Skip to main content
Thirdwatchthirdwatch
Jobs & recruitment

Scrape Startup Jobs with Wellfound (2026 Guide)

Pull startup jobs from Wellfound (formerly AngelList Talent) using Thirdwatch. YC/tier-1 startups + funding signals + recipes.

Apr 28, 2026 · 5 min read · 1,159 words
See the scraper →

Thirdwatch's Wellfound Scraper returns startup-only jobs — title, company, location, salary, equity range, funding stage, last raise, investors, team size. Built for startup-recruiter tools, founder-network platforms, venture-research analysts, and tech-talent-acquisition functions targeting early-stage companies.

Why scrape Wellfound for startup jobs

Startup hiring is a separate ecosystem from enterprise. According to Wellfound's 2024 Talent report, the platform indexes 100K+ active startups across all stages with founder/operator-led recruiting workflows. For startup-focused recruiters, founder-network builders, and venture-research analysts, Wellfound's curated startup-only assortment is materially higher-signal than LinkedIn Jobs filtered to startup-tier.

The job-to-be-done is structured. A startup-recruiter platform indexes 1,000 YC + tier-1 startups for candidate-side discovery. A venture-research analyst tracks portfolio company hiring velocity to inform follow-on investment decisions. A founder-network builder maps active startup hiring to identify high-momentum companies. A talent-acquisition function at an early-stage startup researches comp benchmarks for new hires. All reduce to startup-handle list + role pull + funding-stage filtering.

How does this compare to the alternatives?

Three options for startup-jobs data:

Approach Cost per 1,000 records Reliability Setup time Maintenance
Wellfound Recruiter (paid seat) $299–$3,000/seat/month High, with sourcing tools Hours Per-seat license
LinkedIn Jobs filtered to startups Pay per result Generic, requires filtering Hours LinkedIn TOS
Thirdwatch Wellfound Scraper Pay per result Production-tested with production-grade anti-bot tooling 5 minutes Thirdwatch tracks Wellfound changes

LinkedIn covers all tiers but startup filter is imperfect. Wellfound's first-party Recruiter product is the gold standard for active recruiters but priced for individual seats. The Wellfound Scraper actor page gives you raw startup-jobs data at the lowest unit cost.

How to scrape Wellfound in 4 steps

Step 1: How do I authenticate against Apify?

Sign in at apify.com (free tier, no credit card), open Settings → Integrations, and copy your personal API token:

export APIFY_TOKEN="apify_api_xxxxxxxxxxxxxxxx"

Step 2: How do I pull a startup watchlist?

Pass startup-handle queries.

import os, requests, pandas as pd

ACTOR = "thirdwatch~wellfound-jobs-scraper"
TOKEN = os.environ["APIFY_TOKEN"]

STARTUPS = ["openai", "anthropic", "stripe", "linear", "vercel",
            "scale-ai", "modal", "ramp", "mercury", "rippling",
            "deel", "notion", "figma", "airtable", "retool"]

resp = requests.post(
    f"https://api.apify.com/v2/acts/{ACTOR}/run-sync-get-dataset-items",
    params={"token": TOKEN},
    json={"queries": STARTUPS, "maxResults": 500},
    timeout=3600,
)
df = pd.DataFrame(resp.json())
print(f"{len(df)} jobs across {df.company_name.nunique()} startups")

15 startups × ~30 active roles = ~450 records — small enough to run on demand at the actor's pay-per-result pricing.

Step 3: How do I filter by funding stage and equity?

Filter to specific funding stages + parse equity ranges.

import re

df["equity_min"] = df.equity.str.extract(r"([\d.]+)%").astype(float)
df["funding_stage"] = df.company_funding_stage.fillna("Unknown")

senior_roles = df[
    (df.experience_level.isin(["Senior", "Lead", "Principal"]))
    & df.equity_min.notna()
    & (df.equity_min >= 0.1)
    & df.funding_stage.isin(["Series A", "Series B", "Series C"])
].sort_values("equity_min", ascending=False)

print(f"{len(senior_roles)} senior + Series A-C roles with disclosed equity")
print(senior_roles[["title", "company_name", "funding_stage",
                    "salary", "equity_min", "team_size"]].head(15))

The filter surfaces senior-tier roles at growth-stage startups with equity disclosed — the canonical "high-leverage early-employee" cohort.

Step 4: How do I track funding-velocity correlations?

Cross-reference recent raises with hiring velocity.

df["last_raise_date"] = pd.to_datetime(df.company_last_raise_date)
df["days_since_raise"] = (pd.Timestamp.utcnow() - df.last_raise_date).dt.days

post_raise = df[df.days_since_raise <= 90]
hiring_burst = (
    post_raise.groupby("company_name")
    .agg(
        days_since_raise=("days_since_raise", "first"),
        funding_stage=("funding_stage", "first"),
        last_raise=("company_last_raise", "first"),
        open_roles=("title", "count"),
    )
    .sort_values("open_roles", ascending=False)
)
print(hiring_burst.head(15))

Companies posting 10+ roles within 90 days of a raise are deploying capital aggressively — high-leverage signal for sales prospecting and venture follow-on.

Sample output

A single Wellfound role looks like this. Five rows weigh ~6 KB.

{
  "title": "Senior Backend Engineer",
  "company_name": "Anthropic",
  "company_funding_stage": "Series E",
  "company_last_raise": "$3.5B",
  "company_last_raise_date": "2024-12-15",
  "company_team_size": "500-1000",
  "company_investors": ["Google", "Spark Capital", "Lightspeed"],
  "location": "San Francisco / Remote",
  "salary": "$220,000 - $320,000",
  "equity": "0.05% - 0.15%",
  "experience_level": "Senior",
  "job_type": "Full-time",
  "remote": true,
  "apply_url": "https://wellfound.com/jobs/..."
}

company_funding_stage + company_last_raise + company_investors are the killer Wellfound-specific fields — none available on LinkedIn directly. equity ranges enable comp-modelling for early-employee compensation that LinkedIn salaries don't capture.

Common pitfalls

Three things go wrong in Wellfound pipelines. Stale company-stage data — funding stage updates lag 30-60 days post-raise; cross-reference with Crunchbase for recency-sensitive use cases. Equity-range parsing — equity displays as "0.05% - 0.15%" or "0.5% - 1%" depending on stage; parse min/max separately. Remote tag inconsistency — "Remote" can mean US-only-remote, Global-remote, or hybrid; for accurate remote-only filtering, supplement with description-keyword matching.

Thirdwatch's actor handles the anti-bot work and proxy rotation so you can focus on the data. Pair Wellfound with LinkedIn Jobs Scraper for cross-source startup-hiring coverage and Career Site Scraper for direct ATS depth on prioritized startups. A fourth subtle issue worth flagging: Wellfound's startup roster includes both "active hiring" companies and "stealth" companies posting placeholder roles for SEO reasons; for true active-hiring signal, filter on roles with posted_within_30d: true and exclude companies posting only generic "Engineer" titles without job-specific descriptions. A fifth pattern unique to Wellfound: many startups post the same role under multiple departments (Engineering, Product, Design) for visibility; for accurate per-role velocity tracking, dedupe on (company, title-norm, posted_at) before counting. A sixth and final pitfall: Wellfound's company_team_size field uses bucketed ranges ("11-50", "51-200", "201-500") rather than exact numbers; for cross-startup growth-rate comparisons, use bucket-midpoints and treat the size signal as approximate. A seventh and final pattern worth flagging for production teams: data-pipeline cost optimization. The actor's pricing scales linearly with record volume, so for high-cadence operations (hourly polling on large watchlists), the dominant cost driver is the size of the watchlist rather than the per-record fee. For cost-disciplined teams, tier the watchlist (Tier 1 hourly, Tier 2 daily, Tier 3 weekly) rather than running everything at the highest cadence — typical 60-80% cost reduction with minimal signal loss. Combine tiered cadence with explicit dedup keys and incremental snapshot diffing to keep storage and downstream-compute proportional to new signal rather than total watchlist size.

An eighth subtle issue: snapshot-storage strategy materially affects long-term economics. Raw JSON snapshots compressed with gzip typically run 4-8x smaller than uncompressed; for multi-year retention, always compress at write-time. Partition storage by date prefix (snapshots/YYYY/MM/DD/) to enable fast date-range queries and incremental processing rather than full-scan re-aggregation. Most production pipelines keep 90 days of raw snapshots at full fidelity + 12 months of derived per-record aggregates + indefinite retention of derived metric time-series — three retention tiers managed separately.

A ninth pattern unique to research-grade data work: schema validation should run continuously, not just at pipeline build-time. Run a daily validation suite that asserts each scraper returns the expected core fields with non-null rates above 80% (for required fields) and 50% (for optional). Alert on schema breakage same-day so consumers don't degrade silently. Most schema drift on third-party platforms shows up as one or two missing fields rather than total breakage; catch it early.

Related use cases

Frequently asked questions

What's Wellfound (formerly AngelList Talent)?

Wellfound is the canonical startup-jobs platform — exclusively startup roles from seed-stage through late-stage growth, with rich company-side data (funding stage, last round, investor list, team size). For startup recruiters, founder-network builders, and venture-research analysts, Wellfound is materially better-curated than LinkedIn Jobs for early-stage tech hiring.

Why scrape Wellfound vs LinkedIn?

LinkedIn covers all employer tiers; Wellfound is startup-only. For startup-focused use cases (YC + tier-1 startup hiring, founder networking, comp research at early-stage), Wellfound's data quality is higher because the curation filter is built in. About 80%+ of YC W24+ companies post on Wellfound; less than 50% post on LinkedIn first.

What startup-specific signals are visible?

Five startup-specific fields per role: company funding stage (Pre-seed → Series F), last raise amount + date, investor list, equity range (often disclosed), team size. Cross-referencing all five reveals which startups are over-hiring vs raising vs maintaining. About 60% of Wellfound startups disclose equity bands openly.

How does Wellfound handle anti-bot defenses?

Wellfound uses the site's anti-bot protection + anti-bot bypass. Thirdwatch's actor uses Production-grade anti-bot handling + production-grade tooling behavior to bypass these reliably. Production-tested at sustained weekly volumes. About 90-95% query success rate; failed queries auto-retry with fresh proxy IP.

What's the cost for typical startup-research workflows?

Pay-per-result pricing. A 50-startup-watchlist daily refresh at 20 roles each (~1,000 records/day) runs cheaply enough to schedule continuously; a quarterly snapshot of 500 YC + tier-1 portfolio companies stays well within budget. For founder-research and comp-benchmarking, this is materially cheaper than commercial founder-CRM products.

How does this compare to AngelList Talent (Wellfound's first-party API)?

AngelList's first-party recruiter API is gated behind paid Wellfound for Recruiters seats ($299+/mo). For high-volume research or platform-builder use cases, the actor is materially cheaper at the cost of building your own filtering UX. For active recruiters with low-volume needs, the SaaS path wins on UX.

Related

Try it yourself

100 free credits, no credit card.

About 30 real searches. Add the MCP to Claude or Cursor in two minutes.