Scrape Startup Jobs with Wellfound (2026 Guide)
Thirdwatch's Wellfound Scraper returns startup-only jobs at $0.008 per record — title, company, location, salary, equity range, funding stage, last raise, investors, team size. Built for startup-recruiter tools, founder-network platforms, venture-research analysts, and tech-talent-acquisition functions targeting early-stage companies.
Why scrape Wellfound for startup jobs
Startup hiring is a separate ecosystem from enterprise. According to Wellfound's 2024 Talent report, the platform indexes 100K+ active startups across all stages with founder/operator-led recruiting workflows. For startup-focused recruiters, founder-network builders, and venture-research analysts, Wellfound's curated startup-only assortment is materially higher-signal than LinkedIn Jobs filtered to startup-tier.
The job-to-be-done is structured. A startup-recruiter platform indexes 1,000 YC + tier-1 startups for candidate-side discovery. A venture-research analyst tracks portfolio company hiring velocity to inform follow-on investment decisions. A founder-network builder maps active startup hiring to identify high-momentum companies. A talent-acquisition function at an early-stage startup researches comp benchmarks for new hires. All reduce to startup-handle list + role pull + funding-stage filtering.
How does this compare to the alternatives?
Three options for startup-jobs data:
| Approach | Cost per 1,000 records | Reliability | Setup time | Maintenance |
|---|---|---|---|---|
| Wellfound Recruiter (paid seat) | $299–$3,000/seat/month | High, with sourcing tools | Hours | Per-seat license |
| LinkedIn Jobs filtered to startups | $8 ($0.008 × 1,000) | Generic, requires filtering | Hours | LinkedIn TOS |
| Thirdwatch Wellfound Scraper | $8 ($0.008 × 1,000) | Production-tested with Camoufox | 5 minutes | Thirdwatch tracks Wellfound changes |
LinkedIn covers all tiers but startup filter is imperfect. Wellfound's first-party Recruiter product is the gold standard for active recruiters but priced for individual seats. The Wellfound Scraper actor page gives you raw startup-jobs data at the lowest unit cost.
How to scrape Wellfound in 4 steps
Step 1: How do I authenticate against Apify?
Sign in at apify.com (free tier, no credit card), open Settings → Integrations, and copy your personal API token:
export APIFY_TOKEN="apify_api_xxxxxxxxxxxxxxxx"
Step 2: How do I pull a startup watchlist?
Pass startup-handle queries.
import os, requests, pandas as pd
ACTOR = "thirdwatch~wellfound-jobs-scraper"
TOKEN = os.environ["APIFY_TOKEN"]
STARTUPS = ["openai", "anthropic", "stripe", "linear", "vercel",
"scale-ai", "modal", "ramp", "mercury", "rippling",
"deel", "notion", "figma", "airtable", "retool"]
resp = requests.post(
f"https://api.apify.com/v2/acts/{ACTOR}/run-sync-get-dataset-items",
params={"token": TOKEN},
json={"queries": STARTUPS, "maxResults": 500},
timeout=3600,
)
df = pd.DataFrame(resp.json())
print(f"{len(df)} jobs across {df.company_name.nunique()} startups")
15 startups × ~30 active roles = ~450 records, costing $3.60.
Step 3: How do I filter by funding stage and equity?
Filter to specific funding stages + parse equity ranges.
import re
df["equity_min"] = df.equity.str.extract(r"([\d.]+)%").astype(float)
df["funding_stage"] = df.company_funding_stage.fillna("Unknown")
senior_roles = df[
(df.experience_level.isin(["Senior", "Lead", "Principal"]))
& df.equity_min.notna()
& (df.equity_min >= 0.1)
& df.funding_stage.isin(["Series A", "Series B", "Series C"])
].sort_values("equity_min", ascending=False)
print(f"{len(senior_roles)} senior + Series A-C roles with disclosed equity")
print(senior_roles[["title", "company_name", "funding_stage",
"salary", "equity_min", "team_size"]].head(15))
The filter surfaces senior-tier roles at growth-stage startups with equity disclosed — the canonical "high-leverage early-employee" cohort.
Step 4: How do I track funding-velocity correlations?
Cross-reference recent raises with hiring velocity.
df["last_raise_date"] = pd.to_datetime(df.company_last_raise_date)
df["days_since_raise"] = (pd.Timestamp.utcnow() - df.last_raise_date).dt.days
post_raise = df[df.days_since_raise <= 90]
hiring_burst = (
post_raise.groupby("company_name")
.agg(
days_since_raise=("days_since_raise", "first"),
funding_stage=("funding_stage", "first"),
last_raise=("company_last_raise", "first"),
open_roles=("title", "count"),
)
.sort_values("open_roles", ascending=False)
)
print(hiring_burst.head(15))
Companies posting 10+ roles within 90 days of a raise are deploying capital aggressively — high-leverage signal for sales prospecting and venture follow-on.
Sample output
A single Wellfound role looks like this. Five rows weigh ~6 KB.
{
"title": "Senior Backend Engineer",
"company_name": "Anthropic",
"company_funding_stage": "Series E",
"company_last_raise": "$3.5B",
"company_last_raise_date": "2024-12-15",
"company_team_size": "500-1000",
"company_investors": ["Google", "Spark Capital", "Lightspeed"],
"location": "San Francisco / Remote",
"salary": "$220,000 - $320,000",
"equity": "0.05% - 0.15%",
"experience_level": "Senior",
"job_type": "Full-time",
"remote": true,
"apply_url": "https://wellfound.com/jobs/..."
}
company_funding_stage + company_last_raise + company_investors are the killer Wellfound-specific fields — none available on LinkedIn directly. equity ranges enable comp-modelling for early-employee compensation that LinkedIn salaries don't capture.
Common pitfalls
Three things go wrong in Wellfound pipelines. Stale company-stage data — funding stage updates lag 30-60 days post-raise; cross-reference with Crunchbase for recency-sensitive use cases. Equity-range parsing — equity displays as "0.05% - 0.15%" or "0.5% - 1%" depending on stage; parse min/max separately. Remote tag inconsistency — "Remote" can mean US-only-remote, Global-remote, or hybrid; for accurate remote-only filtering, supplement with description-keyword matching.
Thirdwatch's actor uses Camoufox + residential proxy at $5/1K, ~36% margin. Pair Wellfound with LinkedIn Jobs Scraper for cross-source startup-hiring coverage and Career Site Scraper for direct ATS depth on prioritized startups. A fourth subtle issue worth flagging: Wellfound's startup roster includes both "active hiring" companies and "stealth" companies posting placeholder roles for SEO reasons; for true active-hiring signal, filter on roles with posted_within_30d: true and exclude companies posting only generic "Engineer" titles without job-specific descriptions. A fifth pattern unique to Wellfound: many startups post the same role under multiple departments (Engineering, Product, Design) for visibility; for accurate per-role velocity tracking, dedupe on (company, title-norm, posted_at) before counting. A sixth and final pitfall: Wellfound's company_team_size field uses bucketed ranges ("11-50", "51-200", "201-500") rather than exact numbers; for cross-startup growth-rate comparisons, use bucket-midpoints and treat the size signal as approximate. A seventh and final pattern worth flagging for production teams: data-pipeline cost optimization. The actor's pricing scales linearly with record volume, so for high-cadence operations (hourly polling on large watchlists), the dominant cost driver is the size of the watchlist rather than the per-record fee. For cost-disciplined teams, tier the watchlist (Tier 1 hourly, Tier 2 daily, Tier 3 weekly) rather than running everything at the highest cadence — typical 60-80% cost reduction with minimal signal loss. Combine tiered cadence with explicit dedup keys and incremental snapshot diffing to keep storage and downstream-compute proportional to new signal rather than total watchlist size.
An eighth subtle issue: snapshot-storage strategy materially affects long-term economics. Raw JSON snapshots compressed with gzip typically run 4-8x smaller than uncompressed; for multi-year retention, always compress at write-time. Partition storage by date prefix (snapshots/YYYY/MM/DD/) to enable fast date-range queries and incremental processing rather than full-scan re-aggregation. Most production pipelines keep 90 days of raw snapshots at full fidelity + 12 months of derived per-record aggregates + indefinite retention of derived metric time-series — three retention tiers managed separately.
A ninth pattern unique to research-grade data work: schema validation should run continuously, not just at pipeline build-time. Run a daily validation suite that asserts each scraper returns the expected core fields with non-null rates above 80% (for required fields) and 50% (for optional). Alert on schema breakage same-day so consumers don't degrade silently. Most schema drift on third-party platforms shows up as one or two missing fields rather than total breakage; catch it early.
Related use cases
- Find YC and Tier-1 startup roles on Wellfound
- Track startup hiring velocity on Wellfound
- Scrape CutShort tech jobs India
- The complete guide to scraping job boards
- All Thirdwatch use-case guides
Frequently asked questions
What's Wellfound (formerly AngelList Talent)?
Wellfound is the canonical startup-jobs platform — exclusively startup roles from seed-stage through late-stage growth, with rich company-side data (funding stage, last round, investor list, team size). For startup recruiters, founder-network builders, and venture-research analysts, Wellfound is materially better-curated than LinkedIn Jobs for early-stage tech hiring.
Why scrape Wellfound vs LinkedIn?
LinkedIn covers all employer tiers; Wellfound is startup-only. For startup-focused use cases (YC + tier-1 startup hiring, founder networking, comp research at early-stage), Wellfound's data quality is higher because the curation filter is built in. About 80%+ of YC W24+ companies post on Wellfound; less than 50% post on LinkedIn first.
What startup-specific signals are visible?
Five startup-specific fields per role: company funding stage (Pre-seed → Series F), last raise amount + date, investor list, equity range (often disclosed), team size. Cross-referencing all five reveals which startups are over-hiring vs raising vs maintaining. About 60% of Wellfound startups disclose equity bands openly.
How does Wellfound handle anti-bot defenses?
Wellfound uses DataDome + Cloudflare Turnstile. Thirdwatch's actor uses Camoufox + residential proxy + humanize behavior to bypass these reliably. Production-tested at sustained weekly volumes. About 90-95% query success rate; failed queries auto-retry with fresh proxy IP.
What's the cost for typical startup-research workflows?
$0.008/record FREE tier. A 50-startup-watchlist daily refresh at 20 roles each = 1,000 records/day = $8/day FREE. Quarterly snapshot of 500 YC startups + tier-1 portfolio companies = ~10K records = $80. For founder-research and comp-benchmarking, this is materially cheaper than commercial founder-CRM products.
How does this compare to AngelList Talent (Wellfound's first-party API)?
AngelList's first-party recruiter API is gated behind paid Wellfound for Recruiters seats ($299+/mo). For high-volume research or platform-builder use cases, the actor is materially cheaper at the cost of building your own filtering UX. For active recruiters with low-volume needs, the SaaS path wins on UX.
Run the Wellfound Scraper on Apify Store — pay-per-job, free to try, no credit card to test.