Jobs & recruitment

Scrape LinkedIn Candidates for Recruiter Sourcing (2026)

Source LinkedIn candidates by role, skills, and location using Thirdwatch's LinkedIn Candidate Finder actor. No login required. Python and CRM recipes.

May 26, 2026 · 6 min read · 1,359 words

See the scraper →

Thirdwatch's LinkedIn Candidate Finder returns LinkedIn profiles matching recruiter criteria -- role, skills, location, seniority, experience range, and company filters. No LinkedIn login or Recruiter seat needed. Returns candidate name, headline, and profile URL ready for enrichment or ATS import. Built for sourcing teams, staffing agencies, and hiring managers who need structured shortlists fast.

▶ Skip the setup: Run this as a ready-to-go task on Apify → — pre-loaded with the exact configuration from this guide. No code required.

Why scrape LinkedIn for recruiter sourcing

LinkedIn hosts over 1 billion member profiles globally, making it the single largest professional talent database on the open web. According to LinkedIn's 2025 workforce report, 70% of the global workforce is passive talent -- people not actively applying but open to the right opportunity. A Lever recruiting survey found that sourced candidates are 2-3x more likely to be hired than applicants. The challenge for recruiters: LinkedIn Recruiter costs $100-180 per seat per month, imposes search limits, and locks you into LinkedIn's own UI for outreach.

The job-to-be-done is clear. A tech recruiter building a shortlist of senior backend engineers in Bangalore with Kubernetes experience needs 50 matching profiles by end of day. A staffing agency placing data scientists across three US metros wants weekly refreshes of qualified candidates per metro. A startup founder hiring their first five engineers needs to see who in their target market has the right skill stack -- without paying for an enterprise Recruiter seat. All of these reduce to role + skills + location queries returning structured profile data. The LinkedIn Candidate Finder is the data layer.

How does this compare to the alternatives?

Three paths for sourcing LinkedIn candidate data programmatically:

Approach	Cost	Reliability	Setup time	Maintenance
Manual Google x-ray search (`site:linkedin.com/in/`)	Free, hours of your time	Inconsistent, unstructured	Ongoing manual effort	You paste and format by hand
LinkedIn Recruiter / Sales Navigator	$100-180+/seat/month	High within LinkedIn's limits	Account setup	LinkedIn manages
Thirdwatch LinkedIn Candidate Finder	Pay per result	Production-tested, structured output	5 minutes	Thirdwatch maintains

Manual x-ray search is what most sourcers already do -- site:linkedin.com/in/ "senior software engineer" "kubernetes" "Bangalore" pasted into Google. The Candidate Finder automates exactly this workflow and returns structured rows instead of raw search results. LinkedIn Recruiter gives you richer filters (InMail, notes, pipeline stages) but at enterprise pricing. The Candidate Finder fills the gap: structured candidate data at per-result pricing, no login needed.

How to source LinkedIn candidates in 4 steps

Step 1: How do I set up my Apify token?

Sign up at apify.com (free tier, no credit card required). Navigate to Settings, then Integrations, and copy your API token. All examples below assume it is exported:

export APIFY_TOKEN="apify_api_xxxxxxxxxxxxxxxx"

Step 2: How do I search for candidates by role, skills, and location?

Pass the role, skills, and location fields to define your ideal candidate. The actor builds the right query and returns matching LinkedIn profiles.

import os, requests, pandas as pd

ACTOR = "thirdwatch~linkedin-candidate-finder-scraper"
TOKEN = os.environ["APIFY_TOKEN"]

resp = requests.post(
    f"https://api.apify.com/v2/acts/{ACTOR}/run-sync-get-dataset-items",
    params={"token": TOKEN},
    json={
        "role": "Senior Software Engineer",
        "skills": ["python", "kubernetes", "aws"],
        "location": "Bangalore",
        "maxResults": 25,
    },
    timeout=300,
)
df = pd.DataFrame(resp.json())
print(f"{len(df)} candidates found")
print(df[["fullName", "headline", "url"]].head(10))

The skills array matches each phrase exactly against the candidate's publicly indexed profile. The location field expands common variants automatically -- "Bangalore" also searches "Bengaluru", "Mumbai" also searches "Bombay", "NYC" also searches "New York".

Step 3: How do I filter by seniority and experience range?

Add seniority, minExperienceYears, and maxExperienceYears to narrow the pool to candidates at the right career stage.

resp = requests.post(
    f"https://api.apify.com/v2/acts/{ACTOR}/run-sync-get-dataset-items",
    params={"token": TOKEN},
    json={
        "role": "Product Manager",
        "skills": ["b2b saas", "roadmap"],
        "location": "San Francisco",
        "country": "US",
        "seniority": "senior",
        "minExperienceYears": 5,
        "maxExperienceYears": 12,
        "maxResults": 50,
    },
    timeout=300,
)
df = pd.DataFrame(resp.json())
print(f"{len(df)} senior PMs, 5-12 years experience")

The country field further narrows results when a city name exists in multiple countries. Combine seniority with the experience range to avoid getting junior candidates who happen to mention "senior" in a project description.

Step 4: How do I target or exclude specific companies?

Use currentCompanies to find candidates from your target talent pools and excludeCompanies to filter out your own employees or competitors you do not want to poach from.

resp = requests.post(
    f"https://api.apify.com/v2/acts/{ACTOR}/run-sync-get-dataset-items",
    params={"token": TOKEN},
    json={
        "role": "Data Scientist",
        "skills": ["machine learning", "pytorch"],
        "location": "New York",
        "currentCompanies": ["Google", "Meta", "Amazon"],
        "excludeCompanies": ["Acme Corp"],
        "keywords": ["PhD"],
        "maxResults": 30,
    },
    timeout=300,
)
df = pd.DataFrame(resp.json())
print(f"{len(df)} ML scientists from FAANG, excluding Acme Corp")

The keywords array adds extra terms the profile must contain -- useful for certifications (AWS Solutions Architect, PMP), domain expertise (fintech, healthcare), or education signals (PhD, Stanford).

Step 5: How do I schedule daily sourcing runs?

Set up a recurring schedule so fresh candidates land in your pipeline every morning before the team's standup.

curl -X POST "https://api.apify.com/v2/schedules?token=$APIFY_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "daily-swe-bangalore-sourcing",
    "cronExpression": "0 6 * * 1-5",
    "timezone": "Asia/Kolkata",
    "isEnabled": true,
    "actions": [{
      "type": "RUN_ACTOR",
      "actorId": "thirdwatch~linkedin-candidate-finder-scraper",
      "runInput": {
        "role": "Senior Software Engineer",
        "skills": ["python", "kubernetes", "aws"],
        "location": "Bangalore",
        "seniority": "senior",
        "minExperienceYears": 5,
        "maxResults": 50
      }
    }]
  }'

Add an ACTOR.RUN.SUCCEEDED webhook pointing at your ATS or CRM ingestion endpoint. Every weekday at 6 AM IST, fresh candidate profiles flow into your sourcing pipeline automatically.

Sample output

Each record in the dataset contains the candidate's public profile summary. A typical batch of results looks like this:

[
  {
    "fullName": "Priya Sharma",
    "headline": "Senior Software Engineer at Google | Python, Kubernetes, AWS",
    "url": "https://www.linkedin.com/in/priya-sharma/"
  },
  {
    "fullName": "Rahul Menon",
    "headline": "Staff Engineer at Flipkart | Backend Systems | Ex-Amazon",
    "url": "https://www.linkedin.com/in/rahul-menon-eng/"
  },
  {
    "fullName": "Ananya Iyer",
    "headline": "Senior SWE at Microsoft | Cloud Infrastructure | Kubernetes",
    "url": "https://www.linkedin.com/in/ananya-iyer-cloud/"
  }
]

fullName is the candidate's display name as shown on their LinkedIn profile. headline captures whatever the candidate has set -- typically role, company, and key skills. This is your first-pass relevance signal before enriching with a full LinkedIn Profile Scraper. url is the direct link to the profile, stable enough to use as a dedup key across runs.

Common pitfalls

Three things trip up production sourcing pipelines. Overly narrow queries -- combining a niche role with 5+ skills and a small city often returns fewer results than expected because not all candidates list every skill in their public profile. Start broad (role + 1-2 core skills + metro area) and tighten iteratively. Duplicate candidates across runs -- scheduled daily runs will surface some of the same profiles. Deduplicate on url before pushing to your ATS. A simple SET of profile URLs across runs prevents duplicate outreach. Headline-only matching -- the actor returns the LinkedIn headline, which is whatever the candidate chose to write. A "Full Stack Developer" may actually be a senior backend engineer who has not updated their headline. Feed promising URLs into the LinkedIn Profile Scraper for full experience and skills data before making outreach decisions.

The actor handles query construction, location variant expansion, and proxy rotation so you can focus on evaluating the shortlist rather than wrestling with boolean search strings.

For production sourcing workflows, the best practice is to run separate queries per role-location combination rather than one broad query. A "Senior Software Engineer + Bangalore" query and a "Senior Software Engineer + Hyderabad" query each return more relevant results than a single "Senior Software Engineer + India" query, because LinkedIn's public index is biased toward metro-specific profile visibility. Run 5-10 targeted queries per day on a weekday schedule and merge results into your ATS, deduplicating on profile URL.

Related use cases

Frequently asked questions

Do I need a LinkedIn account to scrape candidates?

No. The LinkedIn Candidate Finder works against publicly indexed profile data. No login, no cookies, no LinkedIn Recruiter seat required. Results reflect what is publicly available on the open web at run time.

How many candidate profiles can I pull per run?

Up to 500 profiles per run via the maxResults input. Start with 5-10 to validate match quality for your role and skills combination, then scale up. Very narrow queries may return fewer matches than the cap.

Does it return email addresses or phone numbers?

No. The actor returns fullName, headline, and LinkedIn profile URL only. Use a dedicated enrichment tool or LinkedIn Profile Scraper downstream to pull contact details, experience history, or education.

How accurate are the skill matches?

Each skill in the skills array is matched as an exact phrase against publicly indexed profile content. Candidates who list skills in their headline or summary surface reliably. Profiles with vague descriptions may not appear for technical queries.

Can I exclude candidates from specific companies?

Yes. Pass company names in the excludeCompanies array. The actor filters out any profile that mentions those companies. Useful for excluding your own employees or agencies you already work with.

Build a Talent Pipeline from LinkedIn Candidate Data (2026)Find Passive Candidates on LinkedIn by Skill (2026 Guide)Track Talent Market Trends with LinkedIn Candidate Data

Try it yourself

100 free credits, no credit card.

About 30 real searches. Add the MCP to Claude or Cursor in two minutes.