Scrape Instagram Profiles and Posts at Scale (2026)
Thirdwatch's Instagram Scraper returns Instagram public posts and profiles at $0.006 per record — captions, like and comment counts, media URLs, hashtags, follower counts, and recent-post arrays — without requiring a login or Graph API access. Built for social-media marketers, influencer analysts, brand-monitoring teams, and trend researchers who need machine-readable Instagram data programmatically.
Why scrape Instagram without API
Instagram is the second-largest social platform globally. According to Meta's 2024 user disclosures, Instagram crossed 2 billion monthly active users with creator engagement growing faster than any other Meta property. For brand and creator monitoring, the platform is non-negotiable. The blocker for systematic access: Meta's Instagram Graph API gates most public-data endpoints behind app review, Business credentials, and creator-account permissions — a multi-week onboarding for what is effectively public information.
The job-to-be-done is structured. A social-media marketing team monitors 50 competitor profiles weekly for engagement benchmarks. An influencer-research team builds a creator shortlist of 200 accounts with follower counts and recent-post engagement. A brand-monitoring team watches hashtags and tagged-account feeds for mentions. A trend researcher collects caption corpora for NLP analysis. All reduce to handle/hashtag list + searchType + max results returning structured rows. The actor is the data layer.
How does this compare to the alternatives?
Three options for getting Instagram data into a pipeline:
| Approach | Cost per 1,000 records | Reliability | Setup time | Maintenance |
|---|---|---|---|---|
| Meta Instagram Graph API | Free with quotas | Official | Weeks (app review + Business creds) | Strict rate limits |
| Influencer SaaS (Modash, HypeAuditor, CreatorIQ) | $5K–$50K/year | High, includes audience demographics | Hours | Vendor lock-in |
| Thirdwatch Instagram Scraper | $6 ($0.006 × 1,000) | Production-tested, no login | 5 minutes | Thirdwatch tracks Instagram changes |
Influencer SaaS bundles Instagram with audience demographics (age/gender/geo splits) the public profile pages don't expose. The Instagram Scraper actor page gives you the public data layer at pay-per-result pricing — most teams build their own monitoring on top for far less than the SaaS cost.
How to scrape Instagram in 4 steps
Step 1: How do I authenticate against Apify?
Sign in at apify.com (free tier, no credit card), open Settings → Integrations, and copy your personal API token. Every example below assumes the token is in APIFY_TOKEN:
export APIFY_TOKEN="apify_api_xxxxxxxxxxxxxxxx"
Step 2: How do I pull profile data with recent posts?
Pass @username-prefixed handles, set searchType: "profiles", and choose how many recent posts to enrich.
import os, requests, pandas as pd
ACTOR = "thirdwatch~instagram-scraper"
TOKEN = os.environ["APIFY_TOKEN"]
CREATORS = ["@natgeo", "@nasa", "@nike",
"@apple", "@vogue", "@gucci"]
resp = requests.post(
f"https://api.apify.com/v2/acts/{ACTOR}/run-sync-get-dataset-items",
params={"token": TOKEN},
json={
"queries": CREATORS,
"searchType": "profiles",
"maxResults": 50,
"maxPostsPerProfile": 12,
},
timeout=900,
)
df = pd.DataFrame(resp.json())
print(f"{len(df)} profiles, total recent posts: "
f"{sum(len(p) for p in df.recentPosts)}")
6 profiles × 12 recent posts each ≈ 78 records (6 profile rows + 72 nested posts), costing $0.50.
Step 3: How do I rank profiles by engagement rate?
Engagement rate (interactions per follower) is the canonical creator-quality metric. Compute it from the recent-posts array.
def engagement_rate(row):
posts = row.get("recentPosts") or []
if not posts or not row.get("followerCount"):
return None
avg_engagement = sum(
(p.get("likeCount") or 0) + (p.get("commentCount") or 0)
for p in posts
) / len(posts)
return avg_engagement / row["followerCount"]
df["engagement_rate"] = df.apply(engagement_rate, axis=1)
ranked = df.sort_values("engagement_rate", ascending=False)
print(ranked[["username", "followerCount", "postCount",
"engagement_rate", "isVerified"]].head(15))
Engagement rate of 1-3% is typical for mega-creators (10M+ followers); 3-6% for mid-tier (100K-1M); 6%+ for niche micro-creators. A high follower count combined with low engagement rate often signals follower fraud — a useful filter for genuine influencer shortlists.
Step 4: How do I track follower growth over time?
Persist daily snapshots of follower counts and diff across days.
import datetime, json, pathlib, glob
today = datetime.date.today().isoformat()
profiles = df[["username", "followerCount", "postCount"]].copy()
profiles["date"] = today
profiles.to_json(f"snapshots/ig-followers-{today}.json", orient="records")
frames = []
for f in sorted(glob.glob("snapshots/ig-followers-*.json")):
frames.append(pd.read_json(f))
history = pd.concat(frames, ignore_index=True)
history["date"] = pd.to_datetime(history["date"])
pivot = history.pivot(index="username", columns="date", values="followerCount")
dates = sorted(pivot.columns)
if len(dates) >= 30:
pivot["growth_30d"] = pivot[dates[-1]] - pivot[dates[-30]]
pivot["growth_30d_pct"] = (pivot[dates[-1]] / pivot[dates[-30]]) - 1
print(pivot.sort_values("growth_30d", ascending=False).head(15))
A 5%+ 30-day follower growth rate is meaningful — most established accounts grow at 1-2% per month. Spikes above 10% usually indicate viral content or bot-driven inflation; cross-check with engagement rate before acting on the signal.
Sample output
A single post record looks like this. Five rows of this shape weigh ~3 KB.
{
"shortcode": "C7xYz12ABcd",
"url": "https://www.instagram.com/p/C7xYz12ABcd/",
"type": "image",
"caption": "Sunrise at the South Rim. #nature #photography",
"likeCount": 245000,
"commentCount": 1200,
"authorUsername": "natgeo",
"timestamp": "2026-04-08T18:30:00+00:00",
"location": "Grand Canyon National Park",
"imageUrl": "https://scontent.cdninstagram.com/...",
"hashtags": ["nature", "photography"]
}
A profile record (with searchType: "profiles") includes profile-level fields plus a recentPosts array:
{
"username": "natgeo",
"fullName": "National Geographic",
"biography": "Experience the world through the eyes of National Geographic...",
"followerCount": 281500000,
"followingCount": 134,
"postCount": 30200,
"isVerified": true,
"isPrivate": false,
"profilePicUrl": "https://scontent.cdninstagram.com/...",
"externalUrl": "https://www.nationalgeographic.com/",
"recentPosts": [/* 12 post objects */]
}
shortcode is Instagram's globally unique post identifier — the canonical key for cross-snapshot dedup. type distinguishes image, video, carousel, and reel content, useful for content-mix analysis. hashtags is parsed from the caption automatically — much cleaner than regex-extracting them downstream.
Common pitfalls
Three things go wrong in production Instagram pipelines. Hashtag rate limiting — Instagram rate-limits hashtag-feed access heavily in 2026; expect partial results on popular hashtags and run smaller batches frequently rather than one large batch. viewCount nullness — viewCount only populates for video and reel posts; image posts return null. For engagement calculations, treat null viewCount as zero (or skip viewCount entirely on image-heavy accounts). Follower-count rounding — Instagram displays follower counts as 2.8M for large accounts; the actor parses these to integers but loses precision. For absolute-precision follower tracking on mega-creators, expect ±0.5% rounding error.
Thirdwatch's actor returns shortcode, url, and timestamp on every post record so cross-snapshot dedup and time-series analysis are clean. The pure-HTTP architecture means a 50-account profile-mode pull with 12 posts each completes in 8-15 minutes wall-clock and costs roughly $4. Pair Instagram with our TikTok Scraper and YouTube Scraper for cross-platform creator research. A fourth subtle issue worth flagging: Instagram's grid sometimes shows pinned posts at the top regardless of recency, which can skew "recent posts" analysis if the pinned post is months old. Cross-check the timestamp on each recentPosts entry rather than trusting their order. A fifth pattern unique to creator analytics: a profile with 5M followers but only 50K likes per post on average has a ~1% engagement rate, which is in the normal mega-creator band; expecting 5M-follower accounts to consistently hit 100K+ likes is a common analyst mistake when the comparison cohort is mid-tier creators.
Related use cases
- Track influencer follower growth on Instagram
- Research Instagram hashtag performance
- Monitor brand Instagram engagement
- The complete guide to scraping social media
- All Thirdwatch use-case guides
Frequently asked questions
Do I need an Instagram account or API access?
No. Thirdwatch's Instagram Scraper accesses publicly visible profile and post pages without login — the same URLs anyone can view as a guest. This sidesteps Meta's Graph API, which gates most public-data access behind app-review requirements and Business-tier credentials that take weeks to onboard.
How much does it cost?
Thirdwatch charges $0.006 per record on the FREE tier and drops to $0.003 at GOLD volume. A 100-account influencer-monitoring batch with 12 recent posts each costs roughly $7-$8 per refresh. Daily cadence on a 100-account watchlist costs ~$220/month — meaningfully below influencer-marketing SaaS subscriptions like Modash or HypeAuditor.
What's the difference between posts and profiles search types?
searchType: "posts" returns individual posts/media — one row per post with full engagement metrics. searchType: "profiles" returns profile-level info (follower count, bio, post count) plus an array of recent posts (controlled by maxPostsPerProfile, default 12). For follower-growth tracking choose profiles; for hashtag content research or single-post analysis choose posts.
Can I scrape hashtag feeds?
Yes, with limitations. Pass #hashtag-prefixed queries and Instagram returns recent posts under that hashtag. Instagram has rate-limited hashtag search heavily in 2026 — expect partial results on broad/popular hashtags and sparse results on smaller ones. For systematic hashtag tracking, run small batches frequently rather than one large batch at a time.
What does the actor NOT return?
Three things: Stories (require login and expire after 24 hours), the Reels For You feed (algorithmically personalised, requires login), and private accounts (the 🔒 padlock means even logged-in scrapers without an active follow can't access). The actor returns whatever Instagram shows publicly; for any of the above, alternative approaches require Meta's Graph API with Business credentials.
How does this compare to apify/instagram-scraper?
The apify/instagram-scraper actor (224K users) is cheaper per result ($0.0027) and battle-tested at scale. Thirdwatch's actor is priced higher but ships a single consolidated schema across posts and profiles, includes maxPostsPerProfile for recent-post enrichment, and is maintained on a predictable release cadence. For pure cost, apify's wins; for schema simplicity in mixed posts/profiles workflows, this one wins.
Run the Instagram Scraper on Apify Store — pay-per-record, free to try, no credit card to test.