Scrape Shopify Reviews for Product Research (2026 Guide)
Pull review counts and average ratings from any Shopify store. Auto-detects Judge.me, Yotpo, Loox, Stamped, Okendo, Reviews.io. Python recipes included.

Thirdwatch's Shopify Reviews Scraper detects the review widget on any Shopify store -- Judge.me, Yotpo, Loox, Stamped, Okendo, Reviews.io, or Shopify native -- and returns total review count and weighted average rating. No login, no API key. Built for DTC brand researchers, e-commerce investors, review platform vendors, and sales teams who need review volume data across hundreds of Shopify stores programmatically.
Why scrape Shopify reviews for product research
Shopify powers over 4.8 million live storefronts worldwide as of 2026, according to BuiltWith's Shopify usage statistics. Every one of those stores can install one of dozens of review widgets -- Judge.me, Yotpo, Loox, Stamped, Okendo, Reviews.io, or Shopify's own native product reviews. The result is a fragmented review ecosystem where no single API gives you a cross-brand view of review volume and quality.
The job-to-be-done is straightforward. A DTC brand researcher benchmarking their review count against ten competitors needs structured data, not hours of manual tab-switching. An e-commerce investor screening 200 Shopify brands for social proof needs review volume as a lead-scoring signal. A review platform sales team prospecting for stores using a competitor widget needs provider detection at scale. All of these reduce to a list of Shopify store URLs returning structured rows with provider, review count, and average rating. The actor is the data layer.
How does this compare to the alternatives?
Three paths to collecting Shopify review data at scale:
| Approach | Reliability | Setup time | Maintenance |
|---|---|---|---|
| DIY Python + BeautifulSoup per widget | Breaks when widget markup changes | 2-4 weeks (per widget) | You track 7+ widget DOM formats |
| Provider-specific API (e.g., Yotpo API) | High for one provider only | 1-2 days per provider | Need API keys per provider |
| Thirdwatch Shopify Reviews Scraper | Auto-detects across all major providers | 5 minutes | Thirdwatch tracks widget changes |
The DIY path requires maintaining separate parsers for each review widget -- Judge.me embeds data differently than Yotpo, which differs from Loox. Provider APIs require store-owner cooperation or partner credentials and only cover a single widget. The Shopify Reviews Scraper handles detection and extraction across all seven providers in a single pass. For context, Judge.me alone powers reviews for over 200,000 Shopify stores according to the Shopify App Store, making it the most common widget you will encounter in DTC brand research.
How to scrape Shopify reviews in 4 steps
Step 1: How do I set up my Apify API token?
Sign in at apify.com (free tier, no credit card required), open Settings, then Integrations, and copy your personal API token. Every example below assumes the token is in APIFY_TOKEN:
export APIFY_TOKEN="apify_api_xxxxxxxxxxxxxxxx"Step 2: How do I pull review stats for a list of Shopify stores?
Pass store URLs in storeUrls and set sampleProducts to control how many product pages are sampled per store. The default of 10 is reliable for most use cases.
import os, requests, pandas as pd
ACTOR = "thirdwatch~shopify-reviews-scraper"
TOKEN = os.environ["APIFY_TOKEN"]
resp = requests.post(
f"https://api.apify.com/v2/acts/{ACTOR}/run-sync-get-dataset-items",
params={"token": TOKEN},
json={
"storeUrls": [
"https://www.allbirds.com",
"https://www.gymshark.com",
"https://www.bombas.com",
"https://www.brooklinen.com",
"https://www.ruggable.com"
],
"sampleProducts": 10,
},
timeout=600,
)
df = pd.DataFrame(resp.json())
print(f"{len(df)} stores analyzed")
print(df[["domain", "provider", "totalReviews", "averageRating"]].to_string(index=False))Five stores with 10 products sampled each completes in under two minutes. The actor returns one row per store with the detected review widget, aggregated review count, and weighted average rating.
Step 3: How do I rank stores by review volume for competitive benchmarking?
Sort by totalReviews descending to see which brands in your competitive set have the strongest social proof. Filter on provider to understand widget market share.
# Rank by review volume
ranked = df.sort_values("totalReviews", ascending=False)
print("\n--- Competitive Review Benchmark ---")
for _, row in ranked.iterrows():
coverage = f"{row['productsWithRatings']}/{row['productsSampled']} products rated"
print(f"{row['domain']:30s} | {row['provider']:20s} | "
f"{row['totalReviews']:>8,} reviews | {row['averageRating']:.1f} avg | {coverage}")
# Widget market share
widget_share = df.groupby("provider").size().sort_values(ascending=False)
print("\n--- Widget Market Share ---")
print(widget_share)This gives you a ranked leaderboard of competing Shopify brands by review volume -- the kind of data that normally requires hours of manual store visits.
Step 4: How do I automate daily review monitoring with a schedule?
Set up an Apify schedule to run the scraper daily and track review growth over time.
curl -X POST "https://api.apify.com/v2/schedules?token=$APIFY_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"name": "shopify-review-benchmark-daily",
"cronExpression": "0 8 * * *",
"timezone": "America/New_York",
"isEnabled": true,
"actions": [{
"type": "RUN_ACTOR",
"actorId": "thirdwatch~shopify-reviews-scraper",
"runInput": {
"storeUrls": [
"https://www.allbirds.com",
"https://www.gymshark.com",
"https://www.bombas.com"
],
"sampleProducts": 15
}
}]
}'Add an ACTOR.RUN.SUCCEEDED webhook to push results into a Google Sheet, Airtable, or your data warehouse. Over weeks, the time series reveals which competitors are growing review volume fastest.
Sample output
A single record from the dataset looks like this. One row per store, typically under 1 KB.
{
"domain": "allbirds.com",
"url": "https://www.allbirds.com",
"provider": "yotpo",
"totalReviews": 48219,
"averageRating": 4.6,
"productsSampled": 10,
"productsWithRatings": 9
}provider tells you exactly which review widget the store uses -- valuable for review platform vendors doing competitive intelligence. totalReviews is the sum across all sampled products; at sampleProducts: 10, this converges to a reliable proxy for the store's total review footprint. averageRating is weighted by review count per product, so a best-seller with 5,000 reviews carries more weight than a niche product with 12. productsWithRatings versus productsSampled shows review adoption depth -- a store where only 3 of 10 products have reviews has a thinner social proof layer than one where 9 of 10 do.
Common pitfalls
Three things trip up first-time users of Shopify review data. Widget fragmentation -- a store might use Judge.me for one product line and Yotpo for another. The actor detects the dominant provider but cannot aggregate across two simultaneous widgets on the same store. If your dataset shows unexpectedly low counts, the store may have split its review stack. Custom domains versus myshopify.com -- pass the brand's primary domain (allbirds.com), not the allbirds.myshopify.com variant. Both work, but custom domains are more reliable for redirect handling. Sample size tradeoffs -- the default sampleProducts of 10 is fast and accurate for most stores. For stores with thousands of SKUs and highly variable ratings, bumping to 30-50 improves precision at the cost of longer run time. Start with 10 and only increase if your use case demands per-category granularity.
Thirdwatch's actor handles widget detection, proxy rotation, and markup parsing across all seven major review providers so you focus on the analysis, not the extraction plumbing.
A fourth consideration: review growth velocity. A store with 48,000 reviews accumulated over five years is growing at a different rate than a store with 12,000 reviews accumulated in eight months. To measure velocity, schedule recurring runs and track the delta in totalReviews between snapshots. The difference divided by the elapsed days gives you a daily review acquisition rate per store — a metric that reveals which competitors are investing in post-purchase review solicitation and which are relying on organic reviews alone. This is particularly valuable for review platform sales teams: a store with high traffic but low review velocity is an ideal prospect for a review-generation tool pitch.
Related use cases
Frequently asked questions
Which review widgets does the Shopify Reviews Scraper support?
The actor auto-detects Judge.me, Yotpo, Loox, Stamped, Okendo, Reviews.io, and Shopify's native product reviews app. You do not need to know which widget a store uses beforehand -- the scraper identifies the provider and returns it alongside the review data.
Does scraping Shopify reviews require login or API keys?
No. The actor reads publicly visible product pages on any Shopify storefront. No Shopify partner app, API key, or store owner permission is needed. Custom domains work the same as myshopify.com domains.
How accurate are the review totals from a sampled scan?
Totals are summed across sampled product pages, weighted by review count. At the default sampleProducts of 10, average ratings converge within 0.1 stars for most stores. Increasing sampleProducts to 30-50 improves precision for stores with highly variable per-product ratings.
Can I scrape individual review text from Shopify stores?
This actor returns aggregate review counts and average ratings per store, not individual review text. For full review bodies, you need a provider-specific scraper targeting the review widget's API directly. The aggregate data is sufficient for benchmarking, lead scoring, and competitive landscape mapping.
What happens if a store is not on Shopify?
The actor attempts detection and returns provider as null with totalReviews as null. No charge is incurred on non-Shopify sites when sampling yields zero products.
Related
100 free credits, no credit card.
About 30 real searches. Add the MCP to Claude or Cursor in two minutes.