Reviews & ratings

Build a Product Sentiment Database from Shopify Reviews

Collect review ratings across hundreds of Shopify stores and build a sentiment database for brand benchmarking, investor screening, and market analysis.

May 26, 2026 · 5 min read · 1,238 words

See the scraper →

Thirdwatch's Shopify Reviews Scraper returns review counts, average ratings, and detected review providers for any Shopify store. Feed the output into a sentiment database to benchmark brands, screen DTC investments, and track customer satisfaction across hundreds of storefronts. No login, no API key, works across Judge.me, Yotpo, Loox, Stamped, Okendo, Reviews.io, and Shopify native reviews.

Why build a product sentiment database from Shopify reviews

Customer reviews are the closest proxy to real-time brand health that exists at scale. According to Bazaarvoice's 2025 Shopper Experience Index, 88% of consumers consult reviews before making a purchase, and brands with higher review volumes convert at measurably higher rates. For anyone analyzing the DTC e-commerce landscape -- investors, brand strategists, competitive intelligence teams -- a structured database of review sentiment across Shopify stores is a foundational asset. According to Statista's 2025 e-commerce platform market share data, Shopify powers over 4 million storefronts globally, making it the largest single-platform DTC ecosystem to analyze.

The problem is fragmentation. Shopify's ecosystem has no unified review API. Each store independently installs one of seven or more review widgets, each with its own data format and access method. Building a cross-brand sentiment database by hand means visiting each store, identifying the widget, and manually recording numbers. At 5 minutes per store and 200 stores in a competitive set, that is 16 hours of manual work -- repeated every time you need a refresh. The actor collapses that to a single API call returning structured rows.

How does this compare to the alternatives?

Three approaches to assembling a multi-brand Shopify sentiment database:

Approach	Reliability	Setup time	Maintenance
Manual store visits + spreadsheet	Accurate but non-scalable	Minutes per store	Full re-crawl on each refresh
Yotpo/Judge.me API per provider	High for one widget only	Days per provider integration	Separate auth per provider
Thirdwatch Shopify Reviews Scraper	Auto-detects all 7 providers	5 minutes	Thirdwatch tracks markup changes

The manual path does not scale past 20-30 stores. Provider APIs require either store-owner credentials or a partner integration per widget -- impractical when your database spans brands using different providers. The Shopify Reviews Scraper handles cross-provider detection in one pass, returning a uniform schema regardless of whether the store uses Judge.me or Okendo.

How to build a Shopify sentiment database in 4 steps

Step 1: How do I authenticate and prepare a store list?

export APIFY_TOKEN="apify_api_xxxxxxxxxxxxxxxx"

Prepare your store list as an array of Shopify store URLs. These can be custom domains or myshopify.com URLs. For a DTC competitor analysis, a typical list includes 50-200 brands in your vertical.

Step 2: How do I collect review data across all stores?

Pass the store list in storeUrls and set sampleProducts to control sampling depth. Higher values produce more accurate totals at the cost of longer run times.

import os, requests, pandas as pd, json
from datetime import date

ACTOR = "thirdwatch~shopify-reviews-scraper"
TOKEN = os.environ["APIFY_TOKEN"]

STORES = [
    "https://www.allbirds.com",
    "https://www.gymshark.com",
    "https://www.bombas.com",
    "https://www.brooklinen.com",
    "https://www.ruggable.com",
    "https://www.chubbiesshorts.com",
    "https://www.mvmtwatches.com",
    "https://www.nativecos.com",
    "https://www.rothys.com",
    "https://www.away.com",
]

resp = requests.post(
    f"https://api.apify.com/v2/acts/{ACTOR}/run-sync-get-dataset-items",
    params={"token": TOKEN},
    json={
        "storeUrls": STORES,
        "sampleProducts": 15,
    },
    timeout=600,
)
df = pd.DataFrame(resp.json())
df["snapshot_date"] = date.today().isoformat()
print(f"{len(df)} stores collected, {df.totalReviews.sum():,.0f} total reviews across dataset")

Ten stores at 15 products each runs in under three minutes. The snapshot_date column anchors each batch to a point in time, which is essential for building time-series sentiment trends.

Step 3: How do I normalize and score sentiment for database storage?

Transform raw review data into a sentiment scoring model. The simplest approach maps averageRating to a labeled tier and uses totalReviews as a confidence weight.

def sentiment_tier(rating):
    if rating is None:
        return "unknown"
    if rating >= 4.5:
        return "strong_positive"
    if rating >= 4.0:
        return "positive"
    if rating >= 3.5:
        return "neutral"
    if rating >= 3.0:
        return "negative"
    return "strong_negative"

df["sentiment"] = df["averageRating"].apply(sentiment_tier)
df["review_coverage"] = df["productsWithRatings"] / df["productsSampled"]
df["confidence"] = df["totalReviews"].apply(
    lambda r: "high" if r and r >= 1000 else ("medium" if r and r >= 100 else "low")
)

# Database-ready output
db_cols = ["domain", "provider", "totalReviews", "averageRating",
           "sentiment", "confidence", "review_coverage", "snapshot_date"]
print(df[db_cols].to_string(index=False))

# Export to CSV for warehouse loading
df[db_cols].to_csv("shopify_sentiment_snapshot.csv", index=False)

The review_coverage ratio (products with ratings divided by products sampled) surfaces stores with shallow review adoption -- a signal that their aggregate rating may be less representative. The confidence tier prevents over-indexing on a 5.0 average from a store with only 8 reviews.

Step 4: How do I maintain the database with scheduled refreshes?

Schedule weekly runs to build a longitudinal sentiment dataset. Each snapshot stacks onto the previous data, enabling trend analysis.

import sqlite3

conn = sqlite3.connect("shopify_sentiment.db")
df[db_cols].to_sql("sentiment_snapshots", conn, if_exists="append", index=False)

# Query: week-over-week rating change
query = """
SELECT domain,
       MAX(CASE WHEN snapshot_date = date('now') THEN averageRating END) AS current_rating,
       MAX(CASE WHEN snapshot_date = date('now', '-7 days') THEN averageRating END) AS prev_rating,
       MAX(CASE WHEN snapshot_date = date('now') THEN totalReviews END) -
       MAX(CASE WHEN snapshot_date = date('now', '-7 days') THEN totalReviews END) AS new_reviews
FROM sentiment_snapshots
GROUP BY domain
HAVING prev_rating IS NOT NULL
ORDER BY new_reviews DESC
"""
trends = pd.read_sql(query, conn)
print(trends)

After a month of weekly snapshots, you have a sentiment trend database that shows which brands are gaining review velocity and which are stagnating -- signals that manual spot-checks cannot capture.

Sample output

A single record from the dataset. One row per store, under 1 KB.

{
    "domain": "gymshark.com",
    "url": "https://www.gymshark.com",
    "provider": "judge.me",
    "totalReviews": 127403,
    "averageRating": 4.7,
    "productsSampled": 15,
    "productsWithRatings": 14
}

provider identifies the review widget -- critical for review platform vendors tracking competitive market share. totalReviews aggregated across 15 sampled products gives a reliable proxy for the store's overall review footprint. averageRating is review-count-weighted, meaning products with more reviews contribute proportionally to the average. A store like Gymshark with 14 of 15 products rated shows deep review adoption, strengthening confidence in the aggregate metric.

Common pitfalls

Three issues surface when building sentiment databases from Shopify review data. Conflating store-level and product-level reviews -- this actor returns product-page review data (from widgets like Yotpo and Judge.me), not store-level ratings from platforms like Trustpilot. These are different signals: a brand can have a 4.8 product rating and a 2.1 Trustpilot score due to shipping complaints. Track both, but keep them in separate columns. Small-sample bias -- a store with 3 products sampled and 25 total reviews can show a perfect 5.0 average that collapses once you sample 30 products. Use the confidence tier from Step 3 to flag low-review stores before including them in aggregate analyses. Provider migration -- brands occasionally switch review widgets (e.g., from Loox to Yotpo). When this happens, totalReviews may drop sharply as the migration period shows partial data. Track the provider field over time and flag provider changes as data-quality events in your pipeline.

Thirdwatch's actor handles the cross-provider detection and sampling so your pipeline receives a clean, uniform schema on every run.

Related use cases

Frequently asked questions

How many Shopify stores can I scan in a single run?

There is no hard limit on the number of store URLs per run. Pass as many URLs as needed in the storeUrls array. For batches over 500 stores, split into multiple runs to stay within the 600-second timeout and process results incrementally.

Can I build sentiment trends over time with this data?

Yes. Schedule the actor to run daily or weekly against the same store list. Each run returns a snapshot of totalReviews and averageRating. Store results in a time-series database and compute week-over-week deltas to track sentiment drift across your competitive set.

Does the actor return sentiment labels like positive or negative?

No. The actor returns numeric averageRating on a 0-5 scale and totalReviews count. You apply your own sentiment thresholds downstream -- for example, treating 4.5 and above as strong positive and below 3.5 as negative. This avoids imposing an opinionated classification layer.

What review providers are covered?

Judge.me, Yotpo, Loox, Stamped, Okendo, Reviews.io, and Shopify native product reviews. The actor auto-detects the provider per store and returns it in the provider field, so your database captures widget market share alongside sentiment data.

How do I handle stores with no detected reviews?

Stores using custom or private review systems return provider as null and totalReviews as null. Filter these rows before aggregating sentiment metrics. They still indicate the store exists on Shopify, which is useful for completeness in a brand database.

Scrape Shopify Reviews for Product Research (2026 Guide)Monitor Shopify Review Scores for DTC Brand Health 2026 Find Shopify Product Complaints for Competitor Gap Analysis

Try it yourself

100 free credits, no credit card.

About 30 real searches. Add the MCP to Claude or Cursor in two minutes.