Skip to main content
Thirdwatchthirdwatch
E-commerce & products

Build a Flipkart Product Research Pipeline (2026 Guide)

Build an India-marketplace product research pipeline using Thirdwatch's Flipkart Scraper. Niche scoring + per-category analysis.

Apr 27, 2026 · 5 min read · 1,162 words
See the scraper →

Thirdwatch's Flipkart Scraper gives India-focused product-research SaaS builders, marketplace operators, and Tier-2/3-targeting brands a structured Flipkart data layer — title, price, MRP, rating, reviews_count, brand, category, image. Build India-specific niche-opportunity scoring tuned to Flipkart's price-sensitive audience.

Why build a Flipkart product-research pipeline

India ecommerce is a separate competitive surface from US/global. According to Flipkart's 2024 Marketplace report, Flipkart accounts for 50%+ of India's online retail GMV outside Tier 1 metros and 35%+ overall. Its category mix, price elasticity, and brand mix differ materially from Amazon India. For India-focused operators, Tier-2/3-targeting brands, and India-specific marketplace SaaS, Flipkart product-research is the foundation rather than an afterthought.

The job-to-be-done is structured. An India product-research SaaS founder ingests 500K+ Flipkart listings weekly to compute India-specific opportunity scores. A consumer-brand operator evaluates 30 candidate niches per quarter for India launches. A Flipkart-FBA seller maintains daily price snapshots on a 2K-product watchlist. A D2C agency builds per-client niche-opportunity dashboards specific to India market dynamics. All reduce to category-level scrapes + price/rating filtering + India-specific scoring logic.

How does this compare to the alternatives?

Three options for Flipkart product-research data:

Approach Cost per 1,000 records Reliability Setup time Maintenance
Manual Flipkart browsing Effectively unbounded analyst time Low Continuous Doesn't scale
Flipkart Marketplace API (seller account) Free with seller approval Limited to your own listings Days (account setup) Strict use-case scope
Thirdwatch Flipkart Scraper Pay per result Production-tested with Production-grade anti-bot handling 5 minutes Thirdwatch tracks Flipkart changes

Manual browsing doesn't scale past 5-10 categories. Flipkart's seller-side Marketplace API is restricted to your own product catalog. The Flipkart Scraper actor page gives you cross-marketplace research data at the lowest unit cost.

How to build a research pipeline in 4 steps

Step 1: How do I authenticate against Apify?

Sign in at apify.com (free tier, no credit card), open Settings → Integrations, and copy your personal API token. Every example below assumes the token is in APIFY_TOKEN:

export APIFY_TOKEN="apify_api_xxxxxxxxxxxxxxxx"

Step 2: How do I pull a category-level batch?

Pass keyword queries that map to your target niches.

import os, requests, pandas as pd

ACTOR = "thirdwatch~flipkart-products-scraper"
TOKEN = os.environ["APIFY_TOKEN"]

NICHES = ["bluetooth speakers under 2000",
          "stainless steel kitchen tools",
          "yoga mat 6mm",
          "office chair ergonomic",
          "running shoes mens",
          "wireless earbuds under 3000",
          "kitchen storage containers",
          "bedsheets king size cotton"]

resp = requests.post(
    f"https://api.apify.com/v2/acts/{ACTOR}/run-sync-get-dataset-items",
    params={"token": TOKEN},
    json={"queries": NICHES, "maxResults": 100},
    timeout=3600,
)
df = pd.DataFrame(resp.json())
print(f"{len(df)} products across {len(NICHES)} niches")

8 niches × 100 products = up to 800 records.

Step 3: How do I score India-specific niche opportunities?

Composite score: price band fit + reviews threshold + non-dominant top brand.

df["price"] = pd.to_numeric(df.price.astype(str).str.replace(r"[₹,]", "", regex=True), errors="coerce")
df["mrp"] = pd.to_numeric(df.mrp.astype(str).str.replace(r"[₹,]", "", regex=True), errors="coerce")
df["rating"] = pd.to_numeric(df.rating, errors="coerce")
df["reviews_count"] = pd.to_numeric(df.reviews_count, errors="coerce")
df["discount_pct"] = (df.mrp - df.price) / df.mrp * 100

scores = (
    df.groupby("searchString")
    .agg(
        product_count=("title", "count"),
        median_price=("price", "median"),
        median_rating=("rating", "median"),
        median_reviews=("reviews_count", "median"),
        median_discount=("discount_pct", "median"),
        top_brand_share=("brand", lambda x: x.value_counts(normalize=True).iloc[0] if len(x) > 0 else 0),
    )
    .assign(
        opportunity_score=lambda d: (
            (d.product_count.between(50, 500))
            * (d.median_rating.between(3.8, 4.3))
            * (d.median_reviews >= 100)
            * (d.top_brand_share < 0.4)
        ).astype(int)
    )
    .sort_values("opportunity_score", ascending=False)
)
print(scores)

The four-condition opportunity filter is similar to Amazon research but with India-tuned thresholds: lower median-reviews floor (100 vs 200) reflecting Flipkart's earlier-market-stage profile.

Step 4: How do I track price drift and seasonal sales?

Persist daily snapshots and surface significant price drops.

import pathlib, json, datetime

ts = datetime.datetime.utcnow().strftime("%Y%m%d")
df[["product_id", "title", "price", "mrp", "discount_pct",
    "rating", "reviews_count", "brand"]].to_json(
    f"snapshots/flipkart-{ts}.json", orient="records", lines=True
)

# Weekly delta detection
prev = pd.read_json("snapshots/flipkart-20260420.json", lines=True)
combined = df.merge(prev, on="product_id", suffixes=("", "_prev"))
combined["price_delta"] = combined.price - combined.price_prev
combined["price_delta_pct"] = combined.price_delta / combined.price_prev

drops = combined[combined.price_delta_pct <= -0.15].sort_values("price_delta_pct")
print(f"{len(drops)} products with 15%+ price drops")
print(drops[["title", "price_prev", "price", "price_delta_pct"]].head(15))

Big Billion Day and weekly Big Saving Days drive most India ecommerce volume, so price-drop detection during these windows is materially predictive of inventory clearance and seasonal-demand peaks.

Sample output

A single Flipkart product record looks like this. Five rows weigh ~6 KB.

{
  "product_id": "MOBFRZBYMGCSGS5J",
  "title": "boAt Stone 1000 14W Bluetooth Speaker",
  "price": "₹1,799",
  "mrp": "₹3,990",
  "rating": 4.3,
  "reviews_count": 28450,
  "category": "Bluetooth Speakers",
  "brand": "boAt",
  "image_url": "https://rukminim2.flixcart.com/...",
  "url": "https://www.flipkart.com/boat-stone-1000-...",
  "in_stock": true,
  "f_assured": true
}

product_id is Flipkart's canonical natural key (FSN) — stable across price and image changes. mrp (Maximum Retail Price) is the regulated retail-price cap; the difference between mrp and price is the displayed discount, a major India shopping signal. f_assured: true indicates Flipkart-fulfilled (the equivalent of Amazon Prime), which converts at higher rates. category provides hierarchical context for cross-niche normalization.

Common pitfalls

Three things go wrong in Flipkart research pipelines. MRP-vs-price misuse — sellers sometimes inflate MRP to display larger fake discounts; for accurate price analysis, anchor against the average price across the same product category rather than the MRP. F-Assured filtering — non-F-Assured products skew toward smaller sellers with longer delivery times and lower conversion; for serious-niche research, filter to f_assured: true to study the high-conversion segment. Brand-name normalization — Indian brands often appear under English and Hindi-script variants in product titles; for clean brand-share computation, normalize via case-folding and translation table.

Thirdwatch's actor handles TLS-level fingerprinting and session rotation so you can focus on the data. Pair Flipkart with Amazon Scraper for cross-marketplace pricing intelligence and AliExpress Scraper for sourcing-cost research. A fourth subtle issue worth flagging: Flipkart's rating field rounds to one decimal (4.3 instead of 4.27), which compresses statistical signal at scale; for high-precision ranking, supplement with reviews_count weighted score rather than rating alone — a 4.3 product with 50K reviews is materially more reliable than a 4.5 with 200 reviews. A fifth pattern unique to India ecommerce: Flipkart aggressively bundles seasonal categories (kurta sets during Diwali, school supplies in June) into temporary "thematic" search results that disappear after the season — for stable longitudinal research, exclude rows where the searchString contains explicit seasonal qualifiers and re-run base searches separately. A sixth pitfall: Flipkart's reviews_count includes both written reviews and ratings-only entries, while Amazon India distinguishes them; for cross-marketplace comparison studies, normalize by extracting the written-review count separately from the rating-only count via the product detail page. A seventh and final pattern worth flagging for India ecommerce research: GST-inclusive vs GST-exclusive pricing varies across category and seller, and the displayed price typically includes GST while the seller's claimed margin is computed on the GST-exclusive base. For accurate margin-modeling on candidate niches, deduct the applicable GST rate (typically 5%, 12%, 18%, or 28% depending on category) from the displayed price before computing seller-economics — overlooking this consistently overstates apparent seller margins by 8-25%.

Related use cases

Frequently asked questions

Why Flipkart specifically for India product research?

Flipkart is India's largest ecommerce marketplace with 50%+ share of online retail GMV in non-metro India. Its assortment skews toward mid-market consumers and price-sensitive segments differently from Amazon India (which leans premium). For India-specific product research — particularly for sellers targeting Tier 2/3 cities — Flipkart's product, price, and rating data is the canonical source.

What signals does India product research need?

Five core signals: price band (Flipkart's median product price is materially lower than Amazon India), rating, reviews_count, brand share within the category, and discount frequency (Flipkart's 'Big Billion Day' and weekly Big Saving Days drive most volume). Cross-tabulate to score niches: high reviews + 3.8-4.2 rating + non-dominant top brand = quality-improvement opportunity.

How does Flipkart pricing compare to Amazon India?

For categories with brand-overlap (electronics, mobiles, large appliances), Flipkart prices run 5-12% lower than Amazon India on average. For non-brand items (apparel, home goods, kitchenware), the gap widens to 15-25%. For sellers, this means margin assumptions calibrated against Amazon India will overstate India-marketplace margins by 10-20%; recompute against Flipkart-anchored pricing.

Which categories are best for niche research on Flipkart?

Home & Kitchen, Personal Care, Mobile Accessories, and Sports & Outdoors have the highest density of low-competition niches with strong demand. Electronics is competitive but high-volume. Apparel is volume-dense but margins are thin. Avoid Health & Pharmaceuticals and Books (heavy regulation / thin margins). Best results: subcategories 3+ levels deep where Flipkart returns under 1,000 competing products.

How does this compare to Jungle Scout for India?

Jungle Scout has limited Flipkart support (their core product is Amazon-focused). For India-specific research the actor delivers structured Flipkart data at a fraction of the cost. Most India-focused operators run the actor for Flipkart + Jungle Scout for Amazon India, then cross-reference for unified India marketplace intelligence.

How fresh does product research data need to be?

For niche-opportunity research, weekly cadence is sufficient — the structural signals (price band, top-brand share) move slowly. For pricing-elasticity studies during Big Billion Day or seasonal sales, daily cadence catches volatility. For year-round category mapping, monthly is fine. Tag scrape timestamps for delta tracking across snapshots.

Related

Try it yourself

100 free credits, no credit card.

About 30 real searches. Add the MCP to Claude or Cursor in two minutes.