Skip to main content
Thirdwatchthirdwatch
E-commerce & products

Build an India Fashion Trend Pipeline with Myntra (2026)

Turn the Myntra catalogue into a longitudinal India fashion-trend dataset — colour, silhouette, brand and price-band signals. Python pipeline recipes.

May 12, 2026 · 6 min read · 1,276 words
See the scraper →

Thirdwatch's Myntra Scraper lets you turn India's largest fashion marketplace into a longitudinal trend dataset — weekly snapshots across 20-30 categories, indexed on colour, silhouette, brand and price-band. Built for fashion researchers building seasonal forecasts, founders sourcing private-label inventory, marketplace analysts benchmarking assortment, and brand teams tracking competitor catalogue ramps.

Why build a trend pipeline on Myntra

Myntra is the leading-edge of India fashion assortment. According to Statista's 2024 India online fashion report, Myntra is the country's largest dedicated fashion marketplace by GMV, with 6,000+ brands and several million live SKUs across apparel, footwear, beauty and accessories. Brands stage new-season inventory on Myntra before pushing to AJIO, their own DTC sites, or offline retail — typically 4-8 weeks ahead. For trend research, that staging window is the entire value proposition: you can see what's about to be commercially important rather than what already is.

The job-to-be-done is structured. A fashion researcher tracks colour-and-silhouette trends across men's, women's and kids' apparel for a seasonal forecast deck. A founder evaluating a private-label launch needs price-band density per category to size the white-space opportunity. A marketplace ops team benchmarks assortment depth versus Myntra to plan category investment. A brand team tracks competitors' new-season ramps in near-real-time. All reduce to weekly longitudinal capture of brand, colour, article_type, price, sizes and rating_count across a fixed category list, then trend computation on top.

How does this compare to the alternatives?

Three options for building an India fashion trend dataset:

Approach Reliability Setup time Maintenance
Subscribe to RedSeer / McKinsey reports High but quarterly/annual cadence, aggregate granularity Days (procurement) Static per report
DIY Python against multiple India fashion sites Low — each site ships differently, rate-limits hard, breaks weekly 2-3 weeks initial Continuous and high
Thirdwatch Myntra Scraper Production-tested with production-grade anti-bot tooling 1 hour Thirdwatch tracks Myntra changes

Official reports are the right macro context layer but they don't operate at SKU granularity or weekly cadence. The Myntra Scraper actor page gives you the underlying records to build your own trend layer — see the e-commerce scraping guide for adjacent patterns.

How to build a Myntra trend pipeline in 4 steps

Step 1: How do I structure the weekly category sweep?

Pull a fixed category list every Monday at 09:00 IST. Use Apify's scheduler or your own cron. Keep the input shape boring and deterministic for downstream stability.

export APIFY_TOKEN="apify_api_xxxxxxxxxxxxxxxx"
import os, requests, datetime, json, pathlib

ACTOR = "thirdwatch~myntra-scraper"
TOKEN = os.environ["APIFY_TOKEN"]

CATEGORIES = [
    "men-tshirts", "men-shirts", "men-jeans", "men-trousers", "men-footwear",
    "women-tops", "women-dresses", "women-kurtas-kurtis", "women-jeans",
    "women-footwear", "women-handbags",
    "ethnic-wear", "sportswear", "beauty-personal-care",
    "accessories", "bags-backpacks",
]

resp = requests.post(
    f"https://api.apify.com/v2/acts/{ACTOR}/run-sync-get-dataset-items",
    params={"token": TOKEN},
    json={
        "queries": CATEGORIES,
        "sortBy": "popularity",
        "maxResults": 300,
    },
    timeout=1800,
)
records = resp.json()
week = datetime.date.today().isocalendar()
stamp = f"{week.year}-W{week.week:02d}"
pathlib.Path("trends").mkdir(exist_ok=True)
pathlib.Path(f"trends/myntra-{stamp}.json").write_text(json.dumps(records))
print(f"{stamp}: {len(records)} records across {len(CATEGORIES)} categories")

16 categories × 300 products = ~4,800 records per weekly pull. Over 12 weeks, that's ~60K records — a usable sample size for India fashion trend analysis at the SKU level.

Step 2: How do I compute a colour trend index?

Aggregate listings by (week, article_type, primary_colour) and compute the share of listings per article_type. Week-over-week share growth is the trend signal.

import pandas as pd, glob

frames = []
for f in sorted(glob.glob("trends/myntra-*.json")):
    stamp = pathlib.Path(f).stem.replace("myntra-", "")
    for j in json.loads(pathlib.Path(f).read_text()):
        frames.append({
            "week": stamp,
            "sku": j.get("sku"),
            "brand": j.get("brand"),
            "article_type": j.get("article_type"),
            "primary_colour": j.get("primary_colour"),
            "price": j.get("price"),
            "rating_count": j.get("rating_count") or 0,
        })

df = pd.DataFrame(frames).dropna(subset=["article_type", "primary_colour"])

colour = (
    df.groupby(["week", "article_type", "primary_colour"])
      .size().reset_index(name="listings")
)
totals = colour.groupby(["week", "article_type"])["listings"].transform("sum")
colour["share"] = colour.listings / totals

# Week-over-week share growth per (article_type, colour)
colour = colour.sort_values(["article_type", "primary_colour", "week"])
colour["share_wow"] = (
    colour.groupby(["article_type", "primary_colour"])["share"].pct_change()
)

trending = colour[
    (colour.share_wow >= 0.20)
    & (colour.listings >= 30)
].sort_values("share_wow", ascending=False)
print(trending.head(20))

A 20%+ week-over-week growth in listings share, with at least 30 listings to filter noise, is a robust trending-colour signal. Cross-check against brand breadth in step 3 to filter single-brand bets.

Step 3: How do I separate real trends from single-brand bets?

Compute brand breadth per trending colour. A real trend has many brands ramping; a single-brand bet has one.

brand_breadth = (
    df.groupby(["week", "article_type", "primary_colour"])["brand"]
      .nunique().reset_index(name="brand_count")
)
trending_with_breadth = trending.merge(brand_breadth,
                                       on=["week", "article_type", "primary_colour"])

real_trends = trending_with_breadth[
    trending_with_breadth.brand_count >= 5
].sort_values("share_wow", ascending=False)
print("Cross-brand trending colours:")
print(real_trends[["week", "article_type", "primary_colour",
                   "share", "share_wow", "brand_count"]].head(20))

A 20%+ share growth with 5+ brands ramping the same colour is a high-confidence trend signal — the kind of input a buyer can defend in a planning meeting.

Step 4: How do I add a demand-velocity layer?

Listing share alone is supply-side. Add rating_count delta as a demand-side proxy: products acquiring more reviews per week are selling faster.

# Sort within SKU by week, take diff
df = df.sort_values(["sku", "week"])
df["rc_delta"] = df.groupby("sku")["rating_count"].diff().clip(lower=0)

velocity = (
    df.dropna(subset=["rc_delta"])
      .groupby(["week", "article_type", "primary_colour"])
      .agg(new_reviews=("rc_delta", "sum"),
           skus=("sku", "nunique"))
      .reset_index()
)
velocity["reviews_per_sku"] = velocity.new_reviews / velocity.skus

hot = velocity.merge(real_trends[["week", "article_type", "primary_colour"]],
                     on=["week", "article_type", "primary_colour"])
print("Trending colours with demand confirmation:")
print(hot.sort_values("reviews_per_sku", ascending=False).head(15))

The intersection of supply-side trend (listing share growth) and demand-side trend (reviews-per-SKU above category baseline) is the highest-confidence signal. This is where buying teams allocate budget and content teams plan campaigns.

Sample output

A trending-colour record from the pipeline output looks like the example below — one row per (week, article_type, primary_colour) cell. The underlying actor records are the same SKU shape as the other Myntra recipes.

[
  {
    "week": "2026-W18",
    "article_type": "Kurtas",
    "primary_colour": "Sage Green",
    "listings": 142,
    "share": 0.087,
    "share_wow": 0.34,
    "brand_count": 11,
    "new_reviews": 4280,
    "reviews_per_sku": 30.1
  },
  {
    "week": "2026-W18",
    "article_type": "Dresses",
    "primary_colour": "Butter Yellow",
    "listings": 98,
    "share": 0.061,
    "share_wow": 0.28,
    "brand_count": 8,
    "new_reviews": 1920,
    "reviews_per_sku": 19.6
  }
]

Sage green kurtas with 34% week-over-week share growth across 11 brands and 30 new reviews per SKU is the canonical strong-trend row — a buyer can act on it. Butter yellow dresses at 28%/8 brands/19.6 reviews-per-SKU is a softer but still actionable signal. Rows below 20% share growth or with fewer than 5 brands get filtered out as noise.

Common pitfalls

Three things go wrong in production trend pipelines. Category-mix drift — if your category list shifts week to week, the trend index becomes uninterpretable; keep the category list fixed for the entire research window. Colour name normalisation — Myntra ships colour names like Sage Green, Dusty Sage and Light Sage as distinct values; for trend research, build a colour-canonicalisation dictionary (or use a perceptual colour-distance metric on the image_url) to collapse near-synonyms before computing share. Sale-window distortion — EORS and Big Fashion Days flood the catalogue with discounted SKUs from clearance inventory, which can spike listing counts for legacy colours; either exclude sale weeks from trend calculations or tag them as a separate regime.

Thirdwatch's actor handles the anti-bot work and India-region routing to stay reliable for the multi-thousand-record weekly sweep this pipeline depends on. Even a 5,000-record weekly pull typically completes in single-digit minutes — cheap enough to run year-round. Pair the trend pipeline with AJIO and Nykaa snapshots for cross-marketplace breadth, and with our Pinterest and Instagram social-signal actors for the lead-indicator layer. A fourth subtle issue: primary_colour is missing for some accessories and beauty SKUs; the trend pipeline should .dropna() on the colour column rather than treat null as "Unknown".

Related use cases

Frequently asked questions

What makes Myntra a good trend signal source?

Myntra carries 6,000+ brands and 5M+ live SKUs across apparel, footwear, beauty and accessories. Its assortment ramp leads the broader Indian fashion market by 4-8 weeks because brands stage new-season inventory on Myntra before pushing to AJIO, brand sites or offline. For trend research, that lead time is the entire value.

What fields are most useful for trend research?

Four: primary_colour (colour trend index), article_type (silhouette and category), brand (brand-share dynamics), and price (price-band positioning). Combined with rating_count as a demand proxy, these four fields support most trend questions. sizes and color_variants_count give assortment-depth signals.

What's the right sampling cadence for trend research?

Weekly snapshots of 5,000-10,000 products across 20-30 categories is sufficient for most trend dashboards. Higher cadence is wasted for trend questions — assortment ramps over weeks, not hours. Save the budget for breadth (more categories, deeper SKU coverage per category) rather than frequency.

How do I detect a trending colour or silhouette?

Aggregate (primary_colour, article_type) listing counts per weekly snapshot, compute the share of total listings, and look for week-over-week share growth above 20%. The signal is stronger when listing share growth co-moves with brand breadth — many brands ramping the same colour at once is a real trend, one brand ramping alone is a single-brand bet.

Can I use the rating signal as a demand proxy?

Yes, but carefully. rating_count is a lagging indicator — it accumulates over a product's lifetime, not its current sales velocity. For demand signal, compute the delta in rating_count week-over-week (new reviews acquired in the window) rather than the absolute level. Brand-new SKUs with zero rating_count need a separate analysis path.

How does this compare to official India fashion-research reports?

RedSeer, McKinsey and Statista publish India fashion reports with quarterly to annual cadence and aggregate-level granularity. A Myntra trend pipeline gives you weekly cadence at the SKU level, which is the right granularity for category-bet decisions, planogram updates and new-launch timing. Treat the reports as macro context and the pipeline as operational signal.

Related

Try it yourself

100 free credits, no credit card.

About 30 real searches. Add the MCP to Claude or Cursor in two minutes.