Scrape FirstCry Products for India Baby Care Research 2026
Pull FirstCry baby and kids products using Thirdwatch — name, brand, price, MRP, discount, rating count, image, URL. Pay per product. Python recipes.

Thirdwatch's FirstCry Scraper returns FirstCry product search and category results — name, brand, price, MRP, discount, rating count, image, and URL. Built for India baby-care researchers sizing categories, brand teams monitoring share-of-shelf, dropshippers scouting kids products, and analysts building India parenting-economy datasets.
Why scrape FirstCry for India baby care research
FirstCry is the dominant India baby, kids and maternity ecommerce platform. The parent company Brainbees Solutions completed its public listing on the BSE and NSE in August 2024, raising roughly INR 4,194 crore in its IPO — the first pure-play India baby-care platform to go public. According to the IBEF report on India ecommerce and category trackers like RedSeer, India's baby-and-kids retail category is on track to cross USD 30 billion by 2027, with online penetration accelerating fastest in Tier 2 and Tier 3 cities — exactly where FirstCry leads. For anyone serious about India parenting-economy data, FirstCry is the table-stakes source.
The job-to-be-done is structured. A baby-care brand tracks share-of-shelf across diapers, wipes and skincare to monitor its own placement vs Pampers, Huggies, Mamaearth and Himalaya. A D2C founder benchmarks pricing on a new product line against the closest 30 SKUs already on FirstCry. A consultancy builds a quarterly India baby-care category report and needs clean, repeatable raw inputs. A marketplace seller tracks competing FirstCry-listed SKUs to time pricing changes. Every one of these reduces to keyword or category queries plus structured product rows.
How does this compare to alternatives?
Three options for getting FirstCry product data into a research pipeline:
| Approach | Reliability | Setup time | Maintenance |
|---|---|---|---|
| FirstCry partner / affiliate access | Restricted, case-by-case | Weeks to months | Strict scope |
| Manual FirstCry browsing into spreadsheets | Low — analyst hours | Continuous | Doesn't scale |
| Thirdwatch FirstCry Scraper | Production-tested with production-grade anti-bot tooling | 5 minutes | Thirdwatch tracks FirstCry changes |
FirstCry's own partner integrations exclude most third-party analytics workflows. The FirstCry Scraper actor page gives you the public catalog at transparent per-result pricing — no application process, no approval gate.
How to scrape FirstCry for baby-care research in 4 steps
Step 1: How do I authenticate against Apify?
Sign in at apify.com (free tier, no credit card needed to test), open Settings → Integrations, and copy your personal API token. Every example below assumes the token is in APIFY_TOKEN:
export APIFY_TOKEN="apify_api_xxxxxxxxxxxxxxxx"Step 2: How do I pull a baby-care category sweep?
Pass keywords as queries and pick a category for the FirstCry top-level slot. The actor maps each query to firstcry.com/search?searchstring={query} and respects the category filter. The full enum of categories (baby-toys, baby-clothing, diapers, baby-feeding, baby-skincare, baby-gear-strollers, kids-clothing, kids-footwear, kids-toys, school-supplies, mom-care, gifts and more) is in the actor's input schema.
import os, requests, datetime, json, pathlib
ACTOR = "thirdwatch~firstcry-scraper"
TOKEN = os.environ["APIFY_TOKEN"]
CATEGORIES = {
"diapers": ["diapers", "pant style diapers", "newborn diapers"],
"baby-skincare": ["baby lotion", "baby shampoo", "baby massage oil"],
"baby-feeding": ["feeding bottle", "sippy cup", "weaning food"],
"baby-toys": ["rattles", "teether", "soft toy"],
"mom-care": ["nursing pads", "stretch mark cream", "maternity belt"],
}
records = []
for category, queries in CATEGORIES.items():
resp = requests.post(
f"https://api.apify.com/v2/acts/{ACTOR}/run-sync-get-dataset-items",
params={"token": TOKEN},
json={
"queries": queries,
"category": category,
"sortBy": "popularity",
"maxResults": 100,
},
timeout=600,
)
chunk = resp.json()
for r in chunk:
r["_category"] = category
records.extend(chunk)
today = datetime.date.today().isoformat()
pathlib.Path(f"snapshots/firstcry-{today}.json").write_text(json.dumps(records))
print(f"{today}: {len(records)} products across {len(CATEGORIES)} categories")Five categories × 3 keywords × 100 = up to 1,500 records per sweep — enough for a representative baby-care snapshot per run.
Step 3: How do I parse INR prices and discounts?
FirstCry prices use the same Indian rupee comma format as other India marketplaces (₹1,299). Strip non-digits for analytics, and compute discount as (MRP - price) / MRP.
import pandas as pd, re
def parse_inr(s):
if not s:
return None
digits = re.sub(r"[^\d]", "", str(s))
return int(digits) if digits else None
df = pd.DataFrame(records)
df["price_num"] = df.price.apply(parse_inr)
df["mrp_num"] = df.original_price.apply(parse_inr)
df["discount_pct"] = ((df.mrp_num - df.price_num) / df.mrp_num * 100).round(1)
per_brand = (
df.dropna(subset=["brand", "price_num"])
.groupby(["_category", "brand"])
.agg(skus=("product_name", "count"),
median_price=("price_num", "median"),
median_discount=("discount_pct", "median"))
.sort_values(["_category", "skus"], ascending=[True, False])
)
print(per_brand.head(30))This gives you a category × brand depth chart — how many SKUs each brand fields, where it prices, and how aggressively it discounts. The same pivot can roll up to brand-level India-wide share-of-shelf for a research report.
Step 4: How do I build a longitudinal baby-care dataset?
Persist daily snapshots, dedupe on URL, and aggregate over time. The product URL is the canonical natural key (FirstCry product IDs embed in the path).
import glob
frames = []
for f in sorted(glob.glob("snapshots/firstcry-*.json")):
date = pathlib.Path(f).stem.replace("firstcry-", "")
for row in json.loads(pathlib.Path(f).read_text()):
frames.append({
"date": date,
"url": row.get("url"),
"product_name": row.get("product_name"),
"brand": row.get("brand"),
"category": row.get("_category"),
"price": parse_inr(row.get("price")),
"rating_count": row.get("rating_count"),
})
hist = pd.DataFrame(frames).dropna(subset=["url", "price"])
hist["date"] = pd.to_datetime(hist["date"])
# Category median price over time — the headline India baby-care index
idx = (hist.groupby(["date", "category"]).price.median()
.unstack().sort_index())
idx.to_csv("firstcry_category_price_index.csv")
print(idx.tail(7))The resulting CSV is the spine of an India baby-care price index. Add weights by rating_count (a usable popularity proxy when sales data isn't available) and you have a defensible category-tracking dataset.
Sample output
A single FirstCry product record looks like this. Five rows weigh roughly 3 KB.
{
"sku": "12345678",
"product_id": "12345678",
"product_name": "Pampers Premium Care Pant Style Diapers Medium - 76 Pieces",
"brand": "Pampers",
"price": "₹1,299",
"original_price": "₹1,599",
"discount_percent": 18,
"currency": "INR",
"rating": null,
"rating_count": 4825,
"image_url": "https://cdn.fcglcdn.com/brainbees/images/.../pampers-mp-76.jpg",
"url": "https://www.firstcry.com/pampers/pampers-premium-care-pants-m-76/.../product-detail",
"source_query": "diapers"
}url is the canonical natural key for cross-snapshot dedup. brand makes brand-level rollups trivial. rating_count is the most useful popularity proxy on FirstCry — sales counts aren't exposed, but rating_count tracks demand closely for steady-state SKUs. original_price populated alongside price means the listing is on discount; an absent original_price means the displayed price is the full price.
Common pitfalls
Three things go wrong in production FirstCry pipelines. MRP-vs-price interpretation — FirstCry sometimes shows the MRP equal to price (no actual discount) while still displaying a discount-style badge driven by bundle offers; trust the numeric (mrp - price) delta over any badge text. Brand-name normalization — the same brand appears as "Pampers", "Pampers India" and occasionally with a co-brand prefix; normalize via a small mapping table before brand-level rollups or you'll undercount the long-tail of variants. Listing-vs-detail data depth — listing rows are deliberately compact (no ingredients, no full description); for product-level research you'll layer a detail-page pass on the top N URLs you care about rather than expecting full specs from search results.
Thirdwatch's actor uses production-grade anti-bot tooling and proxy rotation under the hood, sustained at India-residential network conditions so your FirstCry queries land like a real Indian shopper's. The architecture is pure HTTP with no browser, which keeps runs fast and the per-result cost low. Pair FirstCry with our Myntra Scraper for fashion-adjacent kids categories and our Flipkart Scraper for cross-marketplace India baby-care comparisons. For Amazon India coverage, plug in our Amazon Scraper. A fourth subtle issue: bundle SKUs (e.g., 3-pack diaper bundles) sit alongside single-pack SKUs in the same listing — for unit-economics research, compute per-unit price using the pack-size in the title, otherwise bundle rows distort category medians.
Related use cases
- Track FirstCry pricing on baby and kids products
- Monitor FirstCry deals and bestsellers
- Build an India baby-care market database with FirstCry
- Scrape Flipkart products for India ecommerce
- The complete guide to scraping ecommerce
- All Thirdwatch use-case guides
Frequently asked questions
How much does it cost to scrape FirstCry?
Thirdwatch's FirstCry Scraper uses transparent per-result pricing with volume tiers, so cost scales with how much data you pull and drops at higher tiers. A daily 20-keyword baby-care sweep at 100 products each runs comfortably for steady-state research; pricing is published on the Apify Store listing.
Why scrape FirstCry specifically for India baby care?
FirstCry is India's largest baby, kids and maternity ecommerce platform with millions of monthly active shoppers. Its catalog covers diapers, baby clothing, toys, feeding, mom care and gear at a depth no general marketplace matches. For India baby-care research, brand monitoring, or category sizing, FirstCry is the canonical primary source.
What fields are returned per product?
Each product row includes sku, product_name, brand, price (INR), original_price (MRP), discount_percent, currency, rating_count, image_url and url. Category and subcategory context come from your input (which top-level category and subcategory slug you scraped), so you can join rows back to your taxonomy without extra parsing.
How does this compare to FirstCry's own product API?
FirstCry does not publish a public product or affiliate API for general use. Their internal endpoints are gated, and partner integrations are case-by-case. Thirdwatch's actor reads the public catalog the same way a shopper does — no application process, no use-case restrictions beyond standard public-data norms.
How fresh is the data?
Each run pulls live from firstcry.com at request time. FirstCry prices and stock change daily on steady-state SKUs and hourly during sale events (Birthday Bash, end-of-season, festive). For active monitoring, schedule the actor at hourly cadence during announced sales and daily otherwise.
Why are some product fields empty?
FirstCry's listing omits MRP when a product is at full price (no discount displayed), and omits ratings on newly listed SKUs. The actor returns whatever the page shows. For analytics, treat null fields as missing-data flags rather than zero values — a brand-new SKU with no rating is structurally different from a low-rated one.
Run the FirstCry Scraper on Apify Store — pay per product, free to try, no credit card to test.
Frequently asked questions
How much does it cost to scrape FirstCry?
Thirdwatch's FirstCry Scraper uses transparent per-result pricing with volume tiers, so cost scales with how much data you pull and drops at higher tiers. A daily 20-keyword baby-care sweep at 100 products each runs comfortably for steady-state research; pricing is published on the Apify Store listing.
Why scrape FirstCry specifically for India baby care?
FirstCry is India's largest baby, kids and maternity ecommerce platform with millions of monthly active shoppers. Its catalog covers diapers, baby clothing, toys, feeding, mom care and gear at a depth no general marketplace matches. For India baby-care research, brand monitoring, or category sizing, FirstCry is the canonical primary source.
What fields are returned per product?
Each product row includes product_name, brand, price (INR), original_price (MRP), discount_percent, rating_count, image_url, sku and url. Category and subcategory context come from your input (which top-level category and subcategory slug you scraped), so you can join rows back to your taxonomy without extra parsing.
How does this compare to FirstCry's own product API?
FirstCry does not publish a public product or affiliate API for general use. Their internal endpoints are gated, and partner integrations are case-by-case. Thirdwatch's actor reads the public catalog the same way a shopper does — no application process, no use-case restrictions beyond standard public-data norms.
How fresh is the data?
Each run pulls live from firstcry.com at request time. FirstCry prices and stock change daily on steady-state SKUs and hourly during sale events (Birthday Bash, end-of-season, festive). For active monitoring, schedule the actor at hourly cadence during announced sales and daily otherwise.
Why are some product fields empty?
FirstCry's listing omits MRP when a product is at full price (no discount displayed), and omits ratings on newly listed SKUs. The actor returns whatever the page shows. For analytics, treat null fields as missing-data flags rather than zero values — a brand-new SKU with no rating is structurally different from a low-rated one.
Related
100 free credits, no credit card.
About 30 real searches. Add the MCP to Claude or Cursor in two minutes.