Scrape Noon Products for MENA Ecommerce (2026)

Thirdwatch's Noon Scraper returns Noon UAE + Saudi products at $0.002 per record — title, brand, price, rating, reviews, category, image. Built for MENA ecommerce research, Saudi-market analysis, UAE competitive intelligence, and cross-platform pricing-research vs Amazon UAE.

Why scrape Noon for MENA ecommerce research

Noon dominates MENA ecommerce. According to Noon's 2024 report, the platform processes $5B+ GMV annually across 6 MENA countries with PIF (Saudi sovereign fund) backing and strong logistics presence. For MENA ecommerce research, Saudi-market analysis, and UAE competitive intelligence, Noon is essential alongside Amazon UAE.

The job-to-be-done is structured. A MENA ecommerce operator monitors competitor pricing across 30 categories weekly. A Saudi-market research function studies category-mix patterns specific to Saudi consumers (vs UAE/global). A consumer-brand operator scopes Noon-listed competitors before launching MENA distribution. A cross-platform pricing-research function compares Noon vs Amazon UAE for arbitrage opportunities. All reduce to category + keyword queries + per-product detail extraction.

How does this compare to the alternatives?

Three options for Noon data:

Approach Cost per 10K records Reliability Setup time Maintenance
Noon Marketplace API (paid) $5K+/year (partnership) Official Days (approval) Per-tier license
Manual Noon browsing Effectively unbounded Low Continuous Doesn't scale
Thirdwatch Noon Scraper $20 ($0.002 × 10K) Camoufox + residential proxy 5 minutes Thirdwatch tracks Noon changes

Noon's first-party Marketplace API is gated behind partnership approval. The Noon Scraper actor page gives you raw competitor data at the lowest unit cost.

How to scrape Noon in 4 steps

Step 1: How do I authenticate against Apify?

Sign in at apify.com (free tier, no credit card), open Settings → Integrations, and copy your personal API token:

export APIFY_TOKEN="apify_api_xxxxxxxxxxxxxxxx"

Step 2: How do I pull category × country batches?

Pass keyword + country queries.

import os, requests, pandas as pd

ACTOR = "thirdwatch~noon-scraper"
TOKEN = os.environ["APIFY_TOKEN"]

NICHES = ["bluetooth speakers", "smartwatch",
          "kitchen appliances", "running shoes",
          "skincare set", "perfume oud",
          "abaya", "kandura"]

resp_uae = requests.post(
    f"https://api.apify.com/v2/acts/{ACTOR}/run-sync-get-dataset-items",
    params={"token": TOKEN},
    json={"queries": NICHES, "country": "uae", "maxResults": 100},
    timeout=900,
)
resp_sa = requests.post(
    f"https://api.apify.com/v2/acts/{ACTOR}/run-sync-get-dataset-items",
    params={"token": TOKEN},
    json={"queries": NICHES, "country": "saudi", "maxResults": 100},
    timeout=900,
)

uae = pd.DataFrame(resp_uae.json()).assign(country="UAE")
sa = pd.DataFrame(resp_sa.json()).assign(country="Saudi")
df = pd.concat([uae, sa], ignore_index=True)
print(f"{len(df)} products: UAE {len(uae)}, Saudi {len(sa)}")

8 niches × 100 × 2 countries = up to 1,600 records, costing $3.20.

Step 3: How do I parse AED/SAR pricing + filter quality?

Currency normalize + quality filter.

import re

def parse_price(s):
    if not isinstance(s, str): return None
    m = re.search(r"([\d,]+\.?\d*)", s.replace(",", ""))
    return float(m.group(1)) if m else None

df["price"] = df.price.apply(parse_price)
df["rating"] = pd.to_numeric(df.rating, errors="coerce")
df["review_count"] = pd.to_numeric(df.review_count, errors="coerce")
df["discount_pct"] = pd.to_numeric(df.discount_pct, errors="coerce")

quality = df[
    (df.rating >= 4.0)
    & (df.review_count >= 50)
    & df.price.notna()
].sort_values(["country", "price"])

per_niche = quality.groupby(["searchString", "country"]).agg(
    product_count=("title", "count"),
    median_price=("price", "median"),
    median_rating=("rating", "median"),
    median_discount=("discount_pct", "median"),
)
print(per_niche)

Per-niche per-country aggregates enable MENA market-research at category granularity.

Step 4: How do I detect deals during sale windows?

During Yellow Friday + White Friday, snapshot 6-hourly to catch flash deals.

import datetime, pathlib, json

ts = datetime.datetime.utcnow().strftime("%Y%m%d-%H")
out = pathlib.Path(f"snapshots/noon-{ts}.json")
out.parent.mkdir(parents=True, exist_ok=True)
out.write_text(df.to_json(orient="records"))

# Detect deep-discount events
deals = df[df.discount_pct >= 30].sort_values("discount_pct", ascending=False)
print(f"{len(deals)} products at 30%+ discount")

Deep-discount cohort during sale windows enables affiliate-marketing content + MENA deal-aggregator products.

Sample output

A single Noon product record looks like this. Five rows weigh ~6 KB.

{
  "product_id": "N53412345A",
  "title": "Anker Soundcore Bluetooth Speaker",
  "brand": "Anker",
  "price": "AED 199",
  "original_price": "AED 299",
  "discount_pct": 33,
  "rating": 4.6,
  "review_count": 1845,
  "category": "Electronics > Audio > Speakers",
  "seller_name": "Noon",
  "primary_image": "https://k.nooncdn.com/...",
  "url": "https://www.noon.com/uae-en/anker-soundcore-...",
  "country": "UAE",
  "is_express_delivery": true,
  "is_in_stock": true
}

product_id is the canonical natural key. is_express_delivery: true indicates Noon-fulfilled (equivalent to Prime), which converts at higher rates. seller_name: "Noon" distinguishes first-party (Noon-stocked) from third-party-marketplace listings.

Common pitfalls

Three things go wrong in Noon pipelines. Currency variance — Noon UAE displays AED; Noon Saudi displays SAR; cross-country comparison requires currency normalization (AED ÷ 0.97 ≈ SAR for parity reference). Akamai-induced retries — aggressive Akamai bot-detection requires browser-restart strategy after every ~2 queries; sustained throughput is lower than typical scrapers. Multi-language listings — Noon mixes English + Arabic titles + descriptions; for English-only research, filter by character-set or use language-detection on title field.

Thirdwatch's actor uses Camoufox + residential proxy at $1.10/1K, ~91% margin. Pair Noon with Amazon Scraper for cross-platform UAE/Saudi pricing intelligence. A fourth subtle issue worth flagging: Noon's first-party "Noon-stocked" inventory shows materially different pricing + delivery from third-party-marketplace inventory; for accurate cross-platform comparisons, segment first-party vs third-party listings separately. A fifth pattern unique to MENA ecommerce: Ramadan + Hajj + Eid drive seasonal demand spikes (groceries, religious goods, tourism-related products) that don't follow Western retail seasonality. For accurate trend research, deseasonalize against Hijri-calendar cycles rather than Gregorian. A sixth and final pitfall: VAT-inclusion varies — UAE VAT (5%) typically included in displayed price; Saudi VAT (15%) sometimes excluded. For accurate cross-country effective-price analysis, normalize VAT-treatment per country rather than treating displayed prices as comparable.

Operational best practices for production pipelines

Tier the cadence to match signal half-life. Noon product data changes weekly during steady-state — daily polling is over-frequent for most use cases. 60-80% cost reduction with negligible signal loss when watchlist is properly tiered. During sale windows (Yellow Friday, Eid), tighten to daily or 6-hourly.

Snapshot raw payloads. Pipeline cost is dominated by scrape volume, not storage. Persisting raw JSON snapshots lets you re-derive metrics — particularly useful as your category-classifier or VAT-normalization logic evolves.

Schema validation. Run a daily validation suite asserting expected core fields with non-null rates above 80% (required) and 50% (optional). Noon schema occasionally changes during platform UI revisions — catch drift early before downstream consumers degrade silently. A seventh and final operational pattern at production scale: cross-snapshot diff alerts. Beyond detecting individual changes, build alerts on cross-snapshot field-level diffs — name changes, category re-classifications, ownership-transfers, status changes. These structural changes precede or follow material events (acquisitions, rebrands, regulatory issues, leadership departures) and are leading indicators of organization-level disruption. Persist a structured-diff log alongside aggregate snapshots: for each entity, persist (field, old_value, new_value) tuples per scrape. Surface high-leverage diffs (name changes, category re-classifications, status updates) to human reviewers; low-leverage diffs (single-record additions, minor count updates) stay in the audit log.

An eighth pattern worth flagging for cost-controlled teams: implement an incremental-diff pipeline that only re-processes records whose hash changed since the previous snapshot. For watchlists where 90%+ of records are unchanged between snapshots, hash-comparison-driven incremental processing reduces downstream-compute by 80-90% while preserving full data fidelity. Combine with snapshot-storage compression for end-to-end pipeline-cost reductions of 70%+ at scale.

Related use cases

Frequently asked questions

Why scrape Noon for MENA ecommerce research?

Noon is the dominant homegrown ecommerce platform in UAE + Saudi Arabia — backed by PIF (Saudi sovereign fund) with 50M+ products across electronics, fashion, beauty, home, and groceries. According to Noon's 2024 report, the platform processes $5B+ GMV annually with strong logistics presence across 6 MENA countries. For MENA ecommerce research, Saudi-market analysis, and UAE competitive intelligence, Noon is essential alongside Amazon UAE.

What data does the actor return?

Per product: title, brand, price (AED or SAR), original price, discount percentage, rating, review count, primary image, category, seller name, product ID, availability flag. About 95% of Noon products have comprehensive metadata.

How does Noon handle anti-bot defenses?

Noon uses Akamai Bot Manager. Thirdwatch's actor uses Camoufox stealth-browser + residential proxy. Production-tested with 90-95% success rate. Akamai blocks aggressive after 2 queries; the actor restarts browser sessions strategically to maintain coverage. Sustained polling: 100 products/minute per proxy IP.

How does Noon compare to Amazon UAE?

Amazon UAE has stronger international-brand coverage; Noon has stronger MENA-specific brands + Saudi-tailored selection. For UAE pricing comparisons, run both — typically 30-40% non-overlap. Noon often has lower headline pricing on local-brand items; Amazon UAE wins on Prime-fulfilled international items. Saudi market favors Noon (lower import friction).

Can I detect rising products / deal events?

Yes. Track per-product price + rating + sales velocity over snapshots. Noon's mid-year sale ("Yellow Friday") drives 30-50% discount density across categories — daily cadence during sale windows catches flash deals. For year-round research, weekly snapshots are sufficient.

How fresh does Noon data need to be?

For active MENA pricing-research, weekly cadence is sufficient. For sale-window monitoring (Yellow Friday in November), daily cadence catches promotional pricing. For competitive-positioning research vs Amazon UAE, weekly cross-platform snapshots produce stable comparisons. For one-off market-research, single comprehensive scrapes suffice.

Run the Noon Scraper on Apify Store — pay-per-record, free to try, no credit card to test.