E-commerce & products

Build a TikTok Shop France Product Database With the API

Build a structured TikTok Shop France product database using the Apify API. Python and Node.js examples with schema design and incremental sync logic.

May 26, 2026 · 6 min read · 1,389 words

See the scraper →

Thirdwatch's TikTok Shop France Scraper provides a structured API for extracting product data from TikTok Shop's French marketplace — product IDs, prices in EUR, seller names, ratings, review counts, and sold units. Built for developers building product comparison engines, price tracking services, social commerce analytics dashboards, and any application that needs a reliable TikTok Shop France data feed without maintaining scraping infrastructure.

Why build a TikTok Shop France product database

TikTok Shop France is one of the fastest-growing e-commerce surfaces in Europe. According to eMarketer's 2025 social commerce forecast, TikTok Shop's gross merchandise value in Western Europe grew 340% year-over-year in 2025, with France as the second-largest market after the UK. For developers, this means a growing data source that product teams, analytics platforms, and comparison engines need programmatic access to. According to Statista's social commerce forecast for France, French social commerce revenue is projected to exceed 5 billion EUR by 2027, with TikTok Shop as the fastest-growing contributor.

The developer use cases are straightforward. A price-comparison startup needs nightly product feeds from TikTok Shop France alongside Amazon.fr and Cdiscount to show users the cheapest option. An analytics SaaS ingests product records into a warehouse to power seller performance dashboards for brand clients. A dropshipping tool needs product discovery with price, rating, and sold-count signals to surface profitable niches. An internal tool at a D2C brand monitors competitor product launches and pricing on TikTok Shop weekly. All of these need the same thing: a reliable API that returns structured JSON, a predictable schema, and incremental sync logic.

How does this compare to the alternatives?

Approach	Schema stability	Anti-bot handling	Incremental sync	Setup time
DIY Playwright scraper	You maintain it	You build it	You implement it	2-4 weeks
Generic web scraper SaaS	Varies	Basic	Manual	Hours
TikTok Shop Seller API	Official but seller-scoped	N/A	Webhooks	Days (approval)
Thirdwatch TikTok Shop France Scraper	Stable, documented fields	Handled	product_id-based dedup	10 minutes

TikTok's official Seller API requires merchant approval and only returns data for your own store. For competitive intelligence or cross-marketplace product databases, a scraping API is the standard approach. The TikTok Shop France Scraper returns a stable schema and handles extraction complexity so your pipeline code stays clean.

How to build a TikTok Shop France product database in 6 steps

Step 1: How do I set up authentication?

Create a free Apify account at apify.com, go to Settings, then Integrations, and copy your API token. Install the client library for your language:

# Python
pip install apify-client

# Node.js
npm install apify-client

Step 2: How do I run the scraper and retrieve results?

Python example — synchronous call that blocks until the run finishes:

from apify_client import ApifyClient

client = ApifyClient("apify_api_xxxxxxxxxxxxxxxx")

run = client.actor("thirdwatch/tiktok-shop-france-scraper").call(
    run_input={
        "queries": ["soin visage", "coque telephone", "accessoire mode"],
        "maxResults": 300,
    }
)

items = list(client.dataset(run["defaultDatasetId"]).iterate_items())
print(f"Retrieved {len(items)} products")

Node.js equivalent:

const { ApifyClient } = require("apify-client");

const client = new ApifyClient({ token: "apify_api_xxxxxxxxxxxxxxxx" });

const run = await client.actor("thirdwatch/tiktok-shop-france-scraper").call({
  queries: ["soin visage", "coque telephone", "accessoire mode"],
  maxResults: 300,
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(`Retrieved ${items.length} products`);

Step 3: How do I design the database schema?

Use product_id as the primary key. It is TikTok Shop's stable internal identifier and survives URL changes.

CREATE TABLE tiktok_shop_france_products (
    product_id       TEXT PRIMARY KEY,
    product_name     TEXT NOT NULL,
    seller_name      TEXT,
    price            NUMERIC(10, 2),
    original_price   NUMERIC(10, 2),
    currency         TEXT DEFAULT 'EUR',
    rating           NUMERIC(2, 1),
    review_count     INTEGER DEFAULT 0,
    sold_count       INTEGER DEFAULT 0,
    category         TEXT,
    image_url        TEXT,
    url              TEXT,
    source_query     TEXT,
    marketplace      TEXT DEFAULT 'TikTok Shop France',
    first_seen_at    TIMESTAMP DEFAULT NOW(),
    last_updated_at  TIMESTAMP DEFAULT NOW()
);

CREATE INDEX idx_seller ON tiktok_shop_france_products(seller_name);
CREATE INDEX idx_category ON tiktok_shop_france_products(category);
CREATE INDEX idx_price ON tiktok_shop_france_products(price);

The first_seen_at and last_updated_at columns track product lifecycle — when it first appeared and when its price or rating last changed.

Step 4: How do I implement incremental upsert logic?

On each run, upsert new records and update changed fields for existing products.

import psycopg2
from psycopg2.extras import execute_values

conn = psycopg2.connect("postgresql://user:pass@localhost/tiktokshop")
cur = conn.cursor()

values = [
    (
        item["product_id"], item["product_name"], item["seller_name"],
        item["price"], item["original_price"], item["currency"],
        item["rating"], item["review_count"], item["sold_count"],
        item["category"], item["image_url"], item["url"],
        item["source_query"], item["marketplace"],
    )
    for item in items
]

execute_values(
    cur,
    """
    INSERT INTO tiktok_shop_france_products
        (product_id, product_name, seller_name, price, original_price,
         currency, rating, review_count, sold_count, category,
         image_url, url, source_query, marketplace)
    VALUES %s
    ON CONFLICT (product_id) DO UPDATE SET
        price = EXCLUDED.price,
        original_price = EXCLUDED.original_price,
        rating = EXCLUDED.rating,
        review_count = EXCLUDED.review_count,
        sold_count = EXCLUDED.sold_count,
        last_updated_at = NOW()
    """,
    values,
)

conn.commit()
print(f"Upserted {len(values)} products")

The ON CONFLICT clause means running the same queries twice never creates duplicates — it updates price, rating, and sold metrics instead.

Step 5: How do I track price history?

Add a separate price-history table for time-series analysis.

CREATE TABLE tiktok_shop_france_price_history (
    product_id    TEXT REFERENCES tiktok_shop_france_products(product_id),
    price         NUMERIC(10, 2),
    recorded_at   TIMESTAMP DEFAULT NOW(),
    PRIMARY KEY (product_id, recorded_at)
);

Insert a price snapshot on each sync, but only when the price changed:

for item in items:
    cur.execute(
        """
        INSERT INTO tiktok_shop_france_price_history (product_id, price)
        SELECT %s, %s
        WHERE NOT EXISTS (
            SELECT 1 FROM tiktok_shop_france_price_history
            WHERE product_id = %s
            ORDER BY recorded_at DESC LIMIT 1
            HAVING price = %s
        )
        """,
        (item["product_id"], item["price"],
         item["product_id"], item["price"]),
    )
conn.commit()

Step 6: How do I schedule automated syncs?

Use Apify's scheduler to trigger nightly runs. Set a cron expression like 0 2 * * * (2 AM UTC daily). The actor stores results in a dataset that your pipeline fetches via the API. Alternatively, use Apify's webhook integration to trigger your upsert script when a run completes — no polling required.

# Webhook handler (Flask example)
@app.route("/webhook/tiktok-shop-sync", methods=["POST"])
def handle_sync():
    payload = request.json
    dataset_id = payload["resource"]["defaultDatasetId"]
    items = list(client.dataset(dataset_id).iterate_items())
    upsert_products(items)
    return {"synced": len(items)}, 200

Sample output

Two representative product records from the API:

[
  {
    "product_id": "1729384756012345",
    "product_name": "Serum Acide Hyaluronique Hydratant Anti-Age",
    "seller_name": "DermaGlow FR",
    "price": 9.99,
    "original_price": 19.99,
    "currency": "EUR",
    "rating": 4.6,
    "review_count": 2341,
    "sold_count": 12450,
    "category": "Soin Visage",
    "image_url": "https://p16-oec-va.ibyteimg.com/tos-maliva-i-xxxx/serum-ha.jpg",
    "url": "https://www.tiktok.com/view/product/1729384756012345",
    "source_query": "soin visage",
    "marketplace": "TikTok Shop France"
  },
  {
    "product_id": "1729384756054321",
    "product_name": "Lampe LED Bureau Rechargeable USB Pliable",
    "seller_name": "LumiTech Store",
    "price": 14.50,
    "original_price": 14.50,
    "currency": "EUR",
    "rating": 4.2,
    "review_count": 189,
    "sold_count": 876,
    "category": "Maison",
    "image_url": "https://p16-oec-va.ibyteimg.com/tos-maliva-i-xxxx/lampe-led.jpg",
    "url": "https://www.tiktok.com/view/product/1729384756054321",
    "source_query": "lampe bureau",
    "marketplace": "TikTok Shop France"
  }
]

product_id is your dedup key. original_price equal to price means no active discount. sold_count combined with review_count gives you the review conversion rate — useful for detecting products with unusually low or high review rates.

Common pitfalls

Four things that break TikTok Shop France data pipelines in production. Schema assumptions on optional fields — some products lack original_price (never discounted) or have zero review_count (new listings); your upsert logic should handle nulls and zeros gracefully, not crash on missing keys. UTF-8 encoding for French text — product names contain accented characters (accent aigu, cedilla, ligatures); ensure your database, CSV exports, and API responses use UTF-8 throughout, or you get garbled names in downstream UIs. Sold count is cumulative — sold_count reflects lifetime sales, not sales since your last sync; to compute daily or weekly sales velocity, subtract the previous snapshot's value from the current one, which requires your price-history pattern above.

A fourth issue specific to incremental pipelines: TikTok Shop product URLs and product_id values are stable, but category strings may change as TikTok reclassifies products. If your analytics depend on category groupings, maintain a category-mapping table that normalizes variations (e.g., "Soin du Visage" vs "Soin Visage") rather than trusting the raw category string as a stable key. A fifth pattern worth implementing early: dead-product detection. Products delisted from TikTok Shop will stop appearing in search results. Track last_updated_at and flag products not seen in 14+ days as potentially delisted — this keeps your database reflecting active inventory.

Related use cases

Frequently asked questions

What is the output schema for each product record?

Each record contains product_id (string), product_name (string), seller_name (string), price (float, EUR), original_price (float, EUR), currency (string, always 'EUR'), rating (float, 0-5), review_count (integer), sold_count (integer), category (string), image_url (string), url (string), source_query (string), and marketplace (string, always 'TikTok Shop France'). product_id is the stable unique key for deduplication.

Can I run the actor asynchronously and poll for completion?

Yes. Use the Apify API's async run endpoint (POST /v2/acts/{actorId}/runs) which returns a run ID immediately. Poll GET /v2/actor-runs/{runId} until status is 'SUCCEEDED', then fetch dataset items. The apify-client libraries handle this with client.actor().start() + client.run().wait_for_finish().

Scrape TikTok Shop France Products for Research (2026 Guide)Monitor TikTok Shop France Pricing Trends and Discounts Find Trending TikTok Shop France Sellers for Sourcing 2026

Try it yourself

100 free credits, no credit card.

About 30 real searches. Add the MCP to Claude or Cursor in two minutes.