Build a TikTok Shop France Product Database With the API
Build a structured TikTok Shop France product database using the Apify API. Python and Node.js examples with schema design and incremental sync logic.

Thirdwatch's TikTok Shop France Scraper provides a structured API for extracting product data from TikTok Shop's French marketplace — product IDs, prices in EUR, seller names, ratings, review counts, and sold units. Built for developers building product comparison engines, price tracking services, social commerce analytics dashboards, and any application that needs a reliable TikTok Shop France data feed without maintaining scraping infrastructure.
Why build a TikTok Shop France product database
TikTok Shop France is one of the fastest-growing e-commerce surfaces in Europe. According to eMarketer's 2025 social commerce forecast, TikTok Shop's gross merchandise value in Western Europe grew 340% year-over-year in 2025, with France as the second-largest market after the UK. For developers, this means a growing data source that product teams, analytics platforms, and comparison engines need programmatic access to. According to Statista's social commerce forecast for France, French social commerce revenue is projected to exceed 5 billion EUR by 2027, with TikTok Shop as the fastest-growing contributor.
The developer use cases are straightforward. A price-comparison startup needs nightly product feeds from TikTok Shop France alongside Amazon.fr and Cdiscount to show users the cheapest option. An analytics SaaS ingests product records into a warehouse to power seller performance dashboards for brand clients. A dropshipping tool needs product discovery with price, rating, and sold-count signals to surface profitable niches. An internal tool at a D2C brand monitors competitor product launches and pricing on TikTok Shop weekly. All of these need the same thing: a reliable API that returns structured JSON, a predictable schema, and incremental sync logic.
How does this compare to the alternatives?
| Approach | Schema stability | Anti-bot handling | Incremental sync | Setup time |
|---|---|---|---|---|
| DIY Playwright scraper | You maintain it | You build it | You implement it | 2-4 weeks |
| Generic web scraper SaaS | Varies | Basic | Manual | Hours |
| TikTok Shop Seller API | Official but seller-scoped | N/A | Webhooks | Days (approval) |
| Thirdwatch TikTok Shop France Scraper | Stable, documented fields | Handled | product_id-based dedup | 10 minutes |
TikTok's official Seller API requires merchant approval and only returns data for your own store. For competitive intelligence or cross-marketplace product databases, a scraping API is the standard approach. The TikTok Shop France Scraper returns a stable schema and handles extraction complexity so your pipeline code stays clean.
How to build a TikTok Shop France product database in 6 steps
Step 1: How do I set up authentication?
Create a free Apify account at apify.com, go to Settings, then Integrations, and copy your API token. Install the client library for your language:
# Python
pip install apify-client
# Node.js
npm install apify-clientStep 2: How do I run the scraper and retrieve results?
Python example — synchronous call that blocks until the run finishes:
from apify_client import ApifyClient
client = ApifyClient("apify_api_xxxxxxxxxxxxxxxx")
run = client.actor("thirdwatch/tiktok-shop-france-scraper").call(
run_input={
"queries": ["soin visage", "coque telephone", "accessoire mode"],
"maxResults": 300,
}
)
items = list(client.dataset(run["defaultDatasetId"]).iterate_items())
print(f"Retrieved {len(items)} products")Node.js equivalent:
const { ApifyClient } = require("apify-client");
const client = new ApifyClient({ token: "apify_api_xxxxxxxxxxxxxxxx" });
const run = await client.actor("thirdwatch/tiktok-shop-france-scraper").call({
queries: ["soin visage", "coque telephone", "accessoire mode"],
maxResults: 300,
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(`Retrieved ${items.length} products`);Step 3: How do I design the database schema?
Use product_id as the primary key. It is TikTok Shop's stable internal identifier and survives URL changes.
CREATE TABLE tiktok_shop_france_products (
product_id TEXT PRIMARY KEY,
product_name TEXT NOT NULL,
seller_name TEXT,
price NUMERIC(10, 2),
original_price NUMERIC(10, 2),
currency TEXT DEFAULT 'EUR',
rating NUMERIC(2, 1),
review_count INTEGER DEFAULT 0,
sold_count INTEGER DEFAULT 0,
category TEXT,
image_url TEXT,
url TEXT,
source_query TEXT,
marketplace TEXT DEFAULT 'TikTok Shop France',
first_seen_at TIMESTAMP DEFAULT NOW(),
last_updated_at TIMESTAMP DEFAULT NOW()
);
CREATE INDEX idx_seller ON tiktok_shop_france_products(seller_name);
CREATE INDEX idx_category ON tiktok_shop_france_products(category);
CREATE INDEX idx_price ON tiktok_shop_france_products(price);The first_seen_at and last_updated_at columns track product lifecycle — when it first appeared and when its price or rating last changed.
Step 4: How do I implement incremental upsert logic?
On each run, upsert new records and update changed fields for existing products.
import psycopg2
from psycopg2.extras import execute_values
conn = psycopg2.connect("postgresql://user:pass@localhost/tiktokshop")
cur = conn.cursor()
values = [
(
item["product_id"], item["product_name"], item["seller_name"],
item["price"], item["original_price"], item["currency"],
item["rating"], item["review_count"], item["sold_count"],
item["category"], item["image_url"], item["url"],
item["source_query"], item["marketplace"],
)
for item in items
]
execute_values(
cur,
"""
INSERT INTO tiktok_shop_france_products
(product_id, product_name, seller_name, price, original_price,
currency, rating, review_count, sold_count, category,
image_url, url, source_query, marketplace)
VALUES %s
ON CONFLICT (product_id) DO UPDATE SET
price = EXCLUDED.price,
original_price = EXCLUDED.original_price,
rating = EXCLUDED.rating,
review_count = EXCLUDED.review_count,
sold_count = EXCLUDED.sold_count,
last_updated_at = NOW()
""",
values,
)
conn.commit()
print(f"Upserted {len(values)} products")The ON CONFLICT clause means running the same queries twice never creates duplicates — it updates price, rating, and sold metrics instead.
Step 5: How do I track price history?
Add a separate price-history table for time-series analysis.
CREATE TABLE tiktok_shop_france_price_history (
product_id TEXT REFERENCES tiktok_shop_france_products(product_id),
price NUMERIC(10, 2),
recorded_at TIMESTAMP DEFAULT NOW(),
PRIMARY KEY (product_id, recorded_at)
);Insert a price snapshot on each sync, but only when the price changed:
for item in items:
cur.execute(
"""
INSERT INTO tiktok_shop_france_price_history (product_id, price)
SELECT %s, %s
WHERE NOT EXISTS (
SELECT 1 FROM tiktok_shop_france_price_history
WHERE product_id = %s
ORDER BY recorded_at DESC LIMIT 1
HAVING price = %s
)
""",
(item["product_id"], item["price"],
item["product_id"], item["price"]),
)
conn.commit()Step 6: How do I schedule automated syncs?
Use Apify's scheduler to trigger nightly runs. Set a cron expression like 0 2 * * * (2 AM UTC daily). The actor stores results in a dataset that your pipeline fetches via the API. Alternatively, use Apify's webhook integration to trigger your upsert script when a run completes — no polling required.
# Webhook handler (Flask example)
@app.route("/webhook/tiktok-shop-sync", methods=["POST"])
def handle_sync():
payload = request.json
dataset_id = payload["resource"]["defaultDatasetId"]
items = list(client.dataset(dataset_id).iterate_items())
upsert_products(items)
return {"synced": len(items)}, 200Sample output
Two representative product records from the API:
[
{
"product_id": "1729384756012345",
"product_name": "Serum Acide Hyaluronique Hydratant Anti-Age",
"seller_name": "DermaGlow FR",
"price": 9.99,
"original_price": 19.99,
"currency": "EUR",
"rating": 4.6,
"review_count": 2341,
"sold_count": 12450,
"category": "Soin Visage",
"image_url": "https://p16-oec-va.ibyteimg.com/tos-maliva-i-xxxx/serum-ha.jpg",
"url": "https://www.tiktok.com/view/product/1729384756012345",
"source_query": "soin visage",
"marketplace": "TikTok Shop France"
},
{
"product_id": "1729384756054321",
"product_name": "Lampe LED Bureau Rechargeable USB Pliable",
"seller_name": "LumiTech Store",
"price": 14.50,
"original_price": 14.50,
"currency": "EUR",
"rating": 4.2,
"review_count": 189,
"sold_count": 876,
"category": "Maison",
"image_url": "https://p16-oec-va.ibyteimg.com/tos-maliva-i-xxxx/lampe-led.jpg",
"url": "https://www.tiktok.com/view/product/1729384756054321",
"source_query": "lampe bureau",
"marketplace": "TikTok Shop France"
}
]product_id is your dedup key. original_price equal to price means no active discount. sold_count combined with review_count gives you the review conversion rate — useful for detecting products with unusually low or high review rates.
Common pitfalls
Four things that break TikTok Shop France data pipelines in production. Schema assumptions on optional fields — some products lack original_price (never discounted) or have zero review_count (new listings); your upsert logic should handle nulls and zeros gracefully, not crash on missing keys. UTF-8 encoding for French text — product names contain accented characters (accent aigu, cedilla, ligatures); ensure your database, CSV exports, and API responses use UTF-8 throughout, or you get garbled names in downstream UIs. Sold count is cumulative — sold_count reflects lifetime sales, not sales since your last sync; to compute daily or weekly sales velocity, subtract the previous snapshot's value from the current one, which requires your price-history pattern above.
A fourth issue specific to incremental pipelines: TikTok Shop product URLs and product_id values are stable, but category strings may change as TikTok reclassifies products. If your analytics depend on category groupings, maintain a category-mapping table that normalizes variations (e.g., "Soin du Visage" vs "Soin Visage") rather than trusting the raw category string as a stable key. A fifth pattern worth implementing early: dead-product detection. Products delisted from TikTok Shop will stop appearing in search results. Track last_updated_at and flag products not seen in 14+ days as potentially delisted — this keeps your database reflecting active inventory.
Related use cases
Frequently asked questions
What is the output schema for each product record?
Each record contains product_id (string), product_name (string), seller_name (string), price (float, EUR), original_price (float, EUR), currency (string, always 'EUR'), rating (float, 0-5), review_count (integer), sold_count (integer), category (string), image_url (string), url (string), source_query (string), and marketplace (string, always 'TikTok Shop France'). product_id is the stable unique key for deduplication.
Can I run the actor asynchronously and poll for completion?
Yes. Use the Apify API's async run endpoint (POST /v2/acts/{actorId}/runs) which returns a run ID immediately. Poll GET /v2/actor-runs/{runId} until status is 'SUCCEEDED', then fetch dataset items. The apify-client libraries handle this with client.actor().start() + client.run().wait_for_finish().
Related
100 free credits, no credit card.
About 30 real searches. Add the MCP to Claude or Cursor in two minutes.