Scrape BookMyShow Movies and Events Data Across India 2026
Extract movies, concerts, comedy shows, and live events from BookMyShow across 600 Indian cities. Titles, cast, genres, prices, and venues as clean JSON.

Thirdwatch's BookMyShow Scraper extracts movies, concerts, comedy shows, sports events, and workshops from BookMyShow across 600+ Indian cities. Returns title, languages, genres, cast, director, runtime, synopsis, venue, price, and poster URLs as structured JSON. No login required, no API key. Built for entertainment analysts, aggregator builders, marketing teams, and AI agents that need fresh Indian entertainment data on demand.
Why scrape BookMyShow for movie and event data
BookMyShow dominates India's entertainment ticketing market. According to PwC's Global Entertainment and Media Outlook and FICCI-EY's Media and Entertainment report, India's live entertainment and cinema market crossed $2 billion in 2025, with BookMyShow processing the majority of online ticket transactions across the country. The platform covers everything from Bollywood blockbusters and regional-language films to international concert tours, stand-up comedy, sports events, and niche workshops.
The problem: BookMyShow has no public API. There is no structured way to pull listings programmatically. Entertainment analysts tracking release patterns across cities, aggregator sites combining BookMyShow data with IMDb ratings, marketing teams monitoring concert lineups for sponsorship opportunities, and AI agents serving travel recommendations all hit the same wall. Manual copy-paste does not scale beyond a single city. The BookMyShow Scraper solves this by returning clean, structured JSON for every listing in any Indian city, on demand.
How does this compare to the alternatives?
Three paths to getting BookMyShow data into your pipeline:
| Approach | Reliability | Setup time | Maintenance | Data coverage |
|---|---|---|---|---|
| DIY Python scraper | Breaks when BookMyShow updates protection | 2-4 weeks | You maintain anti-bot handling | Whatever you build |
| Generic web-scraping API | Variable; most fail on BookMyShow's protections | 1-2 days | Vendor maintains generic layer | Unstructured HTML |
| Thirdwatch BookMyShow Scraper | Production-tested, handles protection updates | 5 minutes | Thirdwatch maintains | 30+ structured fields per item |
No other BookMyShow scraper exists on the Apify Store today. Building in-house requires working around BookMyShow's request protection, which blocks plain HTTP clients. Generic Indian-events APIs from District or Paytm Insider cover only their own catalogues and require partnership agreements. The BookMyShow Scraper is the only structured, on-demand view of the full BookMyShow catalogue.
How to scrape BookMyShow movies and events in 4 steps
Step 1: How do I get an Apify API token?
Sign up at apify.com (free tier, no credit card required). Navigate to Settings, then Integrations, and copy your personal API token. Every example below assumes the token is stored in APIFY_TOKEN:
export APIFY_TOKEN="apify_api_xxxxxxxxxxxxxxxx"Step 2: How do I pull all movies and events for a city?
Pass Indian city names in the queries array. Set mode to auto (the default) to get both movies and events, or narrow to movies or events only.
import os, requests, pandas as pd
ACTOR = "thirdwatch~bookmyshow-scraper"
TOKEN = os.environ["APIFY_TOKEN"]
resp = requests.post(
f"https://api.apify.com/v2/acts/{ACTOR}/run-sync-get-dataset-items",
params={"token": TOKEN},
json={
"queries": ["Mumbai", "Bengaluru"],
"mode": "auto",
"includeDetails": True,
"maxResults": 50,
},
timeout=600,
)
df = pd.DataFrame(resp.json())
print(f"{len(df)} items: {df['type'].value_counts().to_dict()}")
print(f"Cities: {df['scraped_city'].unique().tolist()}")Two cities with 50 results each returns up to 100 listings. The includeDetails: true flag enriches each item with cast, director, synopsis, runtime, and venue details from the detail page.
Step 3: How do I scrape a specific movie or event by URL?
If you already have a BookMyShow URL, pass it directly in queries. The actor detects whether the input is a city name or a URL automatically.
resp = requests.post(
f"https://api.apify.com/v2/acts/{ACTOR}/run-sync-get-dataset-items",
params={"token": TOKEN},
json={
"queries": [
"https://in.bookmyshow.com/movies/mumbai/bhooth-bangla/ET00411383",
"https://in.bookmyshow.com/events/calvin-harris-live-in-mumbai/ET00462236",
],
"includeDetails": True,
"maxResults": 10,
},
timeout=600,
)
items = resp.json()
for item in items:
print(f"{item['type']}: {item['title']} — {item.get('city', '')}")This fetches exactly those two items with full metadata. Useful when you already know which listings you want to track.
Step 4: How do I filter movies by language and genre?
The actor returns languages and genres as arrays. Filter downstream in Python.
movies = df[df["type"] == "movie"].copy()
hindi_thrillers = movies[
movies["languages"].apply(lambda l: isinstance(l, list) and "Hindi" in l)
& movies["genres"].apply(lambda g: isinstance(g, list) and "Thriller" in g)
]
print(f"Hindi thrillers in Mumbai + Bengaluru: {len(hindi_thrillers)}")
print(hindi_thrillers[["title", "rating", "runtime_minutes", "director", "release_date"]].head(10))Language names follow BookMyShow's own casing (Hindi, Tamil, Telugu, Kannada, Malayalam, English). Genre values include Action, Comedy, Drama, Horror, Romance, Thriller, and more.
Step 5: How do I schedule daily scraping and send results to a webhook?
Set up a recurring schedule with a webhook to push fresh data into your pipeline automatically.
curl -X POST "https://api.apify.com/v2/schedules?token=$APIFY_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"name": "bookmyshow-mumbai-daily",
"cronExpression": "0 8 * * *",
"timezone": "Asia/Kolkata",
"isEnabled": true,
"actions": [{
"type": "RUN_ACTOR",
"actorId": "thirdwatch~bookmyshow-scraper",
"runInput": {
"queries": ["Mumbai"],
"mode": "auto",
"includeDetails": true,
"maxResults": 200
}
}]
}'Add an ACTOR.RUN.SUCCEEDED webhook pointing at your ingestion endpoint and the loop closes itself. Every morning at 8 AM IST, fresh BookMyShow listings land in your database or dashboard without manual intervention.
Sample output
A movie record and an event record from the dataset look like this:
[
{
"type": "movie",
"title": "Bhooth Bangla",
"slug": "bhooth-bangla",
"event_code": "ET00411383",
"url": "https://in.bookmyshow.com/movies/mumbai/bhooth-bangla/ET00411383",
"languages": ["Hindi"],
"genres": ["Comedy", "Horror", "Thriller"],
"rating": "UA16+",
"release_date": "16 APR, 2026",
"runtime_minutes": 165,
"synopsis": "Arjun Acharya, a financially troubled man in his late 30s, unexpectedly inherits his grandfather's massive ancestral palace...",
"director": "Priyadarshan",
"cast": [{"name": "Akshay Kumar", "url": "https://in.bookmyshow.com/akshay-kumar/94"}],
"poster_url": "https://assets-in.bmscdn.com/discovery-catalog/events/et00411383-ufeqwqauqg-portrait.jpg",
"city": "mumbai",
"scraped_city": "mumbai",
"data_source": "bookmyshow"
},
{
"type": "event",
"title": "CALVIN HARRIS - Live in Mumbai",
"slug": "calvin-harris-live-in-mumbai",
"event_code": "ET00462236",
"url": "https://in.bookmyshow.com/events/calvin-harris-live-in-mumbai/ET00462236",
"category": ["music-shows", "concerts"],
"genre": "edm",
"language": "english",
"event_date": "Sat 18 Apr 2026",
"event_time": "4:00 PM",
"duration": "6 Hours",
"venue_name": "Infinity Bay Sewri",
"venue_city": "Mumbai",
"price_starting_from": "₹ 3500 onwards",
"tags": ["outdoor-events", "fast-filling", "must-attend"],
"poster_url": "https://assets-in.bmscdn.com/discovery-catalog/events/et00462236-ktjhyrfvnc-portrait.jpg",
"city": "mumbai",
"scraped_city": "mumbai",
"data_source": "bookmyshow"
}
]Movies return languages (array), genres, cast, director, runtime_minutes, and synopsis. Events return category, venue_name, event_date, event_time, duration, and price_starting_from. Both include poster_url for high-resolution images and event_code for stable BookMyShow identifiers.
Common pitfalls
Three things trip up first-time users of BookMyShow data. Passing movie titles instead of cities. The queries field accepts city names or full BookMyShow URLs only. Searching by movie title is not supported. To find a specific film, either pass its city to list everything playing, or paste the movie's BookMyShow URL directly. Forgetting includeDetails. With includeDetails set to false, you get listing-card data only -- no cast, no director, no synopsis, no runtime. The default is true, but if you override it for speed, expect thinner records. Language and genre normalization. BookMyShow uses its own casing and naming conventions. "Hindi" is always capitalized, genres are English-language labels even for regional films, and event categories use slug format ("music-shows", "comedy-shows"). Normalize these downstream before joining with other data sources.
A fourth consideration: regional-language content coverage varies by city. Smaller cities may list only Hindi and local-language films, while metros like Mumbai and Bengaluru list English-language imports and international events alongside regional content. Normalize your downstream analysis by language before comparing across cities, otherwise metro-heavy data skews genre distributions.
Thirdwatch's actor handles BookMyShow's request protection and proxy rotation so you focus on the data, not the scraping infrastructure.
Related use cases
Frequently asked questions
Does the BookMyShow Scraper require a login or API key?
No. The actor reads publicly accessible BookMyShow pages. No login credentials, no API key, and no browser session are needed. You authenticate only against Apify with your personal API token.
Which Indian cities does the scraper support?
Every city BookMyShow serves -- Mumbai, Delhi-NCR, Bengaluru, Chennai, Hyderabad, Pune, Kolkata, Ahmedabad, Jaipur, Chandigarh, Kochi, Lucknow, Indore, and 600+ more. Pass the city name in the queries input.
Can I scrape a specific movie or event by URL?
Yes. Drop the full BookMyShow URL into the queries array and the actor fetches that single item directly, returning all available metadata including cast, synopsis, venue, and pricing.
How fresh is the data returned?
Real-time at the moment of each run. BookMyShow updates listings continuously. Schedule the actor hourly or daily on Apify to keep a downstream feed current with new releases and event additions.
What is the difference between includeDetails true and false?
With includeDetails set to true (the default), the actor visits each item's detail page to extract cast, director, synopsis, runtime, and venue. Setting it to false returns listing-card data only, which is faster for high-volume sweeps.
Related
100 free credits, no credit card.
About 30 real searches. Add the MCP to Claude or Cursor in two minutes.