Scrape Commonfloor for Bangalore Real Estate (2026)

Published April 28, 2026 · 1483 words · For researchers

Thirdwatch's Commonfloor Scraper returns India property listings + community data at $0.003 per record — title, property type, location, price, area, amenities, RERA ID, builder, community ratings. Built for Bangalore-focused real estate research, apartment-complex investment analysis, and India proptech aggregator products.

Why scrape Commonfloor for Bangalore research

Commonfloor has historically deep Bangalore + South India property coverage. According to Commonfloor's category data, the platform indexes 500K+ India listings with materially deeper Bangalore + Pune coverage than other platforms — community-level data (apartment-complex amenities, RERA-registered builders, society reviews) is uniquely deep on Commonfloor. For Bangalore-focused real estate research, apartment-complex investment analysis, and South India proptech research, Commonfloor is essential alongside MagicBricks and 99acres.

The job-to-be-done is structured. A Bangalore-focused investment research function maps per-apartment-complex rental yields quarterly. A new-launch tracking function monitors RERA-registered Bangalore projects pre-launch. An apartment-complex investment SaaS surfaces community-level metrics (society ratings, builder reputation) to subscribers. A Bangalore real estate aggregator builder ingests Commonfloor alongside MagicBricks for comprehensive Bangalore coverage. All reduce to city + locality + complex queries + per-listing aggregation.

How does this compare to the alternatives?

Three options for Commonfloor data:

Approach	Cost per 10K records	Reliability	Setup time	Maintenance
Commonfloor B2B API	$5K+/year (partnership)	Official	Days (approval)	Per-tier license
Manual Commonfloor browsing	Effectively unbounded	Low	Continuous	Doesn't scale
Thirdwatch Commonfloor Scraper	$30 ($0.003 × 10K)	HTTP + structured data	5 minutes	Thirdwatch tracks Commonfloor changes

Commonfloor's official API is gated behind B2B partnership. The Commonfloor Scraper actor page gives you raw listing + community data at materially lower per-record cost.

How to scrape Commonfloor in 4 steps

Step 1: How do I authenticate against Apify?

export APIFY_TOKEN="apify_api_xxxxxxxxxxxxxxxx"

Step 2: How do I pull Bangalore localities?

Pass city + locality queries.

import os, requests, pandas as pd

ACTOR = "thirdwatch~commonfloor-scraper"
TOKEN = os.environ["APIFY_TOKEN"]

BANGALORE_LOCALITIES = [
    "Whitefield", "Sarjapur Road", "Hebbal", "HSR Layout",
    "Koramangala", "Indiranagar", "Marathahalli", "Bellandur",
    "Electronic City", "Yelahanka", "JP Nagar", "Jayanagar",
]

queries = [{"city": "Bangalore", "locality": loc, "property_type": "apartment", "listing": "rent"}
           for loc in BANGALORE_LOCALITIES]

resp = requests.post(
    f"https://api.apify.com/v2/acts/{ACTOR}/run-sync-get-dataset-items",
    params={"token": TOKEN},
    json={"queries": queries, "maxResults": 100},
    timeout=900,
)
df = pd.DataFrame(resp.json())
print(f"{len(df)} listings across {df.locality.nunique()} Bangalore localities")

12 localities × 100 = up to 1,200 records, costing $3.60.

Step 3: How do I aggregate per-apartment-complex?

Group by apartment-complex name for community-level metrics.

df["price_inr"] = pd.to_numeric(df.price_inr, errors="coerce")
df["area_sqft"] = pd.to_numeric(df.area_sqft, errors="coerce")
df["price_per_sqft"] = df.price_inr / df.area_sqft

complex_aggregates = (
    df.groupby(["locality", "complex_name"])
    .agg(
        listing_count=("listing_id", "count"),
        median_rent=("price_inr", "median"),
        median_psf=("price_per_sqft", "median"),
        builder=("builder_name", "first"),
        rera_id=("rera_id", "first"),
        community_rating=("community_rating", "first"),
        total_units=("total_units", "first"),
    )
    .query("listing_count >= 3")
    .sort_values(["locality", "median_psf"], ascending=[True, False])
)
print(complex_aggregates.head(20))

Per-apartment-complex aggregates enable society-level investment research — comparing Sobha vs Prestige vs Brigade properties at locality granularity.

Step 4: How do I detect new-launch projects?

Filter to new-launch tier with RERA registration.

new_launches = df[df.listing_type == "new_launch"].drop_duplicates(subset=["complex_name"])
print(f"{len(new_launches)} new-launch projects in Bangalore")
print(new_launches[["complex_name", "builder_name", "rera_id",
                    "expected_possession", "price_per_sqft"]].head(15))

RERA-registered new launches are the primary investment-research surface for pre-launch + under-construction inventory.

Sample output

A single Commonfloor listing record looks like this. Five rows weigh ~7 KB.

{
  "listing_id": "CF12345",
  "title": "3 BHK Apartment for Rent in Sobha Indraprastha",
  "property_type": "Apartment",
  "city": "Bangalore",
  "locality": "Hebbal",
  "complex_name": "Sobha Indraprastha",
  "price": "₹55,000 per month",
  "price_inr": 55000,
  "area_sqft": 1850,
  "bedrooms": 3,
  "bathrooms": 3,
  "balconies": 3,
  "furnished_status": "Semi-Furnished",
  "amenities": ["Swimming Pool", "Gym", "Club House", "Power Backup", "24x7 Security"],
  "builder_name": "Sobha",
  "rera_id": "PRM/KA/RERA/1251/308/PR/180210/000301",
  "community_rating": 4.3,
  "total_units": 480,
  "age": "5-10 Years",
  "posted_date": "2026-04-22",
  "url": "https://www.commonfloor.com/sobha-indraprastha-..."
}

complex_name + community_rating + total_units are the killer Commonfloor-specific community fields. rera_id enables RERA-database cross-reference for compliance verification.

Common pitfalls

Three things go wrong in Commonfloor pipelines. Complex-name normalization — "Sobha Indraprastha" vs "Sobha Indraprasta" vs "Indraprastha by Sobha"; for clean per-complex aggregation, normalize via canonical-name mapping. Listing-vs-community data divergence — community-level data (ratings, total_units) updates quarterly while listing data updates daily; treat the two retention tiers separately. Bangalore-bias in community depth — Commonfloor's strongest coverage is Bangalore; for other cities, community-level fields are sparser. For pan-India apartment-complex research, supplement with city-specific platforms.

Thirdwatch's actor uses HTTP + structured data extraction at $0.10/1K, ~88% margin. Pair Commonfloor with MagicBricks Scraper and 99acres Scraper for comprehensive India real-estate coverage. A fourth subtle issue worth flagging: Commonfloor's community ratings are user-submitted — older complexes (10+ years) accumulate large rating samples (50-200+ ratings) while newer complexes (under 3 years) often have under 10 ratings. For accurate cross-complex comparisons, weight ratings by sample size + apply minimum-sample thresholds (10+ ratings minimum). A fifth pattern unique to Bangalore real estate research: tech-corridor proximity (Whitefield, Bellandur, Sarjapur Road) materially affects rental velocity + price-per-sqft — for accurate locality benchmarking, segment by tech-corridor distance rather than treating all Bangalore localities as comparable. A sixth and final pitfall: Bangalore monsoon season (June-October) drives rental-velocity increases as renters rotate during academic cycles + tech-employer onboarding waves. For accurate base-rate research, account for seasonal demand cycles when interpreting per-locality velocity.

Operational best practices for production pipelines

Tier the cadence to match signal half-life. India real estate listings change slowly — weekly polling on Bangalore localities + monthly on other cities covers most use cases. 60-80% cost reduction with negligible signal loss when watchlist is properly tiered.

Snapshot raw payloads. Pipeline cost is dominated by scrape volume, not storage. Persisting raw JSON snapshots lets you re-derive metrics — particularly useful for community-name normalization as canonical-mapping tables evolve.

Schema validation. Run a daily validation suite asserting expected core fields with non-null rates above 80% (required) and 50% (optional). Commonfloor schema changes during platform UI revisions — catch drift early. Cross-snapshot diff alerts on RERA-status changes + builder-name updates catch structural shifts that pure aggregate-trend monitoring misses. A seventh and final operational pattern at production scale: cross-snapshot diff alerts. Beyond detecting individual changes, build alerts on cross-snapshot field-level diffs — name changes, category re-classifications, status changes. These structural changes precede or follow material events and are leading indicators of organization-level disruption. Persist a structured-diff log alongside aggregate snapshots: for each entity, persist (field, old_value, new_value) tuples per scrape. Surface high-leverage diffs to human reviewers; low-leverage diffs stay in the audit log.

An eighth pattern worth flagging for cost-controlled teams: implement an incremental-diff pipeline that only re-processes records whose hash changed since the previous snapshot. For watchlists where 90%+ of records are unchanged between snapshots, hash-comparison-driven incremental processing reduces downstream-compute by 80-90% while preserving full data fidelity. Combine with snapshot-storage compression for end-to-end pipeline-cost reductions of 70%+ at scale.

Related use cases

Frequently asked questions

Why scrape Commonfloor for Bangalore research?

Commonfloor (acquired by Quikr in 2016) has historically deep Bangalore + South India property coverage with strong community-level data — apartment-complex amenities, RERA-registered builders, society reviews. According to Commonfloor's coverage data, the platform indexes 500K+ India listings with materially deeper Bangalore + Pune coverage than other platforms. For Bangalore-focused real estate research, Commonfloor offers community-level depth alongside MagicBricks/99acres.

What data does the actor return?

Per listing: title, property type, city + locality + apartment-complex name, price (₹), price-per-sqft, area (super-built-up + carpet), bedrooms, bathrooms, balconies, furnished status, amenities list (Pool/Gym/Clubhouse/Power Backup), age, builder name, RERA ID, posted date, image URLs. Community-level data (apartment-complex aggregate ratings, total units, possession year) is materially richer than competitor platforms.

How does Commonfloor compare to MagicBricks + 99acres?

Commonfloor is materially smaller (~500K vs 1.5M+ each at MagicBricks/99acres) but has deeper community/apartment-complex coverage especially in Bangalore + Pune. For apartment-complex-aware research (community ratings, amenities mapping, RERA-builder tracking), Commonfloor is essential. For broad pan-India coverage, MagicBricks/99acres are primary.

Can I track apartment-complex aggregates?

Yes. Commonfloor publishes per-complex data (total units, possession year, builder, amenities, average rating from resident reviews). Aggregate metrics enable apartment-complex-level investment research (yield by complex, rating trend by builder). For Bangalore-focused investment research, apartment-complex view is more actionable than per-listing view.

How fresh do Commonfloor snapshots need to be?

For active rental + buy-listing research, weekly cadence catches new listings within 7 days. For apartment-complex aggregates (which change slowly), monthly cadence is sufficient. For new-launch tracking (RERA-registered projects), daily during peak builder-marketing windows. Most Commonfloor listings stay active 30-60 days.

How does this compare to Commonfloor's first-party API?

Commonfloor offers a paid B2B API ($5K+/year) gated behind partnership approval. The actor delivers similar coverage at $0.003/record without partnership gatekeeping. For active brokerage operations, the API is required. For research-only use cases, the actor scales without onboarding overhead.

Run the Commonfloor Scraper on Apify Store — pay-per-record, free to try, no credit card to test.