Scrape Square Yards Properties for India Real Estate (2026)
Extract structured property listings from Square Yards with coordinates, prices, furnishing, and developer names across 40+ Indian cities. Step-by-step guide.

Thirdwatch's Square Yards Scraper extracts structured India property listings with pay-per-result pricing. Pull sale and rental data across 40+ Indian cities with coordinates, developer names, furnishing status, and floor-level detail. Built for real estate researchers, urban economists, housing-policy analysts, and academic teams studying India's residential property markets at city and locality granularity.
Why use Square Yards for India real estate research
Square Yards operates across 40+ Indian cities and 25+ international markets, making it one of the most geographically complete India-origin real estate platforms. According to Square Yards' 2025 annual report, the platform facilitated over $3.5 billion in property transactions in the prior fiscal year, reflecting both listing depth and transaction-level validation that pure-aggregator portals lack.
The research use case is specific. A housing-policy analyst at NITI Aayog or a state urban development authority needs locality-level price distributions across Tier 1 and Tier 2 cities. An academic studying gentrification patterns in Bangalore needs latitude-longitude pairs alongside price and area to build spatial regression models. A think-tank benchmarking affordable housing supply needs bedroom counts, property types, and developer names to segment market-rate from affordable stock. A real estate consultancy building city-level indices needs time-series snapshots of asking prices by locality.
All of these reduce to the same data pipeline: structured listings with price, location coordinates, property attributes, and developer metadata, refreshed on a regular cadence, across enough cities to support cross-market comparison. The Square Yards Scraper returns exactly these fields per listing.
How does this compare to the alternatives?
Three common approaches for India residential property data:
| Approach | Coverage | Structured fields | Setup time | Maintenance |
|---|---|---|---|---|
| NHB RESIDEX (official index) | 50 cities, quarterly | Index only, no listing-level data | None | Government release schedule |
| Manual portal browsing + copy-paste | Any portal | Whatever you record | Hours per city | Repeats every refresh |
| Thirdwatch Square Yards Scraper | 40+ cities, on-demand | 22 fields per listing including coordinates | Under an hour | Thirdwatch maintains extraction |
NHB RESIDEX is authoritative but publishes aggregate indices, not listing-level records. Manual extraction does not scale past a handful of localities. The scraper sits between them: listing-level granularity at index-building scale. For academic researchers studying India's urban housing markets, listing-level data with coordinates enables spatial econometric methods (geographically weighted regression, spatial autocorrelation analysis) that aggregate indices cannot support. The 22 fields per listing cover the full feature set needed for hedonic pricing models: property attributes (bedrooms, bathrooms, area, floor, furnishing), location attributes (latitude, longitude, locality, city), and market attributes (price, developer, project name).
How to scrape Square Yards properties in 5 steps
Step 1: How do I get an Apify API token?
Sign up at apify.com (free tier, no credit card required). Navigate to Settings, then Integrations, and copy your API token. Every code snippet below assumes the token is set as an environment variable:
export APIFY_TOKEN="apify_api_xxxxxxxxxxxxxxxx"Step 2: How do I install the Apify Python client?
pip install apify-clientStep 3: How do I run the scraper for sale listings?
from apify_client import ApifyClient
client = ApifyClient(os.environ["APIFY_TOKEN"])
run = client.actor("thirdwatch/squareyards-scraper").call(run_input={
"queries": ["Bangalore", "Mumbai", "Pune"],
"maxResults": 200,
"propertyFor": "sale",
})
items = list(client.dataset(run["defaultDatasetId"]).iterate_items())
print(f"Collected {len(items)} sale listings")The queries field accepts city names, locality names, or specific area searches. The propertyFor field takes "sale" or "rent". The maxResults field caps the total number of listings returned per query.
Step 4: How do I export results to a structured format?
import pandas as pd
df = pd.DataFrame(items)
# Keep research-relevant columns
cols = ["title", "property_type", "bedrooms", "bathrooms", "area",
"price", "locality", "city", "latitude", "longitude",
"developer_name", "project_name", "furnishing", "floor"]
df = df[cols]
# Price per sq ft for comparability
df["price_per_sqft"] = df["price"] / df["area"].replace(0, pd.NA)
df.to_csv("squareyards_sale_listings.csv", index=False)
print(df.describe())Step 5: How do I build a locality-level price distribution?
locality_stats = df.groupby(["city", "locality"]).agg(
median_price=("price", "median"),
median_psf=("price_per_sqft", "median"),
listing_count=("title", "count"),
median_area=("area", "median"),
pct_furnished=("furnishing", lambda x: (x == "Furnished").mean()),
).reset_index()
# Flag localities with enough data for statistical reliability
locality_stats = locality_stats[locality_stats["listing_count"] >= 10]
locality_stats = locality_stats.sort_values("median_psf", ascending=False)
print(locality_stats.head(20))Localities with 10+ listings and high price-per-sqft medians are the premium micro-markets; those with low psf and high listing counts may indicate oversupply. Cross-reference with the developer_name field to identify which developers dominate premium localities versus affordable ones — a useful segmentation for real estate consultancies advising developers on market entry strategy.
Sample output
Two records from a Bangalore sale query:
[
{
"title": "3 BHK Apartment in Prestige Lakeside Habitat",
"property_for": "sale",
"property_type": "Apartment",
"bedrooms": 3,
"bathrooms": 3,
"area": 1845,
"floor": "12th of 22",
"address": "Whitefield, Bangalore",
"locality": "Whitefield",
"city": "Bangalore",
"latitude": 12.9698,
"longitude": 77.7500,
"description": "East-facing 3BHK in gated community with clubhouse, pool, and gym...",
"image_url": "https://images.squareyards.com/...",
"url": "https://www.squareyards.com/bangalore/...",
"developer_name": "Prestige Group",
"project_name": "Prestige Lakeside Habitat",
"name": "Prestige Lakeside Habitat",
"price": 18500000,
"rent_monthly": null,
"listed_by": "Developer",
"available_from": "2026-09",
"furnishing": "Semi-Furnished"
},
{
"title": "2 BHK Apartment in Brigade El Dorado",
"property_for": "sale",
"property_type": "Apartment",
"bedrooms": 2,
"bathrooms": 2,
"area": 1200,
"floor": "5th of 18",
"address": "Bagalur, Bangalore",
"locality": "Bagalur",
"city": "Bangalore",
"latitude": 13.1050,
"longitude": 77.6350,
"description": "Ready-to-move 2BHK with balcony, near proposed metro station...",
"image_url": "https://images.squareyards.com/...",
"url": "https://www.squareyards.com/bangalore/...",
"developer_name": "Brigade Group",
"project_name": "Brigade El Dorado",
"name": "Brigade El Dorado",
"price": 8900000,
"rent_monthly": null,
"listed_by": "Developer",
"available_from": "Ready to Move",
"furnishing": "Unfurnished"
}
]The latitude and longitude fields enable direct spatial joins without a geocoding step. The developer_name and project_name fields support developer-level market-share analysis. The furnishing field takes three values — Furnished, Semi-Furnished, and Unfurnished — which correlate strongly with target buyer demographics (investors prefer furnished units for rental yield; end-users prefer unfurnished for customization). The floor field encodes both the unit's floor and the building's total floors, enabling vertical-position analysis in high-rise markets like Mumbai and Gurgaon where higher floors command 5-15% premiums.
Common pitfalls
Three issues surface in production India real estate research pipelines. Price format inconsistency -- Square Yards displays prices as "18.5 Lac" or "1.85 Cr" on the frontend; the scraper normalizes to absolute integers (18500000), but always verify units in your downstream code. Area unit ambiguity -- most listings report in square feet, but some older listings or plot listings may use square yards or square meters; cross-check with property_type to filter. Stale listing contamination -- some listings remain active months after the unit sells; for research requiring transaction-level freshness, filter on available_from and cross-reference against the listing's first-seen date in your snapshot history.
Related use cases
- Build an India property price database from Square Yards
- Monitor Square Yards listings for investment signals
- Track India rental market with Square Yards data
- Scrape Google Maps for location research
- The complete guide to scraping real estate sites
- Square Yards Scraper on Apify Store
- All Thirdwatch use-case guides
Frequently asked questions
Why use Square Yards for India real estate research?
Square Yards is among India's largest integrated real estate platforms with listings across 40+ cities, covering new launches, resale, and rentals. Unlike aggregators that rely on broker uploads, Square Yards lists developer-verified projects with structured price, area, and floor data. This makes it a cleaner primary source for residential real estate research than portals with heavy duplicate or stale listings.
What structured fields does the scraper return per listing?
Each record includes title, property_for (sale or rent), property_type, bedrooms, bathrooms, area, floor, address, locality, city, latitude, longitude, description, image_url, url, developer_name, project_name, name, price (for sale), rent_monthly (for rentals), listed_by, available_from, and furnishing. The latitude and longitude fields enable spatial analysis without a separate geocoding step.
Related
100 free credits, no credit card.
About 30 real searches. Add the MCP to Claude or Cursor in two minutes.