Build an eProcure Tender Database for Your Sales Pipeline
Build a searchable tender database from India's eProcure CPPP portal. Feed government contract data into your sales pipeline with structured extraction.

Thirdwatch's India Government Tenders Scraper extracts structured tender data from India's Central Public Procurement Portal into clean JSON, ready to feed a sales pipeline database. Pull tender IDs, reference numbers, organizations, departments, deadlines, and detail links with pay-per-result pricing. Built for developers and growth engineers building government sales intelligence tools or integrating procurement data into CRM workflows.
Why build a tender database from eProcure
India's government procurement market is massive. According to the World Bank's India procurement assessment, central and state government procurement spending exceeds $500 billion annually, with a growing share moving to electronic platforms like CPPP (eprocure.gov.in). For any B2G (business-to-government) company, this spending represents the largest addressable market in India -- larger than any single private-sector vertical.
The problem is discovery. eProcure's search interface is built for one-off lookups, not systematic pipeline building. There is no API, no export function, and no way to set up saved searches with notifications. A sales team targeting government IT contracts needs to check the portal daily across multiple keywords and departments. A growth engineer building a government market intelligence product needs historical tender data in a queryable format. A channel partner selling to system integrators needs to surface relevant subcontracting opportunities before prime contractors lock in their teams.
All of these use cases require the same foundation: a structured, deduplicated, continuously updated database of eProcure tender records. The scraper provides the data extraction layer. This guide covers how to build the pipeline from extraction through storage to sales workflow integration.
How does this compare to the alternatives?
| Approach | Data freshness | Schema control | Integration flexibility | Maintenance |
|---|---|---|---|---|
| Manual portal checking + spreadsheet | Hours behind, human-dependent | None, ad hoc columns | Copy-paste only | Daily human effort |
| Tender aggregation SaaS (TenderTiger, BidAssist) | Near real-time | Locked to vendor schema | Limited API, vendor-dependent | Subscription renewal |
| In-house web scraper | Real-time on each run | Full control | Full control | Portal changes break scraper |
| Thirdwatch eProcure Scraper + your database | Real-time on each run | Full control of downstream schema | Any database, any CRM | Thirdwatch maintains extraction |
Building on top of the India Government Tenders Scraper gives you schema control and integration flexibility without the maintenance burden of keeping up with eProcure's DOM changes.
How to build the tender database pipeline in 6 steps
Step 1: How do I set up the project?
Install dependencies and set your Apify token.
pip install apify-client psycopg2-binary pandas
export APIFY_TOKEN="apify_api_xxxxxxxxxxxxxxxx"Step 2: How do I define the database schema?
Create a PostgreSQL table that maps to the scraper's output fields. tender_id is the natural key for deduplication.
CREATE TABLE IF NOT EXISTS tenders (
tender_id TEXT PRIMARY KEY,
tender_title TEXT NOT NULL,
tender_reference_number TEXT,
organization TEXT,
department TEXT,
published_date TEXT,
bid_submission_deadline TEXT,
tender_opening_date TEXT,
detail_href TEXT,
first_seen_at TIMESTAMP DEFAULT NOW(),
last_seen_at TIMESTAMP DEFAULT NOW(),
pipeline_status TEXT DEFAULT 'new'
);
CREATE INDEX idx_tenders_org ON tenders(organization);
CREATE INDEX idx_tenders_dept ON tenders(department);
CREATE INDEX idx_tenders_deadline ON tenders(bid_submission_deadline);
CREATE INDEX idx_tenders_status ON tenders(pipeline_status);Step 3: How do I extract tenders for multiple sales verticals?
Run the scraper with keyword arrays that cover your target verticals. Use fetchDetails to get complete metadata for pipeline qualification.
from apify_client import ApifyClient
import os
client = ApifyClient(os.environ["APIFY_TOKEN"])
VERTICALS = {
"IT Services": {
"queries": ["IT services", "software development", "cloud computing",
"cybersecurity", "data centre"],
"maxResults": 200,
"fetchDetails": True,
},
"Infrastructure": {
"queries": ["road construction", "bridge construction", "smart city"],
"maxResults": 150,
"organization": "",
"fetchDetails": True,
},
"Medical Equipment": {
"queries": ["medical equipment", "hospital supplies", "diagnostic instruments"],
"maxResults": 100,
"organization": "All India Institute of Medical Sciences",
"fetchDetails": True,
},
}
all_tenders = []
for vertical_name, config in VERTICALS.items():
run = client.actor("thirdwatch/india-government-tenders-scraper").call(
run_input=config
)
items = list(client.dataset(run["defaultDatasetId"]).iterate_items())
for item in items:
item["_vertical"] = vertical_name
all_tenders.extend(items)
print(f"{vertical_name}: {len(items)} tenders")
# Deduplicate by tender_id
seen = set()
unique_tenders = []
for t in all_tenders:
if t["tender_id"] not in seen:
seen.add(t["tender_id"])
unique_tenders.append(t)
print(f"\nTotal unique tenders: {len(unique_tenders)}")Step 4: How do I upsert tenders into the database?
Upsert by tender_id to handle amended tenders and deadline extensions without creating duplicates.
import psycopg2
from datetime import datetime
conn = psycopg2.connect("postgresql://user:pass@localhost/tenders_db")
cur = conn.cursor()
for t in unique_tenders:
cur.execute("""
INSERT INTO tenders (tender_id, tender_title, tender_reference_number,
organization, department, published_date,
bid_submission_deadline, tender_opening_date,
detail_href, first_seen_at, last_seen_at)
VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, NOW(), NOW())
ON CONFLICT (tender_id) DO UPDATE SET
bid_submission_deadline = EXCLUDED.bid_submission_deadline,
tender_opening_date = EXCLUDED.tender_opening_date,
last_seen_at = NOW()
""", (
t["tender_id"], t["tender_title"], t.get("tender_reference_number"),
t.get("organization"), t.get("department"), t.get("published_date"),
t.get("bid_submission_deadline"), t.get("tender_opening_date"),
t.get("detail_href"),
))
conn.commit()
print(f"Upserted {len(unique_tenders)} records")Step 5: How do I build a pipeline qualification view?
Create a SQL view that surfaces actionable tenders for your sales team -- open deadlines, sorted by urgency.
CREATE VIEW pipeline_active AS
SELECT
tender_id,
tender_title,
organization,
department,
bid_submission_deadline,
pipeline_status,
CASE
WHEN bid_submission_deadline::timestamp < NOW() + INTERVAL '3 days' THEN 'urgent'
WHEN bid_submission_deadline::timestamp < NOW() + INTERVAL '7 days' THEN 'upcoming'
ELSE 'open'
END AS urgency
FROM tenders
WHERE pipeline_status IN ('new', 'qualified', 'preparing')
AND bid_submission_deadline::timestamp > NOW()
ORDER BY bid_submission_deadline ASC;Step 6: How do I schedule daily pipeline updates?
Automate extraction and loading with an Apify schedule plus a cron job for the database sync.
# Schedule the scraper on Apify
schedule = client.schedules().create(
name="daily-tender-pipeline",
cron_expression="0 2 * * *", # 2:00 AM IST daily
actions=[{
"type": "RUN_ACTOR",
"actorId": "thirdwatch/india-government-tenders-scraper",
"runInput": {
"queries": ["IT services", "software development", "cloud computing",
"cybersecurity", "data centre", "road construction",
"medical equipment"],
"maxResults": 300,
"fetchDetails": True,
},
}],
)
print(f"Schedule created: {schedule['id']}")Use a webhook or a downstream cron job to pull completed run datasets into your PostgreSQL instance.
Sample output
Three records from a single run targeting IT services tenders. Each record weighs approximately 1.2 KB.
[
{
"tender_title": "Development of Integrated Dashboard for National Data Analytics Platform",
"tender_id": "2026_MEITY_776543_1",
"tender_reference_number": "MeitY/NDP/2026/DASH-034",
"organization": "Ministry of Electronics and Information Technology",
"department": "National e-Governance Division",
"published_date": "2026-05-21",
"bid_submission_deadline": "18-Jun-2026 05:00 PM",
"tender_opening_date": "19-Jun-2026 11:00 AM",
"detail_href": "https://eprocure.gov.in/eprocure/app?page=FrontEndTendersByOrganisation&service=page"
},
{
"tender_title": "Supply of Firewall and Network Security Appliances for CERT-In",
"tender_id": "2026_CERTIN_554321_1",
"tender_reference_number": "CERT-In/PROC/2026/SEC-017",
"organization": "Indian Computer Emergency Response Team",
"department": "Ministry of Electronics and Information Technology",
"published_date": "2026-05-19",
"bid_submission_deadline": "12-Jun-2026 03:00 PM",
"tender_opening_date": "13-Jun-2026 10:00 AM",
"detail_href": "https://eprocure.gov.in/eprocure/app?page=FrontEndTendersByOrganisation&service=page"
},
{
"tender_title": "Cloud Hosting Services for Passport Seva Portal Migration",
"tender_id": "2026_MEA_998877_1",
"tender_reference_number": "MEA/CPV/2026/CLOUD-008",
"organization": "Ministry of External Affairs",
"department": "Consular Passport and Visa Division",
"published_date": "2026-05-23",
"bid_submission_deadline": "25-Jun-2026 02:00 PM",
"tender_opening_date": "26-Jun-2026 11:00 AM",
"detail_href": "https://eprocure.gov.in/eprocure/app?page=FrontEndTendersByOrganisation&service=page"
}
]tender_id is your deduplication key across runs. organization and department enable routing to the right sales vertical. bid_submission_deadline drives urgency scoring. detail_href links to the full RFP document for bid preparation.
Common pitfalls
Three failure modes are common when building tender pipeline databases. India's GeM (Government e-Marketplace) processed over INR 4.4 lakh crore in procurement in FY2024-25, but CPPP remains the primary portal for works and services contracts. Schema drift from amended tenders -- eProcure allows organizations to amend tenders after publication, changing deadlines, scope, or eligibility criteria. If you only insert new records and never update existing ones, your pipeline shows stale deadlines. Always upsert on tender_id and track last_seen_at to detect amendments. Fiscal year-end volume spikes -- India's fiscal year ends March 31. Departments rush to spend allocated budgets in Q4 (January-March), publishing 2-3x the normal tender volume. Your pipeline and alerting system need to handle this surge without drowning your sales team in noise. Increase filtering strictness during Q4. Missing qualification context -- the scraper provides metadata but not the full tender document. A tender title alone is insufficient for bid/no-bid decisions. Always enable fetchDetails and use the detail_href to download the complete RFP before committing resources to bid preparation.
Build a scoring model on top of the structured fields: weight organization by your historical win rate, weight department by deal size, and penalize tenders with deadlines under 10 days. This turns raw extraction into qualified pipeline.
Related use cases
- India Government Tenders Scraper actor page
- Scrape India government tenders for bid tracking
- Monitor CPPP tender deadlines for compliance
- Find India government contracts by department
- Build an India company registry database from MCA
- The complete guide to scraping compliance data
- All Thirdwatch use-case guides
Frequently asked questions
How do I avoid duplicate tender records in my database?
Use tender_id as your primary deduplication key. Each tender on eProcure has a unique identifier that persists even when the tender is amended or its deadline is extended. On each scrape run, upsert records by tender_id rather than inserting blindly. This lets you track field-level changes (deadline extensions, amended documents) without creating duplicate rows.
What volume of tenders should I expect from a daily scrape?
A broad keyword search like 'IT services' returns 200-500 active tenders on any given day. Narrowing with the organization filter reduces this to 10-50 per department. For a sales pipeline covering 5 keywords across 3 departments, expect 50-300 unique tenders per daily run after deduplication. Volume varies by season -- Q4 (January-March, India's fiscal year-end) sees 2-3x the normal volume as departments rush to spend allocated budgets.
Related
100 free credits, no credit card.
About 30 real searches. Add the MCP to Claude or Cursor in two minutes.