Monitor Competitor LinkedIn Content at Scale — 2026 Guide
Track what competitor brand pages and execs publish on LinkedIn — text, reactions, comments, edits, and media — with Thirdwatch's LinkedIn Post Scraper.

Thirdwatch's LinkedIn Post Scraper returns text, author, reactions, comments, media, and edit status for any public LinkedIn post — exactly the dataset competitive-intelligence and brand-monitoring teams need to track what competitors are saying, how it lands, and when it changes. This guide is the practical workflow for going from a competitor watchlist to a self-maintaining content-intelligence feed.
Why monitor competitor LinkedIn content
LinkedIn has quietly become the default executive-communication channel for B2B brands. According to LinkedIn's own marketing solutions data, four out of five LinkedIn members drive business decisions in their organizations, and the platform's audience has roughly 2x the buying power of the average web audience. Product launches, executive hires, narrative pivots, fundraising news, and customer wins now break on LinkedIn before they hit press releases — and competitor marketing teams want to see them as they happen, not a week later in a press round-up.
The job-to-be-done is structured: a watchlist of competitor company pages and key executives, a window of "posts in the last N days", and a structured row per post that captures text, engagement, and timing. Three competitive-intelligence patterns sit on top of that primitive: a daily digest of new competitor posts, an engagement-anomaly alert for posts breaking out (high reactions/comments velocity), and a quarterly content-strategy report comparing tone, topic mix, and posting cadence across competitors.
How does this compare to alternatives?
Three realistic options for competitor LinkedIn monitoring:
| Approach | Reliability | Setup time | Maintenance |
|---|---|---|---|
| Social-listening platform (Brandwatch, Sprinklr) | High, but expensive at competitive-intelligence depth | Days (vendor onboarding) | Vendor-managed |
| LinkedIn Sales Navigator alerts | Coverage limited to follow-the-page surface, no engagement detail | Hours | Built into LinkedIn |
| Thirdwatch LinkedIn Post Scraper + your pipeline | Production-tested, raw data direct from LinkedIn | 5 minutes | Thirdwatch tracks LinkedIn changes |
Social-listening platforms work if you want a managed dashboard and have budget for it. For teams that already have a data warehouse and want competitor content as a clean SQL table they can join against pipeline data, the actor is the lighter-weight, lower-lock-in choice — see the LinkedIn Post Scraper actor page for the live spec.
How to build a competitor content monitoring loop in 6 steps
Step 1: How do I authenticate against Apify?
Sign in at apify.com, grab your API token:
export APIFY_TOKEN="apify_api_xxxxxxxxxxxxxxxx"Step 2: How do I assemble a competitor post URL list?
The post scraper enriches URLs — it does not discover them. The sourcing pattern that works for most brand-monitoring teams is a hybrid:
# Pseudocode for the sourcing layer
competitor_pages = {
"competitor-a": "https://www.linkedin.com/company/competitor-a/",
"competitor-b": "https://www.linkedin.com/company/competitor-b/",
"competitor-c": "https://www.linkedin.com/company/competitor-c/",
}
# Source recent post URLs via:
# (1) Google search: site:linkedin.com/posts "competitor-a"
# (2) LinkedIn company page Activity tab (manual or scraped)
# (3) Periodic scrape of company page recent activity
post_urls = []
for slug, page_url in competitor_pages.items():
urls = source_recent_post_urls(page_url, days=7)
post_urls.extend(urls)
print(f"{len(post_urls)} posts queued across {len(competitor_pages)} competitors")Curated sourcing once a week, hourly enrichment on the URL list, is the usual cadence.
Step 3: How do I enrich the URL list with full post data?
Pass the URL list to the actor. maxPosts caps the run at 200.
import os, requests, pandas as pd
ACTOR = "thirdwatch~linkedin-post-scraper"
TOKEN = os.environ["APIFY_TOKEN"]
resp = requests.post(
f"https://api.apify.com/v2/acts/{ACTOR}/run-sync-get-dataset-items",
params={"token": TOKEN},
json={"postUrls": post_urls, "maxPosts": 200},
timeout=900,
)
df = pd.DataFrame(resp.json())
print(f"{len(df)} competitor posts hydrated")
print(df[["author_name", "author_is_company", "reactions_count",
"comments_count", "posted_relative"]].head(10))Step 4: How do I detect competitor content trends?
Group by competitor and aggregate. Two views are usually enough: posting cadence and topic mix.
# Posting cadence: how active is each competitor this week?
cadence = (
df.groupby("author_name")
.agg(posts=("url", "count"),
median_reactions=("reactions_count", "median"),
median_comments=("comments_count", "median"),
edited_share=("edited", "mean"))
.sort_values("posts", ascending=False)
)
print(cadence)
# Topic mix: simple keyword tags as a starting point
import re
TOPICS = {
"hiring": re.compile(r"\b(hiring|join us|we're hiring|role|open position)\b", re.I),
"product": re.compile(r"\b(launch|release|now available|introducing)\b", re.I),
"customer": re.compile(r"\b(customer|case study|success story)\b", re.I),
"funding": re.compile(r"\b(series [a-d]|raised|funding|round)\b", re.I),
"exec_move": re.compile(r"\b(joining|welcome|promoted|new role)\b", re.I),
}
for tag, pattern in TOPICS.items():
df[f"is_{tag}"] = df["text"].fillna("").str.contains(pattern)
topic_mix = df.groupby("author_name")[[f"is_{t}" for t in TOPICS]].mean()
print(topic_mix.round(2))A jump in is_funding posts from a competitor a week before TechCrunch picks up the story is exactly the kind of early-warning signal this loop is for.
Step 5: How do I detect engagement breakouts?
A breakout is a post that performs materially above the competitor's own baseline — the right signal to wake up a strategy review.
baselines = df.groupby("author_name")["reactions_count"].agg(["median", "std"])
df = df.merge(baselines, left_on="author_name", right_index=True)
df["z_reactions"] = (df["reactions_count"] - df["median"]) / (df["std"] + 1)
breakouts = df[df["z_reactions"] > 2].sort_values("z_reactions", ascending=False)
print(breakouts[["author_name", "text", "reactions_count",
"comments_count", "z_reactions", "url"]].head(10))A z-score above 2 against the competitor's own recent baseline is the threshold most growth teams use to flag "something worked here, copy the pattern."
Step 6: How do I diff posts to catch edits?
Re-scrape the same URLs on a schedule. If the same url returns different text or reactions_count, you have an edit event.
import sqlite3, hashlib
conn = sqlite3.connect("competitor_posts.db")
conn.execute("""
CREATE TABLE IF NOT EXISTS post_state (
url TEXT PRIMARY KEY,
text_hash TEXT, reactions_count INTEGER, comments_count INTEGER,
edited INTEGER, last_seen TEXT
)
""")
def diff_and_persist(row):
h = hashlib.sha1((row["text"] or "").encode()).hexdigest()[:12]
prev = conn.execute(
"SELECT text_hash, reactions_count FROM post_state WHERE url = ?",
(row["url"],)
).fetchone()
if prev and prev[0] != h:
print(f"[EDITED] {row['author_name']} — {row['url']}")
conn.execute("""
INSERT OR REPLACE INTO post_state
VALUES (?, ?, ?, ?, ?, datetime('now'))
""", (row["url"], h, int(row["reactions_count"] or 0),
int(row["comments_count"] or 0), int(bool(row["edited"]))))
df.apply(diff_and_persist, axis=1)
conn.commit()This pattern catches the not-uncommon case of a competitor quietly editing a post after publishing — sometimes to fix a typo, sometimes to walk back a claim. Both are signal.
Sample output
Three rows from a competitor watchlist run, identifiers redacted:
[
{
"url": "https://www.linkedin.com/feed/update/urn:li:activity:74440XXXXX0000000/",
"author_name": "[COMPETITOR_A] (Company)",
"author_headline": "B2B SaaS for revenue teams · 12,500 followers",
"author_is_company": true,
"text": "We are excited to announce our Series C — $80M led by [REDACTED]. This funding accelerates our AI-native roadmap...",
"posted_relative": "5h",
"edited": false,
"reactions_count": 842,
"comments_count": 67,
"media": [{"type": "image", "url": "https://media.licdn.com/dms/image/..."}]
},
{
"url": "https://www.linkedin.com/feed/update/urn:li:activity:74445XXXXX0000000/",
"author_name": "[COMPETITOR_A_CEO]",
"author_headline": "Co-founder & CEO at [COMPETITOR_A]",
"author_is_company": false,
"text": "Why we built [PRODUCT_X] instead of buying it — a thread...",
"posted_relative": "1d",
"edited": true,
"reactions_count": 1247,
"comments_count": 89
}
]The pair tells a story: the company page announced the round, the CEO published the founder-voice version a day later, and the founder-voice post out-engaged the company post 1.5x — typical pattern for B2B SaaS. The edited: true on the founder post is worth checking; founders sometimes adjust early-engagement posts after the first hour to address comments.
Common pitfalls
Four patterns trip up competitor-monitoring pipelines. The sourcing layer is the harder problem — the actor reliably enriches URLs, but discovering a competitor's recent URLs is a separate scraping problem (company page activity, Google site search, or manual list). Company posts and executive posts ride different engagement curves — group them separately or you'll over-credit competitors with large company page follower bases. Edits are real and frequent — competitors edit posts in the first hour to fix typos or sharpen claims; persist a text hash and diff. Reposts are not exposed — reposts_count is null on the public embed; if reshare velocity matters, sample a few reshare permalinks separately.
Thirdwatch's actor uses production-grade anti-bot tooling and rotates outbound IPs by default, so a 200-post enrichment of a 5-competitor watchlist typically completes in three to five minutes — small enough to schedule hourly. Very large runs (10K+ posts) should be chunked across multiple parallel runs to keep individual timeouts comfortable.
Related use cases
- Scrape LinkedIn posts without login
- Track LinkedIn post engagement for influencer research
- Build a LinkedIn thought-leadership dataset
- Track LinkedIn hiring velocity by company
- Monitor brand TikTok presence
- Track brand mentions on Twitter
- The complete guide to scraping social media
- All Thirdwatch use-case guides
Frequently asked questions
Can I scrape posts from any competitor's LinkedIn company page?
Yes, any public company page. The actor accepts /feed/update/urn:li:activity:{id} permalinks for posts published by company pages and returns author_is_company: true so you can distinguish brand posts from employee posts in your dataset. Private or restricted pages are not accessible.
How do I get the URLs of every recent post a competitor published?
The post scraper reads single-post URLs — it does not paginate a company page's full post history. The sourcing step typically uses Google search restricted to site:linkedin.com/posts {company-name}, a manual URL list from the company page's Activity tab, or a periodic profile-activity scrape. Once you have URLs, the post scraper handles enrichment.
How fresh is the data?
Each run pulls live from LinkedIn at request time. A post published two minutes ago is already scrapeable; reaction and comment counts update in real time on the public embed. For competitive intelligence on actively-publishing competitors, an hourly schedule on the last 24 hours of post URLs is the standard pattern.
Can I detect when a competitor edits or deletes a post?
Yes for edits — the actor returns edited: true on any post LinkedIn has marked as edited, and re-scraping the same URL refreshes text and counts in place so you can diff. Deletions surface as a 404 on the next scrape; persist a posts table keyed on url and treat missing rows on re-scrape as deletion events.
Does this work for posts in non-English languages?
Yes. The actor extracts text from LinkedIn's public embed regardless of language. UTF-8 is preserved end-to-end, so Arabic, Chinese, Japanese, Hindi, and other non-Latin scripts come through cleanly. For competitive monitoring of multilingual brands (CPG, telco, BFSI in APAC and MENA), this matters.
Is scraping competitor LinkedIn posts legal?
Public posts are visible to anonymous visitors and accessing public web pages programmatically is broadly permitted, but LinkedIn's terms of service restrict automated access. The actor reads only public embed pages. As with any web data work, your legal team should review the use case against jurisdiction-specific laws and your contracts; this guide is not legal advice.
Related
100 free credits, no credit card.
About 30 real searches. Add the MCP to Claude or Cursor in two minutes.