Baidu Search Scraper

Scrape Baidu (百度) search engine results for any search query. Extract organic results, news articles, Baike entries, and SERP metadata including related queries, total result estimates, and detected SERP features.

Baidu is China's dominant search engine with roughly 70% domestic market share. This scraper is the equivalent of a Google Search Scraper for the Chinese-language web — essential for any team doing SEO research, competitive intelligence, or data collection targeting the Chinese market.

What this scraper collects

For each search result, the scraper extracts:

Title — the result headline as shown on Baidu
URL — the real destination URL (resolved from Baidu's redirect wrapper)
Displayed URL — the shortened URL displayed on the SERP
Snippet — the descriptive text shown below the title
Result type — organic, news, baike, video
Source site — hostname of the result
Published date — date shown for news results
Thumbnail URL — image thumbnail when present
Is Baidu-owned — flags results pointing to Baidu properties (Baike, Zhidao, Tieba, etc.)
Related queries — the "related searches" shown on the SERP page
Total results estimate — Baidu's stated result count for the query
SERP features — detected features: answer_box, knowledge_panel, image_pack, video_pack, news_box

Supported search modes

Mode	URL pattern	Use case
`web`	`www.baidu.com/s?wd=...`	Standard web search results
`news`	`news.baidu.com/ns?word=...`	News articles from Chinese media

Usage

Configure one or more search queries and set the search mode:

{
  "queries": ["人工智能", "machine learning", "python编程"],
  "searchType": "web",
  "maxPages": 3,
  "maxItems": 100
}

Input parameters

Parameter	Type	Default	Description
`queries`	array	required	Search query strings (Chinese or English)
`searchType`	string	`web`	Search mode: `web` or `news`
`resultsPerPage`	integer	`10`	Results per page (10 or 50)
`maxPages`	integer	`3`	Max SERP pages per query
`maxItems`	integer	`15`	Maximum total results across all queries

Performance and anti-bot notes

Baidu's WAF blocks all data center IP addresses. This scraper uses residential proxies to bypass the block and deliver real search results. Pacing is configured to avoid triggering rate-limit responses.

Politically sensitive queries on Baidu may return reduced result counts or redirected results — this is inherent to Baidu's content policies, not a scraper defect.

Use cases

Chinese SEO research — track keyword rankings and SERP features on Baidu
Brand monitoring — monitor how a brand appears in Chinese search results
Competitive intelligence — analyze which Chinese sites rank for industry keywords
AI training data — collect Baidu SERPs as retrieval anchors for Chinese-language models
Academic research — study information retrieval and content availability on the Chinese web

Baidu Search Scraper

Baidu Search Scraper

What this scraper collects

Supported search modes

Usage

Input parameters

Performance and anti-bot notes

Use cases

Related AI & Data scrapers