OrbTop

Baidu Search Scraper

SEO TOOLSAI

Baidu Search Scraper

Scrape Baidu (百度) search engine results for any search query. Extract organic results, news articles, Baike entries, and SERP metadata including related queries, total result estimates, and detected SERP features.

Baidu is China's dominant search engine with roughly 70% domestic market share. This scraper is the equivalent of a Google Search Scraper for the Chinese-language web — essential for any team doing SEO research, competitive intelligence, or data collection targeting the Chinese market.

What this scraper collects

For each search result, the scraper extracts:

  • Title — the result headline as shown on Baidu
  • URL — the real destination URL (resolved from Baidu's redirect wrapper)
  • Displayed URL — the shortened URL displayed on the SERP
  • Snippet — the descriptive text shown below the title
  • Result typeorganic, news, baike, video
  • Source site — hostname of the result
  • Published date — date shown for news results
  • Thumbnail URL — image thumbnail when present
  • Is Baidu-owned — flags results pointing to Baidu properties (Baike, Zhidao, Tieba, etc.)
  • Related queries — the "related searches" shown on the SERP page
  • Total results estimate — Baidu's stated result count for the query
  • SERP features — detected features: answer_box, knowledge_panel, image_pack, video_pack, news_box

Supported search modes

Mode URL pattern Use case
web www.baidu.com/s?wd=... Standard web search results
news news.baidu.com/ns?word=... News articles from Chinese media

Usage

Configure one or more search queries and set the search mode:

{
  "queries": ["人工智能", "machine learning", "python编程"],
  "searchType": "web",
  "maxPages": 3,
  "maxItems": 100
}

Input parameters

Parameter Type Default Description
queries array required Search query strings (Chinese or English)
searchType string web Search mode: web or news
resultsPerPage integer 10 Results per page (10 or 50)
maxPages integer 3 Max SERP pages per query
maxItems integer 15 Maximum total results across all queries

Performance and anti-bot notes

Baidu's WAF blocks all data center IP addresses. This scraper uses residential proxies to bypass the block and deliver real search results. Pacing is configured to avoid triggering rate-limit responses.

Politically sensitive queries on Baidu may return reduced result counts or redirected results — this is inherent to Baidu's content policies, not a scraper defect.

Use cases

  • Chinese SEO research — track keyword rankings and SERP features on Baidu
  • Brand monitoring — monitor how a brand appears in Chinese search results
  • Competitive intelligence — analyze which Chinese sites rank for industry keywords
  • AI training data — collect Baidu SERPs as retrieval anchors for Chinese-language models
  • Academic research — study information retrieval and content availability on the Chinese web