OrbTop

BeiDou Navigation System News Scraper

NEWSBUSINESSAUTOMATION

BeiDou Navigation System News Scraper

Scrape official news, announcements, policies, and technical bulletins from beidou.gov.cn — the Chinese government's BDS satellite navigation authority (国务院北斗办).

What this actor does

Crawls 5 official sub-channels of the BeiDou Navigation Satellite System website:

Sub-channel Chinese name Content
xwzx 新闻中心 News center — main news (~1000+ articles)
gfgg 公告公示 Official announcements & notices
zcfg 政策法规 Policy & regulations
sjjl 数据交流 Data exchange bulletins
yytg 应用推广 Application promotion & use cases

For each article, it extracts the title, full body text and HTML, publish date, source publication, images, file attachments, and satellite system references (BDS-3, GEO, IGSO, MEO mentions).

Who uses this

  • Aviation and maritime authorities tracking BDS-3 signal-in-space accuracy and anomaly announcements
  • BDS service providers in BRI countries monitoring policy changes
  • Researchers tracking dual-use satellite navigation policy
  • Organizations needing zh-CN primary-source data on GNSS developments

Input parameters

Parameter Type Default Description
maxItems integer 10 Maximum number of articles to scrape. Set to 0 for all.
subchannels array all Sub-channels to include. Omit to scrape all 5 channels.

Example input (scrape recent news + announcements)

{
  "maxItems": 100,
  "subchannels": ["xwzx", "gfgg"]
}

Output schema

Each record contains:

Field Type Description
article_id string Article ID from URL (e.g. t20260609_29339)
subchannel string Sub-channel key (xwzx, gfgg, etc.)
subchannel_name string Sub-channel name in Chinese
title_zh string Article title (Chinese)
body_text string Full article body as plain text
body_html string Full article body as HTML
publish_date string Publish date (YYYY-MM-DD)
source string Source publication (e.g. 人民网, 新华网)
source_url string Canonical article URL
images string Pipe-separated absolute image URLs from the article
attachments string Pipe-separated attachment URLs (.pdf, .doc, etc.)
related_satellites string Pipe-separated BDS satellite references found in text
scrapedAt string ISO-8601 scrape timestamp

Sample record

{
  "article_id": "t20260609_29339",
  "subchannel": "xwzx",
  "subchannel_name": "新闻中心",
  "title_zh": "5G+北斗,织密"三夏"丰收网",
  "body_text": "近年来,北斗卫星导航系统在农业领域的应用不断深化...",
  "publish_date": "2026-06-08",
  "source": "河南日报",
  "source_url": "http://www.beidou.gov.cn/yw/xwzx/202606/t20260609_29339.html",
  "images": "http://www.beidou.gov.cn/yw/xwzx/202606/W020260609123456.jpg",
  "attachments": "",
  "related_satellites": "北斗三号",
  "scrapedAt": "2026-06-11T00:00:00.000Z"
}

Notes

  • All content is in Simplified Chinese (zh-CN).
  • The site is server-rendered with no anti-bot measures. No proxy required.
  • Pagination uses index_N.html format (e.g. index_1.html through index_76.html for xwzx).
  • The xwzx (news center) channel has ~77 pages x 13 articles = ~1000 articles total.
  • Generous 60-second per-request timeout is set to handle occasional CN government server latency.