CAST China Space Technology News Scraper
NEWSBUSINESS
CAST China Space Technology News Scraper
Scrapes news articles from CAST (中国空间技术研究院 — China Academy of Space Technology), the primary satellite manufacturer for China (~70% of all Chinese satellites). CAST built Shenzhou crewed spacecraft, Tiangong space station modules, Chang'e lunar landers, Tianwen Mars rover, and every major Chinese satellite bus platform (DFH-3, DFH-4, DFH-5).
What it scrapes
Articles from CAST's news channels, including:
| Channel ID | Name | Description |
|---|---|---|
| 690 | 本院动态 | Institute news (flagship engineering announcements) |
| 689 | 媒体聚焦 | Media coverage and external press |
| 688 | 八面来风 | External industry coverage |
| 1051 | 科技动态 | Technology updates |
| 1065 | 型号成就 | Product/mission achievements |
Output fields
Each record contains:
| Field | Description |
|---|---|
article_id |
Numeric ID from URL (/news/{id}) |
channel_id |
Source channel ID |
channel_name |
Human-readable channel name in Chinese |
title_zh |
Article title in Chinese |
body_html |
Full article body as raw HTML |
body_text |
Full article body as plain text |
publish_date |
Publish date in ISO-8601 format (YYYY-MM-DD) |
source |
Credited source (信息来源 field) |
source_url |
Canonical URL (https://www.cast.cn/news/{id}) |
images |
Pipe-separated list of image URLs from article body |
scrapedAt |
ISO-8601 timestamp when scraped |
Input
| Parameter | Type | Default | Description |
|---|---|---|---|
channelIds |
array of integers | [690, 689, 688] | CAST channel IDs to scrape |
maxItems |
integer | 10 | Maximum articles to scrape (0 = no limit) |
Use cases
- Competitive intelligence: Bus platform announcements (DFH-5 rollout, Shijian series) appear in CAST news before CASC press releases
- Mission tracking: Launch outcomes, spacecraft integration milestones
- Defense/space industry analysis: Satellite-bus competitive intelligence for Boeing/Lockheed/Airbus DS analysts
- Academic research: Primary-source Chinese space program news in structured format
Notes
- Site uses a legacy ASP.NET backend; all pages are server-rendered HTML
- No authentication required; publicly accessible
- CAST is also known as the Fifth Research Institute (五院) of CASC