China CNIPA Patent Scraper
China CNIPA Patent Scraper
Scrape Chinese patent records from Google Patents with the CN jurisdiction filter applied. Returns bilingual data — English-translated titles and abstracts alongside the original Chinese — plus filing dates, assignees, inventors, CPC/IPC codes, legal event timelines, and patent family members.
China CNIPA Patent Scraper Features
- Extracts 20 fields per patent including bilingual title and abstract
- Search by keyword, IPC/CPC classification code, assignee, or inventor
- Filter by filing date range with
dateFromanddateTo - Direct URL mode — paste specific Google Patents links and skip the search step entirely
- Returns legal event timeline (filings, grants, status changes) as structured JSON
- Includes patent family members so you can cross-reference equivalent filings in other jurisdictions
- Residential proxy preconfigured — Google Patents serves a soft-block page to datacenter IPs
Who Uses Chinese Patent Data?
- IP analysts — Track competitor filings, monitor classification clusters, build prior-art datasets
- Corporate R&D teams — Map who's filing what in China, and when. Useful before you spend a year on the same idea.
- Patent attorneys — Pull family members and legal-event histories for freedom-to-operate analysis
- Market intelligence — Detect filing surges in specific CPC codes as a leading indicator of where Chinese R&D money is going
- Academic researchers — Build longitudinal patent corpora filtered by date, assignee, or technology area
How the China CNIPA Patent Scraper Works
- Pick a mode — Provide a search query, or paste specific patent URLs into
startUrls. URLs win when both are set. - Search pagination — In search mode, the scraper hits Google Patents' JSON XHR endpoint with the
country=CNfilter and pages until it has enough patent IDs or hits Google's 1,000-result hard cap. - Detail extraction — Each patent's detail page is fetched through residential proxy and parsed for bibliographic data, classification codes, legal events, and the Chinese-source text.
- Export — Records land in your Apify dataset as clean JSON, one record per patent.
Input
Search by keyword and date range
{
"query": "battery electric vehicle",
"dateFrom": "2022-01-01",
"dateTo": "2024-12-31",
"maxItems": 50
}
Search by classification code
{
"query": "H04L",
"maxItems": 100
}
Search by assignee
{
"query": "Huawei",
"maxItems": 200
}
Direct URL mode
{
"startUrls": [
"https://patents.google.com/patent/CN115982415A/en",
"https://patents.google.com/patent/CN108081978B/en"
],
"maxItems": 2
}
| Field | Type | Default | Description |
|---|---|---|---|
| query | string | battery electric vehicle |
Keywords, IPC/CPC code, assignee, or inventor. Empty string runs a broad CN patent search. |
| dateFrom | string | — | Earliest priority date (YYYY-MM-DD). |
| dateTo | string | — | Latest priority date (YYYY-MM-DD). |
| startUrls | array | [] |
Specific Google Patents URLs. When provided, the search API is skipped. |
| maxItems | integer | 10 | Cap on patents returned. Google Patents enforces a 1,000-result limit per query. |
| proxyConfiguration | object | Apify Residential | Proxy settings. Residential is required. |
China CNIPA Patent Scraper Output Fields
{
"publication_number": "CN115982415A",
"application_number": "CN:202310093204.3A",
"title_en": "Incremental graph division method, device, equipment, medium and product",
"abstract_en": "The application discloses an incremental graph division method...",
"abstract_cn": "本申请公开了一种增量图划分方法、装置、设备、介质及产品...",
"filing_date": "2023-02-06",
"publication_date": "2023-04-18",
"grant_date": "",
"priority_date": "2023-02-06",
"status": "Pending",
"inventors_cn": "汤韬, 高鹏飞, 孙权, 潘婧, 赵金涛, 郑建宾, 艾博轩, 庞悦",
"assignees_en": "China Unionpay Co Ltd",
"assignees_cn": "China Unionpay Co Ltd",
"cpc_codes": "G06F16/174, G06F16/901, G06F16/9536, G06Q50/00, Y02D10/00",
"ipc_codes": "",
"legal_events": "[{\"date\":\"2023-02-06\",\"title\":\"Application filed by China Unionpay Co Ltd\",\"type\":\"filed\"}]",
"family_members": "WO2024164667A1",
"pdf_url": "https://patentimages.storage.googleapis.com/64/48/49/800ddc17fec1fc/CN115982415A.pdf",
"google_patents_url": "https://patents.google.com/patent/CN115982415A/en",
"scraped_at": "2026-05-11T04:22:20.537Z"
}
| Field | Type | Description |
|---|---|---|
| publication_number | string | CNIPA publication number (e.g. CN114547329B) |
| application_number | string | Application number with country prefix |
| title_en | string | Patent title in English (Google-translated) |
| abstract_en | string | Full abstract in English (Google-translated) |
| abstract_cn | string | Full abstract in original Chinese |
| filing_date | string | Filing date (YYYY-MM-DD) |
| publication_date | string | Publication date (YYYY-MM-DD) |
| grant_date | string | Grant date (YYYY-MM-DD). Empty if not yet granted. |
| priority_date | string | Priority date (YYYY-MM-DD) |
| status | string | Legal status (Pending, Active, Expired, etc.) |
| inventors_cn | string | Inventors in Chinese characters, comma-separated |
| assignees_en | string | Current assignees in English, comma-separated |
| assignees_cn | string | Original assignees in Chinese, comma-separated |
| cpc_codes | string | CPC classification codes (leaf-level only), comma-separated |
| ipc_codes | string | IPC classification codes, comma-separated |
| legal_events | string | JSON-encoded array of {date, title, type} events |
| family_members | string | Patent family member publication numbers, comma-separated |
| pdf_url | string | URL to patent PDF |
| google_patents_url | string | Source URL of the Google Patents detail page |
| scraped_at | string | ISO timestamp when the record was scraped |
FAQ
How do I scrape Chinese patents from CNIPA?
China CNIPA Patent Scraper handles it. Plug in a query, classification code, or assignee, set maxItems, and run. The scraper hits Google Patents with a CN-jurisdiction filter, which is the cleanest public surface for CNIPA data — and the only one that ships translated abstracts alongside the originals.
How much does this actor cost to run?
China CNIPA Patent Scraper uses pay-per-event pricing on the default_2603_basic profile at a 1.5x price coefficient. Residential proxy usage is included in the proxy tier. A 200-patent run typically costs a few cents in platform fees.
What data can I get for each patent?
China CNIPA Patent Scraper returns 20 fields per record: bilingual title and abstract, four key dates (filing, publication, grant, priority), inventors and assignees in both languages, CPC and IPC codes, legal event timeline, family members, and a direct PDF link. Enough to build prior-art datasets without scraping a second source.
Can I filter by date or classification code?
China CNIPA Patent Scraper accepts both. Use dateFrom/dateTo for priority date windows, and put a CPC or IPC code in the query field (e.g. H04L or G06F16/174). You can combine a classification code with a keyword if you want a narrower slice.
Does this actor need proxies?
China CNIPA Patent Scraper requires residential proxy. Google Patents returns a soft-block "Sorry" page to datacenter IPs, even on HTTP 200. The actor is preconfigured with Apify Residential — leave it alone unless you have a specific reason to override it.
Need More Features?
Need additional patent fields, different jurisdictions, or full-text claim extraction? Open an issue or get in touch.
Why Use China CNIPA Patent Scraper?
- Bilingual output — Returns both the Google-translated English and the original Chinese for titles and abstracts, so downstream tools can pick whichever they need.
- Search or seed — Run a broad keyword search across all CN patents, or paste a list of specific URLs when you already know which records you want. Both modes share the same parser.
- Structured legal events — Most patent scrapers give you a status string and call it done. This one ships the full event timeline (filings, grants, expirations) as JSON, which is what you actually need for due diligence.