Chrome Web Store Scraper
DEVELOPER TOOLS
Chrome Web Store Scraper
Scrape Chrome extensions from the Chrome Web Store. Pull comprehensive extension metadata — name, rating, review count, user count, version, full manifest, permissions, category, developer info, screenshots, and website URL. Search by keyword or provide specific extension IDs.
Features
- Three input modes: by extension ID, by URL, or by search query
- Rich data extraction: 25 fields per extension including the full parsed manifest.json
- No proxy needed: Chrome Web Store serves to datacenter IPs cleanly
- Fast extraction: Data is embedded server-side in the HTML — no JavaScript rendering required
- Permissions analysis: Extracts permissions array from the manifest for security audits
Use Cases
- Extension research and competitive analysis
- Security auditing — identify extensions with broad permissions (
<all_urls>,webRequest, etc.) - Developer directory building
- Chrome extension market research and trend tracking
- Finding extensions by category or keyword
Input Configuration
| Field | Type | Description |
|---|---|---|
extensionIds |
Array | List of extension IDs (32-char alphanumeric) to scrape directly |
startUrls |
Array | Chrome Web Store URLs (detail pages, search pages, or category pages) |
searchQuery |
String | Search term to find extensions (e.g. "password manager", "ad blocker") |
maxItems |
Integer | Maximum number of records to return (0 = unlimited, default 20) |
Provide one of extensionIds, startUrls, or searchQuery. If none are provided, the actor runs a default search for "productivity".
Example: Scrape by Extension ID
{
"extensionIds": ["cjpalhdlnbpafiamejdnhcphjbkeiagm"],
"maxItems": 1
}
Example: Search by Keyword
{
"searchQuery": "password manager",
"maxItems": 20
}
Example: Specific Store URLs
{
"startUrls": [
"https://chromewebstore.google.com/detail/ublock-origin/cjpalhdlnbpafiamejdnhcphjbkeiagm",
"https://chromewebstore.google.com/search/vpn"
],
"maxItems": 50
}
Output Fields
Each result record contains:
| Field | Type | Description |
|---|---|---|
extension_id |
String | Unique 32-char extension ID |
url |
String | Chrome Web Store detail page URL |
name |
String | Extension display name |
short_description |
String | Short description shown in search results |
long_description |
String | Full description from the detail page |
rating |
Number | Average user rating (0–5) |
review_count |
Integer | Total number of user reviews |
user_count |
Integer | Approximate number of active users |
version |
String | Current published version |
size |
String | Extension file size (e.g. "4.27MiB") |
category |
String | Primary category path (e.g. "productivity/workflow") |
website_url |
String | Developer's website URL |
icon_url |
String | Extension icon URL |
header_image_url |
String | Header/marquee image URL |
promo_image_url |
String | Promotional tile image URL |
screenshots |
Array | Screenshot image URLs |
developer_email |
String | Developer contact email |
developer_name |
String | Developer display name |
developer_id |
String | Developer identifier |
manifest |
Object | Full parsed manifest.json |
permissions |
Array | Extension permissions list |
languages |
Array | Supported language names |
published_at |
String | Original publish date (ISO 8601) |
updated_at |
String | Last update date (ISO 8601) |
scraped_at |
String | Scrape timestamp (ISO 8601) |
Sample Output
{
"extension_id": "cjpalhdlnbpafiamejdnhcphjbkeiagm",
"url": "https://chromewebstore.google.com/detail/ublock-origin/cjpalhdlnbpafiamejdnhcphjbkeiagm",
"name": "uBlock Origin",
"short_description": "Finally, an efficient blocker. Easy on CPU and memory.",
"rating": 4.6973,
"review_count": 35453,
"user_count": 14000000,
"version": "1.71.0",
"size": "4.27MiB",
"category": "make_chrome_yours/privacy",
"developer_email": "ubo@raymondhill.net",
"developer_name": "Raymond Hill (gorhill)",
"permissions": ["alarms", "contextMenus", "privacy", "storage", "tabs", "webRequest", "webRequestBlocking", "<all_urls>"],
"published_at": "2014-06-24T00:52:35.000Z",
"updated_at": "2026-05-12T05:16:59.000Z",
"scraped_at": "2026-06-12T03:00:10.000Z"
}
Technical Notes
- Data is extracted from server-rendered
AF_initDataCallbackscript blocks — no browser rendering needed core_crawler(CheerioCrawler) with concurrency 3 to respect Google rate limits- No proxy required — datacenter IPs work fine
- Memory: 512 MB is sufficient for most runs
- Timeout: 4 hours default (plenty for bulk runs)