Curated Developer Newsletter Link Scraper
DEVELOPER TOOLS
Curated Developer Newsletter Link Scraper
Extract every curated link from the archives of top developer newsletters — JavaScript Weekly, Ruby Weekly, Node Weekly, Android Weekly, iOS Dev Weekly, and more.
Each newsletter issue is decomposed into its constituent curated items: the external destination URL, headline, editor commentary, section, link position, and a sponsored-slot flag. The result is a structured dev-trend and link-intelligence dataset showing which libraries, tools, repos, and articles the most authoritative curators featured, when, and with what framing.
What it does
For each newsletter archive you specify, the scraper:
- Loads the issue archive index to enumerate all historical issue URLs
- Fetches each issue page (most recent first, up to
maxIssuesper newsletter) - Extracts every curated link item with full metadata
Supported newsletter families:
| Newsletter | Domain | Issues |
|---|---|---|
| JavaScript Weekly | javascriptweekly.com | 700+ |
| Ruby Weekly | rubyweekly.com | 700+ |
| Node Weekly | nodeweekly.com | 500+ |
| iOS Dev Weekly | iosdevweekly.com | 700+ |
| Android Weekly | androidweekly.net | 700+ |
Any newsletter with the same archive structure can be added via the newsletters input.
Output fields
| Field | Type | Description |
|---|---|---|
newsletter_name |
string | Newsletter name (e.g. JavaScript Weekly) |
newsletter_domain |
string | Domain (e.g. javascriptweekly.com) |
issue_number |
integer | Issue number |
issue_url |
string | Full URL of the issue page |
issue_title |
string | Issue title (e.g. JavaScript Weekly Issue 700: August 15, 2024) |
issue_date |
string | Publication date parsed from the title |
link_position |
integer | Order of this link within the issue |
link_title |
string | Curated item headline |
link_url |
string | External destination URL |
link_domain |
string | Domain of the destination URL |
editor_commentary |
string | Editor's description/blurb for the item |
section |
string | Section label (e.g. In Brief, Releases, Jobs) where available |
is_sponsored |
boolean | Whether this is a sponsored or classified ad slot |
scraped_at |
string | ISO timestamp of when this record was scraped |
Use cases
- Dev-trend analysis: track which libraries, repos, and tools the most authoritative curators featured over time
- Content marketing / PR: discover when a project first appeared in a newsletter, and with what framing
- Link intelligence: build a structured index of the most frequently curated developer resources
- Competitive intelligence: see which competing products and tools get curated vs your own
Input options
| Parameter | Type | Default | Description |
|---|---|---|---|
newsletters |
array | All 5 major newsletters | List of newsletter archive URLs (e.g. https://javascriptweekly.com/issues) |
maxIssues |
integer | 3 | Max number of issues to scrape per newsletter (most recent first, 0 = all) |
maxItems |
integer | 15 | Maximum total link records to return |
Example output
{
"newsletter_name": "JavaScript Weekly",
"newsletter_domain": "javascriptweekly.com",
"issue_number": 789,
"issue_url": "https://javascriptweekly.com/issues/789",
"issue_title": "JavaScript Weekly Issue 789",
"issue_date": "",
"link_position": 1,
"link_title": "Announcing TypeScript 5.8",
"link_url": "https://devblogs.microsoft.com/typescript/announcing-typescript-5-8/",
"link_domain": "devblogs.microsoft.com",
"editor_commentary": "TypeScript 5.8 is here with require() of ES modules in Node.js 22 and a bunch of other goodies.",
"section": "",
"is_sponsored": false,
"scraped_at": "2025-06-12T10:23:45.000Z"
}
Notes
- No authentication required. All target newsletters are publicly accessible.
- The scraper uses a polite 500ms delay between requests to be respectful of the servers.
- For full archive crawls (all issues, all newsletters), set
maxIssues: 0and a highmaxItemsvalue. A full crawl of all five newsletters (5000+ issues x ~15 links each) produces tens of thousands of records.