OrbTop

Curated Developer Newsletter Link Scraper

DEVELOPER TOOLS

Curated Developer Newsletter Link Scraper

Extract every curated link from the archives of top developer newsletters — JavaScript Weekly, Ruby Weekly, Node Weekly, Android Weekly, iOS Dev Weekly, and more.

Each newsletter issue is decomposed into its constituent curated items: the external destination URL, headline, editor commentary, section, link position, and a sponsored-slot flag. The result is a structured dev-trend and link-intelligence dataset showing which libraries, tools, repos, and articles the most authoritative curators featured, when, and with what framing.

What it does

For each newsletter archive you specify, the scraper:

  1. Loads the issue archive index to enumerate all historical issue URLs
  2. Fetches each issue page (most recent first, up to maxIssues per newsletter)
  3. Extracts every curated link item with full metadata

Supported newsletter families:

Newsletter Domain Issues
JavaScript Weekly javascriptweekly.com 700+
Ruby Weekly rubyweekly.com 700+
Node Weekly nodeweekly.com 500+
iOS Dev Weekly iosdevweekly.com 700+
Android Weekly androidweekly.net 700+

Any newsletter with the same archive structure can be added via the newsletters input.

Output fields

Field Type Description
newsletter_name string Newsletter name (e.g. JavaScript Weekly)
newsletter_domain string Domain (e.g. javascriptweekly.com)
issue_number integer Issue number
issue_url string Full URL of the issue page
issue_title string Issue title (e.g. JavaScript Weekly Issue 700: August 15, 2024)
issue_date string Publication date parsed from the title
link_position integer Order of this link within the issue
link_title string Curated item headline
link_url string External destination URL
link_domain string Domain of the destination URL
editor_commentary string Editor's description/blurb for the item
section string Section label (e.g. In Brief, Releases, Jobs) where available
is_sponsored boolean Whether this is a sponsored or classified ad slot
scraped_at string ISO timestamp of when this record was scraped

Use cases

  • Dev-trend analysis: track which libraries, repos, and tools the most authoritative curators featured over time
  • Content marketing / PR: discover when a project first appeared in a newsletter, and with what framing
  • Link intelligence: build a structured index of the most frequently curated developer resources
  • Competitive intelligence: see which competing products and tools get curated vs your own

Input options

Parameter Type Default Description
newsletters array All 5 major newsletters List of newsletter archive URLs (e.g. https://javascriptweekly.com/issues)
maxIssues integer 3 Max number of issues to scrape per newsletter (most recent first, 0 = all)
maxItems integer 15 Maximum total link records to return

Example output

{
  "newsletter_name": "JavaScript Weekly",
  "newsletter_domain": "javascriptweekly.com",
  "issue_number": 789,
  "issue_url": "https://javascriptweekly.com/issues/789",
  "issue_title": "JavaScript Weekly Issue 789",
  "issue_date": "",
  "link_position": 1,
  "link_title": "Announcing TypeScript 5.8",
  "link_url": "https://devblogs.microsoft.com/typescript/announcing-typescript-5-8/",
  "link_domain": "devblogs.microsoft.com",
  "editor_commentary": "TypeScript 5.8 is here with require() of ES modules in Node.js 22 and a bunch of other goodies.",
  "section": "",
  "is_sponsored": false,
  "scraped_at": "2025-06-12T10:23:45.000Z"
}

Notes

  • No authentication required. All target newsletters are publicly accessible.
  • The scraper uses a polite 500ms delay between requests to be respectful of the servers.
  • For full archive crawls (all issues, all newsletters), set maxIssues: 0 and a high maxItems value. A full crawl of all five newsletters (5000+ issues x ~15 links each) produces tens of thousands of records.