OrbTop

NewsletterHunt Archived Issue Scraper

LEAD GENERATION

NewsletterHunt Archived Issue Scraper

Scrapes NewsletterHunt's cross-publication archive of real sent newsletter emails. Returns the actual content — subject lines, full email body HTML, sender, date — plus the newsletter's signup URL. No login required.

NewsletterHunt archives sent emails from hundreds of publications on one public site. This actor crawls that archive and returns it as structured data. Useful for competitive email-marketing analysis, AI training corpora, and lead generation (each newsletter carries a direct signup URL).

What You Get

Each output record is one archived email issue. That includes everything you need to understand what was sent, when, by whom, and where to sign up.

Field Description
newsletter_slug URL slug identifying the newsletter (e.g. the-hustle)
newsletter_name Publication display name (e.g. The Hustle)
newsletter_url NewsletterHunt page for this newsletter
newsletter_signup_url The newsletter's own website or subscribe URL
topic Topic/category tags assigned by NewsletterHunt
email_id NewsletterHunt archive ID for this issue
email_url Direct link to the archived issue page
email_subject Subject line of the email issue
email_sender Publication or author who sent it
email_date Publication date (ISO 8601)
email_body_html Full rendered email body HTML (capped at 500 KB)
email_body_text Plain-text extract of the body (tags stripped, capped at 50 KB)
scraped_at Timestamp of when this record was scraped

How It Works

Three-level crawl. Discovers newsletters from the listing page, fetches each newsletter's full issue archive via a JSON endpoint, then retrieves the email body from each issue page.

The email body is embedded directly in the page HTML as an iframe srcdoc attribute — no additional fetches required. Subject, sender, and date are pulled from the page alongside it.

Input

Parameter Type Default Description
maxItems integer 10 Maximum number of email records to return. Set to 0 for no limit.

Usage Notes

  • The listing page currently shows ~9 newsletters. Each has many archived issues.
  • Some older archived issues may not have a full email body (displayed as null).
  • NewsletterHunt is a public archive. No authentication is needed.
  • Polite crawl rate: 5 concurrent requests.

Example Output

{
  "newsletter_slug": "money-stuff-by-matt-levine",
  "newsletter_name": "Money Stuff by Matt Levine",
  "newsletter_url": "https://newsletterhunt.com/newsletters/money-stuff-by-matt-levine",
  "newsletter_signup_url": "http://link.mail.bloombergbusiness.com/join/4wm/moneystuff-signup",
  "topic": "Finance",
  "email_id": "340238",
  "email_url": "https://newsletterhunt.com/emails/340238",
  "email_subject": "Money Stuff: Index Funds Can't Say No to SpaceX",
  "email_sender": "Money Stuff by Matt Levine",
  "email_date": "2020-12-09T11:43:00",
  "email_body_html": "<!DOCTYPE html>...",
  "email_body_text": "Money Stuff: Index Funds Can't Say No to SpaceX...",
  "scraped_at": "2026-06-02T17:30:00.000Z"
}

Related Actors

Pairs well with actors that target specific newsletter platforms for deeper per-publication archives, or with a newsletter directory scraper for subscriber counts and open rates.