NewsletterHunt Archived Issue Scraper
NewsletterHunt Archived Issue Scraper
Scrapes NewsletterHunt's cross-publication archive of real sent newsletter emails. Returns the actual content — subject lines, full email body HTML, sender, date — plus the newsletter's signup URL. No login required.
NewsletterHunt archives sent emails from hundreds of publications on one public site. This actor crawls that archive and returns it as structured data. Useful for competitive email-marketing analysis, AI training corpora, and lead generation (each newsletter carries a direct signup URL).
What You Get
Each output record is one archived email issue. That includes everything you need to understand what was sent, when, by whom, and where to sign up.
| Field | Description |
|---|---|
newsletter_slug |
URL slug identifying the newsletter (e.g. the-hustle) |
newsletter_name |
Publication display name (e.g. The Hustle) |
newsletter_url |
NewsletterHunt page for this newsletter |
newsletter_signup_url |
The newsletter's own website or subscribe URL |
topic |
Topic/category tags assigned by NewsletterHunt |
email_id |
NewsletterHunt archive ID for this issue |
email_url |
Direct link to the archived issue page |
email_subject |
Subject line of the email issue |
email_sender |
Publication or author who sent it |
email_date |
Publication date (ISO 8601) |
email_body_html |
Full rendered email body HTML (capped at 500 KB) |
email_body_text |
Plain-text extract of the body (tags stripped, capped at 50 KB) |
scraped_at |
Timestamp of when this record was scraped |
How It Works
Three-level crawl. Discovers newsletters from the listing page, fetches each newsletter's full issue archive via a JSON endpoint, then retrieves the email body from each issue page.
The email body is embedded directly in the page HTML as an iframe srcdoc attribute — no additional fetches required. Subject, sender, and date are pulled from the page alongside it.
Input
| Parameter | Type | Default | Description |
|---|---|---|---|
maxItems |
integer | 10 | Maximum number of email records to return. Set to 0 for no limit. |
Usage Notes
- The listing page currently shows ~9 newsletters. Each has many archived issues.
- Some older archived issues may not have a full email body (displayed as
null). - NewsletterHunt is a public archive. No authentication is needed.
- Polite crawl rate: 5 concurrent requests.
Example Output
{
"newsletter_slug": "money-stuff-by-matt-levine",
"newsletter_name": "Money Stuff by Matt Levine",
"newsletter_url": "https://newsletterhunt.com/newsletters/money-stuff-by-matt-levine",
"newsletter_signup_url": "http://link.mail.bloombergbusiness.com/join/4wm/moneystuff-signup",
"topic": "Finance",
"email_id": "340238",
"email_url": "https://newsletterhunt.com/emails/340238",
"email_subject": "Money Stuff: Index Funds Can't Say No to SpaceX",
"email_sender": "Money Stuff by Matt Levine",
"email_date": "2020-12-09T11:43:00",
"email_body_html": "<!DOCTYPE html>...",
"email_body_text": "Money Stuff: Index Funds Can't Say No to SpaceX...",
"scraped_at": "2026-06-02T17:30:00.000Z"
}
Related Actors
Pairs well with actors that target specific newsletter platforms for deeper per-publication archives, or with a newsletter directory scraper for subscriber counts and open rates.