PR Newswire Scraper

Scrape press releases from PR Newswire — the world's largest press release distribution network.

For each release, the actor collects:

Headline — the full press release title
Publication date — ISO 8601 datetime with timezone offset
Author / company — the organization that issued the release
Full body text — the complete text of the press release
Canonical URL — permanent link to the release page
Release ID — the unique numeric identifier from the URL slug

Usage

By default, the actor scrapes the All News Releases listing from newest to oldest.

To scrape a specific category listing page, pass one or more URLs via startUrls:

{
    "maxItems": 100,
    "startUrls": [
        { "url": "https://www.prnewswire.com/news-releases/financial-services-latest-news/" },
        { "url": "https://www.prnewswire.com/news-releases/life-sciences-latest-news/" }
    ]
}

Any PR Newswire category listing page that uses the /news-releases/<category>-latest-news/ URL pattern is supported.

Input

Field	Type	Default	Description
`maxItems`	integer	10	Maximum number of press releases to collect.
`startUrls`	array	(all releases)	One or more PR Newswire category listing URLs.

Output

Each output record (saved to the dataset) includes:

Field	Type	Description
`release_id`	string	Numeric ID from the URL slug (e.g. `302793395`)
`title`	string	Press release headline
`url`	string	Canonical URL
`published_at`	string	ISO 8601 datetime (e.g. `2026-06-07T14:30:00-04:00`)
`author`	string	Publishing company or author name
`summary`	string	Short excerpt (when available from listing)
`full_text`	string	Full body text of the press release
`scraped_at`	string	ISO 8601 scrape timestamp

Performance

Proxy: None required — PR Newswire is directly accessible.
Concurrency: 5 parallel requests (polite default).
Coverage: Paginated scraping — all pages of a category listing are crawled.
Rate limiting: Automatic backoff on HTTP 429 responses.

Technical Notes

PR Newswire serves server-rendered HTML across all listing and detail pages. No JavaScript rendering or CAPTCHA solving is required. The actor uses Cheerio for fast HTML parsing and respects pagination via ?page=N query parameters.

The search feature on prnewswire.com is JavaScript-powered and not accessible to a plain HTTP crawler — use category listing URLs instead.