Website Contact Extractor

Extract emails, phone numbers, physical addresses, and social media links from any website. Supply a list of URLs and get one structured record per domain — ready for lead generation, outreach, or contact research.

What it does

For each website in your input list, the actor:

Fetches the homepage
Probes common contact sub-pages (/contact, /contact-us, /about, /about-us, /impressum, /legal)
Extracts and deduplicates all contact data across those pages
Returns one row per domain

Output fields

Field	Description
`url`	Homepage URL (scheme + host)
`domain`	Domain name without `www` (e.g. `example.com`)
`emails`	Comma-separated email addresses found on the site
`phones`	Comma-separated phone numbers found on the site
`social_links`	Comma-separated social media profile URLs
`address`	Physical address if found (JSON-LD schema.org or heuristic footer detection)
`pages_crawled`	Number of pages successfully crawled for this domain
`scraped_at`	ISO-8601 timestamp

Input

Field	Type	Description
`startUrls`	Array	List of website URLs to extract contact info from
`maxItems`	Integer	Maximum number of domain records to return (0 = no limit)

Example input

{
    "startUrls": [
        { "url": "https://example.com" },
        { "url": "https://another-company.com" }
    ],
    "maxItems": 10
}

Example output

{
    "url": "https://example.com",
    "domain": "example.com",
    "emails": "hello@example.com, support@example.com",
    "phones": "+1 800 555 0100",
    "social_links": "https://linkedin.com/company/example, https://twitter.com/example",
    "address": "123 Main St, San Francisco, CA 94105, US",
    "pages_crawled": 7,
    "scraped_at": "2026-06-04T10:00:00.000Z"
}

Notes

Email extraction prioritises <a href="mailto:..."> links (clean, unambiguous) before falling back to a regex sweep of page text.
Phone extraction requires at least 7 digits and rejects date-like strings, version numbers, and decimal sequences to minimise false positives.
Social links covers LinkedIn, X/Twitter, Facebook, Instagram, YouTube, TikTok, GitHub, Pinterest, WhatsApp, Telegram, Medium, Reddit, and Snapchat.
Physical address is read from JSON-LD schema.org/PostalAddress markup first, then from common CSS selectors (footer address, [itemprop="address"], .address).
Sub-pages that return 404 or other errors are silently skipped — only successfully loaded pages contribute data.
Duplicate domains in startUrls are collapsed to one record.

Pricing

Pay per website processed. Charged at run start plus a per-record fee when results are returned.

Website Contact Extractor

Website Contact Extractor

What it does

Output fields

Input

Example input

Example output

Notes

Pricing

Related Developer Tools & Utils scrapers