TheOrg Org Chart People Scraper
TheOrg Org Chart People Scraper
Scrape public org chart data from TheOrg. Returns people, titles, reporting relationships, departments, and company metadata — no login or cookies required. TheOrg publishes its entire org chart database as server-rendered HTML, so every record here is genuinely public.
TheOrg Org Chart People Scraper Features
- Extracts org chart nodes with parent-child reporting edges — the
reports_tofield tells you who each person reports to. - Returns team/department membership per person from the company's team structure.
- Collects company metadata alongside each person: industry, HQ location, website, slug.
- Derives seniority level (C-suite, VP, Director, Manager, Senior IC, IC) from job title automatically.
- Accepts a list of company URLs directly or discovers companies via TheOrg's sitemap (200k+ org profiles).
- Respects
maxItemsso you can run small targeted pulls without crawling the full index.
Who Uses This Data?
- SDRs and BDRs — build buying committees by mapping titles and reporting lines at target accounts before cold outreach.
- Recruiters — identify hiring manager and team structure before reaching out to candidates.
- Market researchers — analyze org structure patterns across industries or geographies.
- Sales engineers — understand decision-maker hierarchy at prospect accounts before demos.
- GTM analysts — enrich CRM accounts with org chart data from a public, cookieless source.
How TheOrg Org Chart People Scraper Works
- Accepts a list of company URLs (
startUrls) or crawls TheOrg's compressed sitemap index to discover company pages. - Fetches each company's org page and extracts the
__NEXT_DATA__SSR JSON payload — no JS rendering needed. - Parses
initialNodes(org chart with parentId edges) andinitialTeams(department membership) from the page data. - Emits one record per unique person, with company metadata attached to each row.
Input
{
"startUrls": [
{ "url": "https://theorg.com/org/stripe" },
{ "url": "https://theorg.com/org/notion" }
],
"maxItems": 100
}
| Field | Type | Default | Description |
|---|---|---|---|
startUrls |
array | [{"url": "https://theorg.com/org/happilo"}] |
List of TheOrg company page URLs to scrape. Leave empty to use sitemap discovery. |
maxItems |
integer | 10 |
Maximum number of person records to collect. Set to 0 for unlimited (sitemap mode). |
Sitemap mode: leave startUrls empty and the actor walks TheOrg's sitemap index (cdn.theorg.com/sitemap.xml) and walks the 400+ compressed company sitemaps to collect company URLs. Use maxItems to cap output.
TheOrg Org Chart People Scraper Output Fields
{
"org_name": "Stripe",
"org_slug": "stripe",
"org_website": "https://stripe.com",
"org_industry": "Financial Services, Software",
"org_hq_location": "San Francisco, California, United States",
"person_name": "Patrick Collison",
"person_title": "CEO",
"person_role_level": "C-suite",
"person_profile_url": "https://theorg.com/org/stripe/people/patrick-collison",
"person_linkedin_url": "",
"reports_to": "",
"department": ""
}
| Field | Type | Description |
|---|---|---|
org_name |
string | Company name |
org_slug |
string | Company slug from the URL |
org_website |
string | Company website URL |
org_industry |
string | Industry tag(s) from TheOrg |
org_hq_location |
string | City, region, country from company profile |
person_name |
string | Full name |
person_title |
string | Job title |
person_role_level |
string | Seniority: C-suite, VP, Director, Manager, Senior IC, or IC |
person_profile_url |
string | Direct link to the person's TheOrg profile |
person_linkedin_url |
string | LinkedIn URL when present in page data |
reports_to |
string | Manager's name (from org chart parent edge) |
department |
string | Team or department name |
FAQ
Does TheOrg require a login to view org charts?
No. TheOrg publishes org charts publicly — no account, no cookies. The data is server-rendered and available to any HTTP client with a valid browser user-agent.
How fresh is the data?
TheOrg data reflects whatever is on their site at crawl time. Org charts are user-contributed and vary in freshness by company. High-profile companies tend to be updated more frequently.
What is the reports_to field?
It contains the name of the person's direct manager, derived from the org chart's parent-child node structure. The field is empty for top-level executives (no parent node) or for people sourced from team pages where the org chart edge isn't available.
How does sitemap mode work?
When startUrls is empty, the actor fetches TheOrg's sitemap index and walks 400+ compressed company sitemaps to collect URLs, then crawls each org page. This mode can return data for 200k+ organizations — use maxItems to cap it.
What does person_role_level derive from?
The title string. Matches common patterns: "CEO", "Chief", "President" → C-suite; "VP", "Vice President" → VP; "Director" → Director; "Manager" → Manager; "Senior", "Lead", "Principal", "Staff", "Head of" → Senior IC; everything else → IC.
Need More Features?
Open an issue or contact support if you need additional fields, pagination controls, or a team-filtered mode.
Why Use TheOrg Org Chart People Scraper?
- Cookieless source — TheOrg requires no authentication, unlike LinkedIn scrapers which all require active session cookies and fight aggressive anti-bot. You get the same org-chart buyer-intent data without the credential management problem.
- Reporting edges included — the
reports_tofield surfaces manager-report relationships that LinkedIn does not expose. Buying committee mapping gets considerably more useful when you know the chain of command. - Affordable at scale — PPE pricing means you pay per record, not per run. Grabbing 500 decision-makers from 10 target accounts costs less than a cup of coffee.