IRS Form 990-PF Private Foundation Scraper - Grant Detail
IRS Form 990-PF Private Foundation Scraper — Grant Detail
Parses IRS bulk XML downloads for Form 990-PF (private foundation) filings and extracts Part XV grant detail — recipient name, address, purpose, and amount paid. Covers ~120k private foundations filing annually. Distinct from the propublica-nonprofit-crawler, which targets Form 990 public charities — 990-PF is a different schema aimed at grant-research buyers.
Features
- Downloads IRS TEOS (Tax-Exempt Organization Submission) annual ZIP batches directly from the IRS open-data portal.
- Filters for 990-PF returns only and parses XML for each filing.
- Extracts foundation header data: EIN, legal name, address, filing year.
- Extracts financial summaries: total assets at year-end, total revenue, qualifying distributions, investment income, excise tax.
- Extracts all officers and trustees with title and compensation.
- Extracts Part XV grant rows: recipient name, address (US or foreign), purpose, and dollar amount.
- Three modes: browse all foundations in a year, look up specific EINs, or filter by recipient name.
- Supports state filter, foundation name filter, and minimum grant amount threshold.
- No API key, no proxy, no browser required. IRS open data is fully public.
Who Uses 990-PF Data?
- Grant writers — Identify private foundations that have funded projects in your sector, then look up recipient names and grant purposes to write more targeted proposals.
- Nonprofit fundraising teams — Prospect for new funders by searching for foundations that have granted to peer organizations.
- Foundation-prospect-research consultants — Build bulk grant landscapes for clients: which funders are active in education/health/arts in a given state, and how much are they distributing.
- Grant-research SaaS platforms — Power foundation search features with structured 990-PF data without writing an IRS parser.
- Impact investors and program officers — Analyze grant flows across a field by pulling multi-year distributions and recipient patterns.
- Journalists — Follow the money from a specific foundation across years and grantees.
How the IRS 990-PF Scraper Works
The IRS publishes all 990-series XML filings as bulk ZIP archives on their Form 990 Series Downloads page. Each annual ZIP batch contains thousands of XML files, one per return. The scraper:
- Fetches the IRS download page to discover the ZIP URLs for the requested year.
- Downloads each ZIP batch sequentially (batches are ~70–100 MB compressed).
- Unpacks each XML file and checks the
ReturnTypeCdfor990PF. - Parses the IRS XML schema to extract foundation header, financials, officers, and Part XV grant groups.
- Applies any filters (state, name, EIN, recipient, minimum grant amount) and saves matching records.
Modes
grants_by_year (default)
Returns all 990-PF filings in the specified year. Use stateFilter, foundationName, and minGrantAmount to narrow results.
{
"mode": "grants_by_year",
"filingYear": 2024,
"stateFilter": "CA",
"minGrantAmount": 100000,
"maxItems": 50
}
foundation_lookup
Fetch filings for specific foundations by EIN. Scans annual batches until all requested EINs are found.
{
"mode": "foundation_lookup",
"einList": ["13-1803509", "23-7129889"],
"filingYear": 2024,
"maxItems": 10
}
grants_by_recipient
Returns foundations that made grants to recipients matching the supplied name substring. Useful for finding all funders that have supported a specific organization.
{
"mode": "grants_by_recipient",
"recipientName": "Red Cross",
"filingYear": 2024,
"maxItems": 20
}
Output Schema
Each record represents one foundation's 990-PF filing for a given tax year.
| Field | Type | Description |
|---|---|---|
foundation_ein |
string | 9-digit EIN |
foundation_name |
string | Legal name |
filing_year |
number | Tax year |
return_type |
string | Always 990PF |
foundation_address |
string | Street address |
foundation_city |
string | City |
foundation_state |
string | 2-letter state code |
foundation_zip |
string | ZIP code |
total_assets_eoy |
number | Total assets at year-end (USD) |
total_revenue |
number | Total revenue (USD) |
total_grants_paid |
number | Qualifying distributions paid (USD) |
investment_income |
number | Net investment income (USD) |
excise_tax_paid |
number | Excise tax on investment income (USD) |
trustee_count |
number | Number of trustees/officers |
trustees |
array | name|title|compensation per trustee |
grant_count |
number | Number of Part XV grants |
grants |
array | recipient|city, state zip|purpose|amount per grant |
source_xml_url |
string | IRS ZIP batch URL |
source_file |
string | XML filename in the ZIP |
scraped_at |
string | ISO 8601 scrape timestamp |
Notes
- Memory: The actor uses 2048 MB by default to handle large ZIP decompression and XML parsing. Reduce to 512 MB for small runs.
- Timeout: IRS ZIP batches are large. Allow at least 1 hour for a complete year scan.
- Year coverage: IRS publishes TEOS data for the current and prior years. Data for the most recent tax year typically appears a few months after year-end.
- maxItems: The default is 15 for quick validation. Increase for bulk extraction.
- Proxy: Not required. The IRS open-data portal is fully public.