NSF Awards Scraper - Research Grants, PIs & Funding Data
NSF Awards Scraper - Research Grants, PIs & Funding Data
Extract NSF award records from the official research.gov Awards API. Returns rich structured data including award title, abstract, awardee organization, principal investigator (name + email), NSF program officer (name/email/phone), funds obligated and estimated totals, directorate and division, program element/reference codes, CFDA numbers, start/expiration dates, and performance location. Filter by keyword, PI name, awardee organization, state, date range, and CFDA program.
What You Get
Each record contains:
- Award identification: award ID, title, CFDA number, directorate/division abbreviations, program element and reference codes
- Awardee details: organization name, city, state, zip, country, UEI number
- Principal Investigator (PI): first/last/full name, email address
- Co-PIs: semicolon-separated list of co-PI names
- Program Officer (PO): name, email, phone — the key contact at NSF
- Funding: funds obligated (USD), estimated total (USD)
- Dates: start date, expiration date, latest amendment date
- Program context: NSF funding program name, primary program, org long name
- Performance location: city, state, zip, country (may differ from awardee)
- Public access mandate flag
- Direct URL to the NSF award detail page
Use Cases
| Buyer | What They Use It For |
|---|---|
| University sponsored-programs offices | Track competitor institutions' NSF funding; identify gaps in their own programs |
| Grant writers | Research NSF program officers by directorate; identify active programs and funded PIs |
| Biotech / deep-tech VCs | Identify university researchers with NSF backing as potential spinout founders |
| Research-policy analysts | Analyze funding flows by directorate, state, or institution over time |
| Competitive-intel teams | Track who NSF is funding in a specific technology area |
| Lead-gen / BD teams | PI and PO email fields for direct outreach in the academic/government sector |
Input Parameters
| Parameter | Type | Description |
|---|---|---|
| keyword | string | Free-text search across award title and abstract |
| piName | string | Filter by PI name (e.g. Smith) |
| awardeeName | string | Filter by awardee institution (e.g. Massachusetts Institute of Technology) |
| awardeeStateCode | string | Filter by US state code (e.g. CA, MA, NY) |
| dateStart | string | Earliest award start date, MM/DD/YYYY format |
| dateEnd | string | Latest award start date, MM/DD/YYYY format |
| cfdaNumber | string | CFDA program number (e.g. 47.070 for Computer & Information Science) |
| maxItems | integer | Maximum records to return. Default 10. Set 0 for unlimited (use date slicing for large pulls). |
Handling the API's 3,000-Record Limit
The NSF Awards API returns at most ~3,000 results per query. For larger data pulls:
- Slice by year: run one query per calendar year using
dateStart/dateEnd - Combine with
keywordorawardeeStateCodeto narrow each slice further
Example: to get all 2024 awards in California, set dateStart: 01/01/2024, dateEnd: 12/31/2024, awardeeStateCode: CA.
Sample Output Record
{
"award_id": "2606034",
"title": "CAREER: Generalization Capabilities of Machine Learning...",
"awardee_name": "University of California-Los Angeles",
"awardee_city": "LOS ANGELES",
"awardee_state_code": "CA",
"uei_number": "RN64EPNH8JC6",
"pi_full_name": "Hayden Schaeffer",
"pi_email": "hayden@math.ucla.edu",
"po_name": "Stacey Levine",
"po_email": "slevine@nsf.gov",
"po_phone": "7032922948",
"funds_obligated_amt": 214876,
"estimated_total_amt": 214876,
"start_date": "10/01/2026",
"exp_date": "09/30/2029",
"cfda_number": "47.049",
"directorate_abbr": "MPS",
"division_abbr": "DMS",
"fund_program_name": "APPLIED MATHEMATICS",
"program_element_codes": "126600",
"program_reference_codes": "075Z, 079Z",
"public_access_mandate": "1",
"perf_state_code": "CA",
"award_detail_url": "https://www.nsf.gov/awardsearch/showAward?AWD_ID=2606034",
"status": "success"
}
Data Source
All data comes from the NSF Awards API (research.gov), the official public interface to the NSF Awards database. No authentication required. The API covers 500,000+ awards since 1959.
The scraper requests all 40+ available fields by name, including fields like piEmail, poEmail, abstractText, and coPDPI that the API only returns when explicitly listed in printFields — ensuring no data is silently omitted.
Technical Notes
- Rate limiting: the scraper sends ~1 request per second as per API courtesy guidelines, with automatic retry on 429/503 responses
- Memory: 512 MB default, sufficient for full paginated runs
- Timeout: 4-hour ceiling for full-history slices; typical keyword searches complete in seconds
- No proxy required: the NSF Awards API is public and accessible without geo-routing