OrbTop

NIH RePORTER Scraper - Grants, PIs & Linked Publications

BUSINESSOTHERLEAD GENERATION

NIH RePORTER Scraper - Grants, PIs & Linked Publications

Extract NIH-funded research project records from the official RePORTER v2 API — no account or proxy required. Retrieve PI names, award amounts, activity codes, study sections, dates, active/terminated status, and optionally linked PubMed publication IDs.

What you get

Each output record corresponds to one NIH project award (one fiscal-year slice). Fields include:

Field Description
project_num Full NIH project number (e.g. 5R01CA123456-05)
core_project_num Core project number — groups subprojects and multi-year awards
appl_id Application ID
fiscal_year NIH fiscal year
project_title Project title
abstract_text Full project abstract
phr_text Public health relevance statement
activity_code NIH activity code (R01, R21, K99, F31, P30, U54, …)
agency_ic_admin Administering institute/center (NCI, NIAID, NHLBI, …)
award_amount Total award amount (USD)
direct_cost_amt Direct costs (USD)
indirect_cost_amt Indirect costs (USD)
contact_pi_name Contact PI name
principal_investigators Full PI roster — each entry is a JSON string with full_name, profile_id, is_contact_pi
organization_name Funded institution
org_state US state of funded institution
is_active Whether the project is currently active
arra_funded Whether funded via ARRA (stimulus)
budget_start / budget_end Budget period dates
project_start_date / project_end_date Project period dates
full_study_section NIH study section that reviewed the application
agency_ic_fundings IC-level funding breakdown (FY:IC:amount strings)
spending_categories NIH spending categories
linked_publication_pmids PubMed IDs of linked publications (when Include Linked Publications is enabled)
project_detail_url Direct link to the RePORTER project-details page

Filtering options

Input Effect
Keyword / Text Search Search across title, abstract, and terms
Fiscal Years Limit to one or more NIH fiscal years (strongly recommended for large pulls)
Activity Codes E.g. R01, R21, K99, F31, P30, U54
Administering Institute E.g. NCI, NIAID, NHLBI, NIGMS
PI Names Filter by PI last name
Organization Names Filter by funded institution
Organization States Filter by US state (e.g. CA, MA, NY)
Active Projects Only Exclude terminated/closed awards
Newly Added Only Only records recently added to RePORTER
Include Linked Publications Fetch linked PubMed IDs for each project
Max Items Cap on total records returned

API limits & pagination

The NIH RePORTER v2 API enforces a hard cap of 15,000 rows per search query (offset + page size cannot exceed 15,000). For large pulls, specify one or more Fiscal Years — the scraper runs a separate query per year so each slice stays under the cap. A single fiscal year typically contains 60,000–100,000 awards; the scraper fetches up to 15,000 per year and logs a warning when the cap is reached.

Use cases

  • Grant landscape analysis — map NIH funding across institutes, activity codes, and institutions
  • PI profiling — identify investigators and their award history
  • Policy research — track ARRA, COVID-response, or newly-terminated awards
  • Publication pipeline — link grants to downstream PubMed output
  • Competitive intelligence — benchmark funding in a specific disease area or geography

Example output

{
  "project_num": "5R01CA123456-05",
  "core_project_num": "R01CA123456",
  "appl_id": 10987654,
  "fiscal_year": 2024,
  "project_title": "Novel Approaches to Targeted Cancer Therapy",
  "activity_code": "R01",
  "agency_ic_admin": "NCI",
  "award_amount": 512000,
  "direct_cost_amt": 350000,
  "indirect_cost_amt": 162000,
  "contact_pi_name": "DOE, JANE",
  "principal_investigators": [
    "{\"full_name\":\"Jane Doe\",\"profile_id\":12345,\"is_contact_pi\":true,\"title\":\"Prof.\"}"
  ],
  "organization_name": "STANFORD UNIVERSITY",
  "org_state": "CA",
  "is_active": true,
  "arra_funded": false,
  "budget_start": "2024-04-01",
  "budget_end": "2025-03-31",
  "project_start_date": "2020-04-01",
  "project_end_date": "2025-03-31",
  "full_study_section": "Tumor Microenvironment Study Section",
  "agency_ic_fundings": ["2024:NCI:512000"],
  "spending_categories": ["Cancer"],
  "linked_publication_pmids": [],
  "project_detail_url": "https://reporter.nih.gov/project-details/R01CA123456",
  "status": "success"
}

Data source

All data is drawn from the NIH Research Portfolio Online Reporting Tools (RePORTER) — a public database maintained by the National Institutes of Health. No authentication is required. The scraper calls the official v2 REST API and does not require a proxy.