OrbTop

EPA Toxic Release Inventory (TRI) Crawler

BUSINESSOTHER

EPA Toxic Release Inventory (TRI) Crawler

Crawl toxic chemical release data from the EPA Toxics Release Inventory via the Envirofacts REST API. Extract facility details, chemical names, release quantities by environmental medium (air, water, land), off-site transfers, geographic coordinates, and carcinogen flags. Filter by state, chemical, reporting year, and facility name.

What does the EPA TRI Crawler do?

The EPA TRI Crawler queries the EPA Envirofacts Data Service API to extract facility-level toxic chemical release data from the Toxics Release Inventory (TRI) program. The TRI tracks releases of over 860 chemicals from ~25,000 reporting facilities across the United States, with data going back to 1987. The crawler joins data from four EPA tables -- TRI_REPORTING_FORM, TRI_FACILITY, TRI_RELEASE_QTY, and TRI_CHEM_INFO -- to produce a single flat record per facility-chemical-year combination with release quantities broken down by air emissions, water discharges, land disposal, underground injection, and off-site transfers.

EPA TRI Crawler Features

  • Extracts data from 25,000+ reporting facilities across all US states and territories
  • Joins four EPA tables into a single denormalized record per facility-chemical-year
  • Breaks down release quantities by medium -- fugitive air, stack air, surface water, underground injection, land disposal, and off-site transfers
  • Filters by state -- all 50 states plus DC, Puerto Rico, and US Virgin Islands (in-memory filtering since the EPA API silently ignores state filters on the reporting form table)
  • Filters by chemical name -- partial matching (e.g., "lead", "benzene", "mercury")
  • Filters by reporting year -- any year from 1987 to present
  • Filters by facility name -- partial matching to find specific industrial sites
  • Filters for carcinogens only -- returns releases of known carcinogens only
  • Includes chemical metadata -- CAS registry number, carcinogen flag, Clean Air Act classification, and chemical category (Metal, Dioxin, PBT, PFAS)
  • Converts coordinates -- DMS (degrees-minutes-seconds) to decimal degrees for mapping
  • No proxy required -- EPA Envirofacts is a public government API with no authentication

EPA TRI Crawler Output Fields

Field Type Description
trifid string TRI facility ID
facility_name string Facility name
street_address string Street address
city string City
county string County
state string State abbreviation
zip_code string ZIP code
latitude number Latitude coordinate (decimal degrees)
longitude number Longitude coordinate (decimal degrees)
parent_company string Parent company name
chemical_name string Chemical name
cas_number string CAS (Chemical Abstract Service) registry number
carcinogen boolean Whether chemical is a known carcinogen
clean_air_act_chemical boolean Whether chemical is regulated under the Clean Air Act
classification string Chemical classification (Dioxin, Metal, PBT, PFAS, etc.)
unit_of_measure string Unit of measurement for release quantities (Pounds or Grams)
total_releases number Total releases including on-site and off-site (in reported units)
fugitive_air number Fugitive air emissions
stack_air number Stack/point air emissions
water number Surface water discharges
underground number Underground injection
land number On-site land disposal (landfills, surface impoundment, land treatment)
off_site number Off-site transfers for disposal/release
reporting_year number Reporting year
federal_facility boolean Whether facility is federally owned

Who Uses EPA TRI Data?

  • Environmental researchers: Analyze pollution trends by chemical, facility, and region to study environmental health impacts and regulatory effectiveness
  • ESG analysts: Evaluate corporate environmental performance by tracking toxic releases from specific companies and their subsidiaries
  • Community organizers: Identify major pollution sources near residential areas and track whether releases are increasing or decreasing over time
  • Journalists: Investigate industrial pollution patterns, compare facility-level release data, and hold polluters accountable with public records
  • Government agencies: Monitor compliance with environmental regulations and identify facilities that may need additional oversight
  • Public health researchers: Correlate chemical release data with health outcomes using geographic coordinates and carcinogen flags

How to Use the EPA TRI Crawler

Input Parameters

Parameter Required Default Description
state No All US state abbreviation (e.g., CA, TX, OH)
chemical No All Chemical name, partial matching (e.g., "lead", "benzene")
reportingYear No (none) Reporting year (e.g., 2023)
facilityName No All Facility name, partial matching
carcinogenOnly No false If true, only return releases of known carcinogens
maxItems No 100 Maximum release records to return. Set to 0 for unlimited (requires at least one filter)

Note: When maxItems is set to 0 (unlimited), at least one search filter is required to prevent accidentally crawling the entire TRI database (~4M+ records).

Example Configurations

Get mercury releases in Arizona for 2023:

{
    "state": "AZ",
    "chemical": "mercury",
    "reportingYear": 2023,
    "maxItems": 100
}

Get all carcinogen releases in Ohio:

{
    "state": "OH",
    "carcinogenOnly": true,
    "reportingYear": 2023,
    "maxItems": 100
}

Search for a specific facility:

{
    "facilityName": "Freeport",
    "reportingYear": 2023,
    "maxItems": 50
}

Sample Output

{
    "trifid": "85003PHNXM2827N",
    "facility_name": "PHOENIX METALS CO",
    "street_address": "2827 N 29TH AVE",
    "city": "PHOENIX",
    "county": "MARICOPA",
    "state": "AZ",
    "zip_code": "85009",
    "latitude": 33.467842,
    "longitude": -112.119637,
    "parent_company": "PHOENIX METALS COMPANY LLC",
    "chemical_name": "LEAD",
    "cas_number": "7439921",
    "carcinogen": false,
    "clean_air_act_chemical": true,
    "classification": "Metal",
    "unit_of_measure": "Pounds",
    "total_releases": 1250.5,
    "fugitive_air": 120.0,
    "stack_air": 350.5,
    "water": 0,
    "underground": 0,
    "land": 780.0,
    "off_site": 0,
    "reporting_year": 2023,
    "federal_facility": false
}

EPA TRI Data FAQ

How do I get toxic release data from the EPA? Use the EPA TRI Crawler to query the Envirofacts REST API. Set your filters (state, chemical, year, facility name) and the crawler returns structured JSON records with facility details, release quantities broken down by environmental medium, chemical metadata, and geographic coordinates.

What is the Toxics Release Inventory (TRI)? The TRI is an EPA program that requires certain industrial facilities to report annually on the quantities of toxic chemicals they release to the environment or transfer off-site. It covers over 860 chemicals and ~25,000 facilities, with data available from 1987 to present.

How are release quantities broken down? Each record includes quantities for six release pathways: fugitive air emissions, stack/point air emissions, surface water discharges, underground injection, on-site land disposal, and off-site transfers. The total_releases field is the sum of all pathways. Quantities are reported in Pounds or Grams depending on the chemical.

Does this crawler require proxies or authentication? No. The EPA Envirofacts API is a public government service. No authentication, API keys, or proxies are required.

How long does a typical run take? The EPA API responds in 3-7 seconds per request, and each record requires multiple API calls to join facility, release, and transfer data. Expect approximately 30 seconds for 10 records and 5 minutes for 100 records.

Why does state filtering use in-memory matching? The EPA Envirofacts API silently ignores the STATE_ABBR filter on the TRI_REPORTING_FORM table (the column exists on TRI_FACILITY, not TRI_REPORTING_FORM). The crawler works around this by pre-loading matching facility IDs from the TRI_FACILITY table and filtering reporting form records in memory.

What chemicals are tracked? The TRI tracks over 860 chemicals including metals (lead, mercury, chromium), volatile organic compounds (benzene, toluene), dioxins, persistent bioaccumulative toxins (PBTs), and PFAS compounds. Each chemical record includes the CAS registry number, carcinogen flag, and classification.

Need a Custom Feature?

If you need additional data fields, custom aggregations, or integration with your environmental monitoring pipeline, file an issue or get in touch.