OrbTop

CMS Hospital Price Transparency Scraper

BUSINESSOTHERDEVELOPER TOOLS

CMS Hospital Price Transparency Scraper

Parse hospital standard-charge files mandated by the CMS Hospital Price Transparency rule. Returns structured rows by billing code (CPT, HCPCS, MS-DRG, APR-DRG, RC, NDC) and payer/plan, with gross charges, cash-discount prices, negotiated dollar amounts, negotiated percentages, and de-identified min/max rates. Also fetches hospital identity records (CCN, NPI, address) for ~6,000 US hospitals from the CMS enrollment API.


CMS Hospital Price Transparency Scraper Features

  • Parses CMS JSON MRF schemas v1, v2, and v3. Auto-detects the version, so you don't have to.
  • Returns negotiated rates by payer and plan — dollar, percentage, and algorithm-based methodologies
  • Includes gross charge, cash-discount, estimated allowed amount, and de-identified min/max in every row
  • Three modes — mrf_parse for a single file, hospital_list for CMS enrollment, discover_and_parse for the combined pipeline
  • Filter by state, CCN, billing-code system, specific code, or payer name substring
  • No proxy required — CMS APIs and most hospital MRFs are publicly reachable

Who Uses Hospital Price Transparency Data?

  • Price-comparison startups — Build patient-facing tools on actual negotiated rates instead of survey data
  • Self-funded employers — Pull payer-specific rates by code to benchmark before contract renegotiation
  • Benefits consultants — Analyze plan-by-plan variation across hospital networks for client RFPs
  • Healthcare journalists — Investigate pricing disparities. The data is public — somebody just has to parse it.
  • Healthcare data vendors — Join MRF charge data to the CMS enrollment registry to enrich existing hospital intelligence products

How the CMS Hospital Price Transparency Scraper Works

  1. Pick a modemrf_parse takes a single MRF URL. hospital_list walks the CMS enrollment dataset. discover_and_parse runs both.
  2. Schema detection — The parser inspects the JSON shape and routes to the v3 nested handler (standard_charges[] with payers_information[]) or the v1/v2 flat handler. No configuration.
  3. Filtering — Code-type, billing-code, payer-substring, and state filters apply during parsing, so only matching rows reach output. Cheaper than post-filtering.
  4. Export — One record per payer/plan/billing-code/setting combination. That's the granularity the CMS rule requires, and it's what makes the data joinable downstream.

Input

Parse a single MRF

{
  "mode": "mrf_parse",
  "mrfUrl": "https://example-hospital.com/standard-charges.json",
  "billingCodeType": "CPT",
  "maxItems": 1000,
  "sp_intended_usage": "Rate comparison for employer plan negotiation",
  "sp_improvement_suggestions": "None"
}

Pull hospital enrollment records

{
  "mode": "hospital_list",
  "stateFilter": "TX",
  "maxItems": 500,
  "sp_intended_usage": "Build a TX hospital registry",
  "sp_improvement_suggestions": "None"
}

Filter to one billing code across payers

{
  "mode": "mrf_parse",
  "mrfUrl": "https://example-hospital.com/standard-charges.json",
  "billingCodeType": "CPT",
  "billingCode": "70551",
  "maxItems": 0
}
Field Type Default Description
mode string mrf_parse mrf_parse, hospital_list, or discover_and_parse.
mrfUrl string CMS example MRF JSON URL. Required for mrf_parse and discover_and_parse.
stateFilter string Two-letter state code. Filters hospital_list results.
hospitalCcn string CMS Certification Number for a single hospital.
billingCodeType string CPT, HCPCS, MS-DRG, APR-DRG, RC, NDC, or Internal. Empty = all.
billingCode string Specific code to filter (e.g. 70551).
payerFilter string Case-insensitive payer-name substring.
maxItems integer 15 Cap on records. 0 = unlimited.
proxyConfiguration object none Proxy settings. Off by default.

CMS Hospital Price Transparency Scraper Output Fields

The output schema is shared between modes. Charge fields are populated for charge_row records and null/empty for hospital_info records. Use record_type to distinguish.

MRF parse — one row per payer/plan/code

{
  "hospital_name": "EXAMPLE REGIONAL MEDICAL CENTER",
  "mrf_url": "https://example-hospital.com/standard-charges.json",
  "mrf_version": "3.0.0",
  "mrf_last_updated": "2025-01-15",
  "billing_code": "70551",
  "billing_code_type": "CPT",
  "description": "MRI Brain without contrast",
  "payer_name": "Aetna",
  "plan_name": "Aetna PPO Standard",
  "setting": "outpatient",
  "methodology": "fee schedule",
  "standard_charge_gross": 4200,
  "standard_charge_discounted_cash": 1890,
  "standard_charge_negotiated_dollar": 1240,
  "standard_charge_negotiated_percentage": null,
  "standard_charge_negotiated_algorithm": "",
  "standard_charge_min": 980,
  "standard_charge_max": 1600,
  "estimated_amount": 1240,
  "additional_payer_notes": "",
  "record_type": "charge_row"
}

Hospital list — one row per hospital

{
  "hospital_name": "MEMORIAL HOSPITAL OF LARAMIE COUNTY",
  "hospital_ccn": "530012",
  "hospital_npi": "1568469223",
  "hospital_address": "214 E 23RD ST",
  "hospital_city": "CHEYENNE",
  "hospital_state": "WY",
  "hospital_zip": "82001",
  "record_type": "hospital_info"
}
Field Type Description
hospital_name string Hospital name
hospital_ccn string CMS Certification Number
hospital_npi string National Provider Identifier
hospital_address string Street address
hospital_city string City
hospital_state string Two-letter state code
hospital_zip string ZIP code
mrf_url string Source machine-readable file URL
mrf_version string CMS schema version (e.g. 3.0.0)
mrf_last_updated string Date the MRF was last updated (from file header)
billing_code string Billing code (e.g. 70551)
billing_code_type string Code system: CPT, HCPCS, MS-DRG, APR-DRG, RC, NDC, Internal
description string Service or item description
payer_name string Payer name
plan_name string Plan name
setting string inpatient, outpatient, or both
methodology string Rate methodology (fee schedule, percent of total billed charges, etc.)
standard_charge_gross number Gross / chargemaster price
standard_charge_discounted_cash number Cash / self-pay discount price
standard_charge_negotiated_dollar number Negotiated dollar amount
standard_charge_negotiated_percentage number Negotiated percentage of gross
standard_charge_negotiated_algorithm string Algorithm description when rate is formula-based
standard_charge_min number De-identified minimum negotiated charge
standard_charge_max number De-identified maximum negotiated charge
estimated_amount number Estimated allowed amount
additional_payer_notes string Additional payer or plan notes
record_type string charge_row for MRF data, hospital_info for enrollment data

FAQ

How do I scrape hospital prices from CMS machine-readable files?

CMS Hospital Price Transparency Scraper handles it in mrf_parse mode. Supply the MRF URL in mrfUrl, set optional filters, and run. The parser auto-detects v1/v2/v3 schema and outputs one row per payer/plan/billing-code combination.

Where do I find hospital MRF URLs?

CMS does not publish a single index of every hospital's file. Most hospitals link to their MRF from a "price transparency" or "standard charges" page on their own site. The CMS Hospital Price Transparency enforcement dataset tracks compliance but doesn't reliably include direct file links. Aggregators like Dolthub's hospital-price-transparency project and Turquoise Health publish compiled URL lists.

What billing code types does this scraper support?

CMS Hospital Price Transparency Scraper supports every code system in the CMS standard: CPT, HCPCS, MS-DRG, APR-DRG, Revenue Code (RC), NDC, and Internal. Filter via billingCodeType, or leave it blank to get everything.

How much does this actor cost to run?

CMS Hospital Price Transparency Scraper uses pay-per-event pricing on the default_2603_basic profile at a 1.0x coefficient. No proxy fees. Parsing a typical hospital MRF (a few thousand rows) costs cents in platform fees.

Does this actor need proxies?

CMS Hospital Price Transparency Scraper runs proxy-free by default. CMS data APIs and most hospital MRF hosts accept public traffic without rate-limiting. The proxyConfiguration field is exposed if a specific hospital's host turns out to be sensitive — most don't.

Can I filter to a single hospital?

CMS Hospital Price Transparency Scraper accepts hospitalCcn to filter hospital_list mode to one CMS Certification Number. For MRF parsing, point mrfUrl directly at that hospital's published file.


Need More Features?

Need CSV MRF support, streaming parse for very large files, or auto-discovery of MRF URLs from a hospital domain? Open an issue or get in touch.

Why Use CMS Hospital Price Transparency Scraper?

  • Handles every CMS schema — v1, v2, and v3 are all parsed by the same actor with no config. Most one-off scripts pick one and break on the rest.
  • Joinable output — Charge rows and hospital enrollment records share the same schema and overlap on hospital_ccn, so you can join them in SQL without an intermediate ETL step.
  • Filter at parse time — Code-type, billing-code, and payer filters apply while the file is being read, which keeps datasets small when you only care about one procedure.