Amtrak Rail Network Scraper — Routes, Stations, Alerts
Amtrak Rail Network Scraper
Scrapes the entire US passenger rail network from Amtrak. Returns every published route — Acela, Northeast Regional, Coast Starlight, California Zephyr, Silver Meteor and ~50 others — plus all 1,000+ stations with codes, state, timezone, amenity flags, and current service alerts.
Amtrak Rail Network Scraper Features
- Returns 53 published Amtrak routes with their full city paths.
- Returns 1,026 stations with the 3-letter codes, city, state, timezone, and amenity flags everyone keeps re-typing into spreadsheets.
- Reports per-station accessibility — wheelchair-accessible building, wheelchair lift, staffed booth, QuikTrak kiosk.
- Pulls current service alerts for each route on a given travel date.
- Filter by route name or 4-letter route code (Acela, COAS, Northeast Regional). Leave the filter empty for the whole network.
- Pure JSON-API scraping. No headless browser, no captcha plumbing, no dramatics.
Who Uses Amtrak Rail Network Data?
- Travel-tech startups — power route lookup, fare-comparison front-ends, and itinerary builders without paying for a GTFS feed you'll have to clean anyway.
- Transportation researchers — study network coverage, accessibility distribution, and service patterns across the US passenger-rail system.
- Booking & comparison sites — keep an authoritative reference of every Amtrak station and route, refreshed on a schedule.
- Accessibility analytics — audit which stations actually have wheelchair lifts versus only an "accessible building" flag, which is not the same thing.
- Internal tools — populate dropdowns and validate user-typed origin / destination cities against the real Amtrak station catalog.
How Amtrak Rail Network Scraper Works
- Pulls the routes catalog from Amtrak's published
routes-list.json. - Pulls the station master and amenity tables in parallel and joins them on station code.
- Optionally fans out a small alert lookup per filtered route on the requested travel date.
- Emits one record per route and one record per station, capped by
maxItems.
The scraper hits public Amtrak content endpoints with a Chrome TLS fingerprint and a US residential proxy. Booking-flow journey searches sit behind a separate Akamai Bot Manager layer and are intentionally out of scope.
Input
{
"routeFilter": [],
"includeStations": true,
"alertDate": "",
"maxItems": 15,
"proxyConfiguration": {
"useApifyProxy": true,
"apifyProxyGroups": ["RESIDENTIAL"],
"countryCode": "US"
}
}
| Field | Type | Default | Description |
|---|---|---|---|
routeFilter |
string[] | [] |
Route names or 4-letter codes to include (case-insensitive substring match). Empty = every route. |
includeStations |
boolean | true |
When true, emits one record per station in addition to per-route records. |
alertDate |
string | today | Date for the route service-alert query. Accepts MM/DD/YYYY or YYYY-MM-DD. |
maxItems |
integer | 15 |
Cap on total records (routes + stations combined). Set higher to crawl the full ~1,080-record network. |
proxyConfiguration |
object | Apify residential / US | Standard Apify proxy block. Residential US is recommended. |
Run a single route
{
"routeFilter": ["Acela"],
"includeStations": false,
"maxItems": 5
}
Run the whole network
{
"routeFilter": [],
"includeStations": true,
"maxItems": 1100
}
Amtrak Rail Network Scraper Output Fields
Records carry a record_type of either route or station. Route fields are populated for routes; station fields are populated for stations; the unused side is empty / false.
Route record example
{
"record_type": "route",
"route_code": "COAS",
"route_name": "Coast Starlight",
"route_url": "https://www.amtrak.com/train-routes/coast-starlight-train.html",
"cities_served": [
"Seattle",
"Tacoma",
"Portland",
"Sacramento",
"San Francisco area",
"Los Angeles"
],
"service_alerts": [],
"station_code": "",
"station_name": "",
"station_city": "",
"station_state": "",
"station_timezone": "",
"station_quiktrak": false,
"station_staffed": false,
"station_accessible": false,
"station_wheelchair_lift": false,
"source_url": "https://www.amtrak.com/services/routes-list.json",
"scraped_at": "2026-04-27T14:40:11.718Z"
}
Station record example
{
"record_type": "station",
"route_code": "",
"route_name": "",
"route_url": "",
"cities_served": [],
"service_alerts": [],
"station_code": "NYP",
"station_name": "New York, NY",
"station_city": "New York",
"station_state": "NY",
"station_timezone": "E",
"station_quiktrak": true,
"station_staffed": true,
"station_accessible": true,
"station_wheelchair_lift": true,
"source_url": "https://www.amtrak.com/services/data.stations.json",
"scraped_at": "2026-04-27T14:40:11.722Z"
}
| Field | Type | Description |
|---|---|---|
record_type |
string | route or station. |
route_code |
string | 4-letter Amtrak route code (e.g. ACEL, COAS). Empty for station records. |
route_name |
string | Human-readable route name (e.g. Acela, Coast Starlight). |
route_url |
string | Public Amtrak route page URL. |
cities_served |
string[] | Ordered list of cities along the route, parsed from Amtrak's published cityServed string. |
service_alerts |
string[] | One-line summaries of any service alerts published for this route on alertDate. |
station_code |
string | 3-letter Amtrak station code (e.g. NYP, WAS, CHI). |
station_name |
string | Full station display name. |
station_city |
string | City the station is in. |
station_state |
string | Two-letter US/CA state code. |
station_timezone |
string | E, C, M, P, A, or H. |
station_quiktrak |
boolean | Self-service QuikTrak ticketing kiosk available. |
station_staffed |
boolean | Station has Amtrak staff on site. |
station_accessible |
boolean | Station building is wheelchair-accessible. |
station_wheelchair_lift |
boolean | Wheelchair lift available for boarding. |
source_url |
string | Source endpoint the record was derived from. |
scraped_at |
string | ISO 8601 timestamp when the record was scraped. |
FAQ
How do I scrape Amtrak data?
Amtrak Rail Network Scraper hits Amtrak's public content endpoints, joins the route catalog with the station master and amenity tables, and emits one flat record per route or station. Configure the input — usually leaving everything at defaults works — and run it.
Does Amtrak Rail Network Scraper return live fares or seat availability?
No. Amtrak's booking flow sits behind Akamai Bot Manager with sensor cookies and is intentionally out of scope. The actor returns the published network catalog (routes, stations, amenities) and route-level service alerts, not per-search journey results.
Does Amtrak Rail Network Scraper need proxies?
Yes — the static endpoints use Akamai TLS fingerprinting that rejects standard HTTP clients. The actor ships with the Apify residential US proxy preset and a Chrome TLS fingerprint, which is enough.
Can I filter to a single route?
Yes. Pass routeFilter: ["Acela"] (or ["COAS"], ["Northeast Regional"], etc.) to narrow the scrape. Filter terms are case-insensitive substring matches on the route code and route name. Leaving the filter empty pulls the entire 53-route catalog.
How many records does a full run produce?
Around 1,080 — 53 routes plus 1,026 stations. Set maxItems accordingly. Default is 15 to keep test runs fast.
How fresh is the data?
Routes and stations are published catalogs that change a handful of times a year. Service alerts are queried in real time for the date you supply.
Need More Features?
Need bus / Greyhound / Megabus coverage, a journey-search mode, or a different cut of the data? File an issue or get in touch.
Why Use Amtrak Rail Network Scraper?
- Cheap — pay-per-event pricing, ~$0.0015 per record at the default coefficient. The whole 1,080-record network costs less than a Northeast Regional sandwich.
- Clean output — flat JSON, consistent field names, station codes uppercased, dates ISO-formatted. You spend less time normalizing and more time using the data.
- Stable — no headless browser, no captcha solver, no scraping-the-DOM heuristics. Just published Amtrak content endpoints over HTTP.