OrbTop

EcoBici CDMX Trip Data Scraper

TRAVEL

EcoBici CDMX Trip Data Scraper

Extract historical trip records and live station data from EcoBici — Mexico City's Lyft-operated public bike-share network with 480+ stations and over 10 million trips per year.

What this actor does

  • Downloads and stream-parses historical monthly trip CSVs going back to 2010 (97 MB+ per monthly file)
  • Fetches live GBFS station data — station names, coordinates, dock capacity, and real-time service status
  • Enriches trip records with origin and destination station details (lat/lng, capacity, operational status)
  • Respects memory constraints via streaming CSV parsing — files are never buffered in full

Input

Field Type Default Description
mode string both What to fetch: trips (historical CSVs), stations (live GBFS), or both
yearFrom integer current year Earliest year to include in historical trip data
yearTo integer current year Latest year to include in historical trip data
maxItems integer unlimited Cap on total output records

Output

Each record contains:

Field Description
trip_id Row-level identifier (YYYY-MM-<index>)
bike_id Bike identifier from the source CSV
user_gender Rider gender (M/F)
user_age Rider age in years
origin_station_id Numeric station ID (origin)
destination_station_id Numeric station ID (destination)
origin_station_name Station name from GBFS
destination_station_name Station name from GBFS
origin_lat / origin_lng Station coordinates
destination_lat / destination_lng Station coordinates
depart_at Departure datetime (ISO 8601)
arrive_at Arrival datetime (ISO 8601)
duration_seconds Trip duration in seconds
trip_type docked (standard docked bike)
month Source month (YYYY-MM)
year Source year
source_csv_url URL of the CSV file
station_capacity Total dock slots at origin station
station_status IN_SERVICE or NOT_IN_SERVICE
source_url EcoBici open data page URL
scraped_at ISO 8601 scrape timestamp

Use cases

  • Urban mobility research and transportation policy analysis
  • Last-mile connectivity studies (Roma, Condesa, Polanco, Centro Historico)
  • Station-level demand forecasting and rebalancing optimization
  • Academic research on bike-share systems in Latin America
  • Housing density and real estate proximity signals

Notes

  • Monthly CSV files for recent years are approximately 97 MB each. Runs that request multiple years will take proportionally longer.
  • Station enrichment relies on the live GBFS feed at the time of each run. Decommissioned stations may not appear in the GBFS but will still appear in historical CSVs.
  • The actor uses streaming CSV parsing with no in-memory buffering — memory usage stays well below 1 GB even for large historical downloads.