EcoBici CDMX Trip Data Scraper
TRAVEL
EcoBici CDMX Trip Data Scraper
Extract historical trip records and live station data from EcoBici — Mexico City's Lyft-operated public bike-share network with 480+ stations and over 10 million trips per year.
What this actor does
- Downloads and stream-parses historical monthly trip CSVs going back to 2010 (97 MB+ per monthly file)
- Fetches live GBFS station data — station names, coordinates, dock capacity, and real-time service status
- Enriches trip records with origin and destination station details (lat/lng, capacity, operational status)
- Respects memory constraints via streaming CSV parsing — files are never buffered in full
Input
| Field | Type | Default | Description |
|---|---|---|---|
mode |
string | both |
What to fetch: trips (historical CSVs), stations (live GBFS), or both |
yearFrom |
integer | current year | Earliest year to include in historical trip data |
yearTo |
integer | current year | Latest year to include in historical trip data |
maxItems |
integer | unlimited | Cap on total output records |
Output
Each record contains:
| Field | Description |
|---|---|
trip_id |
Row-level identifier (YYYY-MM-<index>) |
bike_id |
Bike identifier from the source CSV |
user_gender |
Rider gender (M/F) |
user_age |
Rider age in years |
origin_station_id |
Numeric station ID (origin) |
destination_station_id |
Numeric station ID (destination) |
origin_station_name |
Station name from GBFS |
destination_station_name |
Station name from GBFS |
origin_lat / origin_lng |
Station coordinates |
destination_lat / destination_lng |
Station coordinates |
depart_at |
Departure datetime (ISO 8601) |
arrive_at |
Arrival datetime (ISO 8601) |
duration_seconds |
Trip duration in seconds |
trip_type |
docked (standard docked bike) |
month |
Source month (YYYY-MM) |
year |
Source year |
source_csv_url |
URL of the CSV file |
station_capacity |
Total dock slots at origin station |
station_status |
IN_SERVICE or NOT_IN_SERVICE |
source_url |
EcoBici open data page URL |
scraped_at |
ISO 8601 scrape timestamp |
Use cases
- Urban mobility research and transportation policy analysis
- Last-mile connectivity studies (Roma, Condesa, Polanco, Centro Historico)
- Station-level demand forecasting and rebalancing optimization
- Academic research on bike-share systems in Latin America
- Housing density and real estate proximity signals
Notes
- Monthly CSV files for recent years are approximately 97 MB each. Runs that request multiple years will take proportionally longer.
- Station enrichment relies on the live GBFS feed at the time of each run. Decommissioned stations may not appear in the GBFS but will still appear in historical CSVs.
- The actor uses streaming CSV parsing with no in-memory buffering — memory usage stays well below 1 GB even for large historical downloads.