Cricsheet Ball-by-Ball Data Scraper (IPL, T20, Tests & More)
Cricsheet Ball-by-Ball Cricket Data Scraper (IPL, T20, Tests & More)
Download ball-by-ball delivery data from Cricsheet — the canonical open cricket dataset. Returns one row per delivery across IPL, T20 World Cup, Tests, The Hundred, WBBL, and a dozen other competitions, with computed strike-rate and economy-rate columns included.
Cricsheet Scraper Features
- Downloads competition archives directly from Cricsheet — no scraping HTML, no pagination, no headaches
- Returns one row per delivery with full match context: teams, venue, season, innings, over, ball number
- Covers 16 competitions: IPL, Men's T20I, Women's T20I, Tests, ODIs, The Hundred, BBL, WBBL, CPL, PSL, SA20, LPL, and more
- Computes strike rate and economy rate per batter/bowler at each ball — running totals, not career averages
- Captures wicket type and dismissed player on dismissal balls
- Captures extra type (wides, no-balls, byes, leg-byes) on extra deliveries
maxItemscap for fast sampling — 5 balls for a quick test, unlimited for the full archive- No browser, no proxy, no anti-bot. Cricsheet is open data.
What Can You Do With Cricsheet Ball-by-Ball Data?
- Fantasy cricket analysts — Build ball-by-ball performance models for IPL or T20 World Cup — the dataset has every delivery from every season Cricsheet covers
- Data scientists & ML engineers — Train batting/bowling prediction models on millions of deliveries with clean, structured features
- Cricket statisticians — Compute career averages, phase-by-phase economy rates, powerplay strike rates, partnership data — it's all in the delivery rows
- Sports betting researchers — Build in-match probability models using historical ball sequence data
- Journalists & broadcasters — Pull match context and scoring patterns for retrospective coverage. The Hundred 2026 and T20 World Cup archives are live.
- Educators — Use real match data for sports analytics courses without wrangling messy CSV exports
How the Cricsheet Scraper Works
- You select one or more competition keys (e.g.
["ipl", "t20s"]) and an optionalmaxItemscap - The scraper downloads each competition's zip archive directly from Cricsheet (typically 5–20 MB per competition)
- Each zip contains one JSON file per match. The scraper reads every match in the archive
- For each delivery in each innings, it emits one output row — with match metadata, delivery fields, and running cumulative stats for that batter and bowler in that innings
Input
{
"competitions": ["ipl", "t20s"],
"maxItems": 1000
}
| Field | Type | Default | Description |
|---|---|---|---|
competitions |
array | — (required) | List of competition keys to download. See below for all valid keys. |
maxItems |
integer | 0 (unlimited) | Max delivery rows to return. 0 means no cap — you get the full archive. |
Valid competition keys:
| Key | Competition |
|---|---|
ipl |
Indian Premier League (Men) |
t20s |
T20 Internationals (Men) |
it20s |
Informal T20 Internationals (Men) |
tests |
Test Matches (Men) |
odis |
One Day Internationals (Men) |
the_hundred |
The Hundred (Men) |
wbbl |
Women's Big Bash League |
bbl |
Big Bash League (Men) |
cpl |
Caribbean Premier League |
psl |
Pakistan Super League |
sa20 |
SA20 (Men) |
wt20s |
T20 Internationals (Women) |
wtests |
Test Matches (Women) |
wodis |
ODIs (Women) |
lpl |
Lanka Premier League |
sma |
Super Smash (New Zealand) |
Cricsheet Scraper Output Fields
{
"match_id": "1082591",
"competition": "IPL",
"season": "2017",
"venue": "Rajiv Gandhi International Stadium, Uppal",
"city": "Hyderabad",
"date": "2017-04-05",
"team_1": "Sunrisers Hyderabad",
"team_2": "Royal Challengers Bangalore",
"winner": "Sunrisers Hyderabad",
"innings": 1,
"over": 0,
"ball": 1,
"batter": "DA Warner",
"bowler": "TS Mills",
"non_striker": "S Dhawan",
"runs_off_bat": 0,
"extras": 0,
"extra_type": "",
"wicket_kind": "",
"dismissed_player": "",
"batter_runs_cumulative": 0,
"bowler_balls_bowled": 1,
"strike_rate": 0,
"economy_rate": 0,
"source": "ipl_male_json.zip"
}
| Field | Type | Description |
|---|---|---|
match_id |
string | Cricsheet match ID (numeric, matches the source JSON filename) |
competition |
string | Competition name — e.g. "IPL", "T20I (Men)", "Test (Men)" |
season |
string | Season year or label from the Cricsheet metadata |
venue |
string | Venue name |
city |
string | City |
date |
string | Match date (YYYY-MM-DD) — first date for multi-day matches |
team_1 |
string | First team name |
team_2 |
string | Second team name |
winner |
string | Winning team name, empty if no result or tied |
innings |
integer | Innings number (1 or 2) |
over |
integer | Over number (0-indexed) |
ball |
integer | Ball number within the over (1-indexed, includes extras) |
batter |
string | Batter name |
bowler |
string | Bowler name |
non_striker |
string | Non-striker name |
runs_off_bat |
integer | Runs scored off the bat on this delivery |
extras |
integer | Extra runs on this delivery (wides, no-balls, etc.) |
extra_type |
string | Type of extra: "wides", "no_balls", "byes", "legbyes", "penalty", or empty |
wicket_kind |
string | Dismissal type (caught, bowled, lbw, run out, etc.), empty if no wicket |
dismissed_player |
string | Name of dismissed player, empty if no wicket |
batter_runs_cumulative |
integer | Batter's total runs in this innings up to and including this ball |
bowler_balls_bowled |
integer | Legal balls bowled by this bowler in this innings up to and including this ball |
strike_rate |
number | Batter strike rate at this point (runs / balls × 100) |
economy_rate |
number | Bowler economy rate at this point (runs / legal balls × 6) |
source |
string | Source zip filename — e.g. "ipl_male_json.zip" |
FAQ
How do I scrape IPL ball-by-ball data?
The Cricsheet Scraper downloads the full IPL archive with competitions: ["ipl"]. The IPL archive covers all seasons Cricsheet has ball-by-ball records for — roughly 2008 onwards. The current archive is around 1,200+ matches.
How much does the Cricsheet Scraper cost to run?
The scraper is priced per delivery record. At the default PPE coefficient, a full IPL archive run (roughly 700,000+ deliveries) costs a few cents. Cricsheet is open data, so there are no proxy or residential IP costs.
Does the Cricsheet Scraper need a proxy or browser?
No. Cricsheet.org is a fully open, publicly accessible dataset with no bot protection. The scraper uses direct HTTP downloads with no browser required.
What is the strike_rate field?
It's the batter's running strike rate in this innings at the point of this delivery — not a career figure. Computed as (cumulative runs off bat) / (balls faced) × 100. It updates on every legal delivery.
Can I get the full T20 World Cup dataset?
The t20s competition key covers all men's T20 Internationals in the Cricsheet archive, which includes T20 World Cup matches. Use competitions: ["t20s"] with no maxItems cap to get the full archive.
Why does Cricsheet have more records than competitors?
The Cricsheet Scraper pulls data from Cricsheet directly — a dedicated ball-by-ball cricket data project that has been maintained since 2012. The existing scrapers on Apify have 2–4 users combined; none compute derived statistics inline.
Need More Features?
Open a feature request on the actor page — common requests include filtered exports by date range, competition-specific schemas, or batched competition sweeps.
Why Use the Cricsheet Scraper?
- Derived stats included — Strike rate and economy rate per batter/bowler are computed inline, not left as an exercise for the analyst
- Multi-competition in one run — Pass multiple keys and get a unified, normalized dataset across all selected competitions
- Open data, minimal cost — Cricsheet is a public archive; the only cost is the per-record PPE charge, which is low
- No proxy, no browser, no fragility — Direct zip downloads from Cricsheet's CDN. Nothing to break when a UI changes.