OrbTop

Cricsheet Ball-by-Ball Data Scraper (IPL, T20, Tests & More)

SPORTSDEVELOPER TOOLS

Cricsheet Ball-by-Ball Cricket Data Scraper (IPL, T20, Tests & More)

Download ball-by-ball delivery data from Cricsheet — the canonical open cricket dataset. Returns one row per delivery across IPL, T20 World Cup, Tests, The Hundred, WBBL, and a dozen other competitions, with computed strike-rate and economy-rate columns included.


Cricsheet Scraper Features

  • Downloads competition archives directly from Cricsheet — no scraping HTML, no pagination, no headaches
  • Returns one row per delivery with full match context: teams, venue, season, innings, over, ball number
  • Covers 16 competitions: IPL, Men's T20I, Women's T20I, Tests, ODIs, The Hundred, BBL, WBBL, CPL, PSL, SA20, LPL, and more
  • Computes strike rate and economy rate per batter/bowler at each ball — running totals, not career averages
  • Captures wicket type and dismissed player on dismissal balls
  • Captures extra type (wides, no-balls, byes, leg-byes) on extra deliveries
  • maxItems cap for fast sampling — 5 balls for a quick test, unlimited for the full archive
  • No browser, no proxy, no anti-bot. Cricsheet is open data.

What Can You Do With Cricsheet Ball-by-Ball Data?

  • Fantasy cricket analysts — Build ball-by-ball performance models for IPL or T20 World Cup — the dataset has every delivery from every season Cricsheet covers
  • Data scientists & ML engineers — Train batting/bowling prediction models on millions of deliveries with clean, structured features
  • Cricket statisticians — Compute career averages, phase-by-phase economy rates, powerplay strike rates, partnership data — it's all in the delivery rows
  • Sports betting researchers — Build in-match probability models using historical ball sequence data
  • Journalists & broadcasters — Pull match context and scoring patterns for retrospective coverage. The Hundred 2026 and T20 World Cup archives are live.
  • Educators — Use real match data for sports analytics courses without wrangling messy CSV exports

How the Cricsheet Scraper Works

  1. You select one or more competition keys (e.g. ["ipl", "t20s"]) and an optional maxItems cap
  2. The scraper downloads each competition's zip archive directly from Cricsheet (typically 5–20 MB per competition)
  3. Each zip contains one JSON file per match. The scraper reads every match in the archive
  4. For each delivery in each innings, it emits one output row — with match metadata, delivery fields, and running cumulative stats for that batter and bowler in that innings

Input

{
  "competitions": ["ipl", "t20s"],
  "maxItems": 1000
}
Field Type Default Description
competitions array — (required) List of competition keys to download. See below for all valid keys.
maxItems integer 0 (unlimited) Max delivery rows to return. 0 means no cap — you get the full archive.

Valid competition keys:

Key Competition
ipl Indian Premier League (Men)
t20s T20 Internationals (Men)
it20s Informal T20 Internationals (Men)
tests Test Matches (Men)
odis One Day Internationals (Men)
the_hundred The Hundred (Men)
wbbl Women's Big Bash League
bbl Big Bash League (Men)
cpl Caribbean Premier League
psl Pakistan Super League
sa20 SA20 (Men)
wt20s T20 Internationals (Women)
wtests Test Matches (Women)
wodis ODIs (Women)
lpl Lanka Premier League
sma Super Smash (New Zealand)

Cricsheet Scraper Output Fields

{
  "match_id": "1082591",
  "competition": "IPL",
  "season": "2017",
  "venue": "Rajiv Gandhi International Stadium, Uppal",
  "city": "Hyderabad",
  "date": "2017-04-05",
  "team_1": "Sunrisers Hyderabad",
  "team_2": "Royal Challengers Bangalore",
  "winner": "Sunrisers Hyderabad",
  "innings": 1,
  "over": 0,
  "ball": 1,
  "batter": "DA Warner",
  "bowler": "TS Mills",
  "non_striker": "S Dhawan",
  "runs_off_bat": 0,
  "extras": 0,
  "extra_type": "",
  "wicket_kind": "",
  "dismissed_player": "",
  "batter_runs_cumulative": 0,
  "bowler_balls_bowled": 1,
  "strike_rate": 0,
  "economy_rate": 0,
  "source": "ipl_male_json.zip"
}
Field Type Description
match_id string Cricsheet match ID (numeric, matches the source JSON filename)
competition string Competition name — e.g. "IPL", "T20I (Men)", "Test (Men)"
season string Season year or label from the Cricsheet metadata
venue string Venue name
city string City
date string Match date (YYYY-MM-DD) — first date for multi-day matches
team_1 string First team name
team_2 string Second team name
winner string Winning team name, empty if no result or tied
innings integer Innings number (1 or 2)
over integer Over number (0-indexed)
ball integer Ball number within the over (1-indexed, includes extras)
batter string Batter name
bowler string Bowler name
non_striker string Non-striker name
runs_off_bat integer Runs scored off the bat on this delivery
extras integer Extra runs on this delivery (wides, no-balls, etc.)
extra_type string Type of extra: "wides", "no_balls", "byes", "legbyes", "penalty", or empty
wicket_kind string Dismissal type (caught, bowled, lbw, run out, etc.), empty if no wicket
dismissed_player string Name of dismissed player, empty if no wicket
batter_runs_cumulative integer Batter's total runs in this innings up to and including this ball
bowler_balls_bowled integer Legal balls bowled by this bowler in this innings up to and including this ball
strike_rate number Batter strike rate at this point (runs / balls × 100)
economy_rate number Bowler economy rate at this point (runs / legal balls × 6)
source string Source zip filename — e.g. "ipl_male_json.zip"

FAQ

How do I scrape IPL ball-by-ball data?

The Cricsheet Scraper downloads the full IPL archive with competitions: ["ipl"]. The IPL archive covers all seasons Cricsheet has ball-by-ball records for — roughly 2008 onwards. The current archive is around 1,200+ matches.

How much does the Cricsheet Scraper cost to run?

The scraper is priced per delivery record. At the default PPE coefficient, a full IPL archive run (roughly 700,000+ deliveries) costs a few cents. Cricsheet is open data, so there are no proxy or residential IP costs.

Does the Cricsheet Scraper need a proxy or browser?

No. Cricsheet.org is a fully open, publicly accessible dataset with no bot protection. The scraper uses direct HTTP downloads with no browser required.

What is the strike_rate field?

It's the batter's running strike rate in this innings at the point of this delivery — not a career figure. Computed as (cumulative runs off bat) / (balls faced) × 100. It updates on every legal delivery.

Can I get the full T20 World Cup dataset?

The t20s competition key covers all men's T20 Internationals in the Cricsheet archive, which includes T20 World Cup matches. Use competitions: ["t20s"] with no maxItems cap to get the full archive.

Why does Cricsheet have more records than competitors?

The Cricsheet Scraper pulls data from Cricsheet directly — a dedicated ball-by-ball cricket data project that has been maintained since 2012. The existing scrapers on Apify have 2–4 users combined; none compute derived statistics inline.


Need More Features?

Open a feature request on the actor page — common requests include filtered exports by date range, competition-specific schemas, or batched competition sweeps.

Why Use the Cricsheet Scraper?

  • Derived stats included — Strike rate and economy rate per batter/bowler are computed inline, not left as an exercise for the analyst
  • Multi-competition in one run — Pass multiple keys and get a unified, normalized dataset across all selected competitions
  • Open data, minimal cost — Cricsheet is a public archive; the only cost is the per-record PPE charge, which is low
  • No proxy, no browser, no fragility — Direct zip downloads from Cricsheet's CDN. Nothing to break when a UI changes.