TMDB Movie & TV Metadata Scraper
VIDEOSDEVELOPER TOOLS
TMDB Movie & TV Metadata Scraper
Scrape rich metadata for movies and TV shows from The Movie Database (TMDB) — no API key required. Discovers titles from TMDB's public discover/browse pages and extracts full detail records including cast, directors, genres, keywords, ratings, runtime, original language, and production companies.
What you get
Each record in the output dataset contains:
| Field | Description |
|---|---|
tmdb_id |
Numeric TMDB ID |
title |
Movie or TV show title |
media_type |
movie or tv |
tmdb_url |
Canonical TMDB page URL |
original_title |
Title in the original language |
release_date |
First release or air date (YYYY-MM-DD) |
vote_average |
Aggregate rating (0–10 scale) |
vote_count |
Number of votes |
user_score_percent |
User score percentage (0–100) |
overview |
Plot summary or show description |
genres |
Comma-separated genre names |
runtime_minutes |
Runtime in minutes |
original_language |
Original language |
production_companies |
Comma-separated production company names |
imdb_id |
IMDb ID (e.g. tt0137523) — when listed on the TMDB page |
cast_top |
Comma-separated top-billed cast names |
directors |
Comma-separated director names |
keywords |
Comma-separated TMDB keyword tags |
Why use this instead of the TMDB API?
Most TMDB scrapers on the Apify Store are thin wrappers around the TMDB REST API — they require you to register for and manage your own API key and stay within TMDB's per-account rate limits. This actor scrapes TMDB's public web pages directly, so:
- No API key registration or management
- No per-account rate limits to worry about
- Both movies and TV shows in one unified output schema
- Includes fields not always easily queryable via API (IMDb cross-ID, keyword tags, top cast)
Inputs
| Input | Type | Default | Description |
|---|---|---|---|
maxItems |
integer | 15 | Maximum number of records to return. Set to 0 for no limit. |
mediaType |
string | both |
Which media type to scrape: movie, tv, or both. |
startPage |
integer | 1 | Discover page to start from (each page has ~20 titles). |
Example use cases
- Media server catalogs: Build or enrich metadata catalogs for Plex, Jellyfin, or Kodi libraries without managing API credentials.
- Recommendation engines: Feed movie/TV metadata into ML pipelines — genres, keywords, cast, and ratings in one schema.
- Cross-referencing: Use
imdb_idto join TMDB data with IMDb datasets for enriched analytics. - Market research: Track ratings and popularity trends across the TMDB catalog over time.
How it works
- Discover: Crawls paginated TMDB browse pages (
/movie?language=en-US&page=N,/tv?...) — 20 titles per page, up to 500 pages per type. - Detail: For each title, fetches the detail page and extracts:
- JSON-LD (
schema.org Movie/TVSeries): name, description, rating, genres, runtime, release date - DOM: user score chart, directors, cast, keywords, original title, language, production companies
- JSON-LD (
Notes
- TMDB's discover pages order titles by popularity (most popular first). Use
startPageto offset into the catalog. - The
imdb_idfield is populated only when TMDB links to IMDb on the detail page — this is common for well-known titles but may be absent for obscure entries. - Runtime is in minutes for movies. For TV shows, TMDB typically reports the average episode length.
- The
language=en-USparameter is appended to all requests to ensure English metadata in the output.