TMDB Movie & TV Metadata Scraper

Scrape rich metadata for movies and TV shows from The Movie Database (TMDB) — no API key required. Discovers titles from TMDB's public discover/browse pages and extracts full detail records including cast, directors, genres, keywords, ratings, runtime, original language, and production companies.

What you get

Each record in the output dataset contains:

Field	Description
`tmdb_id`	Numeric TMDB ID
`title`	Movie or TV show title
`media_type`	`movie` or `tv`
`tmdb_url`	Canonical TMDB page URL
`original_title`	Title in the original language
`release_date`	First release or air date (YYYY-MM-DD)
`vote_average`	Aggregate rating (0–10 scale)
`vote_count`	Number of votes
`user_score_percent`	User score percentage (0–100)
`overview`	Plot summary or show description
`genres`	Comma-separated genre names
`runtime_minutes`	Runtime in minutes
`original_language`	Original language
`production_companies`	Comma-separated production company names
`imdb_id`	IMDb ID (e.g. `tt0137523`) — when listed on the TMDB page
`cast_top`	Comma-separated top-billed cast names
`directors`	Comma-separated director names
`keywords`	Comma-separated TMDB keyword tags

Why use this instead of the TMDB API?

Most TMDB scrapers on the Apify Store are thin wrappers around the TMDB REST API — they require you to register for and manage your own API key and stay within TMDB's per-account rate limits. This actor scrapes TMDB's public web pages directly, so:

No API key registration or management
No per-account rate limits to worry about
Both movies and TV shows in one unified output schema
Includes fields not always easily queryable via API (IMDb cross-ID, keyword tags, top cast)

Inputs

Input	Type	Default	Description
`maxItems`	integer	15	Maximum number of records to return. Set to `0` for no limit.
`mediaType`	string	`both`	Which media type to scrape: `movie`, `tv`, or `both`.
`startPage`	integer	1	Discover page to start from (each page has ~20 titles).

Example use cases

Media server catalogs: Build or enrich metadata catalogs for Plex, Jellyfin, or Kodi libraries without managing API credentials.
Recommendation engines: Feed movie/TV metadata into ML pipelines — genres, keywords, cast, and ratings in one schema.
Cross-referencing: Use imdb_id to join TMDB data with IMDb datasets for enriched analytics.
Market research: Track ratings and popularity trends across the TMDB catalog over time.

How it works

Discover: Crawls paginated TMDB browse pages (/movie?language=en-US&page=N, /tv?...) — 20 titles per page, up to 500 pages per type.
Detail: For each title, fetches the detail page and extracts:
- JSON-LD (schema.org Movie / TVSeries): name, description, rating, genres, runtime, release date
- DOM: user score chart, directors, cast, keywords, original title, language, production companies

Notes

TMDB's discover pages order titles by popularity (most popular first). Use startPage to offset into the catalog.
The imdb_id field is populated only when TMDB links to IMDb on the detail page — this is common for well-known titles but may be absent for obscure entries.
Runtime is in minutes for movies. For TV shows, TMDB typically reports the average episode length.
The language=en-US parameter is appended to all requests to ensure English metadata in the output.

TMDB Movie & TV Metadata Scraper

TMDB Movie & TV Metadata Scraper

What you get

Why use this instead of the TMDB API?

Inputs

Example use cases

How it works

Notes

Related AI & Data scrapers