BBC Good Food Recipe Scraper

Overview

The BBC Good Food Recipe Scraper enumerates and extracts the full BBC Good Food recipe catalogue (~15,000+ recipes) using sitemap discovery. It captures rich structured data from each recipe page including ingredients, step-by-step instructions, the UK nutrition panel, BBC-specific skill levels, dietary tags, star ratings, and schema.org/Recipe JSON-LD fields.

BBC Good Food is the largest free English-language recipe authority in the UK, with content covering everything from quick weeknight dinners to elaborate celebration cakes. Unlike generic multi-site scrapers that require you to supply URLs and drop BBC-specific fields, this actor discovers the entire corpus automatically and extracts every structured field the site provides.

Features

Full sitemap enumeration: Walks the BBC Good Food sitemap index and collects every recipe URL across all quarterly recipe sitemaps (~15K+ recipes).
BYO URL mode: Supply specific recipe URLs via startUrls to scrape targeted recipes without a full crawl.
schema.org/Recipe extraction: Parses the embedded JSON-LD block on each page for all standard Recipe fields.
BBC-specific fields: Extracts skill level (Easy / More effort / A challenge), dietary tags (vegetarian, vegan, gluten-free, healthy, etc.), and the UK nutrition panel.
Respectful crawling: Honours the site's crawl-delay directive with conservative concurrency.
Incremental-friendly: Use maxItems to cap run size for incremental update workflows.

Use Cases

Building recipe datasets for LLM fine-tuning or RAG pipelines.
Meal planning and nutrition app data ingestion.
Food-trend analytics using BBC's categorisation taxonomy and editorial dietary tags.
Competitive benchmarking for recipe content platforms.
Academic research on UK food culture and cooking trends.

How It Works

Sitemap discovery: Fetches https://www.bbcgoodfood.com/sitemap.xml (a 260-child index) and filters to recipe-type sitemaps (e.g. 2026-Q2-recipe.xml).
URL collection: Extracts all /recipes/<slug> URLs from matching sitemaps, capped at maxItems.
Page extraction: Fetches each recipe page and parses the schema.org/Recipe JSON-LD block plus supplemental BBC DOM fields.
Output: Stores one record per recipe in the Apify dataset.

Input

Field	Type	Required	Description
`maxItems`	Integer	Yes	Maximum number of recipes to scrape. Set to `0` for the full corpus (15K+). Default: `10`.
`startUrls`	Array	No	Specific BBC Good Food recipe URLs to scrape. Skips sitemap discovery when provided.

Example — Full sitemap run (capped)

{
  "maxItems": 500
}

Example — BYO URLs

{
  "startUrls": [
    { "url": "https://www.bbcgoodfood.com/recipes/easy-chocolate-cake" },
    { "url": "https://www.bbcgoodfood.com/recipes/iced-tea" }
  ],
  "maxItems": 10
}

Output

One record per recipe. All fields sourced from schema.org/Recipe JSON-LD unless noted.

Field	Type	Description
`slug`	String	URL slug (e.g. `easy-chocolate-cake`)
`url`	String	Full recipe page URL
`name`	String	Recipe title
`author`	String	Recipe author name
`description`	String	Short editorial description
`recipe_category`	String	Category (e.g. Cake, Dinner, Drink)
`recipe_cuisine`	String	Cuisine type (e.g. British, Italian)
`recipe_yield`	String	Serving yield (e.g. "Serves 8")
`prep_time`	String	Prep time as ISO 8601 duration (e.g. `PT20M`)
`cook_time`	String	Cook time as ISO 8601 duration
`total_time`	String	Total time as ISO 8601 duration
`skill_level`	String	BBC skill rating: Easy / More effort / A challenge
`recipe_ingredient`	Array	List of ingredient strings
`recipe_instructions`	Array	List of step-by-step instruction strings
`nutrition`	String	JSON-encoded per-serving nutrition data (kcal, fat, saturates, carbs, sugars, fibre, protein, salt)
`aggregate_rating`	Number	Average star rating (1–5 scale)
`rating_count`	Integer	Number of ratings
`keywords`	Array	Editorial keyword tags
`dietary_tags`	Array	Dietary suitability tags (vegetarian, vegan, gluten-free, healthy, etc.)
`image_urls`	Array	Recipe image URLs
`date_published`	String	Publication date (ISO 8601)

Example output record

{
  "slug": "easy-chocolate-cake",
  "url": "https://www.bbcgoodfood.com/recipes/easy-chocolate-cake",
  "name": "Easy chocolate cake",
  "author": "Miriam Nice",
  "description": "Master the chocolate cake with an airy, light sponge and rich buttercream filling...",
  "recipe_category": "Cake",
  "recipe_cuisine": "",
  "recipe_yield": "Serves 8-10",
  "prep_time": "PT30M",
  "cook_time": "PT25M",
  "total_time": "PT55M",
  "skill_level": "Easy",
  "recipe_ingredient": [
    "225g unsalted butter, softened",
    "225g golden caster sugar",
    "4 large eggs"
  ],
  "recipe_instructions": [
    "Heat oven to 190C/170C fan/gas 5. Butter two 20cm sandwich tins...",
    "Beat 225g softened unsalted butter and 225g golden caster sugar until fluffy..."
  ],
  "nutrition": "{\"calories\":\"546 calories\",\"fatContent\":\"31 grams fat\",\"saturatedFatContent\":\"19 grams saturated fat\",\"carbohydrateContent\":\"63 grams carbohydrates\",\"sugarContent\":\"51 grams sugar\",\"fiberContent\":\"1 grams fiber\",\"proteinContent\":\"5 grams protein\",\"sodiumContent\":\"0.5 milligram of sodium\"}",
  "aggregate_rating": 4.7,
  "rating_count": 2314,
  "keywords": ["Afternoon tea", "Celebration cake", "Chocolate cake"],
  "dietary_tags": [],
  "image_urls": ["https://images.immediate.co.uk/production/volatile/sites/30/2020/08/easy_chocolate_cake-b62f92c.jpg?resize=440,230"],
  "date_published": "2020-08-21T00:00:00+00:00"
}

Notes

Crawl-delay: BBC Good Food's robots.txt specifies a 12-second crawl delay. The actor respects this via low concurrency. Full-corpus runs (~15K recipes) will take several hours.
New recipes: The sitemap is indexed quarterly (e.g. 2026-Q2-recipe.xml). Run periodically to capture newly published recipes.
Ratings on new recipes: Freshly published recipes may have no aggregate rating yet — aggregate_rating and rating_count will be null.
Nutrition format: The nutrition field is a JSON string. Parse it with JSON.parse(record.nutrition) to access individual nutrients.

Further reading: ISBN Database Access and Other Open Reference Data in Bulk

BBC Good Food Recipe Scraper

BBC Good Food Recipe Scraper

Overview

Features

Use Cases

How It Works

Input

Example — Full sitemap run (capped)

Example — BYO URLs

Output

Example output record

Notes

Featured in

Related AI & Data scrapers