OrbTop

Babylist Baby Registry Product Catalog Scraper

ECOMMERCE

Babylist Baby Registry Product Catalog Scraper

Scrapes Babylist's product catalog — product details, pricing, ratings, multi-retailer where-to-buy links, editorial badges, and category data from the leading US baby registry platform.

What it does

Babylist uses a React SSR pattern where all product data is embedded in the page HTML as a JSON blob. This actor extracts that data directly without a browser, giving you fast, reliable access to the full product catalog.

Scrape modes

Mode Description
sitemap Walks the full product sitemap (~10,000 products)
category Scrapes a specific category (e.g. strollers, car-seats, baby-monitors)
product_urls Scrapes a list of product URLs you supply
editorial Extracts products from Babylist's best-of editorial lists

Input parameters

Field Type Default Description
mode string sitemap Which data to collect — sitemap, category, product_urls, or editorial
categorySlug string strollers Category slug for category mode (e.g. car-seats, baby-monitors)
startUrls array Product URLs for product_urls mode
maxItems integer 10 Maximum products to scrape; 0 = unlimited

Output schema

Each record contains:

Field Type Description
product_id integer Babylist internal product group ID
variant_id integer Babylist internal variant ID
product_name string Full product name including variant
product_url string Canonical Babylist product URL
brand string Brand name
category string Primary category
subcategory string Subcategory (last breadcrumb)
description string Product description (HTML stripped)
price_current number Current Babylist price in USD
price_was number Original/MSRP price when on sale
currency string Always USD
rating_average number Community star rating (1.0-5.0)
rating_count integer Number of community ratings
review_highlights string Top review excerpts (JSON array)
retailer_links string Multi-retailer where-to-buy options (JSON array of {retailer, price, url})
editorial_badges string Editorial callouts e.g. "Best of Babylist 2025" (JSON array)
image_urls string Product image URLs (JSON array)
color_options string Available color variants (JSON array)
size_options string Available size/style variants (JSON array)
gtin string Global Trade Item Number (barcode)
in_stock boolean Whether the product is available on Babylist
fsa_hsa_eligible boolean Whether the product qualifies for FSA/HSA spending
scraped_at string ISO timestamp of when the record was scraped

Example output

{
  "product_id": 13869,
  "variant_id": 22497,
  "product_name": "Book Rack - Yellow Lion",
  "product_url": "https://www.babylist.com/gp/3-sprouts-book-rack/13869/22497",
  "brand": "3 Sprouts",
  "category": "Nursery",
  "subcategory": "Nursery Storage",
  "description": "Adorable character book rack to store your little one's favorite books...",
  "price_current": 39.99,
  "price_was": null,
  "currency": "USD",
  "rating_average": 4.8,
  "rating_count": 312,
  "in_stock": true,
  "fsa_hsa_eligible": false,
  "retailer_links": "[{\"retailer\":\"Amazon\",\"price\":37.99,\"url\":\"https://...\"}]",
  "image_urls": "[\"https://images.babylist.com/...\"]",
  "scraped_at": "2026-05-24T14:31:59.000Z"
}

Usage examples

Scrape the full product catalog (sitemap mode):

{
  "mode": "sitemap",
  "maxItems": 0
}

Scrape strollers category:

{
  "mode": "category",
  "categorySlug": "strollers",
  "maxItems": 100
}

Scrape specific products:

{
  "mode": "product_urls",
  "startUrls": [
    "https://www.babylist.com/gp/uppababy-mesa-v2/12345/67890",
    "https://www.babylist.com/gp/snoo-smart-sleeper/54321/11111"
  ]
}

Extract editorial best-of lists:

{
  "mode": "editorial",
  "maxItems": 50
}

Performance

  • Full sitemap crawl (~10,000 products): runs in datacenter mode, no proxy required
  • Memory: 512 MB
  • Concurrency: 8 parallel requests
  • Timeout: 4 hours (for full catalog runs)