DTC Bike Brand Catalog & Pricing Aggregator
ECOMMERCESPORTS
DTC Bike Brand Catalog & Pricing Aggregator
Scrapes product listings, specifications, and current pricing from three direct-to-consumer bicycle brand websites: Canyon, Trek, and Felt. Returns a unified dataset of bike models with prices, stock status, frame material, drivetrain, and available size/color options.
What it does
The scraper visits the official brand stores and extracts current catalog data per bike model:
- Canyon (Salesforce Commerce Cloud) — category listing pages feed product detail URLs; per-product data is extracted from ProductGroup JSON-LD on the detail pages
- Trek (SAP Hybris with Vue SSR) — category listing pages carry complete product data in Vue component attributes; no detail-page hop required
- Felt (Shopify) — collection listing pages route to product detail pages where Shopify's ProductGroup JSON-LD is parsed
Output fields
| Field | Type | Description |
|---|---|---|
brand |
string | Source brand: canyon, trek, or felt |
model_name |
string | Full model name (e.g. "Aeroad CFR Di2") |
model_year |
number | Model year extracted from name/tags |
category |
string | road, gravel, mtb, ebike, or urban |
sub_category |
string | Sub-category slug from URL path |
sku |
string | Product SKU or handle |
frame_material |
string | carbon, aluminum, steel, or titanium |
groupset |
string | Drivetrain groupset inferred from specs/name |
wheelset |
string | Wheelset specification (when available) |
weight_kg |
number | Bike weight in kg (when declared in specs) |
frame_size_options |
string | Available frame sizes, comma-separated |
color_options |
string | Available colors, comma-separated |
price_usd |
number | Current retail price in USD |
price_msrp_usd |
number | MSRP/original price when different from sale price |
in_stock |
boolean | Current stock status |
discontinued |
boolean | Whether the model is discontinued |
product_url |
string | Canonical product page URL |
image_url |
string | Primary product image URL |
description |
string | Product description (up to 500 characters) |
spec_sheet |
string | JSON string of key component specs (Canyon only) |
Input
| Field | Type | Default | Description |
|---|---|---|---|
maxItems |
integer | 10 | Maximum total records to return |
brands |
array | ["canyon", "trek", "felt"] |
Which brands to include |
categories |
array | [] |
Filter by category slug (empty = all) |
Notes
- Canyon products include a
spec_sheetJSON field with detailed component specs from their ProductGroup JSON-LD. - Trek product detail pages are JavaScript-rendered (SPA). All available data — name, price, category, stock status — is extracted from server-side-rendered listing page attributes, so Trek records do not include spec sheets or weight.
- Felt product pages are Shopify-hosted and return a full ProductGroup JSON-LD; spec extraction is limited to what Shopify exposes in structured data.
- Pricing data reflects the public retail price at the time of scraping. Sale prices are captured in
price_usd; if a strike-through MSRP is present, it appears inprice_msrp_usd.