OrbTop

Grand Comics Database (GCD) Comic Issue Scraper

ECOMMERCE

Grand Comics Database (GCD) Comic Issue Scraper

Extract comprehensive bibliographic data for comic book issues from the Grand Comics Database public API.

The Grand Comics Database (comics.org) is the most authoritative open comics catalog on the web — over 2 million issues across 229,000+ series from 17,000+ publishers worldwide. While the GCD website is Cloudflare-protected, the /api/ endpoints are publicly accessible and return clean JSON with no challenge. This actor uses the API path exclusively — no browser required.


What you get

Each record includes:

  • Issue identity — GCD issue ID, series ID, series name, issue number, volume, title
  • Publication details — publication date, on-sale date, key date, cover price (+ currency), page count, country, language
  • Barcodes & identifiers — barcode, ISBN, indicia publisher
  • Creator credits — per-story writer, penciller, inker, colorist, letterer, editor credits (in structured string format)
  • Story details — story count, story titles, character lists, genre tags, reprint notes
  • Variant linkagevariant_of_id and is_variant flag for variant cover tracking

Three crawl modes

Mode Description
series_walk Walk all GCD series page by page — full catalog (229K+ series → 2M+ issues)
publisher_walk Scope to specific publishers by ID (e.g. 78 = Marvel, 18 = DC)
by_series_ids Fetch issues for a specific list of GCD series IDs

Input

Field Type Description
mode string Crawl mode: series_walk (default), publisher_walk, or by_series_ids
maxItems integer Maximum number of issue records to return (0 = unlimited; default 15)
seriesIds array GCD series IDs to scrape (mode by_series_ids only)
publisherIds array GCD publisher IDs to filter by (mode publisher_walk only)
startPage integer Series page to start from in series_walk mode (default 1)

Output dataset fields

Field Type Description
gcd_issue_id integer GCD database ID for the issue
issue_url string Canonical API URL for the issue
series_id integer GCD series ID (for joining with series data)
series_name string Series name with year, e.g. "Amazing Spider-Man (1963 series)"
issue_number string Issue number as published
volume string Volume designation
title string Issue title (if any)
publication_date string Human-readable publication date
on_sale_date string On-sale date (when available)
key_date string GCD sortable key date (YYYY-MM-DD format)
price string Cover price (numeric portion)
currency string Currency code (USD, GBP, EUR, etc.)
pages integer Page count
barcode string UPC barcode
isbn string ISBN (when available)
editing_credits string Editor credits
cover_image_url string Cover image URL (null — not available via GCD API)
variant_of_id integer GCD ID of the base issue (for variant covers)
is_variant boolean True if this is a variant cover
story_count integer Number of stories in the issue
story_titles string Story titles joined by " | "
story_credits string Creator credits per story joined by " || "
characters string Character appearances
genre string Genre tags
reprint_notes string Reprint notes
indicia_publisher string Indicia publisher name

Use cases

  • Comic price guides — join GCD issue records with pricing data (PriceCharting, GoCollect) via series_id / gcd_issue_id
  • Collection management — build a personal or dealer catalog of owned/wanted issues
  • Market research — identify key issue variants, first appearances, creator runs
  • Publisher catalogs — enumerate complete run data for any publisher
  • Data enrichment — supplement your comic database with bibliographic depth

GCD data licensing

GCD data is released under a Creative Commons license (CC-BY-SA). Commercial use with attribution is permitted. See comics.org for details.


Rate limits

The GCD API is a community-run server. This actor uses conservative concurrency (8 parallel issue fetches) with 200ms delay between series pages and 50ms between issue fetches to stay well within polite-use limits.


Notes

  • publisher, publisher_id, country, and language are not returned on the issue API endpoint; they live on the series object. Use series_id to join against a separate series walk if you need these fields.
  • cover_image_url is always null — the GCD API does not expose image URLs for individual issues.