Book Series In Order — Reading Order Scraper

Scrapes book series reading order data from bookseriesinorder.com. Returns publication order, chronological order, series positions, and book metadata for thousands of authors, series, and characters.

ISBN databases and metadata APIs don't give you reading order. That's not an oversight — they weren't built for it. Bookseriesinorder.com is the canonical source for exactly this, and this actor walks its entire catalog.

What It Extracts

One record per book per section. An author with a 20-book series listed under both publication order and chronological order produces 40 records — same books, different sequence context.

Field	Type	Description
`author`	string	Author or series/character name (from page H1)
`author_slug`	string	URL slug identifying the author or series page
`series_name`	string	Series or collection name, parsed from the H2 heading
`order_type`	string	`Publication Order` or `Chronological Order`
`book_title`	string	Book title
`series_position`	integer	Position within this ordered list (1-based)
`publication_year`	integer	Publication year
`is_standalone`	boolean	True when the book appears under a standalone section
`co_author`	string	Co-author name if noted in the title
`notes`	string	Additional notes from the book entry
`author_url`	string	Canonical URL of the source page

How to Use

Full catalog crawl: Leave startUrls empty. The actor walks all 26 sitemaps (~5,000+ pages) and emits one record per book per section. Use maxItems to cap the total. Long runs take hours — set a generous timeout.

Targeted scrape: Provide specific author or series URLs in startUrls. Returns all book sections on those pages.

{
  "startUrls": [
    { "url": "https://www.bookseriesinorder.com/lee-child/" },
    { "url": "https://www.bookseriesinorder.com/james-patterson/" }
  ],
  "maxItems": 500
}

Input Parameters

Parameter	Type	Default	Description
`startUrls`	array	Lee Child example	Specific bookseriesinorder.com author or series pages to scrape. When empty, the full sitemap is used.
`maxItems`	integer	10	Maximum number of book records to return. Leave blank for no cap.

Scale

The site covers 5,000+ author and series pages. A full run returns millions of records — every book in every series listed under every available order type. Most use cases want a targeted subset via startUrls.

Why This Data

Open Library, Google Books, and Goodreads expose metadata — title, ISBN, cover, author, description. None of them expose clean publication-vs-chronological series ordering. Library apps, recommendation engines, and reading tracker tools need this distinction. Bookseriesinorder.com is the only structured source for it.

Technical Notes

No proxy required. The site returns 200 on standard requests.
Concurrency is kept low (3 concurrent, 500ms delay) out of courtesy to a small content site.
Both author pages (e.g. /lee-child/) and character/series pages (e.g. /scot-harvath/) are handled.
maxItems caps total records across all pages. A single author page can produce hundreds of records.

Data sourced from bookseriesinorder.com. Actor by OrbTop.