OrbTop

Common Sense Media Parental Content Rating Scraper

EDUCATIONAI

Common Sense Media Parental Content Rating Scraper

Scrape Common Sense Media for structured parental content ratings across movies and TV shows. Common Sense Media is the single authoritative source for age-appropriate content guidance used by parents, schools, and streaming platforms worldwide.

What you get

Each record contains:

  • Title and media type (movie or TV)
  • CSM age recommendation — the editor's recommended minimum age (e.g., 9)
  • CSM overall rating — editorial star rating 1–5
  • Parents say rating — community star rating from parent reviewers 1–5
  • Kids say rating — community star rating from child reviewers 1–5
  • One-line verdict — the editor's brief summary sentence
  • Content domain scores (JSON, each 0–5): positive_messages, positive_role_models, violence, sex, language, consumerism, drinking_drugs_smoking, diverse_representations
  • Genre, release year, runtime, MPAA/ESRB rating
  • Review summary from meta description

Use cases

  • Family streaming apps — filter content by age-appropriateness and domain scores
  • EdTech / parental control products — build curated content catalogs with safety metadata
  • Brand safety vendors — classify media by content domains for ad-placement decisions
  • Kids content recommenders — rank titles by positive messages and role models
  • Research — analyze content trends across decades of movie/TV releases

How it works

The actor walks the Common Sense Media reviews sitemap, filters to /movie-reviews/ and /tv-reviews/ URLs (~55,000+ reviews), and scrapes each detail page using Cheerio. Numeric content domain scores come from the Drupal data layer embedded in every page; community ratings are parsed from the rendered HTML.

No proxy required — the site is server-rendered and publicly accessible.

Input

Parameter Type Default Description
maxItems integer 10 Maximum number of reviews to scrape. Set to 0 for all reviews (~55,000+).
startUrls array [] Optional list of specific review URLs to scrape directly (e.g., https://www.commonsensemedia.org/movie-reviews/spirited-away). If empty, uses the sitemap.

Output

{
  "title": "Spirited Away",
  "media_type": "movie",
  "csm_url": "https://www.commonsensemedia.org/movie-reviews/spirited-away",
  "age_recommendation": 9,
  "csm_overall_rating": 5,
  "parents_say_rating": 5,
  "kids_say_rating": 4,
  "one_line_verdict": "Magnificent movie with scary creatures and a strong heroine.",
  "content_domains": "{\"positive_messages\":4,\"positive_role_models\":4,\"violence\":3,\"sex\":1,\"language\":0,\"consumerism\":0,\"drinking_drugs_smoking\":2,\"diverse_representations\":4}",
  "genre": "Anime",
  "release_year": 2002,
  "runtime_minutes": 125,
  "mpaa_esrb_rating": "PG",
  "review_summary": "Magnificent movie with scary creatures and a strong heroine. Read Common Sense Media's Spirited Away review, age rating, and parents guide."
}

Notes

  • Content domain scores follow the CSM 0–5 scale: 0 = not present, 5 = extreme/pervasive
  • content_domains is returned as a JSON string for schema compatibility — parse it with JSON.parse() in your downstream code
  • The sitemap covers ~55,000+ reviews across movies, TV, books, games, podcasts, and apps; the actor filters to movies and TV only
  • A full scrape of all ~55,000 reviews takes approximately 3–4 hours at default concurrency