OrbTop

Google Scholar Case Law Scraper

OTHER

Google Scholar Case Law Scraper

Search and scrape US court opinions from Google Scholar's case law database. Filter by keywords, court level (federal or state), and date ranges to collect case names, citations, courts, decision dates, snippets, cited-by counts, and direct links. Optionally fetches the full opinion text from the detail pages.


Google Scholar Case Law Scraper Features

  • Searches every US court Google Scholar indexes — Supreme Court, federal circuits, state appellate, the full set
  • Filters by court level: all, federal-only, or state-only
  • Accepts year-range filters for date-bounded searches
  • Returns 14+ fields per case including citation, court, decision year, snippet, and cited-by URL
  • Pulls full opinion text from detail pages when fetchFullText is enabled
  • Captures the cited_by count and cited_by_url so you can chain citation-network crawls
  • Returns PDF links where Google Scholar surfaces them
  • Routes through residential proxies — Google Scholar throttles datacenter IPs aggressively

Who Uses Google Scholar Case Law Data?

  • Legal researchers and academics — Pull case-law datasets for citation analysis, doctrinal studies, or empirical legal research
  • Legal AI training teams — Build training data for case summarization, citation extraction, or legal Q&A models
  • Practicing attorneys — Run keyword sweeps across federal or state opinions for due-diligence and brief-writing
  • Litigation analytics platforms — Surface trend data on how often specific doctrines or precedents appear across jurisdictions
  • Journalists and policy researchers — Trace how a constitutional doctrine moves across circuits over time

How the Google Scholar Case Law Scraper Works

  1. Set a search query — Plain-language keywords like "fourth amendment search and seizure"
  2. Pick a court filterall, federal, or state — and optional dateStart/dateEnd year bounds
  3. Toggle fetchFullText if you need opinion bodies — Slower per case, but you get the full text instead of just the snippet
  4. The scraper paginates through Google Scholar results and returns one record per case

The scraper uses residential proxies by default. Google Scholar's anti-bot is real, and datacenter IPs hit captchas fast. With residential, the run completes cleanly at the rate Scholar allows.


Input

{
  "query": "fourth amendment search and seizure",
  "court": "all",
  "dateStart": "2015",
  "dateEnd": "2024",
  "fetchFullText": false,
  "maxItems": 50
}
Field Type Default Description
query string (required) Search keywords for case law (e.g. first amendment freedom of speech).
court string all Court filter. One of all, federal, state.
dateStart string "" Start of date range, YYYY format (e.g. 2020).
dateEnd string "" End of date range, YYYY format.
fetchFullText boolean false Fetch the full opinion text from detail pages. Slower but provides complete case text.
maxItems integer 10 Maximum number of case law records to scrape.

Federal-only search with full opinion text

{
  "query": "miranda warnings custodial interrogation",
  "court": "federal",
  "fetchFullText": true,
  "maxItems": 25
}

Date-bounded state-court search

{
  "query": "qualified immunity",
  "court": "state",
  "dateStart": "2020",
  "dateEnd": "2024",
  "maxItems": 100
}

Google Scholar Case Law Output Fields

{
  "case_name": "Miranda v. Arizona",
  "citation": "384 U.S. 436 (1966)",
  "court": "Supreme Court of the United States",
  "date_decided": "1966",
  "snippet": "...the prosecution may not use statements, whether exculpatory or inculpatory, stemming from custodial interrogation of the defendant unless...",
  "full_text": null,
  "cited_by_count": 60842,
  "cited_by_url": "https://scholar.google.com/scholar?cites=...",
  "related_cases_url": "https://scholar.google.com/scholar?q=related:...",
  "versions_count": 12,
  "source": "Supreme Court",
  "scholar_url": "https://scholar.google.com/scholar_case?case=...",
  "pdf_url": null,
  "case_id": "5677742783094545037"
}
Field Type Description
case_name string Full case name (e.g. Miranda v. Arizona).
citation string Legal citation extracted from source metadata.
court string Court that decided the case.
date_decided string Year or full date the case was decided.
snippet string Search-result snippet showing relevant excerpts.
full_text string Full opinion text. Populated only when fetchFullText is true.
cited_by_count integer Number of cases citing this case.
cited_by_url string Google Scholar URL listing the citing cases.
related_cases_url string Google Scholar URL for related cases.
versions_count integer Number of other sources reporting this case.
source string Source publication or court name from the metadata line.
scholar_url string Direct Google Scholar URL for this case.
pdf_url string PDF URL when available, otherwise null.
case_id string Google Scholar internal case ID extracted from the URL.

FAQ

How do I scrape Google Scholar case law?

Google Scholar Case Law Scraper hits the standard Scholar case-law search URL with your query, court filter, and year range, then parses the result page. With fetchFullText enabled it follows each result link to pull the opinion body.

Can I filter by court?

Google Scholar Case Law Scraper accepts all, federal, or state for the court field. Federal returns every federal circuit and district that Scholar indexes; state returns every state appellate and supreme court. For finer filtering — say, only the Ninth Circuit — pass the court name in your query string and let Scholar's relevance ranking handle it.

Does this need proxies?

Google Scholar Case Law Scraper uses residential proxies by default. Google Scholar throttles and captchas datacenter IPs aggressively, so residential is effectively required for any meaningful crawl.

Can I get the full opinion text?

Google Scholar Case Law Scraper returns the full opinion body when fetchFullText is set to true. Each detail-page fetch adds an extra request, so it's slower per case — but if you need the complete text for citation extraction or LLM training, that's the way to get it.

How much does the Google Scholar Case Law Scraper cost to run?

Google Scholar Case Law Scraper is priced per record returned via the pay-per-event model. Runs with fetchFullText enabled return larger records but the per-record price is the same.


Need More Features?

Need court-specific filters, citation graph crawling, or PDF download? File an issue or get in touch.

Why Use the Google Scholar Case Law Scraper?

  • Complete metadata — 14 fields per case including citation, court, cited-by count, and a direct Scholar URL
  • Cited-by chaining — Returns the cited_by_url so you can recursively crawl citation networks. Useful when you're tracing how a doctrine spreads across jurisdictions.
  • Full text when you need it — Optional opinion-body fetch, off by default to keep snippets-only runs fast