OrbTop

Cannabis License Scraper - National Aggregator

LEAD GENERATIONBUSINESS

Cannabis License Scraper - National License Aggregator

Aggregate state cannabis-control-board license rosters into one normalized dataset. Returns license number, type, status, business name, geocoded address, owner contact, equity flags, and adult-use vs medical authorization across roughly 6,500 active US cannabis licensees.


Cannabis License Aggregator Features

  • Federates six jurisdictions out of the box. New York OCM (adult-use), New York hemp, Massachusetts CCC, Connecticut DCP, Boulder County Colorado, and Utah DOH.
  • Normalizes a dozen distinct status codes and license-type taxonomies into a single nine-value enum, so you can filter license_status = 'active' and have it mean the same thing in every state.
  • Round-robin pagination across jurisdictions. You see records from every requested state before any one dominates the budget.
  • Filters by license type (retailer, cultivator, manufacturer, processor, testing-lab, microbusiness, distributor, delivery, hemp), active-only status, adult-use only, or medical only.
  • Flags social-equity / minority-owned licensees where the source state publishes that signal.
  • Pure API access — no browser, no proxies, no CAPTCHA. Just JSON from public open-data portals.

Who Uses Cannabis License Data?

  • Cannabis B2B sales (seed-to-sale, payments, packaging) — Pull active retailers and manufacturers by state, feed them into a CRM, run an outbound campaign by Friday.
  • Testing labs and ancillary services — Identify cultivators and processors by state with cultivation-tier and environment fields where available.
  • Cannabis-only insurance and finance — Active-license filter plus expiration date plus operational status — the inputs an underwriting model actually needs.
  • M&A intelligence and market research — Track license issuance, expirations, and equity-program participation across states without rebuilding the same scraper six times.
  • Compliance and license verification — Cross-reference a business name or license number against the source state's official roster.

How the Aggregator Works

  1. Pick jurisdictions by slug (ny-ocm, ma-ccc, ct-dcp, ...) or by state code (NY, MA, CT). Leave both empty to pull from every supported jurisdiction.
  2. Optionally filter by license type, active-only, adult-use, or medical. The filter runs after normalization, so license_type = 'retailer' matches dispensaries in NY, retailers in MA, hybrid retailers in CT, and pharmacies in UT in one query.
  3. The aggregator round-robins across jurisdictions, fetching a page from each in turn. Snapshot-style sources (Massachusetts) are loaded once and sliced client-side.
  4. Records come back in a single flat schema, ready for a warehouse, a CRM, or a CSV export.

Input

{
    "jurisdictions": ["ny-ocm", "ma-ccc"],
    "licenseTypes": ["retailer"],
    "onlyActive": true,
    "maxItems": 500
}
Field Type Default Description
jurisdictions array[string] (all) Jurisdiction slugs. See supported list below.
states array[string] 2-letter state codes. Expands to every supported jurisdiction in the state.
licenseTypes array[string] Filter to specific normalized types: retailer, cultivator, manufacturer, processor, testing-lab, microbusiness, distributor, delivery, hemp.
onlyActive boolean false Drop expired, suspended, revoked, surrendered, denied, and pending licenses.
onlyAdultUse boolean false Filter to recreational / adult-use licensees only.
onlyMedical boolean false Filter to medical licensees only. Hybrid licensees pass both adult-use and medical filters.
maxItems integer 15 Maximum records to return across all jurisdictions.
proxyConfiguration object (off) Optional Apify proxy. Not needed for any supported portal.

Supported Jurisdictions

Slug Jurisdiction State Records (approx.)
ny-ocm New York Office of Cannabis Management (Adult-Use) NY 2,800
ny-hemp New York OCM (Cannabinoid Hemp) NY 2,700
ma-ccc Massachusetts Cannabis Control Commission MA 960
ct-dcp Connecticut DCP (Hybrid + Medical) CT 44
co-boulder Boulder County, Colorado CO 33
ut-doh Utah Department of Health (Medical Pharmacies) UT 10

Example: pull every active retailer in New York

{
    "states": ["NY"],
    "licenseTypes": ["retailer"],
    "onlyActive": true,
    "maxItems": 1000
}

Example: cultivators across all jurisdictions, active only

{
    "licenseTypes": ["cultivator"],
    "onlyActive": true,
    "maxItems": 500
}

Cannabis License Output Fields

{
    "license_number": "OCM-RETL-25-000306",
    "business_name": "100 North 3rd Ltd",
    "dba_name": "7 Leaf Clover",
    "license_type": "retailer",
    "license_type_raw": "Adult-Use Retail Dispensary License",
    "license_status": "active",
    "license_status_raw": "Active",
    "license_issued_date": "2025-03-24",
    "license_effective_date": "2025-03-24",
    "license_expiration_date": "2027-03-24",
    "application_number": "OCMRETL-2023-000090",
    "primary_contact_name": "Jennifer Babaian",
    "phone": "",
    "email": "",
    "website": "",
    "address": "132 Metropolitan Ave",
    "city": "Brooklyn",
    "state": "NY",
    "zip": "11249",
    "county": "Kings",
    "region": "Brooklyn",
    "lat": null,
    "lng": null,
    "is_adult_use": true,
    "is_medical": false,
    "is_social_equity": true,
    "priority_status": "Women-Owned Business, Minority-Owned Business",
    "operational_status": "Active",
    "commence_operations_date": "",
    "cultivation_environment": "",
    "cultivation_tier": "",
    "license_fee_amount": null,
    "source_jurisdiction": "ny-ocm",
    "source_state": "NY",
    "source_url": "https://data.ny.gov/Government-Finance/Current-OCM-Licenses/jskf-tt3q"
}
Field Type Description
license_number string State-assigned license number. Primary key within the source.
business_name string Legal business / entity name.
dba_name string Doing-business-as / trade name when distinct from legal name.
license_type string Normalized type: retailer, cultivator, manufacturer, processor, testing-lab, microbusiness, distributor, delivery, hemp, other.
license_type_raw string Raw type string from the source — useful when you need the state's original taxonomy.
license_status string Normalized status: active, expired, suspended, revoked, pending, inactive, surrendered, denied, other.
license_status_raw string Raw status string from the source.
license_issued_date string Date the license was originally issued (YYYY-MM-DD).
license_effective_date string Date the current term took effect (YYYY-MM-DD).
license_expiration_date string Date the license expires (YYYY-MM-DD).
application_number string Original application identifier when supplied.
primary_contact_name string Primary contact / responsible party name when supplied.
phone string Business phone number when published.
email string Business email when published (Massachusetts only, in practice).
website string Business website URL when published.
address string Establishment street address.
city string Establishment city.
state string Two-letter state code.
zip string ZIP / postal code.
county string County when published.
region string Sub-state region label (e.g. NY OCM exposes Brooklyn, Hudson Valley).
lat number Latitude (WGS84) when published.
lng number Longitude (WGS84) when published.
is_adult_use boolean true when the license authorizes recreational / adult-use sales.
is_medical boolean true when the license authorizes medical cannabis activities.
is_social_equity boolean true when the licensee qualifies under a social-equity / minority-owned program.
priority_status string Equity / priority program label from the source state.
operational_status string Operational status (e.g. Active, Operating, Non-Operational).
commence_operations_date string Date the licensee commenced operations (Massachusetts publishes this).
cultivation_environment string Indoor, Outdoor, or mixed — when the source publishes it.
cultivation_tier string Canopy tier (state-specific).
license_fee_amount number License fee paid in USD when published.
source_jurisdiction string Slug of the jurisdiction that produced this record.
source_state string Two-letter source state code.
source_url string URL of the source dataset on the state open-data portal.

FAQ

How do I scrape cannabis licenses across multiple US states?

Cannabis License Aggregator ships with six state-level jurisdictions and normalizes them into a single schema. Pick jurisdictions by slug, by state code, or leave both empty to hit every supported portal in one run.

How much does Cannabis License Aggregator cost to run?

Cannabis License Aggregator runs on pay-per-event pricing: $0.10 per actor start plus $0.00125 per record. A full sweep of every supported jurisdiction (~6,500 records) is about $8.

Does Cannabis License Aggregator need proxies?

No. All supported sources are public state-government open-data portals (Socrata SODA and direct JSON snapshots) designed for third-party consumption. The actor defaults to direct requests.

Can I filter to only active licenses?

Yes. onlyActive: true drops every record whose normalized status is anything other than active — that means no expired, no suspended, no revoked, no pending applications. This is what you want for outbound sales lists.

What's the difference between license_type and license_type_raw?

license_type is the normalized value — one of nine canonical categories — so a query for retailer matches dispensaries in NY, retailers in MA, hybrid retailers in CT, and pharmacies in UT. license_type_raw is whatever the source state shipped, for when you need the original taxonomy.

Which states are coming next?

The most-requested additions are California (BCC), Washington (LCB), Oregon (OLCC), Illinois, and Michigan. California publishes licensee data behind CloudFront, Washington publishes only Excel files, and Oregon publishes PDFs. Each requires a per-state adapter. File a request for the state you need.


Need More Features?

Need a state that isn't in the registry, owner-history fields, or violations data? File an issue or get in touch.

Why Use the Cannabis License Aggregator?

  • One schema, six jurisdictions — Query once, get normalized results from NY, MA, CT, CO, and UT. No per-state ETL.
  • Built for cannabis B2B saleslicenseTypes filter, onlyActive status, and the social-equity flag cover the screens that seed-to-sale, payments, packaging, and insurance vendors ask for first.
  • Cheap to run — $0.00125 per record. A national sweep is coffee money, not a budget line item.