OrbTop

Brazil CNPJ Receita Federal Scraper

BUSINESSLEAD GENERATIONAUTOMATION

Brazil CNPJ Receita Federal Scraper

Scrape Brazilian companies (CNPJ) from Receita Federal's open registry via minhareceita.org, the community mirror that refreshes monthly from the government bulk dumps. Returns the full company record — legal name, trade name, address, partners (QSA), CNAE industry codes, tax-regime flags, and registration status — across all 55M+ active CNPJs.


Brazil CNPJ Receita Federal Scraper Features

  • Looks up a single CNPJ or paginates the registry by filter.
  • Filters by state (UF), CNAE industry code, IBGE municipality, legal form, or partner CPF — alone or combined.
  • Returns 40+ fields per company, including the QSA partner list and tax-regime flags most enrichment APIs gate behind a paywall.
  • Flags MEI (micro-entrepreneur), Simples Nacional, and public-sector entities so you can segment the file before it leaves the run.
  • Pure JSON API — no browser, no proxy, no captcha. Hard to break.
  • Uses the upstream cursor pagination directly, so it does not skip records the way offset-based scrapers do once you cross the 10k mark.

Who Uses Brazil CNPJ Data?

  • B2B prospecting teams — Pull every company in a CNAE + state combo, drop it into a CRM, and stop buying lead lists.
  • Fintechs and KYC providers — Verify a customer's CNPJ against the registry and capture the partner network for AML screening.
  • Tax consultancies — Filter by Simples Nacional opt-ins or specific tax regimes to find the firms whose books you actually want.
  • Trade-credit insurers — Pre-fill underwriting workflows with status, capital, opening date, and partner data without paying per-call enrichment fees.
  • Researchers and journalists — Map ownership chains by partner CPF — Receita already masks the CPFs (***123456**), so the field is safe to surface.

How Brazil CNPJ Receita Federal Scraper Works

  1. Pick a mode. Either supply one CNPJ for a direct lookup, or set at least one filter (UF, CNAE, municipality, legal form, or partner CPF) for a bulk crawl.
  2. The scraper hits minhareceita.org. For bulk runs it walks the cursor-paginated search until it has maxItems records or the cursor runs out.
  3. Each upstream record is flattened into the dataset schema below. Nested objects become formatted strings; arrays of objects become lists of human-readable strings; CPFs stay masked.
  4. Results stream into the Apify dataset. Stop the run any time — the records already in the dataset are yours.

Input

{
  "uf": "SP",
  "cnae": "6209100",
  "pageSize": 100,
  "maxItems": 1000
}
Field Type Default Description
cnpj string "" A single 14-digit CNPJ (dots, slashes, dashes ignored). Overrides every search filter.
uf string "DF" Two-letter Brazilian state code (SP, RJ, DF…). Set to empty for any state.
cnae string "" CNAE code (matches both primary and secondary activities). Example: 6209100 (IT support).
municipio string "" IBGE or SIAFI municipal code, digits only. Example: 5300108 (Brasília).
naturezaJuridica string "" Legal-form code. Example: 2135 (Empresário Individual).
partnerCpf string "" Partner CPF/CNPJ in the masked Receita format (***123456** for CPF). Use this to find every company an individual sits on.
pageSize integer 100 Records per upstream API page (1–1000). Larger pages are faster but use more memory.
maxItems integer 50 Maximum records the run will save across all queries.

Bulk-search mode requires at least one of uf, cnae, municipio, naturezaJuridica, or partnerCpf. The upstream API times out on no-filter queries — the scraper rejects them up front so you do not pay for a dead run.

Single-CNPJ lookup

{
  "cnpj": "00.000.000/0001-91",
  "maxItems": 1
}

Find every IT support company in São Paulo

{
  "uf": "SP",
  "cnae": "6209100",
  "maxItems": 5000
}

Pull every MEI in a city

{
  "municipio": "5300108",
  "naturezaJuridica": "2135",
  "maxItems": 1000
}

Brazil CNPJ Receita Federal Scraper Output Fields

{
  "cnpj": "00000000000191",
  "razao_social": "BANCO DO BRASIL SA",
  "nome_fantasia": "DIRECAO GERAL",
  "matriz_filial": "MATRIZ",
  "situacao_cadastral": "ATIVA",
  "situacao_cadastral_motivo": "SEM MOTIVO",
  "data_situacao_cadastral": "2005-11-03",
  "data_abertura": "1966-08-01",
  "cnae_principal_codigo": "6422100",
  "cnae_principal_descricao": "Bancos múltiplos, com carteira comercial",
  "cnaes_secundarios": ["6499999 — Outras atividades de serviços financeiros"],
  "natureza_juridica_codigo": "2038",
  "natureza_juridica": "Sociedade de Economia Mista",
  "porte": "DEMAIS",
  "capital_social": 120000000000,
  "logradouro": "QUADRA SAUN QUADRA 5 BLOCO B TORRE I, II, III",
  "numero": "SN",
  "complemento": "ANDAR T I SL S101 A S1602",
  "bairro": "ASA NORTE",
  "municipio": "BRASILIA",
  "uf": "DF",
  "cep": "70040912",
  "pais": "",
  "telefone1": "6134939002",
  "telefone2": "",
  "fax": "6134931040",
  "email": "secex@bb.com.br",
  "simples_nacional": false,
  "data_opcao_simples": "2007-07-01",
  "data_exclusao_simples": "2007-07-01",
  "mei": false,
  "data_opcao_mei": "2009-07-01",
  "data_exclusao_mei": "2009-07-01",
  "regime_tributario": ["LUCRO REAL (2020)", "LUCRO REAL (2021)"],
  "ente_federativo": "",
  "orgao_publico": false,
  "qsa": ["TARCIANA PAULA GOMES MEDEIROS — Presidente — id: ***128734** — entry: 2023-01-26"],
  "qsa_count": 42,
  "ultima_atualizacao": "",
  "source_url": "https://minhareceita.org/00000000000191"
}
Field Type Description
cnpj string 14-digit CNPJ — Brazil's company tax/registry ID.
razao_social string Legal name of the company.
nome_fantasia string Trade name (DBA).
matriz_filial string MATRIZ (head office) or FILIAL (branch).
situacao_cadastral string ATIVA, SUSPENSA, INAPTA, BAIXADA, or NULA.
situacao_cadastral_motivo string Reason text for the current status.
data_situacao_cadastral string Date the status took effect (YYYY-MM-DD).
data_abertura string Date the company started operating (YYYY-MM-DD).
cnae_principal_codigo string Primary CNAE industry code.
cnae_principal_descricao string Primary CNAE description.
cnaes_secundarios string[] Secondary CNAE codes formatted as CODE — Description.
natureza_juridica_codigo string Legal-form code.
natureza_juridica string Legal-form description.
porte string Size class: MICRO EMPRESA, EPP, DEMAIS.
capital_social number Declared capital in BRL.
logradouro string Street address line, prefixed with the type (e.g. AVENIDA Brasil).
numero string Street number.
complemento string Suite, floor, room.
bairro string Neighborhood.
municipio string City name.
uf string Two-letter state code.
cep string Postal code (8 digits).
pais string Country (only when registered abroad).
telefone1 string Primary phone (DDD + number, concatenated).
telefone2 string Secondary phone.
fax string Fax number.
email string Contact email reported to Receita Federal.
simples_nacional boolean Opted into Simples Nacional.
data_opcao_simples string Simples Nacional opt-in date.
data_exclusao_simples string Simples Nacional opt-out date.
mei boolean MEI (Micro Empreendedor Individual).
data_opcao_mei string MEI opt-in date.
data_exclusao_mei string MEI opt-out date.
regime_tributario string[] Tax regimes by year (LUCRO REAL (2020), SIMPLES (2021)…).
ente_federativo string Federative entity (only on public-sector CNPJs).
orgao_publico boolean True when the legal-form code identifies a public-sector body.
qsa string[] Partner list. Each entry is NAME — ROLE — id: ***123456** — entry: YYYY-MM-DD. CPFs are pre-masked by Receita Federal.
qsa_count number Number of partners.
ultima_atualizacao string Upstream record's last-updated timestamp (when present).
source_url string Direct upstream URL for the record.

FAQ

How do I scrape Brazilian companies by CNAE and state?

Brazil CNPJ Receita Federal Scraper takes a uf plus a cnae and walks every match through the registry. A São Paulo IT-support sweep is { "uf": "SP", "cnae": "6209100" }. The cursor pagination handles the rest.

Does Brazil CNPJ Receita Federal Scraper need proxies?

No. The data sits behind Cloudflare CDN with caching enabled, so plain HTTP requests work fine. No proxy, no captcha, no anti-bot to dodge.

How current is the Receita Federal data?

The upstream mirror reloads from Receita Federal's monthly bulk dumps, so most records are 0–45 days behind the official source. Active/baixada status flips and partner changes show up on the next refresh — not the same day, but better than the Receita website which itself is on the same cycle.

Can I look up a single CNPJ instead of running a bulk crawl?

Yes. Set the cnpj field to a 14-digit CNPJ (formatting is stripped) and ignore everything else. The actor returns one record and exits. Useful for KYC enrichment.

What about the partner CPFs — is that legal to surface?

Receita Federal already publishes the QSA with masked CPFs in the format ***123456**. The scraper passes those through as-is — you never see a complete unmasked CPF, so there is nothing to redact downstream.

How much does a typical run cost?

The actor uses standard PPE pricing: $0.10 per run start plus $0.001 per record. A 10,000-record state-wide CNAE sweep is about $10.10. The math is unambitious because the data is.


Need More Features?

Need additional filters, custom fields, or a different Brazilian dataset wired in? File an issue or get in touch.

Why Use Brazil CNPJ Receita Federal Scraper?

  • Bulk discovery, not just enrichment — Most CNPJ actors on the market take a known CNPJ and return one record. This one finds the CNPJs in the first place by CNAE, state, city, or legal form.
  • Full registry data — Returns the partner list (QSA), tax-regime history, MEI flag, and Simples Nacional status — fields that competitors typically gate behind premium tiers.
  • Cheap and fast — Pure JSON API, no browser, no proxy. ~$0.001 per record and several hundred records per minute on a single instance.