OrbTop

US DOL H1B LCA PERM Scraper - Visa Wage Disclosure Data

LEAD GENERATIONBUSINESSOTHER

DOL H1B LCA PERM Disclosure Scraper

Scrape visa disclosure data from the US Department of Labor. Returns H-1B, H-1B1, E-3, PERM, H-2A, H-2B, and CW-1 certifications with employer, wage, SOC, and worksite details — the same Excel dumps the DOL publishes every quarter, except you don't have to open a 75 MB spreadsheet to read them.

What does the DOL H1B LCA PERM Scraper do?

The scraper downloads the official quarterly disclosure XLSX files from dol.gov/agencies/eta/foreign-labor/performance, stream-parses them row by row, and returns structured records matching your filters. It covers five visa programs: LCA (H-1B, H-1B1 Singapore, H-1B1 Chile, E-3 Australian), PERM labor certifications, H-2A agricultural, H-2B non-agricultural, and CW-1 CNMI workers. Filter by employer, job title, SOC code, state, case status, or minimum annual wage. No auth, no proxy, no browser — just the data.

DOL H1B Scraper Features

  • Five visa programs in one actor — LCA (H-1B family), PERM, H-2A, H-2B, CW-1. Unified output schema across all programs, so downstream code stays simple.
  • 45+ structured fields per record — employer, worksite, wages (offered + prevailing), SOC, case status, decision date, attorney, FEIN, NAICS, and more.
  • Wage normalization — offered wages are annualized to USD regardless of unit (Year / Month / Bi-Weekly / Week / Hour). Filter by minAnnualWage without worrying whether H-2B pays hourly.
  • Six filters, AND-combined — employer name (contains), job title (contains), SOC code (prefix), worksite state, case status, min annual wage.
  • Stream parsing — handles Q4 cumulative files that inflate to 600+ MB of inner XML without blowing memory.
  • No proxy, no auth — DOL files are public US-hosted downloads. Your proxy bill stays at zero.
  • Every fiscal year from 2020 onward — pick any FY / quarter. The actor builds the URL, the DOL CDN serves it.

Who Uses DOL H1B / LCA / PERM Data?

  • Immigration attorneys — research precedent filings for a given employer, SOC code, or worksite. Build arguments from actual wage levels and case outcomes.
  • Compensation teams — benchmark H-1B salaries by role, level, and location. The LCA file is what levels.fyi wishes it was.
  • Recruiters and sourcing teams — find companies actively sponsoring H-1B in your target SOC, then cross-reference to LinkedIn.
  • Policy researchers and journalists — track denial rates, Level I wage abuse, H-1B dependent flags, and geographic concentration of foreign-labor filings.
  • Salary-benchmark products — pipe the full quarterly dataset into your pricing engine. Clean fields, consistent names, no parsing Excel.

How the DOL Scraper Works

  1. Pick a visa program, fiscal year, and quarter. The default (LCA / FY2025 / Q4) pulls the most recent annual LCA file.
  2. Optionally add filters — employer, job title, SOC prefix, worksite state, case status, min wage.
  3. The actor downloads the XLSX directly from dol.gov, stream-parses each row, and applies your filters.
  4. Matching rows are written to the Apify dataset in the unified schema. Set maxItems: 0 to pull the whole file.

Input

{
    "visaProgram": "LCA",
    "fiscalYear": 2025,
    "fiscalQuarter": "Q4",
    "employerName": "Google",
    "worksiteState": "CA",
    "minAnnualWage": 150000,
    "maxItems": 100
}
Field Type Default Description
visaProgram string LCA LCA, PERM, H-2A, H-2B, or CW
fiscalYear integer 2025 DOL fiscal year (Oct–Sep). Supports 2020 through 2030.
fiscalQuarter string Q4 Q1, Q2, Q3, or Q4. Q4 files are typically cumulative for the full FY.
employerName string "" Case-insensitive contains match against employer legal name.
jobTitle string "" Case-insensitive contains match against job title.
socCode string "" Prefix match against SOC code (e.g. 15-1252 for Software Developers).
worksiteState string "" Two-letter state code (e.g. CA, NY, TX).
caseStatus string "" Certified, Denied, Withdrawn, or Certified - Withdrawn. Empty = all.
minAnnualWage integer 0 Minimum offered wage (annualized). 0 = no minimum.
maxItems integer 100 Max records to return. 0 = unlimited (run to end of file).

Example: PERM labor certifications for Intel

{
    "visaProgram": "PERM",
    "fiscalYear": 2026,
    "fiscalQuarter": "Q1",
    "employerName": "Intel",
    "maxItems": 500
}

Example: All certified H-1Bs for Software Developers in California

{
    "visaProgram": "LCA",
    "fiscalYear": 2025,
    "fiscalQuarter": "Q4",
    "socCode": "15-1252",
    "worksiteState": "CA",
    "caseStatus": "Certified",
    "maxItems": 0
}

DOL H1B Scraper Output Fields

{
    "case_number": "I-200-23355-585999",
    "visa_class": "H-1B",
    "case_status": "Certified",
    "received_date": "2023-12-21",
    "decision_date": "2023-12-29",
    "employment_start_date": "2024-06-07",
    "employment_end_date": "2027-06-06",
    "employer_name": "Google LLC",
    "employer_address": "1600 Amphitheatre Parkway",
    "employer_city": "Mountain View",
    "employer_state": "CA",
    "employer_postal_code": "94043",
    "employer_country": "UNITED STATES OF AMERICA",
    "employer_phone": "+16502037602",
    "employer_fein": "77-0493581",
    "naics_code": "541512",
    "agent_attorney_name": "Felipe Peixoto Andrade",
    "agent_attorney_firm": "Berry Appleman & Leiden LLP",
    "job_title": "Software Engineer",
    "soc_code": "15-1299.08",
    "soc_title": "Computer Systems Engineers/Architects",
    "wage_rate_of_pay_from": 177000,
    "wage_rate_of_pay_to": null,
    "wage_unit_of_pay": "Year",
    "wage_annual_min": 177000,
    "prevailing_wage": 131144,
    "pw_unit_of_pay": "Year",
    "pw_wage_level": "II",
    "pw_source": null,
    "worksite_address": "225 Humboldt Ct",
    "worksite_city": "Sunnyvale",
    "worksite_county": "SANTA CLARA",
    "worksite_state": "CA",
    "worksite_postal_code": "94089",
    "full_time_position": "Y",
    "total_worker_positions": 1,
    "new_employment": 0,
    "continued_employment": 1,
    "change_previous_employment": 0,
    "new_concurrent_employment": 0,
    "change_employer": 0,
    "amended_petition": 0,
    "h1b_dependent": "No",
    "willful_violator": "No",
    "support_h1b": null,
    "source_file": "LCA_Disclosure_Data_FY2024_Q1.xlsx"
}
Field Type Description
case_number string DOL case number
visa_class string H-1B, H-1B1 Singapore, H-1B1 Chile, E-3 Australian, PERM, H-2A, H-2B, CW-1
case_status string Certified, Denied, Withdrawn, or Certified - Withdrawn
received_date string DOL received date (YYYY-MM-DD)
decision_date string DOL decision date (YYYY-MM-DD)
employment_start_date string Requested employment start date
employment_end_date string Requested employment end date
employer_name string Employer legal name
employer_address string Employer street address
employer_city string Employer city
employer_state string Employer state code
employer_postal_code string Employer postal code
employer_country string Employer country
employer_phone string Employer phone
employer_fein string Federal Employer Identification Number
naics_code string NAICS industry code
agent_attorney_name string Full name of agent or attorney
agent_attorney_firm string Law firm or business name
job_title string Job title
soc_code string SOC occupation code
soc_title string SOC occupation title
wage_rate_of_pay_from number Offered wage low end (in wage_unit_of_pay)
wage_rate_of_pay_to number Offered wage high end
wage_unit_of_pay string Year, Hour, Month, Week, or Bi-Weekly
wage_annual_min number Offered wage annualized to USD/year
prevailing_wage number DOL prevailing wage
pw_unit_of_pay string Prevailing wage period
pw_wage_level string Prevailing wage level (I, II, III, IV)
pw_source string Prevailing wage source (OES, CBA, DBA, SCA, Other)
worksite_address string Worksite street address
worksite_city string Worksite city
worksite_county string Worksite county
worksite_state string Worksite state code
worksite_postal_code string Worksite postal code
full_time_position string Y or N
total_worker_positions number Total worker positions requested
new_employment number Count of new-employment positions
continued_employment number Count of continued-employment positions
change_previous_employment number Count of change-previous-employment positions
new_concurrent_employment number Count of new-concurrent-employment positions
change_employer number Count of change-employer positions
amended_petition number Count of amended-petition positions
h1b_dependent string H-1B dependent employer flag (Y/N)
willful_violator string Willful violator employer flag (Y/N)
support_h1b string Statutory basis / H-1B support flag
source_file string Source XLSX filename (e.g. LCA_Disclosure_Data_FY2024_Q1.xlsx)

PERM records populate employer, SOC, wage, and worksite fields; H-1B-specific fields (h1b_dependent, pw_wage_level, total_worker_positions, etc.) are null for PERM. H-2A / H-2B / CW-1 populate the LCA-style fields directly, with wages often in hourly units.


FAQ

How do I scrape DOL H1B LCA data?

The DOL H1B LCA PERM Scraper downloads the quarterly XLSX disclosure file you pick and filters records in-stream. Default input returns 100 LCA records from the most recent fiscal year. For bulk pulls, set maxItems: 0 and the actor runs until the end of the file.

How much does it cost to run?

The scraper is pay-per-event: $0.10 per actor start plus $0.001 per record scraped. A 10,000-record pull costs ~$10. A focused filter (one employer, one quarter) usually returns a few hundred records for well under a dollar.

Can I get data for a specific company?

Yes. Set employerName to any substring of the employer legal name. Matching is case-insensitive — google matches Google LLC and Google Operating LLC. Combine with worksiteState or socCode to narrow further.

What's the difference between LCA and PERM?

LCA (Labor Condition Application) is the temporary-visa filing for H-1B, H-1B1, and E-3 workers. PERM is the permanent-residency labor certification filed before an employer can sponsor a green card. The LCA file has ~100K records per quarter; PERM has ~15-20K. Most H-1B salary analysis uses LCA.

Does it need a proxy?

No. DOL files are hosted on a public US CDN. The actor's default proxy configuration is off.

How fresh is the data?

The DOL publishes disclosure files quarterly, usually 30-60 days after the quarter ends. The actor always downloads the current version of whatever FY / quarter you request — no caching on our side.

Can I pull historical years?

Yes. Every fiscal year from 2020 onward uses the same file naming convention and schema. Earlier years exist but have different columns and are not currently supported.


Need More Features?

Need custom fields, a different filter, or integration help? File an issue or get in touch.

Why Use the DOL H1B Scraper?

  • All five visa programs, one schema — LCA, PERM, H-2A, H-2B, CW-1 normalized to consistent field names. Stop writing per-program adapters.
  • Wages normalized to annual USD — one field (wage_annual_min) works across Year / Hour / Month / Week / Bi-Weekly records. Sorting and thresholds just work.
  • Straightforward pricing — $0.001 per record. Predictable, not surprise-bill territory.