OrbTop

OPM FedScope Federal Workforce Scraper

BUSINESSLEAD GENERATION

OPM FedScope Federal Workforce Scraper

Scrapes federal workforce data from OPM's FedScope datasets on www.opm.gov. Returns pre-aggregated employment statistics by agency, pay plan, grade, occupation, and duty location — sourced directly from OPM's quarterly ZIP releases, no auth required.


OPM FedScope Federal Workforce Scraper Features

  • Extracts federal employment statistics from OPM's quarterly FedScope dataset ZIPs
  • Returns agency headcount, average salary, and average length of service across multiple dimensions
  • Filters by dataset type: employment summary, employment cubes, accessions, separations, or all
  • Streams large ZIP files line-by-line — no memory ceiling from loading 276 MB flat files
  • Covers data going back to FY 2005 (accessions/separations) and September 1998 (employment cubes)
  • No proxies required. OPM's public data page is accessible from any IP with a browser User-Agent.

Who Uses Federal Workforce Data?

  • Policy researchers — Analyze agency staffing trends, grade distributions, and pay over time
  • Government contractors — Identify agencies by workforce size, occupation series, and location to target business development
  • HR consultants — Benchmark federal pay scales against private sector by grade, occupation, and region
  • Journalists and watchdogs — Track workforce reductions, hiring surges, or pay changes at specific agencies
  • Data scientists — Build longitudinal models of federal employment using quarterly snapshots back to the late 1990s

How OPM FedScope Federal Workforce Scraper Works

  1. Fetches the OPM data catalog page and parses all FedScope dataset entries — name, type, and ZIP download URL
  2. Filters datasets by your selected type (employment summary, cubes, accessions, or separations) and caps at max_datasets ZIPs
  3. Streams each ZIP file through an in-memory unzipper and parses the tab-delimited or pipe-delimited text files line by line
  4. Maps each row to the output schema and saves records until maxItems is reached, then closes the stream cleanly

Input

{
  "maxItems": 10,
  "dataset_type": "employment_summary",
  "max_datasets": 1
}
Field Type Default Description
maxItems integer 10 Maximum number of records to return across all datasets fetched
dataset_type string employment_summary Type of FedScope dataset. Options: employment_summary, employment_cube, accessions, separations, all
max_datasets integer 1 Maximum number of dataset ZIPs to download. Default 1 for fast runs. Increase for historical data.

To pull a full history, set max_datasets higher and dataset_type to employment_cube for quarterly snapshots going back to 1998.

{
  "maxItems": 50000,
  "dataset_type": "employment_cube",
  "max_datasets": 10
}

OPM FedScope Federal Workforce Scraper Output Fields

{
  "dataset_name": "FedScope Employment Summary Data (March 2025)",
  "dataset_type": "employment_summary",
  "table_name": "Pay Plan and Grade",
  "datecode": "202503",
  "agency_code": "VA",
  "agency_name": "DEPARTMENT OF VETERANS AFFAIRS",
  "sub_agency_code": "VATA",
  "sub_agency_name": "VETERANS HEALTH ADMINISTRATION",
  "occupation_series": null,
  "occupation_title": null,
  "pay_plan": "GS",
  "grade": "12",
  "location_code": null,
  "location_state": null,
  "location_state_name": null,
  "employee_count": 5430,
  "average_salary": 95821,
  "average_los": 12.4,
  "source_url": "https://www.opm.gov/data/datasets/Files/753/bc88ce69-1bbe-406f-9441-3c5153014616.zip",
  "scraped_at": "2026-05-04T20:10:14.421Z"
}
Field Type Description
dataset_name string Full dataset name from OPM (e.g. "FedScope Employment Summary Data (March 2025)")
dataset_type string Dataset type: employment_summary, employment_cube, accessions, or separations
table_name string Sub-table within the dataset ZIP (e.g. "By Agency and SubAgency")
datecode string Reporting period in YYYYMM format (e.g. "202503" for March 2025)
agency_code string Agency abbreviation code (e.g. "VA", "DD", "HE")
agency_name string Full agency name (e.g. "DEPARTMENT OF VETERANS AFFAIRS")
sub_agency_code string Sub-agency abbreviation code
sub_agency_name string Full sub-agency name
occupation_series string OPM occupation series code (e.g. "610" for Nurse)
occupation_title string Occupation title (e.g. "NURSE")
pay_plan string Pay plan code (e.g. "GS", "SES", "VN", "SV")
grade string Grade within the pay plan (e.g. "12", "F", "3")
location_code string OPM duty location code
location_state string Two-letter state abbreviation (e.g. "CA", "TX")
location_state_name string Full state name (e.g. "CALIFORNIA")
employee_count integer Number of employees matching these dimensions
average_salary number Average annual salary in USD
average_los number Average length of service in years
source_url string URL to the source ZIP file on www.opm.gov
scraped_at string ISO timestamp when this record was scraped

Not every field is populated in every record — Employment Summary tables are pre-aggregated along specific dimensions, so a "By Pay Plan and Grade" table won't have location or occupation data.


🔍 FAQ

How do I scrape federal employment data from OPM?

OPM FedScope Federal Workforce Scraper fetches from www.opm.gov/data/datasets/, not the newer data.opm.gov portal (which requires a live browser session to generate download URLs). The legacy page has direct static ZIP links — cleaner, faster, and no JavaScript required.

How much does it cost to run?

Each actor start costs $0.10. Each record costs $0.001. A typical run with max_datasets: 1 and maxItems: 10000 costs around $10.10 — which is a lot cheaper than building a federal payroll database from scratch.

What's the difference between employment_summary and employment_cube?

Employment summary datasets are pre-aggregated tables (by agency, by pay plan/grade, by duty location, etc.) and run fast — one ZIP contains about 10 tables with tens of thousands of rows total. Employment cubes are quarterly raw snapshots with more granularity going back to September 1998. Use summary for current state; use cubes for longitudinal analysis.

Does OPM FedScope Federal Workforce Scraper need proxies?

It does not. OPM's data download page is accessible from Apify datacenter IPs with a standard browser User-Agent. No residential proxy, no CAPTCHA, no session required.

Can I pull historical data across multiple quarters?

Set max_datasets to however many quarters you need and dataset_type to employment_cube. The most recent datasets appear first in the catalog, so max_datasets: 4 gives you roughly the last year of quarterly snapshots.


Need More Features?

Need custom field mappings, accessions/separations analysis, or a different OPM data product? File an issue or get in touch.

Why Use OPM FedScope Federal Workforce Scraper?

  • No auth, no CAPTCHA — OPM's legacy data page is fully public with direct ZIP links, so the actor just downloads and parses
  • Streaming ZIP parser — handles files up to 276 MB uncompressed without loading them into memory, which means it doesn't fall over on large historical pulls
  • 20+ years of history available — employment cube datasets go back to September 1998, accessions and separations to FY 2005