TokyoDev Job & Company Scraper

Scrapes tech job listings and company profiles from TokyoDev.com, the primary English-language job board for developers targeting Japan's tech industry. Returns jobs with titles, salaries, remote policies, Japanese language requirements, visa sponsorship signals, and technology tags — plus company profiles with descriptions and tech stacks — across ~182 job listings and ~232 company pages.

TokyoDev Scraper Features

Scrapes job listings, company profiles, or both via a single scrapeMode selector
Extracts Japanese language requirement per listing — true/false, not buried in description text
Captures remote policy per job: fully-remote, partially-remote, or no-remote
Returns apply-from-abroad eligibility where disclosed — useful for candidates outside Japan
Collects technology and skill tags per listing (Ruby, Python, React, etc.)
Filters by remote policy, seniority level, or Japanese language requirement before saving
Accepts specific TokyoDev URLs directly — skip sitemap discovery for targeted runs
Uses residential proxy to bypass Cloudflare protection on all non-sitemap pages

Who Uses TokyoDev Data?

Recruiters — Pull structured Japan tech listings with remote and language filters already applied, not raw HTML to parse
Job aggregators — Ingest English-language Japan tech jobs with consistent field structure across listings
Market researchers — Analyze salary trends, remote policy distribution, and Japanese language demand across the Japan tech sector
HR analytics teams — Build datasets tracking which companies are hiring, what seniority levels are in demand, and what tech stacks are common
Candidate matching platforms — Filter by japanese_required and apply_from_abroad to surface realistic options for international applicants

How TokyoDev Scraper Works

Fetches /sitemap.xml — accessible without Cloudflare challenge — and classifies URLs into job listings and company profile pages
Applies mode filter (jobs, companies, or both) and optional filters for remote policy, seniority, and Japanese language requirement
Loads each target page using a Playwright browser with residential proxy and anti-detection fingerprinting to bypass Cloudflare
Extracts data from both JSON-LD structured markup and rendered HTML, with HTML as fallback for fields not in the schema

Input

{
  "scrapeMode": "jobs",
  "remotePolicy": "fully-remote",
  "japaneseRequired": "no-japanese-required",
  "maxItems": 50,
  "proxyConfiguration": {
    "useApifyProxy": true,
    "apifyProxyGroups": ["RESIDENTIAL"]
  }
}

Field	Type	Default	Description
`scrapeMode`	string	`"both"`	What to scrape: `"jobs"`, `"companies"`, or `"both"`
`searchUrls`	array	—	Optional: specific TokyoDev URLs to scrape. Skips sitemap discovery.
`remotePolicy`	string	`""`	Filter by remote policy: `"fully-remote"`, `"partially-remote"`, `"no-remote"`, or empty for all
`seniority`	string	`""`	Filter by seniority: `"intern"`, `"junior"`, `"intermediate"`, `"senior"`, or empty for all
`japaneseRequired`	string	`""`	Filter by Japanese language: `"japanese-required"`, `"no-japanese-required"`, or empty for all
`maxItems`	integer	`50`	Maximum number of results to return
`proxyConfiguration`	object	RESIDENTIAL	Proxy settings — residential proxy required for Cloudflare bypass

TokyoDev Scraper Output Fields

Job Listings

{
  "job_title": "Senior Rails Engineer",
  "company_name": "TableCheck",
  "company_url": "https://www.tablecheck.com",
  "location": "Tokyo",
  "job_type": "full-time",
  "seniority": "senior",
  "remote_policy": "partially-remote",
  "japanese_required": false,
  "apply_from_abroad": true,
  "salary_range": "8000000-14000000 JPY",
  "description": "TableCheck is looking for a senior Rails engineer...",
  "requirements": ["5+ years Rails experience", "Experience with PostgreSQL"],
  "tags": ["Ruby", "Rails", "PostgreSQL", "React"],
  "apply_url": "https://www.tablecheck.com/jobs/apply/rails-engineer",
  "posted_date": "2025-03-20",
  "job_url": "https://www.tokyodev.com/companies/tablecheck/jobs/senior-rails-engineer"
}

Field	Type	Description
`job_title`	string	Job title
`company_name`	string	Hiring company name
`company_url`	string	Company website URL
`location`	string	Job location (e.g. Tokyo, Remote, Osaka)
`job_type`	string	Employment type: full-time, contract, intern
`seniority`	string	Seniority level: junior, intermediate, senior
`remote_policy`	string	Remote work policy: fully-remote, partially-remote, no-remote
`japanese_required`	boolean	Whether Japanese language proficiency is required
`apply_from_abroad`	boolean	Whether candidates can apply from outside Japan
`salary_range`	string	Salary range if disclosed
`description`	string	Full job description text
`requirements`	array	Job requirements and qualifications
`tags`	array	Technology and skill tags (e.g. Ruby, Python, React)
`apply_url`	string	Direct URL to apply for the position
`posted_date`	string	Date the job was posted
`job_url`	string	Full TokyoDev job listing URL

Company Profiles

When scrapeMode is "companies" or "both", company records are included in the same dataset. Company records populate company_name, company_url, description, location, tags, and job_url (set to the company profile URL). Job-specific fields are null.

{
  "company_name": "Mercari",
  "company_url": "https://www.mercari.com",
  "location": "Tokyo",
  "description": "Mercari is Japan's largest marketplace app...",
  "tags": ["Go", "Kotlin", "Swift", "React", "Kubernetes"],
  "job_url": "https://www.tokyodev.com/companies/mercari"
}

🔍 FAQ

How do I scrape TokyoDev.com?

TokyoDev Scraper handles sitemap discovery automatically. Set scrapeMode to "jobs", "companies", or "both", apply any filters you need, configure the residential proxy, and run it. For targeted runs, paste specific TokyoDev URLs into searchUrls to skip the sitemap phase entirely.

Does TokyoDev Scraper need proxies?

It does. TokyoDev uses Cloudflare managed challenge on all page routes. The scraper uses a Playwright browser with residential proxy and anti-detection fingerprinting to get through. The sitemap at /sitemap.xml is accessible without challenge — the scraper uses that for URL discovery without consuming proxy budget.

What data can I get from TokyoDev.com?

TokyoDev Scraper returns job titles, companies, locations, employment types, seniority levels, remote policies, Japanese language requirements, apply-from-abroad flags, salary ranges, descriptions, requirements lists, technology tags, apply URLs, and posting dates. Company profiles include the company description, location, and tech stack tags.

Can I filter for jobs that don't require Japanese?

Set japaneseRequired to "no-japanese-required". TokyoDev Scraper applies the filter before saving records, so only matching results land in the dataset — you don't have to filter downstream.

How much does TokyoDev Scraper cost to run?

TokyoDev Scraper uses pay-per-event pricing. Because it requires a browser with residential proxy for each page, cost per record is higher than plain HTTP scrapers. Running the full board (~182 jobs + ~232 companies) costs roughly a few dollars depending on proxy consumption.

Need More Features?

Need scheduled runs, webhook delivery, or fields not currently extracted? File an issue or get in touch.

Why Use TokyoDev Scraper?

Structured language and remote data — japanese_required and remote_policy are extracted as typed fields, not buried in description text, so your filters work without NLP preprocessing
Dual-mode output — Jobs and company profiles in a single run with a shared schema, so you can join them by company_name without running two separate scrapers
CF-resilient by design — Residential proxy with browser fingerprinting handles Cloudflare without manual intervention; the sitemap bypass keeps URL discovery cheap

TokyoDev Scraper - Japan Tech Job Listings & Companies