OrbTop

Preply Tutor Directory Scraper

JOBSLEAD GENERATION

Preply Tutor Directory Scraper

Scrape the Preply tutor directory for all subjects and languages. Preply is one of the world's largest online tutoring marketplaces with 40,000+ tutors across 179 subjects. This actor extracts structured tutor profiles from Preply's listing pages without requiring individual profile page visits.

What data does it extract?

Each tutor record includes:

Field Description
tutor_id Unique Preply tutor ID
name Tutor full name (first name + last initial)
subjects_taught List of subjects taught (e.g. English, Spanish, Math)
native_language Tutor's native language(s)
country_of_birth Country of birth
hourly_rate_usd Hourly lesson rate in USD
trial_lesson_price_usd Trial lesson price in USD
rating Average rating (0–5 scale)
reviews_count Total number of student reviews
lessons_count Total number of lessons completed
languages_spoken Languages spoken with proficiency level (e.g. English (Native))
profile_url Full URL to the tutor's Preply profile
badges Achievement badges (e.g. Super Tutor, Top 5)

How does it work?

The actor uses Preply's Next.js __NEXT_DATA__ JSON embedded in each listing page — no JavaScript rendering required. It:

  1. Loads the English tutors seed page to discover all 179 subject slugs
  2. Paginates through each subject's tutor listing (e.g. /en/online/english-tutors?page=N)
  3. Extracts all tutor fields directly from the embedded JSON — no per-profile page visits needed
  4. Respects maxItems to cap the result count

Input

Parameter Type Default Description
maxItems integer 15 Maximum number of tutor records to scrape

Output

Results are saved to the default dataset. Example record:

{
  "tutor_id": "311304",
  "name": "Louise M.",
  "subjects_taught": ["English"],
  "native_language": "English",
  "country_of_birth": "United Kingdom",
  "hourly_rate_usd": 36,
  "trial_lesson_price_usd": 18,
  "rating": 4.99,
  "reviews_count": 25,
  "lessons_count": 3275,
  "languages_spoken": ["French (C2)", "Arabic (C2)", "English (Native)", "Spanish (A1)", "Italian (A1)"],
  "profile_url": "https://preply.com/en/tutor/311304",
  "badges": []
}

Use cases

  • Language school lead generation — identify tutors by subject and native language for outreach
  • Pricing intelligence — benchmark hourly rates across subjects and countries of origin
  • Marketplace research — analyze tutor supply, rating distributions, and lesson volume by subject
  • Competitor analysis — track tutor counts and quality signals across Preply's subject catalog

Notes

  • Prices are normalized to USD using Preply's seoPrice field (always factor=1)
  • 179 subjects are crawled; tutors teaching multiple subjects appear on multiple subject listing pages
  • Set a low maxItems (e.g. 100–1000) for targeted sampling
  • robots.txt specifies crawl-delay: 10; the actor respects this with conservative concurrency