OrbTop

DIO.me Brazil Coding Bootcamp & Courses Scraper

EDUCATIONLEAD GENERATIONDEVELOPER TOOLS

DIO.me Brazil Coding Bootcamp & Courses Scraper

Scrapes all courses, bootcamps, and learning tracks from DIO.me — Brazil's largest free developer education platform with over 9 million registered students and enterprise-sponsored bootcamps from companies including Itaú, XP Inc., Vivo, Globo, and Claro.

What it does

This actor walks DIO.me's sitemap to discover all available courses and bootcamp tracks, then extracts structured data from each page's embedded Next.js __NEXT_DATA__ JSON. No Playwright required — DIO.me serves full server-side rendered HTML.

Covers two content types:

  • Courses — individual video modules (typically 1–4 hours) with instructor names, skills taught, and lesson syllabus
  • Bootcamps / Tracks — multi-week programs (often enterprise-sponsored) with student counts, start/end dates, and career path classification

Why it matters

DIO.me occupies a unique data niche: sponsor-company attribution. Every bootcamp lists the Brazilian enterprise that funded it (bank, telco, media company, etc.) alongside participant counts. This data is unavailable from Alura, Rocketseat, or Platzi (all paid-tier-only platforms). It is directly useful for:

  • B2B ed-tech and enterprise L&D vendors — which Brazilian companies are investing in developer upskilling, and in which tech stacks?
  • Tech talent market research — which skills are most in-demand in the Brazilian market?
  • Competitive intelligence — track DIO's course catalog growth and sponsor relationships over time

Input

Parameter Type Default Description
maxItems integer 10 Maximum number of records to return. Set to 0 for a full run (739 courses + 36 tracks as of 2026).

Output fields

Field Type Description
id string DIO internal UUID
title string Course or bootcamp title
slug string URL slug
type string course, bootcamp, or formacao
category string Career category (e.g. Back-end Developer)
language string Always pt-BR
duration_hours number Estimated duration in hours
is_free boolean Whether content is free to access
sponsor_company string Enterprise sponsor name (tracks only)
instructor_names string Comma-separated instructor names
student_count number Total students enrolled (tracks only)
syllabus_modules string Pipe-separated lesson names (courses only)
skills_taught string Pipe-separated technology skills
certification_offered boolean DIO issues certificates for all content
start_date string Bootcamp start date (ISO-8601, tracks only)
end_date string Bootcamp end date (ISO-8601, tracks only)
description_html string Full HTML description
description_text string Plain-text description
thumbnail_url string Course badge or track preview image URL
source_url string Canonical URL on dio.me
scrapedAt string ISO-8601 scrape timestamp

Sample output (course)

{
  "id": "2290d641-d0ef-489f-aeb5-733ba49da653",
  "title": "Métodos e Gems",
  "slug": "metodos-e-gems",
  "type": "course",
  "category": "Back-end Developer",
  "language": "pt-BR",
  "duration_hours": 1,
  "is_free": true,
  "sponsor_company": null,
  "instructor_names": "Tenille Martins",
  "skills_taught": "Ruby",
  "certification_offered": true,
  "source_url": "https://www.dio.me/courses/metodos-e-gems",
  "scrapedAt": "2026-06-12T09:00:00.000Z"
}

Notes

  • DIO.me has no anti-bot protection. Direct HTTP fetches succeed without proxy.
  • The sitemap contains ~739 course URLs and ~36 track URLs as of June 2026.
  • Some courses are DIO's own (subscription_type: dev) and some are premium (subscription_type: premium). The is_free field reflects this.
  • Track-level data (bootcamps) includes sponsor company and student counts; course-level data includes per-lesson syllabus and instructor details.
  • Content is entirely in Brazilian Portuguese (pt-BR).