OrbTop

NTD Classical Chinese Dance Competition Scraper

NEWSEDUCATION

NTD Classical Chinese Dance Competition Scraper

Extracts competition news, contestant highlights, and announcements from the NTD International Classical Chinese Dance Competition website — the world's only globally-televised classical Chinese dance competition, held biennially in New York since 2007.

What it does

Connects to the competitions.ntdtv.com WordPress REST API and collects posts and pages across three language variants:

  • Traditional Chinese (zh-hant) — competition news and dancer spotlights
  • Simplified Chinese (zh-hans) — competition news and media coverage
  • English — dance section static pages (rules, registration, history)

Each record includes the post title, publication date, category label, body HTML and text, winner/rank extraction (where available), division, and a link back to the source article on ntdtv.com or epochtimes.com.

Output fields

Field Description
post_id WordPress post ID
post_slug URL slug
title Post or page title
category Category label (e.g. classical-dance-zh-hant, competition-news-zh-hans)
division Detected competition division (adult-male, adult-female, junior, etc.)
publish_date Publication date (YYYY-MM-DD)
body_html Raw HTML content
body_text Stripped plain text
winner_name Extracted winner name (when detectable from title/body)
winner_country Winner country (where available)
rank Gold / silver / bronze / honorable-mention (when detectable)
year Competition year
source_url Canonical URL of the post or source article

Input

Field Default Description
maxItems 10 Maximum number of records to return

Notes

  • No proxy required — the API is publicly accessible
  • Deduplicates records across language queries (a post can appear under multiple category IDs)
  • Most posts are article stubs linking to full content on ntdtv.com / epochtimes.com; body_text captures the available on-site content
  • Dance section has ~370 records total (zh-hant: ~244, zh-hans: ~121, pages: 8)