OrbTop

ArchDaily Architecture Projects Scraper

ECOMMERCEBUSINESS

ArchDaily Architecture Projects Scraper

Scrape architectural project data from ArchDaily — the world's most visited architecture website. Extracts comprehensive project metadata including location, team credits, images, architectural drawings, and publication details.

What You Get

Each result is a structured record for one architectural project:

Field Description
project_id Numeric ArchDaily project ID
project_url Canonical ArchDaily project URL
project_title Full project name
architect_firm Primary architecture firm
architect_firm_url ArchDaily profile URL for the firm
architect_country Country of the architecture firm
project_year Year of completion
project_location_country Country where the project is located
project_location_city City of the project
project_location_address Street address
latitude GPS latitude
longitude GPS longitude
building_type Primary category (e.g. Cultural, Residential, Educational)
typology_tags Comma-separated typology tags
area_sqm Floor area in square metres
area_sqft Floor area in square feet
photographer_name Primary photo credit
photographer_url ArchDaily photographer profile URL
lead_architects Named lead architects
design_team Full design team credits
clients Client name(s)
structural_engineers Structural engineering firm(s)
mechanical_engineers Mechanical / services engineering
landscape_architects Landscape architecture firm(s)
manufacturers Product and manufacturer credits
awards Awards listed for the project
publication_date ISO date published on ArchDaily
text_summary First editorial paragraph
image_urls JSON array of photo image URLs
drawing_urls JSON array of drawing URLs (plans, sections, elevations)
num_images Total number of photos
num_drawings Total number of architectural drawings
num_comments Reader comment count
num_saves Times the project was saved
scrapedAt ISO-8601 scrape timestamp

Input Parameters

Parameter Type Default Description
maxItems integer Maximum number of projects to collect (required)
startPage integer 1 Listing page to start from
endPage integer Last listing page to scrape (optional)

Example Output

{
  "project_id": 612345,
  "project_url": "https://www.archdaily.com/612345/some-building-firm",
  "project_title": "Community Library / Studio Example",
  "architect_firm": "Studio Example Architecture",
  "architect_country": "Japan",
  "project_year": 2022,
  "project_location_country": "Japan",
  "project_location_city": "Tokyo",
  "building_type": "Cultural Architecture",
  "area_sqm": 1840,
  "area_sqft": 19805,
  "image_urls": "[\"https://images.adsttc.com/...\"]",
  "drawing_urls": "[\"https://images.adsttc.com/...-plan.jpg\"]",
  "num_images": 18,
  "num_drawings": 4,
  "publication_date": "2023-03-15",
  "scrapedAt": "2025-01-01T00:00:00.000Z"
}

Use Cases

  • Architecture research — analyse global trends in building typology, materials, and sustainable design across thousands of projects
  • Real estate intelligence — benchmark residential and commercial projects by size, location, and year
  • Academic datasets — build training datasets for architectural AI models linking images to project metadata
  • Firm profiling — map a firm's portfolio, geographic reach, and project history
  • Market analysis — track which building types and geographies receive the most coverage in international architecture media

Notes

  • Images and drawings are classified from filename keywords (supports English, Spanish, Polish, and German drawing names)
  • Area is normalised to both sq m and sq ft — if the source uses imperial units, the actor converts automatically
  • The scraper paginates through ArchDaily's project listing pages sequentially; use startPage + endPage to target a specific range
  • Concurrency is kept low (1 concurrent request, ~3 s between requests) to be a respectful crawling client