OrbTop

Wildflower.org Native Plant Database Scraper

ECOMMERCEEDUCATION

Wildflower.org Native Plant Database Scraper

Scrapes the Lady Bird Johnson Wildflower Center NPIN (Native Plant Information Network) — the canonical US native-plant database with ~9,000 species. Returns scientific name, common names, USDA symbol, growing conditions, bloom data, wildlife value, pollinator and butterfly larval-host data, and commercial availability.

What does it scrape?

  • Source: wildflower.org/plants — the authoritative North American native plant reference maintained by the Lady Bird Johnson Wildflower Center at UT Austin
  • Coverage: ~9,000 native plant species (all plant families)
  • Data per plant: ~30 fields covering taxonomy, growing conditions, ecology, and horticultural use

Use Cases

  • Building native plant recommendation tools and apps
  • Ecological research and biodiversity analysis
  • Horticultural and landscaping databases
  • Pollinator habitat and butterfly larval host mapping
  • Retail nursery and seed catalog enrichment

Output Fields

Field Description
id_plant NPIN plant symbol (USDA-aligned primary key, e.g. HEAN3)
scientific_name Scientific name (genus + species)
common_names Common names (comma-separated)
family Plant family (e.g. Asteraceae)
genus Genus
species Species epithet
synonyms Taxonomic synonyms (comma-separated)
usda_symbol USDA PLANTS cross-walk symbol
usda_native_status USDA native status codes by region (comma-separated)
plant_type Plant habit/type (tree, shrub, forb-herb, grass, vine, cactus)
duration Life duration — annual, biennial, perennial (comma-separated)
native_distribution US states and Canadian provinces where native (comma-separated abbreviations)
native_habitat Natural habitat description
bloom_color Flower bloom color(s) (comma-separated)
bloom_time Bloom months (comma-separated, e.g. Jun, Jul, Aug)
height Plant height range
leaf_description Leaf description
flower_description Flower description
fruit_description Fruit/seed description
sun_exposure Sun requirement (comma-separated — sun, part shade, shade)
soil_moisture Soil moisture preference (comma-separated — dry, moist, wet)
soil_description Soil type and conditions description
cold_tolerant Cold-tolerant flag (Yes/No)
heat_tolerant Heat-tolerant flag (Yes/No)
drought_tolerant Drought-tolerant flag (Yes/No)
water_use Water use category (low, medium, high)
maintenance Maintenance level
use_ornamental Ornamental use notes
use_wildlife Wildlife value notes (birds, mammals, insects supported)
use_pollinators Pollinator value (comma-separated — bees, butterflies, hummingbirds)
butterfly_larval_host_for Lepidoptera species for which this plant is a larval host (comma-separated)
conspicuous_flowers Conspicuous flowers flag (Yes/No)
deer_resistant Deer resistance rating
propagation Propagation instructions
seed_collection Seed collection notes
commercially_available Commercially available flag (Yes/No)
image_urls Image URLs from the NPIN gallery (comma-separated)
wildflower_url Canonical NPIN detail page URL

Input Parameters

Parameter Type Description
maxItems integer Maximum number of plant records to return. Set to 0 for all ~9,000 plants. Default: 10.
families array Limit scraping to specific plant families (e.g. ["Asteraceae", "Fabaceae"]). Leave empty to scrape all families.
proxyConfiguration object Proxy settings. Residential proxy is required and auto-configured.

Example Output

{
  "id_plant": "HEAN3",
  "scientific_name": "Helianthus annuus",
  "common_names": "common sunflower",
  "family": "Asteraceae",
  "genus": "Helianthus",
  "species": "annuus",
  "usda_symbol": "HEAN3",
  "plant_type": "forb/herb",
  "duration": "annual",
  "bloom_color": "yellow",
  "bloom_time": "Jun, Jul, Aug, Sep",
  "native_distribution": "AZ, CA, CO, ID, KS, MN, MT, NE, NM, ND, OK, OR, SD, TX, UT, WA, WY",
  "sun_exposure": "sun",
  "soil_moisture": "dry, moist",
  "drought_tolerant": "Yes",
  "commercially_available": "Yes",
  "use_pollinators": "bees, butterflies",
  "wildflower_url": "https://www.wildflower.org/plants/result.php?id_plant=HEAN3"
}

Notes

  • The scraper discovers all plant families automatically from the NPIN search index, then collects plant IDs from each family listing, and finally fetches each plant's detail page.
  • Use the families input to limit coverage to specific botanical families (faster runs for targeted research).
  • maxItems: 0 returns all available plants — expect a full run to take several hours for the complete ~9,000 plant dataset.
  • Residential proxy is required to access the site reliably.