ALETHEIA vs PubChem vs EPA CompTox: which chemical safety API is right for you?
An honest comparison of three public chemical-data APIs on coverage, exposure contexts, regulator consensus, licensing, and use cases. We make one of these and we still think the other two are the right tool for some jobs — here's the actual map.
"Which chemical safety API should we use?" doesn't have a single answer because the three serious options solve different problems. PubChem is built for chemistry research. EPA CompTox is built for US environmental and occupational toxicology. ALETHEIA is built for marketplace screening and consumer-product safety. If you pick the wrong one for your use case, you'll do a lot of post-processing — or worse, miss what you were supposed to catch.
This page is the comparison we wish someone had written when we were evaluating what to build. We're biased — we make ALETHEIA — but we're trying to be honest about where the others win. (For the longer version of why we built this rather than using PubChem or CompTox ourselves, read Why we built ALETHEIA.)
The three APIs at a glance
Option 1
PubChem (NCBI / NLM)
The largest public chemistry database, hosted by the US National Library of Medicine. Free, no API key required for most endpoints, three API surfaces (PUG-REST, PUG-View, and a legacy SOAP), and an enormous compound catalog assembled from contributing data sources around the world.
- Coverage
- ~119M compounds
- Cost
- Free, public domain
- Auth
- None required
- Primary use
- Chemistry research
- Authority
- NIH / NLM
- Returns regulator consensus?
- No
Best for: chemistry research, structure searches, identifier resolution (CAS ↔ InChI ↔ SMILES), bioactivity literature, anything where you need the largest possible compound corpus and you'll do the regulatory interpretation yourself.
Option 2
EPA CompTox Chemicals Dashboard (CCD)
The US Environmental Protection Agency's curated chemicals dataset, served via the CCD API. Strong on environmental and occupational toxicology, hazard predictions (QSAR models), and US-EPA-specific identifiers like DTXSID. Free with a registration key.
- Coverage
- ~900K chemicals (DSSTox)
- Cost
- Free, key required
- Auth
- API key (registration)
- Primary use
- Environmental / occupational
- Authority
- US EPA
- Returns regulator consensus?
- No (US-EPA-centric)
Best for: environmental risk assessment, occupational exposure work, hazard prediction (QSAR), and any US-jurisdiction project where EPA is the primary authority you need to satisfy. Less useful if your customers are global or if non-US regulators (IARC, EU CLP, ECHA) matter to your decisions.
Option 3
ALETHEIA Safety Intelligence
A curated chemical and consumer-product safety API focused on marketplace screening and compliance workflows. Smaller corpus than the other two, but every record is normalized across 15+ regulators and rendered through 19 exposure contexts (8 human life-stages, 2 pets, 5 routes, 4 environmental). Paid; available on RapidAPI or direct.
- Coverage
- 1,879 curated compounds + 959 materials + 1,262 products
- Cost
- $0 / $29 / $99 / Custom
- Auth
- API key (free tier available)
- Primary use
- Marketplace screening, consumer-product compliance
- Authority
- Holistic Quality LLC (aggregator)
- Returns regulator consensus?
- Yes (15+ agencies per record)
Best for: marketplaces screening seller-submitted SKUs, consumer brands doing compliance audits, applications that need life-stage-aware risk surfacing (kid products, pet products, prenatal/nursing-targeted SKUs), and anyone who needs a single API call to return cross-agency consensus instead of integrating 14 separate regulator websites.
Side-by-side
| Feature |
PubChem |
EPA CompTox |
ALETHEIA |
| Compound corpus |
~119M |
~900K |
1,879 curated |
| Cost model |
Free |
Free |
Free tier + paid plans |
| Multi-regulator consensus per record |
No |
EPA-centric |
15+ regulators |
| Life-stage exposure contexts |
No |
No |
19 contexts (8 human + 2 pet + 5 route + 4 env) |
| IARC carcinogen Groups |
Sometimes (data source dependent) |
References, not normalized |
Yes, normalized |
| EU CLP / ECHA REACH |
No |
No |
Yes |
| CalEPA Prop 65 listings |
No |
Cross-references |
Yes, with NSRLs / MADLs |
| Batch lookup endpoint |
PUG-REST limited |
Limited |
Yes, 10/50/100 by tier |
| Embeddable safety badges |
No |
No |
Yes (SVG) |
| Watchlist + webhooks |
No |
No |
Yes |
| Commercial-reuse licensing |
Public domain (data sources vary) |
Government data (US) |
Cleared for commercial reuse |
| SLA / uptime commitment |
None |
None |
99.9% target on paid tiers |
| Latency on cached lookups |
Variable |
Variable |
Typically <200ms |
When to choose each one
Choose PubChem if…
- You're doing scientific research, chemistry, or QSAR modeling and you need the largest possible compound corpus.
- You need identifier resolution (CAS ↔ InChI ↔ SMILES ↔ PubChem CID) and don't care about regulatory classifications.
- You want bioactivity / assay data linked to literature.
- You're cost-sensitive enough that "free, no SLA" beats "$29/mo with SLA."
- You're prepared to do the regulatory interpretation yourself by integrating multiple agency websites separately.
Choose EPA CompTox if…
- You're working on environmental risk assessment or occupational toxicology in a US-jurisdiction context.
- You need EPA-specific data: IRIS reference doses, CompTox DTXSID identifiers, EPA hazard predictions, DSSTox curated chemistries.
- QSAR / predictive models matter to your work.
- You're satisfied with US-only regulatory framing and don't need IARC, EU CLP, ECHA, or other non-US agencies in the same record.
- You can tolerate variable uptime — the CCD API has had unscheduled downtime in 2026.
Choose ALETHEIA if…
- You're a marketplace screening seller-submitted products and you need a one-call answer to "is this listing's ingredient panel safe to publish?"
- Your audience cares about life-stage-specific risk — kid products, baby products, pet products, prenatal/nursing-targeted SKUs, immunocompromised consumers — and "humans" as a single bucket isn't enough.
- You need cross-agency consensus in one response: IARC + EPA + EU CLP + ECHA + Prop 65 + FDA + Health Canada + WHO + OSHA, normalized into a single record.
- You want embeddable safety badges, watchlist + webhooks, or batch endpoints with sane caps.
- You need a commercial-reuse licensing chain that's already cleared, with a documented DPA, terms of service, and a contracting entity (Holistic Quality LLC) you can put on a procurement form.
- You want an SLA you can point at, latency commitments, and predictable pricing.
When you'd actually combine them
For most teams, the right answer isn't "one of three" — it's two or three working together at different layers of the stack. Some patterns we've seen work:
- PubChem for identifier resolution, ALETHEIA for screening. If you're parsing ingredient lists from PDFs or seller-submitted text, PubChem's name-to-CAS resolution is broader than anyone else's. Resolve the name first, then send the CAS number to ALETHEIA for the regulatory record.
- EPA CompTox for environmental impact, ALETHEIA for consumer impact. An outdoor / lawn / garden marketplace might use CompTox for aquatic-life and bee-tox predictions on pesticides while using ALETHEIA for the human-exposure consumer-product workflow.
- PubChem as a fallback when ALETHEIA returns "compound not in corpus." Our 1,879-compound corpus is curated, not exhaustive. If you hit a 404 on a niche specialty chemical, PubChem's catalog is the right fallback for the chemistry layer — though you'll be doing the regulatory work yourself for that compound.
We don't think these are competitors in the strict sense. They're complementary tools that solve different problems. ALETHEIA wins on the marketplace-screening and consumer-product-compliance jobs because that's the job we built it for. PubChem and CompTox win on the jobs they were built for. Use the right one.
Try ALETHEIA on a real product
Free tier: 500 requests/day with batch (up to 10 compounds). 7-day Pro trial: 10,000/day with 50-compound batches, no credit card.