A database meant for regulation became the backbone of many of the hollow cosmetic websites.
CosIng is a regulatory database designed to standardize ingredient names on packaging, serving as a labeling tool for manufacturers rather than a knowledge resource for consumers. In practice, it is structurally useless for anyone outside the industry.
Yet CosIng's entire inventory (over 30,000 entries) was made available as a CSV file. With minimal effort, nimble publishers could import this dataset into a CMS like WordPress or Drupal and instantly generate tens of thousands of "ingredient pages." The result was several cheap and cheerful websites: cosmetically polished, SEO-optimized, but offering nothing beyond the raw CosIng entries.
Those hollow websites, flooded search engines' indexes. Meanwhile, the EU's own CosIng site, built on an engine with Ajax-loaded content, was nearly invisible to search engines.
On the other hand, there was a massive void of open information about cosmetic ingredients on the Internet. That vacuum rushed these content-light replicas, which quickly gained domain authority, not because they offered genuine or helpful information, but simply because there was no alternative source of open ingredient information online to compete with them.
What should search engines and AI systems do?
The responsibility to repair this broken informational ecosystem, whether in cosmetics or any other field, falls directly on the primary "gatekeepers" of open knowledge.
The first obligation is to prioritize official and original sources. If CosIng data is relevant, results should point to the EU's own database, not to replicas dressed up with higher domain authority or SEO/GEO tricks.
Equally important, derivative sites that merely recycle CosIng entries must be systematically treated as unreliable. Any page citing such replicas should be flagged accordingly. The rule is simple: if you want to cite CosIng, cite the source, never the echo.
For AI systems, the obligation is parallel: they must be trained with the original CosIng CSV, not its recycled shadows, so that they can distinguish authentic data from the noise of thousands of derivative entries.
As long as replicas are rewarded, hollow websites will spread like weeds, and consumers will remain lost in the fog of misinformation.
What customers can do?
Here's what a typical CosIng entry looks like:
INCI Name | DAUCUS CAROTA SATIVA SEED OIL |
Description | Daucus Carota Sativa Seed Oil is the oil obtained from the seed of the carrot, Daucus carota L. var. sativa, Umbelliferae |
CAS # | 8015-88-1 / 84929-61-3 |
EC # | - / 284-545-1 |
Identified INGREDIENTS or substances e.g. | |
Cosmetics Regulation provisions | |
Functions |
|
SCCS opinions |
If the "information" you found about an ingredient looks like this, bare fields, regulatory codes, and generic function tags, with no context, explanation, or evidence, it’s not real knowledge. It's a CosIng replica.
When you encounter one, don't reward it. Flag it. Give negative feedback to the search engine or AI chatbot that surfaced it. Every feedback matters: the more customers push back, the harder it becomes for hollow replicas to dominate the informational space. Consumers are not powerless; they are part of the corrective mechanism.
The rule is simple: if it looks like a database dump, treat it as noise, not knowledge.