Through automated categorisation, Carbon enables publishers to increase the quantity and quality of first party assets, with uniquely scored signals mapped into a taxonomy built on IAB standards and layered with Carbon’s unique brand and keyword signals. Here’s how we do it.

Step one: Analysis

Carbon applies page analysis and NLP (Natural Language Processing) to sources including URLs, titles, meta tags and content to identify keywords.

Step two: Keyword matching

Carbon uses a language processing technique that builds word vectors (or word maps) to reveal the many relationships between keywords. By identifying context similarities (e.g. ‘bed’ and ‘pillow’ could both relate to sleep) and word structures (e.g. ‘sleep’ and ‘sleeping’) our machine learning can then match them to the most accurate category within the taxonomy.

Robust taxonomy

Our taxonomy is made up of 1000s of top and sub-level
categories of interests – based on the IAB’s content
taxonomy 2.2 – plus our custom nodes, brands, and
keywords to ensure granular categorization of signals to fuel accurate audience analysis and segmentation.

Data scoring

Carbon’s algorithms score our data based on 4 core statistical features of a site visit, offering the most reliable and accurate data signals for better performing audiences.

Step three: Optimisation

Our tech considers all relevant keyword signals; determining their strength based on the keywords referring to them, how important our algorithms have determined they are to the page, and our prior knowledge of what category the site falls into.

We regularly review the results of our algorithms to ensure no keywords are missed and no irregular classifications occur.


Want to learn more?

If you want to learn more about Carbon’s tech such as our automated categorisation, and how it helps with your audience and monetisation strategy, why not request a demo.