In an index model, Carbon tech counts the number of users with known data to hit a url (e.g. the number of known male and female users) and represents these as a percentage, i.e. 67% female and 33% male. To infer the gender of an unknown user, the index model averages across all urls a user has hit, returning confidence levels for each demographic, i.e. 54% female, 46% male.
This type of model is very effective, as it allows us to get a less biased view of a page. There is also minimal startup time and the model adapts to new content. We use index models to gain first party demographics for income, age, gender, education, and the presence of children.
This model works by predicting gender based on the content of the pages a profile visits. Techniques like TF-IDF (term frequency-inverse document frequency) are used to determine the important information on a page to be used as input features such as page keywords/categories and browser.
Carbon’s tech then learns what inputs accurately map to the signal of known output signals – in this case demographics. Labelled data (such as user on page declared data) is used to train the model and enables us to provide confidences of a page’s demographics. Once trained this model runs independently from any third party data.
The model works by providing predictions based on the interests associated with a profile. Interests are gathered from the pages visited by a profile. Two gender clusters are created, male and female, with all interests associated with labeled users of either gender. Predictions are made by comparing an unlabelled user to either gender cluster and seeing which they are more alike to.
When creating a segment in the platform based on any demographic, strength is presented as a slider to enable the setting of minimum confidence levels. The diagram (right) illustrates the age confidence levels for a user where the most likely age group is 25-34. This user could also be included in a segment targeting the 18-24 age group with medium confidence, as well as a segment targeting the 55-64 age group with low confidence.
This is to illustrate the targeting mechanism of Carbon. Regardless of the most likely demographic group of a user, if they display enough behavior of another demographic group and the strength level is low enough, they will be included in other demographic segments.
Want to learn more?
If you want to learn more about Carbon’s tech such as our demographic modelling, and how it helps with your audience and monetisation strategy, why not request a demo.