Making Behavioural Targeting work


This is the English original of the article that appeared on onlinemarketing.de:

Since the late 90s behavioral targeting has had more comebacks than Tina Turner – and this despite the fact that, as one venture capitalist I spoke to recently put it, “Targeting has never created more value than it costs.”
The continuing and resilient interest in behavioral targeting in the face of mediocre and inconsistent performance tells us that there is an enduring market need for the proposition. In this post I want to present my hypothesis as to why targeting hasn’t worked for over a decade, and what we could to do to make it work, making behavioral targeting finally deliver on its promises.

Behavioral targeting has oversimplified a complex problem
The majority of companies specializing in behavioral targeting has taken the same, basic approach: One or more experts create a system of categories that capture any interesting phenomenon on the internet. As the user comes into contact with a web page, the content is – manually or automatically – associated with one or a couple of these categories. The user’s contact with this page is then integrated into the user profile as a contact with that category at a particular point in time.

There are three salient problems with this approach:

  1. It is subjective – one or more ‘experts’ have created the category system, which means it is a man-made prioritization of how to partition the world into a discrete set of categories
  2. It is static – the category system needs to be maintained by human beings, and the second they pause to do something else is the second it goes out of date, as reality moves on and the category system does not
  3. It is information destructive – taking an entire piece of internet content and reducing it to one or a couple of categories destroys every other information that might be significant about it


When the time comes to build targets, categories are combined into Boolean expressions that the user must conform to in order to be in the target group. A target for a campaign trying to generate test drives in the new BMW 5-series might therefore be defined as: “In the last 30 days the user has had 3 or more contacts with ‘Automotive content’ and 2 or more contacts with ‘Economy & Finance content’”.
This form of target generation is hypothesis-driven, which means it is inherently trial-and-error. Indeed sometimes 5 or more targets are created and run in parallel to simultaneously test 5 hypotheses.
In other words: The approach taken to profiling and targeting for more than a decade is an oversimplification of a very complex problem. What we need is a new approach that eliminates the weaknesses we have just identified.

Really putting behavioral into targeting

The first thing we should do to alleviate the problems of the old approach is to eliminate the category system and replace it with a machine-learning system capable of determining what actually constitutes information on a given page (I use Gregory Bateson’s definition of information as ‘a difference that makes a difference’). One way to do that is by semantically analyzing the content of the page and determining the significant terms and phrases that set this page apart from the ‘background noise’ of all other analyzed pages. These terms and phrases constitute the information of the page as they are the differences that make a difference. Thus when the user comes into contact with a page, the profile is not enriched with one or a couple of categories, but instead with every significant term and phrase on that page, along with a number of other salient characteristics of the contact event. So, how does that solve problems with the traditional approach to profiling?

  1. It is objective – there is no category system created by humans – in fact at no point are humans involved in the process of determining what constitutes information on the page
  2. It is dynamic – every new page semantically analyzed leads to a real-time recalibration of model that determines significance of terms and phrases
  3. It is information preserving – instead of immediately reducing the page to one or more data points in a predefined schema (the category system) this approach preserves all the information on the page for use in target generation


In a month this turns into hundreds of billions of data points, thus creating the challenge of how to transform the massive amount of information into action in a simple, pragmatic way. The secret to this is an empirical target generation approach that starts with a simple question: What do you want to achieve?

Let the data do the talking – a small example

If you are working for BMW trying to generate test drives in the new 5-series, this is your marketing goal. You would then simply deploy a tracking pixel to the BMW page that says: “Thank you for signing up for a test drive in the BMW 5-series,” and start sampling the users that really perform this action.
After collecting a critical number of users in the sample, the target generation engine creates a statistical model that describes how the users taking the action are different from those not taking the action.
Now we see the need to accumulate all this information: We don’t know ex ante which information is going to be important in discriminating the action-takers form the non-action-takers – the target generation process will determine that for us. If we start by putting the pages fetched by the user into subjectively defined categories, we run a very high risk of destroying the characteristics of the action-taking users that really set them apart from the rest of the population.
Interestingly, who you want to reach need not be just action-takers – they can be males, 20-29 years old, high income with and interest in extreme sports. All this new approach to targeting needs is a sample of the ‘right’ users – such as one might get from registration data and online behavior – and then it will create its unique models to find the ‘statistical twins’ of the real males, the real males 20-29 years old, the real males 20-29 years old with a high income and so on. Thus this new approach works equally well for both performance- and branding-oriented advertising.

The right user at the right place at the right time

This has all been about users, but they have very different behavior in different places at different points in time, but as we know from our own experience we have very different responsiveness in different contexts (sites, channels, URLs) at different points in time. The empirical target generation approach has the benefit of being able to take these two additional characteristics into account. Thus as a user comes into contact with an ad placement of a particular format on a particular URL and a particular point in time, these characteristics can all be taken into a real-time calculation of the probability that the user belongs to the different ‘classes’ that constitute the currently active targets, whether these be branding- or performance-oriented.

A generic approach to targeting with a multitude of applications

So what is this new approach to behavioral targeting all about? Actually, it is very simple: Using a machine-learning approach to collect all the information contained in the content consumed by internet users the system finds the right users at the right place at the right time through empirical target generation. The generic nature of this approach means it has a multitude of applications across the many categories of the online advertising technology eco system.