Ad Targeting Is Hard

Editor’s note: Benjy Weinberger is the engineering site lead for foursquare’s San Francisco office. He previously worked on infrastructure and revenue engineering at Twitter, and before that on search and ad engineering at Google for eight years.

Microsoft recently announced that it’s taking a huge $6.2 Billion writedown over the failed aQuantive acquisition. This news, and the scrutiny of Facebook’s business model following their IPO drama, show that, in online advertising, it’s all about the targeting.

As this Reuters analysis explains, there’s so much online advertising space that merely putting billboards up all over the internet is no longer a lucrative business. Meanwhile, Google AdWords remains phenomenally successful, generating over $36B in revenue in 2011. The key difference? targeting. Google’s sophisticated ad-targeting algorithms greatly increase the relevance to the user, and therefore the likelihood of the user clicking on an ad. This is what makes AdWords so much more effective than banner ads.

So why isn’t everyone just improving their targeting? Unfortunately, it’s not that simple. Ad targeting is a difficult artificial intelligence (AI) problem, and while you may not agree that it’s a worthy one, it does require a lot of technical heavy lifting. Here’s why:

The Algorithm

A targeting algorithm take everything you know about the impression – search keywords, location, demographics, previous user activity, time of day, the previous CTR (clickthrough rate) of the ad and so on – and uses that to choose from among millions of candidate ads the one to show. And it has to do this in a fraction of a second. This is not a trivial problem. Can you think, offhand, how you’d do it? If so, I’d like to talk to you about a Data Scientist role at foursquare…

Ad targeting is a relevance problem somewhat similar to web search: given a huge repository of information, and whatever we know about what the user is looking for, find the most relevant information and return it. While the algorithms are not the same, and indeed Google has two entirely separate divisions solving each problem, both for technical and ethical reasons, the difficulty is similar.

Basically, to even begin to tackle ad targeting, you need top-notch data scientists with PhDs in Machine Learning, Information Retrieval or other AI fields. If you’ve spent any time at all at a startup you know how hard it is to hire these people.

The Data

Even once you have an algorithm, it’s not much use without data. The more you know about the user, the more precise your algorithm can be. This is not just for the obvious reason that you need something to target by,  but also because you need to train your algorithm. Machine Learning algorithms are so-called because they adapt through an iterative process: you feed them a set of training data, along with the expected results, and they slowly increase their precision, in a manner analogous to human learning.

The kind of data you can gather depends largely on the consumer service you provide: Google knows a lot about your current intent, via your search keywords. Facebook knows a lot about your context, via your social activity. So far, intent appears to be more valuable than context when it comes to ad targeting. But the holy grail is to have both, which partly explains Google+.

For precise targeting you need a lot of data, particularly about current intent, and this is hard to come by for any but the most successful services.

The Systems

Assuming you have the algorithms and the data, you still have the problem of how to apply them efficiently. You can’t let your user wait around while you laboriously figure out which ads to show. Ads systems are typically expected to return a result within a few hundred milliseconds. It takes very large, very complex distributed systems to pull this off. Google’s SmartASS system, for example, is one of the best-engineered systems I’ve ever encountered. Systems of this sophistication are hard to build.

The Virtuous Trinity

All too often, online advertising is a zero-sum game. The more intrusive display ads are to the user, the more benefit the advertiser perceives. And in a CPI (cost-per-impression) paradigm, the ad publisher is firmly on the side of the advertiser.

But with strong targeting and CPC (cost-per-click) billing, a virtuous trinity emerges: the more relevant an ad is, the happier the user is, and the more likely to click on the ad. This gets the advertiser more engagement, and the ad publisher (who gets paid per click) more money. All three participants are incentivized by better targeting, and all share in the created value. Creating this virtuous trinity, rather than spamming the web with banner ads, is how to truly succeed in online advertising.

To get ad targeting right you need a combination of cutting-edge algorithms, sophisticated systems and mountains of relevant data. Putting all these in place requires a world-class engineering team, the right product, and a lot of users. Not many companies have all these assets.

Microsoft does though, which makes this recent news somewhat perplexing. I guess that just having the ability to do something isn’t enough. Whether you actually get out there and do it or not is the $6.2B question.