Programmatic Platforms vs. “Standard” Digital Platforms

By Eric Picard (Originally published on AdExchanger.com August 5th, 2013)

I’ve been struggling lately with some oddities in how the “programmatic” media space functions. Ad-tech infrastructure — both the “standard” digital infrastructure made up of publisher’s ad servers (DoubleClick for Publishers24/7 Open AdStreamAdtechOpenX, etc.) and the real-time bidding platforms that run alongside those legacy platforms — has drawn huge investment. But misunderstandings and false assumptions abound about how these technologies operate, their limitations and what types of businesses should use them.

My friend, Jed Nahum, wrote a great article last week about programmatic buying and selling, including the sometimes confusing ways in which we use the word “programmatic” because of the complexity we’ve created. What I loved about Jed’s article is that he laid out a taxonomy of four different ways the “programmatic advertising world” operates (or will soon operate):

  1. RTB/programmatic spot: The RTB world we all know and love.
  2. Deal ID/private marketplaces: Using the “RTB pipes” to execute buys that are similar to direct buys, but that aren’t guaranteed.
  3. Programmatic direct: Using non-RTB “pipes” to buy directly from publishers, including guaranteed buys previously supported only via a direct human sales relationship.
  4. Programmatic forward: The (still-to-come) extension of the Deal ID/private marketplace world to guaranteed/reserved buys over the RTB “pipes.”

I really like Jed’s taxonomy because it calls out the very real differences in how various constituents in our space operate. If I do a quick mapping of vendors to their various spaces here (and I’m bound to forget a few), you can see that the players are siloed in their approaches. Full disclosure: My company, Rare Crowds, sits across all of these boxes today – although we don’t have a bidder or an ad server. We sit above that layer and push into each of those various platforms, so in a sense we don’t compete with any of these companies, but would partner with any of them.

tableforpicardstory

The problem I have with this is that the vehicles to programmatically buy media are locked to the plumbing (technology layer) that supports them. And yet they’re all programmatic. That would be fine if we didn’t have this “programmatic forward” component at the bottom — the one that all the RTB folks are working toward.

I still stand my definition of programmatic as any method of buying or selling media that enables a buyer to complete a media purchase without human intervention from a seller. I’ll push forward by saying that — contrary to popular belief –the technology layer, the plumbing, is irrelevant to the channel.

When I talk to people from the programmatic-direct world, they argue they’re the logical path for managing guaranteed direct-media buys because they’re directly plugged into publishers’ platforms. But when I talk to the RTB folks, they make a very good argument about how they can expose publisher inventory “directly” between a buying platform and the publisher’s ad server with a check-box configuration setting — and that it’s better for buyers because they can apply all their first-party data to the buys and get the best of all worlds.

The reality is this: Both sides are “sort of” right. But, in the end, it just doesn’t matter. All the vendors will ultimately plug into each other and liquidity will flow. The programmatic-direct vendors, though, need to make sure that they don’t miss the value proposition of all the various partnerships they should be creating.

One of the most significant developments in our space in the last few years was the DoubleClick Ad Exchange’s rollout of dynamic allocation, the next-generation technology that replaces the previous AdMeld product. Essentially it’s a switch that sits between the exchange and DoubleClick for Publishers and makes real-time decisions about how to allocate inventory to the exchange and for guaranteed buys. Maxifier offers a very similar product that will work with other exchanges (which is also super important, but I give the nod to Google because of its scale).

The reason this matters is that publishers need a dynamic-allocation technology that regulates the decision about which line items get the impression. The yield increases are significant, especially those coming through the exchange.

Dynamic-allocation technologies level the playing field between the RTB players and the programmatic-direct players. Buyers will ultimately need to be able to support both channels. This is critical for decision-makers at agency trading desks or large technical advertisers with their own platforms, such as eBay or Amazon.

They need a way to rationalize when they should be doing dynamic buys, controlled at the impression level (RTB), and when they should be doing direct buys in advance.  Ideally, they need a system that sits above all of these various channels and allocates budget to them in advance, but that also monitors and optimizes how those budgets are allocated throughout the life of a campaign.

Regardless of what any one vendor will tell you, all the functionality of the current set of “legacy” ad-server technologies will be replicated in the RTB stack over the next few years. And the current lines that sit between those stacks will get blurrier. Anyone preaching any kind of secularism here is a bit suspect.

How arbitrage works in digital advertising today

By Eric Picard (Originally published on iMediaConnection.com July 11, 2013)

The idea that ad networks, trading desks, and various technical “traders” in the ad ecosystem are “arbitraging” their customers is fairly well understood. The idea that an intermediary of some kind sells ad inventory to a media buyer, but then buys it somewhere else for a lower price is a pretty basic reality in our industry. But what most of us don’t understand is how it gets done and especially how technically advanced firms are doing it.

So let’s talk about this today — how arbitrage is enacted by various constituents, and I’d love to hear some reactions in the comments about how marketers and media buyers feel about it, especially if they weren’t aware of how it was done. Note: There are many ways to do this; I’m just going to give you some examples.

Old school networks

Back in the day, ad networks would go to large publishers and negotiate low price remnant buys (wholesale buys) where they’d buy raw impressions for pennies on the dollar, with the rule being that the network could only resell those impressions without identifying the publisher (blind inventory resale).

The networks that have done this well traditionally apply some targeting capabilities to sell based on context/content and also audience attributes. But even this is all very old school. The more advanced networks even back in the old days employed a variety of yield optimization technologies and techniques on top of targeting to ensure that they took as little risk on inventory as possible.

RTB procurement

Many networks now use the exchanges as their primary “procurement” mechanism for inventory. In this world there’s very little risk for networks, since they can set each individual campaign up in advance to procure inventory at lower prices than they’ve been sold. There is a bit of risk that they won’t be able to procure inventory — i.e. there isn’t enough to cover what they’ve pre-sold. But the risk of being left holding a large amount of inventory that went unsold is much lower and saves money.

Once you mitigate that primary risk and then add in the ability to ensure margin by setting margin limits, which any DSP can do “off the shelf,” the risk in managing an ad network is so low that almost anyone can do it — as long as you don’t care about maximizing your margins. That’s where a whole new class of arbitrage has entered the market.

Technical arbitrage

There are many different ways that companies are innovating around arbitrage, but I’ll give you the baseline summarization so you can understand why many of the networks (or networks that advertise as if they’re some kind of “services based DSP”) are able to be successful today.

Imagine a network that has built an internal ad platform that enables the following:

  • Build a giant (anonymous) cookie pool of all users on the internet.
  • Develop a statistical model for each user that monitors what sites the network has seen them on historically on a daily/day-of-week basis.
  • Develop a historical model showing how much inventory on each site tends to cost in order to win the inventory on the exchange, perhaps even each individual user.
  • When you run a campaign trying to reach a specific type of user, the system will match against each user, then in the milliseconds before the bid needs to be returned, the network’s systems will determine how likely they are to see this user that day — and if they will find them on sites where historically they’ve been able to buy inventory for less money than the one they’re on at the moment.
  • If the algorithm thinks it can find that user for less money, it will either bid low or it will ignore the bid opportunity until it sees that user later in the day — when it probably can win the bid.

This kind of technology is now running on a good number of networks, with many “variations” on this theme — some networks are using data related to time of day to make optimization decisions. One network told me that it finds that users are likely to click and convert first thing in the morning (before they start their busy day), in mid-morning surfing (after they’ve gotten some work done), after lunch (when they’re presumably trying to avoid nodding off), and in the late afternoon before going home for the day. They optimize their bidding strategy around these scenarios either by time of day or (in more sophisticated models) depending on the specific user’s historical behavior.

You shouldn’t begrudge the networks based too much on this “technical arbitrage,” since all that technology requires a significant upfront investment. They’re still giving you access to the same user pool — but one question that nags at me is that they may be giving you that user on sites that are not great.

It also begs the question that if these very technical networks are buying their inventory on a per-impression basis, all the stories about fraud get me a little worried. A truly sophisticated algorithm that matches a unique ID should be able to see that those IDs are getting too many impressions to be human. But I haven’t done any analysis on this — it’s just a latent concern I have.

Life after the death of 3rd Party Cookies

By Eric Picard (Originally published on AdExchanger.com July 8th, 2013)

In spite of plenty of criticism by the IAB and others in the industry, Mozilla is moving forward with its plan to block third-party cookies and to create a “Cookie Clearinghouse” to determine which cookies will be allowed and which will be blocked.  I’ve written many articles about the ethical issues involved in third-party tracking and targeting over the last few years, and one I wrote in March — “We Don’t Need No Stinkin’ Third-Party Cookies” — led to dozens of conversations on this topic with both business and technology people across the industry.

The basic tenor of those conversations was frustration. More interesting to me than the business discussions, which tended to be both inaccurate and hyperbolic, were my conversations with senior technical leaders within various DSPs, SSPs and exchanges. Those leaders’ reactions ranged from completely freaked out to subdued resignation. While it’s clear there are ways we can technically resolve the issues, the real question isn’t whether we can come up with a solution, but how difficult it will be (i.e. how many engineering hours will be required) to pull it off.

Is This The End Or The Beginning?

Ultimately, Mozilla will do whatever it wants to do. It’s completely within its rights to stop supporting third-party cookies, and while that decision may cause chaos for an ecosystem of ad-technology vendors, it’s completely Mozilla’s call. The company is taking a moral stance that’s, frankly, quite defensible. I’m actually surprised it’s taken Mozilla this long to do it, and I don’t expect it will take Microsoft very long to do the same. Google may well follow suit, as taking a similar stance would likely strengthen its own position.

To understand what life after third-party cookies might look like, companies first need to understand how technology vendors use these cookies to target consumers. Outside of technology teams, this understanding is surprisingly difficult to come by, so here’s what you need to know:

Every exchange, Demand-Side Platform, Supply-Side Platform and third-party data company has its own large “cookie store,” a database of every single unique user it encounters, identified by an anonymous cookie. If a DSP, for instance, wants to use information from a third-party data company, it needs to be able to accurately match that third-party cookie data with its own unique-user pool. So in order to identify users across various publishers, all the vendors in the ecosystem have connected with other vendors to synchronize their cookies.

With third-party cookies, they could do this rather simply. While the exact methodology varies by vendor, it essentially boils down to this:

  1. The exchange, DSP, SSP or ad server carves off a small number of impressions for each unique user for cookie synching. All of these systems can predict pretty accurately how many times a day they’ll see each user and on which sites, so they can easily determine which impressions are worth the least amount of money.
  2. When a unique ID shows up in one of these carved-off impressions, the vendor serves up a data-matching pixel for the third-party data company. The vendor places its unique ID for that user into the call to the data company. The data company looks up its own unique ID, which it then passes back to the vendor with the vendor’s unique ID.
  3. That creates a lookup table between the technology vendor and the data company so that when an impression happens, all the various systems are mapped together. In other words, when it encounters a unique ID for which it has a match, the vendor can pass the data company’s ID to the necessary systems in order to bid for an ad placement or make another ad decision.
  4. Because all the vendors have shared their unique IDs with each other and matched them together, this creates a seamless (while still, for all practical purposes, anonymous) map of each user online.

All of this depends on the basic third-party cookie infrastructure Mozilla is planning to block, which means that all of those data linkages will be broken for Mozilla users. Luckily, some alternatives are available.

Alternatives To Third-Party Cookies

1)  First-Party Cookies: First-party cookies also can be (and already are) used for tracking and ad targeting, and they can be synchronized across vendors on behalf of a publisher or advertiser. In my March article about third-party cookies, I discussed how this can be done using subdomains.

Since then, several technical people have told me they couldn’t use the same cross-vendor-lookup model, outlined above, with first-party cookies — but generally agreed that it could be done using subdomain mapping. Managing subdomains at the scale that would be needed, though, creates a new hurdle for the industry. To be clear, for this to work, every publisher would need to map a subdomain for every single vendor and data provider that touches inventory on its site.

So there are two main reasons that switching to first-party cookies is undesirable for the online-ad ecosystem:  first, the amount of work that would need to be done; second, the lack of a process in place to handle all of this in a scalable way.

Personally, I don’t see anything that can’t be solved here. Someone needs to offer the market a technology solution for scalable subdomain mapping, and all the vendors and data companies need to jump through the hoops. It won’t happen in a week, but it shouldn’t take a year. First-party cookie tracking (even with synchronization) is much more ethically defensible than third-party cookies because, with first-party cookies, direct relationships with publishers or advertisers drive the interaction. If the industry does switch to mostly first-party cookies, it will quickly drive publishers to adopt direct relationships with data companies, probably in the form of Data Management Platform relationships.

2) Relying On The Big Guns: Facebook, Google, Amazon and/or other large players will certainly figure out how to take advantage of this situation to provide value to advertisers.

Quite honestly, I think Facebook is in the best position to offer a solution to the marketplace, given that it has the most unique users and its users are generally active across devices. This is very valuable, and while it puts Facebook in a much stronger position than the rest of the market, I really do see Facebook as the best voice of truth for targeting. Despite some bad press and some minor incidents, Facebook appears to be very dedicated to protecting user privacy – and also is already highly scrutinized and policed.

A Facebook-controlled clearinghouse for data vendors could solve many problems across the board. I trust Facebook more than other potential solutions to build the right kind of privacy controls for ad targeting. And because people usually log into only their own Facebook account, this avoids the problems that has hounded cookie-based targeting related to people sharing devices, such as when a husband uses his wife’s computer one afternoon and suddenly her laptop thinks she’s a male fly-fishing enthusiast.

3) Digital Fingerprinting: Fingerprinting, of course, is as complex and as fraught with ethical issues as third-party cookies, but it has the advantage of being an alternative that many companies already are using today. Essentially, fingerprinting analyzes many different data points that are exposed by a unique session, using statistics to create a unique “fingerprint” of a device and its user.

This approach suffers from one of the same problems as cookies, the challenge of dealing with multiple consumers using the same device. But it’s not a bad solution. One advantage is that fingerprinting can take advantage of users with static IP addresses (or IP addresses that are not officially static but that rarely change).

Ultimately, though, this is a moot point because of…

4) IPV6: IPV6 is on the way. This will give every computer and every device a static permanent unique identifier, at which point IPV6 will replace not only cookies, but also fingerprinting and every other form of tracking identification. That said, we’re still a few years away from having enough IPV6 adoption to make this happen.

If Anyone From Mozilla Reads This Article

Rather than blocking third-party cookies completely, it would be fantastic if you could leave them active during each session and just blow them away at the end of each session. This would keep the market from building third-party profiles, but would keep some very convenient features intact. Some examples include frequency capping within a session, so that users don’t have to see the same ad 10 times; and conversion tracking for DR advertisers, given that DR advertisers (for a whole bunch of stupid reasons) typically only care about conversions that happen within an hour of a click. You already have Private Browsing technology; just apply that technology to third-party cookies.

Why no one can define “premium” inventory

By Eric Picard (Originally published on iMediaConnection.com on June 17th, 2013)

What is premium inventory? The simple answer is that it’s inventory that the advertiser would be happy to run its advertising on if it could manually review every single publisher and page that the ad was going to appear within.

When buyers make “direct” media buys against specific content, they get access to this level of comfort, meaning that they don’t have to worry about where their media dollars end up being spent. But this doesn’t scale well across more than a few dozen sales relationships.

To address this problem of scale, buyers extend their media buys through ad networks and exchange mechanisms. But in this process, they often lose control over where their ads will run. Theoretically the ad network is acting as a proxy of the buyer in order to support the need for “curation” of the ad experience, but this clearly is not usually the case. Ad networks don’t actually have the technology to handle curation of the advertising experience (i.e., monitoring the quality of the publishers and pages they are placing advertising on) at scale any more than the media buyer does, which leads to frequent problems of low quality inventory on ad networks.

Now apply this issue to the new evolution of real-time bidding and ad exchanges. A major problem with buying on exchanges is that the curation problem gets dropped back in the laps of the buyers across more publishers and pages than they can manually curate, which requires a whole new set of skills and tools. But the skills aren’t there yet, and the problem hasn’t been handled well by the various systems providers. So the agencies build out trading desks where that skillset is supposed to live, but the end results of the quality are highly suspect as we’re seeing from all the recent articles on fraud.

So the true answer to this conundrum of what is premium must be to find scalable mechanisms to ensure that a brand’s quality goals for the inventory it is running advertising against are met.
The market needs to be able to efficiently execute media buys against high-quality inventory at media prices that buyers are comfortable paying — if not happy to pay.

The definition of “high quality” is an interesting problem with which I’ve been struggling. Here’s what I’ve come up with: Every brand has its own point of view on “high quality” because it has its own goals and brand guidelines. A pharma advertiser might want to buy ad inventory on health websites, but it might want to only run on general health content, not content that is condition specific. Or an auto advertiser might want to buy ad inventory on auto-related content, but not on reviews of automobiles.

Most brands obviously want to avoid porn, hate speech, and probably gambling pages — but what about content that is very cluttered with ads or where the page layout is so ugly that ads will look like crap? Or pages that are relatively neutral — meaning not good, but not horrible?

Then we run into a problem that nobody has been willing to bring up broadly, but it’s one that gets talked about all the time privately: Inventory is a combination of publisher, page, and audience.

How are we defining audience today? There’s blended data such as comScore or Nielsen data, which use methodologies that are in some cases vetted by third parties, but relatively loosely. There’s first-party data such as CRM, retargeting, or publisher registration data, which will vary broadly in quality based on many issues but are generally well understood by the buyer and the seller. And there’s third-party data from data companies. But frankly, nobody is rating the quality of this data. Even on a baseline level, there are no neutral parties evaluating the methodology used from a data sciences point of view to validate that the method is defensible. And as importantly, there is no neutral party measuring the accuracy of the data quantitatively (e.g., a data provider says that this user is from a household with an income above $200,000, but how have we proven this to be true?).

When we talk about currency in this space, we accept whatever minimum bar that the industry has laid down as truth via the Media Rating Council, hold our nose, and move forward. But we’ve barely got impression guidelines that the industry is willing to accept, let alone all of these other things like page clutter and accuracy of audience data.

And even more importantly, nobody is looking at all the data (publisher, page, audience) from the point of view of the buyer. And as we discussed above, every buyer — and potentially every campaign for every brand — will view quality very differently. Because the skillset of measuring quality is in direct competition with the goal of getting budgets spent efficiently — or what some might call scale — nobody wants to talk about this problem. After all, if buyers start getting picky about the quality of the inventory on any dimension, the worry is that they might reduce the scale of inventory available to them. The issues are directly in conflict with each other. Brand safety, inventory quality, and related issues should be handled as a separate policy matter from media buying, as the minimum quality bar should not be subject to negotiation based on scale issues. Running ads on low-quality sites is a bad idea from a brand perspective, and that line shouldn’t be crossed just to hit a price or volume number.

So instead we talk about the issue sitting in front of our nose that has gotten some press: fraud. The questions that advertisers are raising about our channels center around this concern. But the advertisers should be asking lots of questions about the broader issue — which is, “How are you making sure that my ads are running on high-quality inventory?” Luckily there are some technologies and services on the market that can help provide quality inventory at scale, and this area of product development is only going to get better over time.

Which Type Of Fraud Have You Been Suckered Into?

By Eric Picard (Originally published by AdExchanger.com on May 30th, 2013)

For the last few years, Mike Shields over at Adweek has done a great job of calling out bad actors in our space.  He’s shined a great big spotlight on the shadowy underbelly of our industry – especially where ad networks and RTB intersect with ad spend.

Many kinds of fraud take place in digital advertising, but two major kinds are significantly affecting the online display space today. (To be clear, these same types of fraud also affect video, mobile and social. I’m just focusing on display because it attracts more spending and it’s considered more mainstream.) I’ll call these “page fraud” and “bot fraud.”

Page Fraud

This type of fraud is perpetrated by publishers who load many different ads onto one page.  Some of the ads are visible, others hidden.  Sometimes they’re even hidden in “layers,” so that many ads are buried on top of each other and only one is visible. Sometimes the ads are hidden within iframes that are set to 1×1 pixel size (so they’re not visible at all). Sometimes they’re simply rendered off the page in hidden frames or layers.

It’s possible that a publisher using an ad unit provided by an ad network could be unaware that the network is doing something unscrupulous – at least at first.  But they are like pizza shops that sell more pizzas than it’s possible to make with the flour they’ve purchased. They may be unaware of the exact nature of the bad behavior but must eventually realize that something funny is going on. In the same way, bad behavior is very clear to publishers who can compare the number of page views they’re getting with the number of ad impressions they’re selling.  So I don’t cut them any slack.

This page fraud, by the way, is not the same thing as “viewability,” which involves below-the-fold ads that never render visibly on the user’s page.  That fraudulent activity is perpetrated by the company that owns the web page on which the ads are supposed to be displayed.  They knowingly do so by either programming their web pages with these fraudulent techniques or using networks that sell fake ad impressions on their web pages.

There are many fraud-detection techniques you can employ to make sure that your campaign isn’t the victim of page fraud. And there are many companies – such as TrustMetrics, Double Verify and Integral Ad Science – that offer technologies and services to detect, stop and avoid this type of fraud. Foiling it requires page crawling as well as advanced statistical analysis.

Bot Fraud

This second type of fraud, which can be perpetrated by a publisher or a network, is a much nastier kind of fraud than page fraud. It requires real-time protection that should ultimately be built into every ad server in the market.

Bot fraud happens when a fraudster builds a software robot (or bot) – or uses an off-the-shelf bot – that mimics the behavior of a real user. Simple bots pretend to be a person but behave in a repetitive way that can be quickly identified as nonhuman; perhaps the bot doesn’t rotate its IP address often and creates either impressions or clicks faster than humanly possible. But the more sophisticated bots are very difficult to differentiate from humans.

Many of these bots are able to mimic human behavior because they’re backed by “botnets” that sit on thousands of computers across the world and take over legitimate users’ machines.  These “zombie” computers then bring up the fraudsters’ bot software behind the scenes on the user’s machine, creating fake ad impressions on a real human’s computer.  (For more information on botnets, read “A Botnet Primer for Advertisers.”) Another approach that some fraudsters take is to “farm out” the bot work to real humans, who typically sit in public cyber cafes in foreign countries and just visit web pages, refreshing and clicking on ads over and over again. These low-tech “botnets” are generally easy to detect because the traffic, while human and “real,” comes from a single IP address and usually from physical locations where the heavy traffic seems improbable – often China, Vietnam, other Asian countries or Eastern Europe.

Many companies have invested a lot of money to stay ahead of bot fraud. Google’s DoubleClick ad servers already do a good job of avoiding these types of bot fraud, as do Atlas and others.

Anecdotally, though, newer ad servers such as the various DSPs seem to be having trouble with this; I’ve heard examples through the grapevine on pretty much all of them, which has been a bit of a black eye for the RTB space. This kind of fraud has been around for a very long time and only gets more sophisticated; new bots are rolled out as quickly as new detection techniques are developed.

The industry should demand that their ad servers take on this problem of bot fraud detection, as it really can only be handled at scale by significant investment – and it should be built right into the core campaign infrastructure across the board. Much like the issues of “visible impressions” and verification that have gotten a lot of play in the industry press, bot fraud is core to the ad-serving infrastructure and requires a solution that uses ad-serving-based technology. The investment is marginal on top of the existing ad-serving investments that already have been made, and all of these features should be offered for free as part of the existing ad-server fees.

Complain to – or request bot-fraud-detection features from – your ad server, DSP, SSP and exchange to make sure they’re prioritizing feature development properly. If you don’t complain, they won’t prioritize this; instead, you’ll get less-critical new features first.

Why Is This Happening?

I’ve actually been asked this a lot, and the question seems to indicate a misunderstanding – as if it were some sort of weird “hacking” being done to punish the ad industry. The answer is much simpler:  money.  Publishers and ad networks make money by selling ads. If they don’t have much traffic, they don’t make much money. With all the demand flowing across networks and exchanges today, much of the traffic is delivered across far more and smaller sites than in the past. This opens up significant opportunities for unscrupulous fraudsters.

Page fraud is clearly aimed at benefiting the publisher but also benefitting the networks. Bot fraud is a little less clear – and I do believe that some publishers who aren’t aware of fraud are getting paid for bot-created ad impressions.  In these cases, the network that owns the impressions has configured the bots to drive up its revenues. But like I said above, publishers have to be almost incompetent not to notice the difference in the number of impressions delivered by a bot-fraud-committing ad network and the numbers provided by third parties like Alexa, Comscore, Nielsen, Compete, Hitwise, Quantcast, Google Analytics, Omniture and others.

Media buyers should be very skeptical when they see reports from ad networks or DSPs showing millions of impressions coming from sites that clearly aren’t likely to have millions of impressions to sell.  And if you’re buying campaigns with any amount of targeting – especially something that should significantly limit available inventory such as Geo or Income– or with frequency caps, you need to be extra skeptical when reviewing your reports, or use a service that does that analysis for you.

Changing The Industry: Don’t Just Complain, Do Something

By Eric Picard (Originally published on AdExchanger – May 6th, 2013)

Take it from me: It’s easy to complain. I’ve been writing monthly articles in this space since 1999, which I think qualifies me as a “semi-professional” complainer. In these articles, I discuss issues facing our industry in a public forum and – sometimes – offer suggestions on how we can fix them. But what gets masked, perhaps, is how problems are really solved behind the scenes in our industry.

I’ve been involved in many industry projects over the years. Some of these projects have come through industry organizations like theInteractive Advertising Bureau, some through my employers, and some through unofficial channels such as informal dinners with industry folks that unexpectedly yield ideas for solutions.

Ultimately, things change when someone is passionate about a topic, decides to do more than just talk about it, and rallies a group of people to make that change happen. More than anything, in my experience, what matters is intent. Pushing forward with the intent to make change happen is what drives change. It sounds simple, but it really is the case.

Recently, I had an experience that illustrates my point. But first, let me set the stage…

In ad technology, most of the company leaders know each other pretty well. Despite Terry Kawaja’s best efforts at showing how complex the “LUMAscape” is, the reality is that we’re a pretty compact industry with a fairly small number of key people.

When it comes to ad technology, there are probably only 100-200 people who’ve been primarily responsible for turning the crank these past 15-20 years – and most of us know each other. Even when we find ourselves competing, we find plenty of respect for each other, as well as – perhaps surprisingly – a lot of consensus on the big issues. When I get in a room with Tom Shields from Yieldex, or Tony Katsur from Maxifier, or John Ramey from isocket, the conversation is always going to be fun because we’re all knowledgeable and passionate about the issues. Even if we don’t talk regularly, we can dive right into these conversations because we can use shorthand. We don’t have to explain any context or background; we all have a baseline of understanding.

So when I started reading my friends’ and colleagues’ articles complaining about something as simple as what we call a new category of digital media, I realized a problem was brewing.

The Great Programmatic Premium Debate

The term “programmatic premium” has taken a life of its own and become confusing to many in the industry. Some folks on the real-time bidding side of the market started using the term to show that ad exchanges were finally getting access to “premium” inventory. But there’s also a whole new category of media tools being used to buy and sell inventory through direct publisher integration. These companies were also using the term “programmatic premium,” but with a whole different set of meanings.

After reading the fifth or sixth article on this topic, I got fed up. I sent an email to all the people who were working in this new space and suggested that we get together and help the industry by deciding what the name should be, developing a position and pushing it to the market. I remembered what happened to the “demand-side platforms” and the “supply-side platforms,” feeling that if we could come to some consensus on ”programmatic premium,” perhaps we could help fix the problem quickly.

I knew that most of the folks involved were likely to attend the AdExchanger Programmatic I/O conference in San Francisco, so I suggested that the group meet at the conference and discuss this. A few weeks ago, this informal industry roundtable – including representatives from eight companies – met during the conference and also gathered input before and after the meeting from others who couldn’t attend in person.

After lengthy debate, we came to a unanimous decision: The best name to describe inventory purchased in an automated way through a buying or selling tool that is directly integrated with a publisher’s ad server is “programmatic direct.”  This name reflects the difference between programmatic real-time bidding (RTB) and the growing category of directly integrated buying and selling tools that enable the replacement of the antiquated request for proposal (RFP) and insertion order (I/O) approach that have been in place since industry began.

Last week, John Ramey from isocket published a summary of the discussion on AdExchanger, to which all the participants added comments. The effort ended in success, and now we can move forward as an industry.

How to Drive Change: Do Something

But this is just one example of many. My point is that the potential to move the industry forward lies in all our hands. We are creating the future of advertising and media – and most people don’t realize the power they hold in that process. Every fire needs a spark to ignite it. It’s from you that real change can come.

There are dozens of examples like the one above, showing how with a little effort, consensus building and leadership, the industry can change.

Another such example is the Universal Ad Package that the IAB pushed through in record time back in 2002. The legend is that this effort started when some folks from Microsoft, Disney and Yahoo got together for drinks at a conference and realized that they could shortcut the whole process of pushing better ad formats through the IAB if they did the legwork ahead of time. They invited several other publishers to the conversation, and before long they had a consensus and a very complete body of work that made the standardization job much easier by the time it arrived at the IAB.

Whether you’re trying to change the name of a category, fix a fundamental problem with the number of standard ad formats, or change the way that some major piece of technology works, don’t shy away from having those conversations. If you don’t like the way a product works, don’t complain to your colleagues. Instead, send an email or pick up the phone and call the person in charge of product management for that product.  You’d be shocked by how few people do this, and how much the product managers want to hear this kind of feedback.

If you are passionate about changing the industry, then do it. Join an IAB committee. Write emails, write articles, have dinner at conferences with like-minded colleagues – and even competitors.  That’s how change happens. That’s how you get things done.

Targeting fundamentals everyone should know

 

By Eric Picard (Originally published in iMediaConnection, April 11th, 2013)

Targeting data is ubiquitous in online advertising and has become close to “currency” as we think about it in advertising. And I mean currency in the same way that we think about Nielsen ratings in TV or impression counts in digital display. We pay for inventory today in many cases based on a combination of the publisher, the content associated with the impression, and the data associated with a variety of elements. This includes the IP address of the computer (lots of derived data comes from this), the context of the page, various content categories and quality metrics, and — of course — behavioral and other user-based targeting attributes.

But for all the vetting done by buyers of base media attributes, such as the publisher or the page or quality scores, there’s still very little understanding of where targeting data comes from. And even less when it comes to understanding how it should be valued and why. So this article is about just that topic: how targeting data is derived and how you should think about it from a value perspective.

Let’s get the basic stuff out of the way: anything derived from the IP address and user agent. When a browser visits a web page, it spits out a bunch of data to the servers that it accesses. The two key attributes are IP address and user agent. The IP address is a simple one; it’s the number assigned to the user’s computer by the internet to allow that computer to be identified by the various servers it touches. It’s a unique number that allows an immense amount of information to be inferred; the key piece of information inferred is the geography of the user.

There are lots of techniques used here to varying degrees of “granularity.” But we’ll just leave it at the idea that companies have amassed lists of IP addresses assigned to specific geographic locations. It’s pretty accurate in most cases, but there are still scenarios where people are connected to the internet via private networks (such as a corporate VPN) that confuse the world by assigning IP addresses to users in one location when they are actually in another. This was the classic problem with IP address based geography back in the days of dial-up, when most users showed up as part of Reston, Va. (where AOL had its data centers). Today where most users are on broadband, the mapping is much more accurate and comprehensive.

As important as geography are the various mappings that are done against location. Claritas, Prism, and other derived data products make use of geography to map a variety of attributes to the user browsing the page. And these techniques have moved out of traditional media (especially direct-response mailing lists) to digital and are quite useful. The only issue is that the further down the chain of assumptions used to derive attributes, the more muddled things become. Statistically, the data still is relevant, but on a per-user basis it is potentially completely inaccurate. That shouldn’t stop you from using this information, nor should you devalue it — but just be clear that there’s a margin of error here.

User agent is an identifier for the browser itself, which can be used to target users of specific browsers but also to identify non-browser activity that chooses to identify itself. For instance, various web crawlers such as search engines identify themselves to the server delivering a web page, and ad servers know not to count those ad impressions as human. This assumes good behavior on behalf of the programmers, and sometimes “real” user agents are spoofed when the intent is to create fake impressions. Sometimes a malicious ad network or bad actor will do this to create fake traffic to drive revenue.

Crawled data

There’s a whole class of data that’s derived by sending a robot to a web page, crawling through the content on the page, and classifying the content based on all sorts of analysis. This mechanism is how Google, Bing, and other search engines classify the web. Contextual targeting systems like AdSense classify the web pages into keywords that can be matched by ad sales systems. And quality companies, like Trust Metrics and others, scan pages and use hundreds or thousands of criteria to value the rank of the page — everything from ensuring that the page doesn’t contain porn or hate speech to analyzing the amount of white space around images and ads and the number of ads on a page.

User targeting

Beyond the basics of browser, IP, and page content, the world is much less simple. Rather than diving into methodologies and trying to simplify a complex problem, I’ll simply list and summarize the options here:

Registration data: Publishers used to require registration in order to access their content and, in that process, request a bunch of data such as address, demographics, psychographics, and interests. This process fell out of favor for many publishers over the years, but it’s coming back hard. Many folks in our industry are cynical about registration data, using their own experiences and feelings to discount the validity of user registration data. But in reality, this data is highly accurate; even for large portals, it is often higher than 70 percent accurate, and for news sites and smaller publishers, it’s much more accurate.

Interestingly, the use of co-registration through Facebook, Twitter, LinkedIn, and others is making this data much more accurate. One of the most valuable things about registration data is that it creates a permanent link between a user and the publisher that lives beyond the cookie. Subsequently captured data from various sessions is extremely accurate even if the user fudged his or her registration information.

First-party behavioral data: Publishers and advertisers have a great advantage over third parties in that they have a direct relationship with the user. This gives them incredible opportunities to create deeply refined targeting segments based on interest, behavior, and especially from custom created content such as showcases, contests, and other registration information. Once a publisher or advertiser creates a profile of a user, it has the means to track and store very rich targeting data — much richer in theory than a third party could easily create. For instance, you might imagine that Yahoo Finance benefits highly from registered users who track their stock portfolio via the site. Similarly, users searching for autos, travel, and other vertical-specific information create immense targeting value.

Publishers curbed their internal targeting efforts years ago because they found that third-party data companies were buying targeted campaigns on publishers and then their high-cost, high-value targeting data was leaking away to third parties. But the world has shifted again, and publishers and advertisers both are benefiting highly from the data management platforms (DMPs) that are now common on the market. The race toward using first-party cookies as the standard for data collection is further strengthening publishers’ positions. Targeted content categories and contests are another way that publishers and advertisers have a huge advantage over third parties.

Creating custom content or contests with the intent to derive high-value audience data that is extremely vertical or particularly valuable is easy when you have a direct relationship with the user. You might imagine that Amazon has a huge lead in the market when it comes to valuation of users by vertical product interest. Similarly, big publishers can segment users into buckets based on their interest in numerous topics that can be used to extrapolate value.

Third-party data: There are many methods used to track and value users based on third-party cookies (those pesky cookies set by companies that generally don’t have a direct relationship with the user — and which are tracking them across websites). Luckily there are lots of articles out there (including many I’ve written) on how this works. But to quickly summarize: Third-party data companies generally make use of third-party cookies that are triggered on numerous websites across the internet via the use of tracking pixels. These pixels are literally just a 1×1 pixel image (sometimes called a “clear pixel”), or even just a simple no-image JavaScript call from the third-party server, that allows them to set and/or access a cookie that they can set on the users’ browsers. These cookies are extremely useful to data companies in tracking users because the same cookie can be accessed on any website, on any domain, across sessions, and sometimes across years of time.

Unfortunately for the third-party data companies, third-party cookies have recently come under intense scrutiny since Apple’s Safari doesn’t allow them by default and Firefox has announced that it will set new defaults in its next browser version to block third-party cookies. This means that those companies relying exclusively on third-party cookies will see their audience share erode and will need to fall back on other methods of tracking and profiling users. Note that these companies all use anonymous cookies and work hard to be safe and fair in their use of data. But the reality is that this method is becoming harder for companies to use.

By following users across websites, these companies can amass large and comprehensive profiles of users such that advertising can be targeted against them in deep ways and more money can be made from those ad impressions.
Read more at http://www.imediaconnection.com/content/33972.asp#qakIxCXJbl9KpiG3.99

We don’t need no stinkin’ 3rd party cookies!

By Eric Picard (Originally published on AdExchanger.com)

I’ve been writing about some of the ethical issues with “opt-out” third-party tracking for a long time. It’s a practice that makes me extremely uncomfortable, which is not where I started out. You can read my opus on this topic here.

In this article, I want to go into detail about why third-party cookies aren’t needed by the ecosystem, and why doing away with them as a default setting is both acceptable and not nearly as harmful as many are claiming.

 

First order of business: What is a third-party cookie?

When a user visits a web page, they load a variety of content. Some of this content comes from the domain they’re visiting. (For simplicity sake, let’s call it Publisher.com.) Some comes from third parties that are loading this content onto Publisher.com’s web site. (let’s call it ContentPartner.com.) An example would be that you could visit a site about cooking, and the Food Network could provide some pictures or recipes that the publisher embeds into the page. Those pictures and recipes sit on servers controlled by the content partner and point to that partner’s domain.

When content providers deliver content to a browser, they have the opportunity to set a cookie. When you’re visiting Publisher.com’s page, it can set a first-party cookie because you’re visiting its web domain. In our example above, ContentPartner.com is also delivering content to your browser from within Publisher.com’s page, so the kind of cookie it can deliver is a third-party cookie. There are many legitimate reasons why both parties would drop a cookie on your browser.

If this ended there, we probably wouldn’t have a problem. But this construct – allowing content from multiple domains to be mapped together into one web page, which was really a matter of convenience when the web first was created – is the same mechanism the ad industry uses to drop tracking pixels and ads onto publishers’ web pages.

For example, you might visit Publisher.com and see an ad delivered by AdServer.com. And on every page of that site, you might load tracking pixels delivered by TrackingVendor1.com, TrackingVendor2.com, etc. In this case, only Publisher.com can set a first-party cookie. All the other vendors are setting third-party cookies.

There are many uses for third-party cookies that most people would have no issue with, but some uses of third-party cookies have privacy advocates up in arms. I’ll wave an ugly stick at this issue and just summarize it by saying: Companies that have no direct relationship with the user are tracking that user’s behavior across the entire web, creating profiles on him or her, and profiting off of that user’s behavior without his or her permission.

This column isn’t about whether that issue is ethical or acceptable, because allowing third-party cookies to be active by default is done at the whim of the browser companies. I’ve predicted for about five years that the trend would head toward all browsers blocking them by default. So far Safari (Apple’s browser) doesn’t allow third-party cookies by default, and Mozilla’s Firefox has announced it will block them by default in the next version of Firefox.

Why I don’t think blocking third-party cookies is a problem

There are many scenarios where publishers legitimately need to deliver content from multiple domains. Sometimes several publishers are owned by one company, and they share central resources across those publishers, such as web analytics, ad serving, and content distribution networks (like Akamai). It has been standard practice in many of these cases for publishers to map their vendors against their domain, which by the way allows them to set first-party cookies as well.

How do they accomplish this? They set a ‘subdomain’ that is mapped to the third party’s servers. Here’s an example:

Publisher.com wants to use a web analytics provider but set cookies from its own domain. It creates a subdomain called WebAnalytics.Publisher.com using its Domain Name Server, or DNS. (I won’t get too technical, but DNS is the way that the Internet maps IP addresses – the numeric identifier of servers – to domain names.) It’s honestly as simple as one of the publisher’s IT people opening up a web page that manages their DNS, creating a subdomain name, and mapping it to a specific IP address. And that’s it.

This allows the third-party vendor to place first-party cookies onto the browser of the user visiting Publisher.com. This is a standard practice that is broadly used across the industry, and it’s critically important to the way that the Internet functions. There are many reasons vendors use subdomains, not just to set first-party cookies. For instance, this is standard practice in the web analytics space (except for Google Analytics) and for content delivery networks (CDNs).

So why doesn’t everybody just map subdomains and set first-party cookies?

First, let me say that while it is fairly easy to map a subdomain for the publisher’s IT department, it would be impractical for a demand-side platform (DSP) or other buy-side vendor to go out and have every existing publisher map a subdomain for them. For those focused on first-party data on the advertiser side, they’ll still have access to that data in this world. But for broader data sets, they’ll be picking up their targeting data via the exchange as pushed through by the publisher on the impression itself. For the data management platforms (DMPs), given that this is their primary business, it is a reasonable thing for them to map subdomains for each publisher and advertiser that they work with.

Also, the thing that vendors like about third-party cookies is that by default they work across domains. That means that data companies could set pixels on every publisher’s web site they could convince to place their pixels, and then automatically they would track one cookie across every site they visited. Switching to first-party cookies breaks that broad set of actions across multiple publishers into pockets of activity at the individual publisher level. There is no cheap, convenient way to map one user’s activity across multiple publishers. And only those companies that have a vested interest – the DMPs – will make that investment, and it will limit the number of small vendors who can’t make that investment from participating.

But, is that so bad?

So does moving to first-party cookies break the online advertising industry?

Nope. But it does complicate things. Let me tell you about a broadly used practice in our industry – one that every single data company uses on a regular basis. It’s a practice that gets very little attention today but is pretty ubiquitous. It’s called cookie mapping.

Here’s how it works: Let’s say one vendor has its unique anonymous cookies tracking user behavior and creating big profiles of activity, and it wants to use that data on a different vendor’s servers. In order for this to work, the two vendors need to map together their profiles, finding unique users (anonymously) who are the same user across multiple databases. How this is done is extremely technical, and I’m not going to mangle it by trying to simplify the process. But at a very high level, it’s something like this:

Media Buyer wants to use targeting data on an exchange using a DSP. The DSP enables the buyer to access data from multiple data vendors. The DSP has its own cookies that it sets (today these are third-party cookies) on users when it runs ads. The DSP and the data vendor work with a specialist vendor to map together the DSP’s cookies and the data vendor’s cookies. These cooking mapping specialists (Experian and Acxiom are examples, but others provide this service as well) use a complex set of mechanisms to map together overlapping cookies between the two vendors. They also have privacy auditing processes in place to ensure that this is done in an ethical and safe way to ensure that none of the vendors gets access to personally identifiable data.

Note that this same process is used between advertisers and publishers and their own DMPs so that first-party data from CRM and user registration databases can be mapped to behavioral and other kinds of data.

The trend for data companies in the last few years has been to move into DMP mode, working directly with the advertisers and publishers rather than trying to survive as third-party data providers. This move was intelligent – almost prescient of the change that is happening in the browser space right now.

My short take on this evolution

I feel that this change is both inevitable and positive. It puts more power back in the hands of publishers; it solidifies their value proposition as having a direct relationship with the consumer, and will drive a lot more investment in data management platforms and other big data by publishers. The last few years have seen a data asymmetry problem arise where the buyers had more data available to them than the publishers, and the publishers had no insight into the value of their own audience. They didn’t understand why the buyer was buying their audience. This will fall back into equilibrium in this new world.

Long tail publishers will need to rely on their existing vendors to ensure they can easily map a first-party cookie to a data pool – these solutions need to be baked by companies who cater to long tail publishers, such as ad networks. The networks will need to work with their own DMP and data solutions to ensure that they’re mapping domains together on behalf of their long tail publishers and pushing that targeting data with the inventory into the exchanges. The other option for longer tail publishers is to license their content to larger publishers who can aggregate this content into their sites. It will require some work, which also means ambiguity and stress. But certainly this is not insurmountable.

I also will say that first-party tracking is both ethical and justifiable. Third-party tracking without the user’s permission is ethically a challenging issue, and I’d argue that it’s not in the best interest of our industry to try and perpetuate – especially since there are viable and acceptable alternatives.

That doesn’t mean switching off of third-party cookies is free or easy. But in my opinion, it’s the right way to do this for long-term viability.

What everyone should know about ad serving

By Eric Picard (Originally published in iMediaConnection.com)

Publisher-side ad servers such as DoubleClick for Publishers, Open AdStream, FreeWheel, and others are the most critical components of the ad industry. They’re responsible ultimately for coordination of all the revenue collected by the publisher, and they do an amazing amount of work.

Many people in the industry — especially on the business side of the industry — look at their ad server as mission critical, sort of in the way they look at the electricity provided by their power utility. Critical — but only in that it delivers ads. To ad operations or salespeople, the ad server is most often associated with how they use the user interface — really the workflow they interact with directly. But this is an oversight on their part.

The way that the ad server operates under the surface is actually something everyone in the industry should understand. Only by understanding some of the details of how these systems function can good business decisions be made.

Ad delivery

Ad servers by nature make use of several real-time systems, the most critical being ad delivery. But ad delivery is not a name that adequately describes what those systems do. An ad delivery system is really a decision engine. It reviews an ad impression in the exact moment that it is created (by a user visiting a page), reviews all the information about that impression, and makes the decision about which ad it should deliver. But the real question is this: How does that decision get made?

An impression could be thought of as a molecule made up of atoms. Each atom is an attribute that describes something about that impression. These atomic attributes can be simple media attributes, such as the page location that the ad is imbedded into, the category of content that the page sits within, or the dimensions of the creative. They can be audience attributes such as demographic information taken from the user’s registration data or a third-party data company. They can be complex audience segments provided by a DMP such as “soccer mom” — which is in itself almost a molecular object made up of the attributes of female, parent, children in sports — and of course various other demographic and psychographic atomic attributes.

When taken all together, these attributes define all the possible interpretations of that impression. The delivery engine now must decide (all within a few milliseconds) how to allocate that impression against available line items. This real-time inventory allocation issue is the most critical moment in the life of an impression. Most people in our industry have no understanding of what happens in that moment, which has led to many uninformed business, partnership, and vendor licensing decisions over the years, especially when it comes to operations, inventory management, and yield.

Real-time inventory allocation decides which line items will be matched against an impression. The way these decisions get made reflects the relative importance placed on them by the engineers who wrote the allocation rules. These, of course, are informed by business people who are responsible for yield and revenue, but the reality is that the tuning of allocation against a specific publisher’s needs is not possible in a large shared system. So the rules get tuned as best they can to match the overarching case that most customers face.

Inventory prediction

Well before the impression is generated and has to be allocated out to the impressions in real-time, inventory was sold in advance based on predictions of how much volume would exist in the future. We call these predicted impressions “avails” (for “available to sell”) in our industry, and they’re essentially the basis for how all guaranteed impressions are sold.

We’ll get back to the real-time allocation in a moment, but first let’s talk a bit about avails. The avails calculation done by another component of the ad server, responsible for inventory prediction, is one of the hardest computer science problems facing the industry today. Predicting how much inventory will exist is hard — and extremely complicated.

Imagine if you will that you’ve been asked to predict a different kind of problem than ad serving — perhaps traffic patterns on a state highway system. As you might imagine, predicting how many cars will be on the entire highway next month is probably not very hard to do with a pretty high degree of accuracy. There’s historical data going back years of time, month by month. So you could take a look at the month of April for the last five years, see if there’s any significant variance, and use a bit of somewhat sophisticated math to determine a confidence interval for how many cars will be on the highway in the month of April 2013.

But imagine that you now wanted to zoom into a specific location — let’s say the Golden Gate Bridge. And you wanted to break that prediction down further, let’s say Wednesday, April 3. And let’s say that we wanted to predict not only how many cars would be on the bridge that day, but how many cars with only one passenger. And further, we wanted to know how many of those cars were red and driven by women. And of those red, female-driven cars, how many of them are convertible sports cars? Between 2 and 3 p.m.

Even if you could get some kind of idea how many matches you’ve had in the past, predicting at this level of granularity is very hard. Never mind that there are many outside factors that could affect this; there are short-term issues that could help get more accurate as you get closer in time to the event such as weather and sporting events, and there are much more unpredictable events such as car accidents, earthquakes, etc.

This is essentially the same kind of prediction problem as the avails prediction problem that we face in the online advertising industry. Each time we layer on one bit of data (some defining attribute) onto our inventory definition, we make it harder and harder to predict with any accuracy how many of those impressions will exist. And because we’ve signed up for a guarantee that this inventory will exist, the engineers creating the algorithms that predict how much inventory will exist need to be very conservative on their estimates.

When an ad campaign is booked by an account manager at the publisher, they “pull avails” based on their read of the RFP and media plan and try to find matching inventory. These avails are then reserved in the system (the system puts a hold on avails that are being sent back to the buyer based for a period of time) until the insertion order (I/O) is signed by the buyer. At this moment, a preliminary allocation of predicted avails (impressions that don’t exist yet) is made by a reservation system, which divvies out the avails among the various I/Os. This is another kind of allocation that the ad server does in advance of the campaign actually running live, and it has as much (or even more) impact as the real-time allocation does on overall yield.

How real-time allocation decisions get made

Once a contract has been signed to guarantee that these impressions will in fact be delivered, it’s up to the delivery engine’s allocation system to decide which of the matching impressions to assign to which line items. The primary criteria used to make this decision is how far behind the matching line items are for successfully delivering against their contract, which we call “starvation” (i.e., is the line item starving to death or is it on track to fulfill its obligated impression volume?).

Because the engineers who wrote the avails prediction algorithms were conservative, the system generally has a lot of wiggle room when it comes to delivering against most line items that are not too complex. That means there are usually more impressions available when the impressions are allocated than were predicted ahead of time. So when all the matching line items are not starving, there are other decision criteria that can be used. The clearest one is yield, (i.e., of the available line items to allocate, which one of those lines will get me the most money for this impression?).

Implications of real-time allocation and inventory prediction

There’s a tendency in our industry to think about ad inventory as if it “exists” ahead of time, but as we’ve just seen, an impression is ephemeral. It exists only for a few milliseconds in the brain of a computer that decides what ad to send to the user’s machine. Generally there are many ways that each impression could be fulfilled, and the systems involved have to make millions or billions of decisions every hour.

We tend to think about inventory in terms of premium and remnant, or through a variety of lenses. But the reality is before the inventory is sold or unsold, premium or remnant, or anything else, it gets run through this initial mechanism. In many cases, inventory that is extremely valuable gets allocated to very low CPM impression opportunities or even to remnant because of factors having little to do with what that impression “is.”

There are many vendors in the space, but let’s chat for a moment about two groups of vendors: supply-side platforms (SSPs) and yield management companies.

Yield management firms focus on providing ways for publishers to increase yield on inventory (get more money from the same impressions), and most have different strategies. The two primary companies folks talk to me about these days are Yieldex and Maxifier. Yieldex focuses on the pre-allocation problem — the avails reservations done by account managers as well as the inventory prediction problem. Yieldex also provides a lot of analytics capabilities and is going to factor significantly in the programmatic premium space as well. Maxifier focuses on the real-time allocation problem and finds matches between avails that drive yield up, and it improves matches on other performance metrics like click-through and conversions, as well as any other KPI the publisher tracks, such as viewability or even engagement. Maxifier does this while ensuring that campaigns deliver, since premium campaigns are paid on delivery but measured in many cases on performance. The company is also going to figure heavily into the programmatic premium space, but in a totally different way than Yieldex. In other words, neither company really competes with each other.

Google’s recent release of its dynamic allocation features for the ad exchange (sort of the evolution of the Admeld technology) also plays heavily into real-time allocation and yield decisions. Specifically, the company can compare every impression’s yield opportunity between guaranteed (premium) line items and the response from the DoubleClick Exchange (AdX) to determine on a per-impression basis which will pay the publisher more money. This is very close to what Maxifier does, but Maxifier does this across all SSPs and exchanges involved in the process. Publishers I’ve talked to using all of these technologies have gushed to me about the improvements they’ve seen.

SSPs are another animal altogether. While the yield vendors above are focused on increasing the value of premium inventory and/or maximizing yield between premium and exchange inventory (I think of this as pushing information into the ad server to increase value), the SSPs are given remnant inventory to optimize for yield among all the various venues for clearing remnant inventory. By forcing competition among ad networks, exchanges, and other vehicles, they can drive the price up on remnant inventory.

How to apply this article to your business decisions

I’ve had dozens of conversations with publishers about yield, programmatic premium, SSPs, and other vendors. The most important takeaway I can leave you with is that you should think about premium yield optimization as a totally different track than discussions about remnant inventory.

When it comes to remnant inventory, whoever gets the first “look” at the inventory is likely to provide the highest increase in yield. So when testing remnant options, you have to ensure that you’re testing each one exactly the same way — never beneath each other. Most SSPs and exchanges ultimately provide the same exact demand through slightly different lenses. This means that barring some radical technical superiority — which none have shown me to be the case so far — the decision most likely will come down to ease of integration and ultimately customer service.

Can That Startup Stand Alone, Or Is It Just A Feature?

by Eric Picard (Originally published in AdExchanger 1/24/13)

One of the most common conversations I have on a regular basis about startups, especially in the ad technology space is whether a company has legs. Will they be able to grow large and “go it alone,” or are they really just a feature of a bigger system or platform that hasn’t been added yet? This line of thinking is super important for investors to understand, and a key part of the decision-making process for entrepreneurs as they begin building a company.

Why does this matter?

Investors all have their own set of criteria for how they evaluate investments. Angels and Micro-VCs frequently are willing to invest in companies that are built for a relatively short term flip. They ask, “Can the company build a robust set of features that attract a set of customers in the short term, run at a low burn rate (or even quickly become profitable), integrate with one or more existing platforms, and then get bought within one to three years for a decent return?” In this case, building a company that’s a feature is completely viable and acceptable.

This approach is great if the startup can either bootstrap (not take any money) or just pull in a small amount of investment capital and get profitable (or exit) quickly. For angels and micro-VCs, this kind of investment is great because they can get fantastic returns on a short investment horizon, and sometimes it gets them stock in a larger high flying tech company at a great multiple. Sometimes these will be companies at Series B or C rounds that they couldn’t get into, sometimes they’re large publicly traded companies that the investor gets stock in at a significant market discount.

If the startup is going to need significant capital, it must be able to build a large business that has significant revenues and long sustainable growth. It needs to be able to stand alone for three to six years, and during that time build a large company with an opportunity to exit for greater than $100M – and have a credible story for why they could be worth more than $1B and exit via either an IPO or a significant acquisition.

There have been several examples in the past few years of companies that have been funded as if they were a full standalone company that could build massive revenue and exit big. They’ve taken tens of millions of dollars in funding, and need to either IPO or exit for greater than $100M to be seen as a win by their investors. In these cases, the investors couldn’t properly evaluate if the startup was really a feature or a standalone. So it’s important to have a way to evaluate this in order to avoid making those mistakes – both as an investor and an entrepreneur.

Question 1: Will Google or another big company just add this functionality to its existing product?

Many big companies with significant platform investments will constantly extend their product set over time. Big companies (Google, Microsoft, Yahoo, Amazon, AOL) and the big ad-tech specific companies (AppNexus, Rubicon, Pubmatic, MediaMath, etc.) have large engineering teams, and the reality is that it’s most efficient for large engineering teams to work on large, complex and technically prohibitively problems. They tend to add smaller features on a much slower cycle. That doesn’t mean they won’t add them – it just means they have bigger problems to solve that are higher priority – especially around scale, stability, and redundancy. But eventually they’re going to need to add those features, and they’ll do it either through acquisition or committing resources to build the features. At that point, they’ll be doing what’s known as a “build/buy analysis” to determine where to invest the capital to get the product to their customers. The analysis is going to look something like this:

  1. How many man-hours of development will this take to build?
  2. What’s the cost to develop it in-house, both the actual cost of development and the opportunity cost in both time to market, and the other features that the team won’t be able to work on during the development process?
  3. The answer is likely to come out between $6 and $30 million – with a rare outside answer landing around $50 to $80 million, depending on the complexity of the feature.
  4. That means that for most “feature” companies, exits will be in those ranges, with perhaps a 10 to 20% premium applied. The likelihood of a premium relates directly to how much revenue the company brings with it, minus any costs of both operating the company and integration costs.

This means that startups need to think through a bunch of things if they’re building for acquisition. They should be cheap to buy – ideally in the first range we just discussed. They should be easy to integrate. They should have as few redundant systems and features as possible. They should be architected for scale – to handle the kinds of load that one of those big companies will need to transfer onto their systems post acquisition. But they should be cost effective operationally.

One very smart thing “feature” startups should do as early in their life cycle as possible is integrate with as many of those companies as possible. Thus when that build/buy analysis is done by the acquiring company, the startup is already integrated, is top of mind with the acquirer, and already is familiar with how to develop against their APIs. And in many cases the developers within the startup will already be known by at least some of the developers within the larger company.

Another big question that will come up during due diligence by an acquiring company is whether the startup has any redundant systems that will need to be decommissioned or moved into internal data centers. This is more important than many would realize – especially when it comes to datacenter redundancy. Very few startups have the kind of security and cost efficiencies that a Google, Microsoft or Amazon have in house. So if they’ve invested in their own infrastructure, they’re going to need to move this out of their datacenter and into the bigger company’s infrastructure. Datacenter contracts should be crafted to facilitate this – and hardware is probably just a loss. Building in the cloud can solve this in some cases, but in others, might cause problems of their own – e.g. any real time systems built in the cloud are highly unlikely to scale to cover the needs of a big company. Architecting for scale, even if the startup doesn’t scale up right away, is a critical consideration.

Question 2: How do we define a “standalone” business, and what justifies the higher acquisition costs?

First and foremost consideration in this case is revenue. Companies with deep customer relationships and lots of revenue can both run standalone, and/or be acquired for a large amount of money at significant multiples. Think of a few companies that have been acquired on that scale: AdMeld built a real and significant business that had deep customer relationships, real revenue, and was acquired by Google for a significant premium at a high price. Same goes for Invite Media and DoubleClick, both bought by Google; aQuantive, bought by Microsoft; and Right Media, bought by Yahoo.

All of these companies built significant businesses, with large numbers of customers, decent margins, and were not bought for their technology. In each of those cases, the core technology of each company was completely re-architected and redeveloped by the acquirer (in aQuantive’s case, a little less so).

So when starting a company, and when evaluating that company for investment, one must consider how much revenue, and how many customer relationships, the company can effectively build over a three to six year period. If the answer to both questions is “lots,” and a credible case can be made – then consideration for this kind of investment (both in time/effort from the entrepreneur and cash from the investors) is justified.

Other considerations for whether the business will stand the test of time:

How “sticky” will those relationships be? Ad serving is the penultimate example of a sticky ad technology. It’s so complex and cost prohibitive to switch out your ad server – either for publishers or agencies – that it almost never happens. That’s one reason ad servers with lots of traction were bought for such huge premiums, even though there was significant price compression over time. If the technology is sticky because of integrations into either infrastructure or business processes, it’s a good candidate to be a standalone.

Does the company sell a product, service or commodity that has transfer-pricing pressure?

Or, does the company either create the product it sells from scratch or process the product of another company in a way that adds significant value? Ideally in the latter scenario it would do this in a way that is difficult to replicate – for example, by analyzing big data with some unique and differentiated methods in order to add intelligence and value to what it sells.

Transfer pricing pressure is critical to understand: The best current example of transfer pricing pressure is music. Any business that has started in the music space has been doomed to either incredibly poor margins or instability. This is because the music companies can simply increase the costs of the final good being sold (songs or albums) any time they please. As soon as one of the companies selling music digitally begins to get profitable, the music industry can throttle their margins by increasing price. In the advertising space this is similar to a problem early ad networks had – the ad rep firms. Because the ad rep firms didn’t own the inventory they sold, and re-sold the product of the publisher directly with no additional value, they were doomed to low multiple exits and low valuations if they managed to get large enough to IPO.

A recent example of transformation in a category that shows how some of these standalone companies have become more resilient is the data companies. They started out with relatively undifferentiated mechanisms to create targeting segments that led to the creation of ad networks, then were integrated with data marketplaces on the exchanges, and now have transformed themselves into Data Management Platforms. Lotame is a great example of a company that has made this transition – but many others have as well.

By applying this type of analysis to opportunities to create a company or invest in a company, entrepreneurs and investors can make smart decisions about what to build, and what to invest in.