Data Collection | springload

Feb 18 2015

3rd Party Data, Ad Ecosystem, Ad Technology, Data Collection, Online Media, Targeting

The real reason advertising isn’t more relevant

By Eric Picard (Originally Published on iMedia – February 18, 2015)

I have been pretty publicly dismissive of the idea that we will see significant consumer value driven by ad targeting’s creation of more relevant advertising in the near future. Despite the frequent claim in the industry, I’d call this a false meme today; we don’t have nearly enough disparate messages from marketers to segment the population well enough. At the very least this future is further out simply because there are not enough advertisers spending enough money on enough distinct messages for enough distinct industry verticals, or enough products, to allow us to have enough relevant messages to show people.

Let me be clear: There are privacy issues with which we must contend. But if we step past them for the purposes of this article and look just at this issue of relevance driving value to the consumer, we have a long way to go. The current trend toward massive use of retargeting clearly isn’t hitting this mark if we just make our judgment based on anecdotal input from friends, family, and ourselves. How many times have you experienced (or been told by someone else about) the situation where you visit an online store, buy a product, and then get targeted with ads for the product you just purchased for several days afterwards on numerous websites?

Are the ads more relevant to you? Maybe. Do they add any value to you? Quite the opposite. You probably find the situation as annoying as I do. If I buy a new grill, show me products related to grilling — not the damn grill I just bought. If I buy a new pair of shoes, show me clothing or accessories related to the shoes. If I buy a new car, stop showing me ads for that car or even its competitors. Instead, show me ads related to the fact that I just bought that specific car, or even just that are relevant to a recent car buyer. But at the very least, stop wasting your money showing me the exact product I just purchased.

Frankly, there are reasons why the scenarios I suggest above aren’t happening. About 10 years ago, I had a conversation with an executive at a major publisher who was complaining about how irrelevant the ads on the website were to him. He hated the fact that he kept seeing a “toenail fungus” ad when he didn’t have toenail fungus. Instead, he would love to have seen ads for rock climbing gear, as that was his passion and he was currently looking for new gear.

I explained to him that the toenail fungus ad was creating both category and brand awareness so that if and when he eventually got toenail fungus, he’d remember that he could fix the problem. I also noted that we currently had literally not one ad from an advertiser that sold rock climbing gear available to target to him, so we could not meet his ad targeting needs in that way. This caused him pause. He finally got the point and was willing to concede that maybe he was a good target for toenail fungus ads — but that he hated the creative of the ad and found it “disgusting.” I explained that we could adjust the creative acceptance policy of the site to deal with that issue editorially and that maybe the ad would be more effective if the images were less graphic.

In those days, before programmatic advertising, the solution to the problem seemed like it was just around the corner. But now, a decade later, we still haven’t solved the issue. For clarity, I do very much believe that there will be a tipping point — that as we add the infrastructure and data needed to micro-segment audiences, we will see major changes. Once we have the ability to show a high-quality ad experience and effectively segment users to put ads in front of them with the same level of segmentation as a niche magazine content experience, advertisers in the myriad niche segments of advertising will flood the digital channels with creative that can be matched to the right user. We should explore this a bit.

Consider this example: We are trying to build an advertising experience that is more relevant, and the profile of the person is a 45-year-old male suburban homeowner who is an avid golfer and sports car enthusiast, with teenagers in the house. We can probably find some number of ads that are relevant. But if we want to really add value to that person, we need to have deeper profile information with a better experience of where he is in the buying cycle for those individual areas and categorization of creative messages to help tailor the ad experience for the individual.

Example: The avid golfer. There’s a whole ecosystem around golf that could be useful in creating value to the user beyond just showing ads for golf equipment in general. For instance, if our golfer was shopping for a new driver, it would be relevant to show him ads for drivers. Or if several new clubs had been purchased recently, maybe the ads should focus on balls, bags, shoes, or clothing.

Targeting our golfer based on specific product matches are pretty obvious, but equally interesting would be if he lived in the northeast, it was winter, and he’d recently showed interest in booking a vacation. In that case, the systems should be tailoring the vacation advertising around golfing destinations. This means ads for all sorts of products and services need to be categorized by the messaging used within them such that this kind of matching could be accomplished. Similarly, tailoring ads for numerous products and services around golf should be possible and make those messages more relevant to our golfer. But obviously to make that experience work well, we’d need lots of products and services that could be tailored around the “concept” of golf. Otherwise, we’d show this poor guy the same five ads all the time.

Our systems are on the cusp of these capabilities today. In fact, some of these scenarios could be activated by specific vendors in the industry. But the capabilities need to be ubiquitous enough that marketers drive those scenarios into their advertising creative and into their media plans. So it’s a bit of a chicken-and-egg conundrum: Marketers aren’t driving these scenarios to their vendors, so the vendors haven’t yet activated the capabilities to fulfill the scenarios.

We will get there. But it could take some time.

May 10 2014

Leave a comment

3rd Party Data, Data Collection, Media Agencies, Online Privacy, Targeting

The 7 types of targeting you need to know

By Eric Picard (Originally Published in iMedia – May 10, 2014)

For as long as people have been buying ads, they have been targeting their desired audiences. The science behind this obviously has changed over the years. In the beginning — say, back in ancient Greece — it was as simple as putting the name of your pottery shop on a few of your clay pots. This evolved to more location-based models over the millennia, of course, and today we can geo-target your mobile device. End of story? Not quite.

As we think about the evolution of targeted advertising over the past 50 years, there are panel-based “currency” data providers such as Nielsen, Arbitron, and others. These services allow buyers to place ads on specific published content across numerous media, with an understanding of the overall audience breakdown that views this content. Buyers can place their ads on content where their desired audience makes up some percentage of the audience that consumes that content. By doing this across a certain number of publications or shows, they can be relatively confident that they are reaching a certain number of members of their target audience.

This is easy when you’re selling a product or service that has a very broad audience — say, toothpaste. But when you have a very targeted customer you’re trying to reach, it can be much more difficult. Other than niche publications clearly aligned with your target customer — say, knitting magazines or websites — it has been hard to find enough touchpoints to reach prospective customers easily.

That has changed significantly over the last few years. Let’s focus on digital media for our purposes. The core types of targeting available today include the following.

Panel-based data

Panel-based data is the most broadly used today, from providers such as Nielsen, comScore, and others. These panels are used as described above — to understand the overall audiences that consume content provided by a publisher. This “whole milk” approach works well for brand advertisers that have large audiences that are easy to find.

Geography

This category includes geo-targeting and geo-derived information such as Nielsen PRIZM clusters that merge information about households in specific geographies. This is much more important today than in the past, given that mobile devices offer information about where audiences are at the moment of the ad delivery, thereby taking location-based advertising to new heights. In mobile devices, this matters a lot, as some of the mechanisms available on the web are either not available on mobile, or much less available due to technical limitations related to cookies.

First-party audience data

First-party audience data is available from either the advertisers directly (data they have about their existing customers) or from publishers directly (data they have about their individual audience members). First-party data is derived either from explicitly provided information or from observed behavior.

On the advertiser side, this is typically CRM data; generally these are either customers or prospects with whom the advertiser has had direct contact. Perhaps the person in question has purchased from the advertiser before, or perhaps that person has signed up for a newsletter. In the case of e-commerce, perhaps the user has visited the site but hasn’t purchased, in which case a click-path analysis might derive some information about the person’s interests.

In the case of publishers, this information can be captured through registration (which actually tends to be much more accurate than professionals believe; as it turns out, many people don’t put in fake information) or from observed behavior (users who read financial news get put into a finance bucket to be targeted when consuming other kinds of content).

Third-party audience data

Third-party audience data is available from numerous providers. Typically these data points are derived from observing the behavior (anonymously) of the end users as they’re moving across numerous websites. Sometimes this data is derived from other sources, such as credit card activity matched anonymously to users via cookie matching.

Third-party retargeting data

Third-party retargeting data is available from numerous providers. These companies will typically place targeting tags on both the advertiser and publisher websites and then link those together in order to execute media buys. Because the provider needs to have matched cookies on both the advertiser and publisher websites, typically these services run as ad networks, since they need to close the loop directly. But there are providers that allow advertisers to create their own retargeting cookie pools and reach their customers and prospects over ad exchanges and through their own direct publisher relationships. This is frequently being referred to as second-party targeting.

Look-alike targeting

Look-alike targeting is available from numerous providers as well, which enables the buyer to provide the look-alike vendor or network with a pool of cookies or data definitions. The providers will then find matching audiences who “look like” the users you’ve provided to them. This allows the buyer to get value similar to retargeting campaigns, but for much larger audiences.

Custom micro-segmentation

Custom micro-segmentation is available from a few providers. This enables the buyer to specify extremely targeted audiences that are orders of magnitude more targeted than what is available over the open market and that match their ad campaign goals exactly or much more closely. This type of targeting can be used for brand campaigns or for performance.

The types of targeting above are broad bucket definitions, and there are now literally hundreds of thousands, if not millions, of available targeting segments on the market. Vendors should be more than happy to educate buyers (and sellers) on the opportunity and methodologies behind the data segmentation. I highly recommend that one or more buyers within every buying group become an expert in the types of available segmentation and the data models involved.

Jul 08 2013

Leave a comment

3rd Party Data, Ad Ecosystem, Ad Serving, Ad Technology, Data Collection, Economics, Media Agencies, Predictions, Publishers, Targeting

Life after the death of 3rd Party Cookies

By Eric Picard (Originally published on AdExchanger.com July 8th, 2013)

In spite of plenty of criticism by the IAB and others in the industry, Mozilla is moving forward with its plan to block third-party cookies and to create a “Cookie Clearinghouse” to determine which cookies will be allowed and which will be blocked. I’ve written many articles about the ethical issues involved in third-party tracking and targeting over the last few years, and one I wrote in March — “We Don’t Need No Stinkin’ Third-Party Cookies” — led to dozens of conversations on this topic with both business and technology people across the industry.

The basic tenor of those conversations was frustration. More interesting to me than the business discussions, which tended to be both inaccurate and hyperbolic, were my conversations with senior technical leaders within various DSPs, SSPs and exchanges. Those leaders’ reactions ranged from completely freaked out to subdued resignation. While it’s clear there are ways we can technically resolve the issues, the real question isn’t whether we can come up with a solution, but how difficult it will be (i.e. how many engineering hours will be required) to pull it off.

Is This The End Or The Beginning?

Ultimately, Mozilla will do whatever it wants to do. It’s completely within its rights to stop supporting third-party cookies, and while that decision may cause chaos for an ecosystem of ad-technology vendors, it’s completely Mozilla’s call. The company is taking a moral stance that’s, frankly, quite defensible. I’m actually surprised it’s taken Mozilla this long to do it, and I don’t expect it will take Microsoft very long to do the same. Google may well follow suit, as taking a similar stance would likely strengthen its own position.

To understand what life after third-party cookies might look like, companies first need to understand how technology vendors use these cookies to target consumers. Outside of technology teams, this understanding is surprisingly difficult to come by, so here’s what you need to know:

Every exchange, Demand-Side Platform, Supply-Side Platform and third-party data company has its own large “cookie store,” a database of every single unique user it encounters, identified by an anonymous cookie. If a DSP, for instance, wants to use information from a third-party data company, it needs to be able to accurately match that third-party cookie data with its own unique-user pool. So in order to identify users across various publishers, all the vendors in the ecosystem have connected with other vendors to synchronize their cookies.

With third-party cookies, they could do this rather simply. While the exact methodology varies by vendor, it essentially boils down to this:

The exchange, DSP, SSP or ad server carves off a small number of impressions for each unique user for cookie synching. All of these systems can predict pretty accurately how many times a day they’ll see each user and on which sites, so they can easily determine which impressions are worth the least amount of money.
When a unique ID shows up in one of these carved-off impressions, the vendor serves up a data-matching pixel for the third-party data company. The vendor places its unique ID for that user into the call to the data company. The data company looks up its own unique ID, which it then passes back to the vendor with the vendor’s unique ID.
That creates a lookup table between the technology vendor and the data company so that when an impression happens, all the various systems are mapped together. In other words, when it encounters a unique ID for which it has a match, the vendor can pass the data company’s ID to the necessary systems in order to bid for an ad placement or make another ad decision.
Because all the vendors have shared their unique IDs with each other and matched them together, this creates a seamless (while still, for all practical purposes, anonymous) map of each user online.

All of this depends on the basic third-party cookie infrastructure Mozilla is planning to block, which means that all of those data linkages will be broken for Mozilla users. Luckily, some alternatives are available.

Alternatives To Third-Party Cookies

1) First-Party Cookies: First-party cookies also can be (and already are) used for tracking and ad targeting, and they can be synchronized across vendors on behalf of a publisher or advertiser. In my March article about third-party cookies, I discussed how this can be done using subdomains.

Since then, several technical people have told me they couldn’t use the same cross-vendor-lookup model, outlined above, with first-party cookies — but generally agreed that it could be done using subdomain mapping. Managing subdomains at the scale that would be needed, though, creates a new hurdle for the industry. To be clear, for this to work, every publisher would need to map a subdomain for every single vendor and data provider that touches inventory on its site.

So there are two main reasons that switching to first-party cookies is undesirable for the online-ad ecosystem: first, the amount of work that would need to be done; second, the lack of a process in place to handle all of this in a scalable way.

Personally, I don’t see anything that can’t be solved here. Someone needs to offer the market a technology solution for scalable subdomain mapping, and all the vendors and data companies need to jump through the hoops. It won’t happen in a week, but it shouldn’t take a year. First-party cookie tracking (even with synchronization) is much more ethically defensible than third-party cookies because, with first-party cookies, direct relationships with publishers or advertisers drive the interaction. If the industry does switch to mostly first-party cookies, it will quickly drive publishers to adopt direct relationships with data companies, probably in the form of Data Management Platform relationships.

2) Relying On The Big Guns: Facebook, Google, Amazon and/or other large players will certainly figure out how to take advantage of this situation to provide value to advertisers.

Quite honestly, I think Facebook is in the best position to offer a solution to the marketplace, given that it has the most unique users and its users are generally active across devices. This is very valuable, and while it puts Facebook in a much stronger position than the rest of the market, I really do see Facebook as the best voice of truth for targeting. Despite some bad press and some minor incidents, Facebook appears to be very dedicated to protecting user privacy – and also is already highly scrutinized and policed.

A Facebook-controlled clearinghouse for data vendors could solve many problems across the board. I trust Facebook more than other potential solutions to build the right kind of privacy controls for ad targeting. And because people usually log into only their own Facebook account, this avoids the problems that has hounded cookie-based targeting related to people sharing devices, such as when a husband uses his wife’s computer one afternoon and suddenly her laptop thinks she’s a male fly-fishing enthusiast.

3) Digital Fingerprinting: Fingerprinting, of course, is as complex and as fraught with ethical issues as third-party cookies, but it has the advantage of being an alternative that many companies already are using today. Essentially, fingerprinting analyzes many different data points that are exposed by a unique session, using statistics to create a unique “fingerprint” of a device and its user.

This approach suffers from one of the same problems as cookies, the challenge of dealing with multiple consumers using the same device. But it’s not a bad solution. One advantage is that fingerprinting can take advantage of users with static IP addresses (or IP addresses that are not officially static but that rarely change).

Ultimately, though, this is a moot point because of…

4) IPV6: IPV6 is on the way. This will give every computer and every device a static permanent unique identifier, at which point IPV6 will replace not only cookies, but also fingerprinting and every other form of tracking identification. That said, we’re still a few years away from having enough IPV6 adoption to make this happen.

If Anyone From Mozilla Reads This Article

Rather than blocking third-party cookies completely, it would be fantastic if you could leave them active during each session and just blow them away at the end of each session. This would keep the market from building third-party profiles, but would keep some very convenient features intact. Some examples include frequency capping within a session, so that users don’t have to see the same ad 10 times; and conversion tracking for DR advertisers, given that DR advertisers (for a whole bunch of stupid reasons) typically only care about conversions that happen within an hour of a click. You already have Private Browsing technology; just apply that technology to third-party cookies.

Apr 15 2013

Leave a comment

3rd Party Data, Ad Ecosystem, Ad Technology, Data Collection, Online Media, Online Privacy, Publishers, Targeting

Targeting fundamentals everyone should know

By Eric Picard (Originally published in iMediaConnection, April 11th, 2013)

Targeting data is ubiquitous in online advertising and has become close to “currency” as we think about it in advertising. And I mean currency in the same way that we think about Nielsen ratings in TV or impression counts in digital display. We pay for inventory today in many cases based on a combination of the publisher, the content associated with the impression, and the data associated with a variety of elements. This includes the IP address of the computer (lots of derived data comes from this), the context of the page, various content categories and quality metrics, and — of course — behavioral and other user-based targeting attributes.

But for all the vetting done by buyers of base media attributes, such as the publisher or the page or quality scores, there’s still very little understanding of where targeting data comes from. And even less when it comes to understanding how it should be valued and why. So this article is about just that topic: how targeting data is derived and how you should think about it from a value perspective.

Let’s get the basic stuff out of the way: anything derived from the IP address and user agent. When a browser visits a web page, it spits out a bunch of data to the servers that it accesses. The two key attributes are IP address and user agent. The IP address is a simple one; it’s the number assigned to the user’s computer by the internet to allow that computer to be identified by the various servers it touches. It’s a unique number that allows an immense amount of information to be inferred; the key piece of information inferred is the geography of the user.

There are lots of techniques used here to varying degrees of “granularity.” But we’ll just leave it at the idea that companies have amassed lists of IP addresses assigned to specific geographic locations. It’s pretty accurate in most cases, but there are still scenarios where people are connected to the internet via private networks (such as a corporate VPN) that confuse the world by assigning IP addresses to users in one location when they are actually in another. This was the classic problem with IP address based geography back in the days of dial-up, when most users showed up as part of Reston, Va. (where AOL had its data centers). Today where most users are on broadband, the mapping is much more accurate and comprehensive.

As important as geography are the various mappings that are done against location. Claritas, Prism, and other derived data products make use of geography to map a variety of attributes to the user browsing the page. And these techniques have moved out of traditional media (especially direct-response mailing lists) to digital and are quite useful. The only issue is that the further down the chain of assumptions used to derive attributes, the more muddled things become. Statistically, the data still is relevant, but on a per-user basis it is potentially completely inaccurate. That shouldn’t stop you from using this information, nor should you devalue it — but just be clear that there’s a margin of error here.

User agent is an identifier for the browser itself, which can be used to target users of specific browsers but also to identify non-browser activity that chooses to identify itself. For instance, various web crawlers such as search engines identify themselves to the server delivering a web page, and ad servers know not to count those ad impressions as human. This assumes good behavior on behalf of the programmers, and sometimes “real” user agents are spoofed when the intent is to create fake impressions. Sometimes a malicious ad network or bad actor will do this to create fake traffic to drive revenue.

Crawled data

There’s a whole class of data that’s derived by sending a robot to a web page, crawling through the content on the page, and classifying the content based on all sorts of analysis. This mechanism is how Google, Bing, and other search engines classify the web. Contextual targeting systems like AdSense classify the web pages into keywords that can be matched by ad sales systems. And quality companies, like Trust Metrics and others, scan pages and use hundreds or thousands of criteria to value the rank of the page — everything from ensuring that the page doesn’t contain porn or hate speech to analyzing the amount of white space around images and ads and the number of ads on a page.

User targeting

Beyond the basics of browser, IP, and page content, the world is much less simple. Rather than diving into methodologies and trying to simplify a complex problem, I’ll simply list and summarize the options here:

Registration data: Publishers used to require registration in order to access their content and, in that process, request a bunch of data such as address, demographics, psychographics, and interests. This process fell out of favor for many publishers over the years, but it’s coming back hard. Many folks in our industry are cynical about registration data, using their own experiences and feelings to discount the validity of user registration data. But in reality, this data is highly accurate; even for large portals, it is often higher than 70 percent accurate, and for news sites and smaller publishers, it’s much more accurate.

Interestingly, the use of co-registration through Facebook, Twitter, LinkedIn, and others is making this data much more accurate. One of the most valuable things about registration data is that it creates a permanent link between a user and the publisher that lives beyond the cookie. Subsequently captured data from various sessions is extremely accurate even if the user fudged his or her registration information.

First-party behavioral data: Publishers and advertisers have a great advantage over third parties in that they have a direct relationship with the user. This gives them incredible opportunities to create deeply refined targeting segments based on interest, behavior, and especially from custom created content such as showcases, contests, and other registration information. Once a publisher or advertiser creates a profile of a user, it has the means to track and store very rich targeting data — much richer in theory than a third party could easily create. For instance, you might imagine that Yahoo Finance benefits highly from registered users who track their stock portfolio via the site. Similarly, users searching for autos, travel, and other vertical-specific information create immense targeting value.

Publishers curbed their internal targeting efforts years ago because they found that third-party data companies were buying targeted campaigns on publishers and then their high-cost, high-value targeting data was leaking away to third parties. But the world has shifted again, and publishers and advertisers both are benefiting highly from the data management platforms (DMPs) that are now common on the market. The race toward using first-party cookies as the standard for data collection is further strengthening publishers’ positions. Targeted content categories and contests are another way that publishers and advertisers have a huge advantage over third parties.

Creating custom content or contests with the intent to derive high-value audience data that is extremely vertical or particularly valuable is easy when you have a direct relationship with the user. You might imagine that Amazon has a huge lead in the market when it comes to valuation of users by vertical product interest. Similarly, big publishers can segment users into buckets based on their interest in numerous topics that can be used to extrapolate value.

Third-party data: There are many methods used to track and value users based on third-party cookies (those pesky cookies set by companies that generally don’t have a direct relationship with the user — and which are tracking them across websites). Luckily there are lots of articles out there (including many I’ve written) on how this works. But to quickly summarize: Third-party data companies generally make use of third-party cookies that are triggered on numerous websites across the internet via the use of tracking pixels. These pixels are literally just a 1×1 pixel image (sometimes called a “clear pixel”), or even just a simple no-image JavaScript call from the third-party server, that allows them to set and/or access a cookie that they can set on the users’ browsers. These cookies are extremely useful to data companies in tracking users because the same cookie can be accessed on any website, on any domain, across sessions, and sometimes across years of time.

Unfortunately for the third-party data companies, third-party cookies have recently come under intense scrutiny since Apple’s Safari doesn’t allow them by default and Firefox has announced that it will set new defaults in its next browser version to block third-party cookies. This means that those companies relying exclusively on third-party cookies will see their audience share erode and will need to fall back on other methods of tracking and profiling users. Note that these companies all use anonymous cookies and work hard to be safe and fair in their use of data. But the reality is that this method is becoming harder for companies to use.

By following users across websites, these companies can amass large and comprehensive profiles of users such that advertising can be targeted against them in deep ways and more money can be made from those ad impressions.
Read more at http://www.imediaconnection.com/content/33972.asp#qakIxCXJbl9KpiG3.99

Mar 24 2013

Leave a comment

3rd Party Data, Ad Ecosystem, Ad Serving, Ad Technology, Data Collection, Online Media, Online Privacy, Targeting

We don’t need no stinkin’ 3rd party cookies!

By Eric Picard (Originally published on AdExchanger.com)

I’ve been writing about some of the ethical issues with “opt-out” third-party tracking for a long time. It’s a practice that makes me extremely uncomfortable, which is not where I started out. You can read my opus on this topic here.

In this article, I want to go into detail about why third-party cookies aren’t needed by the ecosystem, and why doing away with them as a default setting is both acceptable and not nearly as harmful as many are claiming.

First order of business: What is a third-party cookie?

When a user visits a web page, they load a variety of content. Some of this content comes from the domain they’re visiting. (For simplicity sake, let’s call it Publisher.com.) Some comes from third parties that are loading this content onto Publisher.com’s web site. (let’s call it ContentPartner.com.) An example would be that you could visit a site about cooking, and the Food Network could provide some pictures or recipes that the publisher embeds into the page. Those pictures and recipes sit on servers controlled by the content partner and point to that partner’s domain.

When content providers deliver content to a browser, they have the opportunity to set a cookie. When you’re visiting Publisher.com’s page, it can set a first-party cookie because you’re visiting its web domain. In our example above, ContentPartner.com is also delivering content to your browser from within Publisher.com’s page, so the kind of cookie it can deliver is a third-party cookie. There are many legitimate reasons why both parties would drop a cookie on your browser.

If this ended there, we probably wouldn’t have a problem. But this construct – allowing content from multiple domains to be mapped together into one web page, which was really a matter of convenience when the web first was created – is the same mechanism the ad industry uses to drop tracking pixels and ads onto publishers’ web pages.

For example, you might visit Publisher.com and see an ad delivered by AdServer.com. And on every page of that site, you might load tracking pixels delivered by TrackingVendor1.com, TrackingVendor2.com, etc. In this case, only Publisher.com can set a first-party cookie. All the other vendors are setting third-party cookies.

There are many uses for third-party cookies that most people would have no issue with, but some uses of third-party cookies have privacy advocates up in arms. I’ll wave an ugly stick at this issue and just summarize it by saying: Companies that have no direct relationship with the user are tracking that user’s behavior across the entire web, creating profiles on him or her, and profiting off of that user’s behavior without his or her permission.

This column isn’t about whether that issue is ethical or acceptable, because allowing third-party cookies to be active by default is done at the whim of the browser companies. I’ve predicted for about five years that the trend would head toward all browsers blocking them by default. So far Safari (Apple’s browser) doesn’t allow third-party cookies by default, and Mozilla’s Firefox has announced it will block them by default in the next version of Firefox.

Why I don’t think blocking third-party cookies is a problem

There are many scenarios where publishers legitimately need to deliver content from multiple domains. Sometimes several publishers are owned by one company, and they share central resources across those publishers, such as web analytics, ad serving, and content distribution networks (like Akamai). It has been standard practice in many of these cases for publishers to map their vendors against their domain, which by the way allows them to set first-party cookies as well.

How do they accomplish this? They set a ‘subdomain’ that is mapped to the third party’s servers. Here’s an example:

Publisher.com wants to use a web analytics provider but set cookies from its own domain. It creates a subdomain called WebAnalytics.Publisher.com using its Domain Name Server, or DNS. (I won’t get too technical, but DNS is the way that the Internet maps IP addresses – the numeric identifier of servers – to domain names.) It’s honestly as simple as one of the publisher’s IT people opening up a web page that manages their DNS, creating a subdomain name, and mapping it to a specific IP address. And that’s it.

This allows the third-party vendor to place first-party cookies onto the browser of the user visiting Publisher.com. This is a standard practice that is broadly used across the industry, and it’s critically important to the way that the Internet functions. There are many reasons vendors use subdomains, not just to set first-party cookies. For instance, this is standard practice in the web analytics space (except for Google Analytics) and for content delivery networks (CDNs).

So why doesn’t everybody just map subdomains and set first-party cookies?

First, let me say that while it is fairly easy to map a subdomain for the publisher’s IT department, it would be impractical for a demand-side platform (DSP) or other buy-side vendor to go out and have every existing publisher map a subdomain for them. For those focused on first-party data on the advertiser side, they’ll still have access to that data in this world. But for broader data sets, they’ll be picking up their targeting data via the exchange as pushed through by the publisher on the impression itself. For the data management platforms (DMPs), given that this is their primary business, it is a reasonable thing for them to map subdomains for each publisher and advertiser that they work with.

Also, the thing that vendors like about third-party cookies is that by default they work across domains. That means that data companies could set pixels on every publisher’s web site they could convince to place their pixels, and then automatically they would track one cookie across every site they visited. Switching to first-party cookies breaks that broad set of actions across multiple publishers into pockets of activity at the individual publisher level. There is no cheap, convenient way to map one user’s activity across multiple publishers. And only those companies that have a vested interest – the DMPs – will make that investment, and it will limit the number of small vendors who can’t make that investment from participating.

But, is that so bad?

So does moving to first-party cookies break the online advertising industry?

Nope. But it does complicate things. Let me tell you about a broadly used practice in our industry – one that every single data company uses on a regular basis. It’s a practice that gets very little attention today but is pretty ubiquitous. It’s called cookie mapping.

Here’s how it works: Let’s say one vendor has its unique anonymous cookies tracking user behavior and creating big profiles of activity, and it wants to use that data on a different vendor’s servers. In order for this to work, the two vendors need to map together their profiles, finding unique users (anonymously) who are the same user across multiple databases. How this is done is extremely technical, and I’m not going to mangle it by trying to simplify the process. But at a very high level, it’s something like this:

Media Buyer wants to use targeting data on an exchange using a DSP. The DSP enables the buyer to access data from multiple data vendors. The DSP has its own cookies that it sets (today these are third-party cookies) on users when it runs ads. The DSP and the data vendor work with a specialist vendor to map together the DSP’s cookies and the data vendor’s cookies. These cooking mapping specialists (Experian and Acxiom are examples, but others provide this service as well) use a complex set of mechanisms to map together overlapping cookies between the two vendors. They also have privacy auditing processes in place to ensure that this is done in an ethical and safe way to ensure that none of the vendors gets access to personally identifiable data.

Note that this same process is used between advertisers and publishers and their own DMPs so that first-party data from CRM and user registration databases can be mapped to behavioral and other kinds of data.

The trend for data companies in the last few years has been to move into DMP mode, working directly with the advertisers and publishers rather than trying to survive as third-party data providers. This move was intelligent – almost prescient of the change that is happening in the browser space right now.

My short take on this evolution

I feel that this change is both inevitable and positive. It puts more power back in the hands of publishers; it solidifies their value proposition as having a direct relationship with the consumer, and will drive a lot more investment in data management platforms and other big data by publishers. The last few years have seen a data asymmetry problem arise where the buyers had more data available to them than the publishers, and the publishers had no insight into the value of their own audience. They didn’t understand why the buyer was buying their audience. This will fall back into equilibrium in this new world.

Long tail publishers will need to rely on their existing vendors to ensure they can easily map a first-party cookie to a data pool – these solutions need to be baked by companies who cater to long tail publishers, such as ad networks. The networks will need to work with their own DMP and data solutions to ensure that they’re mapping domains together on behalf of their long tail publishers and pushing that targeting data with the inventory into the exchanges. The other option for longer tail publishers is to license their content to larger publishers who can aggregate this content into their sites. It will require some work, which also means ambiguity and stress. But certainly this is not insurmountable.

I also will say that first-party tracking is both ethical and justifiable. Third-party tracking without the user’s permission is ethically a challenging issue, and I’d argue that it’s not in the best interest of our industry to try and perpetuate – especially since there are viable and acceptable alternatives.

That doesn’t mean switching off of third-party cookies is free or easy. But in my opinion, it’s the right way to do this for long-term viability.

Mar 03 2012

Leave a comment

3rd Party Data, Data Collection, Online Privacy, Targeting

The Ethical Issues with 3rd Party Behavioral Tracking

(Originally published on AdExchanger, October 2011) by Eric Picard

Companies that track consumers’ behavior across the web without their consent, and without providing them any recognizable value, should stop. I’ll argue that virtually no company that tracks consumer behavior across multiple sites actually provides consumers with recognizable value.

And the real issue here is that consumers never opt-into being tracked this way – if we required this, then the ethical issues would go away. But we don’t require an opt-in because in reality, consumers don’t want this, don’t benefit from it, and as an industry we’re acting in unethical ways. I realize that for this audience, my position makes me as unpopular as a New York City steam bath in August, but I challenge the industry to really stand up and do the right thing here.

For clarity – Publishers that track what their visitors do on that one publisher’s site face completely different issues. Consumers who visit a publisher’s site are engaging in a direct relationship with that publisher. As long as the publisher is collecting data to be used only on its own website, this is defensible – the consumer has elected to visit their site, and gets the value of content that the publisher provides. And if the publisher asks the consumer to opt-into being tracked across multiple websites, then there is no ethical issue at stake. But cross-publisher behavioral tracking should definitely require an opt-in.

As long as a publisher has a clear privacy policy, data collection for their site without an opt-in is ethical. The consumer gets value from personalization of content as well as enabling the publisher to sell behaviorally targeted advertising. And the publisher has the right to collect this data to optimize their business, especially given that most publishers make the most of their revenue from advertising – this data is generally used to better sell ads to advertisers.

While a consumer is visiting a publisher’s site, the publisher certainly has the right to track his or her behavior. And having a user specifically ‘opt-out’ of being tracked on that publisher is the ‘right’ option to provide in terms of creating consumer good will.

There are three arguments commonly used by advertising industry professionals that make a case for behavioral targeting today:

1. Behavioral targeting makes advertising more relevant, a consumer benefit.

Targeting doesn’t make advertising more relevant – it makes a small percentage of the ads people see more relevant. In order to really increase relevance of advertising via targeting, the number of advertisers would need to vastly increase. There are more than 5 trillion ad impressions per month in online display. And more than 90% of US display ad spend is driven by less than 6,000 advertisers.

Frequently the argument is used that with Paid Search, consumers feel that contextual targeting makes the ads more relevant. But contextual targeting doesn’t require consumer behavior data. And further, the basis is completely wrong: Paid Search has roughly 250 billion impressions a month, and 400,000 active advertisers (verses 5 trillion ad impressions and 6,000 – 8,000 for display ads.)

The math is pretty simple – there is very little opportunity to target display advertising against niche segments today in order to increase overall relevancy of the ads. The reason people aren’t seeing relevant ads is not because targeting is not good enough, or we’re not collecting enough behavioral data, it’s because there are too few ad creatives to apply against the vast number of ad impressions.

This could change over time as more advertisers start moving into display – especially if we ever find a way to make display advertising efficient and effective for local advertisers, at which point there’d be lots more creatives. But even in these cases – cross-publisher tracking wouldn’t be necessary. As long as we had geographic targeting available (which doesn’t require any browsing history) and basic data that the publisher could track themselves, the ads could be much more relevant and valuable. But, until we grow the number of advertisers, and more importantly, the number of creatives to more closely match paid search – this is a moot point.

2. There is no harm in the third-party tracking technologies, the tracking is anonymous.

So-called anonymous tracking is not very secure – the anonymity is fairly easily broken. Cracking open that anonymous shell and merging it with personally identifiable information from other sources is a fairly easy engineering feat. Search for, “netflix prize privacy” in any search engine for one example. (Keeping in mind that this example is from 2007). There’s been a lot more examples since then.

Beyond this, many of the players in the behavioral targeting space are small startups, without a huge amount of investment in security infrastructure. Even major corporations have leaked millions of people’s data into the public domain. Searching for “AOL Data Leak”, “Sony PlayStation Data Leak”, “Skype Android Data Leak”, “Epsilon Data Leak” should prove interesting. More of these happen all the time. If these major corporations can’t keep your personal data secure, the idea that a small startup is a safe place for this data simply doesn’t ring true. And I say this with all respect to my friends working at these companies – at the same time, I’m worried about it.

There are lots of very ethical people in this industry who would never do anything intentional or nefarious with the behavioral data that is collected. But that doesn’t mean that bad actors don’t exist. And even if good people are at the helm in some of these companies today – over time, mergers and acquisitions, or bankruptcies can cause changes in ownership over this potentially sensitive data.

I think I’ve shown above that the potential for damage is both real and proven, and could be quite harmful.

3. We’re not as ‘bad’ as the offline marketers.

I probably shouldn’t have to even engage in this kind of argument. But just because I hear this a lot – I’ll address it.

Yes, the offline direct marketers do use a lot of targeting data – much of it using personally identifiable information – in order to target users on direct mail campaigns and other mechanisms. Yes, they use credit card purchase data, and financial data that no consumer really would ever be excited about anyone getting access to. And yes – they’ve been doing it for years. As far as I’m concerned, this is unethical as hell. And I’ve written lots of articles over the years that state my position on this.

That doesn’t justify us doing this online, even if we were doing things in a vacuum. However, we’re not doing things in a vacuum. The process used in the online space in order to support a lot of this behavioral targeting data being actually used for buying ads requires cookie matching.

Cookie matching requires the use of a third party to find some kind of ‘data key’ that sits in a third party data provider, which is used to match two anonymous sources together. This can be an email address, or could be a telephone number, or a mailing address, or something else. This ‘key’ allows two separate anonymous cookies to be matched together as one single user.

The main providers of this service are the same exact ‘offline marketing’ data companies that are referenced by people in our industry as the ones who are ‘worse than we are.’ In other words – there is no difference whatsoever between the online and offline data providers – as they are both used in order for this behavior to function in our space.

Some of you may remember that the acquisition of Abacus (an offline data provider to direct marketing companies who built targeted mailing lists) by DoubleClick was blocked way back in 1999 because of fears that Abacus’ data would be able to merged with online data, and that this was a dangerous thing to the privacy of consumers.

But in reality – only a few years later, other providers of offline data for direct marketing began offering this kind of cookie matching service to online marketers – without any acquisitions. If this was such a concern that it caused DoubleClick’s acquisition to be rejected by regulators – then why is this not a concern when it’s done as part of the day-to-day business of the online advertising industry?

In the end, this is really just about doing the right thing – from my perspective. Consumers should give their approval before anyone without a direct relationship with them begins tracking their behavior across numerous web sites. This seems self-evident to me, and to most consumers I’ve talked to about it. The only argument to the contrary I’ve ever gotten was from professionals in the marketing industry. And as I’ve shown above, these don’t hold water as far as I’m concerned.

Tagged Data, Privacy, Targeting

Mar 03 2012

Leave a comment

3rd Party Data, Data Collection, Online Privacy, Publishers, Targeting

It’s not your data!

(Originally published under the title “Our industry’s Unethical, Indefensible behavior”, in iMediaConnection, April 2011) by Eric Picard

I’ve been writing a lot lately on the topic of online privacy at the intersection of advertising, and particularly the way the third-party tracking ecosystem has been evolving for the past few years. There is an ongoing onslaught of discussion about legislation and how we’re probably going to get regulated. Some of my closest friends in the industry are at odds with my position, and many people are finding themselves diametrically opposed to people they respect over this issue. People are claiming that if we stop the targeting, all the value in this industry will bottom out — that another bubble will burst, and advertising Armageddon will follow. I disagree. I believe a huge amount of value can be generated without marginally ethical behavior.

To me, it’s a very clear issue — one based on ethics and logic. If companies are tracking people across multiple websites without their consent, and without providing any recognizable value, and those people want the tracking stopped — then it should probably stop. There is real money on the table for the companies that do this data collection, and changing the opt-out model to an opt-in model would decimate their financial outlooks. But this ultimately doesn’t matter. As an industry, we are doing something that most people simply don’t want us to do.

When a publisher tracks what its visitors do on that one publisher’s site, tracking is a defensible practice. The online users who visit a publisher’s site are electing to visit that publisher, and as long as the publisher is collecting data to be used only on its own website, then this falls into the standard quid pro quo relationship that already exists.

People get free or reduced-cost content that they desire to consume from a publisher. The publisher shows them ads, and frequently requires that the consumer register or subscribe (regardless of if this is a free or paid subscription) and hand over some data to be used to better sell ads to advertisers. While a person is visiting a publisher’s site, the publisher certainly has the right to track his or her behavior. There are lots of reasons justifying this right. And consumers can choose to simply avoid visiting that particular publisher if they disagree with the publisher’s privacy policy. And having a user specifically opt out of being tracked on that publisher’s site is a great option to provide.

However, my issue is with the practice that has exploded over the past few years, where third-party companies place tracking tags all over the internet — across multiple publishers — and create comprehensive profiles of consumer behavior. This without any discernable value given back to the consumer (I have lots more to say on this issue below) and without their direct knowledge or consent. This tracking is all enabled by third-party tracking using third-party cookies. This capability was not what the browser designers created cookies for, and it is a sort of hack of the way browsers operate. If “hack” is too strong a word, it’s at least an unintended loophole in browser design that has been used in ways that are hardly defensible.

While I am passionate on this topic, I actually think this argument is a moot point in many ways. I predict that the browsers are going to very elegantly enable consumers to block third-party cookies in the next few releases, and the whole house of cards built on top of this loophole in cookie security is going to fall to the ground.

The Internet Explorer team at Microsoft has already announced that IE 9 will make it extremely easy to block third-party cookies and content. And most technical people running the browser groups at Firefox (keep in mind, there really are no business people involved in this open-source browser) and Google (where technology drives most decisions) are all pretty smart; they understand the tracking behavior that they want to shield the public from. This is clearly an issue that technologists understand better than the general population, and most technical people I’ve talked to have arrived at the same conclusion: Blocking third-party tracking is in the best interest of consumers, it should be extremely easy to do, and the decision should be pre-populated as an opt-out.

Most of the discussions I’ve had on the opposite side of this issue have been with business people. They believe that there is no danger to consumers from what they perceive to be anonymous tracking of online behavior. And they continue to look at people who don’t agree with them as privacy fanatics who are irrationally trying to limit their businesses from succeeding. This isn’t the case, and I certainly am not fanatical about privacy. But I’ve learned a lot over the past 10 years about this topic, and on top of this, the market has radically shifted in the past three years. The amount of tracking going on has seen a huge increase, and the safeguards on the data being collected are quite squishy.

There is a real issue here that apparently hasn’t been understood by a lot of non-technical people. So-called anonymous tracking is fairly easily cracked open. And now that there are many mechanisms that have been created for matching cookies across domains and companies, there are numerous broadly correlated profiles of user behavior floating around. Many of the companies that have copies of these profiles are small startups, many without nearly the funding or maturity needed to build extremely secure environments. And even some of the biggest companies out there have had significant security breaches over the last few years — breaches that have leaked millions of people’s data into the public domain.

Many of the executives at the companies operating in this sphere are very reputable and honorable people who are certainly not being malicious or trying to hurt people. But what happens if their companies are purchased by less-reputable entities? Clearly those with scruples will simply quit and find other work. But now we’ve got a company run by unethical and dangerous individuals with access to a ton of data that can pretty quickly and easily be reverse-engineered to do diabolical things.

Or what if a startup isn’t successful and goes into bankruptcy — and the data assets get auctioned off to the highest bidder? Or what if there is a security breach and a hacker gets access to the company’s log files or plants spyware on its servers? There have been cases in this industry of crackers getting into server farms and hosting software there that gave them access to a lot of data. And of course, there is the other problem of companies that are just unethical to begin with.

Many proofs have been created that show how easy it is to reverse-engineer anonymous tracking. With a small amount of data to correlate with non-private activity, any decent engineer can take apart the anonymous shell around a person’s profile and merge it with personally identifiable information from other sources. And suddenly we’ve got non-anonymous profiles with all sorts of data in the hands of not-so-scrupulous people. Not a recipe for comfort.

At this point, the business people typically try to argue that without the work they do, consumers will have the horrible (never mind that it’s what already exists) experience of having to see advertising that is not relevant. The fallacy of this argument states that if we have better targeting, the ads that consumers see will be more relevant, and they will have a better experience visiting websites that are ad-funded.

There is no persuasive argument to be made that consumers benefit (really at all) from third-party tracking. The ads are not perceptibly more relevant (to the consumer), despite the advertiser’s ability to do deep statistical analysis and see a measurable lift in performance. The only groups really benefiting from the third-party tracking that’s going on are the companies that sell it, and to some degree the advertisers that are able to make use of it for a tiny percentage of their overall spend.

This argument is really hard to defend, and has been made by the ad industry for the past 15 years. I’ve made this argument myself a bunch of times. See this video for definitive proof. Please note that watching myself in this video drove two major shifts in my life: First, I saw that even I didn’t really believe this argument anymore, and I stopped championing this position. Second, I realized I needed to lose a ton of weight (which I’ve since done).

The argument of more relevant display ads is a fallacy. There is simply not enough ad inventory available to really improve relevance to a degree that it would meet the bar of a consumer. Getting a tiny percentage lift on CPAs that are already tiny doesn’t matter enough to justify the issues I’m complaining about from a consumer perspective.

Just because I looked at a pair of shoes online and then one out of 50,000 of the ads I see afterwards are for the same pair of shoes doesn’t mean that we’re making advertising more relevant. It means we’re making a few ads more relevant. A tiny handful. A handful that is so small that it won’t for a moment change the way that consumers feel about online ads. And in order to make ads more relevant, we’d need hundreds of thousands or even millions of ads from a similar number of companies in order to make advertising feel more relevant to consumers.

One argument I hear a lot is that consumers prefer the ad experience from paid search because they feel the ads are more relevant. But there is no real comparison to make here. There are something like 5,000 advertisers that make up more than 90 percent of the U.S. ad spend on display, across approximately 5 trillion monthly impressions across hundreds of millions of ad locations. Paid search has more than 400,000 active advertisers at any given time, with only about 250 million impressions per month and only something like 2-3 million commercially viable keywords. Paid search has more relevant ads than display because of this high concentration of advertisers across a small number of ads. We’d need a similar kind of ratio to really appear more relevant to consumers based on targeting in display ads — and we’re nowhere close to this. If someone ever figures out how to get local advertisers to buy display advertising, this could happen — but we’re a long way from this nirvana.

Another argument I hear is that we’re “not as bad as the offline direct marketers, who have been doing much more of this for years, and who have way more data than the online marketers.” And generally the argument is included that consumers clearly haven’t rebelled against direct mail, so they shouldn’t have a problem with what online marketing does.

This is simply silly from my point of view. First, the companies that lead the offline direct marketing industry are exactly the pivotal players that are enabling much of the third-party tracking going on in the online space. They’re the ones gluing together the cookies from multiple parties, so there is no “them vs. us.” We are the same exact industry, and the players are active across the board, across any perceived boundary.

Second, just because consumers have given in on the offline tracking that is going on and data sharing that happens regularly across the credit card and finance industry, this doesn’t imply their implicit acceptance of similar behavior in other venues. Like a frog dropped in warm water and slowly boiled, they didn’t understand what was happening in the offline world until it was too late. Now most consumers understand the issues, and they are not happy about this happening again in the online space where companies are more visibly collecting data about their behavior without permission. At least with the credit card companies, consumers get tangible benefit from the use of the credit card. In the online space, there is no perceptible value.

If you still believe that there is a credible argument to make to the average consumer on this topic, try explaining to an acquaintance who doesn’t work in the online advertising industry what tangible value they get from allowing a third party to track them. And be sure to explain what is really happening, including how many different sites they’re being tracked on without their consent. See if they call foul on you.

And frankly, you need to really question this issue yourself. Imagine your reaction if you found out that some company was hiring people to follow your wife, husband, mother, or children around and note what they do all day in order to build segmentation models for marketing. Imagine that when you confronted them, that their response was, “But we anonymize the data — trust us.” It just doesn’t cut the mustard from my point of view.

I have discussed this issue with lots of consumers, and not a single one — not one person — has ever said that he or she was satisfied with the ability to opt out. Every single one has complained about the fact that this was done without permission.

From a moral and ethical standpoint, I can’t any longer say that third-party tracking is OK with a straight face. I simply don’t believe it. There is no justification I can see from a consumer point of view that they should simply sit back and swallow all this tracking that doesn’t benefit them. Companies are making money off of their personal activity data. Every person I’ve talked to outside of our industry believes they have the right to expect that someone should need to ask permission before tracking.

I now believe that companies with no direct relationship with a consumer should not have the right to track that consumer’s behavior across multiple websites, make money off that consumer’s data, and potentially put that user’s privacy at risk without explicitly asking permission first. First-party tracking is acceptable and justifiable. If I visit a publisher’s website, there’s an understood quid pro quo that all consumers are fairly aware of at this point; they know they need to put up with advertising in order to get access to content and free or reduced-cost tools (e.g., email, IM, etc.).

On the advertiser side, consumers generally don’t have a problem if they are tracked when they visit the website of a company of which they are a customer. Amazon is often used as an example here. Just as there is a reasonable expectation that a shop owner would watch what you’re looking at and make suggestions to you inside their store, Amazon has legitimate reasons to track shopping behavior and provides customer value by doing it.

In the end, just because we can do something doesn’t mean we should do something.

Tagged Data, Privacy, Targeting

Mar 03 2012

Leave a comment

3rd Party Data, Data Collection, Online Privacy, Targeting

Why consumers think online marketing is creepy

(Originally published in iMediaConnection, December 2010) by Eric Picard

When the concept of cookies was introduced into web browsers, the idea was simple and designed to allow for some permanence in the relationship between a person and a website beyond a single session. Cookies would allow someone to visit a site, return to the site, and have his or her user ID already populated into the browser. It also would allow the website itself to create a persistent relationship with a person who visits the site across multiple visits.

These browser cookies were left fairly open and flexible, and were not just limited to the sites that people were visiting. They enabled broad collection of browsing information by any entity that had rights to place images or content onto any site, even from different domains than the person was visiting. This broad capability is what enabled the creation of third-party tracking as a business. And it is extremely useful to companies that market online to consumers. Various ad serving companies enabled this capability (anonymizing the person’s browser such that no personally identifiable information would be passed to the advertiser, just a unique number representing the person) for advertisers, which then were able to understand broad consumer behavior in new ways. And the anonymous nature of these cookies made everyone involved feel justified and comfortable using the technology in this extended way.

More recently, the online advertising industry has gone through a series of revolutions that are fundamentally changing the way these tracking cookies operate. This change has the potential to radically improve the utility of advertising to companies that market online due to the advent of advertising exchanges like the Google-owned DoubleClick ad exchange, the Yahoo-owned Right Media exchange, or the AppNexus exchange, which recently closed $50 million in funding from various sources including Microsoft.

An entire ecosystem of companies has grown up around these exchanges. On the side of publishers, there are the supply-side platforms (SSPs) like Pubmatic, Rubicon, AdMeld, and others. On the side of the advertisers and agencies, there are the demand-side platforms (DSPs) such as Invite Media (owned by Google), MediaMath, Turn, DataXu, Triggit, and others. And all of these providers look for as much data as possible to be injected into the ad impression stream so that an appropriate valuation of the impression can be made. The publishers and SSPs push some data into the chain from what they’ve collected. The advertisers come to the table with what they’ve collected. And the exchanges enable third-party data companies to inject data as well. At the end of the day, the battle has become about who has the best data that nobody else has in order for someone to get an edge and either make more or pay less money than competitors. This space is loosely referred to as the “real-time bidding” space, even though RTB is only part of the story and not always used.

So let’s examine the data side of this market, and what’s going on. Because either what is going on is all perfectly fine, or it is not. And consumers are getting creeped out by it. The question is: Should they be creeped out, or not? Is everything happening in this space completely benign, or is it harmful in some way? From a more philosophical point of view, should companies be able to use cookies to track behavior of individual people via their web browsers and use that data to make money without consent of the person, and without asking first? Should this be legal? That’s at the heart of the FTC’s recent “Do not track” discussion. The commission is proposing a simplified “opt-out” of tracking that does not quite go to the onerous “opt-in” requirement that many have feared, but that would certainly let consumers easily stop being tracked.

Let’s investigate how the ecosystem for data companies works today, and let’s really ask ourselves if there is an issue here. Over the past few years, third-party data companies like AudienceScience, Tacoda (bought by AOL), BlueKai, eXelate, Bizo, and others have found that they can create business arrangements with various websites to enable tracking of behavior, which can then be sold to one of the companies in the RTB space. A good example of this kind of relationship would be a travel website that enables a data company to cookie any user that is searching for travel arrangements and collect the dates and destinations of those travel plans. Or an automotive website that lets a data company track which models of car a person is shopping for. Once the data is available to advertisers and ad agencies, either through a publisher, an SSP, an exchange, or a DSP, the advertiser can bid on impressions specifically based on the audience characteristics suggested by their browsing behavior.

All of these companies go to varying lengths to ensure that no personally identifiable information is exposed when they trade this information over, and nothing I’ve said above sounds overly concerning — especially when you think about the messaging that this is anonymous. And some companies in this space have gone to great lengths to ensure that there is at least a potential consumer upside. BlueKai comes to mind immediately with its BlueKai registry, which enables consumers to see what data are collected about them, edit their profile, and then select a charity to which a portion of the proceeds from their tracking can be assigned.

Very few people outside our industry are aware of all the “cookie matching” that goes on. This process essentially lets two different data providers compare cookies and match the intersection of the audience members between those cookies. This is typically done through a third-party service like Acxiom or Experian, which don’t allow the two parties to match the users in such a way that they can accidentally match personally identifiable information to a profile. A scenario would be an advertiser that has a list of cookies of its customers, which could compare those cookies to a data provider’s list of cookies that show their customers’ other profile attributes from surfing across various websites. Then the advertiser can bid differently on its own customers when it sees them. Given that the cost of acquiring a customer is far higher than retaining a customer, this is good business in many ways. But is it good in general?

It does beg the question about whether you should have to go and opt-out of this in the first place. I’ve had dozens of conversations with people about this, and not one of them was happy about being tracked this way. Some were resigned and disappointed, some were creeped out, and others were downright angry. When people start saying things like, “What gives them the right?” I get a bit concerned that legislation can’t be too far behind.

Tagged Data, Privacy, Targeting

Mar 03 2012

2 Comments

3rd Party Data, Data Collection, Online Privacy, Targeting

The real reason consumers are creeped out by online ads

(Originally published in iMediaConnection, September 2010) by Eric Picard

Direct response marketers have been using various statistical models for decades to determine how to predict human behavior. They’ve built proven models that can help a marketer reach a highly targeted audience with a high degree of reliability and show that audience a message that has a higher probability of success than a random untargeted message. The easiest way to see this at work is to buy a house.

Two years ago, I bought a house (my timing was impeccable). Within weeks of my mortgage closing, I began to receive all sorts of interesting things in the mail. This was interesting because I explicitly opted out of having the data from my mortgage shared with anyone (or so I thought). As it turns out, this isn’t really possible — at least, I wasn’t able to pull it off, and I am aware of how the DR industry works. The average consumer hasn’t got a chance.

The kinds of mail I began receiving included lots of offers for things like mortgage refinance (despite that I had only bought my house weeks before), various types of insurance (most were flavors of home warranties), and then literally hundreds (possibly thousands) of offers from local businesses to try their services. This included some that were logical and tied to my physical relocation to a new neighborhood — various dentists, hair salons, landscapers, accountants, hardware stores, and roofing companies.

The DR industry has statistical models that clearly show the series of marketing opportunities that are associated with major life events. So when you have a baby, there are many things you’re likely to need to buy. When you buy a house, it’s very similar (in fact, these events are highly correlated). For instance, having a baby frequently is followed by purchasing a new (and safer or more spacious) car, SUV, crossover, or minivan. Life insurance is another highly correlated purchase.

These models are built, and the “sensing” mechanisms flow out into the various sources of publicly available data, as well as numerous private sources of data like financial services companies. For decades, your every credit card purchase has been carefully scrutinized and analyzed and applied against highly refined statistical models to figure out what opportunities exist to sell you other products and services.

Many people have begun to realize this — but it took decades to build the systems, and decades more to have the knowledge of its existence permeate the culture. So by the time you read this, many of you have simply accepted that this is standard practice. You’ve come to terms with your outrage at the fact that, without explicitly asking for your permission, data about your private life has been used to segment you into various buckets in order to more effectively market to you.

One of the major problems with this traditional direct response marketing is the massive expense behind it. Despite being a highly profitable, high-revenue business, it’s extremely expensive to operate. Building the statistical models, mining the data across numerous sources, and then building personalized (not private in any way, mind you) profiles against which to sell the personal contact information you’ve amassed — including phone numbers, physical home addresses, and names — isn’t cheap. And when it first started out, the costs were much higher because computing power was relatively much more expensive.

And that’s been the problem with DR since it began: Building these mailing lists of highly personalized targeting opportunities is so expensive, the pools of individuals who match them are so small, and the amount of time that the data are fresh and relevant is so short that the opportunity for any single marketer to reach target audiences is pretty small. Maintaining the freshness of the data is a big part of the expense. From a marketer’s perspective, the decision to use these mechanisms is quite simple — the response rates are well known and the ROI decision is easy. But the number of customers any one company can create using these tools is low enough that other forms of marketing are needed.

Next page >>

If the benefits of DR are its targeted, effective nature and clear ROI, its handicap is the limited audience size for any one company. DR is like fly fishing; pick the fly that will work on that specific type of fish and get the fly into the right location at the right moment. Good old brand advertising has nothing to fear from DR for this reason. The benefit of brand advertising is that you reach a large-scale audience at a low cost and get your message to the masses. Brand advertising is like fishing with a big net; you catch a lot of fish, but you have to throw a lot of them back because they weren’t what you were looking to catch. The problem is, at these large scales, the ability to know how effectively you’re reaching the ideal audience is pretty limited.

Over time, a secondary market of service providers using panels of users that fit various criteria has developed. And at very large scale, marketers have been able to look at various media planning tools for decades that can show them the likelihood of reaching a desired audience based on association of the audience with various television shows, magazines, radio stations, newspapers, etc. But all these tools show is that there is a probability of reaching a certain relatively broad type of audience (e.g., women in a certain age group). But this is better than nothing and has worked fairly well.

And thus the market flourished. And along came online advertising.

When online advertising began, many saw this as the holy grail of marketing. Finally (they said) here is a place where computers are deeply integrated by nature, and we can combine the two methodologies: We can build systems that enable DR and brand advertising to coexist, and eventually we can find a way to do both things. We can reach highly targeted audiences at large scale and low cost and dynamically generate targeting profiles that radically improve ROI.

I cannot tell you how many meetings over the past 15 years that I’ve been in where the conversation flowed essentially like this. “What we’re really trying to do is build a database with one row for each person on the planet, and one column for each targeting attribute we believe we can sell to marketers.” This 6 billion row database with millions of columns has been theoretical of course; neither the technology to pull it off nor the reach to every person on the planet has been available.

And there is, of course, the major issue with privacy that keeps coming up and biting this industry on the backside. Whereas it took decades for the idea of big DR databases with personal data to permeate into the culture, online advertising showed up when the issues were a lot clearer to most people. And since the state of the art of behavioral targeting has begun to show some noticeable results, people are beginning to get “creeped out.” Recent Wall Street Journal and New York Times articles have highlighted how the industry has begun to change; they’ve talked about all the various targeting tags all over the commercial web that track interest and behavior. Users are noticing targeted ads, for better or worse. And the consumer response typically has been something along the lines of, “Who gave you the right?!”

Recently a friend of mine said that she had searched for a specific pair of shoes online, added them to the shopping cart of a website, and then decided to hold off on the purchase. For the next few days, she saw ads for that specific pair of shoes on numerous websites as she surfed across the web. She didn’t find this targeting of a relevant ad to be useful or “less annoying” than non-targeted ads. She found it creepy.

When I talk to people in our industry about the issues surrounding privacy and targeting, they frequently fall back on the defensive leg of providing consumers with more relevant advertising. They say that that once ads are more relevant, consumers will resent advertising less — that they might even like it. I’ve used these arguments myself in the past. The reality is that consumers would benefit from more relevant ads and might resent advertising less if the content of those ads matched better against their interests. But when we make them feel like someone is watching over their shoulders as they do things online, make no mistake — they resent it.

The example of Amazon.com comes up frequently in conversations around our industry. Amazon inherently shows products that match the kinds of things you’ve shopped for or purchased in the past. And often I’ve heard examples like, “When I go into a store and the shopkeeper recognizes me and makes a recommendation for me, I like it, and I begin to frequent this store more often because of the personalized service.”

But the reality is that this is a direct relationship that the consumer has with a specific merchant. It’s a one-on-one relationship that gives specific benefit and that has a clearly understood set of relationship rules. One colleague recently described behavioral targeting like this: “It’s like you are shopping in a store, and a guy in dark sunglasses and a trench coat is following you around and whispering into his watch. Then when you go into another store, he sidles up to the merchant and whispers in her ear that you were just shopping for negligee in another store down the street, and that you seem to prefer underwire cups.”

The reality of behavioral targeting is not far off from this example, and this seems to be missed by the marketing industry. Ultimately, consumers will decide what is and isn’t acceptable to them, and beware the marketing industry executives who believe they will make that decision on the consumer’s behalf. Now that people are relatively aware of the DR marketing practices in the traditional world, they are getting fed up with them in the online world, where they felt relatively anonymous and private. Consumers recognize that Amazon.com might know a lot about their purchase and shopping behavior while on that particular website; however, they would likely feel very uncomfortable if that data were then sold on the open market without their explicit permission to any advertiser willing to pay for it. Politicians have become aware of this growing consumer resentment, meaning that legislation is likely not far behind.

The online advertising industry isn’t wholly clueless, and many have been trying to come up with new approaches that they feel are less antagonistic to consumers, while still providing value to advertisers. In future articles, I’ll be exploring some of the ways that companies are thinking about the problem and beginning to address the issues.

Tagged Data, Privacy, Targeting

Mar 03 2012

Leave a comment

3rd Party Data, Data Collection, Online Privacy, Targeting

Fixing Online Advertising’s Privacy Woes

(Originally published in iMediaConnection, August 2010) by Eric Picard

Privacy is something I’ve been concerned about for some time when it comes to online advertising. John Hagel and Marc Singer’s excellent book “Net Worth” raised the issue in a significant way for me from a business perspective, way back in the ’90s, and Cory Doctorow’s recent novel “Little Brother” paints a bleak picture of what could happen to private citizens if privacy isn’t carefully guarded.

I raise this, of course, because of the recent Wall Street Journal and New York Times articles that have raised the specter of major privacy concerns because of the widespread tracking done by numerous parties in the online advertising space. I began worrying about the likelihood that targeting and privacy would begin to clash in a significant way back in 2004 when I started to understand what was going to happen with display advertising as we moved as an industry away from selling mainly context-based display ads and toward personalized, highly targeted audience-based display ads. And as we began moving toward automated buying systems and real-time bidding for ads based on audience attributes over the last five years, I knew we were in for it once again. From my point of view, it’s always been about when, not if, we were going to run into a consumer backlash against how much data we can (and do) collect in the online space.

Of course, the part that is a little ironic is that very detailed tracking of purchasing behavior and extrapolation of that behavior to other personal life stages and psychographic profiles (a process that is pretty accurate) of each person’s behavior has been common for decades via credit cards, financial services, and offline (traditional) targeting for direct marketing. And for the most part this hasn’t been widely reviled by the press, nor has it caused a consumer backlash against so-called massive mega-corporations with vast amounts of data about what we personally buy, do, and who we are.

Yes, it’s ironic that traditional marketing media have been tracking far more data than we can today online, and yes, it really could be interpreted that we are “less bad” than our traditional media cousins. However, this is not really a strong defensive statement, though it is still frequently stated by my colleagues in this industry. Perhaps only slightly less frequently than that other old nugget about consumers getting the benefit of “more relevant” or “personalized advertising” if they submit to being tracked for the purposes of selling targeted advertising against their anonymous profiles. This is, of course, only a statement I’ve ever heard espoused by folks in the online advertising industry — and not something consumers are consciously happy or excited about, nor something almost any consumer would react positively to.

It’s a bad meme — something we as an industry know to be true (after all, many magazines, as an example, are bought just as much to see the ads as read the articles. Think fashion, home improvement, and technology magazines if you disagree.) But that just isn’t a powerful message for consumers, and it is generally used by the press with some sarcasm to show how out of touch we are with consumers. And don’t get me wrong — I’ve made these statements myself. In fact, I was videotaped last year for a privacy-related video where I talk about targeting and online advertising — and I ultimately don’t get much beyond any of the arguments above in my short clip.

So, what is the issue here? Let’s look at some of the main questions being posed:

Should we be able to target ads based on tracking of anonymous user behavior? I believe so.
Is there significant chance of consumers being personally identified and something nefarious happening to them? Not today — although down the road, that could change as computing power gets much more advanced.
Do consumers get any value from targeting that we can use as a value proposition in educating them about these issues? Absolutely yes, but which messages we should use are not always clear.
Is the massive amount of data being tracked about consumer behavior a good thing or a bad thing? Well, that depends.

The advertising economy
When my parents were children in large working-class families in Massachusetts, it was a very big deal to have chicken for dinner. Chicken dinner was something their families typically had on Sunday — with a large family carefully dividing up a relatively small bird (by today’s standards.) Oranges might be available at certain times of year, but year-round access to all sorts of fresh vegetables and fruits was simply unheard of. And products in general were scarcer, relatively costlier, and were generally less affordable to large swaths of the population.

But with advances in supply chain management, modern manufacturing and farming techniques, and reduced transportation costs, the way the average modern family lives would be considered vastly wealthier and more privileged by the standards of my parents’ or grandparents’ generations.

As technology across all industries improves, we continue to see cost reductions in products and a wider variety of products due to general efficiencies and capabilities growing over time. And as media has fragmented, we’ve seen the costs and inefficiencies of marketing and advertising grow significantly as well. Targeting and personalization of marketing are mechanisms that help us rein in the growing costs and gain efficiency as well as effectiveness.

I have a core belief that I’d like to share with you. I believe that advertising is a fundamental driver of our economy. Advertising, as it so happens, actually works. Companies that advertise (especially those that do it well) sell more products and services. Those companies prosper, and hire more employees to work for them, thereby creating more jobs. And this virtuous cycle is very clear.

It is fairly well understood that watching the marketing spend of major corporations is a major predictor of the economy. When marketing spend drops, the economy soon drops as well. And it’s a leading indicator of a return to economic health — when marketing spend increases, the economy is on its way back to health. The question is: Which is driving which effect? Ultimately, I believe that advertising is both a predictor and a driver of the economy. It’s been shown repeatedly that those companies that increase marketing spend during an economic downturn generally do better during that downturn than competitors, and they tend to have incredible long-term advantage over competitors that decreased spend during the downturn. In some cases, this long-term advantage can create a market-leading company.

So when we talk about techniques for improving the effectiveness of advertising, like targeting, I get very excited. I believe there is incredible value to increasing the effectiveness, reducing inefficiency and increasing the amount of spending we do on advertising as a society. The overall increase in economic value from advertising is something I believe in. And one major way to increase the amount of spending done on advertising is to increase efficiency and decrease waste.

That’s where targeting and personalization of advertising are incredibly important. By showing ads to consumers that are relevant to them — and personalized to whatever possible degree — we can help advertisers hone their messages, spend money on reaching the audience interested in their products or services, and do it at scale. The positive impact of this isn’t well understood by most people — even many of those in our industry. Many parrot these words about “more relevant” advertising as if they are a shield to keep away the hounds of regulation. But there is a truth in this message that goes far beyond what is generally understood.

So — is this economic boom on its way? When will we feel its effects? Read on.

Next page >>

In reality, even the most sophisticated systems we have for targeting and personalizing messages to consumers are pretty bad at it. Even if we knew literally every piece of information about a consumer that could possibly be used to deliver a targeted advertising experience, we couldn’t really do much with it today. Maybe publishers are able to charge a bit more for those ads that can be sold on a targeted basis. And maybe advertisers and agencies can bid higher in real-time for ads they know are going to be delivered to their target audience. But this is really just the beginning of a much more sophisticated advertising industry, and not a world where we can effectively reach an audience with the mythical “right ad at the right time in the right place.”

In a sense, the whole way we go about it is wrong. The surreptitious surveillance of the population in order to data-mine their online activity and build statistically driven models for how to deliver appropriate and relevant ads is possible to some extent today. As computing power grows over the next few years, it will become even more accurate and effective. And eventually the ad creative itself will become intelligent and customize itself to fine-tune the images, sounds, videos, and even the pitches to the individual consumer’s preferences. But doing this quietly in the background — hiding that it is being done — is perhaps destructive to the relationships businesses should be building with their customers.

Hagel and Singer raised this issue very effectively in “Net Worth.” Their prediction was that some entity would become an “infomediary,” building a set of tools and technologies that would allow the user to control what information is tracked about their behavior. This infomediary would enable the user to control which other entities would get access to this data, and perhaps even ensure that the consumer is paid for access to the data that enable better targeting of marketing. In other words, the infomediary would represent the consumer’s data to marketers on their behalf and share the proceeds with them. In 1999, this sounded like so much science fiction. And in a sense, it was. But we’re much closer to a world where this is possible — and desirable to both consumers and businesses.

So is the infomediary the only way to protect consumers?

No. There really isn’t anything shady going on here. While every industry will always have a few “bad actors” who try to game the system, the motivation for targeting marketing communications to potential and existing customers is self-evident value. Any company trying to sell its products must educate potential customers of the fact that its product exists (make them aware), educate them on the value proposition of that product (create purchase intent if it’s possible), and ultimately create demand for that product.

Some products are well suited to some consumers and ill suited to others. Enabling the kind of filtering that shows ads for adventure vacations to those who like to take them and quiet romantic beach vacations to those who are more likely to take them is an easy-to-understand motivation to most people. The concern is that something nefarious will be done with this data. People are concerned that, at best, some big faceless corporation will profit off of data collected on their customers. At worst, people fret that information that is sensitive or embarrassing would be used in a way that somehow affects the consumer.

And why shouldn’t people be worried that something bad will happen, or that they are being “used” by companies and taken advantage of?

Ultimately, we’re in a nice quiet moment where the technology is not yet advanced enough to have much happen — good or bad. While there is definitely a statistical advantage to having the data applied to marketing campaigns, the advantage is really just tightening up efficiencies for marketers at this point. But there’s little question in my mind that things will progress pretty quickly; within three to five years, much more sophisticated and effective technologies will exist.

I do think that the ad industry has a golden opportunity to self-regulate, and it’s also likely that some form of regulation will be applied to this space as well in the next few years. I’m personally not happy about this, as the nuance in how these technologies work is quite important. Heavy-handed regulations will stymie the development of this space at a time we can ill afford it.

An argument could be made that if a viable infomediary model were to arise, consumers could share in the revenue generated by it. But the reality is that we’re talking about a very small amount of money. On the individual consumer’s behalf, it’s probably not enough money to be meaningful. On the other hand, the impact that the data could have on advertising spending is significant, and the positive impact of this would be incredibly valuable to all consumers.

Before legislators begin jumping in and trying to “protect consumers” from this kind of technology, they should really understand what both the downside and upside of these technologies are likely to be. The downside is relatively painless, while the upside is potentially massive. There are great opportunities for this nascent industry to get momentum and really create a positive impact. But that will only happen if we as an industry are careful to avoid any perception of bad behavior in the way we track consumer behavior and target the delivery of ads.

Tagged Data, Privacy, Targeting

springload

A Product Strategy Consultancy focusing on Ad Technology and Online Media consultancy

Category Archives: Data Collection

The real reason advertising isn’t more relevant

The 7 types of targeting you need to know

Panel-based data

Geography

First-party audience data

Third-party audience data

Third-party retargeting data

Look-alike targeting

Custom micro-segmentation

Life after the death of 3rd Party Cookies

Targeting fundamentals everyone should know

Crawled data

User targeting

We don’t need no stinkin’ 3rd party cookies!

The Ethical Issues with 3rd Party Behavioral Tracking

It’s not your data!

Why consumers think online marketing is creepy

The real reason consumers are creeped out by online ads

Fixing Online Advertising’s Privacy Woes