idnits 2.17.00 (12 Aug 2021) /tmp/idnits40878/draft-cooper-web-tracking-opt-outs-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 1 instance of lines with non-RFC2606-compliant FQDNs in the document. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (March 7, 2011) is 4086 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: draft-ietf-httpstate-cookie has been published as RFC 6265 -- Obsolete informational reference (is this intentional?): RFC 2109 (Obsoleted by RFC 2965) -- Obsolete informational reference (is this intentional?): RFC 2616 (Obsoleted by RFC 7230, RFC 7231, RFC 7232, RFC 7233, RFC 7234, RFC 7235) Summary: 0 errors (**), 0 flaws (~~), 3 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group A. Cooper 3 Internet-Draft Center for Democracy & Technology 4 Intended status: Informational H. Tschofenig 5 Expires: September 8, 2011 Nokia Siemens Networks 6 March 7, 2011 8 Overview of Universal Opt-Out Mechanisms for Web Tracking 9 draft-cooper-web-tracking-opt-outs-00 11 Abstract 13 Web servers and the entities that operate them have long had the 14 ability to track user agents as they access resources hosted across 15 different web domains. Concern over the privacy implications of such 16 tracking has prompted recent work on a number of solutions that aim 17 to provide a universal opt-out mechanism for web tracking that can be 18 effectuated through a simple binary choice presented to users. 20 This document provides an overview of the following mechanisms: 21 permanent opt-out cookies, cookie blocking, domain blocking, a "Do 22 Not Track" (DNT) HTTP header, and a Do Not Track Document Object 23 Model (DOM) property. The aim of this document is to describe each 24 approach, the pros and cons of each, and areas where standardization 25 may be necessary should each approach be further pursued, without 26 making recommendations about which approach or approaches should be 27 adopted. 29 Status of this Memo 31 This Internet-Draft is submitted in full conformance with the 32 provisions of BCP 78 and BCP 79. 34 Internet-Drafts are working documents of the Internet Engineering 35 Task Force (IETF). Note that other groups may also distribute 36 working documents as Internet-Drafts. The list of current Internet- 37 Drafts is at http://datatracker.ietf.org/drafts/current/. 39 Internet-Drafts are draft documents valid for a maximum of six months 40 and may be updated, replaced, or obsoleted by other documents at any 41 time. It is inappropriate to use Internet-Drafts as reference 42 material or to cite them other than as "work in progress." 44 This Internet-Draft will expire on September 8, 2011. 46 Copyright Notice 48 Copyright (c) 2011 IETF Trust and the persons identified as the 49 document authors. All rights reserved. 51 This document is subject to BCP 78 and the IETF Trust's Legal 52 Provisions Relating to IETF Documents 53 (http://trustee.ietf.org/license-info) in effect on the date of 54 publication of this document. Please review these documents 55 carefully, as they describe your rights and restrictions with respect 56 to this document. Code Components extracted from this document must 57 include Simplified BSD License text as described in Section 4.e of 58 the Trust Legal Provisions and are provided without warranty as 59 described in the Simplified BSD License. 61 Table of Contents 63 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 64 1.1. History of Opt-Out Cookies . . . . . . . . . . . . . . . . 4 65 1.2. Drawbacks of Opt-Out Cookies . . . . . . . . . . . . . . . 4 66 1.3. New Tracking Opt-Out Mechanisms . . . . . . . . . . . . . 5 67 2. Terminology: First Party vs. Third Party . . . . . . . . . . . 6 68 3. Tracking Opt-Out Mechanisms . . . . . . . . . . . . . . . . . 8 69 3.1. Permanent Opt-Out Cookies . . . . . . . . . . . . . . . . 8 70 3.2. Cookie Blocking . . . . . . . . . . . . . . . . . . . . . 10 71 3.3. Domain Blocking . . . . . . . . . . . . . . . . . . . . . 10 72 3.4. Do Not Track HTTP Header . . . . . . . . . . . . . . . . . 12 73 3.5. Do Not Track DOM Property . . . . . . . . . . . . . . . . 14 74 4. Security Considerations . . . . . . . . . . . . . . . . . . . 14 75 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 16 76 6. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 16 77 7. Informational References . . . . . . . . . . . . . . . . . . . 16 78 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 20 80 1. Introduction 82 The Hypertext Transfer Protocol (HTTP) is a generic and stateless 83 application-level protocol for distributed collaborative hypermedia 84 information systems. The stateless nature of the HTTP protocol is a 85 useful property for scalability and for robustness. However, for 86 more complex web sites it is often important to carry state 87 information between different web pages and to offer reidentification 88 of previous visitors for usability reasons. This has lead web 89 application developers to invent mechanisms for maintaining state 90 information about end user interactions. In fact, one mechanism - 91 the cookie (originally specified in [RFC2109] and now being revised 92 by [I-D.ietf-httpstate-cookie]) - has been added to HTTP itself. 93 Since cookies come with limitations, such as the number of cookies 94 that are allowed to be stored per domain, the size of an individual 95 cookie, and the total number of cookies that can be stored, it is not 96 the only state management concept used by developers. Other 97 mechanisms include combinations of server-side databases, hidden form 98 fields, URL query parameters, extensions to the CGI model, storage 99 capabilities offered by additional plug-ins (such as Adobe Flash and 100 Microsoft Silverlight), HTML5 web storage, and special browser 101 extensions (such as Internet Explorer's userdata behavior). 103 State created by the web server allows the server to uniquely 104 identify individual user agents, providing a mechanism to correlate 105 information about the activity of a single user agent across requests 106 for different resources. Many of today's web sites cause user agents 107 to fetch resources from a large number of other sites which may also 108 make use of state management techniques. 110 State information, such as cookie state stored within the browser, is 111 not accessible to every site due to user agent security policies 112 (which may include the same-origin policy 113 [I-D.abarth-principles-of-origin] and its variations), but sharing of 114 information between web sites visited by a single user can take many 115 different forms. Data may be shared between two sites that both 116 cause requests to the same third site, by sites that share DNS CNAME 117 records or authoritative DNS servers, or between sites that share 118 identifying URLs or referer headers [Krishnamurthy06] 119 [Krishnamurthy07]. These techniques, together with uses of cookies, 120 Javascript, Flash, and other mechanisms for data aggregation 121 purposes, have become pervasive among popular web sites 122 [Krishnamurthy09], allowing users to be tracked in a multitude of 123 ways. 125 Concern over the privacy implications of this tracking has prompted 126 recent work on a number of different solutions that aim to provide a 127 universal opt-out mechanism for web tracking that can be effectuated 128 through a simple binary choice presented to users. This document 129 provides an overview of several such mechanisms. 131 1.1. History of Opt-Out Cookies 133 Web tracking was first widely employed by "third-party" advertising 134 networks, which locate their advertising resources at their own 135 domains (not at the "first-party" domains to which user agents 136 typically issue requests at the direction of users). User agent 137 requests for top-level documents from many separate first-party 138 domains often generate requests for resources that are all located at 139 the same third-party ad network domain, providing the ad network with 140 the ability to build a profile of the first-party resources accessed 141 by the user agent. Ad networks then use these profiles to 142 individually tailor ads served to a particular user agent. This 143 practice is known as "behavioral advertising." 145 Concern over the privacy implications of the tracking involved in 146 behavioral advertising gave rise in 1999 to the Network Advertising 147 Initiative (NAI), a consortium of online advertising companies 148 [NAI-History]. Shortly after its formation, the NAI developed a set 149 of guidelines that its member companies were bound to follow. Among 150 these guidelines was a requirement that the ad companies provide web 151 users with the ability to opt out of ad targeting [NAI-Guidelines]. 153 The primary mechanism adopted for effectuating the opt out was an 154 "opt-out cookie": an HTTP cookie that stores the user's preference to 155 be opted out of ad targeting. Under the guidelines, NAI members 156 could provide users with links to set their opt-out cookies from 157 their own web sites and from a central site [NAI-Registry]. A newer 158 central site now provides users with access to the opt-out cookies 159 for companies that are members of a number of other advertising trade 160 associations in addition to the NAI, all of which are operating under 161 the banner of the Digital Advertising Alliance (DAA) [DAA10]. 163 1.2. Drawbacks of Opt-Out Cookies 165 Several drawbacks to the opt-out cookie approach have been identified 166 over time. Storing the user's preference in a cookie is problematic 167 because users are often encouraged to delete their cookies in order 168 to protect their privacy. If they follow this advice, they delete 169 their opt-out cookies as well, and ad targeting resumes. 171 Because HTTP cookies are typically only returned to the origin server 172 that set them [I-D.ietf-httpstate-cookie], using cookies to control 173 user preferences requires that users obtain individual opt-out 174 cookies for each tracking domain. With current upper estimates for 175 the number of tracking domains reaching over 300 177 [PrivacyChoice-Tracker-Index], this creates a complex cookie 178 management task for users. 180 Not all of these tracking domains are used for behavioral 181 advertising. Tracking -- in the generic sense of correlating a 182 single user agent's requests across multiple domains -- is used for a 183 number of other purposes, including web analytics, web site 184 personalization, ad reporting (e.g., calculating the number of ad 185 views or clicks), market research, fraud detection, and federated 186 authentication. Like behavioral advertising, some of these services 187 (web analytics, ad reporting, some market research services) use 188 cookies as their primary means of identifying user agents and could 189 therefore make use of opt-out cookies to store user preferences. But 190 recent investigations have indicated that only about half of the 300 191 or so tracking domains offer opt-out cookies [Brock11]. Meanwhile, 192 the DAA site offers the opt-out cookies of only about 60 companies. 194 For some of the other tracking purposes, using an opt-out cookie 195 would make little sense. For example, a site or service that 196 requires users to authenticate to obtain access to a personal profile 197 might find it more reasonable to store the user's opt-out choice on a 198 back-end system as part of the user's profile. Since cookies were 199 designed to overcome the statelessness of web transactions, any site 200 or service that persists state about individual users in some non- 201 cookie-based storage can likely find a more streamlined way to store 202 individual opt-out preferences than by using opt-out cookies. 204 Opt-out cookies also do not control tracking that makes use of other 205 technologies. Flash cookies, HTML5 web storage, browser 206 fingerprinting, the CSS history leak, and a number of other non-HTTP- 207 cookie mechanisms can be used to track web activity across domains 208 [Kamkar10][EFF][Baron10]. 210 1.3. New Tracking Opt-Out Mechanisms 212 For all of these reasons, a number of new solutions have been 213 proposed to improve upon the status quo for opting out of web 214 tracking. While these mechanisms differ in their implementations, 215 they share a similar goal: to provide a universal opt-out for web 216 tracking that can be effectuated through a simple binary choice 217 presented to users (this will be referred to hereafter as the "DNT 218 goal"). This document provides an overview of the following 219 mechanisms: 221 o Permanent opt-out cookies 223 o Cookie blocking 224 o Domain blocking 226 o Do Not Track HTTP header 228 o Do Not Track Document Object Model (DOM) property 230 The aim is to generally describe each approach, the pros and cons of 231 each, and areas where standardization may be necessary should each 232 approach be further pursued. This document does not recommend any 233 particular solution or set of solutions. This is very much a first 234 draft; feedback and insights into the various approaches are most 235 welcome. 237 2. Terminology: First Party vs. Third Party 239 There are a number of web-related terms that have taken on special 240 meaning within discussions about web tracking. Some of these 241 meanings may differ from the common understanding of the same terms 242 in the IETF context. 244 In the context of web tracking, a "domain" usually refers to the 245 portion of a web resource's host name comprised of the second-level 246 domain and top-level domain. For example, the domain corresponding 247 to http://count.example.com/ would be example.com. The term 248 "subdomain" is often used to describe a fully qualified domain name 249 (FQDN). For example, the URI http://count.example.com/ contains the 250 subdomain count.example.com. 252 A "first-party domain" usually refers to the domain of a web site to 253 which a user agent directs an explicit request on behalf of a user. 254 A "third-party domain" usually refers to the domain of a web resource 255 that a user agent requests as a result of a first-party request. A 256 third-party resource is hosted at a different domain from the first- 257 party domain that triggers the third-party request. As an example, 258 if a user directs his user agent to http://www.foo.com/ and as a 259 result the user agent also makes a request to www.bar.com, foo.com is 260 the first-party domain and bar.com is the third-party domain. 262 This distinction between first-party and third-party domains is in 263 part a result of long-standing user agent practices for handling HTTP 264 cookies. Typically, HTTP cookies are returned only to the origin 265 server that set them [I-D.ietf-httpstate-cookie]. Cookies set from 266 first-party domains may not be read by third-party domains and vice 267 versa. In some cases, cookies set from first-party domains that 268 contain subdomains are accessible by all subdomains of the first- 269 party domain. The distinction between first-party domains and third- 270 party domains is reflected in browser-based cookie controls: major 271 web browsers all offer distinct first-party cookie settings and 272 third-party cookie settings. 274 However, a user's perception or expectation of the difference between 275 a "first party" and a "third party" may not fall neatly within the 276 distinction between "first-party domain" and "third-party domain." 277 Consider Example Company, which hosts its web site at example.com and 278 contracts with an analytics service provider, Count Company. The 279 analytics service is architected such that it operates from 280 count.example.com, a subdomain. When a user visits www.example.com, 281 a request is triggered to count.example.com, and data about the 282 user's visit is returned to count.example.com to be processed by 283 Count Company. Although all of these exchanges would be between the 284 user agent and first-party domains, the user may only expect to be 285 sending data to Example Company (the "first party"), not to Count 286 Company (the "third party"). 288 Conversely, consider that Example Company runs a social network, 289 Example Social, hosted at examplesocial.com, and a photo-sharing 290 service, Example Photos, hosted at examplephotos.com. Example Social 291 might have a feature that allows users to share their photos from 292 Example Photo on their profiles hosted at examplesocial.com. In this 293 case, a user agent that requests a resource hosted at 294 examplesocial.com would also automatically request and receive 295 content hosted at examplephotos.com. While user agents might 296 consider examplephotos.com to be a third-party domain, the user might 297 consider all the content they receive to be coming from a single 298 first party, Example Company. 300 It has been suggested that this distinction between first parties and 301 third parties from the user expectation perspective can be 302 approximated by distinguishing domains based on their Public Suffixes 303 [Mozilla] plus one additional domain label ("PS+1") 304 [I-D.mayer-do-not-track]. 306 In the remainder of this document, "first-party domain" and "third- 307 party domain" will be used to describe the typical distinction used 308 by web browsers between the two types of cookies; the terms "first 309 party" and "third party" will be used when the user expectation 310 perspective is more appropriate. 312 A summary of the terminology used in the document (some of which is 313 drawn from [I-D.mayer-do-not-track]) is as follows: 315 o Domain: The portion of a web resource's host name comprised of the 316 second-level domain and top-level domain. 318 o DNT goal: To provide a universal opt-out for web tracking that can 319 be effectuated through a simple binary choice presented to users. 321 o First party: A functional entity with which a user reasonably 322 expects to exchange data. 324 o First-party domain: The domain of a web site to which a user agent 325 directs an explicit request on behalf of a user. 327 o Third party: A functional entity that a user does not reasonably 328 expect to receive the user's data. 330 o Third-party domain: The domain of a web resource that a user agent 331 requests as a result of a first-party request. 333 3. Tracking Opt-Out Mechanisms 335 The mechanisms described in this section are at various stages of 336 development, deployment, and standardization. The mechanisms are not 337 necessarily mutually exclusive; it is possible that a combination of 338 approaches could be employed to fulfill different aspects of opt-out 339 functionality, although the mechanics of such combinations are out of 340 scope for this document. It is also possible that some of the 341 mechanisms or similar concepts could be adapted to address tracking 342 outside of the web context -- for example, within mobile applications 343 or email applications. These other contexts are likewise out of 344 scope. 346 Much of the privacy concern about web tracking has focused on 347 tracking conducted by third parties because it often occurs without 348 the knowledge of users and is performed by companies with which users 349 may have no relationship. However, tracking may also be performed by 350 first parties. For example, first parties may track users in order 351 to provide personalized or customized content, or they may share 352 information about user agent requests with third parties who then 353 aggregate that information across multiple first parties. While the 354 traditional opt-out cookie approach does not address first-party 355 tracking, some of the newer mechanisms could be implemented in a way 356 so as to address first-party tracking. A discussion of the extent to 357 which each of the mechanisms addresses first-party tracking is 358 included in the sections below. 360 3.1. Permanent Opt-Out Cookies 362 A number of web browser extentions exist to make opt-out cookies 363 permanent: Targeted Advertising Cookie Opt-Out (TACO) for Firefox and 364 Google Chrome [Abine11], Keep My Opt-Outs (KMOO) for Chrome 366 [Google11], and Keep MORE Opt-Outs, developed by PrivacyChoice 367 [PrivacyChoice11]. These extensions first install the opt-out 368 cookies for a number of ad companies -- all NAI members for KMOO and 369 larger lists of companies for the other two extensions. If the user 370 already has uniquely identifying cookies for any domains on the list, 371 those cookies are deleted. Thereafter, the extensions wait for a 372 cookie change event and preserve the opt-out cookies even when a user 373 clears his or her cookies. 375 The main benefit of this approach is that it does not require any 376 changes on the server side. Servers used to track user agents can 377 continue to operate as they have since opt-out cookies were first 378 introduced. This approach can also apply to tracking conducted for 379 many different purposes or to tracking from first-party domains -- 380 any domain that offers an opt-out cookie could be included in the 381 list of domains for which the browser extension installs an opt-out 382 cookie. Keep MORE Opt-Outs, for example, takes this approach. 384 While this approach overcomes one of the limitations of opt-out 385 cookies -- their lack of persistence -- it still requires managing 386 potentially hundreds of opt-out cookies and ensuring that the list of 387 precisely which opt-out cookies to retain remains up-to-date even as 388 entities that track reconfigure their own cookie-setting practices on 389 the server side. This may amount to a complex managerial task for 390 the browser extension developer. Furthermore, for all entities that 391 conduct tracking but do not offer an opt-out cookie -- of which there 392 are potentially hundreds -- this approach will not work for those 393 entities' domains. 395 Most opt-out cookies do not contain unique user agent identifiers, so 396 installing a domain's opt-out cookie and deleting other uniquely 397 identifying cookies from that domain will generally prevent that 398 domain from continuing to track the user agent via HTTP cookies 399 (while also providing a way for users to verify that they have been 400 opted out). However, in general it does not prevent tracking via 401 other means such as Flash cookies or HTML5 web storage. 403 No existing implementations of this approach exist natively in user 404 agents; they are all currently browser extensions that require user- 405 initiated installation. If this approach were to be pursued further, 406 there may be a need to specify a standard way of representing the 407 list of opt-out cookies that a particular user agent or extension 408 makes permanent and/or the rules for processing the list (similar to 409 what may be required to standardize block lists, see Section 3.3). 411 3.2. Cookie Blocking 413 Since much web tracking has historically occurred via HTTP cookies, 414 it has been suggested that providing users with simple settings to 415 turn cookie blocking on and off may serve the purpose of a universal, 416 binary tracking opt-out choice. All of the major web browsers offer 417 blanket settings for blocking all third-party cookies. However, 418 current implementations differ in their functionality; for example, 419 in some browsers, blocking third-party cookies prevents third-party 420 cookies that the user had previously downloaded from being read, 421 whereas in other cases pre-existing third-party cookies can continue 422 to be read and the block merely prevents new third-party cookies from 423 being set on a going-forward basis. This kind of variation reflects 424 different evaluations of the trade-off between the benefits of more 425 comprehensive blocking and the potential for cookie blocking to alter 426 or break the functionality of certain web sites. 428 The main advantages of the cookie-blocking approach are that it 429 targets what is still the most common means of tracking (HTTP 430 cookies) and it is already built into the most widely used web 431 browsers. However, because of the variations across the browsers, 432 some implementations -- particularly those that continue to allow 433 some third-party cookie reading or setting even after users have 434 affirmatively chosen to block third-party cookies -- may not match 435 users' expectations of what a universal tracking opt-out solution 436 should accomplish. 438 On the other hand, complete third-party cookie blocking does have the 439 potential to inhibit the functionality of some web sites (including 440 functionality unrelated to tracking). Some sites may even prevent 441 users from accessing the sites unless they re-enable third-party 442 cookies. This kind of behavior serves as a disincentive to using 443 existing cookie-blocking settings as a means to achieve the DNT goal. 445 When it prevents uniquely identifying third-party cookies from being 446 read, cookie blocking can be an effective and user-verifiable tool 447 for opting users out of tracking of all kinds. In addition to third- 448 party cookie blocking, most browsers also provide a setting to block 449 all first-party cookies, but because use of this setting breaks 450 significant amounts of web functionality, it is not a reasonable 451 mechanism for opting out of tracking from first-party domains. Nor 452 does cookie blocking have any effect on tracking that occurs via 453 other means. 455 3.3. Domain Blocking 457 Domain blocking requires the user agent to maintain a list of domains 458 to block and to block requests that the user agent would otherwise 459 make to domains on the list. If the list is comprised of domains 460 from which tracking occurs, domain blocking prevents tracking by 461 preventing the user agent from communicating with those domains. 462 Domain blocking has been used for years to block web content of many 463 different kinds, including advertising (see, for example, the AdBlock 464 Plus extension for Firefox and Chrome [AdBlock-Plus]). The Tracking 465 Protection feature in Microsoft Internet Explorer 9 makes use of 466 third-party domain blocking (among other functionality) 467 [Microsoft10]. Many implementations of domain blocking have the 468 ability to periodically update their block lists (by contacting some 469 authoritative source) to stay up-to-date with server reconfigurations 470 and other changes. 472 Although giving users a simple binary choice about blocking a list of 473 domains is likely sufficient to achieve the DNT goal, the domain 474 blocking approach can also include more granular options that give 475 users finer-grained control over their web communications. Existing 476 implementations allow blocking at the level of a subdomain, path or 477 file, for example. They also combine domain blocking with domain 478 whitelisting so that certain domains are kept affirmatively 479 reachable. 481 Domain blocking is a powerful solution because it entirely prevents 482 tracking from occuring via any mechanism that originates with a web 483 server request, including cookie setting, other HTTP-header-based 484 mechanisms, and the transmission of scripts, images or other files 485 that trigger tracking. Domain blocking is also verifiable in that 486 observing requests issued by the user agent will demonstrate that 487 domains on the list are not being accessed. 489 However, to an even greater extent than cookie blocking, domain 490 blocking may cause site functionality to break. For domains that 491 conduct tracking and serve content from the same domain, blocking 492 will prevent both the tracking and the content delivery, even if the 493 user desires to opt out of the tracking without losing access to the 494 content or some version of the content. Domain operators that want 495 to be able to continue serving content and tracking user agents in 496 the face of pervasive domain blocking would need to conduct these 497 activities from separate domains (as was envisioned in the original 498 proposal for behavioral advertising domain blocking [CDT07]), keeping 499 only the tracking domain on the block lists. In some cases this 500 change could require significant costs in terms of server 501 reconfiguration. Moreover, domain operators whose domains are placed 502 on block lists against their will could seek to avoid being blocked 503 by switching domains (possibly on a recurring basis to circumvent 504 list updates). And as with cookie blocking, first-party domains that 505 detect domain blocking may require users to turn domain blocking off 506 before providing access to first-party content. 508 Domain blocking requires that the list of domains to block be kept 509 up-to-date, which may require some management overhead. Domain 510 blocking cannot be used to block first-party tracking since blocking 511 first-party domain requests would prevent users from accessing 512 content that they explicitly wished to access. 514 The IE 9 Tracking Protection feature allows for block lists to be 515 independently created according to a specified file format. The 516 format and the rules for processing block list entries have been 517 submitted to the W3C for potential standardization [Zeigler11]. 518 AdBlock Plus has its own filter list format [AdBlock-Plus-Filters]. 519 Ultimately, standardization of the block list format and processing 520 rules is likely to be required if the goal is for multiple user 521 agents to be able to use the same independently created block lists. 523 3.4. Do Not Track HTTP Header 525 The proposed Do Not Track HTTP header is a user agent feature that 526 appends a new header to HTTP requests that expresses the user's 527 preference not to be tracked. In existing header implementations, 528 the header value is binary: 1 means no tracking and 0 means tracking 529 is permissible. Users can control whether the header is sent through 530 a simple browser preference. A DNT header has been implemented in 531 the current Firefox beta [Stamm] and in a number of browser 532 extensions [Soghoian][Palant11][NoScript]. Depending on the user 533 agent's policy, the header could be appended to every web request, or 534 to a subset of requests (for example, only third-party domain 535 requests, or all requests aside from those for which the user has 536 explicitly chosen to permit tracking). 538 Unlike the mechanisms already discussed, the DNT header does not 539 provide a technical means of enforcing any sort of ban on tracking. 540 Cookies and other tracking mechanisms would still be operational. 541 Thus the presence of the header does not run the risk of directly 542 interfering with existing web site functionality (as cookie or domain 543 blocking might). 545 Rather, the header provides a statement of the user's preference to 546 the domains to which the user agent makes requests. This creates the 547 possibility for the header to provide much broader-based protection 548 against tracking than the other mechanisms if the majority of 549 tracking entities abide by it. Every tracking entity that receives 550 the header would be able to act on it, including first parties, 551 entities that use tracking for purposes other than behavioral 552 advertising, and entities that track users via mechanisms other than 553 HTTP cookies. 555 The lack of a technical enforcement mechanism creates a need to 556 develop some common understanding of what "tracking" means, how 557 domain operators should behave when they receive the header, and to 558 whom the header applies. Should first parties that share tracking 559 data with third parties be required to abide by the header? Should 560 first parties and third parties be distinguished by domain name or by 561 user expectation? Should tracking for certain purposes (fraud 562 detection or ad reporting, for example) be permitted regardless of 563 whether the header is present? Should the header affect the extent 564 to which web request data is retained on the server side? There are 565 a number of efforts underway to try to develop some consensus about 566 the answers to these and other questions in a way that balances the 567 realities of web server operation, legitimate uses of web request 568 data, and users' desire for privacy protection 569 [Mayer][CDT11][Eckersley11]. One of these efforts is seeking to 570 define the semantics and intended usage of the header in the context 571 of its potential standardization at the IETF 572 [I-D.mayer-do-not-track]. How these questions are answered will 573 determine the extent to which server-side reconfiguration is 574 necessary for entities that wish to honor the header. 576 Until some sort of consensus is reached about the semantics and usage 577 of the header on the server side, the level of protection against 578 tracking that the header affords will remain uncertain. Even if a 579 common semantic were established, the header would still require 580 users to trust that their web request data, including unique 581 identifiers sent via cookies or other means, would not be used for 582 tracking whenever the header is present. This sort of guarantee may 583 require enforcement or intervention from governmental privacy 584 authorities in order to truly be effective. 586 As with cookie blocking, some sites that detect the header may 587 prevent users from accessing their content, or they may request that 588 users turn the header off before access is granted. If the header is 589 deployed without granular user control over the sites to which it is 590 sent, this kind of server-side reaction to the header could 591 incentivize users to simply turn the header off entirely, because 592 they would have no way to send the header to some sites but not 593 others. Regardless of whether controls exist or not, having 594 individual sites that ignore the header or that ask users to disable 595 it frustrates the DNT goal of having a universal, binary opt-out 596 mechanism. 598 For a DNT header to be interoperable across web sites and user 599 agents, it would need to be defined according to the syntax specified 600 in the HTTP protocol specification [RFC2616] and registered according 601 to the procedures in RFC 3864 [RFC3864]. This path is currently 602 being pursued in [I-D.mayer-do-not-track]. Standardization of the 603 header has also been proposed to the W3C [Zeigler11]. 605 3.5. Do Not Track DOM Property 607 In a similar vein to the DNT header, the Document Object Model (DOM) 608 could be extended to include a property that expresses the user's 609 preference with respect to tracking. Users could set the value of 610 the property through a simple browser preference, causing the 611 property to be set for all documents (or for documents from some 612 subset of domains, with exceptions specified by the user). Client- 613 side code could query the property before taking tracking-related 614 actions. 616 The DOM property has similar advantages and disadvantages as the 617 header. Its mere deployment need not interfere with any existing web 618 functionality. It has the potential to be accessed and respected by 619 first parties and trackers of all kinds, although its applicability 620 is limited to sites architected to have access to the DOM -- tracking 621 that occurs entirely on the server side will be unaffected by the 622 property. Responding to the presence of the property will require 623 some shared understanding of the property's semantics. Its presence 624 may lead sites to request that users allow tracking in order to 625 access the desired content. 627 One way in which the property differs from the header is that it may 628 reduce the number of server calls made on behalf of users who opt out 629 of tracking. This could be the case if detection of the property 630 causes client-side code not to make requests to tracking domains that 631 otherwise would have been made. This lack of requests issued on 632 behalf of users who have opted out could provide a limited means for 633 users to verify that their preference is being honored -- if users 634 who set the property to the "no tracking" setting observe fewer or 635 different server calls than users who allow tracking, this may 636 provide some proof that sites are honoring the property, although 637 this would likely need to be evaluated on a site-by-site basis since 638 sites may need to implement their responses to the property 639 differently. 641 As with the header, for the DOM property to be interoperable, its 642 syntax and semantics would need to be standardized. A DNT DOM 643 property has been proposed to the W3C for standardization [Zeigler11] 645 4. Security Considerations 647 This document describes various mechanisms that allow users to opt- 648 out of web tracking. Thus one way to frame the security goal of 649 these solutions is the prevention of information leakage to those 650 doing the tracking, particularly third parties. The adversary from a 651 user agent point of view can therefore be considered to be any third 652 party that conducts tracking. 654 Because any information that is shared with a third party could 655 potentially be used to identify a user agent, altogether preventing 656 communication with third-party domains when a user contacts a first- 657 party domain is perhaps the most intuitive way to prevent information 658 leakage to third parties. For example, a user agent might be 659 configured to serve content only from example.com when a user enters 660 http://www.example.com in the browser address bar. However, this 661 approach of preventing all third-party communications is unrealistic 662 since today's web sites often combine content aggregated from many 663 other sites. Hence the task of preventing third-party tracking is 664 more complicated. To address this complexity, the mechanisms 665 discussed in this draft are either more subtle or more granular (or 666 both) than all-out blocking of third parties, and they all face a 667 number of security challenges. 669 Regardless of whether any opt-out mechanism is used, first parties 670 always have the ability to convey information related to tracking to 671 third parties through an out-of-band or back-end channel. Since user 672 agents cannot observe these exchanges, there is little they can do to 673 prevent them. 675 The same origin policy treats subdomains as belonging to the first- 676 party domain. However, a first party can configure its DNS servers 677 in a way that a DNS CNAME alias points to a server belonging to 678 another organization. With appropriate cookie settings by the first 679 party, it is possible for the third party to obtain access to all 680 cookies. Permanent opt-out cookies, cookie blocking, and domain 681 blocking are not able to prevent this data sharing if they are 682 configured to respect the usual same origin policy. A DNT header or 683 DOM property may prevent this sharing if the first party respects the 684 user's preference as signaled by the header or property. 686 All techniques that block direct communication to specific third 687 party sites (via a block list mechanism) suffer from the generic 688 limitations of blacklisting mechanisms. Third parties that want to 689 avoid being blocked will regularly change their domains, attempt to 690 require users to exert additional effort in order to manage 691 blacklists, or relay communication through intermediaries to 692 obfuscate the identification of their domains. To emphasize the 693 negative impact on user experiences that blacklisting can have, some 694 third parties may bundle extra functionality onto the same (blocked) 695 domain, rendering it inaccessible to those using block lists. 697 The online management of block lists raises questions about who 698 provides the lists, how easy they are for users to download or 699 reconfigure, which list is used by default, what security mechanisms 700 control the manipulation of the lists, and what conflict resolution 701 mechanism is offered when black and white lists are combined. The 702 answers to these questions depend heavily on the technology chosen 703 for managing the lists. Failing to secure the lists against 704 manipulation could allow information to be leaked to third parties 705 against the user's wishes. 707 Mechanisms that convey user preferences in a header or as a DOM 708 property will require the receiving party to adhere to the 709 instructions. As with the block listing mechanisms, implementation 710 details pertaining to the default settings in browsers, the ease of 711 changing the settings, and whether the settings can be manipulated 712 will affect the security of the settings themselves. 714 Some web proxies, gateways, and other intermediaries are known to 715 strip certain HTTP headers (the Referer header, for example) or only 716 allow a strict set of HTTP headers to pass through. While third- 717 party companies are unlikely to have the incentive to cooperate with 718 these intermediaries for the explicit purpose of removing or 719 modifying the DNT header, such removal would result in the user's 720 preference not being expressed to receiving servers. Scripts could 721 be used to modify or disable the DNT header or DOM property within 722 the browser to achieve the same effect, but these are fairly easy to 723 detect and therefore unlikely to be abused by third parties that want 724 to conduct tracking against the user's will. Given that third 725 parties can simply ignore the user's preference if they want to 726 conduct tracking under the DNT header or DOM property scenarios, 727 these attacks are unlikely to be used. 729 5. IANA Considerations 731 This document makes no requests of IANA. 733 6. Acknowledgments 735 The authors would like to thank Michael Hanson for inspiring the work 736 on this draft and Justin Brookman, Sue Glueck, and Erica Newland for 737 their reviews. 739 7. Informational References 741 [Abine11] Abine, "Targeted Advertising Cookie Opt-Out (TACO)", http 742 s://addons.mozilla.org/en-US/firefox/addon/ 743 targeted-advertising-cookie-op/, February 2011. 745 [AdBlock-Plus] 746 AdBlock Plus, "AdBlock Plus", http://adblockplus.org/en/. 748 [AdBlock-Plus-Filters] 749 AdBlock Plus, "Writing Adblock Plus filters", 750 http://adblockplus.org/en/filters. 752 [Baron10] Baron, D., "Preventing attacks on a user's history through 753 CSS :visited selectors", 754 http://dbaron.org/mozilla/visited-privacy, April 2010. 756 [Brock11] Brock, J., "Keep MORE Opt Outs", http:// 757 blog.privacychoice.org/2011/01/31/keep-more-opt-outs/, 758 January 2011. 760 [CDT07] Cooper, A., "Dispelling "Do Not Track" Myths", http:// 761 www.cdt.org/blogs/alissa-cooper/ 762 dispelling-do-not-track-myths, October 2007. 764 [CDT11] Center for Democracy & Technology, "What Does "Do Not 765 Track" Mean? A Scoping Proposal from the Center for 766 Democracy & Technology", 767 http://cdt.org/files/pdfs/CDT-DNT-Report.pdf. 769 [DAA10] Digital Advertising Alliance, "Opt Out from Online 770 Behavioral Advertising", 771 http://www.aboutads.info/choices/, 2010. 773 [EFF] Electronic Frontier Foundation, "Panopticlick", 774 http://panopticlick.eff.org/. 776 [Eckersley11] 777 Eckersley, P., "What Does the "Track" in "Do Not Track" 778 Mean?", https://www.eff.org/deeplinks/2011/02/ 779 what-does-track-do-not-track-mean. 781 [Google11] 782 Google, "Keep My Opt-Outs", https://chrome.google.com/ 783 webstore/detail/hhnjdplhmcnkiecampfdgfjilccfpfoe, 784 January 2011. 786 [I-D.abarth-principles-of-origin] 787 Barth, A., "Principles of the Same-Origin Policy", 788 draft-abarth-principles-of-origin-00 (work in progress), 789 February 2011. 791 [I-D.ietf-httpstate-cookie] 792 Barth, A., "HTTP State Management Mechanism", 793 draft-ietf-httpstate-cookie-23 (work in progress), 794 March 2011. 796 [I-D.mayer-do-not-track] 797 Mayer, J., Narayanan, A., and S. Stamm, "Do Not Track: A 798 Universal Third-Party Web Tracking Opt Out, 799 draft-mayer-do-not-track-00 (work in progress)", 800 March 2011. 802 [Kamkar10] 803 Kamkar, S., "Evercookie", http://samy.pl/evercookie/, 804 September 2010. 806 [Krishnamurthy06] 807 Krishnamurthy, B. and C. Wills, "Generating a privacy 808 footprint on the Internet. In Proceedings of the ACM 809 SIGCOMM Internet Measurement Conference, pages 65-70, Rio 810 de Janeiro, Brazil, October 2006", 811 http://www.cs.wpi.edu/~cew/papers/imc06.pdf. 813 [Krishnamurthy07] 814 Krishnamurthy, B., Malandrino, D., and C. Wills, 815 "Measuring privacy loss and the impact of privacy 816 protection in web browsing. In Proceedings of the 817 Symposium on Usable Privacy and Security, pages 52-63, 818 Pittsburgh, PA USA, July 2007. ACM International 819 Conference Proceedings Series.", 820 http://www.cs.wpi.edu/~cew/papers/soups07.pdf. 822 [Krishnamurthy09] 823 Krishnamurthy, B. and C. Wills, "Privacy diffusion on the 824 web: A longitudinal perspective. In Proceedings of the 825 World Wide Web Conference, pages 541-550, Madrid, Spain, 826 April 2009", http://www.cs.wpi.edu/~cew/papers/www09.pdf. 828 [Mayer] Mayer, J. and A. Narayanan, "Do Not Track: Universal Web 829 Tracking Opt-Out", http://donottrack.us/. 831 [Microsoft10] 832 Microsoft, "IE9 and Privacy: Introducing Tracking 833 Protection", http://blogs.msdn.com/b/ie/archive/2010/12/ 834 07/ 835 ie9-and-privacy-introducing-tracking-protection-v8.aspx, 836 December 2010. 838 [Mozilla] Mozilla Foundation, "Public Suffix List", 839 http://publicsuffix.org/. 841 [NAI-Guidelines] 842 Network Advertising Initiative, "Network Advertising 843 Initiative Self-Regulatory Principles for Online 844 Preference Marketing by Network Advertisers", 845 http://www.ftc.gov/os/2000/07/NAI%207-10%20Final.pdf, 846 July 2000. 848 [NAI-History] 849 Network Advertising Initiative, "Network Advertising 850 Initiative History", 851 http://www.networkadvertising.org/about/history.asp. 853 [NAI-Registry] 854 Network Advertising Initiative, "Network Advertising 855 Initiative Opt-Out Registry", 856 http://www.networkadvertising.org/managing/opt_out.asp. 858 [NoScript] 859 Maone, G., "X-Do-Not-Track? DNT, c'est plus facile...", h 860 ttp://hackademix.net/2011/01/28/ 861 x-do-not-track-dnt-cest-plus-facile/. 863 [Palant11] 864 Palant, W., "Adblock Plus and (a little) more: Updated 865 roadmap (Adblock Plus 1.3.5)", https://adblockplus.org/ 866 blog/updated-roadmap-adblock-plus-135, February 2011. 868 [PrivacyChoice-Tracker-Index] 869 PrivacyChoice, "PrivacyChoice Tracker Index", 870 http://www.privacychoice.org/companies/all. 872 [PrivacyChoice11] 873 PrivacyChoice, "Keep MORE Opt-Outs", https:// 874 chrome.google.com/extensions/detail/ 875 eoibfeagdaaoimfpfalgbmmegagdconp, January 2011. 877 [RFC2109] Kristol, D. and L. Montulli, "HTTP State Management 878 Mechanism", RFC 2109, February 1997. 880 [RFC2616] Fielding, R., Gettys, J., Mogul, J., Frystyk, H., 881 Masinter, L., Leach, P., and T. Berners-Lee, "Hypertext 882 Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999. 884 [RFC3864] Klyne, G., Nottingham, M., and J. Mogul, "Registration 885 Procedures for Message Header Fields", BCP 90, RFC 3864, 886 September 2004. 888 [Soghoian] 889 Soghoian, C. and S. Stamm, "Universal Behavioral 890 Advertising Opt-Out", https://addons.mozilla.org/en-US/ 891 firefox/addon/universal-behavioral-advertisi/. 893 [Stamm] Stamm, S., "Implement do-not-track HTTP header to express 894 user intent to halt tracking across site", 895 http://hg.mozilla.org/mozilla-central/rev/6963333a74d1. 897 [Zeigler11] 898 Zeigler, A., Bateman, A., and E. Graff, "Web Tracking 899 Protection: W3C Member Submission 24 February 2011", 900 http://www.w3.org/Submission/web-tracking-protection/, 901 February 2011. 903 Authors' Addresses 905 Alissa Cooper 906 Center for Democracy & Technology 907 1634 Eye St. NW, Suite 1100 908 Washington, DC 20006 909 USA 911 Email: acooper@cdt.org 913 Hannes Tschofenig 914 Nokia Siemens Networks 915 Finland 917 Email: hannes.tschofenig@nsn.com