idnits 2.17.00 (12 Aug 2021) /tmp/idnits5561/draft-bortzmeyer-dnsop-dns-privacy-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (December 17, 2013) is 3077 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Looks like a reference, but probably isn't: '1' on line 290 == Unused Reference: 'RFC2119' is defined on line 391, but no explicit reference was found in the text == Unused Reference: 'RFC2181' is defined on line 401, but no explicit reference was found in the text == Unused Reference: 'RFC5246' is defined on line 408, but no explicit reference was found in the text == Unused Reference: 'RFC6347' is defined on line 414, but no explicit reference was found in the text -- Obsolete informational reference (is this intentional?): RFC 5246 (Obsoleted by RFC 8446) -- Obsolete informational reference (is this intentional?): RFC 6347 (Obsoleted by RFC 9147) == Outdated reference: A later version (-03) exists of draft-wijngaards-dnsop-confidentialdns-00 Summary: 1 error (**), 0 flaws (~~), 6 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group S. Bortzmeyer 3 Internet-Draft AFNIC 4 Intended status: Informational December 17, 2013 5 Expires: June 20, 2014 7 DNS privacy problem statement 8 draft-bortzmeyer-dnsop-dns-privacy-01 10 Abstract 12 This document describes the privacy issues associated with the use of 13 the DNS by Internet users. It is intended to be mostly a problem 14 statement and it does not prescribe solutions. 16 Discussions of the document should take place on the dnsop mailing 17 list [dnsop]. 19 Status of This Memo 21 This Internet-Draft is submitted in full conformance with the 22 provisions of BCP 78 and BCP 79. 24 Internet-Drafts are working documents of the Internet Engineering 25 Task Force (IETF). Note that other groups may also distribute 26 working documents as Internet-Drafts. The list of current Internet- 27 Drafts is at http://datatracker.ietf.org/drafts/current/. 29 Internet-Drafts are draft documents valid for a maximum of six months 30 and may be updated, replaced, or obsoleted by other documents at any 31 time. It is inappropriate to use Internet-Drafts as reference 32 material or to cite them other than as "work in progress." 34 This Internet-Draft will expire on June 20, 2014. 36 Copyright Notice 38 Copyright (c) 2013 IETF Trust and the persons identified as the 39 document authors. All rights reserved. 41 This document is subject to BCP 78 and the IETF Trust's Legal 42 Provisions Relating to IETF Documents 43 (http://trustee.ietf.org/license-info) in effect on the date of 44 publication of this document. Please review these documents 45 carefully, as they describe your rights and restrictions with respect 46 to this document. Code Components extracted from this document must 47 include Simplified BSD License text as described in Section 4.e of 48 the Trust Legal Provisions and are provided without warranty as 49 described in the Simplified BSD License. 51 Table of Contents 53 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 54 2. Risks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 55 2.1. Data in the DNS request . . . . . . . . . . . . . . . . . 4 56 2.2. On the wire . . . . . . . . . . . . . . . . . . . . . . . 5 57 2.3. In the servers . . . . . . . . . . . . . . . . . . . . . 6 58 2.3.1. In the resolvers . . . . . . . . . . . . . . . . . . 7 59 2.3.2. In the authoritative name servers . . . . . . . . . . 7 60 2.3.3. Rogue servers . . . . . . . . . . . . . . . . . . . . 8 61 3. Actual "attacks" . . . . . . . . . . . . . . . . . . . . . . 8 62 4. Legalities . . . . . . . . . . . . . . . . . . . . . . . . . 8 63 5. Security considerations . . . . . . . . . . . . . . . . . . . 8 64 6. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 9 65 7. References . . . . . . . . . . . . . . . . . . . . . . . . . 9 66 7.1. Normative References . . . . . . . . . . . . . . . . . . 9 67 7.2. Informative References . . . . . . . . . . . . . . . . . 9 68 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 11 70 1. Introduction 72 The Domain Name System is specified in [RFC1034] and [RFC1035]. It 73 is one of the most important infrastructure components of the 74 Internet and one of the most often ignored or misunderstood. Almost 75 every activity on the Internet starts with a DNS query (and often 76 several). Its use has many privacy implications and we try to give 77 here a comprehensive and accurate list. 79 Let us start with a small reminder of the way the DNS works (with 80 some simplifications). A client, the stub resolver, issues a DNS 81 query to a server, the resolver (also called caching resolver or full 82 resolver or recursive name server). For instance, the query is "What 83 are the AAAA records for www.example.com?". AAAA is the qtype (Query 84 Type) and www.example.com the qname (Query Name). To get the answer, 85 the resolver will query first the root nameservers, which will, most 86 of the times, send a referral. Here, the referral will be to .com 87 nameservers. In turn, they will send a referral to the example.com 88 nameservers, which will provide the answer. The root name servers, 89 the name servers of .com and those of example.com are called 90 authoritative name servers. It is important, when analyzing the 91 privacy issues, to remember that the question asked to all these name 92 servers is always the original question, not a derived question. 93 Unlike what many "DNS for dummies" articles say, the question sent to 94 the root name servers is "What are the AAAA records for 95 www.example.com?", not "What are the name servers of .com?". So, the 96 DNS leaks more information than it should. 98 Because the DNS uses caching heavily, not all questions are sent to 99 the authoritative name servers. If the stub resolver, a few seconds 100 later, asks to the resolver "What are the SRV records of _xmpp- 101 server._tcp.example.com?", the resolver will remember that it knows 102 the name servers of example.com and will just query them, bypassing 103 the root and .com. Because there is typically no caching in the stub 104 resolver, the resolver, unlike the authoritative servers, sees 105 everything. 107 Almost all the DNS queries are today sent over UDP, and this has 108 practical consequences if someone thinks of encrypting this traffic 109 (some encryption solutions are typically done for TCP, not UDP). 111 I should be noted to that DNS resolvers sometimes forward requests to 112 bigger machines, with a larger and more shared cache, the forwarders. 113 From the point of view of privacy, forwarders are like resolvers, 114 except that the caching in the resolver before them decreases the 115 amount of data they can see. 117 Another important point to keep in mind when analyzing the privacy 118 issues of DNS is the mix of many sort of DNS requests received by a 119 server. Let's assume the eavesdropper want to know which Web page is 120 visited by an user. For a typical Web page displayed by the user, 121 there are three sorts of DNS requests: 123 Primary request: this is the domain name that the user typed or 124 selected from a bookmark or choosed by clicking on an hyperklink. 125 Presulably, this is what is of interest for the eavesdropper. 127 Secondary requests: these are the requests performed by the user 128 agent (here, the Web browser) without any direct involvment or 129 knowledge of the user. For the Web, they are triggered by 130 included content, CSS sheets, JavaScript code, embedded images, 131 etc. In some cases, there can be dozens of domain names in a 132 single page. 134 Tertiary requests: these are the requests performed by the DNS 135 system itself. For instance, if the answer to a query is a 136 referral to a set of name servers, and the glue is not returned, 137 the resolver will have to do tertiary requests to turn name 138 servers' named into IP addresses. 140 For privacy-related terms, we will use here the terminology of 141 [RFC6973]. 143 2. Risks 145 This draft is limited to the study of privacy risks for the end-user 146 (the one performing DNS requests). Privacy risks for the holder of a 147 zone (the risk that someone gets the data) are discussed in [RFC5936] 148 and in [I-D.koch-perpass-dns-confidentiality]. Non-privacy risks 149 (such as cache poisoning) are out of scope. 151 2.1. Data in the DNS request 153 The DNS request includes many fields but two of them seem specially 154 relevant for the privacy issues, the qname and the source IP address. 155 "source IP address" is used in a loose sense of "source IP address + 156 may be source port", because the port is also in the request and can 157 be used to sort out several users sharing an IP address (CGN for 158 instance). 160 The qname is the full name sent by the original user. It gives 161 information about what the user does ("What are the MX records of 162 example.net?" means he probably wants to send email to someone at 163 example.net, which may be a domain used by only a few persons and 164 therefore very revealing). Some qnames are more sensitive than 165 others. For instance, querying the A record of google-analytics.com 166 reveals very little (everybody visits Web sites which use Google 167 Analytics) but querying the A record of www.verybad.example where 168 verybad.example is the domain of an illegal or very offensive 169 organization may create more problems for the user. Another example 170 is when the qname embeds the software one uses. For instance, some 171 BitTorrent clients query a SRV record for _bittorrent- 172 tracker._tcp.domain.example. 174 For the communication between the stub resolver and the resolver, the 175 source IP address is the one of the user's machine. Therefore, all 176 the issues and warnings about collection of IP addresses apply here. 177 For the communication between the resolver and the authoritative name 178 servers, the source IP address has a different meaning, it does not 179 have the same status as the source address in a HTTP connection. It 180 is now the IP address of the resolver which, in a way "hides" the 181 real user. However, it does not always work. Sometimes 182 [I-D.vandergaast-edns-client-subnet] is used. Sometimes the end user 183 has a personal resolver on her machine. In that case, the IP address 184 is as sensitive as it is for HTTP. 186 A note about IP addresses: there is currently no IETF document which 187 describes in detail the privacy issues of IP addressing. In the mean 188 time, the discussion here is intended to include both IPv4 and IPv6 189 source addresses. For a number of reasons their assignment and 190 utilization characteristics are different, which may have 191 implications for details of information leakage associated with the 192 collection of source addresses. (For example, a specific IPv6 source 193 address seen on the public Internet is less likely than an IPv4 194 address to originate behind a CGN or other NAT.) However, for both 195 IPv4 and IPv6 addresses, it's important to note that source addresses 196 are propagated with queries and comprise metadata about the host, 197 user, or application that originated them. 199 2.2. On the wire 201 DNS traffic can be seen by an eavesdropper like any other traffic. 202 It is typically not encrypted. (DNSSEC, specified in [RFC4033] 203 explicitely excludes confidentiality from its goals.) So, if an 204 initiator starts a HTTPS communication with a recipient, while the 205 HTTP traffic will be encrypted, the DNS exchange prior to it will not 206 be. When the other protocols will become more or more privacy-aware 207 and secured against surveillance, the DNS risks to become "the 208 weakest link" in privacy. 210 What also makes the DNS traffic different is that it may take a 211 different path than the communication between the initiator and the 212 recipient. For instance, an eavesdropper may be unable to tap the 213 wire between the initiator and the recipient but may have access to 214 the wire going to the resolver, or to the authoritative name servers. 216 The best place, from an eavesdropper's point of view, is clearly 217 between the stub resolvers and the resolvers, because he is not 218 limited by DNS caching. 220 The attack surface between the stub resolver and the rest of the 221 world can vary widely depending upon how the end user's computer is 222 configured. By order of increasing attack surface: 224 The resolver can be on the end user's computer. In (currently) a 225 small number of cases, individuals may choose to operate their own 226 DNS resolver on their local machine. In this case the attack surface 227 for the stub resolver to caching resolver connection is limited to 228 that single machine. 230 The resolver can be in the IAP (Internet Access Provider) premises. 231 For most residential users and potentially other networks the typical 232 case is for the end user's computer to be configured (typically 233 automatically through DHCP) with the addresses of the DNS resolver at 234 the IAP. The attack surface for on-the-wire attacks is therefore 235 from the end user system across the local network and across the IAP 236 network to the IAP's resolvers. 238 The resolver may also be at the local network edge. For many/most 239 enterprise networks and for some residential users the caching 240 resolver may exist on a server at the edge of the local network. In 241 this case the attack surface is the local network. Note that in 242 large enterprise networks the DNS resolver may not be located at the 243 edge of the local network but rather at the edge of the overall 244 enterprise network. In this case the enterprise network could be 245 thought of as similar to the IAP network referenced above. 247 The resolver can be a public DNS service. Some end users may be 248 configured to use public DNS resolvers such as those operated by 249 Google Public DNS or OpenDNS. The end user may have configured their 250 machine to use these DNS resolvers themselves - or their IAP may 251 choose to use the public DNS resolvers rather than operating their 252 own resolvers. In this case the attack surface is the entire public 253 Internet between the end user's connection and the public DNS 254 service. 256 2.3. In the servers 258 Using the terminology of [RFC6973], the DNS servers (resolvers and 259 authoritative servers) are enablers: they facilitate communication 260 between an initiator and a recipient without being directly in the 261 communications path. As a result, they are often forgotten in risk 262 analysis. But, to quote again [RFC6973], "Although [...] enablers 263 may not generally be considered as attackers, they may all pose 264 privacy threats (depending on the context) because they are able to 265 observe, collect, process, and transfer privacy-relevant data." In 266 [RFC6973] parlance, enablers become observers when they start 267 collecting data. 269 Many programs exist to collect and analyze DNS data at the servers. 270 From the "query log" of some programs like BIND, to tcpdump and more 271 sophisticated programs like PacketQ [packetq] reference and DNSmezzo 272 [dnsmezzo]. The organization managing the DNS server can use this 273 data itself or it can be part of a surveillance program like PRISM 274 [prism] and pass data to an outside attacker. 276 Sometimes, these data are kept for a long time and/or distributed to 277 third parties, for research purposes [ditl], for security analysis, 278 or for surveillance tasks. Also, there are observation points in the 279 network which gather DNS data and then make it accessible to third- 280 parties for research or security purposes ("passive DNS 281 [passive-dns]"). 283 2.3.1. In the resolvers 285 The resolvers see the entire traffic since there is typically no 286 caching before them. They are therefore well situated to observe the 287 traffic. To summarize: your resolver knows a lot about you. The 288 resolver of a large IAP, or a large public resolver can collect data 289 from many users. You may get an idea of the data collected by 290 reading the privacy policy of a big public resolver [1]. 292 2.3.2. In the authoritative name servers 294 Unlike the resolvers, they are limited by caching. They see only a 295 part of the requests. For aggregated statistics ("what is the 296 percentage of LOC queries?"), it is sufficient but it may prevent an 297 observer to observe everything. Nevertheless, the authoritative name 298 servers sees a part of the traffic and this sample may be sufficient 299 to defeat some privacy expectations. 301 Also, the end user has typically some legal/contractual link with the 302 resolver (he has chosen the IAP, or he has chosen to use a given 303 public resolver) while he is often not even aware of the role of the 304 authoritative name servers and their observation abilities. 306 It is an interesting question whether the privacy issues are bigger 307 in the root or in a large TLD. The root sees the traffic for all the 308 TLDs (and the huge amount of traffic for non-existing TLD) but a 309 large TLD has less caching before it. 311 As noted before, using a local resolver or a resolver close to the 312 machine decreases the attack surface for an on-the-wire eavesdropper. 313 But it may decrease privacy against an observer located on an 314 authoritative name server since the authoritative name server will 315 see the IP address of the end client, and not the address of a big 316 resolver shared by many users. This is no longer true if 317 [I-D.vandergaast-edns-client-subnet] is used because, in this case, 318 the authoritative name server sees the original IP prefix or address 319 (depending on the setup). 321 As of today, all the instances of one root name server, L-root, 322 receive together around 20 000 queries per second. While most of it 323 is junk (errors on the TLD name), it gives an idea of the amount of 324 big data which pours into name servers. 326 Many domains, including TLD, are partially hosted by third-party 327 servers, sometimes in a different country. The contracts between the 328 domain manager and these servers may or may not take privacy into 329 account. But it may be surprising for an end-user that requests to a 330 given ccTLD may go to servers managed by organisations outside of the 331 country. 333 2.3.3. Rogue servers 335 A rogue DHCP server can direct you to a rogue resolver. Most of the 336 times, it seems to be done to divert traffic, by providing lies for 337 some domain names. But it could be used just to capture the traffic 338 and gather information about you. Same thing for malwares like 339 DNSchanger[dnschanger] which changes the resolver in the machine's 340 configuration. 342 3. Actual "attacks" 344 A very quick examination of DNS traffic may lead to the false 345 conclusion that extracting the needle from the haystack is difficult. 346 "Interesting" primary DNS requests are mixed with useless (for the 347 eavesdropper) second and tertiary requests (see the terminology in 348 Section 1). But, in this time of "big data" processing, powerful 349 techniques now exist to get from the raw data to what you're actually 350 interested in. 352 Many research papers about malware detection use DNS traffic to 353 detect "abnormal" behaviour that can be traced back to the activity 354 of malware on infected machines. Yes, this reasearch was done for 355 the good but, technically, it is a privacy attack and it demonstrates 356 the power of the observation of DNS traffic. See [dns-footprint], 357 [dagon-malware] and [darkreading-dns]. 359 4. Legalities 361 To our knowledge, there are no specific privacy laws for DNS data. 362 Interpreting general privacy laws like [data-protection-directive] 363 (European Union) in the context of DNS traffic data is not an easy 364 task and it seems there is no court precedent here. 366 5. Security considerations 368 This document is entirely about security, more precisely privacy. 369 Possible solutions to the issues described here are discussed in 370 [I-D.bortzmeyer-dnsop-privacy-sol] or in 371 [I-D.wijngaards-dnsop-confidentialdns]. 373 6. Acknowledgments 375 Thanks to Nathalie Boulvard and to the CENTR members for the original 376 work which leaded to this draft. Thanks to Ondrej Sury for the 377 interesting discussions. Thanks to Mohsen Souissi for proofreading. 378 Thanks to Dan York, Suzanne Woolf and Frank Denis for good written 379 contributions. 381 7. References 383 7.1. Normative References 385 [RFC1034] Mockapetris, P., "Domain names - concepts and facilities", 386 STD 13, RFC 1034, November 1987. 388 [RFC1035] Mockapetris, P., "Domain names - implementation and 389 specification", STD 13, RFC 1035, November 1987. 391 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 392 Requirement Levels", BCP 14, RFC 2119, March 1997. 394 [RFC6973] Cooper, A., Tschofenig, H., Aboba, B., Peterson, J., 395 Morris, J., Hansen, M., and R. Smith, "Privacy 396 Considerations for Internet Protocols", RFC 6973, July 397 2013. 399 7.2. Informative References 401 [RFC2181] Elz, R. and R. Bush, "Clarifications to the DNS 402 Specification", RFC 2181, July 1997. 404 [RFC4033] Arends, R., Austein, R., Larson, M., Massey, D., and S. 405 Rose, "DNS Security Introduction and Requirements", RFC 406 4033, March 2005. 408 [RFC5246] Dierks, T. and E. Rescorla, "The Transport Layer Security 409 (TLS) Protocol Version 1.2", RFC 5246, August 2008. 411 [RFC5936] Lewis, E. and A. Hoenes, "DNS Zone Transfer Protocol 412 (AXFR)", RFC 5936, June 2010. 414 [RFC6347] Rescorla, E. and N. Modadugu, "Datagram Transport Layer 415 Security Version 1.2", RFC 6347, January 2012. 417 [I-D.koch-perpass-dns-confidentiality] 418 Koch, P., "Confidentiality Aspects of DNS Data, 419 Publication, and Resolution", draft-koch-perpass-dns- 420 confidentiality-00 (work in progress), November 2013. 422 [I-D.vandergaast-edns-client-subnet] 423 Contavalli, C., Gaast, W., Leach, S., and E. Lewis, 424 "Client Subnet in DNS Requests", draft-vandergaast-edns- 425 client-subnet-02 (work in progress), July 2013. 427 [I-D.bortzmeyer-dnsop-privacy-sol] 428 Bortzmeyer, S., "Possible solutions to DNS privacy 429 issues", draft-bortzmeyer-dnsop-privacy-sol-00 (work in 430 progress), December 2013. 432 [I-D.wijngaards-dnsop-confidentialdns] 433 Wijngaards, W., "Confidential DNS", draft-wijngaards- 434 dnsop-confidentialdns-00 (work in progress), November 435 2013. 437 [dnsop] IETF, , "The dnsop mailing list", October 2013. 439 [dagon-malware] 440 Dagon, D., "Corrupted DNS Resolution Paths: The Rise of a 441 Malicious Resolution Authority", 2007. 443 [dns-footprint] 444 Stoner, E., "DNS footprint of malware", October 2010. 446 [darkreading-dns] 447 Lemos, R., "Got Malware? Three Signs Revealed In DNS 448 Traffic", May 2013. 450 [dnschanger] 451 Wikipedia, , "DNSchanger", November 2011. 453 [dnscrypt] 454 Denis, F., "DNSCrypt", . 456 [dnscurve] 457 Bernstein, D., "DNScurve", . 459 [packetq] , "PacketQ, a simple tool to make SQL-queries against 460 PCAP-files", 2011. 462 [dnsmezzo] 463 Bortzmeyer, S., "PacketQ, a simple tool to make SQL- 464 queries against PCAP-files", 2009. 466 [prism] NSA, , "PRISM", 2007. 468 [crime] Rizzo, J. and T. Dong, "The CRIME attack against TLS", 469 2012. 471 [ditl] , "A Day in the Life of the Internet (DITL)", 2002. 473 [data-protection-directive] 474 , "European directive 95/46/EC on the protection of 475 individuals with regard to the processing of personal data 476 and on the free movement of such data", November 1995. 478 [passive-dns] 479 Weimer, F., "Passive DNS Replication", April 2005. 481 [tor-leak] 482 , "DNS leaks in Tor", 2013. 484 Author's Address 486 Stephane Bortzmeyer 487 AFNIC 488 Immeuble International 489 Saint-Quentin-en-Yvelines 78181 490 France 492 Phone: +33 1 39 30 83 46 493 Email: bortzmeyer+ietf@nic.fr 494 URI: http://www.afnic.fr/