idnits 2.17.00 (12 Aug 2021) /tmp/idnits1093/draft-irtf-pearg-censorship-05.txt: -(1499): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding -(1598): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding -(1604): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding -(1612): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == There are 12 instances of lines with non-ascii characters in the document. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) -- The document has examples using IPv4 documentation addresses according to RFC6890, but does not use any IPv6 documentation addresses. Maybe there should be IPv6 examples, too? Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document date (4 March 2022) is 71 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Unused Reference: 'AP-2012' is defined on line 1227, but no explicit reference was found in the text == Unused Reference: 'Bentham-1791' is defined on line 1245, but no explicit reference was found in the text == Unused Reference: 'Bristow-2013' is defined on line 1294, but no explicit reference was found in the text == Unused Reference: 'Calamur-2013' is defined on line 1298, but no explicit reference was found in the text == Unused Reference: 'Ellul-1973' is defined on line 1415, but no explicit reference was found in the text == Unused Reference: 'Fareed-2008' is defined on line 1431, but no explicit reference was found in the text == Unused Reference: 'Gao-2014' is defined on line 1442, but no explicit reference was found in the text == Unused Reference: 'Guardian-2014' is defined on line 1473, but no explicit reference was found in the text == Unused Reference: 'Hopkins-2011' is defined on line 1509, but no explicit reference was found in the text == Unused Reference: 'Kopel-2013' is defined on line 1579, but no explicit reference was found in the text == Unused Reference: 'RSF-2005' is defined on line 1713, but no explicit reference was found in the text == Outdated reference: draft-ietf-quic-transport has been published as RFC 9000 == Outdated reference: draft-ietf-tls-sni-encryption has been published as RFC 8744 Summary: 2 errors (**), 0 flaws (~~), 14 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 pearg J.L. Hall 3 Internet-Draft Internet Society 4 Intended status: Informational M.D. Aaron 5 Expires: 5 September 2022 CU Boulder 6 A. Andersdotter 8 B. Jones 9 Princeton 10 N. Feamster 11 U Chicago 12 M. Knodel 13 Center for Democracy & Technology 14 4 March 2022 16 A Survey of Worldwide Censorship Techniques 17 draft-irtf-pearg-censorship-05 19 Abstract 21 This document describes technical mechanisms employed in network 22 censorship that regimes around the world use for blocking or 23 impairing Internet traffic. It aims to make designers, implementers, 24 and users of Internet protocols aware of the properties exploited and 25 mechanisms used for censoring end-user access to information. This 26 document makes no suggestions on individual protocol considerations, 27 and is purely informational, intended as a reference. 29 Status of This Memo 31 This Internet-Draft is submitted in full conformance with the 32 provisions of BCP 78 and BCP 79. 34 Internet-Drafts are working documents of the Internet Engineering 35 Task Force (IETF). Note that other groups may also distribute 36 working documents as Internet-Drafts. The list of current Internet- 37 Drafts is at https://datatracker.ietf.org/drafts/current/. 39 Internet-Drafts are draft documents valid for a maximum of six months 40 and may be updated, replaced, or obsoleted by other documents at any 41 time. It is inappropriate to use Internet-Drafts as reference 42 material or to cite them other than as "work in progress." 44 This Internet-Draft will expire on 5 September 2022. 46 Copyright Notice 48 Copyright (c) 2022 IETF Trust and the persons identified as the 49 document authors. All rights reserved. 51 This document is subject to BCP 78 and the IETF Trust's Legal 52 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 53 license-info) in effect on the date of publication of this document. 54 Please review these documents carefully, as they describe your rights 55 and restrictions with respect to this document. Code Components 56 extracted from this document must include Revised BSD License text as 57 described in Section 4.e of the Trust Legal Provisions and are 58 provided without warranty as described in the Revised BSD License. 60 Table of Contents 62 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 63 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 64 3. Technical Prescription . . . . . . . . . . . . . . . . . . . 4 65 4. Technical Identification . . . . . . . . . . . . . . . . . . 4 66 4.1. Points of Control . . . . . . . . . . . . . . . . . . . . 4 67 4.2. Application Layer . . . . . . . . . . . . . . . . . . . . 7 68 4.2.1. HTTP Request Header Identification . . . . . . . . . 7 69 4.2.2. HTTP Response Header Identification . . . . . . . . . 8 70 4.2.3. Transport Layer Security (TLS) . . . . . . . . . . . 8 71 4.2.4. Instrumenting Content Distributors . . . . . . . . . 11 72 4.2.5. Deep Packet Inspection (DPI) Identification . . . . . 13 73 4.3. Transport Layer . . . . . . . . . . . . . . . . . . . . . 14 74 4.3.1. Shallow Packet Inspection and Transport Header 75 Identification . . . . . . . . . . . . . . . . . . . 14 76 4.3.2. Protocol Identification . . . . . . . . . . . . . . . 15 77 4.4. Residual Censorship . . . . . . . . . . . . . . . . . . . 16 78 5. Technical Interference . . . . . . . . . . . . . . . . . . . 17 79 5.1. Application Layer . . . . . . . . . . . . . . . . . . . . 17 80 5.1.1. DNS Interference . . . . . . . . . . . . . . . . . . 17 81 5.2. Transport Layer . . . . . . . . . . . . . . . . . . . . . 20 82 5.2.1. Performance Degradation . . . . . . . . . . . . . . . 20 83 5.2.2. Packet Dropping . . . . . . . . . . . . . . . . . . . 21 84 5.2.3. RST Packet Injection . . . . . . . . . . . . . . . . 21 85 5.3. Multi-layer and Non-layer . . . . . . . . . . . . . . . . 23 86 5.3.1. Distributed Denial of Service (DDoS) . . . . . . . . 23 87 5.3.2. Network Disconnection or Adversarial Route 88 Announcement . . . . . . . . . . . . . . . . . . . . 24 89 5.3.3. Censorship in Depth . . . . . . . . . . . . . . . . . 24 90 6. Non-Technical Interference . . . . . . . . . . . . . . . . . 25 91 6.1. Manual Filtering . . . . . . . . . . . . . . . . . . . . 25 92 6.2. Self-Censorship . . . . . . . . . . . . . . . . . . . . . 26 93 6.3. Server Takedown . . . . . . . . . . . . . . . . . . . . . 26 94 6.4. Notice and Takedown . . . . . . . . . . . . . . . . . . . 26 95 6.5. Domain-Name Seizures . . . . . . . . . . . . . . . . . . 26 96 7. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 26 97 8. Informative References . . . . . . . . . . . . . . . . . . . 27 98 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 42 100 1. Introduction 102 Censorship is where an entity in a position of power - such as a 103 government, organization, or individual - suppresses communication 104 that it considers objectionable, harmful, sensitive, politically 105 incorrect or inconvenient [WP-Def-2020]. Although censors that 106 engage in censorship must do so through legal, military, or other 107 means, this document focuses largely on technical mechanisms used to 108 achieve network censorship. 110 This document describes technical mechanisms that censorship regimes 111 around the world use for blocking or impairing Internet traffic. See 112 [RFC7754] for a discussion of Internet blocking and filtering in 113 terms of implications for Internet architecture, rather than end-user 114 access to content and services. There is also a growing field of 115 academic study of censorship circumvention (see the review article of 116 [Tschantz-2016]), results from which we seek to make relevant here 117 for protocol designers and implementers. 119 2. Terminology 121 We describe three elements of Internet censorship: prescription, 122 identification, and interference. The document contains three major 123 sections, each corresponding to one of these elements. Prescription 124 is the process by which censors determine what types of material they 125 should censor, e.g., classifying pornographic websites as 126 undesirable. Identification is the process by which censors classify 127 specific traffic or traffic identifiers to be blocked or impaired, 128 e.g., deciding that webpages containing "sex" in an HTTP Header or 129 that accept traffic through the URL wwww.sex.example are likely to be 130 undesirable. Interference is the process by which censors intercede 131 in communication and prevents access to censored materials by 132 blocking access or impairing the connection, e.g., implementing a 133 technical solution capable of identifying HTTP headers or URLs and 134 ensuring they are rendered wholly or partially inaccessible. 136 3. Technical Prescription 138 Prescription is the process of figuring out what censors would like 139 to block [Glanville-2008]. Generally, censors aggregate information 140 "to block" in blocklists or use real-time heuristic assessment of 141 content [Ding-1999]. Some national networks are designed to more 142 naturally serve as points of control [Leyba-2019]. There are also 143 indications that online censors use probabilistic machine learning 144 techniques [Tang-2016]. Indeed, web crawling and machine learning 145 techniques are an active research idea in the effort to identify 146 content deemed as morally or commercially harmful to companies or 147 consumers in some jurisdictions [SIDN2020]. 149 There are typically a few types of blocklist elements: Keyword, 150 domain name, protocol, or Internet Protocol (IP) address. Keyword 151 and domain name blocking take place at the application level, e.g., 152 HTTP; protocol blocking often occurs using Deep Packet Inspection to 153 identify a forbidden protocol; IP blocking tends to take place using 154 IP addresses in IPv4/IPv6 headers. Some censors also use the 155 presence of certain keywords to enable more aggressive blocklists 156 [Rambert-2021] or to be more permissive with content [Knockel-2021]. 158 The mechanisms for building up these blocklists vary. Censors can 159 purchase from private industry "content control" software, such as 160 SmartFilter, which lets censors filter traffic from broad categories 161 they would like to block, such as gambling or pornography 162 [Knight-2005]. In these cases, these private services attempt to 163 categorize every semi-questionable website as to allow for meta-tag 164 blocking. Similarly, they tune real-time content heuristic systems 165 to map their assessments onto categories of objectionable content. 167 Countries that are more interested in retaining specific political 168 control typically have ministries or organizations that maintain 169 blocklists. Examples include the Ministry of Industry and 170 Information Technology in China, Ministry of Culture and Islamic 171 Guidance in Iran, and specific to copyright in France [HADOPI-2020] 172 and across the EU for consumer protection law [Reda-2017]. 174 4. Technical Identification 176 4.1. Points of Control 178 Internet censorship takes place in all parts of the network topology. 179 It may be implemented in the network itself (e.g. local loop or 180 backhaul), on the services side of communication (e.g. web hosts, 181 cloud providers or content delivery networks), in the ancillary 182 services eco-system (e.g. domain name system or certificate 183 authorities) or on the end-client side (e.g. in an end-user device 184 such as a smartphone, laptop or desktop or software executed on such 185 devices). An important aspect of pervasive technical interception is 186 the necessity to rely on software or hardware to intercept the 187 content the censor is interested in. There are various logical and 188 physical points-of-control censors may use for interception 189 mechanisms, including, though not limited to, the following. 191 * Internet Backbone: If a censor controls the gateways into a 192 region, they can filter undesirable traffic that is traveling into 193 and out of the region by packet sniffing and port mirroring at the 194 relevant exchange points. Censorship at this point of control is 195 most effective at controlling the flow of information between a 196 region and the rest of the Internet, but is ineffective at 197 identifying content traveling between the users within a region. 198 Some national network designs naturally serve as more effective 199 chokepoints and points of control [Leyba-2019]. 201 * Internet Service Providers: Internet Service Providers are 202 frequently exploited points of control. They have the benefit of 203 being easily enumerable by a censor - often falling under the 204 jurisdictional or operational control of a censor in an 205 indisputable way - with the additional feature that an ISP can 206 identify the regional and international traffic of all their 207 users. The censor's filtration mechanisms can be placed on an ISP 208 via governmental mandates, ownership, or voluntary/coercive 209 influence. 211 * Institutions: Private institutions such as corporations, schools, 212 and Internet cafes can use filtration mechanisms. These 213 mechanisms are occasionally at the request of a government censor, 214 but can also be implemented to help achieve institutional goals, 215 such as fostering a particular moral outlook on life by school- 216 children, independent of broader society or government goals. 218 * Content Distribution Networks (CDNs): CDNs seek to collapse 219 network topology in order to better locate content closer to the 220 service's users. This reduces content transmission latency and 221 improves quality of service. The CDN service's content servers, 222 located "close" to the user in a network-sense, can be powerful 223 points of control for censors, especially if the location of CDN 224 content repositories allow for easier interference. 226 * Certificate Authorities (CAs) for Public-Key Infrastructures 227 (PKIs): Authorities that issue cryptographically secured resources 228 can be a significant point of control. CAs that issue 229 certificates to domain holders for TLS/HTTPS (the Web PKI) or 230 Regional/Local Internet Registries (RIRs) that issue Route 231 Origination Authorizations (ROAs) to BGP operators can be forced 232 to issue rogue certificates that may allow compromise, i.e., by 233 allowing censorship software to engage in identification and 234 interference where not possible before. CAs may also be forced to 235 revoke certificates. This may lead to adversarial traffic routing 236 or TLS interception being allowed, or an otherwise rightful origin 237 or destination point of traffic flows being unable to communicate 238 in a secure way. 240 * Services: Application service providers can be pressured, coerced, 241 or legally required to censor specific content or data flows. 242 Service providers naturally face incentives to maximize their 243 potential customer base and potential service shutdowns or legal 244 liability due to censorship efforts may seem much less attractive 245 than potentially excluding content, users, or uses of their 246 service. Services have increasingly become focal points of 247 censorship discussions, as well as the focus of discussions of 248 moral imperatives to use censorship tools. 250 * Content sites: On the service side of communications lie many 251 platforms that publish user-generated content require terms of 252 service compliance with all content and user accounts in order to 253 avoid intermediary liability for the web hosts. In aggregate 254 these policies, actions and remedies are known as content 255 moderation. Content moderation happens above the services or 256 application layer, but these mechanisms are built to filter, sort 257 and block content and users thus making them available to censors 258 through direct pressure on the private entity. 260 * Personal Devices: Censors can mandate censorship software be 261 installed on the device level. This has many disadvantages in 262 terms of scalability, ease-of-circumvention, and operating system 263 requirements. (Of course, if a personal device is treated with 264 censorship software before sale and this software is difficult to 265 reconfigure, this may work in favor of those seeking to control 266 information, say for children, students, customers, or employees.) 267 The emergence of mobile devices exacerbate these feasibility 268 problems. This software can also be mandated by institutional 269 actors acting on non-governmentally mandated moral imperatives. 271 At all levels of the network hierarchy, the filtration mechanisms 272 used to censor undesirable traffic are essentially the same: a censor 273 either directly identifies undesirable content using the identifiers 274 described below and then uses a blocking or shaping mechanism such as 275 the ones exemplified below to prevent or impair access, or requests 276 that an actor ancillary to the censor, such as a private entity, 277 perform these functions. Identification of undesirable traffic can 278 occur at the application, transport, or network layer of the IP 279 stack. Censors often focus on web traffic, so the relevant protocols 280 tend to be filtered in predictable ways (see Section 4.2.1 and 281 Section 4.2.2). For example, a subversive image might make it past a 282 keyword filter. However, if later the image is deemed undesirable, a 283 censor may then blacklist the provider site's IP address. 285 4.2. Application Layer 287 The following subsections describe properties and tradeoffs of common 288 ways in which censors filter using application-layer information. 289 Each subsection includes empirical examples describing these common 290 behaviors for further reference. 292 4.2.1. HTTP Request Header Identification 294 An HTTP header contains a lot of useful information for traffic 295 identification. Although "host" is the only required field in an 296 HTTP request header (for HTTP/1.1 and later), an HTTP method field is 297 necessary to do anything useful. As such, "method" and "host" are 298 the two fields used most often for ubiquitous censorship. A censor 299 can sniff traffic and identify a specific domain name (host) and 300 usually a page name (GET /page) as well. This identification 301 technique is usually paired with transport header identification (see 302 Section 4.3.1) for a more robust method. 304 Tradeoffs: Request Identification is a technically straight-forward 305 identification method that can be easily implemented at the Backbone 306 or ISP level. The hardware needed for this sort of identification is 307 cheap and easy-to-acquire, making it desirable when budget and scope 308 are a concern. HTTPS will encrypt the relevant request and response 309 fields, so pairing with transport identification (see Section 4.3.1) 310 is necessary for HTTPS filtering. However, some countermeasures can 311 trivially defeat simple forms of HTTP Request Header Identification. 312 For example, two cooperating endpoints - an instrumented web server 313 and client - could encrypt or otherwise obfuscate the "host" header 314 in a request, potentially thwarting techniques that match against 315 "host" header values. 317 Empirical Examples: Studies exploring censorship mechanisms have 318 found evidence of HTTP header/ URL filtering in many countries, 319 including Bangladesh, Bahrain, China, India, Iran, Malaysia, 320 Pakistan, Russia, Saudi Arabia, South Korea, Thailand, and Turkey 321 [Verkamp-2012] [Nabi-2013] [Aryan-2012]. Commercial technologies 322 such as the McAfee SmartFilter and NetSweeper are often purchased by 323 censors [Dalek-2013]. These commercial technologies use a 324 combination of HTTP Request Identification and Transport Header 325 Identification to filter specific URLs. Dalek et al. and Jones et 326 al. identified the use of these products in the wild [Dalek-2013] 327 [Jones-2014]. 329 4.2.2. HTTP Response Header Identification 331 While HTTP Request Header Identification relies on the information 332 contained in the HTTP request from client to server, response 333 identification uses information sent in response by the server to 334 client to identify undesirable content. 336 Tradeoffs: As with HTTP Request Header Identification, the techniques 337 used to identify HTTP traffic are well-known, cheap, and relatively 338 easy to implement. However, they are made useless by HTTPS because 339 HTTPS encrypts the response and its headers. 341 The response fields are also less helpful for identifying content 342 than request fields, as "Server" could easily be identified using 343 HTTP Request Header identification, and "Via" is rarely relevant. 344 HTTP Response censorship mechanisms normally let the first n packets 345 through while the mirrored traffic is being processed; this may allow 346 some content through and the user may be able to detect that the 347 censor is actively interfering with undesirable content. 349 Empirical Examples: In 2009, Jong Park et al. at the University of 350 New Mexico demonstrated that the Great Firewall of China (GFW) has 351 used this technique [Crandall-2010]. However, Jong Park et al. found 352 that the GFW discontinued this practice during the course of the 353 study. Due to the overlap in HTTP response filtering and keyword 354 filtering (see Section 4.2.4), it is likely that most censors rely on 355 keyword filtering over TCP streams instead of HTTP response 356 filtering. 358 4.2.3. Transport Layer Security (TLS) 360 Similar to HTTP, censors have deployed a variety of techniques 361 towards censoring Transport Layer Security (TLS) (and by extension 362 HTTPS). Most of these techniques relate to the Server Name 363 Indication (SNI) field, including censoring SNI, Encrypted SNI, or 364 omitted SNI. Censors can also censor HTTPS content via server 365 certificates. 367 4.2.3.1. Server Name Indication (SNI) 369 In encrypted connections using TLS, there may be servers that host 370 multiple "virtual servers" at a given network address, and the client 371 will need to specify in the (unencrypted) Client Hello message which 372 domain name it seeks to connect to (so that the server can respond 373 with the appropriate TLS certificate) using the Server Name 374 Indication (SNI) TLS extension [RFC6066]. Since SNI is often sent in 375 the clear (as are the cert fields sent in response), censors and 376 filtering software can use it (and response cert fields) as a basis 377 for blocking, filtering, or impairment by dropping connections to 378 domains that match prohibited content (e.g., bad.foo.example may be 379 censored while good.foo.example is not) [Shbair-2015]. There are 380 undergoing standardization efforts in the TLS Working Group to 381 encrypt SNI [I-D.ietf-tls-sni-encryption] [I-D.ietf-tls-esni] and 382 recent research shows promising results in the use of encrypted SNI 383 in the face of SNI-based filtering [Chai-2019] in some countries. 385 Domain fronting has been one popular way to avoid identification by 386 censors [Fifield-2015]. To avoid identification by censors, 387 applications using domain fronting put a different domain name in the 388 SNI extension than in the Host: header, which is protected by HTTPS. 389 The visible SNI would indicate an unblocked domain, while the blocked 390 domain remains hidden in the encrypted application header. Some 391 encrypted messaging services relied on domain fronting to enable 392 their provision in countries employing SNI-based filtering. These 393 services used the cover provided by domains for which blocking at the 394 domain level would be undesirable to hide their true domain names. 395 However, the companies holding the most popular domains have since 396 reconfigured their software to prevent this practice. It may be 397 possible to achieve similar results using potential future options to 398 encrypt SNI. 400 Tradeoffs: Some clients do not send the SNI extension (e.g., clients 401 that only support versions of SSL and not TLS), rendering this method 402 ineffective (see Section 4.2.3.3). In addition, this technique 403 requires deep packet inspection techniques that can be 404 computationally and infrastructurally expensive and improper 405 configuration of an SNI-based block can result in significant 406 overblocking, e.g., when a second-level domain like 407 populardomain.example is inadvertently blocked. In the case of 408 encrypted SNI, pressure to censor may transfer to other points of 409 intervention, such as content and application providers. 411 Empirical Examples: There are many examples of security firms that 412 offer SNI-based filtering products [Trustwave-2015] [Sophos-2015] 413 [Shbair-2015], and the governments of China, Egypt, Iran, Qatar, 414 South Korea, Turkey, Turkmenistan, and the UAE all do widespread SNI 415 filtering or blocking [OONI-2018] [OONI-2019] [NA-SK-2019] 416 [CitizenLab-2018] [Gatlan-2019] [Chai-2019] [Grover-2019] 417 [Singh-2019]. 419 4.2.3.2. Encrypted SNI (ESNI) 421 With the data leakage present with the SNI field, a natural response 422 is to encrypt it, which is forthcoming in TLS 1.3 with Encrypted 423 Client Hello (ECH). Prior to ECH, the Encrypted SNI (ESNI) extension 424 is available to prevent the data leakage caused by SNI, which 425 encrypts only the SNI field. Unfortunately, censors can target 426 connections that use the ESNI extension specifically for censorship. 427 This guarantees overblocking for the censor, but can be worth the 428 cost if ESNI is not yet widely deployed within the country. 429 Encrypted Client Hello (ECH) is the emerging standard for protecting 430 the entire TLS Client Hello, but it is not yet widely deployed. 432 Tradeoffs: The cost to censoring Encrypted SNI (ESNI) is 433 significantly higher than SNI to a censor, as the censor can no 434 longer target censorship to specific domains and guarantees over- 435 blocking. In these cases, the censor uses the over-blocking to 436 discourage the use of ESNI entirely. 438 Empirical Examples: In 2020, China began censoring all uses of 439 Encrypted ESNI (ESNI) [Bock-2020b], even for innocuous connections. 440 The censorship mechanism for China's ESNI censorship differs from how 441 China censors SNI-based connections, suggesting that new middleboxes 442 were deployed specifically to target ESNI connections. 444 4.2.3.3. Omitted-SNI 446 Researchers have observed that some clients omit the SNI extension 447 entirely. This omitted-SNI approach limits the information available 448 to a censor. Like with ESNI, censors can choose to block connections 449 that omit the SNI, though this too risks over-blocking. 451 Tradeoffs: The approach of censoring all connections that omit the 452 SNI field is guaranteed to over-block, though connections that omit 453 the SNI field should be relatively rare in the wild. 455 Empirical Examples: In the past, researchers have observed censors in 456 Russia blocking connections that omit the SNI field [Bock-2020b]. 458 4.2.3.4. Server Response Certificate 460 During the TLS handshake after the TLS Client Hello, the server will 461 respond with the TLS certificate. This certificate also contains the 462 domain the client is trying to access, creating another avenue that 463 censors can use to perform censorship. This technique will not work 464 in TLS 1.3, as the certificate will be encrypted. 466 Tradeoffs: Censoring based on the server certificate requires deep 467 packet inspection techniques that can be more computationally 468 expensive compared to other methods. Additionally, the certificate 469 is sent later in the TLS Handshake compared to the SNI field, forcing 470 the censor to track the connection for longer. 472 Empirical Examples: Researchers have observed the Reliance Jio ISP in 473 India using certificate response fields to censor connections 474 [Satija-2021]. 476 4.2.4. Instrumenting Content Distributors 478 Many governments pressure content providers to censor themselves, or 479 provide the legal framework within which content distributors are 480 incentivized to follow the content restriction preferences of agents 481 external to the content distributor [Boyle-1997]. Due to the 482 extensive reach of such censorship, we define content distributor as 483 any service that provides utility to users, including everything from 484 web sites to locally installed programs. A commonly used method of 485 instrumenting content distributors consists of keyword identification 486 to detect restricted terms on their platform. Governments may 487 provide the terms on such keyword lists. Alternatively, the content 488 provider may be expected to come up with their own list. A different 489 method of instrumenting content distributors consists of requiring a 490 distributor to disassociate with some categories of users. See also 491 Section 6.4. 493 Tradeoffs: By instrumenting content distributors to identify 494 restricted content or content providers, the censor can gain new 495 information at the cost of political capital with the companies it 496 forces or encourages to participate in censorship. For example, the 497 censor can gain insight about the content of encrypted traffic by 498 coercing web sites to identify restricted content. Coercing content 499 distributors to regulate users, categories of users, content and 500 content providers may encourage users and content providers to 501 exhibit self-censorship, an additional advantage for censors (see 502 Section 6.2). The tradeoffs for instrumenting content distributors 503 are highly dependent on the content provider and the requested 504 assistance. A typical concern is that the targeted keywords or 505 categories of users are too broad, risk being too broadly applied, or 506 are not subjected to a sufficiently robust legal process prior to 507 their mandatory application (see p. 8 of [EC-2012]). 509 Empirical Examples: Researchers discovered keyword identification by 510 content providers on platforms ranging from instant messaging 511 applications [Senft-2013] to search engines [Rushe-2015] [Cheng-2010] 512 [Whittaker-2013] [BBC-2013] [Condliffe-2013]. To demonstrate the 513 prevalence of this type of keyword identification, we look to search 514 engine censorship. 516 Search engine censorship demonstrates keyword identification by 517 content providers and can be regional or worldwide. Implementation 518 is occasionally voluntary, but normally it is based on laws and 519 regulations of the country a search engine is operating in. The 520 keyword blocklists are most likely maintained by the search engine 521 provider. China is known to require search engine providers to 522 "voluntarily" maintain search term blocklists to acquire and keep an 523 Internet content provider (ICP) license [Cheng-2010]. It is clear 524 these blocklists are maintained by each search engine provider based 525 on the slight variations in the intercepted searches [Zhu-2011] 526 [Whittaker-2013]. The United Kingdom has been pushing search engines 527 to self-censor with the threat of litigation if they do not do it 528 themselves: Google and Microsoft have agreed to block more than 529 100,000 queries in U.K. to help combat abuse [BBC-2013] 530 [Condliffe-2013]. European Union law, as well as US law, requires 531 modification of search engine results in response to either 532 copyright, trademark, data protection or defamation concerns 533 [EC-2012]. 535 Depending on the output, search engine keyword identification may be 536 difficult or easy to detect. In some cases specialized or blank 537 results provide a trivial enumeration mechanism, but more subtle 538 censorship can be difficult to detect. In February 2015, Microsoft's 539 search engine, Bing, was accused of censoring Chinese content outside 540 of China [Rushe-2015] because Bing returned different results for 541 censored terms in Chinese and English. However, it is possible that 542 censorship of the largest base of Chinese search users, China, biased 543 Bing's results so that the more popular results in China (the 544 uncensored results) were also more popular for Chinese speakers 545 outside of China. 547 Disassociation by content distributors from certain categories of 548 users has happened for instance in Spain, as a result of the conflict 549 between the Catalunyan independence movement and the Spanish legal 550 presumption of a unitary state [Lomas-2019]. E-sport event 551 organizers have also disassociated themselves from top players who 552 expressed political opinions in relation to the 2019 Hong Kong 553 protests [Victor-2019]. See also Section 5.3.2. 555 4.2.5. Deep Packet Inspection (DPI) Identification 557 DPI (deep packet inspection) technically is any kind of packet 558 analysis beyond IP address and port number and has become 559 computationally feasible as a component of censorship mechanisms in 560 recent years [Wagner-2009]. Unlike other techniques, DPI reassembles 561 network flows to examine the application "data" section, as opposed 562 to only headers, and is therefore often used for keyword 563 identification. DPI also differs from other identification 564 technologies because it can leverage additional packet and flow 565 characteristics, e.g., packet sizes and timings, when identifying 566 content. To prevent substantial quality of service (QoS) impacts, 567 DPI normally analyzes a copy of data while the original packets 568 continue to be routed. Typically, the traffic is split using either 569 a mirror switch or fiber splitter, and analyzed on a cluster of 570 machines running Intrusion Detection Systems (IDS) configured for 571 censorship. 573 Tradeoffs: DPI is one of the most expensive identification mechanisms 574 and can have a large QoS impact [Porter-2010]. When used as a 575 keyword filter for TCP flows, DPI systems can cause also major 576 overblocking problems. Like other techniques, DPI is less useful 577 against encrypted data, though DPI can leverage unencrypted elements 578 of an encrypted data flow, e.g., the Server Name Indication (SNI) 579 sent in the clear for TLS, or metadata about an encrypted flow, e.g., 580 packet sizes, which differ across video and textual flows, to 581 identify traffic. See Section 4.2.3.1 for more information about 582 SNI-based filtration mechanisms. 584 Other kinds of information can be inferred by comparing certain 585 unencrypted elements exchanged during TLS handshakes to similar data 586 points from known sources. This practice, called TLS fingerprinting, 587 allows a probabilistic identification of a party's operating system, 588 browser, or application based on a comparison of the specific 589 combinations of TLS version, ciphersuites, compression options, etc. 590 sent in the ClientHello message to similar signatures found in 591 unencrypted traffic [Husak-2016]. 593 Despite these problems, DPI is the most powerful identification 594 method and is widely used in practice. The Great Firewall of China 595 (GFW), the largest censorship system in the world, uses DPI to 596 identify restricted content over HTTP and DNS and inject TCP RSTs and 597 bad DNS responses, respectively, into connections [Crandall-2010] 598 [Clayton-2006] [Anonymous-2014]. 600 Empirical Examples: Several studies have found evidence of censors 601 using DPI for censoring content and tools. Clayton et al., Crandal 602 et al., Anonymous, and Khattak et al., all explored the GFW 604 [Crandall-2010] [Clayton-2006] [Anonymous-2014]. Khattak et al. even 605 probed the firewall to discover implementation details like how much 606 state it stores [Khattak-2013]. The Tor project claims that China, 607 Iran, Ethiopia, and others must have used DPI to block the obfs2 608 protocol [Wilde-2012]. Malaysia has been accused of using targeted 609 DPI, paired with DDoS, to identify and subsequently attack pro- 610 opposition material [Wagstaff-2013]. It also seems likely that 611 organizations not so worried about blocking content in real-time 612 could use DPI to sort and categorically search gathered traffic using 613 technologies such as NarusInsight [Hepting-2011]. 615 4.3. Transport Layer 617 4.3.1. Shallow Packet Inspection and Transport Header Identification 619 Of the various shallow packet inspection methods, Transport Header 620 Identification is the most pervasive, reliable, and predictable type 621 of identification. Transport headers contain a few invaluable pieces 622 of information that must be transparent for traffic to be 623 successfully routed: destination and source IP address and port. 624 Destination and Source IP are doubly useful, as not only does it 625 allow a censor to block undesirable content via IP blocklisting, but 626 also allows a censor to identify the IP of the user making the 627 request and the IP address of the destination being visited, which in 628 most cases can be used to infer the domain being visited 629 [Patil-2019]. Port is useful for allowlisting certain applications. 631 Trade-offs: header identification is popular due to its simplicity, 632 availability, and robustness. 634 Header identification is trivial to implement, but is difficult to 635 implement in backbone or ISP routers at scale, and is therefore 636 typically implemented with DPI. Blocklisting an IP is equivalent to 637 installing a specific route on a router (such as a /32 route for IPv4 638 addresses and a /128 route for IPv6 addresses). However, due to 639 limited flow table space, this cannot scale beyond a few thousand IPs 640 at most. IP blocking is also relatively crude. It often leads to 641 overblocking and cannot deal with some services like Content 642 Distribution Networks (CDN) that host content at hundreds or 643 thousands of IP addresses. Despite these limitations, IP blocking is 644 extremely effective because the user needs to proxy their traffic 645 through another destination to circumvent this type of 646 identification. 648 Port-blocking is generally not useful because many types of content 649 share the same port and it is possible for censored applications to 650 change their port. For example, most HTTP traffic goes over port 80, 651 so the censor cannot differentiate between restricted and allowed web 652 content solely on the basis of port. HTTPS goes over port 443, with 653 similar consequences for the censor except only partial metadata may 654 now be available to the censor. Port allowlisting is occasionally 655 used, where a censor limits communication to approved ports, such as 656 80 for HTTP traffic and is most effective when used in conjunction 657 with other identification mechanisms. For example, a censor could 658 block the default HTTPS port, port 443, thereby forcing most users to 659 fall back to HTTP. A counter-example is that port 25 (SMTP) has long 660 been blocked on residential ISPs' networks to reduce the risk for 661 email spam, but in doing so also prohibits residential ISP customers 662 from running their own email servers. 664 4.3.2. Protocol Identification 666 Censors sometimes identify entire protocols to be blocked using a 667 variety of traffic characteristics. For example, Iran impairs the 668 performance of HTTPS traffic, a protocol that prevents further 669 analysis, to encourage users to switch to HTTP, a protocol that they 670 can analyze [Aryan-2012]. A simple protocol identification would be 671 to recognize all TCP traffic over port 443 as HTTPS, but more 672 sophisticated analysis of the statistical properties of payload data 673 and flow behavior, would be more effective, even when port 443 is not 674 used [Hjelmvik-2010] [Sandvine-2014]. 676 If censors can detect circumvention tools, they can block them, so 677 censors like China are extremely interested in identifying the 678 protocols for censorship circumvention tools. In recent years, this 679 has devolved into an arms race between censors and circumvention tool 680 developers. As part of this arms race, China developed an extremely 681 effective protocol identification technique that researchers call 682 active probing or active scanning. 684 In active probing, the censor determines whether hosts are running a 685 circumvention protocol by trying to initiate communication using the 686 circumvention protocol. If the host and the censor successfully 687 negotiate a connection, then the censor conclusively knows that host 688 is running a circumvention tool. China has used active scanning to 689 great effect to block Tor [Winter-2012]. 691 Trade-offs: Protocol identification necessarily only provides insight 692 into the way information is traveling, and not the information 693 itself. 695 Protocol identification is useful for detecting and blocking 696 circumvention tools, like Tor, or traffic that is difficult to 697 analyze, like VoIP or SSL, because the censor can assume that this 698 traffic should be blocked. However, this can lead to over-blocking 699 problems when used with popular protocols. These methods are 700 expensive, both computationally and financially, due to the use of 701 statistical analysis, and can be ineffective due to their imprecise 702 nature. Moreover, censorship circumvention groups like the Tor 703 Project have developed "pluggable transports" which seek to make the 704 traffic of censorship circumvention tools appear indistinguishable 705 from other kinds of traffic [Tor-2020]. 707 Censors have also used protocol identification in the past in an 708 'allowlist' filtering capacity, such as by only allowing specific, 709 pre-vetted protocols to be used and blocking any unrecognized 710 protocols [Bock-2020]. These protocol filtering approaches can also 711 lead to over-blocking if the allowed lists of protocols is too small 712 or incomplete, but can be cheap to implement, as many standard 713 'allowed' protocols are simple to identify (such as HTTP). 715 Empirical Examples: Protocol identification can be easy to detect if 716 it is conducted in real time and only a particular protocol is 717 blocked, but some types of protocol identification, like active 718 scanning, are much more difficult to detect. Protocol identification 719 has been used by Iran to identify and throttle SSH traffic to make it 720 unusable [Anonymous-2007] and by China to identify and block Tor 721 relays [Winter-2012]. Protocol identification has also been used for 722 traffic management, such as the 2007 case where Comcast in the United 723 States used RST injection to interrupt BitTorrent Traffic 724 [Winter-2012]. In 2020, Iran deployed an allowlist protocol filter, 725 which only allowed three protocols to be used (DNS, TLS, and HTTP) on 726 specific ports and censored any connection it could not identify 727 [Bock-2020]. 729 4.4. Residual Censorship 731 Another feature of some modern censorship systems is residual 732 censorship, a punitive form of censorship whereby after a censor 733 disrupts a forbidden connection, the censor continues to target 734 subsequent connections, even if they are innocuous [Bock-2021]. 735 Residual censorship can take many forms and often relies on the 736 methods of technical interference described in the next section. 738 An important facet of residual censorship is precisely what the 739 censor continues to block after censorship is initially triggered. 740 There are three common options available to an adversary: 2-tuple 741 (client IP, server IP), 3-tuple (client IP, server IP+port), or 742 4-tuple (client IP+port, server IP+port). Future connections that 743 match the tuple of information the censor records will be disrupted 744 [Bock-2021]. 746 Residual censorship can sometimes be difficult to identify and can 747 often complicate censorship measurement. 749 Trade-offs: The impact of residual censorship is to provide users 750 with further discouragement from trying to access forbidden content, 751 though it is not clear how successful it is at accomplishing this. 753 Empirical Examples: China has used 3-tuple residual censorship in 754 conjunction with their HTTP censorship for years and researchers have 755 reported seeing similar residual censorship for HTTPS. China seems 756 to use a mix of 3-tuple and 4-tuple residual censorship for their 757 censorship of HTTPS with ESNI. Some censors that perform censorship 758 via packet dropping often accidentally implement 4-tuple residual 759 censorship, including Iran and Kazakhstan [Bock-2021]. 761 5. Technical Interference 763 5.1. Application Layer 765 5.1.1. DNS Interference 767 There are a variety of mechanisms that censors can use to block or 768 filter access to content by altering responses from the DNS 769 [AFNIC-2013] [ICANN-SSAC-2012], including blocking the response, 770 replying with an error message, or responding with an incorrect 771 address. Note that there are now encrypted transports for DNS 772 queries in DNS-over-HTTPS [RFC8484] and DNS-over-TLS [RFC7858] that 773 can mitigate interference with DNS queries between the stub and the 774 resolver. 776 Responding to a DNS query with an incorrect address can be achieved 777 with on-path interception, off-path cache poisoning, and lying by the 778 nameserver. 780 "DNS mangling" is a network-level technique of on-path interception 781 where an incorrect IP address is returned in response to a DNS query 782 to a censored destination. An example of this is what some Chinese 783 networks do (we are not aware of any other wide-scale uses of 784 mangling). On those Chinese networks, every DNS request in transit 785 is examined (presumably by network inspection technologies such as 786 DPI) and, if it matches a censored domain, a false response is 787 injected. End users can see this technique in action by simply 788 sending DNS requests to any unused IP address in China (see example 789 below). If it is not a censored name, there will be no response. If 790 it is censored, a forged response will be returned. For example, 791 using the command-line dig utility to query an unused IP address in 792 China of 192.0.2.2 for the name "www.uncensored.example" compared 793 with "www.censored.example" (censored at the time of writing), we get 794 a forged IP address "198.51.100.0" as a response: 796 % dig +short +nodnssec @192.0.2.2 A www.uncensored.example 797 ;; connection timed out; no servers could be reached 799 % dig +short +nodnssec @192.0.2.2 A www.censored.example 800 198.51.100.0 802 DNS cache poisoning happens off-path and refers to a mechanism where 803 a censor interferes with the response sent by an authoritative DNS 804 name server to a recursive resolver by responding more quickly than 805 the authoritative name server can respond with an alternative IP 806 address [Halley-2008]. Cache poisoning occurs after the requested 807 site's name servers resolve the request and attempt to forward the 808 true IP back to the requesting device; on the return route the 809 resolved IP is recursively cached by each DNS server that initially 810 forwarded the request. During this caching process if an undesirable 811 keyword is recognized, the resolved IP is "poisoned" and an 812 alternative IP (or NXDOMAIN error) is returned more quickly than the 813 upstream resolver can respond, causing a forged IP address to be 814 cached (and potentially recursively so). The alternative IPs usually 815 direct to a nonsense domain or a warning page. Alternatively, 816 Iranian censorship appears to prevent the communication en-route, 817 preventing a response from ever being sent [Aryan-2012]. 819 There are also cases of what is colloquially called "DNS lying", 820 where a censor mandates that the DNS responses provided - by an 821 operator of a recursive resolver such as an Internet access provider 822 - be different than what authoritative name server would provide 823 [Bortzmeyer-2015]. 825 Trade-offs: These forms of DNS interference require the censor to 826 force a user to traverse a controlled DNS hierarchy (or intervening 827 network on which the censor serves as a Active Pervasive Attacker 828 [RFC7624] to rewrite DNS responses) for the mechanism to be 829 effective. It can be circumvented by using alternative DNS resolvers 830 (such as any of the public DNS resolvers) that may fall outside of 831 the jurisdictional control of the censor, or Virtual Private Network 832 (VPN) technology. DNS mangling and cache poisoning also imply 833 returning an incorrect IP to those attempting to resolve a domain 834 name, but in some cases the destination may be technically 835 accessible; over HTTP, for example, the user may have another method 836 of obtaining the IP address of the desired site and may be able to 837 access it if the site is configured to be the default server 838 listening at this IP address. Target blocking has also been a 839 problem, as occasionally users outside of the censors region will be 840 directed through DNS servers or DNS-rewriting network equipment 841 controlled by a censor, causing the request to fail. The ease of 842 circumvention paired with the large risk of content blocking and 843 target blocking make DNS interference a partial, difficult, and less 844 than ideal censorship mechanism. 846 Additionally, the above mechanisms rely on DNSSEC not being deployed 847 or DNSSEC validation not being active on the client or recursive 848 resolver (neither of which are hard to imagine given limited 849 deployment of DNSSEC and limited client support for DNSSEC 850 validation). Note that an adversary seeking to merely block 851 resolution can serve a DNSSEC record that doesn't validate correctly, 852 assuming of course that the client/recursive resolver validates. 854 Previously, techniques were used for e.g. censorship that relied on 855 DNS requests being passed in cleartext over port 53 [SSAC-109-2020]. 856 With the deployment of encrypted DNS (e.g., DNS-over-HTTPS [RFC8484]) 857 these requests are now increasingly passed on port 443 with other 858 HTTPS traffic, or in the case of DNS-over-TLS [RFC7858] no longer 859 passed in the clear (see also Section 4.3.1). 861 Empirical Examples: DNS interference, when properly implemented, is 862 easy to identify based on the shortcomings identified above. Turkey 863 relied on DNS interference for its country-wide block of websites 864 such Twitter and YouTube for almost week in March of 2014 but the 865 ease of circumvention resulted in an increase in the popularity of 866 Twitter until Turkish ISPs implementing an IP blocklist to achieve 867 the governmental mandate [Zmijewski-2014]. Ultimately, Turkish ISPs 868 started hijacking all requests to Google and Level 3's international 869 DNS resolvers [Zmijewski-2014]. DNS interference, when incorrectly 870 implemented, has resulted in some of the largest "censorship 871 disasters". In January 2014, China started directing all requests 872 passing through the Great Fire Wall to a single domain, 873 dongtaiwang.com, due to an improperly configured DNS poisoning 874 attempt; this incident is thought to be the largest Internet-service 875 outage in history [AFP-2014] [Anon-SIGCOMM12]. Countries such as 876 China, Iran, Turkey, and the United States have discussed blocking 877 entire TLDs as well, but only Iran has acted by blocking all Israeli 878 (.il) domains [Albert-2011]. DNS-blocking is commonly deployed in 879 European countries to deal with undesirable content, such as child 880 abuse content (Norway, United Kingdom, Belgium, Denmark, Finland, 881 France, Germany, Ireland, Italy, Malta, the Netherlands, Poland, 882 Spain and Sweden [Wright-2013] [Eneman-2010]), online gambling 883 (Belgium, Bulgaria, Czech Republic, Cyprus, Denmark, Estonia, France, 884 Greece, Hungary, Italy, Latvia, Lithuania, Poland, Portugal, Romania, 885 Slovakia, Slovenia, Spain (see Section 6.3.2 of: [EC-gambling-2012], 886 [EC-gambling-2019])), copyright infringement (all European Economic 887 Area countries), hate-speech and extremism (France [Hertel-2015]) and 888 terrorism content (France [Hertel-2015]). 890 5.2. Transport Layer 892 5.2.1. Performance Degradation 894 While other interference techniques outlined in this section mostly 895 focus on blocking or preventing access to content, it can be an 896 effective censorship strategy in some cases to not entirely block 897 access to a given destination, or service but instead degrade the 898 performance of the relevant network connection. The resulting user 899 experience for a site or service under performance degradation can be 900 so bad that users opt to use a different site, service, or method of 901 communication, or may not engage in communication at all if there are 902 no alternatives. Traffic shaping techniques that rate-limit the 903 bandwidth available to certain types of traffic is one example of a 904 performance degradation. 906 Trade offs: While implementing a performance degradation will not 907 always eliminate the ability of people to access a desire resource, 908 it may force them to use other means of communication where 909 censorship (or surveillance) is more easily accomplished. 911 Empirical Examples: Iran has been known to shape the bandwidth 912 available to HTTPS traffic to encourage unencrypted HTTP traffic 913 [Aryan-2012]. 915 5.2.2. Packet Dropping 917 Packet dropping is a simple mechanism to prevent undesirable traffic. 918 The censor identifies undesirable traffic and chooses to not properly 919 forward any packets it sees associated with the traversing 920 undesirable traffic instead of following a normal routing protocol. 921 This can be paired with any of the previously described mechanisms so 922 long as the censor knows the user must route traffic through a 923 controlled router. 925 Trade offs: Packet Dropping is most successful when every traversing 926 packet has transparent information linked to undesirable content, 927 such as a Destination IP. One downside Packet Dropping suffers from 928 is the necessity of blocking all content from otherwise allowable IPs 929 based on a single subversive sub-domain; blogging services and github 930 repositories are good examples. China famously dropped all github 931 packets for three days based on a single repository hosting 932 undesirable content [Anonymous-2013]. The need to inspect every 933 traversing packet in close to real time also makes Packet Dropping 934 somewhat challenging from a QoS perspective. 936 Empirical Examples: Packet Dropping is a very common form of 937 technical interference and lends itself to accurate detection given 938 the unique nature of the time-out requests it leaves in its wake. 939 The Great Firewall of China has been observed using packet dropping 940 as one of its primary mechanisms of technical censorship 941 [Ensafi-2013]. Iran has also used Packet Dropping as the mechanisms 942 for throttling SSH [Aryan-2012]. These are but two examples of a 943 ubiquitous censorship practice. 945 5.2.3. RST Packet Injection 947 Packet injection, generally, refers to a man-in-the-middle (MITM) 948 network interference technique that spoofs packets in an established 949 traffic stream. RST packets are normally used to let one side of TCP 950 connection know the other side has stopped sending information, and 951 thus the receiver should close the connection. RST Packet Injection 952 is a specific type of packet injection attack that is used to 953 interrupt an established stream by sending RST packets to both sides 954 of a TCP connection; as each receiver thinks the other has dropped 955 the connection, the session is terminated. 957 QUIC is not vulnerable to these types of injection attacks once the 958 connection has been setup, but is vulnerable during setup (See 959 [I-D.ietf-quic-transport] for more details). 961 Trade-offs: Although ineffective against non-TCP protocols (QUIC, 962 IPSec), RST Packet Injection has a few advantages that make it 963 extremely popular as a technique employed for censorship. RST Packet 964 Injection is an out-of-band interference mechanism, allowing the 965 avoidance of the the QoS bottleneck one can encounter with inline 966 techniques such as Packet Dropping. This out-of-band property allows 967 a censor to inspect a copy of the information, usually mirrored by an 968 optical splitter, making it an ideal pairing for DPI and protocol 969 identification [Weaver-2009] (this asynchronous version of a MITM is 970 often called a Man-on-the-Side (MOTS)). RST Packet Injection also 971 has the advantage of only requiring one of the two endpoints to 972 accept the spoofed packet for the connection to be interrupted. 974 The difficult part of RST Packet Injection is spoofing "enough" 975 correct information to ensure one end-point accepts a RST packet as 976 legitimate; this generally implies a correct IP, port, and TCP 977 sequence number. Sequence number is the hardest to get correct, as 978 [RFC0793] specifies an RST Packet should be in-sequence to be 979 accepted, although the RFC also recommends allowing in-window packets 980 as "good enough". This in-window recommendation is important, as if 981 it is implemented it allows for successful Blind RST Injection 982 attacks [Netsec-2011]. When in-window sequencing is allowed, it is 983 trivial to conduct a Blind RST Injection: while the term "blind" 984 injection implies the censor doesn't know any sensitive sequencing 985 information about the TCP stream they are injecting into, they can 986 simply enumerate all ~70000 possible windows; this is particularly 987 useful for interrupting encrypted/obfuscated protocols such as SSH or 988 Tor [Gilad]. Some censorship evasion systems work by trying to 989 confuse the censor into tracking incorrect information, rendering 990 their RST Packet Injection useless [Khattak-2013], [Wang-2017], 991 [Li-2017], [Bock-2019], [Wang-2020]. 993 RST Packet Injection relies on a stateful network, making it useless 994 against UDP connections. RST Packet Injection is among the most 995 popular censorship techniques used today given its versatile nature 996 and effectiveness against all types of TCP traffic. Recent research 997 shows that a TCP RST packet injection attack can even work in the 998 case of an off-path attacker [Cao-2016]. 1000 Empirical Examples: RST Packet Injection, as mentioned above, is most 1001 often paired with identification techniques that require splitting, 1002 such as DPI or protocol identification. In 2007, Comcast was accused 1003 of using RST Packet Injection to interrupt traffic it identified as 1004 BitTorrent [Schoen-2007], this later led to a US Federal 1005 Communications Commission ruling against Comcast [VonLohmann-2008]. 1006 China has also been known to use RST Packet Injection for censorship 1007 purposes. This interference is especially evident in the 1008 interruption of encrypted/obfuscated protocols, such as those used by 1009 Tor [Winter-2012]. 1011 5.3. Multi-layer and Non-layer 1013 5.3.1. Distributed Denial of Service (DDoS) 1015 Distributed Denial of Service attacks are a common attack mechanism 1016 used by "hacktivists" and malicious hackers, but censors have used 1017 DDoS in the past for a variety of reasons. There is a huge variety 1018 of DDoS attacks [Wikip-DoS], but at a high level two possible impacts 1019 tend to occur; a flood attack results in the service being unusable 1020 while resources are being spent to flood the service, a crash attack 1021 aims to crash the service so resources can be reallocated elsewhere 1022 without "releasing" the service. 1024 Trade-offs: DDoS is an appealing mechanism when a censor would like 1025 to prevent all access to undesirable content, instead of only access 1026 in their region for a limited period of time, but this is really the 1027 only uniquely beneficial feature for DDoS as a technique employed for 1028 censorship. The resources required to carry out a successful DDoS 1029 against major targets are computationally expensive, usually 1030 requiring renting or owning a malicious distributed platform such as 1031 a botnet, and imprecise. DDoS is an incredibly crude censorship 1032 technique, and appears to largely be used as a timely, easy-to-access 1033 mechanism for blocking undesirable content for a limited period of 1034 time. 1036 Empirical Examples: In 2012 the U.K.'s GCHQ used DDoS to temporarily 1037 shutdown IRC chat rooms frequented by members of Anonymous using the 1038 Syn Flood DDoS method; Syn Flood exploits the handshake used by TCP 1039 to overload the victim server with so many requests that legitimate 1040 traffic becomes slow or impossible [Schone-2014] [CERT-2000]. 1041 Dissenting opinion websites are frequently victims of DDoS around 1042 politically sensitive events in Burma [Villeneuve-2011]. Controlling 1043 parties in Russia [Kravtsova-2012], Zimbabwe [Orion-2013], and 1044 Malaysia [Muncaster-2013] have been accused of using DDoS to 1045 interrupt opposition support and access during elections. In 2015, 1046 China launched a DDoS attack using a true MITM system collocated with 1047 the Great Firewall, dubbed "Great Cannon", that was able to inject 1048 JavaScript code into web visits to a Chinese search engine that 1049 commandeered those user agents to send DDoS traffic to various sites 1050 [Marczak-2015]. 1052 5.3.2. Network Disconnection or Adversarial Route Announcement 1054 While it is perhaps the crudest of all techniques employed for 1055 censorship, there is no more effective way of making sure undesirable 1056 information isn't allowed to propagate on the web than by shutting 1057 off the network. The network can be logically cut off in a region 1058 when a censoring body withdraws all of the Boarder Gateway Protocol 1059 (BGP) prefixes routing through the censor's country. 1061 Trade-offs: The impact to a network disconnection in a region is huge 1062 and absolute; the censor pays for absolute control over digital 1063 information by losing all the benefits the Internet brings; this 1064 rarely a long-term solution for any censor and is normally only used 1065 as a last resort in times of substantial unrest. 1067 Empirical Examples: Network Disconnections tend to only happen in 1068 times of substantial unrest, largely due to the huge social, 1069 political, and economic impact such a move has. One of the first, 1070 highly covered occurrences was with the Junta in Myanmar employing 1071 Network Disconnection to help Junta forces quash a rebellion in 2007 1072 [Dobie-2007]. China disconnected the network in the Xinjiang region 1073 during unrest in 2009 in an effort to prevent the protests from 1074 spreading to other regions [Heacock-2009]. The Arab Spring saw the 1075 the most frequent usage of Network Disconnection, with events in 1076 Egypt and Libya in 2011 [Cowie-2011], and Syria in 2012 1077 [Thomson-2012]. Russia has indicated that it will attempt to 1078 disconnect all Russian networks from the global internet in April 1079 2019 as part of a test of the nation's network independence. Reports 1080 also indicate that, as part of the test disconnect, Russian 1081 telecommunications firms must now route all traffic to state-operated 1082 monitoring points [Cimpanu-2019]. India was the country that saw the 1083 largest number of internet shutdowns per year in 2016 and 2017 1084 [Dada-2017]. 1086 5.3.3. Censorship in Depth 1088 Often, censors implement multiple techniques in tandem, creating 1089 "censorship in depth". Censorship in depth can take many forms; some 1090 censors block the same content through multiple techniques (such as 1091 blocking a domain by DNS, IP blocking, and HTTP simultaneously), some 1092 deploy parallel systems to improve censorship reliability (such as 1093 deploying multiple different censorship systems to block the same 1094 domain), and others can use complimentary systems to limit evasion 1095 (such as by blocking unwanted protocols entirely, forcing users to 1096 use other filtered protocols). 1098 Trade-offs: Censorship in depth can be attractive for censors to 1099 deploy, as it offers additional guarantees about censorship: even if 1100 someone evades one type of censorship, they may still be blocked by 1101 another. The main drawback to this approach is the cost to initial 1102 deployment, as it requires the system to deploy multiple censorship 1103 systems in tandem. 1105 Empirical Examples: Censorship in depth is present in many large 1106 censoring nation states today. Researchers have observed China has 1107 deployed significant censorship in depth, often censoring the same 1108 resource across multiple protocols [Chai-2019], [Bock-2020b] or 1109 deploying additional censorship systems to censor the same content 1110 and protocol [Bock-2021b]. Iran also has deployed a complimentary 1111 protocol filter to limit which protocols can be used on certain 1112 ports, forcing users to rely on protocols their censorship system can 1113 filter [Bock-2020]. 1115 6. Non-Technical Interference 1117 6.1. Manual Filtering 1119 As the name implies, sometimes manpower is the easiest way to figure 1120 out which content to block. Manual Filtering differs from the common 1121 tactic of building up blocklists in that it doesn't necessarily 1122 target a specific IP or DNS, but instead removes or flags content. 1123 Given the imprecise nature of automatic filtering, manually sorting 1124 through content and flagging dissenting websites, blogs, articles and 1125 other media for filtration can be an effective technique. This 1126 filtration can occur on the Backbone/ISP level - China's army of 1127 monitors is a good example [BBC-2013b] - but more commonly manual 1128 filtering occurs on an institutional level. Internet Content 1129 Providers such as Google or Weibo, require a business license to 1130 operate in China. One of the prerequisites for a business license is 1131 an agreement to sign a "voluntary pledge" known as the "Public Pledge 1132 on Self-discipline for the Chinese Internet Industry". The failure 1133 to "energetically uphold" the pledged values can lead to the ICPs 1134 being held liable for the offending content by the Chinese government 1135 [BBC-2013b]. 1137 6.2. Self-Censorship 1139 Self-censorship is difficult to document, as it manifests primarily 1140 through a lack of undesirable content. Tools which encourage self- 1141 censorship are those which may lead a prospective speaker to believe 1142 that speaking increases the risk of unfavourable outcomes for the 1143 speaker (technical monitoring, identification requirements, etc.). 1144 Reporters Without Borders exemplify methods of imposing self- 1145 censorship in their annual World Press Freedom Index reports 1146 [RWB2020]. 1148 6.3. Server Takedown 1150 As mentioned in passing by [Murdoch-2011], servers must have a 1151 physical location somewhere in the world. If undesirable content is 1152 hosted in the censoring country the servers can be physically seized 1153 or - in cases where a server is virtualized in a cloud infrastructure 1154 where it may not necessarily have a fixed physical location - the 1155 hosting provider can be required to prevent access. 1157 6.4. Notice and Takedown 1159 In many countries, legal mechanisms exist where an individual or 1160 other content provider can issue a legal request to a content host 1161 that requires the host to take down content. Examples include the 1162 systems employed by companies like Google to comply with "Right to be 1163 Forgotten" policies in the European Union [Google-RTBF], intermediary 1164 liability rules for electronic platform providers [EC-2012], or the 1165 copyright-oriented notice and takedown regime of the United States 1166 Digital Millennium Copyright Act (DMCA) Section 512 [DMLP-512]. 1168 6.5. Domain-Name Seizures 1170 Domain names are catalogued in so-called name-servers operated by 1171 legal entities called registries. These registries can be made to 1172 cede control over a domain name to someone other than the entity 1173 which registered the domain name through a legal procedure grounded 1174 in either private contracts or public law. Domain name seizures is 1175 increasingly used by both public authorities and private entities to 1176 deal with undesired content dissemination [ICANN2012] [EFF2017]. 1178 7. Contributors 1180 This document benefited from discussions with and input from David 1181 Belson, Stephane Bortzmeyer, Vinicius Fortuna, Gurshabad Grover, 1182 Andrew McConachie, Martin Nilsson, Michael Richardson, Patrick Vacek 1183 and Chris Wood. 1185 8. Informative References 1187 [AFNIC-2013] 1188 AFNIC, "Report of the AFNIC Scientific Council: 1189 Consequences of DNS-based Internet filtering", 2013, 1190 . 1193 [AFP-2014] AFP, "China Has Massive Internet Breakdown Reportedly 1194 Caused By Their Own Censoring Tools", 2014, 1195 . 1198 [Albert-2011] 1199 Albert, K., "DNS Tampering and the new ICANN gTLD Rules", 1200 2011, . 1203 [Anon-SIGCOMM12] 1204 Anonymous, "The Collateral Damage of Internet Censorship 1205 by DNS Injection", 2012, 1206 . 1209 [Anonymous-2007] 1210 Anonymous, "How to Bypass Comcast's Bittorrent 1211 Throttling", 2012, . 1214 [Anonymous-2013] 1215 Anonymous, "GitHub blocked in China - how it happened, how 1216 to get around it, and where it will take us", 2013, 1217 . 1221 [Anonymous-2014] 1222 Anonymous, "Towards a Comprehensive Picture of the Great 1223 Firewall's DNS Censorship", 2014, 1224 . 1227 [AP-2012] Associated Press, "Sattar Beheshit, Iranian Blogger, Was 1228 Beaten In Prison According To Prosecutor", 2012, 1229 . 1232 [Aryan-2012] 1233 Aryan, S., Aryan, H., and J.A. Halderman, "Internet 1234 Censorship in Iran: A First Look", 2012, 1235 . 1237 [BBC-2013] BBC News, "Google and Microsoft agree steps to block abuse 1238 images", 2013, . 1240 [BBC-2013b] 1241 BBC, "China employs two million microblog monitors state 1242 media say", 2013, 1243 . 1245 [Bentham-1791] 1246 Bentham, J., "Panopticon Or the Inspection House", 1791, 1247 . 1250 [Bock-2019] 1251 Bock, K., Hughey, G., Qiang, X., and D. Levin, "Geneva: 1252 Evolving Censorship Evasion Strategies", 2019, 1253 . 1255 [Bock-2020] 1256 Bock, K., Fax, Y., Reese, K., Singh, J., and D. Levin, 1257 "Detecting and Evading Censorship-in-Depth: A Case Study 1258 of Iran’s Protocol Filter", 2020, 1259 . 1262 [Bock-2020b] 1263 Bock, K., iyouport, ., Anonymous, ., Merino, L., Fifield, 1264 D., Houmansadr, A., and D. Levin, "Exposing and 1265 Circumventing China's Censorship of ESNI", 2020, 1266 . 1269 [Bock-2021] 1270 Bock, K., Bharadwaj, P., Singh, J., and D. Levin, "Your 1271 Censor is My Censor: Weaponizing Censorship Infrastructure 1272 for Availability Attacks", 2021, 1273 . 1276 [Bock-2021b] 1277 Bock, K., Naval, G., Reese, K., and D. Levin, "Even 1278 Censors Have a Backup: Examining China’s Double HTTPS 1279 Censorship Middleboxes", 2021, 1280 . 1282 [Bortzmeyer-2015] 1283 Bortzmeyer, S., "DNS Censorship (DNS Lies) As Seen By RIPE 1284 Atlas", 2015, 1285 . 1288 [Boyle-1997] 1289 Boyle, J., "Foucault in Cyberspace: Surveillance, 1290 Sovereignty, and Hardwired Censors", 1997, 1291 . 1294 [Bristow-2013] 1295 Bristow, M., "China's internet 'spin doctors‘", 2013, 1296 . 1298 [Calamur-2013] 1299 Calamur, K., "Prominent Egyptian Blogger Arrested", 2013, 1300 . 1303 [Cao-2016] Cao, Y., Qian, Z., Wang, Z., Dao, T., Krishnamurthy, S., 1304 and L. Marvel, "Off-Path TCP Exploits: Global Rate Limit 1305 Considered Dangerous", 2016, 1306 . 1309 [CERT-2000] 1310 CERT, "TCP SYN Flooding and IP Spoofing Attacks", 2000, 1311 . 1314 [Chai-2019] 1315 Chai, Z., Ghafari, A., and A. Houmansadr, "On the 1316 Importance of Encrypted-SNI (ESNI) to Censorship 1317 Circumvention", 2019, 1318 . 1321 [Cheng-2010] 1322 Cheng, J., "Google stops Hong Kong auto-redirect as China 1323 plays hardball", 2010, . 1327 [Cimpanu-2019] 1328 Cimpanu, C., "Russia to disconnect from the internet as 1329 part of a planned test", 2019, 1330 . 1333 [CitizenLab-2018] 1334 Marczak, B., Dalek, J., McKune, S., Senft, A., Scott- 1335 Railton, J., and R. Deibert, "Bad Traffic: Sandvine’s 1336 PacketLogic Devices Used to Deploy Government Spyware in 1337 Turkey and Redirect Egyptian Users to Affiliate Ads?", 1338 2018, . 1342 [Clayton-2006] 1343 Clayton, R., "Ignoring the Great Firewall of China", 2006, 1344 . 1346 [Condliffe-2013] 1347 Condliffe, J., "Google Announces Massive New Restrictions 1348 on Child Abuse Search Terms", 2013, . 1352 [Cowie-2011] 1353 Cowie, J., "Egypt Leaves the Internet", 2011, 1354 . 1357 [Crandall-2010] 1358 Crandall, J., "Empirical Study of a National-Scale 1359 Distributed Intrusion Detection System: Backbone-Level 1360 Filtering of HTML Responses in China", 2010, 1361 . 1363 [Dada-2017] 1364 Dada, T. and P. Micek, "Launching STOP: the #KeepItOn 1365 internet shutdown tracker", 2017, 1366 . 1368 [Dalek-2013] 1369 Dalek, J., "A Method for Identifying and Confirming the 1370 Use of URL Filtering Products for Censorship", 2013, 1371 . 1374 [Ding-1999] 1375 Ding, C., Chi, C.H., Deng, J., and C.L. Dong, "Centralized 1376 Content-Based Web Filtering and Blocking: How Far Can It 1377 Go?", 1999, . 1380 [DMLP-512] Digital Media Law Project, "Protecting Yourself Against 1381 Copyright Claims Based on User Content", 2012, 1382 . 1385 [Dobie-2007] 1386 Dobie, M., "Junta tightens media screw", 2007, 1387 . 1389 [EC-2012] European Commission, "Summary of the results of the Public 1390 Consultation on the future of electronic commerce in the 1391 Internal Market and the implementation of the Directive on 1392 electronic commerce (2000/31/EC)", 2012, 1393 . 1397 [EC-gambling-2012] 1398 European Commission, "Online gambling in the Internal 1399 Market", 2012, . 1402 [EC-gambling-2019] 1403 European Commission, "Evaluation of regulatory tools for 1404 enforcing online gambling rules and channeling demand 1405 towards controlled offers", 2019, 1406 . 1410 [EFF2017] Malcom, J., Stoltz, M., Rossi, G., and V. Paxson, "Which 1411 Internet registries offer the best protection for domain 1412 owners?", 2017, . 1415 [Ellul-1973] 1416 Ellul, J., "Propaganda: The Formation of Men's Attitudes", 1417 1973, . 1420 [Eneman-2010] 1421 Eneman, M., "ISPs filtering of child abusive material: A 1422 critical reflection of its effectiveness", 2010, 1423 . 1426 [Ensafi-2013] 1427 Ensafi, R., "Detecting Intentional Packet Drops on the 1428 Internet via TCP/IP Side Channels", 2013, 1429 . 1431 [Fareed-2008] 1432 Fareed, M., "China joins a turf war", 2008, 1433 . 1436 [Fifield-2015] 1437 Fifield, D., Lan, C., Hynes, R., Wegmann, P., and V. 1438 Paxson, "Blocking-resistant communication through domain 1439 fronting", 2015, 1440 . 1442 [Gao-2014] Gao, H., "Tiananmen, Forgotten", 2014, 1443 . 1446 [Gatlan-2019] 1447 Gatlan, S., "South Korea is Censoring the Internet by 1448 Snooping on SNI Traffic", 2019, 1449 . 1453 [Gilad] Gilad, Y. and A. Herzberg, "Off-Path TCP Injection 1454 Attacks", 2014, . 1456 [Glanville-2008] 1457 Glanville, J., "The Big Business of Net Censorship", 2008, 1458 . 1461 [Google-RTBF] 1462 Google, Inc., "Search removal request under data 1463 protection law in Europe", 2015, 1464 . 1467 [Grover-2019] 1468 Grover, G., Singh, K., and E. Hickok, "Reliance Jio is 1469 using SNI inspection to block websites", 2019, 1470 . 1473 [Guardian-2014] 1474 The Gaurdian, "Chinese blogger jailed under crackdown on 1475 'internet rumours'", 2014, 1476 . 1479 [HADOPI-2020] 1480 Haute Autorité pour la Diffusion des oeuvres et la 1481 Protection des Droits sur Internet, "Présentation", 2020, 1482 . 1484 [Halley-2008] 1485 Halley, B., "How DNS cache poisoning works", 2014, 1486 . 1489 [Heacock-2009] 1490 Heacock, R., "China Shuts Down Internet in Xinjiang Region 1491 After Riots", 2009, . 1494 [Hepting-2011] 1495 Electronic Frontier Foundation, "Hepting vs. AT&T", 2011, 1496 . 1498 [Hertel-2015] 1499 Hertel, O., "Comment les autorités peuvent bloquer un site 1500 Internet", 2015, . 1504 [Hjelmvik-2010] 1505 Hjelmvik, E., "Breaking and Improving Protocol 1506 Obfuscation", 2010, 1507 . 1509 [Hopkins-2011] 1510 Hopkins, C., "Communications Blocked in Libya, Qatari 1511 Blogger Arrested: This Week in Online Tyranny", 2011, 1512 . 1515 [Husak-2016] 1516 Husak, M., Cermak, M., Jirsik, T., and P. Celeda, "HTTPS 1517 traffic analysis and client identification using passive 1518 SSL/TLS fingerprinting", 2016, 1519 . 1522 [I-D.ietf-quic-transport] 1523 Iyengar, J. and M. Thomson, "QUIC: A UDP-Based Multiplexed 1524 and Secure Transport", Work in Progress, Internet-Draft, 1525 draft-ietf-quic-transport-34, 14 January 2021, 1526 . 1529 [I-D.ietf-tls-esni] 1530 Rescorla, E., Oku, K., Sullivan, N., and C. A. Wood, "TLS 1531 Encrypted Client Hello", Work in Progress, Internet-Draft, 1532 draft-ietf-tls-esni-14, 13 February 2022, 1533 . 1536 [I-D.ietf-tls-sni-encryption] 1537 Huitema, C. and E. Rescorla, "Issues and Requirements for 1538 Server Name Identification (SNI) Encryption in TLS", Work 1539 in Progress, Internet-Draft, draft-ietf-tls-sni- 1540 encryption-09, 28 October 2019, 1541 . 1544 [ICANN-SSAC-2012] 1545 ICANN Security and Stability Advisory Committee (SSAC), 1546 "SAC 056: SSAC Advisory on Impacts of Content Blocking via 1547 the Domain Name System", 2012, 1548 . 1551 [ICANN2012] 1552 ICANN Security and Stability Advisory Committee, "Guidance 1553 for Preparing Domain Name Orders, Seizures & Takedowns", 1554 2012, . 1557 [Jones-2014] 1558 Jones, B., "Automated Detection and Fingerprinting of 1559 Censorship Block Pages", 2014, 1560 . 1563 [Khattak-2013] 1564 Khattak, S., "Towards Illuminating a Censorship Monitor's 1565 Model to Facilitate Evasion", 2013, . 1569 [Knight-2005] 1570 Knight, W., "Iranian net censorship powered by US 1571 technology", 2005, . 1574 [Knockel-2021] 1575 Knockel, J. and L. Ruan, "Measuring QQMail's automated 1576 email censorship in China", 2021, 1577 . 1579 [Kopel-2013] 1580 Kopel, K., "Operation Seizing Our Sites: How the Federal 1581 Government is Taking Domain Names Without Prior Notice", 1582 2013, . 1584 [Kravtsova-2012] 1585 Kravtsova, Y., "Cyberattacks Disrupt Opposition's 1586 Election", 2012, 1587 . 1590 [Leyba-2019] 1591 Leyba, K., Edwards, B., Freeman, C., Crandall, J., and S. 1592 Forrest, "Borders and Gateways: Measuring and Analyzing 1593 National AS Chokepoints", 2019, 1594 . 1597 [Li-2017] Li, F., Razaghpanah, A., Kakhki, A., Niaki, A., Choffnes, 1598 D., Gill, P., and A. Mislove, "lib•erate, (n) : A library 1599 for exposing (traffic-classification) rules and avoiding 1600 them efficiently", 2017, 1601 . 1603 [Lomas-2019] 1604 Lomas, N., "Github removes Tsunami Democràtic’s APK after 1605 a takedown order from Spain", 2019, 1606 . 1609 [Marczak-2015] 1610 Marczak, B., Weaver, N., Dalek, J., Ensafi, R., Fifield, 1611 D., McKune, S., Rey, A., Scott-Railton, J., Deibert, R., 1612 and V. Paxson, "An Analysis of China’s “Great Cannon”", 1613 2015, 1614 . 1617 [Muncaster-2013] 1618 Muncaster, P., "Malaysian election sparks web blocking/ 1619 DDoS claims", 2013, 1620 . 1623 [Murdoch-2011] 1624 Murdoch, S.J. and R. Anderson, "Access Denied: Tools and 1625 Technology of Internet Filtering", 2011, 1626 . 1629 [NA-SK-2019] 1630 Morgus, R., Sherman, J., and S. Nam, "Analysis: South 1631 Korea's New Tool for Filtering Illegal Internet Content", 1632 2019, . 1636 [Nabi-2013] 1637 Nabi, Z., "The Anatomy of Web Censorship in Pakistan", 1638 2013, . 1641 [Netsec-2011] 1642 n3t2.3c, "TCP-RST Injection", 2011, 1643 . 1645 [OONI-2018] 1646 Evdokimov, L., "Iran Protests: DPI blocking of Instagram 1647 (Part 2)", 2018, 1648 . 1650 [OONI-2019] 1651 Singh, S., Filastò, A., and M. Xynou, "China is now 1652 blocking all language editions of Wikipedia", 2019, 1653 . 1655 [Orion-2013] 1656 Orion, E., "Zimbabwe election hit by hacking and DDoS 1657 attacks", 2013, 1658 . 1661 [Patil-2019] 1662 Patil, S. and N. Borisov, "What Can You Learn from an 1663 IP?", 2019, . 1666 [Porter-2010] 1667 Porter, T., "The Perils of Deep Packet Inspection", 2010, 1668 . 1671 [Rambert-2021] 1672 Rampert, R., Weinberg, Z., Barradas, D., and N. Christin, 1673 "Chinese Wall or Swiss Cheese? Keyword filtering in the 1674 Great Firewall of China", 2021, 1675 . 1678 [Reda-2017] 1679 Reda, J., "New EU law prescribes website blocking in the 1680 name of 'consumer protection'", 2017, 1681 . 1683 [RFC0793] Postel, J., "Transmission Control Protocol", STD 7, 1684 RFC 793, DOI 10.17487/RFC0793, September 1981, 1685 . 1687 [RFC6066] Eastlake 3rd, D., "Transport Layer Security (TLS) 1688 Extensions: Extension Definitions", RFC 6066, 1689 DOI 10.17487/RFC6066, January 2011, 1690 . 1692 [RFC7624] Barnes, R., Schneier, B., Jennings, C., Hardie, T., 1693 Trammell, B., Huitema, C., and D. Borkmann, 1694 "Confidentiality in the Face of Pervasive Surveillance: A 1695 Threat Model and Problem Statement", RFC 7624, 1696 DOI 10.17487/RFC7624, August 2015, 1697 . 1699 [RFC7754] Barnes, R., Cooper, A., Kolkman, O., Thaler, D., and E. 1700 Nordmark, "Technical Considerations for Internet Service 1701 Blocking and Filtering", RFC 7754, DOI 10.17487/RFC7754, 1702 March 2016, . 1704 [RFC7858] Hu, Z., Zhu, L., Heidemann, J., Mankin, A., Wessels, D., 1705 and P. Hoffman, "Specification for DNS over Transport 1706 Layer Security (TLS)", RFC 7858, DOI 10.17487/RFC7858, May 1707 2016, . 1709 [RFC8484] Hoffman, P. and P. McManus, "DNS Queries over HTTPS 1710 (DoH)", RFC 8484, DOI 10.17487/RFC8484, October 2018, 1711 . 1713 [RSF-2005] Reporters Sans Frontieres, "Technical ways to get around 1714 censorship", 2005, . 1717 [Rushe-2015] 1718 Rushe, D., "Bing censoring Chinese language search results 1719 for users in the US", 2013, 1720 . 1723 [RWB2020] Reporters Without Borders, "2020 World Press Freedom 1724 Index: Entering a decisive decade for journalism, 1725 exacerbated by coronavirus", 2020, . 1729 [Sandvine-2014] 1730 Sandvine, "Technology Showcase on Traffic Classification: 1731 Why Measurements and Freeform Policy Matter", 2014, 1732 . 1736 [Satija-2021] 1737 Satija, S. and R. Chatterjee, "BlindTLS: Circumventing 1738 TLS-based HTTPS censorship", 2021, 1739 . 1741 [Schoen-2007] 1742 Schoen, S., "EFF tests agree with AP: Comcast is forging 1743 packets to interfere with user traffic", 2007, 1744 . 1747 [Schone-2014] 1748 Schone, M., Esposito, R., Cole, M., and G. Greenwald, 1749 "Snowden Docs Show UK Spies Attacked Anonymous, Hackers", 1750 2014, . 1754 [Senft-2013] 1755 Senft, A., "Asia Chats: Analyzing Information Controls and 1756 Privacy in Asian Messaging Applications", 2013, 1757 . 1761 [Shbair-2015] 1762 Shbair, W.M., Cholez, T., Goichot, A., and I. Chrisment, 1763 "Efficiently Bypassing SNI-based HTTPS Filtering", 2015, 1764 . 1766 [SIDN2020] Moura, G., "Detecting and Taking Down Fraudulent Webshops 1767 at the .nl ccTLD", 2020, 1768 . 1771 [Singh-2019] 1772 Singh, K., Grover, G., and V. Bansal, "How India Censors 1773 the Web", 2019, . 1775 [Sophos-2015] 1776 Sophos, "Understanding Sophos Web Filtering", 2015, 1777 . 1780 [SSAC-109-2020] 1781 ICANN Security and Stability Advisory Committee, "SAC109: 1782 The Implications of DNS over HTTPS and DNS over TLS", 1783 2020, . 1786 [Tang-2016] 1787 Tang, C., "In-depth analysis of the Great Firewall of 1788 China", 2016, 1789 . 1792 [Thomson-2012] 1793 Thomson, I., "Syria Cuts off Internet and Mobile 1794 Communication", 2012, 1795 . 1798 [Tor-2020] The Tor Project, "Tor: Pluggable Transports", 2020, 1799 . 1802 [Trustwave-2015] 1803 Trustwave, "Filter: SNI extension feature and HTTPS 1804 blocking", 2015, 1805 . 1808 [Tschantz-2016] 1809 Tschantz, M., Afroz, S., Anonymous, A., and V. Paxson, 1810 "SoK: Towards Grounding Censorship Circumvention in 1811 Empiricism", 2016, 1812 . 1814 [Verkamp-2012] 1815 Verkamp, J.P. and M. Gupta, "Inferring Mechanics of Web 1816 Censorship Around the World", 2012, 1817 . 1820 [Victor-2019] 1821 Victor, D., "Blizzard Sets Off Backlash for Penalizing 1822 Hearthstone Gamer in Hong Kong", 2019, 1823 . 1826 [Villeneuve-2011] 1827 Villeneuve, N., "Open Access: Chapter 8, Control and 1828 Resistance, Attacks on Burmese Opposition Media", 2011, 1829 . 1832 [VonLohmann-2008] 1833 VonLohmann, F., "FCC Rules Against Comcast for BitTorrent 1834 Blocking", 2008, . 1837 [Wagner-2009] 1838 Wagner, B., "Deep Packet Inspection and Internet 1839 Censorship: International Convergence on an ‘Integrated 1840 Technology of Control'", 2009, 1841 . 1845 [Wagstaff-2013] 1846 Wagstaff, J., "In Malaysia, online election battles take a 1847 nasty turn", 2013, 1848 . 1851 [Wang-2017] 1852 Wang, Z., Cao, Y., Qian, Z., Song, C., and S. 1853 Krishnamurthy, "Your State is Not Mine: A Closer Look at 1854 Evading Stateful Internet Censorship", 2017, 1855 . 1858 [Wang-2020] 1859 Wang, Z., Zhu, S., Cao, Y., Qian, Z., Song, C., 1860 Krishnamurthy, S., Chan, K., and T. Braun, "SYMTCP: 1861 Eluding Stateful Deep Packet Inspection with Automated 1862 Discrepancy Discovery", 2020, 1863 . 1865 [Weaver-2009] 1866 Weaver, N., Sommer, R., and V. Paxson, "Detecting Forged 1867 TCP Packets", 2009, . 1870 [Whittaker-2013] 1871 Whittaker, Z., "1,168 keywords Skype uses to censor, 1872 monitor its Chinese users", 2013, 1873 . 1876 [Wikip-DoS] 1877 Wikipedia, "Denial of Service Attacks", 2016, 1878 . 1881 [Wilde-2012] 1882 Wilde, T., "Knock Knock Knockin' on Bridges Doors", 2012, 1883 . 1886 [Winter-2012] 1887 Winter, P., "How China is Blocking Tor", 2012, 1888 . 1890 [WP-Def-2020] 1891 Wikipedia contributors, "Censorship", 2020, 1892 . 1895 [Wright-2013] 1896 Wright, J. and Y. Breindl, "Internet filtering trends in 1897 liberal democracies: French and German regulatory 1898 debates", 2013, 1899 . 1903 [Zhu-2011] Zhu, T., "An Analysis of Chinese Search Engine Filtering", 1904 2011, 1905 . 1907 [Zmijewski-2014] 1908 Zmijewski, E., "Turkish Internet Censorship Takes a New 1909 Turn", 2014, 1910 . 1913 Authors' Addresses 1915 Joseph Lorenzo Hall 1916 Internet Society 1917 Email: hall@isoc.org 1919 Michael D. Aaron 1920 CU Boulder 1921 Email: michael.drew.aaron@gmail.com 1923 Amelia Andersdotter 1924 Email: amelia.ietf@andersdotter.cc 1926 Ben Jones 1927 Princeton 1928 Email: bj6@cs.princeton.edu 1929 Nick Feamster 1930 U Chicago 1931 Email: feamster@uchicago.edu 1933 Mallory Knodel 1934 Center for Democracy & Technology 1935 Email: mknodel@cdt.org