idnits 2.17.00 (12 Aug 2021) /tmp/idnits40800/draft-ietf-trans-gossip-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 39 instances of too long lines in the document, the longest one being 60 characters in excess of 72. == There are 4 instances of lines with non-RFC2606-compliant FQDNs in the document. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 361: '...tension. The client MUST discard SCTs...' RFC 2119 keyword, line 362: '...own to the client and SHOULD store the...' RFC 2119 keyword, line 368: '...ed on the client MUST be keyed by the ...' RFC 2119 keyword, line 369: '...contacted. They MUST NOT be sent to a...' RFC 2119 keyword, line 372: '...mple.com.) They MUST NOT be sent to a...' (67 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 1801 has weird spacing: '... bool has_...' == Line 1802 has weird spacing: '... bool proo...' == Line 1894 has weird spacing: '...h later num...' == Line 1902 has weird spacing: '...h later num...' == Line 1904 has weird spacing: '...h later num...' == (9 more instances...) -- The document date (January 10, 2017) is 1957 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Experimental ---------------------------------------------------------------------------- -- Looks like a reference, but probably isn't: '1' on line 294 -- Looks like a reference, but probably isn't: '2' on line 296 -- Looks like a reference, but probably isn't: '3' on line 298 ** Obsolete normative reference: RFC 6962 (Obsoleted by RFC 9162) ** Obsolete normative reference: RFC 7159 (Obsoleted by RFC 8259) Summary: 4 errors (**), 0 flaws (~~), 8 warnings (==), 5 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 TRANS L. Nordberg 3 Internet-Draft NORDUnet 4 Intended status: Experimental D. Gillmor 5 Expires: July 14, 2017 ACLU 6 T. Ritter 8 January 10, 2017 10 Gossiping in CT 11 draft-ietf-trans-gossip-04 13 Abstract 15 The logs in Certificate Transparency are untrusted in the sense that 16 the users of the system don't have to trust that they behave 17 correctly since the behavior of a log can be verified to be correct. 19 This document tries to solve the problem with logs presenting a 20 "split view" of their operations. It describes three gossiping 21 mechanisms for Certificate Transparency: SCT Feedback, STH 22 Pollination and Trusted Auditor Relationship. 24 Status of This Memo 26 This Internet-Draft is submitted in full conformance with the 27 provisions of BCP 78 and BCP 79. 29 Internet-Drafts are working documents of the Internet Engineering 30 Task Force (IETF). Note that other groups may also distribute 31 working documents as Internet-Drafts. The list of current Internet- 32 Drafts is at http://datatracker.ietf.org/drafts/current/. 34 Internet-Drafts are draft documents valid for a maximum of six months 35 and may be updated, replaced, or obsoleted by other documents at any 36 time. It is inappropriate to use Internet-Drafts as reference 37 material or to cite them other than as "work in progress." 39 This Internet-Draft will expire on July 14, 2017. 41 Copyright Notice 43 Copyright (c) 2017 IETF Trust and the persons identified as the 44 document authors. All rights reserved. 46 This document is subject to BCP 78 and the IETF Trust's Legal 47 Provisions Relating to IETF Documents 48 (http://trustee.ietf.org/license-info) in effect on the date of 49 publication of this document. Please review these documents 50 carefully, as they describe your rights and restrictions with respect 51 to this document. Code Components extracted from this document must 52 include Simplified BSD License text as described in Section 4.e of 53 the Trust Legal Provisions and are provided without warranty as 54 described in the Simplified BSD License. 56 Table of Contents 58 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 59 2. Defining the problem . . . . . . . . . . . . . . . . . . . . 4 60 3. Overview . . . . . . . . . . . . . . . . . . . . . . . . . . 4 61 4. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 62 4.1. Pre-Loaded vs Locally Added Anchors . . . . . . . . . . . 5 63 5. Who gossips with whom . . . . . . . . . . . . . . . . . . . . 5 64 6. What to gossip about and how . . . . . . . . . . . . . . . . 6 65 7. Data flow . . . . . . . . . . . . . . . . . . . . . . . . . . 6 66 8. Gossip Mechanisms . . . . . . . . . . . . . . . . . . . . . . 7 67 8.1. SCT Feedback . . . . . . . . . . . . . . . . . . . . . . 7 68 8.1.1. SCT Feedback data format . . . . . . . . . . . . . . 8 69 8.1.2. HTTPS client to server . . . . . . . . . . . . . . . 8 70 8.1.3. HTTPS server operation . . . . . . . . . . . . . . . 11 71 8.1.4. HTTPS server to auditors . . . . . . . . . . . . . . 13 72 8.2. STH pollination . . . . . . . . . . . . . . . . . . . . . 14 73 8.2.1. HTTPS Clients and Proof Fetching . . . . . . . . . . 15 74 8.2.2. STH Pollination without Proof Fetching . . . . . . . 17 75 8.2.3. Auditor Action . . . . . . . . . . . . . . . . . . . 17 76 8.2.4. STH Pollination data format . . . . . . . . . . . . . 17 77 8.3. Trusted Auditor Stream . . . . . . . . . . . . . . . . . 17 78 8.3.1. Trusted Auditor data format . . . . . . . . . . . . . 18 79 9. 3-Method Ecosystem . . . . . . . . . . . . . . . . . . . . . 19 80 9.1. SCT Feedback . . . . . . . . . . . . . . . . . . . . . . 19 81 9.2. STH Pollination . . . . . . . . . . . . . . . . . . . . . 20 82 9.3. Trusted Auditor Relationship . . . . . . . . . . . . . . 21 83 9.4. Interaction . . . . . . . . . . . . . . . . . . . . . . . 22 84 10. Security considerations . . . . . . . . . . . . . . . . . . . 22 85 10.1. Attacks by actively malicious logs . . . . . . . . . . . 22 86 10.2. Dual-CA Compromise . . . . . . . . . . . . . . . . . . . 23 87 10.3. Censorship/Blocking considerations . . . . . . . . . . . 24 88 10.4. Flushing Attacks . . . . . . . . . . . . . . . . . . . . 25 89 10.4.1. STHs . . . . . . . . . . . . . . . . . . . . . . . . 25 90 10.4.2. SCTs & Certificate Chains on HTTPS Servers . . . . . 26 91 10.4.3. SCTs & Certificate Chains on HTTPS Clients . . . . . 26 92 10.5. Privacy considerations . . . . . . . . . . . . . . . . . 27 93 10.5.1. Privacy and SCTs . . . . . . . . . . . . . . . . . . 27 94 10.5.2. Privacy in SCT Feedback . . . . . . . . . . . . . . 27 95 10.5.3. Privacy for HTTPS clients performing STH Proof 96 Fetching . . . . . . . . . . . . . . . . . . . . . . 28 98 10.5.4. Privacy in STH Pollination . . . . . . . . . . . . . 28 99 10.5.5. Privacy in STH Interaction . . . . . . . . . . . . . 29 100 10.5.6. Trusted Auditors for HTTPS Clients . . . . . . . . . 29 101 10.5.7. HTTPS Clients as Auditors . . . . . . . . . . . . . 30 102 11. Policy Recommendations . . . . . . . . . . . . . . . . . . . 30 103 11.1. Blocking Recommendations . . . . . . . . . . . . . . . . 31 104 11.1.1. Frustrating blocking . . . . . . . . . . . . . . . . 31 105 11.1.2. Responding to possible blocking . . . . . . . . . . 31 106 11.2. Proof Fetching Recommendations . . . . . . . . . . . . . 32 107 11.3. Record Distribution Recommendations . . . . . . . . . . 33 108 11.3.1. Mixing Algorithm . . . . . . . . . . . . . . . . . . 34 109 11.3.2. The Deletion Algorithm . . . . . . . . . . . . . . . 35 110 11.4. Concrete Recommendations . . . . . . . . . . . . . . . . 36 111 11.4.1. STH Pollination . . . . . . . . . . . . . . . . . . 36 112 11.4.2. SCT Feedback . . . . . . . . . . . . . . . . . . . . 39 113 12. IANA considerations . . . . . . . . . . . . . . . . . . . . . 53 114 13. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 53 115 14. ChangeLog . . . . . . . . . . . . . . . . . . . . . . . . . . 53 116 14.1. Changes between ietf-03 and ietf-04 . . . . . . . . . . 53 117 14.2. Changes between ietf-02 and ietf-03 . . . . . . . . . . 54 118 14.3. Changes between ietf-01 and ietf-02 . . . . . . . . . . 54 119 14.4. Changes between ietf-00 and ietf-01 . . . . . . . . . . 54 120 14.5. Changes between -01 and -02 . . . . . . . . . . . . . . 55 121 14.6. Changes between -00 and -01 . . . . . . . . . . . . . . 55 122 15. References . . . . . . . . . . . . . . . . . . . . . . . . . 55 123 15.1. Normative References . . . . . . . . . . . . . . . . . . 55 124 15.2. Informative References . . . . . . . . . . . . . . . . . 56 125 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 56 127 1. Introduction 129 The purpose of the protocols in this document, collectively referred 130 to as CT Gossip, is to detect certain misbehavior by CT logs. In 131 particular, CT Gossip aims to detect logs that are providing 132 inconsistent views to different log clients, and logs failing to 133 include submitted certificates within the time period stipulated by 134 MMD. 136 One of the major challenges of any gossip protocol is limiting damage 137 to user privacy. The goal of CT gossip is to publish and distribute 138 information about the logs and their operations, but not to expose 139 any additional information about the operation of any of the other 140 participants. Privacy of consumers of log information (in 141 particular, of web browsers and other TLS clients) should not be 142 undermined by gossip. 144 This document presents three different, complementary mechanisms for 145 non-log elements of the CT ecosystem to exchange information about 146 logs in a manner that preserves the privacy of HTTPS clients. They 147 should provide protective benefits for the system as a whole even if 148 their adoption is not universal. 150 2. Defining the problem 152 When a log provides different views of the log to different clients 153 this is described as a partitioning attack. Each client would be 154 able to verify the append-only nature of the log but, in the extreme 155 case, each client might see a unique view of the log. 157 The CT logs are public, append-only and untrusted and thus have to be 158 audited for consistency, i.e., they should never rewrite history. 159 Additionally, auditors and other log clients need to exchange 160 information about logs in order to be able to detect a partitioning 161 attack (as described above). 163 Gossiping about log behavior helps address the problem of detecting 164 malicious or compromised logs with respect to a partitioning attack. 165 We want some side of the partitioned tree, and ideally both sides, to 166 see the other side. 168 Disseminating information about a log poses a potential threat to the 169 privacy of end users. Some data of interest (e.g. SCTs) is linkable 170 to specific log entries and thereby to specific websites, which makes 171 sharing them with others a privacy concern. Gossiping about this 172 data has to take privacy considerations into account in order not to 173 expose associations between users of the log (e.g., web browsers) and 174 certificate holders (e.g., web sites). Even sharing STHs (which do 175 not link to specific log entries) can be problematic - user tracking 176 by fingerprinting through rare STHs is one potential attack (see 177 Section 8.2). 179 3. Overview 181 This document presents three gossiping mechanisms: SCT Feedback, STH 182 Pollination, and a Trusted Auditor Relationship. 184 SCT Feedback enables HTTPS clients to share Signed Certificate 185 Timestamps (SCTs) (Section 3.3 of [RFC-6962-BIS-09]) with CT auditors 186 in a privacy-preserving manner by sending SCTs to originating HTTPS 187 servers, who in turn share them with CT auditors. 189 In STH Pollination, HTTPS clients use HTTPS servers as pools to share 190 Signed Tree Heads (STHs) (Section 3.6 of [RFC-6962-BIS-09]) with 191 other connecting clients in the hope that STHs will find their way to 192 CT auditors. 194 HTTPS clients in a Trusted Auditor Relationship share SCTs and STHs 195 with trusted CT auditors directly, with expectations of privacy 196 sensitive data being handled according to whatever privacy policy is 197 agreed on between client and trusted party. 199 Despite the privacy risks with sharing SCTs there is no loss in 200 privacy if a client sends SCTs for a given site to the site 201 corresponding to the SCT. This is because the site's logs would 202 already indicate that the client is accessing that site. In this way 203 a site can accumulate records of SCTs that have been issued by 204 various logs for that site, providing a consolidated repository of 205 SCTs that could be shared with auditors. Auditors can use this 206 information to detect a misbehaving log that fails to include a 207 certificate within the time period stipulated by its MMD metadata. 209 Sharing an STH is considered reasonably safe from a privacy 210 perspective as long as the same STH is shared by a large number of 211 other log clients. This safety in numbers can be achieved by only 212 allowing gossiping of STHs issued in a certain window of time, while 213 also refusing to gossip about STHs from logs with too high an STH 214 issuance frequency (see Section 8.2). 216 4. Terminology 218 This document relies on terminology and data structures defined in 219 [RFC-6962-BIS-09], including MMD, STH, SCT, Version, LogID, SCT 220 timestamp, CtExtensions, SCT signature, Merkle Tree Hash. 222 This document relies on terminology defined in 223 [draft-ietf-trans-threat-analysis-03], including Auditing. 225 4.1. Pre-Loaded vs Locally Added Anchors 227 Through the document, we refer to both Trust Anchors (Certificate 228 Authorities) and Logs. Both Logs and Trust Anchors may be locally 229 added by an administrator. Unless otherwise clarified, in both cases 230 we refer to the set of Trust Anchors and Logs that come pre-loaded 231 and pre-trusted in a piece of client software. 233 5. Who gossips with whom 235 o HTTPS clients and servers (SCT Feedback and STH Pollination) 237 o HTTPS servers and CT auditors (SCT Feedback and STH Pollination) 239 o CT auditors (Trusted Auditor Relationship) 240 Additionally, some HTTPS clients may engage with an auditor who they 241 trust with their privacy: 243 o HTTPS clients and CT auditors (Trusted Auditor Relationship) 245 6. What to gossip about and how 247 There are three separate gossip streams: 249 o SCT Feedback - transporting SCTs and certificate chains from HTTPS 250 clients to CT auditors via HTTPS servers. 252 o STH Pollination - HTTPS clients and CT auditors using HTTPS 253 servers as STH pools for exchanging STHs. 255 o Trusted Auditor Stream - HTTPS clients communicating directly with 256 trusted CT auditors sharing SCTs, certificate chains and STHs. 258 It is worthwhile to note that when an HTTPS client or CT auditor 259 interacts with a log, they may equivalently interact with a log 260 mirror or cache that replicates the log. 262 7. Data flow 264 The following picture shows how certificates, SCTs and STHs flow 265 through a CT system with SCT Feedback and STH Pollination. It does 266 not show what goes in the Trusted Auditor Relationship stream. 268 +- Cert ---- +----------+ 269 | | CA | ----------+ 270 | + SCT -> +----------+ | 271 v | Cert [& SCT] 272 +----------+ | 273 | Log | ---------- SCT -----------+ 274 +----------+ v 275 | ^ +----------+ 276 | | SCTs & Certs --- | Website | 277 | |[1] | +----------+ 278 | |[2] STHs ^ | 279 | |[3] v | | 280 | | +----------+ | | 281 | +--------> | Auditor | | HTTPS traffic 282 | +----------+ | | 283 STH | SCT & Cert 284 | SCTs & Certs | 285 Log entries | | 286 | STHs STHs 287 v | | 288 +----------+ | v 289 | Monitor | +----------+ 290 +----------+ | Browser | 291 +----------+ 293 # Auditor Log 294 [1] |--- get-sth ------------------->| 295 |<-- STH ------------------------| 296 [2] |--- leaf hash + tree size ----->| 297 |<-- index + inclusion proof --->| 298 [3] |--- tree size 1 + tree size 2 ->| 299 |<-- consistency proof ----------| 301 8. Gossip Mechanisms 303 8.1. SCT Feedback 305 The goal of SCT Feedback is for clients to share SCTs and certificate 306 chains with CT auditors while still preserving the privacy of the end 307 user. The sharing of SCTs contribute to the overall goal of 308 detecting misbehaving logs by providing auditors with SCTs from many 309 vantage points, making it more likely to catch a violation of a log's 310 MMD or a log presenting inconsistent views. The sharing of 311 certificate chains is beneficial to HTTPS server operators interested 312 in direct feedback from clients for detecting bogus certificates 313 issued in their name and therefore incentivizes server operators to 314 take part in SCT Feedback. 316 SCT Feedback is the most privacy-preserving gossip mechanism, as it 317 does not directly expose any links between an end user and the sites 318 they've visited to any third party. 320 HTTPS clients store SCTs and certificate chains they see, and later 321 send them to the originating HTTPS server by posting them to a well- 322 known URL (associated with that server), as described in 323 Section 8.1.2. Note that clients will send the same SCTs and chains 324 to a server multiple times with the assumption that any man-in-the- 325 middle attack eventually will cease, and an honest server will 326 eventually receive collected malicious SCTs and certificate chains. 328 HTTPS servers store SCTs and certificate chains received from 329 clients, as described in Section 8.1.3. They later share them with 330 CT auditors by either posting them to auditors or making them 331 available via a well-known URL. This is described in Section 8.1.4. 333 8.1.1. SCT Feedback data format 335 The data shared between HTTPS clients and servers, as well as between 336 HTTPS servers and CT auditors, is a JSON array [RFC7159]. Each item 337 in the array is a JSON object with the following content: 339 o x509_chain: An array of PEM-encoded X.509 certificates. The first 340 element is the end-entity certificate, the second certifies the 341 first and so on. 343 o sct_data: An array of objects consisting of the base64 344 representation of the binary SCT data as defined in 345 [RFC-6962-BIS-09] Section 3.3. 347 We will refer to this object as 'sct_feedback'. 349 The x509_chain element always contains a full chain from a leaf 350 certificate to a self-signed trust anchor. 352 See Section 8.1.2 for details on what the sct_data element contains 353 as well as more details about the x509_chain element. 355 8.1.2. HTTPS client to server 357 When an HTTPS client connects to an HTTPS server, the client receives 358 a set of SCTs as part of the TLS handshake. SCTs are included in the 359 TLS handshake using one or more of the three mechanisms described in 360 [RFC-6962-BIS-09] section 3.4 - in the server certificate, in a TLS 361 extension, or in an OCSP extension. The client MUST discard SCTs 362 that are not signed by a log known to the client and SHOULD store the 363 remaining SCTs together with a locally constructed certificate chain 364 which is trusted (i.e. terminated in a pre-loaded or locally 365 installed Trust Anchor) in an sct_feedback object or equivalent data 366 structure for later use in SCT Feedback. 368 The SCTs stored on the client MUST be keyed by the exact domain name 369 the client contacted. They MUST NOT be sent to any domain not 370 matching the original domain (e.g. if the original domain is 371 sub.example.com they must not be sent to sub.sub.example.com or to 372 example.com.) They MUST NOT be sent to any Subject Alternate Names 373 specified in the certificate. In the case of certificates that 374 validate multiple domain names, the same SCT is expected to be stored 375 multiple times. 377 Not following these constraints would increase the risk for two types 378 of privacy breaches. First, the HTTPS server receiving the SCT would 379 learn about other sites visited by the HTTPS client. Second, 380 auditors receiving SCTs from the HTTPS server would learn information 381 about other HTTPS servers visited by its clients. 383 If the client later again connects to the same HTTPS server, it again 384 receives a set of SCTs and calculates a certificate chain, and again 385 creates an sct_feedback or similar object. If this object does not 386 exactly match an existing object in the store, then the client MUST 387 add this new object to the store, associated with the exact domain 388 name contacted, as described above. An exact comparison is needed to 389 ensure that attacks involving alternate chains are detected. An 390 example of such an attack is described in 391 [dual-ca-compromise-attack]. However, at least one optimization is 392 safe and MAY be performed: If the certificate chain exactly matches 393 an existing certificate chain, the client MAY store the union of the 394 SCTs from the two objects in the first (existing) object. 396 If the client does connect to the same HTTPS server a subsequent 397 time, it MUST send to the server sct_feedback objects in the store 398 that are associated with that domain name. However, it is not 399 necessary to send an sct_feedback object constructed from the current 400 TLS session, and if the client does so, it MUST NOT be marked as sent 401 in any internal tracking done by the client. 403 Refer to Section 11.3 for recommendations for implementation. 405 Because SCTs can be used as a tracking mechanism (see 406 Section 10.5.2), they deserve special treatment when they are 407 received from (and provided to) domains that are loaded as 408 subresources from an origin domain. Such domains are commonly called 409 'third party domains'. An HTTPS client SHOULD store SCT Feedback 410 using a 'double-keying' approach, which isolates third party domains 411 by the first party domain. This is described in [double-keying]. 413 Gossip would be performed normally for third party domains only when 414 the user revisits the first party domain. In lieu of 'double- 415 keying', an HTTPS client MAY treat SCT Feedback in the same manner it 416 treats other security mechanisms that can enable tracking (such as 417 HSTS and HPKP.) 419 If the HTTPS client has configuration options for not sending cookies 420 to third parties, SCTs of third parties MUST be treated as cookies 421 with respect to this setting. This prevents third party tracking 422 through the use of SCTs/certificates, which would bypass the cookie 423 policy. For domains that are only loaded as third party domains, the 424 client may never perform SCT Feedback; however the client may perform 425 STH Pollination after fetching an inclusion proof, as specified in 426 Section 8.2. 428 SCTs and corresponding certificates are POSTed to the originating 429 HTTPS server at the well-known URL: 431 https:///.well-known/ct-gossip/v1/sct-feedback 433 The data sent in the POST is defined in Section 8.1.1. This data 434 SHOULD be sent in an already-established TLS session. This makes it 435 hard for an attacker to disrupt SCT Feedback without also disturbing 436 ordinary secure browsing (https://). This is discussed more in 437 Section 11.1.1. 439 The HTTPS server SHOULD respond with an HTTP 200 response code and an 440 empty body if it was able to process the request. An HTTPS client 441 who receives any other response SHOULD consider it an error. 443 Some clients have trust anchors or logs that are locally added (e.g. 444 by an administrator or by the user themselves). These additions are 445 potentially privacy-sensitive because they can carry information 446 about the specific configuration, computer, or user. 448 Certificates validated by locally added trust anchors will commonly 449 have no SCTs associated with them, so in this case no action is 450 needed with respect to CT Gossip. SCTs issued by locally added logs 451 MUST NOT be reported via SCT Feedback. 453 If a certificate is validated by SCTs that are issued by publicly 454 trusted logs, but chains to a local trust anchor, the client MAY 455 perform SCT Feedback for this SCT and certificate chain bundle. If 456 it does so, the client MUST include the full chain of certificates 457 chaining to the local trust anchor in the x509_chain array. 458 Performing SCT Feedback in this scenario may be advantageous for the 459 broader internet and CT ecosystem, but may also disclose information 460 about the client. If the client elects to omit SCT Feedback, it can 461 choose to perform STH Pollination after fetching an inclusion proof, 462 as specified in Section 8.2. 464 We require the client to send the full chain (or nothing at all) for 465 two reasons. Firstly, it simplifies the operation on the server if 466 there are not two code paths. Secondly, omitting the chain does not 467 actually preserve user privacy. The Issuer field in the certificate 468 describes the signing certificate. And if the certificate is being 469 submitted at all, it means the certificate is logged, and has SCTs. 470 This means that the Issuer can be queried and obtained from the log, 471 so omitting the signing certificate from the client's submission does 472 not actually help user privacy. 474 8.1.3. HTTPS server operation 476 HTTPS servers can be configured (or omit configuration), resulting 477 in, broadly, two modes of operation. In the simpler mode, the server 478 will only track leaf certificates and SCTs applicable to those leaf 479 certificates. In the more complex mode, the server will confirm the 480 client's chain validation and store the certificate chain. The 481 latter mode requires more configuration, but is necessary to prevent 482 denial of service (DoS) attacks on the server's storage space. 484 In the simple mode of operation, upon receiving a submission at the 485 sct-feedback well-known URL, an HTTPS server will perform a set of 486 operations, checking on each sct_feedback object before storing it: 488 1. the HTTPS server MAY modify the sct_feedback object, and discard 489 all items in the x509_chain array except the first item (which is 490 the end-entity certificate) 492 2. if a bit-wise compare of the sct_feedback object matches one 493 already in the store, this sct_feedback object SHOULD be 494 discarded 496 3. if the leaf cert is not for a domain for which the server is 497 authoritative, the SCT MUST be discarded 499 4. if an SCT in the sct_data array can't be verified to be a valid 500 SCT for the accompanying leaf cert, and issued by a known log, 501 the individual SCT SHOULD be discarded 503 The modification in step number 1 is necessary to prevent a malicious 504 client from exhausting the server's storage space. A client can 505 generate their own issuing certificate authorities, and create an 506 arbitrary number of chains that terminate in an end-entity 507 certificate with an existing SCT. By discarding all but the end- 508 entity certificate, we prevent a simple HTTPS server from storing 509 this data. Note that operation in this mode will not prevent the 510 attack described in [dual-ca-compromise-attack]. Skipping this step 511 requires additional configuration as described below. 513 The check in step 2 is for detecting duplicates and minimizing 514 processing and storage by the server. As on the client, an exact 515 comparison is needed to ensure that attacks involving alternate 516 chains are detected. Again, at least one optimization is safe and 517 MAY be performed. If the certificate chain exactly matches an 518 existing certificate chain, the server MAY store the union of the 519 SCTs from the two objects in the first (existing) object. If the 520 validity check on any of the SCTs fails, the server SHOULD NOT store 521 the union of the SCTs. 523 The check in step 3 is to help malfunctioning clients from exposing 524 which sites they visit. It additionally helps prevent DoS attacks on 525 the server. 527 [ Note: Thinking about building this, how does the SCT Feedback app 528 know which sites it's authoritative for? It will need that amount of 529 configuration at least. ] 531 The check in step 4 is to prevent DoS attacks where an adversary 532 fills up the store prior to attacking a client (thus preventing the 533 client's feedback from being recorded), or an attack where an 534 adversary simply attempts to fill up server's storage space. 536 The above describes the simpler mode of operation. In the more 537 advanced server mode, the server will detect the attack described in 538 [dual-ca-compromise-attack]. In this configuration the server will 539 not modify the sct_feedback object prior to performing checks 2, 3, 540 and 4. 542 To prevent a malicious client from filling the server's data store, 543 the HTTPS server SHOULD perform an additional check in the more 544 advanced mode: 546 o if the x509_chain consists of an invalid certificate chain, or the 547 culminating trust anchor is not recognized by the server, the 548 server SHOULD modify the sct_feedback object, discarding all items 549 in the x509_chain array except the first item 551 The HTTPS server MAY choose to omit checks 4 or 5. This will place 552 the server at risk of having its data store filled up by invalid 553 data, but can also allow a server to identify interesting certificate 554 or certificate chains that omit valid SCTs, or do not chain to a 555 trusted root. This information may enable an HTTPS server operator 556 to detect attacks or unusual behavior of Certificate Authorities even 557 outside the Certificate Transparency ecosystem. 559 8.1.4. HTTPS server to auditors 561 HTTPS servers receiving SCTs from clients SHOULD share SCTs and 562 certificate chains with CT auditors by either serving them on the 563 well-known URL: 565 https:///.well-known/ct-gossip/v1/collected-sct-feedback 567 or by HTTPS POSTing them to a set of preconfigured auditors. This 568 allows an HTTPS server to choose between an active push model or a 569 passive pull model. 571 The data received in a GET of the well-known URL or sent in the POST 572 is defined in Section 8.1.1 with the following difference: The 573 x509_chain element may contain only he end-entity certificate, as 574 described below. 576 HTTPS servers SHOULD share all sct_feedback objects they see that 577 pass the checks in Section 8.1.3. If this is an infeasible amount of 578 data, the server MAY choose to expire submissions according to an 579 undefined policy. Suggestions for such a policy can be found in 580 Section 11.3. 582 HTTPS servers MUST NOT share any other data that they may learn from 583 the submission of SCT Feedback by HTTPS clients, like the HTTPS 584 client IP address or the time of submission. 586 As described above, HTTPS servers can be configured (or omit 587 configuration), resulting in two modes of operation. In one mode, 588 the x509_chain array will contain a full certificate chain. This 589 chain may terminate in a trust anchor the auditor may recognize, or 590 it may not. (One scenario where this could occur is if the client 591 submitted a chain terminating in a locally added trust anchor, and 592 the server kept this chain.) In the other mode, the x509_chain array 593 will consist of only a single element, which is the end-entity 594 certificate. 596 Auditors SHOULD provide the following URL accepting HTTPS POSTing of 597 SCT feedback data: 599 https:///ct-gossip/v1/sct-feedback 601 Auditors SHOULD regularly poll HTTPS servers at the well-known 602 collected-sct-feedback URL. The frequency of the polling and how to 603 determine which domains to poll is outside the scope of this 604 document. However, the selection MUST NOT be influenced by potential 605 HTTPS clients connecting directly to the auditor. For example, if a 606 poll to example.com occurs directly after a client submits an SCT for 607 example.com, an adversary observing the auditor can trivially 608 conclude the activity of the client. 610 8.2. STH pollination 612 The goal of sharing Signed Tree Heads (STHs) through pollination is 613 to share STHs between HTTPS clients and CT auditors while still 614 preserving the privacy of the end user. The sharing of STHs 615 contribute to the overall goal of detecting misbehaving logs by 616 providing CT auditors with STHs from many vantage points, making it 617 possible to detect logs that are presenting inconsistent views. 619 HTTPS servers supporting the protocol act as STH pools. HTTPS 620 clients and CT auditors in the possession of STHs can pollinate STH 621 pools by sending STHs to them, and retrieving new STHs to send to 622 other STH pools. CT auditors can improve the value of their auditing 623 by retrieving STHs from pools. 625 HTTPS clients send STHs to HTTPS servers by POSTing them to the well- 626 known URL: 628 https:///.well-known/ct-gossip/v1/sth-pollination 630 The data sent in the POST is defined in Section 8.2.4. This data 631 SHOULD be sent in an already established TLS session. This makes it 632 hard for an attacker to disrupt STH gossiping without also disturbing 633 ordinary secure browsing (https://). This is discussed more in 634 Section 11.1.1. 636 On a successful connection to an HTTPS server implementing STH 637 Pollination, the response code will be 200, and the response body is 638 application/json, containing zero or more STHs in the same format, as 639 described in Section 8.2.4. 641 An HTTPS client may acquire STHs by several methods: 643 o in replies to pollination POSTs; 645 o asking logs that it recognizes for the current STH, either 646 directly (v2/get-sth) or indirectly (for example over DNS) 648 o resolving an SCT and certificate to an STH via an inclusion proof 650 o resolving one STH to another via a consistency proof 651 HTTPS clients (that have STHs) and CT auditors SHOULD pollinate STH 652 pools with STHs. Which STHs to send and how often pollination should 653 happen is regarded as undefined policy with the exception of privacy 654 concerns explained below. Suggestions for the policy can be found in 655 Section 11.3. 657 An HTTPS client could be tracked by giving it a unique or rare STH. 658 To address this concern, we place restrictions on different 659 components of the system to ensure an STH will not be rare. 661 o HTTPS clients silently ignore STHs from logs with an STH issuance 662 frequency of more than one STH per hour. Logs use the STH 663 Frequency Count metadata to express this ([RFC-6962-BIS-09] 664 sections 3.6 and 5.1). 666 o HTTPS clients silently ignore STHs which are not fresh. 668 An STH is considered fresh iff its timestamp is less than 14 days in 669 the past. Given a maximum STH issuance rate of one per hour, an 670 attacker has 336 unique STHs per log for tracking. Clients MUST 671 ignore STHs older than 14 days. We consider STHs within this 672 validity window not to be personally identifiable data, and STHs 673 outside this window to be personally identifiable. 675 When multiplied by the number of logs from which a client accepts 676 STHs, this number of unique STHs grow and the negative privacy 677 implications grow with it. It's important that this is taken into 678 account when logs are chosen for default settings in HTTPS clients. 679 This concern is discussed upon in Section 10.5.5. 681 A log may cease operation, in which case there will soon be no STH 682 within the validity window. Clients SHOULD perform all three methods 683 of gossip about a log that has ceased operation since it is possible 684 the log was still compromised and gossip can detect that. STH 685 Pollination is the one mechanism where a client must know about a log 686 shutdown. A client who does not know about a log shutdown MUST NOT 687 attempt any heuristic to detect a shutdown. Instead the client MUST 688 be informed about the shutdown from a verifiable source (e.g. a 689 software update). The client SHOULD be provided the final STH issued 690 by the log and SHOULD resolve SCTs and STHs to this final STH. If an 691 SCT or STH cannot be resolved to the final STH, clients SHOULD follow 692 the requirements and recommendations set forth in Section 11.1.2. 694 8.2.1. HTTPS Clients and Proof Fetching 696 There are two types of proofs a client may retrieve; inclusion proofs 697 and consistency proofs. 699 An HTTPS client will retrieve SCTs together with certificate chains 700 from an HTTPS server. Using the timestamp in the SCT together with 701 the end-entity certificate and the issuer key hash, it can obtain an 702 inclusion proof to an STH in order to verify the promise made by the 703 SCT. 705 An HTTPS client will have STHs from performing STH Pollination, and 706 may obtain a consistency proof to a more recent STH. 708 An HTTPS client may also receive an SCT bundled with an inclusion 709 proof to a historical STH via an unspecified future mechanism. 710 Because this historical STH is considered personally identifiable 711 information per above, the client needs to obtain a consistency proof 712 to a more recent STH. 714 A client SHOULD perform proof fetching. A client MUST NOT perform 715 proof fetching for any SCTs or STHs issued by a locally added log. A 716 client MAY fetch an inclusion proof for an SCT (issued by a pre- 717 loaded log) that validates a certificate chaining to a locally added 718 trust anchor. 720 If a client requested either proof directly from a log or auditor, it 721 would reveal the client's browsing habits to a third party. To 722 mitigate this risk, an HTTPS client MUST retrieve the proof in a 723 manner that disguises the client. 725 Depending on the client's DNS provider, DNS may provide an 726 appropriate intermediate layer that obfuscates the linkability 727 between the user of the client and the request for inclusion (while 728 at the same time providing a caching layer for oft-requested 729 inclusion proofs). See [draft-ct-over-dns] for an example of how 730 this can be done. 732 Anonymity networks such as Tor also present a mechanism for a client 733 to anonymously retrieve a proof from an auditor or log. 735 Even when using a privacy-preserving layer between the client and the 736 log, certain observations may be made about an anonymous client or 737 general user behavior depending on how proofs are fetched. For 738 example, if a client fetched all outstanding proofs at once, a log 739 would know that SCTs or STHs received around the same time are more 740 likely to come from a particular client. This could potentially go 741 so far as correlation of activity at different times to a single 742 client. In aggregate the data could reveal what sites are commonly 743 visited together. HTTPS clients SHOULD use a strategy of proof 744 fetching that attempts to obfuscate these patterns. A suggestion of 745 such a policy can be found in Section 11.2. 747 Resolving either SCTs and STHs may result in errors. These errors 748 may be routine downtime or other transient errors, or they may be 749 indicative of an attack. Clients SHOULD follow the requirements and 750 recommendations set forth in Section 11.1.2 when handling these 751 errors in order to give the CT ecosystem the greatest chance of 752 detecting and responding to a compromise. 754 8.2.2. STH Pollination without Proof Fetching 756 An HTTPS client MAY participate in STH Pollination without fetching 757 proofs. In this situation, the client receives STHs from a server, 758 applies the same validation logic to them (signed by a known log, 759 within the validity window) and will later pass them to another HTTPS 760 server. 762 When operating in this fashion, the HTTPS client is promoting gossip 763 for Certificate Transparency, but derives no direct benefit itself. 764 In comparison, a client who resolves SCTs or historical STHs to 765 recent STHs and pollinates them is assured that if it was attacked, 766 there is a probability that the ecosystem will detect and respond to 767 the attack (by distrusting the log). 769 8.2.3. Auditor Action 771 CT auditors participate in STH pollination by retrieving STHs from 772 HTTPS servers. They verify that the STH is valid by checking the 773 signature, and requesting a consistency proof from the STH to the 774 most recent STH. 776 After retrieving the consistency proof to the most recent STH, they 777 SHOULD pollinate this new STH among participating HTTPS servers. In 778 this way, as STHs "age out" and are no longer fresh, their "lineage" 779 continues to be tracked in the system. 781 8.2.4. STH Pollination data format 783 The data sent from HTTPS clients and CT auditors to HTTPS servers is 784 a JSON object [RFC7159] with the following content: 786 o sths - an array of 0 or more fresh SignedTreeHeads as defined in 787 [RFC-6962-BIS-09] Section 3.6.1. 789 8.3. Trusted Auditor Stream 791 HTTPS clients MAY send SCTs and cert chains, as well as STHs, 792 directly to auditors. If sent, this data MAY include data that 793 reflects locally added logs or trust anchors. Note that there are 794 privacy implications in doing so, these are outlined in 795 Section 10.5.1 and Section 10.5.6. 797 The most natural trusted auditor arrangement arguably is a web 798 browser that is "logged in to" a provider of various internet 799 services. Another equivalent arrangement is a trusted party like a 800 corporation to which an employee is connected through a VPN or by 801 other similar means. A third might be individuals or smaller groups 802 of people running their own services. In such a setting, retrieving 803 proofs from that third party could be considered reasonable from a 804 privacy perspective. The HTTPS client may also do its own auditing 805 and might additionally share SCTs and STHs with the trusted party to 806 contribute to herd immunity. Here, the ordinary [RFC-6962-BIS-09] 807 protocol is sufficient for the client to do the auditing while SCT 808 Feedback and STH Pollination can be used in whole or in parts for the 809 gossip part. 811 Another well established trusted party arrangement on the internet 812 today is the relation between internet users and their providers of 813 DNS resolver services. DNS resolvers are typically provided by the 814 internet service provider (ISP) used, which by the nature of name 815 resolving already know a great deal about which sites their users 816 visit. As mentioned in Section 8.2.1, in order for HTTPS clients to 817 be able to retrieve proofs in a privacy preserving manner, logs could 818 expose a DNS interface in addition to the ordinary HTTPS interface. 819 A specification of such a protocol can be found in 820 [draft-ct-over-dns]. 822 8.3.1. Trusted Auditor data format 824 Trusted Auditors expose a REST API at the fixed URI: 826 https:///ct-gossip/v1/trusted-auditor 828 Submissions are made by sending an HTTPS POST request, with the body 829 of the POST in a JSON object. Upon successful receipt the Trusted 830 Auditor returns 200 OK. 832 The JSON object consists of two top-level keys: 'sct_feedback' and 833 'sths'. The 'sct_feedback' value is an array of JSON objects as 834 defined in Section 8.1.1. The 'sths' value is an array of STHs as 835 defined in Section 8.2.4. 837 Example: 839 { 840 'sct_feedback' : 841 [ 842 { 843 'x509_chain' : 844 [ 845 '----BEGIN CERTIFICATE---\n 846 AAA...', 847 '----BEGIN CERTIFICATE---\n 848 AAA...', 849 ... 850 ], 851 'sct_data' : 852 [ 853 'AAA...', 854 'AAA...', 855 ... 856 ] 857 }, ... 858 ], 859 'sths' : 860 [ 861 'AAA...', 862 'AAA...', 863 ... 864 ] 865 } 867 9. 3-Method Ecosystem 869 The use of three distinct methods for auditing logs may seem 870 excessive, but each represents a needed component in the CT 871 ecosystem. To understand why, the drawbacks of each component must 872 be outlined. In this discussion we assume that an attacker knows 873 which mechanisms an HTTPS client and HTTPS server implement. 875 9.1. SCT Feedback 877 SCT Feedback requires the cooperation of HTTPS clients and more 878 importantly HTTPS servers. Although SCT Feedback does require a 879 significant amount of server-side logic to respond to the 880 corresponding APIs, this functionality does not require 881 customization, so it may be pre-provided and work out of the box. 882 However, to take full advantage of the system, an HTTPS server would 883 wish to perform some configuration to optimize its operation: 885 o Minimize its disk commitment by maintaining a list of known SCTs 886 and certificate chains (or hashes thereof) 888 o Maximize its chance of detecting a misissued certificate by 889 configuring a trust store of CAs 891 o Establish a "push" mechanism for POSTing SCTs to CT auditors 893 These configuration needs, and the simple fact that it would require 894 some deployment of software, means that some percentage of HTTPS 895 servers will not deploy SCT Feedback. 897 It is worthwhile to note that an attacker may be able to prevent 898 detection of an attack on a webserver (in all cases) if SCT Feedback 899 is not implemented. This attack is detailed in Section 10.1). 901 If SCT Feedback was the only mechanism in the ecosystem, any server 902 that did not implement the feature would open itself and its users to 903 attack without any possibility of detection. 905 If SCT Feedback is not deployed by a webserver, malicious logs will 906 be able to attack all users of the webserver (who do not have a 907 Trusted Auditor relationship) with impunity. Additionally, users who 908 wish to have the strongest measure of privacy protection (by 909 disabling STH Pollination Proof Fetching and forgoing a Trusted 910 Auditor) could be attacked without risk of detection. 912 9.2. STH Pollination 914 STH Pollination requires the cooperation of HTTPS clients, HTTPS 915 servers, and logs. 917 For a client to fully participate in STH Pollination, and have this 918 mechanism detect attacks against it, the client must have a way to 919 safely perform Proof Fetching in a privacy preserving manner. (The 920 client may pollinate STHs it receives without performing Proof 921 Fetching, but we do not consider this option in this section.) 923 HTTPS servers must deploy software (although, as in the case with SCT 924 Feedback this logic can be pre-provided) and commit some configurable 925 amount of disk space to the endeavor. 927 Logs (or a third party mirroring the logs) must provide access to 928 clients to query proofs in a privacy preserving manner, most likely 929 through DNS. 931 Unlike SCT Feedback, the STH Pollination mechanism is not hampered if 932 only a minority of HTTPS servers deploy it. However, it makes an 933 assumption that an HTTPS client performs Proof Fetching (such as the 934 DNS mechanism discussed). Unfortunately, any manner that is 935 anonymous for some (such as clients who use shared DNS services such 936 as a large ISP), may not be anonymous for others. 938 For instance, DNS requests expose a considerable amount of sensitive 939 information (including what data is already present in the cache) in 940 plaintext over the network. For this reason, some percentage of 941 HTTPS clients may choose to not enable the Proof Fetching component 942 of STH Pollination. (Although they can still request and send STHs 943 among participating HTTPS servers, even when this affords them no 944 direct benefit.) 946 If STH Pollination was the only mechanism deployed, users that 947 disable it would be able to be attacked without risk of detection. 949 If STH Pollination was not deployed, HTTPS clients visiting HTTPS 950 Servers who did not deploy SCT Feedback could be attacked without 951 risk of detection. 953 9.3. Trusted Auditor Relationship 955 The Trusted Auditor Relationship is expected to be the rarest gossip 956 mechanism, as an HTTPS client is providing an unadulterated report of 957 its browsing history to a third party. While there are valid and 958 common reasons for doing so, there is no appropriate way to enter 959 into this relationship without retrieving informed consent from the 960 user. 962 However, the Trusted Auditor Relationship mechanism still provides 963 value to a class of HTTPS clients. For example, web crawlers have no 964 concept of a "user" and no expectation of privacy. Organizations 965 already performing network auditing for anomalies or attacks can run 966 their own Trusted Auditor for the same purpose with marginal increase 967 in privacy concerns. 969 The ability to change one's Trusted Auditor is a form of Trust 970 Agility that allows a user to choose who to trust, and be able to 971 revise that decision later without consequence. A Trusted Auditor 972 connection can be made more confidential than DNS (through the use of 973 TLS), and can even be made (somewhat) anonymous through the use of 974 anonymity services such as Tor. (Note that this does ignore the de- 975 anonymization possibilities available from viewing a user's browsing 976 history.) 978 If the Trusted Auditor relationship was the only mechanism deployed, 979 users who do not enable it (the majority) would be able to be 980 attacked without risk of detection. 982 If the Trusted Auditor relationship was not deployed, crawlers and 983 organizations would build it themselves for their own needs. By 984 standardizing it, users who wish to opt-in (for instance those 985 unwilling to participate fully in STH Pollination) can have an 986 interoperable standard they can use to choose and change their 987 trusted auditor. 989 9.4. Interaction 991 The interactions of the mechanisms is thus outlined: 993 HTTPS clients can be attacked without risk of detection if they do 994 not participate in any of the three mechanisms. 996 HTTPS clients are afforded the greatest chance of detecting an attack 997 when they either participate in both SCT Feedback and STH Pollination 998 with Proof Fetching or if they have a Trusted Auditor relationship. 999 (Participating in SCT Feedback is required to prevent a malicious log 1000 from refusing to ever resolve an SCT to an STH, as put forward in 1001 Section 10.1). Additionally, participating in SCT Feedback enables 1002 an HTTPS client to assist in detecting the exact target of an attack. 1004 HTTPS servers that omit SCT Feedback enable malicious logs to carry 1005 out attacks without risk of detection. If these servers are targeted 1006 specifically, even if the attack is detected, without SCT Feedback 1007 they may never learn that they were specifically targeted. HTTPS 1008 servers without SCT Feedback do gain some measure of herd immunity, 1009 but only because their clients participate in STH Pollination (with 1010 Proof Fetching) or have a Trusted Auditor Relationship. 1012 When HTTPS servers omit SCT feedback, it allows their users to be 1013 attacked without detection by a malicious log; the vulnerable users 1014 are those who do not have a Trusted Auditor relationship. 1016 10. Security considerations 1018 10.1. Attacks by actively malicious logs 1020 One of the most powerful attacks possible in the CT ecosystem is a 1021 trusted log that has actively decided to be malicious. It can carry 1022 out an attack in two ways: 1024 In the first attack, the log can present a split view of the log for 1025 all time. The only way to detect this attack is to resolve each view 1026 of the log to the two most recent STHs and then force the log to 1027 present a consistency proof. (Which it cannot.) This attack can be 1028 detected by CT auditors participating in STH Pollination, as long as 1029 they are explicitly built to handle the situation of a log 1030 continuously presenting a split view. 1032 In the second attack, the log can sign an SCT, and refuse to ever 1033 include the certificate that the SCT refers to in the tree. 1034 (Alternately, it can include it in a branch of the tree and issue an 1035 STH, but then abandon that branch.) Whenever someone requests an 1036 inclusion proof for that SCT (or a consistency proof from that STH), 1037 the log would respond with an error, and a client may simply regard 1038 the response as a transient error. This attack can be detected using 1039 SCT Feedback, or an Auditor of Last Resort, as presented in 1040 Section 11.1.2. 1042 10.2. Dual-CA Compromise 1044 [dual-ca-compromise-attack] describes an attack possible by an 1045 adversary who compromises two Certificate Authorities and a Log. This 1046 attack is difficult to defend against in the CT ecosystem, and 1047 [dual-ca-compromise-attack] describes a few approaches to doing so. 1048 We note that Gossip is not intended to defend against this attack, 1049 but can in certain modes. 1051 Defending against the Dual-CA Compromise attack requires SCT 1052 Feedback, and explicitly requires the server to save full certificate 1053 chains (described in Section 8.1.3 as the 'complex' configuration.) 1054 After CT auditors receive the full certificate chains from servers, 1055 they MAY compare the chain built by clients to the chain supplied by 1056 the log. If the chains differ significantly, the auditor SHOULD 1057 raise a concern. A method of determining if chains differ 1058 significantly is by asserting that one chain is not a subset of the 1059 other and that the roots of the chains are different. 1061 [Note: Justification for this algorithm: 1063 Cross-Signatures could result in a different org being treated as the 1064 'root', but in this case, one chain would be a subset of the other. 1066 Intermediate swapping (e.g. different signature algorithms) could 1067 result in different chains, but the root would be the same. 1069 (Hitting both those cases at once would cause a false positive 1070 though, but this would likely be rare.) 1072 Are there other cases that could occur? (Left for the purposes of 1073 reading during pre-Last Call, to be removed by Editor)] 1075 10.3. Censorship/Blocking considerations 1077 We assume a network attacker who is able to fully control the 1078 client's internet connection for some period of time, including 1079 selectively blocking requests to certain hosts and truncating TLS 1080 connections based on information observed or guessed about client 1081 behavior. In order to successfully detect log misbehavior, the 1082 gossip mechanisms must still work even in these conditions. 1084 There are several gossip connections that can be blocked: 1086 1. Clients sending SCTs to servers in SCT Feedback 1088 2. Servers sending SCTs to auditors in SCT Feedback (server push 1089 mechanism) 1091 3. Servers making SCTs available to auditors (auditor pull 1092 mechanism) 1094 4. Clients fetching proofs in STH Pollination 1096 5. Clients sending STHs to servers in STH Pollination 1098 6. Servers sending STHs to clients in STH Pollination 1100 7. Clients sending SCTs to Trusted Auditors 1102 If a party cannot connect to another party, it can be assured that 1103 the connection did not succeed. While it may not have been 1104 maliciously blocked, it knows the transaction did not succeed. 1105 Mechanisms which result in a positive affirmation from the recipient 1106 that the transaction succeeded allow confirmation that a connection 1107 was not blocked. In this situation, the party can factor this into 1108 strategies suggested in Section 11.3 and in Section 11.1.2. 1110 The connections that allow positive affirmation are 1, 2, 4, 5, and 1111 7. 1113 More insidious is blocking the connections that do not allow positive 1114 confirmation: 3 and 6. An attacker may truncate or drop a response 1115 from a server to a client, such that the server believes it has 1116 shared data with the recipient, when it has not. However, in both 1117 scenarios (3 and 6), the server cannot distinguish the client as a 1118 cooperating member of the CT ecosystem or as an attacker performing a 1119 Sybil attack, aiming to flush the server's data store. Therefore the 1120 fact that these connections can be undetectably blocked does not 1121 actually alter the threat model of servers responding to these 1122 requests. The choice of algorithm to release data is crucial to 1123 protect against these attacks; strategies are suggested in 1124 Section 11.3. 1126 Handling censorship and network blocking (which is indistinguishable 1127 from network error) is relegated to the implementation policy chosen 1128 by clients. Suggestions for client behavior are specified in 1129 Section 11.1. 1131 10.4. Flushing Attacks 1133 A flushing attack is an attempt by an adversary to flush a particular 1134 piece of data from a pool. In the CT Gossip ecosystem, an attacker 1135 may have performed an attack and left evidence of a compromised log 1136 on a client or server. They would be interested in flushing that 1137 data, i.e. tricking the target into gossiping or pollinating the 1138 incriminating evidence with only attacker-controlled clients or 1139 servers with the hope they trick the target into deleting it. 1141 Flushing attacks may be defended against differently depending on the 1142 entity (HTTPS client or HTTPS server) and record (STHs or SCTs with 1143 Certificate Chains). 1145 10.4.1. STHs 1147 For both HTTPS clients and HTTPS servers, STHs within the validity 1148 window SHOULD NOT be deleted. An attacker cannot flush an item from 1149 the cache if it is never removed so flushing attacks are completely 1150 mitigated. 1152 The required disk space for all STHs within the validity window is 1153 336 STHs per log that is trusted. If 20 logs are trusted, and each 1154 STH takes 1 Kilobytes, this is 6.56 Megabytes. 1156 Note that it is important that implementors do not calculate the 1157 exact size of cache expected - if an attack does occur, a small 1158 number of additional STHs will enter into the cache. These STHs will 1159 be in addition to the expected set, and will be evidence of the 1160 attack. 1162 If an HTTPS client or HTTPS server is operating in a constrained 1163 environment and cannot devote enough storage space to hold all STHs 1164 within the validity window it is recommended to use the below 1165 Deletion Algorithm Section 11.3.2 to make it more difficult for the 1166 attacker to perform a flushing attack. 1168 10.4.2. SCTs & Certificate Chains on HTTPS Servers 1170 An HTTPS server will only accept SCTs and Certificate Chains for 1171 domains it is authoritative for. Therefore the storage space needed 1172 is bound by the number of logs it accepts, multiplied by the number 1173 of domains it is authoritative for, multiplied by the number of 1174 certificates issued for those domains. 1176 Imagine a server authoritative for 10,000 domains, and each domain 1177 has 3 certificate chains, and 10 SCTs. A certificate chain is 5 1178 Kilobytes in size and an SCT 1 Kilobyte. This yields 732 Megabytes. 1180 This data can be large, but it is calculable. Web properties with 1181 more certificates and domains are more likely to be able to handle 1182 the increased storage need, while small web properties will not seen 1183 an undue burden. Therefore HTTPS servers SHOULD NOT delete SCTs or 1184 Certificate Chains. This completely mitigates flushing attacks. 1186 Again, note that it is important that implementors do not calculate 1187 the exact size of cache expected - if an attack does occur, the new 1188 SCT(s) and Certificate Chain(s) will enter into the cache. This data 1189 will be in addition to the expected set, and will be evidence of the 1190 attack. 1192 If an HTTPS server is operating in a constrained environment and 1193 cannot devote enough storage space to hold all SCTs and Certificate 1194 Chains it is authoritative for it is recommended to configure the SCT 1195 Feedback mechanism to allow only certain certificates that are known 1196 to be valid. These chains and SCTs can then be discarded without 1197 being stored or subsequently provided to any clients or auditors. If 1198 the allowlist is not sufficient, the below Deletion Algorithm 1199 Section 11.3.2 is recommended to make it more difficult for the 1200 attacker to perform a flushing attack. 1202 10.4.3. SCTs & Certificate Chains on HTTPS Clients 1204 HTTPS clients will accumulate SCTs and Certificate Chains without 1205 bound. It is expected they will choose a particular cache size and 1206 delete entries when the cache size meets its limit. This does not 1207 mitigate flushing attacks, and such an attack is documented in 1208 [gossip-mixing]. 1210 The below Deletion Algorithm Section 11.3.2 is recommended to make it 1211 more difficult for the attacker to perform a flushing attack. 1213 10.5. Privacy considerations 1215 CT Gossip deals with HTTPS clients which are trying to share 1216 indicators that correspond to their browsing history. The most 1217 sensitive relationships in the CT ecosystem are the relationships 1218 between HTTPS clients and HTTPS servers. Client-server relationships 1219 can be aggregated into a network graph with potentially serious 1220 implications for correlative de-anonymization of clients and 1221 relationship-mapping or clustering of servers or of clients. 1223 There are, however, certain clients that do not require privacy 1224 protection. Examples of these clients are web crawlers or robots. 1225 But even in this case, the method by which these clients crawl the 1226 web may in fact be considered sensitive information. In general, it 1227 is better to err on the side of safety, and not assume a client is 1228 okay with giving up its privacy. 1230 10.5.1. Privacy and SCTs 1232 An SCT contains information that links it to a particular web site. 1233 Because the client-server relationship is sensitive, gossip between 1234 clients and servers about unrelated SCTs is risky. Therefore, a 1235 client with an SCT for a given server SHOULD NOT transmit that 1236 information in any other than the following two channels: to the 1237 server associated with the SCT itself; or to a Trusted Auditor, if 1238 one exists. 1240 10.5.2. Privacy in SCT Feedback 1242 SCTs introduce yet another mechanism for HTTPS servers to store state 1243 on an HTTPS client, and potentially track users. HTTPS clients which 1244 allow users to clear history or cookies associated with an origin 1245 MUST clear stored SCTs and certificate chains associated with the 1246 origin as well. 1248 Auditors should treat all SCTs as sensitive data. SCTs received 1249 directly from an HTTPS client are especially sensitive, because the 1250 auditor is a trusted by the client to not reveal their associations 1251 with servers. Auditors MUST NOT share such SCTs in any way, 1252 including sending them to an external log, without first mixing them 1253 with multiple other SCTs learned through submissions from multiple 1254 other clients. Suggestions for mixing SCTs are presented in 1255 Section 11.3. 1257 There is a possible fingerprinting attack where a log issues a unique 1258 SCT for targeted log client(s). A colluding log and HTTPS server 1259 operator could therefore be a threat to the privacy of an HTTPS 1260 client. Given all the other opportunities for HTTPS servers to 1261 fingerprint clients - TLS session tickets, HPKP and HSTS headers, 1262 HTTP Cookies, etc. - this is considered acceptable. 1264 The fingerprinting attack described above would be mitigated by a 1265 requirement that logs must use a deterministic signature scheme when 1266 signing SCTs ([RFC-6962-BIS-09] Section 2.1.4). A log signing using 1267 RSA is not required to use a deterministic signature scheme. 1269 Since logs are allowed to issue a new SCT for a certificate already 1270 present in the log, mandating deterministic signatures does not stop 1271 this fingerprinting attack altogether. It does make the attack 1272 harder to pull off without being detected though. 1274 There is another similar fingerprinting attack where an HTTPS server 1275 tracks a client by using a unique certificate or a variation of cert 1276 chains. The risk for this attack is accepted on the same grounds as 1277 the unique SCT attack described above. 1279 10.5.3. Privacy for HTTPS clients performing STH Proof Fetching 1281 An HTTPS client performing Proof Fetching SHOULD NOT request proofs 1282 from a CT log that it doesn't accept SCTs from. An HTTPS client 1283 SHOULD regularly request an STH from all logs it is willing to 1284 accept, even if it has seen no SCTs from that log. 1286 The time between two polls for new STH's SHOULD NOT be significantly 1287 shorter than the MMD of the polled log divided by its STH Frequency 1288 Count ([RFC-6962-BIS-09] section 5.1). 1290 The actual mechanism by which Proof Fetching is done carries 1291 considerable privacy concerns. Although out of scope for the 1292 document, DNS is a mechanism currently discussed. DNS exposes data 1293 in plaintext over the network (including what sites the user is 1294 visiting and what sites they have previously visited) and may not be 1295 suitable for some. 1297 10.5.4. Privacy in STH Pollination 1299 An STH linked to an HTTPS client may indicate the following about 1300 that client: 1302 o that the client gossips; 1304 o that the client has been using CT at least until the time that the 1305 timestamp and the tree size indicate; 1307 o that the client is talking, possibly indirectly, to the log 1308 indicated by the tree hash; 1310 o which software and software version is being used. 1312 There is a possible fingerprinting attack where a log issues a unique 1313 STH for a targeted HTTPS client. This is similar to the 1314 fingerprinting attack described in Section 10.5.2, but can operate 1315 cross-origin. If a log (or HTTPS server cooperating with a log) 1316 provides a unique STH to a client, the targeted client will be the 1317 only client pollinating that STH cross-origin. 1319 It is mitigated partially because the log is limited in the number of 1320 STHs it can issue. It must 'save' one of its STHs each MMD to 1321 perform the attack. 1323 10.5.5. Privacy in STH Interaction 1325 An HTTPS client may pollinate any STH within the last 14 days. An 1326 HTTPS client may also pollinate an STH for any log that it knows 1327 about. When a client pollinates STHs to a server, it will release 1328 more than one STH at a time. It is unclear if a server may 'prime' a 1329 client and be able to reliably detect the client at a later time. 1331 It's clear that a single site can track a user any way they wish, but 1332 this attack works cross-origin and is therefore more concerning. Two 1333 independent sites A and B want to collaborate to track a user cross- 1334 origin. A feeds a client Carol some N specific STHs from the M logs 1335 Carol trusts, chosen to be older and less common, but still in the 1336 validity window. Carol visits B and chooses to release some of the 1337 STHs she has stored, according to some policy. 1339 Modeling a representation for how common older STHs are in the pools 1340 of clients, and examining that with a given policy of how to choose 1341 which of those STHs to send to B, it should be possible to calculate 1342 statistics about how unique Carol looks when talking to B and how 1343 useful/accurate such a tracking mechanism is. 1345 Building such a model is likely impossible without some real world 1346 data, and requires a given implementation of a policy. To combat 1347 this attack, suggestions are provided in Section 11.3 to attempt to 1348 minimize it, but follow-up testing with real world deployment to 1349 improve the policy will be required. 1351 10.5.6. Trusted Auditors for HTTPS Clients 1353 Some HTTPS clients may choose to use a trusted auditor. This trust 1354 relationship exposes a large amount of information about the client 1355 to the auditor. In particular, it will identify the web sites that 1356 the client has visited to the auditor. Some clients may already 1357 share this information to a third party, for example, when using a 1358 server to synchronize browser history across devices in a server- 1359 visible way, or when doing DNS lookups through a trusted DNS 1360 resolver. For clients with such a relationship already established, 1361 sending SCTs to a trusted auditor run by the same organization does 1362 not appear to expose any additional information to the trusted third 1363 party. 1365 Clients who wish to contact a CT auditor without associating their 1366 identities with their SCTs may wish to use an anonymizing network 1367 like Tor to submit SCT Feedback to the auditor. Auditors SHOULD 1368 accept SCT Feedback that arrives over such anonymizing networks. 1370 Clients sending feedback to an auditor may prefer to reduce the 1371 temporal granularity of the history exposure to the auditor by 1372 caching and delaying their SCT Feedback reports. This is elaborated 1373 upon in Section 11.3. This strategy is only as effective as the 1374 granularity of the timestamps embedded in the SCTs and STHs. 1376 10.5.7. HTTPS Clients as Auditors 1378 Some HTTPS clients may choose to act as CT auditors themselves. A 1379 Client taking on this role needs to consider the following: 1381 o an Auditing HTTPS client potentially exposes its history to the 1382 logs that they query. Querying the log through a cache or a proxy 1383 with many other users may avoid this exposure, but may expose 1384 information to the cache or proxy, in the same way that a non- 1385 Auditing HTTPS Client exposes information to a Trusted Auditor. 1387 o an effective CT auditor needs a strategy about what to do in the 1388 event that it discovers misbehavior from a log. Misbehavior from 1389 a log involves the log being unable to provide either (a) a 1390 consistency proof between two valid STHs or (b) an inclusion proof 1391 for a certificate to an STH any time after the log's MMD has 1392 elapsed from the issuance of the SCT. The log's inability to 1393 provide either proof will not be externally cryptographically- 1394 verifiable, as it may be indistinguishable from a network error. 1396 11. Policy Recommendations 1398 This section is intended as suggestions to implementors of HTTPS 1399 Clients, HTTPS servers, and CT auditors. It is not a requirement for 1400 technique of implementation, so long as privacy considerations 1401 established above are obeyed. 1403 11.1. Blocking Recommendations 1405 11.1.1. Frustrating blocking 1407 When making gossip connections to HTTPS servers or Trusted Auditors, 1408 it is desirable to minimize the plaintext metadata in the connection 1409 that can be used to identify the connection as a gossip connection 1410 and therefore be of interest to block. Additionally, introducing 1411 some randomness into client behavior may be important. We assume 1412 that the adversary is able to inspect the behavior of the HTTPS 1413 client and understand how it makes gossip connections. 1415 As an example, if a client, after establishing a TLS connection (and 1416 receiving an SCT, but not making its own HTTP request yet), 1417 immediately opens a second TLS connection for the purpose of gossip, 1418 the adversary can reliably block this second connection to block 1419 gossip without affecting normal browsing. For this reason it is 1420 recommended to run the gossip protocols over an existing connection 1421 to the server, making use of connection multiplexing such as HTTP 1422 Keep-Alive or SPDY. 1424 Truncation is also a concern. If a client always establishes a TLS 1425 connection, makes a request, receives a response, and then always 1426 attempts a gossip communication immediately following the first 1427 response, truncation will allow an attacker to block gossip reliably. 1429 For these reasons, we recommend that, if at all possible, clients 1430 SHOULD send gossip data in an already established TLS session. This 1431 can be done through the use of HTTP Pipelining, SPDY, or HTTP/2. 1433 11.1.2. Responding to possible blocking 1435 In some circumstances a client may have a piece of data that they 1436 have attempted to share (via SCT Feedback or STH Pollination), but 1437 have been unable to do so: with every attempt they receive an error. 1438 These situations are: 1440 1. The client has an SCT and a certificate, and attempts to retrieve 1441 an inclusion proof - but receives an error on every attempt. 1443 2. The client has an STH, and attempts to resolve it to a newer STH 1444 via a consistency proof - but receives an error on every attempt. 1446 3. The client has attempted to share an SCT and constructed 1447 certificate via SCT Feedback - but receives an error on every 1448 attempt. 1450 4. The client has attempted to share an STH via STH Pollination - 1451 but receives an error on every attempt. 1453 5. The client has attempted to share a specific piece of data with a 1454 Trusted Auditor - but receives an error on every attempt. 1456 In the case of 1 or 2, it is conceivable that the reason for the 1457 errors is that the log acted improperly, either through malicious 1458 actions or compromise. A proof may not be able to be fetched because 1459 it does not exist (and only errors or timeouts occur). One such 1460 situation may arise because of an actively malicious log, as 1461 presented in Section 10.1. This data is especially important to 1462 share with the broader internet to detect this situation. 1464 If an SCT has attempted to be resolved to an STH via an inclusion 1465 proof multiple times, and each time has failed, this SCT might very 1466 well be a compromising proof of an attack. However the client MUST 1467 NOT share the data with any other third party (excepting a Trusted 1468 Auditor should one exist). 1470 If an STH has attempted to be resolved to a newer STH via a 1471 consistency proof multiple times, and each time has failed, a client 1472 MAY share the STH with an "Auditor of Last Resort" even if the STH in 1473 question is no longer within the validity window. This auditor may 1474 be pre-configured in the client, but the client SHOULD permit a user 1475 to disable the functionality or change whom data is sent to. The 1476 Auditor of Last Resort itself represents a point of failure and 1477 privacy concerns, so if implemented, it SHOULD connect using public 1478 key pinning and not consider an item delivered until it receives a 1479 confirmation. 1481 In the cases 3, 4, and 5, we assume that the webserver(s) or trusted 1482 auditor in question is either experiencing an operational failure, or 1483 being attacked. In both cases, a client SHOULD retain the data for 1484 later submission (subject to Private Browsing or other history- 1485 clearing actions taken by the user.) This is elaborated upon more in 1486 Section 11.3. 1488 11.2. Proof Fetching Recommendations 1490 Proof fetching (both inclusion proofs and consistency proofs) SHOULD 1491 be performed at random time intervals. If proof fetching occurred 1492 all at once, in a flurry of activity, a log would know that SCTs or 1493 STHs received around the same time are more likely to come from a 1494 particular client. While proof fetching is required to be done in a 1495 manner that attempts to be anonymous from the perspective of the log, 1496 the correlation of activity to a single client would still reveal 1497 patterns of user behavior we wish to keep confidential. These 1498 patterns could be recognizable as a single user, or could reveal what 1499 sites are commonly visited together in the aggregate. 1501 11.3. Record Distribution Recommendations 1503 In several components of the CT Gossip ecosystem, the recommendation 1504 is made that data from multiple sources be ingested, mixed, stored 1505 for an indeterminate period of time, provided (multiple times) to a 1506 third party, and eventually deleted. The instances of these 1507 recommendations in this draft are: 1509 o When a client receives SCTs during SCT Feedback, it should store 1510 the SCTs and Certificate Chain for some amount of time, provide 1511 some of them back to the server at some point, and may eventually 1512 remove them from its store 1514 o When a client receives STHs during STH Pollination, it should 1515 store them for some amount of time, mix them with other STHs, 1516 release some of them them to various servers at some point, 1517 resolve some of them to new STHs, and eventually remove them from 1518 its store 1520 o When a server receives SCTs during SCT Feedback, it should store 1521 them for some period of time, provide them to auditors some number 1522 of times, and may eventually remove them 1524 o When a server receives STHs during STH Pollination, it should 1525 store them for some period of time, mix them with other STHs, 1526 provide some of them to connecting clients, may resolve them to 1527 new STHs via Proof Fetching, and eventually remove them from its 1528 store 1530 o When a Trusted Auditor receives SCTs or historical STHs from 1531 clients, it should store them for some period of time, mix them 1532 with SCTs received from other clients, and act upon them at some 1533 period of time 1535 Each of these instances have specific requirements for user privacy, 1536 and each have options that may not be invoked. As one example, an 1537 HTTPS client should not mix SCTs from server A with SCTs from server 1538 B and release server B's SCTs to Server A. As another example, an 1539 HTTPS server may choose to resolve STHs to a single more current STH 1540 via proof fetching, but it is under no obligation to do so. 1542 These requirements should be met, but the general problem of 1543 aggregating multiple pieces of data, choosing when and how many to 1544 release, and when to remove them is shared. This problem has 1545 previously been considered in the case of Mix Networks and Remailers, 1546 including papers such as [trickle]. 1548 There are several concerns to be addressed in this area, outlined 1549 below. 1551 11.3.1. Mixing Algorithm 1553 When SCTs or STHs are recorded by a participant in CT Gossip and 1554 later used, it is important that they are selected from the datastore 1555 in a non-deterministic fashion. 1557 This is most important for servers, as they can be queried for SCTs 1558 and STHs anonymously. If the server used a predictable ordering 1559 algorithm, an attacker could exploit the predictability to learn 1560 information about a client. One such method would be by observing 1561 the (encrypted) traffic to a server. When a client of interest 1562 connects, the attacker makes a note. They observe more clients 1563 connecting, and predicts at what point the client-of-interest's data 1564 will be disclosed, and ensures that they query the server at that 1565 point. 1567 Although most important for servers, random ordering is still 1568 strongly recommended for clients and Trusted Auditors. The above 1569 attack can still occur for these entities, although the circumstances 1570 are less straightforward. For clients, an attacker could observe 1571 their behavior, note when they receive an STH from a server, and use 1572 javascript to cause a network connection at the correct time to force 1573 a client to disclose the specific STH. Trusted Auditors are stewards 1574 of sensitive client data. If an attacker had the ability to observe 1575 the activities of a Trusted Auditor (perhaps by being a log, or 1576 another auditor), they could perform the same attack - noting the 1577 disclosure of data from a client to the Trusted Auditor, and then 1578 correlating a later disclosure from the Trusted Auditor as coming 1579 from that client. 1581 Random ordering can be ensured by several mechanisms. A datastore 1582 can be shuffled, using a secure shuffling algorithm such as Fisher- 1583 Yates. Alternately, a series of random indexes into the data store 1584 can be selected (if a collision occurs, a new index is selected.) A 1585 cryptographically secure random number generator must be used in 1586 either case. If shuffling is performed, the datastore must be marked 1587 'dirty' upon item insertion, and at least one shuffle operation 1588 occurs on a dirty datastore before data is retrieved from it for use. 1590 11.3.2. The Deletion Algorithm 1592 No entity in CT Gossip is required to delete records at any time, 1593 except to respect user's wishes such as private browsing mode or 1594 clearing history. However, it is likely that over time the 1595 accumulated storage will grow in size and need to be pruned. 1597 While deletion of data will occur, proof fetching can ensure that any 1598 misbehavior from a log will still be detected, even after the direct 1599 evidence from the attack is deleted. Proof fetching ensures that if 1600 a log presents a split view for a client, they must maintain that 1601 split view in perpetuity. An inclusion proof from an SCT to an STH 1602 does not erase the evidence - the new STH is evidence itself. A 1603 consistency proof from that STH to a new one likewise - the new STH 1604 is every bit as incriminating as the first. (Client behavior in the 1605 situation where an SCT or STH cannot be resolved is suggested in 1606 Section 11.1.2.) Because of this property, we recommend that if a 1607 client is performing proof fetching, that they make every effort to 1608 not delete data until it has been successfully resolved to a new STH 1609 via a proof. 1611 When it is time to delete a record, it can be done in a way that 1612 makes it more difficult for a successful flushing attack to to be 1613 performed. 1615 1. When the record cache has reached a certain size that is yet 1616 under the limit, aggressively perform proof fetching. This 1617 should resolve records to a small set of STHs that can be 1618 retained. Once a proof has been fetched, the record is safer to 1619 delete. 1621 2. If proof fetching has failed, or is disabled, begin by deleting 1622 SCTs and Certificate Chains that have been successfully reported. 1623 Deletion from this set of SCTs should be done at random. For a 1624 client, a submission is not counted as being reported unless it 1625 is sent over a connection using a different SCT, so the attacker 1626 is faced with a recursive problem. (For a server, this step does 1627 not apply.) 1629 3. Attempt to save any submissions that have failed proof fetching 1630 repeatedly, as these are the most likely to be indicative of an 1631 attack. 1633 4. Finally, if the above steps have been followed and have not 1634 succeeded in reducing the size sufficiently, records may be 1635 deleted at random. 1637 Note that if proof fetching is disabled (which is expected although 1638 not required for servers) - the algorithm collapses down to 'delete 1639 at random'. 1641 The decision to delete records at random is intentional. Introducing 1642 non-determinism in the decision is absolutely necessary to make it 1643 more difficult for an adversary to know with certainty or high 1644 confidence that the record has been successfully flushed from a 1645 target. 1647 11.4. Concrete Recommendations 1649 We present the following pseudocode as a concrete outline of our 1650 policy recommendations. 1652 Both suggestions presented are applicable to both clients and 1653 servers. Servers may not perform proof fetching, in which case large 1654 portions of the pseudocode are not applicable. But it should work in 1655 either case. 1657 11.4.1. STH Pollination 1659 The STH class contains data pertaining specifically to the STH 1660 itself. 1662 class STH 1663 { 1664 uint16 proof_attempts 1665 uint16 proof_failure_count 1666 uint32 num_reports_to_thirdparty 1667 datetime timestamp 1668 byte[] data 1669 } 1671 The broader STH store itself would contain all the STHs known by an 1672 entity participating in STH Pollination (either client or server). 1673 This simplistic view of the class does not take into account the 1674 complicated locking that would likely be required for a data 1675 structure being accessed by multiple threads. Something to note 1676 about this pseudocode is that it does not remove STHs once they have 1677 been resolved to a newer STH. Doing so might make older STHs within 1678 the validity window rarer and thus enable tracking. 1680 class STHStore 1681 { 1682 STH[] sth_list 1684 // This function is run after receiving a set of STHs from 1685 // a third party in response to a pollination submission 1686 def insert(STH[] new_sths) { 1687 foreach(new in new_sths) { 1688 if(this.sth_list.contains(new)) 1689 continue 1690 this.sth_list.insert(new) 1691 } 1692 } 1694 // This function is called to delete the given STH 1695 // from the data store 1696 def delete_now(STH s) { 1697 this.sth_list.remove(s) 1698 } 1700 // When it is time to perform STH Pollination, the HTTPS client 1701 // calls this function to get a selection of STHs to send as 1702 // feedback 1703 def get_pollination_selection() { 1704 if(len(this.sth_list) < MAX_STH_TO_GOSSIP) 1705 return this.sth_list 1706 else { 1707 indexes = set() 1708 modulus = len(this.sth_list) 1709 while(len(indexes) < MAX_STH_TO_GOSSIP) { 1710 r = randomInt() % modulus 1711 // Ignore STHs that are past the validity window but not 1712 // yet removed. 1713 if(r not in indexes 1714 && now() - this.sth_list[i].timestamp < TWO_WEEKS) 1715 indexes.insert(r) 1716 } 1718 return_selection = [] 1719 foreach(i in indexes) { 1720 return_selection.insert(this.sth_list[i]) 1721 } 1722 return return_selection 1723 } 1724 } 1725 } 1726 We also suggest a function that will be called periodically in the 1727 background, iterating through the STH store, performing a cleaning 1728 operation and queuing consistency proofs. This function can live as 1729 a member functions of the STHStore class. 1731 //Just a suggestion: 1732 #define MIN_PROOF_FAILURES_CONSIDERED_SUSPICIOUS 3 1734 def clean_list() { 1735 foreach(sth in this.sth_list) { 1737 if(now() - sth.timestamp > TWO_WEEKS) { 1738 //STH is too old, we must remove it 1739 if(proof_fetching_enabled 1740 && auditor_of_last_resort_enabled 1741 && sth.proof_failure_count 1742 > MIN_PROOF_FAILURES_CONSIDERED_SUSPICIOUS) { 1743 queue_for_auditor_of_last_resort(sth, 1744 auditor_of_last_resort_callback) 1745 } else { 1746 delete_now(sth) 1747 } 1748 } 1750 else if(proof_fetching_enabled 1751 && now() - sth.timestamp > LOG_MMD 1752 && sth.proof_attempts != UINT16_MAX 1753 // Only fetch a proof is we have never received a proof 1754 // before. (This also avoids submitting something 1755 // already in the queue.) 1756 && sth.proof_attempts == sth.proof_failure_count) { 1757 sth.proof_attempts++ 1758 queue_consistency_proof(sth, consistency_proof_callback) 1759 } 1760 } 1761 } 1763 These functions also exist in the STHStore class. 1765 // This function is called after successfully pollinating STHs 1766 // to a third party. It is passed the STHs sent to the third 1767 // party, which is the output of get_gossip_selection(), as well 1768 // as the STHs received in the response. 1769 def successful_thirdparty_submission_callback(STH[] submitted_sth_list, 1770 STH[] new_sths) 1771 { 1772 foreach(sth in submitted_sth_list) { 1773 sth.num_reports_to_thirdparty++ 1774 } 1776 this.insert(new_sths); 1777 } 1779 // Attempt auditor of last resort submissions until it succeeds 1780 def auditor_of_last_resort_callback(original_sth, error) { 1781 if(!error) { 1782 delete_now(original_sth) 1783 } 1784 } 1786 def consistency_proof_callback(consistency_proof, original_sth, error) { 1787 if(!error) { 1788 insert(consistency_proof.current_sth) 1789 } else { 1790 original_sth.proof_failure_count++ 1791 } 1792 } 1794 11.4.2. SCT Feedback 1796 The SCT class contains data pertaining specifically to an SCT itself. 1798 class SCT 1799 { 1800 uint16 proof_failure_count 1801 bool has_been_resolved_to_sth 1802 bool proof_outstanding 1803 byte[] data 1804 } 1806 The SCT bundle will contain the trusted certificate chain the HTTPS 1807 client built (chaining to a trusted root certificate.) It also 1808 contains the list of associated SCTs, the exact domain it is 1809 applicable to, and metadata pertaining to how often it has been 1810 reported to the third party. 1812 class SCTBundle 1813 { 1814 X509[] certificate_chain 1815 SCT[] sct_list 1816 string domain 1817 uint32 num_reports_to_thirdparty 1819 def equals(sct_bundle) { 1820 if(sct_bundle.domain != this.domain) 1821 return false 1822 if(sct_bundle.certificate_chain != this.certificate_chain) 1823 return false 1824 if(sct_bundle.sct_list != this.sct_list) 1825 return false 1827 return true 1828 } 1829 def approx_equals(sct_bundle) { 1830 if(sct_bundle.domain != this.domain) 1831 return false 1832 if(sct_bundle.certificate_chain != this.certificate_chain) 1833 return false 1835 return true 1836 } 1838 def insert_scts(sct[] sct_list) { 1839 this.sct_list.union(sct_list) 1840 this.num_reports_to_thirdparty = 0 1841 } 1843 def has_been_fully_resolved_to_sths() { 1844 foreach(s in this.sct_list) { 1845 if(!s.has_been_resolved_to_sth && !s.proof_outstanding) 1846 return false 1847 } 1848 return true 1849 } 1851 def max_proof_failures() { 1852 uint max = 0 1853 foreach(sct in this.sct_list) { 1854 if(sct.proof_failure_count > max) 1855 max = sct.proof_failure_count 1856 } 1857 return max 1858 } 1859 } 1860 For each domain, we store a SCTDomainEntry that holds the SCTBundles 1861 seen for that domain, as well as encapsulating some logic relating to 1862 SCT Feedback for that particular domain. In particular, this data 1863 structure also contains the logic that handles domains not supporting 1864 SCT Feedback. Its behavior is: 1866 1. When a user visits a domain, SCT Feedback is attempted for it. 1867 If it fails, it will retry after a month (configurable). If it 1868 succeeds, excellent. SCT Feedback data is still collected and 1869 stored even if SCT Feedback failed. 1871 2. After 3 month-long waits between failures, the domain will be 1872 marked as failing long-term. No SCT Feedback data will be stored 1873 beyond meta-data, but SCT Feedback will still be attempted after 1874 month-long waits 1876 3. If at any point in time, SCT Feedback succeeds, all failure 1877 counters are reset 1879 4. If a domain succeeds, but then begins failing, it must fail more 1880 than 90% of the time (configurable) and then the process begins 1881 at (2). 1883 If a domain is visited infrequently (say, once every 7 months) then 1884 it will be evicted from the cache and start all over again (according 1885 to the suggestion values in the below pseudocode). 1887 [ Note: To be certain the logic is correct I give the following test 1888 cases which illustrate the intended behavior. Hopefully the code 1889 matches! 1891 Succeed 1 Time num_submissions_attempted=1 num_submissions_succeeded=1 num_feedback_loop_failures=0 1892 Fail 10 Times num_submissions_attempted=11 num_submissions_succeeded=1 num_feedback_loop_failures=0 1893 ... wait a month ... 1894 Fail 1 month later num_submissions_attempted=12 num_submissions_succeeded=1 num_feedback_loop_failures=1 1895 ... wait a month ... 1896 Succeed 1 month later num_submissions_attempted=13 num_submissions_succeeded=2 num_feedback_loop_failures=0(r) indicates (Reset) 1897 -> Feedback is attempted regularly. 1899 Succeed 1 Time num_submissions_attempted=1 num_submissions_succeeded=1 num_feedback_loop_failures=0 1900 Fail 10 Times num_submissions_attempted=11 num_submissions_succeeded=1 num_feedback_loop_failures=0 1901 ... wait a month ... 1902 Fail 1 month later num_submissions_attempted=12 num_submissions_succeeded=1 num_feedback_loop_failures=1 1903 ... wait a month ... 1904 Fail 1 month later num_submissions_attempted=13 num_submissions_succeeded=1 num_feedback_loop_failures=2 1905 ... wait a month ... 1906 Succeed 1 month later num_submissions_attempted=14 num_submissions_succeeded=2 num_feedback_loop_failures=0(r) 1907 -> Feedback is attempted regularly. 1909 Succeed 1 Time num_submissions_attempted=1 num_submissions_succeeded=1 num_feedback_loop_failures=0 1910 Fail 10 Times num_submissions_attempted=11 num_submissions_succeeded=1 num_feedback_loop_failures=0 1911 ... wait a month ... 1912 Fail 1 month later num_submissions_attempted=12 num_submissions_succeeded=1 num_feedback_loop_failures=1 1913 ... wait a month ... 1914 Fail 1 month later num_submissions_attempted=13 num_submissions_succeeded=1 num_feedback_loop_failures=2 1915 ... wait a month ... 1916 Fail 1 month later num_submissions_attempted=14 num_submissions_succeeded=2 num_feedback_loop_failures=3 1917 ... clear_old_data() is run every hour ... 1918 num_submissions_attempted=0 num_submissions_succeeded=0 num_feedback_loop_failures=3 1919 sct_feedback_failing_longterm=True 1920 Fail 1 month later num_submissions_attempted=1 num_submissions_succeeded=0 num_feedback_loop_failures=4 1921 sct_feedback_failing_longterm=True 1922 ... clear_old_data() is run every hour ... 1923 num_submissions_attempted=0(r) num_submissions_succeeded=0 num_feedback_loop_failures=3 1924 sct_feedback_failing_longterm=True 1925 Succeed 1 month later num_submissions_attempted=2 num_submissions_succeeded=1 num_feedback_loop_failures=0(r) 1926 sct_feedback_failing_longterm=False 1927 -> Feedback is attempted regularly. 1929 Note above that the second run of clear_old_data() will reset num_submissions_attempted from 1 to 0. This is 1930 CRITICAL. Otherwise, we would have the below bug (where after 10 months of failures, a success would not hit 1931 the required ratio to keep going) 1933 //The below represents a bug. 1934 Succeed 1 Time num_submissions_attempted=1 num_submissions_succeeded=1 num_feedback_loop_failures=0 1935 Fail 10 Times num_submissions_attempted=11 num_submissions_succeeded=1 num_feedback_loop_failures=0 1936 ... wait a month ... 1937 Fail 1 month later num_submissions_attempted=12 num_submissions_succeeded=1 num_feedback_loop_failures=1 1938 ... wait a month ... 1939 Fail 1 month later num_submissions_attempted=13 num_submissions_succeeded=1 num_feedback_loop_failures=2 1940 ... wait a month ... 1941 Fail 1 month later num_submissions_attempted=14 num_submissions_succeeded=2 num_feedback_loop_failures=3 1942 ... clear_old_data() is run every hour ... 1943 num_submissions_attempted=0 num_submissions_succeeded=0 num_feedback_loop_failures=3 1944 sct_feedback_failing_longterm=True 1945 Fail 1 month later num_submissions_attempted=1 num_submissions_succeeded=0 num_feedback_loop_failures=4 1946 sct_feedback_failing_longterm=True 1947 Fail 9 times for 9 months 1948 num_submissions_attempted=10 num_submissions_succeeded=0 num_feedback_loop_failures=13 1949 sct_feedback_failing_longterm=True 1950 Succeed 1 month later num_submissions_attempted=11 num_submissions_succeeded=1 num_feedback_loop_failures=0(r) 1951 sct_feedback_failing_longterm=False 1952 -> Feedback is NOT attempted regularly. \] 1954 //Suggestions: 1955 // After concluding a domain doesn't support feedback, we try again 1956 // after WAIT_BETWEEN_SCT_FEEDBACK_ATTEMPTS amount of time to see if 1957 // they added support 1958 #define WAIT_BETWEEN_SCT_FEEDBACK_ATTEMPTS 1 month 1960 // If we've waited MIN_SCT_FEEDBACK_ATTEMPTS_BEFORE_OMITTING_STORAGE 1961 // multiplied by WAIT_BETWEEN_SCT_FEEDBACK_ATTEMPTS amount of time, we 1962 // still attempt SCT Feedback, but no longer bother storing any data 1963 // until the domain supports SCT Feedback 1964 #define MIN_SCT_FEEDBACK_ATTEMPTS_BEFORE_OMITTING_STORAGE 3 1966 // If this percentage of SCT Feedback attempts previously succeeded, 1967 // we consider the domain as supporting feedback and is just having 1968 // transient errors 1969 #define MIN_RATIO_FOR_SCT_FEEDBACK_TO_BE_WORKING .10 1971 class SCTDomainEntry 1972 { 1973 // This is the primary key of the object, the exact domain name it 1974 // is valid for 1975 string domain 1977 // This is the last time the domain was contacted. For client 1978 // operations it is updated whenever the client makes any request 1979 // (not just feedback) to the domain. For server operations, it is 1980 // updated whenever any client contacts the domain. Responsibility 1981 // for updating lies OUTSIDE of the class 1982 public datetime last_contact_for_domain 1984 // This is the last time SCT Feedback was attempted for the domain. 1985 // It is updated whenever feedback is attempted - responsibility for 1986 // updating lies OUTSIDE of the class 1987 // This is not used when this algorithm runs on servers 1988 public datetime last_sct_feedback_attempt 1990 // This is the number of times we have waited an 1991 // WAIT_BETWEEN_SCT_FEEDBACK_ATTEMPTS amount of time, and still failed 1992 // e.g. 10 months of failures 1993 // This is not used when this algorithm runs on servers 1994 private uint16 num_feedback_loop_failures 1996 // This is whether or not SCT Feedback has failed enough times that we 1997 // should not bother storing data for it anymore. It is a small function 1998 // used for illustrative purposes 1999 // This is not used when this algorithm runs on servers 2000 private bool sct_feedback_failing_longterm() 2001 { num_feedback_loop_failures >= MIN_SCT_FEEDBACK_ATTEMPTS_BEFORE_OMITTING_STORAGE } 2003 // This is the number of SCT Feedback submissions attempted. 2005 // Responsibility for incrementing lies OUTSIDE of the class 2006 // (And watch for integer overflows) 2007 // This is not used when this algorithm runs on servers 2008 public uint16 num_submissions_attempted 2010 // This is the number of successful SCT Feedback submissions. This 2011 // variable is updated by the class. 2012 // This is not used when this algorithm runs on servers 2013 private uint16 num_submissions_succeeded 2015 // This contains all the bundles of SCT data we have observed for 2016 // this domain 2017 SCTBundle[] observed_records 2019 // This function can be called to determine if we should attempt 2020 // SCT Feedback for this domain. 2021 def should_attempt_feedback() { 2022 // Servers always perform feedback! 2023 if(operator_is_server) 2024 return true 2026 // If we have not tried in a month, try again 2027 if(now() - last_sct_feedback_attempt > WAIT_BETWEEN_SCT_FEEDBACK_ATTEMPTS) 2028 return true 2030 // If we have tried recently, and it seems to be working, go for it! 2031 if((num_submissions_succeeded / num_submissions_attempted) > 2032 MIN_RATIO_FOR_SCT_FEEDBACK_TO_BE_WORKING) 2033 return true 2035 // Otherwise don't try 2036 return false 2037 } 2039 // For Clients, this function is called after a successful 2040 // connection to an HTTPS server, with a single SCTBundle 2041 // constructed from that connection's certificate chain and SCTs. 2042 // For Servers, this is called after receiving SCT Feedback with 2043 // all the bundles sent in the feedback. 2044 def insert(SCTBundle[] bundles) { 2045 // Do not store data for long-failing domains 2046 if(sct_feedback_failing_longterm()) { 2047 return 2048 } 2050 foreach(b in bundles) { 2051 if(operator_is_server) { 2052 if(!passes_validity_checks(b)) 2053 return 2054 } 2056 bool have_inserted = false 2057 foreach(e in this.observed_records) { 2058 if(e.equals(b)) 2059 return 2060 else if(e.approx_equals(b)) { 2061 have_inserted = true 2062 e.insert_scts(b.sct_list) 2063 } 2064 } 2065 if(!have_inserted) 2066 this.observed_records.insert(b) 2067 } 2068 SCTStoreManager.update_cache_percentage() 2069 } 2071 // When it is time to perform SCT Feedback, the HTTPS client 2072 // calls this function to get a selection of SCTBundles to send 2073 // as feedback 2074 def get_gossip_selection() { 2075 if(len(observed_records) > MAX_SCT_RECORDS_TO_GOSSIP) { 2076 indexes = set() 2077 modulus = len(observed_records) 2078 while(len(indexes) < MAX_SCT_RECORDS_TO_GOSSIP) { 2079 r = randomInt() % modulus 2080 if(r not in indexes) 2081 indexes.insert(r) 2082 } 2084 return_selection = [] 2085 foreach(i in indexes) { 2086 return_selection.insert(this.observed_records[i]) 2087 } 2089 return return_selection 2090 } 2091 else 2092 return this.observed_records 2093 } 2095 def passes_validity_checks(SCTBundle b) { 2096 // This function performs the validity checks specified in 2097 // {{feedback-srvop}} 2098 } 2099 } 2100 The SCTDomainEntry is responsible for handling the outcome of a 2101 submission report for that domain using its member function: 2103 // This function is called after providing SCT Feedback 2104 // to a server. It is passed the feedback sent to the other party, which 2105 // is the output of get_gossip_selection(), and also the SCTBundle 2106 // representing the connection the data was sent on. 2107 // (When this code runs on the server, connectionBundle is NULL) 2108 // If the Feedback was not sent successfully, error is True 2109 def after_submit_to_thirdparty(error, SCTBundle[] submittedBundles, 2110 SCTBundle connectionBundle) 2111 { 2112 // Server operation in this instance is exceedingly simple 2113 if(operator_is_server) { 2114 if(error) 2115 return 2116 foreach(bundle in submittedBundles) 2117 bundle.num_reports_to_thirdparty++ 2118 return 2119 } 2121 // Client behavior is much more complicated 2122 if(error) { 2123 if(sct_feedback_failing_longterm()) { 2124 num_feedback_loop_failures++ 2125 } 2126 else if((num_submissions_succeeded / num_submissions_attempted) 2127 > MIN_RATIO_FOR_SCT_FEEDBACK_TO_BE_WORKING) { 2128 // Do nothing. num_submissions_succeeded will not be incremented 2129 // After enough of these failures, the ratio will fall beyond 2130 // acceptable 2131 } else { 2132 // The domain has begun its three-month grace period. We will 2133 // attempt submissions once a month 2134 num_feedback_loop_failures++ 2135 } 2136 return 2137 } 2138 // We succeeded, so reset all of our failure states 2139 // Note, there is a race condition here if clear_old_data() is called 2140 // while this callback is outstanding. 2141 num_feedback_loop_failures = 0 2142 if(num_submissions_succeeded != UINT16_MAX ) 2143 num_submissions_succeeded++ 2145 foreach(bundle in submittedBundles) 2146 { 2147 // Compare Certificate Chains, if they do not match, it counts as a 2148 // submission. 2149 if(!connectionBundle.approx_equals(bundle)) 2150 bundle.num_reports_to_thirdparty++ 2151 else { 2152 // This check ensures that a SCT Bundle is not considered reported 2153 // if it is submitted over a connection with the same SCTs. This 2154 // satisfies the constraint in Paragraph 5 of {{feedback-clisrv}} 2155 // Consider three submission scenarios: 2156 // Submitted SCTs Connection SCTs Considered Submitted 2157 // A, B A, B No - no new information 2158 // A A, B Yes - B is a new SCT 2159 // A, B A No - no new information 2160 if(connectionBundle.sct_list is NOT a subset of bundle.sct_list) 2161 bundle.num_reports_to_thirdparty++ 2162 } 2163 } 2164 } 2166 Instances of the SCTDomainEntry class are stored as part of a larger 2167 class that manages the entire SCT Cache, storing them in a hashmap 2168 keyed by domain. This class also tracks the current size of the 2169 cache, and will trigger cache eviction. 2171 //Suggestions: 2172 #define CACHE_PRESSURE_SAFE .50 2173 #define CACHE_PRESSURE_IMMINENT .70 2174 #define CACHE_PRESSURE_ALMOST_FULL .85 2175 #define CACHE_PRESSURE_FULL .95 2176 #define WAIT_BETWEEN_IMMINENT_CACHE_EVICTION 5 minutes 2178 class SCTStoreManager 2179 { 2180 hashmap all_sct_entries 2181 uint32 current_cache_size 2182 datetime imminent_cache_pressure_check_performed 2184 float current_cache_percentage() { 2185 return current_cache_size / MAX_CACHE_SIZE; 2186 } 2188 static def update_cache_percentage() { 2189 // This function calculates the current size of the cache 2190 // and updates current_cache_size 2191 /* ... perform calculations ... */ 2192 current_cache_size = /* new calculated value */ 2194 // Perform locking to prevent multiple of these functions being 2195 // called concurrently or unnecessarily 2196 if(current_cache_percentage() > CACHE_PRESSURE_FULL) { 2197 cache_is_full() 2198 } 2200 else if(current_cache_percentage() > CACHE_PRESSURE_ALMOST_FULL) { 2201 cache_pressure_almost_full() 2202 } 2204 else if(current_cache_percentage() > CACHE_PRESSURE_IMMINENT) { 2205 // Do not repeatedly perform the imminent cache pressure operation 2206 if(now() - imminent_cache_pressure_check_performed > 2207 WAIT_BETWEEN_IMMINENT_CACHE_EVICTION) { 2208 cache_pressure_is_imminent() 2209 } 2210 } 2211 } 2212 } 2214 The SCTStoreManager contains a function that will be called 2215 periodically in the background, iterating through all SCTDomainEntry 2216 objects and performing maintenance tasks. It removes data for 2217 domains we have not contacted in a long time. This function is not 2218 intended to clear data if the cache is getting full, separate 2219 functions are used for that. 2221 // Suggestions: 2222 #define TIME_UNTIL_OLD_SUBMITTED_SCTDATA_ERASED 3 months 2223 #define TIME_UNTIL_OLD_UNSUBMITTED_SCTDATA_ERASED 6 months 2225 def clear_old_data() 2226 { 2227 foreach(domainEntry in all_sct_stores) 2228 { 2229 // Queue proof fetches 2230 if(proof_fetching_enabled) { 2231 foreach(sctBundle in domainEntry.observed_records) { 2232 if(!sctBundle.has_been_fully_resolved_to_sths()) { 2233 foreach(s in bundle.sct_list) { 2234 if(!s.has_been_resolved_to_sth && !s.proof_outstanding) { 2235 sct.proof_outstanding = True 2236 queue_inclusion_proof(sct, inclusion_proof_callback) 2237 } 2238 } 2239 } 2240 } 2241 } 2243 // Do not store data for domains who are not supporting SCT 2244 if(!operator_is_server 2245 && domainEntry.sct_feedback_failing_longterm()) 2246 { 2247 // Note that reseting these variables every single time is 2248 // necessary to avoid a bug 2249 all_sct_stores[domainEntry].num_submissions_attempted = 0 2250 all_sct_stores[domainEntry].num_submissions_succeeded = 0 2251 delete all_sct_stores[domainEntry].observed_records 2252 all_sct_stores[domainEntry].observed_records = NULL 2253 } 2255 // This check removes successfully submitted data for 2256 // old domains we have not dealt with in a long time 2257 if(domainEntry.num_submissions_succeeded > 0 2258 && now() - domainEntry.last_contact_for_domain 2259 > TIME_UNTIL_OLD_SUBMITTED_SCTDATA_ERASED) 2260 { 2261 all_sct_stores.remove(domainEntry) 2262 } 2264 // This check removes unsuccessfully submitted data for 2265 // old domains we have not dealt with in a very long time 2266 if(now() - domainEntry.last_contact_for_domain 2267 > TIME_UNTIL_OLD_UNSUBMITTED_SCTDATA_ERASED) 2268 { 2269 all_sct_stores.remove(domainEntry) 2270 } 2272 SCTStoreManager.update_cache_percentage() 2273 } 2275 Inclusion Proof Fetching is handled fairly independently 2277 // This function is a callback invoked after an inclusion proof 2278 // has been retrieved. It can exist on the SCT class or independently, 2279 // so long as it can modify the SCT class' members 2280 def inclusion_proof_callback(inclusion_proof, original_sct, error) 2281 { 2282 // Unlike the STH code, this counter must be incremented on the 2283 // callback as there is a race condition on using this counter in the 2284 // cache_* functions. 2285 original_sct.proof_attempts++ 2286 original_sct.proof_outstanding = False 2287 if(!error) { 2288 original_sct.has_been_resolved_to_sth = True 2289 insert_to_sth_datastore(inclusion_proof.new_sth) 2290 } else { 2291 original_sct.proof_failure_count++ 2292 } 2293 } 2295 If the cache is getting full, these three member functions of the 2296 SCTStoreManager class will be used. 2298 // ----------------------------------------------------------------- 2299 // This function is called when the cache is not yet full, but is 2300 // nearing it. It prioritizes deleting data that should be safe 2301 // to delete (because it has been shared with the site or resolved 2302 // to a STH) 2303 def cache_pressure_is_imminent() 2304 { 2305 bundlesToDelete = [] 2306 foreach(domainEntry in all_sct_stores) { 2307 foreach(sctBundle in domainEntry.observed_records) { 2309 if(proof_fetching_enabled) { 2310 // First, queue proofs for anything not already queued. 2311 if(!sctBundle.has_been_fully_resolved_to_sths()) { 2312 foreach(sct in bundle.sct_list) { 2313 if(!sct.has_been_resolved_to_sth 2314 && !sct.proof_outstanding) { 2315 sct.proof_outstanding = True 2316 queue_inclusion_proof(sct, inclusion_proof_callback) 2317 } 2318 } 2319 } 2321 // Second, consider deleting entries that have been fully 2322 // resolved. 2323 else { 2324 bundlesToDelete.append( Struct(domainEntry, sctBundle) ) 2325 } 2326 } 2328 // Third, consider deleting entries that have been successfully 2329 // reported 2330 if(sctBundle.num_reports_to_thirdparty > 0) { 2331 bundlesToDelete.append( Struct(domainEntry, sctBundle) ) 2332 } 2333 } 2334 } 2336 // Third, delete the eligible entries at random until the cache is 2337 // at a safe level 2338 uint recalculateIndex = 0 2339 #define RECALCULATE_EVERY_N_OPERATIONS 50 2341 while(bundlesToDelete.length > 0 && 2342 current_cache_percentage() > CACHE_PRESSURE_SAFE) { 2343 uint rndIndex = rand() % bundlesToDelete.length 2344 bundlesToDelete[rndIndex].domainEntry.observed_records.remove(bundlesToDelete[rndIndex].sctBundle) 2345 bundlesToDelete.removeAt(rndIndex) 2347 recalculateIndex++ 2348 if(recalculateIndex % RECALCULATE_EVERY_N_OPERATIONS == 0) { 2349 update_cache_percentage() 2350 } 2351 } 2353 // Finally, tell the proof fetching engine to go faster 2354 if(proof_fetching_enabled) { 2355 // This function would speed up proof fetching until an 2356 // arbitrary time has passed. Perhaps until it has fetched 2357 // proofs for the number of items currently in its queue? Or 2358 // a percentage of them? 2359 proof_fetch_faster_please() 2360 } 2361 update_cache_percentage(); 2362 } 2364 // ----------------------------------------------------------------- 2365 // This function is called when the cache is almost full. It will 2366 // evict entries at random, while attempting to save entries that 2367 // appear to have proof fetching failures 2368 def cache_pressure_almost_full() 2369 { 2370 uint recalculateIndex = 0 2371 uint savedRecords = 0 2372 #define RECALCULATE_EVERY_N_OPERATIONS 50 2374 while(all_sct_stores.length > savedRecords && 2375 current_cache_percentage() > CACHE_PRESSURE_SAFE) { 2376 uint rndIndex1 = rand() % all_sct_stores.length 2377 uint rndIndex2 = rand() % all_sct_stores[rndIndex1].observed_records.length 2379 if(proof_fetching_enabled) { 2380 if(all_sct_stores[rndIndex1].observed_records[rndIndex2].max_proof_failures() > 2381 MIN_PROOF_FAILURES_CONSIDERED_SUSPICIOUS) { 2382 savedRecords++ 2383 continue 2384 } 2385 } 2387 // If proof fetching is not enabled we need some other logic 2388 else { 2389 if(sctBundle.num_reports_to_thirdparty == 0) { 2390 savedRecords++ 2391 continue 2392 } 2393 } 2395 all_sct_stores[rndIndex1].observed_records.removeAt(rndIndex2) 2396 if(all_sct_stores[rndIndex1].observed_records.length == 0) { 2397 all_sct_stores.removeAt(rndIndex1) 2398 } 2400 recalculateIndex++ 2401 if(recalculateIndex % RECALCULATE_EVERY_N_OPERATIONS == 0) { 2402 update_cache_percentage() 2403 } 2404 } 2406 update_cache_percentage(); 2407 } 2408 // ----------------------------------------------------------------- 2409 // This function is called when the cache is full, and will evict 2410 // cache entries at random 2411 def cache_is_full() 2412 { 2413 uint recalculateIndex = 0 2414 #define RECALCULATE_EVERY_N_OPERATIONS 50 2416 while(all_sct_stores.length > 0 && 2417 current_cache_percentage() > CACHE_PRESSURE_SAFE) { 2418 uint rndIndex1 = rand() % all_sct_stores.length 2419 uint rndIndex2 = rand() % all_sct_stores[rndIndex1].observed_records.length 2421 all_sct_stores[rndIndex1].observed_records.removeAt(rndIndex2) 2422 if(all_sct_stores[rndIndex1].observed_records.length == 0) { 2423 all_sct_stores.removeAt(rndIndex1) 2424 } 2426 recalculateIndex++ 2427 if(recalculateIndex % RECALCULATE_EVERY_N_OPERATIONS == 0) { 2428 update_cache_percentage() 2429 } 2430 } 2432 update_cache_percentage(); 2433 } 2435 12. IANA considerations 2437 [ TBD ] 2439 13. Contributors 2441 The authors would like to thank the following contributors for 2442 valuable suggestions: Al Cutter, Ben Laurie, Benjamin Kaduk, Josef 2443 Gustafsson, Karen Seo, Magnus Ahltorp, Steven Kent, Yan Zhu. 2445 14. ChangeLog 2447 14.1. Changes between ietf-03 and ietf-04 2449 o No changes. 2451 14.2. Changes between ietf-02 and ietf-03 2453 o TBD's resolved. 2455 o References added. 2457 o Pseduocode changed to work for both clients and servers. 2459 14.3. Changes between ietf-01 and ietf-02 2461 o Requiring full certificate chain in SCT Feedback. 2463 o Clarifications on what clients store for and send in SCT Feedback 2464 added. 2466 o SCT Feedback server operation updated to protect against DoS 2467 attacks on servers. 2469 o Pre-Loaded vs Locally Added Anchors explained. 2471 o Base for well-known URL's changed. 2473 o Remove all mentions of monitors - gossip deals with auditors. 2475 o New sections added: Trusted Auditor protocol, attacks by actively 2476 malicious log, the Dual-CA compromise attack, policy 2477 recommendations, 2479 14.4. Changes between ietf-00 and ietf-01 2481 o Improve language and readability based on feedback from Stephen 2482 Kent. 2484 o STH Pollination Proof Fetching defined and indicated as optional. 2486 o 3-Method Ecosystem section added. 2488 o Cases with Logs ceasing operation handled. 2490 o Text on tracking via STH Interaction added. 2492 o Section with some early recommendations for mixing added. 2494 o Section detailing blocking connections, frustrating it, and the 2495 implications added. 2497 14.5. Changes between -01 and -02 2499 o STH Pollination defined. 2501 o Trusted Auditor Relationship defined. 2503 o Overview section rewritten. 2505 o Data flow picture added. 2507 o Section on privacy considerations expanded. 2509 14.6. Changes between -00 and -01 2511 o Add the SCT feedback mechanism: Clients send SCTs to originating 2512 web server which shares them with auditors. 2514 o Stop assuming that clients see STHs. 2516 o Don't use HTTP headers but instead .well-known URL's - avoid that 2517 battle. 2519 o Stop referring to trans-gossip and trans-gossip-transport-https - 2520 too complicated. 2522 o Remove all protocols but HTTPS in order to simplify - let's come 2523 back and add more later. 2525 o Add more reasoning about privacy. 2527 o Do specify data formats. 2529 15. References 2531 15.1. Normative References 2533 [RFC-6962-BIS-09] 2534 Laurie, B., Langley, A., Kasper, E., Messeri, E., and R. 2535 Stradling, "Certificate Transparency", October 2015, 2536 . 2539 [RFC7159] Bray, T., "The JavaScript Object Notation (JSON) Data 2540 Interchange Format", RFC 7159, March 2014. 2542 15.2. Informative References 2544 [double-keying] 2545 Perry, M., Clark, E., and S. Murdoch, "Cross-Origin 2546 Identifier Unlinkability", May 2015, 2547 . 2550 [draft-ct-over-dns] 2551 Laurie, B., Phaneuf, P., and A. Eijdenberg, "Certificate 2552 Transparency over DNS", February 2016, 2553 . 2556 [draft-ietf-trans-threat-analysis-03] 2557 Kent, S., "Attack Model and Threat for Certificate 2558 Transparency", October 2015, 2559 . 2562 [dual-ca-compromise-attack] 2563 Gillmor, D., "can CT defend against dual CA compromise?", 2564 n.d., . 2567 [gossip-mixing] 2568 Ritter, T., "A Bit on Certificate Transparency Gossip", 2569 June 2016, . 2572 [trickle] Serjantov, A., Dingledine, R., and . Paul Syverson, "From 2573 a Trickle to a Flood: Active Attacks on Several Mix 2574 Types", October 2002, 2575 . 2577 Authors' Addresses 2579 Linus Nordberg 2580 NORDUnet 2582 Email: linus@nordu.net 2584 Daniel Kahn Gillmor 2585 ACLU 2587 Email: dkg@fifthhorseman.net 2588 Tom Ritter 2590 Email: tom@ritter.vg