idnits 2.17.00 (12 Aug 2021) /tmp/idnits51030/draft-gpew-priv-ppm-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 2 instances of too long lines in the document, the longest one being 7 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document date (7 March 2022) is 68 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '32' on line 892 == Outdated reference: draft-irtf-cfrg-hpke has been published as RFC 9180 ** Downref: Normative reference to an Informational draft: draft-irtf-cfrg-hpke (ref. 'I-D.irtf-cfrg-hpke') ** Downref: Normative reference to an Informational RFC: RFC 2818 ** Downref: Normative reference to an Informational RFC: RFC 5861 -- No information found for draft-cfrg-patton-vdaf - is the name correct? Summary: 4 errors (**), 0 flaws (~~), 1 warning (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group T. Geoghegan 3 Internet-Draft ISRG 4 Intended status: Standards Track C. Patton 5 Expires: 8 September 2022 Cloudflare 6 E. Rescorla 7 Mozilla 8 C.A. Wood 9 Cloudflare 10 7 March 2022 12 Privacy Preserving Measurement 13 draft-gpew-priv-ppm-01 15 Abstract 17 There are many situations in which it is desirable to take 18 measurements of data which people consider sensitive. In these 19 cases, the entity taking the measurement is usually not interested in 20 people's individual responses but rather in aggregated data. 21 Conventional methods require collecting individual responses and then 22 aggregating them, thus representing a threat to user privacy and 23 rendering many such measurements difficult and impractical. This 24 document describes a multi-party privacy preserving measurement (PPM) 25 protocol which can be used to collect aggregate data without 26 revealing any individual user's data. 28 Discussion Venues 30 This note is to be removed before publishing as an RFC. 32 Discussion of this document takes place on the mailing list (), which 33 is archived at . 35 Source for this draft and an issue tracker can be found at 36 https://github.com/abetterinternet/ppm-specification. 38 Status of This Memo 40 This Internet-Draft is submitted in full conformance with the 41 provisions of BCP 78 and BCP 79. 43 Internet-Drafts are working documents of the Internet Engineering 44 Task Force (IETF). Note that other groups may also distribute 45 working documents as Internet-Drafts. The list of current Internet- 46 Drafts is at https://datatracker.ietf.org/drafts/current/. 48 Internet-Drafts are draft documents valid for a maximum of six months 49 and may be updated, replaced, or obsoleted by other documents at any 50 time. It is inappropriate to use Internet-Drafts as reference 51 material or to cite them other than as "work in progress." 53 This Internet-Draft will expire on 8 September 2022. 55 Copyright Notice 57 Copyright (c) 2022 IETF Trust and the persons identified as the 58 document authors. All rights reserved. 60 This document is subject to BCP 78 and the IETF Trust's Legal 61 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 62 license-info) in effect on the date of publication of this document. 63 Please review these documents carefully, as they describe your rights 64 and restrictions with respect to this document. Code Components 65 extracted from this document must include Revised BSD License text as 66 described in Section 4.e of the Trust Legal Provisions and are 67 provided without warranty as described in the Revised BSD License. 69 Table of Contents 71 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 72 1.1. DISCLAIMER . . . . . . . . . . . . . . . . . . . . . . . 4 73 1.2. Conventions and Definitions . . . . . . . . . . . . . . . 4 74 2. Overview . . . . . . . . . . . . . . . . . . . . . . . . . . 5 75 2.1. System Architecture . . . . . . . . . . . . . . . . . . . 6 76 2.2. Validating Inputs . . . . . . . . . . . . . . . . . . . . 8 77 3. Message Transport . . . . . . . . . . . . . . . . . . . . . . 9 78 3.1. Errors . . . . . . . . . . . . . . . . . . . . . . . . . 9 79 4. Protocol Definition . . . . . . . . . . . . . . . . . . . . . 10 80 4.1. Task Configuration . . . . . . . . . . . . . . . . . . . 11 81 4.2. Uploading Reports . . . . . . . . . . . . . . . . . . . . 12 82 4.2.1. Key Configuration Request . . . . . . . . . . . . . . 12 83 4.2.2. Upload Request . . . . . . . . . . . . . . . . . . . 13 84 4.2.3. Upload Extensions . . . . . . . . . . . . . . . . . . 15 85 4.3. Verifying and Aggregating Reports . . . . . . . . . . . . 16 86 4.3.1. Aggregate Request . . . . . . . . . . . . . . . . . . 17 87 4.3.2. Aggregate Share Request . . . . . . . . . . . . . . . 19 88 4.4. Collecting Results . . . . . . . . . . . . . . . . . . . 21 89 4.4.1. Validating Batch Parameters . . . . . . . . . . . . . 23 90 4.4.2. Anti-replay . . . . . . . . . . . . . . . . . . . . . 24 91 5. Operational Considerations . . . . . . . . . . . . . . . . . 25 92 5.1. Protocol participant capabilities . . . . . . . . . . . . 25 93 5.1.1. Client capabilities . . . . . . . . . . . . . . . . . 25 94 5.1.2. Aggregator capabilities . . . . . . . . . . . . . . . 25 95 5.1.3. Collector capabilities . . . . . . . . . . . . . . . 26 97 5.2. Data resolution limitations . . . . . . . . . . . . . . . 26 98 5.3. Aggregation utility and soft batch deadlines . . . . . . 27 99 5.4. Protocol-specific optimizations . . . . . . . . . . . . . 27 100 5.4.1. Reducing storage requirements . . . . . . . . . . . . 27 101 6. Security Considerations . . . . . . . . . . . . . . . . . . . 28 102 6.1. Threat model . . . . . . . . . . . . . . . . . . . . . . 28 103 6.1.1. Client/user . . . . . . . . . . . . . . . . . . . . . 29 104 6.1.2. Aggregator . . . . . . . . . . . . . . . . . . . . . 29 105 6.1.3. Leader . . . . . . . . . . . . . . . . . . . . . . . 31 106 6.1.4. Collector . . . . . . . . . . . . . . . . . . . . . . 32 107 6.1.5. Aggregator collusion . . . . . . . . . . . . . . . . 32 108 6.1.6. Attacker on the network . . . . . . . . . . . . . . . 32 109 6.2. Client authentication or attestation . . . . . . . . . . 34 110 6.3. Anonymizing proxies . . . . . . . . . . . . . . . . . . . 34 111 6.4. Batch parameters . . . . . . . . . . . . . . . . . . . . 34 112 6.5. Differential privacy . . . . . . . . . . . . . . . . . . 34 113 6.6. Robustness in the presence of malicious servers . . . . . 35 114 6.7. Infrastructure diversity . . . . . . . . . . . . . . . . 35 115 6.8. System requirements . . . . . . . . . . . . . . . . . . . 35 116 6.8.1. Data types . . . . . . . . . . . . . . . . . . . . . 35 117 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 35 118 7.1. Protocol Message Media Types . . . . . . . . . . . . . . 35 119 7.1.1. "application/ppm-hpke-config" media type . . . . . . 36 120 7.1.2. "message/ppm-report" media type . . . . . . . . . . . 37 121 7.1.3. "message/ppm-aggregate-req" media type . . . . . . . 38 122 7.1.4. "message/ppm-aggregate-resp" media type . . . . . . . 39 123 7.1.5. "message/ppm-aggregate-share-req" media type . . . . 39 124 7.1.6. "message/ppm-aggregate-share-resp" media type . . . . 40 125 7.1.7. "message/ppm-collect-req" media type . . . . . . . . 41 126 7.1.8. "message/ppm-collect-req" media type . . . . . . . . 42 127 7.2. Upload Extension Registry . . . . . . . . . . . . . . . . 43 128 7.3. URN Sub-namespace for PPM (urn:ietf:params:ppm) . . . . . 43 129 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 43 130 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 43 131 9.1. Normative References . . . . . . . . . . . . . . . . . . 43 132 9.2. Informative References . . . . . . . . . . . . . . . . . 44 133 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 45 135 1. Introduction 137 This document describes a protocol for privacy preserving 138 measurement. The protocol is executed by a large set of clients and 139 a small set of servers. The servers' goal is to compute some 140 aggregate statistic over the clients' inputs without learning the 141 inputs themselves. This is made possible by distributing the 142 computation among the servers in such a way that, as long as at least 143 one of them executes the protocol honestly, no input is ever seen in 144 the clear by any server. 146 1.1. DISCLAIMER 148 This document is a work in progress. We have not yet settled on the 149 design of the protocol framework or the set of features we intend to 150 support. 152 1.2. Conventions and Definitions 154 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 155 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 156 "OPTIONAL" in this document are to be interpreted as described in 157 BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all 158 capitals, as shown here. 160 The following terms are used: 162 Aggregation function: The function computed over the users' inputs. 164 Aggregator: An endpoint that runs the input-validation protocol and 165 accumulates input shares. 167 Batch: A set of reports that are aggregated into an output. 169 Batch duration: The time difference between the oldest and newest 170 report in a batch. 172 Batch interval: A parameter of the collect or aggregate-share 173 request that specifies the time range of the reports in the batch. 175 Client: The endpoint from which a user sends data to be aggregated, 176 e.g., a web browser. 178 Collector: The endpoint that receives the output of the aggregation 179 function. 181 Input: The measurement (or measurements) emitted by a client, before 182 any encryption or secret sharing scheme is applied. 184 Input share: An aggregator's share of the output of the VDAF 185 [I-D.draft-cfrg-patton-vdaf] sharding algorithm. This algorithm 186 is run by each client in order to cryptographically protect its 187 measurement. 189 Measurement: A single value (e.g., a count) being reported by a 190 client. Multiple measurements may be grouped into a single 191 protocol input. 193 Minimum batch duration: The minimum batch duration permitted for a 194 PPM task, i.e., the minimum time difference between the oldest and 195 newest report in a batch. 197 Minimum batch size: The minimum number of reports in a batch. 199 Leader: A distinguished aggregator that coordinates input validation 200 and data collection. 202 Aggregate result: The output of the aggregation function over a 203 given set of reports. 205 Aggregate share: A share of the aggregate result emitted by an 206 aggregator. Aggregate shares are reassembled by the collector 207 into the final output. 209 Output share: An aggregator's share of the output of the VDAF 210 [I-D.draft-cfrg-patton-vdaf] preparation step. Many output shares 211 are combined into an aggregate share via the VDAF aggregation 212 algorithm. 214 Proof: A value generated by the client and used by the aggregators 215 to verify the client's input. 217 Report: Uploaded to the leader from the client. A report contains 218 the secret-shared and encrypted input and proof. 220 Server: An aggregator. 222 This document uses the presentation language of [RFC8446]. 224 2. Overview 226 The protocol is executed by a large set of clients and a small set of 227 servers. We call the servers the _aggregators_. Each client's input 228 to the protocol is a set of measurements (e.g., counts of some user 229 behavior). Given the input set of measurements x_1, ..., x_n held by 230 n users, the goal of a _privacy preserving measurement (PPM) 231 protocol_ is to compute y = F(p, x_1, ..., x_n) for some function F 232 while revealing nothing else about the measurements. 234 This protocol is extensible and allows for the addition of new 235 cryptographic schemes that implement the VDAF interface specified in 236 [I-D.draft-cfrg-patton-vdaf]. Candidates include: 238 * prio3, which allows for aggregate statistics such as sum, mean, 239 histograms, etc. This class of VDAFs is based on Prio [CGB17] and 240 includes improvements described in [BBCGGI19]. 242 * poplar1, which allows for finding the most popular strings among a 243 collection of clients (e.g., the URL of their home page) as well 244 as counting the number of clients that hold a given string. This 245 VDAF is the basis of the Poplar protocol of [BBCGGI21], which is 246 designed to solve the heavy hitters problem in a privacy 247 preserving manner. 249 This protocol is designed to work with schemes that use secret 250 sharing. Rather than send its input in the clear, each client shards 251 its measurements into a sequence of _input shares_ and sends an input 252 share to each of the aggregators. This provides two important 253 properties: 255 * It's impossible to deduce the measurement without knowing _all_ of 256 the shares. 258 * It allows the aggregators to compute the final output by first 259 aggregating up their measurements shares locally, then combining 260 the results to obtain the final output. 262 2.1. System Architecture 264 {#system-architecture} 266 The overall system architecture is shown in Figure 1. 268 +------------+ 269 | | 270 +--------+ | Helper | 271 | | | | 272 | Client +----+ +-----^------+ 273 | | | | 274 +--------+ | | 275 | | 276 +--------+ | +-----v------+ +-----------+ 277 | | +-----> | | | 278 | Client +----------> Leader <---------> Collector | 279 | | +-----> | | | 280 +--------+ | +-----^------+ +-----------+ 281 | | 282 +--------+ | | 283 | | | | 284 | Client +----+ +-----V------+ 285 | | | | 286 +--------+ | Helper | 287 | | 288 +------------+ 289 Figure 1: System Architecture 291 [[OPEN ISSUE: This shows two helpers, but the document only allows 292 one for now. https://github.com/abetterinternet/ppm-specification/ 293 issues/117]] 295 The main participants in the protocol are as follows: 297 Collector: The entity which wants to take the measurement and 298 ultimately receives the results. Any given measurement will have 299 a single collector. 301 Client(s): The endpoints which directly take the measurement(s) and 302 report them to the PPM system. In order to provide reasonable 303 levels of privacy, there must be a large number of clients. 305 Aggregator: An endpoint which receives report shares. Each 306 aggregator works with the other aggregators to compute the final 307 aggregate. This protocol defines two types of aggregators: 308 Leaders and Helpers. For each measurement, there is a single 309 leader and helper. 311 Leader: The leader is responsible for coordinating the protocol. It 312 receives the encrypted shares, distributes them to the helpers, 313 and orchestrates the process of computing the final measurement as 314 requested by the collector. 316 Helper: Helpers are responsible for executing the protocol as 317 instructed by the leader. The protocol is designed so that 318 helpers can be relatively lightweight, with most of the state held 319 at the leader. 321 The basic unit of PPM is the "task" which represents a single 322 measurement (though potentially taken over multiple time windows). 323 The definition of a task includes the following parameters: 325 * The type of each measurement. 327 * The aggregation function to compute (e.g., sum, mean, etc.) and an 328 optional aggregation parameter. 330 * The set of aggregators and necessary cryptographic keying material 331 to use. 333 * The VDAF to execute, which to some extent is dictated by the 334 previous choices. 336 * The minimum "batch size" of reports which can be aggregated. 338 * The rate at which measurements can be taken, i.e., the "minimum 339 batch window". 341 These parameters are distributed out of band to the clients and to 342 the aggregators. Each task is identified by a unique 32-byte ID 343 which is used to refer to it in protocol messages. 345 During the duration of the measurement, each client records its own 346 value(s), packages them up into a report, and sends them to the 347 leader. Each share is separately encrypted for each aggregator so 348 that even though they pass through the leader, the leader is unable 349 to see or modify them. Depending on the measurement, the client may 350 only send one report or may send many reports over time. 352 The leader distributes the shares to the helpers and orchestrates the 353 process of verifying them (see Section 2.2) and assembling them into 354 a final measurement for the collector. Depending on the VDAF, it may 355 be possible to incrementally process each report as it comes in, or 356 may be necessary to wait until the entire batch of reports is 357 received. 359 2.2. Validating Inputs 361 An essential task of any data collection pipeline is ensuring that 362 the data being aggregated is "valid". In PPM, input validation is 363 complicated by the fact that none of the entities other than the 364 client ever sees the values for individual clients. 366 In order to address this problem, the aggregators engage in a secure, 367 multi-party computation specified by the chosen VDAF 368 [I-D.draft-cfrg-patton-vdaf] in order to prepare a report for 369 aggregation. At the beginning of this computation, each aggregator 370 is in possession of an input share uploaded by the client. At the 371 end of the computation, each aggregator is in posession of either an 372 "output share" that is ready to be aggregated or an indication that a 373 valid output share could not be computed. 375 To facilitiate this computation, the input shares generated by the 376 client include information used by the aggregators during aggregation 377 in order to validate their corresponding output shares. For example, 378 prio3 includes a distributed zero-knowledge proof of the input's 379 validity [BBCGGI19] which the aggregators can jointly verify and 380 reject the report if it cannot be verified. However, they do not 381 learn anything about the individual report other than that it is 382 valid. 384 The specific properties attested to in the proof vary depending on 385 the measurement being taken. For instance, if we want to measure the 386 time the user took performing a given task the proof might 387 demonstrate that the value reported was within a certain range (e.g., 388 0-60 seconds). By contrast, if we wanted to report which of a set of 389 N options the user select, the report might contain N integers and 390 the proof would demonstrate that N-1 were 0 and the other was 1. 392 It is important to recognize that "validity" is distinct from 393 "correctness". For instance, the user might have spent 30s on a task 394 but the client might report 60s. This is a problem with any 395 measurement system and PPM does not attempt to address it; it merely 396 ensures that the data is within acceptable limits, so the client 397 could not report 10^6s or -20s. 399 3. Message Transport 401 Communications between PPM entities are carried over HTTPS [RFC2818]. 402 HTTPS provides server authentication and confidentiality. In 403 addition, report shares are encrypted directly to the aggregators 404 using HPKE [I-D.irtf-cfrg-hpke]. 406 3.1. Errors 408 Errors can be reported in PPM both at the HTTP layer and within 409 challenge objects as defined in Section 7. PPM servers can return 410 responses with an HTTP error response code (4XX or 5XX). For 411 example, if the client submits a request using a method not allowed 412 in this document, then the server MAY return status code 405 (Method 413 Not Allowed). 415 When the server responds with an error status, it SHOULD provide 416 additional information using a problem document [RFC7807]. To 417 facilitate automatic response to errors, this document defines the 418 following standard tokens for use in the "type" field (within the PPM 419 URN namespace "urn:ietf:params:ppm:error:"): 421 +=====================+=========================================+ 422 | Type | Description | 423 +=====================+=========================================+ 424 | unrecognizedMessage | The message type for a response was | 425 | | incorrect or the payload was malformed. | 426 +---------------------+-----------------------------------------+ 427 | unrecognizedTask | An endpoint received a message with an | 428 | | unknown task ID. | 429 +---------------------+-----------------------------------------+ 430 | outdatedConfig | The message was generated using an | 431 | | outdated configuration. | 432 +---------------------+-----------------------------------------+ 434 Table 1 436 This list is not exhaustive. The server MAY return errors set to a 437 URI other than those defined above. Servers MUST NOT use the PPM URN 438 namespace for errors not listed in the appropriate IANA registry (see 439 Section 7.3). Clients SHOULD display the "detail" field of all 440 errors. The "instance" value MUST be the endpoint to which the 441 request was targeted. The problem document MUST also include a 442 "taskid" member which contains the associated PPM task ID (this value 443 is always known, see Section 4.1). 445 In the remainder of this document, we use the tokens in the table 446 above to refer to error types, rather than the full URNs. For 447 example, an "error of type 'unrecognizedMessage'" refers to an error 448 document with "type" value 449 "urn:ietf:params:ppm:error:unrecognizedMessage". 451 This document uses the verbs "abort" and "alert with [some error 452 message]" to describe how protocol participants react to various 453 error conditions. 455 4. Protocol Definition 457 PPM has three major interactions which need to be defined: 459 * Uploading reports from the client to the aggregators 461 * Computing the results of a given measurement 463 * Reporting results to the collector 465 We start with some basic type definitions used in other messages. 467 /* ASCII encoded URL. e.g., "https://example.com" */ 468 opaque Url<1..2^16-1>; 470 Duration uint64; /* Number of seconds elapsed between two instants */ 472 Time uint64; /* seconds elapsed since start of UNIX epoch */ 474 /* An interval of time of length duration, where start is included and (start + 475 duration) is excluded. */ 476 struct { 477 Time start; 478 Duration duration; 479 } Interval; 481 /* A nonce used to uniquely identify a report in the context of a PPM task. It 482 includes the time at which the report was generated and a random, 64-bit 483 integer. */ 484 struct { 485 Time time; 486 uint64 rand; 487 } Nonce; 489 4.1. Task Configuration 491 Prior to the start of execution of the protocol, each participant 492 must agree on the configuration for each task. A task is uniquely 493 identified by its task ID: 495 opaque TaskId[32]; 497 A TaskId is a globally unique sequence of bytes. It is RECOMMENDED 498 that this be set to a random string output by a cryptographically 499 secure pseudorandom number generator. Each task has the following 500 parameters associated with it: 502 * aggregator_endpoints: A list of URLs relative to which an 503 aggregator's API endpoints can be found. Each endpoint's list 504 MUST be in the same order. The leader's endpoint MUST be the 505 first in the list. The order of the encrypted_input_shares in a 506 Report (see Section 4.2) MUST be the same as the order in which 507 aggregators appear in this list. 509 * collector_config: The HPKE configuration of the collector 510 (described in Section 4.2.1). Having participants agree on this 511 absolves collectors of the burden of operating an HTTP server. 512 See #102 (https://github.com/abetterinternet/prio-documents/ 513 issues/102) for discussion. 515 * max_batch_lifetime: The maximum number of times a batch of reports 516 may be used in collect requests. 518 * min_batch_size: The minimum number of reports that appear in a 519 batch. 521 * min_batch_duration: The minimum time difference between the oldest 522 and newest report in a batch. This defines the boundaries with 523 which the batch interval of each collect request must be aligned. 524 (See Section 4.4.1.) 526 * protocol: named parameter identifying the VDAF scheme in use. 528 4.2. Uploading Reports 530 Clients periodically upload reports to the leader, which then 531 distributes the individual shares to each helper. 533 4.2.1. Key Configuration Request 535 Before the client can upload its report to the leader, it must know 536 the public key of each of the aggregators. These are retrieved from 537 each aggregator by sending a request to [aggregator]/key_config, 538 where [aggregator] is the aggregator's endpoint URL, obtained from 539 the task parameters. The aggregator responds to well-formed requests 540 with status 200 and an HpkeConfig value: 542 struct { 543 HpkeConfigId id; 544 HpkeKemId kem_id; 545 HpkeKdfId kdf_id; 546 HpkeAeadKdfId aead_id; 547 HpkePublicKey public_key; 548 } HpkeConfig; 550 uint8 HpkeConfigId; 551 opaque HpkePublicKey<1..2^16-1>; 552 uint16 HpkeAeadId; // Defined in I-D.irtf-cfrg-hpke 553 uint16 HpkeKemId; // Defined in I-D.irtf-cfrg-hpke 554 uint16 HpkeKdfId; // Defined in I-D.irtf-cfrg-hpke 556 [OPEN ISSUE: Decide whether to expand the width of the id, or support 557 multiple cipher suites (a la OHTTP/ECH).] 559 The client MUST abort if any of the following happen for any 560 key_config request: 562 * the client and aggregator failed to establish a secure, 563 aggregator-authenticated channel; 565 * the GET request failed or didn't return a valid key config; or 567 * the key config specifies a KEM, KDF, or AEAD algorithm the client 568 doesn't recognize. 570 Aggregators SHOULD use HTTP caching to permit client-side caching of 571 this resource [RFC5861]. Aggregators SHOULD favor long cache 572 lifetimes to avoid frequent cache revalidation, e.g., on the order of 573 days. Aggregators can control this cached lifetime with the Cache- 574 Control header, as follows: 576 Cache-Control: max-age=86400 578 Clients SHOULD follow the usual HTTP caching [RFC7234] semantics for 579 key configurations. 581 Note: Long cache lifetimes may result in clients using stale HPKE 582 keys; aggregators SHOULD continue to accept reports with old keys for 583 at least twice the cache lifetime in order to avoid rejecting 584 reports. 586 4.2.2. Upload Request 588 Clients upload reports by using an HTTP POST to [leader]/upload, 589 where [leader] is the first entry in the task's aggregator endpoints. 590 The payload is structured as follows: 592 struct { 593 TaskID task_id; 594 Nonce nonce; 595 Extension extensions<4..2^16-1>; 596 EncryptedInputShare encrypted_input_shares<1..2^16-1>; 597 } Report; 599 This message is called the client's _report_. It contains the 600 following fields: 602 * task_id is the task ID of the task for which the report is 603 intended. 605 * nonce is the report nonce generated by the client. This field is 606 used by the aggregators to ensure the report appears in at most 607 one batch. (See Section 4.4.2.) 609 * extensions is a list of extensions to be included in the Upload 610 flow; see Section 4.2.3. 612 * encrypted_input_shares contains the encrypted input shares of each 613 of the aggregators. The order in which the encrypted input shares 614 appear MUST match the order of the task's aggregator_endpoints 615 (i.e., the first share should be the leader's, the second share 616 should be for the first helper, and so on). 618 Encrypted input shares are structured as follows: 620 struct { 621 HpkeConfigId aggregator_config_id; 622 opaque enc<1..2^16-1>; 623 opaque payload<1..2^16-1>; 624 } EncryptedInputShare; 626 * aggregator_config_id is equal to HpkeConfig.id, where HpkeConfig 627 is the key config of the aggregator receiving the input share. 629 * enc is the HPKE encapsulated key, used by the aggregator to 630 decrypt its input share. 632 * payload is the encrypted input share. 634 To generate the report, the client begins by sharding its measurement 635 into a sequence of input shares as specified by the VDAF in use. To 636 encrypt an input share, the client first generates an HPKE 637 [I-D.irtf-cfrg-hpke] context for the aggregator by running 639 enc, context = SetupBaseS(pk, 640 "pda input share" || task_id || server_role) 642 where pk is the aggregator's public key, task_id is Report.task_id 643 and server_role is a byte whose value is 0x01 if the aggregator is 644 the leader and 0x00 if the aggregator is the helper. enc is the HPKE 645 encapsulated key and context is the HPKE context used by the client 646 for encryption. The payload is encrypted as 648 payload = context.Seal(nonce || extensions, input_share) 650 where input_share is the aggregator's input share and nonce and 651 extensions are the corresponding fields of Report. 653 The leader responds to well-formed requests to [leader]/upload with 654 status 200 and an empty body. Malformed requests are handled as 655 described in Section 3.1. Clients SHOULD NOT upload the same 656 measurement value in more than one report if the leader responds with 657 status 200 and an empty body. 659 The leader responds to requests with out-of-date HpkeConfig.id 660 values, indicated by EncryptedInputShare.config_id, with status 400 661 and an error of type 'outdatedConfig'. Clients SHOULD invalidate any 662 cached aggregator HpkeConfig and retry with a freshly generated 663 Report. If this retried report does not succeed, clients MUST abort 664 and discontinue retrying. 666 The leader MUST ignore any report whose nonce contains a timestamp 667 that falls in a batch interval for which it has received at least one 668 collect request from the collector. (See Section 4.4.) Otherwise, 669 comparing the aggregate result to the previous aggregate result may 670 result in a privacy violation. (Note that the helpers enforce this 671 as well; see Section 4.3.1.) In addition, the leader SHOULD abort 672 the upload protocol and alert the client with error "staleReport". 674 4.2.3. Upload Extensions 676 Each UploadReq carries a list of extensions that clients may use to 677 convey additional, authenticated information in the report. [OPEN 678 ISSUE: The extensions aren't authenticated. It's probably a good 679 idea to be a bit more clear about how we envision extensions being 680 used. Right now this includes client attestation for defeating Sybil 681 attacks. See issue#89.] Each extension is a tag-length encoded 682 value of the following form: 684 struct { 685 ExtensionType extension_type; 686 opaque extension_data<0..2^16-1>; 687 } Extension; 689 enum { 690 TBD(0), 691 (65535) 692 } ExtensionType; 694 "extension_type" indicates the type of extension, and 695 "extension_data" contains information specific to the extension. 697 4.3. Verifying and Aggregating Reports 699 Once a set of clients have uploaded their reports to the leader, the 700 leader can send them to the helpers to be verified and aggregated. 701 In order to enable the system to handle very large batches of 702 reports, this process can be performed incrementally. To aggregate a 703 set of reports, the leader sends an AggregateReq to each helper 704 containing those report shares. The helper then processes them 705 (verifying the proofs and incorporating their values into the ongoing 706 aggregate) and replies to the leader. 708 The exact structure of the aggregation flow depends on the VDAF. 709 Specifically: 711 * Some VDAFs (e.g., prio3) allow the leader to start aggregating 712 reports proactively before all the reports in a batch are 713 received. Others (e.g., poplar1) require all the reports to be 714 present and must be initiated by the collector. 716 * Processing the reports -- especially validating them -- may 717 require multiple round trips. 719 Note that it is possible to aggregate reports from one batch while 720 reports from the next batch are coming in. This is because each 721 report is validated independently. 723 This process is illustrated below in Figure 2. In this example, the 724 batch size is 20, but the leader opts to process the reports in sub- 725 batches of 10. Each sub-batch takes two round-trips to process. 726 Once both sub-batches have been processed, the leader can issue an 727 AggregateShareReq in order to retrieve the helper's aggregated 728 result. 730 In order to allow the helpers to retain minimal state, the helper can 731 attach a state parameter to its response, with the leader returning 732 the state value in the next request, thus offloading the state to the 733 leader. This state value MUST be cryptographically protected as 734 described in Section 4.3.1.2. 736 Leader Helper 738 AggregateReq (Reports 1-10) --------------------------------> \ 739 <------------------------------------ AggregateResp (State 1) | Reports 740 AggregateReq (continued, State 1) ---------------------> | 10-11 741 <------------------------------------ AggregateResp (State 2) / 743 AggregateReq (Reports 11-20, State 2) ----------------------> \ 744 <------------------------------------ AggregateResp (State 3) | Reports 745 AggregateReq (continued, State 3) --------------------------> | 20-21 746 <------------------------------------ AggregateResp (State 4) / 748 AggregateShareReq (State 4) --------------------------------> 749 <-------------------------------- AggregateShareResp (Result) 751 Figure 2: Aggregation Process (batch size=20) 753 [OPEN ISSUE: Should there be an indication of whether a given 754 AggregateReq is a continuation of a previous sub-batch?] 756 [TODO: Decide if and how the collector's request is authenticated.] 758 4.3.1. Aggregate Request 760 The AggregateReq request is used by the leader to send a set of 761 reports to the helper. These reports MUST all be associated with the 762 same PPM task and batch. 764 For each aggregator endpoint [aggregator] in AggregateReq.task_id's 765 parameters except its own, the leader sends a POST request to 766 [aggregator]/aggregate with the following message: 768 struct { 769 TaskID task_id; 770 opaque agg_param<0..2^16-1>; // VDAF aggregation parameter 771 opaque helper_state<0..2^16>; // helper's opaque state 772 AggregateSubReq seq<1..2^24-1>; 773 } AggregateReq; 775 The structure contains the PPM task, an opaque, VDAF-specific 776 aggregation parameter, an opaque _helper state_ string, and a 777 sequence of _sub-requests_, each corresponding to a unique client 778 report. Sub-requests are structured as follows: 780 struct { 781 Nonce nonce; // Equal to Report.nonce. 782 Extension extensions<4..2^16-1>; // Equal to Report.extensions. 783 EncryptedInputShare helper_share; 784 opaque message<0..2^16-1>; // VDAF message 785 } AggregateSubReq; 787 The nonce and extensions fields have the same value as those in the 788 report uploaded by the client. Similarly, the helper_share field is 789 the EncryptedInputShare from the Report whose index in 790 Report.encrypted_input_shares is equal to the index of [aggregator] 791 in the task's aggregator endpoints. [OPEN ISSUE: We usually only 792 need to send this in the first aggregate request. Shall we exclude 793 it in subsequent requests somehow?] The remainder of the structure 794 is dedicated to VDAF-specific request parameters. 796 In order to provide replay protection, the leader preprocesses the 797 set of reports it sends in the the AggregateReq as described in 798 Section 4.4.2. Any reports filtered out by this procedure MUST be 799 ignored. 801 The helper handles well-formed requests as follows. (As usual, 802 malformed requests are handled as described in Section 3.1.) It 803 first looks for PPM parameters corresponding to AggregateReq.task_id. 804 It then preprocesses the sub-requests as described in Section 4.4.2. 805 Any sub-requests filtered out by this procedure MUST be ignored. 807 In addition, for any report whose nonce contains a timestamp that 808 falls in a batch interval for which it has completed at least one 809 aggregate-share request (see Section 4.3.2), the helper MUST send an 810 error messsage in response rather than its next VDAF message. Note 811 that this means leaders cannot interleave a sequence of aggregate and 812 aggregate-share requests for a single batch. 814 The response is an HTTP 200 OK with a body consisting of the helper's 815 updated state and a sequence of _sub-responses_. Each sub-response 816 encodes the nonce and a VDAF-specific message: 818 struct { 819 opaque helper_state<0..2^16>; 820 AggregateSubResp seq<1..2^24-1>; 821 } AggregateResp; 823 struct { 824 Nonce nonce; 825 opaque message<0..2^16-1>; // VDAF message 826 } AggregateSubResp; 827 The helper handles each sub-request AggregateSubReq as follows. It 828 first looks up the HPKE config and corresponding secret key 829 associated with helper_share.config_id. If not found, then the sub- 830 response consists of an "unrecognized config" alert. [TODO: We'll 831 want to be more precise about what this means. See issue#57.] Next, 832 it attempts to decrypt the payload with the following procedure: 834 context = SetupBaseR(helper_share.enc, sk, 835 "pda input share" || task_id || server_role) 836 input_share = context.Open(nonce || extensions, helper_share) 838 where sk is the HPKE secret key, task_id is AggregateReq.task_id and 839 server_role is the role of the server (0x01 for the leader and 0x00 840 for the helper). nonce and extensions are obtained from the 841 corresponding fields in AggregateSubReq. If decryption fails, then 842 the sub-response consists of a "decryption error" alert. [See 843 issue#57.] Otherwise, the helper handles the request for its 844 plaintext input share input_share and updates its state as specified 845 by the PPM protocol. 847 After processing all of the sub-requests, the helper encrypts its 848 updated state and constructs its response to the aggregate request. 850 4.3.1.1. Leader State 852 The leader is required to buffer reports while waiting to aggregate 853 them. The leader SHOULD NOT accept reports whose timestamps are too 854 far in the future. Implementors MAY provide for some small leeway, 855 usually no more than a few minutes, to account for clock skew. 857 4.3.1.2. Helper State 859 The helper state is an optional parameter of an aggregate request 860 that the helper can use to carry state across requests. At least 861 part of the state will usually need to be encrypted in order to 862 protect user privacy. However, the details of precisely how the 863 state is encrypted and the information that it carries is up to the 864 helper implementation. 866 4.3.2. Aggregate Share Request 868 Once the aggregators have verified at least as many reports as 869 required for the PPM task, the leader issues an "aggregate-share 870 request" to each helper. The helper responds to this request by 871 extracting its aggregate share from its state and encrypting it under 872 the collector's HPKE public key. 874 [OPEN ISSUE: consider updating the checksum algorithm to not permit 875 collisions] 877 First, the leader computes a checksum over the set of output shares 878 included in the batch window. The checksum is computed by taking the 879 SHA256 hash of each nonce from the client reports included in the 880 aggregation, then combining the hash values with a bitwise-XOR 881 operation. 883 Then, for each aggregator endpoint [aggregator] in the parameters 884 associated with CollectReq.task_id (see Section 4.4) except its own, 885 the leader sends a POST request to [aggregator]/aggregate_share with 886 the following message: 888 struct { 889 TaskID task_id; 890 Interval batch_interval; 891 uint64 report_count; 892 opaque checksum[32]; 893 opaque helper_state<0..2^16>; 894 } AggregateShareReq; 896 * task_id is the task ID associated with the PPM parameters. 898 * batch_interval is the batch interval of the request. 900 * report_count is the number of reports included in the aggregation. 902 * checksum is the checksum computed over the set of client reports, 903 computed as described above. 905 * helper_state is the helper's state, which is carried across 906 requests from the leader. 908 To respond to an AggregateShareReq message, the helper first looks up 909 the PPM parameters associated with task task_id. Then, using the 910 procedure in Section 4.4.1, it ensures that the request meets the 911 requirements of the batch parameters. It also computes a checksum 912 based on its view of the output shares included in the batch window, 913 and checks that the report_count and checksum included in the request 914 match its computed values. If so, it aggregates all valid output 915 shares that fall in the batch interval into an aggregate share. The 916 response contains an opaque, VDAF-specific message: 918 struct { 919 opaque message<0..2^16-1>; // VDAF message 920 } AggregateShare; 921 Next, the helper encrypts the aggregate share agg_share under the 922 collector's public key as follows: 924 enc, context = SetupBaseS(pk, 925 "pda aggregate share" || task_id || server_role) 926 encrypted_agg_share = context.Seal(batch_interval, agg_share) 928 where pk is the HPKE public key encoded by the collector's HPKE key 929 configuration, task_id is AggregateShareReq.task_id and server_role 930 is the role of the server (0x01 for the leader and 0x00 for the 931 helper). agg_share is the serialized AggregateShare, and 932 batch_interval is obtained from the AggregateShareReq. 934 This encryption prevents the leader from learning the actual result, 935 as it only has its own share and not the helper's share, which is 936 encrypted for the collector. The helper responds to the collector 937 with HTTP status 200 OK and a body consisting of the following 938 structure: 940 struct { 941 HpkeConfigId collector_hpke_config_id; 942 opaque enc<1..2^16-1>; 943 opaque payload<1..2^16>; 944 } EncryptedAggregateShare; 946 * collector_hpke_config_id is collector_config.id from the task 947 parameters corresponding to CollectReq.task_id. 949 * enc is the HPKE encapsulated key, used by the collector to decrypt 950 the aggregate share. 952 * payload is an encrypted AggregateShare. 954 The leader uses the helper's aggregate share response to respond to 955 the collector's collect request (see Section 4.4). 957 4.4. Collecting Results 959 The collector uses CollectReq to ask the leader to collect and return 960 the results for a given PPM task over a given time period. To make a 961 collect request, the collector issues a POST request to 962 [leader]/collect, where [leader] is the leader's endpoint URL. The 963 body of the request is structured as follows: 965 [OPEN ISSUE: Decide if and how the collector's request is 966 authenticated. If not, then we need to ensure that collect job URIs 967 are resistant to enumeration attacks.] ~~~ struct { TaskID task_id; 968 Interval batch_interval; opaque agg_param<0..2^16-1>; // VDAF 969 aggregation parameter } CollectReq; ~~~ 971 The named parameters are: 973 * task_id, the PPM task ID. 975 * batch_interval, the request's batch interval. 977 * agg_param, an aggregation parameter for the VDAF being executed. 979 Depending on the VDAF scheme and how the leader is configured, the 980 leader and helper may already have prepared all the reports falling 981 within batch_interval and be ready to return the aggregate shares 982 right away, but this cannot be guaranteed. In fact, for some VDAFs, 983 it is not be possible to begin preparing inputs until the collector 984 provides the aggregation parameter in the CollectReq. For these 985 reasons, collect requests are handled asynchronously. 987 Upon receipt of a CollectReq, the leader begins by checking that the 988 request meets the requirements of the batch parameters using the 989 procedure in Section 4.4.1. If so, it immediately sends the 990 collector a response with HTTP status 303 See Other and a Location 991 header containing a URI identifying the collect job that can be 992 polled by the collector, called the "collect job URI". 994 The leader then begins working with the helper to prepare the shares 995 falling into CollectReq.batch_interval (or continues this process, 996 depending on the VDAF) as described in Section 4.3. 998 After receiving the response to its CollectReq, the collector makes 999 an HTTP GET request to the collect job URI to check on the status of 1000 the collect job and eventually obtain the result. If the collect job 1001 is not finished yet, the leader responds with HTTP status 202 1002 Accepted. The response MAY include a Retry-After header field to 1003 suggest a pulling interval to the collector. 1005 Once all the necessary reports have been prepared, the leader obtains 1006 the helper's encrypted aggregate share for the batch interval by 1007 sending an AggregateShareReq to the helper as described in 1008 Section 4.3.2. The leader then computes its own aggregate share by 1009 aggregating all of the prepared output shares that fall within the 1010 batch interval. 1012 When both aggregators' shares are successfully obtained, the leader 1013 responds to subsequent HTTP GET requests to the collect job's URI 1014 with HTTP status 200 OK and a body consisting of a CollectResult: 1016 struct { 1017 EncryptedAggregateShare shares<1..2^16-1>; 1018 } CollectResult; 1020 * shares is a vector of EncryptedAggregateShares, as described in 1021 Section 4.3.2, except that for the leader's share, the task_id and 1022 batch_interval used to encrypt the AggregateShare are obtained 1023 from the CollectReq. 1025 If obtaining aggregate shares fails, then the leader responds to 1026 subsequent HTTP GET requests to the collect job URI with an HTTP 1027 error status and a problem document as described in Section 3.1. 1029 The leader MUST retain a collect job's results until the collector 1030 sends an HTTP DELETE request to the collect job URI, in which case 1031 the leader responds with HTTP status 204 No Content. 1033 [OPEN ISSUE: Allow the leader to drop aggregate shares after some 1034 reasonable amount of time has passed, but it's not clear how to 1035 specify that. ACME doesn't bother to say anything at all about this 1036 when describing how subscribers should fetch certificates: 1037 https://datatracker.ietf.org/doc/html/rfc8555#section-7.4.2] 1039 [OPEN ISSUE: Describe how intra-protocol errors yield collect errors 1040 (see issue#57). For example, how does a leader respond to a collect 1041 request if the helper drops out?] 1043 4.4.1. Validating Batch Parameters 1045 Before an aggregator responds to a collect request or aggregate-share 1046 request, it must first check that the request does not violate the 1047 parameters associated with the PPM task. It does so as described 1048 here. 1050 First the aggregator checks that the request's batch interval 1051 respects the boundaries defined by the PPM task's parameters. 1052 Namely, it checks that both batch_interval.start and 1053 batch_interval.duration are divisible by min_batch_duration and that 1054 batch_interval.duration >= min_batch_duration. Unless both these 1055 conditions are true, it aborts and alerts the peer with "invalid 1056 batch interval". 1058 Next, the aggregator checks that the request respects the generic 1059 privacy parameters of the PPM task. Let X denote the set of reports 1060 for which the aggregator has recovered a valid output share and which 1061 fall in the batch interval of the request. 1063 * If len(X) < min_batch_size, then the aggregator aborts and alerts 1064 the peer with "insufficient batch size". 1066 * The aggregator keeps track of the number of times each report was 1067 added to the batch of an AggregateShareReq. If any report in X 1068 was added to at least max_batch_lifetime previous batches, then 1069 the helper aborts and alerts the peer with "request exceeds the 1070 batch's privacy budget". 1072 4.4.2. Anti-replay 1074 Using a client-provided report multiple times within a single batch, 1075 or using the same report in multiple batches, may allow a server to 1076 learn information about the client's measurement, violating the 1077 privacy goal of PPM. To prevent such replay attacks, this 1078 specification requires the aggregators to detect and filter out 1079 replayed reports. 1081 To detect replay attacks, each aggregator keeps track of the set of 1082 nonces pertaining to reports that were previously aggregated for a 1083 given task. If the leader receives a report from a client whose 1084 nonce is in this set, it simply ignores it. A helper who receives an 1085 encrypted input share whose nonce is in this set replies to the 1086 leader with an error as described in Section 4.3.1. 1088 [OPEN ISSUE: This has the potential to require aggreagtors to store 1089 nonce sests indefinitely. See issue#180.] 1091 A malicious aggregator may attempt to force a replay by replacing the 1092 nonce generated by the client with a nonce its peer has not yet seen. 1093 To prevent this, clients incorporate the nonce into the AAD for HPKE 1094 encryption, ensuring that the output share is only recovered if the 1095 aggregator is given the correct nonce. (See Section 4.2.2.) 1097 Aggregators prevent the same report from being used in multiple 1098 batches (except as required by the protocol) by only responding to 1099 valid collect requests, as described in Section 4.4.1. 1101 5. Operational Considerations 1103 PPM protocols have inherent constraints derived from the tradeoff 1104 between privacy guarantees and computational complexity. These 1105 tradeoffs influence how applications may choose to utilize services 1106 implementing the specification. 1108 5.1. Protocol participant capabilities 1110 The design in this document has different assumptions and 1111 requirements for different protocol participants, including clients, 1112 aggregators, and collectors. This section describes these 1113 capabilities in more detail. 1115 5.1.1. Client capabilities 1117 Clients have limited capabilities and requirements. Their only 1118 inputs to the protocol are (1) the parameters configured out of band 1119 and (2) a measurement. Clients are not expected to store any state 1120 across any upload flows, nor are they required to implement any sort 1121 of report upload retry mechanism. By design, the protocol in this 1122 document is robust against individual client upload failures since 1123 the protocol output is an aggregate over all inputs. 1125 5.1.2. Aggregator capabilities 1127 Helpers and leaders have different operational requirements. The 1128 design in this document assumes an operationally competent leader, 1129 i.e., one that has no storage or computation limitations or 1130 constraints, but only a modestly provisioned helper, i.e., one that 1131 has computation, bandwidth, and storage constraints. By design, 1132 leaders must be at least as capable as helpers, where helpers are 1133 generally required to: 1135 * Support the collect protocol, which includes validating and 1136 aggregating reports; and 1138 * Publish and manage an HPKE configuration that can be used for the 1139 upload protocol. 1141 In addition, for each PPM task, helpers are required to: 1143 * Implement some form of batch-to-report index, as well as inter- 1144 and intra-batch replay mitigation storage, which includes some way 1145 of tracking batch report size with optional support for state 1146 offloading. Some of this state may be used for replay attack 1147 mitigation. The replay mitigation strategy is described in 1148 Section 4.4.2. 1150 Beyond the minimal capabilities required of helpers, leaders are 1151 generally required to: 1153 * Support the upload protocol and store reports; and 1155 * Track batch report size during each collect flow and request 1156 encrypted output shares from helpers. 1158 In addition, for each PPM task, leaders are required to: 1160 * Implement and store state for the form of inter- and intra-batch 1161 replay mitigation in Section 4.4.2; and 1163 * Store helper state. 1165 5.1.3. Collector capabilities 1167 Collectors statefully interact with aggregators to produce an 1168 aggregate output. Their input to the protocol is the task 1169 parameters, configured out of band, which include the corresponding 1170 batch window and size. For each collect invocation, collectors are 1171 required to keep state from the start of the protocol to the end as 1172 needed to produce the final aggregate output. 1174 Collectors must also maintain state for the lifetime of each task, 1175 which includes key material associated with the HPKE key 1176 configuration. 1178 5.2. Data resolution limitations 1180 Privacy comes at the cost of computational complexity. While affine- 1181 aggregatable encodings (AFEs) can compute many useful statistics, 1182 they require more bandwidth and CPU cycles to account for finite- 1183 field arithmetic during input-validation. The increased work from 1184 verifying inputs decreases the throughput of the system or the inputs 1185 processed per unit time. Throughput is related to the verification 1186 circuit's complexity and the available compute-time to each 1187 aggregator. 1189 Applications that utilize proofs with a large number of 1190 multiplication gates or a high frequency of inputs may need to limit 1191 inputs into the system to meet bandwidth or compute constraints. 1192 Some methods of overcoming these limitations include choosing a 1193 better representation for the data or introducing sampling into the 1194 data collection methodology. 1196 [[TODO: Discuss explicit key performance indicators, here or 1197 elsewhere.]] 1199 5.3. Aggregation utility and soft batch deadlines 1201 A soft real-time system should produce a response within a deadline 1202 to be useful. This constraint may be relevant when the value of an 1203 aggregate decreases over time. A missed deadline can reduce an 1204 aggregate's utility but not necessarily cause failure in the system. 1206 An example of a soft real-time constraint is the expectation that 1207 input data can be verified and aggregated in a period equal to data 1208 collection, given some computational budget. Meeting these deadlines 1209 will require efficient implementations of the input-validation 1210 protocol. Applications might batch requests or utilize more 1211 efficient serialization to improve throughput. 1213 Some applications may be constrained by the time that it takes to 1214 reach a privacy threshold defined by a minimum number of reports. 1215 One possible solution is to increase the reporting period so more 1216 samples can be collected, balanced against the urgency of responding 1217 to a soft deadline. 1219 5.4. Protocol-specific optimizations 1221 Not all PPM tasks have the same operational requirements, so the 1222 protocol is designed to allow implementations to reduce operational 1223 costs in certain cases. 1225 5.4.1. Reducing storage requirements 1227 In general, the aggregators are required to keep state for all valid 1228 reports for as long as collect requests can be made for them. In 1229 particular, the aggregators must store a batch as long as the batch 1230 has not been queried more than max_batch_lifetime times. However, it 1231 is not always necessary to store the reports themselves. For schemes 1232 like Prio in which the input-validation protocol is only run once per 1233 report, each aggregator only needs to store its aggregate share for 1234 each possible batch interval, along with the number of times the 1235 aggregate share was used in a batch. (The helper may store its 1236 aggregate shares in its encrypted state, thereby offloading this 1237 state to the leader.) This is due to the requirement that the batch 1238 interval respect the boundaries defined by the PPM parameters. (See 1239 Section 4.4.1.) 1241 6. Security Considerations 1243 Prio assumes a powerful adversary with the ability to compromise an 1244 unbounded number of clients. In doing so, the adversary can provide 1245 malicious (yet truthful) inputs to the aggregation function. Prio 1246 also assumes that all but one server operates honestly, where a 1247 dishonest server does not execute the protocol faithfully as 1248 specified. The system also assumes that servers communicate over 1249 secure and mutually authenticated channels. In practice, this can be 1250 done by TLS or some other form of application-layer authentication. 1252 In the presence of this adversary, Prio provides two important 1253 properties for computing an aggregation function F: 1255 1. Privacy. The aggregators and collector learn only the output of 1256 F computed over all client inputs, and nothing else. 1258 2. Robustness. As long as the aggregators execute the input- 1259 validation protocol correctly, a malicious client can skew the 1260 output of F only by reporting false (untruthful) input. The 1261 output cannot be influenced in any other way. 1263 There are several additional constraints that a Prio deployment must 1264 satisfy in order to achieve these goals: 1266 1. Minimum batch size. The aggregation batch size has an obvious 1267 impact on privacy. (A batch size of one hides nothing of the 1268 input.) 1270 2. Aggregation function choice. Some aggregation functions leak 1271 slightly more than the function output itself. 1273 [TODO: discuss these in more detail.] 1275 6.1. Threat model 1277 In this section, we enumerate the actors participating in the Prio 1278 system and enumerate their assets (secrets that are either inherently 1279 valuable or which confer some capability that enables further attack 1280 on the system), the capabilities that a malicious or compromised 1281 actor has, and potential mitigations for attacks enabled by those 1282 capabilities. 1284 This model assumes that all participants have previously agreed upon 1285 and exchanged all shared parameters over some unspecified secure 1286 channel. 1288 6.1.1. Client/user 1290 6.1.1.1. Assets 1292 1. Unshared inputs. Clients are the only actor that can ever see 1293 the original inputs. 1295 2. Unencrypted input shares. 1297 6.1.1.2. Capabilities 1299 1. Individual users can reveal their own input and compromise their 1300 own privacy. 1302 2. Clients (that is, software which might be used by many users of 1303 the system) can defeat privacy by leaking input outside of the 1304 Prio system. 1306 3. Clients may affect the quality of aggregations by reporting false 1307 input. 1309 * Prio can only prove that submitted input is valid, not that it 1310 is true. False input can be mitigated orthogonally to the 1311 Prio protocol (e.g., by requiring that aggregations include a 1312 minimum number of contributions) and so these attacks are 1313 considered to be outside of the threat model. 1315 4. Clients can send invalid encodings of input. 1317 6.1.1.3. Mitigations 1319 1. The input validation protocol executed by the aggregators 1320 prevents either individual clients or coalitions of clients from 1321 compromising the robustness property. 1323 2. If aggregator output satisifes differential privacy Section 6.5, 1324 then all records not leaked by malicious clients are still 1325 protected. 1327 6.1.2. Aggregator 1329 6.1.2.1. Assets 1331 1. Unencrypted input shares. 1333 2. Input share decryption keys. 1335 3. Client identifying information. 1337 4. Aggregate shares. 1339 5. Aggregator identity. 1341 6.1.2.2. Capabilities 1343 1. Aggregators may defeat the robustness of the system by emitting 1344 bogus output shares. 1346 2. If clients reveal identifying information to aggregators (such as 1347 a trusted identity during client authentication), aggregators can 1348 learn which clients are contributing input. 1350 1. Aggregators may reveal that a particular client contributed 1351 input. 1353 2. Aggregators may attack robustness by selectively omitting 1354 inputs from certain clients. 1356 * For example, omitting submissions from a particular 1357 geographic region to falsely suggest that a particular 1358 localization is not being used. 1360 3. Individual aggregators may compromise availability of the system 1361 by refusing to emit aggregate shares. 1363 4. Input validity proof forging. Any aggregator can collude with a 1364 malicious client to craft a proof that will fool honest 1365 aggregators into accepting invalid input. 1367 5. Aggregators can count the total number of input shares, which 1368 could compromise user privacy (and differential privacy 1369 Section 6.5) if the presence or absence of a share for a given 1370 user is sensitive. 1372 6.1.2.3. Mitigations 1374 1. The linear secret sharing scheme employed by the client ensures 1375 that privacy is preserved as long as at least one aggregator does 1376 not reveal its input shares. 1378 2. If computed over a sufficient number of reports, aggregate shares 1379 reveal nothing about either the inputs or the participating 1380 clients. 1382 3. Clients can ensure that aggregate counts are non-sensitive by 1383 generating input independently of user behavior. For example, a 1384 client should periodically upload a report even if the event that 1385 the task is tracking has not occurred, so that the absence of 1386 reports cannot be distinguished from their presence. 1388 4. Bogus inputs can be generated that encode "null" shares that do 1389 not affect the aggregate output, but mask the total number of 1390 true inputs. 1392 * Either leaders or clients can generate these inputs to mask 1393 the total number from non-leader aggregators or all the 1394 aggregators, respectively. 1396 * In either case, care must be taken to ensure that bogus inputs 1397 are indistinguishable from true inputs (metadata, etc), 1398 especially when constructing timestamps on reports. 1400 [OPEN ISSUE: Define what "null" shares are. They should be defined 1401 such that inserting null shares into an aggregation is effectively a 1402 no-op. See issue#98.] 1404 6.1.3. Leader 1406 The leader is also an aggregator, and so all the assets, capabilities 1407 and mitigations available to aggregators also apply to the leader. 1409 6.1.3.1. Capabilities 1411 1. Input validity proof verification. The leader can forge proofs 1412 and collude with a malicious client to trick aggregators into 1413 aggregating invalid inputs. 1415 * This capability is no stronger than any aggregator's ability 1416 to forge validity proof in collusion with a malicious client. 1418 2. Relaying messages between aggregators. The leader can compromise 1419 availability by dropping messages. 1421 * This capability is no stronger than any aggregator's ability 1422 to refuse to emit aggregate shares. 1424 3. Shrinking the anonymity set. The leader instructs aggregators to 1425 construct output parts and so could request aggregations over few 1426 inputs. 1428 6.1.3.2. Mitigations 1429 1. Aggregators enforce agreed upon minimum aggregation thresholds to 1430 prevent deanonymizing. 1432 2. If aggregator output satisfies differential privacy Section 6.5, 1433 then genuine records are protected regardless of the size of the 1434 anonymity set. 1436 6.1.4. Collector 1438 6.1.4.1. Capabilities 1440 1. Advertising shared configuration parameters (e.g., minimum 1441 thresholds for aggregations, joint randomness, arithmetic 1442 circuits). 1444 2. Collectors may trivially defeat availability by discarding 1445 aggregate shares submitted by aggregators. 1447 3. Known input injection. Collectors may collude with clients to 1448 send known input to the aggregators, allowing collectors to 1449 shrink the effective anonymity set by subtracting the known 1450 inputs from the final output. Sybil attacks [Dou02] could be 1451 used to amplify this capability. 1453 6.1.4.2. Mitigations 1455 1. Aggregators should refuse shared parameters that are trivially 1456 insecure (i.e., aggregation threshold of 1 contribution). 1458 2. If aggregator output satisfies differential privacy Section 6.5, 1459 then genuine records are protected regardless of the size of the 1460 anonymity set. 1462 6.1.5. Aggregator collusion 1464 If all aggregators collude (e.g. by promiscuously sharing unencrypted 1465 input shares), then none of the properties of the system hold. 1466 Accordingly, such scenarios are outside of the threat model. 1468 6.1.6. Attacker on the network 1470 We assume the existence of attackers on the network links between 1471 participants. 1473 6.1.6.1. Capabilities 1475 1. Observation of network traffic. Attackers may observe messages 1476 exchanged between participants at the IP layer. 1478 1. The time of transmission of input shares by clients could 1479 reveal information about user activity. 1481 * For example, if a user opts into a new feature, and the 1482 client immediately reports this to aggregators, then just 1483 by observing network traffic, the attacker can infer what 1484 the user did. 1486 2. Observation of message size could allow the attacker to learn 1487 how much input is being submitted by a client. 1489 * For example, if the attacker observes an encrypted message 1490 of some size, they can infer the size of the plaintext, 1491 plus or minus the cipher block size. From this they may 1492 be able to infer which aggregations the user has opted 1493 into or out of. 1495 2. Tampering with network traffic. Attackers may drop messages or 1496 inject new messages into communications between participants. 1498 6.1.6.2. Mitigations 1500 1. All messages exchanged between participants in the system should 1501 be encrypted. 1503 2. All messages exchanged between aggregators, the collector and the 1504 leader should be mutually authenticated so that network attackers 1505 cannot impersonate participants. 1507 3. Clients should be required to submit inputs at regular intervals 1508 so that the timing of individual messages does not reveal 1509 anything. 1511 4. Clients should submit dummy inputs even for aggregations the user 1512 has not opted into. 1514 [[OPEN ISSUE: The threat model for Prio --- as it's described in the 1515 original paper and [BBCGGI19] --- considers *either* a malicious 1516 client (attacking soundness) *or* a malicious subset of aggregators 1517 (attacking privacy). In particular, soundness isn't guaranteed if 1518 any one of the aggregators is malicious; in theory it may be possible 1519 for a malicious client and aggregator to collude and break soundness. 1520 Is this a contingency we need to address? There are techniques in 1521 [BBCGGI19] that account for this; we need to figure out if they're 1522 practical.]] 1524 6.2. Client authentication or attestation 1526 [TODO: Solve issue#89] 1528 6.3. Anonymizing proxies 1530 Client reports can contain auxiliary information such as source IP, 1531 HTTP user agent or in deployments which use it, client authentication 1532 information, which could be used by aggregators to identify 1533 participating clients or permit some attacks on robustness. This 1534 auxiliary information could be removed by having clients submit 1535 reports to an anonymizing proxy server which would then use Oblivous 1536 HTTP [I-D.thomson-http-oblivious] to forward inputs to the PPM 1537 leader, without requiring any server participating in PPM to be aware 1538 of whatever client authentication or attestation scheme is in use. 1540 6.4. Batch parameters 1542 An important parameter of a PPM deployment is the minimum batch size. 1543 If an aggregation includes too few inputs, then the outputs can 1544 reveal information about individual participants. Aggregators use 1545 the batch size field of the shared task parameters to enforce minimum 1546 batch size during the collect protocol, but server implementations 1547 may also opt out of participating in a PPM task if the minimum batch 1548 size is too small. This document does not specify how to choose 1549 minimum batch sizes. 1551 The PPM parameters also specify the maximum number of times a report 1552 can be used. Some protocols, such as Poplar [BBCGGI21], require 1553 reports to be used in multiple batches spanning multiple collect 1554 requests. 1556 6.5. Differential privacy 1558 Optionally, PPM deployments can choose to ensure their output F 1559 achieves differential privacy [Vad16]. A simple approach would 1560 require the aggregators to add two-sided noise (e.g. sampled from a 1561 two-sided geometric distribution) to outputs. Since each aggregator 1562 is adding noise independently, privacy can be guaranteed even if all 1563 but one of the aggregators is malicious. Differential privacy is a 1564 strong privacy definition, and protects users in extreme 1565 circumstances: Even if an adversary has prior knowledge of every 1566 input in a batch except for one, that one record is still formally 1567 protected. 1569 [OPEN ISSUE: While parameters configuring the differential privacy 1570 noise (like specific distributions / variance) can be agreed upon out 1571 of band by the aggregators and collector, there may be benefits to 1572 adding explicit protocol support by encoding them into task 1573 parameters.] 1575 6.6. Robustness in the presence of malicious servers 1577 Most PPM protocols, including Prio and Poplar, are robust against 1578 malicious clients, but are not robust against malicious servers. Any 1579 aggregator can simply emit bogus aggregate shares and undetectably 1580 spoil aggregates. If enough aggregators were available, this could 1581 be mitigated by running the protocol multiple times with distinct 1582 subsets of aggregators chosen so that no aggregator appears in all 1583 subsets and checking all the outputs against each other. If all the 1584 protocol runs do not agree, then participants know that at least one 1585 aggregator is defective, and it may be possible to identify the 1586 defector (i.e., if a majority of runs agree, and a single aggregator 1587 appears in every run that disagrees). See #22 1588 (https://github.com/abetterinternet/ppm-specification/issues/22) for 1589 discussion. 1591 6.7. Infrastructure diversity 1593 Prio deployments should ensure that aggregators do not have common 1594 dependencies that would enable a single vendor to reassemble inputs. 1595 For example, if all participating aggregators stored unencrypted 1596 input shares on the same cloud object storage service, then that 1597 cloud vendor would be able to reassemble all the input shares and 1598 defeat privacy. 1600 6.8. System requirements 1602 6.8.1. Data types 1604 7. IANA Considerations 1606 7.1. Protocol Message Media Types 1608 This specification defines the following protocol messages, along 1609 with their corresponding media types types: 1611 * HpkeConfig Section 4.1: "application/ppm-hpke-config" 1613 * Report Section 4.2.2: "message/ppm-report" 1615 * AggregateReq Section 4.3.1: "message/ppm-aggregate-req" 1616 * AggregateResp Section 4.3.1: "message/ppm-aggregate-resp" 1618 * AggregateShareReq Section 4.3.2: "message/ppm-aggregate-share-req" 1620 * AggregateShareResp Section 4.3.2: "message/ppm-aggregate-share- 1621 resp" 1623 * CollectReq Section 4.4: "message/ppm-collect-req" 1625 * CollectResult Section 4.4: "message/ppm-collect-result" 1627 The definition for each media type is in the following subsections. 1629 Protocol message format evolution is supported through the definition 1630 of new formats that are identified by new media types. 1632 IANA [shall update / has updated] the "Media Types" registry at 1633 https://www.iana.org/assignments/media-types with the registration 1634 information in this section for all media types listed above. 1636 [OPEN ISSUE: Solicit review of these allocations from domain 1637 experts.] 1639 7.1.1. "application/ppm-hpke-config" media type 1641 Type name: application 1643 Subtype name: ppm-hpke-config 1645 Required parameters: N/A 1647 Optional parameters: None 1649 Encoding considerations: only "8bit" or "binary" is permitted 1651 Security considerations: see Section 4.1 1653 Interoperability considerations: N/A 1655 Published specification: this specification 1657 Applications that use this media type: N/A 1659 Fragment identifier considerations: N/A 1661 Additional information: Magic number(s): N/A 1663 Deprecated alias names for this type: N/A 1664 File extension(s): N/A 1666 Macintosh file type code(s): N/A 1668 Person and email address to contact for further information: see Aut 1669 hors' Addresses section 1671 Intended usage: COMMON 1673 Restrictions on usage: N/A 1675 Author: see Authors' Addresses section 1677 Change controller: IESG 1679 7.1.2. "message/ppm-report" media type 1681 Type name: message 1683 Subtype name: ppm-report 1685 Required parameters: N/A 1687 Optional parameters: None 1689 Encoding considerations: only "8bit" or "binary" is permitted 1691 Security considerations: see Section 4.2.2 1693 Interoperability considerations: N/A 1695 Published specification: this specification 1697 Applications that use this media type: N/A 1699 Fragment identifier considerations: N/A 1701 Additional information: Magic number(s): N/A 1703 Deprecated alias names for this type: N/A 1705 File extension(s): N/A 1707 Macintosh file type code(s): N/A 1709 Person and email address to contact for further information: see Aut 1710 hors' Addresses section 1712 Intended usage: COMMON 1714 Restrictions on usage: N/A 1716 Author: see Authors' Addresses section 1718 Change controller: IESG 1720 7.1.3. "message/ppm-aggregate-req" media type 1722 Type name: message 1724 Subtype name: ppm-aggregate-req 1726 Required parameters: N/A 1728 Optional parameters: None 1730 Encoding considerations: only "8bit" or "binary" is permitted 1732 Security considerations: see Section 4.3.1 1734 Interoperability considerations: N/A 1736 Published specification: this specification 1738 Applications that use this media type: N/A 1740 Fragment identifier considerations: N/A 1742 Additional information: Magic number(s): N/A 1744 Deprecated alias names for this type: N/A 1746 File extension(s): N/A 1748 Macintosh file type code(s): N/A 1750 Person and email address to contact for further information: see Aut 1751 hors' Addresses section 1753 Intended usage: COMMON 1755 Restrictions on usage: N/A 1757 Author: see Authors' Addresses section 1759 Change controller: IESG 1761 7.1.4. "message/ppm-aggregate-resp" media type 1763 Type name: application 1765 Subtype name: ppm-aggregate-resp 1767 Required parameters: N/A 1769 Optional parameters: None 1771 Encoding considerations: only "8bit" or "binary" is permitted 1773 Security considerations: see Section 4.3.1 1775 Interoperability considerations: N/A 1777 Published specification: this specification 1779 Applications that use this media type: N/A 1781 Fragment identifier considerations: N/A 1783 Additional information: Magic number(s): N/A 1785 Deprecated alias names for this type: N/A 1787 File extension(s): N/A 1789 Macintosh file type code(s): N/A 1791 Person and email address to contact for further information: see Aut 1792 hors' Addresses section 1794 Intended usage: COMMON 1796 Restrictions on usage: N/A 1798 Author: see Authors' Addresses section 1800 Change controller: IESG 1802 7.1.5. "message/ppm-aggregate-share-req" media type 1804 Type name: application 1806 Subtype name: ppm-aggregate-share-req 1808 Required parameters: N/A 1809 Optional parameters: None 1811 Encoding considerations: only "8bit" or "binary" is permitted 1813 Security considerations: see Section 4.3.2 1815 Interoperability considerations: N/A 1817 Published specification: this specification 1819 Applications that use this media type: N/A 1821 Fragment identifier considerations: N/A 1823 Additional information: Magic number(s): N/A 1825 Deprecated alias names for this type: N/A 1827 File extension(s): N/A 1829 Macintosh file type code(s): N/A 1831 Person and email address to contact for further information: see Aut 1832 hors' Addresses section 1834 Intended usage: COMMON 1836 Restrictions on usage: N/A 1838 Author: see Authors' Addresses section 1840 Change controller: IESG 1842 7.1.6. "message/ppm-aggregate-share-resp" media type 1844 Type name: application 1846 Subtype name: ppm-aggregate-share-resp 1848 Required parameters: N/A 1850 Optional parameters: None 1852 Encoding considerations: only "8bit" or "binary" is permitted 1854 Security considerations: see Section 4.3.2 1856 Interoperability considerations: N/A 1857 Published specification: this specification 1859 Applications that use this media type: N/A 1861 Fragment identifier considerations: N/A 1863 Additional information: Magic number(s): N/A 1865 Deprecated alias names for this type: N/A 1867 File extension(s): N/A 1869 Macintosh file type code(s): N/A 1871 Person and email address to contact for further information: see Aut 1872 hors' Addresses section 1874 Intended usage: COMMON 1876 Restrictions on usage: N/A 1878 Author: see Authors' Addresses section 1880 Change controller: IESG 1882 7.1.7. "message/ppm-collect-req" media type 1884 Type name: application 1886 Subtype name: ppm-collect-req 1888 Required parameters: N/A 1890 Optional parameters: None 1892 Encoding considerations: only "8bit" or "binary" is permitted 1894 Security considerations: see Section 4.4 1896 Interoperability considerations: N/A 1898 Published specification: this specification 1900 Applications that use this media type: N/A 1902 Fragment identifier considerations: N/A 1904 Additional information: Magic number(s): N/A 1905 Deprecated alias names for this type: N/A 1907 File extension(s): N/A 1909 Macintosh file type code(s): N/A 1911 Person and email address to contact for further information: see Aut 1912 hors' Addresses section 1914 Intended usage: COMMON 1916 Restrictions on usage: N/A 1918 Author: see Authors' Addresses section 1920 Change controller: IESG 1922 7.1.8. "message/ppm-collect-req" media type 1924 Type name: application 1926 Subtype name: ppm-collect-req 1928 Required parameters: N/A 1930 Optional parameters: None 1932 Encoding considerations: only "8bit" or "binary" is permitted 1934 Security considerations: see Section 4.4 1936 Interoperability considerations: N/A 1938 Published specification: this specification 1940 Applications that use this media type: N/A 1942 Fragment identifier considerations: N/A 1944 Additional information: Magic number(s): N/A 1946 Deprecated alias names for this type: N/A 1948 File extension(s): N/A 1950 Macintosh file type code(s): N/A 1952 Person and email address to contact for further information: see Aut 1953 hors' Addresses section 1955 Intended usage: COMMON 1957 Restrictions on usage: N/A 1959 Author: see Authors' Addresses section 1961 Change controller: IESG 1963 7.2. Upload Extension Registry 1965 This document requests creation of a new registry for extensions to 1966 the Upload protocol. This registry should contain the following 1967 columns: 1969 [TODO: define how we want to structure this registry when the time 1970 comes] 1972 7.3. URN Sub-namespace for PPM (urn:ietf:params:ppm) 1974 The following value [will be/has been] registered in the "IETF URN 1975 Sub-namespace for Registered Protocol Parameter Identifiers" 1976 registry, following the template in [RFC3553]: 1978 Registry name: ppm 1980 Specification: [[THIS DOCUMENT]] 1982 Repository: http://www.iana.org/assignments/ppm 1984 Index value: No transformation needed. 1986 Initial contents: The types and descriptions in the table in 1987 Section 3.1 above, with the Reference field set to point to this 1988 specification. 1990 8. Acknowledgements 1992 The text in Section 3 is based extensively on [RFC8555] 1994 9. References 1996 9.1. Normative References 1998 [I-D.irtf-cfrg-hpke] 1999 Barnes, R. L., Bhargavan, K., Lipp, B., and C. A. Wood, 2000 "Hybrid Public Key Encryption", Work in Progress, 2001 Internet-Draft, draft-irtf-cfrg-hpke-12, 2 September 2021, 2002 . 2005 [I-D.thomson-http-oblivious] 2006 Thomson, M. and C. A. Wood, "Oblivious HTTP", Work in 2007 Progress, Internet-Draft, draft-thomson-http-oblivious-02, 2008 24 August 2021, . 2011 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 2012 Requirement Levels", BCP 14, RFC 2119, 2013 DOI 10.17487/RFC2119, March 1997, 2014 . 2016 [RFC2818] Rescorla, E., "HTTP Over TLS", RFC 2818, 2017 DOI 10.17487/RFC2818, May 2000, 2018 . 2020 [RFC3553] Mealling, M., Masinter, L., Hardie, T., and G. Klyne, "An 2021 IETF URN Sub-namespace for Registered Protocol 2022 Parameters", BCP 73, RFC 3553, DOI 10.17487/RFC3553, June 2023 2003, . 2025 [RFC5861] Nottingham, M., "HTTP Cache-Control Extensions for Stale 2026 Content", RFC 5861, DOI 10.17487/RFC5861, May 2010, 2027 . 2029 [RFC7234] Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, 2030 Ed., "Hypertext Transfer Protocol (HTTP/1.1): Caching", 2031 RFC 7234, DOI 10.17487/RFC7234, June 2014, 2032 . 2034 [RFC7807] Nottingham, M. and E. Wilde, "Problem Details for HTTP 2035 APIs", RFC 7807, DOI 10.17487/RFC7807, March 2016, 2036 . 2038 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2039 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 2040 May 2017, . 2042 [RFC8446] Rescorla, E., "The Transport Layer Security (TLS) Protocol 2043 Version 1.3", RFC 8446, DOI 10.17487/RFC8446, August 2018, 2044 . 2046 9.2. Informative References 2048 [BBCGGI19] Boneh, D., Boyle, E., Corrigan-Gibbs, H., Gilboa, N., and 2049 Y. Ishai, "Zero-Knowledge Proofs on Secret-Shared Data via 2050 Fully Linear PCPs", 5 January 2021, 2051 . 2053 [BBCGGI21] Boneh, D., Boyle, E., Corrigan-Gibbs, H., Gilboa, N., and 2054 Y. Ishai, "Lightweight Techniques for Private Heavy 2055 Hitters", 5 January 2021, 2056 . 2058 [CGB17] Corrigan-Gibbs, H. and D. Boneh, "Prio: Private, Robust, 2059 and Scalable Computation of Aggregate Statistics", 14 2060 March 2017, . 2062 [Dou02] Douceur, J., "The Sybil Attack", 10 October 2022, 2063 . 2066 [I-D.draft-cfrg-patton-vdaf] 2067 "*** BROKEN REFERENCE ***". 2069 [RFC8555] Barnes, R., Hoffman-Andrews, J., McCarney, D., and J. 2070 Kasten, "Automatic Certificate Management Environment 2071 (ACME)", RFC 8555, DOI 10.17487/RFC8555, March 2019, 2072 . 2074 [Vad16] Vadhan, S., "The Complexity of Differential Privacy", 9 2075 August 2016, 2076 . 2079 Authors' Addresses 2081 Tim Geoghegan 2082 ISRG 2083 Email: timgeog+ietf@gmail.com 2085 Christopher Patton 2086 Cloudflare 2087 Email: chrispatton+ietf@gmail.com 2089 Eric Rescorla 2090 Mozilla 2091 Email: ekr@rtfm.com 2092 Christopher A. Wood 2093 Cloudflare 2094 Email: caw@heapingbits.net