idnits 2.17.00 (12 Aug 2021) /tmp/idnits63854/draft-ietf-quic-transport-08.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** The abstract seems to contain references ([2], [3], [1]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (December 5, 2017) is 1628 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '1' on line 4082 -- Looks like a reference, but probably isn't: '2' on line 4084 -- Looks like a reference, but probably isn't: '3' on line 4086 -- Looks like a reference, but probably isn't: '4' on line 4088 == Outdated reference: draft-ietf-tls-tls13 has been published as RFC 8446 == Outdated reference: draft-ietf-quic-recovery has been published as RFC 9002 == Outdated reference: draft-ietf-quic-tls has been published as RFC 9001 -- Duplicate reference: RFC1191, mentioned in 'RFC1191', was also mentioned in 'PMTUDv4'. -- Obsolete informational reference (is this intentional?): RFC 6824 (Obsoleted by RFC 8684) Summary: 2 errors (**), 0 flaws (~~), 4 warnings (==), 8 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 QUIC J. Iyengar, Ed. 3 Internet-Draft Google 4 Intended status: Standards Track M. Thomson, Ed. 5 Expires: June 8, 2018 Mozilla 6 December 5, 2017 8 QUIC: A UDP-Based Multiplexed and Secure Transport 9 draft-ietf-quic-transport-08 11 Abstract 13 This document defines the core of the QUIC transport protocol. This 14 document describes connection establishment, packet format, 15 multiplexing and reliability. Accompanying documents describe the 16 cryptographic handshake and loss detection. 18 Note to Readers 20 Discussion of this draft takes place on the QUIC working group 21 mailing list (quic@ietf.org), which is archived at 22 https://mailarchive.ietf.org/arch/search/?email_list=quic [1]. 24 Working Group information can be found at https://github.com/quicwg 25 [2]; source code and issues list for this draft can be found at 26 https://github.com/quicwg/base-drafts/labels/-transport [3]. 28 Status of This Memo 30 This Internet-Draft is submitted in full conformance with the 31 provisions of BCP 78 and BCP 79. 33 Internet-Drafts are working documents of the Internet Engineering 34 Task Force (IETF). Note that other groups may also distribute 35 working documents as Internet-Drafts. The list of current Internet- 36 Drafts is at https://datatracker.ietf.org/drafts/current/. 38 Internet-Drafts are draft documents valid for a maximum of six months 39 and may be updated, replaced, or obsoleted by other documents at any 40 time. It is inappropriate to use Internet-Drafts as reference 41 material or to cite them other than as "work in progress." 43 This Internet-Draft will expire on June 8, 2018. 45 Copyright Notice 47 Copyright (c) 2017 IETF Trust and the persons identified as the 48 document authors. All rights reserved. 50 This document is subject to BCP 78 and the IETF Trust's Legal 51 Provisions Relating to IETF Documents 52 (https://trustee.ietf.org/license-info) in effect on the date of 53 publication of this document. Please review these documents 54 carefully, as they describe your rights and restrictions with respect 55 to this document. Code Components extracted from this document must 56 include Simplified BSD License text as described in Section 4.e of 57 the Trust Legal Provisions and are provided without warranty as 58 described in the Simplified BSD License. 60 Table of Contents 62 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 5 63 2. Conventions and Definitions . . . . . . . . . . . . . . . . . 5 64 2.1. Notational Conventions . . . . . . . . . . . . . . . . . 6 65 3. A QUIC Overview . . . . . . . . . . . . . . . . . . . . . . . 6 66 3.1. Low-Latency Connection Establishment . . . . . . . . . . 6 67 3.2. Stream Multiplexing . . . . . . . . . . . . . . . . . . . 7 68 3.3. Rich Signaling for Congestion Control and Loss Recovery . 7 69 3.4. Stream and Connection Flow Control . . . . . . . . . . . 7 70 3.5. Authenticated and Encrypted Header and Payload . . . . . 8 71 3.6. Connection Migration and Resilience to NAT Rebinding . . 8 72 3.7. Version Negotiation . . . . . . . . . . . . . . . . . . . 8 73 4. Versions . . . . . . . . . . . . . . . . . . . . . . . . . . 8 74 5. Packet Types and Formats . . . . . . . . . . . . . . . . . . 9 75 5.1. Long Header . . . . . . . . . . . . . . . . . . . . . . . 10 76 5.2. Short Header . . . . . . . . . . . . . . . . . . . . . . 11 77 5.3. Version Negotiation Packet . . . . . . . . . . . . . . . 13 78 5.4. Cryptographic Handshake Packets . . . . . . . . . . . . . 14 79 5.4.1. Initial Packet . . . . . . . . . . . . . . . . . . . 14 80 5.4.2. Retry Packet . . . . . . . . . . . . . . . . . . . . 15 81 5.4.3. Handshake Packet . . . . . . . . . . . . . . . . . . 15 82 5.5. Protected Packets . . . . . . . . . . . . . . . . . . . . 16 83 5.6. Connection ID . . . . . . . . . . . . . . . . . . . . . . 16 84 5.7. Packet Numbers . . . . . . . . . . . . . . . . . . . . . 17 85 5.7.1. Initial Packet Number . . . . . . . . . . . . . . . . 18 86 5.8. Handling Packets from Different Versions . . . . . . . . 18 87 6. Frames and Frame Types . . . . . . . . . . . . . . . . . . . 19 88 7. Life of a Connection . . . . . . . . . . . . . . . . . . . . 20 89 7.1. Matching Packets to Connections . . . . . . . . . . . . . 21 90 7.2. Version Negotiation . . . . . . . . . . . . . . . . . . . 22 91 7.2.1. Sending Version Negotiation Packets . . . . . . . . . 22 92 7.2.2. Handling Version Negotiation Packets . . . . . . . . 23 93 7.2.3. Using Reserved Versions . . . . . . . . . . . . . . . 23 94 7.3. Cryptographic and Transport Handshake . . . . . . . . . . 24 95 7.4. Transport Parameters . . . . . . . . . . . . . . . . . . 25 96 7.4.1. Transport Parameter Definitions . . . . . . . . . . . 27 97 7.4.2. Values of Transport Parameters for 0-RTT . . . . . . 29 98 7.4.3. New Transport Parameters . . . . . . . . . . . . . . 29 99 7.4.4. Version Negotiation Validation . . . . . . . . . . . 29 100 7.5. Stateless Retries . . . . . . . . . . . . . . . . . . . . 31 101 7.6. Proof of Source Address Ownership . . . . . . . . . . . . 31 102 7.6.1. Client Address Validation Procedure . . . . . . . . . 32 103 7.6.2. Address Validation on Session Resumption . . . . . . 33 104 7.6.3. Address Validation Token Integrity . . . . . . . . . 34 105 7.7. Connection Migration . . . . . . . . . . . . . . . . . . 34 106 7.7.1. Privacy Implications of Connection Migration . . . . 35 107 7.7.2. Address Validation for Migrated Connections . . . . . 36 108 7.8. Spurious Connection Migrations . . . . . . . . . . . . . 37 109 7.9. Connection Termination . . . . . . . . . . . . . . . . . 38 110 7.9.1. Closing and Draining Connection States . . . . . . . 38 111 7.9.2. Idle Timeout . . . . . . . . . . . . . . . . . . . . 40 112 7.9.3. Immediate Close . . . . . . . . . . . . . . . . . . . 40 113 7.9.4. Stateless Reset . . . . . . . . . . . . . . . . . . . 41 114 8. Frame Types and Formats . . . . . . . . . . . . . . . . . . . 43 115 8.1. Variable-Length Integer Encoding . . . . . . . . . . . . 44 116 8.2. PADDING Frame . . . . . . . . . . . . . . . . . . . . . . 44 117 8.3. RST_STREAM Frame . . . . . . . . . . . . . . . . . . . . 45 118 8.4. CONNECTION_CLOSE frame . . . . . . . . . . . . . . . . . 45 119 8.5. APPLICATION_CLOSE frame . . . . . . . . . . . . . . . . . 46 120 8.6. MAX_DATA Frame . . . . . . . . . . . . . . . . . . . . . 46 121 8.7. MAX_STREAM_DATA Frame . . . . . . . . . . . . . . . . . . 47 122 8.8. MAX_STREAM_ID Frame . . . . . . . . . . . . . . . . . . . 48 123 8.9. PING Frame . . . . . . . . . . . . . . . . . . . . . . . 49 124 8.10. BLOCKED Frame . . . . . . . . . . . . . . . . . . . . . . 50 125 8.11. STREAM_BLOCKED Frame . . . . . . . . . . . . . . . . . . 50 126 8.12. STREAM_ID_BLOCKED Frame . . . . . . . . . . . . . . . . . 51 127 8.13. NEW_CONNECTION_ID Frame . . . . . . . . . . . . . . . . . 51 128 8.14. STOP_SENDING Frame . . . . . . . . . . . . . . . . . . . 52 129 8.15. PONG Frame . . . . . . . . . . . . . . . . . . . . . . . 52 130 8.16. ACK Frame . . . . . . . . . . . . . . . . . . . . . . . . 53 131 8.16.1. ACK Block Section . . . . . . . . . . . . . . . . . 54 132 8.16.2. Sending ACK Frames . . . . . . . . . . . . . . . . . 56 133 8.16.3. ACK Frames and Packet Protection . . . . . . . . . . 57 134 8.17. STREAM Frames . . . . . . . . . . . . . . . . . . . . . . 58 135 9. Packetization and Reliability . . . . . . . . . . . . . . . . 59 136 9.1. Special Considerations for PMTU Discovery . . . . . . . . 62 137 10. Streams: QUIC's Data Structuring Abstraction . . . . . . . . 62 138 10.1. Stream Identifiers . . . . . . . . . . . . . . . . . . . 63 139 10.2. Stream States . . . . . . . . . . . . . . . . . . . . . 64 140 10.2.1. Send Stream States . . . . . . . . . . . . . . . . . 65 141 10.2.2. Receive Stream States . . . . . . . . . . . . . . . 67 142 10.2.3. Permitted Frame Types . . . . . . . . . . . . . . . 70 143 10.2.4. Bidirectional Stream States . . . . . . . . . . . . 70 144 10.3. Solicited State Transitions . . . . . . . . . . . . . . 71 145 10.4. Stream Concurrency . . . . . . . . . . . . . . . . . . . 72 146 10.5. Sending and Receiving Data . . . . . . . . . . . . . . . 73 147 10.6. Stream Prioritization . . . . . . . . . . . . . . . . . 73 148 11. Flow Control . . . . . . . . . . . . . . . . . . . . . . . . 74 149 11.1. Edge Cases and Other Considerations . . . . . . . . . . 75 150 11.1.1. Response to a RST_STREAM . . . . . . . . . . . . . . 76 151 11.1.2. Data Limit Increments . . . . . . . . . . . . . . . 76 152 11.2. Stream Limit Increment . . . . . . . . . . . . . . . . . 77 153 11.2.1. Blocking on Flow Control . . . . . . . . . . . . . . 77 154 11.3. Stream Final Offset . . . . . . . . . . . . . . . . . . 77 155 12. Error Handling . . . . . . . . . . . . . . . . . . . . . . . 78 156 12.1. Connection Errors . . . . . . . . . . . . . . . . . . . 78 157 12.2. Stream Errors . . . . . . . . . . . . . . . . . . . . . 79 158 12.3. Transport Error Codes . . . . . . . . . . . . . . . . . 79 159 12.4. Application Protocol Error Codes . . . . . . . . . . . . 81 160 13. Security and Privacy Considerations . . . . . . . . . . . . . 81 161 13.1. Spoofed ACK Attack . . . . . . . . . . . . . . . . . . . 81 162 13.2. Slowloris Attacks . . . . . . . . . . . . . . . . . . . 82 163 13.3. Stream Fragmentation and Reassembly Attacks . . . . . . 82 164 13.4. Stream Commitment Attack . . . . . . . . . . . . . . . . 82 165 14. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 83 166 14.1. QUIC Transport Parameter Registry . . . . . . . . . . . 83 167 14.2. QUIC Transport Error Codes Registry . . . . . . . . . . 84 168 15. References . . . . . . . . . . . . . . . . . . . . . . . . . 87 169 15.1. Normative References . . . . . . . . . . . . . . . . . . 87 170 15.2. Informative References . . . . . . . . . . . . . . . . . 88 171 15.3. URIs . . . . . . . . . . . . . . . . . . . . . . . . . . 89 172 Appendix A. Contributors . . . . . . . . . . . . . . . . . . . . 89 173 Appendix B. Acknowledgments . . . . . . . . . . . . . . . . . . 89 174 Appendix C. Change Log . . . . . . . . . . . . . . . . . . . . . 90 175 C.1. Since draft-ietf-quic-transport-07 . . . . . . . . . . . 90 176 C.2. Since draft-ietf-quic-transport-06 . . . . . . . . . . . 90 177 C.3. Since draft-ietf-quic-transport-05 . . . . . . . . . . . 90 178 C.4. Since draft-ietf-quic-transport-04 . . . . . . . . . . . 91 179 C.5. Since draft-ietf-quic-transport-03 . . . . . . . . . . . 91 180 C.6. Since draft-ietf-quic-transport-02 . . . . . . . . . . . 91 181 C.7. Since draft-ietf-quic-transport-01 . . . . . . . . . . . 92 182 C.8. Since draft-ietf-quic-transport-00 . . . . . . . . . . . 94 183 C.9. Since draft-hamilton-quic-transport-protocol-01 . . . . . 95 184 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 95 186 1. Introduction 188 QUIC is a multiplexed and secure transport protocol that runs on top 189 of UDP. QUIC aims to provide a flexible set of features that allow 190 it to be a general-purpose transport for multiple applications. 192 QUIC implements techniques learned from experience with TCP, SCTP and 193 other transport protocols. QUIC uses UDP as substrate so as to not 194 require changes to legacy client operating systems and middleboxes to 195 be deployable. QUIC authenticates all of its headers and encrypts 196 most of the data it exchanges, including its signaling. This allows 197 the protocol to evolve without incurring a dependency on upgrades to 198 middleboxes. This document describes the core QUIC protocol, 199 including the conceptual design, wire format, and mechanisms of the 200 QUIC protocol for connection establishment, stream multiplexing, 201 stream and connection-level flow control, and data reliability. 203 Accompanying documents describe QUIC's loss detection and congestion 204 control [QUIC-RECOVERY], and the use of TLS 1.3 for key negotiation 205 [QUIC-TLS]. 207 2. Conventions and Definitions 209 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 210 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 211 "OPTIONAL" in this document are to be interpreted as described in BCP 212 14 [RFC2119] [RFC8174] when, and only when, they appear in all 213 capitals, as shown here. 215 Definitions of terms that are used in this document: 217 Client: The endpoint initiating a QUIC connection. 219 Server: The endpoint accepting incoming QUIC connections. 221 Endpoint: The client or server end of a connection. 223 Stream: A logical, bi-directional channel of ordered bytes within a 224 QUIC connection. 226 Connection: A conversation between two QUIC endpoints with a single 227 encryption context that multiplexes streams within it. 229 Connection ID: The 64-bit unsigned number used as an identifier for 230 a QUIC connection. 232 QUIC packet: A well-formed UDP payload that can be parsed by a QUIC 233 receiver. QUIC packet size in this document refers to the UDP 234 payload size. 236 2.1. Notational Conventions 238 Packet and frame diagrams use the format described in Section 3.1 of 239 [RFC2360], with the following additional conventions: 241 [x] Indicates that x is optional 243 {x} Indicates that x is encrypted 245 x (A) Indicates that x is A bits long 247 x (A/B/C) ... Indicates that x is one of A, B, or C bits long 249 x (i) ... Indicates that x uses the variable-length encoding in 250 Section 8.1 252 x (*) ... Indicates that x is variable-length 254 3. A QUIC Overview 256 This section briefly describes QUIC's key mechanisms and benefits. 257 Key strengths of QUIC include: 259 o Low-latency connection establishment 261 o Multiplexing without head-of-line blocking 263 o Authenticated and encrypted header and payload 265 o Rich signaling for congestion control and loss recovery 267 o Stream and connection flow control 269 o Connection migration and resilience to NAT rebinding 271 o Version negotiation 273 3.1. Low-Latency Connection Establishment 275 QUIC relies on a combined cryptographic and transport handshake for 276 setting up a secure transport connection. QUIC connections are 277 expected to commonly use 0-RTT handshakes, meaning that for most QUIC 278 connections, data can be sent immediately following the client 279 handshake packet, without waiting for a reply from the server. QUIC 280 provides a dedicated stream (Stream ID 0) to be used for performing 281 the cryptographic handshake and QUIC options negotiation. The format 282 of the QUIC options and parameters used during negotiation are 283 described in this document, but the handshake protocol that runs on 284 Stream ID 0 is described in the accompanying cryptographic handshake 285 draft [QUIC-TLS]. 287 3.2. Stream Multiplexing 289 When application messages are transported over TCP, independent 290 application messages can suffer from head-of-line blocking. When an 291 application multiplexes many streams atop TCP's single-bytestream 292 abstraction, a loss of a TCP segment results in blocking of all 293 subsequent segments until a retransmission arrives, irrespective of 294 the application streams that are encapsulated in subsequent segments. 295 QUIC ensures that lost packets carrying data for an individual stream 296 only impact that specific stream. Data received on other streams can 297 continue to be reassembled and delivered to the application. 299 3.3. Rich Signaling for Congestion Control and Loss Recovery 301 QUIC's packet framing and acknowledgments carry rich information that 302 help both congestion control and loss recovery in fundamental ways. 303 Each QUIC packet carries a new packet number, including those 304 carrying retransmitted data. This obviates the need for a separate 305 mechanism to distinguish acknowledgments for retransmissions from 306 those for original transmissions, avoiding TCP's retransmission 307 ambiguity problem. QUIC acknowledgments also explicitly encode the 308 delay between the receipt of a packet and its acknowledgment being 309 sent, and together with the monotonically-increasing packet numbers, 310 this allows for precise network roundtrip-time (RTT) calculation. 311 QUIC's ACK frames support multiple ACK blocks, so QUIC is more 312 resilient to reordering than TCP with SACK support, as well as able 313 to keep more bytes on the wire when there is reordering or loss. 315 3.4. Stream and Connection Flow Control 317 QUIC implements stream- and connection-level flow control. At a high 318 level, a QUIC receiver advertises the maximum amount of data that it 319 is willing to receive on each stream. As data is sent, received, and 320 delivered on a particular stream, the receiver sends MAX_STREAM_DATA 321 frames that increase the advertised limit for that stream, allowing 322 the peer to send more data on that stream. 324 In addition to this stream-level flow control, QUIC implements 325 connection-level flow control to limit the aggregate buffer that a 326 QUIC receiver is willing to allocate to all streams on a connection. 327 Connection-level flow control works in the same way as stream-level 328 flow control, but the bytes delivered and the limits are aggregated 329 across all streams. 331 3.5. Authenticated and Encrypted Header and Payload 333 TCP headers appear in plaintext on the wire and are not 334 authenticated, causing a plethora of injection and header 335 manipulation issues for TCP, such as receive-window manipulation and 336 sequence-number overwriting. While some of these are mechanisms used 337 by middleboxes to improve TCP performance, others are active attacks. 338 Even "performance-enhancing" middleboxes that routinely interpose on 339 the transport state machine end up limiting the evolvability of the 340 transport protocol, as has been observed in the design of MPTCP 341 [RFC6824] and in its subsequent deployability issues. 343 Generally, QUIC packets are always authenticated and the payload is 344 typically fully encrypted. The parts of the packet header which are 345 not encrypted are still authenticated by the receiver, so as to 346 thwart any packet injection or manipulation by third parties. Some 347 early handshake packets, such as the Version Negotiation packet, are 348 not encrypted, but information sent in these unencrypted handshake 349 packets is later verified as part of cryptographic processing. 351 3.6. Connection Migration and Resilience to NAT Rebinding 353 QUIC connections are identified by a Connection ID, a 64-bit unsigned 354 number randomly generated by the server. QUIC's consistent 355 connection ID allows connections to survive changes to the client's 356 IP and port, such as those caused by NAT rebindings or by the client 357 changing network connectivity to a new address. QUIC provides 358 automatic cryptographic verification of a rebound client, since the 359 client continues to use the same session key for encrypting and 360 decrypting packets. The consistent connection ID can be used to 361 allow migration of the connection to a new server IP address as well, 362 since the Connection ID remains consistent across changes in the 363 client's and the server's network addresses. 365 3.7. Version Negotiation 367 QUIC version negotiation allows for multiple versions of the protocol 368 to be deployed and used concurrently. Version negotiation is 369 described in Section 7.2. 371 4. Versions 373 QUIC versions are identified using a 32-bit unsigned number. 375 The version 0x00000000 is reserved to represent version negotiation. 376 This version of the specification is identified by the number 377 0x00000001. 379 Version 0x00000001 of QUIC uses TLS as a cryptographic handshake 380 protocol, as described in [QUIC-TLS]. 382 Versions with the most significant 16 bits of the version number 383 cleared are reserved for use in future IETF consensus documents. 385 Versions that follow the pattern 0x?a?a?a?a are reserved for use in 386 forcing version negotiation to be exercised. That is, any version 387 number where the low four bits of all octets is 1010 (in binary). A 388 client or server MAY advertise support for any of these reserved 389 versions. 391 Reserved version numbers will probably never represent a real 392 protocol; a client MAY use one of these version numbers with the 393 expectation that the server will initiate version negotiation; a 394 server MAY advertise support for one of these versions and can expect 395 that clients ignore the value. 397 [[RFC editor: please remove the remainder of this section before 398 publication.]] 400 The version number for the final version of this specification 401 (0x00000001), is reserved for the version of the protocol that is 402 published as an RFC. 404 Version numbers used to identify IETF drafts are created by adding 405 the draft number to 0xff000000. For example, draft-ietf-quic- 406 transport-13 would be identified as 0xff00000D. 408 Implementors are encouraged to register version numbers of QUIC that 409 they are using for private experimentation on the github wiki [4]. 411 5. Packet Types and Formats 413 We first describe QUIC's packet types and their formats, since some 414 are referenced in subsequent mechanisms. 416 All numeric values are encoded in network byte order (that is, big- 417 endian) and all field sizes are in bits. When discussing individual 418 bits of fields, the least significant bit is referred to as bit 0. 419 Hexadecimal notation is used for describing the value of fields. 421 Any QUIC packet has either a long or a short header, as indicated by 422 the Header Form bit. Long headers are expected to be used early in 423 the connection before version negotiation and establishment of 1-RTT 424 keys. Short headers are minimal version-specific headers, which are 425 used after version negotiation and 1-RTT keys are established. 427 5.1. Long Header 429 0 1 2 3 430 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 431 +-+-+-+-+-+-+-+-+ 432 |1| Type (7) | 433 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 434 | | 435 + Connection ID (64) + 436 | | 437 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 438 | Version (32) | 439 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 440 | Packet Number (32) | 441 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 442 | Payload (*) ... 443 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 445 Figure 1: Long Header Format 447 Long headers are used for packets that are sent prior to the 448 completion of version negotiation and establishment of 1-RTT keys. 449 Once both conditions are met, a sender switches to sending packets 450 using the short header (Section 5.2). The long form allows for 451 special packets - such as the Version Negotiation packet - to be 452 represented in this uniform fixed-length packet format. A long 453 header contains the following fields: 455 Header Form: The most significant bit (0x80) of octet 0 (the first 456 octet) is set to 1 for long headers. 458 Long Packet Type: The remaining seven bits of octet 0 contain the 459 packet type. This field can indicate one of 128 packet types. 460 The types specified for this version are listed in Table 1. 462 Connection ID: Octets 1 through 8 contain the connection ID. 463 Section 5.6 describes the use of this field in more detail. 465 Version: Octets 9 to 12 contain the selected protocol version. This 466 field indicates which version of QUIC is in use and determines how 467 the rest of the protocol fields are interpreted. 469 Packet Number: Octets 13 to 16 contain the packet number. 470 Section 5.7 describes the use of packet numbers. 472 Payload: Octets from 17 onwards (the rest of QUIC packet) are the 473 payload of the packet. 475 The following packet types are defined: 477 +------+-----------------+---------------+ 478 | Type | Name | Section | 479 +------+-----------------+---------------+ 480 | 0x7F | Initial | Section 5.4.1 | 481 | | | | 482 | 0x7E | Retry | Section 5.4.2 | 483 | | | | 484 | 0x7D | Handshake | Section 5.4.3 | 485 | | | | 486 | 0x7C | 0-RTT Protected | Section 5.5 | 487 +------+-----------------+---------------+ 489 Table 1: Long Header Packet Types 491 The header form, packet type, connection ID, packet number and 492 version fields of a long header packet are version-independent. The 493 types of packets defined in Table 1 are version-specific. See 494 Section 5.8 for details on how packets from different versions of 495 QUIC are interpreted. 497 The interpretation of the fields and the payload are specific to a 498 version and packet type. Type-specific semantics for this version 499 are described in the following sections. 501 5.2. Short Header 503 0 1 2 3 504 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 505 +-+-+-+-+-+-+-+-+ 506 |0|C|K| Type (5)| 507 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 508 | | 509 + [Connection ID (64)] + 510 | | 511 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 512 | Packet Number (8/16/32) ... 513 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 514 | Protected Payload (*) ... 515 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 517 Figure 2: Short Header Format 519 The short header can be used after the version and 1-RTT keys are 520 negotiated. This header form has the following fields: 522 Header Form: The most significant bit (0x80) of octet 0 is set to 0 523 for the short header. 525 Omit Connection ID Flag: The second bit (0x40) of octet 0 indicates 526 whether the Connection ID field is omitted. If set to 0, then the 527 Connection ID field is present; if set to 1, the Connection ID 528 field is omitted. The Connection ID field can only be omitted if 529 the omit_connection_id transport parameter (Section 7.4.1) is 530 specified by the intended recipient of the packet. 532 Key Phase Bit: The third bit (0x20) of octet 0 indicates the key 533 phase, which allows a recipient of a packet to identify the packet 534 protection keys that are used to protect the packet. See 535 [QUIC-TLS] for details. 537 Short Packet Type: The remaining 5 bits of octet 0 include one of 32 538 packet types. Table 2 lists the types that are defined for short 539 packets. 541 Connection ID: If the Omit Connection ID Flag is not set, a 542 connection ID occupies octets 1 through 8 of the packet. See 543 Section 5.6 for more details. 545 Packet Number: The length of the packet number field depends on the 546 packet type. This field can be 1, 2 or 4 octets long depending on 547 the short packet type. 549 Protected Payload: Packets with a short header always include a 550 1-RTT protected payload. 552 The packet type in a short header currently determines only the size 553 of the packet number field. Additional types can be used to signal 554 the presence of other fields. 556 +------+--------------------+ 557 | Type | Packet Number Size | 558 +------+--------------------+ 559 | 0x1F | 1 octet | 560 | | | 561 | 0x1E | 2 octets | 562 | | | 563 | 0x1D | 4 octets | 564 +------+--------------------+ 566 Table 2: Short Header Packet Types 568 The header form, omit connection ID flag, and connection ID of a 569 short header packet are version-independent. The remaining fields 570 are specific to the selected QUIC version. See Section 5.8 for 571 details on how packets from different versions of QUIC are 572 interpreted. 574 5.3. Version Negotiation Packet 576 A Version Negotiation packet is inherently not version-specific, and 577 does not use the packet headers defined above. Upon receipt by a 578 client, it will appear to be a packet using the long header, but will 579 be identified as a Version Negotiation packet based on the Version 580 field. 582 The Version Negotiation packet is a response to a client packet that 583 contains a version that is not supported by the server, and is only 584 sent by servers. 586 The layout of a Version Negotiation packet is: 588 0 1 2 3 589 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 590 +-+-+-+-+-+-+-+-+ 591 |1| Unused (7) | 592 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 593 | | 594 + Connection ID (64) + 595 | | 596 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 597 | Version (32) | 598 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 599 | Supported Version 1 (32) ... 600 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 601 | [Supported Version 2 (32)] ... 602 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 603 ... 604 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 605 | [Supported Version N (32)] ... 606 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 608 Figure 3: Version Negotiation Packet 610 The value in the Unused field is selected randomly by the server. 611 The Connection ID field echoes the corresponding value from the 612 triggering client packet. This allows clients some assurance that 613 the server received the packet and that the Version Negotiation 614 packet is in fact from the server. The Version field MUST be set to 615 0x00000000. The remainder of the Version Negotiation packet is a 616 list of 32-bit versions which the server supports. 618 A Version Negotiation packet cannot be explicitly acknowledged in an 619 ACK frame by a client. Receiving another Initial packet implicitly 620 acknowledges a Version Negotiation packet. 622 See Section 7.2 for a description of the version negotiation process. 624 5.4. Cryptographic Handshake Packets 626 Once version negotiation is complete, the cryptographic handshake is 627 used to agree on cryptographic keys. The cryptographic handshake is 628 carried in Initial (Section 5.4.1), Retry (Section 5.4.2) and 629 Handshake (Section 5.4.3) packets. 631 All these packets use the long header and contain the current QUIC 632 version in the version field. 634 In order to prevent tampering by version-unaware middleboxes, 635 handshake packets are protected with a connection- and version- 636 specific key, as described in [QUIC-TLS]. This protection does not 637 provide confidentiality or integrity against on-path attackers, but 638 provides some level of protection against off-path attackers. 640 5.4.1. Initial Packet 642 The Initial packet uses long headers with a type value of 0x7E. It 643 carries the first cryptographic handshake message sent by the client. 645 The client populates the connection ID field with randomly selected 646 values, unless it has received a packet from the server. If the 647 client has received a packet from the server, the connection ID field 648 uses the value provided by the server. 650 The first Initial packet that is sent by a client contains a 651 randomized packet number. All subsequent packets contain a packet 652 number that is incremented by one, see (Section 5.7). 654 The payload of a Initial packet consists of a STREAM frame (or 655 frames) for stream 0 containing a cryptographic handshake message, 656 with enough PADDING frames that the packet is at least 1200 octets 657 (see Section 9). The stream in this packet always starts at an 658 offset of 0 (see Section 7.5) and the complete cryptographic 659 handshake message MUST fit in a single packet (see Section 7.3). 661 The client uses the Initial packet type for any packet that contains 662 an initial cryptographic handshake message. This includes all cases 663 where a new packet containing the initial cryptographic message needs 664 to be created, this includes the packets sent after receiving a 665 Version Negotiation (Section 5.3) or Retry packet (Section 5.4.2). 667 5.4.2. Retry Packet 669 A Retry packet uses long headers with a type value of 0x7D. It 670 carries cryptographic handshake messages and acknowledgments. It is 671 used by a server that wishes to perform a stateless retry (see 672 Section 7.5). 674 The packet number and connection ID fields echo the corresponding 675 fields from the triggering client packet. This allows a client to 676 verify that the server received its packet. 678 A Retry packet is never explicitly acknowledged in an ACK frame by a 679 client. Receiving another Initial packet implicitly acknowledges a 680 Retry packet. 682 After receiving a Retry packet, the client uses a new Initial packet 683 containing the next cryptographic handshake message. The client 684 retains the state of its cryptographic handshake, but discards all 685 transport state. The Initial packet that is generated in response to 686 a Retry packet includes STREAM frames on stream 0 that start again at 687 an offset of 0. 689 Continuing the cryptographic handshake is necessary to ensure that an 690 attacker cannot force a downgrade of any cryptographic parameters. 691 In addition to continuing the cryptographic handshake, the client 692 MUST remember the results of any version negotiation that occurred 693 (see Section 7.2). The client MAY also retain any observed RTT or 694 congestion state that it has accumulated for the flow, but other 695 transport state MUST be discarded. 697 The payload of the Retry packet contains a single STREAM frame on 698 stream 0 with offset 0 containing the server's cryptographic 699 stateless retry material. It MUST NOT contain any other frames. The 700 next STREAM frame sent by the server will also start at stream offset 701 0. 703 5.4.3. Handshake Packet 705 A Handshake packet uses long headers with a type value of 0x7C. It 706 is used to carry acknowledgments and cryptographic handshake messages 707 from the server and client. 709 The connection ID field in a Handshake packet contains a connection 710 ID that is chosen by the server (see Section 5.6). 712 The first Handshake packet sent by a server contains a randomized 713 packet number. This value is increased for each subsequent packet 714 sent by the server as described in Section 5.7. The client 715 increments the packet number from its previous packet by one for each 716 Handshake packet that it sends (which might be an Initial, 0-RTT 717 Protected, or Handshake packet). 719 The payload of this packet contains STREAM frames and could contain 720 PADDING and ACK frames. 722 5.5. Protected Packets 724 Packets that are protected with 0-RTT keys are sent with long 725 headers; all packets protected with 1-RTT keys are sent with short 726 headers. The different packet types explicitly indicate the 727 encryption level and therefore the keys that are used to remove 728 packet protection. 730 Packets protected with 0-RTT keys use a type value of 0x7B. The 731 connection ID field for a 0-RTT packet is selected by the client. 733 The client can send 0-RTT packets after receiving a Handshake packet 734 (Section 5.4.3), if that packet does not complete the handshake. 735 Even if the client receives a different connection ID in the 736 Handshake packet, it MUST continue to use the connection ID selected 737 by the client for 0-RTT packets, see Section 5.6. 739 The version field for protected packets is the current QUIC version. 741 The packet number field contains a packet number, which increases 742 with each packet sent, see Section 5.7 for details. 744 The payload is protected using authenticated encryption. [QUIC-TLS] 745 describes packet protection in detail. After decryption, the 746 plaintext consists of a sequence of frames, as described in 747 Section 6. 749 5.6. Connection ID 751 QUIC connections are identified by their 64-bit Connection ID. All 752 long headers contain a Connection ID. Short headers indicate the 753 presence of a Connection ID using the Omit Connection ID flag. When 754 present, the Connection ID is in the same location in all packet 755 headers, making it straightforward for middleboxes, such as load 756 balancers, to locate and use it. 758 The client MUST choose a random connection ID and use it in Initial 759 packets (Section 5.4.1) and 0-RTT packets (Section 5.5). 761 When the server receives a Initial packet and decides to proceed with 762 the handshake, it chooses a new value for the connection ID and sends 763 that in a Handshake packet (Section 5.4.3). The server MAY choose to 764 use the value that the client initially selects. 766 Once the client receives the connection ID that the server has 767 chosen, it MUST use it for all subsequent Handshake (Section 5.4.3) 768 and 1-RTT (Section 5.5) packets but not for 0-RTT packets 769 (Section 5.5). 771 Server's Version Negotiation (Section 5.3) and Retry (Section 5.4.2) 772 packets MUST use connection ID selected by the client. 774 5.7. Packet Numbers 776 The packet number is an integer in the range 0 to 2^62-1. The value 777 is used in determining the cryptographic nonce for packet encryption. 778 Each endpoint maintains a separate packet number for sending and 779 receiving. The packet number for sending MUST increase by at least 780 one after sending any packet, unless otherwise specified (see 781 Section 5.7.1). 783 A QUIC endpoint MUST NOT reuse a packet number within the same 784 connection (that is, under the same cryptographic keys). If the 785 packet number for sending reaches 2^62 - 1, the sender MUST close the 786 connection without sending a CONNECTION_CLOSE frame or any further 787 packets; a server MAY send a Stateless Reset (Section 7.9.4) in 788 response to further packets that it receives. 790 For the packet header, the number of bits required to represent the 791 packet number are reduced by including only the least significant 792 bits of the packet number. The actual packet number for each packet 793 is reconstructed at the receiver based on the largest packet number 794 received on a successfully authenticated packet. 796 A packet number is decoded by finding the packet number value that is 797 closest to the next expected packet. The next expected packet is the 798 highest received packet number plus one. For example, if the highest 799 successfully authenticated packet had a packet number of 0xaa82f30e, 800 then a packet containing a 16-bit value of 0x1f94 will be decoded as 801 0xaa831f94. 803 The sender MUST use a packet number size able to represent more than 804 twice as large a range than the difference between the largest 805 acknowledged packet and packet number being sent. A peer receiving 806 the packet will then correctly decode the packet number, unless the 807 packet is delayed in transit such that it arrives after many higher- 808 numbered packets have been received. An endpoint SHOULD use a large 809 enough packet number encoding to allow the packet number to be 810 recovered even if the packet arrives after packets that are sent 811 afterwards. 813 As a result, the size of the packet number encoding is at least one 814 more than the base 2 logarithm of the number of contiguous 815 unacknowledged packet numbers, including the new packet. 817 For example, if an endpoint has received an acknowledgment for packet 818 0x6afa2f, sending a packet with a number of 0x6b4264 requires a 819 16-bit or larger packet number encoding; whereas a 32-bit packet 820 number is needed to send a packet with a number of 0x6bc107. 822 Version Negotiation (Section 5.3) and Retry (Section 5.4.2) packets 823 have special rules for populating the packet number field. 825 5.7.1. Initial Packet Number 827 The initial value for packet number MUST be selected randomly from a 828 range between 0 and 2^32 - 1025 (inclusive). This value is selected 829 so that Initial and Handshake packets exercise as many possible 830 values for the Packet Number field as possible. 832 Limiting the range allows both for loss of packets and for any 833 stateless exchanges. Packet numbers are incremented for subsequent 834 packets, but packet loss and stateless handling can both mean that 835 the first packet sent by an endpoint isn't necessarily the first 836 packet received by its peer. The first packet received by a peer 837 cannot be 2^32 or greater or the recipient will incorrectly assume a 838 packet number that is 2^32 values lower and discard the packet. 840 Use of a secure random number generator [RFC4086] is not necessary 841 for generating the initial packet number, nor is it necessary that 842 the value be uniformly distributed. 844 5.8. Handling Packets from Different Versions 846 Between different versions the following things are guaranteed to 847 remain constant: 849 o the location of the header form flag, 851 o the location of the Omit Connection ID flag in short headers, 853 o the location and size of the Connection ID field in both header 854 forms, 856 o the location and size of the Version field in long headers, 857 o the format and semantics of the Version Negotiation packet. 859 Implementations MUST assume that an unsupported version uses an 860 unknown packet format. All other fields MUST be ignored when 861 processing a packet that contains an unsupported version. 863 6. Frames and Frame Types 865 The payload of all packets, after removing packet protection, 866 consists of a sequence of frames, as shown in Figure 4. Version 867 Negotiation and Stateless Reset do not contain frames. 869 0 1 2 3 870 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 871 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 872 | Frame 1 (*) ... 873 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 874 | Frame 2 (*) ... 875 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 876 ... 877 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 878 | Frame N (*) ... 879 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 881 Figure 4: Contents of Protected Payload 883 Protected payloads MUST contain at least one frame, and MAY contain 884 multiple frames and multiple frame types. 886 Frames MUST fit within a single QUIC packet and MUST NOT span a QUIC 887 packet boundary. Each frame begins with a Frame Type byte, 888 indicating its type, followed by additional type-dependent fields: 890 0 1 2 3 891 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 892 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 893 | Type (8) | Type-Dependent Fields (*) ... 894 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 896 Figure 5: Generic Frame Layout 898 Frame types are listed in Table 3. Note that the Frame Type byte in 899 STREAM and ACK frames is used to carry other frame-specific flags. 900 For all other frames, the Frame Type byte simply identifies the 901 frame. These frames are explained in more detail as they are 902 referenced later in the document. 904 +-------------+-------------------+--------------+ 905 | Type Value | Frame Type Name | Definition | 906 +-------------+-------------------+--------------+ 907 | 0x00 | PADDING | Section 8.2 | 908 | | | | 909 | 0x01 | RST_STREAM | Section 8.3 | 910 | | | | 911 | 0x02 | CONNECTION_CLOSE | Section 8.4 | 912 | | | | 913 | 0x03 | APPLICATION_CLOSE | Section 8.5 | 914 | | | | 915 | 0x04 | MAX_DATA | Section 8.6 | 916 | | | | 917 | 0x05 | MAX_STREAM_DATA | Section 8.7 | 918 | | | | 919 | 0x06 | MAX_STREAM_ID | Section 8.8 | 920 | | | | 921 | 0x07 | PING | Section 8.9 | 922 | | | | 923 | 0x08 | BLOCKED | Section 8.10 | 924 | | | | 925 | 0x09 | STREAM_BLOCKED | Section 8.11 | 926 | | | | 927 | 0x0a | STREAM_ID_BLOCKED | Section 8.12 | 928 | | | | 929 | 0x0b | NEW_CONNECTION_ID | Section 8.13 | 930 | | | | 931 | 0x0c | STOP_SENDING | Section 8.14 | 932 | | | | 933 | 0x0d | PONG | Section 8.15 | 934 | | | | 935 | 0x0e | ACK | Section 8.16 | 936 | | | | 937 | 0x10 - 0x17 | STREAM | Section 8.17 | 938 +-------------+-------------------+--------------+ 940 Table 3: Frame Types 942 7. Life of a Connection 944 A QUIC connection is a single conversation between two QUIC 945 endpoints. QUIC's connection establishment intertwines version 946 negotiation with the cryptographic and transport handshakes to reduce 947 connection establishment latency, as described in Section 7.3. Once 948 established, a connection may migrate to a different IP or port at 949 either endpoint, due to NAT rebinding or mobility, as described in 950 Section 7.7. Finally a connection may be terminated by either 951 endpoint, as described in Section 7.9. 953 7.1. Matching Packets to Connections 955 Incoming packets are classified on receipt. Packets can either be 956 associated with an existing connection, be discarded, or - for 957 servers - potentially create a new connection. 959 Packets that can be associated with an existing connection are 960 handled according to the current state of that connection. Packets 961 are associated with existing connections using connection ID if it is 962 present; this might include connection IDs that were advertised using 963 NEW_CONNECTION_ID (Section 8.13). Packets without connection IDs and 964 long-form packets for connections that have incomplete cryptographic 965 handshakes are associated with an existing connection using the tuple 966 of source and destination IP addresses and ports. 968 A packet that uses the short header could be associated with an 969 existing connection with an incomplete cryptographic handshake. Such 970 a packet could be a valid packet that has been reordered with respect 971 to the long-form packets that will complete the cryptographic 972 handshake. This might happen after the final set of cryptographic 973 handshake messages from either peer. These packets are expected to 974 be correlated with a connection using the tuple of IP addresses and 975 ports. Packets that might be reordered in this fashion SHOULD be 976 buffered in anticipation of the handshake completing. 978 0-RTT packets might be received prior to a Client Initial packet at a 979 server. If the version of these packets is acceptable to the server, 980 it MAY buffer these packets in anticipation of receiving a reordered 981 Client Initial packet. 983 Buffering ensures that data is not lost, which improves performance; 984 conversely, discarding these packets could create false loss signals 985 for the congestion controllers. However, limiting the number and 986 size of buffered packets might be needed to prevent exposure to 987 denial of service. 989 For clients, any packet that cannot be associated with an existing 990 connection SHOULD be discarded if it is not buffered. Discarded 991 packets MAY be logged for diagnostic or security purposes. 993 For servers, packets that aren't associated with a connection 994 potentially create a new connection. However, only packets that use 995 the long packet header and that are at least the minimum size defined 996 for the protocol version can be initial packets. A server MAY 997 discard packets with a short header or packets that are smaller than 998 the smallest minimum size for any version that the server supports. 999 A server that discards a packet that cannot be associated with a 1000 connection MAY also generate a stateless reset (Section 7.9.4). 1002 This version of QUIC defines a minimum size for initial packets of 1003 1200 octets (see Section 9). Versions of QUIC that define smaller 1004 minimum initial packet sizes need to be aware that initial packets 1005 will be discarded without action by servers that only support 1006 versions with larger minimums. Clients that support multiple QUIC 1007 versions can avoid this problem by ensuring that they increase the 1008 size of their initial packets to the largest minimum size across all 1009 of the QUIC versions they support. Servers need to recognize initial 1010 packets that are the minimum size of all QUIC versions they support. 1012 7.2. Version Negotiation 1014 QUIC's connection establishment begins with version negotiation, 1015 since all communication between the endpoints, including packet and 1016 frame formats, relies on the two endpoints agreeing on a version. 1018 A QUIC connection begins with a client sending an Initial packet 1019 (Section 5.4.1). The details of the handshake mechanisms are 1020 described in Section 7.3, but any Initial packet sent from the client 1021 to the server MUST use the long header format - which includes the 1022 version of the protocol being used - and they MUST be padded to at 1023 least 1200 octets. 1025 The server receives this packet and determines whether it potentially 1026 creates a new connection (see Section 7.1). If the packet might 1027 generate a new connection, the server then checks whether it 1028 understands the version that the client has selected. 1030 If the packet contains a version that is acceptable to the server, 1031 the server proceeds with the handshake (Section 7.3). This commits 1032 the server to the version that the client selected. 1034 7.2.1. Sending Version Negotiation Packets 1036 If the version selected by the client is not acceptable to the 1037 server, the server responds with a Version Negotiation packet 1038 (Section 5.3). This includes a list of versions that the server will 1039 accept. 1041 A server sends a Version Negotiation packet for any packet with an 1042 unacceptable version if that packet could create a new connection. 1043 This allows a server to process packets with unsupported versions 1044 without retaining state. Though either the Client Initial packet or 1045 the version negotiation packet that is sent in response could be 1046 lost, the client will send new packets until it successfully receives 1047 a response or it abandons the connection attempt. 1049 7.2.2. Handling Version Negotiation Packets 1051 When the client receives a Version Negotiation packet, it first 1052 checks that the connection ID matches the connection ID the client 1053 sent. If this check fails, the packet MUST be discarded. 1055 Once the Version Negotiation packet is determined to be valid, the 1056 client then selects an acceptable protocol version from the list 1057 provided by the server. The client then attempts to create a 1058 connection using that version. Though the contents of the Client 1059 Initial packet the client sends might not change in response to 1060 version negotiation, a client MUST increase the packet number it uses 1061 on every packet it sends. Packets MUST continue to use long headers 1062 and MUST include the new negotiated protocol version. 1064 The client MUST use the long header format and include its selected 1065 version on all packets until it has 1-RTT keys and it has received a 1066 packet from the server which is not a Version Negotiation packet. 1068 A client MUST NOT change the version it uses unless it is in response 1069 to a Version Negotiation packet from the server. Once a client 1070 receives a packet from the server which is not a Version Negotiation 1071 packet, it MUST discard other Version Negotiation packets on the same 1072 connection. Similarly, a client MUST ignore a Version Negotiation 1073 packet if it has already received and acted on a Version Negotiation 1074 packet. 1076 A client MUST ignore a Version Negotiation packet that lists the 1077 client's chosen version. 1079 Version negotiation packets have no cryptographic protection. The 1080 result of the negotiation MUST be revalidated as part of the 1081 cryptographic handshake (see Section 7.4.4). 1083 7.2.3. Using Reserved Versions 1085 For a server to use a new version in the future, clients must 1086 correctly handle unsupported versions. To help ensure this, a server 1087 SHOULD include a reserved version (see Section 4) while generating a 1088 Version Negotiation packet. 1090 The design of version negotiation permits a server to avoid 1091 maintaining state for packets that it rejects in this fashion. The 1092 validation of version negotiation (see Section 7.4.4) only validates 1093 the result of version negotiation, which is the same no matter which 1094 reserved version was sent. A server MAY therefore send different 1095 reserved version numbers in the Version Negotiation Packet and in its 1096 transport parameters. 1098 A client MAY send a packet using a reserved version number. This can 1099 be used to solicit a list of supported versions from a server. 1101 7.3. Cryptographic and Transport Handshake 1103 QUIC relies on a combined cryptographic and transport handshake to 1104 minimize connection establishment latency. QUIC allocates stream 0 1105 for the cryptographic handshake. Version 0x00000001 of QUIC uses TLS 1106 1.3 as described in [QUIC-TLS]; a different QUIC version number could 1107 indicate that a different cryptographic handshake protocol is in use. 1109 QUIC provides this stream with reliable, ordered delivery of data. 1110 In return, the cryptographic handshake provides QUIC with: 1112 o authenticated key exchange, where 1114 * a server is always authenticated, 1116 * a client is optionally authenticated, 1118 * every connection produces distinct and unrelated keys, 1120 * keying material is usable for packet protection for both 0-RTT 1121 and 1-RTT packets, and 1123 * 1-RTT keys have forward secrecy 1125 o authenticated values for the transport parameters of the peer (see 1126 Section 7.4) 1128 o authenticated confirmation of version negotiation (see 1129 Section 7.4.4) 1131 o authenticated negotiation of an application protocol (TLS uses 1132 ALPN [RFC7301] for this purpose) 1134 o for the server, the ability to carry data that provides assurance 1135 that the client can receive packets that are addressed with the 1136 transport address that is claimed by the client (see Section 7.6) 1138 The initial cryptographic handshake message MUST be sent in a single 1139 packet. Any second attempt that is triggered by address validation 1140 MUST also be sent within a single packet. This avoids having to 1141 reassemble a message from multiple packets. Reassembling messages 1142 requires that a server maintain state prior to establishing a 1143 connection, exposing the server to a denial of service risk. 1145 The first client packet of the cryptographic handshake protocol MUST 1146 fit within a 1232 octet QUIC packet payload. This includes overheads 1147 that reduce the space available to the cryptographic handshake 1148 protocol. 1150 Details of how TLS is integrated with QUIC is provided in more detail 1151 in [QUIC-TLS]. 1153 7.4. Transport Parameters 1155 During connection establishment, both endpoints make authenticated 1156 declarations of their transport parameters. These declarations are 1157 made unilaterally by each endpoint. Endpoints are required to comply 1158 with the restrictions implied by these parameters; the description of 1159 each parameter includes rules for its handling. 1161 The format of the transport parameters is the TransportParameters 1162 struct from Figure 6. This is described using the presentation 1163 language from Section 3 of [I-D.ietf-tls-tls13]. 1165 uint32 QuicVersion; 1167 enum { 1168 initial_max_stream_data(0), 1169 initial_max_data(1), 1170 initial_max_stream_id_bidi(2), 1171 idle_timeout(3), 1172 omit_connection_id(4), 1173 max_packet_size(5), 1174 stateless_reset_token(6), 1175 ack_delay_exponent(7), 1176 initial_max_stream_id_uni(8), 1177 (65535) 1178 } TransportParameterId; 1180 struct { 1181 TransportParameterId parameter; 1182 opaque value<0..2^16-1>; 1183 } TransportParameter; 1185 struct { 1186 select (Handshake.msg_type) { 1187 case client_hello: 1188 QuicVersion initial_version; 1190 case encrypted_extensions: 1191 QuicVersion negotiated_version; 1192 QuicVersion supported_versions<4..2^8-4>; 1194 case new_session_ticket: 1195 struct {}; 1196 }; 1197 TransportParameter parameters<30..2^16-1>; 1198 } TransportParameters; 1200 Figure 6: Definition of TransportParameters 1202 The "extension_data" field of the quic_transport_parameters extension 1203 defined in [QUIC-TLS] contains a TransportParameters value. TLS 1204 encoding rules are therefore used to encode the transport parameters. 1206 QUIC encodes transport parameters into a sequence of octets, which 1207 are then included in the cryptographic handshake. Once the handshake 1208 completes, the transport parameters declared by the peer are 1209 available. Each endpoint validates the value provided by its peer. 1210 In particular, version negotiation MUST be validated (see 1211 Section 7.4.4) before the connection establishment is considered 1212 properly complete. 1214 Definitions for each of the defined transport parameters are included 1215 in Section 7.4.1. Any given parameter MUST appear at most once in a 1216 given transport parameters extension. An endpoint MUST treat receipt 1217 of duplicate transport parameters as a connection error of type 1218 TRANSPORT_PARAMETER_ERROR. 1220 7.4.1. Transport Parameter Definitions 1222 An endpoint MUST include the following parameters in its encoded 1223 TransportParameters: 1225 initial_max_stream_data (0x0000): The initial stream maximum data 1226 parameter contains the initial value for the maximum data that can 1227 be sent on any newly created stream. This parameter is encoded as 1228 an unsigned 32-bit integer in units of octets. This is equivalent 1229 to an implicit MAX_STREAM_DATA frame (Section 8.7) being sent on 1230 all streams immediately after opening. 1232 initial_max_data (0x0001): The initial maximum data parameter 1233 contains the initial value for the maximum amount of data that can 1234 be sent on the connection. This parameter is encoded as an 1235 unsigned 32-bit integer in units of octets. This is equivalent to 1236 sending a MAX_DATA (Section 8.6) for the connection immediately 1237 after completing the handshake. 1239 idle_timeout (0x0003): The idle timeout is a value in seconds that 1240 is encoded as an unsigned 16-bit integer. The maximum value is 1241 600 seconds (10 minutes). 1243 A server MUST include the following transport parameters: 1245 stateless_reset_token (0x0006): The Stateless Reset Token is used in 1246 verifying a stateless reset, see Section 7.9.4. This parameter is 1247 a sequence of 16 octets. 1249 A client MUST NOT include a stateless reset token. A server MUST 1250 treat receipt of a stateless_reset_token transport parameter as a 1251 connection error of type TRANSPORT_PARAMETER_ERROR. 1253 An endpoint MAY use the following transport parameters: 1255 initial_max_stream_id_bidi (0x0002): The initial maximum stream ID 1256 parameter contains the initial maximum stream number the peer may 1257 initiate for bidirectional streams, encoded as an unsigned 32-bit 1258 integer. This value MUST be a valid bidirectional stream ID for a 1259 peer-initiated stream (that is, the two least significant bits are 1260 set to 0 by a server and to 1 by a client). If an invalid value 1261 is provided, the recipient MUST generate a connection error of 1262 type TRANSPORT_PARAMETER_ERROR. Setting this parameter is 1263 equivalent to sending a MAX_STREAM_ID (Section 8.8) immediately 1264 after completing the handshake. The maximum bidirectional stream 1265 ID is set to 0 if this parameter is absent, preventing the 1266 creation of new bidirectional streams until a MAX_STREAM_ID frame 1267 is sent. Note that a default value of 0 does not prevent the 1268 cryptographic handshake stream (that is, stream 0) from being 1269 used. 1271 initial_max_stream_id_uni (0x0008): The initial maximum stream ID 1272 parameter contains the initial maximum stream number the peer may 1273 initiate for unidirectional streams, encoded as an unsigned 32-bit 1274 integer. The value MUST be a valid unidirectional ID for the 1275 recipient (that is, the two least significant bits are set to 2 by 1276 a server and to 3 by a client). If an invalid value is provided, 1277 the recipient MUST generate a connection error of type 1278 TRANSPORT_PARAMETER_ERROR. Setting this parameter is equivalent 1279 to sending a MAX_STREAM_ID (Section 8.8) immediately after 1280 completing the handshake. The maximum unidirectional stream ID is 1281 set to 0 if this parameter is absent, preventing the creation of 1282 new unidirectional streams until a MAX_STREAM_ID frame is sent. 1284 omit_connection_id (0x0004): The omit connection identifier 1285 parameter indicates that packets sent to the endpoint that 1286 advertises this parameter MAY omit the connection ID in packets 1287 using short header format. This can be used by an endpoint where 1288 it knows that source and destination IP address and port are 1289 sufficient for it to identify a connection. This parameter is 1290 zero length. Absence of this parameter means that the connection 1291 ID MUST be present in every packet sent to this endpoint. 1293 max_packet_size (0x0005): The maximum packet size parameter places a 1294 limit on the size of packets that the endpoint is willing to 1295 receive, encoded as an unsigned 16-bit integer. This indicates 1296 that packets larger than this limit will be dropped. The default 1297 for this parameter is the maximum permitted UDP payload of 65527. 1298 Values below 1200 are invalid. This limit only applies to 1299 protected packets (Section 5.5). 1301 ack_delay_exponent (0x0007): An 8-bit unsigned integer value 1302 indicating an exponent used to decode the ACK Delay field in the 1303 ACK frame, see Section 8.16. If this value is absent, a default 1304 value of 3 is assumed (indicating a multiplier of 8). Values 1305 above 20 are invalid. 1307 7.4.2. Values of Transport Parameters for 0-RTT 1309 Transport parameters from the server MUST be remembered by the client 1310 for use with 0-RTT data. If the TLS NewSessionTicket message 1311 includes the quic_transport_parameters extension, then those values 1312 are used for the server values when establishing a new connection 1313 using that ticket. Otherwise, the transport parameters that the 1314 server advertises during connection establishment are used. 1316 A server can remember the transport parameters that it advertised, or 1317 store an integrity-protected copy of the values in the ticket and 1318 recover the information when accepting 0-RTT data. A server uses the 1319 transport parameters in determining whether to accept 0-RTT data. 1321 A server MAY accept 0-RTT and subsequently provide different values 1322 for transport parameters for use in the new connection. If 0-RTT 1323 data is accepted by the server, the server MUST NOT reduce any limits 1324 or alter any values that might be violated by the client with its 1325 0-RTT data. In particular, a server that accepts 0-RTT data MUST NOT 1326 set values for initial_max_data or initial_max_stream_data that are 1327 smaller than the remembered value of those parameters. Similarly, a 1328 server MUST NOT reduce the value of initial_max_stream_id_bidi or 1329 initial_max_stream_id_uni. 1331 Omitting or setting a zero value for certain transport parameters can 1332 result in 0-RTT data being enabled, but not usable. The following 1333 transport parameters SHOULD be set to non-zero values for 0-RTT: 1334 initial_max_stream_id_bidi, initial_max_stream_id_uni, 1335 initial_max_data, initial_max_stream_data. 1337 A server MUST reject 0-RTT data or even abort a handshake if the 1338 implied values for transport parameters cannot be supported. 1340 7.4.3. New Transport Parameters 1342 New transport parameters can be used to negotiate new protocol 1343 behavior. An endpoint MUST ignore transport parameters that it does 1344 not support. Absence of a transport parameter therefore disables any 1345 optional protocol feature that is negotiated using the parameter. 1347 New transport parameters can be registered according to the rules in 1348 Section 14.1. 1350 7.4.4. Version Negotiation Validation 1352 Though the cryptographic handshake has integrity protection, two 1353 forms of QUIC version downgrade are possible. In the first, an 1354 attacker replaces the QUIC version in the Initial packet. In the 1355 second, a fake Version Negotiation packet is sent by an attacker. To 1356 protect against these attacks, the transport parameters include three 1357 fields that encode version information. These parameters are used to 1358 retroactively authenticate the choice of version (see Section 7.2). 1360 The cryptographic handshake provides integrity protection for the 1361 negotiated version as part of the transport parameters (see 1362 Section 7.4). As a result, attacks on version negotiation by an 1363 attacker can be detected. 1365 The client includes the initial_version field in its transport 1366 parameters. The initial_version is the version that the client 1367 initially attempted to use. If the server did not send a version 1368 negotiation packet Section 5.3, this will be identical to the 1369 negotiated_version field in the server transport parameters. 1371 A server that processes all packets in a stateful fashion can 1372 remember how version negotiation was performed and validate the 1373 initial_version value. 1375 A server that does not maintain state for every packet it receives 1376 (i.e., a stateless server) uses a different process. If the 1377 initial_version matches the version of QUIC that is in use, a 1378 stateless server can accept the value. 1380 If the initial_version is different from the version of QUIC that is 1381 in use, a stateless server MUST check that it would have sent a 1382 version negotiation packet if it had received a packet with the 1383 indicated initial_version. If a server would have accepted the 1384 version included in the initial_version and the value differs from 1385 the QUIC version that is in use, the server MUST terminate the 1386 connection with a VERSION_NEGOTIATION_ERROR error. 1388 The server includes both the version of QUIC that is in use and a 1389 list of the QUIC versions that the server supports. 1391 The negotiated_version field is the version that is in use. This 1392 MUST be set by the server to the value that is on the Initial packet 1393 that it accepts (not an Initial packet that triggers a Retry or 1394 Version Negotiation packet). A client that receives a 1395 negotiated_version that does not match the version of QUIC that is in 1396 use MUST terminate the connection with a VERSION_NEGOTIATION_ERROR 1397 error code. 1399 The server includes a list of versions that it would send in any 1400 version negotiation packet (Section 5.3) in the supported_versions 1401 field. The server populates this field even if it did not send a 1402 version negotiation packet. This field is absent if the parameters 1403 are included in a NewSessionTicket message. 1405 The client validates that the negotiated_version is included in the 1406 supported_versions list and - if version negotiation was performed - 1407 that it would have selected the negotiated version. A client MUST 1408 terminate the connection with a VERSION_NEGOTIATION_ERROR error code 1409 if the current QUIC version is not listed in the supported_versions 1410 list. A client MUST terminate with a VERSION_NEGOTIATION_ERROR error 1411 code if version negotiation occurred but it would have selected a 1412 different version based on the value of the supported_versions list. 1414 When an endpoint accepts multiple QUIC versions, it can potentially 1415 interpret transport parameters as they are defined by any of the QUIC 1416 versions it supports. The version field in the QUIC packet header is 1417 authenticated using transport parameters. The position and the 1418 format of the version fields in transport parameters MUST either be 1419 identical across different QUIC versions, or be unambiguously 1420 different to ensure no confusion about their interpretation. One way 1421 that a new format could be introduced is to define a TLS extension 1422 with a different codepoint. 1424 7.5. Stateless Retries 1426 A server can process an initial cryptographic handshake messages from 1427 a client without committing any state. This allows a server to 1428 perform address validation (Section 7.6, or to defer connection 1429 establishment costs. 1431 A server that generates a response to an initial packet without 1432 retaining connection state MUST use the Retry packet (Section 5.4.2). 1433 This packet causes a client to reset its transport state and to 1434 continue the connection attempt with new connection state while 1435 maintaining the state of the cryptographic handshake. 1437 A server MUST NOT send multiple Retry packets in response to a client 1438 handshake packet. Thus, any cryptographic handshake message that is 1439 sent MUST fit within a single packet. 1441 In TLS, the Retry packet type is used to carry the HelloRetryRequest 1442 message. 1444 7.6. Proof of Source Address Ownership 1446 Transport protocols commonly spend a round trip checking that a 1447 client owns the transport address (IP and port) that it claims. 1448 Verifying that a client can receive packets sent to its claimed 1449 transport address protects against spoofing of this information by 1450 malicious clients. 1452 This technique is used primarily to avoid QUIC from being used for 1453 traffic amplification attack. In such an attack, a packet is sent to 1454 a server with spoofed source address information that identifies a 1455 victim. If a server generates more or larger packets in response to 1456 that packet, the attacker can use the server to send more data toward 1457 the victim than it would be able to send on its own. 1459 Several methods are used in QUIC to mitigate this attack. Firstly, 1460 the initial handshake packet is padded to at least 1200 octets. This 1461 allows a server to send a similar amount of data without risking 1462 causing an amplification attack toward an unproven remote address. 1464 A server eventually confirms that a client has received its messages 1465 when the cryptographic handshake successfully completes. This might 1466 be insufficient, either because the server wishes to avoid the 1467 computational cost of completing the handshake, or it might be that 1468 the size of the packets that are sent during the handshake is too 1469 large. This is especially important for 0-RTT, where the server 1470 might wish to provide application data traffic - such as a response 1471 to a request - in response to the data carried in the early data from 1472 the client. 1474 To send additional data prior to completing the cryptographic 1475 handshake, the server then needs to validate that the client owns the 1476 address that it claims. 1478 Source address validation is therefore performed during the 1479 establishment of a connection. TLS provides the tools that support 1480 the feature, but basic validation is performed by the core transport 1481 protocol. 1483 A different type of source address validation is performed after a 1484 connection migration, see Section 7.7.2. 1486 7.6.1. Client Address Validation Procedure 1488 QUIC uses token-based address validation. Any time the server wishes 1489 to validate a client address, it provides the client with a token. 1490 As long as the token cannot be easily guessed (see Section 7.6.3), if 1491 the client is able to return that token, it proves to the server that 1492 it received the token. 1494 During the processing of the cryptographic handshake messages from a 1495 client, TLS will request that QUIC make a decision about whether to 1496 proceed based on the information it has. TLS will provide QUIC with 1497 any token that was provided by the client. For an initial packet, 1498 QUIC can decide to abort the connection, allow it to proceed, or 1499 request address validation. 1501 If QUIC decides to request address validation, it provides the 1502 cryptographic handshake with a token. The contents of this token are 1503 consumed by the server that generates the token, so there is no need 1504 for a single well-defined format. A token could include information 1505 about the claimed client address (IP and port), a timestamp, and any 1506 other supplementary information the server will need to validate the 1507 token in the future. 1509 The cryptographic handshake is responsible for enacting validation by 1510 sending the address validation token to the client. A legitimate 1511 client will include a copy of the token when it attempts to continue 1512 the handshake. The cryptographic handshake extracts the token then 1513 asks QUIC a second time whether the token is acceptable. In 1514 response, QUIC can either abort the connection or permit it to 1515 proceed. 1517 A connection MAY be accepted without address validation - or with 1518 only limited validation - but a server SHOULD limit the data it sends 1519 toward an unvalidated address. Successful completion of the 1520 cryptographic handshake implicitly provides proof that the client has 1521 received packets from the server. 1523 7.6.2. Address Validation on Session Resumption 1525 A server MAY provide clients with an address validation token during 1526 one connection that can be used on a subsequent connection. Address 1527 validation is especially important with 0-RTT because a server 1528 potentially sends a significant amount of data to a client in 1529 response to 0-RTT data. 1531 A different type of token is needed when resuming. Unlike the token 1532 that is created during a handshake, there might be some time between 1533 when the token is created and when the token is subsequently used. 1534 Thus, a resumption token SHOULD include an expiration time. It is 1535 also unlikely that the client port number is the same on two 1536 different connections; validating the port is therefore unlikely to 1537 be successful. 1539 This token can be provided to the cryptographic handshake immediately 1540 after establishing a connection. QUIC might also generate an updated 1541 token if significant time passes or the client address changes for 1542 any reason (see Section 7.7). The cryptographic handshake is 1543 responsible for providing the client with the token. In TLS the 1544 token is included in the ticket that is used for resumption and 1545 0-RTT, which is carried in a NewSessionTicket message. 1547 7.6.3. Address Validation Token Integrity 1549 An address validation token MUST be difficult to guess. Including a 1550 large enough random value in the token would be sufficient, but this 1551 depends on the server remembering the value it sends to clients. 1553 A token-based scheme allows the server to offload any state 1554 associated with validation to the client. For this design to work, 1555 the token MUST be covered by integrity protection against 1556 modification or falsification by clients. Without integrity 1557 protection, malicious clients could generate or guess values for 1558 tokens that would be accepted by the server. Only the server 1559 requires access to the integrity protection key for tokens. 1561 In TLS the address validation token is often bundled with the 1562 information that TLS requires, such as the resumption secret. In 1563 this case, adding integrity protection can be delegated to the 1564 cryptographic handshake protocol, avoiding redundant protection. If 1565 integrity protection is delegated to the cryptographic handshake, an 1566 integrity failure will result in immediate cryptographic handshake 1567 failure. If integrity protection is performed by QUIC, QUIC MUST 1568 abort the connection if the integrity check fails with a 1569 PROTOCOL_VIOLATION error code. 1571 7.7. Connection Migration 1573 QUIC connections are identified by their 64-bit Connection ID. 1574 QUIC's consistent connection ID allows connections to survive changes 1575 to the client's IP and/or port, such as those caused by client or 1576 server migrating to a new network. Connection migration allows a 1577 client to retain any shared state with a connection when they move 1578 networks. This includes state that can be hard to recover such as 1579 outstanding requests, which might otherwise be lost with no easy way 1580 to retry them. 1582 An endpoint that receives packets that contain a source IP address 1583 and port that has not yet been used can start sending new packets 1584 with those as a destination IP address and port. Packets exchanged 1585 between endpoints can then follow the new path. 1587 Due to variations in path latency or packet reordering, packets from 1588 different source addresses might be reordered. The packet with the 1589 highest packet number MUST be used to determine which path to use. 1590 Endpoints also need to be prepared to receive packets from an older 1591 source address. 1593 An endpoint MUST validate that its peer can receive packets at the 1594 new address before sending any significant quantity of data to that 1595 address, or it risks being used for denial of service. See 1596 Section 7.7.2 for details. 1598 7.7.1. Privacy Implications of Connection Migration 1600 Using a stable connection ID on multiple network paths allows a 1601 passive observer to correlate activity between those paths. A client 1602 that moves between networks might not wish to have their activity 1603 correlated by any entity other than a server. The NEW_CONNECTION_ID 1604 message can be sent by a server to provide an unlinkable connection 1605 ID for use in case the client wishes to explicitly break linkability 1606 between two points of network attachment. 1608 A client might need to send packets on multiple networks without 1609 receiving any response from the server. To ensure that the client is 1610 not linkable across each of these changes, a new connection ID and 1611 packet number gap are needed for each network. To support this, a 1612 server sends multiple NEW_CONNECTION_ID messages. Each 1613 NEW_CONNECTION_ID is marked with a sequence number. Connection IDs 1614 MUST be used in the order in which they are numbered. 1616 A client which wishes to break linkability upon changing networks 1617 MUST use the connection ID provided by the server as well as 1618 incrementing the packet sequence number by an externally 1619 unpredictable value computed as described in Section 7.7.1.1. Packet 1620 number gaps are cumulative. A client might skip connection IDs, but 1621 it MUST ensure that it applies the associated packet number gaps for 1622 connection IDs that it skips in addition to the packet number gap 1623 associated with the connection ID that it does use. 1625 A server that receives a packet that is marked with a new connection 1626 ID recovers the packet number by adding the cumulative packet number 1627 gap to its expected packet number. A server SHOULD discard packets 1628 that contain a smaller gap than it advertised. 1630 For instance, a server might provide a packet number gap of 7 1631 associated with a new connection ID. If the server received packet 1632 10 using the previous connection ID, it should expect packets on the 1633 new connection ID to start at 18. A packet with the new connection 1634 ID and a packet number of 17 is discarded as being in error. 1636 7.7.1.1. Packet Number Gap 1638 In order to avoid linkage, the packet number gap MUST be externally 1639 indistinguishable from random. The packet number gap for a 1640 connection ID with sequence number is computed by encoding the 1641 sequence number as a 32-bit integer in big-endian format, and then 1642 computing: 1644 Gap = HKDF-Expand-Label(packet_number_secret, 1645 "QUIC packet sequence gap", sequence, 4) 1647 The output of HKDF-Expand-Label is interpreted as a big-endian 1648 number. "packet_number_secret" is derived from the TLS key exchange, 1649 as described in Section 5.6 of [QUIC-TLS]. 1651 7.7.2. Address Validation for Migrated Connections 1653 An endpoint that receives a packet from a new remote IP address and 1654 port (or just a new remote port) on packets from its peer is likely 1655 seeing a connection migration at the peer. 1657 However, it is also possible that the peer is spoofing its source 1658 address in order to cause the endpoint to send excessive amounts of 1659 data to an unwilling host. If the endpoint sends significantly more 1660 data than the peer, connection migration might be used to amplify the 1661 volume of data that an attacker can generate toward a victim. 1663 Thus, when seeing a new remote transport address, an endpoint MUST 1664 verify that its peer can receive and respond to packets at that new 1665 address. By providing copies of the data that it receives, the peer 1666 proves that it is receiving packets at the new address and consents 1667 to receive data. 1669 Prior to validating the new remote address, and endpoint MUST limit 1670 the amount of data and packets that it sends to its peer. At a 1671 minimum, this needs to consider the possibility that packets are sent 1672 without congestion feedback. 1674 Once a connection is established, address validation is relatively 1675 simple (see Section 7.6 for the process that is used during the 1676 handshake). An endpoint validates a remote address by sending a PING 1677 frame containing a payload that is hard to guess. This frame MUST be 1678 sent in a packet that is sent to the new address. Once a PONG frame 1679 containing the same payload is received, the address is considered to 1680 be valid. The PONG frame can use any path on its return. A PING 1681 frame containing 12 randomly generated [RFC4086] octets is sufficient 1682 to ensure that it is easier to receive the packet than it is to guess 1683 the value correctly. 1685 If the PING frame is determined to be lost, a new PING frame SHOULD 1686 be generated. This PING frame MUST include a new Data field that is 1687 similarly difficult to guess. 1689 If validation of the new remote address fails, after allowing enough 1690 time for possible loss and recovery of packets carrying PING and PONG 1691 frames, the endpoint MUST terminate the connection. When setting 1692 this timer, implementations are cautioned that the new path could 1693 have a longer round trip time than the original. The endpoint MUST 1694 NOT send a CONNECTION_CLOSE frame in this case; it has to assume that 1695 the remote peer does not want to receive any more packets. 1697 If the remote address is validated successfully, the endpoint MAY 1698 increase the rate that it sends on the new path using the state from 1699 the previous path. The capacity available on the new path might not 1700 be the same as the old path. An endpoint MUST NOT restore its send 1701 rate unless it is reasonably sure that the path is the same as the 1702 previous path. For instance, a change in only port number is likely 1703 indicative of a rebinding in a middlebox and not a complete change in 1704 path. This determination likely depends on heuristics, which could 1705 be imperfect; if the new path capacity is significantly reduced, 1706 ultimately this relies on the congestion controller responding to 1707 congestion signals and reduce send rates appropriately. 1709 After verifying an address, the endpoint SHOULD update any address 1710 validation tokens (Section 7.6) that it has issued to its peer if 1711 those are no longer valid based on the changed address. 1713 Address validation using the PING frame MAY be used at any time by 1714 either peer. For instance, an endpoint might check that a peer is 1715 still in possession of its address after a period of quiescence. 1717 Upon seeing a connection migration, an endpoint that sees a new 1718 address MUST abandon any address validation it is performing with 1719 other addresses on the expectation that the validation is likely to 1720 fail. Abandoning address validation primarily means not closing the 1721 connection when a PONG frame is not received, but it could also mean 1722 ceasing retransmissions of the PING frame. An endpoint that doesn't 1723 retransmit a PING frame might receive a PONG frame, which it MUST 1724 ignore. 1726 7.8. Spurious Connection Migrations 1728 A connection migration could be triggered by an attacker that is able 1729 to capture and forward a packet such that it arrives before the 1730 legitimate copy of that packet. Such a packet will appear to be a 1731 legitimate connection migration and the legitimate copy will be 1732 dropped as a duplicate. 1734 After a spurious migration, validation of the source address will 1735 fail because the entity at the source address does not have the 1736 necessary cryptographic keys to read or respond to the PING frame 1737 that is sent to it, even if it wanted to. Such a spurious connection 1738 migration could result in the connection being dropped when the 1739 source address validation fails. This grants an attacker the ability 1740 to terminate the connection. 1742 Receipt of packets with higher packet numbers from the legitimate 1743 address will trigger another connection migration. This will cause 1744 the validation of the address of the spurious migration to be 1745 abandoned. 1747 To ensure that a peer sends packets from the legitimate address 1748 before the validation of the new address can fail, an endpoint SHOULD 1749 attempt to validate the old remote address before attempting to 1750 validate the new address. If the connection migration is spurious, 1751 then the legitimate address will be used to respond and the 1752 connection will migrate back to the old address. 1754 As with any address validation, packets containing retransmissions of 1755 the PING frame validating an address MUST be sent to the address 1756 being validated. Consequently, during a migration of a peer, an 1757 endpoint could be sending to multiple remote addresses. 1759 An endpoint MAY abandon address validation for an address that it 1760 considers to be already valid. That is, if successive connection 1761 migrations occur in quick succession with the final remote address 1762 being identical to the initial remote address, the endpoint MAY 1763 abandon address validation for that address. 1765 7.9. Connection Termination 1767 Connections should remain open until they become idle for a pre- 1768 negotiated period of time. A QUIC connection, once established, can 1769 be terminated in one of three ways: 1771 o idle timeout (Section 7.9.2) 1773 o immediate close (Section 7.9.3) 1775 o stateless reset (Section 7.9.4) 1777 7.9.1. Closing and Draining Connection States 1779 The closing and draining connection states exist to ensure that 1780 connections close cleanly and that delayed or reordered packets are 1781 properly discarded. These states SHOULD persist for three times the 1782 current Retransmission Timeout (RTO) interval as defined in 1783 [QUIC-RECOVERY]. 1785 An endpoint enters a closing period after initiating an immediate 1786 close (Section 7.9.3) and optionally after an idle timeout 1787 (Section 7.9.2). While closing, an endpoint MUST NOT send packets 1788 unless they contain a CONNECTION_CLOSE or APPLICATION_CLOSE frame 1789 (see Section 7.9.3 for details). 1791 In the closing state, only a packet containing a closing frame can be 1792 sent. An endpoint retains only enough information to generate a 1793 packet containing a closing frame and to identify packets as 1794 belonging to the connection. The connection ID and QUIC version is 1795 sufficient information to identify packets for a closing connection; 1796 an endpoint can discard all other connection state. An endpoint MAY 1797 retain packet protection keys for incoming packets to allow it to 1798 read and process a closing frame. 1800 The draining state is entered once an endpoint receives a signal that 1801 its peer is closing or draining. While otherwise identical to the 1802 closing state, an endpoint in the draining state MUST NOT send any 1803 packets. Retaining packet protection keys is unnecessary once a 1804 connection is in the draining state. 1806 An endpoint MAY transition from the closing period to the draining 1807 period if it can confirm that its peer is also closing or draining. 1808 Receiving a closing frame is sufficient confirmation, as is receiving 1809 a stateless reset. The draining period SHOULD end when the closing 1810 period would have ended. In other words, the endpoint can use the 1811 same end time, but cease retransmission of the closing packet. 1813 Disposing of connection state prior to the end of the closing or 1814 draining period could cause delayed or reordered packets to be 1815 handled poorly. Endpoints that have some alternative means to ensure 1816 that late-arriving packets on the connection do not create QUIC 1817 state, such as those that are able to close the UDP socket, MAY use 1818 an abbreviated draining period which can allow for faster resource 1819 recovery. Servers that retain an open socket for accepting new 1820 connections SHOULD NOT exit the closing or draining period early. 1822 Once the closing or draining period has ended, an endpoint SHOULD 1823 discard all connection state. This results in new packets on the 1824 connection being handled generically. For instance, an endpoint MAY 1825 send a stateless reset in response to any further incoming packets. 1827 The draining and closing periods do not apply when a stateless reset 1828 (Section 7.9.4) is sent. 1830 7.9.2. Idle Timeout 1832 A connection that remains idle for longer than the idle timeout (see 1833 Section 7.4.1) is closed. A connection enters the draining state 1834 when the idle timeout expires. 1836 The time at which an idle timeout takes effect won't be perfectly 1837 synchronized on both endpoints. An endpoint that sends packets near 1838 the end of an idle period could have those packets discarded if its 1839 peer enters the draining state before the packet is received. 1841 7.9.3. Immediate Close 1843 An endpoint sends a closing frame, either CONNECTION_CLOSE or 1844 APPLICATION_CLOSE, to terminate the connection immediately. Either 1845 closing frame causes all streams to immediately become closed; open 1846 streams can be assumed to be implicitly reset. 1848 After sending a closing frame, endpoints immediately enter the 1849 closing state. During the closing period, an endpoint that sends a 1850 closing frame SHOULD respond to any packet that it receives with 1851 another packet containing a closing frame. To minimize the state 1852 that an endpoint maintains for a closing connection, endpoints MAY 1853 send the exact same packet. However, endpoints SHOULD limit the 1854 number of packets they generate containing a closing frame. For 1855 instance, an endpoint could progressively increase the number of 1856 packets that it receives before sending additional packets or 1857 increase the time between packets. 1859 Note: Allowing retransmission of a packet contradicts other advice 1860 in this document that recommends the creation of new packet 1861 numbers for every packet. Sending new packet numbers is primarily 1862 of advantage to loss recovery and congestion control, which are 1863 not expected to be relevant for a closed connection. 1864 Retransmitting the final packet requires less state. 1866 After receiving a closing frame, endpoints enter the draining state. 1867 An endpoint that receives a closing frame MAY send a single packet 1868 containing a closing frame before entering the draining state, using 1869 a CONNECTION_CLOSE frame and a NO_ERROR code if appropriate. An 1870 endpoint MUST NOT send further packets, which could result in a 1871 constant exchange of closing frames until the closing period on 1872 either peer ended. 1874 An immediate close can be used after an application protocol has 1875 arranged to close a connection. This might be after the application 1876 protocols negotiates a graceful shutdown. The application protocol 1877 exchanges whatever messages that are needed to cause both endpoints 1878 to agree to close the connection, after which the application 1879 requests that the connection be closed. The application protocol can 1880 use an APPLICATION_CLOSE message with an appropriate error code to 1881 signal closure. 1883 7.9.4. Stateless Reset 1885 A stateless reset is provided as an option of last resort for a 1886 server that does not have access to the state of a connection. A 1887 server crash or outage might result in clients continuing to send 1888 data to a server that is unable to properly continue the connection. 1889 A server that wishes to communicate a fatal connection error MUST use 1890 a closing frame if it has sufficient state to do so. 1892 To support this process, the server sends a stateless_reset_token 1893 value during the handshake in the transport parameters. This value 1894 is protected by encryption, so only client and server know this 1895 value. 1897 A server that receives packets that it cannot process sends a packet 1898 in the following layout: 1900 0 1 2 3 1901 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1902 +-+-+-+-+-+-+-+-+ 1903 |0|C|K|Type (5) | 1904 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1905 | | 1906 + [Connection ID (64)] + 1907 | | 1908 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1909 | Packet Number (8/16/32) | 1910 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1911 | Random Octets (*) ... 1912 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1913 | | 1914 + + 1915 | | 1916 + Stateless Reset Token (128) + 1917 | | 1918 + + 1919 | | 1920 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1922 A server copies the connection ID field from the packet that triggers 1923 the stateless reset. A server omits the connection ID if explicitly 1924 configured to do so, or if the client packet did not include a 1925 connection ID. 1927 The Packet Number field is set to a randomized value. The server 1928 SHOULD send a packet with a short header and a type of 0x1F. This 1929 produces the shortest possible packet number encoding, which 1930 minimizes the perceived gap between the last packet that the server 1931 sent and this packet. A server MAY use a different short header 1932 type, indicating a different packet number length, but a longer 1933 packet number encoding might allow this message to be identified as a 1934 stateless reset more easily using heuristics. 1936 After the first short header octet and optional connection ID, the 1937 server includes the value of the Stateless Reset Token that it 1938 included in its transport parameters. 1940 After the Packet Number, the server pads the message with an 1941 arbitrary number of octets containing random values. 1943 Finally, the last 16 octets of the packet are set to the value of the 1944 Stateless Reset Token. 1946 This design ensures that a stateless reset packet is - to the extent 1947 possible - indistinguishable from a regular packet. 1949 A stateless reset is not appropriate for signaling error conditions. 1950 An endpoint that wishes to communicate a fatal connection error MUST 1951 use a CONNECTION_CLOSE or APPLICATION_CLOSE frame if it has 1952 sufficient state to do so. 1954 This stateless reset design is specific to QUIC version 1. A server 1955 that supports multiple versions of QUIC needs to generate a stateless 1956 reset that will be accepted by clients that support any version that 1957 the server might support (or might have supported prior to losing 1958 state). Designers of new versions of QUIC need to be aware of this 1959 and either reuse this design, or use a portion of the packet other 1960 than the last 16 octets for carrying data. 1962 7.9.4.1. Detecting a Stateless Reset 1964 A client detects a potential stateless reset when a packet with a 1965 short header either cannot be decrypted or is marked as a duplicate 1966 packet. The client then compares the last 16 octets of the packet 1967 with the Stateless Reset Token provided by the server in its 1968 transport parameters. If these values are identical, the client MUST 1969 enter the draining period and not send any further packets on this 1970 connection. If the comparison fails, the packet can be discarded. 1972 7.9.4.2. Calculating a Stateless Reset Token 1974 The stateless reset token MUST be difficult to guess. In order to 1975 create a Stateless Reset Token, a server could randomly generate 1976 [RFC4086] a secret for every connection that it creates. However, 1977 this presents a coordination problem when there are multiple servers 1978 in a cluster or a storage problem for a server that might lose state. 1979 Stateless reset specifically exists to handle the case where state is 1980 lost, so this approach is suboptimal. 1982 A single static key can be used across all connections to the same 1983 endpoint by generating the proof using a second iteration of a 1984 preimage-resistant function that takes three inputs: the static key, 1985 a the connection ID for the connection (see Section 5.6), and an 1986 identifier for the server instance. A server could use HMAC 1987 [RFC2104] (for example, HMAC(static_key, server_id || connection_id)) 1988 or HKDF [RFC5869] (for example, using the static key as input keying 1989 material, with server and connection identifiers as salt). The 1990 output of this function is truncated to 16 octets to produce the 1991 Stateless Reset Token for that connection. 1993 A server that loses state can use the same method to generate a valid 1994 Stateless Reset Secret. The connection ID comes from the packet that 1995 the server receives. 1997 This design relies on the client always sending a connection ID in 1998 its packets so that the server can use the connection ID from a 1999 packet to reset the connection. A server that uses this design 2000 cannot allow clients to omit a connection ID (that is, it cannot use 2001 the truncate_connection_id transport parameter Section 7.4.1). 2003 Revealing the Stateless Reset Token allows any entity to terminate 2004 the connection, so a value can only be used once. This method for 2005 choosing the Stateless Reset Token means that the combination of 2006 server instance, connection ID, and static key cannot occur for 2007 another connection. A connection ID from a connection that is reset 2008 by revealing the Stateless Reset Token cannot be reused for new 2009 connections at the same server without first changing to use a 2010 different static key or server identifier. 2012 Note that Stateless Reset messages do not have any cryptographic 2013 protection. 2015 8. Frame Types and Formats 2017 As described in Section 6, Regular packets contain one or more 2018 frames. We now describe the various QUIC frame types that can be 2019 present in a Regular packet. The use of these frames and various 2020 frame header bits are described in subsequent sections. 2022 8.1. Variable-Length Integer Encoding 2024 QUIC frames use a common variable-length encoding for all non- 2025 negative integer values. This encoding ensures that smaller integer 2026 values need fewer octets to encode. 2028 The QUIC variable-length integer encoding reserves the two most 2029 significant bits of the first octet to encode the base 2 logarithm of 2030 the integer encoding length in octets. The integer value is encoded 2031 on the remaining bits, in network byte order. 2033 This means that integers are encoded on 1, 2, 4, or 8 octets and can 2034 encode 6, 14, 30, or 62 bit values respectively. Table 4 summarizes 2035 the encoding properties. 2037 +------+--------+-------------+-----------------------+ 2038 | 2Bit | Length | Usable Bits | Range | 2039 +------+--------+-------------+-----------------------+ 2040 | 00 | 1 | 6 | 0-63 | 2041 | | | | | 2042 | 01 | 2 | 14 | 0-16383 | 2043 | | | | | 2044 | 10 | 4 | 30 | 0-1073741823 | 2045 | | | | | 2046 | 11 | 8 | 62 | 0-4611686018427387903 | 2047 +------+--------+-------------+-----------------------+ 2049 Table 4: Summary of Integer Encodings 2051 For example, the eight octet sequence c2 19 7c 5e ff 14 e8 8c (in 2052 hexadecimal) decodes to the decimal value 151288809941952652; the 2053 four octet sequence 9d 7f 3e 7d decodes to 494878333; the two octet 2054 sequence 7b bd decodes to 15293; and the single octet 25 decodes to 2055 37 (as does the two octet sequence 40 25). 2057 Error codes (Section 12.3) are described using integers, but do not 2058 use this encoding. 2060 8.2. PADDING Frame 2062 The PADDING frame (type=0x00) has no semantic value. PADDING frames 2063 can be used to increase the size of a packet. Padding can be used to 2064 increase an initial client packet to the minimum required size, or to 2065 provide protection against traffic analysis for protected packets. 2067 A PADDING frame has no content. That is, a PADDING frame consists of 2068 the single octet that identifies the frame as a PADDING frame. 2070 8.3. RST_STREAM Frame 2072 An endpoint may use a RST_STREAM frame (type=0x01) to abruptly 2073 terminate a stream. 2075 After sending a RST_STREAM, an endpoint ceases transmission and 2076 retransmission of STREAM frames on the identified stream. A receiver 2077 of RST_STREAM can discard any data that it already received on that 2078 stream. 2080 The RST_STREAM frame is as follows: 2082 0 1 2 3 2083 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2084 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2085 | Stream ID (i) ... 2086 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2087 | Application Error Code (16) | 2088 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2089 | Final Offset (i) ... 2090 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2092 The fields are: 2094 Stream ID: A variable-length integer encoding of the Stream ID of 2095 the stream being terminated. 2097 Application Protocol Error Code: A 16-bit application protocol error 2098 code (see Section 12.4) which indicates why the stream is being 2099 closed. 2101 Final Offset: A variable-length integer indicating the absolute byte 2102 offset of the end of data written on this stream by the RST_STREAM 2103 sender. 2105 8.4. CONNECTION_CLOSE frame 2107 An endpoint sends a CONNECTION_CLOSE frame (type=0x02) to notify its 2108 peer that the connection is being closed. CONNECTION_CLOSE is used 2109 to signal errors at the QUIC layer, or the absence of errors (with 2110 the NO_ERROR code). 2112 If there are open streams that haven't been explicitly closed, they 2113 are implicitly closed when the connection is closed. 2115 The CONNECTION_CLOSE frame is as follows: 2117 0 1 2 3 2118 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2119 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2120 | Error Code (16) | Reason Phrase Length (i) ... 2121 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2122 | Reason Phrase (*) ... 2123 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2125 The fields of a CONNECTION_CLOSE frame are as follows: 2127 Error Code: A 16-bit error code which indicates the reason for 2128 closing this connection. CONNECTION_CLOSE uses codes from the 2129 space defined in Section 12.3 (APPLICATION_CLOSE uses codes from 2130 the application protocol error code space, see Section 12.4). 2132 Reason Phrase Length: A variable-length integer specifying the 2133 length of the reason phrase in bytes. Note that a 2134 CONNECTION_CLOSE frame cannot be split between packets, so in 2135 practice any limits on packet size will also limit the space 2136 available for a reason phrase. 2138 Reason Phrase: A human-readable explanation for why the connection 2139 was closed. This can be zero length if the sender chooses to not 2140 give details beyond the Error Code. This SHOULD be a UTF-8 2141 encoded string [RFC3629]. 2143 8.5. APPLICATION_CLOSE frame 2145 An APPLICATION_CLOSE frame (type=0x03) uses the same format as the 2146 CONNECTION_CLOSE frame (Section 8.4), except that it uses error codes 2147 from the application protocol error code space (Section 12.4) instead 2148 of the transport error code space. 2150 Other than the error code space, the format and semantics of the 2151 APPLICATION_CLOSE frame are identical to the CONNECTION_CLOSE frame. 2153 8.6. MAX_DATA Frame 2155 The MAX_DATA frame (type=0x04) is used in flow control to inform the 2156 peer of the maximum amount of data that can be sent on the connection 2157 as a whole. 2159 The frame is as follows: 2161 0 1 2 3 2162 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2163 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2164 | Maximum Data (i) ... 2165 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2167 The fields in the MAX_DATA frame are as follows: 2169 Maximum Data: A variable-length integer indicating the maximum 2170 amount of data that can be sent on the entire connection, in units 2171 of octets. 2173 All data sent in STREAM frames counts toward this limit, with the 2174 exception of data on stream 0. The sum of the largest received 2175 offsets on all streams - including streams in terminal states, but 2176 excluding stream 0 - MUST NOT exceed the value advertised by a 2177 receiver. An endpoint MUST terminate a connection with a 2178 QUIC_FLOW_CONTROL_RECEIVED_TOO_MUCH_DATA error if it receives more 2179 data than the maximum data value that it has sent, unless this is a 2180 result of a change in the initial limits (see Section 7.4.2). 2182 8.7. MAX_STREAM_DATA Frame 2184 The MAX_STREAM_DATA frame (type=0x05) is used in flow control to 2185 inform a peer of the maximum amount of data that can be sent on a 2186 stream. 2188 The frame is as follows: 2190 0 1 2 3 2191 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2192 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2193 | Stream ID (i) ... 2194 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2195 | Maximum Stream Data (i) ... 2196 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2198 The fields in the MAX_STREAM_DATA frame are as follows: 2200 Stream ID: The stream ID of the stream that is affected encoded as a 2201 variable-length integer. 2203 Maximum Stream Data: A variable-length integer indicating the 2204 maximum amount of data that can be sent on the identified stream, 2205 in units of octets. 2207 When counting data toward this limit, an endpoint accounts for the 2208 largest received offset of data that is sent or received on the 2209 stream. Loss or reordering can mean that the largest received offset 2210 on a stream can be greater than the total size of data received on 2211 that stream. Receiving STREAM frames might not increase the largest 2212 received offset. 2214 The data sent on a stream MUST NOT exceed the largest maximum stream 2215 data value advertised by the receiver. An endpoint MUST terminate a 2216 connection with a FLOW_CONTROL_ERROR error if it receives more data 2217 than the largest maximum stream data that it has sent for the 2218 affected stream, unless this is a result of a change in the initial 2219 limits (see Section 7.4.2). 2221 8.8. MAX_STREAM_ID Frame 2223 The MAX_STREAM_ID frame (type=0x06) informs the peer of the maximum 2224 stream ID that they are permitted to open. 2226 The frame is as follows: 2228 0 1 2 3 2229 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2230 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2231 | Maximum Stream ID (i) ... 2232 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2234 The fields in the MAX_STREAM_ID frame are as follows: 2236 Maximum Stream ID: ID of the maximum unidirectional or bidirectional 2237 peer-initiated stream ID for the connection encoded as a variable- 2238 length integer. The limit applies to unidirectional steams if the 2239 second least signification bit of the stream ID is 1, and applies 2240 to bidirectional streams if it is 0. 2242 Loss or reordering can mean that a MAX_STREAM_ID frame can be 2243 received which states a lower stream limit than the client has 2244 previously received. MAX_STREAM_ID frames which do not increase the 2245 maximum stream ID MUST be ignored. 2247 A peer MUST NOT initiate a stream with a higher stream ID than the 2248 greatest maximum stream ID it has received. An endpoint MUST 2249 terminate a connection with a STREAM_ID_ERROR error if a peer 2250 initiates a stream with a higher stream ID than it has sent, unless 2251 this is a result of a change in the initial limits (see 2252 Section 7.4.2). 2254 8.9. PING Frame 2256 Endpoints can use PING frames (type=0x07) to verify that their peers 2257 are still alive or to check reachability to the peer. 2259 The PING frame contains a variable-length payload. 2261 0 1 2 3 2262 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2263 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2264 | Length(8) | Data (*) ... 2265 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2267 Length: This 8-bit value describes the length of the Data field. 2269 Data: This variable-length field contains arbitrary data. 2271 A PING frame with an empty Data field causes the packet containing it 2272 to be acknowledged as normal. No other action is required of the 2273 recipient. 2275 An empty PING frame can be used to keep a connection alive when an 2276 application or application protocol wishes to prevent the connection 2277 from timing out. An application protocol SHOULD provide guidance 2278 about the conditions under which generating a PING is recommended. 2279 This guidance SHOULD indicate whether it is the client or the server 2280 that is expected to send the PING. Having both endpoints send PING 2281 frames without coordination can produce an excessive number of 2282 packets and poor performance. 2284 If the Data field is not empty, the recipient of this frame MUST 2285 generate a PONG frame (Section 8.15) containing the same Data. A 2286 PING frame with data is not appropriate for use in keeping a 2287 connection alive, because the PONG frame elicits an acknowledgement, 2288 causing the sender of the original PING to send two packets. 2290 A connection will time out if no packets are sent or received for a 2291 period longer than the time specified in the idle_timeout transport 2292 parameter (see Section 7.9). However, state in middleboxes might 2293 time out earlier than that. Though REQ-5 in [RFC4787] recommends a 2 2294 minute timeout interval, experience shows that sending packets every 2295 15 to 30 seconds is necessary to prevent the majority of middleboxes 2296 from losing state for UDP flows. 2298 8.10. BLOCKED Frame 2300 A sender SHOULD send a BLOCKED frame (type=0x08) when it wishes to 2301 send data, but is unable to due to connection-level flow control (see 2302 Section 11.2.1). BLOCKED frames can be used as input to tuning of 2303 flow control algorithms (see Section 11.1.2). 2305 The BLOCKED frame is as follows: 2307 0 1 2 3 2308 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2309 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2310 | Offset (i) ... 2311 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2313 The BLOCKED frame contains a single field. 2315 Offset: A variable-length integer indicating the connection-level 2316 offset at which the blocking occurred. 2318 8.11. STREAM_BLOCKED Frame 2320 A sender SHOULD send a STREAM_BLOCKED frame (type=0x09) when it 2321 wishes to send data, but is unable to due to stream-level flow 2322 control. This frame is analogous to BLOCKED (Section 8.10). 2324 The STREAM_BLOCKED frame is as follows: 2326 0 1 2 3 2327 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2328 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2329 | Stream ID (i) ... 2330 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2331 | Offset (i) ... 2332 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2334 The STREAM_BLOCKED frame contains two fields: 2336 Stream ID: A variable-length integer indicating the stream which is 2337 flow control blocked. 2339 Offset: A variable-length integer indicating the offset of the 2340 stream at which the blocking occurred. 2342 8.12. STREAM_ID_BLOCKED Frame 2344 A sender MAY send a STREAM_ID_BLOCKED frame (type=0x0a) when it 2345 wishes to open a stream, but is unable to due to the maximum stream 2346 ID limit set by its peer (see Section 8.8). This does not open the 2347 stream, but informs the peer that a new stream was needed, but the 2348 stream limit prevented the creation of the stream. 2350 The STREAM_ID_BLOCKED frame is as follows: 2352 0 1 2 3 2353 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2354 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2355 | Stream ID (i) ... 2356 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2358 The STREAM_ID_BLOCKED frame contains a single field. 2360 Stream ID: A variable-length integer indicating the highest stream 2361 ID that the sender was permitted to open. 2363 8.13. NEW_CONNECTION_ID Frame 2365 A server sends a NEW_CONNECTION_ID frame (type=0x0b) to provide the 2366 client with alternative connection IDs that can be used to break 2367 linkability when migrating connections (see Section 7.7.1). 2369 The NEW_CONNECTION_ID is as follows: 2371 0 1 2 3 2372 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2373 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2374 | Sequence (i) ... 2375 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2376 | | 2377 + Connection ID (64) + 2378 | | 2379 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2380 | | 2381 + + 2382 | | 2383 + Stateless Reset Token (128) + 2384 | | 2385 + + 2386 | | 2387 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2389 The fields are: 2391 Sequence: A variable-length integer. This value starts at 0 and 2392 increases by 1 for each connection ID that is provided by the 2393 server. The connection ID that is assigned during the handshake 2394 is assumed to have a sequence of -1. That is, the value selected 2395 during the handshake comes immediately before the first value that 2396 a server can send. 2398 Connection ID: A 64-bit connection ID. 2400 Stateless Reset Token: A 128-bit value that will be used to for a 2401 stateless reset when the associated connection ID is used (see 2402 Section 7.9.4). 2404 8.14. STOP_SENDING Frame 2406 An endpoint may use a STOP_SENDING frame (type=0x0c) to communicate 2407 that incoming data is being discarded on receipt at application 2408 request. This signals a peer to abruptly terminate transmission on a 2409 stream. 2411 The STOP_SENDING frame is as follows: 2413 0 1 2 3 2414 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2415 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2416 | Stream ID (i) ... 2417 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2418 | Application Error Code (16) | 2419 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2421 The fields are: 2423 Stream ID: A variable-length integer carrying the Stream ID of the 2424 stream being ignored. 2426 Application Error Code: A 16-bit, application-specified reason the 2427 sender is ignoring the stream (see Section 12.4). 2429 8.15. PONG Frame 2431 The PONG frame (type=0x0d) is sent in response to a PING frame that 2432 contains data. Its format is identical to the PING frame 2433 (Section 8.9). 2435 An endpoint that receives an unsolicited PONG frame - that is, a PONG 2436 frame containing a payload that is empty MUST generate a connection 2437 error of type FRAME_ERROR, indicating the PONG frame (that is, 2438 0x10d). If the content of a PONG frame does not match the content of 2439 a PING frame previously sent by the endpoint, the endpoint MAY 2440 generate a connection error of type UNSOLICITED_PONG. 2442 8.16. ACK Frame 2444 Receivers send ACK frames (type=0xe) to inform senders which packets 2445 they have received and processed. A sent packet that has never been 2446 acknowledged is missing. The ACK frame contains any number of ACK 2447 blocks. ACK blocks are ranges of acknowledged packets. 2449 Unlike TCP SACKs, QUIC acknowledgements are irrevocable. Once a 2450 packet has been acknowledged, even if it does not appear in a future 2451 ACK frame, it remains acknowledged. 2453 A client MUST NOT acknowledge Version Negotiation or Retry packets. 2454 These packet types contain packet numbers selected by the client, not 2455 the server. 2457 A sender MAY intentionally skip packet numbers to introduce entropy 2458 into the connection, to avoid opportunistic acknowledgement attacks. 2459 The sender SHOULD close the connection if an unsent packet number is 2460 acknowledged. The format of the ACK frame is efficient at expressing 2461 even long blocks of missing packets, allowing for large, 2462 unpredictable gaps. 2464 An ACK frame is shown below. 2466 0 1 2 3 2467 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2468 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2469 | Largest Acknowledged (i) ... 2470 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2471 | ACK Delay (i) ... 2472 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2473 | ACK Block Count (i) ... 2474 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2475 | ACK Blocks (*) ... 2476 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2478 Figure 7: ACK Frame Format 2480 The fields in the ACK frame are as follows: 2482 Largest Acknowledged: A variable-length integer representing the 2483 largest packet number the peer is acknowledging; this is usually 2484 the largest packet number that the peer has received prior to 2485 generating the ACK frame. 2487 ACK Delay: A variable-length integer including the time in 2488 microseconds that the largest acknowledged packet, as indicated in 2489 the Largest Acknowledged field, was received by this peer to when 2490 this ACK was sent. The value of the ACK Delay field is scaled by 2491 multiplying the encoded value by the 2 to the power of the value 2492 of the "ack_delay_exponent" transport parameter set by the sender 2493 of the ACK frame. The "ack_delay_exponent" defaults to 3, or a 2494 multiplier of 8 (see Section 7.4.1). Scaling in this fashion 2495 allows for a larger range of values with a shorter encoding at the 2496 cost of lower resolution. 2498 ACK Block Count: The number of Additional ACK Block (and Gap) fields 2499 after the First ACK Block. 2501 ACK Blocks: Contains one or more blocks of packet numbers which have 2502 been successfully received, see Section 8.16.1. 2504 8.16.1. ACK Block Section 2506 The ACK Block Section consists of alternating Gap and ACK Block 2507 fields in descending packet number order. A First Ack Block field is 2508 followed by a variable number of alternating Gap and Additional ACK 2509 Blocks. The number of Gap and Additional ACK Block fields is 2510 determined by the ACK Block Count field. 2512 Gap and ACK Block fields use a relative integer encoding for 2513 efficiency. Though each encoded value is positive, the values are 2514 subtracted, so that each ACK Block describes progressively lower- 2515 numbered packets. As long as contiguous ranges of packets are small, 2516 the variable-length integer encoding ensures that each range can be 2517 expressed in a small number of octets. 2519 0 1 2 3 2520 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2521 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2522 | First ACK Block (i) ... 2523 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2524 | Gap (i) ... 2525 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2526 | Additional ACK Block (i) ... 2527 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2528 | Gap (i) ... 2529 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2530 | Additional ACK Block (i) ... 2531 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2532 ... 2533 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2534 | Gap (i) ... 2535 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2536 | Additional ACK Block (i) ... 2537 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2539 Figure 8: ACK Block Section 2541 Each ACK Block acknowledges a contiguous range of packets by 2542 indicating the number of acknowledged packets that precede the 2543 largest packet number in that block. A value of zero indicates that 2544 only the largest packet number is acknowledged. Larger ACK Block 2545 values indicate a larger range, with corresponding lower values for 2546 the smallest packet number in the range. Thus, given a largest 2547 packet number for the ACK, the smallest value is determined by the 2548 formula: 2550 smallest = largest - ack_block 2552 The range of packets that are acknowledged by the ACK block include 2553 the range from the smallest packet number to the largest, inclusive. 2555 The largest value for the First ACK Block is determined by the 2556 Largest Acknowledged field; the largest for Additional ACK Blocks is 2557 determined by cumulatively subtracting the size of all preceding ACK 2558 Blocks and Gaps. 2560 Each Gap indicates a range of packets that are not being 2561 acknowledged. The number of packets in the gap is one higher than 2562 the encoded value of the Gap Field. 2564 The value of the Gap field establishes the largest packet number 2565 value for the ACK block that follows the gap using the following 2566 formula: 2568 largest = previous_smallest - gap - 2 2570 If the calculated value for largest or smallest packet number for any 2571 ACK Block is negative, an endpoint MUST generate a connection error 2572 of type FRAME_ERROR indicating an error in an ACK frame (that is, 2573 0x10d). 2575 The fields in the ACK Block Section are: 2577 First ACK Block: A variable-length integer indicating the number of 2578 contiguous packets preceding the Largest Acknowledged that are 2579 being acknowledged. 2581 Gap (repeated): A variable-length integer indicating the number of 2582 contiguous unacknowledged packets preceding the packet number one 2583 lower than the smallest in the preceding ACK Block. 2585 ACK Block (repeated): A variable-length integer indicating the 2586 number of contiguous acknowledged packets preceding the largest 2587 packet number, as determined by the preceding Gap. 2589 8.16.2. Sending ACK Frames 2591 Implementations MUST NOT generate packets that only contain ACK 2592 frames in response to packets which only contain ACK frames. 2593 However, they MUST acknowledge packets containing only ACK frames 2594 when sending ACK frames in response to other packets. 2595 Implementations MUST NOT send more than one ACK frame per received 2596 packet that contains frames other than ACK frames. Packets 2597 containing non-ACK frames MUST be acknowledged immediately or when a 2598 delayed ack timer expires. 2600 To limit ACK blocks to those that have not yet been received by the 2601 sender, the receiver SHOULD track which ACK frames have been 2602 acknowledged by its peer. Once an ACK frame has been acknowledged, 2603 the packets it acknowledges SHOULD NOT be acknowledged again. 2605 A receiver that is only sending ACK frames will not receive 2606 acknowledgments for its packets. Sending an occasional MAX_DATA or 2607 MAX_STREAM_DATA frame as data is received will ensure that 2608 acknowledgements are generated by a peer. Otherwise, an endpoint MAY 2609 send a PING frame once per RTT to solicit an acknowledgment. 2611 To limit receiver state or the size of ACK frames, a receiver MAY 2612 limit the number of ACK blocks it sends. A receiver can do this even 2613 without receiving acknowledgment of its ACK frames, with the 2614 knowledge this could cause the sender to unnecessarily retransmit 2615 some data. Standard QUIC [QUIC-RECOVERY] algorithms declare packets 2616 lost after sufficiently newer packets are acknowledged. Therefore, 2617 the receiver SHOULD repeatedly acknowledge newly received packets in 2618 preference to packets received in the past. 2620 8.16.3. ACK Frames and Packet Protection 2622 ACK frames that acknowledge protected packets MUST be carried in a 2623 packet that has an equivalent or greater level of packet protection. 2625 Packets that are protected with 1-RTT keys MUST be acknowledged in 2626 packets that are also protected with 1-RTT keys. 2628 A packet that is not protected and claims to acknowledge a packet 2629 number that was sent with packet protection is not valid. An 2630 unprotected packet that carries acknowledgments for protected packets 2631 MUST be discarded in its entirety. 2633 Packets that a client sends with 0-RTT packet protection MUST be 2634 acknowledged by the server in packets protected by 1-RTT keys. This 2635 can mean that the client is unable to use these acknowledgments if 2636 the server cryptographic handshake messages are delayed or lost. 2637 Note that the same limitation applies to other data sent by the 2638 server protected by the 1-RTT keys. 2640 Unprotected packets, such as those that carry the initial 2641 cryptographic handshake messages, MAY be acknowledged in unprotected 2642 packets. Unprotected packets are vulnerable to falsification or 2643 modification. Unprotected packets can be acknowledged along with 2644 protected packets in a protected packet. 2646 An endpoint SHOULD acknowledge packets containing cryptographic 2647 handshake messages in the next unprotected packet that it sends, 2648 unless it is able to acknowledge those packets in later packets 2649 protected by 1-RTT keys. At the completion of the cryptographic 2650 handshake, both peers send unprotected packets containing 2651 cryptographic handshake messages followed by packets protected by 2652 1-RTT keys. An endpoint SHOULD acknowledge the unprotected packets 2653 that complete the cryptographic handshake in a protected packet, 2654 because its peer is guaranteed to have access to 1-RTT packet 2655 protection keys. 2657 For instance, a server acknowledges a TLS ClientHello in the packet 2658 that carries the TLS ServerHello; similarly, a client can acknowledge 2659 a TLS HelloRetryRequest in the packet containing a second TLS 2660 ClientHello. The complete set of server handshake messages (TLS 2661 ServerHello through to Finished) might be acknowledged by a client in 2662 protected packets, because it is certain that the server is able to 2663 decipher the packet. 2665 8.17. STREAM Frames 2667 STREAM frames implicitly create a stream and carry stream data. The 2668 STREAM frame takes the form 0b00010XXX (or the set of values from 2669 0x10 to 0x17). The value of the three low-order bits of the frame 2670 type determine the fields that are present in the frame. 2672 o The FIN bit (0x01) of the frame type is set only on frames that 2673 contain the final offset of the stream. Setting this bit 2674 indicates that the frame marks the end of the stream. 2676 o The LEN bit (0x02) in the frame type is set to indicate that there 2677 is a Length field present. If this bit is set to 0, the Length 2678 field is absent and the Stream Data field extends to the end of 2679 the packet. If this bit is set to 1, the Length field is present. 2681 o The OFF bit (0x04) in the frame type is set to indicate that there 2682 is an Offset field present. When set to 1, the Offset field is 2683 present; when set to 0, the Offset field is absent and the Stream 2684 Data starts at an offset of 0 (that is, the frame contains the 2685 first octets of the stream, or the end of a stream that includes 2686 no data). 2688 A STREAM frame is shown below. 2690 0 1 2 3 2691 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2692 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2693 | Stream ID (i) ... 2694 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2695 | [Offset (i)] ... 2696 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2697 | [Length (i)] ... 2698 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2699 | Stream Data (*) ... 2700 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2702 Figure 9: STREAM Frame Format 2704 The STREAM frame contains the following fields: 2706 Stream ID: A variable-length integer indicating the stream ID of the 2707 stream (see Section 10.1). 2709 Offset: A variable-length integer specifying the byte offset in the 2710 stream for the data in this STREAM frame. This field is present 2711 when the OFF bit is set to 1. When the Offset field is absent, 2712 the offset is 0. 2714 Length: A variable-length integer specifying the length of the 2715 Stream Data field in this STREAM frame. This field is present 2716 when the LEN bit is set to 1. When the LEN bit is set to 0, the 2717 Stream Data field consumes all the remaining octets in the packet. 2719 Stream Data: The bytes from the designated stream to be delivered. 2721 A stream frame's Stream Data MUST NOT be empty, unless the FIN bit is 2722 set. When the FIN flag is sent on an empty STREAM frame, the offset 2723 in the STREAM frame is the offset of the next byte that would be 2724 sent. 2726 The first byte in the stream has an offset of 0. The largest offset 2727 delivered on a stream - the sum of the re-constructed offset and data 2728 length - MUST be less than 2^62. 2730 Stream multiplexing is achieved by interleaving STREAM frames from 2731 multiple streams into one or more QUIC packets. A single QUIC packet 2732 can include multiple STREAM frames from one or more streams. 2734 Implementation note: One of the benefits of QUIC is avoidance of 2735 head-of-line blocking across multiple streams. When a packet loss 2736 occurs, only streams with data in that packet are blocked waiting for 2737 a retransmission to be received, while other streams can continue 2738 making progress. Note that when data from multiple streams is 2739 bundled into a single QUIC packet, loss of that packet blocks all 2740 those streams from making progress. An implementation is therefore 2741 advised to bundle as few streams as necessary in outgoing packets 2742 without losing transmission efficiency to underfilled packets. 2744 9. Packetization and Reliability 2746 The Path Maximum Transmission Unit (PMTU) is the maximum size of the 2747 entire IP header, UDP header, and UDP payload. The UDP payload 2748 includes the QUIC packet header, protected payload, and any 2749 authentication fields. 2751 All QUIC packets SHOULD be sized to fit within the estimated PMTU to 2752 avoid IP fragmentation or packet drops. To optimize bandwidth 2753 efficiency, endpoints SHOULD use Packetization Layer PMTU Discovery 2754 ([PLPMTUD]) and MAY use PMTU Discovery ([PMTUDv4], [PMTUDv6]) for 2755 detecting the PMTU, setting the PMTU appropriately, and storing the 2756 result of previous PMTU determinations. 2758 In the absence of these mechanisms, QUIC endpoints SHOULD NOT send IP 2759 packets larger than 1280 octets. Assuming the minimum IP header 2760 size, this results in a QUIC packet size of 1232 octets for IPv6 and 2761 1252 octets for IPv4. 2763 QUIC endpoints that implement any kind of PMTU discovery SHOULD 2764 maintain an estimate for each combination of local and remote IP 2765 addresses (as each pairing could have a different maximum MTU in the 2766 path). 2768 QUIC depends on the network path supporting a MTU of at least 1280 2769 octets. This is the IPv6 minimum and therefore also supported by 2770 most modern IPv4 networks. An endpoint MUST NOT reduce their MTU 2771 below this number, even if it receives signals that indicate a 2772 smaller limit might exist. 2774 Clients MUST ensure that the first packet in a connection, and any 2775 retransmissions of those octets, has a QUIC packet size of least 1200 2776 octets. The packet size for a QUIC packet includes the QUIC header 2777 and integrity check, but not the UDP or IP header. 2779 The initial client packet SHOULD be padded to exactly 1200 octets 2780 unless the client has a reasonable assurance that the PMTU is larger. 2781 Sending a packet of this size ensures that the network path supports 2782 an MTU of this size and helps reduce the amplitude of amplification 2783 attacks caused by server responses toward an unverified client 2784 address. 2786 Servers MUST ignore an initial plaintext packet from a client if its 2787 total size is less than 1200 octets. 2789 If a QUIC endpoint determines that the PMTU between any pair of local 2790 and remote IP addresses has fallen below 1280 octets, it MUST 2791 immediately cease sending QUIC packets on the affected path. This 2792 could result in termination of the connection if an alternative path 2793 cannot be found. 2795 A sender bundles one or more frames in a Regular QUIC packet (see 2796 Section 6). 2798 A sender SHOULD minimize per-packet bandwidth and computational costs 2799 by bundling as many frames as possible within a QUIC packet. A 2800 sender MAY wait for a short period of time to bundle multiple frames 2801 before sending a packet that is not maximally packed, to avoid 2802 sending out large numbers of small packets. An implementation may 2803 use heuristics about expected application sending behavior to 2804 determine whether and for how long to wait. This waiting period is 2805 an implementation decision, and an implementation should be careful 2806 to delay conservatively, since any delay is likely to increase 2807 application-visible latency. 2809 Regular QUIC packets are "containers" of frames; a packet is never 2810 retransmitted whole. How an endpoint handles the loss of the frame 2811 depends on the type of the frame. Some frames are simply 2812 retransmitted, some have their contents moved to new frames, and 2813 others are never retransmitted. 2815 When a packet is detected as lost, the sender re-sends any frames as 2816 necessary: 2818 o All application data sent in STREAM frames MUST be retransmitted, 2819 unless the endpoint has sent a RST_STREAM for that stream. When 2820 an endpoint sends a RST_STREAM frame, data outstanding on that 2821 stream SHOULD NOT be retransmitted, since subsequent data on this 2822 stream is expected to not be delivered by the receiver. 2824 o ACK and PADDING frames MUST NOT be retransmitted. ACK frames 2825 containing updated information will be sent as described in 2826 Section 8.16. 2828 o STOP_SENDING frames MUST be retransmitted until the receive stream 2829 enters either a "Data Recvd" or "Reset Recvd" state. See 2830 Section 10.3. 2832 o The most recent MAX_STREAM_DATA frame for a stream MUST be 2833 retransmitted until the receive stream enters a "Size Known" 2834 state. Any previous unacknowledged MAX_STREAM_DATA frame for the 2835 same stream SHOULD NOT be retransmitted since a newer 2836 MAX_STREAM_DATA frame for a stream obviates the need for 2837 delivering older ones. Similarly, the most recent MAX_DATA frame 2838 MUST be retransmitted; previous unacknowledged ones SHOULD NOT be 2839 retransmitted. 2841 o BLOCKED, STREAM_BLOCKED, and STREAM_ID_BLOCKED frames SHOULD be 2842 retransmitted if the sender is still blocked on the same limit. 2843 If the limit has been increased since the frame was originally 2844 sent, the frame SHOULD NOT be retransmitted. 2846 o All other frames MUST be retransmitted. 2848 Upon detecting losses, a sender MUST take appropriate congestion 2849 control action. The details of loss detection and congestion control 2850 are described in [QUIC-RECOVERY]. 2852 A packet MUST NOT be acknowledged until packet protection has been 2853 successfully removed and all frames contained in the packet have been 2854 processed. For STREAM frames, this means the data has been queued 2855 (but not necessarily delivered to the application). This also means 2856 that any stream state transitions triggered by STREAM or RST_STREAM 2857 frames have occurred. Once the packet has been fully processed, a 2858 receiver acknowledges receipt by sending one or more ACK frames 2859 containing the packet number of the received packet. 2861 To avoid creating an indefinite feedback loop, an endpoint MUST NOT 2862 send an ACK frame in response to a packet containing only ACK or 2863 PADDING frames, even if there are packet gaps which precede the 2864 received packet. The endpoint MUST acknowledge packets containing 2865 only ACK or PADDING frames in the next ACK frame that it sends. 2867 Strategies and implications of the frequency of generating 2868 acknowledgments are discussed in more detail in [QUIC-RECOVERY]. 2870 9.1. Special Considerations for PMTU Discovery 2872 Traditional ICMP-based path MTU discovery in IPv4 [RFC1191] is 2873 potentially vulnerable to off-path attacks that successfully guess 2874 the IP/port 4-tuple and reduce the MTU to a bandwidth-inefficient 2875 value. TCP connections mitigate this risk by using the (at minimum) 2876 8 bytes of transport header echoed in the ICMP message to validate 2877 the TCP sequence number as valid for the current connection. 2878 However, as QUIC operates over UDP, in IPv4 the echoed information 2879 could consist only of the IP and UDP headers, which usually has 2880 insufficient entropy to mitigate off-path attacks. 2882 As a result, endpoints that implement PMTUD in IPv4 SHOULD take steps 2883 to mitigate this risk. For instance, an application could: 2885 o Set the IPv4 Don't Fragment (DF) bit on a small proportion of 2886 packets, so that most invalid ICMP messages arrive when there are 2887 no DF packets outstanding, and can therefore be identified as 2888 spurious. 2890 o Store additional information from the IP or UDP headers from DF 2891 packets (for example, the IP ID or UDP checksum) to further 2892 authenticate incoming Datagram Too Big messages. 2894 o Any reduction in PMTU due to a report contained in an ICMP packet 2895 is provisional until QUIC's loss detection algorithm determines 2896 that the packet is actually lost. 2898 10. Streams: QUIC's Data Structuring Abstraction 2900 Streams in QUIC provide a lightweight, ordered byte-stream 2901 abstraction. 2903 There are two basic types of stream in QUIC. Unidirectional streams 2904 carry data in one direction only; bidirectional streams allow for 2905 data to be sent in both directions. Different stream identifiers are 2906 used to distinguish between unidirectional and bidirectional streams, 2907 as well as to create a separation between streams that are initiated 2908 by the client and server (see Section 10.1). 2910 Either type of stream can be created by either endpoint, can 2911 concurrently send data interleaved with other streams, and can be 2912 cancelled. 2914 Data that is received on a stream is delivered in order within that 2915 stream, but there is no particular delivery order across streams. 2916 Transmit ordering among streams is left to the implementation. 2918 The creation and destruction of streams are expected to have minimal 2919 bandwidth and computational cost. A single STREAM frame may create, 2920 carry data for, and terminate a stream, or a stream may last the 2921 entire duration of a connection. 2923 Streams are individually flow controlled, allowing an endpoint to 2924 limit memory commitment and to apply back pressure. The creation of 2925 streams is also flow controlled, with each peer declaring the maximum 2926 stream ID it is willing to accept at a given time. 2928 An alternative view of QUIC streams is as an elastic "message" 2929 abstraction, similar to the way ephemeral streams are used in SST 2930 [SST], which may be a more appealing description for some 2931 applications. 2933 10.1. Stream Identifiers 2935 Streams are identified by an unsigned 62-bit integer, referred to as 2936 the Stream ID. The least significant two bits of the Stream ID are 2937 used to identify the type of stream (unidirectional or bidirectional) 2938 and the initiator of the stream. 2940 The least significant bit (0x1) of the Stream ID identifies the 2941 initiator of the stream. Clients initiate even-numbered streams 2942 (those with the least significant bit set to 0); servers initiate 2943 odd-numbered streams (with the bit set to 1). Separation of the 2944 stream identifiers ensures that client and server are able to open 2945 streams without the latency imposed by negotiating for an identifier. 2947 If an endpoint receives a frame for a stream that it expects to 2948 initiate (i.e., odd-numbered for the client or even-numbered for the 2949 server), but which it has not yet opened, it MUST close the 2950 connection with error code STREAM_STATE_ERROR. 2952 The second least significant bit (0x2) of the Stream ID 2953 differentiates between unidirectional streams and bidirectional 2954 streams. Unidirectional streams always have this bit set to 1 and 2955 bidirectional streams have this bit set to 0. 2957 The two type bits from a Stream ID therefore identify streams as 2958 summarized in Table 5. 2960 +----------+----------------------------------+ 2961 | Low Bits | Stream Type | 2962 +----------+----------------------------------+ 2963 | 0x0 | Client-Initiated, Bidirectional | 2964 | | | 2965 | 0x1 | Server-Initiated, Bidirectional | 2966 | | | 2967 | 0x2 | Client-Initiated, Unidirectional | 2968 | | | 2969 | 0x3 | Server-Initiated, Unidirectional | 2970 +----------+----------------------------------+ 2972 Table 5: Stream ID Types 2974 Stream ID 0 (0x0) is a client-initiated, bidirectional stream that is 2975 used for the cryptographic handshake. Stream 0 MUST NOT be used for 2976 application data. 2978 A QUIC endpoint MUST NOT reuse a Stream ID. Open streams can be used 2979 in any order. Streams that are used out of order result in opening 2980 all lower-numbered streams of the same type in the same direction. 2982 Stream IDs are encoded as a variable-length integer (see 2983 Section 8.1). 2985 10.2. Stream States 2987 This section describes the two types of QUIC stream in terms of the 2988 states of their send or receive components. Two state machines are 2989 described: one for streams on which an endpoint transmits data 2990 (Section 10.2.1); another for streams from which an endpoint receives 2991 data (Section 10.2.2). 2993 Unidirectional streams use the applicable state machine directly. 2994 Bidirectional streams use both state machines. For the most part, 2995 the use of these state machines is the same whether the stream is 2996 unidirectional or bidirectional. The conditions for opening a stream 2997 are slightly more complex for a bidirectional stream because the 2998 opening of either send or receive causes the stream to open in both 2999 directions. 3001 Opening a stream causes all lower-numbered streams of the same type 3002 to implicitly open. This includes both send and receive streams if 3003 the stream is bidirectional. For bidirectional streams, an endpoint 3004 can send data on an implicitly opened stream. On both unidirectional 3005 and bidirectional streams, an endpoint MAY send MAX_STREAM_DATA or 3006 STOP_SENDING on implicitly opened streams. An endpoint SHOULD NOT 3007 implicitly open streams that it initiates, instead opening streams in 3008 order. 3010 Note: These states are largely informative. This document uses 3011 stream states to describe rules for when and how different types 3012 of frames can be sent and the reactions that are expected when 3013 different types of frames are received. Though these state 3014 machines are intended to be useful in implementing QUIC, these 3015 states aren't intended to constrain implementations. An 3016 implementation can define a different state machine as long as its 3017 behavior is consistent with an implementation that implements 3018 these states. 3020 10.2.1. Send Stream States 3022 Figure 10 shows the states for the part of a stream that sends data 3023 to a peer. 3025 o 3026 | Application Open 3027 | Open Paired Stream (bidirectional) 3028 v 3029 +-------+ 3030 | Open | Send RST_STREAM 3031 | |-----------------------. 3032 +-------+ | 3033 | | 3034 | Send STREAM / | 3035 | STREAM_BLOCKED | 3036 v | 3037 +-------+ | 3038 | Send | Send RST_STREAM | 3039 | |---------------------->| 3040 +-------+ | 3041 | | 3042 | Send STREAM + FIN | 3043 v v 3044 +-------+ +-------+ 3045 | Data | Send RST_STREAM | Reset | 3046 | Sent +------------------>| Sent | 3047 +-------+ +-------+ 3048 | | 3049 | Recv All ACKs | Recv ACK 3050 v v 3051 +-------+ +-------+ 3052 | Data | | Reset | 3053 | Recvd | | Recvd | 3054 +-------+ +-------+ 3056 Figure 10: States for Send Streams 3058 The sending part of stream that the endpoint initiates (types 0 and 2 3059 for clients, 1 and 3 for servers) is opened by the application or 3060 application protocol. The "Open" state represents a newly created 3061 stream that is able to accept data from the application. Stream data 3062 might be buffered in this state in preparation for sending. 3064 The sending part of a bidirectional stream initiated by a peer (type 3065 0 for a server, type 1 for a client) enters the "Open" state if the 3066 receiving part enters the "Recv" state. 3068 Sending the first STREAM or STREAM_BLOCKED frame causes a send stream 3069 to enter the "Send" state. An implementation might choose to defer 3070 allocating a Stream ID to a send stream until it sends the first 3071 frame and enters this state, which can allow for better stream 3072 prioritization. 3074 In the "Send" state, an endpoint transmits - and retransmits as 3075 necessary - data in STREAM frames. The endpoint respects the flow 3076 control limits of its peer, accepting MAX_STREAM_DATA frames. An 3077 endpoint in the "Send" state generates STREAM_BLOCKED frames if it 3078 encounters flow control limits. 3080 After the application indicates that stream data is complete and a 3081 STREAM frame containing the FIN bit is sent, the send stream enters 3082 the "Data Sent" state. From this state, the endpoint only 3083 retransmits stream data as necessary. The endpoint no longer needs 3084 to track flow control limits or send STREAM_BLOCKED frames for a send 3085 stream in this state. The endpoint can ignore any MAX_STREAM_DATA 3086 frames it receives from its peer in this state; MAX_STREAM_DATA 3087 frames might be received until the peer receives the final stream 3088 offset. 3090 Once all stream data has been successfully acknowledged, the send 3091 stream enters the "Data Recvd" state, which is a terminal state. 3093 From any of the "Open", "Send", or "Data Sent" states, an application 3094 can signal that it wishes to abandon transmission of stream data. 3095 Similarly, the endpoint might receive a STOP_SENDING frame from its 3096 peer. In either case, the endpoint sends a RST_STREAM frame, which 3097 causes the stream to enter the "Reset Sent" state. 3099 An endpoint MAY send a RST_STREAM as the first frame on a send 3100 stream; this causes the send stream to open and then immediately 3101 transition to the "Reset Sent" state. 3103 Once a packet containing a RST_STREAM has been acknowledged, the send 3104 stream enters the "Reset Recvd" state, which is a terminal state. 3106 10.2.2. Receive Stream States 3108 Figure 11 shows the states for the part of a stream that receives 3109 data from a peer. The states for a receive stream mirror only some 3110 of the states of the send stream at the peer. A receive stream 3111 doesn't track states on the send stream that cannot be observed, such 3112 as the "Open" state; instead, receive streams track the delivery of 3113 data to the application or application protocol some of which cannot 3114 be observed by the sender. 3116 o 3117 | Recv STREAM / STREAM_BLOCKED / RST_STREAM 3118 | Open Paired Stream (bidirectional) 3119 | Recv MAX_STREAM_DATA 3120 v 3121 +-------+ 3122 | Recv | Recv RST_STREAM 3123 | |-----------------------. 3124 +-------+ | 3125 | | 3126 | Recv STREAM + FIN | 3127 v | 3128 +-------+ | 3129 | Size | Recv RST_STREAM | 3130 | Known +---------------------->| 3131 +-------+ | 3132 | | 3133 | Recv All Data | 3134 v v 3135 +-------+ +-------+ 3136 | Data | Recv RST_STREAM | Reset | 3137 | Recvd +<-- (optional) --->| Recvd | 3138 +-------+ +-------+ 3139 | | 3140 | App Read All Data | App Read RST 3141 v v 3142 +-------+ +-------+ 3143 | Data | | Reset | 3144 | Read | | Read | 3145 +-------+ +-------+ 3147 Figure 11: States for Receive Streams 3149 The receiving part of a stream initiated by a peer (types 1 and 3 for 3150 a client, or 0 and 2 for a server) are created when the first STREAM, 3151 STREAM_BLOCKED, RST_STREAM, or MAX_STREAM_DATA (bidirectional only, 3152 see below) is received for that stream. The initial state for a 3153 receive stream is "Recv". Receiving a RST_STREAM frame causes the 3154 receive stream to immediately transition to the "Reset Recvd". 3156 The receive stream enters the "Recv" state when the sending part of a 3157 bidirectional stream initiated by the endpoint (type 0 for a client, 3158 type 1 for a server) enters the "Open" state. 3160 A bidirectional stream also opens when a MAX_STREAM_DATA frame is 3161 received. Receiving a MAX_STREAM_DATA frame implies that the remote 3162 peer has opened the stream and is providing flow control credit. A 3163 MAX_STREAM_DATA frame might arrive before a STREAM or STREAM_BLOCKED 3164 frame if packets are lost or reordered. 3166 In the "Recv" state, the endpoint receives STREAM and STREAM_BLOCKED 3167 frames. Incoming data is buffered and reassembled into the correct 3168 order for delivery to the application. As data is consumed by the 3169 application and buffer space becomes available, the endpoint sends 3170 MAX_STREAM_DATA frames to allow the peer to send more data. 3172 When a STREAM frame with a FIN bit is received, the final offset (see 3173 Section 11.3) is known. The receive stream enters the "Size Known" 3174 state. In this state, the endpoint no longer needs to send 3175 MAX_STREAM_DATA frames, it only receives any retransmissions of 3176 stream data. 3178 Once all data for the stream has been received, the receive stream 3179 enters the "Data Recvd" state. This might happen as a result of 3180 receiving the same STREAM frame that causes the transition to "Size 3181 Known". In this state, the endpoint has all stream data. Any STREAM 3182 or STREAM_BLOCKED frames it receives for the stream can be discarded. 3184 The "Data Recvd" state persists until stream data has been delivered 3185 to the application or application protocol. Once stream data has 3186 been delivered, the stream enters the "Data Read" state, which is a 3187 terminal state. 3189 Receiving a RST_STREAM frame in the "Recv" or "Size Known" states 3190 causes the stream to enter the "Reset Recvd" state. This might cause 3191 the delivery of stream data to the application to be interrupted. 3193 It is possible that all stream data is received when a RST_STREAM is 3194 received (that is, from the "Data Recvd" state). Similarly, it is 3195 possible for remaining stream data to arrive after receiving a 3196 RST_STREAM frame (the "Reset Recvd" state). An implementation is 3197 able to manage this situation as they choose. Sending RST_STREAM 3198 means that an endpoint cannot guarantee delivery of stream data; 3199 however there is no requirement that stream data not be delivered if 3200 a RST_STREAM is received. An implementation MAY interrupt delivery 3201 of stream data, discard any data that was not consumed, and signal 3202 the existence of the RST_STREAM immediately. Alternatively, the 3203 RST_STREAM signal might be suppressed or withheld if stream data is 3204 completely received. In the latter case, the receive stream 3205 effectively transitions to "Data Recvd" from "Reset Recvd". 3207 Once the application has been delivered the signal indicating that 3208 the receive stream was reset, the receive stream transitions to the 3209 "Reset Read" state, which is a terminal state. 3211 10.2.3. Permitted Frame Types 3213 The sender of a stream sends just three frame types that affect the 3214 state of a stream at either sender or receiver: STREAM 3215 (Section 8.17), STREAM_BLOCKED (Section 8.11), and RST_STREAM 3216 (Section 8.3). 3218 A sender MUST NOT send any of these frames from a terminal state 3219 ("Data Recvd" or "Reset Recvd"). A sender MUST NOT send STREAM or 3220 STREAM_BLOCKED after sending a RST_STREAM; that is, in the "Reset 3221 Sent" state in addition to the terminal states. A receiver could 3222 receive any of these frames in any state, but only due to the 3223 possibility of delayed delivery of packets carrying them. 3225 The receiver of a stream sends MAX_STREAM_DATA (Section 8.7) and 3226 STOP_SENDING frames (Section 8.14). 3228 The receiver only sends MAX_STREAM_DATA in the "Recv" state. A 3229 receiver can send STOP_SENDING in any state where it has not received 3230 a RST_STREAM frame; that is states other than "Reset Recvd" or "Reset 3231 Read". However there is little value in sending a STOP_SENDING frame 3232 after all stream data has been received in the "Data Recvd" state. A 3233 sender could receive these frames in any state as a result of delayed 3234 delivery of packets. 3236 10.2.4. Bidirectional Stream States 3238 A bidirectional stream is composed of a send stream and a receive 3239 stream. Implementations may represent states of the bidirectional 3240 stream as composites of send and receive stream states. The simplest 3241 model presents the stream as "open" when either send or receive 3242 stream is in a non-terminal state and "closed" when both send and 3243 receive streams are in a terminal state. 3245 Table 6 shows a more complex mapping of bidirectional stream states 3246 that loosely correspond to the stream states in HTTP/2 [HTTP2]. This 3247 shows that multiple states on send or receive streams are mapped to 3248 the same composite state. Note that this is just one possibility for 3249 such a mapping; this mapping requires that data is acknowledged 3250 before the transition to a "closed" or "half-closed" state. 3252 +-----------------------+---------------------+---------------------+ 3253 | Send Stream | Receive Stream | Composite State | 3254 +-----------------------+---------------------+---------------------+ 3255 | No Stream/Open | No Stream/Recv *1 | idle | 3256 | | | | 3257 | Open/Send/Data Sent | Recv/Size Known | open | 3258 | | | | 3259 | Open/Send/Data Sent | Data Recvd/Data | half-closed | 3260 | | Read | (remote) | 3261 | | | | 3262 | Open/Send/Data Sent | Reset Recvd/Reset | half-closed | 3263 | | Read | (remote) | 3264 | | | | 3265 | Data Recvd | Recv/Size Known | half-closed (local) | 3266 | | | | 3267 | Reset Sent/Reset | Recv/Size Known | half-closed (local) | 3268 | Recvd | | | 3269 | | | | 3270 | Data Recvd | Recv/Size Known | half-closed (local) | 3271 | | | | 3272 | Reset Sent/Reset | Data Recvd/Data | closed | 3273 | Recvd | Read | | 3274 | | | | 3275 | Reset Sent/Reset | Reset Recvd/Reset | closed | 3276 | Recvd | Read | | 3277 | | | | 3278 | Data Recvd | Data Recvd/Data | closed | 3279 | | Read | | 3280 | | | | 3281 | Data Recvd | Reset Recvd/Reset | closed | 3282 | | Read | | 3283 +-----------------------+---------------------+---------------------+ 3285 Table 6: Possible Mapping of Stream States to HTTP/2 3287 Note (*1): A stream is considered "idle" if it has not yet been 3288 created, or if the receive stream is in the "Recv" state without 3289 yet having received any frames. 3291 10.3. Solicited State Transitions 3293 If an endpoint is no longer interested in the data it is receiving on 3294 a stream, it MAY send a STOP_SENDING frame identifying that stream to 3295 prompt closure of the stream in the opposite direction. This 3296 typically indicates that the receiving application is no longer 3297 reading data it receives from the stream, but is not a guarantee that 3298 incoming data will be ignored. 3300 STREAM frames received after sending STOP_SENDING are still counted 3301 toward the connection and stream flow-control windows, even though 3302 these frames will be discarded upon receipt. This avoids potential 3303 ambiguity about which STREAM frames count toward flow control. 3305 A STOP_SENDING frame requests that the receiving endpoint send a 3306 RST_STREAM frame. An endpoint that receives a STOP_SENDING frame 3307 MUST send a RST_STREAM frame for that stream, and can use an error 3308 code of STOPPING. If the STOP_SENDING frame is received on a send 3309 stream that is already in the "Data Sent" state, a RST_STREAM frame 3310 MAY still be sent in order to cancel retransmission of previously- 3311 sent STREAM frames. 3313 STOP_SENDING SHOULD only be sent for a receive stream that has not 3314 been reset. STOP_SENDING is most useful for streams in the "Recv" or 3315 "Size Known" states. 3317 An endpoint is expected to send another STOP_SENDING frame if a 3318 packet containing a previous STOP_SENDING is lost. However, once 3319 either all stream data or a RST_STREAM frame has been received for 3320 the stream - that is, the stream is in any state other than "Recv" or 3321 "Size Known" - sending a STOP_SENDING frame is unnecessary. 3323 10.4. Stream Concurrency 3325 An endpoint limits the number of concurrently active incoming streams 3326 by adjusting the maximum stream ID. An initial value is set in the 3327 transport parameters (see Section 7.4.1) and is subsequently 3328 increased by MAX_STREAM_ID frames (see Section 8.8). 3330 The maximum stream ID is specific to each endpoint and applies only 3331 to the peer that receives the setting. That is, clients specify the 3332 maximum stream ID the server can initiate, and servers specify the 3333 maximum stream ID the client can initiate. Each endpoint may respond 3334 on streams initiated by the other peer, regardless of whether it is 3335 permitted to initiated new streams. 3337 Endpoints MUST NOT exceed the limit set by their peer. An endpoint 3338 that receives a STREAM frame with an ID greater than the limit it has 3339 sent MUST treat this as a stream error of type STREAM_ID_ERROR 3340 (Section 12), unless this is a result of a change in the initial 3341 offsets (see Section 7.4.2). 3343 A receiver MUST NOT renege on an advertisement; that is, once a 3344 receiver advertises a stream ID via a MAX_STREAM_ID frame, it MUST 3345 NOT subsequently advertise a smaller maximum ID. A sender may 3346 receive MAX_STREAM_ID frames out of order; a sender MUST therefore 3347 ignore any MAX_STREAM_ID that does not increase the maximum. 3349 10.5. Sending and Receiving Data 3351 Once a stream is created, endpoints may use the stream to send and 3352 receive data. Each endpoint may send a series of STREAM frames 3353 encapsulating data on a stream until the stream is terminated in that 3354 direction. Streams are an ordered byte-stream abstraction, and they 3355 have no other structure within them. STREAM frame boundaries are not 3356 expected to be preserved in retransmissions from the sender or during 3357 delivery to the application at the receiver. 3359 When new data is to be sent on a stream, a sender MUST set the 3360 encapsulating STREAM frame's offset field to the stream offset of the 3361 first byte of this new data. The first byte of data that is sent on 3362 a stream has the stream offset 0. The largest offset delivered on a 3363 stream MUST be less than 2^62. A receiver MUST ensure that received 3364 stream data is delivered to the application as an ordered byte- 3365 stream. Data received out of order MUST be buffered for later 3366 delivery, as long as it is not in violation of the receiver's flow 3367 control limits. 3369 An endpoint MUST NOT send data on any stream without ensuring that it 3370 is within the data limits set by its peer. The cryptographic 3371 handshake stream, Stream 0, is exempt from the connection-level data 3372 limits established by MAX_DATA. Data on stream 0 other than the 3373 initial cryptographic handshake message is still subject to stream- 3374 level data limits and MAX_STREAM_DATA. This message is exempt from 3375 flow control because it needs to be sent in a single packet 3376 regardless of the server's flow control state. This rule applies 3377 even for 0-RTT handshakes where the remembered value of 3378 MAX_STREAM_DATA would not permit sending a full initial cryptographic 3379 handshake message. 3381 Flow control is described in detail in Section 11, and congestion 3382 control is described in the companion document [QUIC-RECOVERY]. 3384 10.6. Stream Prioritization 3386 Stream multiplexing has a significant effect on application 3387 performance if resources allocated to streams are correctly 3388 prioritized. Experience with other multiplexed protocols, such as 3389 HTTP/2 [HTTP2], shows that effective prioritization strategies have a 3390 significant positive impact on performance. 3392 QUIC does not provide frames for exchanging prioritization 3393 information. Instead it relies on receiving priority information 3394 from the application that uses QUIC. Protocols that use QUIC are 3395 able to define any prioritization scheme that suits their application 3396 semantics. A protocol might define explicit messages for signaling 3397 priority, such as those defined in HTTP/2; it could define rules that 3398 allow an endpoint to determine priority based on context; or it could 3399 leave the determination to the application. 3401 A QUIC implementation SHOULD provide ways in which an application can 3402 indicate the relative priority of streams. When deciding which 3403 streams to dedicate resources to, QUIC SHOULD use the information 3404 provided by the application. Failure to account for priority of 3405 streams can result in suboptimal performance. 3407 Stream priority is most relevant when deciding which stream data will 3408 be transmitted. Often, there will be limits on what can be 3409 transmitted as a result of connection flow control or the current 3410 congestion controller state. 3412 Giving preference to the transmission of its own management frames 3413 ensures that the protocol functions efficiently. That is, 3414 prioritizing frames other than STREAM frames ensures that loss 3415 recovery, congestion control, and flow control operate effectively. 3417 Stream 0 MUST be prioritized over other streams prior to the 3418 completion of the cryptographic handshake. This includes the 3419 retransmission of the second flight of client handshake messages, 3420 that is, the TLS Finished and any client authentication messages. 3422 STREAM frames that are determined to be lost SHOULD be retransmitted 3423 before sending new data, unless application priorities indicate 3424 otherwise. Retransmitting lost stream data can fill in gaps, which 3425 allows the peer to consume already received data and free up flow 3426 control window. 3428 11. Flow Control 3430 It is necessary to limit the amount of data that a sender may have 3431 outstanding at any time, so as to prevent a fast sender from 3432 overwhelming a slow receiver, or to prevent a malicious sender from 3433 consuming significant resources at a receiver. This section 3434 describes QUIC's flow-control mechanisms. 3436 QUIC employs a credit-based flow-control scheme similar to HTTP/2's 3437 flow control [HTTP2]. A receiver advertises the number of octets it 3438 is prepared to receive on a given stream and for the entire 3439 connection. This leads to two levels of flow control in QUIC: (i) 3440 Connection flow control, which prevents senders from exceeding a 3441 receiver's buffer capacity for the connection, and (ii) Stream flow 3442 control, which prevents a single stream from consuming the entire 3443 receive buffer for a connection. 3445 A data receiver sends MAX_STREAM_DATA or MAX_DATA frames to the 3446 sender to advertise additional credit. MAX_STREAM_DATA frames send 3447 the the maximum absolute byte offset of a stream, while MAX_DATA 3448 sends the maximum sum of the absolute byte offsets of all streams 3449 other than stream 0. 3451 A receiver MAY advertise a larger offset at any point by sending 3452 MAX_DATA or MAX_STREAM_DATA frames. A receiver MUST NOT renege on an 3453 advertisement; that is, once a receiver advertises an offset, it MUST 3454 NOT subsequently advertise a smaller offset. A sender could receive 3455 MAX_DATA or MAX_STREAM_DATA frames out of order; a sender MUST 3456 therefore ignore any flow control offset that does not move the 3457 window forward. 3459 A receiver MUST close the connection with a FLOW_CONTROL_ERROR error 3460 (Section 12) if the peer violates the advertised connection or stream 3461 data limits. 3463 A sender SHOULD send BLOCKED or STREAM_BLOCKED frames to indicate it 3464 has data to write but is blocked by flow control limits. These 3465 frames are expected to be sent infrequently in common cases, but they 3466 are considered useful for debugging and monitoring purposes. 3468 A receiver advertises credit for a stream by sending a 3469 MAX_STREAM_DATA frame with the Stream ID set appropriately. A 3470 receiver could use the current offset of data consumed to determine 3471 the flow control offset to be advertised. A receiver MAY send 3472 MAX_STREAM_DATA frames in multiple packets in order to make sure that 3473 the sender receives an update before running out of flow control 3474 credit, even if one of the packets is lost. 3476 Connection flow control is a limit to the total bytes of stream data 3477 sent in STREAM frames on all streams. A receiver advertises credit 3478 for a connection by sending a MAX_DATA frame. A receiver maintains a 3479 cumulative sum of bytes received on all streams, which are used to 3480 check for flow control violations. A receiver might use a sum of 3481 bytes consumed on all contributing streams to determine the maximum 3482 data limit to be advertised. 3484 11.1. Edge Cases and Other Considerations 3486 There are some edge cases which must be considered when dealing with 3487 stream and connection level flow control. Given enough time, both 3488 endpoints must agree on flow control state. If one end believes it 3489 can send more than the other end is willing to receive, the 3490 connection will be torn down when too much data arrives. 3492 Conversely if a sender believes it is blocked, while endpoint B 3493 expects more data can be received, then the connection can be in a 3494 deadlock, with the sender waiting for a MAX_DATA or MAX_STREAM_DATA 3495 frame which will never come. 3497 On receipt of a RST_STREAM frame, an endpoint will tear down state 3498 for the matching stream and ignore further data arriving on that 3499 stream. This could result in the endpoints getting out of sync, 3500 since the RST_STREAM frame may have arrived out of order and there 3501 may be further bytes in flight. The data sender would have counted 3502 the data against its connection level flow control budget, but a 3503 receiver that has not received these bytes would not know to include 3504 them as well. The receiver must learn the number of bytes that were 3505 sent on the stream to make the same adjustment in its connection flow 3506 controller. 3508 To avoid this de-synchronization, a RST_STREAM sender MUST include 3509 the final byte offset sent on the stream in the RST_STREAM frame. On 3510 receiving a RST_STREAM frame, a receiver definitively knows how many 3511 bytes were sent on that stream before the RST_STREAM frame, and the 3512 receiver MUST use the final offset to account for all bytes sent on 3513 the stream in its connection level flow controller. 3515 11.1.1. Response to a RST_STREAM 3517 RST_STREAM terminates one direction of a stream abruptly. Whether 3518 any action or response can or should be taken on the data already 3519 received is an application-specific issue, but it will often be the 3520 case that upon receipt of a RST_STREAM an endpoint will choose to 3521 stop sending data in its own direction. If the sender of a 3522 RST_STREAM wishes to explicitly state that no future data will be 3523 processed, that endpoint MAY send a STOP_SENDING frame at the same 3524 time. 3526 11.1.2. Data Limit Increments 3528 This document leaves when and how many bytes to advertise in a 3529 MAX_DATA or MAX_STREAM_DATA to implementations, but offers a few 3530 considerations. These frames contribute to connection overhead. 3531 Therefore frequently sending frames with small changes is 3532 undesirable. At the same time, infrequent updates require larger 3533 increments to limits if blocking is to be avoided. Thus, larger 3534 updates require a receiver to commit to larger resource commitments. 3535 Thus there is a tradeoff between resource commitment and overhead 3536 when determining how large a limit is advertised. 3538 A receiver MAY use an autotuning mechanism to tune the frequency and 3539 amount that it increases data limits based on a roundtrip time 3540 estimate and the rate at which the receiving application consumes 3541 data, similar to common TCP implementations. 3543 11.2. Stream Limit Increment 3545 As with flow control, this document leaves when and how many streams 3546 to make available to a peer via MAX_STREAM_ID to implementations, but 3547 offers a few considerations. MAX_STREAM_ID frames constitute minimal 3548 overhead, while withholding MAX_STREAM_ID frames can prevent the peer 3549 from using the available parallelism. 3551 Implementations will likely want to increase the maximum stream ID as 3552 peer-initiated streams close. A receiver MAY also advance the 3553 maximum stream ID based on current activity, system conditions, and 3554 other environmental factors. 3556 11.2.1. Blocking on Flow Control 3558 If a sender does not receive a MAX_DATA or MAX_STREAM_DATA frame when 3559 it has run out of flow control credit, the sender will be blocked and 3560 SHOULD send a BLOCKED or STREAM_BLOCKED frame. These frames are 3561 expected to be useful for debugging at the receiver; they do not 3562 require any other action. A receiver SHOULD NOT wait for a BLOCKED 3563 or STREAM_BLOCKED frame before sending MAX_DATA or MAX_STREAM_DATA, 3564 since doing so will mean that a sender is unable to send for an 3565 entire round trip. 3567 For smooth operation of the congestion controller, it is generally 3568 considered best to not let the sender go into quiescence if 3569 avoidable. To avoid blocking a sender, and to reasonably account for 3570 the possibiity of loss, a receiver should send a MAX_DATA or 3571 MAX_STREAM_DATA frame at least two roundtrips before it expects the 3572 sender to get blocked. 3574 A sender sends a single BLOCKED or STREAM_BLOCKED frame only once 3575 when it reaches a data limit. A sender SHOULD NOT send multiple 3576 BLOCKED or STREAM_BLOCKED frames for the same data limit, unless the 3577 original frame is determined to be lost. Another BLOCKED or 3578 STREAM_BLOCKED frame can be sent after the data limit is increased. 3580 11.3. Stream Final Offset 3582 The final offset is the count of the number of octets that are 3583 transmitted on a stream. For a stream that is reset, the final 3584 offset is carried explicitly in a RST_STREAM frame. Otherwise, the 3585 final offset is the offset of the end of the data carried in a STREAM 3586 frame marked with a FIN flag, or 0 in the case of incoming 3587 unidirectional streams. 3589 An endpoint will know the final offset for a stream when the receive 3590 stream enters the "Size Known" or "Reset Recvd" state. 3592 An endpoint MUST NOT send data on a stream at or beyond the final 3593 offset. 3595 Once a final offset for a stream is known, it cannot change. If a 3596 RST_STREAM or STREAM frame causes the final offset to change for a 3597 stream, an endpoint SHOULD respond with a FINAL_OFFSET_ERROR error 3598 (see Section 12). A receiver SHOULD treat receipt of data at or 3599 beyond the final offset as a FINAL_OFFSET_ERROR error, even after a 3600 stream is closed. Generating these errors is not mandatory, but only 3601 because requiring that an endpoint generate these errors also means 3602 that the endpoint needs to maintain the final offset state for closed 3603 streams, which could mean a significant state commitment. 3605 12. Error Handling 3607 An endpoint that detects an error SHOULD signal the existence of that 3608 error to its peer. Errors can affect an entire connection (see 3609 Section 12.1), or a single stream (see Section 12.2). 3611 The most appropriate error code (Section 12.3) SHOULD be included in 3612 the frame that signals the error. Where this specification 3613 identifies error conditions, it also identifies the error code that 3614 is used. 3616 A stateless reset (Section 7.9.4) is not suitable for any error that 3617 can be signaled with a CONNECTION_CLOSE, APPLICATION_CLOSE, or 3618 RST_STREAM frame. A stateless reset MUST NOT be used by an endpoint 3619 that has the state necessary to send a frame on the connection. 3621 12.1. Connection Errors 3623 Errors that result in the connection being unusable, such as an 3624 obvious violation of protocol semantics or corruption of state that 3625 affects an entire connection, MUST be signaled using a 3626 CONNECTION_CLOSE or APPLICATION_CLOSE frame (Section 8.4, 3627 Section 8.5). An endpoint MAY close the connection in this manner 3628 even if the error only affects a single stream. 3630 Application protocols can signal application-specific protocol errors 3631 using the APPLICATION_CLOSE frame. Errors that are specific to the 3632 transport, including all those described in this document, are 3633 carried in a CONNECTION_CLOSE frame. Other than the type of error 3634 code they carry, these frames are identical in format and semantics. 3636 A CONNECTION_CLOSE or APPLICATION_CLOSE frame could be sent in a 3637 packet that is lost. An endpoint SHOULD be prepared to retransmit a 3638 packet containing either frame type if it receives more packets on a 3639 terminated connection. Limiting the number of retransmissions and 3640 the time over which this final packet is sent limits the effort 3641 expended on terminated connections. 3643 An endpoint that chooses not to retransmit packets containing 3644 CONNECTION_CLOSE or APPLICATION_CLOSE risks a peer missing the first 3645 such packet. The only mechanism available to an endpoint that 3646 continues to receive data for a terminated connection is to use the 3647 stateless reset process (Section 7.9.4). 3649 An endpoint that receives an invalid CONNECTION_CLOSE or 3650 APPLICATION_CLOSE frame MUST NOT signal the existence of the error to 3651 its peer. 3653 12.2. Stream Errors 3655 If the error affects a single stream, but otherwise leaves the 3656 connection in a recoverable state, the endpoint can send a RST_STREAM 3657 frame (Section 8.3) with an appropriate error code to terminate just 3658 the affected stream. 3660 Stream 0 is critical to the functioning of the entire connection. If 3661 stream 0 is closed with either a RST_STREAM or STREAM frame bearing 3662 the FIN flag, an endpoint MUST generate a connection error of type 3663 PROTOCOL_VIOLATION. 3665 RST_STREAM MUST be instigated by the application and MUST carry an 3666 application error code. Resetting a stream without knowledge of the 3667 application protocol could cause the protocol to enter an 3668 unrecoverable state. Application protocols might require certain 3669 streams to be reliably delivered in order to guarantee consistent 3670 state between endpoints. 3672 12.3. Transport Error Codes 3674 QUIC error codes are 16-bit unsigned integers. 3676 This section lists the defined QUIC transport error codes that may be 3677 used in a CONNECTION_CLOSE frame. These errors apply to the entire 3678 connection. 3680 NO_ERROR (0x0): An endpoint uses this with CONNECTION_CLOSE to 3681 signal that the connection is being closed abruptly in the absence 3682 of any error. 3684 INTERNAL_ERROR (0x1): The endpoint encountered an internal error and 3685 cannot continue with the connection. 3687 FLOW_CONTROL_ERROR (0x3): An endpoint received more data than it 3688 permitted in its advertised data limits (see Section 11). 3690 STREAM_ID_ERROR (0x4): An endpoint received a frame for a stream 3691 identifier that exceeded its advertised maximum stream ID. 3693 STREAM_STATE_ERROR (0x5): An endpoint received a frame for a stream 3694 that was not in a state that permitted that frame (see 3695 Section 10.2). 3697 FINAL_OFFSET_ERROR (0x6): An endpoint received a STREAM frame 3698 containing data that exceeded the previously established final 3699 offset. Or an endpoint received a RST_STREAM frame containing a 3700 final offset that was lower than the maximum offset of data that 3701 was already received. Or an endpoint received a RST_STREAM frame 3702 containing a different final offset to the one already 3703 established. 3705 FRAME_FORMAT_ERROR (0x7): An endpoint received a frame that was 3706 badly formatted. For instance, an empty STREAM frame that omitted 3707 the FIN flag, or an ACK frame that has more acknowledgment ranges 3708 than the remainder of the packet could carry. This is a generic 3709 error code; an endpoint SHOULD use the more specific frame format 3710 error codes (0x1XX) if possible. 3712 TRANSPORT_PARAMETER_ERROR (0x8): An endpoint received transport 3713 parameters that were badly formatted, included an invalid value, 3714 was absent even though it is mandatory, was present though it is 3715 forbidden, or is otherwise in error. 3717 VERSION_NEGOTIATION_ERROR (0x9): An endpoint received transport 3718 parameters that contained version negotiation parameters that 3719 disagreed with the version negotiation that it performed. This 3720 error code indicates a potential version downgrade attack. 3722 PROTOCOL_VIOLATION (0xA): An endpoint detected an error with 3723 protocol compliance that was not covered by more specific error 3724 codes. 3726 UNSOLICITED_PONG (0xB): An endpoint received a PONG frame that did 3727 not correspond to any PING frame that it previously sent. 3729 FRAME_ERROR (0x1XX): An endpoint detected an error in a specific 3730 frame type. The frame type is included as the last octet of the 3731 error code. For example, an error in a MAX_STREAM_ID frame would 3732 be indicated with the code (0x106). 3734 See Section 14.2 for details of registering new error codes. 3736 12.4. Application Protocol Error Codes 3738 Application protocol error codes are 16-bit unsigned integers, but 3739 the management of application error codes are left to application 3740 protocols. Application protocol error codes are used for the 3741 RST_STREAM (Section 8.3) and APPLICATION_CLOSE (Section 8.5) frames. 3743 There is no restriction on the use of the 16-bit error code space for 3744 application protocols. However, QUIC reserves the error code with a 3745 value of 0 to mean STOPPING. The application error code of STOPPING 3746 (0) is used by the transport to cancel a stream in response to 3747 receipt of a STOP_SENDING frame. 3749 13. Security and Privacy Considerations 3751 13.1. Spoofed ACK Attack 3753 An attacker receives an STK from the server and then releases the IP 3754 address on which it received the STK. The attacker may, in the 3755 future, spoof this same address (which now presumably addresses a 3756 different endpoint), and initiate a 0-RTT connection with a server on 3757 the victim's behalf. The attacker then spoofs ACK frames to the 3758 server which cause the server to potentially drown the victim in 3759 data. 3761 There are two possible mitigations to this attack. The simplest one 3762 is that a server can unilaterally create a gap in packet-number 3763 space. In the non-attack scenario, the client will send an ACK frame 3764 with the larger value for largest acknowledged. In the attack 3765 scenario, the attacker could acknowledge a packet in the gap. If the 3766 server sees an acknowledgment for a packet that was never sent, the 3767 connection can be aborted. 3769 The second mitigation is that the server can require that 3770 acknowledgments for sent packets match the encryption level of the 3771 sent packet. This mitigation is useful if the connection has an 3772 ephemeral forward-secure key that is generated and used for every new 3773 connection. If a packet sent is protected with a forward-secure key, 3774 then any acknowledgments that are received for them MUST also be 3775 forward-secure protected. Since the attacker will not have the 3776 forward secure key, the attacker will not be able to generate 3777 forward-secure protected packets with ACK frames. 3779 13.2. Slowloris Attacks 3781 The attacks commonly known as Slowloris [SLOWLORIS] try to keep many 3782 connections to the target endpoint open and hold them open as long as 3783 possible. These attacks can be executed against a QUIC endpoint by 3784 generating the minimum amount of activity necessary to avoid being 3785 closed for inactivity. This might involve sending small amounts of 3786 data, gradually opening flow control windows in order to control the 3787 sender rate, or manufacturing ACK frames that simulate a high loss 3788 rate. 3790 QUIC deployments SHOULD provide mitigations for the Slowloris 3791 attacks, such as increasing the maximum number of clients the server 3792 will allow, limiting the number of connections a single IP address is 3793 allowed to make, imposing restrictions on the minimum transfer speed 3794 a connection is allowed to have, and restricting the length of time 3795 an endpoint is allowed to stay connected. 3797 13.3. Stream Fragmentation and Reassembly Attacks 3799 An adversarial endpoint might intentionally fragment the data on 3800 stream buffers in order to cause disproportionate memory commitment. 3801 An adversarial endpoint could open a stream and send some STREAM 3802 frames containing arbitrary fragments of the stream content. 3804 The attack is mitigated if flow control windows correspond to 3805 available memory. However, some receivers will over-commit memory 3806 and advertise flow control offsets in the aggregate that exceed 3807 actual available memory. The over-commitment strategy can lead to 3808 better performance when endpoints are well behaved, but renders 3809 endpoints vulnerable to the stream fragmentation attack. 3811 QUIC deployments SHOULD provide mitigations against the stream 3812 fragmentation attack. Mitigations could consist of avoiding over- 3813 committing memory, delaying reassembly of STREAM frames, implementing 3814 heuristics based on the age and duration of reassembly holes, or some 3815 combination. 3817 13.4. Stream Commitment Attack 3819 An adversarial endpoint can open lots of streams, exhausting state on 3820 an endpoint. The adversarial endpoint could repeat the process on a 3821 large number of connections, in a manner similar to SYN flooding 3822 attacks in TCP. 3824 Normally, clients will open streams sequentially, as explained in 3825 Section 10.1. However, when several streams are initiated at short 3826 intervals, transmission error may cause STREAM DATA frames opening 3827 streams to be received out of sequence. A receiver is obligated to 3828 open intervening streams if a higher-numbered stream ID is received. 3829 Thus, on a new connection, opening stream 2000001 opens 1 million 3830 streams, as required by the specification. 3832 The number of active streams is limited by the concurrent stream 3833 limit transport parameter, as explained in Section 10.4. If chosen 3834 judisciously, this limit mitigates the effect of the stream 3835 commitment attack. However, setting the limit too low could affect 3836 performance when applications expect to open large number of streams. 3838 14. IANA Considerations 3840 14.1. QUIC Transport Parameter Registry 3842 IANA [SHALL add/has added] a registry for "QUIC Transport Parameters" 3843 under a "QUIC Protocol" heading. 3845 The "QUIC Transport Parameters" registry governs a 16-bit space. 3846 This space is split into two spaces that are governed by different 3847 policies. Values with the first byte in the range 0x00 to 0xfe (in 3848 hexadecimal) are assigned via the Specification Required policy 3849 [RFC8126]. Values with the first byte 0xff are reserved for Private 3850 Use [RFC8126]. 3852 Registrations MUST include the following fields: 3854 Value: The numeric value of the assignment (registrations will be 3855 between 0x0000 and 0xfeff). 3857 Parameter Name: A short mnemonic for the parameter. 3859 Specification: A reference to a publicly available specification for 3860 the value. 3862 The nominated expert(s) verify that a specification exists and is 3863 readily accessible. The expert(s) are encouraged to be biased 3864 towards approving registrations unless they are abusive, frivolous, 3865 or actively harmful (not merely aesthetically displeasing, or 3866 architecturally dubious). 3868 The initial contents of this registry are shown in Table 7. 3870 +--------+----------------------------+---------------+ 3871 | Value | Parameter Name | Specification | 3872 +--------+----------------------------+---------------+ 3873 | 0x0000 | initial_max_stream_data | Section 7.4.1 | 3874 | | | | 3875 | 0x0001 | initial_max_data | Section 7.4.1 | 3876 | | | | 3877 | 0x0002 | initial_max_stream_id_bidi | Section 7.4.1 | 3878 | | | | 3879 | 0x0003 | idle_timeout | Section 7.4.1 | 3880 | | | | 3881 | 0x0004 | omit_connection_id | Section 7.4.1 | 3882 | | | | 3883 | 0x0005 | max_packet_size | Section 7.4.1 | 3884 | | | | 3885 | 0x0006 | stateless_reset_token | Section 7.4.1 | 3886 | | | | 3887 | 0x0007 | ack_delay_exponent | Section 7.4.1 | 3888 | | | | 3889 | 0x0008 | initial_max_stream_id_uni | Section 7.4.1 | 3890 +--------+----------------------------+---------------+ 3892 Table 7: Initial QUIC Transport Parameters Entries 3894 14.2. QUIC Transport Error Codes Registry 3896 IANA [SHALL add/has added] a registry for "QUIC Transport Error 3897 Codes" under a "QUIC Protocol" heading. 3899 The "QUIC Transport Error Codes" registry governs a 16-bit space. 3900 This space is split into two spaces that are governed by different 3901 policies. Values with the first byte in the range 0x00 to 0xfe (in 3902 hexadecimal) are assigned via the Specification Required policy 3903 [RFC8126]. Values with the first byte 0xff are reserved for Private 3904 Use [RFC8126]. 3906 Registrations MUST include the following fields: 3908 Value: The numeric value of the assignment (registrations will be 3909 between 0x0000 and 0xfeff). 3911 Code: A short mnemonic for the parameter. 3913 Description: A brief description of the error code semantics, which 3914 MAY be a summary if a specification reference is provided. 3916 Specification: A reference to a publicly available specification for 3917 the value. 3919 The initial contents of this registry are shown in Table 8. Note 3920 that FRAME_ERROR takes the range from 0x100 to 0x1FF and private use 3921 occupies the range from 0xFE00 to 0xFFFF. 3923 +-----------+------------------------+---------------+--------------+ 3924 | Value | Error | Description | Specificatio | 3925 | | | | n | 3926 +-----------+------------------------+---------------+--------------+ 3927 | 0x0 | NO_ERROR | No error | Section 12.3 | 3928 | | | | | 3929 | 0x1 | INTERNAL_ERROR | Implementatio | Section 12.3 | 3930 | | | n error | | 3931 | | | | | 3932 | 0x3 | FLOW_CONTROL_ERROR | Flow control | Section 12.3 | 3933 | | | error | | 3934 | | | | | 3935 | 0x4 | STREAM_ID_ERROR | Invalid | Section 12.3 | 3936 | | | stream ID | | 3937 | | | | | 3938 | 0x5 | STREAM_STATE_ERROR | Frame | Section 12.3 | 3939 | | | received in | | 3940 | | | invalid | | 3941 | | | stream state | | 3942 | | | | | 3943 | 0x6 | FINAL_OFFSET_ERROR | Change to | Section 12.3 | 3944 | | | final stream | | 3945 | | | offset | | 3946 | | | | | 3947 | 0x7 | FRAME_FORMAT_ERROR | Generic frame | Section 12.3 | 3948 | | | format error | | 3949 | | | | | 3950 | 0x8 | TRANSPORT_PARAMETER_ER | Error in | Section 12.3 | 3951 | | ROR | transport | | 3952 | | | parameters | | 3953 | | | | | 3954 | 0x9 | VERSION_NEGOTIATION_ER | Version | Section 12.3 | 3955 | | ROR | negotiation | | 3956 | | | failure | | 3957 | | | | | 3958 | 0xA | PROTOCOL_VIOLATION | Generic | Section 12.3 | 3959 | | | protocol | | 3960 | | | violation | | 3961 | | | | | 3962 | 0xB | UNSOLICITED_PONG | Unsolicited | Section 12.3 | 3963 | | | PONG frame | | 3964 | | | | | 3965 | 0x100-0x1 | FRAME_ERROR | Specific | Section 12.3 | 3966 | FF | | frame format | | 3967 | | | error | | 3968 +-----------+------------------------+---------------+--------------+ 3970 Table 8: Initial QUIC Transport Error Codes Entries 3972 15. References 3974 15.1. Normative References 3976 [I-D.ietf-tls-tls13] 3977 Rescorla, E., "The Transport Layer Security (TLS) Protocol 3978 Version 1.3", draft-ietf-tls-tls13-22 (work in progress), 3979 November 2017. 3981 [PLPMTUD] Mathis, M. and J. Heffner, "Packetization Layer Path MTU 3982 Discovery", RFC 4821, DOI 10.17487/RFC4821, March 2007, 3983 . 3985 [PMTUDv4] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, 3986 DOI 10.17487/RFC1191, November 1990, 3987 . 3989 [PMTUDv6] McCann, J., Deering, S., Mogul, J., and R. Hinden, Ed., 3990 "Path MTU Discovery for IP version 6", STD 87, RFC 8201, 3991 DOI 10.17487/RFC8201, July 2017, 3992 . 3994 [QUIC-RECOVERY] 3995 Iyengar, J., Ed. and I. Swett, Ed., "QUIC Loss Detection 3996 and Congestion Control", draft-ietf-quic-recovery-00 (work 3997 in progress), December 2017. 3999 [QUIC-TLS] 4000 Thomson, M., Ed. and S. Turner, Ed., "Using Transport 4001 Layer Security (TLS) to Secure QUIC", draft-ietf-quic- 4002 tls-00 (work in progress), December 2017. 4004 [RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, 4005 DOI 10.17487/RFC1191, November 1990, 4006 . 4008 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 4009 Requirement Levels", BCP 14, RFC 2119, 4010 DOI 10.17487/RFC2119, March 1997, 4011 . 4013 [RFC3629] Yergeau, F., "UTF-8, a transformation format of ISO 4014 10646", STD 63, RFC 3629, DOI 10.17487/RFC3629, November 4015 2003, . 4017 [RFC4086] Eastlake 3rd, D., Schiller, J., and S. Crocker, 4018 "Randomness Requirements for Security", BCP 106, RFC 4086, 4019 DOI 10.17487/RFC4086, June 2005, 4020 . 4022 [RFC8126] Cotton, M., Leiba, B., and T. Narten, "Guidelines for 4023 Writing an IANA Considerations Section in RFCs", BCP 26, 4024 RFC 8126, DOI 10.17487/RFC8126, June 2017, 4025 . 4027 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 4028 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 4029 May 2017, . 4031 15.2. Informative References 4033 [EARLY-DESIGN] 4034 Roskind, J., "QUIC: Multiplexed Transport Over UDP", 4035 December 2013, . 4037 [HTTP2] Belshe, M., Peon, R., and M. Thomson, Ed., "Hypertext 4038 Transfer Protocol Version 2 (HTTP/2)", RFC 7540, 4039 DOI 10.17487/RFC7540, May 2015, 4040 . 4042 [RFC2104] Krawczyk, H., Bellare, M., and R. Canetti, "HMAC: Keyed- 4043 Hashing for Message Authentication", RFC 2104, 4044 DOI 10.17487/RFC2104, February 1997, 4045 . 4047 [RFC2360] Scott, G., "Guide for Internet Standards Writers", BCP 22, 4048 RFC 2360, DOI 10.17487/RFC2360, June 1998, 4049 . 4051 [RFC4787] Audet, F., Ed. and C. Jennings, "Network Address 4052 Translation (NAT) Behavioral Requirements for Unicast 4053 UDP", BCP 127, RFC 4787, DOI 10.17487/RFC4787, January 4054 2007, . 4056 [RFC5869] Krawczyk, H. and P. Eronen, "HMAC-based Extract-and-Expand 4057 Key Derivation Function (HKDF)", RFC 5869, 4058 DOI 10.17487/RFC5869, May 2010, 4059 . 4061 [RFC6824] Ford, A., Raiciu, C., Handley, M., and O. Bonaventure, 4062 "TCP Extensions for Multipath Operation with Multiple 4063 Addresses", RFC 6824, DOI 10.17487/RFC6824, January 2013, 4064 . 4066 [RFC7301] Friedl, S., Popov, A., Langley, A., and E. Stephan, 4067 "Transport Layer Security (TLS) Application-Layer Protocol 4068 Negotiation Extension", RFC 7301, DOI 10.17487/RFC7301, 4069 July 2014, . 4071 [SLOWLORIS] 4072 RSnake Hansen, R., "Welcome to Slowloris...", June 2009, 4073 . 4076 [SST] Ford, B., "Structured streams", ACM SIGCOMM Computer 4077 Communication Review Vol. 37, pp. 361, 4078 DOI 10.1145/1282427.1282421, October 2007. 4080 15.3. URIs 4082 [1] https://mailarchive.ietf.org/arch/search/?email_list=quic 4084 [2] https://github.com/quicwg 4086 [3] https://github.com/quicwg/base-drafts/labels/-transport 4088 [4] https://github.com/quicwg/base-drafts/wiki/QUIC-Versions 4090 Appendix A. Contributors 4092 The original authors of this specification were Ryan Hamilton, Jana 4093 Iyengar, Ian Swett, and Alyssa Wilk. 4095 The original design and rationale behind this protocol draw 4096 significantly from work by Jim Roskind [EARLY-DESIGN]. In 4097 alphabetical order, the contributors to the pre-IETF QUIC project at 4098 Google are: Britt Cyr, Jeremy Dorfman, Ryan Hamilton, Jana Iyengar, 4099 Fedor Kouranov, Charles Krasic, Jo Kulik, Adam Langley, Jim Roskind, 4100 Robbie Shade, Satyam Shekhar, Cherie Shi, Ian Swett, Raman Tenneti, 4101 Victor Vasiliev, Antonio Vicente, Patrik Westin, Alyssa Wilk, Dale 4102 Worley, Fan Yang, Dan Zhang, Daniel Ziegler. 4104 Appendix B. Acknowledgments 4106 Special thanks are due to the following for helping shape pre-IETF 4107 QUIC and its deployment: Chris Bentzel, Misha Efimov, Roberto Peon, 4108 Alistair Riddoch, Siddharth Vijayakrishnan, and Assar Westerlund. 4110 This document has benefited immensely from various private 4111 discussions and public ones on the quic@ietf.org and proto- 4112 quic@chromium.org mailing lists. Our thanks to all. 4114 Appendix C. Change Log 4116 *RFC Editor's Note:* Please remove this section prior to 4117 publication of a final version of this document. 4119 Issue and pull request numbers are listed with a leading octothorp. 4121 C.1. Since draft-ietf-quic-transport-07 4123 o Employ variable-length integer encodings throughout (#595) 4125 o Draining period can terminate early (#869) 4127 C.2. Since draft-ietf-quic-transport-06 4129 o Replaced FNV-1a with AES-GCM for all "Cleartext" packets (#554) 4131 o Split error code space between application and transport (#485) 4133 o Stateless reset token moved to end (#820) 4135 o 1-RTT-protected long header types removed (#848) 4137 o No acknowledgments during draining period (#852) 4139 o Remove "application close" as a separate close type (#854) 4141 o Remove timestamps from the ACK frame (#841) 4143 o Require transport parameters to only appear once (#792) 4145 C.3. Since draft-ietf-quic-transport-05 4147 o Stateless token is server-only (#726) 4149 o Refactor section on connection termination (#733, #748, #328, 4150 #177) 4152 o Limit size of Version Negotiation packet (#585) 4154 o Clarify when and what to ack (#736) 4156 o Renamed STREAM_ID_NEEDED to STREAM_ID_BLOCKED 4158 o Clarify Keep-alive requirements (#729) 4160 C.4. Since draft-ietf-quic-transport-04 4162 o Introduce STOP_SENDING frame, RST_STREAM only resets in one 4163 direction (#165) 4165 o Removed GOAWAY; application protocols are responsible for graceful 4166 shutdown (#696) 4168 o Reduced the number of error codes (#96, #177, #184, #211) 4170 o Version validation fields can't move or change (#121) 4172 o Removed versions from the transport parameters in a 4173 NewSessionTicket message (#547) 4175 o Clarify the meaning of "bytes in flight" (#550) 4177 o Public reset is now stateless reset and not visible to the path 4178 (#215) 4180 o Reordered bits and fields in STREAM frame (#620) 4182 o Clarifications to the stream state machine (#572, #571) 4184 o Increased the maximum length of the Largest Acknowledged field in 4185 ACK frames to 64 bits (#629) 4187 o truncate_connection_id is renamed to omit_connection_id (#659) 4189 o CONNECTION_CLOSE terminates the connection like TCP RST (#330, 4190 #328) 4192 o Update labels used in HKDF-Expand-Label to match TLS 1.3 (#642) 4194 C.5. Since draft-ietf-quic-transport-03 4196 o Change STREAM and RST_STREAM layout 4198 o Add MAX_STREAM_ID settings 4200 C.6. Since draft-ietf-quic-transport-02 4202 o The size of the initial packet payload has a fixed minimum (#267, 4203 #472) 4205 o Define when Version Negotiation packets are ignored (#284, #294, 4206 #241, #143, #474) 4208 o The 64-bit FNV-1a algorithm is used for integrity protection of 4209 unprotected packets (#167, #480, #481, #517) 4211 o Rework initial packet types to change how the connection ID is 4212 chosen (#482, #442, #493) 4214 o No timestamps are forbidden in unprotected packets (#542, #429) 4216 o Cryptographic handshake is now on stream 0 (#456) 4218 o Remove congestion control exemption for cryptographic handshake 4219 (#248, #476) 4221 o Version 1 of QUIC uses TLS; a new version is needed to use a 4222 different handshake protocol (#516) 4224 o STREAM frames have a reduced number of offset lengths (#543, #430) 4226 o Split some frames into separate connection- and stream- level 4227 frames (#443) 4229 * WINDOW_UPDATE split into MAX_DATA and MAX_STREAM_DATA (#450) 4231 * BLOCKED split to match WINDOW_UPDATE split (#454) 4233 * Define STREAM_ID_NEEDED frame (#455) 4235 o A NEW_CONNECTION_ID frame supports connection migration without 4236 linkability (#232, #491, #496) 4238 o Transport parameters for 0-RTT are retained from a previous 4239 connection (#405, #513, #512) 4241 * A client in 0-RTT no longer required to reset excess streams 4242 (#425, #479) 4244 o Expanded security considerations (#440, #444, #445, #448) 4246 C.7. Since draft-ietf-quic-transport-01 4248 o Defined short and long packet headers (#40, #148, #361) 4250 o Defined a versioning scheme and stable fields (#51, #361) 4252 o Define reserved version values for "greasing" negotiation (#112, 4253 #278) 4255 o The initial packet number is randomized (#35, #283) 4256 o Narrow the packet number encoding range requirement (#67, #286, 4257 #299, #323, #356) 4259 o Defined client address validation (#52, #118, #120, #275) 4261 o Define transport parameters as a TLS extension (#49, #122) 4263 o SCUP and COPT parameters are no longer valid (#116, #117) 4265 o Transport parameters for 0-RTT are either remembered from before, 4266 or assume default values (#126) 4268 o The server chooses connection IDs in its final flight (#119, #349, 4269 #361) 4271 o The server echoes the Connection ID and packet number fields when 4272 sending a Version Negotiation packet (#133, #295, #244) 4274 o Defined a minimum packet size for the initial handshake packet 4275 from the client (#69, #136, #139, #164) 4277 o Path MTU Discovery (#64, #106) 4279 o The initial handshake packet from the client needs to fit in a 4280 single packet (#338) 4282 o Forbid acknowledgment of packets containing only ACK and PADDING 4283 (#291) 4285 o Require that frames are processed when packets are acknowledged 4286 (#381, #341) 4288 o Removed the STOP_WAITING frame (#66) 4290 o Don't require retransmission of old timestamps for lost ACK frames 4291 (#308) 4293 o Clarified that frames are not retransmitted, but the information 4294 in them can be (#157, #298) 4296 o Error handling definitions (#335) 4298 o Split error codes into four sections (#74) 4300 o Forbid the use of Public Reset where CONNECTION_CLOSE is possible 4301 (#289) 4303 o Define packet protection rules (#336) 4304 o Require that stream be entirely delivered or reset, including 4305 acknowledgment of all STREAM frames or the RST_STREAM, before it 4306 closes (#381) 4308 o Remove stream reservation from state machine (#174, #280) 4310 o Only stream 1 does not contribute to connection-level flow control 4311 (#204) 4313 o Stream 1 counts towards the maximum concurrent stream limit (#201, 4314 #282) 4316 o Remove connection-level flow control exclusion for some streams 4317 (except 1) (#246) 4319 o RST_STREAM affects connection-level flow control (#162, #163) 4321 o Flow control accounting uses the maximum data offset on each 4322 stream, rather than bytes received (#378) 4324 o Moved length-determining fields to the start of STREAM and ACK 4325 (#168, #277) 4327 o Added the ability to pad between frames (#158, #276) 4329 o Remove error code and reason phrase from GOAWAY (#352, #355) 4331 o GOAWAY includes a final stream number for both directions (#347) 4333 o Error codes for RST_STREAM and CONNECTION_CLOSE are now at a 4334 consistent offset (#249) 4336 o Defined priority as the responsibility of the application protocol 4337 (#104, #303) 4339 C.8. Since draft-ietf-quic-transport-00 4341 o Replaced DIVERSIFICATION_NONCE flag with KEY_PHASE flag 4343 o Defined versioning 4345 o Reworked description of packet and frame layout 4347 o Error code space is divided into regions for each component 4349 o Use big endian for all numeric values 4351 C.9. Since draft-hamilton-quic-transport-protocol-01 4353 o Adopted as base for draft-ietf-quic-tls 4355 o Updated authors/editors list 4357 o Added IANA Considerations section 4359 o Moved Contributors and Acknowledgments to appendices 4361 Authors' Addresses 4363 Jana Iyengar (editor) 4364 Google 4366 Email: jri@google.com 4368 Martin Thomson (editor) 4369 Mozilla 4371 Email: martin.thomson@gmail.com