idnits 2.17.00 (12 Aug 2021) /tmp/idnits15298/draft-ietf-quic-transport-34.txt: -(8089): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == There are 6 instances of lines with non-ascii characters in the document. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (15 January 2021) is 484 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '0' on line 2375 == Missing Reference: 'CH' is mentioned on line 2371, but not defined == Missing Reference: 'SH' is mentioned on line 2373, but not defined == Missing Reference: 'EE' is mentioned on line 2374, but not defined == Missing Reference: 'CERT' is mentioned on line 2374, but not defined == Missing Reference: 'CV' is mentioned on line 2374, but not defined == Missing Reference: 'FIN' is mentioned on line 2374, but not defined -- Looks like a reference, but probably isn't: '1' on line 2373 == Outdated reference: draft-ietf-quic-invariants has been published as RFC 8999 == Outdated reference: draft-ietf-quic-recovery has been published as RFC 9002 == Outdated reference: draft-ietf-quic-tls has been published as RFC 9001 == Outdated reference: A later version (-16) exists of draft-ietf-quic-manageability-08 -- Obsolete informational reference (is this intentional?): RFC 4941 (Obsoleted by RFC 8981) Summary: 0 errors (**), 0 flaws (~~), 12 warnings (==), 5 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 QUIC J. Iyengar, Ed. 3 Internet-Draft Fastly 4 Intended status: Standards Track M. Thomson, Ed. 5 Expires: 19 July 2021 Mozilla 6 15 January 2021 8 QUIC: A UDP-Based Multiplexed and Secure Transport 9 draft-ietf-quic-transport-34 11 Abstract 13 This document defines the core of the QUIC transport protocol. QUIC 14 provides applications with flow-controlled streams for structured 15 communication, low-latency connection establishment, and network path 16 migration. QUIC includes security measures that ensure 17 confidentiality, integrity, and availability in a range of deployment 18 circumstances. Accompanying documents describe the integration of 19 TLS for key negotiation, loss detection, and an exemplary congestion 20 control algorithm. 22 DO NOT DEPLOY THIS VERSION OF QUIC 24 DO NOT DEPLOY THIS VERSION OF QUIC UNTIL IT IS IN AN RFC. This 25 version is still a work in progress. For trial deployments, please 26 use earlier versions. 28 Note to Readers 30 Discussion of this draft takes place on the QUIC working group 31 mailing list (quic@ietf.org (mailto:quic@ietf.org)), which is 32 archived at https://mailarchive.ietf.org/arch/search/?email_list=quic 34 Working Group information can be found at https://github.com/quicwg; 35 source code and issues list for this draft can be found at 36 https://github.com/quicwg/base-drafts/labels/-transport. 38 Status of This Memo 40 This Internet-Draft is submitted in full conformance with the 41 provisions of BCP 78 and BCP 79. 43 Internet-Drafts are working documents of the Internet Engineering 44 Task Force (IETF). Note that other groups may also distribute 45 working documents as Internet-Drafts. The list of current Internet- 46 Drafts is at https://datatracker.ietf.org/drafts/current/. 48 Internet-Drafts are draft documents valid for a maximum of six months 49 and may be updated, replaced, or obsoleted by other documents at any 50 time. It is inappropriate to use Internet-Drafts as reference 51 material or to cite them other than as "work in progress." 53 This Internet-Draft will expire on 19 July 2021. 55 Copyright Notice 57 Copyright (c) 2021 IETF Trust and the persons identified as the 58 document authors. All rights reserved. 60 This document is subject to BCP 78 and the IETF Trust's Legal 61 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 62 license-info) in effect on the date of publication of this document. 63 Please review these documents carefully, as they describe your rights 64 and restrictions with respect to this document. Code Components 65 extracted from this document must include Simplified BSD License text 66 as described in Section 4.e of the Trust Legal Provisions and are 67 provided without warranty as described in the Simplified BSD License. 69 Table of Contents 71 1. Overview . . . . . . . . . . . . . . . . . . . . . . . . . . 7 72 1.1. Document Structure . . . . . . . . . . . . . . . . . . . 8 73 1.2. Terms and Definitions . . . . . . . . . . . . . . . . . . 10 74 1.3. Notational Conventions . . . . . . . . . . . . . . . . . 11 75 2. Streams . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 76 2.1. Stream Types and Identifiers . . . . . . . . . . . . . . 13 77 2.2. Sending and Receiving Data . . . . . . . . . . . . . . . 14 78 2.3. Stream Prioritization . . . . . . . . . . . . . . . . . . 14 79 2.4. Operations on Streams . . . . . . . . . . . . . . . . . . 15 80 3. Stream States . . . . . . . . . . . . . . . . . . . . . . . . 15 81 3.1. Sending Stream States . . . . . . . . . . . . . . . . . . 16 82 3.2. Receiving Stream States . . . . . . . . . . . . . . . . . 18 83 3.3. Permitted Frame Types . . . . . . . . . . . . . . . . . . 21 84 3.4. Bidirectional Stream States . . . . . . . . . . . . . . . 21 85 3.5. Solicited State Transitions . . . . . . . . . . . . . . . 23 86 4. Flow Control . . . . . . . . . . . . . . . . . . . . . . . . 24 87 4.1. Data Flow Control . . . . . . . . . . . . . . . . . . . . 24 88 4.2. Increasing Flow Control Limits . . . . . . . . . . . . . 25 89 4.3. Flow Control Performance . . . . . . . . . . . . . . . . 26 90 4.4. Handling Stream Cancellation . . . . . . . . . . . . . . 27 91 4.5. Stream Final Size . . . . . . . . . . . . . . . . . . . . 27 92 4.6. Controlling Concurrency . . . . . . . . . . . . . . . . . 28 93 5. Connections . . . . . . . . . . . . . . . . . . . . . . . . . 29 94 5.1. Connection ID . . . . . . . . . . . . . . . . . . . . . . 29 95 5.1.1. Issuing Connection IDs . . . . . . . . . . . . . . . 31 96 5.1.2. Consuming and Retiring Connection IDs . . . . . . . . 32 97 5.2. Matching Packets to Connections . . . . . . . . . . . . . 33 98 5.2.1. Client Packet Handling . . . . . . . . . . . . . . . 34 99 5.2.2. Server Packet Handling . . . . . . . . . . . . . . . 35 100 5.2.3. Considerations for Simple Load Balancers . . . . . . 35 101 5.3. Operations on Connections . . . . . . . . . . . . . . . . 36 102 6. Version Negotiation . . . . . . . . . . . . . . . . . . . . . 37 103 6.1. Sending Version Negotiation Packets . . . . . . . . . . . 37 104 6.2. Handling Version Negotiation Packets . . . . . . . . . . 38 105 6.2.1. Version Negotiation Between Draft Versions . . . . . 38 106 6.3. Using Reserved Versions . . . . . . . . . . . . . . . . . 39 107 7. Cryptographic and Transport Handshake . . . . . . . . . . . . 39 108 7.1. Example Handshake Flows . . . . . . . . . . . . . . . . . 41 109 7.2. Negotiating Connection IDs . . . . . . . . . . . . . . . 42 110 7.3. Authenticating Connection IDs . . . . . . . . . . . . . . 43 111 7.4. Transport Parameters . . . . . . . . . . . . . . . . . . 45 112 7.4.1. Values of Transport Parameters for 0-RTT . . . . . . 46 113 7.4.2. New Transport Parameters . . . . . . . . . . . . . . 48 114 7.5. Cryptographic Message Buffering . . . . . . . . . . . . . 49 115 8. Address Validation . . . . . . . . . . . . . . . . . . . . . 49 116 8.1. Address Validation During Connection Establishment . . . 50 117 8.1.1. Token Construction . . . . . . . . . . . . . . . . . 51 118 8.1.2. Address Validation using Retry Packets . . . . . . . 51 119 8.1.3. Address Validation for Future Connections . . . . . . 52 120 8.1.4. Address Validation Token Integrity . . . . . . . . . 55 121 8.2. Path Validation . . . . . . . . . . . . . . . . . . . . . 55 122 8.2.1. Initiating Path Validation . . . . . . . . . . . . . 56 123 8.2.2. Path Validation Responses . . . . . . . . . . . . . . 57 124 8.2.3. Successful Path Validation . . . . . . . . . . . . . 58 125 8.2.4. Failed Path Validation . . . . . . . . . . . . . . . 58 126 9. Connection Migration . . . . . . . . . . . . . . . . . . . . 59 127 9.1. Probing a New Path . . . . . . . . . . . . . . . . . . . 60 128 9.2. Initiating Connection Migration . . . . . . . . . . . . . 60 129 9.3. Responding to Connection Migration . . . . . . . . . . . 61 130 9.3.1. Peer Address Spoofing . . . . . . . . . . . . . . . . 62 131 9.3.2. On-Path Address Spoofing . . . . . . . . . . . . . . 62 132 9.3.3. Off-Path Packet Forwarding . . . . . . . . . . . . . 63 133 9.4. Loss Detection and Congestion Control . . . . . . . . . . 64 134 9.5. Privacy Implications of Connection Migration . . . . . . 65 135 9.6. Server's Preferred Address . . . . . . . . . . . . . . . 66 136 9.6.1. Communicating a Preferred Address . . . . . . . . . . 66 137 9.6.2. Migration to a Preferred Address . . . . . . . . . . 67 138 9.6.3. Interaction of Client Migration and Preferred 139 Address . . . . . . . . . . . . . . . . . . . . . . . 67 140 9.7. Use of IPv6 Flow-Label and Migration . . . . . . . . . . 68 141 10. Connection Termination . . . . . . . . . . . . . . . . . . . 69 142 10.1. Idle Timeout . . . . . . . . . . . . . . . . . . . . . . 69 143 10.1.1. Liveness Testing . . . . . . . . . . . . . . . . . . 69 144 10.1.2. Deferring Idle Timeout . . . . . . . . . . . . . . . 70 145 10.2. Immediate Close . . . . . . . . . . . . . . . . . . . . 70 146 10.2.1. Closing Connection State . . . . . . . . . . . . . . 71 147 10.2.2. Draining Connection State . . . . . . . . . . . . . 72 148 10.2.3. Immediate Close During the Handshake . . . . . . . . 73 149 10.3. Stateless Reset . . . . . . . . . . . . . . . . . . . . 74 150 10.3.1. Detecting a Stateless Reset . . . . . . . . . . . . 77 151 10.3.2. Calculating a Stateless Reset Token . . . . . . . . 78 152 10.3.3. Looping . . . . . . . . . . . . . . . . . . . . . . 79 153 11. Error Handling . . . . . . . . . . . . . . . . . . . . . . . 79 154 11.1. Connection Errors . . . . . . . . . . . . . . . . . . . 80 155 11.2. Stream Errors . . . . . . . . . . . . . . . . . . . . . 81 156 12. Packets and Frames . . . . . . . . . . . . . . . . . . . . . 81 157 12.1. Protected Packets . . . . . . . . . . . . . . . . . . . 82 158 12.2. Coalescing Packets . . . . . . . . . . . . . . . . . . . 82 159 12.3. Packet Numbers . . . . . . . . . . . . . . . . . . . . . 83 160 12.4. Frames and Frame Types . . . . . . . . . . . . . . . . . 85 161 12.5. Frames and Number Spaces . . . . . . . . . . . . . . . . 89 162 13. Packetization and Reliability . . . . . . . . . . . . . . . . 90 163 13.1. Packet Processing . . . . . . . . . . . . . . . . . . . 90 164 13.2. Generating Acknowledgments . . . . . . . . . . . . . . . 91 165 13.2.1. Sending ACK Frames . . . . . . . . . . . . . . . . . 91 166 13.2.2. Acknowledgment Frequency . . . . . . . . . . . . . . 92 167 13.2.3. Managing ACK Ranges . . . . . . . . . . . . . . . . 93 168 13.2.4. Limiting Ranges by Tracking ACK Frames . . . . . . . 94 169 13.2.5. Measuring and Reporting Host Delay . . . . . . . . . 95 170 13.2.6. ACK Frames and Packet Protection . . . . . . . . . . 95 171 13.2.7. PADDING Frames Consume Congestion Window . . . . . . 95 172 13.3. Retransmission of Information . . . . . . . . . . . . . 96 173 13.4. Explicit Congestion Notification . . . . . . . . . . . . 98 174 13.4.1. Reporting ECN Counts . . . . . . . . . . . . . . . . 99 175 13.4.2. ECN Validation . . . . . . . . . . . . . . . . . . . 99 176 14. Datagram Size . . . . . . . . . . . . . . . . . . . . . . . . 101 177 14.1. Initial Datagram Size . . . . . . . . . . . . . . . . . 102 178 14.2. Path Maximum Transmission Unit . . . . . . . . . . . . . 103 179 14.2.1. Handling of ICMP Messages by PMTUD . . . . . . . . . 104 180 14.3. Datagram Packetization Layer PMTU Discovery . . . . . . 105 181 14.3.1. DPLPMTUD and Initial Connectivity . . . . . . . . . 105 182 14.3.2. Validating the Network Path with DPLPMTUD . . . . . 105 183 14.3.3. Handling of ICMP Messages by DPLPMTUD . . . . . . . 105 184 14.4. Sending QUIC PMTU Probes . . . . . . . . . . . . . . . . 105 185 14.4.1. PMTU Probes Containing Source Connection ID . . . . 106 186 15. Versions . . . . . . . . . . . . . . . . . . . . . . . . . . 106 187 16. Variable-Length Integer Encoding . . . . . . . . . . . . . . 107 188 17. Packet Formats . . . . . . . . . . . . . . . . . . . . . . . 108 189 17.1. Packet Number Encoding and Decoding . . . . . . . . . . 108 190 17.2. Long Header Packets . . . . . . . . . . . . . . . . . . 109 191 17.2.1. Version Negotiation Packet . . . . . . . . . . . . . 112 192 17.2.2. Initial Packet . . . . . . . . . . . . . . . . . . . 113 193 17.2.3. 0-RTT . . . . . . . . . . . . . . . . . . . . . . . 115 194 17.2.4. Handshake Packet . . . . . . . . . . . . . . . . . . 117 195 17.2.5. Retry Packet . . . . . . . . . . . . . . . . . . . . 118 196 17.3. Short Header Packets . . . . . . . . . . . . . . . . . . 120 197 17.3.1. 1-RTT Packet . . . . . . . . . . . . . . . . . . . . 120 198 17.4. Latency Spin Bit . . . . . . . . . . . . . . . . . . . . 122 199 18. Transport Parameter Encoding . . . . . . . . . . . . . . . . 123 200 18.1. Reserved Transport Parameters . . . . . . . . . . . . . 124 201 18.2. Transport Parameter Definitions . . . . . . . . . . . . 124 202 19. Frame Types and Formats . . . . . . . . . . . . . . . . . . . 128 203 19.1. PADDING Frames . . . . . . . . . . . . . . . . . . . . . 129 204 19.2. PING Frames . . . . . . . . . . . . . . . . . . . . . . 129 205 19.3. ACK Frames . . . . . . . . . . . . . . . . . . . . . . . 130 206 19.3.1. ACK Ranges . . . . . . . . . . . . . . . . . . . . . 131 207 19.3.2. ECN Counts . . . . . . . . . . . . . . . . . . . . . 132 208 19.4. RESET_STREAM Frames . . . . . . . . . . . . . . . . . . 133 209 19.5. STOP_SENDING Frames . . . . . . . . . . . . . . . . . . 134 210 19.6. CRYPTO Frames . . . . . . . . . . . . . . . . . . . . . 135 211 19.7. NEW_TOKEN Frames . . . . . . . . . . . . . . . . . . . . 136 212 19.8. STREAM Frames . . . . . . . . . . . . . . . . . . . . . 136 213 19.9. MAX_DATA Frames . . . . . . . . . . . . . . . . . . . . 138 214 19.10. MAX_STREAM_DATA Frames . . . . . . . . . . . . . . . . . 138 215 19.11. MAX_STREAMS Frames . . . . . . . . . . . . . . . . . . . 139 216 19.12. DATA_BLOCKED Frames . . . . . . . . . . . . . . . . . . 140 217 19.13. STREAM_DATA_BLOCKED Frames . . . . . . . . . . . . . . . 141 218 19.14. STREAMS_BLOCKED Frames . . . . . . . . . . . . . . . . . 141 219 19.15. NEW_CONNECTION_ID Frames . . . . . . . . . . . . . . . . 142 220 19.16. RETIRE_CONNECTION_ID Frames . . . . . . . . . . . . . . 144 221 19.17. PATH_CHALLENGE Frames . . . . . . . . . . . . . . . . . 145 222 19.18. PATH_RESPONSE Frames . . . . . . . . . . . . . . . . . . 145 223 19.19. CONNECTION_CLOSE Frames . . . . . . . . . . . . . . . . 146 224 19.20. HANDSHAKE_DONE Frames . . . . . . . . . . . . . . . . . 147 225 19.21. Extension Frames . . . . . . . . . . . . . . . . . . . . 147 226 20. Error Codes . . . . . . . . . . . . . . . . . . . . . . . . . 148 227 20.1. Transport Error Codes . . . . . . . . . . . . . . . . . 148 228 20.2. Application Protocol Error Codes . . . . . . . . . . . . 150 229 21. Security Considerations . . . . . . . . . . . . . . . . . . . 150 230 21.1. Overview of Security Properties . . . . . . . . . . . . 150 231 21.1.1. Handshake . . . . . . . . . . . . . . . . . . . . . 151 232 21.1.2. Protected Packets . . . . . . . . . . . . . . . . . 153 233 21.1.3. Connection Migration . . . . . . . . . . . . . . . . 153 234 21.2. Handshake Denial of Service . . . . . . . . . . . . . . 158 235 21.3. Amplification Attack . . . . . . . . . . . . . . . . . . 159 236 21.4. Optimistic ACK Attack . . . . . . . . . . . . . . . . . 159 237 21.5. Request Forgery Attacks . . . . . . . . . . . . . . . . 159 238 21.5.1. Control Options for Endpoints . . . . . . . . . . . 160 239 21.5.2. Request Forgery with Client Initial Packets . . . . 161 240 21.5.3. Request Forgery with Preferred Addresses . . . . . . 162 241 21.5.4. Request Forgery with Spoofed Migration . . . . . . . 162 242 21.5.5. Request Forgery with Version Negotiation . . . . . . 163 243 21.5.6. Generic Request Forgery Countermeasures . . . . . . 163 244 21.6. Slowloris Attacks . . . . . . . . . . . . . . . . . . . 164 245 21.7. Stream Fragmentation and Reassembly Attacks . . . . . . 165 246 21.8. Stream Commitment Attack . . . . . . . . . . . . . . . . 165 247 21.9. Peer Denial of Service . . . . . . . . . . . . . . . . . 166 248 21.10. Explicit Congestion Notification Attacks . . . . . . . . 166 249 21.11. Stateless Reset Oracle . . . . . . . . . . . . . . . . . 167 250 21.12. Version Downgrade . . . . . . . . . . . . . . . . . . . 167 251 21.13. Targeted Attacks by Routing . . . . . . . . . . . . . . 168 252 21.14. Traffic Analysis . . . . . . . . . . . . . . . . . . . . 168 253 22. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 168 254 22.1. Registration Policies for QUIC Registries . . . . . . . 168 255 22.1.1. Provisional Registrations . . . . . . . . . . . . . 168 256 22.1.2. Selecting Codepoints . . . . . . . . . . . . . . . . 169 257 22.1.3. Reclaiming Provisional Codepoints . . . . . . . . . 170 258 22.1.4. Permanent Registrations . . . . . . . . . . . . . . 170 259 22.2. QUIC Versions Registry . . . . . . . . . . . . . . . . . 171 260 22.3. QUIC Transport Parameter Registry . . . . . . . . . . . 172 261 22.4. QUIC Frame Types Registry . . . . . . . . . . . . . . . 173 262 22.5. QUIC Transport Error Codes Registry . . . . . . . . . . 174 263 23. References . . . . . . . . . . . . . . . . . . . . . . . . . 176 264 23.1. Normative References . . . . . . . . . . . . . . . . . . 176 265 23.2. Informative References . . . . . . . . . . . . . . . . . 178 266 Appendix A. Pseudocode . . . . . . . . . . . . . . . . . . . . . 181 267 A.1. Sample Variable-Length Integer Decoding . . . . . . . . . 181 268 A.2. Sample Packet Number Encoding Algorithm . . . . . . . . . 182 269 A.3. Sample Packet Number Decoding Algorithm . . . . . . . . . 182 270 A.4. Sample ECN Validation Algorithm . . . . . . . . . . . . . 183 271 Appendix B. Change Log . . . . . . . . . . . . . . . . . . . . . 184 272 B.1. Since draft-ietf-quic-transport-32 . . . . . . . . . . . 184 273 B.2. Since draft-ietf-quic-transport-31 . . . . . . . . . . . 185 274 B.3. Since draft-ietf-quic-transport-30 . . . . . . . . . . . 185 275 B.4. Since draft-ietf-quic-transport-29 . . . . . . . . . . . 186 276 B.5. Since draft-ietf-quic-transport-28 . . . . . . . . . . . 186 277 B.6. Since draft-ietf-quic-transport-27 . . . . . . . . . . . 187 278 B.7. Since draft-ietf-quic-transport-26 . . . . . . . . . . . 188 279 B.8. Since draft-ietf-quic-transport-25 . . . . . . . . . . . 188 280 B.9. Since draft-ietf-quic-transport-24 . . . . . . . . . . . 188 281 B.10. Since draft-ietf-quic-transport-23 . . . . . . . . . . . 189 282 B.11. Since draft-ietf-quic-transport-22 . . . . . . . . . . . 190 283 B.12. Since draft-ietf-quic-transport-21 . . . . . . . . . . . 191 284 B.13. Since draft-ietf-quic-transport-20 . . . . . . . . . . . 191 285 B.14. Since draft-ietf-quic-transport-19 . . . . . . . . . . . 192 286 B.15. Since draft-ietf-quic-transport-18 . . . . . . . . . . . 192 287 B.16. Since draft-ietf-quic-transport-17 . . . . . . . . . . . 193 288 B.17. Since draft-ietf-quic-transport-16 . . . . . . . . . . . 194 289 B.18. Since draft-ietf-quic-transport-15 . . . . . . . . . . . 195 290 B.19. Since draft-ietf-quic-transport-14 . . . . . . . . . . . 195 291 B.20. Since draft-ietf-quic-transport-13 . . . . . . . . . . . 195 292 B.21. Since draft-ietf-quic-transport-12 . . . . . . . . . . . 196 293 B.22. Since draft-ietf-quic-transport-11 . . . . . . . . . . . 197 294 B.23. Since draft-ietf-quic-transport-10 . . . . . . . . . . . 197 295 B.24. Since draft-ietf-quic-transport-09 . . . . . . . . . . . 198 296 B.25. Since draft-ietf-quic-transport-08 . . . . . . . . . . . 199 297 B.26. Since draft-ietf-quic-transport-07 . . . . . . . . . . . 199 298 B.27. Since draft-ietf-quic-transport-06 . . . . . . . . . . . 200 299 B.28. Since draft-ietf-quic-transport-05 . . . . . . . . . . . 201 300 B.29. Since draft-ietf-quic-transport-04 . . . . . . . . . . . 201 301 B.30. Since draft-ietf-quic-transport-03 . . . . . . . . . . . 202 302 B.31. Since draft-ietf-quic-transport-02 . . . . . . . . . . . 202 303 B.32. Since draft-ietf-quic-transport-01 . . . . . . . . . . . 203 304 B.33. Since draft-ietf-quic-transport-00 . . . . . . . . . . . 205 305 B.34. Since draft-hamilton-quic-transport-protocol-01 . . . . . 205 306 Contributors . . . . . . . . . . . . . . . . . . . . . . . . . . 205 307 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 207 309 1. Overview 311 QUIC is a secure general-purpose transport protocol. This document 312 defines version 1 of QUIC, which conforms to the version-independent 313 properties of QUIC defined in [QUIC-INVARIANTS]. 315 QUIC is a connection-oriented protocol that creates a stateful 316 interaction between a client and server. 318 The QUIC handshake combines negotiation of cryptographic and 319 transport parameters. QUIC integrates the TLS ([TLS13]) handshake, 320 although using a customized framing for protecting packets. The 321 integration of TLS and QUIC is described in more detail in 322 [QUIC-TLS]. The handshake is structured to permit the exchange of 323 application data as soon as possible. This includes an option for 324 clients to send data immediately (0-RTT), which requires some form of 325 prior communication or configuration to enable. 327 Endpoints communicate in QUIC by exchanging QUIC packets. Most 328 packets contain frames, which carry control information and 329 application data between endpoints. QUIC authenticates the entirety 330 of each packet and encrypts as much of each packet as is practical. 331 QUIC packets are carried in UDP datagrams ([UDP]) to better 332 facilitate deployment in existing systems and networks. 334 Application protocols exchange information over a QUIC connection via 335 streams, which are ordered sequences of bytes. Two types of stream 336 can be created: bidirectional streams, which allow both endpoints to 337 send data; and unidirectional streams, which allow a single endpoint 338 to send data. A credit-based scheme is used to limit stream creation 339 and to bound the amount of data that can be sent. 341 QUIC provides the necessary feedback to implement reliable delivery 342 and congestion control. An algorithm for detecting and recovering 343 from loss of data is described in [QUIC-RECOVERY]. QUIC depends on 344 congestion control to avoid network congestion. An exemplary 345 congestion control algorithm is also described in [QUIC-RECOVERY]. 347 QUIC connections are not strictly bound to a single network path. 348 Connection migration uses connection identifiers to allow connections 349 to transfer to a new network path. Only clients are able to migrate 350 in this version of QUIC. This design also allows connections to 351 continue after changes in network topology or address mappings, such 352 as might be caused by NAT rebinding. 354 Once established, multiple options are provided for connection 355 termination. Applications can manage a graceful shutdown, endpoints 356 can negotiate a timeout period, errors can cause immediate connection 357 teardown, and a stateless mechanism provides for termination of 358 connections after one endpoint has lost state. 360 1.1. Document Structure 362 This document describes the core QUIC protocol and is structured as 363 follows: 365 * Streams are the basic service abstraction that QUIC provides. 367 - Section 2 describes core concepts related to streams, 369 - Section 3 provides a reference model for stream states, and 371 - Section 4 outlines the operation of flow control. 373 * Connections are the context in which QUIC endpoints communicate. 375 - Section 5 describes core concepts related to connections, 377 - Section 6 describes version negotiation, 379 - Section 7 details the process for establishing connections, 380 - Section 8 describes address validation and critical denial of 381 service mitigations, 383 - Section 9 describes how endpoints migrate a connection to a new 384 network path, 386 - Section 10 lists the options for terminating an open 387 connection, and 389 - Section 11 provides guidance for stream and connection error 390 handling. 392 * Packets and frames are the basic unit used by QUIC to communicate. 394 - Section 12 describes concepts related to packets and frames, 396 - Section 13 defines models for the transmission, retransmission, 397 and acknowledgment of data, and 399 - Section 14 specifies rules for managing the size of datagrams 400 carrying QUIC packets. 402 * Finally, encoding details of QUIC protocol elements are described 403 in: 405 - Section 15 (Versions), 407 - Section 16 (Integer Encoding), 409 - Section 17 (Packet Headers), 411 - Section 18 (Transport Parameters), 413 - Section 19 (Frames), and 415 - Section 20 (Errors). 417 Accompanying documents describe QUIC's loss detection and congestion 418 control [QUIC-RECOVERY], and the use of TLS and other cryptographic 419 mechanisms [QUIC-TLS]. 421 This document defines QUIC version 1, which conforms to the protocol 422 invariants in [QUIC-INVARIANTS]. 424 To refer to QUIC version 1, cite this document. References to the 425 limited set of version-independent properties of QUIC can cite 426 [QUIC-INVARIANTS]. 428 1.2. Terms and Definitions 430 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 431 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 432 "OPTIONAL" in this document are to be interpreted as described in 433 BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all 434 capitals, as shown here. 436 Commonly used terms in the document are described below. 438 QUIC: The transport protocol described by this document. QUIC is a 439 name, not an acronym. 441 Endpoint: An entity that can participate in a QUIC connection by 442 generating, receiving, and processing QUIC packets. There are 443 only two types of endpoint in QUIC: client and server. 445 Client: The endpoint that initiates a QUIC connection. 447 Server: The endpoint that accepts a QUIC connection. 449 QUIC packet: A complete processable unit of QUIC that can be 450 encapsulated in a UDP datagram. One or more QUIC packets can be 451 encapsulated in a single UDP datagram. 453 Ack-eliciting Packet: A QUIC packet that contains frames other than 454 ACK, PADDING, and CONNECTION_CLOSE. These cause a recipient to 455 send an acknowledgment; see Section 13.2.1. 457 Frame: A unit of structured protocol information. There are 458 multiple frame types, each of which carries different information. 459 Frames are contained in QUIC packets. 461 Address: When used without qualification, the tuple of IP version, 462 IP address, and UDP port number that represents one end of a 463 network path. 465 Connection ID: An identifier that is used to identify a QUIC 466 connection at an endpoint. Each endpoint selects one or more 467 Connection IDs for its peer to include in packets sent towards the 468 endpoint. This value is opaque to the peer. 470 Stream: A unidirectional or bidirectional channel of ordered bytes 471 within a QUIC connection. A QUIC connection can carry multiple 472 simultaneous streams. 474 Application: An entity that uses QUIC to send and receive data. 476 This document uses the terms "QUIC packets", "UDP datagrams", and "IP 477 packets" to refer to the units of the respective protocols. That is, 478 one or more QUIC packets can be encapsulated in a UDP datagram, which 479 is in turn encapsulated in an IP packet. 481 1.3. Notational Conventions 483 Packet and frame diagrams in this document use a custom format. The 484 purpose of this format is to summarize, not define, protocol 485 elements. Prose defines the complete semantics and details of 486 structures. 488 Complex fields are named and then followed by a list of fields 489 surrounded by a pair of matching braces. Each field in this list is 490 separated by commas. 492 Individual fields include length information, plus indications about 493 fixed value, optionality, or repetitions. Individual fields use the 494 following notational conventions, with all lengths in bits: 496 x (A): Indicates that x is A bits long 498 x (i): Indicates that x holds an integer value using the variable- 499 length encoding in Section 16 501 x (A..B): Indicates that x can be any length from A to B; A can be 502 omitted to indicate a minimum of zero bits and B can be omitted to 503 indicate no set upper limit; values in this format always end on 504 an byte boundary 506 x (L) = C: Indicates that x has a fixed value of C with the length 507 described by L, which can use any of the three length forms above 509 x (L) = C..D: Indicates that x has a value in the range from C to D, 510 inclusive, with the length described by L, as above 512 [x (L)]: Indicates that x is optional (and has length of L) 514 x (L) ...: Indicates that zero or more instances of x are present 515 (and that each instance is length L) 517 This document uses network byte order (that is, big endian) values. 518 Fields are placed starting from the high-order bits of each byte. 520 By convention, individual fields reference a complex field by using 521 the name of the complex field. 523 For example: 525 Example Structure { 526 One-bit Field (1), 527 7-bit Field with Fixed Value (7) = 61, 528 Field with Variable-Length Integer (i), 529 Arbitrary-Length Field (..), 530 Variable-Length Field (8..24), 531 Field With Minimum Length (16..), 532 Field With Maximum Length (..128), 533 [Optional Field (64)], 534 Repeated Field (8) ..., 535 } 537 Figure 1: Example Format 539 When a single-bit field is referenced in prose, the position of that 540 field can be clarified by using the value of the byte that carries 541 the field with the field's value set. For example, the value 0x80 542 could be used to refer to the single-bit field in the most 543 significant bit of the byte, such as One-bit Field in Figure 1. 545 2. Streams 547 Streams in QUIC provide a lightweight, ordered byte-stream 548 abstraction to an application. Streams can be unidirectional or 549 bidirectional. 551 Streams can be created by sending data. Other processes associated 552 with stream management - ending, cancelling, and managing flow 553 control - are all designed to impose minimal overheads. For 554 instance, a single STREAM frame (Section 19.8) can open, carry data 555 for, and close a stream. Streams can also be long-lived and can last 556 the entire duration of a connection. 558 Streams can be created by either endpoint, can concurrently send data 559 interleaved with other streams, and can be cancelled. QUIC does not 560 provide any means of ensuring ordering between bytes on different 561 streams. 563 QUIC allows for an arbitrary number of streams to operate 564 concurrently and for an arbitrary amount of data to be sent on any 565 stream, subject to flow control constraints and stream limits; see 566 Section 4. 568 2.1. Stream Types and Identifiers 570 Streams can be unidirectional or bidirectional. Unidirectional 571 streams carry data in one direction: from the initiator of the stream 572 to its peer. Bidirectional streams allow for data to be sent in both 573 directions. 575 Streams are identified within a connection by a numeric value, 576 referred to as the stream ID. A stream ID is a 62-bit integer (0 to 577 2^62-1) that is unique for all streams on a connection. Stream IDs 578 are encoded as variable-length integers; see Section 16. A QUIC 579 endpoint MUST NOT reuse a stream ID within a connection. 581 The least significant bit (0x1) of the stream ID identifies the 582 initiator of the stream. Client-initiated streams have even-numbered 583 stream IDs (with the bit set to 0), and server-initiated streams have 584 odd-numbered stream IDs (with the bit set to 1). 586 The second least significant bit (0x2) of the stream ID distinguishes 587 between bidirectional streams (with the bit set to 0) and 588 unidirectional streams (with the bit set to 1). 590 The two least significant bits from a stream ID therefore identify a 591 stream as one of four types, as summarized in Table 1. 593 +======+==================================+ 594 | Bits | Stream Type | 595 +======+==================================+ 596 | 0x0 | Client-Initiated, Bidirectional | 597 +------+----------------------------------+ 598 | 0x1 | Server-Initiated, Bidirectional | 599 +------+----------------------------------+ 600 | 0x2 | Client-Initiated, Unidirectional | 601 +------+----------------------------------+ 602 | 0x3 | Server-Initiated, Unidirectional | 603 +------+----------------------------------+ 605 Table 1: Stream ID Types 607 The stream space for each type begins at the minimum value (0x0 608 through 0x3 respectively); successive streams of each type are 609 created with numerically increasing stream IDs. A stream ID that is 610 used out of order results in all streams of that type with lower- 611 numbered stream IDs also being opened. 613 2.2. Sending and Receiving Data 615 STREAM frames (Section 19.8) encapsulate data sent by an application. 616 An endpoint uses the Stream ID and Offset fields in STREAM frames to 617 place data in order. 619 Endpoints MUST be able to deliver stream data to an application as an 620 ordered byte-stream. Delivering an ordered byte-stream requires that 621 an endpoint buffer any data that is received out of order, up to the 622 advertised flow control limit. 624 QUIC makes no specific allowances for delivery of stream data out of 625 order. However, implementations MAY choose to offer the ability to 626 deliver data out of order to a receiving application. 628 An endpoint could receive data for a stream at the same stream offset 629 multiple times. Data that has already been received can be 630 discarded. The data at a given offset MUST NOT change if it is sent 631 multiple times; an endpoint MAY treat receipt of different data at 632 the same offset within a stream as a connection error of type 633 PROTOCOL_VIOLATION. 635 Streams are an ordered byte-stream abstraction with no other 636 structure visible to QUIC. STREAM frame boundaries are not expected 637 to be preserved when data is transmitted, retransmitted after packet 638 loss, or delivered to the application at a receiver. 640 An endpoint MUST NOT send data on any stream without ensuring that it 641 is within the flow control limits set by its peer. Flow control is 642 described in detail in Section 4. 644 2.3. Stream Prioritization 646 Stream multiplexing can have a significant effect on application 647 performance if resources allocated to streams are correctly 648 prioritized. 650 QUIC does not provide a mechanism for exchanging prioritization 651 information. Instead, it relies on receiving priority information 652 from the application. 654 A QUIC implementation SHOULD provide ways in which an application can 655 indicate the relative priority of streams. An implementation uses 656 information provided by the application to determine how to allocate 657 resources to active streams. 659 2.4. Operations on Streams 661 This document does not define an API for QUIC, but instead defines a 662 set of functions on streams that application protocols can rely upon. 663 An application protocol can assume that a QUIC implementation 664 provides an interface that includes the operations described in this 665 section. An implementation designed for use with a specific 666 application protocol might provide only those operations that are 667 used by that protocol. 669 On the sending part of a stream, an application protocol can: 671 * write data, understanding when stream flow control credit 672 (Section 4.1) has successfully been reserved to send the written 673 data; 675 * end the stream (clean termination), resulting in a STREAM frame 676 (Section 19.8) with the FIN bit set; and 678 * reset the stream (abrupt termination), resulting in a RESET_STREAM 679 frame (Section 19.4) if the stream was not already in a terminal 680 state. 682 On the receiving part of a stream, an application protocol can: 684 * read data; and 686 * abort reading of the stream and request closure, possibly 687 resulting in a STOP_SENDING frame (Section 19.5). 689 An application protocol can also request to be informed of state 690 changes on streams, including when the peer has opened or reset a 691 stream, when a peer aborts reading on a stream, when new data is 692 available, and when data can or cannot be written to the stream due 693 to flow control. 695 3. Stream States 697 This section describes streams in terms of their send or receive 698 components. Two state machines are described: one for the streams on 699 which an endpoint transmits data (Section 3.1), and another for 700 streams on which an endpoint receives data (Section 3.2). 702 Unidirectional streams use either the sending or receiving state 703 machine depending on the stream type and endpoint role. 704 Bidirectional streams use both state machines at both endpoints. For 705 the most part, the use of these state machines is the same whether 706 the stream is unidirectional or bidirectional. The conditions for 707 opening a stream are slightly more complex for a bidirectional stream 708 because the opening of either the send or receive side causes the 709 stream to open in both directions. 711 The state machines shown in this section are largely informative. 712 This document uses stream states to describe rules for when and how 713 different types of frames can be sent and the reactions that are 714 expected when different types of frames are received. Though these 715 state machines are intended to be useful in implementing QUIC, these 716 states are not intended to constrain implementations. An 717 implementation can define a different state machine as long as its 718 behavior is consistent with an implementation that implements these 719 states. 721 Note: In some cases, a single event or action can cause a transition 722 through multiple states. For instance, sending STREAM with a FIN 723 bit set can cause two state transitions for a sending stream: from 724 the Ready state to the Send state, and from the Send state to the 725 Data Sent state. 727 3.1. Sending Stream States 729 Figure 2 shows the states for the part of a stream that sends data to 730 a peer. 732 o 733 | Create Stream (Sending) 734 | Peer Creates Bidirectional Stream 735 v 736 +-------+ 737 | Ready | Send RESET_STREAM 738 | |-----------------------. 739 +-------+ | 740 | | 741 | Send STREAM / | 742 | STREAM_DATA_BLOCKED | 743 v | 744 +-------+ | 745 | Send | Send RESET_STREAM | 746 | |---------------------->| 747 +-------+ | 748 | | 749 | Send STREAM + FIN | 750 v v 751 +-------+ +-------+ 752 | Data | Send RESET_STREAM | Reset | 753 | Sent |------------------>| Sent | 754 +-------+ +-------+ 755 | | 756 | Recv All ACKs | Recv ACK 757 v v 758 +-------+ +-------+ 759 | Data | | Reset | 760 | Recvd | | Recvd | 761 +-------+ +-------+ 763 Figure 2: States for Sending Parts of Streams 765 The sending part of a stream that the endpoint initiates (types 0 and 766 2 for clients, 1 and 3 for servers) is opened by the application. 767 The "Ready" state represents a newly created stream that is able to 768 accept data from the application. Stream data might be buffered in 769 this state in preparation for sending. 771 Sending the first STREAM or STREAM_DATA_BLOCKED frame causes a 772 sending part of a stream to enter the "Send" state. An 773 implementation might choose to defer allocating a stream ID to a 774 stream until it sends the first STREAM frame and enters this state, 775 which can allow for better stream prioritization. 777 The sending part of a bidirectional stream initiated by a peer (type 778 0 for a server, type 1 for a client) starts in the "Ready" state when 779 the receiving part is created. 781 In the "Send" state, an endpoint transmits - and retransmits as 782 necessary - stream data in STREAM frames. The endpoint respects the 783 flow control limits set by its peer, and continues to accept and 784 process MAX_STREAM_DATA frames. An endpoint in the "Send" state 785 generates STREAM_DATA_BLOCKED frames if it is blocked from sending by 786 stream flow control limits (Section 4.1). 788 After the application indicates that all stream data has been sent 789 and a STREAM frame containing the FIN bit is sent, the sending part 790 of the stream enters the "Data Sent" state. From this state, the 791 endpoint only retransmits stream data as necessary. The endpoint 792 does not need to check flow control limits or send 793 STREAM_DATA_BLOCKED frames for a stream in this state. 794 MAX_STREAM_DATA frames might be received until the peer receives the 795 final stream offset. The endpoint can safely ignore any 796 MAX_STREAM_DATA frames it receives from its peer for a stream in this 797 state. 799 Once all stream data has been successfully acknowledged, the sending 800 part of the stream enters the "Data Recvd" state, which is a terminal 801 state. 803 From any of the "Ready", "Send", or "Data Sent" states, an 804 application can signal that it wishes to abandon transmission of 805 stream data. Alternatively, an endpoint might receive a STOP_SENDING 806 frame from its peer. In either case, the endpoint sends a 807 RESET_STREAM frame, which causes the stream to enter the "Reset Sent" 808 state. 810 An endpoint MAY send a RESET_STREAM as the first frame that mentions 811 a stream; this causes the sending part of that stream to open and 812 then immediately transition to the "Reset Sent" state. 814 Once a packet containing a RESET_STREAM has been acknowledged, the 815 sending part of the stream enters the "Reset Recvd" state, which is a 816 terminal state. 818 3.2. Receiving Stream States 820 Figure 3 shows the states for the part of a stream that receives data 821 from a peer. The states for a receiving part of a stream mirror only 822 some of the states of the sending part of the stream at the peer. 823 The receiving part of a stream does not track states on the sending 824 part that cannot be observed, such as the "Ready" state. Instead, 825 the receiving part of a stream tracks the delivery of data to the 826 application, some of which cannot be observed by the sender. 828 o 829 | Recv STREAM / STREAM_DATA_BLOCKED / RESET_STREAM 830 | Create Bidirectional Stream (Sending) 831 | Recv MAX_STREAM_DATA / STOP_SENDING (Bidirectional) 832 | Create Higher-Numbered Stream 833 v 834 +-------+ 835 | Recv | Recv RESET_STREAM 836 | |-----------------------. 837 +-------+ | 838 | | 839 | Recv STREAM + FIN | 840 v | 841 +-------+ | 842 | Size | Recv RESET_STREAM | 843 | Known |---------------------->| 844 +-------+ | 845 | | 846 | Recv All Data | 847 v v 848 +-------+ Recv RESET_STREAM +-------+ 849 | Data |--- (optional) --->| Reset | 850 | Recvd | Recv All Data | Recvd | 851 +-------+<-- (optional) ----+-------+ 852 | | 853 | App Read All Data | App Read Reset 854 v v 855 +-------+ +-------+ 856 | Data | | Reset | 857 | Read | | Read | 858 +-------+ +-------+ 860 Figure 3: States for Receiving Parts of Streams 862 The receiving part of a stream initiated by a peer (types 1 and 3 for 863 a client, or 0 and 2 for a server) is created when the first STREAM, 864 STREAM_DATA_BLOCKED, or RESET_STREAM frame is received for that 865 stream. For bidirectional streams initiated by a peer, receipt of a 866 MAX_STREAM_DATA or STOP_SENDING frame for the sending part of the 867 stream also creates the receiving part. The initial state for the 868 receiving part of a stream is "Recv". 870 For a bidirectional stream, the receiving part enters the "Recv" 871 state when the sending part initiated by the endpoint (type 0 for a 872 client, type 1 for a server) enters the "Ready" state. 874 An endpoint opens a bidirectional stream when a MAX_STREAM_DATA or 875 STOP_SENDING frame is received from the peer for that stream. 876 Receiving a MAX_STREAM_DATA frame for an unopened stream indicates 877 that the remote peer has opened the stream and is providing flow 878 control credit. Receiving a STOP_SENDING frame for an unopened 879 stream indicates that the remote peer no longer wishes to receive 880 data on this stream. Either frame might arrive before a STREAM or 881 STREAM_DATA_BLOCKED frame if packets are lost or reordered. 883 Before a stream is created, all streams of the same type with lower- 884 numbered stream IDs MUST be created. This ensures that the creation 885 order for streams is consistent on both endpoints. 887 In the "Recv" state, the endpoint receives STREAM and 888 STREAM_DATA_BLOCKED frames. Incoming data is buffered and can be 889 reassembled into the correct order for delivery to the application. 890 As data is consumed by the application and buffer space becomes 891 available, the endpoint sends MAX_STREAM_DATA frames to allow the 892 peer to send more data. 894 When a STREAM frame with a FIN bit is received, the final size of the 895 stream is known; see Section 4.5. The receiving part of the stream 896 then enters the "Size Known" state. In this state, the endpoint no 897 longer needs to send MAX_STREAM_DATA frames, it only receives any 898 retransmissions of stream data. 900 Once all data for the stream has been received, the receiving part 901 enters the "Data Recvd" state. This might happen as a result of 902 receiving the same STREAM frame that causes the transition to "Size 903 Known". After all data has been received, any STREAM or 904 STREAM_DATA_BLOCKED frames for the stream can be discarded. 906 The "Data Recvd" state persists until stream data has been delivered 907 to the application. Once stream data has been delivered, the stream 908 enters the "Data Read" state, which is a terminal state. 910 Receiving a RESET_STREAM frame in the "Recv" or "Size Known" states 911 causes the stream to enter the "Reset Recvd" state. This might cause 912 the delivery of stream data to the application to be interrupted. 914 It is possible that all stream data has already been received when a 915 RESET_STREAM is received (that is, in the "Data Recvd" state). 916 Similarly, it is possible for remaining stream data to arrive after 917 receiving a RESET_STREAM frame (the "Reset Recvd" state). An 918 implementation is free to manage this situation as it chooses. 920 Sending RESET_STREAM means that an endpoint cannot guarantee delivery 921 of stream data; however there is no requirement that stream data not 922 be delivered if a RESET_STREAM is received. An implementation MAY 923 interrupt delivery of stream data, discard any data that was not 924 consumed, and signal the receipt of the RESET_STREAM. A RESET_STREAM 925 signal might be suppressed or withheld if stream data is completely 926 received and is buffered to be read by the application. If the 927 RESET_STREAM is suppressed, the receiving part of the stream remains 928 in "Data Recvd". 930 Once the application receives the signal indicating that the stream 931 was reset, the receiving part of the stream transitions to the "Reset 932 Read" state, which is a terminal state. 934 3.3. Permitted Frame Types 936 The sender of a stream sends just three frame types that affect the 937 state of a stream at either sender or receiver: STREAM 938 (Section 19.8), STREAM_DATA_BLOCKED (Section 19.13), and RESET_STREAM 939 (Section 19.4). 941 A sender MUST NOT send any of these frames from a terminal state 942 ("Data Recvd" or "Reset Recvd"). A sender MUST NOT send a STREAM or 943 STREAM_DATA_BLOCKED frame for a stream in the "Reset Sent" state or 944 any terminal state, that is, after sending a RESET_STREAM frame. A 945 receiver could receive any of these three frames in any state, due to 946 the possibility of delayed delivery of packets carrying them. 948 The receiver of a stream sends MAX_STREAM_DATA (Section 19.10) and 949 STOP_SENDING frames (Section 19.5). 951 The receiver only sends MAX_STREAM_DATA in the "Recv" state. A 952 receiver MAY send STOP_SENDING in any state where it has not received 953 a RESET_STREAM frame; that is states other than "Reset Recvd" or 954 "Reset Read". However there is little value in sending a 955 STOP_SENDING frame in the "Data Recvd" state, since all stream data 956 has been received. A sender could receive either of these two frames 957 in any state as a result of delayed delivery of packets. 959 3.4. Bidirectional Stream States 961 A bidirectional stream is composed of sending and receiving parts. 962 Implementations can represent states of the bidirectional stream as 963 composites of sending and receiving stream states. The simplest 964 model presents the stream as "open" when either sending or receiving 965 parts are in a non-terminal state and "closed" when both sending and 966 receiving streams are in terminal states. 968 Table 2 shows a more complex mapping of bidirectional stream states 969 that loosely correspond to the stream states in HTTP/2 [HTTP2]. This 970 shows that multiple states on sending or receiving parts of streams 971 are mapped to the same composite state. Note that this is just one 972 possibility for such a mapping; this mapping requires that data is 973 acknowledged before the transition to a "closed" or "half-closed" 974 state. 976 +======================+======================+=================+ 977 | Sending Part | Receiving Part | Composite State | 978 +======================+======================+=================+ 979 | No Stream/Ready | No Stream/Recv *1 | idle | 980 +----------------------+----------------------+-----------------+ 981 | Ready/Send/Data Sent | Recv/Size Known | open | 982 +----------------------+----------------------+-----------------+ 983 | Ready/Send/Data Sent | Data Recvd/Data Read | half-closed | 984 | | | (remote) | 985 +----------------------+----------------------+-----------------+ 986 | Ready/Send/Data Sent | Reset Recvd/Reset | half-closed | 987 | | Read | (remote) | 988 +----------------------+----------------------+-----------------+ 989 | Data Recvd | Recv/Size Known | half-closed | 990 | | | (local) | 991 +----------------------+----------------------+-----------------+ 992 | Reset Sent/Reset | Recv/Size Known | half-closed | 993 | Recvd | | (local) | 994 +----------------------+----------------------+-----------------+ 995 | Reset Sent/Reset | Data Recvd/Data Read | closed | 996 | Recvd | | | 997 +----------------------+----------------------+-----------------+ 998 | Reset Sent/Reset | Reset Recvd/Reset | closed | 999 | Recvd | Read | | 1000 +----------------------+----------------------+-----------------+ 1001 | Data Recvd | Data Recvd/Data Read | closed | 1002 +----------------------+----------------------+-----------------+ 1003 | Data Recvd | Reset Recvd/Reset | closed | 1004 | | Read | | 1005 +----------------------+----------------------+-----------------+ 1007 Table 2: Possible Mapping of Stream States to HTTP/2 1009 Note (*1): A stream is considered "idle" if it has not yet been 1010 created, or if the receiving part of the stream is in the "Recv" 1011 state without yet having received any frames. 1013 3.5. Solicited State Transitions 1015 If an application is no longer interested in the data it is receiving 1016 on a stream, it can abort reading the stream and specify an 1017 application error code. 1019 If the stream is in the "Recv" or "Size Known" states, the transport 1020 SHOULD signal this by sending a STOP_SENDING frame to prompt closure 1021 of the stream in the opposite direction. This typically indicates 1022 that the receiving application is no longer reading data it receives 1023 from the stream, but it is not a guarantee that incoming data will be 1024 ignored. 1026 STREAM frames received after sending a STOP_SENDING frame are still 1027 counted toward connection and stream flow control, even though these 1028 frames can be discarded upon receipt. 1030 A STOP_SENDING frame requests that the receiving endpoint send a 1031 RESET_STREAM frame. An endpoint that receives a STOP_SENDING frame 1032 MUST send a RESET_STREAM frame if the stream is in the Ready or Send 1033 state. If the stream is in the "Data Sent" state, the endpoint MAY 1034 defer sending the RESET_STREAM frame until the packets containing 1035 outstanding data are acknowledged or declared lost. If any 1036 outstanding data is declared lost, the endpoint SHOULD send a 1037 RESET_STREAM frame instead of retransmitting the data. 1039 An endpoint SHOULD copy the error code from the STOP_SENDING frame to 1040 the RESET_STREAM frame it sends, but can use any application error 1041 code. An endpoint that sends a STOP_SENDING frame MAY ignore the 1042 error code in any RESET_STREAM frames subsequently received for that 1043 stream. 1045 STOP_SENDING SHOULD only be sent for a stream that has not been reset 1046 by the peer. STOP_SENDING is most useful for streams in the "Recv" 1047 or "Size Known" states. 1049 An endpoint is expected to send another STOP_SENDING frame if a 1050 packet containing a previous STOP_SENDING is lost. However, once 1051 either all stream data or a RESET_STREAM frame has been received for 1052 the stream - that is, the stream is in any state other than "Recv" or 1053 "Size Known" - sending a STOP_SENDING frame is unnecessary. 1055 An endpoint that wishes to terminate both directions of a 1056 bidirectional stream can terminate one direction by sending a 1057 RESET_STREAM frame, and it can encourage prompt termination in the 1058 opposite direction by sending a STOP_SENDING frame. 1060 4. Flow Control 1062 Receivers need to limit the amount of data that they are required to 1063 buffer, in order to prevent a fast sender from overwhelming them or a 1064 malicious sender from consuming a large amount of memory. To enable 1065 a receiver to limit memory commitments for a connection, streams are 1066 flow controlled both individually and across a connection as a whole. 1067 A QUIC receiver controls the maximum amount of data the sender can 1068 send on a stream as well as across all streams at any time, as 1069 described in Section 4.1 and Section 4.2. 1071 Similarly, to limit concurrency within a connection, a QUIC endpoint 1072 controls the maximum cumulative number of streams that its peer can 1073 initiate, as described in Section 4.6. 1075 Data sent in CRYPTO frames is not flow controlled in the same way as 1076 stream data. QUIC relies on the cryptographic protocol 1077 implementation to avoid excessive buffering of data; see [QUIC-TLS]. 1078 To avoid excessive buffering at multiple layers, QUIC implementations 1079 SHOULD provide an interface for the cryptographic protocol 1080 implementation to communicate its buffering limits. 1082 4.1. Data Flow Control 1084 QUIC employs a limit-based flow-control scheme where a receiver 1085 advertises the limit of total bytes it is prepared to receive on a 1086 given stream or for the entire connection. This leads to two levels 1087 of data flow control in QUIC: 1089 * Stream flow control, which prevents a single stream from consuming 1090 the entire receive buffer for a connection by limiting the amount 1091 of data that can be sent on each stream. 1093 * Connection flow control, which prevents senders from exceeding a 1094 receiver's buffer capacity for the connection, by limiting the 1095 total bytes of stream data sent in STREAM frames on all streams. 1097 Senders MUST NOT send data in excess of either limit. 1099 A receiver sets initial limits for all streams through transport 1100 parameters during the handshake (Section 7.4). Subsequently, a 1101 receiver sends MAX_STREAM_DATA (Section 19.10) or MAX_DATA 1102 (Section 19.9) frames to the sender to advertise larger limits. 1104 A receiver can advertise a larger limit for a stream by sending a 1105 MAX_STREAM_DATA frame with the corresponding stream ID. A 1106 MAX_STREAM_DATA frame indicates the maximum absolute byte offset of a 1107 stream. A receiver could determine the flow control offset to be 1108 advertised based on the current offset of data consumed on that 1109 stream. 1111 A receiver can advertise a larger limit for a connection by sending a 1112 MAX_DATA frame, which indicates the maximum of the sum of the 1113 absolute byte offsets of all streams. A receiver maintains a 1114 cumulative sum of bytes received on all streams, which is used to 1115 check for violations of the advertised connection or stream data 1116 limits. A receiver could determine the maximum data limit to be 1117 advertised based on the sum of bytes consumed on all streams. 1119 Once a receiver advertises a limit for the connection or a stream, it 1120 is not an error to advertise a smaller limit, but the smaller limit 1121 has no effect. 1123 A receiver MUST close the connection with a FLOW_CONTROL_ERROR error 1124 (Section 11) if the sender violates the advertised connection or 1125 stream data limits. 1127 A sender MUST ignore any MAX_STREAM_DATA or MAX_DATA frames that do 1128 not increase flow control limits. 1130 If a sender has sent data up to the limit, it will be unable to send 1131 new data and is considered blocked. A sender SHOULD send a 1132 STREAM_DATA_BLOCKED or DATA_BLOCKED frame to indicate to the receiver 1133 that it has data to write but is blocked by flow control limits. If 1134 a sender is blocked for a period longer than the idle timeout 1135 (Section 10.1), the receiver might close the connection even when the 1136 sender has data that is available for transmission. To keep the 1137 connection from closing, a sender that is flow control limited SHOULD 1138 periodically send a STREAM_DATA_BLOCKED or DATA_BLOCKED frame when it 1139 has no ack-eliciting packets in flight. 1141 4.2. Increasing Flow Control Limits 1143 Implementations decide when and how much credit to advertise in 1144 MAX_STREAM_DATA and MAX_DATA frames, but this section offers a few 1145 considerations. 1147 To avoid blocking a sender, a receiver MAY send a MAX_STREAM_DATA or 1148 MAX_DATA frame multiple times within a round trip or send it early 1149 enough to allow time for loss of the frame and subsequent recovery. 1151 Control frames contribute to connection overhead. Therefore, 1152 frequently sending MAX_STREAM_DATA and MAX_DATA frames with small 1153 changes is undesirable. On the other hand, if updates are less 1154 frequent, larger increments to limits are necessary to avoid blocking 1155 a sender, requiring larger resource commitments at the receiver. 1156 There is a trade-off between resource commitment and overhead when 1157 determining how large a limit is advertised. 1159 A receiver can use an autotuning mechanism to tune the frequency and 1160 amount of advertised additional credit based on a round-trip time 1161 estimate and the rate at which the receiving application consumes 1162 data, similar to common TCP implementations. As an optimization, an 1163 endpoint could send frames related to flow control only when there 1164 are other frames to send, ensuring that flow control does not cause 1165 extra packets to be sent. 1167 A blocked sender is not required to send STREAM_DATA_BLOCKED or 1168 DATA_BLOCKED frames. Therefore, a receiver MUST NOT wait for a 1169 STREAM_DATA_BLOCKED or DATA_BLOCKED frame before sending a 1170 MAX_STREAM_DATA or MAX_DATA frame; doing so could result in the 1171 sender being blocked for the rest of the connection. Even if the 1172 sender sends these frames, waiting for them will result in the sender 1173 being blocked for at least an entire round trip. 1175 When a sender receives credit after being blocked, it might be able 1176 to send a large amount of data in response, resulting in short-term 1177 congestion; see Section 7.7 in [QUIC-RECOVERY] for a discussion of 1178 how a sender can avoid this congestion. 1180 4.3. Flow Control Performance 1182 If an endpoint cannot ensure that its peer always has available flow 1183 control credit that is greater than the peer's bandwidth-delay 1184 product on this connection, its receive throughput will be limited by 1185 flow control. 1187 Packet loss can cause gaps in the receive buffer, preventing the 1188 application from consuming data and freeing up receive buffer space. 1190 Sending timely updates of flow control limits can improve 1191 performance. Sending packets only to provide flow control updates 1192 can increase network load and adversely affect performance. Sending 1193 flow control updates along with other frames, such as ACK frames, 1194 reduces the cost of those updates. 1196 4.4. Handling Stream Cancellation 1198 Endpoints need to eventually agree on the amount of flow control 1199 credit that has been consumed on every stream, to be able to account 1200 for all bytes for connection-level flow control. 1202 On receipt of a RESET_STREAM frame, an endpoint will tear down state 1203 for the matching stream and ignore further data arriving on that 1204 stream. 1206 RESET_STREAM terminates one direction of a stream abruptly. For a 1207 bidirectional stream, RESET_STREAM has no effect on data flow in the 1208 opposite direction. Both endpoints MUST maintain flow control state 1209 for the stream in the unterminated direction until that direction 1210 enters a terminal state. 1212 4.5. Stream Final Size 1214 The final size is the amount of flow control credit that is consumed 1215 by a stream. Assuming that every contiguous byte on the stream was 1216 sent once, the final size is the number of bytes sent. More 1217 generally, this is one higher than the offset of the byte with the 1218 largest offset sent on the stream, or zero if no bytes were sent. 1220 A sender always communicates the final size of a stream to the 1221 receiver reliably, no matter how the stream is terminated. The final 1222 size is the sum of the Offset and Length fields of a STREAM frame 1223 with a FIN flag, noting that these fields might be implicit. 1224 Alternatively, the Final Size field of a RESET_STREAM frame carries 1225 this value. This guarantees that both endpoints agree on how much 1226 flow control credit was consumed by the sender on that stream. 1228 An endpoint will know the final size for a stream when the receiving 1229 part of the stream enters the "Size Known" or "Reset Recvd" state 1230 (Section 3). The receiver MUST use the final size of the stream to 1231 account for all bytes sent on the stream in its connection level flow 1232 controller. 1234 An endpoint MUST NOT send data on a stream at or beyond the final 1235 size. 1237 Once a final size for a stream is known, it cannot change. If a 1238 RESET_STREAM or STREAM frame is received indicating a change in the 1239 final size for the stream, an endpoint SHOULD respond with a 1240 FINAL_SIZE_ERROR error; see Section 11. A receiver SHOULD treat 1241 receipt of data at or beyond the final size as a FINAL_SIZE_ERROR 1242 error, even after a stream is closed. Generating these errors is not 1243 mandatory, because requiring that an endpoint generate these errors 1244 also means that the endpoint needs to maintain the final size state 1245 for closed streams, which could mean a significant state commitment. 1247 4.6. Controlling Concurrency 1249 An endpoint limits the cumulative number of incoming streams a peer 1250 can open. Only streams with a stream ID less than (max_stream * 4 + 1251 initial_stream_id_for_type) can be opened; see Table 1. Initial 1252 limits are set in the transport parameters; see Section 18.2. 1253 Subsequent limits are advertised using MAX_STREAMS frames; see 1254 Section 19.11. Separate limits apply to unidirectional and 1255 bidirectional streams. 1257 If a max_streams transport parameter or a MAX_STREAMS frame is 1258 received with a value greater than 2^60, this would allow a maximum 1259 stream ID that cannot be expressed as a variable-length integer; see 1260 Section 16. If either is received, the connection MUST be closed 1261 immediately with a connection error of type TRANSPORT_PARAMETER_ERROR 1262 if the offending value was received in a transport parameter or of 1263 type FRAME_ENCODING_ERROR if it was received in a frame; see 1264 Section 10.2. 1266 Endpoints MUST NOT exceed the limit set by their peer. An endpoint 1267 that receives a frame with a stream ID exceeding the limit it has 1268 sent MUST treat this as a connection error of type STREAM_LIMIT_ERROR 1269 (Section 11). 1271 Once a receiver advertises a stream limit using the MAX_STREAMS 1272 frame, advertising a smaller limit has no effect. A receiver MUST 1273 ignore any MAX_STREAMS frame that does not increase the stream limit. 1275 As with stream and connection flow control, this document leaves 1276 implementations to decide when and how many streams should be 1277 advertised to a peer via MAX_STREAMS. Implementations might choose 1278 to increase limits as streams are closed, to keep the number of 1279 streams available to peers roughly consistent. 1281 An endpoint that is unable to open a new stream due to the peer's 1282 limits SHOULD send a STREAMS_BLOCKED frame (Section 19.14). This 1283 signal is considered useful for debugging. An endpoint MUST NOT wait 1284 to receive this signal before advertising additional credit, since 1285 doing so will mean that the peer will be blocked for at least an 1286 entire round trip, and potentially indefinitely if the peer chooses 1287 not to send STREAMS_BLOCKED frames. 1289 5. Connections 1291 A QUIC connection is shared state between a client and a server. 1293 Each connection starts with a handshake phase, during which the two 1294 endpoints establish a shared secret using the cryptographic handshake 1295 protocol [QUIC-TLS] and negotiate the application protocol. The 1296 handshake (Section 7) confirms that both endpoints are willing to 1297 communicate (Section 8.1) and establishes parameters for the 1298 connection (Section 7.4). 1300 An application protocol can use the connection during the handshake 1301 phase with some limitations. 0-RTT allows application data to be 1302 sent by a client before receiving a response from the server. 1303 However, 0-RTT provides no protection against replay attacks; see 1304 Section 9.2 of [QUIC-TLS]. A server can also send application data 1305 to a client before it receives the final cryptographic handshake 1306 messages that allow it to confirm the identity and liveness of the 1307 client. These capabilities allow an application protocol to offer 1308 the option of trading some security guarantees for reduced latency. 1310 The use of connection IDs (Section 5.1) allows connections to migrate 1311 to a new network path, both as a direct choice of an endpoint and 1312 when forced by a change in a middlebox. Section 9 describes 1313 mitigations for the security and privacy issues associated with 1314 migration. 1316 For connections that are no longer needed or desired, there are 1317 several ways for a client and server to terminate a connection, as 1318 described in Section 10. 1320 5.1. Connection ID 1322 Each connection possesses a set of connection identifiers, or 1323 connection IDs, each of which can identify the connection. 1324 Connection IDs are independently selected by endpoints; each endpoint 1325 selects the connection IDs that its peer uses. 1327 The primary function of a connection ID is to ensure that changes in 1328 addressing at lower protocol layers (UDP, IP) do not cause packets 1329 for a QUIC connection to be delivered to the wrong endpoint. Each 1330 endpoint selects connection IDs using an implementation-specific (and 1331 perhaps deployment-specific) method that will allow packets with that 1332 connection ID to be routed back to the endpoint and to be identified 1333 by the endpoint upon receipt. 1335 Multiple connection IDs are used so that endpoints can send packets 1336 that cannot be identified by an observer as being for the same 1337 connection without cooperation from an endpoint; see Section 9.5. 1339 Connection IDs MUST NOT contain any information that can be used by 1340 an external observer (that is, one that does not cooperate with the 1341 issuer) to correlate them with other connection IDs for the same 1342 connection. As a trivial example, this means the same connection ID 1343 MUST NOT be issued more than once on the same connection. 1345 Packets with long headers include Source Connection ID and 1346 Destination Connection ID fields. These fields are used to set the 1347 connection IDs for new connections; see Section 7.2 for details. 1349 Packets with short headers (Section 17.3) only include the 1350 Destination Connection ID and omit the explicit length. The length 1351 of the Destination Connection ID field is expected to be known to 1352 endpoints. Endpoints using a load balancer that routes based on 1353 connection ID could agree with the load balancer on a fixed length 1354 for connection IDs, or agree on an encoding scheme. A fixed portion 1355 could encode an explicit length, which allows the entire connection 1356 ID to vary in length and still be used by the load balancer. 1358 A Version Negotiation (Section 17.2.1) packet echoes the connection 1359 IDs selected by the client, both to ensure correct routing toward the 1360 client and to demonstrate that the packet is in response to a packet 1361 sent by the client. 1363 A zero-length connection ID can be used when a connection ID is not 1364 needed to route to the correct endpoint. However, multiplexing 1365 connections on the same local IP address and port while using zero- 1366 length connection IDs will cause failures in the presence of peer 1367 connection migration, NAT rebinding, and client port reuse. An 1368 endpoint MUST NOT use the same IP address and port for multiple 1369 concurrent connections with zero-length connection IDs, unless it is 1370 certain that those protocol features are not in use. 1372 When an endpoint uses a non-zero-length connection ID, it needs to 1373 ensure that the peer has a supply of connection IDs from which to 1374 choose for packets sent to the endpoint. These connection IDs are 1375 supplied by the endpoint using the NEW_CONNECTION_ID frame 1376 (Section 19.15). 1378 5.1.1. Issuing Connection IDs 1380 Each Connection ID has an associated sequence number to assist in 1381 detecting when NEW_CONNECTION_ID or RETIRE_CONNECTION_ID frames refer 1382 to the same value. The initial connection ID issued by an endpoint 1383 is sent in the Source Connection ID field of the long packet header 1384 (Section 17.2) during the handshake. The sequence number of the 1385 initial connection ID is 0. If the preferred_address transport 1386 parameter is sent, the sequence number of the supplied connection ID 1387 is 1. 1389 Additional connection IDs are communicated to the peer using 1390 NEW_CONNECTION_ID frames (Section 19.15). The sequence number on 1391 each newly issued connection ID MUST increase by 1. The connection 1392 ID that a client selects for the first Destination Connection ID 1393 field it sends and any connection ID provided by a Retry packet are 1394 not assigned sequence numbers. 1396 When an endpoint issues a connection ID, it MUST accept packets that 1397 carry this connection ID for the duration of the connection or until 1398 its peer invalidates the connection ID via a RETIRE_CONNECTION_ID 1399 frame (Section 19.16). Connection IDs that are issued and not 1400 retired are considered active; any active connection ID is valid for 1401 use with the current connection at any time, in any packet type. 1402 This includes the connection ID issued by the server via the 1403 preferred_address transport parameter. 1405 An endpoint SHOULD ensure that its peer has a sufficient number of 1406 available and unused connection IDs. Endpoints advertise the number 1407 of active connection IDs they are willing to maintain using the 1408 active_connection_id_limit transport parameter. An endpoint MUST NOT 1409 provide more connection IDs than the peer's limit. An endpoint MAY 1410 send connection IDs that temporarily exceed a peer's limit if the 1411 NEW_CONNECTION_ID frame also requires the retirement of any excess, 1412 by including a sufficiently large value in the Retire Prior To field. 1414 A NEW_CONNECTION_ID frame might cause an endpoint to add some active 1415 connection IDs and retire others based on the value of the Retire 1416 Prior To field. After processing a NEW_CONNECTION_ID frame and 1417 adding and retiring active connection IDs, if the number of active 1418 connection IDs exceeds the value advertised in its 1419 active_connection_id_limit transport parameter, an endpoint MUST 1420 close the connection with an error of type CONNECTION_ID_LIMIT_ERROR. 1422 An endpoint SHOULD supply a new connection ID when the peer retires a 1423 connection ID. If an endpoint provided fewer connection IDs than the 1424 peer's active_connection_id_limit, it MAY supply a new connection ID 1425 when it receives a packet with a previously unused connection ID. An 1426 endpoint MAY limit the total number of connection IDs issued for each 1427 connection to avoid the risk of running out of connection IDs; see 1428 Section 10.3.2. An endpoint MAY also limit the issuance of 1429 connection IDs to reduce the amount of per-path state it maintains, 1430 such as path validation status, as its peer might interact with it 1431 over as many paths as there are issued connection IDs. 1433 An endpoint that initiates migration and requires non-zero-length 1434 connection IDs SHOULD ensure that the pool of connection IDs 1435 available to its peer allows the peer to use a new connection ID on 1436 migration, as the peer will be unable to respond if the pool is 1437 exhausted. 1439 An endpoint that selects a zero-length connection ID during the 1440 handshake cannot issue a new connection ID. A zero-length 1441 Destination Connection ID field is used in all packets sent toward 1442 such an endpoint over any network path. 1444 5.1.2. Consuming and Retiring Connection IDs 1446 An endpoint can change the connection ID it uses for a peer to 1447 another available one at any time during the connection. An endpoint 1448 consumes connection IDs in response to a migrating peer; see 1449 Section 9.5 for more. 1451 An endpoint maintains a set of connection IDs received from its peer, 1452 any of which it can use when sending packets. When the endpoint 1453 wishes to remove a connection ID from use, it sends a 1454 RETIRE_CONNECTION_ID frame to its peer. Sending a 1455 RETIRE_CONNECTION_ID frame indicates that the connection ID will not 1456 be used again and requests that the peer replace it with a new 1457 connection ID using a NEW_CONNECTION_ID frame. 1459 As discussed in Section 9.5, endpoints limit the use of a connection 1460 ID to packets sent from a single local address to a single 1461 destination address. Endpoints SHOULD retire connection IDs when 1462 they are no longer actively using either the local or destination 1463 address for which the connection ID was used. 1465 An endpoint might need to stop accepting previously issued connection 1466 IDs in certain circumstances. Such an endpoint can cause its peer to 1467 retire connection IDs by sending a NEW_CONNECTION_ID frame with an 1468 increased Retire Prior To field. The endpoint SHOULD continue to 1469 accept the previously issued connection IDs until they are retired by 1470 the peer. If the endpoint can no longer process the indicated 1471 connection IDs, it MAY close the connection. 1473 Upon receipt of an increased Retire Prior To field, the peer MUST 1474 stop using the corresponding connection IDs and retire them with 1475 RETIRE_CONNECTION_ID frames before adding the newly provided 1476 connection ID to the set of active connection IDs. This ordering 1477 allows an endpoint to replace all active connection IDs without the 1478 possibility of a peer having no available connection IDs and without 1479 exceeding the limit the peer sets in the active_connection_id_limit 1480 transport parameter; see Section 18.2. Failure to cease using the 1481 connection IDs when requested can result in connection failures, as 1482 the issuing endpoint might be unable to continue using the connection 1483 IDs with the active connection. 1485 An endpoint SHOULD limit the number of connection IDs it has retired 1486 locally and have not yet been acknowledged. An endpoint SHOULD allow 1487 for sending and tracking a number of RETIRE_CONNECTION_ID frames of 1488 at least twice the active_connection_id limit. An endpoint MUST NOT 1489 forget a connection ID without retiring it, though it MAY choose to 1490 treat having connection IDs in need of retirement that exceed this 1491 limit as a connection error of type CONNECTION_ID_LIMIT_ERROR. 1493 Endpoints SHOULD NOT issue updates of the Retire Prior To field 1494 before receiving RETIRE_CONNECTION_ID frames that retire all 1495 connection IDs indicated by the previous Retire Prior To value. 1497 5.2. Matching Packets to Connections 1499 Incoming packets are classified on receipt. Packets can either be 1500 associated with an existing connection, or - for servers - 1501 potentially create a new connection. 1503 Endpoints try to associate a packet with an existing connection. If 1504 the packet has a non-zero-length Destination Connection ID 1505 corresponding to an existing connection, QUIC processes that packet 1506 accordingly. Note that more than one connection ID can be associated 1507 with a connection; see Section 5.1. 1509 If the Destination Connection ID is zero length and the addressing 1510 information in the packet matches the addressing information the 1511 endpoint uses to identify a connection with a zero-length connection 1512 ID, QUIC processes the packet as part of that connection. An 1513 endpoint can use just destination IP and port or both source and 1514 destination addresses for identification, though this makes 1515 connections fragile as described in Section 5.1. 1517 Endpoints can send a Stateless Reset (Section 10.3) for any packets 1518 that cannot be attributed to an existing connection. A stateless 1519 reset allows a peer to more quickly identify when a connection 1520 becomes unusable. 1522 Packets that are matched to an existing connection are discarded if 1523 the packets are inconsistent with the state of that connection. For 1524 example, packets are discarded if they indicate a different protocol 1525 version than that of the connection, or if the removal of packet 1526 protection is unsuccessful once the expected keys are available. 1528 Invalid packets that lack strong integrity protection, such as 1529 Initial, Retry, or Version Negotiation, MAY be discarded. An 1530 endpoint MUST generate a connection error if processing the contents 1531 of these packets prior to discovering an error, or fully revert any 1532 changes made during that processing. 1534 5.2.1. Client Packet Handling 1536 Valid packets sent to clients always include a Destination Connection 1537 ID that matches a value the client selects. Clients that choose to 1538 receive zero-length connection IDs can use the local address and port 1539 to identify a connection. Packets that do not match an existing 1540 connection, based on Destination Connection ID or, if this value is 1541 zero-length, local IP address and port, are discarded. 1543 Due to packet reordering or loss, a client might receive packets for 1544 a connection that are encrypted with a key it has not yet computed. 1545 The client MAY drop these packets, or MAY buffer them in anticipation 1546 of later packets that allow it to compute the key. 1548 If a client receives a packet that uses a different version than it 1549 initially selected, it MUST discard that packet. 1551 5.2.2. Server Packet Handling 1553 If a server receives a packet that indicates an unsupported version 1554 and if the packet is large enough to initiate a new connection for 1555 any supported version, the server SHOULD send a Version Negotiation 1556 packet as described in Section 6.1. A server MAY limit the number of 1557 packets to which it responds with a Version Negotiation packet. 1558 Servers MUST drop smaller packets that specify unsupported versions. 1560 The first packet for an unsupported version can use different 1561 semantics and encodings for any version-specific field. In 1562 particular, different packet protection keys might be used for 1563 different versions. Servers that do not support a particular version 1564 are unlikely to be able to decrypt the payload of the packet or 1565 properly interpret the result. Servers SHOULD respond with a Version 1566 Negotiation packet, provided that the datagram is sufficiently long. 1568 Packets with a supported version, or no version field, are matched to 1569 a connection using the connection ID or - for packets with zero- 1570 length connection IDs - the local address and port. These packets 1571 are processed using the selected connection; otherwise, the server 1572 continues below. 1574 If the packet is an Initial packet fully conforming with the 1575 specification, the server proceeds with the handshake (Section 7). 1576 This commits the server to the version that the client selected. 1578 If a server refuses to accept a new connection, it SHOULD send an 1579 Initial packet containing a CONNECTION_CLOSE frame with error code 1580 CONNECTION_REFUSED. 1582 If the packet is a 0-RTT packet, the server MAY buffer a limited 1583 number of these packets in anticipation of a late-arriving Initial 1584 packet. Clients are not able to send Handshake packets prior to 1585 receiving a server response, so servers SHOULD ignore any such 1586 packets. 1588 Servers MUST drop incoming packets under all other circumstances. 1590 5.2.3. Considerations for Simple Load Balancers 1592 A server deployment could load balance among servers using only 1593 source and destination IP addresses and ports. Changes to the 1594 client's IP address or port could result in packets being forwarded 1595 to the wrong server. Such a server deployment could use one of the 1596 following methods for connection continuity when a client's address 1597 changes. 1599 * Servers could use an out-of-band mechanism to forward packets to 1600 the correct server based on Connection ID. 1602 * If servers can use a dedicated server IP address or port, other 1603 than the one that the client initially connects to, they could use 1604 the preferred_address transport parameter to request that clients 1605 move connections to that dedicated address. Note that clients 1606 could choose not to use the preferred address. 1608 A server in a deployment that does not implement a solution to 1609 maintain connection continuity when the client address changes SHOULD 1610 indicate migration is not supported using the 1611 disable_active_migration transport parameter. The 1612 disable_active_migration transport parameter does not prohibit 1613 connection migration after a client has acted on a preferred_address 1614 transport parameter. 1616 Server deployments that use this simple form of load balancing MUST 1617 avoid the creation of a stateless reset oracle; see Section 21.11. 1619 5.3. Operations on Connections 1621 This document does not define an API for QUIC, but instead defines a 1622 set of functions for QUIC connections that application protocols can 1623 rely upon. An application protocol can assume that an implementation 1624 of QUIC provides an interface that includes the operations described 1625 in this section. An implementation designed for use with a specific 1626 application protocol might provide only those operations that are 1627 used by that protocol. 1629 When implementing the client role, an application protocol can: 1631 * open a connection, which begins the exchange described in 1632 Section 7; 1634 * enable Early Data when available; and 1636 * be informed when Early Data has been accepted or rejected by a 1637 server. 1639 When implementing the server role, an application protocol can: 1641 * listen for incoming connections, which prepares for the exchange 1642 described in Section 7; 1644 * if Early Data is supported, embed application-controlled data in 1645 the TLS resumption ticket sent to the client; and 1647 * if Early Data is supported, retrieve application-controlled data 1648 from the client's resumption ticket and accept or reject Early 1649 Data based on that information. 1651 In either role, an application protocol can: 1653 * configure minimum values for the initial number of permitted 1654 streams of each type, as communicated in the transport parameters 1655 (Section 7.4); 1657 * control resource allocation for receive buffers by setting flow 1658 control limits both for streams and for the connection 1660 * identify whether the handshake has completed successfully or is 1661 still ongoing; 1663 * keep a connection from silently closing, either by generating PING 1664 frames (Section 19.2) or by requesting that the transport send 1665 additional frames before the idle timeout expires (Section 10.1); 1666 and 1668 * immediately close (Section 10.2) the connection. 1670 6. Version Negotiation 1672 Version negotiation allows a server to indicate that it does not 1673 support the version the client used. A server sends a Version 1674 Negotiation packet in response to each packet that might initiate a 1675 new connection; see Section 5.2 for details. 1677 The size of the first packet sent by a client will determine whether 1678 a server sends a Version Negotiation packet. Clients that support 1679 multiple QUIC versions SHOULD ensure that the first UDP datagram they 1680 send is sized to the largest of the minimum datagram sizes from all 1681 versions they support, using PADDING frames (Section 19.1) as 1682 necessary. This ensures that the server responds if there is a 1683 mutually supported version. A server might not send a Version 1684 Negotiation packet if the datagram it receives is smaller than the 1685 minimum size specified in a different version; see Section 14.1. 1687 6.1. Sending Version Negotiation Packets 1689 If the version selected by the client is not acceptable to the 1690 server, the server responds with a Version Negotiation packet; see 1691 Section 17.2.1. This includes a list of versions that the server 1692 will accept. An endpoint MUST NOT send a Version Negotiation packet 1693 in response to receiving a Version Negotiation packet. 1695 This system allows a server to process packets with unsupported 1696 versions without retaining state. Though either the Initial packet 1697 or the Version Negotiation packet that is sent in response could be 1698 lost, the client will send new packets until it successfully receives 1699 a response or it abandons the connection attempt. 1701 A server MAY limit the number of Version Negotiation packets it 1702 sends. For instance, a server that is able to recognize packets as 1703 0-RTT might choose not to send Version Negotiation packets in 1704 response to 0-RTT packets with the expectation that it will 1705 eventually receive an Initial packet. 1707 6.2. Handling Version Negotiation Packets 1709 Version Negotiation packets are designed to allow for functionality 1710 to be defined in the future that allows QUIC to negotiate the version 1711 of QUIC to use for a connection. Future standards-track 1712 specifications might change how implementations that support multiple 1713 versions of QUIC react to Version Negotiation packets received in 1714 response to an attempt to establish a connection using this version. 1716 A client that supports only this version of QUIC MUST abandon the 1717 current connection attempt if it receives a Version Negotiation 1718 packet, with the following two exceptions. A client MUST discard any 1719 Version Negotiation packet if it has received and successfully 1720 processed any other packet, including an earlier Version Negotiation 1721 packet. A client MUST discard a Version Negotiation packet that 1722 lists the QUIC version selected by the client. 1724 How to perform version negotiation is left as future work defined by 1725 future standards-track specifications. In particular, that future 1726 work will ensure robustness against version downgrade attacks; see 1727 Section 21.12. 1729 6.2.1. Version Negotiation Between Draft Versions 1731 [[RFC editor: please remove this section before publication.]] 1733 When a draft implementation receives a Version Negotiation packet, it 1734 MAY use it to attempt a new connection with one of the versions 1735 listed in the packet, instead of abandoning the current connection 1736 attempt; see Section 6.2. 1738 The client MUST check that the Destination and Source Connection ID 1739 fields match the Source and Destination Connection ID fields in a 1740 packet that the client sent. If this check fails, the packet MUST be 1741 discarded. 1743 Once the Version Negotiation packet is determined to be valid, the 1744 client then selects an acceptable protocol version from the list 1745 provided by the server. The client then attempts to create a new 1746 connection using that version. The new connection MUST use a new 1747 random Destination Connection ID different from the one it had 1748 previously sent. 1750 Note that this mechanism does not protect against downgrade attacks 1751 and MUST NOT be used outside of draft implementations. 1753 6.3. Using Reserved Versions 1755 For a server to use a new version in the future, clients need to 1756 correctly handle unsupported versions. Some version numbers 1757 (0x?a?a?a?a as defined in Section 15) are reserved for inclusion in 1758 fields that contain version numbers. 1760 Endpoints MAY add reserved versions to any field where unknown or 1761 unsupported versions are ignored to test that a peer correctly 1762 ignores the value. For instance, an endpoint could include a 1763 reserved version in a Version Negotiation packet; see Section 17.2.1. 1764 Endpoints MAY send packets with a reserved version to test that a 1765 peer correctly discards the packet. 1767 7. Cryptographic and Transport Handshake 1769 QUIC relies on a combined cryptographic and transport handshake to 1770 minimize connection establishment latency. QUIC uses the CRYPTO 1771 frame (Section 19.6) to transmit the cryptographic handshake. The 1772 version of QUIC defined in this document is identified as 0x00000001 1773 and uses TLS as described in [QUIC-TLS]; a different QUIC version 1774 could indicate that a different cryptographic handshake protocol is 1775 in use. 1777 QUIC provides reliable, ordered delivery of the cryptographic 1778 handshake data. QUIC packet protection is used to encrypt as much of 1779 the handshake protocol as possible. The cryptographic handshake MUST 1780 provide the following properties: 1782 * authenticated key exchange, where 1784 - a server is always authenticated, 1786 - a client is optionally authenticated, 1788 - every connection produces distinct and unrelated keys, and 1789 - keying material is usable for packet protection for both 0-RTT 1790 and 1-RTT packets 1792 * authenticated exchange of values for transport parameters of both 1793 endpoints, and confidentiality protection for server transport 1794 parameters (see Section 7.4) 1796 * authenticated negotiation of an application protocol (TLS uses 1797 ALPN [ALPN] for this purpose) 1799 The CRYPTO frame can be sent in different packet number spaces 1800 (Section 12.3). The offsets used by CRYPTO frames to ensure ordered 1801 delivery of cryptographic handshake data start from zero in each 1802 packet number space. 1804 Figure 4 shows a simplified handshake and the exchange of packets and 1805 frames that are used to advance the handshake. Exchange of 1806 application data during the handshake is enabled where possible, 1807 shown with a '*'. Once the handshake is complete, endpoints are able 1808 to exchange application data freely. 1810 Client Server 1812 Initial (CRYPTO) 1813 0-RTT (*) ----------> 1814 Initial (CRYPTO) 1815 Handshake (CRYPTO) 1816 <---------- 1-RTT (*) 1817 Handshake (CRYPTO) 1818 1-RTT (*) ----------> 1819 <---------- 1-RTT (HANDSHAKE_DONE) 1821 1-RTT <=========> 1-RTT 1823 Figure 4: Simplified QUIC Handshake 1825 Endpoints can use packets sent during the handshake to test for 1826 Explicit Congestion Notification (ECN) support; see Section 13.4. An 1827 endpoint validates support for ECN by observing whether the ACK 1828 frames acknowledging the first packets it sends carry ECN counts, as 1829 described in Section 13.4.2. 1831 Endpoints MUST explicitly negotiate an application protocol. This 1832 avoids situations where there is a disagreement about the protocol 1833 that is in use. 1835 7.1. Example Handshake Flows 1837 Details of how TLS is integrated with QUIC are provided in 1838 [QUIC-TLS], but some examples are provided here. An extension of 1839 this exchange to support client address validation is shown in 1840 Section 8.1.2. 1842 Once any address validation exchanges are complete, the cryptographic 1843 handshake is used to agree on cryptographic keys. The cryptographic 1844 handshake is carried in Initial (Section 17.2.2) and Handshake 1845 (Section 17.2.4) packets. 1847 Figure 5 provides an overview of the 1-RTT handshake. Each line 1848 shows a QUIC packet with the packet type and packet number shown 1849 first, followed by the frames that are typically contained in those 1850 packets. So, for instance the first packet is of type Initial, with 1851 packet number 0, and contains a CRYPTO frame carrying the 1852 ClientHello. 1854 Multiple QUIC packets -- even of different packet types -- can be 1855 coalesced into a single UDP datagram; see Section 12.2. As a result, 1856 this handshake could consist of as few as 4 UDP datagrams, or any 1857 number more (subject to limits inherent to the protocol, such as 1858 congestion control and anti-amplification). For instance, the 1859 server's first flight contains Initial packets, Handshake packets, 1860 and "0.5-RTT data" in 1-RTT packets. 1862 Client Server 1864 Initial[0]: CRYPTO[CH] -> 1866 Initial[0]: CRYPTO[SH] ACK[0] 1867 Handshake[0]: CRYPTO[EE, CERT, CV, FIN] 1868 <- 1-RTT[0]: STREAM[1, "..."] 1870 Initial[1]: ACK[0] 1871 Handshake[0]: CRYPTO[FIN], ACK[0] 1872 1-RTT[0]: STREAM[0, "..."], ACK[0] -> 1874 Handshake[1]: ACK[0] 1875 <- 1-RTT[1]: HANDSHAKE_DONE, STREAM[3, "..."], ACK[0] 1877 Figure 5: Example 1-RTT Handshake 1879 Figure 6 shows an example of a connection with a 0-RTT handshake and 1880 a single packet of 0-RTT data. Note that as described in 1881 Section 12.3, the server acknowledges 0-RTT data in 1-RTT packets, 1882 and the client sends 1-RTT packets in the same packet number space. 1884 Client Server 1886 Initial[0]: CRYPTO[CH] 1887 0-RTT[0]: STREAM[0, "..."] -> 1889 Initial[0]: CRYPTO[SH] ACK[0] 1890 Handshake[0] CRYPTO[EE, FIN] 1891 <- 1-RTT[0]: STREAM[1, "..."] ACK[0] 1893 Initial[1]: ACK[0] 1894 Handshake[0]: CRYPTO[FIN], ACK[0] 1895 1-RTT[1]: STREAM[0, "..."] ACK[0] -> 1897 Handshake[1]: ACK[0] 1898 <- 1-RTT[1]: HANDSHAKE_DONE, STREAM[3, "..."], ACK[1] 1900 Figure 6: Example 0-RTT Handshake 1902 7.2. Negotiating Connection IDs 1904 A connection ID is used to ensure consistent routing of packets, as 1905 described in Section 5.1. The long header contains two connection 1906 IDs: the Destination Connection ID is chosen by the recipient of the 1907 packet and is used to provide consistent routing; the Source 1908 Connection ID is used to set the Destination Connection ID used by 1909 the peer. 1911 During the handshake, packets with the long header (Section 17.2) are 1912 used to establish the connection IDs used by both endpoints. Each 1913 endpoint uses the Source Connection ID field to specify the 1914 connection ID that is used in the Destination Connection ID field of 1915 packets being sent to them. After processing the first Initial 1916 packet, each endpoint sets the Destination Connection ID field in 1917 subsequent packets it sends to the value of the Source Connection ID 1918 field that it received. 1920 When an Initial packet is sent by a client that has not previously 1921 received an Initial or Retry packet from the server, the client 1922 populates the Destination Connection ID field with an unpredictable 1923 value. This Destination Connection ID MUST be at least 8 bytes in 1924 length. Until a packet is received from the server, the client MUST 1925 use the same Destination Connection ID value on all packets in this 1926 connection. 1928 The Destination Connection ID field from the first Initial packet 1929 sent by a client is used to determine packet protection keys for 1930 Initial packets. These keys change after receiving a Retry packet; 1931 see Section 5.2 of [QUIC-TLS]. 1933 The client populates the Source Connection ID field with a value of 1934 its choosing and sets the Source Connection ID Length field to 1935 indicate the length. 1937 The first flight of 0-RTT packets use the same Destination Connection 1938 ID and Source Connection ID values as the client's first Initial 1939 packet. 1941 Upon first receiving an Initial or Retry packet from the server, the 1942 client uses the Source Connection ID supplied by the server as the 1943 Destination Connection ID for subsequent packets, including any 0-RTT 1944 packets. This means that a client might have to change the 1945 connection ID it sets in the Destination Connection ID field twice 1946 during connection establishment: once in response to a Retry, and 1947 once in response to an Initial packet from the server. Once a client 1948 has received a valid Initial packet from the server, it MUST discard 1949 any subsequent packet it receives on that connection with a different 1950 Source Connection ID. 1952 A client MUST change the Destination Connection ID it uses for 1953 sending packets in response to only the first received Initial or 1954 Retry packet. A server MUST set the Destination Connection ID it 1955 uses for sending packets based on the first received Initial packet. 1956 Any further changes to the Destination Connection ID are only 1957 permitted if the values are taken from NEW_CONNECTION_ID frames; if 1958 subsequent Initial packets include a different Source Connection ID, 1959 they MUST be discarded. This avoids unpredictable outcomes that 1960 might otherwise result from stateless processing of multiple Initial 1961 packets with different Source Connection IDs. 1963 The Destination Connection ID that an endpoint sends can change over 1964 the lifetime of a connection, especially in response to connection 1965 migration (Section 9); see Section 5.1.1 for details. 1967 7.3. Authenticating Connection IDs 1969 The choice each endpoint makes about connection IDs during the 1970 handshake is authenticated by including all values in transport 1971 parameters; see Section 7.4. This ensures that all connection IDs 1972 used for the handshake are also authenticated by the cryptographic 1973 handshake. 1975 Each endpoint includes the value of the Source Connection ID field 1976 from the first Initial packet it sent in the 1977 initial_source_connection_id transport parameter; see Section 18.2. 1978 A server includes the Destination Connection ID field from the first 1979 Initial packet it received from the client in the 1980 original_destination_connection_id transport parameter; if the server 1981 sent a Retry packet, this refers to the first Initial packet received 1982 before sending the Retry packet. If it sends a Retry packet, a 1983 server also includes the Source Connection ID field from the Retry 1984 packet in the retry_source_connection_id transport parameter. 1986 The values provided by a peer for these transport parameters MUST 1987 match the values that an endpoint used in the Destination and Source 1988 Connection ID fields of Initial packets that it sent (and received, 1989 for servers). Endpoints MUST validate that received transport 1990 parameters match received Connection ID values. Including connection 1991 ID values in transport parameters and verifying them ensures that 1992 that an attacker cannot influence the choice of connection ID for a 1993 successful connection by injecting packets carrying attacker-chosen 1994 connection IDs during the handshake. 1996 An endpoint MUST treat absence of the initial_source_connection_id 1997 transport parameter from either endpoint or absence of the 1998 original_destination_connection_id transport parameter from the 1999 server as a connection error of type TRANSPORT_PARAMETER_ERROR. 2001 An endpoint MUST treat the following as a connection error of type 2002 TRANSPORT_PARAMETER_ERROR or PROTOCOL_VIOLATION: 2004 * absence of the retry_source_connection_id transport parameter from 2005 the server after receiving a Retry packet, 2007 * presence of the retry_source_connection_id transport parameter 2008 when no Retry packet was received, or 2010 * a mismatch between values received from a peer in these transport 2011 parameters and the value sent in the corresponding Destination or 2012 Source Connection ID fields of Initial packets. 2014 If a zero-length connection ID is selected, the corresponding 2015 transport parameter is included with a zero-length value. 2017 Figure 7 shows the connection IDs (with DCID=Destination Connection 2018 ID, SCID=Source Connection ID) that are used in a complete handshake. 2019 The exchange of Initial packets is shown, plus the later exchange of 2020 1-RTT packets that includes the connection ID established during the 2021 handshake. 2023 Client Server 2025 Initial: DCID=S1, SCID=C1 -> 2026 <- Initial: DCID=C1, SCID=S3 2027 ... 2028 1-RTT: DCID=S3 -> 2029 <- 1-RTT: DCID=C1 2031 Figure 7: Use of Connection IDs in a Handshake 2033 Figure 8 shows a similar handshake that includes a Retry packet. 2035 Client Server 2037 Initial: DCID=S1, SCID=C1 -> 2038 <- Retry: DCID=C1, SCID=S2 2039 Initial: DCID=S2, SCID=C1 -> 2040 <- Initial: DCID=C1, SCID=S3 2041 ... 2042 1-RTT: DCID=S3 -> 2043 <- 1-RTT: DCID=C1 2045 Figure 8: Use of Connection IDs in a Handshake with Retry 2047 In both cases (Figure 7 and Figure 8), the client sets the value of 2048 the initial_source_connection_id transport parameter to "C1". 2050 When the handshake does not include a Retry (Figure 7), the server 2051 sets original_destination_connection_id to "S1" and 2052 initial_source_connection_id to "S3". In this case, the server does 2053 not include a retry_source_connection_id transport parameter. 2055 When the handshake includes a Retry (Figure 8), the server sets 2056 original_destination_connection_id to "S1", 2057 retry_source_connection_id to "S2", and initial_source_connection_id 2058 to "S3". 2060 7.4. Transport Parameters 2062 During connection establishment, both endpoints make authenticated 2063 declarations of their transport parameters. Endpoints are required 2064 to comply with the restrictions that each parameter defines; the 2065 description of each parameter includes rules for its handling. 2067 Transport parameters are declarations that are made unilaterally by 2068 each endpoint. Each endpoint can choose values for transport 2069 parameters independent of the values chosen by its peer. 2071 The encoding of the transport parameters is detailed in Section 18. 2073 QUIC includes the encoded transport parameters in the cryptographic 2074 handshake. Once the handshake completes, the transport parameters 2075 declared by the peer are available. Each endpoint validates the 2076 values provided by its peer. 2078 Definitions for each of the defined transport parameters are included 2079 in Section 18.2. 2081 An endpoint MUST treat receipt of a transport parameter with an 2082 invalid value as a connection error of type 2083 TRANSPORT_PARAMETER_ERROR. 2085 An endpoint MUST NOT send a parameter more than once in a given 2086 transport parameters extension. An endpoint SHOULD treat receipt of 2087 duplicate transport parameters as a connection error of type 2088 TRANSPORT_PARAMETER_ERROR. 2090 Endpoints use transport parameters to authenticate the negotiation of 2091 connection IDs during the handshake; see Section 7.3. 2093 Application Layer Protocol Negotiation (ALPN; see [ALPN]) allows 2094 clients to offer multiple application protocols during connection 2095 establishment. The transport parameters that a client includes 2096 during the handshake apply to all application protocols that the 2097 client offers. Application protocols can recommend values for 2098 transport parameters, such as the initial flow control limits. 2099 However, application protocols that set constraints on values for 2100 transport parameters could make it impossible for a client to offer 2101 multiple application protocols if these constraints conflict. 2103 7.4.1. Values of Transport Parameters for 0-RTT 2105 Using 0-RTT depends on both client and server using protocol 2106 parameters that were negotiated from a previous connection. To 2107 enable 0-RTT, endpoints store the value of the server transport 2108 parameters from a connection and apply them to any 0-RTT packets that 2109 are sent in subsequent connections to that peer that use a session 2110 ticket issued on that connection. This information is stored with 2111 any information required by the application protocol or cryptographic 2112 handshake; see Section 4.6 of [QUIC-TLS]. 2114 Remembered transport parameters apply to the new connection until the 2115 handshake completes and the client starts sending 1-RTT packets. 2116 Once the handshake completes, the client uses the transport 2117 parameters established in the handshake. Not all transport 2118 parameters are remembered, as some do not apply to future connections 2119 or they have no effect on use of 0-RTT. 2121 The definition of a new transport parameter (Section 7.4.2) MUST 2122 specify whether storing the transport parameter for 0-RTT is 2123 mandatory, optional, or prohibited. A client need not store a 2124 transport parameter it cannot process. 2126 A client MUST NOT use remembered values for the following parameters: 2127 ack_delay_exponent, max_ack_delay, initial_source_connection_id, 2128 original_destination_connection_id, preferred_address, 2129 retry_source_connection_id, and stateless_reset_token. The client 2130 MUST use the server's new values in the handshake instead; if the 2131 server does not provide new values, the default value is used. 2133 A client that attempts to send 0-RTT data MUST remember all other 2134 transport parameters used by the server that it is able to process. 2135 The server can remember these transport parameters, or store an 2136 integrity-protected copy of the values in the ticket and recover the 2137 information when accepting 0-RTT data. A server uses the transport 2138 parameters in determining whether to accept 0-RTT data. 2140 If 0-RTT data is accepted by the server, the server MUST NOT reduce 2141 any limits or alter any values that might be violated by the client 2142 with its 0-RTT data. In particular, a server that accepts 0-RTT data 2143 MUST NOT set values for the following parameters (Section 18.2) that 2144 are smaller than the remembered value of the parameters. 2146 * active_connection_id_limit 2148 * initial_max_data 2150 * initial_max_stream_data_bidi_local 2152 * initial_max_stream_data_bidi_remote 2154 * initial_max_stream_data_uni 2156 * initial_max_streams_bidi 2158 * initial_max_streams_uni 2159 Omitting or setting a zero value for certain transport parameters can 2160 result in 0-RTT data being enabled, but not usable. The applicable 2161 subset of transport parameters that permit sending of application 2162 data SHOULD be set to non-zero values for 0-RTT. This includes 2163 initial_max_data and either initial_max_streams_bidi and 2164 initial_max_stream_data_bidi_remote, or initial_max_streams_uni and 2165 initial_max_stream_data_uni. 2167 A server MAY store and recover the previously sent values of the 2168 max_idle_timeout, max_udp_payload_size, and disable_active_migration 2169 parameters and reject 0-RTT if it selects smaller values. Lowering 2170 the values of these parameters while also accepting 0-RTT data could 2171 degrade the performance of the connection. Specifically, lowering 2172 the max_udp_payload_size could result in dropped packets leading to 2173 worse performance compared to rejecting 0-RTT data outright. 2175 A server MUST reject 0-RTT data if the restored values for transport 2176 parameters cannot be supported. 2178 When sending frames in 0-RTT packets, a client MUST only use 2179 remembered transport parameters; importantly, it MUST NOT use updated 2180 values that it learns from the server's updated transport parameters 2181 or from frames received in 1-RTT packets. Updated values of 2182 transport parameters from the handshake apply only to 1-RTT packets. 2183 For instance, flow control limits from remembered transport 2184 parameters apply to all 0-RTT packets even if those values are 2185 increased by the handshake or by frames sent in 1-RTT packets. A 2186 server MAY treat use of updated transport parameters in 0-RTT as a 2187 connection error of type PROTOCOL_VIOLATION. 2189 7.4.2. New Transport Parameters 2191 New transport parameters can be used to negotiate new protocol 2192 behavior. An endpoint MUST ignore transport parameters that it does 2193 not support. Absence of a transport parameter therefore disables any 2194 optional protocol feature that is negotiated using the parameter. As 2195 described in Section 18.1, some identifiers are reserved in order to 2196 exercise this requirement. 2198 A client that does not understand a transport parameter can discard 2199 it and attempt 0-RTT on subsequent connections. However, if the 2200 client adds support for a discarded transport parameter, it risks 2201 violating the constraints that the transport parameter establishes if 2202 it attempts 0-RTT. New transport parameters can avoid this problem 2203 by setting a default of the most conservative value. Clients can 2204 avoid this problem by remembering all parameters, even ones not 2205 currently supported. 2207 New transport parameters can be registered according to the rules in 2208 Section 22.3. 2210 7.5. Cryptographic Message Buffering 2212 Implementations need to maintain a buffer of CRYPTO data received out 2213 of order. Because there is no flow control of CRYPTO frames, an 2214 endpoint could potentially force its peer to buffer an unbounded 2215 amount of data. 2217 Implementations MUST support buffering at least 4096 bytes of data 2218 received in out-of-order CRYPTO frames. Endpoints MAY choose to 2219 allow more data to be buffered during the handshake. A larger limit 2220 during the handshake could allow for larger keys or credentials to be 2221 exchanged. An endpoint's buffer size does not need to remain 2222 constant during the life of the connection. 2224 Being unable to buffer CRYPTO frames during the handshake can lead to 2225 a connection failure. If an endpoint's buffer is exceeded during the 2226 handshake, it can expand its buffer temporarily to complete the 2227 handshake. If an endpoint does not expand its buffer, it MUST close 2228 the connection with a CRYPTO_BUFFER_EXCEEDED error code. 2230 Once the handshake completes, if an endpoint is unable to buffer all 2231 data in a CRYPTO frame, it MAY discard that CRYPTO frame and all 2232 CRYPTO frames received in the future, or it MAY close the connection 2233 with a CRYPTO_BUFFER_EXCEEDED error code. Packets containing 2234 discarded CRYPTO frames MUST be acknowledged because the packet has 2235 been received and processed by the transport even though the CRYPTO 2236 frame was discarded. 2238 8. Address Validation 2240 Address validation ensures that an endpoint cannot be used for a 2241 traffic amplification attack. In such an attack, a packet is sent to 2242 a server with spoofed source address information that identifies a 2243 victim. If a server generates more or larger packets in response to 2244 that packet, the attacker can use the server to send more data toward 2245 the victim than it would be able to send on its own. 2247 The primary defense against amplification attacks is verifying that a 2248 peer is able to receive packets at the transport address that it 2249 claims. Therefore, after receiving packets from an address that is 2250 not yet validated, an endpoint MUST limit the amount of data it sends 2251 to the unvalidated address to three times the amount of data received 2252 from that address. This limit on the size of responses is known as 2253 the anti-amplification limit. 2255 Address validation is performed both during connection establishment 2256 (see Section 8.1) and during connection migration (see Section 8.2). 2258 8.1. Address Validation During Connection Establishment 2260 Connection establishment implicitly provides address validation for 2261 both endpoints. In particular, receipt of a packet protected with 2262 Handshake keys confirms that the peer successfully processed an 2263 Initial packet. Once an endpoint has successfully processed a 2264 Handshake packet from the peer, it can consider the peer address to 2265 have been validated. 2267 Additionally, an endpoint MAY consider the peer address validated if 2268 the peer uses a connection ID chosen by the endpoint and the 2269 connection ID contains at least 64 bits of entropy. 2271 For the client, the value of the Destination Connection ID field in 2272 its first Initial packet allows it to validate the server address as 2273 a part of successfully processing any packet. Initial packets from 2274 the server are protected with keys that are derived from this value 2275 (see Section 5.2 of [QUIC-TLS]). Alternatively, the value is echoed 2276 by the server in Version Negotiation packets (Section 6) or included 2277 in the Integrity Tag in Retry packets (Section 5.8 of [QUIC-TLS]). 2279 Prior to validating the client address, servers MUST NOT send more 2280 than three times as many bytes as the number of bytes they have 2281 received. This limits the magnitude of any amplification attack that 2282 can be mounted using spoofed source addresses. For the purposes of 2283 avoiding amplification prior to address validation, servers MUST 2284 count all of the payload bytes received in datagrams that are 2285 uniquely attributed to a single connection. This includes datagrams 2286 that contain packets that are successfully processed and datagrams 2287 that contain packets that are all discarded. 2289 Clients MUST ensure that UDP datagrams containing Initial packets 2290 have UDP payloads of at least 1200 bytes, adding PADDING frames as 2291 necessary. A client that sends padded datagrams allows the server to 2292 send more data prior to completing address validation. 2294 Loss of an Initial or Handshake packet from the server can cause a 2295 deadlock if the client does not send additional Initial or Handshake 2296 packets. A deadlock could occur when the server reaches its anti- 2297 amplification limit and the client has received acknowledgments for 2298 all the data it has sent. In this case, when the client has no 2299 reason to send additional packets, the server will be unable to send 2300 more data because it has not validated the client's address. To 2301 prevent this deadlock, clients MUST send a packet on a probe timeout 2302 (PTO, see Section 6.2 of [QUIC-RECOVERY]). Specifically, the client 2303 MUST send an Initial packet in a UDP datagram that contains at least 2304 1200 bytes if it does not have Handshake keys, and otherwise send a 2305 Handshake packet. 2307 A server might wish to validate the client address before starting 2308 the cryptographic handshake. QUIC uses a token in the Initial packet 2309 to provide address validation prior to completing the handshake. 2310 This token is delivered to the client during connection establishment 2311 with a Retry packet (see Section 8.1.2) or in a previous connection 2312 using the NEW_TOKEN frame (see Section 8.1.3). 2314 In addition to sending limits imposed prior to address validation, 2315 servers are also constrained in what they can send by the limits set 2316 by the congestion controller. Clients are only constrained by the 2317 congestion controller. 2319 8.1.1. Token Construction 2321 A token sent in a NEW_TOKEN frame or a Retry packet MUST be 2322 constructed in a way that allows the server to identify how it was 2323 provided to a client. These tokens are carried in the same field, 2324 but require different handling from servers. 2326 8.1.2. Address Validation using Retry Packets 2328 Upon receiving the client's Initial packet, the server can request 2329 address validation by sending a Retry packet (Section 17.2.5) 2330 containing a token. This token MUST be repeated by the client in all 2331 Initial packets it sends for that connection after it receives the 2332 Retry packet. 2334 In response to processing an Initial containing a token that was 2335 provided in a Retry packet, a server cannot send another Retry 2336 packet; it can only refuse the connection or permit it to proceed. 2338 As long as it is not possible for an attacker to generate a valid 2339 token for its own address (see Section 8.1.4) and the client is able 2340 to return that token, it proves to the server that it received the 2341 token. 2343 A server can also use a Retry packet to defer the state and 2344 processing costs of connection establishment. Requiring the server 2345 to provide a different connection ID, along with the 2346 original_destination_connection_id transport parameter defined in 2347 Section 18.2, forces the server to demonstrate that it, or an entity 2348 it cooperates with, received the original Initial packet from the 2349 client. Providing a different connection ID also grants a server 2350 some control over how subsequent packets are routed. This can be 2351 used to direct connections to a different server instance. 2353 If a server receives a client Initial that contains an invalid Retry 2354 token but is otherwise valid, it knows the client will not accept 2355 another Retry token. The server can discard such a packet and allow 2356 the client to time out to detect handshake failure, but that could 2357 impose a significant latency penalty on the client. Instead, the 2358 server SHOULD immediately close (Section 10.2) the connection with an 2359 INVALID_TOKEN error. Note that a server has not established any 2360 state for the connection at this point and so does not enter the 2361 closing period. 2363 A flow showing the use of a Retry packet is shown in Figure 9. 2365 Client Server 2367 Initial[0]: CRYPTO[CH] -> 2369 <- Retry+Token 2371 Initial+Token[1]: CRYPTO[CH] -> 2373 Initial[0]: CRYPTO[SH] ACK[1] 2374 Handshake[0]: CRYPTO[EE, CERT, CV, FIN] 2375 <- 1-RTT[0]: STREAM[1, "..."] 2377 Figure 9: Example Handshake with Retry 2379 8.1.3. Address Validation for Future Connections 2381 A server MAY provide clients with an address validation token during 2382 one connection that can be used on a subsequent connection. Address 2383 validation is especially important with 0-RTT because a server 2384 potentially sends a significant amount of data to a client in 2385 response to 0-RTT data. 2387 The server uses the NEW_TOKEN frame (Section 19.7) to provide the 2388 client with an address validation token that can be used to validate 2389 future connections. In a future connection, the client includes this 2390 token in Initial packets to provide address validation. The client 2391 MUST include the token in all Initial packets it sends, unless a 2392 Retry replaces the token with a newer one. The client MUST NOT use 2393 the token provided in a Retry for future connections. Servers MAY 2394 discard any Initial packet that does not carry the expected token. 2396 Unlike the token that is created for a Retry packet, which is used 2397 immediately, the token sent in the NEW_TOKEN frame can be used after 2398 some period of time has passed. Thus, a token SHOULD have an 2399 expiration time, which could be either an explicit expiration time or 2400 an issued timestamp that can be used to dynamically calculate the 2401 expiration time. A server can store the expiration time or include 2402 it in an encrypted form in the token. 2404 A token issued with NEW_TOKEN MUST NOT include information that would 2405 allow values to be linked by an observer to the connection on which 2406 it was issued. For example, it cannot include the previous 2407 connection ID or addressing information, unless the values are 2408 encrypted. A server MUST ensure that every NEW_TOKEN frame it sends 2409 is unique across all clients, with the exception of those sent to 2410 repair losses of previously sent NEW_TOKEN frames. Information that 2411 allows the server to distinguish between tokens from Retry and 2412 NEW_TOKEN MAY be accessible to entities other than the server. 2414 It is unlikely that the client port number is the same on two 2415 different connections; validating the port is therefore unlikely to 2416 be successful. 2418 A token received in a NEW_TOKEN frame is applicable to any server 2419 that the connection is considered authoritative for (e.g., server 2420 names included in the certificate). When connecting to a server for 2421 which the client retains an applicable and unused token, it SHOULD 2422 include that token in the Token field of its Initial packet. 2423 Including a token might allow the server to validate the client 2424 address without an additional round trip. A client MUST NOT include 2425 a token that is not applicable to the server that it is connecting 2426 to, unless the client has the knowledge that the server that issued 2427 the token and the server the client is connecting to are jointly 2428 managing the tokens. A client MAY use a token from any previous 2429 connection to that server. 2431 A token allows a server to correlate activity between the connection 2432 where the token was issued and any connection where it is used. 2433 Clients that want to break continuity of identity with a server can 2434 discard tokens provided using the NEW_TOKEN frame. In comparison, a 2435 token obtained in a Retry packet MUST be used immediately during the 2436 connection attempt and cannot be used in subsequent connection 2437 attempts. 2439 A client SHOULD NOT reuse a NEW_TOKEN token for different connection 2440 attempts. Reusing a token allows connections to be linked by 2441 entities on the network path; see Section 9.5. 2443 Clients might receive multiple tokens on a single connection. Aside 2444 from preventing linkability, any token can be used in any connection 2445 attempt. Servers can send additional tokens to either enable address 2446 validation for multiple connection attempts or to replace older 2447 tokens that might become invalid. For a client, this ambiguity means 2448 that sending the most recent unused token is most likely to be 2449 effective. Though saving and using older tokens has no negative 2450 consequences, clients can regard older tokens as being less likely be 2451 useful to the server for address validation. 2453 When a server receives an Initial packet with an address validation 2454 token, it MUST attempt to validate the token, unless it has already 2455 completed address validation. If the token is invalid then the 2456 server SHOULD proceed as if the client did not have a validated 2457 address, including potentially sending a Retry. Tokens provided with 2458 NEW_TOKEN frames and Retry packets can be distinguished by servers 2459 (see Section 8.1.1), and the latter validated more strictly. If the 2460 validation succeeds, the server SHOULD then allow the handshake to 2461 proceed. 2463 Note: The rationale for treating the client as unvalidated rather 2464 than discarding the packet is that the client might have received 2465 the token in a previous connection using the NEW_TOKEN frame, and 2466 if the server has lost state, it might be unable to validate the 2467 token at all, leading to connection failure if the packet is 2468 discarded. 2470 In a stateless design, a server can use encrypted and authenticated 2471 tokens to pass information to clients that the server can later 2472 recover and use to validate a client address. Tokens are not 2473 integrated into the cryptographic handshake and so they are not 2474 authenticated. For instance, a client might be able to reuse a 2475 token. To avoid attacks that exploit this property, a server can 2476 limit its use of tokens to only the information needed to validate 2477 client addresses. 2479 Clients MAY use tokens obtained on one connection for any connection 2480 attempt using the same version. When selecting a token to use, 2481 clients do not need to consider other properties of the connection 2482 that is being attempted, including the choice of possible application 2483 protocols, session tickets, or other connection properties. 2485 8.1.4. Address Validation Token Integrity 2487 An address validation token MUST be difficult to guess. Including a 2488 random value with at least 128 bits of entropy in the token would be 2489 sufficient, but this depends on the server remembering the value it 2490 sends to clients. 2492 A token-based scheme allows the server to offload any state 2493 associated with validation to the client. For this design to work, 2494 the token MUST be covered by integrity protection against 2495 modification or falsification by clients. Without integrity 2496 protection, malicious clients could generate or guess values for 2497 tokens that would be accepted by the server. Only the server 2498 requires access to the integrity protection key for tokens. 2500 There is no need for a single well-defined format for the token 2501 because the server that generates the token also consumes it. Tokens 2502 sent in Retry packets SHOULD include information that allows the 2503 server to verify that the source IP address and port in client 2504 packets remain constant. 2506 Tokens sent in NEW_TOKEN frames MUST include information that allows 2507 the server to verify that the client IP address has not changed from 2508 when the token was issued. Servers can use tokens from NEW_TOKEN in 2509 deciding not to send a Retry packet, even if the client address has 2510 changed. If the client IP address has changed, the server MUST 2511 adhere to the anti-amplification limit; see Section 8. Note that in 2512 the presence of NAT, this requirement might be insufficient to 2513 protect other hosts that share the NAT from amplification attack. 2515 Attackers could replay tokens to use servers as amplifiers in DDoS 2516 attacks. To protect against such attacks, servers MUST ensure that 2517 replay of tokens is prevented or limited. Servers SHOULD ensure that 2518 tokens sent in Retry packets are only accepted for a short time, as 2519 they are returned immediately by clients. Tokens that are provided 2520 in NEW_TOKEN frames (Section 19.7) need to be valid for longer, but 2521 SHOULD NOT be accepted multiple times. Servers are encouraged to 2522 allow tokens to be used only once, if possible; tokens MAY include 2523 additional information about clients to further narrow applicability 2524 or reuse. 2526 8.2. Path Validation 2528 Path validation is used by both peers during connection migration 2529 (see Section 9) to verify reachability after a change of address. In 2530 path validation, endpoints test reachability between a specific local 2531 address and a specific peer address, where an address is the two- 2532 tuple of IP address and port. 2534 Path validation tests that packets sent on a path to a peer are 2535 received by that peer. Path validation is used to ensure that 2536 packets received from a migrating peer do not carry a spoofed source 2537 address. 2539 Path validation does not validate that a peer can send in the return 2540 direction. Acknowledgments cannot be used for return path validation 2541 because they contain insufficient entropy and might be spoofed. 2542 Endpoints independently determine reachability on each direction of a 2543 path, and therefore return reachability can only be established by 2544 the peer. 2546 Path validation can be used at any time by either endpoint. For 2547 instance, an endpoint might check that a peer is still in possession 2548 of its address after a period of quiescence. 2550 Path validation is not designed as a NAT traversal mechanism. Though 2551 the mechanism described here might be effective for the creation of 2552 NAT bindings that support NAT traversal, the expectation is that one 2553 or other peer is able to receive packets without first having sent a 2554 packet on that path. Effective NAT traversal needs additional 2555 synchronization mechanisms that are not provided here. 2557 An endpoint MAY include other frames with the PATH_CHALLENGE and 2558 PATH_RESPONSE frames used for path validation. In particular, an 2559 endpoint can include PADDING frames with a PATH_CHALLENGE frame for 2560 Path Maximum Transmission Unit Discovery (PMTUD; see Section 14.2.1); 2561 it can also include its own PATH_CHALLENGE frame with a PATH_RESPONSE 2562 frame. 2564 An endpoint uses a new connection ID for probes sent from a new local 2565 address; see Section 9.5. When probing a new path, an endpoint can 2566 ensure that its peer has an unused connection ID available for 2567 responses. Sending NEW_CONNECTION_ID and PATH_CHALLENGE frames in 2568 the same packet, if the peer's active_connection_id_limit permits, 2569 ensures that an unused connection ID will be available to the peer 2570 when sending a response. 2572 An endpoint can choose to simultaneously probe multiple paths. The 2573 number of simultaneous paths used for probes is limited by the number 2574 of extra connection IDs its peer has previously supplied, since each 2575 new local address used for a probe requires a previously unused 2576 connection ID. 2578 8.2.1. Initiating Path Validation 2580 To initiate path validation, an endpoint sends a PATH_CHALLENGE frame 2581 containing an unpredictable payload on the path to be validated. 2583 An endpoint MAY send multiple PATH_CHALLENGE frames to guard against 2584 packet loss. However, an endpoint SHOULD NOT send multiple 2585 PATH_CHALLENGE frames in a single packet. 2587 An endpoint SHOULD NOT probe a new path with packets containing a 2588 PATH_CHALLENGE frame more frequently than it would send an Initial 2589 packet. This ensures that connection migration is no more load on a 2590 new path than establishing a new connection. 2592 The endpoint MUST use unpredictable data in every PATH_CHALLENGE 2593 frame so that it can associate the peer's response with the 2594 corresponding PATH_CHALLENGE. 2596 An endpoint MUST expand datagrams that contain a PATH_CHALLENGE frame 2597 to at least the smallest allowed maximum datagram size of 1200 bytes, 2598 unless the anti-amplification limit for the path does not permit 2599 sending a datagram of this size. Sending UDP datagrams of this size 2600 ensures that the network path from the endpoint to the peer can be 2601 used for QUIC; see Section 14. 2603 When an endpoint is unable to expand the datagram size to 1200 bytes 2604 due to the anti-amplification limit, the path MTU will not be 2605 validated. To ensure that the path MTU is large enough, the endpoint 2606 MUST perform a second path validation by sending a PATH_CHALLENGE 2607 frame in a datagram of at least 1200 bytes. This additional 2608 validation can be performed after a PATH_RESPONSE is successfully 2609 received or when enough bytes have been received on the path that 2610 sending the larger datagram will not result in exceeding the anti- 2611 amplification limit. 2613 Unlike other cases where datagrams are expanded, endpoints MUST NOT 2614 discard datagrams that appear to be too small when they contain 2615 PATH_CHALLENGE or PATH_RESPONSE. 2617 8.2.2. Path Validation Responses 2619 On receiving a PATH_CHALLENGE frame, an endpoint MUST respond by 2620 echoing the data contained in the PATH_CHALLENGE frame in a 2621 PATH_RESPONSE frame. An endpoint MUST NOT delay transmission of a 2622 packet containing a PATH_RESPONSE frame unless constrained by 2623 congestion control. 2625 A PATH_RESPONSE frame MUST be sent on the network path where the 2626 PATH_CHALLENGE was received. This ensures that path validation by a 2627 peer only succeeds if the path is functional in both directions. 2628 This requirement MUST NOT be enforced by the endpoint that initiates 2629 path validation as that would enable an attack on migration; see 2630 Section 9.3.3. 2632 An endpoint MUST expand datagrams that contain a PATH_RESPONSE frame 2633 to at least the smallest allowed maximum datagram size of 1200 bytes. 2634 This verifies that the path is able to carry datagrams of this size 2635 in both directions. However, an endpoint MUST NOT expand the 2636 datagram containing the PATH_RESPONSE if the resulting data exceeds 2637 the anti-amplification limit. This is expected to only occur if the 2638 received PATH_CHALLENGE was not sent in an expanded datagram. 2640 An endpoint MUST NOT send more than one PATH_RESPONSE frame in 2641 response to one PATH_CHALLENGE frame; see Section 13.3. The peer is 2642 expected to send more PATH_CHALLENGE frames as necessary to evoke 2643 additional PATH_RESPONSE frames. 2645 8.2.3. Successful Path Validation 2647 Path validation succeeds when a PATH_RESPONSE frame is received that 2648 contains the data that was sent in a previous PATH_CHALLENGE frame. 2649 A PATH_RESPONSE frame received on any network path validates the path 2650 on which the PATH_CHALLENGE was sent. 2652 If an endpoint sends a PATH_CHALLENGE frame in a datagram that is not 2653 expanded to at least 1200 bytes, and if the response to it validates 2654 the peer address, the path is validated but not the path MTU. As a 2655 result, the endpoint can now send more than three times the amount of 2656 data that has been received. However, the endpoint MUST initiate 2657 another path validation with an expanded datagram to verify that the 2658 path supports the required MTU. 2660 Receipt of an acknowledgment for a packet containing a PATH_CHALLENGE 2661 frame is not adequate validation, since the acknowledgment can be 2662 spoofed by a malicious peer. 2664 8.2.4. Failed Path Validation 2666 Path validation only fails when the endpoint attempting to validate 2667 the path abandons its attempt to validate the path. 2669 Endpoints SHOULD abandon path validation based on a timer. When 2670 setting this timer, implementations are cautioned that the new path 2671 could have a longer round-trip time than the original. A value of 2672 three times the larger of the current Probe Timeout (PTO) or the PTO 2673 for the new path (that is, using kInitialRtt as defined in 2674 [QUIC-RECOVERY]) is RECOMMENDED. 2676 This timeout allows for multiple PTOs to expire prior to failing path 2677 validation, so that loss of a single PATH_CHALLENGE or PATH_RESPONSE 2678 frame does not cause path validation failure. 2680 Note that the endpoint might receive packets containing other frames 2681 on the new path, but a PATH_RESPONSE frame with appropriate data is 2682 required for path validation to succeed. 2684 When an endpoint abandons path validation, it determines that the 2685 path is unusable. This does not necessarily imply a failure of the 2686 connection - endpoints can continue sending packets over other paths 2687 as appropriate. If no paths are available, an endpoint can wait for 2688 a new path to become available or close the connection. An endpoint 2689 that has no valid network path to its peer MAY signal this using the 2690 NO_VIABLE_PATH connection error, noting that this is only possible if 2691 the network path exists but does not support the required MTU 2692 (Section 14). 2694 A path validation might be abandoned for other reasons besides 2695 failure. Primarily, this happens if a connection migration to a new 2696 path is initiated while a path validation on the old path is in 2697 progress. 2699 9. Connection Migration 2701 The use of a connection ID allows connections to survive changes to 2702 endpoint addresses (IP address and port), such as those caused by an 2703 endpoint migrating to a new network. This section describes the 2704 process by which an endpoint migrates to a new address. 2706 The design of QUIC relies on endpoints retaining a stable address for 2707 the duration of the handshake. An endpoint MUST NOT initiate 2708 connection migration before the handshake is confirmed, as defined in 2709 section 4.1.2 of [QUIC-TLS]. 2711 If the peer sent the disable_active_migration transport parameter, an 2712 endpoint also MUST NOT send packets (including probing packets; see 2713 Section 9.1) from a different local address to the address the peer 2714 used during the handshake, unless the endpoint has acted on a 2715 preferred_address transport parameter from the peer. If the peer 2716 violates this requirement, the endpoint MUST either drop the incoming 2717 packets on that path without generating a stateless reset or proceed 2718 with path validation and allow the peer to migrate. Generating a 2719 stateless reset or closing the connection would allow third parties 2720 in the network to cause connections to close by spoofing or otherwise 2721 manipulating observed traffic. 2723 Not all changes of peer address are intentional, or active, 2724 migrations. The peer could experience NAT rebinding: a change of 2725 address due to a middlebox, usually a NAT, allocating a new outgoing 2726 port or even a new outgoing IP address for a flow. An endpoint MUST 2727 perform path validation (Section 8.2) if it detects any change to a 2728 peer's address, unless it has previously validated that address. 2730 When an endpoint has no validated path on which to send packets, it 2731 MAY discard connection state. An endpoint capable of connection 2732 migration MAY wait for a new path to become available before 2733 discarding connection state. 2735 This document limits migration of connections to new client 2736 addresses, except as described in Section 9.6. Clients are 2737 responsible for initiating all migrations. Servers do not send non- 2738 probing packets (see Section 9.1) toward a client address until they 2739 see a non-probing packet from that address. If a client receives 2740 packets from an unknown server address, the client MUST discard these 2741 packets. 2743 9.1. Probing a New Path 2745 An endpoint MAY probe for peer reachability from a new local address 2746 using path validation (Section 8.2) prior to migrating the connection 2747 to the new local address. Failure of path validation simply means 2748 that the new path is not usable for this connection. Failure to 2749 validate a path does not cause the connection to end unless there are 2750 no valid alternative paths available. 2752 PATH_CHALLENGE, PATH_RESPONSE, NEW_CONNECTION_ID, and PADDING frames 2753 are "probing frames", and all other frames are "non-probing frames". 2754 A packet containing only probing frames is a "probing packet", and a 2755 packet containing any other frame is a "non-probing packet". 2757 9.2. Initiating Connection Migration 2759 An endpoint can migrate a connection to a new local address by 2760 sending packets containing non-probing frames from that address. 2762 Each endpoint validates its peer's address during connection 2763 establishment. Therefore, a migrating endpoint can send to its peer 2764 knowing that the peer is willing to receive at the peer's current 2765 address. Thus an endpoint can migrate to a new local address without 2766 first validating the peer's address. 2768 To establish reachability on the new path, an endpoint initiates path 2769 validation (Section 8.2) on the new path. An endpoint MAY defer path 2770 validation until after a peer sends the next non-probing frame to its 2771 new address. 2773 When migrating, the new path might not support the endpoint's current 2774 sending rate. Therefore, the endpoint resets its congestion 2775 controller and RTT estimate, as described in Section 9.4. 2777 The new path might not have the same ECN capability. Therefore, the 2778 endpoint validates ECN capability as described in Section 13.4. 2780 9.3. Responding to Connection Migration 2782 Receiving a packet from a new peer address containing a non-probing 2783 frame indicates that the peer has migrated to that address. 2785 If the recipient permits the migration, it MUST send subsequent 2786 packets to the new peer address and MUST initiate path validation 2787 (Section 8.2) to verify the peer's ownership of the address if 2788 validation is not already underway. 2790 An endpoint only changes the address to which it sends packets in 2791 response to the highest-numbered non-probing packet. This ensures 2792 that an endpoint does not send packets to an old peer address in the 2793 case that it receives reordered packets. 2795 An endpoint MAY send data to an unvalidated peer address, but it MUST 2796 protect against potential attacks as described in Section 9.3.1 and 2797 Section 9.3.2. An endpoint MAY skip validation of a peer address if 2798 that address has been seen recently. In particular, if an endpoint 2799 returns to a previously-validated path after detecting some form of 2800 spurious migration, skipping address validation and restoring loss 2801 detection and congestion state can reduce the performance impact of 2802 the attack. 2804 After changing the address to which it sends non-probing packets, an 2805 endpoint can abandon any path validation for other addresses. 2807 Receiving a packet from a new peer address could be the result of a 2808 NAT rebinding at the peer. 2810 After verifying a new client address, the server SHOULD send new 2811 address validation tokens (Section 8) to the client. 2813 9.3.1. Peer Address Spoofing 2815 It is possible that a peer is spoofing its source address to cause an 2816 endpoint to send excessive amounts of data to an unwilling host. If 2817 the endpoint sends significantly more data than the spoofing peer, 2818 connection migration might be used to amplify the volume of data that 2819 an attacker can generate toward a victim. 2821 As described in Section 9.3, an endpoint is required to validate a 2822 peer's new address to confirm the peer's possession of the new 2823 address. Until a peer's address is deemed valid, an endpoint limits 2824 the amount of data it sends to that address; see Section 8. In the 2825 absence of this limit, an endpoint risks being used for a denial of 2826 service attack against an unsuspecting victim. 2828 If an endpoint skips validation of a peer address as described above, 2829 it does not need to limit its sending rate. 2831 9.3.2. On-Path Address Spoofing 2833 An on-path attacker could cause a spurious connection migration by 2834 copying and forwarding a packet with a spoofed address such that it 2835 arrives before the original packet. The packet with the spoofed 2836 address will be seen to come from a migrating connection, and the 2837 original packet will be seen as a duplicate and dropped. After a 2838 spurious migration, validation of the source address will fail 2839 because the entity at the source address does not have the necessary 2840 cryptographic keys to read or respond to the PATH_CHALLENGE frame 2841 that is sent to it even if it wanted to. 2843 To protect the connection from failing due to such a spurious 2844 migration, an endpoint MUST revert to using the last validated peer 2845 address when validation of a new peer address fails. Additionally, 2846 receipt of packets with higher packet numbers from the legitimate 2847 peer address will trigger another connection migration. This will 2848 cause the validation of the address of the spurious migration to be 2849 abandoned, thus containing migrations initiated by the attacker 2850 injecting a single packet. 2852 If an endpoint has no state about the last validated peer address, it 2853 MUST close the connection silently by discarding all connection 2854 state. This results in new packets on the connection being handled 2855 generically. For instance, an endpoint MAY send a stateless reset in 2856 response to any further incoming packets. 2858 9.3.3. Off-Path Packet Forwarding 2860 An off-path attacker that can observe packets might forward copies of 2861 genuine packets to endpoints. If the copied packet arrives before 2862 the genuine packet, this will appear as a NAT rebinding. Any genuine 2863 packet will be discarded as a duplicate. If the attacker is able to 2864 continue forwarding packets, it might be able to cause migration to a 2865 path via the attacker. This places the attacker on path, giving it 2866 the ability to observe or drop all subsequent packets. 2868 This style of attack relies on the attacker using a path that has 2869 approximately the same characteristics as the direct path between 2870 endpoints. The attack is more reliable if relatively few packets are 2871 sent or if packet loss coincides with the attempted attack. 2873 A non-probing packet received on the original path that increases the 2874 maximum received packet number will cause the endpoint to move back 2875 to that path. Eliciting packets on this path increases the 2876 likelihood that the attack is unsuccessful. Therefore, mitigation of 2877 this attack relies on triggering the exchange of packets. 2879 In response to an apparent migration, endpoints MUST validate the 2880 previously active path using a PATH_CHALLENGE frame. This induces 2881 the sending of new packets on that path. If the path is no longer 2882 viable, the validation attempt will time out and fail; if the path is 2883 viable, but no longer desired, the validation will succeed, but only 2884 results in probing packets being sent on the path. 2886 An endpoint that receives a PATH_CHALLENGE on an active path SHOULD 2887 send a non-probing packet in response. If the non-probing packet 2888 arrives before any copy made by an attacker, this results in the 2889 connection being migrated back to the original path. Any subsequent 2890 migration to another path restarts this entire process. 2892 This defense is imperfect, but this is not considered a serious 2893 problem. If the path via the attack is reliably faster than the 2894 original path despite multiple attempts to use that original path, it 2895 is not possible to distinguish between attack and an improvement in 2896 routing. 2898 An endpoint could also use heuristics to improve detection of this 2899 style of attack. For instance, NAT rebinding is improbable if 2900 packets were recently received on the old path; similarly, rebinding 2901 is rare on IPv6 paths. Endpoints can also look for duplicated 2902 packets. Conversely, a change in connection ID is more likely to 2903 indicate an intentional migration rather than an attack. 2905 9.4. Loss Detection and Congestion Control 2907 The capacity available on the new path might not be the same as the 2908 old path. Packets sent on the old path MUST NOT contribute to 2909 congestion control or RTT estimation for the new path. 2911 On confirming a peer's ownership of its new address, an endpoint MUST 2912 immediately reset the congestion controller and round-trip time 2913 estimator for the new path to initial values (see Appendices A.3 and 2914 B.3 in [QUIC-RECOVERY]) unless the only change in the peer's address 2915 is its port number. Because port-only changes are commonly the 2916 result of NAT rebinding or other middlebox activity, the endpoint MAY 2917 instead retain its congestion control state and round-trip estimate 2918 in those cases instead of reverting to initial values. In cases 2919 where congestion control state retained from an old path is used on a 2920 new path with substantially different characteristics, a sender could 2921 transmit too aggressively until the congestion controller and the RTT 2922 estimator have adapted. Generally, implementations are advised to be 2923 cautious when using previous values on a new path. 2925 There could be apparent reordering at the receiver when an endpoint 2926 sends data and probes from/to multiple addresses during the migration 2927 period, since the two resulting paths could have different round-trip 2928 times. A receiver of packets on multiple paths will still send ACK 2929 frames covering all received packets. 2931 While multiple paths might be used during connection migration, a 2932 single congestion control context and a single loss recovery context 2933 (as described in [QUIC-RECOVERY]) could be adequate. For instance, 2934 an endpoint might delay switching to a new congestion control context 2935 until it is confirmed that an old path is no longer needed (such as 2936 the case in Section 9.3.3). 2938 A sender can make exceptions for probe packets so that their loss 2939 detection is independent and does not unduly cause the congestion 2940 controller to reduce its sending rate. An endpoint might set a 2941 separate timer when a PATH_CHALLENGE is sent, which is cancelled if 2942 the corresponding PATH_RESPONSE is received. If the timer fires 2943 before the PATH_RESPONSE is received, the endpoint might send a new 2944 PATH_CHALLENGE, and restart the timer for a longer period of time. 2945 This timer SHOULD be set as described in Section 6.2.1 of 2946 [QUIC-RECOVERY] and MUST NOT be more aggressive. 2948 9.5. Privacy Implications of Connection Migration 2950 Using a stable connection ID on multiple network paths would allow a 2951 passive observer to correlate activity between those paths. An 2952 endpoint that moves between networks might not wish to have their 2953 activity correlated by any entity other than their peer, so different 2954 connection IDs are used when sending from different local addresses, 2955 as discussed in Section 5.1. For this to be effective, endpoints 2956 need to ensure that connection IDs they provide cannot be linked by 2957 any other entity. 2959 At any time, endpoints MAY change the Destination Connection ID they 2960 transmit with to a value that has not been used on another path. 2962 An endpoint MUST NOT reuse a connection ID when sending from more 2963 than one local address, for example when initiating connection 2964 migration as described in Section 9.2 or when probing a new network 2965 path as described in Section 9.1. 2967 Similarly, an endpoint MUST NOT reuse a connection ID when sending to 2968 more than one destination address. Due to network changes outside 2969 the control of its peer, an endpoint might receive packets from a new 2970 source address with the same destination connection ID, in which case 2971 it MAY continue to use the current connection ID with the new remote 2972 address while still sending from the same local address. 2974 These requirements regarding connection ID reuse apply only to the 2975 sending of packets, as unintentional changes in path without a change 2976 in connection ID are possible. For example, after a period of 2977 network inactivity, NAT rebinding might cause packets to be sent on a 2978 new path when the client resumes sending. An endpoint responds to 2979 such an event as described in Section 9.3. 2981 Using different connection IDs for packets sent in both directions on 2982 each new network path eliminates the use of the connection ID for 2983 linking packets from the same connection across different network 2984 paths. Header protection ensures that packet numbers cannot be used 2985 to correlate activity. This does not prevent other properties of 2986 packets, such as timing and size, from being used to correlate 2987 activity. 2989 An endpoint SHOULD NOT initiate migration with a peer that has 2990 requested a zero-length connection ID, because traffic over the new 2991 path might be trivially linkable to traffic over the old one. If the 2992 server is able to associate packets with a zero-length connection ID 2993 to the right connection, it means that the server is using other 2994 information to demultiplex packets. For example, a server might 2995 provide a unique address to every client, for instance using HTTP 2996 alternative services [ALTSVC]. Information that might allow correct 2997 routing of packets across multiple network paths will also allow 2998 activity on those paths to be linked by entities other than the peer. 3000 A client might wish to reduce linkability by switching to a new 3001 connection ID, source UDP port, or IP address (see [RFC4941]) when 3002 sending traffic after a period of inactivity. Changing the address 3003 from which it sends packets at the same time might cause the server 3004 to detect a connection migration. This ensures that the mechanisms 3005 that support migration are exercised even for clients that do not 3006 experience NAT rebindings or genuine migrations. Changing address 3007 can cause a peer to reset its congestion control state (see 3008 Section 9.4), so addresses SHOULD only be changed infrequently. 3010 An endpoint that exhausts available connection IDs cannot probe new 3011 paths or initiate migration, nor can it respond to probes or attempts 3012 by its peer to migrate. To ensure that migration is possible and 3013 packets sent on different paths cannot be correlated, endpoints 3014 SHOULD provide new connection IDs before peers migrate; see 3015 Section 5.1.1. If a peer might have exhausted available connection 3016 IDs, a migrating endpoint could include a NEW_CONNECTION_ID frame in 3017 all packets sent on a new network path. 3019 9.6. Server's Preferred Address 3021 QUIC allows servers to accept connections on one IP address and 3022 attempt to transfer these connections to a more preferred address 3023 shortly after the handshake. This is particularly useful when 3024 clients initially connect to an address shared by multiple servers 3025 but would prefer to use a unicast address to ensure connection 3026 stability. This section describes the protocol for migrating a 3027 connection to a preferred server address. 3029 Migrating a connection to a new server address mid-connection is not 3030 supported by the version of QUIC specified in this document. If a 3031 client receives packets from a new server address when the client has 3032 not initiated a migration to that address, the client SHOULD discard 3033 these packets. 3035 9.6.1. Communicating a Preferred Address 3037 A server conveys a preferred address by including the 3038 preferred_address transport parameter in the TLS handshake. 3040 Servers MAY communicate a preferred address of each address family 3041 (IPv4 and IPv6) to allow clients to pick the one most suited to their 3042 network attachment. 3044 Once the handshake is confirmed, the client SHOULD select one of the 3045 two addresses provided by the server and initiate path validation 3046 (see Section 8.2). A client constructs packets using any previously 3047 unused active connection ID, taken from either the preferred_address 3048 transport parameter or a NEW_CONNECTION_ID frame. 3050 As soon as path validation succeeds, the client SHOULD begin sending 3051 all future packets to the new server address using the new connection 3052 ID and discontinue use of the old server address. If path validation 3053 fails, the client MUST continue sending all future packets to the 3054 server's original IP address. 3056 9.6.2. Migration to a Preferred Address 3058 A client that migrates to a preferred address MUST validate the 3059 address it chooses before migrating; see Section 21.5.3. 3061 A server might receive a packet addressed to its preferred IP address 3062 at any time after it accepts a connection. If this packet contains a 3063 PATH_CHALLENGE frame, the server sends a packet containing a 3064 PATH_RESPONSE frame as per Section 8.2. The server MUST send non- 3065 probing packets from its original address until it receives a non- 3066 probing packet from the client at its preferred address and until the 3067 server has validated the new path. 3069 The server MUST probe on the path toward the client from its 3070 preferred address. This helps to guard against spurious migration 3071 initiated by an attacker. 3073 Once the server has completed its path validation and has received a 3074 non-probing packet with a new largest packet number on its preferred 3075 address, the server begins sending non-probing packets to the client 3076 exclusively from its preferred IP address. The server SHOULD drop 3077 newer packets for this connection that are received on the old IP 3078 address. The server MAY continue to process delayed packets that are 3079 received on the old IP address. 3081 The addresses that a server provides in the preferred_address 3082 transport parameter are only valid for the connection in which they 3083 are provided. A client MUST NOT use these for other connections, 3084 including connections that are resumed from the current connection. 3086 9.6.3. Interaction of Client Migration and Preferred Address 3088 A client might need to perform a connection migration before it has 3089 migrated to the server's preferred address. In this case, the client 3090 SHOULD perform path validation to both the original and preferred 3091 server address from the client's new address concurrently. 3093 If path validation of the server's preferred address succeeds, the 3094 client MUST abandon validation of the original address and migrate to 3095 using the server's preferred address. If path validation of the 3096 server's preferred address fails but validation of the server's 3097 original address succeeds, the client MAY migrate to its new address 3098 and continue sending to the server's original address. 3100 If packets received at the server's preferred address have a 3101 different source address than observed from the client during the 3102 handshake, the server MUST protect against potential attacks as 3103 described in Section 9.3.1 and Section 9.3.2. In addition to 3104 intentional simultaneous migration, this might also occur because the 3105 client's access network used a different NAT binding for the server's 3106 preferred address. 3108 Servers SHOULD initiate path validation to the client's new address 3109 upon receiving a probe packet from a different address; see 3110 Section 8. 3112 A client that migrates to a new address SHOULD use a preferred 3113 address from the same address family for the server. 3115 The connection ID provided in the preferred_address transport 3116 parameter is not specific to the addresses that are provided. This 3117 connection ID is provided to ensure that the client has a connection 3118 ID available for migration, but the client MAY use this connection ID 3119 on any path. 3121 9.7. Use of IPv6 Flow-Label and Migration 3123 Endpoints that send data using IPv6 SHOULD apply an IPv6 flow label 3124 in compliance with [RFC6437], unless the local API does not allow 3125 setting IPv6 flow labels. 3127 The flow label generation MUST be designed to minimize the chances of 3128 linkability with a previously used flow label, as a stable flow label 3129 would enable correlating activity on multiple paths; see Section 9.5. 3131 [RFC6437] suggests deriving values using a pseudorandom function to 3132 generate flow labels. Including the Destination Connection ID field 3133 in addition to source and destination addresses when generating flow 3134 labels ensures that changes are synchronized with changes in other 3135 observable identifiers. A cryptographic hash function that combines 3136 these inputs with a local secret is one way this might be 3137 implemented. 3139 10. Connection Termination 3141 An established QUIC connection can be terminated in one of three 3142 ways: 3144 * idle timeout (Section 10.1) 3146 * immediate close (Section 10.2) 3148 * stateless reset (Section 10.3) 3150 An endpoint MAY discard connection state if it does not have a 3151 validated path on which it can send packets; see Section 8.2. 3153 10.1. Idle Timeout 3155 If a max_idle_timeout is specified by either peer in its transport 3156 parameters (Section 18.2), the connection is silently closed and its 3157 state is discarded when it remains idle for longer than the minimum 3158 of both peers max_idle_timeout values. 3160 Each endpoint advertises a max_idle_timeout, but the effective value 3161 at an endpoint is computed as the minimum of the two advertised 3162 values (or the sole advertised value, if only one endpoint advertises 3163 a nonzero value). By announcing a max_idle_timeout, an endpoint 3164 commits to initiating an immediate close (Section 10.2) if it 3165 abandons the connection prior to the effective value. 3167 An endpoint restarts its idle timer when a packet from its peer is 3168 received and processed successfully. An endpoint also restarts its 3169 idle timer when sending an ack-eliciting packet if no other ack- 3170 eliciting packets have been sent since last receiving and processing 3171 a packet. Restarting this timer when sending a packet ensures that 3172 connections are not closed after new activity is initiated. 3174 To avoid excessively small idle timeout periods, endpoints MUST 3175 increase the idle timeout period to be at least three times the 3176 current Probe Timeout (PTO). This allows for multiple PTOs to 3177 expire, and therefore multiple probes to be sent and lost, prior to 3178 idle timeout. 3180 10.1.1. Liveness Testing 3182 An endpoint that sends packets close to the effective timeout risks 3183 having them be discarded at the peer, since the idle timeout period 3184 might have expired at the peer before these packets arrive. 3186 An endpoint can send a PING or another ack-eliciting frame to test 3187 the connection for liveness if the peer could time out soon, such as 3188 within a PTO; see Section 6.2 of [QUIC-RECOVERY]. This is especially 3189 useful if any available application data cannot be safely retried. 3190 Note that the application determines what data is safe to retry. 3192 10.1.2. Deferring Idle Timeout 3194 An endpoint might need to send ack-eliciting packets to avoid an idle 3195 timeout if it is expecting response data, but does not have or is 3196 unable to send application data. 3198 An implementation of QUIC might provide applications with an option 3199 to defer an idle timeout. This facility could be used when the 3200 application wishes to avoid losing state that has been associated 3201 with an open connection, but does not expect to exchange application 3202 data for some time. With this option, an endpoint could send a PING 3203 frame (Section 19.2) periodically, which will cause the peer to 3204 restart its idle timeout period. Sending a packet containing a PING 3205 frame restarts the idle timeout for this endpoint also if this is the 3206 first ack-eliciting packet sent since receiving a packet. Sending a 3207 PING frame causes the peer to respond with an acknowledgment, which 3208 also restarts the idle timeout for the endpoint. 3210 Application protocols that use QUIC SHOULD provide guidance on when 3211 deferring an idle timeout is appropriate. Unnecessary sending of 3212 PING frames could have a detrimental effect on performance. 3214 A connection will time out if no packets are sent or received for a 3215 period longer than the time negotiated using the max_idle_timeout 3216 transport parameter; see Section 10. However, state in middleboxes 3217 might time out earlier than that. Though REQ-5 in [RFC4787] 3218 recommends a 2 minute timeout interval, experience shows that sending 3219 packets every 30 seconds is necessary to prevent the majority of 3220 middleboxes from losing state for UDP flows [GATEWAY]. 3222 10.2. Immediate Close 3224 An endpoint sends a CONNECTION_CLOSE frame (Section 19.19) to 3225 terminate the connection immediately. A CONNECTION_CLOSE frame 3226 causes all streams to immediately become closed; open streams can be 3227 assumed to be implicitly reset. 3229 After sending a CONNECTION_CLOSE frame, an endpoint immediately 3230 enters the closing state; see Section 10.2.1. After receiving a 3231 CONNECTION_CLOSE frame, endpoints enter the draining state; see 3232 Section 10.2.2. 3234 Violations of the protocol lead to an immediate close. 3236 An immediate close can be used after an application protocol has 3237 arranged to close a connection. This might be after the application 3238 protocol negotiates a graceful shutdown. The application protocol 3239 can exchange messages that are needed for both application endpoints 3240 to agree that the connection can be closed, after which the 3241 application requests that QUIC close the connection. When QUIC 3242 consequently closes the connection, a CONNECTION_CLOSE frame with an 3243 application-supplied error code will be used to signal closure to the 3244 peer. 3246 The closing and draining connection states exist to ensure that 3247 connections close cleanly and that delayed or reordered packets are 3248 properly discarded. These states SHOULD persist for at least three 3249 times the current Probe Timeout (PTO) interval as defined in 3250 [QUIC-RECOVERY]. 3252 Disposing of connection state prior to exiting the closing or 3253 draining state could result in an endpoint generating a stateless 3254 reset unnecessarily when it receives a late-arriving packet. 3255 Endpoints that have some alternative means to ensure that late- 3256 arriving packets do not induce a response, such as those that are 3257 able to close the UDP socket, MAY end these states earlier to allow 3258 for faster resource recovery. Servers that retain an open socket for 3259 accepting new connections SHOULD NOT end the closing or draining 3260 states early. 3262 Once its closing or draining state ends, an endpoint SHOULD discard 3263 all connection state. The endpoint MAY send a stateless reset in 3264 response to any further incoming packets belonging to this 3265 connection. 3267 10.2.1. Closing Connection State 3269 An endpoint enters the closing state after initiating an immediate 3270 close. 3272 In the closing state, an endpoint retains only enough information to 3273 generate a packet containing a CONNECTION_CLOSE frame and to identify 3274 packets as belonging to the connection. An endpoint in the closing 3275 state sends a packet containing a CONNECTION_CLOSE frame in response 3276 to any incoming packet that it attributes to the connection. 3278 An endpoint SHOULD limit the rate at which it generates packets in 3279 the closing state. For instance, an endpoint could wait for a 3280 progressively increasing number of received packets or amount of time 3281 before responding to received packets. 3283 An endpoint's selected connection ID and the QUIC version are 3284 sufficient information to identify packets for a closing connection; 3285 the endpoint MAY discard all other connection state. An endpoint 3286 that is closing is not required to process any received frame. An 3287 endpoint MAY retain packet protection keys for incoming packets to 3288 allow it to read and process a CONNECTION_CLOSE frame. 3290 An endpoint MAY drop packet protection keys when entering the closing 3291 state and send a packet containing a CONNECTION_CLOSE frame in 3292 response to any UDP datagram that is received. However, an endpoint 3293 that discards packet protection keys cannot identify and discard 3294 invalid packets. To avoid being used for an amplification attack, 3295 such endpoints MUST limit the cumulative size of packets it sends to 3296 three times the cumulative size of the packets that are received and 3297 attributed to the connection. To minimize the state that an endpoint 3298 maintains for a closing connection, endpoints MAY send the exact same 3299 packet in response to any received packet. 3301 Note: Allowing retransmission of a closing packet is an exception to 3302 the requirement that a new packet number be used for each packet 3303 in Section 12.3. Sending new packet numbers is primarily of 3304 advantage to loss recovery and congestion control, which are not 3305 expected to be relevant for a closed connection. Retransmitting 3306 the final packet requires less state. 3308 While in the closing state, an endpoint could receive packets from a 3309 new source address, possibly indicating a connection migration; see 3310 Section 9. An endpoint in the closing state MUST either discard 3311 packets received from an unvalidated address or limit the cumulative 3312 size of packets it sends to an unvalidated address to three times the 3313 size of packets it receives from that address. 3315 An endpoint is not expected to handle key updates when it is closing 3316 (Section 6 of [QUIC-TLS]). A key update might prevent the endpoint 3317 from moving from the closing state to the draining state, as the 3318 endpoint will not be able to process subsequently received packets, 3319 but it otherwise has no impact. 3321 10.2.2. Draining Connection State 3323 The draining state is entered once an endpoint receives a 3324 CONNECTION_CLOSE frame, which indicates that its peer is closing or 3325 draining. While otherwise identical to the closing state, an 3326 endpoint in the draining state MUST NOT send any packets. Retaining 3327 packet protection keys is unnecessary once a connection is in the 3328 draining state. 3330 An endpoint that receives a CONNECTION_CLOSE frame MAY send a single 3331 packet containing a CONNECTION_CLOSE frame before entering the 3332 draining state, using a NO_ERROR code if appropriate. An endpoint 3333 MUST NOT send further packets. Doing so could result in a constant 3334 exchange of CONNECTION_CLOSE frames until one of the endpoints exits 3335 the closing state. 3337 An endpoint MAY enter the draining state from the closing state if it 3338 receives a CONNECTION_CLOSE frame, which indicates that the peer is 3339 also closing or draining. In this case, the draining state ends when 3340 the closing state would have ended. In other words, the endpoint 3341 uses the same end time, but ceases transmission of any packets on 3342 this connection. 3344 10.2.3. Immediate Close During the Handshake 3346 When sending CONNECTION_CLOSE, the goal is to ensure that the peer 3347 will process the frame. Generally, this means sending the frame in a 3348 packet with the highest level of packet protection to avoid the 3349 packet being discarded. After the handshake is confirmed (see 3350 Section 4.1.2 of [QUIC-TLS]), an endpoint MUST send any 3351 CONNECTION_CLOSE frames in a 1-RTT packet. However, prior to 3352 confirming the handshake, it is possible that more advanced packet 3353 protection keys are not available to the peer, so another 3354 CONNECTION_CLOSE frame MAY be sent in a packet that uses a lower 3355 packet protection level. More specifically: 3357 * A client will always know whether the server has Handshake keys 3358 (see Section 17.2.2.1), but it is possible that a server does not 3359 know whether the client has Handshake keys. Under these 3360 circumstances, a server SHOULD send a CONNECTION_CLOSE frame in 3361 both Handshake and Initial packets to ensure that at least one of 3362 them is processable by the client. 3364 * A client that sends CONNECTION_CLOSE in a 0-RTT packet cannot be 3365 assured that the server has accepted 0-RTT. Sending a 3366 CONNECTION_CLOSE frame in an Initial packet makes it more likely 3367 that the server can receive the close signal, even if the 3368 application error code might not be received. 3370 * Prior to confirming the handshake, a peer might be unable to 3371 process 1-RTT packets, so an endpoint SHOULD send CONNECTION_CLOSE 3372 in both Handshake and 1-RTT packets. A server SHOULD also send 3373 CONNECTION_CLOSE in an Initial packet. 3375 Sending a CONNECTION_CLOSE of type 0x1d in an Initial or Handshake 3376 packet could expose application state or be used to alter application 3377 state. A CONNECTION_CLOSE of type 0x1d MUST be replaced by a 3378 CONNECTION_CLOSE of type 0x1c when sending the frame in Initial or 3379 Handshake packets. Otherwise, information about the application 3380 state might be revealed. Endpoints MUST clear the value of the 3381 Reason Phrase field and SHOULD use the APPLICATION_ERROR code when 3382 converting to a CONNECTION_CLOSE of type 0x1c. 3384 CONNECTION_CLOSE frames sent in multiple packet types can be 3385 coalesced into a single UDP datagram; see Section 12.2. 3387 An endpoint can send a CONNECTION_CLOSE frame in an Initial packet. 3388 This might be in response to unauthenticated information received in 3389 Initial or Handshake packets. Such an immediate close might expose 3390 legitimate connections to a denial of service. QUIC does not include 3391 defensive measures for on-path attacks during the handshake; see 3392 Section 21.2. However, at the cost of reducing feedback about errors 3393 for legitimate peers, some forms of denial of service can be made 3394 more difficult for an attacker if endpoints discard illegal packets 3395 rather than terminating a connection with CONNECTION_CLOSE. For this 3396 reason, endpoints MAY discard packets rather than immediately close 3397 if errors are detected in packets that lack authentication. 3399 An endpoint that has not established state, such as a server that 3400 detects an error in an Initial packet, does not enter the closing 3401 state. An endpoint that has no state for the connection does not 3402 enter a closing or draining period on sending a CONNECTION_CLOSE 3403 frame. 3405 10.3. Stateless Reset 3407 A stateless reset is provided as an option of last resort for an 3408 endpoint that does not have access to the state of a connection. A 3409 crash or outage might result in peers continuing to send data to an 3410 endpoint that is unable to properly continue the connection. An 3411 endpoint MAY send a stateless reset in response to receiving a packet 3412 that it cannot associate with an active connection. 3414 A stateless reset is not appropriate for indicating errors in active 3415 connections. An endpoint that wishes to communicate a fatal 3416 connection error MUST use a CONNECTION_CLOSE frame if it is able. 3418 To support this process, an endpoint issues a stateless reset token, 3419 which is a 16-byte value that is hard to guess. If the peer 3420 subsequently receives a stateless reset, which is a UDP datagram that 3421 ends in that stateless reset token, the peer will immediately end the 3422 connection. 3424 A stateless reset token is specific to a connection ID. An endpoint 3425 issues a stateless reset token by including the value in the 3426 Stateless Reset Token field of a NEW_CONNECTION_ID frame. Servers 3427 can also issue a stateless_reset_token transport parameter during the 3428 handshake that applies to the connection ID that it selected during 3429 the handshake. These exchanges are protected by encryption, so only 3430 client and server know their value. Note that clients cannot use the 3431 stateless_reset_token transport parameter because their transport 3432 parameters do not have confidentiality protection. 3434 Tokens are invalidated when their associated connection ID is retired 3435 via a RETIRE_CONNECTION_ID frame (Section 19.16). 3437 An endpoint that receives packets that it cannot process sends a 3438 packet in the following layout (see Section 1.3): 3440 Stateless Reset { 3441 Fixed Bits (2) = 1, 3442 Unpredictable Bits (38..), 3443 Stateless Reset Token (128), 3444 } 3446 Figure 10: Stateless Reset Packet 3448 This design ensures that a stateless reset packet is - to the extent 3449 possible - indistinguishable from a regular packet with a short 3450 header. 3452 A stateless reset uses an entire UDP datagram, starting with the 3453 first two bits of the packet header. The remainder of the first byte 3454 and an arbitrary number of bytes following it are set to values that 3455 SHOULD be indistinguishable from random. The last 16 bytes of the 3456 datagram contain a Stateless Reset Token. 3458 To entities other than its intended recipient, a stateless reset will 3459 appear to be a packet with a short header. For the stateless reset 3460 to appear as a valid QUIC packet, the Unpredictable Bits field needs 3461 to include at least 38 bits of data (or 5 bytes, less the two fixed 3462 bits). 3464 The resulting minimum size of 21 bytes does not guarantee that a 3465 stateless reset is difficult to distinguish from other packets if the 3466 recipient requires the use of a connection ID. To achieve that end, 3467 the endpoint SHOULD ensure that all packets it sends are at least 22 3468 bytes longer than the minimum connection ID length that it requests 3469 the peer to include in its packets, adding PADDING frames as 3470 necessary. This ensures that any stateless reset sent by the peer is 3471 indistinguishable from a valid packet sent to the endpoint. An 3472 endpoint that sends a stateless reset in response to a packet that is 3473 43 bytes or shorter SHOULD send a stateless reset that is one byte 3474 shorter than the packet it responds to. 3476 These values assume that the Stateless Reset Token is the same length 3477 as the minimum expansion of the packet protection AEAD. Additional 3478 unpredictable bytes are necessary if the endpoint could have 3479 negotiated a packet protection scheme with a larger minimum 3480 expansion. 3482 An endpoint MUST NOT send a stateless reset that is three times or 3483 more larger than the packet it receives to avoid being used for 3484 amplification. Section 10.3.3 describes additional limits on 3485 stateless reset size. 3487 Endpoints MUST discard packets that are too small to be valid QUIC 3488 packets. To give an example, with the set of AEAD functions defined 3489 in [QUIC-TLS], short header packets that are smaller than 21 bytes 3490 are never valid. 3492 Endpoints MUST send stateless reset packets formatted as a packet 3493 with a short header. However, endpoints MUST treat any packet ending 3494 in a valid stateless reset token as a stateless reset, as other QUIC 3495 versions might allow the use of a long header. 3497 An endpoint MAY send a stateless reset in response to a packet with a 3498 long header. Sending a stateless reset is not effective prior to the 3499 stateless reset token being available to a peer. In this QUIC 3500 version, packets with a long header are only used during connection 3501 establishment. Because the stateless reset token is not available 3502 until connection establishment is complete or near completion, 3503 ignoring an unknown packet with a long header might be as effective 3504 as sending a stateless reset. 3506 An endpoint cannot determine the Source Connection ID from a packet 3507 with a short header, therefore it cannot set the Destination 3508 Connection ID in the stateless reset packet. The Destination 3509 Connection ID will therefore differ from the value used in previous 3510 packets. A random Destination Connection ID makes the connection ID 3511 appear to be the result of moving to a new connection ID that was 3512 provided using a NEW_CONNECTION_ID frame (Section 19.15). 3514 Using a randomized connection ID results in two problems: 3516 * The packet might not reach the peer. If the Destination 3517 Connection ID is critical for routing toward the peer, then this 3518 packet could be incorrectly routed. This might also trigger 3519 another Stateless Reset in response; see Section 10.3.3. A 3520 Stateless Reset that is not correctly routed is an ineffective 3521 error detection and recovery mechanism. In this case, endpoints 3522 will need to rely on other methods - such as timers - to detect 3523 that the connection has failed. 3525 * The randomly generated connection ID can be used by entities other 3526 than the peer to identify this as a potential stateless reset. An 3527 endpoint that occasionally uses different connection IDs might 3528 introduce some uncertainty about this. 3530 This stateless reset design is specific to QUIC version 1. An 3531 endpoint that supports multiple versions of QUIC needs to generate a 3532 stateless reset that will be accepted by peers that support any 3533 version that the endpoint might support (or might have supported 3534 prior to losing state). Designers of new versions of QUIC need to be 3535 aware of this and either reuse this design, or use a portion of the 3536 packet other than the last 16 bytes for carrying data. 3538 10.3.1. Detecting a Stateless Reset 3540 An endpoint detects a potential stateless reset using the trailing 16 3541 bytes of the UDP datagram. An endpoint remembers all Stateless Reset 3542 Tokens associated with the connection IDs and remote addresses for 3543 datagrams it has recently sent. This includes Stateless Reset Tokens 3544 from NEW_CONNECTION_ID frames and the server's transport parameters 3545 but excludes Stateless Reset Tokens associated with connection IDs 3546 that are either unused or retired. The endpoint identifies a 3547 received datagram as a stateless reset by comparing the last 16 bytes 3548 of the datagram with all Stateless Reset Tokens associated with the 3549 remote address on which the datagram was received. 3551 This comparison can be performed for every inbound datagram. 3552 Endpoints MAY skip this check if any packet from a datagram is 3553 successfully processed. However, the comparison MUST be performed 3554 when the first packet in an incoming datagram either cannot be 3555 associated with a connection, or cannot be decrypted. 3557 An endpoint MUST NOT check for any Stateless Reset Tokens associated 3558 with connection IDs it has not used or for connection IDs that have 3559 been retired. 3561 When comparing a datagram to Stateless Reset Token values, endpoints 3562 MUST perform the comparison without leaking information about the 3563 value of the token. For example, performing this comparison in 3564 constant time protects the value of individual Stateless Reset Tokens 3565 from information leakage through timing side channels. Another 3566 approach would be to store and compare the transformed values of 3567 Stateless Reset Tokens instead of the raw token values, where the 3568 transformation is defined as a cryptographically-secure pseudo-random 3569 function using a secret key (e.g., block cipher, HMAC [RFC2104]). An 3570 endpoint is not expected to protect information about whether a 3571 packet was successfully decrypted, or the number of valid Stateless 3572 Reset Tokens. 3574 If the last 16 bytes of the datagram are identical in value to a 3575 Stateless Reset Token, the endpoint MUST enter the draining period 3576 and not send any further packets on this connection. 3578 10.3.2. Calculating a Stateless Reset Token 3580 The stateless reset token MUST be difficult to guess. In order to 3581 create a Stateless Reset Token, an endpoint could randomly generate 3582 ([RANDOM]) a secret for every connection that it creates. However, 3583 this presents a coordination problem when there are multiple 3584 instances in a cluster or a storage problem for an endpoint that 3585 might lose state. Stateless reset specifically exists to handle the 3586 case where state is lost, so this approach is suboptimal. 3588 A single static key can be used across all connections to the same 3589 endpoint by generating the proof using a pseudorandom function that 3590 takes a static key and the connection ID chosen by the endpoint (see 3591 Section 5.1) as input. An endpoint could use HMAC [RFC2104] (for 3592 example, HMAC(static_key, connection_id)) or HKDF [RFC5869] (for 3593 example, using the static key as input keying material, with the 3594 connection ID as salt). The output of this function is truncated to 3595 16 bytes to produce the Stateless Reset Token for that connection. 3597 An endpoint that loses state can use the same method to generate a 3598 valid Stateless Reset Token. The connection ID comes from the packet 3599 that the endpoint receives. 3601 This design relies on the peer always sending a connection ID in its 3602 packets so that the endpoint can use the connection ID from a packet 3603 to reset the connection. An endpoint that uses this design MUST 3604 either use the same connection ID length for all connections or 3605 encode the length of the connection ID such that it can be recovered 3606 without state. In addition, it cannot provide a zero-length 3607 connection ID. 3609 Revealing the Stateless Reset Token allows any entity to terminate 3610 the connection, so a value can only be used once. This method for 3611 choosing the Stateless Reset Token means that the combination of 3612 connection ID and static key MUST NOT be used for another connection. 3613 A denial of service attack is possible if the same connection ID is 3614 used by instances that share a static key, or if an attacker can 3615 cause a packet to be routed to an instance that has no state but the 3616 same static key; see Section 21.11. A connection ID from a 3617 connection that is reset by revealing the Stateless Reset Token MUST 3618 NOT be reused for new connections at nodes that share a static key. 3620 The same Stateless Reset Token MUST NOT be used for multiple 3621 connection IDs. Endpoints are not required to compare new values 3622 against all previous values, but a duplicate value MAY be treated as 3623 a connection error of type PROTOCOL_VIOLATION. 3625 Note that Stateless Reset packets do not have any cryptographic 3626 protection. 3628 10.3.3. Looping 3630 The design of a Stateless Reset is such that without knowing the 3631 stateless reset token it is indistinguishable from a valid packet. 3632 For instance, if a server sends a Stateless Reset to another server 3633 it might receive another Stateless Reset in response, which could 3634 lead to an infinite exchange. 3636 An endpoint MUST ensure that every Stateless Reset that it sends is 3637 smaller than the packet that triggered it, unless it maintains state 3638 sufficient to prevent looping. In the event of a loop, this results 3639 in packets eventually being too small to trigger a response. 3641 An endpoint can remember the number of Stateless Reset packets that 3642 it has sent and stop generating new Stateless Reset packets once a 3643 limit is reached. Using separate limits for different remote 3644 addresses will ensure that Stateless Reset packets can be used to 3645 close connections when other peers or connections have exhausted 3646 limits. 3648 Reducing the size of a Stateless Reset below 41 bytes means that the 3649 packet could reveal to an observer that it is a Stateless Reset, 3650 depending upon the length of the peer's connection IDs. Conversely, 3651 refusing to send a Stateless Reset in response to a small packet 3652 might result in Stateless Reset not being useful in detecting cases 3653 of broken connections where only very small packets are sent; such 3654 failures might only be detected by other means, such as timers. 3656 11. Error Handling 3658 An endpoint that detects an error SHOULD signal the existence of that 3659 error to its peer. Both transport-level and application-level errors 3660 can affect an entire connection; see Section 11.1. Only application- 3661 level errors can be isolated to a single stream; see Section 11.2. 3663 The most appropriate error code (Section 20) SHOULD be included in 3664 the frame that signals the error. Where this specification 3665 identifies error conditions, it also identifies the error code that 3666 is used; though these are worded as requirements, different 3667 implementation strategies might lead to different errors being 3668 reported. In particular, an endpoint MAY use any applicable error 3669 code when it detects an error condition; a generic error code (such 3670 as PROTOCOL_VIOLATION or INTERNAL_ERROR) can always be used in place 3671 of specific error codes. 3673 A stateless reset (Section 10.3) is not suitable for any error that 3674 can be signaled with a CONNECTION_CLOSE or RESET_STREAM frame. A 3675 stateless reset MUST NOT be used by an endpoint that has the state 3676 necessary to send a frame on the connection. 3678 11.1. Connection Errors 3680 Errors that result in the connection being unusable, such as an 3681 obvious violation of protocol semantics or corruption of state that 3682 affects an entire connection, MUST be signaled using a 3683 CONNECTION_CLOSE frame (Section 19.19). 3685 Application-specific protocol errors are signaled using the 3686 CONNECTION_CLOSE frame with a frame type of 0x1d. Errors that are 3687 specific to the transport, including all those described in this 3688 document, are carried in the CONNECTION_CLOSE frame with a frame type 3689 of 0x1c. 3691 A CONNECTION_CLOSE frame could be sent in a packet that is lost. An 3692 endpoint SHOULD be prepared to retransmit a packet containing a 3693 CONNECTION_CLOSE frame if it receives more packets on a terminated 3694 connection. Limiting the number of retransmissions and the time over 3695 which this final packet is sent limits the effort expended on 3696 terminated connections. 3698 An endpoint that chooses not to retransmit packets containing a 3699 CONNECTION_CLOSE frame risks a peer missing the first such packet. 3700 The only mechanism available to an endpoint that continues to receive 3701 data for a terminated connection is to attempt the stateless reset 3702 process (Section 10.3). 3704 As the AEAD on Initial packets does not provide strong 3705 authentication, an endpoint MAY discard an invalid Initial packet. 3706 Discarding an Initial packet is permitted even where this 3707 specification otherwise mandates a connection error. An endpoint can 3708 only discard a packet if it does not process the frames in the packet 3709 or reverts the effects of any processing. Discarding invalid Initial 3710 packets might be used to reduce exposure to denial of service; see 3711 Section 21.2. 3713 11.2. Stream Errors 3715 If an application-level error affects a single stream, but otherwise 3716 leaves the connection in a recoverable state, the endpoint can send a 3717 RESET_STREAM frame (Section 19.4) with an appropriate error code to 3718 terminate just the affected stream. 3720 Resetting a stream without the involvement of the application 3721 protocol could cause the application protocol to enter an 3722 unrecoverable state. RESET_STREAM MUST only be instigated by the 3723 application protocol that uses QUIC. 3725 The semantics of the application error code carried in RESET_STREAM 3726 are defined by the application protocol. Only the application 3727 protocol is able to cause a stream to be terminated. A local 3728 instance of the application protocol uses a direct API call and a 3729 remote instance uses the STOP_SENDING frame, which triggers an 3730 automatic RESET_STREAM. 3732 Application protocols SHOULD define rules for handling streams that 3733 are prematurely cancelled by either endpoint. 3735 12. Packets and Frames 3737 QUIC endpoints communicate by exchanging packets. Packets have 3738 confidentiality and integrity protection; see Section 12.1. Packets 3739 are carried in UDP datagrams; see Section 12.2. 3741 This version of QUIC uses the long packet header during connection 3742 establishment; see Section 17.2. Packets with the long header are 3743 Initial (Section 17.2.2), 0-RTT (Section 17.2.3), Handshake 3744 (Section 17.2.4), and Retry (Section 17.2.5). Version negotiation 3745 uses a version-independent packet with a long header; see 3746 Section 17.2.1. 3748 Packets with the short header are designed for minimal overhead and 3749 are used after a connection is established and 1-RTT keys are 3750 available; see Section 17.3. 3752 12.1. Protected Packets 3754 QUIC packets have different levels of cryptographic protection based 3755 on the type of packet. Details of packet protection are found in 3756 [QUIC-TLS]; this section includes an overview of the protections that 3757 are provided. 3759 Version Negotiation packets have no cryptographic protection; see 3760 [QUIC-INVARIANTS]. 3762 Retry packets use an authenticated encryption with associated data 3763 function (AEAD; [AEAD]) to protect against accidental modification. 3765 Initial packets use an AEAD, the keys for which are derived using a 3766 value that is visible on the wire. Initial packets therefore do not 3767 have effective confidentiality protection. Initial protection exists 3768 to ensure that the sender of the packet is on the network path. Any 3769 entity that receives an Initial packet from a client can recover the 3770 keys that will allow them to both read the contents of the packet and 3771 generate Initial packets that will be successfully authenticated at 3772 either endpoint. The AEAD also protects Initial packets against 3773 accidental modification. 3775 All other packets are protected with keys derived from the 3776 cryptographic handshake. The cryptographic handshake ensures that 3777 only the communicating endpoints receive the corresponding keys for 3778 Handshake, 0-RTT, and 1-RTT packets. Packets protected with 0-RTT 3779 and 1-RTT keys have strong confidentiality and integrity protection. 3781 The Packet Number field that appears in some packet types has 3782 alternative confidentiality protection that is applied as part of 3783 header protection; see Section 5.4 of [QUIC-TLS] for details. The 3784 underlying packet number increases with each packet sent in a given 3785 packet number space; see Section 12.3 for details. 3787 12.2. Coalescing Packets 3789 Initial (Section 17.2.2), 0-RTT (Section 17.2.3), and Handshake 3790 (Section 17.2.4) packets contain a Length field that determines the 3791 end of the packet. The length includes both the Packet Number and 3792 Payload fields, both of which are confidentiality protected and 3793 initially of unknown length. The length of the Payload field is 3794 learned once header protection is removed. 3796 Using the Length field, a sender can coalesce multiple QUIC packets 3797 into one UDP datagram. This can reduce the number of UDP datagrams 3798 needed to complete the cryptographic handshake and start sending 3799 data. This can also be used to construct PMTU probes; see 3800 Section 14.4.1. Receivers MUST be able to process coalesced packets. 3802 Coalescing packets in order of increasing encryption levels (Initial, 3803 0-RTT, Handshake, 1-RTT; see Section 4.1.4 of [QUIC-TLS]) makes it 3804 more likely the receiver will be able to process all the packets in a 3805 single pass. A packet with a short header does not include a length, 3806 so it can only be the last packet included in a UDP datagram. An 3807 endpoint SHOULD include multiple frames in a single packet if they 3808 are to be sent at the same encryption level, instead of coalescing 3809 multiple packets at the same encryption level. 3811 Receivers MAY route based on the information in the first packet 3812 contained in a UDP datagram. Senders MUST NOT coalesce QUIC packets 3813 with different connection IDs into a single UDP datagram. Receivers 3814 SHOULD ignore any subsequent packets with a different Destination 3815 Connection ID than the first packet in the datagram. 3817 Every QUIC packet that is coalesced into a single UDP datagram is 3818 separate and complete. The receiver of coalesced QUIC packets MUST 3819 individually process each QUIC packet and separately acknowledge 3820 them, as if they were received as the payload of different UDP 3821 datagrams. For example, if decryption fails (because the keys are 3822 not available or any other reason), the receiver MAY either discard 3823 or buffer the packet for later processing and MUST attempt to process 3824 the remaining packets. 3826 Retry packets (Section 17.2.5), Version Negotiation packets 3827 (Section 17.2.1), and packets with a short header (Section 17.3) do 3828 not contain a Length field and so cannot be followed by other packets 3829 in the same UDP datagram. Note also that there is no situation where 3830 a Retry or Version Negotiation packet is coalesced with another 3831 packet. 3833 12.3. Packet Numbers 3835 The packet number is an integer in the range 0 to 2^62-1. This 3836 number is used in determining the cryptographic nonce for packet 3837 protection. Each endpoint maintains a separate packet number for 3838 sending and receiving. 3840 Packet numbers are limited to this range because they need to be 3841 representable in whole in the Largest Acknowledged field of an ACK 3842 frame (Section 19.3). When present in a long or short header 3843 however, packet numbers are reduced and encoded in 1 to 4 bytes; see 3844 Section 17.1. 3846 Version Negotiation (Section 17.2.1) and Retry (Section 17.2.5) 3847 packets do not include a packet number. 3849 Packet numbers are divided into 3 spaces in QUIC: 3851 * Initial space: All Initial packets (Section 17.2.2) are in this 3852 space. 3854 * Handshake space: All Handshake packets (Section 17.2.4) are in 3855 this space. 3857 * Application data space: All 0-RTT (Section 17.2.3) and 1-RTT 3858 (Section 17.3.1) packets are in this space. 3860 As described in [QUIC-TLS], each packet type uses different 3861 protection keys. 3863 Conceptually, a packet number space is the context in which a packet 3864 can be processed and acknowledged. Initial packets can only be sent 3865 with Initial packet protection keys and acknowledged in packets that 3866 are also Initial packets. Similarly, Handshake packets are sent at 3867 the Handshake encryption level and can only be acknowledged in 3868 Handshake packets. 3870 This enforces cryptographic separation between the data sent in the 3871 different packet number spaces. Packet numbers in each space start 3872 at packet number 0. Subsequent packets sent in the same packet 3873 number space MUST increase the packet number by at least one. 3875 0-RTT and 1-RTT data exist in the same packet number space to make 3876 loss recovery algorithms easier to implement between the two packet 3877 types. 3879 A QUIC endpoint MUST NOT reuse a packet number within the same packet 3880 number space in one connection. If the packet number for sending 3881 reaches 2^62 - 1, the sender MUST close the connection without 3882 sending a CONNECTION_CLOSE frame or any further packets; an endpoint 3883 MAY send a Stateless Reset (Section 10.3) in response to further 3884 packets that it receives. 3886 A receiver MUST discard a newly unprotected packet unless it is 3887 certain that it has not processed another packet with the same packet 3888 number from the same packet number space. Duplicate suppression MUST 3889 happen after removing packet protection for the reasons described in 3890 Section 9.5 of [QUIC-TLS]. 3892 Endpoints that track all individual packets for the purposes of 3893 detecting duplicates are at risk of accumulating excessive state. 3894 The data required for detecting duplicates can be limited by 3895 maintaining a minimum packet number below which all packets are 3896 immediately dropped. Any minimum needs to account for large 3897 variations in round trip time, which includes the possibility that a 3898 peer might probe network paths with much larger round trip times; see 3899 Section 9. 3901 Packet number encoding at a sender and decoding at a receiver are 3902 described in Section 17.1. 3904 12.4. Frames and Frame Types 3906 The payload of QUIC packets, after removing packet protection, 3907 consists of a sequence of complete frames, as shown in Figure 11. 3908 Version Negotiation, Stateless Reset, and Retry packets do not 3909 contain frames. 3911 Packet Payload { 3912 Frame (8..) ..., 3913 } 3915 Figure 11: QUIC Payload 3917 The payload of a packet that contains frames MUST contain at least 3918 one frame, and MAY contain multiple frames and multiple frame types. 3919 An endpoint MUST treat receipt of a packet containing no frames as a 3920 connection error of type PROTOCOL_VIOLATION. Frames always fit 3921 within a single QUIC packet and cannot span multiple packets. 3923 Each frame begins with a Frame Type, indicating its type, followed by 3924 additional type-dependent fields: 3926 Frame { 3927 Frame Type (i), 3928 Type-Dependent Fields (..), 3929 } 3931 Figure 12: Generic Frame Layout 3933 Table 3 lists and summarizes information about each frame type that 3934 is defined in this specification. A description of this summary is 3935 included after the table. 3937 +=============+======================+===============+======+======+ 3938 | Type Value | Frame Type Name | Definition | Pkts | Spec | 3939 +=============+======================+===============+======+======+ 3940 | 0x00 | PADDING | Section 19.1 | IH01 | NP | 3941 +-------------+----------------------+---------------+------+------+ 3942 | 0x01 | PING | Section 19.2 | IH01 | | 3943 +-------------+----------------------+---------------+------+------+ 3944 | 0x02 - 0x03 | ACK | Section 19.3 | IH_1 | NC | 3945 +-------------+----------------------+---------------+------+------+ 3946 | 0x04 | RESET_STREAM | Section 19.4 | __01 | | 3947 +-------------+----------------------+---------------+------+------+ 3948 | 0x05 | STOP_SENDING | Section 19.5 | __01 | | 3949 +-------------+----------------------+---------------+------+------+ 3950 | 0x06 | CRYPTO | Section 19.6 | IH_1 | | 3951 +-------------+----------------------+---------------+------+------+ 3952 | 0x07 | NEW_TOKEN | Section 19.7 | ___1 | | 3953 +-------------+----------------------+---------------+------+------+ 3954 | 0x08 - 0x0f | STREAM | Section 19.8 | __01 | F | 3955 +-------------+----------------------+---------------+------+------+ 3956 | 0x10 | MAX_DATA | Section 19.9 | __01 | | 3957 +-------------+----------------------+---------------+------+------+ 3958 | 0x11 | MAX_STREAM_DATA | Section 19.10 | __01 | | 3959 +-------------+----------------------+---------------+------+------+ 3960 | 0x12 - 0x13 | MAX_STREAMS | Section 19.11 | __01 | | 3961 +-------------+----------------------+---------------+------+------+ 3962 | 0x14 | DATA_BLOCKED | Section 19.12 | __01 | | 3963 +-------------+----------------------+---------------+------+------+ 3964 | 0x15 | STREAM_DATA_BLOCKED | Section 19.13 | __01 | | 3965 +-------------+----------------------+---------------+------+------+ 3966 | 0x16 - 0x17 | STREAMS_BLOCKED | Section 19.14 | __01 | | 3967 +-------------+----------------------+---------------+------+------+ 3968 | 0x18 | NEW_CONNECTION_ID | Section 19.15 | __01 | P | 3969 +-------------+----------------------+---------------+------+------+ 3970 | 0x19 | RETIRE_CONNECTION_ID | Section 19.16 | __01 | | 3971 +-------------+----------------------+---------------+------+------+ 3972 | 0x1a | PATH_CHALLENGE | Section 19.17 | __01 | P | 3973 +-------------+----------------------+---------------+------+------+ 3974 | 0x1b | PATH_RESPONSE | Section 19.18 | ___1 | P | 3975 +-------------+----------------------+---------------+------+------+ 3976 | 0x1c - 0x1d | CONNECTION_CLOSE | Section 19.19 | ih01 | N | 3977 +-------------+----------------------+---------------+------+------+ 3978 | 0x1e | HANDSHAKE_DONE | Section 19.20 | ___1 | | 3979 +-------------+----------------------+---------------+------+------+ 3981 Table 3: Frame Types 3983 The format and semantics of each frame type are explained in more 3984 detail in Section 19. The remainder of this section provides a 3985 summary of important and general information. 3987 The Frame Type in ACK, STREAM, MAX_STREAMS, STREAMS_BLOCKED, and 3988 CONNECTION_CLOSE frames is used to carry other frame-specific flags. 3989 For all other frames, the Frame Type field simply identifies the 3990 frame. 3992 The "Pkts" column in Table 3 lists the types of packets that each 3993 frame type could appear in, indicated by the following characters: 3995 I: Initial (Section 17.2.2) 3997 H: Handshake (Section 17.2.4) 3999 0: 0-RTT (Section 17.2.3) 4001 1: 1-RTT (Section 17.3.1) 4003 ih: Only a CONNECTION_CLOSE frame of type 0x1c can appear in Initial 4004 or Handshake packets. 4006 For more detail about these restrictions, see Section 12.5. Note 4007 that all frames can appear in 1-RTT packets. An endpoint MUST treat 4008 receipt of a frame in a packet type that is not permitted as a 4009 connection error of type PROTOCOL_VIOLATION. 4011 The "Spec" column in Table 3 summarizes any special rules governing 4012 the processing or generation of the frame type, as indicated by the 4013 following characters: 4015 N: Packets containing only frames with this marking are not ack- 4016 eliciting; see Section 13.2. 4018 C: Packets containing only frames with this marking do not count 4019 toward bytes in flight for congestion control purposes; see 4020 [QUIC-RECOVERY]. 4022 P: Packets containing only frames with this marking can be used to 4023 probe new network paths during connection migration; see 4024 Section 9.1. 4026 F: The content of frames with this marking are flow controlled; see 4027 Section 4. 4029 The "Pkts" and "Spec" columns in Table 3 do not form part of the IANA 4030 registry; see Section 22.4. 4032 An endpoint MUST treat the receipt of a frame of unknown type as a 4033 connection error of type FRAME_ENCODING_ERROR. 4035 All frames are idempotent in this version of QUIC. That is, a valid 4036 frame does not cause undesirable side effects or errors when received 4037 more than once. 4039 The Frame Type field uses a variable-length integer encoding (see 4040 Section 16) with one exception. To ensure simple and efficient 4041 implementations of frame parsing, a frame type MUST use the shortest 4042 possible encoding. For frame types defined in this document, this 4043 means a single-byte encoding, even though it is possible to encode 4044 these values as a two-, four- or eight-byte variable-length integer. 4045 For instance, though 0x4001 is a legitimate two-byte encoding for a 4046 variable-length integer with a value of 1, PING frames are always 4047 encoded as a single byte with the value 0x01. This rule applies to 4048 all current and future QUIC frame types. An endpoint MAY treat the 4049 receipt of a frame type that uses a longer encoding than necessary as 4050 a connection error of type PROTOCOL_VIOLATION. 4052 12.5. Frames and Number Spaces 4054 Some frames are prohibited in different packet number spaces. The 4055 rules here generalize those of TLS, in that frames associated with 4056 establishing the connection can usually appear in packets in any 4057 packet number space, whereas those associated with transferring data 4058 can only appear in the application data packet number space: 4060 * PADDING, PING, and CRYPTO frames MAY appear in any packet number 4061 space. 4063 * CONNECTION_CLOSE frames signaling errors at the QUIC layer (type 4064 0x1c) MAY appear in any packet number space. CONNECTION_CLOSE 4065 frames signaling application errors (type 0x1d) MUST only appear 4066 in the application data packet number space. 4068 * ACK frames MAY appear in any packet number space, but can only 4069 acknowledge packets that appeared in that packet number space. 4070 However, as noted below, 0-RTT packets cannot contain ACK frames. 4072 * All other frame types MUST only be sent in the application data 4073 packet number space. 4075 Note that it is not possible to send the following frames in 0-RTT 4076 packets for various reasons: ACK, CRYPTO, HANDSHAKE_DONE, NEW_TOKEN, 4077 PATH_RESPONSE, and RETIRE_CONNECTION_ID. A server MAY treat receipt 4078 of these frames in 0-RTT packets as a connection error of type 4079 PROTOCOL_VIOLATION. 4081 13. Packetization and Reliability 4083 A sender sends one or more frames in a QUIC packet; see Section 12.4. 4085 A sender can minimize per-packet bandwidth and computational costs by 4086 including as many frames as possible in each QUIC packet. A sender 4087 MAY wait for a short period of time to collect multiple frames before 4088 sending a packet that is not maximally packed, to avoid sending out 4089 large numbers of small packets. An implementation MAY use knowledge 4090 about application sending behavior or heuristics to determine whether 4091 and for how long to wait. This waiting period is an implementation 4092 decision, and an implementation should be careful to delay 4093 conservatively, since any delay is likely to increase application- 4094 visible latency. 4096 Stream multiplexing is achieved by interleaving STREAM frames from 4097 multiple streams into one or more QUIC packets. A single QUIC packet 4098 can include multiple STREAM frames from one or more streams. 4100 One of the benefits of QUIC is avoidance of head-of-line blocking 4101 across multiple streams. When a packet loss occurs, only streams 4102 with data in that packet are blocked waiting for a retransmission to 4103 be received, while other streams can continue making progress. Note 4104 that when data from multiple streams is included in a single QUIC 4105 packet, loss of that packet blocks all those streams from making 4106 progress. Implementations are advised to include as few streams as 4107 necessary in outgoing packets without losing transmission efficiency 4108 to underfilled packets. 4110 13.1. Packet Processing 4112 A packet MUST NOT be acknowledged until packet protection has been 4113 successfully removed and all frames contained in the packet have been 4114 processed. For STREAM frames, this means the data has been enqueued 4115 in preparation to be received by the application protocol, but it 4116 does not require that data is delivered and consumed. 4118 Once the packet has been fully processed, a receiver acknowledges 4119 receipt by sending one or more ACK frames containing the packet 4120 number of the received packet. 4122 An endpoint SHOULD treat receipt of an acknowledgment for a packet it 4123 did not send as a connection error of type PROTOCOL_VIOLATION, if it 4124 is able to detect the condition. Further discussion of how this 4125 might be achieved is in Section 21.4. 4127 13.2. Generating Acknowledgments 4129 Endpoints acknowledge all packets they receive and process. However, 4130 only ack-eliciting packets cause an ACK frame to be sent within the 4131 maximum ack delay. Packets that are not ack-eliciting are only 4132 acknowledged when an ACK frame is sent for other reasons. 4134 When sending a packet for any reason, an endpoint SHOULD attempt to 4135 include an ACK frame if one has not been sent recently. Doing so 4136 helps with timely loss detection at the peer. 4138 In general, frequent feedback from a receiver improves loss and 4139 congestion response, but this has to be balanced against excessive 4140 load generated by a receiver that sends an ACK frame in response to 4141 every ack-eliciting packet. The guidance offered below seeks to 4142 strike this balance. 4144 13.2.1. Sending ACK Frames 4146 Every packet SHOULD be acknowledged at least once, and ack-eliciting 4147 packets MUST be acknowledged at least once within the maximum delay 4148 an endpoint communicated using the max_ack_delay transport parameter; 4149 see Section 18.2. max_ack_delay declares an explicit contract: an 4150 endpoint promises to never intentionally delay acknowledgments of an 4151 ack-eliciting packet by more than the indicated value. If it does, 4152 any excess accrues to the RTT estimate and could result in spurious 4153 or delayed retransmissions from the peer. A sender uses the 4154 receiver's max_ack_delay value in determining timeouts for timer- 4155 based retransmission, as detailed in Section 6.2 of [QUIC-RECOVERY]. 4157 An endpoint MUST acknowledge all ack-eliciting Initial and Handshake 4158 packets immediately and all ack-eliciting 0-RTT and 1-RTT packets 4159 within its advertised max_ack_delay, with the following exception. 4160 Prior to handshake confirmation, an endpoint might not have packet 4161 protection keys for decrypting Handshake, 0-RTT, or 1-RTT packets 4162 when they are received. It might therefore buffer them and 4163 acknowledge them when the requisite keys become available. 4165 Since packets containing only ACK frames are not congestion 4166 controlled, an endpoint MUST NOT send more than one such packet in 4167 response to receiving an ack-eliciting packet. 4169 An endpoint MUST NOT send a non-ack-eliciting packet in response to a 4170 non-ack-eliciting packet, even if there are packet gaps that precede 4171 the received packet. This avoids an infinite feedback loop of 4172 acknowledgments, which could prevent the connection from ever 4173 becoming idle. Non-ack-eliciting packets are eventually acknowledged 4174 when the endpoint sends an ACK frame in response to other events. 4176 In order to assist loss detection at the sender, an endpoint SHOULD 4177 generate and send an ACK frame without delay when it receives an ack- 4178 eliciting packet either: 4180 * when the received packet has a packet number less than another 4181 ack-eliciting packet that has been received, or 4183 * when the packet has a packet number larger than the highest- 4184 numbered ack-eliciting packet that has been received and there are 4185 missing packets between that packet and this packet. 4187 Similarly, packets marked with the ECN Congestion Experienced (CE) 4188 codepoint in the IP header SHOULD be acknowledged immediately, to 4189 reduce the peer's response time to congestion events. 4191 The algorithms in [QUIC-RECOVERY] are expected to be resilient to 4192 receivers that do not follow the guidance offered above. However, an 4193 implementation should only deviate from these requirements after 4194 careful consideration of the performance implications of a change, 4195 for connections made by the endpoint and for other users of the 4196 network. 4198 An endpoint that is only sending ACK frames will not receive 4199 acknowledgments from its peer unless those acknowledgments are 4200 included in packets with ack-eliciting frames. An endpoint SHOULD 4201 send an ACK frame with other frames when there are new ack-eliciting 4202 packets to acknowledge. When only non-ack-eliciting packets need to 4203 be acknowledged, an endpoint MAY wait until an ack-eliciting packet 4204 has been received to include an ACK frame with outgoing frames. 4206 A receiver MUST NOT send an ack-eliciting frame in all packets that 4207 would otherwise be non-ack-eliciting, to avoid an infinite feedback 4208 loop of acknowledgments. 4210 13.2.2. Acknowledgment Frequency 4212 A receiver determines how frequently to send acknowledgments in 4213 response to ack-eliciting packets. This determination involves a 4214 trade-off. 4216 Endpoints rely on timely acknowledgment to detect loss; see Section 6 4217 of [QUIC-RECOVERY]. Window-based congestion controllers, such as the 4218 one in Section 7 of [QUIC-RECOVERY], rely on acknowledgments to 4219 manage their congestion window. In both cases, delaying 4220 acknowledgments can adversely affect performance. 4222 On the other hand, reducing the frequency of packets that carry only 4223 acknowledgments reduces packet transmission and processing cost at 4224 both endpoints. It can improve connection throughput on severely 4225 asymmetric links and reduce the volume of acknowledgment traffic 4226 using return path capacity; see Section 3 of [RFC3449]. 4228 A receiver SHOULD send an ACK frame after receiving at least two ack- 4229 eliciting packets. This recommendation is general in nature and 4230 consistent with recommendations for TCP endpoint behavior [RFC5681]. 4231 Knowledge of network conditions, knowledge of the peer's congestion 4232 controller, or further research and experimentation might suggest 4233 alternative acknowledgment strategies with better performance 4234 characteristics. 4236 A receiver MAY process multiple available packets before determining 4237 whether to send an ACK frame in response. 4239 13.2.3. Managing ACK Ranges 4241 When an ACK frame is sent, one or more ranges of acknowledged packets 4242 are included. Including acknowledgments for older packets reduces 4243 the chance of spurious retransmissions caused by losing previously 4244 sent ACK frames, at the cost of larger ACK frames. 4246 ACK frames SHOULD always acknowledge the most recently received 4247 packets, and the more out-of-order the packets are, the more 4248 important it is to send an updated ACK frame quickly, to prevent the 4249 peer from declaring a packet as lost and spuriously retransmitting 4250 the frames it contains. An ACK frame is expected to fit within a 4251 single QUIC packet. If it does not, then older ranges (those with 4252 the smallest packet numbers) are omitted. 4254 A receiver limits the number of ACK Ranges (Section 19.3.1) it 4255 remembers and sends in ACK frames, both to limit the size of ACK 4256 frames and to avoid resource exhaustion. After receiving 4257 acknowledgments for an ACK frame, the receiver SHOULD stop tracking 4258 those acknowledged ACK Ranges. Senders can expect acknowledgments 4259 for most packets, but QUIC does not guarantee receipt of an 4260 acknowledgment for every packet that the receiver processes. 4262 It is possible that retaining many ACK Ranges could cause an ACK 4263 frame to become too large. A receiver can discard unacknowledged ACK 4264 Ranges to limit ACK frame size, at the cost of increased 4265 retransmissions from the sender. This is necessary if an ACK frame 4266 would be too large to fit in a packet. Receivers MAY also limit ACK 4267 frame size further to preserve space for other frames or to limit the 4268 capacity that acknowledgments consume. 4270 A receiver MUST retain an ACK Range unless it can ensure that it will 4271 not subsequently accept packets with numbers in that range. 4272 Maintaining a minimum packet number that increases as ranges are 4273 discarded is one way to achieve this with minimal state. 4275 Receivers can discard all ACK Ranges, but they MUST retain the 4276 largest packet number that has been successfully processed as that is 4277 used to recover packet numbers from subsequent packets; see 4278 Section 17.1. 4280 A receiver SHOULD include an ACK Range containing the largest 4281 received packet number in every ACK frame. The Largest Acknowledged 4282 field is used in ECN validation at a sender and including a lower 4283 value than what was included in a previous ACK frame could cause ECN 4284 to be unnecessarily disabled; see Section 13.4.2. 4286 Section 13.2.4 describes an exemplary approach for determining what 4287 packets to acknowledge in each ACK frame. Though the goal of this 4288 algorithm is to generate an acknowledgment for every packet that is 4289 processed, it is still possible for acknowledgments to be lost. 4291 13.2.4. Limiting Ranges by Tracking ACK Frames 4293 When a packet containing an ACK frame is sent, the largest 4294 acknowledged in that frame can be saved. When a packet containing an 4295 ACK frame is acknowledged, the receiver can stop acknowledging 4296 packets less than or equal to the largest acknowledged in the sent 4297 ACK frame. 4299 A receiver that sends only non-ack-eliciting packets, such as ACK 4300 frames, might not receive an acknowledgment for a long period of 4301 time. This could cause the receiver to maintain state for a large 4302 number of ACK frames for a long period of time, and ACK frames it 4303 sends could be unnecessarily large. In such a case, a receiver could 4304 send a PING or other small ack-eliciting frame occasionally, such as 4305 once per round trip, to elicit an ACK from the peer. 4307 In cases without ACK frame loss, this algorithm allows for a minimum 4308 of 1 RTT of reordering. In cases with ACK frame loss and reordering, 4309 this approach does not guarantee that every acknowledgment is seen by 4310 the sender before it is no longer included in the ACK frame. Packets 4311 could be received out of order and all subsequent ACK frames 4312 containing them could be lost. In this case, the loss recovery 4313 algorithm could cause spurious retransmissions, but the sender will 4314 continue making forward progress. 4316 13.2.5. Measuring and Reporting Host Delay 4318 An endpoint measures the delays intentionally introduced between the 4319 time the packet with the largest packet number is received and the 4320 time an acknowledgment is sent. The endpoint encodes this 4321 acknowledgment delay in the ACK Delay field of an ACK frame; see 4322 Section 19.3. This allows the receiver of the ACK frame to adjust 4323 for any intentional delays, which is important for getting a better 4324 estimate of the path RTT when acknowledgments are delayed. 4326 A packet might be held in the OS kernel or elsewhere on the host 4327 before being processed. An endpoint MUST NOT include delays that it 4328 does not control when populating the ACK Delay field in an ACK frame. 4329 However, endpoints SHOULD include buffering delays caused by 4330 unavailability of decryption keys, since these delays can be large 4331 and are likely to be non-repeating. 4333 When the measured acknowledgment delay is larger than its 4334 max_ack_delay, an endpoint SHOULD report the measured delay. This 4335 information is especially useful during the handshake when delays 4336 might be large; see Section 13.2.1. 4338 13.2.6. ACK Frames and Packet Protection 4340 ACK frames MUST only be carried in a packet that has the same packet 4341 number space as the packet being acknowledged; see Section 12.1. For 4342 instance, packets that are protected with 1-RTT keys MUST be 4343 acknowledged in packets that are also protected with 1-RTT keys. 4345 Packets that a client sends with 0-RTT packet protection MUST be 4346 acknowledged by the server in packets protected by 1-RTT keys. This 4347 can mean that the client is unable to use these acknowledgments if 4348 the server cryptographic handshake messages are delayed or lost. 4349 Note that the same limitation applies to other data sent by the 4350 server protected by the 1-RTT keys. 4352 13.2.7. PADDING Frames Consume Congestion Window 4354 Packets containing PADDING frames are considered to be in flight for 4355 congestion control purposes [QUIC-RECOVERY]. Packets containing only 4356 PADDING frames therefore consume congestion window but do not 4357 generate acknowledgments that will open the congestion window. To 4358 avoid a deadlock, a sender SHOULD ensure that other frames are sent 4359 periodically in addition to PADDING frames to elicit acknowledgments 4360 from the receiver. 4362 13.3. Retransmission of Information 4364 QUIC packets that are determined to be lost are not retransmitted 4365 whole. The same applies to the frames that are contained within lost 4366 packets. Instead, the information that might be carried in frames is 4367 sent again in new frames as needed. 4369 New frames and packets are used to carry information that is 4370 determined to have been lost. In general, information is sent again 4371 when a packet containing that information is determined to be lost 4372 and sending ceases when a packet containing that information is 4373 acknowledged. 4375 * Data sent in CRYPTO frames is retransmitted according to the rules 4376 in [QUIC-RECOVERY], until all data has been acknowledged. Data in 4377 CRYPTO frames for Initial and Handshake packets is discarded when 4378 keys for the corresponding packet number space are discarded. 4380 * Application data sent in STREAM frames is retransmitted in new 4381 STREAM frames unless the endpoint has sent a RESET_STREAM for that 4382 stream. Once an endpoint sends a RESET_STREAM frame, no further 4383 STREAM frames are needed. 4385 * ACK frames carry the most recent set of acknowledgments and the 4386 acknowledgment delay from the largest acknowledged packet, as 4387 described in Section 13.2.1. Delaying the transmission of packets 4388 containing ACK frames or resending old ACK frames can cause the 4389 peer to generate an inflated RTT sample or unnecessarily disable 4390 ECN. 4392 * Cancellation of stream transmission, as carried in a RESET_STREAM 4393 frame, is sent until acknowledged or until all stream data is 4394 acknowledged by the peer (that is, either the "Reset Recvd" or 4395 "Data Recvd" state is reached on the sending part of the stream). 4396 The content of a RESET_STREAM frame MUST NOT change when it is 4397 sent again. 4399 * Similarly, a request to cancel stream transmission, as encoded in 4400 a STOP_SENDING frame, is sent until the receiving part of the 4401 stream enters either a "Data Recvd" or "Reset Recvd" state; see 4402 Section 3.5. 4404 * Connection close signals, including packets that contain 4405 CONNECTION_CLOSE frames, are not sent again when packet loss is 4406 detected, but as described in Section 10. 4408 * The current connection maximum data is sent in MAX_DATA frames. 4409 An updated value is sent in a MAX_DATA frame if the packet 4410 containing the most recently sent MAX_DATA frame is declared lost, 4411 or when the endpoint decides to update the limit. Care is 4412 necessary to avoid sending this frame too often as the limit can 4413 increase frequently and cause an unnecessarily large number of 4414 MAX_DATA frames to be sent; see Section 4.2. 4416 * The current maximum stream data offset is sent in MAX_STREAM_DATA 4417 frames. Like MAX_DATA, an updated value is sent when the packet 4418 containing the most recent MAX_STREAM_DATA frame for a stream is 4419 lost or when the limit is updated, with care taken to prevent the 4420 frame from being sent too often. An endpoint SHOULD stop sending 4421 MAX_STREAM_DATA frames when the receiving part of the stream 4422 enters a "Size Known" or "Reset Recvd" state. 4424 * The limit on streams of a given type is sent in MAX_STREAMS 4425 frames. Like MAX_DATA, an updated value is sent when a packet 4426 containing the most recent MAX_STREAMS for a stream type frame is 4427 declared lost or when the limit is updated, with care taken to 4428 prevent the frame from being sent too often. 4430 * Blocked signals are carried in DATA_BLOCKED, STREAM_DATA_BLOCKED, 4431 and STREAMS_BLOCKED frames. DATA_BLOCKED frames have connection 4432 scope, STREAM_DATA_BLOCKED frames have stream scope, and 4433 STREAMS_BLOCKED frames are scoped to a specific stream type. New 4434 frames are sent if packets containing the most recent frame for a 4435 scope is lost, but only while the endpoint is blocked on the 4436 corresponding limit. These frames always include the limit that 4437 is causing blocking at the time that they are transmitted. 4439 * A liveness or path validation check using PATH_CHALLENGE frames is 4440 sent periodically until a matching PATH_RESPONSE frame is received 4441 or until there is no remaining need for liveness or path 4442 validation checking. PATH_CHALLENGE frames include a different 4443 payload each time they are sent. 4445 * Responses to path validation using PATH_RESPONSE frames are sent 4446 just once. The peer is expected to send more PATH_CHALLENGE 4447 frames as necessary to evoke additional PATH_RESPONSE frames. 4449 * New connection IDs are sent in NEW_CONNECTION_ID frames and 4450 retransmitted if the packet containing them is lost. 4451 Retransmissions of this frame carry the same sequence number 4452 value. Likewise, retired connection IDs are sent in 4453 RETIRE_CONNECTION_ID frames and retransmitted if the packet 4454 containing them is lost. 4456 * NEW_TOKEN frames are retransmitted if the packet containing them 4457 is lost. No special support is made for detecting reordered and 4458 duplicated NEW_TOKEN frames other than a direct comparison of the 4459 frame contents. 4461 * PING and PADDING frames contain no information, so lost PING or 4462 PADDING frames do not require repair. 4464 * The HANDSHAKE_DONE frame MUST be retransmitted until it is 4465 acknowledged. 4467 Endpoints SHOULD prioritize retransmission of data over sending new 4468 data, unless priorities specified by the application indicate 4469 otherwise; see Section 2.3. 4471 Even though a sender is encouraged to assemble frames containing up- 4472 to-date information every time it sends a packet, it is not forbidden 4473 to retransmit copies of frames from lost packets. A sender that 4474 retransmits copies of frames needs to handle decreases in available 4475 payload size due to change in packet number length, connection ID 4476 length, and path MTU. A receiver MUST accept packets containing an 4477 outdated frame, such as a MAX_DATA frame carrying a smaller maximum 4478 data than one found in an older packet. 4480 A sender SHOULD avoid retransmitting information from packets once 4481 they are acknowledged. This includes packets that are acknowledged 4482 after being declared lost, which can happen in the presence of 4483 network reordering. Doing so requires senders to retain information 4484 about packets after they are declared lost. A sender can discard 4485 this information after a period of time elapses that adequately 4486 allows for reordering, such as a PTO (Section 6.2 of 4487 [QUIC-RECOVERY]), or on other events, such as reaching a memory 4488 limit. 4490 Upon detecting losses, a sender MUST take appropriate congestion 4491 control action. The details of loss detection and congestion control 4492 are described in [QUIC-RECOVERY]. 4494 13.4. Explicit Congestion Notification 4496 QUIC endpoints can use Explicit Congestion Notification (ECN) 4497 [RFC3168] to detect and respond to network congestion. ECN allows an 4498 endpoint to set an ECT codepoint in the ECN field of an IP packet. A 4499 network node can then indicate congestion by setting the CE codepoint 4500 in the ECN field instead of dropping the packet [RFC8087]. Endpoints 4501 react to reported congestion by reducing their sending rate in 4502 response, as described in [QUIC-RECOVERY]. 4504 To enable ECN, a sending QUIC endpoint first determines whether a 4505 path supports ECN marking and whether the peer reports the ECN values 4506 in received IP headers; see Section 13.4.2. 4508 13.4.1. Reporting ECN Counts 4510 Use of ECN requires the receiving endpoint to read the ECN field from 4511 an IP packet, which is not possible on all platforms. If an endpoint 4512 does not implement ECN support or does not have access to received 4513 ECN fields, it does not report ECN counts for packets it receives. 4515 Even if an endpoint does not set an ECT field on packets it sends, 4516 the endpoint MUST provide feedback about ECN markings it receives, if 4517 these are accessible. Failing to report the ECN counts will cause 4518 the sender to disable use of ECN for this connection. 4520 On receiving an IP packet with an ECT(0), ECT(1) or CE codepoint, an 4521 ECN-enabled endpoint accesses the ECN field and increases the 4522 corresponding ECT(0), ECT(1), or CE count. These ECN counts are 4523 included in subsequent ACK frames; see Section 13.2 and Section 19.3. 4525 Each packet number space maintains separate acknowledgment state and 4526 separate ECN counts. Coalesced QUIC packets (see Section 12.2) share 4527 the same IP header so the ECN counts are incremented once for each 4528 coalesced QUIC packet. 4530 For example, if one each of an Initial, Handshake, and 1-RTT QUIC 4531 packet are coalesced into a single UDP datagram, the ECN counts for 4532 all three packet number spaces will be incremented by one each, based 4533 on the ECN field of the single IP header. 4535 ECN counts are only incremented when QUIC packets from the received 4536 IP packet are processed. As such, duplicate QUIC packets are not 4537 processed and do not increase ECN counts; see Section 21.10 for 4538 relevant security concerns. 4540 13.4.2. ECN Validation 4542 It is possible for faulty network devices to corrupt or erroneously 4543 drop packets that carry a non-zero ECN codepoint. To ensure 4544 connectivity in the presence of such devices, an endpoint validates 4545 the ECN counts for each network path and disables use of ECN on that 4546 path if errors are detected. 4548 To perform ECN validation for a new path: 4550 * The endpoint sets an ECT(0) codepoint in the IP header of early 4551 outgoing packets sent on a new path to the peer ([RFC8311]). 4553 * The endpoint monitors whether all packets sent with an ECT 4554 codepoint are eventually deemed lost (Section 6 of 4555 [QUIC-RECOVERY]), indicating that ECN validation has failed. 4557 If an endpoint has cause to expect that IP packets with an ECT 4558 codepoint might be dropped by a faulty network element, the endpoint 4559 could set an ECT codepoint for only the first ten outgoing packets on 4560 a path, or for a period of three PTOs (see Section 6.2 of 4561 [QUIC-RECOVERY]). If all packets marked with non-zero ECN codepoints 4562 are subsequently lost, it can disable marking on the assumption that 4563 the marking caused the loss. 4565 An endpoint thus attempts to use ECN and validates this for each new 4566 connection, when switching to a server's preferred address, and on 4567 active connection migration to a new path. Appendix A.4 describes 4568 one possible algorithm. 4570 Other methods of probing paths for ECN support are possible, as are 4571 different marking strategies. Implementations MAY use other methods 4572 defined in RFCs; see [RFC8311]. Implementations that use the ECT(1) 4573 codepoint need to perform ECN validation using the reported ECT(1) 4574 counts. 4576 13.4.2.1. Receiving ACK Frames with ECN Counts 4578 Erroneous application of CE markings by the network can result in 4579 degraded connection performance. An endpoint that receives an ACK 4580 frame with ECN counts therefore validates the counts before using 4581 them. It performs this validation by comparing newly received counts 4582 against those from the last successfully processed ACK frame. Any 4583 increase in the ECN counts is validated based on the ECN markings 4584 that were applied to packets that are newly acknowledged in the ACK 4585 frame. 4587 If an ACK frame newly acknowledges a packet that the endpoint sent 4588 with either the ECT(0) or ECT(1) codepoint set, ECN validation fails 4589 if the corresponding ECN counts are not present in the ACK frame. 4590 This check detects a network element that zeroes the ECN field or a 4591 peer that does not report ECN markings. 4593 ECN validation also fails if the sum of the increase in ECT(0) and 4594 ECN-CE counts is less than the number of newly acknowledged packets 4595 that were originally sent with an ECT(0) marking. Similarly, ECN 4596 validation fails if the sum of the increases to ECT(1) and ECN-CE 4597 counts is less than the number of newly acknowledged packets sent 4598 with an ECT(1) marking. These checks can detect remarking of ECN-CE 4599 markings by the network. 4601 An endpoint could miss acknowledgments for a packet when ACK frames 4602 are lost. It is therefore possible for the total increase in ECT(0), 4603 ECT(1), and ECN-CE counts to be greater than the number of packets 4604 that are newly acknowledged by an ACK frame. This is why ECN counts 4605 are permitted to be larger than the total number of packets that are 4606 acknowledged. 4608 Validating ECN counts from reordered ACK frames can result in 4609 failure. An endpoint MUST NOT fail ECN validation as a result of 4610 processing an ACK frame that does not increase the largest 4611 acknowledged packet number. 4613 ECN validation can fail if the received total count for either ECT(0) 4614 or ECT(1) exceeds the total number of packets sent with each 4615 corresponding ECT codepoint. In particular, validation will fail 4616 when an endpoint receives a non-zero ECN count corresponding to an 4617 ECT codepoint that it never applied. This check detects when packets 4618 are remarked to ECT(0) or ECT(1) in the network. 4620 13.4.2.2. ECN Validation Outcomes 4622 If validation fails, then the endpoint MUST disable ECN. It stops 4623 setting the ECT codepoint in IP packets that it sends, assuming that 4624 either the network path or the peer does not support ECN. 4626 Even if validation fails, an endpoint MAY revalidate ECN for the same 4627 path at any later time in the connection. An endpoint could continue 4628 to periodically attempt validation. 4630 Upon successful validation, an endpoint MAY continue to set an ECT 4631 codepoint in subsequent packets it sends, with the expectation that 4632 the path is ECN-capable. Network routing and path elements can 4633 however change mid-connection; an endpoint MUST disable ECN if 4634 validation later fails. 4636 14. Datagram Size 4638 A UDP datagram can include one or more QUIC packets. The datagram 4639 size refers to the total UDP payload size of a single UDP datagram 4640 carrying QUIC packets. The datagram size includes one or more QUIC 4641 packet headers and protected payloads, but not the UDP or IP headers. 4643 The maximum datagram size is defined as the largest size of UDP 4644 payload that can be sent across a network path using a single UDP 4645 datagram. QUIC MUST NOT be used if the network path cannot support a 4646 maximum datagram size of at least 1200 bytes. 4648 QUIC assumes a minimum IP packet size of at least 1280 bytes. This 4649 is the IPv6 minimum size ([IPv6]) and is also supported by most 4650 modern IPv4 networks. Assuming the minimum IP header size of 40 4651 bytes for IPv6 and 20 bytes for IPv4 and a UDP header size of 8 4652 bytes, this results in a maximum datagram size of 1232 bytes for IPv6 4653 and 1252 bytes for IPv4. Thus, modern IPv4 and all IPv6 network 4654 paths are expected to be able to support QUIC. 4656 Note: This requirement to support a UDP payload of 1200 bytes limits 4657 the space available for IPv6 extension headers to 32 bytes or IPv4 4658 options to 52 bytes if the path only supports the IPv6 minimum MTU 4659 of 1280 bytes. This affects Initial packets and path validation. 4661 Any maximum datagram size larger than 1200 bytes can be discovered 4662 using Path Maximum Transmission Unit Discovery (PMTUD; see 4663 Section 14.2.1) or Datagram Packetization Layer PMTU Discovery 4664 (DPLPMTUD; see Section 14.3). 4666 Enforcement of the max_udp_payload_size transport parameter 4667 (Section 18.2) might act as an additional limit on the maximum 4668 datagram size. A sender can avoid exceeding this limit, once the 4669 value is known. However, prior to learning the value of the 4670 transport parameter, endpoints risk datagrams being lost if they send 4671 datagrams larger than the smallest allowed maximum datagram size of 4672 1200 bytes. 4674 UDP datagrams MUST NOT be fragmented at the IP layer. In IPv4 4675 ([IPv4]), the DF bit MUST be set if possible, to prevent 4676 fragmentation on the path. 4678 QUIC sometimes requires datagrams to be no smaller than a certain 4679 size; see Section 8.1 as an example. However, the size of a datagram 4680 is not authenticated. That is, if an endpoint receives a datagram of 4681 a certain size, it cannot know that the sender sent the datagram at 4682 the same size. Therefore, an endpoint MUST NOT close a connection 4683 when it receives a datagram that does not meet size constraints; the 4684 endpoint MAY however discard such datagrams. 4686 14.1. Initial Datagram Size 4688 A client MUST expand the payload of all UDP datagrams carrying 4689 Initial packets to at least the smallest allowed maximum datagram 4690 size of 1200 bytes by adding PADDING frames to the Initial packet or 4691 by coalescing the Initial packet; see Section 12.2. Initial packets 4692 can even be coalesced with invalid packets, which a receiver will 4693 discard. Similarly, a server MUST expand the payload of all UDP 4694 datagrams carrying ack-eliciting Initial packets to at least the 4695 smallest allowed maximum datagram size of 1200 bytes. 4697 Sending UDP datagrams of this size ensures that the network path 4698 supports a reasonable Path Maximum Transmission Unit (PMTU), in both 4699 directions. Additionally, a client that expands Initial packets 4700 helps reduce the amplitude of amplification attacks caused by server 4701 responses toward an unverified client address; see Section 8. 4703 Datagrams containing Initial packets MAY exceed 1200 bytes if the 4704 sender believes that the network path and peer both support the size 4705 that it chooses. 4707 A server MUST discard an Initial packet that is carried in a UDP 4708 datagram with a payload that is smaller than the smallest allowed 4709 maximum datagram size of 1200 bytes. A server MAY also immediately 4710 close the connection by sending a CONNECTION_CLOSE frame with an 4711 error code of PROTOCOL_VIOLATION; see Section 10.2.3. 4713 The server MUST also limit the number of bytes it sends before 4714 validating the address of the client; see Section 8. 4716 14.2. Path Maximum Transmission Unit 4718 The Path Maximum Transmission Unit (PMTU) is the maximum size of the 4719 entire IP packet including the IP header, UDP header, and UDP 4720 payload. The UDP payload includes one or more QUIC packet headers 4721 and protected payloads. The PMTU can depend on path characteristics, 4722 and can therefore change over time. The largest UDP payload an 4723 endpoint sends at any given time is referred to as the endpoint's 4724 maximum datagram size. 4726 An endpoint SHOULD use DPLPMTUD (Section 14.3) or PMTUD 4727 (Section 14.2.1) to determine whether the path to a destination will 4728 support a desired maximum datagram size without fragmentation. In 4729 the absence of these mechanisms, QUIC endpoints SHOULD NOT send 4730 datagrams larger than the smallest allowed maximum datagram size. 4732 Both DPLPMTUD and PMTUD send datagrams that are larger than the 4733 current maximum datagram size, referred to as PMTU probes. All QUIC 4734 packets that are not sent in a PMTU probe SHOULD be sized to fit 4735 within the maximum datagram size to avoid the datagram being 4736 fragmented or dropped ([RFC8085]). 4738 If a QUIC endpoint determines that the PMTU between any pair of local 4739 and remote IP addresses cannot support the smallest allowed maximum 4740 datagram size of 1200 bytes, it MUST immediately cease sending QUIC 4741 packets, except for those in PMTU probes or those containing 4742 CONNECTION_CLOSE frames, on the affected path. An endpoint MAY 4743 terminate the connection if an alternative path cannot be found. 4745 Each pair of local and remote addresses could have a different PMTU. 4746 QUIC implementations that implement any kind of PMTU discovery 4747 therefore SHOULD maintain a maximum datagram size for each 4748 combination of local and remote IP addresses. 4750 A QUIC implementation MAY be more conservative in computing the 4751 maximum datagram size to allow for unknown tunnel overheads or IP 4752 header options/extensions. 4754 14.2.1. Handling of ICMP Messages by PMTUD 4756 Path Maximum Transmission Unit Discovery (PMTUD; [RFC1191], 4757 [RFC8201]) relies on reception of ICMP messages (e.g., IPv6 Packet 4758 Too Big messages) that indicate when an IP packet is dropped because 4759 it is larger than the local router MTU. DPLPMTUD can also optionally 4760 use these messages. This use of ICMP messages is potentially 4761 vulnerable to attacks by entities that cannot observe packets but 4762 might successfully guess the addresses used on the path. These 4763 attacks could reduce the PMTU to a bandwidth-inefficient value. 4765 An endpoint MUST ignore an ICMP message that claims the PMTU has 4766 decreased below QUIC's smallest allowed maximum datagram size. 4768 The requirements for generating ICMP ([RFC1812], [RFC4443]) state 4769 that the quoted packet should contain as much of the original packet 4770 as possible without exceeding the minimum MTU for the IP version. 4771 The size of the quoted packet can actually be smaller, or the 4772 information unintelligible, as described in Section 1.1 of 4773 [DPLPMTUD]. 4775 QUIC endpoints using PMTUD SHOULD validate ICMP messages to protect 4776 from packet injection as specified in [RFC8201] and Section 5.2 of 4777 [RFC8085]. This validation SHOULD use the quoted packet supplied in 4778 the payload of an ICMP message to associate the message with a 4779 corresponding transport connection (see Section 4.6.1 of [DPLPMTUD]). 4780 ICMP message validation MUST include matching IP addresses and UDP 4781 ports ([RFC8085]) and, when possible, connection IDs to an active 4782 QUIC session. The endpoint SHOULD ignore all ICMP messages that fail 4783 validation. 4785 An endpoint MUST NOT increase PMTU based on ICMP messages; see 4786 Section 3, clause 6 of [DPLPMTUD]. Any reduction in QUIC's maximum 4787 datagram size in response to ICMP messages MAY be provisional until 4788 QUIC's loss detection algorithm determines that the quoted packet has 4789 actually been lost. 4791 14.3. Datagram Packetization Layer PMTU Discovery 4793 Datagram Packetization Layer PMTU Discovery (DPLPMTUD; [DPLPMTUD]) 4794 relies on tracking loss or acknowledgment of QUIC packets that are 4795 carried in PMTU probes. PMTU probes for DPLPMTUD that use the 4796 PADDING frame implement "Probing using padding data", as defined in 4797 Section 4.1 of [DPLPMTUD]. 4799 Endpoints SHOULD set the initial value of BASE_PLPMTU (Section 5.1 of 4800 [DPLPMTUD]) to be consistent with QUIC's smallest allowed maximum 4801 datagram size. The MIN_PLPMTU is the same as the BASE_PLPMTU. 4803 QUIC endpoints implementing DPLPMTUD maintain a DPLPMTUD Maximum 4804 Packet Size (MPS, Section 4.4 of [DPLPMTUD]) for each combination of 4805 local and remote IP addresses. This corresponds to the maximum 4806 datagram size. 4808 14.3.1. DPLPMTUD and Initial Connectivity 4810 From the perspective of DPLPMTUD, QUIC is an acknowledged 4811 Packetization Layer (PL). A QUIC sender can therefore enter the 4812 DPLPMTUD BASE state (Section 5.2 of [DPLPMTUD]) when the QUIC 4813 connection handshake has been completed. 4815 14.3.2. Validating the Network Path with DPLPMTUD 4817 QUIC is an acknowledged PL, therefore a QUIC sender does not 4818 implement a DPLPMTUD CONFIRMATION_TIMER while in the SEARCH_COMPLETE 4819 state; see Section 5.2 of [DPLPMTUD]. 4821 14.3.3. Handling of ICMP Messages by DPLPMTUD 4823 An endpoint using DPLPMTUD requires the validation of any received 4824 ICMP Packet Too Big (PTB) message before using the PTB information, 4825 as defined in Section 4.6 of [DPLPMTUD]. In addition to UDP port 4826 validation, QUIC validates an ICMP message by using other PL 4827 information (e.g., validation of connection IDs in the quoted packet 4828 of any received ICMP message). 4830 The considerations for processing ICMP messages described in 4831 Section 14.2.1 also apply if these messages are used by DPLPMTUD. 4833 14.4. Sending QUIC PMTU Probes 4835 PMTU probes are ack-eliciting packets. 4837 Endpoints could limit the content of PMTU probes to PING and PADDING 4838 frames, since packets that are larger than the current maximum 4839 datagram size are more likely to be dropped by the network. Loss of 4840 a QUIC packet that is carried in a PMTU probe is therefore not a 4841 reliable indication of congestion and SHOULD NOT trigger a congestion 4842 control reaction; see Section 3, Bullet 7 of [DPLPMTUD]. However, 4843 PMTU probes consume congestion window, which could delay subsequent 4844 transmission by an application. 4846 14.4.1. PMTU Probes Containing Source Connection ID 4848 Endpoints that rely on the destination connection ID for routing 4849 incoming QUIC packets are likely to require that the connection ID be 4850 included in PMTU probes to route any resulting ICMP messages 4851 (Section 14.2.1) back to the correct endpoint. However, only long 4852 header packets (Section 17.2) contain the Source Connection ID field, 4853 and long header packets are not decrypted or acknowledged by the peer 4854 once the handshake is complete. 4856 One way to construct a PMTU probe is to coalesce (see Section 12.2) a 4857 packet with a long header, such as a Handshake or 0-RTT packet 4858 (Section 17.2), with a short header packet in a single UDP datagram. 4859 If the resulting PMTU probe reaches the endpoint, the packet with the 4860 long header will be ignored, but the short header packet will be 4861 acknowledged. If the PMTU probe causes an ICMP message to be sent, 4862 the first part of the probe will be quoted in that message. If the 4863 Source Connection ID field is within the quoted portion of the probe, 4864 that could be used for routing or validation of the ICMP message. 4866 Note: The purpose of using a packet with a long header is only to 4867 ensure that the quoted packet contained in the ICMP message 4868 contains a Source Connection ID field. This packet does not need 4869 to be a valid packet and it can be sent even if there is no 4870 current use for packets of that type. 4872 15. Versions 4874 QUIC versions are identified using a 32-bit unsigned number. 4876 The version 0x00000000 is reserved to represent version negotiation. 4877 This version of the specification is identified by the number 4878 0x00000001. 4880 Other versions of QUIC might have different properties from this 4881 version. The properties of QUIC that are guaranteed to be consistent 4882 across all versions of the protocol are described in 4883 [QUIC-INVARIANTS]. 4885 Version 0x00000001 of QUIC uses TLS as a cryptographic handshake 4886 protocol, as described in [QUIC-TLS]. 4888 Versions with the most significant 16 bits of the version number 4889 cleared are reserved for use in future IETF consensus documents. 4891 Versions that follow the pattern 0x?a?a?a?a are reserved for use in 4892 forcing version negotiation to be exercised. That is, any version 4893 number where the low four bits of all bytes is 1010 (in binary). A 4894 client or server MAY advertise support for any of these reserved 4895 versions. 4897 Reserved version numbers will never represent a real protocol; a 4898 client MAY use one of these version numbers with the expectation that 4899 the server will initiate version negotiation; a server MAY advertise 4900 support for one of these versions and can expect that clients ignore 4901 the value. 4903 16. Variable-Length Integer Encoding 4905 QUIC packets and frames commonly use a variable-length encoding for 4906 non-negative integer values. This encoding ensures that smaller 4907 integer values need fewer bytes to encode. 4909 The QUIC variable-length integer encoding reserves the two most 4910 significant bits of the first byte to encode the base 2 logarithm of 4911 the integer encoding length in bytes. The integer value is encoded 4912 on the remaining bits, in network byte order. 4914 This means that integers are encoded on 1, 2, 4, or 8 bytes and can 4915 encode 6-, 14-, 30-, or 62-bit values respectively. Table 4 4916 summarizes the encoding properties. 4918 +======+========+=============+=======================+ 4919 | 2Bit | Length | Usable Bits | Range | 4920 +======+========+=============+=======================+ 4921 | 00 | 1 | 6 | 0-63 | 4922 +------+--------+-------------+-----------------------+ 4923 | 01 | 2 | 14 | 0-16383 | 4924 +------+--------+-------------+-----------------------+ 4925 | 10 | 4 | 30 | 0-1073741823 | 4926 +------+--------+-------------+-----------------------+ 4927 | 11 | 8 | 62 | 0-4611686018427387903 | 4928 +------+--------+-------------+-----------------------+ 4930 Table 4: Summary of Integer Encodings 4932 Examples and a sample decoding algorithm are shown in Appendix A.1. 4934 Values do not need to be encoded on the minimum number of bytes 4935 necessary, with the sole exception of the Frame Type field; see 4936 Section 12.4. 4938 Versions (Section 15), packet numbers sent in the header 4939 (Section 17.1), and the length of connection IDs in long header 4940 packets (Section 17.2) are described using integers, but do not use 4941 this encoding. 4943 17. Packet Formats 4945 All numeric values are encoded in network byte order (that is, big- 4946 endian) and all field sizes are in bits. Hexadecimal notation is 4947 used for describing the value of fields. 4949 17.1. Packet Number Encoding and Decoding 4951 Packet numbers are integers in the range 0 to 2^62-1 (Section 12.3). 4952 When present in long or short packet headers, they are encoded in 1 4953 to 4 bytes. The number of bits required to represent the packet 4954 number is reduced by including only the least significant bits of the 4955 packet number. 4957 The encoded packet number is protected as described in Section 5.4 of 4958 [QUIC-TLS]. 4960 Prior to receiving an acknowledgment for a packet number space, the 4961 full packet number MUST be included; it is not to be truncated as 4962 described below. 4964 After an acknowledgment is received for a packet number space, the 4965 sender MUST use a packet number size able to represent more than 4966 twice as large a range than the difference between the largest 4967 acknowledged packet and packet number being sent. A peer receiving 4968 the packet will then correctly decode the packet number, unless the 4969 packet is delayed in transit such that it arrives after many higher- 4970 numbered packets have been received. An endpoint SHOULD use a large 4971 enough packet number encoding to allow the packet number to be 4972 recovered even if the packet arrives after packets that are sent 4973 afterwards. 4975 As a result, the size of the packet number encoding is at least one 4976 bit more than the base-2 logarithm of the number of contiguous 4977 unacknowledged packet numbers, including the new packet. Pseudocode 4978 and examples for packet number encoding can be found in Appendix A.2. 4980 At a receiver, protection of the packet number is removed prior to 4981 recovering the full packet number. The full packet number is then 4982 reconstructed based on the number of significant bits present, the 4983 value of those bits, and the largest packet number received in a 4984 successfully authenticated packet. Recovering the full packet number 4985 is necessary to successfully remove packet protection. 4987 Once header protection is removed, the packet number is decoded by 4988 finding the packet number value that is closest to the next expected 4989 packet. The next expected packet is the highest received packet 4990 number plus one. Pseudocode and an example for packet number 4991 decoding can be found in Appendix A.3. 4993 17.2. Long Header Packets 4995 Long Header Packet { 4996 Header Form (1) = 1, 4997 Fixed Bit (1) = 1, 4998 Long Packet Type (2), 4999 Type-Specific Bits (4), 5000 Version (32), 5001 Destination Connection ID Length (8), 5002 Destination Connection ID (0..160), 5003 Source Connection ID Length (8), 5004 Source Connection ID (0..160), 5005 Type-Specific Payload (..), 5006 } 5008 Figure 13: Long Header Packet Format 5010 Long headers are used for packets that are sent prior to the 5011 establishment of 1-RTT keys. Once 1-RTT keys are available, a sender 5012 switches to sending packets using the short header (Section 17.3). 5013 The long form allows for special packets - such as the Version 5014 Negotiation packet - to be represented in this uniform fixed-length 5015 packet format. Packets that use the long header contain the 5016 following fields: 5018 Header Form: The most significant bit (0x80) of byte 0 (the first 5019 byte) is set to 1 for long headers. 5021 Fixed Bit: The next bit (0x40) of byte 0 is set to 1, unless the 5022 packet is a Version Negotiation packet. Packets containing a zero 5023 value for this bit are not valid packets in this version and MUST 5024 be discarded. A value of 1 for this bit allows QUIC to coexist 5025 with other protocols; see [RFC7983]. 5027 Long Packet Type: The next two bits (those with a mask of 0x30) of 5028 byte 0 contain a packet type. Packet types are listed in Table 5. 5030 Type-Specific Bits: The semantics of the lower four bits (those with 5031 a mask of 0x0f) of byte 0 are determined by the packet type. 5033 Version: The QUIC Version is a 32-bit field that follows the first 5034 byte. This field indicates the version of QUIC that is in use and 5035 determines how the rest of the protocol fields are interpreted. 5037 Destination Connection ID Length: The byte following the version 5038 contains the length in bytes of the Destination Connection ID 5039 field that follows it. This length is encoded as an 8-bit 5040 unsigned integer. In QUIC version 1, this value MUST NOT exceed 5041 20. Endpoints that receive a version 1 long header with a value 5042 larger than 20 MUST drop the packet. In order to properly form a 5043 Version Negotiation packet, servers SHOULD be able to read longer 5044 connection IDs from other QUIC versions. 5046 Destination Connection ID: The Destination Connection ID field 5047 follows the Destination Connection ID Length field, which 5048 indicates the length of this field. Section 7.2 describes the use 5049 of this field in more detail. 5051 Source Connection ID Length: The byte following the Destination 5052 Connection ID contains the length in bytes of the Source 5053 Connection ID field that follows it. This length is encoded as a 5054 8-bit unsigned integer. In QUIC version 1, this value MUST NOT 5055 exceed 20 bytes. Endpoints that receive a version 1 long header 5056 with a value larger than 20 MUST drop the packet. In order to 5057 properly form a Version Negotiation packet, servers SHOULD be able 5058 to read longer connection IDs from other QUIC versions. 5060 Source Connection ID: The Source Connection ID field follows the 5061 Source Connection ID Length field, which indicates the length of 5062 this field. Section 7.2 describes the use of this field in more 5063 detail. 5065 Type-Specific Payload: The remainder of the packet, if any, is type- 5066 specific. 5068 In this version of QUIC, the following packet types with the long 5069 header are defined: 5071 +======+===========+================+ 5072 | Type | Name | Section | 5073 +======+===========+================+ 5074 | 0x0 | Initial | Section 17.2.2 | 5075 +------+-----------+----------------+ 5076 | 0x1 | 0-RTT | Section 17.2.3 | 5077 +------+-----------+----------------+ 5078 | 0x2 | Handshake | Section 17.2.4 | 5079 +------+-----------+----------------+ 5080 | 0x3 | Retry | Section 17.2.5 | 5081 +------+-----------+----------------+ 5083 Table 5: Long Header Packet Types 5085 The header form bit, Destination and Source Connection ID lengths, 5086 Destination and Source Connection ID fields, and Version fields of a 5087 long header packet are version-independent. The other fields in the 5088 first byte are version-specific. See [QUIC-INVARIANTS] for details 5089 on how packets from different versions of QUIC are interpreted. 5091 The interpretation of the fields and the payload are specific to a 5092 version and packet type. While type-specific semantics for this 5093 version are described in the following sections, several long-header 5094 packets in this version of QUIC contain these additional fields: 5096 Reserved Bits: Two bits (those with a mask of 0x0c) of byte 0 are 5097 reserved across multiple packet types. These bits are protected 5098 using header protection; see Section 5.4 of [QUIC-TLS]. The value 5099 included prior to protection MUST be set to 0. An endpoint MUST 5100 treat receipt of a packet that has a non-zero value for these bits 5101 after removing both packet and header protection as a connection 5102 error of type PROTOCOL_VIOLATION. Discarding such a packet after 5103 only removing header protection can expose the endpoint to 5104 attacks; see Section 9.5 of [QUIC-TLS]. 5106 Packet Number Length: In packet types that contain a Packet Number 5107 field, the least significant two bits (those with a mask of 0x03) 5108 of byte 0 contain the length of the packet number, encoded as an 5109 unsigned, two-bit integer that is one less than the length of the 5110 packet number field in bytes. That is, the length of the packet 5111 number field is the value of this field, plus one. These bits are 5112 protected using header protection; see Section 5.4 of [QUIC-TLS]. 5114 Length: The length of the remainder of the packet (that is, the 5115 Packet Number and Payload fields) in bytes, encoded as a variable- 5116 length integer (Section 16). 5118 Packet Number: The packet number field is 1 to 4 bytes long. The 5119 packet number is protected using header protection; see 5120 Section 5.4 of [QUIC-TLS]. The length of the packet number field 5121 is encoded in the Packet Number Length bits of byte 0; see above. 5123 17.2.1. Version Negotiation Packet 5125 A Version Negotiation packet is inherently not version-specific. 5126 Upon receipt by a client, it will be identified as a Version 5127 Negotiation packet based on the Version field having a value of 0. 5129 The Version Negotiation packet is a response to a client packet that 5130 contains a version that is not supported by the server, and is only 5131 sent by servers. 5133 The layout of a Version Negotiation packet is: 5135 Version Negotiation Packet { 5136 Header Form (1) = 1, 5137 Unused (7), 5138 Version (32) = 0, 5139 Destination Connection ID Length (8), 5140 Destination Connection ID (0..2040), 5141 Source Connection ID Length (8), 5142 Source Connection ID (0..2040), 5143 Supported Version (32) ..., 5144 } 5146 Figure 14: Version Negotiation Packet 5148 The value in the Unused field is set to an arbitrary value by the 5149 server. Clients MUST ignore the value of this field. Where QUIC 5150 might be multiplexed with other protocols (see [RFC7983]), servers 5151 SHOULD set the most significant bit of this field (0x40) to 1 so that 5152 Version Negotiation packets appear to have the Fixed Bit field. Note 5153 that other versions of QUIC might not make a similar recommendation. 5155 The Version field of a Version Negotiation packet MUST be set to 5156 0x00000000. 5158 The server MUST include the value from the Source Connection ID field 5159 of the packet it receives in the Destination Connection ID field. 5160 The value for Source Connection ID MUST be copied from the 5161 Destination Connection ID of the received packet, which is initially 5162 randomly selected by a client. Echoing both connection IDs gives 5163 clients some assurance that the server received the packet and that 5164 the Version Negotiation packet was not generated by an entity that 5165 did not observe the Initial packet. 5167 Future versions of QUIC could have different requirements for the 5168 lengths of connection IDs. In particular, connection IDs might have 5169 a smaller minimum length or a greater maximum length. Version- 5170 specific rules for the connection ID therefore MUST NOT influence a 5171 server decision about whether to send a Version Negotiation packet. 5173 The remainder of the Version Negotiation packet is a list of 32-bit 5174 versions that the server supports. 5176 A Version Negotiation packet is not acknowledged. It is only sent in 5177 response to a packet that indicates an unsupported version; see 5178 Section 5.2.2. 5180 The Version Negotiation packet does not include the Packet Number and 5181 Length fields present in other packets that use the long header form. 5182 Consequently, a Version Negotiation packet consumes an entire UDP 5183 datagram. 5185 A server MUST NOT send more than one Version Negotiation packet in 5186 response to a single UDP datagram. 5188 See Section 6 for a description of the version negotiation process. 5190 17.2.2. Initial Packet 5192 An Initial packet uses long headers with a type value of 0x0. It 5193 carries the first CRYPTO frames sent by the client and server to 5194 perform key exchange, and carries ACKs in either direction. 5196 Initial Packet { 5197 Header Form (1) = 1, 5198 Fixed Bit (1) = 1, 5199 Long Packet Type (2) = 0, 5200 Reserved Bits (2), 5201 Packet Number Length (2), 5202 Version (32), 5203 Destination Connection ID Length (8), 5204 Destination Connection ID (0..160), 5205 Source Connection ID Length (8), 5206 Source Connection ID (0..160), 5207 Token Length (i), 5208 Token (..), 5209 Length (i), 5210 Packet Number (8..32), 5211 Packet Payload (8..), 5212 } 5214 Figure 15: Initial Packet 5216 The Initial packet contains a long header as well as the Length and 5217 Packet Number fields; see Section 17.2. The first byte contains the 5218 Reserved and Packet Number Length bits; see also Section 17.2. 5219 Between the Source Connection ID and Length fields, there are two 5220 additional fields specific to the Initial packet. 5222 Token Length: A variable-length integer specifying the length of the 5223 Token field, in bytes. This value is zero if no token is present. 5224 Initial packets sent by the server MUST set the Token Length field 5225 to zero; clients that receive an Initial packet with a non-zero 5226 Token Length field MUST either discard the packet or generate a 5227 connection error of type PROTOCOL_VIOLATION. 5229 Token: The value of the token that was previously provided in a 5230 Retry packet or NEW_TOKEN frame; see Section 8.1. 5232 Packet Payload: The payload of the packet. 5234 In order to prevent tampering by version-unaware middleboxes, Initial 5235 packets are protected with connection- and version-specific keys 5236 (Initial keys) as described in [QUIC-TLS]. This protection does not 5237 provide confidentiality or integrity against attackers that can 5238 observe packets, but provides some level of protection against 5239 attackers that cannot observe packets. 5241 The client and server use the Initial packet type for any packet that 5242 contains an initial cryptographic handshake message. This includes 5243 all cases where a new packet containing the initial cryptographic 5244 message needs to be created, such as the packets sent after receiving 5245 a Retry packet (Section 17.2.5). 5247 A server sends its first Initial packet in response to a client 5248 Initial. A server MAY send multiple Initial packets. The 5249 cryptographic key exchange could require multiple round trips or 5250 retransmissions of this data. 5252 The payload of an Initial packet includes a CRYPTO frame (or frames) 5253 containing a cryptographic handshake message, ACK frames, or both. 5254 PING, PADDING, and CONNECTION_CLOSE frames of type 0x1c are also 5255 permitted. An endpoint that receives an Initial packet containing 5256 other frames can either discard the packet as spurious or treat it as 5257 a connection error. 5259 The first packet sent by a client always includes a CRYPTO frame that 5260 contains the start or all of the first cryptographic handshake 5261 message. The first CRYPTO frame sent always begins at an offset of 5262 0; see Section 7. 5264 Note that if the server sends a TLS HelloRetryRequest (see 5265 Section 4.7 of [QUIC-TLS]), the client will send another series of 5266 Initial packets. These Initial packets will continue the 5267 cryptographic handshake and will contain CRYPTO frames starting at an 5268 offset matching the size of the CRYPTO frames sent in the first 5269 flight of Initial packets. 5271 17.2.2.1. Abandoning Initial Packets 5273 A client stops both sending and processing Initial packets when it 5274 sends its first Handshake packet. A server stops sending and 5275 processing Initial packets when it receives its first Handshake 5276 packet. Though packets might still be in flight or awaiting 5277 acknowledgment, no further Initial packets need to be exchanged 5278 beyond this point. Initial packet protection keys are discarded (see 5279 Section 4.9.1 of [QUIC-TLS]) along with any loss recovery and 5280 congestion control state; see Section 6.4 of [QUIC-RECOVERY]. 5282 Any data in CRYPTO frames is discarded - and no longer retransmitted 5283 - when Initial keys are discarded. 5285 17.2.3. 0-RTT 5287 A 0-RTT packet uses long headers with a type value of 0x1, followed 5288 by the Length and Packet Number fields; see Section 17.2. The first 5289 byte contains the Reserved and Packet Number Length bits; see 5290 Section 17.2. A 0-RTT packet is used to carry "early" data from the 5291 client to the server as part of the first flight, prior to handshake 5292 completion. As part of the TLS handshake, the server can accept or 5293 reject this early data. 5295 See Section 2.3 of [TLS13] for a discussion of 0-RTT data and its 5296 limitations. 5298 0-RTT Packet { 5299 Header Form (1) = 1, 5300 Fixed Bit (1) = 1, 5301 Long Packet Type (2) = 1, 5302 Reserved Bits (2), 5303 Packet Number Length (2), 5304 Version (32), 5305 Destination Connection ID Length (8), 5306 Destination Connection ID (0..160), 5307 Source Connection ID Length (8), 5308 Source Connection ID (0..160), 5309 Length (i), 5310 Packet Number (8..32), 5311 Packet Payload (8..), 5312 } 5314 Figure 16: 0-RTT Packet 5316 Packet numbers for 0-RTT protected packets use the same space as 5317 1-RTT protected packets. 5319 After a client receives a Retry packet, 0-RTT packets are likely to 5320 have been lost or discarded by the server. A client SHOULD attempt 5321 to resend data in 0-RTT packets after it sends a new Initial packet. 5322 New packet numbers MUST be used for any new packets that are sent; as 5323 described in Section 17.2.5.3, reusing packet numbers could 5324 compromise packet protection. 5326 A client only receives acknowledgments for its 0-RTT packets once the 5327 handshake is complete, as defined in Section 4.1.1 of [QUIC-TLS]. 5329 A client MUST NOT send 0-RTT packets once it starts processing 1-RTT 5330 packets from the server. This means that 0-RTT packets cannot 5331 contain any response to frames from 1-RTT packets. For instance, a 5332 client cannot send an ACK frame in a 0-RTT packet, because that can 5333 only acknowledge a 1-RTT packet. An acknowledgment for a 1-RTT 5334 packet MUST be carried in a 1-RTT packet. 5336 A server SHOULD treat a violation of remembered limits 5337 (Section 7.4.1) as a connection error of an appropriate type (for 5338 instance, a FLOW_CONTROL_ERROR for exceeding stream data limits). 5340 17.2.4. Handshake Packet 5342 A Handshake packet uses long headers with a type value of 0x2, 5343 followed by the Length and Packet Number fields; see Section 17.2. 5344 The first byte contains the Reserved and Packet Number Length bits; 5345 see Section 17.2. It is used to carry cryptographic handshake 5346 messages and acknowledgments from the server and client. 5348 Handshake Packet { 5349 Header Form (1) = 1, 5350 Fixed Bit (1) = 1, 5351 Long Packet Type (2) = 2, 5352 Reserved Bits (2), 5353 Packet Number Length (2), 5354 Version (32), 5355 Destination Connection ID Length (8), 5356 Destination Connection ID (0..160), 5357 Source Connection ID Length (8), 5358 Source Connection ID (0..160), 5359 Length (i), 5360 Packet Number (8..32), 5361 Packet Payload (8..), 5362 } 5364 Figure 17: Handshake Protected Packet 5366 Once a client has received a Handshake packet from a server, it uses 5367 Handshake packets to send subsequent cryptographic handshake messages 5368 and acknowledgments to the server. 5370 The Destination Connection ID field in a Handshake packet contains a 5371 connection ID that is chosen by the recipient of the packet; the 5372 Source Connection ID includes the connection ID that the sender of 5373 the packet wishes to use; see Section 7.2. 5375 Handshake packets have their own packet number space, and thus the 5376 first Handshake packet sent by a server contains a packet number of 5377 0. 5379 The payload of this packet contains CRYPTO frames and could contain 5380 PING, PADDING, or ACK frames. Handshake packets MAY contain 5381 CONNECTION_CLOSE frames of type 0x1c. Endpoints MUST treat receipt 5382 of Handshake packets with other frames as a connection error of type 5383 PROTOCOL_VIOLATION. 5385 Like Initial packets (see Section 17.2.2.1), data in CRYPTO frames 5386 for Handshake packets is discarded - and no longer retransmitted - 5387 when Handshake protection keys are discarded. 5389 17.2.5. Retry Packet 5391 A Retry packet uses a long packet header with a type value of 0x3. 5392 It carries an address validation token created by the server. It is 5393 used by a server that wishes to perform a retry; see Section 8.1. 5395 Retry Packet { 5396 Header Form (1) = 1, 5397 Fixed Bit (1) = 1, 5398 Long Packet Type (2) = 3, 5399 Unused (4), 5400 Version (32), 5401 Destination Connection ID Length (8), 5402 Destination Connection ID (0..160), 5403 Source Connection ID Length (8), 5404 Source Connection ID (0..160), 5405 Retry Token (..), 5406 Retry Integrity Tag (128), 5407 } 5409 Figure 18: Retry Packet 5411 A Retry packet (shown in Figure 18) does not contain any protected 5412 fields. The value in the Unused field is set to an arbitrary value 5413 by the server; a client MUST ignore these bits. In addition to the 5414 fields from the long header, it contains these additional fields: 5416 Retry Token: An opaque token that the server can use to validate the 5417 client's address. 5419 Retry Integrity Tag: See the Retry Packet Integrity section of 5420 [QUIC-TLS]. 5422 17.2.5.1. Sending a Retry Packet 5424 The server populates the Destination Connection ID with the 5425 connection ID that the client included in the Source Connection ID of 5426 the Initial packet. 5428 The server includes a connection ID of its choice in the Source 5429 Connection ID field. This value MUST NOT be equal to the Destination 5430 Connection ID field of the packet sent by the client. A client MUST 5431 discard a Retry packet that contains a Source Connection ID field 5432 that is identical to the Destination Connection ID field of its 5433 Initial packet. The client MUST use the value from the Source 5434 Connection ID field of the Retry packet in the Destination Connection 5435 ID field of subsequent packets that it sends. 5437 A server MAY send Retry packets in response to Initial and 0-RTT 5438 packets. A server can either discard or buffer 0-RTT packets that it 5439 receives. A server can send multiple Retry packets as it receives 5440 Initial or 0-RTT packets. A server MUST NOT send more than one Retry 5441 packet in response to a single UDP datagram. 5443 17.2.5.2. Handling a Retry Packet 5445 A client MUST accept and process at most one Retry packet for each 5446 connection attempt. After the client has received and processed an 5447 Initial or Retry packet from the server, it MUST discard any 5448 subsequent Retry packets that it receives. 5450 Clients MUST discard Retry packets that have a Retry Integrity Tag 5451 that cannot be validated; see the Retry Packet Integrity section of 5452 [QUIC-TLS]. This diminishes an attacker's ability to inject a Retry 5453 packet and protects against accidental corruption of Retry packets. 5454 A client MUST discard a Retry packet with a zero-length Retry Token 5455 field. 5457 The client responds to a Retry packet with an Initial packet that 5458 includes the provided Retry Token to continue connection 5459 establishment. 5461 A client sets the Destination Connection ID field of this Initial 5462 packet to the value from the Source Connection ID in the Retry 5463 packet. Changing Destination Connection ID also results in a change 5464 to the keys used to protect the Initial packet. It also sets the 5465 Token field to the token provided in the Retry. The client MUST NOT 5466 change the Source Connection ID because the server could include the 5467 connection ID as part of its token validation logic; see 5468 Section 8.1.4. 5470 A Retry packet does not include a packet number and cannot be 5471 explicitly acknowledged by a client. 5473 17.2.5.3. Continuing a Handshake After Retry 5475 Subsequent Initial packets from the client include the connection ID 5476 and token values from the Retry packet. The client copies the Source 5477 Connection ID field from the Retry packet to the Destination 5478 Connection ID field and uses this value until an Initial packet with 5479 an updated value is received; see Section 7.2. The value of the 5480 Token field is copied to all subsequent Initial packets; see 5481 Section 8.1.2. 5483 Other than updating the Destination Connection ID and Token fields, 5484 the Initial packet sent by the client is subject to the same 5485 restrictions as the first Initial packet. A client MUST use the same 5486 cryptographic handshake message it included in this packet. A server 5487 MAY treat a packet that contains a different cryptographic handshake 5488 message as a connection error or discard it. Note that including a 5489 Token field reduces the available space for the cryptographic 5490 handshake message, which might result in the client needing to send 5491 multiple Initial packets. 5493 A client MAY attempt 0-RTT after receiving a Retry packet by sending 5494 0-RTT packets to the connection ID provided by the server. 5496 A client MUST NOT reset the packet number for any packet number space 5497 after processing a Retry packet. In particular, 0-RTT packets 5498 contain confidential information that will most likely be 5499 retransmitted on receiving a Retry packet. The keys used to protect 5500 these new 0-RTT packets will not change as a result of responding to 5501 a Retry packet. However, the data sent in these packets could be 5502 different than what was sent earlier. Sending these new packets with 5503 the same packet number is likely to compromise the packet protection 5504 for those packets because the same key and nonce could be used to 5505 protect different content. A server MAY abort the connection if it 5506 detects that the client reset the packet number. 5508 The connection IDs used on Initial and Retry packets exchanged 5509 between client and server are copied to the transport parameters and 5510 validated as described in Section 7.3. 5512 17.3. Short Header Packets 5514 This version of QUIC defines a single packet type that uses the short 5515 packet header. 5517 17.3.1. 1-RTT Packet 5519 A 1-RTT packet uses a short packet header. It is used after the 5520 version and 1-RTT keys are negotiated. 5522 1-RTT Packet { 5523 Header Form (1) = 0, 5524 Fixed Bit (1) = 1, 5525 Spin Bit (1), 5526 Reserved Bits (2), 5527 Key Phase (1), 5528 Packet Number Length (2), 5529 Destination Connection ID (0..160), 5530 Packet Number (8..32), 5531 Packet Payload (8..), 5532 } 5534 Figure 19: 1-RTT Packet 5536 1-RTT packets contain the following fields: 5538 Header Form: The most significant bit (0x80) of byte 0 is set to 0 5539 for the short header. 5541 Fixed Bit: The next bit (0x40) of byte 0 is set to 1. Packets 5542 containing a zero value for this bit are not valid packets in this 5543 version and MUST be discarded. A value of 1 for this bit allows 5544 QUIC to coexist with other protocols; see [RFC7983]. 5546 Spin Bit: The third most significant bit (0x20) of byte 0 is the 5547 latency spin bit, set as described in Section 17.4. 5549 Reserved Bits: The next two bits (those with a mask of 0x18) of byte 5550 0 are reserved. These bits are protected using header protection; 5551 see Section 5.4 of [QUIC-TLS]. The value included prior to 5552 protection MUST be set to 0. An endpoint MUST treat receipt of a 5553 packet that has a non-zero value for these bits, after removing 5554 both packet and header protection, as a connection error of type 5555 PROTOCOL_VIOLATION. Discarding such a packet after only removing 5556 header protection can expose the endpoint to attacks; see 5557 Section 9.5 of [QUIC-TLS]. 5559 Key Phase: The next bit (0x04) of byte 0 indicates the key phase, 5560 which allows a recipient of a packet to identify the packet 5561 protection keys that are used to protect the packet. See 5562 [QUIC-TLS] for details. This bit is protected using header 5563 protection; see Section 5.4 of [QUIC-TLS]. 5565 Packet Number Length: The least significant two bits (those with a 5566 mask of 0x03) of byte 0 contain the length of the packet number, 5567 encoded as an unsigned, two-bit integer that is one less than the 5568 length of the packet number field in bytes. That is, the length 5569 of the packet number field is the value of this field, plus one. 5570 These bits are protected using header protection; see Section 5.4 5571 of [QUIC-TLS]. 5573 Destination Connection ID: The Destination Connection ID is a 5574 connection ID that is chosen by the intended recipient of the 5575 packet. See Section 5.1 for more details. 5577 Packet Number: The packet number field is 1 to 4 bytes long. The 5578 packet number is protected using header protection; see 5579 Section 5.4 of [QUIC-TLS]. The length of the packet number field 5580 is encoded in Packet Number Length field. See Section 17.1 for 5581 details. 5583 Packet Payload: 1-RTT packets always include a 1-RTT protected 5584 payload. 5586 The header form bit and the connection ID field of a short header 5587 packet are version-independent. The remaining fields are specific to 5588 the selected QUIC version. See [QUIC-INVARIANTS] for details on how 5589 packets from different versions of QUIC are interpreted. 5591 17.4. Latency Spin Bit 5593 The latency spin bit, which is defined for 1-RTT packets 5594 (Section 17.3.1), enables passive latency monitoring from observation 5595 points on the network path throughout the duration of a connection. 5596 The server reflects the spin value received, while the client 'spins' 5597 it after one RTT. On-path observers can measure the time between two 5598 spin bit toggle events to estimate the end-to-end RTT of a 5599 connection. 5601 The spin bit is only present in 1-RTT packets, since it is possible 5602 to measure the initial RTT of a connection by observing the 5603 handshake. Therefore, the spin bit is available after version 5604 negotiation and connection establishment are completed. On-path 5605 measurement and use of the latency spin bit is further discussed in 5606 [QUIC-MANAGEABILITY]. 5608 The spin bit is an OPTIONAL feature of this version of QUIC. An 5609 endpoint that does not support this feature MUST disable it, as 5610 defined below. 5612 Each endpoint unilaterally decides if the spin bit is enabled or 5613 disabled for a connection. Implementations MUST allow administrators 5614 of clients and servers to disable the spin bit either globally or on 5615 a per-connection basis. Even when the spin bit is not disabled by 5616 the administrator, endpoints MUST disable their use of the spin bit 5617 for a random selection of at least one in every 16 network paths, or 5618 for one in every 16 connection IDs, in order to ensure that QUIC 5619 connections that disable the spin bit are commonly observed on the 5620 network. As each endpoint disables the spin bit independently, this 5621 ensures that the spin bit signal is disabled on approximately one in 5622 eight network paths. 5624 When the spin bit is disabled, endpoints MAY set the spin bit to any 5625 value, and MUST ignore any incoming value. It is RECOMMENDED that 5626 endpoints set the spin bit to a random value either chosen 5627 independently for each packet or chosen independently for each 5628 connection ID. 5630 If the spin bit is enabled for the connection, the endpoint maintains 5631 a spin value for each network path and sets the spin bit in the 5632 packet header to the currently stored value when a 1-RTT packet is 5633 sent on that path. The spin value is initialized to 0 in the 5634 endpoint for each network path. Each endpoint also remembers the 5635 highest packet number seen from its peer on each path. 5637 When a server receives a 1-RTT packet that increases the highest 5638 packet number seen by the server from the client on a given network 5639 path, it sets the spin value for that path to be equal to the spin 5640 bit in the received packet. 5642 When a client receives a 1-RTT packet that increases the highest 5643 packet number seen by the client from the server on a given network 5644 path, it sets the spin value for that path to the inverse of the spin 5645 bit in the received packet. 5647 An endpoint resets the spin value for a network path to zero when 5648 changing the connection ID being used on that network path. 5650 18. Transport Parameter Encoding 5652 The extension_data field of the quic_transport_parameters extension 5653 defined in [QUIC-TLS] contains the QUIC transport parameters. They 5654 are encoded as a sequence of transport parameters, as shown in 5655 Figure 20: 5657 Transport Parameters { 5658 Transport Parameter (..) ..., 5659 } 5660 Figure 20: Sequence of Transport Parameters 5662 Each transport parameter is encoded as an (identifier, length, value) 5663 tuple, as shown in Figure 21: 5665 Transport Parameter { 5666 Transport Parameter ID (i), 5667 Transport Parameter Length (i), 5668 Transport Parameter Value (..), 5669 } 5671 Figure 21: Transport Parameter Encoding 5673 The Transport Parameter Length field contains the length of the 5674 Transport Parameter Value field in bytes. 5676 QUIC encodes transport parameters into a sequence of bytes, which is 5677 then included in the cryptographic handshake. 5679 18.1. Reserved Transport Parameters 5681 Transport parameters with an identifier of the form "31 * N + 27" for 5682 integer values of N are reserved to exercise the requirement that 5683 unknown transport parameters be ignored. These transport parameters 5684 have no semantics, and can carry arbitrary values. 5686 18.2. Transport Parameter Definitions 5688 This section details the transport parameters defined in this 5689 document. 5691 Many transport parameters listed here have integer values. Those 5692 transport parameters that are identified as integers use a variable- 5693 length integer encoding; see Section 16. Transport parameters have a 5694 default value of 0 if the transport parameter is absent unless 5695 otherwise stated. 5697 The following transport parameters are defined: 5699 original_destination_connection_id (0x00): The value of the 5700 Destination Connection ID field from the first Initial packet sent 5701 by the client; see Section 7.3. This transport parameter is only 5702 sent by a server. 5704 max_idle_timeout (0x01): The max idle timeout is a value in 5705 milliseconds that is encoded as an integer; see (Section 10.1). 5706 Idle timeout is disabled when both endpoints omit this transport 5707 parameter or specify a value of 0. 5709 stateless_reset_token (0x02): A stateless reset token is used in 5710 verifying a stateless reset; see Section 10.3. This parameter is 5711 a sequence of 16 bytes. This transport parameter MUST NOT be sent 5712 by a client, but MAY be sent by a server. A server that does not 5713 send this transport parameter cannot use stateless reset 5714 (Section 10.3) for the connection ID negotiated during the 5715 handshake. 5717 max_udp_payload_size (0x03): The maximum UDP payload size parameter 5718 is an integer value that limits the size of UDP payloads that the 5719 endpoint is willing to receive. UDP datagrams with payloads 5720 larger than this limit are not likely to be processed by the 5721 receiver. 5723 The default for this parameter is the maximum permitted UDP 5724 payload of 65527. Values below 1200 are invalid. 5726 This limit does act as an additional constraint on datagram size 5727 in the same way as the path MTU, but it is a property of the 5728 endpoint and not the path; see Section 14. It is expected that 5729 this is the space an endpoint dedicates to holding incoming 5730 packets. 5732 initial_max_data (0x04): The initial maximum data parameter is an 5733 integer value that contains the initial value for the maximum 5734 amount of data that can be sent on the connection. This is 5735 equivalent to sending a MAX_DATA (Section 19.9) for the connection 5736 immediately after completing the handshake. 5738 initial_max_stream_data_bidi_local (0x05): This parameter is an 5739 integer value specifying the initial flow control limit for 5740 locally-initiated bidirectional streams. This limit applies to 5741 newly created bidirectional streams opened by the endpoint that 5742 sends the transport parameter. In client transport parameters, 5743 this applies to streams with an identifier with the least 5744 significant two bits set to 0x0; in server transport parameters, 5745 this applies to streams with the least significant two bits set to 5746 0x1. 5748 initial_max_stream_data_bidi_remote (0x06): This parameter is an 5749 integer value specifying the initial flow control limit for peer- 5750 initiated bidirectional streams. This limit applies to newly 5751 created bidirectional streams opened by the endpoint that receives 5752 the transport parameter. In client transport parameters, this 5753 applies to streams with an identifier with the least significant 5754 two bits set to 0x1; in server transport parameters, this applies 5755 to streams with the least significant two bits set to 0x0. 5757 initial_max_stream_data_uni (0x07): This parameter is an integer 5758 value specifying the initial flow control limit for unidirectional 5759 streams. This limit applies to newly created unidirectional 5760 streams opened by the endpoint that receives the transport 5761 parameter. In client transport parameters, this applies to 5762 streams with an identifier with the least significant two bits set 5763 to 0x3; in server transport parameters, this applies to streams 5764 with the least significant two bits set to 0x2. 5766 initial_max_streams_bidi (0x08): The initial maximum bidirectional 5767 streams parameter is an integer value that contains the initial 5768 maximum number of bidirectional streams the endpoint that receives 5769 this transport parameter is permitted to initiate. If this 5770 parameter is absent or zero, the peer cannot open bidirectional 5771 streams until a MAX_STREAMS frame is sent. Setting this parameter 5772 is equivalent to sending a MAX_STREAMS (Section 19.11) of the 5773 corresponding type with the same value. 5775 initial_max_streams_uni (0x09): The initial maximum unidirectional 5776 streams parameter is an integer value that contains the initial 5777 maximum number of unidirectional streams the endpoint that 5778 receives this transport parameter is permitted to initiate. If 5779 this parameter is absent or zero, the peer cannot open 5780 unidirectional streams until a MAX_STREAMS frame is sent. Setting 5781 this parameter is equivalent to sending a MAX_STREAMS 5782 (Section 19.11) of the corresponding type with the same value. 5784 ack_delay_exponent (0x0a): The acknowledgment delay exponent is an 5785 integer value indicating an exponent used to decode the ACK Delay 5786 field in the ACK frame (Section 19.3). If this value is absent, a 5787 default value of 3 is assumed (indicating a multiplier of 8). 5788 Values above 20 are invalid. 5790 max_ack_delay (0x0b): The maximum acknowledgment delay is an integer 5791 value indicating the maximum amount of time in milliseconds by 5792 which the endpoint will delay sending acknowledgments. This value 5793 SHOULD include the receiver's expected delays in alarms firing. 5794 For example, if a receiver sets a timer for 5ms and alarms 5795 commonly fire up to 1ms late, then it should send a max_ack_delay 5796 of 6ms. If this value is absent, a default of 25 milliseconds is 5797 assumed. Values of 2^14 or greater are invalid. 5799 disable_active_migration (0x0c): The disable active migration 5800 transport parameter is included if the endpoint does not support 5801 active connection migration (Section 9) on the address being used 5802 during the handshake. An endpoint that receives this transport 5803 parameter MUST NOT use a new local address when sending to the 5804 address that the peer used during the handshake. This transport 5805 parameter does not prohibit connection migration after a client 5806 has acted on a preferred_address transport parameter. This 5807 parameter is a zero-length value. 5809 preferred_address (0x0d): The server's preferred address is used to 5810 effect a change in server address at the end of the handshake, as 5811 described in Section 9.6. This transport parameter is only sent 5812 by a server. Servers MAY choose to only send a preferred address 5813 of one address family by sending an all-zero address and port 5814 (0.0.0.0:0 or [::]:0) for the other family. IP addresses are 5815 encoded in network byte order. 5817 The preferred_address transport parameter contains an address and 5818 port for both IP version 4 and 6. The four-byte IPv4 Address 5819 field is followed by the associated two-byte IPv4 Port field. 5820 This is followed by a 16-byte IPv6 Address field and two-byte IPv6 5821 Port field. After address and port pairs, a Connection ID Length 5822 field describes the length of the following Connection ID field. 5823 Finally, a 16-byte Stateless Reset Token field includes the 5824 stateless reset token associated with the connection ID. The 5825 format of this transport parameter is shown in Figure 22. 5827 The Connection ID field and the Stateless Reset Token field 5828 contain an alternative connection ID that has a sequence number of 5829 1; see Section 5.1.1. Having these values sent alongside the 5830 preferred address ensures that there will be at least one unused 5831 active connection ID when the client initiates migration to the 5832 preferred address. 5834 The Connection ID and Stateless Reset Token fields of a preferred 5835 address are identical in syntax and semantics to the corresponding 5836 fields of a NEW_CONNECTION_ID frame (Section 19.15). A server 5837 that chooses a zero-length connection ID MUST NOT provide a 5838 preferred address. Similarly, a server MUST NOT include a zero- 5839 length connection ID in this transport parameter. A client MUST 5840 treat violation of these requirements as a connection error of 5841 type TRANSPORT_PARAMETER_ERROR. 5843 Preferred Address { 5844 IPv4 Address (32), 5845 IPv4 Port (16), 5846 IPv6 Address (128), 5847 IPv6 Port (16), 5848 Connection ID Length (8), 5849 Connection ID (..), 5850 Stateless Reset Token (128), 5851 } 5852 Figure 22: Preferred Address format 5854 active_connection_id_limit (0x0e): The active connection ID limit is 5855 an integer value specifying the maximum number of connection IDs 5856 from the peer that an endpoint is willing to store. This value 5857 includes the connection ID received during the handshake, that 5858 received in the preferred_address transport parameter, and those 5859 received in NEW_CONNECTION_ID frames. The value of the 5860 active_connection_id_limit parameter MUST be at least 2. An 5861 endpoint that receives a value less than 2 MUST close the 5862 connection with an error of type TRANSPORT_PARAMETER_ERROR. If 5863 this transport parameter is absent, a default of 2 is assumed. If 5864 an endpoint issues a zero-length connection ID, it will never send 5865 a NEW_CONNECTION_ID frame and therefore ignores the 5866 active_connection_id_limit value received from its peer. 5868 initial_source_connection_id (0x0f): The value that the endpoint 5869 included in the Source Connection ID field of the first Initial 5870 packet it sends for the connection; see Section 7.3. 5872 retry_source_connection_id (0x10): The value that the server 5873 included in the Source Connection ID field of a Retry packet; see 5874 Section 7.3. This transport parameter is only sent by a server. 5876 If present, transport parameters that set initial per-stream flow 5877 control limits (initial_max_stream_data_bidi_local, 5878 initial_max_stream_data_bidi_remote, and initial_max_stream_data_uni) 5879 are equivalent to sending a MAX_STREAM_DATA frame (Section 19.10) on 5880 every stream of the corresponding type immediately after opening. If 5881 the transport parameter is absent, streams of that type start with a 5882 flow control limit of 0. 5884 A client MUST NOT include any server-only transport parameter: 5885 original_destination_connection_id, preferred_address, 5886 retry_source_connection_id, or stateless_reset_token. A server MUST 5887 treat receipt of any of these transport parameters as a connection 5888 error of type TRANSPORT_PARAMETER_ERROR. 5890 19. Frame Types and Formats 5892 As described in Section 12.4, packets contain one or more frames. 5893 This section describes the format and semantics of the core QUIC 5894 frame types. 5896 19.1. PADDING Frames 5898 A PADDING frame (type=0x00) has no semantic value. PADDING frames 5899 can be used to increase the size of a packet. Padding can be used to 5900 increase an initial client packet to the minimum required size, or to 5901 provide protection against traffic analysis for protected packets. 5903 PADDING frames are formatted as shown in Figure 23, which shows that 5904 PADDING frames have no content. That is, a PADDING frame consists of 5905 the single byte that identifies the frame as a PADDING frame. 5907 PADDING Frame { 5908 Type (i) = 0x00, 5909 } 5911 Figure 23: PADDING Frame Format 5913 19.2. PING Frames 5915 Endpoints can use PING frames (type=0x01) to verify that their peers 5916 are still alive or to check reachability to the peer. 5918 PING frames are formatted as shown in Figure 24, which shows that 5919 PING frames have no content. 5921 PING Frame { 5922 Type (i) = 0x01, 5923 } 5925 Figure 24: PING Frame Format 5927 The receiver of a PING frame simply needs to acknowledge the packet 5928 containing this frame. 5930 The PING frame can be used to keep a connection alive when an 5931 application or application protocol wishes to prevent the connection 5932 from timing out; see Section 10.1.2. 5934 19.3. ACK Frames 5936 Receivers send ACK frames (types 0x02 and 0x03) to inform senders of 5937 packets they have received and processed. The ACK frame contains one 5938 or more ACK Ranges. ACK Ranges identify acknowledged packets. If 5939 the frame type is 0x03, ACK frames also contain the cumulative count 5940 of QUIC packets with associated ECN marks received on the connection 5941 up until this point. QUIC implementations MUST properly handle both 5942 types and, if they have enabled ECN for packets they send, they 5943 SHOULD use the information in the ECN section to manage their 5944 congestion state. 5946 QUIC acknowledgments are irrevocable. Once acknowledged, a packet 5947 remains acknowledged, even if it does not appear in a future ACK 5948 frame. This is unlike reneging for TCP SACKs ([RFC2018]). 5950 Packets from different packet number spaces can be identified using 5951 the same numeric value. An acknowledgment for a packet needs to 5952 indicate both a packet number and a packet number space. This is 5953 accomplished by having each ACK frame only acknowledge packet numbers 5954 in the same space as the packet in which the ACK frame is contained. 5956 Version Negotiation and Retry packets cannot be acknowledged because 5957 they do not contain a packet number. Rather than relying on ACK 5958 frames, these packets are implicitly acknowledged by the next Initial 5959 packet sent by the client. 5961 ACK frames are formatted as shown in Figure 25. 5963 ACK Frame { 5964 Type (i) = 0x02..0x03, 5965 Largest Acknowledged (i), 5966 ACK Delay (i), 5967 ACK Range Count (i), 5968 First ACK Range (i), 5969 ACK Range (..) ..., 5970 [ECN Counts (..)], 5971 } 5973 Figure 25: ACK Frame Format 5975 ACK frames contain the following fields: 5977 Largest Acknowledged: A variable-length integer representing the 5978 largest packet number the peer is acknowledging; this is usually 5979 the largest packet number that the peer has received prior to 5980 generating the ACK frame. Unlike the packet number in the QUIC 5981 long or short header, the value in an ACK frame is not truncated. 5983 ACK Delay: A variable-length integer encoding the acknowledgment 5984 delay in microseconds; see Section 13.2.5. It is decoded by 5985 multiplying the value in the field by 2 to the power of the 5986 ack_delay_exponent transport parameter sent by the sender of the 5987 ACK frame; see Section 18.2. Compared to simply expressing the 5988 delay as an integer, this encoding allows for a larger range of 5989 values within the same number of bytes, at the cost of lower 5990 resolution. 5992 ACK Range Count: A variable-length integer specifying the number of 5993 ACK Range fields in the frame. 5995 First ACK Range: A variable-length integer indicating the number of 5996 contiguous packets preceding the Largest Acknowledged that are 5997 being acknowledged. That is, the smallest packet acknowledged in 5998 the range is determined by subtracting the First ACK Range value 5999 from the Largest Acknowledged. 6001 ACK Ranges: Contains additional ranges of packets that are 6002 alternately not acknowledged (Gap) and acknowledged (ACK Range); 6003 see Section 19.3.1. 6005 ECN Counts: The three ECN Counts; see Section 19.3.2. 6007 19.3.1. ACK Ranges 6009 Each ACK Range consists of alternating Gap and ACK Range Length 6010 values in descending packet number order. ACK Ranges can be 6011 repeated. The number of Gap and ACK Range Length values is 6012 determined by the ACK Range Count field; one of each value is present 6013 for each value in the ACK Range Count field. 6015 ACK Ranges are structured as shown in Figure 26. 6017 ACK Range { 6018 Gap (i), 6019 ACK Range Length (i), 6020 } 6022 Figure 26: ACK Ranges 6024 The fields that form each ACK Range are: 6026 Gap: A variable-length integer indicating the number of contiguous 6027 unacknowledged packets preceding the packet number one lower than 6028 the smallest in the preceding ACK Range. 6030 ACK Range Length: A variable-length integer indicating the number of 6031 contiguous acknowledged packets preceding the largest packet 6032 number, as determined by the preceding Gap. 6034 Gap and ACK Range Length values use a relative integer encoding for 6035 efficiency. Though each encoded value is positive, the values are 6036 subtracted, so that each ACK Range describes progressively lower- 6037 numbered packets. 6039 Each ACK Range acknowledges a contiguous range of packets by 6040 indicating the number of acknowledged packets that precede the 6041 largest packet number in that range. A value of zero indicates that 6042 only the largest packet number is acknowledged. Larger ACK Range 6043 values indicate a larger range, with corresponding lower values for 6044 the smallest packet number in the range. Thus, given a largest 6045 packet number for the range, the smallest value is determined by the 6046 formula: 6048 smallest = largest - ack_range 6050 An ACK Range acknowledges all packets between the smallest packet 6051 number and the largest, inclusive. 6053 The largest value for an ACK Range is determined by cumulatively 6054 subtracting the size of all preceding ACK Range Lengths and Gaps. 6056 Each Gap indicates a range of packets that are not being 6057 acknowledged. The number of packets in the gap is one higher than 6058 the encoded value of the Gap field. 6060 The value of the Gap field establishes the largest packet number 6061 value for the subsequent ACK Range using the following formula: 6063 largest = previous_smallest - gap - 2 6065 If any computed packet number is negative, an endpoint MUST generate 6066 a connection error of type FRAME_ENCODING_ERROR. 6068 19.3.2. ECN Counts 6070 The ACK frame uses the least significant bit of the type value (that 6071 is, type 0x03) to indicate ECN feedback and report receipt of QUIC 6072 packets with associated ECN codepoints of ECT(0), ECT(1), or CE in 6073 the packet's IP header. ECN Counts are only present when the ACK 6074 frame type is 0x03. 6076 When present, there are 3 ECN counts, as shown in Figure 27. 6078 ECN Counts { 6079 ECT0 Count (i), 6080 ECT1 Count (i), 6081 ECN-CE Count (i), 6082 } 6084 Figure 27: ECN Count Format 6086 The three ECN Counts are: 6088 ECT0 Count: A variable-length integer representing the total number 6089 of packets received with the ECT(0) codepoint in the packet number 6090 space of the ACK frame. 6092 ECT1 Count: A variable-length integer representing the total number 6093 of packets received with the ECT(1) codepoint in the packet number 6094 space of the ACK frame. 6096 CE Count: A variable-length integer representing the total number of 6097 packets received with the CE codepoint in the packet number space 6098 of the ACK frame. 6100 ECN counts are maintained separately for each packet number space. 6102 19.4. RESET_STREAM Frames 6104 An endpoint uses a RESET_STREAM frame (type=0x04) to abruptly 6105 terminate the sending part of a stream. 6107 After sending a RESET_STREAM, an endpoint ceases transmission and 6108 retransmission of STREAM frames on the identified stream. A receiver 6109 of RESET_STREAM can discard any data that it already received on that 6110 stream. 6112 An endpoint that receives a RESET_STREAM frame for a send-only stream 6113 MUST terminate the connection with error STREAM_STATE_ERROR. 6115 RESET_STREAM frames are formatted as shown in Figure 28. 6117 RESET_STREAM Frame { 6118 Type (i) = 0x04, 6119 Stream ID (i), 6120 Application Protocol Error Code (i), 6121 Final Size (i), 6122 } 6124 Figure 28: RESET_STREAM Frame Format 6126 RESET_STREAM frames contain the following fields: 6128 Stream ID: A variable-length integer encoding of the Stream ID of 6129 the stream being terminated. 6131 Application Protocol Error Code: A variable-length integer 6132 containing the application protocol error code (see Section 20.2) 6133 that indicates why the stream is being closed. 6135 Final Size: A variable-length integer indicating the final size of 6136 the stream by the RESET_STREAM sender, in unit of bytes; see 6137 Section 4.5. 6139 19.5. STOP_SENDING Frames 6141 An endpoint uses a STOP_SENDING frame (type=0x05) to communicate that 6142 incoming data is being discarded on receipt at application request. 6143 STOP_SENDING requests that a peer cease transmission on a stream. 6145 A STOP_SENDING frame can be sent for streams in the Recv or Size 6146 Known states; see Section 3.1. Receiving a STOP_SENDING frame for a 6147 locally-initiated stream that has not yet been created MUST be 6148 treated as a connection error of type STREAM_STATE_ERROR. An 6149 endpoint that receives a STOP_SENDING frame for a receive-only stream 6150 MUST terminate the connection with error STREAM_STATE_ERROR. 6152 STOP_SENDING frames are formatted as shown in Figure 29. 6154 STOP_SENDING Frame { 6155 Type (i) = 0x05, 6156 Stream ID (i), 6157 Application Protocol Error Code (i), 6158 } 6160 Figure 29: STOP_SENDING Frame Format 6162 STOP_SENDING frames contain the following fields: 6164 Stream ID: A variable-length integer carrying the Stream ID of the 6165 stream being ignored. 6167 Application Protocol Error Code: A variable-length integer 6168 containing the application-specified reason the sender is ignoring 6169 the stream; see Section 20.2. 6171 19.6. CRYPTO Frames 6173 A CRYPTO frame (type=0x06) is used to transmit cryptographic 6174 handshake messages. It can be sent in all packet types except 0-RTT. 6175 The CRYPTO frame offers the cryptographic protocol an in-order stream 6176 of bytes. CRYPTO frames are functionally identical to STREAM frames, 6177 except that they do not bear a stream identifier; they are not flow 6178 controlled; and they do not carry markers for optional offset, 6179 optional length, and the end of the stream. 6181 CRYPTO frames are formatted as shown in Figure 30. 6183 CRYPTO Frame { 6184 Type (i) = 0x06, 6185 Offset (i), 6186 Length (i), 6187 Crypto Data (..), 6188 } 6190 Figure 30: CRYPTO Frame Format 6192 CRYPTO frames contain the following fields: 6194 Offset: A variable-length integer specifying the byte offset in the 6195 stream for the data in this CRYPTO frame. 6197 Length: A variable-length integer specifying the length of the 6198 Crypto Data field in this CRYPTO frame. 6200 Crypto Data: The cryptographic message data. 6202 There is a separate flow of cryptographic handshake data in each 6203 encryption level, each of which starts at an offset of 0. This 6204 implies that each encryption level is treated as a separate CRYPTO 6205 stream of data. 6207 The largest offset delivered on a stream - the sum of the offset and 6208 data length - cannot exceed 2^62-1. Receipt of a frame that exceeds 6209 this limit MUST be treated as a connection error of type 6210 FRAME_ENCODING_ERROR or CRYPTO_BUFFER_EXCEEDED. 6212 Unlike STREAM frames, which include a Stream ID indicating to which 6213 stream the data belongs, the CRYPTO frame carries data for a single 6214 stream per encryption level. The stream does not have an explicit 6215 end, so CRYPTO frames do not have a FIN bit. 6217 19.7. NEW_TOKEN Frames 6219 A server sends a NEW_TOKEN frame (type=0x07) to provide the client 6220 with a token to send in the header of an Initial packet for a future 6221 connection. 6223 NEW_TOKEN frames are formatted as shown in Figure 31. 6225 NEW_TOKEN Frame { 6226 Type (i) = 0x07, 6227 Token Length (i), 6228 Token (..), 6229 } 6231 Figure 31: NEW_TOKEN Frame Format 6233 NEW_TOKEN frames contain the following fields: 6235 Token Length: A variable-length integer specifying the length of the 6236 token in bytes. 6238 Token: An opaque blob that the client can use with a future Initial 6239 packet. The token MUST NOT be empty. A client MUST treat receipt 6240 of a NEW_TOKEN frame with an empty Token field as a connection 6241 error of type FRAME_ENCODING_ERROR. 6243 A client might receive multiple NEW_TOKEN frames that contain the 6244 same token value if packets containing the frame are incorrectly 6245 determined to be lost. Clients are responsible for discarding 6246 duplicate values, which might be used to link connection attempts; 6247 see Section 8.1.3. 6249 Clients MUST NOT send NEW_TOKEN frames. A server MUST treat receipt 6250 of a NEW_TOKEN frame as a connection error of type 6251 PROTOCOL_VIOLATION. 6253 19.8. STREAM Frames 6255 STREAM frames implicitly create a stream and carry stream data. The 6256 STREAM frame Type field takes the form 0b00001XXX (or the set of 6257 values from 0x08 to 0x0f). The three low-order bits of the frame 6258 type determine the fields that are present in the frame: 6260 * The OFF bit (0x04) in the frame type is set to indicate that there 6261 is an Offset field present. When set to 1, the Offset field is 6262 present. When set to 0, the Offset field is absent and the Stream 6263 Data starts at an offset of 0 (that is, the frame contains the 6264 first bytes of the stream, or the end of a stream that includes no 6265 data). 6267 * The LEN bit (0x02) in the frame type is set to indicate that there 6268 is a Length field present. If this bit is set to 0, the Length 6269 field is absent and the Stream Data field extends to the end of 6270 the packet. If this bit is set to 1, the Length field is present. 6272 * The FIN bit (0x01) indicates that the frame marks the end of the 6273 stream. The final size of the stream is the sum of the offset and 6274 the length of this frame. 6276 An endpoint MUST terminate the connection with error 6277 STREAM_STATE_ERROR if it receives a STREAM frame for a locally- 6278 initiated stream that has not yet been created, or for a send-only 6279 stream. 6281 STREAM frames are formatted as shown in Figure 32. 6283 STREAM Frame { 6284 Type (i) = 0x08..0x0f, 6285 Stream ID (i), 6286 [Offset (i)], 6287 [Length (i)], 6288 Stream Data (..), 6289 } 6291 Figure 32: STREAM Frame Format 6293 STREAM frames contain the following fields: 6295 Stream ID: A variable-length integer indicating the stream ID of the 6296 stream; see Section 2.1. 6298 Offset: A variable-length integer specifying the byte offset in the 6299 stream for the data in this STREAM frame. This field is present 6300 when the OFF bit is set to 1. When the Offset field is absent, 6301 the offset is 0. 6303 Length: A variable-length integer specifying the length of the 6304 Stream Data field in this STREAM frame. This field is present 6305 when the LEN bit is set to 1. When the LEN bit is set to 0, the 6306 Stream Data field consumes all the remaining bytes in the packet. 6308 Stream Data: The bytes from the designated stream to be delivered. 6310 When a Stream Data field has a length of 0, the offset in the STREAM 6311 frame is the offset of the next byte that would be sent. 6313 The first byte in the stream has an offset of 0. The largest offset 6314 delivered on a stream - the sum of the offset and data length - 6315 cannot exceed 2^62-1, as it is not possible to provide flow control 6316 credit for that data. Receipt of a frame that exceeds this limit 6317 MUST be treated as a connection error of type FRAME_ENCODING_ERROR or 6318 FLOW_CONTROL_ERROR. 6320 19.9. MAX_DATA Frames 6322 A MAX_DATA frame (type=0x10) is used in flow control to inform the 6323 peer of the maximum amount of data that can be sent on the connection 6324 as a whole. 6326 MAX_DATA frames are formatted as shown in Figure 33. 6328 MAX_DATA Frame { 6329 Type (i) = 0x10, 6330 Maximum Data (i), 6331 } 6333 Figure 33: MAX_DATA Frame Format 6335 MAX_DATA frames contain the following field: 6337 Maximum Data: A variable-length integer indicating the maximum 6338 amount of data that can be sent on the entire connection, in units 6339 of bytes. 6341 All data sent in STREAM frames counts toward this limit. The sum of 6342 the final sizes on all streams - including streams in terminal states 6343 - MUST NOT exceed the value advertised by a receiver. An endpoint 6344 MUST terminate a connection with a FLOW_CONTROL_ERROR error if it 6345 receives more data than the maximum data value that it has sent. 6346 This includes violations of remembered limits in Early Data; see 6347 Section 7.4.1. 6349 19.10. MAX_STREAM_DATA Frames 6351 A MAX_STREAM_DATA frame (type=0x11) is used in flow control to inform 6352 a peer of the maximum amount of data that can be sent on a stream. 6354 A MAX_STREAM_DATA frame can be sent for streams in the Recv state; 6355 see Section 3.1. Receiving a MAX_STREAM_DATA frame for a locally- 6356 initiated stream that has not yet been created MUST be treated as a 6357 connection error of type STREAM_STATE_ERROR. An endpoint that 6358 receives a MAX_STREAM_DATA frame for a receive-only stream MUST 6359 terminate the connection with error STREAM_STATE_ERROR. 6361 MAX_STREAM_DATA frames are formatted as shown in Figure 34. 6363 MAX_STREAM_DATA Frame { 6364 Type (i) = 0x11, 6365 Stream ID (i), 6366 Maximum Stream Data (i), 6367 } 6369 Figure 34: MAX_STREAM_DATA Frame Format 6371 MAX_STREAM_DATA frames contain the following fields: 6373 Stream ID: The stream ID of the stream that is affected encoded as a 6374 variable-length integer. 6376 Maximum Stream Data: A variable-length integer indicating the 6377 maximum amount of data that can be sent on the identified stream, 6378 in units of bytes. 6380 When counting data toward this limit, an endpoint accounts for the 6381 largest received offset of data that is sent or received on the 6382 stream. Loss or reordering can mean that the largest received offset 6383 on a stream can be greater than the total size of data received on 6384 that stream. Receiving STREAM frames might not increase the largest 6385 received offset. 6387 The data sent on a stream MUST NOT exceed the largest maximum stream 6388 data value advertised by the receiver. An endpoint MUST terminate a 6389 connection with a FLOW_CONTROL_ERROR error if it receives more data 6390 than the largest maximum stream data that it has sent for the 6391 affected stream. This includes violations of remembered limits in 6392 Early Data; see Section 7.4.1. 6394 19.11. MAX_STREAMS Frames 6396 A MAX_STREAMS frame (type=0x12 or 0x13) inform the peer of the 6397 cumulative number of streams of a given type it is permitted to open. 6398 A MAX_STREAMS frame with a type of 0x12 applies to bidirectional 6399 streams, and a MAX_STREAMS frame with a type of 0x13 applies to 6400 unidirectional streams. 6402 MAX_STREAMS frames are formatted as shown in Figure 35; 6404 MAX_STREAMS Frame { 6405 Type (i) = 0x12..0x13, 6406 Maximum Streams (i), 6407 } 6409 Figure 35: MAX_STREAMS Frame Format 6411 MAX_STREAMS frames contain the following field: 6413 Maximum Streams: A count of the cumulative number of streams of the 6414 corresponding type that can be opened over the lifetime of the 6415 connection. This value cannot exceed 2^60, as it is not possible 6416 to encode stream IDs larger than 2^62-1. Receipt of a frame that 6417 permits opening of a stream larger than this limit MUST be treated 6418 as a FRAME_ENCODING_ERROR. 6420 Loss or reordering can cause a MAX_STREAMS frame to be received that 6421 state a lower stream limit than an endpoint has previously received. 6422 MAX_STREAMS frames that do not increase the stream limit MUST be 6423 ignored. 6425 An endpoint MUST NOT open more streams than permitted by the current 6426 stream limit set by its peer. For instance, a server that receives a 6427 unidirectional stream limit of 3 is permitted to open stream 3, 7, 6428 and 11, but not stream 15. An endpoint MUST terminate a connection 6429 with a STREAM_LIMIT_ERROR error if a peer opens more streams than was 6430 permitted. This includes violations of remembered limits in Early 6431 Data; see Section 7.4.1. 6433 Note that these frames (and the corresponding transport parameters) 6434 do not describe the number of streams that can be opened 6435 concurrently. The limit includes streams that have been closed as 6436 well as those that are open. 6438 19.12. DATA_BLOCKED Frames 6440 A sender SHOULD send a DATA_BLOCKED frame (type=0x14) when it wishes 6441 to send data, but is unable to do so due to connection-level flow 6442 control; see Section 4. DATA_BLOCKED frames can be used as input to 6443 tuning of flow control algorithms; see Section 4.2. 6445 DATA_BLOCKED frames are formatted as shown in Figure 36. 6447 DATA_BLOCKED Frame { 6448 Type (i) = 0x14, 6449 Maximum Data (i), 6450 } 6452 Figure 36: DATA_BLOCKED Frame Format 6454 DATA_BLOCKED frames contain the following field: 6456 Maximum Data: A variable-length integer indicating the connection- 6457 level limit at which blocking occurred. 6459 19.13. STREAM_DATA_BLOCKED Frames 6461 A sender SHOULD send a STREAM_DATA_BLOCKED frame (type=0x15) when it 6462 wishes to send data, but is unable to do so due to stream-level flow 6463 control. This frame is analogous to DATA_BLOCKED (Section 19.12). 6465 An endpoint that receives a STREAM_DATA_BLOCKED frame for a send-only 6466 stream MUST terminate the connection with error STREAM_STATE_ERROR. 6468 STREAM_DATA_BLOCKED frames are formatted as shown in Figure 37. 6470 STREAM_DATA_BLOCKED Frame { 6471 Type (i) = 0x15, 6472 Stream ID (i), 6473 Maximum Stream Data (i), 6474 } 6476 Figure 37: STREAM_DATA_BLOCKED Frame Format 6478 STREAM_DATA_BLOCKED frames contain the following fields: 6480 Stream ID: A variable-length integer indicating the stream that is 6481 blocked due to flow control. 6483 Maximum Stream Data: A variable-length integer indicating the offset 6484 of the stream at which the blocking occurred. 6486 19.14. STREAMS_BLOCKED Frames 6488 A sender SHOULD send a STREAMS_BLOCKED frame (type=0x16 or 0x17) when 6489 it wishes to open a stream, but is unable to due to the maximum 6490 stream limit set by its peer; see Section 19.11. A STREAMS_BLOCKED 6491 frame of type 0x16 is used to indicate reaching the bidirectional 6492 stream limit, and a STREAMS_BLOCKED frame of type 0x17 is used to 6493 indicate reaching the unidirectional stream limit. 6495 A STREAMS_BLOCKED frame does not open the stream, but informs the 6496 peer that a new stream was needed and the stream limit prevented the 6497 creation of the stream. 6499 STREAMS_BLOCKED frames are formatted as shown in Figure 38. 6501 STREAMS_BLOCKED Frame { 6502 Type (i) = 0x16..0x17, 6503 Maximum Streams (i), 6504 } 6506 Figure 38: STREAMS_BLOCKED Frame Format 6508 STREAMS_BLOCKED frames contain the following field: 6510 Maximum Streams: A variable-length integer indicating the maximum 6511 number of streams allowed at the time the frame was sent. This 6512 value cannot exceed 2^60, as it is not possible to encode stream 6513 IDs larger than 2^62-1. Receipt of a frame that encodes a larger 6514 stream ID MUST be treated as a STREAM_LIMIT_ERROR or a 6515 FRAME_ENCODING_ERROR. 6517 19.15. NEW_CONNECTION_ID Frames 6519 An endpoint sends a NEW_CONNECTION_ID frame (type=0x18) to provide 6520 its peer with alternative connection IDs that can be used to break 6521 linkability when migrating connections; see Section 9.5. 6523 NEW_CONNECTION_ID frames are formatted as shown in Figure 39. 6525 NEW_CONNECTION_ID Frame { 6526 Type (i) = 0x18, 6527 Sequence Number (i), 6528 Retire Prior To (i), 6529 Length (8), 6530 Connection ID (8..160), 6531 Stateless Reset Token (128), 6532 } 6534 Figure 39: NEW_CONNECTION_ID Frame Format 6536 NEW_CONNECTION_ID frames contain the following fields: 6538 Sequence Number: The sequence number assigned to the connection ID 6539 by the sender, encoded as a variable-length integer; see 6540 Section 5.1.1. 6542 Retire Prior To: A variable-length integer indicating which 6543 connection IDs should be retired; see Section 5.1.2. 6545 Length: An 8-bit unsigned integer containing the length of the 6546 connection ID. Values less than 1 and greater than 20 are invalid 6547 and MUST be treated as a connection error of type 6548 FRAME_ENCODING_ERROR. 6550 Connection ID: A connection ID of the specified length. 6552 Stateless Reset Token: A 128-bit value that will be used for a 6553 stateless reset when the associated connection ID is used; see 6554 Section 10.3. 6556 An endpoint MUST NOT send this frame if it currently requires that 6557 its peer send packets with a zero-length Destination Connection ID. 6558 Changing the length of a connection ID to or from zero-length makes 6559 it difficult to identify when the value of the connection ID changed. 6560 An endpoint that is sending packets with a zero-length Destination 6561 Connection ID MUST treat receipt of a NEW_CONNECTION_ID frame as a 6562 connection error of type PROTOCOL_VIOLATION. 6564 Transmission errors, timeouts and retransmissions might cause the 6565 same NEW_CONNECTION_ID frame to be received multiple times. Receipt 6566 of the same frame multiple times MUST NOT be treated as a connection 6567 error. A receiver can use the sequence number supplied in the 6568 NEW_CONNECTION_ID frame to handle receiving the same 6569 NEW_CONNECTION_ID frame multiple times. 6571 If an endpoint receives a NEW_CONNECTION_ID frame that repeats a 6572 previously issued connection ID with a different Stateless Reset 6573 Token or a different sequence number, or if a sequence number is used 6574 for different connection IDs, the endpoint MAY treat that receipt as 6575 a connection error of type PROTOCOL_VIOLATION. 6577 The Retire Prior To field applies to connection IDs established 6578 during connection setup and the preferred_address transport 6579 parameter; see Section 5.1.2. The Retire Prior To field MUST be less 6580 than or equal to the Sequence Number field. Receiving a value 6581 greater than the Sequence Number MUST be treated as a connection 6582 error of type FRAME_ENCODING_ERROR. 6584 Once a sender indicates a Retire Prior To value, smaller values sent 6585 in subsequent NEW_CONNECTION_ID frames have no effect. A receiver 6586 MUST ignore any Retire Prior To fields that do not increase the 6587 largest received Retire Prior To value. 6589 An endpoint that receives a NEW_CONNECTION_ID frame with a sequence 6590 number smaller than the Retire Prior To field of a previously 6591 received NEW_CONNECTION_ID frame MUST send a corresponding 6592 RETIRE_CONNECTION_ID frame that retires the newly received connection 6593 ID, unless it has already done so for that sequence number. 6595 19.16. RETIRE_CONNECTION_ID Frames 6597 An endpoint sends a RETIRE_CONNECTION_ID frame (type=0x19) to 6598 indicate that it will no longer use a connection ID that was issued 6599 by its peer. This includes the connection ID provided during the 6600 handshake. Sending a RETIRE_CONNECTION_ID frame also serves as a 6601 request to the peer to send additional connection IDs for future use; 6602 see Section 5.1. New connection IDs can be delivered to a peer using 6603 the NEW_CONNECTION_ID frame (Section 19.15). 6605 Retiring a connection ID invalidates the stateless reset token 6606 associated with that connection ID. 6608 RETIRE_CONNECTION_ID frames are formatted as shown in Figure 40. 6610 RETIRE_CONNECTION_ID Frame { 6611 Type (i) = 0x19, 6612 Sequence Number (i), 6613 } 6615 Figure 40: RETIRE_CONNECTION_ID Frame Format 6617 RETIRE_CONNECTION_ID frames contain the following field: 6619 Sequence Number: The sequence number of the connection ID being 6620 retired; see Section 5.1.2. 6622 Receipt of a RETIRE_CONNECTION_ID frame containing a sequence number 6623 greater than any previously sent to the peer MUST be treated as a 6624 connection error of type PROTOCOL_VIOLATION. 6626 The sequence number specified in a RETIRE_CONNECTION_ID frame MUST 6627 NOT refer to the Destination Connection ID field of the packet in 6628 which the frame is contained. The peer MAY treat this as a 6629 connection error of type PROTOCOL_VIOLATION. 6631 An endpoint cannot send this frame if it was provided with a zero- 6632 length connection ID by its peer. An endpoint that provides a zero- 6633 length connection ID MUST treat receipt of a RETIRE_CONNECTION_ID 6634 frame as a connection error of type PROTOCOL_VIOLATION. 6636 19.17. PATH_CHALLENGE Frames 6638 Endpoints can use PATH_CHALLENGE frames (type=0x1a) to check 6639 reachability to the peer and for path validation during connection 6640 migration. 6642 PATH_CHALLENGE frames are formatted as shown in Figure 41. 6644 PATH_CHALLENGE Frame { 6645 Type (i) = 0x1a, 6646 Data (64), 6647 } 6649 Figure 41: PATH_CHALLENGE Frame Format 6651 PATH_CHALLENGE frames contain the following field: 6653 Data: This 8-byte field contains arbitrary data. 6655 Including 64 bits of entropy in a PATH_CHALLENGE frame ensures that 6656 it is easier to receive the packet than it is to guess the value 6657 correctly. 6659 The recipient of this frame MUST generate a PATH_RESPONSE frame 6660 (Section 19.18) containing the same Data. 6662 19.18. PATH_RESPONSE Frames 6664 A PATH_RESPONSE frame (type=0x1b) is sent in response to a 6665 PATH_CHALLENGE frame. 6667 PATH_RESPONSE frames are formatted as shown in Figure 42, which is 6668 identical to the PATH_CHALLENGE frame (Section 19.17). 6670 PATH_RESPONSE Frame { 6671 Type (i) = 0x1b, 6672 Data (64), 6673 } 6675 Figure 42: PATH_RESPONSE Frame Format 6677 If the content of a PATH_RESPONSE frame does not match the content of 6678 a PATH_CHALLENGE frame previously sent by the endpoint, the endpoint 6679 MAY generate a connection error of type PROTOCOL_VIOLATION. 6681 19.19. CONNECTION_CLOSE Frames 6683 An endpoint sends a CONNECTION_CLOSE frame (type=0x1c or 0x1d) to 6684 notify its peer that the connection is being closed. The 6685 CONNECTION_CLOSE with a frame type of 0x1c is used to signal errors 6686 at only the QUIC layer, or the absence of errors (with the NO_ERROR 6687 code). The CONNECTION_CLOSE frame with a type of 0x1d is used to 6688 signal an error with the application that uses QUIC. 6690 If there are open streams that have not been explicitly closed, they 6691 are implicitly closed when the connection is closed. 6693 CONNECTION_CLOSE frames are formatted as shown in Figure 43. 6695 CONNECTION_CLOSE Frame { 6696 Type (i) = 0x1c..0x1d, 6697 Error Code (i), 6698 [Frame Type (i)], 6699 Reason Phrase Length (i), 6700 Reason Phrase (..), 6701 } 6703 Figure 43: CONNECTION_CLOSE Frame Format 6705 CONNECTION_CLOSE frames contain the following fields: 6707 Error Code: A variable-length integer error code that indicates the 6708 reason for closing this connection. A CONNECTION_CLOSE frame of 6709 type 0x1c uses codes from the space defined in Section 20.1. A 6710 CONNECTION_CLOSE frame of type 0x1d uses codes from the 6711 application protocol error code space; see Section 20.2. 6713 Frame Type: A variable-length integer encoding the type of frame 6714 that triggered the error. A value of 0 (equivalent to the mention 6715 of the PADDING frame) is used when the frame type is unknown. The 6716 application-specific variant of CONNECTION_CLOSE (type 0x1d) does 6717 not include this field. 6719 Reason Phrase Length: A variable-length integer specifying the 6720 length of the reason phrase in bytes. Because a CONNECTION_CLOSE 6721 frame cannot be split between packets, any limits on packet size 6722 will also limit the space available for a reason phrase. 6724 Reason Phrase: Additional diagnostic information for the closure. 6726 This can be zero length if the sender chooses not to give details 6727 beyond the Error Code. This SHOULD be a UTF-8 encoded string 6728 [RFC3629], though the frame does not carry information, such as 6729 language tags, that would aid comprehension by any entity other 6730 than the one that created the text. 6732 The application-specific variant of CONNECTION_CLOSE (type 0x1d) can 6733 only be sent using 0-RTT or 1-RTT packets; see Section 12.5. When an 6734 application wishes to abandon a connection during the handshake, an 6735 endpoint can send a CONNECTION_CLOSE frame (type 0x1c) with an error 6736 code of APPLICATION_ERROR in an Initial or a Handshake packet. 6738 19.20. HANDSHAKE_DONE Frames 6740 The server uses a HANDSHAKE_DONE frame (type=0x1e) to signal 6741 confirmation of the handshake to the client. 6743 HANDSHAKE_DONE frames are formatted as shown in Figure 44, which 6744 shows that HANDSHAKE_DONE frames have no content. 6746 HANDSHAKE_DONE Frame { 6747 Type (i) = 0x1e, 6748 } 6750 Figure 44: HANDSHAKE_DONE Frame Format 6752 A HANDSHAKE_DONE frame can only be sent by the server. Servers MUST 6753 NOT send a HANDSHAKE_DONE frame before completing the handshake. A 6754 server MUST treat receipt of a HANDSHAKE_DONE frame as a connection 6755 error of type PROTOCOL_VIOLATION. 6757 19.21. Extension Frames 6759 QUIC frames do not use a self-describing encoding. An endpoint 6760 therefore needs to understand the syntax of all frames before it can 6761 successfully process a packet. This allows for efficient encoding of 6762 frames, but it means that an endpoint cannot send a frame of a type 6763 that is unknown to its peer. 6765 An extension to QUIC that wishes to use a new type of frame MUST 6766 first ensure that a peer is able to understand the frame. An 6767 endpoint can use a transport parameter to signal its willingness to 6768 receive extension frame types. One transport parameter can indicate 6769 support for one or more extension frame types. 6771 Extensions that modify or replace core protocol functionality 6772 (including frame types) will be difficult to combine with other 6773 extensions that modify or replace the same functionality unless the 6774 behavior of the combination is explicitly defined. Such extensions 6775 SHOULD define their interaction with previously-defined extensions 6776 modifying the same protocol components. 6778 Extension frames MUST be congestion controlled and MUST cause an ACK 6779 frame to be sent. The exception is extension frames that replace or 6780 supplement the ACK frame. Extension frames are not included in flow 6781 control unless specified in the extension. 6783 An IANA registry is used to manage the assignment of frame types; see 6784 Section 22.4. 6786 20. Error Codes 6788 QUIC transport error codes and application error codes are 62-bit 6789 unsigned integers. 6791 20.1. Transport Error Codes 6793 This section lists the defined QUIC transport error codes that can be 6794 used in a CONNECTION_CLOSE frame with a type of 0x1c. These errors 6795 apply to the entire connection. 6797 NO_ERROR (0x0): An endpoint uses this with CONNECTION_CLOSE to 6798 signal that the connection is being closed abruptly in the absence 6799 of any error. 6801 INTERNAL_ERROR (0x1): The endpoint encountered an internal error and 6802 cannot continue with the connection. 6804 CONNECTION_REFUSED (0x2): The server refused to accept a new 6805 connection. 6807 FLOW_CONTROL_ERROR (0x3): An endpoint received more data than it 6808 permitted in its advertised data limits; see Section 4. 6810 STREAM_LIMIT_ERROR (0x4): An endpoint received a frame for a stream 6811 identifier that exceeded its advertised stream limit for the 6812 corresponding stream type. 6814 STREAM_STATE_ERROR (0x5): An endpoint received a frame for a stream 6815 that was not in a state that permitted that frame; see Section 3. 6817 FINAL_SIZE_ERROR (0x6): An endpoint received a STREAM frame 6818 containing data that exceeded the previously established final 6819 size. Or an endpoint received a STREAM frame or a RESET_STREAM 6820 frame containing a final size that was lower than the size of 6821 stream data that was already received. Or an endpoint received a 6822 STREAM frame or a RESET_STREAM frame containing a different final 6823 size to the one already established. 6825 FRAME_ENCODING_ERROR (0x7): An endpoint received a frame that was 6826 badly formatted. For instance, a frame of an unknown type, or an 6827 ACK frame that has more acknowledgment ranges than the remainder 6828 of the packet could carry. 6830 TRANSPORT_PARAMETER_ERROR (0x8): An endpoint received transport 6831 parameters that were badly formatted, included an invalid value, 6832 omitted a mandatory transport parameter, included a forbidden 6833 transport parameter, or were otherwise in error. 6835 CONNECTION_ID_LIMIT_ERROR (0x9): The number of connection IDs 6836 provided by the peer exceeds the advertised 6837 active_connection_id_limit. 6839 PROTOCOL_VIOLATION (0xa): An endpoint detected an error with 6840 protocol compliance that was not covered by more specific error 6841 codes. 6843 INVALID_TOKEN (0xb): A server received a client Initial that 6844 contained an invalid Token field. 6846 APPLICATION_ERROR (0xc): The application or application protocol 6847 caused the connection to be closed. 6849 CRYPTO_BUFFER_EXCEEDED (0xd): An endpoint has received more data in 6850 CRYPTO frames than it can buffer. 6852 KEY_UPDATE_ERROR (0xe): An endpoint detected errors in performing 6853 key updates; see Section 6 of [QUIC-TLS]. 6855 AEAD_LIMIT_REACHED (0xf): An endpoint has reached the 6856 confidentiality or integrity limit for the AEAD algorithm used by 6857 the given connection. 6859 NO_VIABLE_PATH (0x10): An endpoint has determined that the network 6860 path is incapable of supporting QUIC. An endpoint is unlikely to 6861 receive CONNECTION_CLOSE carrying this code except when the path 6862 does not support a large enough MTU. 6864 CRYPTO_ERROR (0x1XX): The cryptographic handshake failed. A range 6865 of 256 values is reserved for carrying error codes specific to the 6866 cryptographic handshake that is used. Codes for errors occurring 6867 when TLS is used for the crypto handshake are described in 6868 Section 4.8 of [QUIC-TLS]. 6870 See Section 22.5 for details of registering new error codes. 6872 In defining these error codes, several principles are applied. Error 6873 conditions that might require specific action on the part of a 6874 recipient are given unique codes. Errors that represent common 6875 conditions are given specific codes. Absent either of these 6876 conditions, error codes are used to identify a general function of 6877 the stack, like flow control or transport parameter handling. 6878 Finally, generic errors are provided for conditions where 6879 implementations are unable or unwilling to use more specific codes. 6881 20.2. Application Protocol Error Codes 6883 The management of application error codes is left to application 6884 protocols. Application protocol error codes are used for the 6885 RESET_STREAM frame (Section 19.4), the STOP_SENDING frame 6886 (Section 19.5), and the CONNECTION_CLOSE frame with a type of 0x1d 6887 (Section 19.19). 6889 21. Security Considerations 6891 The goal of QUIC is to provide a secure transport connection. 6892 Section 21.1 provides an overview of those properties; subsequent 6893 sections discuss constraints and caveats regarding these properties, 6894 including descriptions of known attacks and countermeasures. 6896 21.1. Overview of Security Properties 6898 A complete security analysis of QUIC is outside the scope of this 6899 document. This section provides an informal description of the 6900 desired security properties as an aid to implementors and to help 6901 guide protocol analysis. 6903 QUIC assumes the threat model described in [SEC-CONS] and provides 6904 protections against many of the attacks that arise from that model. 6906 For this purpose, attacks are divided into passive and active 6907 attacks. Passive attackers have the capability to read packets from 6908 the network, while active attackers also have the capability to write 6909 packets into the network. However, a passive attack could involve an 6910 attacker with the ability to cause a routing change or other 6911 modification in the path taken by packets that comprise a connection. 6913 Attackers are additionally categorized as either on-path attackers or 6914 off-path attackers. An on-path attacker can read, modify, or remove 6915 any packet it observes such that it no longer reaches its 6916 destination, while an off-path attacker observes the packets, but 6917 cannot prevent the original packet from reaching its intended 6918 destination. Both types of attackers can also transmit arbitrary 6919 packets. This definition differs from that of Section 3.5 of 6920 [SEC-CONS] in that an off-path attacker is able to observe packets. 6922 Properties of the handshake, protected packets, and connection 6923 migration are considered separately. 6925 21.1.1. Handshake 6927 The QUIC handshake incorporates the TLS 1.3 handshake and inherits 6928 the cryptographic properties described in Appendix E.1 of [TLS13]. 6929 Many of the security properties of QUIC depend on the TLS handshake 6930 providing these properties. Any attack on the TLS handshake could 6931 affect QUIC. 6933 Any attack on the TLS handshake that compromises the secrecy or 6934 uniqueness of session keys, or the authentication of the 6935 participating peers, affects other security guarantees provided by 6936 QUIC that depend on those keys. For instance, migration (Section 9) 6937 depends on the efficacy of confidentiality protections, both for the 6938 negotiation of keys using the TLS handshake and for QUIC packet 6939 protection, to avoid linkability across network paths. 6941 An attack on the integrity of the TLS handshake might allow an 6942 attacker to affect the selection of application protocol or QUIC 6943 version. 6945 In addition to the properties provided by TLS, the QUIC handshake 6946 provides some defense against DoS attacks on the handshake. 6948 21.1.1.1. Anti-Amplification 6950 Address validation (Section 8) is used to verify that an entity that 6951 claims a given address is able to receive packets at that address. 6952 Address validation limits amplification attack targets to addresses 6953 for which an attacker can observe packets. 6955 Prior to address validation, endpoints are limited in what they are 6956 able to send. Endpoints cannot send data toward an unvalidated 6957 address in excess of three times the data received from that address. 6959 Note: The anti-amplification limit only applies when an endpoint 6960 responds to packets received from an unvalidated address. The 6961 anti-amplification limit does not apply to clients when 6962 establishing a new connection or when initiating connection 6963 migration. 6965 21.1.1.2. Server-Side DoS 6967 Computing the server's first flight for a full handshake is 6968 potentially expensive, requiring both a signature and a key exchange 6969 computation. In order to prevent computational DoS attacks, the 6970 Retry packet provides a cheap token exchange mechanism that allows 6971 servers to validate a client's IP address prior to doing any 6972 expensive computations at the cost of a single round trip. After a 6973 successful handshake, servers can issue new tokens to a client, which 6974 will allow new connection establishment without incurring this cost. 6976 21.1.1.3. On-Path Handshake Termination 6978 An on-path or off-path attacker can force a handshake to fail by 6979 replacing or racing Initial packets. Once valid Initial packets have 6980 been exchanged, subsequent Handshake packets are protected with the 6981 handshake keys and an on-path attacker cannot force handshake failure 6982 other than by dropping packets to cause endpoints to abandon the 6983 attempt. 6985 An on-path attacker can also replace the addresses of packets on 6986 either side and therefore cause the client or server to have an 6987 incorrect view of the remote addresses. Such an attack is 6988 indistinguishable from the functions performed by a NAT. 6990 21.1.1.4. Parameter Negotiation 6992 The entire handshake is cryptographically protected, with the Initial 6993 packets being encrypted with per-version keys and the Handshake and 6994 later packets being encrypted with keys derived from the TLS key 6995 exchange. Further, parameter negotiation is folded into the TLS 6996 transcript and thus provides the same integrity guarantees as 6997 ordinary TLS negotiation. An attacker can observe the client's 6998 transport parameters (as long as it knows the version-specific salt) 6999 but cannot observe the server's transport parameters and cannot 7000 influence parameter negotiation. 7002 Connection IDs are unencrypted but integrity protected in all 7003 packets. 7005 This version of QUIC does not incorporate a version negotiation 7006 mechanism; implementations of incompatible versions will simply fail 7007 to establish a connection. 7009 21.1.2. Protected Packets 7011 Packet protection (Section 12.1) applies authenticated encryption to 7012 all packets except Version Negotiation packets, though Initial and 7013 Retry packets have limited protection due to the use of version- 7014 specific keying material; see [QUIC-TLS] for more details. This 7015 section considers passive and active attacks against protected 7016 packets. 7018 Both on-path and off-path attackers can mount a passive attack in 7019 which they save observed packets for an offline attack against packet 7020 protection at a future time; this is true for any observer of any 7021 packet on any network. 7023 A blind attacker, one who injects packets without being able to 7024 observe valid packets for a connection, is unlikely to be successful, 7025 since packet protection ensures that valid packets are only generated 7026 by endpoints that possess the key material established during the 7027 handshake; see Section 7 and Section 21.1.1. Similarly, any active 7028 attacker that observes packets and attempts to insert new data or 7029 modify existing data in those packets should not be able to generate 7030 packets deemed valid by the receiving endpoint, other than Initial 7031 packets. 7033 A spoofing attack, in which an active attacker rewrites unprotected 7034 parts of a packet that it forwards or injects, such as the source or 7035 destination address, is only effective if the attacker can forward 7036 packets to the original endpoint. Packet protection ensures that the 7037 packet payloads can only be processed by the endpoints that completed 7038 the handshake, and invalid packets are ignored by those endpoints. 7040 An attacker can also modify the boundaries between packets and UDP 7041 datagrams, causing multiple packets to be coalesced into a single 7042 datagram, or splitting coalesced packets into multiple datagrams. 7043 Aside from datagrams containing Initial packets, which require 7044 padding, modification of how packets are arranged in datagrams has no 7045 functional effect on a connection, although it might change some 7046 performance characteristics. 7048 21.1.3. Connection Migration 7050 Connection Migration (Section 9) provides endpoints with the ability 7051 to transition between IP addresses and ports on multiple paths, using 7052 one path at a time for transmission and receipt of non-probing 7053 frames. Path validation (Section 8.2) establishes that a peer is 7054 both willing and able to receive packets sent on a particular path. 7055 This helps reduce the effects of address spoofing by limiting the 7056 number of packets sent to a spoofed address. 7058 This section describes the intended security properties of connection 7059 migration under various types of DoS attacks. 7061 21.1.3.1. On-Path Active Attacks 7063 An attacker that can cause a packet it observes to no longer reach 7064 its intended destination is considered an on-path attacker. When an 7065 attacker is present between a client and server, endpoints are 7066 required to send packets through the attacker to establish 7067 connectivity on a given path. 7069 An on-path attacker can: 7071 * Inspect packets 7073 * Modify IP and UDP packet headers 7075 * Inject new packets 7077 * Delay packets 7079 * Reorder packets 7081 * Drop packets 7083 * Split and merge datagrams along packet boundaries 7085 An on-path attacker cannot: 7087 * Modify an authenticated portion of a packet and cause the 7088 recipient to accept that packet 7090 An on-path attacker has the opportunity to modify the packets that it 7091 observes, however any modifications to an authenticated portion of a 7092 packet will cause it to be dropped by the receiving endpoint as 7093 invalid, as packet payloads are both authenticated and encrypted. 7095 In the presence of an on-path attacker, QUIC aims to provide the 7096 following properties: 7098 1. An on-path attacker can prevent use of a path for a connection, 7099 causing the connection to fail if it cannot use a different path 7100 that does not contain the attacker. This can be achieved by 7101 dropping all packets, modifying them so that they fail to 7102 decrypt, or other methods. 7104 2. An on-path attacker can prevent migration to a new path for which 7105 the attacker is also on-path by causing path validation to fail 7106 on the new path. 7108 3. An on-path attacker cannot prevent a client from migrating to a 7109 path for which the attacker is not on-path. 7111 4. An on-path attacker can reduce the throughput of a connection by 7112 delaying packets or dropping them. 7114 5. An on-path attacker cannot cause an endpoint to accept a packet 7115 for which it has modified an authenticated portion of that 7116 packet. 7118 21.1.3.2. Off-Path Active Attacks 7120 An off-path attacker is not directly on the path between a client and 7121 server, but could be able to obtain copies of some or all packets 7122 sent between the client and the server. It is also able to send 7123 copies of those packets to either endpoint. 7125 An off-path attacker can: 7127 * Inspect packets 7129 * Inject new packets 7131 * Reorder injected packets 7133 An off-path attacker cannot: 7135 * Modify packets sent by endpoints 7137 * Delay packets 7139 * Drop packets 7141 * Reorder original packets 7143 An off-path attacker can create modified copies of packets that it 7144 has observed and inject those copies into the network, potentially 7145 with spoofed source and destination addresses. 7147 For the purposes of this discussion, it is assumed that an off-path 7148 attacker has the ability to inject a modified copy of a packet into 7149 the network that will reach the destination endpoint prior to the 7150 arrival of the original packet observed by the attacker. In other 7151 words, an attacker has the ability to consistently "win" a race with 7152 the legitimate packets between the endpoints, potentially causing the 7153 original packet to be ignored by the recipient. 7155 It is also assumed that an attacker has the resources necessary to 7156 affect NAT state, potentially both causing an endpoint to lose its 7157 NAT binding, and an attacker to obtain the same port for use with its 7158 traffic. 7160 In the presence of an off-path attacker, QUIC aims to provide the 7161 following properties: 7163 1. An off-path attacker can race packets and attempt to become a 7164 "limited" on-path attacker. 7166 2. An off-path attacker can cause path validation to succeed for 7167 forwarded packets with the source address listed as the off-path 7168 attacker as long as it can provide improved connectivity between 7169 the client and the server. 7171 3. An off-path attacker cannot cause a connection to close once the 7172 handshake has completed. 7174 4. An off-path attacker cannot cause migration to a new path to fail 7175 if it cannot observe the new path. 7177 5. An off-path attacker can become a limited on-path attacker during 7178 migration to a new path for which it is also an off-path 7179 attacker. 7181 6. An off-path attacker can become a limited on-path attacker by 7182 affecting shared NAT state such that it sends packets to the 7183 server from the same IP address and port that the client 7184 originally used. 7186 21.1.3.3. Limited On-Path Active Attacks 7188 A limited on-path attacker is an off-path attacker that has offered 7189 improved routing of packets by duplicating and forwarding original 7190 packets between the server and the client, causing those packets to 7191 arrive before the original copies such that the original packets are 7192 dropped by the destination endpoint. 7194 A limited on-path attacker differs from an on-path attacker in that 7195 it is not on the original path between endpoints, and therefore the 7196 original packets sent by an endpoint are still reaching their 7197 destination. This means that a future failure to route copied 7198 packets to the destination faster than their original path will not 7199 prevent the original packets from reaching the destination. 7201 A limited on-path attacker can: 7203 * Inspect packets 7205 * Inject new packets 7207 * Modify unencrypted packet headers 7209 * Reorder packets 7211 A limited on-path attacker cannot: 7213 * Delay packets so that they arrive later than packets sent on the 7214 original path 7216 * Drop packets 7218 * Modify the authenticated and encrypted portion of a packet and 7219 cause the recipient to accept that packet 7221 A limited on-path attacker can only delay packets up to the point 7222 that the original packets arrive before the duplicate packets, 7223 meaning that it cannot offer routing with worse latency than the 7224 original path. If a limited on-path attacker drops packets, the 7225 original copy will still arrive at the destination endpoint. 7227 In the presence of a limited on-path attacker, QUIC aims to provide 7228 the following properties: 7230 1. A limited on-path attacker cannot cause a connection to close 7231 once the handshake has completed. 7233 2. A limited on-path attacker cannot cause an idle connection to 7234 close if the client is first to resume activity. 7236 3. A limited on-path attacker can cause an idle connection to be 7237 deemed lost if the server is the first to resume activity. 7239 Note that these guarantees are the same guarantees provided for any 7240 NAT, for the same reasons. 7242 21.2. Handshake Denial of Service 7244 As an encrypted and authenticated transport QUIC provides a range of 7245 protections against denial of service. Once the cryptographic 7246 handshake is complete, QUIC endpoints discard most packets that are 7247 not authenticated, greatly limiting the ability of an attacker to 7248 interfere with existing connections. 7250 Once a connection is established QUIC endpoints might accept some 7251 unauthenticated ICMP packets (see Section 14.2.1), but the use of 7252 these packets is extremely limited. The only other type of packet 7253 that an endpoint might accept is a stateless reset (Section 10.3), 7254 which relies on the token being kept secret until it is used. 7256 During the creation of a connection, QUIC only provides protection 7257 against attack from off the network path. All QUIC packets contain 7258 proof that the recipient saw a preceding packet from its peer. 7260 Addresses cannot change during the handshake, so endpoints can 7261 discard packets that are received on a different network path. 7263 The Source and Destination Connection ID fields are the primary means 7264 of protection against off-path attack during the handshake; see 7265 Section 8.1. These are required to match those set by a peer. 7266 Except for an Initial and stateless reset packets, an endpoint only 7267 accepts packets that include a Destination Connection ID field that 7268 matches a value the endpoint previously chose. This is the only 7269 protection offered for Version Negotiation packets. 7271 The Destination Connection ID field in an Initial packet is selected 7272 by a client to be unpredictable, which serves an additional purpose. 7273 The packets that carry the cryptographic handshake are protected with 7274 a key that is derived from this connection ID and a salt specific to 7275 the QUIC version. This allows endpoints to use the same process for 7276 authenticating packets that they receive as they use after the 7277 cryptographic handshake completes. Packets that cannot be 7278 authenticated are discarded. Protecting packets in this fashion 7279 provides a strong assurance that the sender of the packet saw the 7280 Initial packet and understood it. 7282 These protections are not intended to be effective against an 7283 attacker that is able to receive QUIC packets prior to the connection 7284 being established. Such an attacker can potentially send packets 7285 that will be accepted by QUIC endpoints. This version of QUIC 7286 attempts to detect this sort of attack, but it expects that endpoints 7287 will fail to establish a connection rather than recovering. For the 7288 most part, the cryptographic handshake protocol [QUIC-TLS] is 7289 responsible for detecting tampering during the handshake. 7291 Endpoints are permitted to use other methods to detect and attempt to 7292 recover from interference with the handshake. Invalid packets can be 7293 identified and discarded using other methods, but no specific method 7294 is mandated in this document. 7296 21.3. Amplification Attack 7298 An attacker might be able to receive an address validation token 7299 (Section 8) from a server and then release the IP address it used to 7300 acquire that token. At a later time, the attacker can initiate a 7301 0-RTT connection with a server by spoofing this same address, which 7302 might now address a different (victim) endpoint. The attacker can 7303 thus potentially cause the server to send an initial congestion 7304 window's worth of data towards the victim. 7306 Servers SHOULD provide mitigations for this attack by limiting the 7307 usage and lifetime of address validation tokens; see Section 8.1.3. 7309 21.4. Optimistic ACK Attack 7311 An endpoint that acknowledges packets it has not received might cause 7312 a congestion controller to permit sending at rates beyond what the 7313 network supports. An endpoint MAY skip packet numbers when sending 7314 packets to detect this behavior. An endpoint can then immediately 7315 close the connection with a connection error of type 7316 PROTOCOL_VIOLATION; see Section 10.2. 7318 21.5. Request Forgery Attacks 7320 A request forgery attack occurs where an endpoint causes its peer to 7321 issue a request towards a victim, with the request controlled by the 7322 endpoint. Request forgery attacks aim to provide an attacker with 7323 access to capabilities of its peer that might otherwise be 7324 unavailable to the attacker. For a networking protocol, a request 7325 forgery attack is often used to exploit any implicit authorization 7326 conferred on the peer by the victim due to the peer's location in the 7327 network. 7329 For request forgery to be effective, an attacker needs to be able to 7330 influence what packets the peer sends and where these packets are 7331 sent. If an attacker can target a vulnerable service with a 7332 controlled payload, that service might perform actions that are 7333 attributed to the attacker's peer, but decided by the attacker. 7335 For example, cross-site request forgery [CSRF] exploits on the Web 7336 cause a client to issue requests that include authorization cookies 7337 [COOKIE], allowing one site access to information and actions that 7338 are intended to be restricted to a different site. 7340 As QUIC runs over UDP, the primary attack modality of concern is one 7341 where an attacker can select the address to which its peer sends UDP 7342 datagrams and can control some of the unprotected content of those 7343 packets. As much of the data sent by QUIC endpoints is protected, 7344 this includes control over ciphertext. An attack is successful if an 7345 attacker can cause a peer to send a UDP datagram to a host that will 7346 perform some action based on content in the datagram. 7348 This section discusses ways in which QUIC might be used for request 7349 forgery attacks. 7351 This section also describes limited countermeasures that can be 7352 implemented by QUIC endpoints. These mitigations can be employed 7353 unilaterally by a QUIC implementation or deployment, without 7354 potential targets for request forgery attacks taking action. However 7355 these countermeasures could be insufficient if UDP-based services do 7356 not properly authorize requests. 7358 Because the migration attack described in Section 21.5.4 is quite 7359 powerful and does not have adequate countermeasures, QUIC server 7360 implementations should assume that attackers can cause them to 7361 generate arbitrary UDP payloads to arbitrary destinations. QUIC 7362 servers SHOULD NOT be deployed in networks that do not deploy ingress 7363 filtering [BCP38] and also have inadequately secured UDP endpoints. 7365 Although it is not generally possible to ensure that clients are not 7366 co-located with vulnerable endpoints, this version of QUIC does not 7367 allow servers to migrate, thus preventing spoofed migration attacks 7368 on clients. Any future extension which allows server migration MUST 7369 also define countermeasures for forgery attacks. 7371 21.5.1. Control Options for Endpoints 7373 QUIC offers some opportunities for an attacker to influence or 7374 control where its peer sends UDP datagrams: 7376 * initial connection establishment (Section 7), where a server is 7377 able to choose where a client sends datagrams, for example by 7378 populating DNS records; 7380 * preferred addresses (Section 9.6), where a server is able to 7381 choose where a client sends datagrams; 7383 * spoofed connection migrations (Section 9.3.1), where a client is 7384 able to use source address spoofing to select where a server sends 7385 subsequent datagrams; and 7387 * spoofed packets that cause a server to send a Version Negotiation 7388 packet Section 21.5.5. 7390 In all cases, the attacker can cause its peer to send datagrams to a 7391 victim that might not understand QUIC. That is, these packets are 7392 sent by the peer prior to address validation; see Section 8. 7394 Outside of the encrypted portion of packets, QUIC offers an endpoint 7395 several options for controlling the content of UDP datagrams that its 7396 peer sends. The Destination Connection ID field offers direct 7397 control over bytes that appear early in packets sent by the peer; see 7398 Section 5.1. The Token field in Initial packets offers a server 7399 control over other bytes of Initial packets; see Section 17.2.2. 7401 There are no measures in this version of QUIC to prevent indirect 7402 control over the encrypted portions of packets. It is necessary to 7403 assume that endpoints are able to control the contents of frames that 7404 a peer sends, especially those frames that convey application data, 7405 such as STREAM frames. Though this depends to some degree on details 7406 of the application protocol, some control is possible in many 7407 protocol usage contexts. As the attacker has access to packet 7408 protection keys, they are likely to be capable of predicting how a 7409 peer will encrypt future packets. Successful control over datagram 7410 content then only requires that the attacker be able to predict the 7411 packet number and placement of frames in packets with some amount of 7412 reliability. 7414 This section assumes that limiting control over datagram content is 7415 not feasible. The focus of the mitigations in subsequent sections is 7416 on limiting the ways in which datagrams that are sent prior to 7417 address validation can be used for request forgery. 7419 21.5.2. Request Forgery with Client Initial Packets 7421 An attacker acting as a server can choose the IP address and port on 7422 which it advertises its availability, so Initial packets from clients 7423 are assumed to be available for use in this sort of attack. The 7424 address validation implicit in the handshake ensures that - for a new 7425 connection - a client will not send other types of packet to a 7426 destination that does not understand QUIC or is not willing to accept 7427 a QUIC connection. 7429 Initial packet protection (Section 5.2 of [QUIC-TLS]) makes it 7430 difficult for servers to control the content of Initial packets sent 7431 by clients. A client choosing an unpredictable Destination 7432 Connection ID ensures that servers are unable to control any of the 7433 encrypted portion of Initial packets from clients. 7435 However, the Token field is open to server control and does allow a 7436 server to use clients to mount request forgery attacks. Use of 7437 tokens provided with the NEW_TOKEN frame (Section 8.1.3) offers the 7438 only option for request forgery during connection establishment. 7440 Clients however are not obligated to use the NEW_TOKEN frame. 7441 Request forgery attacks that rely on the Token field can be avoided 7442 if clients send an empty Token field when the server address has 7443 changed from when the NEW_TOKEN frame was received. 7445 Clients could avoid using NEW_TOKEN if the server address changes. 7446 However, not including a Token field could adversely affect 7447 performance. Servers could rely on NEW_TOKEN to enable sending of 7448 data in excess of the three times limit on sending data; see 7449 Section 8.1. In particular, this affects cases where clients use 7450 0-RTT to request data from servers. 7452 Sending a Retry packet (Section 17.2.5) offers a server the option to 7453 change the Token field. After sending a Retry, the server can also 7454 control the Destination Connection ID field of subsequent Initial 7455 packets from the client. This also might allow indirect control over 7456 the encrypted content of Initial packets. However, the exchange of a 7457 Retry packet validates the server's address, thereby preventing the 7458 use of subsequent Initial packets for request forgery. 7460 21.5.3. Request Forgery with Preferred Addresses 7462 Servers can specify a preferred address, which clients then migrate 7463 to after confirming the handshake; see Section 9.6. The Destination 7464 Connection ID field of packets that the client sends to a preferred 7465 address can be used for request forgery. 7467 A client MUST NOT send non-probing frames to a preferred address 7468 prior to validating that address; see Section 8. This greatly 7469 reduces the options that a server has to control the encrypted 7470 portion of datagrams. 7472 This document does not offer any additional countermeasures that are 7473 specific to use of preferred addresses and can be implemented by 7474 endpoints. The generic measures described in Section 21.5.6 could be 7475 used as further mitigation. 7477 21.5.4. Request Forgery with Spoofed Migration 7479 Clients are able to present a spoofed source address as part of an 7480 apparent connection migration to cause a server to send datagrams to 7481 that address. 7483 The Destination Connection ID field in any packets that a server 7484 subsequently sends to this spoofed address can be used for request 7485 forgery. A client might also be able to influence the ciphertext. 7487 A server that only sends probing packets (Section 9.1) to an address 7488 prior to address validation provides an attacker with only limited 7489 control over the encrypted portion of datagrams. However, 7490 particularly for NAT rebinding, this can adversely affect 7491 performance. If the server sends frames carrying application data, 7492 an attacker might be able to control most of the content of 7493 datagrams. 7495 This document does not offer specific countermeasures that can be 7496 implemented by endpoints aside from the generic measures described in 7497 Section 21.5.6. However, countermeasures for address spoofing at the 7498 network level, in particular ingress filtering [BCP38], are 7499 especially effective against attacks that use spoofing and originate 7500 from an external network. 7502 21.5.5. Request Forgery with Version Negotiation 7504 Clients that are able to present a spoofed source address on a packet 7505 can cause a server to send a Version Negotiation packet 7506 Section 17.2.1 to that address. 7508 The absence of size restrictions on the connection ID fields for 7509 packets of an unknown version increases the amount of data that the 7510 client controls from the resulting datagram. The first byte of this 7511 packet is not under client control and the next four bytes are zero, 7512 but the client is able to control up to 512 bytes starting from the 7513 fifth byte. 7515 No specific countermeasures are provided for this attack, though 7516 generic protections Section 21.5.6 could apply. In this case, 7517 ingress filtering [BCP38] is also effective. 7519 21.5.6. Generic Request Forgery Countermeasures 7521 The most effective defense against request forgery attacks is to 7522 modify vulnerable services to use strong authentication. However, 7523 this is not always something that is within the control of a QUIC 7524 deployment. This section outlines some others steps that QUIC 7525 endpoints could take unilaterally. These additional steps are all 7526 discretionary as, depending on circumstances, they could interfere 7527 with or prevent legitimate uses. 7529 Services offered over loopback interfaces often lack proper 7530 authentication. Endpoints MAY prevent connection attempts or 7531 migration to a loopback address. Endpoints SHOULD NOT allow 7532 connections or migration to a loopback address if the same service 7533 was previously available at a different interface or if the address 7534 was provided by a service at a non-loopback address. Endpoints that 7535 depend on these capabilities could offer an option to disable these 7536 protections. 7538 Similarly, endpoints could regard a change in address to link-local 7539 address [RFC4291] or an address in a private use range [RFC1918] from 7540 a global, unique-local [RFC4193], or non-private address as a 7541 potential attempt at request forgery. Endpoints could refuse to use 7542 these addresses entirely, but that carries a significant risk of 7543 interfering with legitimate uses. Endpoints SHOULD NOT refuse to use 7544 an address unless they have specific knowledge about the network 7545 indicating that sending datagrams to unvalidated addresses in a given 7546 range is not safe. 7548 Endpoints MAY choose to reduce the risk of request forgery by not 7549 including values from NEW_TOKEN frames in Initial packets or by only 7550 sending probing frames in packets prior to completing address 7551 validation. Note that this does not prevent an attacker from using 7552 the Destination Connection ID field for an attack. 7554 Endpoints are not expected to have specific information about the 7555 location of servers that could be vulnerable targets of a request 7556 forgery attack. However, it might be possible over time to identify 7557 specific UDP ports that are common targets of attacks or particular 7558 patterns in datagrams that are used for attacks. Endpoints MAY 7559 choose to avoid sending datagrams to these ports or not send 7560 datagrams that match these patterns prior to validating the 7561 destination address. Endpoints MAY retire connection IDs containing 7562 patterns known to be problematic without using them. 7564 Note: Modifying endpoints to apply these protections is more 7565 efficient than deploying network-based protections, as endpoints 7566 do not need to perform any additional processing when sending to 7567 an address that has been validated. 7569 21.6. Slowloris Attacks 7571 The attacks commonly known as Slowloris ([SLOWLORIS]) try to keep 7572 many connections to the target endpoint open and hold them open as 7573 long as possible. These attacks can be executed against a QUIC 7574 endpoint by generating the minimum amount of activity necessary to 7575 avoid being closed for inactivity. This might involve sending small 7576 amounts of data, gradually opening flow control windows in order to 7577 control the sender rate, or manufacturing ACK frames that simulate a 7578 high loss rate. 7580 QUIC deployments SHOULD provide mitigations for the Slowloris 7581 attacks, such as increasing the maximum number of clients the server 7582 will allow, limiting the number of connections a single IP address is 7583 allowed to make, imposing restrictions on the minimum transfer speed 7584 a connection is allowed to have, and restricting the length of time 7585 an endpoint is allowed to stay connected. 7587 21.7. Stream Fragmentation and Reassembly Attacks 7589 An adversarial sender might intentionally not send portions of the 7590 stream data, causing the receiver to commit resources for the unsent 7591 data. This could cause a disproportionate receive buffer memory 7592 commitment and/or the creation of a large and inefficient data 7593 structure at the receiver. 7595 An adversarial receiver might intentionally not acknowledge packets 7596 containing stream data in an attempt to force the sender to store the 7597 unacknowledged stream data for retransmission. 7599 The attack on receivers is mitigated if flow control windows 7600 correspond to available memory. However, some receivers will over- 7601 commit memory and advertise flow control offsets in the aggregate 7602 that exceed actual available memory. The over-commitment strategy 7603 can lead to better performance when endpoints are well behaved, but 7604 renders endpoints vulnerable to the stream fragmentation attack. 7606 QUIC deployments SHOULD provide mitigations against stream 7607 fragmentation attacks. Mitigations could consist of avoiding over- 7608 committing memory, limiting the size of tracking data structures, 7609 delaying reassembly of STREAM frames, implementing heuristics based 7610 on the age and duration of reassembly holes, or some combination. 7612 21.8. Stream Commitment Attack 7614 An adversarial endpoint can open a large number of streams, 7615 exhausting state on an endpoint. The adversarial endpoint could 7616 repeat the process on a large number of connections, in a manner 7617 similar to SYN flooding attacks in TCP. 7619 Normally, clients will open streams sequentially, as explained in 7620 Section 2.1. However, when several streams are initiated at short 7621 intervals, loss or reordering can cause STREAM frames that open 7622 streams to be received out of sequence. On receiving a higher- 7623 numbered stream ID, a receiver is required to open all intervening 7624 streams of the same type; see Section 3.2. Thus, on a new 7625 connection, opening stream 4000000 opens 1 million and 1 client- 7626 initiated bidirectional streams. 7628 The number of active streams is limited by the 7629 initial_max_streams_bidi and initial_max_streams_uni transport 7630 parameters as updated by any received MAX_STREAMS frames, as 7631 explained in Section 4.6. If chosen judiciously, these limits 7632 mitigate the effect of the stream commitment attack. However, 7633 setting the limit too low could affect performance when applications 7634 expect to open large number of streams. 7636 21.9. Peer Denial of Service 7638 QUIC and TLS both contain frames or messages that have legitimate 7639 uses in some contexts, but that can be abused to cause a peer to 7640 expend processing resources without having any observable impact on 7641 the state of the connection. 7643 Messages can also be used to change and revert state in small or 7644 inconsequential ways, such as by sending small increments to flow 7645 control limits. 7647 If processing costs are disproportionately large in comparison to 7648 bandwidth consumption or effect on state, then this could allow a 7649 malicious peer to exhaust processing capacity. 7651 While there are legitimate uses for all messages, implementations 7652 SHOULD track cost of processing relative to progress and treat 7653 excessive quantities of any non-productive packets as indicative of 7654 an attack. Endpoints MAY respond to this condition with a connection 7655 error, or by dropping packets. 7657 21.10. Explicit Congestion Notification Attacks 7659 An on-path attacker could manipulate the value of ECN fields in the 7660 IP header to influence the sender's rate. [RFC3168] discusses 7661 manipulations and their effects in more detail. 7663 A limited on-path attacker can duplicate and send packets with 7664 modified ECN fields to affect the sender's rate. If duplicate 7665 packets are discarded by a receiver, an attacker will need to race 7666 the duplicate packet against the original to be successful in this 7667 attack. Therefore, QUIC endpoints ignore the ECN field on an IP 7668 packet unless at least one QUIC packet in that IP packet is 7669 successfully processed; see Section 13.4. 7671 21.11. Stateless Reset Oracle 7673 Stateless resets create a possible denial of service attack analogous 7674 to a TCP reset injection. This attack is possible if an attacker is 7675 able to cause a stateless reset token to be generated for a 7676 connection with a selected connection ID. An attacker that can cause 7677 this token to be generated can reset an active connection with the 7678 same connection ID. 7680 If a packet can be routed to different instances that share a static 7681 key, for example by changing an IP address or port, then an attacker 7682 can cause the server to send a stateless reset. To defend against 7683 this style of denial of service, endpoints that share a static key 7684 for stateless reset (see Section 10.3.2) MUST be arranged so that 7685 packets with a given connection ID always arrive at an instance that 7686 has connection state, unless that connection is no longer active. 7688 More generally, servers MUST NOT generate a stateless reset if a 7689 connection with the corresponding connection ID could be active on 7690 any endpoint using the same static key. 7692 In the case of a cluster that uses dynamic load balancing, it is 7693 possible that a change in load balancer configuration could occur 7694 while an active instance retains connection state. Even if an 7695 instance retains connection state, the change in routing and 7696 resulting stateless reset will result in the connection being 7697 terminated. If there is no chance of the packet being routed to the 7698 correct instance, it is better to send a stateless reset than wait 7699 for the connection to time out. However, this is acceptable only if 7700 the routing cannot be influenced by an attacker. 7702 21.12. Version Downgrade 7704 This document defines QUIC Version Negotiation packets in Section 6 7705 that can be used to negotiate the QUIC version used between two 7706 endpoints. However, this document does not specify how this 7707 negotiation will be performed between this version and subsequent 7708 future versions. In particular, Version Negotiation packets do not 7709 contain any mechanism to prevent version downgrade attacks. Future 7710 versions of QUIC that use Version Negotiation packets MUST define a 7711 mechanism that is robust against version downgrade attacks. 7713 21.13. Targeted Attacks by Routing 7715 Deployments should limit the ability of an attacker to target a new 7716 connection to a particular server instance. Ideally, routing 7717 decisions are made independently of client-selected values, including 7718 addresses. Once an instance is selected, a connection ID can be 7719 selected so that later packets are routed to the same instance. 7721 21.14. Traffic Analysis 7723 The length of QUIC packets can reveal information about the length of 7724 the content of those packets. The PADDING frame is provided so that 7725 endpoints have some ability to obscure the length of packet content; 7726 see Section 19.1. 7728 Note however that defeating traffic analysis is challenging and the 7729 subject of active research. Length is not the only way that 7730 information might leak. Endpoints might also reveal sensitive 7731 information through other side channels, such as the timing of 7732 packets. 7734 22. IANA Considerations 7736 This document establishes several registries for the management of 7737 codepoints in QUIC. These registries operate on a common set of 7738 policies as defined in Section 22.1. 7740 22.1. Registration Policies for QUIC Registries 7742 All QUIC registries allow for both provisional and permanent 7743 registration of codepoints. This section documents policies that are 7744 common to these registries. 7746 22.1.1. Provisional Registrations 7748 Provisional registration of codepoints are intended to allow for 7749 private use and experimentation with extensions to QUIC. Provisional 7750 registrations only require the inclusion of the codepoint value and 7751 contact information. However, provisional registrations could be 7752 reclaimed and reassigned for another purpose. 7754 Provisional registrations require Expert Review, as defined in 7755 Section 4.5 of [RFC8126]. Designated expert(s) are advised that only 7756 registrations for an excessive proportion of remaining codepoint 7757 space or the very first unassigned value (see Section 22.1.2) can be 7758 rejected. 7760 Provisional registrations will include a date field that indicates 7761 when the registration was last updated. A request to update the date 7762 on any provisional registration can be made without review from the 7763 designated expert(s). 7765 All QUIC registries include the following fields to support 7766 provisional registration: 7768 Value: The assigned codepoint. 7770 Status: "Permanent" or "Provisional". 7772 Specification: A reference to a publicly available specification for 7773 the value. 7775 Date: The date of last update to the registration. 7777 Change Controller: The entity that is responsible for the definition 7778 of the registration. 7780 Contact: Contact details for the registrant. 7782 Notes: Supplementary notes about the registration. 7784 Provisional registrations MAY omit the Specification and Notes 7785 fields, plus any additional fields that might be required for a 7786 permanent registration. The Date field is not required as part of 7787 requesting a registration as it is set to the date the registration 7788 is created or updated. 7790 22.1.2. Selecting Codepoints 7792 New uses of codepoints from QUIC registries SHOULD use a randomly 7793 selected codepoint that excludes both existing allocations and the 7794 first unallocated codepoint in the selected space. Requests for 7795 multiple codepoints MAY use a contiguous range. This minimizes the 7796 risk that differing semantics are attributed to the same codepoint by 7797 different implementations. 7799 Use of the first unassigned codepoint is reserved for allocation 7800 using the Standards Action policy; see Section 4.9 of [RFC8126]. The 7801 early codepoint assignment process [EARLY-ASSIGN] can be used for 7802 these values. 7804 For codepoints that are encoded in variable-length integers 7805 (Section 16), such as frame types, codepoints that encode to four or 7806 eight bytes (that is, values 2^14 and above) SHOULD be used unless 7807 the usage is especially sensitive to having a longer encoding. 7809 Applications to register codepoints in QUIC registries MAY include a 7810 requested codepoint as part of the registration. IANA MUST allocate 7811 the selected codepoint if the codepoint is unassigned and the 7812 requirements of the registration policy are met. 7814 22.1.3. Reclaiming Provisional Codepoints 7816 A request might be made to remove an unused provisional registration 7817 from the registry to reclaim space in a registry, or portion of the 7818 registry (such as the 64-16383 range for codepoints that use 7819 variable-length encodings). This SHOULD be done only for the 7820 codepoints with the earliest recorded date and entries that have been 7821 updated less than a year prior SHOULD NOT be reclaimed. 7823 A request to remove a codepoint MUST be reviewed by the designated 7824 expert(s). The expert(s) MUST attempt to determine whether the 7825 codepoint is still in use. Experts are advised to contact the listed 7826 contacts for the registration, plus as wide a set of protocol 7827 implementers as possible in order to determine whether any use of the 7828 codepoint is known. The expert(s) are advised to allow at least four 7829 weeks for responses. 7831 If any use of the codepoints is identified by this search or a 7832 request to update the registration is made, the codepoint MUST NOT be 7833 reclaimed. Instead, the date on the registration is updated. A note 7834 might be added for the registration recording relevant information 7835 that was learned. 7837 If no use of the codepoint was identified and no request was made to 7838 update the registration, the codepoint MAY be removed from the 7839 registry. 7841 This review and consultation process also applies to requests to 7842 change a provisional registration into a permanent registration, 7843 except that the goal is not to determine whether there is no use of 7844 the codepoint, but to determine that the registration is an accurate 7845 representation of any deployed usage. 7847 22.1.4. Permanent Registrations 7849 Permanent registrations in QUIC registries use the Specification 7850 Required policy ([RFC8126]), unless otherwise specified. The 7851 designated expert(s) verify that a specification exists and is 7852 readily accessible. Expert(s) are encouraged to be biased towards 7853 approving registrations unless they are abusive, frivolous, or 7854 actively harmful (not merely aesthetically displeasing, or 7855 architecturally dubious). The creation of a registry MAY specify 7856 additional constraints on permanent registrations. 7858 The creation of a registry MAY identify a range of codepoints where 7859 registrations are governed by a different registration policy. For 7860 instance, the frame type registry in Section 22.4 has a stricter 7861 policy for codepoints in the range from 0 to 63. 7863 Any stricter requirements for permanent registrations do not prevent 7864 provisional registrations for affected codepoints. For instance, a 7865 provisional registration for a frame type of 61 could be requested. 7867 All registrations made by Standards Track publications MUST be 7868 permanent. 7870 All registrations in this document are assigned a permanent status 7871 and list a change controller of the IETF and a contact of the QUIC 7872 working group (quic@ietf.org). 7874 22.2. QUIC Versions Registry 7876 IANA [SHALL add/has added] a registry for "QUIC Versions" under a 7877 "QUIC" heading. 7879 The "QUIC Versions" registry governs a 32-bit space; see Section 15. 7880 This registry follows the registration policy from Section 22.1. 7881 Permanent registrations in this registry are assigned using the 7882 Specification Required policy ([RFC8126]). 7884 The codepoint of 0x00000001 to the protocol is assigned with 7885 permanent status to the protocol defined in this document. The 7886 codepoint of 0x00000000 is permanently reserved; the note for this 7887 codepoint [shall] indicate[s] that this version is reserved for 7888 Version Negotiation. 7890 All codepoints that follow the pattern 0x?a?a?a?a are reserved and 7891 MUST NOT be assigned by IANA and MUST NOT appear in the listing of 7892 assigned values. 7894 [[RFC editor: please remove the following note before publication.]] 7896 IANA note: Several pre-standardization versions will likely be in 7897 use at the time of publication. There is no need to document 7898 these in an RFC, but recording information about these version 7899 will ensure that the information in the registry is accurate. The 7900 document editors or working group chairs can facilitate getting 7901 the necessary information. 7903 22.3. QUIC Transport Parameter Registry 7905 IANA [SHALL add/has added] a registry for "QUIC Transport Parameters" 7906 under a "QUIC" heading. 7908 The "QUIC Transport Parameters" registry governs a 62-bit space. 7909 This registry follows the registration policy from Section 22.1. 7910 Permanent registrations in this registry are assigned using the 7911 Specification Required policy ([RFC8126]). 7913 In addition to the fields in Section 22.1.1, permanent registrations 7914 in this registry MUST include the following field: 7916 Parameter Name: A short mnemonic for the parameter. 7918 The initial contents of this registry are shown in Table 6. 7920 +=======+=====================================+===============+ 7921 | Value | Parameter Name | Specification | 7922 +=======+=====================================+===============+ 7923 | 0x00 | original_destination_connection_id | Section 18.2 | 7924 +-------+-------------------------------------+---------------+ 7925 | 0x01 | max_idle_timeout | Section 18.2 | 7926 +-------+-------------------------------------+---------------+ 7927 | 0x02 | stateless_reset_token | Section 18.2 | 7928 +-------+-------------------------------------+---------------+ 7929 | 0x03 | max_udp_payload_size | Section 18.2 | 7930 +-------+-------------------------------------+---------------+ 7931 | 0x04 | initial_max_data | Section 18.2 | 7932 +-------+-------------------------------------+---------------+ 7933 | 0x05 | initial_max_stream_data_bidi_local | Section 18.2 | 7934 +-------+-------------------------------------+---------------+ 7935 | 0x06 | initial_max_stream_data_bidi_remote | Section 18.2 | 7936 +-------+-------------------------------------+---------------+ 7937 | 0x07 | initial_max_stream_data_uni | Section 18.2 | 7938 +-------+-------------------------------------+---------------+ 7939 | 0x08 | initial_max_streams_bidi | Section 18.2 | 7940 +-------+-------------------------------------+---------------+ 7941 | 0x09 | initial_max_streams_uni | Section 18.2 | 7942 +-------+-------------------------------------+---------------+ 7943 | 0x0a | ack_delay_exponent | Section 18.2 | 7944 +-------+-------------------------------------+---------------+ 7945 | 0x0b | max_ack_delay | Section 18.2 | 7946 +-------+-------------------------------------+---------------+ 7947 | 0x0c | disable_active_migration | Section 18.2 | 7948 +-------+-------------------------------------+---------------+ 7949 | 0x0d | preferred_address | Section 18.2 | 7950 +-------+-------------------------------------+---------------+ 7951 | 0x0e | active_connection_id_limit | Section 18.2 | 7952 +-------+-------------------------------------+---------------+ 7953 | 0x0f | initial_source_connection_id | Section 18.2 | 7954 +-------+-------------------------------------+---------------+ 7955 | 0x10 | retry_source_connection_id | Section 18.2 | 7956 +-------+-------------------------------------+---------------+ 7958 Table 6: Initial QUIC Transport Parameters Entries 7960 Each value of the format "31 * N + 27" for integer values of N (that 7961 is, 27, 58, 89, ...) are reserved; these values MUST NOT be assigned 7962 by IANA and MUST NOT appear in the listing of assigned values. 7964 22.4. QUIC Frame Types Registry 7966 IANA [SHALL add/has added] a registry for "QUIC Frame Types" under a 7967 "QUIC" heading. 7969 The "QUIC Frame Types" registry governs a 62-bit space. This 7970 registry follows the registration policy from Section 22.1. 7971 Permanent registrations in this registry are assigned using the 7972 Specification Required policy ([RFC8126]), except for values between 7973 0x00 and 0x3f (in hexadecimal; inclusive), which are assigned using 7974 Standards Action or IESG Approval as defined in Section 4.9 and 4.10 7975 of [RFC8126]. 7977 In addition to the fields in Section 22.1.1, permanent registrations 7978 in this registry MUST include the following field: 7980 Frame Name: A short mnemonic for the frame type. 7982 In addition to the advice in Section 22.1, specifications for new 7983 permanent registrations SHOULD describe the means by which an 7984 endpoint might determine that it can send the identified type of 7985 frame. An accompanying transport parameter registration is expected 7986 for most registrations; see Section 22.3. Specifications for 7987 permanent registrations also need to describe the format and assigned 7988 semantics of any fields in the frame. 7990 The initial contents of this registry are tabulated in Table 3. Note 7991 that the registry does not include the "Pkts" and "Spec" columns from 7992 Table 3. 7994 22.5. QUIC Transport Error Codes Registry 7996 IANA [SHALL add/has added] a registry for "QUIC Transport Error 7997 Codes" under a "QUIC" heading. 7999 The "QUIC Transport Error Codes" registry governs a 62-bit space. 8000 This space is split into three regions that are governed by different 8001 policies. Permanent registrations in this registry are assigned 8002 using the Specification Required policy ([RFC8126]), except for 8003 values between 0x00 and 0x3f (in hexadecimal; inclusive), which are 8004 assigned using Standards Action or IESG Approval as defined in 8005 Section 4.9 and 4.10 of [RFC8126]. 8007 In addition to the fields in Section 22.1.1, permanent registrations 8008 in this registry MUST include the following fields: 8010 Code: A short mnemonic for the parameter. 8012 Description: A brief description of the error code semantics, which 8013 MAY be a summary if a specification reference is provided. 8015 The initial contents of this registry are shown in Table 7. 8017 +======+===========================+================+===============+ 8018 |Value | Code |Description | Specification | 8019 +======+===========================+================+===============+ 8020 |0x0 | NO_ERROR |No error | Section 20 | 8021 +------+---------------------------+----------------+---------------+ 8022 |0x1 | INTERNAL_ERROR |Implementation | Section 20 | 8023 | | |error | | 8024 +------+---------------------------+----------------+---------------+ 8025 |0x2 | CONNECTION_REFUSED |Server refuses a| Section 20 | 8026 | | |connection | | 8027 +------+---------------------------+----------------+---------------+ 8028 |0x3 | FLOW_CONTROL_ERROR |Flow control | Section 20 | 8029 | | |error | | 8030 +------+---------------------------+----------------+---------------+ 8031 |0x4 | STREAM_LIMIT_ERROR |Too many streams| Section 20 | 8032 | | |opened | | 8033 +------+---------------------------+----------------+---------------+ 8034 |0x5 | STREAM_STATE_ERROR |Frame received | Section 20 | 8035 | | |in invalid | | 8036 | | |stream state | | 8037 +------+---------------------------+----------------+---------------+ 8038 |0x6 | FINAL_SIZE_ERROR |Change to final | Section 20 | 8039 | | |size | | 8040 +------+---------------------------+----------------+---------------+ 8041 |0x7 | FRAME_ENCODING_ERROR |Frame encoding | Section 20 | 8042 | | |error | | 8043 +------+---------------------------+----------------+---------------+ 8044 |0x8 | TRANSPORT_PARAMETER_ERROR |Error in | Section 20 | 8045 | | |transport | | 8046 | | |parameters | | 8047 +------+---------------------------+----------------+---------------+ 8048 |0x9 | CONNECTION_ID_LIMIT_ERROR |Too many | Section 20 | 8049 | | |connection IDs | | 8050 | | |received | | 8051 +------+---------------------------+----------------+---------------+ 8052 |0xa | PROTOCOL_VIOLATION |Generic protocol| Section 20 | 8053 | | |violation | | 8054 +------+---------------------------+----------------+---------------+ 8055 |0xb | INVALID_TOKEN |Invalid Token | Section 20 | 8056 | | |Received | | 8057 +------+---------------------------+----------------+---------------+ 8058 |0xc | APPLICATION_ERROR |Application | Section 20 | 8059 | | |error | | 8060 +------+---------------------------+----------------+---------------+ 8061 |0xd | CRYPTO_BUFFER_EXCEEDED |CRYPTO data | Section 20 | 8062 | | |buffer | | 8063 | | |overflowed | | 8064 +------+---------------------------+----------------+---------------+ 8065 |0xe | KEY_UPDATE_ERROR |Invalid packet | Section 20 | 8066 | | |protection | | 8067 | | |update | | 8068 +------+---------------------------+----------------+---------------+ 8069 |0xf | AEAD_LIMIT_REACHED |Excessive use of| Section 20 | 8070 | | |packet | | 8071 | | |protection keys | | 8072 +------+---------------------------+----------------+---------------+ 8073 |0x10 | NO_VIABLE_PATH |No viable | Section 20 | 8074 | | |network path | | 8075 | | |exists | | 8076 +------+---------------------------+----------------+---------------+ 8078 Table 7: Initial QUIC Transport Error Codes Entries 8080 23. References 8082 23.1. Normative References 8084 [BCP38] Ferguson, P. and D. Senie, "Network Ingress Filtering: 8085 Defeating Denial of Service Attacks which employ IP Source 8086 Address Spoofing", BCP 38, RFC 2827, DOI 10.17487/RFC2827, 8087 May 2000, . 8089 [DPLPMTUD] Fairhurst, G., Jones, T., Tüxen, M., Rüngeler, I., and T. 8090 Völker, "Packetization Layer Path MTU Discovery for 8091 Datagram Transports", RFC 8899, DOI 10.17487/RFC8899, 8092 September 2020, . 8094 [EARLY-ASSIGN] 8095 Cotton, M., "Early IANA Allocation of Standards Track Code 8096 Points", BCP 100, RFC 7120, DOI 10.17487/RFC7120, January 8097 2014, . 8099 [IPv4] Postel, J., "Internet Protocol", STD 5, RFC 791, 8100 DOI 10.17487/RFC0791, September 1981, 8101 . 8103 [QUIC-INVARIANTS] 8104 Thomson, M., "Version-Independent Properties of QUIC", 8105 Work in Progress, Internet-Draft, draft-ietf-quic- 8106 invariants-13, 15 January 2021, 8107 . 8110 [QUIC-RECOVERY] 8111 Iyengar, J., Ed. and I. Swett, Ed., "QUIC Loss Detection 8112 and Congestion Control", Work in Progress, Internet-Draft, 8113 draft-ietf-quic-recovery-34, 15 January 2021, 8114 . 8116 [QUIC-TLS] Thomson, M., Ed. and S. Turner, Ed., "Using Transport 8117 Layer Security (TLS) to Secure QUIC", Work in Progress, 8118 Internet-Draft, draft-ietf-quic-tls-34, 15 January 2021, 8119 . 8121 [RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, 8122 DOI 10.17487/RFC1191, November 1990, 8123 . 8125 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 8126 Requirement Levels", BCP 14, RFC 2119, 8127 DOI 10.17487/RFC2119, March 1997, 8128 . 8130 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition 8131 of Explicit Congestion Notification (ECN) to IP", 8132 RFC 3168, DOI 10.17487/RFC3168, September 2001, 8133 . 8135 [RFC3629] Yergeau, F., "UTF-8, a transformation format of ISO 8136 10646", STD 63, RFC 3629, DOI 10.17487/RFC3629, November 8137 2003, . 8139 [RFC6437] Amante, S., Carpenter, B., Jiang, S., and J. Rajahalme, 8140 "IPv6 Flow Label Specification", RFC 6437, 8141 DOI 10.17487/RFC6437, November 2011, 8142 . 8144 [RFC8085] Eggert, L., Fairhurst, G., and G. Shepherd, "UDP Usage 8145 Guidelines", BCP 145, RFC 8085, DOI 10.17487/RFC8085, 8146 March 2017, . 8148 [RFC8126] Cotton, M., Leiba, B., and T. Narten, "Guidelines for 8149 Writing an IANA Considerations Section in RFCs", BCP 26, 8150 RFC 8126, DOI 10.17487/RFC8126, June 2017, 8151 . 8153 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 8154 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 8155 May 2017, . 8157 [RFC8201] McCann, J., Deering, S., Mogul, J., and R. Hinden, Ed., 8158 "Path MTU Discovery for IP version 6", STD 87, RFC 8201, 8159 DOI 10.17487/RFC8201, July 2017, 8160 . 8162 [RFC8311] Black, D., "Relaxing Restrictions on Explicit Congestion 8163 Notification (ECN) Experimentation", RFC 8311, 8164 DOI 10.17487/RFC8311, January 2018, 8165 . 8167 [TLS13] Rescorla, E., "The Transport Layer Security (TLS) Protocol 8168 Version 1.3", RFC 8446, DOI 10.17487/RFC8446, August 2018, 8169 . 8171 [UDP] Postel, J., "User Datagram Protocol", STD 6, RFC 768, 8172 DOI 10.17487/RFC0768, August 1980, 8173 . 8175 23.2. Informative References 8177 [AEAD] McGrew, D., "An Interface and Algorithms for Authenticated 8178 Encryption", RFC 5116, DOI 10.17487/RFC5116, January 2008, 8179 . 8181 [ALPN] Friedl, S., Popov, A., Langley, A., and E. Stephan, 8182 "Transport Layer Security (TLS) Application-Layer Protocol 8183 Negotiation Extension", RFC 7301, DOI 10.17487/RFC7301, 8184 July 2014, . 8186 [ALTSVC] Nottingham, M., McManus, P., and J. Reschke, "HTTP 8187 Alternative Services", RFC 7838, DOI 10.17487/RFC7838, 8188 April 2016, . 8190 [COOKIE] Barth, A., "HTTP State Management Mechanism", RFC 6265, 8191 DOI 10.17487/RFC6265, April 2011, 8192 . 8194 [CSRF] Barth, A., Jackson, C., and J. Mitchell, "Robust defenses 8195 for cross-site request forgery", Proceedings of the 15th 8196 ACM conference on Computer and communications security - 8197 CCS '08, DOI 10.1145/1455770.1455782, 2008, 8198 . 8200 [EARLY-DESIGN] 8201 Roskind, J., "QUIC: Multiplexed Transport Over UDP", 2 8202 December 2013, . 8204 [GATEWAY] Hätönen, S., Nyrhinen, A., Eggert, L., Strowes, S., 8205 Sarolahti, P., and M. Kojo, "An experimental study of home 8206 gateway characteristics", Proceedings of the 10th annual 8207 conference on Internet measurement - IMC '10, 8208 DOI 10.1145/1879141.1879174, 2010, 8209 . 8211 [HTTP2] Belshe, M., Peon, R., and M. Thomson, Ed., "Hypertext 8212 Transfer Protocol Version 2 (HTTP/2)", RFC 7540, 8213 DOI 10.17487/RFC7540, May 2015, 8214 . 8216 [IPv6] Deering, S. and R. Hinden, "Internet Protocol, Version 6 8217 (IPv6) Specification", STD 86, RFC 8200, 8218 DOI 10.17487/RFC8200, July 2017, 8219 . 8221 [QUIC-MANAGEABILITY] 8222 Kuehlewind, M. and B. Trammell, "Manageability of the QUIC 8223 Transport Protocol", Work in Progress, Internet-Draft, 8224 draft-ietf-quic-manageability-08, 2 November 2020, 8225 . 8228 [RANDOM] Eastlake 3rd, D., Schiller, J., and S. Crocker, 8229 "Randomness Requirements for Security", BCP 106, RFC 4086, 8230 DOI 10.17487/RFC4086, June 2005, 8231 . 8233 [RFC1812] Baker, F., Ed., "Requirements for IP Version 4 Routers", 8234 RFC 1812, DOI 10.17487/RFC1812, June 1995, 8235 . 8237 [RFC1918] Rekhter, Y., Moskowitz, B., Karrenberg, D., de Groot, G. 8238 J., and E. Lear, "Address Allocation for Private 8239 Internets", BCP 5, RFC 1918, DOI 10.17487/RFC1918, 8240 February 1996, . 8242 [RFC2018] Mathis, M., Mahdavi, J., Floyd, S., and A. Romanow, "TCP 8243 Selective Acknowledgment Options", RFC 2018, 8244 DOI 10.17487/RFC2018, October 1996, 8245 . 8247 [RFC2104] Krawczyk, H., Bellare, M., and R. Canetti, "HMAC: Keyed- 8248 Hashing for Message Authentication", RFC 2104, 8249 DOI 10.17487/RFC2104, February 1997, 8250 . 8252 [RFC3449] Balakrishnan, H., Padmanabhan, V., Fairhurst, G., and M. 8253 Sooriyabandara, "TCP Performance Implications of Network 8254 Path Asymmetry", BCP 69, RFC 3449, DOI 10.17487/RFC3449, 8255 December 2002, . 8257 [RFC4193] Hinden, R. and B. Haberman, "Unique Local IPv6 Unicast 8258 Addresses", RFC 4193, DOI 10.17487/RFC4193, October 2005, 8259 . 8261 [RFC4291] Hinden, R. and S. Deering, "IP Version 6 Addressing 8262 Architecture", RFC 4291, DOI 10.17487/RFC4291, February 8263 2006, . 8265 [RFC4443] Conta, A., Deering, S., and M. Gupta, Ed., "Internet 8266 Control Message Protocol (ICMPv6) for the Internet 8267 Protocol Version 6 (IPv6) Specification", STD 89, 8268 RFC 4443, DOI 10.17487/RFC4443, March 2006, 8269 . 8271 [RFC4787] Audet, F., Ed. and C. Jennings, "Network Address 8272 Translation (NAT) Behavioral Requirements for Unicast 8273 UDP", BCP 127, RFC 4787, DOI 10.17487/RFC4787, January 8274 2007, . 8276 [RFC4941] Narten, T., Draves, R., and S. Krishnan, "Privacy 8277 Extensions for Stateless Address Autoconfiguration in 8278 IPv6", RFC 4941, DOI 10.17487/RFC4941, September 2007, 8279 . 8281 [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion 8282 Control", RFC 5681, DOI 10.17487/RFC5681, September 2009, 8283 . 8285 [RFC5869] Krawczyk, H. and P. Eronen, "HMAC-based Extract-and-Expand 8286 Key Derivation Function (HKDF)", RFC 5869, 8287 DOI 10.17487/RFC5869, May 2010, 8288 . 8290 [RFC7983] Petit-Huguenin, M. and G. Salgueiro, "Multiplexing Scheme 8291 Updates for Secure Real-time Transport Protocol (SRTP) 8292 Extension for Datagram Transport Layer Security (DTLS)", 8293 RFC 7983, DOI 10.17487/RFC7983, September 2016, 8294 . 8296 [RFC8087] Fairhurst, G. and M. Welzl, "The Benefits of Using 8297 Explicit Congestion Notification (ECN)", RFC 8087, 8298 DOI 10.17487/RFC8087, March 2017, 8299 . 8301 [SEC-CONS] Rescorla, E. and B. Korver, "Guidelines for Writing RFC 8302 Text on Security Considerations", BCP 72, RFC 3552, 8303 DOI 10.17487/RFC3552, July 2003, 8304 . 8306 [SLOWLORIS] 8307 RSnake Hansen, R., "Welcome to Slowloris...", June 2009, 8308 . 8311 Appendix A. Pseudocode 8313 The pseudocode in this section describes sample algorithms. These 8314 algorithms are intended to be correct and clear, rather than being 8315 optimally performant. 8317 The pseudocode segments in this section are licensed as Code 8318 Components; see the copyright notice. 8320 A.1. Sample Variable-Length Integer Decoding 8322 The pseudocode in Figure 45 shows how a variable-length integer can 8323 be read from a stream of bytes. The function ReadVarint takes a 8324 single argument, a sequence of bytes which can be read in network 8325 byte order. 8327 ReadVarint(data): 8328 // The length of variable-length integers is encoded in the 8329 // first two bits of the first byte. 8330 v = data.next_byte() 8331 prefix = v >> 6 8332 length = 1 << prefix 8334 // Once the length is known, remove these bits and read any 8335 // remaining bytes. 8336 v = v & 0x3f 8337 repeat length-1 times: 8338 v = (v << 8) + data.next_byte() 8339 return v 8341 Figure 45: Sample Variable-Length Integer Decoding Algorithm 8343 For example, the eight-byte sequence 0xc2197c5eff14e88c decodes to 8344 the decimal value 151,288,809,941,952,652; the four-byte sequence 8345 0x9d7f3e7d decodes to 494,878,333; the two-byte sequence 0x7bbd 8346 decodes to 15,293; and the single byte 0x25 decodes to 37 (as does 8347 the two-byte sequence 0x4025). 8349 A.2. Sample Packet Number Encoding Algorithm 8351 The pseudocode in Figure 46 shows how an implementation can select an 8352 appropriate size for packet number encodings. 8354 The EncodePacketNumber function takes two arguments: 8356 * full_pn is the full packet number of the packet being sent. 8358 * largest_acked is the largest packet number which has been 8359 acknowledged by the peer in the current packet number space, if 8360 any. 8362 EncodePacketNumber(full_pn, largest_acked): 8364 // The number of bits must be at least one more 8365 // than the base-2 logarithm of the number of contiguous 8366 // unacknowledged packet numbers, including the new packet. 8367 if largest_acked is None: 8368 num_unacked = full_pn + 1 8369 else: 8370 num_unacked = full_pn - largest_acked 8372 min_bits = log(num_unacked, 2) + 1 8373 num_bytes = ceil(min_bits / 8) 8375 // Encode the integer value and truncate to 8376 // the num_bytes least-significant bytes. 8377 return encode(full_pn, num_bytes) 8379 Figure 46: Sample Packet Number Encoding Algorithm 8381 For example, if an endpoint has received an acknowledgment for packet 8382 0xabe8bc and is sending a packet with a number of 0xac5c02, there are 8383 29,519 (0x734f) outstanding packets. In order to represent at least 8384 twice this range (59,038 packets, or 0xe69e), 16 bits are required. 8386 In the same state, sending a packet with a number of 0xace8fe uses 8387 the 24-bit encoding, because at least 18 bits are required to 8388 represent twice the range (131,182 packets, or 0x2006e). 8390 A.3. Sample Packet Number Decoding Algorithm 8392 The pseudocode in Figure 47 includes an example algorithm for 8393 decoding packet numbers after header protection has been removed. 8395 The DecodePacketNumber function takes three arguments: 8397 * largest_pn is the largest packet number that has been successfully 8398 processed in the current packet number space. 8400 * truncated_pn is the value of the Packet Number field. 8402 * pn_nbits is the number of bits in the Packet Number field (8, 16, 8403 24, or 32). 8405 DecodePacketNumber(largest_pn, truncated_pn, pn_nbits): 8406 expected_pn = largest_pn + 1 8407 pn_win = 1 << pn_nbits 8408 pn_hwin = pn_win / 2 8409 pn_mask = pn_win - 1 8410 // The incoming packet number should be greater than 8411 // expected_pn - pn_hwin and less than or equal to 8412 // expected_pn + pn_hwin 8413 // 8414 // This means we cannot just strip the trailing bits from 8415 // expected_pn and add the truncated_pn because that might 8416 // yield a value outside the window. 8417 // 8418 // The following code calculates a candidate value and 8419 // makes sure it's within the packet number window. 8420 // Note the extra checks to prevent overflow and underflow. 8421 candidate_pn = (expected_pn & ~pn_mask) | truncated_pn 8422 if candidate_pn <= expected_pn - pn_hwin and 8423 candidate_pn < (1 << 62) - pn_win: 8424 return candidate_pn + pn_win 8425 if candidate_pn > expected_pn + pn_hwin and 8426 candidate_pn >= pn_win: 8427 return candidate_pn - pn_win 8428 return candidate_pn 8430 Figure 47: Sample Packet Number Decoding Algorithm 8432 For example, if the highest successfully authenticated packet had a 8433 packet number of 0xa82f30ea, then a packet containing a 16-bit value 8434 of 0x9b32 will be decoded as 0xa82f9b32. 8436 A.4. Sample ECN Validation Algorithm 8438 Each time an endpoint commences sending on a new network path, it 8439 determines whether the path supports ECN; see Section 13.4. If the 8440 path supports ECN, the goal is to use ECN. Endpoints might also 8441 periodically reassess a path that was determined to not support ECN. 8443 This section describes one method for testing new paths. This 8444 algorithm is intended to show how a path might be tested for ECN 8445 support. Endpoints can implement different methods. 8447 The path is assigned an ECN state that is one of "testing", 8448 "unknown", "failed", or "capable". On paths with a "testing" or 8449 "capable" state the endpoint sends packets with an ECT marking, by 8450 default ECT(0); otherwise, the endpoint sends unmarked packets. 8452 To start testing a path, the ECN state is set to "testing" and 8453 existing ECN counts are remembered as a baseline. 8455 The testing period runs for a number of packets or a limited time, as 8456 determined by the endpoint. The goal is not to limit the duration of 8457 the testing period, but to ensure that enough marked packets are sent 8458 for received ECN counts to provide a clear indication of how the path 8459 treats marked packets. Section 13.4.2 suggests limiting this to 10 8460 packets or 3 times the probe timeout. 8462 After the testing period ends, the ECN state for the path becomes 8463 "unknown". From the "unknown" state, successful validation of the 8464 ECN counts an ACK frame (see Section 13.4.2.1) causes the ECN state 8465 for the path to become "capable", unless no marked packet has been 8466 acknowledged. 8468 If validation of ECN counts fails at any time, the ECN state for the 8469 affected path becomes "failed". An endpoint can also mark the ECN 8470 state for a path as "failed" if marked packets are all declared lost 8471 or if they are all CE marked. 8473 Following this algorithm ensures that ECN is rarely disabled for 8474 paths that properly support ECN. Any path that incorrectly modifies 8475 markings will cause ECN to be disabled. For those rare cases where 8476 marked packets are discarded by the path, the short duration of the 8477 testing period limits the number of losses incurred. 8479 Appendix B. Change Log 8481 *RFC Editor's Note:* Please remove this section prior to 8482 publication of a final version of this document. 8484 Issue and pull request numbers are listed with a leading octothorp. 8486 B.1. Since draft-ietf-quic-transport-32 8488 * Endpoints are required to limit the total data they send in 8489 response to an apparent connection migration to three times what 8490 was received (#4257, #4264) 8492 * Added an error code for path validation failures (#4257, #4331) 8494 * Defined DoS protections for clients during the handshake (#4259, 8495 #4330, #4344) 8497 * Prohibited connection errors when datagrams are not the required 8498 size (#4273, #4342) 8500 * Stop using initial timeout for path validation (#4261, #4262, 8501 #4263). 8503 * A number of improvements to IANA considerations: 8505 - Added a registry for versions (#4345, #4280) 8507 - Clarified rules for first reserved value (#4378, #4389) 8509 - Reserved values are not added to the registry (#4372, #4428) 8511 * Added final version numbers (#4430) 8513 B.2. Since draft-ietf-quic-transport-31 8515 * Require expansion of datagrams to ensure that a path supports at 8516 least 1200 bytes in both directions: 8518 - During the handshake ack-eliciting Initial packets from the 8519 server need to be expanded (#4183, #4188) 8521 - Path validation now requires packets containing PATH_CHALLENGE 8522 and PATH_RESPONSE to be expanded and PATH_RESPONSE is sent on 8523 the same network path (#4216, #4226) 8525 * Though senders need to expand datagrams in some cases, receivers 8526 cannot enforce this requirement (#4253, #4254) 8528 * Split contact into contact and change controller for IANA 8529 registrations (#4230, #4239) 8531 B.3. Since draft-ietf-quic-transport-30 8533 * Use TRANSPORT_PARAMETER_ERROR for an invalid transport parameter 8534 (#4099, #4100) 8536 * Add a new error code for AEAD_LIMIT_REACHED code to avoid conflict 8537 (#4087, #4088) 8539 * Allow use of address validation token when server address changes 8540 (#4076, #4089) 8542 B.4. Since draft-ietf-quic-transport-29 8544 * Require the same connection ID on coalesced packets (#3800, #3930) 8546 * Allow caching of packets that can't be decrypted, by allowing the 8547 reported acknowledgment delay to exceed max_ack_delay prior to 8548 confirming the handshake (#3821, #3980, #4035, #3874) 8550 * Allow connection ID to be used for address validation (#3834, 8551 #3924) 8553 * Required protocol operations are no longer directed at 8554 implementations, but are features provided to application 8555 protocols (#3838, #3935) 8557 * Narrow requirements for reset of congestion state on path change 8558 (#3842, #3945) 8560 * Add a three times amplification limit for sending of 8561 CONNECTION_CLOSE with reduced state (#3845, #3864) 8563 * Change error code for invalid RETIRE_CONNECTION_ID frames (#3860, 8564 #3861) 8566 * Recommend retention of state for lost packets to allow for late 8567 arrival and avoid unnecessary retransmission (#3956, #3957) 8569 * Allow a server to reject connections if a client reuses packet 8570 numbers after Retry (#3989, #3990) 8572 * Limit recommendation for immediate acknowledgment to when ack- 8573 eliciting packets are reordered (#4001, #4000) 8575 B.5. Since draft-ietf-quic-transport-28 8577 * Made SERVER_BUSY error (0x2) more generic, now CONNECTION_REFUSED 8578 (#3709, #3690, #3694) 8580 * Allow TRANSPORT_PARAMETER_ERROR when validating connection IDs 8581 (#3703, #3691) 8583 * Integrate QUIC-specific language from draft-ietf-tsvwg-datagram- 8584 plpmtud (#3695, #3702) 8586 * disable_active_migration does not apply to the addresses offered 8587 in server_preferred_address (#3608, #3670) 8589 B.6. Since draft-ietf-quic-transport-27 8591 * Allowed CONNECTION_CLOSE in any packet number space, with a 8592 requirement to use a new transport-level error for application- 8593 specific errors in Initial and Handshake packets (#3430, #3435, 8594 #3440) 8596 * Clearer requirements for address validation (#2125, #3327) 8598 * Security analysis of handshake and migration (#2143, #2387, #2925) 8600 * The entire payload of a datagram is used when counting bytes for 8601 mitigating amplification attacks (#3333, #3470) 8603 * Connection IDs can be used at any time, including in the handshake 8604 (#3348, #3560, #3438, #3565) 8606 * Only one ACK should be sent for each instance of reordering 8607 (#3357, #3361) 8609 * Remove text allowing a server to proceed with a bad Retry token 8610 (#3396, #3398) 8612 * Ignore active_connection_id_limit with a zero-length connection ID 8613 (#3427, #3426) 8615 * Require active_connection_id_limit be remembered for 0-RTT (#3423, 8616 #3425) 8618 * Require ack_delay not be remembered for 0-RTT (#3433, #3545) 8620 * Redefined max_packet_size to max_udp_datagram_size (#3471, #3473) 8622 * Guidance on limiting outstanding attempts to retire connection IDs 8623 (#3489, #3509, #3557, #3547) 8625 * Restored text on dropping bogus Version Negotiation packets 8626 (#3532, #3533) 8628 * Clarified that largest acknowledged needs to be saved, but not 8629 necessarily signaled in all cases (#3541, #3581) 8631 * Addressed linkability risk with the use of preferred_address 8632 (#3559, #3563) 8634 * Added authentication of handshake connection IDs (#3439, #3499) 8636 * Opening a stream in the wrong direction is an error (#3527) 8638 B.7. Since draft-ietf-quic-transport-26 8640 * Change format of transport parameters to use varints (#3294, 8641 #3169) 8643 B.8. Since draft-ietf-quic-transport-25 8645 * Define the use of CONNECTION_CLOSE prior to establishing 8646 connection state (#3269, #3297, #3292) 8648 * Allow use of address validation tokens after client address 8649 changes (#3307, #3308) 8651 * Define the timer for address validation (#2910, #3339) 8653 B.9. Since draft-ietf-quic-transport-24 8655 * Added HANDSHAKE_DONE to signal handshake confirmation (#2863, 8656 #3142, #3145) 8658 * Add integrity check to Retry packets (#3014, #3274, #3120) 8660 * Specify handling of reordered NEW_CONNECTION_ID frames (#3194, 8661 #3202) 8663 * Require checking of sequence numbers in RETIRE_CONNECTION_ID 8664 (#3037, #3036) 8666 * active_connection_id_limit is enforced (#3193, #3197, #3200, 8667 #3201) 8669 * Correct overflow in packet number decode algorithm (#3187, #3188) 8671 * Allow use of CRYPTO_BUFFER_EXCEEDED for CRYPTO frame errors 8672 (#3258, #3186) 8674 * Define applicability and scope of NEW_TOKEN (#3150, #3152, #3155, 8675 #3156) 8677 * Tokens from Retry and NEW_TOKEN must be differentiated (#3127, 8678 #3128) 8680 * Allow CONNECTION_CLOSE in response to invalid token (#3168, #3107) 8681 * Treat an invalid CONNECTION_CLOSE as an invalid frame (#2475, 8682 #3230, #3231) 8684 * Throttle when sending CONNECTION_CLOSE after discarding state 8685 (#3095, #3157) 8687 * Application-variant of CONNECTION_CLOSE can only be sent in 0-RTT 8688 or 1-RTT packets (#3158, #3164) 8690 * Advise sending while blocked to avoid idle timeout (#2744, #3266) 8692 * Define error codes for invalid frames (#3027, #3042) 8694 * Idle timeout is symmetric (#2602, #3099) 8696 * Prohibit IP fragmentation (#3243, #3280) 8698 * Define the use of provisional registration for all registries 8699 (#3109, #3020, #3102, #3170) 8701 * Packets on one path must not adjust values for a different path 8702 (#2909, #3139) 8704 B.10. Since draft-ietf-quic-transport-23 8706 * Allow ClientHello to span multiple packets (#2928, #3045) 8708 * Client Initial size constraints apply to UDP datagram payload 8709 (#3053, #3051) 8711 * Stateless reset changes (#2152, #2993) 8713 - tokens need to be compared in constant time 8715 - detection uses UDP datagrams, not packets 8717 - tokens cannot be reused (#2785, #2968) 8719 * Clearer rules for sharing of UDP ports and use of connection IDs 8720 when doing so (#2844, #2851) 8722 * A new connection ID is necessary when responding to migration 8723 (#2778, #2969) 8725 * Stronger requirements for connection ID retirement (#3046, #3096) 8727 * NEW_TOKEN cannot be empty (#2978, #2977) 8728 * PING can be sent at any encryption level (#3034, #3035) 8730 * CONNECTION_CLOSE is not ack-eliciting (#3097, #3098) 8732 * Frame encoding error conditions updated (#3027, #3042) 8734 * Non-ack-eliciting packets cannot be sent in response to non-ack- 8735 eliciting packets (#3100, #3104) 8737 * Servers have to change connection IDs in Retry (#2837, #3147) 8739 B.11. Since draft-ietf-quic-transport-22 8741 * Rules for preventing correlation by connection ID tightened 8742 (#2084, #2929) 8744 * Clarified use of CONNECTION_CLOSE in Handshake packets (#2151, 8745 #2541, #2688) 8747 * Discourage regressions of largest acknowledged in ACK (#2205, 8748 #2752) 8750 * Improved robustness of validation process for ECN counts (#2534, 8751 #2752) 8753 * Require endpoints to ignore spurious migration attempts (#2342, 8754 #2893) 8756 * Transport parameter for disabling migration clarified to allow NAT 8757 rebinding (#2389, #2893) 8759 * Document principles for defining new error codes (#2388, #2880) 8761 * Reserve transport parameters for greasing (#2550, #2873) 8763 * A maximum ACK delay of 0 is used for handshake packet number 8764 spaces (#2646, #2638) 8766 * Improved rules for use of congestion control state on new paths 8767 (#2685, #2918) 8769 * Removed recommendation to coordinate spin for multiple connections 8770 that share a path (#2763, #2882) 8772 * Allow smaller stateless resets and recommend a smaller minimum on 8773 packets that might trigger a stateless reset (#2770, #2869, #2927, 8774 #3007). 8776 * Provide guidance around the interface to QUIC as used by 8777 application protocols (#2805, #2857) 8779 * Frames other than STREAM can cause STREAM_LIMIT_ERROR (#2825, 8780 #2826) 8782 * Tighter rules about processing of rejected 0-RTT packets (#2829, 8783 #2840, #2841) 8785 * Explanation of the effect of Retry on 0-RTT packets (#2842, #2852) 8787 * Cryptographic handshake needs to provide server transport 8788 parameter encryption (#2920, #2921) 8790 * Moved ACK generation guidance from recovery draft to transport 8791 draft (#1860, #2916). 8793 B.12. Since draft-ietf-quic-transport-21 8795 * Connection ID lengths are now one octet, but limited in version 1 8796 to 20 octets of length (#2736, #2749) 8798 B.13. Since draft-ietf-quic-transport-20 8800 * Error codes are encoded as variable-length integers (#2672, #2680) 8802 * NEW_CONNECTION_ID includes a request to retire old connection IDs 8803 (#2645, #2769) 8805 * Tighter rules for generating and explicitly eliciting ACK frames 8806 (#2546, #2794) 8808 * Recommend having only one packet per encryption level in a 8809 datagram (#2308, #2747) 8811 * More normative language about use of stateless reset (#2471, 8812 #2574) 8814 * Allow reuse of stateless reset tokens (#2732, #2733) 8816 * Allow, but not require, enforcing non-duplicate transport 8817 parameters (#2689, #2691) 8819 * Added an active_connection_id_limit transport parameter (#1994, 8820 #1998) 8822 * max_ack_delay transport parameter defaults to 0 (#2638, #2646) 8823 * When sending 0-RTT, only remembered transport parameters apply 8824 (#2458, #2360, #2466, #2461) 8826 * Define handshake completion and confirmation; define clearer rules 8827 when it encryption keys should be discarded (#2214, #2267, #2673) 8829 * Prohibit path migration prior to handshake confirmation (#2309, 8830 #2370) 8832 * PATH_RESPONSE no longer needs to be received on the validated path 8833 (#2582, #2580, #2579, #2637) 8835 * PATH_RESPONSE frames are not stored and retransmitted (#2724, 8836 #2729) 8838 * Document hack for enabling routing of ICMP when doing PMTU probing 8839 (#1243, #2402) 8841 B.14. Since draft-ietf-quic-transport-19 8843 * Refine discussion of 0-RTT transport parameters (#2467, #2464) 8845 * Fewer transport parameters need to be remembered for 0-RTT (#2624, 8846 #2467) 8848 * Spin bit text incorporated (#2564) 8850 * Close the connection when maximum stream ID in MAX_STREAMS exceeds 8851 2^62 - 1 (#2499, #2487) 8853 * New connection ID required for intentional migration (#2414, 8854 #2413) 8856 * Connection ID issuance can be rate-limited (#2436, #2428) 8858 * The "QUIC bit" is ignored in Version Negotiation (#2400, #2561) 8860 * Initial packets from clients need to be padded to 1200 unless a 8861 Handshake packet is sent as well (#2522, #2523) 8863 * CRYPTO frames can be discarded if too much data is buffered 8864 (#1834, #2524) 8866 * Stateless reset uses a short header packet (#2599, #2600) 8868 B.15. Since draft-ietf-quic-transport-18 8869 * Removed version negotiation; version negotiation, including 8870 authentication of the result, will be addressed in the next 8871 version of QUIC (#1773, #2313) 8873 * Added discussion of the use of IPv6 flow labels (#2348, #2399) 8875 * A connection ID can't be retired in a packet that uses that 8876 connection ID (#2101, #2420) 8878 * Idle timeout transport parameter is in milliseconds (from seconds) 8879 (#2453, #2454) 8881 * Endpoints are required to use new connection IDs when they use new 8882 network paths (#2413, #2414) 8884 * Increased the set of permissible frames in 0-RTT (#2344, #2355) 8886 B.16. Since draft-ietf-quic-transport-17 8888 * Stream-related errors now use STREAM_STATE_ERROR (#2305) 8890 * Endpoints discard initial keys as soon as handshake keys are 8891 available (#1951, #2045) 8893 * Expanded conditions for ignoring ICMP packet too big messages 8894 (#2108, #2161) 8896 * Remove rate control from PATH_CHALLENGE/PATH_RESPONSE (#2129, 8897 #2241) 8899 * Endpoints are permitted to discard malformed initial packets 8900 (#2141) 8902 * Clarified ECN implementation and usage requirements (#2156, #2201) 8904 * Disable ECN count verification for packets that arrive out of 8905 order (#2198, #2215) 8907 * Use Probe Timeout (PTO) instead of RTO (#2206, #2238) 8909 * Loosen constraints on retransmission of ACK ranges (#2199, #2245) 8911 * Limit Retry and Version Negotiation to once per datagram (#2259, 8912 #2303) 8914 * Set a maximum value for max_ack_delay transport parameter (#2282, 8915 #2301) 8917 * Allow server preferred address for both IPv4 and IPv6 (#2122, 8918 #2296) 8920 * Corrected requirements for migration to a preferred address 8921 (#2146, #2349) 8923 * ACK of non-existent packet is illegal (#2298, #2302) 8925 B.17. Since draft-ietf-quic-transport-16 8927 * Stream limits are defined as counts, not maximums (#1850, #1906) 8929 * Require amplification attack defense after closing (#1905, #1911) 8931 * Remove reservation of application error code 0 for STOPPING 8932 (#1804, #1922) 8934 * Renumbered frames (#1945) 8936 * Renumbered transport parameters (#1946) 8938 * Numeric transport parameters are expressed as varints (#1608, 8939 #1947, #1955) 8941 * Reorder the NEW_CONNECTION_ID frame (#1952, #1963) 8943 * Rework the first byte (#2006) 8945 - Fix the 0x40 bit 8947 - Change type values for long header 8949 - Add spin bit to short header (#631, #1988) 8951 - Encrypt the remainder of the first byte (#1322) 8953 - Move packet number length to first byte 8955 - Move ODCIL to first byte of retry packets 8957 - Simplify packet number protection (#1575) 8959 * Allow STOP_SENDING to open a remote bidirectional stream (#1797, 8960 #2013) 8962 * Added mitigation for off-path migration attacks (#1278, #1749, 8963 #2033) 8965 * Don't let the PMTU to drop below 1280 (#2063, #2069) 8967 * Require peers to replace retired connection IDs (#2085) 8969 * Servers are required to ignore Version Negotiation packets (#2088) 8971 * Tokens are repeated in all Initial packets (#2089) 8973 * Clarified how PING frames are sent after loss (#2094) 8975 * Initial keys are discarded once Handshake are available (#1951, 8976 #2045) 8978 * ICMP PTB validation clarifications (#2161, #2109, #2108) 8980 B.18. Since draft-ietf-quic-transport-15 8982 Substantial editorial reorganization; no technical changes. 8984 B.19. Since draft-ietf-quic-transport-14 8986 * Merge ACK and ACK_ECN (#1778, #1801) 8988 * Explicitly communicate max_ack_delay (#981, #1781) 8990 * Validate original connection ID after Retry packets (#1710, #1486, 8991 #1793) 8993 * Idle timeout is optional and has no specified maximum (#1765) 8995 * Update connection ID handling; add RETIRE_CONNECTION_ID type 8996 (#1464, #1468, #1483, #1484, #1486, #1495, #1729, #1742, #1799, 8997 #1821) 8999 * Include a Token in all Initial packets (#1649, #1794) 9001 * Prevent handshake deadlock (#1764, #1824) 9003 B.20. Since draft-ietf-quic-transport-13 9005 * Streams open when higher-numbered streams of the same type open 9006 (#1342, #1549) 9008 * Split initial stream flow control limit into 3 transport 9009 parameters (#1016, #1542) 9011 * All flow control transport parameters are optional (#1610) 9012 * Removed UNSOLICITED_PATH_RESPONSE error code (#1265, #1539) 9014 * Permit stateless reset in response to any packet (#1348, #1553) 9016 * Recommended defense against stateless reset spoofing (#1386, 9017 #1554) 9019 * Prevent infinite stateless reset exchanges (#1443, #1627) 9021 * Forbid processing of the same packet number twice (#1405, #1624) 9023 * Added a packet number decoding example (#1493) 9025 * More precisely define idle timeout (#1429, #1614, #1652) 9027 * Corrected format of Retry packet and prevented looping (#1492, 9028 #1451, #1448, #1498) 9030 * Permit 0-RTT after receiving Version Negotiation or Retry (#1507, 9031 #1514, #1621) 9033 * Permit Retry in response to 0-RTT (#1547, #1552) 9035 * Looser verification of ECN counters to account for ACK loss 9036 (#1555, #1481, #1565) 9038 * Remove frame type field from APPLICATION_CLOSE (#1508, #1528) 9040 B.21. Since draft-ietf-quic-transport-12 9042 * Changes to integration of the TLS handshake (#829, #1018, #1094, 9043 #1165, #1190, #1233, #1242, #1252, #1450, #1458) 9045 - The cryptographic handshake uses CRYPTO frames, not stream 0 9047 - QUIC packet protection is used in place of TLS record 9048 protection 9050 - Separate QUIC packet number spaces are used for the handshake 9052 - Changed Retry to be independent of the cryptographic handshake 9054 - Added NEW_TOKEN frame and Token fields to Initial packet 9056 - Limit the use of HelloRetryRequest to address TLS needs (like 9057 key shares) 9059 * Enable server to transition connections to a preferred address 9060 (#560, #1251, #1373) 9062 * Added ECN feedback mechanisms and handling; new ACK_ECN frame 9063 (#804, #805, #1372) 9065 * Changed rules and recommendations for use of new connection IDs 9066 (#1258, #1264, #1276, #1280, #1419, #1452, #1453, #1465) 9068 * Added a transport parameter to disable intentional connection 9069 migration (#1271, #1447) 9071 * Packets from different connection ID can't be coalesced (#1287, 9072 #1423) 9074 * Fixed sampling method for packet number encryption; the length 9075 field in long headers includes the packet number field in addition 9076 to the packet payload (#1387, #1389) 9078 * Stateless Reset is now symmetric and subject to size constraints 9079 (#466, #1346) 9081 * Added frame type extension mechanism (#58, #1473) 9083 B.22. Since draft-ietf-quic-transport-11 9085 * Enable server to transition connections to a preferred address 9086 (#560, #1251) 9088 * Packet numbers are encrypted (#1174, #1043, #1048, #1034, #850, 9089 #990, #734, #1317, #1267, #1079) 9091 * Packet numbers use a variable-length encoding (#989, #1334) 9093 * STREAM frames can now be empty (#1350) 9095 B.23. Since draft-ietf-quic-transport-10 9097 * Swap payload length and packed number fields in long header 9098 (#1294) 9100 * Clarified that CONNECTION_CLOSE is allowed in Handshake packet 9101 (#1274) 9103 * Spin bit reserved (#1283) 9105 * Coalescing multiple QUIC packets in a UDP datagram (#1262, #1285) 9106 * A more complete connection migration (#1249) 9108 * Refine opportunistic ACK defense text (#305, #1030, #1185) 9110 * A Stateless Reset Token isn't mandatory (#818, #1191) 9112 * Removed implicit stream opening (#896, #1193) 9114 * An empty STREAM frame can be used to open a stream without sending 9115 data (#901, #1194) 9117 * Define stream counts in transport parameters rather than a maximum 9118 stream ID (#1023, #1065) 9120 * STOP_SENDING is now prohibited before streams are used (#1050) 9122 * Recommend including ACK in Retry packets and allow PADDING (#1067, 9123 #882) 9125 * Endpoints now become closing after an idle timeout (#1178, #1179) 9127 * Remove implication that Version Negotiation is sent when a packet 9128 of the wrong version is received (#1197) 9130 B.24. Since draft-ietf-quic-transport-09 9132 * Added PATH_CHALLENGE and PATH_RESPONSE frames to replace PING with 9133 Data and PONG frame. Changed ACK frame type from 0x0e to 0x0d. 9134 (#1091, #725, #1086) 9136 * A server can now only send 3 packets without validating the client 9137 address (#38, #1090) 9139 * Delivery order of stream data is no longer strongly specified 9140 (#252, #1070) 9142 * Rework of packet handling and version negotiation (#1038) 9144 * Stream 0 is now exempt from flow control until the handshake 9145 completes (#1074, #725, #825, #1082) 9147 * Improved retransmission rules for all frame types: information is 9148 retransmitted, not packets or frames (#463, #765, #1095, #1053) 9150 * Added an error code for server busy signals (#1137) 9151 * Endpoints now set the connection ID that their peer uses. 9152 Connection IDs are variable length. Removed the 9153 omit_connection_id transport parameter and the corresponding short 9154 header flag. (#1089, #1052, #1146, #821, #745, #821, #1166, #1151) 9156 B.25. Since draft-ietf-quic-transport-08 9158 * Clarified requirements for BLOCKED usage (#65, #924) 9160 * BLOCKED frame now includes reason for blocking (#452, #924, #927, 9161 #928) 9163 * GAP limitation in ACK Frame (#613) 9165 * Improved PMTUD description (#614, #1036) 9167 * Clarified stream state machine (#634, #662, #743, #894) 9169 * Reserved versions don't need to be generated deterministically 9170 (#831, #931) 9172 * You don't always need the draining period (#871) 9174 * Stateless reset clarified as version-specific (#930, #986) 9176 * initial_max_stream_id_x transport parameters are optional (#970, 9177 #971) 9179 * ACK delay assumes a default value during the handshake (#1007, 9180 #1009) 9182 * Removed transport parameters from NewSessionTicket (#1015) 9184 B.26. Since draft-ietf-quic-transport-07 9186 * The long header now has version before packet number (#926, #939) 9188 * Rename and consolidate packet types (#846, #822, #847) 9190 * Packet types are assigned new codepoints and the Connection ID 9191 Flag is inverted (#426, #956) 9193 * Removed type for Version Negotiation and use Version 0 (#963, 9194 #968) 9196 * Streams are split into unidirectional and bidirectional (#643, 9197 #656, #720, #872, #175, #885) 9198 - Stream limits now have separate uni- and bi-directional 9199 transport parameters (#909, #958) 9201 - Stream limit transport parameters are now optional and default 9202 to 0 (#970, #971) 9204 * The stream state machine has been split into read and write (#634, 9205 #894) 9207 * Employ variable-length integer encodings throughout (#595) 9209 * Improvements to connection close 9211 - Added distinct closing and draining states (#899, #871) 9213 - Draining period can terminate early (#869, #870) 9215 - Clarifications about stateless reset (#889, #890) 9217 * Address validation for connection migration (#161, #732, #878) 9219 * Clearly defined retransmission rules for BLOCKED (#452, #65, #924) 9221 * negotiated_version is sent in server transport parameters (#710, 9222 #959) 9224 * Increased the range over which packet numbers are randomized 9225 (#864, #850, #964) 9227 B.27. Since draft-ietf-quic-transport-06 9229 * Replaced FNV-1a with AES-GCM for all "Cleartext" packets (#554) 9231 * Split error code space between application and transport (#485) 9233 * Stateless reset token moved to end (#820) 9235 * 1-RTT-protected long header types removed (#848) 9237 * No acknowledgments during draining period (#852) 9239 * Remove "application close" as a separate close type (#854) 9241 * Remove timestamps from the ACK frame (#841) 9243 * Require transport parameters to only appear once (#792) 9245 B.28. Since draft-ietf-quic-transport-05 9247 * Stateless token is server-only (#726) 9249 * Refactor section on connection termination (#733, #748, #328, 9250 #177) 9252 * Limit size of Version Negotiation packet (#585) 9254 * Clarify when and what to ack (#736) 9256 * Renamed STREAM_ID_NEEDED to STREAM_ID_BLOCKED 9258 * Clarify Keep-alive requirements (#729) 9260 B.29. Since draft-ietf-quic-transport-04 9262 * Introduce STOP_SENDING frame, RESET_STREAM only resets in one 9263 direction (#165) 9265 * Removed GOAWAY; application protocols are responsible for graceful 9266 shutdown (#696) 9268 * Reduced the number of error codes (#96, #177, #184, #211) 9270 * Version validation fields can't move or change (#121) 9272 * Removed versions from the transport parameters in a 9273 NewSessionTicket message (#547) 9275 * Clarify the meaning of "bytes in flight" (#550) 9277 * Public reset is now stateless reset and not visible to the path 9278 (#215) 9280 * Reordered bits and fields in STREAM frame (#620) 9282 * Clarifications to the stream state machine (#572, #571) 9284 * Increased the maximum length of the Largest Acknowledged field in 9285 ACK frames to 64 bits (#629) 9287 * truncate_connection_id is renamed to omit_connection_id (#659) 9289 * CONNECTION_CLOSE terminates the connection like TCP RST (#330, 9290 #328) 9292 * Update labels used in HKDF-Expand-Label to match TLS 1.3 (#642) 9294 B.30. Since draft-ietf-quic-transport-03 9296 * Change STREAM and RESET_STREAM layout 9298 * Add MAX_STREAM_ID settings 9300 B.31. Since draft-ietf-quic-transport-02 9302 * The size of the initial packet payload has a fixed minimum (#267, 9303 #472) 9305 * Define when Version Negotiation packets are ignored (#284, #294, 9306 #241, #143, #474) 9308 * The 64-bit FNV-1a algorithm is used for integrity protection of 9309 unprotected packets (#167, #480, #481, #517) 9311 * Rework initial packet types to change how the connection ID is 9312 chosen (#482, #442, #493) 9314 * No timestamps are forbidden in unprotected packets (#542, #429) 9316 * Cryptographic handshake is now on stream 0 (#456) 9318 * Remove congestion control exemption for cryptographic handshake 9319 (#248, #476) 9321 * Version 1 of QUIC uses TLS; a new version is needed to use a 9322 different handshake protocol (#516) 9324 * STREAM frames have a reduced number of offset lengths (#543, #430) 9326 * Split some frames into separate connection- and stream- level 9327 frames (#443) 9329 - WINDOW_UPDATE split into MAX_DATA and MAX_STREAM_DATA (#450) 9331 - BLOCKED split to match WINDOW_UPDATE split (#454) 9333 - Define STREAM_ID_NEEDED frame (#455) 9335 * A NEW_CONNECTION_ID frame supports connection migration without 9336 linkability (#232, #491, #496) 9338 * Transport parameters for 0-RTT are retained from a previous 9339 connection (#405, #513, #512) 9340 - A client in 0-RTT no longer required to reset excess streams 9341 (#425, #479) 9343 * Expanded security considerations (#440, #444, #445, #448) 9345 B.32. Since draft-ietf-quic-transport-01 9347 * Defined short and long packet headers (#40, #148, #361) 9349 * Defined a versioning scheme and stable fields (#51, #361) 9351 * Define reserved version values for "greasing" negotiation (#112, 9352 #278) 9354 * The initial packet number is randomized (#35, #283) 9356 * Narrow the packet number encoding range requirement (#67, #286, 9357 #299, #323, #356) 9359 * Defined client address validation (#52, #118, #120, #275) 9361 * Define transport parameters as a TLS extension (#49, #122) 9363 * SCUP and COPT parameters are no longer valid (#116, #117) 9365 * Transport parameters for 0-RTT are either remembered from before, 9366 or assume default values (#126) 9368 * The server chooses connection IDs in its final flight (#119, #349, 9369 #361) 9371 * The server echoes the Connection ID and packet number fields when 9372 sending a Version Negotiation packet (#133, #295, #244) 9374 * Defined a minimum packet size for the initial handshake packet 9375 from the client (#69, #136, #139, #164) 9377 * Path MTU Discovery (#64, #106) 9379 * The initial handshake packet from the client needs to fit in a 9380 single packet (#338) 9382 * Forbid acknowledgment of packets containing only ACK and PADDING 9383 (#291) 9385 * Require that frames are processed when packets are acknowledged 9386 (#381, #341) 9388 * Removed the STOP_WAITING frame (#66) 9390 * Don't require retransmission of old timestamps for lost ACK frames 9391 (#308) 9393 * Clarified that frames are not retransmitted, but the information 9394 in them can be (#157, #298) 9396 * Error handling definitions (#335) 9398 * Split error codes into four sections (#74) 9400 * Forbid the use of Public Reset where CONNECTION_CLOSE is possible 9401 (#289) 9403 * Define packet protection rules (#336) 9405 * Require that stream be entirely delivered or reset, including 9406 acknowledgment of all STREAM frames or the RESET_STREAM, before it 9407 closes (#381) 9409 * Remove stream reservation from state machine (#174, #280) 9411 * Only stream 1 does not contribute to connection-level flow control 9412 (#204) 9414 * Stream 1 counts towards the maximum concurrent stream limit (#201, 9415 #282) 9417 * Remove connection-level flow control exclusion for some streams 9418 (except 1) (#246) 9420 * RESET_STREAM affects connection-level flow control (#162, #163) 9422 * Flow control accounting uses the maximum data offset on each 9423 stream, rather than bytes received (#378) 9425 * Moved length-determining fields to the start of STREAM and ACK 9426 (#168, #277) 9428 * Added the ability to pad between frames (#158, #276) 9430 * Remove error code and reason phrase from GOAWAY (#352, #355) 9432 * GOAWAY includes a final stream number for both directions (#347) 9434 * Error codes for RESET_STREAM and CONNECTION_CLOSE are now at a 9435 consistent offset (#249) 9437 * Defined priority as the responsibility of the application protocol 9438 (#104, #303) 9440 B.33. Since draft-ietf-quic-transport-00 9442 * Replaced DIVERSIFICATION_NONCE flag with KEY_PHASE flag 9444 * Defined versioning 9446 * Reworked description of packet and frame layout 9448 * Error code space is divided into regions for each component 9450 * Use big endian for all numeric values 9452 B.34. Since draft-hamilton-quic-transport-protocol-01 9454 * Adopted as base for draft-ietf-quic-tls 9456 * Updated authors/editors list 9458 * Added IANA Considerations section 9460 * Moved Contributors and Acknowledgments to appendices 9462 Contributors 9464 The original design and rationale behind this protocol draw 9465 significantly from work by Jim Roskind [EARLY-DESIGN]. 9467 The IETF QUIC Working Group received an enormous amount of support 9468 from many people. The following people provided substantive 9469 contributions to this document: 9471 * Alessandro Ghedini 9473 * Alyssa Wilk 9475 * Antoine Delignat-Lavaud 9477 * Brian Trammell 9479 * Christian Huitema 9481 * Colin Perkins 9483 * David Schinazi 9484 * Dmitri Tikhonov 9486 * Eric Kinnear 9488 * Eric Rescorla 9490 * Gorry Fairhurst 9492 * Ian Swett 9494 * Igor Lubashev 9496 * 奥 一穂 (Kazuho Oku) 9498 * Lars Eggert 9500 * Lucas Pardue 9502 * Magnus Westerlund 9504 * Marten Seemann 9506 * Martin Duke 9508 * Mike Bishop 9510 * Mikkel Fahnøe Jørgensen 9512 * Mirja Kühlewind 9514 * Nick Banks 9516 * Nick Harper 9518 * Patrick McManus 9520 * Roberto Peon 9522 * Ryan Hamilton 9524 * Subodh Iyengar 9526 * Tatsuhiro Tsujikawa 9528 * Ted Hardie 9530 * Tom Jones 9531 * Victor Vasiliev 9533 Authors' Addresses 9535 Jana Iyengar (editor) 9536 Fastly 9538 Email: jri.ietf@gmail.com 9540 Martin Thomson (editor) 9541 Mozilla 9543 Email: mt@lowentropy.net