idnits 2.17.00 (12 Aug 2021) /tmp/idnits14439/draft-ietf-quic-manageability-16.txt: -(1383): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == There are 2 instances of lines with non-ascii characters in the document. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document date (6 April 2022) is 38 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: draft-ietf-dprive-dnsoquic has been published as RFC 9250 == Outdated reference: draft-ietf-tsvwg-transport-encrypt has been published as RFC 9065 == Outdated reference: A later version (-16) exists of draft-ietf-quic-applicability-15 Summary: 0 errors (**), 0 flaws (~~), 4 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group M. Kuehlewind 3 Internet-Draft Ericsson 4 Intended status: Informational B. Trammell 5 Expires: 8 October 2022 Google Switzerland GmbH 6 6 April 2022 8 Manageability of the QUIC Transport Protocol 9 draft-ietf-quic-manageability-16 11 Abstract 13 This document discusses manageability of the QUIC transport protocol, 14 focusing on the implications of QUIC's design and wire image on 15 network operations involving QUIC traffic. It is intended as a 16 "user's manual" for the wire image, providing guidance for network 17 operators and equipment vendors who rely on the use of transport- 18 aware network functions. 20 Status of This Memo 22 This Internet-Draft is submitted in full conformance with the 23 provisions of BCP 78 and BCP 79. 25 Internet-Drafts are working documents of the Internet Engineering 26 Task Force (IETF). Note that other groups may also distribute 27 working documents as Internet-Drafts. The list of current Internet- 28 Drafts is at https://datatracker.ietf.org/drafts/current/. 30 Internet-Drafts are draft documents valid for a maximum of six months 31 and may be updated, replaced, or obsoleted by other documents at any 32 time. It is inappropriate to use Internet-Drafts as reference 33 material or to cite them other than as "work in progress." 35 This Internet-Draft will expire on 8 October 2022. 37 Copyright Notice 39 Copyright (c) 2022 IETF Trust and the persons identified as the 40 document authors. All rights reserved. 42 This document is subject to BCP 78 and the IETF Trust's Legal 43 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 44 license-info) in effect on the date of publication of this document. 45 Please review these documents carefully, as they describe your rights 46 and restrictions with respect to this document. Code Components 47 extracted from this document must include Revised BSD License text as 48 described in Section 4.e of the Trust Legal Provisions and are 49 provided without warranty as described in the Revised BSD License. 51 Table of Contents 53 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 54 2. Features of the QUIC Wire Image . . . . . . . . . . . . . . . 4 55 2.1. QUIC Packet Header Structure . . . . . . . . . . . . . . 4 56 2.2. Coalesced Packets . . . . . . . . . . . . . . . . . . . . 6 57 2.3. Use of Port Numbers . . . . . . . . . . . . . . . . . . . 7 58 2.4. The QUIC Handshake . . . . . . . . . . . . . . . . . . . 7 59 2.5. Integrity Protection of the Wire Image . . . . . . . . . 12 60 2.6. Connection ID and Rebinding . . . . . . . . . . . . . . . 12 61 2.7. Packet Numbers . . . . . . . . . . . . . . . . . . . . . 13 62 2.8. Version Negotiation and Greasing . . . . . . . . . . . . 13 63 3. Network-Visible Information about QUIC Flows . . . . . . . . 14 64 3.1. Identifying QUIC Traffic . . . . . . . . . . . . . . . . 14 65 3.1.1. Identifying Negotiated Version . . . . . . . . . . . 15 66 3.1.2. First Packet Identification for Garbage Rejection . . 15 67 3.2. Connection Confirmation . . . . . . . . . . . . . . . . . 15 68 3.3. Distinguishing Acknowledgment Traffic . . . . . . . . . . 16 69 3.4. Server Name Indication (SNI) . . . . . . . . . . . . . . 16 70 3.4.1. Extracting Server Name Indication (SNI) 71 Information . . . . . . . . . . . . . . . . . . . . . 16 72 3.5. Flow Association . . . . . . . . . . . . . . . . . . . . 18 73 3.6. Flow Teardown . . . . . . . . . . . . . . . . . . . . . . 18 74 3.7. Flow Symmetry Measurement . . . . . . . . . . . . . . . . 19 75 3.8. Round-Trip Time (RTT) Measurement . . . . . . . . . . . . 19 76 3.8.1. Measuring Initial RTT . . . . . . . . . . . . . . . . 19 77 3.8.2. Using the Spin Bit for Passive RTT Measurement . . . 19 78 4. Specific Network Management Tasks . . . . . . . . . . . . . . 21 79 4.1. Passive Network Performance Measurement and 80 Troubleshooting . . . . . . . . . . . . . . . . . . . . 21 81 4.2. Stateful Treatment of QUIC Traffic . . . . . . . . . . . 22 82 4.3. Address Rewriting to Ensure Routing Stability . . . . . . 23 83 4.4. Server Cooperation with Load Balancers . . . . . . . . . 24 84 4.5. Filtering Behavior . . . . . . . . . . . . . . . . . . . 24 85 4.6. UDP Blocking, Throttling, and NAT Binding . . . . . . . . 24 86 4.7. DDoS Detection and Mitigation . . . . . . . . . . . . . . 25 87 4.8. Quality of Service Handling and ECMP Routing . . . . . . 27 88 4.9. Handling ICMP Messages . . . . . . . . . . . . . . . . . 27 89 4.10. Guiding Path MTU . . . . . . . . . . . . . . . . . . . . 27 91 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 29 92 6. Security Considerations . . . . . . . . . . . . . . . . . . . 29 93 7. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 29 94 8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 30 95 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 30 96 9.1. Normative References . . . . . . . . . . . . . . . . . . 30 97 9.2. Informative References . . . . . . . . . . . . . . . . . 31 98 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 34 100 1. Introduction 102 QUIC [QUIC-TRANSPORT] is a new transport protocol that is 103 encapsulated in UDP. QUIC integrates TLS [QUIC-TLS] to encrypt all 104 payload data and most control information. QUIC version 1 was 105 designed primarily as a transport for HTTP, with the resulting 106 protocol being known as HTTP/3 [QUIC-HTTP]. 108 This document provides guidance for network operations that manage 109 QUIC traffic. This includes guidance on how to interpret and utilize 110 information that is exposed by QUIC to the network, requirements and 111 assumptions of the QUIC design with respect to network treatment, and 112 a description of how common network management practices will be 113 impacted by QUIC. 115 QUIC is an end-to-end transport protocol. No information in the 116 protocol header, even that which can be inspected, is mutable by the 117 network. This is enforced through integrity protection of the wire 118 image [WIRE-IMAGE]. Encryption of most transport-layer control 119 signaling means that less information is visible to the network than 120 is the case with TCP. 122 Integrity protection can also simplify troubleshooting at the end 123 points as none of the nodes on the network path can modify transport 124 layer information. However, it means in-network operations that 125 depend on modification of data (for examples, see [RFC9065]) are not 126 possible without the cooperation of a QUIC endpoint. Such 127 cooperation might be possible with the introduction of a proxy which 128 authenticates as an endpoint. Proxy operations are not in scope for 129 this document. 131 Network management is not a one-size-fits-all endeavour: practices 132 considered necessary or even mandatory within enterprise networks 133 with certain compliance requirements, for example, would be 134 impermissible on other networks without those requirements. The 135 presence of a particular practice in this document should therefore 136 not be construed as a recommendation to apply it. For each practice, 137 this document describes what is and is not possible with the QUIC 138 transport protocol as defined. 140 This document focuses solely on network management practices that 141 observe traffic on the wire. Replacement of troubleshooting based on 142 observation with active measurement techniques, for example, is 143 therefore out of scope. A more generalized treatment of network 144 management operations on encrypted transports is given in [RFC9065]. 146 QUIC-specific terminology used in this document is defined in 147 [QUIC-TRANSPORT]. 149 2. Features of the QUIC Wire Image 151 This section discusses those aspects of the QUIC transport protocol 152 that have an impact on the design and operation of devices that 153 forward QUIC packets. This section is therefore primarily 154 considering the unencrypted part of QUIC's wire image [WIRE-IMAGE], 155 which is defined as the information available in the packet header in 156 each QUIC packet, and the dynamics of that information. Since QUIC 157 is a versioned protocol, the wire image of the header format can also 158 change from version to version. However, the field that identifies 159 the QUIC version in some packets, and the format of the Version 160 Negotiation Packet, are both inspectable and invariant 161 [QUIC-INVARIANTS]. 163 This document addresses version 1 of the QUIC protocol, whose wire 164 image is fully defined in [QUIC-TRANSPORT] and [QUIC-TLS]. Features 165 of the wire image described herein may change in future versions of 166 the protocol, except when specified as an invariant 167 [QUIC-INVARIANTS], and cannot be used to identify QUIC as a protocol 168 or to infer the behavior of future versions of QUIC. 170 2.1. QUIC Packet Header Structure 172 QUIC packets may have either a long header or a short header. The 173 first bit of the QUIC header is the Header Form bit, and indicates 174 which type of header is present. The purpose of this bit is 175 invariant across QUIC versions. 177 The long header exposes more information. It contains a version 178 number, as well as source and destination connection IDs for 179 associating packets with a QUIC connection. The definition and 180 location of these fields in the QUIC long header are invariant for 181 future versions of QUIC, although future versions of QUIC may provide 182 additional fields in the long header [QUIC-INVARIANTS]. 184 In version 1 of QUIC, the long header is used during connection 185 establishment to transmit crypto handshake data, perform version 186 negotiation, retry, and send 0-RTT data. 188 Short headers contain only an optional destination connection ID and 189 the spin bit for RTT measurement. In version 1 of QUIC, they are 190 used after connection establishment. 192 The following information is exposed in QUIC packet headers in all 193 versions of QUIC: 195 * version number: the version number is present in the long header, 196 and identifies the version used for that packet. During Version 197 Negotiation (see Section 17.2.1 of [QUIC-TRANSPORT] and 198 Section 2.8), the version number field has a special value 199 (0x00000000) that identifies the packet as a Version Negotiation 200 packet. QUIC version 1 uses version 0x00000001. Operators should 201 expect to observe packets with other version numbers as a result 202 of various Internet experiments, future standards, and greasing 203 ([RFC7801]). All deployed versions are maintained in an IANA 204 registry (see Section 22.2 of [QUIC-TRANSPORT]). 206 * source and destination connection ID: short and long headers carry 207 a destination connection ID, a variable-length field that can be 208 used to identify the connection associated with a QUIC packet, for 209 load-balancing and NAT rebinding purposes; see Section 4.4 and 210 Section 2.6. Long packet headers additionally carry a source 211 connection ID. The source connection ID corresponds to the 212 destination connection ID the source would like to have on packets 213 sent to it, and is only present on long headers. On long header 214 packets, the length of the connection IDs is also present; on 215 short header packets, the length of the destination connection ID 216 is implicit. 218 In version 1 of QUIC, the following additional information is 219 exposed: 221 * "fixed bit": The second-most-significant bit of the first octet of 222 most QUIC packets of the current version is set to 1, enabling 223 endpoints to demultiplex with other UDP-encapsulated protocols. 224 Even though this bit is fixed in the version 1 specification, 225 endpoints might use an extension that varies the bit. Therefore, 226 observers cannot reliably use it as an identifier for QUIC. 228 * latency spin bit: The third-most-significant bit of the first 229 octet in the short header for version 1. The spin bit is set by 230 endpoints such that tracking edge transitions can be used to 231 passively observe end-to-end RTT. See Section 3.8.2 for further 232 details. 234 * header type: The long header has a 2 bit packet type field 235 following the Header Form and fixed bits. Header types correspond 236 to stages of the handshake; see Section 17.2 of [QUIC-TRANSPORT] 237 for details. 239 * length: The length of the remaining QUIC packet after the length 240 field, present on long headers. This field is used to implement 241 coalesced packets during the handshake (see Section 2.2). 243 * token: Initial packets may contain a token, a variable-length 244 opaque value optionally sent from client to server, used for 245 validating the client's address. Retry packets also contain a 246 token, which can be used by the client in an Initial packet on a 247 subsequent connection attempt. The length of the token is 248 explicit in both cases. 250 Retry (Section 17.2.5 of [QUIC-TRANSPORT]) and Version Negotiation 251 (Section 17.2.1 of [QUIC-TRANSPORT]) packets are not encrypted or 252 protected in any way. For other kinds of packets, version 1 of QUIC 253 cryptographically obfuscates other information in the packet headers: 255 * packet number: All packets except Version Negotiation and Retry 256 packets have an associated packet number; however, this packet 257 number is encrypted, and therefore not of use to on-path 258 observers. The offset of the packet number can be decoded in long 259 headers, while it is implicit (depending on destination connection 260 ID length) in short headers. The length of the packet number is 261 cryptographically protected. 263 * key phase: The Key Phase bit, present in short headers, specifies 264 the keys used to encrypt the packet to support key rotation. The 265 Key Phase bit is cryptographically protected. 267 2.2. Coalesced Packets 269 Multiple QUIC packets may be coalesced into a single UDP datagram, 270 with a datagram carrying one or more long header packets followed by 271 zero or one short header packets. When packets are coalesced, the 272 Length fields in the long headers are used to separate QUIC packets; 273 see Section 12.2 of [QUIC-TRANSPORT]. The Length field is variable 274 length, and its position in the header is also variable depending on 275 the length of the source and destination connection ID; see 276 Section 17.2 of [QUIC-TRANSPORT]. 278 2.3. Use of Port Numbers 280 Applications that have a mapping for TCP as well as QUIC are expected 281 to use the same port number for both services. However, as for all 282 other IETF transports [RFC7605], there is no guarantee that a 283 specific application will use a given registered port, or that a 284 given port carries traffic belonging to the respective registered 285 service, especially when application layer information is encrypted. 286 For example, [QUIC-HTTP] specifies the use of the HTTP Alternative 287 Services mechanism [RFC7838] for discovery of HTTP/3 services on 288 other ports. 290 Further, as QUIC has a connection ID, it is also possible to maintain 291 multiple QUIC connections over one 5-tuple (protocol, source and 292 destination IP address, and source and destination port). However, 293 if the connection ID is zero-length, all packets of the 5-tuple 294 likely belong to the same QUIC connection. 296 2.4. The QUIC Handshake 298 New QUIC connections are established using a handshake, which is 299 distinguishable on the wire and contains some information that can be 300 passively observed. 302 To illustrate the information visible in the QUIC wire image during 303 the handshake, we first show the general communication pattern 304 visible in the UDP datagrams containing the QUIC handshake, then 305 examine each of the datagrams in detail. 307 The QUIC handshake can normally be recognized on the wire through 308 four flights of datagrams labelled "Client Initial", "Server 309 Initial", "Client Completion", and "Server Completion", as 310 illustrated in Figure 1. 312 A handshake starts with the client sending one or more datagrams 313 containing Initial packets, detailed in Figure 2, which elicits the 314 Server Initial response detailed in Figure 3 typically containing 315 three types of packets: Initial packet(s) with the beginning of the 316 server's side of the TLS handshake, Handshake packet(s) with the rest 317 of the server's portion of the TLS handshake, and 1-RTT packet(s), if 318 present. 320 Client Server 321 | | 322 +----Client Initial----------------------->| 323 +----(zero or more 0-RTT)----------------->| 324 | | 325 |<-----------------------Server Initial----+ 326 |<--------(1-RTT encrypted data starts)----+ 327 | | 328 +----Client Completion-------------------->| 329 +----(1-RTT encrypted data starts)-------->| 330 | | 331 |<--------------------Server Completion----+ 332 | | 334 Figure 1: General communication pattern visible in the QUIC handshake 336 As shown here, the client can send 0-RTT data as soon as it has sent 337 its Client Hello, and the server can send 1-RTT data as soon as it 338 has sent its Server Hello. The Client Completion flight contains at 339 least one Handshake packet and could also include an Initial packet. 340 QUIC packets in separate contexts during the handshake can be 341 coalesced (see Section 2.2) in order to reduce the number of UDP 342 datagrams sent during the handshake. 344 Handshake packets can arrive out-of-order without impacting the 345 handshake as long as the reordering was not accompanied by extensive 346 delays that trigger a spurious Probe Timeout ({Section 6.2 of 347 RFC9002}). If QUIC packets get lost or reordered, packets belonging 348 to the same flight might not be observed in close time succession, 349 though the sequence of the flights will not change, because one 350 flight depends upon the peer's previous flight. 352 Datagrams that contain an Initial packet (Client Initial, Server 353 Initial, and some Client Completion) contain at least 1200 octets of 354 UDP payload. This protects against amplification attacks and 355 verifies that the network path meets the requirements for the minimum 356 QUIC IP packet size; see Section 14 of [QUIC-TRANSPORT]. This is 357 accomplished by either adding PADDING frames within the Initial 358 packet, coalescing other packets with the Initial packet, or leaving 359 unused payload in the UDP packet after the Initial packet. A network 360 path needs to be able to forward at least this size of packet for 361 QUIC to be used. 363 The content of Initial packets is encrypted using Initial Secrets, 364 which are derived from a per-version constant and the client's 365 destination connection ID. That content is therefore observable by 366 any on-path device that knows the per-version constant and is 367 considered visible in this illustration. The content of QUIC 368 Handshake packets is encrypted using keys established during the 369 initial handshake exchange, and is therefore not visible. 371 Initial, Handshake, and 1-RTT packets belong to different 372 cryptographic and transport contexts. The Client Completion 373 (Figure 4) and the Server Completion (Figure 5) flights conclude the 374 Initial and Handshake contexts, by sending final acknowledgments and 375 CRYPTO frames. 377 +----------------------------------------------------------+ 378 | UDP header (source and destination UDP ports) | 379 +----------------------------------------------------------+ 380 | QUIC long header (type = Initial, Version, DCID, SCID) (Length) 381 +----------------------------------------------------------+ | 382 | QUIC CRYPTO frame header | | 383 +----------------------------------------------------------+ | 384 | | TLS Client Hello (incl. TLS SNI) | | | 385 +----------------------------------------------------------+ | 386 | QUIC PADDING frames | | 387 +----------------------------------------------------------+<-+ 389 Figure 2: Example Client Initial datagram without 0-RTT 391 A Client Initial packet exposes the version, source and destination 392 connection IDs without encryption. The payload of the Initial packet 393 is protected using the Initial secret. The complete TLS Client 394 Hello, including any TLS Server Name Indication (SNI) present, is 395 sent in one or more CRYPTO frames across one or more QUIC Initial 396 packets. 398 +------------------------------------------------------------+ 399 | UDP header (source and destination UDP ports) | 400 +------------------------------------------------------------+ 401 | QUIC long header (type = Initial, Version, DCID, SCID) (Length) 402 +------------------------------------------------------------+ | 403 | QUIC CRYPTO frame header | | 404 +------------------------------------------------------------+ | 405 | TLS Server Hello | | 406 +------------------------------------------------------------+ | 407 | QUIC ACK frame (acknowledging client hello) | | 408 +------------------------------------------------------------+<-+ 409 | QUIC long header (type = Handshake, Version, DCID, SCID) (Length) 410 +------------------------------------------------------------+ | 411 | encrypted payload (presumably CRYPTO frames) | | 412 +------------------------------------------------------------+<-+ 413 | QUIC short header | 414 +------------------------------------------------------------+ 415 | 1-RTT encrypted payload | 416 +------------------------------------------------------------+ 418 Figure 3: Coalesced Server Initial datagram pattern 420 The Server Initial datagram also exposes version number, source and 421 destination connection IDs in the clear; the payload of the Initial 422 packet(s) is protected using the Initial secret. 424 +------------------------------------------------------------+ 425 | UDP header (source and destination UDP ports) | 426 +------------------------------------------------------------+ 427 | QUIC long header (type = Initial, Version, DCID, SCID) (Length) 428 +------------------------------------------------------------+ | 429 | QUIC ACK frame (acknowledging Server Initial) | | 430 +------------------------------------------------------------+<-+ 431 | QUIC long header (type = Handshake, Version, DCID, SCID) (Length) 432 +------------------------------------------------------------+ | 433 | encrypted payload (presumably CRYPTO/ACK frames) | | 434 +------------------------------------------------------------+<-+ 435 | QUIC short header | 436 +------------------------------------------------------------+ 437 | 1-RTT encrypted payload | 438 +------------------------------------------------------------+ 440 Figure 4: Coalesced Client Completion datagram pattern 442 The Client Completion flight does not expose any additional 443 information; however, as the destination connection ID is server- 444 selected, it usually is not the same ID that is sent in the Client 445 Initial. Client Completion flights contain 1-RTT packets which 446 indicate the handshake has completed (see Section 3.2) on the client, 447 and for three-way handshake RTT estimation as in Section 3.8. 449 +------------------------------------------------------------+ 450 | UDP header (source and destination UDP ports) | 451 +------------------------------------------------------------+ 452 | QUIC long header (type = Handshake, Version, DCID, SCID) (Length) 453 +------------------------------------------------------------+ | 454 | encrypted payload (presumably ACK frame) | | 455 +------------------------------------------------------------+<-+ 456 | QUIC short header | 457 +------------------------------------------------------------+ 458 | 1-RTT encrypted payload | 459 +------------------------------------------------------------+ 461 Figure 5: Coalesced Server Completion datagram pattern 463 Similar to Client Completion, Server Completion also exposes no 464 additional information; observing it serves only to determine that 465 the handshake has completed. 467 When the client uses 0-RTT data, the Client Initial flight can also 468 include one or more 0-RTT packets, as shown in Figure 6. 470 +----------------------------------------------------------+ 471 | UDP header (source and destination UDP ports) | 472 +----------------------------------------------------------+ 473 | QUIC long header (type = Initial, Version, DCID, SCID) (Length) 474 +----------------------------------------------------------+ | 475 | QUIC CRYPTO frame header | | 476 +----------------------------------------------------------+ | 477 | TLS Client Hello (incl. TLS SNI) | | 478 +----------------------------------------------------------+<-+ 479 | QUIC long header (type = 0-RTT, Version, DCID, SCID) (Length) 480 +----------------------------------------------------------+ | 481 | 0-RTT encrypted payload | | 482 +----------------------------------------------------------+<-+ 484 Figure 6: Coalesced 0-RTT Client Initial datagram 486 When a 0-RTT packet is coalesced with an Initial packet, the datagram 487 will be padded to 1200 byes. Additional datagrams containing only 488 0-RTT packets with long headers can be sent after the client Initial 489 packet(s), containing more 0-RTT data. The amount of 0-RTT protected 490 data that can be sent in the first flight is limited by the initial 491 congestion window, typically to around 10 packets (see Section 7.2 of 492 [QUIC-RECOVERY]). 494 2.5. Integrity Protection of the Wire Image 496 As soon as the cryptographic context is established, all information 497 in the QUIC header, including exposed information, is integrity 498 protected. Further, information that was exposed in packets sent 499 before the cryptographic context was established is validated during 500 the cryptographic handshake. Therefore, devices on path cannot alter 501 any information or bits in QUIC packets. Such alterations would 502 cause the integrity check to fail, which results in the receiver 503 discarding the packet. Some parts of Initial packets could be 504 altered by removing and re-applying the authenticated encryption 505 without immediate discard at the receiver. However, the 506 cryptographic handshake validates most fields and any modifications 507 in those fields will result in connection establishment failing 508 later. 510 2.6. Connection ID and Rebinding 512 The connection ID in the QUIC packet headers allows association of 513 QUIC packets using information independent of the 5-tuple. This 514 allows rebinding of a connection after one of the endpoints - usually 515 the client - has experienced an address change. Further it can be 516 used by in-network devices to ensure that related 5-tuple flows are 517 appropriately balanced together. 519 Client and server each choose a connection ID during the handshake; 520 for example, a server might request that a client use a connection 521 ID, whereas the client might choose a zero-length value. Connection 522 IDs for either endpoint may change during the lifetime of a 523 connection, with the new connection ID being supplied via encrypted 524 frames (see Section 5.1 of [QUIC-TRANSPORT]). Therefore, observing a 525 new connection ID does not necessarily indicate a new connection. 527 [QUIC_LB] specifies algorithms for encoding the server mapping in a 528 connection ID in order to share this information with selected on- 529 path devices such as load balancers. Server mappings should only be 530 exposed to selected entities. Uncontrolled exposure would allow 531 linkage of multiple IP addresses to the same host if the server also 532 supports migration that opens an attack vector on specific servers or 533 pools. The best way to obscure an encoding is to appear random to 534 any other observers, which is most rigorously achieved with 535 encryption. As a result, any attempt to infer information from 536 specific parts of a connection ID is unlikely to be useful. 538 2.7. Packet Numbers 540 The Packet Number field is always present in the QUIC packet header 541 in version 1; however, it is always encrypted. The encryption key 542 for packet number protection on Initial packets -- which are sent 543 before cryptographic context establishment -- is specific to the QUIC 544 version, while packet number protection on subsequent packets uses 545 secrets derived from the end-to-end cryptographic context. Packet 546 numbers are therefore not part of the wire image that is visible to 547 on-path observers. 549 2.8. Version Negotiation and Greasing 551 Version Negotiation packets are used by the server to indicate that a 552 requested version from the client is not supported (see Section 6 of 553 [QUIC-TRANSPORT]. Version Negotiation packets are not intrinsically 554 protected, but future QUIC versions could use later encrypted 555 messages to verify that they were authentic. Therefore, any 556 modification of this list will be detected and may cause the 557 endpoints to terminate the connection attempt. 559 Also note that the list of versions in the Version Negotiation packet 560 may contain reserved versions. This mechanism is used to avoid 561 ossification in the implementation on the selection mechanism. 562 Further, a client may send an Initial packet with a reserved version 563 number to trigger version negotiation. In the Version Negotiation 564 packet, the connection IDs of the client's Initial packet are 565 reflected to provide a proof of return-routability. Therefore, 566 changing this information will also cause the connection to fail. 568 QUIC is expected to evolve rapidly, so new versions, both 569 experimental and IETF standard versions, will be deployed on the 570 Internet more often than with other commonly deployed Internet- and 571 transport-layer protocols. Use of the version number field for 572 traffic recognition will therefore behave differently than with these 573 protocols. Using a particular version number to recognize valid QUIC 574 traffic is likely to persistently miss a fraction of QUIC flows, and 575 completely fail in the near future. Reliance on the version number 576 field for the purposes of admission control is similarly likely to 577 rapidly lead to unintended failure modes. Admission of QUIC traffic 578 regardless of version avoids these failure modes, avoids unnecessary 579 deployment delays, and supports continuous version-based evolution. 581 3. Network-Visible Information about QUIC Flows 583 This section addresses the different kinds of observations and 584 inferences that can be made about QUIC flows by a passive observer in 585 the network based on the wire image in Section 2. Here we assume a 586 bidirectional observer (one that can see packets in both directions 587 in the sequence in which they are carried on the wire) unless noted, 588 but typically without access to any keying information. 590 3.1. Identifying QUIC Traffic 592 The QUIC wire image is not specifically designed to be 593 distinguishable from other UDP traffic by a passive observer in the 594 network. While certain QUIC applications may be heuristically 595 identifiable on a per-application basis, there is no general method 596 for distinguishing QUIC traffic from otherwise-unclassifiable UDP 597 traffic on a given link. Any unrecognized UDP traffic may therefore 598 be QUIC traffic. 600 At the time of writing, two application bindings for QUIC have been 601 published or adopted by the IETF: HTTP/3 [QUIC-HTTP] and DNS over 602 Dedicated QUIC Connections [I-D.ietf-dprive-dnsoquic]. These are 603 both known at the time of writing to have active Internet 604 deployments, so an assumption that all QUIC traffic is HTTP/3 is not 605 valid. HTTP/3 uses UDP port 443 by convention but various methods 606 can be used to specify alternate port numbers. Simple assumptions 607 about whether a given flow is using QUIC based upon a UDP port number 608 may therefore not hold; see also Section 5 of [RFC7605]. 610 While the second-most-significant bit (0x40) of the first octet is 611 set to 1 in most QUIC packets of the current version (see Section 2.1 612 and Section 17 of [QUIC-TRANSPORT]), this method of recognizing QUIC 613 traffic is not reliable. First, it only provides one bit of 614 information and is prone to collision with UDP-based protocols other 615 than those considered in [RFC7983]. Second, this feature of the wire 616 image is not invariant [QUIC-INVARIANTS] and may change in future 617 versions of the protocol, or even be negotiated during the handshake 618 via the use of an extension [QUIC-GREASE]. 620 Even though transport parameters transmitted in the client's Initial 621 packet are observable by the network, they cannot be modified by the 622 network without causing connection failure. Further, the reply from 623 the server cannot be observed, so observers on the network cannot 624 know which parameters are actually in use. 626 3.1.1. Identifying Negotiated Version 628 An in-network observer assuming that a set of packets belongs to a 629 QUIC flow might infer the version number in use by observing the 630 handshake: for QUIC version 1, if the version number in the Initial 631 packet from a client is the same as the version number in the Initial 632 packet of the server response, that version has been accepted by both 633 endpoints to be used for the rest of the connection. 635 The negotiated version cannot be identified for flows for which a 636 handshake is not observed, such as in the case of connection 637 migration; however, it might be possible to associate a flow with a 638 flow for which a version has been identified; see Section 3.5. 640 3.1.2. First Packet Identification for Garbage Rejection 642 A related question is whether the first packet of a given flow on a 643 port known to be associated with QUIC is a valid QUIC packet. This 644 determination supports in-network filtering of garbage UDP packets 645 (reflection attacks, random backscatter, etc.). While heuristics 646 based on the first byte of the packet (packet type) could be used to 647 separate valid from invalid first packet types, the deployment of 648 such heuristics is not recommended, as bits in the first byte may 649 have different meanings in future versions of the protocol. 651 3.2. Connection Confirmation 653 This document focuses on QUIC version 1, and this Connection 654 Confirmation section applies only to packets belonging to QUIC 655 version 1 flows; for purposes of on-path observation, it assumes that 656 these packets have been identified as such through the observation of 657 a version number exchange as described above. 659 Connection establishment uses Initial and Handshake packets 660 containing a TLS handshake, and Retry packets that do not contain 661 parts of the handshake. Connection establishment can therefore be 662 detected using heuristics similar to those used to detect TLS over 663 TCP. A client initiating a connection may also send data in 0-RTT 664 packets directly after the Initial packet containing the TLS Client 665 Hello. Since packets may be reordered or lost in the network, 0-RTT 666 packets could be seen before the Initial packet. 668 Note that in this version of QUIC, clients send Initial packets 669 before servers do, servers send Handshake packets before clients do, 670 and only clients send Initial packets with tokens. Therefore, an 671 endpoint can be identified as a client or server by an on-path 672 observer. An attempted connection after Retry can be detected by 673 correlating the contents of the Retry packet with the Token and the 674 Destination Connection ID fields of the new Initial packet. 676 3.3. Distinguishing Acknowledgment Traffic 678 Some deployed in-network functions distinguish packets that carry 679 only acknowledgment (ACK-only) information from packets carrying 680 upper-layer data in order to attempt to enhance performance, for 681 example by queueing ACKs differently or manipulating ACK signaling 682 [RFC3449]. Distinguishing ACK packets is possible in TCP, but is not 683 supported by QUIC, since acknowledgment signaling is carried inside 684 QUIC's encrypted payload, and ACK manipulation is impossible. 685 Specifically, heuristics attempting to distinguish ACK-only packets 686 from payload-carrying packets based on packet size are likely to 687 fail, and are not recommended to use as a way to construe internals 688 of QUIC's operation as those mechanisms can change, e.g., due to the 689 use of extensions. 691 3.4. Server Name Indication (SNI) 693 The client's TLS ClientHello may contain a Server Name Indication 694 (SNI) [RFC6066] extension, by which the client reveals the name of 695 the server it intends to connect to, in order to allow the server to 696 present a certificate based on that name. It may also contain an 697 Application-Layer Protocol Negotiation (ALPN) [RFC7301] extension, by 698 which the client exposes the names of application-layer protocols it 699 supports; an observer can deduce that one of those protocols will be 700 used if the connection continues. 702 Work is currently underway in the TLS working group to encrypt the 703 contents of the ClientHello in TLS 1.3 [TLS-ECH]. This would make 704 SNI-based application identification impossible by on-path 705 observation for QUIC and other protocols that use TLS. 707 3.4.1. Extracting Server Name Indication (SNI) Information 709 If the ClientHello is not encrypted, SNI can be derived from the 710 client's Initial packet(s) by calculating the Initial secret to 711 decrypt the packet payload and parsing the QUIC CRYPTO frame(s) 712 containing the TLS ClientHello. 714 As both the derivation of the Initial secret and the structure of the 715 Initial packet itself are version-specific, the first step is always 716 to parse the version number (the second through fifth bytes of the 717 long header). Note that only long header packets carry the version 718 number, so it is necessary to also check if the first bit of the QUIC 719 packet is set to 1, indicating a long header. 721 Note that proprietary QUIC versions, that have been deployed before 722 standardization, might not set the first bit in a QUIC long header 723 packet to 1. However, it is expected that these versions will 724 gradually disappear over time and therefore do not require any 725 special consideration or treatment. 727 When the version has been identified as QUIC version 1, the packet 728 type needs to be verified as an Initial packet by checking that the 729 third and fourth bits of the header are both set to 0. Then the 730 Destination Connection ID needs to be extracted from the packet. The 731 Initial secret is calculated using the version-specific Initial salt, 732 as described in Section 5.2 of [QUIC-TLS]. The length of the 733 connection ID is indicated in the 6th byte of the header followed by 734 the connection ID itself. 736 Note that subsequent Initial packets might contain a Destination 737 Connection ID other than the one used to generate the Initial secret. 738 Therefore, attempts to decrypt these packets using the procedure 739 above might fail unless the Initial secret is retained by the 740 observer. 742 To determine the end of the packet header and find the start of the 743 payload, the packet number length, the source connection ID length, 744 and the token length need to be extracted. The packet number length 745 is defined by the seventh and eight bits of the header as described 746 in Section 17.2 of [QUIC-TRANSPORT], but is protected as described in 747 Section 5.4 of [QUIC-TLS]. The source connection ID length is 748 specified in the byte after the destination connection ID. The token 749 length, which follows the source connection ID, is a variable-length 750 integer as specified in Section 16 of [QUIC-TRANSPORT]. 752 After decryption, the client's Initial packet(s) can be parsed to 753 detect the CRYPTO frame(s) that contains the TLS ClientHello, which 754 then can be parsed similarly to TLS over TCP connections. Note that 755 there can be multiple CRYPTO frames spread out over one or more 756 Initial packets, and they might not be in order, so reassembling the 757 CRYPTO stream by parsing offsets and lengths is required. Further, 758 the client's Initial packet(s) may contain other frames, so the first 759 bytes of each frame need to be checked to identify the frame type and 760 determine whether the frame can be skipped over. Note that the 761 length of the frames is dependent on the frame type; see Section 18 762 of [QUIC-TRANSPORT]. E.g., PADDING frames, each consisting of a 763 single zero byte, may occur before, after, or between CRYPTO frames. 764 However, extensions might define additional frame types. If an 765 unknown frame type is encountered, it is impossible to know the 766 length of that frame which prevents skipping over it, and therefore 767 parsing fails. 769 3.5. Flow Association 771 The QUIC connection ID (see Section 2.6) is designed to allow a 772 coordinating on-path device, such as a load-balancer, to associate 773 two flows when one of the endpoints changes address. This change can 774 be due to NAT rebinding or address migration. 776 The connection ID must change upon intentional address change by an 777 endpoint, and connection ID negotiation is encrypted, so it is not 778 possible for a passive observer to link intended changes of address 779 using the connection ID. 781 When one endpoint's address unintentionally changes, as is the case 782 with NAT rebinding, an on-path observer may be able to use the 783 connection ID to associate the flow on the new address with the flow 784 on the old address. 786 A network function that attempts to use the connection ID to 787 associate flows must be robust to the failure of this technique. 788 Since the connection ID may change multiple times during the lifetime 789 of a connection, packets with the same 5-tuple but different 790 connection IDs might or might not belong to the same connection. 791 Likewise, packets with the same connection ID but different 5-tuples 792 might not belong to the same connection, either. 794 Connection IDs should be treated as opaque; see Section 4.4 for 795 caveats regarding connection ID selection at servers. 797 3.6. Flow Teardown 799 QUIC does not expose the end of a connection; the only indication to 800 on-path devices that a flow has ended is that packets are no longer 801 observed. Stateful devices on path such as NATs and firewalls must 802 therefore use idle timeouts to determine when to drop state for QUIC 803 flows; see Section 4.2. 805 3.7. Flow Symmetry Measurement 807 QUIC explicitly exposes which side of a connection is a client and 808 which side is a server during the handshake. In addition, the 809 symmetry of a flow (whether primarily client-to-server, primarily 810 server-to-client, or roughly bidirectional, as input to basic traffic 811 classification techniques) can be inferred through the measurement of 812 data rate in each direction. Note that QUIC packets containing only 813 control frames (such as ACK-only packets) may be padded. Padding, 814 though optional, may conceal connection roles or flow symmetry 815 information. 817 3.8. Round-Trip Time (RTT) Measurement 819 The round-trip time (RTT) of QUIC flows can be inferred by 820 observation once per flow, during the handshake, as in passive TCP 821 measurement; this requires parsing of the QUIC packet header and 822 recognition of the handshake, as illustrated in Section 2.4. It can 823 also be inferred during the flow's lifetime, if the endpoints use the 824 spin bit facility described below and in Section 17.3.1 of 825 [QUIC-TRANSPORT]. 827 3.8.1. Measuring Initial RTT 829 In the common case, the delay between the client's Initial packet 830 (containing the TLS ClientHello) and the server's Initial packet 831 (containing the TLS ServerHello) represents the RTT component on the 832 path between the observer and the server. The delay between the 833 server's first Handshake packet and the Handshake packet sent by the 834 client represents the RTT component on the path between the observer 835 and the client. While the client may send 0-RTT packets after the 836 Initial packet during connection re-establishment, these can be 837 ignored for RTT measurement purposes. 839 Handshake RTT can be measured by adding the client-to-observer and 840 observer-to-server RTT components together. This measurement 841 necessarily includes all transport- and application-layer delay at 842 both endpoints. 844 3.8.2. Using the Spin Bit for Passive RTT Measurement 846 The spin bit provides a version-specific method to measure per-flow 847 RTT from observation points on the network path throughout the 848 duration of a connection. See Section 17.4 of [QUIC-TRANSPORT] for 849 the definition of the spin bit in Version 1 of QUIC. Endpoint 850 participation in spin bit signaling is optional. That is, while its 851 location is fixed in this version of QUIC, an endpoint can 852 unilaterally choose to not support "spinning" the bit. 854 Use of the spin bit for RTT measurement by devices on path is only 855 possible when both endpoints enable it. Some endpoints may disable 856 use of the spin bit by default, others only in specific deployment 857 scenarios, e.g., for servers and clients where the RTT would reveal 858 the presence of a VPN or proxy. To avoid making these connections 859 identifiable based on the usage of the spin bit, all endpoints 860 randomly disable "spinning" for at least one eighth of connections, 861 even if otherwise enabled by default. An endpoint not participating 862 in spin bit signaling for a given connection can use a fixed spin 863 value for the duration of the connection, or can set the bit randomly 864 on each packet sent. 866 When in use, the latency spin bit in each direction changes value 867 once per RTT any time that both endpoints are sending packets 868 continuously. An on-path observer can observe the time difference 869 between edges (changes from 1 to 0 or 0 to 1) in the spin bit signal 870 in a single direction to measure one sample of end-to-end RTT. This 871 mechanism follows the principles of protocol measurability laid out 872 in [IPIM]. 874 Note that this measurement, as with passive RTT measurement for TCP, 875 includes all transport protocol delay (e.g., delayed sending of 876 acknowledgments) and/or application layer delay (e.g., waiting for a 877 response to be generated). It therefore provides devices on path a 878 good instantaneous estimate of the RTT as experienced by the 879 application. 881 However, application-limited and flow-control-limited senders can 882 have application and transport layer delay, respectively, that are 883 much greater than network RTT. When the sender is application- 884 limited and e.g., only sends small amount of periodic application 885 traffic, where that period is longer than the RTT, measuring the spin 886 bit provides information about the application period, not the 887 network RTT. 889 Since the spin bit logic at each endpoint considers only samples from 890 packets that advance the largest packet number, signal generation 891 itself is resistant to reordering. However, reordering can cause 892 problems at an observer by causing spurious edge detection and 893 therefore inaccurate (i.e., lower) RTT estimates, if reordering 894 occurs across a spin-bit flip in the stream. 896 Simple heuristics based on the observed data rate per flow or changes 897 in the RTT series can be used to reject bad RTT samples due to lost 898 or reordered edges in the spin signal, as well as application or flow 899 control limitation; for example, QoF [TMA-QOF] rejects component RTTs 900 significantly higher than RTTs over the history of the flow. These 901 heuristics may use the handshake RTT as an initial RTT estimate for a 902 given flow. Usually such heuristics would also detect if the spin is 903 either constant or randomly set for a connection. 905 An on-path observer that can see traffic in both directions (from 906 client to server and from server to client) can also use the spin bit 907 to measure "upstream" and "downstream" component RTT; i.e, the 908 component of the end-to-end RTT attributable to the paths between the 909 observer and the server and the observer and the client, 910 respectively. It does this by measuring the delay between a spin 911 edge observed in the upstream direction and that observed in the 912 downstream direction, and vice versa. 914 Raw RTT samples generated using these techniques can be processed in 915 various ways to generate useful network performance metrics. A 916 simple linear smoothing or moving minimum filter can be applied to 917 the stream of RTT samples to get a more stable estimate of 918 application-experienced RTT. RTT samples measured from the spin bit 919 can also be used to generate RTT distribution information, including 920 minimum RTT (which approximates network RTT over longer time windows) 921 and RTT variance (which approximates one-way packet delay variance as 922 seen by an application end-point). 924 4. Specific Network Management Tasks 926 In this section, we review specific network management and 927 measurement techniques and how QUIC's design impacts them. 929 4.1. Passive Network Performance Measurement and Troubleshooting 931 Limited RTT measurement is possible by passive observation of QUIC 932 traffic; see Section 3.8. No passive measurement of loss is possible 933 with the present wire image. Limited observation of upstream 934 congestion may be possible via the observation of CE markings in the 935 IP header [RFC3168] on ECN-enabled QUIC traffic. 937 On-path devices can also make measurements of RTT, loss and other 938 performance metrics when information is carried in an additional 939 network-layer packet header (Section 6 of 940 [I-D.ietf-tsvwg-transport-encrypt] describes use of operations, 941 administration and management (OAM) information). Using network- 942 layer approaches also has the advantage that common observation and 943 analysis tools can be consistently used for multiple transport 944 protocols, however, these techniques are often limited to 945 measurements within one or multiple cooperating domains. 947 4.2. Stateful Treatment of QUIC Traffic 949 Stateful treatment of QUIC traffic (e.g., at a firewall or NAT 950 middlebox) is possible through QUIC traffic and version 951 identification (Section 3.1) and observation of the handshake for 952 connection confirmation (Section 3.2). The lack of any visible end- 953 of-flow signal (Section 3.6) means that this state must be purged 954 either through timers or through least-recently-used eviction, 955 depending on application requirements. 957 While QUIC has no clear network-visible end-of-flow signal and 958 therefore does require timer-based state removal, the QUIC handshake 959 indicates confirmation by both ends of a valid bidirectional 960 transmission. As soon as the handshake completed, timers should be 961 set long enough to also allow for short idle time during a valid 962 transmission. 964 [RFC4787] requires a network state timeout that is not less than 2 965 minutes for most UDP traffic. However, in practice, a QUIC endpoint 966 can experience lower timeouts, in the range of 30 to 60 seconds 967 [QUIC-TIMEOUT]. 969 In contrast, [RFC5382] recommends a state timeout of more than 2 970 hours for TCP, given that TCP is a connection-oriented protocol with 971 well-defined closure semantics. Even though QUIC has explicitly been 972 designed to tolerate NAT rebindings, decreasing the NAT timeout is 973 not recommended, as it may negatively impact application performance 974 or incentivize endpoints to send very frequent keep-alive packets. 976 The recommendation is therefore that, even when lower state timeouts 977 are used for other UDP traffic, a state timeout of at least two 978 minutes ought to be used for QUIC traffic. 980 If state is removed too early, this could lead to black-holing of 981 incoming packets after a short idle period. To detect this 982 situation, a timer at the client needs to expire before a re- 983 establishment can happen (if at all), which would lead to 984 unnecessarily long delays in an otherwise working connection. 986 Furthermore, not all endpoints use routing architectures where 987 connections will survive a port or address change. So even when the 988 client revives the connection, a NAT rebinding can cause a routing 989 mismatch where a packet is not even delivered to the server that 990 might support address migration. For these reasons, the limits in 991 [RFC4787] are important to avoid black-holing of packets (and hence 992 avoid interrupting the flow of data to the client), especially where 993 devices are able to distinguish QUIC traffic from other UDP payloads. 995 The QUIC header optionally contains a connection ID which could 996 provide additional entropy beyond the 5-tuple. The QUIC handshake 997 needs to be observed in order to understand whether the connection ID 998 is present and what length it has. However, connection IDs may be 999 renegotiated after the handshake, and this renegotiation is not 1000 visible to the path. Therefore, using the connection ID as a flow 1001 key field for stateful treatment of flows is not recommended as 1002 connection ID changes will cause undetectable and unrecoverable loss 1003 of state in the middle of a connection. In particular, the use of 1004 the connection ID for functions that require state to make a 1005 forwarding decison is not viable as it will break connectivity, or at 1006 minimum cause long timeout-based delays before this problem is 1007 detected by the endpoints and the connection can potentially be re- 1008 established. 1010 Use of connection IDs is specifically discouraged for NAT 1011 applications. If a NAT hits an operational limit, it is recommended 1012 to rather drop the initial packets of a flow (see also Section 4.5), 1013 which potentially triggers TCP fallback. Use of the connection ID to 1014 multiplex multiple connections on the same IP address/port pair is 1015 not a viable solution as it risks connectivity breakage, in case the 1016 connection ID changes. 1018 4.3. Address Rewriting to Ensure Routing Stability 1020 While QUIC's migration capability makes it possible for a connection 1021 to survive client address changes, this does not work if the routers 1022 or switches in the server infrastructure route using the address-port 1023 4-tuple. If infrastructure routes on addresses only, NAT rebinding 1024 or address migration will cause packets to be delivered to the wrong 1025 server. [QUIC_LB] describes a way to addresses this problem by 1026 coordinating the selection and use of connection IDs between load- 1027 balancers and servers. 1029 Applying address translation at a middlebox to maintain a stable 1030 address-port mapping for flows based on connection ID might seem like 1031 a solution to this problem. However, hiding information about the 1032 change of the IP address or port conceals important and security- 1033 relevant information from QUIC endpoints and as such would facilitate 1034 amplification attacks (see Section 8 of [QUIC-TRANSPORT]). A NAT 1035 function that hides peer address changes prevents the other end from 1036 detecting and mitigating attacks as the endpoint cannot verify 1037 connectivity to the new address using QUIC PATH_CHALLENGE and 1038 PATH_RESPONSE frames. 1040 In addition, a change of IP address or port is also an input signal 1041 to other internal mechanisms in QUIC. When a path change is 1042 detected, path-dependent variables like congestion control parameters 1043 will be reset protecting the new path from overload. 1045 4.4. Server Cooperation with Load Balancers 1047 In the case of networking architectures that include load balancers, 1048 the connection ID can be used as a way for the server to signal 1049 information about the desired treatment of a flow to the load 1050 balancers. Guidance on assigning connection IDs is given in 1051 [QUIC-APPLICABILITY]. [QUIC_LB] describes a system for coordinating 1052 selection and use of connection IDs between load-balancers and 1053 servers. 1055 4.5. Filtering Behavior 1057 [RFC4787] describes possible packet filtering behaviors that relate 1058 to NATs but is often also used is other scenarios where packet 1059 filtering is desired. Though the guidance there holds, a 1060 particularly unwise behavior admits a handful of UDP packets and then 1061 makes a decision to whether or not filter later packets in the same 1062 connection. QUIC applications are encouraged to fall back to TCP if 1063 early packets do not arrive at their destination 1064 [QUIC-APPLICABILITY], as QUIC is based on UDP and there are known 1065 blocks of UDP traffic (see Section 4.6). Admitting a few packets 1066 allows the QUIC endpoint to determine that the path accepts QUIC. 1067 Sudden drops afterwards will result in slow and costly timeouts 1068 before abandoning the connection. 1070 4.6. UDP Blocking, Throttling, and NAT Binding 1072 Today, UDP is the most prevalent DDoS vector, since it is easy for 1073 compromised non-admin applications to send a flood of large UDP 1074 packets (while with TCP the attacker gets throttled by the congestion 1075 controller) or to craft reflection and amplification attacks. Some 1076 networks therefore block UDP traffic. With increased deployment of 1077 QUIC, there is also an increased need to allow UDP traffic on ports 1078 used for QUIC. However, if UDP is generally enabled on these ports, 1079 UDP flood attacks may also use the same ports. One possible response 1080 to this threat is to throttle UDP traffic on the network, allocating 1081 a fixed portion of the network capacity to UDP and blocking UDP 1082 datagrams over that cap. As the portion of QUIC traffic compared to 1083 TCP is also expected to increase over time, using such a limit is not 1084 recommended but if done, limits might need to be adapted dynamically. 1086 Further, if UDP traffic is desired to be throttled, it is recommended 1087 to block individual QUIC flows entirely rather than dropping packets 1088 indiscriminately. When the handshake is blocked, QUIC-capable 1089 applications may fall back to TCP. However, blocking a random 1090 fraction of QUIC packets across 4-tuples will allow many QUIC 1091 handshakes to complete, preventing TCP fallback, but these 1092 connections will suffer from severe packet loss (see also 1093 Section 4.5). Therefore, UDP throttling should be realized by per- 1094 flow policing, as opposed to per-packet policing. Note that this 1095 per-flow policing should be stateless to avoid problems with stateful 1096 treatment of QUIC flows (see Section 4.2), for example blocking a 1097 portion of the space of values of a hash function over the addresses 1098 and ports in the UDP datagram. While QUIC endpoints are often able 1099 to survive address changes, e.g., by NAT rebindings, blocking a 1100 portion of the traffic based on 5-tuple hashing increases the risk of 1101 black-holing an active connection when the address changes. 1103 Note that some source ports are assumed to be reflection attack 1104 vectors by some servers; see Section 8.1 of [QUIC-APPLICABILITY]. As 1105 a result, NAT binding to these source ports can result in that 1106 traffic being blocked. 1108 4.7. DDoS Detection and Mitigation 1110 On-path observation of the transport headers of packets can be used 1111 for various security functions. For example, Denial of Service (DOS) 1112 and Distributed DOS (DDOS) attacks against the infrastructure or 1113 against an endpoint can be detected and mitigated by characterising 1114 anomalous traffic. Other uses include support for security audits 1115 (e.g., verifying the compliance with ciphersuites); client and 1116 application fingerprinting for inventory; and to provide alerts for 1117 network intrusion detection and other next generation firewall 1118 functions. 1120 Current practices in detection and mitigation of DDoS attacks 1121 generally involve classification of incoming traffic (as packets, 1122 flows, or some other aggregate) into "good" (productive) and "bad" 1123 (DDoS) traffic, and then differential treatment of this traffic to 1124 forward only good traffic. This operation is often done in a 1125 separate specialized mitigation environment through which all traffic 1126 is filtered; a generalized architecture for separation of concerns in 1127 mitigation is given in [DOTS-ARCH]. 1129 Efficient classification of this DDoS traffic in the mitigation 1130 environment is key to the success of this approach. Limited first- 1131 packet garbage detection as in Section 3.1.2 and stateful tracking of 1132 QUIC traffic as in Section 4.2 above may be useful during 1133 classification. 1135 Note that the use of a connection ID to support connection migration 1136 renders 5-tuple based filtering insufficient to detect active flows 1137 and requires more state to be maintained by DDoS defense systems if 1138 support of migration of QUIC flows is desired. For the common case 1139 of NAT rebinding, where the client's address changes without the 1140 client's intent or knowledge, DDoS defense systems can detect a 1141 change in the client's endpoint address by linking flows based on the 1142 server's connection IDs. However, QUIC's linkability resistance 1143 ensures that a deliberate connection migration is accompanied by a 1144 change in the connection ID. In this case, the connection ID can not 1145 be used to distinguish valid, active traffic from new attack traffic. 1147 It is also possible for endpoints to directly support security 1148 functions such as DoS classification and mitigation. Endpoints can 1149 cooperate with an in-network device directly by e.g., sharing 1150 information about connection IDs. 1152 Another potential method could use an on-path network device that 1153 relies on pattern inferences in the traffic and heuristics or machine 1154 learning instead of processing observed header information. 1156 However, it is questionable whether connection migrations must be 1157 supported during a DDoS attack. While unintended migration without a 1158 connection ID change can be more easily supported, it might be 1159 acceptable to not support migrations of active QUIC connections that 1160 are not visible to the network functions performing the DDoS 1161 detection. As soon as the connection blocking is detected by the 1162 client, the client may be able to rely on the 0-RTT data mechanism 1163 provided by QUIC. When clients migrate to a new path, they should be 1164 prepared for the migration to fail and attempt to reconnect quickly. 1166 Beyond in-network DDoS protection mechanisms, TCP syncookies 1167 [RFC4937] are a well-established method of mitigating some kinds of 1168 TCP DDoS attacks. QUIC Retry packets are the functional analogue to 1169 syncookies, forcing clients to prove possession of their IP address 1170 before committing server state. However, there are safeguards in 1171 QUIC against unsolicited injection of these packets by intermediaries 1172 who do not have consent of the end server. See [QUIC_LB] for 1173 standard ways for intermediaries to send Retry packets on behalf of 1174 consenting servers. 1176 4.8. Quality of Service Handling and ECMP Routing 1178 It is expected that any QoS handling in the network, e.g., based on 1179 use of DiffServ Code Points (DSCPs) [RFC2475] as well as Equal-Cost 1180 Multi-Path (ECMP) routing, is applied on a per flow-basis (and not 1181 per-packet) and as such that all packets belonging to the same active 1182 QUIC connection get uniform treatment. 1184 Using ECMP to distribute packets from a single flow across multiple 1185 network paths or any other non-uniform treatment of packets belong to 1186 the same connection could result in variations in order, delivery 1187 rate, and drop rate. As feedback about loss or delay of each packet 1188 is used as input to the congestion controller, these variations could 1189 adversely affect performance. Depending on the loss recovery 1190 mechanism implemented, QUIC may be more tolerant of packet re- 1191 ordering than traditional TCP traffic (see Section 2.7). However, 1192 the recovery mechanism used by a flow cannot be known by the network 1193 and therefore reordering tolerance should be considered as unknown. 1195 Note that the 5-tuple of a QUIC connnection can change due to 1196 migration. In this case different flows are observed by the path and 1197 maybe be treated differently, as congestion control is usualy reset 1198 on migration (see also Section 3.5). 1200 4.9. Handling ICMP Messages 1202 Datagram Packetization Layer PMTU Discovery (PLPMTUD) can be used by 1203 QUIC to probe for the supported PMTU. PLPMTUD optionally uses ICMP 1204 messages (e.g., IPv6 Packet Too Big messages). Given known attacks 1205 with the use of ICMP messages, the use of PLPMTUD in QUIC has been 1206 designed to safely use but not rely on receiving ICMP feedback (see 1207 Section 14.2.1. of [QUIC-TRANSPORT]). 1209 Networks are recommended to forward these ICMP messages and retain as 1210 much of the original packet as possible without exceeding the minimum 1211 MTU for the IP version when generating ICMP messages as recommended 1212 in [RFC1812] and [RFC4443]. 1214 4.10. Guiding Path MTU 1216 Some network segments support 1500-byte packets, but can only do so 1217 by fragmenting at a lower layer before traversing a network segment 1218 with a smaller MTU, and then reassembling within the network segment. 1219 This is permissible even when the IP layer is IPv6 or IPv4 with the 1220 DF bit set, because fragmention occurs below the IP layer. However, 1221 this process can add to compute and memory costs, leading to a 1222 bottleneck that limits network capacity. In such networks this 1223 generates a desire to influence a majority of senders to use smaller 1224 packets, to avoid exceeding limited reassembly capacity. 1226 For TCP, MSS clamping (Section 3.2 of [RFC4459]) is often used to 1227 change the sender's TCP maximum segment size, but QUIC requires a 1228 different approach. Section 14 of [QUIC-TRANSPORT] advises senders 1229 to probe larger sizes using Datagram Packetization Layer PMTU 1230 Discovery ([DPLPMTUD]) or Path Maximum Transmission Unit Discovery 1231 (PMTUD: [RFC1191] and [RFC8201]). This mechanism encourages senders 1232 to approach the maximum packet size, which could then cause 1233 fragmentation within a network segment of which they may not be 1234 aware. 1236 If path performance is limited when forwarding larger packets, an on- 1237 path device should support a maximum packet size for a specific 1238 transport flow and then consistently drop all packets that exceed the 1239 configured size when the inner IPv4 packet has DF set, or IPv6 is 1240 used. 1242 Networks with configurations that would lead to fragmentation of 1243 large packets within a network segment should drop such packets 1244 rather than fragmenting them. Network operators who plan to 1245 implement a more selective policy may start by focusing on QUIC. 1247 QUIC flows cannot always be easily distinguished from other UDP 1248 traffic, but we assume at least some portion of QUIC traffic can be 1249 identified (see Section 3.1). For networks supporting QUIC, it is 1250 recommended that a path drops any packet larger than the 1251 fragmentation size. When a QUIC endpoint uses DPLPMTUD, it will use 1252 a QUIC probe packet to discover the PMTU. If this probe is lost, it 1253 will not impact the flow of QUIC data. 1255 IPv4 routers generate an ICMP message when a packet is dropped 1256 because the link MTU was exceeded. [RFC8504] specifies how an IPv6 1257 node generates an ICMPv6 Packet Too Big message (PTB) in this case. 1258 PMTUD relies upon an endpoint receiving such PTB messages [RFC8201], 1259 whereas DPLPMTUD does not reply upon these messages, but still can 1260 optionally use these to improve performance Section 4.6 of 1261 [DPLPMTUD]. 1263 A network cannot know in advance which discovery method is used by a 1264 QUIC endpoint, so it should send a PTB message in addition to 1265 dropping an oversized packet. A generated PTB message should be 1266 compliant with the validation requirements of Section 14.2.1 of 1267 [QUIC-TRANSPORT], otherwise it will be ignored for PMTU discovery. 1268 This provides a signal to the endpoint to prevent the packet size 1269 from growing too large, which can entirely avoid network segment 1270 fragmentation for that flow. 1272 Endpoints can cache PMTU information, in the IP-layer cache. This 1273 short-term consistency between the PMTU for flows can help avoid an 1274 endpoint using a PMTU that is inefficient. The IP cache can also 1275 influence the PMTU value of other IP flows that use the same path 1276 [RFC8201][DPLPMTUD], including IP packets carrying protocols other 1277 than QUIC. The representation of an IP path is implementation- 1278 specific [RFC8201]. 1280 5. IANA Considerations 1282 This document has no actions for IANA. 1284 6. Security Considerations 1286 QUIC is an encrypted and authenticated transport. That means, once 1287 the cryptographic handshake is complete, QUIC endpoints discard most 1288 packets that are not authenticated, greatly limiting the ability of 1289 an attacker to interfere with existing connections. 1291 However, some information is still observerable, as supporting 1292 manageability of QUIC traffic inherently involves tradeoffs with the 1293 confidentiality of QUIC's control information; this entire document 1294 is therefore security-relevant. 1296 More security considerations for QUIC are discussed in 1297 [QUIC-TRANSPORT] and [QUIC-TLS], generally considering active or 1298 passive attackers in the network as well as attacks on specific QUIC 1299 mechanism. 1301 Version Negotiation packets do not contain any mechanism to prevent 1302 version downgrade attacks. However, future versions of QUIC that use 1303 Version Negotiation packets are required to define a mechanism that 1304 is robust against version downgrade attacks. Therefore, a network 1305 node should not attempt to impact version selection, as version 1306 downgrade may result in connection failure. 1308 7. Contributors 1310 The following people have contributed significant text to and/or 1311 feedback on this document: 1313 * Chris Box 1315 * Dan Druta 1317 * David Schinazi 1319 * Gorry Fairhurst 1320 * Ian Swett 1322 * Igor Lubashev 1324 * Jana Iyengar 1326 * Jared Mauch 1328 * Lars Eggert 1330 * Lucas Purdue 1332 * Marcus Ihlar 1334 * Mark Nottingham 1336 * Martin Duke 1338 * Martin Thomson 1340 * Matt Joras 1342 * Mike Bishop 1344 * Nick Banks 1346 * Thomas Fossati 1348 * Sean Turner 1350 8. Acknowledgments 1352 Special thanks to last call reviewers Elwyn Davies, Barry Lieba, Al 1353 Morton, and Peter Saint-Andre. 1355 This work was partially supported by the European Commission under 1356 Horizon 2020 grant agreement no. 688421 Measurement and Architecture 1357 for a Middleboxed Internet (MAMI), and by the Swiss State Secretariat 1358 for Education, Research, and Innovation under contract no. 15.0268. 1359 This support does not imply endorsement. 1361 9. References 1363 9.1. Normative References 1365 [QUIC-TLS] Thomson, M., Ed. and S. Turner, Ed., "Using TLS to Secure 1366 QUIC", RFC 9001, DOI 10.17487/RFC9001, May 2021, 1367 . 1369 [QUIC-TRANSPORT] 1370 Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based 1371 Multiplexed and Secure Transport", RFC 9000, 1372 DOI 10.17487/RFC9000, May 2021, 1373 . 1375 9.2. Informative References 1377 [DOTS-ARCH] 1378 Mortensen, A., Ed., Reddy.K, T., Ed., Andreasen, F., 1379 Teague, N., and R. Compton, "DDoS Open Threat Signaling 1380 (DOTS) Architecture", RFC 8811, DOI 10.17487/RFC8811, 1381 August 2020, . 1383 [DPLPMTUD] Fairhurst, G., Jones, T., Tüxen, M., Rüngeler, I., and T. 1384 Völker, "Packetization Layer Path MTU Discovery for 1385 Datagram Transports", RFC 8899, DOI 10.17487/RFC8899, 1386 September 2020, . 1388 [I-D.ietf-dprive-dnsoquic] 1389 Huitema, C., Dickinson, S., and A. Mankin, "DNS over 1390 Dedicated QUIC Connections", Work in Progress, Internet- 1391 Draft, draft-ietf-dprive-dnsoquic-11, 21 March 2022, 1392 . 1395 [I-D.ietf-tsvwg-transport-encrypt] 1396 Fairhurst, G. and C. Perkins, "Considerations around 1397 Transport Header Confidentiality, Network Operations, and 1398 the Evolution of Internet Transport Protocols", Work in 1399 Progress, Internet-Draft, draft-ietf-tsvwg-transport- 1400 encrypt-21, 20 April 2021, 1401 . 1404 [IPIM] Allman, M., Beverly, R., and B. Trammell, "In-Protocol 1405 Internet Measurement (arXiv preprint 1612.02902)", 9 1406 December 2016, . 1408 [QUIC-APPLICABILITY] 1409 Kuehlewind, M. and B. Trammell, "Applicability of the QUIC 1410 Transport Protocol", Work in Progress, Internet-Draft, 1411 draft-ietf-quic-applicability-15, 7 March 2022, 1412 . 1415 [QUIC-GREASE] 1416 Thomson, M., "Greasing the QUIC Bit", Work in Progress, 1417 Internet-Draft, draft-ietf-quic-bit-grease-02, 10 November 1418 2021, . 1421 [QUIC-HTTP] 1422 Bishop, M., "Hypertext Transfer Protocol Version 3 1423 (HTTP/3)", Work in Progress, Internet-Draft, draft-ietf- 1424 quic-http-34, 2 February 2021, 1425 . 1428 [QUIC-INVARIANTS] 1429 Thomson, M., "Version-Independent Properties of QUIC", 1430 RFC 8999, DOI 10.17487/RFC8999, May 2021, 1431 . 1433 [QUIC-RECOVERY] 1434 Iyengar, J., Ed. and I. Swett, Ed., "QUIC Loss Detection 1435 and Congestion Control", RFC 9002, DOI 10.17487/RFC9002, 1436 May 2021, . 1438 [QUIC-TIMEOUT] 1439 Roskind, J., "QUIC (IETF-88 TSV Area Presentation)", 7 1440 November 2013, 1441 . 1444 [QUIC_LB] Duke, M., Banks, N., and C. Huitema, "QUIC-LB: Generating 1445 Routable QUIC Connection IDs", Work in Progress, Internet- 1446 Draft, draft-ietf-quic-load-balancers-13, 28 March 2022, 1447 . 1450 [RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, 1451 DOI 10.17487/RFC1191, November 1990, 1452 . 1454 [RFC1812] Baker, F., Ed., "Requirements for IP Version 4 Routers", 1455 RFC 1812, DOI 10.17487/RFC1812, June 1995, 1456 . 1458 [RFC2475] Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z., 1459 and W. Weiss, "An Architecture for Differentiated 1460 Services", RFC 2475, DOI 10.17487/RFC2475, December 1998, 1461 . 1463 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition 1464 of Explicit Congestion Notification (ECN) to IP", 1465 RFC 3168, DOI 10.17487/RFC3168, September 2001, 1466 . 1468 [RFC3449] Balakrishnan, H., Padmanabhan, V., Fairhurst, G., and M. 1469 Sooriyabandara, "TCP Performance Implications of Network 1470 Path Asymmetry", BCP 69, RFC 3449, DOI 10.17487/RFC3449, 1471 December 2002, . 1473 [RFC4443] Conta, A., Deering, S., and M. Gupta, Ed., "Internet 1474 Control Message Protocol (ICMPv6) for the Internet 1475 Protocol Version 6 (IPv6) Specification", STD 89, 1476 RFC 4443, DOI 10.17487/RFC4443, March 2006, 1477 . 1479 [RFC4459] Savola, P., "MTU and Fragmentation Issues with In-the- 1480 Network Tunneling", RFC 4459, DOI 10.17487/RFC4459, April 1481 2006, . 1483 [RFC4787] Audet, F., Ed. and C. Jennings, "Network Address 1484 Translation (NAT) Behavioral Requirements for Unicast 1485 UDP", BCP 127, RFC 4787, DOI 10.17487/RFC4787, January 1486 2007, . 1488 [RFC4937] Arberg, P. and V. Mammoliti, "IANA Considerations for PPP 1489 over Ethernet (PPPoE)", RFC 4937, DOI 10.17487/RFC4937, 1490 June 2007, . 1492 [RFC5382] Guha, S., Ed., Biswas, K., Ford, B., Sivakumar, S., and P. 1493 Srisuresh, "NAT Behavioral Requirements for TCP", BCP 142, 1494 RFC 5382, DOI 10.17487/RFC5382, October 2008, 1495 . 1497 [RFC6066] Eastlake 3rd, D., "Transport Layer Security (TLS) 1498 Extensions: Extension Definitions", RFC 6066, 1499 DOI 10.17487/RFC6066, January 2011, 1500 . 1502 [RFC7301] Friedl, S., Popov, A., Langley, A., and E. Stephan, 1503 "Transport Layer Security (TLS) Application-Layer Protocol 1504 Negotiation Extension", RFC 7301, DOI 10.17487/RFC7301, 1505 July 2014, . 1507 [RFC7605] Touch, J., "Recommendations on Using Assigned Transport 1508 Port Numbers", BCP 165, RFC 7605, DOI 10.17487/RFC7605, 1509 August 2015, . 1511 [RFC7801] Dolmatov, V., Ed., "GOST R 34.12-2015: Block Cipher 1512 "Kuznyechik"", RFC 7801, DOI 10.17487/RFC7801, March 2016, 1513 . 1515 [RFC7838] Nottingham, M., McManus, P., and J. Reschke, "HTTP 1516 Alternative Services", RFC 7838, DOI 10.17487/RFC7838, 1517 April 2016, . 1519 [RFC7983] Petit-Huguenin, M. and G. Salgueiro, "Multiplexing Scheme 1520 Updates for Secure Real-time Transport Protocol (SRTP) 1521 Extension for Datagram Transport Layer Security (DTLS)", 1522 RFC 7983, DOI 10.17487/RFC7983, September 2016, 1523 . 1525 [RFC8201] McCann, J., Deering, S., Mogul, J., and R. Hinden, Ed., 1526 "Path MTU Discovery for IP version 6", STD 87, RFC 8201, 1527 DOI 10.17487/RFC8201, July 2017, 1528 . 1530 [RFC8504] Chown, T., Loughney, J., and T. Winters, "IPv6 Node 1531 Requirements", BCP 220, RFC 8504, DOI 10.17487/RFC8504, 1532 January 2019, . 1534 [RFC9065] Fairhurst, G. and C. Perkins, "Considerations around 1535 Transport Header Confidentiality, Network Operations, and 1536 the Evolution of Internet Transport Protocols", RFC 9065, 1537 DOI 10.17487/RFC9065, July 2021, 1538 . 1540 [TLS-ECH] Rescorla, E., Oku, K., Sullivan, N., and C. A. Wood, "TLS 1541 Encrypted Client Hello", Work in Progress, Internet-Draft, 1542 draft-ietf-tls-esni-14, 13 February 2022, 1543 . 1546 [TMA-QOF] Trammell, B., Gugelmann, D., and N. Brownlee, "Inline Data 1547 Integrity Signals for Passive Measurement (in Proc. TMA 1548 2014)", April 2014. 1550 [WIRE-IMAGE] 1551 Trammell, B. and M. Kuehlewind, "The Wire Image of a 1552 Network Protocol", RFC 8546, DOI 10.17487/RFC8546, April 1553 2019, . 1555 Authors' Addresses 1557 Mirja Kuehlewind 1558 Ericsson 1559 Email: mirja.kuehlewind@ericsson.com 1561 Brian Trammell 1562 Google Switzerland GmbH 1563 Gustav-Gull-Platz 1 1564 CH- 8004 Zurich 1565 Switzerland 1566 Email: ietf@trammell.ch