idnits 2.17.00 (12 Aug 2021) /tmp/idnits9848/draft-kuhn-quic-careful-resume-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document date (6 May 2022) is 8 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Informational ---------------------------------------------------------------------------- No issues found here. Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force N. Kuhn 3 Internet-Draft Thales Alenia Space 4 Intended status: Informational E. Stephan 5 Expires: 7 November 2022 Orange 6 G. Fairhurst 7 T. Jones 8 University of Aberdeen 9 C. Huitema 10 Private Octopus Inc. 11 6 May 2022 13 Carefully Resume QUIC Session 14 draft-kuhn-quic-careful-resume-01 16 Abstract 18 This document provides a method to allow a QUIC session to carefully 19 resume using a previously utilised Internet path. 21 The method uses a set of computed congestion control parameters that 22 are based on the previously observed path characteristics, such as 23 the bottleneck bandwidth, available capacity, or the RTT. These 24 parameters are stored and can then used to modify the congestion 25 control behaviour of a subsequent connection. The draft discusses 26 assumptions around how a server ought to utilise the parameters to 27 provide opportunities for a new flow to more quickly get up to speed 28 (i.e. utilise available capacity). It discusses how these changes 29 impact the capacity at a shared network bottleneck and the response 30 that is needed after any indication that the new rate is 31 inappropriate. 33 Status of This Memo 35 This Internet-Draft is submitted in full conformance with the 36 provisions of BCP 78 and BCP 79. 38 Internet-Drafts are working documents of the Internet Engineering 39 Task Force (IETF). Note that other groups may also distribute 40 working documents as Internet-Drafts. The list of current Internet- 41 Drafts is at https://datatracker.ietf.org/drafts/current/. 43 Internet-Drafts are draft documents valid for a maximum of six months 44 and may be updated, replaced, or obsoleted by other documents at any 45 time. It is inappropriate to use Internet-Drafts as reference 46 material or to cite them other than as "work in progress." 48 This Internet-Draft will expire on 7 November 2022. 50 Copyright Notice 52 Copyright (c) 2022 IETF Trust and the persons identified as the 53 document authors. All rights reserved. 55 This document is subject to BCP 78 and the IETF Trust's Legal 56 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 57 license-info) in effect on the date of publication of this document. 58 Please review these documents carefully, as they describe your rights 59 and restrictions with respect to this document. Code Components 60 extracted from this document must include Revised BSD License text as 61 described in Section 4.e of the Trust Legal Provisions and are 62 provided without warranty as described in the Revised BSD License. 64 Table of Contents 66 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 67 2. Language, notations and terms . . . . . . . . . . . . . . . . 5 68 2.1. Requirements Language . . . . . . . . . . . . . . . . . . 5 69 2.2. Notations and Terms . . . . . . . . . . . . . . . . . . . 5 70 3. Scenarios of Interest . . . . . . . . . . . . . . . . . . . . 6 71 3.1. Large BDP Scenarios . . . . . . . . . . . . . . . . . . . 6 72 3.2. Accomodating from a Known Reduction in Capacity . . . . . 6 73 3.3. Optimizing Client Requests . . . . . . . . . . . . . . . 7 74 3.4. Sharing Transport Information across Multiple 75 Connections . . . . . . . . . . . . . . . . . . . . . . . 8 76 3.5. Connection Establishment, Client and Server . . . . . . . 8 77 4. The Phases of CC . . . . . . . . . . . . . . . . . . . . . . 8 78 5. Safe Jump . . . . . . . . . . . . . . . . . . . . . . . . . . 9 79 5.1. Rationale behind the Safety Guidelines . . . . . . . . . 9 80 5.2. Rationale #1: Variable Network Conditions . . . . . . . . 10 81 5.3. Rationale #2: Malicious clients . . . . . . . . . . . . . 11 82 5.4. Trade-off between the different solutions . . . . . . . . 12 83 5.4.1. Interoperability and Use Cases . . . . . . . . . . . 12 84 5.4.2. Summary . . . . . . . . . . . . . . . . . . . . . . . 13 85 6. Safety Guidelines . . . . . . . . . . . . . . . . . . . . . . 14 86 7. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 16 87 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 16 88 9. Security Considerations . . . . . . . . . . . . . . . . . . . 16 89 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 17 90 10.1. Normative References . . . . . . . . . . . . . . . . . . 17 91 10.2. Informative References . . . . . . . . . . . . . . . . . 18 92 Appendix A. Implementation Considerations . . . . . . . . . . . 19 93 A.1. Rationale behind the different implementation options . . 19 94 A.2. Independent Local Storage of Values . . . . . . . . . . . 19 95 A.3. Using NEW_TOKEN frames . . . . . . . . . . . . . . . . . 20 96 A.4. BDP Frame . . . . . . . . . . . . . . . . . . . . . . . . 20 97 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 20 99 1. Introduction 101 The specification for the QUIC transport protocol [RFC9000] notes 102 "Generally, implementations are advised to be cautious when using 103 previous values on a new path." The method uses a set of computed 104 Congestion Control (CC) parameters that are based on the previously 105 observed path characteristics, such as the bottleneck bandwidth, 106 available capacity, or the Round Trip Time (RTT). These parameters 107 are stored and can then used to modify the congestion control 108 behaviour of a subsequent connection. 110 All Internet transports are required to use a CC method. In 2010, 111 RFC 5783 provided a survey of alternative CC methods, and noted that 112 there are challenges when a CC operates across paths with a high and/ 113 or variable bandwidth-delay product (BDP) [RFC5783]. 115 A CC algorithm typically takes time to ramp-up the packet rate, 116 called the "slow-start phase", informally known as the time to "Get 117 up to speed". The slow start phase is a period in which a sender 118 intentionally uses less capacity than might be available with the 119 intention to avoid overshooting the actual capacity at a bottleneck, 120 which would result in increased queueing (latency/jitter) and/or 121 congestion packet loss. An overshoot in the capacity can also have a 122 detrimental effect on other flows sharing a common bottleneck. In 123 the extreme case, persistent congestion can result in unwanted 124 starvation of other flows [RFC8867] (i.e. Preventing other flows 125 from successfully sharing a common bottleneck). 127 In Reno, the slow-start phase consists of a sequence of increases in 128 the congestion window (cwnd) starting from the Initial Window (IW). 129 Each step lasts approximately one path RTT, until the sender estimate 130 that the capacity has reached (or is nearing) the capacity at the 131 bottleneck for the path). 133 To fully-utilise the capacity along a path with a certain path RTT, 134 the transport needs to determine an appropriate volume of bytes in 135 flight, based on the product of the available capacity and the RTT. 136 [RFC6349] defines the BDP as follows: "Derived from Round-Trip Time 137 (RTT) and network Bottleneck Bandwidth (BB), the Bandwidth-Delay 138 Product (BDP) determines the Send and Received Socket buffer sizes 139 required to achieve the maximum TCP Throughput." The BDP estimated 140 by a server includes all buffering experienced along a network path. 141 Various approaches are possible to determine the BDP, based on 142 measurements of the path characteristics. [RFC6349] specifies one 143 procedure for TCP. CC for QUIC is specified in [RFC9002] and does 144 not specify a required method to measure the BDP, allowing the sender 145 to implement an appropriate method. 147 This document specifies a method for QUIC that can improve traffic 148 delivery (e.g. throughput) by allowing a QUIC connection to reduce 149 the total duration of the slow start phase under specific conditions. 150 This introduces an alternative way to discover initial key path 151 parameters, including a way to more rapidly and safely grow the cwnd. 153 There are scenarios where sharing previously computed parameters 154 relating to path characteristics, such as the bottleneck bandwidth or 155 RTT, can help to save round-trip times at the start of a new 156 connection. For example: 158 1. To optimize sessions that use a series of short flows over the 159 same path, each of which needs to individually learn the 160 available capacity/rtt (e.g., a client using Dynamic Adaptive 161 Streaming over HTTPS, DASH); 163 2. After a pause in transmission (e.g., when a user uses a path, 164 pauses a session, and then wishes to resume the session over the 165 same path; 167 3. To resume a session after a service disruption (e.g., where the 168 network service temporarily reduced due to a link propagation 169 impairment, or where a user on a train journey travels through 170 different areas of connectivity with a temporary change in path 171 characteristics before the user returns to the original path 172 characteristics). 174 In all of these cases, specific characteristics of the path may have 175 been learned, including CC information, such as the available 176 capacity and RTT. This CC information might be expected to be 177 similar when a new connection is made between the same local and 178 remote endpoints. 180 While the server could take optimization decisions without 181 considering the client's preference, in some cases a client could 182 have information that is not available at the server. A client may 183 provide hints, for example: (1) information abnout how the upper 184 layers expect to use a connection - such as the expected size of 185 transfer; (2) an indication that the path/local interface has 186 changed; (3) information related to current hardware limitations of 187 the client or (4) an understanding about the capacity needs of other 188 concurrent flows that would compete for shared capacity. As a 189 result, a client could explicitely ask for tuning the slow start of 190 the resumed connection, or to inhibit tuning. This is discussed 191 further later in the document. 193 There are also cases where using the parameters of a previous 194 connection are not appropriate, and a need to evaluate the potential 195 malicious use of the method. 197 The remainder of this document: 199 1. discusses use-cases where carefully resuming QUIC sessions is 200 expected to have benefit; 202 2. proposes guidelines for how to carefully utilise the previously 203 stored CC information; 205 3. describes implementation considerations for the proposed method 206 using QUIC; 208 4. discusses the trade-offs associated with the different 209 implementation solutions. 211 2. Language, notations and terms 213 This section provides a brief summary of key terms and the 214 requirements language that is used. The document uses language drawn 215 from a range of IETF RFCs. 217 2.1. Requirements Language 219 The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 220 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 221 "OPTIONAL" in this document are to be interpreted as described in BCP 222 14 [RFC2119] [RFC8174] when, and only when, they appear in all 223 capitals, as shown here. 225 2.2. Notations and Terms 227 This document defines current, and saved values for a set of CC 228 parameters: 230 * IW: Initial Window [RFC9002]; 232 * current_iw: Current IW; 234 * recom_iw: Recommended IW; 236 * current_bb : Current estimated bottleneck bandwidth; 238 * saved_bb: Estimated bottleneck bandwidth preserved from a previous 239 connection; 241 * current_rtt: Current RTT; 243 * saved_rtt: RTT measure RTT preserved from a previous connection; 245 * client_ip : IP address of the client; 247 * current_client_ip : Current IP address of the client; 249 * saved_client_ip : IP address of a previous connection by the 250 client; 252 * remembered BDP parameters: a combination of saved_rtt and saved_bb 254 Congestion controllers, such as CUBIC or RENO, could estimate the 255 saved_bb and current_bb values by utilizing a combination of the 256 cwnd/flight_size and the minimum RTT. A different method could be 257 used to estimate the same values when using a rate-based congestion 258 controller, such as BBR [I-D.cardwell-iccrg-bbr-congestion-control]. 259 It is important to consider whether the methods could result in over- 260 estimating the bottleneck bandwidth, and the preserved values there 261 ought to be used with caution. 263 3. Scenarios of Interest 265 3.1. Large BDP Scenarios 267 QUIC introduces the concept of transport parameters (section 4 of 268 [RFC9000]). This document notes that a new connection can utilise a 269 set of key transport parameters from a previous connection to reduce 270 the completion time for a transfer of size much larger than the IW 271 over paths where the available capacity is also significantly larger 272 than the IW. This benefit is particularly evident for a path where 273 the RTT is much larger than a typical Internet RTT. 275 For example, a satellite access network, a 5.3 MB transfer takes up 276 to 9 seconds using standard congestion control, whereas using the 277 specified method this could reduce to 4 seconds [IJSCN]; and the time 278 to complete a 1 MB transfer could be reduced by 62 % [MAPRG111]. 279 Benefits is also expected for other sizes of transfer and for 280 different path characteristics that also result in a higher BDP. 282 3.2. Accomodating from a Known Reduction in Capacity 284 A transport protocol is not able to assume that the path 285 characteristics it experiences remain the same. Variation can arise 286 from a combination of various factors: 288 * Competing network traffic sharing a common bottleneck can result 289 in short or long term variation; 291 * Changes in forward path can change the set of links/routers over 292 which the flow is forwarded (from routing/mobility/circuit 293 restoration/interface change), result in a change in the bandwidth 294 and the other traffic that shares all/part of the path; 296 * Link conditions can result in a change of the total bandwidth 297 (e.g., as a result of changes in propagation conditions or sharing 298 of a medium); 300 * Application/endpoint Behavior can change the capacity available to 301 a flow. 303 The characteristics of an Internet path therefore need to be measured 304 by the transport protocol, and may not reflect the actual path used 305 by a new connection. Older measurements, or cases where measurements 306 are known to vary significantly are more likely to be invalid. 308 In some cases (e.g., after a change in the interface used by the 309 local endpoint), a client may be aware of such a change, and might be 310 able to infer that a previously available path has again become 311 available. However, to utilise the previous information, the client 312 would need assurance that the path was to the same endpoint, and that 313 the characteristics have not significantly changed from those 314 previously measured. When the path is expected to be the same, there 315 is then an opportunity to save time (eliminate RTTs consumed by slow 316 start) by utilising saved CC information for the path. 318 3.3. Optimizing Client Requests 320 Some styles of usage do not use long-lasting connections at the 321 transport layer. Instead, they use a series of shorter connections. 322 For example, a client using Dynamic Adaptive Streaming over HTTPS 323 (DASH). Such a client might be unable to reach the video playback 324 quality that is supported by the path, because for each video chunk, 325 the transport protocol needs to independently determine the path 326 capacity. The lower transfer rate is safe, but can also lead to an 327 overly conservative requested rate by the client, because clients 328 often adapt their application-layer requests based on the transport 329 performance (i.e., the client could fail to increase the requested 330 quality of video chunks, or to fill buffers to avoid stalling 331 playback or to send high quality advertisements). 333 There are other cases where applications could provide additional 334 services if a client knew the path characteristics. 336 3.4. Sharing Transport Information across Multiple Connections 338 There can be benefit in sharing transport information across multiple 339 concurrent connections. [RFC9040] considers the sharing of transport 340 parameters between TCP connections that originate from the same host. 341 The proposal in this document has the advantage of storing server- 342 generated information at the client and not requiring the server to 343 retain additional state for each client. 345 3.5. Connection Establishment, Client and Server 347 In the previously detailed scenarios, the application data transfer 348 was unidirectional towards the client, i.e., the main flow of data 349 was from a server to a client (e.g., downloading a file or web page). 350 This is the focus of the current version of the document. 352 In a different example, the application data transfer can still be 353 unidirectional, but towards the server, e.g., uploading an image/ 354 video is a server. 356 There are also use cases where a client initiates a connection for a 357 bidirectional service where both endpoints send appreciable data to 358 each other, such as to support a remote executing application, or a 359 video conference call. 361 In general, the guidelines proposed in this document apply when a 362 congestion controller is sending data to a remote peer and that 363 remote endpoint resumes the session. Both endpoints can assume the 364 role of a client or a server. 366 4. The Phases of CC 368 This document defines a series of different phases through which the 369 CC algorithm moves as a connection gets up to speed. The phases are 370 labelled as follows: 372 1. Observe: During a previous connection, the current RTT 373 (current_rtt), bottleneck bandwidth (current_bb) and current 374 client IP (current_client_ip) are stored as saved_rtt, saved_bb 375 and saved_client_ip; 377 2. Reconnaissance: When resuming a session between the same pair of 378 IP addresses, the server measures the path characteristics of a 379 new connection to confirm the path appears the same as observed 380 previously (e.g., path with similar RTT). The server also seeks 381 assurance that initial data is not lost, to avoid resuming under 382 congested conditions. 384 3. Unvalidated: Utilise the saved path characteristics to send at a 385 rate higher than allowed by slow start. The convergence towards 386 the previous rate is expected be quicker than when using 387 traditional slow-start mechanisms, but should not be 388 instantaneous, to avoid adding congestion to a congested 389 bottleneck. 391 1. If the unvalidated rate was used without inducing noticeable 392 congestion to the path, the server is permitted to continue 393 sending at this rate in the 'Normal' phase. 395 2. If the validation phase determines that previous parameters 396 are not valid (due to a change) or congestion was 397 experienced, the sender must withdraw rapidly to a safe rate, 398 before it enters the 'Normal' phase. 400 4. Normal: Resume using the normal CC method. 402 5. Safe Jump 404 5.1. Rationale behind the Safety Guidelines 406 NOTE: The sender ought not to re-utilise all capacity it previously 407 used, to avoiding starving flows that started after the measurement. 408 How strong should this be stated: ... MUST or SHOULD ... What safety 409 factor is appropriate for the resuming sender? If using slow-start 410 it would anyway double the rate on the next RTT, so is capacity/2 411 appropriate to initially try? 413 A connection MUST NOT use the previously measured saved_rtt and 414 saved_bb to simply initialise a new flow to resume sending at the 415 same rate. 417 * Rationale #1: Bottleneck bandwidth and network traffic can change 418 at any time. An Internet method needs to be robust to network 419 conditions that can differ from one session to the next, due to 420 variations in the forwarding path, reconfiguration of equipment or 421 changes in the link conditions. An Internet method needs to be 422 robust to changes in network traffic, including the arrival of new 423 traffic flows that compete for the bottleneck capacity. 424 Behaviours need to be designed that avoid sending excessive data 425 into a congestion bottleneck because this can have a material 426 impact on any flows using that bottleneck, and the ability of 427 those flows to control their own sending rate. 429 * Rationale #2: Information sent by a malicious client is not 430 relevant. A client could request a server to use a cwnd higher 431 than appropriate, to gain an unfair share of capacity for itself 432 or to induce congestion for other flows. A server might anyway 433 decide whether to fully use the new allowed rate. 435 5.2. Rationale #1: Variable Network Conditions 437 The server MUST check the validity of any received saved_rtt and 438 saved_bb parameters, whether these are sent by a client or are stored 439 at the server. The following events indicates cases where the use of 440 these parameters is inappropriate: 442 * IP address change: If the client changes its local IP address 443 (i.e., the saved_client_ip is different from the 444 current_client_ip), the different source address is a assumed an 445 indication of a different network path. This new path does not 446 necessarily exhibit the same characteristics as the old one. If 447 the server changes its IP address after a migration, it would not 448 be safe to exploit previously estimated parameters. 450 * RTT change: A significant change in RTT might be an indication 451 that the network conditions have changed. Since the CC 452 information is directly impacted by the RTT, a significant change 453 in the RTT is a strong indication that the previously estimated 454 BDP parameters are likely to not be valid for the current path. 455 NOTE: This document needs to define a significant change. 457 * Lifetime of the information: The CC information is temporal. 458 Frequent connections to the same IP address are likely to track 459 changes, but long-term use of previous values is not appropriate. 460 NOTE: This document needs to define how long. 462 * BB over-estimation: There are cases where using a measured cwnd 463 would inflate the bottleneck bandwidth. At the end of the CC slow 464 start phase, the value of cwnd can be significantly larger than 465 the minimum value needed to utilise the path (i.e., cwnd 466 overshoot). In most case, the cwnd finally converges to a stable 467 value after a few more RTTs. It would be inappropriate to use an 468 overshoot in the cwnd as a basis for estimating the bottleneck 469 bandwidth. NOTE: One mitigation could be to further restrict to 470 only a fraction (e.g., 1/2) of the previously used cwnd; another 471 mitigation might be to calculate the bottleneck bandwidth based on 472 the flight_size or an averaged cwnd. 474 * Preventing Starvation of New Flows: It would not be appropriate to 475 fully use a bottleneck bandwidth estimate based on a previous 476 measurement of capacity, because new flows might have started 477 using the available capacity since that measurement was made. The 478 mitigation could be to restrict to only a fraction (e.g., 1/2) of 479 the previously used cwnd. 481 There are several solutions to mitigate the impact of changes in 482 network conditions: 484 * Rationale #1 - Solution #1 : When resuming a session, restore the 485 current_bb and current_rtt from the saved_bb and saved_rtt 486 parameters estimated from a previous connection. 488 * Rationale #1 - Solution #2 : When resuming a session, implement a 489 safety check to measure avoid using the saved_bb and saved_rtt 490 parameters to cause congestion over the path. In this case, the 491 current_bb and current_rtt might not be set directly to the 492 saved_bb and saved_rtt: the server might wait for the completion 493 of the safety check before this is done. 495 Section 6 describes various approaches for Rationale #1 - Solution 496 #2. 498 5.3. Rationale #2: Malicious clients 500 The server MUST check the integrity of the saved_rtt and saved_bb 501 parameters received from a client. 503 There are several solutions to avoid attacks by malicious clients: 505 * Rationale #2 - Solution #1 : The server stores a local estimate of 506 the bottleneck bandwidth and RTT parameters as the saved_bb and 507 saved_rtt. 509 * Rationale #2 - Solution #2 : The server sends the estimate of the 510 bottleneck bandwidth and RTT parameters to the client as the 511 saved_bb and saved_rtt in a block of information that is 512 authenticated. This information also could be encrypted by the 513 server. The client resends the same information when resuming a 514 connection. The server can use its local key information to 515 authenticate the information, without needing to keep a local 516 copy. 518 * Rationale #2 - Solution #3 : This approach is the same as above, 519 except that the server sends an estimate of the saved_rtt and 520 saved_bb parameters in a form that may be read by the client. The 521 information might not be encrypted, or the information might be 522 duplicated outside of the encrypted block. This allows a client 523 to read, but not modify, the saved_rtt and saved_bb parameters and 524 could enable a client to decide whether it thinks the new 525 parameters are appropriate, based on client-side information about 526 the network conditions, connectivity, or needs of the session 527 using the connection. 529 Appendix A describes various implementation approaches for each of 530 these solutions using local storage ( Appendix A.2 for Rationale #2 - 531 Solution #1), NEW_TOKEN Frame ( Appendix A.3 for Rationale #2 - 532 Solution #2), BDP extension Frame ( Appendix A.4 for Rationale #2 - 533 Solution #3). 535 5.4. Trade-off between the different solutions 537 This section provides a description of several implementation options 538 and discusses their respective advantages and drawbacks. 540 While there are some discussions for the solutions regarding 541 Rationale #2, the server MUST consider Rationale #1 - Solution #2 and 542 avoid Rationale #1 - Solution #1: the server MUST implement a safety 543 check to measure whether the saved BDP parameters (i.e. saved_rtt and 544 saved_bb) are relevant or check that their usage would not cause 545 excessive congestion over the path. 547 Security consideration are discussed in Section 9 . 549 5.4.1. Interoperability and Use Cases 551 A server that stores a resumption ticket for each client to protect 552 against replay on a third party IP, it could also store the IP 553 address (i.e., saved_client_ip) and BDP parameters (i.e., saved_rtt 554 and saved_bb) of the previous session of the client. 556 When the BDP Frame extension is used, locally stored BDP parameters 557 at the server can provide a cross-check of the BDP parameters sent by 558 a client. The server can anyway enable a safe jump, but without the 559 BDP Frame extension. However, using the parameters enables a client 560 to choose whether to request this or not, enabling it to utilize 561 local knowledge of the network conditions, connectivity, or session 562 requirements. 564 XXX-Editor-note: Text to be improved: Storing local values related to 565 the BDP would help improve the ingress for 0-RTT connections, 566 however, not using a BDP Frame extension could reduce the interest of 567 the approach where (1) the client knows the BDP estimation at the 568 server, (2) the client decides to accept or reject ingress 569 optimization, (3) the client tunes application level requests. 571 5.4.2. Summary 573 Local storage of values can be secure and the BDP Frame extension 574 provides more information to the client and more interoperability. 575 The Figure 1 provides a summary of the advantages and drawbacks of 576 each approach. 578 +---------+-----------+----------------+---------------+-----------+ 579 |Rationale| Solution | Advantage | Drawback | Comment | 580 +---------+-----------+----------------+---------------+-----------+ 581 |#1 |#1 | | | | 582 |Variable |set |Ingress optim. |Risk of adding |MUST NOT | 583 |Network |current_* | | congestion |implement | 584 | |to saved_* | | | | 585 | +-----------+----------------+---------------+-----------+ 586 | |#2 | | | | 587 | |Implement |Reduce risk of |Negative impact|MUST | 588 | |safety | adding | on ingress |implement | 589 | |check | congestion | optim. |Section 3 | 590 +---------+-----------+----------------+---------------+-----------+ 591 |#2 |#1 | | | | 592 |Malicious|Local |Enforced |Client unable | | 593 |client |storage | security | to decide to | | 594 | | | | reject | | 595 | | | |Malicious | | 596 | | | | server could | | 597 | | | | fill client's | | 598 | | | | buffer | | 599 | | | |Limited | | 600 | | | | use-cases |Section 4.2| 601 | +-----------+----------------+---------------+-----------+ 602 | |#2 | | | | 603 | |NEW_TOKEN |Save resource |Malicious | | 604 | | | at server | client could | | 605 | | |Opaque token | change token | | 606 | | | protected | even if | | 607 | | | | protected | | 608 | | | |Malicious | | 609 | | | | server could | | 610 | | | | fill client's | | 611 | | | | buffer | | 612 | | | |Server may not | | 613 | | | | trust client |Section 4.3| 614 | +-----------+----------------+---------------+-----------+ 615 | |#3 | | | | 616 | |BDP |Extended |Malicious | | 617 | |extension | use-cases | client could | | 618 | | |Save resource | change BDP | | 619 | | | at server | even if | | 620 | | |Client can | protected | | 621 | | | read and decide|Server may not | | 622 | | | to reject | trust client | | 623 | | |BDP extension | | | 624 | | | protected | | | 625 | | | | |Section 4.4| 626 +---------+-----------+----------------+---------------+-----------+ 628 XXX-Editor-Note: Need to clarify the text around changing 629 the authenticated token. 631 Figure 1: Comparing solutions 633 6. Safety Guidelines 635 The following safety guidelines refer to the labelling defined in 636 Section 4. 638 The safety guidelines are designed to mitigate the risk that a server 639 adds excessive congestion to an already congested path. The 640 following mechanisms help in fulfilling this objective: 642 * (observation phase) The server SHOULD NOT store and/or send 643 information related to a previously estimated bottleneck bandwidth 644 (saved_bb) (see Section 2.2 for more details on bottleneck 645 bandwidth definition), if the cwnd is not at least four times 646 larger than the IW. 648 * (reconnaissance phase) The server MUST NOT send more than the 649 recommended maximum IW (recom_iw) in the first RTT of transmitting 650 data [RFC9000]. (When used in a controlled network, additional 651 information about local path characteristics could be known that 652 might be used to configure a non-standard IW). 654 * (reconnaissance phase) The server MUST compare the measured 655 transport parameters (in particular current_rtt) of the 0-RTT 656 connection with those of the 1-RTT connection (in particular 657 saved_rtt). The method MUST NOT be used when the path fails to be 658 validated; 660 * (unvalidated phase) The server MUST NOT use the parameters unless 661 the first IW packets when packets are detected as lost or 662 acknowledgements indicate the packets were ECN CE-marked. These 663 are indication of potential congestion and therefore the method 664 MUST NOT be used; 666 * (unvalidated phase) The server MUST implement the retreat method 667 when packets are detected as lost or acknowledgements indicate the 668 packets were ECN CE-marked. These are indication of potential 669 congestion and therefore the method MUST NOT be used. 671 The proposed mechanisms SHOULD be limited by any rate-limitation 672 mechanisms of QUIC, such as flow control mechanisms or amplification 673 attack prevention. In particular, it may be necessary to issue 674 proactive MAX_DATA frames to increase the flow control limits of a 675 connection. In particular, the maximum number of packets that can be 676 sent without acknowledgements needs to be chosen to avoid the 677 creation and the increase of congestion for the path. 679 This extension MUST NOT provide an opportunity for the current 680 connection to be a vector of an amplification attack. The address 681 validation process, used to prevent amplification attacks, SHOULD be 682 performed [RFC9000]. 684 XXX-Editor-note: This probbaly should be a range rather than an 685 inequality (current_rtt < 1.2*saved_rtt). 687 The following mechanisms could be implemented: 689 * Exploit a standard IW: 691 1. The server sends the first data packet using the IW - this is 692 a safe starting point for any path where there is no path 693 information or congestion control information. This avoids 694 adding excessive congestion to a path; 696 2. The sender monitors the reception of the IW data. If the path 697 characteristics resemble those of a recent previous session 698 from to the same server (i.e., current_rtt < 1.2*saved_rtt) 699 and all data was acknowledged without reported congestion), 700 the method permits the sender to utilise the saved_bb as an 701 input to adapt current_bb to rapidly determine a new safe 702 rate; 704 3. The sender needs to avoid a burst of packets resulting from a 705 step-increase in the congestion window [RFC9000]. Pacing the 706 packets as a function of the current_rtt can provide this 707 additional safety during the period in which the CWND is 708 increased by the method. 710 * Identify a relevant pacing rhythm: 712 - The server estimates the pacing rhythm using saved_rtt and 713 saved_bb. The Inter-packet Transmission Time (ITT) is 714 determined by the ratio between the current Maximum Message 715 Size (MMS) and the ratio between the saved_bb and saved_rtt. A 716 tunable safety margin can avoid sending more than a recommended 717 maximum IW (recom_iw): 719 o current_iw = min(recom_iw,saved_bb) 721 o ITT = MSS/(current_iw/saved_rtt) 723 - When the successful receipt of the IW data is acknowledged, the 724 server returns to a standard slow-start mechanism. 726 * Tune slow-start mechanisms: After transport parameters are set to 727 a previously estimated bottleneck bandwidth, if the slow-start 728 mechanisms continue, the sender can then overshoot the bottleneck 729 capacity. This can occur even when using the safety check 730 described in this section. 732 - For NewReno and CUBIC, it is recommended to exit slow-start and 733 enter the congestion avoidance phase. 735 - For BBR, it is recommended to enter the "probe bandwidth" 736 state. 738 This follows the idea presented in [RFC4782], 739 [I-D.irtf-iccrg-sallantin-initial-spreading] and [CONEXT15]. 741 7. Acknowledgments 743 The authors would like to thank Gabriel Montenegro, Patrick McManus, 744 Ian Swett, Igor Lubashev, Robin Marx, Roland Bless and Franklin Simo 745 for their fruitful comments on earlier versions of this document. 747 8. IANA Considerations 749 TBD: Text is required to register the BDP Frame and the enable_bdp 750 transport parameter. Parameters are registered using the procedure 751 defined in [RFC9000]. 753 9. Security Considerations 755 Security considerations for QUIC are discussed in Section 6 757 The client can send information related to the saved_rtt and saved_bb 758 to the server with the BDP Frame extension using either Rationale #2 759 - Solution #2 or Rationale #2 - Solution #3. However, the server 760 SHOULD NOT trust the client. Indeed, even if 0-RTT packets 761 containing the BDP Frame are encrypted, a client could modify the 762 values within the extension and encrypt the 0-RTT packet. 763 Authentication mechanisms might not guarantee that the values are 764 safe. It is not an easy operation for a client to modify 765 authenticated or encrypted data without this being detected by a 766 server. Modification could be realized by malicious clients. One 767 way to avoid this is for a server to also store the saved_rtt and 768 saved_bb parameters. 770 A malicious client might modify the saved_bb parameter to convince 771 the server to use a larger CWND than appropriate. Using the 772 algorithms proposed in Section 6, the server may reduce any intended 773 harm and can check that part of the information provided by the 774 client are valid. 776 Storing the BDP parameters locally at the server reduces the 777 associated risks by allowing the client to transmit information 778 related to the BDP of the path in the case of a malicious client 779 trying to break the encryption mechanism that it had received. 781 10. References 783 10.1. Normative References 785 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 786 Requirement Levels", BCP 14, RFC 2119, 787 DOI 10.17487/RFC2119, March 1997, 788 . 790 [RFC4782] Floyd, S., Allman, M., Jain, A., and P. Sarolahti, "Quick- 791 Start for TCP and IP", RFC 4782, DOI 10.17487/RFC4782, 792 January 2007, . 794 [RFC6349] Constantine, B., Forget, G., Geib, R., and R. Schrage, 795 "Framework for TCP Throughput Testing", RFC 6349, 796 DOI 10.17487/RFC6349, August 2011, 797 . 799 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 800 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 801 May 2017, . 803 [RFC8446] Rescorla, E., "The Transport Layer Security (TLS) Protocol 804 Version 1.3", RFC 8446, DOI 10.17487/RFC8446, August 2018, 805 . 807 [RFC9000] Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based 808 Multiplexed and Secure Transport", RFC 9000, 809 DOI 10.17487/RFC9000, May 2021, 810 . 812 [RFC9002] Iyengar, J., Ed. and I. Swett, Ed., "QUIC Loss Detection 813 and Congestion Control", RFC 9002, DOI 10.17487/RFC9002, 814 May 2021, . 816 10.2. Informative References 818 [CONEXT15] Li, Q., Dong, M., and P B. Godfrey, "Halfback: Running 819 Short Flows Quickly and Safely", ACM CoNEXT , 2015. 821 [I-D.cardwell-iccrg-bbr-congestion-control] 822 Cardwell, N., Cheng, Y., Yeganeh, S. H., Swett, I., and V. 823 Jacobson, "BBR Congestion Control", Work in Progress, 824 Internet-Draft, draft-cardwell-iccrg-bbr-congestion- 825 control-02, 7 March 2022, 826 . 829 [I-D.irtf-iccrg-sallantin-initial-spreading] 830 Sallantin, R., Baudoin, C., Arnal, F., Dubois, E., Chaput, 831 E., and A. Beylot, "Safe increase of the TCP's Initial 832 Window Using Initial Spreading", Work in Progress, 833 Internet-Draft, draft-irtf-iccrg-sallantin-initial- 834 spreading-00, 15 January 2014, 835 . 838 [IJSCN] Thomas, L., Dubois, E., Kuhn, N., and E. Lochin, "Google 839 QUIC performance over a public SATCOM access", 840 International Journal of Satellite Communications and 841 Networking 10.1002/sat.1301, 2019. 843 [MAPRG111] Kuhn, N., Stephan, E., Fairhurst, G., Jones, T., and C. 844 Huitema, "Feedback from using QUIC's 0-RTT-BDP extension 845 over SATCOM public access", IETF 111 - MAPRG meeting , 846 2022. 848 [RFC5783] Welzl, M. and W. Eddy, "Congestion Control in the RFC 849 Series", RFC 5783, DOI 10.17487/RFC5783, February 2010, 850 . 852 [RFC8867] Sarker, Z., Singh, V., Zhu, X., and M. Ramalho, "Test 853 Cases for Evaluating Congestion Control for Interactive 854 Real-Time Media", RFC 8867, DOI 10.17487/RFC8867, January 855 2021, . 857 [RFC9040] Touch, J., Welzl, M., and S. Islam, "TCP Control Block 858 Interdependence", RFC 9040, DOI 10.17487/RFC9040, July 859 2021, . 861 Appendix A. Implementation Considerations 863 A.1. Rationale behind the different implementation options 865 The NewSessionTickets message of TLS can offer a solution. The 866 proposal is to add a 'bdp_metada' field in the NewSessionTickets, 867 which the client is able to read. The only extension currently 868 defined in TLS1.3 that can be seen by the client is 869 max_early_data_size (see Section 4.6.1 of [RFC8446]). However, in 870 the general design of QUIC, TLS sessions are managed by a TLS stack. 872 Three distinct approaches are presented: sending an opaque blob to 873 the client that the client may return to the server when establishing 874 a future new connection (see Appendix A.3), enabling local storage of 875 the BDP infromation (see Appendix A.2) and a BDP Frame extension (see 876 Appendix A.4). 878 A.2. Independent Local Storage of Values 880 This approach independently lets both a client and a server store 881 their BDP parameters: 883 * During a 1-RTT session, the endpoint stores the RTT (as the 884 saved_rtt) and bottleneck bandwidth (as the saved_bb) together in 885 the session resume ticket. The client can also store the IP 886 address of the server; 888 * The server maintains a table of previously issued tickets, indexed 889 by the random ticket identifier that is used to guarantee 890 uniqueness of the Authenticated Encryption with Associated Data 891 (AEAD) encryption. Old tokens are removed from the table using 892 the Least Recently Used (LRU) logic. For each ticket identifier, 893 the table holds the RTT and bottleneck bandwidth (i.e. saved_rtt 894 and saved_bb), and also the IP address of the client (i.e. 895 saved_client_ip). 897 During the 0-RTT session, the local endpoint waits for the first RTT 898 measurement from the remote endpoint IP address. This is used to 899 verify that the current_rtt has not significantly changed from the 900 saved_rtt (used as an indication that the BDP information is 901 appropriate for the current path). 903 If this RTT is confirmed, the endpoint also verifies that an IW of 904 data has been acknowledged without requiring retransmission or 905 resulting in an ECN CE-mark. This second check detects whether a 906 path is experiencing significant congestion (i.e., where it would not 907 be safe to update the cwnd based on the saved_bb). In practice, this 908 could be realized by a proportional increase in the cwnd, where the 909 increase is (saved_bb/IW)*proportion_of_IW_currently-ACKed. 911 This solution does not allow a client to request the server not to 912 use the BDP parameters. If the server does not want to store the 913 metrics from previous connections, an equivalent of the 914 tcp_no_metrics_save for QUIC may be necessary. This option could be 915 negotiated that allows a client to choose whether to use the saved 916 information. 918 A.3. Using NEW_TOKEN frames 920 A server can send a NEW_TOKEN Frame to the client. The token is an 921 opaque (encrypyted) blob and the client can not read its content (see 922 section 19.7 of [RFC9000]). The client sends the received token in 923 the header of an Initial packet of a later connection. 925 A.4. BDP Frame 927 Using BDP Frames, the server could send information relating to the 928 path characteristics to the client. The use of the BDP Frame is 929 negotiated with the client. The client can read its content. If the 930 client agrees with the usage of previous session parameters, it can 931 send the BDP Frame back to the server in an Initial packet of a later 932 connection. 934 Authors' Addresses 936 Nicolas Kuhn 937 Thales Alenia Space 938 Email: nicolas.kuhn.ietf@gmail.com 940 Emile Stephan 941 Orange 942 Email: emile.stephan@orange.com 943 Godred Fairhurst 944 University of Aberdeen 945 Department of Engineering 946 Fraser Noble Building 947 Aberdeen 948 Email: gorry@erg.abdn.ac.uk 950 Tom Jones 951 University of Aberdeen 952 Department of Engineering 953 Fraser Noble Building 954 Aberdeen 955 Email: tom@erg.abdn.ac.uk 957 Christian Huitema 958 Private Octopus Inc. 959 Email: huitema@huitema.net