idnits 2.17.00 (12 Aug 2021) /tmp/idnits10852/draft-ietf-quic-recovery-34.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == There are 3 instances of lines with non-ascii characters in the document. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (15 January 2021) is 484 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'Initial' is mentioned on line 1708, but not defined == Outdated reference: draft-ietf-quic-tls has been published as RFC 9001 == Outdated reference: draft-ietf-quic-transport has been published as RFC 9000 == Outdated reference: draft-ietf-tcpm-rack has been published as RFC 8985 Summary: 0 errors (**), 0 flaws (~~), 6 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 QUIC J. Iyengar, Ed. 3 Internet-Draft Fastly 4 Intended status: Standards Track I. Swett, Ed. 5 Expires: 19 July 2021 Google 6 15 January 2021 8 QUIC Loss Detection and Congestion Control 9 draft-ietf-quic-recovery-34 11 Abstract 13 This document describes loss detection and congestion control 14 mechanisms for QUIC. 16 Note to Readers 18 Discussion of this draft takes place on the QUIC working group 19 mailing list (quic@ietf.org (mailto:quic@ietf.org)), which is 20 archived at https://mailarchive.ietf.org/arch/ 21 search/?email_list=quic. 23 Working Group information can be found at https://github.com/quicwg; 24 source code and issues list for this draft can be found at 25 https://github.com/quicwg/base-drafts/labels/-recovery. 27 Status of This Memo 29 This Internet-Draft is submitted in full conformance with the 30 provisions of BCP 78 and BCP 79. 32 Internet-Drafts are working documents of the Internet Engineering 33 Task Force (IETF). Note that other groups may also distribute 34 working documents as Internet-Drafts. The list of current Internet- 35 Drafts is at https://datatracker.ietf.org/drafts/current/. 37 Internet-Drafts are draft documents valid for a maximum of six months 38 and may be updated, replaced, or obsoleted by other documents at any 39 time. It is inappropriate to use Internet-Drafts as reference 40 material or to cite them other than as "work in progress." 42 This Internet-Draft will expire on 19 July 2021. 44 Copyright Notice 46 Copyright (c) 2021 IETF Trust and the persons identified as the 47 document authors. All rights reserved. 49 This document is subject to BCP 78 and the IETF Trust's Legal 50 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 51 license-info) in effect on the date of publication of this document. 52 Please review these documents carefully, as they describe your rights 53 and restrictions with respect to this document. Code Components 54 extracted from this document must include Simplified BSD License text 55 as described in Section 4.e of the Trust Legal Provisions and are 56 provided without warranty as described in the Simplified BSD License. 58 Table of Contents 60 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 61 2. Conventions and Definitions . . . . . . . . . . . . . . . . . 4 62 3. Design of the QUIC Transmission Machinery . . . . . . . . . . 5 63 4. Relevant Differences Between QUIC and TCP . . . . . . . . . . 6 64 4.1. Separate Packet Number Spaces . . . . . . . . . . . . . . 6 65 4.2. Monotonically Increasing Packet Numbers . . . . . . . . . 6 66 4.3. Clearer Loss Epoch . . . . . . . . . . . . . . . . . . . 7 67 4.4. No Reneging . . . . . . . . . . . . . . . . . . . . . . . 7 68 4.5. More ACK Ranges . . . . . . . . . . . . . . . . . . . . . 7 69 4.6. Explicit Correction For Delayed Acknowledgments . . . . . 7 70 4.7. Probe Timeout Replaces RTO and TLP . . . . . . . . . . . 7 71 4.8. The Minimum Congestion Window is Two Packets . . . . . . 8 72 5. Estimating the Round-Trip Time . . . . . . . . . . . . . . . 8 73 5.1. Generating RTT samples . . . . . . . . . . . . . . . . . 8 74 5.2. Estimating min_rtt . . . . . . . . . . . . . . . . . . . 9 75 5.3. Estimating smoothed_rtt and rttvar . . . . . . . . . . . 10 76 6. Loss Detection . . . . . . . . . . . . . . . . . . . . . . . 12 77 6.1. Acknowledgment-Based Detection . . . . . . . . . . . . . 12 78 6.1.1. Packet Threshold . . . . . . . . . . . . . . . . . . 13 79 6.1.2. Time Threshold . . . . . . . . . . . . . . . . . . . 13 80 6.2. Probe Timeout . . . . . . . . . . . . . . . . . . . . . . 14 81 6.2.1. Computing PTO . . . . . . . . . . . . . . . . . . . . 14 82 6.2.2. Handshakes and New Paths . . . . . . . . . . . . . . 16 83 6.2.3. Speeding Up Handshake Completion . . . . . . . . . . 17 84 6.2.4. Sending Probe Packets . . . . . . . . . . . . . . . . 18 85 6.3. Handling Retry Packets . . . . . . . . . . . . . . . . . 19 86 6.4. Discarding Keys and Packet State . . . . . . . . . . . . 19 87 7. Congestion Control . . . . . . . . . . . . . . . . . . . . . 20 88 7.1. Explicit Congestion Notification . . . . . . . . . . . . 20 89 7.2. Initial and Minimum Congestion Window . . . . . . . . . . 21 90 7.3. Congestion Control States . . . . . . . . . . . . . . . . 21 91 7.3.1. Slow Start . . . . . . . . . . . . . . . . . . . . . 22 92 7.3.2. Recovery . . . . . . . . . . . . . . . . . . . . . . 22 93 7.3.3. Congestion Avoidance . . . . . . . . . . . . . . . . 23 94 7.4. Ignoring Loss of Undecryptable Packets . . . . . . . . . 23 95 7.5. Probe Timeout . . . . . . . . . . . . . . . . . . . . . . 24 96 7.6. Persistent Congestion . . . . . . . . . . . . . . . . . . 24 97 7.6.1. Duration . . . . . . . . . . . . . . . . . . . . . . 24 98 7.6.2. Establishing Persistent Congestion . . . . . . . . . 25 99 7.6.3. Example . . . . . . . . . . . . . . . . . . . . . . . 26 100 7.7. Pacing . . . . . . . . . . . . . . . . . . . . . . . . . 27 101 7.8. Under-utilizing the Congestion Window . . . . . . . . . . 28 102 8. Security Considerations . . . . . . . . . . . . . . . . . . . 28 103 8.1. Loss and Congestion Signals . . . . . . . . . . . . . . . 28 104 8.2. Traffic Analysis . . . . . . . . . . . . . . . . . . . . 28 105 8.3. Misreporting ECN Markings . . . . . . . . . . . . . . . . 28 106 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 29 107 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 29 108 10.1. Normative References . . . . . . . . . . . . . . . . . . 29 109 10.2. Informative References . . . . . . . . . . . . . . . . . 30 110 Appendix A. Loss Recovery Pseudocode . . . . . . . . . . . . . . 32 111 A.1. Tracking Sent Packets . . . . . . . . . . . . . . . . . . 32 112 A.1.1. Sent Packet Fields . . . . . . . . . . . . . . . . . 32 113 A.2. Constants of Interest . . . . . . . . . . . . . . . . . . 33 114 A.3. Variables of interest . . . . . . . . . . . . . . . . . . 33 115 A.4. Initialization . . . . . . . . . . . . . . . . . . . . . 34 116 A.5. On Sending a Packet . . . . . . . . . . . . . . . . . . . 34 117 A.6. On Receiving a Datagram . . . . . . . . . . . . . . . . . 35 118 A.7. On Receiving an Acknowledgment . . . . . . . . . . . . . 35 119 A.8. Setting the Loss Detection Timer . . . . . . . . . . . . 37 120 A.9. On Timeout . . . . . . . . . . . . . . . . . . . . . . . 39 121 A.10. Detecting Lost Packets . . . . . . . . . . . . . . . . . 39 122 A.11. Upon Dropping Initial or Handshake Keys . . . . . . . . . 40 123 Appendix B. Congestion Control Pseudocode . . . . . . . . . . . 41 124 B.1. Constants of interest . . . . . . . . . . . . . . . . . . 41 125 B.2. Variables of interest . . . . . . . . . . . . . . . . . . 41 126 B.3. Initialization . . . . . . . . . . . . . . . . . . . . . 42 127 B.4. On Packet Sent . . . . . . . . . . . . . . . . . . . . . 42 128 B.5. On Packet Acknowledgment . . . . . . . . . . . . . . . . 42 129 B.6. On New Congestion Event . . . . . . . . . . . . . . . . . 43 130 B.7. Process ECN Information . . . . . . . . . . . . . . . . . 44 131 B.8. On Packets Lost . . . . . . . . . . . . . . . . . . . . . 44 132 B.9. Removing Discarded Packets From Bytes In Flight . . . . . 44 133 Appendix C. Change Log . . . . . . . . . . . . . . . . . . . . . 45 134 C.1. Since draft-ietf-quic-recovery-32 . . . . . . . . . . . . 45 135 C.2. Since draft-ietf-quic-recovery-31 . . . . . . . . . . . . 45 136 C.3. Since draft-ietf-quic-recovery-30 . . . . . . . . . . . . 45 137 C.4. Since draft-ietf-quic-recovery-29 . . . . . . . . . . . . 45 138 C.5. Since draft-ietf-quic-recovery-28 . . . . . . . . . . . . 46 139 C.6. Since draft-ietf-quic-recovery-27 . . . . . . . . . . . . 46 140 C.7. Since draft-ietf-quic-recovery-26 . . . . . . . . . . . . 46 141 C.8. Since draft-ietf-quic-recovery-25 . . . . . . . . . . . . 46 142 C.9. Since draft-ietf-quic-recovery-24 . . . . . . . . . . . . 46 143 C.10. Since draft-ietf-quic-recovery-23 . . . . . . . . . . . . 46 144 C.11. Since draft-ietf-quic-recovery-22 . . . . . . . . . . . . 47 145 C.12. Since draft-ietf-quic-recovery-21 . . . . . . . . . . . . 47 146 C.13. Since draft-ietf-quic-recovery-20 . . . . . . . . . . . . 47 147 C.14. Since draft-ietf-quic-recovery-19 . . . . . . . . . . . . 47 148 C.15. Since draft-ietf-quic-recovery-18 . . . . . . . . . . . . 48 149 C.16. Since draft-ietf-quic-recovery-17 . . . . . . . . . . . . 48 150 C.17. Since draft-ietf-quic-recovery-16 . . . . . . . . . . . . 49 151 C.18. Since draft-ietf-quic-recovery-14 . . . . . . . . . . . . 49 152 C.19. Since draft-ietf-quic-recovery-13 . . . . . . . . . . . . 49 153 C.20. Since draft-ietf-quic-recovery-12 . . . . . . . . . . . . 50 154 C.21. Since draft-ietf-quic-recovery-11 . . . . . . . . . . . . 50 155 C.22. Since draft-ietf-quic-recovery-10 . . . . . . . . . . . . 50 156 C.23. Since draft-ietf-quic-recovery-09 . . . . . . . . . . . . 50 157 C.24. Since draft-ietf-quic-recovery-08 . . . . . . . . . . . . 50 158 C.25. Since draft-ietf-quic-recovery-07 . . . . . . . . . . . . 50 159 C.26. Since draft-ietf-quic-recovery-06 . . . . . . . . . . . . 51 160 C.27. Since draft-ietf-quic-recovery-05 . . . . . . . . . . . . 51 161 C.28. Since draft-ietf-quic-recovery-04 . . . . . . . . . . . . 51 162 C.29. Since draft-ietf-quic-recovery-03 . . . . . . . . . . . . 51 163 C.30. Since draft-ietf-quic-recovery-02 . . . . . . . . . . . . 51 164 C.31. Since draft-ietf-quic-recovery-01 . . . . . . . . . . . . 51 165 C.32. Since draft-ietf-quic-recovery-00 . . . . . . . . . . . . 51 166 C.33. Since draft-iyengar-quic-loss-recovery-01 . . . . . . . . 51 167 Appendix D. Contributors . . . . . . . . . . . . . . . . . . . . 52 168 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . 52 169 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 52 171 1. Introduction 173 QUIC is a secure general-purpose transport protocol, described in 174 [QUIC-TRANSPORT]). This document describes loss detection and 175 congestion control mechanisms for QUIC. 177 2. Conventions and Definitions 179 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 180 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 181 "OPTIONAL" in this document are to be interpreted as described in 182 BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all 183 capitals, as shown here. 185 Definitions of terms that are used in this document: 187 Ack-eliciting frames: All frames other than ACK, PADDING, and 188 CONNECTION_CLOSE are considered ack-eliciting. 190 Ack-eliciting packets: Packets that contain ack-eliciting frames 191 elicit an ACK from the receiver within the maximum acknowledgment 192 delay and are called ack-eliciting packets. 194 In-flight packets: Packets are considered in-flight when they are 195 ack-eliciting or contain a PADDING frame, and they have been sent 196 but are not acknowledged, declared lost, or discarded along with 197 old keys. 199 3. Design of the QUIC Transmission Machinery 201 All transmissions in QUIC are sent with a packet-level header, which 202 indicates the encryption level and includes a packet sequence number 203 (referred to below as a packet number). The encryption level 204 indicates the packet number space, as described in Section 12.3 in 205 [QUIC-TRANSPORT]. Packet numbers never repeat within a packet number 206 space for the lifetime of a connection. Packet numbers are sent in 207 monotonically increasing order within a space, preventing ambiguity. 208 It is permitted for some packet numbers to never be used, leaving 209 intentional gaps. 211 This design obviates the need for disambiguating between 212 transmissions and retransmissions; this eliminates significant 213 complexity from QUIC's interpretation of TCP loss detection 214 mechanisms. 216 QUIC packets can contain multiple frames of different types. The 217 recovery mechanisms ensure that data and frames that need reliable 218 delivery are acknowledged or declared lost and sent in new packets as 219 necessary. The types of frames contained in a packet affect recovery 220 and congestion control logic: 222 * All packets are acknowledged, though packets that contain no ack- 223 eliciting frames are only acknowledged along with ack-eliciting 224 packets. 226 * Long header packets that contain CRYPTO frames are critical to the 227 performance of the QUIC handshake and use shorter timers for 228 acknowledgment. 230 * Packets containing frames besides ACK or CONNECTION_CLOSE frames 231 count toward congestion control limits and are considered in- 232 flight. 234 * PADDING frames cause packets to contribute toward bytes in flight 235 without directly causing an acknowledgment to be sent. 237 4. Relevant Differences Between QUIC and TCP 239 Readers familiar with TCP's loss detection and congestion control 240 will find algorithms here that parallel well-known TCP ones. 241 However, protocol differences between QUIC and TCP contribute to 242 algorithmic differences. These protocol differences are briefly 243 described below. 245 4.1. Separate Packet Number Spaces 247 QUIC uses separate packet number spaces for each encryption level, 248 except 0-RTT and all generations of 1-RTT keys use the same packet 249 number space. Separate packet number spaces ensures acknowledgment 250 of packets sent with one level of encryption will not cause spurious 251 retransmission of packets sent with a different encryption level. 252 Congestion control and round-trip time (RTT) measurement are unified 253 across packet number spaces. 255 4.2. Monotonically Increasing Packet Numbers 257 TCP conflates transmission order at the sender with delivery order at 258 the receiver, resulting in the retransmission ambiguity problem 259 ([RETRANSMISSION]). QUIC separates transmission order from delivery 260 order: packet numbers indicate transmission order, and delivery order 261 is determined by the stream offsets in STREAM frames. 263 QUIC's packet number is strictly increasing within a packet number 264 space, and directly encodes transmission order. A higher packet 265 number signifies that the packet was sent later, and a lower packet 266 number signifies that the packet was sent earlier. When a packet 267 containing ack-eliciting frames is detected lost, QUIC includes 268 necessary frames in a new packet with a new packet number, removing 269 ambiguity about which packet is acknowledged when an ACK is received. 270 Consequently, more accurate RTT measurements can be made, spurious 271 retransmissions are trivially detected, and mechanisms such as Fast 272 Retransmit can be applied universally, based only on packet number. 274 This design point significantly simplifies loss detection mechanisms 275 for QUIC. Most TCP mechanisms implicitly attempt to infer 276 transmission ordering based on TCP sequence numbers - a non-trivial 277 task, especially when TCP timestamps are not available. 279 4.3. Clearer Loss Epoch 281 QUIC starts a loss epoch when a packet is lost. The loss epoch ends 282 when any packet sent after the start of the epoch is acknowledged. 283 TCP waits for the gap in the sequence number space to be filled, and 284 so if a segment is lost multiple times in a row, the loss epoch may 285 not end for several round trips. Because both should reduce their 286 congestion windows only once per epoch, QUIC will do it once for 287 every round trip that experiences loss, while TCP may only do it once 288 across multiple round trips. 290 4.4. No Reneging 292 QUIC ACK frames contain information similar to that in TCP Selective 293 Acknowledgements (SACKs, [RFC2018]). However, QUIC does not allow a 294 packet acknowledgement to be reneged, greatly simplifying 295 implementations on both sides and reducing memory pressure on the 296 sender. 298 4.5. More ACK Ranges 300 QUIC supports many ACK ranges, opposed to TCP's 3 SACK ranges. In 301 high loss environments, this speeds recovery, reduces spurious 302 retransmits, and ensures forward progress without relying on 303 timeouts. 305 4.6. Explicit Correction For Delayed Acknowledgments 307 QUIC endpoints measure the delay incurred between when a packet is 308 received and when the corresponding acknowledgment is sent, allowing 309 a peer to maintain a more accurate round-trip time estimate; see 310 Section 13.2 of [QUIC-TRANSPORT]. 312 4.7. Probe Timeout Replaces RTO and TLP 314 QUIC uses a probe timeout (PTO; see Section 6.2), with a timer based 315 on TCP's RTO computation; see [RFC6297]. QUIC's PTO includes the 316 peer's maximum expected acknowledgment delay instead of using a fixed 317 minimum timeout. 319 Similar to the RACK-TLP loss detection algorithm for TCP ([RACK]), 320 QUIC does not collapse the congestion window when the PTO expires, 321 since a single packet loss at the tail does not indicate persistent 322 congestion. Instead, QUIC collapses the congestion window when 323 persistent congestion is declared; see Section 7.6. In doing this, 324 QUIC avoids unnecessary congestion window reductions, obviating the 325 need for correcting mechanisms such as F-RTO ([RFC5682]). Since QUIC 326 does not collapse the congestion window on a PTO expiration, a QUIC 327 sender is not limited from sending more in-flight packets after a PTO 328 expiration if it still has available congestion window. This occurs 329 when a sender is application-limited and the PTO timer expires. This 330 is more aggressive than TCP's RTO mechanism when application-limited, 331 but identical when not application-limited. 333 QUIC allows probe packets to temporarily exceed the congestion window 334 whenever the timer expires. 336 4.8. The Minimum Congestion Window is Two Packets 338 TCP uses a minimum congestion window of one packet. However, loss of 339 that single packet means that the sender needs to waiting for a PTO 340 (Section 6.2) to recover, which can be much longer than a round-trip 341 time. Sending a single ack-eliciting packet also increases the 342 chances of incurring additional latency when a receiver delays its 343 acknowledgment. 345 QUIC therefore recommends that the minimum congestion window be two 346 packets. While this increases network load, it is considered safe, 347 since the sender will still reduce its sending rate exponentially 348 under persistent congestion (Section 6.2). 350 5. Estimating the Round-Trip Time 352 At a high level, an endpoint measures the time from when a packet was 353 sent to when it is acknowledged as a round-trip time (RTT) sample. 354 The endpoint uses RTT samples and peer-reported host delays (see 355 Section 13.2 of [QUIC-TRANSPORT]) to generate a statistical 356 description of the network path's RTT. An endpoint computes the 357 following three values for each path: the minimum value over a period 358 of time (min_rtt), an exponentially-weighted moving average 359 (smoothed_rtt), and the mean deviation (referred to as "variation" in 360 the rest of this document) in the observed RTT samples (rttvar). 362 5.1. Generating RTT samples 364 An endpoint generates an RTT sample on receiving an ACK frame that 365 meets the following two conditions: 367 * the largest acknowledged packet number is newly acknowledged, and 369 * at least one of the newly acknowledged packets was ack-eliciting. 371 The RTT sample, latest_rtt, is generated as the time elapsed since 372 the largest acknowledged packet was sent: 374 latest_rtt = ack_time - send_time_of_largest_acked 375 An RTT sample is generated using only the largest acknowledged packet 376 in the received ACK frame. This is because a peer reports 377 acknowledgment delays for only the largest acknowledged packet in an 378 ACK frame. While the reported acknowledgment delay is not used by 379 the RTT sample measurement, it is used to adjust the RTT sample in 380 subsequent computations of smoothed_rtt and rttvar (Section 5.3). 382 To avoid generating multiple RTT samples for a single packet, an ACK 383 frame SHOULD NOT be used to update RTT estimates if it does not newly 384 acknowledge the largest acknowledged packet. 386 An RTT sample MUST NOT be generated on receiving an ACK frame that 387 does not newly acknowledge at least one ack-eliciting packet. A peer 388 usually does not send an ACK frame when only non-ack-eliciting 389 packets are received. Therefore an ACK frame that contains 390 acknowledgments for only non-ack-eliciting packets could include an 391 arbitrarily large ACK Delay value. Ignoring such ACK frames avoids 392 complications in subsequent smoothed_rtt and rttvar computations. 394 A sender might generate multiple RTT samples per RTT when multiple 395 ACK frames are received within an RTT. As suggested in [RFC6298], 396 doing so might result in inadequate history in smoothed_rtt and 397 rttvar. Ensuring that RTT estimates retain sufficient history is an 398 open research question. 400 5.2. Estimating min_rtt 402 min_rtt is the sender's estimate of the minimum RTT observed for a 403 given network path over a period of time. In this document, min_rtt 404 is used by loss detection to reject implausibly small rtt samples. 406 min_rtt MUST be set to the latest_rtt on the first RTT sample. 407 min_rtt MUST be set to the lesser of min_rtt and latest_rtt 408 (Section 5.1) on all other samples. 410 An endpoint uses only locally observed times in computing the min_rtt 411 and does not adjust for acknowledgment delays reported by the peer. 412 Doing so allows the endpoint to set a lower bound for the 413 smoothed_rtt based entirely on what it observes (see Section 5.3), 414 and limits potential underestimation due to erroneously-reported 415 delays by the peer. 417 The RTT for a network path may change over time. If a path's actual 418 RTT decreases, the min_rtt will adapt immediately on the first low 419 sample. If the path's actual RTT increases however, the min_rtt will 420 not adapt to it, allowing future RTT samples that are smaller than 421 the new RTT to be included in smoothed_rtt. 423 Endpoints SHOULD set the min_rtt to the newest RTT sample after 424 persistent congestion is established. This is to allow a connection 425 to reset its estimate of min_rtt and smoothed_rtt (Section 5.3) after 426 a disruptive network event, and because it is possible that an 427 increase in path delay resulted in persistent congestion being 428 incorrectly declared. 430 Endpoints MAY re-establish the min_rtt at other times in the 431 connection, such as when traffic volume is low and an acknowledgment 432 is received with a low acknowledgment delay. Implementations SHOULD 433 NOT refresh the min_rtt value too often, since the actual minimum RTT 434 of the path is not frequently observable. 436 5.3. Estimating smoothed_rtt and rttvar 438 smoothed_rtt is an exponentially-weighted moving average of an 439 endpoint's RTT samples, and rttvar estimates the variation in the RTT 440 samples using a mean variation. 442 The calculation of smoothed_rtt uses RTT samples after adjusting them 443 for acknowledgment delays. These delays are decoded from the ACK 444 Delay field of ACK frames as described in Section 19.3 of 445 [QUIC-TRANSPORT]. 447 The peer might report acknowledgment delays that are larger than the 448 peer's max_ack_delay during the handshake (Section 13.2.1 of 449 [QUIC-TRANSPORT]). To account for this, the endpoint SHOULD ignore 450 max_ack_delay until the handshake is confirmed, as defined in 451 Section 4.1.2 of [QUIC-TLS]. When they occur, these large 452 acknowledgment delays are likely to be non-repeating and limited to 453 the handshake. The endpoint can therefore use them without limiting 454 them to the max_ack_delay, avoiding unnecessary inflation of the RTT 455 estimate. 457 Note that a large acknowledgment delay can result in a substantially 458 inflated smoothed_rtt, if there is either an error in the peer's 459 reporting of the acknowledgment delay or in the endpoint's min_rtt 460 estimate. Therefore, prior to handshake confirmation, an endpoint 461 MAY ignore RTT samples if adjusting the RTT sample for acknowledgment 462 delay causes the sample to be less than the min_rtt. 464 After the handshake is confirmed, any acknowledgment delays reported 465 by the peer that are greater than the peer's max_ack_delay are 466 attributed to unintentional but potentially repeating delays, such as 467 scheduler latency at the peer or loss of previous acknowledgments. 468 Excess delays could also be due to a non-compliant receiver. 469 Therefore, these extra delays are considered effectively part of path 470 delay and incorporated into the RTT estimate. 472 Therefore, when adjusting an RTT sample using peer-reported 473 acknowledgment delays, an endpoint: 475 * MAY ignore the acknowledgment delay for Initial packets, since 476 these acknowledgments are not delayed by the peer (Section 13.2.1 477 of [QUIC-TRANSPORT]); 479 * SHOULD ignore the peer's max_ack_delay until the handshake is 480 confirmed; 482 * MUST use the lesser of the acknowledgment delay and the peer's 483 max_ack_delay after the handshake is confirmed; and 485 * MUST NOT subtract the acknowledgment delay from the RTT sample if 486 the resulting value is smaller than the min_rtt. This limits the 487 underestimation of the smoothed_rtt due to a misreporting peer. 489 Additionally, an endpoint might postpone the processing of 490 acknowledgments when the corresponding decryption keys are not 491 immediately available. For example, a client might receive an 492 acknowledgment for a 0-RTT packet that it cannot decrypt because 493 1-RTT packet protection keys are not yet available to it. In such 494 cases, an endpoint SHOULD subtract such local delays from its RTT 495 sample until the handshake is confirmed. 497 Similar to [RFC6298], smoothed_rtt and rttvar are computed as 498 follows. 500 An endpoint initializes the RTT estimator during connection 501 establishment and when the estimator is reset during connection 502 migration; see Section 9.4 of [QUIC-TRANSPORT]. Before any RTT 503 samples are available for a new path or when the estimator is reset, 504 the estimator is initialized using the initial RTT; see 505 Section 6.2.2. 507 smoothed_rtt and rttvar are initialized as follows, where kInitialRtt 508 contains the initial RTT value: 510 smoothed_rtt = kInitialRtt 511 rttvar = kInitialRtt / 2 513 RTT samples for the network path are recorded in latest_rtt; see 514 Section 5.1. On the first RTT sample after initialization, the 515 estimator is reset using that sample. This ensures that the 516 estimator retains no history of past samples. 518 On the first RTT sample after initialization, smoothed_rtt and rttvar 519 are set as follows: 521 smoothed_rtt = latest_rtt 522 rttvar = latest_rtt / 2 524 On subsequent RTT samples, smoothed_rtt and rttvar evolve as follows: 526 ack_delay = decoded acknowledgment delay from ACK frame 527 if (handshake confirmed): 528 ack_delay = min(ack_delay, max_ack_delay) 529 adjusted_rtt = latest_rtt 530 if (min_rtt + ack_delay < latest_rtt): 531 adjusted_rtt = latest_rtt - ack_delay 532 smoothed_rtt = 7/8 * smoothed_rtt + 1/8 * adjusted_rtt 533 rttvar_sample = abs(smoothed_rtt - adjusted_rtt) 534 rttvar = 3/4 * rttvar + 1/4 * rttvar_sample 536 6. Loss Detection 538 QUIC senders use acknowledgments to detect lost packets, and a probe 539 time out (see Section 6.2) to ensure acknowledgments are received. 540 This section provides a description of these algorithms. 542 If a packet is lost, the QUIC transport needs to recover from that 543 loss, such as by retransmitting the data, sending an updated frame, 544 or discarding the frame. For more information, see Section 13.3 of 545 [QUIC-TRANSPORT]. 547 Loss detection is separate per packet number space, unlike RTT 548 measurement and congestion control, because RTT and congestion 549 control are properties of the path, whereas loss detection also 550 relies upon key availability. 552 6.1. Acknowledgment-Based Detection 554 Acknowledgment-based loss detection implements the spirit of TCP's 555 Fast Retransmit ([RFC5681]), Early Retransmit ([RFC5827]), FACK 556 ([FACK]), SACK loss recovery ([RFC6675]), and RACK-TLP ([RACK]). 557 This section provides an overview of how these algorithms are 558 implemented in QUIC. 560 A packet is declared lost if it meets all the following conditions: 562 * The packet is unacknowledged, in-flight, and was sent prior to an 563 acknowledged packet. 565 * The packet was sent kPacketThreshold packets before an 566 acknowledged packet (Section 6.1.1), or it was sent long enough in 567 the past (Section 6.1.2). 569 The acknowledgment indicates that a packet sent later was delivered, 570 and the packet and time thresholds provide some tolerance for packet 571 reordering. 573 Spuriously declaring packets as lost leads to unnecessary 574 retransmissions and may result in degraded performance due to the 575 actions of the congestion controller upon detecting loss. 576 Implementations can detect spurious retransmissions and increase the 577 reordering threshold in packets or time to reduce future spurious 578 retransmissions and loss events. Implementations with adaptive time 579 thresholds MAY choose to start with smaller initial reordering 580 thresholds to minimize recovery latency. 582 6.1.1. Packet Threshold 584 The RECOMMENDED initial value for the packet reordering threshold 585 (kPacketThreshold) is 3, based on best practices for TCP loss 586 detection ([RFC5681], [RFC6675]). In order to remain similar to TCP, 587 implementations SHOULD NOT use a packet threshold less than 3; see 588 [RFC5681]. 590 Some networks may exhibit higher degrees of packet reordering, 591 causing a sender to detect spurious losses. Additionally, packet 592 reordering could be more common with QUIC than TCP, because network 593 elements that could observe and reorder TCP packets cannot do that 594 for QUIC, because QUIC packet numbers are encrypted. Algorithms that 595 increase the reordering threshold after spuriously detecting losses, 596 such as RACK [RACK], have proven to be useful in TCP and are expected 597 to be at least as useful in QUIC. 599 6.1.2. Time Threshold 601 Once a later packet within the same packet number space has been 602 acknowledged, an endpoint SHOULD declare an earlier packet lost if it 603 was sent a threshold amount of time in the past. To avoid declaring 604 packets as lost too early, this time threshold MUST be set to at 605 least the local timer granularity, as indicated by the kGranularity 606 constant. The time threshold is: 608 max(kTimeThreshold * max(smoothed_rtt, latest_rtt), kGranularity) 610 If packets sent prior to the largest acknowledged packet cannot yet 611 be declared lost, then a timer SHOULD be set for the remaining time. 613 Using max(smoothed_rtt, latest_rtt) protects from the two following 614 cases: 616 * the latest RTT sample is lower than the smoothed RTT, perhaps due 617 to reordering where the acknowledgment encountered a shorter path; 619 * the latest RTT sample is higher than the smoothed RTT, perhaps due 620 to a sustained increase in the actual RTT, but the smoothed RTT 621 has not yet caught up. 623 The RECOMMENDED time threshold (kTimeThreshold), expressed as a 624 round-trip time multiplier, is 9/8. The RECOMMENDED value of the 625 timer granularity (kGranularity) is 1ms. 627 Note: TCP's RACK ([RACK]) specifies a slightly larger threshold, 628 equivalent to 5/4, for a similar purpose. Experience with QUIC 629 shows that 9/8 works well. 631 Implementations MAY experiment with absolute thresholds, thresholds 632 from previous connections, adaptive thresholds, or including RTT 633 variation. Smaller thresholds reduce reordering resilience and 634 increase spurious retransmissions, and larger thresholds increase 635 loss detection delay. 637 6.2. Probe Timeout 639 A Probe Timeout (PTO) triggers sending one or two probe datagrams 640 when ack-eliciting packets are not acknowledged within the expected 641 period of time or the server may not have validated the client's 642 address. A PTO enables a connection to recover from loss of tail 643 packets or acknowledgments. 645 As with loss detection, the probe timeout is per packet number space. 646 That is, a PTO value is computed per packet number space. 648 A PTO timer expiration event does not indicate packet loss and MUST 649 NOT cause prior unacknowledged packets to be marked as lost. When an 650 acknowledgment is received that newly acknowledges packets, loss 651 detection proceeds as dictated by packet and time threshold 652 mechanisms; see Section 6.1. 654 The PTO algorithm used in QUIC implements the reliability functions 655 of Tail Loss Probe [RACK], RTO [RFC5681], and F-RTO algorithms for 656 TCP [RFC5682]. The timeout computation is based on TCP's 657 retransmission timeout period [RFC6298]. 659 6.2.1. Computing PTO 661 When an ack-eliciting packet is transmitted, the sender schedules a 662 timer for the PTO period as follows: 664 PTO = smoothed_rtt + max(4*rttvar, kGranularity) + max_ack_delay 666 The PTO period is the amount of time that a sender ought to wait for 667 an acknowledgment of a sent packet. This time period includes the 668 estimated network roundtrip-time (smoothed_rtt), the variation in the 669 estimate (4*rttvar), and max_ack_delay, to account for the maximum 670 time by which a receiver might delay sending an acknowledgment. 672 When the PTO is armed for Initial or Handshake packet number spaces, 673 the max_ack_delay in the PTO period computation is set to 0, since 674 the peer is expected to not delay these packets intentionally; see 675 13.2.1 of [QUIC-TRANSPORT]. 677 The PTO period MUST be at least kGranularity, to avoid the timer 678 expiring immediately. 680 When ack-eliciting packets in multiple packet number spaces are in 681 flight, the timer MUST be set to the earlier value of the Initial and 682 Handshake packet number spaces. 684 An endpoint MUST NOT set its PTO timer for the application data 685 packet number space until the handshake is confirmed. Doing so 686 prevents the endpoint from retransmitting information in packets when 687 either the peer does not yet have the keys to process them or the 688 endpoint does not yet have the keys to process their acknowledgments. 689 For example, this can happen when a client sends 0-RTT packets to the 690 server; it does so without knowing whether the server will be able to 691 decrypt them. Similarly, this can happen when a server sends 1-RTT 692 packets before confirming that the client has verified the server's 693 certificate and can therefore read these 1-RTT packets. 695 A sender SHOULD restart its PTO timer every time an ack-eliciting 696 packet is sent or acknowledged, or when Initial or Handshake keys are 697 discarded (Section 4.9 of [QUIC-TLS]). This ensures the PTO is 698 always set based on the latest estimate of the round-trip time and 699 for the correct packet across packet number spaces. 701 When a PTO timer expires, the PTO backoff MUST be increased, 702 resulting in the PTO period being set to twice its current value. 703 The PTO backoff factor is reset when an acknowledgment is received, 704 except in the following case. A server might take longer to respond 705 to packets during the handshake than otherwise. To protect such a 706 server from repeated client probes, the PTO backoff is not reset at a 707 client that is not yet certain that the server has finished 708 validating the client's address. That is, a client does not reset 709 the PTO backoff factor on receiving acknowledgments in Initial 710 packets. 712 This exponential reduction in the sender's rate is important because 713 consecutive PTOs might be caused by loss of packets or 714 acknowledgments due to severe congestion. Even when there are ack- 715 eliciting packets in-flight in multiple packet number spaces, the 716 exponential increase in probe timeout occurs across all spaces to 717 prevent excess load on the network. For example, a timeout in the 718 Initial packet number space doubles the length of the timeout in the 719 Handshake packet number space. 721 The total length of time over which consecutive PTOs expire is 722 limited by the idle timeout. 724 The PTO timer MUST NOT be set if a timer is set for time threshold 725 loss detection; see Section 6.1.2. A timer that is set for time 726 threshold loss detection will expire earlier than the PTO timer in 727 most cases and is less likely to spuriously retransmit data. 729 6.2.2. Handshakes and New Paths 731 Resumed connections over the same network MAY use the previous 732 connection's final smoothed RTT value as the resumed connection's 733 initial RTT. When no previous RTT is available, the initial RTT 734 SHOULD be set to 333ms. This results in handshakes starting with a 735 PTO of 1 second, as recommended for TCP's initial retransmission 736 timeout; see Section 2 of [RFC6298]. 738 A connection MAY use the delay between sending a PATH_CHALLENGE and 739 receiving a PATH_RESPONSE to set the initial RTT (see kInitialRtt in 740 Appendix A.2) for a new path, but the delay SHOULD NOT be considered 741 an RTT sample. 743 Initial packets and Handshake packets could be never acknowledged, 744 but they are removed from bytes in flight when the Initial and 745 Handshake keys are discarded, as described below in Section 6.4. 746 When Initial or Handshake keys are discarded, the PTO and loss 747 detection timers MUST be reset, because discarding keys indicates 748 forward progress and the loss detection timer might have been set for 749 a now discarded packet number space. 751 6.2.2.1. Before Address Validation 753 Until the server has validated the client's address on the path, the 754 amount of data it can send is limited to three times the amount of 755 data received, as specified in Section 8.1 of [QUIC-TRANSPORT]. If 756 no additional data can be sent, the server's PTO timer MUST NOT be 757 armed until datagrams have been received from the client, because 758 packets sent on PTO count against the anti-amplification limit. Note 759 that the server could fail to validate the client's address even if 760 0-RTT is accepted. 762 Since the server could be blocked until more datagrams are received 763 from the client, it is the client's responsibility to send packets to 764 unblock the server until it is certain that the server has finished 765 its address validation (see Section 8 of [QUIC-TRANSPORT]). That is, 766 the client MUST set the probe timer if the client has not received an 767 acknowledgment for any of its Handshake packets and the handshake is 768 not confirmed (see Section 4.1.2 of [QUIC-TLS]), even if there are no 769 packets in flight. When the PTO fires, the client MUST send a 770 Handshake packet if it has Handshake keys, otherwise it MUST send an 771 Initial packet in a UDP datagram with a payload of at least 1200 772 bytes. 774 6.2.3. Speeding Up Handshake Completion 776 When a server receives an Initial packet containing duplicate CRYPTO 777 data, it can assume the client did not receive all of the server's 778 CRYPTO data sent in Initial packets, or the client's estimated RTT is 779 too small. When a client receives Handshake or 1-RTT packets prior 780 to obtaining Handshake keys, it may assume some or all of the 781 server's Initial packets were lost. 783 To speed up handshake completion under these conditions, an endpoint 784 MAY, for a limited number of times per connection, send a packet 785 containing unacknowledged CRYPTO data earlier than the PTO expiry, 786 subject to the address validation limits in Section 8.1 of 787 [QUIC-TRANSPORT]. Doing so at most once for each connection is 788 adequate to quickly recover from a single packet loss. An endpoint 789 that always retransmits packets in response to receiving packets that 790 it cannot process risks creating an infinite exchange of packets. 792 Endpoints can also use coalesced packets (see Section 12.2 of 793 [QUIC-TRANSPORT]) to ensure that each datagram elicits at least one 794 acknowledgment. For example, a client can coalesce an Initial packet 795 containing PING and PADDING frames with a 0-RTT data packet and a 796 server can coalesce an Initial packet containing a PING frame with 797 one or more packets in its first flight. 799 6.2.4. Sending Probe Packets 801 When a PTO timer expires, a sender MUST send at least one ack- 802 eliciting packet in the packet number space as a probe. An endpoint 803 MAY send up to two full-sized datagrams containing ack-eliciting 804 packets, to avoid an expensive consecutive PTO expiration due to a 805 single lost datagram, or transmit data from multiple packet number 806 spaces. All probe packets sent on a PTO MUST be ack-eliciting. 808 In addition to sending data in the packet number space for which the 809 timer expired, the sender SHOULD send ack-eliciting packets from 810 other packet number spaces with in-flight data, coalescing packets if 811 possible. This is particularly valuable when the server has both 812 Initial and Handshake data in-flight or the client has both Handshake 813 and Application Data in-flight, because the peer might only have 814 receive keys for one of the two packet number spaces. 816 If the sender wants to elicit a faster acknowledgment on PTO, it can 817 skip a packet number to eliminate the acknowledgment delay. 819 An endpoint SHOULD include new data in packets that are sent on PTO 820 expiration. Previously sent data MAY be sent if no new data can be 821 sent. Implementations MAY use alternative strategies for determining 822 the content of probe packets, including sending new or retransmitted 823 data based on the application's priorities. 825 It is possible the sender has no new or previously-sent data to send. 826 As an example, consider the following sequence of events: new 827 application data is sent in a STREAM frame, deemed lost, then 828 retransmitted in a new packet, and then the original transmission is 829 acknowledged. When there is no data to send, the sender SHOULD send 830 a PING or other ack-eliciting frame in a single packet, re-arming the 831 PTO timer. 833 Alternatively, instead of sending an ack-eliciting packet, the sender 834 MAY mark any packets still in flight as lost. Doing so avoids 835 sending an additional packet, but increases the risk that loss is 836 declared too aggressively, resulting in an unnecessary rate reduction 837 by the congestion controller. 839 Consecutive PTO periods increase exponentially, and as a result, 840 connection recovery latency increases exponentially as packets 841 continue to be dropped in the network. Sending two packets on PTO 842 expiration increases resilience to packet drops, thus reducing the 843 probability of consecutive PTO events. 845 When the PTO timer expires multiple times and new data cannot be 846 sent, implementations must choose between sending the same payload 847 every time or sending different payloads. Sending the same payload 848 may be simpler and ensures the highest priority frames arrive first. 849 Sending different payloads each time reduces the chances of spurious 850 retransmission. 852 6.3. Handling Retry Packets 854 A Retry packet causes a client to send another Initial packet, 855 effectively restarting the connection process. A Retry packet 856 indicates that the Initial was received, but not processed. A Retry 857 packet cannot be treated as an acknowledgment, because it does not 858 indicate that a packet was processed or specify the packet number. 860 Clients that receive a Retry packet reset congestion control and loss 861 recovery state, including resetting any pending timers. Other 862 connection state, in particular cryptographic handshake messages, is 863 retained; see Section 17.2.5 of [QUIC-TRANSPORT]. 865 The client MAY compute an RTT estimate to the server as the time 866 period from when the first Initial was sent to when a Retry or a 867 Version Negotiation packet is received. The client MAY use this 868 value in place of its default for the initial RTT estimate. 870 6.4. Discarding Keys and Packet State 872 When Initial and Handshake packet protection keys are discarded (see 873 Section 4.9 of [QUIC-TLS]), all packets that were sent with those 874 keys can no longer be acknowledged because their acknowledgments 875 cannot be processed. The sender MUST discard all recovery state 876 associated with those packets and MUST remove them from the count of 877 bytes in flight. 879 Endpoints stop sending and receiving Initial packets once they start 880 exchanging Handshake packets; see Section 17.2.2.1 of 881 [QUIC-TRANSPORT]. At this point, recovery state for all in-flight 882 Initial packets is discarded. 884 When 0-RTT is rejected, recovery state for all in-flight 0-RTT 885 packets is discarded. 887 If a server accepts 0-RTT, but does not buffer 0-RTT packets that 888 arrive before Initial packets, early 0-RTT packets will be declared 889 lost, but that is expected to be infrequent. 891 It is expected that keys are discarded after packets encrypted with 892 them would be acknowledged or declared lost. However, Initial and 893 Handshake secrets are discarded as soon as handshake and 1-RTT keys 894 are proven to be available to both client and server; see 895 Section 4.9.1 of [QUIC-TLS]. 897 7. Congestion Control 899 This document specifies a sender-side congestion controller for QUIC 900 similar to TCP NewReno ([RFC6582]). 902 The signals QUIC provides for congestion control are generic and are 903 designed to support different sender-side algorithms. A sender can 904 unilaterally choose a different algorithm to use, such as Cubic 905 ([RFC8312]). 907 If a sender uses a different controller than that specified in this 908 document, the chosen controller MUST conform to the congestion 909 control guidelines specified in Section 3.1 of [RFC8085]. 911 Similar to TCP, packets containing only ACK frames do not count 912 towards bytes in flight and are not congestion controlled. Unlike 913 TCP, QUIC can detect the loss of these packets and MAY use that 914 information to adjust the congestion controller or the rate of ACK- 915 only packets being sent, but this document does not describe a 916 mechanism for doing so. 918 The algorithm in this document specifies and uses the controller's 919 congestion window in bytes. 921 An endpoint MUST NOT send a packet if it would cause bytes_in_flight 922 (see Appendix B.2) to be larger than the congestion window, unless 923 the packet is sent on a PTO timer expiration (see Section 6.2) or 924 when entering recovery (see Section 7.3.2). 926 7.1. Explicit Congestion Notification 928 If a path has been validated to support ECN ([RFC3168], [RFC8311]), 929 QUIC treats a Congestion Experienced (CE) codepoint in the IP header 930 as a signal of congestion. This document specifies an endpoint's 931 response when the peer-reported ECN-CE count increases; see 932 Section 13.4.2 of [QUIC-TRANSPORT]. 934 7.2. Initial and Minimum Congestion Window 936 QUIC begins every connection in slow start with the congestion window 937 set to an initial value. Endpoints SHOULD use an initial congestion 938 window of 10 times the maximum datagram size (max_datagram_size), 939 while limiting the window to the larger of 14720 bytes or twice the 940 maximum datagram size. This follows the analysis and recommendations 941 in [RFC6928], increasing the byte limit to account for the smaller 942 8-byte overhead of UDP compared to the 20-byte overhead for TCP. 944 If the maximum datagram size changes during the connection, the 945 initial congestion window SHOULD be recalculated with the new size. 946 If the maximum datagram size is decreased in order to complete the 947 handshake, the congestion window SHOULD be set to the new initial 948 congestion window. 950 Prior to validating the client's address, the server can be further 951 limited by the anti-amplification limit as specified in Section 8.1 952 of [QUIC-TRANSPORT]. Though the anti-amplification limit can prevent 953 the congestion window from being fully utilized and therefore slow 954 down the increase in congestion window, it does not directly affect 955 the congestion window. 957 The minimum congestion window is the smallest value the congestion 958 window can decrease to as a response to loss, increase in the peer- 959 reported ECN-CE count, or persistent congestion. The RECOMMENDED 960 value is 2 * max_datagram_size. 962 7.3. Congestion Control States 964 The NewReno congestion controller described in this document has 965 three distinct states, as shown in Figure 1. 967 New Path or +------------+ 968 persistent congestion | Slow | 969 (O)---------------------->| Start | 970 +------------+ 971 | 972 Loss or | 973 ECN-CE increase | 974 v 975 +------------+ Loss or +------------+ 976 | Congestion | ECN-CE increase | Recovery | 977 | Avoidance |------------------>| Period | 978 +------------+ +------------+ 979 ^ | 980 | | 981 +----------------------------+ 982 Acknowledgment of packet 983 sent during recovery 985 Figure 1: Congestion Control States and Transitions 987 These states and the transitions between them are described in 988 subsequent sections. 990 7.3.1. Slow Start 992 A NewReno sender is in slow start any time the congestion window is 993 below the slow start threshold. A sender begins in slow start 994 because the slow start threshold is initialized to an infinite value. 996 While a sender is in slow start, the congestion window increases by 997 the number of bytes acknowledged when each acknowledgment is 998 processed. This results in exponential growth of the congestion 999 window. 1001 The sender MUST exit slow start and enter a recovery period when a 1002 packet is lost or when the ECN-CE count reported by its peer 1003 increases. 1005 A sender re-enters slow start any time the congestion window is less 1006 than the slow start threshold, which only occurs after persistent 1007 congestion is declared. 1009 7.3.2. Recovery 1011 A NewReno sender enters a recovery period when it detects the loss of 1012 a packet or the ECN-CE count reported by its peer increases. A 1013 sender that is already in a recovery period stays in it and does not 1014 re-enter it. 1016 On entering a recovery period, a sender MUST set the slow start 1017 threshold to half the value of the congestion window when loss is 1018 detected. The congestion window MUST be set to the reduced value of 1019 the slow start threshold before exiting the recovery period. 1021 Implementations MAY reduce the congestion window immediately upon 1022 entering a recovery period or use other mechanisms, such as 1023 Proportional Rate Reduction ([PRR]), to reduce the congestion window 1024 more gradually. If the congestion window is reduced immediately, a 1025 single packet can be sent prior to reduction. This speeds up loss 1026 recovery if the data in the lost packet is retransmitted and is 1027 similar to TCP as described in Section 5 of [RFC6675]. 1029 The recovery period aims to limit congestion window reduction to once 1030 per round trip. Therefore during a recovery period, the congestion 1031 window does not change in response to new losses or increases in the 1032 ECN-CE count. 1034 A recovery period ends and the sender enters congestion avoidance 1035 when a packet sent during the recovery period is acknowledged. This 1036 is slightly different from TCP's definition of recovery, which ends 1037 when the lost segment that started recovery is acknowledged 1038 ([RFC5681]). 1040 7.3.3. Congestion Avoidance 1042 A NewReno sender is in congestion avoidance any time the congestion 1043 window is at or above the slow start threshold and not in a recovery 1044 period. 1046 A sender in congestion avoidance uses an Additive Increase 1047 Multiplicative Decrease (AIMD) approach that MUST limit the increase 1048 to the congestion window to at most one maximum datagram size for 1049 each congestion window that is acknowledged. 1051 The sender exits congestion avoidance and enters a recovery period 1052 when a packet is lost or when the ECN-CE count reported by its peer 1053 increases. 1055 7.4. Ignoring Loss of Undecryptable Packets 1057 During the handshake, some packet protection keys might not be 1058 available when a packet arrives and the receiver can choose to drop 1059 the packet. In particular, Handshake and 0-RTT packets cannot be 1060 processed until the Initial packets arrive and 1-RTT packets cannot 1061 be processed until the handshake completes. Endpoints MAY ignore the 1062 loss of Handshake, 0-RTT, and 1-RTT packets that might have arrived 1063 before the peer had packet protection keys to process those packets. 1065 Endpoints MUST NOT ignore the loss of packets that were sent after 1066 the earliest acknowledged packet in a given packet number space. 1068 7.5. Probe Timeout 1070 Probe packets MUST NOT be blocked by the congestion controller. A 1071 sender MUST however count these packets as being additionally in 1072 flight, since these packets add network load without establishing 1073 packet loss. Note that sending probe packets might cause the 1074 sender's bytes in flight to exceed the congestion window until an 1075 acknowledgment is received that establishes loss or delivery of 1076 packets. 1078 7.6. Persistent Congestion 1080 When a sender establishes loss of all packets sent over a long enough 1081 duration, the network is considered to be experiencing persistent 1082 congestion. 1084 7.6.1. Duration 1086 The persistent congestion duration is computed as follows: 1088 (smoothed_rtt + max(4*rttvar, kGranularity) + max_ack_delay) * 1089 kPersistentCongestionThreshold 1091 Unlike the PTO computation in Section 6.2, this duration includes the 1092 max_ack_delay irrespective of the packet number spaces in which 1093 losses are established. 1095 This duration allows a sender to send as many packets before 1096 establishing persistent congestion, including some in response to PTO 1097 expiration, as TCP does with Tail Loss Probes ([RACK]) and a 1098 Retransmission Timeout ([RFC5681]). 1100 Larger values of kPersistentCongestionThreshold cause the sender to 1101 become less responsive to persistent congestion in the network, which 1102 can result in aggressive sending into a congested network. Too small 1103 a value can result in a sender declaring persistent congestion 1104 unnecessarily, resulting in reduced throughput for the sender. 1106 The RECOMMENDED value for kPersistentCongestionThreshold is 3, which 1107 results in behavior that is approximately equivalent to a TCP sender 1108 declaring an RTO after two TLPs. 1110 This design does not use consecutive PTO events to establish 1111 persistent congestion, since application patterns impact PTO 1112 expirations. For example, a sender that sends small amounts of data 1113 with silence periods between them restarts the PTO timer every time 1114 it sends, potentially preventing the PTO timer from expiring for a 1115 long period of time, even when no acknowledgments are being received. 1116 The use of a duration enables a sender to establish persistent 1117 congestion without depending on PTO expiration. 1119 7.6.2. Establishing Persistent Congestion 1121 A sender establishes persistent congestion after the receipt of an 1122 acknowledgment if two packets that are ack-eliciting are declared 1123 lost, and: 1125 * across all packet number spaces, none of the packets sent between 1126 the send times of these two packets are acknowledged; 1128 * the duration between the send times of these two packets exceeds 1129 the persistent congestion duration (Section 7.6.1); and 1131 * a prior RTT sample existed when these two packets were sent. 1133 These two packets MUST be ack-eliciting, since a receiver is required 1134 to acknowledge only ack-eliciting packets within its maximum ack 1135 delay; see Section 13.2 of [QUIC-TRANSPORT]. 1137 The persistent congestion period SHOULD NOT start until there is at 1138 least one RTT sample. Before the first RTT sample, a sender arms its 1139 PTO timer based on the initial RTT (Section 6.2.2), which could be 1140 substantially larger than the actual RTT. Requiring a prior RTT 1141 sample prevents a sender from establishing persistent congestion with 1142 potentially too few probes. 1144 Since network congestion is not affected by packet number spaces, 1145 persistent congestion SHOULD consider packets sent across packet 1146 number spaces. A sender that does not have state for all packet 1147 number spaces or an implementation that cannot compare send times 1148 across packet number spaces MAY use state for just the packet number 1149 space that was acknowledged. This might result in erroneously 1150 declaring persistent congestion, but it will not lead to a failure to 1151 detect persistent congestion. 1153 When persistent congestion is declared, the sender's congestion 1154 window MUST be reduced to the minimum congestion window 1155 (kMinimumWindow), similar to a TCP sender's response on an RTO 1156 ([RFC5681]). 1158 7.6.3. Example 1160 The following example illustrates how a sender might establish 1161 persistent congestion. Assume: 1163 smoothed_rtt + max(4*rttvar, kGranularity) + max_ack_delay = 2 1164 kPersistentCongestionThreshold = 3 1166 Consider the following sequence of events: 1168 +========+===========================+ 1169 | Time | Action | 1170 +========+===========================+ 1171 | t=0 | Send packet #1 (app data) | 1172 +--------+---------------------------+ 1173 | t=1 | Send packet #2 (app data) | 1174 +--------+---------------------------+ 1175 | t=1.2 | Recv acknowledgment of #1 | 1176 +--------+---------------------------+ 1177 | t=2 | Send packet #3 (app data) | 1178 +--------+---------------------------+ 1179 | t=3 | Send packet #4 (app data) | 1180 +--------+---------------------------+ 1181 | t=4 | Send packet #5 (app data) | 1182 +--------+---------------------------+ 1183 | t=5 | Send packet #6 (app data) | 1184 +--------+---------------------------+ 1185 | t=6 | Send packet #7 (app data) | 1186 +--------+---------------------------+ 1187 | t=8 | Send packet #8 (PTO 1) | 1188 +--------+---------------------------+ 1189 | t=12 | Send packet #9 (PTO 2) | 1190 +--------+---------------------------+ 1191 | t=12.2 | Recv acknowledgment of #9 | 1192 +--------+---------------------------+ 1194 Table 1 1196 Packets 2 through 8 are declared lost when the acknowledgment for 1197 packet 9 is received at t = 12.2. 1199 The congestion period is calculated as the time between the oldest 1200 and newest lost packets: 8 - 1 = 7. The persistent congestion 1201 duration is: 2 * 3 = 6. Because the threshold was reached and 1202 because none of the packets between the oldest and the newest lost 1203 packets were acknowledged, the network is considered to have 1204 experienced persistent congestion. 1206 While this example shows PTO expiration, they are not required for 1207 persistent congestion to be established. 1209 7.7. Pacing 1211 A sender SHOULD pace sending of all in-flight packets based on input 1212 from the congestion controller. 1214 Sending multiple packets into the network without any delay between 1215 them creates a packet burst that might cause short-term congestion 1216 and losses. Senders MUST either use pacing or limit such bursts. 1217 Senders SHOULD limit bursts to the initial congestion window; see 1218 Section 7.2. A sender with knowledge that the network path to the 1219 receiver can absorb larger bursts MAY use a higher limit. 1221 An implementation should take care to architect its congestion 1222 controller to work well with a pacer. For instance, a pacer might 1223 wrap the congestion controller and control the availability of the 1224 congestion window, or a pacer might pace out packets handed to it by 1225 the congestion controller. 1227 Timely delivery of ACK frames is important for efficient loss 1228 recovery. Packets containing only ACK frames SHOULD therefore not be 1229 paced, to avoid delaying their delivery to the peer. 1231 Endpoints can implement pacing as they choose. A perfectly paced 1232 sender spreads packets exactly evenly over time. For a window-based 1233 congestion controller, such as the one in this document, that rate 1234 can be computed by averaging the congestion window over the round- 1235 trip time. Expressed as a rate in units of bytes per time, where 1236 congestion_window is in bytes: 1238 rate = N * congestion_window / smoothed_rtt 1240 Or, expressed as an inter-packet interval in units of time: 1242 interval = ( smoothed_rtt * packet_size / congestion_window ) / N 1244 Using a value for "N" that is small, but at least 1 (for example, 1245 1.25) ensures that variations in round-trip time do not result in 1246 under-utilization of the congestion window. 1248 Practical considerations, such as packetization, scheduling delays, 1249 and computational efficiency, can cause a sender to deviate from this 1250 rate over time periods that are much shorter than a round-trip time. 1252 One possible implementation strategy for pacing uses a leaky bucket 1253 algorithm, where the capacity of the "bucket" is limited to the 1254 maximum burst size and the rate the "bucket" fills is determined by 1255 the above function. 1257 7.8. Under-utilizing the Congestion Window 1259 When bytes in flight is smaller than the congestion window and 1260 sending is not pacing limited, the congestion window is under- 1261 utilized. When this occurs, the congestion window SHOULD NOT be 1262 increased in either slow start or congestion avoidance. This can 1263 happen due to insufficient application data or flow control limits. 1265 A sender that paces packets (see Section 7.7) might delay sending 1266 packets and not fully utilize the congestion window due to this 1267 delay. A sender SHOULD NOT consider itself application limited if it 1268 would have fully utilized the congestion window without pacing delay. 1270 A sender MAY implement alternative mechanisms to update its 1271 congestion window after periods of under-utilization, such as those 1272 proposed for TCP in [RFC7661]. 1274 8. Security Considerations 1276 8.1. Loss and Congestion Signals 1278 Loss detection and congestion control fundamentally involve 1279 consumption of signals, such as delay, loss, and ECN markings, from 1280 unauthenticated entities. An attacker can cause endpoints to reduce 1281 their sending rate by manipulating these signals; by dropping 1282 packets, by altering path delay strategically, or by changing ECN 1283 codepoints. 1285 8.2. Traffic Analysis 1287 Packets that carry only ACK frames can be heuristically identified by 1288 observing packet size. Acknowledgment patterns may expose 1289 information about link characteristics or application behavior. To 1290 reduce leaked information, endpoints can bundle acknowledgments with 1291 other frames, or they can use PADDING frames at a potential cost to 1292 performance. 1294 8.3. Misreporting ECN Markings 1296 A receiver can misreport ECN markings to alter the congestion 1297 response of a sender. Suppressing reports of ECN-CE markings could 1298 cause a sender to increase their send rate. This increase could 1299 result in congestion and loss. 1301 A sender can detect suppression of reports by marking occasional 1302 packets that it sends with an ECN-CE marking. If a packet sent with 1303 an ECN-CE marking is not reported as having been CE marked when the 1304 packet is acknowledged, then the sender can disable ECN for that path 1305 by not setting ECT codepoints in subsequent packets sent on that path 1306 [RFC3168]. 1308 Reporting additional ECN-CE markings will cause a sender to reduce 1309 their sending rate, which is similar in effect to advertising reduced 1310 connection flow control limits and so no advantage is gained by doing 1311 so. 1313 Endpoints choose the congestion controller that they use. Congestion 1314 controllers respond to reports of ECN-CE by reducing their rate, but 1315 the response may vary. Markings can be treated as equivalent to loss 1316 ([RFC3168]), but other responses can be specified, such as 1317 ([RFC8511]) or ([RFC8311]). 1319 9. IANA Considerations 1321 This document has no IANA actions. 1323 10. References 1325 10.1. Normative References 1327 [QUIC-TLS] Thomson, M., Ed. and S. Turner, Ed., "Using TLS to Secure 1328 QUIC", Work in Progress, Internet-Draft, draft-ietf-quic- 1329 tls-34, 15 January 2021, 1330 . 1332 [QUIC-TRANSPORT] 1333 Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based 1334 Multiplexed and Secure Transport", Work in Progress, 1335 Internet-Draft, draft-ietf-quic-transport-34, 15 January 1336 2021, . 1339 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1340 Requirement Levels", BCP 14, RFC 2119, 1341 DOI 10.17487/RFC2119, March 1997, 1342 . 1344 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition 1345 of Explicit Congestion Notification (ECN) to IP", 1346 RFC 3168, DOI 10.17487/RFC3168, September 2001, 1347 . 1349 [RFC8085] Eggert, L., Fairhurst, G., and G. Shepherd, "UDP Usage 1350 Guidelines", BCP 145, RFC 8085, DOI 10.17487/RFC8085, 1351 March 2017, . 1353 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 1354 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 1355 May 2017, . 1357 10.2. Informative References 1359 [FACK] Mathis, M. and J. Mahdavi, "Forward Acknowledgement: 1360 Refining TCP Congestion Control", ACM SIGCOMM , August 1361 1996. 1363 [PRR] Mathis, M., Dukkipati, N., and Y. Cheng, "Proportional 1364 Rate Reduction for TCP", RFC 6937, DOI 10.17487/RFC6937, 1365 May 2013, . 1367 [RACK] Cheng, Y., Cardwell, N., Dukkipati, N., and P. Jha, "The 1368 RACK-TLP loss detection algorithm for TCP", Work in 1369 Progress, Internet-Draft, draft-ietf-tcpm-rack-15, 22 1370 December 2020, . 1373 [RETRANSMISSION] 1374 Karn, P. and C. Partridge, "Improving Round-Trip Time 1375 Estimates in Reliable Transport Protocols", ACM SIGCOMM 1376 CCR , January 1995. 1378 [RFC2018] Mathis, M., Mahdavi, J., Floyd, S., and A. Romanow, "TCP 1379 Selective Acknowledgment Options", RFC 2018, 1380 DOI 10.17487/RFC2018, October 1996, 1381 . 1383 [RFC3465] Allman, M., "TCP Congestion Control with Appropriate Byte 1384 Counting (ABC)", RFC 3465, DOI 10.17487/RFC3465, February 1385 2003, . 1387 [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion 1388 Control", RFC 5681, DOI 10.17487/RFC5681, September 2009, 1389 . 1391 [RFC5682] Sarolahti, P., Kojo, M., Yamamoto, K., and M. Hata, 1392 "Forward RTO-Recovery (F-RTO): An Algorithm for Detecting 1393 Spurious Retransmission Timeouts with TCP", RFC 5682, 1394 DOI 10.17487/RFC5682, September 2009, 1395 . 1397 [RFC5827] Allman, M., Avrachenkov, K., Ayesta, U., Blanton, J., and 1398 P. Hurtig, "Early Retransmit for TCP and Stream Control 1399 Transmission Protocol (SCTP)", RFC 5827, 1400 DOI 10.17487/RFC5827, May 2010, 1401 . 1403 [RFC6297] Welzl, M. and D. Ros, "A Survey of Lower-than-Best-Effort 1404 Transport Protocols", RFC 6297, DOI 10.17487/RFC6297, June 1405 2011, . 1407 [RFC6298] Paxson, V., Allman, M., Chu, J., and M. Sargent, 1408 "Computing TCP's Retransmission Timer", RFC 6298, 1409 DOI 10.17487/RFC6298, June 2011, 1410 . 1412 [RFC6582] Henderson, T., Floyd, S., Gurtov, A., and Y. Nishida, "The 1413 NewReno Modification to TCP's Fast Recovery Algorithm", 1414 RFC 6582, DOI 10.17487/RFC6582, April 2012, 1415 . 1417 [RFC6675] Blanton, E., Allman, M., Wang, L., Jarvinen, I., Kojo, M., 1418 and Y. Nishida, "A Conservative Loss Recovery Algorithm 1419 Based on Selective Acknowledgment (SACK) for TCP", 1420 RFC 6675, DOI 10.17487/RFC6675, August 2012, 1421 . 1423 [RFC6928] Chu, J., Dukkipati, N., Cheng, Y., and M. Mathis, 1424 "Increasing TCP's Initial Window", RFC 6928, 1425 DOI 10.17487/RFC6928, April 2013, 1426 . 1428 [RFC7661] Fairhurst, G., Sathiaseelan, A., and R. Secchi, "Updating 1429 TCP to Support Rate-Limited Traffic", RFC 7661, 1430 DOI 10.17487/RFC7661, October 2015, 1431 . 1433 [RFC8311] Black, D., "Relaxing Restrictions on Explicit Congestion 1434 Notification (ECN) Experimentation", RFC 8311, 1435 DOI 10.17487/RFC8311, January 2018, 1436 . 1438 [RFC8312] Rhee, I., Xu, L., Ha, S., Zimmermann, A., Eggert, L., and 1439 R. Scheffenegger, "CUBIC for Fast Long-Distance Networks", 1440 RFC 8312, DOI 10.17487/RFC8312, February 2018, 1441 . 1443 [RFC8511] Khademi, N., Welzl, M., Armitage, G., and G. Fairhurst, 1444 "TCP Alternative Backoff with ECN (ABE)", RFC 8511, 1445 DOI 10.17487/RFC8511, December 2018, 1446 . 1448 Appendix A. Loss Recovery Pseudocode 1450 We now describe an example implementation of the loss detection 1451 mechanisms described in Section 6. 1453 The pseudocode segments in this section are licensed as Code 1454 Components; see the copyright notice. 1456 A.1. Tracking Sent Packets 1458 To correctly implement congestion control, a QUIC sender tracks every 1459 ack-eliciting packet until the packet is acknowledged or lost. It is 1460 expected that implementations will be able to access this information 1461 by packet number and crypto context and store the per-packet fields 1462 (Appendix A.1.1) for loss recovery and congestion control. 1464 After a packet is declared lost, the endpoint can still maintain 1465 state for it for an amount of time to allow for packet reordering; 1466 see Section 13.3 of [QUIC-TRANSPORT]. This enables a sender to 1467 detect spurious retransmissions. 1469 Sent packets are tracked for each packet number space, and ACK 1470 processing only applies to a single space. 1472 A.1.1. Sent Packet Fields 1474 packet_number: The packet number of the sent packet. 1476 ack_eliciting: A boolean that indicates whether a packet is ack- 1477 eliciting. If true, it is expected that an acknowledgment will be 1478 received, though the peer could delay sending the ACK frame 1479 containing it by up to the max_ack_delay. 1481 in_flight: A boolean that indicates whether the packet counts 1482 towards bytes in flight. 1484 sent_bytes: The number of bytes sent in the packet, not including 1485 UDP or IP overhead, but including QUIC framing overhead. 1487 time_sent: The time the packet was sent. 1489 A.2. Constants of Interest 1491 Constants used in loss recovery are based on a combination of RFCs, 1492 papers, and common practice. 1494 kPacketThreshold: Maximum reordering in packets before packet 1495 threshold loss detection considers a packet lost. The value 1496 recommended in Section 6.1.1 is 3. 1498 kTimeThreshold: Maximum reordering in time before time threshold 1499 loss detection considers a packet lost. Specified as an RTT 1500 multiplier. The value recommended in Section 6.1.2 is 9/8. 1502 kGranularity: Timer granularity. This is a system-dependent value, 1503 and Section 6.1.2 recommends a value of 1ms. 1505 kInitialRtt: The RTT used before an RTT sample is taken. The value 1506 recommended in Section 6.2.2 is 333ms. 1508 kPacketNumberSpace: An enum to enumerate the three packet number 1509 spaces. 1511 enum kPacketNumberSpace { 1512 Initial, 1513 Handshake, 1514 ApplicationData, 1515 } 1517 A.3. Variables of interest 1519 Variables required to implement the congestion control mechanisms are 1520 described in this section. 1522 latest_rtt: The most recent RTT measurement made when receiving an 1523 ack for a previously unacked packet. 1525 smoothed_rtt: The smoothed RTT of the connection, computed as 1526 described in Section 5.3. 1528 rttvar: The RTT variation, computed as described in Section 5.3. 1530 min_rtt: The minimum RTT seen over a period of time, ignoring 1531 acknowledgment delay, as described in Section 5.2. 1533 first_rtt_sample: The time that the first RTT sample was obtained. 1535 max_ack_delay: The maximum amount of time by which the receiver 1536 intends to delay acknowledgments for packets in the Application 1537 Data packet number space, as defined by the eponymous transport 1538 parameter (Section 18.2 of [QUIC-TRANSPORT]). Note that the 1539 actual ack_delay in a received ACK frame may be larger due to late 1540 timers, reordering, or loss. 1542 loss_detection_timer: Multi-modal timer used for loss detection. 1544 pto_count: The number of times a PTO has been sent without receiving 1545 an ack. 1547 time_of_last_ack_eliciting_packet[kPacketNumberSpace]: The time the 1548 most recent ack-eliciting packet was sent. 1550 largest_acked_packet[kPacketNumberSpace]: The largest packet number 1551 acknowledged in the packet number space so far. 1553 loss_time[kPacketNumberSpace]: The time at which the next packet in 1554 that packet number space can be considered lost based on exceeding 1555 the reordering window in time. 1557 sent_packets[kPacketNumberSpace]: An association of packet numbers 1558 in a packet number space to information about them. Described in 1559 detail above in Appendix A.1. 1561 A.4. Initialization 1563 At the beginning of the connection, initialize the loss detection 1564 variables as follows: 1566 loss_detection_timer.reset() 1567 pto_count = 0 1568 latest_rtt = 0 1569 smoothed_rtt = kInitialRtt 1570 rttvar = kInitialRtt / 2 1571 min_rtt = 0 1572 first_rtt_sample = 0 1573 for pn_space in [ Initial, Handshake, ApplicationData ]: 1574 largest_acked_packet[pn_space] = infinite 1575 time_of_last_ack_eliciting_packet[pn_space] = 0 1576 loss_time[pn_space] = 0 1578 A.5. On Sending a Packet 1580 After a packet is sent, information about the packet is stored. The 1581 parameters to OnPacketSent are described in detail above in 1582 Appendix A.1.1. 1584 Pseudocode for OnPacketSent follows: 1586 OnPacketSent(packet_number, pn_space, ack_eliciting, 1587 in_flight, sent_bytes): 1588 sent_packets[pn_space][packet_number].packet_number = 1589 packet_number 1590 sent_packets[pn_space][packet_number].time_sent = now() 1591 sent_packets[pn_space][packet_number].ack_eliciting = 1592 ack_eliciting 1593 sent_packets[pn_space][packet_number].in_flight = in_flight 1594 sent_packets[pn_space][packet_number].sent_bytes = sent_bytes 1595 if (in_flight): 1596 if (ack_eliciting): 1597 time_of_last_ack_eliciting_packet[pn_space] = now() 1598 OnPacketSentCC(sent_bytes) 1599 SetLossDetectionTimer() 1601 A.6. On Receiving a Datagram 1603 When a server is blocked by anti-amplification limits, receiving a 1604 datagram unblocks it, even if none of the packets in the datagram are 1605 successfully processed. In such a case, the PTO timer will need to 1606 be re-armed. 1608 Pseudocode for OnDatagramReceived follows: 1610 OnDatagramReceived(datagram): 1611 // If this datagram unblocks the server, arm the 1612 // PTO timer to avoid deadlock. 1613 if (server was at anti-amplification limit): 1614 SetLossDetectionTimer() 1616 A.7. On Receiving an Acknowledgment 1618 When an ACK frame is received, it may newly acknowledge any number of 1619 packets. 1621 Pseudocode for OnAckReceived and UpdateRtt follow: 1623 IncludesAckEliciting(packets): 1624 for packet in packets: 1625 if (packet.ack_eliciting): 1626 return true 1627 return false 1629 OnAckReceived(ack, pn_space): 1630 if (largest_acked_packet[pn_space] == infinite): 1631 largest_acked_packet[pn_space] = ack.largest_acked 1633 else: 1634 largest_acked_packet[pn_space] = 1635 max(largest_acked_packet[pn_space], ack.largest_acked) 1637 // DetectAndRemoveAckedPackets finds packets that are newly 1638 // acknowledged and removes them from sent_packets. 1639 newly_acked_packets = 1640 DetectAndRemoveAckedPackets(ack, pn_space) 1641 // Nothing to do if there are no newly acked packets. 1642 if (newly_acked_packets.empty()): 1643 return 1645 // Update the RTT if the largest acknowledged is newly acked 1646 // and at least one ack-eliciting was newly acked. 1647 if (newly_acked_packets.largest().packet_number == 1648 ack.largest_acked && 1649 IncludesAckEliciting(newly_acked_packets)): 1650 latest_rtt = 1651 now() - newly_acked_packets.largest().time_sent 1652 UpdateRtt(ack.ack_delay) 1654 // Process ECN information if present. 1655 if (ACK frame contains ECN information): 1656 ProcessECN(ack, pn_space) 1658 lost_packets = DetectAndRemoveLostPackets(pn_space) 1659 if (!lost_packets.empty()): 1660 OnPacketsLost(lost_packets) 1661 OnPacketsAcked(newly_acked_packets) 1663 // Reset pto_count unless the client is unsure if 1664 // the server has validated the client's address. 1665 if (PeerCompletedAddressValidation()): 1666 pto_count = 0 1667 SetLossDetectionTimer() 1669 UpdateRtt(ack_delay): 1670 if (first_rtt_sample == 0): 1671 min_rtt = latest_rtt 1672 smoothed_rtt = latest_rtt 1673 rttvar = latest_rtt / 2 1674 first_rtt_sample = now() 1675 return 1677 // min_rtt ignores acknowledgment delay. 1678 min_rtt = min(min_rtt, latest_rtt) 1679 // Limit ack_delay by max_ack_delay after handshake 1680 // confirmation. 1681 if (handshake confirmed): 1682 ack_delay = min(ack_delay, max_ack_delay) 1684 // Adjust for acknowledgment delay if plausible. 1685 adjusted_rtt = latest_rtt 1686 if (latest_rtt > min_rtt + ack_delay): 1687 adjusted_rtt = latest_rtt - ack_delay 1689 rttvar = 3/4 * rttvar + 1/4 * abs(smoothed_rtt - adjusted_rtt) 1690 smoothed_rtt = 7/8 * smoothed_rtt + 1/8 * adjusted_rtt 1692 A.8. Setting the Loss Detection Timer 1694 QUIC loss detection uses a single timer for all timeout loss 1695 detection. The duration of the timer is based on the timer's mode, 1696 which is set in the packet and timer events further below. The 1697 function SetLossDetectionTimer defined below shows how the single 1698 timer is set. 1700 This algorithm may result in the timer being set in the past, 1701 particularly if timers wake up late. Timers set in the past fire 1702 immediately. 1704 Pseudocode for SetLossDetectionTimer follows (where the "^" operator 1705 represents exponentiation): 1707 GetLossTimeAndSpace(): 1708 time = loss_time[Initial] 1709 space = Initial 1710 for pn_space in [ Handshake, ApplicationData ]: 1711 if (time == 0 || loss_time[pn_space] < time): 1712 time = loss_time[pn_space]; 1713 space = pn_space 1714 return time, space 1716 GetPtoTimeAndSpace(): 1717 duration = (smoothed_rtt + max(4 * rttvar, kGranularity)) 1718 * (2 ^ pto_count) 1719 // Arm PTO from now when there are no inflight packets. 1720 if (no in-flight packets): 1721 assert(!PeerCompletedAddressValidation()) 1722 if (has handshake keys): 1723 return (now() + duration), Handshake 1724 else: 1725 return (now() + duration), Initial 1726 pto_timeout = infinite 1727 pto_space = Initial 1728 for space in [ Initial, Handshake, ApplicationData ]: 1729 if (no in-flight packets in space): 1730 continue; 1731 if (space == ApplicationData): 1732 // Skip Application Data until handshake confirmed. 1733 if (handshake is not confirmed): 1734 return pto_timeout, pto_space 1735 // Include max_ack_delay and backoff for Application Data. 1736 duration += max_ack_delay * (2 ^ pto_count) 1738 t = time_of_last_ack_eliciting_packet[space] + duration 1739 if (t < pto_timeout): 1740 pto_timeout = t 1741 pto_space = space 1742 return pto_timeout, pto_space 1744 PeerCompletedAddressValidation(): 1745 // Assume clients validate the server's address implicitly. 1746 if (endpoint is server): 1747 return true 1748 // Servers complete address validation when a 1749 // protected packet is received. 1750 return has received Handshake ACK || 1751 handshake confirmed 1753 SetLossDetectionTimer(): 1754 earliest_loss_time, _ = GetLossTimeAndSpace() 1755 if (earliest_loss_time != 0): 1756 // Time threshold loss detection. 1757 loss_detection_timer.update(earliest_loss_time) 1758 return 1760 if (server is at anti-amplification limit): 1761 // The server's timer is not set if nothing can be sent. 1762 loss_detection_timer.cancel() 1763 return 1765 if (no ack-eliciting packets in flight && 1766 PeerCompletedAddressValidation()): 1767 // There is nothing to detect lost, so no timer is set. 1768 // However, the client needs to arm the timer if the 1769 // server might be blocked by the anti-amplification limit. 1770 loss_detection_timer.cancel() 1771 return 1773 timeout, _ = GetPtoTimeAndSpace() 1774 loss_detection_timer.update(timeout) 1776 A.9. On Timeout 1778 When the loss detection timer expires, the timer's mode determines 1779 the action to be performed. 1781 Pseudocode for OnLossDetectionTimeout follows: 1783 OnLossDetectionTimeout(): 1784 earliest_loss_time, pn_space = GetLossTimeAndSpace() 1785 if (earliest_loss_time != 0): 1786 // Time threshold loss Detection 1787 lost_packets = DetectAndRemoveLostPackets(pn_space) 1788 assert(!lost_packets.empty()) 1789 OnPacketsLost(lost_packets) 1790 SetLossDetectionTimer() 1791 return 1793 if (bytes_in_flight > 0): 1794 // PTO. Send new data if available, else retransmit old data. 1795 // If neither is available, send a single PING frame. 1796 _, pn_space = GetPtoTimeAndSpace() 1797 SendOneOrTwoAckElicitingPackets(pn_space) 1798 else: 1799 assert(!PeerCompletedAddressValidation()) 1800 // Client sends an anti-deadlock packet: Initial is padded 1801 // to earn more anti-amplification credit, 1802 // a Handshake packet proves address ownership. 1803 if (has Handshake keys): 1804 SendOneAckElicitingHandshakePacket() 1805 else: 1806 SendOneAckElicitingPaddedInitialPacket() 1808 pto_count++ 1809 SetLossDetectionTimer() 1811 A.10. Detecting Lost Packets 1813 DetectAndRemoveLostPackets is called every time an ACK is received or 1814 the time threshold loss detection timer expires. This function 1815 operates on the sent_packets for that packet number space and returns 1816 a list of packets newly detected as lost. 1818 Pseudocode for DetectAndRemoveLostPackets follows: 1820 DetectAndRemoveLostPackets(pn_space): 1821 assert(largest_acked_packet[pn_space] != infinite) 1822 loss_time[pn_space] = 0 1823 lost_packets = [] 1824 loss_delay = kTimeThreshold * max(latest_rtt, smoothed_rtt) 1826 // Minimum time of kGranularity before packets are deemed lost. 1827 loss_delay = max(loss_delay, kGranularity) 1829 // Packets sent before this time are deemed lost. 1830 lost_send_time = now() - loss_delay 1832 foreach unacked in sent_packets[pn_space]: 1833 if (unacked.packet_number > largest_acked_packet[pn_space]): 1834 continue 1836 // Mark packet as lost, or set time when it should be marked. 1837 // Note: The use of kPacketThreshold here assumes that there 1838 // were no sender-induced gaps in the packet number space. 1839 if (unacked.time_sent <= lost_send_time || 1840 largest_acked_packet[pn_space] >= 1841 unacked.packet_number + kPacketThreshold): 1842 sent_packets[pn_space].remove(unacked.packet_number) 1843 lost_packets.insert(unacked) 1844 else: 1845 if (loss_time[pn_space] == 0): 1846 loss_time[pn_space] = unacked.time_sent + loss_delay 1847 else: 1848 loss_time[pn_space] = min(loss_time[pn_space], 1849 unacked.time_sent + loss_delay) 1850 return lost_packets 1852 A.11. Upon Dropping Initial or Handshake Keys 1854 When Initial or Handshake keys are discarded, packets from the space 1855 are discarded and loss detection state is updated. 1857 Pseudocode for OnPacketNumberSpaceDiscarded follows: 1859 OnPacketNumberSpaceDiscarded(pn_space): 1860 assert(pn_space != ApplicationData) 1861 RemoveFromBytesInFlight(sent_packets[pn_space]) 1862 sent_packets[pn_space].clear() 1863 // Reset the loss detection and PTO timer 1864 time_of_last_ack_eliciting_packet[pn_space] = 0 1865 loss_time[pn_space] = 0 1866 pto_count = 0 1867 SetLossDetectionTimer() 1869 Appendix B. Congestion Control Pseudocode 1871 We now describe an example implementation of the congestion 1872 controller described in Section 7. 1874 The pseudocode segments in this section are licensed as Code 1875 Components; see the copyright notice. 1877 B.1. Constants of interest 1879 Constants used in congestion control are based on a combination of 1880 RFCs, papers, and common practice. 1882 kInitialWindow: Default limit on the initial bytes in flight as 1883 described in Section 7.2. 1885 kMinimumWindow: Minimum congestion window in bytes as described in 1886 Section 7.2. 1888 kLossReductionFactor: Scaling factor applied to reduce the 1889 congestion window when a new loss event is detected. Section 7 1890 recommends a value is 0.5. 1892 kPersistentCongestionThreshold: Period of time for persistent 1893 congestion to be established, specified as a PTO multiplier. 1894 Section 7.6 recommends a value of 3. 1896 B.2. Variables of interest 1898 Variables required to implement the congestion control mechanisms are 1899 described in this section. 1901 max_datagram_size: The sender's current maximum payload size. Does 1902 not include UDP or IP overhead. The max datagram size is used for 1903 congestion window computations. An endpoint sets the value of 1904 this variable based on its Path Maximum Transmission Unit (PMTU; 1905 see Section 14.2 of [QUIC-TRANSPORT]), with a minimum value of 1906 1200 bytes. 1908 ecn_ce_counters[kPacketNumberSpace]: The highest value reported for 1909 the ECN-CE counter in the packet number space by the peer in an 1910 ACK frame. This value is used to detect increases in the reported 1911 ECN-CE counter. 1913 bytes_in_flight: The sum of the size in bytes of all sent packets 1914 that contain at least one ack-eliciting or PADDING frame, and have 1915 not been acknowledged or declared lost. The size does not include 1916 IP or UDP overhead, but does include the QUIC header and AEAD 1917 overhead. Packets only containing ACK frames do not count towards 1918 bytes_in_flight to ensure congestion control does not impede 1919 congestion feedback. 1921 congestion_window: Maximum number of bytes allowed to be in flight. 1923 congestion_recovery_start_time: The time the current recovery period 1924 started due to the detection of loss or ECN. When a packet sent 1925 after this time is acknowledged, QUIC exits congestion recovery. 1927 ssthresh: Slow start threshold in bytes. When the congestion window 1928 is below ssthresh, the mode is slow start and the window grows by 1929 the number of bytes acknowledged. 1931 The congestion control pseudocode also accesses some of the variables 1932 from the loss recovery pseudocode. 1934 B.3. Initialization 1936 At the beginning of the connection, initialize the congestion control 1937 variables as follows: 1939 congestion_window = kInitialWindow 1940 bytes_in_flight = 0 1941 congestion_recovery_start_time = 0 1942 ssthresh = infinite 1943 for pn_space in [ Initial, Handshake, ApplicationData ]: 1944 ecn_ce_counters[pn_space] = 0 1946 B.4. On Packet Sent 1948 Whenever a packet is sent, and it contains non-ACK frames, the packet 1949 increases bytes_in_flight. 1951 OnPacketSentCC(sent_bytes): 1952 bytes_in_flight += sent_bytes 1954 B.5. On Packet Acknowledgment 1956 Invoked from loss detection's OnAckReceived and is supplied with the 1957 newly acked_packets from sent_packets. 1959 In congestion avoidance, implementers that use an integer 1960 representation for congestion_window should be careful with division, 1961 and can use the alternative approach suggested in Section 2.1 of 1962 [RFC3465]. 1964 InCongestionRecovery(sent_time): 1965 return sent_time <= congestion_recovery_start_time 1967 OnPacketsAcked(acked_packets): 1968 for acked_packet in acked_packets: 1969 OnPacketAcked(acked_packet) 1971 OnPacketAcked(acked_packet): 1972 if (!acked_packet.in_flight): 1973 return; 1974 // Remove from bytes_in_flight. 1975 bytes_in_flight -= acked_packet.sent_bytes 1976 // Do not increase congestion_window if application 1977 // limited or flow control limited. 1978 if (IsAppOrFlowControlLimited()) 1979 return 1980 // Do not increase congestion window in recovery period. 1981 if (InCongestionRecovery(acked_packet.time_sent)): 1982 return 1983 if (congestion_window < ssthresh): 1984 // Slow start. 1985 congestion_window += acked_packet.sent_bytes 1986 else: 1987 // Congestion avoidance. 1988 congestion_window += 1989 max_datagram_size * acked_packet.sent_bytes 1990 / congestion_window 1992 B.6. On New Congestion Event 1994 Invoked from ProcessECN and OnPacketsLost when a new congestion event 1995 is detected. If not already in recovery, this starts a recovery 1996 period and reduces the slow start threshold and congestion window 1997 immediately. 1999 OnCongestionEvent(sent_time): 2000 // No reaction if already in a recovery period. 2001 if (InCongestionRecovery(sent_time)): 2002 return 2004 // Enter recovery period. 2005 congestion_recovery_start_time = now() 2006 ssthresh = congestion_window * kLossReductionFactor 2007 congestion_window = max(ssthresh, kMinimumWindow) 2008 // A packet can be sent to speed up loss recovery. 2009 MaybeSendOnePacket() 2011 B.7. Process ECN Information 2013 Invoked when an ACK frame with an ECN section is received from the 2014 peer. 2016 ProcessECN(ack, pn_space): 2017 // If the ECN-CE counter reported by the peer has increased, 2018 // this could be a new congestion event. 2019 if (ack.ce_counter > ecn_ce_counters[pn_space]): 2020 ecn_ce_counters[pn_space] = ack.ce_counter 2021 sent_time = sent_packets[ack.largest_acked].time_sent 2022 OnCongestionEvent(sent_time) 2024 B.8. On Packets Lost 2026 Invoked when DetectAndRemoveLostPackets deems packets lost. 2028 OnPacketsLost(lost_packets): 2029 sent_time_of_last_loss = 0 2030 // Remove lost packets from bytes_in_flight. 2031 for lost_packet in lost_packets: 2032 if lost_packet.in_flight: 2033 bytes_in_flight -= lost_packet.sent_bytes 2034 sent_time_of_last_loss = 2035 max(sent_time_of_last_loss, lost_packet.time_sent) 2036 // Congestion event if in-flight packets were lost 2037 if (sent_time_of_last_loss != 0): 2038 OnCongestionEvent(sent_time_of_last_loss) 2040 // Reset the congestion window if the loss of these 2041 // packets indicates persistent congestion. 2042 // Only consider packets sent after getting an RTT sample. 2043 if (first_rtt_sample == 0): 2044 return 2045 pc_lost = [] 2046 for lost in lost_packets: 2047 if lost.time_sent > first_rtt_sample: 2048 pc_lost.insert(lost) 2049 if (InPersistentCongestion(pc_lost)): 2050 congestion_window = kMinimumWindow 2051 congestion_recovery_start_time = 0 2053 B.9. Removing Discarded Packets From Bytes In Flight 2055 When Initial or Handshake keys are discarded, packets sent in that 2056 space no longer count toward bytes in flight. 2058 Pseudocode for RemoveFromBytesInFlight follows: 2060 RemoveFromBytesInFlight(discarded_packets): 2061 // Remove any unacknowledged packets from flight. 2062 foreach packet in discarded_packets: 2063 if packet.in_flight 2064 bytes_in_flight -= size 2066 Appendix C. Change Log 2068 *RFC Editor's Note:* Please remove this section prior to 2069 publication of a final version of this document. 2071 Issue and pull request numbers are listed with a leading octothorp. 2073 C.1. Since draft-ietf-quic-recovery-32 2075 * Clarifications to definition of persistent congestion (#4413, 2076 #4414, #4421, #4429, #4437) 2078 C.2. Since draft-ietf-quic-recovery-31 2080 * Limit the number of Initial packets sent in response to 2081 unauthenticated packets (#4183, #4188) 2083 C.3. Since draft-ietf-quic-recovery-30 2085 Editorial changes only. 2087 C.4. Since draft-ietf-quic-recovery-29 2089 * Allow caching of packets that can't be decrypted, by allowing the 2090 reported acknowledgment delay to exceed max_ack_delay prior to 2091 confirming the handshake (#3821, #3980, #4035, #3874) 2093 * Persistent congestion cannot include packets sent before the first 2094 RTT sample for the path (#3875, #3889) 2096 * Recommend reset of min_rtt in persistent congestion (#3927, #3975) 2098 * Persistent congestion is independent of packet number space 2099 (#3939, #3961) 2101 * Only limit bursts to the initial window without information about 2102 the path (#3892, #3936) 2104 * Add normative requirements for increasing and reducing the 2105 congestion window (#3944, #3978, #3997, #3998) 2107 C.5. Since draft-ietf-quic-recovery-28 2109 * Refactored pseudocode to correct PTO calculation (#3564, #3674, 2110 #3681) 2112 C.6. Since draft-ietf-quic-recovery-27 2114 * Added recommendations for speeding up handshake under some loss 2115 conditions (#3078, #3080) 2117 * PTO count is reset when handshake progress is made (#3272, #3415) 2119 * PTO count is not reset by a client when the server might be 2120 awaiting address validation (#3546, #3551) 2122 * Recommend repairing losses immediately after entering the recovery 2123 period (#3335, #3443) 2125 * Clarified what loss conditions can be ignored during the handshake 2126 (#3456, #3450) 2128 * Allow, but don't recommend, using RTT from previous connection to 2129 seed RTT (#3464, #3496) 2131 * Recommend use of adaptive loss detection thresholds (#3571, #3572) 2133 C.7. Since draft-ietf-quic-recovery-26 2135 No changes. 2137 C.8. Since draft-ietf-quic-recovery-25 2139 No significant changes. 2141 C.9. Since draft-ietf-quic-recovery-24 2143 * Require congestion control of some sort (#3247, #3244, #3248) 2145 * Set a minimum reordering threshold (#3256, #3240) 2147 * PTO is specific to a packet number space (#3067, #3074, #3066) 2149 C.10. Since draft-ietf-quic-recovery-23 2151 * Define under-utilizing the congestion window (#2630, #2686, #2675) 2153 * PTO MUST send data if possible (#3056, #3057) 2154 * Connection Close is not ack-eliciting (#3097, #3098) 2156 * MUST limit bursts to the initial congestion window (#3160) 2158 * Define the current max_datagram_size for congestion control 2159 (#3041, #3167) 2161 C.11. Since draft-ietf-quic-recovery-22 2163 * PTO should always send an ack-eliciting packet (#2895) 2165 * Unify the Handshake Timer with the PTO timer (#2648, #2658, #2886) 2167 * Move ACK generation text to transport draft (#1860, #2916) 2169 C.12. Since draft-ietf-quic-recovery-21 2171 * No changes 2173 C.13. Since draft-ietf-quic-recovery-20 2175 * Path validation can be used as initial RTT value (#2644, #2687) 2177 * max_ack_delay transport parameter defaults to 0 (#2638, #2646) 2179 * ACK delay only measures intentional delays induced by the 2180 implementation (#2596, #2786) 2182 C.14. Since draft-ietf-quic-recovery-19 2184 * Change kPersistentThreshold from an exponent to a multiplier 2185 (#2557) 2187 * Send a PING if the PTO timer fires and there's nothing to send 2188 (#2624) 2190 * Set loss delay to at least kGranularity (#2617) 2192 * Merge application limited and sending after idle sections. Always 2193 limit burst size instead of requiring resetting CWND to initial 2194 CWND after idle (#2605) 2196 * Rewrite RTT estimation, allow RTT samples where a newly acked 2197 packet is ack-eliciting but the largest_acked is not (#2592) 2199 * Don't arm the handshake timer if there is no handshake data 2200 (#2590) 2202 * Clarify that the time threshold loss alarm takes precedence over 2203 the crypto handshake timer (#2590, #2620) 2205 * Change initial RTT to 500ms to align with RFC6298 (#2184) 2207 C.15. Since draft-ietf-quic-recovery-18 2209 * Change IW byte limit to 14720 from 14600 (#2494) 2211 * Update PTO calculation to match RFC6298 (#2480, #2489, #2490) 2213 * Improve loss detection's description of multiple packet number 2214 spaces and pseudocode (#2485, #2451, #2417) 2216 * Declare persistent congestion even if non-probe packets are sent 2217 and don't make persistent congestion more aggressive than RTO 2218 verified was (#2365, #2244) 2220 * Move pseudocode to the appendices (#2408) 2222 * What to send on multiple PTOs (#2380) 2224 C.16. Since draft-ietf-quic-recovery-17 2226 * After Probe Timeout discard in-flight packets or send another 2227 (#2212, #1965) 2229 * Endpoints discard initial keys as soon as handshake keys are 2230 available (#1951, #2045) 2232 * 0-RTT state is discarded when 0-RTT is rejected (#2300) 2234 * Loss detection timer is cancelled when ack-eliciting frames are in 2235 flight (#2117, #2093) 2237 * Packets are declared lost if they are in flight (#2104) 2239 * After becoming idle, either pace packets or reset the congestion 2240 controller (#2138, 2187) 2242 * Process ECN counts before marking packets lost (#2142) 2244 * Mark packets lost before resetting crypto_count and pto_count 2245 (#2208, #2209) 2247 * Congestion and loss recovery state are discarded when keys are 2248 discarded (#2327) 2250 C.17. Since draft-ietf-quic-recovery-16 2252 * Unify TLP and RTO into a single PTO; eliminate min RTO, min TLP 2253 and min crypto timeouts; eliminate timeout validation (#2114, 2254 #2166, #2168, #1017) 2256 * Redefine how congestion avoidance in terms of when the period 2257 starts (#1928, #1930) 2259 * Document what needs to be tracked for packets that are in flight 2260 (#765, #1724, #1939) 2262 * Integrate both time and packet thresholds into loss detection 2263 (#1969, #1212, #934, #1974) 2265 * Reduce congestion window after idle, unless pacing is used (#2007, 2266 #2023) 2268 * Disable RTT calculation for packets that don't elicit 2269 acknowledgment (#2060, #2078) 2271 * Limit ack_delay by max_ack_delay (#2060, #2099) 2273 * Initial keys are discarded once Handshake keys are available 2274 (#1951, #2045) 2276 * Reorder ECN and loss detection in pseudocode (#2142) 2278 * Only cancel loss detection timer if ack-eliciting packets are in 2279 flight (#2093, #2117) 2281 C.18. Since draft-ietf-quic-recovery-14 2283 * Used max_ack_delay from transport params (#1796, #1782) 2285 * Merge ACK and ACK_ECN (#1783) 2287 C.19. Since draft-ietf-quic-recovery-13 2289 * Corrected the lack of ssthresh reduction in CongestionEvent 2290 pseudocode (#1598) 2292 * Considerations for ECN spoofing (#1426, #1626) 2294 * Clarifications for PADDING and congestion control (#837, #838, 2295 #1517, #1531, #1540) 2297 * Reduce early retransmission timer to RTT/8 (#945, #1581) 2298 * Packets are declared lost after an RTO is verified (#935, #1582) 2300 C.20. Since draft-ietf-quic-recovery-12 2302 * Changes to manage separate packet number spaces and encryption 2303 levels (#1190, #1242, #1413, #1450) 2305 * Added ECN feedback mechanisms and handling; new ACK_ECN frame 2306 (#804, #805, #1372) 2308 C.21. Since draft-ietf-quic-recovery-11 2310 No significant changes. 2312 C.22. Since draft-ietf-quic-recovery-10 2314 * Improved text on ack generation (#1139, #1159) 2316 * Make references to TCP recovery mechanisms informational (#1195) 2318 * Define time_of_last_sent_handshake_packet (#1171) 2320 * Added signal from TLS the data it includes needs to be sent in a 2321 Retry packet (#1061, #1199) 2323 * Minimum RTT (min_rtt) is initialized with an infinite value 2324 (#1169) 2326 C.23. Since draft-ietf-quic-recovery-09 2328 No significant changes. 2330 C.24. Since draft-ietf-quic-recovery-08 2332 * Clarified pacing and RTO (#967, #977) 2334 C.25. Since draft-ietf-quic-recovery-07 2336 * Include ACK delay in RTO(and TLP) computations (#981) 2338 * ACK delay in SRTT computation (#961) 2340 * Default RTT and Slow Start (#590) 2342 * Many editorial fixes. 2344 C.26. Since draft-ietf-quic-recovery-06 2346 No significant changes. 2348 C.27. Since draft-ietf-quic-recovery-05 2350 * Add more congestion control text (#776) 2352 C.28. Since draft-ietf-quic-recovery-04 2354 No significant changes. 2356 C.29. Since draft-ietf-quic-recovery-03 2358 No significant changes. 2360 C.30. Since draft-ietf-quic-recovery-02 2362 * Integrate F-RTO (#544, #409) 2364 * Add congestion control (#545, #395) 2366 * Require connection abort if a skipped packet was acknowledged 2367 (#415) 2369 * Simplify RTO calculations (#142, #417) 2371 C.31. Since draft-ietf-quic-recovery-01 2373 * Overview added to loss detection 2375 * Changes initial default RTT to 100ms 2377 * Added time-based loss detection and fixes early retransmit 2379 * Clarified loss recovery for handshake packets 2381 * Fixed references and made TCP references informative 2383 C.32. Since draft-ietf-quic-recovery-00 2385 * Improved description of constants and ACK behavior 2387 C.33. Since draft-iyengar-quic-loss-recovery-01 2389 * Adopted as base for draft-ietf-quic-recovery 2391 * Updated authors/editors list 2392 * Added table of contents 2394 Appendix D. Contributors 2396 The IETF QUIC Working Group received an enormous amount of support 2397 from many people. The following people provided substantive 2398 contributions to this document: 2400 * Alessandro Ghedini 2402 * Benjamin Saunders 2404 * Gorry Fairhurst 2406 * 山本和彦 (Kazu Yamamoto) 2408 * 奥 一穂 (Kazuho Oku) 2410 * Lars Eggert 2412 * Magnus Westerlund 2414 * Marten Seemann 2416 * Martin Duke 2418 * Martin Thomson 2420 * Mirja Kühlewind 2422 * Nick Banks 2424 * Praveen Balasubramanian 2426 Acknowledgments 2428 Authors' Addresses 2430 Jana Iyengar (editor) 2431 Fastly 2433 Email: jri.ietf@gmail.com 2435 Ian Swett (editor) 2436 Google 2438 Email: ianswett@google.com