idnits 2.17.00 (12 Aug 2021) /tmp/idnits29669/draft-welzl-ledbat-survey-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** The document seems to lack a License Notice according IETF Trust Provisions of 28 Dec 2009, Section 6.b.ii or Provisions of 12 Sep 2009 Section 6.b -- however, there's a paragraph with a matching beginning. Boilerplate error? (You're using the IETF Trust Provisions' Section 6.b License Notice from 12 Feb 2009 rather than one of the newer Notices. See https://trustee.ietf.org/license-info/.) Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (March 3, 2009) is 4826 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Obsolete informational reference (is this intentional?): RFC 1323 (Obsoleted by RFC 7323) -- Obsolete informational reference (is this intentional?): RFC 2309 (Obsoleted by RFC 7567) -- Obsolete informational reference (is this intentional?): RFC 3662 (Obsoleted by RFC 8622) Summary: 1 error (**), 0 flaws (~~), 1 warning (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force M. Welzl 3 Internet-Draft University of Innsbruck 4 Intended status: Informational March 3, 2009 5 Expires: September 4, 2009 7 A Survey of Lower-than-Best Effort Transport Protocols 8 draft-welzl-ledbat-survey-00.txt 10 Status of this Memo 12 This Internet-Draft is submitted to IETF in full conformance with the 13 provisions of BCP 78 and BCP 79. 15 Internet-Drafts are working documents of the Internet Engineering 16 Task Force (IETF), its areas, and its working groups. Note that 17 other groups may also distribute working documents as Internet- 18 Drafts. 20 Internet-Drafts are draft documents valid for a maximum of six months 21 and may be updated, replaced, or obsoleted by other documents at any 22 time. It is inappropriate to use Internet-Drafts as reference 23 material or to cite them other than as "work in progress." 25 The list of current Internet-Drafts can be accessed at 26 http://www.ietf.org/ietf/1id-abstracts.txt. 28 The list of Internet-Draft Shadow Directories can be accessed at 29 http://www.ietf.org/shadow.html. 31 This Internet-Draft will expire on September 4, 2009. 33 Copyright Notice 35 Copyright (c) 2009 IETF Trust and the persons identified as the 36 document authors. All rights reserved. 38 This document is subject to BCP 78 and the IETF Trust's Legal 39 Provisions Relating to IETF Documents in effect on the date of 40 publication of this document (http://trustee.ietf.org/license-info). 41 Please review these documents carefully, as they describe your rights 42 and restrictions with respect to this document. 44 Abstract 46 This document provides a survey of transport protocols which are 47 designed to have a smaller bandwidth and/or delay impact on standard 48 TCP than standard TCP itself when they share a bottleneck with it. 50 Such protocols could be used for low-priority "background" traffic, 51 as they provide what is sometimes called a "less than" (or "lower 52 than") best effort service. 54 Table of Contents 56 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 57 2. Delay-based transport protocols . . . . . . . . . . . . . . . 3 58 3. Non-delay-based transport protocols . . . . . . . . . . . . . 6 59 4. Application layer approaches . . . . . . . . . . . . . . . . . 6 60 5. Orthogonal work . . . . . . . . . . . . . . . . . . . . . . . 7 61 6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 8 62 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 63 8. Security Considerations . . . . . . . . . . . . . . . . . . . 8 64 9. Informative References . . . . . . . . . . . . . . . . . . . . 8 65 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 10 67 1. Introduction 69 As a starting point for the work in the LEDBAT group, this document 70 presents a brief survey of efforts to attain a Less than Best Effort 71 (LBE) service without help from routers. We loosely define a LBE 72 service as a service which has smaller bandwidth and/or delay impact 73 on standard TCP than standard TCP itself when sharing a bottleneck 74 with it. We refer to systems that provide this service as Less than 75 Best Effort (LBE) systems. Generally, LBE behavior can be achieved 76 by reacting to queue growth earlier than standard TCP would, or by 77 changing the congestion avoidance behavior of TCP without utilizing 78 any additional implicit feedback. Some mechanisms achieve a LBE 79 behavior at the application layer, e.g. by changing the receiver 80 window of standard TCP, and there is also a substantial amount of 81 work that is related to the LBE concept but not presenting a solution 82 that can be installed in end hosts or expected to work over the 83 Internet. According to this classification, solutions have been 84 categorized as delay-based transport protocols, non-delay-based 85 transport protocols, application layer approaches and orthogonal work 86 in this document. 88 The author wishes to emphasize that, in its present form, this 89 document is only a starting point and not based on a thorough 90 literature study. Many relevant references will be missing, and an 91 apology goes to all authors of related work that has not been 92 mentioned here. 94 2. Delay-based transport protocols 96 It is wrong to generally equate "little impact on standard TCP" with 97 "small sending rate". Unless the sender's maximum window is limited 98 for some reason, and in the absence of ECN support, standard TCP will 99 normally increase its rate until a queue overflows, causing one or 100 more packets to be dropped and the rate to be reduced. A protocol 101 which stops increasing the rate before this event happens can, in 102 principle, achieve a better performance than standard TCP. In the 103 absence of any other traffic, this is even true for TCP itself when 104 its maximum send window is limited to the bandwidth*round-trip time 105 (RTT) product. 107 TCP Vegas [Bra+94] is one of the first protocols that was known to 108 have a smaller sending rate than standard TCP when both protocols 109 share a bottleneck [Kur+00] -- yet it was designed to achieve more, 110 not less throughput than standard TCP. Indeed, when it is the only 111 protocol on the bottleneck, the throughput of TCP Vegas is greater 112 than the throughput of standard TCP. Depending on the bottleneck 113 queue length, TCP Vegas itself can be starved by standard TCP flows. 115 This can be remedied to some degree by the RED Active Queue 116 Management mechanism [RFC2309]. 118 The congestion avoidance behavior is the protocol's most important 119 feature in terms of historical relevance as well as relevance in the 120 context of this document (it has been shown that other elements of 121 the protocol can sometimes play a greater role for its overall 122 behavior [Hen+00]). In Congestion Avoidance, once per RTT, TCP Vegas 123 calculates the expected throughput as WindowSize / BaseRTT, where 124 WindowSize is the current congestion window and BaseRTT is the 125 minimum of all measured RTTs. The expected throughput is then 126 compared with the actual (measured) throughput. If the actual 127 throughput is smaller than the expected throughput minus a threshold, 128 this is taken as a sign that the network is underutilized, causing 129 the protocol to linearly increase its rate. If the actual throughput 130 is greater than the expected throughput plus a threshold, this is 131 taken as a sign of congestion, causing the protocol to linearly 132 decrease its rate. 134 TCP Vegas has been analyzed extensively. One of the most prominent 135 properties of TCP Vegas is its fairness between multiple flows of the 136 same kind, which does not penalize flows with large propagation 137 delays in the same way as standard TCP. While it was not the first 138 protocol that uses delay as a congestion indication, its predecessors 139 (which can be found in [Bra+94]) are not discussed here because of 140 the historical "landmark" role that TCP Vegas has taken in the 141 literature. 143 Transport protocols which were designed to be non-intrusive include 144 TCP-LP [Kuz+06], TCP Nice [Ven+02] and 4CP [Liu+07]. Using a simple 145 analytical model, the authors of [Kuz+06] illustrate the feasibility 146 of this endeavor by showing that, due to the non-linear relationship 147 between throughput and RTT, it is possible to remain transparent to 148 standard TCP even when the flows under consideration have a larger 149 RTT than standard TCP flows. 151 TCP Nice [Ven+02] follows the same basic approach as TCP Vegas but 152 improves upon it in some aspects. Because of its moderate linear- 153 decrease congestion response, TCP Vegas can affect standard TCP 154 despite its ability to detect congestion early. TCP Nice removes 155 this issue by halving the congestion window (at most once per RTT, 156 like standard TCP) instead of linearly reducing it. To avoid being 157 too conservative, this is only done if a fixed predefined fraction of 158 delay-based incipient congestion signals appears within one RTT. 159 Otherwise, TCP Nice falls back to the congestion avoidance rules of 160 TCP Vegas if no packet was lost or standard TCP if a packet was lost. 161 One more feature of TCP Nice is its ability to support a congestion 162 window of less than one packet, by clocking out single packets over 163 more than one RTT. With ns-2 simulations and real-life experiments 164 using a Linux implementation, the authors of [Ven+02] show that TCP 165 Nice achieves its goal of efficiently utilizing spare capacity while 166 being non-intrusive to standard TCP. 168 Other than TCP Vegas and TCP Nice, TCP-LP uses only the one-way delay 169 (OWD) instead of the RTT as an indicator of incipient congestion. 170 This is done to avoid reacting to delay fluctuations that are caused 171 by reverse cross-traffic. Using the TCP Timestamps option [RFC1323], 172 the OWD is determined as the difference between the receiver's 173 Timestamp value in the ACK and the original Timestamp value that the 174 receiver copied into the ACK. While the result of this subtraction 175 can only precisely represent the OWD if clocks are synchronized, its 176 absolute value is of no concern to TCP-LP and hence clock 177 synchronization is unnecessary. Using a constant smoothing 178 parameter, TCP-LP calculates an Exponentially Weighted Moving Average 179 (EWMA) of the measured OWD and checks whether the result exceeds a 180 threshold within the range of the minimum and maximum OWD that was 181 seen during the connections's lifetime; if it does, this condition is 182 interpreted as an "early congestion indication". The minimum and 183 maximum OWD values are initialized during the slow-start phase. 185 Regarding its reaction to an early congestion indication, TCP-LP 186 tries to strike a middle ground between the overly conservative 187 choice of immediately setting the congestion window to one packet and 188 the presumably too aggressive choice of halving the congestion window 189 like standard TCP. It does so by halving the window at first in 190 response to an early congestion indication, then initializing an 191 "interference time-out timer", and maintaining the window size until 192 this timer fires. If another early congestion indication appeared 193 during this "interference phase", the window is then set to 1; 194 otherwise, the window is maintained and TCP-LP continues to increase 195 it the standard Additive-Increase fashion. This method ensures that 196 it takes at least two RTTs for a TCP-LP flow to decrease its window 197 to 1, and, like standard TCP, TCP-LP reacts to congestion at most 198 once per RTT. 200 With ns-2 simulations and real-life experiments using a Linux 201 implementation, the authors of [Kuz+06] show that TCP-LP is largely 202 non-intrusive to TCP traffic while at the same time enabling it to 203 utilize a large portion of the excess network bandwidth, which is 204 fairly shared among competing TCP-LP flows. They also show that 205 using their protocol for bulk data transfers greatly reduces file 206 transfer times of competing best-effort web traffic. 208 3. Non-delay-based transport protocols 210 4CP [Liu+07], which stands for "Competitive and Considerate 211 Congestion Control", is a protocol which provides a LBE service by 212 changing the window control rules of standard TCP. A "virtual 213 window" is maintained, which, during a so-called "bad congestion 214 phase" is reduced to less than a predefined minimum value of the 215 actual congestion window. The congestion window is only increased 216 again once the virtual window exceeds this minimum, and in this way 217 the virtual window controls the duration during which the sender 218 transmits with a fixed minimum rate. The 4CP congestion avoidance 219 algorithm allows for setting a target average window and avoids 220 starvation of "background" flows while bounding the impact on 221 "foreground" flows. Its performance was evaluated in ns-2 222 simulations and in real-life experiments with a kernel-level 223 implementation in Microsoft Windows Vista. 225 Some work was done on applying weights to congestion control 226 mechanisms, allowing a flow to be as aggressive as a number of 227 parallel TCP flows at the same time. This is usually motivated by 228 the fact that users may want to assign different priorities to 229 different flows. The first, and best known, such protocol is MulTCP 230 [Cro+98], which emulates N TCPs in a rather simple fashion. An 231 improved version of MulTCP is presented in [Hac+04], and there is 232 also a variant where only one feedback loop is applied to control a 233 larger traffic aggregate by the name of Probe-Aided (PA-)MulTCP 234 [Kuo+08]. Another protocol, CP [Ott+04], applies the same concept to 235 the TFRC protocol [RFC5348] in order to provide such fairness 236 differentiation for multimedia flows. 238 The general assumption underlying all of the above work is that these 239 protocols are "N-TCP-friendly", i.e. they are as TCP-friendly as N 240 TCPs, where N is a positive (and possibly natural) number which is 241 greater than or equal to 1. The MulTFRC [Dam+09] protocol, another 242 extension of TFRC for multiple flows, is however able to support 243 values between 0 and 1, making it applicable as a mechanism for a LBE 244 service. Since it does not react to delay like the mechanisms above 245 but adjusts its rate like TFRC, it can probably be expected to be 246 more aggressive than mechanisms such as TCP Nice or TCP-LP. This 247 also means that MulTFRC is less likely to be prone to starvation, as 248 its aggression is tunable at a fine granularity even when N is 249 between 0 and 1. 251 4. Application layer approaches 253 The mechanism described in [Spr+00] controls the bandwidth by letting 254 the receiver intelligently manipulate the receiver window of standard 255 TCP. This is done because the authors assume a client-server setting 256 where the receiver's access link is typically the bottleneck. The 257 scheme incorporates a delay-based calculation of the expected queue 258 length at the bottleneck, which is quite similar to the calculation 259 in the above delay based protocols, e.g. TCP Vegas. Using a Linux 260 implementation, where TCP flows are classified according to their 261 application's needs, it is shown that a significant improvement in 262 packet latency can be attained over an unmodified system while 263 maintaining good link utilization. 265 Receiver window tuning is also done in [Key+04], where choosing the 266 right value for the window is phrased as an optimization problem. On 267 this basis, two algorithms are presented, binary search -- which is 268 faster than the other one at achieving a good operation point but 269 fluctuates -- and stochastic optimization, which does not fluctuate 270 but converges slower than binary search. These algorithms merely use 271 the previous receiver window and the amount of data received during 272 the previous control interval as input. According to [Key+04], the 273 encouraging simulation results suggest that such an application level 274 mechanism can work almost as well as a transport layer scheme like 275 TCP-LP. 277 TODO: mention other rwnd tuning and different application layer work, 278 e.g. from related work sections of [Egg+05] and [Kok+04] and intro of 279 [Key+04]. 281 5. Orthogonal work 283 Various suggestions have been published for realizing a LBE service 284 by influencing the way packets are treated in routers. One example 285 is the Persistent Class Based Queuing (P-CBQ) scheme presented in 286 [Car+01], which is a variant of Class Based Queuing (CBQ) with per- 287 flow accounting. RFC 3662 [RFC3662] defines a DiffServ per-domain 288 behavior called "Lower Effort". 290 Harp [Kok+04] realizes a LBE service by dissipating background 291 traffic to less-utilized paths of the network. This is achieved 292 without changing routers by using edge nodes as relays. According to 293 the authors, these edge nodes should be gateways of organizations in 294 order to align their scheme with usage incentives, but the technical 295 solution would also work if Harp was only deployed in end hosts. It 296 detects impending congestion by looking at delay similar to TCP Nice 297 [Ven+02] and manages to improve utilization and fairness over pure 298 single-path solutions. 300 An entirely different approach is taken in [Egg+05]: here, the 301 priority of a flow is reduced via a generic idletime scheduling 302 strategy in a host's operating system. While results presented in 303 this paper show that the new scheduler can effectively shield regular 304 tasks from low-priority ones (e.g. TCP from greedy UDP) with only a 305 minor performance impact, it is an underlying assumption that all 306 involved end hosts would use the idletime scheduler. In other words, 307 it is not the focus of this work to protect a standard TCP flow which 308 originates from any host where the presented scheduling scheme may 309 not be implemented. 311 TODO: studies dealing with the precision of congestion prediction in 312 end hosts (i.e. using delay to determine the onset of congestion) may 313 be relevant in this document, and could be discussed here, e.g. 314 [Bha+07] and the references therein. 316 6. Acknowledgements 318 The author would like to thank Dragana Damjanovic for reference 319 pointers. Surely lots of other folks will help in one way or another 320 later and I'll thank them all here. 322 7. IANA Considerations 324 This memo includes no request to IANA. 326 8. Security Considerations 328 This document introduces no new security considerations. 330 9. Informative References 332 [Bha+07] Bhandarkar, S., Reddy, A., Zhang, Y., and D. Loguinov, 333 "Emulating AQM from end hosts", Proceedings of ACM 334 SIGCOMM 2007, 2007. 336 [Bra+94] Brakmo, L., O'Malley, S., and L. Peterson, "TCP Vegas: New 337 techniques for congestion detection and avoidance", 338 Proceedings of SIGCOMM '94, pages 24-35, August 1994. 340 [Car+01] Carlberg, K., Gevros, P., and J. Crowcroft, "Lower than 341 best effort: a design and implementation", Workshop on 342 Data communication in Latin America and the 343 Caribbean 2007, San Jose, Costa Rica, Pages: 244 - 265, 344 2001. 346 [Cro+98] Crowcroft, J. and P. Oechslin, "Differentiated end-to-end 347 Internet services using a weighted proportional fair 348 sharing TCP", ACM SIGCOMM Computer Communication 349 Review vol. 28, no. 3 (July 1998), pp. 53-69, 1998. 351 [Dam+09] Damjanovic, D. and M. Welzl, "MulTFRC: Providing Weighted 352 Fairness for Multimedia Applications (and others too!)", 353 Work in progress ..., 2009. 355 [Egg+05] Eggert, L. and J. Touch, "A Lower Effort Per-Domain 356 Behavior (PDB) for Differentiated Services", Proceedings 357 of 20th ACM Symposium on Operating Systems Principles SOSP 358 2005, Brighton, United Kingdom, pp. 249/262, October 2005. 360 [Hac+04] Hacker, T., Noble, B., and B. Athey, "Improving Throughput 361 and Maintaining Fairness using Parallel TCP", Proceedings 362 of Infocom 2004, March 2004. 364 [Hen+00] Hengartner, U., Bolliger, J., and T. Gross, "TCP Vegas 365 revisited", Proceedings of Infocom 2000, March 2000. 367 [Key+04] Key, P., MassouliA(C), L., and B. Wang, "Emulating Low- 368 Priority Transport at the Application Layer: a Background 369 Transfer Service", Proceedings of ACM SIGMETRICS 2004, 370 January 2004. 372 [Kok+04] Kokku, R., Bohra, A., Ganguly, S., and A. Venkataramani, 373 "A Multipath Background Network Architecture", Proceedings 374 of Infocom 2007, May 2007. 376 [Kuo+08] Kuo, F. and X. Fu, "Probe-Aided MulTCP: an aggregate 377 congestion control mechanism", ACM SIGCOMM Computer 378 Communication Review vol. 38, no. 1 (January 2008), pp. 379 17-28, 2008. 381 [Kur+00] Kurata, K., Hasegawa, G., and M. Murata, "Fairness 382 Comparisons Between TCP Reno and TCP Vegas for Future 383 Deployment of TCP Vegas", Proceedings of INET 2000, 384 July 2000. 386 [Kuz+06] Kuzmanovic, A. and E. Knightly, "TCP-LP: low-priority 387 service via end-point congestion control", IEEE/ACM 388 Transactions on Networking (ToN) Volume 14, Issue 4, pp. 389 739-752., August 2006, 390 . 392 [Liu+07] Liu, S., Vojnovic, M., and D. Gunawardena, "Competitive 393 and Considerate Congestion Control for Bulk Data 394 Transfers", Proceedings of IWQoS 2007, June 2007. 396 [Ott+04] Ott, D., Sparks, T., and K. Mayer-Patel, "Aggregate 397 congestion control for distributed multimedia 398 applications", Proceedings of Infocom 2004, March 2004. 400 [RFC1323] Jacobson, V., Braden, B., and D. Borman, "TCP Extensions 401 for High Performance", RFC 1323, May 1992. 403 [RFC2309] Braden, B., Clark, D., Crowcroft, J., Davie, B., Deering, 404 S., Estrin, D., Floyd, S., Jacobson, V., Minshall, G., 405 Partridge, C., Peterson, L., Ramakrishnan, K., Shenker, 406 S., Wroclawski, J., and L. Zhang, "Recommendations on 407 Queue Management and Congestion Avoidance in the 408 Internet", RFC 2309, April 1998. 410 [RFC3662] Bless, R., Nichols, K., and K. Wehrle, "A Lower Effort 411 Per-Domain Behavior (PDB) for Differentiated Services", 412 RFC 3662, December 2003. 414 [RFC5348] Floyd, S., Handley, M., Padhye, J., and J. Widmer, "TCP 415 Friendly Rate Control (TFRC): Protocol Specification", 416 RFC 5348, September 2008. 418 [Spr+00] Spring, N., Chesire, M., Berryman, M., Sahasranaman, V., 419 Anderson, T., and B. Bershad, "Receiver based management 420 of low bandwidth access links", Proceedings of 421 Infocom 2000, pp. 245-254, vol.1, 2000. 423 [Ven+02] Venkataramani, A., Kokku, R., and M. Dahlin, "TCP Nice: a 424 mechanism for background transfers", Proceedings of 425 OSDI '02, 2002. 427 Author's Address 429 Michael Welzl 430 University of Innsbruck 431 Technikerstr. 21 A 432 Innsbruck, 6020 433 Austria 435 Phone: +43 512 507 6110 436 Email: michael.welzl@uibk.ac.at