idnits 2.17.00 (12 Aug 2021) /tmp/idnits22427/draft-ietf-6lo-fragment-recovery-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year (Using the creation date from RFC4944, updated by this document, for RFC5378 checks: 2005-07-13) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (May 20, 2019) is 1097 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: draft-ietf-6lo-minimal-fragment has been published as RFC 8930 == Outdated reference: draft-ietf-6tisch-architecture has been published as RFC 9030 Summary: 0 errors (**), 0 flaws (~~), 3 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 6lo P. Thubert, Ed. 3 Internet-Draft Cisco Systems 4 Updates: 4944 (if approved) May 20, 2019 5 Intended status: Standards Track 6 Expires: November 21, 2019 8 6LoWPAN Selective Fragment Recovery 9 draft-ietf-6lo-fragment-recovery-03 11 Abstract 13 This draft updates RFC 4944 with a simple protocol to recover 14 individual fragments across a route-over mesh network, with a minimal 15 flow control to protect the network against bloat. 17 Status of This Memo 19 This Internet-Draft is submitted in full conformance with the 20 provisions of BCP 78 and BCP 79. 22 Internet-Drafts are working documents of the Internet Engineering 23 Task Force (IETF). Note that other groups may also distribute 24 working documents as Internet-Drafts. The list of current Internet- 25 Drafts is at https://datatracker.ietf.org/drafts/current/. 27 Internet-Drafts are draft documents valid for a maximum of six months 28 and may be updated, replaced, or obsoleted by other documents at any 29 time. It is inappropriate to use Internet-Drafts as reference 30 material or to cite them other than as "work in progress." 32 This Internet-Draft will expire on November 21, 2019. 34 Copyright Notice 36 Copyright (c) 2019 IETF Trust and the persons identified as the 37 document authors. All rights reserved. 39 This document is subject to BCP 78 and the IETF Trust's Legal 40 Provisions Relating to IETF Documents 41 (https://trustee.ietf.org/license-info) in effect on the date of 42 publication of this document. Please review these documents 43 carefully, as they describe your rights and restrictions with respect 44 to this document. Code Components extracted from this document must 45 include Simplified BSD License text as described in Section 4.e of 46 the Trust Legal Provisions and are provided without warranty as 47 described in the Simplified BSD License. 49 Table of Contents 51 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 52 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 53 2.1. BCP 14 . . . . . . . . . . . . . . . . . . . . . . . . . 4 54 2.2. References . . . . . . . . . . . . . . . . . . . . . . . 4 55 2.3. 6LoWPAN Acronyms . . . . . . . . . . . . . . . . . . . . 4 56 2.4. Referenced Work . . . . . . . . . . . . . . . . . . . . . 4 57 2.5. New Terms . . . . . . . . . . . . . . . . . . . . . . . . 5 58 3. Updating RFC 4944 . . . . . . . . . . . . . . . . . . . . . . 5 59 4. Updating draft-ietf-6lo-minimal-fragment . . . . . . . . . . 6 60 4.1. Slack in the First Fragment . . . . . . . . . . . . . . . 6 61 4.2. Gap between frames . . . . . . . . . . . . . . . . . . . 6 62 4.3. Modifying the First Fragment . . . . . . . . . . . . . . 7 63 5. New Dispatch types and headers . . . . . . . . . . . . . . . 7 64 5.1. Recoverable Fragment Dispatch type and Header . . . . . . 8 65 5.2. RFRAG Acknowledgment Dispatch type and Header . . . . . . 10 66 6. Fragments Recovery . . . . . . . . . . . . . . . . . . . . . 12 67 6.1. Forwarding Fragments . . . . . . . . . . . . . . . . . . 14 68 6.1.1. Upon the first fragment . . . . . . . . . . . . . . . 14 69 6.1.2. Upon the next fragments . . . . . . . . . . . . . . . 14 70 6.2. Upon the RFRAG Acknowledgments . . . . . . . . . . . . . 15 71 6.3. Cancelling a Fragmented Packet . . . . . . . . . . . . . 15 72 7. Management Considerations . . . . . . . . . . . . . . . . . . 16 73 7.1. Protocol Parameters . . . . . . . . . . . . . . . . . . . 16 74 7.2. Observing the network . . . . . . . . . . . . . . . . . . 17 75 8. Security Considerations . . . . . . . . . . . . . . . . . . . 18 76 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 18 77 10. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 18 78 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 18 79 11.1. Normative References . . . . . . . . . . . . . . . . . . 18 80 11.2. Informative References . . . . . . . . . . . . . . . . . 19 81 Appendix A. Rationale . . . . . . . . . . . . . . . . . . . . . 21 82 Appendix B. Requirements . . . . . . . . . . . . . . . . . . . . 22 83 Appendix C. Considerations On Flow Control . . . . . . . . . . . 23 84 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 25 86 1. Introduction 88 In most Low Power and Lossy Network (LLN) applications, the bulk of 89 the traffic consists of small chunks of data (in the order few bytes 90 to a few tens of bytes) at a time. Given that an IEEE Std. 802.15.4 91 [IEEE.802.15.4] frame can carry 74 bytes or more in all cases, 92 fragmentation is usually not required. However, and though this 93 happens only occasionally, a number of mission critical applications 94 do require the capability to transfer larger chunks of data, for 95 instance to support a firmware upgrades of the LLN nodes or an 96 extraction of logs from LLN nodes. In the former case, the large 97 chunk of data is transferred to the LLN node, whereas in the latter, 98 the large chunk flows away from the LLN node. In both cases, the 99 size can be on the order of 10Kbytes or more and an end-to-end 100 reliable transport is required. 102 "Transmission of IPv6 Packets over IEEE 802.15.4 Networks" [RFC4944] 103 defines the original 6LoWPAN datagram fragmentation mechanism for 104 LLNs. One critical issue with this original design is that routing 105 an IPv6 [RFC8200] packet across a route-over mesh requires to 106 reassemble the full packet at each hop, which may cause latency along 107 a path and an overall buffer bloat in the network. The "6TiSCH 108 Architecture" [I-D.ietf-6tisch-architecture] recommends to use a hop- 109 by-hop fragment forwarding technique to alleviate those undesirable 110 effects. "LLN Minimal Fragment Forwarding" 111 [I-D.ietf-6lo-minimal-fragment] proposes such a technique, in a 112 fashion that is compatible with [RFC4944] without the need to define 113 a new protocol. 115 However, adding that capability alone to the local implementation of 116 the original 6LoWPAN fragmentation would not address the issues of 117 resources locked and wasted transmissions due to the loss of a 118 fragment. [RFC4944] does not define a mechanism to first discover a 119 fragment loss, and then to recover that loss. With RFC 4944, the 120 forwarding of a whole datagram fails when one fragment is not 121 delivered properly to the destination 6LoWPAN endpoint. Constrained 122 memory resources are blocked on the receiver until the receiver times 123 out. 125 That problem is exacerbated when forwarding fragments over multiple 126 hops since a loss at an intermediate hop will not be discovered by 127 either the source or the destination, and the source will keep on 128 sending fragments, wasting even more resources in the network and 129 possibly contributing to the condition that caused the loss to no 130 avail since the datagram cannot arrive in its entirety. RFC 4944 is 131 also missing signaling to abort a multi-fragment transmission at any 132 time and from either end, and, if the capability to forward fragments 133 is implemented, clean up the related state in the network. It is 134 also lacking flow control capabilities to avoid participating to a 135 congestion that may in turn cause the loss of a fragment and trigger 136 the retransmission of the full datagram. 138 This specification proposes a method to forward fragments across a 139 multi-hop route-over mesh, and to recover individual fragments 140 between LLN endpoints. The method is designed to limit congestion 141 loss in the network and addresses the requirements that are detailed 142 in Appendix B. 144 2. Terminology 146 2.1. BCP 14 148 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 149 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 150 "OPTIONAL" in this document are to be interpreted as described in BCP 151 14 [RFC2119][RFC8174] when, and only when, they appear in all 152 capitals, as shown here. 154 2.2. References 156 In this document, readers will encounter terms and concepts that are 157 discussed in the following documents: 159 o "Problem Statement and Requirements for IPv6 over Low-Power 160 Wireless Personal Area Network (6LoWPAN) Routing" [RFC6606] 162 2.3. 6LoWPAN Acronyms 164 This document uses the following acronyms: 166 6BBR: 6LoWPAN Backbone Router 168 6LBR: 6LoWPAN Border Router 170 6LN: 6LoWPAN Node 172 6LR: 6LoWPAN Router 174 LLN: Low-Power and Lossy Network 176 2.4. Referenced Work 178 Past experience with fragmentation has shown that miss-associated or 179 lost fragments can lead to poor network behavior and, occasionally, 180 trouble at application layer. The reader is encouraged to read "IPv4 181 Reassembly Errors at High Data Rates" [RFC4963] and follow the 182 references for more information. 184 That experience led to the definition of "Path MTU discovery" 185 [RFC8201] (PMTUD) protocol that limits fragmentation over the 186 Internet. 188 Specifically in the case of UDP, valuable additional information can 189 be found in "UDP Usage Guidelines for Application Designers" 190 [RFC8085]. 192 Readers are expected to be familiar with all the terms and concepts 193 that are discussed in "IPv6 over Low-Power Wireless Personal Area 194 Networks (6LoWPANs): Overview, Assumptions, Problem Statement, and 195 Goals" [RFC4919] and "Transmission of IPv6 Packets over IEEE 802.15.4 196 Networks" [RFC4944]. 198 "The Benefits of Using Explicit Congestion Notification (ECN)" 199 [RFC8087] provides useful information on the potential benefits and 200 pitfalls of using ECN. 202 Quoting the "Multiprotocol Label Switching (MPLS) Architecture" 203 [RFC3031]: with MPLS, "packets are "labeled" before they are 204 forwarded. At subsequent hops, there is no further analysis of the 205 packet's network layer header. Rather, the label is used as an index 206 into a table which specifies the next hop, and a new label". The 207 MPLS technique is leveraged in the present specification to forward 208 fragments that actually do not have a network layer header, since the 209 fragmentation occurs below IP. 211 "LLN Minimal Fragment Forwarding" [I-D.ietf-6lo-minimal-fragment] 212 introduces the concept of a Virtual Reassembly Buffer (VRB) and an 213 associated technique to forward fragments as they come, using the 214 Datagram_tag as a label in a fashion similar to MLPS. This 215 specification reuses that technique with slightly modified controls. 217 2.5. New Terms 219 This specification uses the following terms: 221 6LoWPAN endpoints 223 The LLN nodes in charge of generating or expanding a 6LoWPAN 224 header from/to a full IPv6 packet. The 6LoWPAN endpoints are the 225 points where fragmentation and reassembly take place. 227 3. Updating RFC 4944 229 This specification updates the fragmentation mechanism that is 230 specified in "Transmission of IPv6 Packets over IEEE 802.15.4 231 Networks" [RFC4944] for use in route-over LLNs by providing a model 232 where fragments can be forwarded end-to-end across a 6LoWPAN LLN, and 233 where fragments that are lost on the way can be recovered 234 individually. A new format for fragment is introduces and new 235 dispatch types are defined in Section 5. 237 [RFC8138] allows to modify the size of a packet en-route by removing 238 the consumed hops in a compressed Routing Header. It results that 239 the fragment_offset and datagram_size cannot be signaled in the 240 uncompressed form. This specification expresses those fields in the 241 compressed form and allows to modify them en-route (see Section 4.3. 243 Note that consistently with in Section 2 of [RFC6282] for the 244 fragmentation mechanism described in Section 5.3 of [RFC4944], any 245 header that cannot fit within the first fragment MUST NOT be 246 compressed when using the fragmentation mechanism described in this 247 specification. 249 4. Updating draft-ietf-6lo-minimal-fragment 251 This specification updates the fragment forwarding mechanism 252 specified in "LLN Minimal Fragment Forwarding" 253 [I-D.ietf-6lo-minimal-fragment] by providing additional operations to 254 improve the management of the Virtual Reassembly Buffer (VRB). 256 4.1. Slack in the First Fragment 258 At the time of this writing, [I-D.ietf-6lo-minimal-fragment] allows 259 for refragmenting in intermediate nodes, meaning that some bytes from 260 a given fragment may be left in the VRB to be added to the next 261 fragment. The reason for this to happen would be the need for space 262 in the outgoing fragment that was not needed in the incoming 263 fragment, for instance because the 6LoWPAN Header Compression is not 264 as efficient on the outgoing link, e.g., if the Interface ID (IID) of 265 the source IPv6 address is elided by the originator on the first hop 266 because it matches the source MAC address, but cannot be on the next 267 hops because the source MAC address changes. 269 This specification cannot allow this operation since fragments are 270 recovered end-to-end based on the fragment number. This means that 271 the fragments that contain a 6LoWPAN-compressed header MUST have 272 enough slack to enable a less efficient compression in the next hops 273 that still fits in one MAC frame. For instance, if the IID of the 274 source IPv6 address is elided by the originator, then it MUST compute 275 the fragment_size as if the MTU was 8 bytes less. This way, the next 276 hop can restore the source IID to the first fragment without 277 impacting the second fragment. 279 4.2. Gap between frames 281 This specification introduces a concept of Inter-Frame Gap, which is 282 a configurable interval of time between transmissions to a same next 283 hop. In the case of half duplex interfaces, this InterFrameGap 284 ensures that the next hop has progressed the previous frame and is 285 capable of receiving the next one. 287 In the case of a mesh operating at a single frequency with 288 omnidirectional antennas, a larger InterFrameGap is required protect 289 the frame against hidden terminal collisions with the previous frame 290 of a same flow that is still progressing alon a common path. 292 The Inter-Frame Gap is useful even for unfragmented datagrams, but it 293 becomes a necessity for fragments that are typically generated in a 294 fast sequence and are all sent over the exact same path. 296 4.3. Modifying the First Fragment 298 The compression of the Hop Limit, of the source and destination 299 addresses, and of the Routing Header may change en-route in a Route- 300 Over mesh LLN. If the size of the first fragment is modified, then 301 the intermediate node MUST adapt the datagram_size to reflect that 302 difference. 304 The intermediate node MUST also save the difference of datagram_size 305 of the first fragment in the VRB and add it to the datagram_size and 306 to the fragment_offset of all the subsequent fragments for that 307 datagram. 309 5. New Dispatch types and headers 311 This specification enables the 6LoWPAN fragmentation sublayer to 312 provide an MTU up to 2048 bytes to the upper layer, which can be the 313 6LoWPAN Header Compression sublayer that is defined in the 314 "Compression Format for IPv6 Datagrams" [RFC6282] specification. In 315 order to achieve this, this specification enables the fragmentation 316 and the reliable transmission of fragments over a multihop 6LoWPAN 317 mesh network. 319 This specification provides a technique that is derived from MPLS in 320 order to forward individual fragments across a 6LoWPAN route-over 321 mesh. The Datagram_tag is used as a label; it is locally unique to 322 the node that is the source MAC address of the fragment, so together 323 the MAC address and the label can identify the fragment globally. A 324 node may build the Datagram_tag in its own locally-significant way, 325 as long as the selected tag stays unique to the particular datagram 326 for the lifetime of that datagram. It results that the label does 327 not need to be globally unique but also that it must be swapped at 328 each hop as the source MAC address changes. 330 This specification extends RFC 4944 [RFC4944] with 4 new Dispatch 331 types, for Recoverable Fragment (RFRAG) headers with or without 332 Acknowledgment Request (RFRAG vs. RFRAG-ARQ), and for the RFRAG 333 Acknowledgment back, with or without ECN Echo (RFRAG-ACK vs. RFRAG- 334 ECHO). 336 (to be confirmed by IANA) The new 6LoWPAN Dispatch types use the 337 Value Bit Pattern of 11 1010xx from page 0 [RFC8025], as follows: 339 Pattern Header Type 340 +------------+------------------------------------------+ 341 | 11 10100x | RFRAG - Recoverable Fragment | 342 | 11 10101x | RFRAG-ACK - RFRAG Acknowledgment | 343 +------------+------------------------------------------+ 345 Figure 1: Additional Dispatch Value Bit Patterns 347 In the following sections, a "Datagram_tag" extends the semantics 348 defined in [RFC4944] Section 5.3."Fragmentation Type and Header". 349 The Datagram_tag is a locally unique identifier for the datagram from 350 the perspective of the sender. This means that the datagram-tag 351 identifies a datagram uniquely in the network when associated with 352 the source of the datagram. As the datagram gets forwarded, the 353 source changes and the Datagram_tag must be swapped as detailed in 354 [I-D.ietf-6lo-minimal-fragment]. 356 5.1. Recoverable Fragment Dispatch type and Header 358 In this specification, the size and offset of the fragments are 359 expressed on the compressed packet form as opposed to the 360 uncompressed - native - packet form. 362 The format of the fragment header is the same for all fragments. The 363 format indicates both a length and an offset, which seem be redundant 364 with the sequence field, but is not. The position of a fragment in 365 the recomposition buffer is neither correlated with the value of the 366 sequence field nor with the order in which the fragments are 367 received. This enables out-of-sequence and overlapping fragments, 368 e.g., a fragment 5 that is retried as smaller fragments 5, 13 and 14 369 due to a change of MTU. 371 There is no requirement on the receiver to check for contiguity of 372 the received fragments, and the sender MUST ensure that when all 373 fragments are acknowledged, then the datagram is fully received. 374 This may be useful in particular in the case where the MTU changes 375 and a fragment sequence is retried with a smaller fragment_size, the 376 remainder of the original fragment being retried with new sequence 377 values. 379 The first fragment is recognized by a sequence of 0; it carries its 380 fragment_size and the datagram_size of the compressed packet, whereas 381 the other fragments carry their fragment_size and fragment_offset. 383 The last fragment for a datagram is recognized when its 384 fragment_offset and its fragment_size add up to the datagram_size. 386 Recoverable Fragments are sequenced and a bitmap is used in the RFRAG 387 Acknowledgment to indicate the received fragments by setting the 388 individual bits that correspond to their sequence. 390 1 2 3 391 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 392 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 393 |1 1 1 0 1 0 0|E| Datagram_tag | 394 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 395 |X| sequence| fragment_size | fragment_offset | 396 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 398 X set == Ack-Request 400 Figure 2: RFRAG Dispatch type and Header 402 E: 1 bit; Explicit Congestion Notification; the "E" flag is reset by 403 the source of the fragment and set by intermediate routers to 404 signal that this fragment experienced congestion along its path. 406 Fragment_size: 10 bit unsigned integer; the size of this fragment in 407 a unit that depends on the MAC layer technology. By default, that 408 unit is the octet which allows fragments up to 512 bytes. For 409 IEEE Std. 802.15.4, the unit is octet, and the maximum fragment 410 size, when it is constrained by the maximum frame size of 128 411 octet minus the overheads of the MAC and Fragment Headers, is not 412 limited by this encoding. 414 X: 1 bit; Ack-Request: when set, the sender requires an RFRAG 415 Acknowledgment from the receiver. 417 Sequence: 5 bit unsigned integer; the sequence number of the 418 fragment in the acknowledgement bitmap. Fragments are numbered 419 [0..N] where N is in [0..31]. A Sequence of 0 indicates the first 420 fragment in a datagram, but non-zero values are not indicative of 421 the position in the recomposition buffer. 423 Fragment_offset: 16 bit unsigned integer; 425 * When the Fragment_offset is set to a non-0 value, its semantics 426 depend on the value of the Sequence field. 428 + For a first fragment (i.e. with a Sequence of 0), this field 429 indicates the datagram_size of the compressed datagram, to 430 help the receiver allocate an adapted buffer for the 431 reception and reassembly operations. The fragment may 432 stored for local recomposition, or it may be routed based on 433 the destination IPv6 address, in which case a VRB state must 434 be installed as described in Section 6.1.1. 436 + When the Sequence is not 0, this field indicates the offset 437 of the fragment in the compressed form. The fragment may be 438 added to a local recomposition buffer or forwarded based on 439 an existing VRB as described in Section 6.1.2. 441 * A Fragment_offset that is set to a value of 0 indicates an 442 abort condition and all state regarding the datagram should be 443 cleaned up once the processing of the fragment is complete; the 444 processing of the fragment depends on whether there is a VRB 445 already established for this datagram, and the next hop is 446 still reachable: 448 + if a VRB already exists and is not broken, the fragment is 449 to be forwarded along the associated Label Switched Path 450 (LSP) as described in Section 6.1.2, but regardless of the 451 value of the Sequence field; 453 + else, if the Sequence is 0, then the fragment is to be 454 routed as described in Section 6.1.1 but no state is 455 conserved afterwards. In that case, the session if it 456 exists is aborted and the packet is also forwarded in an 457 attempt to clean up the next hops as along the path 458 indicated by the IPv6 header (possibly including a routing 459 header). 461 If the fragment cannot be forwarded or routed, then an abort 462 RFRAG-ACK is sent back to the source. 464 5.2. RFRAG Acknowledgment Dispatch type and Header 466 This specification also defines a 4-octet RFRAG Acknowledgment bitmap 467 that is used by the reassembling end point to confirm selectively the 468 reception of individual fragments. A given offset in the bitmap maps 469 one to one with a given sequence number. 471 The offset of the bit in the bitmap indicates which fragment is 472 acknowledged as follows: 474 1 2 3 475 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 476 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 477 | RFRAG Acknowledgment Bitmap | 478 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 479 ^ ^ 480 | | bitmap indicating whether: 481 | +----- Fragment with sequence 9 was received 482 +----------------------- Fragment with sequence 0 was received 484 Figure 3: RFRAG Acknowledgment bitmap encoding 486 Figure 4 shows an example Acknowledgment bitmap which indicates that 487 all fragments from sequence 0 to 20 were received, except for 488 fragments 1, 2 and 16 that were lost and must be retried. 490 1 2 3 491 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 492 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 493 |1|0|0|1|1|1|1|1|1|1|1|1|1|1|1|1|0|1|1|1|1|0|0|0|0|0|0|0|0|0|0|0| 494 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 496 Figure 4: Example RFRAG Acknowledgment Bitmap 498 The RFRAG Acknowledgment Bitmap is included in a RFRAG Acknowledgment 499 header, as follows: 501 1 2 3 502 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 503 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 504 |1 1 1 0 1 0 1|E| Datagram_tag | 505 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 506 | RFRAG Acknowledgment Bitmap (32 bits) | 507 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 509 Figure 5: RFRAG Acknowledgment Dispatch type and Header 511 E: 1 bit; Explicit Congestion Notification Echo 513 When set, the sender indicates that at least one of the 514 acknowledged fragments was received with an Explicit Congestion 515 Notification, indicating that the path followed by the fragments 516 is subject to congestion. More in Appendix C. 518 RFRAG Acknowledgment Bitmap 519 An RFRAG Acknowledgment Bitmap, whereby setting the bit at offset 520 x indicates that fragment x was received, as shown in Figure 3. 521 All 0's is a NULL bitmap that indicates that the fragmentation 522 process is aborted. All 1's is a FULL bitmap that indicates that 523 the fragmentation process is complete, all fragments were received 524 at the reassembly end point. 526 6. Fragments Recovery 528 The Recoverable Fragment headers RFRAG and RFRAG-ARQ are used to 529 transport a fragment and optionally request an RFRAG Acknowledgment 530 that will confirm the good reception of a one or more fragments. An 531 RFRAG Acknowledgment is carried as a standalone header in a message 532 that is sent back to the 6LoWPAN endpoint that was the source of the 533 fragments, as known by its MAC address. The process ensures that at 534 every hop, the source MAC address and the Datagram_tag in the 535 received fragment are enough information to send the RFRAG 536 Acknowledgment back towards the source 6LoWPAN endpoint by reversing 537 the MPLS operation. 539 The 6LoWPAN endpoint that fragments the packets at 6LoWPAN level (the 540 sender) also controls the amount of acknowledgments by setting the 541 Ack-Request flag in the RFRAG packets. The sender may set the Ack- 542 Request flag on any fragment to perform congestion control by 543 limiting the number of outstanding fragments, which are the fragments 544 that have been sent but for which reception or loss was not 545 positively confirmed by the reassembling endpoint. Te maximum number 546 of outstanding fragments is the Window-Size. It is configurable and 547 may vary in case of ECN notification. When it receives a fragment 548 with the Ack-Request flag set, the 6LoWPAN endpoint that reassembles 549 the packets at 6LoWPAN level (the receiver) MUST send back an RFRAG 550 Acknowledgment to confirm reception of all the fragments it has 551 received so far. 553 The Ack-Request bit marks the end of a window. It SHOULD be set on 554 the last fragment to protect the datagram, and MAY be used in 555 intermediate fragments for the purpose of flow control. This ARQ 556 process MUST be protected by a ARQ timer, and the fragment that 557 carries the Ack-Request flag MAY be retried upon time out a 558 configurable amount of times. Upon exhaustion of the retries the 559 sender may either abort the transmission of the datagram or retry the 560 datagram from the first fragment with an Ack-Request in order to 561 reestablish a path and discover which fragments were received over 562 the old path. When the sender of the fragment knows that an 563 underlying link-layer mechanism protects the fragments, it may 564 refrain from using the RFRAG Acknowledgment mechanism, and never set 565 the Ack-Request bit. 567 The RFRAG Acknowledgment can optionally carry an ECN indication for 568 flow control (see Appendix C). The receiver of a fragment with the 569 'E' (ECN) flag set MUST echo that information by setting the 'E' 570 (ECN) flag in the next RFRAG Acknowledgment. 572 The sender transfers a controlled number of fragments and MAY flag 573 the last fragment of a series with an RFRAG Acknowledgment Request. 574 The receiver MUST acknowledge a fragment with the acknowledgment 575 request bit set. If any fragment immediately preceding an 576 acknowledgment request is still missing, the receiver MAY 577 intentionally delay its acknowledgment to allow in-transit fragments 578 to arrive. Delaying the acknowledgment might defeat the round trip 579 delay computation so it should be configurable and not enabled by 580 default. 582 The receiver MAY issue unsolicited acknowledgments. An unsolicited 583 acknowledgment signals to the sender endpoint that it can resume 584 sending if it had reached its maximum number of outstanding 585 fragments. Another use is to inform that the reassembling endpoint 586 has canceled the process of an individual datagram. Note that 587 acknowledgments might consume precious resources so the use of 588 unsolicited acknowledgments should be configurable and not enabled by 589 default. 591 An observation is that streamlining forwarding of fragments generally 592 reduces the latency over the LLN mesh, providing room for retries 593 within existing upper-layer reliability mechanisms. The sender 594 protects the transmission over the LLN mesh with a retry timer that 595 is computed according to the method detailed in [RFC6298]. It is 596 expected that the upper layer retries obey the recommendations in 597 "UDP Usage Guidelines" [RFC8085], in which case a single round of 598 fragment recovery should fit within the upper layer recovery timers. 600 Fragments are sent in a round robin fashion: the sender sends all the 601 fragments for a first time before it retries any lost fragment; lost 602 fragments are retried in sequence, oldest first. This mechanism 603 enables the receiver to acknowledge fragments that were delayed in 604 the network before they are retried. 606 When a single frequency is used by contiguous hops, the sender should 607 wait a reasonable amount of time between fragments so as to let a 608 fragment progress a few hops and avoid hidden terminal issues. This 609 precaution is not required on channel hopping technologies such as 610 Time Slotted Channel Hopping (TSCH) [RFC6554] 612 6.1. Forwarding Fragments 614 It is assumed that the first Fragment is large enough to carry the 615 IPv6 header and make routing decisions. If that is not so, then this 616 specification MUST NOT be used. 618 This specification extends the Virtual Reassembly Buffer (VRB) 619 technique to forward fragments with no intermediate reconstruction of 620 the entire packet. It inherits operations like Datagram_tag 621 Switching and using a timer to clean the VRB when the traffic dries 622 up. In more details, the first fragment carries the IP header and it 623 is routed all the way from the fragmenting end point to the 624 reassembling end point. Upon the first fragment, the routers along 625 the path install a label-switched path (LSP), and the following 626 fragments are label-switched along that path. As a consequence, 627 alternate routes not possible for individual fragments. The 628 Datagram_tag is used to carry the label, that is swapped at each hop. 629 All fragments follow the same path and fragments are delivered in the 630 order at which they are sent. 632 6.1.1. Upon the first fragment 634 In Route-Over mode, the source and destination MAC addressed in a 635 frame change at each hop. The label that is formed and placed in the 636 Datagram_tag is associated to the source MAC and only valid (and 637 unique) for that source MAC. Upon a first fragment (i.e. with a 638 sequence of zero), a VRB and the associated LSP state are created for 639 the tuple (source MAC address, Datagram_tag) and the fragment is 640 forwarded along the IPv6 route that matches the destination IPv6 641 address in the IPv6 header as prescribed by 642 [I-D.ietf-6lo-minimal-fragment]. The LSP state enables to match the 643 (previous MAC address, Datagram_tag) in an incoming fragment to the 644 tuple (next MAC address, swapped Datagram_tag) used in the forwarded 645 fragment and points at the VRB. In addition, the router also forms a 646 Reverse LSP state indexed by the MAC address of the next hop and the 647 swapped Datagram_tag. This reverse LSP state also points at the VRB 648 and enables to match the (next MAC address, swapped_Datagram_tag) 649 found in an RFRAG Acknowledgment to the tuple (previous MAC address, 650 Datagram_tag) used when forwarding a Fragment Acknowledgment (RFRAG- 651 ACK) back to the sender endpoint. 653 6.1.2. Upon the next fragments 655 Upon a next fragment (i.e. with a non-zero sequence), the router 656 looks up a LSP indexed by the tuple (MAC address, Datagram_tag) found 657 in the fragment. If it is found, the router forwards the fragment 658 using the associated VRB as prescribed by 659 [I-D.ietf-6lo-minimal-fragment]. 661 if the VRB for the tuple is not found, the router builds an RFRAG-ACK 662 to abort the transmission of the packet. The resulting message has 663 the following information: 665 o The source and destination MAC addresses are swapped from those 666 found in the fragment 668 o The Datagram_tag set to the Datagram_tag found in the fragment 670 o A NULL bitmap is used to signal the abort condition 672 At this point the router is all set and can send the RFRAG-ACK back 673 to the previous router. The RFRAG-ACK should normally be forwarded 674 all the way to the source using the reverse LSP state in the VRBs in 675 the intermediate routers as described in the next section. 677 6.2. Upon the RFRAG Acknowledgments 679 Upon an RFRAG-ACK, the router looks up a Reverse LSP indexed by the 680 tuple (MAC address, Datagram_tag), which are respectively the source 681 MAC address of the received frame and the received Datagram_tag. If 682 it is found, the router forwards the fragment using the associated 683 VRB as prescribed by [I-D.ietf-6lo-minimal-fragment], but using the 684 Reverse LSP so that the RFRAG-ACK flows back to the sender endpoint. 686 If the Reverse LSP is not found, the router MUST silently drop the 687 RFRAG-ACK message. 689 Either way, if the RFRAG-ACK indicates that the fragment was entirely 690 received (FULL bitmap), it arms a short timer, and upon timeout, the 691 VRB and all the associated state are destroyed. until the timer 692 elapses, fragments of that datagram may still be received, e.g. if 693 the RFRAG-ACK was lost on the way back and the source retried the 694 last fragment. In that case, the router forwards the fragment 695 according to the state in the VRB. 697 This specification does not provide a method to discover the number 698 of hops or the minimal value of MTU along those hops. But should the 699 minimal MTU decrease, it is possible to retry a long fragment (say 700 sequence of 5) with first a shorter fragment of the same sequence (5 701 again) and then one or more other fragments with a sequence that was 702 not used before (e.g., 13 and 14). 704 6.3. Cancelling a Fragmented Packet 706 A reset is signaled on the forward path with a pseudo fragment that 707 has the fragment_offset, sequence and fragment_size all set to 0, and 708 no data. 710 When the sender or a router on the way decides that a packet should 711 be dropped and the fragmentation process canceled, it generates a 712 reset pseudo fragment and forwards it down the fragment path. 714 Each router next along the path the way forwards the pseudo fragment 715 based on the VRB state. If an acknowledgment is not requested, the 716 VRB and all associated state are destroyed. 718 Upon reception of the pseudo fragment, the receiver cleans up all 719 resources for the packet associated to the Datagram_tag. If an 720 acknowledgment is requested, the receiver responds with a NULL 721 bitmap. 723 The other way around, the receiver might need to cancel the process 724 of a fragmented packet for internal reasons, for instance if it is 725 out of reassembly buffers, or considers that this packet is already 726 fully reassembled and passed to the upper layer. In that case, the 727 receiver SHOULD indicate so to the sender with a NULL bitmap in a 728 RFRAG Acknowledgment. Upon an acknowledgment with a NULL bitmap, the 729 sender endpoint MUST abort the transmission of the fragmented 730 datagram. 732 7. Management Considerations 734 7.1. Protocol Parameters 736 There is no particular configuration on the receiver, as echoing ECN 737 should always be on. The configuration only applies to the sender 738 that is in control of the transmission. The management system SHOULD 739 be capable of providing the parameters below: 741 MinFragmentSize: The MinFragmentSize is the minimum value for the 742 Fragment_Size. 744 OptFragmentSize: The MinFragmentSize is the value for the 745 Fragment_Size that the sender should use to start with. 747 MaxFragmentSize: The MaxFragmentSize is the maximum value for the 748 Fragment_Size. It MUST be lower than the minimum MTU along the 749 path. A large value augments the chances of buffer bloat and 750 transmission loss. The value MUST be less than 512 if the unit 751 that is defined for the PHY layer is the octet. 753 UseECN: Indicates whether the sender should react to ECN. When the 754 sender reacts to ECN the Window_Size will vary between 755 MinWindowSize and MaxWindowSize. 757 MinWindowSize: The minimum value of Window_Size that the sender can 758 use. 760 OptWindowSize: The OptWindowSize is the value for the Window_Size 761 that the sender should use to start with. 763 MaxWindowSize: The maximum value of Window_Size that the sender can 764 use. The value MUSt be less than 32. 766 UseECN: Indicates whether the sender should react to ECN. When the 767 sender reacts to ECN the sender SHOULD adapt the Window_Size 768 between MinWindowSize and MaxWindowSize and it MAY adapt the 769 Fragment_Size if that is supported. 771 InterFrameGap: Indicates a minimum amount of time between 772 transmissions. All packets to a same destination, and in 773 particular fragments, may be subject to receive while 774 transmitting and hidden terminal collisions with the next or 775 the previous transmission as the fragments progress along a 776 same path. The InterFrameGap protects the propagation of one 777 transmission before the next one is triggered and creates a 778 duty cycle that controls the ratio of air time and memory in 779 intermediate nodes that a particular datagram will use. 781 MinARQTimeOut: The maximum amount of time a node should wait for an 782 RFRAG Acknowledgment before it takes a next action. 784 OptARQTimeOut: The starting point of the value of the amount that a 785 sender should wait for an RFRAG Acknowledgment before it takes 786 a next action. 788 MaxARQTimeOut: The maximum amount of time a node should wait for an 789 RFRAG Acknowledgment before it takes a next action. 791 MaxFragRetries: The maximum number of retries for a particular 792 Fragment. 794 MaxDatagramRetries: The maximum number of retries from scratch for a 795 particular Datagram. 797 7.2. Observing the network 799 The management system should monitor the amount of retries and of ECN 800 settings that can be observed from the perspective of the both the 801 sender and the receiver, and may tune the optimum size of 802 Fragment_Size and of the Window_Size, OptWindowSize and OptWindowSize 803 respectively, at the sender. The values should be bounded by the 804 expected number of hops and reduced beyond that when the number of 805 datagrams that can traverse an intermediate point may exceed its 806 capacity and cause a congestion loss. The InterFrameGap is another 807 tool that can be used to increase the spacing between fragments of a 808 same datagram and reduce the ratio of time when a particular 809 intermediate node holds a fragment of that datagram. 811 8. Security Considerations 813 The process of recovering fragments does not appear to create any 814 opening for new threat compared to "Transmission of IPv6 Packets over 815 IEEE 802.15.4 Networks" [RFC4944]. 817 9. IANA Considerations 819 Need extensions for formats defined in "Transmission of IPv6 Packets 820 over IEEE 802.15.4 Networks" [RFC4944]. 822 10. Acknowledgments 824 The author wishes to thank Michel Veillette, Dario Tedeschi, Laurent 825 Toutain, Thomas Watteyne and Michael Richardson for in-depth reviews 826 and comments. Also many thanks to Jonathan Hui, Jay Werb, Christos 827 Polyzois, Soumitri Kolavennu, Pat Kinney, Margaret Wasserman, Richard 828 Kelsey, Carsten Bormann and Harry Courtice for their various 829 contributions. 831 11. References 833 11.1. Normative References 835 [I-D.ietf-6lo-minimal-fragment] 836 Watteyne, T., Bormann, C., and P. Thubert, "LLN Minimal 837 Fragment Forwarding", draft-ietf-6lo-minimal-fragment-01 838 (work in progress), March 2019. 840 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 841 Requirement Levels", BCP 14, RFC 2119, 842 DOI 10.17487/RFC2119, March 1997, 843 . 845 [RFC4944] Montenegro, G., Kushalnagar, N., Hui, J., and D. Culler, 846 "Transmission of IPv6 Packets over IEEE 802.15.4 847 Networks", RFC 4944, DOI 10.17487/RFC4944, September 2007, 848 . 850 [RFC6282] Hui, J., Ed. and P. Thubert, "Compression Format for IPv6 851 Datagrams over IEEE 802.15.4-Based Networks", RFC 6282, 852 DOI 10.17487/RFC6282, September 2011, 853 . 855 [RFC6554] Hui, J., Vasseur, JP., Culler, D., and V. Manral, "An IPv6 856 Routing Header for Source Routes with the Routing Protocol 857 for Low-Power and Lossy Networks (RPL)", RFC 6554, 858 DOI 10.17487/RFC6554, March 2012, 859 . 861 [RFC8025] Thubert, P., Ed. and R. Cragie, "IPv6 over Low-Power 862 Wireless Personal Area Network (6LoWPAN) Paging Dispatch", 863 RFC 8025, DOI 10.17487/RFC8025, November 2016, 864 . 866 [RFC8138] Thubert, P., Ed., Bormann, C., Toutain, L., and R. Cragie, 867 "IPv6 over Low-Power Wireless Personal Area Network 868 (6LoWPAN) Routing Header", RFC 8138, DOI 10.17487/RFC8138, 869 April 2017, . 871 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 872 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 873 May 2017, . 875 11.2. Informative References 877 [I-D.ietf-6tisch-architecture] 878 Thubert, P., "An Architecture for IPv6 over the TSCH mode 879 of IEEE 802.15.4", draft-ietf-6tisch-architecture-20 (work 880 in progress), March 2019. 882 [IEEE.802.15.4] 883 IEEE, "IEEE Standard for Low-Rate Wireless Networks", 884 IEEE Standard 802.15.4, DOI 10.1109/IEEE 885 P802.15.4-REVd/D01, 886 . 888 [RFC2914] Floyd, S., "Congestion Control Principles", BCP 41, 889 RFC 2914, DOI 10.17487/RFC2914, September 2000, 890 . 892 [RFC3031] Rosen, E., Viswanathan, A., and R. Callon, "Multiprotocol 893 Label Switching Architecture", RFC 3031, 894 DOI 10.17487/RFC3031, January 2001, 895 . 897 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition 898 of Explicit Congestion Notification (ECN) to IP", 899 RFC 3168, DOI 10.17487/RFC3168, September 2001, 900 . 902 [RFC4919] Kushalnagar, N., Montenegro, G., and C. Schumacher, "IPv6 903 over Low-Power Wireless Personal Area Networks (6LoWPANs): 904 Overview, Assumptions, Problem Statement, and Goals", 905 RFC 4919, DOI 10.17487/RFC4919, August 2007, 906 . 908 [RFC4963] Heffner, J., Mathis, M., and B. Chandler, "IPv4 Reassembly 909 Errors at High Data Rates", RFC 4963, 910 DOI 10.17487/RFC4963, July 2007, 911 . 913 [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion 914 Control", RFC 5681, DOI 10.17487/RFC5681, September 2009, 915 . 917 [RFC6298] Paxson, V., Allman, M., Chu, J., and M. Sargent, 918 "Computing TCP's Retransmission Timer", RFC 6298, 919 DOI 10.17487/RFC6298, June 2011, 920 . 922 [RFC6606] Kim, E., Kaspar, D., Gomez, C., and C. Bormann, "Problem 923 Statement and Requirements for IPv6 over Low-Power 924 Wireless Personal Area Network (6LoWPAN) Routing", 925 RFC 6606, DOI 10.17487/RFC6606, May 2012, 926 . 928 [RFC7554] Watteyne, T., Ed., Palattella, M., and L. Grieco, "Using 929 IEEE 802.15.4e Time-Slotted Channel Hopping (TSCH) in the 930 Internet of Things (IoT): Problem Statement", RFC 7554, 931 DOI 10.17487/RFC7554, May 2015, 932 . 934 [RFC7567] Baker, F., Ed. and G. Fairhurst, Ed., "IETF 935 Recommendations Regarding Active Queue Management", 936 BCP 197, RFC 7567, DOI 10.17487/RFC7567, July 2015, 937 . 939 [RFC8085] Eggert, L., Fairhurst, G., and G. Shepherd, "UDP Usage 940 Guidelines", BCP 145, RFC 8085, DOI 10.17487/RFC8085, 941 March 2017, . 943 [RFC8087] Fairhurst, G. and M. Welzl, "The Benefits of Using 944 Explicit Congestion Notification (ECN)", RFC 8087, 945 DOI 10.17487/RFC8087, March 2017, 946 . 948 [RFC8200] Deering, S. and R. Hinden, "Internet Protocol, Version 6 949 (IPv6) Specification", STD 86, RFC 8200, 950 DOI 10.17487/RFC8200, July 2017, 951 . 953 [RFC8201] McCann, J., Deering, S., Mogul, J., and R. Hinden, Ed., 954 "Path MTU Discovery for IP version 6", STD 87, RFC 8201, 955 DOI 10.17487/RFC8201, July 2017, 956 . 958 Appendix A. Rationale 960 There are a number of uses for large packets in Wireless Sensor 961 Networks. Such usages may not be the most typical or represent the 962 largest amount of traffic over the LLN; however, the associated 963 functionality can be critical enough to justify extra care for 964 ensuring effective transport of large packets across the LLN. 966 The list of those usages includes: 968 Towards the LLN node: 970 Firmware update: For example, a new version of the LLN node 971 software is downloaded from a system manager over unicast or 972 multicast services. Such a reflashing operation typically 973 involves updating a large number of similar LLN nodes over a 974 relatively short period of time. 976 Packages of Commands: A number of commands or a full 977 configuration can be packaged as a single message to ensure 978 consistency and enable atomic execution or complete roll back. 979 Until such commands are fully received and interpreted, the 980 intended operation will not take effect. 982 From the LLN node: 984 Waveform captures: A number of consecutive samples are measured 985 at a high rate for a short time and then transferred from a 986 sensor to a gateway or an edge server as a single large report. 988 Data logs: LLN nodes may generate large logs of sampled data for 989 later extraction. LLN nodes may also generate system logs to 990 assist in diagnosing problems on the node or network. 992 Large data packets: Rich data types might require more than one 993 fragment. 995 Uncontrolled firmware download or waveform upload can easily result 996 in a massive increase of the traffic and saturate the network. 998 When a fragment is lost in transmission, the lack of recovery in the 999 original fragmentation system of RFC 4944 implies that all fragments 1000 are resent, further contributing to the congestion that caused the 1001 initial loss, and potentially leading to congestion collapse. 1003 This saturation may lead to excessive radio interference, or random 1004 early discard (leaky bucket) in relaying nodes. Additional queuing 1005 and memory congestion may result while waiting for a low power next 1006 hop to emerge from its sleeping state. 1008 Considering that RFC 4944 defines an MTU is 1280 bytes and that in 1009 most incarnations (but 802.15.4g) a IEEE Std. 802.15.4 frame can 1010 limit the MAC payload to as few as 74 bytes, a packet might be 1011 fragmented into at least 18 fragments at the 6LoWPAN shim layer. 1012 Taking into account the worst-case header overhead for 6LoWPAN 1013 Fragmentation and Mesh Addressing headers will increase the number of 1014 required fragments to around 32. This level of fragmentation is much 1015 higher than that traditionally experienced over the Internet with 1016 IPv4 fragments. At the same time, the use of radios increases the 1017 probability of transmission loss and Mesh-Under techniques compound 1018 that risk over multiple hops. 1020 Mechanisms such as TCP or application-layer segmentation could be 1021 used to support end-to-end reliable transport. One option to support 1022 bulk data transfer over a frame-size-constrained LLN is to set the 1023 Maximum Segment Size to fit within the link maximum frame size. 1024 Doing so, however, can add significant header overhead to each 1025 802.15.4 frame. In addition, deploying such a mechanism requires 1026 that the end-to-end transport is aware of the delivery properties of 1027 the underlying LLN, which is a layer violation, and difficult to 1028 achieve from the far end of the IPv6 network. 1030 Appendix B. Requirements 1032 For one-hop communications, a number of Low Power and Lossy Network 1033 (LLN) link-layers propose a local acknowledgment mechanism that is 1034 enough to detect and recover the loss of fragments. In a multihop 1035 environment, an end-to-end fragment recovery mechanism might be a 1036 good complement to a hop-by-hop MAC level recovery. This draft 1037 introduces a simple protocol to recover individual fragments between 1038 6LoWPAN endpoints that may be multiple hops away. The method 1039 addresses the following requirements of a LLN: 1041 Number of fragments 1043 The recovery mechanism must support highly fragmented packets, 1044 with a maximum of 32 fragments per packet. 1046 Minimum acknowledgment overhead 1048 Because the radio is half duplex, and because of silent time spent 1049 in the various medium access mechanisms, an acknowledgment 1050 consumes roughly as many resources as data fragment. 1052 The new end-to-end fragment recovery mechanism should be able to 1053 acknowledge multiple fragments in a single message and not require 1054 an acknowledgment at all if fragments are already protected at a 1055 lower layer. 1057 Controlled latency 1059 The recovery mechanism must succeed or give up within the time 1060 boundary imposed by the recovery process of the Upper Layer 1061 Protocols. 1063 Optional congestion control 1065 The aggregation of multiple concurrent flows may lead to the 1066 saturation of the radio network and congestion collapse. 1068 The recovery mechanism should provide means for controlling the 1069 number of fragments in transit over the LLN. 1071 Appendix C. Considerations On Flow Control 1073 Considering that a multi-hop LLN can be a very sensitive environment 1074 due to the limited queuing capabilities of a large population of its 1075 nodes, this draft recommends a simple and conservative approach to 1076 Congestion Control, based on TCP congestion avoidance. 1078 Congestion on the forward path is assumed in case of packet loss, and 1079 packet loss is assumed upon time out. The draft allows to control 1080 the number of outstanding fragments, that have been transmitted but 1081 for which an acknowledgment was not received yet. It must be noted 1082 that the number of outstanding fragments should not exceed the number 1083 of hops in the network, but the way to figure the number of hops is 1084 out of scope for this document. 1086 Congestion on the forward path can also be indicated by an Explicit 1087 Congestion Notification (ECN) mechanism. Though whether and how ECN 1088 [RFC3168] is carried out over the LoWPAN is out of scope, this draft 1089 provides a way for the destination endpoint to echo an ECN indication 1090 back to the source endpoint in an acknowledgment message as 1091 represented in Figure 5 in Section 5.2. 1093 It must be noted that congestion and collision are different topics. 1094 In particular, when a mesh operates on a same channel over multiple 1095 hops, then the forwarding of a fragment over a certain hop may 1096 collide with the forwarding of a next fragment that is following over 1097 a previous hop but in a same interference domain. This draft enables 1098 an end-to-end flow control, but leaves it to the sender stack to pace 1099 individual fragments within a transmit window, so that a given 1100 fragment is sent only when the previous fragment has had a chance to 1101 progress beyond the interference domain of this hop. In the case of 1102 6TiSCH [I-D.ietf-6tisch-architecture], which operates over the 1103 TimeSlotted Channel Hopping [RFC7554] (TSCH) mode of operation of 1104 IEEE802.14.5, a fragment is forwarded over a different channel at a 1105 different time and it makes full sense to transmit the next fragment 1106 as soon as the previous fragment has had its chance to be forwarded 1107 at the next hop. 1109 From the standpoint of a source 6LoWPAN endpoint, an outstanding 1110 fragment is a fragment that was sent but for which no explicit 1111 acknowledgment was received yet. This means that the fragment might 1112 be on the way, received but not yet acknowledged, or the 1113 acknowledgment might be on the way back. It is also possible that 1114 either the fragment or the acknowledgment was lost on the way. 1116 From the sender standpoint, all outstanding fragments might still be 1117 in the network and contribute to its congestion. There is an 1118 assumption, though, that after a certain amount of time, a frame is 1119 either received or lost, so it is not causing congestion anymore. 1120 This amount of time can be estimated based on the round trip delay 1121 between the 6LoWPAN endpoints. The method detailed in [RFC6298] is 1122 recommended for that computation. 1124 The reader is encouraged to read through "Congestion Control 1125 Principles" [RFC2914]. Additionally [RFC7567] and [RFC5681] provide 1126 deeper information on why this mechanism is needed and how TCP 1127 handles Congestion Control. Basically, the goal here is to manage 1128 the amount of fragments present in the network; this is achieved by 1129 to reducing the number of outstanding fragments over a congested path 1130 by throttling the sources. 1132 Section 6 describes how the sender decides how many fragments are 1133 (re)sent before an acknowledgment is required, and how the sender 1134 adapts that number to the network conditions. 1136 Author's Address 1138 Pascal Thubert (editor) 1139 Cisco Systems, Inc 1140 Building D 1141 45 Allee des Ormes - BP1200 1142 MOUGINS - Sophia Antipolis 06254 1143 FRANCE 1145 Phone: +33 497 23 26 34 1146 Email: pthubert@cisco.com