idnits 2.17.00 (12 Aug 2021) /tmp/idnits23874/draft-ietf-6lo-fragment-recovery-12.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year (Using the creation date from RFC4944, updated by this document, for RFC5378 checks: 2005-07-13) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (11 February 2020) is 830 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: draft-ietf-6lo-minimal-fragment has been published as RFC 8930 == Outdated reference: A later version (-02) exists of draft-ietf-lwig-6lowpan-virtual-reassembly-01 == Outdated reference: draft-ietf-intarea-frag-fragile has been published as RFC 8900 == Outdated reference: draft-ietf-6tisch-architecture has been published as RFC 9030 Summary: 0 errors (**), 0 flaws (~~), 5 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 6lo P. Thubert, Ed. 3 Internet-Draft Cisco Systems 4 Updates: 4944 (if approved) 11 February 2020 5 Intended status: Standards Track 6 Expires: 14 August 2020 8 6LoWPAN Selective Fragment Recovery 9 draft-ietf-6lo-fragment-recovery-12 11 Abstract 13 This draft updates RFC 4944 with a simple protocol to recover 14 individual fragments across a route-over mesh network, with a minimal 15 flow control to protect the network against bloat. 17 Status of This Memo 19 This Internet-Draft is submitted in full conformance with the 20 provisions of BCP 78 and BCP 79. 22 Internet-Drafts are working documents of the Internet Engineering 23 Task Force (IETF). Note that other groups may also distribute 24 working documents as Internet-Drafts. The list of current Internet- 25 Drafts is at https://datatracker.ietf.org/drafts/current/. 27 Internet-Drafts are draft documents valid for a maximum of six months 28 and may be updated, replaced, or obsoleted by other documents at any 29 time. It is inappropriate to use Internet-Drafts as reference 30 material or to cite them other than as "work in progress." 32 This Internet-Draft will expire on 14 August 2020. 34 Copyright Notice 36 Copyright (c) 2020 IETF Trust and the persons identified as the 37 document authors. All rights reserved. 39 This document is subject to BCP 78 and the IETF Trust's Legal 40 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 41 license-info) in effect on the date of publication of this document. 42 Please review these documents carefully, as they describe your rights 43 and restrictions with respect to this document. Code Components 44 extracted from this document must include Simplified BSD License text 45 as described in Section 4.e of the Trust Legal Provisions and are 46 provided without warranty as described in the Simplified BSD License. 48 Table of Contents 50 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 51 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 52 2.1. BCP 14 . . . . . . . . . . . . . . . . . . . . . . . . . 4 53 2.2. References . . . . . . . . . . . . . . . . . . . . . . . 4 54 2.3. New Terms . . . . . . . . . . . . . . . . . . . . . . . . 5 55 3. Updating RFC 4944 . . . . . . . . . . . . . . . . . . . . . . 6 56 4. Extending draft-ietf-6lo-minimal-fragment . . . . . . . . . . 6 57 4.1. Slack in the First Fragment . . . . . . . . . . . . . . . 7 58 4.2. Gap between frames . . . . . . . . . . . . . . . . . . . 7 59 4.3. Modifying the First Fragment . . . . . . . . . . . . . . 7 60 5. New Dispatch types and headers . . . . . . . . . . . . . . . 8 61 5.1. Recoverable Fragment Dispatch type and Header . . . . . . 8 62 5.2. RFRAG Acknowledgment Dispatch type and Header . . . . . . 11 63 6. Fragments Recovery . . . . . . . . . . . . . . . . . . . . . 12 64 6.1. Forwarding Fragments . . . . . . . . . . . . . . . . . . 14 65 6.1.1. Receiving the first fragment . . . . . . . . . . . . 15 66 6.1.2. Receiving the next fragments . . . . . . . . . . . . 15 67 6.2. Receiving RFRAG Acknowledgments . . . . . . . . . . . . . 16 68 6.3. Aborting the Transmission of a Fragmented Packet . . . . 16 69 6.4. Applying Recoverable Fragmentation along a Diverse 70 Path . . . . . . . . . . . . . . . . . . . . . . . . . . 17 71 7. Management Considerations . . . . . . . . . . . . . . . . . . 18 72 7.1. Protocol Parameters . . . . . . . . . . . . . . . . . . . 18 73 7.2. Observing the network . . . . . . . . . . . . . . . . . . 21 74 8. Security Considerations . . . . . . . . . . . . . . . . . . . 21 75 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 22 76 10. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 22 77 11. Normative References . . . . . . . . . . . . . . . . . . . . 22 78 12. Informative References . . . . . . . . . . . . . . . . . . . 23 79 Appendix A. Rationale . . . . . . . . . . . . . . . . . . . . . 26 80 Appendix B. Requirements . . . . . . . . . . . . . . . . . . . . 28 81 Appendix C. Considerations on Flow Control . . . . . . . . . . . 28 82 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 30 84 1. Introduction 86 In most Low Power and Lossy Network (LLN) applications, the bulk of 87 the traffic consists of small chunks of data (on the order of a few 88 bytes to a few tens of bytes) at a time. Given that an IEEE Std. 89 802.15.4 [IEEE.802.15.4] frame can carry a payload of 74 bytes or 90 more, fragmentation is usually not required. However, and though 91 this happens only occasionally, a number of mission critical 92 applications do require the capability to transfer larger chunks of 93 data, for instance to support the firmware upgrade of the LLN nodes 94 or the extraction of logs from LLN nodes. In the former case, the 95 large chunk of data is transferred to the LLN node, whereas in the 96 latter, the large chunk flows away from the LLN node. In both cases, 97 the size can be on the order of 10 kilobytes or more and an end-to- 98 end reliable transport is required. 100 "Transmission of IPv6 Packets over IEEE 802.15.4 Networks" [RFC4944] 101 defines the original 6LoWPAN datagram fragmentation mechanism for 102 LLNs. One critical issue with this original design is that routing 103 an IPv6 [RFC8200] packet across a route-over mesh requires 104 reassembling the full packet at each hop, which may cause latency 105 along a path and an overall buffer bloat in the network. The "6TiSCH 106 Architecture" [I-D.ietf-6tisch-architecture] recommends using a 107 fragment forwarding (FF) technique to alleviate those undesirable 108 effects. "LLN Minimal Fragment Forwarding" 109 [I-D.ietf-6lo-minimal-fragment] specifies the general behavior that 110 all FF techniques including this specification follow, and presents 111 the associated caveats. In particular, the routing information is 112 fully indicated in the first fragment, which is always forwarded 113 first. A state is formed and used to forward all the next fragments 114 along the same path. The datagram_tag is locally significant to the 115 Layer-2 source of the packet and is swapped at each hop. 117 "Virtual reassembly buffers in 6LoWPAN" 118 [I-D.ietf-lwig-6lowpan-virtual-reassembly] (VRB) proposes a FF 119 technique that is compatible with [RFC4944] without the need to 120 define a new protocol. However, adding that capability alone to the 121 local implementation of the original 6LoWPAN fragmentation would not 122 address the inherent fragility of fragmentation (see 123 [I-D.ietf-intarea-frag-fragile]) in particular the issues of 124 resources locked on the receiver and the wasted transmissions due to 125 the loss of a single fragment in a whole datagram. [Kent] compares 126 the unreliable delivery of fragments with a mechanism it calls 127 "selective acknowledgements" that recovers the loss of a fragment 128 individually. The paper illustrates the benefits that can be derived 129 from such a method in figures 1, 2 and 3, on pages 6 and 7. 130 [RFC4944] has no selective recovery and the whole datagram fails when 131 one fragment is not delivered to the destination 6LoWPAN endpoint. 132 Constrained memory resources are blocked on the receiver until the 133 receiver times out, possibly causing the loss of subsequent packets 134 that cannot be received for the lack of buffers. 136 That problem is exacerbated when forwarding fragments over multiple 137 hops since a loss at an intermediate hop will not be discovered by 138 either the source or the destination, and the source will keep on 139 sending fragments, wasting even more resources in the network and 140 possibly contributing to the condition that caused the loss to no 141 avail since the datagram cannot arrive in its entirety. RFC 4944 is 142 also missing signaling to abort a multi-fragment transmission at any 143 time and from either end, and, if the capability to forward fragments 144 is implemented, clean up the related state in the network. It is 145 also lacking flow control capabilities to avoid participating in 146 congestion that may in turn cause the loss of a fragment and 147 potentially the retransmission of the full datagram. 149 This specification provides a method to forward fragments across a 150 multi-hop route-over mesh, and a selective acknowledgment to recover 151 individual fragments between 6LoWPAN endpoints. The method is 152 designed to limit congestion loss in the network and addresses the 153 requirements that are detailed in Appendix B. 155 2. Terminology 157 2.1. BCP 14 159 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 160 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 161 "OPTIONAL" in this document are to be interpreted as described in BCP 162 14 [RFC2119][RFC8174] when, and only when, they appear in all 163 capitals, as shown here. 165 2.2. References 167 In this document, readers will encounter terms and concepts that are 168 discussed in "Problem Statement and Requirements for IPv6 over 169 Low-Power Wireless Personal Area Network (6LoWPAN) Routing" [RFC6606] 171 "LLN Minimal Fragment Forwarding" [I-D.ietf-6lo-minimal-fragment] 172 introduces the generic concept of a Virtual Reassembly Buffer (VRB) 173 and specifies behaviours and caveats that are common to a large 174 family of FF techniques including this, which fully inherits from 175 that specification. 177 Past experience with fragmentation has shown that misassociated or 178 lost fragments can lead to poor network behavior and, occasionally, 179 trouble at the application layer. The reader is encouraged to read 180 "IPv4 Reassembly Errors at High Data Rates" [RFC4963] and follow the 181 references for more information. 183 That experience led to the definition of "Path MTU discovery" 184 [RFC8201] (PMTUD) protocol that limits fragmentation over the 185 Internet. 187 Specifically in the case of UDP, valuable additional information can 188 be found in "UDP Usage Guidelines for Application Designers" 189 [RFC8085]. 191 Readers are expected to be familiar with all the terms and concepts 192 that are discussed in "IPv6 over Low-Power Wireless Personal Area 193 Networks (6LoWPANs): Overview, Assumptions, Problem Statement, and 194 Goals" [RFC4919] and "Transmission of IPv6 Packets over IEEE 802.15.4 195 Networks" [RFC4944]. 197 "The Benefits of Using Explicit Congestion Notification (ECN)" 198 [RFC8087] provides useful information on the potential benefits and 199 pitfalls of using ECN. 201 Quoting the "Multiprotocol Label Switching (MPLS) Architecture" 202 [RFC3031]: with MPLS, 'packets are "labeled" before they are 203 forwarded' along a Label Switched Path (LSP). At subsequent hops, 204 there is no further analysis of the packet's network layer header. 205 Rather, the label is used as an index into a table which specifies 206 the next hop, and a new label". The MPLS technique is leveraged in 207 the present specification to forward fragments that actually do not 208 have a network layer header, since the fragmentation occurs below IP. 210 2.3. New Terms 212 This specification uses the following terms: 214 6LoWPAN endpoints: The LLN nodes in charge of generating or 215 expanding a 6LoWPAN header from/to a full IPv6 packet. The 216 6LoWPAN endpoints are the points where fragmentation and 217 reassembly take place. 219 Compressed Form: This specification uses the generic term Compressed 220 Form to refer to the format of a datagram after the action of 221 [RFC6282] and possibly [RFC8138] for RPL [RFC6550] artifacts. 223 datagram_size: The size of the datagram in its Compressed Form 224 before it is fragmented. The datagram_size is expressed in a unit 225 that depends on the MAC layer technology, by default a byte. 227 datagram_tag: An identifier of a datagram that is locally unique to 228 the Layer-2 sender. Associated with the MAC address of the 229 sender, this becomes a globally unique identifier for the 230 datagram. 232 fragment_offset: The offset of a particular fragment of a datagram 233 in its Compressed Form. The fragment_offset is expressed in a 234 unit that depends on the MAC layer technology and is by default a 235 byte. 237 RFRAG: Recoverable Fragment 238 RFRAG-ACK: Recoverable Fragment Acknowledgement 240 RFRAG Acknowledgment Request: An RFRAG with the Acknowledgement 241 Request flag ('X' flag) set. 243 NULL bitmap: Refers to a bitmap with all bits set to zero. 245 FULL bitmap: Refers to a bitmap with all bits set to one. 247 Forward: The direction of a LSP path, followed by the RFRAG. 249 Reverse: The reverse direction of a LSP path, taken by the RFRAG- 250 ACK. 252 3. Updating RFC 4944 254 This specification updates the fragmentation mechanism that is 255 specified in "Transmission of IPv6 Packets over IEEE 802.15.4 256 Networks" [RFC4944] for use in route-over LLNs by providing a model 257 where fragments can be forwarded end-to-end across a 6LoWPAN LLN, and 258 where fragments that are lost on the way can be recovered 259 individually. A new format for fragments is introduced and new 260 dispatch types are defined in Section 5. 262 [RFC8138] allows modifying the size of a packet en route by removing 263 the consumed hops in a compressed Routing Header. This requires that 264 fragment_offset and datagram_size (see Section 2.3) are also modified 265 en route, which is difficult to do in the uncompressed form. This 266 specification expresses those fields in the Compressed Form and 267 allows modifying them en route (see Section 4.3) easily. 269 Note that consistent with Section 2 of [RFC6282], for the 270 fragmentation mechanism described in Section 5.3 of [RFC4944], any 271 header that cannot fit within the first fragment MUST NOT be 272 compressed when using the fragmentation mechanism described in this 273 specification. 275 4. Extending draft-ietf-6lo-minimal-fragment 277 This specification implements the generic FF technique specified in 278 "LLN Minimal Fragment Forwarding" [I-D.ietf-6lo-minimal-fragment] in 279 a fashion that enables end-to-end recovery of fragments and some 280 degree of flow control. 282 4.1. Slack in the First Fragment 284 [I-D.ietf-6lo-minimal-fragment] allows for refragmenting in 285 intermediate nodes, meaning that some bytes from a given fragment may 286 be left in the VRB to be added to the next fragment. The reason for 287 this happening would be the need for space in the outgoing fragment 288 that was not needed in the incoming fragment, for instance because 289 the 6LoWPAN Header Compression is not as efficient on the outgoing 290 link, e.g., if the Interface ID (IID) of the source IPv6 address is 291 elided by the originator on the first hop because it matches the 292 source MAC address, but cannot be on the next hops because the source 293 MAC address changes. 295 This specification cannot allow this operation since fragments are 296 recovered end-to-end based on a sequence number. This means that the 297 fragments that contain a 6LoWPAN-compressed header MUST have enough 298 slack to enable a less efficient compression in the next hops that 299 still fits in one MAC frame. For instance, if the IID of the source 300 IPv6 address is elided by the originator, then it MUST compute the 301 Fragment_Size as if the MTU was 8 bytes less. This way, the next hop 302 can restore the source IID to the first fragment without impacting 303 the second fragment. 305 4.2. Gap between frames 307 This specification introduces a concept of an inter-frame gap, which 308 is a configurable interval of time between transmissions to a same 309 next hop. In the case of half duplex interfaces, this inter-frame 310 gap ensures that the next hop has completed processing of the 311 previous frame and is capable of receiving the next one. 313 In the case of a mesh operating at a single frequency with 314 omnidirectional antennas, a larger inter-frame gap is required to 315 protect the frame against hidden terminal collisions with the 316 previous frame of a same flow that is still progressing along a 317 common path. 319 The inter-frame gap is useful even for unfragmented datagrams, but it 320 becomes a necessity for fragments that are typically generated in a 321 fast sequence and are all sent over the exact same path. 323 4.3. Modifying the First Fragment 325 The compression of the Hop Limit, of the source and destination 326 addresses in the IPv6 Header, and of the Routing Header may change en 327 route in a Route-Over mesh LLN. If the size of the first fragment is 328 modified, then the intermediate node MUST adapt the datagram_size to 329 reflect that difference. 331 The intermediate node MUST also save the difference of datagram_size 332 of the first fragment in the VRB and add it to the datagram_size and 333 to the fragment_offset of all the subsequent fragments for that 334 datagram. 336 5. New Dispatch types and headers 338 This specification enables the 6LoWPAN fragmentation sublayer to 339 provide an MTU up to 2048 bytes to the upper layer, which can be the 340 6LoWPAN Header Compression sublayer that is defined in the 341 "Compression Format for IPv6 Datagrams" [RFC6282] specification. In 342 order to achieve this, this specification enables the fragmentation 343 and the reliable transmission of fragments over a multihop 6LoWPAN 344 mesh network. 346 This specification provides a technique that is derived from MPLS to 347 forward individual fragments across a 6LoWPAN route-over mesh without 348 reassembly at each hop. The datagram_tag is used as a label; it is 349 locally unique to the node that owns the source MAC address of the 350 fragment, so together the MAC address and the label can identify the 351 fragment globally. A node may build the datagram_tag in its own 352 locally-significant way, as long as the chosen datagram_tag stays 353 unique to the particular datagram for the lifetime of that datagram. 354 It results that the label does not need to be globally unique but 355 also that it must be swapped at each hop as the source MAC address 356 changes. 358 This specification extends RFC 4944 [RFC4944] with 2 new Dispatch 359 types, for Recoverable Fragment (RFRAG) and for the RFRAG 360 Acknowledgment back. The new 6LoWPAN Dispatch types are taken from 361 Page 0 [RFC8025] as indicated in Table 1 in Section 9. 363 In the following sections, a "datagram_tag" extends the semantics 364 defined in [RFC4944] Section 5.3."Fragmentation Type and Header". 365 The datagram_tag is a locally unique identifier for the datagram from 366 the perspective of the sender. This means that the datagram_tag 367 identifies a datagram uniquely in the network when associated with 368 the source of the datagram. As the datagram gets forwarded, the 369 source changes and the datagram_tag must be swapped as detailed in 370 [I-D.ietf-6lo-minimal-fragment]. 372 5.1. Recoverable Fragment Dispatch type and Header 374 In this specification, if the packet is compressed then the size and 375 offset of the fragments are expressed with respect to the Compressed 376 Form of the packet form as opposed to the uncompressed (native) 377 packet form. 379 The format of the fragment header is shown in Figure 1. It is the 380 same for all fragments. The format has a length and an offset, as 381 well as a sequence field. This would be redundant if the offset was 382 computed as the product of the sequence by the length, but this is 383 not the case. The position of a fragment in the reassembly buffer is 384 neither correlated with the value of the sequence field nor with the 385 order in which the fragments are received. This enables out-of- 386 sequence subfragmenting, e.g., a fragment seq. 5 that is retried end- 387 to-end as smaller fragments seq. 5, 13 and 14 due to a change of MTU 388 along the path between the 6LoWPAN endpoints. 390 1 2 3 391 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 392 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 393 |1 1 1 0 1 0 0|E| datagram_tag | 394 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 395 |X| sequence| Fragment_Size | fragment_offset | 396 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 398 X set == Ack-Request 400 Figure 1: RFRAG Dispatch type and Header 402 There is no requirement on the receiver to check for contiguity of 403 the received fragments, and the sender MUST ensure that when all 404 fragments are acknowledged, then the datagram is fully received. 405 This may be useful in particular in the case where the MTU changes 406 and a fragment sequence is retried with a smaller Fragment_Size, the 407 remainder of the original fragment being retried with new sequence 408 values. 410 The first fragment is recognized by a sequence of 0; it carries its 411 Fragment_Size and the datagram_size of the compressed packet before 412 it is fragmented, whereas the other fragments carry their 413 Fragment_Size and fragment_offset. The last fragment for a datagram 414 is recognized when its fragment_offset and its Fragment_Size add up 415 to the datagram_size. 417 Recoverable Fragments are sequenced and a bitmap is used in the RFRAG 418 Acknowledgment to indicate the received fragments by setting the 419 individual bits that correspond to their sequence. 421 X: 1 bit; Ack-Request: when set, the sender requires an RFRAG 422 Acknowledgment from the receiver. 424 E: 1 bit; Explicit Congestion Notification; the "E" flag is reset by 425 the source of the fragment and set by intermediate routers to 426 signal that this fragment experienced congestion along its path. 428 Fragment_Size: 10-bit unsigned integer; the size of this fragment in 429 a unit that depends on the MAC layer technology. Unless 430 overridden by a more specific specification, that unit is the 431 octet, which allows fragments up to 512 bytes. 433 datagram_tag: 8 bits; an identifier of the datagram that is locally 434 unique to the sender. 436 Sequence: 5-bit unsigned integer; the sequence number of the 437 fragment in the acknowledgement bitmap. Fragments are numbered 438 [0..N] where N is in [0..31]. A Sequence of 0 indicates the first 439 fragment in a datagram, but non-zero values are not indicative of 440 the position in the reassembly buffer. 442 Fragment_offset: 16-bit unsigned integer. 444 When the Fragment_offset is set to a non-0 value, its semantics 445 depend on the value of the Sequence field as follows: 447 * For a first fragment (i.e., with a Sequence of 0), this field 448 indicates the datagram_size of the compressed datagram, to help 449 the receiver allocate an adapted buffer for the reception and 450 reassembly operations. The fragment may be stored for local 451 reassembly. Alternatively, it may be routed based on the 452 destination IPv6 address. In that case, a VRB state must be 453 installed as described in Section 6.1.1. 454 * When the Sequence is not 0, this field indicates the offset of 455 the fragment in the Compressed Form of the datagram. The 456 fragment may be added to a local reassembly buffer or forwarded 457 based on an existing VRB as described in Section 6.1.2. 459 A Fragment_offset that is set to a value of 0 indicates an abort 460 condition and all state regarding the datagram should be cleaned 461 up once the processing of the fragment is complete; the processing 462 of the fragment depends on whether there is a VRB already 463 established for this datagram, and the next hop is still 464 reachable: 466 * if a VRB already exists and is not broken, the fragment is to 467 be forwarded along the associated Label Switched Path (LSP) as 468 described in Section 6.1.2, but regardless of the value of the 469 Sequence field; 470 * else, if the Sequence is 0, then the fragment is to be routed 471 as described in Section 6.1.1, but no state is conserved 472 afterwards. In that case, the session if it exists is aborted 473 and the packet is also forwarded in an attempt to clean up the 474 next hops along the path indicated by the IPv6 header (possibly 475 including a routing header). 477 If the fragment cannot be forwarded or routed, then an abort 478 RFRAG-ACK is sent back to the source as described in 479 Section 6.1.2. 481 5.2. RFRAG Acknowledgment Dispatch type and Header 483 This specification also defines a 4-octet RFRAG Acknowledgment bitmap 484 that is used by the reassembling endpoint to confirm selectively the 485 reception of individual fragments. A given offset in the bitmap maps 486 one-to-one with a given sequence number and indicates which fragment 487 is acknowledged as follows: 489 1 2 3 490 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 491 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 492 | RFRAG Acknowledgment Bitmap | 493 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 494 ^ ^ 495 | | bitmap indicating whether: 496 | +----- Fragment with sequence 9 was received 497 +----------------------- Fragment with sequence 0 was received 499 Figure 2: RFRAG Acknowledgment Bitmap Encoding 501 Figure 3 shows an example Acknowledgment bitmap which indicates that 502 all fragments from sequence 0 to 20 were received, except for 503 fragments 1, 2 and 16 were lost and must be retried. 505 1 2 3 506 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 507 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 508 |1|0|0|1|1|1|1|1|1|1|1|1|1|1|1|1|0|1|1|1|1|0|0|0|0|0|0|0|0|0|0|0| 509 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 511 Figure 3: Example RFRAG Acknowledgment Bitmap 513 The RFRAG Acknowledgment Bitmap is included in an RFRAG 514 Acknowledgment header, as follows: 516 1 2 3 517 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 518 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 519 |1 1 1 0 1 0 1|E| datagram_tag | 520 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 521 | RFRAG Acknowledgment Bitmap (32 bits) | 522 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 523 Figure 4: RFRAG Acknowledgment Dispatch type and Header 525 E: 1 bit; Explicit Congestion Notification Echo 527 When set, the sender indicates that at least one of the 528 acknowledged fragments was received with an Explicit Congestion 529 Notification, indicating that the path followed by the fragments 530 is subject to congestion. More in Appendix C. 532 RFRAG Acknowledgment Bitmap: An RFRAG Acknowledgment Bitmap, whereby 533 setting the bit at offset x indicates that fragment x was 534 received, as shown in Figure 2. A NULL bitmap that indicates that 535 the fragmentation process is aborted. A FULL bitmap that 536 indicates that the fragmentation process is complete, all 537 fragments were received at the reassembly endpoint. 539 6. Fragments Recovery 541 The Recoverable Fragment header RFRAG is used to transport a fragment 542 and optionally request an RFRAG Acknowledgment that will confirm the 543 good reception of one or more fragments. An RFRAG Acknowledgment is 544 carried as a standalone fragment header (i.e., with no 6LoWPAN 545 payload) in a message that is propagated back to the 6LoWPAN endpoint 546 that was the originator of the fragments. To achieve this, each hop 547 that performed an MPLS-like operation on fragments reverses that 548 operation for the RFRAG_ACK by sending a frame from the next hop to 549 the previous hop as known by its MAC address in the VRB. The 550 datagram_tag in the RFRAG_ACK is unique to the receiver and is enough 551 information for an intermediate hop to locate the VRB that contains 552 the datagram_tag used by the previous hop and the Layer-2 information 553 associated to it (interface and MAC address). 555 The 6LoWPAN endpoint that fragments the packets at the 6LoWPAN level 556 (the sender) also controls the amount of acknowledgments by setting 557 the Ack-Request flag in the RFRAG packets. The sender may set the 558 Ack-Request flag on any fragment to perform congestion control by 559 limiting the number of outstanding fragments, which are the fragments 560 that have been sent but for which reception or loss was not 561 positively confirmed by the reassembling endpoint. The maximum 562 number of outstanding fragments is controlled by the Window-Size. It 563 is configurable and may vary in case of ECN notification. When the 564 6LoWPAN endpoint that reassembles the packets at the 6LoWPAN level 565 (the receiver) receives a fragment with the Ack-Request flag set, it 566 MUST send an RFRAG Acknowledgment back to the originator to confirm 567 reception of all the fragments it has received so far. 569 The Ack-Request ('X') set in an RFRAG marks the end of a window. 570 This flag MUST be set on the last fragment if the sender wishes to 571 protect the datagram, and it MAY be set in any intermediate fragment 572 for the purpose of flow control. 574 This automatic repeat request (ARQ) process MUST be protected by a 575 Retransmission TimeOut (RTO) timer, and the fragment that carries the 576 'X' flag MAY be retried upon a time out for a configurable number of 577 times (see Section 7.1). Upon exhaustion of the retries the sender 578 may either abort the transmission of the datagram or retry the 579 datagram from the first fragment with an 'X' flag set in order to 580 reestablish a path and discover which fragments were received over 581 the old path in the acknowledgment bitmap. When the sender of the 582 fragment knows that an underlying link-layer mechanism protects the 583 fragments, it may refrain from using the RFRAG Acknowledgment 584 mechanism, and never set the Ack-Request bit. 586 The receiver MAY issue unsolicited acknowledgments. An unsolicited 587 acknowledgment signals to the sender endpoint that it can resume 588 sending if it had reached its maximum number of outstanding 589 fragments. Another use is to inform the sender that the reassembling 590 endpoint aborted the processing of an individual datagram. 592 The RFRAG Acknowledgment can optionally carry an ECN indication for 593 flow control (see Appendix C). The receiver of a fragment with the 594 'E' (ECN) flag set MUST echo that information by setting the 'E' 595 (ECN) flag in the next RFRAG Acknowledgment. 597 In order to protect the datagram, the sender transfers a controlled 598 number of fragments and flags the last fragment of a window with an 599 RFRAG Acknowledgment Request. The receiver MUST acknowledge a 600 fragment with the acknowledgment request bit set. If any fragment 601 immediately preceding an acknowledgment request is still missing, the 602 receiver MAY intentionally delay its acknowledgment to allow in- 603 transit fragments to arrive. Because it might defeat the round-trip 604 delay computation, delaying the acknowledgment should be configurable 605 and not enabled by default. 607 When all the fragments are received, the receiving endpoint 608 reconstructs the packet, passes it to the upper layer, sends an RFRAG 609 Acknowledgment on the reverse path with a FULL bitmap, and arms a 610 short timer, e.g., in the order of an average round-trip delay in the 611 network. As the timer runs, the receiving endpoint absorbs the 612 fragments that were still in flight for that datagram without 613 creating a new state. The receiving endpoint abort the communication 614 if it keeps going on beyond the duration of the timer. 616 Note that acknowledgments might consume precious resources so the use 617 of unsolicited acknowledgments should be configurable and not enabled 618 by default. 620 An observation is that streamlining forwarding of fragments generally 621 reduces the latency over the LLN mesh, providing room for retries 622 within existing upper-layer reliability mechanisms. The sender 623 protects the transmission over the LLN mesh with a retry timer that 624 is computed according to the method detailed in [RFC6298]. It is 625 expected that the upper layer retries obey the recommendations in 626 [RFC8085], in which case a single round of fragment recovery should 627 fit within the upper layer recovery timers. 629 Fragments are sent in a round-robin fashion: the sender sends all the 630 fragments for a first time before it retries any lost fragment; lost 631 fragments are retried in sequence, oldest first. This mechanism 632 enables the receiver to acknowledge fragments that were delayed in 633 the network before they are retried. 635 When a single frequency is used by contiguous hops, the sender should 636 insert a delay between fragments of a same datagram that covers 637 multiple transmissions so as to let a fragment progress a few hops 638 and avoid hidden terminal issues. This precaution is not required on 639 channel hopping technologies such as Time Slotted Channel Hopping 640 (TSCH) [RFC6554], where nodes that communicate at Layer-2 are 641 scheduled to send and receive respectively, and different hops 642 operate on different channels. 644 6.1. Forwarding Fragments 646 It is assumed that the first fragment is large enough to carry the 647 IPv6 header and make routing decisions. If that is not so, then this 648 specification MUST NOT be used. 650 This specification extends the Virtual Reassembly Buffer (VRB) 651 technique to forward fragments with no intermediate reconstruction of 652 the entire packet. It inherits operations like datagram_tag 653 switching and using a timer to clean the VRB when the traffic dries 654 up. The first fragment carries the IP header and it is routed all 655 the way from the fragmenting endpoint to the reassembling endpoint. 656 Upon receiving the first fragment, the routers along the path install 657 a label-switched path (LSP), and the following fragments are label- 658 switched along that path. As a consequence, the next fragments can 659 only follow the path that was set up by the first fragment and cannot 660 follow an alternate route. The datagram_tag is used to carry the 661 label, which is swapped in each hop. All fragments follow the same 662 path and fragments are delivered in the order at which they are sent. 664 6.1.1. Receiving the first fragment 666 In Route-Over mode, the source and destination MAC addresses in a 667 frame change at each hop. The label that is formed and placed in the 668 datagram_tag is associated with the source MAC address and only valid 669 (and unique) for that source MAC address. Upon a first fragment 670 (i.e., with a sequence of zero), an intermediate router creates a VRB 671 and the associated LSP state for the tuple (source MAC address, 672 datagram_tag) and the fragment is forwarded along the IPv6 route that 673 matches the destination IPv6 address in the IPv6 header as prescribed 674 by [I-D.ietf-6lo-minimal-fragment], where the receiving endpoint 675 allocates a reassembly buffer. 677 The LSP state enables to match the (previous MAC address, 678 datagram_tag) in an incoming fragment to the tuple (next MAC address, 679 swapped datagram_tag) used in the forwarded fragment and points at 680 the VRB. In addition, the router also forms a reverse LSP state 681 indexed by the MAC address of the next hop and the swapped 682 datagram_tag. This reverse LSP state also points at the VRB and 683 enables matching the (next MAC address, swapped_datagram_tag) found 684 in an RFRAG Acknowledgment to the tuple (previous MAC address, 685 datagram_tag) used when forwarding a Fragment Acknowledgment (RFRAG- 686 ACK) back to the sender endpoint. 688 The first fragment may be received a second time, indicating that it 689 did not reach the destination and was retried. In that case, it 690 SHOULD follow the same path as the first occurrence. It is up to 691 sending endpoint to determine whether to abort a transmission and 692 then retry it from scratch, which may build an entirely new path. 694 6.1.2. Receiving the next fragments 696 Upon receiving a next fragment (i.e., with a non-zero sequence), an 697 intermediate router looks up a LSP indexed by the tuple (MAC address, 698 datagram_tag) found in the fragment. If it is found, the router 699 forwards the fragment using the associated VRB as prescribed by 700 [I-D.ietf-6lo-minimal-fragment]. 702 If the VRB for the tuple is not found, the router builds an RFRAG-ACK 703 to abort the transmission of the packet. The resulting message has 704 the following information: 706 * The source and destination MAC addresses are swapped from those 707 found in the fragment 708 * The datagram_tag is set to the datagram_tag found in the fragment 709 * A NULL bitmap is used to signal the abort condition 710 At this point the router is all set and can send the RFRAG-ACK back 711 to the previous router. The RFRAG-ACK should normally be forwarded 712 all the way to the source using the reverse LSP state in the VRBs in 713 the intermediate routers as described in the next section. 715 [I-D.ietf-6lo-minimal-fragment] indicates that the receiving endpoint 716 stores "the actual packet data from the fragments received so far, in 717 a form that makes it possible to detect when the whole packet has 718 been received and can be processed or forwarded". How this is 719 computed in implementation specific but relies on receiving all the 720 bytes up to the datagram_size indicated in the first fragment. An 721 implementation may receive overlapping fragments as the result of 722 retries after an MTU change. 724 6.2. Receiving RFRAG Acknowledgments 726 Upon receipt of an RFRAG-ACK, the router looks up a reverse LSP 727 indexed by the tuple (MAC address, datagram_tag), which are 728 respectively the source MAC address of the received frame and the 729 received datagram_tag. If it is found, the router forwards the 730 fragment using the associated VRB as prescribed by 731 [I-D.ietf-6lo-minimal-fragment], but using the reverse LSP so that 732 the RFRAG-ACK flows back to the sender endpoint. 734 If the reverse LSP is not found, the router MUST silently drop the 735 RFRAG-ACK message. 737 Either way, if the RFRAG-ACK indicates that the fragment was entirely 738 received (FULL bitmap), it arms a short timer, and upon timeout, the 739 VRB and all the associated state are destroyed. Until the timer 740 elapses, fragments of that datagram may still be received, e.g. if 741 the RFRAG-ACK was lost on the way back and the source retried the 742 last fragment. In that case, the router forwards the fragment 743 according to the state in the VRB. 745 This specification does not provide a method to discover the number 746 of hops or the minimal value of MTU along those hops. But should the 747 minimal MTU decrease, it is possible to retry a long fragment (say 748 sequence of 5) with first a shorter fragment of the same sequence (5 749 again) and then one or more other fragments with a sequence that was 750 not used before (e.g., 13 and 14). Note that Path MTU Discovery is 751 out of scope for this document. 753 6.3. Aborting the Transmission of a Fragmented Packet 755 A reset is signaled on the forward path with a pseudo fragment that 756 has the fragment_offset, sequence, and Fragment_Size all set to 0, 757 and no data. 759 When the sender or a router on the way decides that a packet should 760 be dropped and the fragmentation process aborted, it generates a 761 reset pseudo fragment and forwards it down the fragment path. 763 Each router next along the path the way forwards the pseudo fragment 764 based on the VRB state. If an acknowledgment is not requested, the 765 VRB and all associated state are destroyed. 767 Upon reception of the pseudo fragment, the receiver cleans up all 768 resources for the packet associated with the datagram_tag. If an 769 acknowledgment is requested, the receiver responds with a NULL 770 bitmap. 772 The other way around, the receiver might need to abort the process of 773 a fragmented packet for internal reasons, for instance if it is out 774 of reassembly buffers, already uses all 256 possible values of the 775 datagram_tag, or if it keeps receiving fragments beyond a reasonable 776 time while it considers that this packet is already fully reassembled 777 and was passed to the upper layer. In that case, the receiver SHOULD 778 indicate so to the sender with a NULL bitmap in an RFRAG 779 Acknowledgment. The RFRAG Acknowledgment is forwarded all the way 780 back to the source of the packet and cleans up all resources on the 781 way. Upon an acknowledgment with a NULL bitmap, the sender endpoint 782 MUST abort the transmission of the fragmented datagram with one 783 exception: In the particular case of the first fragment, it MAY 784 decide to retry via an alternate next hop instead. 786 6.4. Applying Recoverable Fragmentation along a Diverse Path 788 The text above can be read with the assumption of a serial path 789 between a source and a destination. Section 4.5.3 of the "6TiSCH 790 Architecture" [I-D.ietf-6tisch-architecture] defines the concept of a 791 Track that can be a complex path between a source and a destination 792 with Packet ARQ, Replication, Elimination and Overhearing (PAREO) 793 along the Track. This specification can be used along any subset of 794 the complex Track where the first fragment is flooded. The last 795 RFRAG Acknowledgment is flooded on that same subset in the reverse 796 direction. Intermediate RFRAG Acknowledgments can be flooded on any 797 sub-subset of that reverse subset that reach back to the source. 799 7. Management Considerations 801 This specification extends "On Forwarding 6LoWPAN Fragments over a 802 Multihop IPv6 Network" [I-D.ietf-6lo-minimal-fragment] and requires 803 the same parameters in the receiver and on intermediate nodes. There 804 is no new parameter as echoing ECN is always on. This parameters 805 typically include the reassembly time-out at the receiver and an 806 inactivity clean-up timer on the intermediate nodes, and the number 807 of messages that can be processed in parallel in all nodes. 809 The configuration settings introduced by this specification only 810 apply to the sender, which is in full control of the transmission. 811 LLNs vary a lot in size (there can be thousands of nodes in a mesh), 812 in speed (from 10Kbps to several Mbps at the PHY layer), in traffic 813 density, and in optimizations that are desired (e.g., the selection 814 of a RPL [RFC6550] Objective Function [RFC6552] impacts the shape of 815 the routing graph). 817 For that reason, only a very generic guidance can be given on the 818 settings of the sender and on whether complex algorithms are needed 819 to perform flow control or estimate the round-trip time. To cover 820 the most complex use cases, this specification enables the sender to 821 vary the fragment size, the window size and the inter-frame gap, 822 based on the amount of losses, the observed variations of the round- 823 trip time and the setting of the ECN bit. 825 7.1. Protocol Parameters 827 The management system SHOULD be capable of providing the parameters 828 listed in this section. 830 An implementation must control the rate at which it sends packets 831 over a same path to allow the next hop to forward a packet before it 832 gets the next. In a wireless network that uses a same frequency 833 along a path, more time must be inserted to avoid hidden terminal 834 issues between fragments. This is controlled by the following 835 parameter: 837 inter-frame gap: Indicates a minimum amount of time between 838 transmissions. All packets to a same destination, and in 839 particular fragments, may be subject to receive while transmitting 840 and hidden terminal collisions with the next or the previous 841 transmission as the fragments progress along a same path. The 842 inter-frame gap protects the propagation of one transmission 843 before the next one is triggered and creates a duty cycle that 844 controls the ratio of air time and memory in intermediate nodes 845 that a particular datagram will use. 847 An implementation should consider the generic recommendations from 848 the IETF in the matter of flow control and rate management in 849 [RFC5033]. To control the flow, an implementation may use a dynamic 850 value of the window size (Window_Size), adapt the fragment size 851 (Fragment_Size) and insert an inter-frame gap that is longer than 852 necessary. In a large network where node contend for the bandwidth, 853 a larger Fragment_Size consumes less bandwidth but also reduces the 854 fluidity and incurs higher chances of loss in transmission. This is 855 controlled by the following parameters: 857 MinFragmentSize: The MinFragmentSize is the minimum value for the 858 Fragment_Size. 860 OptFragmentSize: The OptFragmentSize is the value for the 861 Fragment_Size that the sender should use to start with. It is 862 more than or equal to MinFragmentSize. It is less than or equal 863 to MaxFragmentSize. On the first fragment, it must enable the 864 expansion of the IPv6 addresses and of the Hop Limit field within 865 MTU. On all fragments, it is a balance between the expected 866 fluidity and the overhead of MAC and 6LoWPAN headers. For a small 867 MTU, the idea is to keep it close to the maximum, whereas for 868 larger MTUs, it might makes sense to keep it short enough, so that 869 the duty cycle of the transmitter is bounded, e.g., to transmit at 870 least 10 frames per second. 872 MaxFragmentSize: The MaxFragmentSize is the maximum value for the 873 Fragment_Size. It MUST be lower than the minimum MTU along the 874 path. A large value augments the chances of buffer bloat and 875 transmission loss. The value MUST be less than 512 if the unit 876 that is defined for the PHY layer is the octet. 878 MinWindowSize: The minimum value of Window_Size that the sender can 879 use. 881 OptWindowSize: The OptWindowSize is the value for the Window_Size 882 that the sender should use to start with. It is more than or 883 equal to MinWindowSize. It is less than or equal to 884 MaxWindowSize. The Window_Size should be maintained below the 885 number of hops in the path of the fragment to avoid stacking 886 fragments at the bottleneck on the path. If an inter-frame gap is 887 used to avoid interference between fragments then the Window_Size 888 should be at most in the order of the estimation of the trip time 889 divided by the inter-frame gap. 891 MaxWindowSize: The maximum value of Window_Size that the sender can 892 use. The value MUST be less than 32. 894 An implementation may perform its estimate of the RTO or use a 895 configured one. The ARQ process is controlled by the following 896 parameters: 898 MinARQTimeOut: The maximum amount of time a node should wait for an 899 RFRAG Acknowledgment before it takes a next action. 901 OptARQTimeOut: The starting point of the value of the RTO, that is 902 amount of time that a sender should wait for an RFRAG 903 Acknowledgment before it takes a next action. It is more than or 904 equal to MinARQTimeOut. It is less than or equal to 905 MaxARQTimeOut. 907 MaxARQTimeOut: The maximum amount of time a node should wait for an 908 RFRAG Acknowledgment before it takes a next action. It must cover 909 the longest expected round-trip time, and be several times less 910 than the time-out that covers the recomposition buffer at the 911 receiver, which is typically in the order of the minute. See 912 Appendix C for recommendations on computing the round-trip time. 914 MaxFragRetries: The maximum number of retries for a particular 915 fragment. 917 MaxDatagramRetries: The maximum number of retries from scratch for a 918 particular datagram. 920 An implementation may be capable to perform flow control based on 921 ECN, more in Appendix C. This is controlled by the following 922 parameter: 924 UseECN: Indicates whether the sender should react to ECN. The 925 sender may react to ECN by varying the Window_Size between 926 MinWindowSize and MaxWindowSize, varying the Fragment_Size between 927 MinFragmentSize and MaxFragmentSize and/or by increasing the 928 inter-frame gap. 930 7.2. Observing the network 932 The management system should monitor the amount of retries and of ECN 933 settings that can be observed from the perspective of both the sender 934 and the receiver, and may tune the optimum size of Fragment_Size and 935 of the Window_Size, OptDatagramSize and OptWindowSize respectively, 936 at the sender. The values should be bounded by the expected number 937 of hops and reduced beyond that when the number of datagrams that can 938 traverse an intermediate point may exceed its capacity and cause a 939 congestion loss. The inter-frame gap is another tool that can be 940 used to increase the spacing between fragments of the same datagram 941 and reduce the ratio of time when a particular intermediate node 942 holds a fragment of that datagram. 944 8. Security Considerations 946 This document specifies an instantiation of a 6LoWPAN Fragment 947 Forwarding technique. [I-D.ietf-6lo-minimal-fragment] provides the 948 generic description of Fragment Forwarding and this specification 949 inherits from it. The generic considerations in the Security 950 sections of [I-D.ietf-6lo-minimal-fragment] apply equally to this 951 document. 953 This specification does not recommend a particular algorithm for the 954 estimation of the duration of the RTO that covers the detection of 955 the loss of a fragment with the 'X' flag set; regardless, an attacker 956 on the path may slow down or discard packets, which in turn can 957 affect the throughput of fragmented packets. 959 Compared to "Transmission of IPv6 Packets over IEEE 802.15.4 960 Networks" [RFC4944], this specification reduces the datagram_tag to 8 961 bits and the tag wraps faster than with [RFC4944]. But for a 962 constrained network where a node is expected to be able to hold only 963 one or a few large packets in memory, 256 is still a large number. 964 Also, the acknowledgement mechanism allows cleaning up the state 965 rapidly once the packet is fully transmitted or aborted. 967 The abstract Virtual Recovery Buffer inherited from 968 [I-D.ietf-6lo-minimal-fragment] may be used to perform a Denial-of- 969 Service (DoS) attack against the intermediate Routers since the 970 routers need to maintain a state per flow. The particular VRB 971 implementation technique described in 972 [I-D.ietf-lwig-6lowpan-virtual-reassembly] allows realigning which 973 data goes in which fragment, which causes the intermediate node to 974 store a portion of the data, which adds an attack vector that is not 975 present with this specification. With this specification, the data 976 that is transported in each fragment is conserved and the state to 977 keep does not include any data that would not fit in the previous 978 fragment. 980 9. IANA Considerations 982 This document allocates 2 patterns for a total of 4 dispatch values 983 in Page 0 for recoverable fragments from the "Dispatch Type Field" 984 registry that was created by "Transmission of IPv6 Packets over IEEE 985 802.15.4 Networks" [RFC4944] and reformatted by "6LoWPAN Paging 986 Dispatch" [RFC8025]. 988 The suggested patterns (to be confirmed by IANA) are indicated in 989 Table 1. 991 +-------------+------+----------------------------------+-----------+ 992 | Bit Pattern | Page | Header Type | Reference | 993 +=============+======+==================================+===========+ 994 | 11 10100x | 0 | RFRAG - Recoverable Fragment | THIS RFC | 995 +-------------+------+----------------------------------+-----------+ 996 | 11 10100x | 1-14 | Unassigned | | 997 +-------------+------+----------------------------------+-----------+ 998 | 11 10100x | 15 | Reserved for Experimental Use | RFC 8025 | 999 +-------------+------+----------------------------------+-----------+ 1000 | 11 10101x | 0 | RFRAG-ACK - RFRAG | THIS RFC | 1001 | | | Acknowledgment | | 1002 +-------------+------+----------------------------------+-----------+ 1003 | 11 10101x | 1-14 | Unassigned | | 1004 +-------------+------+----------------------------------+-----------+ 1005 | 11 10101x | 15 | Reserved for Experimental Use | RFC 8025 | 1006 +-------------+------+----------------------------------+-----------+ 1008 Table 1: Additional Dispatch Value Bit Patterns 1010 10. Acknowledgments 1012 The author wishes to thank Michel Veillette, Dario Tedeschi, Laurent 1013 Toutain, Carles Gomez Montenegro, Thomas Watteyne, and Michael 1014 Richardson for in-depth reviews and comments. Also many thanks to 1015 Peter Yee, Colin Perkins, Tirumaleswar Reddy Konda and Erik Nordmark 1016 for their careful reviews and for helping through the IETF Last Call 1017 and IESG review process, and to Jonathan Hui, Jay Werb, Christos 1018 Polyzois, Soumitri Kolavennu, Pat Kinney, Margaret Wasserman, Richard 1019 Kelsey, Carsten Bormann and Harry Courtice for their various 1020 contributions in the long process that lead ot this document. 1022 11. Normative References 1024 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1025 Requirement Levels", BCP 14, RFC 2119, 1026 DOI 10.17487/RFC2119, March 1997, 1027 . 1029 [RFC4944] Montenegro, G., Kushalnagar, N., Hui, J., and D. Culler, 1030 "Transmission of IPv6 Packets over IEEE 802.15.4 1031 Networks", RFC 4944, DOI 10.17487/RFC4944, September 2007, 1032 . 1034 [RFC6282] Hui, J., Ed. and P. Thubert, "Compression Format for IPv6 1035 Datagrams over IEEE 802.15.4-Based Networks", RFC 6282, 1036 DOI 10.17487/RFC6282, September 2011, 1037 . 1039 [RFC6554] Hui, J., Vasseur, JP., Culler, D., and V. Manral, "An IPv6 1040 Routing Header for Source Routes with the Routing Protocol 1041 for Low-Power and Lossy Networks (RPL)", RFC 6554, 1042 DOI 10.17487/RFC6554, March 2012, 1043 . 1045 [RFC8025] Thubert, P., Ed. and R. Cragie, "IPv6 over Low-Power 1046 Wireless Personal Area Network (6LoWPAN) Paging Dispatch", 1047 RFC 8025, DOI 10.17487/RFC8025, November 2016, 1048 . 1050 [RFC8138] Thubert, P., Ed., Bormann, C., Toutain, L., and R. Cragie, 1051 "IPv6 over Low-Power Wireless Personal Area Network 1052 (6LoWPAN) Routing Header", RFC 8138, DOI 10.17487/RFC8138, 1053 April 2017, . 1055 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 1056 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 1057 May 2017, . 1059 [I-D.ietf-6lo-minimal-fragment] 1060 Watteyne, T., Thubert, P., and C. Bormann, "On Forwarding 1061 6LoWPAN Fragments over a Multihop IPv6 Network", Work in 1062 Progress, Internet-Draft, draft-ietf-6lo-minimal-fragment- 1063 10, 1 February 2020, . 1066 12. Informative References 1068 [RFC8201] McCann, J., Deering, S., Mogul, J., and R. Hinden, Ed., 1069 "Path MTU Discovery for IP version 6", STD 87, RFC 8201, 1070 DOI 10.17487/RFC8201, July 2017, 1071 . 1073 [RFC7567] Baker, F., Ed. and G. Fairhurst, Ed., "IETF 1074 Recommendations Regarding Active Queue Management", 1075 BCP 197, RFC 7567, DOI 10.17487/RFC7567, July 2015, 1076 . 1078 [RFC3031] Rosen, E., Viswanathan, A., and R. Callon, "Multiprotocol 1079 Label Switching Architecture", RFC 3031, 1080 DOI 10.17487/RFC3031, January 2001, 1081 . 1083 [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion 1084 Control", RFC 5681, DOI 10.17487/RFC5681, September 2009, 1085 . 1087 [RFC2914] Floyd, S., "Congestion Control Principles", BCP 41, 1088 RFC 2914, DOI 10.17487/RFC2914, September 2000, 1089 . 1091 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition 1092 of Explicit Congestion Notification (ECN) to IP", 1093 RFC 3168, DOI 10.17487/RFC3168, September 2001, 1094 . 1096 [RFC4919] Kushalnagar, N., Montenegro, G., and C. Schumacher, "IPv6 1097 over Low-Power Wireless Personal Area Networks (6LoWPANs): 1098 Overview, Assumptions, Problem Statement, and Goals", 1099 RFC 4919, DOI 10.17487/RFC4919, August 2007, 1100 . 1102 [RFC4963] Heffner, J., Mathis, M., and B. Chandler, "IPv4 Reassembly 1103 Errors at High Data Rates", RFC 4963, 1104 DOI 10.17487/RFC4963, July 2007, 1105 . 1107 [RFC6298] Paxson, V., Allman, M., Chu, J., and M. Sargent, 1108 "Computing TCP's Retransmission Timer", RFC 6298, 1109 DOI 10.17487/RFC6298, June 2011, 1110 . 1112 [RFC6550] Winter, T., Ed., Thubert, P., Ed., Brandt, A., Hui, J., 1113 Kelsey, R., Levis, P., Pister, K., Struik, R., Vasseur, 1114 JP., and R. Alexander, "RPL: IPv6 Routing Protocol for 1115 Low-Power and Lossy Networks", RFC 6550, 1116 DOI 10.17487/RFC6550, March 2012, 1117 . 1119 [RFC6552] Thubert, P., Ed., "Objective Function Zero for the Routing 1120 Protocol for Low-Power and Lossy Networks (RPL)", 1121 RFC 6552, DOI 10.17487/RFC6552, March 2012, 1122 . 1124 [RFC7554] Watteyne, T., Ed., Palattella, M., and L. Grieco, "Using 1125 IEEE 802.15.4e Time-Slotted Channel Hopping (TSCH) in the 1126 Internet of Things (IoT): Problem Statement", RFC 7554, 1127 DOI 10.17487/RFC7554, May 2015, 1128 . 1130 [RFC8200] Deering, S. and R. Hinden, "Internet Protocol, Version 6 1131 (IPv6) Specification", STD 86, RFC 8200, 1132 DOI 10.17487/RFC8200, July 2017, 1133 . 1135 [RFC8085] Eggert, L., Fairhurst, G., and G. Shepherd, "UDP Usage 1136 Guidelines", BCP 145, RFC 8085, DOI 10.17487/RFC8085, 1137 March 2017, . 1139 [RFC8087] Fairhurst, G. and M. Welzl, "The Benefits of Using 1140 Explicit Congestion Notification (ECN)", RFC 8087, 1141 DOI 10.17487/RFC8087, March 2017, 1142 . 1144 [RFC5033] Floyd, S. and M. Allman, "Specifying New Congestion 1145 Control Algorithms", BCP 133, RFC 5033, 1146 DOI 10.17487/RFC5033, August 2007, 1147 . 1149 [RFC6606] Kim, E., Kaspar, D., Gomez, C., and C. Bormann, "Problem 1150 Statement and Requirements for IPv6 over Low-Power 1151 Wireless Personal Area Network (6LoWPAN) Routing", 1152 RFC 6606, DOI 10.17487/RFC6606, May 2012, 1153 . 1155 [I-D.ietf-lwig-6lowpan-virtual-reassembly] 1156 Bormann, C. and T. Watteyne, "Virtual reassembly buffers 1157 in 6LoWPAN", Work in Progress, Internet-Draft, draft-ietf- 1158 lwig-6lowpan-virtual-reassembly-01, 11 March 2019, 1159 . 1162 [I-D.ietf-intarea-frag-fragile] 1163 Bonica, R., Baker, F., Huston, G., Hinden, R., Troan, O., 1164 and F. Gont, "IP Fragmentation Considered Fragile", Work 1165 in Progress, Internet-Draft, draft-ietf-intarea-frag- 1166 fragile-17, 30 September 2019, 1167 . 1170 [I-D.ietf-6tisch-architecture] 1171 Thubert, P., "An Architecture for IPv6 over the TSCH mode 1172 of IEEE 802.15.4", Work in Progress, Internet-Draft, 1173 draft-ietf-6tisch-architecture-28, 29 October 2019, 1174 . 1177 [IEEE.802.15.4] 1178 IEEE, "IEEE Standard for Low-Rate Wireless Networks", 1179 IEEE Standard 802.15.4, DOI 10.1109/IEEE 1180 P802.15.4-REVd/D01, 1181 . 1183 [Kent] Kent, C. and J. Mogul, ""Fragmentation Considered 1184 Harmful", In Proc. SIGCOMM '87 Workshop on Frontiers in 1185 Computer Communications Technology", 1186 DOI 10.1145/55483.55524, August 1987, 1187 . 1190 Appendix A. Rationale 1192 There are a number of uses for large packets in Wireless Sensor 1193 Networks. Such usages may not be the most typical or represent the 1194 largest amount of traffic over the LLN; however, the associated 1195 functionality can be critical enough to justify extra care for 1196 ensuring effective transport of large packets across the LLN. 1198 The list of those usages includes: 1200 Towards the LLN node: Firmware update: For example, a new version 1201 of the LLN node software is downloaded from a system manager 1202 over unicast or multicast services. Such a reflashing 1203 operation typically involves updating a large number of similar 1204 LLN nodes over a relatively short period of time. 1206 Packages of Commands: A number of commands or 1207 a full configuration can be packaged as a single message to 1208 ensure consistency and enable atomic execution or complete roll 1209 back. Until such commands are fully received and interpreted, 1210 the intended operation will not take effect. 1212 From the LLN node: Waveform captures: A number of consecutive 1213 samples are measured at a high rate for a short time and then 1214 transferred from a sensor to a gateway or an edge server as a 1215 single large report. 1217 Data logs: LLN nodes may generate large logs of 1219 sampled data for later extraction. LLN nodes may also generate 1220 system logs to assist in diagnosing problems on the node or 1221 network. 1223 Large data packets: Rich data types might 1224 require more than one fragment. 1226 Uncontrolled firmware download or waveform upload can easily result 1227 in a massive increase of the traffic and saturate the network. 1229 When a fragment is lost in transmission, the lack of recovery in the 1230 original fragmentation system of RFC 4944 implies that all fragments 1231 would need to be resent, further contributing to the congestion that 1232 caused the initial loss, and potentially leading to congestion 1233 collapse. 1235 This saturation may lead to excessive radio interference, or random 1236 early discard (leaky bucket) in relaying nodes. Additional queuing 1237 and memory congestion may result while waiting for a low power next 1238 hop to emerge from its sleeping state. 1240 Considering that RFC 4944 defines an MTU is 1280 bytes and that in 1241 most incarnations (but 802.15.4g) a IEEE Std. 802.15.4 frame can 1242 limit the MAC payload to as few as 74 bytes, a packet might be 1243 fragmented into at least 18 fragments at the 6LoWPAN shim layer. 1244 Taking into account the worst-case header overhead for 6LoWPAN 1245 Fragmentation and Mesh Addressing headers will increase the number of 1246 required fragments to around 32. This level of fragmentation is much 1247 higher than that traditionally experienced over the Internet with 1248 IPv4 fragments. At the same time, the use of radios increases the 1249 probability of transmission loss and Mesh-Under techniques compound 1250 that risk over multiple hops. 1252 Mechanisms such as TCP or application-layer segmentation could be 1253 used to support end-to-end reliable transport. One option to support 1254 bulk data transfer over a frame-size-constrained LLN is to set the 1255 Maximum Segment Size to fit within the link maximum frame size. 1256 Doing so, however, can add significant header overhead to each 1257 802.15.4 frame. In addition, deploying such a mechanism requires 1258 that the end-to-end transport is aware of the delivery properties of 1259 the underlying LLN, which is a layer violation, and difficult to 1260 achieve from the far end of the IPv6 network. 1262 Appendix B. Requirements 1264 For one-hop communications, a number of Low Power and Lossy Network 1265 (LLN) link-layers propose a local acknowledgment mechanism that is 1266 enough to detect and recover the loss of fragments. In a multihop 1267 environment, an end-to-end fragment recovery mechanism might be a 1268 good complement to a hop-by-hop MAC level recovery. This draft 1269 introduces a simple protocol to recover individual fragments between 1270 6LoWPAN endpoints that may be multiple hops away. The method 1271 addresses the following requirements of an LLN: 1273 Number of fragments: The recovery mechanism must support highly 1274 fragmented packets, with a maximum of 32 fragments per packet. 1276 Minimum acknowledgment overhead: Because the radio is half duplex, 1277 and because of silent time spent in the various medium access 1278 mechanisms, an acknowledgment consumes roughly as many resources 1279 as a data fragment. 1281 The new end-to-end fragment recovery mechanism should be able to 1282 acknowledge multiple fragments in a single message and not require 1283 an acknowledgment at all if fragments are already protected at a 1284 lower layer. 1286 Controlled latency: The recovery mechanism must succeed or give up 1287 within the time boundary imposed by the recovery process of the 1288 Upper Layer Protocols. 1290 Optional congestion control: The aggregation of multiple concurrent 1291 flows may lead to the saturation of the radio network and 1292 congestion collapse. 1294 The recovery mechanism should provide means for controlling the 1295 number of fragments in transit over the LLN. 1297 Appendix C. Considerations on Flow Control 1299 Considering that a multi-hop LLN can be a very sensitive environment 1300 due to the limited queuing capabilities of a large population of its 1301 nodes, this draft recommends a simple and conservative approach to 1302 Congestion Control, based on TCP congestion avoidance. 1304 Congestion on the forward path is assumed in case of packet loss, and 1305 packet loss is assumed upon time out. The draft allows controlling 1306 the number of outstanding fragments that have been transmitted but 1307 for which an acknowledgment was not received yet. It must be noted 1308 that the number of outstanding fragments should not exceed the number 1309 of hops in the network, but the way to figure the number of hops is 1310 out of scope for this document. 1312 Congestion on the forward path can also be indicated by an Explicit 1313 Congestion Notification (ECN) mechanism. Though whether and how ECN 1314 [RFC3168] is carried out over the LoWPAN is out of scope, this draft 1315 provides a way for the destination endpoint to echo an ECN indication 1316 back to the source endpoint in an acknowledgment message as 1317 represented in Figure 4 in Section 5.2. 1319 It must be noted that congestion and collision are different topics. 1320 In particular, when a mesh operates on a same channel over multiple 1321 hops, then the forwarding of a fragment over a certain hop may 1322 collide with the forwarding of a next fragment that is following over 1323 a previous hop but in a same interference domain. This draft enables 1324 end-to-end flow control, but leaves it to the sender stack to pace 1325 individual fragments within a transmit window, so that a given 1326 fragment is sent only when the previous fragment has had a chance to 1327 progress beyond the interference domain of this hop. In the case of 1328 6TiSCH [I-D.ietf-6tisch-architecture], which operates over the 1329 TimeSlotted Channel Hopping [RFC7554] (TSCH) mode of operation of 1330 IEEE802.14.5, a fragment is forwarded over a different channel at a 1331 different time and it makes full sense to transmit the next fragment 1332 as soon as the previous fragment has had its chance to be forwarded 1333 at the next hop. 1335 From the standpoint of a source 6LoWPAN endpoint, an outstanding 1336 fragment is a fragment that was sent but for which no explicit 1337 acknowledgment was received yet. This means that the fragment might 1338 be on the way, received but not yet acknowledged, or the 1339 acknowledgment might be on the way back. It is also possible that 1340 either the fragment or the acknowledgment was lost on the way. 1342 From the sender standpoint, all outstanding fragments might still be 1343 in the network and contribute to its congestion. There is an 1344 assumption, though, that after a certain amount of time, a frame is 1345 either received or lost, so it is not causing congestion anymore. 1346 This amount of time can be estimated based on the round-trip delay 1347 between the 6LoWPAN endpoints. The method detailed in "Computing 1348 TCP's Retransmission Timer" [RFC6298] is recommended for that 1349 computation. 1351 The reader is encouraged to read through "Congestion Control 1352 Principles" [RFC2914]. Additionally [RFC7567] and [RFC5681] provide 1353 deeper information on why this mechanism is needed and how TCP 1354 handles Congestion Control. Basically, the goal here is to manage 1355 the amount of fragments present in the network; this is achieved by 1356 to reducing the number of outstanding fragments over a congested path 1357 by throttling the sources. 1359 Section 6 describes how the sender decides how many fragments are 1360 (re)sent before an acknowledgment is required, and how the sender 1361 adapts that number to the network conditions. 1363 Author's Address 1365 Pascal Thubert (editor) 1366 Cisco Systems, Inc 1367 Building D 1368 45 Allee des Ormes - BP1200 1369 06254 MOUGINS - Sophia Antipolis 1370 France 1372 Phone: +33 497 23 26 34 1373 Email: pthubert@cisco.com