idnits 2.17.00 (12 Aug 2021) /tmp/idnits4149/draft-ietf-tsvwg-datagram-plpmtud-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The abstract seems to indicate that this document updates RFC8201, but the header doesn't have an 'Updates:' line to match this. -- The abstract seems to indicate that this document updates RFC4821, but the header doesn't have an 'Updates:' line to match this. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (March 05, 2018) is 1538 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: draft-ietf-quic-transport has been published as RFC 9000 == Outdated reference: draft-ietf-tsvwg-sctp-dtls-encaps has been published as RFC 8261 == Outdated reference: A later version (-18) exists of draft-ietf-tsvwg-udp-options-01 ** Obsolete normative reference: RFC 2460 (Obsoleted by RFC 8200) Summary: 1 error (**), 0 flaws (~~), 4 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force G. Fairhurst 3 Internet-Draft T. Jones 4 Intended status: Standards Track University of Aberdeen 5 Expires: September 6, 2018 M. Tuexen 6 I. Ruengeler 7 Muenster University of Applied Sciences 8 March 05, 2018 10 Packetization Layer Path MTU Discovery for Datagram Transports 11 draft-ietf-tsvwg-datagram-plpmtud-01 13 Abstract 15 This document describes a robust method for Path MTU Discovery 16 (PMTUD) for datagram Packetization layers. The method allows a 17 Packetization Layer (PL), or a datagram application that uses a PL, 18 to probe an network path with progressively larger packets to 19 determine a maximum packet size. The document describes an extension 20 to RFC 1191 and RFC 8201, which specify ICMP-based Path MTU Discovery 21 for IPv4 and IPv6. This provides functionally for datagram 22 transports that is equivalent to the Packetization layer PMTUD 23 specification for TCP, specified in RFC4821. 25 When published, this specification updates RFC4821. 27 Status of This Memo 29 This Internet-Draft is submitted in full conformance with the 30 provisions of BCP 78 and BCP 79. 32 Internet-Drafts are working documents of the Internet Engineering 33 Task Force (IETF). Note that other groups may also distribute 34 working documents as Internet-Drafts. The list of current Internet- 35 Drafts is at http://datatracker.ietf.org/drafts/current/. 37 Internet-Drafts are draft documents valid for a maximum of six months 38 and may be updated, replaced, or obsoleted by other documents at any 39 time. It is inappropriate to use Internet-Drafts as reference 40 material or to cite them other than as "work in progress." 42 This Internet-Draft will expire on September 6, 2018. 44 Copyright Notice 46 Copyright (c) 2018 IETF Trust and the persons identified as the 47 document authors. All rights reserved. 49 This document is subject to BCP 78 and the IETF Trust's Legal 50 Provisions Relating to IETF Documents 51 (http://trustee.ietf.org/license-info) in effect on the date of 52 publication of this document. Please review these documents 53 carefully, as they describe your rights and restrictions with respect 54 to this document. Code Components extracted from this document must 55 include Simplified BSD License text as described in Section 4.e of 56 the Trust Legal Provisions and are provided without warranty as 57 described in the Simplified BSD License. 59 Table of Contents 61 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 62 1.1. Classical Path MTU Discovery . . . . . . . . . . . . . . 3 63 1.2. Packetization Layer Path MTU Discovery . . . . . . . . . 4 64 1.3. Path MTU Discovery for Datagram Services . . . . . . . . 5 65 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 6 66 3. Features Required to Provide Datagram PLPMTUD . . . . . . . . 7 67 3.1. PMTU Probe Packets . . . . . . . . . . . . . . . . . . . 10 68 3.2. Validation of the Current Effective PMTU . . . . . . . . 11 69 3.3. Reduction of the Effective PMTU . . . . . . . . . . . . . 11 70 4. Datagram Packetization Layer PMTUD . . . . . . . . . . . . . 12 71 4.1. Probing . . . . . . . . . . . . . . . . . . . . . . . . . 12 72 4.2. Verification and Use of PTB Messages . . . . . . . . . . 13 73 4.3. Timers . . . . . . . . . . . . . . . . . . . . . . . . . 13 74 4.4. Constants . . . . . . . . . . . . . . . . . . . . . . . . 14 75 4.5. Variables . . . . . . . . . . . . . . . . . . . . . . . . 14 76 4.6. Selecting PROBED_SIZE . . . . . . . . . . . . . . . . . . 15 77 4.7. State Machine . . . . . . . . . . . . . . . . . . . . . . 15 78 5. Specification of Protocol-Specific Methods . . . . . . . . . 18 79 5.1. DPLPMTUD for UDP and UDP-Lite . . . . . . . . . . . . . . 18 80 5.1.1. UDP Options . . . . . . . . . . . . . . . . . . . . . 18 81 5.1.2. UDP Options Required for PLPMTUD . . . . . . . . . . 18 82 5.1.2.1. Echo Request Option . . . . . . . . . . . . . . . 19 83 5.1.2.2. Echo Response Option . . . . . . . . . . . . . . 19 84 5.1.3. Sending UDP-Option Probe Packets . . . . . . . . . . 19 85 5.1.4. Validating the Path with UDP Options . . . . . . . . 20 86 5.1.5. Handling of PTB Messages by UDP . . . . . . . . . . . 20 87 5.2. DPLPMTUD for SCTP . . . . . . . . . . . . . . . . . . . . 20 88 5.2.1. SCTP/IP4 and SCTP/IPv6 . . . . . . . . . . . . . . . 20 89 5.2.1.1. Sending SCTP Probe Packets . . . . . . . . . . . 20 90 5.2.1.2. Validating the Path with SCTP . . . . . . . . . . 21 91 5.2.1.3. PTB Message Handling by SCTP . . . . . . . . . . 21 92 5.2.2. DPLPMTUD for SCTP/UDP . . . . . . . . . . . . . . . . 21 93 5.2.2.1. Sending SCTP/UDP Probe Packets . . . . . . . . . 21 94 5.2.2.2. Validating the Path with SCTP/UDP . . . . . . . . 21 95 5.2.2.3. Handling of PTB Messages by SCTP/UDP . . . . . . 21 96 5.2.3. DPLPMTUD for SCTP/DTLS . . . . . . . . . . . . . . . 22 97 5.2.3.1. Sending SCTP/DTLS Probe Packets . . . . . . . . . 22 98 5.2.3.2. Validating the Path with SCTP/DTLS . . . . . . . 22 99 5.2.3.3. Handling of PTB Messages by SCTP/DTLS . . . . . . 22 100 5.3. PMTUD for QUIC . . . . . . . . . . . . . . . . . . . . . 22 101 5.3.1. Sending QUIC Probe Packets . . . . . . . . . . . . . 22 102 5.3.2. Validating the Path with QUIC . . . . . . . . . . . . 23 103 5.3.3. Handling of PTB Messages by QUIC . . . . . . . . . . 23 104 5.4. Other IETF Transports . . . . . . . . . . . . . . . . . . 23 105 5.5. DPLPMTUD by Applications . . . . . . . . . . . . . . . . 23 106 6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 24 107 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 24 108 8. Security Considerations . . . . . . . . . . . . . . . . . . . 24 109 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 24 110 9.1. Normative References . . . . . . . . . . . . . . . . . . 24 111 9.2. Informative References . . . . . . . . . . . . . . . . . 26 112 Appendix A. Event-driven state changes . . . . . . . . . . . . . 26 113 Appendix B. Revision Notes . . . . . . . . . . . . . . . . . . . 29 114 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 30 116 1. Introduction 118 The IETF has specified datagram transport using UDP, SCTP, and DCCP, 119 as well as protocols layered on top of these transports (e.g., SCTP/ 120 UDP, DCCP/UDP). 122 1.1. Classical Path MTU Discovery 124 Classical Path Maximum Transmission Unit Discovery (PMTUD) can be 125 used with any transport that is able to process ICMP Packet Too Big 126 (PTB) messages (e.g., [RFC1191] and [RFC8201]). The term PTB message 127 is applied to both IPv4 ICMP Unreachable messages (type 3) that carry 128 the error Fragmentation Needed (Type 3, Code 4) and ICMPv6 packet too 129 big messages (Type 2). When a sender receives a PTB message, it 130 reduces the effective Path MTU (PMTU) to the value reported as the 131 Link MTU in the PTB message, and a method that from time-to-time 132 increases the packet size in attempt to discover an increase in the 133 supported PMTU. The packets sent with a size larger than the current 134 effective PMTU are known as probe packets. 136 Packets not intended as probe packets are either fragmented to the 137 current effective PMTU, or the attempt to send fails with an error 138 code. Applications are sometimes provided with a primitive to let 139 them read the maximum packet size, derived from the current effective 140 PMTU. 142 Classical PMTUD is subject to protocol failures. One failure arises 143 when traffic using a packet size larger than the actual supported 144 PMTU is black-holed (all datagrams sent with this size are silently 145 discarded without the sender receiving ICMP PTB messages. This could 146 arise when the ICMP messages are not delivered back to the sender for 147 some reason [RFC2923]). For example, ICMP messages are increasingly 148 filtered by middleboxes (including firewalls) [RFC4890]. Also, in 149 some cases are not correctly processed by tunnel endpoints. 151 Another failure could result if a node not on the network path sends 152 a PTB that attempts to force the sender to change the effective PMTU 153 [RFC8201]. A sender can protect itself from reacting to such 154 messages by utilising the quoted packet within the PTB message 155 payload to verify that the received PTB message was generated in 156 response to a packet that had actually been sent. However, there are 157 situations where a sender would be unable to provide this 158 verification. 160 Examples where verification is not possible include: 162 o When the router issuing the ICMP message is acting on a tunneled 163 packet the ICMP message is directed to the tunnel endpoint. This 164 endpoint is responsible for processed in the quoted packet in the 165 payload field to remove the effect of the tunnel, and return the 166 ICMP message to the sender. Failure to do this results in black- 167 holing. 169 o When the router issuing the ICMP message implements RFC792 170 [RFC0792], which only requires the quoted payload to include the 171 first 64 bits of the IP payload of the packet, and the ICMP 172 message occurs within a tunnel. Even if the decpasulated message 173 is processed by the tunnel endpoint, there could be insufficient 174 bytes remaining for the sender to read the quoted transport 175 information. RFC1812 [RFC1812] requires routers to return the 176 full packet if possible, often the case for IPv4 when used the 177 path includes tunnels; or where the packet has been encapsulated/ 178 tunneled over an encrypted transport and it is not possible to 179 determine the original transport header ). 181 o Even when the PTB message includes sufficient bytes of the quoted 182 packet, the network layer could lack sufficient context to perform 183 verification, because this depends on information about the active 184 transport flows at an endpoint node (e.g., the socket/address 185 pairs being used, and other protocol header information). 187 1.2. Packetization Layer Path MTU Discovery 189 The term Packetization Layer (PL) has been introduced to describe the 190 layer that is responsible for placing data blocks into the payload of 191 packets and selecting an appropriate maximum packet size. This 192 function is often performed by a transport protocol, but can also be 193 performed by other encapsulation methods working above the transport. 194 PTB verification is more straight forward at the PL or at a higher 195 layer. 197 In contrast to PMTUD, Packetization Layer Path MTU Discovery 198 (PLPMTUD) [RFC4821] does not rely upon reception and verification of 199 PTB messages. It is therefore more robust than Classical PMTUD. 200 This has become the recommended approach for implementing PMTU 201 discovery with TCP. 203 It uses a general strategy where the PL sends probe packet to search 204 for an appropriate PMTU. The probe packets are sent a progressively 205 larger packet size. If a probe packet is successfully delivered (as 206 determined by the PL), then the effective Path MTU is raised to the 207 size of the successful probe. If no response is received to a probe 208 packet, the method reduces the probe size. 210 PLPMTUD introduces flexibility in the implementation of PMTU 211 discovery. At one extreme, it can be configured to only perform PTB 212 black hole detection and recovery to increase the robustness of 213 Classical PMTUD, or at the other extreme, all PTB processing can be 214 disabled and PLPMTUD can completely replace Classical PMTUD. PLPMTUD 215 can also include additional consistency checks without increasing the 216 risk of increased black-holing. 218 1.3. Path MTU Discovery for Datagram Services 220 Section 4 of this document presents a set of algorithms for datagram 221 protocols to discover a maximum size for the effective PMTU across a 222 path. The methods described rely on features of the PL Section 3 and 223 apply to transport protocols over IPv4 and IPv6. It does not require 224 cooperation from the lower layers (except that they are consistent 225 about which packet sizes are acceptable). A method can utilise ICMP 226 PTB messages when these received messages are made available to the 227 PL. 229 The UDP-Guidelines [RFC8085] state "an application SHOULD either use 230 the Path MTU information provided by the IP layer or implement Path 231 MTU Discovery (PMTUD)", but does not provide a mechanism for 232 discovering the largest size of unfragmented datagram than can be 233 used on a path. Prior to this document, PLPMTUD had not been 234 specified for UDP. 236 Section 10.2 of [RFC4821] recommends a PLPMTUD probing method for the 237 Stream Control Transport Protocol (SCTP). SCTP utilises heartbeat 238 messages as probe packets, but RFC4821 does not provide a complete 239 specification. This document provides the details to complete that 240 specification. 242 The Datagram Congestion Control Protocol (DCCP) [RFC4340] requires 243 implementations to support Classical PMTUD and states that a DCCP 244 sender "MUST maintain the maximum packet size (MPS) allowed for each 245 active DCCP session". It also defines the current congestion control 246 maximum packet size (CCMPS) supported by a path. This recommends use 247 of PMTUD, and suggests use of control packets (DCCP-Sync) as path 248 probe packets, because they do not risk application data loss. The 249 method defined in this specification could be used with DCCP. 251 Section 5 specifies the method for a set of transports, and provides 252 information to enables the implementation of PLPMTUD with other 253 datagram transports and applications that use datagram transports. 255 2. Terminology 257 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 258 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 259 document are to be interpreted as described in [RFC2119]. 261 Other terminology is directly copied from [RFC4821], and the 262 definitions in [RFC1122]. 264 Black-Holed: When the sender is unaware that packets are not 265 delivered to the destination endpoint (e.g., when the sender 266 transmits packets of a particular size with a previously known 267 PMTU, but is unaware of a change to the path that resulted in a 268 smaller PMTU). 270 Classical Path MTU Discovery: Classical PMTUD is a process described 271 in [RFC1191] and [RFC8201], in which nodes rely on PTB messages to 272 learn the largest size of unfragmented datagram than can be used 273 across a path. 275 Datagram: A datagram is a transport-layer protocol data unit, 276 transmitted in the payload of an IP packet. 278 Effective PMTU: The current estimated value for PMTU that is used by 279 a Packetization Layer. 281 EMTU_S: The Effective MTU for sending (EMTU_S) is defined in 282 [RFC1122] as "the maximum IP datagram size that may be sent, for a 283 particular combination of IP source and destination addresses...". 285 EMTU_R: The Effective MTU for receiving (EMTU_R) is designated in 286 [RFC1122] as the largest datagram size that can be reassembled by 287 EMTU_R ("Effective MTU to receive"). 289 Link: A communication facility or medium over which nodes can 290 communicate at the link layer, i.e., a layer below the IP layer. 291 Examples are Ethernet LANs and Internet (or higher) layer and 292 tunnels. 294 Link MTU: The Maximum Transmission Unit (MTU) is the size in bytes 295 of the largest IP packet, including the IP header and payload, 296 that can be transmitted over a link. Note that this could more 297 properly be called the IP MT, to be consistent with how other 298 standards organizations use the acronym MT. This includes the IP 299 header, but excludes link layer headers and other framing that is 300 not part of IP or the IP payload. Other standards organizations 301 generally define link MTU to include the link layer headers. 303 MPS: The Maximum Packet Size (MPS), the largest size of application 304 data block that can be sent unfragmented across a path. In 305 PLPMTUD this quantity is derived from Effective PMTU by taking 306 into consideration the size of the application and lower protocol 307 layer headers, and can be limited by the application protocol. 309 Packet: An IP header plus the IP payload. 311 Packetization Layer (PL): The layer of the network stack that places 312 data into packets and performs transport protocol functions. 314 Path: The set of link and routers traversed by a packet between a 315 source node and a destination node. 317 Path MTU (PMTU): The minimum of the link MTU of all the links 318 forming a path between a source node and a destination node. 320 PLPMTUD: Packetization Layer Path MTU Discovery, the method 321 described in this document for datagram PLs, which is an extension 322 to Classical PMTU Discovery. 324 Probe packet: A datagram sent with a purposely chosen size 325 (typically larger than the current Effective PMTU or MPS) to 326 detect if messages of this size can be successfully sent along the 327 end-to-end path. 329 3. Features Required to Provide Datagram PLPMTUD 331 TCP PLPMTUD has been defined using standard TCP protocol mechanisms. 332 All of the requirements in [RFC4821] also apply to use of the 333 technique with a datagram PL. Unlike TCP, some datagram PLs require 334 additional mechanisms to implement PLPMTUD. 336 There are nine requirements for performing the datagram PLPMTUD 337 method described in this specification: 339 1. PMTU parameters: A PLPMTUD sender is REQUIRED to provide 340 information about the maximum size of packet that can be 341 transmitted by the sender on the local link (the link MTU and MAY 342 utilize similar information about the receiver when this is 343 supplied (note this could be less than EMTU_R). Some 344 applications also have a maximum transport protocol data unit 345 (PDU) size, in which case there is no benefit from probing for a 346 size larger than this (unless a transport allows multiplexing 347 multiple applications PDUs into the same datagram). 349 2. Effective PMTU: A datagram application MUST be able to choose the 350 size of datagrams sent to the network, up to the effective PMTU, 351 or a smaller value (such as the MPS) derived from this. This 352 value is managed by the PMTUD method. The effective PMTU 353 (specified in Section 1 of [RFC1191]) is equivalent to the EMTU_S 354 (specified in [RFC1122]). 356 3. Probe packets: On request, a PLPMTUD sender is REQUIRED to be 357 able to transmit a packet larger than the current effective PMTU 358 (but always with a total size less than the link MTU). The 359 method can use this as a probe packet. In IPv4, a probe packet 360 is always sent with the Don't Fragment (DF) bit set in the IP 361 header, and without network layer endpoint fragmentation. In 362 IPv6, a probe packet is always sent without source fragmentation 363 (as specified in section 5.4 of [RFC8201]). 365 4. Processing PTB messages: A PLPMTUD sender MAY optionally utilize 366 PTB messages received from the network layer to help identify 367 when a path does not support the current size of packet probe. 368 Any received PTB message SHOULD/MUST be verified before it is 369 used to update the PMTU discovery information [RFC8201]. This 370 verification confirms that the PTB message was sent in response 371 to a packet originating by the sender, and needs to be performed 372 before the PMTU discovery method reacts to the PTB message. When 373 the router link MTU is indicated in the PTB message this MAY be 374 used by datagram PLPMTUD to reduce the size of a probe, but MUST 375 NOT be used to increase the effective PMTU ([RFC8201]). 377 5. Reception feedback: The destination PL endpoint is REQUIRED to 378 provide a feedback method that indicates to the PLPMTUD sender 379 when a probe packet has been received by the destination 380 endpoint. The local PL endpoint at the sending node is REQUIRED 381 to pass this feedback to the sender-side PLPMTUD method. 383 6. Probing and congestion control: The isolated loss of a probe 384 packet SHOULD NOT be treated as an indication of congestion and 385 its loss does not directly trigger a congestion control reaction 386 [RFC4821]. 388 7. Probe loss recovery: If the data block carried by a probe message 389 needs to be sent reliably, the PL (or layers above) MUST arrange 390 retransmission/repair of any resulting loss. This method MUST be 391 robust in the case where probe packets are lost due to other 392 reasons (including link transmission error, congestion). The 393 PLPMTUD method treats isolated loss of a probe packet (with or 394 without an PTB message) as a potential indication of a PMTU limit 395 on the path. The PL MAY retransmit any data included in a lost 396 probe packet without adjusting its congestion window [RFC4821]. 398 8. Cached effective PMTU: The sender MUST cache the effective PMTU 399 value used by an instance of the PL between probes and needs also 400 to consider the disruption that could be incurred by an 401 unsuccessful probe - both upon the flow that incurs a probe loss, 402 and other flows that experience the effect of additional probe 403 traffic. 405 9. Shared effective PMTU state: The PMTU value could also be stored 406 with the corresponding entry in the destination cache and used by 407 other PL instances. The specification of PLPMTUD [RFC4821] 408 states: "If PLPMTUD updates the MTU for a particular path, all 409 Packetization Layer sessions that share the path representation 410 (as described in Section 5.2 of [RFC4821]) SHOULD be notified to 411 make use of the new MTU and make the required congestion control 412 adjustments". Such methods need to robust to the wide variety of 413 underlying network forwarding behaviours. Section 5.2 of 414 [RFC8201] provides guidance on the caching of PMTU information 415 and also the relation to IPv6 flow labels. 417 In addition the following design principles are stated: 419 o Suitable MPS: The PLPMTUD method SHOULD avoid forcing an 420 application to use an arbitrary small MPS (effective PMTU) for 421 transmission while the method is searching for the currently 422 supported PMTU. Datagram PLs do not necessarily support 423 fragmentation of PDUs larger than the PMTU. A reduced MPS can 424 adversely impact the performance of a datagram application. 426 o Path validation: The PLPMTUD method MUST be robust to path changes 427 that could have occurred since the path characteristics were last 428 confirmed. 430 o Datagram reordering: A method MUST be robust to the possibility 431 that a flow encounters reordering, or has the traffic (including 432 probe packets) is divided over more than one network path. 434 o When to probe: The PLPMTUD method SHOULD determine whether the 435 path capacity has increased since it last measured the path. This 436 determines when the path should again be probed. 438 3.1. PMTU Probe Packets 440 PMTU discovery relies upon the sender being able to generate probe 441 messages with a specific size. TCP is able to generate probe packets 442 by choosing to appropriately segment data being sent [RFC4821]. 444 In contrast, a datagram PL that needs to construct a probe packet has 445 to either request an application to send a data block that is larger 446 than that generated by an application, or to utilise padding 447 functions to extend a datagram beyond the size of the application 448 data block. Protocols that permit exchange of control messages 449 (without an application data block) could alternatively prefer to 450 generate a probe packet by extending a control message with padding 451 data. 453 When the method fails to validate the PMTU for the path, it may be 454 required to send a probe packet with a size less than the size of the 455 data block generated by an application. In this case, the PL could 456 provide a way to fragment a datagram at the PL, or could instead 457 utilise a control packet with padding. 459 A receiver needs to be able to distinguish an in-band data block from 460 any added padding. This is needed to ensure that any added padding 461 is not passed on to an application at the receiver. 463 This results in three possible ways that a sender can create a probe 464 packet: 466 Probing using appication data: A probe packet that contains a data 467 block supplied by an application that matches the size required 468 for the probe. This method requests the application to issue a 469 data block of the desired probe size. If the application/ 470 transport needs protection from the loss of an unsuccessful probe 471 packet, the application/transport needs then to perform transport- 472 layer retransmission/repair of the data block (e.g., by 473 retransmission after loss is detected or by duplicating the data 474 block in a datagram without the padding). 476 Probing using appication data and padding data: A probe packet that 477 contains a data block supplied by an application that is combined 478 with padding to inflate the length of the datagram to the size 479 required for the probe. If the application/transport needs 480 protection from the loss of this probe packet, the application/ 481 transport may perform transport-layer retransmission/repair of the 482 data block (e.g., by retransmission after loss is detected or by 483 duplicating the data block in a datagram without the padding 484 data). 486 Probing using padding data: A probe packet that contains only 487 control information together with any padding needed to inflate 488 the packet to the size required for the probe. Since these probe 489 packets do not carry an application-supplied data block,they do 490 not typically require retransmission, although they do still 491 consume network capacity and incur endpoint processing. 493 A datagram PLPMTUD MAY choose to use only one of these methods to 494 simplify the implementation. 496 3.2. Validation of the Current Effective PMTU 498 The PL needs a method to determine when probe packets have been 499 successfully received end-to-end across a network path. 501 Transport protocols can include end-to-end methods that detect and 502 report reception of specific datagrams that they send (e.g., DCCP and 503 SCTP provide keep-alive/heartbeat features). When supported, this 504 mechanism SHOULD also be used by PLPMTUD to acknowledge reception of 505 a probe packet. 507 A PL that does not acknowledge data reception (e.g., UDP and UDP- 508 Lite) is unable to detect when the packets it sends are discarded 509 because their size is greater than the actual PMTUD. These PLs need 510 to either rely on an application protocol to detect this, or make use 511 of an additional transport method such as UDP-Options 512 [I-D.ietf-tsvwg-udp-options]. In addition, they might need to send 513 reachability probes (e.g., periodically solicit a response from the 514 destination) to determine whether the current effective PMTU is still 515 supported by the network path. 517 Section Section 4 specifies this function for a set of IETF-specified 518 protocols. 520 3.3. Reduction of the Effective PMTU 522 When the current effective PMTU is no longer supported by the network 523 path, the transport needs to detect this and reduce the effective 524 PMTU. 526 o A PL that sends a datagram larger than the actual PMTU that 527 includes no application data block, or one that does not attempt 528 to provide any retransmission, can send a new probe packet with an 529 updated probe size. 531 o A PL that wishes to resend the application data block, could then 532 need to re-fragment the data block to a smaller packet size that 533 is expected to traverse the end-to-end path. This could utilise 534 network-layer or PL fragmentation when these are available. A 535 fragmented datagram MUST NOT be used as a probe packet (see 536 [RFC8201]). 538 A method can additionally utilise PTB messages to detect when the 539 actual PMTU supported by a network path is less than the current size 540 of datagrams (or probe messages) that are being sent. 542 4. Datagram Packetization Layer PMTUD 544 This section specifies Datagram PLPMTUD. 546 The central idea of PLPMTU discovery is probing by a sender. Probe 547 packets of increasing size are sent to find out the maximum size of a 548 user message that is completely transferred across the network path 549 from the sender to the destination. 551 4.1. Probing 553 The PLPMTUD method utilises a timer to trigger the generation of 554 probe packets. The probe_timer is started each time a probe packet 555 is sent to the destination and is cancelled when receipt of the probe 556 packet is acknowledged. 558 The PROBE_COUNT is initialised to zero when a probe packet is first 559 sent with a particular size. Each time the probe_timer expires, the 560 PROBE_COUNT is incremented, and a probe packet of the same size is 561 retransmitted. The maximum number of retransmissions per probing 562 size is configured (MAX_PROBES). If the value of the PROBE_COUNT 563 reaches MAX_PROBES, probing will be stopped and the last successfully 564 probed PMTU is set as the effective PMTU. 566 Once probing is completed, the sender continues to use the effective 567 PMTU until either a PTB message is received or the PMTU_RAISE_TIMER 568 expires. If the PL is unable to verify reachability to the 569 destination endpoint after probing has completed, the method uses a 570 REACHABILITY_TIMER to periodically repeat a probe packet for the 571 current effective PMTU size, while the PMTU_RAISE_TIMER is running. 572 If the resulting probe packet is not acknowledged (i.e. the 573 PROBE_TIMER expires), the method re-starts probing for the PMTU. 575 4.2. Verification and Use of PTB Messages 577 This section describes processing for both IPv4 ICMP Unreachable 578 messages (type 3) and ICMPv6 packet too big messages. 580 A node that receives a PTB message from a router or middlebox, MUST 581 verify the PTB message. The node checks the protocol information in 582 the quoted payload to verify that the message originated from the 583 sending node. The node also checks that the reported MTU size is 584 less than the size used by packet probes. PTB messages are discarded 585 if they fail to pass these checks, or where there is insufficient 586 ICMP payload to perform these checks. The checks are intended to 587 provide protection from packets that originate from a node that is 588 not on the network path or a node that attempts to report a larger 589 MTU than the current probe size. 591 PTB messages that have been verified can be utilised by the DPLPMTUD 592 algorithm. A method that utilises these PTB messages can improve 593 performance compared to one that relies solely on probing. 595 4.3. Timers 597 This method utilises three timers: 599 PROBE_TIMER: Configured to expire after a period longer than the 600 maximum time to receive an acknowledgment to a probe packet. This 601 value MUST be larger than 1 second, and SHOULD be larger than 15 602 seconds. Guidance on selection of the timer value are provide in 603 section 3.1.1 of the UDP Usage Guidelines [RFC8085]. 605 PMTU_RAISE_TIMER: Configured to the period a sender ought to 606 continue use the current effective PMTU, after which it re- 607 commences probing for a higher PMTU. This timer has a period of 608 600 secs, as recommended by PLPMTUD [RFC4821]. 610 REACHABILITY_TIMER: Configured to the period a sender ought to wait 611 before confirming the current effective PMTU is still supported. 612 This is less than the PMTU_RAISE_TIMER. 614 An application that needs to employ keep-alive messages to deliver 615 useful service over UDP SHOULD NOT transmit them more frequently 616 than once every 15 seconds and SHOULD use longer intervals when 617 possible. DPLPMTUD ought to suspend reachability probes when no 618 application data has been sent since the previous probe packet. 619 Guidance on selection of the timer value are provide in section 620 3.1.1 of the UDP Usage Guidelines[RFC8085]. 622 An implementation could implement the various timers using a single 623 timer process. 625 4.4. Constants 627 The following constants are defined: 629 MAX_PROBES: The maximum value of the PROBE_ERROR_COUNTER. The 630 default value of MAX_PROBES is 10. 632 MIN_PMTU: The smallest allowed probe packet size. This value is 633 1280 bytes, as specified in [RFC2460]. For IPv4, the minimum 634 value is 68 bytes. (An IPv4 routed is required to be able to 635 forward a datagram of 68 octets without further fragmentation. 636 This is the combined size of an IPv4 header and the minimum 637 fragment size of 8 octets.) 639 BASE_PMTU: The BASE_PMTU is a considered a size that ought to work 640 in most cases. The size is equal to or larger than the minimum 641 permitted and smaller than the maximum allowed. In the case of 642 IPv6, this value is 1280 bytes [RFC2460]. When using IPv4, a size 643 of 1200 is RECOMMENDED. 645 MAX_PMTU: The MAX_PMTU is the largest size of PMTU that is probed. 646 This has to be less than or equal to the minimum of the local MTU 647 of the outgoing interface and the destination effective MTU for 648 receiving. An application or PL may reduce this when it knows 649 there is no need to send packets above a specific size. 651 4.5. Variables 653 This method utilises a set of variables: 655 effective PMTU: The effective PMTU is the maximum size of datagram 656 that the method has currently determined can be supported along 657 the entire path. 659 PROBED_SIZE: The PROBED_SIZE is the size of the current probe 660 packet. This is a tentative value for the effective PMTU, which 661 is awaiting confirmation by an acknowledgment. 663 PROBE_COUNT: This is a count of the number of unsuccessful probe 664 packets that have been sent with size PROBED_SIZE. The value is 665 initialised to zero when a particular size of PROBED_SIZE is first 666 attempted. 668 PTB_SIZE: The PTB_Size is value returned by a verified PTB message 669 indicating the local MTU size of a router along the path. 671 4.6. Selecting PROBED_SIZE 673 Implementations discover the search range by validating the minimum 674 path MTU and then using the probe method to select a PROBED_SIZE less 675 than or equal to the maximum PMTU_MAX. Where PMTU_MAX is the minimum 676 of the the local link MTU and EMTU_R (learned from the remote 677 endpoint). The PMTU_MAX MAY be constrained by an application that 678 has a maximum to the size of datagrams it wishes to send. 680 Implementations use a search algorithm to choose probe sizes within 681 the search range. XXX The current method does not specify or 682 recommend a specific methods for selecting a probe size. One simple 683 method is to increase the size of probe in increments until it fails, 684 other methods may use tables to select probe sizes, or search 685 algorithms - this part to be expanded based on experience and 686 consideration of methods XXX 688 Implementations MAY optimizse the search procedure by selecting step 689 sizes from a table of common MTU sizes. 691 Implementations SHOULD select probe sizes to maximise the gain in 692 PMTU each search step. Implementations ought to take into 693 consideration useful probe size steps and a minimum useful gain in 694 PMTU. 696 4.7. State Machine 698 A state machine for Datagram PLPMTUD is depicted in Figure 1. If 699 multihoming is supported, a state machine is needed for each active 700 path. 702 PROBE_TIMER expiry 703 (PROBE_COUNT = MAX_PROBES) 704 +-------------+ +--------------+ 705 =->| PROBE_START |--------------->|PROBE_DISABLED| 706 PROBE_TIMER expiry | +-------------+ +--------------+ 707 (PROBE_COUNT = | | | 708 MAX_PROBES) ------- | Connectivity confirmed 709 v 710 ----------- +------------+ -- PROBE_TIMER expiry 711 MAX_PMTU acked or | | PROBE_BASE | | (PROBE_COUNT < 712 PTB (>= BASE_PMTU)| -----> +------------+ <- MAX_PROBES) 713 ---------------- | /\ | | 714 | | | | | PTB 715 | PMTU_RAISE_TIMER| | | | (PTB_SIZE < BASE_PMTU) 716 | or reachability | | | | or 717 | (PROBE_COUNT | | | | PROBE_TIMER expiry 718 | = MAX_PROBES) | | | | (PROBE_COUNT = MAX_PROBES) 719 | ------------- | | \ 720 | | PTB | | \ 721 | | (< PROBED_SIZE)| | \ 722 | | | | ---------------- 723 | | | | | 724 | | | | Probe | 725 | | | | acked | 726 v | | v v 727 +------------+ +--------------+ Probe +-------------+ 728 | PROBE_DONE |<-------------- | PROBE_SEARCH |<-------| PROBE_ERROR | 729 +------------+ MAX_PMTU acked +--------------+ acked +-------------+ 730 /\ | or /\ | 731 | | PROBE_TIMER expiry | | 732 | |(PROBE_COUNT = MAX_PROBES) | | 733 | | | | 734 ------ -------- 735 Reachability probe acked PROBE_TIMER expiry 736 or PROBE_TIMER expiry (PROBE_COUNT < MAX_PROBES) 737 (PROBE_COUNT < MAX_PROBES) or 738 Probe acked 740 Figure 1: State machine for Datagram PLPMTUD 742 XXX State machine to be updated to describe handling of validated PTB 743 messages XXX 745 XXX Method may be updated to clarify how probe sizes are used during 746 probing XXX 748 The following states are defined to reflect the probing process: 750 PROBE_START: The PROBE_START state is the initial state before 751 probing has started. PLPMTUD is not performed in this state. The 752 state transitions to PROBE_BASE, when a path has been confirmed, 753 i.e. when a sent packet has been acknowledged on this path. The 754 effective PMTU is set to the BASE_PMTU size. Probing ought to 755 start immediately after connection setup to prevent the loss of 756 user data. 758 PROBE_BASE: The PROBE_BASE state is the starting point for probing 759 with datagram PLPMTUD. It is used to confirm whether the 760 BASE_PMTU size is supported by the network path. On entry, the 761 PROBED_SIZE is set to the BASE_PMTU size and the PROBE_COUNT is 762 set to zero. A probe packet is sent, and the PROBE_TIMER is 763 started. The state is left when the PROBE_COUNT reaches 764 MAX_PROBES; a PTB message is verified, or a probe packet is 765 acknowledged. 767 PROBE_SEARCH: The PROBE_SEARCH state is the main probing state. 768 This state is entered either when probing for the BASE_PMTU was 769 successful or when there is a successful reachability test in the 770 PROBE_ERROR state. On entry, the effective PMTU is set to the 771 last acknowledged PROBED_SIZE. 773 The PROBE_COUNT is set to zero when the first probe packet is sent 774 for each probed size. Each time a probe packet is acknowledged, 775 the effective PMTU is set to the PROBED_SIZE, and then the 776 PROBED_SIZE is increased. 778 When a probe packet is sent and not acknowledged within the period 779 of the PROBE_TIMER, the PROBE_COUNT is incremented and the probe 780 packet is retransmitted. The state is exited when the PROBE_COUNT 781 reaches MAX_PROBES; a PTB message is verified; or a probe of size 782 PMTU_MAX is acknowledged. 784 PROBE_ERROR: The PROBE_ERROR state represents the case where the 785 network path is not known to support an effective PMTU of at least 786 the BASE_PMTU size. It is entered when either a probe of size 787 BASE_PMTU has not been acknowledged or a verified PTB message 788 indicates a smaller link MTU than the BASE_PMTU. On entry, the 789 PROBE_COUNT is set to zero and the PROBED_SIZE is set to the 790 MIN_PMTU size, and the effective PMTU is reset to MIN_PMTU size. 791 In this state, a probe packet is sent, and the PROBE_TIMER is 792 started. The state transitions to the PROBE_SEARCH state when a 793 probe packet is acknowledged. 795 PROBE_DONE: The PROBE_DONE state indicates a successful end to a 796 probing phase. Datagram PLPMTUD remains in this state until 797 either the PMTU_RAISE_TIMER expires or a received PTB message is 798 verified. 800 When PLPMTUD uses an unacknowledged PL and is in the PROBE_DONE 801 state, a REACHABILITY_TIMER periodically resets the PROBE_COUNT 802 and schedules a probe packet with the size of the effective PMTU. 803 If the probe packet fails to be acknowledged after MAX_PROBES 804 attempts, the method enters the PROBE_BASE state. When used with 805 an acknowledged PL (e.g., SCTP), DPLPMTUD SHOULD NOT continue to 806 probe in this state. 808 PROBE_DISABLED: The PROBE_DISABLED state indicates that connectivity 809 could not be established. DPLPMTUD MUST NOT probe in this state. 811 Appendix A contains an informative description of key events. 813 5. Specification of Protocol-Specific Methods 815 This section specifies protocol-specific details for datagram PLPMTUD 816 for IETF-specified transports. 818 5.1. DPLPMTUD for UDP and UDP-Lite 820 The current specifications of UDP [RFC0768] and UDP-LIte [RFC3828] do 821 not define a method in the RFC-series that supports PLPMTUD. In 822 particular, these transports do not provide the transport layer 823 features needed to implement datagram PLPMTUD, and any support for 824 Datagram PLPMTUD would therefore need to rely on higher-layer 825 protocol features [RFC8085]. 827 5.1.1. UDP Options 829 UDP-Options [I-D.ietf-tsvwg-udp-options] supply the additional 830 functionality required to implement datagram PLPMTUD. This enables 831 padding to be added to UDP datagrams and can be used to provide 832 feedback acknowledgement of received probe packets. 834 5.1.2. UDP Options Required for PLPMTUD 836 This subsection proposes two new UDP-Options that add support for 837 requesting a datagram response be sent and to mark this datagram as a 838 response to a request. 840 XXX Future versions of the spec may define a parameter in an Option 841 to indicate the EMTU_R to the peer that can be used to initialise 842 PMTU_MAX. XXX 844 5.1.2.1. Echo Request Option 846 The Echo Request Option allows a sending endpoint to solicit a 847 response from a destination endpoint. 849 The Echo Request carries a four byte token set by the sender. This 850 token can be set to a value that is likely to be known only to the 851 sender (and becomes known to nodes along the end-to-end path). The 852 sender can then check the value returned in the response to provide 853 additional protection from off-path insertion of data [RFC8085]. 855 +---------+--------+-----------------+ 856 | Kind=9 | Len=6 | Token | 857 +---------+--------+-----------------+ 858 1 byte 1 byte 4 bytes 860 Figure 2: UDP ECHOREQ Option Format 862 5.1.2.2. Echo Response Option 864 The Echo Response Option is generated by the PL in response to 865 reception of a previously received Echo Request. The Token field 866 associates the response with the Token value carried in the most 867 recently-received Echo Request. The rate of generation of UDP 868 packets carrying an Echo Response Option MAY be rate-limited. 870 +---------+--------+-----------------+ 871 | Kind=10 | Len=6 | Token | 872 +---------+--------+-----------------+ 873 1 byte 1 byte 4 bytes 875 Figure 3: UDP ECHORES Option Format 877 5.1.3. Sending UDP-Option Probe Packets 879 This method specifies a probe packet that does not carry an 880 application data block. The probe packet consists of a UDP datagram 881 header followed by a UDP Option containing the ECHOREQ option, which 882 is followed by NOP Options to pad the remainder of the datagram 883 payload to the probe size. NOP padding is used to control the length 884 of the probe packet. 886 A UDP Option carrying the ECHORES option is used to provide feedback 887 when a probe packet is received at the destination endpoint. 889 5.1.4. Validating the Path with UDP Options 891 Since UDP is an unacknowledged PL, a sender that does not have 892 higher-layer information confirming correct delivery of datagrams 893 SHOULD implement the REACHABILITY_TIMER to periodically send probe 894 packets while in the PROBE_DONE state. 896 5.1.5. Handling of PTB Messages by UDP 898 Normal ICMP verification MUST be performed as specified in 899 Section 5.2 of [RFC8085]. This requires that the PL verifies each 900 received PTB messages to verify these are received in response to 901 transmitted traffic and that the reported LInk MTU is less than the 902 current probe size. A verified PTB message MAY be used as input to 903 the PLPMTUD algorithm. 905 5.2. DPLPMTUD for SCTP 907 Section 10.2 of [RFC4821] specifies a recommended PLPMTUD probing 908 method for SCTP. It recommends the use of the PAD chunk, defined in 909 [RFC4820] to be attached to a minimum length HEARTBEAT chunk to build 910 a probe packet. This enables probing without affecting the transfer 911 of user messages and without interfering with congestion control. 912 This is preferred to using DATA chunks (with padding as required) as 913 path probes. 915 XXX Future versions of this specification might define a parameter 916 contained in the INIT and INIT ACK chunk to indicate the MTU to the 917 peer. However, multihoming makes this a bit complex, so it might not 918 be worth doing. XXX 920 5.2.1. SCTP/IP4 and SCTP/IPv6 922 The base protocol is specified in [RFC4960]. 924 5.2.1.1. Sending SCTP Probe Packets 926 Probe packets consist of an SCTP common header followed by a 927 HEARTBEAT chunk and a PAD chunk. The PAD chunk is used to control 928 the length of the probe packet. The HEARTBEAT chunk is used to 929 trigger the sending of a HEARTBEAT ACK chunk. The reception of the 930 HEARTBEAT ACK chunk acknowledges reception of a successful probe. 932 The HEARTBEAT chunk carries a Heartbeat Information parameter which 933 should include, besides the information suggested in [RFC4960], the 934 probing size, which is the MTU size the complete datagram will add up 935 to. The size of the PAD chunk is therefore computed by reducing the 936 probing size by the IPv4 or IPv6 header size, the SCTP common header, 937 the HEARTBEAT request and the PAD chunk header. The payload of the 938 PAD chunk contains arbitrary data. 940 To avoid fragmentation of retransmitted data, probing starts right 941 after the handshake, before data is sent. Assuming normal behaviour 942 (i.e., the PMTU is smaller than or equal to the interface MTU), this 943 process will take a few round trip time periods depending on the 944 number of PMTU sizes probed. The Heartbeat timer can be used to 945 implement the PROBE_TIMER. 947 5.2.1.2. Validating the Path with SCTP 949 Since SCTP provides an acknowledged PL, a sender does MUST NOT 950 implement the REACHABILITY_TIMER while in the PROBE_DONE state. 952 5.2.1.3. PTB Message Handling by SCTP 954 Normal ICMP verification MUST be performed as specified in Appendix C 955 of [RFC4960]. This requires that the first 8 bytes of the SCTP 956 common header are quoted in the payload of the PTB message, which can 957 be the case for ICMPv4 and is normally the case for ICMPv6. 959 When a PTB message has been verified, the router Link MTU indicated 960 in the PTB message SHOULD be used with the PLPMTUD algorithm, 961 providing that the reported Link MTU is less than the current probe 962 size. 964 5.2.2. DPLPMTUD for SCTP/UDP 966 The UDP encapsulation of SCTP is specified in [RFC6951]. 968 5.2.2.1. Sending SCTP/UDP Probe Packets 970 Packet probing can be performed as specified in Section 5.2.1.1. The 971 maximum payload is reduced by 8 bytes, which has to be considered 972 when filling the PAD chunk. 974 5.2.2.2. Validating the Path with SCTP/UDP 976 Since SCTP provides an acknowledged PL, a sender does MUST NOT 977 implement the REACHABILITY_TIMER while in the PROBE_DONE state. 979 5.2.2.3. Handling of PTB Messages by SCTP/UDP 981 Normal ICMP verification MUST be performed for PTB messages as 982 specified in Appendix C of [RFC4960]. This requires that the first 8 983 bytes of the SCTP common header are contained in the PTB message, 984 which can be the case for ICMPv4 (but note the UDP header also 985 consumes a part of the quoted packet header) and is normally the case 986 for ICMPv6. When the verification is completed, the router Link MTU 987 size indicated in the PTB message SHOULD be used with the PLPMTUD 988 algorithm providing that the reported LInk MTU is less than the 989 current probe size. 991 5.2.3. DPLPMTUD for SCTP/DTLS 993 The Datagram Transport Layer Security (DTLS) encapsulation of SCTP is 994 specified in [I-D.ietf-tsvwg-sctp-dtls-encaps]. It is used for data 995 channels in WebRTC implementations. 997 5.2.3.1. Sending SCTP/DTLS Probe Packets 999 Packet probing can be done as specified in Section 5.2.1.1. 1001 5.2.3.2. Validating the Path with SCTP/DTLS 1003 Since SCTP provides an acknowledged PL, a sender does MUST NOT 1004 implement the REACHABILITY_TIMER while in the PROBE_DONE state. 1006 5.2.3.3. Handling of PTB Messages by SCTP/DTLS 1008 It is not possible to perform normal ICMP verification as specified 1009 in [RFC4960], since even if the ICMP message payload contains 1010 sufficient information, the reflected SCTP common header would be 1011 encrypted. Therefore it is not possible to process PTB messages at 1012 the PL. 1014 5.3. PMTUD for QUIC 1016 XXX New section XXX 1018 Quick UDP Internet Connection (QUIC) is a UDP-based transport that 1019 provides reception feedback [I-D.ietf-quic-transport]. 1021 Section 9.2 of [I-D.ietf-quic-transport] details the path 1022 considerations when sending QUIC packets. It reccomends the use of 1023 PADDING frames to buld the probe packet. This enables probing the 1024 without affecting the transfer of other frames. 1026 5.3.1. Sending QUIC Probe Packets 1028 Probe packets consist of a QUIC Header and a payload containing only 1029 PADDING Frames. PADDING Frames are a single octet (0x00) and 1030 serveral of these can be used to create a probe packet of size 1031 PROBED_SIZE. 1033 A QUIC sender needs to pad initial packets to 1200 bytes to validate 1034 the path can support packets of a useful size. If a QUIC sender 1035 determines the PMTU on a path has fallen below 1280 octets it MUST 1036 immediately stop sending on the affected path. 1038 5.3.2. Validating the Path with QUIC 1040 Since QUIC provides an acknowledged PL, a sender does MUST NOT 1041 implement the REACHABILITY_TIMER while in the PROBE_DONE state. 1043 5.3.3. Handling of PTB Messages by QUIC 1045 QUIC does not specify any methods for validating ICMP responses, but 1046 does provide some guidlines to make it harder for an off path 1047 attacker to inject ICMP messages. 1049 o Set the IPv4 Don't Fragment (DF) bit on a small proportion of 1050 packets, so that most invalid ICMP messages arrive when there are 1051 no DF packets outstanding, and can therefore be identified as 1052 spurious. 1054 o Store additional information from the IP or UDP headers from DF 1055 packets (for example, the IP ID or UDP checksum) to further 1056 authenticate incoming Datagram Too Big messages. 1058 o Any reduction in PMTU due to a report contained in an ICMP packet 1059 is provisional until QUIC's loss detection algorithm determines 1060 that the packet is actually lost. 1062 XXX The above list was pulled whole from quic-transport XXX 1064 5.4. Other IETF Transports 1066 XXX This section to be updated in a later revision. XXX 1068 5.5. DPLPMTUD by Applications 1070 Applications that use the Datagram API (e.g., applications built 1071 directly or indirectly on UDP) can implement DPLPMTUD. Some 1072 primitives used by DPLPMTUD might not be available via this interface 1073 (e.g., the ability to access the PMTU cache, or interpret received 1074 ICMP PTB messages). 1076 In addition, it is important that PMTUD is not performed by multiple 1077 protocol layers. 1079 XXX This section will be completed in a future revision of this ID 1080 XXX 1082 6. Acknowledgements 1084 This work was partially funded by the European Union's Horizon 2020 1085 research and innovation programme under grant agreement No. 644334 1086 (NEAT). The views expressed are solely those of the author(s). 1088 7. IANA Considerations 1090 This memo includes no request to IANA. 1092 XXX If new UDP Options are specified in this document, a request to 1093 IANA will be included here. XXX 1095 If there are no requirements for IANA, the section will be removed 1096 during conversion into an RFC by the RFC Editor. 1098 8. Security Considerations 1100 The security considerations for the use of UDP and SCTP are provided 1101 in the references RFCs. Security guidance for applications using UDP 1102 is provided in the UDP-Guidelines [RFC8085]. 1104 PTB messages could potentially be used to cause a node to 1105 inappropriately reduce the effective PMTU. A node supporting PLPMTUD 1106 MUST appropriately verify the payload of PTB messages to ensure these 1107 are received in response to transmitted traffic (i.e., a reported 1108 error condition that corresponds to a datagram actually sent by the 1109 path layer. 1111 XXX Determine if parallel forwarding paths needs to be considered. 1112 XXX 1114 A node performing PLPMTUD could experience conflicting information 1115 about the size of supported probe packets. This could occur when 1116 there are multiple paths are concurrently in use and these exhibit a 1117 different PMTU. If not considered, this could result in data being 1118 blackholed when the effective PMTU is larger than the smallest PMTU 1119 across the current paths. 1121 9. References 1123 9.1. Normative References 1125 [I-D.ietf-quic-transport] 1126 Iyengar, J. and M. Thomson, "QUIC: A UDP-Based Multiplexed 1127 and Secure Transport", draft-ietf-quic-transport-04 (work 1128 in progress), June 2017. 1130 [I-D.ietf-tsvwg-sctp-dtls-encaps] 1131 Tuexen, M., Stewart, R., Jesup, R., and S. Loreto, "DTLS 1132 Encapsulation of SCTP Packets", draft-ietf-tsvwg-sctp- 1133 dtls-encaps-09 (work in progress), January 2015. 1135 [I-D.ietf-tsvwg-udp-options] 1136 Touch, J., "Transport Options for UDP", draft-ietf-tsvwg- 1137 udp-options-01 (work in progress), June 2017. 1139 [RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768, 1140 DOI 10.17487/RFC0768, August 1980, . 1143 [RFC0792] Postel, J., "Internet Control Message Protocol", STD 5, 1144 RFC 792, DOI 10.17487/RFC0792, September 1981, 1145 . 1147 [RFC1122] Braden, R., Ed., "Requirements for Internet Hosts - 1148 Communication Layers", STD 3, RFC 1122, 1149 DOI 10.17487/RFC1122, October 1989, . 1152 [RFC1812] Baker, F., Ed., "Requirements for IP Version 4 Routers", 1153 RFC 1812, DOI 10.17487/RFC1812, June 1995, 1154 . 1156 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1157 Requirement Levels", BCP 14, RFC 2119, 1158 DOI 10.17487/RFC2119, March 1997, . 1161 [RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 1162 (IPv6) Specification", RFC 2460, DOI 10.17487/RFC2460, 1163 December 1998, . 1165 [RFC3828] Larzon, L-A., Degermark, M., Pink, S., Jonsson, L-E., Ed., 1166 and G. Fairhurst, Ed., "The Lightweight User Datagram 1167 Protocol (UDP-Lite)", RFC 3828, DOI 10.17487/RFC3828, July 1168 2004, . 1170 [RFC4820] Tuexen, M., Stewart, R., and P. Lei, "Padding Chunk and 1171 Parameter for the Stream Control Transmission Protocol 1172 (SCTP)", RFC 4820, DOI 10.17487/RFC4820, March 2007, 1173 . 1175 [RFC4960] Stewart, R., Ed., "Stream Control Transmission Protocol", 1176 RFC 4960, DOI 10.17487/RFC4960, September 2007, 1177 . 1179 [RFC6951] Tuexen, M. and R. Stewart, "UDP Encapsulation of Stream 1180 Control Transmission Protocol (SCTP) Packets for End-Host 1181 to End-Host Communication", RFC 6951, 1182 DOI 10.17487/RFC6951, May 2013, . 1185 [RFC8085] Eggert, L., Fairhurst, G., and G. Shepherd, "UDP Usage 1186 Guidelines", BCP 145, RFC 8085, DOI 10.17487/RFC8085, 1187 March 2017, . 1189 [RFC8201] McCann, J., Deering, S., Mogul, J., and R. Hinden, Ed., 1190 "Path MTU Discovery for IP version 6", STD 87, RFC 8201, 1191 DOI 10.17487/RFC8201, July 2017, . 1194 9.2. Informative References 1196 [RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, 1197 DOI 10.17487/RFC1191, November 1990, . 1200 [RFC2923] Lahey, K., "TCP Problems with Path MTU Discovery", 1201 RFC 2923, DOI 10.17487/RFC2923, September 2000, 1202 . 1204 [RFC4340] Kohler, E., Handley, M., and S. Floyd, "Datagram 1205 Congestion Control Protocol (DCCP)", RFC 4340, 1206 DOI 10.17487/RFC4340, March 2006, . 1209 [RFC4821] Mathis, M. and J. Heffner, "Packetization Layer Path MTU 1210 Discovery", RFC 4821, DOI 10.17487/RFC4821, March 2007, 1211 . 1213 [RFC4890] Davies, E. and J. Mohacsi, "Recommendations for Filtering 1214 ICMPv6 Messages in Firewalls", RFC 4890, 1215 DOI 10.17487/RFC4890, May 2007, . 1218 Appendix A. Event-driven state changes 1220 This appendix contains an informative description of key events: 1222 Path Setup: When a new path is initiated, the state is set to 1223 PROBE_START. As soon as the path is confirmed, the state changes 1224 to PROBE_BASE and the probing mechanism for this path is started. 1225 the first probe packet is sent with the size of the BASE_PMTU. 1227 Arrival of an Acknowledgment: Depending on the probing state, the 1228 reaction differs according to Figure 4, which is just a 1229 simplification of Figure 1 focusing on this event. 1231 +--------------+ +----------------+ 1232 | PROBE_START | --3------------------------------->| PROBE_DISABLED | 1233 +--------------+ --4-----------\ +----------------+ 1234 \ 1235 +--------------+ \ 1236 | PROBE_ERROR | --------------- \ 1237 +--------------+ \ \ 1238 \ \ 1239 +--------------+ \ \ +--------------+ 1240 | PROBE_BASE | --1---------- \ ------------> | PROBE_BASE | 1241 +--------------+ --2----- \ \ +--------------+ 1242 \ \ \ 1243 +--------------+ \ \ ------------> +--------------+ 1244 | PROBE_SEARCH | --2--- \ -----------------> | PROBE_SEARCH | 1245 +--------------+ --1---\----\---------------------> +--------------+ 1246 \ \ 1247 +--------------+ \ \ +--------------+ 1248 | PROBE_DONE | \ -------------------> | PROBE_DONE | 1249 +--------------+ -----------------------> +--------------+ 1251 Condition 1: The maximum PMTU size has not yet been reached. 1252 Condition 2: The maximum PMTU size has been reached. Conition 3: 1253 Probe Timer expires and PROBE_COUNT = MAX_PROBEs. Condition 4: 1254 PROBE_ACK received. 1256 Figure 4: State changes at the arrival of an acknowledgment 1258 Probing timeout: The PROBE_COUNT is initialised to zero each time 1259 the value of PROBED_SIZE is changed. The PROBE_TIMER is started 1260 each time a probe packet is sent. It is stopped when an 1261 acknowledgment arrives that confirms delivery of a probe packet. 1262 If the probe packet is not acknowledged before the PROBE_TIMER 1263 expires, the PROBE_ERROR_COUNTER is incremented. When the 1264 PROBE_COUNT equals the value MAX_PROBES, the state is changed, 1265 otherwise a new probe packet of the same size (PROBED_SIZE) is 1266 resent. The state transitions are illustrated in Figure 5. This 1267 shows a simplification of Figure 1 with a focus only on this 1268 event. 1270 +--------------+ +----------------+ 1271 | PROBE_START |----------------------------------->| PROBE_DISABLED | 1272 +--------------+ +----------------+ 1274 +--------------+ +--------------+ 1275 | PROBE_ERROR | -----------------> | PROBE_ERROR | 1276 +--------------+ / +--------------+ 1277 / 1278 +--------------+ --2----------/ +--------------+ 1279 | PROBE_BASE | --1------------------------------> | PROBE_BASE | 1280 +--------------+ +--------------+ 1282 +--------------+ +--------------+ 1283 | PROBE_SEARCH | --1------------------------------> | PROBE_SEARCH | 1284 +--------------+ --2--------- +--------------+ 1285 \ 1286 +--------------+ \ +--------------+ 1287 | PROBE_DONE | -------------------> | PROBE_DONE | 1288 +--------------+ +--------------+ 1290 Condition 1: The maximum number of probe packets has not been 1291 reached. Condition 2: The maximum number of probe packets has been 1292 reached. 1294 Figure 5: State changes at the expiration of the probe timer 1296 PMTU raise timer timeout: The path through the network can change 1297 over time. It impossible to discover whether a path change has 1298 increased the actual PMTU by exchanging packets less than or equal 1299 to the effective PMTU. This requires PLPMTUD to periodically send 1300 a probe packet to detect whether a larger PMTU is possible. This 1301 probe packet is generated by the PMTU_RAISE_TIMER. When the timer 1302 expires, probing is restarted with the BASE_PMTU and the state is 1303 changed to PROBE_BASE. 1305 Arrival of an ICMP message: The active probing of the path can be 1306 supported by the arrival of PTB messages sent by routers or 1307 middleboxes with a link MTU that is smaller than the probe packet 1308 size. If the PTB message includes the router link MTU, three 1309 cases can be distinguished: 1311 1. The indicated link MTU in the PTB message is between the 1312 already probed and effective MTU and the probe that triggered 1313 the PTB message. 1315 2. The indicated link MTU in the PTB message is smaller than the 1316 effective PMTU. 1318 3. The indicated link MTU in the PTB message is equal to the 1319 BASE_PMTU. 1321 In first case, the PROBE_BASE state transitions to the PROBE_ERROR 1322 state. In the PROBE_SEARCH state, a new probe packet is sent with 1323 the sized reported by the PTB message. Its result is handled 1324 according to the former events. 1326 The second case could be a result of a network re-configuration. 1327 If the reported link MTU in the PTB message is greater than the 1328 BASE_MTU, the probing starts again with a value of PROBE_BASE. 1329 Otherwise, the method enters the state PROBE_ERROR. 1331 In the third case, the maximum possible PMTU has been reached. 1332 This ought to be probed again, because there could be a link 1333 further along the path with a still smaller MTU. 1335 Note: Not all routers include the link MTU size when they send a 1336 PTB message. If the PTB message does not indicate the link MTU, 1337 the probe is handled in the same way as condition 2 of Figure 5. 1339 Appendix B. Revision Notes 1341 Note to RFC-Editor: please remove this entire section prior to 1342 publication. 1344 Individual draft -00: 1346 o Comments and corrections are welcome directly to the authors or 1347 via the IETF TSVWG working group mailing list. 1349 o This update is proposed for WG comments. 1351 Individual draft -01: 1353 o Contains the first representation of the algorithm, showing the 1354 states and timers 1356 o This update is proposed for WG comments. 1358 Individual draft -02: 1360 o Contains updated representation of the algorithm, and textual 1361 corrections. 1363 o The text describing when to set the effective PMTU has not yet 1364 been verified by the authors 1366 o To determine security to off-path-attacks: We need to decide 1367 whether a received PTB message SHOULD/MUST be verified? The text 1368 on how to handle a PTB message indicating a link MTU larger than 1369 the probe has yet not been verified by the authors 1371 o No text currently describes how to handle inconsistent results 1372 from arbitrary re-routing along different parallel paths 1374 o This update is proposed for WG comments. 1376 Working Group draft -00: 1378 o This draft follows a successful adoption call for TSVWG 1380 o There is still work to complete, please comment on this draft. 1382 Working Group draft -01: 1384 o This draft includes improved introduction. 1386 o The draft is updated to require ICMP validation prior to accepting 1387 PTB messages - this to be confirmed by WG 1389 o Section added to discuss Selection of Probe Size - methods to be 1390 evlauated and recommendations to be considered 1392 o Section added to align with work proposed in the QUIC WG. 1394 Authors' Addresses 1396 Godred Fairhurst 1397 University of Aberdeen 1398 School of Engineering 1399 Fraser Noble Building 1400 Aberdeen AB24 3U 1401 UK 1403 Email: gorry@erg.abdn.ac.uk 1404 Tom Jones 1405 University of Aberdeen 1406 School of Engineering 1407 Fraser Noble Building 1408 Aberdeen AB24 3U 1409 UK 1411 Email: tom@erg.abdn.ac.uk 1413 Michael Tuexen 1414 Muenster University of Applied Sciences 1415 Stegerwaldstrasse 39 1416 Stein fart 48565 1417 DE 1419 Email: tuexen@fh-muenster.de 1421 Irene Ruengeler 1422 Muenster University of Applied Sciences 1423 Stegerwaldstrasse 39 1424 Stein fart 48565 1425 DE 1427 Email: i.ruengeler@fh-muenster.de