idnits 2.17.00 (12 Aug 2021) /tmp/idnits34570/draft-ietf-ipsecme-iptfs-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 1064 has weird spacing: '...4 any any...' == Line 1080 has weird spacing: '...4 any any...' -- The document date (December 16, 2019) is 886 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: '--800--' is mentioned on line 900, but not defined -- Looks like a reference, but probably isn't: '60' on line 900 == Missing Reference: '-240-' is mentioned on line 900, but not defined == Missing Reference: '--4000----------------------' is mentioned on line 900, but not defined == Outdated reference: draft-iab-wire-image has been published as RFC 8546 Summary: 0 errors (**), 0 flaws (~~), 7 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group C. Hopps 3 Internet-Draft LabN Consulting, L.L.C. 4 Intended status: Standards Track December 16, 2019 5 Expires: June 18, 2020 7 IP Traffic Flow Security 8 draft-ietf-ipsecme-iptfs-00 10 Abstract 12 This document describes a mechanism to enhance IPsec traffic flow 13 security by adding traffic flow confidentiality to encrypted IP 14 encapsulated traffic. Traffic flow confidentiality is provided by 15 obscuring the size and frequency of IP traffic using a fixed-sized, 16 constant-send-rate IPsec tunnel. The solution allows for congestion 17 control as well. 19 Status of This Memo 21 This Internet-Draft is submitted in full conformance with the 22 provisions of BCP 78 and BCP 79. 24 Internet-Drafts are working documents of the Internet Engineering 25 Task Force (IETF). Note that other groups may also distribute 26 working documents as Internet-Drafts. The list of current Internet- 27 Drafts is at https://datatracker.ietf.org/drafts/current/. 29 Internet-Drafts are draft documents valid for a maximum of six months 30 and may be updated, replaced, or obsoleted by other documents at any 31 time. It is inappropriate to use Internet-Drafts as reference 32 material or to cite them other than as "work in progress." 34 This Internet-Draft will expire on June 18, 2020. 36 Copyright Notice 38 Copyright (c) 2019 IETF Trust and the persons identified as the 39 document authors. All rights reserved. 41 This document is subject to BCP 78 and the IETF Trust's Legal 42 Provisions Relating to IETF Documents 43 (https://trustee.ietf.org/license-info) in effect on the date of 44 publication of this document. Please review these documents 45 carefully, as they describe your rights and restrictions with respect 46 to this document. Code Components extracted from this document must 47 include Simplified BSD License text as described in Section 4.e of 48 the Trust Legal Provisions and are provided without warranty as 49 described in the Simplified BSD License. 51 Table of Contents 53 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 54 1.1. Terminology & Concepts . . . . . . . . . . . . . . . . . 3 55 2. The IP-TFS Tunnel . . . . . . . . . . . . . . . . . . . . . . 4 56 2.1. Tunnel Content . . . . . . . . . . . . . . . . . . . . . 4 57 2.2. IPTFS_PROTOCOL Payload Content . . . . . . . . . . . . . 4 58 2.2.1. Data Blocks . . . . . . . . . . . . . . . . . . . . . 5 59 2.2.2. No Implicit Padding Required . . . . . . . . . . . . 6 60 2.2.3. Empty Payload . . . . . . . . . . . . . . . . . . . . 6 61 2.2.4. IP Header Value Mapping . . . . . . . . . . . . . . . 6 62 2.3. Exclusive SA Use . . . . . . . . . . . . . . . . . . . . 7 63 2.4. Initiating IP-TFS Operation On The SA. . . . . . . . . . 7 64 2.5. Modes of Operation . . . . . . . . . . . . . . . . . . . 7 65 2.5.1. Non-Congestion Controlled Mode . . . . . . . . . . . 7 66 2.5.2. Congestion Controlled Mode . . . . . . . . . . . . . 8 67 3. Congestion Information . . . . . . . . . . . . . . . . . . . 9 68 3.1. ECN Support . . . . . . . . . . . . . . . . . . . . . . . 10 69 4. Configuration . . . . . . . . . . . . . . . . . . . . . . . . 10 70 4.1. Bandwidth . . . . . . . . . . . . . . . . . . . . . . . . 10 71 4.2. Fixed Packet Size . . . . . . . . . . . . . . . . . . . . 10 72 4.3. Congestion Control . . . . . . . . . . . . . . . . . . . 11 73 5. IKEv2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 74 5.1. TFS Type Transform Type . . . . . . . . . . . . . . . . . 11 75 5.2. IPTFS_REQUIREMENTS Status Notification . . . . . . . . . 11 76 6. Packet and Data Formats . . . . . . . . . . . . . . . . . . . 12 77 6.1. ESP IP-TFS Payload . . . . . . . . . . . . . . . . . . . 12 78 6.1.1. Non-Congestion Control IPTFS_PROTOCOL Payload Format 12 79 6.1.2. Congestion Control IPTFS_PROTOCOL Payload Format . . 13 80 6.1.3. Data Blocks . . . . . . . . . . . . . . . . . . . . . 14 81 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 16 82 7.1. IPTFS_PROTOCOL Type . . . . . . . . . . . . . . . . . . . 16 83 7.2. IKEv2 Transform Type TFS Type . . . . . . . . . . . . . . 16 84 7.3. TFS Type Transform IDs Registry . . . . . . . . . . . . . 17 85 7.4. IPTFS_REQUIREMENTS Notify Message Status Type . . . . . . 17 86 8. Security Considerations . . . . . . . . . . . . . . . . . . . 17 87 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 17 88 9.1. Normative References . . . . . . . . . . . . . . . . . . 17 89 9.2. Informative References . . . . . . . . . . . . . . . . . 18 90 Appendix A. Example Of An Encapsulated IP Packet Flow . . . . . 19 91 Appendix B. A Send and Loss Event Rate Calculation . . . . . . . 20 92 Appendix C. Comparisons of IP-TFS . . . . . . . . . . . . . . . 21 93 C.1. Comparing Overhead . . . . . . . . . . . . . . . . . . . 21 94 C.1.1. IP-TFS Overhead . . . . . . . . . . . . . . . . . . . 21 95 C.1.2. ESP with Padding Overhead . . . . . . . . . . . . . . 21 97 C.2. Overhead Comparison . . . . . . . . . . . . . . . . . . . 22 98 C.3. Comparing Available Bandwidth . . . . . . . . . . . . . . 23 99 C.3.1. Ethernet . . . . . . . . . . . . . . . . . . . . . . 23 100 Appendix D. Acknowledgements . . . . . . . . . . . . . . . . . . 25 101 Appendix E. Contributors . . . . . . . . . . . . . . . . . . . . 25 102 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 25 104 1. Introduction 106 Traffic Analysis ([RFC4301], [AppCrypt]) is the act of extracting 107 information about data being sent through a network. While one may 108 directly obscure the data through the use of encryption [RFC4303], 109 the traffic pattern itself exposes information due to variations in 110 it's shape and timing ([I-D.iab-wire-image], [AppCrypt]). Hiding the 111 size and frequency of traffic is referred to as Traffic Flow 112 Confidentiality (TFC) per [RFC4303]. 114 [RFC4303] provides for TFC by allowing padding to be added to 115 encrypted IP packets and allowing for transmission of all-pad packets 116 (indicated using protocol 59). This method has the major limitation 117 that it can significantly under-utilize the available bandwidth. 119 The IP-TFS solution provides for full TFC without the aforementioned 120 bandwidth limitation. To do this, we use a constant-send-rate IPsec 121 [RFC4303] tunnel with fixed-sized encapsulating packets; however, 122 these fixed-sized packets can contain partial, whole or multiple IP 123 packets to maximize the bandwidth of the tunnel. 125 For a comparison of the overhead of IP-TFS with the RFC4303 126 prescribed TFC solution see Appendix C. 128 Additionally, IP-TFS provides for dealing with network congestion 129 [RFC2914]. This is important for when the IP-TFS user is not in full 130 control of the domain through which the IP-TFS tunnel path flows. 132 1.1. Terminology & Concepts 134 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 135 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 136 "OPTIONAL" in this document are to be interpreted as described in 137 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, 138 as shown here. 140 This document assumes familiarity with IP security concepts described 141 in [RFC4301]. 143 2. The IP-TFS Tunnel 145 As mentioned in Section 1 IP-TFS utilizes an IPsec [RFC4303] tunnel 146 (SA) as it's transport. To provide for full TFC we send fixed-sized 147 encapsulating packets at a constant rate on the tunnel. 149 The primary input to the tunnel algorithm is the requested bandwidth 150 of the tunnel. Two values are then required to provide for this 151 bandwidth, the fixed size of the encapsulating packets, and rate at 152 which to send them. 154 The fixed packet size may either be specified manually or can be 155 determined through the use of Path MTU discovery [RFC1191] and 156 [RFC8201]. 158 Given the encapsulating packet size and the requested tunnel 159 bandwidth, the corresponding packet send rate can be calculated. The 160 packet send rate is the requested bandwidth divided by the payload 161 size of the encapsulating packet. 163 The egress of the IP-TFS tunnel MUST allow for, and expect the 164 ingress (sending) side of the IP-TFS tunnel to vary the size and rate 165 of sent encapsulating packets, unless constrained by other policy. 167 2.1. Tunnel Content 169 As previously mentioned, one issue with the TFC padding solution in 170 [RFC4303] is the large amount of wasted bandwidth as only one IP 171 packet can be sent per encapsulating packet. In order to maximize 172 bandwidth IP-TFS breaks this one-to-one association. 174 With IP-TFS we aggregate as well as fragment the inner IP traffic 175 flow into fixed-sized encapsulating IPsec tunnel packets. We only 176 pad the tunnel packets if there is no data available to be sent at 177 the time of tunnel packet transmission, or if fragmentation has been 178 disabled by the receiver. 180 In order to do this we use a new Encapsulating Security Payload (ESP, 181 [RFC4303]) payload type which is the new IP protocol number 182 IPTFS_PROTOCOL (TBD1). 184 2.2. IPTFS_PROTOCOL Payload Content 186 The IPTFS_PROTOCOL ESP payload is comprised a 4 or 16 octet header 187 followed by either a partial, a full or multiple partial or full data 188 blocks. The following diagram illustrates the IPTFS_PROTOCOL ESP 189 payload within the ESP packet. See Section 6.1 for the exact formats 190 of the IPTFS_PROTOCOL payload. 192 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 . Outer Encapsulating Header ... . 194 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195 . ESP Header... . 196 +---------------------------------------------------------------+ 197 | ... : BlockOffset | 198 +---------------------------------------------------------------+ 199 : [Optional Congestion Info] : 200 +---------------------------------------------------------------+ 201 | DataBlocks ... ~ 202 ~ ~ 203 ~ | 204 +---------------------------------------------------------------| 205 . ESP Trailer... . 206 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208 Figure 1: Layout of an IP-TFS IPsec Packet 210 The "BlockOffset" value is either zero or some offset into or past 211 the end of the "DataBlocks" data. 213 If the "BlockOffset" value is zero it means that the "DataBlocks" 214 data begins with a new data block. 216 Conversely, if the "BlockOffset" value is non-zero it points to the 217 start of the new data block, and the initial "DataBlocks" data 218 belongs to a previous data block that is still being re-assembled. 220 The "BlockOffset" can point past the end of the "DataBlocks" data 221 which indicates that the next data block occurs in a subsequent 222 encapsulating packet. 224 Having the "BlockOffset" always point at the next available data 225 block allows for quick recovery with minimal inner packet loss in the 226 presence of outer encapsulating packet loss. 228 An example IP-TFS packet flow can be found in Appendix A. 230 2.2.1. Data Blocks 232 +---------------------------------------------------------------+ 233 | Type | rest of IPv4, IPv6 or pad. 234 +-------- 236 Figure 2: Layout of IP-TFS data block 238 A data block is defined by a 4-bit type code followed by the data 239 block data. The type values have been carefully chosen to coincide 240 with the IPv4/IPv6 version field values so that no per-data block 241 type overhead is required to encapsulate an IP packet. Likewise, the 242 length of the data block is extracted from the encapsulated IPv4 or 243 IPv6 packet's length field. 245 2.2.2. No Implicit Padding Required 247 It's worth noting that there is never a need for an implicit pad at 248 the end of an encapsulating packet. Even when the start of a data 249 block occurs near the end of a encapsulating packet such that there 250 is no room for the length field of the encapsulated header to be 251 included in the current encapsulating packet, the fact that the 252 length comes at a known location and is guaranteed to be present is 253 enough to fetch the length field from the subsequent encapsulating 254 packet payload. Only when there is no data to encapsulate is padding 255 required, and then an explicit "Pad Data Block" would be used to 256 identify the padding. 258 2.2.3. Empty Payload 260 In order to support reporting of congestion control information 261 (described later) on a non-IP-TFS enabled SA, IP-TFS allows for the 262 sending of an IP-TFS payload with no data blocks (i.e., the ESP 263 payload length is equal to the IP-TFS header length). This special 264 payload is called an empty payload. 266 2.2.4. IP Header Value Mapping 268 [RFC4301] provides some direction on when and how to map various 269 values from an inner IP header to the outer encapsulating header, 270 namely the Don't-Fragment (DF) bit ([RFC0791] and [RFC8200]), the 271 Differentiated Services (DS) field [RFC2474] and the Explicit 272 Congestion Notification (ECN) field [RFC3168]. Unlike [RFC4301] with 273 IP-TFS we may and often will be encapsulating more than 1 IP packet 274 per ESP packet. To deal with this we further restrict these 275 mappings. In particular we never map the inner DF bit as it is 276 unrelated to the IP-TFS tunnel functionality; we never IP fragment 277 the inner packets and the inner packets will not affect the 278 fragmentation of the outer encapsulation packets. Likewise, the ECN 279 value need not be mapped as any congestion related to the constant- 280 send-rate IP-TFS tunnel is unrelated (by design!) to the inner 281 traffic flow. Finally, by default the DS field SHOULD NOT be copied 282 although an implementation MAY choose to allow for configuration to 283 override this behavior. An implementation SHOULD also allow the DS 284 value to be set by configuration. 286 2.3. Exclusive SA Use 288 It is not the intention of this specification to allow for mixed use 289 of an IP-TFS enabled SA. In other words, an SA that has IP-TFS 290 enabled is exclusively for IP-TFS use and MUST NOT have non-IP-TFS 291 payloads such as IP (IP protocol 4), TCP transport (IP protocol 6), 292 or ESP pad packets (protocol 59) intermixed with non-empty IP-TFS (IP 293 protocol TBD1) payloads. While it's possible to envision making the 294 algorithm work in the presence of sequence number skips in the IP-TFS 295 payload stream, the added complexity is not deemed worthwhile. Other 296 IPsec uses can configure and use their own SAs. 298 2.4. Initiating IP-TFS Operation On The SA. 300 While a user will normally configure their IPsec tunnel (SA) to 301 operate using IP-TFS to start, we also allow IP-TFS operation to be 302 enabled post-SA creation and use. This late-enabling may be useful 303 for debugging or other purposes. To support this late-enabled 304 operation the receiver switches to IP-TFS operation on receipt of the 305 first ESP payload with the IPTFS_PROTOCOL indicated as the payload 306 type which also contains a data block (i.e., a non-empty IP-TFS 307 payload). The the receipt of an empty IPTFS_PROTOCOL payload (i.e., 308 one without any data blocks) is used to communicate congestion 309 control information from the receiver back to the sender on a non-IP- 310 TFS enabled SA, and MUST NOT cause IP-TFS to be enabled on that SA. 312 2.5. Modes of Operation 314 Just as with normal IPsec/ESP tunnels, IP-TFS tunnels are 315 unidirectional. Bidirectional IP-TFS functionality is achieved by 316 setting up 2 IP-TFS tunnels, one in either direction. 318 An IP-TFS tunnel can operate in 2 modes, a non-congestion controlled 319 mode and congestion controlled mode. 321 2.5.1. Non-Congestion Controlled Mode 323 In the non-congestion controlled mode IP-TFS sends fixed-sized 324 packets at a constant rate. The packet send rate is constant and is 325 not automatically adjusted regardless of any network congestion 326 (e.g., packet loss). 328 For similar reasons as given in [RFC7510] the non-congestion 329 controlled mode should only be used where the user has full 330 administrative control over the path the tunnel will take. This is 331 required so the user can guarantee the bandwidth and also be sure as 332 to not be negatively affecting network congestion [RFC2914]. In this 333 case packet loss should be reported to the administrator (e.g., via 334 syslog, YANG notification, SNMP traps, etc) so that any failures due 335 to a lack of bandwidth can be corrected. 337 2.5.2. Congestion Controlled Mode 339 With the congestion controlled mode, IP-TFS adapts to network 340 congestion by lowering the packet send rate to accommodate the 341 congestion, as well as raising the rate when congestion subsides. 342 Since overhead is per packet, by allowing for maximal fixed-size 343 packets and varying the send rate we minimize transport overhead. 345 The output of the congestion control algorithm will adjust the rate 346 at which the ingress sends packets. While this document does not 347 require a specific congestion control algorithm, best current 348 practice RECOMMENDS that the algorithm conform to [RFC5348]. 349 Congestion control principles are documented in [RFC2914] as well. 350 An example of an implementation of the [RFC5348] algorithm which 351 matches the requirements of IP-TFS (i.e., designed for fixed-size 352 packet and send rate varied based on congestion) is documented in 353 [RFC4342]. 355 The required inputs for the TCP friendly rate control algorithm 356 described in [RFC5348] are the receivers loss event rate and the 357 senders estimated round-trip time (RTT). These values are provided 358 by IP-TFS using the congestion information header fields described in 359 Section 3. In particular these values are sufficient to implement 360 the algorithm described in [RFC5348]. 362 At a minimum, the congestion information must be sent, from the 363 receiver as well as from the sender, at least once per RTT. Prior to 364 establishing an RTT the information SHOULD be sent constantly from 365 the sender and the receiver so that an RTT estimate can be 366 established. The lack of receiving this information over multiple 367 consecutive RTT intervals should be considered a congestion event 368 that causes the sender to adjust it's sending rate lower. For 369 example, [RFC4342] calls this the "no feedback timeout" and it is 370 equal to 4 RTT intervals. When a "no feedback timeout" has occurred 371 [RFC4342] halves the sending rate. 373 An implementation could choose to always include the congestion 374 information in it's IP-TFS payload header if sending on an IP-TFS 375 enabled SA. Since IP-TFS normally will operate with a large packet 376 size, the congestion information should represent a small portion of 377 the available tunnel bandwidth. 379 When an implementation is choosing a congestion control algorithm (or 380 a selection of algorithms) one should remember that IP-TFS is not 381 providing for reliable delivery of IP traffic, and so per packet ACKs 382 are not required and are not provided. 384 It's worth noting that the variable send-rate of a congestion 385 controlled IP-TFS tunnel, is not private; however, this send-rate is 386 being driven by network congestion, and as long as the encapsulated 387 (inner) traffic flow shape and timing are not directly affecting the 388 (outer) network congestion, the variations in the tunnel rate will 389 not weaken the provided inner traffic flow confidentiality. 391 2.5.2.1. Circuit Breakers 393 In additional to congestion control, implementations MAY choose to 394 define and implement circuit breakers [RFC8084] as a recovery method 395 of last resort. Enabling circuit breakers is also a reason a user 396 may wish to enable congestion information reports even when using the 397 non-congestion controlled mode of operation. The definition of 398 circuit breakers are outside the scope of this document. 400 3. Congestion Information 402 In order to support the congestion control mode, the sender needs to 403 know the loss event rate and also be able to approximate the RTT 404 ([RFC5348]). In order to obtain these values the receiver sends 405 congestion control information on it's SA back to the sender. Thus, 406 in order to support congestion control the receiver must have a 407 paired SA back to the sender (this is always the case when the tunnel 408 was created using IKEv2). If the SA back to the sender is a non-IP- 409 TFS enabled SA then an IPTFS_PROTOCOL empty payload (i.e., header 410 only) is used to convey the information. 412 In order to calculate a loss event rate compatible with [RFC5348], 413 the receiver needs to have a round-trip time estimate. Thus the 414 sender communicates this estimate in the "RTT" header field. On 415 startup this value will be zero as no RTT estimate is yet known. 417 In order to allow the sender to calculate the "RTT" value, the 418 receiver communicates the last sequence number it has seen to the 419 sender in the "LastSeqNum" header field. In addition to the 420 "LastSeqNum" value, the receiver sends an estimate of the amount of 421 time between receiving the "LastSeqNum" packet and transmitting the 422 "LastSeqNum" value back to the sender in the congestion information. 423 It places this time estimate in the "Delay" header field along with 424 the "LastSeqNum". 426 The receiver also calculates, and communicates in the "LossEventRate" 427 header field, the loss event rate for use by the sender. This is 428 slightly different from [RFC4342] which periodically sends all the 429 loss interval data back to the sender so that it can do the 430 calculation. See Appendix B for a suggested way to calculate the 431 loss event rate value. Initially this value will be zero (indicating 432 no loss) until enough data has been collected by the receiver to 433 update it. 435 3.1. ECN Support 437 In additional to normal packet loss information IP-TFS supports use 438 of the ECN bits in the encapsulating IP header [RFC3168] for 439 identifying congestion. If ECN use is enabled and a packet arrives 440 at the egress endpoint with the Congestion Experienced (CE) value 441 set, then the receiver considers that packet as being dropped, 442 although it does not drop it. The receiver MUST set the E bit in any 443 IPTFS_PROTOCOL payload header containing a "LossEventRate" value 444 derived from a CE value being considered. 446 As noted in [RFC3168] the ECN bits are not protected by IPsec and 447 thus may constitute a covert channel. For this reason ECN use SHOULD 448 NOT be enabled by default. 450 4. Configuration 452 IP-TFS is meant to be deployable with a minimal amount of 453 configuration. All IP-TFS specific configuration should be able to 454 be specified at the unidirectional tunnel ingress (sending) side. It 455 is intended that non-IKEv2 operation is supported, at least, with 456 local static configuration. 458 4.1. Bandwidth 460 Bandwidth is a local configuration option. For non-congestion 461 controlled mode the bandwidth SHOULD be configured. For congestion 462 controlled mode one can configure the bandwidth or have no 463 configuration and let congestion control discover the maximum 464 bandwidth available. No standardized configuration method is 465 required. 467 4.2. Fixed Packet Size 469 The fixed packet size to be used for the tunnel encapsulation packets 470 can be configured manually or can be automatically determined using 471 Path MTU discovery (see [RFC1191] and [RFC8201]). No standardized 472 configuration method is required. 474 4.3. Congestion Control 476 Congestion control is a local configuration option. No standardized 477 configuration method is required. 479 5. IKEv2 481 5.1. TFS Type Transform Type 483 When IP-TFS is used with IKEv2 a new "TFS Type" Transform Type (TBD2) 484 is used to negotiate (as defined in [RFC7296]) the possible operation 485 of IP-TFS on a child SA pair. This document defines 3 "TFS Type" 486 Transform IDs for the new "TFS Type" Transform Type: None (0), 487 TFS_IPTFS_CC (1) for congestion-controlled IP-TFS mode or 488 TFS_IPTFS_NOCC (2) for non-congestion controlled IP-TFS mode. The 489 selection of a proposal with a "TFS Type" Transform ID TFS_IPTFS_CC 490 or TFS_IPTFS_NOCC does not mandate the use of IP-TFS, rather it 491 indicates a willingness or intent to use IP-TFS on the SA pair. In 492 addition, a new Notify Message Status Type IPTFS_REQUIREMENTS (TBD3) 493 MAY be used by the initiator as well as the responder to further 494 refine any operational requirements. 496 Additional "TFS Type" Transform IDs may be defined in the future, and 497 so readers are referred to [IKEV2IANA] for the most up to date list. 499 5.2. IPTFS_REQUIREMENTS Status Notification 501 As mentioned in the previous section, a new Notify Message Status 502 Type IPTFS_REQUIREMENTS (TBD3) MAY be sent by the initiator and/or 503 the responder to further refine what will be supported. This 504 notification is sent during IKE_AUTH and new CREATE_CHILD_SA 505 exchanges; however, it MUST NOT be sent, and MUST be ignored, during 506 a CREATE_CHILD_SA rekeying exchange as the requirements are not 507 allowed to change during rekeying. 509 The IPTFS_REQUIREMENTS notification contains a 1 octet payload of 510 flags that specify any extra requirements from the sender of the 511 message. The flag values (currently a single flag) are defined 512 below. If the IPTFS_REQUIREMENTS notification is not sent then it 513 implies that all the flag bits are clear. 515 +-+-+-+-+-+-+-+-+ 516 |0|0|0|0|0|0|0|D| 517 +-+-+-+-+-+-+-+-+ 519 0: 520 MUST be zero on send and MUST be ignored on receive. 522 D: 523 Don't Fragment bit, if set indicates the sender of the notify 524 message does not support receiving packet fragments (i.e., inner 525 packets MUST be sent using a single "Data Block"). This value 526 only applies to what the sender is capable of receiving; the 527 sender MAY still send packet fragments unless similarly restricted 528 by the receiver in it's IPTFS_REQUIREMENTS notification. 530 6. Packet and Data Formats 532 6.1. ESP IP-TFS Payload 534 An ESP IP-TFS payload is identified by the IP protocol number 535 IPTFS_PROTOCOL (TBD1). This payload begins with a fixed 4 or 16 536 octet header followed by a variable amount of "DataBlocks" data. The 537 exact payload format and fields are defined in the following 538 sections. 540 6.1.1. Non-Congestion Control IPTFS_PROTOCOL Payload Format 542 The non-congestion control IPTFS_PROTOCOL payload is comprised of a 4 543 octet header followed by a variable amount of "DataBlocks" data as 544 shown below. 546 1 2 3 547 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 548 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 549 |V|C| Reserved | BlockOffset | 550 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 551 | DataBlocks ... 552 +-+-+-+-+-+-+-+-+-+-+- 554 V: 555 A 1 bit version field that MUST be set to zero. If received as 556 one the packet MUST be dropped. 558 C: 559 A 1 bit value that MUST be set to 0 to indicate no congestion 560 control information is present. 562 Reserved: 563 A 14 bit field set to 0 and ignored on receipt. 565 BlockOffset: 566 A 16 bit unsigned integer counting the number of octets of 567 "DataBlocks" data before the start of a new data block. 568 "BlockOffset" can count past the end of the "DataBlocks" data in 569 which case all the "DataBlocks" data belongs to the previous data 570 block being re-assembled. If the "BlockOffset" extends into 571 subsequent packets it continues to only count subsequent 572 "DataBlocks" data (i.e., it does not count subsequent packets 573 non-"DataBlocks" octets). 575 DataBlocks: 576 Variable number of octets that begins with the start of a data 577 block, or the continuation of a previous data block, followed by 578 zero or more additional data blocks. 580 6.1.2. Congestion Control IPTFS_PROTOCOL Payload Format 582 The congestion control IPTFS_PROTOCOL payload is comprised of a 16 583 octet header followed by a variable amount of "DataBlocks" data as 584 shown below. 586 1 2 3 587 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 588 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 589 |V|C|E| Reserved | BlockOffset | 590 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 591 | RTT | Delay | 592 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 593 | LossEventRate | 594 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 595 | LastSeqNum | 596 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 597 | DataBlocks ... 598 +-+-+-+-+-+-+-+-+-+-+- 600 V: 601 A 1 bit version field that MUST be set to zero. If received as 602 one the packet MUST be dropped. 604 C: 605 A 1 bit value that MUST be set to 1 which indicates the presence 606 of the congestion information header fields "RTT", "Delay", 607 "LossEventRate" and "LastSeqNum". 609 E: 610 A 1 bit value if set indicates that Congestion Experienced (CE) 611 ECN bits were received and used in deriving the reported 612 "LossEventRate". 614 Reserved: 615 A 13 bit field set to 0 and ignored on receipt. 617 BlockOffset: 619 The same value as the non-congestion controlled payload format 620 value. 622 RTT: 623 A 16 bit value specifying the sender's current round-trip time 624 estimate in milliseconds. The value MAY be zero prior to the 625 sender having calculated a round-trip time estimate. The value 626 SHOULD be set to zero on non-IP-TFS enabled SAs. 628 Delay: 629 A 16 bit value specifying the delay in milliseconds incurred 630 between the receiver receiving the "LastSeqNum" packet and the 631 sending of this acknowledgement of it. 633 LossEventRate: 634 A 32 bit value specifying the inverse of the current loss event 635 rate as calculated by the receiver. A value of zero indicates no 636 loss. Otherwise the loss event rate is "1/LossEventRate". 638 LastSeqNum: 639 A 32 bit value containing the lower 32 bits of the largest 640 sequence number last received. This is the latest in the sequence 641 not necessarily the most recent (in the case of re-ordering of 642 packets it may be less recent). When determining largest and 64 643 bit extended sequence numbers are in use, the upper 32 bits should 644 be used during the comparison. 646 DataBlocks: 647 Variable number of octets that begins with the start of a data 648 block, or the continuation of a previous data block, followed by 649 zero or more additional data blocks. For the special case of 650 sending congestion control information on an non-IP-TFS enabled SA 651 this value MUST be empty (i.e., be zero octets long). 653 6.1.3. Data Blocks 655 1 2 3 656 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 657 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 658 | Type | IPv4, IPv6 or pad... 659 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+- 661 Type: 662 A 4 bit field where 0x0 identifies a pad data block, 0x4 indicates 663 an IPv4 data block, and 0x6 indicates an IPv6 data block. 665 6.1.3.1. IPv4 Data Block 667 1 2 3 668 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 669 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 670 | 0x4 | IHL | TypeOfService | TotalLength | 671 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 672 | Rest of the inner packet ... 673 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+- 675 These values are the actual values within the encapsulated IPv4 676 header. In other words, the start of this data block is the start of 677 the encapsulated IP packet. 679 Type: 680 A 4 bit value of 0x4 indicating IPv4 (i.e., first nibble of the 681 IPv4 packet). 683 TotalLength: 684 The 16 bit unsigned integer length field of the IPv4 inner packet. 686 6.1.3.2. IPv6 Data Block 688 1 2 3 689 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 690 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 691 | 0x6 | TrafficClass | FlowLabel | 692 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 693 | TotalLength | Rest of the inner packet ... 694 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+- 696 These values are the actual values within the encapsulated IPv6 697 header. In other words, the start of this data block is the start of 698 the encapsulated IP packet. 700 Type: 701 A 4 bit value of 0x6 indicating IPv6 (i.e., first nibble of the 702 IPv6 packet). 704 TotalLength: 705 The 16 bit unsigned integer length field of the inner IPv6 inner 706 packet. 708 6.1.3.3. Pad Data Block 709 1 2 3 710 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 711 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 712 | 0x0 | Padding ... 713 +-+-+-+-+-+-+-+-+-+-+- 715 Type: 716 A 4 bit value of 0x0 indicating a padding data block. 718 Padding: 719 extends to end of the encapsulating packet. 721 7. IANA Considerations 723 7.1. IPTFS_PROTOCOL Type 725 This document requests a protocol number IPTFS_PROTOCOL be allocated 726 by IANA from "Assigned Internet Protocol Numbers" registry for 727 identifying the IP-TFS ESP payload format. 729 Type: 730 TBD1 732 Description: 733 IP-TFS ESP payload format. 735 Reference: 736 This document 738 7.2. IKEv2 Transform Type TFS Type 740 This document requests an IKEv2 Transform Type "TFS Type" be 741 allocated by IANA from the "Transform Type Values" registry. 743 Type: 744 TBD2 746 Description: 747 TFS Type 749 Used In: 750 (optional in ESP) 752 Reference: 753 This document 755 7.3. TFS Type Transform IDs Registry 757 This document requests a "Transform Type TBD3 - TFS Type Transform 758 IDs" registry be created. The registration procedure is Expert 759 Review. The initial values are as follows: 761 Number Name Reference 762 ---------------------------------------- 763 0 NONE This document 764 1 TFS_IPTFS_CC This document 765 2 TFS_IPTFS_NOCC This document 766 3-65535 Reserved This document 768 7.4. IPTFS_REQUIREMENTS Notify Message Status Type 770 This document requests a status type IPTFS_REQUIREMENTS be allocated 771 from the "IKEv2 Notify Message Types - Status Types" registry. 773 Value: 774 TBD3 776 Name: 777 IPTFS_REQUIREMENTS 779 Reference: 780 This document 782 8. Security Considerations 784 This document describes a mechanism to add Traffic Flow 785 Confidentiality to IP traffic. Use of this mechanism is expected to 786 increase the security of the traffic being transported. Other than 787 the additional security afforded by using this mechanism, IP-TFS 788 utilizes the security protocols [RFC4303] and [RFC7296] and so their 789 security considerations apply to IP-TFS as well. 791 As noted previously in Section 2.5.2, for TFC to be fully maintained 792 the encapsulated traffic flow should not be affecting network 793 congestion in a predictable way, and if it would be then non- 794 congestion controlled mode use should be considered instead. 796 9. References 798 9.1. Normative References 800 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 801 Requirement Levels", BCP 14, RFC 2119, 802 DOI 10.17487/RFC2119, March 1997, 803 . 805 [RFC4303] Kent, S., "IP Encapsulating Security Payload (ESP)", 806 RFC 4303, DOI 10.17487/RFC4303, December 2005, 807 . 809 [RFC7296] Kaufman, C., Hoffman, P., Nir, Y., Eronen, P., and T. 810 Kivinen, "Internet Key Exchange Protocol Version 2 811 (IKEv2)", STD 79, RFC 7296, DOI 10.17487/RFC7296, October 812 2014, . 814 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 815 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 816 May 2017, . 818 9.2. Informative References 820 [AppCrypt] 821 Schneier, B., "Applied Cryptography: Protocols, 822 Algorithms, and Source Code in C", 11 2017. 824 [I-D.iab-wire-image] 825 Trammell, B. and M. Kuehlewind, "The Wire Image of a 826 Network Protocol", draft-iab-wire-image-01 (work in 827 progress), November 2018. 829 [IKEV2IANA] 830 IANA, "Internet Key Exchange Version 2 (IKEv2) 831 Parameters", 832 . 834 [RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, 835 DOI 10.17487/RFC0791, September 1981, 836 . 838 [RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, 839 DOI 10.17487/RFC1191, November 1990, 840 . 842 [RFC2474] Nichols, K., Blake, S., Baker, F., and D. Black, 843 "Definition of the Differentiated Services Field (DS 844 Field) in the IPv4 and IPv6 Headers", RFC 2474, 845 DOI 10.17487/RFC2474, December 1998, 846 . 848 [RFC2914] Floyd, S., "Congestion Control Principles", BCP 41, 849 RFC 2914, DOI 10.17487/RFC2914, September 2000, 850 . 852 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition 853 of Explicit Congestion Notification (ECN) to IP", 854 RFC 3168, DOI 10.17487/RFC3168, September 2001, 855 . 857 [RFC4301] Kent, S. and K. Seo, "Security Architecture for the 858 Internet Protocol", RFC 4301, DOI 10.17487/RFC4301, 859 December 2005, . 861 [RFC4342] Floyd, S., Kohler, E., and J. Padhye, "Profile for 862 Datagram Congestion Control Protocol (DCCP) Congestion 863 Control ID 3: TCP-Friendly Rate Control (TFRC)", RFC 4342, 864 DOI 10.17487/RFC4342, March 2006, 865 . 867 [RFC5348] Floyd, S., Handley, M., Padhye, J., and J. Widmer, "TCP 868 Friendly Rate Control (TFRC): Protocol Specification", 869 RFC 5348, DOI 10.17487/RFC5348, September 2008, 870 . 872 [RFC7510] Xu, X., Sheth, N., Yong, L., Callon, R., and D. Black, 873 "Encapsulating MPLS in UDP", RFC 7510, 874 DOI 10.17487/RFC7510, April 2015, 875 . 877 [RFC8084] Fairhurst, G., "Network Transport Circuit Breakers", 878 BCP 208, RFC 8084, DOI 10.17487/RFC8084, March 2017, 879 . 881 [RFC8200] Deering, S. and R. Hinden, "Internet Protocol, Version 6 882 (IPv6) Specification", STD 86, RFC 8200, 883 DOI 10.17487/RFC8200, July 2017, 884 . 886 [RFC8201] McCann, J., Deering, S., Mogul, J., and R. Hinden, Ed., 887 "Path MTU Discovery for IP version 6", STD 87, RFC 8201, 888 DOI 10.17487/RFC8201, July 2017, 889 . 891 Appendix A. Example Of An Encapsulated IP Packet Flow 893 Below we show an example inner IP packet flow within the 894 encapsulating tunnel packet stream. Notice how encapsulated IP 895 packets can start and end anywhere, and more than one or less than 1 896 may occur in a single encapsulating packet. 898 Offset: 0 Offset: 100 Offset: 2900 Offset: 1400 899 [ ESP1 (1500) ][ ESP2 (1500) ][ ESP3 (1500) ][ ESP4 (1500) ] 900 [--800--][--800--][60][-240-][--4000----------------------][pad] 902 Figure 3: Inner and Outer Packet Flow 904 The encapsulated IP packet flow (lengths include IP header and 905 payload) is as follows: an 800 octet packet, an 800 octet packet, a 906 60 octet packet, a 240 octet packet, a 4000 octet packet. 908 The "BlockOffset" values in the 4 IP-TFS payload headers for this 909 packet flow would thus be: 0, 100, 2900, 1400 respectively. The 910 first encapsulating packet ESP1 has a zero "BlockOffset" which points 911 at the IP data block immediately following the IP-TFS header. The 912 following packet ESP2s "BlockOffset" points inward 100 octets to the 913 start of the 60 octet data block. The third encapsulating packet 914 ESP3 contains the middle portion of the 4000 octet data block so the 915 offset points past its end and into the forth encapsulating packet. 916 The fourth packet ESP4s offset is 1400 pointing at the padding which 917 follows the completion of the continued 4000 octet packet. 919 Appendix B. A Send and Loss Event Rate Calculation 921 The current best practice indicates that congestion control should be 922 done in a TCP friendly way. A TCP friendly congestion control 923 algorithm is described in [RFC5348]. For our use case (as with 924 [RFC4342]) we consider our (fixed) packet size the segment size for 925 the algorithm. The formula for the send rate is then as follows: 927 1 928 X_Pps = ----------------------------------------------- 929 R * (sqrt(2*p/3) + 12*sqrt(3*p/8)*p*(1+32*p^2)) 931 Where "X_Pps" is the send rate in packets per second, "R" is the 932 round trip time estimate and "p" is the loss event rate (the inverse 933 of which is provided by the receiver). 935 The IP-TFS receiver, having the RTT estimate from the sender MAY use 936 the same method as described in [RFC4342] to collect the loss 937 intervals and calculate the loss event rate value using the weighted 938 average as indicated. The receiver communicates the inverse of this 939 value back to the sender in the IPTFS_PROTOCOL payload header field 940 "LossEventRate". 942 The IP-TFS sender now has both the "R" and "p" values and can 943 calculate the correct sending rate ("X_Pps"). If following [RFC5348] 944 the sender SHOULD also use the slow start mechanism described therein 945 when the IP-TFS SA is first established. 947 Appendix C. Comparisons of IP-TFS 949 C.1. Comparing Overhead 951 C.1.1. IP-TFS Overhead 953 The overhead of IP-TFS is 40 bytes per outer packet. Therefore the 954 octet overhead per inner packet is 40 divided by the number of outer 955 packets required (fractional allowed). The overhead as a percentage 956 of inner packet size is a constant based on the Outer MTU size. 958 OH = 40 / Outer Payload Size / Inner Packet Size 959 OH % of Inner Packet Size = 100 * OH / Inner Packet Size 960 OH % of Inner Packet Size = 4000 / Outer Payload Size 962 Type IP-TFS IP-TFS IP-TFS 963 MTU 576 1500 9000 964 PSize 536 1460 8960 965 ------------------------------- 966 40 7.46% 2.74% 0.45% 967 576 7.46% 2.74% 0.45% 968 1500 7.46% 2.74% 0.45% 969 9000 7.46% 2.74% 0.45% 971 Figure 4: IP-TFS Overhead as Percentage of Inner Packet Size 973 C.1.2. ESP with Padding Overhead 975 The overhead per inner packet for constant-send-rate padded ESP 976 (i.e., traditional IPsec TFC) is 36 octets plus any padding, unless 977 fragmentation is required. 979 When fragmentation of the inner packet is required to fit in the 980 outer IPsec packet, overhead is the number of outer packets required 981 to carry the fragmented inner packet times both the inner IP overhead 982 (20) and the outer packet overhead (36) minus the initial inner IP 983 overhead plus any required tail padding in the last encapsulation 984 packet. The required tail padding is the number of required packets 985 times the difference of the Outer Payload Size and the IP Overhead 986 minus the Inner Payload Size. So: 988 Inner Paylaod Size = IP Packet Size - IP Overhead 989 Outer Payload Size = MTU - IPsec Overhead 991 Inner Payload Size 992 NF0 = ---------------------------------- 993 Outer Payload Size - IP Overhead 995 NF = CEILING(NF0) 997 OH = NF * (IP Overhead + IPsec Overhead) 998 - IP Overhead 999 + NF * (Outer Payload Size - IP Overhead) 1000 - Inner Payload Size 1002 OH = NF * (IPsec Overhead + Outer Payload Size) 1003 - (IP Overhead + Inner Payload Size) 1005 OH = NF * (IPsec Overhead + Outer Payload Size) 1006 - Inner Packet Size 1008 C.2. Overhead Comparison 1010 The following tables collect the overhead values for some common L3 1011 MTU sizes in order to compare them. The first table is the number of 1012 octets of overhead for a given L3 MTU sized packet. The second table 1013 is the percentage of overhead in the same MTU sized packet. 1015 Type ESP+Pad ESP+Pad ESP+Pad IP-TFS IP-TFS IP-TFS 1016 L3 MTU 576 1500 9000 576 1500 9000 1017 PSize 540 1464 8964 536 1460 8960 1018 ----------------------------------------------------------- 1019 40 500 1424 8924 3.0 1.1 0.2 1020 128 412 1336 8836 9.6 3.5 0.6 1021 256 284 1208 8708 19.1 7.0 1.1 1022 536 4 928 8428 40.0 14.7 2.4 1023 576 576 888 8388 43.0 15.8 2.6 1024 1460 268 4 7504 109.0 40.0 6.5 1025 1500 228 1500 7464 111.9 41.1 6.7 1026 8960 1408 1540 4 668.7 245.5 40.0 1027 9000 1368 1500 9000 671.6 246.6 40.2 1029 Figure 5: Overhead comparison in octets 1031 Type ESP+Pad ESP+Pad ESP+Pad IP-TFS IP-TFS IP-TFS 1032 MTU 576 1500 9000 576 1500 9000 1033 PSize 540 1464 8964 536 1460 8960 1034 ----------------------------------------------------------- 1035 40 1250.0% 3560.0% 22310.0% 7.46% 2.74% 0.45% 1036 128 321.9% 1043.8% 6903.1% 7.46% 2.74% 0.45% 1037 256 110.9% 471.9% 3401.6% 7.46% 2.74% 0.45% 1038 536 0.7% 173.1% 1572.4% 7.46% 2.74% 0.45% 1039 576 100.0% 154.2% 1456.2% 7.46% 2.74% 0.45% 1040 1460 18.4% 0.3% 514.0% 7.46% 2.74% 0.45% 1041 1500 15.2% 100.0% 497.6% 7.46% 2.74% 0.45% 1042 8960 15.7% 17.2% 0.0% 7.46% 2.74% 0.45% 1043 9000 15.2% 16.7% 100.0% 7.46% 2.74% 0.45% 1045 Figure 6: Overhead as Percentage of Inner Packet Size 1047 C.3. Comparing Available Bandwidth 1049 Another way to compare the two solutions is to look at the amount of 1050 available bandwidth each solution provides. The following sections 1051 consider and compare the percentage of available bandwidth. For the 1052 sake of providing a well understood baseline we will also include 1053 normal (unencrypted) Ethernet as well as normal ESP values. 1055 C.3.1. Ethernet 1057 In order to calculate the available bandwidth we first calculate the 1058 per packet overhead in bits. The total overhead of Ethernet is 14+4 1059 octets of header and CRC plus and additional 20 octets of framing 1060 (preamble, start, and inter-packet gap) for a total of 48 octets. 1061 Additionally the minimum payload is 46 octets. 1063 Size E + P E + P E + P IPTFS IPTFS IPTFS Enet ESP 1064 MTU 590 1514 9014 590 1514 9014 any any 1065 OH 74 74 74 78 78 78 38 74 1066 ------------------------------------------------------------ 1067 40 614 1538 9038 45 42 40 84 114 1068 128 614 1538 9038 146 134 129 166 202 1069 256 614 1538 9038 293 269 258 294 330 1070 536 614 1538 9038 614 564 540 574 610 1071 576 1228 1538 9038 659 606 581 614 650 1072 1460 1842 1538 9038 1672 1538 1472 1498 1534 1073 1500 1842 3076 9038 1718 1580 1513 1538 1574 1074 8960 11052 10766 9038 10263 9438 9038 8998 9034 1075 9000 11052 10766 18076 10309 9480 9078 9038 9074 1077 Figure 7: L2 Octets Per Packet 1079 Size E + P E + P E + P IPTFS IPTFS IPTFS Enet ESP 1080 MTU 590 1514 9014 590 1514 9014 any any 1081 OH 74 74 74 78 78 78 38 74 1082 -------------------------------------------------------------- 1083 40 2.0M 0.8M 0.1M 27.3M 29.7M 31.0M 14.9M 11.0M 1084 128 2.0M 0.8M 0.1M 8.5M 9.3M 9.7M 7.5M 6.2M 1085 256 2.0M 0.8M 0.1M 4.3M 4.6M 4.8M 4.3M 3.8M 1086 536 2.0M 0.8M 0.1M 2.0M 2.2M 2.3M 2.2M 2.0M 1087 576 1.0M 0.8M 0.1M 1.9M 2.1M 2.2M 2.0M 1.9M 1088 1460 678K 812K 138K 747K 812K 848K 834K 814K 1089 1500 678K 406K 138K 727K 791K 826K 812K 794K 1090 8960 113K 116K 138K 121K 132K 138K 138K 138K 1091 9000 113K 116K 69K 121K 131K 137K 138K 137K 1093 Figure 8: Packets Per Second on 10G Ethernet 1095 Size E + P E + P E + P IPTFS IPTFS IPTFS Enet ESP 1096 590 1514 9014 590 1514 9014 any any 1097 74 74 74 78 78 78 38 74 1098 ---------------------------------------------------------------------- 1099 40 6.51% 2.60% 0.44% 87.30% 94.93% 99.14% 47.62% 35.09% 1100 128 20.85% 8.32% 1.42% 87.30% 94.93% 99.14% 77.11% 63.37% 1101 256 41.69% 16.64% 2.83% 87.30% 94.93% 99.14% 87.07% 77.58% 1102 536 87.30% 34.85% 5.93% 87.30% 94.93% 99.14% 93.38% 87.87% 1103 576 46.91% 37.45% 6.37% 87.30% 94.93% 99.14% 93.81% 88.62% 1104 1460 79.26% 94.93% 16.15% 87.30% 94.93% 99.14% 97.46% 95.18% 1105 1500 81.43% 48.76% 16.60% 87.30% 94.93% 99.14% 97.53% 95.30% 1106 8960 81.07% 83.22% 99.14% 87.30% 94.93% 99.14% 99.58% 99.18% 1107 9000 81.43% 83.60% 49.79% 87.30% 94.93% 99.14% 99.58% 99.18% 1109 Figure 9: Percentage of Bandwidth on 10G Ethernet 1111 A sometimes unexpected result of using IP-TFS (or any packet 1112 aggregating tunnel) is that, for small to medium sized packets, the 1113 available bandwidth is actually greater than native Ethernet. This 1114 is due to the reduction in Ethernet framing overhead. This increased 1115 bandwidth is paid for with an increase in latency. This latency is 1116 the time to send the unrelated octets in the outer tunnel frame. The 1117 following table illustrates the latency for some common values on a 1118 10G Ethernet link. The table also includes latency introduced by 1119 padding if using ESP with padding. 1121 ESP+Pad ESP+Pad IP-TFS IP-TFS 1122 1500 9000 1500 9000 1124 ------------------------------------------ 1125 40 1.14 us 7.14 us 1.17 us 7.17 us 1126 128 1.07 us 7.07 us 1.10 us 7.10 us 1127 256 0.97 us 6.97 us 1.00 us 7.00 us 1128 536 0.74 us 6.74 us 0.77 us 6.77 us 1129 576 0.71 us 6.71 us 0.74 us 6.74 us 1130 1460 0.00 us 6.00 us 0.04 us 6.04 us 1131 1500 1.20 us 5.97 us 0.00 us 6.00 us 1133 Figure 10: Added Latency 1135 Notice that the latency values are very similar between the two 1136 solutions; however, whereas IP-TFS provides for constant high 1137 bandwidth, in some cases even exceeding native Ethernet, ESP with 1138 padding often greatly reduces available bandwidth. 1140 Appendix D. Acknowledgements 1142 We would like to thank Don Fedyk for help in reviewing this work. 1144 Appendix E. Contributors 1146 The following people made significant contributions to this document. 1148 Lou Berger 1149 LabN Consulting, L.L.C. 1151 Email: lberger@labn.net 1153 Author's Address 1155 Christian Hopps 1156 LabN Consulting, L.L.C. 1158 Email: chopps@chopps.org