idnits 2.17.00 (12 Aug 2021) /tmp/idnits43432/draft-ietf-ipsecme-tcp-encaps-10.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (May 30, 2017) is 1810 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'Appendix A' is mentioned on line 303, but not defined == Missing Reference: 'Section 4' is mentioned on line 361, but not defined == Missing Reference: 'RFC-this-rfc' is mentioned on line 680, but not defined == Missing Reference: 'ChangeCipherSpec' is mentioned on line 1000, but not defined == Missing Reference: 'CERTREQ' is mentioned on line 837, but not defined == Missing Reference: 'CERT' is mentioned on line 842, but not defined == Missing Reference: 'CP' is mentioned on line 890, but not defined -- Obsolete informational reference (is this intentional?): RFC 5246 (Obsoleted by RFC 8446) Summary: 0 errors (**), 0 flaws (~~), 8 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network T. Pauly 3 Internet-Draft Apple Inc. 4 Intended status: Standards Track S. Touati 5 Expires: December 1, 2017 Ericsson 6 R. Mantha 7 Cisco Systems 8 May 30, 2017 10 TCP Encapsulation of IKE and IPsec Packets 11 draft-ietf-ipsecme-tcp-encaps-10 13 Abstract 15 This document describes a method to transport IKE and IPsec packets 16 over a TCP connection for traversing network middleboxes that may 17 block IKE negotiation over UDP. This method, referred to as TCP 18 encapsulation, involves sending both IKE packets for Security 19 Association establishment and ESP packets over a TCP connection. 20 This method is intended to be used as a fallback option when IKE 21 cannot be negotiated over UDP. 23 Status of This Memo 25 This Internet-Draft is submitted in full conformance with the 26 provisions of BCP 78 and BCP 79. 28 Internet-Drafts are working documents of the Internet Engineering 29 Task Force (IETF). Note that other groups may also distribute 30 working documents as Internet-Drafts. The list of current Internet- 31 Drafts is at http://datatracker.ietf.org/drafts/current/. 33 Internet-Drafts are draft documents valid for a maximum of six months 34 and may be updated, replaced, or obsoleted by other documents at any 35 time. It is inappropriate to use Internet-Drafts as reference 36 material or to cite them other than as "work in progress." 38 This Internet-Draft will expire on December 1, 2017. 40 Copyright Notice 42 Copyright (c) 2017 IETF Trust and the persons identified as the 43 document authors. All rights reserved. 45 This document is subject to BCP 78 and the IETF Trust's Legal 46 Provisions Relating to IETF Documents 47 (http://trustee.ietf.org/license-info) in effect on the date of 48 publication of this document. Please review these documents 49 carefully, as they describe your rights and restrictions with respect 50 to this document. Code Components extracted from this document must 51 include Simplified BSD License text as described in Section 4.e of 52 the Trust Legal Provisions and are provided without warranty as 53 described in the Simplified BSD License. 55 Table of Contents 57 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 58 1.1. Prior Work and Motivation . . . . . . . . . . . . . . . . 3 59 1.2. Terminology and Notation . . . . . . . . . . . . . . . . 4 60 2. Configuration . . . . . . . . . . . . . . . . . . . . . . . . 5 61 3. TCP-Encapsulated Header Formats . . . . . . . . . . . . . . . 5 62 3.1. TCP-Encapsulated IKE Header Format . . . . . . . . . . . 6 63 3.2. TCP-Encapsulated ESP Header Format . . . . . . . . . . . 6 64 4. TCP-Encapsulated Stream Prefix . . . . . . . . . . . . . . . 7 65 5. Applicability . . . . . . . . . . . . . . . . . . . . . . . . 7 66 5.1. Recommended Fallback from UDP . . . . . . . . . . . . . . 8 67 6. Connection Establishment and Teardown . . . . . . . . . . . . 8 68 7. Interaction with NAT Detection Payloads . . . . . . . . . . . 10 69 8. Using MOBIKE with TCP encapsulation . . . . . . . . . . . . . 10 70 9. Using IKE Message Fragmentation with TCP encapsulation . . . 11 71 10. Considerations for Keep-alives and DPD . . . . . . . . . . . 11 72 11. Middlebox Considerations . . . . . . . . . . . . . . . . . . 12 73 12. Performance Considerations . . . . . . . . . . . . . . . . . 12 74 12.1. TCP-in-TCP . . . . . . . . . . . . . . . . . . . . . . . 12 75 12.2. Added Reliability for Unreliable Protocols . . . . . . . 13 76 12.3. Quality of Service Markings . . . . . . . . . . . . . . 13 77 12.4. Maximum Segment Size . . . . . . . . . . . . . . . . . . 13 78 12.5. Tunnelling ECN in TCP . . . . . . . . . . . . . . . . . 14 79 13. Security Considerations . . . . . . . . . . . . . . . . . . . 14 80 14. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 15 81 15. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 15 82 16. References . . . . . . . . . . . . . . . . . . . . . . . . . 15 83 16.1. Normative References . . . . . . . . . . . . . . . . . . 15 84 16.2. Informative References . . . . . . . . . . . . . . . . . 16 85 Appendix A. Using TCP encapsulation with TLS . . . . . . . . . . 17 86 Appendix B. Example exchanges of TCP Encapsulation with TLS . . 17 87 B.1. Establishing an IKE session . . . . . . . . . . . . . . . 17 88 B.2. Deleting an IKE session . . . . . . . . . . . . . . . . . 19 89 B.3. Re-establishing an IKE session . . . . . . . . . . . . . 20 90 B.4. Using MOBIKE between UDP and TCP Encapsulation . . . . . 21 91 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 23 93 1. Introduction 95 IKEv2 [RFC7296] is a protocol for establishing IPsec Security 96 Associations (SAs), using IKE messages over UDP for control traffic, 97 and using Encapsulating Security Payload (ESP) messages for encrypted 98 data traffic. Many network middleboxes that filter traffic on public 99 hotspots block all UDP traffic, including IKE and IPsec, but allow 100 TCP connections through since they appear to be web traffic. Devices 101 on these networks that need to use IPsec (to access private 102 enterprise networks, to route voice-over-IP calls to carrier 103 networks, or because of security policies) are unable to establish 104 IPsec SAs. This document defines a method for encapsulating both the 105 IKE control messages as well as the IPsec data messages within a TCP 106 connection. 108 Using TCP as a transport for IPsec packets adds a third option to the 109 list of traditional IPsec transports: 111 1. Direct. Currently, IKE negotiations begin over UDP port 500. 112 If no NAT is detected between the Initiator and the Responder, 113 then subsequent IKE packets are sent over UDP port 500 and 114 IPsec data packets are sent using ESP [RFC4303]. 116 2. UDP Encapsulation [RFC3948]. If a NAT is detected between the 117 Initiator and the Responder, then subsequent IKE packets are 118 sent over UDP port 4500 with four bytes of zero at the start of 119 the UDP payload and ESP packets are sent out over UDP port 120 4500. Some peers default to using UDP encapsulation even when 121 no NAT are detected on the path as some middleboxes do not 122 support IP protocols other than TCP and UDP. 124 3. TCP Encapsulation. If both of the other two methods are not 125 available or appropriate, both IKE negotiation packets as well 126 as ESP packets can be sent over a single TCP connection to the 127 peer. 129 Direct use of ESP or UDP Encapsulation should be preferred by IKE 130 implementations due to performance concerns when using TCP 131 Encapsulation Section 12. Most implementations should use TCP 132 Encapsulation only on networks where negotiation over UDP has been 133 attempted without receiving responses from the peer, or if a network 134 is known to not support UDP. 136 1.1. Prior Work and Motivation 138 Encapsulating IKE connections within TCP streams is a common approach 139 to solve the problem of UDP packets being blocked by network 140 middleboxes. The goal of this document is to promote 141 interoperability by providing a standard method of framing IKE and 142 ESP message within streams, and to provide guidelines for how to 143 configure and use TCP encapsulation. 145 Some previous alternatives include: 147 Cellular Network Access Interworking Wireless LAN (IWLAN) uses IKEv2 148 to create secure connections to cellular carrier networks for 149 making voice calls and accessing other network services over 150 Wi-Fi networks. 3GPP has recommended that IKEv2 and ESP packets 151 be sent within a TLS connection to be able to establish 152 connections on restrictive networks. 154 ISAKMP over TCP Various non-standard extensions to ISAKMP have been 155 deployed that send IPsec traffic over TCP or TCP-like packets. 157 SSL VPNs Many proprietary VPN solutions use a combination of TLS and 158 IPsec in order to provide reliability. These often run on TCP 159 port 443. 161 IKEv2 over TCP IKEv2 over TCP as described in 162 [I-D.nir-ipsecme-ike-tcp] is used to avoid UDP fragmentation. 164 The goal of this specification is to provide a standardized method 165 for using TCP streams to transport IPsec that is compatible with the 166 current IKE standard, and avoids the overhead of other alternatives 167 that always rely on TCP or TLS. 169 1.2. Terminology and Notation 171 This document distinguishes between the IKE peer that initiates TCP 172 connections to be used for TCP encapsulation and the roles of 173 Initiator and Responder for particular IKE messages. During the 174 course of IKE exchanges, the role of IKE Initiator and Responder may 175 swap for a given SA (as with IKE SA Rekeys), while the initiator of 176 the TCP connection is still responsible for tearing down the TCP 177 connection and re-establishing it if necessary. For this reason, 178 this document will use the term "TCP Originator" to indicate the IKE 179 peer that initiates TCP connections. The peer that receives TCP 180 connections will be referred to as the "TCP Responder". If an IKE SA 181 is rekeyed one or more times, the TCP Originator MUST remain the peer 182 that originally initiated the first IKE SA. 184 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 185 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 186 document are to be interpreted as described in RFC 2119 [RFC2119]. 188 2. Configuration 190 One of the main reasons to use TCP encapsulation is that UDP traffic 191 may be entirely blocked on a network. Because of this, support for 192 TCP encapsulation is not specifically negotiated in the IKE exchange. 193 Instead, support for TCP encapsulation must be pre-configured on both 194 the TCP Originator and the TCP Responder. 196 Implementations MUST support TCP encapsulation on TCP port 4500, 197 which is reserved for IPsec NAT Traversal. 199 Beyond a flag indicating support for TCP encapsulation, the 200 configuration for each peer can include the following optional 201 parameters: 203 o Alternate TCP ports on which the specific TCP Responder listens 204 for incoming connections. Note that the TCP Originator may 205 initiate TCP connections to the TCP Responder from any local port. 207 o An extra framing protocol to use on top of TCP to further 208 encapsulate the stream of IKE and IPsec packets. See Appendix A 209 for a detailed discussion. 211 Since TCP encapsulation of IKE and IPsec packets adds overhead and 212 has potential performance trade-offs compared to direct or UDP- 213 encapsulated SAs (as described in Performance Considerations, 214 Section 12), implementations SHOULD prefer ESP direct or UDP 215 encapsulated SAs over TCP encapsulated SAs when possible. 217 3. TCP-Encapsulated Header Formats 219 Like UDP encapsulation, TCP encapsulation uses the first four bytes 220 of a message to differentiate IKE and ESP messages. TCP 221 encapsulation also adds a length field to define the boundaries of 222 messages within a stream. The message length is sent in a 16-bit 223 field that precedes every message. If the first 32-bits of the 224 message are zeros (a Non-ESP Marker), then the contents comprise an 225 IKE message. Otherwise, the contents comprise an ESP message. 226 Authentication Header (AH) messages are not supported for TCP 227 encapsulation. 229 Although a TCP stream may be able to send very long messages, 230 implementations SHOULD limit message lengths to typical UDP datagram 231 ESP payload lengths. The maximum message length is used as the 232 effective MTU for connections that are being encrypted using ESP, so 233 the maximum message length will influence characteristics of inner 234 connections, such as the TCP Maximum Segment Size (MSS). 236 Note that this method of encapsulation will also work for placing IKE 237 and ESP messages within any protocol that presents a stream 238 abstraction, beyond TCP. 240 3.1. TCP-Encapsulated IKE Header Format 242 1 2 3 243 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 244 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 245 | Length | 246 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 247 | Non-ESP Marker | 248 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 249 | | 250 ~ IKE header [RFC7296] ~ 251 | | 252 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 254 Figure 1 256 The IKE header is preceded by a 16-bit length field in network byte 257 order that specifies the length of the IKE message (including the 258 Non-ESP marker) within the TCP stream. As with IKE over UDP port 259 4500, a zeroed 32-bit Non-ESP Marker is inserted before the start of 260 the IKE header in order to differentiate the traffic from ESP traffic 261 between the same addresses and ports. 263 o Length (2 octets, unsigned integer) - Length of the IKE packet 264 including the Length Field and Non-ESP Marker. 266 3.2. TCP-Encapsulated ESP Header Format 268 1 2 3 269 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 270 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 271 | Length | 272 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 273 | | 274 ~ ESP header [RFC4303] ~ 275 | | 276 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 278 Figure 2 280 The ESP header is preceded by a 16-bit length field in network byte 281 order that specifies the length of the ESP packet within the TCP 282 stream. 284 The SPI field in the ESP header MUST NOT be a zero value. 286 o Length (2 octets, unsigned integer) - Length of the ESP packet 287 including the Length Field. 289 4. TCP-Encapsulated Stream Prefix 291 Each stream of bytes used for IKE and IPsec encapsulation MUST begin 292 with a fixed sequence of six bytes as a magic value, containing the 293 characters "IKETCP" as ASCII values. This value is intended to 294 identify and validate that the TCP connection is being used for TCP 295 encapsulation as defined in this document, to avoid conflicts with 296 the prevalence of previous non-standard protocols that used TCP port 297 4500. This value is only sent once, by the TCP Originator only, at 298 the beginning of any stream of IKE and ESP messages. 300 If other framing protocols are used within TCP to further encapsulate 301 or encrypt the stream of IKE and ESP messages, the Stream Prefix must 302 be at the start of the TCP Originator's IKE and ESP message stream 303 within the added protocol layer [Appendix A]. Although some framing 304 protocols do support negotiating inner protocols, the stream prefix 305 should always be used in order for implementations to be as generic 306 as possible and not rely on other framing protocols on top of TCP. 308 0 1 2 3 4 5 309 +------+------+------+------+------+------+ 310 | 0x49 | 0x4b | 0x45 | 0x54 | 0x43 | 0x50 | 311 +------+------+------+------+------+------+ 313 Figure 3 315 5. Applicability 317 TCP encapsulation is applicable only when it has been configured to 318 be used with specific IKE peers. If a Responder is configured to use 319 TCP encapsulation, it MUST listen on the configured port(s) in case 320 any peers will initiate new IKE sessions. Initiators MAY use TCP 321 encapsulation for any IKE session to a peer that is configured to 322 support TCP encapsulation, although it is recommended that Initiators 323 should only use TCP encapsulation when traffic over UDP is blocked. 325 Since the support of TCP encapsulation is a configured property, not 326 a negotiated one, it is recommended that if there are multiple IKE 327 endpoints representing a single peer (such as multiple machines with 328 different IP addresses when connecting by Fully-Qualified Domain 329 Name, or endpoints used with IKE redirection), all of the endpoints 330 equally support TCP encapsulation. 332 If TCP encapsulation is being used for a specific IKE SA, all 333 messages for that IKE SA and its Child SAs MUST be sent over a TCP 334 connection until the SA is deleted or MOBIKE is used to change the SA 335 endpoints and/or encapsulation protocol. See Section 8 for more 336 details on using MOBIKE to transition between encapsulation modes. 338 5.1. Recommended Fallback from UDP 340 Since UDP is the preferred method of transport for IKE messages, 341 implementations that use TCP encapsulation should have an algorithm 342 for deciding when to use TCP after determining that UDP is unusable. 343 If an Initiator implementation has no prior knowledge about the 344 network it is on and the status of UDP on that network, it SHOULD 345 always attempt negotiate IKE over UDP first. IKEv2 defines how to 346 use retransmission timers with IKE messages, and IKE_SA_INIT messages 347 specifically [RFC7296]. Generally, this means that the 348 implementation will define a frequency of retransmission, and the 349 maximum number of retransmissions allowed before marking the IKE SA 350 as failed. An implementation can attempt negotiation over TCP once 351 it has hit the maximum retransmissions over UDP, or slightly before 352 to reduce connection setup delays. It is recommended that the 353 initial message over UDP is retransmitted at least once before 354 falling back to TCP, unless the Initiator knows beforehand that the 355 network is likely to block UDP. 357 6. Connection Establishment and Teardown 359 When the IKE Initiator uses TCP encapsulation, it will initiate a TCP 360 connection to the Responder using the configured TCP port. The first 361 bytes sent on the stream MUST be the stream prefix value [Section 4]. 362 After this prefix, encapsulated IKE messages will negotiate the IKE 363 SA and initial Child SA [RFC7296]. After this point, both 364 encapsulated IKE Figure 1 and ESP Figure 2 messages will be sent over 365 the TCP connection. The TCP Responder MUST wait for the entire 366 stream prefix to be received on the stream before trying to parse out 367 any IKE or ESP messages. The stream prefix is sent only once, and 368 only by the TCP Originator. 370 In order to close an IKE session, either the Initiator or Responder 371 SHOULD gracefully tear down IKE SAs with DELETE payloads. Once the 372 SA has been deleted, the TCP Originator SHOULD close the TCP 373 connection if it does not intend to use the connection for another 374 IKE session to the TCP Responder. If the connection is left idle, 375 and the TCP Responder needs to clean up resources, the TCP Responder 376 MAY close the TCP connection. 378 An unexpected FIN or a RST on the TCP connection may indicate either 379 a loss of connectivity, an attack, or some other error. If a DELETE 380 payload has not been sent, both sides SHOULD maintain the state for 381 their SAs for the standard lifetime or time-out period. The TCP 382 Originator is responsible for re-establishing the TCP connection if 383 it is torn down for any unexpected reason. Since new TCP connections 384 may use different ports due to NAT mappings or local port allocations 385 changing, the TCP Responder MUST allow packets for existing SAs to be 386 received from new source ports. 388 A peer MUST discard a partially received message due to a broken 389 connection. 391 Whenever the TCP Originator opens a new TCP connection to be used for 392 an existing IKE SA, it MUST send the stream prefix first, before any 393 IKE or ESP messages. This follows the same behavior as the initial 394 TCP connection. 396 If a TCP connection is being used to resume a previous IKE session, 397 the TCP Responder can recognize the session using either the IKE SPI 398 from an encapsulated IKE message or the ESP SPI from an encapsulated 399 ESP message. If the session had been fully established previously, 400 it is suggested that the TCP Originator send an UPDATE_SA_ADDRESSES 401 message if MOBIKE is supported, or an INFORMATIONAL message (a 402 keepalive) otherwise. 404 The TCP Responder MUST NOT accept any messages for the existing IKE 405 session on a new incoming connection unless that connection begins 406 with the stream prefix. If either the TCP Originator or TCP 407 Responder detects corruption on a connection that was started with a 408 valid stream prefix, it SHOULD close the TCP connection. The 409 connection can be determined as corrupted if there are too many 410 subsequent messages that cannot be parsed as valid IKE messages or 411 ESP messages with known SPIs, or if the authentication check for an 412 ESP message with a known SPI fails. Implementations SHOULD NOT tear 413 down a connection if only a single ESP message has an unknown SPI, 414 since the SPI databases may be momentarily out of sync. If there is 415 instead a syntax issue within an IKE message, an implementation MUST 416 send the INVALID_SYNTAX notify payload and tear down the IKE SA as 417 usual, rather than tearing down the TCP connection directly. 419 An TCP Originator SHOULD only open one TCP connection per IKE SA, 420 over which it sends all of the corresponding IKE and ESP messages. 421 This helps ensure that any firewall or NAT mappings allocated for the 422 TCP connection apply to all of the traffic associated with the IKE SA 423 equally. 425 Similarly, a TCP Responder SHOULD at any given time send packets for 426 an IKE SA and its Child SAs over only one TCP connection. It SHOULD 427 choose the TCP connection on which it last received a valid and 428 decryptable IKE or ESP message. In order to be considered valid for 429 choosing a TCP connection, an IKE message must be successfully 430 decrypted and authenticated, not be a retransmission of a previously 431 received message, and be within the expected window for IKE message 432 IDs. Similarly, an ESP message must pass authentication checks and 433 be decrypted, not be a replay of a previous message. 435 Since a connection may be broken and a new connection re-established 436 by the TCP Originator without the TCP Responder being aware, a TCP 437 Responder SHOULD accept receiving IKE and ESP messages on both old 438 and new connections until the old connection is closed by the TCP 439 Originator. A TCP Responder MAY close a TCP connection that it 440 perceives as idle and extraneous (one previously used for IKE and ESP 441 messages that has been replaced by a new connection). 443 Multiple IKE SAs MUST NOT share a single TCP connection, unless one 444 is a rekey of an existing IKE SA, in which case there will 445 temporarily be two IKE SAs on the same TCP connection. 447 7. Interaction with NAT Detection Payloads 449 When negotiating over UDP port 500, IKE_SA_INIT packets include 450 NAT_DETECTION_SOURCE_IP and NAT_DETECTION_DESTINATION_IP payloads to 451 determine if UDP encapsulation of IPsec packets should be used. 452 These payloads contain SHA-1 digests of the SPIs, IP addresses, and 453 ports as defined in [RFC7296]. IKE_SA_INIT packets sent on a TCP 454 connection SHOULD include these payloads with the same content as 455 when sending over UDP, and SHOULD use the applicable TCP ports when 456 creating and checking the SHA-1 digests. 458 If a NAT is detected due to the SHA-1 digests not matching the 459 expected values, no change should be made for encapsulation of 460 subsequent IKE or ESP packets, since TCP encapsulation inherently 461 supports NAT traversal. Implementations MAY use the information that 462 a NAT is present to influence keep-alive timer values. 464 If a NAT is detected, implementations need to handle transport mode 465 TCP and UDP packet checksum fixup as defined for UDP encapsulation in 466 [RFC3948]. 468 8. Using MOBIKE with TCP encapsulation 470 When an IKE session that has negotiated MOBIKE [RFC4555] is 471 transitioning between networks, the Initiator of the transition may 472 switch between using TCP encapsulation, UDP encapsulation, or no 473 encapsulation. Implementations that implement both MOBIKE and TCP 474 encapsulation MUST support dynamically enabling and disabling TCP 475 encapsulation as interfaces change. 477 When a MOBIKE-enabled Initiator changes networks, the 478 UPDATE_SA_ADDRESSES notification SHOULD be sent out first over UDP 479 before attempting over TCP. If there is a response to the 480 UPDATE_SA_ADDRESSES notification sent over UDP, then the ESP packets 481 should be sent directly over IP or over UDP port 4500 (depending on 482 if a NAT was detected), regardless of if a connection on a previous 483 network was using TCP encapsulation. Similarly, if the Responder 484 only responds to the UPDATE_SA_ADDRESSES notification over TCP, then 485 the ESP packets should be sent over the TCP connection, regardless of 486 if a connection on a previous network did not use TCP encapsulation. 488 9. Using IKE Message Fragmentation with TCP encapsulation 490 IKE Message Fragmentation [RFC7383] is not required when using TCP 491 encapsulation, since a TCP stream already handles the fragmentation 492 of its contents across packets. Since fragmentation is redundant in 493 this case, implementations might choose to not negotiate IKE 494 fragmentation. Even if fragmentation is negotiated, an 495 implementation SHOULD NOT send fragments when going over a TCP 496 connection, although it MUST support receiving fragments. 498 If an implementation supports both MOBIKE and IKE fragmentation, it 499 SHOULD negotiate IKE fragmentation over a TCP encapsulated session in 500 case the session switches to UDP encapsulation on another network. 502 10. Considerations for Keep-alives and DPD 504 Encapsulating IKE and IPsec inside of a TCP connection can impact the 505 strategy that implementations use to detect peer liveness and to 506 maintain middlebox port mappings. Peer liveness should be checked 507 using IKE Informational packets [RFC7296]. 509 In general, TCP port mappings are maintained by NATs longer than UDP 510 port mappings, so IPsec ESP NAT keep-alives [RFC3948] SHOULD NOT be 511 sent when using TCP encapsulation. Any implementation using TCP 512 encapsulation MUST silently drop incoming NAT keep-alive packets, and 513 not treat them as errors. NAT keep-alive packets over a TCP 514 encapsulated IPsec connection will be sent with a length value of 1 515 byte, whose value is 0xFF Figure 2. 517 Note that depending on the configuration of TCP and TLS on the 518 connection, TCP keep-alives [RFC1122] and TLS keep-alives [RFC6520] 519 may be used. These MUST NOT be used as indications of IKE peer 520 liveness. 522 11. Middlebox Considerations 524 Many security networking devices such as Firewalls or Intrusion 525 Prevention Systems, network optimization/acceleration devices and 526 Network Address Translation (NAT) devices keep the state of sessions 527 that traverse through them. 529 These devices commonly track the transport layer and/or the 530 application layer data to drop traffic that is anomalous or malicious 531 in nature. While many of these devices will be more likely to pass 532 TCP-encapsulated traffic as opposed to UDP-encapsulated traffic, some 533 may still block or interfere with TCP-encapsulated IKE and IPsec. 535 A network device that monitors the transport layer will track the 536 state of TCP sessions, such as TCP sequence numbers. TCP 537 encapsulation of IKE should therefore use standard TCP behaviors to 538 avoid being dropped by middleboxes. 540 12. Performance Considerations 542 Several aspects of TCP encapsulation for IKE and IPsec packets may 543 negatively impact the performance of connections within a tunnel-mode 544 IPsec SA. Implementations should be aware of these performance 545 impacts and take these into consideration when determining when to 546 use TCP encapsulation. Implementations SHOULD favor using direct ESP 547 or UDP encapsulation over TCP encapsulation whenever possible. 549 12.1. TCP-in-TCP 551 If the outer connection between IKE peers is over TCP, inner TCP 552 connections may suffer effects from using TCP within TCP. Running 553 TCP within TCP is discouraged, since the TCP algorithms generally 554 assume that they are running over an unreliable datagram layer. 556 If the outer (tunnel) TCP connection experiences packet loss, this 557 loss will be hidden from any inner TCP connections, since the outer 558 connection will retransmit to account for the losses. Since the 559 outer TCP connection will deliver the inner messages in order, any 560 messages after a lost packet may have to wait until the loss is 561 recovered. This means that loss on the outer connection will be 562 interpreted only as delay by inner connections. The burstiness of 563 inner traffic can increase, since a large number of inner packets may 564 be delivered across the tunnel at once. The inner TCP connection may 565 interpret a long period of delay as a transmission problem, 566 triggering a retransmission timeout, which will cause spurious 567 retransmissions. The sending rate of the inner connection may be 568 unnecessarily reduced if the retransmissions are not detected as 569 spurious in time. 571 The inner TCP connection's round-trip-time estimation will be 572 affected by the burstiness of the outer TCP connection if there are 573 long delays when packets are retransmitted by the outer TCP 574 connection. This will make the congestion control loop of the inner 575 TCP traffic less reactive, potentially permanently leading to a lower 576 sending rate than the outer TCP would allow for. 578 TCP-in-TCP can also lead to increased buffering, or bufferbloat. 579 This can occur when the window size of the outer TCP connection is 580 reduced, and becomes smaller than the window sizes of the inner TCP 581 connections. This can lead to packets backing up in the outer TCP 582 connection's send buffers. In order to limit this effect, the outer 583 TCP connection should have limits on its send buffer size, and on the 584 rate at which it reduces its window size. 586 Note that any negative effects will be shared between all flows going 587 through the outer TCP connection. This is of particular concern for 588 any latency-sensitive or real-time applications using the tunnel. If 589 such traffic is using a TCP encapsulated IPsec connection, it is 590 recommended that the number of inner connections sharing the tunnel 591 be limited as much as possible. 593 12.2. Added Reliability for Unreliable Protocols 595 Since ESP is an unreliable protocol, transmitting ESP packets over a 596 TCP connection will change the fundamental behavior of the packets. 597 Some application-level protocols that prefer packet loss to delay 598 (such as Voice over IP or other real-time protocols) may be 599 negatively impacted if their packets are retransmitted by the TCP 600 connection due to packet loss. 602 12.3. Quality of Service Markings 604 Quality of Service (QoS) markings, such as DSCP and Traffic Class, 605 should be used with care on TCP connections used for encapsulation. 606 Individual packets SHOULD NOT use different markings than the rest of 607 the connection, since packets with different priorities may be routed 608 differently and cause unnecessary delays in the connection. 610 12.4. Maximum Segment Size 612 A TCP connection used for IKE encapsulation SHOULD negotiate its 613 maximum segment size (MSS) in order to avoid unnecessary 614 fragmentation of packets. 616 12.5. Tunnelling ECN in TCP 618 Since there is not a one-to-one relationship between outer IP packets 619 and inner ESP/IP messages when using TCP encapsulation, the markings 620 for Explicit Congestion Notification (ECN) [RFC3168] cannot be simply 621 mapped. However, any ECN Congestion Experienced (CE) marking on 622 inner messages should be preserved through the tunnel. 624 Implementations SHOULD follow the ECN compatibility mode as described 625 in [RFC6040]. In compatibility mode, the outer TCP connection SHOULD 626 mark its packets as not ECN-capable, and MUST NOT clear any ECN 627 markings on inner packets. Note that outer packets may be ECN marked 628 even though the outer connection did not negotiate support for ECN. 629 If an implementation receives such an outer packet, it MAY propagate 630 the markings as described in the Default Tunnel Egress Behaviour 631 [RFC6040] for any inner packet contained within a single outer TCP 632 packet, or simply apply the rules as if the outer packet were Not-ECT 633 if the inner packet spans multiple outer packets. 635 13. Security Considerations 637 IKE Responders that support TCP encapsulation may become vulnerable 638 to new Denial-of-Service (DoS) attacks that are specific to TCP, such 639 as SYN-flooding attacks. TCP Responders should be aware of this 640 additional attack-surface. 642 TCP Responders should be careful to ensure that the stream prefix 643 "IKETCP" uniquely identifies incoming streams as ones that use the 644 TCP encapsulation protocol, and they are not running any other 645 protocols on the same listening port that could conflict with this. 647 Attackers may be able to disrupt the TCP connection by sending 648 spurious RST packets. Due to this, implementations SHOULD make sure 649 that IKE session state persists even if the underlying TCP connection 650 is torn down. 652 If MOBIKE is being used, all of the security considerations outlined 653 for MOBIKE apply [RFC4555]. 655 Similarly to MOBIKE, TCP encapsulation requires a TCP Responder to 656 handle changing of source address and port due to network or 657 connection disruption. The successful delivery of valid IKE or ESP 658 messages over a new TCP connection is used by the TCP Responder to 659 determine where to send subsequent responses. If an attacker is able 660 to send packets on a new TCP connection that pass the validation 661 checks of the TCP Responder, it can influence which path future 662 packets take. For this reason, the validation of messages on the TCP 663 Responder must include decryption, authentication, and replay checks. 665 Since TCP provides a reliable, in-order delivery of ESP messages, the 666 ESP Anti-Replay Window size SHOULD be set to 1. See [RFC4303] for a 667 complete description of the ESP Anti-Replay Window. This increases 668 the protection of implementations against replay attacks. 670 14. IANA Considerations 672 TCP port 4500 is already allocated to IPsec for NAT Traversal. This 673 port SHOULD be used for TCP encapsulated IKE and ESP as described in 674 this document. 676 This document updates the reference for TCP port 4500: 678 Keyword Decimal Description Reference 679 ------- ------- ----------- --------- 680 ipsec-nat-t 4500/tcp IPsec NAT-Traversal [RFC-this-rfc] 682 Figure 4 684 15. Acknowledgments 686 The authors would like to acknowledge the input and advice of Stuart 687 Cheshire, Delziel Fernandes, Yoav Nir, Christoph Paasch, Yaron 688 Sheffer, David Schinazi, Graham Bartlett, Byju Pularikkal, March Wu, 689 Kingwel Xie, Valery Smyslov, Jun Hu, and Tero Kivinen. Special 690 thanks to Eric Kinnear for his implementation work. 692 16. References 694 16.1. Normative References 696 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 697 Requirement Levels", BCP 14, RFC 2119, 698 DOI 10.17487/RFC2119, March 1997, 699 . 701 [RFC3948] Huttunen, A., Swander, B., Volpe, V., DiBurro, L., and M. 702 Stenberg, "UDP Encapsulation of IPsec ESP Packets", 703 RFC 3948, DOI 10.17487/RFC3948, January 2005, 704 . 706 [RFC4303] Kent, S., "IP Encapsulating Security Payload (ESP)", 707 RFC 4303, DOI 10.17487/RFC4303, December 2005, 708 . 710 [RFC6040] Briscoe, B., "Tunnelling of Explicit Congestion 711 Notification", RFC 6040, DOI 10.17487/RFC6040, November 712 2010, . 714 [RFC7296] Kaufman, C., Hoffman, P., Nir, Y., Eronen, P., and T. 715 Kivinen, "Internet Key Exchange Protocol Version 2 716 (IKEv2)", STD 79, RFC 7296, DOI 10.17487/RFC7296, October 717 2014, . 719 16.2. Informative References 721 [I-D.nir-ipsecme-ike-tcp] 722 Nir, Y., "A TCP transport for the Internet Key Exchange", 723 draft-nir-ipsecme-ike-tcp-01 (work in progress), July 724 2012. 726 [RFC1122] Braden, R., Ed., "Requirements for Internet Hosts - 727 Communication Layers", STD 3, RFC 1122, 728 DOI 10.17487/RFC1122, October 1989, 729 . 731 [RFC2817] Khare, R. and S. Lawrence, "Upgrading to TLS Within 732 HTTP/1.1", RFC 2817, DOI 10.17487/RFC2817, May 2000, 733 . 735 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition 736 of Explicit Congestion Notification (ECN) to IP", 737 RFC 3168, DOI 10.17487/RFC3168, September 2001, 738 . 740 [RFC4555] Eronen, P., "IKEv2 Mobility and Multihoming Protocol 741 (MOBIKE)", RFC 4555, DOI 10.17487/RFC4555, June 2006, 742 . 744 [RFC5246] Dierks, T. and E. Rescorla, "The Transport Layer Security 745 (TLS) Protocol Version 1.2", RFC 5246, 746 DOI 10.17487/RFC5246, August 2008, 747 . 749 [RFC6520] Seggelmann, R., Tuexen, M., and M. Williams, "Transport 750 Layer Security (TLS) and Datagram Transport Layer Security 751 (DTLS) Heartbeat Extension", RFC 6520, 752 DOI 10.17487/RFC6520, February 2012, 753 . 755 [RFC7383] Smyslov, V., "Internet Key Exchange Protocol Version 2 756 (IKEv2) Message Fragmentation", RFC 7383, 757 DOI 10.17487/RFC7383, November 2014, 758 . 760 Appendix A. Using TCP encapsulation with TLS 762 This section provides recommendations on how to use TLS in addition 763 to TCP encapsulation. 765 When using TCP encapsulation, implementations may choose to use TLS 766 [RFC5246] on the TCP connection to be able to traverse middle-boxes, 767 which may otherwise block the traffic. 769 If a web proxy is applied to the ports used for the TCP connection, 770 and TLS is being used, the TCP Originator can send an HTTP CONNECT 771 message to establish an SA through the proxy [RFC2817]. 773 The use of TLS should be configurable on the peers, and may be used 774 as the default when using TCP encapsulation, or else be a fallback 775 when basic TCP encapsulation fails. The TCP Responder may expect to 776 read encapsulated IKE and ESP packets directly from the TCP 777 connection, or it may expect to read them from a stream of TLS data 778 packets. The TCP Originator should be pre-configured to use TLS or 779 not when communicating with a given port on the TCP Responder. 781 When new TCP connections are re-established due to a broken 782 connection, TLS must be re-negotiated. TLS Session Resumption is 783 recommended to improve efficiency in this case. 785 The security of the IKE session is entirely derived from the IKE 786 negotiation and key establishment and not from the TLS session (which 787 in this context is only used for encapsulation purposes), therefore 788 when TLS is used on the TCP connection, both the TCP Originator and 789 TCP Responder SHOULD allow the NULL cipher to be selected for 790 performance reasons. 792 Implementations should be aware that the use of TLS introduces 793 another layer of overhead requiring more bytes to transmit a given 794 IKE and IPsec packet. For this reason, direct ESP, UDP 795 encapsulation, or TCP encapsulation without TLS should be preferred 796 in situations in which TLS is not required in order to traverse 797 middle-boxes. 799 Appendix B. Example exchanges of TCP Encapsulation with TLS 801 B.1. Establishing an IKE session 803 Client Server 804 ---------- ---------- 805 1) -------------------- TCP Connection ------------------- 806 (IP_I:Port_I -> IP_R:Port_R) 807 TcpSyn ----------> 808 <---------- TcpSyn,Ack 809 TcpAck ----------> 811 2) --------------------- TLS Session --------------------- 812 ClientHello ----------> 813 ServerHello 814 Certificate* 815 ServerKeyExchange* 816 <---------- ServerHelloDone 817 ClientKeyExchange 818 CertificateVerify* 819 [ChangeCipherSpec] 820 Finished ----------> 821 [ChangeCipherSpec] 822 <---------- Finished 824 3) ---------------------- Stream Prefix -------------------- 825 "IKETCP" ----------> 826 4) ----------------------- IKE Session --------------------- 827 Length + Non-ESP Marker ----------> 828 IKE_SA_INIT 829 HDR, SAi1, KEi, Ni, 830 [N(NAT_DETECTION_*_IP)] 831 <------ Length + Non-ESP Marker 832 IKE_SA_INIT 833 HDR, SAr1, KEr, Nr, 834 [N(NAT_DETECTION_*_IP)] 835 Length + Non-ESP Marker ----------> 836 first IKE_AUTH 837 HDR, SK {IDi, [CERTREQ] 838 CP(CFG_REQUEST), IDr, 839 SAi2, TSi, TSr, ...} 840 <------ Length + Non-ESP Marker 841 first IKE_AUTH 842 HDR, SK {IDr, [CERT], AUTH, 843 EAP, SAr2, TSi, TSr} 844 Length + Non-ESP Marker ----------> 845 IKE_AUTH + EAP 846 repeat 1..N times 847 <------ Length + Non-ESP Marker 848 IKE_AUTH + EAP 849 Length + Non-ESP Marker ----------> 850 final IKE_AUTH 851 HDR, SK {AUTH} 852 <------ Length + Non-ESP Marker 853 final IKE_AUTH 854 HDR, SK {AUTH, CP(CFG_REPLY), 855 SA, TSi, TSr, ...} 857 -------------- IKE and IPsec SAs Established ------------ 858 Length + ESP frame ----------> 860 Figure 5 862 1. Client establishes a TCP connection with the server on port 863 4500, or an alternate pre-configured port that the server is 864 listening on. 866 2. If configured to use TLS, the client initiates a TLS handshake. 867 During the TLS handshake, the server SHOULD NOT request the 868 client's certificate, since authentication is handled as part 869 of IKE negotiation. 871 3. Client send the Stream Prefix for TCP encapsulated IKE 872 Section 4 traffic to signal the beginning of IKE negotiation. 874 4. Client and server establish an IKE connection. This example 875 shows EAP-based authentication, although any authentication 876 type may be used. 878 B.2. Deleting an IKE session 880 Client Server 881 ---------- ---------- 882 1) ----------------------- IKE Session --------------------- 883 Length + Non-ESP Marker ----------> 884 INFORMATIONAL 885 HDR, SK {[N,] [D,] 886 [CP,] ...} 887 <------ Length + Non-ESP Marker 888 INFORMATIONAL 889 HDR, SK {[N,] [D,] 890 [CP], ...} 892 2) --------------------- TLS Session --------------------- 893 close_notify ----------> 894 <---------- close_notify 895 3) -------------------- TCP Connection ------------------- 896 TcpFin ----------> 897 <---------- Ack 898 <---------- TcpFin 899 Ack ----------> 900 --------------------- IKE SA Deleted ------------------- 902 Figure 6 904 1. Client and server exchange INFORMATIONAL messages to notify IKE 905 SA deletion. 907 2. Client and server negotiate TLS session deletion using TLS 908 CLOSE_NOTIFY. 910 3. The TCP connection is torn down. 912 The deletion of the IKE SA should lead to the disposal of the 913 underlying TLS and TCP state. 915 B.3. Re-establishing an IKE session 917 Client Server 918 ---------- ---------- 919 1) -------------------- TCP Connection ------------------- 920 (IP_I:Port_I -> IP_R:Port_R) 921 TcpSyn ----------> 922 <---------- TcpSyn,Ack 923 TcpAck ----------> 924 2) --------------------- TLS Session --------------------- 925 ClientHello ----------> 926 <---------- ServerHello 927 [ChangeCipherSpec] 928 Finished 929 [ChangeCipherSpec] ----------> 930 Finished 931 3) ---------------------- Stream Prefix -------------------- 932 "IKETCP" ----------> 933 4) <---------------------> IKE/ESP flow <------------------> 934 Length + ESP frame ----------> 936 Figure 7 938 1. If a previous TCP connection was broken (for example, due to a 939 RST), the client is responsible for re-initiating the TCP 940 connection. The TCP Originator's address and port (IP_I and 941 Port_I) may be different from the previous connection's address 942 and port. 944 2. In ClientHello TLS message, the client SHOULD send the Session 945 ID it received in the previous TLS handshake if available. It 946 is up to the server to perform either an abbreviated handshake 947 or full handshake based on the session ID match. 949 3. After TCP and TLS are complete, the client sends the Stream 950 Prefix for TCP encapsulated IKE traffic Section 4. 952 4. The IKE and ESP packet flow can resume. If MOBIKE is being 953 used, the Initiator SHOULD send UPDATE_SA_ADDRESSES. 955 B.4. Using MOBIKE between UDP and TCP Encapsulation 957 Client Server 958 ---------- ---------- 959 (IP_I1:UDP500 -> IP_R:UDP500) 960 1) ----------------- IKE_SA_INIT Exchange ----------------- 961 (IP_I1:UDP4500 -> IP_R:UDP4500) 962 Non-ESP Marker -----------> 963 Initial IKE_AUTH 964 HDR, SK { IDi, CERT, AUTH, 965 CP(CFG_REQUEST), 966 SAi2, TSi, TSr, 967 N(MOBIKE_SUPPORTED) } 968 <----------- Non-ESP Marker 969 Initial IKE_AUTH 970 HDR, SK { IDr, CERT, AUTH, 971 EAP, SAr2, TSi, TSr, 972 N(MOBIKE_SUPPORTED) } 973 <------------------ IKE SA establishment ---------------> 975 2) ------------ MOBIKE Attempt on new network -------------- 976 (IP_I2:UDP4500 -> IP_R:UDP4500) 977 Non-ESP Marker -----------> 978 INFORMATIONAL 979 HDR, SK { N(UPDATE_SA_ADDRESSES), 980 N(NAT_DETECTION_SOURCE_IP), 981 N(NAT_DETECTION_DESTINATION_IP) } 983 3) -------------------- TCP Connection ------------------- 984 (IP_I2:Port_I -> IP_R:Port_R) 985 TcpSyn -----------> 986 <----------- TcpSyn,Ack 987 TcpAck -----------> 989 4) --------------------- TLS Session --------------------- 990 ClientHello -----------> 991 ServerHello 992 Certificate* 993 ServerKeyExchange* 994 <----------- ServerHelloDone 996 ClientKeyExchange 997 CertificateVerify* 998 [ChangeCipherSpec] 999 Finished -----------> 1000 [ChangeCipherSpec] 1001 <----------- Finished 1002 5) ---------------------- Stream Prefix -------------------- 1003 "IKETCP" ----------> 1005 6) ----------------------- IKE Session --------------------- 1006 Length + Non-ESP Marker -----------> 1007 INFORMATIONAL (Same as step 2) 1008 HDR, SK { N(UPDATE_SA_ADDRESSES), 1009 N(NAT_DETECTION_SOURCE_IP), 1010 N(NAT_DETECTION_DESTINATION_IP) } 1012 <------- Length + Non-ESP Marker 1013 HDR, SK { N(NAT_DETECTION_SOURCE_IP), 1014 N(NAT_DETECTION_DESTINATION_IP) } 1015 7) <----------------- IKE/ESP data flow -------------------> 1017 Figure 8 1019 1. During the IKE_SA_INIT exchange, the client and server exchange 1020 MOBIKE_SUPPORTED notify payloads to indicate support for 1021 MOBIKE. 1023 2. The client changes its point of attachment to the network, and 1024 receives a new IP address. The client attempts to re-establish 1025 the IKE session using the UPDATE_SA_ADDRESSES notify payload, 1026 but the server does not respond because the network blocks UDP 1027 traffic. 1029 3. The client brings up a TCP connection to the server in order to 1030 use TCP encapsulation. 1032 4. The client initiates and TLS handshake with the server. 1034 5. The client sends the Stream Prefix for TCP encapsulated IKE 1035 traffic Section 4. 1037 6. The client sends the UPDATE_SA_ADDRESSES notify payload on the 1038 TCP encapsulated connection. Note that this IKE message is the 1039 same as the one sent over UDP in step 2, and should have the 1040 same message ID and contents. 1042 7. The IKE and ESP packet flow can resume. 1044 Authors' Addresses 1046 Tommy Pauly 1047 Apple Inc. 1048 1 Infinite Loop 1049 Cupertino, California 95014 1050 US 1052 Email: tpauly@apple.com 1054 Samy Touati 1055 Ericsson 1056 2755 Augustine 1057 Santa Clara, California 95054 1058 US 1060 Email: samy.touati@ericsson.com 1062 Ravi Mantha 1063 Cisco Systems 1064 SEZ, Embassy Tech Village 1065 Panathur, Bangalore 560 037 1066 India 1068 Email: ramantha@cisco.com