idnits 2.17.00 (12 Aug 2021) /tmp/idnits46436/draft-ymbk-idr-l3nd-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 1 instance of lines with multicast IPv4 addresses in the document. If these are generic example addresses, they should be changed to use the 233.252.0.x range defined in RFC 5771 Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document date (21 March 2022) is 61 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-09) exists of draft-ietf-lsvr-l3dl-08 -- Possible downref: Non-RFC (?) normative reference: ref. 'IANA-PEN' ** Obsolete normative reference: RFC 5226 (Obsoleted by RFC 8126) == Outdated reference: A later version (-05) exists of draft-ymbk-idr-l3nd-ulpc-04 Summary: 1 error (**), 0 flaws (~~), 3 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group R. Bush 3 Internet-Draft Arrcus & Internet Initiative Japan 4 Intended status: Standards Track R. Housley 5 Expires: 22 September 2022 Vigil Security 6 R. Austein 7 Arrcus 8 S. Hares 9 Hickory Hill Consulting 10 K. Patel 11 Arrcus 12 21 March 2022 14 Layer-3 Neighbor Discovery 15 draft-ymbk-idr-l3nd-03 17 Abstract 19 Data Centers where the topology is BGP-based need to discover 20 neighbor IP addressing, IP Layer-3 BGP neighbors, etc. This Layer-3 21 Neighbor Discovery protocol identifies BGP neighbor candidates. 23 Requirements Language 25 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 26 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 27 "OPTIONAL" in this document are to be interpreted as described in BCP 28 14 [RFC2119] [RFC8174] when, and only when, they appear in all 29 capitals, as shown here. 31 Status of This Memo 33 This Internet-Draft is submitted in full conformance with the 34 provisions of BCP 78 and BCP 79. 36 Internet-Drafts are working documents of the Internet Engineering 37 Task Force (IETF). Note that other groups may also distribute 38 working documents as Internet-Drafts. The list of current Internet- 39 Drafts is at https://datatracker.ietf.org/drafts/current/. 41 Internet-Drafts are draft documents valid for a maximum of six months 42 and may be updated, replaced, or obsoleted by other documents at any 43 time. It is inappropriate to use Internet-Drafts as reference 44 material or to cite them other than as "work in progress." 46 This Internet-Draft will expire on 22 September 2022. 48 Copyright Notice 50 Copyright (c) 2022 IETF Trust and the persons identified as the 51 document authors. All rights reserved. 53 This document is subject to BCP 78 and the IETF Trust's Legal 54 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 55 license-info) in effect on the date of publication of this document. 56 Please review these documents carefully, as they describe your rights 57 and restrictions with respect to this document. Code Components 58 extracted from this document must include Revised BSD License text as 59 described in Section 4.e of the Trust Legal Provisions and are 60 provided without warranty as described in the Revised BSD License. 62 Table of Contents 64 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 65 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 66 3. Background . . . . . . . . . . . . . . . . . . . . . . . . . 4 67 4. Inter-Link Protocol Overview . . . . . . . . . . . . . . . . 5 68 4.1. L3ND Ladder Diagram . . . . . . . . . . . . . . . . . . . 5 69 5. TLV PDUs . . . . . . . . . . . . . . . . . . . . . . . . . . 7 70 6. HELLO . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 71 7. TCP Set-Up . . . . . . . . . . . . . . . . . . . . . . . . . 9 72 8. OPEN . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 73 9. ACK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 74 9.1. Retransmission . . . . . . . . . . . . . . . . . . . . . 14 75 10. The Encapsulations . . . . . . . . . . . . . . . . . . . . . 14 76 10.1. The Encapsulation PDU Skeleton . . . . . . . . . . . . . 14 77 10.2. Encapsulaion Flags . . . . . . . . . . . . . . . . . . . 16 78 10.3. IPv4 Encapsulation . . . . . . . . . . . . . . . . . . . 16 79 10.4. IPv6 Encapsulation . . . . . . . . . . . . . . . . . . . 17 80 10.5. MPLS Label List . . . . . . . . . . . . . . . . . . . . 18 81 10.6. MPLS IPv4 Encapsulation . . . . . . . . . . . . . . . . 18 82 10.7. MPLS IPv6 Encapsulation . . . . . . . . . . . . . . . . 19 83 11. VENDOR - Vendor Extensions . . . . . . . . . . . . . . . . . 19 84 12. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . 20 85 12.1. HELLO Discussion . . . . . . . . . . . . . . . . . . . . 20 86 13. VLANs/SVIs/Sub-interfaces . . . . . . . . . . . . . . . . . . 20 87 14. Implementation Considerations . . . . . . . . . . . . . . . . 21 88 15. Security Considerations . . . . . . . . . . . . . . . . . . . 21 89 16. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 22 90 16.1. Link Local Layer-3 Addresses . . . . . . . . . . . . . . 22 91 16.2. Ports for TLS/TCP . . . . . . . . . . . . . . . . . . . 22 92 16.3. PDU Types . . . . . . . . . . . . . . . . . . . . . . . 22 93 16.4. Flag Bits . . . . . . . . . . . . . . . . . . . . . . . 23 94 16.5. Error Codes . . . . . . . . . . . . . . . . . . . . . . 23 95 17. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 23 96 18. References . . . . . . . . . . . . . . . . . . . . . . . . . 23 97 18.1. Normative References . . . . . . . . . . . . . . . . . . 23 98 18.2. Informative References . . . . . . . . . . . . . . . . . 24 99 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 25 101 1. Introduction 103 The Massive Data Center (MDC) environment presents unusual problems 104 of scale, e.g. O(10,000) forwarding devices, while its homogeneity 105 presents opportunities for simple approaches. Layer-3 Discovery and 106 Liveness (L3DL), [I-D.ietf-lsvr-l3dl], provides neighbor discovery at 107 Layer-2. This document (set) provides a similar solution at Layer-3, 108 attempting to be as similar as reasonable to L3DL. 110 Some guiding principles when dealing with datacenters with tens of 111 thousands of devices are 113 * Predictable Reliability, 115 * Security: Authorization and Integrity more than Confidentiality, 116 and 118 * Massive Scalability 120 Layer-3 Neighbor Discovery (L3ND) provides brutally simple mechanisms 121 for neighboring devices to 123 * Discover each other's IP Addresses, 125 * Discover mutually supported layer-3 encapsulations, e.g. IPv4/ 126 IPv6//MPLS, 128 * Discover Layer-3 IP and/or MPLS addressing of interfaces of the 129 encapsulations, 131 * Provide authenticity, integrity, and verification of protocol 132 messages, and 134 * Accommodate scaling needed for EVPN etc. 136 L3ND is intended for use within single IP subnets (IP over Ethernet 137 or other point-to-point or multi-point IP link) in order to exchange 138 the data needed to bootstrap BGP-based peering, EVPN, etc.; 139 especially in a datacenter Clos [Clos] environment. Once IP 140 connectivity has been leveraged to discover layer-3 addressability 141 and forwarding capabilities, normal IP forwarding and routing can 142 take over. 144 L3ND might be more widely applicable to a range of routing and 145 similar protocols which need Layer-3 neighbor discovery. 147 2. Terminology 149 Even though it concentrates on the inter-device layer, this document 150 relies heavily on routing terminology. The following attempts to 151 clarify the use of some possibly confusing terms: 153 Clos: A hierarchic subset of a crossbar switch topology commonly 154 used in data centers [Clos]. 156 Datagram: The L3ND content of a single Layer-3 UDP Datagram. 158 Encapsulation: Address Family Indicator and Subsequent Address 159 Family Indicator (AFI/SAFI). I.e. classes of Layer-2.5 160 and Layer-3 addresses such as IPv4, IPv6, MPLS. 162 Link or Logical Link: A logical connection between two interfaces on 163 two different devices. E.g. two VLANs between the same 164 two ports are two links. 166 MDC: Massive Data Center, commonly composed of thousands of Top 167 of Rack Switches (TORs). 169 MTU: Maximum Transmission Unit, the size in octets of the 170 largest packet that can be sent on a medium, see [RFC1122] 171 1.3.3. 173 PDU: Protocol Data Unit, an L3ND application layer message. 175 Session: An established, via exchange of OPEN PDUs, session between 176 two L3ND capable IP interfaces on a link. 178 TOR Switch: Top Of Rack switch, aggregates the servers in a rack and 179 connects to aggregation layers of the Clos tree, AKA the 180 Clos spine. 182 3. Background 184 L3ND is primarily designed for a Clos type datacenter scale and 185 topology, but can accommodate richer topologies which contain 186 potential cycles. 188 While L3ND is designed for the MDC, there are no inherent reasons it 189 could not run on a WAN. The authentication and authorization needed 190 to run safely on a WAN need to be considered, and the appropriate 191 level of security options chosen. 193 The number of addresses of one Encapsulation type on an interface 194 link may be quite large given a TOR switch with tens of servers, each 195 server having a few hundred micro-services, resulting in an 196 inordinate number of addresses. And highly automated micro-service 197 migration can cause serious address prefix disaggregation, resulting 198 in interfaces with thousands of disaggregated prefixes. 200 To meet such scaling needs, the L3ND protocol is session oriented and 201 uses incremental announcement and withdrawal with session restart, a 202 la BGP ([RFC4271]). 204 4. Inter-Link Protocol Overview 206 A device broadcasts a Layer-3 Multicast UDP datagram (HELLO) 207 containing the port number that is willing to serve a TLS or raw TCP 208 connection to support the data exchange of the rest of the protocol 209 in a reliable and preferably authenticated manner. 211 Another device on the link then establishes a TLS or raw TCP session 212 in which inter-device PDUs are used to exchange device and logical 213 link identities and layer-2.5 (MPLS) and 3 identifiers (not 214 payloads), e.g. more IP Addresses, loopback addresses, port 215 identities, and Encapsulations. 217 To assure discovery of new devices coming up on a multi-link 218 topology, devices on such a topology, and only on a multi-link 219 topology, send periodic HELLOs forever, see Section 12.1. 221 Given the TLS/TCP session, OPEN PDUs (Section 8) are exchanged, the 222 Encapsulations (Section 10) configured on an end point may be 223 announced and modified. Note that these are only the encapsulation 224 and addresses configured on the announcing interface; though a 225 device's loopback and overlay interface(s) may also be announced. 227 4.1. L3ND Ladder Diagram 229 The HELLO, Section 6, is a priming message sent on all logical links 230 configured for L3ND. It is a small L3ND Multicast UDP PDU with the 231 simple goal of advertising a TLS/TCP service available on an 232 advertised port on the sending IP interface. 234 The HELLO PDU is either IPv4 or IPv6, which selects the AFI to be 235 used for the rest of the session(s) between end-points. Two 236 endpoints MAY establish a link for each AFI. 238 An interface on the link receiving the HELLO PDU attempts to 239 establish a TLS or raw TCP, as specified by the HELLO, session to the 240 source IP address of the HELLO on the port advertised in the HELLO. 242 The OPEN, Section 8 PDUs, used to exchange details about the L3ND 243 session, and the ACK/ERROR PDU, are mandatory; other PDUs are 244 optional; though at least one encapsulation SHOULD be agreed at some 245 point. 247 Like Multi-Protocol BGP, [RFC4760], an L3ND session running over one 248 AFI MAY carry encapsulations etc. of different AFIs, 250 A L3DL extension, [I-D.ymbk-idr-l3nd-ulpc], describes the next upper 251 layer L3DL protocol to exchange BGP parameter information. 253 The following is a ladder-style diagram of the L3ND protocol 254 exchanges: 256 | HELLO | Logical Link Peer discovery 257 |---------------------------->| 258 | TCP OPEN | Mandatory 259 |<----------------------------| 260 | | 261 | | 262 | OPEN | IDs, security, etc. 263 |---------------------------->| 264 | ACK | 265 |<----------------------------| 266 | | 267 | OPEN | Mandatory 268 |<----------------------------| 269 | ACK | 270 |---------------------------->| 271 | | 272 | | 273 | Interface IPv4 Addresses | Interface IPv4 Addresses 274 |---------------------------->| Optional 275 | ACK | 276 |<----------------------------| 277 | | 278 | Interface IPv4 Addresses | 279 |<----------------------------| 280 | ACK | 281 |---------------------------->| 282 | | 283 | | 284 | Interface IPv6 Addresses | Interface IPv6 Addresses 285 |---------------------------->| Optional 286 | ACK | 287 |<----------------------------| 288 | | 289 | Interface IPv6 Addresses | 290 |<----------------------------| 291 | ACK | 292 |---------------------------->| 293 | | 294 | | 295 | Interface MPLSv4 Labels | Interface MPLSv4 Labels 296 |---------------------------->| Optional 297 | ACK | 298 |<----------------------------| 299 | | 300 | Interface MPLSv4 Labels | Interface MPLSv4 Labels 301 |<----------------------------| Optional 302 | ACK | 303 |---------------------------->| 304 | | 305 | | 306 | Interface MPLSv6 Labels | Interface MPLSv6 Labels 307 |---------------------------->| Optional 308 | ACK | 309 |<----------------------------| 310 | | 311 | Interface MPLSv6 Labels | Interface MPLSv6 Labels 312 |<----------------------------| Optional 313 | ACK | 314 |---------------------------->| 316 5. TLV PDUs 318 The basic L3ND application layer PDU is a typical TLV (Type Length 319 Value) PDU. As it is transported over TCP, integrity is assured. 320 When it is transported over TLS, authenticity is also provided. 322 0 1 2 3 323 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 324 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 325 | Version = 0 | PDU Type | Payload Length ~ 326 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 327 ~ | ~ 328 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ~ 329 ~ Payload ... ~ 330 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 332 The fields of the basic L3ND header are as follows: 334 Version: An integer differentiating versions of the L3ND protocol. 335 Currently only Version 0 MAY BE specified. 337 PDU Type: An integer differentiating PDU payload types. See 338 Section 16.3. 340 Payload Length: Total number of octets in the Payload field. 342 Payload: The application layer content of the L3ND PDU. 344 6. HELLO 346 0 1 2 3 347 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 348 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 349 | Version = 0 | PDU Type = 0 | Payload Length = 3 ~ 350 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 351 ~ | Flags | Port ~ 352 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 353 ~ | 354 +-+-+-+-+-+-+-+-+ 356 Flags (bit): 357 0 - 0 Raw TCP, 1 TLS 358 1 - 0 Self-Signed Cert for TLS, 1 CA-based 360 The Payload Length is 3 to cover the Flags and Port fields. 362 The Port is the two octet TCP Port Number (default is TBD3) on which 363 the HELLO sender MUST have a waiting TLS/TCP (as specified in Flags) 364 server listening. Though the IANA assigned well-known port SHOULD be 365 used, this field allows configuration of alternate ports. 367 The IPv4 UDP packets are sent to the IPv4 link local multicast 368 address (TBD1) and the IPv6 UDP packets are sent to an IPv6 link 369 Local multicast address (TBD2). See Section 12.1 for why multicast 370 is used. 372 The HELLO PDU solicits a unicast TLS/TCP session open request of the 373 same AFI from other devices on the link. 375 When a HELLO is received from a source IP address with which there is 376 no established TLS/TCP L3ND session, if the receiver has the higher 377 of the two IP addresses, it SHOULD respond by sending a TLS/TCP 378 client open request, using the same AFI, to the source IP address of 379 the HELLO to establish an L3ND TLS/TCP session. 381 All L3ND PDUs other than HELLO are sent via TLS/TCP, as the server's 382 destination IP address is known after the HELLO. 384 When an interface is turned up on a device, it SHOULD issue a HELLO 385 if it is configured to participate in L3ND sessions and repeat HELLOs 386 at a configured interval, with a default of 60 seconds. 388 If the configured multicast destination address is one that is 389 propagated by switches, the HELLO SHOULD be repeated at a configured 390 interval, with a default of 60 seconds. This allows discovery by new 391 devices which come up on the mesh. In this multi-link scenario, the 392 operator should be aware of the trade-off between timer tuning and 393 network noise and adjust the inter-HELLO timer accordingly. 395 By default, GTSM, [RFC5082], SHOULD be enabled to test that a 396 received HELLO MUST be on the local link. It MAY be disabled by 397 configuration. GTSM check failures SHOULD be logged, though with 398 rate limiting to keep from overwhelming logs. 400 If more than one device responds, one adjacency is formed for each 401 unique source IP address. L3ND treats each adjacency as a separate 402 logical link. 404 To ameliorate possible load spikes during bootstrap or event 405 recovery, there SHOULD be a jittered delay between receipt of a HELLO 406 and TLS/TCP open. The default delay range SHOULD be zero to five 407 seconds, and MUST be configurable. 409 If a HELLO is received from an IP Address with which there is an 410 established session for that AFI, the HELLO SHOULD be dropped. 412 A device with a TLS/TCP listener SHOULD log or otherwise report 413 repeated failed inbound attempts. 415 7. TCP Set-Up 417 If the receiver of a HELLO does not agree with the sender's choice of 418 TLS/TCP or does not agree with the verification choice, Self-Signed 419 or CA-based, the receiver SHOULD respond with a HELLO specifying its 420 preferences. 422 As it is assumed that the configured deployment of a data center 423 would have compatible parameters on all devices, any disagreement 424 over TLS/TCP or trust anchors MUST be logged; with rate limiting of 425 the logging. 427 By default, GTSM, [RFC5082], SHOULD be enabled to ensure that a SYN 428 received in response to a HELLO is on the local link. It MAY be 429 disabled by configuration. GTSM check failures SHOULD be logged; 430 though with rate limiting to keep from overwhelming logs. 432 If the receiver of a HELLO agrees with the sender's choice of TLS/TCP 433 and authentication, both sides have agreed on an AFI for the 434 transport and on each other's IP address in that AFI. This is 435 sufficient to open a TCP session between them, which will allow for 436 reliable transport of very large data PDUs while obviating the need 437 to invent complex transports. 439 The L3ND peer with the higher IP address MUST act as the TLS/TCP 440 client and open the transport session (send SYN) toward the peer with 441 the lower IP address. 443 The server, the sender of the HELLO from the lower IP address, 444 listens on the advertised port for the TLS/TCP session open. The 445 receiver of the acceptable HELLO, the TLS/TCP client, initiates a TLS 446 or raw TCP session toward the sender of the HELLO, the TLS/TCP 447 server, preferably TLS, as advertised. If TLS, the server has chosen 448 and signaled either a self-signed certificate or one configured from 449 the operational CA trusted by both parties, as negotiated in the 450 HELLO exchange. 452 Once the TLS/TCP session is established, if its interface is 453 configured as point to point, the client side SHOULD stop listening 454 on any port for which it has sent a HELLO. The server, if configured 455 as a point to point interface SHOULD stop sending HELLOs. 457 If the TLS/TCP open fails, then this SHOULD be logged and the parties 458 MUST go back to the initial state and try HELLO. Logging SHOULD be 459 rate limited. 461 Should an interface with an established TLS/TCP session be 462 reconfigured changing the TLS/TCP parameters, the TLS/TCP session 463 should be closed or torn down and both parties should return to the 464 HELLO state. 466 Should the TLS/TCP session terminate for any reason, the devices 467 SHOULD restart/resume HELLOs. When the new TLS/TCP session is 468 started, if possible the OPEN PDU SHOULD try to resume the lost 469 logical session by using the same nonce and resuming from the last 470 Serial Number. 472 Once the TLS/TCP session has been established, the two devices 473 exchange L3ND PDUs, starting with OPENs. 475 8. OPEN 477 Each device has learned the other's IP Address from the HELLO 478 exchange, see Section 6 and established a TLS/TCP session over a 479 particular AFI. 481 The first PDU each sends MUST be an OPEN, and the other side MUST 482 respond with an ACK PDU. 484 0 1 2 3 485 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 486 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 487 | Version = 0 | PDU Type = 1 | Payload Length ~ 488 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 489 ~ | Session ID ~ 490 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 491 ~ | Serial Number ~ 492 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 493 ~ | AttrCount | ~ 494 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + 495 ~ Attribute List ... ~ 496 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 498 The four octet Payload Length is the number of octets in all fields 499 of the PDU from the Session ID through the Serial Number. 501 The four octet Session ID is a nonce which uniquely identifies a 502 session. It enables detection of a duplicate OPEN PDU. It SHOULD be 503 either a random number or a high resolution timestamp. It is needed 504 to prevent session closure due to a repeated OPEN caused by a race or 505 a dropped or delayed ACK. It can be used to resume a dropped logical 506 session. 508 The one octet AttrCount is the number of attributes in the Attribute 509 List. A node may send zero or more attributes. 511 Attributes are single octets the semantics of which are operator- 512 defined, e.g.: spine, leaf, backbone, route reflector, arabica, ... 514 Attribute syntax and semantics are local to an operator or 515 datacenter; hence there is no global registry. Nodes exchange their 516 attributes only in the OPEN PDU. 518 Unlike L3DL [I-D.ietf-lsvr-l3dl], there are no verifiable keys in the 519 PDUs. If the operator wants authentication, integrity, 520 confidentiality, then TLS MUST have been requested by the HELLO and 521 agreed by the TLS session open. 523 The Serial Number is a monotonically increasing four octet value 524 representing the sender's state at the time of sending the last PDU. 525 It may be a non-negative integer, a timestamp, etc. If incrementing 526 the Serial Number would cause it to be zero, it should be incremented 527 again. 529 On session restart (new OPEN, same Session ID), a receiver MAY send 530 the last received Serial Number to tell the sender to only send data 531 with a Serial Number greater (in the [RFC1982] sense), or send a 532 Serial Number of zero to request all data. 534 This allows a sender of an OPEN to tell the receiver that the sender 535 would like to resume a logical session and that the receiver of the 536 OPEN PDU only needs to send data starting with the PDU with the 537 lowest Serial Number greater (in the [RFC1982] sense) than the one 538 sent in the OPEN. If the sender is not trying to resume a dropped 539 session, the Serial Number MUST be zero. 541 If the receiver of an OPEN PDU with a non-zero Serial Number can not 542 resume from the requested point, it should return an ACK with an 543 Error Code of 5, Session May Not Be Continued, EType of 1. The 544 sender of the failing OPEN PDU SHOULD respond with an OPEN PDU with a 545 Serial Number of zero. 547 If a sender of OPEN does not receive an ACK of the OPEN PDU in a 548 configurable time (default 5 seconds), then they SHOULD close or 549 otherwise terminate the TLS/TCP session and restart from the HELLO 550 state. 552 If an OPEN arrives at L3ND speaker A from B with which A believes it 553 already has an L3ND session (i.e. OPENs have already been 554 exchanged), and the Serial Number in B's OPEN PDU is non-zero, 555 speaker A SHOULD establish a new sending session by sending an OPEN 556 with the Serial Number being the same as that of A's last sent and 557 ACKed PDU. A MUST resume sending encapsulations etc. subsequent to 558 the requested Sequence Number. And B MUST retain all previously 559 discovered encapsulation and other data received from A. 561 If an OPEN arrives at L3ND speaker A from B with which A believes it 562 already has an L3ND session (i.e. OPENs have already been 563 exchanged), and the Serial Number in B's OPEN is zero, then the A 564 MUST assume that B's internal state has been reset. All Previously 565 discovered encapsulation data MUST BE discarded; and A MUST respond 566 with a new OPEN PDU with a Serial Number of zero. 568 TCP KeepAlives should be configured and tuned to meet local 569 operational needs. Some defaults and recommendations are needed 570 here. 572 9. ACK 574 The ACK PDU acknowledges receipt of a PDU and reports any error 575 condition which might have been raised. 577 0 1 2 3 578 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 579 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 580 | Version = 0 | PDU Type = 3 | Payload Length = 6 ~ 581 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 582 ~ | ACKed PDU | EType | 583 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 584 | Error Code | Error Hint | 585 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 587 The ACK PDU acknowledges receipt of an OPEN, Encapsulation, Vendor 588 PDU, etc. and is used to return error codes if any. 590 The one octet ACKed PDU is the PDU Type of the PDU being 591 acknowledged, e.g., OPEN, one of the Encapsulations, etc. 593 If there was an error processing the received PDU, then the one octet 594 EType is non-zero. If the EType is zero, Error Code and Error Hint 595 MUST also be zero. 597 A non-zero EType is the receiver's way of telling the PDU's sender 598 that the receiver had problems processing the PDU. The Error Code 599 and Error Hint will tell the sender more detail about the error. 601 The decimal value of EType gives a strong hint how the receiver 602 sending the ACK believes things should proceed: 604 0 - No Error, Error Code and Error Hint MUST be zero 606 1 - Warning, something not too serious happened, continue 608 2 - Session should not be continued, try to restart 610 3 - Restart is hopeless, call the operator 612 4-15 - Reserved 614 The two octet Error Code, noting protocol failures, are listed in 615 Section 16.5. Someone stuck in the 1990s might think the catenation 616 of EType and Error Code as an echo of 0x1zzz, 0x2zzz, etc. They 617 might be right; or not. 619 The two octet Error Hint, is arbitrary additional data the sender of 620 the error PDU thinks will help the recipient or the debugger with the 621 particular error. 623 9.1. Retransmission 625 If a PDU sender expects an ACK, e.g. for an OPEN, an Encapsulation, a 626 Vendor PDU, etc., and does not receive the ACK for a configurable 627 time (default five seconds) the TLS/TCP session should be closed or 628 dropped, and both sides revert to HELLO state. 630 10. The Encapsulations 632 Once the devices know each other's IP Addresses, and have established 633 a TLS/TCP session and have successfully exchanged OPENs, the L3ND 634 session is considered established, and the devices SHOULD exchange 635 Layer-3 interface encapsulations, Layer-3 addresses, and Layer-2.5 636 labels. 638 Encapsulation data for any AFI/SAFI may be exchanged over a TLS/TCP 639 session irrespective of the AFI/SAFI of the session transport. 641 The Encapsulation types the peers exchange may be IPv4 642 (Section 10.3), IPv6 (Section 10.4), MPLS IPv4 (Section 10.6), MPLS 643 IPv6 (Section 10.7), and/or possibly others not defined here. 645 The sender of an Encapsulation PDU MUST NOT assume that the receiver 646 is capable of the same Encapsulation Type. An ACK (Section 9) with 647 EType of 0 merely acknowledges receipt. Only if both peers have sent 648 the same Encapsulation Type is it safe for Layer-3 protocols to 649 assume that they are compatible for that Encapsulation Type. 651 A receiver of an encapsulation might recognize an addressing 652 conflict, such as both ends of the link trying to use the same 653 address. In this case, the receiver MUST respond with an error 654 (Error Code 2, Logical Link Addressing Conflict) ACK. As there may 655 be other usable addresses or encapsulations, this error might log and 656 continue, letting an upper layer topology builder deal with what 657 works. 659 Further, to consider a logical link of a Encapsulation Type to 660 formally be established so that it may be used by other protocols, 661 the addressing for the type must be compatible, e.g. on the same IP 662 subnet. 664 10.1. The Encapsulation PDU Skeleton 666 The header for all encapsulation PDUs is as follows: 668 0 1 2 3 669 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 670 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 671 | Version = 0 | PDU Type | Payload Length ~ 672 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 673 ~ | Count ~ 674 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 675 ~ | Serial Number ~ 676 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 677 ~ | Encapsulation List... ~ 678 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 680 An Encapsulation PDU describes zero or more addresses of the 681 encapsulation type. 683 The three octet Count is the number of Encapsulations in the 684 Encapsulation List. 686 The Serial Number is a monotonically increasing four octet value 687 representing the sender's state in time. It may be an integer, a 688 timestamp, etc. On session restart (new OPEN), a receiver MAY send 689 the last received Serial Number to request the sender to only send 690 newer data. 692 If a sender has multiple links on the same interface, separate state: 693 data, ACKs, etc. must be kept for each peer session. 695 Over time, multiple Encapsulation PDUs may be sent for an interface 696 in a session as configuration changes. 698 The Receiver MUST acknowledge the Encapsulation PDU with an ACK PDU 699 (Section 9) with the Type field being that of the Type of the 700 Encapsulation PDU being announced, see Section 9. 702 If the Sender does not receive an ACK in a configurable interval 703 (default five seconds), they SHOULD retransmit. After a user 704 configurable number of failures (default three), the L3ND session 705 should be considered dead, TLS/TCP torn down, and the HELLO process 706 SHOULD be restarted. 708 If the link is broken below layer-3, retransmission MAY BE retried if 709 data have not changed in the interim and the TLS/TCP session is still 710 alive. 712 Should an Encapsulation in the Encapsulation List be syntactically 713 invalid, e.g. an out of bounds prefix length, the entire 714 Encapsulation PDU MUST be dropped and the sending party notified by 715 an ACK PDU with an EType of 1 and an Error Code of 3, Encapsulation 716 Error. 718 10.2. Encapsulaion Flags 720 The one octet Encapsulation Flags field is a sequence of one bit 721 fields as follows: 723 0 1 2 3 4 ... 7 724 +------------+------------+------------+------------+------------+ 725 | Ann/With | Primary | Under/Over | Loopback | Reserved ..| 726 +------------+------------+------------+------------+------------+ 728 Each encapsulation in an Encapsulation PDU of Type T may announce new 729 and/or withdraw old encapsulations of Type T. It indicates this with 730 the Ann/With Encapsulation Flag, Announce == 1, Withdraw == 0. 732 Announcing an encapsulation which already exists SHOULD raise an 733 Announce/Withdraw Error (see Section 16.5); the EType SHOULD be 2, 734 suggesting a session restart (see Section 9) so all encapsulations 735 will be resent. 737 If an interface on a link has multiple addresses for an encapsulation 738 type, one and only one address MAY be marked as primary (Primary Flag 739 == 1) for that Encapsulation Type. 741 An Encapsulation interface address in an Encapsulation PDU MAY be 742 marked as a loopback, in which case the Loopback bit is set. 743 Loopback addresses are generally not seen directly on an external 744 interface. One or more loopback addresses MAY be exposed by 745 configuration on one or more L3ND speaking external interfaces, e.g. 746 for iBGP peering. They SHOULD be marked as such, Loopback Flag == 1. 748 Each Encapsulation interface address in an Encapsulation PDU is that 749 of the direct 'underlay interface (Under/Over == 1), or an 'overlay' 750 address (Under/Over == 0), likely that of a VM or container guest 751 bridged or configured on to the interface already having an underlay 752 address. 754 10.3. IPv4 Encapsulation 756 The IPv4 Encapsulation describes a device's ability to exchange IPv4 757 packets on one or more subnets. It does so by stating the 758 interface's addresses and the corresponding prefix lengths. 760 0 1 2 3 761 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 762 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 763 | Version = 0 | PDU Type = 4 | Payload Length ~ 764 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 765 ~ | Count ~ 766 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 767 ~ | Serial Number ~ 768 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 769 ~ | Encaps Flags | IPv4 Address ~ 770 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 771 ~ | PrefixLen | more ... ~ 772 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 774 The three octet Count is the sum of the number of IPv4 Encapsulations 775 being announced and/or withdrawn. 777 10.4. IPv6 Encapsulation 779 The IPv6 Encapsulation describes a link's ability to exchange IPv6 780 packets on one or more subnets. It does so by stating the 781 interface's addresses and the corresponding prefix lengths. 783 0 1 2 3 784 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 785 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 786 | Version = 0 | PDU Type = 5 | Payload Length ~ 787 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 788 ~ | Count ~ 789 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 790 ~ | Serial Number ~ 791 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 792 ~ | Encaps Flags | ~ 793 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + 794 | ~ 795 + + 796 ~ ~ 797 + + 798 ~ IPv6 Prefix ~ 799 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 800 ~ | PrefixLen | more ... ~ 801 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 803 The three octet Count is the sum of the number of IPv6 Encapsulations 804 being announced and/or withdrawn. 806 10.5. MPLS Label List 808 As an MPLS enabled interface may have a label stack, see [RFC3032], a 809 variable length list of labels is needed. These are the labels the 810 sender will accept for the prefix to which the list is attached. 812 0 1 2 3 813 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 814 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 815 | Label Count | Label | Exp |S| 816 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 817 ~ Label | Exp |S| more ... ~ 818 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 820 A one octet Label Count of zero is an implicit withdraw of all labels 821 for that prefix on that interface. 823 The bottom of the stack flag, S, MUST be set on one and only one 824 label. Should this not be the case, the receiver of the erroneous 825 PDU MUST respond with an ACK PDU of EType 1 and Error Code 1, MPLS 826 Error. 828 10.6. MPLS IPv4 Encapsulation 830 The MPLS IPv4 Encapsulation describes a logical link's ability to 831 exchange labeled IPv4 packets on one or more subnets. It does so by 832 stating the interface's addresses the corresponding prefix lengths, 833 and the corresponding labels which will be accepted for each address. 835 0 1 2 3 836 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 837 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 838 | Version = 0 | PDU Type = 6 | Payload Length ~ 839 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 840 ~ | Count ~ 841 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 842 ~ | Serial Number ~ 843 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 844 ~ | Encaps Flags | MPLS Label List ... ~ 845 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 846 | IPv4 Address | 847 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 848 | PrefixLen | more ... ~ 849 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 851 The three octet Count is the sum of the number of MPLSv4 852 Encapsulation being announced and/or withdrawn. 854 10.7. MPLS IPv6 Encapsulation 856 The MPLS IPv6 Encapsulation describes a logical link's ability to 857 exchange labeled IPv6 packets on one or more subnets. It does so by 858 stating the interface's addresses, the corresponding prefix lengths, 859 and the corresponding labels which will be accepted for each address. 861 0 1 2 3 862 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 863 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 864 | Version = 0 | PDU Type = 7 | Payload Length ~ 865 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 866 ~ | Count ~ 867 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 868 ~ | Serial Number ~ 869 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 870 ~ | Encaps Flags | MPLS Label List ... | 871 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 872 | ~ 873 + + 874 ~ ~ 875 + IPv6 Address + 876 ~ ~ 877 + + 878 ~ | 879 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 880 | Prefix Len | more ... ~ 881 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 883 The three octet Count is the sum of the number of MPLSv6 884 Encapsulations being announced and/or withdrawn. 886 11. VENDOR - Vendor Extensions 888 0 1 2 3 889 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 890 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 891 | Version = 0 | PDU Type = 255| Payload Length ~ 892 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 893 ~ | Serial Number ~ 894 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 895 ~ | Enterprise Number ~ 896 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 897 ~ | Ent Type | Enterprise Data ... ~ 898 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 899 Vendors or enterprises may define TLVs beyond the scope of L3ND 900 standards. This is done using a Private Enterprise Number [IANA-PEN] 901 followed by Enterprise Data in a format defined for that three octet 902 Enterprise Number and one octet Ent Type. 904 Ent Type allows a Vendor PDU to be sub-typed in the event that the 905 vendor/enterprise needs multiple PDU types. 907 As with Encapsulation PDUs, a receiver of a Vendor PDU MUST respond 908 with an ACK PDU, possibly signalling an error. Similarly, a Vendor 909 PDU MUST only be sent over an open session. 911 12. Discussion 913 This section explores some trade-offs taken and some considerations. 915 12.1. HELLO Discussion 917 A device may send IP packets over a Layer-3 interface which transmits 918 data over a single Layer-2 interface or multiple Layer-2 interfaces. 919 Packets sourced by one Layer-3 IP interface over multiple Layer-2 920 should consider that a Layer-3 interface with multiple Layer-2 921 interfaces could have many devices which might come at various times, 922 therefore the configured HELLO PDU retransmit time SHOULD be set to a 923 non-zero value, and periodic HELLOs should continue. Packets 924 transmitted on a single Layer-2 interface on a point-to-point (p2p) 925 connection, MAY set the configuration value to zero, so when a TLS/ 926 TCP session is up, HELLOs are no longer desirable. 928 A device with multiple Layer-2 interfaces, traditionally called a 929 switch, may be used to forward packets from multiple devices to one 930 Layer-3 interface, I, on an L3ND speaking device. Interface I could 931 discover a peer J across the switch. Later, a prospective peer K 932 could come up across the switch. If I was not still sending and 933 listening for HELLOs, the potential peering with K could not be 934 discovered. Therefore, on multi-link interfaces, L3ND MUST continue 935 to send HELLOs as long as they are turned up. 937 13. VLANs/SVIs/Sub-interfaces 939 One can think of the protocol as an instance (i.e. state machine) 940 which runs on each logical link of a device. 942 As the upper routing layer must view VLAN topologies as separate 943 graphs, L3ND treats VLAN ports as separate links. 945 As Sub-Interfaces each have their own layer-3 identities, they act as 946 separate interfaces, forming their own links. 948 14. Implementation Considerations 950 An implementation SHOULD provide the ability to configure each 951 logical interface as L3ND speaking or not. 953 An implementation SHOULD provide the ability to distribute one or 954 more loopback addresses or interfaces into L3ND on an external L3ND 955 speaking interface. 957 An implementation SHOULD provide the ability to distribute one or 958 more overlay and/or underlay addresses or interfaces into L3ND on an 959 external L3ND speaking interface. 961 An implementation SHOULD provide the ability to configure one of the 962 addresses of an encapsulation as primary on an L3ND speaking 963 interface. If there is only one address for a particular 964 encapsulation, the implementation MAY mark it as primary by default. 966 An implementation MAY allow optional configuration which updates the 967 local forwarding table with overlay and underlay data both learned 968 from L3ND peers and configured locally. 970 15. Security Considerations 972 For TLS, versions greater than 1.1 MUST be used. 974 The protocol as is MUST NOT be used outside a datacenter or similarly 975 closed environment without using TLS encapsulation which is based on 976 a configured CA trust anchor. 978 Many datacenter operators have a strange belief that physical walls 979 and firewalls provide sufficient security. This is not credible. 980 All DC protocols need to be examined for exposure and attack surface. 981 In the case of L3ND, authentication and integrity as provided by TLS 982 validated to a configured shared CA trust anchor is strongly 983 RECOMMENDED. 985 It is generally unwise to assume that on the wire Layer-3 is secure. 986 Strange/unauthorized devices may plug into a port. Mis-wiring is 987 very common in datacenter installations. A poisoned laptop might be 988 plugged into a device's port, form malicious sessions, etc. to 989 divert, intercept, or drop traffic. 991 Similarly, malicious nodes/devices could mis-announce addressing. 993 If OPEN PDUs are not over validated TLS, an attacker could forge an 994 OPEN for an existing session and cause the session to be reset. 996 16. IANA Considerations 998 16.1. Link Local Layer-3 Addresses 1000 IANA is requested to assignment one address (TBD1) for L3DL-L3-LL 1001 from the IPv4 Multicast Address Space Registry from the Local Network 1002 Control Block (224.0.0.0 - 224.0.0.255 (224.0.0/24)). 1004 IANA is requested to assign one address (TBD2) for L3DL-L3-LL from 1005 the IPv6 Multicast Address Space Registry in the the IPv6 Link-Local 1006 Scope Multicast address (TBD:2). 1008 16.2. Ports for TLS/TCP 1010 This document requests the IANA to assign a well-known TCP Port 1011 Number (TBD3) to the Layer-3 Neighbor Discovery Protocol for the 1012 following, see Section 7: 1014 l3nd-server 1016 16.3. PDU Types 1018 This document requests the IANA create a registry for L3ND PDU Type, 1019 which may range from 0 to 255. The name of the registry should be 1020 L3ND-PDU-Type. The policy for adding to the registry is RFC Required 1021 per [RFC5226], either standards track or experimental. The initial 1022 entries should be the following: 1024 PDU 1025 Code PDU Name 1026 ---- ------------------- 1027 0 HELLO 1028 1 reserved 1029 2 OPEN 1030 3 ACK 1031 4 IPv4 Announcement 1032 5 IPv6 Announcement 1033 6 MPLS IPv4 Announcement 1034 7 MPLS IPv6 Announcement 1035 8-254 Reserved 1036 255 Vendor 1038 16.4. Flag Bits 1040 This document requests the IANA create a registry for L3ND PL Flag 1041 Bits, which may range from 0 to 7. The name of the registry should 1042 be L3ND-PL-Flag-Bits. The policy for adding to the registry is RFC 1043 Required per [RFC5226], either standards track or experimental. The 1044 initial entries should be the following: 1046 Bit Bit Name 1047 ---- ------------------- 1048 0 Announce/Withdraw (ann == 0) 1049 1 Primary 1050 2 Underlay/Overlay (under == 0) 1051 3 Loopback 1052 4-7 Reserved 1054 16.5. Error Codes 1056 This document requests the IANA create a registry for L3ND Error 1057 Codes, a 16 bit integer. The name of the registry should be L3ND- 1058 Error-Codes. The policy for adding to the registry is RFC Required 1059 per [RFC5226], either standards track or experimental. The initial 1060 entries should be the following: 1062 Error 1063 Code Error Name 1064 ---- ------------------- 1065 0 No Error 1066 1 MPLS Error 1067 2 Logical Link Addressing Conflict 1068 3 Encapsulation Error 1069 4 Announce/Withdraw Error 1070 5 Session May Not Be Continued 1072 17. Acknowledgments 1074 Many kind people helped with the Layer-2 cousin of this protocol, 1075 L3DL. Cristel Pelsser provided multiple reviews, Harsha Kovuru 1076 commented during implementation, Jeff Haas reviewed and commented, 1077 Joerg Ott did an early but deep transport review, Joe Clarke provided 1078 a useful ops review, John Scudder a deeply serious review and 1079 comments, Martijn Schmidt contributed, and Neeraj Malhotra reviewed. 1081 18. References 1083 18.1. Normative References 1085 [I-D.ietf-lsvr-l3dl] 1086 Bush, R., Austein, R., and K. Patel, "Layer-3 Discovery 1087 and Liveness", Work in Progress, Internet-Draft, draft- 1088 ietf-lsvr-l3dl-08, 14 October 2021, 1089 . 1092 [IANA-PEN] "IANA Private Enterprise Numbers", 1093 . 1096 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1097 Requirement Levels", BCP 14, RFC 2119, 1098 DOI 10.17487/RFC2119, March 1997, 1099 . 1101 [RFC3032] Rosen, E., Tappan, D., Fedorkow, G., Rekhter, Y., 1102 Farinacci, D., Li, T., and A. Conta, "MPLS Label Stack 1103 Encoding", RFC 3032, DOI 10.17487/RFC3032, January 2001, 1104 . 1106 [RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A 1107 Border Gateway Protocol 4 (BGP-4)", RFC 4271, 1108 DOI 10.17487/RFC4271, January 2006, 1109 . 1111 [RFC5082] Gill, V., Heasley, J., Meyer, D., Savola, P., Ed., and C. 1112 Pignataro, "The Generalized TTL Security Mechanism 1113 (GTSM)", RFC 5082, DOI 10.17487/RFC5082, October 2007, 1114 . 1116 [RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an 1117 IANA Considerations Section in RFCs", RFC 5226, 1118 DOI 10.17487/RFC5226, May 2008, 1119 . 1121 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 1122 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 1123 May 2017, . 1125 18.2. Informative References 1127 [Clos] "Clos Network", 1128 . 1130 [I-D.ymbk-idr-l3nd-ulpc] 1131 Bush, R. and K. Patel, "L3ND Upper-Layer Protocol 1132 Configuration", Work in Progress, Internet-Draft, draft- 1133 ymbk-idr-l3nd-ulpc-04, 21 March 2022, 1134 . 1137 [RFC1122] Braden, R., Ed., "Requirements for Internet Hosts - 1138 Communication Layers", STD 3, RFC 1122, 1139 DOI 10.17487/RFC1122, October 1989, 1140 . 1142 [RFC1982] Elz, R. and R. Bush, "Serial Number Arithmetic", RFC 1982, 1143 DOI 10.17487/RFC1982, August 1996, 1144 . 1146 [RFC4760] Bates, T., Chandra, R., Katz, D., and Y. Rekhter, 1147 "Multiprotocol Extensions for BGP-4", RFC 4760, 1148 DOI 10.17487/RFC4760, January 2007, 1149 . 1151 Authors' Addresses 1153 Randy Bush 1154 Arrcus & Internet Initiative Japan 1155 5147 Crystal Springs 1156 Bainbridge Island, WA 98110 1157 United States of America 1158 Email: randy@psg.com 1160 Russ Housley 1161 Vigil Security, LLC 1162 516 Dranesville Road 1163 Herndon, VA 20170 1164 United States of America 1165 Email: housley@vigilsec.com 1167 Rob Austein 1168 Arrcus, Inc 1169 Email: sra@hactrn.net 1171 Susan Hares 1172 Hickory Hill Consulting 1173 7453 Hickory Hill 1174 Saline, MI 48176 1175 United States of America 1176 Phone: +1-734-604-0332 1177 Email: shares@ndzh.com 1178 Keyur Patel 1179 Arrcus 1180 2077 Gateway Place, Suite #400 1181 San Jose, CA 95119 1182 United States of America 1183 Email: keyur@arrcus.com