idnits 2.17.00 (12 Aug 2021) /tmp/idnits14223/draft-ietf-mpls-spring-entropy-label-06.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (May 3, 2017) is 1844 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Looks like a reference, but probably isn't: '1000' on line 369 -- Looks like a reference, but probably isn't: '1999' on line 369 == Unused Reference: 'RFC4206' is defined on line 992, but no explicit reference was found in the text == Outdated reference: draft-ietf-spring-segment-routing has been published as RFC 8402 == Outdated reference: draft-ietf-isis-mpls-elc has been published as RFC 9088 == Outdated reference: draft-ietf-ospf-mpls-elc has been published as RFC 9089 == Outdated reference: draft-ietf-isis-l2bundles has been published as RFC 8668 Summary: 0 errors (**), 0 flaws (~~), 6 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group S. Kini 3 Internet-Draft 4 Intended status: Informational K. Kompella 5 Expires: November 4, 2017 Juniper 6 S. Sivabalan 7 Cisco 8 S. Litkowski 9 Orange 10 R. Shakir 11 Google 12 J. Tantsura 13 May 3, 2017 15 Entropy label for SPRING tunnels 16 draft-ietf-mpls-spring-entropy-label-06 18 Abstract 20 Source routed tunnels with label stacking is a technique that can be 21 leveraged to steer a packet through a controlled set of segments. 22 This can be applied to the Multi Protocol Label Switching (MPLS) data 23 plane. Entropy label (EL) is a technique used in MPLS to improve 24 load-balancing. This document examines and describes how ELs are to 25 be applied to source routed tunnels with label stacks. 27 Status of This Memo 29 This Internet-Draft is submitted in full conformance with the 30 provisions of BCP 78 and BCP 79. 32 Internet-Drafts are working documents of the Internet Engineering 33 Task Force (IETF). Note that other groups may also distribute 34 working documents as Internet-Drafts. The list of current Internet- 35 Drafts is at http://datatracker.ietf.org/drafts/current/. 37 Internet-Drafts are draft documents valid for a maximum of six months 38 and may be updated, replaced, or obsoleted by other documents at any 39 time. It is inappropriate to use Internet-Drafts as reference 40 material or to cite them other than as "work in progress." 42 This Internet-Draft will expire on November 4, 2017. 44 Copyright Notice 46 Copyright (c) 2017 IETF Trust and the persons identified as the 47 document authors. All rights reserved. 49 This document is subject to BCP 78 and the IETF Trust's Legal 50 Provisions Relating to IETF Documents 51 (http://trustee.ietf.org/license-info) in effect on the date of 52 publication of this document. Please review these documents 53 carefully, as they describe your rights and restrictions with respect 54 to this document. Code Components extracted from this document must 55 include Simplified BSD License text as described in Section 4.e of 56 the Trust Legal Provisions and are provided without warranty as 57 described in the Simplified BSD License. 59 Table of Contents 61 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 62 1.1. Requirements Language . . . . . . . . . . . . . . . . . . 3 63 2. Abbreviations and Terminology . . . . . . . . . . . . . . . . 3 64 3. Use-case requiring multipath load-balancing . . . . . . . . . 4 65 4. Entropy Readable Label Depth . . . . . . . . . . . . . . . . 5 66 5. Maximum SID Depth . . . . . . . . . . . . . . . . . . . . . . 7 67 6. LSP stitching using the binding SID . . . . . . . . . . . . . 8 68 7. Insertion of entropy labels for SPRING path . . . . . . . . . 10 69 7.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . 10 70 7.1.1. Example 1 . . . . . . . . . . . . . . . . . . . . . . 11 71 7.1.2. Example 2 . . . . . . . . . . . . . . . . . . . . . . 12 72 7.2. Considerations for the placement of entropy labels . . . 12 73 7.2.1. ERLD value . . . . . . . . . . . . . . . . . . . . . 13 74 7.2.2. Segment type . . . . . . . . . . . . . . . . . . . . 14 75 7.2.2.1. Node-SID . . . . . . . . . . . . . . . . . . . . 14 76 7.2.2.2. Adjacency-SID representing an ECMP bundle . . . . 14 77 7.2.2.3. Adjacency-SID representing a single IP link . . . 15 78 7.2.2.4. Adjacency-SID representing a single link within 79 an L2 bundle . . . . . . . . . . . . . . . . . . 15 80 7.2.2.5. Adjacency-SID representing an L2 bundle . . . . . 15 81 7.2.3. Maximizing number of LSRs that will load-balance . . 15 82 7.2.4. Preference for a part of the path . . . . . . . . . . 16 83 7.2.5. Combining criteria . . . . . . . . . . . . . . . . . 16 84 8. A simple algorithm example . . . . . . . . . . . . . . . . . 16 85 9. Deployment Considerations . . . . . . . . . . . . . . . . . . 17 86 10. Options considered . . . . . . . . . . . . . . . . . . . . . 18 87 10.1. Single EL at the bottom of the stack of tunnels . . . . 18 88 10.2. An EL per tunnel in the stack . . . . . . . . . . . . . 18 89 10.3. A re-usable EL for a stack of tunnels . . . . . . . . . 19 90 10.4. EL at top of stack . . . . . . . . . . . . . . . . . . . 20 91 10.5. ELs at readable label stack depths . . . . . . . . . . . 20 92 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 20 93 12. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 21 94 13. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 21 95 14. Security Considerations . . . . . . . . . . . . . . . . . . . 21 96 15. References . . . . . . . . . . . . . . . . . . . . . . . . . 21 97 15.1. Normative References . . . . . . . . . . . . . . . . . . 21 98 15.2. Informative References . . . . . . . . . . . . . . . . . 22 99 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 23 101 1. Introduction 103 The source routed tunnels with label stacking paradigm is leveraged 104 by techniques such as Segment Routing (SR) 105 [I-D.ietf-spring-segment-routing] to steer a packet through a set of 106 segments. This can be directly applied to the MPLS data plane, but 107 it has implications on the label stack depth. 109 Clarifying statements on label stack depth have been provided in 110 [RFC7325] but the RFC does not address the case of source routed 111 stacked MPLS tunnels as described in 112 [I-D.ietf-spring-segment-routing] where deeper label stacks are more 113 prevalent. 115 Entropy label (EL) [RFC6790] is a technique used in the MPLS data 116 plane to provide entropy for load-balancing. When using LSP 117 hierarchies, there are implications on how [RFC6790] should be 118 applied. The current document addresses the case where the hierarchy 119 is created at a single LSR as required by source routed tunnels with 120 label stacks. 122 A use-case requiring load-balancing with source routed tunnels with 123 label stacks is given in Section 3. A recommended solution is 124 described in Section 7 keeping in consideration the limitations of 125 implementations when applying [RFC6790] to deeper label stacks. 126 Options that were considered to arrive at the recommended solution 127 are documented for historical purposes in Section 10. 129 1.1. Requirements Language 131 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 132 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 133 document are to be interpreted as described in [RFC2119]. 135 Although this document is not a protocol specification, the use of 136 this language clarifies the instructions to protocol designers 137 producing solutions that satisfy the requirements set out in this 138 document. 140 2. Abbreviations and Terminology 142 EL - Entropy Label 144 ELI - Entropy Label Identifier 145 ELC - Entropy Label Capability 147 ERLD - Entropy Readable Label Depth 149 SR - Segment Routing 151 ECMP - Equal Cost Multi Path 153 LSR - Label Switch Router 155 MPLS - Multiprotocol Label Switching 157 MSD - Maximum SID Depth 159 SID - Segment Identifier 161 RLD - Readable Label Depth 163 OAM - Operation, Administration and Maintenance 165 3. Use-case requiring multipath load-balancing 167 +------+ 168 | | 169 +-------| P3 |-----+ 170 | +-----| |---+ | 171 L3| |L4 +------+ L1| |L2 +----+ 172 | | | | +--| P4 |--+ 173 +-----+ +-----+ +-----+ | +----+ | +-----+ 174 | S |-----| P1 |------------| P2 |--+ +--| D | 175 | | | | | |--+ +--| | 176 +-----+ +-----+ +-----+ | +----+ | +-----+ 177 +--| P5 |--+ 178 +----+ 179 S=Source LSR, D=Destination LSR, P1,P2,P3,P4,P5=Transit LSRs, 180 L1,L2,L3,L4=Links 182 Figure 1: Traffic engineering use-case 184 Traffic-engineering (TE) is one of the applications of MPLS and is 185 also a requirement for source routed tunnels with label stacks 186 [RFC7855]. Consider the topology shown in Figure 1. The LSR S 187 requires data to be sent to LSR D along a traffic-engineered path 188 that goes over the link L1. Good load-balancing is also required 189 across equal cost paths (including parallel links). To engineer 190 traffic along a path that takes link L1, the label stack that LSR S 191 creates consists of a label to the node SID of LSR P3, stacked over 192 the label for the adjacency SID of link L1 and that in turn is 193 stacked over the label to the node SID of LSR D. For simplicity lets 194 assume that all LSRs use the same label space (SRGB) for source 195 routed label stacks. Let L_N-Px denote the label to be used to reach 196 the node SID of LSR Px. Let L_A-Ln denote the label used for the 197 adjacency SID for link Ln. The LSR S must use the label stack for traffic-engineering. However to achieve good 199 load-balancing over the equal cost paths P2-P4-D, P2-P5-D and the 200 parallel links L3, L4, a mechanism such as Entropy labels [RFC6790] 201 should be adapted for source routed label stacks. Indeed, the SPRING 202 architecture with the MPLS dataplane uses nested MPLS LSPs composing 203 the source routed label stacks. As each MPLS node may have 204 limitations in the number of labels it can push when it is ingress or 205 inspect when doing load-balancing, an entropy label insertion 206 strategy becomes important to keep the benefit of the load-balancing. 207 Multiple ways to apply entropy labels were considered and are 208 documented in Section 10 along with their trade-offs. A recommended 209 solution is described in Section 7. 211 4. Entropy Readable Label Depth 213 The Entropy Readable Label Depth (ERLD) is defined as the number of 214 labels a router can both: 216 a. Read in an MPLS packet received on its incoming interface(s) 217 (starting from the top of the stack). 219 b. Use in its load-balancing function. 221 The ERLD means that the router will perform load-balancing using the 222 EL label if the EL is placed within the ERLD first labels. 224 A router capable of reading N labels but not using an EL located 225 within those N labels MUST consider its ERLD to be 0. In a 226 distributed switching architecture, each linecard may have a 227 different capability in terms of ERLD. For simplicity, an 228 implementation MAY use the minimum ERLD between each linecard as the 229 ERLD value for the system. 231 Examples: 233 | Payload | 234 +----------+ 235 | Payload | | EL | P7 236 +----------+ +----------+ 237 | Payload | | EL | | ELI | 238 +----------+ +----------+ +----------+ 239 | Payload | | EL | | ELI | | Label 50 | 240 +----------+ +----------+ +----------+ +----------+ 241 | Payload | | EL | | ELI | | Label 40 | | Label 40 | 242 +----------+ +----------+ +----------+ +----------+ +----------+ 243 | EL | | ELI | | Label 30 | | Label 30 | | Label 30 | 244 +----------+ +----------+ +----------+ +----------+ +----------+ 245 | ELI | | Label 20 | | Label 20 | | Label 20 | | Label 20 | 246 +----------+ +----------+ +----------+ +----------+ +----------+ 247 | Label 16 | | Label 16 | | Label 16 | | Label 16 | | Label 16 | P1 248 +----------+ +----------+ +----------+ +----------+ +----------+ 249 Packet 1 Packet 2 Packet 3 Packet 4 Packet 5 251 Figure 2: Label stacks with ELI/EL 253 In the figure below, we consider the displayed packets received on a 254 router interface. We consider also a single ERLD value for the 255 router. 257 o If the router has an ERLD of 3, it will be able to load-balance 258 Packet 1 displayed in Figure 2 using the EL as part of the load- 259 balancing keys. The ERLD value of 3 means that the router can 260 read and take into account the entropy label for load-balancing if 261 it is placed between position 1 (top) and position 3. 263 o If the router has an ERLD of 5, it will be able to load-balance 264 Packets 1 to 3 in Figure 2 using the EL as part of the load- 265 balancing keys. Packets 4 and 5 have the EL placed at a position 266 greater than 5, so the router is not able to read it and use as 267 part of the load-balancing keys. 269 o If the router has an ERLD of 10, it will be able to load-balance 270 all the packets displayed in Figure 2 using the EL as part of the 271 load-balancing keys. 273 To allow an efficient load-balancing based on entropy labels, a 274 router running SPRING SHOULD advertise its ERLD (or ERLDs), so all 275 the other SPRING routers in the network are aware of its capability. 276 How this advertisement is done is outside the scope of this document. 278 To advertise an ERLD value, a SPRING router: 280 o MUST be entropy label capable and, as a consequence, MUST apply 281 all the procedures defined in [RFC6790]. 283 o MUST be able to read an ELI/EL which is located within its ERLD 284 value. 286 o MUST take into account this EL in its load-balancing function. 288 5. Maximum SID Depth 290 The Maximum SID Depth defines the maximum number of labels that a 291 particular node can impose on a packet. This includes any kind of 292 labels (service, entropy, transport...). In an MPLS network, the MSD 293 is a limit of the Ingress LSR (I-LSR) or any stitching node that 294 would perform an imposition of additional labels on an existing label 295 stack. 297 Depending of the number of MPLS operations (POP, SWAP...) to be 298 performed before the PUSH, the MSD may vary due to the hardware or 299 software limitations. As for the ERLD, there may also be different 300 MSD limits based on the linecard type used in a distributed switching 301 system. 303 When an external controller is used to program a label stack on a 304 particular node, this node MAY advertise its MSD value or a subset of 305 its MSD value to the controller. How this advertisement is done is 306 outside the scope of this document. As the controller does not have 307 the knowledge of the entire label stack to be pushed by the node, the 308 node may advertise an MSD value which is lower than its actual limit. 309 This gives the ability for the controller to program a label stack up 310 to the advertised MSD value while leaving room for the local node to 311 add more labels (e.g., service, entropy, transport...) without 312 reaching the hardware/software limit. 314 P7 ---- P8 ---- P9 315 / \ 316 PE1 --- P1 --- P2 --- P3 --- P4 --- P5 --- P6 --- PE2 317 | \ | 318 ----> P10 \ | 319 IP Pkt | \ | 320 P11 --- P12 --- P13 321 100 10000 323 Figure 3 325 In the Figure 3, an IP packet comes in the MPLS network at PE1. All 326 metrics are considered equal to 1 except P12-P13 which is 10000 and 327 P11-P12 which is 100. PE1 wants to steer the traffic using a SPRING 328 path to PE2 along 329 PE1->P1->P7->P8->P9->P4->P5->P10->P11->P12->P13->PE2. By using 330 Adjacency SIDs only, PE1 will be required to push (as an I-LSR) 10 331 labels on the IP packet received and so requires an MSD of 10. If 332 the IP packet should be carried over an MPLS service like a regular 333 layer 3 VPN, an additional service label will be imposed, requiring 334 an MSD of 11 for PE1. In addition, if PE1 wants to insert an ELI/EL 335 for load-balancing purpose, PE1 will need to push 13 labels on the IP 336 packet requiring an MSD of 13. 338 In the SPRING architecture, Node SIDs or Binding SIDs can be used to 339 reduce the label stack size. As an example, to steer the traffic on 340 the same path as before, PE1 may be able to use the following label 341 stack: . In this example we 342 consider a combination of Node SIDs and a Binding SID advertised by 343 P5 that will stitch the traffic along the path P10->P11->P12->P13. 344 The instruction associated with the binding SID at P5 is thus to swap 345 Binding_P5 to Adj_P12-P13 and then push . P5 346 acts as a stitching node that pushes additional labels on an existing 347 label stack, P5's MSD needs also to be taken into account and may 348 limit the number of labels that could be imposed. 350 6. LSP stitching using the binding SID 352 The binding SID allows binding a segment identifier to an existing 353 LSP. As examples, the binding SID can represent an RSVP-TE tunnel, 354 an LDP path (through the mapping server advertisement), a SPRING 355 path... Each LSP associated with a binding SID has its own entropy 356 label capability. 358 In the figure 3, if we consider that: 360 o P6, PE2, P10, P11, P12 are pure LDP routers. 362 o PE1, P1, P2, P3, P4, P7, P8, P9 are pure SPRING routers. 364 o P5 is running SPRING and LDP. 366 o P5 acts as a mapping server (MS) and advertises Prefix SIDs for 367 the LDP FECs: an index value of 20 is used for PE2. 369 o All SPRING routers use an SRGB of [1000, 1999]. 371 o P6 advertises label 20 for the PE2 FEC. 373 o Traffic from PE1 to PE2 uses the shortest path. 375 PE1 ----- P1 -- P2 -- P3 -- P4 ---- P5 --- P6 --- PE2 377 --> +----+ +----+ +----+ +----+ 378 IP Pkt | IP | | IP | | IP | | IP | 379 +----+ +----+ +----+ +----+ 380 |1020| |1020| | 20 | 381 +----+ +----+ +----+ 382 SPRING LDP 384 In term of packet forwarding, by learning the MS advertisement from 385 PE5, PE1 imposes a label 1020 to an IP packet destinated to PE2. 386 SPRING routers along the shortest path to PE2 will switch the traffic 387 until it reaches P5 which will perform the LSP stitching. P5 will 388 swap the SPRING label 1020 to the LDP label 20 advertised by the 389 nexthop P6. P6 will then forward the packet using the LDP label 390 towards PE2. 392 PE1 cannot push an ELI/EL for the binding SID without knowing that 393 the tail-end of the LSP associated with the binding (PE2) is entropy 394 label capable. 396 To accomodate the mix of signalling protocols involved during the 397 stitching, the entropy label capability SHOULD be propagated between 398 the signalling protocols. Each binding SID SHOULD have its own 399 entropy label capability that MUST be inherited from the entropy 400 label capability of the associated LSP. If the router advertising 401 the binding SID does not know the ELC state of the target FEC, it 402 MUST NOT set the ELC for the binding SID. An ingress node MUST NOT 403 push an ELI/EL associated with a binding SID unless this binding SID 404 has the entropy label capability. How the entropy label capability 405 is advertised for a binding SID is outside the scope of this 406 document. 408 In our example, if PE2 is LDP entropy label capable, it will add the 409 entropy label capability in its LDP advertisement. When P5 receives 410 the FEC/label binding for PE2, it learns about the ELC and can set 411 the ELC in the mapping server advertisement. Thus PE1 learns about 412 the ELC of PE2 and may push an ELI/EL associated with the binding 413 SID. 415 The proposed solution only works if the SPRING router advertising the 416 binding SID is also performing the dataplane LSP stitching. In our 417 example, if the mapping server function is hosted on P8 instead of 418 P5, P8 does not know about the ELC state of PE2's LDP FEC. As a 419 consequence, it does not set the ELC for the associated binding SID. 421 7. Insertion of entropy labels for SPRING path 423 7.1. Overview 425 The solution described in this section follows [RFC6790]. Within a 426 SPRING path, a node may be ingress, egress, transit (regarding the 427 entropy label processing described in [RFC6790]), or it can be any 428 combination of those. For example: 430 o The ingress node of a SPRING domain may be an ingress node from an 431 entropy label perspective. 433 o Any LSR terminating a segment of the SPRING path is an egress node 434 (because it terminates the segment) but may also be a transit node 435 if the SPRING path is not terminated because there is a subsequent 436 SPRING MPLS label in the stack. 438 o Any LSR processing a binding SID may be a transit node and an 439 ingress node (because it may push additional labels when 440 processing the binding SID). 442 As described earlier, an LSR may have a limitation, ERLD, on the 443 depth of the label stack that it can read and process in order to do 444 multipath load-balancing based on entropy labels. 446 If an EL does not occur within the ERLD of an LSR in the label stack 447 of an MPLS packet that it receives, then it would lead to poor load- 448 balancing at that LSR. Hence an ELI/EL pair MUST be within the ERLD 449 of the LSR in order for the LSR to use the EL during load-balancing. 451 Adding a single ELI/EL pair for the entire SPRING path may lead also 452 to poor load-balancing as well because the EL/ELI may not occur 453 within the ERLD of some LSR on the path (if too deep) or may not be 454 present in the stack when it reaches some LSRs if it is too shallow. 456 In order for the EL to occur within the ERLD of LSRs along the path 457 corresponding to a SPRING label stack, multiple pairs MAY 458 be inserted in this label stack. 460 The insertion of the ELI/EL SHOULD occur only with a SPRING label 461 advertised by an LSR that advertised an ERLD (the LSR is entropy 462 label capable) or with a SPRING label associated with a binding SID 463 that has the ELC set. 465 The ELs among multiple pairs inserted in the stack MAY be 466 the same or different. The LSR that inserts pairs MAY have 467 limitations on the number of such pairs that it can insert and also 468 the depth at which it can insert them. If due to limitations, the 469 inserted ELs are at positions such that an LSR along the path 470 receives an MPLS packet without an EL in the label stack within that 471 LSR's ERLD, then the load-balancing performed by that LSR would be 472 poor. An implementation MAY consider multiple criteria when 473 inserting pairs. 475 7.1.1. Example 1 477 ECMP LAG LAG 478 PE1 --- P1 --- P2 --- P3 --- P4 --- P5 --- P6 --- PE2 480 Figure 4 482 In the Figure 4, PE1 wants to forward some MPLS VPN traffic over an 483 explicit path to PE2 resulting in the following label stack to be 484 pushed onto the received IP header: {VPN_label, Adj_P6PE2, Adj_P5P6, 485 Adj_P4P5, Adj_P3P4, Adj_Bundle_P2P3, Adj_P1P2}. PE1 is limited to 486 push a maximum of 11 labels (MSD=11). P2, P3 and P6 have an ERLD of 487 3 while others have an ERLD of 10. 489 PE1 can only add two ELI/EL pairs in the label stack due to its MSD 490 limitation. It should insert them strategically to benefit load- 491 balancing along the longest part of the path. 493 PE1 may take into account multiple parameters when inserting ELs, as 494 examples: 496 o The ERLD value advertised by transit nodes. 498 o The requirement of load-balancing for a particular label value. 500 o Any service provider preference: favor beginning of the path or 501 end of the path. 503 In the Figure 4, a good strategy may be to use the following stack 504 {VPN_label, ELI2,EL2, Adj_P6PE2, Adj_P5P6, Adj_P4P5, Adj_P3P4, ELI1, 505 EL1, Adj_Bundle_P2P3, Adj_P1P2}. The original stack requests P2 to 506 forward based on a bundle Adjacency segment that will require load- 507 balancing. Therefore it is important to ensure that P2 can load- 508 balance correctly. As P2 has a limited ERLD of 3, ELI/EL must be 509 inserted just next to the label that P2 will use to forward. On the 510 path to PE2, P3 has also a limited ERLD, but P3 will forward based on 511 a basic adjacency segment that may require no load-balancing. 512 Therefore it does not seem important to ensure that P3 can do load- 513 balancing despite of its limited ERLD. The next nodes along the 514 forwarding path have a high ERLD that does not cause any issue, 515 except P6, moreover P6 is using some LAGs to PE2 and so is expected 516 to load-balance. It becomes important to insert a new ELI/EL just 517 next to P6 forwarding label. 519 In the case above, the ingress node had enough label push capacity to 520 ensure end-to-end load-balancing taking into the path attributes. 521 There might be some cases, where the ingress node may not have the 522 necessary label imposition capacity. 524 7.1.2. Example 2 526 ECMP LAG ECMP ECMP 527 PE1 --- P1 --- P2 --- P3 --- P4 --- P5 --- P6 --- P7 --- P8 --- PE2 529 Figure 5 531 In the Figure 5, PE1 wants to forward MPLS VPN traffic over an 532 explicit path to PE2 resulting in the following label stack to be 533 pushed onto the IP header: {VPN_label, Adj_Bundle_P8PE2, Adj_P7P8, 534 Adj_Bundle_P6P7, Adj_P5P6, Adj_P4P5, Adj_P3P4, Adj_Bundle_P2P3, 535 Adj_P1P2}. PE1 is limited to push a maximum of 11 labels, P2, P3 and 536 P6 have an ERLD of 3 while others have an ERLD of 15. 538 Using a similar strategy as the previous case may lead to a dilemma, 539 as PE1 can only push a single ELI/EL while we may need a minimum of 540 three to load-balance the end-to-end path. An optimized stack that 541 would enable end-to-end load-balancing may be: {VPN_label, ELI3, EL3, 542 Adj_Bundle_P8PE2, Adj_P7P8, ELI2, EL2, Adj_Bundle_P6P7, Adj_P5P6, 543 Adj_P4P5, Adj_P3P4, ELI1, EL1, Adj_Bundle_P2P3, Adj_P1P2}. 545 A decision needs to be taken to favor some part of the path for load- 546 balancing considering that load-balancing may not work on the other 547 part. A service provider may decide to place the ELI/EL after the P6 548 forwarding label as it will allow P4 and P6 to load-balance. Placing 549 the ELI/EL at bottom of the stack is also a possibility enabling 550 load-balancing for P4 and P8. 552 7.2. Considerations for the placement of entropy labels 554 The sample cases described in the previous section showed that 555 placing the ELI/EL when the maximum number of labels to be pushed is 556 limited is not an easy decision and multiple criteria may be taken 557 into account. 559 This section describes some considerations that could be taken into 560 account when placing ELI/ELs. This list of criteria is not 561 considered as exhaustive and an implementation MAY take into account 562 additional criteria or tie-breakers that are not documented here. 564 An implementation SHOULD try to maximize the load-balancing where 565 multiple ECMP paths are available and minimize the number of EL/ELIs 566 that need to be inserted. In case of trade-off, an implementation 567 MAY provide flexibility to the operator to select the criteria to be 568 considered when placing EL/ELIs or the sub-objective for which to 569 optimize. 571 PE1 -- P1 -- P2 -- P3 -- P4 -- P5 -- ... -- P8 -- P9 -- PE2 572 | | 573 P3'--- P4'--- P5' 575 Figure 6 577 The figure above will be used as reference in the following 578 subsections. 580 7.2.1. ERLD value 582 As mentioned in Section 7.1, the ERLD value is an important parameter 583 to consider when inserting ELI/EL as if an ELI/EL does not fall 584 within the ERLD of a node on the path, the node will not be able to 585 load-balance the traffic efficiently. 587 The ERLD value can be advertised via protocols and those extensions 588 are described in separate documents [I-D.ietf-isis-mpls-elc] and 589 [I-D.ietf-ospf-mpls-elc]. 591 Let's consider a path from PE1 to PE2 using the following stack 592 pushed by PE1: {Service_label, Adj_PE2P9, Node_P9, Adj_P1P2}. 594 Using the ERLD as an input parameter may help to minimize the number 595 of required ELI/EL pairs to be inserted. An ERLD value must be 596 retrieved for each SPRING label in the label stack. 598 For a label bound to an adjacency segment, the ERLD is the ERLD of 599 the node that advertised the adjacency segment. In the example 600 above, the ERLD associated with Adj_P1P2 would be the ERLD of router 601 P1 as P1 will perform the forwarding based on the Adj_P1P2 label. 603 For a label bound to a node segment, multiple strategies MAY be 604 implemented. An implementation may try to evaluate the minimum ERLD 605 value along the node segment path. If an implementation cannot find 606 the minimum ERLD along the path of the segment, it can use the ERLD 607 of the starting node instead. In the example above, if the 608 implementation supports computation of minimum ERLD along the path, 609 the ERLD associated to label Node_P9 would be the minimum ERLD 610 between nodes {P2,P3,P4 ..., P8}. If an implementation does not 611 support the computation of minimum ERLD, it should consider the ERLD 612 of P2 (starting node that will forward based on the Node_P9 label). 614 For a label bound to a binding segment, if the binding segment 615 describes a path, an implementation may also try to evaluate the 616 minimum ERLD along this path. If the implementation cannot find the 617 minimum ERLD along the path of the segment, it can use the ERLD of 618 the starting node instead. 620 7.2.2. Segment type 622 Depending of the type of segment a particular label is bound to, an 623 implementation may deduce that this particular label will be subject 624 to load-balancing on the path. 626 7.2.2.1. Node-SID 628 An MPLS label bound to a Node-SID represents a path that may cross 629 multiple hops. Load-balancing may be needed on the node starting 630 this path but also on any node along the path. 632 Let's consider a path from PE1 to PE2 using the following stack 633 pushed by PE1: {Service_label, Adj_PE2P9, Node_P9, Adj_P1P2}. 635 If, for example, PE1 is limited to pushing 6 labels, it can add a 636 single ELI/EL within the label stack. An operator may want to favor 637 a placement that would allow load-balancing along the Node-SID path. 638 In the figure above, P3 which is along the Node-SID path requires 639 load-balancing on two equal-cost paths. 641 An implementation may try to evaluate if load-balancing is really 642 required within a node segment path. This could be done by running 643 an additional SPT computation and analysis of the node segment path 644 to prevent a node segment that does not really require load-balancing 645 from being preferred when placing EL/ELIs. Such inspection may be 646 time consuming for implementations and without a 100% guarantee, as a 647 node segment path may use LAG that could be invisible from the IP 648 topology. A simpler approach would be to consider that a label bound 649 to a Node-SID will be subject to load-balancing and requires an EL/ 650 ELI. 652 7.2.2.2. Adjacency-SID representing an ECMP bundle 654 When an adjacency segment representing an ECMP bundle is used within 655 a label stack, an implementation can deduce that load-balancing is 656 expected at the node that advertised this adjacency segment. An 657 implementation could then favor this particular label value when 658 placing ELI/ELs. 660 7.2.2.3. Adjacency-SID representing a single IP link 662 When an adjacency segment representing a single IP link is used 663 within a label stack, an implementation can deduce that load- 664 balancing may not be expected at the node that advertised this 665 adjacency segment. 667 The implementation could then decide to place ELI/ELs to favor other 668 LSRs than the one advertising this adjacency segment. 670 Readers should note that an adjacency segment representing a single 671 IP link may require load-balancing. This is the case when a LAG (L2 672 bundle) is implemented between two IP nodes and the L2 bundle SR 673 extensions [I-D.ietf-isis-l2bundles] are not implemented. In such 674 case, it may be interesting to keep the possibility to insert an EL/ 675 ELI in a readable position for the LSR advertising the label 676 associated with the adjacency segment. 678 7.2.2.4. Adjacency-SID representing a single link within an L2 bundle 680 When L2 bundle SR extensions [I-D.ietf-isis-l2bundles] are used, 681 adjacency segments may be advertised for each member of the bundle. 682 In this case, an implementation can deduce that load-balancing is not 683 expected on the LSR advertising this segment and could then decide to 684 place ELI/ELs to favor other LSRs than the one advertising this 685 adjacency segment. 687 7.2.2.5. Adjacency-SID representing an L2 bundle 689 When L2 bundle SR extensions [I-D.ietf-isis-l2bundles] are used, an 690 adjacency segment may be advertised to represent the bundle. In this 691 case, an implementation can deduce that load-balancing is expected on 692 the LSR advertising this segment and could then decide to place ELI/ 693 ELs to favor this LSR. 695 7.2.3. Maximizing number of LSRs that will load-balance 697 When placing ELI/ELs, an implementation may try to maximize the 698 number of LSRs that both need to load-balance (i.e., have ECMP paths) 699 and that will be able to perform load-balancing (i.e., the EL label 700 is within their ERLD). 702 Let's consider a path from PE1 to PE2 using the following stack 703 pushed by PE1: {Service_label, Adj_PE2P9, Node_P9, Adj_P1P2}. All 704 routers have an ERLD of 10, expect P1 and P2 which have an ERLD of 4. 705 PE1 is able to push 6 labels, so only a single ELI/EL can be added. 707 In the example above, adding ELI/EL next to Adj_P1P2 will only allow 708 load-balancing at P1 while inserting it next to Adj_PE2P9, will allow 709 load-balancing at P2,P3 ... P9 and maximizing the number of LSRs that 710 could perform load-balancing. 712 7.2.4. Preference for a part of the path 714 An implementation may propose to favor a part of the end-to-end path 715 when the number of EL/ELI that can be pushed is not enough to cover 716 the entire path. As example, a service provider may want to favor 717 load-balancing at the beginning of the path or at the end of path, so 718 the implementation should prefer putting the ELI/ELs near the top or 719 near of the bottom of the stack. 721 7.2.5. Combining criteria 723 An implementation can combine multiple criteria to determine the best 724 EL/ELIs placement. But combining too much criteria may lead to 725 implementation complexity and high control plane resource 726 consumption. Each time the network topology changes, a new 727 evaluation of the EL/ELI placement will be necessary for each 728 impacted LSPs. 730 8. A simple algorithm example 732 A simple implementation can only take into account ERLD when placing 733 ELI/EL while keep minimizing the number of EL/ELIs inserted and 734 maximizing the number of LSRs that can load-balance. 736 The algorithm example is based on the following considerations: 738 o An LSR that is limited in the number of pairs that it 739 can insert SHOULD insert such pairs deeper in the stack. 741 o An LSR should try to insert pairs at positions so that 742 for the maximum number of transit LSRs, the EL occurs within the 743 ERLD of those LSRs. 745 o An LSR should try to insert the minimum number of such pairs while 746 trying to satisfy the above criteria. 748 The pseudocode of the example is shown below. 750 Initialize the current EL insertion point to the 751 bottommost label in the stack that is EL-capable 752 while (local-node can push more pairs OR 753 insertion point is not above label stack) { 754 insert an pair below current insertion point 755 move new insertion point up from current insertion point until 756 ((last inserted EL is below the ERLD) AND (ERLD > 2) 757 AND 758 (new insertion point is EL-capable)) 759 set current insertion point to new insertion point 760 } 762 Figure 7: Example algorithm to insert pairs in a label 763 stack 765 When this algorithm is applied to the example described in Section 3, 766 it will result in ELs being inserted in two positions, one below the 767 label L_N-D and another below L_N-P3. Thus the resulting label stack 768 would be {L_N-P3, ELI, EL, L_A-L1, L_N-D, ELI, EL} 770 9. Deployment Considerations 772 As long as LSR node dataplane capabilities with be limited (number of 773 labels that can be pushed, or number of labels that can be 774 inspected), hop-by-hop load-balancing of SPRING encapsulated flows 775 will require trade-offs. 777 Entropy label is still a good and usable solution as it allows load- 778 balancing without having to perform a deep packet inspection on each 779 LSR: it does not seem reasonable to have an LSR inspecting UDP ports 780 within a GRE tunnel carried over a 15 label SPRING tunnel. 782 Due to the limited capacity of reading a deep stack of MPLS labels, 783 multiple EL/ELIs may be required within the stack which directly 784 impacts the capacity of the head-end to push a deep stack: each EL/ 785 ELI inserted requires two additional labels to be pushed. 787 Placement strategies of EL/ELIs are required to find the best trade- 788 off. Multiple criteria may be taken into account and some level of 789 customization (by the user) may be required to accommodate the 790 different deployments. Analyzing the path of each destination to 791 determine the best EL/ELI placement may be time consuming for the 792 control plane, we encourage implementations to find the best trade- 793 off between simplicity, resource consumption, and load-balancing 794 efficiency. 796 In future, hardware and software capacity may increase dataplane 797 capabilities and may be remove some of these limits, increasing load- 798 balancing capability using entropy labels. 800 10. Options considered 802 Different options that were considered to arrive at the recommended 803 solution are documented in this section. 805 10.1. Single EL at the bottom of the stack of tunnels 807 In this option, a single EL is used for the entire label stack. The 808 source LSR S encodes the entropy label (EL) at the bottom of the 809 label stack. In the example described in Section 3, it will result 810 in the label stack at LSR S to look like {L_N-P3, L_A-L1, L_N-D, ELI, 811 EL} {remaining packet header}. Note that the notation in [RFC6790] 812 is used to describe the label stack. An issue with this approach is 813 that as the label stack grows due an increase in the number of SIDs, 814 the EL goes correspondingly deeper in the label stack. Hence, 815 transit LSRs have to access a larger number of bytes in the packet 816 header when making forwarding decisions. In the example described in 817 Section 3, the LSR P1 would load-balance traffic poorly on the 818 parallel links L3 and L4 since the EL is below the ERLD of the packet 819 received by P1. A load-balanced network design using this approach 820 must ensure that all intermediate LSRs have the capability to 821 traverse the maximum label stack depth as required for the 822 application that uses source routed stacking. 824 In the case where the hardware is capable of pushing a single pair at any depth, this option is the same as the recommended 826 solution in Section 7. 828 This option was rejected since there exist a number of hardware 829 implementations which have a low maximum readable label depth. 830 Choosing this option can lead to a loss of load-balancing using EL in 831 a significant part of the network when that is a critical requirement 832 in a service-provider network. 834 10.2. An EL per tunnel in the stack 836 In this option, each tunnel in the stack can be given its own EL. 837 The source LSR pushes an before pushing a tunnel label when 838 load-balancing is required to direct traffic on that tunnel. In the 839 example described in Section 3, the source LSR S encoded label stack 840 would be {L_N-P3, ELI, EL, L_A-L1, L_N-D, ELI, EL} where all the ELs 841 can be the same. Accessing the EL at an intermediate LSR is 842 independent of the depth of the label stack and hence independent of 843 the specific application that uses source routed tunnels with label 844 stacking. A drawback is that the depth of the label stack grows 845 significantly, almost 3 times as the number of labels in the label 846 stack. The network design should ensure that source LSRs have the 847 capability to push such a deep label stack. Also, the bandwidth 848 overhead and potential MTU issues of deep label stacks should be 849 considered in the network design. 851 In the case where the RLD is the minimum value (3) for all LSRs, all 852 LSRs are EL capable and the LSR that is inserting pairs has 853 no limit on how many it can insert then this option is the same as 854 the recommended solution in Section 7. 856 This option was rejected due to the existence of hardware 857 implementations that can push a limited number of labels on the label 858 stack. Choosing this option would result in a hardware requirement 859 to push two additional labels per tunnel label. Hence it would 860 restrict the number of tunnels that can be stacked in a LSP and hence 861 constrain the types of LSPs that can be created. This was considered 862 unacceptable. 864 10.3. A re-usable EL for a stack of tunnels 866 In this option an LSR that terminates a tunnel re-uses the EL of the 867 terminated tunnel for the next inner tunnel. It does this by storing 868 the EL from the outer tunnel when that tunnel is terminated and re- 869 inserting it below the next inner tunnel label during the label swap 870 operation. The LSR that stacks tunnels should insert an EL below the 871 outermost tunnel. It should not insert ELs for any inner tunnels. 872 Also, the penultimate hop LSR of a segment must not pop the ELI and 873 EL even though they are exposed as the top labels since the 874 terminating LSR of that segment would re-use the EL for the next 875 segment. 877 In Section 3 above, the source LSR S encoded label stack would be 878 {L_N-P3, ELI, EL, L_A-L1, L_N-D}. At P1, the outgoing label stack 879 would be {L_N-P3, ELI, EL, L_A-L1, L_N-D} after it has load-balanced 880 to one of the links L3 or L4. At P3 the outgoing label stack would 881 be {L_N-D, ELI, EL}. At P2, the outgoing label stack would be {L_N- 882 D, ELI, EL} and it would load-balance to one of the nexthop LSRs P4 883 or P5. Accessing the EL at an intermediate LSR (e.g., P1) is 884 independent of the depth of the label stack and hence independent of 885 the specific use-case to which the label stack is applied. 887 This option was rejected due to the significant change in label swap 888 operations that would be required for existing hardware. 890 10.4. EL at top of stack 892 A slight variant of the re-usable EL option is to keep the EL at the 893 top of the stack rather than below the tunnel label. In this case, 894 each LSR that is not terminating a segment should continue to keep 895 the received EL at the top of the stack when forwarding the packet 896 along the segment. An LSR that terminates a segment should use the 897 EL from the terminated segment at the top of the stack when 898 forwarding onto the next segment. 900 This option was rejected due to the significant change in label swap 901 operations that would be required for existing hardware. 903 10.5. ELs at readable label stack depths 905 In this option the source LSR inserts ELs for tunnels in the label 906 stack at depths such that each LSR along the path that must load 907 balance is able to access at least one EL. Note that the source LSR 908 may have to insert multiple ELs in the label stack at different 909 depths for this to work since intermediate LSRs may have differing 910 capabilities in accessing the depth of a label stack. The label 911 stack depth access value of intermediate LSRs must be known to create 912 such a label stack. How this value is determined is outside the 913 scope of this document. This value can be advertised using a 914 protocol such as an IGP. 916 Applying this method to the example in Section 3 above, if LSR P1 917 needs to have the EL within a depth of 4, then the source LSR S 918 encoded label stack would be {L_N-P3, ELI, EL, L_A-L1, L_N-D, ELI, 919 EL} where all the ELs would typically have the same value. 921 In the case where the RLD has different values along the path and the 922 LSR that is inserting pairs has no limit on how many pairs 923 it can insert, and it knows the appropriate positions in the stack 924 where they should be inserted, this option is the same as the 925 recommended solution in Section 7. 927 Note that a refinement of this solution which balances the number of 928 pushed labels against the desired entropy is the solution described 929 in Section 7. 931 11. Acknowledgements 933 The authors would like to thank John Drake, Loa Andersson, Curtis 934 Villamizar, Greg Mirsky, Markus Jork, Kamran Raza, Carlos Pignataro, 935 Bruno Decraene and Nobo Akiya for their review comments and 936 suggestions. 938 12. Contributors 940 Xiaohu Xu 941 Huawei 943 Email: xuxiaohu@huawei.com 945 Wim Hendrickx 946 Nokia 948 Email: wim.henderickx@nokia.com 950 Gunter Van De Velde 951 Nokia 953 Email: gunter.van_de_velde@nokia.com 955 Acee Lindem 956 Cisco 958 Email: acee@cisco.com 960 13. IANA Considerations 962 This memo includes no request to IANA. Note to RFC Editor: Remove 963 this section before publication. 965 14. Security Considerations 967 This document does not introduce any new security considerations 968 beyond those already listed in [RFC6790]. 970 15. References 972 15.1. Normative References 974 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 975 Requirement Levels", BCP 14, RFC 2119, 976 DOI 10.17487/RFC2119, March 1997, 977 . 979 [RFC6790] Kompella, K., Drake, J., Amante, S., Henderickx, W., and 980 L. Yong, "The Use of Entropy Labels in MPLS Forwarding", 981 RFC 6790, DOI 10.17487/RFC6790, November 2012, 982 . 984 [RFC7855] Previdi, S., Ed., Filsfils, C., Ed., Decraene, B., 985 Litkowski, S., Horneffer, M., and R. Shakir, "Source 986 Packet Routing in Networking (SPRING) Problem Statement 987 and Requirements", RFC 7855, DOI 10.17487/RFC7855, May 988 2016, . 990 15.2. Informative References 992 [RFC4206] Kompella, K. and Y. Rekhter, "Label Switched Paths (LSP) 993 Hierarchy with Generalized Multi-Protocol Label Switching 994 (GMPLS) Traffic Engineering (TE)", RFC 4206, 995 DOI 10.17487/RFC4206, October 2005, 996 . 998 [RFC7325] Villamizar, C., Ed., Kompella, K., Amante, S., Malis, A., 999 and C. Pignataro, "MPLS Forwarding Compliance and 1000 Performance Requirements", RFC 7325, DOI 10.17487/RFC7325, 1001 August 2014, . 1003 [I-D.ietf-spring-segment-routing] 1004 Filsfils, C., Previdi, S., Decraene, B., Litkowski, S., 1005 and R. Shakir, "Segment Routing Architecture", draft-ietf- 1006 spring-segment-routing-11 (work in progress), February 1007 2017. 1009 [I-D.ietf-isis-mpls-elc] 1010 Xu, X., Kini, S., Sivabalan, S., Filsfils, C., and S. 1011 Litkowski, "Signaling Entropy Label Capability Using IS- 1012 IS", draft-ietf-isis-mpls-elc-02 (work in progress), 1013 October 2016. 1015 [I-D.ietf-ospf-mpls-elc] 1016 Xu, X., Kini, S., Sivabalan, S., Filsfils, C., and S. 1017 Litkowski, "Signaling Entropy Label Capability Using 1018 OSPF", draft-ietf-ospf-mpls-elc-04 (work in progress), 1019 November 2016. 1021 [I-D.ietf-isis-l2bundles] 1022 Ginsberg, L., Bashandy, A., Filsfils, C., Nanduri, M., and 1023 E. Aries, "Advertising L2 Bundle Member Link Attributes in 1024 IS-IS", draft-ietf-isis-l2bundles-04 (work in progress), 1025 April 2017. 1027 Authors' Addresses 1029 Sriganesh Kini 1031 EMail: sriganeshkini@gmail.com 1033 Kireeti Kompella 1034 Juniper 1036 EMail: kireeti@juniper.net 1038 Siva Sivabalan 1039 Cisco 1041 EMail: msiva@cisco.com 1043 Stephane Litkowski 1044 Orange 1046 EMail: stephane.litkowski@orange.com 1048 Rob Shakir 1049 Google 1051 EMail: rjs@rob.sh 1053 Jeff Tantsura 1055 EMail: jefftant@gmail.com