idnits 2.17.00 (12 Aug 2021) /tmp/idnits14793/draft-ietf-mpls-arch-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2022-05-21) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2022 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** The abstract seems to contain references ([2-11], [1]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. == There are 1 instance of lines with private range IPv4 addresses in the document. If these are generic example addresses, they should be changed to use any of the ranges defined in RFC 6890 (or successor): 192.0.2.x, 198.51.100.x or 203.0.113.x. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 635: '... operation MUST be to "pop the stack...' RFC 2119 keyword, line 670: '...M THE NHLFE; THIS MAY IN SOME CASES BE...' RFC 2119 keyword, line 707: '... cases, Rd MUST NOT distribute to Ru...' RFC 2119 keyword, line 849: '...popping the label stack at all MUST do...' RFC 2119 keyword, line 852: '...LDP negotiations MUST allow each LSR t...' (11 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == Line 242 has weird spacing: '...e class a gr...' == Line 315 has weird spacing: '... router an ...' == Line 354 has weird spacing: '...itching an IE...' == Line 2432 has weird spacing: '...4.2, we will ...' == Line 2518 has weird spacing: '...d later becom...' -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (July 1998) is 8711 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '2-11' on line 38 == Missing Reference: '12' is mentioned on line 1181, but not defined == Unused Reference: '2' is defined on line 2833, but no explicit reference was found in the text == Unused Reference: '3' is defined on line 2837, but no explicit reference was found in the text == Unused Reference: '4' is defined on line 2841, but no explicit reference was found in the text == Unused Reference: '5' is defined on line 2845, but no explicit reference was found in the text == Unused Reference: '6' is defined on line 2849, but no explicit reference was found in the text == Unused Reference: '7' is defined on line 2853, but no explicit reference was found in the text == Unused Reference: '8' is defined on line 2857, but no explicit reference was found in the text == Unused Reference: '9' is defined on line 2861, but no explicit reference was found in the text == Unused Reference: '10' is defined on line 2865, but no explicit reference was found in the text == Unused Reference: '11' is defined on line 2868, but no explicit reference was found in the text -- Possible downref: Non-RFC (?) normative reference: ref. '1' -- Possible downref: Non-RFC (?) normative reference: ref. '2' -- Possible downref: Non-RFC (?) normative reference: ref. '3' -- Possible downref: Non-RFC (?) normative reference: ref. '4' -- Possible downref: Non-RFC (?) normative reference: ref. '5' -- Possible downref: Non-RFC (?) normative reference: ref. '6' -- Possible downref: Non-RFC (?) normative reference: ref. '7' -- Possible downref: Non-RFC (?) normative reference: ref. '8' -- Possible downref: Non-RFC (?) normative reference: ref. '9' ** Downref: Normative reference to an Informational RFC: RFC 2098 (ref. '10') -- Possible downref: Non-RFC (?) normative reference: ref. '11' Summary: 10 errors (**), 0 flaws (~~), 18 warnings (==), 13 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Network Working Group Eric C. Rosen 2 Internet Draft Cisco Systems, Inc. 3 Expiration Date: January 1999 4 Arun Viswanathan 5 Lucent Technologies 7 Ross Callon 8 IronBridge Networks, Inc. 10 July 1998 12 Multiprotocol Label Switching Architecture 14 draft-ietf-mpls-arch-02.txt 16 Status of this Memo 18 This document is an Internet-Draft. Internet-Drafts are working 19 documents of the Internet Engineering Task Force (IETF), its areas, 20 and its working groups. Note that other groups may also distribute 21 working documents as Internet-Drafts. 23 Internet-Drafts are draft documents valid for a maximum of six months 24 and may be updated, replaced, or obsoleted by other documents at any 25 time. It is inappropriate to use Internet-Drafts as reference 26 material or to cite them other than as "work in progress." 28 To view the entire list of current Internet-Drafts, please check the 29 "1id-abstracts.txt" listing contained in the Internet-Drafts Shadow 30 Directories on ftp.is.co.za (Africa), ftp.nordu.net (Northern 31 Europe), ftp.nis.garr.it (Southern Europe), munnari.oz.au (Pacific 32 Rim), ftp.ietf.org (US East Coast), or ftp.isi.edu (US West Coast). 34 Abstract 36 This internet draft specifies the architecture for multiprotocol 37 label switching (MPLS). The architecture is based on other label 38 switching approaches [2-11] as well as on the MPLS Framework document 39 [1]. 41 Table of Contents 43 1 Introduction to MPLS ............................... 4 44 1.1 Overview ........................................... 4 45 1.2 Terminology ........................................ 6 46 1.3 Acronyms and Abbreviations ......................... 9 47 1.4 Acknowledgments .................................... 10 48 2 Outline of Approach ................................ 10 49 2.1 Labels ............................................. 11 50 2.2 Upstream and Downstream LSRs ....................... 12 51 2.3 Labeled Packet ..................................... 12 52 2.4 Label Assignment and Distribution .................. 12 53 2.5 Attributes of a Label Binding ...................... 12 54 2.6 Label Distribution Protocol (LDP) .................. 13 55 2.7 Downstream vs. Downstream-on-Demand ................ 13 56 2.8 Label Retention Mode ............................... 13 57 2.9 The Label Stack .................................... 14 58 2.10 The Next Hop Label Forwarding Entry (NHLFE) ........ 14 59 2.11 Incoming Label Map (ILM) ........................... 15 60 2.12 FEC-to-NHLFE Map (FTN) ............................. 15 61 2.13 Label Swapping ..................................... 16 62 2.14 Scope and Uniqueness of Labels ..................... 16 63 2.15 Label Switched Path (LSP), LSP Ingress, LSP Egress . 17 64 2.16 Penultimate Hop Popping ............................ 19 65 2.17 LSP Next Hop ....................................... 20 66 2.18 Invalid Incoming Labels ............................ 21 67 2.19 LSP Control: Ordered versus Independent ............ 21 68 2.20 Aggregation ........................................ 22 69 2.21 Route Selection .................................... 24 70 2.22 Time-to-Live (TTL) ................................. 25 71 2.23 Loop Control ....................................... 26 72 2.23.1 Loop Prevention .................................... 27 73 2.23.2 Interworking of Loop Control Options ............... 29 74 2.24 Label Encodings .................................... 30 75 2.24.1 MPLS-specific Hardware and/or Software ............. 31 76 2.24.2 ATM Switches as LSRs ............................... 31 77 2.24.3 Interoperability among Encoding Techniques ......... 33 78 2.25 Label Merging ...................................... 33 79 2.25.1 Non-merging LSRs ................................... 34 80 2.25.2 Labels for Merging and Non-Merging LSRs ............ 35 81 2.25.3 Merge over ATM ..................................... 36 82 2.25.3.1 Methods of Eliminating Cell Interleave ............. 36 83 2.25.3.2 Interoperation: VC Merge, VP Merge, and Non-Merge .. 36 84 2.26 Tunnels and Hierarchy .............................. 37 85 2.26.1 Hop-by-Hop Routed Tunnel ........................... 38 86 2.26.2 Explicitly Routed Tunnel ........................... 38 87 2.26.3 LSP Tunnels ........................................ 38 88 2.26.4 Hierarchy: LSP Tunnels within LSPs ................. 39 89 2.26.5 LDP Peering and Hierarchy .......................... 39 90 2.27 LDP Transport ...................................... 40 91 2.28 Multicast .......................................... 41 92 3 Some Applications of MPLS .......................... 41 93 3.1 MPLS and Hop by Hop Routed Traffic ................. 41 94 3.1.1 Labels for Address Prefixes ........................ 41 95 3.1.2 Distributing Labels for Address Prefixes ........... 41 96 3.1.2.1 LDP Peers for a Particular Address Prefix .......... 41 97 3.1.2.2 Distributing Labels ................................ 42 98 3.1.3 Using the Hop by Hop path as the LSP ............... 43 99 3.1.4 LSP Egress and LSP Proxy Egress .................... 43 100 3.1.5 The Implicit NULL Label ............................ 44 101 3.1.6 Option: Egress-Targeted Label Assignment ........... 45 102 3.2 MPLS and Explicitly Routed LSPs .................... 46 103 3.2.1 Explicitly Routed LSP Tunnels: Traffic Engineering . 46 104 3.3 Label Stacks and Implicit Peering .................. 47 105 3.4 MPLS and Multi-Path Routing ........................ 48 106 3.5 LSP Trees as Multipoint-to-Point Entities .......... 48 107 3.6 LSP Tunneling between BGP Border Routers ........... 49 108 3.7 Other Uses of Hop-by-Hop Routed LSP Tunnels ........ 50 109 3.8 MPLS and Multicast ................................. 51 110 4 LDP Procedures for Hop-by-Hop Routed Traffic ....... 51 111 4.1 The Procedures for Advertising and Using labels .... 51 112 4.1.1 Downstream LSR: Distribution Procedure ............. 52 113 4.1.1.1 PushUnconditional .................................. 52 114 4.1.1.2 PushConditional .................................... 53 115 4.1.1.3 PulledUnconditional ................................ 53 116 4.1.1.4 PulledConditional .................................. 54 117 4.1.2 Upstream LSR: Request Procedure .................... 55 118 4.1.2.1 RequestNever ....................................... 55 119 4.1.2.2 RequestWhenNeeded .................................. 55 120 4.1.2.3 RequestOnRequest ................................... 55 121 4.1.3 Upstream LSR: NotAvailable Procedure ............... 56 122 4.1.3.1 RequestRetry ....................................... 56 123 4.1.3.2 RequestNoRetry ..................................... 56 124 4.1.4 Upstream LSR: Release Procedure .................... 56 125 4.1.4.1 ReleaseOnChange .................................... 56 126 4.1.4.2 NoReleaseOnChange .................................. 57 127 4.1.5 Upstream LSR: labelUse Procedure ................... 57 128 4.1.5.1 UseImmediate ....................................... 57 129 4.1.5.2 UseIfLoopFree ...................................... 57 130 4.1.5.3 UseIfLoopNotDetected ............................... 58 131 4.1.6 Downstream LSR: Withdraw Procedure ................. 58 132 4.2 MPLS Schemes: Supported Combinations of Procedures . 59 133 4.2.1 TTL-capable LSP Segments ........................... 59 134 4.2.2 Using ATM Switches as LSRs ......................... 60 135 4.2.2.1 Without Label Merging .............................. 60 136 4.2.2.2 With Label Merging ................................. 61 137 4.2.3 Interoperability Considerations .................... 62 138 5 Security Considerations ............................ 63 139 6 Authors' Addresses ................................. 63 140 7 References ......................................... 64 142 1. Introduction to MPLS 144 1.1. Overview 146 In connectionless network layer protocols, as a packet travels from 147 one router hop to the next, an independent forwarding decision is 148 made at each hop. Each router runs a network layer routing 149 algorithm. As a packet travels through the network, each router 150 analyzes the packet header. The choice of next hop for a packet is 151 based on the header analysis and the result of running the routing 152 algorithm. 154 Packet headers contain considerably more information than is needed 155 simply to choose the next hop. Choosing the next hop can therefore be 156 thought of as the composition of two functions. The first function 157 partitions the entire set of possible packets into a set of 158 "Forwarding Equivalence Classes (FECs)". The second maps each FEC to 159 a next hop. Insofar as the forwarding decision is concerned, 160 different packets which get mapped into the same FEC are 161 indistinguishable. All packets which belong to a particular FEC and 162 which travel from a particular node will follow the same path. 164 In conventional IP forwarding, a particular router will typically 165 consider two packets to be in the same FEC if there is some address 166 prefix X in that router's routing tables such that X is the "longest 167 match" for each packet's destination address. As the packet traverses 168 the network, each hop in turn reexamines the packet and assigns it to 169 a FEC. 171 In MPLS, the assignment of a particular packet to a particular FEC is 172 done just once, as the packet enters the network. The FEC to which 173 the packet is assigned is encoded with a short fixed length value 174 known as a "label". When a packet is forwarded to its next hop, the 175 label is sent along with it; that is, the packets are "labeled". 177 At subsequent hops, there is no further analysis of the packet's 178 network layer header. Rather, the label is used as an index into a 179 table which specifies the next hop, and a new label. The old label 180 is replaced with the new label, and the packet is forwarded to its 181 next hop. If assignment to a FEC is based on a "longest match", this 182 eliminates the need to perform a longest match computation for each 183 packet at each hop; the computation can be performed just once. 185 Some routers analyze a packet's network layer header not merely to 186 choose the packet's next hop, but also to determine a packet's 187 "precedence" or "class of service", in order to apply different 188 discard thresholds or scheduling disciplines to different packets. 189 MPLS allows the precedence or class of service to be inferred from 190 the label, so that no further header analysis is needed; in some 191 cases MPLS provides a way to explicitly encode a class of service in 192 the "label header". 194 The fact that a packet is assigned to a FEC just once, rather than at 195 every hop, allows the use of sophisticated forwarding paradigms. A 196 packet that enters the network at a particular router can be labeled 197 differently than the same packet entering the network at a different 198 router, and as a result forwarding decisions that depend on the 199 ingress point ("policy routing") can be easily made. In fact, the 200 policy used to assign a packet to a FEC need not have only the 201 network layer header as input; it may use arbitrary information about 202 the packet, and/or arbitrary policy information as input. Since this 203 decouples forwarding from routing, it allows one to use MPLS to 204 support a large variety of routing policies that are difficult or 205 impossible to support with just conventional network layer 206 forwarding. 208 Similarly, MPLS facilitates the use of explicit routing, without 209 requiring that each IP packet carry the explicit route. Explicit 210 routes may be useful to support policy routing and traffic 211 engineering. 213 MPLS makes use of a routing approach whereby the normal mode of 214 operation is that L3 routing (e.g., existing IP routing protocols 215 and/or new IP routing protocols) is used by all nodes to determine 216 the routed path. 218 MPLS stands for "Multiprotocol" Label Switching, multiprotocol 219 because its techniques are applicable to ANY network layer protocol. 220 In this document, however, we focus on the use of IP as the network 221 layer protocol. 223 A router which supports MPLS is known as a "Label Switching Router", 224 or LSR. 226 A general discussion of issues related to MPLS is presented in "A 227 Framework for Multiprotocol Label Switching" [1]. 229 1.2. Terminology 231 This section gives a general conceptual overview of the terms used in 232 this document. Some of these terms are more precisely defined in 233 later sections of the document. 235 DLCI a label used in Frame Relay networks to 236 identify frame relay circuits 238 flow a single instance of an application to 239 application flow of data (as in the RSVP 240 and IFMP use of the term "flow") 242 forwarding equivalence class a group of IP packets which are 243 forwarded in the same manner (e.g., 244 over the same path, with the same 245 forwarding treatment) 247 frame merge label merging, when it is applied to 248 operation over frame based media, so that 249 the potential problem of cell interleave 250 is not an issue. 252 label a short fixed length physically 253 contiguous identifier which is used to 254 identify a FEC, usually of local 255 significance. 257 label merging the replacement of multiple incoming 258 labels for a particular FEC with a single 259 outgoing label 261 label swap the basic forwarding operation consisting 262 of looking up an incoming label to 263 determine the outgoing label, 264 encapsulation, port, and other data 265 handling information. 267 label swapping a forwarding paradigm allowing 268 streamlined forwarding of data by using 269 labels to identify classes of data 270 packets which are treated 271 indistinguishably when forwarding. 273 label switched hop the hop between two MPLS nodes, on which 274 forwarding is done using labels. 276 label switched path the path created by the concatenation of 277 one or more label switched hops, allowing 278 a packet to be forwarded by swapping 279 labels from an MPLS node to another MPLS 280 node. 282 layer 2 the protocol layer under layer 3 (which 283 therefore offers the services used by 284 layer 3). Forwarding, when done by the 285 swapping of short fixed length labels, 286 occurs at layer 2 regardless of whether 287 the label being examined is an ATM 288 VPI/VCI, a frame relay DLCI, or an MPLS 289 label. 291 layer 3 the protocol layer at which IP and its 292 associated routing protocols operate link 293 layer synonymous with layer 2 295 loop detection a method of dealing with loops in which 296 loops are allowed to be set up, and data 297 may be transmitted over the loop, but the 298 loop is later detected and closed 300 loop prevention a method of dealing with loops in which 301 data is never transmitted over a loop 303 label stack an ordered set of labels 305 loop survival a method of dealing with loops in which 306 data may be transmitted over a loop, but 307 means are employed to limit the amount of 308 network resources which may be consumed 309 by the looping data 311 label switched path The path through one or more LSRs at one 312 level of the hierarchy followed by a 313 packets in a particular FEC. 315 label switching router an MPLS node which is capable of 316 forwarding native L3 packets 318 merge point a node at which label merging is done 320 MPLS core standards the standards which describe the core 321 MPLS technology 323 MPLS domain a contiguous set of nodes which operate 324 MPLS routing and forwarding and which are 325 also in one Routing or Administrative 326 Domain 328 MPLS edge node an MPLS node that connects an MPLS domain 329 with a node which is outside of the 330 domain, either because it does not run 331 MPLS, and/or because it is in a different 332 domain. Note that if an LSR has a 333 neighboring host which is not running 334 MPLS, that that LSR is an MPLS edge node. 336 MPLS egress node an MPLS edge node in its role in handling 337 traffic as it leaves an MPLS domain 339 MPLS ingress node an MPLS edge node in its role in handling 340 traffic as it enters an MPLS domain 342 MPLS label a label which is carried in a packet 343 header, and which represents the packet's 344 FEC 346 MPLS node a node which is running MPLS. An MPLS 347 node will be aware of MPLS control 348 protocols, will operate one or more L3 349 routing protocols, and will be capable of 350 forwarding packets based on labels. An 351 MPLS node may optionally be also capable 352 of forwarding native L3 packets. 354 MultiProtocol Label Switching an IETF working group and the effort 355 associated with the working group 357 network layer synonymous with layer 3 359 stack synonymous with label stack 360 switched path synonymous with label switched path 362 virtual circuit a circuit used by a connection-oriented 363 layer 2 technology such as ATM or Frame 364 Relay, requiring the maintenance of state 365 information in layer 2 switches. 367 VC merge label merging where the MPLS label is 368 carried in the ATM VCI field (or combined 369 VPI/VCI field), so as to allow multiple 370 VCs to merge into one single VC 372 VP merge label merging where the MPLS label is 373 carried din the ATM VPI field, so as to 374 allow multiple VPs to be merged into one 375 single VP. In this case two cells would 376 have the same VCI value only if they 377 originated from the same node. This 378 allows cells from different sources to be 379 distinguished via the VCI. 381 VPI/VCI a label used in ATM networks to identify 382 circuits 384 1.3. Acronyms and Abbreviations 386 ATM Asynchronous Transfer Mode 388 BGP Border Gateway Protocol 390 DLCI Data Link Circuit Identifier 392 FEC Forwarding Equivalence Class 394 FTN FEC to NHLFE Map 396 IGP Interior Gateway Protocol 398 ILM Incoming Label Map 400 IP Internet Protocol 402 LDP Label Distribution Protocol 404 L2 Layer 2 405 L3 Layer 3 407 LSP Label Switched Path 409 LSR Label Switching Router 411 MPLS MultiProtocol Label Switching 413 MPT Multipoint to Point Tree 415 NHLFE Next Hop Label Forwarding Entry 417 SVC Switched Virtual Circuit 419 SVP Switched Virtual Path 421 TTL Time-To-Live 423 VC Virtual Circuit 425 VCI Virtual Circuit Identifier 427 VP Virtual Path 429 VPI Virtual Path Identifier 431 1.4. Acknowledgments 433 The ideas and text in this document have been collected from a number 434 of sources and comments received. We would like to thank Rick Boivie, 435 Paul Doolan, Nancy Feldman, Yakov Rekhter, Vijay Srinivasan, and 436 George Swallow for their inputs and ideas. 438 2. Outline of Approach 440 In this section, we introduce some of the basic concepts of MPLS and 441 describe the general approach to be used. 443 2.1. Labels 445 A label is a short, fixed length, locally significant identifier 446 which is used to identify a FEC. The label which is put on a 447 particular packet represents the Forwarding Equivalence Class to 448 which that packet is assigned. 450 Most commonly, packets are assigned to FECS based on their 451 destination network layer addresses. However, the label is never an 452 encoding of the destination network layer address. 454 If Ru and Rd are LSRs, they may agree that when Ru transmits a packet 455 to Rd, Ru will label with packet with label value L if and only if 456 the packet is a member of a particular FEC F. That is, they can 457 agree to a "binding" between label L and FEC F for packets moving 458 from Ru to Rd. As a result of such an agreement, L becomes Ru's 459 "outgoing label" representing FEC F, and L becomes Rd's "incoming 460 label" representing FEC F. 462 Note that L does not necessarily represent FEC F for any packets 463 other than those which are being sent from Ru to Rd. L is an 464 arbitrary value whose binding to F is local to Ru and Rd. 466 When we speak above of packets "being sent" from Ru to to Rd, we do 467 not imply either that the packet originated at Ru or that its 468 destination is Rd. Rather, we mean to include packets which are 469 "transit packets" at one or both of the LSRs. 471 Sometimes it may be difficult or even impossible for Rd to tell, of 472 an arriving packet carrying label L, that the label L was placed in 473 the packet by Ru, rather than by some other LSR. (This will 474 typically be the case when Ru and Rd are not direct neighbors.) In 475 such cases, Rd must make sure that the binding from label to FEC is 476 one-to-one. That is, in such cases, Rd must not agree with Ru1 to 477 bind L to FEC F1, while also agreeing with some other LSR Ru2 to bind 478 L to a different FEC F2. It is the responsibility of each LSR to 479 ensure that it can uniquely interpret its incoming labels. 481 2.2. Upstream and Downstream LSRs 483 Suppose Ru and Rd have agreed to bind label L to FEC F, for packets 484 sent from Ru to Rd. Then with respect to this binding, Ru is the 485 "upstream LSR", and Rd is the "downstream LSR". 487 To say that one node is upstream and one is downstream with respect 488 to a given binding means only that a particular label represents a 489 particular FEC in packets travelling from the upstream node to the 490 downstream node. This is NOT meant to imply that packets in that FEC 491 would actually be routed from the upstream node to the downstream 492 node. 494 2.3. Labeled Packet 496 A "labeled packet" is a packet into which a label has been encoded. 497 In some cases, the label resides in an encapsulation header which 498 exists specifically for this purpose. In other cases, the label may 499 reside in an existing data link or network layer header, as long as 500 there is a field which is available for that purpose. The particular 501 encoding technique to be used must be agreed to by both the entity 502 which encodes the label and the entity which decodes the label. 504 2.4. Label Assignment and Distribution 506 In the MPLS architecture, the decision to bind a particular label L 507 to a particular FEC F is made by the LSR which is DOWNSTREAM with 508 respect to that binding. The downstream LSR then informs the 509 upstream LSR of the binding. Thus labels are "downstream-assigned", 510 and label bindings are distributed in the "downstream to upstream" 511 direction. 513 2.5. Attributes of a Label Binding 515 A particular binding of label L to FEC F, distributed by Rd to Ru, 516 may have associated "attributes". If Ru, acting as a downstream LSR, 517 also distributes a binding of a label to FEC F, then under certain 518 conditions, it may be required to also distribute the corresponding 519 attribute that it received from Rd. 521 2.6. Label Distribution Protocol (LDP) 523 A Label Distribution Protocol (LDP) is a set of procedures by which 524 one LSR informs another of the label/FEC bindings it has made. Two 525 LSRs which use an LDP to exchange label/FEC binding information are 526 known as "LDP Peers" with respect to the binding information they 527 exchange. If two LSRs are LDP Peers, we will speak of there being an 528 "LDP Adjacency" between them. 530 (N.B.: two LSRs may be LDP Peers with respect to some set of 531 bindings, but not with respect to some other set of bindings.) 533 The LDP also encompasses any negotiations in which two LDP Peers need 534 to engage in order to learn of each other's MPLS capabilities. 536 2.7. Downstream vs. Downstream-on-Demand 538 The MPLS architecture allows an LSR to explicitly request, from its 539 next hop for a particular FEC, a label binding for that FEC. This is 540 known as "downstream-on-demand" label distribution. 542 The MPLS architecture also allows an LSR to distribute bindings to 543 LSRs that have not explicitly requested them. This is known as 544 "downstream" label distribution. 546 Both of these label distribution techniques may be used in the same 547 network at the same time. However, on any given LDP adjacency, the 548 upstream LSR and the downstream LSR must agree on which technique is 549 to be used. 551 2.8. Label Retention Mode 553 An LSR Ru may receive (or have received) a label binding for a 554 particular FEC from an LSR Rd, even though Rd is not Ru's next hop 555 (or is no longer Ru's next hop) for that FEC. 557 Ru then has the choice of whether to keep track of such bindings, or 558 whether to discard such bindings. If Ru keeps track of such 559 bindings, then it may immediately begin using the binding again if Rd 560 eventually becomes its next hop for the FEC in question. If Ru 561 discards such bindings, then if Rd later becomes the next hop, the 562 binding will have to be reacquired. 564 If an LSR supports "Liberal Label Retention Mode", it maintains the 565 bindings between a label and a FEC which are received from LSRs which 566 are not its next hop for that FEC. If an LSR supports "Conservative 567 Label Retention Mode", it discards such bindings. 569 Liberal label retention mode allows for quicker adaptation to routing 570 changes, especially if loop prevention (see section 2.23) is not 571 being used. Conservative label retention mode though requires an LSR 572 to maintain many fewer labels. 574 2.9. The Label Stack 576 So far, we have spoken as if a labeled packet carries only a single 577 label. As we shall see, it is useful to have a more general model in 578 which a labeled packet carries a number of labels, organized as a 579 last-in, first-out stack. We refer to this as a "label stack". 581 IN MPLS, EVERY FORWARDING DECISION IS BASED EXCLUSIVELY ON THE LABEL 582 AT THE TOP OF THE STACK. 584 Although, as we shall see, MPLS supports a hierarchy, the processing 585 of a labeled packet is completely independent of the level of 586 hierarchy. The processing is always based on the top label, without 587 regard for the possibility that some number of other labels may have 588 been "above it" in the past, or that some number of other labels may 589 be below it at present. 591 An unlabeled packet can be thought of as a packet whose label stack 592 is empty (i.e., whose label stack has depth 0). 594 If a packet's label stack is of depth m, we refer to the label at the 595 bottom of the stack as the level 1 label, to the label above it (if 596 such exists) as the level 2 label, and to the label at the top of the 597 stack as the level m label. 599 The utility of the label stack will become clear when we introduce 600 the notion of LSP Tunnel and the MPLS Hierarchy (section 2.26). 602 2.10. The Next Hop Label Forwarding Entry (NHLFE) 604 The "Next Hop Label Forwarding Entry" (NHLFE) is used when forwarding 605 a labeled packet. It contains the following information: 607 1. the packet's next hop 609 2. the data link encapsulation to use when transmitting the packet 610 3. the way to encode the label stack when transmitting the packet 612 4. the operation to perform on the packet's label stack; this is 613 one of the following operations: 615 a) replace the label at the top of the label stack with a 616 specified new label 618 b) pop the label stack 620 c) replace the label at the top of the label stack with a 621 specified new label, and then push one or more specified 622 new labels onto the label stack. 624 Note that at a given LSR, the packet's "next hop" might be that LSR 625 itself. In this case, the LSR would need to pop the top level label, 626 and then "forward" the resulting packet to itself. It would then 627 make another forwarding decision, based on what remains after the 628 label stacked is popped. This may still be a labeled packet, or it 629 may be the native IP packet. 631 This implies that in some cases the LSR may need to operate on the IP 632 header in order to forward the packet. 634 If the packet's "next hop" is the current LSR, then the label stack 635 operation MUST be to "pop the stack". 637 2.11. Incoming Label Map (ILM) 639 The "Incoming Label Map" (ILM) is a mapping from incoming labels to 640 NHLFEs. It is used when forwarding packets that arrive as labeled 641 packets. 643 2.12. FEC-to-NHLFE Map (FTN) 645 The "FEC-to-NHLFE" (FTN) is a mapping from FECs to NHLFEs. It is used 646 when forwarding packets that arrive unlabeled, but which are to be 647 labeled before being forwarded. 649 2.13. Label Swapping 651 Label swapping is the use of the following procedures to forward a 652 packet. 654 In order to forward a labeled packet, a LSR examines the label at the 655 top of the label stack. It uses the ILM to map this label to an 656 NHLFE. Using the information in the NHLFE, it determines where to 657 forward the packet, and performs an operation on the packet's label 658 stack. It then encodes the new label stack into the packet, and 659 forwards the result. 661 In order to forward an unlabeled packet, a LSR analyzes the network 662 layer header, to determine the packet's FEC. It then uses the FTN to 663 map this to an NHLFE. Using the information in the NHLFE, it 664 determines where to forward the packet, and performs an operation on 665 the packet's label stack. (Popping the label stack would, of course, 666 be illegal in this case.) It then encodes the new label stack into 667 the packet, and forwards the result. 669 IT IS IMPORTANT TO NOTE THAT WHEN LABEL SWAPPING IS IN USE, THE NEXT 670 HOP IS ALWAYS TAKEN FROM THE NHLFE; THIS MAY IN SOME CASES BE 671 DIFFERENT FROM WHAT THE NEXT HOP WOULD BE IF MPLS WERE NOT IN USE. 673 2.14. Scope and Uniqueness of Labels 675 A given LSR Rd may bind label L1 to FEC F, and distribute that 676 binding to LDP peer Ru1. Rd may also bind label L2 to FEC F, and 677 distribute that binding to LDP peer Ru2. Whether or not L1 == L2 is 678 not determined by the architecture; this is a local matter. 680 A given LSR Rd may bind label L to FEC F1, and distribute that 681 binding to LDP peer Ru1. Rd may also bind label L to FEC F2, and 682 distribute that binding to LDP peer Ru2. IF (AND ONLY IF) RD CAN 683 TELL, WHEN IT RECEIVES A PACKET WHOSE TOP LABEL IS L, WHETHER THE 684 LABEL WAS PUT THERE BY RU1 OR BY RU2, THEN THE ARCHITECTURE DOES NOT 685 REQUIRE THAT F1 == F2. 687 In general, Rd can only tell whether it was Ru1 or Ru2 that put the 688 particular label value L at the top of the label stack if the 689 following conditions hold: 691 - Ru1 and Ru2 are the only LDP peers to which Rd distributed a 692 binding of label value L, and 694 - Ru1 and Ru2 are each directly connected to Rd via a point-to- 695 point interface. 697 When these conditions hold, an LSR may use labels that have "per 698 interface" scope, i.e., which are only unique per interface. When 699 these conditions do not hold, the labels must be unique over the LSR 700 which has assigned them. 702 If a particular LSR Rd is attached to a particular LSR Ru over two 703 point-to-point interfaces, then Rd may distribute to Rd a binding of 704 label L to FEC F1, as well as a binding of label L to FEC F2, F1 != 705 F2, if and only if each binding is valid only for packets which Ru 706 sends to Rd over a particular one of the interfaces. In all other 707 cases, Rd MUST NOT distribute to Ru bindings of the same label value 708 to two different FECs. 710 This prohibition holds even if the bindings are regarded as being at 711 different "levels of hierarchy". In MPLS, there is no notion of 712 having a different label space for different levels of the hierarchy; 713 when interpreting a label, the level of the label is irrelevant. 715 2.15. Label Switched Path (LSP), LSP Ingress, LSP Egress 717 A "Label Switched Path (LSP) of level m" for a particular packet P is 718 a sequence of routers, 720 722 with the following properties: 724 1. R1, the "LSP Ingress", is an LSR which pushes a label onto P's 725 label stack, resulting in a label stack of depth m; 727 2. For all i, 10). 751 In other words, we can speak of the level m LSP for Packet P as the 752 sequence of routers: 754 1. which begins with an LSR (an "LSP Ingress") that pushes on a 755 level m label, 757 2. all of whose intermediate LSRs make their forwarding decision 758 by label Switching on a level m label, 760 3. which ends (at an "LSP Egress") when a forwarding decision is 761 made by label Switching on a level m-k label, where k>0, or 762 when a forwarding decision is made by "ordinary", non-MPLS 763 forwarding procedures. 765 A consequence (or perhaps a presupposition) of this is that whenever 766 an LSR pushes a label onto an already labeled packet, it needs to 767 make sure that the new label corresponds to a FEC whose LSP Egress is 768 the LSR that assigned the label which is now second in the stack. 770 We will call a sequence of LSRs the "LSP for a particular FEC F" if 771 it is an LSP of level m for a particular packet P when P's level m 772 label is a label corresponding to FEC F. 774 Consider the set of nodes which may be LSP ingress nodes for FEC F. 775 Then there is an LSP for FEC F which begins with each of those nodes. 776 If a number of those LSPs have the same LSP egress, then one can 777 consider the set of such LSPs to be a tree, whose root is the LSP 778 egress. (Since data travels along this tree towards the root, this 779 may be called a multipoint-to-point tree.) We can thus speak of the 780 "LSP tree" for a particular FEC F. 782 2.16. Penultimate Hop Popping 784 Note that according to the definitions of section 2.15, if is a level m LSP for packet P, P may be transmitted from R[n-1] 786 to Rn with a label stack of depth m-1. That is, the label stack may 787 be popped at the penultimate LSR of the LSP, rather than at the LSP 788 Egress. 790 From an architectural perspective, this is perfectly appropriate. 791 The purpose of the level m label is to get the packet to Rn. Once 792 R[n-1] has decided to send the packet to Rn, the label no longer has 793 any function, and need no longer be carried. 795 There is also a practical advantage to doing penultimate hop popping. 796 If one does not do this, then when the LSP egress receives a packet, 797 it first looks up the top label, and determines as a result of that 798 lookup that it is indeed the LSP egress. Then it must pop the stack, 799 and examine what remains of the packet. If there is another label on 800 the stack, the egress will look this up and forward the packet based 801 on this lookup. (In this case, the egress for the packet's level m 802 LSP is also an intermediate node for its level m-1 LSP.) If there is 803 no other label on the stack, then the packet is forwarded according 804 to its network layer destination address. Note that this would 805 require the egress to do TWO lookups, either two label lookups or a 806 label lookup followed by an address lookup. 808 If, on the other hand, penultimate hop popping is used, then when the 809 penultimate hop looks up the label, it determines: 811 - that it is the penultimate hop, and 813 - who the next hop is. 815 The penultimate node then pops the stack, and forwards the packet 816 based on the information gained by looking up the label that was 817 previously at the top of the stack. When the LSP egress receives the 818 packet, the label which is now at the top of the stack will be the 819 label which it needs to look up in order to make its own forwarding 820 decision. Or, if the packet was only carrying a single label, the 821 LSP egress will simply see the network layer packet, which is just 822 what it needs to see in order to make its forwarding decision. 824 This technique allows the egress to do a single lookup, and also 825 requires only a single lookup by the penultimate node. 827 The creation of the forwarding "fastpath" in a label switching 828 product may be greatly aided if it is known that only a single lookup 829 is ever required: 831 - the code may be simplified if it can assume that only a single 832 lookup is ever needed 834 - the code can be based on a "time budget" that assumes that only a 835 single lookup is ever needed. 837 In fact, when penultimate hop popping is done, the LSP Egress need 838 not even be an LSR. 840 However, some hardware switching engines may not be able to pop the 841 label stack, so this cannot be universally required. There may also 842 be some situations in which penultimate hop popping is not desirable. 843 Therefore the penultimate node pops the label stack only if this is 844 specifically requested by the egress node, OR if the next node in the 845 LSP does not support MPLS. (If the next node in the LSP does support 846 MPLS, but does not make such a request, the penultimate node has no 847 way of knowing that it in fact is the penultimate node.) 849 An LSR which is capable of popping the label stack at all MUST do 850 penultimate hop popping when so requested by its downstream LDP peer. 852 Initial LDP negotiations MUST allow each LSR to determine whether its 853 neighboring LSRS are capable of popping the label stack. A LSR MUST 854 NOT request an LDP peer to pop the label stack unless it is capable 855 of doing so. 857 It may be asked whether the egress node can always interpret the top 858 label of a received packet properly if penultimate hop popping is 859 used. As long as the uniqueness and scoping rules of section 2.14 860 are obeyed, it is always possible to interpret the top label of a 861 received packet unambiguously. 863 2.17. LSP Next Hop 865 The LSP Next Hop for a particular labeled packet in a particular LSR 866 is the LSR which is the next hop, as selected by the NHLFE entry used 867 for forwarding that packet. 869 The LSP Next Hop for a particular FEC is the next hop as selected by 870 the NHLFE entry indexed by a label which corresponds to that FEC. 872 Note that the LSP Next Hop may differ from the next hop which would 873 be chosen by the network layer routing algorithm. We will use the 874 term "L3 next hop" when we refer to the latter. 876 2.18. Invalid Incoming Labels 878 What should an LSR do if it receives a labeled packet with a 879 particular incoming label, but has no binding for that label? It is 880 tempting to think that the labels can just be removed, and the packet 881 forwarded as an unlabeled IP packet. However, in some cases, doing 882 so could cause a loop. If the upstream LSR thinks the label is bound 883 to an explicit route, and the downstream LSR doesn't think the label 884 is bound to anything, and if the hop by hop routing of the unlabeled 885 IP packet brings the packet back to the upstream LSR, then a loop is 886 formed. 888 It is also possible that the label was intended to represent a route 889 which the cannot be inferred the IP header. 891 Therefore, when a labeled packet is received with an invalid incoming 892 label, it MUST be discarded, UNLESS it is determined by some means 893 (not within the scope of the current document) that forwarding it 894 unlabeled cannot cause any harm. 896 2.19. LSP Control: Ordered versus Independent 898 Some FECs correspond to address prefixes which are distributed via a 899 dynamic routing algorithm. The setup of the LSPs for these FECs can 900 be done in one of two ways: Independent LSP Control or Ordered LSP 901 Control. 903 In Independent LSP Control, each LSR, upon noting that it recognizes 904 a particular FEC, makes an independent decision to bind a label to 905 that FEC and to distribute that binding to its LDP peers. This 906 corresponds to the way that conventional IP datagram routing works; 907 each node makes an independent decision as to how to treat each 908 packet, and relies on the routing algorithm to converge rapidly so as 909 to ensure that each datagram is correctly delivered. 911 In Ordered LSP Control, an LSR only binds a label to a particular FEC 912 if it is the egress LSR for that FEC, or if it has already received a 913 label binding for that FEC from its next hop for that FEC. 915 If one wants to ensure that traffic in a particular FEC follows a 916 path with some specified set of properties (e.g., that the traffic 917 does not traverse any node twice, that a specified amount of 918 resources are available to the traffic, that the traffic follows an 919 explicitly specified path, etc.) ordered control must be used. With 920 independent control, some LSRs may begin label switching a traffic in 921 the FEC before the LSP is completely set up, and thus some traffic in 922 the FEC may follow a path which does not have the specified set of 923 properties. Ordered control also needs to be used if the recognition 924 of the FEC is a consequence of the setting up of the corresponding 925 LSP. 927 Ordered LSP setup may be initiated either by the ingress or the 928 egress. 930 Ordered control and independent control are fully interoperable. 931 However, unless all LSRs in an LSP are using ordered control, the 932 overall effect on network behavior is largely that of independent 933 control, since one cannot be sure that an LSP is not used until it is 934 fully set up. 936 This architecture allows the choice between independent control and 937 ordered control to be a local matter. Since the two methods 938 interwork, a given LSR need support only one or the other. Generally 939 speaking, the choice of independent versus ordered control does not 940 appear to have any effect on the LDP mechanisms which need to be 941 defined. 943 2.20. Aggregation 945 One way of partitioning traffic into FECs is to create a separate FEC 946 for each address prefix which appears in the routing table. However, 947 within a particular MPLS domain, this may result in a set of FECs 948 such that all traffic in all those FECs follows the same route. For 949 example, a set of distinct address prefixes might all have the same 950 egress node, and label swapping might be used only to get the the 951 traffic to the egress node. In this case, within the MPLS domain, 952 the union of those FECs is itself a FEC. This creates a choice: 953 should a distinct label be bound to each component FEC, or should a 954 single label be bound to the union, and that label applied to all 955 traffic in the union? 957 The procedure of binding a single label to a union of FECs which is 958 itself a FEC (within some domain), and of applying that label to all 959 traffic in the union, is known as "aggregation". The MPLS 960 architecture allows aggregation. Aggregation may reduce the number 961 of labels which are needed to handle a particular set of packets, and 962 may also reduce the amount of LDP control traffic needed. 964 Given a set of FECs which are "aggregatable" into a single FEC, it is 965 possible to (a) aggregate them into a single FEC, (b) aggregate them 966 into a set of FECs, or (c) not aggregate them at all. Thus we can 967 speak of the "granularity" of aggregation, with (a) being the 968 "coarsest granularity", and (c) being the "finest granularity". 970 When order control is used, each LSR should adopt, for a given set of 971 FECs, the granularity used by its next hop for those FECs. 973 When independent control is used, it is possible that there will be 974 two adjacent LSRs, Ru and Rd, which aggregate some set of FECs 975 differently. 977 If Ru has finer granularity than Rd, this does not cause a problem. 978 Ru distributes more labels for that set of FECs than Rd does. This 979 means that when Ru needs to forward labeled packets in those FECs to 980 Rd, it may need to map n labels into m labels, where n > m. As an 981 option, Ru may withdraw the set of n labels that it has distributed, 982 and then distribute a set of m labels, corresponding to Rd's level of 983 granularity. This is not necessary to ensure correct operation, but 984 it does result in a reduction of the number of labels distributed by 985 Ru, and Ru is not gaining any particular advantage by distributing 986 the larger number of labels. The decision whether to do this or not 987 is a local matter. 989 If Ru has coarser granularity than Rd (i.e., Rd has distributed n 990 labels for the set of FECs, while Ru has distributed m, where n > m), 991 it has two choices: 993 - It may adopt Rd's finer level of granularity. This would require 994 it to withdraw the m labels it has distributed, and distribute n 995 labels. This is the preferred option. 997 - It may simply map its m labels into a subset of Rd's n labels, if 998 it can determine that this will produce the same routing. For 999 example, suppose that Ru applies a single label to all traffic 1000 that needs to pass through a certain egress LSR, whereas Rd binds 1001 a number of different labels to such traffic, depending on the 1002 individual destination addresses of the packets. If Ru knows the 1003 address of the egress router, and if Rd has bound a label to the 1004 FEC which is identified by that address, then Ru can simply apply 1005 that label. 1007 In any event, every LSR needs to know (by configuration) what 1008 granularity to use for labels that it assigns. Where ordered control 1009 is used, this requires each node to know the granularity only for 1010 FECs which leave the MPLS network at that node. For independent 1011 control, best results may be obtained by ensuring that all LSRs are 1012 consistently configured to know the granularity for each FEC. 1013 However, in many cases this may be done by using a single level of 1014 granularity which applies to all FECs (such as "one label per IP 1015 prefix in the forwarding table", or "one label per egress node"). 1017 2.21. Route Selection 1019 Route selection refers to the method used for selecting the LSP for a 1020 particular FEC. The proposed MPLS protocol architecture supports two 1021 options for Route Selection: (1) hop by hop routing, and (2) explicit 1022 routing. 1024 Hop by hop routing allows each node to independently choose the next 1025 hop for each FEC. This is the usual mode today in existing IP 1026 networks. A "hop by hop routed LSP" is an LSP whose route is selected 1027 using hop by hop routing. 1029 In an explicitly routed LSP, each LSR does not independently choose 1030 the next hop; rather, a single LSR, generally the LSP ingress or the 1031 LSP egress, specifies several (or all) of the LSRs in the LSP. If a 1032 single LSR specifies the entire LSP, the LSP is "strictly" explicitly 1033 routed. If a single LSR specifies only some of the LSP, the LSP is 1034 "loosely" explicitly routed. 1036 The sequence of LSRs followed by an explicitly routed LSP may be 1037 chosen by configuration, or may be selected dynamically by a single 1038 node (for example, the egress node may make use of the topological 1039 information learned from a link state database in order to compute 1040 the entire path for the tree ending at that egress node). 1042 Explicit routing may be useful for a number of purposes such as 1043 policy routing or traffic engineering. With MPLS the explicit route 1044 needs to be specified at the time that labels are assigned, but the 1045 explicit route does not have to be specified with each IP packet. 1046 This makes MPLS explicit routing much more efficient than the 1047 alternative of IP source routing. 1049 When an LSP is explicitly routed (either loosely or strictly), it is 1050 essential that packets travelling along the LSP reach its end before 1051 they revert to hop by hop routing. Otherwise inconsistent routing 1052 and loops might form. 1054 It is not necessary for a node to be able to create an explicit 1055 route. However, in order to ensure interoperability it is necessary 1056 to ensure that either (i) Every node knows how to use hop by hop 1057 routing; or (ii) Every node knows how to create and follow an 1058 explicit route. We propose that due to the common use of hop by hop 1059 routing in networks today, it is reasonable to make hop by hop 1060 routing the default that all nodes need to be able to use. 1062 2.22. Time-to-Live (TTL) 1064 In conventional IP forwarding, each packet carries a "Time To Live" 1065 (TTL) value in its header. Whenever a packet passes through a 1066 router, its TTL gets decremented by 1; if the TTL reaches 0 before 1067 the packet has reached its destination, the packet gets discarded. 1069 This provides some level of protection against forwarding loops that 1070 may exist due to misconfigurations, or due to failure or slow 1071 convergence of the routing algorithm. TTL is sometimes used for other 1072 functions as well, such as multicast scoping, and supporting the 1073 "traceroute" command. This implies that there are two TTL-related 1074 issues that MPLS needs to deal with: (i) TTL as a way to suppress 1075 loops; (ii) TTL as a way to accomplish other functions, such as 1076 limiting the scope of a packet. 1078 When a packet travels along an LSP, it SHOULD emerge with the same 1079 TTL value that it would have had if it had traversed the same 1080 sequence of routers without having been label switched. If the 1081 packet travels along a hierarchy of LSPs, the total number of LSR- 1082 hops traversed SHOULD be reflected in its TTL value when it emerges 1083 from the hierarchy of LSPs. 1085 The way that TTL is handled may vary depending upon whether the MPLS 1086 label values are carried in an MPLS-specific "shim" header, or if the 1087 MPLS labels are carried in an L2 header, such as an ATM header or a 1088 frame relay header. 1090 If the label values are encoded in a "shim" that sits between the 1091 data link and network layer headers, then this shim MUST have a TTL 1092 field that SHOULD be initially loaded from the network layer header 1093 TTL field, SHOULD be decremented at each LSR-hop, and SHOULD be 1094 copied into the network layer header TTL field when the packet 1095 emerges from its LSP. 1097 If the label values are encoded in a data link layer header (e.g., 1098 the VPI/VCI field in ATM's AAL5 header), and the labeled packets are 1099 forwarded by an L2 switch (e.g., an ATM switch), and the data link 1100 layer (like ATM) does not itself have a TTL field, then it will not 1101 be possible to decrement a packet's TTL at each LSR-hop. An LSP 1102 segment which consists of a sequence of LSRs that cannot decrement a 1103 packet's TTL will be called a "non-TTL LSP segment". 1105 When a packet emerges from a non-TTL LSP segment, it SHOULD however 1106 be given a TTL that reflects the number of LSR-hops it traversed. In 1107 the unicast case, this can be achieved by propagating a meaningful 1108 LSP length to ingress nodes, enabling the ingress to decrement the 1109 TTL value before forwarding packets into a non-TTL LSP segment. 1111 Sometimes it can be determined, upon ingress to a non-TTL LSP 1112 segment, that a particular packet's TTL will expire before the packet 1113 reaches the egress of that non-TTL LSP segment. In this case, the LSR 1114 at the ingress to the non-TTL LSP segment must not label switch the 1115 packet. This means that special procedures must be developed to 1116 support traceroute functionality, for example, traceroute packets may 1117 be forwarded using conventional hop by hop forwarding. 1119 2.23. Loop Control 1121 On a non-TTL LSP segment, by definition, TTL cannot be used to 1122 protect against forwarding loops. The importance of loop control may 1123 depend on the particular hardware being used to provide the LSR 1124 functions along the non-TTL LSP segment. 1126 Suppose, for instance, that ATM switching hardware is being used to 1127 provide MPLS switching functions, with the label being carried in the 1128 VPI/VCI field. Since ATM switching hardware cannot decrement TTL, 1129 there is no protection against loops. If the ATM hardware is capable 1130 of providing fair access to the buffer pool for incoming cells 1131 carrying different VPI/VCI values, this looping may not have any 1132 deleterious effect on other traffic. If the ATM hardware cannot 1133 provide fair buffer access of this sort, however, then even transient 1134 loops may cause severe degradation of the LSR's total performance. 1136 Even if fair buffer access can be provided, it is still worthwhile to 1137 have some means of detecting loops that last "longer than possible". 1138 In addition, even where TTL and/or per-VC fair queuing provides a 1139 means for surviving loops, it still may be desirable where practical 1140 to avoid setting up LSPs which loop. 1142 The MPLS architecture will therefore provide a technique for ensuring 1143 that looping LSP segments can be detected, and a technique for 1144 ensuring that looping LSP segments are never created. 1146 All LSRs will be required to support a common technique for loop 1147 detection. Support for the loop prevention technique is optional, 1148 though it is recommended in ATM-LSRs that have no other way to 1149 protect themselves against the effects of looping data packets. Use 1150 of the loop prevention technique, when supported, is optional. 1152 The loop prevention technique presupposes the use of Ordered LSP 1153 Control. The loop detection technique, on the other hand, works with 1154 either Independent or Ordered LSP Control. 1156 2.23.1. Loop Prevention 1158 NOTE: The loop prevention technique described here is being 1159 reconsidered, and may be changed. 1161 LSR's maintain for each of their LSP's an LSR id list. This list is a 1162 list of all the LSR's downstream from this LSR on a given LSP. The 1163 LSR id list is used to prevent the formation of switched path loops. 1164 The LSR ID list is propagated upstream from a node to its neighbor 1165 nodes. The LSR ID list is used to prevent loops as follows: 1167 When a node, R, detects a change in the next hop for a given FEC, it 1168 asks its new next hop for a label and the associated LSR ID list for 1169 that FEC. 1171 The new next hop responds with a label for the FEC and an associated 1172 LSR id list. 1174 R looks in the LSR id list. If R determines that it, R, is in the 1175 list then we have a route loop. In this case, we do nothing and the 1176 old LSP will continue to be used until the route protocols break the 1177 loop. The means by which the old LSP is replaced by a new LSP after 1178 the route protocols breathe loop is described below. 1180 If R is not in the LSR id list, R will start a "diffusion" 1181 computation [12]. The purpose of the diffusion computation is to 1182 prune the tree upstream of R so that we remove all LSR's from the 1183 tree that would be on a looping path if R were to switch over to the 1184 new LSP. After those LSR's are removed from the tree, it is safe for 1185 R to replace the old LSP with the new LSP (and the old LSP can be 1186 released). 1188 The diffusion computation works as follows: 1190 R adds its LSR id to the list and sends a query message to each of 1191 its "upstream" neighbors (i.e. to each of its neighbors that is not 1192 the new "downstream" next hop). 1194 A node S that receives such a query will process the query as 1195 follows: 1197 - If node R is not node S's next hop for the given FEC, node S will 1198 respond to node R will an "OK" message meaning that as far as 1199 node S is concerned it is safe for node R to switch over to the 1200 new LSP. 1202 - If node R is node S's next hop for the FEC, node S will check to 1203 see if it, node S, is in the LSR id list that it received from 1204 node R. If it is, we have a route loop and S will respond with a 1205 "LOOP" message. R will unsplice the connection to S pruning S 1206 from the tree. The mechanism by which S will get a new LSP for 1207 the FEC after the route protocols break the loop is described 1208 below. 1210 - If node S is not in the LSR id list, S will add its LSR id to the 1211 LSR id list and send a new query message further upstream. The 1212 diffusion computation will continue to propagate upstream along 1213 each of the paths in the tree upstream of S until either a loop 1214 is detected, in which case the node is pruned as described above 1215 or we get to a point where a node gets a response ("OK" or 1216 "LOOP") from each of its neighbors perhaps because none of those 1217 neighbors considers the node in question to be its downstream 1218 next hop. Once a node has received a response from each of its 1219 upstream neighbors, it returns an "OK" message to its downstream 1220 neighbor. When the original node, node R, gets a response from 1221 each of its neighbors, it is safe to replace the old LSP with the 1222 new one because all the paths that would loop have been pruned 1223 from the tree. 1225 There are a couple of details to discuss: 1227 - First, we need to do something about nodes that for one reason or 1228 another do not produce a timely response in response to a query 1229 message. If a node Y does not respond to a query from node X 1230 because of a failure of some kind, X will not be able to respond 1231 to its downstream neighbors (if any) or switch over to a new LSP 1232 if X is, like R above, the node that has detected the route 1233 change. This problem is handled by timing out the query message. 1234 If a node doesn't receive a response within a "reasonable" period 1235 of time, it "unsplices" its VC to the upstream neighbor that is 1236 not responding and proceeds as it would if it had received the 1237 "LOOP" message. 1239 - We also need to be concerned about multiple concurrent routing 1240 updates. What happens, for example, when a node M receives a 1241 request for an LSP from an upstream neighbor, N, when M is in the 1242 middle of a diffusion computation i.e., it has sent a query 1243 upstream but hasn't received all the responses. Since a 1244 downstream node, node R is about to change from one LSP to 1245 another, M needs to pass to N an LSR id list corresponding to the 1246 union of the old and new LSP's if it is to avoid loops both 1247 before and after the transition. This is easily accomplished 1248 since M already has the LSR id list for the old LSP and it gets 1249 the LSR id list for the new LSP in the query message. After R 1250 makes the switch from the old LSP to the new one, R sends a new 1251 establish message upstream with the LSR id list of (just) the new 1252 LSP. At this point, the nodes upstream of R know that R has 1253 switched over to the new LSP and that they can return the id list 1254 for (just) the new LSP in response to any new requests for LSP's. 1255 They can also grow the tree to include additional nodes that 1256 would not have been valid for the combined LSR id list. 1258 - We also need to discuss how a node that doesn't have an LSP for a 1259 given stream at the end of a diffusion computation (because it 1260 would have been on a looping LSP) gets one after the routing 1261 protocols break the loop. If node L has been pruned from the 1262 tree and its local route protocol processing entity breaks the 1263 loop by changing L's next hop, L will request a new LSP from its 1264 new downstream neighbor which it will use once it executes the 1265 diffusion computation as described above. If the loop is broken 1266 by a route change at another point in the loop, i.e. at a point 1267 "downstream" of L, L will get a new LSP as the new LSP tree grows 1268 upstream from the point of the route change as discussed in the 1269 previous paragraph. 1271 - Note that when a node is pruned from the tree, the switched path 1272 upstream of that node remains "connected". This is important 1273 since it allows the switched path to get "reconnected" to a 1274 downstream switched path after a route change with a minimal 1275 amount of unsplicing and resplicing once the appropriate 1276 diffusion computation(s) have taken place. 1278 The LSR Id list can also be used to provide a "loop detection" 1279 capability. To use it in this manner, an LSR which sees that it is 1280 already in the LSR Id list for a particular FEC will immediately 1281 unsplice itself from the switched path for that FEC, and will NOT 1282 pass the LSR Id list further upstream. The LSR can rejoin a switched 1283 path for the FEC when it changes its next hop for that FEC, or when 1284 it receives a new LSR Id list from its current next hop, in which it 1285 is not contained. The diffusion computation would be omitted. 1287 2.23.2. Interworking of Loop Control Options 1289 The MPLS protocol architecture allows some nodes to be using loop 1290 prevention, while some other nodes are not (i.e., the choice of 1291 whether or not to use loop prevention may be a local decision). When 1292 this mix is used, it is not possible for a loop to form which 1293 includes only nodes which do loop prevention. However, it is possible 1294 for loops to form which contain a combination of some nodes which do 1295 loop prevention, and some nodes which do not. 1297 There are at least four identified cases in which it makes sense to 1298 combine nodes which do loop prevention with nodes which do not: (i) 1299 For transition, in intermediate states while transitioning from all 1300 non-loop-prevention to all loop prevention, or vice versa; (ii) For 1301 interoperability, where one vendor implements loop prevention but 1302 another vendor does not; (iii) Where there is a mixed ATM and 1303 datagram media network, and where loop prevention is desired over the 1304 ATM portions of the network but not over the datagram portions; (iv) 1305 where some of the ATM switches can do fair access to the buffer pool 1306 on a per-VC basis, and some cannot, and loop prevention is desired 1307 over the ATM portions of the network which cannot. 1309 Note that interworking is straightforward. If an LSR is not doing 1310 loop prevention, and it receives from a downstream LSR a label 1311 binding which contains loop prevention information, it (a) accepts 1312 the label binding, (b) does NOT pass the loop prevention information 1313 upstream, and (c) informs the downstream neighbor that the path is 1314 loop-free. 1316 Similarly, if an LSR R which is doing loop prevention receives from a 1317 downstream LSR a label binding which does not contain any loop 1318 prevention information, then R passes the label binding upstream with 1319 loop prevention information included as if R were the egress for the 1320 specified FEC. 1322 Optionally, a node is permitted to implement the ability of either 1323 doing or not doing loop prevention as options, and is permitted to 1324 choose which to use for any one particular LSP based on the 1325 information obtained from downstream nodes. When the label binding 1326 arrives from downstream, then the node may choose whether to use loop 1327 prevention so as to continue to use the same approach as was used in 1328 the information passed to it. Note that regardless of whether loop 1329 prevention is used the egress nodes (for any particular LSP) always 1330 initiates exchange of label binding information without waiting for 1331 other nodes to act. 1333 2.24. Label Encodings 1335 In order to transmit a label stack along with the packet whose label 1336 stack it is, it is necessary to define a concrete encoding of the 1337 label stack. The architecture supports several different encoding 1338 techniques; the choice of encoding technique depends on the 1339 particular kind of device being used to forward labeled packets. 1341 2.24.1. MPLS-specific Hardware and/or Software 1343 If one is using MPLS-specific hardware and/or software to forward 1344 labeled packets, the most obvious way to encode the label stack is to 1345 define a new protocol to be used as a "shim" between the data link 1346 layer and network layer headers. This shim would really be just an 1347 encapsulation of the network layer packet; it would be "protocol- 1348 independent" such that it could be used to encapsulate any network 1349 layer. Hence we will refer to it as the "generic MPLS 1350 encapsulation". 1352 The generic MPLS encapsulation would in turn be encapsulated in a 1353 data link layer protocol. 1355 The generic MPLS encapsulation should contain the following fields: 1357 1. the label stack, 1359 2. a Time-to-Live (TTL) field 1361 3. a Class of Service (CoS) field 1363 The TTL field permits MPLS to provide a TTL function similar to what 1364 is provided by IP. 1366 The CoS field permits LSRs to apply various scheduling packet 1367 disciplines to labeled packets, without requiring separate labels for 1368 separate disciplines. 1370 2.24.2. ATM Switches as LSRs 1372 It will be noted that MPLS forwarding procedures are similar to those 1373 of legacy "label swapping" switches such as ATM switches. ATM 1374 switches use the input port and the incoming VPI/VCI value as the 1375 index into a "cross-connect" table, from which they obtain an output 1376 port and an outgoing VPI/VCI value. Therefore if one or more labels 1377 can be encoded directly into the fields which are accessed by these 1378 legacy switches, then the legacy switches can, with suitable software 1379 upgrades, be used as LSRs. We will refer to such devices as "ATM- 1380 LSRs". 1382 There are three obvious ways to encode labels in the ATM cell header 1383 (presuming the use of AAL5): 1385 1. SVC Encoding 1387 Use the VPI/VCI field to encode the label which is at the top 1388 of the label stack. This technique can be used in any network. 1389 With this encoding technique, each LSP is realized as an ATM 1390 SVC, and the LDP becomes the ATM "signaling" protocol. With 1391 this encoding technique, the ATM-LSRs cannot perform "push" or 1392 "pop" operations on the label stack. 1394 2. SVP Encoding 1396 Use the VPI field to encode the label which is at the top of 1397 the label stack, and the VCI field to encode the second label 1398 on the stack, if one is present. This technique some advantages 1399 over the previous one, in that it permits the use of ATM "VP- 1400 switching". That is, the LSPs are realized as ATM SVPs, with 1401 LDP serving as the ATM signaling protocol. 1403 However, this technique cannot always be used. If the network 1404 includes an ATM Virtual Path through a non-MPLS ATM network, 1405 then the VPI field is not necessarily available for use by 1406 MPLS. 1408 When this encoding technique is used, the ATM-LSR at the egress 1409 of the VP effectively does a "pop" operation. 1411 3. SVP Multipoint Encoding 1413 Use the VPI field to encode the label which is at the top of 1414 the label stack, use part of the VCI field to encode the second 1415 label on the stack, if one is present, and use the remainder of 1416 the VCI field to identify the LSP ingress. If this technique 1417 is used, conventional ATM VP-switching capabilities can be used 1418 to provide multipoint-to-point VPs. Cells from different 1419 packets will then carry different VCI values. As we shall see 1420 in section 2.25, this enables us to do label merging, without 1421 running into any cell interleaving problems, on ATM switches 1422 which can provide multipoint-to-point VPs, but which do not 1423 have the VC merge capability. 1425 This technique depends on the existence of a capability for 1426 assigning small unique values to each ATM switch. 1428 If there are more labels on the stack than can be encoded in the ATM 1429 header, the ATM encodings must be combined with the generic 1430 encapsulation. 1432 2.24.3. Interoperability among Encoding Techniques 1434 If is a segment of a LSP, it is possible that R1 will 1435 use one encoding of the label stack when transmitting packet P to R2, 1436 but R2 will use a different encoding when transmitting a packet P to 1437 R3. In general, the MPLS architecture supports LSPs with different 1438 label stack encodings used on different hops. Therefore, when we 1439 discuss the procedures for processing a labeled packet, we speak in 1440 abstract terms of operating on the packet's label stack. When a 1441 labeled packet is received, the LSR must decode it to determine the 1442 current value of the label stack, then must operate on the label 1443 stack to determine the new value of the stack, and then encode the 1444 new value appropriately before transmitting the labeled packet to its 1445 next hop. 1447 Unfortunately, ATM switches have no capability for translating from 1448 one encoding technique to another. The MPLS architecture therefore 1449 requires that whenever it is possible for two ATM switches to be 1450 successive LSRs along a level m LSP for some packet, that those two 1451 ATM switches use the same encoding technique. 1453 Naturally there will be MPLS networks which contain a combination of 1454 ATM switches operating as LSRs, and other LSRs which operate using an 1455 MPLS shim header. In such networks there may be some LSRs which have 1456 ATM interfaces as well as "MPLS Shim" interfaces. This is one example 1457 of an LSR with different label stack encodings on different hops. 1458 Such an LSR may swap off an ATM encoded label stack on an incoming 1459 interface and replace it with an MPLS shim header encoded label stack 1460 on the outgoing interface. 1462 2.25. Label Merging 1464 Suppose that an LSR has bound multiple incoming labels to a 1465 particular FEC. When forwarding packets in that FEC, one would like 1466 to have a single outgoing label which is applied to all such packets. 1467 The fact that two different packets in the FEC arrived with different 1468 incoming labels is irrelevant; one would like to forward them with 1469 the same outgoing label. The capability to do so is known as "label 1470 merging". 1472 Let us say that an LSR is capable of label merging if it can receive 1473 two packets from different incoming interfaces, and/or with different 1474 labels, and send both packets out the same outgoing interface with 1475 the same label. Once the packets are transmitted, the information 1476 that they arrived from different interfaces and/or with different 1477 incoming labels is lost. 1479 Let us say that an LSR is not capable of label merging if, for any 1480 two packets which arrive from different interfaces, or with different 1481 labels, the packets must either be transmitted out different 1482 interfaces, or must have different labels. 1484 Label merging would be a requirement of the MPLS architecture, if not 1485 for the fact that ATM-LSRs using the SVC or SVP Encodings cannot 1486 perform label merging. This is discussed in more detail in the next 1487 section. 1489 If a particular LSR cannot perform label merging, then if two packets 1490 in the same FEC arrive with different incoming labels, they must be 1491 forwarded with different outgoing labels. With label merging, the 1492 number of outgoing labels per FEC need only be 1; without label 1493 merging, the number of outgoing labels per FEC could be as large as 1494 the number of nodes in the network. 1496 With label merging, the number of incoming labels per FEC that a 1497 particular LSR needs is never be larger than the number of LDP 1498 adjacencies. Without label merging, the number of incoming labels 1499 per FEC that a particular LSR needs is as large as the number of 1500 upstream nodes which forward traffic in the FEC to the LSR in 1501 question. In fact, it is difficult for an LSR to even determine how 1502 many such incoming labels it must support for a particular FEC. 1504 The MPLS architecture accommodates both merging and non-merging LSRs, 1505 but allows for the fact that there may be LSRs which do not support 1506 label merging. This leads to the issue of ensuring correct 1507 interoperation between merging LSRs and non-merging LSRs. The issue 1508 is somewhat different in the case of datagram media versus the case 1509 of ATM. The different media types will therefore be discussed 1510 separately. 1512 2.25.1. Non-merging LSRs 1514 The MPLS forwarding procedures is very similar to the forwarding 1515 procedures used by such technologies as ATM and Frame Relay. That is, 1516 a unit of data arrives, a label (VPI/VCI or DLCI) is looked up in a 1517 "cross-connect table", on the basis of that lookup an output port is 1518 chosen, and the label value is rewritten. In fact, it is possible to 1519 use such technologies for MPLS forwarding; LDP can be used as the 1520 "signalling protocol" for setting up the cross-connect tables. 1522 Unfortunately, these technologies do not necessarily support the 1523 label merging capability. In ATM, if one attempts to perform label 1524 merging, the result may be the interleaving of cells from various 1525 packets. If cells from different packets get interleaved, it is 1526 impossible to reassemble the packets. Some Frame Relay switches use 1527 cell switching on their backplanes. These switches may also be 1528 incapable of supporting label merging, for the same reason -- cells 1529 of different packets may get interleaved, and there is then no way to 1530 reassemble the packets. 1532 We propose to support two solutions to this problem. First, MPLS will 1533 contain procedures which allow the use of non-merging LSRs. Second, 1534 MPLS will support procedures which allow certain ATM switches to 1535 function as merging LSRs. 1537 Since MPLS supports both merging and non-merging LSRs, MPLS also 1538 contains procedures to ensure correct interoperation between them. 1540 2.25.2. Labels for Merging and Non-Merging LSRs 1542 An upstream LSR which supports label merging needs to be sent only 1543 one label per FEC. An upstream neighbor which does not support label 1544 merging needs to be sent multiple labels per FEC. However, there is 1545 no way of knowing a priori how many labels it needs. This will depend 1546 on how many LSRs are upstream of it with respect to the FEC in 1547 question. 1549 In the MPLS architecture, if a particular upstream neighbor does not 1550 support label merging, it is not sent any labels for a particular FEC 1551 unless it explicitly asks for a label for that FEC. The upstream 1552 neighbor may make multiple such requests, and is given a new label 1553 each time. When a downstream neighbor receives such a request from 1554 upstream, and the downstream neighbor does not itself support label 1555 merging, then it must in turn ask its downstream neighbor for another 1556 label for the FEC in question. 1558 It is possible that there may be some nodes which support label 1559 merging, but can only merge a limited number of incoming labels into 1560 a single outgoing label. Suppose for example that due to some 1561 hardware limitation a node is capable of merging four incoming labels 1562 into a single outgoing label. Suppose however, that this particular 1563 node has six incoming labels arriving at it for a particular FEC. In 1564 this case, this node may merge these into two outgoing labels. 1566 Whether label merging is applicable to explicitly routed LSPs is for 1567 further study. 1569 2.25.3. Merge over ATM 1571 2.25.3.1. Methods of Eliminating Cell Interleave 1573 There are several methods that can be used to eliminate the cell 1574 interleaving problem in ATM, thereby allowing ATM switches to support 1575 stream merge: : 1577 1. VP merge, using the SVP Multipoint Encoding 1579 When VP merge is used, multiple virtual paths are merged into a 1580 virtual path, but packets from different sources are 1581 distinguished by using different VCs within the VP. 1583 2. VC merge 1585 When VC merge is used, switches are required to buffer cells 1586 from one packet until the entire packet is received (this may 1587 be determined by looking for the AAL5 end of frame indicator). 1589 VP merge has the advantage that it is compatible with a higher 1590 percentage of existing ATM switch implementations. This makes it more 1591 likely that VP merge can be used in existing networks. Unlike VC 1592 merge, VP merge does not incur any delays at the merge points and 1593 also does not impose any buffer requirements. However, it has the 1594 disadvantage that it requires coordination of the VCI space within 1595 each VP. There are a number of ways that this can be accomplished. 1596 Selection of one or more methods is for further study. 1598 This tradeoff between compatibility with existing equipment versus 1599 protocol complexity and scalability implies that it is desirable for 1600 the MPLS protocol to support both VP merge and VC merge. In order to 1601 do so each ATM switch participating in MPLS needs to know whether its 1602 immediate ATM neighbors perform VP merge, VC merge, or no merge. 1604 2.25.3.2. Interoperation: VC Merge, VP Merge, and Non-Merge 1606 The interoperation of the various forms of merging over ATM is most 1607 easily described by first describing the interoperation of VC merge 1608 with non-merge. 1610 In the case where VC merge and non-merge nodes are interconnected the 1611 forwarding of cells is based in all cases on a VC (i.e., the 1612 concatenation of the VPI and VCI). For each node, if an upstream 1613 neighbor is doing VC merge then that upstream neighbor requires only 1614 a single VPI/VCI for a particular stream (this is analogous to the 1615 requirement for a single label in the case of operation over frame 1616 media). If the upstream neighbor is not doing merge, then the 1617 neighbor will require a single VPI/VCI per stream for itself, plus 1618 enough VPI/VCIs to pass to its upstream neighbors. The number 1619 required will be determined by allowing the upstream nodes to request 1620 additional VPI/VCIs from their downstream neighbors (this is again 1621 analogous to the method used with frame merge). 1623 A similar method is possible to support nodes which perform VP merge. 1624 In this case the VP merge node, rather than requesting a single 1625 VPI/VCI or a number of VPI/VCIs from its downstream neighbor, instead 1626 may request a single VP (identified by a VPI) but several VCIs within 1627 the VP. Furthermore, suppose that a non-merge node is downstream 1628 from two different VP merge nodes. This node may need to request one 1629 VPI/VCI (for traffic originating from itself) plus two VPs (one for 1630 each upstream node), each associated with a specified set of VCIs (as 1631 requested from the upstream node). 1633 In order to support all of VP merge, VC merge, and non-merge, it is 1634 therefore necessary to allow upstream nodes to request a combination 1635 of zero or more VC identifiers (consisting of a VPI/VCI), plus zero 1636 or more VPs (identified by VPIs) each containing a specified number 1637 of VCs (identified by a set of VCIs which are significant within a 1638 VP). VP merge nodes would therefore request one VP, with a contained 1639 VCI for traffic that it originates (if appropriate) plus a VCI for 1640 each VC requested from above (regardless of whether or not the VC is 1641 part of a containing VP). VC merge node would request only a single 1642 VPI/VCI (since they can merge all upstream traffic into a single VC). 1643 Non-merge nodes would pass on any requests that they get from above, 1644 plus request a VPI/VCI for traffic that they originate (if 1645 appropriate). 1647 2.26. Tunnels and Hierarchy 1649 Sometimes a router Ru takes explicit action to cause a particular 1650 packet to be delivered to another router Rd, even though Ru and Rd 1651 are not consecutive routers on the Hop-by-hop path for that packet, 1652 and Rd is not the packet's ultimate destination. For example, this 1653 may be done by encapsulating the packet inside a network layer packet 1654 whose destination address is the address of Rd itself. This creates a 1655 "tunnel" from Ru to Rd. We refer to any packet so handled as a 1656 "Tunneled Packet". 1658 2.26.1. Hop-by-Hop Routed Tunnel 1660 If a Tunneled Packet follows the Hop-by-hop path from Ru to Rd, we 1661 say that it is in an "Hop-by-Hop Routed Tunnel" whose "transmit 1662 endpoint" is Ru and whose "receive endpoint" is Rd. 1664 2.26.2. Explicitly Routed Tunnel 1666 If a Tunneled Packet travels from Ru to Rd over a path other than the 1667 Hop-by-hop path, we say that it is in an "Explicitly Routed Tunnel" 1668 whose "transmit endpoint" is Ru and whose "receive endpoint" is Rd. 1669 For example, we might send a packet through an Explicitly Routed 1670 Tunnel by encapsulating it in a packet which is source routed. 1672 2.26.3. LSP Tunnels 1674 It is possible to implement a tunnel as a LSP, and use label 1675 switching rather than network layer encapsulation to cause the packet 1676 to travel through the tunnel. The tunnel would be a LSP , where R1 is the transmit endpoint of the tunnel, and Rn is the 1678 receive endpoint of the tunnel. This is called a "LSP Tunnel". 1680 The set of packets which are to be sent though the LSP tunnel 1681 constitutes a FEC, and each LSR in the tunnel must assign a label to 1682 that FEC (i.e., must assign a label to the tunnel). The criteria for 1683 assigning a particular packet to an LSP tunnel is a local matter at 1684 the tunnel's transmit endpoint. To put a packet into an LSP tunnel, 1685 the transmit endpoint pushes a label for the tunnel onto the label 1686 stack and sends the labeled packet to the next hop in the tunnel. 1688 If it is not necessary for the tunnel's receive endpoint to be able 1689 to determine which packets it receives through the tunnel, as 1690 discussed earlier, the label stack may be popped at the penultimate 1691 LSR in the tunnel. 1693 A "Hop-by-Hop Routed LSP Tunnel" is a Tunnel that is implemented as 1694 an hop-by-hop routed LSP between the transmit endpoint and the 1695 receive endpoint. 1697 An "Explicitly Routed LSP Tunnel" is a LSP Tunnel that is also an 1698 Explicitly Routed LSP. 1700 2.26.4. Hierarchy: LSP Tunnels within LSPs 1702 Consider a LSP . Let us suppose that R1 receives 1703 unlabeled packet P, and pushes on its label stack the label to cause 1704 it to follow this path, and that this is in fact the Hop-by-hop path. 1705 However, let us further suppose that R2 and R3 are not directly 1706 connected, but are "neighbors" by virtue of being the endpoints of an 1707 LSP tunnel. So the actual sequence of LSRs traversed by P is . 1710 When P travels from R1 to R2, it will have a label stack of depth 1. 1711 R2, switching on the label, determines that P must enter the tunnel. 1712 R2 first replaces the Incoming label with a label that is meaningful 1713 to R3. Then it pushes on a new label. This level 2 label has a value 1714 which is meaningful to R21. Switching is done on the level 2 label by 1715 R21, R22, R23. R23, which is the penultimate hop in the R2-R3 tunnel, 1716 pops the label stack before forwarding the packet to R3. When R3 sees 1717 packet P, P has only a level 1 label, having now exited the tunnel. 1718 Since R3 is the penultimate hop in P's level 1 LSP, it pops the label 1719 stack, and R4 receives P unlabeled. 1721 The label stack mechanism allows LSP tunneling to nest to any depth. 1723 2.26.5. LDP Peering and Hierarchy 1725 Suppose that packet P travels along a Level 1 LSP , 1726 and when going from R2 to R3 travels along a Level 2 LSP . From the perspective of the Level 2 LSP, R2's LDP peer is 1728 R21. From the perspective of the Level 1 LSP, R2's LDP peers are R1 1729 and R3. One can have LDP peers at each layer of hierarchy. We will 1730 see in sections 3.6 and 3.7 some ways to make use of this hierarchy. 1731 Note that in this example, R2 and R21 must be IGP neighbors, but R2 1732 and R3 need not be. 1734 When two LSRs are IGP neighbors, we will refer to them as "Local LDP 1735 Peers". When two LSRs may be LDP peers, but are not IGP neighbors, 1736 we will refer to them as "Remote LDP Peers". In the above example, 1737 R2 and R21 are local LDP peers, but R2 and R3 are remote LDP peers. 1739 The MPLS architecture supports two ways to distribute labels at 1740 different layers of the hierarchy: Explicit Peering and Implicit 1741 Peering. 1743 One performs label Distribution with one's Local LDP Peers by opening 1744 LDP connections to them. One can perform label Distribution with 1745 one's Remote LDP Peers in one of two ways: 1747 1. Explicit Peering 1749 In explicit peering, one sets up LDP connections between Remote 1750 LDP Peers, exactly as one would do for Local LDP Peers. This 1751 technique is most useful when the number of Remote LDP Peers is 1752 small, or the number of higher level label bindings is large, 1753 or the Remote LDP Peers are in distinct routing areas or 1754 domains. Of course, one needs to know which labels to 1755 distribute to which peers; this is addressed in section 3.1.2. 1757 Examples of the use of explicit peering is found in sections 1758 3.2.1 and 3.6. 1760 2. Implicit Peering 1762 In Implicit Peering, one does not have LDP connections to one's 1763 remote LDP peers, but only to one's local LDP peers. To 1764 distribute higher level labels to ones remote LDP peers, one 1765 encodes the higher level labels as an attribute of the lower 1766 level labels, and distributes the lower level label, along with 1767 this attribute, to the local LDP peers. The local LDP peers 1768 then propagate the information to their peers. This process 1769 continues till the information reaches remote LDP peers. Note 1770 that the intermediary nodes may also be remote LDP peers. 1772 This technique is most useful when the number of Remote LDP 1773 Peers is large. Implicit peering does not require a n-square 1774 peering mesh to distribute labels to the remote LDP peers 1775 because the information is piggybacked through the local LDP 1776 peering. However, implicit peering requires the intermediate 1777 nodes to store information that they might not be directly 1778 interested in. 1780 An example of the use of implicit peering is found in section 1781 3.3. 1783 2.27. LDP Transport 1785 LDP is used between nodes in an MPLS network to establish and 1786 maintain the label bindings. In order for LDP to operate correctly, 1787 LDP information needs to be transmitted reliably, and the LDP 1788 messages pertaining to a particular FEC need to be transmitted in 1789 sequence. Flow control is also required, as is the capability to 1790 carry multiple LDP messages in a single datagram. 1792 These goals will be met by using TCP as the underlying transport for 1793 LDP. 1795 (The use of multicast techniques to distribute label bindings is for 1796 further study.) 1798 2.28. Multicast 1800 This section is for further study 1802 3. Some Applications of MPLS 1804 3.1. MPLS and Hop by Hop Routed Traffic 1806 One use of MPLS is to simplify the process of forwarding packets 1807 using hop by hop routing. 1809 3.1.1. Labels for Address Prefixes 1811 In general, router R determines the next hop for packet P by finding 1812 the address prefix X in its routing table which is the longest match 1813 for P's destination address. That is, the packets in a given FEC are 1814 just those packets which match a given address prefix in R's routing 1815 table. In this case, a FEC can be identified with an address prefix. 1817 If packet P must traverse a sequence of routers, and at each router 1818 in the sequence P matches the same address prefix, MPLS simplifies 1819 the forwarding process by enabling all routers but the first to avoid 1820 executing the best match algorithm; they need only look up the label. 1822 3.1.2. Distributing Labels for Address Prefixes 1824 3.1.2.1. LDP Peers for a Particular Address Prefix 1826 LSRs R1 and R2 are considered to be LDP Peers for address prefix X if 1827 and only if one of the following conditions holds: 1829 1. R1's route to X is a route which it learned about via a 1830 particular instance of a particular IGP, and R2 is a neighbor 1831 of R1 in that instance of that IGP 1833 2. R1's route to X is a route which it learned about by some 1834 instance of routing algorithm A1, and that route is 1835 redistributed into an instance of routing algorithm A2, and R2 1836 is a neighbor of R1 in that instance of A2 1838 3. R1 is the receive endpoint of an LSP Tunnel that is within 1839 another LSP, and R2 is a transmit endpoint of that tunnel, and 1840 R1 and R2 are participants in a common instance of an IGP, and 1841 are in the same IGP area (if the IGP in question has areas), 1842 and R1's route to X was learned via that IGP instance, or is 1843 redistributed by R1 into that IGP instance 1845 4. R1's route to X is a route which it learned about via BGP, and 1846 R2 is a BGP peer of R1 1848 In general, these rules ensure that if the route to a particular 1849 address prefix is distributed via an IGP, the LDP peers for that 1850 address prefix are the IGP neighbors. If the route to a particular 1851 address prefix is distributed via BGP, the LDP peers for that address 1852 prefix are the BGP peers. In other cases of LSP tunneling, the 1853 tunnel endpoints are LDP peers. 1855 3.1.2.2. Distributing Labels 1857 In order to use MPLS for the forwarding of normally routed traffic, 1858 each LSR MUST: 1860 1. bind one or more labels to each address prefix that appears in 1861 its routing table; 1863 2. for each such address prefix X, use an LDP to distribute the 1864 binding of a label to X to each of its LDP Peers for X. 1866 There is also one circumstance in which an LSR must distribute a 1867 label binding for an address prefix, even if it is not the LSR which 1868 bound that label to that address prefix: 1870 3. If R1 uses BGP to distribute a route to X, naming some other 1871 LSR R2 as the BGP Next Hop to X, and if R1 knows that R2 has 1872 assigned label L to X, then R1 must distribute the binding 1873 between T and X to any BGP peer to which it distributes that 1874 route. 1876 These rules ensure that labels corresponding to address prefixes 1877 which correspond to BGP routes are distributed to IGP neighbors if 1878 and only if the BGP routes are distributed into the IGP. Otherwise, 1879 the labels bound to BGP routes are distributed only to the other BGP 1880 speakers. 1882 These rules are intended only to indicate which label bindings must 1883 be distributed by a given LSR to which other LSRs. 1885 3.1.3. Using the Hop by Hop path as the LSP 1887 If the hop-by-hop path that packet P needs to follow is , then can be an LSP as long as: 1890 1. there is a single address prefix X, such that, for all i, 1891 1<=i, and the Hop-by-hop path for P2 is . Let's suppose that R3 binds label L3 to X, and distributes 2151 this binding to R2. R2 binds label L2 to X, and distributes this 2152 binding to both R1 and R4. When R2 receives packet P1, its incoming 2153 label will be L2. R2 will overwrite L2 with L3, and send P1 to R3. 2154 When R2 receives packet P2, its incoming label will also be L2. R2 2155 again overwrites L2 with L3, and send P2 on to R3. 2157 Note then that when P1 and P2 are traveling from R2 to R3, they carry 2158 the same label, and as far as MPLS is concerned, they cannot be 2159 distinguished. Thus instead of talking about two distinct LSPs, and , we might talk of a single "Multipoint-to- 2161 Point LSP Tree", which we might denote as <{R1, R4}, R2, R3>. 2163 This creates a difficulty when we attempt to use conventional ATM 2164 switches as LSRs. Since conventional ATM switches do not support 2165 multipoint-to-point connections, there must be procedures to ensure 2166 that each LSP is realized as a point-to-point VC. However, if ATM 2167 switches which do support multipoint-to-point VCs are in use, then 2168 the LSPs can be most efficiently realized as multipoint-to-point VCs. 2169 Alternatively, if the SVP Multipoint Encoding (section 2.24.2) can be 2170 used, the LSPs can be realized as multipoint-to-point SVPs. 2172 3.6. LSP Tunneling between BGP Border Routers 2174 Consider the case of an Autonomous System, A, which carries transit 2175 traffic between other Autonomous Systems. Autonomous System A will 2176 have a number of BGP Border Routers, and a mesh of BGP connections 2177 among them, over which BGP routes are distributed. In many such 2178 cases, it is desirable to avoid distributing the BGP routes to 2179 routers which are not BGP Border Routers. If this can be avoided, 2180 the "route distribution load" on those routers is significantly 2181 reduced. However, there must be some means of ensuring that the 2182 transit traffic will be delivered from Border Router to Border Router 2183 by the interior routers. 2185 This can easily be done by means of LSP Tunnels. Suppose that BGP 2186 routes are distributed only to BGP Border Routers, and not to the 2187 interior routers that lie along the Hop-by-hop path from Border 2188 Router to Border Router. LSP Tunnels can then be used as follows: 2190 1. Each BGP Border Router distributes, to every other BGP Border 2191 Router in the same Autonomous System, a label for each address 2192 prefix that it distributes to that router via BGP. 2194 2. The IGP for the Autonomous System maintains a host route for 2195 each BGP Border Router. Each interior router distributes its 2196 labels for these host routes to each of its IGP neighbors. 2198 3. Suppose that: 2200 a) BGP Border Router B1 receives an unlabeled packet P, 2202 b) address prefix X in B1's routing table is the longest 2203 match for the destination address of P, 2205 c) the route to X is a BGP route, 2207 d) the BGP Next Hop for X is B2, 2209 e) B2 has bound label L1 to X, and has distributed this 2210 binding to B1, 2212 f) the IGP next hop for the address of B2 is I1, 2214 g) the address of B2 is in B1's and I1's IGP routing tables 2215 as a host route, and 2217 h) I1 has bound label L2 to the address of B2, and 2218 distributed this binding to B1. 2220 Then before sending packet P to I1, B1 must create a label 2221 stack for P, then push on label L1, and then push on label L2. 2223 4. Suppose that BGP Border Router B1 receives a labeled Packet P, 2224 where the label on the top of the label stack corresponds to an 2225 address prefix, X, to which the route is a BGP route, and that 2226 conditions 3b, 3c, 3d, and 3e all hold. Then before sending 2227 packet P to I1, B1 must replace the label at the top of the 2228 label stack with L1, and then push on label L2. 2230 With these procedures, a given packet P follows a level 1 LSP all of 2231 whose members are BGP Border Routers, and between each pair of BGP 2232 Border Routers in the level 1 LSP, it follows a level 2 LSP. 2234 These procedures effectively create a Hop-by-Hop Routed LSP Tunnel 2235 between the BGP Border Routers. 2237 Since the BGP border routers are exchanging label bindings for 2238 address prefixes that are not even known to the IGP routing, the BGP 2239 routers should become explicit LDP peers with each other. 2241 3.7. Other Uses of Hop-by-Hop Routed LSP Tunnels 2243 The use of Hop-by-Hop Routed LSP Tunnels is not restricted to tunnels 2244 between BGP Next Hops. Any situation in which one might otherwise 2245 have used an encapsulation tunnel is one in which it is appropriate 2246 to use a Hop-by-Hop Routed LSP Tunnel. Instead of encapsulating the 2247 packet with a new header whose destination address is the address of 2248 the tunnel's receive endpoint, the label corresponding to the address 2249 prefix which is the longest match for the address of the tunnel's 2250 receive endpoint is pushed on the packet's label stack. The packet 2251 which is sent into the tunnel may or may not already be labeled. 2253 If the transmit endpoint of the tunnel wishes to put a labeled packet 2254 into the tunnel, it must first replace the label value at the top of 2255 the stack with a label value that was distributed to it by the 2256 tunnel's receive endpoint. Then it must push on the label which 2257 corresponds to the tunnel itself, as distributed to it by the next 2258 hop along the tunnel. To allow this, the tunnel endpoints should be 2259 explicit LDP peers. The label bindings they need to exchange are of 2260 no interest to the LSRs along the tunnel. 2262 3.8. MPLS and Multicast 2264 Multicast routing proceeds by constructing multicast trees. The tree 2265 along which a particular multicast packet must get forwarded depends 2266 in general on the packet's source address and its destination 2267 address. Whenever a particular LSR is a node in a particular 2268 multicast tree, it binds a label to that tree. It then distributes 2269 that binding to its parent on the multicast tree. (If the node in 2270 question is on a LAN, and has siblings on that LAN, it must also 2271 distribute the binding to its siblings. This allows the parent to 2272 use a single label value when multicasting to all children on the 2273 LAN.) 2275 When a multicast labeled packet arrives, the NHLFE corresponding to 2276 the label indicates the set of output interfaces for that packet, as 2277 well as the outgoing label. If the same label encoding technique is 2278 used on all the outgoing interfaces, the very same packet can be sent 2279 to all the children. 2281 4. LDP Procedures for Hop-by-Hop Routed Traffic 2283 4.1. The Procedures for Advertising and Using labels 2285 In this section, we consider only label bindings that are used for 2286 traffic to be label switched along its hop-by-hop routed path. In 2287 these cases, the label in question will correspond to an address 2288 prefix in the routing table. 2290 There are a number of different procedures that may be used to 2291 distribute label bindings. One such procedure is executed by the 2292 downstream LSR, and the others by the upstream LSR. 2294 The downstream LSR must perform: 2296 - The Distribution Procedure, and 2298 - the Withdrawal Procedure. 2300 The upstream LSR must perform: 2302 - The Request Procedure, and 2304 - the NotAvailable Procedure, and 2306 - the Release Procedure, and 2308 - the labelUse Procedure. 2310 The MPLS architecture supports several variants of each procedure. 2312 However, the MPLS architecture does not support all possible 2313 combinations of all possible variants. The set of supported 2314 combinations will be described in section 4.2, where the 2315 interoperability between different combinations will also be 2316 discussed. 2318 4.1.1. Downstream LSR: Distribution Procedure 2320 The Distribution Procedure is used by a downstream LSR to determine 2321 when it should distribute a label binding for a particular address 2322 prefix to its LDP peers. The architecture supports four different 2323 distribution procedures. 2325 Irrespective of the particular procedure that is used, if a label 2326 binding for a particular address prefix has been distributed by a 2327 downstream LSR Rd to an upstream LSR Ru, and if at any time the 2328 attributes (as defined above) of that binding change, then Rd must 2329 inform Ru of the new attributes. 2331 If an LSR is maintaining multiple routes to a particular address 2332 prefix, it is a local matter as to whether that LSR binds multiple 2333 labels to the address prefix (one per route), and hence distributes 2334 multiple bindings. 2336 4.1.1.1. PushUnconditional 2338 Let Rd be an LSR. Suppose that: 2340 1. X is an address prefix in Rd's routing table 2342 2. Ru is an LDP Peer of Rd with respect to X 2344 Whenever these conditions hold, Rd must bind a label to X and 2345 distribute that binding to Ru. It is the responsibility of Rd to 2346 keep track of the bindings which it has distributed to Ru, and to 2347 make sure that Ru always has these bindings. 2349 This procedure would be used by LSRs which are performing downstream 2350 label assignment in the Independent LSP Control Mode. 2352 4.1.1.2. PushConditional 2354 Let Rd be an LSR. Suppose that: 2356 1. X is an address prefix in Rd's routing table 2358 2. Ru is an LDP Peer of Rd with respect to X 2360 3. Rd is either an LSP Egress or an LSP Proxy Egress for X, or 2361 Rd's L3 next hop for X is Rn, where Rn is distinct from Ru, and 2362 Rn has bound a label to X and distributed that binding to Rd. 2364 Then as soon as these conditions all hold, Rd should bind a label to 2365 X and distribute that binding to Ru. 2367 Whereas PushUnconditional causes the distribution of label bindings 2368 for all address prefixes in the routing table, PushConditional causes 2369 the distribution of label bindings only for those address prefixes 2370 for which one has received label bindings from one's LSP next hop, or 2371 for which one does not have an MPLS-capable L3 next hop. 2373 This procedure would be used by LSRs which are performing downstream 2374 label assignment in the Ordered LSP Control Mode. 2376 4.1.1.3. PulledUnconditional 2378 Let Rd be an LSR. Suppose that: 2380 1. X is an address prefix in Rd's routing table 2382 2. Ru is a label distribution peer of Rd with respect to X 2383 3. Ru has explicitly requested that Rd bind a label to X and 2384 distribute the binding to Ru 2386 Then Rd should bind a label to X and distribute that binding to Ru. 2387 Note that if X is not in Rd's routing table, or if Rd is not an LDP 2388 peer of Ru with respect to X, then Rd must inform Ru that it cannot 2389 provide a binding at this time. 2391 If Rd has already distributed a binding for address prefix X to Ru, 2392 and it receives a new request from Ru for a binding for address 2393 prefix X, it will bind a second label, and distribute the new binding 2394 to Ru. The first label binding remains in effect. 2396 This procedure would be used by LSRs performing downstream-on-demand 2397 label distribution using the Independent LSP Control Mode. 2399 4.1.1.4. PulledConditional 2401 Let Rd be an LSR. Suppose that: 2403 1. X is an address prefix in Rd's routing table 2405 2. Ru is a label distribution peer of Rd with respect to X 2407 3. Ru has explicitly requested that Rd bind a label to X and 2408 distribute the binding to Ru 2410 4. Rd is either an LSP Egress or an LSP Proxy Egress for X, or 2411 Rd's L3 next hop for X is Rn, where Rn is distinct from Ru, and 2412 Rn has bound a label to X and distributed that binding to Rd 2414 Then as soon as these conditions all hold, Rd should bind a label to 2415 X and distribute that binding to Ru. Note that if X is not in Rd's 2416 routing table, or if Rd is not a label distribution peer of Ru with 2417 respect to X, then Rd must inform Ru that it cannot provide a binding 2418 at this time. 2420 However, if the only condition that fails to hold is that Rn has not 2421 yet provided a label to Rd, then Rd must defer any response to Ru 2422 until such time as it has receiving a binding from Rn. 2424 If Rd has distributed a label binding for address prefix X to Ru, and 2425 at some later time, any attribute of the label binding changes, then 2426 Rd must redistribute the label binding to Ru, with the new attribute. 2427 It must do this even though Ru does not issue a new Request. 2429 This procedure would be used by LSRs that are performing downstream- 2430 on-demand label allocation in the Ordered LSP Control Mode. 2432 In section 4.2, we will discuss how to choose the particular 2433 procedure to be used at any given time, and how to ensure 2434 interoperability among LSRs that choose different procedures. 2436 4.1.2. Upstream LSR: Request Procedure 2438 The Request Procedure is used by the upstream LSR for an address 2439 prefix to determine when to explicitly request that the downstream 2440 LSR bind a label to that prefix and distribute the binding. There 2441 are three possible procedures that can be used. 2443 4.1.2.1. RequestNever 2445 Never make a request. This is useful if the downstream LSR uses the 2446 PushConditional procedure or the PushUnconditional procedure, but is 2447 not useful if the downstream LSR uses the PulledUnconditional 2448 procedure or the the PulledConditional procedures. 2450 This procedure would be used by an LSR when downstream label 2451 distribution and Liberal Label Retention Mode are being used. 2453 4.1.2.2. RequestWhenNeeded 2455 Make a request whenever the L3 next hop to the address prefix 2456 changes, and one doesn't already have a label binding from that next 2457 hop for the given address prefix. 2459 This procedure would be used by an LSR whenever Conservative Label 2460 Retention Mode is being used. 2462 4.1.2.3. RequestOnRequest 2464 Issue a request whenever a request is received, in addition to 2465 issuing a request when needed (as described in section 4.1.2.2). If 2466 Rd receives such a request from Ru, for an address prefix for which 2467 Rd has already distributed Ru a label, Rd shall assign a new 2468 (distinct) label, bind it to X, and distribute that binding. 2469 (Whether Rd can distribute this binding to Ru immediately or not 2470 depends on the Distribution Procedure being used.) 2472 This procedure would be used by an LSR which doing downstream-on- 2473 demand label distribution, but is not doing label merging, e.g., an 2474 ATM-LSR which is not capable of VC merge. 2476 4.1.3. Upstream LSR: NotAvailable Procedure 2478 If Ru and Rd are respectively upstream and downstream label 2479 distribution peers for address prefix X, and Rd is Ru's L3 next hop 2480 for X, and Ru requests a binding for X from Rd, but Rd replies that 2481 it cannot provide a binding at this time, then the NotAvailable 2482 procedure determines how Ru responds. There are two possible 2483 procedures governing Ru's behavior: 2485 4.1.3.1. RequestRetry 2487 Ru should issue the request again at a later time. That is, the 2488 requester is responsible for trying again later to obtain the needed 2489 binding. This procedure would be used when downstream-on-demand 2490 label distribution is used. 2492 4.1.3.2. RequestNoRetry 2494 Ru should never reissue the request, instead assuming that Rd will 2495 provide the binding automatically when it is available. This is 2496 useful if Rd uses the PushUnconditional procedure or the 2497 PushConditional procedure, i.e., if downstream label distribution is 2498 used. 2500 4.1.4. Upstream LSR: Release Procedure 2502 Suppose that Rd is an LSR which has bound a label to address prefix 2503 X, and has distributed that binding to LSR Ru. If Rd does not happen 2504 to be Ru's L3 next hop for address prefix X, or has ceased to be Ru's 2505 L3 next hop for address prefix X, then Rd will not be using the 2506 label. The Release Procedure determines how Ru acts in this case. 2507 There are two possible procedures governing Ru's behavior: 2509 4.1.4.1. ReleaseOnChange 2511 Ru should release the binding, and inform Rd that it has done so. 2512 This procedure would be used to implement Conservative Label 2513 Retention Mode. 2515 4.1.4.2. NoReleaseOnChange 2517 Ru should maintain the binding, so that it can use it again 2518 immediately if Rd later becomes Ru's L3 next hop for X. This 2519 procedure would be used to implement Liberal Label Retention Mode. 2521 4.1.5. Upstream LSR: labelUse Procedure 2523 Suppose Ru is an LSR which has received label binding L for address 2524 prefix X from LSR Rd, and Ru is upstream of Rd with respect to X, and 2525 in fact Rd is Ru's L3 next hop for X. 2527 Ru will make use of the binding if Rd is Ru's L3 next hop for X. If, 2528 at the time the binding is received by Ru, Rd is NOT Ru's L3 next hop 2529 for X, Ru does not make any use of the binding at that time. Ru may 2530 however start using the binding at some later time, if Rd becomes 2531 Ru's L3 next hop for X. 2533 The labelUse Procedure determines just how Ru makes use of Rd's 2534 binding. 2536 There are three procedures which Ru may use: 2538 4.1.5.1. UseImmediate 2540 Ru may put the binding into use immediately. At any time when Ru has 2541 a binding for X from Rd, and Rd is Ru's L3 next hop for X, Rd will 2542 also be Ru's LSP next hop for X. This procedure is used when neither 2543 loop prevention nor loop detection are in use. 2545 4.1.5.2. UseIfLoopFree 2547 Ru will use the binding only if it determines that by doing so, it 2548 will not cause a forwarding loop. 2550 If Ru has a binding for X from Rd, and Rd is (or becomes) Ru's L3 2551 next hop for X, but Rd is NOT Ru's current LSP next hop for X, Ru 2552 does NOT immediately make Rd its LSP next hop. Rather, it initiates 2553 a loop prevention algorithm. If, upon the completion of this 2554 algorithm, Rd is still the L3 next hop for X, Ru will make Rd the LSP 2555 next hop for X, and use L as the outgoing label. 2557 This procedure is used when loop prevention is in use. 2559 The loop prevention algorithm to be used is still under 2560 consideration. 2562 4.1.5.3. UseIfLoopNotDetected 2564 This procedure is the same as UseImmediate, unless Ru has detected a 2565 loop in the LSP. If a loop has been detected, Ru will discard 2566 packets that would otherwise have been labeled with L and sent to Rd. 2568 This procedure is used when loop detection, but not loop prevention, 2569 is in use. 2571 This will continue until the next hop for X changes, or until the 2572 loop is no longer detected. 2574 4.1.6. Downstream LSR: Withdraw Procedure 2576 In this case, there is only a single procedure. 2578 When LSR Rd decides to break the binding between label L and address 2579 prefix X, then this unbinding must be distributed to all LSRs to 2580 which the binding was distributed. 2582 It is desirable, though not required, that the unbinding of L from X 2583 be distributed by Rd to a LSR Ru before Rd distributes to Ru any new 2584 binding of L to any other address prefix Y, where X != Y. If Ru 2585 learns of the new binding of L to Y before it learns of the unbinding 2586 of L from X, and if packets matching both X and Y are forwarded by Ru 2587 to Rd, then for a period of time, Ru will label both packets matching 2588 X and packets matching Y with label L. 2590 The distribution and withdrawal of label bindings is done via a label 2591 distribution protocol, or LDP. LDP is a two-party protocol. If LSR R1 2592 has received label bindings from LSR R2 via an instance of an LDP, 2593 and that instance of that protocol is closed by either end (whether 2594 as a result of failure or as a matter of normal operation), then all 2595 bindings learned over that instance of the protocol must be 2596 considered to have been withdrawn. 2598 As long as the relevant LDP connection remains open, label bindings 2599 that are withdrawn must always be withdrawn explicitly. If a second 2600 label is bound to an address prefix, the result is not to implicitly 2601 withdraw the first label, but to bind both labels; this is needed to 2602 support multi-path routing. If a second address prefix is bound to a 2603 label, the result is not to implicitly withdraw the binding of that 2604 label to the first address prefix, but to use that label for both 2605 address prefixes. 2607 4.2. MPLS Schemes: Supported Combinations of Procedures 2609 Consider two LSRs, Ru and Rd, which are label distribution peers with 2610 respect to some set of address prefixes, where Ru is the upstream 2611 peer and Rd is the downstream peer. 2613 The MPLS scheme which governs the interaction of Ru and Rd can be 2614 described as a quintuple of procedures: . (Since there is only one Withdraw Procedure, it 2617 need not be mentioned.) A "*" appearing in one of the positions is a 2618 wild-card, meaning that any procedure in that category may be 2619 present; an "N/A" appearing in a particular position indicates that 2620 no procedure in that category is needed. 2622 Only the MPLS schemes which are specified below are supported by the 2623 MPLS Architecture. Other schemes may be added in the future, if a 2624 need for them is shown. 2626 4.2.1. TTL-capable LSP Segments 2628 If Ru and Rd are MPLS peers, and both are capable of decrementing a 2629 TTL field in the MPLS header, then the MPLS scheme in use between Ru 2630 and Rd must be one of the following: 2632 1. 2635 This is downstream label distribution with independent control, 2636 liberal label retention mode, and no loop detection. 2638 2. 2641 This is downstream label distribution with independent control, 2642 liberal label retention, and loop detection. 2644 3. 2647 This is downstream label distribution with ordered control and 2648 conservative label retention mode. Loop prevention and loop 2649 detection are optional. 2651 4. 2653 This is downstream label distribution with ordered control and 2654 liberal label retention mode. Loop prevention and loop 2655 detection are optional. 2657 4.2.2. Using ATM Switches as LSRs 2659 The procedures for using ATM switches as LSRs depends on whether the 2660 ATM switches can realize LSP trees as multipoint-to-point VCs or VPs. 2662 Most ATM switches existing today do NOT have a multipoint-to-point 2663 VC-switching capability. Their cross-connect tables could easily be 2664 programmed to move cells from multiple incoming VCs to a single 2665 outgoing VC, but the result would be that cells from different 2666 packets get interleaved. 2668 Some ATM switches do support a multipoint-to-point VC-switching 2669 capability. These switches will queue up all the incoming cells from 2670 an incoming VC until a packet boundary is reached. Then they will 2671 transmit the entire sequence of cells on the outgoing VC, without 2672 allowing cells from any other packet to be interleaved. 2674 Many ATM switches do support a multipoint-to-point VP-switching 2675 capability, which can be used if the Multipoint SVP label encoding is 2676 used. 2678 4.2.2.1. Without Label Merging 2680 Suppose that R1, R2, R3, and R4 are ATM switches which do not support 2681 label merging, but are being used as LSRs. Suppose further that the 2682 L3 hop-by-hop path for address prefix X is , and that 2683 packets destined for X can enter the network at any of these LSRs. 2684 Since there is no multipoint-to-point capability, the LSPs must be 2685 realized as point-to-point VCs, which means that there needs to be 2686 three such VCs for address prefix X: , , 2687 and . 2689 Therefore, if R1 and R2 are MPLS peers, and either is an LSR which is 2690 implemented using conventional ATM switching hardware (i.e., no cell 2691 interleave suppression), the MPLS scheme in use between R1 and R2 2692 must be one of the following: 2694 1. 2697 This is downstream-on-demand label distribution with 2698 independent control and conservative label retention mode, 2699 without loop prevention or detection. 2701 2. 2704 This is downstream-on-demand label distribution with 2705 independent control and conservative label retention mode, with 2706 loop detection. 2708 3. 2711 This is downstream-on-demand label distribution with ordered 2712 control (initiated by the ingress), conservative label 2713 retention mode, and optional loop detection or loop prevention. 2715 The use of the RequestOnRequest procedure will cause R4 to 2716 distribute three labels for X to R3; R3 will distribute 2 2717 labels for X to R2, and R2 will distribute one label for X to 2718 R1. 2720 4.2.2.2. With Label Merging 2722 If R1 and R2 are MPLS peers, at least one of which is an ATM-LSR 2723 which supports label merging, then the MPLS scheme in use between R1 2724 and R2 must be one of the following: 2726 1. 2729 This is downstream-on-demand label distribution with 2731 2733 2736 The first of these is an ordered control scheme. The second is 2737 is the "downstream" variant of independent control. The third 2738 is the "conservative downstream-on-demand" variant of 2739 independent control. 2741 4.2.3. Interoperability Considerations 2743 It is easy to see that certain quintuples do NOT yield viable MPLS 2744 schemes. For example: 2746 - 2747 2749 In these MPLS schemes, the downstream LSR Rd distributes label 2750 bindings to upstream LSR Ru only upon request from Ru, but Ru 2751 never makes any such requests. Obviously, these schemes are not 2752 viable, since they will not result in the proper distribution of 2753 label bindings. 2755 - <*, RequestNever, *, *, ReleaseOnChange> 2757 In these MPLS schemes, Rd releases bindings when it isn't using 2758 them, but it never asks for them again, even if it later has a 2759 need for them. These schemes thus do not ensure that label 2760 bindings get properly distributed. 2762 In this section, we specify rules to prevent a pair of LDP peers from 2763 adopting procedures which lead to infeasible MPLS Schemes. These 2764 rules require the exchange of information between LDP peers during 2765 the initialization of the LDP connection between them. 2767 1. Each must state whether it is an ATM-LSR, and if so, whether it 2768 has cell interleave suppression (i.e., VC merging). 2770 2. If Rd is an ATM switch without cell interleave suppression, it 2771 must state whether it intends to use the PulledUnconditional 2772 procedure or the Pulledconditional procedure. If the former, 2773 Ru MUST use the RequestRetry procedure; if the latter, Ru MUST 2774 use the RequestNoRetry procedure. 2776 3. If Ru is an ATM switch without cell interleave suppression, it 2777 must state whether it intends to use the RequestRetry or the 2778 RequestNoRetry procedure. If Rd is an ATM switch without cell 2779 interleave suppression, Rd is not bound by this, and in fact Ru 2780 MUST adopt Rd's preferences. However, if Rd is NOT an ATM 2781 switch without cell interleave suppression, then if Ru chooses 2782 RequestRetry, Rd must use PulledUnconditional, and if Ru 2783 chooses RequestNoRetry, Rd MUST use PulledConditional. 2785 4. If Rd is an ATM switch with cell interleave suppression, it 2786 must specify whether it prefers to use PushConditional, 2787 PushUnconditional, or PulledConditional. If Ru is not an ATM 2788 switch without cell interleave suppression, it must then use 2789 RequestWhenNeeded and RequestNoRetry, or else RequestNever and 2790 NoReleaseOnChange, respectively. 2792 5. If Ru is an ATM switch with cell interleave suppression, it 2793 must specify whether it prefers to use RequestWhenNeeded and 2794 RequestNoRetry, or else RequestNever and NoReleaseOnChange. If 2795 Rd is NOT an ATM switch with cell interleave suppression, it 2796 must then use either PushConditional or PushUnconditional, 2797 respectively. 2799 5. Security Considerations 2801 Security considerations are not discussed in this version of this 2802 draft. 2804 6. Authors' Addresses 2806 Eric C. Rosen 2807 Cisco Systems, Inc. 2808 250 Apollo Drive 2809 Chelmsford, MA, 01824 2810 E-mail: erosen@cisco.com 2812 Arun Viswanathan 2813 Lucent Technologies 2814 101 Crawford Corner Rd., #4D-537 2815 Holmdel, NJ 07733 2816 732-332-5163 2817 E-mail: arunv@dnrc.bell-labs.com 2819 Ross Callon 2820 IronBridge Networks 2821 55 Hayden Avenue, 2822 Lexington, MA 02173 2823 +1-781-402-8017 2824 E-mail: rcallon@ironbridgenetworks.com 2826 7. References 2828 [1] "A Framework for Multiprotocol Label Switching", R.Callon, 2829 P.Doolan, N.Feldman, A.Fredette, G.Swallow, and A.Viswanathan, work 2830 in progress, Internet Draft , 2831 November 1997. 2833 [2] "ARIS: Aggregate Route-Based IP Switching", A. Viswanathan, N. 2834 Feldman, R. Boivie, R. Woundy, work in progress, Internet Draft 2835 , March 1997. 2837 [3] "ARIS Specification", N. Feldman, A. Viswanathan, work in 2838 progress, Internet Draft , March 2839 1997. 2841 [4] "Tag Switching Architecture - Overview", Rekhter, Davie, Katz, 2842 Rosen, Swallow, Farinacci, work in progress, Internet Draft , January, 1997. 2845 [5] "Tag distribution Protocol", Doolan, Davie, Katz, Rekhter, Rosen, 2846 work in progress, Internet Draft , May, 2847 1997. 2849 [6] "Use of Tag Switching with ATM", Davie, Doolan, Lawrence, 2850 McGloghrie, Rekhter, Rosen, Swallow, work in progress, Internet Draft 2851 , January, 1997. 2853 [7] "Label Switching: Label Stack Encodings", Rosen, Rekhter, Tappan, 2854 Farinacci, Fedorkow, Li, Conta, work in progress, Internet Draft 2855 , February, 1998. 2857 [8] "Partitioning Tag Space among Multicast Routers on a Common 2858 Subnet", Farinacci, work in progress, internet draft , December, 1996. 2861 [9] "Multicast Tag Binding and Distribution using PIM", Farinacci, 2862 Rekhter, work in progress, internet draft , December, 1996. 2865 [10] "Toshiba's Router Architecture Extensions for ATM: Overview", 2866 Katsube, Nagami, Esaki, RFC 2098, February, 1997. 2868 [11] "Loop-Free Routing Using Diffusing Computations", J.J. Garcia- 2869 Luna-Aceves, IEEE/ACM Transactions on Networking, Vol. 1, No. 1, 2870 February 1993.