idnits 2.17.00 (12 Aug 2021) /tmp/idnits39869/draft-shand-remote-lfa-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == It seems as if not all pages are separated by form feeds - found 0 form feeds but 12 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 11, 2011) is 3874 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'ISOCORE2010' is defined on line 494, but no explicit reference was found in the text Summary: 0 errors (**), 0 flaws (~~), 3 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group S. Bryant 3 Internet-Draft C. Filsfils 4 Intended status: Standards Track M. Shand 5 Expires: April 30, 2012 Cisco Systems 6 N. So 7 Verizon Inc. 8 October 11, 2011 10 Remote LFA FRR 11 draft-shand-remote-lfa-00 13 Abstract 15 This draft describes an extension to the basic IP fast re-route 16 mechanism described in RFC 5286 that provides additional backup 17 connectivity when none can be provided by the basic mechanisms. 19 Requirements Language 21 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 22 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 23 document are to be interpreted as described in RFC2119 [RFC2119]. 25 Status of this Memo 27 This Internet-Draft is submitted in full conformance with the 28 provisions of BCP 78 and BCP 79. 30 Internet-Drafts are working documents of the Internet Engineering 31 Task Force (IETF). Note that other groups may also distribute 32 working documents as Internet-Drafts. The list of current Internet- 33 Drafts is at http://datatracker.ietf.org/drafts/current/. 35 Internet-Drafts are draft documents valid for a maximum of six months 36 and may be updated, replaced, or obsoleted by other documents at any 37 time. It is inappropriate to use Internet-Drafts as reference 38 material or to cite them other than as "work in progress." 40 This Internet-Draft will expire on May 30, 2012. 42 Copyright Notice 44 Copyright (c) 2011 IETF Trust and the persons identified as the 45 document authors. All rights reserved. 47 This document is subject to BCP 78 and the IETF Trust's Legal 48 Provisions Relating to IETF Documents 49 (http://trustee.ietf.org/license-info) in effect on the date of 50 publication of this document. Please review these documents 51 carefully, as they describe your rights and restrictions with respect 52 to this document. Code Components extracted from this document must 53 include Simplified BSD License text as described in Section 4.e of 54 the Trust Legal Provisions and are provided without warranty as 55 described in the Simplified BSD License. 57 1. Terminology 59 This draft uses the terms defined in [RFC5714]. This section defines 60 additional terms used in this draft. 62 Extended P-space 64 The union of the P-space of the neighbours of a 65 specific router with respect to the protected link. 67 P-space P-space is the set of routers reachable from a 68 specific router without any path (including equal cost 69 path splits) transiting the protected link. 71 For example, the P-space of S, is the set of routers 72 that S can reach without using the protected link S-E. 74 PQ node A node which is a member of both the extended P-space 75 and the Q-space. 77 Q-space Q-space is the set of routers from which a specific 78 router can be reached without any path (including 79 equal cost path splits) transiting the protected link. 81 Repair tunnel A tunnel established for the purpose of providing a 82 virtual neighbor which is a Loop Free Alternate. 84 Remote LFA The tail-end of a repair tunnel. This tail-end is a 85 member of both the extended-P space the Q space. It 86 is also termed a "PQ" node. 88 2. Introduction 90 RFC 5714 [RFC5714] describes a framework for IP Fast Re-route and 91 provides a summary of various proposed IPFRR solutions. A basic 92 mechanism using loop-free alternates (LFAs) is described in [RFC5286] 93 that provides good repair coverage in many 94 topologies[I-D.filsfils-rtgwg-lfa-applicability], especially those 95 that are highly meshed. However, some topologies, notably ring based 96 topologies are not well protected by LFAs alone. This is illustrated 97 in Figure 1 below. 99 S---E 100 / \ 101 A D 102 \ / 103 B---C 105 Figure 1: A simple ring topology 107 If all link costs are equal, the link S-E cannot be fully protected 108 by LFAs. The destination C is an ECMP from S, and so can be 109 protected when S-E fails, but D and E are not protectable using LFAs 111 This draft describes extensions to the basic repair mechanism in 112 which tunnels are used to provide additional logical links which can 113 then be used as loop free alternates where none exist in the original 114 topology. For example if a tunnel is provided between S and C as 115 shown in Figure 2 then C, now being a direct neighbor of S would 116 become an LFA for D and E. The non-failure traffic distribution is 117 not disrupted by the provision of such a tunnel since it is only used 118 for repair traffic and MUST NOT be used for normal traffic. 120 S---E 121 / \ \ 122 A \ D 123 \ \ / 124 B---C 126 Figure 2: The addition of a tunnel 128 The use of this technique is not restricted to ring based topologies, 129 but is a general mechanism which can be used to enhance the 130 protection provided by LFAs. 132 3. Repair Paths 134 As with LFA FRR, when a router detects an adjacent link failure, it 135 uses one or more repair paths in place of the failed link. Repair 136 paths are pre-computed in anticipation of later failures so they can 137 be promptly activated when a failure is detected. 139 A tunneled repair path tunnels traffic to some staging point in the 140 network from which it is assumed that, in the absence of multiple 141 failures, it will travel to its destination using normal forwarding 142 without looping back. This is equivalent to providing a virtual 143 loop-free alternate to supplement the physical loop-free alternates. 144 Hence the name "Remote LFA FRR". When a link cannot be entirely 145 protected with local LFA neighbors, the protecting router seeks the 146 help of a remote LFA staging point. 148 3.1. Tunnels as Repair Paths 150 Consider an arbitrary protected link S-E. In LFA FRR, if a path to 151 the destination from a neighbor N of S does not cause a packet to 152 loop back over the link S-E (i.e. N is a loop-free alternate), then 153 S can send the packet to N and the packet will be delivered to the 154 destination using the pre-failure forwarding information. If there 155 is no such LFA neighbor, then S may be able to create a virtual LFA 156 by using a tunnel to carry the packet to a point in the network which 157 is not a direct neighbor of S from which the packet will be delivered 158 to the destination without looping back to S. In this document such a 159 tunnel is termed a repair tunnel. The tail-end of this tunnel is 160 called a "remote LFA" or a "PQ node". 162 Note that the repair tunnel terminates at some intermediate router 163 between S and E, and not E itself. This is clearly the case, since 164 if it were possible to construct a tunnel from S to E then a 165 conventional LFA would have been sufficient to effect the repair. 167 3.2. Tunnel Requirements 169 There are a number of IP in IP tunnel mechanisms that may be used to 170 fulfil the requirements of this design, such as IP-in-IP [RFC1853] 171 and GRE[RFC1701] . 173 In an MPLS enabled network using LDP[RFC5036], a simple label 174 stack[RFC3032] may be used to provide the required repair tunnel. In 175 this case the outer label is S's neighbor's label for the repair 176 tunnel end point, and the inner label is the repair tunnel end 177 point's label for the packet destination. In order for S to obtain 178 the correct inner label it is necessary to establish a directed LDP 179 session[RFC5036] to the tunnel end point. 181 The selection of the specific tunnelling mechanism (and any necessary 182 enhancements) used to provide a repair path is outside the scope of 183 this document. The authors simply note that deployment in an MPLS/ 184 LDP environment is extremely simple and straight-forward as an LDP 185 LSP from S to the PQ node is readily available, and hence does not 186 require any new protocol extension or design change. This LSP is 187 automatically established as a basic property of LDP behavior. The 188 performance of the encapsulation and decapsulation is also excellent 189 as encapsulation is just a push of one label (like conventional MPLS 190 TE FRR) and the decapsulation occurs naturally at the penultimate hop 191 before the PQ node. 193 When a failure is detected, it is necessary to immediately redirect 194 traffic to the repair path. Consequently, the repair tunnel used 195 must be provisioned beforehand in anticipation of the failure. Since 196 the location of the repair tunnels is dynamically determined it is 197 necessary to establish the repair tunnels without management action. 198 Multiple repairs may share a tunnel end point. 200 4. Construction of Repair Paths 202 4.1. Identifying Required Tunneled Repair Paths 204 Not all links will require protection using a tunneled repair path. 205 If E can already be protected via an LFA, S-E does not need to be 206 protected using a repair tunnel, since all destinations normally 207 reachable through E must therefore also be protectable by an LFA. 208 Such an LFA is frequently termed a "link LFA". Tunneled repair paths 209 are only required for links which do not have a link LFA. 211 4.2. Determining Tunnel End Points 213 The repair tunnel endpoint needs to be a node in the network 214 reachable from S without traversing S-E. In addition, the repair 215 tunnel end point needs to be a node from which packets will normally 216 flow towards their destination without being attracted back to the 217 failed link S-E. 219 Note that once released from the tunnel, the packet will be 220 forwarded, as normal, on the shortest path from the release point to 221 its destination. This may result in the packet traversing the router 222 E at the far end of the protected link S-E., but this is obviously 223 not required. 225 The properties that are required of repair tunnel end points are 226 therefore: 228 o The repair tunneled point MUST be reachable from the tunnel source 229 without traversing the failed link; and 231 o When released, tunneled packets MUST proceed towards their 232 destination without being attracted back over the failed link. 234 Provided both these requirements are met, packets forwarded over the 235 repair tunnel will reach their destination and will not loop. 237 In some topologies it will not be possible to find a repair tunnel 238 endpoint that exhibits both the required properties. For example if 239 the ring topology illustrated in Figure 1 had a cost of 4 for the 240 link B-C, while the remaining links were cost 1, then it would not be 241 possible to establish a tunnel from S to C (without resorting to some 242 form of source routing). 244 4.2.1. Computing Repair Paths 246 The set of routers which can be reached from S without traversing S-E 247 is termed the P-space of S with respect to the link S-E. The P-space 248 can be obtained by computing a shortest path tree (SPT) rooted at S 249 and excising the sub-tree reached via the link S-E (including those 250 which are members of an ECMP). In the case of Figure 1 the P-space 251 comprises nodes A and B only. 253 The set of routers from which the node E can be reached, by normal 254 forwarding, without traversing the link S-E is termed the Q-space of 255 E with respect to the link S-E. The Q-space can be obtained by 256 computing a reverse shortest path tree (rSPT) rooted at E, with the 257 sub-tree which traverses the failed link excised (including those 258 which are members of an ECMP). The rSPT uses the cost towards the 259 root rather than from it and yields the best paths towards the root 260 from other nodes in the network. In the case of Figure 1 the Q-space 261 comprises nodes C and D only. 263 The intersection of the E's Q-space with S's P-space defines the set 264 of viable repair tunnel end-points, known as "PQ nodes". As can be 265 seen, for the case of Figure 1 there is no common node and hence no 266 viable repair tunnel end-point. 268 Note that the Q-space calculation could be conducted for each 269 individual destination and a per-destination repair tunnel end point 270 determined. However this would, in the worst case, require an SPF 271 computation per destination which is not considered to be scalable. 272 We therefore use the Q-space of E as a proxy for the Q-space of each 273 destination. This approximation is obviously correct since the 274 repair is only used for the set of destinations which were, prior to 275 the failure, routed through node E. This is analogous to the use of 276 link-LFAs rather than per-prefix LFAs. 278 4.2.2. Extended P-space 280 The description in Section 4.2.1 calculated router S's P-space rooted 281 at S itself. However, since router S will only use a repair path 282 when it has detected the failure of the link S-E, the initial hop of 283 the repair path need not be subject to S's normal forwarding decision 284 process. Thus we introduce the concept of extended P-space. Router 285 S's extended P-space is the union of the P-spaces of each of S's 286 neighbours. The use of extended P-space may allow router S to reach 287 potential repair tunnel end points that were otherwise unreachable. 289 Another way to describe extended P-space is that it is the union of ( 290 un-extended ) P-space and the set of destinations for which S has a 291 per-prefix LFA protecting the link S-E. i.e. the repair tunnel end 292 point can be reached either directly or using a per-prefix LFA. 294 Since in the case of Figure 1 node A is a per-prefix LFA for the 295 destination node C, the set of extended P-space nodes comprises nodes 296 A, B and C. Since node C is also in E's Q-space, there is now a node 297 common to both extended P-space and Q-space which can be used as a 298 repair tunnel end-point to protect the link S-E. 300 4.2.3. Selecting Repair Paths 302 The mechanisms described above will identify all the possible repair 303 tunnel end points that can be used to protect a particular link. In 304 a well-connected network there are likely to be multiple possible 305 release points for each protected link. All will deliver the packets 306 correctly so, arguably, it does not matter which is chosen. However, 307 one repair tunnel end point may be preferred over the others on the 308 basis of path cost or some other selection criteria. 310 In general there are advantages in choosing the repair tunnel end 311 point closest (shortest metric) to S. Choosing the closest maximises 312 the opportunity for the traffic to be load balanced once it has been 313 released from the tunnel. 315 There is no technical requirement for the selection criteria to be 316 consistent across all routers, but such consistency may be desirable 317 from an operational point of view. 319 5. Example Application of Remote LFAs 321 An example of a commonly deployed topology which is not fully 322 protected by LFAs alone is shown in Figure 3. PE1 and PE2 are 323 connected in the same site. P1 and P2 may be geographically 324 separated (inter-site). In order to guarantee the lowest latency 325 path from/to all other remote PEs, normally the shortest path follows 326 the geographical distance of the site locations. Therefore, to 327 ensure this, a lower IGP metric (5) is assigned between PE1 and PE2. 328 A high metric (1000) is set on the P-PE links to prevent the PEs 329 being used for transit traffic. The PEs are not individually dual- 330 homed in order to reduce costs. 332 This is a common topology in SP networks. 334 When a failure occurs on the link between PE1 and P1, PE1 does not 335 have an LFA for traffic reachable via P1. Similarly, by symmetry, if 336 the link between PE2 and P2 fails, PE2 does not have an LFA for 337 traffic reachable via P2. 339 Increasing the metric between PE1 and PE2 to allow the LFA would 340 impact the normal traffic performance by potentially increasing the 341 latency. 342 | 100 | 343 -P1---------P2- 344 \ / 345 1000 \ / 1000 346 PE1---PE2 347 5 349 Figure 3: Example SP topology 351 Clearly, full protection can be provided, using the techniques 352 described in this draft, by PE1 choosing P2 as a PQ node, and PE2 353 choosing P1 as a PQ node. 355 6. Historical Note 357 The basic concepts behind Remote LFA were invented in 2002 and were 358 later included in draft-bryant-ipfrr-tunnels, submitted in 2004. 360 draft-bryant-ipfrr-tunnels targetted a 100% protection coverage and 361 hence included additional mechanims on top of the Remote LFA concept. 362 The addition of these mechanisms made the proposal very complex and 363 computationally intensive and it was therefore not pursued as a 364 working group item. 366 As explained in [I-D.filsfils-rtgwg-lfa-applicability], the purpose 367 of the LFA FRR technology is not to provide coverage at any cost. A 368 solution for this already exists with MPLS TE FRR. MPLS TE FRR is a 369 mature technology which is able to provide protection in any topology 370 thanks to the explicit routing capability of MPLS TE. 372 The purpose of LFA FRR technology is to provide for a simple FRR 373 solution when such a solution is possible. The first step along this 374 simplicity approach was "local" LFA [RFC5286]. We propose "Remote 375 LFA" as a natural second step. The following section motivates its 376 benefits in terms of simplicity, incremental deployment and 377 significant coverage increase. 379 7. Benefits 381 Remote LFAs preserve the benefits of RFC5286: simplicity, incremental 382 deployment and good protection coverage. 384 7.1. Simplicity 386 The remote LFA algorithm is simple to compute. 388 o The extended P space does not require any new computation (it is 389 known once per-prefix LFA computation is completed). 391 o The Q-space is a single reverse SPF rooted at the neighbor. 393 o The directed LDP session is automatically computed and 394 established. 396 In edge topologies (square, ring), the directed LDP session position 397 and number is determinic and hence troubleshooting is simple. 399 In core topologies, our simulation indicates that the 90th percentile 400 number of LDP sessions per node to achieve the significant Remote LFA 401 coverage observed in section 7.3 is <= 6. This is insignificant 402 compared to the number of LDP sessions commonly deployed per router 403 which is frequently is in the several hundreds. 405 7.2. Incremental Deployment 407 The establishment of the directed LDP session to the PQ node does not 408 require any new technology on the PQ node. Indeed, routers commonly 409 support the ability to accept a remote request to open a directed LDP 410 session. The new capability is restricted to the Remote-LFA 411 computing node (the originator of the LDP session). 413 7.3. Significant Coverage Extension 415 The previous sections have already explained how Remote LFAs provide 416 protection for frequently occuring edge topologies: square and rings. 417 In the core, we extend the analysis framework in section 4.3 of 418 [I-D.filsfils-rtgwg-lfa-applicability]and provide hereafter the 419 Remote LFA coverage results for the 11 topologies: 421 +----------+--------------+----------------+------------+ 422 | Topology | Per-link LFA | Per-prefix LFA | Remote LFA | 423 +----------+--------------+----------------+------------+ 424 | T1 | 45% | 77% | 78% | 425 | T2 | 49% | 99% | 100% | 426 | T3 | 88% | 99% | 99% | 427 | T4 | 68% | 84% | 92% | 428 | T5 | 75% | 94% | 99% | 429 | T6 | 87% | 99% | 100% | 430 | T7 | 16% | 67% | 96% | 431 | T8 | 87% | 100% | 100% | 432 | T9 | 67% | 80% | 98% | 433 | T10 | 98% | 100% | 100% | 434 | T11 | 59% | 77% | 95% | 435 | Average | 67% | 89% | 96% | 436 | Median | 68% | 94% | 99% | 437 +----------+--------------+----------------+------------+ 439 Another study[ISOCORE2010]confirms the significant coverage increase 440 provided by Remote LFAs. 442 8. Complete Protection 444 As shown in the previous table, Remote LFA provides for 96% average 445 (99% median) protection in the 11 analyzed SP topologies. 447 In an MPLS network, this is achieved without any scalability impact 448 as the tunnels to the PQ nodes are always present as a property of an 449 LDP-based deployment. 451 In the very few cases where P and Q spaces have an empty 452 intersection, one could select the closest node in the Q space 453 (i.e. Qc) and signal an explicitely-routed RSVP TE LSP to Qc. 454 A directed LDP session is then established with Qc and the rest of 455 the solution is identical. 457 The drawbacks of this solution are: 458 1/ only available for MPLS network; 459 2/ the addition of LSPs in the SP infrastructure. 461 This extension is described for exhaustivity. In practice, the 462 "Remote LFA" solution should be preferred for three reasons: its 463 simplicity, its excellent coverage in the analyzed backbones and its 464 complete coverage in the most frequent access/aggregation topologies 465 (box or ring). 467 8. IANA Considerations 469 There are no IANA considerations that arise from this architectural 470 description of IPFRR. 472 9. Security Considerations 474 The security considerations of RFC 5286 also apply. 476 To prevent their use as an attack vector the repair tunnel endpoints 477 SHOULD be assigned from a set of addresses that are not reachable 478 from outside the routing domain. 480 10. Acknowledgments 482 The authors acknowledge the technical contributions made to this work 483 by Stefano Previdi. 485 11. Informative References 487 [I-D.filsfils-rtgwg-lfa-applicability] 488 Filsfils, C., Francois, P., Shand, M., Decraene, B., 489 Uttaro, J., Leymann, N., and M. Horneffer, "LFA 490 applicability in SP networks", 491 draft-filsfils-rtgwg-lfa-applicability-00 (work in 492 progress), March 2010. 494 [ISOCORE2010] 495 So, N., Lin, T., and C. Chen, "LFA (Loop Free Alternates) 496 Case Studies in Verizon's LDP Network", 2010. 498 [RFC1701] Hanks, S., Li, T., Farinacci, D., and P. Traina, "Generic 499 Routing Encapsulation (GRE)", RFC 1701, October 1994. 501 [RFC1853] Simpson, W., "IP in IP Tunneling", RFC 1853, October 1995. 503 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 504 Requirement Levels", BCP 14, RFC 2119, March 1997. 506 [RFC3032] Rosen, E., Tappan, D., Fedorkow, G., Rekhter, Y., 507 Farinacci, D., Li, T., and A. Conta, "MPLS Label Stack 508 Encoding", RFC 3032, January 2001. 510 [RFC5036] Andersson, L., Minei, I., and B. Thomas, "LDP 511 Specification", RFC 5036, October 2007. 513 [RFC5286] Atlas, A. and A. Zinin, "Basic Specification for IP Fast 514 Reroute: Loop-Free Alternates", RFC 5286, September 2008. 516 [RFC5714] Shand, M. and S. Bryant, "IP Fast Reroute Framework", 517 RFC 5714, January 2010. 519 Authors' Addresses 521 Stewart Bryant 522 Cisco Systems 523 250, Longwater, Green Park, 524 Reading RG2 6GB, UK 525 UK 527 Email: stbryant@cisco.com 529 Clarence Filsfils 530 Cisco Systems 531 De Kleetlaan 6a 532 1831 Diegem, 533 Belgium 535 Email: cfilsfil@cisco.com 537 Mike Shand 539 Email: imc.shand@googlemail.com 541 Ning So 542 Verizon Inc. 544 Phone: 545 Email: