idnits 2.17.00 (12 Aug 2021) /tmp/idnits19250/draft-ginsberg-lsr-isis-flooding-scale-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (July 8, 2021) is 311 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-10) exists of draft-ietf-lsr-dynamic-flooding-08 Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Networking Working Group L. Ginsberg 3 Internet-Draft P. Psenak 4 Intended status: Informational M. Karasek 5 Expires: January 9, 2022 A. Lindem 6 Cisco Systems 7 T. Przygienda 8 Juniper 9 July 8, 2021 11 IS-IS Flooding Scale Considerations 12 draft-ginsberg-lsr-isis-flooding-scale-05 14 Abstract 16 Link State PDU flooding rates in use are much slower than what modern 17 networks can support. The use of IS-IS at larger scale requires 18 faster flooding rates to achieve desired convergence goals. This 19 document discusses issues associated with increasing flooding rates 20 and some recommended practices which allow faster flooding rates to 21 be used safely. 23 Requirements Language 25 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 26 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 27 "OPTIONAL" in this document are to be interpreted as described in BCP 28 14 [RFC2119] [RFC8174] when, and only when, they appear in all 29 capitals, as shown here. 31 Status of This Memo 33 This Internet-Draft is submitted in full conformance with the 34 provisions of BCP 78 and BCP 79. 36 Internet-Drafts are working documents of the Internet Engineering 37 Task Force (IETF). Note that other groups may also distribute 38 working documents as Internet-Drafts. The list of current Internet- 39 Drafts is at https://datatracker.ietf.org/drafts/current/. 41 Internet-Drafts are draft documents valid for a maximum of six months 42 and may be updated, replaced, or obsoleted by other documents at any 43 time. It is inappropriate to use Internet-Drafts as reference 44 material or to cite them other than as "work in progress." 46 This Internet-Draft will expire on January 9, 2022. 48 Copyright Notice 50 Copyright (c) 2021 IETF Trust and the persons identified as the 51 document authors. All rights reserved. 53 This document is subject to BCP 78 and the IETF Trust's Legal 54 Provisions Relating to IETF Documents 55 (https://trustee.ietf.org/license-info) in effect on the date of 56 publication of this document. Please review these documents 57 carefully, as they describe your rights and restrictions with respect 58 to this document. Code Components extracted from this document must 59 include Simplified BSD License text as described in Section 4.e of 60 the Trust Legal Provisions and are provided without warranty as 61 described in the Simplified BSD License. 63 Table of Contents 65 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 66 2. Historical Behavior . . . . . . . . . . . . . . . . . . . . . 3 67 3. Flooding Rate and Convergence . . . . . . . . . . . . . . . . 4 68 3.1. Flow Control Considerations . . . . . . . . . . . . . . . 5 69 3.2. Rate of LSP Acknowledgments . . . . . . . . . . . . . . . 7 70 3.3. Bandwidth Utilization . . . . . . . . . . . . . . . . . . 7 71 3.4. Packet Prioritization on Receive . . . . . . . . . . . . 7 72 4. Minimizing LSP Generation . . . . . . . . . . . . . . . . . . 8 73 5. Redundant Flooding . . . . . . . . . . . . . . . . . . . . . 10 74 6. Use of Jumbo Frames . . . . . . . . . . . . . . . . . . . . . 10 75 7. Deployment Considerations . . . . . . . . . . . . . . . . . . 10 76 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11 77 9. Security Considerations . . . . . . . . . . . . . . . . . . . 11 78 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 11 79 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 11 80 11.1. Normative References . . . . . . . . . . . . . . . . . . 11 81 11.2. Informative References . . . . . . . . . . . . . . . . . 12 82 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 12 84 1. Introduction 86 Link state IGPs such as Intermediate-System-to-Intermediate-System 87 (IS-IS) depend upon having consistent Link State Databases (LSDB) on 88 all Intermediate Systems (ISs) in the network in order to provide 89 correct forwarding of data packets. When topology changes occur, 90 new/updated Link State PDUs (LSPs) are propagated network-wide. The 91 speed of propagation is a key contributor to convergence time. 93 Historically, flooding rates have been conservative - on the order of 94 10s of LSPs/second. This derives from guidance in the base 95 specification [ISO10589] and early deployments when both CPU speeds 96 and interface speeds were much slower than they are today and the 97 scale of an IS-IS area was smaller than it may be today. 99 As IS-IS is deployed in greater scale (larger number of nodes in an 100 area and larger number of neighbors/node), the impact of the historic 101 flooding rates becomes more significant. Consider the bringup or 102 failure of a node with 1000 neighbors. This will result in a minimum 103 of 1000 LSP updates. At a typical LSP flooding rate used in many 104 deployments today (33 LSPs/second), it would take 30+ seconds simply 105 to send the updated LSPs to a given neighbor. Depending on the 106 diameter of the network, achieving a consistent LSDB on all nodes in 107 the network could easily take a minute (or more). 109 Increasing LSP flooding rate therefore becomes an essential element 110 of supporting greater network scale. 112 The remainder of this document discusses various aspects of protocol 113 operation and how they are impacted by increased flooding rate. 114 Where appropriate, best practices are defined which enhance an 115 implementation's ability to support faster flooding rates. 117 2. Historical Behavior 119 The base specification for IS-IS [ISO10589] was first published in 120 1992 and updated in 2002. The update made no changes in regards to 121 suggested timer values. Convergence targets at the time were on the 122 order of seconds and the specified timer values reflect that. Here 123 are some examples: 125 minimumLSPGenerationInterval - This is the minimum time interval 126 between generation of Link State PDUs. A source Intermediate 127 system shall wait at least this long before re-generating one 128 of its own Link State PDUs. 129 The recommended value was 30 seconds. 131 minimumLSPTransmissionInterval - This is the amount of time an 132 Intermediate system shall wait before further propagating 133 another Link State PDU from the same source system. 134 The recommended value was 5 seconds. 136 partialSNPInterval - This is the amount of time between periodic 137 action for transmission of Partial Sequence Number PDUs. 138 It shall be less than minimumLSPTransmission-Interval. 139 The recommend value was 2 seconds. 141 Most relevant to a discussion of LSP flooding rate is the recommended 142 interval between the transmission of two different LSPs on a given 143 interface. 145 For broadcast interfaces, [ISO10589] defined: 147 minimumBroadcastLSPTransmissionInterval - the minimum interval 148 between PDU arrivals which can be processed by the slowest 149 Intermediate System on the LAN. 150 The default value was defined as 33 milliseconds. 151 NOTE: It was permitted to send multiple LSPs "back-to-back" 152 as a burst, but this was limited to 10 LSPs in a one second 153 period. 155 Although this value was specific to LAN interfaces, this has commonly 156 been applied by implementations to all interfaces though that was not 157 the original intent of the base specification. In fact 158 Section 12.1.2.4.3 states: 160 On point-to-point links the peak rate of arrival is limited only 161 by the speed of the data link and the other traffic flowing on 162 that link. 164 Although modern implementations have not strictly adhered to the 33 165 millisecond interval, it is commonplace for implementations to limit 166 flooding rate to an order of magnitude similar to the 33 ms value. 168 In the past 20 years, significant work on achieving faster 169 convergence - more specifically sub-second convergence - has resulted 170 in implementations modifying a number of the above timers in order to 171 support faster signaling of topology changes. For example, 172 minimumLSPGenerationInterval has been modified to support millisecond 173 intervals - often with a backoff algorithm applied to prevent LSP 174 generation storms in the event of a series of rapid oscillations. 176 However, flooding rate has not been fundamentally altered. 178 3. Flooding Rate and Convergence 180 Convergence involves a number of sequential operations. 182 First the topology change needs to be detected. This is a local 183 activity occurring only on the node or nodes directly connected to 184 the topology change. The directly connected node(s) then must 185 advertise the topology change by updating their LSPs and flooding the 186 changed LSPs. Routers then must process the updated LSDB and 187 recalculate paths to affected destinations. The updated paths must 188 then be installed in the forwarding plane. 190 Only when all of the steps are completed on all nodes in the network 191 has the network completed convergence. 193 As the convergence requirement is consistency of LSDBs on all nodes 194 in the network, it is fundamental to understand that the goal of 195 flooding is to update the LSDB on all nodes in the network "as fast 196 as possible". Controling the rate of flooding per interface is done 197 to address some practical limitations which include: 199 o Fairness to other data and control traffic on the same interface 201 o Limitations on the processing rate of incoming control traffic 203 However, intentionally using different flooding rates on different 204 interfaces increases the possibility of longer periods of LSDB 205 inconsistency, which, in turn, delays network wide convergence. 207 Many implementations provide knobs to control the rate of LSP 208 flooding on a per interface basis. To the extent that this serves as 209 a flow control mechanism, this may reduce the number of dropped LSPs 210 during high activity bursts and thereby reduce the number of LSP 211 retransmissions required. As LSP retransimssion timers are typically 212 long (multiple seconds), this may result in shorter convergence times 213 than if the LSP burst was uncontrolled. But if the performance 214 characteristics of routers in the network are such that some routers 215 consistently accept and process fewer LSPs/second than other routers, 216 convergence will be degraded. Tuning LSP transmission timers on a 217 per interface basis will never provide optimal convergence. 218 Consistent flooding rates should be used on all interfaces. 220 3.1. Flow Control Considerations 222 In large scale deployments where an increased flooding rate is being 223 used, it becomes more likely that a burst of LSPs may temporarily 224 overwhelm a receiver. Normal operation of the Update Process will 225 recover from this, but it may well make sense to employ some form of 226 flow control. This will not serve to optimize convergence, but it 227 can serve to reduce the number of LSP retransmissions. As 228 retransmissions are deliberately done at a slow rate, the result of 229 flow control will be to provide a shorter recovery time from a 230 transient condition which prevents a node from handling the targeted 231 rate of LSP transmission. Sustained inability to handle LSP 232 reception at the targeted flooding rate indicates that the network is 233 provisioned in a way which does not support optimal convergence. 234 Steps need to be taken to resolve this issue. Such steps could 235 include upgrading the routers that demonstrate this condition 236 consistently, altering the configuration on the problematic routers 237 or altering the position of the problematic routers in the network so 238 as to reduce the overall load on those routers, or reducing the 239 target maximum LSP transmission rate network-wide. 241 When flow control is necessary, it can be implemented in a 242 straightforward manner based on knowledge of the current flooding 243 rate and the current acknowledgement rate. Such an algorithm is a 244 local matter and there is no requirement or intent to standardize an 245 algorithm. There are a number of aspects which serve as guidelines 246 which can be described. 248 A maximum target LSP transmission rate (LSPTxMax) SHOULD be 249 configurable. This represents the fastest LSP transmission rate 250 which will be attempted. This value SHOULD be applicable to all 251 interfaces and and SHOULD be consistent network wide. 253 When the current rate of LSP transmission (LSPTxRate) exceeds the 254 capabilities of the receiver, the flow control algorithm needs to 255 aggressively reduce the LSPTxRate within a few seconds. Slower 256 responsiveness is likely to result in a large number of 257 retransmissions which can introduce much larger delays in 258 convergence. 260 NOTE: Even with modest increases in flooding speed (for example, a 261 target LSPTxMax of 300 LSPs/second (10 times the typical rate 262 supported today)), a topology change triggering 2100 new LSPs would 263 only take 7 seconds to complete. 265 Dynamic adjustment of the rate of LSP transmission (LSPTxRate) 266 upwards (i.e., faster) SHOULD be done less aggressively and only be 267 done when the neighbor has demonstrated its ability to sustain the 268 current LSPTxRate. 270 The flow control algorithm MUST NOT assume the receive capabilities 271 of a neighbor are static, i.e., it MUST handle transient conditions 272 which result in a slower or faster receive rate on the part of a 273 neighbor. 275 The flow control algorithm needs to consider the expected delay time 276 in receiving an acknowledgment. See Section 3.2. This may vary per 277 neighbor. 279 3.2. Rate of LSP Acknowledgments 281 On point-to-point networks, PSNP PDUs provide acknowledgments for 282 received LSPs. [ISO10589] suggests that some delay be used when 283 sending PSNPs. This provides some optimization as multiple LSPs can 284 be acknowledged in a single PSNP. 286 If faster LSP flooding is to be used safely, it is necessary that 287 LSPs be acknowledged more promptly as well. This requires a 288 reduction in the delay in sending PSNPs. 290 As PSNPs also consume link bandwidth and packet queue space and 291 protocol processing time on receipt, the increased sending of PSNPs 292 should be taken into account when considering the rate at which LSPs 293 can be sent on an interface. 295 3.3. Bandwidth Utilization 297 Routing protocol traffic has to share bandwidth on a link with other 298 control traffic and data traffic. During periods of instability, 299 routing protocol traffic will increase, but it is still desirable 300 that the maximum bandwidth consumption by routing protocol traffic be 301 modest. This needs to be considered when setting IS-IS flooding 302 rates. 304 If we assume a maximum size of 1492 bytes for an LSP, here are some 305 rough estimates of bandwidth consumption at different flooding rates: 307 +--------------+----------------+-------------+ 308 | LSPs/second | 100 Mb Link | 1 Gb Link | 309 +--------------+----------------+-------------+ 310 | 100 | 1.2 % | 0.1 % | 311 +--------------+----------------+-------------+ 312 | 500 | 6.1 % | 0.6 % | 313 +--------------+----------------+-------------+ 314 | 1000 | 12.1 % | 1.2 % | 315 +--------------+----------------+-------------+ 317 3.4. Packet Prioritization on Receive 319 There are three classes of PDUs sent by IS-IS: 321 o Hellos 323 o LSPs 324 o Complete Sequence Number PDUs (CSNPs) and Partial Sequence Number 325 PDUs (PSNPs) 327 Implementations today may prioritize the reception of Hellos over 328 LSPs and SNPs in order to prevent a burst of LSP updates from 329 triggering an adjacency timeout which in turn would require 330 additional LSPs to be updated. 332 SNPs serve to acknowledge or trigger the transmission of specified 333 LSPs. On a point-to-point link, PSNPs acknowledge the receipt of one 334 or more LSPs. Because PSNPs (like all IS-IS PDUs) use TLVs in the 335 body, it is possible to acknowledge multiple LSPs using a single 336 PSNP. For this reason, [ISO10589] specifies a delay 337 (partialSNPInterval) before sending a PSNP so that the number of 338 PSNPs required to be sent is reduced. On receipt of a PSNP, the set 339 of LSPs acknowledged by that PSNP can be marked so that they do not 340 need to be retransmitted. 342 If a PSNP is dropped on reception, this has a significant impact as 343 the set of LSPs advertised in the PSNP cannot be marked as 344 acknowledged and this results in needless retransmissions which may 345 further delay transmission of other LSPs which have yet to be 346 transmitted. It may also make it more likely that a receiver becomes 347 overwhelmed by LSP transmissions. 349 It is therefore recommended that implementations prioritize the 350 receipt of SNPs over LSPs. 352 4. Minimizing LSP Generation 354 In IS-IS the unit of flooding is an LSP. Each router may generate a 355 set of LSPs at each supported level. Each LSP in the set has an LSP 356 number - which is a value from 0-N where N = 255 for the base 357 protocol. (N has been extended to 65535 by [RFC7356].) Each LSP 358 carries network information using defined Type/Length/Value (TLV) 359 tuples. For example, some TLVs carry neighbor information and some 360 TLVs carry reachable prefix information. [ISO10589] strongly 361 recommends preserving the association of a given advertisement (such 362 as a neighbor) with a specific LSP whenever possible. This minimizes 363 the number of LSPs which need to be regenerated when a topology 364 change occurs. This recommendation becomes even more important as 365 the scale of the network increases. 367 Consider the following example; 369 Node A has 11 neighbors currently in the UP state and is advertising 370 them in three LSPs with content as follows: 372 A.00-00 contains the following advertisements 373 Neighbor 1 374 Neighbor 2 375 Neighbor 3 376 Neighbor 4 377 Neighbor 5 378 A.00-01 contains the following advertisements: 379 Neighbor 6 380 Neighbor 7 381 Neighbor 8 382 Neighbor 9 383 Neighbor 10 384 A.00-02 contains the following advertisements 385 Neighbor 11 387 Imagine that the adjacency to Neighbor 3 goes down. There are (at 388 least) two ways that A could update its LSPs. 390 Method 1: Node A removes the neighbor advertisement for neighbor 3 391 from A.00-00 and sends an update for that LSP. LSPs 00-01 and 00-02 392 are unchanged and so do not have to be flooded. 394 Method 2: Node A attempts to reduce the number of LSPs currently 395 active and updates the content as follows: 397 A.00-00 contains the following advertisements 398 Neighbor 1 399 Neighbor 2 400 Neighbor 4 401 Neighbor 5 402 Neighbor 6 403 A.00-01 contains the following advertisements: 404 Neighbor 7 405 Neighbor 8 406 Neighbor 9 407 Neighbor 10 408 Neighbor 11 409 A.00-02 becomes empty 411 Node A now has to flood all three LSPs. LSPs #0 and #1 are reflooded 412 because their content has changed. LSP #2 is purged. 414 In a large scale network, the impact of using Method #2 becomes 415 significant and introduces conditions where a much larger number of 416 LSPs need to be flooded than is the case with Method #1. 418 In order to operate at scale, implementations need to follow the 419 guidance in [ISO10589] and use Method #1 whenever possible. 421 5. Redundant Flooding 423 Default operation of the Update Process is to flood on all 424 interfaces. In cases where a network is highly meshed, this can 425 result in a significant amount of redundant flooding. Nodes will 426 receive multiple copies of each updated LSP. 428 There are defined mechanisms which can greatly reduce the redundant 429 flooding. These include: 431 o Mesh Groups ( [RFC2973] ) 433 o Dynamic Flooding ( [I-D.ietf-lsr-dynamic-flooding] ) 435 6. Use of Jumbo Frames 437 The maximum size of an LSP (LSPBufferSize) is a parameter that needs 438 to be set consistently network wide. This is because IS-IS does not 439 support fragmentation of its PDUs - so in order for network wide 440 flooding of an LSP to be successful all routers must restrict their 441 LSP size to a size which can be supported without fragmentation on 442 all interfaces on which IS-IS operates. 444 In networks where all interfaces on which IS-IS operates support 445 large frames, LSPBufferSize may be set to a larger value than the 446 default (1492). This allows more routing information to be encoded 447 in a single LSP, which means that fewer LSPs are generated by each 448 node and therefore the number of LSPs which need to be flooded can be 449 reduced in some scenarios (e.g., node or interface bringup). 451 7. Deployment Considerations 453 As noted earlier in this document, it is desired to have consistent 454 flooding speeds on all nodes in the network. Today, this is roughly 455 achieved to the extent that current implementations flood at rates 456 which are on the order of what is discussed in [ISO10589] , i.e., 33 457 LSPs/second). 459 As the goal is to introduce an order of magnitude increase in the 460 rate of flooding (e.g., 10 times the current flooding rate) a network 461 which has a mixture of nodes which support the faster flooding speeds 462 and nodes which do not is at greater risk of introducing longer 463 periods of LSDB inconsistency in the network - which is likely to 464 have a negative impact on convergence and increase the occurrence of 465 traffic drops or looping. 467 It is recommended that all nodes in the network support increased 468 flooding rates before enabling use of the increased flooding rates. 470 Note that as the Update process runs in the context of an area (or 471 the L2 sub-domain), enablement can safely be done on a per area basis 472 even when nodes in another area do not support the faster flooding 473 rates. 475 8. IANA Considerations 477 This document requires no actions by IANA. 479 9. Security Considerations 481 Security concerns for IS-IS are addressed in [ISO10589, [RFC5304], 482 and [RFC5310]. 484 10. Acknowledgements 486 Thanks to Bruno Decraene for his careful review and insightful 487 comments. 489 11. References 491 11.1. Normative References 493 [ISO10589] 494 International Organization for Standardization, 495 "Intermediate system to Intermediate system intra-domain 496 routeing information exchange protocol for use in 497 conjunction with the protocol for providing the 498 connectionless-mode Network Service (ISO 8473)", ISO/ 499 IEC 10589:2002, Second Edition, Nov 2002. 501 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 502 Requirement Levels", BCP 14, RFC 2119, 503 DOI 10.17487/RFC2119, March 1997, 504 . 506 [RFC2973] Balay, R., Katz, D., and J. Parker, "IS-IS Mesh Groups", 507 RFC 2973, DOI 10.17487/RFC2973, October 2000, 508 . 510 [RFC5304] Li, T. and R. Atkinson, "IS-IS Cryptographic 511 Authentication", RFC 5304, DOI 10.17487/RFC5304, October 512 2008, . 514 [RFC5310] Bhatia, M., Manral, V., Li, T., Atkinson, R., White, R., 515 and M. Fanto, "IS-IS Generic Cryptographic 516 Authentication", RFC 5310, DOI 10.17487/RFC5310, February 517 2009, . 519 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 520 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 521 May 2017, . 523 11.2. Informative References 525 [I-D.ietf-lsr-dynamic-flooding] 526 Li, T., Psenak, P., Ginsberg, L., Chen, H., Przygienda, 527 T., Cooper, D., Jalil, L., Dontula, S., and G. S. Mishra, 528 "Dynamic Flooding on Dense Graphs", draft-ietf-lsr- 529 dynamic-flooding-08 (work in progress), December 2020. 531 [RFC7356] Ginsberg, L., Previdi, S., and Y. Yang, "IS-IS Flooding 532 Scope Link State PDUs (LSPs)", RFC 7356, 533 DOI 10.17487/RFC7356, September 2014, 534 . 536 Authors' Addresses 538 Les Ginsberg 539 Cisco Systems 540 821 Alder Drive 541 Milpitas, CA 95035 542 USA 544 Email: ginsberg@cisco.com 546 Peter Psenak 547 Cisco Systems 548 Apollo Business Center Mlynske nivy 43 549 Bratislava 821 09 550 Slovakia 552 Email: ppsenak@cisco.com 553 Marek Karasek 554 Cisco Systems 555 Pujmanove 1753/10a, Prague 4 - Nusle 556 Prague 10 14000 557 Czech Republic 559 Email: mkarasek@cisco.com 561 Acee Lindem 562 Cisco Systems 563 301 Midenhall Way 564 Cary, NC 27513 565 US 567 Email: acee@cisco.com 569 Tony Przygienda 570 Juniper 571 1137 Innovation Way 572 Sunnyvale, Ca 573 USA 575 Email: prz@juniper.net