idnits 2.17.00 (12 Aug 2021) /tmp/idnits5341/draft-saucez-lisp-itr-graceful-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 153: '...k elements and as such, it MUST NOT be...' RFC 2119 keyword, line 166: '...e that this destination RLOC MAY be an...' RFC 2119 keyword, line 212: '...her hosts. EIDs MUST NOT be used as L...' RFC 2119 keyword, line 213: '... that EID blocks MAY be assigned in a ...' RFC 2119 keyword, line 241: '... entries MUST be configured on...' (2 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (December 20, 2013) is 3067 days in the past. Is this intentional? Checking references for intended status: Experimental ---------------------------------------------------------------------------- -- Obsolete informational reference (is this intentional?): RFC 2460 (Obsoleted by RFC 8200) Summary: 2 errors (**), 0 flaws (~~), 1 warning (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group D. Saucez 3 Internet-Draft INRIA 4 Intended status: Experimental O. Bonaventure 5 Expires: June 23, 2014 UCLouvain 6 L. Iannone 7 Telecom ParisTech 8 C. Filsfils 9 Cisco Systems 10 December 20, 2013 12 LISP ITR Graceful Restart 13 draft-saucez-lisp-itr-graceful-03.txt 15 Abstract 17 The Locator/ID Separation Protocol (LISP) is a map-and-encap 18 mechanism to enable communications between hosts identified with 19 their Endpoint IDentifier (EID) over the Internet where EIDs are not 20 routable. To do so, packets toward EIDs are encapsulated in packets 21 with routing locators (RLOCs) to form dynamic tunnels. An Ingress 22 Tunnel Router (ITR) that encapsulates EID packets determines tunnel 23 endpoints via mappings that associate EIDs to RLOCs. Before 24 encapsulating a packet, the ITR queries the mapping system to obtain 25 the mapping associated to the EID of the packet it must encapsulate. 26 Such mapping is cached by the ITR in its local EID-to-RLOC cache for 27 any subsequent encapsulation for the same EID. LISP is scalable 28 because EID-to-RLOC mappings are cached on ITRs. Initially, the 29 cache is empty and is populated progressively according to the 30 traffic traversing the ITR. However, after an ITR is restarted, 31 e.g., for maintenance reason, its cache is empty which means that all 32 packets that are re-routed to the freshly restarted ITR will cause 33 cache misses and a potentially high loss rate. In this draft, we 34 present mechanisms to reduce the negative impact on traffic caused by 35 the restart of an ITR in a LISP network. 37 Status of This Memo 39 This Internet-Draft is submitted in full conformance with the 40 provisions of BCP 78 and BCP 79. 42 Internet-Drafts are working documents of the Internet Engineering 43 Task Force (IETF). Note that other groups may also distribute 44 working documents as Internet-Drafts. The list of current Internet- 45 Drafts is at http://datatracker.ietf.org/drafts/current/. 47 Internet-Drafts are draft documents valid for a maximum of six months 48 and may be updated, replaced, or obsoleted by other documents at any 49 time. It is inappropriate to use Internet-Drafts as reference 50 material or to cite them other than as "work in progress." 52 This Internet-Draft will expire on June 23, 2014. 54 Copyright Notice 56 Copyright (c) 2013 IETF Trust and the persons identified as the 57 document authors. All rights reserved. 59 This document is subject to BCP 78 and the IETF Trust's Legal 60 Provisions Relating to IETF Documents 61 (http://trustee.ietf.org/license-info) in effect on the date of 62 publication of this document. Please review these documents 63 carefully, as they describe your rights and restrictions with respect 64 to this document. Code Components extracted from this document must 65 include Simplified BSD License text as described in Section 4.e of 66 the Trust Legal Provisions and are provided without warranty as 67 described in the Simplified BSD License. 69 Table of Contents 71 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 72 2. Definition of terms . . . . . . . . . . . . . . . . . . . . . 3 73 2.1. LISP Definition of Terms . . . . . . . . . . . . . . . . 4 74 3. Problem Statement . . . . . . . . . . . . . . . . . . . . . . 6 75 4. ITR Graceful Restart . . . . . . . . . . . . . . . . . . . . 7 76 5. Security Considerations . . . . . . . . . . . . . . . . . . . 8 77 6. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . 9 78 7. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 9 79 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 9 80 8.1. Normative References . . . . . . . . . . . . . . . . . . 9 81 8.2. Informative References . . . . . . . . . . . . . . . . . 9 82 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 10 84 1. Introduction 86 The Locator/ID Separation Protocol (LISP) [RFC6830] relies on two 87 principles. First, Endpoint Identifiers (EIDs) are allocated to 88 hosts while Routing Locators (RLOCs) are allocated to LISP Ingress 89 Tunnel Routers (ITR) and Egress Tunnel Routers (ETR). EIDs are not 90 directly routable on the global Internet, only RLOCs are. Second, 91 LISP relies on mapping and encapsulation. Hosts are located on sites 92 and are served by ITRs and ETRs. When host A.1 in site A needs to 93 send a packet to host B.2 in site B, its packet is intercepted by an 94 ITR that serves its site. The ITR queries a mapping system to find 95 the RLOC of the ETR that serves EID B.2. Once the RLOC of the ETR 96 serving B's site is known, the ITR encapsulates the packet using the 97 encapsulation defined in [RFC6830] so that it can reach B's ETR. B's 98 ETR decapsulates the packet and forwards it to host B. 100 Packets from a LISP site are routed to their closest ITR by the mean 101 of the routing system (e.g., IGP). In case of an ITR that just 102 booted (either because it has just been added to the network or 103 because it has been restarted due to maintenance) a large portion of 104 the traffic can potentially be routed to the freshly started ITR. 105 However, in this case, its EID-to-RLOC cache is empty. While with 106 traditional routing, such a massive redirection has minor impact on 107 the traffic (except for path stretch and latency), in the context of 108 LISP, this can cause a high volume of cache misses (i.e., no EID-to- 109 RLOC cache entry matching the destination RLOC) resulting in a high 110 volume of dropped packets, hence, potentially leading to severe 111 traffic disruption. Furthermore, such a high number of cache misses 112 triggers a burst of Map-Requests that may overload the mapping system 113 (or Map Resolvers if [RFC6833] is used). 115 This memo opens the question about how to perform graceful (re)start 116 of ITRs in LISP networks. It aims at documenting the problem of ITR 117 (re)start with the associated risk of "miss storm" and discusses EID- 118 to-RLOC cache synchronization solutions to provide ITR graceful 119 restart without overwhelming the mapping system and without high 120 packet losses. 122 2. Definition of terms 124 This section introduces the definition of the main elements and terms 125 used throughout the whole document. More specifically, hereafter the 126 terms introduced by this document are defined, while in Section 2.1 127 the definitions related to the LISP's architecture are provided in 128 order to ease the read of the present document. 130 EID-to-RLOC cache miss storm: A sudden raise of the cache miss rate 131 at an ITR to a level significantly higher than the rate 132 observed at steady state on the ITR. 134 Map-Request storm: The side effect of a EID-to-RLOC cache miss 135 storm, is the generation of a high number of Map-Requests, 136 which is called a Map-Request storm. 138 Synchronization Set: The set of ITRs that are potentially on the 139 path of the same traffic should have their EID-to-RLOC cache 140 synchronized in order to avoid EID-to-RLOC cache miss storms. 142 ITR Restart: Generic term indicating an ITR that has just completed 143 the bootstrap phase and resuming normal operation. It can be 144 either an ITR that has been added to the network (hence, 145 actually at its first boot as part of the specific network) or 146 an ITR actually re-booting due to various reasons such as 147 maintenance or outage. 149 2.1. LISP Definition of Terms 151 LISP operates on two name spaces and introduces several new network 152 elements. This section provides high-level definitions of the LISP 153 name spaces and network elements and as such, it MUST NOT be 154 considered as an authoritative source. The reference to the 155 authoritative document for each term is included in every term 156 description. 158 Ingress Tunnel Router (ITR) [RFC6830]: An ITR is a router that 159 resides in a LISP site. Packets sent by sources inside of the 160 LISP site to destinations outside of the site are candidates 161 for encapsulation by the ITR. The ITR treats the IP 162 destination address as an EID and performs an EID-to-RLOC 163 mapping lookup. The router then prepends an "outer" IP header 164 with one of its globally routable RLOCs in the source address 165 field and the result of the mapping lookup in the destination 166 address field. Note that this destination RLOC MAY be an 167 intermediate, proxy device that has better knowledge of the 168 EID-to-RLOC mapping closer to the destination EID. In general, 169 an ITR receives IP packets from site end-systems on one side 170 and sends LISP-encapsulated IP packets toward the Internet on 171 the other side. Specifically, when a service provider prepends 172 a LISP header for Traffic Engineering purposes, the router that 173 does this is also regarded as an ITR. The outer RLOC the ISP 174 ITR uses can be based on the outer destination address (the 175 originating ITR's supplied RLOC) or the inner destination 176 address (the originating hosts supplied EID). 178 Egress Tunnel Router (ETR) [RFC6830]: An ETR is a router that 179 accepts an IP packet where the destination address in the 180 "outer" IP header is one of its own RLOCs. The router strips 181 the "outer" header and forwards the packet based on the next IP 182 header found. In general, an ETR receives LISP-encapsulated IP 183 packets from the Internet on one side and sends decapsulated IP 184 packets to site end-systems on the other side. ETR 185 functionality does not have to be limited to a router device. 186 A server host can be the endpoint of a LISP tunnel as well. 188 Routing LOCator (RLOC) [RFC6830]: A RLOC is an IPv4 [RFC0791] or 189 IPv6 [RFC2460] address of an egress tunnel router (ETR). A 190 RLOC is the output of an EID-to-RLOC mapping lookup. An EID 191 maps to one or more RLOCs. Typically, RLOCs are numbered from 192 topologically aggregatable blocks that are assigned to a site 193 at each point to which it attaches to the global Internet; 194 where the topology is defined by the connectivity of provider 195 networks, RLOCs can be thought of as PA addresses. Multiple 196 RLOCs can be assigned to the same ETR device or to multiple ETR 197 devices at a site. 199 Endpoint ID (EID) [RFC6830]: An EID is a 32-bit (for IPv4) or 200 128-bit (for IPv6) value used in the source and destination 201 address fields of the first (most inner) LISP header of a 202 packet. The host obtains a destination EID the same way it 203 obtains an destination address today, for example through a 204 Domain Name System (DNS) [RFC1034] lookup or Session Invitation 205 Protocol (SIP) [RFC3261] exchange. The source EID is obtained 206 via existing mechanisms used to set a host's "local" IP 207 address. An EID used on the public Internet must have the same 208 properties as any other IP address used in that manner; this 209 means, among other things, that it must be globally unique. An 210 EID is allocated to a host from an EID-prefix block associated 211 with the site where the host is located. An EID can be used by 212 hosts to refer to other hosts. EIDs MUST NOT be used as LISP 213 RLOCs. Note that EID blocks MAY be assigned in a hierarchical 214 manner, independent of the network topology, to facilitate 215 scaling of the mapping database. In addition, an EID block 216 assigned to a site may have site-local structure (subnetting) 217 for routing within the site; this structure is not visible to 218 the global routing system. In theory, the bit string that 219 represents an EID for one device can represent an RLOC for a 220 different device. As the architecture is realized, if a given 221 bit string is both an RLOC and an EID, it must refer to the 222 same entity in both cases. When used in discussions with other 223 Locator/ID separation proposals, a LISP EID will be called a 224 "LEID". Throughout this document, any reference to "EID" 225 refers to an LEID. 227 EID-to-RLOC cache [RFC6830]: The EID-to-RLOC cache is a short-lived, 228 on- demand table in an ITR that stores, tracks, and is 229 responsible for timing-out and otherwise validating EID-to-RLOC 230 mappings. This cache is distinct from the full "database" of 231 EID-to-RLOC mappings, it is dynamic, local to the ITR(s), and 232 relatively small while the database is distributed, relatively 233 static, and much more global in scope. 235 EID-to-RLOC Database [RFC6830]: The EID-to-RLOC database is a global 236 distributed database that contains all known EID-prefix to RLOC 237 mappings. Each potential ETR typically contains a small piece 238 of the database: the EID-to-RLOC mappings for the EID prefixes 239 "behind" the router. These map to one of the router's own, 240 globally visible, IP addresses. The same database mapping 241 entries MUST be configured on all ETRs for a given site. In a 242 steady state the EID-prefixes for the site and the locator-set 243 for each EID-prefix MUST be the same on all ETRs. Procedures 244 to enforce and/or verify this are outside the scope of this 245 document. Note that there MAY be transient conditions when the 246 EID-prefix for the site and locator-set for each EID-prefix may 247 not be the same on all ETRs. This has no negative implications 248 since a partial set of locators can be used. 250 3. Problem Statement 252 LISP is a map-and-encap mechanism where an ITR dynamically learns the 253 mappings when it receives a packet for a destination EID for which it 254 did not do encapsulation before. When such a packet is received, a 255 cache miss occurs and the ITR sends a Map-Request to the mapping 256 system to retrieve the mapping that corresponds to the destination of 257 the packet that caused the cache miss. The ITR then caches the 258 mapping for any subsequent packet toward the same destination. LISP 259 [RFC6830] does not specify how a packet that causes a cache miss must 260 be handled. However, to the best of our knowledge, the current 261 implementations drop packets causing a cache miss. The consequences 262 of such a current practice in case of cache miss is two-fold. On the 263 one hand, misses imply packet losses and hence performance issues. 264 On the other hand, due to the consequent Map-Request, cache misses 265 cause load on the mapping system. 267 When an ITR restarts, its EID-to-RLOC cache is initially empty, and 268 is populated, growing in size, progressively with the traffic. 269 However, because mappings have a limited lifetime, the EID-to-RLOC 270 cache size converges to a stable value and it is expected to always 271 observe misses. As shown in [Networking12], at the steady state, 272 networks experience a rather stable, and limited, miss rate. 273 However, when an ITR is restarted, e.g., for a maintenance operation, 274 a cache miss storm can be observed. A EID-to-RLOC cache miss storm 275 is a phenomenon during which the miss rate is significantly higher 276 than the miss rate normally observed in the network. A miss storm 277 has two sever side effects, first, it abruptly increases the load on 278 the mapping system, and second, many packets are dropped, which 279 causes performance issues. When an ITR is restarted, actually two 280 cache miss storms can be observed. The first one happens when the 281 ITR is stopped (or fails); while the second one happens when the ITR 282 is again available for encapsulation. The first EID-to-RLOC cache 283 miss storm is due to the fact that all the traffic is suddenly 284 redirected to the other ITRs in the network, which might not have the 285 mappings for all the EIDs of ongoing communications. The second EID- 286 to-RLOC cache miss storm can be observed when the ITR is restarted, 287 because it might have to encapsulate all the traffic redirected to 288 it. As a matter of fact, when the ITR is freshly restarted, its 289 cache is empty meaning that every packet will cause misses at that 290 particular time. 292 Cache misses are normal in a LISP network. However, these misses 293 normally happen only when the first packet of the first flow toward 294 an EID is received by an ITR which have no significant impact on the 295 traffic at steady state in the network. On the contrary, when an ITR 296 restarts, cache misses happen on elapsing, potentially high 297 throughput, flows for which high loss rate is not acceptable. For 298 this particular reason, techniques must be applied to avoid EID-to- 299 RLOC cache miss storm upon ITRs restarts. 301 It can be argued that if a router fails and is out of order for a 302 long time, avoiding the EID-to-RLOC cache miss storm, which lasts in 303 the order of minutes, is not worth. This is not actually accurate. 304 When a router fails, there are usually already deployed backup 305 solutions in order to re-direct the traffic instantaneously, with 306 almost no losses. Such redirection remains in place until the 307 failure is fixed, without any consequence on the traffic except for 308 using a different path. Similarly, when the router is back online, 309 booting, traffic will flow again trough it only when the state of the 310 router is consistent with the rest of the network, making re- 311 directing the traffic through it disruptionless. All of this is not 312 true for ITRs. Even if with existing techniques we are able to re- 313 direct the traffic with no losses, the LISP encapsulation engine will 314 drop packets because of the lack of mappings in the cache, creating 315 traffic disruption and a raise in signaling traffic on the mapping 316 system. 318 In this memo, we open the discussion on techniques that can be used 319 to avoid EID-to-RLOC cache miss storms in the case of a planned ITR 320 restart. In other words, we discuss how to achieve ITR graceful 321 restart. 323 4. ITR Graceful Restart 325 The addition of an ITR causes the traffic to be redirected to the 326 freshly started ITR and hence risks to cause miss storm. As the 327 cache of an ITR is empty when it starts, every received packet 328 potentially causes a miss. We can isolate three techniques to 329 protect the network from miss storm when an ITR is added (or 330 restarted) in the network. All the ITRs that are potentially used by 331 the same node in the network are grouped in synchronization sets. 333 o Non-volatile mapping storage: when an ITR has to be stopped, its 334 EID-to-RLOC cache is stored on a non-volatile medium (e.g., a hard 335 drive) such that when it is restarted, it can load the EID-to-RLOC 336 cache to be equivalent of the cache it had before it restarted. 338 o ITR deflection: when a miss occurs at an ITR while it is starting 339 up, the ITR deflects the packet that caused a miss to an ITR in 340 its synchronization set and, in parallel, sends a Map-Request for 341 the EID that caused the miss. Note that the Map-Request can even 342 be sent to another ITR of the site or a Map Resolver working in 343 proxy mode. In this manner mapping retrieval latency can be 344 shortened. 346 o ITR cache synchronization: upon startup, the ITR synchronizes its 347 cache with the other ITRs in its synchronization set. The ITR is 348 marked as available only after the cache is synchronized. 350 The non-volatile storage offers the advantage to be transparent for 351 the network and is adapted to short unavailability periods (e.g., the 352 ITR reboots after an upgrade). However, this technique is not 353 adapted for long unavailability periods where most of the entries 354 might be outdated and new prefixes unknown, or when an ITR is added 355 for the first time in the network. This technique is thus 356 recommended only for network with a low mapping caching dynamics. 358 Traffic deflection to other ITRs (or a PxTR) upon misses causes 359 several issues. On the one hand, the ITR that is restarting must 360 determine the ITR to which the packet must be deflected. On the 361 other hand, packets must be marked as deflected in order to avoid 362 loops. In addition, the ITR must determine its graceful restart 363 period such that it stops deflecting traffic once at steady state. 364 The deflection from one ITR to another can be done directly in LISP 365 where the ITR that started LISP encapsulates and forwards the packet 366 to another ITR. This last ITR must then also run the ETR 367 functionality to decapsulate the packet. 369 ITR EID-to_RLOC cache synchronization is the most adapted to graceful 370 restart. When the ITR starts, it sends requests to an ITR in its 371 synchronization set (or its MR) to obtain the full cache. When the 372 synchronization is finished, the ITR advertises itself as an ITR in 373 the network such that the ITR does receive traffic to encapsulate 374 only once its cache is synchronized. 376 5. Security Considerations 378 Security considerations have to be written accordingly to the 379 technique finally chosen for ITR graceful restart. However, as a 380 general security recommendation, we can say that mappings must be 381 authenticated in order to avoid relay attacks or denial of service. 382 However, ITR graceful restart should not introduce any new threat in 383 the core LISP mechanism. 385 6. Conclusion 387 In this memo, we highlighted the implication of the addition or the 388 restart of an ITR in a LISP network. When an ITR is added into a 389 LISP network, its EID-to-RLOC cache is initially empty. Therefore, 390 when on-going flows are routed to the freshly started ITR, their 391 packets cause potential miss storms which result in packet drops and 392 mapping system overload. To tackle this issue, we propose and 393 discuss three different techniques to reduce the impact of a planed 394 ITR restart. 396 7. Acknowledgments 398 The authors would like to acknowledge Dino Farinacci, Vince Fuller, 399 Darrel Lewis, Fabio Maino, and Simon van der Linden. 401 8. References 403 8.1. Normative References 405 [RFC6830] Farinacci, D., Fuller, V., Meyer, D., and D. Lewis, "The 406 Locator/ID Separation Protocol (LISP)", RFC 6830, January 407 2013. 409 [RFC6833] Fuller, V. and D. Farinacci, "Locator/ID Separation 410 Protocol (LISP) Map-Server Interface", RFC 6833, January 411 2013. 413 8.2. Informative References 415 [Networking12] 416 Saucez, D., Kim, J., Iannone, L., Bonaventure, O., and C. 417 Filsfils, "A local Approach to Fast Failure Recovery of 418 LISP Ingress Tunnel Routers", The 11th International 419 Conference on Networking (Networking'12) , May 2012, 420 <[Networking12]>. 422 [RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, September 423 1981. 425 [RFC1034] Mockapetris, P., "Domain names - concepts and facilities", 426 STD 13, RFC 1034, November 1987. 428 [RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 429 (IPv6) Specification", RFC 2460, December 1998. 431 [RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, 432 A., Peterson, J., Sparks, R., Handley, M., and E. 433 Schooler, "SIP: Session Initiation Protocol", RFC 3261, 434 June 2002. 436 Authors' Addresses 438 Damien Saucez 439 INRIA 440 2004 route des Lucioles BP 93 441 Sophia Antipolis Cedex 06902 442 France 444 Email: damien.saucez@inria.fr 446 Olivier Bonaventure 447 UCLouvain 448 Universite catholique de Louvain, Place Sainte Barbe 2 449 Louvain-la-Neuve 1348 450 Belgium 452 Email: olivier.bonaventure@uclouvain.be 453 URI: http://inl.info.ucl.ac.be 455 Luigi Iannone 456 Telecom ParisTech 457 23, Avenue d'Italie 458 75013 Paris 459 France 461 Email: luigi.iannone@telecom-paristech.fr 463 Clarence Filsfils 464 Cisco Systems 465 Brussels 1000 466 Belgium 468 Email: cf@cisco.com