idnits 2.17.00 (12 Aug 2021) /tmp/idnits47942/draft-skr-bess-evpn-pim-proxy-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([RFC7432]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 30, 2017) is 1657 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC7606' is mentioned on line 675, but not defined == Missing Reference: 'RFC2119' is mentioned on line 925, but not defined ** Obsolete normative reference: RFC 4601 (Obsoleted by RFC 7761) ** Downref: Normative reference to an Informational RFC: RFC 8220 == Outdated reference: A later version (-21) exists of draft-ietf-bess-evpn-igmp-mld-proxy-00 == Outdated reference: draft-ietf-bess-evpn-proxy-arp-nd has been published as RFC 9161 Summary: 3 errors (**), 0 flaws (~~), 5 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 BESS Workgroup J. Rabadan, Ed. 3 Internet Draft J. Kotalwar 4 Intended status: Standards Track S. Sathappan 5 Nokia 7 Z. Zhang 8 Juniper 10 A. Sajassi 11 Cisco 13 Expires: May 3, 2018 October 30, 2017 15 PIM Proxy in EVPN Networks 16 draft-skr-bess-evpn-pim-proxy-01 18 Abstract 20 Ethernet Virtual Private Networks [RFC7432] are becoming prevalent in 21 Data Centers, Data Center Interconnect (DCI) and Service Provider VPN 22 applications. One of the goals that EVPN pursues is the reduction of 23 flooding and the efficiency of CE-based control plane procedures in 24 Broadcast Domains. Examples of this are Proxy ARP/ND and IGMP/MLD 25 Proxy. This document complements the latter, describing the 26 procedures required to minimize the flooding of PIM messages in EVPN 27 Broadcast Domains, and optimize the IP Multicast delivery between PIM 28 routers. 30 Status of this Memo 32 This Internet-Draft is submitted in full conformance with the 33 provisions of BCP 78 and BCP 79. 35 Internet-Drafts are working documents of the Internet Engineering 36 Task Force (IETF), its areas, and its working groups. Note that 37 other groups may also distribute working documents as Internet- 38 Drafts. 40 Internet-Drafts are draft documents valid for a maximum of six months 41 and may be updated, replaced, or obsoleted by other documents at any 42 time. It is inappropriate to use Internet-Drafts as reference 43 material or to cite them other than as "work in progress." 45 The list of current Internet-Drafts can be accessed at 46 http://www.ietf.org/ietf/1id-abstracts.txt 48 The list of Internet-Draft Shadow Directories can be accessed at 49 http://www.ietf.org/shadow.html 51 This Internet-Draft will expire on May 3, 2018. 53 Copyright Notice 55 Copyright (c) 2017 IETF Trust and the persons identified as the 56 document authors. All rights reserved. 58 This document is subject to BCP 78 and the IETF Trust's Legal 59 Provisions Relating to IETF Documents 60 (http://trustee.ietf.org/license-info) in effect on the date of 61 publication of this document. Please review these documents 62 carefully, as they describe your rights and restrictions with respect 63 to this document. Code Components extracted from this document must 64 include Simplified BSD License text as described in Section 4.e of 65 the Trust Legal Provisions and are provided without warranty as 66 described in the Simplified BSD License. 68 Table of Contents 70 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 71 2. PIM Proxy Operation in EVPN Broadcast Domains . . . . . . . . . 4 72 2.1. Multicast Router Discovery Procedures in EVPN . . . . . . . 5 73 2.1.1. Discovering PIM Routers . . . . . . . . . . . . . . . . 5 74 2.1.2. Discovering IGMP Queriers . . . . . . . . . . . . . . . 7 75 2.2. PIM Join/Prune Proxy Procedures . . . . . . . . . . . . . . 7 76 2.3. PIM Assert Optimization . . . . . . . . . . . . . . . . . . 10 77 2.3.1 Assert Optimization Procedures in Downstream PEs . . . . 11 78 2.3.2 Assert Optimization Procedures in Upstream PEs . . . . . 12 79 2.4. EVPN Multi-Homing and State Synchronization . . . . . . . . 13 80 3. Interaction with IGMP-snooping and Sources . . . . . . . . . . 13 81 4. BGP Information Model . . . . . . . . . . . . . . . . . . . . . 14 82 4.1 Multicast Router Discovery (MRD) Route . . . . . . . . . . . 15 83 4.2 Selective Multicast Ethernet Tag Route for PIM Proxy . . . . 16 84 4.3 PIM RPT-Prune Route . . . . . . . . . . . . . . . . . . . . 18 85 4.4 IGMP/PIM Join Synch Route for PIM Proxy . . . . . . . . . . 19 86 4.5 IGMP/PIM RPT-Prune Synch Route for PIM Proxy . . . . . . . . 20 87 5. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . 21 88 6. Conventions used in this document . . . . . . . . . . . . . . . 21 89 7. Security Considerations . . . . . . . . . . . . . . . . . . . . 21 90 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 21 91 9. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . 22 92 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 22 93 10.1 Normative References . . . . . . . . . . . . . . . . . . . 22 94 10.2 Informative References . . . . . . . . . . . . . . . . . . 23 95 11. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 23 96 12. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 23 97 13. Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . 23 99 1. Introduction 101 Ethernet Virtual Private Networks [RFC7432] are becoming prevalent in 102 Data Centers, Data Center Interconnect (DCI) and Service Provider VPN 103 applications. One of the goals that EVPN pursues is the reduction of 104 flooding and the efficiency of CE-based control plane procedures in 105 Broadcast Domains. Examples of this are [EVPN-PROXY-ARP-ND] for 106 improving the efficiency of CE's ARP/ND protocols, and [EVPN-IGMP- 107 MLD-PROXY] for IGMP/MLD protocols. 109 This document focuses on optimizing the behavior of PIM in EVPN 110 Broadcast Domains and re-uses some procedures of [EVPN-IGMP-MLD- 111 PROXY]. The reader is also advised to check out [RFC8220] to 112 understand certain aspects of the procedures of PIM Join/Prune 113 messages received on Attachment Circuits (ACs). 115 Section 2 describes the PIM Proxy procedures that the implementation 116 should follow, including: 118 o The use of EVPN to suppress the flooding of PIM Hello messages in 119 shared Broadcast Domains. The benefit of this is twofold: 120 - PIM Hello messages will ONLY be flooded to Attachment Circuits 121 that are connected to PIM routers, as opposed to all the CEs and 122 hosts in the Broadcast Domain. 123 - Soft-state PIM Hello messages will be replaced by hard-state BGP 124 messages that don't need to be refreshed periodically. 126 o The use of EVPN to discover IGMP Queriers, while avoiding the 127 flooding of IGMP Queries in the core. 129 o The procedures to proxy PIM Join/Prune messages and replace them by 130 hard-state EVPN routes that don't need to be refreshed 131 periodically. By using BGP EVPN to propagate both, Hello and 132 Join/Prune messages, we also avoid out-of-order delivery between 133 both types of PIM messages. 135 o This document also describes an EVPN based procedure so that the 136 PIM routers connected to the shared Broadcast Domain don't need to 137 run any PIM Assert procedure. PIM Assert procedures may be 138 expensive for PIM routers in terms of resource consumption. With 139 this procedure, there is no PIM Assert needed on PIM routers. 141 o The use of procedures similar to the ones defined in [EVPN-IGMP- 142 MLD-PROXY] to synchronize multicast states among the PEs in the 143 same Ethernet Segment. 145 Section 3 describes the interaction of PIM Proxy with IGMP Proxy PEs 146 and Multicast Sources connected to the same EVPN Broadcast Domain. 148 Section 4 defines the BGP Information Model that this document 149 requires to address the PIM Proxy procedures. 151 This document assumes the reader is familiar with PIM and IGMP 152 protocols. 154 2. PIM Proxy Operation in EVPN Broadcast Domains 156 This section describes the operation of PIM Proxy in EVPN Broadcast 157 Domains (BDs). Figure 1 depicts an EVPN Broadcast Domain defined in 158 four PEs that are connected to PIM routers. This example will be used 159 throughout this section and assumes both R4 and R5 are PIM Upstream 160 Neighbors for PIM routers R1, R2 and R3 and multicast group G1. In 161 this situation, the PIM multicast traffic flows from R4 or R5 to R1, 162 R2 and R3. The PIM Join/Prune signaling will flow in the opposite 163 direction. From a terminology perspective, we consider PE1 and PE2 as 164 egress or downstream PEs, whereas PE3 and PE4 are ingress or upstream 165 PEs. 167 J(*,G1,IP5) 168 +--+ 169 |R1+------> XXXXXXXX 170 +--+ +-----+ XXXX XX XXXXX +-----+ +--+ 171 | PE1 |XXXXX XXXX XX| PE3 +----> |R4| 172 +--+ | | | | +--+ 173 |R2+-----> +-----+ +-----+ <---- 174 +--+ X XX multicast 175 J(*,G1,IP5) X XXX (S1,G1) 176 XXX EVPN Broadcast XX 177 X Domain X 178 +--+ +-----+ X RP 179 |R3+---> | PE2 | XX+-----+ +--+ 180 +--+ | | XXXX | PE4 +--> |R5| 181 +-----+XXXX XXXXX | | +--+ 182 J(S1,G1,IP4) X X X +-----+ 183 XX XXX XX XXX 184 XXXXXX XXXXX XXX 186 Figure 1 - PIM Routers connected by an EVPN Broadcast Domain 188 It is important to note that any Router's PIM message not explicitly 189 specified in this document will be forwarded by the PEs normally, in 190 the data path, as a unicast or multicast packet. 192 2.1. Multicast Router Discovery Procedures in EVPN 194 The procedures defined in this section make use of the Multicast 195 Router Discovery (MRD) route described in section 4 and are OPTIONAL. 196 An EVPN router not implementing this specification will transparently 197 flood PIM Hello messages and IGMP Queries to remote PEs. 199 2.1.1. Discovering PIM Routers 201 As described in [RFC4601] for shared LANs, an EVPN Broadcast Domain 202 may have multiple PIM routers connected to it and a single one of 203 these routers, the DR, will act on behalf of directly connected hosts 204 with respect to the PIM-SM protocol. The DR election, as well as 205 discovery and negotiation of options in PIM, is performed using Hello 206 messages. PIM Hello messages are periodically exchanged and flooded 207 in EVPN Broadcast Domains that don't follow this specification. 209 When PIM Proxy is enabled, an EVPN PE will snoop PIM Hello messages 210 and forward them only to local ACs where PIM routers have been 211 detected. This document assumes that all the procedures defined in 213 [RFC8220] to snoop PIM Hellos on local ACs and build the PIM Neighbor 214 DB on the PEs are followed. PIM Hello messages MUST NOT be forwarded 215 to remote EVPN PEs though. 217 Using Figure 1 as an example, the PIM Proxy operation for Hello 218 messages is as follows: 220 1) The arrival of a new PIM Hello message at e.g. PE1 will trigger an 221 MRD route advertisement including: 222 o The IP address and length of the multicast router that issued 223 the Hello message. E.g. R1's IP address and length. 224 o The DR Priority copied from the Hello DR Priority TLV. 225 o Q flag set (if the multicast router is a Querier). 226 o P flag set that indicates the router is PIM capable. 228 2) All other PEs import the MRD route and do the following: 229 o Add the multicast router address to the PIM Neighbor Database 230 (PIM Nbr DB) associated to the Originator Router Address. 231 o Generate a PIM hello where the IP Source Address is the 232 Multicast Router IP and the DR Priority is copied from the 233 route. This PIM hello is sent to all the local ACs connected to 234 a PIM router. For example, PE3 will send the generated hello 235 message to R4. 237 3) Each PE will build its PIM Nbr DB out of the local PIM hello 238 messages and/or remote MRD routes. The PIM hello timers and other 239 hello parameters are not propagated in the MRD routes. 241 o The timers are handled locally by the PE and as per [RFC4601]. 242 This is valid for the hold_time (when a PIM router or PE 243 receives a hello message, resets the neighbor-expiry timer), and 244 other timers. 246 o The Generation ID option is also processed locally on the PE, as 247 well as the Generation ID changes for a given multicast router. 248 It is not propagated in the MRD route. 250 o Procedures described in [RFC4601] are used to remove a local AC 251 PIM router from the PIM Nbr DB. When a local router is removed 252 from the DB, the MRD route is withdrawn. If the local router is 253 still sending Queries, the route is updated with flags P=0 and 254 Q=1. Upon receiving the update, the other PEs will remove the 255 router from the PIM Nbr DB but not from the list of queriers. 257 4) Based on regular PIM DR election procedures (highest DR Priority 258 or highest IP), each PE is aware of who the DR is for the BD. For 259 more information, refer to section "3. Interaction with IGMP- 260 snooping and Sources". 262 2.1.2. Discovering IGMP Queriers 264 In (EVPN) Broadcast Domains that are shared among not only PIM 265 routers but also IGMP hosts, one or more PIM routers will also be 266 configured as IGMP Queriers. The proxy Querier mechanism described in 267 [EVPN-IGMP-MLD-PROXY] suppresses the flooding of queries on the 268 Broadcast Domain, by using PE generated Queries from an anycast IP 269 address. 271 While the proxy Querier mechanism works in most of the use-cases, 272 sometimes it is desired to have a more transparent behavior and 273 propagate existing multicast router IGMP Queries as opposed to 274 "blindly" querying all the hosts from the PEs. The MRD route defined 275 in section 4 can be used for that purpose. 277 When the discovered local PIM router is also sending IGMP Queries, 278 the PE will issue an MRD route for the multicast router with both Q 279 (IGMP Querier) and P (PIM router) flags set. Note that the PE may set 280 both flags or only one of them, depending on the capabilities of the 281 local router. 283 A PE receiving an MRD route with Q=1 will generate IGMP Query 284 messages, using the multicast router IP address encoded in the 285 received MRD route. If more than one IGMP Queriers exist in the EVI, 286 the PE receiving the MRD routes with Q=1 will select the lower IP 287 address, as per [RFC2236]. Note that, upon receiving the MRD routes 288 with Q=1, the PE must generate IGMP Queries and forward them to all 289 the local ACs. Other Queriers listening to these received Query 290 messages will stop sending Queries if they are no longer the selected 291 Querier, as per [RFC2236]. 293 This procedure allows the EVPN PEs to act as proxy Queriers, but 294 using the IP address of the best existing IGMP Querier in the EVPN 295 Broadcast Domain. This can help IGMP hosts troubleshoot any issues on 296 the IGMP routers and check their connectivity to them. 298 2.2. PIM Join/Prune Proxy Procedures 300 This section describes the procedures associated to the PIM Proxy 301 function for Join and Prune messages. This document assumes that all 302 the procedures defined in [RFC8220] to build multicast states on the 303 PEs' local ACs are followed. Figure 2 illustrates an scenario where 304 PIM Proxy is enabled on the EVPN PEs. 306 J(*,G1,IP5) 307 +--+ J(*,G1,IP5) 308 |R1+------> XXXXXXXX P(S1,G1,IP5,rpt) 309 +--+ +-----+ XXXX XX XXXXX +-----+ +--+ 310 | PE1 |XXXXX XXXX XX| PE3 +----> |R4| 311 +--+ | | SMET | | +--+ 312 |R2+-----> +-----+ (*,G1,IP5) +-----+ 313 +--+ X +---------> XX 314 J(*,G1,IP5) X XXX 315 XX XX 316 X X J(*,G1,IP5) 317 +--+ +-----+ SMET X P(S1,G1,IP5,rpt) 318 |R3+---> | PE2 | (S1,G1,IP5,rpt) XX+-----+ +--+ 319 +--+ | | +--------> XXXX | PE4 +--> |R5| 320 +-----+XXXX XXXXX | | +--+ 321 P(S1,G1,IP5,rpt) X X X +-----+ RP 322 XX XXX XX XXX 323 XXXXXX XXXXX XXX 325 Figure 2 - Proxy PIM Join/Prune in EVPN 327 PIM J/P messages are sent by the routers towards upstream sources and 328 RPs: 329 o (*,G) is used in Join/Prune messages that are sent towards the RP 330 for the specified group. 331 o (S,G) used in Join/Prune messages sent towards the specified 332 source. 333 o (S,G,rpt) is used in Join/Prune messages sent towards the RP. We 334 refer to this as RPT message and the Prune message always precedes 335 the Join message. The typical sequence of PIM messages (for a 336 group) seen in a BD connecting PIM routers is the following: 338 a) (*,G) Join issued by a downstream router to the RP (to join the 339 RP Tree). 340 b) (S,G) Join issued by a downstream router switching to the SPT. 341 c) (S,G,rpt) Prune issued by a downstream router to the RP to prune 342 a specific source from the RPT. 343 d) (S,G) Prune issued by a downstream router no longer interested 344 in the SPT. 345 e) (S,G,rpt) Join issued by a downstream router interested (again) 346 in the RPT for (S,G). 348 The Proxy PIM procedures for Join/Prune messages are summarized as 349 follows: 351 1) Downstream PE procedures: 353 o A downstream PE will snoop PIM Join/Prune messages and won't 354 forward them to remote PEs. 356 o Triggered by the reception of the PIM Join message, a downstream 357 PE will advertise an SMET route, including the source, group and 358 Upstream Neighbor as received from the PIM Join message. A 359 single SMET route is advertised per source, group, with the P 360 flag set. As an example, in Figure 2, PE1 receives two PIM Join 361 messages for the same source, group and Upstream Neighbor, 362 however PE1 advertises a single SMET route. 364 o When the last connected router sends a PIM Prune message for a 365 given source, group and Upstream Neighbor and the state is 366 removed, the PE will withdraw the SMET route (note that the 367 state is removed once the prune-pend timer expires). 369 o SMET routes must always be generated upon receiving a PIM Join 370 message, irrespective of the location of the Upstream Neighbor 371 and even if the Upstream Neighbor is local to the PE. 373 o A downstream PE receiving a PIM Prune (S,G,rpt) message will 374 trigger an RPT-Prune route for the source and group. 375 Subsequently, if the downstream PE receives a PIM Join (S,G,rpt) 376 to cancel the previous Prune (S,G,rpt) and keep pulling the 377 multicast traffic from the RPT, the downstream PE will withdraw 378 the RPT-Prune route. 380 o PIM Timers are handled locally. If the holdtime expires for a 381 local Join the PE withdraws the SMET route. 383 3) Upstream PE procedures: 385 o A received SMET route with P=1 will add state for the source and 386 group and will generate a PIM Join message for the source, group 387 that will be forwarded to all the local AC PIM routers. 389 o A received SMET route withdrawal will remove the state and 390 generate a PIM Prune message for the source, group and upstream 391 neighbor that will be forwarded to all the local AC PIM routers. 393 o A received RPT-Prune route for (S,G) will generate a PIM Prune 394 (S,G,rpt) message that will be forwarded to all the local AC PIM 395 routers. 397 o A received RPT-Prune withdrawal for (S,G) will generate a PIM 398 Join (S,G,rpt) message that will be forwarded to all the local 399 AC PIM routers. 401 It is important to note that, compared to a solution that does not 402 snoop PIM messages and does not use BGP to propagate states in the 403 core, this EVPN PIM Proxy solution will add some latency derived from 404 the procedures described in this document. 406 2.3. PIM Assert Optimization 408 The PIM Assert process described in [RFC4601] is intense in terms of 409 resource consumption in the PIM routers, however it is needed in case 410 PIM routers share a multi-access transit LAN. The use of PIM Proxy 411 for EVPN BDs can minimize and even suppress the need for PIM Assert 412 as described in this section. 414 As a refresher, the PIM Assert procedures are needed to prevent two 415 or more Upstream PIM routers from forwarding the same multicast 416 content to the group of Downstream PIM routers sharing the same 417 (EVPN) Broadcast Domain. This multicast packet duplication may happen 418 in any of the following cases: 420 o Two or more Downstream PIM routers on the BD may issue (*,G) Joins 421 to different upstream routers on the BD because they have 422 inconsistent MRIB entries regarding how to reach the RP. Both paths 423 on the RP tree will be set up, causing two copies of all the shared 424 tree traffic to appear on the EVPN Broadcast Domain. 426 o Two or more routers on the BD may issue (S,G) Joins to different 427 upstream routers on the BD because they have inconsistent MRIB 428 entries regarding how to reach source S. Both paths on the source- 429 specific tree will be set up, causing two copies of all the traffic 430 from S to appear on the BD. 432 o A router on the BD may issue a (*,G) Join to one upstream router on 433 the BD, and another router on the BD may issue an (S,G) Join to a 434 different upstream router on the same BD. Traffic from S may reach 435 the BD over both the RPT and the SPT. If the receiver behind the 436 downstream (*,G) router doesn't issue an (S,G,rpt) prune, then this 437 condition would persist. 439 PIM does not prevent such duplicate joins from occurring; instead, 440 when duplicate data packets appear on the same BD from different 441 routers, these routers notice this and then elect a single forwarder. 442 This election is performed using the PIM Assert procedure. 444 The issue is minimized or suppressed in this document by making sure 445 all the Upstream PEs select the same Upstream Neighbor for a given 446 (*,G) or (S,G) in any of the three above situations. If there is only 447 one upstream PIM router selected and the same multicast content is 448 not allowed to be flooded from more than one Upstream Neighbor, there 449 will not be multicast duplication or need for Assert procedures in 450 the EVPN Broadcast Domain. 452 Figure 3 illustrates an example of the PIM Assert Optimization in 453 EVPN. 455 J(*,G1,IP5) 456 +--+ J(*,G1,IP5) 457 |R1+------> XXXXXXXX J(S1,G1,IP4) 458 +--+ +-----+ XXXX XX XXXXX +-----+ +--+ 459 | PE1 |XXXXX XXXX XX| PE3 +----> |R4| 460 +--+ | | SMET | | +--+ 461 |R2+-----> +-----+ (*,G1,IP5) +-----+ 462 +--+ X +---------> XX 463 J(*,G1,IP4) X XXX 464 XX XX 465 X X J(*,G1,IP5) 466 +--+ +-----+ SMET X J(S1,G1,IP4) 467 |R3+---> | PE2 | (S1,G1,IP4) XX+-----+ +--+ 468 +--+ | | +--------> XXXX | PE4 +--> |R5| 469 +-----+XXXX XXXXX | | +--+ 470 J(S1,G1,IP4) X X X +-----+ RP 471 XX XXX XX XXX P(S1,G1,IP5,rpt)--> 472 XXXXXX XXXXX XXX 474 Figure 3 - Proxy PIM Assert Optimization in EVPN 476 2.3.1 Assert Optimization Procedures in Downstream PEs 478 The Downstream PEs will trigger SMET routes based on the received PIM 479 Join messages. This is their behavior when any of the three 480 situations described in section 2.3 occurs: 482 o If the Downstream PE receives two local (*,G) Joins to different 483 Upstream Neighbors, the PE will generate a single SMET route, 484 selecting the highest IP address. In Figure 3, if we assume R1 485 issues J(*,G1,IP5) and R2 J(*,G1,IP4), PE1 will advertise an SMET 486 route for (*,G,IP5). If PE1 had already advertised (*,G1,IP4), it 487 would have sent an update with (*,G1,IP5). Note that the Upstream 488 Router IP address is not part of the SMET route key, hence there is 489 no need to withdraw the previous (*,G1,IP4). 491 o In the same way, if the Downstream PE receives two local (S,G) 492 Joins to different Upstream Neighbors, the PE will generate a 493 single SMET route, selecting the highest IP address. 495 o If the Downstream PE receives a local (S,G) and a local (*,G) Joins 496 for the same group but to different Upstream Neighbors, the PE will 497 generate two different SMET routes (since *,G and S,G make two 498 different route keys), keeping the original Upstream Neighbors in 499 the SMET routes. 501 2.3.2 Assert Optimization Procedures in Upstream PEs 503 Upon receiving two or more SMET routes for the same group but 504 different Upstream Neighbors, the Upstream PEs will follow this 505 procedure: 507 1) The Upstream PE will select a unique Upstream Neighbor based on 508 the following rules: 510 a) The Upstream Neighbor encoded in a (S,G) SMET route has 511 precedence over the Upstream Neighbor on the (*,G) SMET route 512 for the same group. This is consistent with the Assert winner 513 election in [RFC4601]. In the example of Figure 3, PE3 and PE4 514 will select IP4 as the Upstream Neighbor for (S1,G1) and (*,G1). 516 b) In case the SMET routes have the same source (* or S), the 517 higher Upstream Neighbor IP Address wins. 519 2) After selecting the Unique Upstream Neighbor, the PE will instruct 520 the data path to discard any ingress multicast stream that is 521 coming from an interface different than the selected Upstream 522 Neighbor for the multicast group. In the example in Figure 3, PE4 523 will not accept G1 multicast traffic from R5. 525 NOTE: when the procedure selects an Upstream Neighbor between the 526 (S,G) and (*,G) routes, we assume that the PE's interface that is 527 connected to the non-selected Upstream Neighbor, is not shared 528 with another Source for the same Group. In the example of Figure 529 3, this means that PE4's AC cannot be shared by R5 and S2 for the 530 same group G. If PE4's AC is connected to a switch where R5 (RP) 531 and S2 are connected, multicast traffic (S2,G) will be dropped by 532 PE4, as per (2). 534 3) Then the PE will generate the corresponding local PIM messages as 535 usual. In the example, PE3 and PE4 generate PIM Join messages for 536 (S1,G1,IP4) and (*,G1,IP5). 538 4) The PE connected to the non-selected Upstream Neighbor will issue 539 a PIM (S,G)/(*,G) Prune or a PIM (S,G,rpt) Prune to make sure the 540 non-selected Upstream Router does not forward traffic for the 541 group anymore. In the example, PE4 will issue a local (S1,G1,rpt) 542 Prune message to R5, so that R5 does not forward G1 traffic. 544 In case of any change that impacts on the Upstream Neighbor selection 545 for a given group G1, the upstream PEs will simply update the 546 Upstream Neighbor selection and follow the above procedure. This 547 mechanism prevents the multicast duplication in the EVPN Broadcast 548 Domain and avoids PIM Assert procedures among PIM routers in the BD. 550 2.4. EVPN Multi-Homing and State Synchronization 552 PIM Join/Prune States will be synchronized across all the PEs in an 553 Ethernet Segment by using the procedures described in [EVPN-IGMP-MLD- 554 PROXY] and the IGMP/PIM Join Synch Route with the corresponding Flag 555 P set. This document does not require the use of IGMP Leave Synch 556 Routes. 558 In the same way, RPT-Prune States can be synchronized by using the 559 PIM RPT-Prune Synch route. The generation and process for this route 560 follows similar procedures as for the IGMP/PIM Join Synch Route. 562 In order to synchronize the PIM Neighbors discovered on an Ethernet 563 Segment, the MRD route and its ESI value will be used. Upon receiving 564 a Hello message on a link that is part of a multi-homed Ethernet 565 Segment, the PE will issue an MRD route that encodes the ESI value of 566 the AC over which the Hello was received. Upon receiving the non-zero 567 ESI MRD route, the PEs in the same ES will add the router to their 568 PIM Neighbor DB, using their AC on the same ES as the PIM Neighbor 569 port. This will allow the DF on the ES to generate Hello messages for 570 the local PIM router. 572 A PE that is not part of the ESI would normally receive a single non- 573 zero ESI MRD route per multicast router. In certain transient 574 situations the PE may receive more than one non-zero ESI MRD route 575 for the same multicast router. The PE should recognize this and not 576 generate additional PIM Hello messages for the local ACs. 578 3. Interaction with IGMP-snooping and Sources 580 Figure 4 illustrates an example with a multicast source, an IGMP host 581 and a PIM router in the same EVPN BD. 583 XXXXX J(*,G1) 584 XXXXXXX +-----+ +--+ 585 XXXX | PE3 | <---+H3| 586 X | | +--+ 587 +------+ X +--------> +-----+ +---> 588 |Source| +-----+ | S1,G1 X S1,G1 mcast 589 | S1 +---> | PE1 | + mcast XX 590 +------+ | | XX Hello 591 G1 +-----+ + S1,G1 X <---+ 592 XX | mcast +-----+ +--+ 593 X +---------> | PE4 +--> |R4| 594 X | | +--+ 595 XX XXX +-----+ DR 596 XXX XXX XXX 597 XXXXXXX S1,G1, mcast 599 Figure 4 - Proxy PIM interaction with local sources and hosts 601 When PIM routers, multicast sources and IGMP hosts coexist in the 602 same EVPN Broadcast domain, the PEs supporting both IGMP and PIM 603 proxy will provide the following optimizations in the EVPN BD: 605 o If an IGMP host and a PIM router are connected to the same BD on a 606 PE, the PE will advertise a single SMET route per (S,G) or (*,G) 607 irrespective of the received IGMP or PIM message. The IGMP flags 608 can be simultaneously set along with the P flag. 610 o In the same way, if IGMP hosts and PIM routers are connected to the 611 same BD and Ethernet Segment, the IGMP/PIM Join Synch route can be 612 shared by a host and a router requesting the same multicast source 613 and group. 615 o A PE connected to a Source and using Ingress Replication will 616 forward a multicast stream (S1,G1) to all the egress PEs that 617 advertised an SMET route for (S1,G1) and all the egress PEs that 618 advertised an MRD route for the EVPN BD. 620 4. BGP Information Model 622 This document defines the following additional routes and requests 623 IANA to allocate a type value in the EVPN route type registry: 625 + Type TBD - Multicast Router Discovery (MRD) Route 626 + Type TBD - PIM RPT-Prune Route 627 + Type TBD - PIM RPT-Prune Join Synch Route 629 In addition, the following routes defined in [EVPN-IGMP-MLD-PROXY] 630 are re-used and extended in this document's procedures: 632 + Type 6 - Selective Multicast Ethernet Tag Route 633 + Type 7 - IGMP Join Synch Route 635 Where Type 7 is requested to be re-named as IGMP/PIM Join Synch 636 Route. 638 4.1 Multicast Router Discovery (MRD) Route 640 Figure 5 shows the content of the MRD route: 642 +-------------------------------------------------+ 643 | RD (8 octets) | 644 +-------------------------------------------------+ 645 | Ethernet Segment ID (10 octets) | 646 +-------------------------------------------------+ 647 | Ethernet Tag ID (4 octets) | 648 +-------------------------------------------------+ 649 | Originator Router Length (1 octet) | 650 +-------------------------------------------------+ 651 | Originator Router Address (Variable) | 652 +-------------------------------------------------+ 653 | Mcast Router Length (1 octet) | 654 +-------------------------------------------------+ 655 | Mcast Router Address 1 (variable) | 656 +-------------------------------------------------+ 657 | Secondary Address List Length (1 octet) | 658 +-------------------------------------------------+ 659 | Secondary Mcast Router Address 1 (variable) | 660 +-------------------------------------------------+ 661 | . | 662 | . | 663 | Secondary Mcast Router Address n (variable) | 664 +-------------------------------------------------+ 665 | DR Priority (4 octets) | 666 +-------------------------------------------------+ 667 | Flags (1 octet) | 668 +-------------------------------------------------+ 670 Figure 5 Multicast Router Discovery Route 672 The support for this new route type is OPTIONAL. Since this new route 673 type is OPTIONAL, an implementation not supporting it MUST ignore the 674 route, based on the unknown route type value, as specified by Section 675 5.4 in [RFC7606]. 677 The encoding of this route is defined as follows: 679 o RD, ESI and Ethernet Tag ID are defined as per [RFC7432] for MAC/IP 680 routes. 682 o The Originator Router Length and Address encode and IPv4 or IPv6 683 address that belongs to the advertising PE. 685 o The Multicast Router Length and Address field encode the Primary IP 686 address of the PIM neighbor added to the PE's DB. 688 o The Secondary Address List Length encodes the number of Secondary 689 IP addresses advertised by the PIM router in the PIM Hello message. 690 If this field is zero, the NLRI will not include any Secondary 691 Multicast Router Address. All the IP addresses will have the same 692 Length, that is, they will all be either IPv4 or IPv6, but not a 693 mix of both. 695 o DR Priority is copied from the same field in Hello packets, as per 696 [RFC4601]. 698 o Flags: 699 - Q: Querier flag. Least significant bit. It indicates the encoded 700 multicast router is an IGMP Querier. 701 - P: PIM router flag. Second low order bit in the Flags octet. It 702 indicates that the multicast router is a PIM router. 703 - Q and P may be set simultaneously. 705 For BGP processing purposes, only the RD, Ethernet Tag ID, Originator 706 Router Length and Address, and Multicast Router Length and Address 707 are considered part of the route key. The Secondary Multicast Router 708 Addresses and the rest of the fields are not part of the route key. 710 4.2 Selective Multicast Ethernet Tag Route for PIM Proxy 712 This document extends the SMET route defined in [EVPN-IGMP-MLD-PROXY] 713 as shown in Figure 6. 715 +---------------------------------------+ 716 | RD (8 octets) | 717 +---------------------------------------+ 718 | Ethernet Tag ID (4 octets) | 719 +---------------------------------------+ 720 | Multicast Source Length (1 octet) | 721 +---------------------------------------+ 722 | Multicast Source Address (variable) | 723 +---------------------------------------+ 724 | Multicast Group Length (1 octet) | 725 +---------------------------------------+ 726 | Multicast Group Address (Variable) | 727 +---------------------------------------+ 728 | Originator Router Length (1 octet) | 729 +---------------------------------------+ 730 | Originator Router Address (variable) | 731 +---------------------------------------+ 732 | Flags (1 octets) (optional) | 733 +---------------------------------------+ 734 | Upstream Router Length (1B)(optional)| 735 +---------------------------------------+ 736 | Upstream Router Addr (variable)(opt) | 737 +---------------------------------------+ 739 Flags: 741 0 1 2 3 4 5 6 7 742 +--+--+--+--+--+--+--+--+ 743 | | | P|IE|v3|v2|v1| 744 +--+--+--+--+--+--+--+--+ 746 Figure 6 Selective Multicast Ethernet Tag Route and Flags 748 As in the case of the MRD route, this route type is OPTIONAL. 750 This route will be used as per [EVPN-IGMP-MLD-PROXY], with the 751 following extra and optional fields: 753 o Upstream Router Length and Address will contain the same 754 information as received in a PIM Join/Prune message on a local AC. 755 There is only one Upstream Router Address per route. 757 o Flags: This field encodes Flags that are now relevant to IGMP and 758 PIM. The following new Flag is defined: 760 - Flag P: Indicates the SMET route is generated by a received PIM 761 Join on a local AC. When P=1, the Upstream Router Length and 762 Address fields are present in the route. Otherwise the two fields 763 will not be present. 765 Compared to [EVPN-IGMP-MLD-PROXY] there is no change in terms of 766 fields considered part of the route key for BGP processing. The 767 Upstream Router Length and Address are not considered part of the 768 route key. 770 4.3 PIM RPT-Prune Route 772 The RPT-Prune route is analogous to the SMET route but for PIM RPT- 773 Prune messages. The SMET routes cannot be used to convey RPT-Prune 774 messages because they are always triggered by IGMP or PIM Join 775 messages. A PIM RPT-Prune message is used to Prune a specific (S,G) 776 from the RP Tree by downstream routers. An RPT-Prune message is 777 typically seen prior to an RPT-Join message for the (S,G), hence it 778 requires its own BGP route type (since the SMET route is always 779 advertised based on the received Join messages). 781 +---------------------------------------+ 782 | RD (8 octets) | 783 +---------------------------------------+ 784 | Ethernet Tag ID (4 octets) | 785 +---------------------------------------+ 786 | Multicast Source Length (1 octet) | 787 +---------------------------------------+ 788 | Multicast Source Address (variable) | 789 +---------------------------------------+ 790 | Multicast Group Length (1 octet) | 791 +---------------------------------------+ 792 | Multicast Group Address (Variable) | 793 +---------------------------------------+ 794 | Originator Router Length (1 octet) | 795 +---------------------------------------+ 796 | Originator Router Address (variable) | 797 +---------------------------------------+ 798 | Upstream Router Length (1B) | 799 +---------------------------------------+ 800 | Upstream Router Addr (variable) | 801 +---------------------------------------+ 803 Figure 7 PIM RPT-Prune Route 805 Fields are defined in the same way as for the SMET route. 807 4.4 IGMP/PIM Join Synch Route for PIM Proxy 809 This document renames the IGMP Join Synch Route defined in [EVPN- 810 IGMP-MLD-PROXY] as IGMP/PIM Join Synch Route and extends it with new 811 fields and Flags as shown in Figure 8: 813 +----------------------------------------------+ 814 | RD (8 octets) | 815 +----------------------------------------------+ 816 | Ethernet Segment Identifier (10 octets) | 817 +----------------------------------------------+ 818 | Ethernet Tag ID (4 octets) | 819 +----------------------------------------------+ 820 | Multicast Source Length (1 octet) | 821 +----------------------------------------------+ 822 | Multicast Source Address (variable) | 823 +----------------------------------------------+ 824 | Multicast Group Length (1 octet) | 825 +----------------------------------------------+ 826 | Multicast Group Address (Variable) | 827 +----------------------------------------------+ 828 | Originator Router Length (1 octet) | 829 +----------------------------------------------+ 830 | Originator Router Address (variable) | 831 +----------------------------------------------+ 832 | Flags (1 octet) | 833 +----------------------------------------------+ 834 | Upstream Router Length (1B)(optional) | 835 +----------------------------------------------+ 836 | Upstream Router Addr (variable)(opt) | 837 +----------------------------------------------+ 839 Flags: 841 0 1 2 3 4 5 6 7 842 +--+--+--+--+--+--+--+--+ 843 | | | | P|IE|v3|v2|v1| 844 +--+--+--+--+--+--+--+--+ 846 Figure 8 IGMP/PIM Join Synch Route and Flags 848 This route will be used as per [EVPN-IGMP-MLD-PROXY], with the 849 following extra and optional fields: 851 o Upstream Router Length and Address will contain the same 852 information as received in a PIM Join/Prune message on a local AC. 853 There is only one Upstream Router Address per route. 855 o Flags: This field encodes Flags that are now relevant to IGMP and 856 PIM. The following new Flag is defined: 858 - Flag P: Indicates the Join Synch route is generated by a received 859 PIM Join on a local AC. When P=1, the Upstream Router Length and 860 Address fields are present in the route. Otherwise the two fields 861 will not be present. 863 Compared to [EVPN-IGMP-MLD-PROXY] there is no change in terms of 864 fields considered part of the route key for BGP processing. The 865 Upstream Router Length and Address are not considered part of the 866 route key. 868 4.5 IGMP/PIM RPT-Prune Synch Route for PIM Proxy 870 This new route is used to Synch RPT-Prune states among the PEs in the 871 Ethernet Segment. 873 +----------------------------------------------+ 874 | RD (8 octets) | 875 +----------------------------------------------+ 876 | Ethernet Segment Identifier (10 octets) | 877 +----------------------------------------------+ 878 | Ethernet Tag ID (4 octets) | 879 +----------------------------------------------+ 880 | Multicast Source Length (1 octet) | 881 +----------------------------------------------+ 882 | Multicast Source Address (variable) | 883 +----------------------------------------------+ 884 | Multicast Group Length (1 octet) | 885 +----------------------------------------------+ 886 | Multicast Group Address (Variable) | 887 +----------------------------------------------+ 888 | Originator Router Length (1 octet) | 889 +----------------------------------------------+ 890 | Originator Router Address (variable) | 891 +----------------------------------------------+ 892 | Upstream Router Length (1B)(optional) | 893 +----------------------------------------------+ 894 | Upstream Router Addr (variable)(opt) | 895 +----------------------------------------------+ 897 Figure 9 IGMP/PIM RPT-Prune Synch Route 899 The RD, Ethernet Segment Identifier and other fields are defined as 900 for the IGMP/PIM Join Synch Route. In addition, the Upstream Router 901 Length and Address will contain the same information as received in a 902 PIM RPT-Prune message on a local AC. The Upstream Router points at 903 the RP for the source and group and there is only one Upstream Router 904 Address per route. 906 The route key for BGP processing is defined as per the IGMP/PIM Join 907 Synch route. 909 5. Conclusions 911 This document extends the IGMP Proxy concept of [EVPN-IGMP-MLD-PROXY] 912 to PIM, so that EVPN can also be used to minimize the flooding of PIM 913 control messages and optimize the delivery of IP multicast traffic in 914 EVPN Broadcast Domains that connect PIM routers. 916 This specification describes procedures to Discover new PIM routers 917 in the BD, as well as propagate PIM Join/Prune messages using EVPN 918 SMET routes and other optimizations. 920 6. Conventions used in this document 922 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 923 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 924 document are to be interpreted as described in RFC-2119 [RFC2119]. 926 In this document, these words will appear with that interpretation 927 only when in ALL CAPS. Lower case uses of these words are not to be 928 interpreted as carrying RFC-2119 significance. 930 In this document, the characters ">>" preceding an indented line(s) 931 indicates a compliance requirement statement using the key words 932 listed above. This convention aids reviewers in quickly identifying 933 or finding the explicit compliance requirements of this RFC. 935 7. Security Considerations 937 This section will be added in future versions. 939 8. IANA Considerations 941 This document requests IANA to allocate a new EVPN route type in the 942 corresponding registry: 944 + Type TBD - Multicast Router Discovery (MRD) Route 945 + Type TBD - PIM RPT-Prune Route 946 + Type TBD - PIM RPT-Prune Join Synch Route 948 In addition, the following route defined in [EVPN-IGMP-MLD-PROXY] 949 should be renamed as follows: 951 + Type 7 - IGMP/PIM Join Synch Route 953 9. Terminology 955 o EVI: EVPN Instance. 957 o EVPN Broadcast Domain: it refers to an EVI in case of VLAN-based 958 and VLAN-bundle interfaces. It refers to a Bridge Domain identified 959 by an Ethernet-Tag (in the control plane) in case of VLAN-Aware 960 Bundle interfaces. 962 o AC: Attachment Circuit. 964 o PIM-DM: Protocol Independent Multicast - Dense Mode. 966 o PIM-SM: Protocol Independent Multicast - Sparse Mode. 968 o PIM-SSM: Protocol Independent Multicast - Source Specific Mode. 970 o S: IP address of the multicast source. 972 o G: IP address of the multicast group. 974 o N: Upstream neighbor field in a Join/Prune/Graft message. 976 o PIM J/P: PIM Join/Prune messages. 978 o RP: PIM Rendezvous Point. 980 o MRD route: Multicast Router Discovery. 982 o PIM Nbr: PIM Neighbor. 984 10. References 986 10.1 Normative References 988 [RFC7432] Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A., 989 Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based Ethernet 990 VPN", RFC 7432, DOI 10.17487/RFC7432, February 2015, . 993 [RFC4601] Fenner, B., Handley, M., Holbrook, H., and I. Kouvelas, 994 "Protocol Independent Multicast - Sparse Mode (PIM-SM): Protocol 995 Specification (Revised)", RFC 4601, DOI 10.17487/RFC4601, August 996 2006, . 998 [RFC2236] Fenner, W., "Internet Group Management Protocol, Version 999 2", RFC 2236, DOI 10.17487/RFC2236, November 1997, . 1002 [RFC8220] Dornon, O. et al, "Protocol Independent Multicast (PIM) 1003 over Virtual Private LAN Service (VPLS)", RFC 8220, DOI 1004 10.17487/RFC8220, September 2017, . 1007 [EVPN-IGMP-MLD-PROXY] Sajassi, A. et al, "IGMP and MLD Proxy for 1008 EVPN", March 2017, work-in-progress, draft-ietf-bess-evpn-igmp-mld- 1009 proxy-00. 1011 10.2 Informative References 1013 [EVPN-PROXY-ARP-ND] Rabadan, J. et al, "Operational Aspects of Proxy- 1014 ARP/ND in EVPN Networks", October 2017, work-in-progress, draft-ietf- 1015 bess-evpn-proxy-arp-nd-03. 1017 11. Acknowledgments 1019 12. Contributors 1021 13. Authors' Addresses 1023 Jorge Rabadan 1024 Nokia 1025 777 E. Middlefield Road 1026 Mountain View, CA 94043 USA 1027 Email: jorge.rabadan@nokia.com 1028 Senthil Sathappan 1029 Nokia 1030 701 E. Middlefield Road 1031 Mountain View, CA 94043 USA 1032 Email: senthil.sathappan@nokia.com 1034 Jayant Kotalwar 1035 Nokia 1036 701 E. Middlefield Road 1037 Mountain View, CA 94043 USA 1038 Email: jayant.kotalwar@nokia.com 1040 Zhaohui Zhang 1041 Juniper Networks 1042 EMail: zzhang@juniper.net 1044 Ali Sajassi 1045 Cisco 1046 Email: sajassi@cisco.com