idnits 2.17.00 (12 Aug 2021) /tmp/idnits60004/draft-ietf-bess-evpn-redundant-mcast-source-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 3 instances of lines with private range IPv4 addresses in the document. If these are generic example addresses, they should be changed to use any of the ranges defined in RFC 6890 (or successor): 192.0.2.x, 198.51.100.x or 203.0.113.x. Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document date (6 February 2022) is 97 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-21) exists of draft-ietf-bess-evpn-igmp-mld-proxy-16 Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 BESS Workgroup J. Rabadan, Ed. 3 Internet-Draft J. Kotalwar 4 Intended status: Standards Track S. Sathappan 5 Expires: 10 August 2022 Nokia 6 Z. Zhang 7 W. Lin 8 Juniper 9 E. Rosen 10 Individual 11 6 February 2022 13 Multicast Source Redundancy in EVPN Networks 14 draft-ietf-bess-evpn-redundant-mcast-source-03 16 Abstract 18 EVPN supports intra and inter-subnet IP multicast forwarding. 19 However, EVPN (or conventional IP multicast techniques for that 20 matter) do not have a solution for the case where: a) a given 21 multicast group carries more than one flow (i.e., more than one 22 source), and b) it is desired that each receiver gets only one of the 23 several flows. Existing multicast techniques assume there are no 24 redundant sources sending the same flow to the same IP multicast 25 group, and, in case there were redundant sources, the receiver's 26 application would deal with the received duplicated packets. This 27 document extends the existing EVPN specifications and assumes that IP 28 Multicast source redundancy may exist. It also assumes that, in case 29 two or more sources send the same IP Multicast flows into the tenant 30 domain, the EVPN PEs need to avoid that the receivers get packet 31 duplication by following the described procedures. 33 Status of This Memo 35 This Internet-Draft is submitted in full conformance with the 36 provisions of BCP 78 and BCP 79. 38 Internet-Drafts are working documents of the Internet Engineering 39 Task Force (IETF). Note that other groups may also distribute 40 working documents as Internet-Drafts. The list of current Internet- 41 Drafts is at https://datatracker.ietf.org/drafts/current/. 43 Internet-Drafts are draft documents valid for a maximum of six months 44 and may be updated, replaced, or obsoleted by other documents at any 45 time. It is inappropriate to use Internet-Drafts as reference 46 material or to cite them other than as "work in progress." 48 This Internet-Draft will expire on 10 August 2022. 50 Copyright Notice 52 Copyright (c) 2022 IETF Trust and the persons identified as the 53 document authors. All rights reserved. 55 This document is subject to BCP 78 and the IETF Trust's Legal 56 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 57 license-info) in effect on the date of publication of this document. 58 Please review these documents carefully, as they describe your rights 59 and restrictions with respect to this document. Code Components 60 extracted from this document must include Revised BSD License text as 61 described in Section 4.e of the Trust Legal Provisions and are 62 provided without warranty as described in the Revised BSD License. 64 Table of Contents 66 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 67 1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 4 68 1.2. Background on IP Multicast Delivery in EVPN Networks . . 6 69 1.2.1. Intra-subnet IP Multicast Forwarding . . . . . . . . 6 70 1.2.2. Inter-subnet IP Multicast Forwarding . . . . . . . . 8 71 1.3. Multi-Homed IP Multicast Sources in EVPN . . . . . . . . 9 72 1.4. The Need for Redundant IP Multicast Sources in EVPN . . . 11 73 2. Solution Overview . . . . . . . . . . . . . . . . . . . . . . 12 74 3. BGP EVPN Extensions . . . . . . . . . . . . . . . . . . . . . 13 75 4. Warm Standby (WS) Solution for Redundant G-Sources . . . . . 14 76 4.1. WS Example in an OISM Network . . . . . . . . . . . . . . 16 77 4.2. WS Example in a Single-BD Tenant Network . . . . . . . . 18 78 5. Hot Standby (HS) Solution for Redundant G-Sources . . . . . . 19 79 5.1. Extensions for the Advertisement of DCB Labels . . . . . 23 80 5.2. Use of BFD in the HS Solution . . . . . . . . . . . . . . 24 81 5.3. HS Example in an OISM Network . . . . . . . . . . . . . . 24 82 5.4. HS Example in a Single-BD Tenant Network . . . . . . . . 28 83 6. Security Considerations . . . . . . . . . . . . . . . . . . . 28 84 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 29 85 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 29 86 8.1. Normative References . . . . . . . . . . . . . . . . . . 29 87 8.2. Informative References . . . . . . . . . . . . . . . . . 30 88 Appendix A. Acknowledgments . . . . . . . . . . . . . . . . . . 31 89 Appendix B. Contributors . . . . . . . . . . . . . . . . . . . . 31 90 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 31 92 1. Introduction 94 Intra and Inter-subnet IP Multicast forwarding are supported in EVPN 95 networks. [I-D.ietf-bess-evpn-igmp-mld-proxy] describes the 96 procedures required to optimize the delivery of IP Multicast flows 97 when Sources and Receivers are connected to the same EVPN BD 98 (Broadcast Domain), whereas [I-D.ietf-bess-evpn-irb-mcast] specifies 99 the procedures to support Inter-subnet IP Multicast in a tenant 100 network. Inter-subnet IP Multicast means that IP Multicast Source 101 and Receivers of the same multicast flow are connected to different 102 BDs of the same tenant. 104 [I-D.ietf-bess-evpn-igmp-mld-proxy], [I-D.ietf-bess-evpn-irb-mcast] 105 or conventional IP multicast techniques do not have a solution for 106 the case where a given multicast group carries more than one flow 107 (i.e., more than one source) and it is desired that each receiver 108 gets only one of the several flows. Multicast techniques assume 109 there are no redundant sources sending the same flows to the same IP 110 multicast group, and, in case there were redundant sources, the 111 receiver's application would deal with the received duplicated 112 packets. 114 As a workaround in conventional IP multicast (PIM or MVPN networks), 115 if all the redundant sources are given the same IP address, each 116 receiver will get only one flow. The reason is that, in conventional 117 IP multicast, (S,G) state is always created by the RP (Rendezvous 118 Point), and sometimes by the Last Hop Router (LHR). The (S,G) state 119 always binds the (S,G) flow to a source-specific tree, rooted at the 120 source IP address. If multiple sources have the same IP address, one 121 may end up with multiple (S,G) trees. However, the way the trees are 122 constructed ensures that any given LHR or RP is on at most one of 123 them. The use of an anycast address assigned to multiple sources may 124 be useful for warm standby redundancy solutions. However, on one 125 hand, it's not really helpful for hot standby redundancy solutions 126 and on the other hand, configuring the same IP address (in particular 127 IPv4 address) in multiple sources may bring issues if the sources 128 need to be reached by IP unicast traffic or if the sources are 129 attached to the same Broadcast Domain. 131 In addition, in the scenario where several G-sources are attached via 132 EVPN/OISM, there is not necessarily any (S,G) state created for the 133 redundant sources. The LHRs may have only (*,G) state, and there may 134 not be an RP (creating (S,G) state) either. Therefore, this document 135 extends the above two specifications and assumes that IP Multicast 136 source redundancy may exist. It also assumes that, in case two or 137 more sources send the same IP Multicast flows into the tenant domain, 138 the EVPN PEs need to avoid that the receivers get packet duplication. 140 The solution provides support for Warm Standby (WS) and Hot Standby 141 (HS) redundancy. WS is defined as the redundancy scenario in which 142 the upstream PEs attached to the redundant sources of the same 143 tenant, make sure that only one source of the same flow can send 144 multicast to the interested downstream PEs at the same time. In HS 145 the upstream PEs forward the redundant multicast flows to the 146 downstream PEs, and the downstream PEs make sure only one flow is 147 forwarded to the interested attached receivers. 149 1.1. Terminology 151 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 152 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 153 "OPTIONAL" in this document are to be interpreted as described in BCP 154 14 [RFC2119] [RFC8174] when, and only when, they appear in all 155 capitals, as shown here. 157 * PIM: Protocol Independent Multicast. 159 * MVPN: Multicast Virtual Private Networks. 161 * OISM: Optimized Inter-Subnet Multicast, as in 162 [I-D.ietf-bess-evpn-irb-mcast]. 164 * Broadcast Domain (BD): an emulated ethernet, such that two systems 165 on the same BD will receive each other's link-local broadcasts. 166 In this document, BD also refers to the instantiation of a 167 Broadcast Domain on an EVPN PE. An EVPN PE can be attached to one 168 or multiple BDs of the same tenant. 170 * Designated Forwarder (DF): as defined in [RFC7432], an ethernet 171 segment may be multi-homed (attached to more than one PE). An 172 ethernet segment may also contain multiple BDs, of one or more 173 EVIs. For each such EVI, one of the PEs attached to the segment 174 becomes that EVI's DF for that segment. Since a BD may belong to 175 only one EVI, we can speak unambiguously of the BD's DF for a 176 given segment. 178 * Upstream PE: in this document an Upstream PE is referred to as the 179 EVPN PE that is connected to the IP Multicast source or closest to 180 it. It receives the IP Multicast flows on local ACs (Attachment 181 Circuits). 183 * Downstream PE: in this document a Downstream PE is referred to as 184 the EVPN PE that is connected to the IP Multicast receivers and 185 gets the IP Multicast flows from remote EVPN PEs. 187 * G-traffic: any frame with an IP payload whose IP Destination 188 Address (IP DA) is a multicast group G. 190 * G-source: any system sourcing IP multicast traffic to G. 192 * SFG: Single Flow Group, i.e., a multicast group address G which 193 represents traffic that contains only a single flow. However, 194 multiple sources - with the same or different IP - may be 195 transmitting an SFG. 197 * Redundant G-source: a host or router that transmits an SFG in a 198 tenant network where there are more hosts or routers transmitting 199 the same SFG. Redundant G-sources for the same SFG SHOULD have 200 different IP addresses, although they MAY have the same IP address 201 when in different BDs of the same tenant network. Redundant 202 G-sources are assumed NOT to be "bursty" in this document (typical 203 example are Broadcast TV G-sources or similar). 205 * P-tunnel: Provider tunnel refers to the type of tree a given 206 upstream EVPN PE uses to forward multicast traffic to downstream 207 PEs. Examples of P-tunnels supported in this document are Ingress 208 Replication (IR), Assisted Replication (AR), Bit Indexed Explicit 209 Replication (BIER), multicast Label Distribution Protocol (mLDP) 210 or Point to Multi-Point Resource Reservation protocol with Traffic 211 Engineering extensions (P2MP RSVP-TE). 213 * Inclusive Multicast Tree or Inclusive Provider Multicast Service 214 Interface (I-PMSI): defined in [RFC6513], in this document it is 215 applicable only to EVPN and refers to the default multicast tree 216 for a given BD. All the EVPN PEs that are attached to a specific 217 BD belong to the I-PMSI for the BD. The I-PMSI trees are signaled 218 by EVPN Inclusive Multicast Ethernet Tag (IMET) routes. 220 * Selective Multicast Tree or Selective Provider Multicast Service 221 Interface (S-PMSI): defined in [RFC6513], in this document it is 222 applicable only to EVPN and refers to the multicast tree to which 223 only the interested PEs of a given BD belong to. There are two 224 types of EVPN S-PMSIs: 226 - EVPN S-PMSIs that require the advertisement of S-PMSI AD routes 227 from the upstream PE, as in [EVPN-BUM]. The interested 228 downstream PEs join the S-PMSI tree as in [EVPN-BUM]. 230 - EVPN S-PMSIs that don't require the advertisement of S-PMSI AD 231 routes. They use the forwarding information of the IMET 232 routes, but upstream PEs send IP Multicast flows only to 233 downstream PEs issuing Selective Multicast Ethernet Tag (SMET) 234 routes for the flow. These S-PMSIs are only supported with the 235 following P-tunnels: Ingress Replication (IR), Assisted 236 Replication (AR) and BIER. 238 This document also assumes familiarity with the terminology of 239 [RFC7432], [RFC4364], [RFC6513], [RFC6514], 240 [I-D.ietf-bess-evpn-igmp-mld-proxy], [I-D.ietf-bess-evpn-irb-mcast], 241 [EVPN-RT5] and [EVPN-BUM]. 243 1.2. Background on IP Multicast Delivery in EVPN Networks 245 IP Multicast is all about forwarding a single copy of a packet from a 246 source S to a group of receivers G along a multicast tree. That 247 multicast tree can be created in an EVPN tenant domain where S and 248 the receivers for G are connected to the same BD or different BD. In 249 the former case, we refer to Intra-subnet IP Multicast forwarding, 250 whereas the latter case will be referred to as Inter-subnet IP 251 Multicast forwarding. 253 1.2.1. Intra-subnet IP Multicast Forwarding 255 When the source S1 and receivers interested in G1 are attached to the 256 same BD, the EVPN network can deliver the IP Multicast traffic to the 257 receivers in two different ways (Figure 1): 259 S1 + S1 + 260 (a) + | (b) + | 261 | | (S1,G1) | | (S1,G1) 262 PE1 | | PE1 | | 263 +-----+ v +-----+ v 264 |+---+| |+---+| 265 ||BD1|| ||BD1|| 266 |+---+| |+---+| 267 +-----+ +-----+ 268 +-------|-------+ +-------| 269 | | | | | 270 v v v v v 271 +-----+ +-----+ +-----+ +-----+ +-----+ +-----+ 272 |+---+| |-----| |-----| |+---+| |+---+| |+---+| 273 ||BD1|| ||BD1|| ||BD1|| ||BD1|| ||BD1|| ||BD1|| 274 |+---+| |-----| |-----| |+---+| |+---+| |+---+| 275 +-----+ +-----+ +-----+ +-----+ +-----+ +-----+ 276 PE2| PE3| PE4| PE2| PE3| PE4 277 - | - - - | - | - | - - - | - 278 | | | | | | | | | 279 v v v v v 280 | R1 R2 | R3 | R1 R2 | R3 281 - - - G1- - - - - - G1- - - 283 Figure 1: Intra-subnet IP Multicast 285 Model (a) illustrated in Figure 1 is referred to as "IP Multicast 286 delivery as BUM traffic". This way of delivering IP Multicast 287 traffic does not require any extensions to [RFC7432], however, it 288 sends the IP Multicast flows to non-interested receivers, such as 289 e.g., R3 in Figure 1. In this example, downstream PEs can snoop 290 IGMP/MLD messages from the receivers so that layer-2 multicast state 291 is created and, for instance, PE4 can avoid sending (S1,G1) to R3, 292 since R3 is not interested in (S1,G1). 294 Model (b) in Figure 1 uses an S-PMSI to optimize the delivery of the 295 (S1,G1) flow. For instance, assuming PE1 uses IR, PE1 sends (S1,G1) 296 only to the downstream PEs that issued an SMET route for (S1,G1), 297 that is, PE2 and PE3. In case PE1 uses any P-tunnel different than 298 IR, AR or BIER, PE1 will advertise an S-PMSI A-D route for (S1,G1) 299 and PE2/PE2 will join that tree. 301 Procedures for Model (b) are specified in 302 [I-D.ietf-bess-evpn-igmp-mld-proxy]. 304 1.2.2. Inter-subnet IP Multicast Forwarding 306 If the source and receivers are attached to different BDs of the same 307 tenant domain, the EVPN network can also use Inclusive or Selective 308 Trees as depicted in Figure 2, models (a) and (b) respectively. 310 S1 + S1 + 311 (a) + | (b) + | 312 | | (S1,G1) | | (S1,G1) 313 PE1 | | PE1 | | 314 +-----+ v +-----+ v 315 |+---+| |+---+| 316 ||BD1|| ||BD1|| 317 |+---+| |+---+| 318 +-----+ +-----+ 319 +-------|-------+ +-------| 320 | | | | | 321 v v v v v 322 +-----+ +-----+ +-----+ +-----+ +-----+ +-----+ 323 |+---+| |+---+| |+---+| |+---+| |+---+| |+---+| 324 ||SBD|| ||SBD|| ||SBD|| ||SBD|| ||SBD|| ||SBD|| 325 |+-|-+| |+-|-+| |+---+| |+-|-+| |+-|-+| |+---+| 326 | VRF | | VRF | | VRF | | VRF | | VRF | | VRF | 327 |+-v-+| |+-v-+| |+---+| |+-v-+| |+-v-+| |+---+| 328 ||BD2|| ||BD3|| ||BD4|| ||BD2|| ||BD3|| ||BD4|| 329 |+-|-+| |+-|-+| |+---+| |+-|-+| |+-|-+| |+---+| 330 +--|--+ +--|--+ +-----+ +--|--+ +--|--+ +-----+ 331 PE2| PE3| PE4 PE2| PE3| PE4 332 - | - - - | - - | - - - | - 333 | | | | | | | | 334 v v v v 335 | R1 R2 | R3 | R1 R2 | R3 336 - - - G1- - - - - - G1- - - 338 Figure 2: Inter-subnet IP Multicast 340 [I-D.ietf-bess-evpn-irb-mcast] specifies the procedures to optimize 341 the Inter-subnet Multicast forwarding in an EVPN network. The IP 342 Multicast flows are always sent in the context of the source BD. As 343 described in [I-D.ietf-bess-evpn-irb-mcast], if the downstream PE is 344 not attached to the source BD, the IP Multicast flow is received on 345 the SBD (Supplementary Broadcast Domain), as in the example in 346 Figure 2. 348 [I-D.ietf-bess-evpn-irb-mcast] supports Inclusive or Selective 349 Multicast Trees, and as explained in Section 1.2.1, the Selective 350 Multicast Trees are setup in a different way, depending on the 351 P-tunnel being used by the source BD. As an example, model (a) in 352 Figure 2 illustrates the use of an Inclusive Multicast Tree for BD1 353 on PE1. Since the downstream PEs are not attached to BD1, they will 354 all receive (S1,G1) in the context of the SBD and will locally route 355 the flow to the local ACs. Model (b) uses a similar forwarding 356 model, however PE1 sends the (S1,G1) flow in a Selective Multicast 357 Tree. If the P-tunnel is IR, AR or BIER, PE1 does not need to 358 advertise an S-PMSI A-D route. 360 [I-D.ietf-bess-evpn-irb-mcast] is a superset of the procedures in 361 [I-D.ietf-bess-evpn-igmp-mld-proxy], in which sources and receivers 362 can be in the same or different BD of the same tenant. 363 [I-D.ietf-bess-evpn-irb-mcast] ensures every upstream PE attached to 364 a source will learn of all other PEs (attached to the same Tenant 365 Domain) that have interest in a particular set of flows. This is 366 because the downstream PEs advertise SMET routes for a set of flows 367 with the SBD's Route Target and they are imported by all the Upstream 368 PEs of the tenant. As a result of that, inter-subnet multicasting 369 can be done within the Tenant Domain, without requiring any 370 Rendezvous Points (RP), shared trees, UMH selection or any other 371 complex aspects of conventional multicast routing techniques. 373 1.3. Multi-Homed IP Multicast Sources in EVPN 375 Contrary to conventional multicast routing technologies, multi-homing 376 PEs attached to the same source can never create IP Multicast packet 377 duplication if the PEs use a multi-homed Ethernet Segment (ES). 378 Figure 3 illustrates this by showing two multi-homing PEs (PE1 and 379 PE2) that are attached to the same source (S1). We assume that S1 is 380 connected to an all-active ES by a layer-2 switch (SW1) with a Link 381 Aggregation Group (LAG) to PE1 and PE2. 383 S1 384 | 385 v 386 +-----+ 387 | SW1 | 388 +-----+ 389 +---- | | 390 (S1,G1)| +----+ +----+ 391 IGMP | | all-active | 392 J(S1,G1) PE1 v | ES-1 | PE2 393 +----> +-----------|---+ +---|-----------+ 394 | +---+ +---+ | | +---+ | 395 R1 <-----|BD2| |BD1| | | |BD1| | 396 | +---+---+---+ | | +---+---+ | 397 +----| |VRF| | | | |VRF| |----+ 398 | | +---+---+ | | | +---+---+ | | 399 | | |SBD| | | | |SBD| | | 400 | | +---+ | | | +---+ | | 401 | +------------|--+ +---------------+ | 402 | | | 403 | | | 404 | | | 405 | EVPN | ^ | 406 | OISM v PE3 | SMET | 407 | +---------------+ | (*,G1) | 408 | | +---+ | | | 409 | | |SBD| | | 410 | | +---+---+ | | 411 +--------------| |VRF| |----------------+ 412 | +---+---+---+ | 413 | |BD2| |BD3| | 414 | +-|-+ +-|-+ | 415 +---|-------|---+ 416 ^ | | ^ 417 IGMP | v v | IGMP 418 J(*,G1) | R2 R3 | J(S1,G1) 420 Figure 3: All-active Multi-homing and OISM 422 When receiving the (S1,G1) flow from S1, SW1 will choose only one 423 link to send the flow, as per [RFC7432]. Assuming PE1 is the 424 receiving PE on BD1, the IP Multicast flow will be forwarded as soon 425 as BD1 creates multicast state for (S1,G1) or (*,G1). In the example 426 of Figure 3, receivers R1, R2 and R3 are interested in the multicast 427 flow to G1. R1 will receive (S1,G1) directly via the IRB interface 428 as per [I-D.ietf-bess-evpn-irb-mcast]. Upon receiving IGMP reports 429 from R2 and R3, PE3 will issue an SMET (*,G1) route that will create 430 state in PE1's BD1. PE1 will therefore forward the IP Multicast flow 431 to PE3's SBD and PE3 will forward to R2 and R3, as per 432 [I-D.ietf-bess-evpn-irb-mcast] procedures. 434 When IP Multicast source multi-homing is required, EVPN multi-homed 435 Ethernet Segments MUST be used. EVPN multi-homing guarantees that 436 only one Upstream PE will forward a given multicast flow at the time, 437 avoiding packet duplication at the Downstream PEs. In addition, the 438 SMET route for a given flow creates state in all the multi-homing 439 Upstream PEs. Therefore, in case of failure on the Upstream PE 440 forwarding the flow, the backup Upstream PE can forward the flow 441 immediately. 443 This document assumes that multi-homing PEs attached to the same 444 source always use multi-homed Ethernet Segments. 446 1.4. The Need for Redundant IP Multicast Sources in EVPN 448 While multi-homing PEs to the same IP Multicast G-source provides 449 certain level of resiliency, multicast applications are often 450 critical in the Operator's network and greater level of redundancy is 451 required. This document assumes that: 453 a. Redundant G-sources for an SFG may exist in the EVPN tenant 454 network. A Redundant G-source is a host or a router that sends 455 an SFG in a tenant network where there is another host or router 456 sending traffic to the same SFG. 458 b. Those redundant G-sources may be in the same BD or different BDs 459 of the tenant. There must not be restrictions imposed on the 460 location of the receiver systems either. 462 c. The redundant G-sources can be single-homed to only one EVPN PE 463 or multi-homed to multiple EVPN PEs. 465 d. The EVPN PEs must avoid duplication of the same SFG on the 466 receiver systems. 468 2. Solution Overview 470 An SFG is represented as (*,G) if any source that issues multicast 471 traffic to G is a redundant G-source. Alternatively, this document 472 allows an SFG to be represented as (S,G), where S is a prefix of any 473 length. In this case, a source is considered a redundant G-source 474 for the SFG if it is contained in the prefix. This document allows 475 variable length prefixes in the Sources advertised in S-PMSI A-D 476 routes only for the particular application of redundant G-sources. 478 There are two redundant G-source solutions described in this 479 document: 481 * Warm Standby (WS) Solution 483 * Hot Standby (HS) Solution 485 The WS solution is considered an upstream-PE-based solution (since 486 downstream PEs do not participate in the procedures), in which all 487 the upstream PEs attached to redundant G-sources for an SFG 488 represented by (*,G) or (S,G) will elect a "Single Forwarder" (SF) 489 among themselves. Once a SF is elected, the upstream PEs add an 490 Reverse Path Forwarding (RPF) check to the (*,G) or (S,G) state for 491 the SFG: 493 * A non-SF upstream PE discards any (*,G)/(S,G) packets received 494 over a local AC. 496 * The SF accepts and forwards any (*,G)/(S,G) packets it receives 497 over a single local AC (for the SFG). In case (*,G)/(S,G) packets 498 for the SFG are received over multiple local ACs, they will be 499 discarded in all the local ACs but one. The procedure to choose 500 the local AC that accepts packets is a local implementation 501 matter. 503 A failure on the SF will result in the election of a new SF. The 504 Election requires BGP extensions on the existing EVPN routes. These 505 extensions and associated procedures are described in Section 3 and 506 Section 4 respectively. 508 In the HS solution the downstream PEs are the ones avoiding the SFG 509 duplication. The upstream PEs are aware of the locally attached 510 G-sources and add a unique Ethernet Segment Identifier label (ESI- 511 label) per SFG to the SFG packets forwarded to downstream PEs. The 512 downstream PEs pull the SFG from all the upstream PEs attached to the 513 redundant G-sources and avoid duplication on the receiver systems by 514 adding an RPF check to the (*,G) state for the SFG: 516 * A downstream PE discards any (*,G) packets it receives from the 517 "wrong G-source". 519 * The wrong G-source is identified in the data path by an ESI-label 520 that is different than the ESI-label used for the selected G- 521 source. 523 * Note that the ESI-label is used here for "ingress filtering" (at 524 the egress/downstream PE) as opposed to the [RFC7432] "egress 525 filtering" (at the egress/downstream PE) used in the split-horizon 526 procedures. In [RFC7432] the ESI-label indicates what egress ACs 527 must be skipped when forwarding BUM traffic to the egress. In 528 this document, the ESI-label indicates what ingress traffic must 529 be discarded at the downstream PE. 531 The use of ESI-labels for SFGs forwarded by upstream PEs require some 532 control plane and data plane extensions in the procedures used by 533 [RFC7432] for multi-homing. Upon failure of the selected G-source, 534 the downstream PE will switch over to a different selected G-source, 535 and will therefore change the RPF check for the (*,G) state. The 536 extensions and associated procedures are described in Section 3 and 537 Section 5 respectively. 539 An operator should use the HS solution if they require a fast fail- 540 over time and the additional bandwidth consumption is acceptable (SFG 541 packets are received multiple times on the downstream PEs). 542 Otherwise the operator should use the WS solution, at the expense of 543 a slower fail-over time in case of a G-source or upstream PE failure. 544 Besides bandwidth efficiency, another advantage of the WS solution is 545 that only the upstream PEs attached to the redundant G-sources for 546 the same SFG need to be upgraded to support the new procedures. 548 This document does not impose the support of both solutions on a 549 system. If one solution is supported, the support of the other 550 solution is OPTIONAL. 552 3. BGP EVPN Extensions 554 This document makes use of the following BGP EVPN extensions: 556 1. SFG flag in the Multicast Flags Extended Community 558 The Single Flow Group (SFG) flag is a new bit requested to IANA 559 out of the registry Multicast Flags Extended Community Flag 560 Values. This new flag is set for S-PMSI A-D routes that carry a 561 (*,G)/(S,G) SFG in the NLRI. 563 2. ESI Label Extended Community is used in S-PMSI A-D routes 564 The HS solution requires the advertisement of one or more ESI 565 Label Extended Communities [RFC7432] that encode the Ethernet 566 Segment Identifier(s) associated to an S-PMSI A-D (*,G)/(S,G) 567 route that advertises the presence of an SFG. Only the ESI Label 568 value in the extended community is relevant to the procedures in 569 this document. The Flags field in the extended community will be 570 advertised as 0x00 and ignored on reception. [RFC7432] specifies 571 that the ESI Label Extended Community is advertised along with 572 the A-D per ES route. This documents extends the use of this 573 extended community so that it can be advertised multiple times 574 (with different ESI values) along with the S-PMSI A-D route. 576 4. Warm Standby (WS) Solution for Redundant G-Sources 578 The general procedure is described as follows: 580 1. Configuration of the upstream PEs 582 Upstream PEs (possibly attached to redundant G-sources) need to 583 be configured to know which groups are carrying only flows from 584 redundant G-sources, that is, the SFGs in the tenant domain. 585 They will also be configured to know which local BDs may be 586 attached to a redundant G-source. The SFGs can be configured for 587 any source, E.g., SFG for "*", or for a prefix that contains 588 multiple sources that will issue the same SFG, i.e., 589 "10.0.0.0/30". In the latter case sources 10.0.0.1 and 10.0.0.2 590 are considered as Redundant G-sources, whereas 10.0.0.10 is not 591 considered a redundant G-source for the same SFG. 593 As an example: 595 * PE1 is configured to know that G1 is an SFG for any source and 596 redundant G-sources for G1 may be attached to BD1 or BD2. 598 * Or PE1 can also be configured to know that G1 is an SFG for 599 the sources contained in 10.0.0.0/30, and those redundant 600 G-sources may be attached to BD1 or BD2. 602 2. Signaling the location of a G-source for a given SFG 604 Upon receiving G-traffic for a configured SFG on a BD, an 605 upstream PE configured to follow this procedure, e.g., PE1: 607 * Originates an S-PMSI A-D (*,G)/(S,G) route for the SFG. An 608 (*,G) route is advertised if the SFG is configured for any 609 source, and an (S,G) route is advertised (where the Source can 610 have any length) if the SFG is configured for a prefix. 612 * The S-PMSI A-D route is imported by all the PEs attached to 613 the tenant domain. In order to do that, the route will use 614 the SBD-RT (Supplementary Broadcast Domain Route-Target) in 615 addition to the BD-RT of the BD over which the G-traffic is 616 received. The route SHOULD also carry a DF Election Extended 617 Community (EC) and a flag indicating that it conveys an SFG. 618 The DF Election EC and its use is specified in [RFC8584]. 620 * The above S-PMSI A-D route MAY be advertised with or without 621 PMSI Tunnel Attribute (PTA): 623 - With no PTA if an I-PMSI or S-PMSI A-D with IR/AR/BIER are 624 to be used. 626 - With PTA in any other case. 628 * The S-PMSI A-D route is triggered by the first packet of the 629 SFG and withdrawn when the flow is not received anymore. 630 Detecting when the G-source is no longer active is a local 631 implementation matter. The use of a timer is RECOMMENDED. 632 The timer is started when the traffic to G1 is not received. 633 Upon expiration of the timer, the PE will withdraw the route 635 3. Single Forwarder (SF) Election 637 If the PE with a local G-source receives one or more S-PMSI A-D 638 routes for the same SFG from a remote PE, it will run a Single 639 Forwarder (SF) Election based on the information encoded in the 640 DF Election EC. Two S-PMSI A-D routes are considered for the 641 same SFG if they are advertised for the same tenant, and their 642 Multicast Source Length, Multicast Source, Multicast Group Length 643 and Multicast Group fields match. 645 1. A given DF Alg can only be used if all the PEs running the DF 646 Alg have consistent input. For example, in an OISM network, 647 if the redundant G-sources for an SFG are attached to BDs 648 with different Ethernet Tags, the Default DF Election Alg 649 MUST NOT be used. 651 2. In case the there is a mismatch in the DF Election Alg or 652 capabilities advertised by two PEs competing for the SF, the 653 lowest PE IP address (given by the Originator Address in the 654 S- PMSI A-D route) will be used as a tie-breaker. 656 4. RPF check on the PEs attached to a redundant G-source 657 All the PEs with a local G-source for the SFG will add an RPF 658 check to the (*,G)/(S,G) state for the SFG. That RPF check 659 depends on the SF Election result: 661 1. The non-SF PEs discard any (*,G)/(S,G) packets for the SFG 662 received over a local AC. 664 2. The SF accepts any (*,G)/(S,G) packets for the SFG it 665 receives over one (and only one) local AC. 667 The solution above provides redundancy for SFGs and it does not 668 require an upgrade of the downstream PEs (PEs where there is 669 certainty that no redundant G-sources are connected). Other 670 G-sources for non-SFGs may exist in the same tenant domain. This 671 document does not change the existing procedures for non-SFG 672 G-sources. 674 The redundant G-sources can be single-homed or multi-homed to a BD in 675 the tenant domain. Multi-homing does not change the above 676 procedures. 678 Section 4.1 and Section 4.2 show two examples of the WS solution. 680 4.1. WS Example in an OISM Network 682 Figure 4 illustrates an example in which S1 and S2 are redundant G- 683 sources for the SFG (*,G1). 685 S1 (Single S2 686 | Forwarder) | 687 (S1,G1)| (S2,G1)| 688 | | 689 PE1 | PE2 | 690 +--------v---+ +--------v---+ 691 S-PMSI | +---+ | | +---+ | S-PMSI 692 (*,G1) | +---|BD1| | | +---|BD2| | (*,G1) 693 Pref200 | |VRF+---+ | | |VRF+---+ | Pref100 694 |SFG |+---+ | | | |+---+ | | SFG| 695 | +----|SBD|--+ | |-----------||SBD|--+ |---+ | 696 v | |+---+ | | |+---+ | | v 697 | +---------|--+ +------------+ | 698 SMET | | | SMET 699 (*,G1) | | (S1,G1) | (*,G1) 700 | +--------+------------------+ | 701 ^ | | | | ^ 702 | | | EVPN | | | 703 | | | OISM | | | 704 | | | | | | 705 PE3 | | PE4 | | PE5 706 +--------v---+ +------------+ | +------------+ 707 | +---+ | | +---+ | | | +---+ | 708 | +---|SBD| |-------| +---|SBD| |--|---| +---|SBD| | 709 | |VRF+---+ | | |VRF+---+ | | | |VRF+---+ | 710 |+---+ | | |+---+ | | | |+---+ | | 711 ||BD3|--+ | ||BD4|--+ | +--->|BD1|--+ | 712 |+---+ | |+---+ | |+---+ | 713 +------------+ +------------+ +------------+ 714 | ^ | ^ 715 | | IGMP | | IGMP 716 R1 | J(*,G1) R3 | J(*,G1) 718 Figure 4: WS Solution for Redundant G-Sources 720 The WS solution works as follows: 722 1. Configuration of the upstream PEs, PE1 and PE2 724 PE1 and PE2 are configured to know that G1 is an SFG for any 725 source and redundant G-sources for G1 may be attached to BD1 or 726 BD2, respectively. 728 2. Signaling the location of S1 and S2 for (*,G1) 729 Upon receiving (S1,G1) traffic on a local AC, PE1 and PE2 730 originate S-PMSI A-D (*,G1) routes with the SBD-RT, DF Election 731 Extended Community (EC) and a flag indicating that it conveys an 732 SFG. 734 3. Single Forwarder (SF) Election 736 Based on the DF Election EC content, PE1 and PE2 elect an SF for 737 (*,G1). Assuming both PEs agree on e.g., Preference based 738 Election as the algorithm to use [DF-PREF], and PE1 has a higher 739 preference, PE1 becomes the SF for (*,G1). 741 4. RPF check on the PEs attached to a redundant G-source 743 a. The non-SF, PE2, discards any (*,G1) packets received over a 744 local AC. 746 b. The SF, PE1 accepts (*,G1) packets it receives over one (and 747 only one) local AC. 749 The end result is that, upon receiving reports for (*,G1) or (S,G1), 750 the downstream PEs (PE3 and PE5) will issue SMET routes and will pull 751 the multicast SFG from PE1, and PE1 only. Upon a failure on S1, the 752 AC connected to S1 or PE1 itself will trigger the S-PMSI A-D (*,G1) 753 withdrawal from PE1 and PE2 will be promoted to SF. 755 4.2. WS Example in a Single-BD Tenant Network 757 Figure 5 illustrates an example in which S1 and S2 are redundant 758 G-sources for the SFG (*,G1), however, now all the G-sources and 759 receivers are connected to the same BD1 and there is no SBD. 761 S1 (Single S2 762 | Forwarder) | 763 (S1,G1)| (S2,G1)| 764 | | 765 PE1 | PE2 | 766 +--------v---+ +--------v---+ 767 S-PMSI | +---+ | | +---+ | S-PMSI 768 (*,G1) | |BD1| | | |BD1| | (*,G1) 769 Pref200 | +---+ | | +---+ | Pref100 770 |SFG | | | | | SFG| 771 | +---| | |-----------| |---+ | 772 v | | | | | | | v 773 | +---------|--+ +------------+ | 774 SMET | | | SMET 775 (*,G1) | | (S1,G1) | (*,G1) 776 | +--------+------------------------+ | 777 ^ | | | | ^ 778 | | | EVPN | | | 779 | | | | | | 780 | | | | | | 781 PE3 | | PE4 | | PE5 782 +--------v---+ +------------+ +-|----------+ 783 | +---+ | | +---+ | | | +---+ | 784 | |BD1| |-------| |BD1| |------| +--->|BD1| | 785 | +---+ | | +---+ | | +---+ | 786 | | | | | | 787 | | | | | | 788 | | | | | | 789 +------------+ +------------+ +------------+ 790 | ^ | ^ 791 | | IGMP | | IGMP 792 R1 | J(*,G1) R3 | J(*,G1) 794 Figure 5: WS Solution for Redundant G-Sources in the same BD 796 The same procedure as in Section 4.1 is valid here, being this a sub- 797 case of the one in Section 4.1. Upon receiving traffic for the SFG 798 G1, PE1 and PE2 advertise the S-PMSI A-D routes with BD1-RT only, 799 since there is no SBD. 801 5. Hot Standby (HS) Solution for Redundant G-Sources 803 If fast-failover is required upon the failure of a G-source or PE 804 attached to the G-source and the extra bandwidth consumption in the 805 tenant network is not an issue, the HS solution should be used. The 806 procedure is as follows: 808 1. Configuration of the PEs 809 As in the WS case, the upstream PEs where redundant G-sources may 810 exist need to be configured to know which groups (for any source 811 or a prefix containing the intended sources) are carrying only 812 flows from redundant G-sources, that is, the SFGs in the tenant 813 domain. 815 In addition (and this is not done in WS mode), the individual 816 redundant G-sources for an SFG need to be associated with an 817 Ethernet Segment (ES) on the upstream PEs. This is irrespective 818 of the redundant G-source being multi-homed or single-homed. 819 Even for single-homed redundant G-sources the HS procedure relies 820 on the ESI labels for the RPF check on downstream PEs. The term 821 "S-ESI" is used in this document to refer to an ESI associated to 822 a redundant G-source. 824 Contrary to what is specified in the WS method (that is 825 transparent to the downstream PEs), the support of the HS 826 procedure is required not only on the upstream PEs but also on 827 all downstream PEs connected to the receivers in the tenant 828 network. The downstream PEs do not need to be configured to know 829 the connected SFGs or their ESIs, since they get that information 830 from the upstream PEs. The downstream PEs will locally select an 831 ESI for a given SFG, and will program an RPF check to the 832 (*,G)/(S,G) state for the SFG that will discard (*,G)/(S,G) 833 packets from the rest of the ESIs. The selection of the ESI for 834 the SFG is based on local policy. 836 2. Signaling the location of a G-source for a given SFG and its 837 association to the local ESIs 839 Based on the configuration in step 1, an upstream PE configured 840 to follow the HS procedures: 842 a. Advertises an S-PMSI A-D (*,G)/(S,G) route per each 843 configured SFG. These routes need to be imported by all the 844 PEs of the tenant domain, therefore they will carry the BD-RT 845 and SBD-RT (if the SBD exists). The route also carries the 846 ESI Label Extended Communities needed to convey all the 847 S-ESIs associated to the SFG in the PE. 849 b. The S-PMSI A-D route will convey a PTA in the same cases as 850 in the WS procedure. 852 c. The S-PMSI A-D (*,G)/(S,G) route is triggered by the 853 configuration of the SFG and not by the reception of 854 G-traffic. 856 3. Distribution of DCB (Domain-wide Common Block) ESI-labels and 857 G-source ES routes 859 An upstream PE advertises the corresponding ES, A-D per EVI and 860 A-D per ES routes for the local S-ESIs. 862 a. ES routes are used for regular DF Election for the S-ES. 863 This document does not introduce any change in the procedures 864 related to the ES routes. 866 b. The A-D per EVI and A-D per ES routes MUST include the SBD-RT 867 since they have to be imported by all the PEs in the tenant 868 domain. 870 c. The A-D per ES routes convey the S-ESI labels that the 871 downstream PEs use to add the RPF check for the (*,G)/(S,G) 872 associated to the SFGs. This RPF check requires that all the 873 packets for a given G-source are received with the same S-ESI 874 label value on the downstream PEs. For example, if two 875 redundant G-sources are multi-homed to PE1 and PE2 via S-ES-1 876 and S-ES-2, PE1 and PE2 MUST allocate the same ESI label "Lx" 877 for S-ES-1 and they MUST allocate the same ESI label "Ly" for 878 S-ES-2. In addition, Lx and Ly MUST be different. These ESI 879 labels are Domain-wide Common Block (DCB) labels and follow 880 the allocation procedures in 881 [I-D.ietf-bess-mvpn-evpn-aggregation-label]. 883 4. Processing of A-D per ES/EVI routes and RPF check on the 884 downstream PEs 886 The A-D per ES/EVI routes are received and imported in all the 887 PEs in the tenant domain. The processing of the A-D per ES/EVI 888 routes on a given PE depends on its configuration: 890 a. The PEs attached to the same BD of the BD-RT that is included 891 in the A-D per ES/EVI routes will process the routes as in 892 [RFC7432] and [RFC8584]. If the receiving PE is attached to 893 the same ES as indicated in the route, [RFC7432] split- 894 horizon procedures will be followed and the DF Election 895 candidate list may be modified as in [RFC8584] if the ES 896 supports the AC-DF capability. 898 b. The PEs that are not attached to the BD-RT but are attached 899 to the SBD of the received SBD-RT, will import the A-D per 900 ES/EVI routes and use them for redundant G-source mass 901 withdrawal, as explained later. 903 c. Upon importing A-D per ES routes corresponding to different 904 S-ESes, a PE MUST select a primary S-ES and add an RPF check 905 to the (*,G)/(S,G) state in the BD or SBD. This RPF check 906 will discard all ingress packets to (*,G)/(S,G) that are not 907 received with the ESI-label of the primary S-ES. The 908 selection of the primary S-ES is a matter of local policy. 910 5. G-traffic forwarding for redundant G-sources and fault detection 912 Assuming there is (*,G) or (S,G) state for the SFG with OIF 913 (Ouput Interface) list entries associated to remote EVPN PEs, 914 upon receiving G-traffic on a S-ES, the upstream PE will add a 915 S-ESI label at the bottom of the stack before forwarding the 916 traffic to the remote EVPN PEs. This label is allocated from a 917 DCB as described in step 3. If P2MP or BIER PMSIs are used, this 918 is not adding any new data path procedures on the upstream PEs 919 (except that the ESI-label is allocated from a DCB as described 920 in [I-D.ietf-bess-mvpn-evpn-aggregation-label]). However, if IR/ 921 AR are used, this document extends the [RFC7432] procedures by 922 pushing the S-ESI labels not only on packets sent to the PEs that 923 shared the ES but also to the rest of the PEs in the tenant 924 domain. This allows the downstream PEs to receive all the 925 multicast packets from the redundant G-sources with a S-ESI label 926 (irrespective of the PMSI type and the local ESes), and discard 927 any packet that conveys a S-ESI label different from the primary 928 S-ESI label (that is, the label associated to the selected 929 primary S-ES), as discussed in step 4. 931 If the last A-D per EVI or the last A-D per ES route for the 932 primary S-ES is withdrawn, the downstream PE will immediately 933 select a new primary S-ES and will change the RPF check. Note 934 that if the S-ES is re-used for multiple tenant domains by the 935 upstream PEs, the withdrawal of all the A-D per-ES routes for a 936 S-ES provides a mass withdrawal capability that makes a 937 downstream PE to change the RPF check in all the tenant domains 938 using the same S-ES. 940 The withdrawal of the last S-PMSI A-D route for a given 941 (*,G)/(S,G) that represents a SFG SHOULD make the downstream PE 942 remove the S-ESI label based RPF check on (*,G)/(S,G). 944 5.1. Extensions for the Advertisement of DCB Labels 946 DCB Labels are specified in 947 [I-D.ietf-bess-mvpn-evpn-aggregation-label] and this document makes 948 use of them for the procedures described in Section 5. 949 [I-D.ietf-bess-mvpn-evpn-aggregation-label] assumes that DCB labels 950 can only be used along with MP2MP/P2MP/BIER tunnels and that, if the 951 PMSI label is signaled as a DCB label, then the ESI label used for 952 multi-homing is also a DCB label. This document extends the use of 953 the DCB allocation for ESI labels so that: 955 a. DCB-allocated ESI labels MAY be used along with IR tunnels, and 957 b. DCB-allocated ESI labels MAY be used by PEs that do not use DCB- 958 allocated PMSI labels. 960 This control plane extension is indicated by adding the DCB-flag or 961 the Context Label Space ID Extended Community to the A-D per ES 962 route(s) advertised for the S-ES. The DCB-flag is encoded in the ESI 963 Label Extended Community as follows: 965 1 2 3 966 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 967 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 968 | Type=0x06 | Sub-Type=0x01 | Flags(1 octet)| Reserved=0 | 969 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 970 | Reserved=0 | ESI Label | 971 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 973 Figure 6: ESI Label Extended Community 975 This document defines the bit 5 in the Flags octet of the ESI Label 976 Extended Community as the ESI-DCB-flag. When the ESI-DCB-flag is 977 set, it indicates that the ESI label is a DCB label. 979 A received ESI label is considered DCB if either of these two 980 conditions is met: 982 a. The ESI label is encoded in an ESI Label Extended Community with 983 the ESI-DCB-flag set. 985 b. The ESI label is signaled from a PE that advertised a PMSI label 986 that is a DCB label. 988 As in [I-D.ietf-bess-mvpn-evpn-aggregation-label] this document also 989 allows the use of context label space ID Extended Community. When 990 the context label space ID extended community is advertised along 991 with the ESI label in an A-D per ES route, the ESI label is from a 992 context label space identified by the DCB label in the Extended 993 Community. 995 5.2. Use of BFD in the HS Solution 997 In addition to using the state of the A-D per EVI, A-D per ES or 998 S-PMSI A-D routes to modify the RPF check on (*,G)/(S,G) as discussed 999 in Section 5, Bidirectional Forwarding Detection (BFD) protocol MAY 1000 be used to find the status of the multipoint tunnels used to forward 1001 the SFG from the redundant G-sources. 1003 The BGP-BFD Attribute is advertised along with the S-PMSI A-D or IMET 1004 routes (depending on whether I-PMSI or S-PMSI trees are used) and the 1005 procedures described in [EVPN-BFD] are used to bootstrap multipoint 1006 BFD sessions on the downstream PEs. 1008 5.3. HS Example in an OISM Network 1010 Figure 7 illustrates the HS model in an OISM network. Consider S1 1011 and S2 are redundant G-sources for the SFG (*,G1) in BD1 (any source 1012 using G1 is assumed to transmit an SFG). S1 and S2 are (all-active) 1013 multi-homed to upstream PEs, PE1 and PE2. The receivers are attached 1014 to downstream PEs, PE3 and PE5, in BD3 and BD1, respectively. S1 and 1015 S2 are assumed to be connected by a LAG to the multi-homing PEs, and 1016 the multicast traffic can use the link to either upstream PE. The 1017 diagram illustrates how S1 sends the G-traffic to PE1 and PE1 1018 forwards to the remote interested downstream PEs, whereas S2 sends to 1019 PE2 and PE2 forwards further. In this HS model, the interested 1020 downstream PEs will get duplicate G-traffic from the two G-sources 1021 for the same SFG. While the diagram shows that the two flows are 1022 forwarded by different upstream PEs, the all-active multi-homing 1023 procedures may cause that the two flows come from the same upstream 1024 PE. Therefore, finding out the upstream PE for the flow is not 1025 enough for the downstream PEs to program the required RPF check to 1026 avoid duplicate packets on the receiver. 1028 S1(ESI-1) S2(ESI-2) 1029 | | 1030 | +----------------------+ 1031 (S1,G1)| | (S2,G1)| 1032 +----------------------+ | 1033 PE1 | | PE2 | | 1034 +--------v---+ +--------v---+ 1035 | +---+ | | +---+ | S-PMSI 1036 S-PMSI | +---|BD1| | | +---|BD1| | (*,G1) 1037 (*,G1) | |VRF+---+ | | |VRF+---+ | SFG 1038 SFG |+---+ | | | |+---+ | | | ESI1,2 1039 ESI1,2 +---||SBD|--+ | |-----------||SBD|--+ | |---+ | 1040 | | |+---+ | | EVPN |+---+ | | | v 1041 v | +---------|--+ OISM +---------|--+ | 1042 | | | | 1043 | | (S1,G1) | | 1044 SMET | +---------+------------------+ | | SMET 1045 (*,G1) | | | | | (*,G1) 1046 ^ | | +----------------------------+---+ | ^ 1047 | | | | (S2,G1) | | | | 1048 | | | | | | | | 1049 PE3 | | | PE4 | | | PE5 1050 +-------v-v--+ +------------+ | | +------------+ 1051 | +---+ | | +---+ | | | | +---+ | 1052 | +---|SBD| +-------| +---|SBD| |--|-|-| +---|SBD| | 1053 | |VRF+---+ | | |VRF+---+ | | | | |VRF+---+ | 1054 |+---+ | | |+---+ | | | | |+---+ | | 1055 ||BD3|--+ | ||BD4|--+ | | +->|BD1|--+ | 1056 |+---+ | |+---+ | +--->+---+ | 1057 +------------+ +------------+ +------------+ 1058 | ^ | ^ 1059 | | IGMP | | IGMP 1060 R1 | J(*,G1) R3 | J(*,G1) 1062 Figure 7: HS Solution for Multi-homed Redundant G-Sources in OISM 1064 In this scenario, the HS solution works as follows: 1066 1. Configuration of the upstream PEs, PE1 and PE2 1067 PE1 and PE2 are configured to know that G1 is an SFG for any 1068 source (a source prefix length could have been configured 1069 instead) and the redundant G-sources for G1 use S-ESIs ESI-1 and 1070 ESI-2 respectively. Both ESes are configured in both PEs and the 1071 ESI value can be configured or auto-derived. The ESI-label 1072 values are allocated from a DCB 1073 [I-D.ietf-bess-mvpn-evpn-aggregation-label] and are configured 1074 either locally or by a centralized controller. We assume ESI-1 1075 is configured to use ESI-label-1 and ESI-2 to use ESI-label-2. 1077 The downstream PEs, PE3, PE4 and PE5 are configured to support HS 1078 mode and select the G-source with e.g., lowest ESI value. 1080 2. PE1 and PE2 advertise S-PMSI A-D (*,G1) and ES/A-D per ES/EVI 1081 routes 1083 Based on the configuration of step 1, PE1 and PE2 advertise an 1084 S-PMSI A-D (*,G1) route each. The route from each of the two PEs 1085 will include TWO ESI Label Extended Communities with ESI-1 and 1086 ESI-2 respectively, as well as BD1-RT plus SBD-RT and a flag that 1087 indicates that (*,G1) is an SFG. 1089 In addition, PE1 and PE2 advertise ES and A-D per ES/EVI routes 1090 for ESI-1 and ESI-2. The A-D per ES and per EVI routes will 1091 include the SBD-RT so that they can be imported by the downstream 1092 PEs that are not attached to BD1, e.g., PE3 and PE4. The A-D per 1093 ES routes will convey ESI-label-1 for ESI-1 (on both PEs) and 1094 ESI-label-2 for ESI-2 (also on both PEs). 1096 3. Processing of A-D per ES/EVI routes and RPF check 1098 PE1 and PE2 received each other's ES and A-D per ES/EVI routes. 1099 Regular [RFC7432] [RFC8584] procedures will be followed for DF 1100 Election and programming of the ESI-labels for egress split- 1101 horizon filtering. PE3/PE4 import the A-D per ES/EVI routes in 1102 the SBD. Since PE3 has created a (*,G1) state based on local 1103 interest, PE3 will add an RPF check to (*,G1) so that packets 1104 coming with ESI-label-2 are discarded (lowest ESI value is 1105 assumed to give the primary S-ES). 1107 4. G-traffic forwarding and fault detection 1108 PE1 receives G-traffic (S1,G1) on ES-1 that is forwarded within 1109 the context of BD1. Irrespective of the tunnel type, PE1 pushes 1110 ESI-label-1 at the bottom of the stack and the traffic gets to 1111 PE3 and PE5 with the mentioned ESI-label (PE4 has no local 1112 interested receivers). The G-traffic with ESI-label-1 passes the 1113 RPF check and it is forwarded to R1. In the same way, PE2 sends 1114 (S2,G1) with ESI-label-2, but this G-traffic does not pass the 1115 RPF check and gets discarded at PE3/PE5. 1117 If the link from S1 to PE1 fails, S1 will forward the (S1,G1) 1118 traffic to PE2 instead. PE1 withdraws the ES and A-D routes for 1119 ESI-1. Now both flows will be originated by PE2, however the RPF 1120 checks don't change in PE3/PE5. 1122 If subsequently, the link from S1 to PE2 fails, PE2 also 1123 withdraws the ES and A-D routes for ESI-1. Since PE3 and PE5 1124 have no longer A-D per ES/EVI routes for ESI-1, they immediately 1125 change the RPF check so that packets with ESI-label-2 are now 1126 accepted. 1128 Figure 8 illustrates a scenario where S1 and S2 are single-homed to 1129 PE1 and PE2 respectively. This scenario is a sub-case of the one in 1130 Figure 7. Now ES-1 only exists in PE1, hence only PE1 advertises the 1131 A-D per ES/EVI routes for ESI-1. Similarly, ES-2 only exists in PE2 1132 and PE2 is the only PE advertising A-D routes for ESI-2. The same 1133 procedures as in Figure 7 applies to this use-case. 1135 S1(ESI-1) S2(ESI-2) 1136 | | 1137 (S1,G1)| (S2,G1)| 1138 | | 1139 PE1 | PE2 | 1140 +--------v---+ +--------v---+ 1141 | +---+ | | +---+ | S-PMSI 1142 S-PMSI | +---|BD1| | | +---|BD2| | (*,G1) 1143 (*,G1) | |VRF+---+ | | |VRF+---+ | SFG 1144 SFG |+---+ | | | |+---+ | | | ESI2 1145 ESI1 +---||SBD|--+ | |-----------||SBD|--+ | |---+ | 1146 | | |+---+ | | EVPN |+---+ | | | v 1147 v | +---------|--+ OISM +---------|--+ | 1148 | | | | 1149 | | (S1,G1) | | 1150 SMET | +---------+------------------+ | | SMET 1151 (*,G1) | | | | | (*,G1) 1152 ^ | | +--------------------------------+----+ | ^ 1153 | | | | (S2,G1) | | | | 1154 | | | | | | | | 1155 PE3 | | | PE4 | | | PE5 1156 +-------v-v--+ +------------+ | +------v-----+ 1157 | +---+ | | +---+ | | | +---+ | 1158 | +---|SBD| |-------| +---|SBD| |--|---| +---|SBD| | 1159 | |VRF+---+ | | |VRF+---+ | | | |VRF+---+ | 1160 |+---+ | | |+---+ | | | |+---+ | | 1161 ||BD3|--+ | ||BD4|--+ | +--->|BD1|--+ | 1162 |+---+ | |+---+ | |+---+ | 1163 +------------+ +------------+ +------------+ 1164 | ^ | ^ 1165 | | IGMP | | IGMP 1166 R1 | J(*,G1) R3 | J(*,G1) 1168 Figure 8: HS Solution for single-homed Redundant G-Sources in OISM 1170 5.4. HS Example in a Single-BD Tenant Network 1172 Irrespective of the redundant G-sources being multi-homed or single- 1173 homed, if the tenant network has only one BD, e.g., BD1, the 1174 procedures of Section 5.2 still apply, only that routes do not 1175 include any SBD-RT and all the procedures apply to BD1 only. 1177 6. Security Considerations 1179 The same Security Considerations described in 1180 [I-D.ietf-bess-evpn-irb-mcast] are valid for this document. 1182 From a security perspective, out of the two methods described in this 1183 document, the WS method is considered lighter in terms of control 1184 plane and therefore its impact is low on the processing capabilities 1185 of the PEs. The HS method adds more burden on the control plane of 1186 all the PEs of the tenant with sources and receivers. 1188 7. IANA Considerations 1190 IANA is requested to allocate a Bit in the Multicast Flags Extended 1191 Community to indicate that a given (*,G) or (S,G) in an S-PMSI A-D 1192 route is associated with an SFG. 1194 8. References 1196 8.1. Normative References 1198 [RFC7432] Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A., 1199 Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based 1200 Ethernet VPN", RFC 7432, DOI 10.17487/RFC7432, February 1201 2015, . 1203 [RFC6513] Rosen, E., Ed. and R. Aggarwal, Ed., "Multicast in MPLS/ 1204 BGP IP VPNs", RFC 6513, DOI 10.17487/RFC6513, February 1205 2012, . 1207 [RFC6514] Aggarwal, R., Rosen, E., Morin, T., and Y. Rekhter, "BGP 1208 Encodings and Procedures for Multicast in MPLS/BGP IP 1209 VPNs", RFC 6514, DOI 10.17487/RFC6514, February 2012, 1210 . 1212 [I-D.ietf-bess-evpn-igmp-mld-proxy] 1213 Sajassi, A., Thoria, S., Mishra, M., Drake, J., and W. 1214 Lin, "IGMP and MLD Proxy for EVPN", Work in Progress, 1215 Internet-Draft, draft-ietf-bess-evpn-igmp-mld-proxy-16, 13 1216 January 2022, . 1219 [I-D.ietf-bess-evpn-irb-mcast] 1220 Lin, W., Zhang, Z., Drake, J., Rosen, E. C., Rabadan, J., 1221 and A. Sajassi, "EVPN Optimized Inter-Subnet Multicast 1222 (OISM) Forwarding", Work in Progress, Internet-Draft, 1223 draft-ietf-bess-evpn-irb-mcast-06, 24 May 2021, 1224 . 1227 [RFC8584] Rabadan, J., Ed., Mohanty, S., Ed., Sajassi, A., Drake, 1228 J., Nagaraj, K., and S. Sathappan, "Framework for Ethernet 1229 VPN Designated Forwarder Election Extensibility", 1230 RFC 8584, DOI 10.17487/RFC8584, April 2019, 1231 . 1233 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1234 Requirement Levels", BCP 14, RFC 2119, 1235 DOI 10.17487/RFC2119, March 1997, 1236 . 1238 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 1239 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 1240 May 2017, . 1242 [I-D.ietf-bess-mvpn-evpn-aggregation-label] 1243 Zhang, Z., Rosen, E., Lin, W., Li, Z., and I. Wijnands, 1244 "MVPN/EVPN Tunnel Aggregation with Common Labels", Work in 1245 Progress, Internet-Draft, draft-ietf-bess-mvpn-evpn- 1246 aggregation-label-08, 20 January 2022, 1247 . 1250 8.2. Informative References 1252 [EVPN-RT5] Rabadan, J., Henderickx, W., Drake, J., Lin, W., and A. 1253 Sajassi, "IP Prefix Advertisement in EVPN", internet- 1254 draft ietf-bess-evpn-prefix-advertisement-11.txt, May 1255 2018. 1257 [EVPN-BUM] Zhang, Z., Lin, W., Rabadan, J., and K. Patel, "Updates on 1258 EVPN BUM Procedures", internet-draft ietf-bess-evpn-bum- 1259 procedure-updates-06, June 2019. 1261 [DF-PREF] Rabadan, J., Sathappan, S., Przygienda, T., Lin, W., 1262 Drake, J., Sajassi, A., and S. Mohanty, "Preference-based 1263 EVPN DF Election", internet-draft ietf-bess-evpn-pref-df- 1264 04.txt, June 2019. 1266 [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private 1267 Networks (VPNs)", RFC 4364, DOI 10.17487/RFC4364, February 1268 2006, . 1270 [EVPN-BFD] Govindan, V., Mallik, M., Sajassi, A., and G. Mirsky, 1271 "Fault Management for EVPN networks", internet-draft ietf- 1272 bess-evpn-bfd-01.txt, October 2020. 1274 Appendix A. Acknowledgments 1276 The authors would like to thank Mankamana Mishra and Ali Sajassi for 1277 their review and valuable comments. 1279 Appendix B. Contributors 1281 Authors' Addresses 1283 Jorge Rabadan (editor) 1284 Nokia 1285 777 Middlefield Road 1286 Mountain View, CA 94043 1287 United States of America 1289 Email: jorge.rabadan@nokia.com 1291 Jayant Kotalwar 1292 Nokia 1293 701 E. Middlefield Road 1294 Mountain View, CA 94043 USA 1296 Email: jayant.kotalwar@nokia.com 1298 Senthil Sathappan 1299 Nokia 1300 701 E. Middlefield Road 1301 Mountain View, CA 94043 USA 1303 Email: senthil.sathappan@nokia.com 1305 Zhaohui Zhang 1306 Juniper Networks 1308 Email: zzhang@juniper.net 1310 Wen Lin 1311 Juniper Networks 1313 Email: wlin@juniper.net 1314 Eric C. Rosen 1315 Individual 1317 Email: erosen52@gmail.com