idnits 2.17.00 (12 Aug 2021) /tmp/idnits53445/draft-ietf-bess-bgp-multicast-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 8 instances of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. Miscellaneous warnings: ---------------------------------------------------------------------------- == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'MUST not' in this paragraph: A network currently running PIM can be incrementally transitioned to BGP based multicast. At any time, a router supporting BGP based multicast can use PIM with some neighbors (upstream or downstream) and BGP with some other neighbors. PIM and BGP MUST not be used simultaneously between two neighbors for multicast purpose, and routers connected to the same LAN MUST be transitioned during the same maintenance window. -- The document date (January 7, 2022) is 127 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC8174' is mentioned on line 40, but not defined == Missing Reference: 'RFC7438' is mentioned on line 277, but not defined == Missing Reference: 'RFC5492' is mentioned on line 327, but not defined == Missing Reference: 'RFC4760' is mentioned on line 609, but not defined == Missing Reference: 'RFC4271' is mentioned on line 609, but not defined == Missing Reference: 'RFC7524' is mentioned on line 725, but not defined == Missing Reference: 'RFC6388' is mentioned on line 893, but not defined == Unused Reference: 'RFC4601' is defined on line 1091, but no explicit reference was found in the text == Unused Reference: 'RFC5015' is defined on line 1104, but no explicit reference was found in the text == Unused Reference: 'I-D.ietf-bess-mvpn-evpn-aggregation-label' is defined on line 1135, but no explicit reference was found in the text == Unused Reference: 'I-D.ietf-bess-mvpn-pe-ce' is defined on line 1141, but no explicit reference was found in the text == Unused Reference: 'I-D.ietf-spring-segment-routing-policy' is defined on line 1162, but no explicit reference was found in the text == Outdated reference: draft-ietf-idr-tunnel-encaps has been published as RFC 9012 ** Obsolete normative reference: RFC 4601 (Obsoleted by RFC 7761) == Outdated reference: A later version (-09) exists of draft-ietf-bess-bgp-multicast-controller-07 == Outdated reference: A later version (-08) exists of draft-ietf-bess-mvpn-evpn-aggregation-label-07 == Outdated reference: A later version (-19) exists of draft-ietf-lsr-flex-algo-18 == Outdated reference: A later version (-04) exists of draft-ietf-pim-sr-p2mp-policy-03 == Outdated reference: A later version (-22) exists of draft-ietf-spring-segment-routing-policy-14 == Outdated reference: A later version (-14) exists of draft-kaliraj-idr-bgp-classful-transport-planes-12 == Outdated reference: A later version (-04) exists of draft-wijnands-mpls-mldp-multi-topology-03 Summary: 1 error (**), 0 flaws (~~), 22 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 BESS Z. Zhang 3 Internet-Draft L. Giuliano 4 Intended status: Standards Track Juniper Networks 5 Expires: July 11, 2022 K. Patel 6 Arrcus 7 I. Wijnands 8 M. Mishra 9 Cisco Systems 10 A. Gulko 11 Refinitiv 12 January 7, 2022 14 BGP Based Multicast 15 draft-ietf-bess-bgp-multicast-04 17 Abstract 19 This document specifies a BGP address family and related procedures 20 that allow BGP to be used for setting up multicast distribution 21 trees. This document also specifies procedures that enable BGP to be 22 used for multicast source discovery, and for showing interest in 23 receiving particular multicast flows. Taken together, these 24 procedures allow BGP to be used as a replacement for other multicast 25 routing protocols, such as PIM or mLDP. The BGP procedures specified 26 here are based on the BGP multicast procedures that were originally 27 designed for use by providers of Multicast Virtual Private Network 28 service. 30 This document also describes how various signaling mechanisms can be 31 used to set up end-to-end inter-region multiast trees. 33 Requirements Language 35 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 36 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 37 "OPTIONAL" in this document are to be interpreted as described in BCP 38 14 [RFC2119] [RFC8174] when, and only when, they appear in all 39 capitals, as shown here. 41 Status of This Memo 43 This Internet-Draft is submitted in full conformance with the 44 provisions of BCP 78 and BCP 79. 46 Internet-Drafts are working documents of the Internet Engineering 47 Task Force (IETF). Note that other groups may also distribute 48 working documents as Internet-Drafts. The list of current Internet- 49 Drafts is at https://datatracker.ietf.org/drafts/current/. 51 Internet-Drafts are draft documents valid for a maximum of six months 52 and may be updated, replaced, or obsoleted by other documents at any 53 time. It is inappropriate to use Internet-Drafts as reference 54 material or to cite them other than as "work in progress." 56 This Internet-Draft will expire on July 11, 2022. 58 Copyright Notice 60 Copyright (c) 2022 IETF Trust and the persons identified as the 61 document authors. All rights reserved. 63 This document is subject to BCP 78 and the IETF Trust's Legal 64 Provisions Relating to IETF Documents 65 (https://trustee.ietf.org/license-info) in effect on the date of 66 publication of this document. Please review these documents 67 carefully, as they describe your rights and restrictions with respect 68 to this document. Code Components extracted from this document must 69 include Simplified BSD License text as described in Section 4.e of 70 the Trust Legal Provisions and are provided without warranty as 71 described in the Simplified BSD License. 73 Table of Contents 75 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 76 1.1. Motivation . . . . . . . . . . . . . . . . . . . . . . . 3 77 1.1.1. Native/unlabeled Multicast . . . . . . . . . . . . . 3 78 1.1.2. Labeled Multicast . . . . . . . . . . . . . . . . . . 4 79 1.2. Overview . . . . . . . . . . . . . . . . . . . . . . . . 5 80 1.2.1. (x,g) Multicast . . . . . . . . . . . . . . . . . . . 5 81 1.2.1.1. Source Discovery for ASM . . . . . . . . . . . . 6 82 1.2.1.2. ASM Shared-tree-only Mode . . . . . . . . . . . . 6 83 1.2.1.3. Integration with BGP-MVPN . . . . . . . . . . . . 7 84 1.2.2. BGP Inband Signaling for mLDP Tunnel . . . . . . . . 7 85 1.2.3. BGP Sessions . . . . . . . . . . . . . . . . . . . . 7 86 1.2.4. LAN and Parallel Links . . . . . . . . . . . . . . . 8 87 1.2.5. Transition . . . . . . . . . . . . . . . . . . . . . 9 88 1.2.6. Inter-region Multicast . . . . . . . . . . . . . . . 10 89 1.2.6.1. Inband Signaling across a Region . . . . . . . . 10 90 1.2.6.2. Overlay Signaling Over a Region . . . . . . . . . 10 91 1.2.6.3. Controller Based Signaling . . . . . . . . . . . 11 92 1.2.7. BGP Classful Transport Planes . . . . . . . . . . . . 12 93 1.2.8. Flexible Algorithm and Multi-topology . . . . . . . . 13 94 2. Specification . . . . . . . . . . . . . . . . . . . . . . . . 13 95 2.1. BGP NLRIs and Attributes . . . . . . . . . . . . . . . . 13 96 2.1.1. S-PMSI A-D Route . . . . . . . . . . . . . . . . . . 14 97 2.1.2. Leaf A-D Route . . . . . . . . . . . . . . . . . . . 15 98 2.1.3. Source Active A-D Route . . . . . . . . . . . . . . . 16 99 2.1.4. S-PMSI A-D Route for C-multicast mLDP . . . . . . . . 17 100 2.1.5. Session Address Extended Community . . . . . . . . . 17 101 2.1.6. Multicast RPF Address Extended Community . . . . . . 18 102 2.1.7. Topology/IPA Extended Community . . . . . . . . . . . 18 103 2.2. Procedures . . . . . . . . . . . . . . . . . . . . . . . 18 104 2.2.1. Source Discovery for ASM . . . . . . . . . . . . . . 18 105 2.2.2. Originating Tree Join Routes . . . . . . . . . . . . 19 106 2.2.2.1. (x,g) Multicast Tree . . . . . . . . . . . . . . 19 107 2.2.2.2. BGP Inband Signaling for mLDP Tunnel . . . . . . 20 108 2.2.3. Receiving Tree Join Routes . . . . . . . . . . . . . 20 109 2.2.4. Withdrawl of Tree Join Routes . . . . . . . . . . . . 20 110 2.2.5. LAN procedures for (x,g) Unidirectional Tree . . . . 21 111 2.2.5.1. Originating S-PMSI A-D Routes . . . . . . . . . . 21 112 2.2.5.2. Receiving S-PMSI A-D Routes . . . . . . . . . . . 21 113 2.2.6. Distributing Label for Upstream Traffic for 114 Bidirectional Tree/Tunnel . . . . . . . . . . . . . . 22 115 3. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 23 116 4. Security Considerations . . . . . . . . . . . . . . . . . . . 23 117 5. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 24 118 6. References . . . . . . . . . . . . . . . . . . . . . . . . . 24 119 6.1. Normative References . . . . . . . . . . . . . . . . . . 24 120 6.2. Informative References . . . . . . . . . . . . . . . . . 25 121 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 27 123 1. Introduction 125 1.1. Motivation 127 This section provides some motivation for BGP signaling for native 128 and labeled multicast. One target deployment would be a Data Center 129 that requires multicast but uses BGP as its only routing protocol 130 [RFC7938]. In such a deployment, it would be desirable to support 131 multicast by extending the deployed routing protocol, without 132 requiring the deployment of tree building protocols such as PIM, 133 mLDP, RSVP-TE P2MP, and without requiring an IGP. 135 Additionally, compared to PIM, BGP based signaling has several 136 advantage as described in the following section, and may be desired 137 in non-DC deployment scenarios as well. 139 1.1.1. Native/unlabeled Multicast 141 Protocol Independent Multicast (PIM) has been the prevailing 142 multicast protocol for many years. Despite its success, it has two 143 drawbacks: 145 o The ASM model, which is prevalent, introduces complexity in the 146 following areas: source discovery procedures, need for Rendezvous 147 Points (RPs) and group-to-RP mappings, need to switch between RP- 148 rooted trees and source-rooted trees, etc. 150 o Periodical protocol state refreshes due to soft state nature. 152 PIM-SSM removes much of the complexity of PIM-ASM by moving source 153 discovery to the application layer. However, for various reasons, 154 many legacy applications and devices still rely upon network-based 155 source discovery. PIM-Port (PIM over Reliable Transport) solves the 156 soft state issue, though its deployment has also been limited for two 157 reasons: 159 o It does not remove the ASM complexities. 161 o In many of the scenarios where reliable transport is deemed 162 important, BGP-based multicast (e.g. BGP-MVPN) has been used 163 instead of PORT. 165 Partly because of the above mentioned problems, some Data Center 166 operators have been avoiding deploying multicast in their networks. 168 BGP-MVPN [RFC6514] uses BGP to signal VPN customer multicast state 169 over provider networks. It removes the above mentioned problems from 170 the SP environment, and the deployment experiences have been 171 encouraging. While RFC 6514 makes it possible for an SP to provide 172 MVPN service without running PIM on its backbone, that RFC still 173 assumes that PIM (or mLDP) runs on the PE-CE links. [draft-ietf-bess- 174 mvpn-pe-ce] adapts the concept of BGP-MVPN to PE-CE links so that the 175 use of PIM on the PE-CE links can be eliminated (though the PIM-ASM 176 complexities still remains in the customer network), and this 177 document extends it further to general topologies, so that they can 178 be run on any router, as a replacement for PIM or mLDP. 180 With that, PIM can be completely eliminated from the network. PIM 181 soft state is replaced by BGP hard state. For ASM, source specific 182 trees are set up directly after simpler source discovery (data driven 183 on FHRs and control driven elsewhere), all based on BGP. All the 184 complexities related to source discovery and shared/source tree 185 switch are also eliminated. Additionally, the trees can be setup 186 with MPLS labels, with just minor enhancements in the signaling. 188 1.1.2. Labeled Multicast 190 There could be two forms of labeled multicast signaled by BGP. The 191 first one is labeled (x,g) multicast where 'x' stands for either 's' 192 or '*'. Basically, it is for BGP-signaled multicast tree as 193 described in previous section but with labels. The second one is for 194 mLDP tunnels with BGP signaling in part or whole through a BGP 195 domain. 197 For both cases, BGP is used because other label distribution 198 mechanisms like mLDP may not be desired by some operators. For 199 example, a DC operator may prefer to have a BGP-only deployment. 201 1.2. Overview 203 1.2.1. (x,g) Multicast 205 PIM-like functionality is provided, using BGP-based join/prune 206 signaling and BGP-based source discovery for ASM. The BGP-based join 207 signaling supports both labeled multicast and IP multicast. 209 The same RPF procedures as in PIM are used for each router to 210 determine the RPF neighbor for a particular source or RPA (in case of 211 Bidirectional Tree). Except in the Bidirectional Tree case and a 212 special case described in Section 1.2.1.2, no (*,G) join is used - 213 LHR routers discover the sources for ASM and then join towards the 214 sources directly. Data driven mechanisms like PIM Assert is replaced 215 by control driven mechanisms (Section 1.2.4). 217 The joins are carried in BGP Updates with MCAST-TREE SAFI and S-PMSI/ 218 Leaf A-D routes defined in this document. The updates are targeted 219 at the upstream neighbor by use of Route Targets. There are three 220 benefits of using S-PMSI/Leaf routes for this purpose: a) when the 221 routes go through RRs, we have to distinguish different routes based 222 on upstream router and downstream router. This leads to Leaf routes. 223 b) for labeled bidirectional trees, we need to signal "upstream fec". 224 S-PMSI suits this very well. c) we may want to allow the option of 225 setting up trees or parts of a tree from the root/upstream towards 226 leaves/downstream and S-PMSI suits that very well. 228 If the BGP updates carry labels (via Tunnel Encapsulation Attribute 229 [I-D.ietf-idr-tunnel-encaps]), then (s,g) multicast traffic can use 230 the labels. This is very similar to mLDP Inband Signaling [RFC6826], 231 except that there are no corresponding "mLDP tunnels" for the PIM 232 trees. Similar to mLDP, labeled traffic on transit LANs are point to 233 point. Of course, traffic sent to receivers on a LAN by a LHR is 234 native multicast. 236 For labeled bidirectional (*,g) trees, downstream traffic (away from 237 the RPA) can be forwarded as in the (s,g) case. For upstream traffic 238 (towards RPA), the upstream neighbor needs to advertise a label for 239 its downstream neighbors. The same label that the upstream neighbor 240 advertises to its upstream is the same one that it advertises to its 241 downstreams, using an S-PMSI A-D route. 243 1.2.1.1. Source Discovery for ASM 245 This document does not support ASM via shared trees (aka RP Tree, or 246 RPT) with one exception discussed in the next section. Instead, 247 FHRs, LHRs, and optionally RRs work together to propagate/discover 248 source information via control plane and LHRs join source specific 249 Shortest Path Trees (SPT) directly. 251 A FHR originates Source Active A-D routes upon discovering sources 252 for particular flows and advertise them to its peers. It is desired 253 that the SA routes only reach LHRs that are interested in receiving 254 the traffic. To achieve that, the SA routes carry an IPv4 or IPv6 255 address specific Route Target. The Global Administrator field is set 256 the group address of the flow, and the Local Administrator field is 257 set to 0 or a pre-assigned domain-wide unique value that identifies a 258 VPN. An LHR advertises Route Target Membership routes, with the 259 Route Target field in the NLRI set according to the groups it wants 260 to receive traffic for, as how a FHR encode the Route Target in its 261 Source Active routes. The propagation of the SA routes is subject to 262 cooperative export filtering as specified in [RFC4684] and referred 263 to as RTC mechanism in this document. That way, the LHR only 264 receives Source Active routes for groups that it is interested in. 266 Typically, a set of RRs are used and they maintains all Source Active 267 routes but only distribute to interested LHRs on demand (upon 268 receiving corresponding Route Target Membership routes, which are 269 triggered on LHRs when they receive IGMP/MLD membership routes). The 270 rest of the document assumes that RRs are used, even though that is 271 not required. 273 1.2.1.2. ASM Shared-tree-only Mode 275 It may be desired that only a shared tree is used to distribute all 276 traffic for a particular ASM group from its RP to all LHRs, as 277 described in Section 4.1 "PIM Shared Tree Forwarding" of [RFC7438]. 278 This will significantly cut down the number of trees and works out 279 very well in certain deployment scenarios. For example, all the 280 sources could be connected to the RP, or clustered close to RP. In 281 the latter case, either the path from FHRs to the RP do not intersect 282 the shared tree so native forwarding can be used between the FHRs and 283 the RP, or other means outside of this document could be used to 284 forward traffic from FHRs to the RP. 286 For native forwarding from FHRs to the RP, SA routes may be used to 287 announce the sources so that the RP can join source specific trees to 288 pull traffic, but the group specific Route Target is not needed. The 289 LHRs do not advertise the group specific Route Target Membership 290 routes as they do not need the SA routes. 292 To establish the shared tree, (*,g) Leaf A-D routes are used as in 293 the bidirectional tree case, though no forwarding state is 294 established to forward traffic from downstream neighbors. 296 1.2.1.3. Integration with BGP-MVPN 298 For each VPN, the Source Active routes distribution in that VPN do 299 not have to involve PEs at all unless there are sources/receivers 300 directly connected to some PEs and they are independent of MVPN SA 301 routes. For example, FHRs and LHRs establish BGP sessions with RRs 302 of that particular VPN for the purpose of SA distribution. 304 After source discovery, BGP multicast signaling is done from LHRs 305 towards the sources. When the signaling reaches an egress PE, BGP- 306 MVPN signaling takes over, as if a PIM (s,g) join/prune was received 307 on the PE-CE interface. When the BGP-MVPN signaling reaches the 308 ingress PE, BGP multicast signaling as specified in this document 309 takes over, similar to how BGP-MVPN triggers PIM (s,g) join/prune on 310 PE-CE interfaces. 312 1.2.2. BGP Inband Signaling for mLDP Tunnel 314 Part of an (or the whole) mLDP tunnel can also be signaled via BGP 315 and seamlessly integrated with the rest of mLDP tunnel signaled 316 natively via mLDP. All the procedures are similar to mLDP except 317 that the signaling is done via BGP. The mLDP FEC is encoded as the 318 BGP NLRI, with MCAST-TREE SAFI and S-PMSI/Leaf A-D Routes for 319 C-multicast mLDP defined in this document. The Leaf A-D routes 320 correspond to mLDP Label Mapping messages, and the S-PMSI A-D routes 321 are used to signal upstream FEC for MP2MP mLDP tunnels, similar to 322 the bidirectional (*,g) case. 324 1.2.3. BGP Sessions 326 In order for two BGP speakers to exchange MCAST-TREE NLRI, they must 327 use BGP Capabilities Advertisement [RFC5492] to ensure that they both 328 are capable of properly processing the MCAST-TREE NLRI. This is done 329 as specified in [RFC4760], by using a capability code 1 330 (multiprotocol BGP) with an AFI of IPv4 (1) or IPv6 (2) and a SAFI of 331 MCAST-TREE with a value to be assigned by IANA. 333 How the BGP peer sessions are provisioned, whether EBGP or IBGP, 334 whether statically, automatically (e.g., based on IGP neighbor 335 discovery), or programmably via an external controller, is outside 336 the scope of this document. 338 In case of IBGP, it could be that every router peering with Route 339 Reflectors, or hop by hop IBGP sessions could be used to exchange 340 MCAST-TREE NLRIs for joins. In the latter case, unless desired 341 otherwise for reasons outside of the scope of this document, the hop 342 by hop IBGP sessions SHOULD only be used to exchange MCAST-TREE 343 NLRIs. 345 When multihop BGP is used, a router advertises its local interface 346 addresses, for the same purposes that the Address List TLV in LDP 347 serves. This is achieved by advertising the interface address as 348 host prefixes with IPv4/v6 Address Specific ECs corresponding to the 349 router's local addresses used for its BGP sessions (Section 2.1.5). 351 Because the BGP Capability Advertisement is only between two peers, 352 when the sessions are only via RRs, a router needs another way to 353 determine if its neighbor is capable of signaling multicast via BGP. 354 The interface address advertisement can be used for that purpose - 355 the inclusion of a Session Address EC indicates that the BGP speaker 356 identified in the EC supports the C-Multicast NLRI. 358 FHRs and LHRs may also establish BGP sessions to some Route 359 Reflectors for source discovery purpose (Section 1.2.1.1). 361 With the traditional PIM, the FHRs and LHRs refer to the PIM DRs on 362 the source or receiver networks. With BGP based multicast, PIM may 363 not be running at all, and the FHRs and LHRs refer to the IGMP/MLD 364 queriers, or the DF elected per [I-D.wijnands-bier-mld-lan-election]. 365 Alternatively, if it is known that a network only has senders then no 366 IGMP/MLD or DF election is needed - any router may generate SA 367 routes. That will not cause any issue other than redundant SA routes 368 being originated. 370 1.2.4. LAN and Parallel Links 372 There could be parallel links between two BGP peers. A single multi- 373 hop session, whether IBGP or EBGP, between loopback addresses may be 374 used. Except for LAN interfaces in case of unlabeled (x,g) 375 unidirectional trees (note that transit LAN interface is not 376 supported for BGP signaled (*,g) bidirectional tree and for mLDP 377 tunnels, traffic on transit LAN is point to point between neighbors), 378 any link between the two peers can be automatically used by a 379 downstream peer to receive traffic from the upstream peer, and it is 380 for the upstream peer to decide which link to use. If one of the 381 links goes down, the upstream peer switches to a different link and 382 there is no change needed on the downstream peer. 384 For unlabeled (x,g) unidirectional trees, the upstream peer MAY 385 prefer LAN interfaces to send traffic, since multiple downstream 386 peers may be reached simultaneously, or it may make a decision based 387 on local policy, e.g., for load balancing purpose. Because different 388 downstream peers might choose different upstream peers for RPF, when 389 an upstream peer decides to use a LAN interface to send traffic, it 390 originates an S-PMSI A-D route indicating that one or more LAN 391 interface will be used. The route carries Route Targets specific to 392 the LANs so that all the peers on the LANs import the route. If more 393 than one router originate the route specifying the same LAN for the 394 same (s,g) or (*,g) flow, then assert procedure based on the S-PMSI 395 A-D routes happens and assert losers will stop sending traffic to the 396 LAN. 398 1.2.5. Transition 400 A network currently running PIM can be incrementally transitioned to 401 BGP based multicast. At any time, a router supporting BGP based 402 multicast can use PIM with some neighbors (upstream or downstream) 403 and BGP with some other neighbors. PIM and BGP MUST not be used 404 simultaneously between two neighbors for multicast purpose, and 405 routers connected to the same LAN MUST be transitioned during the 406 same maintenance window. 408 In case of PIM-SSM, any router can be transitioned at any time 409 (except on a LAN). It may receive source tree joins from a mixed set 410 of BGP and PIM downstream neighbors and send source tree joins to its 411 upstream neighbor using either PIM or BGP signaling. 413 In case of PIM-ASM, the RPs are first upgraded to support BGP based 414 multicast. They learn sources either via PIM procedures from PIM 415 FHRs, or via Source Active A-D routes from BGP FHRs. In the former 416 case, the RPs can originate proxy Source Active A-D routes. There 417 may be a mixed set of RPs/RRs - some capable of both traditional PIM 418 RP functionalities while some only redistribute SA routes. 420 Then any routers can be transitioned incrementally. A transitioned 421 LHR router will pull Source Active A-D routes from the RPs/RRs when 422 they receive IGMP/MLD (*,G) joins for ASM groups, and may send either 423 PIM (s,g) joins or BGP Source Tree Join routes. A transitioned 424 transit router may receive (*,g) PIM joins but only send source tree 425 joins after pulling Source Active A-D routes from RPs/RRs. 427 Similarly, a network currently running mLDP can be incrementally 428 transitioned to BGP signaling. Without the complication of ASM, any 429 router can be transitioned at any time, even without the restriction 430 of coordinated transition on a LAN. It may receive mixed mLDP label 431 mapping or BGP updates from different downstream neighbors, and may 432 exchange either mLDP label mapping or BGP updates with its upstream 433 neighbors, depending on if the neighbor is using BGP based signaling 434 or not. 436 1.2.6. Inter-region Multicast 438 An end-to-end multicast tree or P2MP tunnel may span multiple 439 regions, where a region could be an IGP area (or even a sub-area) or 440 an Autonomous System (AS), and different multicast signaling could be 441 used in different regions. There are several situations to consider. 443 1.2.6.1. Inband Signaling across a Region 445 With inband signaling, the multicast tree/tunnel is signaled through 446 a region and internal routers in the region maintain corresponding 447 per-tree/tunnel state. A downstream region and an upstream region 448 may use the same or different signaling. For example, An (*/s, g) IP 449 multicast tree with BGP signaling in a downstream region can be 450 signaled with mLDP Inband Signaling [RFC6826] or with PIM across the 451 upstream region, and a p2mp tunnel with BGP signaling in the 452 downstream region can be signaled with mLDP across the upstream 453 region, or vice versa. A RBR will stitch the upstream portion (e.g 454 PIM/mLDP-signaled) to downstream portion (e.g BGP-signaled). 456 If all routers in the region have route towards the source/root of 457 the tree/tunnel then there is nothing different from the intra-region 458 case. On the other hand, if internal routers do not have route 459 towards the source/root, e.g. as with Seamless MPLS 460 [I-D.ietf-mpls-seamless-mpls] or Seamless SR 461 [I-D.hegde-spring-mpls-seamless-sr], the internal routers need to do 462 RPF towards an upstream Regional Border Router (RBR). To signal the 463 RBR information to an internal upstream router, one of the following 464 ways is used depending on the signaling method: 466 o With BGP signaling, the Leaf A-D Route carries a new BGP Extended 467 Community referred to as Multicast RPF Address EC, similar to PIM 468 RPF Vector [RFC5496] and mLDP Recursive FEC [RFC6512] 470 o With PIM signaling, PIM RPF Vector is used. 472 o With mLDP signaling, mLDP Recursive FEC is used. 474 1.2.6.2. Overlay Signaling Over a Region 476 With overlay signaling, a downstream RBR signals to its upstream RBR 477 over the region and the internal routers do not maintain the state of 478 the (overlay) tree/tunnel. This can be done with one of the 479 following methods: 481 o mLDP P2MP tunnels can be signaled over the region via targeted LDP 482 sessions [RFC7060] 484 o Both IP multicast tree and mLDP P2MP tunnels can be signaled over 485 a region via BGP-MVPN procedures [RFC6514]. 487 o Both IP multicast tree and mLDP P2MP tunnels can be signaled over 488 a region via BGP as discussed in the rest of this section. 490 All three methods are actually very similar in concept. The upstream 491 RBR tunnels packets to the downstream RBR, just as in the intra- 492 region case when two routers on the tree/tunnel are not directly 493 connected. The rest of this section only discusses BGP signaling. 495 When a downstream RBR determines that the route towards the source/ 496 root has a BGP Next Hop towards a BGP speaker capable of multicast 497 signaling via BGP as specified in this document, it signals to that 498 BGP speaker (via a RR or not). 500 Suppose an upstream RBR receives the signaling for the same tree/ 501 tunnel from several downstream RBRs. It could use Ingress 502 Replication to replicate packets directly to those downstream RBRs, 503 or it could use underlay P2MP tunnels instead. 505 In the latter case, the upstream RBR advertises an S-PMSI A-D route 506 with a Provider Tunnel Attribute (PTA) specifying the underlay 507 tunnel. This is very much like the "mLDP Over Targeted Sessions" 508 [RFC7060] or BGP-MVPN [RFC6514] (though MCAST-VPN's C-Muilticast 509 routes are replaced with MCAST-TREE's Leaf A-D routes). If the 510 mapping between overlay tree/tunnel and underlay tunnel is one-to- 511 one, the MPLS Label field in the PTA is set to 0 or otherwise set to 512 a Domain-wide Common Block (DCB) label [I-D.ietf-bess-mvpn-evpn- 513 aggregation-label] or an upstream-assigned label corresponding to the 514 overlay tree/tunnel. 516 The underlay tunnel, whether P2P to individual downstream RBRs or 517 P2MP to the set of downstream RBRs, can be of any type including 518 Segment Routing (SR) [RFC8402] policies [I-D.ietf-spring-segment- 519 routing-policy] [I-D.ietf-pim-sr-p2mp-policy]. 521 1.2.6.3. Controller Based Signaling 523 [I-D.ietf-bess-bgp-multicast-controller] specifies the procedures for 524 a controller to signal multicast forwarding state to each router on a 525 multicast tree based on the controller's computation. Depending on 526 deployment scenarios, in inter-region cases it is possible that the 527 hop-by-hop signaling specified in this document and the controller 528 based signaling may be used in different regions. 530 Consider a situation where an ABR is connected three regions A, B, 531 and C, where hop-by-hop signaling is used in A and B, while 532 controller based signaling is used in C. 534 For a particular multicast tree, A is the upstream region, while B 535 and C are two downstream regions. The ABR receives a Leaf A-D route 536 from region B and a Leaf A-D route from C's controller, and sends a 537 Leaf A-D route to its upstream router in A. 539 For a different tree, C is the upstream region while A and B are 540 downstream. The ABR receives two Leaf A-D routes for the tree from 541 regions A and B, and one Leaf A-D route from C's controller. Note 542 that the ABR needs to signal to the controller that it is a leaf of 543 the tree (because of the Leaf A-D routes received from regions A and 544 B). 546 For both cases, the ABR stitches together different segments in 547 different regions by creating forwarding state based on the Leaf A-D 548 routes (optionally based on the S-PMSI A-D routes in region A and B 549 in addition.) 551 1.2.7. BGP Classful Transport Planes 553 [I-D.kaliraj-idr-bgp-classful-transport-planes] specifies a framework 554 for classifying underlay routes into transport classes and mapping 555 service routes to specific transport classes. An underlay route 556 signaled with BGP-CT SAFI carries a Transport Class Route Target (TC- 557 RT) to both indicate the transport class that the route belongs to 558 and to control the propagation and importation of the underlay route. 559 The recipient of the underlay routes use the TC-RT to determine how 560 the Protocol NH (PNH) is resolved. A service/overlay route may carry 561 a mapping community that maps to a transport class that is used to 562 resolve the service route's PNH. 564 In case of multicast, the selection of the link/tunnel between an 565 upstream and downstream tree node may be subject to the transport 566 class that the tree is for (in case of an underlay tree) or the class 567 of transport that the tree should use (in case of an overlay tree). 568 In both the underlay and overlay case, the transport class is 569 indicated by a mapping community attached to the BGP multicast 570 routes, which could be a color community or any community intended 571 for mapping to the transport. 573 The mapping community not only affects an upstream node's selection 574 of link/tunnel to a downstream node, but may also affect a downstream 575 node's selection of its upstream node (i.e. the RPF procedure). 577 1.2.8. Flexible Algorithm and Multi-topology 579 Similar to classful transport, in case of multi-topology [RFC4915] 580 [RFC5120] or Flexible Algorithm [I-D.ietf-lsr-flex-algo], a multicast 581 tree may be required to do RPF based on a particular topology or 582 Flexible Algorithm (IPA). To signal that, the BGP-MCAST Leaf A-D 583 route may carry an extended community to encode the topology and/or 584 IPA. Note that this could also be an operator defined mapping 585 community that maps to a transport class (that is associated with a 586 topology or a Flexible Algorithm). 588 In the grand scheme of inter-region scenario, if mLDP is to be used 589 with Flexible Algorithm or Multi-topology for signaling in a 590 particular region, [I-D.wijnands-mpls-mldp-multi-topology] specifies 591 how topology and/or IPA are encoded. 593 Similarly, in case of PIM, [RFC6420] specifies how topology 594 information is encoded in PIM signaling and similar mechanism can be 595 specified for Flexible Algorithm. However, that, and potentially 596 encoding transport class in PIM/mLDP are outside the scope of this 597 document. 599 2. Specification 601 2.1. BGP NLRIs and Attributes 603 The BGP Multiprotocol Extensions [RFC4760] allow BGP to carry routes 604 from multiple different "AFI/SAFIs". This document defines a new 605 SAFI known as a MCAST-TREE SAFI with a value to be assigned by the 606 IANA. This SAFI is used along with the AFI of IPv4 (1) or IPv6 (2). 608 The MCAST-TREE NLRI defined below is carried in the BGP UPDATE 609 messages [RFC4271] using the BGP multiprotocol extensions [RFC4760] 610 with a AFI of IPv4 (1) or IPv6 (2) assigned by IANA and a MCAST-TREE 611 SAFI with a value to be assigned by the IANA. 613 The Next hop field of MP_REACH_NLRI attribute SHALL be interpreted as 614 an IPv4 address whenever the length of the Next Hop address is 4 615 octets, and as an IPv6 address whenever the length of the Next Hop is 616 address is 16 octets. 618 The NLRI field in the MP_REACH_NLRI and MP_UNREACH_NLRI is a prefix 619 with a maximum length of 12 octets for IPv4 AFI and 36 octets for 620 IPv6 AFI. The following is the format of the MCAST-TREE NLRI: 622 +-----------------------------------+ 623 | Route Type (1 octet) | 624 +-----------------------------------+ 625 | Length (1 octet) | 626 +-----------------------------------+ 627 | Route Type specific (variable) | 628 +-----------------------------------+ 630 The Route Type field defines encoding of the rest of the MCAST-TREE 631 NLRI. (Route Type specific MCAST-TREE NLRI). 633 The Length field indicates the length in octets of the Route Type 634 specific field of MCAST-TREE NLRI. 636 The following new route types are defined: 638 3 - S-PMSI A-D Route for (x,g) 639 4 - Leaf A-D Route 640 5 - Source Active A-D Route 641 0x43 - S-PMSI A-D Route for C-multicast mLDP 643 Except for the Source Active A-D routes, the routes are to be 644 consumed by targeted upstream/downstream neighbors, and are not 645 propagated further. This can be achieved by outbound filtering based 646 on the RTs that lead to the importation of the routes. 648 The Type-3/4 routes MAY carry a Tunnel Encapsulation Attribute (TEA) 649 [I-D.ietf-idr-tunnel-encaps]. The Type-0x43 route MUST carry a TEA. 650 When used for mLDP, the Type-4 route MUST carry a TEA. Only the MPLS 651 tunnel type for the TEA is considered. Others are outside the scope 652 of this document. 654 2.1.1. S-PMSI A-D Route 656 Similar to defined in RFC 6514, an S-PMSI A-D Route Type specific 657 MCAST-TREE NLRI consists of the following: 659 +-----------------------------------+ 660 | RD (8 octets) | 661 +-----------------------------------+ 662 | Multicast Source Length (1 octet) | 663 +-----------------------------------+ 664 | Multicast Source (variable) | 665 +-----------------------------------+ 666 | Multicast Group Length (1 octet) | 667 +-----------------------------------+ 668 | Multicast Group (variable) | 669 +-----------------------------------+ 670 | Upstream Router's IP Address | 671 +-----------------------------------+ 673 If the Multicast Source (or Group) field contains an IPv4 address, 674 then the value of the Multicast Source (or Group) Length field is 32. 675 If the Multicast Source (or Group) field contains an IPv6 address, 676 then the value of the Multicast Source (or Group) Length field is 677 128. 679 Usage of other values of the Multicast Source Length and Multicast 680 Group Length fields is outside the scope of this document. 682 There are two usages for S-PMSI A-D route. They're described in 683 Section 2.2.5 and Section 2.2.6 respectively. 685 2.1.2. Leaf A-D Route 687 Similar to the Leaf A-D route in [RFC6514], a MCAST-TREE Leaf A-D 688 route's route key includes the corresponding x-PMSI NLRI, plus the 689 Originating Router's IP Addr. 691 +-----------------------------------+ 692 | x-PMSI NLRI | 693 +-----------------------------------+ 694 | Originating Router's IP Address | 695 +-----------------------------------+ 697 For example, the entire NLRI of a Leaf A-D route for (x,g) tree is as 698 following: 700 +- +-----------------------------------+ 701 | | Route Type - 4 (Leaf A-D) | 702 | +-----------------------------------+ 703 | | Length (1 octet) | 704 | +- +-----------------------------------+ --+ 705 | | | Route Type - 3 (S-PMSI A-D) | | 706 L | L | +-----------------------------------+ | S 707 E | E | | Length (1 octet) | | | 708 A | A | +-----------------------------------+ | P 709 F | F | | RD (8 octets) | | M 710 | | +-----------------------------------+ | S 711 | | | Multicast Source Length (1 octet) | | I 712 | | +-----------------------------------+ | I 713 N | R | | Multicast Source (variable) | | 714 L | O | +-----------------------------------+ | 715 R | U | | Multicast Group Length (1 octet) | | N 716 I | T | +-----------------------------------+ | L 717 | E | | Multicast Group (variable) | | R 718 | | +-----------------------------------+ | I 719 | K | | Upstream Router's IP Address | | 720 | E | +-----------------------------------+ --+ 721 | Y | | Originating Router's IP Address | 722 +- +- +-----------------------------------+ 724 Even though the MCAST-TREE Leaf A-D route is unsolicited, unlike the 725 Leaf A-D route for GTM in [RFC7524], it is encoded as if a 726 corresponding S-PMSI A-D route had been received. 728 When used for signaling mLDP tunnels, even though the Leaf A-D route 729 is unsolicited, unlike the "Route-type 0x44 Leaf A-D route for 730 C-multicast mLDP" as in [RFC7441], it is Route-type 4 and encoded as 731 if a corresponding S-PMSI A-D route had been received. 733 2.1.3. Source Active A-D Route 735 Similar to defined in RFC 6514, a Source Active A-D Route Type 736 specific MCAST NLRI consists of the following: 738 +-----------------------------------+ 739 | RD (8 octets) | 740 +-----------------------------------+ 741 | Multicast Source Length (1 octet) | 742 +-----------------------------------+ 743 | Multicast Source (variable) | 744 +-----------------------------------+ 745 | Multicast Group Length (1 octet) | 746 +-----------------------------------+ 747 | Multicast Group (variable) | 748 +-----------------------------------+ 750 The definition of the source/length and group/length fields are the 751 same as in the S-PMSI A-D routes. 753 Usage of Source Active A-D routes is described in Section 1.2.1.1. 755 2.1.4. S-PMSI A-D Route for C-multicast mLDP 757 The route is used to signal upstream FEC for an MP2MP mLDP tunnel. 758 The route key include the mLDP FEC and the Upstream Router's IP 759 Address field. The encoding is similar to the same route in 760 [RFC7441]. 762 2.1.5. Session Address Extended Community 764 For two BGP speakers to determine if they are directly connected, 765 each will advertise their local interface addresses, with an Session 766 Address Extended Community. This is an IPv4/IPv6 Address Specific EC 767 with the Global Admin Field set to the local address used for its 768 multihop sessions and the Local Admin Field set to the prefix length 769 corresponding to the interface's network mask. 771 For example, if a router has two interfaces with address 772 10.10.10.1/24 and 10.12.0.1/16 respectively (notice the different 773 network mask), and a loopback address 11.11.11.1/32 that is used for 774 BGP sessions, then it will advertise prefix 10.10.10.1/32 with a 775 Session Address EC 11.11.11.1:24 and 10.12.0.1/32 with a Session 776 Address EC 11.11.11.1:16. If it also uses another loopback address 777 11.11.11.11/32 for other BGP sessions, then the routes will 778 additionally carry Session Address EC 11.11.11.11:24 and 779 11.11.11.11:16 respectively. 781 This achieves what the Address List TLV in LDP Address Messages 782 achieves, and can also be used to indicate that a router supports the 783 BGP multicast signaling procedures specified in this document. 785 Only those interface addresses that will be used as resolved nexthops 786 in the RIB need to be advertised with the Session Address EC. For 787 example, the RPF lookup may say that the resolved nexthop address is 788 A1, so the router needs to find out the corresponding BGP speaker 789 with address A1 through the (interface address, session address) 790 mapping built according to the interface address NLRI with the 791 Session Address EC. For comparison with LDP, this is done via the 792 (interface address, session address) mapping that is built by the LDP 793 Address Messages. 795 2.1.6. Multicast RPF Address Extended Community 797 This is an IP or IPv6 Address Specific EC with the Global Admin Field 798 set to the address of the upstream RBR and the Local Admin Field set 799 to 0. 801 2.1.7. Topology/IPA Extended Community 803 This is an Transitive Opaque Extended Community with the following 804 format: 806 0 1 2 3 807 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 808 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 809 | 0x03 | Sub-Type | Reserved | 810 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 811 | IPA | MT-ID | 812 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 814 IPA is the Flexible Algorithm number and MT-ID is the Multi-Topology 815 Identifier to be used for setting a multicast tree. 817 2.2. Procedures 819 2.2.1. Source Discovery for ASM 821 When a FHR first receives a multicast packet addressed to an ASM 822 group, it originates a Source Active route. It carries a IP/IPv6 823 Address Specific RT, with the Global Admin Field set to the group 824 address and the Local Admin Field set to 0. The route is advertised 825 to its peers, who will re-advertise further based on the RTC 826 mechanisms. Note that typically the route is advertised only to the 827 RRs. 829 The FHRs withdraws the Source Active route after a certain amount of 830 time since it last received a packet of an (s,g) flow. The amount of 831 time to wait is a local matter. 833 2.2.2. Originating Tree Join Routes 835 Note that in this document, tree join routes are S-PMSI/Leaf A-D 836 routes. 838 2.2.2.1. (x,g) Multicast Tree 840 When a router learns from IGMP/MLD or a downstream PIM/BGP peer that 841 it needs to join a particular (s,g) tree, it determines the RPF 842 nexthop address wrt the source, following the same RPF procedures as 843 defined for PIM. It further finds the BGP router that advertised the 844 nexthop address as one of its local addresses. 846 If the RPF neighbor supports MCAST-TREE SAFI, this router originates 847 a Leaf A-D route. Although it is unsolicited, it is constructed as 848 if there was a corresponding S-PMSI A-D route. The Upstream Router's 849 IP Address field is set to the RPF neighbor's session address (learnt 850 via the EC attached to the host route for the RPF nexthop address). 851 An Address Specific RT corresponding to the session address is 852 attached to the route, with the Global Administrative Field set to 853 the session address and the local administrative field set to 0 or a 854 pre-assigned domain-wide unique value that identifies a VPN. 856 Similarly, when a router learns that it needs to join a bi- 857 directional tree for a particular group, it determines the RPF 858 neighbor wrt the RPA. If the neighbor supports MCAST-TREE SAFI, it 859 originates a Leaf A-D Route and advertises the route to the RPF 860 neighbor (in case of EBGP or hop-by-hop IBGP), or one or more RRs. 862 When a router first learns that it needs to receive traffic for an 863 ASM group, either because of a local (*,g) IGMP/MLD report or a 864 downstream PIM (*,g) join, it originates a RTC route with the NLRI's 865 AS field set to its AS number and the Route Target field set to an 866 address based RT, with the Global Administrator field set to group 867 address and the Local Administrator field set to 0 or a pre-assigned 868 domain-wide unique value that identifies a VPN. The route is 869 advertised to its peers (most practically some RRs), so that the 870 router can receive matching Source Active A-D routes. Upon the 871 receiving of the Source Active A-D routes, the router originates Leaf 872 A-D routes as described above, as long as it still needs to receive 873 traffic for the flows (i.e., the corresponding IGMP/MLD membership 874 exists or join from downstream PIM/BGP neighbor exists). 876 When a Leaf A-D route is originated by this router, it sets up 877 corresponding forwarding state such that the expected incoming 878 interface list includes all non-LAN interfaces directly connecting to 879 the upstream neighbor. LAN interfaces are added upon receiving 880 corresponding S-PMSI A-D route (Section 2.2.5.2). If the upstream 881 neighbor is not directly connected, tunnels may be used - details to 882 be included in future revisions. 884 When the upstream neighbor changes, the previously advertised Leaf 885 A-D route is withdrawn. If there is a new upstream neighbor, a new 886 Leaf A-D route is originated, corresponding to the new neighbor. 887 Because NLRIs are different for the old and new Leaf A-D routes, 888 make-before-break as well as MoFRR [RFC7431] can be achieved. 890 2.2.2.2. BGP Inband Signaling for mLDP Tunnel 892 The same mLDP procedures as defined in [RFC6388] are followed, except 893 that where a label mapping message is sent in [RFC6388], a Leaf A-D 894 route is sent if the the upstream neighbor supports BGP based 895 signaling. 897 2.2.3. Receiving Tree Join Routes 899 A router (auto-)configures Import RTs matching itself so that it can 900 import tree join routes from their peers. Note that in this 901 document, tree join routes are S-PMSI/Leaf A-D routes. 903 When a router receives a tree join route and imports it, it 904 determines if it needs to originate its own corresponding route and 905 advertise further upstream wrt the source/RPA or mLDP tunnel root. 906 If this router is the FHR or is on the RPL or is the tunnel root, 907 then it does not need to. Otherwise the procedures in Section 2.2.2 908 are followed. 910 Additionally, the router sets up its corresponding forwarding state 911 such that traffic will be sent to the downstream neighbor, and 912 received from the downstream neighbor in case of bidirectional tree/ 913 tunnel. If the downstream neighbor is not directly connected, 914 tunnels may be used - details to be included in future revisions. 916 2.2.4. Withdrawl of Tree Join Routes 918 For a particular tree or tunnel, if a downstream neighbor withdraws 919 its Leaf A-D route, the neighbor is removed from the corresponding 920 forwarding state. If all downstream neighbors withdraw their tree 921 join routes and this router no longer has local receivers, it 922 withdraws the tree join routes that it previously originated. 924 As mentioned earlier, when the upstream neighbor changes, the 925 previously advertised Leaf A-D route is also withdrawn. The 926 corresponding incoming interfaces are also removed from the 927 corresponding forwarding state. 929 2.2.5. LAN procedures for (x,g) Unidirectional Tree 931 For a unidirectional (x,g) multicast tree, if there is a LAN 932 interface connecting to the downstream neighbor, it MAY be preferred 933 over non-LAN interfaces, but an S-PMSI A-D route MUST be originated 934 to facilitate the analog of the Assert process (Section 2.2.5.1). 936 2.2.5.1. Originating S-PMSI A-D Routes 938 If this router chooses to use a LAN interface to send traffic to its 939 neighbors for a particular (s,g) or (*,g) flow, it MUST announce that 940 by originating a corresponding S-PMSI A-D route. The Tunnel Type in 941 the PMSI Tunnel Attribute (PTA) is set to 0 (no tunnel information 942 Present). The LAN interface is identified by an IP address specific 943 RT, with the Global Administrative Field set to the LAN interface's 944 address prefix and the Local Administrative Field set to the prefix 945 length. The RT also serves the purpose of restricting the importing 946 of the route by all routers on the LAN. An operator MUST ensure that 947 RTs encoded as above are not used for other purposes. Practically 948 that should not be unreasonable. 950 If multiple LAN interfaces are to be used (to reach different sets of 951 neighbors), then the route will include multiple RTs, one for each 952 used LAN interface as described above. 954 The S-PMSI A-D routes may also be used to announce tunnels that could 955 be used to send traffic to downstream neighbors that are not directly 956 connected. Details may be added in future revisions. 958 2.2.5.2. Receiving S-PMSI A-D Routes 960 A router (auto-)configures an Import RT for each of its LAN 961 interfaces over which BGP is used for multicast signaling. The 962 construction of the RT is described in the previous section. 964 When a router R1 imports an S-PMSI A-D route for flow (x,g) from 965 router R2, R1 checks to see if it also originating an S-PMSI A-D 966 route with the same NLRI except the Upstream Router's IP Address 967 field. When a router R1 originates an S-PMSI A-D route, it checks to 968 see if it also has installed an S-PMSI A-D route, from some other 969 router R2, with the same NLRI except the Upstream Router's IP Address 970 field. In either case, R1 checks to see if the two routes have an RT 971 in common and the RT is encoded as in Section 2.2.5.1. If so, then 972 there is a LAN attached to both R1 and R2, and both routers are 973 prepared to send (S,G) traffic onto that LAN. This kicks off the 974 assert procedure to elect a winner - the one with the highest 975 Upstream Router's IP Address in the NLRI wins. An assert loser will 976 not include the corresponding LAN interface in its outgoing interface 977 list, but it keeps the S-PMSI A-D route that it originates. 979 If this router does not have a matching S-PMSI route of its own with 980 some common RTs, and the originator of the received S-PMSI route is a 981 chosen upstream neighbor for the corresponding flow, then this router 982 updates its forwarding state to include the LAN interface in the 983 incoming interface list. When the last S-PMSI route with a RT 984 matching the LAN is withdrawn later, the LAN interface is removed 985 from the incoming interface list. 987 Note that a downstream router on the LAN does not participate in the 988 assert procedure. It adds/keeps the LAN interface in the expected 989 incoming interfaces as long as its chosen upstream peer originates 990 the S-PMSI AD route. It does not switch to the assert winner as its 991 upstream. An assert loser MAY keep sending joins upstream based on 992 local policy even if it has no other downstream neighbors (this could 993 be used for fast switch over in case the assert winner would fail). 995 2.2.6. Distributing Label for Upstream Traffic for Bidirectional Tree/ 996 Tunnel 998 For MP2MP mLDP tunnels or labeled (*,g) bidirectional trees, an 999 upstream router needs to advertise a label to all its downstream 1000 neighbors so that the downstream neighbors can send traffic to 1001 itself. 1003 For MP2MP mLDP tunnels, the same procedures for mLDP are followed 1004 except that instead of MP2MP-U Label Mapping messages, S-PMSI A-D 1005 Routes for C-Multicast mLDP are used. 1007 For labeled (*,g) bidirectional trees, for a Leaf A-D route received 1008 from a downstream neighbor, a corresponding S-PMSI A-D route is sent 1009 back to the downstream router. 1011 In both cases, a single S-PMSI A-D route is originated for each tree 1012 from this router, but with multiple RTs (one for each downstream 1013 neighbor on the tree). A TEA specifies a label allocated by the 1014 upstream router for its downstream neighbors to send traffic with. 1015 Note that this is still a "downstream allocated" label (the upstream 1016 router is "downstream" from traffic direction point of view). 1018 The S-PMSI routes do not carry a PTA, unless a P2MP tunnel is used to 1019 reach downstream neighbors. Such use case is out of scope of this 1020 document for now and may be specified in the future. 1022 3. IANA Considerations 1024 This document requests IANA to assign a new BGP SAFI value for the 1025 MCAST-TREE SAFI. 1027 This document requests IANA to create a new "BGP MCAST-TREE Route 1028 Types" registry, referencing this document. The following initial 1029 values are defined: 1031 0~2 - Reserved 1032 3 - S-PMSI A-D Route for (x,g) 1033 4 - Leaf A-D Route 1034 5 - Source Active A-D Route 1035 0x43 - S-PMSI A-D Route for C-multicast mLDP 1037 This document requests IANA to assign two Sub-type values from 1038 Transitive IPv4-Address-Specific Extended Community Sub-types 1039 Registry for Session Address EC and Multicast RPF Address EC 1040 respectively. 1042 This document requests IANA to assign two Sub-Type values from 1043 Transitive IPv6-Address-Specific Extended Community Types Registry 1044 for Session Address EC and Multicast RPF Address EC respectively. 1046 This document requests IANA to assign one Sub-Type value from 1047 Transitive Opaque Extended Community Types Registry for the Topology/ 1048 IPA EC. 1050 4. Security Considerations 1052 This document shares many of the mechanisms and concepts of MVPN and, 1053 accordingly, can reuse many of the security considerations described 1054 in RFC6513 and RFC6514, though the distinctions made on PE-CE links 1055 and relationships in those documents are not relevant. 1057 This document describes interworking with several multicast control 1058 protocols, including PIM-SM, PIM-SSM, PIM-Bidir, mLDP and IGMP/MLD. 1059 Security considerations specified for those protocols are applicable 1060 to this document. 1062 Implementations should include Multicast Damping procedures specified 1063 in RFC7899 to protect the control plane from excessive churn due to 1064 multicast dynamicity. Implementations should also include the 1065 ability to rate-limit join state creation on a per-peer and per-RIB 1066 basis, as well as rate-limit Source Active A-D route propagation on a 1067 per-source, per-peer and per-RIB basis to configurable thresholds. 1069 5. Acknowledgements 1071 The authors thank Marco Rodrigues for his initial idea/ask of using 1072 BGP for multicast signaling beyond MVPN. We thank Eric Rosen for his 1073 questions, suggestions, and help finding solutions to some issues. 1074 We also thank Luay Jalil, James Uttaro and Shraddha Hegde for their 1075 comments and support for the work. 1077 6. References 1079 6.1. Normative References 1081 [I-D.ietf-idr-tunnel-encaps] 1082 Patel, K., Velde, G. V. D., Sangli, S. R., and J. Scudder, 1083 "The BGP Tunnel Encapsulation Attribute", draft-ietf-idr- 1084 tunnel-encaps-22 (work in progress), January 2021. 1086 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1087 Requirement Levels", BCP 14, RFC 2119, 1088 DOI 10.17487/RFC2119, March 1997, 1089 . 1091 [RFC4601] Fenner, B., Handley, M., Holbrook, H., and I. Kouvelas, 1092 "Protocol Independent Multicast - Sparse Mode (PIM-SM): 1093 Protocol Specification (Revised)", RFC 4601, 1094 DOI 10.17487/RFC4601, August 2006, 1095 . 1097 [RFC4684] Marques, P., Bonica, R., Fang, L., Martini, L., Raszuk, 1098 R., Patel, K., and J. Guichard, "Constrained Route 1099 Distribution for Border Gateway Protocol/MultiProtocol 1100 Label Switching (BGP/MPLS) Internet Protocol (IP) Virtual 1101 Private Networks (VPNs)", RFC 4684, DOI 10.17487/RFC4684, 1102 November 2006, . 1104 [RFC5015] Handley, M., Kouvelas, I., Speakman, T., and L. Vicisano, 1105 "Bidirectional Protocol Independent Multicast (BIDIR- 1106 PIM)", RFC 5015, DOI 10.17487/RFC5015, October 2007, 1107 . 1109 [RFC6514] Aggarwal, R., Rosen, E., Morin, T., and Y. Rekhter, "BGP 1110 Encodings and Procedures for Multicast in MPLS/BGP IP 1111 VPNs", RFC 6514, DOI 10.17487/RFC6514, February 2012, 1112 . 1114 [RFC7441] Wijnands, IJ., Rosen, E., and U. Joorde, "Encoding 1115 Multipoint LDP (mLDP) Forwarding Equivalence Classes 1116 (FECs) in the NLRI of BGP MCAST-VPN Routes", RFC 7441, 1117 DOI 10.17487/RFC7441, January 2015, 1118 . 1120 6.2. Informative References 1122 [I-D.hegde-spring-mpls-seamless-sr] 1123 Hegde, S., Bowers, C., Xu, X., Gulko, A., Bogdanov, A., 1124 Uttaro, J., Jalil, L., Khaddam, M., Alston, A., and L. M. 1125 Contreras, "Seamless SR Problem Statement", draft-hegde- 1126 spring-mpls-seamless-sr-06 (work in progress), September 1127 2021. 1129 [I-D.ietf-bess-bgp-multicast-controller] 1130 Zhang, Z., Raszuk, R., Pacella, D., and A. Gulko, 1131 "Controller Based BGP Multicast Signaling", draft-ietf- 1132 bess-bgp-multicast-controller-07 (work in progress), July 1133 2021. 1135 [I-D.ietf-bess-mvpn-evpn-aggregation-label] 1136 Zhang, Z., Rosen, E., Lin, W., Li, Z., and I. Wijnands, 1137 "MVPN/EVPN Tunnel Aggregation with Common Labels", draft- 1138 ietf-bess-mvpn-evpn-aggregation-label-07 (work in 1139 progress), December 2021. 1141 [I-D.ietf-bess-mvpn-pe-ce] 1142 Patel, K., Rosen, E. C., and Y. Rekhter, "BGP as an MVPN 1143 PE-CE Protocol", draft-ietf-bess-mvpn-pe-ce-01 (work in 1144 progress), October 2015. 1146 [I-D.ietf-lsr-flex-algo] 1147 Psenak, P., Hegde, S., Filsfils, C., Talaulikar, K., and 1148 A. Gulko, "IGP Flexible Algorithm", draft-ietf-lsr-flex- 1149 algo-18 (work in progress), October 2021. 1151 [I-D.ietf-mpls-seamless-mpls] 1152 Leymann, N., Decraene, B., Filsfils, C., Konstantynowicz, 1153 M., and D. Steinberg, "Seamless MPLS Architecture", draft- 1154 ietf-mpls-seamless-mpls-07 (work in progress), June 2014. 1156 [I-D.ietf-pim-sr-p2mp-policy] 1157 (editor), D. V., Filsfils, C., Parekh, R., Bidgoli, H., 1158 and Z. Zhang, "Segment Routing Point-to-Multipoint 1159 Policy", draft-ietf-pim-sr-p2mp-policy-03 (work in 1160 progress), August 2021. 1162 [I-D.ietf-spring-segment-routing-policy] 1163 Filsfils, C., Talaulikar, K., Voyer, D., Bogdanov, A., and 1164 P. Mattes, "Segment Routing Policy Architecture", draft- 1165 ietf-spring-segment-routing-policy-14 (work in progress), 1166 October 2021. 1168 [I-D.kaliraj-idr-bgp-classful-transport-planes] 1169 Vairavakkalai, K., Venkataraman, N., Rajagopalan, B., 1170 Mishra, G., Khaddam, M., Xu, X., Szarecki, R. J., and D. 1171 J. Gowda, "BGP Classful Transport Planes", draft-kaliraj- 1172 idr-bgp-classful-transport-planes-12 (work in progress), 1173 August 2021. 1175 [I-D.wijnands-bier-mld-lan-election] 1176 Wijnands, I., Pfister, P., and J. Zhang, "Generic 1177 Multicast Router Election on LAN's", draft-wijnands-bier- 1178 mld-lan-election-01 (work in progress), July 2016. 1180 [I-D.wijnands-mpls-mldp-multi-topology] 1181 Wijnands, I., Raza, K., Mishra, M., Budhiraja, A., Zhang, 1182 Z., and A. Gulko, "mLDP Extensions for Multi-Topology 1183 Routing", draft-wijnands-mpls-mldp-multi-topology-03 (work 1184 in progress), October 2021. 1186 [RFC4915] Psenak, P., Mirtorabi, S., Roy, A., Nguyen, L., and P. 1187 Pillay-Esnault, "Multi-Topology (MT) Routing in OSPF", 1188 RFC 4915, DOI 10.17487/RFC4915, June 2007, 1189 . 1191 [RFC5120] Przygienda, T., Shen, N., and N. Sheth, "M-ISIS: Multi 1192 Topology (MT) Routing in Intermediate System to 1193 Intermediate Systems (IS-ISs)", RFC 5120, 1194 DOI 10.17487/RFC5120, February 2008, 1195 . 1197 [RFC5496] Wijnands, IJ., Boers, A., and E. Rosen, "The Reverse Path 1198 Forwarding (RPF) Vector TLV", RFC 5496, 1199 DOI 10.17487/RFC5496, March 2009, 1200 . 1202 [RFC6420] Cai, Y. and H. Ou, "PIM Multi-Topology ID (MT-ID) Join 1203 Attribute", RFC 6420, DOI 10.17487/RFC6420, November 2011, 1204 . 1206 [RFC6512] Wijnands, IJ., Rosen, E., Napierala, M., and N. Leymann, 1207 "Using Multipoint LDP When the Backbone Has No Route to 1208 the Root", RFC 6512, DOI 10.17487/RFC6512, February 2012, 1209 . 1211 [RFC6826] Wijnands, IJ., Ed., Eckert, T., Leymann, N., and M. 1212 Napierala, "Multipoint LDP In-Band Signaling for Point-to- 1213 Multipoint and Multipoint-to-Multipoint Label Switched 1214 Paths", RFC 6826, DOI 10.17487/RFC6826, January 2013, 1215 . 1217 [RFC7060] Napierala, M., Rosen, E., and IJ. Wijnands, "Using LDP 1218 Multipoint Extensions on Targeted LDP Sessions", RFC 7060, 1219 DOI 10.17487/RFC7060, November 2013, 1220 . 1222 [RFC7431] Karan, A., Filsfils, C., Wijnands, IJ., Ed., and B. 1223 Decraene, "Multicast-Only Fast Reroute", RFC 7431, 1224 DOI 10.17487/RFC7431, August 2015, 1225 . 1227 [RFC7938] Lapukhov, P., Premji, A., and J. Mitchell, Ed., "Use of 1228 BGP for Routing in Large-Scale Data Centers", RFC 7938, 1229 DOI 10.17487/RFC7938, August 2016, 1230 . 1232 [RFC8402] Filsfils, C., Ed., Previdi, S., Ed., Ginsberg, L., 1233 Decraene, B., Litkowski, S., and R. Shakir, "Segment 1234 Routing Architecture", RFC 8402, DOI 10.17487/RFC8402, 1235 July 2018, . 1237 Authors' Addresses 1239 Zhaohui Zhang 1240 Juniper Networks 1242 EMail: zzhang@juniper.net 1244 Lenny Giuliano 1245 Juniper Networks 1247 EMail: lenny@juniper.net 1249 Keyur Patel 1250 Arrcus 1252 EMail: keyur@arrcus.com 1253 IJsbrand Wijnands 1254 Cisco Systems 1256 EMail: ice@cisco.com 1258 Mankamana Mishra 1259 Cisco Systems 1261 EMail: mankamis@cisco.com 1263 Arkadiy Gulko 1264 Refinitiv 1266 EMail: arkadiy.gulko@refinitiv.com