idnits 2.17.00 (12 Aug 2021) /tmp/idnits19645/draft-ietf-idr-rpd-15.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 8 instances of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. Miscellaneous warnings: ---------------------------------------------------------------------------- == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'MUST not' in this paragraph: GeMask: 1 octet for route prefix length match range's lower bound, MUST not be less than Mask or be 0. -- The document date (25 January 2022) is 109 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'I-D.ietf-idr-registered-wide-bgp-communities' is defined on line 886, but no explicit reference was found in the text == Outdated reference: A later version (-07) exists of draft-ietf-idr-wide-bgp-communities-06 Summary: 0 errors (**), 0 flaws (~~), 4 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group Z. Li 3 Internet-Draft Huawei 4 Intended status: Standards Track L. Ou 5 Expires: 29 July 2022 Y. Luo 6 China Telcom Co., Ltd. 7 S. Lu 8 Tencent 9 G. Mishra 10 Verizon Inc. 11 H. Chen 12 Futurewei 13 S. Zhuang 14 H. Wang 15 Huawei 16 25 January 2022 18 BGP Extensions for Routing Policy Distribution (RPD) 19 draft-ietf-idr-rpd-15 21 Abstract 23 It is hard to adjust traffic and optimize traffic paths in a 24 traditional IP network from time to time through manual 25 configurations. It is desirable to have a mechanism for setting up 26 routing policies, which adjusts traffic and optimizes traffic paths 27 automatically. This document describes BGP Extensions for Routing 28 Policy Distribution (BGP RPD) to support this. 30 Requirements Language 32 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 33 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 34 document are to be interpreted as described in [RFC2119] [RFC8174] 35 when, and only when, they appear in all capitals, as shown here. 37 Status of This Memo 39 This Internet-Draft is submitted in full conformance with the 40 provisions of BCP 78 and BCP 79. 42 Internet-Drafts are working documents of the Internet Engineering 43 Task Force (IETF). Note that other groups may also distribute 44 working documents as Internet-Drafts. The list of current Internet- 45 Drafts is at https://datatracker.ietf.org/drafts/current/. 47 Internet-Drafts are draft documents valid for a maximum of six months 48 and may be updated, replaced, or obsoleted by other documents at any 49 time. It is inappropriate to use Internet-Drafts as reference 50 material or to cite them other than as "work in progress." 52 This Internet-Draft will expire on 29 July 2022. 54 Copyright Notice 56 Copyright (c) 2022 IETF Trust and the persons identified as the 57 document authors. All rights reserved. 59 This document is subject to BCP 78 and the IETF Trust's Legal 60 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 61 license-info) in effect on the date of publication of this document. 62 Please review these documents carefully, as they describe your rights 63 and restrictions with respect to this document. Code Components 64 extracted from this document must include Revised BSD License text as 65 described in Section 4.e of the Trust Legal Provisions and are 66 provided without warranty as described in the Revised BSD License. 68 Table of Contents 70 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 71 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 72 3. Problem Statement . . . . . . . . . . . . . . . . . . . . . . 4 73 3.1. Inbound Traffic Control . . . . . . . . . . . . . . . . . 4 74 3.2. Outbound Traffic Control . . . . . . . . . . . . . . . . 5 75 4. Protocol Extensions . . . . . . . . . . . . . . . . . . . . . 6 76 4.1. Using a New AFI and SAFI . . . . . . . . . . . . . . . . 6 77 4.2. BGP Wide Community and Atoms . . . . . . . . . . . . . . 8 78 4.2.1. RouteAttr atom Sub-TLV . . . . . . . . . . . . . . . 9 79 4.2.2. Sub-TLVs of the Parameters TLV . . . . . . . . . . . 12 80 4.3. Capability Negotiation . . . . . . . . . . . . . . . . . 14 81 5. Operations . . . . . . . . . . . . . . . . . . . . . . . . . 15 82 5.1. Application Scenario . . . . . . . . . . . . . . . . . . 15 83 5.2. About Failure . . . . . . . . . . . . . . . . . . . . . . 16 84 6. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 17 85 7. Security Considerations . . . . . . . . . . . . . . . . . . . 17 86 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 17 87 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 17 88 9.1. Existing Assignments . . . . . . . . . . . . . . . . . . 17 89 9.2. Registered IANA Wide Communities . . . . . . . . . . . . 18 90 9.3. RouteAttr Atom Type . . . . . . . . . . . . . . . . . . . 18 91 9.4. Route Attributes Sub-sub-TLV Registry . . . . . . . . . . 18 92 9.5. Attribute Change Sub-TLV Registry . . . . . . . . . . . . 19 93 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 19 94 10.1. Normative References . . . . . . . . . . . . . . . . . . 19 95 10.2. Informative References . . . . . . . . . . . . . . . . . 20 96 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 21 98 1. Introduction 100 It is difficult to optimize traffic paths in a traditional IP network 101 because of the following: 103 * Complex. Traffic can only be adjusted device by device. The 104 configurations on all the routers that the traffic traverses need 105 to be changed or added. There are already lots of policies 106 configured on the routers in an operational network. There are 107 different types of policies, which include security, management 108 and control policies. These policies are relatively stable. 109 However, the policies for adjusting traffic are dynamic. Whenever 110 the traffic through a route is not expected, the policies to 111 adjust the traffic for that route are configured on the related 112 routers. It is complex to dynamically add or change the policies 113 to the existing policies on the special routers to adjust the 114 traffic. Some people would like to separate the stable route 115 policies from the dynamic ones even though they have configuration 116 automation systems (including YANG models). 118 * Difficult maintenance. The routing policies used to adjust 119 network traffic are dynamic, posing difficulties to subsequent 120 maintenance. High maintenance skills are required. 122 * Slow. Adding or changing some route policies on some routers 123 through a configuration automation system for adjusting some 124 traffic to avoid congestions may be slow. 126 It is desirable to have an automatic mechanism for setting up routing 127 policies, which can simplify routing policy configuration and be 128 fast. This document describes extensions to BGP for Routing Policy 129 Distribution to resolve these issues. 131 2. Terminology 133 The following terminology is used in this document. 135 * ACL: Access Control List 137 * BGP: Border Gateway Protocol [RFC4271] 139 * FS: Flow Specification 141 * NLRI: Network Layer Reachability Information [RFC4271] 142 * PBR: Policy-Based Routing 144 * RPD: Routing Policy Distribution 146 * VPN: Virtual Private Network 148 3. Problem Statement 150 Providers have the requirement to adjust their business traffic 151 routing policies from time to time because of the following: 153 * Business development or network failure introduces link congestion 154 and overload. 156 * Business changes or network additions produce unused resources 157 such as idle links. 159 * Network transmission quality is decreased as the result of delay, 160 loss and they need to adjust traffic to other paths. 162 * To control OPEX and CPEX, they may prefer the transit provider 163 with lower price. 165 3.1. Inbound Traffic Control 167 In Figure 1, for the reasons above, the provider P of AS100 may wish 168 the inbound traffic from AS200 to enter AS100 through link L3 instead 169 of the others. Since P doesn't have any administrative control over 170 AS200, there is no way for P to directly modify the route selection 171 criteria inside AS200. 173 Traffic from PE1 to Prefix1 174 -----------------------------------> 176 +-----------------+ +-------------------------+ 177 | +---------+ | L1 | +----+ +----------+| 178 | |Speaker1 | +------------+ |IGW1| |policy || 179 | +---------+ |** L2**| +----+ |controller|| 180 | | ** ** | +----------+| 181 | +---+ | **** | | 182 | |PE1| | **** | | 183 | +---+ | ** ** | | 184 | +---------+ |** L3**| +----+ | 185 | |Speaker2 | +------------+ |IGW2| AS100 | 186 | +---------+ | L4 | +----+ | 187 | | | | 188 | AS200 | | | 189 | | | ... | 190 | | | | 191 | +---------+ | | +----+ +-------+ | 192 | |Speakern | | | |IGWn| |Prefix1| | 193 | +---------+ | | +----+ +-------+ | 194 +-----------------+ +-------------------------+ 196 Prefix1 advertised from AS100 to AS200 197 <---------------------------------------- 199 Figure 1: Inbound Traffic Control case 201 3.2. Outbound Traffic Control 203 In Figure 2, the provider P of AS100 prefers link L3 for the traffic 204 to the destination Prefix2 among multiple exits and links to AS200. 205 This preference can be dynamic and might change frequently because of 206 the reasons above. So, provider P expects an efficient and 207 convenient solution. 209 Traffic from PE2 to Prefix2 210 -----------------------------------> 211 +-------------------------+ +-----------------+ 212 |+----------+ +----+ |L1 | +---------+ | 213 ||policy | |IGW1| +------------+ |Speaker1 | | 214 ||controller| +----+ |** **| +---------+ | 215 |+----------+ |L2** ** | +-------+| 216 | | **** | |Prefix2|| 217 | | **** | +-------+| 218 | |L3** ** | | 219 | AS100 +----+ |** **| +---------+ | 220 | |IGW2| +------------+ |Speaker2 | | 221 | +----+ |L4 | +---------+ | 222 | | | | 223 |+---+ | | AS200 | 224 ||PE2| ... | | | 225 |+---+ | | | 226 | +----+ | | +---------+ | 227 | |IGWn| | | |Speakern | | 228 | +----+ | | +---------+ | 229 +-------------------------+ +-----------------+ 231 Prefix2 advertised from AS200 to AS100 232 <---------------------------------------- 234 Figure 2: Outbound Traffic Control case 236 4. Protocol Extensions 238 This document specifies a solution using a new AFI and SAFI with the 239 BGP Wide Community for encoding a routing policy. 241 4.1. Using a New AFI and SAFI 243 A new AFI and SAFI are defined: the Routing Policy AFI whose 244 codepoint 16398 has been assigned by IANA, and SAFI whose codepoint 245 75 has been assigned by IANA. 247 The AFI and SAFI pair uses a new NLRI, which is defined as follows: 249 0 1 2 3 250 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 251 +-+-+-+-+-+-+-+-+ 252 | NLRI Length | 253 +-+-+-+-+-+-+-+-+ 254 | Policy Type | 255 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 256 | Distinguisher (4 octets) | 257 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 258 | Peer IP (4/16 octets) ~ 259 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 261 Figure 3: AFI and SAFI with new NLRI 263 Where: 265 NLRI Length: 1 octet represents the length of NLRI. If the Length 266 is anything other than 9 or 21, the NLRI is corrupt and the 267 enclosing UPDATE message MUST be ignored. 269 Policy Type: 1 octet indicates the type of a policy. 1 is for 270 Export policy. 2 is for Import policy. If the Policy Type is any 271 other value, the NLRI is corrupt and the enclosing UPDATE message 272 MUST be ignored. 274 Distinguisher: 4 octet unsigned integer that uniquely identifies the 275 content/policy. It is used to sort/order the polices from the 276 lower to higher distinguisher. They are applied in ascending 277 order. A policy with a lower/smaller distinguisher is applied 278 before the policies with a higher/larger distinguisher. 280 Peer IP: 4/16 octet value indicates IPv4/IPv6 peers. Its default 281 value is 0, which indicates that when receiving a BGP UPDATE 282 message with the NLRI, a BGP speaker will apply the policy in the 283 message to all its IPv4/IPv6 peers. 285 Under RPD AFI/SAFI, the RPD routes are stored and ordered according 286 to the keys (Policy type, Distinguisher, Peer IP). Under IPv4/IPv6 287 Unicast AFI/SAFI, there are IPv4/IPv6 unicast routes learned and 288 various static policies configured. In addition, there are dynamic 289 RPD policies from the RPD AFI/SAFI when RPD is enabled. 291 Before advertising an IPv4/IPv6 Unicast AFI/SAFI route, the 292 configured policies are applied to it first, and then the RPD Export 293 policies are applied. 295 The NLRI containing the Routing Policy is carried in MP_REACH_NLRI 296 and MP_UNREACH_NLRI path attributes in a BGP UPDATE message, which 297 MUST also contain the BGP mandatory attributes and MAY contain some 298 BGP optional attributes. 300 When receiving a BGP UPDATE message with routing policy, a BGP 301 speaker processes it as follows: 303 * If the peer IP in the NLRI is 0, then apply the routing policy to 304 all the remote peers of this BGP speaker. 306 * If the peer IP in the NLRI is non-zero, then the IP address 307 indicates a remote peer of this BGP speaker and the routing policy 308 will be applied to it. 310 The content of the Routing Policy is encoded in a BGP Wide Community. 312 4.2. BGP Wide Community and Atoms 314 The BGP wide community attribute is defined in 315 [I-D.ietf-idr-wide-bgp-communities]. This document specifies how two 316 wide communities associate the routing policy NLRI to Routing Policy 317 NLRI (section 4.1) to distribute routing policy to BGP peers. The 318 wide communities which define routing policy are: 320 * MATCH AND SET ATTR (TBD1) 322 * MATCH and NOT ADVERTISE (TBD2) 324 These wide communities are passed in the BGP wide community container 325 in the wide community attribute. These communities support three of 326 the optional TLVs: Target TLV, Exclude Target TLV, and Parameter TLV. 327 The value of each of these TLVs comprises a series of Atoms, each of 328 which is a TLV (or sub-TLV). 330 A new wide community Atom is defined for BGP Wide Community Target(s) 331 TLV (RouteAttr), and two new Atoms are defined for BGP Wide Community 332 Parameter(s) TLV. For your reference, the format of the TLV is 333 illustrated below: 335 0 1 2 3 336 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 337 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 338 | Type | Length | 339 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 340 | Value (variable) ~ 341 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 342 Figure 4: Format of Wide Community Atom TLV 344 4.2.1. RouteAttr atom Sub-TLV 346 A RouteAttr Atom sub-TLV (or RouteAttr sub-TLV for short) is defined 347 and may be included in a Target TLV. It has the following format. 349 0 1 2 3 350 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 351 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 352 | Type (TBD3) | Length (variable) | 353 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 354 | sub-sub-TLVs ~ 355 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 357 Figure 5: Format of RouteAttr Atom sub-TLV 359 The Type for RouteAttr atom is TBD3. In RouteAttr sub-TLV, four sub- 360 sub-TLVs are defined: IPv4 Prefix, IPv6 Prefix, AS-Path, and 361 Community sub-sub-TLV. 363 An IP prefix sub-sub-TLV gives matching criteria on IPv4 prefixes. 364 Its format is illustrated below: 366 0 1 2 3 367 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 368 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 369 | Type 1 | Length (N x 8) |M-Type | Flags | 370 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 371 | IPv4 Address | 372 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 373 | Mask | GeMask | LeMask |M-Type | Flags | 374 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 375 ~ . . . 376 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 377 | IPv4 Address | 378 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 379 | Mask | GeMask | LeMask | 380 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 382 Figure 6: Format of IPv4 Prefix sub-sub-TLV 384 Type: 1 for IPv4 Prefix. 386 Length: N x 8, where N is the number of tuples . If Length is not a multiple of 8, 388 the Atom is corrupt and the enclosing UPDATE message MUST be 389 ignored. 391 M-Type: 4-bit field specifying match type. The following four 392 values are defined. IPaddress is the IP address in the sub-sub- 393 TLV while IProute is the IP route being matched. 395 M-Type = 0: Exact match with the Mask length IP address prefix. 396 GeMask and LeMask MUST be sent as zero and ignored on receipt. 398 M-Type = 1: Matches if the Mask number of prefix bits exactly 399 match between IPaddress and IProute and the actual prefix 400 length of IProute is greater than or equal to GeMask. LeMask 401 MUST be sent as zero and ignored on receipt. 403 M-Type = 2: Matches if the Mask number of prefix bits exactly 404 match between IPaddress and IProute and the actual prefix 405 length of IProute is less than or equal to LeMask. GeMask MUST 406 be sent as zero and ignored on receipt. 408 M-Type = 3: Matches if the Mask number of prefix bits exactly 409 match between IPaddress and IProute and the actual prefix 410 length of IProute is less than or equal to LeMask and greater 411 than or equal to GeMask. 413 Flags: 4 bits. No flags are currently defined. They MUST be sent 414 as zero and ignored on receipt. 416 IPv4 Address: 4 octets for an IPv4 address. 418 Mask: 1 octet for the IP address prefix length that needs to exactly 419 match between the IP address in the sub-sub-TLV and the route. 421 GeMask: 1 octet for route prefix length match range's lower bound, 422 MUST not be less than Mask or be 0. 424 LeMask: 1 octet for route prefix length match range's upper bound, 425 MUST be greater than Mask or be 0. 427 For example, tuple represents an exact IP prefix match for 429 1.1.0.0/22. 431 represents match IP prefix 16.1.0.0/24 greater-equal 24 433 (i.e., route matches if route's first Mask=24 bits match 16.1.0 and 434 24 =< route's prefix length =< 32). 436 represents match IP prefix 17.1.0.0/24 less-equal 26 438 (i.e., route matches if route's first Mask=24 bits match 17.1.0 and 439 24 =< route's prefix length <= 26). 441 represents match IP prefix 18.1.0.0/24 greater-equal 24 443 and less-equal 30 (i.e., route matches if route's first Mask=24 bits 444 match 18.1.0 and 24 =< route's prefix length <= 30). 446 Similarly, an IPv6 Prefix sub-sub-TLV represents match criteria on 447 IPv6 prefixes. Its format is illustrated below: 449 0 1 2 3 450 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 451 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 452 | Type 4 | Length (N x 20) |M-Type | Flags | 453 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 454 | IPv6 Address (16 octets) ~ 455 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 456 | Mask | GeMask | LeMask |M-Type | Flags | 457 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 458 ~ . . . 459 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 460 | IPv6 Address (16 octets ~ 461 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 462 | Mask | GeMask | LeMask | 463 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 465 Figure 7: Format of IPv6 Prefix sub-sub-TLV 467 An AS-Path sub-sub-TLV represents a match criteria in a regular 468 expression string. Its format is illustrated below: 470 0 1 2 3 471 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 472 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 473 | Type 2 | Length (Variable) | 474 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 475 | AS-Path Regex String | 476 : : 477 | ~ 478 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 480 Figure 8: Format of AS Path sub-sub-TLV 482 Type: 2 for AS-Path. 484 Length: Variable, maximum is 1024. 486 AS-Path Regex String: AS-Path regular expression string. 488 A community sub-sub-TLV represents a list of communities to be 489 matched all. Its format is illustrated below: 491 0 1 2 3 492 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 493 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 494 | Type 3 | Length (N x 4 + 1) | Flags | 495 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 496 | Community 1 Value | 497 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 498 ~ . . . ~ 499 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 500 | Community N Value | 501 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 503 Figure 9: Format of Community sub-sub-TLV 505 Type: 3 for Community. 507 Length: N x 4 + 1, where N is the number of communities. If Length 508 is not a multiple of 4 plus 1, the Atom is corrupt and the 509 enclosing UPDATE MUST be ignored. 511 Flags: 1 octet. No flags are currently defined. These bits MUST be 512 sent as zero and ignored on receipt. 514 4.2.2. Sub-TLVs of the Parameters TLV 516 This document introduces 2 community values: 518 MATCH AND SET ATTR (TBD1): If the IPv4/IPv6 unicast routes to a 519 remote peer match the specific conditions defined in the routing 520 policy extracted from the RPD route, then the attributes of the 521 IPv4/IPv6 unicast routes will be modified when sending to the 522 remote peer per the actions defined in the RPD route. 524 MATCH AND NOT ADVERTISE (TBD2): If the IPv4/IPv6 unicast routes to a 525 remote peer match the specific conditions defined in the routing 526 policy extracted from the RPD route, then the IPv4/IPv6 unicast 527 routes will not be advertised to the remote peer. 529 For the Parameter(s) TLV, two action sub-TLVs are defined: MED change 530 sub-TLV and AS-Path change sub-TLV. When the community in the 531 container is MATCH AND SET ATTR, the Parameter(s) TLV can include 532 these sub-TLVs. When the community is MATCH AND NOT ADVERTISE, the 533 Parameter(s) TLV's value is empty. 535 A MED change sub-TLV indicates an action to change the MED. Its 536 format is illustrated below: 538 0 1 2 3 539 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 540 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 541 | Type 1 | Length (5) | OP | 542 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 543 | Value | 544 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 546 Figure 10: Format of MED Change sub-TLV 548 Type: 1 for MED Change. 550 Length: 5. If Length is any other value, the sub-TLV is corrupt and 551 the enclosing UPDATE MUST be ignored. 553 OP: 1 octet. Three are defined: 555 OP = 0: assign the Value to the existing MED. 557 OP = 1: add the Value to the existing MED. If the sum is greater 558 than the maximum value for MED, assign the maximum value to 559 MED. 561 OP = 2: subtract the Value from the existing MED. If the 562 existing MED minus the Value is less than 0, assign 0 to MED. 564 If OP is any other value, the sub-TLV is ignored. 566 Value: 4 octets. 568 An AS-Path change sub-TLV indicates an action to change the AS-Path. 569 Its format is illustrated below: 571 0 1 2 3 572 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 573 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 574 | Type 2 | Length (n x 5) | 575 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 576 | AS1 | 577 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 578 | Count1 | 579 +-+-+-+-+-+-+-+-+ 580 ~ . . . 581 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 582 | ASn | 583 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 584 | Countn | 585 +-+-+-+-+-+-+-+-+ 587 Figure 11: Format of AS-Path Change sub-TLV 589 Type: 2 for AS-Path Change. 591 Length: n x 5. If Length is not a multiple of 5, the sub-TLV is 592 corrupt and the enclosing UPDATE MUST be ignored. 594 ASi: 4 octet. An AS number. 596 Counti: 1 octet. ASi repeats Counti times. 598 The sequence of AS numbers are added to the existing AS Path. 600 4.3. Capability Negotiation 602 It is necessary to negotiate the capability to support BGP Extensions 603 for Routing Policy Distribution (RPD). The BGP RPD Capability is a 604 new BGP capability [RFC5492]. The Capability Code for this 605 capability is 72 assigned by the IANA. The Capability Length field 606 of this capability is variable. The Capability Value field consists 607 of one or more of the following tuples: 609 +--------------------------------------------------+ 610 | Address Family Identifier (2 octets) | 611 +--------------------------------------------------+ 612 | Subsequent Address Family Identifier (1 octet) | 613 +--------------------------------------------------+ 614 | Send/Receive (1 octet) | 615 +--------------------------------------------------+ 617 Figure 12: BGP RPD Capability 619 The meaning and use of the fields are as follows: 621 Address Family Identifier (AFI): This field is the same as the one 622 used in [RFC4760]. 624 Subsequent Address Family Identifier (SAFI): This field is the same 625 as the one used in [RFC4760]. 627 Send/Receive: This field indicates whether the sender is (a) willing 628 to receive Routing Policies from its peer (value 1), (b) would like 629 to send Routing Policies to its peer (value 2), or (c) both (value 3) 630 for the . If Send/Receive is any other value, that tuple 631 is ignored but any other tuples present are still used. 633 5. Operations 635 This section presents a typical application scenario and some details 636 about handling a related failure. 638 5.1. Application Scenario 640 Figure 13 illustrates a typical scenario, where RPD is used by a 641 controller with a Route Reflector (RR) to adjust traffic dynamically. 643 +--------------+ 644 | Controller | 645 +-------+------+ 646 \ 647 \ RPD 648 .--\._.+--+ ___...__ 649 __( \ '.---... ( ) 650 / RR o -------- A o) ---------- (o X AS2 ) 651 (o E |\ ) _____//(___ ___) 652 ( | \_______ B o) ____/ / ''' 653 (o F \ ) ____/ 654 ( \_____ C o) ______/ ___...__ 655 ' AS1 _) \_____ ( ) 656 '---._.-. ) \_______ (o Y AS3 ) 657 '---' (___ ___) 658 ''' 660 Figure 13: Controller with RR Adjusts Traffic 662 The controller connects the RR through a BGP session. There is a BGP 663 session between the RR and each of routers A, B and C in AS1, which 664 is shown in the figure. Other sessions in AS1 are not shown in the 665 figure. 667 There is router X in AS2. There is a BGP session between X and each 668 of routers A, B and C in AS1. 670 There is router Y in AS3. There is a BGP session between Y and 671 router C in AS1. 673 The controller sends a RPD route to the RR. After receiving the RPD 674 route from the controller, the RR reflects the RPD route to routers 675 A, B and C. After receiving the RPD route from the RR, routers A, B 676 and C extract the routing policy from the RPD route. If the peer IP 677 in the NLRI of the RPD route is 0, then apply the routing policy to 678 all the remote peers of routers A, B and C. If the peer IP in the 679 NLRI of the RPD route is non-zero, then the IP address indicates a 680 remote peer of routers A, B and C and such routing policy is applied 681 to the specific remote peer. The IPv4/IPv6 unicast routes towards 682 router X in AS2 and router Y in AS3 will be adjusted based on the 683 routing policy sent by the controller via a RPD route. 685 The controller uses the RT extend community to notify a router 686 whether to receive a RPD policy. For example, if there is not any 687 adjustment on router B, the controller sends RPD routes with the RTs 688 for A and C. B will not receive the routes. 690 The process of adjusting traffic in a network is a close loop. The 691 loop starts from the controller with some traffic expectations on a 692 set of routes. The controller obtains the information about traffic 693 flows for the related routes. It analyzes the traffic and checks 694 whether the current traffic flows meet the expectations. If the 695 expectations are not met, the controller adjusts the traffic. And 696 then the loop goes to the starter of the loop (The controller obtains 697 the information about traffic ...). 699 5.2. About Failure 701 This section describes some details about handling a failure related 702 to a RPD route being applied. 704 A RPD route is not a configuration. When it is sent to a router from 705 a controller, no ack is needed from the router. The existing BGP 706 mechanisms are re-used for delivering a RPD route. After the route 707 is delivered to a router, it will be successful. This is guaranteed 708 by the BGP protocols. 710 If there is a failure for the router to install the route locally, 711 this failure is a bug of the router. The bug needs to be fixed. 713 For the errors mentioned in [RFC7606], they are handled according to 714 [RFC7606]. These errors are bugs, which need to be resolved. 716 When the controller fails while a RPD route is being applied such as 717 on the way to the router, some existing mechanisms such BGP Graceful 718 Restart (GR) [RFC4724] and BGP Long-lived Graceful Restart (LLGR) can 719 be used to let the router keep the routes from the controller for 720 some time. 722 With support of "Long-lived Graceful Restart Capability" 723 [I-D.ietf-idr-long-lived-gr], the routes can be retained for a longer 724 time after the controller fails. 726 After the controller recovers from its failure, the router will have 727 all the routes (including the RPD route being applied) from the 728 controller. 730 In the worst case, the controller fails and the RPD routes for 731 adjusting the traffic are withdrawn. The traffic adjusted/redirected 732 may take its old path. This should be acceptable. 734 6. Contributors 736 The following people have substantially contributed to the definition 737 of the BGP-FS RPD and to the editing of this document: 739 Peng Zhou 740 Huawei 741 Email: Jewpon.zhou@huawei.com 743 7. Security Considerations 745 Protocol extensions defined in this document do not affect BGP 746 security other than as discussed in the Security Considerations 747 section of [RFC8955]. 749 8. Acknowledgements 751 The authors would like to thank Acee Lindem, Jeff Haas, Jie Dong, 752 Lucy Yong, Qiandeng Liang, Zhenqiang Li, Robert Raszuk, Donald 753 Eastlake, Ketan Talaulikar, and Jakob Heitz for their comments to 754 this work. 756 9. IANA Considerations 758 9.1. Existing Assignments 760 IANA has assigned an AFI of value 16398 from the registry "Address 761 Family Numbers" for Routing Policy. 763 IANA has assigned a SAFI of value 75 from the registry "Subsequent 764 Address Family Identifiers (SAFI) Parameters" for Routing Policy. 766 IANA has assigned a Code Point of value 72 from the registry 767 "Capability Codes" for Routing Policy Distribution. 769 9.2. Registered IANA Wide Communities 771 IANA Should assign from the Registered Wide Community Values" the 772 following values: 774 +---------------------+------------------------------+-------------+ 775 | Community Value | Description | Reference | 776 +---------------------+------------------------------+-------------+ 777 | TBD1 | MATCH AND SET ATTR |This document| 778 +---------------------+------------------------------+-------------+ 779 | TBD2 | MATCH AND NOT ADVISE |This document| 780 +---------------------+------------------------------+-------------+ 782 9.3. RouteAttr Atom Type 784 IANA is requested to assign a code-point from the registry "BGP 785 Community Container Atom Types" as follows: 787 +---------------------+------------------------------+-------------+ 788 | Atom Code Point | Description | Reference | 789 +---------------------+------------------------------+-------------+ 790 | TBD3 (48 suggested) | RouteAttr Atom |This document| 791 +---------------------+------------------------------+-------------+ 793 9.4. Route Attributes Sub-sub-TLV Registry 795 IANA is requested to create a registry called "Route Attributes Sub- 796 sub-TLV" under RouteAttr Atom Sub-TLV. The allocation policy of this 797 registry is "First Come First Served (FCFS)". 799 The initial code points are as follows: 801 +-------------+-----------------------------------+-------------+ 802 | Code Point | Description | Reference | 803 +-------------+-----------------------------------+-------------+ 804 | 0 | Reserved | | 805 +-------------+-----------------------------------+-------------+ 806 | 1 | IPv4 Prefix Sub-sub-TLV |This document| 807 +-------------+-----------------------------------+-------------+ 808 | 2 | AS-Path Sub-sub-TLV |This document| 809 +-------------+-----------------------------------+-------------+ 810 | 3 | Community Sub-sub-TLV |This document| 811 +-------------+-----------------------------------+-------------+ 812 | 4 | IPv6 Prefix Sub-sub-TLV |This document| 813 +-------------+-----------------------------------+-------------+ 814 | 5 - 255 | Available | | 815 +-------------+-----------------------------------+-------------+ 817 9.5. Attribute Change Sub-TLV Registry 819 IANA is requested to create a registry called "Attribute Change Sub- 820 TLV" under Parameter(s) TLV. The allocation policy of this registry 821 is "First Come First Served (FCFS)". 823 Initial code points are as follows: 825 +-------------+-----------------------------------+-------------+ 826 | Code Point | Description | Reference | 827 +-------------+-----------------------------------+-------------+ 828 | 0 | Reserved | | 829 +-------------+-----------------------------------+-------------+ 830 | 1 | MED Change Sub-TLV |This document| 831 +-------------+-----------------------------------+-------------+ 832 | 2 | AS-Path Change Sub-TLV |This document| 833 +-------------+-----------------------------------+-------------+ 834 | 3 - 255 | Available | | 835 +-------------+-----------------------------------+-------------+ 837 10. References 839 10.1. Normative References 841 [I-D.ietf-idr-wide-bgp-communities] 842 Raszuk, R., Haas, J., Lange, A., Decraene, B., Amante, S., 843 and P. Jakma, "BGP Community Container Attribute", Work in 844 Progress, Internet-Draft, draft-ietf-idr-wide-bgp- 845 communities-06, 10 January 2022, 846 . 849 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 850 Requirement Levels", BCP 14, RFC 2119, 851 DOI 10.17487/RFC2119, March 1997, 852 . 854 [RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A 855 Border Gateway Protocol 4 (BGP-4)", RFC 4271, 856 DOI 10.17487/RFC4271, January 2006, 857 . 859 [RFC4760] Bates, T., Chandra, R., Katz, D., and Y. Rekhter, 860 "Multiprotocol Extensions for BGP-4", RFC 4760, 861 DOI 10.17487/RFC4760, January 2007, 862 . 864 [RFC5492] Scudder, J. and R. Chandra, "Capabilities Advertisement 865 with BGP-4", RFC 5492, DOI 10.17487/RFC5492, February 866 2009, . 868 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 869 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 870 May 2017, . 872 [RFC8955] Loibl, C., Hares, S., Raszuk, R., McPherson, D., and M. 873 Bacher, "Dissemination of Flow Specification Rules", 874 RFC 8955, DOI 10.17487/RFC8955, December 2020, 875 . 877 10.2. Informative References 879 [I-D.ietf-idr-long-lived-gr] 880 Uttaro, J., Chen, E., Decraene, B., and J. G. Scudder, 881 "Support for Long-lived BGP Graceful Restart", Work in 882 Progress, Internet-Draft, draft-ietf-idr-long-lived-gr-00, 883 5 September 2019, . 886 [I-D.ietf-idr-registered-wide-bgp-communities] 887 Raszuk, R. and J. Haas, "Registered Wide BGP Community 888 Values", Work in Progress, Internet-Draft, draft-ietf-idr- 889 registered-wide-bgp-communities-02, 31 May 2016, 890 . 893 [RFC4724] Sangli, S., Chen, E., Fernando, R., Scudder, J., and Y. 894 Rekhter, "Graceful Restart Mechanism for BGP", RFC 4724, 895 DOI 10.17487/RFC4724, January 2007, 896 . 898 [RFC7606] Chen, E., Ed., Scudder, J., Ed., Mohapatra, P., and K. 899 Patel, "Revised Error Handling for BGP UPDATE Messages", 900 RFC 7606, DOI 10.17487/RFC7606, August 2015, 901 . 903 Authors' Addresses 905 Zhenbin Li 906 Huawei 907 Huawei Bld., No.156 Beiqing Rd. 908 Beijing 909 100095 910 China 912 Email: lizhenbin@huawei.com 914 Liang Ou 915 China Telcom Co., Ltd. 916 109 West Zhongshan Ave,Tianhe District 917 Guangzhou 918 510630 919 China 921 Email: ouliang@chinatelecom.cn 923 Yujia Luo 924 China Telcom Co., Ltd. 925 109 West Zhongshan Ave,Tianhe District 926 Guangzhou 927 510630 928 China 930 Email: luoyuj@sdu.edu.cn 932 Sujian Lu 933 Tencent 934 Tengyun Building,Tower A ,No. 397 Tianlin Road 935 Shanghai 936 Xuhui District, 200233 937 China 939 Email: jasonlu@tencent.com 940 Gyan S. Mishra 941 Verizon Inc. 942 13101 Columbia Pike 943 Silver Spring, MD 20904 944 United States of America 946 Phone: 301 502-1347 947 Email: gyan.s.mishra@verizon.com 949 Huaimo Chen 950 Futurewei 951 Boston, MA, 952 United States of America 954 Email: Huaimo.chen@futurewei.com 956 Shunwan Zhuang 957 Huawei 958 Huawei Bld., No.156 Beiqing Rd. 959 Beijing 960 100095 961 China 963 Email: zhuangshunwan@huawei.com 965 Haibo Wang 966 Huawei 967 Huawei Bld., No.156 Beiqing Rd. 968 Beijing 969 100095 970 China 972 Email: rainsword.wang@huawei.com