idnits 2.17.00 (12 Aug 2021) /tmp/idnits10569/draft-ietf-idr-bgp-multisession-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document seems to contain a disclaimer for pre-RFC5378 work, and may have content which was first submitted before 10 November 2008. The disclaimer is necessary when there are original authors that you have been unable to contact, or if some do not wish to grant the BCP78 rights to the IETF Trust. If you are able to get all authors (current and original) to grant those rights, you can and should remove the disclaimer; otherwise, the disclaimer is needed and you can ignore this comment. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (September 12, 2012) is 3531 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-16) exists of draft-ietf-idr-dynamic-cap-14 Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force J. Scudder 3 Internet-Draft Juniper Networks 4 Intended status: Standards Track C. Appanna 5 Expires: March 16, 2013 Arista Networks 6 I. Varlashkin 7 Easynet Global Services 8 September 12, 2012 10 Multisession BGP 11 draft-ietf-idr-bgp-multisession-07 13 Abstract 15 This specification augments "Multiprotocol Extensions for BGP-4" (MP- 16 BGP) by proposing a mechanism to facilitate the use of multiple 17 sessions between a given pair of BGP speakers. Each session is used 18 to transport routes related by some session-based attribute such as 19 AFI/SAFI. This provides an alternative to the MP-BGP approach of 20 multiplexing all routes onto a single connection. 22 Use of this approach is expected to provide finer-grained fault 23 management and isolation as the BGP protocol is used to support more 24 and more diverse services. 26 Status of this Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at http://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on March 16, 2013. 43 Copyright Notice 45 Copyright (c) 2012 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (http://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 This document may contain material from IETF Documents or IETF 59 Contributions published or made publicly available before November 60 10, 2008. The person(s) controlling the copyright in some of this 61 material may not have granted the IETF Trust the right to allow 62 modifications of such material outside the IETF Standards Process. 63 Without obtaining an adequate license from the person(s) controlling 64 the copyright in such materials, this document may not be modified 65 outside the IETF Standards Process, and derivative works of it may 66 not be created outside the IETF Standards Process, except to format 67 it for publication as an RFC or to translate it into languages other 68 than English. 70 Table of Contents 72 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 73 1.1. Requirements Language . . . . . . . . . . . . . . . . . . 5 74 2. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 5 75 3. Overview of operations . . . . . . . . . . . . . . . . . . . . 5 76 4. Multisession BGP Capability Code . . . . . . . . . . . . . . . 6 77 5. New NOTIFICATION Subcodes . . . . . . . . . . . . . . . . . . 7 78 6. Modified Connection Collision Handling . . . . . . . . . . . . 7 79 7. Connection establishment . . . . . . . . . . . . . . . . . . . 8 80 8. Graceful restart . . . . . . . . . . . . . . . . . . . . . . . 10 81 9. Error handling . . . . . . . . . . . . . . . . . . . . . . . . 10 82 10. Operational considerations . . . . . . . . . . . . . . . . . . 10 83 11. Backward Compatibility . . . . . . . . . . . . . . . . . . . . 11 84 12. State Machine . . . . . . . . . . . . . . . . . . . . . . . . 11 85 13. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . 11 86 14. Security Considerations . . . . . . . . . . . . . . . . . . . 12 87 15. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 12 88 16. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 12 89 17. References . . . . . . . . . . . . . . . . . . . . . . . . . . 13 90 17.1. Normative References . . . . . . . . . . . . . . . . . . . 13 91 17.2. Informative References . . . . . . . . . . . . . . . . . . 13 92 Appendix A. Multisession usage scenarios . . . . . . . . . . . . 13 93 A.1. Single session on both sides . . . . . . . . . . . . . . . 13 94 A.2. Single session on one side, multiple sessions on the 95 other . . . . . . . . . . . . . . . . . . . . . . . . . . 14 96 A.3. Multiple sessions based on AFI/SAFI . . . . . . . . . . . 15 97 A.4. Multiple sessions based on arbitrary BGP Capabilities . . 17 98 A.5. Process level separation of multiple sessions . . . . . . 18 99 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 18 101 1. Introduction 103 Most BGP [RFC4271] implementations only permit a single ESTABLISHED 104 connection to exist with each peer. More precisely, they only permit 105 a single ESTABLISHED connection for any given pair of IP endpoints. 107 BGP Capabilities [RFC5492] extend BGP to allow diverse information 108 (encoded as "capabilities") to be associated with a session. In some 109 cases, a capability may relate to the operation of the protocol 110 machinery; an example is Route Refresh [RFC2918]. However, in other 111 cases a capability may relate specifically to some common 112 distinguishing characteristic of the routes carried over the session; 113 an example is Multiprotocol BGP [RFC4760]. 115 Multiprotocol BGP [RFC4760] extends BGP to allow information for 116 multiple NLRI families and sub-families to be transported in BGP. 117 Routes for different families are distinguished by AFI and SAFI. 118 Routes for different families are commonly multiplexed onto a single 119 BGP session. 121 A common criticism of BGP is the fact that most malformed messages 122 cause the session to be terminated. While this behavior is necessary 123 for protocol correctness, one may observe that the protocol machinery 124 of a given implementation may only be defective with respect to a 125 given AFI/SAFI. Thus, it would be desirable to allow the session 126 related to that family to be terminated while leaving other AFI/SAFI 127 unaffected. As BGP is commonly deployed, this is not possible. 129 A second criticism of BGP is that it is difficult or in some cases 130 impossible to manage control plane resource contention when BGP is 131 used to support diverse services over a single session. In contrast, 132 if a single BGP session carries only information for a single service 133 (or related set of services) it may be easier to manage such 134 contention. 136 In this specification, we propose a mechanism by which multiple 137 transport sessions may be established between a pair of peers. Each 138 transport session is identified by a distinct set of BGP 139 capabilities, notably the MP-BGP capability. 141 Each session is distinct from a BGP protocol point of view; an error 142 or other event on one session has no implications for any other 143 session. All protocol modifications proposed by this specification 144 take place during the OPEN exchange phase of the session, there are 145 no modifications to the operation of the protocol once a session 146 reaches ESTABLISHED state. 148 Although AFI/SAFI is perhaps the most obvious way to group sets of 149 routes being exchanged between BGP peers, sessions can also be 150 distinguished by other BGP capabilities. In general, any capability 151 used in this fashion would be expected to have semantics of 152 identifying some common distinguishing characteristic of a set of 153 routes, just as AFI/SAFI does; however, specifics are beyond the 154 scope of this document. Most examples in this document are focusing 155 on MP-BGP capability (or interchangeably, AFI/SAFI) based grouping 156 for simplicity reason. However actual application of multisessions 157 extension . Such use is illustrative and is not intended to be 158 limiting. 160 Routers implementing this specification MUST also implement the base 161 criteria that is used to define sessions. For example if AFI/SAFI 162 based sessions are desired then routers implementing this 163 specification MUST also implement MP-BGP [RFC4760]. 165 1.1. Requirements Language 167 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 168 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 169 document are to be interpreted as described in RFC 2119 [RFC2119]. 171 2. Definitions 173 "MP-BGP capability" refers to the capability [RFC5492] with code 1, 174 specified in MP-BGP [RFC4760] section 8. 176 A BGP speaker is said to "support" some feature or functionality (for 177 example, to support this specification, or to support a particular 178 AFI/SAFI) when the BGP implementation supports the feature AND the 179 feature has not been disabled by configuration. 181 The Session Identifier is a capability or group of capabilities that 182 will be used to differentiate individual BGP sessions between two IP 183 endpoints. When the AFI/SAFI is used to distinguish sessions, the 184 MP-BGP capability is the session identifier. 186 3. Overview of operations 188 To allow multiple sessions between same pair of BGP speakers to co- 189 exist BGP Multisession extension modifies Connection Collision 190 Detection procedure of the base BGP specification (RFC4271). Rather 191 than considering only IP addresses of the peers new procedure also 192 takes into account list of certain session attributes, such as AFI/ 193 SAFI, to determine uniqueness of the sessions. When sessions are 194 deemed to be unique each of them is then handled independently, 195 therefore critical conditions (such as malformed UPDATEs) in one 196 session won't affect others. 198 BGP Multisession extension introduces new BGP capability code to 199 indicate that a BGP speaker supports protocol modification described 200 in this document and new error message sub-codes that facilitate 201 handling of incompatible configurations between two speakers. 203 Following sections provide formal description of the protocol 204 enhancement. Additionally, Appendix contains non-normative examples 205 of desired behaviour for Multisession-enabled BGP speakers, which is 206 intended only for illustrative purpose. 208 4. Multisession BGP Capability Code 210 This specification defines the Multisession capability [RFC5492]: 212 Capability code (1 octet): 68 214 Capability length (1 octet): variable 216 Capability value (1 octet): Flags followed by the list of 217 capabilities that define a session. 219 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 220 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 221 |G| Reserved | Session Id ~ 222 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 224 G - the most significant bit was originally intended by earlier draft 225 version of Multisession specification to denote capability of a BGP 226 speaker to group multiple capability values into one session. As 227 this information can be deduced from Session Id, the use of G bit is 228 deprecated - implementations conforming to final version of 229 Multisession specification SHOULD NOT rely on value of the G bit. 231 Reserved - MUST be set to zero by sender, MUST be ignored by receiver 233 Session Id(entifier) - list of zero or more capability codes (1 octet 234 each) defined in BGP, whose values will be used to distinguish one 235 group from another. The size of the list is inferred from the length 236 of the overall capability; it is the capability length minus one. 237 The Multisession capability code itself MUST NOT be listed; if listed 238 it MUST be ignored upon receipt. 240 Empty Session Id list and Session Id containing 1 (one, Multiprotocol 241 Extensions) as the only value are considered equal and indicate that 242 AFI/SAFI list in the OPEN message is used to distinguish the groups. 243 However, if BGP speaker wishes to use compound Session Id that 244 includes AFI/SAFI list as one of the components, then Capability Code 245 1 MUST be explicitly included in the Session Id. For example, if BGP 246 speaker Session Id to 'X' (denoting Capability Foo) then only Foo 247 will be used as Session Id, i.e. session where Foo is 1 and AFI/SAFI 248 is 1/1 and session where Foo is 1 and AFI/SAFI is 1/2 will be 249 considered as conflicting. On the other hand Session Id set to '1 X' 250 or 'X 1' indicates that groups are identified by combination of Foo 251 and AFI/SAFI, i.e. above two sessions as well as session where Foo is 252 2 and AFI/SAFI is 2/4 will be considered unique. 254 For given pair of BGP peers Multisession capability MUST be used 255 either on all or none sessions. This is required due to different 256 connection collision handling procedure used by multisession. 258 5. New NOTIFICATION Subcodes 260 BGP [RFC4271] Section 4.5 provides a number of subcodes to the 261 NOTIFICATION message, and Section 6.2 elaborates on the use of those 262 subcodes specific to OPEN message. 264 This specification introduces three new subcodes for OPEN Message 265 Error code: 267 7 - Capability Value Mismatch - Session Id mismatch, i.e. 268 remote speaker whishes to use different capability codes in 269 Session Id compare to local speaker 271 8 - Grouping Conflict - values of capability codes used in 272 Session Id of the received message cannot be unambiguously 273 mapped to a locally configured group 275 9 - Grouping Required (from earlier drafts, perhaps should be 276 removed if not used) 278 BGP implementations conforming to this specification SHOULD use new 279 sub-codes as described further down in section "Connection 280 establishment" of this document. 282 6. Modified Connection Collision Handling 284 BGP speaker conforming to and actively using this specification MUST 285 use modified connection collision handling procedure as described in 286 this section. 288 Two sessions are said to collide if and only if both of following 289 conditions are true: 291 1: the IP addresses on of peers are the same on both sessions 293 2: values of capability codes used in session identifier are either 294 the same or overlapping (regardless fully or partially) within 295 given capability code 297 Otherwise two sessions are considered unique and both MAY transition 298 to the ESTABLISHED state (subject to rest of BGP specification). 300 Before attempting to create new session local system SHOULD evaluate 301 existing sessions with the same peer. If there is already a session 302 with the same peer in ESTABLISHED state and new session would collide 303 with it, BGP speaker SHOULD NOT attempt creating new session; it's a 304 good idea to notify operator of the local system about such potential 305 collision. 307 Upon receipt of an OPEN messages BGP speaker MUST evaluate existing 308 sessions with the same peer. If there is already a session in 309 ESTABLISHED state and multisession distinguisher values of the old 310 and the new OPEN messages fully match, the old session remains and 311 the new MUST be closed. 313 If there is a session in OpenConfirm or OpenSent state and two 314 sessions do not collide according to this document, then both 315 sessions proceed as normally and section 6.8 of RFC4271 MUST NOT be 316 applied. If on the other hand two sessions collide according to 317 definition of this document, then original procedure from section 6.8 318 of RFC4271 MUST be applied, except for the NOTIFICATION type. 319 Whereas original specification prescribes to use 'Cease' error code, 320 multisession enabled BGP speaker SHOULD send NOTIFICATION message as 321 described in this document. 323 7. Connection establishment 325 When BGP Multisession is enabled by configuration for given peer and 326 configuration dictates that multiple sessions can potentially be 327 established with given peer, BGP speaker MUST advertise Multisession 328 Capability code in the OPEN message on every session with given peer. 329 In all other cases Multisession capability SHOULD NOT be advertised. 330 The value of Session Id MUST be the same on every session. 332 When Multisession-enabled BGP speaker receives an OPEN message 333 without BGP Multisession Capability code it MUST assume that peer is 334 not capable of multiple sessions and MUST use original Connection 335 Collision Detection procedure as described in section 6.8 of RFC4271. 337 When Multisession-enabled BGP speaker receives an OPEN message 338 containing BGP Multisession Capability Code but with Session Id not 339 matching its own Session Id, local BGP speaker MUST send NOTIFICATION 340 message with Error Code set to 2 ("OPEN Message Error") and Error 341 Sub-code set to 8 ("Grouping Conflict") and drop the session. If 342 received Session Id matches locally configured Session Id then BGP 343 speaker MUST verify whether this session would collide with any of 344 the existing as described in section "Modified Connection Collision 345 Handling". 347 If session is allowed to continue by connection collision detection 348 procedure, the next step for local speaker is to find matching group 349 as follow: 351 1. If BGP capability code values used in Session Id of the received 352 message match exactly (i.e. for every value in the received OPEN 353 message there is corresponding value in a locally configured 354 group) then local BGP speaker proceeds with this session 356 2. If values in the received message do not match any of the locally 357 configured groups exactly, but there is one and only one locally 358 configured group such that for every capability code the 359 intersection between received and local values is non-empty set, 360 then this group is selected for continuing the session. Note, 361 such partial match results in behaviour similar to non- 362 multisession BGP when capability codes overlap partially. 363 Rationale behind allowing only one group for partial matching is 364 that it simplifies specification and implementation; from 365 operational perspective multiple partially matching groups 366 suggest significant descrepancy in configuration between peers 367 and therefore unlikely to be required in real-life networks. 369 3. In all other cases local BGP speaker MUST send NOTIFICATION 370 message with Error Code set to 2 (OPEN Message Error) and Error 371 Sub-code set to 8 (Grouping conflict). 373 Once local BGP speaker has identified which locally configured group 374 corresponds to received OPEN message it proceeds with the session 375 like it would have been regular non-multisession one, particularly - 376 the original Finite State Machine applies. BGP speaker is free to 377 handle such session either in the same process/thread as the one that 378 received OPEN message, or it can hand over connection to another 379 process/thread. If uses, the connection handover is local-matter of 380 BGP implementation and not part of this specification. Appendix 381 contains an example how such handover could be done. 383 8. Graceful restart 385 With respect to Section 4.2 of BGP Graceful Restart [RFC4724], when 386 determining whether a new connection BGP speaker evaluate values of 387 all capability codes used in Session Identifier. 389 9. Error handling 391 If multisession-enabled BGP speaker detects an error condition that 392 warrants session reset, it SHOULD reset only session that was 393 affected by the error. Resetting other sessions with the same peer 394 would significantly diminish value of multisession extensions. 396 10. Operational considerations 398 Multisession feature SHOULD be disabled by default. BGP 399 implementation SHOULD provide configuration-time option to enable 400 multisession extension on per-peer basis. If BGP implementation 401 supports non-trivial groups, then it SHOULD provide configuration- 402 time option for operator to control how sessions are grouped. An 403 example of such option would be possibility for an operator to 404 specify which address families will be carried in one session, and 405 which address families will be carried in another session. 407 BGP implementation supporting multisession extension SHOULD allow 408 operator to view state of each individual group and at least last 409 NOTIFICATION message that caused connection reset. 411 For the sake of interoperability between BGP speakers supporting 412 multisession, an implementation SHOULD NOT impose hard-coded 413 restrictions on groups based on particular Session Id are put 414 together. If such restrictions are unavoidable, then BGP 415 implementation MUST support at least trivial groups based on that 416 attribute. Let's consider this on an example. If implementation A 417 requires AFI/SAFI 1/1 and 1/4 to be always carried within same 418 session, while implementation B requires AFI/SAFI 1/4 to be always 419 carried only with 1/128 and not with any other, then it's not 420 possible to establish session between such BGP speakers. However if 421 implementations A and B both allow each AFI/SAFI to be carried each 422 in its own group, then we can establish three sessions - one for AFI/ 423 SAFI 1/1, another one for AFI/SAFI 1/4 and third one for AFI/SAFI 424 1/128. 426 11. Backward Compatibility 428 This subsection discusses a BGP speaker's behavior towards a peer 429 that is known or assumed not to support this specification. In 430 short, the BGP speaker's behavior towards such a peer should be as 431 otherwise defined for the BGP protocol, according to [RFC4271] and 432 any other extension supported by the BGP speaker. 434 If a BGP speaker receives OPEN message that doesn't include 435 Multisession Capability and local BGP speaker is required to use 436 multisession (e.g. through configuration by operator), the local BGP 437 speaker MUST drop the session and send appropriate NOTIFICATION 438 message as described in Section 5. If multisession is not required, 439 local BGP speaker proceeds with multisession extension disabled, so 440 it appears as regular implementation to the peer. 442 As previously mentioned, the BGP speaker SHOULD always advertise the 443 Multisession capability in its OPEN message, even towards "backward 444 compatibility" peers. 446 Use of techniques such as dynamic capabilities 447 [I-D.ietf-idr-dynamic-cap] for on-the-fly switching of session modes 448 is beyond the scope of this document. 450 12. State Machine 452 This specification does not modify BGP FSM as such, but all 453 references to execution of collision handling procedure of original 454 BGP specification are replaced with call to collision handling 455 procedure described in this document. 457 The specific state machine modifications to [RFC4271] Section 8.2.2 458 are as follows. 460 13. Discussion 462 Note that many BGP implementations already permit multiple sessions 463 to be used between a given pair of routers, typically by configuring 464 multiple IP addresses on each router and configuring each session to 465 be bound to a different IP address. The principal contribution of 466 this specification is to allow multiple sessions to be created 467 automatically, without additional configuration overhead or address 468 consumption. 470 The specification supports the simple case of one capability being 471 used as the session identifier and one connection per session 472 identifier value. It also permits connections be established based 473 on multiple capabilities as a session identifier with multiple values 474 per capability grouped together per connection. 476 In the context of MP-BGP based connections, which we believe may be 477 the most prevalent use of this specification, it permits supporting 478 one AFI/SAFI per connection, and also permits arbitrary grouping of 479 AFI/SAFI onto BGP connections. For such grouping to function 480 pleasingly, both peers participating in a connection need to agree on 481 what AFI/SAFI groupings will be used. If conflicting groupings are 482 configured, the connections may not establish, or more connections 483 may be established than were expected (in the degenerate case, one 484 connection per AFI/SAFI could be established despite configured 485 groupings). We observe that the potential for misbehavior in the 486 presence of conflicting configuration is not unusual in BGP, and that 487 support for, and configuration of grouping is purely optional. 489 14. Security Considerations 491 This document does not change the BGP security model. 493 15. Acknowledgements 495 The authors would like to thank Martin Djernaes, Pedro Marques, Keyur 496 Patel, Robert Raszuk, Yakov Rekhter, David Ward and Anton Elita for 497 their valuable comments. 499 16. IANA Considerations 501 IANA has allocated BGP Capability Code 68 as the Multisession BGP 502 Capability. 504 This document requests IANA to allocate three new OPEN Message Error 505 subcodes: 507 7 - Capability Value Mismatch 509 8 - Grouping Conflict 511 9 - Grouping Required 513 17. References 514 17.1. Normative References 516 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 517 Requirement Levels", BCP 14, RFC 2119, March 1997. 519 [RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway 520 Protocol 4 (BGP-4)", RFC 4271, January 2006. 522 [RFC4724] Sangli, S., Chen, E., Fernando, R., Scudder, J., and Y. 523 Rekhter, "Graceful Restart Mechanism for BGP", RFC 4724, 524 January 2007. 526 [RFC4760] Bates, T., Chandra, R., Katz, D., and Y. Rekhter, 527 "Multiprotocol Extensions for BGP-4", RFC 4760, 528 January 2007. 530 [RFC5492] Scudder, J. and R. Chandra, "Capabilities Advertisement 531 with BGP-4", RFC 5492, February 2009. 533 17.2. Informative References 535 [I-D.ietf-idr-dynamic-cap] 536 Ramachandra, S. and E. Chen, "Dynamic Capability for 537 BGP-4", draft-ietf-idr-dynamic-cap-14 (work in progress), 538 December 2011. 540 [RFC2918] Chen, E., "Route Refresh Capability for BGP-4", RFC 2918, 541 September 2000. 543 Appendix A. Multisession usage scenarios 545 This section demonstrates usage of Multisession Extension in several 546 common scenarios. All examples presented here for illustrative 547 purpose only, they're not part of Multisession specification. 549 A.1. Single session on both sides 551 BGP Speaker A and BGP Speaker B are both configured to exchange IPv4 552 unicast (AFI=1, SAFI=1) and IPv4 L3VPN (AFI=1, SAFI=128) prefixes 553 over single session. If Multisession extension is disabled by 554 configuration on both sides, then the session is, from every 555 perspective, indistinguishable from ordinary (non-multisession) BGP 556 peering. If only one of the speakers is enabled (through 557 configuration) for multisession and yet only with one session to 558 multiplex both AFI/SAFI, then again only single session is 559 established and it looks like normal session. Although multisession- 560 enabled BGP speaker is capable of processing new NOTIFICATION sub- 561 codes, the other side (non-multisession) won't take advantage of it. 562 On the other hand use of new NOTIFICATION sub-codes isn't necessary 563 in this situation because both sides keep all AFI/SAFI within same 564 session. Finally, if both speakers are multisession-enabled, they 565 still setup single session, but now they can use new NOTIFICATION 566 sub-codes for more sophisticated error handling. 568 Note that if both speakers configured to use only single session and 569 their respective AFI/SAFI lists overlap but do not match exactly, 570 then like with ordinary (non-multisession) BGP speakers the session 571 will transition to ESTABLISHED state. It's possible that one of the 572 speakers (or both) require exact match of AFI/SAFI lists in order to 573 establish session (either by implementation or through 574 configuration). In this case such speaker will send NOTIFICATION 575 message with Error Code 2 (OPEN Message Error) and Sub-code 8 576 (Grouping conflict) and subsequently close the session. 578 A.2. Single session on one side, multiple sessions on the other 580 In this setup Speaker A is configured to carry IPv4 unicast (AFI=1, 581 SAFI=1) and IPv4 L3VPN (AFI=1, SAFI=128) prefixes within single 582 session, while Speaker B is configured with two sessions - one for 583 IPv4 unicast and second for IPv4 L3VPN. Several scenarios are 584 possible depending on which speaker sends OPEN message first and 585 whether Speaker A is multisession-enabled or not. 587 Assuming Speaker A is not multisession-enabled, it sends OPEN message 588 first and there is no existing session between these two peers. 589 Speaker B determines that OPEN message lists both AFI/SAFI and it 590 knows that it wants to split them into different sessions, therefore 591 it's obvious that setup cannot function as intended. Since 592 separation of two address families into two groups is performed by 593 operator (as per Multisession Extension specification), the most 594 appropriate action is to prevent any communication between Speaker A 595 and B until operator intervenes and resolves the conflict in 596 configuration. To do this BGP Speaker B sends NOTIFICATION message 597 with Error Code 6 (because peer is not expected to understand new 598 notification sub-codes). Would Speaker A be multisession enabled, 599 then Speaker B would send NOTIFICATION message with Error Code 1 and 600 Error Subcode 9 (Grouping Required). 602 Now let's consider reverse situation - the Speaker B sends an OPEN 603 message for either AFI/SAFI first. Assuming Speaker A is not 604 multisession-enabled, it will accept OPEN message containing either 605 AFI/SAFI and will reply with OPEN message containing both AFI/SAFI. 606 Although session might transitions for a brief period to ESTABLISHED 607 state, the Speaker B upon receipt of the OPEN message will detect 608 misconfiguration and send NOTIFICATION message with Error Code 6 as 609 in previous paragraph. Would Speaker A be multisession-enabled, it 610 could detect misconfiguration on its own and send NOTIFICATION 611 message with Error Code 1 and Error Subcode 8 (Grouping conflict). 613 There is possibility that Speaker A opens one TCP connection and 614 sends its OPEN message, and simultaneously Speaker B opens one or two 615 TCP connection(s) and sends OPEN message on each of them. Since 616 Speaker A is not multisession-enabled, it will invoke original 617 collision detection procedure and will drop one of the sessions. 618 Speaker B seeing NOTIFICATION message with Cease error code concludes 619 that Speaker A is not multisession-capable and that setup prescribed 620 by Speaker B's configuration cannot be achieved. Depending on 621 implementation of Speaker B a session for one of the AFI/SAFI may 622 progress to ESTABLISHED state, but Speaker B will inform operator 623 about incompatible configuration. 625 It's also possible that initially Speaker B has been configured with 626 only one AFI/SAFI, e.g. IPv4 unicast. The session between two peers 627 would come up as described in previous subsection. Now suppose 628 Speaker B is configured with additional session to carry IPv4 L3VPN 629 prefixes. Since Speaker A does not have multiple sessions 630 configured, it won't send another OPEN message as long as first 631 session is in ESTABLISHED state. Therefore it's only possible that 632 Speaker B will attempt establishing second connection and send new 633 OPEN message containing only IPv4 L3VPN AFI/SAFI. If Speaker A is 634 non-multisession enabled, it will drop second session sending 635 NOTIFICATION message. From this Speaker B can conclude that 636 configuration of two sides is incompatible, will stop attempting to 637 bring up IPv4 L3VPN session and will notify operator. Already 638 ESTABLISHED session may remain unaffected (subject to Speaker B 639 implementation), just like with non-multisession speakers. 641 A.3. Multiple sessions based on AFI/SAFI 643 This is most common use of multisession extension is to separate 644 prefixes based on AFI/SAFI. Note that use of AFI/SAFI based groups 645 is denoted by empty Optional Data field in Multisession Capability, 646 which is the same as in previous two sections. Grouping 647 configuration is devised from the list of actually advertised AFI/ 648 SAFI lists (MP-BGP Capability). This will be demonstrated in 649 following examples. 651 Let's consider BGP Speaker A and BGP Speaker B both configured to 652 exchange IPv4 unicast, IPv4 labelled-unicast and IPv4 L3VPN prefixes 653 each in its own session. We start with no existing sessions between 654 these speakers. Speaker A (though roles can reverse) sends OPEN 655 message in which among other capabilities it announces MP-BGP 656 Capability for AFI=1 SAFI=1 and Multisession Capability with empty 657 optional data field. Speaker B upon receipt of such message finds 658 that it expects to exchange IPv4 unicast with Speaker B in a 659 dedicated session. It accepts connection and sends similar OPEN 660 message to Speaker A. As there were no existing sessions, collision 661 handling procedure is not invoked at this time. Next Speaker A (but 662 again it could be Speaker B) starts new TCP connection to Speaker B 663 and sends OPEN message with MP-BGP Capability for AFI=1 SAFI=4 and 664 Multisession Capability with empty optional data field. Speaker B is 665 willing to exchange IPv4 labelled-unicast too, but before accepting 666 the proposal it executes collision detection procedure. Since AFI/ 667 SAFI lists of the old (ESTABLISHED) and of the new sessions are 668 different, the sessions don't collide and, sending OPEN message with 669 AFI=1 SAFI=4, the Speaker B brings second session to ESTABLISHED 670 state. In the same way third session, for AFI=1 SAFI=128, is brought 671 up. 673 Note that similar behaviour will be also observed if two speakers 674 send OPEN messages simultaneously - modified collision handling 675 procedure, introduced by Multisession Extension specification, will 676 mark sessions as unique based on the difference in Session Id 677 (different AFI/SAFI lists). If Speaker A opens TCP connection and 678 sends an OPEN message for either AFI/SAFI, and simultanously Speaker 679 B opens TCP connection and send OPEN message for the same AFI/SAFI, 680 then modified collision handling procedure will resolve the conflict 681 just like original procedure would do in non-multisession 682 environment. Yet modified collision handling procedure allows 683 sessions with distinct Session Id's to coexist without affecting each 684 other. This behaviour applies also to more complex cases where 685 groups include more AFI/SAFI or based on different Capability Codes 686 all together. For this reason collision handling is not discussed in 687 remaining scenarios. 689 Now suppose Speaker A configuration is as above, but Speaker B is 690 configured to combine labelled-unicast and L3VPN prefixes into the 691 same session. IPv4 session is brought up as above. Next there are 692 two possible alternatives. Either Speaker A sends OPEN message for 693 one of the remaining sessions, to which Speaker B responds with 694 NOTIFICATION message Error Code 2 and Error Subcode 8. Or Speaker B 695 sends OPEN message for combined session including both of the 696 remaining address families, to which Speaker A responds either with 697 exactly the same NOTIFICATION message. At the end only IPv4 session 698 remains in ESTABLISHED state, while two other address families 699 require operator's intervention for configuring either Speaker A with 700 combined session for labelled-unicast and L3VPN, or Speaker B for one 701 session per AFI/SAFI. Note that if Speaker B would have used an 702 implementation that requires that labelled-unicast and L3VPN address- 703 families are combined into single session, then behaviour of each 704 side would be exactly as above. 706 If Speaker A wouldn't have L3VPN configuration for Speaker B at all, 707 then whether second session would progress to ESTABLISHED or not 708 depends on whether configuration of either side requires exact match 709 between groups (by default implementations expected to mimim original 710 BGP behaviour which will bring overlapping AFI/SAFI up, but won't 711 require exact match, but some implementation may provide 712 configuration knob to require exact match). 714 Finally we look at the case where AFI/SAFI lists of different 715 configured sessions overlap. Suppose Speaker A is configured with 716 following groups: group 1 AFI=1 SAFI=1, group 2 AFI=1 SAFI=4 and 717 SAFI=128, group 3 AFI=2 SAFI=4; and Speaker B is configured as: group 718 1 AFI=1 SAFI=1, group 2 AFI=1 SAFI=4, group 3 AFI=1 SAFI=128 and 719 AFI=2 SAFI=4. For simplicity sake we assume that group 1 is brought 720 up first. Both speakers behave as already described in previous 721 case. Next let Speaker A to be the first to setup second TCP session 722 and send OPEN message for group 2. Applying collision handling 723 procedure as defined in Multisession specification Speaker B 724 continues processing of received OPEN message. If Speaker B is 725 configured for strict match between the groups, then it will detect 726 incompatibility of AFI/SAFI list between the received message and its 727 own configuration, therefore it will send NOTIFICATION message with 728 Error Code 2 and Error Subcode 8. If on the other hand Speaker B 729 allows partial overlapping of received and its own AFI lists (as 730 regular BGP implementation would in absence of multisession), it will 731 reply with OPEN message that lists AFI=1 SAFI=4 and session 732 potentially progresses to ESTABLISHED state provided that Speaker A 733 doesn't require exact match on AFI/SAFI list. Similar applies to the 734 session 3 for the remaining AFI/SAFI. Note that configuration for 735 exact or partial match between AFI/SAFI lists is the same for all 736 sessions between given peers. 738 A.4. Multiple sessions based on arbitrary BGP Capabilities 740 Although grouping based on arbitrary attributes is the most 741 comprehensive scenario, the behaviour of the BGP speakers is 742 essentially the same as in case of AFI/SAFI based groups. However 743 arbitrary groups do add extra complexity because BGP speakers need to 744 consider not only values of single capability, but need to agree upon 745 Capability Codes that constitute Session Id. Following example 746 demonstrates behaviour of multisession-enabled BGP speakers in 747 situation where Session Id on each side is based on different 748 capabilities. 750 Let's suppose there is imaginery Capability Code X that denotes 751 Experiment Id, and two speakers would like to exchange IPv4 unicast 752 and L3VPN prefixes for two experiments. Speaker A would like to 753 group prefixes into separate sessions based solely on Experiment Id 754 (so two sessions with two AFI/SAFI in each), while Speaker B would 755 like to have separate session per experiment per AFI/SAFI (so four 756 sessions with one AFI/SAFI in each). Since Session Id involves 757 attribute other than AFI/SAFI, the Optional Data field in 758 Multisession Capability will be non-empty. Multisession Capability 759 sent by Speaker A will contain only 'Experiment Id Capability Code' 760 in the Optional Data, whereas Speaker B will put there both 761 "Experiment Id Capability Code" and "MP-BGP (AFI/SAFI)". When either 762 speaker receives OPEN message from the peer, it will notice mismatch 763 between content of the Optional Data field and, since sessions cannot 764 be established as intended, the speaker will send NOTIFICATION 765 message with Error Code 2 and Subcode 7 after which session will be 766 dropped. Both speakers will notify operator and will suppress 767 further attempt to bring session up until configuration of either 768 side changes. 770 Note that despite Multisession Capability does not containing a field 771 to denote support for non-AFI/SAFI based groups, even an 772 implementation that does not support groups based on arbitrary 773 capability codes will be able to recognise configuration mismatch and 774 provide sufficient information to the peer as described above. 776 A.5. Process level separation of multiple sessions 778 As fault isolation is the key motivation for the Multisession 779 Extension it's natural to consider process-level separation between 780 the sessions. Although Multisession specification itself does not 781 prescribe any particular way of handling each session, BGP 782 implementations can leverege IPC facilities provided by host 783 operating systems to handover arbitrary session to appropriate 784 process. For example, many systems can pass connection from the 785 process that accepted TCP connection to a process dedicated for 786 particular group using specially crafted message on Unix socket. 787 This is somewhat acking to inetd, but based on content of the OPEN 788 message (e.g. AFI/SAFI list) rather than on transport protocol 789 properties (e.g. TCP/UDP port numbers). At one extrimity the 790 process that initially accepts TCP connection may be very primitive 791 and can leave even connection collision handling to a specializing 792 process, on the other hand process could handle collision detection 793 itself or even handle particular group on its own while passing only 794 specific group to another process. This process level separation is 795 local implementation business and does not require specific aid from 796 BGP at protocol specification level. Therefore process level 797 separation is not part of multisession specification. 799 Authors' Addresses 801 John G. Scudder 802 Juniper Networks 804 Email: jgs@juniper.net 806 Chandra Appanna 807 Arista Networks 809 Email: achandra@aristanetworks.com 811 Ilya Varlashkin 812 Easynet Global Services 814 Email: ilya.varlashkin@easynet.net