idnits 2.17.00 (12 Aug 2021) /tmp/idnits33955/draft-eastlake-sfc-parallel-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document date (January 23, 2022) is 111 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) No issues found here. Summary: 0 errors (**), 0 flaws (~~), 0 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 INTERNET-DRAFT D. Eastlake 2 Intended status: Proposed Standard Futurewei Technologies 3 Expires: July 22, 2022 January 23, 2022 5 Service Function Chaining (SFC) Parallelism and Diversions 6 8 Abstract 9 Service Function Chaining (SFC) is the processing of packets through 10 a sequence of Service Functions (SFs) within an SFC domain by the 11 addition of path information and metadata on entry to that domain, 12 the use and modification of that path information and metadata to 13 step the packet through a sequence of SFs, and the removal of that 14 path information and metadata on exit from that domain. The IETF has 15 standardized a method for SFC using the Network Service Header 16 specified in RFC 8300. 18 There are requirements for SFC to process packets through parallel 19 sequences of service functions and to easily splice in additional 20 service functions or splice service functions out of a service chain. 21 This document provides use cases for these requirements and 22 extensions to SFC to support them. 24 Status of This Memo 26 This Internet-Draft is submitted in full conformance with the 27 provisions of BCP 78 and BCP 79. 29 Distribution of this document is unlimited. Comments should be sent 30 to the SFC Working Group mailing list or to the 31 authors. 33 Internet-Drafts are working documents of the Internet Engineering 34 Task Force (IETF), its areas, and its working groups. Note that 35 other groups may also distribute working documents as Internet- 36 Drafts. 38 Internet-Drafts are draft documents valid for a maximum of six months 39 and may be updated, replaced, or obsoleted by other documents at any 40 time. It is inappropriate to use Internet-Drafts as reference 41 material or to cite them other than as "work in progress." 43 The list of current Internet-Drafts can be accessed at 44 https://www.ietf.org/1id-abstracts.html. The list of Internet-Draft 45 Shadow Directories can be accessed at 46 https://www.ietf.org/shadow.html. 48 Table of Contents 50 1. Introduction............................................3 51 1.1 Conventions Used in This Document......................3 53 2. Service Function Chaining Background....................5 54 2.1 The Network Service Header (NSH).......................6 55 2.2 NSH Metadata and Variable Length Context Headers.......7 57 3. Requirements for Parallelism and Diversions.............9 59 4. Diversion Points and Rendezvous Points.................12 60 4.1 Rendezvous Point Information (RePIn)..................12 61 4.1.1 Packet Identifier...................................13 62 4.1.2 Packet Extent Modified..............................14 63 4.1.3 Saved Metadata......................................15 64 4.1.4 Saved TTL...........................................15 65 4.2 Diversion Point (DP) Behavior.........................16 66 4.3 Rendezvous Point (RP) Behavior........................18 68 5. IANA Considerations....................................20 69 5.1 Variable Length Context Header Type...................20 70 5.2 RePIn VLCH Sub-Types..................................20 72 6. Security Considerations................................21 74 Normative References......................................22 75 Informative References....................................22 77 Appendix: Relation to Hierarchical SFC....................23 79 Acknowledgements..........................................23 80 Authors' Addresses........................................23 82 1. Introduction 84 Service Function Chaining (SFC) is the processing of packets through 85 a sequence of Service Functions (SFs) within an SFC domain by the 86 addition of path information and metadata on entry to that domain, 87 the use and modification of that path information and metadata to 88 step the packet through a sequence of SFs, and the removal of that 89 path information and metadata on exit from that domain. The IETF has 90 standardized a method for SFC using the Network Service Header 91 specified in [RFC8300]. 93 There are requirements for SFC to process packets through parallel 94 sequences of service functions and to easily splice in additional 95 service functions or splice service functions out of a service chain. 96 This document provides use cases for these requirements and 97 extensions to SFC to support them. 99 1.1 Conventions Used in This Document 101 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 102 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 103 "OPTIONAL" in this document are to be interpreted as described in BCP 104 14 [RFC2119] [RFC8174] when, and only when, they appear in all 105 capitals, as shown here. 107 Acronyms and terms: 109 downstream - The direction from ingress to egress. 111 diversion - A reclassification of an SFC packet into one or 112 multiple parallel packets with a difference SPI where this 113 reclassified packet or packets are, in the normal case, 114 combined at a downstream rendezvous point which restores the 115 original SPI. 117 DP - Diversion Point - An SF implementing a diversion. 119 MD - Metadata - Part of the NSH. 121 NSH - Network Service Header [RFC8300]. 123 rendezvous - The process of taking one or more corresponding SFC 124 packets that have been diverted at an upstream DP, combining 125 the packets if there are more than one, and restoring the 126 original SPI. 128 RePIn - Rendezvous Point Information. Metadata included in an SFC 129 packet for use at an RP. 131 RP - Rendezvous Point - An SF implementing a rendezvous. 133 SF - Service Function [RFC7665]. 135 SFC - Service Function Chaining [RFC7665]. 137 SFF - Service Function Forwarder [RFC7665] - A type of node that 138 forwards based on the NSH. 140 SFP - Service Function Path. 142 SI - Service Index - Part of the NSH. 144 SPI - Service Path Identifier - Part of the NSH. 146 TLV - Type Length Value. 148 upstream - The direction from egress to ingress. 150 VLCH - Variable Length Context Header - A type of NSH header 151 metadata. 153 2. Service Function Chaining Background 155 Service Function Chaining (SFC) calls for the encapsulation of 156 traffic within a service function chaining domain using a Network 157 Service Header (NSH [RFC8300]) added by the "Classifier" (ingress 158 node) on entry to the domain and the NSH being removed on exit from 159 the domain at the downstream egress node as shown in Figure 1. The 160 NSH controls the path of a packet in an SFC domain and includes 161 additional information. 163 | 164 v 165 +----------+ 166 . .|Classifier|. . . . . . . . . . . . . . 167 . +----------+ . 168 . | +----+ . 169 . | --+ SF | Service . 170 . | / +----+ Function . 171 . v --- Chaining . 172 . +-----+/ +----+ domain . 173 . | SFF |--------+ SF | . 174 . +-----+\ +----+ . 175 . | --- . 176 . | \ +----+ . 177 . | --+ SF | . 178 . v +----+ . 179 . +-----+ +----+ . 180 . | SFF |-----------------+ SF | . 181 . +-----+ +----+ . 182 . | +----+ . 183 . | --+ SF | . 184 . | / +----+ . 185 . v --- . 186 . +-----+/ +----+ . 187 . | SFF |--------+ SF | . 188 . +-----+\ +----+ . 189 . | --- . 190 . | \ +----+ . 191 . | --+ SF | . 192 . v +----+ . 193 . +------+ . 194 . . .| Exit |. . . . . . . . . . . . . . . 195 +------+ 196 | 197 v 199 Figure 1. Example SFC Path Forwarding Nodes 201 Traffic passes through a sequence of Service Function Forwarders 202 (SFFs) each of which sends the traffic to one or, sequentially, more 203 than one Service Functions (SFs). Each SF performs some operation on 204 the traffic, for example firewall or Network Address Translation 205 (NAT) or load balancer, and then returns it to the SFF from which it 206 was received. There may be multiple instances of SFs performing the 207 same function attached to the same or different SFFs. 209 Logically, during the transit of an SFF, the outer transport header 210 that got the packet to the SFF is stripped (see Figure 2), the SFF 211 decides on the next forwarding step, either (1) adding a new 212 transport header or (2) in case of error discarding or logging the 213 packet and not forwarding it or, (3) if the SFF is the exit/egress, 214 removing the NSH header and then adding a new transport header. The 215 transport used may be different in different regions of the SFC 216 domain. For example, a version of the Internet Protocol (IP) could be 217 used in some part and Multi-Protocol Label Switching (MPLS) used in 218 other part of the SFC domain. 220 +-----------------------------------+ 221 | Outer Transport Header | 222 +-----------------------------------+ 223 | Network Service Header (NSH) | 224 | +------------------------------+ | 225 | | Base Header | | 226 | +------------------------------+ | 227 | | Service Path Header | | 228 | +------------------------------+ | 229 | | Metadata (Context Header(s)) | | 230 | +------------------------------+ | 231 +-----------------------------------+ 232 | Original Packet / Frame / Payload | 233 +-----------------------------------+ 235 Figure 2. Data Encapsulation with the NSH 237 An SF can receive one or more SFC packets from an SFF and return to 238 it a larger or smaller number of SFC packets; that is to say, SFC 239 packets can be discarded or created by an SF. 241 2.1 The Network Service Header (NSH) 243 The NSH header is used to encapsulate and control the subsequent path 244 of traffic. It consists of three parts, the initial 32-bit Base 245 Header, the 32-bit Service Path Header, and any Context Headers 246 holding metadata, as shown in more detail in Figure 3 and specified 247 in [RFC8300]. 249 The Base Header includes a Length field whose value is the overall 250 NSH length. Because the Base Header and Service Path Header are fixed 251 length, the length of the Metadata can be computed from this Length 252 field. The Base Header also includes a field indicating the type of 253 metadata in the NSH. 255 The Service Path Header consists of a Service Path Identifier (SPI) 256 and a Service Index (SI). The SPI identifies the logical path the 257 packet should follow while the SI indicates which step along that 258 path the packet is at. 260 An SF anywhere along a Service Function Path can re-classify an SFC 261 packet by replacing the Service Path Identifier (SPI) and Service 262 Index (SI) in the NSH. SFFs can also insert, delete, or change 263 metadata (Context Header(s)) in the NSH. 265 0 1 2 3 266 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 267 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 268 |Ver|O|U| TTL | Length |U|U|U|U|MD Type| Next Protocol | 269 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 270 | Service Path Identifier | Service Index | 271 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 272 | Optional Context Headers / Metadata ~ 273 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+- 275 Figure 3. Network Service Header Details from [RFC8300] 277 2.2 NSH Metadata and Variable Length Context Headers 279 If the MD Type field in the NSH Base Header has the value 1, there is 280 a single fixed length 128-bit Context Header whose format is not 281 further defined by the IETF. In that case, the NSH Length field has 282 the value 6. 284 If the MD Type field has the value 2, there are zero or more 285 Variable-Length Context Headers (VLCHs) as shown in Figure 4 at the 286 end of the NSH. The absence of any Context Headers is indicated by 287 using MD Type 2 and an NSH Length of 2. MD Type 0 is reserved and MD 288 Types 3 through 15 are unassigned. 290 0 1 2 3 291 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 292 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 293 | Metadata Class | Length |U| Type | 294 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 295 | Variable Length Metadata ~ 296 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+- 298 Figure 4. Variable Length Context Header 300 The minimum size for a VLCH is 32 bits consisting of the Metadata 301 Class, Type, one unused bit, and Length as shown in Figure 4. 303 The Metadata Class field is 16 bits, and its value specifies the 304 organization under which the particular of VLCHs specified. Metadata 305 Class zero is the IETF base class. The 8-bit Type field's value, 306 along with the Metadata Class value, indicates the meaning of the 307 Context Header and its Variable-Length Metadata. The size of the 308 Length field is 7 bits and its value gives the length in octets of 309 the Variable-Length Metadata that follows the initial fixed length 310 portion of the VLCH. A VLCH with no Variable-Length Metadata is 311 indicated by a Length field whose value is zero. VLCHs are padded so 312 that they always start and end at a multiple of 4 bytes from the 313 beginning of the NSH. 315 3. Requirements for Parallelism and Diversions 317 There are requirements to split a Service Function Chain (SFC) into 318 two or more parallel Service Function Paths (SFPs) that later rejoin 319 as shown in Figure 5. 321 +------+ +-----+ 322 ----->|SFF 2a|<---->| SF 2| 323 / +------+\ +-----+ 324 / \____ 325 / \ 326 +-----+/ +------+ ->+-----+ 327 ---->|SFF 1|--------->|SFF 2b|-------->|SFF 3|-----> 328 +-----+\ +------+ ->+-----+ 329 /|\ \ /|\ / /|\ 330 | \ | / | 331 \|/ \ \|/ | \|/ 332 +-----+ \ +-----+ | +-----+ 333 | SF 1| \ | SF 3| / | SF 5| 334 | DP | \ +-----+ / | RP | 335 +-----+ \ / +-----+ 336 \ +------+/ +-----+ 337 ->|SFF 2c|<---->| SF 4| 338 +------+ +-----+ 340 Figure 5: Parallel Service Function Paths 342 For example, there may be two or more Service Functions (SFs) that 343 can be performed in parallel with the goal, for time critical traffic 344 such as some financial or gaming traffic, of delaying the stream of 345 packets only by the amount of time taken by the slowest single SF; if 346 the packets went through the SFs sequentially, the delay would be the 347 sum of the times taken by each of the SFs. An example of such 348 potential parallel processing might be that the SFs operate of 349 different parts of the packet such as one SF operating on packet 350 addressing while another operates on the information payload. Another 351 example might be that one SF creates a signature or integrity code 352 over parts of the packet to be inserted into the packet payload while 353 another SF encrypts parts of the packet (or alternatively, they 354 verify and decrypt in parallel). 356 Another example of desirable parallelism would be improved 357 reliability or accuracy if the SFs executed in parallel were 358 unreliable or were different implementations of doing the same 359 processing. For example, some quantum computers are currently 360 unreliable so it would be desirable to perform some quantum process 361 several times and compare the results to pick the most common value, 362 or a vote could be taken between the results of different 363 implementations of some process. 365 In Figure 5 it could be that any of the parallel paths could have 366 more or less than one SFF, although exactly one is shown in the 367 example for simplicity and any of the SFFs in any of the parallel 368 SFPs could process a packet through more than one SF, although they 369 are shown using only one SF in Figure 5 for simplicity. (Note that 370 while SFFs implement an SFP, SFPs logically consists of the sequence 371 of SFs. Thus, for example, an SFP could divert into multiple parallel 372 SFPs that rejoin at an RP all implemented through SFs off of one and 373 the same SFF.) It could also be that one or more of the parallel 374 paths would themselves further split into parallel paths and so on. 376 There are cases where it is desirable to divert an SFP so as to 377 splice one or more added SFs into that SFP or to divert it so as to 378 slice out one or more sequential SFs that were downstream in that 379 SFP, as shown in Figures 6 and 7. Although SF 3 in each of those two 380 cases could re-classify the packet with a new SPI and SI that include 381 the remainder of the new diverted path, this would require that a new 382 SFP with this new SPI already be configured in all the SFFs for the 383 remainder of the Service Function Path after a diversion. In the case 384 of Figure 7, SF 3 could possibly just adjust the Service Index, but 385 this would require relaxing any checking at SF 5 of the SPI/SI or 386 source address of packets on the main SFP or may otherwise be 387 undesirable. 389 +-----+ +-----+ 390 ->|SFF 3| <---->| SF 4| 391 / +-----+\ +-----+ 392 / \ 393 +-----+ +-----+ ->+-----+ +-----+ 394 --->|SFF 1|----->|SFF 2| - - - - - - >|SFF 4|----->|SFF 5|---> 395 +-----+ +-----+ +-----+ +-----+ 396 /|\ /|\ /|\ /|\ 397 | | | | 398 \|/ \|/ \|/ \|/ 399 +-----+ +-----+ +-----+ +-----+ 400 | SF 1| | SF 3| | SF 5| | SF 6| 401 +-----+ | DP | | RP | +-----+ 402 +-----+ +-----+ 404 Figure 6: Splicing in One or More SFFs 405 ------------\ 406 / \ 407 / \ 408 +-----+ +-----+ +-----+ \->+-----+ 409 --->|SFF 1|----->|SFF 2|- - ->|SFF 4|- - ->|SFF 5|---> 410 +-----+ +-----+ +-----+ +-----+ 411 /|\ /|\ /|\ /|\ 412 | | | | 413 \|/ \|/ \|/ \|/ 414 +-----+ +-----+ +-----+ +-----+ 415 | SF 1| | SF 3| | SF 5| | SF 5| 416 +-----+ | DP | +-----+ | RP | 417 +-----+ +-----+ 419 Figure 7: Splicing out One or More SFFs 421 Combinations of the cases shown in Figures 5, 6, and 7 may be needed 422 where a diversion such as in Figures 6 or 7 occur within a parallel 423 path as in Figure 5 or parallel paths as in Figure 5 occur within a 424 diversion as in Figure 6. Generalizing Figure 6 and 7, a diversion 425 might splice in a path with some number of SFs that cuts out a 426 portion of the original SFP that had some number of SFs. 428 Although DPs and RPs are logically Service Functions (SFs) and shown 429 as separate boxes in the above figures, like any other SF they can be 430 implemented as co-located with an SFF. 432 4. Diversion Points and Rendezvous Points 434 SF 1 in Figure 5 and SF 3 in Figures 6 and 7 are referred to as 435 Diversion Points (DPs) because they are nodes at which an SFP is 436 diverted to one or more SFPs with new SPIs that are intended to 437 rejoint/return to the original SPI at a downstream Rendezvous Point. 438 SF 5 in Figures 5, 6, and 7 is referred to as a Rendezvous Point (RP) 439 because it is the node at which one or more SFPs from an upstream DP 440 rejoin an original SFP and SPI. The packets so received at an RP are 441 merged or coordinated. 443 In general, an RP needs to be configured to expect SFC packets to 444 arrive at that RP on diverted SFPs. An RP may need additional 445 information in the SFC packets, as discussed in Section 4.1, to be 446 included in their NSH. Divergence point behavior is discussed in 447 Section 4.2 and RP behavior is discussed in Section 4.3. 449 4.1 Rendezvous Point Information (RePIn) 451 To recombine packets from divergent SFP(s) at a Rendezvous Point 452 (RP), or rejoin a diverted SFP to the original SFP, additional 453 information may be needed in the packets. This is accomplished 454 through inclusion of the RP Information (RePIn) Variable Length 455 Context Header (VLCH), as shown in Figure 8, in the packet's NSH. 457 0 1 2 3 458 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 459 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 460 | Metadata Class = 0x0000 | Type=TBD |U| Length | 461 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 462 | Saved Service Path Identifier | Saved SI | 463 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 464 | Variable Length Sub-TLVs ~ 465 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+- 467 Figure 8: RePIn VLCH 469 The Length field is the total length of the Variable Length Sub-TLVs 470 in octets. 472 The Saved Service Path Identifier and Saved SI are the SPI and SI in 473 the NSH of the SFC packet being diverted after entry to the DP SF and 474 the SI has been decremented. 476 The Variable Length Sub-TLVs consist of zero or more RePIn VLCH Sub- 477 TLVs. The format of a RePIn VLCH Sub-TLV is as shown in Figure 9 478 except for Sub-Type 1 as discussed in Section 4.1.1; however, all 479 RePIn VLCH Sub-TLVs are padded at the end up to an even multiple of 4 480 octets. 482 0 1 2 3 483 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 484 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 485 | Sub-Type |X|Sub-Length | Variable Length Metadata ~ 486 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+- 488 Figure 9: General RePIn VLCH Sub-TLV 490 The Sub-Type field is an 8-bit unsigned integer that is always 491 present and indicates the format of the rest of the Sub-TLV. The X 492 bit may be assigned a meaning for particular Sub-Types; if no such 493 meaning is assigned for a particular Sub-Type, the X bit MUST be sent 494 as zero and ignored on receipt. Sub-Length is an unsigned integer 495 giving the length of the variable length metadata in octets. 497 Unless the specification for a RePIn VLCH Sub-TLV Sub-Type specifies 498 that there may be multiple occurrences of that Sub-TLV, it may only 499 be included once. If there are multiple instances, the first 500 occurrence is used with any subsequent occurrences being ignored. 502 4.1.1 Packet Identifier 504 When an SFC packet is replicated and diverted to more than one 505 parallel path to be merged back together at a Rendezvous Point (RP), 506 a method of matching packets is needed such as labeling each copy 507 that originated with the same packet before the split using a packet 508 identifier such as a packet counter, fine grained time stamp, or hash 509 code. Such an identifier might already exist in the packet, for 510 example a TCP sequence number. If not, it is included as a VLCH 511 special Sub-TLV as shown below. Use of a packet counter is 512 RECOMMENDED. 514 0 1 2 3 515 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 516 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 517 | Sub-Type=1 | Packet Identifier | 518 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 520 Figure 9: Packet Identifier Special Sub-TLV 522 Because this is expected to be a common RePIn VLCH Sub-TLV, in order 523 to save octets, for this Sub-Type only, the "Sub-TLV" X and Sub- 524 Length fields are subsumed into the Packet Identifier field. 526 4.1.2 Packet Extent Modified 528 If two or more SFPs used in parallel have modified parts of a packet, 529 the RP may need additional information to be able to recombine the 530 different copies of the packet it will be receiving. As an example of 531 the complexities involved, an SF could change the length of part of a 532 packet in a way dependent on the content of that part such as by 533 applying a data compression or de-compression algorithm to part of 534 the packet or by conditionally inserting or removing a VLAN tag 535 depending on addressing information. 537 In simple cases such as parallel SFPs that modify fixed size disjoint 538 parts of a packet without changing the size of those parts, it may be 539 possible for an RP to be configured to recombine the results without 540 added information. But in more complex or variable length cases, 541 parallel SFPs need to add information as to what part of the original 542 packet they modified and how this may have changed the length of that 543 part. Also, with such additional information, in some cases only one 544 of the parallel SFPs would need to forward all of the original packet 545 with modifications to the RP; one or more other parallel SFPs could 546 just forward their modified part and the RP would be able to 547 recombine the results thus saving communications link capacity that 548 would be used if they all sent full packets. 550 0 1 2 3 551 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 552 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 553 | Sub-Type=2 |X|Sub-Length=6 | Offset | 554 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 555 | Priority | Original Size | Modified Size | 556 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 558 Figure 10: Packet Extent Modified Sub-TLV 560 If the X bit is zero, the entire modified packet is present in the 561 SFC packet. If the X bit is a one, only the modified portion appears 562 in the packet which requires any SFs between the SF that modified the 563 packet and the RP to be capable of handling such an abbreviated 564 packet. 566 Offset is the number of bits between the end of the NSH or the last 567 NSH if there are multiple stacked NSHs and the portion of the packet 568 being modified. Original Size and Modified Size are the size in bits 569 of the portion being modified before and after that modification. Any 570 of the Offset, Original Size, and Modified Size fields may have the 571 value zero. 573 If parallel SFPs have modified the same or overlapping parts of a 574 packet, the RP may need some way to resolve this conflict which could 575 include a relative priority for changes made by different SFs 576 configured at the RP and/or indicated in the RP Information (RePIn) 577 or from other sources. The Priority field MAY be used for this 578 purpose; it contains an unsigned integer where a larger magnitude 579 value indicates a higher priority that would prevail over a lower 580 priority. If not used, the Priority field MUST be sent as zero and 581 ignored on receipt. 583 If a path has modified more than one portion of a packet, multiple 584 instance of the Packet Extent Modified Sub-TLV can be included in the 585 RePIn VLCH. 587 4.1.3 Saved Metadata 589 A DP may need to save Metadata so it need not be seen inside a 590 diversion and will be restored at the RP. This Sub-TLV is used for 591 that purpose. See Section 4.2 and 4.3 for further details of use. 593 0 1 2 3 594 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 595 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 596 | Sub-Type=3 |X| Sub-Length | MBZ |MetaTyp| 597 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 598 | Variable Length Saved Metadata ~ 599 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+- 601 Figure 10: Saved Metadata Sub-TLV 603 The X and MBZ bits MUST be sent as zero and ignored on receipt. 604 MetaTyp is the MD Type of Metadata saved. The presence of the MBZ 605 field causes the saved metadata to be aligned on a multiple of 4 606 octets. 608 4.1.4 Saved TTL 610 The TTL limits the number of SFs that can be traversed between 611 ingress and egress. The packet is discarded if the TTL is exhausted. 612 This is a safety measure to defend against infinite or very large 613 loops due to malfunctions, configuration error, or other reasons. 614 Thus, the RECOMMENDED mode of operation is to use a TTL value that is 615 decremented continuously from original SFC domain ingress to final 616 SFC egress including throughout any diversions. If the TTL is reset 617 on entry to a diversion, then the Saved TTL Sub-TLV MUST be used so 618 that the previous TTL can be restored at the diversion's RP. 620 Note that resetting the TTL on entry to a diversion opens the 621 possibility for loop where a diversion diverts to itself or there are 622 two diversions X and Y where X diverts to Y and Y diverts to X or 623 more complex scenarios all of which are made safe by using a 624 continuous TTL and unsafe by resetting the TTL on diversion entry. 625 Such loops will result in a growing amount of metadata which might 626 safely lead to packet discard or unsafely cause repeated 627 fragmentation. 629 If, despite the above warning, it is desired to reset the TTL at the 630 DP and restore it at the RP, the Saved TTL Sub-TLV as shown below is 631 used. 633 0 1 2 3 634 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 635 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 636 | Sub-Type=4 |X| Sub-Length=2| MBZ | Saved TTL | 637 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 639 Figure 11: Saved TTL Sub-TLV 641 The X and MBZ bits MUST be sent as zero and ignored on receipt. The 642 Saved TTL is the value of the TTL field copied from the NSH after its 643 initial decrement on entering the DP SF. 645 4.2 Diversion Point (DP) Behavior 647 If it is desired to simply skip some SFs in an SFP, a diversion may 648 not be necessary. The SI can simply be decreased to that for the next 649 SF desired if the SFF to which the SF/DP that reduces the SI returns 650 the packet can handle that reduced SI value and forward the packet to 651 the appropriate SFF. Otherwise, the procedure below in this section 652 is used and this procedure MAY be used even in cases where simple 653 reduction of the SI would work. 655 If the RP can recognize diverted SFC packets and modify/merge them 656 appropriately to restore them to the original SFP with appropriate 657 Metadata, then inserting a RePIn VLCH might not be needed. In other 658 cases take the steps below. This is a logical procedure and any 659 procedure can be used that results in the same diverted SFC 660 packet(s). 662 1. Construct a RePIn VLCH containing the SPI and SI from the NSH and 663 then change those fields in the NSH to the diversion SPI and SI. 664 If diversion is to multiple parallel SFPs, make a copy of the SFC 665 packet for each diversion, then construct a VLCH and modify the 666 NSH as in the previous sentence for each one of the parallel 667 paths. Then perform the following steps to each modified copy and 668 its corresponding RePIn VLCH. 670 2. If it will be necessary for the RP to merge the modified copies of 671 the original SFC packet sent over parallel paths or if the RP 672 needs to restore a particular ordering to packets, then add a 673 Packet Identification Sub-TLV to the RePIn VLCH unless sufficient 674 information is available in the packet payload for the RP to do so 675 without a Packet Identification Sub-TLV. 677 3. Take any Metadata from the original NSH that should be restored at 678 the RP and add it to the RePIn VLCH as a Saved Metadata Sub-TLV. 679 If the NSH already has any RePIn VLCHs, they need not be saved as 680 they will be masked by the new RePIn VLCH that will be inserted 681 before it (this indicates that a diversion from a diversion is 682 being created). To save space, any Metadata that has been saved in 683 the RePIn VLCH and is not needed in the diversion SFP SHOULD be 684 removed from the NSH if MD Type 2 and MUST be removed from the NSH 685 if MD Type 1. (In the Type 1 case, this converts the NSH to MD 686 Type 2 with no Metadata.) 688 4. If it is desired to use a new value for the NSH TTL in the 689 diversion, with the old value restored at the RP, add a Saved TTL 690 Sub-TLV to the RePIn VLCH and set the TTL in the NSH to a 691 configured value which may be dependent on the diversion being 692 entered. This is NOT RECOMMENDED as discussed in Section 4.1.4. 694 5. The RePIn VLCH constructed as above is inserted into an NSH as in 695 the subpoints below: 696 5.a If, after the above step 3, the initial NSH in the SFC packet 697 is MD Type 2, insert the RePIn VLCH constructed above before 698 any existing RePIn VLCH in the NSH. 699 5.b If, after the above step 3, the initial NSH in the SFC packet 700 is MD Type 1, this implies that there is Type 1 metadata that 701 may be needed by one or more SFs in the diversion. If the SFs 702 in the diversion can handle stacked NSHs, insert an MD Type 2 703 NSH copied from the initial NSH except for metadata, after the 704 initial MD Type 1 NSH to hold the RePIn VLCH. Handling stacks 705 NSHs means the SF (or its proxy) can parse through them to 706 find the needed metadata and the payload to operate on and, if 707 the SF generates packets, can create them with appropriate 708 stacked NSHs. If the SFs in the diversion cannot handle 709 stacked NSHes, the creation of the diversion is beyond the 710 scope of this document. 712 6. Perform such other functions or modifications to the metadata or 713 other parts of the SFC packet as are appropriate based on the 714 saved or new SPI or other factors. 716 The addition of Metadata and possible additional NSH header (see step 717 5.b above) may lead to fragmentation or decreased payload Maximum 718 Transmission Unit (MTU) in some networks. 720 4.3 Rendezvous Point (RP) Behavior 722 A RP will have been configured to know the SFC packet SPI and SI 723 values in diverted packets for which it is to perform the RP service. 724 The SI is decremented when an SFC packet is received by an SF; for an 725 RP this might decrement the SI to zero. The RP performs the steps 726 below. If the RP can restore diverted SFC packets to their former SFP 727 and, to the extent necessary, match and merge diverted packets 728 received over parallel paths and correctly order the resulting SFC 729 packets, without the presence of a RePIn VLCH, it does so and the 730 remainder of this section is inapplicable. If not, the following 731 logical procedure or any procedure resulting in the same SFC packet 732 is used. 734 1. Steps 2 and 3 below are performed on each diverted packet received 735 by the RP. If the RP is merging parallel diversions, step 4 is 736 then performed on the set of matching packets. In this case and 737 any case where the RP should restore packet order, the RP must be 738 prepared to buffer packets until they can be processed and 739 forwarded. Overflow of such a buffer will result in lost packets 740 and should be logged as an error. How long to wait for missing 741 diverted packets and what action to take if it is decided they 742 have been lost are application and implementation dependent. 743 Finally, Step 5 is performed. 745 2. Find and remove the first RePIn VLCH in the diverted packet. This 746 is referred to below as the removed VLCH. It might be in a second 747 stacked NSH if the initial NSH has MD Type 1. 749 3. Restore the packet from the diversion through the sub-steps listed 750 below. 751 3.a If there is a Saved TTL in the removed VLCH, restore it into 752 the initial NSH. 753 3.c Restore the saved SPI and SI from the removed RePIn VLCH into 754 the initial NSH. 755 3.d Restore metadata as follows: 756 3.d.1 Restore any MD Type 2 Saved Metadata from the removed 757 VLCH into the NSH from which that VLCH was removed. 758 3.d.2 If there is MD Type 1 Saved Metadata in the removed VLCH 759 and there is an initial MD Type 1 NSH in the packet, 760 replace the MD Type 1 metadata with the saved MD Type 1 761 metadata. 762 3.d.3 If there is MD Type 1 Saved Metadata in the removed VLCH 763 and there is an initial MD Type 2 NSH in the packet, 764 insert a new initial NSH into the packet which is a copy 765 of that MD Type 2 NSH except that it is MD Type 1 with 766 the saved MD Type 1 metadata. 767 3.e If, at this point, the packet starts with an MD Type 1 NSH 768 followed by an MD Type 2 NSH with no metadata, remove that 2nd 769 NSH. 771 4. Match up SFC packets arriving at the RP through parallel paths 772 using the Packet Identification Sub-TLV in the removed RePIn VLCH 773 or using some other technique. For each matching set, perform the 774 sub-steps below. Arbitrarily select one of the matching diverted 775 packets to modify into the merged packet unless configured to use 776 some particular diverted packet such as the one received over a 777 particular diversion. This is referred to below as the merged 778 packet even before the merger is complete. 779 4.a For error checking, the SPI and SI in the initial NSH of the 780 matching packets SHOULD be compared and an error logged if 781 they are not identical. 782 4.b For safety, it is RECOMMENDED that the minimum NSH TTL from 783 the parallel SFC packets be copied into the merged packet. 784 4.c Depending on the application and implementation, the remaining 785 metadata in the merged packet may be used or updated based on 786 the remaining Metadata in the other packets being merged. How 787 to do this is beyond the scope of this document. 788 4.d The payloads of the other packets being merged, that is the 789 portion after any NSHs, are used to update the payload in the 790 merged packet. This may be based on RP configuration for the 791 application or Packet Extent Modified Sub-TLVs in the removed 792 RePIn VLCHs or a combination of these. 794 5. Perform such other functions or modifications to the metadata or 795 other parts of the SFC packet as are appropriate based on the 796 saved or new SPI or other factors. Then return the packet to the 797 SFF. 799 5. IANA Considerations 801 The following subsections provide IANA assignment considerations. 803 5.1 Variable Length Context Header Type 805 IANA is requested to assign a variable length context header type 806 from the "NSH IETF-Assigned Optional Variable-Length Metadata Types" 807 registry as follows: 809 Value Description Reference 810 ----- ------------------------------------ --------------- 811 TDB Rendezvous Point Information (RePIn) [this document] 813 5.2 RePIn VLCH Sub-Types 815 IANA is requested to create a sub-registry under the "NSH IETF- 816 Assigned Optional Variable-Length Metadata Types" registry as 817 follows: 819 Name: Sub-TLVs under the Type TBD Variable Length Context Header 820 Registration Procedure: Expert Review 821 Reference: [this document] 823 Sub-Type Description Reference 824 -------- ---------------------- --------------- 825 0 reserved [this document] 826 1 Packet Identifier [this document] 827 2 Packet Extent Modified [this document] 828 3 Saved Metadata [this document] 829 4 Saved TTL [this document] 830 5-254 unassigned [this document] 831 255 reserved [this document] 833 6. Security Considerations 835 For general SFC and NSH security considerations, see [RFC8300]. 837 More to be added... 839 Normative References 841 [RFC2119] - Bradner, S., "Key words for use in RFCs to Indicate 842 Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, 843 March 1997, . 845 [RFC8174] - Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 846 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 847 2017, 849 [RFC8300] - Quinn, P., Ed., Elzur, U., Ed., and C. Pignataro, Ed., 850 "Network Service Header (NSH)", RFC 8300, DOI 10.17487/RFC8300, 851 January 2018, . 853 Informative References 855 [ITU_Liaison] - ITU-T SG11, "LS on recent service function chaining 856 related developments in Q4/SG11: two new draft Supplements", 857 ITU-T SG11-LS-179, March 2021, 858 . 860 [RFC7665] - Halpern, J., Ed., and C. Pignataro, Ed., "Service 861 Function Chaining (SFC) Architecture", RFC 7665, DOI 862 10.17487/RFC7665, October 2015, . 865 [RFC8459] - Dolson, D., Homma, S., Lopez, D., and M. Boucadair, 866 "Hierarchical Service Function Chaining (hSFC)", RFC 8459, DOI 867 10.17487/RFC8459, September 2018, . 870 Appendix: Relation to Hierarchical SFC 872 Experimental [RFC8459] describes "Hierarchical SFC" in which SFs in a 873 higher level SFP can be entire lower level SFPs with a different SPI 874 and where the higher level SPI is restored at the end of the lower 875 level SFP. This is similar to a diversion in this document. The 876 Internal Boundary Nodes (IBNs) in [RFC8459] that transition an SFC 877 packet between the higher and lower levels are similar to DPs/RPs in 878 the terminology of this document. 880 Experimental [RFC8459] discusses a wide variety of mechanisms to 881 implement Hierarchical SFC while this document looks toward 882 specifying a more specific set of mechanisms as a Proposed Standard 883 to support parallelism and other types of diversions. 885 Acknowledgements 887 The authors gratefully acknowledge the comments and suggestions of 888 the following persons: 890 (None yet) 892 Authors' Addresses 894 Donald E. Eastlake, 3rd 895 Futurewei Technologies 896 2386 Panoramic Circle 897 Apopka, FL 32703 USA 899 Tel: +1-508-333-2270 900 Email: d3e3e3@gmail.com 902 Copyright and IPR Provisions 904 Copyright (c) 2022 IETF Trust and the persons identified as the 905 document authors. All rights reserved. 907 This document is subject to BCP 78 and the IETF Trust's Legal 908 Provisions Relating to IETF Documents 909 (http://trustee.ietf.org/license-info) in effect on the date of 910 publication of this document. Please review these documents 911 carefully, as they describe your rights and restrictions with respect 912 to this document. Code Components extracted from this document must 913 include Simplified BSD License text as described in Section 4.e of 914 the Trust Legal Provisions and are provided without warranty as 915 described in the Simplified BSD License. The definitive version of 916 an IETF Document is that published by, or under the auspices of, the 917 IETF. Versions of IETF Documents that are published by third parties, 918 including those that are translated into other languages, should not 919 be considered to be definitive versions of IETF Documents. The 920 definitive version of these Legal Provisions is that published by, or 921 under the auspices of, the IETF. Versions of these Legal Provisions 922 that are published by third parties, including those that are 923 translated into other languages, should not be considered to be 924 definitive versions of these Legal Provisions. For the avoidance of 925 doubt, each Contributor to the IETF Standards Process licenses each 926 Contribution that he or she makes as part of the IETF Standards 927 Process to the IETF Trust pursuant to the provisions of RFC 5378. No 928 language to the contrary, or terms, conditions or rights that differ 929 from or are inconsistent with the rights and licenses granted under 930 RFC 5378, shall have any effect and shall be null and void, whether 931 published or posted by such Contributor, or included with or in such 932 Contribution.