idnits 2.17.00 (12 Aug 2021) /tmp/idnits8939/draft-ietf-raw-architecture-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document date (4 March 2022) is 71 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: 'PCE' is mentioned on line 810, but not defined == Outdated reference: A later version (-05) exists of draft-irtf-panrg-path-properties-04 Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 RAW P. Thubert, Ed. 3 Internet-Draft Cisco Systems 4 Intended status: Informational G.Z. Papadopoulos 5 Expires: 5 September 2022 IMT Atlantique 6 4 March 2022 8 Reliable and Available Wireless Architecture 9 draft-ietf-raw-architecture-04 11 Abstract 13 Reliable and Available Wireless (RAW) provides for high reliability 14 and availability for IP connectivity over a wireless medium. The 15 wireless medium presents significant challenges to achieve 16 deterministic properties such as low packet error rate, bounded 17 consecutive losses, and bounded latency. This document defines the 18 RAW Architecture following an OODA loop that involves OAM, PCE, PSE 19 and PAREO functions. It builds on the DetNet Architecture and 20 discusses specific challenges and technology considerations needed to 21 deliver DetNet service utilizing scheduled wireless segments and 22 other media, e.g., frequency/time-sharing physical media resources 23 with stochastic traffic. 25 Status of This Memo 27 This Internet-Draft is submitted in full conformance with the 28 provisions of BCP 78 and BCP 79. 30 Internet-Drafts are working documents of the Internet Engineering 31 Task Force (IETF). Note that other groups may also distribute 32 working documents as Internet-Drafts. The list of current Internet- 33 Drafts is at https://datatracker.ietf.org/drafts/current/. 35 Internet-Drafts are draft documents valid for a maximum of six months 36 and may be updated, replaced, or obsoleted by other documents at any 37 time. It is inappropriate to use Internet-Drafts as reference 38 material or to cite them other than as "work in progress." 40 This Internet-Draft will expire on 5 September 2022. 42 Copyright Notice 44 Copyright (c) 2022 IETF Trust and the persons identified as the 45 document authors. All rights reserved. 47 This document is subject to BCP 78 and the IETF Trust's Legal 48 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 49 license-info) in effect on the date of publication of this document. 50 Please review these documents carefully, as they describe your rights 51 and restrictions with respect to this document. Code Components 52 extracted from this document must include Revised BSD License text as 53 described in Section 4.e of the Trust Legal Provisions and are 54 provided without warranty as described in the Revised BSD License. 56 Table of Contents 58 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 59 2. The RAW problem . . . . . . . . . . . . . . . . . . . . . . . 6 60 2.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 6 61 2.1.1. Acronyms . . . . . . . . . . . . . . . . . . . . . . 6 62 2.1.2. Link and Direction . . . . . . . . . . . . . . . . . 7 63 2.1.3. Path and Tracks . . . . . . . . . . . . . . . . . . . 8 64 2.1.4. Deterministic Networking . . . . . . . . . . . . . . 10 65 2.1.5. Reliability and Availability . . . . . . . . . . . . 11 66 2.1.6. OAM variations . . . . . . . . . . . . . . . . . . . 12 67 2.2. Reliability and Availability . . . . . . . . . . . . . . 13 68 2.2.1. High Availability Engineering Principles . . . . . . 13 69 2.2.2. Applying Reliability Concepts to Networking . . . . . 16 70 2.2.3. Wireless Effects Affecting Reliability . . . . . . . 16 71 2.3. Routing Time Scale vs. Forwarding Time Scale . . . . . . 18 72 3. The RAW Conceptual Model . . . . . . . . . . . . . . . . . . 20 73 4. The OODA Loop . . . . . . . . . . . . . . . . . . . . . . . . 22 74 4.1. Observe: The RAW OAM . . . . . . . . . . . . . . . . . . 23 75 4.2. Orient: The Path Computation Engine . . . . . . . . . . . 24 76 4.3. Decide: The Path Selection Engine . . . . . . . . . . . . 24 77 4.4. Act: The PAREO Functions . . . . . . . . . . . . . . . . 26 78 4.4.1. Packet Replication . . . . . . . . . . . . . . . . . 27 79 4.4.2. Packet Elimination . . . . . . . . . . . . . . . . . 28 80 4.4.3. Promiscuous Overhearing . . . . . . . . . . . . . . . 28 81 4.4.4. Constructive Interference . . . . . . . . . . . . . . 29 82 5. Security Considerations . . . . . . . . . . . . . . . . . . . 29 83 5.1. Layer-2 encryption . . . . . . . . . . . . . . . . . . . 29 84 5.2. Forced Access . . . . . . . . . . . . . . . . . . . . . . 29 85 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 30 86 7. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 30 87 8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 30 88 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 30 89 9.1. Normative References . . . . . . . . . . . . . . . . . . 30 90 9.2. Informative References . . . . . . . . . . . . . . . . . 32 91 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 33 93 1. Introduction 95 Deterministic Networking is an attempt to emulate the properties of a 96 serial link over a switched fabric, by providing a bounded latency 97 and eliminating congestion loss, even when co-existing with best- 98 effort traffic. It is getting traction in various industries 99 including professional A/V, manufacturing, online gaming, and 100 smartgrid automation, enabling cost and performance optimizations 101 (e.g., vs. loads of P2P cables). 103 Bringing determinism in a packet network means eliminating the 104 statistical effects of multiplexing that result in probabilistic 105 jitter and loss. This can be approached with a tight control of the 106 physical resources to maintain the amount of traffic within a 107 budgeted volume of data per unit of time that fits the physical 108 capabilities of the underlying network, and the use of time-shared 109 resources (bandwidth and buffers) per circuit, and/or by shaping and/ 110 or scheduling the packets at every hop. 112 This innovation was initially introduced on wired networks, with IEEE 113 802.1 Time Sensitive networking (TSN) - for Ethernet LANs - and IETF 114 DetNet. But the wired and the wireless media are fundamentally 115 different at the physical level and in the possible abstractions that 116 can be built for IPv6 [IPoWIRELESS]. Nevertheless, deterministic 117 capabilities are required in a number of wireless use cases as well 118 [RAW-USE-CASES]. With new scheduled radios such as TSCH and OFDMA 119 [RAW-TECHNOS] being developped to provide determinism over wireless 120 links at the lower layers, providing DetNet capabilities is now 121 becoming possible. 123 Wireless networks operate on a shared medium where uncontrolled 124 interference, including the self-induced multipath fading cause 125 random transmission losses. Fixed and mobile obstacles and 126 reflectors may block or alter the signal, causing transient and 127 unpredictable variations of the throughput and packet delivery ratio 128 (PDR) of a wireless link. This adds new dimensions to the 129 statistical effects that affect the quality and reliability of the 130 link. Multiple links and transmissions must be used, and the 131 challenge is to provide enough diversity and redundancy to ensure the 132 timely packet delivery while preserving energy and optimizing the use 133 of the shared spectrum. 135 Reliable and Available Wireless (RAW) takes up the challenge of 136 providing highly available and reliable end-to-end performances in a 137 network with scheduled wireless segments. To defeat those additional 138 causes of transmission delay and loss in wireless transmission, RAW 139 requires and leverages deterministic layer-2 capabilities. Operating 140 at the layer-3, RAW can further increase diversity in the spatial, 141 time, code, and frequency domains by enabling multiple link-layer 142 wired and wireless technologies in parallel or sequentially, for a 143 higher resilience and a wider applicability. RAW can also provide 144 homogeneous services to critical applications beyond the boundaries 145 of a single subnetwork, e.g., controlling the use of diverse radio 146 access technologies to optimize the end-to-end application 147 experience. 149 While the generic "Deterministic Networking Problem Statement" 150 [RFC8557] applies to both the wired and the wireless media, the 151 methods to achieve RAW must extend those used to support time- 152 sensitive networking over wires, as a RAW solution has to address 153 less consistent transmissions, energy conservation and shared 154 spectrum efficiency. 156 RAW provides DetNet elements that are specialized for IPv6 flows 157 [IPv6] over selected deterministic radios technologies [RAW-TECHNOS]. 158 Conceptually, RAW is agnostic to the radio layer underneath though 159 the capability to schedule transmissions is assumed. How the PHY is 160 programmed to do so, and whether the radio is single-hop or meshed, 161 are unknown at the IP layer and not part of the RAW abstraction. 162 Nevertheless, cross-layer optimizations may take place to ensure 163 proper link awareness (think, link quality) and packet handling 164 (think, scheduling). 166 The "Deterministic Networking Architecture" [RFC8655] is composed of 167 three planes: the Application (User) Plane, the Controller Plane, and 168 the Network Plane. The DetNet Network Plane is composed of a DetNet 169 service sublayer that focuses on flow protection (e.g., using 170 redundancy) and can be fully operated at layer-3, and a DetNet 171 forwarding sublayer that associates the flows to the paths, ensures 172 the availability of the necessary resources, and leverages layer-2 173 functionalities for timely delivery to the next Detnet system. 175 The RAW Architecture extends the DetNet Network Plane, to accommodate 176 one or multiple hops of homogeneous or heterogeneous wired and 177 wireless technologies. RAW adds reactivity to the DetNet service 178 sublayer to compensate the dynamics for the radio links in terms of 179 lossiness and bandwidth. This may apply for instance to mesh 180 networks as illustrated in Figure 3, or diverse radio access networks 181 as illustrated in Figure 5. 183 RAW and DetNet route application flows that require a special 184 treatment along the paths that will provide that treatment. This may 185 be seen as a form of Path Aware Networking and may be subject to 186 impediments documented in [RFC9049]. 188 The establishment of a path is not in-scope for RAW. It may be the 189 product of a centralized Controller Plane as described for DetNet. 190 As opposed to wired networks, the action of installing a path over a 191 set of wireless links may be very slow relative to the speed at which 192 the radio conditions vary, and it makes sense in the wireless case to 193 provide redundant forwarding solutions along a complex path (see 194 Section 2.1.3) and to leave it to the Network Plane to select which 195 of those forwarding solutions are to be used for a given packet based 196 on the current conditions. 198 RAW distinguishes the longer time scale at which routes are computed 199 from the the shorter forwarding time scale where per-packet decisions 200 are made. RAW operates within the Network Plane at the forwarding 201 time scale on one DetNet flow over a complex path delineated by a 202 Track (see Section 2.1.3.2). The Track is preestablished and 203 installed by means outside of the scope of RAW; it may be strict or 204 loose depending on whether each or just a subset of the hops are 205 observed and controlled by RAW. 207 The RAW Architecture is based on an abstract OODA Loop (Observe, 208 Orient, Decide, Act). The generic concept involves: 210 1. Network Plane measurement protocols for Operations, 211 Administration and Maintenance (OAM) to Observe some or all hops 212 along a Track as well as the end-to-end packet delivery 214 2. Controller plane elements to reports the links statistics to a 215 Path computation Element (PCE) in a centralized controller that 216 computes and installs the Tracks and provides meta data to Orient 217 the routing decision 219 3. A Runtime distributed Path Selection Engine (PSE) that Decides 220 which subTrack to use for the next packet(s) that are routed 221 along the Track 223 4. Packet (hybrid) ARQ, Replication, Elimination and Ordering 224 Dataplane actions that operate at the DetNet Service Layer to 225 increase the reliability of the end-to-end transmissions. The 226 RAW architecture also covers in-situ signalling when the decision 227 is Acted by a node that down the Track from the PSE. 229 The overall OODA Loop optimizes the use of redundancy to achieve the 230 required reliability and availability Service Level Agreement (SLA) 231 while minimizing the use of constrained resources such as spectrum 232 and battery. 234 This document presents the RAW problem and associated terminology in 235 Section 2, and elaborates in Section 4 on the OODA loop based on the 236 RAW conceptual model presented in Section 3. 238 2. The RAW problem 240 2.1. Terminology 242 RAW reuses terminology defined for DetNet in the "Deterministic 243 Networking Architecture" [RFC8655], e.g., PREOF for Packet 244 Replication, Elimination and Ordering Functions. 246 RAW also reuses terminology defined for 6TiSCH in [6TiSCH-ARCHI] such 247 as the term Track. A Track associates a complex path with PAREO and 248 shaping operations. The concept is agnostic to the underlaying 249 technology and applies but is not limited to any fully or partially 250 wireless mesh. RAW specifies strict and loose Tracks depending on 251 whether the path is fully controlled by RAW or traverses an opaque 252 network where RAW cannot observe and control the individual hops. 254 RAW uses the following terminology and acronyms: 256 2.1.1. Acronyms 258 2.1.1.1. ARQ 260 Automatic Repeat Request, enabling an acknowledged transmission and 261 retries. ARQ is a typical model at Layer-2 on a wireless medium. 262 ARQ is typically implemented hop-by-hop and not end-to-end in 263 wireless networks. Else, it introduces excessive indetermination in 264 latency, but a limited number of retries within a bounded time may be 265 used within end-to-end constraints. 267 2.1.1.2. OAM 269 OAM stands for Operations, Administration, and Maintenance, and 270 covers the processes, activities, tools, and standards involved with 271 operating, administering, managing and maintaining any system. This 272 document uses the terms Operations, Administration, and Maintenance, 273 in conformance with the 'Guidelines for the Use of the "OAM" Acronym 274 in the IETF' [RFC6291] and the system observed by the RAW OAM is the 275 Track. 277 2.1.1.3. OODA 279 Observe, Orient, Decide, Act. The OODA Loop is a conceptual cyclic 280 model developed by USAF Colonel John Boyd, and that is applicable in 281 multiple domains where agility can provide benefits against brute 282 force. 284 2.1.1.4. PAREO 286 Packet (hybrid) ARQ, Replication, Elimination and Ordering. PAREO is 287 a superset Of DetNet's PREOF that includes radio-specific techniques 288 such as short range broadcast, MUMIMO, PHY rate and other Modulation 289 Coding Scheme (MCS) adaptation, constructive interference and 290 overhearing, which can be leveraged separately or combined to 291 increase the reliability. 293 2.1.2. Link and Direction 295 2.1.2.1. Flapping 297 In the context of RAW, a link flaps when the reliability of the 298 wireless connectivity drops abruptly for a short period of time, 299 typically of a subsecond to seconds duration. 301 2.1.2.2. Uplink 303 Connection from end-devices to a data communication equipment. In 304 the context of wireless, uplink refers to the connection between a 305 station (STA) and a controller (AP) or a User Equipment (UE) to a 306 Base Station (BS) such as a 3GPP 5G gNodeB (gNb). 308 2.1.2.3. Downlink 310 The reverse direction from uplink. 312 2.1.2.4. Downstream 314 Following the direction of the flow data path along a Track. 316 2.1.2.5. Upstream 318 Against the direction of the flow data path along a Track. 320 2.1.3. Path and Tracks 322 2.1.3.1. Path 324 Quoting section 1.1.3 of [INT-ARCHI]: 326 | At a given moment, all the IP datagrams from a particular source 327 | host to a particular destination host will typically traverse the 328 | same sequence of gateways. We use the term "path" for this 329 | sequence. Note that a path is uni-directional; it is not unusual 330 | to have different paths in the two directions between a given host 331 | pair. 333 Section 2 of [I-D.irtf-panrg-path-properties] points to a longer, 334 more modern definition of path, which begins as follows: 336 | A sequence of adjacent path elements over which a packet can be 337 | transmitted, starting and ending with a node. A path is 338 | unidirectional. Paths are time-dependent, i.e., the sequence of 339 | path elements over which packets are sent from one node to another 340 | may change. A path is defined between two nodes. 342 It follows that the general acceptance of a path is a linear sequence 343 of nodes, as opposed to a multi-dimensional graph, defined by the 344 experience of the packet that went from a node A to a node B. 346 With DetNet and RAW, a packet may be duplicated, fragmented and 347 network-coded, and the various byproducts may travel different paths 348 that are not necessarily end-to-end between A and B; we refer to that 349 experience as a complex path. The complex path does not fit the 350 traditional description of a path, and is subject to change from a 351 packet to the next. This is why we introduce below the term of a 352 Track as the overall topology where the possible complex paths are 353 all contained. 355 In the context of this document, a path is observed by following one 356 copy or one fragment of a packet that conserves its uniqueness and 357 integrity. For instance, if C replicates to E and F and D eliminates 358 on the way from A to B, a packet from A to B experiences 2 paths, 359 A->C->E->D->B and A->C->F->D->B. 361 2.1.3.2. Track 363 A networking graph that can be followed to transport packets with 364 equivalent treatment; as opposed to the definition of a path above, a 365 Track represents not an experience but a potential, is not 366 necessarily a linear sequence, and is not necessarily fully traversed 367 (flooded) by all packets of a flow. It may contain multiple paths 368 that may overlap, fork and rejoin, for instance to enable the RAW 369 PAREO operations. 371 +---------+ 372 | IoT G/W | 373 +---------+ 374 EGR <=== Elimination at Egress 375 | | 376 /------/ \-------\ Wired backbone 377 | | 378 +--|--+ +--|--+ 379 | | | Backbone | | | Backbone 380 | | | Router | | | Router 381 +--|--+ +--|--+ 382 | | 383 o \ o / Track branch 384 o o o---o---o o o o o 385 \ o / o o o 386 o o \ / o low power lossy network 387 \/ o o o 388 o IN <=== Replication at Track Ingress 389 | 390 o <- source device 392 Figure 1: Example IoT Track to an IoT gateway with 1+1 redundancy 394 In DetNet [RFC8655] terms, a Track has the following properties: 396 * A Track is a layer-3 abstraction built upon P2P IP links between 397 routers. A router may form multiple P2P IP links over a single 398 radio interface. 400 * A Track has one Ingress and one Egress nodes, which operate as 401 DetNet Edge nodes. 403 * A Track is reversible, meaning that packets can be routed against 404 the flow of data packets, e.g., to carry OAM measurements or 405 control messages back to the Ingress. 407 * The vertices of the Track are DetNet Relay nodes that operate at 408 the DetNet Service sublayer and provide the PAREO functions. 410 * The topological edges of the graph are serial sequences of DetNet 411 Transit nodes that operate at the DetNet Forwarding sublayer. 413 2.1.3.3. SubTrack 415 A Track within a Track. The RAW PSE selects a subTrack on a per- 416 packet or a per-collection of packets basis to provide the desired 417 reliability for the transported flows. 419 2.1.3.4. Segment 421 A serial path formed by a topological edge of a Track. East-West 422 Segments are oriented from Ingress (East) to Egress (West). North/ 423 South Segments can be bidirectional; to avoid loops, measures must be 424 taken to ensure that a given packet flows either Northwards or 425 Southwards along a bidirectional Segment, but never bounces back. 427 2.1.4. Deterministic Networking 429 This document reuses the terminology in section 2 of [RFC8557] and 430 section 4.1.2 of [RFC8655] for deterministic networking and 431 deterministic networks. 433 2.1.4.1. Flow 435 A collection of consecutive IP packets defined by the upper layers 436 and signaled by the same 5 or 6-tuple, see section 5.1 of [RFC8939]. 437 Packets of the same flow must be placed on the same Track to receive 438 an equivalent treatment from Ingress to Egress within the Track. 439 Multiple flows may be transported along the same Track. The subTrack 440 that is selected for the flow may change over time under the control 441 of the PSE. 443 2.1.4.2. Deterministic Flow Identifier (L2) 445 A tuple identified by a stream_handle, and provided by a bridge, in 446 accordance with IEEE 802.1CB. The tuple comprises at least src MAC, 447 dst MAC, VLAN ID, and L2 priority. Continuous streams are 448 characterized by bandwidth and max packet size; scheduled streams are 449 characterized by a repeating pattern of timed transmissions. 451 2.1.4.3. Deterministic Flow Identifier (L3) 453 See section 3.3 of [DetNet-DP]. The classical IP 5-tuple that 454 identifies a flow comprises the src IP, dst IP, src port, dest port, 455 and the upper layer protocol (ULP). DetNet uses a 6-tuple where the 456 extra field is the DSCP field in the packet. The IPv6 flow label is 457 not used for that purpose. 459 2.1.4.4. TSN 461 TSN stands for Time Sensitive Networking and denotes the efforts at 462 IEEE 802 for deterministic networking, originally for use on 463 Ethernet. Wireless TSN (WTSN) denotes extensions of the TSN work on 464 wireless media such as the selected RAW technologies [RAW-TECHNOS]. 466 2.1.5. Reliability and Availability 468 In the context of the RAW work, Reliability and Availability are 469 defined as follows: 471 2.1.5.1. Service Level Agreement 473 In the context of RAW, an SLA (service level agreement) is a contract 474 between a provider, the network, and a client, the application flow, 475 about measurable metrics such as latency boundaries, consecutive 476 losses, and packet delivery ratio (PDR). 478 2.1.5.2. Service Level Objective 480 A service level objective (SLO) is one term in the SLA, for which 481 specific network setting and operations are implemented. For 482 instance, a dynamic tuning of the packet redundancy will address an 483 SLO of consecutive losses in a row by augmenting the chances of 484 delivery of a packet that follows a loss. 486 2.1.5.3. Service Level Indicator 488 A service level indicator (SLI) measures the compliance of an SLO to 489 the terms of the contrast. It can be for instance the statistics of 490 individual losses and losses in a row as time series.). 492 2.1.5.4. Reliability 494 Reliability is a measure of the probability that an item will perform 495 its intended function for a specified interval under stated 496 conditions (SLA). RAW expresses reliability in terms of Mean Time 497 Between Failure (MTBF) and Maximum Consecutive Failures (MCF). More 498 in [NASA].). 500 2.1.5.5. Available 502 That is exempt of unscheduled outage or derivation from the terms of 503 the SLA. A basic expectation for a RAW network is that the flow is 504 maintained in the face of any single breakage or flapping. 506 2.1.5.6. Availability 508 Availability is a measure of the relative amount of time where a RAW 509 Network operates in stated condition (SLA), expressed as 510 (uptime)/(uptime+downtime). Because a serial wireless path may not 511 be good enough to provide the required reliability, and even 2 512 parallel paths may not be over a longer period of time, the RAW 513 availability implies a journey that is a lot more complex than 514 following a serial path. 516 2.1.6. OAM variations 518 2.1.6.1. Active OAM 520 See [RFC7799]. In the context of RAW, Active OAM is used to observe 521 a particular Track, subTrack, or Segment of a Track regardless of 522 whether it is used for traffic at that time. 524 2.1.6.2. In-Band OAM 526 An active OAM packet is considered in-band for the monitored Track 527 when it traverses the same set of links and interfaces and if the OAM 528 packet receives the same QoS and PAREO treatment as the packets of 529 the data flows that are injected in the Track. 531 2.1.6.3. Out-of-Band OAM 533 Out-of-band OAM is an active OAM whose path is not topologically 534 congruent to the Track, or its test packets receive a QoS and/or 535 PAREO treatment that is different from that of the packets of the 536 data flows that are injected in the Track, or both. 538 2.1.6.4. Limited OAM 540 An active OAM packet is a Limited OAM packet when it observes the RAW 541 operation over a node, a segment, or a subTrack of the Track, though 542 not from Ingress to Egress. It is injected in the datapath and 543 extracted from the datapath around the particular function or 544 subnetwork (e.g., around a relay providing a service layer 545 replication point) that is being tested. 547 2.1.6.5. Upstream OAM 549 An upstream OAM packet is an Out-of-Band OAM packet that traverses 550 the Track from egress to ingress on the reverse direction, to capture 551 and report OAM measurements upstream. The collection may capture all 552 information along the whole Track, or it may only learn select data 553 across all, or only a particular subTrack, or Segment of a Track. 555 2.1.6.6. Residence Time 557 A residence time (RT) is defined as the time period between the 558 reception of a packet starts and the transmission of the packet 559 begins. In the context of RAW, RT is useful for a transit node, not 560 ingress or egress. 562 2.1.6.7. Additional References 564 [DetNet-OAM] provides additional terminology related to OAM in the 565 context of DetNet and by extension of RAW, whereas [RFC7799] defines 566 the Active, Passive, and Hybrid OAM methods. 568 2.2. Reliability and Availability 570 2.2.1. High Availability Engineering Principles 572 The reliability criteria of a critical system pervades through its 573 elements, and if the system comprises a data network then the data 574 network is also subject to the inherited reliability and availability 575 criteria. It is only natural to consider the art of high 576 availability engineering and apply it to wireless communications in 577 the context of RAW. 579 There are three principles [pillars] of high availability 580 engineering: 582 1. elimination of single points of failure 583 2. reliable crossover 584 3. prompt detection of failures as they occur. 586 These principles are common to all high availability systems, not 587 just ones with Internet technology at the center. Examples of both 588 non-Internet and Internet are included. 590 2.2.1.1. Elimination of Single Points of Failure 592 Physical and logical components in a system happen to fail, either as 593 the effect of wear and tear, when used beyond acceptable limits, or 594 due to a software bug. It is necessary to decouple component failure 595 from system failure to avoid the latter. This allows failed 596 components to be restored while the rest of the system continues to 597 function. 599 IP Routers leverage routing protocols to compute alternate routes in 600 case of a failure. There is a rather open-ended issue over alternate 601 routes -- for example, when links are cabled through the same 602 conduit, they form a shared risk link group (SRLG), and will share 603 the same fate if the bundle is cut. The same effect can happen with 604 virtual links that end up in a same physical transport through the 605 games of encapsulation. In a same fashion, an interferer or an 606 obstacle may affect multiple wireless transmissions at the same time, 607 even between different sets of peers. 609 Intermediate network Nodes such as routers, switches and APs, wire 610 bundles and the air medium itself can become single points of 611 failure. For High Availability, it is thus required to use 612 physically link- and Node-disjoint paths; in the wireless space, it 613 is also required to use the highest possible degree of diversity 614 (time, space, code, frequency, channel width) in the transmissions 615 over the air to combat the additional causes of transmission loss. 617 From an economics standpoint, executing this principle properly 618 generally increases capitalization expense because of the redundant 619 equipment. In a constrained network where the waste of energy and 620 bandwidth should be minimized, an excessive use of redundant links 621 must be avoided; for RAW this means that the extra bandwidth must be 622 used wisely and with parcimony. 624 2.2.1.2. Reliable Crossover 626 Having a backup equipment has a limited value unless it can be 627 reliably switched into use within the down-time parameters. IP 628 Routers execute reliable crossover continuously because the routers 629 will use any alternate routes that are available [RFC0791]. This is 630 due to the stateless nature of IP datagrams and the dissociation of 631 the datagrams from the forwarding routes they take. The "IP Fast 632 Reroute Framework" [FRR] analyzes mechanisms for fast failure 633 detection and path repair for IP Fast-Reroute, and discusses the case 634 of multiple failures and SRLG. Examples of FRR techniques include 635 Remote Loop-Free Alternate [RLFA-FRR] and backup label-switched path 636 (LSP) tunnels for the local repair of LSP tunnels using RSVP-TE 637 [RFC4090]. 639 Deterministic flows, on the contrary, are attached to specific paths 640 where dedicated resources are reserved for each flow. This is why 641 each DetNet path must inherently provide sufficient redundancy to 642 provide the guaranteed SLA at all times. The DetNet PREOF typically 643 leverages 1+1 redundancy whereby a packet is sent twice, over non- 644 congruent paths. This avoids the gap during the fast reroute 645 operation, but doubles the traffic in the network. 647 In the case of RAW, the expectation is that multiple transient faults 648 may happen in overlapping time windows, in which case the 1+1 649 redundancy with delayed reestablishment of the second path will not 650 provide the required guarantees. The Data Plane must be configured 651 with a sufficient degree of redundancy to select an alternate 652 redundant path immediately upon a fault, without the need for a slow 653 intervention from the controller plane. 655 2.2.1.3. Prompt Notification of Failures 657 The execution of the two above principles is likely to render a 658 system where the user will rarely see a failure. But someone needs 659 to in order to direct maintenance. 661 There are many reasons for system monitoring (FCAPS for fault, 662 configuration, accounting, performance, security is a handy mental 663 checklist) but fault monitoring is sufficient reason. 665 "An Architecture for Describing Simple Network Management Protocol 666 (SNMP) Management Frameworks" [STD 62] describes how to use SNMP to 667 observe and correct long-term faults. 669 "Overview and Principles of Internet Traffic Engineering" [TE] 670 discusses the importance of measurement for network protection, and 671 provides abstract an method for network survivability with the 672 analysis of a traffic matrix as observed by SNMP, probing techniques, 673 FTP, IGP link state advertisements, and more. 675 Those measurements are needed in the context of RAW to inform the 676 controller and make the long term reactive decision to rebuild a 677 complex path based on statistical and aggregated information. RAW 678 itself operates in the Network Plane at a faster time scale with live 679 information on speed, state, etc... This live information can be 680 obtained directly from the lower layer, e.g., using L2 triggers, read 681 from a protocol such as the Dynamic Link Exchange Protocol (DLEP) 682 [DLEP], or transported over multiple hops using OAM and reverse OAM, 683 as illustrated in Figure 6. 685 2.2.2. Applying Reliability Concepts to Networking 687 The terms Reliability and Availability are defined for use in RAW in 688 Section 2.1 and the reader is invited to read [NASA] for more details 689 on the general definition of Reliability. Practically speaking a 690 number of nines is often used to indicate the reliability of a data 691 link, e.g., 5 nines indicate a Packet Delivery Ratio (PDR) of 692 99.999%. 694 This number is typical in a wired environment where the loss is due 695 to a random event such as a solar particle that affects the 696 transmission of a particular frame, but does not affect the previous 697 or next frame, nor frames transmitted on other links. Note that the 698 QoS requirements in RAW may include a bounded latency, and a packet 699 that arrives too late is a fault and not considered as delivered. 701 For a periodic networking pattern such as an automation control loop, 702 this number is proportional to the Mean Time Between Failures (MTBF). 703 When a single fault can have dramatic consequences, the MTBF 704 expresses the chances that the unwanted fault event occurs. In data 705 networks, this is rarely the case. Packet loss cannot never be fully 706 avoided and the systems are built to resist to one loss, e.g., using 707 redundancy with Retries (HARQ) or Packet Replication and Elimination 708 (PRE), or, in a typical control loop, by linear interpolation from 709 the previous measurements. 711 But the linear interpolation method cannot resist multiple 712 consecutive losses, and a high MTBF is desired as a guarantee that 713 this will not happen, IOW that the number of losses-in-a-row can be 714 bounded. In that case, what is really desired is a Maximum 715 Consecutive Failures (MCF). If the number of losses in a row passes 716 the MCF, the control loop has to abort and the system, e.g., the 717 production line, may need to enter an emergency stop condition. 719 Engineers that build automated processes may use the network 720 reliability expressed in nines or as an MTBF as a proxy to indicate 721 an MCF, e.g., as described in section 7.4 of the "Deterministic 722 Networking Use Cases" [RFC8578]. 724 2.2.3. Wireless Effects Affecting Reliability 726 In contrast with wired networks, errors in transmission are the 727 predominant source of packet loss in wireless networks. 729 The root cause for the loss may be of multiple origins, calling for 730 the use of different forms of diversity: 732 Multipath Fading A destructive interference by a reflection of the 733 original signal. 735 A radio signal may be received directly (line-of-sight) and/or as 736 a reflection on a physical structure (echo). The reflections take 737 a longer path and are delayed by the extra distance divided by the 738 speed of light in the medium. Depending on the frequency, the 739 echo lands with a different phase which may add up to 740 (constructive interference) or cancel the direct signal 741 (destructive interference). 743 The affected frequencies depend on the relative position of the 744 sender, the receiver, and all the reflecting objects in the 745 environment. A given hop will suffer from multipath fading for 746 multiple packets in a row till a physical movement changes the 747 reflection patterns. 749 Co-channel Interference Energy in the spectrum used for the 750 transmission confuses the receiver. 752 The wireless medium itself is a Shared Risk Link Group (SRLG) for 753 nearby users of the same spectrum, as an interference may affect 754 multiple co-channel transmissions between different peers within 755 the interference domain of the interferer, possibly even when they 756 use different technologies. 758 Obstacle in Fresnel Zone The optimal transmission happens when the 759 Fresnel Zone between the sender and the receiver is free of 760 obstacles. 762 As long as a physical object (e.g., a metallic trolley between 763 peers) that affects the transmission is not removed, the quality 764 of the link is affected. 766 In an environment that is rich of metallic structures and mobile 767 objects, a single radio link will provide a fuzzy service, meaning 768 that it cannot be trusted to transport the traffic reliably over a 769 long period of time. 771 Transmission losses are typically not independent, and their nature 772 and duration are unpredictable; as long as a physical object (e.g., a 773 metallic trolley between peers) that affects the transmission is not 774 removed, or as long as the interferer (e.g., a radar) keeps 775 transmitting, a continuous stream of packets will be affected. 777 The key technique to combat those unpredictable losses is diversity. 778 Different forms of diversity are necessary to combat different causes 779 of loss and the use of diversity must be maximized to optimize the 780 PDR. 782 A single packet may be sent at different times (time diversity) over 783 diverse paths (spatial diversity) that rely on diverse radio channels 784 (frequency diversity) and diverse PHY technologies, e.g., narrowband 785 vs. spread spectrum, or diverse codes. Using time diversity will 786 defeat short-term interferences; spatial diversity combats very local 787 causes such as multipath fading; narrowband and spread spectrum are 788 relatively innocuous to one another and can be used for diversity in 789 the presence of the other. 791 2.3. Routing Time Scale vs. Forwarding Time Scale 793 With DetNet, the Controller Plane Function that handles the routing 794 computation and maintenance (the PCE) can be centralized and can 795 reside outside the network. In a wireless mesh, the path to the PCE 796 can be expensive and slow, possibly going across the whole mesh and 797 back. Reaching to the PCE can also be slow in regards to the speed 798 of events that affect the forwarding operation at the radio layer. 800 Due to that cost and latency, the Controller Plane is not expected to 801 be sensitive/reactive to transient changes. The abstraction of a 802 link at the routing level is expected to use statistical metrics that 803 aggregate the behavior of a link over long periods of time, and 804 represent its properties as shades of gray as opposed to numerical 805 values such as a link quality indicator, or a boolean value for 806 either up or down. 808 +----------------+ 809 | Controller | 810 | [PCE] | 811 +----------------+ 812 ^ 813 | 814 Slow 815 | 816 _-._-._-._-._-._-. | ._-._-._-._-._-._-._-._-._-._-._-._- 817 _-._-._-._-._-._-._-. | _-._-._-._-._-._-._-._-._-._-._-._- 818 | 819 Expensive 820 | 821 .... | ....... 822 .... . | . ....... 823 .... v ... 824 .. A-------B-------C---D .. 825 ... / \ / \ .. 826 . I ----M-------N--***-- E .. 827 .. \ / / ... 828 .. P--***--Q-----M---R .... 829 .. .... 830 . <----- Fast -------> .... 831 ....... .... 832 ................. 834 *** = flapping at this time 836 Figure 2: Time Scales 838 In the case of wireless, the changes that affect the forwarding 839 decision can happen frequently and often for short durations, e.g., a 840 mobile object moves between a transmitter and a receiver, and will 841 cancel the line of sight transmission for a few seconds, or a radar 842 measures the depth of a pool and interferes on a particular channel 843 for a split second. 845 There is thus a desire to separate the long term computation of the 846 route and the short term forwarding decision. In that model, the 847 routing operation computes a complex Track that enables multiple Non- 848 Equal Cost Multi-Path (N-ECMP) forwarding solutions, and leaves it to 849 the Data Plane to make the per-packet decision of which of these 850 possibilities should be used. 852 In the wired world, and more specifically in the context of Traffic 853 Engineering (TE), an alternate path can be used upon the detection of 854 a failure in the main path, e.g., using OAM in MPLS-TP or BFD over a 855 collection of SD-WAN tunnels. RAW formalizes a forwarding time scale 856 that is an order(s) of magnitude shorter than the controller plane 857 routing time scale, and separates the protocols and metrics that are 858 used at both scales. Routing can operate on long term statistics 859 such as delivery ratio over minutes to hours, but as a first 860 approximation can ignore flapping. On the other hand, the RAW 861 forwarding decision is made at the scale of the packet rate, and uses 862 information that must be pertinent at the present time for the 863 current transmission(s). 865 3. The RAW Conceptual Model 867 RAW inherits the conceptual model described in section 4 of the 868 DetNet Architecture [RFC8655]. RAW extends the DetNet service layer 869 to provide additional agility against transmission loss. 871 A RAW Network Plane may be strict or loose, depending on whether RAW 872 observes and takes actions on all hops or not. For instance, the 873 packets between two wireless entities may be relayed over a wired 874 infrastructure such as a Wi-Fi extended service set (ESS) or a 5G 875 Core; in that case, RAW observes and controls the transmission over 876 the wireless first and last hops, as well as end-to-end metrics such 877 as latency, jitter, and delivery ratio. This operation is loose 878 since the structure and properties of the wired infrastructure are 879 ignored, and may be either controlled by other means such as DetNet/ 880 TSN, or neglected in the face of the wireless hops. 882 A Controller Plane Function (CPF) called the Path Computation Element 883 (PCE) [RFC4655] interacts with RAW Nodes over a Southbound API. The 884 RAW Nodes are DetNet relays that are capable of additional diversity 885 mechanisms and measurement functions related to the radio interface, 886 in particular the PAREO diversity mechanisms. 888 The PCE defines a complex Track between an Ingress End System and an 889 Egress End System, and indicates to the RAW Nodes where the PAREO 890 operations may be actionned in the Network Plane. The Track may be 891 expressed loosely to enable traversing a non-RAW subnetwork. In that 892 case, the expectation is that the non-RAW subnetwork can be neglected 893 in the RAW computation, that is, considered infinitely fast, reliable 894 and/or available in comparison with the links between RAW nodes. 896 CPF CPF CPF CPF 898 Southbound API 899 _-._-._-._-._-._-._-._-._-._-._-._-._-._-._-._-._-._-._-._-._-._- 900 _-._-._-._-._-._-._-._-._-._-._-._-._-._-._-._-._-._-._-._-._-._- 902 RAW --/ RAW --/ RAW --/ RAW 903 /-- Node /-- Node /-- Node /-- Node --/ 904 Ingress --/ / / /-- Egress 905 End / / .. . End 906 Node ---/ / / .. .. . /-- Node 907 /-- RAW --/ RAW ( non-RAW ) -- RAW --/ 908 Node /-- Node --- ( Nodes ) Node 909 ... . 910 --/ wireless wired 911 /-- link --- link 913 Figure 3: RAW Nodes 915 The Link-Layer metrics are reported to the PCE in a time-aggregated, 916 e.g., statistical fashion. Example Link-Layer metrics include 917 typical Link bandwidth (the medium speed depends dynamically on the 918 PHY mode), number of flows (bandwidth that can be reserved for a 919 flomw depends on the number and size of flows sharing the spectrum) 920 and average and mean squared deviation of availability and 921 reliability figures such as Packet Delivery Ratio (PDR) over long 922 periods of time. 924 Based on those metrics, the PCE installs the Track with enough 925 redundant forwarding solutions to ensure that the Network Plane can 926 reliably deliver the packets within a System Level Agreement (SLA) 927 associated to the flows that it transports. The SLA defines end-to- 928 end reliability and availability requirements, where reliability may 929 be expressed as a successful delivery in order and within a bounded 930 delay of at least one copy of a packet. 932 Depending on the use case and the SLA, the Track may comprise non-RAW 933 segments, either interleaved inside the Track, or all the way to the 934 Egress End Node (e.g., a server in the Internet). RAW observes the 935 Lower-Layer Links between RAW nodes (typically, radio links) and the 936 end-to-end Network Layer operation to decide at all times which of 937 the PAREO diversity schemes is actioned by which RAW Nodes. 939 Once a Track is established, per-segment and end-to-end reliability 940 and availability statistics are periodically reported to the PCE to 941 assure that the SLA can be met or have it recompute the Track if not. 943 4. The OODA Loop 945 The RAW Architecture is structured as an OODA Loop (Observe, Orient, 946 Decide, Act). It involves: 948 1. Network Plane measurement protocols for Operations, 949 Administration and Maintenance (OAM) to Observe some or all hops 950 along a Track as well as the end-to-end packet delivery, more in 951 Section 4.1; 953 2. Controller plane elements to report the links statistics to a 954 Path computation Element (PCE) in a centralized controller that 955 computes and installs the Tracks and provides meta data to Orient 956 the routing decision, more in Section 4.2; 958 3. A Runtime distributed Path Selection Engine (PSE) thar Decides 959 which subTrack to use for the next packet(s) that are routed 960 along the Track, more in Section 4.3; 962 4. Packet (hybrid) ARQ, Replication, Elimination and Ordering 963 Dataplane actions that operate at the DetNet Service Layer to 964 increase the reliability of the end-to-end transmission. The RAW 965 architecture also covers in-situ signalling when the decision is 966 Acted by a node that down the Track from the PSE, more in 967 Section 4.4. 969 +-------> Orient (PCE) --------+ 970 | link stats, | 971 | pre-trained model | 972 | ... | 973 | v 974 Observe (OAM) Decide (PSE) 975 ^ | 976 | | 977 | | 978 +-------- Act (PAREO) <--------+ 979 At DetNet 980 Service sublayer 982 Figure 4: The RAW OODA Loop 984 The overall OODA Loop optimizes the use of redundancy to achieve the 985 required reliability and availability Service Level Agreement (SLA) 986 while minimizing the use of constrained resources such as spectrum 987 and battery. 989 4.1. Observe: The RAW OAM 991 RAW In-situ OAM operation in the Network Plane may observe either a 992 full Track or subTracks that are being used at this time. As packets 993 may be load balanced, replicated, eliminated, and / or fragmented for 994 Network Coding (NC) forward error correction (FEC), the RAW In-situ 995 operation needs to be able to signal which operation occured to an 996 individual packet. 998 Active RAW OAM may be needed to observe the unused segments and 999 evaluate the desirability of a rerouting decision. 1001 Finally, the RAW Service Layer Assurance may observe the individual 1002 PAREO operation of a relay node to ensure that it is conforming; this 1003 might require injecting an OAM packet at an upstream point inside the 1004 Track and extracting that packet at another point downstream before 1005 it reaches the egress. 1007 This observation feeds the RAW PSE that makes the decision on which 1008 PAREO function is actioned at which RAW Node, for one a small 1009 continuous series of packets. 1011 ... .. 1012 RAN 1 ----- ... .. ... 1013 / . .. .... 1014 +-------+ / . .. .... +------+ 1015 |Ingress|- . ..... |Egress| 1016 | End |------ RAN 2 -- . Internet ....---| End | 1017 |System |- .. ..... |System| 1018 +-------+ \ . ...... +------+ 1019 \ ... ... ..... 1020 RAN n -------- ... ..... 1022 <------------------> <--------------------> 1023 Observed by OAM Opaque to OAM 1025 Figure 5: Observed Links in Radio Access Protection 1027 In the case of a End-to-End Protection in a Wireless Mesh, the Track 1028 is strict and congruent with the path so all links are observed. 1030 Conversely, in the case of Radio Access Protection illustrated in 1031 Figure 5, the Track is Loose and only the first hop is observed; the 1032 rest of the path is abstracted and considered infinitely reliable. 1033 The loss if a packet is attributed to the first hop Radio Access 1034 Network (RAN), even if a particular loss effectively happens farther 1035 down the path. In that case, RAW enables technology diversity (e.g. 1036 Wi-Fi and 5G) which in turn improves the diversity in spectrum usage. 1038 The Links that are not observed by OAM are opaque to it, meaning that 1039 the OAM information is carried across and possibly echoed as data, 1040 but there is no information capture in intermediate nodes. In the 1041 example above, the Internet is opaque and not controlled by RAW; 1042 still the RAW OAM measures the end-to-end latency and delivery ratio 1043 for packets sent via each if RAN 1, RAN 2 and RAN 3, and determines 1044 whether a packet should be sent over either or a collection of those 1045 access links. 1047 4.2. Orient: The Path Computation Engine 1049 RAW separates the long time scale at which a Track is elaborated and 1050 installed, from the short time scale at which the forwarding decision 1051 is taken for one or a few packets (see in Section 2.3) that will 1052 experience the same path until the network conditions evolve and 1053 another patyh is selected within the same Track. 1055 The Track computation is out of scope, but RAW expects that the 1056 Controller plane protocol that installs the Track also provides 1057 related knowledge in the form of meta data about the links, segments 1058 and possible subTracks. That meta data can be a pre-digested 1059 statistical model, and may include prediction of future flaps and 1060 packet loss, as well as recommended actions when that happens. 1062 The meta data may include: 1064 * Pre-Determined subTracks to match predictable error profiles 1066 * Pre-Trained models 1068 * Link Quality Statistics and their projected evolution 1070 The Track is installed with measurable objectives that are computed 1071 by the PCE to achieve the RAW SLA. The objectives can be expressed 1072 as any of maximum number of packet lost in a row, bounded latency, 1073 maximal jitter, maximum number of interleaved out of order packets, 1074 average number of copies received at the elimination point, and 1075 maximal delay between the first and the last received copy of the 1076 same packet. 1078 4.3. Decide: The Path Selection Engine 1080 The RAW OODA Loop operates at the path selection time scale to 1081 provide agility vs. the brute force approach of flooding the whole 1082 Track. The OODA Loop controls, within the redundant solutions that 1083 are proposed by the PCE, which will be used for each packet to 1084 provide a Reliable and Available service while minimizing the waste 1085 of constrained resources. 1087 To that effect, RAW defines the Path Selection Engine (PSE) that is 1088 the counterpart of the PCE to perform rapid local adjustments of the 1089 forwarding tables within the diversity that the PCE has selected for 1090 the Track. The PSE enables to exploit the richer forwarding 1091 capabilities with PAREO and scheduled transmissions at a faster time 1092 scale over the smaller domain that is the Track, in either a loose or 1093 a strict fashion. 1095 Compared to the PCE, the PSE operates on metrics that evolve faster, 1096 but that need to be advertised at a fast rate but only locally, 1097 within the Track. The forwarding decision may also change rapidly, 1098 but with a scope that is also contained within the Track, with no 1099 visibility to the other Tracks and flows in the network. This is as 1100 opposed to the PCE that must observe the whole network and optimize 1101 all the Tracks globally, which can only be done at a slow pace and 1102 using long-term statistical metrics, as presented in Table 1. 1104 +===============+========================+===================+ 1105 | | PCE (Not in Scope) | PSE (In Scope) | 1106 +===============+========================+===================+ 1107 | Operation | Centralized | Source-Routed or | 1108 | | | Distributed | 1109 +---------------+------------------------+-------------------+ 1110 | Communication | Slow, expensive | Fast, local | 1111 +---------------+------------------------+-------------------+ 1112 | Time Scale | hours and above | seconds and below | 1113 +---------------+------------------------+-------------------+ 1114 | Network Size | Large, many Tracks to | Small, within one | 1115 | | optimize globally | Track | 1116 +---------------+------------------------+-------------------+ 1117 | Considered | Averaged, Statistical, | Instant values / | 1118 | Metrics | Shade of grey | boolean condition | 1119 +---------------+------------------------+-------------------+ 1121 Table 1: PCE vs. PSE 1123 The PSE sits in the DetNet Service sub-Layer of Edge and Relay Nodes. 1124 On the one hand, it operates on the packet flow, learning the Track 1125 and path selection information from the packet, possibly making local 1126 decision and retagging the packet to indicate so. On the other hand, 1127 the PSE interacts with the lower layers and with its peers to obtain 1128 up-to-date information about its radio links and the quality of the 1129 overall Track, respectively, as illustrated in Figure 6. 1131 | 1132 packet | going 1133 down the | stack 1134 +==========v==========+=====================+=====================+ 1135 | (iOAM + iCTRL) | (L2 Triggers, DLEP) | (oOAM) | 1136 +==========v==========+=====================+=====================+ 1137 | Learn from Learn from | 1138 | packet tagging Maintain end-to-end | 1139 +----------v----------+ Forwarding OAM packets | 1140 | Forwarding decision < State +---------^-----------| 1141 +----------v----------+ | Enrich or | 1142 + Retag Packet | Learn abstracted > Regenerate | 1143 | and Forward | metrics about Links | OAM packets | 1144 +..........v..........+..........^..........+.........^.v.........+ 1145 | Lower layers | 1146 +..........v.....................^....................^.v.........+ 1147 frame | sent Frame | L2 Ack oOAM | | packet 1148 over | wireless In | In | | and out 1149 v | | v 1151 Figure 6: PSE 1153 4.4. Act: The PAREO Functions 1155 RAW may control whether and how to use packet replication and 1156 elimination (PRE), fragmentation, and network coding, and how the 1157 lower layers performs Automatic Repeat reQuest (ARQ), Hybrid ARQ 1158 (HARQ) that includes Forward Error Correction (FEC), and other 1159 wireless-specific techniques such as overhearing and constructive 1160 interferences, in order to increase the reliabiility and availability 1161 of the end-to-end transmission. 1163 Collectively, those function are called PAREO for Packet (hybrid) 1164 ARQ, Replication, Elimination and Ordering. By tuning dynamically 1165 the use of PAREO functions, RAW avoids the waste of critical 1166 resources such as spectrum and energy while providing that the 1167 guaranteed SLA, e.g., by adding redundancy only when a spike of loss 1168 is observed. 1170 In a nutshell, PAREO establishes several paths in a network to 1171 provide redundancy and parallel transmissions to bound the end-to-end 1172 delay to traverse the network. Optionally, promiscuous listening 1173 between paths is possible, such that the Nodes on one path may 1174 overhear transmissions along the other path. Considering the 1175 scenario shown in Figure 7, many different paths are possible to 1176 traverse the network from ingress to egress. A simple way to benefit 1177 from this topology could be to use the two independent paths via 1178 Nodes A, C, E and via B, D, F. But more complex paths are possible 1179 by interleaving transmissions from the lower level of the path to the 1180 upper level. 1182 (A) -- (C) -- (E) 1183 / \ 1184 Ingress = | | | = Egress 1185 \ / 1186 (B) -- (D) -- (F) 1188 Figure 7: A Ladder Shape with Two Parallel Paths 1190 PAREO may also take advantage of the shared properties of the 1191 wireless medium to compensate for the potential loss that is incurred 1192 with radio transmissions. 1194 For instance, when the source sends to Node A, Node B may listen 1195 promiscuously and get a second chance to receive the frame without an 1196 additional transmission. Note that B would not have to listen if it 1197 already received that particular frame at an earlier timeslot in a 1198 dedicated transmission towards B. 1200 The PAREO model can be implemented in both centralized and 1201 distributed scheduling approaches. In the centralized approach, a 1202 Path Computation Element (PCE) scheduler calculates a Track and 1203 schedules the communication. In the distributed approach, the Track 1204 is computed within the network, and signaled in the packets, e.g., 1205 using BIER-TE, Segment Routing, or a Source Routing Header. 1207 4.4.1. Packet Replication 1209 By employing a Packet Replication procedure, a Node forwards a copy 1210 of each data packet to more than one successor. To do so, each Node 1211 (i.e., Ingress and intermediate Node) sends the data packet multiple 1212 times as separate unicast transmissions. For instance, in Figure 8, 1213 the Ingress Node is transmitting the packet to both successors, nodes 1214 A and B, at two different times. 1216 ===> (A) => (C) => (E) === 1217 // \\// \\// \\ 1218 Ingress //\\ //\\ Egress 1219 \\ // \\ // \\ // 1220 ===> (B) => (D) => (F) === 1222 Figure 8: Packet Replication 1224 An example schedule is shown in Table 2. This way, the transmission 1225 leverages with the time and spatial forms of diversity. 1227 +=========+======+======+======+======+======+======+======+ 1228 | Channel | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 1229 +=========+======+======+======+======+======+======+======+ 1230 | 0 | S->A | S->B | B->C | B->D | C->F | E->R | F->R | 1231 +---------+------+------+------+------+------+------+------+ 1232 | 1 | | A->C | A->D | C->E | D->E | D->F | | 1233 +---------+------+------+------+------+------+------+------+ 1235 Table 2: Packet Replication: Sample schedule 1237 4.4.2. Packet Elimination 1239 The replication operation increases the traffic load in the network, 1240 due to packet duplications. This may occur at several stages inside 1241 the Track, and to avoid an explosion of the number of copies, a 1242 Packet Elimination procedure must be applied as well. To this aim, 1243 once a Node receives the first copy of a data packet, it discards the 1244 subsequent copies. 1246 The logical functions of Replication and Elimination may be 1247 collocated in an intermediate Node, the Node first eliminating the 1248 redundant copies and then sending the packet exactly once to each of 1249 the selected successors. 1251 4.4.3. Promiscuous Overhearing 1253 Considering that the wireless medium is broadcast by nature, any 1254 neighbor of a transmitter may overhear a transmission. By employing 1255 the Promiscuous Overhearing operation, the next hops have additional 1256 opportunities to capture the data packets. In Figure 9, when Node A 1257 is transmitting to its DP (Node C), the AP (Node D) and its sibling 1258 (Node B) may decode this data packet as well. As a result, by 1259 employing corellated paths, a Node may have multiple opportunities to 1260 receive a given data packet. 1262 ===> (A) ====> (C) ====> (E) ==== 1263 // ^ | \\ \\ 1264 Ingress | | \\ Egress 1265 \\ | v \\ // 1266 ===> (B) ====> (D) ====> (F) ==== 1268 Figure 9: Unicast with Overhearing 1270 Variations on the same idea such as link-layer anycast and multicast 1271 may also be used to reach more than one next-hop with a single frame. 1273 4.4.4. Constructive Interference 1275 Constructive Interference can be seen as the reverse of Promiscuous 1276 Overhearing, and refers to the case where two senders transmit the 1277 exact same signal in a fashion that the emitted symbols add up at the 1278 receiver and permit a reception that would not be possible with a 1279 single sender at the same PHY mode and the same power level. 1281 Constructive Interference was proposed on 5G, Wi-Fi7 and even tested 1282 on IEEE Std 802.14.5. The hard piece is to synchronize the senders 1283 to the point that the signals are emitted at slightly different time 1284 to offset the difference of propagation delay that corresponds to the 1285 difference of distance of the transmitters to the receiver at the 1286 speed of light to the point that the symbols are superposed long 1287 enough to be recognizable. 1289 5. Security Considerations 1291 RAW uses all forms of diversity including radio technology and 1292 physical path to increase the reliability and availability in the 1293 face of unpredictable conditions. While this is not done 1294 specifically to defeat an attacker, the amount of diversity used in 1295 RAW makes an attack harder to achieve. 1297 5.1. Layer-2 encryption 1299 Radio networks typically encrypt at the MAC layer to protect the 1300 transmission. If the encryption is per pair of peers, then certain 1301 RAW operations like promiscuous overhearing become impossible. 1303 5.2. Forced Access 1305 RAW will typically select the cheapest collection of links that 1306 matches the requested SLA, for instance, leverage free WI-Fi vs. paid 1307 3GPP access. By defeating the cheap connectivity (e.g., PHY-layer 1308 interference) the attacker can force an End System to use the paid 1309 access and increase the cost of the transmission for the user. 1311 6. IANA Considerations 1313 This document has no IANA actions. 1315 7. Contributors 1317 The editor wishes to thank: 1319 Xavi Vilajosana: Wireless Networks Research Lab, Universitat Oberta 1320 de Catalunya 1322 Remous-Aris Koutsiamanis: IMT Atlantique 1324 Nicolas Montavont: IMT Atlantique 1326 Rex Buddenberg: Individual contributor 1328 Greg Mirsky: ZTE 1330 for their contributions to the text and ideas exposed in this 1331 document. 1333 8. Acknowledgments 1335 TBD 1337 9. References 1339 9.1. Normative References 1341 [6TiSCH-ARCHI] 1342 Thubert, P., Ed., "An Architecture for IPv6 over the Time- 1343 Slotted Channel Hopping Mode of IEEE 802.15.4 (6TiSCH)", 1344 RFC 9030, DOI 10.17487/RFC9030, May 2021, 1345 . 1347 [INT-ARCHI] 1348 Braden, R., Ed., "Requirements for Internet Hosts - 1349 Communication Layers", STD 3, RFC 1122, 1350 DOI 10.17487/RFC1122, October 1989, 1351 . 1353 [RAW-TECHNOS] 1354 Thubert, P., Cavalcanti, D., Vilajosana, X., Schmitt, C., 1355 and J. Farkas, "Reliable and Available Wireless 1356 Technologies", Work in Progress, Internet-Draft, draft- 1357 ietf-raw-technologies-05, 2 February 2022, 1358 . 1361 [RAW-USE-CASES] 1362 Bernardos, C. J., Papadopoulos, G. Z., Thubert, P., and F. 1363 Theoleyre, "RAW use-cases", Work in Progress, Internet- 1364 Draft, draft-ietf-raw-use-cases-05, 23 February 2022, 1365 . 1368 [RFC4655] Farrel, A., Vasseur, J.-P., and J. Ash, "A Path 1369 Computation Element (PCE)-Based Architecture", RFC 4655, 1370 DOI 10.17487/RFC4655, August 2006, 1371 . 1373 [RFC6291] Andersson, L., van Helvoort, H., Bonica, R., Romascanu, 1374 D., and S. Mansfield, "Guidelines for the Use of the "OAM" 1375 Acronym in the IETF", BCP 161, RFC 6291, 1376 DOI 10.17487/RFC6291, June 2011, 1377 . 1379 [RFC7799] Morton, A., "Active and Passive Metrics and Methods (with 1380 Hybrid Types In-Between)", RFC 7799, DOI 10.17487/RFC7799, 1381 May 2016, . 1383 [RFC8578] Grossman, E., Ed., "Deterministic Networking Use Cases", 1384 RFC 8578, DOI 10.17487/RFC8578, May 2019, 1385 . 1387 [IPv6] Deering, S. and R. Hinden, "Internet Protocol, Version 6 1388 (IPv6) Specification", STD 86, RFC 8200, 1389 DOI 10.17487/RFC8200, July 2017, 1390 . 1392 [RFC8557] Finn, N. and P. Thubert, "Deterministic Networking Problem 1393 Statement", RFC 8557, DOI 10.17487/RFC8557, May 2019, 1394 . 1396 [RFC8655] Finn, N., Thubert, P., Varga, B., and J. Farkas, 1397 "Deterministic Networking Architecture", RFC 8655, 1398 DOI 10.17487/RFC8655, October 2019, 1399 . 1401 [RFC8939] Varga, B., Ed., Farkas, J., Berger, L., Fedyk, D., and S. 1402 Bryant, "Deterministic Networking (DetNet) Data Plane: 1403 IP", RFC 8939, DOI 10.17487/RFC8939, November 2020, 1404 . 1406 [RFC9049] Dawkins, S., Ed., "Path Aware Networking: Obstacles to 1407 Deployment (A Bestiary of Roads Not Taken)", RFC 9049, 1408 DOI 10.17487/RFC9049, June 2021, 1409 . 1411 9.2. Informative References 1413 [RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, 1414 DOI 10.17487/RFC0791, September 1981, 1415 . 1417 [TE] Awduche, D., Chiu, A., Elwalid, A., Widjaja, I., and X. 1418 Xiao, "Overview and Principles of Internet Traffic 1419 Engineering", RFC 3272, DOI 10.17487/RFC3272, May 2002, 1420 . 1422 [STD 62] Harrington, D., Presuhn, R., and B. Wijnen, "An 1423 Architecture for Describing Simple Network Management 1424 Protocol (SNMP) Management Frameworks", STD 62, RFC 3411, 1425 DOI 10.17487/RFC3411, December 2002, 1426 . 1428 [RFC4090] Pan, P., Ed., Swallow, G., Ed., and A. Atlas, Ed., "Fast 1429 Reroute Extensions to RSVP-TE for LSP Tunnels", RFC 4090, 1430 DOI 10.17487/RFC4090, May 2005, 1431 . 1433 [FRR] Shand, M. and S. Bryant, "IP Fast Reroute Framework", 1434 RFC 5714, DOI 10.17487/RFC5714, January 2010, 1435 . 1437 [RLFA-FRR] Bryant, S., Filsfils, C., Previdi, S., Shand, M., and N. 1438 So, "Remote Loop-Free Alternate (LFA) Fast Reroute (FRR)", 1439 RFC 7490, DOI 10.17487/RFC7490, April 2015, 1440 . 1442 [DetNet-DP] 1443 Varga, B., Ed., Farkas, J., Berger, L., Malis, A., and S. 1444 Bryant, "Deterministic Networking (DetNet) Data Plane 1445 Framework", RFC 8938, DOI 10.17487/RFC8938, November 2020, 1446 . 1448 [DLEP] Ratliff, S., Jury, S., Satterwhite, D., Taylor, R., and B. 1449 Berry, "Dynamic Link Exchange Protocol (DLEP)", RFC 8175, 1450 DOI 10.17487/RFC8175, June 2017, 1451 . 1453 [I-D.irtf-panrg-path-properties] 1454 Enghardt, T. and C. Kraehenbuehl, "A Vocabulary of Path 1455 Properties", Work in Progress, Internet-Draft, draft-irtf- 1456 panrg-path-properties-04, 25 October 2021, 1457 . 1460 [IPoWIRELESS] 1461 Thubert, P., "IPv6 Neighbor Discovery on Wireless 1462 Networks", Work in Progress, Internet-Draft, draft- 1463 thubert-6man-ipv6-over-wireless-11, 15 December 2021, 1464 . 1467 [DetNet-OAM] 1468 Mirsky, G., Theoleyre, F., Papadopoulos, G. Z., Bernardos, 1469 C. J., Varga, B., and J. Farkas, "Framework of Operations, 1470 Administration and Maintenance (OAM) for Deterministic 1471 Networking (DetNet)", Work in Progress, Internet-Draft, 1472 draft-ietf-detnet-oam-framework-05, 14 October 2021, 1473 . 1476 [NASA] Adams, T., "RELIABILITY: Definition & Quantitative 1477 Illustration", . 1480 Authors' Addresses 1482 Pascal Thubert (editor) 1483 Cisco Systems, Inc 1484 Building D 1485 45 Allee des Ormes - BP1200 1486 06254 MOUGINS - Sophia Antipolis 1487 France 1488 Phone: +33 497 23 26 34 1489 Email: pthubert@cisco.com 1490 Georgios Z. Papadopoulos 1491 IMT Atlantique 1492 Office B00 - 114A 1493 2 Rue de la Chataigneraie 1494 35510 Cesson-Sevigne - Rennes 1495 France 1496 Phone: +33 299 12 70 04 1497 Email: georgios.papadopoulos@imt-atlantique.fr