idnits 2.17.00 (12 Aug 2021) /tmp/idnits46454/draft-ietf-ippm-ioam-data-09.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (March 08, 2020) is 803 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '0' on line 570 -- Looks like a reference, but probably isn't: '1' on line 574 == Unused Reference: 'I-D.lapukhov-dataplane-probe' is defined on line 1815, but no explicit reference was found in the text -- Possible downref: Non-RFC (?) normative reference: ref. 'IEEE1588v2' -- Possible downref: Non-RFC (?) normative reference: ref. 'POSIX' == Outdated reference: draft-ietf-ntp-packet-timestamps has been published as RFC 8877 == Outdated reference: draft-ietf-nvo3-geneve has been published as RFC 8926 == Outdated reference: A later version (-12) exists of draft-ietf-nvo3-vxlan-gpe-09 == Outdated reference: A later version (-08) exists of draft-ietf-sfc-proof-of-transit-04 == Outdated reference: A later version (-06) exists of draft-spiegel-ippm-ioam-rawexport-02 Summary: 0 errors (**), 0 flaws (~~), 7 warnings (==), 5 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 ippm F. Brockners 3 Internet-Draft S. Bhandari 4 Intended status: Standards Track C. Pignataro 5 Expires: September 9, 2020 Cisco 6 H. Gredler 7 RtBrick Inc. 8 J. Leddy 10 S. Youell 11 JPMC 12 T. Mizrahi 13 Huawei Network.IO Innovation Lab 14 D. Mozes 16 P. Lapukhov 17 Facebook 18 R. Chang 19 Barefoot Networks 20 D. Bernier 21 Bell Canada 22 J. Lemon 23 Broadcom 24 March 08, 2020 26 Data Fields for In-situ OAM 27 draft-ietf-ippm-ioam-data-09 29 Abstract 31 In-situ Operations, Administration, and Maintenance (IOAM) records 32 operational and telemetry information in the packet while the packet 33 traverses a path between two points in the network. This document 34 discusses the data fields and associated data types for in-situ OAM. 35 In-situ OAM data fields can be encapsulated into a variety of 36 protocols such as NSH, Segment Routing, Geneve, IPv6 (via extension 37 header), or IPv4. In-situ OAM can be used to complement OAM 38 mechanisms based on e.g. ICMP or other types of probe packets. 40 Status of This Memo 42 This Internet-Draft is submitted in full conformance with the 43 provisions of BCP 78 and BCP 79. 45 Internet-Drafts are working documents of the Internet Engineering 46 Task Force (IETF). Note that other groups may also distribute 47 working documents as Internet-Drafts. The list of current Internet- 48 Drafts is at https://datatracker.ietf.org/drafts/current/. 50 Internet-Drafts are draft documents valid for a maximum of six months 51 and may be updated, replaced, or obsoleted by other documents at any 52 time. It is inappropriate to use Internet-Drafts as reference 53 material or to cite them other than as "work in progress." 55 This Internet-Draft will expire on September 9, 2020. 57 Copyright Notice 59 Copyright (c) 2020 IETF Trust and the persons identified as the 60 document authors. All rights reserved. 62 This document is subject to BCP 78 and the IETF Trust's Legal 63 Provisions Relating to IETF Documents 64 (https://trustee.ietf.org/license-info) in effect on the date of 65 publication of this document. Please review these documents 66 carefully, as they describe your rights and restrictions with respect 67 to this document. Code Components extracted from this document must 68 include Simplified BSD License text as described in Section 4.e of 69 the Trust Legal Provisions and are provided without warranty as 70 described in the Simplified BSD License. 72 Table of Contents 74 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 75 2. Conventions . . . . . . . . . . . . . . . . . . . . . . . . . 3 76 3. Scope, Applicability, and Assumptions . . . . . . . . . . . . 4 77 4. IOAM Data-Fields, Types, Nodes . . . . . . . . . . . . . . . 6 78 4.1. IOAM Data-Fields and Option-Types . . . . . . . . . . . . 6 79 4.2. IOAM-Domains and types of IOAM Nodes . . . . . . . . . . 6 80 4.3. IOAM-Namespaces . . . . . . . . . . . . . . . . . . . . . 8 81 4.4. IOAM Trace Option-Types . . . . . . . . . . . . . . . . . 10 82 4.4.1. Pre-allocated and Incremental Trace Option-Types . . 12 83 4.4.2. IOAM node data fields and associated formats . . . . 16 84 4.4.3. Examples of IOAM node data . . . . . . . . . . . . . 22 85 4.5. IOAM Proof of Transit Option-Type . . . . . . . . . . . . 24 86 4.5.1. IOAM Proof of Transit Type 0 . . . . . . . . . . . . 26 87 4.6. IOAM Edge-to-Edge Option-Type . . . . . . . . . . . . . . 27 88 5. Timestamp Formats . . . . . . . . . . . . . . . . . . . . . . 29 89 5.1. PTP Truncated Timestamp Format . . . . . . . . . . . . . 29 90 5.2. NTP 64-bit Timestamp Format . . . . . . . . . . . . . . . 30 91 5.3. POSIX-based Timestamp Format . . . . . . . . . . . . . . 32 92 6. IOAM Data Export . . . . . . . . . . . . . . . . . . . . . . 33 93 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 33 94 7.1. Creation of a new In-Situ OAM Protocol Parameters 95 Registry (IOAM) Protocol Parameters IANA registry . . . . 33 96 7.2. IOAM Option-Type Registry . . . . . . . . . . . . . . . . 34 97 7.3. IOAM Trace-Type Registry . . . . . . . . . . . . . . . . 34 98 7.4. IOAM Trace-Flags Registry . . . . . . . . . . . . . . . . 35 99 7.5. IOAM POT-Type Registry . . . . . . . . . . . . . . . . . 35 100 7.6. IOAM POT-Flags Registry . . . . . . . . . . . . . . . . . 36 101 7.7. IOAM E2E-Type Registry . . . . . . . . . . . . . . . . . 36 102 7.8. IOAM Namespace-ID Registry . . . . . . . . . . . . . . . 36 103 8. Security Considerations . . . . . . . . . . . . . . . . . . . 37 104 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 38 105 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 38 106 10.1. Normative References . . . . . . . . . . . . . . . . . . 39 107 10.2. Informative References . . . . . . . . . . . . . . . . . 39 108 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 41 110 1. Introduction 112 This document defines data fields for "in-situ" Operations, 113 Administration, and Maintenance (IOAM). In-situ OAM records OAM 114 information within the packet while the packet traverses a particular 115 network domain. The term "in-situ" refers to the fact that the OAM 116 data is added to the data packets rather than is being sent within 117 packets specifically dedicated to OAM. IOAM is to complement 118 mechanisms such as Ping or Traceroute. In terms of "active" or 119 "passive" OAM, "in-situ" OAM can be considered a hybrid OAM type. 120 "In-situ" mechanisms do not require extra packets to be sent. IOAM 121 adds information to the already available data packets and therefore 122 cannot be considered passive. In terms of the classification given 123 in [RFC7799] IOAM could be portrayed as Hybrid Type 1. IOAM 124 mechanisms can be leveraged where mechanisms using e.g. ICMP do not 125 apply or do not offer the desired results, such as proving that a 126 certain traffic flow takes a pre-defined path, SLA verification for 127 the live data traffic, detailed statistics on traffic distribution 128 paths in networks that distribute traffic across multiple paths, or 129 scenarios in which probe traffic is potentially handled differently 130 from regular data traffic by the network devices. 132 IOAM use cases and mechanisms have expanded as this document matured, 133 resulting in additional flags and options that may trigger creation 134 of additional packets dedicated to OAM. The term IOAM continues to 135 be used for such mechanisms, in addition to the "in-situ" mechanisms 136 that motivated this terminology. 138 2. Conventions 140 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 141 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 142 document are to be interpreted as described in [RFC2119]. 144 Abbreviations used in this document: 146 E2E Edge to Edge 148 Geneve: Generic Network Virtualization Encapsulation 149 [I-D.ietf-nvo3-geneve] 151 IOAM: In-situ Operations, Administration, and Maintenance 153 MTU: Maximum Transmit Unit 155 NSH: Network Service Header [RFC8300] 157 OAM: Operations, Administration, and Maintenance 159 POT: Proof of Transit 161 SFC: Service Function Chain 163 SID: Segment Identifier 165 SR: Segment Routing 167 VXLAN-GPE: Virtual eXtensible Local Area Network, Generic Protocol 168 Extension [I-D.ietf-nvo3-vxlan-gpe] 170 3. Scope, Applicability, and Assumptions 172 IOAM deployment assumes a set of constraints, requirements, and 173 guiding principles which are described in this section. 175 Scope: This document defines the data fields and associated data 176 types for in-situ OAM. The in-situ OAM data field can be 177 encapsulated in a variety of protocols, including NSH, Segment 178 Routing, Geneve, IPv6, or IPv4. Specification details for these 179 different protocols are outside the scope of this document. 181 Deployment domain (or scope) of in-situ OAM deployment: IOAM is a 182 network domain focused feature, with "network domain" being a set of 183 network devices or entities within a single administration. For 184 example, a network domain can include an enterprise campus using 185 physical connections between devices or an overlay network using 186 virtual connections / tunnels for connectivity between said devices. 187 A network domain is defined by its perimeter or edge. Designers of 188 protocol encapsulations for IOAM must specify mechanisms to ensure 189 that IOAM data stays within an IOAM domain. In addition, the 190 operator of such a domain is expected to put provisions in place to 191 ensure that IOAM data does not leak beyond the edge of an IOAM domain 192 using for example packet filtering methods. The operator should 193 consider the potential operational impact of IOAM to mechanisms such 194 as ECMP processing (e.g. load-balancing schemes based on packet 195 length could be impacted by the increased packet size due to IOAM), 196 path MTU (i.e. ensure that the MTU of all links within a domain is 197 sufficiently large to support the increased packet size due to IOAM) 198 and ICMP message handling (i.e. in case of IPv6, IOAM support for 199 ICMPv6 Echo Request/Reply is desired which would translate into 200 ICMPv6 extensions to enable IOAM-Data-Fields to be copied from an 201 Echo Request message to an Echo Reply message). 203 IOAM control points: IOAM-Data-Fields are added to or removed from 204 the live user traffic by the devices which form the edge of a domain. 205 Devices which form an IOAM-Domain can add, update or remove IOAM- 206 Data-Fields. Edge devices of an IOAM-Domain can be hosts or network 207 devices. 209 Traffic-sets that IOAM is applied to: IOAM can be deployed on all or 210 only on subsets of the live user traffic. Using IOAM on a selected 211 set of traffic (e.g., per interface, based on an access control list 212 or flow specification defining a specific set of traffic, etc.) could 213 be useful in deployments where the cost of processing IOAM-Data- 214 Fields by encapsulating, transit, or decapsulating node(s) might be a 215 concern from a performance or operational perspective. Thus limiting 216 the amount of traffic IOAM is applied to could be beneficial in some 217 deployments. 219 Encapsulation independence: The definition of IOAM-Data-Fields is 220 independent from the protocols the IOAM-Data-Fields are encapsulated 221 into. IOAM-Data-Fields can be encapsulated into several 222 encapsulating protocols. The specification of how IOAM-Data-Fields 223 are encapsulated into "parent" protocols, like e.g., NSH or IPv6 is 224 outside the scope of this document. 226 Layering: If several encapsulation protocols (e.g., in case of 227 tunneling) are stacked on top of each other, IOAM-Data-Fields could 228 be present at multiple layers. The behavior follows the ships-in- 229 the-night model, i.e. IOAM-Data-Fields in one layer are independent 230 from IOAM-Data-Fields in another layer. Layering allows operators to 231 instrument the protocol layer they want to measure. The different 232 layers could, but do not have to share the same IOAM encapsulation 233 mechanisms. 235 IOAM implementation: The definition of the IOAM-Data-Fields take the 236 specifics of devices with hardware data-plane and software data-plane 237 into account. 239 4. IOAM Data-Fields, Types, Nodes 241 This section details IOAM-related nomenclature and describes data 242 types such as IOAM-Data-Fields, IOAM-Types, IOAM-Namespaces as well 243 as the different types of IOAM nodes. 245 4.1. IOAM Data-Fields and Option-Types 247 An IOAM-Data-Field is a set of bits with a defined format and 248 meaning, which can be stored at a certain place in a packet for the 249 purpose of IOAM. 251 To accommodate the different uses of IOAM, IOAM-Data-Fields fall into 252 different categories. In IOAM these categories are referred to as 253 IOAM-Option-Types. A common registry is maintained for IOAM-Option- 254 Types, see Section 7.2 for details. Corresponding to these IOAM- 255 Option-Types, different IOAM-Data-Fields are defined. IOAM-Data- 256 Fields can be encapsulated into a variety of protocols, such as NSH, 257 Geneve, IPv6, etc. The definition of how IOAM-Data-Fields are 258 encapsulated into other protocols is outside the scope of this 259 document. 261 This document defines four IOAM-Option-Types: 263 o Pre-allocated Trace Option-Type 265 o Incremental Trace Option-Type 267 o Proof of Transit (POT) Option-Type 269 o Edge-to-Edge (E2E) Option-Type 271 4.2. IOAM-Domains and types of IOAM Nodes 273 IOAM is expected to be deployed in a specific domain. The part of 274 the network which employs IOAM is referred to as the "IOAM-Domain". 275 One or more IOAM-Option-Types are added to a packet upon entering the 276 IOAM-Domain and are removed from the packet when exiting the domain. 277 Within the IOAM-Domain, the IOAM-Data-Fields MAY be updated by 278 network nodes that the packet traverses. An IOAM-Domain consists of 279 "IOAM encapsulating nodes", "IOAM decapsulating nodes" and "IOAM 280 transit nodes". The role of a node (i.e. encapsulating, transit, 281 decapsulating) is defined within an IOAM-Namespace (see below). A 282 node can have different roles in different IOAM-Namespaces. 284 A device which adds at least one IOAM-Option-Type to the packet is 285 called the "IOAM encapsulating node", whereas a device which removes 286 an IOAM-Option-Type is referred to as the "IOAM decapsulating node". 288 Nodes within the domain which are aware of IOAM data and read and/or 289 write or process the IOAM data are called "IOAM transit nodes". IOAM 290 nodes which add or remove the IOAM-Data-Fields can also update the 291 IOAM-Data-Fields at the same time. Or in other words, IOAM 292 encapsulating or decapsulating nodes can also serve as IOAM transit 293 nodes at the same time. Note that not every node in an IOAM domain 294 needs to be an IOAM transit node. For example, a deployment might 295 require that packets traverse a set of firewalls which support IOAM. 296 In that case, only the set of firewall nodes would be IOAM transit 297 nodes rather than all nodes. 299 An "IOAM encapsulating node" incorporates one or more IOAM-Option- 300 Types (from the list of IOAM-Types, see Section 7.2) into packets 301 that IOAM is enabled for. If IOAM is enabled for a selected subset 302 of the traffic, the IOAM encapsulating node is responsible for 303 applying the IOAM functionality to the selected subset. 305 An "IOAM transit node" updates one or more of the IOAM-Data-Fields. 306 If both the Pre-allocated and the Incremental Trace Option-Types are 307 present in the packet, each IOAM transit node will update at most one 308 of these Option-Types. A transit node MUST NOT add new IOAM-Option- 309 Types to a packet, and MUST NOT change the IOAM-Data-Fields of an 310 IOAM Edge-to-Edge Option-Type. 312 An "IOAM decapsulating node" removes IOAM-Option-Type(s) from 313 packets. 315 The role of an IOAM-encapsulating, IOAM-transit or IOAM-decapsulating 316 node is always performed within a specific IOAM-Namespace. This 317 means that an IOAM node which is e.g. an IOAM-decapsulating node for 318 IOAM-Namespace "A" but not for IOAM-Namespace "B" will only remove 319 the IOAM-Option-Types for IOAM-Namespace "A" from the packet. An 320 IOAM decapsulating node situated at the edge of an IOAM domain MUST 321 remove all IOAM-Option-Types and associated encapsulation headers for 322 all IOAM-Namespaces from the packet. 324 IOAM-Namespaces allow for a namespace-specific definition and 325 interpretation of IOAM-Data-Fields. An interface-id could for 326 example point to a physical interface (e.g., to understand which 327 physical interface of an aggregated link is used when receiving or 328 transmitting a packet) whereas in another case it could refer to a 329 logical interface (e.g., in case of tunnels). Please refer to 330 Section 4.3 for details on IOAM-Namespaces. 332 4.3. IOAM-Namespaces 334 A subset or all of the IOAM-Option-Types and their corresponding 335 IOAM-Data-Fields can be associated to an IOAM-Namespace. IOAM- 336 Namespaces add further context to IOAM-Option-Types and associated 337 IOAM-Data-Fields. Any IOAM-Namespace MUST interpret the IOAM-Option- 338 Types and associated IOAM-Data-Fields per the definition in this 339 document. IOAM-Namespaces group nodes to support different 340 deployment approaches of IOAM (see a few example use-cases below) as 341 well as resolve issues which can occur due to IOAM-Data-Fields not 342 being globally unique (e.g. IOAM node identifiers do not have to be 343 globally unique). IOAM-Data-Fields significance is always within a 344 particular IOAM-Namespace. 346 An IOAM-Namespace is identified by a 16-bit namespace identifier 347 (Namespace-ID). IOAM-Namespace identifiers MUST be present and 348 populated in all IOAM-Option-Types. The Namespace-ID value is 349 divided into two sub-ranges: 351 o An operator-assigned range from 0x0001 to 0x7FFF 353 o An IANA-assigned range from 0x8000 to 0xFFFF 355 The IANA-assigned range is intended to allow future extensions to 356 have new and interoperable IOAM functionality, while the operator- 357 assigned range is intended to be domain specific, and managed by the 358 network operator. The Namespace-ID value of 0x0000 is default and 359 known to all the nodes implementing IOAM. 361 Namespace identifiers allow devices which are IOAM capable to 362 determine: 364 o whether IOAM-Option-Type(s) need to be processed by a device: If 365 the Namespace-ID contained in a packet does not match any 366 Namespace-ID the node is configured to operate on, then the node 367 MUST NOT change the contents of the IOAM-Data-Fields. 369 o which IOAM-Option-Type needs to be processed/updated in case there 370 are multiple IOAM-Option-Types present in the packet. Multiple 371 IOAM-Option-Types can be present in a packet in case of 372 overlapping IOAM-Domains or in case of a layered IOAM deployment. 374 o whether IOAM-Option-Type(s) should be removed from the packet, 375 e.g. at a domain edge or domain boundary. 377 IOAM-Namespaces support several different uses: 379 o IOAM-Namespaces can be used by an operator to distinguish 380 different operational domains. Devices at domain edges can filter 381 on Namespace-IDs to provide for proper IOAM-Domain isolation. 383 o IOAM-Namespaces provide additional context for IOAM-Data-Fields 384 and thus ensure that IOAM-Data-Fields are unique and can be 385 interpreted properly by management stations or network 386 controllers. While, for example, the node identifier field 387 (node_id, see below) does not need to be unique in a deployment 388 (e.g. an operator may wish to use different node identifiers for 389 different IOAM layers, even within the same device; or node 390 identifiers might not be unique for other organizational reasons, 391 such as after a merger of two formerly separated organizations), 392 the combination of node_id and Namespace-ID will always be unique. 393 Similarly, IOAM-Namespaces can be used to define how certain IOAM- 394 Data-Fields are interpreted: IOAM offers three different timestamp 395 format options. The Namespace-ID can be used to determine the 396 timestamp format. IOAM-Data-Fields (e.g. buffer occupancy) which 397 do not have a unit associated are to be interpreted within the 398 context of a IOAM-Namespace. 400 o IOAM-Namespaces can be used to identify different sets of devices 401 (e.g., different types of devices) in a deployment: If an operator 402 desires to insert different IOAM-Data-Fields based on the device, 403 the devices could be grouped into multiple IOAM-Namespaces. This 404 could be due to the fact that the IOAM feature set differs between 405 different sets of devices, or it could be for reasons of optimized 406 space usage in the packet header. It could also stem from 407 hardware or operational limitations on the size of the trace data 408 that can be added and processed, preventing collection of a full 409 trace for a flow. 411 * Assigning different IOAM Namespace-IDs to different sets of 412 nodes or network partitions and using the Namespace-ID as a 413 selector at the IOAM encapsulating node, a full trace for a 414 flow could be collected and constructed via partial traces in 415 different packets of the same flow. Example: An operator could 416 choose to group the devices of a domain into two IOAM- 417 Namespaces, in a way that at average, only every second hop 418 would be recorded by any device. To retrieve a full view of 419 the deployment, the captured IOAM-Data-Fields of the two IOAM- 420 Namespaces need to be correlated. 422 * Assigning different IOAM Namespace-IDs to different sets of 423 nodes or network partitions and using a separate instance of an 424 IOAM-Option-Type for each Namespace-ID, a full trace for a flow 425 could be collected and constructed via partial traces from each 426 IOAM-Option-Type in each of the packets in the flow. Example: 428 An operator could choose to group the devices of a domain into 429 two IOAM-Namespaces, in a way that each IOAM-Namespace is 430 represented by one of two IOAM-Option-Types in the packet. 431 Each node would record data only for the IOAM-Namespace that it 432 belongs to, ignoring the other IOAM-Option-Type with a IOAM- 433 Namespace to which it doesn't belong. To retrieve a full view 434 of the deployment, the captured IOAM-Data-Fields of the two 435 IOAM-Namespaces need to be correlated. 437 4.4. IOAM Trace Option-Types 439 "IOAM tracing data" is expected to be either collected at every IOAM 440 transit node that a packet traverses to ensure visibility into the 441 entire path a packet takes within an IOAM-Domain. I.e., in a typical 442 deployment all nodes in an IOAM-Domain would participate in IOAM and 443 thus be IOAM transit nodes, IOAM encapsulating or IOAM decapsulating 444 nodes. If not all nodes within a domain support IOAM functionality 445 as defined in this document, IOAM tracing information (i.e., node 446 data, see below) will only be collected on those nodes which support 447 IOAM functionality as defined in this document. Nodes which do not 448 support IOAM functionality as defined in this document will forward 449 the packet without any changes to the IOAM-Data-Fields. The maximum 450 number of hops and the minimum path MTU of the IOAM domain is assumed 451 to be known. 453 To optimize hardware and software implementations IOAM tracing is 454 defined as two separate options. Any deployment MAY choose to 455 configure and support one or both of the following options. 457 Pre-allocated Trace-Option: This trace option is defined as a 458 container of node data fields (see below) with pre-allocated space 459 for each node to populate its information. This option is useful 460 for implementations where it is efficient to allocate the space 461 once and index into the array to populate the data during transit 462 (e.g., software forwarders often fall into this class). The IOAM 463 encapsulating node allocates space for Pre-allocated Trace Option- 464 Type in the packet and sets corresponding fields in this IOAM- 465 Option-Type. The IOAM encapsulating node allocates an array which 466 is used to store operational data retrieved from every node while 467 the packet traverses the domain. IOAM transit nodes update the 468 content of the array, and possibly update the checksums of outer 469 headers. A pointer which is part of the IOAM trace data, points 470 to the next empty slot in the array. An IOAM transit node that 471 updates the content of the pre-allocated option also updates the 472 value of the pointer, which specifies where the next IOAM transit 473 node fills in its data.The "node data list" array (see below) in 474 the packet is populated iteratively as the packet traverses the 475 network, starting with the last entry of the array, i.e., "node 476 data list [n]" is the first entry to be populated, "node data list 477 [n-1]" is the second one, etc. 479 Incremental Trace-Option: This trace option is defined as a 480 container of node data fields where each node allocates and pushes 481 its node data immediately following the option header. This type 482 of trace recording is useful for some of the hardware 483 implementations as it eliminates the need for the transit network 484 elements to read the full array in the option and allows for 485 arbitrarily long packets as the MTU allows. The IOAM 486 encapsulating node allocates space for the Incremental Trace 487 Option-Type. Based on operational state and configuration, the 488 IOAM encapsulating node sets the fields in the Option-Type that 489 control what IOAM-Data-Fields should be collected and how large 490 the node data list can grow. IOAM transit nodes push their node 491 data to the node data list, decrease the remaining length 492 available to subsequent nodes and adjust the lengths and possibly 493 checksums in outer headers. 495 A particular implementation of IOAM MAY choose to support only one of 496 the two trace option types. In the event that both options are 497 utilized at the same time, the Incremental Trace-Option MUST be 498 placed before the Pre-allocated Trace-Option. Deployments which mix 499 devices which either the Incremental Trace-Option or the Pre- 500 allocated Trace-Option could result in both Option-Types being 501 present in a packet. Given that the operator knows which equipment 502 is deployed in a particular IOAM, the operator will decide by means 503 of configuration which type(s) of trace options will be used for a 504 particular domain. 506 Every node data entry holds information for a particular IOAM transit 507 node that is traversed by a packet. The IOAM decapsulating node 508 removes the IOAM-Option-Type(s) and processes and/or exports the 509 associated data. Like all IOAM-Data-Fields, the IOAM-Data-Fields of 510 the IOAM-Trace-Option-Types are defined in the context of an IOAM- 511 Namespace. 513 IOAM tracing can collect the following types of information: 515 o Identification of the IOAM node. An IOAM node identifier can 516 match to a device identifier or a particular control point or 517 subsystem within a device. 519 o Identification of the interface that a packet was received on, 520 i.e. ingress interface. 522 o Identification of the interface that a packet was sent out on, 523 i.e. egress interface. 525 o Time of day when the packet was processed by the node as well as 526 the transit delay. Different definitions of processing time are 527 feasible and expected, though it is important that all devices of 528 an in-situ OAM domain follow the same definition. 530 o Generic data: Format-free information where syntax and semantic of 531 the information is defined by the operator in a specific 532 deployment. For a specific IOAM-Namespace, all IOAM nodes should 533 interpret the generic data the same way. Examples for generic 534 IOAM data include geo-location information (location of the node 535 at the time the packet was processed), buffer queue fill level or 536 cache fill level at the time the packet was processed, or even a 537 battery charge level. 539 o Information to detect whether IOAM trace data was added at every 540 hop or whether certain hops in the domain weren't IOAM transit 541 nodes. 543 4.4.1. Pre-allocated and Incremental Trace Option-Types 545 The IOAM Pre-allocated Trace-Option and the IOAM Incremental Trace- 546 Option have similar formats. Except where noted below, the internal 547 formats and fields of the two trace options are identical. Both 548 Trace-Options consist of a fixed size "trace option header" and a 549 variable data space to store gathered data, the "node data list". An 550 IOAM transit node (that is not an IOAM encapsulating node or IOAM 551 decapsulating node) MUST NOT modify any of the fields in the fixed 552 size "trace option header", other than "flags" and "RemainingLen", 553 i.e. an IOAM transit node MUST NOT modify the Namespace-ID, NodeLen, 554 IOAM-Trace-Type, or Reserved fields. 556 Pre-allocated and incremental trace option headers: 558 0 1 2 3 559 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 560 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 561 | Namespace-ID |NodeLen | Flags | RemainingLen| 562 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 563 | IOAM-Trace-Type | Reserved | 564 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 566 The trace option data MUST be 4-octet aligned: 568 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<-+ 569 | | | 570 | node data list [0] | | 571 | | | 572 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ D 573 | | a 574 | node data list [1] | t 575 | | a 576 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 577 ~ ... ~ S 578 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ p 579 | | a 580 | node data list [n-1] | c 581 | | e 582 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 583 | | | 584 | node data list [n] | | 585 | | | 586 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<-+ 588 Namespace-ID: 16-bit identifier of an IOAM-Namespace. The 589 Namespace-ID value of 0x0000 is defined as the default value and 590 MUST be known to all the nodes implementing IOAM. For any other 591 Namespace-ID value that does not match any Namespace-ID the node 592 is configured to operate on, the node MUST NOT change the contents 593 of the IOAM-Data-Fields. 595 NodeLen: 5-bit unsigned integer. This field specifies the length of 596 data added by each node in multiples of 4-octets, excluding the 597 length of the "Opaque State Snapshot" field. 599 If IOAM-Trace-Type bit 22 is not set, then NodeLen specifies the 600 actual length added by each node. If IOAM-Trace-Type bit 22 is 601 set, then the actual length added by a node would be (NodeLen + 602 length of the "Opaque State Snapshot" field) in 4 octet units. 604 For example, if 3 IOAM-Trace-Type bits are set and none of them 605 are wide, then NodeLen would be 3. If 3 IOAM-Trace-Type bits are 606 set and 2 of them are wide, then NodeLen would be 5. 608 An IOAM encapsulating node must set NodeLen. 610 A node receiving an IOAM Pre-allocated or Incremental Trace-Option 611 may rely on the NodeLen value, or it may ignore the NodeLen value 612 and calculate the node length from the IOAM-Trace-Type bits (see 613 below). 615 Flags 4-bit field. Flags are allocated by IANA, as specified in 616 Section 7.4. This document allocates a single flag as follows: 618 Bit 0 "Overflow" (O-bit) (most significant bit). This bit is set 619 by the network element if there are not enough octets left to 620 record node data, no field is added and the overflow "O-bit" 621 must be set to "1" in the IOAM-Trace-Option header. This is 622 useful for transit nodes to ignore further processing of the 623 option. 625 RemainingLen: 7-bit unsigned integer. This field specifies the data 626 space in multiples of 4-octets remaining for recording the node 627 data, before the node data list is considered to have overflowed. 628 Given that the sender knows the minimum path MTU, the sender MAY 629 set the initial value of RemainingLen according to the number of 630 node data bytes allowed before exceeding the MTU. Subsequent 631 nodes can carry out a simple comparison between RemainingLen and 632 NodeLen, along with the length of the "Opaque State Snapshot" if 633 applicable, to determine whether or not data can be added by this 634 node. When node data is added, the node MUST decrease 635 RemainingLen by the amount of data added. In the pre-allocated 636 trace option, RemainingLength is used to derive the offset in data 637 space to record the node data element. Specifically, the 638 recording of the node data element would start from RemainingLen - 639 NodeLen - sizeof(opaque snapshot) in 4 octet units. 641 IOAM-Trace-Type: A 24-bit identifier which specifies which data 642 types are used in this node data list. 644 The IOAM-Trace-Type value is a bit field. The following bits are 645 defined in this document, with details on each bit described in 646 the Section 4.4.2. The order of packing the data fields in each 647 node data element follows the bit order of the IOAM-Trace-Type 648 field, as follows: 650 Bit 0 (Most significant bit) When set indicates presence of 651 Hop_Lim and node_id (short format) in the node data. 653 Bit 1 When set indicates presence of ingress_if_id and 654 egress_if_id (short format) in the node data. 656 Bit 2 When set indicates presence of timestamp seconds in the 657 node data. 659 Bit 3 When set indicates presence of timestamp subseconds in 660 the node data. 662 Bit 4 When set indicates presence of transit delay in the node 663 data. 665 Bit 5 When set indicates presence of IOAM-Namespace specific 666 data (short format) in the node data. 668 Bit 6 When set indicates presence of queue depth in the node 669 data. 671 Bit 7 When set indicates presence of the Checksum Complement 672 node data. 674 Bit 8 When set indicates presence of Hop_Lim and node_id in 675 wide format in the node data. 677 Bit 9 When set indicates presence of ingress_if_id and 678 egress_if_id in wide format in the node data. 680 Bit 10 When set indicates presence of IOAM-Namespace specific 681 data in wide format in the node data. 683 Bit 11 When set indicates presence of buffer occupancy in the 684 node data. 686 Bit 12-21 Undefined. An IOAM encapsulating node MUST set the 687 value of each of these bits to 0. If an IOAM transit 688 node receives a packet with one or more of these bits set 689 to 1, it must either: 691 1. Add corresponding node data filled with the reserved 692 value 0xFFFFFFFF, after the node data fields for the 693 IOAM-Trace-Type bits defined above, such that the 694 total node data added by this node in units of 695 4-octets is equal to NodeLen, or 697 2. Not add any node data fields to the packet, even for 698 the IOAM-Trace-Type bits defined above. 700 Bit 22 When set indicates presence of variable length Opaque 701 State Snapshot field. 703 Bit 23 Reserved: Must be set to zero upon transmission and 704 ignored upon receipt. 706 Section 4.4.2 describes the IOAM-Data-Types and their formats. 707 Within an IOAM-Domain possible combinations of these bits making 708 the IOAM-Trace-Type can be restricted by configuration knobs. 710 Reserved: 8-bits. An IOAM encapsulating node MUST set the value to 711 zero upon transmission. IOAM transit nodes must ignore the 712 received value. 714 Node data List [n]: Variable-length field. This is a list of node 715 data elements where the content of each node data element is 716 determined by the IOAM-Trace-Type. The order of packing the data 717 fields in each node data element follows the bit order of the 718 IOAM-Trace-Type field. Each node MUST prepend its node data 719 element in front of the node data elements that it received, such 720 that the transmitted node data list begins with this node's data 721 element as the first populated element in the list. The last node 722 data element in this list is the node data of the first IOAM 723 capable node in the path. Populating the node data list in this 724 way ensures that the order of node data list is the same for 725 incremental and pre-allocated trace options. In the pre-allocated 726 trace option, the index contained in RemainingLen identifies the 727 offset for current active node data to be populated. 729 4.4.2. IOAM node data fields and associated formats 731 All the IOAM-Data-Fields MUST be 4-octet aligned. If a node which is 732 supposed to update an IOAM-Data-Field is not capable of populating 733 the value of a field set in the IOAM-Trace-Type, the field value MUST 734 be set to 0xFFFFFFFF for 4-octet fields or 0xFFFFFFFFFFFFFFFF for 735 8-octet fields, indicating that the value is not populated, except 736 when explicitly specified in the field description below. 738 Some IOAM-Data-Fields defined below, such as interface identifiers or 739 IOAM-Namespace specific data, are defined in both "short format" as 740 well as "wide format". Their use is not exclusive. A deployment 741 could choose to leverage both. For example, ingress_if_id_(short 742 format) could be an identifier for the physical interface, whereas 743 ingress_if_id_(wide format) could be an identifier for a logical sub- 744 interface of that physical interface. 746 Data field and associated data type for each of the IOAM-Data-Fields 747 is shown below: 749 Hop_Lim and node_id short format: 4-octet field defined as follows: 751 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 752 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 753 | Hop_Lim | node_id | 754 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 756 Hop_Lim: 1-octet unsigned integer. It is set to the Hop Limit 757 value in the packet at the node that records this data. Hop 758 Limit information is used to identify the location of the node 759 in the communication path. This is copied from the lower 760 layer, e.g., TTL value in IPv4 header or hop limit field from 761 IPv6 header of the packet when the packet is ready for 762 transmission. The semantics of the Hop_Lim field depend on the 763 lower layer protocol that IOAM is encapsulated into, and 764 therefore its specific semantics are outside the scope of this 765 memo. The value of this field MUST be set to 0xff when the 766 lower level does not have a TTL/Hop limit equivalent field. 768 node_id: 3-octet unsigned integer. Node identifier field to 769 uniquely identify a node within the IOAM-Namespace and 770 associated IOAM-Domain. The procedure to allocate, manage and 771 map the node_ids is beyond the scope of this document. 773 ingress_if_id and egress_if_id: 4-octet field defined as follows: 775 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 776 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 777 | ingress_if_id | egress_if_id | 778 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 780 ingress_if_id: 2-octet unsigned integer. Interface identifier to 781 record the ingress interface the packet was received on. 783 egress_if_id: 2-octet unsigned integer. Interface identifier to 784 record the egress interface the packet is forwarded out of. 786 Note that due to the fact that IOAM uses its own IOAM-Namespaces 787 for IOAM-Data-Fields, data fields like interface identifiers can 788 be used in a flexible way to represent system resources that are 789 associated with ingressing or egressing packets, i.e. 790 ingress_if_id could represent a physical interface, a virtual or 791 logical interface, or even a queue. 793 timestamp seconds: 4-octet unsigned integer. Absolute timestamp in 794 seconds that specifies the time at which the packet was received 795 by the node. This field has three possible formats; based on 796 either PTP [IEEE1588v2], NTP [RFC5905], or POSIX [POSIX]. The 797 three timestamp formats are specified in Section 5. In all three 798 cases, the Timestamp Seconds field contains the 32 most 799 significant bits of the timestamp format that is specified in 800 Section 5. If a node is not capable of populating this field, it 801 assigns the value 0xFFFFFFFF. Note that this is a legitimate 802 value that is valid for 1 second in approximately 136 years; the 803 analyzer should correlate several packets or compare the timestamp 804 value to its own time-of-day in order to detect the error 805 indication. 807 timestamp subseconds: 4-octet unsigned integer. Absolute timestamp 808 in subseconds that specifies the time at which the packet was 809 received by the node. This field has three possible formats; 810 based on either PTP [IEEE1588v2], NTP [RFC5905], or POSIX [POSIX]. 811 The three timestamp formats are specified in Section 5. In all 812 three cases, the Timestamp Subseconds field contains the 32 least 813 significant bits of the timestamp format that is specified in 814 Section 5. If a node is not capable of populating this field, it 815 assigns the value 0xFFFFFFFF. Note that this is a legitimate 816 value in the NTP format, valid for approximately 233 picoseconds 817 in every second. If the NTP format is used the analyzer should 818 correlate several packets in order to detect the error indication. 820 transit delay: 4-octet unsigned integer in the range 0 to 2^31-1. 821 It is the time in nanoseconds the packet spent in the transit 822 node. This can serve as an indication of the queuing delay at the 823 node. If the transit delay exceeds 2^31-1 nanoseconds then the 824 top bit 'O' is set to indicate overflow and value set to 825 0x80000000. When this field is part of the data field but a node 826 populating the field is not able to fill it, the field position in 827 the field must be filled with value 0xFFFFFFFF to mean not 828 populated. 830 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 831 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 832 |O| transit delay | 833 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 835 namespace specific data: 4-octet field which can be used by the node 836 to add IOAM-Namespace specific data. This represents a "free- 837 format" 4-octet bit field with its semantics defined in the 838 context of a specific IOAM-Namespace. 840 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 841 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 842 | namespace specific data | 843 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 845 queue depth: 4-octet unsigned integer field. This field indicates 846 the current length of the egress interface queue of the interface 847 from where the packet is forwarded out. The queue depth is 848 expressed as the current number of memory buffers used by the 849 queue (a packet may consume one or more memory buffers, depending 850 on its size). 852 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 853 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 854 | queue depth | 855 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 857 Hop_Lim and node_id wide: 8-octet field defined as follows: 859 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 860 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 861 | Hop_Lim | node_id ~ 862 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 863 ~ node_id (contd) | 864 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 866 Hop_Lim: 1-octet unsigned integer. It is set to the Hop Limit 867 value in the packet at the node that records this data. Hop 868 Limit information is used to identify the location of the node 869 in the communication path. This is copied from the lower layer 870 for e.g. TTL value in IPv4 header or hop limit field from IPv6 871 header of the packet. The semantics of the Hop_Lim field 872 depend on the lower layer protocol that IOAM is encapsulated 873 into, and therefore its specific semantics are outside the 874 scope of this memo. The value of this field MUST be set to 875 0xff when the lower level does not have a TTL/Hop limit 876 equivalent field. 878 node_id: 7-octet unsigned integer. Node identifier field to 879 uniquely identify a node within the IOAM-Namespace and 880 associated IOAM-Domain. The procedure to allocate, manage and 881 map the node_ids is beyond the scope of this document. 883 ingress_if_id and egress_if_id wide: 8-octet field defined as 884 follows: 886 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 887 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 888 | ingress_if_id | 889 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 890 | egress_if_id | 891 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 893 ingress_if_id: 4-octet unsigned integer. Interface identifier to 894 record the ingress interface the packet was received on. 896 egress_if_id: 4-octet unsigned integer. Interface identifier to 897 record the egress interface the packet is forwarded out of. 899 namespace specific data wide: 8-octet field which can be used by the 900 node to add IOAM-Namespace specific data. This represents a 901 "free-format" 8-octet bit field with its semantics defined in the 902 context of a specific IOAM-Namespace. 904 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 905 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 906 | namespace specific data ~ 907 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 908 ~ namespace specific data (contd) | 909 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 911 buffer occupancy: 4-octet unsigned integer field. This field 912 indicates the current status of the occupancy of the common buffer 913 pool used by a set of queues. The units of this field may be 914 implementation specific. Hence, the units may need to be 915 interpreted within the context of an IOAM-Namespace and/or node-id 916 if used. The authors acknowledge that in some operational cases 917 there is a need for the units to be consistent across a packet 918 path through the network, hence recommend the implementations to 919 use standard unit such as Bytes. 921 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 922 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 923 | buffer occupancy | 924 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 926 Checksum Complement: 4-octet node data which contains a 4-octet 927 Checksum Complement field. The Checksum Complement is useful when 928 IOAM is transported over encapsulations that make use of a UDP 929 transport, such as VXLAN-GPE or Geneve. Without the Checksum 930 Complement, nodes adding IOAM node data must update the UDP 931 Checksum field. When the Checksum Complement is present, an IOAM 932 encapsulating node or IOAM transit node adding node data MUST 933 carry out one of the following two alternatives in order to 934 maintain the correctness of the UDP Checksum value: 936 1. Recompute the UDP Checksum field. 938 2. Use the Checksum Complement to make a checksum-neutral update 939 in the UDP payload; the Checksum Complement is assigned a 940 value that complements the rest of the node data fields that 941 were added by the current node, causing the existing UDP 942 Checksum field to remain correct. 944 IOAM decapsulating nodes MUST recompute the UDP Checksum field, 945 since they do not know whether previous hops modified the UDP 946 Checksum field or the Checksum Complement field. 948 Checksum Complement fields are used in a similar manner in 949 [RFC7820] and [RFC7821]. 951 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 952 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 953 | Checksum Complement | 954 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 956 Opaque State Snapshot: Opaque State Snapshot is a variable length 957 field and immediately follows the fixed length IOAM-Data-Fields 958 defined above. It allows the network element to store an 959 arbitrary state in the node data field, without a pre-defined 960 schema. The schema is to be defined within the context of an 961 IOAM-Namespace. The schema needs to be made known to the analyzer 962 by some out-of-band mechanism. The specification of this 963 mechanism is beyond the scope of this document. A 24-bit "Schema 964 Id" field, interpreted within the context of an IOAM-Namespace, 965 indicates which particular schema is used, and should be 966 configured on the network element by the operator. 968 0 1 2 3 969 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 970 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 971 | Length | Schema ID | 972 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 973 | | 974 | | 975 | Opaque data | 976 ~ ~ 977 . . 978 . . 979 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 980 Length: 1-octet unsigned integer. It is the length in multiples 981 of 4-octets of the Opaque data field that follows Schema Id. 983 Schema ID: 3-octet unsigned integer identifying the schema of 984 Opaque data. 986 Opaque data: Variable length field. This field is interpreted as 987 specified by the schema identified by the Schema ID. 989 When this field is part of the data field but a node populating 990 the field has no opaque state data to report, the Length must be 991 set to 0 and the Schema ID must be set to 0xFFFFFF to mean no 992 schema. 994 4.4.3. Examples of IOAM node data 996 An entry in the "node data list" array can have different formats, 997 following the needs of the deployment. Some deployments might only 998 be interested in recording the node identifiers, whereas others might 999 be interested in recording node identifier and timestamp. The 1000 section provides example entries of the "node data list". 1002 0xD40000: IOAM-Trace-Type is 0xD40000 (0b110101000000000000000000) 1003 then the format of node data is: 1005 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1006 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1007 | Hop_Lim | node_id | 1008 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1009 | ingress_if_id | egress_if_id | 1010 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1011 | timestamp subseconds | 1012 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1013 | namespace specific data | 1014 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1016 0xC00000: IOAM-Trace-Type is 0xC00000 (0b110000000000000000000000) 1017 then the format is: 1019 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1020 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1021 | Hop_Lim | node_id | 1022 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1023 | ingress_if_id | egress_if_id | 1024 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1026 0x900000: IOAM-Trace-Type is 0x900000 (0b100100000000000000000000) 1027 then the format is: 1029 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1030 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1031 | Hop_Lim | node_id | 1032 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1033 | timestamp subseconds | 1034 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1036 0x840000: IOAM-Trace-Type is 0x840000 (0b100001000000000000000000) 1037 then the format is: 1039 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1040 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1041 | Hop_Lim | node_id | 1042 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1043 | namespace specific data | 1044 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1046 0x940000: IOAM-Trace-Type is 0x940000 (0b100101000000000000000000) 1047 then the format is: 1049 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1050 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1051 | Hop_Lim | node_id | 1052 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1053 | timestamp subseconds | 1054 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1055 | namespace specific data | 1056 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1058 0x308002: IOAM-Trace-Type is 0x308002 (0b001100001000000000000010) 1059 then the format is: 1061 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1062 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1063 | timestamp seconds | 1064 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1065 | timestamp subseconds | 1066 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1067 | Hop_Lim | node_id | 1068 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1069 | node_id(contd) | 1070 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1071 | Length | Schema Id | 1072 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1073 | | 1074 | | 1075 | Opaque data | 1076 ~ ~ 1077 . . 1078 . . 1079 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1081 4.5. IOAM Proof of Transit Option-Type 1083 IOAM Proof of Transit Option-Type is to support path or service 1084 function chain [RFC7665] verification use cases. Proof-of-transit 1085 uses methods like nested hashing or nested encryption of the IOAM 1086 data or mechanisms such as Shamir's Secret Sharing Schema (SSSS). 1087 While details on how the IOAM data for the proof of transit option is 1088 processed at IOAM encapsulating, decapsulating and transit nodes are 1089 outside the scope of the document, all of these approaches share the 1090 need to uniquely identify a packet as well as iteratively operate on 1091 a set of information that is handed from node to node. 1092 Correspondingly, two pieces of information are added as IOAM-Data- 1093 Fields to the packet: 1095 o Random: Unique identifier for the packet (e.g., 64-bits allow for 1096 the unique identification of 2^64 packets). 1098 o Cumulative: Information which is handed from node to node and 1099 updated by every node according to a verification algorithm. 1101 The IOAM Proof of Transit Option-Type consist of a fixed size "IOAM 1102 proof of transit option header" and "IOAM proof of transit option 1103 data fields": 1105 IOAM proof of transit option header: 1107 0 1 2 3 1108 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1109 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1110 | Namespace-ID |IOAM POT Type | IOAM POT flags| 1111 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1113 IOAM proof of transit Option-Type IOAM-Data-Fields MUST be 1114 4-octet aligned: 1116 0 1 2 3 1117 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1118 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1119 | POT Option data field determined by IOAM-POT-Type | 1120 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1122 Namespace-ID: 16-bit identifier of an IOAM-Namespace. The 1123 Namespace-ID value of 0x0000 is defined as the default value and 1124 MUST be known to all the nodes implementing IOAM. For any other 1125 Namespace-ID value that does not match any Namespace-ID the node 1126 is configured to operate on, the node MUST NOT change the contents 1127 of the IOAM-Data-Fields. 1129 IOAM POT Type: 8-bit identifier of a particular POT variant that 1130 specifies the POT data that is included. This document defines 1131 POT Type 0: 1133 0: POT data is a 16 Octet field as described below. 1135 IOAM POT flags: 8-bit. Following flags are defined: 1137 Bit 0 "Profile-to-use" (P-bit) (most significant bit). For IOAM 1138 POT types that use a maximum of two profiles to drive 1139 computation, indicates which POT-profile is used. The two 1140 profiles are numbered 0, 1. 1142 Bit 1-7 Reserved: Must be set to zero upon transmission and 1143 ignored upon receipt. 1145 POT Option data: Variable-length field. The type of which is 1146 determined by the IOAM-POT-Type. 1148 4.5.1. IOAM Proof of Transit Type 0 1150 IOAM proof of transit option of IOAM POT Type 0: 1152 0 1 2 3 1153 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1154 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1155 | Namespace-ID |IOAM POT Type=0|P|R R R R R R R| 1156 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<-+ 1157 | Random | | 1158 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P 1159 | Random(contd) | O 1160 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ T 1161 | Cumulative | | 1162 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 1163 | Cumulative (contd) | | 1164 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<-+ 1166 Namespace-ID: 16-bit identifier of an IOAM-Namespace. The 1167 Namespace-ID value of 0x0000 is defined as the default value and 1168 MUST be known to all the nodes implementing IOAM. For any other 1169 Namespace-ID value that does not match any Namespace-ID the node 1170 is configured to operate on, the node MUST NOT change the contents 1171 of the IOAM-Data-Fields. 1173 IOAM POT Type: 8-bit identifier of a particular POT variant that 1174 specifies the POT data that is included. This section defines the 1175 POT data when the IOAM POT Type is set to the value 0. 1177 P bit: 1-bit. "Profile-to-use" (P-bit) (most significant bit). 1178 Indicates which POT-profile is used to generate the Cumulative. 1179 Any node participating in POT will have a maximum of 2 profiles 1180 configured that drive the computation of cumulative. The two 1181 profiles are numbered 0, 1. This bit conveys whether profile 0 or 1182 profile 1 is used to compute the Cumulative. 1184 R (7 bits): 7-bit IOAM POT flags for future use. MUST be set to 1185 zero upon transmission and ignored upon receipt. 1187 Random: 64-bit Per packet Random number. 1189 Cumulative: 64-bit Cumulative that is updated at specific nodes by 1190 processing per packet Random number field and configured 1191 parameters. 1193 Note: Larger or smaller sizes of "Random" and "Cumulative" data are 1194 feasible and could be required for certain deployments (e.g. in case 1195 of space constraints in the encapsulation protocols used). Future 1196 documents may address different sizes of data for "proof of transit". 1198 4.6. IOAM Edge-to-Edge Option-Type 1200 The IOAM Edge-to-Edge Option-Type is to carry data that is added by 1201 the IOAM encapsulating node and interpreted by IOAM decapsulating 1202 node. The IOAM transit nodes MAY process the data but MUST NOT 1203 modify it. 1205 The IOAM Edge-to-Edge Option-Type consist of a fixed size "IOAM Edge- 1206 to-Edge Option-Type header" and "IOAM Edge-to-Edge Option-Type data 1207 fields": 1209 IOAM Edge-to-Edge Option-Type header: 1211 0 1 2 3 1212 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1213 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1214 | Namespace-ID | IOAM-E2E-Type | 1215 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1217 IOAM Edge-to-Edge Option-Type IOAM-Data-Fields MUST 1218 be 4-octet aligned: 1220 0 1 2 3 1221 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1222 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1223 | E2E Option data field determined by IOAM-E2E-Type | 1224 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1226 Namespace-ID: 16-bit identifier of an IOAM-Namespace. The 1227 Namespace-ID value of 0x0000 is defined as the default value and 1228 MUST be known to all the nodes implementing IOAM. For any other 1229 Namespace-ID value that does not match any Namespace-ID the node 1230 is configured to operate on, then the node MUST NOT change the 1231 contents of the IOAM-Data-Fields. 1233 IOAM-E2E-Type: A 16-bit identifier which specifies which data types 1234 are used in the E2E option data. The IOAM-E2E-Type value is a bit 1235 field. The order of packing the E2E option data field elements 1236 follows the bit order of the IOAM-E2E-Type field, as follows: 1238 Bit 0 (Most significant bit) When set indicates presence of a 1239 64-bit sequence number added to a specific "packet group" 1240 which is used to detect packet loss, packet reordering, 1241 or packet duplication within the group. The "packet 1242 group" is deployment dependent and defined at the IOAM 1243 encapsulating node e.g. by n-tuple based classification 1244 of packets. 1246 Bit 1 When set indicates presence of a 32-bit sequence number 1247 added to a specific "packet group" which is used to 1248 detect packet loss, packet reordering, or packet 1249 duplication within that group. The "packet group" is 1250 deployment dependent and defined at the IOAM 1251 encapsulating node e.g. by n-tuple based classification 1252 of packets. 1254 Bit 2 When set indicates presence of timestamp seconds, 1255 representing the time at which the packet entered the 1256 IOAM domain. Within the IOAM encapsulating node, the 1257 time that the timestamp is retrieved can depend on the 1258 implementation. Some possibilities are: 1) the time at 1259 which the packet was received by the node, 2) the time at 1260 which the packet was transmitted by the node, 3) when a 1261 tunnel encapsulation is used, the point at which the 1262 packet is encapsulated into the tunnel. Each 1263 implementation should document when the E2E timestamp 1264 that is going to be put in the packet is retrieved. This 1265 4-octet field has three possible formats; based on either 1266 PTP [IEEE1588v2], NTP [RFC5905], or POSIX [POSIX]. The 1267 three timestamp formats are specified in Section 5. In 1268 all three cases, the Timestamp Seconds field contains the 1269 32 most significant bits of the timestamp format that is 1270 specified in Section 5. If a node is not capable of 1271 populating this field, it assigns the value 0xFFFFFFFF. 1272 Note that this is a legitimate value that is valid for 1 1273 second in approximately 136 years; the analyzer should 1274 correlate several packets or compare the timestamp value 1275 to its own time-of-day in order to detect the error 1276 indication. 1278 Bit 3 When set indicates presence of timestamp subseconds, 1279 representing the time at which the packet entered the 1280 IOAM domain. This 4-octet field has three possible 1281 formats; based on either PTP [IEEE1588v2], NTP [RFC5905], 1282 or POSIX [POSIX]. The three timestamp formats are 1283 specified in Section 5. In all three cases, the 1284 Timestamp Subseconds field contains the 32 least 1285 significant bits of the timestamp format that is 1286 specified in Section 5. If a node is not capable of 1287 populating this field, it assigns the value 0xFFFFFFFF. 1289 Note that this is a legitimate value in the NTP format, 1290 valid for approximately 233 picoseconds in every second. 1291 If the NTP format is used the analyzer should correlate 1292 several packets in order to detect the error indication. 1294 Bit 4-15 Undefined. An IOAM encapsulating node Must set the value 1295 of these bits to zero upon transmission and ignore upon 1296 receipt. 1298 E2E Option data: Variable-length field. The type of which is 1299 determined by the IOAM-E2E-Type. 1301 5. Timestamp Formats 1303 The IOAM-Data-Fields include a timestamp field which is represented 1304 in one of three possible timestamp formats. It is assumed that the 1305 management plane is responsible for determining which timestamp 1306 format is used. 1308 5.1. PTP Truncated Timestamp Format 1310 The Precision Time Protocol (PTP) [IEEE1588v2] uses an 80-bit 1311 timestamp format. The truncated timestamp format is a 64-bit field, 1312 which is the 64 least significant bits of the 80-bit PTP timestamp. 1313 The PTP truncated format is specified in Section 4.3 of 1314 [I-D.ietf-ntp-packet-timestamps], and the details are presented below 1315 for the sake of completeness. 1317 0 1 2 3 1318 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1319 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1320 | Seconds | 1321 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1322 | Nanoseconds | 1323 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1325 Figure 1: PTP [IEEE1588v2] Truncated Timestamp Format 1327 Timestamp field format: 1329 Seconds: specifies the integer portion of the number of seconds 1330 since the epoch. 1332 + Size: 32 bits. 1334 + Units: seconds. 1336 Nanoseconds: specifies the fractional portion of the number of 1337 seconds since the epoch. 1339 + Size: 32 bits. 1341 + Units: nanoseconds. The value of this field is in the range 0 1342 to (10^9)-1. 1344 Epoch: 1346 The PTP [IEEE1588v2] epoch is 1 January 1970 00:00:00 TAI, which 1347 is 31 December 1969 23:59:51.999918 UTC. 1349 Resolution: 1351 The resolution is 1 nanosecond. 1353 Wraparound: 1355 This time format wraps around every 2^32 seconds, which is roughly 1356 136 years. The next wraparound will occur in the year 2106. 1358 Synchronization Aspects: 1360 It is assumed that nodes that run this protocol are synchronized 1361 among themselves. Nodes may be synchronized to a global reference 1362 time. Note that if PTP [IEEE1588v2] is used for synchronization, 1363 the timestamp may be derived from the PTP-synchronized clock, 1364 allowing the timestamp to be measured with respect to the clock of 1365 an PTP Grandmaster clock. 1367 The PTP truncated timestamp format is not affected by leap 1368 seconds. 1370 5.2. NTP 64-bit Timestamp Format 1372 The Network Time Protocol (NTP) [RFC5905] timestamp format is 64 bits 1373 long. This format is specified in Section 4.2.1 of 1374 [I-D.ietf-ntp-packet-timestamps], and the details are presented below 1375 for the sake of completeness. 1377 0 1 2 3 1378 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1379 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1380 | Seconds | 1381 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1382 | Fraction | 1383 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1385 Figure 2: NTP [RFC5905] 64-bit Timestamp Format 1387 Timestamp field format: 1389 Seconds: specifies the integer portion of the number of seconds 1390 since the epoch. 1392 + Size: 32 bits. 1394 + Units: seconds. 1396 Fraction: specifies the fractional portion of the number of 1397 seconds since the epoch. 1399 + Size: 32 bits. 1401 + Units: the unit is 2^(-32) seconds, which is roughly equal to 1402 233 picoseconds. 1404 Epoch: 1406 The epoch is 1 January 1900 at 00:00 UTC. 1408 Resolution: 1410 The resolution is 2^(-32) seconds. 1412 Wraparound: 1414 This time format wraps around every 2^32 seconds, which is roughly 1415 136 years. The next wraparound will occur in the year 2036. 1417 Synchronization Aspects: 1419 Nodes that use this timestamp format will typically be 1420 synchronized to UTC using NTP [RFC5905]. Thus, the timestamp may 1421 be derived from the NTP-synchronized clock, allowing the timestamp 1422 to be measured with respect to the clock of an NTP server. 1424 The NTP timestamp format is affected by leap seconds; it 1425 represents the number of seconds since the epoch minus the number 1426 of leap seconds that have occurred since the epoch. The value of 1427 a timestamp during or slightly after a leap second may be 1428 temporarily inaccurate. 1430 5.3. POSIX-based Timestamp Format 1432 This timestamp format is based on the POSIX time format [POSIX]. The 1433 detailed specification of the timestamp format used in this document 1434 is presented below. 1436 0 1 2 3 1437 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1438 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1439 | Seconds | 1440 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1441 | Microseconds | 1442 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1444 Figure 3: POSIX-based Timestamp Format 1446 Timestamp field format: 1448 Seconds: specifies the integer portion of the number of seconds 1449 since the epoch. 1451 + Size: 32 bits. 1453 + Units: seconds. 1455 Microseconds: specifies the fractional portion of the number of 1456 seconds since the epoch. 1458 + Size: 32 bits. 1460 + Units: the unit is microseconds. The value of this field is in 1461 the range 0 to (10^6)-1. 1463 Epoch: 1465 The epoch is 1 January 1970 00:00:00 TAI, which is 31 December 1466 1969 23:59:51.999918 UTC. 1468 Resolution: 1470 The resolution is 1 microsecond. 1472 Wraparound: 1474 This time format wraps around every 2^32 seconds, which is roughly 1475 136 years. The next wraparound will occur in the year 2106. 1477 Synchronization Aspects: 1479 It is assumed that nodes that use this timestamp format run Linux 1480 operating system, and hence use the POSIX time. In some cases 1481 nodes may be synchronized to UTC using a synchronization mechanism 1482 that is outside the scope of this document, such as NTP [RFC5905]. 1483 Thus, the timestamp may be derived from the NTP-synchronized 1484 clock, allowing the timestamp to be measured with respect to the 1485 clock of an NTP server. 1487 The POSIX-based timestamp format is affected by leap seconds; it 1488 represents the number of seconds since the epoch minus the number 1489 of leap seconds that have occurred since the epoch. The value of 1490 a timestamp during or slightly after a leap second may be 1491 temporarily inaccurate. 1493 6. IOAM Data Export 1495 IOAM nodes collect information for packets traversing a domain that 1496 supports IOAM. IOAM decapsulating nodes as well as IOAM transit 1497 nodes can choose to retrieve IOAM information from the packet, 1498 process the information further and export the information using 1499 e.g., IPFIX. The mechanisms and associated data formats for 1500 exporting IOAM data is outside the scope of this document. 1502 Raw data export of IOAM data using IPFIX is discussed in 1503 [I-D.spiegel-ippm-ioam-rawexport]. 1505 7. IANA Considerations 1507 This document requests the following IANA Actions. 1509 7.1. Creation of a new In-Situ OAM Protocol Parameters Registry (IOAM) 1510 Protocol Parameters IANA registry 1512 IANA is requested to create a new protocol registry for "In-Situ OAM 1513 (IOAM) Protocol Parameters". This is the common registry that will 1514 include registrations for all IOAM-Namespaces. Each Registry, whose 1515 names are listed below: 1517 IOAM Option-Type 1519 IOAM Trace-Type 1520 IOAM Trace-Flags 1522 IOAM POT-Type 1524 IOAM POT-Flags 1526 IOAM E2E-Type 1528 IOAM Namespace-ID 1530 will contain the current set of possibilities defined in this 1531 document. New registries in this name space are created via RFC 1532 Required process as per [RFC8126]. 1534 The subsequent sub-sections detail the registries herein contained. 1536 7.2. IOAM Option-Type Registry 1538 This registry defines 128 code points for the IOAM Option-Type field 1539 for identifying IOAM Option-Types as explained in Section 4. The 1540 following code points are defined in this draft: 1542 0 IOAM Pre-allocated Trace Option-Type 1544 1 IOAM Incremental Trace Option-Type 1546 2 IOAM POT Option-Type 1548 3 IOAM E2E Option-Type 1550 4 - 127 are available for assignment via RFC Required process as per 1551 [RFC8126]. 1553 7.3. IOAM Trace-Type Registry 1555 This registry defines code point for each bit in the 24-bit IOAM- 1556 Trace-Type field for Pre-allocated trace option and Incremental trace 1557 option defined in Section 4.4. The meaning of Bits 0 - 11 for trace 1558 type are defined in this document in Paragraph 5 of Section 4.4.1: 1560 Bit 0 hop_Lim and node_id in short format 1562 Bit 1 ingress_if_id and egress_if_id in short format 1564 Bit 2 timestamp seconds 1566 Bit 3 timestamp subseconds 1567 Bit 4 transit delay 1569 Bit 5 namespace specific data in short format 1571 Bit 6 queue depth 1573 Bit 7 checksum complement 1575 Bit 8 hop_Lim and node_id in wide format 1577 Bit 9 ingress_if_id and egress_if_id in wide format 1579 Bit 10 namespace specific data in wide format 1581 Bit 11 buffer occupancy 1583 Bit 22 variable length Opaque State Snapshot 1585 Bit 23 reserved 1587 The meaning for Bits 12 - 21 are available for assignment via RFC 1588 Required process as per [RFC8126]. 1590 7.4. IOAM Trace-Flags Registry 1592 This registry defines code points for each bit in the 4 bit flags for 1593 the Pre-allocated trace option and for the Incremental trace option 1594 defined in Section 4.4. The meaning of Bit 0 (the most significant 1595 bit) for trace flags is defined in this document in Paragraph 3 of 1596 Section 4.4.1: 1598 Bit 0 "Overflow" (O-bit) 1600 Bit 1 - 3 are available for assignment via RFC Required process as 1601 per [RFC8126]. 1603 7.5. IOAM POT-Type Registry 1605 This registry defines 256 code points to define IOAM POT Type for 1606 IOAM proof of transit option Section 4.5. The code point value 0 is 1607 defined in this document: 1609 0: 16 Octet POT data 1611 1 - 255 are available for assignment via RFC Required process as per 1612 [RFC8126]. 1614 7.6. IOAM POT-Flags Registry 1616 This registry defines code points for each bit in the 8 bit flags for 1617 IOAM POT option defined in Section 4.5. The meaning of Bit 0 for 1618 IOAM POT flags is defined in this document in Section 4.5: 1620 Bit 0 "Profile-to-use" (P-bit) 1622 The meaning for Bits 1 - 7 are available for assignment via RFC 1623 Required process as per [RFC8126]. 1625 7.7. IOAM E2E-Type Registry 1627 This registry defines code points for each bit in the 16 bit IOAM- 1628 E2E-Type field for IOAM E2E option Section 4.6. The meaning of Bit 0 1629 - 3 are defined in this document: 1631 Bit 0 64-bit sequence number 1633 Bit 1 32-bit sequence number 1635 Bit 2 timestamp seconds 1637 Bit 3 timestamp subseconds 1639 The meaning of Bits 4 - 15 are available for assignment via RFC 1640 Required process as per [RFC8126]. 1642 7.8. IOAM Namespace-ID Registry 1644 IANA is requested to set up an "IOAM Namespace-ID Registry", 1645 containing 16-bit values. The meaning of Bit 0 is defined in this 1646 document. IANA is requested to reserve the values 0x0001 to 0x7FFF 1647 for private use (managed by operators), as specified in Section 4.3 1648 of the current document. Registry entries for the values 0x8000 to 1649 0xFFFF are to be assigned via the "Expert Review" policy defined in 1650 [RFC8126]. 1652 0: default namespace (known to all IOAM nodes) 1654 0x0001 - 0x7FFF: reserved for private use 1656 0x8000 - 0xFFFF: unassigned 1658 8. Security Considerations 1660 As discussed in [RFC7276], a successful attack on an OAM protocol in 1661 general, and specifically on IOAM, can prevent the detection of 1662 failures or anomalies, or create a false illusion of nonexistent 1663 ones. In particular, these threats are applicable by compromising 1664 the integrity of IOAM data, either by maliciously modifying IOAM 1665 options in transit, or by injecting packets with maliciously 1666 generated IOAM options 1668 The Proof of Transit Option-Type (Section Section 4.5) is used for 1669 verifying the path of data packets. The security considerations of 1670 POT are further discussed in [I-D.ietf-sfc-proof-of-transit]. 1672 From a confidentiality perspective, although IOAM options do not 1673 contain user data, they can be used for network reconnaissance, 1674 allowing attackers to collect information about network paths, 1675 performance, queue states, buffer occupancy and other information. 1676 Moreover, if IOAM data leaks from the IOAM domain it may enable 1677 reconnaissance beyond the scope of the IOAM domain. Note that in 1678 case IOAM is used in "Direct Exporting" mode 1679 [I-D.ioamteam-ippm-ioam-direct-export], the IOAM related trace 1680 information would not be available in the customer data packets, but 1681 would trigger export of packet related IOAM information at every 1682 node, thus restricting the potential threat to the management plane 1683 and mitigating the leakage threat. IOAM data exporting and the way 1684 it is secured is outside the scope of this document. 1686 IOAM can be used as a means for implementing Denial of Service (DoS) 1687 attacks, or for amplifying them. For example, a malicious attacker 1688 can add an IOAM header to packets in order to consume the resources 1689 of network devices that take part in IOAM or entities that receive, 1690 collect or analyze the IOAM data. Another example is a packet length 1691 attack, in which an attacker pushes headers associated with IOAM 1692 Option-Types into data packets, causing these packets to be increased 1693 beyond the MTU size, resulting in fragmentation or in packet drops. 1695 Since IOAM options may include timestamps, if network devices use 1696 synchronization protocols then any attack on the time protocol 1697 [RFC7384] can compromise the integrity of the timestamp-related data 1698 fields. 1700 At the management plane, attacks may be implemented by misconfiguring 1701 or by maliciously configuring IOAM-enabled nodes in a way that 1702 enables other attacks. Thus, IOAM configuration should be secured in 1703 a way that authenticates authorized users and verifies the integrity 1704 of configuration procedures. 1706 The current document does not define a specific IOAM encapsulation. 1707 It should be noted that some IOAM encapsulation types may introduce 1708 specific security considerations. A specification that defines an 1709 IOAM encapsulation is expected to address the respective 1710 encapsulation-specific security considerations. 1712 Notably, in most cases IOAM is expected to be deployed in specific 1713 network domains, thus confining the potential attack vectors to 1714 within the network domain. A limited administrative domain provides 1715 the operator with the means to select, monitor, and control the 1716 access of all the network devices, making these devices trusted by 1717 the operator. Indeed, in order to limit the scope of threats 1718 mentioned above to within the current network domain the network 1719 operator is expected to enforce policies that prevent IOAM traffic 1720 from leaking outside of the IOAM domain, and prevent IOAM data from 1721 outside the domain to be processed and used within the domain. 1723 The security considerations of a system that deploys IOAM, much like 1724 any system, should be reviewed on a per-deployment-scenario basis, 1725 based on a systems-specific threat analysis, which may lead to 1726 specific security solutions that are beyond the scope of the current 1727 document. For example, in an IOAM deployment that is not confined to 1728 a single LAN, but spans multiple inter-connected sites, the inter- 1729 site links may be secured (e.g., by IPsec) in order to avoid external 1730 threats. 1732 9. Acknowledgements 1734 The authors would like to thank Eric Vyncke, Nalini Elkins, Srihari 1735 Raghavan, Ranganathan T S, Karthik Babu Harichandra Babu, Akshaya 1736 Nadahalli, LJ Wobker, Erik Nordmark, Vengada Prasad Govindan, Andrew 1737 Yourtchenko, Aviv Kfir, Tianran Zhou and Zhenbin (Robin) for the 1738 comments and advice. 1740 This document leverages and builds on top of several concepts 1741 described in [I-D.kitamura-ipv6-record-route]. The authors would 1742 like to acknowledge the work done by the author Hiroshi Kitamura and 1743 people involved in writing it. 1745 The authors would like to gracefully acknowledge useful review and 1746 insightful comments received from Joe Clarke, Al Morton, Tom Herbert, 1747 Haoyu Song, Mickey Spiegel and Barak Gafni. 1749 10. References 1750 10.1. Normative References 1752 [IEEE1588v2] 1753 Institute of Electrical and Electronics Engineers, "IEEE 1754 Std 1588-2008 - IEEE Standard for a Precision Clock 1755 Synchronization Protocol for Networked Measurement and 1756 Control Systems", IEEE Std 1588-2008, 2008, 1757 . 1760 [POSIX] Institute of Electrical and Electronics Engineers, "IEEE 1761 Std 1003.1-2008 (Revision of IEEE Std 1003.1-2004) - IEEE 1762 Standard for Information Technology - Portable Operating 1763 System Interface (POSIX(R))", IEEE Std 1003.1-2008, 2008, 1764 . 1767 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1768 Requirement Levels", BCP 14, RFC 2119, 1769 DOI 10.17487/RFC2119, March 1997, 1770 . 1772 [RFC5905] Mills, D., Martin, J., Ed., Burbank, J., and W. Kasch, 1773 "Network Time Protocol Version 4: Protocol and Algorithms 1774 Specification", RFC 5905, DOI 10.17487/RFC5905, June 2010, 1775 . 1777 [RFC8126] Cotton, M., Leiba, B., and T. Narten, "Guidelines for 1778 Writing an IANA Considerations Section in RFCs", BCP 26, 1779 RFC 8126, DOI 10.17487/RFC8126, June 2017, 1780 . 1782 10.2. Informative References 1784 [I-D.ietf-ntp-packet-timestamps] 1785 Mizrahi, T., Fabini, J., and A. Morton, "Guidelines for 1786 Defining Packet Timestamps", draft-ietf-ntp-packet- 1787 timestamps-08 (work in progress), February 2020. 1789 [I-D.ietf-nvo3-geneve] 1790 Gross, J., Ganga, I., and T. Sridhar, "Geneve: Generic 1791 Network Virtualization Encapsulation", draft-ietf- 1792 nvo3-geneve-15 (work in progress), February 2020. 1794 [I-D.ietf-nvo3-vxlan-gpe] 1795 Maino, F., Kreeger, L., and U. Elzur, "Generic Protocol 1796 Extension for VXLAN", draft-ietf-nvo3-vxlan-gpe-09 (work 1797 in progress), December 2019. 1799 [I-D.ietf-sfc-proof-of-transit] 1800 Brockners, F., Bhandari, S., Mizrahi, T., Dara, S., and S. 1801 Youell, "Proof of Transit", draft-ietf-sfc-proof-of- 1802 transit-04 (work in progress), November 2019. 1804 [I-D.ioamteam-ippm-ioam-direct-export] 1805 Song, H., Gafni, B., Zhou, T., Li, Z., Brockners, F., 1806 Bhandari, S., Sivakolundu, R., and T. Mizrahi, "In-situ 1807 OAM Direct Exporting", draft-ioamteam-ippm-ioam-direct- 1808 export-00 (work in progress), October 2019. 1810 [I-D.kitamura-ipv6-record-route] 1811 Kitamura, H., "Record Route for IPv6 (PR6) Hop-by-Hop 1812 Option Extension", draft-kitamura-ipv6-record-route-00 1813 (work in progress), November 2000. 1815 [I-D.lapukhov-dataplane-probe] 1816 Lapukhov, P. and r. remy@barefootnetworks.com, "Data-plane 1817 probe for in-band telemetry collection", draft-lapukhov- 1818 dataplane-probe-01 (work in progress), June 2016. 1820 [I-D.spiegel-ippm-ioam-rawexport] 1821 Spiegel, M., Brockners, F., Bhandari, S., and R. 1822 Sivakolundu, "In-situ OAM raw data export with IPFIX", 1823 draft-spiegel-ippm-ioam-rawexport-02 (work in progress), 1824 July 2019. 1826 [RFC7276] Mizrahi, T., Sprecher, N., Bellagamba, E., and Y. 1827 Weingarten, "An Overview of Operations, Administration, 1828 and Maintenance (OAM) Tools", RFC 7276, 1829 DOI 10.17487/RFC7276, June 2014, 1830 . 1832 [RFC7384] Mizrahi, T., "Security Requirements of Time Protocols in 1833 Packet Switched Networks", RFC 7384, DOI 10.17487/RFC7384, 1834 October 2014, . 1836 [RFC7665] Halpern, J., Ed. and C. Pignataro, Ed., "Service Function 1837 Chaining (SFC) Architecture", RFC 7665, 1838 DOI 10.17487/RFC7665, October 2015, 1839 . 1841 [RFC7799] Morton, A., "Active and Passive Metrics and Methods (with 1842 Hybrid Types In-Between)", RFC 7799, DOI 10.17487/RFC7799, 1843 May 2016, . 1845 [RFC7820] Mizrahi, T., "UDP Checksum Complement in the One-Way 1846 Active Measurement Protocol (OWAMP) and Two-Way Active 1847 Measurement Protocol (TWAMP)", RFC 7820, 1848 DOI 10.17487/RFC7820, March 2016, 1849 . 1851 [RFC7821] Mizrahi, T., "UDP Checksum Complement in the Network Time 1852 Protocol (NTP)", RFC 7821, DOI 10.17487/RFC7821, March 1853 2016, . 1855 [RFC8300] Quinn, P., Ed., Elzur, U., Ed., and C. Pignataro, Ed., 1856 "Network Service Header (NSH)", RFC 8300, 1857 DOI 10.17487/RFC8300, January 2018, 1858 . 1860 Authors' Addresses 1862 Frank Brockners 1863 Cisco Systems, Inc. 1864 Hansaallee 249, 3rd Floor 1865 DUESSELDORF, NORDRHEIN-WESTFALEN 40549 1866 Germany 1868 Email: fbrockne@cisco.com 1870 Shwetha Bhandari 1871 Cisco Systems, Inc. 1872 Cessna Business Park, Sarjapura Marathalli Outer Ring Road 1873 Bangalore, KARNATAKA 560 087 1874 India 1876 Email: shwethab@cisco.com 1878 Carlos Pignataro 1879 Cisco Systems, Inc. 1880 7200-11 Kit Creek Road 1881 Research Triangle Park, NC 27709 1882 United States 1884 Email: cpignata@cisco.com 1886 Hannes Gredler 1887 RtBrick Inc. 1889 Email: hannes@rtbrick.com 1890 John Leddy 1891 United States 1893 Email: john@leddy.net 1895 Stephen Youell 1896 JP Morgan Chase 1897 25 Bank Street 1898 London E14 5JP 1899 United Kingdom 1901 Email: stephen.youell@jpmorgan.com 1903 Tal Mizrahi 1904 Huawei Network.IO Innovation Lab 1905 Israel 1907 Email: tal.mizrahi.phd@gmail.com 1909 David Mozes 1911 Email: mosesster@gmail.com 1913 Petr Lapukhov 1914 Facebook 1915 1 Hacker Way 1916 Menlo Park, CA 94025 1917 US 1919 Email: petr@fb.com 1921 Remy Chang 1922 Barefoot Networks 1923 4750 Patrick Henry Drive 1924 Santa Clara, CA 95054 1925 US 1927 Email: remy@barefootnetworks.com 1928 Daniel Bernier 1929 Bell Canada 1930 Canada 1932 Email: daniel.bernier@bell.ca 1934 Jennifer Lemon 1935 Broadcom 1936 270 Innovation Drive 1937 San Jose, CA 95134 1938 US 1940 Email: jennifer.lemon@broadcom.com