idnits 2.17.00 (12 Aug 2021) /tmp/idnits5378/draft-mirsky-ippm-hybrid-two-step-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (April 14, 2020) is 767 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-17) exists of draft-ietf-ippm-ioam-data-09 == Outdated reference: A later version (-07) exists of draft-ietf-ippm-ioam-direct-export-00 == Outdated reference: A later version (-12) exists of draft-song-ippm-postcard-based-telemetry-07 Summary: 0 errors (**), 0 flaws (~~), 4 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 IPPM Working Group G. Mirsky 3 Internet-Draft ZTE Corp. 4 Intended status: Standards Track W. Lingqiang 5 Expires: October 16, 2020 G. Zhui 6 ZTE Corporation 7 April 14, 2020 9 Hybrid Two-Step Performance Measurement Method 10 draft-mirsky-ippm-hybrid-two-step-05 12 Abstract 14 Development of, and advancements in, automation of network operations 15 brought new requirements for measurement methodology. Among them is 16 the ability to collect instant network state as the packet being 17 processed by the networking elements along its path through the 18 domain. This document introduces a new hybrid measurement method, 19 referred to as hybrid two-step, as it separates the act of measuring 20 and/or calculating the performance metric from the act of collecting 21 and transporting network state. 23 Status of This Memo 25 This Internet-Draft is submitted in full conformance with the 26 provisions of BCP 78 and BCP 79. 28 Internet-Drafts are working documents of the Internet Engineering 29 Task Force (IETF). Note that other groups may also distribute 30 working documents as Internet-Drafts. The list of current Internet- 31 Drafts is at https://datatracker.ietf.org/drafts/current/. 33 Internet-Drafts are draft documents valid for a maximum of six months 34 and may be updated, replaced, or obsoleted by other documents at any 35 time. It is inappropriate to use Internet-Drafts as reference 36 material or to cite them other than as "work in progress." 38 This Internet-Draft will expire on October 16, 2020. 40 Copyright Notice 42 Copyright (c) 2020 IETF Trust and the persons identified as the 43 document authors. All rights reserved. 45 This document is subject to BCP 78 and the IETF Trust's Legal 46 Provisions Relating to IETF Documents 47 (https://trustee.ietf.org/license-info) in effect on the date of 48 publication of this document. Please review these documents 49 carefully, as they describe your rights and restrictions with respect 50 to this document. Code Components extracted from this document must 51 include Simplified BSD License text as described in Section 4.e of 52 the Trust Legal Provisions and are provided without warranty as 53 described in the Simplified BSD License. 55 Table of Contents 57 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 58 2. Conventions used in this document . . . . . . . . . . . . . . 3 59 2.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 3 60 2.2. Requirements Language . . . . . . . . . . . . . . . . . . 3 61 3. Problem Overview . . . . . . . . . . . . . . . . . . . . . . 4 62 4. Theory of Operation . . . . . . . . . . . . . . . . . . . . . 5 63 4.1. Operation of the HTS Ingress Node . . . . . . . . . . . . 6 64 4.2. Operation of the HTS Transient Node . . . . . . . . . . . 7 65 4.3. Operation of the HTS Egress Node . . . . . . . . . . . . 8 66 4.4. Considerations for HTS Timers . . . . . . . . . . . . . . 8 67 4.5. Deploying HTS in a Multicast Network . . . . . . . . . . 8 68 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 69 6. Security Considerations . . . . . . . . . . . . . . . . . . . 9 70 7. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 10 71 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 10 72 8.1. Normative References . . . . . . . . . . . . . . . . . . 10 73 8.2. Informative References . . . . . . . . . . . . . . . . . 10 74 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 11 76 1. Introduction 78 Successful resolution of challenges of automated network operation, 79 as part of, for example, overall service orchestration or data center 80 operation, relies on a timely collection of accurate information that 81 reflects the state of network elements on an unprecedented scale. 82 Because performing the analysis and act upon the collected 83 information requires considerable computing and storage resources, 84 the network state information is unlikely to be processed by the 85 network elements themselves but will be relayed into the data storage 86 facilities, e.g., data lakes. The process of producing, collecting 87 network state information also referred to in this document as 88 network telemetry, and transporting it for post-processing should 89 work equally well with data flows or injected in the network test 90 packets. RFC 7799 [RFC7799] describes a combination of elements of 91 passive and active measurement as a hybrid measurement. 93 Several technical methods have been proposed to enable the collection 94 of network state information instantaneous to the packet processing, 95 among them [P4.INT] and [I-D.ietf-ippm-ioam-data]. The 96 instantaneous, i.e., in the data packet itself, collection of 97 telemetry information simplifies the process of attribution of 98 telemetry information to the particular monitored flow. On the other 99 hand, this collection method impacts the data packets, potentially 100 changing their treatment by the networking nodes. Also, the amount 101 of information the instantaneous method collects might be incomplete 102 because of the limited space it can be allotted. Other proposals 103 defined methods to collect telemetry information in a separate packet 104 from each node traversed by the monitored data flow. Examples of 105 this approach to collecting telemetry information are 106 [I-D.ietf-ippm-ioam-direct-export] and 107 [I-D.song-ippm-postcard-based-telemetry]. These methods allow data 108 collection from any arbitrary path and avoid directly impacting data 109 packets. On the other hand, the correlation of data and the 110 monitored flow requires that each packet with telemetry information 111 also includes characteristic information about the monitored flow. 113 This document introduces Hybrid Two-Step (HTS) as a new hybrid 114 measurement method that allows achieving better accuracy of a 115 measurement by separating the act of measuring or calculating the 116 performance metric from the collecting and transporting this 117 information. The Hybrid Two-Step method extends the two-step mode of 118 Residence Time Measurement (RTM) defined in [RFC8169] to on-path 119 network state collection and transport. HTS allows the collection of 120 telemetry information from any arbitrary path, does not change data 121 packets of the monitored flow and makes the process of attribution of 122 telemetry to the data flow simple. 124 2. Conventions used in this document 126 2.1. Terminology 128 RTM Residence Time Measurement 130 ECMP Equal Cost Multipath 132 MTU Maximum Transmission Unit 134 HTS Hybrid Two-Step 136 Network telemetry - the process of collecting and reporting of 137 network state 139 2.2. Requirements Language 141 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 142 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 143 "OPTIONAL" in this document are to be interpreted as described in BCP 144 14 [RFC2119] [RFC8174] when, and only when, they appear in all 145 capitals, as shown here. 147 3. Problem Overview 149 Performance measurements are meant to provide data that characterize 150 conditions experienced by traffic flows in the network and possibly 151 trigger operational changes (e.g., re-route of flows, or changes in 152 resource allocations). Modifications to a network are determined 153 based on the performance metric information available at the time 154 that a change is to be made. The correctness of this determination 155 is based on the quality of the collected metrics data. The quality 156 of collected measurement data is defined by: 158 o the resolution and accuracy of each measurement; 160 o predictability of both the time at which each measurement is made 161 and the timeliness of measurement collection data delivery for 162 use. 164 Consider the case of delay measurement that relies on collecting time 165 of packet arrival at the ingress interface and time of the packet 166 transmission at the egress interface. The method includes recording 167 a local clock value on receiving the first octet of an affected 168 message at the device ingress, and again recording the clock value on 169 transmitting the first byte of the same message at the device egress. 170 In this ideal case, the difference between the two recorded clock 171 times corresponds to the time that the message spent in traversing 172 the device. In practice, the time that has been recorded can differ 173 from the ideal case by any fixed amount and a correction can be 174 applied to compute the same time difference taking into account the 175 known fixed time associated with the actual measurement. In this 176 way, the resulting time difference reflects any variable delay 177 associated with queuing. 179 Depending on the implementation, it may be a challenge to compute the 180 difference between message arrival and departure times and - on the 181 fly - add the necessary residence time information to the same 182 message. And that task may become even more challenging if the 183 packet is encrypted. Implementations SHOULD NOT record a message 184 departure time that may be significantly inaccurate in the same 185 message, as the result of estimating the departure time that includes 186 the variable time component (such as that associated with buffering 187 and queuing of the message). A similar problem may cause a lower 188 quality of, for example, information that characterizes utilization 189 of the egress interface. If unable to obtain the data consistently, 190 without variable delays for additional processing, information may 191 not accurately reflect the state at the egress interface. To 192 mitigate this problem [RFC8169] defined an RTM two-step mode. 194 Another challenge associated with methods that collect network state 195 information into the actual data packet is the risk to exceed the 196 Maximum Transmission Unit (MTU) size, especially if the packet 197 traverses overlay domains or VPNs. Since the fragmentation is not 198 available at the transport network, operators may have to reduce MTU 199 size advertised to client layer or risk missing network state data 200 for the part, most probably the latter part, of the path. 202 4. Theory of Operation 204 The HTS method consists of the two phases: 206 o performing a measurement or obtaining network state information, 207 one or more than one type, on a node; 209 o collecting and transporting the measurement. 211 HTS uses HTS Trigger carried in a data packet or a specially 212 constructed test packet. Nature of the HTS Trigger is transport 213 network layer specific, and its description is outside the scope of 214 this document. The packet that includes the HTS Trigger in this 215 document also referred to as the trigger packet. 217 The HTS method uses the HTS Follow-up packet, in this document also 218 referred to as the follow-up packet, to collect measurement and 219 network state data from the nodes. The node that creates the HTS 220 Trigger also generates the HTS Follow-up packet. The follow-up 221 packet contains characteristic information, copied from the trigger 222 packet, sufficient for participating HTS nodes to associate it with 223 the original packet. The exact composition of the characteristic 224 information is specific for each transport network, and its 225 definition is outside the scope of this document. The follow-up 226 packet also uses the same encapsulation as the data packet. If not 227 payload but only network information used to load-balance flows in 228 equal cost multipath (ECMP), use of the network encapsulation 229 identical to the trigger packet should guarantee that the follow-up 230 packet remains in-band, i.e., traverses the same set of network 231 elements, with the original data packet with the HTS Trigger. Only 232 one outstanding follow-up packet MUST be on the node for the given 233 path. That means that if the node receives an HTS Trigger for the 234 flow on which it still waits for the follow-up packet to the previous 235 HTS Trigger, the node will originate the follow-up packet to 236 transport the former set of the network state data and transmit it 237 before it sends the follow-up packet with the latest collection of 238 network state information. 240 4.1. Operation of the HTS Ingress Node 242 A node that originates the HTS Trigger is referred to as HTS ingress 243 node. As stated, the ingress node originates the follow-up packet. 244 The follow-up packet has the transport network encapsulation 245 identical with the trigger packet followed by the HTS shim and one or 246 more telemetry information elements encoded as Type-Length-Value 247 {TLV}. Figure 1 displays the example of the follow-up packet format. 249 0 1 2 3 250 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 251 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 252 | | 253 ~ Transport Network ~ 254 | Encapsulation | 255 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 256 |Ver|HTS Shim Len| Flags | Sequence Number | 257 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 258 | Telemetry Data Profile | 259 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 260 | | 261 ~ Telemetry Data TLVs ~ 262 | | 263 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 265 Figure 1: Follow-up Packet Format 267 Fields of the HTS shim are as follows: 269 Version (Ver) is the two-bits long field. It specifies the 270 version of the HTS shim format. This document defines the format 271 for the 0b00 value of the field. 273 HTS Shim Length is the six bits-long field. It defines the length 274 of the HTS shim in bytes. The minimal value of the field is four 275 bytes. 277 Flags is eight-bits long field. The format of the Flags field 278 displayed in Figure 2. 280 Full (F) flag MUST be set to zero by the node originating the 281 HTS follow-up packet and MUST be set to one by the node that 282 does not add its telemetry data to avoid exceeding MTU size. 284 The node originating the follow-up packet MUST zero the 285 Reserved field and ignore it on the receipt. 287 Sequence Number is 16 bits-long field. The value of the field 288 reflects the number of the HTS follow-up packet in the sequence of 289 the HTS follow-up packets originated in response to the same HTS 290 trigger. The ingress node MUST set the value of the field to 291 zero. 293 Telemetry Data Profile is the optional variable length field of 294 bit-size flags. Each flag indicates requested type of telemetry 295 data to be collected at the each HTS node. The increment of the 296 field is four bytes with a minimum length of zero. 298 0 299 0 1 2 3 4 5 6 7 300 +-+-+-+-+-+-+-+-+ 301 |F| Reserved | 302 +-+-+-+-+-+-+-+-+ 304 Figure 2: Flags Field Format 306 4.2. Operation of the HTS Transient Node 308 Upon receiving the trigger packet the HTS transient node MUST: 310 o copy the transport information; 312 o start the HTS Follow-up Timer for the obtained flow. 314 Upon receiving the follow-up packet the HTS transient node MUST: 316 o verify that the matching transport information exists and the Full 317 flag is cleared, then stop the associated HTS Follow-up timer; 319 o collect telemetry data requested in the Telemetry Data Profile 320 field or defined by the local HTS policy; 322 o if adding the collected telemetry would not exceed MTU, then 323 append data into Telemetry Data TLVs field and transmit the 324 follow-up packet; 326 o otherwise, set the value of the Full flag to one and transmit the 327 received a follow-up packet; 329 o originate the new follow-up packet using the same transport 330 information. The value of the Sequence Number field in the HTS 331 shim MUST be set to the value of the field in the received follow- 332 up packet incremented by one. Copy collected telemetry data and 333 transmit the packet. 335 If the follow-up timer expires the transient node MUST: 337 o originate the follow-up packet using transport information 338 associated with the expired timer; 340 o initialize the HTS shim by setting Version field to 0b00 and 341 Sequence Number field to 0. Values of HTS Shim Length and 342 Telemetry Data Profile fields MAY be set according to the local 343 policy. 345 o copy telemetry information into Telemetry Data TLVs field and 346 transmit the packet. 348 4.3. Operation of the HTS Egress Node 350 Upon receiving the trigger packet the HTS egress node MUST: 352 o copy the transport information; 354 o start the HTS Collection timer for the obtained flow. 356 When the egress node receives the follow-up packet for the known 357 flow, i.e., the flow to which the Collection timer is running, the 358 node MUST: 360 o copy telemetry information; 362 o restart the corresponding Collection timer. 364 When the Collection timer expires the egress relays the collected 365 telemetry information for processing and analysis to a local or 366 remote agent. 368 4.4. Considerations for HTS Timers 370 This specification defines two timers - HTS Follow-up and HTS 371 Collection. Because for the particular flow there MUST be not more 372 than one HTS Trigger, values of HTS timers bounded by the rate of the 373 trigger generation for that flow. 375 4.5. Deploying HTS in a Multicast Network 377 Previous sections discussed the operation of HTS in a unicast 378 network. Multicast services are important, and the ability to 379 collect telemetry information is an invaluable component in 380 delivering a high quality of experience. While the replication of 381 data packets is necessary, replication of HTS follow-up packets is 382 not. Replication of multicast data packets down a multicast tree may 383 be set based on multicast routing information or explicit information 384 included in the special header, as, for example, in Bit-Indexed 385 Explicit Replication [RFC8296]. A replicating node processes HTS 386 packet as defined below: 388 o the first transmitted multicast packet MUST be followed by the 389 received corresponding HTS packet as described in Section 4.2; 391 o each consecutively transmitted copy of the original multicast 392 packet MUST be followed by the new HTS packet originated by the 393 replicating node that acts as a transient HTS node when the 394 Follow-up timer expired. 396 As a result, there are no duplicate copies of Telemetry Data TLV for 397 the same pair of ingress and egress interfaces. At the same time, 398 all ingress/egress pairs traversed by the given multicast packet 399 reflected in their respective Telemetry Data TLV. Consequently, a 400 centralized controller would be able to reconstruct and analyze the 401 state of the particular multicast distribution tree based on HTS 402 packets collected from egress nodes. 404 5. IANA Considerations 406 TBD 408 6. Security Considerations 410 Nodes that practice HTS method are presumed to share a trust model 411 that depends on the existence of a trusted relationship among nodes. 412 This is necessary as these nodes are expected to correctly modify the 413 specific content of the data in the follow-up packet, and the degree 414 to which HTS measurement is useful for network operation depends on 415 this ability. In practice, this means either confidentiality or 416 integrity protection cannot cover those portions of messages that 417 contain the network state data. Though there are methods that make 418 it possible in theory to provide either or both such protections and 419 still allow for intermediate nodes to make detectable yet 420 authenticated modifications, such methods do not seem practical at 421 present, particularly for protocols that used to measure latency and/ 422 or jitter. 424 The ability to potentially authenticate and/or encrypt the network 425 state data for scenarios both with and without the participation of 426 intermediate nodes that participate in HTS measurement is left for 427 further study. 429 While it is possible for a supposed compromised node to intercept and 430 modify the network state information in the follow-up packet, this is 431 an issue that exists for nodes in general - for all data that to be 432 carried over the particular networking technology - and is therefore 433 the basis for an additional presumed trust model associated with an 434 existing network. 436 7. Acknowledgments 438 Authors express their gratitude and appreciation to Joel Halpern for 439 the most helpful and insightful discussion on the applicability of 440 HTS in a Service Function Chaining domain. 442 8. References 444 8.1. Normative References 446 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 447 Requirement Levels", BCP 14, RFC 2119, 448 DOI 10.17487/RFC2119, March 1997, 449 . 451 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 452 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 453 May 2017, . 455 8.2. Informative References 457 [I-D.ietf-ippm-ioam-data] 458 Brockners, F., Bhandari, S., Pignataro, C., Gredler, H., 459 Leddy, J., Youell, S., Mizrahi, T., Mozes, D., Lapukhov, 460 P., remy@barefootnetworks.com, r., daniel.bernier@bell.ca, 461 d., and J. Lemon, "Data Fields for In-situ OAM", draft- 462 ietf-ippm-ioam-data-09 (work in progress), March 2020. 464 [I-D.ietf-ippm-ioam-direct-export] 465 Song, H., Gafni, B., Zhou, T., Li, Z., Brockners, F., 466 Bhandari, S., Sivakolundu, R., and T. Mizrahi, "In-situ 467 OAM Direct Exporting", draft-ietf-ippm-ioam-direct- 468 export-00 (work in progress), February 2020. 470 [I-D.song-ippm-postcard-based-telemetry] 471 Song, H., Zhou, T., Li, Z., Shin, J., and K. Lee, 472 "Postcard-based On-Path Flow Data Telemetry", draft-song- 473 ippm-postcard-based-telemetry-07 (work in progress), April 474 2020. 476 [P4.INT] "In-band Network Telemetry (INT)", P4.org Specification, 477 October 2017. 479 [RFC7799] Morton, A., "Active and Passive Metrics and Methods (with 480 Hybrid Types In-Between)", RFC 7799, DOI 10.17487/RFC7799, 481 May 2016, . 483 [RFC8169] Mirsky, G., Ruffini, S., Gray, E., Drake, J., Bryant, S., 484 and A. Vainshtein, "Residence Time Measurement in MPLS 485 Networks", RFC 8169, DOI 10.17487/RFC8169, May 2017, 486 . 488 [RFC8296] Wijnands, IJ., Ed., Rosen, E., Ed., Dolganow, A., 489 Tantsura, J., Aldrin, S., and I. Meilik, "Encapsulation 490 for Bit Index Explicit Replication (BIER) in MPLS and Non- 491 MPLS Networks", RFC 8296, DOI 10.17487/RFC8296, January 492 2018, . 494 Authors' Addresses 496 Greg Mirsky 497 ZTE Corp. 499 Email: gregimirsky@gmail.com 501 Wang Lingqiang 502 ZTE Corporation 503 No 19 ,East Huayuan Road 504 Beijing 100191 505 P.R.China 507 Phone: +86 10 82963945 508 Email: wang.lingqiang@zte.com.cn 510 Guo Zhui 511 ZTE Corporation 512 No 19 ,East Huayuan Road 513 Beijing 100191 514 P.R.China 516 Phone: +86 10 82963945 517 Email: guo.zhui@zte.com.cn