idnits 2.17.00 (12 Aug 2021) /tmp/idnits44368/draft-fioccola-rfc8889bis-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([RFC8889]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document date (February 23, 2022) is 87 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-04) exists of draft-fioccola-rfc8321bis-03 ** Downref: Normative reference to an Informational RFC: RFC 5474 == Outdated reference: A later version (-17) exists of draft-song-opsawg-ifit-framework-16 == Outdated reference: A later version (-09) exists of draft-zhou-ippm-enhanced-alternate-marking-08 Summary: 2 errors (**), 0 flaws (~~), 3 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group G. Fioccola, Ed. 3 Internet-Draft Huawei Technologies 4 Obsoletes: 8889 (if approved) M. Cociglio 5 Intended status: Standards Track Telecom Italia 6 Expires: August 27, 2022 A. Sapio 7 Intel Corporation 8 R. Sisto 9 Politecnico di Torino 10 T. Zhou 11 Huawei Technologies 12 February 23, 2022 14 Multipoint Alternate-Marking Method 15 draft-fioccola-rfc8889bis-03 17 Abstract 19 This document generalizes and expands Alternate-Marking methodology 20 to measure any kind of unicast flow whose packets can follow several 21 different paths in the network -- in wider terms, a multipoint-to- 22 multipoint network. For this reason, the technique here described is 23 called "Multipoint Alternate Marking". This document obsoletes 24 [RFC8889]. 26 Status of This Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at https://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on August 27, 2022. 43 Copyright Notice 45 Copyright (c) 2022 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (https://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 Table of Contents 60 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 61 1.1. Requirements Language . . . . . . . . . . . . . . . . . . 5 62 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 63 2.1. Correlation with RFC 5644 . . . . . . . . . . . . . . . . 6 64 3. Flow Classification . . . . . . . . . . . . . . . . . . . . . 6 65 4. Extension of the Method to Multipoint Flows . . . . . . . . . 9 66 4.1. Monitoring Network . . . . . . . . . . . . . . . . . . . 9 67 4.2. Network Packet Loss . . . . . . . . . . . . . . . . . . . 10 68 5. Network Clustering . . . . . . . . . . . . . . . . . . . . . 11 69 5.1. Algorithm for Clusters Partition . . . . . . . . . . . . 12 70 6. Multipoint Packet Loss Measurement . . . . . . . . . . . . . 16 71 7. Multipoint Delay and Delay Variation . . . . . . . . . . . . 16 72 7.1. Delay Measurements on a Multipoint-Paths Basis . . . . . 17 73 7.1.1. Single-Marking Measurement . . . . . . . . . . . . . 17 74 7.2. Delay Measurements on a Single-Packet Basis . . . . . . . 17 75 7.2.1. Single- and Double-Marking Measurement . . . . . . . 17 76 7.2.2. Hashing Selection Method . . . . . . . . . . . . . . 18 77 8. Synchronization and Timing . . . . . . . . . . . . . . . . . 19 78 9. Results of the Multipoint Alternate Marking Experiment . . . 21 79 10. A Closed-Loop Performance-Management Approach . . . . . . . . 21 80 11. Security Considerations . . . . . . . . . . . . . . . . . . . 23 81 12. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 23 82 13. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 23 83 14. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 23 84 15. References . . . . . . . . . . . . . . . . . . . . . . . . . 24 85 15.1. Normative References . . . . . . . . . . . . . . . . . . 24 86 15.2. Informative References . . . . . . . . . . . . . . . . . 24 87 Appendix A. Changes Log . . . . . . . . . . . . . . . . . . . . 25 88 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 26 90 1. Introduction 92 The Alternate-Marking method, as described in 93 [I-D.fioccola-rfc8321bis], is applicable to a point-to-point path. 94 The extension proposed in this document applies to the most general 95 case of multipoint-to-multipoint path and enables flexible and 96 adaptive performance measurements in a managed network. 98 The Alternate-Marking methodology described in 99 [I-D.fioccola-rfc8321bis] allows the synchronization of the 100 measurements in different points by dividing the packet flow into 101 batches. So it is possible to get coherent counters and show what is 102 happening in every marking period for each monitored flow. The 103 monitoring parameters are the packet counter and timestamps of a flow 104 for each marking period. Note that additional details about the 105 applicability of the Alternate-Marking methodology are described in 106 [I-D.fioccola-rfc8321bis] while implementation details can be found 107 in the paper "AM-PM: Efficient Network Telemetry using Alternate 108 Marking" [IEEE-Network-PNPM]. 110 There are some applications of the Alternate-Marking method where 111 there are a lot of monitored flows and nodes. Multipoint Alternate 112 Marking aims to reduce these values and makes the performance 113 monitoring more flexible in case a detailed analysis is not needed. 114 For instance, by considering n measurement points and m monitored 115 flows, the order of magnitude of the packet counters for each time 116 interval is n*m*2 (1 per color). The number of measurement points 117 and monitored flows may vary and depends on the portion of the 118 network we are monitoring (core network, metro network, access 119 network) and the granularity (for each service, each customer). So 120 if both n and m are high values, the packet counters increase a lot, 121 and Multipoint Alternate Marking offers a tool to control these 122 parameters. 124 The approach presented in this document is applied only to unicast 125 flows and not to multicast. Broadcast, Unknown Unicast, and 126 Multicast (BUM) traffic is not considered here, because traffic 127 replication is not covered by the Multipoint Alternate-Marking 128 method. Furthermore, it can be applicable to anycast flows, and 129 Equal-Cost Multipath (ECMP) paths can also be easily monitored with 130 this technique. 132 [I-D.fioccola-rfc8321bis] applies to point-to-point unicast flows and 133 BUM traffic. For BUM traffic, the basic method of 134 [I-D.fioccola-rfc8321bis] can easily be applied link by link and 135 therefore split the multicast flow tree distribution into separate 136 unicast point-to-point links. While this document and its Clustered 137 Alternate-Marking method is valid for multipoint-to-multipoint 138 unicast flows, anycast, and ECMP flows. 140 Therefore, the Alternate-Marking method can be extended to any kind 141 of multipoint-to-multipoint paths, and the network-clustering 142 approach presented in this document is the formalization of how to 143 implement this property and allow a flexible and optimized 144 performance measurement support for network management in every 145 situation. 147 Without network clustering, it is possible to apply Alternate Marking 148 only for all the network or per single flow. Instead, with network 149 clustering, it is possible to use the partition of the network into 150 clusters at different levels in order to perform the needed degree of 151 detail. In some circumstances, it is possible to monitor a 152 multipoint network by analyzing the network clustering, without 153 examining in depth. In case of problems (packet loss is measured or 154 the delay is too high), the filtering criteria could be specified 155 more in order to perform a detailed analysis by using a different 156 combination of clusters up to a per-flow measurement as described in 157 [I-D.fioccola-rfc8321bis]. 159 This approach fits very well with the Closed-Loop Network and 160 Software-Defined Network (SDN) paradigm, where the SDN orchestrator 161 and the SDN controllers are the brains of the network and can manage 162 flow control to the switches and routers and, in the same way, can 163 calibrate the performance measurements depending on the desired 164 accuracy. An SDN controller application can orchestrate how 165 accurately the network performance monitoring is set up by applying 166 the Multipoint Alternate Marking as described in this document. 168 It is important to underline that, as an extension of 169 [I-D.fioccola-rfc8321bis], this is a methodology document, so the 170 mechanism that can be used to transmit the counters and the 171 timestamps is out of scope here, and the implementation is open. 172 Several options are possible -- e.g., see "Enhanced Alternate Marking 173 Method" [I-D.zhou-ippm-enhanced-alternate-marking]. 175 This document assumes that the blocks are created according to a 176 fixed timer as per [I-D.fioccola-rfc8321bis]. The switching after a 177 fixed number of packets is an additional possibility but it is out of 178 scope here. 180 Note that the fragmented packets case can be managed with the 181 Alternate-Marking methodology. The same considerations of 182 [I-D.fioccola-rfc8321bis] apply also in the case of Multipoint 183 Alternate Marking. As defined in [I-D.fioccola-rfc8321bis] the 184 marking node MUST mark all the fragments except in the case of 185 fragmentation within the network domain, in that event it is 186 suggested to mark only the first fragment. 188 1.1. Requirements Language 190 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 191 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 192 "OPTIONAL" in this document are to be interpreted as described in BCP 193 14 [RFC2119] [RFC8174] when, and only when, they appear in all 194 capitals, as shown here. 196 2. Terminology 198 The definitions of the basic terms are identical to those found in 199 Alternate Marking [I-D.fioccola-rfc8321bis]. It is to be remembered 200 that [I-D.fioccola-rfc8321bis] is valid for point-to-point unicast 201 flows and BUM traffic. 203 The important new terms that need to be explained are listed below: 205 Multipoint Alternate Marking: Extension to 206 [I-D.fioccola-rfc8321bis], valid for multipoint-to-multipoint 207 unicast flows, anycast, and ECMP flows. It can also be referred 208 to as Clustered Alternate Marking. 210 Flow definition: The concept of flow is generalized in this 211 document. The identification fields are selected without any 212 constraints and, in general, the flow can be a multipoint-to- 213 multipoint flow, as a result of aggregate point-to-point flows. 215 Monitoring Network: Identified with the nodes of the network that 216 are the measurement points (MPs) and the links that are the 217 connections between MPs. The monitoring network graph depends on 218 the flow definition, so it can represent a specific flow or the 219 entire network topology as aggregate of all the flows. 221 Cluster: Smallest identifiable subnetwork of the entire monitoring 222 network graph that still satisfies the condition that the number 223 of packets that go in is the same as the number that go out. 225 Multipoint metrics: Packet loss, delay and delay variation are 226 extended to the case of multipoint flows. It is possible to 227 compute these metrics on the basis of multipoint paths in order to 228 associate the measurements to a cluster, a combination of 229 clusters, or the entire monitored network. For delay and delay 230 variation, it is also possible to define the metrics on a single- 231 packet basis, and it means that the multipoint path is used to 232 easily couple packets between input and output nodes of a 233 multipoint path. 235 The next section highlights the correlation with the terms used in 236 RFC 5644 [RFC5644]. 238 2.1. Correlation with RFC 5644 240 RFC 5644 [RFC5644] is limited to active measurements using a single 241 source packet or stream. Its scope is also limited to observations 242 of corresponding packets along the path (spatial metric) and at one 243 or more destinations (one-to-group) along the path. 245 Instead, the scope of this memo is to define multiparty metrics for 246 passive and hybrid measurements in a group-to-group topology with 247 multiple sources and destinations. 249 RFC 5644 [RFC5644] introduces metric names that can be reused here 250 but have to be extended and rephrased to be applied to the Alternate- 251 Marking schema: 253 a. the multiparty metrics are not only one-to-group metrics but can 254 be also group-to-group metrics; 256 b. the spatial metrics, used for measuring the performance of 257 segments of a source to destination path, are applied here to 258 group-to-group segments (called clusters). 260 3. Flow Classification 262 A unicast flow is identified by all the packets having a set of 263 common characteristics. This definition is inspired by RFC 7011 264 [RFC7011]. 266 As an example, by considering a flow as all the packets sharing the 267 same source IP address or the same destination IP address, it is easy 268 to understand that the resulting pattern will not be a point-to-point 269 connection, but a point-to-multipoint or multipoint-to-point 270 connection. 272 In general, a flow can be defined by a set of selection rules used to 273 match a subset of the packets processed by the network device. These 274 rules specify a set of Layer 3 and Layer 4 header fields 275 (identification fields) and the relative values that must be found in 276 matching packets. 278 The choice of the identification fields directly affects the type of 279 paths that the flow would follow in the network. In fact, it is 280 possible to relate a set of identification fields with the pattern of 281 the resulting graphs, as listed in Figure 1. 283 A TCP 5-tuple usually identifies flows following either a single path 284 or a point-to-point multipath (in the case of load balancing). On 285 the contrary, a single source address selects aggregate flows 286 following a point-to-multipoint, while a multipoint-to-point can be 287 the result of a matching on a single destination address. In the 288 case where a selection rule and its reverse are used for 289 bidirectional measurements, they can correspond to a point-to- 290 multipoint in one direction and a multipoint-to-point in the opposite 291 direction. 293 So the flows to be monitored are selected into the monitoring points 294 using packet selection rules, which can also change the pattern of 295 the monitored network. 297 Note that, more generally, the flow can be defined at different 298 levels based on the potential encapsulation, and additional 299 conditions that are not in the packet header can also be included as 300 part of matching criteria. 302 The Alternate-Marking method is applicable only to a single path (and 303 partially to a one-to-one multipath), so the extension proposed in 304 this document is suitable also for the most general case of 305 multipoint-to-multipoint, which embraces all the other patterns of 306 Figure 1. 308 point-to-point single path 309 +------+ +------+ +------+ 310 ---<> R1 <>----<> R2 <>----<> R3 <>--- 311 +------+ +------+ +------+ 313 point-to-point multipath 314 +------+ 315 <> R2 <> 316 / +------+ \ 317 / \ 318 +------+ / \ +------+ 319 ---<> R1 <> <> R4 <>--- 320 +------+ \ / +------+ 321 \ / 322 \ +------+ / 323 <> R3 <> 324 +------+ 326 point-to-multipoint 327 +------+ 329 <> R4 <>--- 330 / +------+ 331 +------+ / 332 <> R2 <> 333 / +------+ \ 334 +------+ / \ +------+ 335 ---<> R1 <> <> R5 <>--- 336 +------+ \ +------+ 337 \ +------+ 338 <> R3 <> 339 +------+ \ 340 \ +------+ 341 <> R6 <>--- 342 +------+ 344 multipoint-to-point 345 +------+ 346 ---<> R1 <> 347 +------+ \ 348 \ +------+ 349 <> R4 <> 350 / +------+ \ 351 +------+ / \ +------+ 352 ---<> R2 <> <> R6 <>--- 353 +------+ / +------+ 354 +------+ / 355 <> R5 <> 356 / +------+ 357 +------+ / 358 ---<> R3 <> 359 +------+ 361 multipoint-to-multipoint 362 +------+ +------+ 363 ---<> R1 <> <> R6 <>--- 364 +------+ \ / +------+ 365 \ +------+ / 366 <> R4 <> 367 +------+ \ 368 +------+ \ +------+ 369 ---<> R2 <> <> R7 <>--- 370 +------+ \ / +------+ 371 \ +------+ / 372 <> R5 <> 373 / +------+ \ 374 +------+ / \ +------+ 376 ---<> R3 <> <> R8 <>--- 377 +------+ +------+ 379 Figure 1: Flow Classification 381 The case of unicast flow is considered in Figure 1. The anycast flow 382 is also in scope, because there is no replication and only a single 383 node from the anycast group receives the traffic, so it can be viewed 384 as a special case of unicast flow. Furthermore, an ECMP flow is in 385 scope by definition, since it is a point-to-multipoint unicast flow. 387 4. Extension of the Method to Multipoint Flows 389 By using the Alternate-Marking method, only point-to-point paths can 390 be monitored. To have an IP (TCP/UDP) flow that follows a point-to- 391 point path, we have to define, with a specific value, 5 392 identification fields (IP Source, IP Destination, Transport Protocol, 393 Source Port, Destination Port). 395 Multipoint Alternate Marking enables the performance measurement for 396 multipoint flows selected by identification fields without any 397 constraints (even the entire network production traffic). It is also 398 possible to use multiple marking points for the same monitored flow. 400 4.1. Monitoring Network 402 The monitoring network is deduced from the production network by 403 identifying the nodes of the graph that are the measurement points, 404 and the links that are the connections between measurement points. 406 There are some techniques that can help with the building of the 407 monitoring network (as an example, see [I-D.ietf-ippm-route]). In 408 general, there are different options: the monitoring network can be 409 obtained by considering all the possible paths for the traffic or 410 periodically checking the traffic (e.g. daily, weekly, monthly) and 411 updating the graph as appropriate, but this is up to the Network 412 Management System (NMS) configuration. 414 So a graph model of the monitoring network can be built according to 415 the Alternate-Marking method: the monitored interfaces and links are 416 identified. Only the measurement points and links where the traffic 417 has flowed have to be represented in the graph. 419 Figure 2 shows a simple example of a monitoring network graph: 421 +------+ 422 <> R6 <>--- 423 / +------+ 424 +------+ +------+ / 425 <> R2 <>---<> R4 <> 426 / +------+ \ +------+ \ 427 / \ \ +------+ 428 +------+ / +------+ \ +------+ <> R7 <>--- 429 ---<> R1 <>---<> R3 <>---<> R5 <> +------+ 430 +------+ \ +------+ \ +------+ \ 431 \ \ \ +------+ 432 \ \ <> R8 <>--- 433 \ \ +------+ 434 \ \ 435 \ \ +------+ 436 \ <> R9 <>--- 437 \ +------+ 438 \ 439 \ +------+ 440 <> R10 <>--- 441 +------+ 443 Figure 2: Monitoring Network Graph 445 Each monitoring point is characterized by the packet counter that 446 refers only to a marking period of the monitored flow. Also, it is 447 assumed that there be a monitoring point at all possible egress 448 points of the multipoint monitored network. 450 The same is also applicable for the delay, but it will be described 451 in the following sections. 453 The rest of the document assumes that the traffic is going from left 454 to right in order to simplify the explanation. But the analysis done 455 for one direction applies equally to all directions. 457 4.2. Network Packet Loss 459 Since all the packets of the considered flow leaving the network have 460 previously entered the network, the number of packets counted by all 461 the input nodes is always greater than, or equal to, the number of 462 packets counted by all the output nodes. Noninitial fragments are 463 not considered here. 465 The assumption is the use of the Alternate-Marking method. In the 466 case of no packet loss occurring in the marking period, if all the 467 input and output points of the network domain to be monitored are 468 measurement points, the sum of the number of packets on all the 469 ingress interfaces equals the number on egress interfaces for the 470 monitored flow. In this circumstance, if no packet loss occurs, the 471 intermediate measurement points only have the task of splitting the 472 measurement. 474 It is possible to define the Network Packet Loss of one monitored 475 flow for a single period. In a packet network, the number of lost 476 packets is the number of packets counted by the input nodes minus the 477 number of packets counted by the output nodes. This is true for 478 every packet flow in each marking period. 480 The monitored network packet loss with n input nodes and m output 481 nodes is given by: 483 PL = (PI1 + PI2 +...+ PIn) - (PO1 + PO2 +...+ POm) 485 where: 487 PL is the network packet loss (number of lost packets) 489 PIi is the number of packets flowed through the i-th input node in 490 this period 492 POj is the number of packets flowed through the j-th output node in 493 this period 495 The equation is applied on a per-time-interval basis and a per-flow 496 basis: 498 The reference interval is the Alternate-Marking period, as defined 499 in [I-D.fioccola-rfc8321bis]. 501 The flow definition is generalized here. Indeed, as described 502 before, a multipoint packet flow is considered, and the 503 identification fields can be selected without any constraints. 505 5. Network Clustering 507 The previous equation of Section 4.2 can determine the number of 508 packets lost globally in the monitored network, exploiting only the 509 data provided by the counters in the input and output nodes. 511 In addition, it is also possible to leverage the data provided by the 512 other counters in the network to converge on the smallest 513 identifiable subnetworks where the losses occur. These subnetworks 514 are named "clusters". 516 A cluster graph is a subnetwork of the entire monitoring network 517 graph that still satisfies the packet loss equation (introduced in 518 the previous section), where PL in this case is the number of packets 519 lost in the cluster. As for the entire monitoring network graph, the 520 cluster is defined on a per-flow basis. 522 For this reason, a cluster should contain all the arcs emanating from 523 its input nodes and all the arcs terminating at its output nodes. 524 This ensures that we can count all the packets (and only those) 525 exiting an input node again at the output node, whatever path they 526 follow. 528 In a completely monitored unidirectional network (a network where 529 every network interface is monitored), each network device 530 corresponds to a cluster, and each physical link corresponds to two 531 clusters (one for each device). 533 Clusters can have different sizes depending on the flow-filtering 534 criteria adopted. 536 Moreover, sometimes clusters can be optionally simplified. For 537 example, when two monitored interfaces are divided by a single router 538 (one is the input interface, the other is the output interface, and 539 the router has only these two interfaces), instead of counting 540 exactly twice, upon entering and leaving, it is possible to consider 541 a single measurement point. In this case, we do not care about the 542 internal packet loss of the router. 544 It is worth highlighting that it might also be convenient to define 545 clusters based on the topological information so that they are 546 applicable to all the possible flows in the monitored network. 548 5.1. Algorithm for Clusters Partition 550 A simple algorithm can be applied in order to split our monitoring 551 network into clusters. This can be done for each direction 552 separately. The clusters partition is based on the monitoring 553 network graph, which can be valid for a specific flow or can also be 554 general and valid for the entire network topology. 556 It is a two-step algorithm: 558 o Group the links where there is the same starting node; 560 o Join the grouped links with at least one ending node in common. 562 Considering that the links are unidirectional, the first step implies 563 listing all the links as connections between two nodes and grouping 564 the different links if they have the same starting node. Note that 565 it is possible to start from any link, and the procedure will work. 566 Following this classification, the second step implies eventually 567 joining the groups classified in the first step by looking at the 568 ending nodes. If different groups have at least one common ending 569 node, they are put together and belong to the same set. After the 570 application of the two steps of the algorithm, each one of the 571 composed sets of links, together with the endpoint nodes, constitutes 572 a cluster. 574 In our monitoring network graph example, it is possible to identify 575 the clusters partition by applying this two-step algorithm. 577 The first step identifies the following groups: 579 1. Group 1: (R1-R2), (R1-R3), (R1-R10) 581 2. Group 2: (R2-R4), (R2-R5) 583 3. Group 3: (R3-R5), (R3-R9) 585 4. Group 4: (R4-R6), (R4-R7) 587 5. Group 5: (R5-R8) 589 And then, the second step builds the clusters partition (in 590 particular, we can underline that Groups 2 and 3 connect together, 591 since R5 is in common): 593 1. Cluster 1: (R1-R2), (R1-R3), (R1-R10) 595 2. Cluster 2: (R2-R4), (R2-R5), (R3-R5), (R3-R9) 597 3. Cluster 3: (R4-R6), (R4-R7) 599 4. Cluster 4: (R5-R8) 601 The flow direction here considered is from left to right. For the 602 opposite direction, the same reasoning can be applied, and in this 603 example, you get the same clusters partition. 605 In the end, the following 4 clusters are obtained: 607 Cluster 1 608 +------+ 609 <> R2 <>--- 610 / +------+ 612 / 613 +------+ / +------+ 614 ---<> R1 <>---<> R3 <>--- 615 +------+ \ +------+ 616 \ 617 \ 618 \ 619 \ 620 \ 621 \ 622 \ 623 \ 624 \ +------+ 625 <> R10 <>--- 626 +------+ 628 Cluster 2 629 +------+ +------+ 630 ---<> R2 <>---<> R4 <>--- 631 +------+ \ +------+ 632 \ 633 +------+ \ +------+ 634 ---<> R3 <>---<> R5 <>--- 635 +------+ \ +------+ 636 \ 637 \ 638 \ 639 \ 640 \ +------+ 641 <> R9 <>--- 642 +------+ 644 Cluster 3 645 +------+ 646 <> R6 <>--- 647 / +------+ 648 +------+ / 649 ---<> R4 <> 650 +------+ \ 651 \ +------+ 652 <> R7 <>--- 653 +------+ 655 Cluster 4 656 +------+ 658 ---<> R5 <> 659 +------+ \ 660 \ +------+ 661 <> R8 <>--- 662 +------+ 664 Figure 3: Clusters Example 666 There are clusters with more than two nodes as well as two-node 667 clusters. In the two-node clusters, the loss is on the link (Cluster 668 4). In more-than-two-node clusters, the loss is on the cluster, but 669 we cannot know in which link (Cluster 1, 2, or 3). 671 In this way, the calculation of packet loss, delay and delay 672 variation can be made on a cluster basis. Note that the packet 673 counters for each marking period permit calculating the packet rate 674 on a cluster basis, so Committed Information Rate (CIR) and Excess 675 Information Rate (EIR) could also be deduced on a cluster basis. 677 Obviously, by combining some clusters in a new connected subnetwork 678 (called a "super cluster"), the packet-loss rule is still true. 680 In this way, in a very large network, there is no need to configure 681 detailed filter criteria to inspect the traffic. It is possible to 682 check a multipoint network and, in case of problems, go deep with a 683 step-by-step cluster analysis, but only for the cluster or 684 combination of clusters where the problem happens. 686 In summary, once a flow is defined, the algorithm to build the 687 clusters partition is based on topological information; therefore, it 688 considers all the possible links and nodes crossed by the given flow, 689 even if there is no traffic. So, if the flow does not enter or 690 traverse all the nodes, the counters have a nonzero value for the 691 involved nodes and a zero value for the other nodes without traffic; 692 but in the end, all the formulas are still valid. 694 The algorithm described above is an iterative clustering algorithm, 695 but it is also possible to apply a recursive clustering algorithm by 696 using the node-node adjacency matrix representation 697 [IEEE-ACM-ToN-MPNPM]. 699 The complete and mathematical analysis of the possible algorithms for 700 clusters partition, including the considerations in terms of 701 efficiency and a comparison between the different methods, is in the 702 paper [IEEE-ACM-ToN-MPNPM]. 704 6. Multipoint Packet Loss Measurement 706 The Network Packet Loss, defined in Section 4.2, valid for the an 707 entire monitored flow, can easily be extended to each multipoint path 708 (e.g., the whole multipoint network, a cluster, or a combination of 709 clusters). In this way it is possible to calculate Multipoint Packet 710 Loss that is representative of a multipoint path. 712 The same equation of Section 4.2 can be applied to a generic 713 multipoint path like a cluster or a combination of clusters, where 714 the number of packets are those entering and leaving the multipoint 715 path. 717 By applying the algorithm described in Section 5.1, it is possible to 718 split the monitoring network into clusters. Then, packet loss can be 719 measured on a cluster basis for each single period by considering the 720 counters of the input and output nodes that belong to the specific 721 cluster. This can be done for every packet flow in each marking 722 period. 724 7. Multipoint Delay and Delay Variation 726 The same line of reasoning can be applied to delay and delay 727 variation. Similarly to the delay measurements defined in 728 [I-D.fioccola-rfc8321bis], the marking batches anchor the samples to 729 a particular period, and this is the time reference that can be used. 730 It is important to highlight that both delay and delay-variation 731 measurements make sense in a multipoint path. The delay variation is 732 calculated by considering the same packets selected for measuring the 733 delay. 735 In general, it is possible to perform delay and delay-variation 736 measurements on the basis of multipoint paths or single packets: 738 o Delay measurements on the basis of multipoint paths mean that the 739 delay value is representative of an entire multipoint path (e.g., 740 the whole multipoint network, a cluster, or a combination of 741 clusters). 743 o Delay measurements on a single-packet basis mean that you can use 744 a multipoint path just to easily couple packets between input and 745 output nodes of a multipoint path, as described in the following 746 sections. 748 7.1. Delay Measurements on a Multipoint-Paths Basis 750 7.1.1. Single-Marking Measurement 752 Mean delay and mean delay-variation measurements can also be 753 generalized to the case of multipoint flows. It is possible to 754 compute the average one-way delay of packets in one block, a cluster, 755 or the entire monitored network. 757 The average latency can be measured as the difference between the 758 weighted averages of the mean timestamps of the sets of output and 759 input nodes. This means that, in the calculation, it is possible to 760 weigh the timestamps by considering the number of packets for each 761 endpoints. 763 Note that, since the one-way delay value is representative of a 764 multipoint path, it is possible to calculate the two-way delay of a 765 multipoint path by summing the one-way delays of the two directions, 766 similarly to [I-D.fioccola-rfc8321bis]. 768 7.2. Delay Measurements on a Single-Packet Basis 770 7.2.1. Single- and Double-Marking Measurement 772 Delay and delay-variation measurements relative to only one picked 773 packet per period (both single and double marked) can be performed in 774 the multipoint scenario, with some limitations: 776 Single marking based on the first/last packet of the interval 777 would not work, because it would not be possible to agree on the 778 first packet of the interval. 780 Double marking or multiplexed marking would work, but each 781 measurement would only give information about the delay of a 782 single path. However, by repeating the measurement multiple 783 times, it is possible to get information about all the paths in 784 the multipoint flow. This can be done in the case of a point-to- 785 multipoint path, but it is more difficult to achieve in the case 786 of a multipoint-to-multipoint path because of the multiple source 787 routers. 789 If we would perform a delay measurement for more than one picked 790 packet in the same marking period, and especially if we want to get 791 delay measurements on a multipoint-to-multipoint basis, neither the 792 single- nor the double-marking method is useful in the multipoint 793 scenario, since they would not be representative of the entire flow. 794 The packets can follow different paths with various delays, and in 795 general it can be very difficult to recognize marked packets in a 796 multipoint-to-multipoint path, especially in the case when there is 797 more than one per period. 799 A desirable option is to monitor simultaneously all the paths of a 800 multipoint path in the same marking period; for this purpose, hashing 801 can be used, as reported in the next section. 803 Note that, since the one-way delay measurement is done on a single- 804 packet basis, it is always possible to calculate the two-way delay 805 but it is not immediate since it is necessary to couple the 806 measurement on each single path with the opposite direction. In this 807 case the NMS can do the calculation. 809 7.2.2. Hashing Selection Method 811 RFCs 5474 [RFC5474] and 5475 [RFC5475] introduce sampling and 812 filtering techniques for IP packet selection. 814 The hash-based selection methodologies for delay measurement can work 815 in a multipoint-to-multipoint path and MAY be used either coupled to 816 mean delay or stand-alone. 818 [I-D.mizrahi-ippm-marking] introduces how to use the hash method (RFC 819 5474 [RFC5474] and RFC 5475 [RFC5475]) combined with the Alternate- 820 Marking method for point-to-point flows. It is also called Mixed 821 Hashed Marking: the coupling of a marking method and hashing 822 technique is very useful, because the marking batches anchor the 823 samples selected with hashing, and this simplifies the correlation of 824 the hashing packets along the path. 826 It is possible to use a basic-hash or a dynamic-hash method. One of 827 the challenges of the basic approach is that the frequency of the 828 sampled packets may vary considerably. For this reason, the dynamic 829 approach has been introduced for point-to-point flows in order to 830 have the desired and almost fixed number of samples for each 831 measurement period. Using the hash-based sampling, the number of 832 samples may vary a lot because it depends on the packet rate that is 833 variable. The dynamic approach helps to have an almost fixed number 834 of samples for each marking period, and this is a better option for 835 making regular measurements over time. In the hash-based sampling, 836 Alternate Marking is used to create periods, so that hash-based 837 samples are divided into batches, which allows anchoring the selected 838 samples to their period. Moreover, in the dynamic hash-based 839 sampling, by dynamically adapting the length of the hash value, the 840 number of samples is bounded in each marking period. 842 In a multipoint environment, the hashing selection MAY be the 843 solution for performing delay measurements on specific packets and 844 overcoming the single- and double-marking limitations. 846 8. Synchronization and Timing 848 It is important to consider the timing aspects, since out-of-order 849 packets happen and have to be handled as well, as described in 850 [I-D.fioccola-rfc8321bis]. 852 However, in a multisource situation, an additional issue has to be 853 considered. With multipoint path, the egress nodes will receive 854 alternate marked packets in random order from different ingress 855 nodes, and this must not affect the measurement. 857 So, if we analyze a multipoint-to-multipoint path with more than one 858 marking node, it is important to recognize the reference measurement 859 interval. In general, the measurement interval for describing the 860 results is the interval of the marking node that is more aligned with 861 the start of the measurement, as reported in Figure 4. 863 Note that the mark switching approach based on a fixed timer is 864 considered in this document. 866 time -> start stop 867 T(R1) |-------------| 868 T(R2) |-------------| 869 T(R3) |------------| 871 Figure 4: Measurement Interval 873 In Figure 4, it is assumed that the node with the earliest clock (R1) 874 identifies the right starting and ending times of the measurement, 875 but it is just an assumption, and other possibilities could occur. 876 So, in this case, T(R1) is the measurement interval, and its 877 recognition is essential in order to make comparisons with other 878 active/passive/hybrid Packet Loss metrics. 880 Regarding the timing constraints of the methodology, 881 [I-D.fioccola-rfc8321bis] already describes two contributions that 882 are taken into account: the clock error between network devices and 883 the network delay between the measurement points. 885 When we expand to a multipoint environment, we have to consider that 886 there are more marking nodes that mark the traffic based on 887 synchronized clock time. But, due to different synchronization 888 issues that may happen, the marking batches can be of different 889 lengths and with different offsets when they get mixed in a 890 multipoint flow. The additional gap that results between the sources 891 can be incorporated into A, which is the maximum clock skew between 892 the network devices, as already defined in [I-D.fioccola-rfc8321bis]. 894 ...BBBBBBBBB | AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA | BBBBBBBBB... 895 |<======================================>| 896 | L | 897 ...=========>|<==================><==================>|<==========... 898 | L/2 L/2 | 899 |<====>| |<====>| 900 d | | d 901 |<========================>| 902 available counting interval 904 Figure 5: Timing Aspects 906 Moreover, it is assumed that each path of the multipoint flow can 907 still be represented with a distinct normal distribution. So, for 908 the aggregate multipoint path, the combination of normal 909 distributions result in a new normal distribution. Under this 910 assumption, the definition of the guard band d is still applicable as 911 defined in [I-D.fioccola-rfc8321bis] and is given by: 913 d = A + D_avg + 3*D_stddev, 915 where A is the clock accuracy, D_avg is the average value of the 916 network delay, and D_stddev is the standard deviation of the delay. 918 As shown in Figure 5 and according to [I-D.fioccola-rfc8321bis], the 919 condition that must be satisfied to enable the method to function 920 properly is that the available counting interval must be > 0, and 921 that means: 923 L - 2d > 0. 925 This formula needs to be verified for each measurement point on the 926 multipoint path. 928 Note that the timing considerations are valid for both packet loss 929 and delay measurements. 931 9. Results of the Multipoint Alternate Marking Experiment 933 The methodology described in the previous sections can be applied to 934 various performance measurement problems, as also explained in 935 [I-D.fioccola-rfc8321bis]. 937 Either one or two flag bits might be available for marking in 938 different deployments: 940 One flag: packet loss measurement SHOULD be done as described in 941 Section 6 by applying the network clustering partition described 942 in Section 5. While delay measurement MAY be done according to 943 the Mean delay calculation representative of the multipoint path, 944 as described in Section 7.1.1. Single-marking method based on the 945 first/last packet of the interval cannot be applied, as mentioned 946 in Section 7.2.1. 948 Two flags: packet loss measurement SHOULD be done as described in 949 Section 6 by applying the network clustering partition described 950 in Section 5. While delay measurement SHOULD be done on a single 951 packet basis according to double-marking method Section 7.2.1. In 952 this case the Mean delay calculation (Section 7.1.1) MAY also be 953 used as a representative value of a multipoint path. 955 One flag and hash-based selection: packet loss measurement SHOULD 956 be done as described in Section 6 by applying the network 957 clustering partition described in Section 5. Hash-based selection 958 methodologies, introduced in Section 7.2.2, MAY be used for delay 959 measurement. 961 The experiment with Multipoint Alternate Marking methodologies 962 confirmed the benefits of the Alternate Marking methodology described 963 in [I-D.fioccola-rfc8321bis], as its extension to the general case of 964 multipoint-to-multipoint scenarios. 966 The Multipoint Alternate Marking Method is RECOMMENDED only for 967 controlled domains, as per [I-D.fioccola-rfc8321bis]. 969 10. A Closed-Loop Performance-Management Approach 971 The Multipoint Alternate-Marking framework that is introduced in this 972 document adds flexibility to Performance Management (PM), because it 973 can reduce the order of magnitude of the packet counters. This 974 allows an SDN orchestrator to supervise, control, and manage PM in 975 large networks. 977 The monitoring network can be considered as a whole or split into 978 clusters that are the smallest subnetworks (group-to-group segments), 979 maintaining the packet-loss property for each subnetwork. The 980 clusters can also be combined in new, connected subnetworks at 981 different levels, depending on the detail we want to achieve. 983 An SDN controller or a Network Management System (NMS) can calibrate 984 performance measurements, since they are aware of the network 985 topology. They can start without examining in depth. In case of 986 necessity (packet loss is measured or the delay is too high), the 987 filtering criteria could be immediately reconfigured in order to 988 perform a partition of the network by using clusters and/or different 989 combinations of clusters. In this way, the problem can be localized 990 in a specific cluster or a single combination of clusters, and a more 991 detailed analysis can be performed step by step by successive 992 approximation up to a point-to-point flow detailed analysis. This is 993 the so-called "closed loop". 995 This approach can be called "network zooming" and can be performed in 996 two different ways: 998 1) change the traffic filter and select more detailed flows; 1000 2) activate new measurement points by defining more specified 1001 clusters. 1003 The network-zooming approach implies that some filters or rules are 1004 changed and that therefore there is a transient time to wait once the 1005 new network configuration takes effect. This time can be determined 1006 by the Network Orchestrator/Controller, based on the network 1007 conditions. 1009 For example, if the network zooming identifies the performance 1010 problem for the traffic coming from a specific source, we need to 1011 recognize the marked signal from this specific source node and its 1012 relative path. For this purpose, we can activate all the available 1013 measurement points and better specify the flow filter criteria (i.e., 1014 5-tuple). As an alternative, it can be enough to select packets from 1015 the specific source for delay measurements; in this case, it is 1016 possible to apply the hashing technique, as mentioned in the previous 1017 sections. 1019 [I-D.song-opsawg-ifit-framework] defines an architecture where the 1020 centralized Data Collector and Network Management can apply the 1021 intelligent and flexible Alternate-Marking algorithm as previously 1022 described. 1024 As for [I-D.fioccola-rfc8321bis], it is possible to classify the 1025 traffic and mark a portion of the total traffic. For each period, 1026 the packet rate and bandwidth are calculated from the number of 1027 packets. In this way, the network orchestrator becomes aware if the 1028 traffic rate surpasses limits. In addition, more precision can be 1029 obtained by reducing the marking period; indeed, some implementations 1030 use a marking period of 1 sec or less. 1032 In addition, an SDN controller could also collect the measurement 1033 history. 1035 It is important to mention that the Multipoint Alternate Marking 1036 framework also helps Traffic Visualization. Indeed, this methodology 1037 is very useful for identifying which path or cluster is crossed by 1038 the flow. 1040 11. Security Considerations 1042 This document specifies a method of performing measurements that does 1043 not directly affect Internet security or applications that run on the 1044 Internet. However, implementation of this method must be mindful of 1045 security and privacy concerns, as explained in 1046 [I-D.fioccola-rfc8321bis]. 1048 12. IANA Considerations 1050 This document has no IANA actions. 1052 13. Contributors 1054 Greg Mirsky 1055 Ericsson 1056 Email: gregimirsky@gmail.com 1058 Tal Mizrahi 1059 Huawei Technologies 1060 Email: tal.mizrahi.phd@gmail.com 1062 Xiao Min 1063 ZTE Corp. 1064 Email: xiao.min2@zte.com.cn 1066 14. Acknowledgements 1068 The authors would like to thank Martin Duke and Tommy Pauly for their 1069 assistance and their detailed and precious reviews. 1071 15. References 1073 15.1. Normative References 1075 [I-D.fioccola-rfc8321bis] 1076 Fioccola, G., Cociglio, M., Mirsky, G., Mizrahi, T., Zhou, 1077 T., and X. Min, "Alternate-Marking Method", draft- 1078 fioccola-rfc8321bis-03 (work in progress), February 2022. 1080 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1081 Requirement Levels", BCP 14, RFC 2119, 1082 DOI 10.17487/RFC2119, March 1997, 1083 . 1085 [RFC5474] Duffield, N., Ed., Chiou, D., Claise, B., Greenberg, A., 1086 Grossglauser, M., and J. Rexford, "A Framework for Packet 1087 Selection and Reporting", RFC 5474, DOI 10.17487/RFC5474, 1088 March 2009, . 1090 [RFC5475] Zseby, T., Molina, M., Duffield, N., Niccolini, S., and F. 1091 Raspall, "Sampling and Filtering Techniques for IP Packet 1092 Selection", RFC 5475, DOI 10.17487/RFC5475, March 2009, 1093 . 1095 [RFC5644] Stephan, E., Liang, L., and A. Morton, "IP Performance 1096 Metrics (IPPM): Spatial and Multicast", RFC 5644, 1097 DOI 10.17487/RFC5644, October 2009, 1098 . 1100 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 1101 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 1102 May 2017, . 1104 15.2. Informative References 1106 [I-D.ietf-ippm-route] 1107 Alvarez-Hamelin, J. I., Morton, A., Fabini, J., Pignataro, 1108 C., and R. Geib, "Advanced Unidirectional Route Assessment 1109 (AURA)", draft-ietf-ippm-route-10 (work in progress), 1110 August 2020. 1112 [I-D.mizrahi-ippm-marking] 1113 Mizrahi, T., Fioccola, G., Cociglio, M., Chen, M., and G. 1114 Mirsky, "Marking Methods for Performance Measurement", 1115 draft-mizrahi-ippm-marking-00 (work in progress), October 1116 2021. 1118 [I-D.song-opsawg-ifit-framework] 1119 Song, H., Qin, F., Chen, H., Jin, J., and J. Shin, "In- 1120 situ Flow Information Telemetry", draft-song-opsawg-ifit- 1121 framework-16 (work in progress), October 2021. 1123 [I-D.zhou-ippm-enhanced-alternate-marking] 1124 Zhou, T., Fioccola, G., Liu, Y., Lee, S., Cociglio, M., 1125 and W. Li, "Enhanced Alternate Marking Method", draft- 1126 zhou-ippm-enhanced-alternate-marking-08 (work in 1127 progress), January 2022. 1129 [IEEE-ACM-ToN-MPNPM] 1130 IEEE/ACM TRANSACTION ON NETWORKING, "Multipoint Passive 1131 Monitoring in Packet Networks", 1132 DOI 10.1109/TNET.2019.2950157, 2019. 1134 [IEEE-Network-PNPM] 1135 IEEE Network, "AM-PM: Efficient Network Telemetry using 1136 Alternate Marking", DOI 10.1109/MNET.2019.1800152, 2019. 1138 [RFC7011] Claise, B., Ed., Trammell, B., Ed., and P. Aitken, 1139 "Specification of the IP Flow Information Export (IPFIX) 1140 Protocol for the Exchange of Flow Information", STD 77, 1141 RFC 7011, DOI 10.17487/RFC7011, September 2013, 1142 . 1144 [RFC8889] Fioccola, G., Ed., Cociglio, M., Sapio, A., and R. Sisto, 1145 "Multipoint Alternate-Marking Method for Passive and 1146 Hybrid Performance Monitoring", RFC 8889, 1147 DOI 10.17487/RFC8889, August 2020, 1148 . 1150 Appendix A. Changes Log 1152 Changes from RFC 8889 include: 1154 o Minor editorial changes 1156 o Removed section on "Examples of application" 1158 Changes in v-(01) include: 1160 o Considerations on BUM traffic 1162 o Reference to RFC8321bis for the fragmentation part 1164 o Revised section on "Delay Measurements on a Single-Packet Basis" 1165 o Revised section on "Timing Aspects" 1167 Changes in v-(02) include: 1169 o Clarified the formula in the section on "Timing Aspects" to be 1170 aligned with RFC 8321 1172 o Considerations on two-way delay measurements in both sections 8.1 1173 and 8.2 on delay measurements 1175 o Clarified in section 4.1 on "Monitoring Network" that the 1176 description is done for one direction but it can easily be 1177 extended to all direction 1179 o New section on "Results of the Multipoint Alternate Marking 1180 Experiment" 1182 Changes in v-(03) include: 1184 o Moved and renamed section on "Timing Aspects" as "Synchronization 1185 and Timing" 1187 o Renamed old section on "Multipoint Packet Loss" as "Network Packet 1188 Loss" 1190 o New section on "Multipoint Packet Loss Measurement" 1192 o Renamed section on "Multipoint Performance Measurement" as 1193 "Extension of the Method to Multipoint Flows" 1195 Authors' Addresses 1197 Giuseppe Fioccola (editor) 1198 Huawei Technologies 1199 Riesstrasse, 25 1200 Munich 80992 1201 Germany 1203 Email: giuseppe.fioccola@huawei.com 1205 Mauro Cociglio 1206 Telecom Italia 1207 Via Reiss Romoli, 274 1208 Torino 10148 1209 Italy 1211 Email: mauro.cociglio@telecomitalia.it 1212 Amedeo Sapio 1213 Intel Corporation 1214 4750 Patrick Henry Dr. 1215 Santa Clara, CA 95054 1216 USA 1218 Email: amedeo.sapio@intel.com 1220 Riccardo Sisto 1221 Politecnico di Torino 1222 Corso Duca degli Abruzzi, 24 1223 Torino 10129 1224 Italy 1226 Email: riccardo.sisto@polito.it 1228 Tianran Zhou 1229 Huawei Technologies 1230 156 Beiqing Rd. 1231 Beijing 100095 1232 China 1234 Email: zhoutianran@huawei.com