idnits 2.17.00 (12 Aug 2021) /tmp/idnits62493/draft-ietf-ippm-rfc8889bis-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([RFC8889]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document date (April 28, 2022) is 16 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Downref: Normative reference to an Informational RFC: RFC 5474 Summary: 2 errors (**), 0 flaws (~~), 0 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group G. Fioccola, Ed. 3 Internet-Draft Huawei Technologies 4 Obsoletes: 8889 (if approved) M. Cociglio 5 Intended status: Standards Track Telecom Italia 6 Expires: October 30, 2022 A. Sapio 7 Intel Corporation 8 R. Sisto 9 Politecnico di Torino 10 T. Zhou 11 Huawei Technologies 12 April 28, 2022 14 Multipoint Alternate-Marking Clustered Method 15 draft-ietf-ippm-rfc8889bis-01 17 Abstract 19 This document generalizes and expands Alternate-Marking methodology 20 to measure any kind of unicast flow whose packets can follow several 21 different paths in the network -- in wider terms, a multipoint-to- 22 multipoint network. For this reason, the technique here described is 23 called "Multipoint Alternate Marking". This document obsoletes 24 [RFC8889]. 26 Status of This Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at https://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on October 30, 2022. 43 Copyright Notice 45 Copyright (c) 2022 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (https://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 Table of Contents 60 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 61 1.1. Summary of Changes from RFC 8889 . . . . . . . . . . . . 5 62 1.2. Requirements Language . . . . . . . . . . . . . . . . . . 5 63 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 64 2.1. Correlation with RFC 5644 . . . . . . . . . . . . . . . . 6 65 3. Flow Classification . . . . . . . . . . . . . . . . . . . . . 7 66 4. Extension of the Method to Multipoint Flows . . . . . . . . . 9 67 4.1. Monitoring Network . . . . . . . . . . . . . . . . . . . 10 68 4.2. Network Packet Loss . . . . . . . . . . . . . . . . . . . 11 69 5. Network Clustering . . . . . . . . . . . . . . . . . . . . . 12 70 5.1. Algorithm for Clusters Partition . . . . . . . . . . . . 13 71 6. Multipoint Packet Loss Measurement . . . . . . . . . . . . . 16 72 7. Multipoint Delay and Delay Variation . . . . . . . . . . . . 17 73 7.1. Delay Measurements on a Multipoint-Paths Basis . . . . . 17 74 7.1.1. Single-Marking Measurement . . . . . . . . . . . . . 17 75 7.2. Delay Measurements on a Single-Packet Basis . . . . . . . 18 76 7.2.1. Single- and Double-Marking Measurement . . . . . . . 18 77 7.2.2. Hashing Selection Method . . . . . . . . . . . . . . 18 78 8. Synchronization and Timing . . . . . . . . . . . . . . . . . 19 79 9. Results of the Multipoint Alternate Marking Experiment . . . 21 80 10. A Closed-Loop Performance-Management Approach . . . . . . . . 22 81 11. Security Considerations . . . . . . . . . . . . . . . . . . . 23 82 12. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 24 83 13. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 24 84 14. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 24 85 15. References . . . . . . . . . . . . . . . . . . . . . . . . . 24 86 15.1. Normative References . . . . . . . . . . . . . . . . . . 24 87 15.2. Informative References . . . . . . . . . . . . . . . . . 25 88 Appendix A. Changes Log . . . . . . . . . . . . . . . . . . . . 26 89 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 27 91 1. Introduction 93 The Alternate-Marking Method, as described in 94 [I-D.fioccola-rfc8321bis], is applicable to a point-to-point path. 95 The extension proposed in this document applies to the most general 96 case of a multipoint-to-multipoint path and enables flexible and 97 adaptive performance measurements in a managed network. 99 The Alternate-Marking methodology described in 100 [I-D.fioccola-rfc8321bis] allows the synchronization of the 101 measurements at different points by dividing the packet flow into 102 batches. So it is possible to get coherent counters and show what is 103 happening in every marking period for each monitored flow. The 104 monitoring parameters are the packet counter and timestamps of a flow 105 for each marking period. Note that additional details about the 106 applicability of the Alternate-Marking methodology are described in 107 [I-D.fioccola-rfc8321bis] while implementation details can be found 108 in the paper "AM-PM: Efficient Network Telemetry using Alternate 109 Marking" [IEEE-Network-PNPM]. 111 There are some applications of the Alternate-Marking method where 112 there are a lot of monitored flows and nodes. Multipoint Alternate 113 Marking aims to reduce these values and makes the performance 114 monitoring more flexible in case a detailed analysis is not needed. 115 For instance, by considering n measurement points and m monitored 116 flows, the order of magnitude of the packet counters for each time 117 interval is n*m*2 (1 per color). The number of measurement points 118 and monitored flows may vary and depends on the portion of the 119 network we are monitoring (core network, metro network, access 120 network) and the granularity (for each service, each customer). So 121 if both n and m are high values, the packet counters increase a lot, 122 and Multipoint Alternate Marking offers a tool to control these 123 parameters. 125 The approach presented in this document is applied only to unicast 126 flows and not to multicast. Broadcast, Unknown Unicast, and 127 Multicast (BUM) traffic is not considered here, because traffic 128 replication is not covered by the Multipoint Alternate-Marking 129 method. Furthermore, it can be applicable to anycast flows, and 130 Equal-Cost Multipath (ECMP) paths can also be easily monitored with 131 this technique. 133 [I-D.fioccola-rfc8321bis] applies to point-to-point unicast flows and 134 BUM traffic. For BUM traffic, the basic method of 135 [I-D.fioccola-rfc8321bis] can easily be applied link by link and 136 therefore split the multicast flow tree distribution into separate 137 unicast point-to-point links. While this document and its Clustered 138 Alternate-Marking method is valid for multipoint-to-multipoint 139 unicast flows, anycast, and ECMP flows. 141 Therefore, the Alternate-Marking method can be extended to any kind 142 of multipoint-to-multipoint paths, and the network-clustering 143 approach presented in this document is the formalization of how to 144 implement this property and allow a flexible and optimized 145 performance measurement support for network management in every 146 situation. 148 Without network clustering, it is possible to apply Alternate Marking 149 only for all the network or per single flow. Instead, with network 150 clustering, it is possible to use the partition of the network into 151 clusters at different levels in order to provide the needed degree of 152 detail. In some circumstances, it is possible to monitor a 153 multipoint network by monitoring the network clusters, without 154 examining in depth. In case of problems (packet loss is measured or 155 the delay is too high), the filtering criteria could be enhanced in 156 order to perform a detailed analysis by using a different combination 157 of clusters up to a per-flow measurement as described in 158 [I-D.fioccola-rfc8321bis]. 160 This approach fits very well with the Closed-Loop Network and 161 Software-Defined Network (SDN) paradigm, where the SDN orchestrator 162 and the SDN controllers are the brains of the network and can manage 163 flow control to the switches and routers and, in the same way, can 164 calibrate the performance measurements depending on the desired 165 accuracy. An SDN controller application can orchestrate how 166 accurately the network performance monitoring is set up by applying 167 the Multipoint Alternate Marking as described in this document. 169 It is important to underline that, as an extension of 170 [I-D.fioccola-rfc8321bis], this is a methodology document, so the 171 mechanism that can be used to transmit the counters and the 172 timestamps is out of scope here, and the implementation is open. 173 Several options are possible -- e.g., see "Enhanced Alternate Marking 174 Method" [I-D.zhou-ippm-enhanced-alternate-marking]. 176 This document assumes that the blocks are created according to a 177 fixed timer as per [I-D.fioccola-rfc8321bis]. Switching after a 178 fixed number of packets is possible but it is out of scope here. 180 Note that the fragmented packets case can be managed with the 181 Alternate-Marking methodology. The same considerations of 182 [I-D.fioccola-rfc8321bis] apply also in the case of Multipoint 183 Alternate Marking. As defined in [I-D.fioccola-rfc8321bis] the 184 marking node MUST mark all the fragments except in the case of 185 fragmentation within the network domain, in that event it is 186 suggested to mark only the first fragment. 188 1.1. Summary of Changes from RFC 8889 190 This document defines the Multipoint Alternate-Marking Method, 191 addressing ambiguities and overtaking its experimental phase in the 192 original specification [RFC8889]. 194 The relevant changes are: 196 o Added the recommendations about the different deployments in case 197 one or two flag bits are available for marking (Section 9). 199 o Changed the structure to improve the readability. 201 o Removed the wording about the experimentation of the method and 202 considerations that no longer apply. 204 o Revised the description of detailed aspects of the methodology, 205 e.g. synchronization and timing. 207 It is important to note that all the changes are totally backward 208 compatible with [RFC8889] and no new additional technique has been 209 introduced in this document compared to [RFC8889]. 211 1.2. Requirements Language 213 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 214 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 215 "OPTIONAL" in this document are to be interpreted as described in BCP 216 14 [RFC2119] [RFC8174] when, and only when, they appear in all 217 capitals, as shown here. 219 2. Terminology 221 The definitions of the basic terms are identical to those found in 222 Alternate Marking [I-D.fioccola-rfc8321bis]. It is to be remembered 223 that [I-D.fioccola-rfc8321bis] is valid for point-to-point unicast 224 flows and BUM traffic. 226 The important new terms that need to be explained are listed below: 228 Multipoint Alternate Marking: Extension to 229 [I-D.fioccola-rfc8321bis], valid for multipoint-to-multipoint 230 unicast flows, anycast, and ECMP flows. It can also be referred 231 to as Clustered Alternate Marking. 233 Flow definition: The concept of flow is generalized in this 234 document. The identification fields are selected without any 235 constraints and, in general, the flow can be a multipoint-to- 236 multipoint flow, as a result of aggregate point-to-point flows. 238 Monitoring Network: Identified with the nodes of the network that 239 are the measurement points (MPs) and the links that are the 240 connections between MPs. The monitoring network graph depends on 241 the flow definition, so it can represent a specific flow or the 242 entire network topology as aggregate of all the flows. 244 Cluster: Smallest identifiable subnetwork of the entire monitoring 245 network graph that still satisfies the condition that the number 246 of packets that go in is the same as the number that go out. 248 Multipoint metrics: Packet loss, delay and delay variation are 249 extended to the case of multipoint flows. It is possible to 250 compute these metrics on the basis of multipoint paths in order to 251 associate the measurements to a cluster, a combination of 252 clusters, or the entire monitored network. For delay and delay 253 variation, it is also possible to define the metrics on a single- 254 packet basis, and it means that the multipoint path is used to 255 easily couple packets between input and output nodes of a 256 multipoint path. 258 The next section highlights the correlation with the terms used in 259 RFC 5644 [RFC5644]. 261 2.1. Correlation with RFC 5644 263 RFC 5644 [RFC5644] is limited to active measurements using a single 264 source packet or stream. Its scope is also limited to observations 265 of corresponding packets along the path (spatial metric) and at one 266 or more destinations (one-to-group) along the path. 268 Instead, the scope of this memo is to define multiparty metrics for 269 passive and hybrid measurements in a group-to-group topology with 270 multiple sources and destinations. 272 RFC 5644 [RFC5644] introduces metric names that can be reused here 273 but have to be extended and rephrased to be applied to the Alternate- 274 Marking schema: 276 a. the multiparty metrics are not only one-to-group metrics but can 277 be also group-to-group metrics; 279 b. the spatial metrics, used for measuring the performance of 280 segments of a source to destination path, are applied here to 281 group-to-group segments (called clusters). 283 3. Flow Classification 285 A unicast flow is identified by all the packets having a set of 286 common characteristics. This definition is inspired by RFC 7011 287 [RFC7011]. 289 As an example, by considering a flow as all the packets sharing the 290 same source IP address or the same destination IP address, it is easy 291 to understand that the resulting pattern will not be a point-to-point 292 connection, but a point-to-multipoint or multipoint-to-point 293 connection. 295 In general, a flow can be defined by a set of selection rules used to 296 match a subset of the packets processed by the network device. These 297 rules specify a set of Layer 3 and Layer 4 header fields 298 (identification fields) and the relative values that must be found in 299 matching packets. 301 The choice of the identification fields directly affects the type of 302 paths that the flow would follow in the network. In fact, it is 303 possible to relate a set of identification fields with the pattern of 304 the resulting graphs, as listed in Figure 1. 306 A TCP 5-tuple usually identifies flows following either a single path 307 or a point-to-point multipath (in the case of load balancing). On 308 the contrary, a single source address selects aggregate flows 309 following a point-to-multipoint, while a multipoint-to-point can be 310 the result of a matching on a single destination address. In the 311 case where a selection rule and its reverse are used for 312 bidirectional measurements, they can correspond to a point-to- 313 multipoint in one direction and a multipoint-to-point in the opposite 314 direction. 316 So the flows to be monitored are selected into the monitoring points 317 using packet selection rules, which can also change the pattern of 318 the monitored network. 320 Note that, more generally, the flow can be defined at different 321 levels based on the potential encapsulation, and additional 322 conditions that are not in the packet header can also be included as 323 part of matching criteria. 325 The Alternate-Marking method is applicable only to a single path (and 326 partially to a one-to-one multipath), so the extension proposed in 327 this document is suitable also for the most general case of 328 multipoint-to-multipoint, which embraces all the other patterns of 329 Figure 1. 331 point-to-point single path 332 +------+ +------+ +------+ 333 ---<> R1 <>----<> R2 <>----<> R3 <>--- 334 +------+ +------+ +------+ 336 point-to-point multipath 337 +------+ 338 <> R2 <> 339 / +------+ \ 340 / \ 341 +------+ / \ +------+ 342 ---<> R1 <> <> R4 <>--- 343 +------+ \ / +------+ 344 \ / 345 \ +------+ / 346 <> R3 <> 347 +------+ 349 point-to-multipoint 350 +------+ 351 <> R4 <>--- 352 / +------+ 353 +------+ / 354 <> R2 <> 355 / +------+ \ 356 +------+ / \ +------+ 357 ---<> R1 <> <> R5 <>--- 358 +------+ \ +------+ 359 \ +------+ 360 <> R3 <> 361 +------+ \ 362 \ +------+ 363 <> R6 <>--- 364 +------+ 366 multipoint-to-point 367 +------+ 368 ---<> R1 <> 369 +------+ \ 370 \ +------+ 371 <> R4 <> 372 / +------+ \ 373 +------+ / \ +------+ 374 ---<> R2 <> <> R6 <>--- 375 +------+ / +------+ 376 +------+ / 377 <> R5 <> 378 / +------+ 379 +------+ / 380 ---<> R3 <> 381 +------+ 383 multipoint-to-multipoint 384 +------+ +------+ 385 ---<> R1 <> <> R6 <>--- 386 +------+ \ / +------+ 387 \ +------+ / 388 <> R4 <> 389 +------+ \ 390 +------+ \ +------+ 391 ---<> R2 <> <> R7 <>--- 392 +------+ \ / +------+ 393 \ +------+ / 394 <> R5 <> 395 / +------+ \ 396 +------+ / \ +------+ 397 ---<> R3 <> <> R8 <>--- 398 +------+ +------+ 400 Figure 1: Flow Classification 402 The case of unicast flow is considered in Figure 1. The anycast flow 403 is also in scope, because there is no replication and only a single 404 node from the anycast group receives the traffic, so it can be viewed 405 as a special case of unicast flow. Furthermore, an ECMP flow is in 406 scope by definition, since it is a point-to-multipoint unicast flow. 408 4. Extension of the Method to Multipoint Flows 410 By using the Alternate-Marking method, only point-to-point paths can 411 be monitored. To have an IP (TCP/UDP) flow that follows a point-to- 412 point path, in general we have to define, with a specific value, 5 413 identification fields (IP Source, IP Destination, Transport Protocol, 414 Source Port, Destination Port). 416 Multipoint Alternate Marking enables the performance measurement for 417 multipoint flows selected by identification fields without any 418 constraints (even the entire network production traffic). It is also 419 possible to use multiple marking points for the same monitored flow. 421 4.1. Monitoring Network 423 The monitoring network is deduced from the production network by 424 identifying the nodes of the graph that are the measurement points, 425 and the links that are the connections between measurement points. 427 There are some techniques that can help with the building of the 428 monitoring network (as an example, see [I-D.ietf-ippm-route]). In 429 general, there are different options: the monitoring network can be 430 obtained by considering all the possible paths for the traffic or 431 periodically checking the traffic (e.g. daily, weekly, monthly) and 432 updating the graph as appropriate, but this is up to the Network 433 Management System (NMS) configuration. 435 So a graph model of the monitoring network can be built according to 436 the Alternate-Marking method: the monitored interfaces and links are 437 identified. Only the measurement points and links where the traffic 438 has flowed have to be represented in the graph. 440 Figure 2 shows a simple example of a monitoring network graph: 442 +------+ 443 <> R6 <>--- 444 / +------+ 445 +------+ +------+ / 446 <> R2 <>---<> R4 <> 447 / +------+ \ +------+ \ 448 / \ \ +------+ 449 +------+ / +------+ \ +------+ <> R7 <>--- 450 ---<> R1 <>---<> R3 <>---<> R5 <> +------+ 451 +------+ \ +------+ \ +------+ \ 452 \ \ \ +------+ 453 \ \ <> R8 <>--- 454 \ \ +------+ 455 \ \ 456 \ \ +------+ 457 \ <> R9 <>--- 458 \ +------+ 459 \ 460 \ +------+ 461 <> R10 <>--- 462 +------+ 464 Figure 2: Monitoring Network Graph 466 Each monitoring point is characterized by the packet counter that 467 refers only to a marking period of the monitored flow. Also, it is 468 assumed that there be a monitoring point at all possible egress 469 points of the multipoint monitored network. 471 The same is also applicable for the delay, but it will be described 472 in the following sections. 474 The rest of the document assumes that the traffic is going from left 475 to right in order to simplify the explanation. But the analysis done 476 for one direction applies equally to all directions. 478 4.2. Network Packet Loss 480 Since all the packets of the considered flow leaving the network have 481 previously entered the network, the number of packets counted by all 482 the input nodes is always greater than, or equal to, the number of 483 packets counted by all the output nodes. Non-initial fragments are 484 not considered here. 486 In the case of no packet loss occurring in the marking period, if all 487 the input and output points of the network domain to be monitored are 488 measurement points, the sum of the number of packets on all the 489 ingress interfaces equals the number on egress interfaces for the 490 monitored flow. In this circumstance, if no packet loss occurs, the 491 intermediate measurement points only have the task of splitting the 492 measurement. 494 It is possible to define the Network Packet Loss of one monitored 495 flow for a single period. In a packet network, the number of lost 496 packets is the number of packets counted by the input nodes minus the 497 number of packets counted by the output nodes. This is true for 498 every packet flow in each marking period. 500 The monitored network packet loss with n input nodes and m output 501 nodes is given by: 503 PL = (PI1 + PI2 +...+ PIn) - (PO1 + PO2 +...+ POm) 505 where: 507 PL is the network packet loss (number of lost packets) 509 PIi is the number of packets flowed through the i-th input node in 510 this period 512 POj is the number of packets flowed through the j-th output node in 513 this period 514 The equation is applied on a per-time-interval basis and a per-flow 515 basis: 517 The reference interval is the Alternate-Marking period, as defined 518 in [I-D.fioccola-rfc8321bis]. 520 The flow definition is generalized here. Indeed, as described 521 before, a multipoint packet flow is considered, and the 522 identification fields can be selected without any constraints. 524 5. Network Clustering 526 The previous equation of Section 4.2 can determine the number of 527 packets lost globally in the monitored network, exploiting only the 528 data provided by the counters in the input and output nodes. 530 In addition, it is also possible to leverage the data provided by the 531 other counters in the network to converge on the smallest 532 identifiable subnetworks where the losses occur. These subnetworks 533 are named "clusters". 535 A cluster graph is a subnetwork of the entire monitoring network 536 graph that still satisfies the packet loss equation (introduced in 537 the previous section), where PL in this case is the number of packets 538 lost in the cluster. As for the entire monitoring network graph, the 539 cluster is defined on a per-flow basis. 541 For this reason, a cluster should contain all the arcs emanating from 542 its input nodes and all the arcs terminating at its output nodes. 543 This ensures that we can count all the packets (and only those) 544 exiting an input node again at the output node, whatever path they 545 follow. 547 In a completely monitored unidirectional network (a network where 548 every network interface is monitored), each network device 549 corresponds to a cluster, and each physical link corresponds to two 550 clusters (one for each device). 552 Clusters can have different sizes depending on the flow-filtering 553 criteria adopted. 555 Moreover, sometimes clusters can be optionally simplified. For 556 example, when two monitored interfaces are divided by a single router 557 (one is the input interface, the other is the output interface, and 558 the router has only these two interfaces), instead of counting 559 exactly twice, upon entering and leaving, it is possible to consider 560 a single measurement point. In this case, we do not care about the 561 internal packet loss of the router. 563 It is worth highlighting that it might also be convenient to define 564 clusters based on the topological information so that they are 565 applicable to all the possible flows in the monitored network. 567 5.1. Algorithm for Clusters Partition 569 A simple algorithm can be applied in order to split our monitoring 570 network into clusters. This can be done for each direction 571 separately. The clusters partition is based on the monitoring 572 network graph, which can be valid for a specific flow or can also be 573 general and valid for the entire network topology. 575 It is a two-step algorithm: 577 o Group the links where there is the same starting node; 579 o Join the grouped links with at least one ending node in common. 581 Considering that the links are unidirectional, the first step implies 582 listing all the links as connections between two nodes and grouping 583 the different links if they have the same starting node. Note that 584 it is possible to start from any link, and the procedure will work. 585 Following this classification, the second step implies eventually 586 joining the groups classified in the first step by looking at the 587 ending nodes. If different groups have at least one common ending 588 node, they are put together and belong to the same set. After the 589 application of the two steps of the algorithm, each one of the 590 composed sets of links, together with the endpoint nodes, constitutes 591 a cluster. 593 In our monitoring network graph example, it is possible to identify 594 the clusters partition by applying this two-step algorithm. 596 The first step identifies the following groups: 598 1. Group 1: (R1-R2), (R1-R3), (R1-R10) 600 2. Group 2: (R2-R4), (R2-R5) 602 3. Group 3: (R3-R5), (R3-R9) 604 4. Group 4: (R4-R6), (R4-R7) 606 5. Group 5: (R5-R8) 608 And then, the second step builds the clusters partition (in 609 particular, we can underline that Groups 2 and 3 connect together, 610 since R5 is in common): 612 1. Cluster 1: (R1-R2), (R1-R3), (R1-R10) 614 2. Cluster 2: (R2-R4), (R2-R5), (R3-R5), (R3-R9) 616 3. Cluster 3: (R4-R6), (R4-R7) 618 4. Cluster 4: (R5-R8) 620 The flow direction here considered is from left to right. For the 621 opposite direction, the same reasoning can be applied, and in this 622 example, you get the same clusters partition. 624 In the end, the following 4 clusters are obtained: 626 Cluster 1 627 +------+ 628 <> R2 <>--- 629 / +------+ 630 / 631 +------+ / +------+ 632 ---<> R1 <>---<> R3 <>--- 633 +------+ \ +------+ 634 \ 635 \ 636 \ 637 \ 638 \ 639 \ 640 \ 641 \ 642 \ +------+ 643 <> R10 <>--- 644 +------+ 646 Cluster 2 647 +------+ +------+ 648 ---<> R2 <>---<> R4 <>--- 649 +------+ \ +------+ 650 \ 651 +------+ \ +------+ 652 ---<> R3 <>---<> R5 <>--- 653 +------+ \ +------+ 654 \ 655 \ 656 \ 657 \ 658 \ +------+ 659 <> R9 <>--- 660 +------+ 662 Cluster 3 663 +------+ 664 <> R6 <>--- 665 / +------+ 666 +------+ / 667 ---<> R4 <> 668 +------+ \ 669 \ +------+ 670 <> R7 <>--- 671 +------+ 673 Cluster 4 674 +------+ 675 ---<> R5 <> 676 +------+ \ 677 \ +------+ 678 <> R8 <>--- 679 +------+ 681 Figure 3: Clusters Example 683 There are clusters with more than two nodes as well as two-node 684 clusters. In the two-node clusters, the loss is on the link (Cluster 685 4). In more-than-two-node clusters, the loss is on the cluster, but 686 we cannot know in which link (Cluster 1, 2, or 3). 688 The algorithm, as applied in this example of a point-to-multipoint 689 network, works for the more general case of multipoint-to-multipoint 690 network in the same way. It should be highlighted that for a 691 multipoint-to-multipoint network the multiple sources MUST mark 692 coherently the traffic and MUST be synchronized with all the other 693 nodes according to the timing requirements detailed in Section 8. 695 When the clusters partition is done, the calculation of packet loss, 696 delay and delay variation can be made on a cluster basis. Note that 697 the packet counters for each marking period permit calculating the 698 packet rate on a cluster basis, so Committed Information Rate (CIR) 699 and Excess Information Rate (EIR) could also be deduced on a cluster 700 basis. 702 Obviously, by combining some clusters in a new connected subnetwork 703 the packet-loss rule is still true. So it is also possible to 704 consider combinations of clusters if and where it suits. 706 In this way, in a very large network, there is no need to configure 707 detailed filter criteria to inspect the traffic. It is possible to 708 check a multipoint network and, in case of problems, go deep with a 709 step-by-step cluster analysis, but only for the cluster or 710 combination of clusters where the problem happens. 712 In summary, once a flow is defined, the algorithm to build the 713 clusters partition is based on topological information; therefore, it 714 considers all the possible links and nodes crossed by the given flow, 715 even if there is no traffic. So, if the flow does not enter or 716 traverse all the nodes, the counters have a non-zero value for the 717 involved nodes and a zero value for the other nodes without traffic; 718 but in the end, all the formulas are still valid. 720 The algorithm described above network is an iterative clustering 721 algorithm, but it is also possible to apply a recursive clustering 722 algorithm by using the node-node adjacency matrix representation 723 [IEEE-ACM-ToN-MPNPM]. 725 The complete and mathematical analysis of the possible algorithms for 726 clusters partition, including the considerations in terms of 727 efficiency and a comparison between the different methods, is in the 728 paper [IEEE-ACM-ToN-MPNPM]. 730 6. Multipoint Packet Loss Measurement 732 The Network Packet Loss, defined in Section 4.2, valid for the an 733 entire monitored flow, can easily be extended to each multipoint path 734 (e.g., the whole multipoint network, a cluster, or a combination of 735 clusters). In this way it is possible to calculate Multipoint Packet 736 Loss that is representative of a multipoint path. 738 The same equation of Section 4.2 can be applied to a generic 739 multipoint path like a cluster or a combination of clusters, where 740 the number of packets are those entering and leaving the multipoint 741 path. 743 By applying the algorithm described in Section 5.1, it is possible to 744 split the monitoring network into clusters. Then, packet loss can be 745 measured on a cluster basis for each single period by considering the 746 counters of the input and output nodes that belong to the specific 747 cluster. This can be done for every packet flow in each marking 748 period. 750 7. Multipoint Delay and Delay Variation 752 The same line of reasoning can be applied to delay and delay 753 variation. Similarly to the delay measurements defined in 754 [I-D.fioccola-rfc8321bis], the marking batches anchor the samples to 755 a particular period, and this is the time reference that can be used. 756 It is important to highlight that both delay and delay-variation 757 measurements make sense in a multipoint path. The delay variation is 758 calculated by considering the same packets selected for measuring the 759 delay. 761 In general, it is possible to perform delay and delay-variation 762 measurements on the basis of multipoint paths or single packets: 764 o Delay measurements on the basis of multipoint paths mean that the 765 delay value is representative of an entire multipoint path (e.g., 766 the whole multipoint network, a cluster, or a combination of 767 clusters). 769 o Delay measurements on a single-packet basis mean that you can use 770 a multipoint path just to easily couple packets between input and 771 output nodes of a multipoint path, as described in the following 772 sections. 774 7.1. Delay Measurements on a Multipoint-Paths Basis 776 7.1.1. Single-Marking Measurement 778 Mean delay and mean delay-variation measurements can also be 779 generalized to the case of multipoint flows. It is possible to 780 compute the average one-way delay of packets in one block, a cluster, 781 or the entire monitored network. 783 The average latency can be measured as the difference between the 784 weighted averages of the mean timestamps of the sets of output and 785 input nodes. This means that, in the calculation, it is possible to 786 weigh the timestamps by considering the number of packets for each 787 endpoints. 789 Note that, since the one-way delay value is representative of a 790 multipoint path, it is possible to calculate the two-way delay of a 791 multipoint path by summing the one-way delays of the two directions, 792 similarly to [I-D.fioccola-rfc8321bis]. 794 7.2. Delay Measurements on a Single-Packet Basis 796 7.2.1. Single- and Double-Marking Measurement 798 Delay and delay-variation measurements relative to only one picked 799 packet per period (both single and double marked) can be performed in 800 the multipoint scenario, with some limitations: 802 Single marking based on the first/last packet of the interval 803 would not work, because it would not be possible to agree on the 804 first packet of the interval. 806 Double marking or multiplexed marking would work, but each 807 measurement would only give information about the delay of a 808 single path. However, by repeating the measurement multiple 809 times, it is possible to get information about all the paths in 810 the multipoint flow. This can be done in the case of a point-to- 811 multipoint path, but it is more difficult to achieve in the case 812 of a multipoint-to-multipoint path because of the multiple source 813 routers. 815 If we would perform a delay measurement for more than one picked 816 packet in the same marking period, and especially if we want to get 817 delay measurements on a multipoint-to-multipoint basis, neither the 818 single- nor the double-marking method is useful in the multipoint 819 scenario, since they would not be representative of the entire flow. 820 The packets can follow different paths with various delays, and in 821 general it can be very difficult to recognize marked packets in a 822 multipoint-to-multipoint path, especially in the case when there is 823 more than one per period. 825 A desirable option is to monitor simultaneously all the paths of a 826 multipoint path in the same marking period; for this purpose, hashing 827 can be used, as reported in the next section. 829 Note that, since the one-way delay measurement is done on a single- 830 packet basis, it is always possible to calculate the two-way delay 831 but it is not immediate since it is necessary to couple the 832 measurement on each single path with the opposite direction. In this 833 case the NMS can do the calculation. 835 7.2.2. Hashing Selection Method 837 RFCs 5474 [RFC5474] and 5475 [RFC5475] introduce sampling and 838 filtering techniques for IP packet selection. 840 The hash-based selection methodologies for delay measurement can work 841 in a multipoint-to-multipoint path and MAY be used either coupled to 842 mean delay or stand-alone. 844 [I-D.mizrahi-ippm-marking] introduces how to use the hash method (RFC 845 5474 [RFC5474] and RFC 5475 [RFC5475]) combined with the Alternate- 846 Marking method for point-to-point flows. It is also called Mixed 847 Hashed Marking: the coupling of a marking method and hashing 848 technique is very useful, because the marking batches anchor the 849 samples selected with hashing, and this simplifies the correlation of 850 the hashing packets along the path. 852 It is possible to use a basic-hash or a dynamic-hash method. One of 853 the challenges of the basic approach is that the frequency of the 854 sampled packets may vary considerably. For this reason, the dynamic 855 approach has been introduced for point-to-point flows in order to 856 have the desired and almost fixed number of samples for each 857 measurement period. Using the hash-based sampling, the number of 858 samples may vary a lot because it depends on the packet rate that is 859 variable. The dynamic approach helps to have an almost fixed number 860 of samples for each marking period, and this is a better option for 861 making regular measurements over time. In the hash-based sampling, 862 Alternate Marking is used to create periods, so that hash-based 863 samples are divided into batches, which allows anchoring the selected 864 samples to their period. Moreover, in the dynamic hash-based 865 sampling, by dynamically adapting the length of the hash value, the 866 number of samples is bounded in each marking period. 868 In a multipoint environment, the hashing selection MAY be the 869 solution for performing delay measurements on specific packets and 870 overcoming the single- and double-marking limitations. 872 8. Synchronization and Timing 874 It is important to consider the timing aspects, since out-of-order 875 packets happen and have to be handled as well, as described in 876 [I-D.fioccola-rfc8321bis]. 878 However, in a multisource situation, an additional issue has to be 879 considered. With multipoint path, the egress nodes will receive 880 alternate marked packets in random order from different ingress 881 nodes, and this must not affect the measurement. 883 So, if we analyze a multipoint-to-multipoint path with more than one 884 marking node, it is important to recognize the reference measurement 885 interval. In general, the measurement interval for describing the 886 results is the interval of the marking node that is more aligned with 887 the start of the measurement, as reported in Figure 4. 889 Note that the mark switching approach based on a fixed timer is 890 considered in this document. 892 time -> start stop 893 T(R1) |-------------| 894 T(R2) |-------------| 895 T(R3) |------------| 897 Figure 4: Measurement Interval 899 In Figure 4, it is assumed that the node with the earliest clock (R1) 900 identifies the right starting and ending times of the measurement, 901 but it is just an assumption, and other possibilities could occur. 902 So, in this case, T(R1) is the measurement interval, and its 903 recognition is essential in order to make comparisons with other 904 active/passive/hybrid Packet Loss metrics. 906 Regarding the timing constraints of the methodology, 907 [I-D.fioccola-rfc8321bis] already describes two contributions that 908 are taken into account: the clock error between network devices and 909 the network delay between the measurement points. 911 When we expand to a multipoint environment, we have to consider that 912 there are more marking nodes that mark the traffic based on 913 synchronized clock time. But, due to different synchronization 914 issues that may happen, the marking batches can be of different 915 lengths and with different offsets when they get mixed in a 916 multipoint flow. The additional gap that results between the sources 917 can be incorporated into A, which is the maximum clock skew between 918 the network devices, as already defined in [I-D.fioccola-rfc8321bis]. 920 ...BBBBBBBBB | AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA | BBBBBBBBB... 921 |<======================================>| 922 | L | 923 ...=========>|<==================><==================>|<==========... 924 | L/2 L/2 | 925 |<====>| |<====>| 926 d | | d 927 |<========================>| 928 available counting interval 930 Figure 5: Timing Aspects 932 Moreover, it is assumed that each path of the multipoint flow can 933 still be represented with a distinct normal distribution. So, for 934 the aggregate multipoint path, the combination of normal 935 distributions result in a new normal distribution. Under this 936 assumption, the definition of the guard band d is still applicable as 937 defined in [I-D.fioccola-rfc8321bis] and is given by: 939 d = A + D_avg + 3*D_stddev, 941 where A is the clock accuracy, D_avg is the average value of the 942 network delay, and D_stddev is the standard deviation of the delay. 944 As shown in Figure 5 and according to [I-D.fioccola-rfc8321bis], the 945 condition that must be satisfied to enable the method to function 946 properly is that the available counting interval must be > 0, and 947 that means: 949 L - 2d > 0. 951 This formula needs to be verified for each measurement point on the 952 multipoint path. 954 Note that the timing considerations are valid for both packet loss 955 and delay measurements. 957 9. Results of the Multipoint Alternate Marking Experiment 959 The methodology described in the previous sections can be applied to 960 various performance measurement problems, as also explained in 961 [I-D.fioccola-rfc8321bis]. 963 Either one or two flag bits might be available for marking in 964 different deployments: 966 One flag: packet loss measurement SHOULD be done as described in 967 Section 6 by applying the network clustering partition described 968 in Section 5. While delay measurement MAY be done according to 969 the Mean delay calculation representative of the multipoint path, 970 as described in Section 7.1.1. Single-marking method based on the 971 first/last packet of the interval cannot be applied, as mentioned 972 in Section 7.2.1. 974 Two flags: packet loss measurement SHOULD be done as described in 975 Section 6 by applying the network clustering partition described 976 in Section 5. While delay measurement SHOULD be done on a single 977 packet basis according to double-marking method Section 7.2.1. In 978 this case the Mean delay calculation (Section 7.1.1) MAY also be 979 used as a representative value of a multipoint path. 981 One flag and hash-based selection: packet loss measurement SHOULD 982 be done as described in Section 6 by applying the network 983 clustering partition described in Section 5. Hash-based selection 984 methodologies, introduced in Section 7.2.2, MAY be used for delay 985 measurement. 987 The experiment with Multipoint Alternate Marking methodologies 988 confirmed the benefits of the Alternate Marking methodology described 989 in [I-D.fioccola-rfc8321bis], as its extension to the general case of 990 multipoint-to-multipoint scenarios. 992 The Multipoint Alternate Marking Method is RECOMMENDED only for 993 controlled domains, as per [I-D.fioccola-rfc8321bis]. 995 10. A Closed-Loop Performance-Management Approach 997 The Multipoint Alternate-Marking framework that is introduced in this 998 document adds flexibility to Performance Management (PM), because it 999 can reduce the order of magnitude of the packet counters. This 1000 allows an SDN orchestrator to supervise, control, and manage PM in 1001 large networks. 1003 The monitoring network can be considered as a whole or split into 1004 clusters that are the smallest subnetworks (group-to-group segments), 1005 maintaining the packet-loss property for each subnetwork. The 1006 clusters can also be combined in new, connected subnetworks at 1007 different levels, depending on the detail we want to achieve. 1009 An SDN controller or a Network Management System (NMS) can calibrate 1010 performance measurements, since they are aware of the network 1011 topology. They can start without examining in depth. In case of 1012 necessity (packet loss is measured or the delay is too high), the 1013 filtering criteria could be immediately reconfigured in order to 1014 perform a partition of the network by using clusters and/or different 1015 combinations of clusters. In this way, the problem can be localized 1016 in a specific cluster or a single combination of clusters, and a more 1017 detailed analysis can be performed step by step by successive 1018 approximation up to a point-to-point flow detailed analysis. This is 1019 the so-called "closed loop". 1021 This approach can be called "network zooming" and can be performed in 1022 two different ways: 1024 1) change the traffic filter and select more detailed flows; 1026 2) activate new measurement points by defining more specified 1027 clusters. 1029 The network-zooming approach implies that some filters or rules are 1030 changed and that therefore there is a transient time to wait once the 1031 new network configuration takes effect. This time can be determined 1032 by the Network Orchestrator/Controller, based on the network 1033 conditions. 1035 For example, if the network zooming identifies the performance 1036 problem for the traffic coming from a specific source, we need to 1037 recognize the marked signal from this specific source node and its 1038 relative path. For this purpose, we can activate all the available 1039 measurement points and better specify the flow filter criteria (i.e., 1040 5-tuple). As an alternative, it can be enough to select packets from 1041 the specific source for delay measurements; in this case, it is 1042 possible to apply the hashing technique, as mentioned in the previous 1043 sections. 1045 [I-D.song-opsawg-ifit-framework] defines an architecture where the 1046 centralized Data Collector and Network Management can apply the 1047 intelligent and flexible Alternate-Marking algorithm as previously 1048 described. 1050 As for [I-D.fioccola-rfc8321bis], it is possible to classify the 1051 traffic and mark a portion of the total traffic. For each period, 1052 the packet rate and bandwidth are calculated from the number of 1053 packets. In this way, the network orchestrator becomes aware if the 1054 traffic rate surpasses limits. In addition, more precision can be 1055 obtained by reducing the marking period; indeed, some implementations 1056 use a marking period of 1 sec or less. 1058 In addition, an SDN controller could also collect the measurement 1059 history. 1061 It is important to mention that the Multipoint Alternate Marking 1062 framework also helps Traffic Visualization. Indeed, this methodology 1063 is very useful for identifying which path or cluster is crossed by 1064 the flow. 1066 11. Security Considerations 1068 This document specifies a method of performing measurements that does 1069 not directly affect Internet security or applications that run on the 1070 Internet. However, implementation of this method must be mindful of 1071 security and privacy concerns, as explained in 1072 [I-D.fioccola-rfc8321bis]. 1074 12. IANA Considerations 1076 This document has no IANA actions. 1078 13. Contributors 1080 Greg Mirsky 1081 Ericsson 1082 Email: gregimirsky@gmail.com 1084 Tal Mizrahi 1085 Huawei Technologies 1086 Email: tal.mizrahi.phd@gmail.com 1088 Xiao Min 1089 ZTE Corp. 1090 Email: xiao.min2@zte.com.cn 1092 14. Acknowledgements 1094 The authors would like to thank Martin Duke and Tommy Pauly for their 1095 assistance and their detailed and precious reviews. 1097 15. References 1099 15.1. Normative References 1101 [I-D.fioccola-rfc8321bis] 1102 Fioccola, G., Cociglio, M., Mirsky, G., Mizrahi, T., and 1103 T. Zhou, "Alternate-Marking Method", draft-fioccola- 1104 rfc8321bis-04 (work in progress), April 2022. 1106 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1107 Requirement Levels", BCP 14, RFC 2119, 1108 DOI 10.17487/RFC2119, March 1997, 1109 . 1111 [RFC5474] Duffield, N., Ed., Chiou, D., Claise, B., Greenberg, A., 1112 Grossglauser, M., and J. Rexford, "A Framework for Packet 1113 Selection and Reporting", RFC 5474, DOI 10.17487/RFC5474, 1114 March 2009, . 1116 [RFC5475] Zseby, T., Molina, M., Duffield, N., Niccolini, S., and F. 1117 Raspall, "Sampling and Filtering Techniques for IP Packet 1118 Selection", RFC 5475, DOI 10.17487/RFC5475, March 2009, 1119 . 1121 [RFC5644] Stephan, E., Liang, L., and A. Morton, "IP Performance 1122 Metrics (IPPM): Spatial and Multicast", RFC 5644, 1123 DOI 10.17487/RFC5644, October 2009, 1124 . 1126 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 1127 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 1128 May 2017, . 1130 15.2. Informative References 1132 [I-D.ietf-ippm-route] 1133 Alvarez-Hamelin, J. I., Morton, A., Fabini, J., Pignataro, 1134 C., and R. Geib, "Advanced Unidirectional Route Assessment 1135 (AURA)", draft-ietf-ippm-route-10 (work in progress), 1136 August 2020. 1138 [I-D.mizrahi-ippm-marking] 1139 Mizrahi, T., Fioccola, G., Cociglio, M., Chen, M., and G. 1140 Mirsky, "Marking Methods for Performance Measurement", 1141 draft-mizrahi-ippm-marking-00 (work in progress), October 1142 2021. 1144 [I-D.song-opsawg-ifit-framework] 1145 Song, H., Qin, F., Chen, H., Jin, J., and J. Shin, "A 1146 Framework for In-situ Flow Information Telemetry", draft- 1147 song-opsawg-ifit-framework-17 (work in progress), February 1148 2022. 1150 [I-D.zhou-ippm-enhanced-alternate-marking] 1151 Zhou, T., Fioccola, G., Liu, Y., Cociglio, M., Lee, S., 1152 and W. Li, "Enhanced Alternate Marking Method", draft- 1153 zhou-ippm-enhanced-alternate-marking-09 (work in 1154 progress), February 2022. 1156 [IEEE-ACM-ToN-MPNPM] 1157 IEEE/ACM TRANSACTION ON NETWORKING, "Multipoint Passive 1158 Monitoring in Packet Networks", 1159 DOI 10.1109/TNET.2019.2950157, 2019. 1161 [IEEE-Network-PNPM] 1162 IEEE Network, "AM-PM: Efficient Network Telemetry using 1163 Alternate Marking", DOI 10.1109/MNET.2019.1800152, 2019. 1165 [RFC7011] Claise, B., Ed., Trammell, B., Ed., and P. Aitken, 1166 "Specification of the IP Flow Information Export (IPFIX) 1167 Protocol for the Exchange of Flow Information", STD 77, 1168 RFC 7011, DOI 10.17487/RFC7011, September 2013, 1169 . 1171 [RFC8889] Fioccola, G., Ed., Cociglio, M., Sapio, A., and R. Sisto, 1172 "Multipoint Alternate-Marking Method for Passive and 1173 Hybrid Performance Monitoring", RFC 8889, 1174 DOI 10.17487/RFC8889, August 2020, 1175 . 1177 Appendix A. Changes Log 1179 Changes from RFC 8889 in draft-fioccola-rfc8889bis-00 include: 1181 o Minor editorial changes 1183 o Removed section on "Examples of application" 1185 Changes in draft-fioccola-rfc8889bis-01 include: 1187 o Considerations on BUM traffic 1189 o Reference to RFC8321bis for the fragmentation part 1191 o Revised section on "Delay Measurements on a Single-Packet Basis" 1193 o Revised section on "Timing Aspects" 1195 Changes in draft-fioccola-rfc8889bis-02 include: 1197 o Clarified the formula in the section on "Timing Aspects" to be 1198 aligned with RFC 8321 1200 o Considerations on two-way delay measurements in both sections 8.1 1201 and 8.2 on delay measurements 1203 o Clarified in section 4.1 on "Monitoring Network" that the 1204 description is done for one direction but it can easily be 1205 extended to all direction 1207 o New section on "Results of the Multipoint Alternate Marking 1208 Experiment" 1210 Changes in draft-fioccola-rfc8889bis-03 include: 1212 o Moved and renamed section on "Timing Aspects" as "Synchronization 1213 and Timing" 1215 o Renamed old section on "Multipoint Packet Loss" as "Network Packet 1216 Loss" 1218 o New section on "Multipoint Packet Loss Measurement" 1220 o Renamed section on "Multipoint Performance Measurement" as 1221 "Extension of the Method to Multipoint Flows" 1223 Changes in draft-fioccola-rfc8889bis-04/draft-ietf-ippm-rfc8889bis-00 1224 include: 1226 o Revised section 5.1 on "Algorithm for Clusters Partition" 1228 Changes in draft-ietf-ippm-rfc8889bis-01 include: 1230 o New section on "Summary of Changes from RFC 8889" 1232 Authors' Addresses 1234 Giuseppe Fioccola (editor) 1235 Huawei Technologies 1236 Riesstrasse, 25 1237 Munich 80992 1238 Germany 1240 Email: giuseppe.fioccola@huawei.com 1242 Mauro Cociglio 1243 Telecom Italia 1244 Via Reiss Romoli, 274 1245 Torino 10148 1246 Italy 1248 Email: mauro.cociglio@telecomitalia.it 1250 Amedeo Sapio 1251 Intel Corporation 1252 4750 Patrick Henry Dr. 1253 Santa Clara, CA 95054 1254 USA 1256 Email: amedeo.sapio@intel.com 1257 Riccardo Sisto 1258 Politecnico di Torino 1259 Corso Duca degli Abruzzi, 24 1260 Torino 10129 1261 Italy 1263 Email: riccardo.sisto@polito.it 1265 Tianran Zhou 1266 Huawei Technologies 1267 156 Beiqing Rd. 1268 Beijing 100095 1269 China 1271 Email: zhoutianran@huawei.com