idnits 2.17.00 (12 Aug 2021) /tmp/idnits44063/draft-fioccola-rfc8889bis-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([RFC8889]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (November 19, 2021) is 183 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Downref: Normative reference to an Informational RFC: RFC 5474 == Outdated reference: A later version (-04) exists of draft-fioccola-rfc8321bis-00 == Outdated reference: A later version (-17) exists of draft-song-opsawg-ifit-framework-16 == Outdated reference: A later version (-09) exists of draft-zhou-ippm-enhanced-alternate-marking-07 Summary: 2 errors (**), 0 flaws (~~), 4 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group G. Fioccola, Ed. 3 Internet-Draft Huawei Technologies 4 Obsoletes: 8889 (if approved) M. Cociglio 5 Intended status: Standards Track Telecom Italia 6 Expires: May 23, 2022 A. Sapio 7 Intel Corporation 8 R. Sisto 9 Politecnico di Torino 10 November 19, 2021 12 Multipoint Alternate-Marking Method 13 draft-fioccola-rfc8889bis-00 15 Abstract 17 This document generalizes and expands Alternate-Marking methodology 18 to measure any kind of unicast flow whose packets can follow several 19 different paths in the network -- in wider terms, a multipoint-to- 20 multipoint network. For this reason, the technique here described is 21 called "Multipoint Alternate Marking". This document obsoletes 22 [RFC8889]. 24 Status of This Memo 26 This Internet-Draft is submitted in full conformance with the 27 provisions of BCP 78 and BCP 79. 29 Internet-Drafts are working documents of the Internet Engineering 30 Task Force (IETF). Note that other groups may also distribute 31 working documents as Internet-Drafts. The list of current Internet- 32 Drafts is at https://datatracker.ietf.org/drafts/current/. 34 Internet-Drafts are draft documents valid for a maximum of six months 35 and may be updated, replaced, or obsoleted by other documents at any 36 time. It is inappropriate to use Internet-Drafts as reference 37 material or to cite them other than as "work in progress." 39 This Internet-Draft will expire on May 23, 2022. 41 Copyright Notice 43 Copyright (c) 2021 IETF Trust and the persons identified as the 44 document authors. All rights reserved. 46 This document is subject to BCP 78 and the IETF Trust's Legal 47 Provisions Relating to IETF Documents 48 (https://trustee.ietf.org/license-info) in effect on the date of 49 publication of this document. Please review these documents 50 carefully, as they describe your rights and restrictions with respect 51 to this document. Code Components extracted from this document must 52 include Simplified BSD License text as described in Section 4.e of 53 the Trust Legal Provisions and are provided without warranty as 54 described in the Simplified BSD License. 56 Table of Contents 58 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 59 1.1. Requirements Language . . . . . . . . . . . . . . . . . . 4 60 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 61 2.1. Correlation with RFC 5644 . . . . . . . . . . . . . . . . 5 62 3. Flow Classification . . . . . . . . . . . . . . . . . . . . . 6 63 4. Multipoint Performance Measurement . . . . . . . . . . . . . 8 64 4.1. Monitoring Network . . . . . . . . . . . . . . . . . . . 9 65 5. Multipoint Packet Loss . . . . . . . . . . . . . . . . . . . 10 66 6. Network Clustering . . . . . . . . . . . . . . . . . . . . . 11 67 6.1. Algorithm for Clusters Partition . . . . . . . . . . . . 12 68 7. Timing Aspects . . . . . . . . . . . . . . . . . . . . . . . 15 69 8. Multipoint Delay and Delay Variation . . . . . . . . . . . . 17 70 8.1. Delay Measurements on a Multipoint-Paths Basis . . . . . 18 71 8.1.1. Single-Marking Measurement . . . . . . . . . . . . . 18 72 8.2. Delay Measurements on a Single-Packet Basis . . . . . . . 18 73 8.2.1. Single- and Double-Marking Measurement . . . . . . . 18 74 8.2.2. Hashing Selection Method . . . . . . . . . . . . . . 19 75 9. A Closed-Loop Performance-Management Approach . . . . . . . . 21 76 10. Security Considerations . . . . . . . . . . . . . . . . . . . 22 77 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 22 78 12. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 22 79 13. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 23 80 14. References . . . . . . . . . . . . . . . . . . . . . . . . . 23 81 14.1. Normative References . . . . . . . . . . . . . . . . . . 23 82 14.2. Informative References . . . . . . . . . . . . . . . . . 23 83 Appendix A. Changes Log . . . . . . . . . . . . . . . . . . . . 25 84 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 25 86 1. Introduction 88 The Alternate-Marking method, as described in 89 [I-D.fioccola-rfc8321bis], is applicable to a point-to-point path. 90 The extension proposed in this document applies to the most general 91 case of multipoint-to-multipoint path and enables flexible and 92 adaptive performance measurements in a managed network. 94 The Alternate-Marking methodology described in 95 [I-D.fioccola-rfc8321bis] allows the synchronization of the 96 measurements in different points by dividing the packet flow into 97 batches. So it is possible to get coherent counters and show what is 98 happening in every marking period for each monitored flow. The 99 monitoring parameters are the packet counter and timestamps of a flow 100 for each marking period. Note that additional details about the 101 applicability of the Alternate-Marking methodology are described in 102 [I-D.fioccola-rfc8321bis] while implementation details can be found 103 in the paper "AM-PM: Efficient Network Telemetry using Alternate 104 Marking" [IEEE-Network-PNPM]. 106 There are some applications of the Alternate-Marking method where 107 there are a lot of monitored flows and nodes. Multipoint Alternate 108 Marking aims to reduce these values and makes the performance 109 monitoring more flexible in case a detailed analysis is not needed. 110 For instance, by considering n measurement points and m monitored 111 flows, the order of magnitude of the packet counters for each time 112 interval is n*m*2 (1 per color). The number of measurement points 113 and monitored flows may vary and depends on the portion of the 114 network we are monitoring (core network, metro network, access 115 network) and the granularity (for each service, each customer). So 116 if both n and m are high values, the packet counters increase a lot, 117 and Multipoint Alternate Marking offers a tool to control these 118 parameters. 120 The approach presented in this document is applied only to unicast 121 flows and not to multicast. Broadcast, Unknown Unicast, and 122 Multicast (BUM) traffic is not considered here, because traffic 123 replication is not covered by the Multipoint Alternate-Marking 124 method. Furthermore, it can be applicable to anycast flows, and 125 Equal-Cost Multipath (ECMP) paths can also be easily monitored with 126 this technique. 128 In short, [I-D.fioccola-rfc8321bis] applies to point-to-point unicast 129 flows and BUM traffic, while this document and its Clustered 130 Alternate-Marking method is valid for multipoint-to-multipoint 131 unicast flows, anycast, and ECMP flows. 133 Therefore,the Alternate-Marking method can be extended to any kind of 134 multipoint-to-multipoint paths, and the network-clustering approach 135 presented in this document is the formalization of how to implement 136 this property and allow a flexible and optimized performance 137 measurement support for network management in every situation. 139 Without network clustering, it is possible to apply Alternate Marking 140 only for all the network or per single flow. Instead, with network 141 clustering, it is possible to use the partition of the network into 142 clusters at different levels in order to perform the needed degree of 143 detail. In some circumstances, it is possible to monitor a 144 multipoint network by analyzing the network clustering, without 145 examining in depth. In case of problems (packet loss is measured or 146 the delay is too high), the filtering criteria could be specified 147 more in order to perform a detailed analysis by using a different 148 combination of clusters up to a per-flow measurement as described in 149 [I-D.fioccola-rfc8321bis]. 151 This approach fits very well with the Closed-Loop Network and 152 Software-Defined Network (SDN) paradigm, where the SDN orchestrator 153 and the SDN controllers are the brains of the network and can manage 154 flow control to the switches and routers and, in the same way, can 155 calibrate the performance measurements depending on the desired 156 accuracy. An SDN controller application can orchestrate how 157 accurately the network performance monitoring is set up by applying 158 the Multipoint Alternate Marking as described in this document. 160 It is important to underline that, as an extension of 161 [I-D.fioccola-rfc8321bis], this is a methodology document, so the 162 mechanism that can be used to transmit the counters and the 163 timestamps is out of scope here, and the implementation is open. 164 Several options are possible -- e.g., see "Enhanced Alternate Marking 165 Method" [I-D.zhou-ippm-enhanced-alternate-marking]. 167 Note that the fragmented packets case can be managed with the 168 Alternate-Marking methodology only if fragmentation happens outside 169 the portion of the network that is monitored. This is always true 170 for both [I-D.fioccola-rfc8321bis] and Multipoint Alternate Marking, 171 as explained here. 173 1.1. Requirements Language 175 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 176 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 177 "OPTIONAL" in this document are to be interpreted as described in BCP 178 14 [RFC2119] [RFC8174] when, and only when, they appear in all 179 capitals, as shown here. 181 2. Terminology 183 The definitions of the basic terms are identical to those found in 184 Alternate Marking [I-D.fioccola-rfc8321bis]. It is to be remembered 185 that [I-D.fioccola-rfc8321bis] is valid for point-to-point unicast 186 flows and BUM traffic. 188 The important new terms that need to be explained are listed below: 190 Multipoint Alternate Marking: Extension to 191 [I-D.fioccola-rfc8321bis], valid for multipoint-to-multipoint 192 unicast flows, anycast, and ECMP flows. It can also be referred 193 to as Clustered Alternate Marking. 195 Flow definition: The concept of flow is generalized in this 196 document. The identification fields are selected without any 197 constraints and, in general, the flow can be a multipoint-to- 198 multipoint flow, as a result of aggregate point-to-point flows. 200 Monitoring Network: Identified with the nodes of the network that 201 are the measurement points (MPs) and the links that are the 202 connections between MPs. The monitoring network graph depends on 203 the flow definition, so it can represent a specific flow or the 204 entire network topology as aggregate of all the flows. 206 Cluster: Smallest identifiable subnetwork of the entire monitoring 207 network graph that still satisfies the condition that the number 208 of packets that go in is the same as the number that go out. 210 Multipoint metrics: Packet loss, delay and delay variation are 211 extended to the case of multipoint flows. It is possible to 212 compute these metrics on the basis of multipoint paths in order to 213 associate the measurements to a cluster, a combination of 214 clusters, or the entire monitored network. For delay and delay 215 variation, it is also possible to define the metrics on a single- 216 packet basis, and it means that the multipoint path is used to 217 easily couple packets between input and output nodes of a 218 multipoint path. 220 The next section highlights the correlation with the terms used in 221 RFC 5644 [RFC5644]. 223 2.1. Correlation with RFC 5644 225 RFC 5644 [RFC5644] is limited to active measurements using a single 226 source packet or stream. Its scope is also limited to observations 227 of corresponding packets along the path (spatial metric) and at one 228 or more destinations (one-to-group) along the path. 230 Instead, the scope of this memo is to define multiparty metrics for 231 passive and hybrid measurements in a group-to-group topology with 232 multiple sources and destinations. 234 RFC 5644 [RFC5644] introduces metric names that can be reused here 235 but have to be extended and rephrased to be applied to the Alternate- 236 Marking schema: 238 a. the multiparty metrics are not only one-to-group metrics but can 239 be also group-to-group metrics; 241 b. the spatial metrics, used for measuring the performance of 242 segments of a source to destination path, are applied here to 243 group-to-group segments (called clusters). 245 3. Flow Classification 247 A unicast flow is identified by all the packets having a set of 248 common characteristics. This definition is inspired by RFC 7011 249 [RFC7011]. 251 As an example, by considering a flow as all the packets sharing the 252 same source IP address or the same destination IP address, it is easy 253 to understand that the resulting pattern will not be a point-to-point 254 connection, but a point-to-multipoint or multipoint-to-point 255 connection. 257 In general, a flow can be defined by a set of selection rules used to 258 match a subset of the packets processed by the network device. These 259 rules specify a set of Layer 3 and Layer 4 header fields 260 (identification fields) and the relative values that must be found in 261 matching packets. 263 The choice of the identification fields directly affects the type of 264 paths that the flow would follow in the network. In fact, it is 265 possible to relate a set of identification fields with the pattern of 266 the resulting graphs, as listed in Figure 1. 268 A TCP 5-tuple usually identifies flows following either a single path 269 or a point-to-point multipath (in the case of load balancing). On 270 the contrary, a single source address selects aggregate flows 271 following a point-to-multipoint, while a multipoint-to-point can be 272 the result of a matching on a single destination address. In the 273 case where a selection rule and its reverse are used for 274 bidirectional measurements, they can correspond to a point-to- 275 multipoint in one direction and a multipoint-to-point in the opposite 276 direction. 278 So the flows to be monitored are selected into the monitoring points 279 using packet selection rules, which can also change the pattern of 280 the monitored network. 282 Note that, more generally, the flow can be defined at different 283 levels based on the potential encapsulation, and additional 284 conditions that are not in the packet header can also be included as 285 part of matching criteria. 287 The Alternate-Marking method is applicable only to a single path (and 288 partially to a one-to-one multipath), so the extension proposed in 289 this document is suitable also for the most general case of 290 multipoint-to-multipoint, which embraces all the other patterns of 291 Figure 1. 293 point-to-point single path 294 +------+ +------+ +------+ 295 ---<> R1 <>----<> R2 <>----<> R3 <>--- 296 +------+ +------+ +------+ 298 point-to-point multipath 299 +------+ 300 <> R2 <> 301 / +------+ \ 302 / \ 303 +------+ / \ +------+ 304 ---<> R1 <> <> R4 <>--- 305 +------+ \ / +------+ 306 \ / 307 \ +------+ / 308 <> R3 <> 309 +------+ 311 point-to-multipoint 312 +------+ 313 <> R4 <>--- 314 / +------+ 315 +------+ / 316 <> R2 <> 317 / +------+ \ 318 +------+ / \ +------+ 319 ---<> R1 <> <> R5 <>--- 320 +------+ \ +------+ 321 \ +------+ 322 <> R3 <> 323 +------+ \ 324 \ +------+ 325 <> R6 <>--- 326 +------+ 328 multipoint-to-point 329 +------+ 330 ---<> R1 <> 331 +------+ \ 332 \ +------+ 333 <> R4 <> 334 / +------+ \ 335 +------+ / \ +------+ 336 ---<> R2 <> <> R6 <>--- 337 +------+ / +------+ 338 +------+ / 339 <> R5 <> 340 / +------+ 341 +------+ / 342 ---<> R3 <> 343 +------+ 345 multipoint-to-multipoint 346 +------+ +------+ 347 ---<> R1 <> <> R6 <>--- 348 +------+ \ / +------+ 349 \ +------+ / 350 <> R4 <> 351 +------+ \ 352 +------+ \ +------+ 353 ---<> R2 <> <> R7 <>--- 354 +------+ \ / +------+ 355 \ +------+ / 356 <> R5 <> 357 / +------+ \ 358 +------+ / \ +------+ 359 ---<> R3 <> <> R8 <>--- 360 +------+ +------+ 362 Figure 1: Flow Classification 364 The case of unicast flow is considered in Figure 1. The anycast flow 365 is also in scope, because there is no replication and only a single 366 node from the anycast group receives the traffic, so it can be viewed 367 as a special case of unicast flow. Furthermore, an ECMP flow is in 368 scope by definition, since it is a point-to-multipoint unicast flow. 370 4. Multipoint Performance Measurement 372 By using the Alternate-Marking method, only point-to-point paths can 373 be monitored. To have an IP (TCP/UDP) flow that follows a point-to- 374 point path, we have to define, with a specific value, 5 375 identification fields (IP Source, IP Destination, Transport Protocol, 376 Source Port, Destination Port). 378 Multipoint Alternate Marking enables the performance measurement for 379 multipoint flows selected by identification fields without any 380 constraints (even the entire network production traffic). It is also 381 possible to use multiple marking points for the same monitored flow. 383 4.1. Monitoring Network 385 The monitoring network is deduced from the production network by 386 identifying the nodes of the graph that are the measurement points, 387 and the links that are the connections between measurement points. 389 There are some techniques that can help with the building of the 390 monitoring network (as an example, see [I-D.ietf-ippm-route]). In 391 general, there are different options: the monitoring network can be 392 obtained by considering all the possible paths for the traffic or 393 periodically checking the traffic (e.g. daily, weekly, monthly) and 394 updating the graph as appropriate, but this is up to the Network 395 Management System (NMS) configuration. 397 So a graph model of the monitoring network can be built according to 398 the Alternate-Marking method: the monitored interfaces and links are 399 identified. Only the measurement points and links where the traffic 400 has flowed have to be represented in the graph. 402 Figure 2 shows a simple example of a monitoring network graph: 404 +------+ 405 <> R6 <>--- 406 / +------+ 407 +------+ +------+ / 408 <> R2 <>---<> R4 <> 409 / +------+ \ +------+ \ 410 / \ \ +------+ 411 +------+ / +------+ \ +------+ <> R7 <>--- 412 ---<> R1 <>---<> R3 <>---<> R5 <> +------+ 413 +------+ \ +------+ \ +------+ \ 414 \ \ \ +------+ 415 \ \ <> R8 <>--- 416 \ \ +------+ 417 \ \ 418 \ \ +------+ 419 \ <> R9 <>--- 420 \ +------+ 421 \ 422 \ +------+ 423 <> R10 <>--- 424 +------+ 426 Figure 2: Monitoring Network Graph 428 Each monitoring point is characterized by the packet counter that 429 refers only to a marking period of the monitored flow. 431 The same is also applicable for the delay, but it will be described 432 in the following sections. 434 5. Multipoint Packet Loss 436 Since all the packets of the considered flow leaving the network have 437 previously entered the network, the number of packets counted by all 438 the input nodes is always greater than, or equal to, the number of 439 packets counted by all the output nodes. Noninitial fragments are 440 not considered here. 442 The assumption is the use of the Alternate-Marking method. In the 443 case of no packet loss occurring in the marking period, if all the 444 input and output points of the network domain to be monitored are 445 measurement points, the sum of the number of packets on all the 446 ingress interfaces equals the number on egress interfaces for the 447 monitored flow. In this circumstance, if no packet loss occurs, the 448 intermediate measurement points only have the task of splitting the 449 measurement. 451 It is possible to define the Network Packet Loss of one monitored 452 flow for a single period. In a packet network, the number of lost 453 packets is the number of packets counted by the input nodes minus the 454 number of packets counted by the output nodes. This is true for 455 every packet flow in each marking period. 457 The monitored network packet loss with n input nodes and m output 458 nodes is given by: 460 PL = (PI1 + PI2 +...+ PIn) - (PO1 + PO2 +...+ POm) 462 where: 464 PL is the network packet loss (number of lost packets) 466 PIi is the number of packets flowed through the i-th input node in 467 this period 469 POj is the number of packets flowed through the j-th output node in 470 this period 472 The equation is applied on a per-time-interval basis and a per-flow 473 basis: 475 The reference interval is the Alternate-Marking period, as defined 476 in [I-D.fioccola-rfc8321bis]. 478 The flow definition is generalized here. Indeed, as described 479 before, a multipoint packet flow is considered, and the 480 identification fields can be selected without any constraints. 482 6. Network Clustering 484 The previous equation can determine the number of packets lost 485 globally in the monitored network, exploiting only the data provided 486 by the counters in the input and output nodes. 488 In addition, it is also possible to leverage the data provided by the 489 other counters in the network to converge on the smallest 490 identifiable subnetworks where the losses occur. These subnetworks 491 are named "clusters". 493 A cluster graph is a subnetwork of the entire monitoring network 494 graph that still satisfies the packet loss equation (introduced in 495 the previous section), where PL in this case is the number of packets 496 lost in the cluster. As for the entire monitoring network graph, the 497 cluster is defined on a per-flow basis. 499 For this reason, a cluster should contain all the arcs emanating from 500 its input nodes and all the arcs terminating at its output nodes. 501 This ensures that we can count all the packets (and only those) 502 exiting an input node again at the output node, whatever path they 503 follow. 505 In a completely monitored unidirectional network (a network where 506 every network interface is monitored), each network device 507 corresponds to a cluster, and each physical link corresponds to two 508 clusters (one for each device). 510 Clusters can have different sizes depending on the flow-filtering 511 criteria adopted. 513 Moreover, sometimes clusters can be optionally simplified. For 514 example, when two monitored interfaces are divided by a single router 515 (one is the input interface, the other is the output interface, and 516 the router has only these two interfaces), instead of counting 517 exactly twice, upon entering and leaving, it is possible to consider 518 a single measurement point. In this case, we do not care about the 519 internal packet loss of the router. 521 It is worth highlighting that it might also be convenient to define 522 clusters based on the topological information so that they are 523 applicable to all the possible flows in the monitored network. 525 6.1. Algorithm for Clusters Partition 527 A simple algorithm can be applied in order to split our monitoring 528 network into clusters. This can be done for each direction 529 separately. The clusters partition is based on the monitoring 530 network graph, which can be valid for a specific flow or can also be 531 general and valid for the entire network topology. 533 It is a two-step algorithm: 535 o Group the links where there is the same starting node; 537 o Join the grouped links with at least one ending node in common. 539 Considering that the links are unidirectional, the first step implies 540 listing all the links as connections between two nodes and grouping 541 the different links if they have the same starting node. Note that 542 it is possible to start from any link, and the procedure will work. 543 Following this classification, the second step implies eventually 544 joining the groups classified in the first step by looking at the 545 ending nodes. If different groups have at least one common ending 546 node, they are put together and belong to the same set. After the 547 application of the two steps of the algorithm, each one of the 548 composed sets of links, together with the endpoint nodes, constitutes 549 a cluster. 551 In our monitoring network graph example, it is possible to identify 552 the clusters partition by applying this two-step algorithm. 554 The first step identifies the following groups: 556 1. Group 1: (R1-R2), (R1-R3), (R1-R10) 558 2. Group 2: (R2-R4), (R2-R5) 560 3. Group 3: (R3-R5), (R3-R9) 562 4. Group 4: (R4-R6), (R4-R7) 564 5. Group 5: (R5-R8) 566 And then, the second step builds the clusters partition (in 567 particular, we can underline that Groups 2 and 3 connect together, 568 since R5 is in common): 570 1. Cluster 1: (R1-R2), (R1-R3), (R1-R10) 572 2. Cluster 2: (R2-R4), (R2-R5), (R3-R5), (R3-R9) 574 3. Cluster 3: (R4-R6), (R4-R7) 576 4. Cluster 4: (R5-R8) 578 The flow direction here considered is from left to right. For the 579 opposite direction, the same reasoning can be applied, and in this 580 example, you get the same clusters partition. 582 In the end, the following 4 clusters are obtained: 584 Cluster 1 585 +------+ 586 <> R2 <>--- 587 / +------+ 588 / 589 +------+ / +------+ 590 ---<> R1 <>---<> R3 <>--- 591 +------+ \ +------+ 592 \ 593 \ 594 \ 595 \ 596 \ 597 \ 598 \ 599 \ 600 \ +------+ 601 <> R10 <>--- 602 +------+ 604 Cluster 2 605 +------+ +------+ 606 ---<> R2 <>---<> R4 <>--- 607 +------+ \ +------+ 608 \ 609 +------+ \ +------+ 610 ---<> R3 <>---<> R5 <>--- 611 +------+ \ +------+ 612 \ 613 \ 614 \ 615 \ 616 \ +------+ 617 <> R9 <>--- 618 +------+ 620 Cluster 3 621 +------+ 622 <> R6 <>--- 623 / +------+ 624 +------+ / 625 ---<> R4 <> 626 +------+ \ 627 \ +------+ 628 <> R7 <>--- 629 +------+ 631 Cluster 4 632 +------+ 633 ---<> R5 <> 634 +------+ \ 635 \ +------+ 636 <> R8 <>--- 637 +------+ 639 Figure 3: Clusters Example 641 There are clusters with more than two nodes as well as two-node 642 clusters. In the two-node clusters, the loss is on the link (Cluster 643 4). In more-than-two-node clusters, the loss is on the cluster, but 644 we cannot know in which link (Cluster 1, 2, or 3). 646 In this way, the calculation of packet loss can be made on a cluster 647 basis. Note that the packet counters for each marking period permit 648 calculating the packet rate on a cluster basis, so Committed 649 Information Rate (CIR) and Excess Information Rate (EIR) could also 650 be deduced on a cluster basis. 652 Obviously, by combining some clusters in a new connected subnetwork 653 (called a "super cluster"), the packet-loss rule is still true. 655 In this way, in a very large network, there is no need to configure 656 detailed filter criteria to inspect the traffic. You can check a 657 multipoint network and, in case of problems, go deep with a step-by- 658 step cluster analysis, but only for the cluster or combination of 659 clusters where the problem happens. 661 In summary, once a flow is defined, the algorithm to build the 662 clusters partition is based on topological information; therefore, it 663 considers all the possible links and nodes crossed by the given flow, 664 even if there is no traffic. So, if the flow does not enter or 665 traverse all the nodes, the counters have a nonzero value for the 666 involved nodes and a zero value for the other nodes without traffic; 667 but in the end, all the formulas are still valid. 669 The algorithm described above is an iterative clustering algorithm, 670 but it is also possible to apply a recursive clustering algorithm by 671 using the node-node adjacency matrix representation 672 [IEEE-ACM-ToN-MPNPM]. 674 The complete and mathematical analysis of the possible algorithms for 675 clusters partition, including the considerations in terms of 676 efficiency and a comparison between the different methods, is in the 677 paper [IEEE-ACM-ToN-MPNPM]. 679 7. Timing Aspects 681 It is important to consider the timing aspects, since out-of-order 682 packets happen and have to be handled as well, as described in 683 [I-D.fioccola-rfc8321bis]. However, in a multisource situation, an 684 additional issue has to be considered. With multipoint path, the 685 egress nodes will receive alternate marked packets in random order 686 from different ingress nodes, and this must not affect the 687 measurement. 689 So, if we analyze a multipoint-to-multipoint path with more than one 690 marking node, it is important to recognize the reference measurement 691 interval. In general, the measurement interval for describing the 692 results is the interval of the marking node that is more aligned with 693 the start of the measurement, as reported in Figure 4. 695 Note that the mark switching approach based on a fixed timer is 696 considered in this document. 698 time -> start stop 699 T(R1) |-------------| 700 T(R2) |-------------| 701 T(R3) |------------| 703 Figure 4: Measurement Interval 705 In Figure 4, it is assumed that the node with the earliest clock (R1) 706 identifies the right starting and ending times of the measurement, 707 but it is just an assumption, and other possibilities could occur. 708 So, in this case, T(R1) is the measurement interval, and its 709 recognition is essential in order to make comparisons with other 710 active/passive/hybrid Packet Loss metrics. 712 When we expand to multipoint-to-multipoint flows, we have to consider 713 that all source nodes mark the traffic, and this adds more 714 complexity. 716 Regarding the timing aspects of the methodology, 717 [I-D.fioccola-rfc8321bis] already describes two contributions that 718 are taken into account: the clock error between network devices and 719 the network delay between measurement points. 721 But we should now consider an additional contribution. Since all 722 source nodes mark the traffic, the source measurement intervals can 723 be of different lengths and with different offsets, and this mismatch 724 m can be added to d, as shown in Figure 5. 726 ...BBBBBBBBB | AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA | BBBBBBBBB... 727 |<======================================>| 728 | L | 729 ...=========>|<==================><==================>|<==========... 730 | L/2 L/2 | 731 |<=><===>| |<===><=>| 732 m d | | d m 733 |<====================>| 734 available counting interval 736 Figure 5: Timing Aspects for Multipoint Paths 738 So the misalignment between the marking source routers gives an 739 additional constraint, and the value of m is added to d (which 740 already includes clock error and network delay). 742 Thus, three different possible contributions are considered: clock 743 error between network devices, network delay between measurement 744 points, and the misalignment between the marking source routers. 746 In the end, the condition that must be satisfied to enable the method 747 to function properly is that the available counting interval must be 748 > 0, and that means: 750 L - 2m - 2d > 0. 752 This formula needs to be verified for each measurement point on the 753 multipoint path, where m is misalignment between the marking source 754 routers, while d, already introduced in [I-D.fioccola-rfc8321bis], 755 takes into account clock error and network delay between network 756 nodes. Therefore, the mismatch between measurement intervals must 757 satisfy this condition. 759 Note that the timing considerations are valid for both packet loss 760 and delay measurements. 762 8. Multipoint Delay and Delay Variation 764 The same line of reasoning can be applied to delay and delay 765 variation. Similarly to the delay measurements defined in 766 [I-D.fioccola-rfc8321bis], the marking batches anchor the samples to 767 a particular period, and this is the time reference that can be used. 768 It is important to highlight that both delay and delay-variation 769 measurements make sense in a multipoint path. The delay variation is 770 calculated by considering the same packets selected for measuring the 771 delay. 773 In general, it is possible to perform delay and delay-variation 774 measurements on the basis of multipoint paths or single packets: 776 o Delay measurements on the basis of multipoint paths mean that the 777 delay value is representative of an entire multipoint path (e.g., 778 the whole multipoint network, a cluster, or a combination of 779 clusters). 781 o Delay measurements on a single-packet basis mean that you can use 782 a multipoint path just to easily couple packets between input and 783 output nodes of a multipoint path, as described in the following 784 sections. 786 8.1. Delay Measurements on a Multipoint-Paths Basis 788 8.1.1. Single-Marking Measurement 790 Mean delay and mean delay-variation measurements can also be 791 generalized to the case of multipoint flows. It is possible to 792 compute the average one-way delay of packets in one block, a cluster, 793 or the entire monitored network. 795 The average latency can be measured as the difference between the 796 weighted averages of the mean timestamps of the sets of output and 797 input nodes. This means that, in the calculation, it is possible to 798 weigh the timestamps by considering the number of packets for each 799 endpoints. 801 8.2. Delay Measurements on a Single-Packet Basis 803 8.2.1. Single- and Double-Marking Measurement 805 Delay and delay-variation measurements relative to only one picked 806 packet per period (both single and double marked) can be performed in 807 the multipoint scenario, with some limitations: 809 Single marking based on the first/last packet of the interval 810 would not work, because it would not be possible to agree on the 811 first packet of the interval. 813 Double marking or multiplexed marking would work, but each 814 measurement would only give information about the delay of a 815 single path. However, by repeating the measurement multiple 816 times, it is possible to get information about all the paths in 817 the multipoint flow. This can be done in the case of a point-to- 818 multipoint path, but it is more difficult to achieve in the case 819 of a multipoint-to-multipoint path because of the multiple source 820 routers. 822 If we would perform a delay measurement for more than one picked 823 packet in the same marking period, and especially if we want to get 824 delay measurements on a multipoint-to-multipoint basis, neither the 825 single- nor the double-marking method is useful in the multipoint 826 scenario, since they would not be representative of the entire flow. 827 The packets can follow different paths with various delays, and in 828 general it can be very difficult to recognize marked packets in a 829 multipoint-to-multipoint path, especially in the case when there is 830 more than one per period. 832 A desirable option is to monitor simultaneously all the paths of a 833 multipoint path in the same marking period; for this purpose, hashing 834 can be used, as reported in the next section. 836 8.2.2. Hashing Selection Method 838 RFCs 5474 [RFC5474] and 5475 [RFC5475] introduce sampling and 839 filtering techniques for IP packet selection. 841 The hash-based selection methodologies for delay measurement can work 842 in a multipoint-to-multipoint path and MAY be used either coupled to 843 mean delay or stand-alone. 845 [I-D.mizrahi-ippm-compact-alternate-marking] introduces how to use 846 the hash method (RFC 5474 [RFC5474] and RFC 5475 [RFC5475]) combined 847 with the Alternate-Marking method for point-to-point flows. It is 848 also called Mixed Hashed Marking: the coupling of a marking method 849 and hashing technique is very useful, because the marking batches 850 anchor the samples selected with hashing, and this simplifies the 851 correlation of the hashing packets along the path. 853 It is possible to use a basic-hash or a dynamic-hash method. One of 854 the challenges of the basic approach is that the frequency of the 855 sampled packets may vary considerably. For this reason, the dynamic 856 approach has been introduced for point-to-point flows in order to 857 have the desired and almost fixed number of samples for each 858 measurement period. Using the hash-based sampling, the number of 859 samples may vary a lot because it depends on the packet rate that is 860 variable. The dynamic approach helps to have an almost fixed number 861 of samples for each marking period, and this is a better option for 862 making regular measurements over time. In the hash-based sampling, 863 Alternate Marking is used to create periods, so that hash-based 864 samples are divided into batches, which allows anchoring the selected 865 samples to their period. Moreover, in the dynamic hash-based 866 sampling, by dynamically adapting the length of the hash value, the 867 number of samples is bounded in each marking period. This can be 868 realized by choosing the maximum number of samples (NMAX) to be 869 caught in a marking period. The algorithm starts with only a few 870 hash bits, which permits selecting a greater percentage of packets 871 (e.g., with 0 bits of hash all the packets are sampled, with 1 bit of 872 hash half of the packets are sampled, and so on). When the number of 873 selected packets reaches NMAX, a hashing bit is added. As a 874 consequence, the sampling proceeds at half of the original rate, and 875 also the packets already selected that do not match the new hash are 876 discarded. This step can be repeated iteratively. It is assumed 877 that each sample includes the timestamp (used for delay measurement) 878 and the hash value, allowing the management system to match the 879 samples received from the two measurement points. The dynamic 880 process statistically converges at the end of a marking period, and 881 the final number of selected samples is between NMAX/2 and NMAX. 882 Therefore, the dynamic approach paces the sampling rate, allowing to 883 bound the number of sampled packets per sampling period. 885 In a multipoint environment, the behavior is similar to a point-to- 886 point flow. In particular, in the context of a multipoint-to- 887 multipoint flow, the dynamic hash could be the solution for 888 performing delay measurements on specific packets and overcoming the 889 single- and double-marking limitations. 891 The management system receives the samples, including the timestamps 892 and the hash value, from all the MPs, and this happens for both 893 point-to-point and multipoint-to-multipoint flows. Then, the longest 894 hash used by the MPs is deduced and applied to couple timestamps from 895 either the same packets of 2 MPs of a point-to-point path, or the 896 input and output MPs of a cluster (or a super cluster or the entire 897 network). But some considerations are needed: if there isn't packet 898 loss, the set of input samples is always equal to the set of output 899 samples. In the case of packet loss, the set of output samples can 900 be a subset of input samples, but the method still works because, at 901 the end, it is easy to couple the input and output timestamps of each 902 caught packet using the hash (in particular, the "unused part of the 903 hash" that should be different for each packet). 905 Therefore, the basic hash is logically similar to the double-marking 906 method, and in the case of a point-to-point path, double-marking and 907 basic-hash selection are equivalent. The dynamic approach scales the 908 number of measurements per interval. It would seem that double 909 marking would also work well if we reduced the interval length, but 910 this can be done only for a point-to-point path and not for a 911 multipoint path, where we cannot couple the picked packets in a 912 multipoint path. So, in general, if we want to get delay 913 measurements on the basis of a multipoint-to-multipoint path, and 914 want to select more than one packet per period, double marking cannot 915 be used because we could not be able to couple the picked packets 916 between input and output nodes. On the other hand, we can do that by 917 using hashing selection. 919 9. A Closed-Loop Performance-Management Approach 921 The Multipoint Alternate-Marking framework that is introduced in this 922 document adds flexibility to Performance Management (PM), because it 923 can reduce the order of magnitude of the packet counters. This 924 allows an SDN orchestrator to supervise, control, and manage PM in 925 large networks. 927 The monitoring network can be considered as a whole or split into 928 clusters that are the smallest subnetworks (group-to-group segments), 929 maintaining the packet-loss property for each subnetwork. The 930 clusters can also be combined in new, connected subnetworks at 931 different levels, depending on the detail we want to achieve. 933 An SDN controller or a Network Management System (NMS) can calibrate 934 performance measurements, since they are aware of the network 935 topology. They can start without examining in depth. In case of 936 necessity (packet loss is measured or the delay is too high), the 937 filtering criteria could be immediately reconfigured in order to 938 perform a partition of the network by using clusters and/or different 939 combinations of clusters. In this way, the problem can be localized 940 in a specific cluster or a single combination of clusters, and a more 941 detailed analysis can be performed step by step by successive 942 approximation up to a point-to-point flow detailed analysis. This is 943 the so-called "closed loop". 945 This approach can be called "network zooming" and can be performed in 946 two different ways: 948 1) change the traffic filter and select more detailed flows; 950 2) activate new measurement points by defining more specified 951 clusters. 953 The network-zooming approach implies that some filters or rules are 954 changed and that therefore there is a transient time to wait once the 955 new network configuration takes effect. This time can be determined 956 by the Network Orchestrator/Controller, based on the network 957 conditions. 959 For example, if the network zooming identifies the performance 960 problem for the traffic coming from a specific source, we need to 961 recognize the marked signal from this specific source node and its 962 relative path. For this purpose, we can activate all the available 963 measurement points and better specify the flow filter criteria (i.e., 964 5-tuple). As an alternative, it can be enough to select packets from 965 the specific source for delay measurements; in this case, it is 966 possible to apply the hashing technique, as mentioned in the previous 967 sections. 969 [I-D.song-opsawg-ifit-framework] defines an architecture where the 970 centralized Data Collector and Network Management can apply the 971 intelligent and flexible Alternate-Marking algorithm as previously 972 described. 974 As for [I-D.fioccola-rfc8321bis], it is possible to classify the 975 traffic and mark a portion of the total traffic. For each period, 976 the packet rate and bandwidth are calculated from the number of 977 packets. In this way, the network orchestrator becomes aware if the 978 traffic rate surpasses limits. In addition, more precision can be 979 obtained by reducing the marking period; indeed, some implementations 980 use a marking period of 1 sec or less. 982 In addition, an SDN controller could also collect the measurement 983 history. 985 It is important to mention that the Multipoint Alternate Marking 986 framework also helps Traffic Visualization. Indeed, this methodology 987 is very useful for identifying which path or cluster is crossed by 988 the flow. 990 10. Security Considerations 992 This document specifies a method of performing measurements that does 993 not directly affect Internet security or applications that run on the 994 Internet. However, implementation of this method must be mindful of 995 security and privacy concerns, as explained in 996 [I-D.fioccola-rfc8321bis]. 998 11. IANA Considerations 1000 This document has no IANA actions. 1002 12. Contributors 1004 Tianran Zhou 1005 Huawei Technologies 1006 Email: zhoutianran@huawei.com 1008 Greg Mirsky 1009 Ericsson 1010 Email: gregimirsky@gmail.com 1012 Tal Mizrahi 1013 Huawei Technologies 1014 Email: tal.mizrahi.phd@gmail.com 1016 Xiao Min 1017 ZTE Corp. 1018 Email: xiao.min2@zte.com.cn 1020 13. Acknowledgements 1022 The authors would like to thank Martin Duke and Tommy Pauly for their 1023 assistance and their detailed and precious reviews. 1025 14. References 1027 14.1. Normative References 1029 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1030 Requirement Levels", BCP 14, RFC 2119, 1031 DOI 10.17487/RFC2119, March 1997, 1032 . 1034 [RFC5474] Duffield, N., Ed., Chiou, D., Claise, B., Greenberg, A., 1035 Grossglauser, M., and J. Rexford, "A Framework for Packet 1036 Selection and Reporting", RFC 5474, DOI 10.17487/RFC5474, 1037 March 2009, . 1039 [RFC5475] Zseby, T., Molina, M., Duffield, N., Niccolini, S., and F. 1040 Raspall, "Sampling and Filtering Techniques for IP Packet 1041 Selection", RFC 5475, DOI 10.17487/RFC5475, March 2009, 1042 . 1044 [RFC5644] Stephan, E., Liang, L., and A. Morton, "IP Performance 1045 Metrics (IPPM): Spatial and Multicast", RFC 5644, 1046 DOI 10.17487/RFC5644, October 2009, 1047 . 1049 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 1050 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 1051 May 2017, . 1053 14.2. Informative References 1055 [I-D.fioccola-rfc8321bis] 1056 Fioccola, G., Cociglio, M., Mirsky, G., and T. Mizrahi, 1057 "Alternate-Marking Method", draft-fioccola-rfc8321bis-00 1058 (work in progress), November 2021. 1060 [I-D.ietf-ippm-route] 1061 Alvarez-Hamelin, J. I., Morton, A., Fabini, J., Pignataro, 1062 C., and R. Geib, "Advanced Unidirectional Route Assessment 1063 (AURA)", draft-ietf-ippm-route-10 (work in progress), 1064 August 2020. 1066 [I-D.mizrahi-ippm-compact-alternate-marking] 1067 Mizrahi, T., Arad, C., Fioccola, G., Cociglio, M., Chen, 1068 M., Zheng, L., and G. Mirsky, "Compact Alternate Marking 1069 Methods for Passive and Hybrid Performance Monitoring", 1070 draft-mizrahi-ippm-compact-alternate-marking-05 (work in 1071 progress), July 2019. 1073 [I-D.song-opsawg-ifit-framework] 1074 Song, H., Qin, F., Chen, H., Jin, J., and J. Shin, "In- 1075 situ Flow Information Telemetry", draft-song-opsawg-ifit- 1076 framework-16 (work in progress), October 2021. 1078 [I-D.zhou-ippm-enhanced-alternate-marking] 1079 Zhou, T., Fioccola, G., Liu, Y., Lee, S., Cociglio, M., 1080 and W. Li, "Enhanced Alternate Marking Method", draft- 1081 zhou-ippm-enhanced-alternate-marking-07 (work in 1082 progress), July 2021. 1084 [IEEE-ACM-ToN-MPNPM] 1085 IEEE/ACM TRANSACTION ON NETWORKING, "Multipoint Passive 1086 Monitoring in Packet Networks", 1087 DOI 10.1109/TNET.2019.2950157, 2019. 1089 [IEEE-Network-PNPM] 1090 IEEE Network, "AM-PM: Efficient Network Telemetry using 1091 Alternate Marking", DOI 10.1109/MNET.2019.1800152, 2019. 1093 [RFC7011] Claise, B., Ed., Trammell, B., Ed., and P. Aitken, 1094 "Specification of the IP Flow Information Export (IPFIX) 1095 Protocol for the Exchange of Flow Information", STD 77, 1096 RFC 7011, DOI 10.17487/RFC7011, September 2013, 1097 . 1099 [RFC8889] Fioccola, G., Ed., Cociglio, M., Sapio, A., and R. Sisto, 1100 "Multipoint Alternate-Marking Method for Passive and 1101 Hybrid Performance Monitoring", RFC 8889, 1102 DOI 10.17487/RFC8889, August 2020, 1103 . 1105 Appendix A. Changes Log 1107 Changes from RFC 8889 include: 1109 o Minor editorial changes 1111 o Removed section on "Examples of application" 1113 Authors' Addresses 1115 Giuseppe Fioccola (editor) 1116 Huawei Technologies 1117 Riesstrasse, 25 1118 Munich 80992 1119 Germany 1121 Email: giuseppe.fioccola@huawei.com 1123 Mauro Cociglio 1124 Telecom Italia 1125 Via Reiss Romoli, 274 1126 Torino 10148 1127 Italy 1129 Email: mauro.cociglio@telecomitalia.it 1131 Amedeo Sapio 1132 Intel Corporation 1133 4750 Patrick Henry Dr. 1134 Santa Clara, CA 95054 1135 USA 1137 Email: amedeo.sapio@intel.com 1139 Riccardo Sisto 1140 Politecnico di Torino 1141 Corso Duca degli Abruzzi, 24 1142 Torino 10129 1143 Italy 1145 Email: riccardo.sisto@polito.it