idnits 2.17.00 (12 Aug 2021) /tmp/idnits43634/draft-fioccola-rfc8889bis-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([RFC8889]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (December 9, 2021) is 163 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-04) exists of draft-fioccola-rfc8321bis-00 ** Downref: Normative reference to an Informational RFC: RFC 5474 == Outdated reference: A later version (-17) exists of draft-song-opsawg-ifit-framework-16 == Outdated reference: A later version (-09) exists of draft-zhou-ippm-enhanced-alternate-marking-07 Summary: 2 errors (**), 0 flaws (~~), 4 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group G. Fioccola, Ed. 3 Internet-Draft Huawei Technologies 4 Obsoletes: 8889 (if approved) M. Cociglio 5 Intended status: Standards Track Telecom Italia 6 Expires: June 12, 2022 A. Sapio 7 Intel Corporation 8 R. Sisto 9 Politecnico di Torino 10 T. Zhou 11 Huawei Technologies 12 December 9, 2021 14 Multipoint Alternate-Marking Method 15 draft-fioccola-rfc8889bis-01 17 Abstract 19 This document generalizes and expands Alternate-Marking methodology 20 to measure any kind of unicast flow whose packets can follow several 21 different paths in the network -- in wider terms, a multipoint-to- 22 multipoint network. For this reason, the technique here described is 23 called "Multipoint Alternate Marking". This document obsoletes 24 [RFC8889]. 26 Status of This Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at https://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on June 12, 2022. 43 Copyright Notice 45 Copyright (c) 2021 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (https://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 Table of Contents 60 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 61 1.1. Requirements Language . . . . . . . . . . . . . . . . . . 4 62 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 63 2.1. Correlation with RFC 5644 . . . . . . . . . . . . . . . . 5 64 3. Flow Classification . . . . . . . . . . . . . . . . . . . . . 6 65 4. Multipoint Performance Measurement . . . . . . . . . . . . . 9 66 4.1. Monitoring Network . . . . . . . . . . . . . . . . . . . 9 67 5. Multipoint Packet Loss . . . . . . . . . . . . . . . . . . . 10 68 6. Network Clustering . . . . . . . . . . . . . . . . . . . . . 11 69 6.1. Algorithm for Clusters Partition . . . . . . . . . . . . 12 70 7. Timing Aspects . . . . . . . . . . . . . . . . . . . . . . . 16 71 8. Multipoint Delay and Delay Variation . . . . . . . . . . . . 18 72 8.1. Delay Measurements on a Multipoint-Paths Basis . . . . . 18 73 8.1.1. Single-Marking Measurement . . . . . . . . . . . . . 18 74 8.2. Delay Measurements on a Single-Packet Basis . . . . . . . 18 75 8.2.1. Single- and Double-Marking Measurement . . . . . . . 18 76 8.2.2. Hashing Selection Method . . . . . . . . . . . . . . 19 77 9. A Closed-Loop Performance-Management Approach . . . . . . . . 20 78 10. Security Considerations . . . . . . . . . . . . . . . . . . . 21 79 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 22 80 12. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 22 81 13. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 22 82 14. References . . . . . . . . . . . . . . . . . . . . . . . . . 22 83 14.1. Normative References . . . . . . . . . . . . . . . . . . 22 84 14.2. Informative References . . . . . . . . . . . . . . . . . 23 85 Appendix A. Changes Log . . . . . . . . . . . . . . . . . . . . 24 86 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 24 88 1. Introduction 90 The Alternate-Marking method, as described in 91 [I-D.fioccola-rfc8321bis], is applicable to a point-to-point path. 92 The extension proposed in this document applies to the most general 93 case of multipoint-to-multipoint path and enables flexible and 94 adaptive performance measurements in a managed network. 96 The Alternate-Marking methodology described in 97 [I-D.fioccola-rfc8321bis] allows the synchronization of the 98 measurements in different points by dividing the packet flow into 99 batches. So it is possible to get coherent counters and show what is 100 happening in every marking period for each monitored flow. The 101 monitoring parameters are the packet counter and timestamps of a flow 102 for each marking period. Note that additional details about the 103 applicability of the Alternate-Marking methodology are described in 104 [I-D.fioccola-rfc8321bis] while implementation details can be found 105 in the paper "AM-PM: Efficient Network Telemetry using Alternate 106 Marking" [IEEE-Network-PNPM]. 108 There are some applications of the Alternate-Marking method where 109 there are a lot of monitored flows and nodes. Multipoint Alternate 110 Marking aims to reduce these values and makes the performance 111 monitoring more flexible in case a detailed analysis is not needed. 112 For instance, by considering n measurement points and m monitored 113 flows, the order of magnitude of the packet counters for each time 114 interval is n*m*2 (1 per color). The number of measurement points 115 and monitored flows may vary and depends on the portion of the 116 network we are monitoring (core network, metro network, access 117 network) and the granularity (for each service, each customer). So 118 if both n and m are high values, the packet counters increase a lot, 119 and Multipoint Alternate Marking offers a tool to control these 120 parameters. 122 The approach presented in this document is applied only to unicast 123 flows and not to multicast. Broadcast, Unknown Unicast, and 124 Multicast (BUM) traffic is not considered here, because traffic 125 replication is not covered by the Multipoint Alternate-Marking 126 method. Furthermore, it can be applicable to anycast flows, and 127 Equal-Cost Multipath (ECMP) paths can also be easily monitored with 128 this technique. 130 [I-D.fioccola-rfc8321bis] applies to point-to-point unicast flows and 131 BUM traffic. For BUM traffic, the basic method of 132 [I-D.fioccola-rfc8321bis] can easily be applied link by link and 133 therefore split the multicast flow tree distribution into separate 134 unicast point-to-point links. While this document and its Clustered 135 Alternate-Marking method is valid for multipoint-to-multipoint 136 unicast flows, anycast, and ECMP flows. 138 Therefore, the Alternate-Marking method can be extended to any kind 139 of multipoint-to-multipoint paths, and the network-clustering 140 approach presented in this document is the formalization of how to 141 implement this property and allow a flexible and optimized 142 performance measurement support for network management in every 143 situation. 145 Without network clustering, it is possible to apply Alternate Marking 146 only for all the network or per single flow. Instead, with network 147 clustering, it is possible to use the partition of the network into 148 clusters at different levels in order to perform the needed degree of 149 detail. In some circumstances, it is possible to monitor a 150 multipoint network by analyzing the network clustering, without 151 examining in depth. In case of problems (packet loss is measured or 152 the delay is too high), the filtering criteria could be specified 153 more in order to perform a detailed analysis by using a different 154 combination of clusters up to a per-flow measurement as described in 155 [I-D.fioccola-rfc8321bis]. 157 This approach fits very well with the Closed-Loop Network and 158 Software-Defined Network (SDN) paradigm, where the SDN orchestrator 159 and the SDN controllers are the brains of the network and can manage 160 flow control to the switches and routers and, in the same way, can 161 calibrate the performance measurements depending on the desired 162 accuracy. An SDN controller application can orchestrate how 163 accurately the network performance monitoring is set up by applying 164 the Multipoint Alternate Marking as described in this document. 166 It is important to underline that, as an extension of 167 [I-D.fioccola-rfc8321bis], this is a methodology document, so the 168 mechanism that can be used to transmit the counters and the 169 timestamps is out of scope here, and the implementation is open. 170 Several options are possible -- e.g., see "Enhanced Alternate Marking 171 Method" [I-D.zhou-ippm-enhanced-alternate-marking]. 173 This document assumes that the blocks are created according to a 174 fixed timer as per [I-D.fioccola-rfc8321bis]. The switching after a 175 fixed number of packets is an additional possibility but it is out of 176 scope here. 178 Note that the fragmented packets case can be managed with the 179 Alternate-Marking methodology. The same considerations of 180 [I-D.fioccola-rfc8321bis] apply also in the case of Multipoint 181 Alternate Marking. As defined in [I-D.fioccola-rfc8321bis] the 182 marking node MUST mark all the fragments except in the case of 183 fragmentation within the network domain, in that event it is 184 suggested to mark only the first fragment. 186 1.1. Requirements Language 188 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 189 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 190 "OPTIONAL" in this document are to be interpreted as described in BCP 191 14 [RFC2119] [RFC8174] when, and only when, they appear in all 192 capitals, as shown here. 194 2. Terminology 196 The definitions of the basic terms are identical to those found in 197 Alternate Marking [I-D.fioccola-rfc8321bis]. It is to be remembered 198 that [I-D.fioccola-rfc8321bis] is valid for point-to-point unicast 199 flows and BUM traffic. 201 The important new terms that need to be explained are listed below: 203 Multipoint Alternate Marking: Extension to 204 [I-D.fioccola-rfc8321bis], valid for multipoint-to-multipoint 205 unicast flows, anycast, and ECMP flows. It can also be referred 206 to as Clustered Alternate Marking. 208 Flow definition: The concept of flow is generalized in this 209 document. The identification fields are selected without any 210 constraints and, in general, the flow can be a multipoint-to- 211 multipoint flow, as a result of aggregate point-to-point flows. 213 Monitoring Network: Identified with the nodes of the network that 214 are the measurement points (MPs) and the links that are the 215 connections between MPs. The monitoring network graph depends on 216 the flow definition, so it can represent a specific flow or the 217 entire network topology as aggregate of all the flows. 219 Cluster: Smallest identifiable subnetwork of the entire monitoring 220 network graph that still satisfies the condition that the number 221 of packets that go in is the same as the number that go out. 223 Multipoint metrics: Packet loss, delay and delay variation are 224 extended to the case of multipoint flows. It is possible to 225 compute these metrics on the basis of multipoint paths in order to 226 associate the measurements to a cluster, a combination of 227 clusters, or the entire monitored network. For delay and delay 228 variation, it is also possible to define the metrics on a single- 229 packet basis, and it means that the multipoint path is used to 230 easily couple packets between input and output nodes of a 231 multipoint path. 233 The next section highlights the correlation with the terms used in 234 RFC 5644 [RFC5644]. 236 2.1. Correlation with RFC 5644 238 RFC 5644 [RFC5644] is limited to active measurements using a single 239 source packet or stream. Its scope is also limited to observations 240 of corresponding packets along the path (spatial metric) and at one 241 or more destinations (one-to-group) along the path. 243 Instead, the scope of this memo is to define multiparty metrics for 244 passive and hybrid measurements in a group-to-group topology with 245 multiple sources and destinations. 247 RFC 5644 [RFC5644] introduces metric names that can be reused here 248 but have to be extended and rephrased to be applied to the Alternate- 249 Marking schema: 251 a. the multiparty metrics are not only one-to-group metrics but can 252 be also group-to-group metrics; 254 b. the spatial metrics, used for measuring the performance of 255 segments of a source to destination path, are applied here to 256 group-to-group segments (called clusters). 258 3. Flow Classification 260 A unicast flow is identified by all the packets having a set of 261 common characteristics. This definition is inspired by RFC 7011 262 [RFC7011]. 264 As an example, by considering a flow as all the packets sharing the 265 same source IP address or the same destination IP address, it is easy 266 to understand that the resulting pattern will not be a point-to-point 267 connection, but a point-to-multipoint or multipoint-to-point 268 connection. 270 In general, a flow can be defined by a set of selection rules used to 271 match a subset of the packets processed by the network device. These 272 rules specify a set of Layer 3 and Layer 4 header fields 273 (identification fields) and the relative values that must be found in 274 matching packets. 276 The choice of the identification fields directly affects the type of 277 paths that the flow would follow in the network. In fact, it is 278 possible to relate a set of identification fields with the pattern of 279 the resulting graphs, as listed in Figure 1. 281 A TCP 5-tuple usually identifies flows following either a single path 282 or a point-to-point multipath (in the case of load balancing). On 283 the contrary, a single source address selects aggregate flows 284 following a point-to-multipoint, while a multipoint-to-point can be 285 the result of a matching on a single destination address. In the 286 case where a selection rule and its reverse are used for 287 bidirectional measurements, they can correspond to a point-to- 288 multipoint in one direction and a multipoint-to-point in the opposite 289 direction. 291 So the flows to be monitored are selected into the monitoring points 292 using packet selection rules, which can also change the pattern of 293 the monitored network. 295 Note that, more generally, the flow can be defined at different 296 levels based on the potential encapsulation, and additional 297 conditions that are not in the packet header can also be included as 298 part of matching criteria. 300 The Alternate-Marking method is applicable only to a single path (and 301 partially to a one-to-one multipath), so the extension proposed in 302 this document is suitable also for the most general case of 303 multipoint-to-multipoint, which embraces all the other patterns of 304 Figure 1. 306 point-to-point single path 307 +------+ +------+ +------+ 308 ---<> R1 <>----<> R2 <>----<> R3 <>--- 309 +------+ +------+ +------+ 311 point-to-point multipath 312 +------+ 313 <> R2 <> 314 / +------+ \ 315 / \ 316 +------+ / \ +------+ 317 ---<> R1 <> <> R4 <>--- 318 +------+ \ / +------+ 319 \ / 320 \ +------+ / 321 <> R3 <> 322 +------+ 324 point-to-multipoint 325 +------+ 326 <> R4 <>--- 327 / +------+ 328 +------+ / 329 <> R2 <> 330 / +------+ \ 331 +------+ / \ +------+ 332 ---<> R1 <> <> R5 <>--- 333 +------+ \ +------+ 334 \ +------+ 335 <> R3 <> 336 +------+ \ 337 \ +------+ 338 <> R6 <>--- 339 +------+ 341 multipoint-to-point 342 +------+ 343 ---<> R1 <> 344 +------+ \ 345 \ +------+ 346 <> R4 <> 347 / +------+ \ 348 +------+ / \ +------+ 349 ---<> R2 <> <> R6 <>--- 350 +------+ / +------+ 351 +------+ / 352 <> R5 <> 353 / +------+ 354 +------+ / 355 ---<> R3 <> 356 +------+ 358 multipoint-to-multipoint 359 +------+ +------+ 360 ---<> R1 <> <> R6 <>--- 361 +------+ \ / +------+ 362 \ +------+ / 363 <> R4 <> 364 +------+ \ 365 +------+ \ +------+ 366 ---<> R2 <> <> R7 <>--- 367 +------+ \ / +------+ 368 \ +------+ / 369 <> R5 <> 370 / +------+ \ 371 +------+ / \ +------+ 372 ---<> R3 <> <> R8 <>--- 373 +------+ +------+ 375 Figure 1: Flow Classification 377 The case of unicast flow is considered in Figure 1. The anycast flow 378 is also in scope, because there is no replication and only a single 379 node from the anycast group receives the traffic, so it can be viewed 380 as a special case of unicast flow. Furthermore, an ECMP flow is in 381 scope by definition, since it is a point-to-multipoint unicast flow. 383 4. Multipoint Performance Measurement 385 By using the Alternate-Marking method, only point-to-point paths can 386 be monitored. To have an IP (TCP/UDP) flow that follows a point-to- 387 point path, we have to define, with a specific value, 5 388 identification fields (IP Source, IP Destination, Transport Protocol, 389 Source Port, Destination Port). 391 Multipoint Alternate Marking enables the performance measurement for 392 multipoint flows selected by identification fields without any 393 constraints (even the entire network production traffic). It is also 394 possible to use multiple marking points for the same monitored flow. 396 4.1. Monitoring Network 398 The monitoring network is deduced from the production network by 399 identifying the nodes of the graph that are the measurement points, 400 and the links that are the connections between measurement points. 402 There are some techniques that can help with the building of the 403 monitoring network (as an example, see [I-D.ietf-ippm-route]). In 404 general, there are different options: the monitoring network can be 405 obtained by considering all the possible paths for the traffic or 406 periodically checking the traffic (e.g. daily, weekly, monthly) and 407 updating the graph as appropriate, but this is up to the Network 408 Management System (NMS) configuration. 410 So a graph model of the monitoring network can be built according to 411 the Alternate-Marking method: the monitored interfaces and links are 412 identified. Only the measurement points and links where the traffic 413 has flowed have to be represented in the graph. 415 Figure 2 shows a simple example of a monitoring network graph: 417 +------+ 418 <> R6 <>--- 419 / +------+ 420 +------+ +------+ / 421 <> R2 <>---<> R4 <> 422 / +------+ \ +------+ \ 423 / \ \ +------+ 424 +------+ / +------+ \ +------+ <> R7 <>--- 425 ---<> R1 <>---<> R3 <>---<> R5 <> +------+ 426 +------+ \ +------+ \ +------+ \ 427 \ \ \ +------+ 428 \ \ <> R8 <>--- 429 \ \ +------+ 430 \ \ 431 \ \ +------+ 432 \ <> R9 <>--- 433 \ +------+ 434 \ 435 \ +------+ 436 <> R10 <>--- 437 +------+ 439 Figure 2: Monitoring Network Graph 441 Each monitoring point is characterized by the packet counter that 442 refers only to a marking period of the monitored flow. Also, it is 443 assumed that there be a monitoring point at all possible egress 444 points of the multipoint monitored network. 446 The same is also applicable for the delay, but it will be described 447 in the following sections. 449 5. Multipoint Packet Loss 451 Since all the packets of the considered flow leaving the network have 452 previously entered the network, the number of packets counted by all 453 the input nodes is always greater than, or equal to, the number of 454 packets counted by all the output nodes. Noninitial fragments are 455 not considered here. 457 The assumption is the use of the Alternate-Marking method. In the 458 case of no packet loss occurring in the marking period, if all the 459 input and output points of the network domain to be monitored are 460 measurement points, the sum of the number of packets on all the 461 ingress interfaces equals the number on egress interfaces for the 462 monitored flow. In this circumstance, if no packet loss occurs, the 463 intermediate measurement points only have the task of splitting the 464 measurement. 466 It is possible to define the Network Packet Loss of one monitored 467 flow for a single period. In a packet network, the number of lost 468 packets is the number of packets counted by the input nodes minus the 469 number of packets counted by the output nodes. This is true for 470 every packet flow in each marking period. 472 The monitored network packet loss with n input nodes and m output 473 nodes is given by: 475 PL = (PI1 + PI2 +...+ PIn) - (PO1 + PO2 +...+ POm) 477 where: 479 PL is the network packet loss (number of lost packets) 481 PIi is the number of packets flowed through the i-th input node in 482 this period 484 POj is the number of packets flowed through the j-th output node in 485 this period 487 The equation is applied on a per-time-interval basis and a per-flow 488 basis: 490 The reference interval is the Alternate-Marking period, as defined 491 in [I-D.fioccola-rfc8321bis]. 493 The flow definition is generalized here. Indeed, as described 494 before, a multipoint packet flow is considered, and the 495 identification fields can be selected without any constraints. 497 6. Network Clustering 499 The previous equation can determine the number of packets lost 500 globally in the monitored network, exploiting only the data provided 501 by the counters in the input and output nodes. 503 In addition, it is also possible to leverage the data provided by the 504 other counters in the network to converge on the smallest 505 identifiable subnetworks where the losses occur. These subnetworks 506 are named "clusters". 508 A cluster graph is a subnetwork of the entire monitoring network 509 graph that still satisfies the packet loss equation (introduced in 510 the previous section), where PL in this case is the number of packets 511 lost in the cluster. As for the entire monitoring network graph, the 512 cluster is defined on a per-flow basis. 514 For this reason, a cluster should contain all the arcs emanating from 515 its input nodes and all the arcs terminating at its output nodes. 516 This ensures that we can count all the packets (and only those) 517 exiting an input node again at the output node, whatever path they 518 follow. 520 In a completely monitored unidirectional network (a network where 521 every network interface is monitored), each network device 522 corresponds to a cluster, and each physical link corresponds to two 523 clusters (one for each device). 525 Clusters can have different sizes depending on the flow-filtering 526 criteria adopted. 528 Moreover, sometimes clusters can be optionally simplified. For 529 example, when two monitored interfaces are divided by a single router 530 (one is the input interface, the other is the output interface, and 531 the router has only these two interfaces), instead of counting 532 exactly twice, upon entering and leaving, it is possible to consider 533 a single measurement point. In this case, we do not care about the 534 internal packet loss of the router. 536 It is worth highlighting that it might also be convenient to define 537 clusters based on the topological information so that they are 538 applicable to all the possible flows in the monitored network. 540 6.1. Algorithm for Clusters Partition 542 A simple algorithm can be applied in order to split our monitoring 543 network into clusters. This can be done for each direction 544 separately. The clusters partition is based on the monitoring 545 network graph, which can be valid for a specific flow or can also be 546 general and valid for the entire network topology. 548 It is a two-step algorithm: 550 o Group the links where there is the same starting node; 552 o Join the grouped links with at least one ending node in common. 554 Considering that the links are unidirectional, the first step implies 555 listing all the links as connections between two nodes and grouping 556 the different links if they have the same starting node. Note that 557 it is possible to start from any link, and the procedure will work. 558 Following this classification, the second step implies eventually 559 joining the groups classified in the first step by looking at the 560 ending nodes. If different groups have at least one common ending 561 node, they are put together and belong to the same set. After the 562 application of the two steps of the algorithm, each one of the 563 composed sets of links, together with the endpoint nodes, constitutes 564 a cluster. 566 In our monitoring network graph example, it is possible to identify 567 the clusters partition by applying this two-step algorithm. 569 The first step identifies the following groups: 571 1. Group 1: (R1-R2), (R1-R3), (R1-R10) 573 2. Group 2: (R2-R4), (R2-R5) 575 3. Group 3: (R3-R5), (R3-R9) 577 4. Group 4: (R4-R6), (R4-R7) 579 5. Group 5: (R5-R8) 581 And then, the second step builds the clusters partition (in 582 particular, we can underline that Groups 2 and 3 connect together, 583 since R5 is in common): 585 1. Cluster 1: (R1-R2), (R1-R3), (R1-R10) 587 2. Cluster 2: (R2-R4), (R2-R5), (R3-R5), (R3-R9) 589 3. Cluster 3: (R4-R6), (R4-R7) 591 4. Cluster 4: (R5-R8) 593 The flow direction here considered is from left to right. For the 594 opposite direction, the same reasoning can be applied, and in this 595 example, you get the same clusters partition. 597 In the end, the following 4 clusters are obtained: 599 Cluster 1 600 +------+ 601 <> R2 <>--- 602 / +------+ 603 / 604 +------+ / +------+ 605 ---<> R1 <>---<> R3 <>--- 606 +------+ \ +------+ 607 \ 608 \ 609 \ 610 \ 611 \ 612 \ 613 \ 614 \ 615 \ +------+ 616 <> R10 <>--- 617 +------+ 619 Cluster 2 620 +------+ +------+ 621 ---<> R2 <>---<> R4 <>--- 622 +------+ \ +------+ 623 \ 624 +------+ \ +------+ 625 ---<> R3 <>---<> R5 <>--- 626 +------+ \ +------+ 627 \ 628 \ 629 \ 630 \ 631 \ +------+ 632 <> R9 <>--- 633 +------+ 635 Cluster 3 636 +------+ 637 <> R6 <>--- 638 / +------+ 639 +------+ / 640 ---<> R4 <> 641 +------+ \ 642 \ +------+ 643 <> R7 <>--- 644 +------+ 646 Cluster 4 647 +------+ 648 ---<> R5 <> 649 +------+ \ 650 \ +------+ 651 <> R8 <>--- 652 +------+ 654 Figure 3: Clusters Example 656 There are clusters with more than two nodes as well as two-node 657 clusters. In the two-node clusters, the loss is on the link (Cluster 658 4). In more-than-two-node clusters, the loss is on the cluster, but 659 we cannot know in which link (Cluster 1, 2, or 3). 661 In this way, the calculation of packet loss can be made on a cluster 662 basis. Note that the packet counters for each marking period permit 663 calculating the packet rate on a cluster basis, so Committed 664 Information Rate (CIR) and Excess Information Rate (EIR) could also 665 be deduced on a cluster basis. 667 Obviously, by combining some clusters in a new connected subnetwork 668 (called a "super cluster"), the packet-loss rule is still true. 670 In this way, in a very large network, there is no need to configure 671 detailed filter criteria to inspect the traffic. You can check a 672 multipoint network and, in case of problems, go deep with a step-by- 673 step cluster analysis, but only for the cluster or combination of 674 clusters where the problem happens. 676 In summary, once a flow is defined, the algorithm to build the 677 clusters partition is based on topological information; therefore, it 678 considers all the possible links and nodes crossed by the given flow, 679 even if there is no traffic. So, if the flow does not enter or 680 traverse all the nodes, the counters have a nonzero value for the 681 involved nodes and a zero value for the other nodes without traffic; 682 but in the end, all the formulas are still valid. 684 The algorithm described above is an iterative clustering algorithm, 685 but it is also possible to apply a recursive clustering algorithm by 686 using the node-node adjacency matrix representation 687 [IEEE-ACM-ToN-MPNPM]. 689 The complete and mathematical analysis of the possible algorithms for 690 clusters partition, including the considerations in terms of 691 efficiency and a comparison between the different methods, is in the 692 paper [IEEE-ACM-ToN-MPNPM]. 694 7. Timing Aspects 696 It is important to consider the timing aspects, since out-of-order 697 packets happen and have to be handled as well, as described in 698 [I-D.fioccola-rfc8321bis]. However, in a multisource situation, an 699 additional issue has to be considered. With multipoint path, the 700 egress nodes will receive alternate marked packets in random order 701 from different ingress nodes, and this must not affect the 702 measurement. 704 So, if we analyze a multipoint-to-multipoint path with more than one 705 marking node, it is important to recognize the reference measurement 706 interval. In general, the measurement interval for describing the 707 results is the interval of the marking node that is more aligned with 708 the start of the measurement, as reported in Figure 4. 710 Note that the mark switching approach based on a fixed timer is 711 considered in this document. 713 time -> start stop 714 T(R1) |-------------| 715 T(R2) |-------------| 716 T(R3) |------------| 718 Figure 4: Measurement Interval 720 In Figure 4, it is assumed that the node with the earliest clock (R1) 721 identifies the right starting and ending times of the measurement, 722 but it is just an assumption, and other possibilities could occur. 723 So, in this case, T(R1) is the measurement interval, and its 724 recognition is essential in order to make comparisons with other 725 active/passive/hybrid Packet Loss metrics. 727 When we expand to multipoint-to-multipoint flows, we have to consider 728 that all source nodes mark the traffic, and this adds more 729 complexity. 731 Regarding the timing aspects of the methodology, 732 [I-D.fioccola-rfc8321bis] already describes two contributions that 733 are taken into account: the clock error between network devices and 734 the network delay between measurement points. 736 Since there are more marking nodes in a multipoint environment, all 737 source nodes mark the traffic based on synchronized clock time but 738 the marking periods can be of different lengths and with different 739 offsets. This is because there can be an additional contribution to 740 consider since different nodes are marking the traffic and the 741 batches get mixed in a multipoint flow. For example, a marking node 742 may apply the marking with a delay because it is overloaded while the 743 other marking nodes are not. To take into account this possible 744 additional gap between the sources it is introduced a mismatch m that 745 can be added to d, as shown in Figure 5. 747 ...BBBBBBBBB | AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA | BBBBBBBBB... 748 |<======================================>| 749 | L | 750 ...=========>|<==================><==================>|<==========... 751 | L/2 L/2 | 752 |<=><===>| |<===><=>| 753 m d | | d m 754 |<====================>| 755 available counting interval 757 Figure 5: Timing Aspects for Multipoint Paths 759 So the misalignment between the marking source routers gives an 760 additional constraint, and the value of m is added to d (which 761 already includes clock error and network delay). 763 Thus, three different possible contributions are considered: clock 764 error between network devices, network delay between measurement 765 points, and the misalignment between the marking source routers. 767 In the end, the condition that must be satisfied to enable the method 768 to function properly is that the available counting interval must be 769 > 0, and that means: 771 L - 2m - 2d > 0. 773 This formula needs to be verified for each measurement point on the 774 multipoint path, where m is misalignment between the marking source 775 routers, while d, already introduced in [I-D.fioccola-rfc8321bis], 776 takes into account clock error and network delay between network 777 nodes. Therefore, the mismatch between measurement intervals must 778 satisfy this condition. 780 Also, it is worth highlighting that the formula above is exactly the 781 same of [I-D.fioccola-rfc8321bis] if m=0, indeed in case of a point- 782 to-point flow there is only one marking node and m=0. 784 Note that the timing considerations are valid for both packet loss 785 and delay measurements. 787 8. Multipoint Delay and Delay Variation 789 The same line of reasoning can be applied to delay and delay 790 variation. Similarly to the delay measurements defined in 791 [I-D.fioccola-rfc8321bis], the marking batches anchor the samples to 792 a particular period, and this is the time reference that can be used. 793 It is important to highlight that both delay and delay-variation 794 measurements make sense in a multipoint path. The delay variation is 795 calculated by considering the same packets selected for measuring the 796 delay. 798 In general, it is possible to perform delay and delay-variation 799 measurements on the basis of multipoint paths or single packets: 801 o Delay measurements on the basis of multipoint paths mean that the 802 delay value is representative of an entire multipoint path (e.g., 803 the whole multipoint network, a cluster, or a combination of 804 clusters). 806 o Delay measurements on a single-packet basis mean that you can use 807 a multipoint path just to easily couple packets between input and 808 output nodes of a multipoint path, as described in the following 809 sections. 811 8.1. Delay Measurements on a Multipoint-Paths Basis 813 8.1.1. Single-Marking Measurement 815 Mean delay and mean delay-variation measurements can also be 816 generalized to the case of multipoint flows. It is possible to 817 compute the average one-way delay of packets in one block, a cluster, 818 or the entire monitored network. 820 The average latency can be measured as the difference between the 821 weighted averages of the mean timestamps of the sets of output and 822 input nodes. This means that, in the calculation, it is possible to 823 weigh the timestamps by considering the number of packets for each 824 endpoints. 826 8.2. Delay Measurements on a Single-Packet Basis 828 8.2.1. Single- and Double-Marking Measurement 830 Delay and delay-variation measurements relative to only one picked 831 packet per period (both single and double marked) can be performed in 832 the multipoint scenario, with some limitations: 834 Single marking based on the first/last packet of the interval 835 would not work, because it would not be possible to agree on the 836 first packet of the interval. 838 Double marking or multiplexed marking would work, but each 839 measurement would only give information about the delay of a 840 single path. However, by repeating the measurement multiple 841 times, it is possible to get information about all the paths in 842 the multipoint flow. This can be done in the case of a point-to- 843 multipoint path, but it is more difficult to achieve in the case 844 of a multipoint-to-multipoint path because of the multiple source 845 routers. 847 If we would perform a delay measurement for more than one picked 848 packet in the same marking period, and especially if we want to get 849 delay measurements on a multipoint-to-multipoint basis, neither the 850 single- nor the double-marking method is useful in the multipoint 851 scenario, since they would not be representative of the entire flow. 852 The packets can follow different paths with various delays, and in 853 general it can be very difficult to recognize marked packets in a 854 multipoint-to-multipoint path, especially in the case when there is 855 more than one per period. 857 A desirable option is to monitor simultaneously all the paths of a 858 multipoint path in the same marking period; for this purpose, hashing 859 can be used, as reported in the next section. 861 8.2.2. Hashing Selection Method 863 RFCs 5474 [RFC5474] and 5475 [RFC5475] introduce sampling and 864 filtering techniques for IP packet selection. 866 The hash-based selection methodologies for delay measurement can work 867 in a multipoint-to-multipoint path and MAY be used either coupled to 868 mean delay or stand-alone. 870 [I-D.mizrahi-ippm-marking] introduces how to use the hash method (RFC 871 5474 [RFC5474] and RFC 5475 [RFC5475]) combined with the Alternate- 872 Marking method for point-to-point flows. It is also called Mixed 873 Hashed Marking: the coupling of a marking method and hashing 874 technique is very useful, because the marking batches anchor the 875 samples selected with hashing, and this simplifies the correlation of 876 the hashing packets along the path. 878 It is possible to use a basic-hash or a dynamic-hash method. One of 879 the challenges of the basic approach is that the frequency of the 880 sampled packets may vary considerably. For this reason, the dynamic 881 approach has been introduced for point-to-point flows in order to 882 have the desired and almost fixed number of samples for each 883 measurement period. Using the hash-based sampling, the number of 884 samples may vary a lot because it depends on the packet rate that is 885 variable. The dynamic approach helps to have an almost fixed number 886 of samples for each marking period, and this is a better option for 887 making regular measurements over time. In the hash-based sampling, 888 Alternate Marking is used to create periods, so that hash-based 889 samples are divided into batches, which allows anchoring the selected 890 samples to their period. Moreover, in the dynamic hash-based 891 sampling, by dynamically adapting the length of the hash value, the 892 number of samples is bounded in each marking period. 894 In a multipoint environment, the hashing selection MAY be the 895 solution for performing delay measurements on specific packets and 896 overcoming the single- and double-marking limitations. 898 9. A Closed-Loop Performance-Management Approach 900 The Multipoint Alternate-Marking framework that is introduced in this 901 document adds flexibility to Performance Management (PM), because it 902 can reduce the order of magnitude of the packet counters. This 903 allows an SDN orchestrator to supervise, control, and manage PM in 904 large networks. 906 The monitoring network can be considered as a whole or split into 907 clusters that are the smallest subnetworks (group-to-group segments), 908 maintaining the packet-loss property for each subnetwork. The 909 clusters can also be combined in new, connected subnetworks at 910 different levels, depending on the detail we want to achieve. 912 An SDN controller or a Network Management System (NMS) can calibrate 913 performance measurements, since they are aware of the network 914 topology. They can start without examining in depth. In case of 915 necessity (packet loss is measured or the delay is too high), the 916 filtering criteria could be immediately reconfigured in order to 917 perform a partition of the network by using clusters and/or different 918 combinations of clusters. In this way, the problem can be localized 919 in a specific cluster or a single combination of clusters, and a more 920 detailed analysis can be performed step by step by successive 921 approximation up to a point-to-point flow detailed analysis. This is 922 the so-called "closed loop". 924 This approach can be called "network zooming" and can be performed in 925 two different ways: 927 1) change the traffic filter and select more detailed flows; 928 2) activate new measurement points by defining more specified 929 clusters. 931 The network-zooming approach implies that some filters or rules are 932 changed and that therefore there is a transient time to wait once the 933 new network configuration takes effect. This time can be determined 934 by the Network Orchestrator/Controller, based on the network 935 conditions. 937 For example, if the network zooming identifies the performance 938 problem for the traffic coming from a specific source, we need to 939 recognize the marked signal from this specific source node and its 940 relative path. For this purpose, we can activate all the available 941 measurement points and better specify the flow filter criteria (i.e., 942 5-tuple). As an alternative, it can be enough to select packets from 943 the specific source for delay measurements; in this case, it is 944 possible to apply the hashing technique, as mentioned in the previous 945 sections. 947 [I-D.song-opsawg-ifit-framework] defines an architecture where the 948 centralized Data Collector and Network Management can apply the 949 intelligent and flexible Alternate-Marking algorithm as previously 950 described. 952 As for [I-D.fioccola-rfc8321bis], it is possible to classify the 953 traffic and mark a portion of the total traffic. For each period, 954 the packet rate and bandwidth are calculated from the number of 955 packets. In this way, the network orchestrator becomes aware if the 956 traffic rate surpasses limits. In addition, more precision can be 957 obtained by reducing the marking period; indeed, some implementations 958 use a marking period of 1 sec or less. 960 In addition, an SDN controller could also collect the measurement 961 history. 963 It is important to mention that the Multipoint Alternate Marking 964 framework also helps Traffic Visualization. Indeed, this methodology 965 is very useful for identifying which path or cluster is crossed by 966 the flow. 968 10. Security Considerations 970 This document specifies a method of performing measurements that does 971 not directly affect Internet security or applications that run on the 972 Internet. However, implementation of this method must be mindful of 973 security and privacy concerns, as explained in 974 [I-D.fioccola-rfc8321bis]. 976 11. IANA Considerations 978 This document has no IANA actions. 980 12. Contributors 982 Greg Mirsky 983 Ericsson 984 Email: gregimirsky@gmail.com 986 Tal Mizrahi 987 Huawei Technologies 988 Email: tal.mizrahi.phd@gmail.com 990 Xiao Min 991 ZTE Corp. 992 Email: xiao.min2@zte.com.cn 994 13. Acknowledgements 996 The authors would like to thank Martin Duke and Tommy Pauly for their 997 assistance and their detailed and precious reviews. 999 14. References 1001 14.1. Normative References 1003 [I-D.fioccola-rfc8321bis] 1004 Fioccola, G., Cociglio, M., Mirsky, G., and T. Mizrahi, 1005 "Alternate-Marking Method", draft-fioccola-rfc8321bis-00 1006 (work in progress), November 2021. 1008 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1009 Requirement Levels", BCP 14, RFC 2119, 1010 DOI 10.17487/RFC2119, March 1997, 1011 . 1013 [RFC5474] Duffield, N., Ed., Chiou, D., Claise, B., Greenberg, A., 1014 Grossglauser, M., and J. Rexford, "A Framework for Packet 1015 Selection and Reporting", RFC 5474, DOI 10.17487/RFC5474, 1016 March 2009, . 1018 [RFC5475] Zseby, T., Molina, M., Duffield, N., Niccolini, S., and F. 1019 Raspall, "Sampling and Filtering Techniques for IP Packet 1020 Selection", RFC 5475, DOI 10.17487/RFC5475, March 2009, 1021 . 1023 [RFC5644] Stephan, E., Liang, L., and A. Morton, "IP Performance 1024 Metrics (IPPM): Spatial and Multicast", RFC 5644, 1025 DOI 10.17487/RFC5644, October 2009, 1026 . 1028 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 1029 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 1030 May 2017, . 1032 14.2. Informative References 1034 [I-D.ietf-ippm-route] 1035 Alvarez-Hamelin, J. I., Morton, A., Fabini, J., Pignataro, 1036 C., and R. Geib, "Advanced Unidirectional Route Assessment 1037 (AURA)", draft-ietf-ippm-route-10 (work in progress), 1038 August 2020. 1040 [I-D.mizrahi-ippm-marking] 1041 Mizrahi, T., Fioccola, G., Cociglio, M., Chen, M., and G. 1042 Mirsky, "Marking Methods for Performance Measurement", 1043 draft-mizrahi-ippm-marking-00 (work in progress), October 1044 2021. 1046 [I-D.song-opsawg-ifit-framework] 1047 Song, H., Qin, F., Chen, H., Jin, J., and J. Shin, "In- 1048 situ Flow Information Telemetry", draft-song-opsawg-ifit- 1049 framework-16 (work in progress), October 2021. 1051 [I-D.zhou-ippm-enhanced-alternate-marking] 1052 Zhou, T., Fioccola, G., Liu, Y., Lee, S., Cociglio, M., 1053 and W. Li, "Enhanced Alternate Marking Method", draft- 1054 zhou-ippm-enhanced-alternate-marking-07 (work in 1055 progress), July 2021. 1057 [IEEE-ACM-ToN-MPNPM] 1058 IEEE/ACM TRANSACTION ON NETWORKING, "Multipoint Passive 1059 Monitoring in Packet Networks", 1060 DOI 10.1109/TNET.2019.2950157, 2019. 1062 [IEEE-Network-PNPM] 1063 IEEE Network, "AM-PM: Efficient Network Telemetry using 1064 Alternate Marking", DOI 10.1109/MNET.2019.1800152, 2019. 1066 [RFC7011] Claise, B., Ed., Trammell, B., Ed., and P. Aitken, 1067 "Specification of the IP Flow Information Export (IPFIX) 1068 Protocol for the Exchange of Flow Information", STD 77, 1069 RFC 7011, DOI 10.17487/RFC7011, September 2013, 1070 . 1072 [RFC8889] Fioccola, G., Ed., Cociglio, M., Sapio, A., and R. Sisto, 1073 "Multipoint Alternate-Marking Method for Passive and 1074 Hybrid Performance Monitoring", RFC 8889, 1075 DOI 10.17487/RFC8889, August 2020, 1076 . 1078 Appendix A. Changes Log 1080 Changes from RFC 8889 include: 1082 o Minor editorial changes 1084 o Removed section on "Examples of application" 1086 Changes in v-(01) include: 1088 o Considerations on BUM traffic 1090 o Reference to RFC8321bis for the fragmentation part 1092 o Revised section on "Delay Measurements on a Single-Packet Basis" 1094 o Revised section on "Timing Aspects" 1096 Authors' Addresses 1098 Giuseppe Fioccola (editor) 1099 Huawei Technologies 1100 Riesstrasse, 25 1101 Munich 80992 1102 Germany 1104 Email: giuseppe.fioccola@huawei.com 1106 Mauro Cociglio 1107 Telecom Italia 1108 Via Reiss Romoli, 274 1109 Torino 10148 1110 Italy 1112 Email: mauro.cociglio@telecomitalia.it 1113 Amedeo Sapio 1114 Intel Corporation 1115 4750 Patrick Henry Dr. 1116 Santa Clara, CA 95054 1117 USA 1119 Email: amedeo.sapio@intel.com 1121 Riccardo Sisto 1122 Politecnico di Torino 1123 Corso Duca degli Abruzzi, 24 1124 Torino 10129 1125 Italy 1127 Email: riccardo.sisto@polito.it 1129 Tianran Zhou 1130 Huawei Technologies 1131 156 Beiqing Rd. 1132 Beijing 100095 1133 China 1135 Email: zhoutianran@huawei.com