idnits 2.17.00 (12 Aug 2021) /tmp/idnits45118/draft-ietf-ippm-rfc8889bis-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([RFC8889]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document date (April 26, 2022) is 25 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Downref: Normative reference to an Informational RFC: RFC 5474 Summary: 2 errors (**), 0 flaws (~~), 0 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group G. Fioccola, Ed. 3 Internet-Draft Huawei Technologies 4 Obsoletes: 8889 (if approved) M. Cociglio 5 Intended status: Standards Track Telecom Italia 6 Expires: October 28, 2022 A. Sapio 7 Intel Corporation 8 R. Sisto 9 Politecnico di Torino 10 T. Zhou 11 Huawei Technologies 12 April 26, 2022 14 Multipoint Alternate-Marking Method 15 draft-ietf-ippm-rfc8889bis-00 17 Abstract 19 This document generalizes and expands Alternate-Marking methodology 20 to measure any kind of unicast flow whose packets can follow several 21 different paths in the network -- in wider terms, a multipoint-to- 22 multipoint network. For this reason, the technique here described is 23 called "Multipoint Alternate Marking". This document obsoletes 24 [RFC8889]. 26 Status of This Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at https://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on October 28, 2022. 43 Copyright Notice 45 Copyright (c) 2022 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (https://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 Table of Contents 60 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 61 1.1. Requirements Language . . . . . . . . . . . . . . . . . . 5 62 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 63 2.1. Correlation with RFC 5644 . . . . . . . . . . . . . . . . 6 64 3. Flow Classification . . . . . . . . . . . . . . . . . . . . . 6 65 4. Extension of the Method to Multipoint Flows . . . . . . . . . 9 66 4.1. Monitoring Network . . . . . . . . . . . . . . . . . . . 9 67 4.2. Network Packet Loss . . . . . . . . . . . . . . . . . . . 10 68 5. Network Clustering . . . . . . . . . . . . . . . . . . . . . 11 69 5.1. Algorithm for Clusters Partition . . . . . . . . . . . . 12 70 6. Multipoint Packet Loss Measurement . . . . . . . . . . . . . 16 71 7. Multipoint Delay and Delay Variation . . . . . . . . . . . . 16 72 7.1. Delay Measurements on a Multipoint-Paths Basis . . . . . 17 73 7.1.1. Single-Marking Measurement . . . . . . . . . . . . . 17 74 7.2. Delay Measurements on a Single-Packet Basis . . . . . . . 17 75 7.2.1. Single- and Double-Marking Measurement . . . . . . . 17 76 7.2.2. Hashing Selection Method . . . . . . . . . . . . . . 18 77 8. Synchronization and Timing . . . . . . . . . . . . . . . . . 19 78 9. Results of the Multipoint Alternate Marking Experiment . . . 21 79 10. A Closed-Loop Performance-Management Approach . . . . . . . . 21 80 11. Security Considerations . . . . . . . . . . . . . . . . . . . 23 81 12. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 23 82 13. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 23 83 14. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 23 84 15. References . . . . . . . . . . . . . . . . . . . . . . . . . 24 85 15.1. Normative References . . . . . . . . . . . . . . . . . . 24 86 15.2. Informative References . . . . . . . . . . . . . . . . . 24 87 Appendix A. Changes Log . . . . . . . . . . . . . . . . . . . . 25 88 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 26 90 1. Introduction 92 The Alternate-Marking method, as described in 93 [I-D.fioccola-rfc8321bis], is applicable to a point-to-point path. 94 The extension proposed in this document applies to the most general 95 case of a multipoint-to-multipoint path and enables flexible and 96 adaptive performance measurements in a managed network. 98 The Alternate-Marking methodology described in 99 [I-D.fioccola-rfc8321bis] allows the synchronization of the 100 measurements at different points by dividing the packet flow into 101 batches. So it is possible to get coherent counters and show what is 102 happening in every marking period for each monitored flow. The 103 monitoring parameters are the packet counter and timestamps of a flow 104 for each marking period. Note that additional details about the 105 applicability of the Alternate-Marking methodology are described in 106 [I-D.fioccola-rfc8321bis] while implementation details can be found 107 in the paper "AM-PM: Efficient Network Telemetry using Alternate 108 Marking" [IEEE-Network-PNPM]. 110 There are some applications of the Alternate-Marking method where 111 there are a lot of monitored flows and nodes. Multipoint Alternate 112 Marking aims to reduce these values and makes the performance 113 monitoring more flexible in case a detailed analysis is not needed. 114 For instance, by considering n measurement points and m monitored 115 flows, the order of magnitude of the packet counters for each time 116 interval is n*m*2 (1 per color). The number of measurement points 117 and monitored flows may vary and depends on the portion of the 118 network we are monitoring (core network, metro network, access 119 network) and the granularity (for each service, each customer). So 120 if both n and m are high values, the packet counters increase a lot, 121 and Multipoint Alternate Marking offers a tool to control these 122 parameters. 124 The approach presented in this document is applied only to unicast 125 flows and not to multicast. Broadcast, Unknown Unicast, and 126 Multicast (BUM) traffic is not considered here, because traffic 127 replication is not covered by the Multipoint Alternate-Marking 128 method. Furthermore, it can be applicable to anycast flows, and 129 Equal-Cost Multipath (ECMP) paths can also be easily monitored with 130 this technique. 132 [I-D.fioccola-rfc8321bis] applies to point-to-point unicast flows and 133 BUM traffic. For BUM traffic, the basic method of 134 [I-D.fioccola-rfc8321bis] can easily be applied link by link and 135 therefore split the multicast flow tree distribution into separate 136 unicast point-to-point links. While this document and its Clustered 137 Alternate-Marking method is valid for multipoint-to-multipoint 138 unicast flows, anycast, and ECMP flows. 140 Therefore, the Alternate-Marking method can be extended to any kind 141 of multipoint-to-multipoint paths, and the network-clustering 142 approach presented in this document is the formalization of how to 143 implement this property and allow a flexible and optimized 144 performance measurement support for network management in every 145 situation. 147 Without network clustering, it is possible to apply Alternate Marking 148 only for all the network or per single flow. Instead, with network 149 clustering, it is possible to use the partition of the network into 150 clusters at different levels in order to provide the needed degree of 151 detail. In some circumstances, it is possible to monitor a 152 multipoint network by monitoring the network clusters, without 153 examining in depth. In case of problems (packet loss is measured or 154 the delay is too high), the filtering criteria could be enhanced in 155 order to perform a detailed analysis by using a different combination 156 of clusters up to a per-flow measurement as described in 157 [I-D.fioccola-rfc8321bis]. 159 This approach fits very well with the Closed-Loop Network and 160 Software-Defined Network (SDN) paradigm, where the SDN orchestrator 161 and the SDN controllers are the brains of the network and can manage 162 flow control to the switches and routers and, in the same way, can 163 calibrate the performance measurements depending on the desired 164 accuracy. An SDN controller application can orchestrate how 165 accurately the network performance monitoring is set up by applying 166 the Multipoint Alternate Marking as described in this document. 168 It is important to underline that, as an extension of 169 [I-D.fioccola-rfc8321bis], this is a methodology document, so the 170 mechanism that can be used to transmit the counters and the 171 timestamps is out of scope here, and the implementation is open. 172 Several options are possible -- e.g., see "Enhanced Alternate Marking 173 Method" [I-D.zhou-ippm-enhanced-alternate-marking]. 175 This document assumes that the blocks are created according to a 176 fixed timer as per [I-D.fioccola-rfc8321bis]. Switching after a 177 fixed number of packets is possible but it is out of scope here. 179 Note that the fragmented packets case can be managed with the 180 Alternate-Marking methodology. The same considerations of 181 [I-D.fioccola-rfc8321bis] apply also in the case of Multipoint 182 Alternate Marking. As defined in [I-D.fioccola-rfc8321bis] the 183 marking node MUST mark all the fragments except in the case of 184 fragmentation within the network domain, in that event it is 185 suggested to mark only the first fragment. 187 1.1. Requirements Language 189 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 190 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 191 "OPTIONAL" in this document are to be interpreted as described in BCP 192 14 [RFC2119] [RFC8174] when, and only when, they appear in all 193 capitals, as shown here. 195 2. Terminology 197 The definitions of the basic terms are identical to those found in 198 Alternate Marking [I-D.fioccola-rfc8321bis]. It is to be remembered 199 that [I-D.fioccola-rfc8321bis] is valid for point-to-point unicast 200 flows and BUM traffic. 202 The important new terms that need to be explained are listed below: 204 Multipoint Alternate Marking: Extension to 205 [I-D.fioccola-rfc8321bis], valid for multipoint-to-multipoint 206 unicast flows, anycast, and ECMP flows. It can also be referred 207 to as Clustered Alternate Marking. 209 Flow definition: The concept of flow is generalized in this 210 document. The identification fields are selected without any 211 constraints and, in general, the flow can be a multipoint-to- 212 multipoint flow, as a result of aggregate point-to-point flows. 214 Monitoring Network: Identified with the nodes of the network that 215 are the measurement points (MPs) and the links that are the 216 connections between MPs. The monitoring network graph depends on 217 the flow definition, so it can represent a specific flow or the 218 entire network topology as aggregate of all the flows. 220 Cluster: Smallest identifiable subnetwork of the entire monitoring 221 network graph that still satisfies the condition that the number 222 of packets that go in is the same as the number that go out. 224 Multipoint metrics: Packet loss, delay and delay variation are 225 extended to the case of multipoint flows. It is possible to 226 compute these metrics on the basis of multipoint paths in order to 227 associate the measurements to a cluster, a combination of 228 clusters, or the entire monitored network. For delay and delay 229 variation, it is also possible to define the metrics on a single- 230 packet basis, and it means that the multipoint path is used to 231 easily couple packets between input and output nodes of a 232 multipoint path. 234 The next section highlights the correlation with the terms used in 235 RFC 5644 [RFC5644]. 237 2.1. Correlation with RFC 5644 239 RFC 5644 [RFC5644] is limited to active measurements using a single 240 source packet or stream. Its scope is also limited to observations 241 of corresponding packets along the path (spatial metric) and at one 242 or more destinations (one-to-group) along the path. 244 Instead, the scope of this memo is to define multiparty metrics for 245 passive and hybrid measurements in a group-to-group topology with 246 multiple sources and destinations. 248 RFC 5644 [RFC5644] introduces metric names that can be reused here 249 but have to be extended and rephrased to be applied to the Alternate- 250 Marking schema: 252 a. the multiparty metrics are not only one-to-group metrics but can 253 be also group-to-group metrics; 255 b. the spatial metrics, used for measuring the performance of 256 segments of a source to destination path, are applied here to 257 group-to-group segments (called clusters). 259 3. Flow Classification 261 A unicast flow is identified by all the packets having a set of 262 common characteristics. This definition is inspired by RFC 7011 263 [RFC7011]. 265 As an example, by considering a flow as all the packets sharing the 266 same source IP address or the same destination IP address, it is easy 267 to understand that the resulting pattern will not be a point-to-point 268 connection, but a point-to-multipoint or multipoint-to-point 269 connection. 271 In general, a flow can be defined by a set of selection rules used to 272 match a subset of the packets processed by the network device. These 273 rules specify a set of Layer 3 and Layer 4 header fields 274 (identification fields) and the relative values that must be found in 275 matching packets. 277 The choice of the identification fields directly affects the type of 278 paths that the flow would follow in the network. In fact, it is 279 possible to relate a set of identification fields with the pattern of 280 the resulting graphs, as listed in Figure 1. 282 A TCP 5-tuple usually identifies flows following either a single path 283 or a point-to-point multipath (in the case of load balancing). On 284 the contrary, a single source address selects aggregate flows 285 following a point-to-multipoint, while a multipoint-to-point can be 286 the result of a matching on a single destination address. In the 287 case where a selection rule and its reverse are used for 288 bidirectional measurements, they can correspond to a point-to- 289 multipoint in one direction and a multipoint-to-point in the opposite 290 direction. 292 So the flows to be monitored are selected into the monitoring points 293 using packet selection rules, which can also change the pattern of 294 the monitored network. 296 Note that, more generally, the flow can be defined at different 297 levels based on the potential encapsulation, and additional 298 conditions that are not in the packet header can also be included as 299 part of matching criteria. 301 The Alternate-Marking method is applicable only to a single path (and 302 partially to a one-to-one multipath), so the extension proposed in 303 this document is suitable also for the most general case of 304 multipoint-to-multipoint, which embraces all the other patterns of 305 Figure 1. 307 point-to-point single path 308 +------+ +------+ +------+ 309 ---<> R1 <>----<> R2 <>----<> R3 <>--- 310 +------+ +------+ +------+ 312 point-to-point multipath 313 +------+ 314 <> R2 <> 315 / +------+ \ 316 / \ 317 +------+ / \ +------+ 318 ---<> R1 <> <> R4 <>--- 319 +------+ \ / +------+ 320 \ / 321 \ +------+ / 322 <> R3 <> 323 +------+ 325 point-to-multipoint 326 +------+ 328 <> R4 <>--- 329 / +------+ 330 +------+ / 331 <> R2 <> 332 / +------+ \ 333 +------+ / \ +------+ 334 ---<> R1 <> <> R5 <>--- 335 +------+ \ +------+ 336 \ +------+ 337 <> R3 <> 338 +------+ \ 339 \ +------+ 340 <> R6 <>--- 341 +------+ 343 multipoint-to-point 344 +------+ 345 ---<> R1 <> 346 +------+ \ 347 \ +------+ 348 <> R4 <> 349 / +------+ \ 350 +------+ / \ +------+ 351 ---<> R2 <> <> R6 <>--- 352 +------+ / +------+ 353 +------+ / 354 <> R5 <> 355 / +------+ 356 +------+ / 357 ---<> R3 <> 358 +------+ 360 multipoint-to-multipoint 361 +------+ +------+ 362 ---<> R1 <> <> R6 <>--- 363 +------+ \ / +------+ 364 \ +------+ / 365 <> R4 <> 366 +------+ \ 367 +------+ \ +------+ 368 ---<> R2 <> <> R7 <>--- 369 +------+ \ / +------+ 370 \ +------+ / 371 <> R5 <> 372 / +------+ \ 373 +------+ / \ +------+ 375 ---<> R3 <> <> R8 <>--- 376 +------+ +------+ 378 Figure 1: Flow Classification 380 The case of unicast flow is considered in Figure 1. The anycast flow 381 is also in scope, because there is no replication and only a single 382 node from the anycast group receives the traffic, so it can be viewed 383 as a special case of unicast flow. Furthermore, an ECMP flow is in 384 scope by definition, since it is a point-to-multipoint unicast flow. 386 4. Extension of the Method to Multipoint Flows 388 By using the Alternate-Marking method, only point-to-point paths can 389 be monitored. To have an IP (TCP/UDP) flow that follows a point-to- 390 point path, in general we have to define, with a specific value, 5 391 identification fields (IP Source, IP Destination, Transport Protocol, 392 Source Port, Destination Port). 394 Multipoint Alternate Marking enables the performance measurement for 395 multipoint flows selected by identification fields without any 396 constraints (even the entire network production traffic). It is also 397 possible to use multiple marking points for the same monitored flow. 399 4.1. Monitoring Network 401 The monitoring network is deduced from the production network by 402 identifying the nodes of the graph that are the measurement points, 403 and the links that are the connections between measurement points. 405 There are some techniques that can help with the building of the 406 monitoring network (as an example, see [I-D.ietf-ippm-route]). In 407 general, there are different options: the monitoring network can be 408 obtained by considering all the possible paths for the traffic or 409 periodically checking the traffic (e.g. daily, weekly, monthly) and 410 updating the graph as appropriate, but this is up to the Network 411 Management System (NMS) configuration. 413 So a graph model of the monitoring network can be built according to 414 the Alternate-Marking method: the monitored interfaces and links are 415 identified. Only the measurement points and links where the traffic 416 has flowed have to be represented in the graph. 418 Figure 2 shows a simple example of a monitoring network graph: 420 +------+ 421 <> R6 <>--- 422 / +------+ 423 +------+ +------+ / 424 <> R2 <>---<> R4 <> 425 / +------+ \ +------+ \ 426 / \ \ +------+ 427 +------+ / +------+ \ +------+ <> R7 <>--- 428 ---<> R1 <>---<> R3 <>---<> R5 <> +------+ 429 +------+ \ +------+ \ +------+ \ 430 \ \ \ +------+ 431 \ \ <> R8 <>--- 432 \ \ +------+ 433 \ \ 434 \ \ +------+ 435 \ <> R9 <>--- 436 \ +------+ 437 \ 438 \ +------+ 439 <> R10 <>--- 440 +------+ 442 Figure 2: Monitoring Network Graph 444 Each monitoring point is characterized by the packet counter that 445 refers only to a marking period of the monitored flow. Also, it is 446 assumed that there be a monitoring point at all possible egress 447 points of the multipoint monitored network. 449 The same is also applicable for the delay, but it will be described 450 in the following sections. 452 The rest of the document assumes that the traffic is going from left 453 to right in order to simplify the explanation. But the analysis done 454 for one direction applies equally to all directions. 456 4.2. Network Packet Loss 458 Since all the packets of the considered flow leaving the network have 459 previously entered the network, the number of packets counted by all 460 the input nodes is always greater than, or equal to, the number of 461 packets counted by all the output nodes. Non-initial fragments are 462 not considered here. 464 In the case of no packet loss occurring in the marking period, if all 465 the input and output points of the network domain to be monitored are 466 measurement points, the sum of the number of packets on all the 467 ingress interfaces equals the number on egress interfaces for the 468 monitored flow. In this circumstance, if no packet loss occurs, the 469 intermediate measurement points only have the task of splitting the 470 measurement. 472 It is possible to define the Network Packet Loss of one monitored 473 flow for a single period. In a packet network, the number of lost 474 packets is the number of packets counted by the input nodes minus the 475 number of packets counted by the output nodes. This is true for 476 every packet flow in each marking period. 478 The monitored network packet loss with n input nodes and m output 479 nodes is given by: 481 PL = (PI1 + PI2 +...+ PIn) - (PO1 + PO2 +...+ POm) 483 where: 485 PL is the network packet loss (number of lost packets) 487 PIi is the number of packets flowed through the i-th input node in 488 this period 490 POj is the number of packets flowed through the j-th output node in 491 this period 493 The equation is applied on a per-time-interval basis and a per-flow 494 basis: 496 The reference interval is the Alternate-Marking period, as defined 497 in [I-D.fioccola-rfc8321bis]. 499 The flow definition is generalized here. Indeed, as described 500 before, a multipoint packet flow is considered, and the 501 identification fields can be selected without any constraints. 503 5. Network Clustering 505 The previous equation of Section 4.2 can determine the number of 506 packets lost globally in the monitored network, exploiting only the 507 data provided by the counters in the input and output nodes. 509 In addition, it is also possible to leverage the data provided by the 510 other counters in the network to converge on the smallest 511 identifiable subnetworks where the losses occur. These subnetworks 512 are named "clusters". 514 A cluster graph is a subnetwork of the entire monitoring network 515 graph that still satisfies the packet loss equation (introduced in 516 the previous section), where PL in this case is the number of packets 517 lost in the cluster. As for the entire monitoring network graph, the 518 cluster is defined on a per-flow basis. 520 For this reason, a cluster should contain all the arcs emanating from 521 its input nodes and all the arcs terminating at its output nodes. 522 This ensures that we can count all the packets (and only those) 523 exiting an input node again at the output node, whatever path they 524 follow. 526 In a completely monitored unidirectional network (a network where 527 every network interface is monitored), each network device 528 corresponds to a cluster, and each physical link corresponds to two 529 clusters (one for each device). 531 Clusters can have different sizes depending on the flow-filtering 532 criteria adopted. 534 Moreover, sometimes clusters can be optionally simplified. For 535 example, when two monitored interfaces are divided by a single router 536 (one is the input interface, the other is the output interface, and 537 the router has only these two interfaces), instead of counting 538 exactly twice, upon entering and leaving, it is possible to consider 539 a single measurement point. In this case, we do not care about the 540 internal packet loss of the router. 542 It is worth highlighting that it might also be convenient to define 543 clusters based on the topological information so that they are 544 applicable to all the possible flows in the monitored network. 546 5.1. Algorithm for Clusters Partition 548 A simple algorithm can be applied in order to split our monitoring 549 network into clusters. This can be done for each direction 550 separately. The clusters partition is based on the monitoring 551 network graph, which can be valid for a specific flow or can also be 552 general and valid for the entire network topology. 554 It is a two-step algorithm: 556 o Group the links where there is the same starting node; 558 o Join the grouped links with at least one ending node in common. 560 Considering that the links are unidirectional, the first step implies 561 listing all the links as connections between two nodes and grouping 562 the different links if they have the same starting node. Note that 563 it is possible to start from any link, and the procedure will work. 564 Following this classification, the second step implies eventually 565 joining the groups classified in the first step by looking at the 566 ending nodes. If different groups have at least one common ending 567 node, they are put together and belong to the same set. After the 568 application of the two steps of the algorithm, each one of the 569 composed sets of links, together with the endpoint nodes, constitutes 570 a cluster. 572 In our monitoring network graph example, it is possible to identify 573 the clusters partition by applying this two-step algorithm. 575 The first step identifies the following groups: 577 1. Group 1: (R1-R2), (R1-R3), (R1-R10) 579 2. Group 2: (R2-R4), (R2-R5) 581 3. Group 3: (R3-R5), (R3-R9) 583 4. Group 4: (R4-R6), (R4-R7) 585 5. Group 5: (R5-R8) 587 And then, the second step builds the clusters partition (in 588 particular, we can underline that Groups 2 and 3 connect together, 589 since R5 is in common): 591 1. Cluster 1: (R1-R2), (R1-R3), (R1-R10) 593 2. Cluster 2: (R2-R4), (R2-R5), (R3-R5), (R3-R9) 595 3. Cluster 3: (R4-R6), (R4-R7) 597 4. Cluster 4: (R5-R8) 599 The flow direction here considered is from left to right. For the 600 opposite direction, the same reasoning can be applied, and in this 601 example, you get the same clusters partition. 603 In the end, the following 4 clusters are obtained: 605 Cluster 1 606 +------+ 607 <> R2 <>--- 608 / +------+ 610 / 611 +------+ / +------+ 612 ---<> R1 <>---<> R3 <>--- 613 +------+ \ +------+ 614 \ 615 \ 616 \ 617 \ 618 \ 619 \ 620 \ 621 \ 622 \ +------+ 623 <> R10 <>--- 624 +------+ 626 Cluster 2 627 +------+ +------+ 628 ---<> R2 <>---<> R4 <>--- 629 +------+ \ +------+ 630 \ 631 +------+ \ +------+ 632 ---<> R3 <>---<> R5 <>--- 633 +------+ \ +------+ 634 \ 635 \ 636 \ 637 \ 638 \ +------+ 639 <> R9 <>--- 640 +------+ 642 Cluster 3 643 +------+ 644 <> R6 <>--- 645 / +------+ 646 +------+ / 647 ---<> R4 <> 648 +------+ \ 649 \ +------+ 650 <> R7 <>--- 651 +------+ 653 Cluster 4 654 +------+ 656 ---<> R5 <> 657 +------+ \ 658 \ +------+ 659 <> R8 <>--- 660 +------+ 662 Figure 3: Clusters Example 664 There are clusters with more than two nodes as well as two-node 665 clusters. In the two-node clusters, the loss is on the link (Cluster 666 4). In more-than-two-node clusters, the loss is on the cluster, but 667 we cannot know in which link (Cluster 1, 2, or 3). 669 The algorithm, as applied in this example of a point-to-multipoint 670 network, works for the more general case of multipoint-to-multipoint 671 network in the same way. It should be highlighted that for a 672 multipoint-to-multipoint network the multiple sources MUST mark 673 coherently the traffic and MUST be synchronized with all the other 674 nodes according to the timing requirements detailed in Section 8. 676 When the clusters partition is done, the calculation of packet loss, 677 delay and delay variation can be made on a cluster basis. Note that 678 the packet counters for each marking period permit calculating the 679 packet rate on a cluster basis, so Committed Information Rate (CIR) 680 and Excess Information Rate (EIR) could also be deduced on a cluster 681 basis. 683 Obviously, by combining some clusters in a new connected subnetwork 684 the packet-loss rule is still true. So it is also possible to 685 consider combinations of clusters if and where it suits. 687 In this way, in a very large network, there is no need to configure 688 detailed filter criteria to inspect the traffic. It is possible to 689 check a multipoint network and, in case of problems, go deep with a 690 step-by-step cluster analysis, but only for the cluster or 691 combination of clusters where the problem happens. 693 In summary, once a flow is defined, the algorithm to build the 694 clusters partition is based on topological information; therefore, it 695 considers all the possible links and nodes crossed by the given flow, 696 even if there is no traffic. So, if the flow does not enter or 697 traverse all the nodes, the counters have a non-zero value for the 698 involved nodes and a zero value for the other nodes without traffic; 699 but in the end, all the formulas are still valid. 701 The algorithm described above network is an iterative clustering 702 algorithm, but it is also possible to apply a recursive clustering 703 algorithm by using the node-node adjacency matrix representation 704 [IEEE-ACM-ToN-MPNPM]. 706 The complete and mathematical analysis of the possible algorithms for 707 clusters partition, including the considerations in terms of 708 efficiency and a comparison between the different methods, is in the 709 paper [IEEE-ACM-ToN-MPNPM]. 711 6. Multipoint Packet Loss Measurement 713 The Network Packet Loss, defined in Section 4.2, valid for the an 714 entire monitored flow, can easily be extended to each multipoint path 715 (e.g., the whole multipoint network, a cluster, or a combination of 716 clusters). In this way it is possible to calculate Multipoint Packet 717 Loss that is representative of a multipoint path. 719 The same equation of Section 4.2 can be applied to a generic 720 multipoint path like a cluster or a combination of clusters, where 721 the number of packets are those entering and leaving the multipoint 722 path. 724 By applying the algorithm described in Section 5.1, it is possible to 725 split the monitoring network into clusters. Then, packet loss can be 726 measured on a cluster basis for each single period by considering the 727 counters of the input and output nodes that belong to the specific 728 cluster. This can be done for every packet flow in each marking 729 period. 731 7. Multipoint Delay and Delay Variation 733 The same line of reasoning can be applied to delay and delay 734 variation. Similarly to the delay measurements defined in 735 [I-D.fioccola-rfc8321bis], the marking batches anchor the samples to 736 a particular period, and this is the time reference that can be used. 737 It is important to highlight that both delay and delay-variation 738 measurements make sense in a multipoint path. The delay variation is 739 calculated by considering the same packets selected for measuring the 740 delay. 742 In general, it is possible to perform delay and delay-variation 743 measurements on the basis of multipoint paths or single packets: 745 o Delay measurements on the basis of multipoint paths mean that the 746 delay value is representative of an entire multipoint path (e.g., 747 the whole multipoint network, a cluster, or a combination of 748 clusters). 750 o Delay measurements on a single-packet basis mean that you can use 751 a multipoint path just to easily couple packets between input and 752 output nodes of a multipoint path, as described in the following 753 sections. 755 7.1. Delay Measurements on a Multipoint-Paths Basis 757 7.1.1. Single-Marking Measurement 759 Mean delay and mean delay-variation measurements can also be 760 generalized to the case of multipoint flows. It is possible to 761 compute the average one-way delay of packets in one block, a cluster, 762 or the entire monitored network. 764 The average latency can be measured as the difference between the 765 weighted averages of the mean timestamps of the sets of output and 766 input nodes. This means that, in the calculation, it is possible to 767 weigh the timestamps by considering the number of packets for each 768 endpoints. 770 Note that, since the one-way delay value is representative of a 771 multipoint path, it is possible to calculate the two-way delay of a 772 multipoint path by summing the one-way delays of the two directions, 773 similarly to [I-D.fioccola-rfc8321bis]. 775 7.2. Delay Measurements on a Single-Packet Basis 777 7.2.1. Single- and Double-Marking Measurement 779 Delay and delay-variation measurements relative to only one picked 780 packet per period (both single and double marked) can be performed in 781 the multipoint scenario, with some limitations: 783 Single marking based on the first/last packet of the interval 784 would not work, because it would not be possible to agree on the 785 first packet of the interval. 787 Double marking or multiplexed marking would work, but each 788 measurement would only give information about the delay of a 789 single path. However, by repeating the measurement multiple 790 times, it is possible to get information about all the paths in 791 the multipoint flow. This can be done in the case of a point-to- 792 multipoint path, but it is more difficult to achieve in the case 793 of a multipoint-to-multipoint path because of the multiple source 794 routers. 796 If we would perform a delay measurement for more than one picked 797 packet in the same marking period, and especially if we want to get 798 delay measurements on a multipoint-to-multipoint basis, neither the 799 single- nor the double-marking method is useful in the multipoint 800 scenario, since they would not be representative of the entire flow. 801 The packets can follow different paths with various delays, and in 802 general it can be very difficult to recognize marked packets in a 803 multipoint-to-multipoint path, especially in the case when there is 804 more than one per period. 806 A desirable option is to monitor simultaneously all the paths of a 807 multipoint path in the same marking period; for this purpose, hashing 808 can be used, as reported in the next section. 810 Note that, since the one-way delay measurement is done on a single- 811 packet basis, it is always possible to calculate the two-way delay 812 but it is not immediate since it is necessary to couple the 813 measurement on each single path with the opposite direction. In this 814 case the NMS can do the calculation. 816 7.2.2. Hashing Selection Method 818 RFCs 5474 [RFC5474] and 5475 [RFC5475] introduce sampling and 819 filtering techniques for IP packet selection. 821 The hash-based selection methodologies for delay measurement can work 822 in a multipoint-to-multipoint path and MAY be used either coupled to 823 mean delay or stand-alone. 825 [I-D.mizrahi-ippm-marking] introduces how to use the hash method (RFC 826 5474 [RFC5474] and RFC 5475 [RFC5475]) combined with the Alternate- 827 Marking method for point-to-point flows. It is also called Mixed 828 Hashed Marking: the coupling of a marking method and hashing 829 technique is very useful, because the marking batches anchor the 830 samples selected with hashing, and this simplifies the correlation of 831 the hashing packets along the path. 833 It is possible to use a basic-hash or a dynamic-hash method. One of 834 the challenges of the basic approach is that the frequency of the 835 sampled packets may vary considerably. For this reason, the dynamic 836 approach has been introduced for point-to-point flows in order to 837 have the desired and almost fixed number of samples for each 838 measurement period. Using the hash-based sampling, the number of 839 samples may vary a lot because it depends on the packet rate that is 840 variable. The dynamic approach helps to have an almost fixed number 841 of samples for each marking period, and this is a better option for 842 making regular measurements over time. In the hash-based sampling, 843 Alternate Marking is used to create periods, so that hash-based 844 samples are divided into batches, which allows anchoring the selected 845 samples to their period. Moreover, in the dynamic hash-based 846 sampling, by dynamically adapting the length of the hash value, the 847 number of samples is bounded in each marking period. 849 In a multipoint environment, the hashing selection MAY be the 850 solution for performing delay measurements on specific packets and 851 overcoming the single- and double-marking limitations. 853 8. Synchronization and Timing 855 It is important to consider the timing aspects, since out-of-order 856 packets happen and have to be handled as well, as described in 857 [I-D.fioccola-rfc8321bis]. 859 However, in a multisource situation, an additional issue has to be 860 considered. With multipoint path, the egress nodes will receive 861 alternate marked packets in random order from different ingress 862 nodes, and this must not affect the measurement. 864 So, if we analyze a multipoint-to-multipoint path with more than one 865 marking node, it is important to recognize the reference measurement 866 interval. In general, the measurement interval for describing the 867 results is the interval of the marking node that is more aligned with 868 the start of the measurement, as reported in Figure 4. 870 Note that the mark switching approach based on a fixed timer is 871 considered in this document. 873 time -> start stop 874 T(R1) |-------------| 875 T(R2) |-------------| 876 T(R3) |------------| 878 Figure 4: Measurement Interval 880 In Figure 4, it is assumed that the node with the earliest clock (R1) 881 identifies the right starting and ending times of the measurement, 882 but it is just an assumption, and other possibilities could occur. 883 So, in this case, T(R1) is the measurement interval, and its 884 recognition is essential in order to make comparisons with other 885 active/passive/hybrid Packet Loss metrics. 887 Regarding the timing constraints of the methodology, 888 [I-D.fioccola-rfc8321bis] already describes two contributions that 889 are taken into account: the clock error between network devices and 890 the network delay between the measurement points. 892 When we expand to a multipoint environment, we have to consider that 893 there are more marking nodes that mark the traffic based on 894 synchronized clock time. But, due to different synchronization 895 issues that may happen, the marking batches can be of different 896 lengths and with different offsets when they get mixed in a 897 multipoint flow. The additional gap that results between the sources 898 can be incorporated into A, which is the maximum clock skew between 899 the network devices, as already defined in [I-D.fioccola-rfc8321bis]. 901 ...BBBBBBBBB | AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA | BBBBBBBBB... 902 |<======================================>| 903 | L | 904 ...=========>|<==================><==================>|<==========... 905 | L/2 L/2 | 906 |<====>| |<====>| 907 d | | d 908 |<========================>| 909 available counting interval 911 Figure 5: Timing Aspects 913 Moreover, it is assumed that each path of the multipoint flow can 914 still be represented with a distinct normal distribution. So, for 915 the aggregate multipoint path, the combination of normal 916 distributions result in a new normal distribution. Under this 917 assumption, the definition of the guard band d is still applicable as 918 defined in [I-D.fioccola-rfc8321bis] and is given by: 920 d = A + D_avg + 3*D_stddev, 922 where A is the clock accuracy, D_avg is the average value of the 923 network delay, and D_stddev is the standard deviation of the delay. 925 As shown in Figure 5 and according to [I-D.fioccola-rfc8321bis], the 926 condition that must be satisfied to enable the method to function 927 properly is that the available counting interval must be > 0, and 928 that means: 930 L - 2d > 0. 932 This formula needs to be verified for each measurement point on the 933 multipoint path. 935 Note that the timing considerations are valid for both packet loss 936 and delay measurements. 938 9. Results of the Multipoint Alternate Marking Experiment 940 The methodology described in the previous sections can be applied to 941 various performance measurement problems, as also explained in 942 [I-D.fioccola-rfc8321bis]. 944 Either one or two flag bits might be available for marking in 945 different deployments: 947 One flag: packet loss measurement SHOULD be done as described in 948 Section 6 by applying the network clustering partition described 949 in Section 5. While delay measurement MAY be done according to 950 the Mean delay calculation representative of the multipoint path, 951 as described in Section 7.1.1. Single-marking method based on the 952 first/last packet of the interval cannot be applied, as mentioned 953 in Section 7.2.1. 955 Two flags: packet loss measurement SHOULD be done as described in 956 Section 6 by applying the network clustering partition described 957 in Section 5. While delay measurement SHOULD be done on a single 958 packet basis according to double-marking method Section 7.2.1. In 959 this case the Mean delay calculation (Section 7.1.1) MAY also be 960 used as a representative value of a multipoint path. 962 One flag and hash-based selection: packet loss measurement SHOULD 963 be done as described in Section 6 by applying the network 964 clustering partition described in Section 5. Hash-based selection 965 methodologies, introduced in Section 7.2.2, MAY be used for delay 966 measurement. 968 The experiment with Multipoint Alternate Marking methodologies 969 confirmed the benefits of the Alternate Marking methodology described 970 in [I-D.fioccola-rfc8321bis], as its extension to the general case of 971 multipoint-to-multipoint scenarios. 973 The Multipoint Alternate Marking Method is RECOMMENDED only for 974 controlled domains, as per [I-D.fioccola-rfc8321bis]. 976 10. A Closed-Loop Performance-Management Approach 978 The Multipoint Alternate-Marking framework that is introduced in this 979 document adds flexibility to Performance Management (PM), because it 980 can reduce the order of magnitude of the packet counters. This 981 allows an SDN orchestrator to supervise, control, and manage PM in 982 large networks. 984 The monitoring network can be considered as a whole or split into 985 clusters that are the smallest subnetworks (group-to-group segments), 986 maintaining the packet-loss property for each subnetwork. The 987 clusters can also be combined in new, connected subnetworks at 988 different levels, depending on the detail we want to achieve. 990 An SDN controller or a Network Management System (NMS) can calibrate 991 performance measurements, since they are aware of the network 992 topology. They can start without examining in depth. In case of 993 necessity (packet loss is measured or the delay is too high), the 994 filtering criteria could be immediately reconfigured in order to 995 perform a partition of the network by using clusters and/or different 996 combinations of clusters. In this way, the problem can be localized 997 in a specific cluster or a single combination of clusters, and a more 998 detailed analysis can be performed step by step by successive 999 approximation up to a point-to-point flow detailed analysis. This is 1000 the so-called "closed loop". 1002 This approach can be called "network zooming" and can be performed in 1003 two different ways: 1005 1) change the traffic filter and select more detailed flows; 1007 2) activate new measurement points by defining more specified 1008 clusters. 1010 The network-zooming approach implies that some filters or rules are 1011 changed and that therefore there is a transient time to wait once the 1012 new network configuration takes effect. This time can be determined 1013 by the Network Orchestrator/Controller, based on the network 1014 conditions. 1016 For example, if the network zooming identifies the performance 1017 problem for the traffic coming from a specific source, we need to 1018 recognize the marked signal from this specific source node and its 1019 relative path. For this purpose, we can activate all the available 1020 measurement points and better specify the flow filter criteria (i.e., 1021 5-tuple). As an alternative, it can be enough to select packets from 1022 the specific source for delay measurements; in this case, it is 1023 possible to apply the hashing technique, as mentioned in the previous 1024 sections. 1026 [I-D.song-opsawg-ifit-framework] defines an architecture where the 1027 centralized Data Collector and Network Management can apply the 1028 intelligent and flexible Alternate-Marking algorithm as previously 1029 described. 1031 As for [I-D.fioccola-rfc8321bis], it is possible to classify the 1032 traffic and mark a portion of the total traffic. For each period, 1033 the packet rate and bandwidth are calculated from the number of 1034 packets. In this way, the network orchestrator becomes aware if the 1035 traffic rate surpasses limits. In addition, more precision can be 1036 obtained by reducing the marking period; indeed, some implementations 1037 use a marking period of 1 sec or less. 1039 In addition, an SDN controller could also collect the measurement 1040 history. 1042 It is important to mention that the Multipoint Alternate Marking 1043 framework also helps Traffic Visualization. Indeed, this methodology 1044 is very useful for identifying which path or cluster is crossed by 1045 the flow. 1047 11. Security Considerations 1049 This document specifies a method of performing measurements that does 1050 not directly affect Internet security or applications that run on the 1051 Internet. However, implementation of this method must be mindful of 1052 security and privacy concerns, as explained in 1053 [I-D.fioccola-rfc8321bis]. 1055 12. IANA Considerations 1057 This document has no IANA actions. 1059 13. Contributors 1061 Greg Mirsky 1062 Ericsson 1063 Email: gregimirsky@gmail.com 1065 Tal Mizrahi 1066 Huawei Technologies 1067 Email: tal.mizrahi.phd@gmail.com 1069 Xiao Min 1070 ZTE Corp. 1071 Email: xiao.min2@zte.com.cn 1073 14. Acknowledgements 1075 The authors would like to thank Martin Duke and Tommy Pauly for their 1076 assistance and their detailed and precious reviews. 1078 15. References 1080 15.1. Normative References 1082 [I-D.fioccola-rfc8321bis] 1083 Fioccola, G., Cociglio, M., Mirsky, G., Mizrahi, T., and 1084 T. Zhou, "Alternate-Marking Method", draft-fioccola- 1085 rfc8321bis-04 (work in progress), April 2022. 1087 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1088 Requirement Levels", BCP 14, RFC 2119, 1089 DOI 10.17487/RFC2119, March 1997, 1090 . 1092 [RFC5474] Duffield, N., Ed., Chiou, D., Claise, B., Greenberg, A., 1093 Grossglauser, M., and J. Rexford, "A Framework for Packet 1094 Selection and Reporting", RFC 5474, DOI 10.17487/RFC5474, 1095 March 2009, . 1097 [RFC5475] Zseby, T., Molina, M., Duffield, N., Niccolini, S., and F. 1098 Raspall, "Sampling and Filtering Techniques for IP Packet 1099 Selection", RFC 5475, DOI 10.17487/RFC5475, March 2009, 1100 . 1102 [RFC5644] Stephan, E., Liang, L., and A. Morton, "IP Performance 1103 Metrics (IPPM): Spatial and Multicast", RFC 5644, 1104 DOI 10.17487/RFC5644, October 2009, 1105 . 1107 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 1108 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 1109 May 2017, . 1111 15.2. Informative References 1113 [I-D.ietf-ippm-route] 1114 Alvarez-Hamelin, J. I., Morton, A., Fabini, J., Pignataro, 1115 C., and R. Geib, "Advanced Unidirectional Route Assessment 1116 (AURA)", draft-ietf-ippm-route-10 (work in progress), 1117 August 2020. 1119 [I-D.mizrahi-ippm-marking] 1120 Mizrahi, T., Fioccola, G., Cociglio, M., Chen, M., and G. 1121 Mirsky, "Marking Methods for Performance Measurement", 1122 draft-mizrahi-ippm-marking-00 (work in progress), October 1123 2021. 1125 [I-D.song-opsawg-ifit-framework] 1126 Song, H., Qin, F., Chen, H., Jin, J., and J. Shin, "A 1127 Framework for In-situ Flow Information Telemetry", draft- 1128 song-opsawg-ifit-framework-17 (work in progress), February 1129 2022. 1131 [I-D.zhou-ippm-enhanced-alternate-marking] 1132 Zhou, T., Fioccola, G., Liu, Y., Cociglio, M., Lee, S., 1133 and W. Li, "Enhanced Alternate Marking Method", draft- 1134 zhou-ippm-enhanced-alternate-marking-09 (work in 1135 progress), February 2022. 1137 [IEEE-ACM-ToN-MPNPM] 1138 IEEE/ACM TRANSACTION ON NETWORKING, "Multipoint Passive 1139 Monitoring in Packet Networks", 1140 DOI 10.1109/TNET.2019.2950157, 2019. 1142 [IEEE-Network-PNPM] 1143 IEEE Network, "AM-PM: Efficient Network Telemetry using 1144 Alternate Marking", DOI 10.1109/MNET.2019.1800152, 2019. 1146 [RFC7011] Claise, B., Ed., Trammell, B., Ed., and P. Aitken, 1147 "Specification of the IP Flow Information Export (IPFIX) 1148 Protocol for the Exchange of Flow Information", STD 77, 1149 RFC 7011, DOI 10.17487/RFC7011, September 2013, 1150 . 1152 [RFC8889] Fioccola, G., Ed., Cociglio, M., Sapio, A., and R. Sisto, 1153 "Multipoint Alternate-Marking Method for Passive and 1154 Hybrid Performance Monitoring", RFC 8889, 1155 DOI 10.17487/RFC8889, August 2020, 1156 . 1158 Appendix A. Changes Log 1160 Changes from RFC 8889 in draft-fioccola-rfc8889bis-00 include: 1162 o Minor editorial changes 1164 o Removed section on "Examples of application" 1166 Changes in draft-fioccola-rfc8889bis-01 include: 1168 o Considerations on BUM traffic 1170 o Reference to RFC8321bis for the fragmentation part 1172 o Revised section on "Delay Measurements on a Single-Packet Basis" 1173 o Revised section on "Timing Aspects" 1175 Changes in draft-fioccola-rfc8889bis-02 include: 1177 o Clarified the formula in the section on "Timing Aspects" to be 1178 aligned with RFC 8321 1180 o Considerations on two-way delay measurements in both sections 8.1 1181 and 8.2 on delay measurements 1183 o Clarified in section 4.1 on "Monitoring Network" that the 1184 description is done for one direction but it can easily be 1185 extended to all direction 1187 o New section on "Results of the Multipoint Alternate Marking 1188 Experiment" 1190 Changes in draft-fioccola-rfc8889bis-03 include: 1192 o Moved and renamed section on "Timing Aspects" as "Synchronization 1193 and Timing" 1195 o Renamed old section on "Multipoint Packet Loss" as "Network Packet 1196 Loss" 1198 o New section on "Multipoint Packet Loss Measurement" 1200 o Renamed section on "Multipoint Performance Measurement" as 1201 "Extension of the Method to Multipoint Flows" 1203 Changes in draft-fioccola-rfc8889bis-04/draft-ietf-ippm-rfc8889bis-00 1204 include: 1206 o Revised section 5.1 on "Algorithm for Clusters Partition" 1208 Authors' Addresses 1210 Giuseppe Fioccola (editor) 1211 Huawei Technologies 1212 Riesstrasse, 25 1213 Munich 80992 1214 Germany 1216 Email: giuseppe.fioccola@huawei.com 1217 Mauro Cociglio 1218 Telecom Italia 1219 Via Reiss Romoli, 274 1220 Torino 10148 1221 Italy 1223 Email: mauro.cociglio@telecomitalia.it 1225 Amedeo Sapio 1226 Intel Corporation 1227 4750 Patrick Henry Dr. 1228 Santa Clara, CA 95054 1229 USA 1231 Email: amedeo.sapio@intel.com 1233 Riccardo Sisto 1234 Politecnico di Torino 1235 Corso Duca degli Abruzzi, 24 1236 Torino 10129 1237 Italy 1239 Email: riccardo.sisto@polito.it 1241 Tianran Zhou 1242 Huawei Technologies 1243 156 Beiqing Rd. 1244 Beijing 100095 1245 China 1247 Email: zhoutianran@huawei.com