idnits 2.16.02
/tmp/draft-ietf-rmcat-sbd-11.txt:
Checking boilerplate required by RFC 5378 and the IETF Trust (see
https://trustee.ietf.org/license-info):
----------------------------------------------------------------------------
No issues found here.
Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
----------------------------------------------------------------------------
No issues found here.
Checking nits according to https://www.ietf.org/id-info/checklist :
----------------------------------------------------------------------------
No issues found here.
Miscellaneous warnings:
----------------------------------------------------------------------------
== The copyright year in the IETF Trust and authors Copyright Line does not
match the current year
== The document seems to lack the recommended RFC 2119 boilerplate, even if
it appears to use RFC 2119 keywords -- however, there's a paragraph with
a matching beginning. Boilerplate error?
(The document does seem to have the reference to RFC 2119 which the
ID-Checklist requires).
-- The document date (March 29, 2018) is 386 days in the past. Is this
intentional?
Checking references for intended status: Experimental
----------------------------------------------------------------------------
== Outdated reference: A later version (-03) exists of
draft-ietf-avtcore-cc-feedback-message-01
== Outdated reference: A later version (-08) exists of
draft-ietf-rmcat-coupled-cc-07
-- Obsolete informational reference (is this intentional?): RFC 2680
(Obsoleted by RFC 7680)
Summary: 0 errors (**), 0 flaws (~~), 4 warnings (==), 2 comments (--).
Run idnits with the --verbose option for more detailed information about
the items above.
--------------------------------------------------------------------------------
2 RTP Media Congestion Avoidance Techniques D. Hayes, Ed.
3 Internet-Draft Simula Research Laboratory
4 Intended status: Experimental S. Ferlin
5 Expires: September 30, 2018
6 M. Welzl
7 K. Hiorth
8 University of Oslo
9 March 29, 2018
11 Shared Bottleneck Detection for Coupled Congestion Control for RTP
12 Media.
13 draft-ietf-rmcat-sbd-11
15 Abstract
17 This document describes a mechanism to detect whether end-to-end data
18 flows share a common bottleneck. It relies on summary statistics
19 that are calculated based on continuous measurements and used as
20 input to a grouping algorithm that runs wherever the knowledge is
21 needed.
23 Status of This Memo
25 This Internet-Draft is submitted in full conformance with the
26 provisions of BCP 78 and BCP 79.
28 Internet-Drafts are working documents of the Internet Engineering
29 Task Force (IETF). Note that other groups may also distribute
30 working documents as Internet-Drafts. The list of current Internet-
31 Drafts is at https://datatracker.ietf.org/drafts/current/.
33 Internet-Drafts are draft documents valid for a maximum of six months
34 and may be updated, replaced, or obsoleted by other documents at any
35 time. It is inappropriate to use Internet-Drafts as reference
36 material or to cite them other than as "work in progress."
38 This Internet-Draft will expire on September 30, 2018.
40 Copyright Notice
42 Copyright (c) 2018 IETF Trust and the persons identified as the
43 document authors. All rights reserved.
45 This document is subject to BCP 78 and the IETF Trust's Legal
46 Provisions Relating to IETF Documents
47 (https://trustee.ietf.org/license-info) in effect on the date of
48 publication of this document. Please review these documents
49 carefully, as they describe your rights and restrictions with respect
50 to this document. Code Components extracted from this document must
51 include Simplified BSD License text as described in Section 4.e of
52 the Trust Legal Provisions and are provided without warranty as
53 described in the Simplified BSD License.
55 Table of Contents
57 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
58 1.1. The Basic Mechanism . . . . . . . . . . . . . . . . . . . 3
59 1.2. The Signals . . . . . . . . . . . . . . . . . . . . . . . 3
60 1.2.1. Packet Loss . . . . . . . . . . . . . . . . . . . . . 3
61 1.2.2. Packet Delay . . . . . . . . . . . . . . . . . . . . 4
62 1.2.3. Path Lag . . . . . . . . . . . . . . . . . . . . . . 4
63 2. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 4
64 2.1. Parameters and Their Effect . . . . . . . . . . . . . . . 6
65 2.2. Recommended Parameter Values . . . . . . . . . . . . . . 7
66 3. Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . 7
67 3.1. SBD Feedback Requirements . . . . . . . . . . . . . . . . 8
68 3.1.1. Feedback When All the Logic is Placed at the Sender . 9
69 3.1.2. Feedback When the Statistics are Calculated at the
70 Receiver and SBD Performed at the Sender . . . . . . 9
71 3.1.3. Feedback When Bottlenecks can be Determined at Both
72 Senders and Receivers . . . . . . . . . . . . . . . . 10
73 3.2. Key Metrics and Their Calculation . . . . . . . . . . . . 10
74 3.2.1. Mean Delay . . . . . . . . . . . . . . . . . . . . . 10
75 3.2.2. Skewness Estimate . . . . . . . . . . . . . . . . . . 11
76 3.2.3. Variability Estimate . . . . . . . . . . . . . . . . 12
77 3.2.4. Oscillation Estimate . . . . . . . . . . . . . . . . 12
78 3.2.5. Packet Loss . . . . . . . . . . . . . . . . . . . . . 13
79 3.3. Flow Grouping . . . . . . . . . . . . . . . . . . . . . . 13
80 3.3.1. Flow Grouping Algorithm . . . . . . . . . . . . . . . 13
81 3.3.2. Using the Flow Group Signal . . . . . . . . . . . . . 16
82 4. Enhancements to the Basic SBD Algorithm . . . . . . . . . . . 16
83 4.1. Reducing Lag and Improving Responsiveness . . . . . . . . 16
84 4.1.1. Improving the Response of the Skewness Estimate . . . 17
85 4.1.2. Improving the Response of the Variability Estimate . 19
86 4.2. Removing Oscillation Noise . . . . . . . . . . . . . . . 19
87 5. Measuring OWD . . . . . . . . . . . . . . . . . . . . . . . . 20
88 5.1. Time-stamp Resolution . . . . . . . . . . . . . . . . . . 20
89 5.2. Clock Skew . . . . . . . . . . . . . . . . . . . . . . . 20
90 6. Expected Feedback from Experiments . . . . . . . . . . . . . 20
91 7. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 21
92 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 21
93 9. Security Considerations . . . . . . . . . . . . . . . . . . . 21
94 10. Change history . . . . . . . . . . . . . . . . . . . . . . . 21
95 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 23
96 11.1. Normative References . . . . . . . . . . . . . . . . . . 23
97 11.2. Informative References . . . . . . . . . . . . . . . . . 23
98 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 24
100 1. Introduction
102 In the Internet, it is not normally known if flows (e.g., TCP
103 connections or UDP data streams) traverse the same bottlenecks. Even
104 flows that have the same sender and receiver may take different paths
105 and may or may not share a bottleneck. Flows that share a bottleneck
106 link usually compete with one another for their share of the
107 capacity. This competition has the potential to increase packet loss
108 and delays. This is especially relevant for interactive applications
109 that communicate simultaneously with multiple peers (such as multi-
110 party video). For RTP media applications such as RTCWEB,
111 [I-D.ietf-rmcat-coupled-cc] describes a scheme that combines the
112 congestion controllers of flows in order to honor their priorities
113 and avoid unnecessary packet loss as well as delay. This mechanism
114 relies on some form of Shared Bottleneck Detection (SBD); here, a
115 measurement-based SBD approach is described.
117 1.1. The Basic Mechanism
119 The mechanism groups flows that have similar statistical
120 characteristics together. Section 3.3.1 describes a simple method
121 for achieving this, however, a major part of this draft is concerned
122 with collecting suitable statistics for this purpose.
124 1.2. The Signals
126 The current Internet is unable to explicitly inform endpoints as to
127 which flows share bottlenecks, so endpoints need to infer this from
128 whatever information is available to them. The mechanism described
129 here currently utilizes packet loss and packet delay, but is not
130 restricted to these. As ECN becomes more prevalent it too will
131 become a valuable base signal.
133 1.2.1. Packet Loss
135 Packet loss is often a relatively infrequent indication that a flow
136 traverses a bottleneck. Therefore, on its own it is of limited use
137 for SBD, however, it is a valuable supplementary measure when it is
138 more prevalent (refer to [RFC2680] section 2.5 for measuring packet
139 loss).
141 1.2.2. Packet Delay
143 End-to-end delay measurements include noise from every device along
144 the path in addition to the delay perturbation at the bottleneck
145 device. The noise is often significantly increased if the round-trip
146 time is used. The cleanest signal is obtained by using One-Way-Delay
147 (OWD) (refer to [RFC7679] section 3 for a definition of OWD).
149 Measuring absolute OWD is difficult since it requires both the sender
150 and receiver clocks to be synchronized. However, since the
151 statistics being collected are relative to the mean OWD, a relative
152 OWD measurement is sufficient. Clock skew is not usually significant
153 over the time intervals used by this SBD mechanism (see [RFC6817] A.2
154 for a discussion on clock skew and OWD measurements). However, in
155 circumstances where it is significant, Section 5.2 outlines a way of
156 adjusting the calculations to cater for it.
158 Each packet arriving at the bottleneck buffer may experience very
159 different queue lengths, and therefore different waiting times. A
160 single OWD sample does not, therefore, characterize the path well.
161 However, multiple OWD measurements do reflect the distribution of
162 delays experienced at the bottleneck.
164 1.2.3. Path Lag
166 Flows that share a common bottleneck may traverse different paths,
167 and these paths will often have different base delays. This makes it
168 difficult to correlate changes in delay or loss. This technique uses
169 the long term shape of the delay distribution as a base for
170 comparison to counter this.
172 2. Definitions
174 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
175 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
176 "OPTIONAL" in this document are to be interpreted as described in BCP
177 14 [RFC2119] RFC2119 [RFC2119] RFC8174 [RFC8174] when, and only when,
178 they appear in all capitals, as shown here.
180 Acronyms used in this document:
182 OWD -- One Way Delay
184 MAD -- Mean Absolute Deviation
186 RTT -- Round Trip Time
188 SBD -- Shared Bottleneck Detection
190 Conventions used in this document:
192 T -- the base time interval over which measurements are
193 made
195 N -- the number of base time, T, intervals used in some
196 calculations
198 M -- the number of base time, T, intervals used in some
199 calculations, where M <= N
201 sum(...) -- summation of terms of the variable in parentheses
203 sum_T(...) -- summation of all the measurements of the variable
204 in parentheses taken over the interval T
206 sum_NT(...) -- summation of all measurements taken over the
207 interval N*T
209 sum_MT(...) -- summation of all measurements taken over the
210 interval M*T
212 E_T(...) -- the expectation or mean of the measurements of the
213 variable in parentheses over T
215 E_N(...) -- the expectation or mean of the last N values of
216 the variable in parentheses
218 E_M(...) -- the expectation or mean of the last M values of
219 the variable in parentheses
221 num_T(...) -- the count of measurements of the variable in
222 parentheses taken in the interval T
224 num_MT(...) -- the count of measurements of the variable in
225 parentheses taken in the interval NT
227 PB -- a boolean variable indicating the particular flow
228 was identified transiting a bottleneck in the
229 previous interval T (i.e. Previously Bottleneck)
231 skew_est -- a measure of skewness in a OWD distribution
233 skew_base_T -- a variable used as an intermediate step in
234 calculating skew_est
236 var_est -- a measure of variability in OWD measurements
237 var_base_T -- a variable used as an intermediate step in
238 calculating var_est
240 freq_est -- a measure of low frequency oscillation in the OWD
241 measurements
243 pkt_loss -- a measure of the proportion of packets lost
245 p_l, p_f, p_mad, c_s, c_h, p_s, p_d, p_v -- various thresholds
246 used in the mechanism
248 M and F -- number of values related to N
250 2.1. Parameters and Their Effect
252 T T should be long enough so that there are enough packets
253 received during T for a useful estimate of short term mean
254 OWD and variation statistics. Making T too large can limit
255 the efficacy of freq_est. It will also increase the response
256 time of the mechanism. Making T too small will make the
257 metrics noisier.
259 N & M N should be large enough to provide a stable estimate of
260 oscillations in OWD. Usually M=N, though having M mean_delay) skew_base_T--
499 The mean_delay does not include the mean of the current T interval to
500 enable it to be calculated iteratively.
502 skew_est = sum_MT(skew_base_T)/num_MT(OWD)
504 where skew_est is a number between -1 and 1
506 Note: Care must be taken when implementing the comparisons to ensure
507 that rounding does not bias skew_est. It is important that the mean
508 is calculated with a higher precision than the samples.
510 3.2.3. Variability Estimate
512 Mean Absolute Deviation (MAD) delay is a robust variability measure
513 that copes well with different send rates. It can be implemented in
514 an online manner as follows:
516 var_base_T = sum_T(|OWD - E_T(OWD)|)
518 where
520 |x| is the absolute value of x
522 E_T(OWD) is the mean OWD calculated in the previous T
524 var_est = MAD_MT = sum_MT(var_base_T)/num_MT(OWD)
526 3.2.4. Oscillation Estimate
528 An estimate of the low frequency oscillation of the delay signal is
529 calculated by counting and normalizing the significant mean,
530 E_T(OWD), crossings of mean_delay:
532 freq_est = number_of_crossings / N
534 where we define a significant mean crossing as a crossing that
535 extends p_v * var_est from mean_delay. In our experiments we
536 have found that p_v = 0.7 is a good value.
538 Freq_est is a number between 0 and 1. Freq_est can be approximated
539 incrementally as follows:
541 With each new calculation of E_T(OWD) a decision is made as to
542 whether this value of E_T(OWD) significantly crosses the current
543 long term mean, mean_delay, with respect to the previous
544 significant mean crossing.
546 A cyclic buffer, last_N_crossings, records a 1 if there is a
547 significant mean crossing, otherwise a 0.
549 The counter, number_of_crossings, is incremented when there is a
550 significant mean crossing and decremented when a non-zero value is
551 removed from the last_N_crossings.
553 This approximation of freq_est was not used in [Hayes-LCN14], which
554 calculated freq_est every T using the current E_N(E_T(OWD)). Our
555 tests show that this approximation of freq_est yields results that
556 are almost identical to when the full calculation is performed every
557 T.
559 3.2.5. Packet Loss
561 The proportion of packets lost over the period NT is used as a
562 supplementary measure:
564 pkt_loss = sum_NT(lost packets) / sum_NT(total packets)
566 Note: When pkt_loss is small it is very variable, however, when
567 pkt_loss is high it becomes a stable measure for making grouping
568 decisions.
570 3.3. Flow Grouping
572 3.3.1. Flow Grouping Algorithm
574 The following grouping algorithm is RECOMMENDED for use of SBD with
575 coupled congestion control for RTP media [I-D.ietf-rmcat-coupled-cc]
576 and is sufficient and efficient for small to moderate numbers of
577 flows. For very large numbers of flows (e.g. hundreds), a more
578 complex clustering algorithm may be substituted.
580 Since no single metric is precise enough to group flows (due to
581 noise), the algorithm uses multiple metrics. Each metric offers a
582 different "view" of the bottleneck link characteristics, and used
583 together they enable a more precise grouping of flows than would
584 otherwise be possible.
586 Flows determined to be transiting a bottleneck are successively
587 divided into groups based on freq_est, var_est, skew_est and
588 pkt_loss.
590 The first step is to determine which flows are transiting a
591 bottleneck. This is important, since if a flow is not transiting a
592 bottleneck its delay based metrics will not describe the bottleneck,
593 but the "noise" from the rest of the path. Skewness, with proportion
594 of packet loss as a supplementary measure, is used to do this:
596 1. Grouping will be performed on flows that are inferred to be
597 traversing a bottleneck by:
599 skew_est < c_s
601 || ( skew_est < c_h & PB ) || pkt_loss > p_l
603 The parameter c_s controls how sensitive the mechanism is in
604 detecting a bottleneck. c_s = 0.0 was used in [Hayes-LCN14]. A value
605 of c_s = 0.1 is a little more sensitive, and c_s = -0.1 is a little
606 less sensitive. c_h controls the hysteresis on flows that were
607 grouped as transiting a bottleneck last time. If the test result is
608 TRUE, PB=TRUE, otherwise PB=FALSE.
610 These flows, flows transiting a bottleneck, are then progressively
611 divided into groups based on the freq_est, var_est, and skew_est
612 summary statistics. The process proceeds according to the following
613 steps:
615 2. Group flows whose difference in sorted freq_est is less than a
616 threshold:
618 diff(freq_est) < p_f
620 3. Subdivide the groups obtained in 2. by grouping flows whose
621 difference in sorted E_M(var_est) (highest to lowest) is less
622 than a threshold:
624 diff(var_est) < (p_mad * var_est)
626 The threshold, (p_mad * var_est), is with respect to the highest
627 value in the difference.
629 4. Subdivide the groups obtained in 3. by grouping flows whose
630 difference in sorted skew_est is less than a threshold:
632 diff(skew_est) < p_s
634 5. When packet loss is high enough to be reliable (pkt_loss > p_l),
635 Subdivide the groups obtained in 4. by grouping flows whose
636 difference is less than a threshold
638 diff(pkt_loss) < (p_d * pkt_loss)
640 The threshold, (p_d * pkt_loss), is with respect to the highest
641 value in the difference.
643 This procedure involves sorting estimates from highest to lowest. It
644 is simple to implement, and efficient for small numbers of flows (up
645 to 10-20). Figure 2 illustrates this algorithm.
647 *********
648 * Flows *
649 ***.**.**
650 / '
651 / '--.
652 / \
653 .---v--. .----v---.
654 1. Flows traversing | Cong | | UnCong |
655 a bottleneck '-.--.-' '--------'
656 / \
657 / \
658 / \
659 .--v--. v-----.
660 2. Divide by | g_1 | ... | g_n |
661 freq_est '---.-. '----..
662 / \ / \
663 / '--. v '------.
664 / \ \
665 .----v-. .-v----. .---v--.
666 3. Divide by | g_1a | ... | g_1z | ... | g_nz |
667 var_est '---.-.' '-----.. '-.-.--'
668 / \ / \ / |
669 / '-----. v v v |
670 / \ |
671 .-v-----. .-v-----. .---v---.
672 4. Divide by | g_1ai | ... | g_1ax | ... | g_nzx |
673 skew_est '----.-.' '------.. '-.-.---'
674 / \ / \ / |
675 / '--. v v v |
676 / \ |
677 .-----v--. .-v------. .----v---.
678 5. Divide by | g_1aiA | ... | g_1aiZ | ... | g_nzxZ |
679 pkt_loss '--------' '--------' '--------'
680 (when applicable)
682 Simple grouping algorithm.
684 Figure 2
686 3.3.2. Using the Flow Group Signal
688 Grouping decisions can be made every T from the second T, however
689 they will not attain their full design accuracy until after the
690 2*N'th T interval. We recommend that grouping decisions are not made
691 until 2*M T intervals.
693 Network conditions, and even the congestion controllers, can cause
694 bottlenecks to fluctuate. A coupled congestion controller MAY decide
695 only to couple groups that remain stable, say grouped together 90% of
696 the time, depending on its objectives. Recommendations concerning
697 this are beyond the scope of this document and will be specific to
698 the coupled congestion controller's objectives.
700 4. Enhancements to the Basic SBD Algorithm
702 The SBD algorithm as specified in Section 3 was found to work well
703 for a broad variety of conditions. The following enhancements to the
704 basic mechanisms have been found to significantly improve the
705 algorithm's performance under some circumstances and SHOULD be
706 implemented. These "tweaks" are described separately to keep the
707 main description succinct.
709 4.1. Reducing Lag and Improving Responsiveness
711 This section describes how to improve the responsiveness of the basic
712 algorithm.
714 Measurement based shared bottleneck detection makes decisions in the
715 present based on what has been measured in the past. This means that
716 there is always a lag in responding to changing conditions. This
717 mechanism is based on summary statistics taken over (N*T) seconds.
718 This mechanism can be made more responsive to changing conditions by:
720 1. Reducing N and/or M -- but at the expense of having less accurate
721 metrics, and/or
723 2. Exploiting the fact that more recent measurements are more
724 valuable than older measurements and weighting them accordingly.
726 Although more recent measurements are more valuable, older
727 measurements are still needed to gain an accurate estimate of the
728 distribution descriptor we are measuring. Unfortunately, the simple
729 exponentially weighted moving average weights drop off too quickly
730 for our requirements and have an infinite tail. A simple linearly
731 declining weighted moving average also does not provide enough weight
732 to the most recent measurements. We propose a piecewise linear
733 distribution of weights, such that the first section (samples 1:F) is
734 flat as in a simple moving average, and the second section (samples
735 F+1:M) is linearly declining weights to the end of the averaging
736 window. We choose integer weights, which allows incremental
737 calculation without introducing rounding errors.
739 4.1.1. Improving the Response of the Skewness Estimate
741 The weighted moving average for skew_est, based on skew_est in
742 Section 3.2.2, can be calculated as follows:
744 skew_est = ((M-F+1)*sum(skew_base_T(1:F))
746 + sum([(M-F):1].*skew_base_T(F+1:M)))
748 / ((M-F+1)*sum(numsampT(1:F))
750 + sum([(M-F):1].*numsampT(F+1:M)))
752 where numsampT is an array of the number of OWD samples in each T
753 (i.e. num_T(OWD)), and numsampT(1) is the most recent; skew_base_T(1)
754 is the most recent calculation of skew_base_T; 1:F refers to the
755 integer values 1 through to F, and [(M-F):1] refers to an array of
756 the integer values (M-F) declining through to 1; and ".*" is the
757 array scalar dot product operator.
759 To calculate this weighted skew_est incrementally:
761 Notation: F_ - flat portion, D_ - declining portion, W_ - weighted
762 component
764 Initialize: sum_skewbase = 0, F_skewbase=0, W_D_skewbase=0
766 skewbase_hist = buffer length M initialize to 0
768 numsampT = buffer length M initialized to 0
770 Steps per iteration:
772 1. old_skewbase = skewbase_hist(M)
774 2. old_numsampT = numsampT(M)
776 3. cycle(skewbase_hist)
778 4. cycle(numsampT)
780 5. numsampT(1) = num_T(OWD)
782 6. skewbase_hist(1) = skew_base_T
784 7. F_skewbase = F_skewbase + skew_base_T - skewbase_hist(F+1)
786 8. W_D_skewbase = W_D_skewbase + (M-F)*skewbase_hist(F+1)
787 - sum_skewbase
789 9. W_D_numsamp = W_D_numsamp + (M-F)*numsampT(F+1) - sum_numsamp
790 + F_numsamp
792 10. F_numsamp = F_numsamp + numsampT(1) - numsampT(F+1)
794 11. sum_skewbase = sum_skewbase + skewbase_hist(F+1) - old_skewbase
796 12. sum_numsamp = sum_numsamp + numsampT(1) - old_numsampT
798 13. skew_est = ((M-F+1)*F_skewbase + W_D_skewbase) /
799 ((M-F+1)*F_numsamp+W_D_numsamp)
801 Where cycle(....) refers to the operation on a cyclic buffer where
802 the start of the buffer is now the next element in the buffer.
804 4.1.2. Improving the Response of the Variability Estimate
806 Similarly the weighted moving average for var_est can be calculated
807 as follows:
809 var_est = ((M-F+1)*sum(var_base_T(1:F))
811 + sum([(M-F):1].*var_base_T(F+1:M)))
813 / ((M-F+1)*sum(numsampT(1:F))
815 + sum([(M-F):1].*numsampT(F+1:M)))
817 where numsampT is an array of the number of OWD samples in each T
818 (i.e. num_T(OWD)), and numsampT(1) is the most recent; skew_base_T(1)
819 is the most recent calculation of skew_base_T; 1:F refers to the
820 integer values 1 through to F, and [(M-F):1] refers to an array of
821 the integer values (M-F) declining through to 1; and ".*" is the
822 array scalar dot product operator. When removing oscillation noise
823 (see Section 4.2) this calculation must be adjusted to allow for
824 invalid var_base_T records.
826 Var_est can be calculated incrementally in the same way as skew_est
827 in Section 4.1.1. However, note that the buffer numsampT is used for
828 both calculations so the operations on it should not be repeated.
830 4.2. Removing Oscillation Noise
832 When a path has no bottleneck, var_est will be very small and the
833 recorded significant mean crossings will be the result of path noise.
834 Thus up to N-1 meaningless mean crossings can be a source of error at
835 the point a link becomes a bottleneck and flows traversing it begin
836 to be grouped.
838 To remove this source of noise from freq_est:
840 1. Set the current var_base_T = NaN (a value representing an invalid
841 record, i.e. Not a Number) for flows that are deemed to not be
842 transiting a bottleneck by the first skew_est based grouping test
843 (see Section 3.3.1).
845 2. Then var_est = sum_MT(var_base_T != NaN) / num_MT(OWD)
847 3. For freq_est, only record a significant mean crossing if flow
848 deemed to be transiting a bottleneck.
850 These three changes can help to remove the non-bottleneck noise from
851 freq_est.
853 5. Measuring OWD
855 This section discusses the OWD measurements required for this
856 algorithm to detect shared bottlenecks.
858 The SBD mechanism described in this document relies on differences
859 between OWD measurements to avoid the practical problems with
860 measuring absolute OWD (see [Hayes-LCN14] section IIIC). Since all
861 summary statistics are relative to the mean OWD and sender/receiver
862 clock offsets should be approximately constant over the measurement
863 periods, the offset is subtracted out in the calculation.
865 5.1. Time-stamp Resolution
867 The SBD mechanism requires timing information precise enough to be
868 able to make comparisons. As a rule of thumb, the time resolution
869 should be less than one hundredth of a typical path's range of
870 delays. In general, the coarser the time resolution, the more care
871 that needs to be taken to ensure rounding errors do not bias the
872 skewness calculation. Frequent timing information in millisecond
873 resolution as described by [I-D.ietf-avtcore-cc-feedback-message]
874 should be sufficient for the sender to calculate relative OWD.
876 5.2. Clock Skew
878 Generally sender and receiver clock skew will be too small to cause
879 significant errors in the estimators. Skew_est and freq_est are the
880 most sensitive to this type of noise due to their use of a mean OWD
881 calculated over a longer interval. In circumstances where clock skew
882 is high, basing skew_est only on the previous T's mean and ignoring
883 freq_est provides a noisier but reliable signal.
885 A more sophisticated method is to estimate the effect the clock skew
886 is having on the summary statistics, and then adjust statistics
887 accordingly. There are a number of techniques in the literature,
888 including [Zhang-Infocom02].
890 6. Expected Feedback from Experiments
892 The algorithm described in this memo has so far been evaluated using
893 simulations and small scale experiments. Real network tests using
894 RTP Media Congestion Avoidance Techniques (RMCAT) congestion control
895 algorithms will help confirm the default parameter choice. For
896 example, the time interval T may need to be made longer if the packet
897 rate is very low. Implementers and testers are invited to document
898 their findings in an Internet draft.
900 7. Acknowledgments
902 This work was part-funded by the European Community under its Seventh
903 Framework Programme through the Reducing Internet Transport Latency
904 (RITE) project (ICT-317700). The views expressed are solely those of
905 the authors.
907 8. IANA Considerations
909 This memo includes no request to IANA.
911 9. Security Considerations
913 The security considerations of RFC 3550 [RFC3550], RFC 4585
914 [RFC4585], and RFC 5124 [RFC5124] are expected to apply.
916 Non-authenticated RTCP packets carrying OWD measurements, shared
917 bottleneck indications, and/or summary statistics could allow
918 attackers to alter the bottleneck sharing characteristics for private
919 gain or disruption of other parties' communication. When using SBD
920 for coupled congestion control as described in
921 [I-D.ietf-rmcat-coupled-cc], the security considerations of
922 [I-D.ietf-rmcat-coupled-cc] apply.
924 10. Change history
926 XX RFC ED - PLEASE REMOVE THIS SECTION XXX
928 Changes made to this document:
930 WG-10->WG-11 : Genart review addressed.
932 WG-09->WG-10 : AD review addressed.
934 WG-08->WG-09 : Removed definitions that are no longer used. Added
935 pkt_loss definition. Refined c_s recommendation.
937 WG-07->WG-08 : Updates addressing https://www.ietf.org/mail-
938 archive/web/rmcat/current/msg01671.html Mainly
939 clarifications.
941 WG-06->WG-07 : Updates addressing
942 https://mailarchive.ietf.org/arch/msg/
943 rmcat/80B6q4nI7carGcf_ddBwx7nKvOw. Mainly
944 clarifications. Figure 2 to supplement grouping
945 algorithm description.
947 WG-05->WG-06 : Updates addressing WG reviews
948 https://mailarchive.ietf.org/arch/msg/rmcat/-
949 1JdrTMq1Y5T6ZNlOkrQJQ27TzE and
950 https://mailarchive.ietf.org/arch/msg/rmcat/
951 eI2Q1f8NL2SxbJgjFLR4_rEmJ_g. This has mainly
952 involved minor clarifications, including the moving
953 of 3.4.1 and 3.5 into the new Section 4, and 3.4.1
954 into Section 5
956 WG-04->WG-05 : Fix ToC formatting. Add section on expected
957 feedback from experiments replacing short section
958 on implementation status. Added comment on ECN as
959 a signal. Clarification of lost packet signaling.
960 Change term "draft" to "document" where
961 appropriate. American spelling. Some tightening
962 of the text.
964 WG-03->WG-04 : Add M to terminology table, suggest skew_est based
965 on previous T and no freq_est in clock skew
966 section, feedback requirements as a separate sub
967 section.
969 WG-02->WG-03 : Correct misspelled author
971 WG-01->WG-02 : Removed ambiguity associated with the term
972 "congestion". Expanded the description of
973 initialization messages. Removed PDV metric.
974 Added description of incremental weighted metric
975 calculations for skew_est. Various clarifications
976 based on implementation work. Fixed typos and
977 tuned parameters.
979 WG-00->WG-01 : Moved unbiased skew section to replace skew
980 estimate, more robust variability estimator, the
981 term variance replaced with variability, clock
982 drift term corrected to clock skew, revision to
983 clock skew section with a place holder, description
984 of parameters.
986 02->WG-00 : Fixed missing 0.5 in 3.3.2 and missing brace in
987 3.3.3
989 01->02 : New section describing improvements to the key
990 metric calculations that help to remove noise,
991 bias, and reduce lag. Some revisions to the
992 notation to make it clearer. Some tightening of
993 the thresholds.
995 00->01 : Revisions to terminology for clarity
997 11. References
999 11.1. Normative References
1001 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
1002 Requirement Levels", BCP 14, RFC 2119,
1003 DOI 10.17487/RFC2119, March 1997,
1004 .
1006 11.2. Informative References
1008 [Hayes-LCN14]
1009 Hayes, D., Ferlin, S., and M. Welzl, "Practical Passive
1010 Shared Bottleneck Detection using Shape Summary
1011 Statistics", Proc. the IEEE Local Computer Networks
1012 (LCN) pp150-158, September 2014,
1013 .
1016 [I-D.ietf-avtcore-cc-feedback-message]
1017 Sarker, Z., Perkins, C., Singh, V., and M. Ramalho, "RTP
1018 Control Protocol (RTCP) Feedback for Congestion Control",
1019 draft-ietf-avtcore-cc-feedback-message-01 (work in
1020 progress), March 2018.
1022 [I-D.ietf-rmcat-coupled-cc]
1023 Islam, S., Welzl, M., and S. Gjessing, "Coupled congestion
1024 control for RTP media", draft-ietf-rmcat-coupled-cc-07
1025 (work in progress), September 2017.
1027 [RFC2680] Almes, G., Kalidindi, S., and M. Zekauskas, "A One-way
1028 Packet Loss Metric for IPPM", RFC 2680,
1029 DOI 10.17487/RFC2680, September 1999,
1030 .
1032 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V.
1033 Jacobson, "RTP: A Transport Protocol for Real-Time
1034 Applications", STD 64, RFC 3550, DOI 10.17487/RFC3550,
1035 July 2003, .
1037 [RFC4585] Ott, J., Wenger, S., Sato, N., Burmeister, C., and J. Rey,
1038 "Extended RTP Profile for Real-time Transport Control
1039 Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585,
1040 DOI 10.17487/RFC4585, July 2006,
1041 .
1043 [RFC5124] Ott, J. and E. Carrara, "Extended Secure RTP Profile for
1044 Real-time Transport Control Protocol (RTCP)-Based Feedback
1045 (RTP/SAVPF)", RFC 5124, DOI 10.17487/RFC5124, February
1046 2008, .
1048 [RFC6817] Shalunov, S., Hazel, G., Iyengar, J., and M. Kuehlewind,
1049 "Low Extra Delay Background Transport (LEDBAT)", RFC 6817,
1050 DOI 10.17487/RFC6817, December 2012,
1051 .
1053 [RFC7679] Almes, G., Kalidindi, S., Zekauskas, M., and A. Morton,
1054 Ed., "A One-Way Delay Metric for IP Performance Metrics
1055 (IPPM)", STD 81, RFC 7679, DOI 10.17487/RFC7679, January
1056 2016, .
1058 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
1059 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
1060 May 2017, .
1062 [Zhang-Infocom02]
1063 Zhang, L., Liu, Z., and H. Xia, "Clock synchronization
1064 algorithms for network measurements", Proc. the IEEE
1065 International Conference on Computer Communications
1066 (INFOCOM) pp160-169, September 2002,
1067 .
1069 Authors' Addresses
1071 David Hayes (editor)
1072 Simula Research Laboratory
1073 P.O. Box 134
1074 Lysaker 1325
1075 Norway
1077 Email: davidh@simula.no
1079 Simone Ferlin
1081 Email: simone@ferlin.io
1082 Michael Welzl
1083 University of Oslo
1084 PO Box 1080 Blindern
1085 Oslo N-0316
1086 Norway
1088 Email: michawe@ifi.uio.no
1090 Kristian Hiorth
1091 University of Oslo
1092 PO Box 1080 Blindern
1093 Oslo N-0316
1094 Norway
1096 Email: kristahi@ifi.uio.no