idnits 2.17.00 (12 Aug 2021) /tmp/idnits9004/draft-ietf-dime-ovli-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([RFC2119]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'MUST not' in this paragraph: The lack of end-to-end confidentiality protection means that any Diameter agent in the path of an overload report can view the contents of that report. In addition to the requirement to select which peers are trusted to send overload reports, operators MUST be able to select which peers are authorized to receive reports. A node MUST not send an overload report to a peer not authorized to receive it. Furthermore, an agent MUST remove any overload reports that might have been inserted by other nodes before forwarding a Diameter message to a peer that is not authorized to receive overload reports. -- The document date (December 3, 2014) is 2726 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'RFC5905' is defined on line 1565, but no explicit reference was found in the text == Unused Reference: 'RFC5729' is defined on line 1588, but no explicit reference was found in the text ** Obsolete normative reference: RFC 5226 (Obsoleted by RFC 8126) == Outdated reference: draft-ietf-dime-e2e-sec-req has been published as RFC 7966 -- Obsolete informational reference (is this intentional?): RFC 4006 (Obsoleted by RFC 8506) Summary: 2 errors (**), 0 flaws (~~), 5 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Diameter Maintenance and Extensions (DIME) J. Korhonen, Ed. 3 Internet-Draft Broadcom 4 Intended status: Standards Track S. Donovan, Ed. 5 Expires: June 6, 2015 B. Campbell 6 Oracle 7 L. Morand 8 Orange Labs 9 December 3, 2014 11 Diameter Overload Indication Conveyance 12 draft-ietf-dime-ovli-05.txt 14 Abstract 16 This specification defines a base solution for Diameter overload 17 control, referred to as Diameter Overload Indication Conveyance 18 (DOIC). 20 Requirements 22 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 23 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 24 document are to be interpreted as described in RFC 2119 [RFC2119]. 26 Status of This Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at http://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on June 6, 2015. 43 Copyright Notice 45 Copyright (c) 2014 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (http://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 Table of Contents 60 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 61 2. Terminology and Abbreviations . . . . . . . . . . . . . . . . 4 62 3. Solution Overview . . . . . . . . . . . . . . . . . . . . . . 5 63 3.1. Piggybacking . . . . . . . . . . . . . . . . . . . . . . 7 64 3.2. DOIC Capability Announcement . . . . . . . . . . . . . . 8 65 3.3. DOIC Overload Condition Reporting . . . . . . . . . . . . 9 66 3.4. DOIC Extensibility . . . . . . . . . . . . . . . . . . . 11 67 3.5. Simplified Example Architecture . . . . . . . . . . . . . 12 68 4. Solution Procedures . . . . . . . . . . . . . . . . . . . . . 12 69 4.1. Capability Announcement . . . . . . . . . . . . . . . . . 12 70 4.1.1. Reacting Node Behavior . . . . . . . . . . . . . . . 13 71 4.1.2. Reporting Node Behavior . . . . . . . . . . . . . . . 13 72 4.1.3. Agent Behavior . . . . . . . . . . . . . . . . . . . 14 73 4.2. Overload Report Processing . . . . . . . . . . . . . . . 15 74 4.2.1. Overload Control State . . . . . . . . . . . . . . . 15 75 4.2.2. Reacting Node Behavior . . . . . . . . . . . . . . . 19 76 4.2.3. Reporting Node Behavior . . . . . . . . . . . . . . . 20 77 4.3. Protocol Extensibility . . . . . . . . . . . . . . . . . 21 78 5. Loss Algorithm . . . . . . . . . . . . . . . . . . . . . . . 22 79 5.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . 22 80 5.2. Reporting Node Behavior . . . . . . . . . . . . . . . . . 23 81 5.3. Reacting Node Behavior . . . . . . . . . . . . . . . . . 24 82 6. Attribute Value Pairs . . . . . . . . . . . . . . . . . . . . 25 83 6.1. OC-Supported-Features AVP . . . . . . . . . . . . . . . . 25 84 6.2. OC-Feature-Vector AVP . . . . . . . . . . . . . . . . . . 25 85 6.3. OC-OLR AVP . . . . . . . . . . . . . . . . . . . . . . . 26 86 6.4. OC-Sequence-Number AVP . . . . . . . . . . . . . . . . . 26 87 6.5. OC-Validity-Duration AVP . . . . . . . . . . . . . . . . 27 88 6.6. OC-Report-Type AVP . . . . . . . . . . . . . . . . . . . 27 89 6.7. OC-Reduction-Percentage AVP . . . . . . . . . . . . . . . 27 90 6.8. Attribute Value Pair flag rules . . . . . . . . . . . . . 27 91 7. Error Response Codes . . . . . . . . . . . . . . . . . . . . 28 92 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 29 93 8.1. AVP codes . . . . . . . . . . . . . . . . . . . . . . . . 29 94 8.2. New registries . . . . . . . . . . . . . . . . . . . . . 29 95 9. Security Considerations . . . . . . . . . . . . . . . . . . . 30 96 9.1. Potential Threat Modes . . . . . . . . . . . . . . . . . 30 97 9.2. Denial of Service Attacks . . . . . . . . . . . . . . . . 31 98 9.3. Non-Compliant Nodes . . . . . . . . . . . . . . . . . . . 32 99 9.4. End-to End-Security Issues . . . . . . . . . . . . . . . 32 100 10. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 33 101 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 34 102 11.1. Normative References . . . . . . . . . . . . . . . . . . 34 103 11.2. Informative References . . . . . . . . . . . . . . . . . 34 104 Appendix A. Issues left for future specifications . . . . . . . 34 105 A.1. Additional traffic abatement algorithms . . . . . . . . . 35 106 A.2. Agent Overload . . . . . . . . . . . . . . . . . . . . . 35 107 A.3. New Error Diagnostic AVP . . . . . . . . . . . . . . . . 35 108 Appendix B. Deployment Considerations . . . . . . . . . . . . . 35 109 Appendix C. Requirements Conformance Analysis . . . . . . . . . 35 110 C.1. Deferred Requirements . . . . . . . . . . . . . . . . . . 36 111 C.2. Detection of non-supporting Intermediaries . . . . . . . 36 112 C.3. Implicit Application Indication . . . . . . . . . . . . . 36 113 C.4. Stateless Operation . . . . . . . . . . . . . . . . . . . 37 114 C.5. No New Vulnerabilities . . . . . . . . . . . . . . . . . 37 115 C.6. Detailed Requirements . . . . . . . . . . . . . . . . . . 37 116 C.6.1. General . . . . . . . . . . . . . . . . . . . . . . . 37 117 C.6.2. Performance . . . . . . . . . . . . . . . . . . . . . 39 118 C.6.3. Heterogeneous Support for Solution . . . . . . . . . 41 119 C.6.4. Granular Control . . . . . . . . . . . . . . . . . . 43 120 C.6.5. Priority and Policy . . . . . . . . . . . . . . . . . 43 121 C.6.6. Security . . . . . . . . . . . . . . . . . . . . . . 44 122 C.6.7. Flexibility and Extensibility . . . . . . . . . . . . 45 123 Appendix D. Considerations for Applications Integrating the DOIC 124 Solution . . . . . . . . . . . . . . . . . . . . . . 46 125 D.1. Application Classification . . . . . . . . . . . . . . . 47 126 D.2. Application Type Overload Implications . . . . . . . . . 48 127 D.3. Request Transaction Classification . . . . . . . . . . . 49 128 D.4. Request Type Overload Implications . . . . . . . . . . . 50 129 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 51 131 1. Introduction 133 This specification defines a base solution for Diameter overload 134 control, referred to as Diameter Overload Indication Conveyance 135 (DOIC), based on the requirements identified in [RFC7068]. 137 This specification addresses Diameter overload control between 138 Diameter nodes that support the DOIC solution. The solution, which 139 is designed to apply to existing and future Diameter applications, 140 requires no changes to the Diameter base protocol [RFC6733] and is 141 deployable in environments where some Diameter nodes do not implement 142 the Diameter overload control solution defined in this specification. 144 Note that the overload control solution defined in this specification 145 does not address all the requirements listed in [RFC7068]. A number 146 of overload control related features are left for future 147 specifications. See Appendix A for a list of extensions that are 148 currently being considered. See Appendix C for an analysis of 149 conformance to the requirements specified in [RFC7068]. 151 2. Terminology and Abbreviations 153 Abatement 155 Reaction to receipt of an overload report resulting in a reduction 156 in traffic sent to the reporting node. Abatement actions include 157 diversion and throttling. 159 Abatement Algorithm 161 A mechanism requested by reporting nodes and used by reacting 162 nodes to reduce the amount of traffic sent during an occurrence of 163 overload control. 165 Diversion 167 A mechanism used for overload abatement by selecting a different 168 path for requests. 170 Host-Routed Requests 172 Requests that a reacting node knows will be served by a particular 173 host, either due to the presence of a Destination-Host AVP, or by 174 some other local knowledge on the part of the reacting node. 176 Overload Control State (OCS) 178 Reporting and reacting node internally maintained state describing 179 occurrences of overload control. 181 Overload Report (OLR) 183 Overload control information for a particular overload occurrence 184 sent by a reporting node. 186 Reacting Node 188 A Diameter node that acts upon an overload report. 190 Realm-Routed Requests 191 Requests that a reacting node does not know the host that will 192 service the request. 194 Reporting Node 196 A Diameter node that generates an overload report. (This may or 197 may not be the overloaded node.) 199 Throttling 201 A mechanism for overload abatement that limits the number of 202 requests sent by the DIOC reacting node. Throttling can include a 203 Diameter Client not sending requests, or a Diameter Agent or 204 Server rejecting requests with appropriate error responses. In 205 both cases the result of the throttling is a permanent rejection 206 of the transaction. 208 3. Solution Overview 210 The Diameter Overload Information Conveyance (DOIC) solution allows 211 Diameter nodes to request other Diameter nodes to perform overload 212 abatement actions, that is, actions to reduce the load offered to the 213 overloaded node or realm. 215 A Diameter node that supports DOIC is known as a "DOIC node". Any 216 Diameter node can act as a DOIC node, including Diameter Clients, 217 Diameter Servers, and Diameter Agents. DOIC nodes are further 218 divided into "Reporting Nodes" and "Reacting Nodes." A reporting 219 node requests overload abatement by sending Overload Reports (OLR). 221 A reacting node acts upon OLRs, and performs whatever actions are 222 needed to fulfill the abatement requests included in the OLRs. A 223 Reporting node may report overload on its own behalf, or on behalf of 224 other nodes. Likewise, a reacting node may perform overload 225 abatement on its own behalf, or on behalf of other nodes. 227 A Diameter node's role as a DOIC node is independent of its Diameter 228 role. For example, Diameter Agents may act as DOIC nodes, even 229 though they are not endpoints in the Diameter sense. Since Diameter 230 enables bi-directional applications, where Diameter Servers can send 231 requests towards Diameter Clients, a given Diameter node can 232 simultaneously act as both a reporting node and a reacting node. 234 Likewise, a Diameter Agent may act as a reacting node from the 235 perspective of upstream nodes, and a reporting node from the 236 perspective of downstream nodes. 238 DOIC nodes do not generate new messages to carry DOIC related 239 information. Rather, they "piggyback" DOIC information over existing 240 Diameter messages by inserting new AVPs into existing Diameter 241 requests and responses. Nodes indicate support for DOIC, and any 242 needed DOIC parameters, by inserting an OC-Supported-Features AVP 243 (Section 6.2) into existing requests and responses. Reporting nodes 244 send OLRs by inserting OC-OLR AVPs (Section 6.3). 246 A given OLR applies to the Diameter realm and application of the 247 Diameter message that carries it. If a reporting node supports more 248 than one realm and/or application, it reports independently for each 249 combination of realm and application. Similarly, the OC-Supported- 250 Features AVP applies to the realm and application of the enclosing 251 message. This implies that a node may support DOIC for one 252 application and/or realm, but not another, and may indicate different 253 DOIC parameters for each application and realm for which it supports 254 DOIC. 256 Reacting nodes perform overload abatement according to an agreed-upon 257 abatement algorithm. An abatement algorithm defines the meaning of 258 some of the parameters of an OLR and the procedures required for 259 overload abatement. An overload abatement algorithm separates 260 Diameter requests into two sets. The first set contains the requests 261 that are to undergo overload abatement treatment of either throttling 262 or diversion. The second set contains the requests that are to be 263 given normal routing treatment. This document specifies a single 264 must-support algorithm, namely the "loss" algorithm (Section 5). 265 Future specifications may introduce new algorithms. 267 Overload conditions may vary in scope. For example, a single 268 Diameter node may be overloaded, in which case reacting nodes may 269 attempt to send requests to other destinations. On the other hand, 270 an entire Diameter realm may be overloaded, in which case such 271 attempts would do harm. DOIC OLRs have a concept of "report type" 272 (Section 6.6), where the type defines such behaviors. Report types 273 are extensible. This document defines report types for overload of a 274 specific host, and for overload of an entire realm. 276 A report of type "HOST_REPORT" is sent to indicate the overload of a 277 specific host, identified by the Origin-Host AVP of the message 278 containing the OLR, for the application-id indicated in the 279 transaction. When receiving an OLR of type "HOST_REPORT", a reacting 280 node applies overload abatement treatment to the host-routed requests 281 identified by the overload abatement algorithm (see definition in 282 Section 2) sent for this application to the overloaded host. 284 A report of type "REALM_REPORT" is sent to indicate the overload of a 285 realm for the application-id indicated in the transaction. The 286 overloaded realm is identified by the Destination-Realm AVP of the 287 message containing the OLR. When receiving an OLR of type 288 "REALM_REPORT", a reacting node applies overload abatement treatment 289 to realm-routed requests identified by the overload abatement 290 algorithm (see definition in Section 2) sent for this application to 291 the overloaded realm. 293 While a reporting node sends OLRs to "adjacent" reacting nodes, nodes 294 that are "adjacent" for DOIC purposes may not be adjacent from a 295 Diameter, or transport, perspective. For example, one or more 296 Diameter agents that do not support DOIC may exist between a given 297 pair of reporting and reacting nodes, as long as those agents pass 298 unknown AVPs through unchanged. The report types described in this 299 document can safely pass through non-supporting agents. This may not 300 be true for report types defined in future specifications. 302 3.1. Piggybacking 304 There is no new Diameter application defined to carry overload 305 related AVPs. The overload control AVPs defined in this 306 specification have been designed to be piggybacked on top of existing 307 application messages. This is made possible by adding overload 308 control AVPs, the OC-OLR AVP and the OC-Supported-Features AVP, as 309 optional AVPs into existing commands when the corresponding Command 310 Code Format (CCF) specification allows adding new optional AVPs (see 311 Section 1.3.4 of [RFC6733]). 313 Reacting nodes indicate support for DOIC by including the OC- 314 Supported-Features AVP in all request messages originated or relayed 315 by the reacting node. 317 Reporting nodes indicate support for DOIC by including the OC- 318 Supported-Features AVP in all answer messages originated or relayed 319 by the reporting node that are in response to a request that 320 contained the OC-Supported-Features AVP. Reporting nodes also 321 include overload reports using the OC-OLR AVP in answer messages. 323 Note that the overload control solution does not have fixed server 324 and client roles. The DOIC node role is determined based on the 325 message type: whether the message is a request (i.e. sent by a 326 "reacting node") or an answer (i.e. send by a "reporting node"). 327 Therefore, in a typical "client-server" deployment, the Diameter 328 Client MAY report its overload condition to the Diameter Server for 329 any Diameter Server initiated message exchange. An example of such 330 is the Diameter Server requesting a re-authentication from a Diameter 331 Client. 333 3.2. DOIC Capability Announcement 335 The DOIC solution supports the ability for Diameter nodes to 336 determine if other nodes in the path of a request support the 337 solution. This capability is referred to as DOIC Capability 338 Announcement (DCA) and is separate from Diameter Capability Exchange. 340 The DCA mechanism uses the OC-Supported-Features AVPs to indicate the 341 Diameter overload features supported. 343 The first node in the path of a Diameter request that supports the 344 DOIC solution inserts the OC-Supported-Features AVP in the request 345 message. 347 Note: As discussed elsewhere in the document, agents in the path 348 of the request can modify the OC-Supported-Features AVP. 350 Note: The DOIC solution must support deployments where Diameter 351 Clients and/or Diameter Servers do not support the DOIC solution. 352 In this scenario, Diameter Agents that support the DOIC solution 353 may handle overload abatement for the non supporting Diameter 354 nodes. In this case the DOIC agent will insert the OC-Supported- 355 Features AVP in requests that do not already contain one, telling 356 the reporting node that there is a DOIC node that will handle 357 overload abatement. For transactions where there was an OC- 358 Supporting-Features AVP in the request, the agent will insert the 359 OC-Supported-Features AVP in answers, telling the reacting node 360 that there is a reporting node. 362 The OC-Feature-Vector AVP will contain an indication of support for 363 the loss overload abatement algorithm defined in this specification 364 (see Section 5). This ensures that there is at least one commonly 365 supported overload abatement algorithm between the reporting node and 366 the reacting node(s) in the path of the request. 368 The reporting node inserts the OC-Supported-Features AVP in all 369 answer messages to requests that contained the OC-Supported-Features 370 AVP. The contents of the reporting node's OC-Supported-Features AVP 371 indicate the set of Diameter overload features supported by the 372 reporting node. This specification defines one exception - the 373 reporting node only includes an indication of support for one 374 overload abatement algorithm, independent of the number of overload 375 abatement algorithms actually supported by the reacting node. The 376 overload abatement algorithm indicated is the algorithm that the 377 reporting node intends to use should it enter an overload condition. 378 Reacting nodes can use the indicated overload abatement algorithm to 379 prepare for possible overload reports and must use the indicated 380 overload abatement algorithm if traffic reduction is actually 381 requested. 383 Note that the loss algorithm defined in this document is a 384 stateless abatement algorithm. As a result it does not require 385 any actions by reacting nodes prior to the receipt of an overload 386 report. Stateful abatement algorithms that base the abatement 387 logic on a history of request messages sent might require reacting 388 nodes to maintain state in advance of receiving an overload report 389 to ensure that the overload reports can be properly handled. 391 Reporting nodes are allowed to change the overload abatement 392 algorithm indicated in the OC-Feature-Vector AVP if the reporting 393 node is not currently in an overload condition and sending overload 394 reports. The reporting node is not allowed to change the overload 395 abatement algorithm while the reporting node is in an overload 396 condition. 398 The individual features supported by the DOIC nodes are indicated in 399 the OC-Feature-Vector AVP. Any semantics associated with the 400 features will be defined in extension specifications that introduce 401 the features. 403 The DCA mechanism must also allow the scenario where the set of 404 features supported by the sender of a request and by agents in the 405 path of a request differ. In this case, the agent updates the OC- 406 Supported-Features AVP to reflect the mixture of the two sets of 407 supported features. 409 Note: The logic to determine the content of the modified OC- 410 Supported-Features AVP is out-of-scope for this specification and 411 is left to implementation decisions. Care must be taken not to 412 introduce interoperability issues for downstream or upstream DOIC 413 nodes. 415 3.3. DOIC Overload Condition Reporting 417 As with DOIC capability announcement, overload condition reporting 418 uses new AVPs (Section 6.3) to indicate an overload condition. 420 The OC-OLR AVP is referred to as an overload report. The OC-OLR AVP 421 includes the type of report, a sequence number, the length of time 422 that the report is valid and abatement algorithm specific AVPs. 424 Two types of overload reports are defined in this document, host 425 reports and realm reports. 427 A report of type "HOST_REPORT" is sent to indicate the overload of a 428 specific Diameter node for the application-id indicated in the 429 transaction. When receiving an OLR of type host, a reacting node 430 applies overload abatement to what is referred to in this document as 431 host-routed requests. The reacting node applies overload abatement 432 on those host-routed requests which the reacting node knows will be 433 served by the server that matches the Origin-Host AVP of the received 434 message that contained the received OLR of type host. 436 A report of type "REALM_REPORT" applies to realm-routed requests for 437 a specific realm as indicated in the Destination-Realm AVP. 439 This document assumes that there is a single source for realm-reports 440 for a given realm, or that if multiple nodes can send realm reports, 441 that each such node has full knowledge of the overload state of the 442 entire realm. A reacting node cannot distinguish between receiving 443 realm-reports from a single node, or from multiple nodes. 445 Note: Known issues exist if multiple sources for overload reports 446 which apply to the same Diameter entity exist. Reacting nodes 447 have no way of determining the source and, as such, will treat 448 them as coming from a single source. Variance in sequence numbers 449 between the two sources can then cause incorrect overload 450 abatement treatment to be applied for indeterminate periods of 451 time. 453 Reporting nodes are responsible for determining the need for a 454 reduction of traffic. The method for making this determination is 455 implementation specific and depend on the type of overload report 456 being generated. A host-report, for instance, will generally be 457 generated by tracking utilization of resources required by the host 458 to handle transactions for the Diameter application. A realm-report 459 generally impacts the traffic sent to multiple hosts and, as such, 460 requires tracking the capacity all servers for realm-routed requests 461 for the application and realm. 463 Once a reporting node determines the need for a reduction in traffic, 464 it uses the DOIC defined AVPs to report on the condition. These AVPs 465 are included in answer messages sent or relayed by the reporting 466 node. The reporting node indicates the overload abatement algorithm 467 that is to be used to handle the traffic reduction in the OC- 468 Supported-Features AVP. The OC-OLR AVP is used to communicate 469 information about the requested reduction. 471 Reacting nodes, upon receipt of an overload report, are responsible 472 for applying the overload abatement algorithm to traffic impacted by 473 the overload report. The method used to determine the requests that 474 are to receive overload abatement treatment is dependent on the 475 abatement algorithm. The loss abatement algorithm is defined in this 476 document (Section 5). Other abatement algorithms can be defined in 477 extensions to the DOIC solutions. 479 Two types of overload abatement treatment are defined, diversion and 480 throttling. Reacting nodes are responsible for determining which 481 treatment is appropriate for individual requests. 483 As the conditions that lead to the generation of the overload report 484 change the reporting node can send new overload reports requesting 485 greater reduction if the condition gets worse or less reduction if 486 the condition improves. The reporting node sends an overload report 487 with a duration of zero to indicate that the overload condition has 488 ended and need for use of the abatement algorithm to reduce traffic 489 sent is no longer needed. 491 The reacting node also determines when the overload report expires 492 based on the OC-Validity-Duration AVP in the overload report and 493 stops applying the abatement algorithm when the report expires. 495 3.4. DOIC Extensibility 497 The DOIC solution is designed to be extensible. This extensibility 498 is based on existing Diameter based extensibility mechanisms, along 499 with the DOIC capability announcement mechanism. 501 There are multiple categories of extensions that are expected. This 502 includes the definition of new overload abatement algorithms, the 503 definition of new report types and the definition of new scopes of 504 messages impacted by an overload report. 506 The DOIC solution uses the OC-Supported-Features AVP for DOIC nodes 507 to communicate supported features. The specific features supported 508 by the DOIC node are indicated in the OC-Feature-Vector AVP. DOIC 509 extensions that require new normative behavior define new values for 510 the OC-Feature-Vector AVP. DOIC extensions also have the ability to 511 add new AVPs to the OC-Supported-Features AVP, if additional 512 information about the new feature is required. 514 Reporting nodes use the OC-OLR AVP to communicate overload 515 occurrences. This AVP can also be extended to add new AVPs allowing 516 reporting nodes to communicate additional information about handling 517 an overload condition. 519 If necessary, new extensions can also define new AVPs that are not 520 part of the OC-Supported-Features and OC-OLR group AVPs. It is, 521 however, recommended that DOIC extensions use the OC-Supported- 522 Features AVP and OC-OLR AVP to carry all DOIC related AVPs. 524 3.5. Simplified Example Architecture 526 Figure 1 illustrates the simplified architecture for Diameter 527 overload information conveyance. 529 Realm X Same or other Realms 530 <--------------------------------------> <----------------------> 532 +--^-----+ : (optional) : 533 |Diameter| : : 534 |Server A|--+ .--. : +---^----+ : .--. 535 +--------+ | _( `. : |Diameter| : _( `. +---^----+ 536 +--( )--:-| Agent |-:--( )--|Diameter| 537 +--------+ | ( ` . ) ) : +-----^--+ : ( ` . ) ) | Client | 538 |Diameter|--+ `--(___.-' : : `--(___.-' +-----^--+ 539 |Server B| : : 540 +---^----+ : : 542 End-to-end Overload Indication 543 1) <-----------------------------------------------> 544 Diameter Application Y 546 Overload Indication A Overload Indication A' 547 2) <----------------------> <----------------------> 548 Diameter Application Y Diameter Application Y 550 Figure 1: Simplified architecture choices for overload indication 551 delivery 553 In Figure 1, the Diameter overload indication can be conveyed (1) 554 end-to-end between servers and clients or (2) between servers and 555 Diameter agent inside the realm and then between the Diameter agent 556 and the clients. 558 4. Solution Procedures 560 This section outlines the normative behavior for the DOIC solution. 562 4.1. Capability Announcement 564 This section defines DOIC Capability Announcement (DCA) behavior. 566 4.1.1. Reacting Node Behavior 568 A reacting node MUST include the OC-Supported-Features AVP in all 569 requests. It MAY include the OC-Feature-Vector AVP. If it does so, 570 it MUST indicate support for the "loss" algorithm. If the reacting 571 node is configured to support features (including other algorithms) 572 in addition to the loss algorithm, it MUST indicate such support in 573 an OC-Feature-Vector AVP. 575 An OC-Supported-Features AVP in answer messages indicates there is a 576 reporting node for the transaction. The reacting node MAY take 577 action, for example creating state for some stateful abatement 578 algorithm, based on the features indicated in the OC-Feature-Vector 579 AVP. 581 Note: The loss abatement algorithm does not require stateful 582 behavior when there is no active overload report. This behavior 583 is described in Section 4.2 and Section 5. 585 4.1.2. Reporting Node Behavior 587 Upon receipt of a request message, a reporting node determines if 588 there is a reacting node for the transaction based on the presence of 589 the OC-Supported-Features AVP in the request message. 591 If the request message contains an OC-Supported-Features AVP then a 592 reporting node MUST include the OC-Supported-Features AVP in the 593 answer message for that transaction. 595 A reporting node MUST NOT include the OC-Supported-Features AVP, OC- 596 OLR AVP or any other overload control AVPs defined in extension 597 drafts in response messages for transactions where the request 598 message does not include the OC-Supported-Features AVP. Lack of the 599 OC-Supported-Features AVP in the request message indicates that there 600 is no reacting node for the transaction. 602 A reporting node knows what overload control functionality is 603 supported by the reacting node based on the content of the OC- 604 Feature-Vector AVP in the request message. 606 A reporting node MUST indicate support for one and only one abatement 607 algorithm in the OC-Feature-Vector AVP. The abatement algorithm 608 selected MUST indicate the abatement algorithm the reporting node 609 wants the reacting node to use when the reporting node enters an 610 overload condition. 612 The abatement algorithm selected MUST be from the set of abatement 613 algorithms contained in the request message's OC-Feature-Vector AVP. 615 A reporting node that selects the loss algorithm may do so by 616 including the OC-Feature-Vector AVP with an explicit indication of 617 the loss algorithm, or it MAY omit OC-Feature-Vector. If it selects 618 a different algorithm, it MUST include the OC-Feature-Vector AVP with 619 an explicit indication of the selected algorithm. 621 For an ongoing overload condition, a reporting node MUST NOT change 622 the selected algorithm during the period of time that it is in an 623 overload condition and, as a result, is sending OC-OLR AVPs in answer 624 messages. 626 The reporting node MAY change the overload abatement algorithm 627 indicated in the OC-Supported-Features AVP at any time as long as no 628 previously sent OLRs may be active. 630 The reporting node SHOULD indicate support for other DOIC features 631 defined in extension drafts that it supports and that apply to the 632 transaction. 634 Note: Not all DOIC features will apply to all Diameter 635 applications or deployment scenarios. The features included in 636 the OC-Feature-Vector AVP are based on local reporting node 637 policy. 639 4.1.3. Agent Behavior 641 Diameter Agents that support DOIC SHOULD ensure that all messages 642 relayed by the agent contain the OC-Supported-Features AVP. 644 A Diameter Agent SHOULD take on reacting node behavior for Diameter 645 endpoints that do not support the DOIC solution. A Diameter Agent 646 detects that a Diameter endpoint does not support DOIC reacting node 647 behavior when there is no OC-Supported-Features AVP in a request 648 message. 650 For a Diameter Agent to be a reacting node for a non supporting 651 Diameter endpoint, the Diameter Agent MUST include the OC-Supported- 652 Features AVP in request messages it receives that do not contain the 653 OC-Supported-Features AVP. 655 A Diameter Agent SHOULD take on reporting node behavior for Diameter 656 endpoints that do not support the DOIC solution. A Diameter Agent 657 detects that a Diameter endpoint does not support DOIC reporting node 658 behavior when there is no OC-Supported-Features AVP in an answer 659 message for a transaction that contained the OC-Supported-Features 660 AVP in the request message. 662 For a Diameter Agent to take on reporting node behavior for a non 663 supporting Diameter endpoint the Diameter Agent MUST include the OC- 664 Supported-Features AVP in answer messages it receives that do not 665 contain the OC-Supported-Features AVP. 667 As with a Diameter endpoint taking on reporting node behavior, a 668 Diameter Agent MUST only include the OC-Supported-Features AVP in 669 answer messages for transactions where the request message received 670 by the Diameter Agent had an OC-Supported-Features AVP. 672 If a request message already has the OC-Supported-Features AVP then a 673 Diameter Agent MAY leave it unchanged in the relayed message or MAY 674 modify it to reflect the features appropriate for the transaction. 676 For instance, if the agent supports a superset of the features 677 reported by the reacting node then the agent might choose, based 678 on local policy, to advertise that superset of features to the 679 reporting node. 681 If the Diameter Agent changes the OC-Supported-Features AVP in a 682 request message then it is likely it will also need to modify the OC- 683 Supported-Features AVP in the answer message for the transaction. As 684 such, a Diameter Agent MAY modify the OC-Supported-Features AVP 685 carried in answer messages. 687 When making changes to the OC-Supported-Features AVP the Diameter 688 Agent needs to ensure that there is no ambiguity in DOIC behavior for 689 both upstream and downstream DOIC nodes. 691 4.2. Overload Report Processing 693 4.2.1. Overload Control State 695 Both reacting and reporting nodes maintain Overload Control State 696 (OCS) for active overload conditions. The following sections define 697 behavior associated with that OCS. 699 4.2.1.1. Overload Control State for Reacting Nodes 701 A reacting node SHOULD maintain the following OCS per supported 702 Diameter application: 704 o A host-type OCS entry for each Destination-Host to which it sends 705 host-type requests and 707 o A realm-type OCS entry for each Destination-Realm to which it 708 sends realm-type requests. 710 A host-type OCS entry is identified by the pair of application-id and 711 the node's DiameterIdentity. 713 A realm-type OCS entry is identified by the pair of application-d and 714 realm. 716 The host-type and realm-type OCS entries MAY include the following 717 information (the actual information stored is an implementation 718 decision): 720 o Sequence number (as received in OC-OLR) 722 o Time of expiry (derived from OC-Validity-Duration AVP received in 723 the OC-OLR AVP and time of reception of the message carrying OC- 724 OLR AVP) 726 o Selected Abatement Algorithm (as received in the OC-Supported- 727 Features AVP) 729 o Abatement Algorithm specific input data (as received in the OC-OLR 730 AVP, for example, OC-Reduction-Percentage for the Loss abatement 731 algorithm) 733 4.2.1.2. Overload Control State for Reporting Nodes 735 A reporting node SHOULD maintain OCS entries per supported Diameter 736 application, per supported (and eventually selected) Abatement 737 Algorithm and per report-type. 739 An OCS entry is identified by the tuple of Application-Id, Report- 740 Type and Abatement Algorithm and MAY include the following 741 information (the actual information stored is an implementation 742 decision): 744 o Sequence number 746 o Validity Duration 748 o Expiration Time 750 o Algorithm specific input data (for example, the Reduction 751 Percentage for the Loss Abatement Algorithm) 753 4.2.1.3. Reacting Node Maintenance of Overload Control State 755 When a reacting node receives an OC-OLR AVP, it MUST determine if it 756 is for an existing or new overload condition. 758 Note: For the remainder of this section the term OLR refers to the 759 combination of the contents of the received OC-OLR AVP and the 760 abatement algorithm indicated in the received OC-Supported- 761 Features AVP. 763 When receiving an answer message with multiple OLRs or different 764 types, a reporting node MUST process each received OLR. 766 When receiving an OC-OLR AVPs with unknown values, a reacting node 767 SHOULD be silently discarded by reacting nodes and the event SHOULD 768 be logged. 770 The OLR is for an existing overload condition if a reacting node has 771 an OCS that matches the received OLR. 773 For a host-report this means it matches the application-id and the 774 host's DiameterIdentity in an existing host OCS entry. 776 For a realm-report this means it matches the application-id and the 777 realm in an existing realm OCS entry. 779 If the OLR is for an existing overload condition then a reacting node 780 MUST determine if the OLR is a retransmission or an update to the 781 existing OLR. 783 If the sequence number for the received OLR is greater than the 784 sequence number stored in the matching OCS entry then a reacting node 785 MUST update the matching OCS entry. 787 If the sequence number for the received OLR is less than or equal to 788 the sequence number in the matching OCS entry then a reacting node 789 MUST silently ignore the received OLR. The matching OCS MUST NOT be 790 updated in this case. 792 If the received OLR is for a new overload condition then a reacting 793 node MUST generate a new OCS entry for the overload condition. 795 For a host-report this means a reacting node creates on OCS entry 796 with the application-id in the received message and DiameterIdentity 797 of the Origin-Host in the received message. 799 Note: This solution assumes that the Origin-Host AVP in the answer 800 message included by the reporting node is not changed along the 801 path to the reacting node. 803 For a realm-report this means a reacting node creates on OCS entry 804 with the application-id in the received message and realm of the 805 Origin-Realm in the received message. 807 If the received OLR contains a validity duration of zero ("0") then a 808 reacting node MUST update the OCS entry as being expired. 810 Note: It is not necessarily appropriate to delete the OCS entry, 811 as there is recommended behavior that the reacting node slowly 812 returns to full traffic when ending an overload abatement period. 814 The reacting node does not delete an OCS when receiving an answer 815 message that does not contain an OC-OLR AVP (i.e. absence of OLR 816 means "no change"). 818 4.2.1.4. Reporting Node Maintenance of Overload Control State 820 A reporting node SHOULD create a new OCS entry when entering an 821 overload condition. 823 Note: If a reporting node knows through absence of the OC- 824 Supported-Features AVP in received messages that there are no 825 reacting nodes supporting DOIC then the reporting node can choose 826 to not create OCS entries. 828 When generating a new OCS entry the sequence number SHOULD be set to 829 zero ("0"). 831 When generating sequence numbers for new overload conditions, the new 832 sequence number MUST be greater than any sequence number in an active 833 (unexpired) overload report for the same application and report-type 834 previously sent by the reporting node. This property MUST hold over 835 a reboot of the reporting node. 837 Note: One way of addressing this over a reboot of a reporting node 838 is to use a time stamp for the first overload condition that 839 occurs after the report and to start using sequence numbers of 840 zero for subsequent overload conditions. 842 A reporting node MUST update an OCS entry when it needs to adjust the 843 validity duration of the overload condition at reacting nodes. 845 For instance, if a reporting node wishes to instruct reacting 846 nodes to continue overload abatement for a longer period of time 847 than originally communicated. This also applies if the reporting 848 node wishes to shorten the period of time that overload abatement 849 is to continue. 851 A reporting node MUST NOT update the abatement algorithm in an active 852 OCS entry. 854 A reporting node MUST update an OCS entry when it wishes to adjust 855 any abatement algorithm specific parameters, including the reduction 856 percentage used for the Loss abatement algorithm. 858 For instance, if a reporting node wishes to change the reduction 859 percentage either higher, if the overload condition has worsened, 860 or lower, if the overload condition has improved, then the 861 reporting node would update the appropriate OCS entry. 863 A reporting node MUST update the sequence number associated with the 864 OCS entry anytime the contents of the OCS entry are changed. This 865 will result in a new sequence number being sent to reacting nodes, 866 instructing reacting nodes to process the OC-OLR AVP. 868 A reporting node SHOULD update an OCS entry with a validity duration 869 of zero ("0") when the overload condition ends. 871 Note: If a reporting node knows that the OCS entries in the 872 reacting nodes are near expiration then the reporting node might 873 decide not to send an OLR with a validity duration of zero. 875 A reporting node MUST keep an OCS entry with a validity duration of 876 zero ("0") for a period of time long enough to ensure that any non- 877 expired reacting node's OCS entry created as a result of the overload 878 condition in the reporting node is deleted. 880 4.2.2. Reacting Node Behavior 882 When a reacting node sends a request it MUST determine if that 883 request matches an active OCS. 885 If the request matches an active OCS then the reacting node MUST use 886 the overload abatement algorithm indicated in the OCS to determine if 887 the request is to receive overload abatement treatment. 889 For the Loss abatement algorithm defined in this specification, see 890 Section 5 for the overload abatement algorithm logic applied. 892 If the overload abatement algorithm selects the request for overload 893 abatement treatment then the reacting node MUST apply overload 894 abatement treatment on the request. The abatement treatment applied 895 depends on the context of the request. 897 If the request is a host-routed request then the reacting node SHOULD 898 apply throttling abatement treatment to the request. 900 If the request is a realm-routed request then the reacting node 901 SHOULD apply diversion abatement treatment to the request. 903 If the overload abatement treatment results in throttling of the 904 request and if the reacting node is an agent then the agent MUST send 905 an appropriate error as defined in Section 7. 907 The behavior of reacting nodes that are Diameter endpoints when 908 throttling requests depends on the application and is outside the 909 scope of this specification. 911 In the case that the OCS entry indicated no traffic was to be sent to 912 the overloaded entity and the validity duration expires or has a 913 validity duration of zero ("0"), meaning that the reporting node has 914 explicitly signaled the end of the overload condition then overload 915 abatement associated with the overload abatement MUST be ended in a 916 controlled fashion. 918 4.2.3. Reporting Node Behavior 920 If there is an active OCS entry then a reporting node SHOULD include 921 the OC-OLR AVP in all answer messages to requests that contain the 922 OC-Supported-Features AVP and that match the active OCS entry. 924 Note: A request matches if the application-id in the request 925 matches the application-id in any active OCS entry and if the 926 report-type in the OCS entry matches a report-type supported by 927 the reporting node as indicated in the OC-Supported-Features AVP. 929 The contents of the OC-OLR AVP depend on the selected algorithm. 931 A reporting node MAY choose to not resend an overload report to a 932 reacting node if it can guarantee that this overload report is 933 already active in the reacting node. 935 Note: In some cases (e.g. when there are one or more agents in the 936 path between reporting and reacting nodes, or when overload 937 reports are discarded by reacting nodes) a reporting node may not 938 be able to guarantee that the reacting node has received the 939 report. 941 A reporting node MUST NOT send overload reports of a type that has 942 not been advertised as supported by the reacting node. 944 Note: A reacting node implicitly advertises support for the host 945 and realm report types by including the OC-Supported-Features AVP 946 in the request. Support for other report types will be explicitly 947 indicated by new feature bits in the OC-Feature-Vector AVP. 949 A reporting node MAY rely on the OC-Validity-Duration AVP values for 950 the implicit overload control state cleanup on the reacting node. 952 A reporting node SHOULD explicitly indicate the end of an overload 953 occurrence by sending a new OLR with OC-Validity-Duration set to a 954 value of zero ("0"). The reporting node SHOULD ensure that all 955 reacting nodes receive the updated overload report. 957 Note: All OLRs sent have an expiration time calculated by adding 958 the validity-duration contained in the OLR to the time the message 959 was sent. Transit time for the OLR can be safely ignored. The 960 reporting node can ensure that all reacting nodes have received 961 the OLR by continuing to send it in answer messages until the 962 expiration time for all OLRs sent for that overload condition have 963 expired. 965 When a reporting node sends an OLR, it effectively delegates any 966 necessary throttling to downstream nodes. If the reporting node also 967 locally throttles the same set of messages, the overall number of 968 throttled requests may be higher than intended. Therefore, before 969 applying local message throttling, a reporting node needs to check if 970 these messages match existing OCS entries, indicating that these 971 messages have survived throttling applied by downstream nodes that 972 have received the related OLR. 974 However, even if the set of messages match existing OCS entries, the 975 reporting node can still apply other abatement methods such as 976 diversion. The reporting node might also need to throttle requests 977 for reasons other than overload. For example, an agent or server 978 might have a configured rate limit for each client, and throttle 979 requests that exceed that limit, even if such requests had already 980 been candidates for throttling by downstream nodes. The reporting 981 node also has the option to send new OLRs requesting greater 982 reductions in traffic, reducing the need for local throttling. 984 A reporting node SHOULD decrease requested overload abatement 985 treatment in a controlled fashion to avoid oscillations in traffic. 987 4.3. Protocol Extensibility 989 The DOIC solution can be extended. Types of potential extensions 990 include new traffic abatement algorithms, new report types or other 991 new functionality. 993 When defining a new extension that requires new normative behavior, 994 the specification MUST define a new feature for the OC-Feature- 995 Vector. This feature bit is used to communicate support for the new 996 feature. 998 The extension MAY define new AVPs for use in DOIC Capability 999 Announcement and for use in DOIC Overload reporting. These new AVPs 1000 SHOULD be defined to be extensions to the OC-Supported-Features and 1001 OC-OLR AVPs defined in this document. 1003 [RFC6733] defined Grouped AVP extension mechanisms apply. This 1004 allows, for example, defining a new feature that is mandatory to be 1005 understood even when piggybacked on an existing application. 1007 The handling of feature bits in the OC-Feature-Vector AVP that are 1008 not associated with overload abatement algorithms MUST be specified 1009 by the extensions that define the features. 1011 When defining new report type values, the corresponding specification 1012 MUST define the semantics of the new report types and how they affect 1013 the OC-OLR AVP handling. The specification MUST also reserve a 1014 corresponding new feature bit in the OC-Feature-Vector AVP. 1016 The OC-OLR AVP can be expanded with optional sub-AVPs only if a 1017 legacy DOIC implementation can safely ignore them without breaking 1018 backward compatibility for the given OC-Report-Type AVP value. If 1019 the new sub-AVPs imply new semantics for handling the indicated 1020 report type, then a new OC-Report-Type AVP value MUST be defined. 1022 Documents that introduce new report types MUST describe any 1023 limitations on their use across non-supporting agents. 1025 New features (feature bits in the OC-Feature-Vector AVP) and report 1026 types (in the OC-Report-Type AVP) MUST be registered with IANA. As 1027 with any Diameter specification, RFC6733 requires all new AVPs to be 1028 registered with IANA. See Section 8 for the required procedures. 1030 5. Loss Algorithm 1032 This section documents the Diameter overload loss abatement 1033 algorithm. 1035 5.1. Overview 1037 The DOIC specification supports the ability for multiple overload 1038 abatement algorithms to be specified. The abatement algorithm used 1039 for any instance of overload is determined by the Diameter Overload 1040 Capability Announcement process documented in Section 4.1. 1042 The loss algorithm described in this section is the default algorithm 1043 that must be supported by all Diameter nodes that support DOIC. 1045 The loss algorithm is designed to be a straightforward and stateless 1046 overload abatement algorithm. It is used by reporting nodes to 1047 request a percentage reduction in the amount of traffic sent. The 1048 traffic impacted by the requested reduction depends on the type of 1049 overload report. 1051 Reporting nodes use a strategy of applying abatement logic to the 1052 requested percentage of request messages sent (or handled in the case 1053 of agents) by the reacting node that are impacted by the overload 1054 report. 1056 From a conceptual level, the logic at the reacting node could be 1057 outlined as follows. 1059 1. An overload report is received and the associated OCS is either 1060 saved or updated (if required) by the reacting node. 1062 2. A new Diameter request is generated by the application running on 1063 the reacting node. 1065 3. The reacting node determines that an active overload report 1066 applies to the request, as indicated by the corresponding OCS 1067 entry. 1069 4. The reacting node determines if overload abatement treatment 1070 should be applied to the request. One approach that could be 1071 taken for each request is to select a random number between 1 and 1072 100. If the random number is less than the indicated reduction 1073 percentage then the request is given abatement treatment, 1074 otherwise the request is given normal routing treatment. 1076 5.2. Reporting Node Behavior 1078 The method a reporting node uses to determine the amount of traffic 1079 reduction required to address an overload condition is an 1080 implementation decision. 1082 When a reporting node that has selected the loss abatement algorithm 1083 determines the need to request a reduction in traffic, it includes an 1084 OC-OLR AVP in response messages as described in Section 4.2.3. 1086 When sending the OC-OLR AVP, the reporting node MUST indicate a 1087 percentage reduction in the OC-Reduction-Percentage AVP. 1089 The reporting node MAY change the reduction percentage in subsequent 1090 overload reports. When doing so the reporting node must conform to 1091 overload report handing specified in Section 4.2.3. 1093 5.3. Reacting Node Behavior 1095 The method a reacting node uses to determine which request messages 1096 are given abatement treatment is an implementation decision. 1098 When receiving an OC-OLR in an answer message where the algorithm 1099 indicated in the OC-Supported-Features AVP is the loss algorithm, the 1100 reacting node MUST apply abatement treatment to the requested 1101 percentage of request messages sent. 1103 Note: The loss algorithm is a stateless algorithm. As a result, 1104 the reacting node does not guarantee that there will be an 1105 absolute reduction in traffic sent. Rather, it guarantees that 1106 the requested percentage of new requests will be given abatement 1107 treatment. 1109 When applying overload abatement treatment for the load abatement 1110 algorithm, the reacting node MUST abate the requested percentage of 1111 requests that would have otherwise been sent to the reporting host or 1112 realm. 1114 If reacting node comes out of the 100 percent traffic reduction as a 1115 result of the overload report timing out, the following concerns are 1116 RECOMMENDED to be applied. The reacting node sending the traffic 1117 should be conservative and, for example, first send "probe" messages 1118 to learn the overload condition of the overloaded node before 1119 converging to any traffic amount/rate decided by the sender. Similar 1120 concerns apply in all cases when the overload report times out unless 1121 the previous overload report stated 0 percent reduction. 1123 If the reacting node does not receive an OLR in messages sent to the 1124 formerly overloaded node then the reacting node SHOULD slowly 1125 increase the rate of traffic sent to the overloaded node. 1127 When an active overload report expires, it is suggested that the 1128 reacting node progressively decrease the amount of traffic given 1129 abatement treatment, until the reduction is completely removed and no 1130 traffic is given abatement treatment. 1132 The goal of this behavior is to reduce the probability of overload 1133 condition thrashing where an immediate transition from 100% 1134 reduction to 0% reduction results in the reporting node moving 1135 quickly back into an overload condition. 1137 6. Attribute Value Pairs 1139 This section describes the encoding and semantics of the Diameter 1140 Overload Indication Attribute Value Pairs (AVPs) defined in this 1141 document. 1143 A new application specification can incorporate the overload control 1144 mechanism specified in this document by making it mandatory to 1145 implement for the application and referencing this specification 1146 normatively. It is the responsibility of the Diameter application 1147 designers to define how overload control mechanisms works on that 1148 application. 1150 6.1. OC-Supported-Features AVP 1152 The OC-Supported-Features AVP (AVP code TBD1) is of type Grouped and 1153 serves two purposes. First, it announces a node's support for the 1154 DOIC solution in general. Second, it contains the description of the 1155 supported DOIC features of the sending node. The OC-Supported- 1156 Features AVP MUST be included in every Diameter request message a 1157 DOIC supporting node sends. 1159 OC-Supported-Features ::= < AVP Header: TBD1 > 1160 [ OC-Feature-Vector ] 1161 * [ AVP ] 1163 The OC-Feature-Vector sub-AVP is used to announce the DOIC features 1164 supported by the DOIC node, in the form of a flag-bits field in which 1165 each bit announces one feature or capability supported by the node 1166 (see Section 6.2). The absence of the OC-Feature-Vector AVP 1167 indicates that only the default traffic abatement algorithm described 1168 in this specification is supported. 1170 6.2. OC-Feature-Vector AVP 1172 The OC-Feature-Vector AVP (AVP code TBD6) is of type Unsigned64 and 1173 contains a 64 bit flags field of announced capabilities of a DOIC 1174 node. The value of zero (0) is reserved. 1176 The OC-Feature-Vector sub-AVP is used to announce the DOIC features 1177 supported by the DOIC node, in the form of a flag-bits field in which 1178 each bit announces one feature or capability supported by the node 1179 (see Section 6.2). The absence of the OC-Feature-Vector AVP 1180 indicates that only the default traffic abatement algorithm described 1181 in this specification is supported. 1183 The following capabilities are defined in this document: 1185 OLR_DEFAULT_ALGO (0x0000000000000001) 1187 When this flag is set by the a DOIC reacting node it means that 1188 the default traffic abatement (loss) algorithm is supported. When 1189 this flag is set by a DOIC reporting node it means that the loss 1190 algorithm will be used for requested overload abatement. 1192 6.3. OC-OLR AVP 1194 The OC-OLR AVP (AVP code TBD2) is of type Grouped and contains the 1195 information necessary to convey an overload report on an overload 1196 condition at the reporting node. The OC-OLR AVP does not explicitly 1197 contain all information needed by the reacting node to decide whether 1198 a subsequent request must undergo abatement using the received 1199 reduction percentage. The value of the OC-Report-Type AVP within the 1200 OC-OLR AVP indicates which implicit information is relevant for this 1201 decision (see Section 6.6). The application the OC-OLR AVP applies 1202 to is the same as the Application-Id found in the Diameter message 1203 header. The host or realm the OC-OLR AVP concerns is determined from 1204 the Origin-Host AVP and/or Origin-Realm AVP found in the 1205 encapsulating Diameter command. The OC-OLR AVP is intended to be 1206 sent only by a reporting node. 1208 OC-OLR ::= < AVP Header: TBD2 > 1209 < OC-Sequence-Number > 1210 < OC-Report-Type > 1211 [ OC-Reduction-Percentage ] 1212 [ OC-Validity-Duration ] 1213 * [ AVP ] 1215 6.4. OC-Sequence-Number AVP 1217 The OC-Sequence-Number AVP (AVP code TBD3) is of type Unsigned64. 1218 Its usage in the context of overload control is described in 1219 Section 4.2. 1221 From the functionality point of view, the OC-Sequence-Number AVP is 1222 used as a non-volatile increasing counter for a sequence of overload 1223 reports between two DOIC nodes for the same overload occurrence. The 1224 sequence number is only required to be unique between two DOIC nodes. 1225 Sequence numbers are treated in a uni-directional manner, i.e. two 1226 sequence numbers on each direction between two DOIC nodes are not 1227 related or correlated. 1229 6.5. OC-Validity-Duration AVP 1231 The OC-Validity-Duration AVP (AVP code TBD4) is of type Unsigned32 1232 and indicates in milliseconds the validity time of the overload 1233 report. The number of milliseconds is measured after reception of 1234 the first OC-OLR AVP with a given value of OC-Sequence-Number AVP. 1235 The default value for the OC-Validity-Duration AVP is 30000 (i.e.; 30 1236 seconds). When the OC-Validity-Duration AVP is not present in the 1237 OC-OLR AVP, the default value applies. 1239 6.6. OC-Report-Type AVP 1241 The OC-Report-Type AVP (AVP code TBD5) is of type Enumerated. The 1242 value of the AVP describes what the overload report concerns. The 1243 following values are initially defined: 1245 HOST_REPORT 0 The overload report is for a host. Overload abatement 1246 treatment applies to host-routed requests. 1248 REALM_REPORT 1 The overload report is for a realm. Overload 1249 abatement treatment applies to realm-routed requests. 1251 6.7. OC-Reduction-Percentage AVP 1253 The OC-Reduction-Percentage AVP (AVP code TBD8) is of type Unsigned32 1254 and describes the percentage of the traffic that the sender is 1255 requested to reduce, compared to what it otherwise would send. The 1256 OC-Reduction-Percentage AVP applies to the default (loss) algorithm 1257 specified in this specification. However, the AVP can be reused for 1258 future abatement algorithms, if its semantics fit into the new 1259 algorithm. 1261 The value of the Reduction-Percentage AVP is between zero (0) and one 1262 hundred (100). Values greater than 100 are ignored. The value of 1263 100 means that all traffic is to be throttled, i.e. the reporting 1264 node is under a severe load and ceases to process any new messages. 1265 The value of 0 means that the reporting node is in a stable state and 1266 has no need for the reacting node to apply any traffic abatement. 1267 The default value of the OC-Reduction-Percentage AVP is 0. When the 1268 OC-Reduction-Percentage AVP is not present in the overload report, 1269 the default value applies. 1271 6.8. Attribute Value Pair flag rules 1272 +---------+ 1273 |AVP flag | 1274 |rules | 1275 +----+----+ 1276 AVP Section | |MUST| 1277 Attribute Name Code Defined Value Type |MUST| NOT| 1278 +--------------------------------------------------+----+----+ 1279 |OC-Supported-Features TBD1 6.1 Grouped | | V | 1280 +--------------------------------------------------+----+----+ 1281 |OC-OLR TBD2 6.3 Grouped | | V | 1282 +--------------------------------------------------+----+----+ 1283 |OC-Sequence-Number TBD3 6.4 Unsigned64 | | V | 1284 +--------------------------------------------------+----+----+ 1285 |OC-Validity-Duration TBD4 6.5 Unsigned32 | | V | 1286 +--------------------------------------------------+----+----+ 1287 |OC-Report-Type TBD5 6.6 Enumerated | | V | 1288 +--------------------------------------------------+----+----+ 1289 |OC-Reduction | | | 1290 | -Percentage TBD8 6.7 Unsigned32 | | V | 1291 +--------------------------------------------------+----+----+ 1292 |OC-Feature-Vector TBD6 6.2 Unsigned64 | | V | 1293 +--------------------------------------------------+----+----+ 1295 As described in the Diameter base protocol [RFC6733], the M-bit usage 1296 for a given AVP in a given command may be defined by the 1297 application.. 1299 7. Error Response Codes 1301 When a DOIC node rejects a Diameter request due to overload, the DOIC 1302 node MUST select an appropriate error response code. This 1303 determination is made based on the probability of the request 1304 succeeding if retried on a different path. 1306 A reporting node rejecting a Diameter request due to an overload 1307 condition SHOULD send a DIAMETER-TOO-BUSY error response, if it can 1308 assume that the same request may succeed on a different path. 1310 If a reporting node knows or assumes that the same request will not 1311 succeed on a different path, DIAMETER_UNABLE_TO_COMPLY error response 1312 SHOULD be used. Retrying would consume valuable resources during an 1313 occurrence of overload. 1315 For instance, if the request arrived at the reporting node without 1316 a Destination-Host AVP then the reporting node might determine 1317 that there is an alternative Diameter node that could successfully 1318 process the request and that retrying the transaction would not 1319 negatively impact the reporting node. DIAMETER_TOO_BUSY would be 1320 sent in this case. 1322 If the request arrived at the reporting node with a Destination- 1323 Host AVP populated with its own Diameter identity then the 1324 reporting node can assume that retrying the request would result 1325 in it coming to the same reporting node. 1326 DIAMETER_UNABLE_TO_COMPLY would be sent in this case. 1328 A second example is when an agent that supports the DOIC solution 1329 is performing the role of a reacting node for a non supporting 1330 client. Requests that are rejected as a result of DOIC throttling 1331 by the agent in this scenario would generally be rejected with a 1332 DIAMETER_UNABLE_TO_COMPLY response code. 1334 8. IANA Considerations 1336 8.1. AVP codes 1338 New AVPs defined by this specification are listed in Section 6. All 1339 AVP codes are allocated from the 'Authentication, Authorization, and 1340 Accounting (AAA) Parameters' AVP Codes registry. 1342 8.2. New registries 1344 Two new registries are needed under the 'Authentication, 1345 Authorization, and Accounting (AAA) Parameters' registry. 1347 A new "Overload Control Feature Vector" registry is required. The 1348 registry must contain the following: 1350 Feature Vector Value 1352 Specification - the specification that defines the new value. 1354 See Section 6.2 for the initial Feature Vector Value in the registry. 1355 This specification is the specification defining the value. New 1356 values can be added into the registry using the Specification 1357 Required policy. [RFC5226]. 1359 A new "Overload Report Type" registry is required. The registry must 1360 contain the following: 1362 Report Type Value 1364 Specification - the specification that defines the new value. 1366 See Section 6.2 for the initial assignment in the registry. New 1367 types can be added using the Specification Required policy [RFC5226]. 1369 9. Security Considerations 1371 DOIC gives Diameter nodes the ability to request that downstream 1372 nodes send fewer Diameter requests. Nodes do this by exchanging 1373 overload reports that directly effect this reduction. This exchange 1374 is potentially subject to multiple methods of attack, and has the 1375 potential to be used as a Denial-of-Service (DoS) attack vector. 1377 Overload reports may contain information about the topology and 1378 current status of a Diameter network. This information is 1379 potentially sensitive. Network operators may wish to control 1380 disclosure of overload reports to unauthorized parties to avoid its 1381 use for competitive intelligence or to target attacks. 1383 Diameter does not include features to provide end-to-end 1384 authentication, integrity protection, or confidentiality. This may 1385 cause complications when sending overload reports between non- 1386 adjacent nodes. 1388 9.1. Potential Threat Modes 1390 The Diameter protocol involves transactions in the form of requests 1391 and answers exchanged between clients and servers. These clients and 1392 servers may be peers, that is, they may share a direct transport 1393 (e.g. TCP or SCTP) connection, or the messages may traverse one or 1394 more intermediaries, known as Diameter Agents. Diameter nodes use 1395 TLS, DTLS, or IPsec to authenticate peers, and to provide 1396 confidentiality and integrity protection of traffic between peers. 1397 Nodes can make authorization decisions based on the peer identities 1398 authenticated at the transport layer. 1400 When agents are involved, this presents an effectively transitive 1401 trust model. That is, a Diameter client or server can authorize an 1402 agent for certain actions, but it must trust that agent to make 1403 appropriate authorization decisions about its peers, and so on. 1404 Since confidentiality and integrity protection occurs at the 1405 transport layer, agents can read, and perhaps modify, any part of a 1406 Diameter message, including an overload report. 1408 There are several ways an attacker might attempt to exploit the 1409 overload control mechanism. An unauthorized third party might inject 1410 an overload report into the network. If this third party is upstream 1411 of an agent, and that agent fails to apply proper authorization 1412 policies, downstream nodes may mistakenly trust the report. This 1413 attack is at least partially mitigated by the assumption that nodes 1414 include overload reports in Diameter answers but not in requests. 1415 This requires an attacker to have knowledge of the original request 1416 in order to construct an answer. Such an answer would also need to 1417 arrive at a Diameter node via a protected transport connection. 1418 Therefore, implementations MUST validate that an answer containing an 1419 overload report is a properly constructed response to a pending 1420 request prior to acting on the overload report, and that the answer 1421 was received via an appropriate transport connection. 1423 A similar attack involves a compromised but otherwise authorized node 1424 that sends an inappropriate overload report. For example, a server 1425 for the realm "example.com" might send an overload report indicating 1426 that a competitor's realm "example.net" is overloaded. If other 1427 nodes act on the report, they may falsely believe that "example.net" 1428 is overloaded, effectively reducing that realm's capacity. 1429 Therefore, it's critical that nodes validate that an overload report 1430 received from a peer actually falls within that peer's responsibility 1431 before acting on the report or forwarding the report to other peers. 1432 For example, an overload report from a peer that applies to a realm 1433 not handled by that peer is suspect. 1435 This attack is partially mitigated by the fact that the 1436 application, as well as host and realm, for a given OLR is 1437 determined implicitly by respective AVPs in the enclosing answer. 1438 If a reporting node modifies any of those AVPs, the enclosing 1439 transaction will also be affected. 1441 9.2. Denial of Service Attacks 1443 Diameter overload reports, especially realm-reports, can cause a node 1444 to cease sending some or all Diameter requests for an extended 1445 period. This makes them a tempting vector for DoS attacks. 1446 Furthermore, since Diameter is almost always used in support of other 1447 protocols, a DoS attack on Diameter is likely to impact those 1448 protocols as well. Therefore, Diameter nodes MUST NOT honor or 1449 forward OLRs received from peers that are not trusted to send them. 1451 An attacker might use the information in an OLR to assist in DoS 1452 attacks. For example, an attacker could use information about 1453 current overload conditions to time an attack for maximum effect, or 1454 use subsequent overload reports as a feedback mechanism to learn the 1455 results of a previous or ongoing attack. Operators need the ability 1456 to ensure that OLRs are not leaked to untrusted parties. 1458 9.3. Non-Compliant Nodes 1460 In the absence of an overload control mechanism, Diameter nodes need 1461 to implement strategies to protect themselves from floods of 1462 requests, and to make sure that a disproportionate load from one 1463 source does not prevent other sources from receiving service. For 1464 example, a Diameter server might throttle a certain percentage of 1465 requests from sources that exceed certain limits. Overload control 1466 can be thought of as an optimization for such strategies, where 1467 downstream nodes never send the excess requests in the first place. 1468 However, the presence of an overload control mechanism does not 1469 remove the need for these other protection strategies. 1471 When a Diameter node sends an overload report, it cannot assume that 1472 all nodes will comply, even if they indicate support for DOIC. A 1473 non-compliant node might continue to send requests with no reduction 1474 in load. Such non-compliance could be done accidentally, or 1475 maliciously to gain an unfair advantage over compliant nodes. 1476 Requirement 28 [RFC7068] indicates that the overload control solution 1477 cannot assume that all Diameter nodes in a network are trusted, and 1478 that malicious nodes not be allowed to take advantage of the overload 1479 control mechanism to get more than their fair share of service. 1481 9.4. End-to End-Security Issues 1483 The lack of end-to-end integrity features makes it difficult to 1484 establish trust in overload reports received from non-adjacent nodes. 1485 Any agents in the message path may insert or modify overload reports. 1486 Nodes must trust that their adjacent peers perform proper checks on 1487 overload reports from their peers, and so on, creating a transitive- 1488 trust requirement extending for potentially long chains of nodes. 1489 Network operators must determine if this transitive trust requirement 1490 is acceptable for their deployments. Nodes supporting Diameter 1491 overload control MUST give operators the ability to select which 1492 peers are trusted to deliver overload reports, and whether they are 1493 trusted to forward overload reports from non-adjacent nodes. DOIC 1494 nodes MUST strip DOIC AVPs from messages received from peers that are 1495 not trusted for DOIC purposes. 1497 The lack of end-to-end confidentiality protection means that any 1498 Diameter agent in the path of an overload report can view the 1499 contents of that report. In addition to the requirement to select 1500 which peers are trusted to send overload reports, operators MUST be 1501 able to select which peers are authorized to receive reports. A node 1502 MUST not send an overload report to a peer not authorized to receive 1503 it. Furthermore, an agent MUST remove any overload reports that 1504 might have been inserted by other nodes before forwarding a Diameter 1505 message to a peer that is not authorized to receive overload reports. 1507 A DOIC node cannot always automatically detect that a peer also 1508 supports DOIC. For example, a node might have a peer that is a 1509 non-supporting agent. If nodes on the other side of that agent 1510 send OC-Supported-Features AVPs, the agent is likely to forward 1511 them as unknown AVPs. Messages received across the non-supporting 1512 agent may be indistinguishable from messages received across a 1513 DOIC supporting agent, giving the false impression that the non- 1514 supporting agent actually supports DOIC. This complicates the 1515 transitive-trust nature of DOIC. Operators need to be careful to 1516 avoid situations where a non-supporting agent is mistakenly 1517 trusted to enforce DOIC related authorization policies. 1519 At the time of this writing, the DIME working group is studying 1520 requirements for adding end-to-end security features 1521 [I-D.ietf-dime-e2e-sec-req] to Diameter. These features, when they 1522 become available, might make it easier to establish trust in non- 1523 adjacent nodes for overload control purposes. Readers should be 1524 reminded, however, that the overload control mechanism encourages 1525 Diameter agents to modify AVPs in, or insert additional AVPs into, 1526 existing messages that are originated by other nodes. If end-to-end 1527 security is enabled, there is a risk that such modification could 1528 violate integrity protection. The details of using any future 1529 Diameter end-to-end security mechanism with overload control will 1530 require careful consideration, and are beyond the scope of this 1531 document. 1533 10. Contributors 1535 The following people contributed substantial ideas, feedback, and 1536 discussion to this document: 1538 o Eric McMurry 1540 o Hannes Tschofenig 1542 o Ulrich Wiehe 1544 o Jean-Jacques Trottin 1546 o Maria Cruz Bartolome 1548 o Martin Dolly 1550 o Nirav Salot 1552 o Susan Shishufeng 1554 11. References 1556 11.1. Normative References 1558 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1559 Requirement Levels", BCP 14, RFC 2119, March 1997. 1561 [RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an 1562 IANA Considerations Section in RFCs", BCP 26, RFC 5226, 1563 May 2008. 1565 [RFC5905] Mills, D., Martin, J., Burbank, J., and W. Kasch, "Network 1566 Time Protocol Version 4: Protocol and Algorithms 1567 Specification", RFC 5905, June 2010. 1569 [RFC6733] Fajardo, V., Arkko, J., Loughney, J., and G. Zorn, 1570 "Diameter Base Protocol", RFC 6733, October 2012. 1572 11.2. Informative References 1574 [Cx] 3GPP, , "ETSI TS 129 229 V11.4.0", August 2013. 1576 [I-D.ietf-dime-e2e-sec-req] 1577 Tschofenig, H., Korhonen, J., Zorn, G., and K. Pillay, 1578 "Diameter AVP Level Security: Scenarios and Requirements", 1579 draft-ietf-dime-e2e-sec-req-00 (work in progress), 1580 September 2013. 1582 [PCC] 3GPP, , "ETSI TS 123 203 V11.12.0", December 2013. 1584 [RFC4006] Hakala, H., Mattila, L., Koskinen, J-P., Stura, M., and J. 1585 Loughney, "Diameter Credit-Control Application", RFC 4006, 1586 August 2005. 1588 [RFC5729] Korhonen, J., Jones, M., Morand, L., and T. Tsou, 1589 "Clarifications on the Routing of Diameter Requests Based 1590 on the Username and the Realm", RFC 5729, December 2009. 1592 [RFC7068] McMurry, E. and B. Campbell, "Diameter Overload Control 1593 Requirements", RFC 7068, November 2013. 1595 [S13] 3GPP, , "ETSI TS 129 272 V11.9.0", December 2012. 1597 Appendix A. Issues left for future specifications 1599 The base solution for the overload control does not cover all 1600 possible use cases. A number of solution aspects were intentionally 1601 left for future specification and protocol work. The following sub- 1602 sections define some of the potential extensions to the DOIC 1603 solution. 1605 A.1. Additional traffic abatement algorithms 1607 This specification describes only means for a simple loss based 1608 algorithm. Future algorithms can be added using the designed 1609 solution extension mechanism. The new algorithms need to be 1610 registered with IANA. See Sections 6.1 and 8 for the required IANA 1611 steps. 1613 A.2. Agent Overload 1615 This specification focuses on Diameter endpoint (server or client) 1616 overload. A separate extension will be required to outline the 1617 handling of the case of agent overload. 1619 A.3. New Error Diagnostic AVP 1621 This specification indicates the use of existing error messages when 1622 nodes reject requests due to overload. The DIME working group is 1623 considering defining additional error codes or AVPs to indicate that 1624 overload was the reason for the rejection of the message. 1626 Appendix B. Deployment Considerations 1628 Non Supporting Agents 1630 Due to the way that realm-routed requests are handled in Diameter 1631 networks with the server selection for the request done by an 1632 agent, network operators should enable DOIC at agents that perform 1633 server selection first. 1635 Topology Hiding Interactions 1637 There exist proxies that implement what is referred to as Topology 1638 Hiding. This can include cases where the agent modifies the 1639 Origin-Host in answer messages. The behavior of the DOIC solution 1640 is not well understood when this happens. As such, the DOIC 1641 solution does not address this scenario. 1643 Appendix C. Requirements Conformance Analysis 1645 This section contains the result of an analysis of the DOIC solutions 1646 conformance to the requirements defined in [RFC7068]. 1648 C.1. Deferred Requirements 1650 The 3GPP has adopted an early version of this document as normative 1651 references in various Diameter related specifications to support the 1652 overload control mechanism in their release 12 framework. The DIME 1653 working group has therefore decided to defer certain requirements in 1654 order to complete the design of an extensible, generic solution 1655 before the deadline scheduled by the 3GPP for the completion of the 1656 release 12 protocol work by the end of 2014. The deferred work 1657 includes the following: 1659 o Agent Overload - The ability for an agent to report an overload 1660 condition of the agent itself. 1662 o Load Information - The ability for a node to report its load level 1663 when not overloaded. 1665 At the time of this writing, DIME has begun separate work efforts for 1666 these requirements. 1668 C.2. Detection of non-supporting Intermediaries 1670 The DOIC mechanism as currently defined does not allow supporting 1671 nodes to automatically determine whether OC-Supported-Features or OC- 1672 OLR AVPs are originated by a peer node, or by a non-peer node and 1673 sent across a non-supporting peer. This makes it impossible to 1674 detect the presence of non-supporting nodes between supporting nodes, 1675 except by configuration. The working group determined that such a 1676 configuration requirement is acceptable. 1678 This limits full compliance with certain requirements related to the 1679 limitation of new configuration, deployment in environments with 1680 mixed support, operating across non-supporting agents, and 1681 authorization. 1683 C.3. Implicit Application Indication 1685 The working group elected to determine the application for an 1686 overload report from that of the enclosing message. This prevents 1687 sending an OLR for an application when there are no transactions for 1688 that application. 1690 As a consequence, DOIC does not comply with the requirement to be 1691 able to report overload information across quiescent connections. 1692 DOIC does not fully comply with requirements to operate on up-to-date 1693 information, since if an OLR causes all transactions to stop for an 1694 application, the only way traffic will resume is for the OLR to 1695 expire. 1697 C.4. Stateless Operation 1699 RFC7068 explicitly discourages the sending of OLRs in every answer 1700 message, as part of the requirement to avoid additional work for 1701 overloaded nodes. DOIC recommends exactly that behavior during 1702 active overload conditions. The working group determined that doing 1703 otherwise would reduce reliability and increase statefulness. (Note 1704 that DOIC does allow nodes to avoid sending OLRs in every answer if 1705 they have some other method of ensuring that OLRs get to all relevant 1706 reacting nodes.) 1708 C.5. No New Vulnerabilities 1710 The working group believes that DOIC is compliant with the 1711 requirement to avoid introducing new vulnerabilities. However, this 1712 requirement may warrant an early security expert review. 1714 C.6. Detailed Requirements 1716 [RFC Editor: Please remove this section and subsections prior to 1717 publication as an RFC.] 1719 C.6.1. General 1721 REQ 1: The solution MUST provide a communication method for Diameter 1722 nodes to exchange load and overload information. 1724 *Partially Compliant*. The mechanism uses new AVPs 1725 piggybacked on existing Diameter messages to exchange 1726 overload information. It does not currently support "load" 1727 information or the ability to report overload of an agent. 1728 These have been left for future extensions. 1730 REQ 2: The solution MUST allow Diameter nodes to support overload 1731 control regardless of which Diameter applications they 1732 support. Diameter clients and agents must be able to use the 1733 received load and overload information to support graceful 1734 behavior during an overload condition. Graceful behavior 1735 under overload conditions is best described by REQ 3. 1737 *Partially Compliant*. The DOIC AVPs can be used in any 1738 application that allows the extension of AVPs. However, 1739 "load" information is not currently supported. 1741 REQ 3: The solution MUST limit the impact of overload on the overall 1742 useful throughput of a Diameter server, even when the 1743 incoming load on the network is far in excess of its 1744 capacity. The overall useful throughput under load is the 1745 ultimate measure of the value of a solution. 1747 *Compliant*. DOIC provides information that nodes can use to 1748 reduce the impact of overload. 1750 REQ 4: Diameter allows requests to be sent from either side of a 1751 connection, and either side of a connection may have need to 1752 provide its overload status. The solution MUST allow each 1753 side of a connection to independently inform the other of its 1754 overload status. 1756 *Compliant*. DOIC AVPs can be included regardless of 1757 transaction "direction" 1759 REQ 5: Diameter allows nodes to determine their peers via dynamic 1760 discovery or manual configuration. The solution MUST work 1761 consistently without regard to how peers are determined. 1763 *Compliant*. DOIC contains no assumptions about how peers are 1764 discovered. 1766 REQ 6: The solution designers SHOULD seek to minimize the amount of 1767 new configuration required in order to work. For example, it 1768 is better to allow peers to advertise or negotiate support 1769 for the solution, rather than to require that this knowledge 1770 to be configured at each node. 1772 *Partially Compliant*. Most DOIC parameters are advertised 1773 using the DOIC capability announcement mechanism. However, 1774 there are some situations where configuration is required. 1775 For example, a DOIC node detect the fact that a peer may not 1776 support DOIC when nodes on the other side of the non- 1777 supporting node do support DOIC without configuration. 1779 C.6.2. Performance 1781 REQ 7: The solution and any associated default algorithm(s) MUST 1782 ensure that the system remains stable. At some point after 1783 an overload condition has ended, the solution MUST enable 1784 capacity to stabilize and become equal to what it would be in 1785 the absence of an overload condition. Note that this also 1786 requires that the solution MUST allow nodes to shed load 1787 without introducing non-converging oscillations during or 1788 after an overload condition. 1790 *Compliant*. The specification offers guidance that 1791 implementations should apply hysteresis when recovering from 1792 overload, and avoid sudden ramp ups in offered load when 1793 recovering. 1795 REQ 8: Supporting nodes MUST be able to distinguish current overload 1796 information from stale information. 1798 *Partially Compliant*. DOIC overload reports are "soft 1799 state", that is they expire after an indicated period. DOIC 1800 nodes may also send reports that end existing overload 1801 conditions. DOIC requires reporting nodes to ensure that all 1802 relevant reacting nodes receive overload reports. 1804 However, since DOIC does not allow reporting nodes to send 1805 OLRs in watchdog messages, if an overload condition results 1806 in zero offered load, the reporting node cannot update the 1807 condition until the expiration of the original OLR. 1809 REQ 9: The solution MUST function across fully loaded as well as 1810 quiescent transport connections. This is partially derived 1811 from the requirement for stability in REQ 7. 1813 *Not Compliant*. DOIC does not allow OLRs to be sent over 1814 quiescent transport connections. This is due to the fact 1815 that OLRs cannot be sent outside of the application to which 1816 they apply. 1818 REQ 10: Consumers of overload information MUST be able to determine 1819 when the overload condition improves or ends. 1821 *Partially Compliant*. (See response to previous two 1822 requirements.) 1824 REQ 11: The solution MUST be able to operate in networks of different 1825 sizes. 1827 *Compliant*. DOIC makes no assumptions about the size of the 1828 network. DOIC can operate purely between clients and 1829 servers, or across agents. 1831 REQ 12: When a single network node fails, goes into overload, or 1832 suffers from reduced processing capacity, the solution MUST 1833 make it possible to limit the impact of the affected node on 1834 other nodes in the network. This helps to prevent a small- 1835 scale failure from becoming a widespread outage. 1837 *Partially Compliant*. DOIC allows overload reports for an 1838 entire realm, where abated traffic will not be redirected 1839 towards another server. But in situations where nodes choose 1840 to divert traffic to other nodes, DOIC offers no way of 1841 knowing whether the new recipients can handle the traffic if 1842 they have not already indicated overload. This may be 1843 mitigated with the use of a future "load" extension, or with 1844 the use of proprietary dynamic load-balancing mechanisms. 1846 REQ 13: The solution MUST NOT introduce substantial additional work 1847 for a node in an overloaded state. For example, a 1848 requirement for an overloaded node to send overload 1849 information every time it received a new request would 1850 introduce substantial work. 1852 *Not Compliant*. DOIC does in fact encourage an overloaded 1853 node to send an OLR in every response. The working group 1854 that other mechanisms to ensure that every relevant node 1855 receives an OLR would create even more work. [Note: This 1856 needs discussion.] 1858 REQ 14: Some scenarios that result in overload involve a rapid 1859 increase of traffic with little time between normal levels 1860 and levels that induce overload. The solution SHOULD provide 1861 for rapid feedback when traffic levels increase. 1863 *Compliant*. The piggyback mechanism allows OLRs to be sent 1864 at the same rate as application traffic. 1866 REQ 15: The solution MUST NOT interfere with the congestion control 1867 mechanisms of underlying transport protocols. For example, a 1868 solution that opened additional TCP connections when the 1869 network is congested would reduce the effectiveness of the 1870 underlying congestion control mechanisms. 1872 *Compliant*. DOIC does not require or recommend changes in 1873 the handling of transport protocols or connections. 1875 C.6.3. Heterogeneous Support for Solution 1877 REQ 16: The solution is likely to be deployed incrementally. The 1878 solution MUST support a mixed environment where some, but not 1879 all, nodes implement it. 1881 *Partially Compliant*. DOIC works with most mixed-deployment 1882 scenarios. However, it cannot work across a non-supporting 1883 proxy that modifies Origin-Host AVPs in answer messages. 1884 DOIC will have limited impact in networks where the nodes 1885 that perform server selections do not support the mechanism. 1887 REQ 17: In a mixed environment with nodes that support the solution 1888 and nodes that do not, the solution MUST NOT result in 1889 materially less useful throughput during overload as would 1890 have resulted if the solution were not present. It SHOULD 1891 result in less severe overload in this environment. 1893 *Compliant*. In most mixed-support deployment, DOIC will 1894 offer at least some value, and will not make things worse. 1896 REQ 18: In a mixed environment of nodes that support the solution and 1897 nodes that do not, the solution MUST NOT preclude elements 1898 that support overload control from treating elements that do 1899 not support overload control in an equitable fashion relative 1900 to those that do. Users and operators of nodes that do not 1901 support the solution MUST NOT unfairly benefit from the 1902 solution. The solution specification SHOULD provide guidance 1903 to implementers for dealing with elements not supporting 1904 overload control. 1906 *Compliant*. DOIC provides mechanisms to abate load from non- 1907 supporting sources. Furthermore, it recommends that 1908 reporting nodes will still need to be able to apply whatever 1909 protections they would ordinarily apply if DOIC were not in 1910 use. 1912 REQ 19: It MUST be possible to use the solution between nodes in 1913 different realms and in different administrative domains. 1915 *Partially Compliant*. DOIC allows sending OLRs across 1916 administrative domains, and potentially to nodes in other 1917 realms. However, an OLR cannot indicate overload for realms 1918 other than the one in the Origin-Realm AVP of the containing 1919 answer. 1921 REQ 20: Any explicit overload indication MUST be clearly 1922 distinguishable from other errors reported via Diameter. 1924 *Compliant*. DOIC sends explicit overload indication in 1925 overload reports. It does not depend on error result codes. 1927 REQ 21: In cases where a network node fails, is so overloaded that it 1928 cannot process messages, or cannot communicate due to a 1929 network failure, it may not be able to provide explicit 1930 indications of the nature of the failure or its levels of 1931 overload. The solution MUST result in at least as much 1932 useful throughput as would have resulted if the solution were 1933 not in place. 1935 *Compliant*. DOIC overload reports have the primary effect of 1936 suppressing message retries in overload conditions. DOIC 1937 recommends that messages never be silently dropped if at all 1938 possible. 1940 C.6.4. Granular Control 1942 REQ 22: The solution MUST provide a way for a node to throttle the 1943 amount of traffic it receives from a peer node. This 1944 throttling SHOULD be graded so that it can be applied 1945 gradually as offered load increases. Overload is not a 1946 binary state; there may be degrees of overload. 1948 *Compliant*. The "loss" algorithm expresses a percentage 1949 reduction. 1951 REQ 23: The solution MUST provide sufficient information to enable a 1952 load-balancing node to divert messages that are rejected or 1953 otherwise throttled by an overloaded upstream node to other 1954 upstream nodes that are the most likely to have sufficient 1955 capacity to process them. 1957 *Not Compliant*. DOIC provides no built in mechanism to 1958 determine the best place to divert messages that would 1959 otherwise be throttled. This can be accomplished with a 1960 future "load" extension, or with proprietary load balancing 1961 mechanisms. 1963 REQ 24: The solution MUST provide a mechanism for indicating load 1964 levels, even when not in an overload condition, to assist 1965 nodes in making decisions to prevent overload conditions from 1966 occurring. 1968 *Not Compliant*. "Load" information has been left for a 1969 future extension. 1971 C.6.5. Priority and Policy 1973 REQ 25: The base specification for the solution SHOULD offer general 1974 guidance on which message types might be desirable to send or 1975 process over others during times of overload, based on 1976 application-specific considerations. For example, it may be 1977 more beneficial to process messages for existing sessions 1978 ahead of new sessions. Some networks may have a requirement 1979 to give priority to requests associated with emergency 1980 sessions. Any normative or otherwise detailed definition of 1981 the relative priorities of message types during an overload 1982 condition will be the responsibility of the application 1983 specification. 1985 *Compliant*. The specification offers guidance on how 1986 requests might be prioritized for different types of 1987 applications. 1989 REQ 26: The solution MUST NOT prevent a node from prioritizing 1990 requests based on any local policy, so that certain requests 1991 are given preferential treatment, given additional 1992 retransmission, not throttled, or processed ahead of others. 1994 *Compliant*. Nothing in the specification prevents 1995 application-specific, implementation-specific, or local 1996 policies. 1998 C.6.6. Security 2000 REQ 27: The solution MUST NOT provide new vulnerabilities to 2001 malicious attack or increase the severity of any existing 2002 vulnerabilities. This includes vulnerabilities to DoS and 2003 DDoS attacks as well as replay and man-in-the-middle attacks. 2004 Note that the Diameter base specification [RFC6733] lacks 2005 end-to-end security and this must be considered (see the 2006 Security Considerations in [RFC7068]). Note that this 2007 requirement was expressed at a high level so as to not 2008 preclude any particular solution. It is expected that the 2009 solution will address this in more detail. 2011 *Compliant*. The working group is not aware of any such 2012 vulnerabilities. [This may need further analysis.] 2014 REQ 28: The solution MUST NOT depend on being deployed in 2015 environments where all Diameter nodes are completely trusted. 2016 It SHOULD operate as effectively as possible in environments 2017 where other nodes are malicious; this includes preventing 2018 malicious nodes from obtaining more than a fair share of 2019 service. Note that this does not imply any responsibility on 2020 the solution to detect, or take countermeasures against, 2021 malicious nodes. 2023 *Partially Compliant*. Since all Diameter security is 2024 currently at the transport layer, nodes must trust immediate 2025 peers to enforce trust policies. However, there are 2026 situations where a DOIC node cannot determine if an immediate 2027 peer supports DOIC. The authors recommend an expert security 2028 review. 2030 REQ 29: It MUST be possible for a supporting node to make 2031 authorization decisions about what information will be sent 2032 to peer nodes based on the identity of those nodes. This 2033 allows a domain administrator who considers the load of their 2034 nodes to be sensitive information to restrict access to that 2035 information. Of course, in such cases, there is no 2036 expectation that the solution itself will help prevent 2037 overload from that peer node. 2039 *Partially Compliant*. (See response to previous 2040 requirement.) 2042 REQ 30: The solution MUST NOT interfere with any Diameter-compliant 2043 method that a node may use to protect itself from overload 2044 from non-supporting nodes or from denial-of-service attacks. 2046 *Compliant*. The specification recommends that any such 2047 protection mechanism needed without DOIC should continue to 2048 be employed with DOIC. 2050 C.6.7. Flexibility and Extensibility 2052 REQ 31: There are multiple situations where a Diameter node may be 2053 overloaded for some purposes but not others. For example, 2054 this can happen to an agent or server that supports multiple 2055 applications, or when a server depends on multiple external 2056 resources, some of which may become overloaded while others 2057 are fully available. The solution MUST allow Diameter nodes 2058 to indicate overload with sufficient granularity to allow 2059 clients to take action based on the overloaded resources 2060 without unreasonably forcing available capacity to go unused. 2061 The solution MUST support specification of overload 2062 information with granularities of at least "Diameter node", 2063 "realm", and "Diameter application" and MUST allow 2064 extensibility for others to be added in the future. 2066 *Partially Compliant*. All DOIC overload reports are scoped 2067 to the specific application and realm. Inside that scope, 2068 overload can be reported at the specific server or whole 2069 realm scope. As currently specified, DOIC cannot indicate 2070 local overload for an agent. At the time of this writing, 2071 the DIME working group has plans to work on an agent-overload 2072 extension. 2074 DOIC allows new "scopes" through the use of extended report 2075 types. 2077 REQ 32: The solution MUST provide a method for extending the 2078 information communicated and the algorithms used for overload 2079 control. 2081 *Compliant*. DOIC allows new report types and abatement 2082 algorithms to be created. These may be indicated using the 2083 OC-Supported-Features AVP. 2085 REQ 33: The solution MUST provide a default algorithm that is 2086 mandatory to implement. 2088 *Compliant*. The "loss" algorithm is mandatory to implement. 2090 REQ 34: The solution SHOULD provide a method for exchanging overload 2091 and load information between elements that are connected by 2092 intermediaries that do not support the solution. 2094 *Partially Compliant*. DOIC information can traverse non- 2095 supporting agents, as long as those agents do not modify 2096 certain AVPs. (e.g., Origin-Host). DOIC does not provide a 2097 way for supporting nodes to detect such modification. 2099 Appendix D. Considerations for Applications Integrating the DOIC 2100 Solution 2102 This section outlines considerations to be taken into account when 2103 integrating the DOIC solution into Diameter applications. 2105 D.1. Application Classification 2107 The following is a classification of Diameter applications and 2108 request types. This discussion is meant to document factors that 2109 play into decisions made by the Diameter identity responsible for 2110 handling overload reports. 2112 Section 8.1 of [RFC6733] defines two state machines that imply two 2113 types of applications, session-less and session-based applications. 2114 The primary difference between these types of applications is the 2115 lifetime of Session-Ids. 2117 For session-based applications, the Session-Id is used to tie 2118 multiple requests into a single session. 2120 The Credit-Control application defined in [RFC4006] is an example of 2121 a Diameter session-based application. 2123 In session-less applications, the lifetime of the Session-Id is a 2124 single Diameter transaction, i.e. the session is implicitly 2125 terminated after a single Diameter transaction and a new Session-Id 2126 is generated for each Diameter request. 2128 For the purposes of this discussion, session-less applications are 2129 further divided into two types of applications: 2131 Stateless Applications: 2133 Requests within a stateless application have no relationship to 2134 each other. The 3GPP defined S13 application is an example of a 2135 stateless application [S13], where only a Diameter command is 2136 defined between a client and a server and no state is maintained 2137 between two consecutive transactions. 2139 Pseudo-Session Applications: 2141 Applications that do not rely on the Session-Id AVP for 2142 correlation of application messages related to the same session 2143 but use other session-related information in the Diameter requests 2144 for this purpose. The 3GPP defined Cx application [Cx] is an 2145 example of a pseudo-session application. 2147 The handling of overload reports must take the type of application 2148 into consideration, as discussed in Appendix D.2. 2150 D.2. Application Type Overload Implications 2152 This section discusses considerations for mitigating overload 2153 reported by a Diameter entity. This discussion focuses on the type 2154 of application. Appendix D.3 discusses considerations for handling 2155 various request types when the target server is known to be in an 2156 overloaded state. 2158 These discussions assume that the strategy for mitigating the 2159 reported overload is to reduce the overall workload sent to the 2160 overloaded entity. The concept of applying overload treatment to 2161 requests targeted for an overloaded Diameter entity is inherent to 2162 this discussion. The method used to reduce offered load is not 2163 specified here but could include routing requests to another Diameter 2164 entity known to be able to handle them, or it could mean rejecting 2165 certain requests. For a Diameter agent, rejecting requests will 2166 usually mean generating appropriate Diameter error responses. For a 2167 Diameter client, rejecting requests will depend upon the application. 2168 For example, it could mean giving an indication to the entity 2169 requesting the Diameter service that the network is busy and to try 2170 again later. 2172 Stateless Applications: 2174 By definition there is no relationship between individual requests 2175 in a stateless application. As a result, when a request is sent 2176 or relayed to an overloaded Diameter entity - either a Diameter 2177 Server or a Diameter Agent - the sending or relaying entity can 2178 choose to apply the overload treatment to any request targeted for 2179 the overloaded entity. 2181 Pseudo-Session Applications: 2183 For pseudo-session applications, there is an implied ordering of 2184 requests. As a result, decisions about which requests towards an 2185 overloaded entity to reject could take the command code of the 2186 request into consideration. This generally means that 2187 transactions later in the sequence of transactions should be given 2188 more favorable treatment than messages earlier in the sequence. 2189 This is because more work has already been done by the Diameter 2190 network for those transactions that occur later in the sequence. 2191 Rejecting them could result in increasing the load on the network 2192 as the transactions earlier in the sequence might also need to be 2193 repeated. 2195 Session-Based Applications: 2197 Overload handling for session-based applications must take into 2198 consideration the work load associated with setting up and 2199 maintaining a session. As such, the entity sending requests 2200 towards an overloaded Diameter entity for a session-based 2201 application might tend to reject new session requests prior to 2202 rejecting intra-session requests. In addition, session ending 2203 requests might be given a lower probability of being rejected as 2204 rejecting session ending requests could result in session status 2205 being out of sync between the Diameter clients and servers. 2206 Application designers that would decide to reject mid-session 2207 requests will need to consider whether the rejection invalidates 2208 the session and any resulting session cleanup procedures. 2210 D.3. Request Transaction Classification 2212 Independent Request: 2214 An independent request is not correlated to any other requests 2215 and, as such, the lifetime of the session-id is constrained to an 2216 individual transaction. 2218 Session-Initiating Request: 2220 A session-initiating request is the initial message that 2221 establishes a Diameter session. The ACR message defined in 2222 [RFC6733] is an example of a session-initiating request. 2224 Correlated Session-Initiating Request: 2226 There are cases when multiple session-initiated requests must be 2227 correlated and managed by the same Diameter server. It is notably 2228 the case in the 3GPP PCC architecture [PCC], where multiple 2229 apparently independent Diameter application sessions are actually 2230 correlated and must be handled by the same Diameter server. 2232 Intra-Session Request: 2234 An intra-session request is a request that uses the same Session- 2235 Id than the one used in a previous request. An intra-session 2236 request generally needs to be delivered to the server that handled 2237 the session creating request for the session. The STR message 2238 defined in [RFC6733] is an example of an intra-session request. 2240 Pseudo-Session Requests: 2242 Pseudo-session requests are independent requests and do not use 2243 the same Session-Id but are correlated by other session-related 2244 information contained in the request. There exists Diameter 2245 applications that define an expected ordering of transactions. 2246 This sequencing of independent transactions results in a pseudo 2247 session. The AIR, MAR and SAR requests in the 3GPP defined Cx 2248 [Cx] application are examples of pseudo-session requests. 2250 D.4. Request Type Overload Implications 2252 The request classes identified in Appendix D.3 have implications on 2253 decisions about which requests should be throttled first. The 2254 following list of request treatment regarding throttling is provided 2255 as guidelines for application designers when implementing the 2256 Diameter overload control mechanism described in this document. The 2257 exact behavior regarding throttling is a matter of local policy, 2258 unless specifically defined for the application. 2260 Independent Requests: 2262 Independent requests can generally be given equal treatment when 2263 making throttling decisions, unless otherwise indicated by 2264 application requirements or local policy. 2266 Session-Initiating Requests: 2268 Session-initiating requests often represent more work than 2269 independent or intra-session requests. Moreover, session- 2270 initiating requests are typically followed by other session- 2271 related requests. Since the main objective of the overload 2272 control is to reduce the total number of requests sent to the 2273 overloaded entity, throttling decisions might favor allowing 2274 intra-session requests over session-initiating requests. In the 2275 absence of local policies or application specific requirements to 2276 the contrary, Individual session-initiating requests can be given 2277 equal treatment when making throttling decisions. 2279 Correlated Session-Initiating Requests: 2281 A Request that results in a new binding, where the binding is used 2282 for routing of subsequent session-initiating requests to the same 2283 server, represents more work load than other requests. As such, 2284 these requests might be throttled more frequently than other 2285 request types. 2287 Pseudo-Session Requests: 2289 Throttling decisions for pseudo-session requests can take into 2290 consideration where individual requests fit into the overall 2291 sequence of requests within the pseudo session. Requests that are 2292 earlier in the sequence might be throttled more aggressively than 2293 requests that occur later in the sequence. 2295 Intra-Session Requests: 2297 There are two types of intra-sessions requests, requests that 2298 terminate a session and the remainder of intra-session requests. 2299 Implementers and operators may choose to throttle session- 2300 terminating requests less aggressively in order to gracefully 2301 terminate sessions, allow cleanup of the related resources (e.g. 2302 session state) and avoid the need for additional intra-session 2303 requests. Favoring session-termination requests may reduce the 2304 session management impact on the overloaded entity. The default 2305 handling of other intra-session requests might be to treat them 2306 equally when making throttling decisions. There might also be 2307 application level considerations whether some request types are 2308 favored over others. 2310 Authors' Addresses 2312 Jouni Korhonen (editor) 2313 Broadcom 2314 Porkkalankatu 24 2315 Helsinki FIN-00180 2316 Finland 2318 Email: jouni.nospam@gmail.com 2320 Steve Donovan (editor) 2321 Oracle 2322 7460 Warren Parkway 2323 Frisco, Texas 75034 2324 United States 2326 Email: srdonovan@usdonovans.com 2328 Ben Campbell 2329 Oracle 2330 7460 Warren Parkway 2331 Frisco, Texas 75034 2332 United States 2334 Email: ben@nostrum.com 2335 Lionel Morand 2336 Orange Labs 2337 38/40 rue du General Leclerc 2338 Issy-Les-Moulineaux Cedex 9 92794 2339 France 2341 Phone: +33145296257 2342 Email: lionel.morand@orange.com