idnits 2.17.00 (12 Aug 2021) /tmp/idnits14507/draft-ietf-dime-ovli-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([RFC2119]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 473 has weird spacing: '...rotocol stan...' == The document seems to use 'NOT RECOMMENDED' as an RFC 2119 keyword, but does not include the phrase in its RFC 2119 key words list. == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'MUST not' in this paragraph: The lack of end-to-end confidentiality protection means that any Diameter agent in the path of an overload report can view the contents of that report. In addition to the requirement to select which peers are trusted to send overload reports, operators MUST be able to select which peers are authorized to receive reports. A node MUST not send an overload report to a peer not authorized to receive it. Furthermore, an agent MUST remove any overload reports that might have been inserted by other nodes before forwarding a Diameter message to a peer that is not authorized to receive overload reports. -- The document date (December 17, 2013) is 3077 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 5226 (Obsoleted by RFC 8126) == Outdated reference: draft-ietf-dime-e2e-sec-req has been published as RFC 7966 -- Obsolete informational reference (is this intentional?): RFC 4006 (Obsoleted by RFC 8506) Summary: 2 errors (**), 0 flaws (~~), 5 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Diameter Maintenance and Extensions J. Korhonen, Ed. 3 (DIME) Broadcom 4 Internet-Draft S. Donovan 5 Intended status: Standards Track B. Campbell 6 Expires: June 20, 2014 Oracle 7 L. Morand 8 Orange Labs 9 December 17, 2013 11 Diameter Overload Indication Conveyance 12 draft-ietf-dime-ovli-01.txt 14 Abstract 16 This specification documents a Diameter Overload Control (DOC) base 17 solution and the dissemination of the overload report information. 19 Requirements 21 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 22 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 23 document are to be interpreted as described in RFC 2119 [RFC2119]. 25 Status of this Memo 27 This Internet-Draft is submitted in full conformance with the 28 provisions of BCP 78 and BCP 79. 30 Internet-Drafts are working documents of the Internet Engineering 31 Task Force (IETF). Note that other groups may also distribute 32 working documents as Internet-Drafts. The list of current Internet- 33 Drafts is at http://datatracker.ietf.org/drafts/current/. 35 Internet-Drafts are draft documents valid for a maximum of six months 36 and may be updated, replaced, or obsoleted by other documents at any 37 time. It is inappropriate to use Internet-Drafts as reference 38 material or to cite them other than as "work in progress." 40 This Internet-Draft will expire on June 20, 2014. 42 Copyright Notice 44 Copyright (c) 2013 IETF Trust and the persons identified as the 45 document authors. All rights reserved. 47 This document is subject to BCP 78 and the IETF Trust's Legal 48 Provisions Relating to IETF Documents 49 (http://trustee.ietf.org/license-info) in effect on the date of 50 publication of this document. Please review these documents 51 carefully, as they describe your rights and restrictions with respect 52 to this document. Code Components extracted from this document must 53 include Simplified BSD License text as described in Section 4.e of 54 the Trust Legal Provisions and are provided without warranty as 55 described in the Simplified BSD License. 57 Table of Contents 59 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 60 2. Terminology and Abbreviations . . . . . . . . . . . . . . . . 4 61 3. Solution Overview . . . . . . . . . . . . . . . . . . . . . . 5 62 3.1. Architectural Assumptions . . . . . . . . . . . . . . . . 5 63 3.1.1. Application Classification . . . . . . . . . . . . . . 5 64 3.1.2. Application Type Overload Implications . . . . . . . . 6 65 3.1.3. Request Transaction Classification . . . . . . . . . . 8 66 3.1.4. Request Type Overload Implications . . . . . . . . . . 9 67 3.1.5. Diameter Agent Behaviour . . . . . . . . . . . . . . . 10 68 3.1.6. Simplified Example Architecture . . . . . . . . . . . 11 69 3.2. Conveyance of the Overload Indication . . . . . . . . . . 11 70 3.2.1. DOIC Capability Discovery . . . . . . . . . . . . . . 12 71 3.3. Overload Condition Indication . . . . . . . . . . . . . . 12 72 4. Attribute Value Pairs . . . . . . . . . . . . . . . . . . . . 12 73 4.1. OC-Supported-Features AVP . . . . . . . . . . . . . . . . 13 74 4.2. OC-Feature-Vector AVP . . . . . . . . . . . . . . . . . . 14 75 4.3. OC-OLR AVP . . . . . . . . . . . . . . . . . . . . . . . . 14 76 4.4. OC-Sequence-Number AVP . . . . . . . . . . . . . . . . . . 15 77 4.5. OC-Validity-Duration AVP . . . . . . . . . . . . . . . . . 15 78 4.6. OC-Report-Type AVP . . . . . . . . . . . . . . . . . . . . 16 79 4.7. OC-Reduction-Percentage AVP . . . . . . . . . . . . . . . 16 80 4.8. Attribute Value Pair flag rules . . . . . . . . . . . . . 17 81 5. Overload Control Operation . . . . . . . . . . . . . . . . . . 18 82 5.1. Overload Control Endpoints . . . . . . . . . . . . . . . . 18 83 5.2. Piggybacking Principle . . . . . . . . . . . . . . . . . . 21 84 5.3. Capability Announcement . . . . . . . . . . . . . . . . . 22 85 5.3.1. Reacting Node Endpoint Considerations . . . . . . . . 22 86 5.3.2. Reporting Node Endpoint Considerations . . . . . . . . 23 87 5.4. Protocol Extensibility . . . . . . . . . . . . . . . . . . 23 88 5.5. Overload Report Processing . . . . . . . . . . . . . . . . 24 89 5.5.1. Overload Control State . . . . . . . . . . . . . . . . 24 90 5.5.2. Reacting Node Considerations . . . . . . . . . . . . . 24 91 5.5.3. Reporting Node Considerations . . . . . . . . . . . . 27 92 6. Transport Considerations . . . . . . . . . . . . . . . . . . . 27 93 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 28 94 7.1. AVP codes . . . . . . . . . . . . . . . . . . . . . . . . 28 95 7.2. New registries . . . . . . . . . . . . . . . . . . . . . . 28 97 8. Security Considerations . . . . . . . . . . . . . . . . . . . 28 98 8.1. Potential Threat Modes . . . . . . . . . . . . . . . . . . 28 99 8.2. Denial of Service Attacks . . . . . . . . . . . . . . . . 30 100 8.3. Non-Compliant Nodes . . . . . . . . . . . . . . . . . . . 30 101 8.4. End-to End-Security Issues . . . . . . . . . . . . . . . . 30 102 9. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 31 103 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 32 104 10.1. Normative References . . . . . . . . . . . . . . . . . . . 32 105 10.2. Informative References . . . . . . . . . . . . . . . . . . 32 106 Appendix A. Issues left for future specifications . . . . . . . . 33 107 A.1. Additional traffic abatement algorithms . . . . . . . . . 33 108 A.2. Agent Overload . . . . . . . . . . . . . . . . . . . . . . 33 109 A.3. DIAMETER_TOO_BUSY clarifications . . . . . . . . . . . . . 33 110 Appendix B. Examples . . . . . . . . . . . . . . . . . . . . . . 33 111 B.1. Mix of Destination-Realm routed requests and 112 Destination-Host routed requests . . . . . . . . . . . . . 33 113 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 37 115 1. Introduction 117 This specification defines a base solution for Diameter Overload 118 Control (DOC). The requirements for the solution are described and 119 discussed in the corresponding design requirements document 120 [RFC7068]. Note that the overload control solution defined in this 121 specification does not address all the requirements listed in 122 [RFC7068]. A number of overload control related features are left 123 for the future specifications. 125 The solution defined in this specification addresses the Diameter 126 overload control between two endpoints (see Section 5.1). 127 Furthermore, the solution is designed to apply to existing and future 128 Diameter applications, requires no changes to the Diameter base 129 protocol [RFC6733] and is deployable in environments where some 130 Diameter nodes do not implement the Diameter overload control 131 solution defined in this specification. 133 2. Terminology and Abbreviations 135 Server Farm 137 A set of Diameter servers that can handle any request for a given 138 set of Diameter applications. While these servers support the 139 same set of applications, they do not necessarily all have the 140 same capacity. An individual server farm might also support a 141 subset of the users for a Diameter Realm. A server farm may host 142 a single or multiple realms. 144 Diameter Routing: 146 Diameter Routing between non-adjacent nodes relies on the 147 Destination-Realm AVP to determine the Diameter realm in which the 148 request needs to be processed. A Destination-Host AVP may also be 149 present in the request to address a specific server inside the 150 Diameter realm. This function is defined in [RFC6733]. However, 151 it is possible to enhance the routing decisions with application 152 level knowledge as it done in 3GPP PCC [3GPP.23.203] and NAI-based 153 source routing [RFC5729]. 155 Diameter layer Load Balancing: 157 Diameter layer load balancing allows Diameter requests to be 158 distributed across the set of servers. Definition of this 159 function is outside the scope of this document. 161 Topology Hiding: 163 Topology Hiding is loosely defined as ensuring that no Diameter 164 topology information about a Diameter network can be discovered 165 from Diameter messages sent outside a predefined boundary 166 (typically an administrative domain). This includes obfuscating 167 identifiers and address information of Diameter entities in the 168 Diameter network. It can also include hiding the number of 169 various Diameter entities in the Diameter network. Identifying 170 information can occur in many Diameter Attribute-Value Pairs 171 (AVPs), including Origin-Host, Destination-Host, Route-Record, 172 Proxy-Info, Session-ID and other AVPs. 174 Throttling: 176 Throttling is the reduction of the number of requests sent to an 177 entity. Throttling can include a client dropping requests, or an 178 agent rejecting requests with appropriate error responses. 179 Clients and agents can also choose to redirect throttled requests 180 to some other entity or entities capable of handling them. 182 Reporting Node 184 A Diameter node that generates an overload report. (This may or 185 may not be the actually overloaded node.) 187 Reacting Node 189 A Diameter node that consumes and acts upon a report. Note that 190 "act upon" does not necessarily mean the reacting node applies an 191 abatement algorithm; it might decide to delegate that downstream, 192 in which case it also becomes a "reporting node". 194 OLR Overload Report. 196 3. Solution Overview 198 3.1. Architectural Assumptions 200 This section describes the high-level architectural and semantic 201 assumptions that underlie the Diameter Overload Control Mechanism. 203 3.1.1. Application Classification 205 The following is a classification of Diameter applications and 206 requests. This discussion is meant to document factors that play 207 into decisions made by the Diameter identity responsible for handling 208 overload reports. 210 Section 8.1 of [RFC6733] defines two state machines that imply two 211 types of applications, session-less and session-based applications. 212 The primary difference between these types of applications is the 213 lifetime of Session-Ids. 215 For session-based applications, the Session-Id is used to tie 216 multiple requests into a single session. 218 In session-less applications, the lifetime of the Session-Id is a 219 single Diameter transaction, i.e. the session is implicitly 220 terminated after a single Diameter transaction and a new Session-Id 221 is generated for each Diameter request. 223 For the purposes of this discussion, session-less applications are 224 further divided into two types of applications: 226 Stateless applications: 228 Requests within a stateless application have no relationship to 229 each other. The 3GPP defined S13 application is an example of a 230 stateless application [3GPP.29.272], where only a Diameter command 231 is defined between a client and a server and no state is 232 maintained between two consecutive transactions. 234 Pseudo-session applications: 236 Applications that do not rely on the Session-Id AVP for 237 correlation of application messages related to the same session 238 but use other session-related information in the Diameter requests 239 for this purpose. The 3GPP defined Cx application [3GPP.29.229] 240 is an example of a pseudo-session application. 242 The Credit-Control application defined in [RFC4006] is an example of 243 a Diameter session-based application. 245 The handling of overload reports must take the type of application 246 into consideration, as discussed in Section 3.1.2. 248 3.1.2. Application Type Overload Implications 250 This section discusses considerations for mitigating overload 251 reported by a Diameter entity. This discussion focuses on the type 252 of application. Section 3.1.3 discusses considerations for handling 253 various request types when the target server is known to be in an 254 overloaded state. 256 These discussions assume that the strategy for mitigating the 257 reported overload is to reduce the overall workload sent to the 258 overloaded entity. The concept of applying overload treatment to 259 requests targeted for an overloaded Diameter entity is inherent to 260 this discussion. The method used to reduce offered load is not 261 specified here but could include routing requests to another Diameter 262 entity known to be able to handle them, or it could mean rejecting 263 certain requests. For a Diameter agent, rejecting requests will 264 usually mean generating appropriate Diameter error responses. For a 265 Diameter client, rejecting requests will depend upon the application. 266 For example, it could mean giving an indication to the entity 267 requesting the Diameter service that the network is busy and to try 268 again later. 270 Stateless applications: 272 By definition there is no relationship between individual requests 273 in a stateless application. As a result, when a request is sent 274 or relayed to an overloaded Diameter entity - either a Diameter 275 Server or a Diameter Agent - the sending or relaying entity can 276 choose to apply the overload treatment to any request targeted for 277 the overloaded entity. 279 Pseudo-session applications: 281 For pseudo-session applications, there is an implied ordering of 282 requests. As a result, decisions about which requests towards an 283 overloaded entity to reject could take the command code of the 284 request into consideration. This generally means that 285 transactions later in the sequence of transactions should be given 286 more favorable treatment than messages earlier in the sequence. 287 This is because more work has already been done by the Diameter 288 network for those transactions that occur later in the sequence. 289 Rejecting them could result in increasing the load on the network 290 as the transactions earlier in the sequence might also need to be 291 repeated. 293 Session-based applications: 295 Overload handling for session-based applications must take into 296 consideration the work load associated with setting up and 297 maintaining a session. As such, the entity sending requests 298 towards an overloaded Diameter entity for a session-based 299 application might tend to reject new session requests prior to 300 rejecting intra-session requests. In addition, session ending 301 requests might be given a lower probability of being rejected as 302 rejecting session ending requests could result in session status 303 being out of sync between the Diameter clients and servers. 305 Application designers that would decide to reject mid-session 306 requests will need to consider whether the rejection invalidates 307 the session and any resulting session clean-up procedures. 309 3.1.3. Request Transaction Classification 311 Independent Request: 313 An independent request is not correlated to any other requests 314 and, as such, the lifetime of the session-id is constrained to an 315 individual transaction. 317 Session-Initiating Request: 319 A session-initiating request is the initial message that 320 establishes a Diameter session. The ACR message defined in 321 [RFC6733] is an example of a session-initiating request. 323 Correlated Session-Initiating Request: 325 There are cases when multiple session-initiated requests must be 326 correlated and managed by the same Diameter server. It is notably 327 the case in the 3GPP PCC architecture [3GPP.23.203], where 328 multiple apparently independent Diameter application sessions are 329 actually correlated and must be handled by the same Diameter 330 server. 332 Intra-Session Request: 334 An intra session request is a request that uses the same 335 Session-Id than the one used in a previous request. An intra 336 session request generally needs to be delivered to the server that 337 handled the session creating request for the session. The STR 338 message defined in [RFC6733] is an example of an intra-session 339 requests. 341 Pseudo-Session Requests: 343 Pseudo-session requests are independent requests and do not use 344 the same Session-Id but are correlated by other session-related 345 information contained in the request. There exists Diameter 346 applications that define an expected ordering of transactions. 347 This sequencing of independent transactions results in a pseudo 348 session. The AIR, MAR and SAR requests in the 3GPP defined Cx 349 application are examples of pseudo-session requests. 351 3.1.4. Request Type Overload Implications 353 The request classes identified in Section 3.1.3 have implications on 354 decisions about which requests should be throttled first. The 355 following list of request treatment regarding throttling is provided 356 as guidelines for application designers when implementing the 357 Diameter overload control mechanism described in this document. 358 Exact behavior regarding throttling must be defined per application. 360 Independent requests: 362 Independent requests can be given equal treatment when making 363 throttling decisions. 365 Session-initiating requests: 367 Session-initiating requests represent more work than independent 368 or intra-session requests. Moreover, session-initiating requests 369 are typically followed by other related session-related requests. 370 As such, as the main objective of the overload control is to 371 reduce the total number of requests sent to the overloaded entity, 372 throttling decisions might favor allowing intra-session requests 373 over session-initiating requests. Individual session-initiating 374 requests can be given equal treatment when making throttling 375 decisions. 377 Correlated session-initiating requests: 379 A Request that results in a new binding, where the binding is used 380 for routing of subsequent session-initiating requests to the same 381 server, represents more work load than other requests. As such, 382 these requests might be throttled more frequently than other 383 request types. 385 Pseudo-session requests: 387 Throttling decisions for pseudo-session requests can take into 388 consideration where individual requests fit into the overall 389 sequence of requests within the pseudo session. Requests that are 390 earlier in the sequence might be throttled more aggressively than 391 requests that occur later in the sequence. 393 Intra-session requests 395 There are two classes of intra-sessions requests. The first class 396 consists of requests that terminate a session. The second one 397 contains the set of requests that are used by the Diameter client 398 and server to maintain the ongoing session state. Session 399 terminating requests should be throttled less aggressively in 400 order to gracefully terminate sessions, allow clean-up of the 401 related resources (e.g. session state) and get rid of the need for 402 other intra-session requests, reducing the session management 403 impact on the overloaded entity. The default handling of other 404 intra-session requests might be to treat them equally when making 405 throttling decisions. There might also be application level 406 considerations whether some request types are favored over others. 408 3.1.5. Diameter Agent Behaviour 410 In the context of the Diameter Overload Indication Conveyance (DOIC) 411 and reacting to the overload information, the functional behaviour of 412 Diameter agents in front of servers, especially Diameter proxies, 413 needs to be common. This is important because agents may actively 414 participate in the handling of an overload conditions. For example, 415 they may make intelligent next hop selection decisions based on 416 overload conditions, or aggregate overload information to be 417 disseminated downstream. Diameter agents may have other deployment 418 related tasks that are not defined in the Diameter base protocol 419 [RFC6733]. These include, among other tasks, topology hiding, or 420 agent acting as a Server Front End (SFE) for a farm of Diameter 421 servers. 423 Since the solution defined in this specification must not break the 424 Diameter base protocol [RFC6733] at any time, great care has to be 425 taken not to assume functionality from the Diameter agents that would 426 break base protocol behavior, or to assume agent functionality beyond 427 the Diameter base protocol. Effectively this means the following 428 from a Diameter agent: 430 o If a Diameter agent presents itself as the "end node", as an agent 431 acting as an topology hiding SFE, the agent is the final 432 destination of requests initiated by Diameter clients, the 433 original source for the corresponding answers and server-initiated 434 requests. As a consequence, the DOIC mechanism MUST NOT leak 435 information of the Diameter nodes behind it. This requirement 436 means that such a Diameter agent acts as a back-to-back-agent for 437 DOIC purposes. How the Diameter agent in this case appears to the 438 Diameter servers in the farm, is specific to the implementation 439 and deployment within the realm the Diameter agent is deployed. 441 o If the Diameter agent does not impersonate the servers behind it, 442 the Diameter dialogue is established between clients and servers 443 and any overload information received by a client would be from 444 the server identified by the Origin-Host identity contained in the 445 Diameter message. 447 3.1.6. Simplified Example Architecture 449 Figure 1 illustrates the simplified architecture for Diameter 450 overload information conveyance. See Section 5.1 for more discussion 451 and details how different Diameter nodes fit into the architecture 452 from the DOIC point of view. 454 Realm X Same or other Realms 455 <--------------------------------------> <----------------------> 457 +--^-----+ : (optional) : 458 |Diameter| : : 459 |Server A|--+ .--. : +---^----+ : .--. 460 +--------+ | _( `. : |Diameter| : _( `. +---^----+ 461 +--( )--:-| Agent |-:--( )--|Diameter| 462 +--------+ | ( ` . ) ) : +-----^--+ : ( ` . ) ) | Client | 463 |Diameter|--+ `--(___.-' : : `--(___.-' +-----^--+ 464 |Server B| : : 465 +---^----+ : : 467 End-to-end Overload Indication 468 1) <-----------------------------------------------> 469 Diameter Application Y 471 Overload Indication A Overload Indication A' 472 2) <----------------------> <----------------------> 473 standard base protocol standard base protocol 475 Figure 1: Simplified architecture choices for overload indication 476 delivery 478 In Figure 1, the Diameter overload indication can be conveyed (1) 479 end-to-end between servers and clients or (2) between servers and 480 Diameter agent inside the realm and then between the Diameter agent 481 and the clients when the Diameter agent acting as back-to-back-agent 482 for DOIC purposes. 484 3.2. Conveyance of the Overload Indication 486 The following sections describe new Diameter AVPs used for sending 487 overload reports, and for declaring support for certain DOIC 488 features. 490 3.2.1. DOIC Capability Discovery 492 Support of DOIC may be specified as part of the functionality 493 supported by a new Diameter application. In this way, support of the 494 considered Diameter application (discovered during capabilities 495 exchange phase as defined in Diameter base protocol [RFC6733]) 496 indicates implicit support of the DOIC mechanism. 498 When the DOIC mechanism is introduced in existing Diameter 499 applications, a specific capability discovery mechanism is required. 500 The "DOIC capability discovery mechanism" is based on the presence of 501 specific optional AVPs in the Diameter messages, such as the OC- 502 Supported-Features AVP (see Section 4.1). Although the OC-Supported- 503 Features AVP can be used to advertise a certain set of new or 504 existing Diameter overload control capabilities, it is not a 505 versioning solution per se, however, it can be used to achieve the 506 same result. 508 From the Diameter overload control functionality point of view, the 509 "Reacting node" is the requester of the overload report information 510 and the "Reporting node" is the provider of the overload report. The 511 OC-Supported-Features AVP in the request message is always 512 interpreted as an announcement of "DOIC supported capabilities". The 513 OC-Supported-Features AVP in the answer is also interpreted as a 514 report of "DOIC supported capabilities" and at least one of supported 515 capabilities MUST be common with the "Reacting node" (see 516 Section 4.1). 518 3.3. Overload Condition Indication 520 Diameter nodes can request a reduction in offered load by indicating 521 an overload condition in the form of an overload report. The 522 overload report contains information about how much load should be 523 reduced, and may contain other information about the overload 524 condition. This information is conveyed in Diameter Attribute Value 525 Pairs (AVPs). 527 Certain new AVPs may also be used to declare certain DOIC 528 capabilities and extensions. 530 4. Attribute Value Pairs 532 This section describes the encoding and semantics of the Diameter 533 Overload Indication Attribute Value Pairs (AVPs) defined in this 534 document. 536 4.1. OC-Supported-Features AVP 538 The OC-Supported-Features AVP (AVP code TBD1) is type of Grouped and 539 serves for two purposes. First, it announces node's support for the 540 DOIC in general. Second, it contains the description of the 541 supported DOIC features of the sending node. The OC-Supported- 542 Features AVP SHOULD be included into every Diameter message a DOIC 543 supporting node sends (and intends to use for DOIC purposes). 545 OC-Supported-Features ::= < AVP Header: TBD1 > 546 < OC-Sequence-Number > 547 [ OC-Feature-Vector ] 548 * [ AVP ] 550 The OC-Sequence-Number AVP is used to indicate whether the contents 551 of the OC-Supported-Features AVP has changed since last time the node 552 included the OC-Supported-Features AVP (see Section 4.4). Although 553 sending the OC-Sequence-Number AVP is mandatory in the OC-Supported- 554 Features AVP, the receiving node MAY always choose to ignore the 555 sequence number if it can determine the feature support changes 556 otherwise. 558 The OC-Feature-Vector sub-AVP is used to announced the DOIC features 559 supported by the endpoint, in the form of a flag bits field in which 560 each bit announces one feature or capability supported by the node 561 (see Section 4.2). The absence of the OC-Feature-Vector AVP 562 indicates that only the default traffic abatement algorithm described 563 in this specification is supported. 565 A reacting node includes this AVP to indicate its capabilities to a 566 reporting node. For example, the endpoint (reacting node) may 567 indicate which (future defined) traffic abatement algorithms it 568 supports in addition to the default. 570 During the message exchange the overload control endpoints express 571 their common set of supported capabilities. The reacting node 572 includes the OC-Supported-Features AVP that announces what it 573 supports. The reporting node that sends the answer also includes the 574 OC-Supported-Features AVP that describes the capabilities it 575 supports. The set of capabilities advertised by the reporting node 576 depends on local policies. At least one of the announced 577 capabilities MUST match mutually. If there is no single matching 578 capability the reacting node MUST act as if it does not implement 579 DOIC and cease inserting any DOIC related AVPs into any Diameter 580 messages with this specific reacting node. 582 4.2. OC-Feature-Vector AVP 584 The OC-Feature-Vector AVP (AVP code TBD6) is type of Unsigned64 and 585 contains a 64 bit flags field of announced capabilities of an 586 overload control endpoint. The value of zero (0) is reserved. 588 The following capabilities are defined in this document: 590 OLR_DEFAULT_ALGO (0x0000000000000001) 592 When this flag is set by the overload control endpoint it means 593 that the default traffic abatement (loss) algorithm is supported. 595 4.3. OC-OLR AVP 597 The OC-OLR AVP (AVP code TBD2) is type of Grouped and contains the 598 necessary information to convey an overload report. The OC-OLR AVP 599 does not contain explicit information to which application it applies 600 to and who inserted the AVP or whom the specific OC-OLR AVP concerns 601 to. Both these information is implicitly learned from the 602 encapsulating Diameter message/command. The application the OC-OLR 603 AVP applies to is the same as the Application-Id found in the 604 Diameter message header. The identity the OC-OLR AVP concerns is 605 determined from the Origin-Host AVP (and Origin-Realm AVP as well) 606 found from the encapsulating Diameter command. The OC-OLR AVP is 607 intended to be sent only by a reporting node. 609 OC-OLR ::= < AVP Header: TBD2 > 610 < OC-Sequence-Number > 611 [ OC-Report-Type ] 612 [ OC-Reduction-Percentage ] 613 [ OC-Validity-Duration ] 614 * [ AVP ] 616 The Sequence-Number AVP indicates the "freshness" of the OC-OLR AVP. 617 It is possible to replay the same OC-OLR AVP multiple times between 618 the overload control endpoints, however, when the OC-OLR AVP content 619 changes or sending endpoint otherwise wants the receiving endpoint to 620 update its overload control information, then the OC-Sequence-Number 621 AVP MUST contain a new greater value than the previously received. 622 The receiver SHOULD discard an OC-OLR AVP with a sequence number that 623 is less than previously received one. 625 Note that if a Diameter command were to contain multiple OC-OLR AVPs 626 they all MUST have different OC-Report-Type AVP value. OC-OLR AVPs 627 with unknown values SHOULD be silently discarded and the event SHOULD 628 be logged. 630 The OC-OLR AVP can be expanded with optional sub-AVPs only if a 631 legacy implementation can safely ignore them without breaking 632 backward compatibility for the given OC-Report-Type AVP value implied 633 report handling semantics. If the new sub-AVPs imply new semantics 634 for the report handling, then a new OC-Report-Type AVP value MUST be 635 defined. 637 4.4. OC-Sequence-Number AVP 639 The OC-Sequence-Number AVP (AVP code TBD3) is type of Time. Its 640 usage in the context of the overload control is described in Sections 641 4.1 and 4.3. 643 From the functionality point of view, the OC-Sequence-Number AVP MUST 644 be used as a non-volatile increasing counter between two overload 645 control endpoints (neglecting the fact that the contents of the AVP 646 is a 64-bit NTP timestamp [RFC5905]). The sequence number is only 647 required to be unique between two overload control endpoints. 648 Sequence numbers are treated in uni-directional manner, i.e. two 649 sequence numbers on each direction between two endpoints are not 650 related or correlated. 652 When generating sequence numbers, the new sequence number MUST be 653 greater than any sequence number previously seen between two 654 endpoints within a time window that tolerates the wraparound of the 655 NTP timestamp (i.e. approximately 68 years). 657 4.5. OC-Validity-Duration AVP 659 The OC-Validity-Duration AVP (AVP code TBD4) is type of Unsigned32 660 and describes the number of seconds the "new and fresh" OC-OLR AVP 661 and its content is valid since the reception of the new OC-OLR AVP. 662 The default value for the OC-Validity-Duration AVP value is 5 (i.e., 663 5 seconds). When the OC-Validity-Duration AVP is not present in the 664 OC-OLR AVP, the default value applies. Validity duration values 0 665 (i.e., 0 seconds) and above 86400 (i.e., 24 hours) MUST NOT be used. 666 Invalid validity duration values are treated as if the OC-Validity- 667 Duration AVP were not present. 669 A timeout of the overload report has specific concerns that need to 670 be taken into account by the endpoint acting on the earlier received 671 overload report(s). Section 4.7 discusses the impacts of timeout in 672 the scope of the traffic abatement algorithms. 674 As a general guidance for implementations it is RECOMMENDED never to 675 let any overload report to timeout. Following to this rule, an 676 overload endpoint should explicitly signal the end of overload 677 condition and not rely on the expiration of the validity time of the 678 overload report in the reacting node. This leaves no need for the 679 reacting node to reason or guess the overload condition of the 680 reporting node. 682 4.6. OC-Report-Type AVP 684 The OC-Report-Type AVP (AVP code TBD5) is type of Enumerated. The 685 value of the AVP describes what the overload report concerns. The 686 following values are initially defined: 688 0 A host report. The overload treatment should apply to requests 689 the reacting node knows that will reach the overloaded node. For 690 example, requests with a Destination-Host AVP indicating the 691 endpoint. The reacting node learns the "host" implicitly from the 692 Origin-Host AVP of the received message that contained the OC-OLR 693 AVP. 695 1 A realm report. The overload treatment should apply to all 696 requests bound for the overloaded realm. The reacting node learns 697 the "realm" implicitly from the Origin-Realm AVP of the received 698 message that contained the OC-OLR AVP. 700 The default value of the OC-Report-Type AVP is 0 (i.e. the host 701 report). 703 The OC-Report-Type AVP is envisioned to be useful for situations 704 where a reacting node needs to apply different overload treatments 705 for different "types" of overload. For example, the reacting node(s) 706 might need to throttle differently requests sent to a specific server 707 (identified by the Destination-Host AVP in the request) and requests 708 that can be handled by any server in a realm. The example in 709 Appendix B.1 illustrates this usage. 711 When defining new report type values, the corresponding specification 712 MUST define the semantics of the new report types and how they affect 713 the OC-OLR AVP handling. The specification MUST also reserve a 714 corresponding new feature, see the OC-Supported-Features and OC- 715 Feature-Vector AVPs. 717 4.7. OC-Reduction-Percentage AVP 719 The OC-Reduction-Percentage AVP (AVP code TBD8) is type of Unsigned32 720 and describes the percentage of the traffic that the sender is 721 requested to reduce, compared to what it otherwise would have sent. 722 The OC-Reduction-Percentage AVP applies to the default (loss like) 723 algorithm specified in this specification. However, the AVP can be 724 reused for future abatement algorithms, if its semantics fit into the 725 new algorithm. 727 The value of the Reduction-Percentage AVP is between zero (0) and one 728 hundred (100). Values greater than 100 are interpreted as 100. The 729 value of 100 means that no traffic is expected, i.e. the reporting 730 node is under a severe load and ceases to process any new messages. 731 The value of 0 means that the reporting node is in a stable state and 732 has no requests to the other endpoint to apply any traffic abatement. 733 The default value of the OC-Reduction-Percentage AVP is 0. When the 734 OC-Reduction-Percentage AVP is not present in the overload report, 735 the default value applies. 737 If an overload control endpoint comes out of the 100 percent traffic 738 reduction as a result of the overload report timing out, the 739 following concerns are RECOMMENDED to be applied. The reacting node 740 sending the traffic should be conservative and, for example, first 741 send "probe" messages to learn the overload condition of the 742 overloaded node before converging to any traffic amount/rate decided 743 by the sender. Similar concerns apply in all cases when the overload 744 report times out unless the previous overload report stated 0 percent 745 reduction. 747 4.8. Attribute Value Pair flag rules 749 +---------+ 750 |AVP flag | 751 |rules | 752 +----+----+ 753 AVP Section | |MUST| 754 Attribute Name Code Defined Value Type |MUST| NOT| 755 +--------------------------------------------------+----+----+ 756 |OC-Supported-Features TBD1 x.x Grouped | | V | 757 +--------------------------------------------------+----+----+ 758 |OC-OLR TBD2 x.x Grouped | | V | 759 +--------------------------------------------------+----+----+ 760 |OC-Sequence-Number TBD3 x.x Time | | V | 761 +--------------------------------------------------+----+----+ 762 |OC-Validity-Duration TBD4 x.x Unsigned32 | | V | 763 +--------------------------------------------------+----+----+ 764 |OC-Report-Type TBD5 x.x Enumerated | | V | 765 +--------------------------------------------------+----+----+ 766 |OC-Reduction | | | 767 | -Percentage TBD8 x.x Unsigned32 | | V | 768 +--------------------------------------------------+----+----+ 769 |OC-Feature-Vector TBD6 x.x Unsigned64 | | V | 770 +--------------------------------------------------+----+----+ 772 As described in the Diameter base protocol [RFC6733], the M-bit 773 setting for a given AVP is relevant to an application and each 774 command within that application that includes the AVP. 776 The Diameter overload control AVPs SHOULD always be sent with the 777 M-bit cleared when used within existing Diameter applications to 778 avoid backward compatibility issues. Otherwise, when reused in newly 779 defined Diameter applications, the DOC related AVPs SHOULD have the 780 M-bit set. 782 5. Overload Control Operation 784 5.1. Overload Control Endpoints 786 The overload control solution can be considered as an overlay on top 787 of an arbitrary Diameter network. The overload control information 788 is exchanged over on a "DOIC association" established between two 789 communication endpoints. The endpoints, namely the "reacting node" 790 and the "reporting node" do not need to be adjacent Diameter peer 791 nodes, nor they need to be the end-to-end Diameter nodes in a typical 792 "client-server" deployment with multiple intermediate Diameter agent 793 nodes in between. The overload control endpoints are the two 794 Diameter nodes that decide to exchange overload control information 795 between each other. How the endpoints are determined is specific to 796 a deployment, a Diameter node role in that deployment and local 797 configuration. 799 The following diagrams illustrate the concept of Diameter Overload 800 End-Points and how they differ from the standard [RFC6733] defined 801 client, server and agent Diameter nodes. The following is the key to 802 the elements in the diagrams: 804 C Diameter client as defined in [RFC6733]. 806 S Diameter server as defined in [RFC6733]. 808 A Diameter agent, in either a relay or proxy mode, as defined in 809 [RFC6733]. 811 DEP Diameter Overload End-Point as defined in this document. In the 812 following figures a DEP may terminate two different DOIC 813 associations being a reporter and reactor at the same time. 815 Diameter Session A Diameter session as defined in [RFC6733]. 817 DOIC Association A DOIC association exists between two Diameter 818 Overload End-Points. One of the end-points is the overload 819 reporter and the other is the overload reactor. 821 Figure 2 illustrates the most basic configuration where a client is 822 connected directly to a server. In this case, the Diameter session 823 and the DOIC association are both between the client and server. 825 +-----+ +-----+ 826 | C | | S | 827 +-----+ +-----+ 828 | DEP | | DEP | 829 +--+--+ +--+--+ 830 | | 831 | | 832 |{Diameter Session}| 833 | | 834 |{DOIC Association}| 835 | | 837 Figure 2: Basic DOIC deployment 839 In Figure 3 there is an agent that is not participating directly in 840 the exchange of overload reports. As a result, the Diameter session 841 and the DOIC association are still established between the client and 842 the server. 844 +-----+ +-----+ +-----+ 845 | C | | A | | S | 846 +-----+ +--+--+ +-----+ 847 | DEP | | | DEP | 848 +--+--+ | +--+--+ 849 | | | 850 | | | 851 |----------{Diameter Session}---------| 852 | | | 853 |----------{DOIC Association}---------| 854 | | | 856 Figure 3: DOIC deployment with non participating agent 858 Figure 4 illustrates the case where the client does not support 859 Diameter overload. In this case, the DOIC association is between the 860 agent and the server. The agent handles the role of the reactor for 861 overload reports generated by the server. 863 +-----+ +-----+ +-----+ 864 | C | | A | | S | 865 +--+--+ +-----+ +-----+ 866 | | DEP | | DEP | 867 | +--+--+ +--+--+ 868 | | | 869 | | | 870 |----------{Diameter Session}---------| 871 | | | 872 | |{DOIC Association}| 873 | | | 875 Figure 4: DOIC deployment with non-DOIC client and DOIC enabled agent 877 In Figure 5 there is a DOIC association between the client and the 878 agent and a second DOIC association between the agent and the server. 879 One use case requiring this configuration is when the agent is 880 serving as a SFE for a set of servers. 882 +-----+ +-----+ +-----+ 883 | C | | A | | S | 884 +-----+ +-----+ +-----+ 885 | DEP | | DEP | | DEP | 886 +--+--+ +--+--+ +--+--+ 887 | | | 888 | | | 889 |----------{Diameter Session}---------| 890 | | | 891 |{DOIC Association}|{DOIC Association}| 892 | | and/or 893 |----------{DOIC Association}---------| 894 | | | 896 Figure 5: A deployment where all nodes support DOIC 898 Figure 6 illustrates a deployment where some clients support Diameter 899 overload control and some do not. In this case the agent must 900 support Diameter overload control for the non supporting client. It 901 might also need to have a DOIC association with the server, as shown 902 here, to handle overload for a server farm and/or for managing Realm 903 overload. 905 +-----+ +-----+ +-----+ +-----+ 906 | C1 | | C2 | | A | | S | 907 +-----+ +--+--+ +-----+ +-----+ 908 | DEP | | | DEP | | DEP | 909 +--+--+ | +--+--+ +--+--+ 910 | | | | 911 | | | | 912 |-------------------{Diameter Session}-------------------| 913 | | | | 914 | |--------{Diameter Session}-----------| 915 | | | | 916 |---------{DOIC Association}----------|{DOIC Association}| 917 | | | and/or 918 |-------------------{DOIC Association}-------------------| 919 | | | | 921 Figure 6: A deployment with DOIC and non-DOIC supporting clients 923 Figure 7 illustrates a deployment where some agents support Diameter 924 overload control and others do not. 926 +-----+ +-----+ +-----+ +-----+ 927 | C | | A | | A | | S | 928 +-----+ +--+--+ +-----+ +-----+ 929 | DEP | | | DEP | | DEP | 930 +--+--+ | +--+--+ +--+--+ 931 | | | | 932 | | | | 933 |-------------------{Diameter Session}-------------------| 934 | | | | 935 | | | | 936 |---------{DOIC Association}----------|{DOIC Association}| 937 | | | and/or 938 |-------------------{DOIC Association}-------------------| 939 | | | | 941 Figure 7: A deployment with DOIC and non-DOIC supporting agents 943 5.2. Piggybacking Principle 945 The overload control AVPs defined in this specification have been 946 designed to be piggybacked on top of existing application message 947 exchanges. This is made possible by adding overload control top 948 level AVPs, the OC-OLR AVP and the OC-Supported-Features AVP as 949 optional AVPs into existing commands when the corresponding Command 950 Code Format (CCF) specification allows adding new optional AVPs (see 951 Section 1.3.4 of [RFC6733]). 953 When added to existing commands, both OC-Feature-Vector and OC-OLR 954 AVPs SHOULD have the M-bit flag cleared to avoid backward 955 compatibility issues. 957 A new application specification can incorporate the overload control 958 mechanism specified in this document by making it mandatory to 959 implement for the application and referencing this specification 960 normatively. In such a case, the OC-Feature-Vector and OC-OLR AVPs 961 reused in newly defined Diameter applications SHOULD have the M-bit 962 flag set. However, it is the responsibility of the Diameter 963 application designers to define how overload control mechanisms works 964 on that application. 966 Note that the overload control solution does not have fixed server 967 and client roles. The endpoint role is determined based on the 968 message type: whether the message is a request (i.e. sent by a 969 "reacting node") or an answer (i.e. send by a "reporting node"). 970 Therefore, in a typical "client-server" deployment, the "client" MAY 971 report its overload condition to the "server" for any server 972 initiated message exchange. An example of such is the server 973 requesting a re-authentication from a client. 975 5.3. Capability Announcement 977 Since the overload control solution relies on the piggybacking 978 principle for the overload reporting and the overload control 979 endpoint are likely not adjacent peers, finding out whether the other 980 endpoint supports the overload control or what is the common traffic 981 abatement algorithm to apply for the traffic. The approach defined 982 in this specification for the end-to-end capability announcement 983 relies on the exchange of the OC-Supported-Features between the 984 endpoints. The feature announcement solution also works when carried 985 out on existing applications. For the newly defined application the 986 negotiation can be more exact based on the application specification. 987 The announced set of capabilities MUST NOT change during the life 988 time of the Diameter session (or transaction in case of non-session 989 maintaining applications). 991 5.3.1. Reacting Node Endpoint Considerations 993 The basic principle is that the request message initiating endpoint 994 (i.e. the "reacting node") announces its support for the overload 995 control mechanism by including in the request message the OC- 996 Supported-Features AVP with those capabilities it supports and is 997 willing to use for this Diameter session (or transaction in a case of 998 a non-session state maintaining applications, see Section 3.1.2 for 999 more details on Diameter sessions). It is RECOMMENDED that the 1000 request message initiating endpoint includes the capability 1001 announcement into every request regardless it has had prior message 1002 exchanges with the give remote endpoint. In a case of a Diameter 1003 session maintaining application, sending the OC-Supported-Features 1004 AVP in every message is not really necessary after the initial 1005 capability announcement or until there is a change in supported 1006 features. 1008 Once the endpoint that initiated the request message receives an 1009 answer message from the remote endpoint, it can detect from the 1010 received answer message whether the remote endpoint supports the 1011 overload control solution and in a case it does, what features are 1012 supported. The support for the overload control solution is based on 1013 the presence of the OC-Supported-Features AVP in the Diameter answer 1014 for existing application. 1016 5.3.2. Reporting Node Endpoint Considerations 1018 When a remote endpoint (i.e. a "reporting node") receives a request 1019 message, it can detect whether the request message initiating 1020 endpoint supports the overload control solution based on the presence 1021 of the OC-Supported-Features AVP. For the newly defined applications 1022 the overload control solution support can be part of the application 1023 specification. Based on the content of the OC-Supported-Features AVP 1024 the request message receiving endpoint knows what overload control 1025 functionality the other endpoint supports and then act accordingly 1026 for the subsequent answer messages it initiates. The answer message 1027 initiating endpoint MAY announce as many supported capabilities as it 1028 has (the announced set is a subject to local policy and 1029 configuration). However, at least one of the announced capabilities 1030 MUST be the same as received in the request message. 1032 The answer message initiating endpoint MUST NOT include any overload 1033 control solution defined AVPs into its answer messages if the request 1034 message initiating endpoint has not indicated support at the 1035 beginning of the created session (or transaction in a case of non- 1036 session state maintaining applications). The same also applies if 1037 none of the announced capabilities match between the two endpoints. 1039 5.4. Protocol Extensibility 1041 The overload control solution can be extended, e.g. with new traffic 1042 abatement algorithms or new functionality. The new features and 1043 algorithms MUST be registered with the IANA and for the possible use 1044 with the OC-Supported-Features for announcing the support for the new 1045 features (see Section 7 for the required procedures). 1047 It should be noted that [RFC6733] defined Grouped AVP extension 1048 mechanisms also apply. This allows, for example, defining a new 1049 feature that is mandatory to understand even when piggybacked on an 1050 existing applications. More specifically, the sub-AVPs inside the 1051 OC-OLR AVP MAY have the M-bit set. However, when overload control 1052 AVPs are piggybacked on top of an existing applications, setting 1053 M-bit in sub-AVPs is NOT RECOMMENDED. 1055 5.5. Overload Report Processing 1057 5.5.1. Overload Control State 1059 Both reacting and reporting nodes maintain an overload condition 1060 state for each endpoint (a host or a realm) they communicate with and 1061 both endpoints have announced support for DOIC. See Sections 4.1 and 1062 5.3 for discussion about how the support for DOIC is determined. The 1063 overload condition state SHOULD be able to make a difference between 1064 a realm and a specific host in that realm. 1066 The overload condition state could include the following information 1067 (per host or realm): 1069 o The endpoint information (Diameter identity of the realm and/or 1070 host, application identifier, etc) 1072 o Reduction percentage 1074 o Validity period timer 1076 o Sequence number 1078 o Supported/selected traffic abatement algorithm 1080 The overload control state information SHOULD be maintained as long 1081 as the other endpoint is known to support DOIC (based on the presence 1082 of the DOIC AVPs or by a future application specification). 1084 5.5.2. Reacting Node Considerations 1086 Once a reacting node receives an OC-OLR AVP from a reporting node, it 1087 applies the traffic abatement based on the commonly supported 1088 algorithm with the reporting node and the current overload condition. 1089 The reacting node learns the reporting node supported abatement 1090 algorithms directly from the received answer message containing the 1091 OC-Supported-Features AVP or indirectly remembering the previously 1092 used traffic abatement algorithm with the given reporting node. 1094 The received OC-Supported-Features AVP does not change the existing 1095 overload condition and/or traffic abatement algorithm settings if the 1096 OC-Sequence-Number AVP contains a value that is equal to the 1097 previously received/recorded one. If the OC-Supported-Features AVP 1098 is received for the first time for the reporting node or the OC- 1099 Sequence-Number AVP value is less than the previously received/ 1100 recorded one (and is outside the valid overflow window), then either 1101 the sequence number is stale (e.g. an intentional or unintentional 1102 replay) and SHOULD be silently discarded. 1104 The OC-OLR AVP contains the necessary information of the overload 1105 condition on the reporting node. Similarly to the OC-Supported- 1106 Features's sequence numbering, the OC-OLR AVP also has the OC- 1107 Sequence-Number AVP and its handling is similar to the one in the OC- 1108 Supported-Features AVP. The reacting node MUST update its overload 1109 condition state whenever receiving the OC-OLR AVP for the first time 1110 or the OC-Sequence-Number sub-AVP indicates a change in the OC-OLR 1111 AVP. 1113 As described in Section 4.3, the OC-OLR AVP contains the necessary 1114 information of the overload condition on the reporting node. 1116 From the OC-Report-Type AVP contained in the OC-OLR AVP, the reacting 1117 node learns whether the overload condition report concerns a specific 1118 host (as identified by the Origin-Host AVP of the answer message 1119 containing the OC-OLR AVP) or the entire realm (as identified by the 1120 Origin-Realm AVP of the answer message containing the OC-OLR AVP). 1121 The reacting node learns the Diameter application to which the 1122 overload report applies from the Application-ID of the answer message 1123 containing the OC-OLR AVP. The reacting node MUST use this 1124 information as an input for its traffic abatement algorithm. The 1125 idea is that the reacting node applies different handling of the 1126 traffic abatement, whether sent request messages are targeted to a 1127 specific host (identified by the Diameter-Host AVP in the request) or 1128 to any host in a realm (when only the Destination-Realm AVP is 1129 present in the request). Note that future specifications MAY define 1130 new OC-Report-Type AVP values that imply different handling of the 1131 OC-OLR AVP. For example, in a form of new additional AVPs inside the 1132 Grouped OC-OLR AVP that would define report target in a finer 1133 granularity than just a host. 1135 In the context of this specification and the default traffic 1136 abatement algorithm, the OC-Reduction-Percentage AVP value MUST be 1137 interpreted in the following way: 1139 value == 0 1141 Indicates explicitly the end of overload condition and the 1142 reacting node SHOULD NOT apply the traffic abatement algorithm 1143 procedures anymore for the given reporting node (or realm). 1145 value == 100 1147 Indicates that the reporting node (or realm) does not want to 1148 receive any traffic from the reacting node for the application the 1149 report concerns. The reacting node MUST do all measure not to 1150 send traffic to the reporting node (or realm) as long as the 1151 overload condition changes or expires. 1153 0 < value < 100 1155 Indicates that the reporting node urges the reacting node to 1156 reduce its traffic by a given percentage. For example if the 1157 reacting node has been sending 100 packets per second to the 1158 reporting node, then a reception of OC-Reduction-Percentage value 1159 of 10 would mean that from now on the reacting node MUST only send 1160 90 packets per second. How the reacting node achieves the "true 1161 reduction" transactions leading to the sent request messages is up 1162 to the implementation. The reacting node MAY simply drop every 1163 10th packet from its output queue and let the generic application 1164 logic try to recover from it. 1166 If the OC-OLR AVP is received for the first time, the reacting node 1167 MUST create an overload condition state associated with the related 1168 realm or a specific host in the realm identified in the message 1169 carrying the OC-OLR AVP, as described in Section 5.5.1. 1171 If the value of the OC-Sequence-Number AVP contained in the received 1172 OC-OLR AVP is equal to or less than the value stored in an existing 1173 overload condition state, the received OC-OLR AVP SHOULD be silently 1174 discarded. If the value of the OC-Sequence-Number AVP contained in 1175 the received OC-OLR AVP is greater than the value stored in an 1176 existing overload condition state or there is no previously recorded 1177 sequence number, the reacting node MUST update the overload condition 1178 state associated with the realm or the specific node is the realm. 1180 When an overload condition state is created or updated, the reacting 1181 node MUST apply the traffic abatement requested in the OC-OLR AVP 1182 using the algorithm announced in the OC-Supported-Features AVP 1183 contained in the received answer message along with the OC-OLR AVP. 1185 The validity duration of the overload information contained in the 1186 OC-OLR AVP is either explicitly indicated in the OC-Validity-Duration 1187 AVP or is implicitly equals to the default value (5 seconds) if the 1188 OC-Validity-Duration AVP is absent of the OC-OLR AVP. The reacting 1189 node MUST maintain the validity duration in the overload condition 1190 state. Once the validity duration times out, the reacting node MUST 1191 assume the overload condition reported in a previous OC-OLR AVP has 1192 ended. 1194 5.5.3. Reporting Node Considerations 1196 A reporting node is a Diameter node inserting an OC-OLR AVP in a 1197 Diameter message in order to inform a reacting node about an overload 1198 condition and request Diameter traffic abatement. 1200 The operation on the reporting node is rather straight forward. The 1201 reporting node learns the capabilities of the reacting node when it 1202 receives the OC-Supported-Features AVP as part of any Diameter 1203 request message. If the reporting node shares at least one common 1204 feature with the reacting node, then the DOIC can be enabled between 1205 these two endpoints. See Section 5.3 for further discussion on the 1206 capability and feature announcement between two endpoints. 1208 When a traffic reduction is required due to an overload condition and 1209 the overload control solution is supported by the sender of the 1210 Diameter request, the reporting node MUST include an OC-Supported- 1211 Features AVP and an OC-OLR AVP in the corresponding Diameter answer. 1212 The OC-OLR AVP contains the required traffic reduction and the OC- 1213 Supported-Features AVP indicates the traffic abatement algorithm to 1214 apply. This algorithm MUST be one of the algorithms advertised by 1215 the request sender. 1217 A reporting node MAY rely on the OC-Validity-Duration AVP values for 1218 the implicit overload condition state cleanup on the reacting node. 1219 However, it is RECOMMENDED that the reporting node always explicitly 1220 indicates the end of a overload condition. 1222 6. Transport Considerations 1224 In order to reduce overload control introduced additional AVP and 1225 message processing it might be desirable/beneficial to signal whether 1226 the Diameter command carries overload control information that should 1227 be of interest of an overload aware Diameter node. 1229 Should such indication be include is not part of this specification. 1230 It has not either been concluded at what layer such possible 1231 indication should be. Obvious candidates include transport layer 1232 protocols (e.g., SCTP PPID or TCP flags) or Diameter command header 1233 flags. 1235 7. IANA Considerations 1237 7.1. AVP codes 1239 New AVPs defined by this specification are listed in Section 4. All 1240 AVP codes allocated from the 'Authentication, Authorization, and 1241 Accounting (AAA) Parameters' AVP Codes registry. 1243 7.2. New registries 1245 Three new registries are needed under the 'Authentication, 1246 Authorization, and Accounting (AAA) Parameters' registry. 1248 Section 4.2 defines a new "Overload Control Feature Vector" registry 1249 including the initial assignments. New values can be added into the 1250 registry using the Specification Required policy [RFC5226]. See 1251 Section 4.2 for the initial assignment in the registry. 1253 Section 4.6 defines a new "Overload Report Type" registry with its 1254 initial assignments. New types can be added using the Specification 1255 Required policy [RFC5226]. 1257 8. Security Considerations 1259 This mechanism gives Diameter nodes the ability to request that 1260 downstream nodes send fewer Diameter requests. Nodes do this by 1261 exchanging overload reports that directly affect this reduction. 1262 This exchange is potentially subject to multiple methods of attack, 1263 and has the potential to be used as a Denial-of-Service (DoS) attack 1264 vector. 1266 Overload reports may contain information about the topology and 1267 current status of a Diameter network. This information is 1268 potentially sensitive. Network operators may wish to control 1269 disclosure of overload reports to unauthorized parties to avoid its 1270 use for competitive intelligence or to target attacks. 1272 Diameter does not include features to provide end-to-end 1273 authentication, integrity protection, or confidentiality. This may 1274 cause complications when sending overload reports between non- 1275 adjacent nodes. 1277 8.1. Potential Threat Modes 1279 The Diameter protocol involves transactions in the form of requests 1280 and answers exchanged between clients and servers. These clients and 1281 servers may be peers, that is,they may share a direct transport (e.g. 1283 TCP or SCTP) connection, or the messages may traverse one or more 1284 intermediaries, known as Diameter Agents. Diameter nodes use TLS, 1285 DTLS, or IPSec to authenticate peers, and to provide confidentiality 1286 and integrity protection of traffic between peers. Nodes can make 1287 authorization decisions based on the peer identities authenticated at 1288 the transport layer. 1290 When agents are involved, this presents an effectively hop-by-hop 1291 trust model. That is, a Diameter client or server can authorize an 1292 agent for certain actions, but it must trust that agent to make 1293 appropriate authorization decisions about its peers, and so on. 1295 Since confidentiality and integrity protection occurs at the 1296 transport layer. Agents can read, and perhaps modify, any part of a 1297 Diameter message, including an overload report. 1299 There are several ways an attacker might attempt to exploit the 1300 overload control mechanism. An unauthorized third party might inject 1301 an overload report into the network. If this third party is upstream 1302 of an agent, and that agent fails to apply proper authorization 1303 policies, downstream nodes may mistakenly trust the report. This 1304 attack is at least partially mitigated by the assumption that nodes 1305 include overload reports in Diameter answers but not in requests. 1306 This requires an attacker to have knowledge of the original request 1307 in order to construct a response. Therefore, implementations SHOULD 1308 validate that an answer containing an overload report is a properly 1309 constructed response to a pending request prior to acting on the 1310 overload report. 1312 A similar attack involves an otherwise authorized Diameter node that 1313 sends an inappropriate overload report. For example, a server for 1314 the realm "example.com" might send an overload report indicating that 1315 a competitor's realm "example.net" is overloaded. If other nodes act 1316 on the report, they may falsely believe that "example.net" is 1317 overloaded, effectively reducing that realm's capacity. Therefore, 1318 it's critical that nodes validate that an overload report received 1319 from a peer actually falls within that peer's responsibility before 1320 acting on the report or forwarding the report to other peers. For 1321 example, an overload report from an peer that applies to a realm not 1322 handled by that peer is suspect. 1324 An attacker might use the information in an overload report to assist 1325 in certain attacks. For example, an attacker could use information 1326 about current overload conditions to time a DoS attack for maximum 1327 effect, or use subsequent overload reports as a feedback mechanism to 1328 learn the results of a previous or ongoing attack. 1330 8.2. Denial of Service Attacks 1332 Diameter overload reports can cause a node to cease sending some or 1333 all Diameter requests for an extended period. This makes them a 1334 tempting vector for DoS tacks. Furthermore, since Diameter is almost 1335 always used in support of other protocols, a DoS attack on Diameter 1336 is likely to impact those protocols as well. Therefore, Diameter 1337 nodes MUST NOT honor or forward overload reports from unauthorized or 1338 otherwise untrusted sources. 1340 8.3. Non-Compliant Nodes 1342 When a Diameter node sends an overload report, it cannot assume that 1343 all nodes will comply. A non-compliant node might continue to send 1344 requests with no reduction in load. Requirement 28 [RFC7068] 1345 indicates that the overload control solution cannot assume that all 1346 Diameter nodes in a network are necessarily trusted, and that 1347 malicious nodes not be allowed to take advantage of the overload 1348 control mechanism to get more than their fair share of service. 1350 In the absence of an overload control mechanism, Diameter nodes need 1351 to implement strategies to protect themselves from floods of 1352 requests, and to make sure that a disproportionate load from one 1353 source does not prevent other sources from receiving service. For 1354 example, a Diameter server might reject a certain percentage of 1355 requests from sources that exceed certain limits. Overload control 1356 can be thought of as an optimization for such strategies, where 1357 downstream nodes never send the excess requests in the first place. 1358 However, the presence of an overload control mechanism does not 1359 remove the need for these other protection strategies. 1361 8.4. End-to End-Security Issues 1363 The lack of end-to-end security features makes it far more difficult 1364 to establish trust in overload reports that originate from non- 1365 adjacent nodes. Any agents in the message path may insert or modify 1366 overload reports. Nodes must trust that their adjacent peers perform 1367 proper checks on overload reports from their peers, and so on, 1368 creating a transitive-trust requirement extending for potentially 1369 long chains of nodes. Network operators must determine if this 1370 transitive trust requirement is acceptable for their deployments. 1371 Nodes supporting Diameter overload control MUST give operators the 1372 ability to select which peers are trusted to deliver overload 1373 reports, and whether they are trusted to forward overload reports 1374 from non-adjacent nodes. 1376 The lack of end-to-end confidentiality protection means that any 1377 Diameter agent in the path of an overload report can view the 1378 contents of that report. In addition to the requirement to select 1379 which peers are trusted to send overload reports, operators MUST be 1380 able to select which peers are authorized to receive reports. A node 1381 MUST not send an overload report to a peer not authorized to receive 1382 it. Furthermore, an agent MUST remove any overload reports that 1383 might have been inserted by other nodes before forwarding a Diameter 1384 message to a peer that is not authorized to receive overload reports. 1386 At the time of this writing, the DIME working group is studying 1387 requirements for adding end-to-end security 1388 [I-D.ietf-dime-e2e-sec-req] features to Diameter. These features, 1389 when they become available, might make it easier to establish trust 1390 in non-adjacent nodes for overload control purposes. Readers should 1391 be reminded, however, that the overload control mechanism encourages 1392 Diameter agents to modify AVPs in, or insert additional AVPs into, 1393 existing messages that are originated by other nodes. If end-to-end 1394 security is enabled, there is a risk that such modification could 1395 violate integrity protection. The details of using any future 1396 Diameter end-to-end security mechanism with overload control will 1397 require careful consideration, and are beyond the scope of this 1398 document. 1400 9. Contributors 1402 The following people contributed substantial ideas, feedback, and 1403 discussion to this document: 1405 o Eric McMurry 1407 o Hannes Tschofenig 1409 o Ulrich Wiehe 1411 o Jean-Jacques Trottin 1413 o Maria Cruz Bartolome 1415 o Martin Dolly 1417 o Nirav Salot 1419 o Susan Shishufeng 1421 10. References 1422 10.1. Normative References 1424 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1425 Requirement Levels", BCP 14, RFC 2119, March 1997. 1427 [RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an 1428 IANA Considerations Section in RFCs", BCP 26, RFC 5226, 1429 May 2008. 1431 [RFC5905] Mills, D., Martin, J., Burbank, J., and W. Kasch, "Network 1432 Time Protocol Version 4: Protocol and Algorithms 1433 Specification", RFC 5905, June 2010. 1435 [RFC6733] Fajardo, V., Arkko, J., Loughney, J., and G. Zorn, 1436 "Diameter Base Protocol", RFC 6733, October 2012. 1438 10.2. Informative References 1440 [3GPP.23.203] 1441 3GPP, "Policy and charging control architecture", 3GPP 1442 TS 23.203 10.9.0, September 2013. 1444 [3GPP.29.229] 1445 3GPP, "Cx and Dx interfaces based on the Diameter 1446 protocol; Protocol details", 3GPP TS 29.229 10.5.0, 1447 March 2013. 1449 [3GPP.29.272] 1450 3GPP, "Evolved Packet System (EPS); Mobility Management 1451 Entity (MME) and Serving GPRS Support Node (SGSN) related 1452 interfaces based on Diameter protocol", 3GPP TS 29.272 1453 10.8.0, June 2013. 1455 [I-D.ietf-dime-e2e-sec-req] 1456 Tschofenig, H., Korhonen, J., Zorn, G., and K. Pillay, 1457 "Diameter AVP Level Security: Scenarios and Requirements", 1458 draft-ietf-dime-e2e-sec-req-00 (work in progress), 1459 September 2013. 1461 [RFC4006] Hakala, H., Mattila, L., Koskinen, J-P., Stura, M., and J. 1462 Loughney, "Diameter Credit-Control Application", RFC 4006, 1463 August 2005. 1465 [RFC5729] Korhonen, J., Jones, M., Morand, L., and T. Tsou, 1466 "Clarifications on the Routing of Diameter Requests Based 1467 on the Username and the Realm", RFC 5729, December 2009. 1469 [RFC7068] McMurry, E. and B. Campbell, "Diameter Overload Control 1470 Requirements", RFC 7068, November 2013. 1472 Appendix A. Issues left for future specifications 1474 The base solution for the overload control does not cover all 1475 possible use cases. A number of solution aspects were intentionally 1476 left for future specification and protocol work. 1478 A.1. Additional traffic abatement algorithms 1480 This specification describes only means for a simple loss based 1481 algorithm. Future algorithms can be added using the designed 1482 solution extension mechanism. The new algorithms need to be 1483 registered with IANA. See Sections 4.1 and 7 for the required IANA 1484 steps. 1486 A.2. Agent Overload 1488 This specification focuses on Diameter end-point (server or client) 1489 overload. A separate extension will be required to outline the 1490 handling the case of agent overload. 1492 A.3. DIAMETER_TOO_BUSY clarifications 1494 The current [RFC6733] behaviour in a case of DIAMETER_TOO_BUSY is 1495 somewhat under specified. For example, there is no information how 1496 long the specific Diameter node is willing to be unavailable. A 1497 specification updating [RFC6733] should clarify the handling of 1498 DIAMETER_TOO_BUSY from the error answer initiating Diameter node 1499 point of view and from the original request initiating Diameter node 1500 point of view. Further, the inclusion of possible additional 1501 information providing AVPs should be discussed and possible be 1502 recommended to be used. 1504 Appendix B. Examples 1506 B.1. Mix of Destination-Realm routed requests and Destination-Host 1507 routed requests 1509 Diameter allows a client to optionally select the destination server 1510 of a request, even if there are agents between the client and the 1511 server. The client does this using the Destination-Host AVP. In 1512 cases where the client does not care if a specific server receives 1513 the request, it can omit Destination-Host and route the request using 1514 the Destination-Realm and Application Id, effectively letting an 1515 agent select the server. 1517 Clients commonly send mixtures of Destination-Host and Destination- 1518 Realm routed requests. For example, in an application that uses user 1519 sessions, a client typically won't care which server handles a 1520 session-initiating requests. But once the session is initiated, the 1521 client will send all subsequent requests in that session to the same 1522 server. Therefore it would send the initial request with no 1523 Destination-Host AVP. If it receives a successful answer, the client 1524 would copy the Origin-Host value from the answer message into a 1525 Destination-Host AVP in each subsequent request in the session. 1527 An agent has very limited options in applying overload abatement to 1528 requests that contain Destination-Host AVPs. It typically cannot 1529 route the request to a different server than the one identified in 1530 Destination-Host. It's only remaining options are to throttle such 1531 requests locally, or to send an overload report back towards the 1532 client so the client can throttle the requests. The second choice is 1533 usually more efficient, since it prevents any throttled requests from 1534 being sent in the first place, and removes the agent's need to send 1535 errors back to the client for each dropped request. 1537 On the other hand, an agent has much more leeway to apply overload 1538 abatement for requests that do not contain Destination-Host AVPs. If 1539 the agent has multiple servers in its peer table for the given realm 1540 and application, it can route such requests to other, less overloaded 1541 servers. 1543 If the overload severity increases, the agent may reach a point where 1544 there is not sufficient capacity across all servers to handle even 1545 realm-routed requests. In this case, the realm itself can be 1546 considered overloaded. The agent may need the client to throttle 1547 realm-routed requests in addition to Destination-Host routed 1548 requests. The overload severity may be different for each server, 1549 and the severity for the realm at is likely to be different than for 1550 any specific server. Therefore, an agent may need to forward, or 1551 originate, multiple overload reports with differing ReportType and 1552 Reduction-Percentage values. 1554 Figure 8 illustrates such a mixed-routing scenario. In this example, 1555 the servers S1, S2, and S3 handle requests for the realm "realm". 1556 Any of the three can handle requests that are not part of a user 1557 session (i.e. routed by Destination-Realm). But once a session is 1558 established, all requests in that session must go to the same server. 1560 Client Agent S1 S2 S3 1561 | | | | | 1562 |(1) Request (DR:realm) | | 1563 |-------->| | | | 1564 | | | | | 1565 | | | | | 1566 | |Agent selects S1 | | 1567 | | | | | 1568 | | | | | 1569 | | | | | 1570 | |(2) Request (DR:realm) | 1571 | |-------->| | | 1572 | | | | | 1573 | | | | | 1574 | | |S1 overloaded, returns OLR 1575 | | | | | 1576 | | | | | 1577 | | | | | 1578 | |(3) Answer (OR:realm,OH:S1,OLR:RT=DH) 1579 | |<--------| | | 1580 | | | | | 1581 | | | | | 1582 | |sees OLR,routes DR traffic to S2&S3 1583 | | | | | 1584 | | | | | 1585 | | | | | 1586 |(4) Answer (OR:realm,OH:S1, OLR:RT=DH) | 1587 |<--------| | | | 1588 | | | | | 1589 | | | | | 1590 |Client throttles requests with DH:S1 | 1591 | | | | | 1592 | | | | | 1593 | | | | | 1594 |(5) Request (DR:realm) | | 1595 |-------->| | | | 1596 | | | | | 1597 | | | | | 1598 | |Agent selects S2 | | 1599 | | | | | 1600 | | | | | 1601 | | | | | 1602 | |(6) Request (DR:realm) | 1603 | |------------------>| | 1604 | | | | | 1605 | | | | | 1606 | | | |S2 is overloaded... 1607 | | | | | 1608 | | | | | 1609 | | | | | 1610 | |(7) Answer (OH:S2, OLR:RT=DH)| 1611 | |<------------------| | 1612 | | | | | 1613 | | | | | 1614 | |Agent sees OLR, realm now overloaded 1615 | | | | | 1616 | | | | | 1617 | | | | | 1618 |(8) Answer (OR:realm,OH:S2, OLR:RT=DH, OLR: RT=R) 1619 |<--------| | | | 1620 | | | | | 1621 | | | | | 1622 |Client throttles DH:S1, DH:S2, and DR:realm 1623 | | | | | 1624 | | | | | 1625 | | | | | 1626 | | | | | 1627 | | | | | 1629 Figure 8: Mix of Destination-Host and Destination-Realm Routed 1630 Requests 1632 1. The client sends a request with no Destination-Host AVP (that is, 1633 a Destination-Realm routed request.) 1635 2. The agent follows local policy to select a server from its peer 1636 table. In this case, the agent selects S2 and forwards the 1637 request. 1639 3. S1 is overloaded. It sends a answer indicating success, but also 1640 includes an overload report. Since the overload report only 1641 applies to S1, the ReportType is "Destination-Host". 1643 4. The agent sees the overload report, and records that S1 is 1644 overloaded by the value in the Reduction-Percentage AVP. It 1645 begins diverting the indicated percentage of realm-routed traffic 1646 from S1 to S2 and S3. Since it can't divert Destination-Host 1647 routed traffic, it forwards the overload report to the client. 1648 This effectively delegates the throttling of traffic with 1649 Destination-Host:S1 to the client. 1651 5. The client sends another Destination-Realm routed request. 1653 6. The agent selects S2, and forwards the request. 1655 7. It turns out that S2 is also overloaded, perhaps due to all that 1656 traffic it took over for S1. S2 returns an successful answer 1657 containing an overload report. Since this report only applies to 1658 S2, the ReportType is "Destination-Host". 1660 8. The agent sees that S2 is also overloaded by the value in 1661 Reduction-Percentage. This value is probably different than the 1662 value from S1's report. The agent diverts the remaining traffic 1663 to S3 as best as it can, but it calculates that the remaining 1664 capacity across all three servers is no longer sufficient to 1665 handle all of the realm-routed traffic. This means the realm 1666 itself is overloaded. The realm's overload percentage is most 1667 likely different than that for either S1 or S2. The agent 1668 forward's S2's report back to the client in the Diameter answer. 1669 Additionally, the agent generates a new report for the realm of 1670 "realm", and inserts that report into the answer. The client 1671 throttles requests with Destination-Host:S1 at one rate, requests 1672 with Destination-Host:S2 at another rate, and requests with no 1673 Destination-Host AVP at yet a third rate. (Since S3 has not 1674 indicated overload, the client does not throttle requests with 1675 Destination-Host:S3.) 1677 Authors' Addresses 1679 Jouni Korhonen (editor) 1680 Broadcom 1681 Porkkalankatu 24 1682 Helsinki FIN-00180 1683 Finland 1685 Email: jouni.nospam@gmail.com 1687 Steve Donovan 1688 Oracle 1689 17210 Campbell Road 1690 Dallas, Texas 75254 1691 United States 1693 Email: srdonovan@usdonovans.com 1695 Ben Campbell 1696 Oracle 1697 17210 Campbell Road 1698 Dallas, Texas 75254 1699 United States 1701 Email: ben@nostrum.com 1702 Lionel Morand 1703 Orange Labs 1704 38/40 rue du General Leclerc 1705 Issy-Les-Moulineaux Cedex 9 92794 1706 France 1708 Phone: +33145296257 1709 Email: lionel.morand@orange.com