idnits 2.17.00 (12 Aug 2021) /tmp/idnits27279/draft-ietf-dime-load-09.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (March 22, 2017) is 1879 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '1' on line 937 -- Looks like a reference, but probably isn't: '2' on line 937 == Missing Reference: 'C' is mentioned on line 848, but not defined == Missing Reference: 'A1' is mentioned on line 848, but not defined == Missing Reference: 'A2' is mentioned on line 848, but not defined == Missing Reference: 'S4' is mentioned on line 848, but not defined == Outdated reference: draft-ietf-dime-agent-overload has been published as RFC 8581 Summary: 0 errors (**), 0 flaws (~~), 6 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force B. Campbell 3 Internet-Draft S. Donovan, Ed. 4 Intended status: Standards Track Oracle 5 Expires: September 23, 2017 JJ. Trottin 6 Nokia 7 March 22, 2017 9 Diameter Load Information Conveyance 10 draft-ietf-dime-load-09 12 Abstract 14 RFC7068 describes requirements for Overload Control in Diameter. 15 This includes a requirement to allow Diameter nodes to send "load" 16 information, even when the node is not overloaded. RFC7683 (Diameter 17 Overload Information Conveyance (DOIC)) solution describes a 18 mechanism meeting most of the requirements, but does not currently 19 include the ability to send load information. This document defines 20 a mechanism for conveying of Diameter load information. 22 Status of This Memo 24 This Internet-Draft is submitted in full conformance with the 25 provisions of BCP 78 and BCP 79. 27 Internet-Drafts are working documents of the Internet Engineering 28 Task Force (IETF). Note that other groups may also distribute 29 working documents as Internet-Drafts. The list of current Internet- 30 Drafts is at http://datatracker.ietf.org/drafts/current/. 32 Internet-Drafts are draft documents valid for a maximum of six months 33 and may be updated, replaced, or obsoleted by other documents at any 34 time. It is inappropriate to use Internet-Drafts as reference 35 material or to cite them other than as "work in progress." 37 This Internet-Draft will expire on September 23, 2017. 39 Copyright Notice 41 Copyright (c) 2017 IETF Trust and the persons identified as the 42 document authors. All rights reserved. 44 This document is subject to BCP 78 and the IETF Trust's Legal 45 Provisions Relating to IETF Documents 46 (http://trustee.ietf.org/license-info) in effect on the date of 47 publication of this document. Please review these documents 48 carefully, as they describe your rights and restrictions with respect 49 to this document. Code Components extracted from this document must 50 include Simplified BSD License text as described in Section 4.e of 51 the Trust Legal Provisions and are provided without warranty as 52 described in the Simplified BSD License. 54 Table of Contents 56 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 57 2. Terminology and Abbreviations . . . . . . . . . . . . . . . . 3 58 3. Conventions Used in This Document . . . . . . . . . . . . . . 4 59 4. Background . . . . . . . . . . . . . . . . . . . . . . . . . 4 60 4.1. Differences between Load and Overload information . . . . 4 61 4.2. How is Load Information Used? . . . . . . . . . . . . . . 5 62 5. Solution Overview . . . . . . . . . . . . . . . . . . . . . . 6 63 5.1. Theory of Operation . . . . . . . . . . . . . . . . . . . 8 64 6. Load Mechanism Procedures . . . . . . . . . . . . . . . . . . 10 65 6.1. Reporting Node Behavior . . . . . . . . . . . . . . . . . 10 66 6.1.1. Endpoint Reporting Node Behavior . . . . . . . . . . 10 67 6.1.2. Agent Reporting Node Behavior . . . . . . . . . . . . 11 68 6.2. Reacting Node Behavior . . . . . . . . . . . . . . . . . 12 69 6.3. Extensibility . . . . . . . . . . . . . . . . . . . . . . 13 70 6.4. Addition and Removal of Nodes . . . . . . . . . . . . . . 13 71 7. Attribute Value Pairs . . . . . . . . . . . . . . . . . . . . 14 72 7.1. Load AVP . . . . . . . . . . . . . . . . . . . . . . . . 14 73 7.2. Load-Type AVP . . . . . . . . . . . . . . . . . . . . . . 14 74 7.3. Load-Value AVP . . . . . . . . . . . . . . . . . . . . . 14 75 7.4. SourceID AVP . . . . . . . . . . . . . . . . . . . . . . 15 76 7.5. Attribute Value Pair flag rules . . . . . . . . . . . . . 15 77 8. Security Considerations . . . . . . . . . . . . . . . . . . . 15 78 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 16 79 9.1. AVP Codes . . . . . . . . . . . . . . . . . . . . . . . . 16 80 9.2. New Registries . . . . . . . . . . . . . . . . . . . . . 16 81 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 16 82 10.1. Normative References . . . . . . . . . . . . . . . . . . 16 83 10.2. Informative References . . . . . . . . . . . . . . . . . 17 84 Appendix A. Topology Scenarios . . . . . . . . . . . . . . . . . 17 85 A.1. No Agent . . . . . . . . . . . . . . . . . . . . . . . . 17 86 A.2. Single Agent . . . . . . . . . . . . . . . . . . . . . . 17 87 A.3. Multiple Agents . . . . . . . . . . . . . . . . . . . . . 18 88 A.4. Linked Agents . . . . . . . . . . . . . . . . . . . . . . 19 89 A.5. Shared Server Pools . . . . . . . . . . . . . . . . . . . 20 90 A.6. Agent Chains . . . . . . . . . . . . . . . . . . . . . . 20 91 A.7. Fully Meshed Layers . . . . . . . . . . . . . . . . . . . 21 92 A.8. Partitions . . . . . . . . . . . . . . . . . . . . . . . 21 93 A.9. Active-Standby Nodes . . . . . . . . . . . . . . . . . . 21 94 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 22 96 1. Introduction 98 [RFC7068] describes requirements for Overload Control in Diameter 99 [RFC6733]. The DIME working group has finished the Diameter Overload 100 Information Conveyance (DOIC) mechanism [RFC7683]. As currently 101 specified, DOIC fulfills some, but not all, of the requirements. 103 In particular, DOIC does not fulfill Req 23 and Req 24: 105 REQ 23: The solution MUST provide sufficient information to enable 106 a load-balancing node to divert messages that are rejected or 107 otherwise throttled by an overloaded upstream node to other 108 upstream nodes that are the most likely to have sufficient 109 capacity to process them. 111 REQ 24: The solution MUST provide a mechanism for indicating load 112 levels, even when not in an overload condition, to assist nodes in 113 making decisions to prevent overload conditions from occurring. 115 There are several other requirements in [RFC7068] that mention both 116 overload and load information that are only partially fulfilled by 117 DOIC. 119 The DIME working group explicitly chose not to fulfill these 120 requirements when publishing DOIC [RFC7683] due to several reasons. 121 A principal reason was that the working group did not agree on a 122 general approach for conveying load information. It chose to 123 progress the rest of DOIC, and deferred load information conveyance 124 to a DOIC extension or a separate mechanism. 126 This document defines a mechanism that addresses the load-related 127 requirements from RFC 7068. 129 2. Terminology and Abbreviations 131 AVP 133 Attribute Value Pair 135 DOIC 137 Diameter Overload Information Conveyance ([RFC7683]) 139 Load 141 The relative usage of the Diameter message processing capacity of 142 a Diameter node. A low load level indicates that the Diameter 143 node is under utilized. A high load level indicates that the node 144 is closer to being fully utilized. 146 Offered Load 148 The actual traffic sent to the reporting node after overload 149 abatement and routing decisions are made. 151 Reporting Node 153 Reporting Node: A Diameter node that generates a load report. 155 Reacting Node 157 Reacting Node: A Diameter node that acts upon a load report. 159 Routing Information 161 Routing Information referred to in this document can include the 162 Routing and Peer tables defined in RFC 6733. It can also include 163 other implementation specific tables used to store load 164 information. This document does not define the structure of such 165 tables. 167 3. Conventions Used in This Document 169 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 170 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 171 document are to be interpreted as described in RFC 2119 [RFC2119]. 173 RFC 2119 [RFC2119] interpretation does not apply for the above listed 174 words when they are not used in all-caps format. 176 4. Background 178 4.1. Differences between Load and Overload information 180 Previous discussions of how to solve the load-related requirements in 181 [RFC7068] have shown that people did not have an agreed-upon concept 182 of how "load" information differs from "overload" information. While 183 the two concepts are highly interrelated, there are two primary 184 differences. First, a Diameter node always has a load. At any given 185 time that load may be effectively zero, effectively fully loaded, or 186 somewhere in between. In contrast, overload is an exceptional 187 condition. A node only has overload information when it is in an 188 overloaded state. Furthermore, the relationship between a node's 189 load level and overload state at any given time may be vague. For 190 example, a node may normally operate at a "fully loaded" level, but 191 still not be considered overloaded. Another node may declare itself 192 to be "overloaded" even though it might not be fully "loaded". 194 Second, Overload information, in the form of a DOIC Overload Report 195 (OLR) [RFC7683] indicates an explicit request for action on the part 196 of the reacting node. That is, the OLR requests that the reacting 197 node reduces the offered load -- the actual traffic sent to the 198 reporting node after overload abatement and routing decisions are 199 made -- by an indicated amount (by default), or as prescribed by the 200 selected abatement algorithm. Effectively, DOIC provides a contract 201 between the reporting node and the reacting node. 203 In contrast, load is informational. That is, load information can be 204 considered a hint to the recipient node. That node may use the load 205 information for load balancing purposes, as an input to certain 206 overload abatement techniques, to make inferences about the 207 likelihood that the sending node becomes overloaded in the immediate 208 future, or for other purposes. 210 None of this prevents a Diameter node from deciding to reduce the 211 offered load based on load information. The fundamental difference 212 is that an overload report requires the reduction of offered load. 213 It is also reasonable for a Diameter node to decide to increase the 214 offered load based on load information. 216 4.2. How is Load Information Used? 218 [RFC7068] contemplates two primary uses for load information. Req 23 219 discusses how load information might be used when performing 220 diversion as an overload abatement technique, as described in 221 [RFC7683]. When a reacting node diverts traffic away from an 222 overloaded node, it needs load information for the other candidates 223 for that traffic in order to effectively load balance the diverted 224 load between potential candidates. Otherwise, diversion has a 225 greater potential to drive other nodes into overload. 227 Req 24 discusses how Diameter load information might be used when no 228 overload condition currently exists. Diameter nodes can use the load 229 information to make decisions to try to avoid overload conditions in 230 the first place. Normal load-balancing falls into this category, but 231 the diameter node can take other proactive steps as well. 233 If the loaded nodes are Diameter servers (or clients in the case of 234 server-to-client transactions), both of these uses of load 235 information should be accomplished by a Diameter node that performs 236 server selection (selection of the Diameter endpont to which the 237 request is to be routed for processing). Typically, server selection 238 is performed by a node (a client or an agent) that is an immediate 239 peer of the server. However, there are scenarios (see Appendix A) 240 where a client or proxy that is not the immediate peer to the 241 selected servers performs server selection. In this case, the client 242 or proxy enforces the server selection by inserting a Destination- 243 Host AVP. 245 For example, a Diameter node (e.g. client) can use a redirect 246 agent to get candidate destination host addresses. The redirect 247 agent might return several destination host addresses, from which 248 the Diameter node selects one. The Diameter node can use load 249 information received from these hosts to make the selection. 251 Just as load information can be used as part of server selection, it 252 can also be used as input to the selection of the next-hop peer to 253 which a request is to be routed. 255 It should be noted that a Diameter node will need to process both 256 Load reports and Overload reports from the same Diameter node. The 257 reacting node for the Overload report always has the responsibility 258 to reduce the amount of Diameter traffic sent to the overloaded node. 259 If, or how, the reacting node uses load information to achieve this 260 is left as an implementation decision. 262 5. Solution Overview 264 The mechanism defined here for the conveyance of load information is 265 similar in some ways to the mechanism defined for DOIC and is 266 different in other ways. 268 As with DOIC, load information is conveyed by piggy-backing the Load 269 AVPs on existing Diameter applications. 271 There are two primary differences. First, there is no capability 272 negotiation process for load. The sender of the load information is 273 sending it with the expectation that any supporting nodes will use it 274 when making routing decisions. If there are no nodes that support 275 the Load mechanism then the load information is ignored. 277 The second big difference between DOIC and Load is visibility of the 278 DOIC or load information within a Diameter network. DOIC information 279 is sent end-to-end resulting in the ability of all nodes in the path 280 of the answer message that carries the OC-OLR AVP to act on the 281 information, although only one node actually comsumes and reacts to 282 the report. The DOIC overload reports remain in the message all the 283 way from the reporting node to the node that is the target for the 284 answer message. 286 For the Load mechanism there are two types of Load reports and only 287 the first one is transmitted end-to-end. 289 The first type of Load report is a HOST report which contains the 290 load of the endpoint sending the answer message. This Load report is 291 carried end-to-end to enable any nodes that make server selection 292 decisions to use the load status of the sending endpoint as part of 293 the server selection decision. Unlike with DOIC, more than one node 294 may make use of the load information received. 296 The second type of Load report is a PEER report. This report is used 297 by Diameter nodes as part of the logic to select the next-hop 298 Diameter node and, as such, does not have significance beyond the 299 peer node. Load reports of type PEER are removed by the first 300 supporting Diameter node to receive the report. 302 Because Load reports can traverse Diameter nodes that do not support 303 the Load mechanism, it is necessary to include the identity of the 304 node to which the Load report applies as part of the Load report. 305 This allows for a Diameter node to verify that a Load report applies 306 to its peer or if it should be ignored. 308 The Load report includes a value indicating relative load of the 309 sending node, specified in a manner consistent with that defined for 310 DNS SRV [RFC2782]. 312 The goal is to make it possible to use both the load values received 313 as a part of the Diameter Load mechanism and weight values received 314 as a result of a DNS SRV query. As a result, the Diameter load value 315 has a range of 0-65535. This value and DNS SRV weight values are 316 then used in a distribution algorithm similar to that specified in 317 [RFC2782]. 319 The DNS SRV distribution algorithm results in more messages being 320 sent to a node with a higher weight value. As a result, a higher 321 Diameter load value indicates a LOWER load on the sending node. A 322 node that is heavily loaded sends a lower Diameter load value. 323 Stated another way, a node that has zero load would have a load value 324 of 65535. A node that is 100% loaded would have a load value of 0. 326 The distribution algorithm used by Diameter nodes supporting the 327 Diameter Load mechanism is an implementation decision but it needs to 328 result in similar behavior to the algorithm described for the use of 329 weight values specified in [RFC2782]. 331 The method for calculating the load value included in the Load report 332 is also left as an implementation decision. 334 The frequency for sending of Load reports is also left as an 335 implementation decision. The sending node might choose to send Load 336 reports in all messages or it might choose to only send Load reports 337 when the load value has changed by some implementation specific 338 amount. The important consideration is that all nodes needing the 339 load information have a sufficiently accurate view of the node's 340 load. 342 5.1. Theory of Operation 344 This section outlines how the Diameter Load mechanism is expected to 345 work. 347 For this discussion, assume the following Diameter network 348 configuration: 350 ---A1---A3----S[1], S[2]...S[p] 351 / | \ / 352 C | x 353 \ | / \ 354 ---A2---A4----S[p+1], S[p+2] ...S[n] 356 Figure 1: Example Diameter Network 358 Note that in this diagram, S[1], S[2] through S[p] are peers to A3. 359 S[p+1], S[p+2] through S[n] are peers to A4. 361 Also assume that the request for a Diameter transaction takes the 362 following path: 364 C A1 A4 S[n] 365 | | | | 366 |----->|----->|----->| 367 xxR xxR xxR 369 Figure 2: Request Message Path 371 When sending the answer message, an endpoint node that supports the 372 Diameter Load mechanism includes its own load information in the 373 answer message. Because it is a Diameter endpoint it includes a HOST 374 Load report. 376 C A1 A4 S[n] 377 | | | | 378 | | |<-----| 379 | | xxA(Load type:HOST, source:S[n]) 380 | | | | 382 Figure 3: Answer Message from S[n] 384 If Agent A4 supports the Load mechanism then A4's actions depend on 385 whether A4 is responsible for doing server selection. If A4 is not 386 doing server selection then A4 ignores the HOST Load report. If A4 387 is responsible for doing server selection then it stores the load 388 information for S[n] in its routing information for the handling of 389 subsequent request messages. In both cases A4 leaves the HOST report 390 in the message. 392 Note: If A4 does not support the Load mechanism then it will relay 393 the answer message without doing any processing on the load 394 information. In this case the load information AVPs will be 395 relayed without change. 397 A4 then calculates its own load information and inserts load 398 information AVPs of type PEER in the message before sending the 399 message to A1. 401 C A1 A4 S[n] 402 | | | | 403 | |<-----| | 404 | xxA(Load type:PEER, source:A4) 405 | xxA(Load type:HOST, source:S[n]) 406 | | | | 408 Figure 4: Answer Message from A4 410 If A1 supports the Load mechanism then it processes each of the Load 411 reports it receives separately. 413 For the PEER Load report, A1 first determines if the source of the 414 report indicated in the Load report matches the DiameterIdentity of 415 the Diameter node from which the request was received. If the 416 identities do not match then the PEER Load report is discarded. If 417 the identities match then A1 saves the load information in its 418 routing information for routing of subsequent request messages. In 419 both cases A1 strips the PEER Load report from the message. 421 For the HOST Load report, A1's actions depend on whether A1 is 422 responsible for doing server selection. If A1 is not doing server 423 selection then A1 ignores the HOST Load report. If A1 is responsible 424 for doing server selection then it stores the load information for 425 S[n] in its routing information for the handling of subsequent 426 request messages. In both cases A1 leaves the HOST report in the 427 message. 429 A1 then calculates its own load information and inserts load 430 information AVPs of type PEER in the message before sending the 431 message to C: 433 C A1 A4 S[n] 434 | | | | 435 |<-----| | | 436 xxA(Load type:PEER, source:A1) 437 xxA(Load type:HOST, source:S[n]) 439 Figure 5: Answer Message from A1 441 As with A1, C processes each Load report separately. 443 For the PEER Load report, C follows the same procedure as A1 for 444 determining if the Load report was received from the peer from which 445 the report was sent. When finding it does, C stores the load 446 information for use when making future routing decisions. 448 For the HOST Load report, C saves the load information only if it is 449 responsible for doing server selection. 451 The load information received by all nodes is then used for routing 452 of subsequent request messages. 454 6. Load Mechanism Procedures 456 This section defines the normative behaviors for the Load mechanism. 458 6.1. Reporting Node Behavior 460 This section defines the procedures of Diameter reporting nodes that 461 generate Load reports. 463 6.1.1. Endpoint Reporting Node Behavior 465 A Diameter endpoint that supports the Diameter Load mechanism MUST 466 include a Load report of type HOST in sufficient answer messages to 467 ensure that all consumers of the load information receive timely 468 updates. 470 The Diameter endpoint MUST include its own DiameterIdentity in the 471 SourceID AVP included in the Load AVP. 473 The Diameter endpoint MUST include a Load-Type AVP of type HOST in 474 the Load AVP. 476 The Diameter endpoint MUST include its load value in the Load-Value 477 AVP in the Load AVP. 479 The LOAD value should be calculated in a way that reflects the 480 available load independently of the weight of each server, in order 481 to accurately compare LOAD values from different nodes. Any specific 482 LOAD value needs to identify the same amount of available capacity, 483 regardless the Diameter node that calculates the value. 485 The mechanism used to calculate the LOAD value that fulfills this 486 requirement is an implementation decision. 488 The frequency of sending Load reports is an implementation decision. 490 For instance, if the only consumer of the Load reports is the 491 endpoint's peer then the endpoint can choose to only include a 492 Load report when the load of the endpoint has changed by a 493 meaningful percentage. If there are consumers of the endpoint 494 Load report other then the endpoint's peer (this will be the case 495 if other nodes are responsible for server selection) then the 496 endpoint might choose to include Load reports in all answer 497 messages as a way of ensuring that all nodes doing server 498 selection get accurate load information. 500 6.1.2. Agent Reporting Node Behavior 502 A Diameter Agent that supports the Diameter Load mechanism MUST 503 include a PEER Load report in sufficient answer messages to ensure 504 that all users of the load information receive timely updates. 506 The Diameter Agent MUST include its own DiameterIdentity in the 507 SourceID AVP included in the Load AVP. 509 The Diameter Agent MUST include a Load-Type AVP of type PEER in the 510 Load AVP. 512 The Diameter Agent MUST include its load value in the Load-Value AVP 513 in the Load AVP. 515 The LOAD value should be calculated in a way that reflects the 516 available load independently of the weight of each agent, in order to 517 accurately compare LOAD values from different nodes. Any specific 518 LOAD value needs to identify the same amount of available capacity, 519 regardless the Diameter node that calculates the value. 521 The mechanism used to calculate the LOAD value that fulfills this 522 requirement is an implementation decision. 524 The frequency of sending Load reports is an implementation decision. 526 Note: In the case of peer Load reports it is only necessary to 527 include Load reports when the load value has changed by some 528 meaningful value, as long as the agent ensures that all peers 529 receive the report. It is also acceptable to include the Load 530 report in every answer message handled by the Diameter Agent. 532 6.2. Reacting Node Behavior 534 This section defines the behavior of Diameter nodes processing Load 535 reports. 537 A Diameter node that supports the Diameter Load mechanism MUST be 538 prepared to process Load reports of type HOST and of type PEER, as 539 indicated in the Load-Type AVP included in the Load AVP received in 540 the same answer message or from multiple answer messages. 542 Note that the node needs to be able to handle messages with no 543 load reports, messages with just a PEER Load report, messages with 544 just an HOST Load report and messages with both types of Load 545 reports. 547 If the Diameter node is not responsible for doing server selection 548 then it SHOULD ignore Load reports of type HOST. 550 If the Diameter node is responsible for doing server selection then 551 it SHOULD save the load value included in the Load-Value AVP included 552 in the Load AVP of type HOST in its routing information. 554 If the Diameter node receives a Load report of type PEER then the 555 Diameter node MUST determine if the Load report was inserted into the 556 answer message by the peer from which the message was received. This 557 is achieved by comparing the DiameterIdentity associated with the 558 connection from which the message was received with the 559 DiameterIdentity included in the SourceID AVP in the Load report. 561 If the Diameter node determines that the Load report of type PEER was 562 not received from the peer that sent or relayed the answer message 563 then the node MUST ignore the Load report. 565 If the Diameter node determines that the Load report of type PEER was 566 received from the peer that sent or relayed the answer message then 567 the node SHOULD save the load information in its routing information. 569 In all cases, a Diameter Agent MUST strip all Load reports of type 570 PEER received in answer messages. 572 Note: This ensures that there will be precisely one Load report of 573 type PEER, that of the Diameter node sending the message, in any 574 answer messages sent by the Diameter Agent. 576 How a Diameter node uses load information for making routing 577 decisions is an implementation decision. However, the distribution 578 algorithm MUST result in similar behavior as the algorithm described 579 for the use of weight values in [RFC2782]. 581 6.3. Extensibility 583 The Load mechanism can be extended to include additional information 584 in the Load reports. 586 Any extension may define new AVPs for use in Load reports. These new 587 AVPs SHOULD be defined to be extensions to the Load AVPs defined in 588 this document. 590 [RFC6733] defined Grouped AVP extension mechanisms apply. This 591 allows, for example, defining a new feature that is mandatory to be 592 understood even when piggybacked on an existing application. 594 As with any Diameter specification, [RFC6733] requires all new AVPs 595 to be registered with IANA. See Section 9 for the required 596 procedures. 598 6.4. Addition and Removal of Nodes 600 When a Diameter node is added, the new node will start by advertising 601 its load. Downstream nodes will need to factor the new load 602 information into load balancing decisions. The downstream nodes can 603 attempt to ensure a smooth increase of the traffic to the new node, 604 avoiding an immediate spike of traffic to the new node. The method 605 for handling of such a smooth increase is implementation specific but 606 it can rely on the evolution of load information received from the 607 new node and from the other nodes. 609 When removing a node in a controlled way (e.g. for maintenance 610 purpose, so outside a failure case), it might be appropriate to 611 progressively reduce the traffic to this node by routing traffic to 612 other nodes. Simple load information (load percentage) would not be 613 sufficient. The method for handling of the node removal is 614 implementation specific but it can rely on the evolution of the load 615 information received from the node to be removed. 617 7. Attribute Value Pairs 619 The section defines the AVPs required for the Load mechanism. 621 7.1. Load AVP 623 The Load AVP (AVP code TBD1) is of type Grouped and is used to convey 624 load information between Diameter nodes. 626 Load ::= < AVP Header: TBD1 > 627 [ Load-Type ] 628 [ Load-Value ] 629 [ SourceID ] 630 * [ AVP ] 632 7.2. Load-Type AVP 634 The Load-Type AVP (AVP code TBD2) is of type Enumerated. It is used 635 to convey the type of Diameter node that sent the load information. 636 The following values are defined: 638 HOST 0 The Load report is for a host. 640 PEER 1 The Load report is for a peer. 642 7.3. Load-Value AVP 644 The Load-Value AVP (AVP code TBD3) is of type Unsigned64. It is used 645 to convey relative load information about the sender of the Load 646 report. 648 The Load-Value AVP is specified in a manner similar to the weight 649 value in DNS SRV ([RFC2782]). 651 The Load-Value has a range of 0-65535. 653 A higher value indicates a lower load on the sending node. A lower 654 value indicates that the sending node is heavily loaded. 656 Stated another way, a node that has zero load would have a load 657 value of 65535. A node that is 100% loaded would have a load 658 value of 0. 660 7.4. SourceID AVP 662 The SourceID AVP is defined in [I-D.ietf-dime-agent-overload]. It is 663 used to identify the Diameter node that sent the Load report. 665 7.5. Attribute Value Pair flag rules 667 +---------+ 668 |AVP flag | 669 |rules | 670 +----+----+ 671 AVP Section | |MUST| 672 Attribute Name Code Defined Value Type |MUST| NOT| 673 +--------------------------------------------------------+----+----+ 674 |Load TBD1 x.1 Grouped | | V | 675 +--------------------------------------------------------+----+----+ 676 |Load-Type TBD2 x.2 Enumerated | | V | 677 +--------------------------------------------------------+----+----+ 678 |Load-Value TBD3 x.3 Unsigned64 | | V | 679 +------------------------------------------------------ -+----+----+ 680 |SourceID TBD4 x.4 DiameterIdentity | | V | 681 +--------------------------------------------------------+----+----+ 683 As described in the Diameter base protocol [RFC6733], the M-bit usage 684 for a given AVP in a given command may be defined by the application. 686 8. Security Considerations 688 Load information may be sensitive information in some cases. 689 Depending on the mechanism, an unauthorized recipient might be able 690 to infer the topology of a Diameter network from load information. 691 Load information might be useful in identifying targets for Denial of 692 Service (DoS) attacks, where a node known to be already heavily 693 loaded might be a tempting target. Load information might also be 694 useful as feedback about the success of an ongoing DoS attack. 696 Given that routing decisions are impacted by load information, there 697 is potential for negative impacts on a Diameter network caused by 698 erroneous or malicious Load reports. This includes the malicious 699 changing of load values by Diameter Agents. 701 Any load information conveyance mechanism will need to allow 702 operators to avoid sending load information to nodes that are not 703 authorized to receive it. Since Diameter currently only offers 704 authentication of nodes at the transport level and does not support 705 end-to-end security mechanisms, any solution that sends load 706 information to non-peer nodes requires a transitive-trust model. 708 9. IANA Considerations 710 9.1. AVP Codes 712 New AVPs defined by this specification are listed in 713 Section Section 7. All AVP codes are allocated from the 714 'Authentication, Authorization, and Accounting (AAA) Parameters' AVP 715 Codes registry. 717 9.2. New Registries 719 This document makes no new registry requests of IANA. 721 10. References 723 10.1. Normative References 725 [I-D.ietf-dime-agent-overload] 726 Donovan, S., "Diameter Agent Overload", draft-ietf-dime- 727 agent-overload-02 (work in progress), August 2015. 729 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 730 Requirement Levels", BCP 14, RFC 2119, 731 DOI 10.17487/RFC2119, March 1997, 732 . 734 [RFC2782] Gulbrandsen, A., Vixie, P., and L. Esibov, "A DNS RR for 735 specifying the location of services (DNS SRV)", RFC 2782, 736 DOI 10.17487/RFC2782, February 2000, 737 . 739 [RFC6733] Fajardo, V., Ed., Arkko, J., Loughney, J., and G. Zorn, 740 Ed., "Diameter Base Protocol", RFC 6733, 741 DOI 10.17487/RFC6733, October 2012, 742 . 744 [RFC7683] Korhonen, J., Ed., Donovan, S., Ed., Campbell, B., and L. 745 Morand, "Diameter Overload Indication Conveyance", 746 RFC 7683, DOI 10.17487/RFC7683, October 2015, 747 . 749 10.2. Informative References 751 [RFC7068] McMurry, E. and B. Campbell, "Diameter Overload Control 752 Requirements", RFC 7068, DOI 10.17487/RFC7068, November 753 2013, . 755 Appendix A. Topology Scenarios 757 This section presents a number of Diameter topology scenarios, and 758 discusses how load information might be used in each scenario. 760 A.1. No Agent 762 Figure 6 shows a simple client-server scenario, where a client picks 763 from a set of candidate servers available for a particular realm and 764 application. The client selects the server for a given transaction 765 using the load information received from each server. 767 ------S1 768 / 769 C 770 \ 771 ------S2 773 Figure 6: Basic Client Server Scenario 775 If a node supports dynamic discovery, it will not obtain load 776 information from the nodes with which it has no Diameter 777 connection established. Nevertheless it might take into account 778 the load information from the other nodes to decide to add 779 connections to new nodes with the dynamic discovery mechanism. 781 Note: The use of dynamic connections needs to be considered. 783 A.2. Single Agent 785 Figure 7 shows a client that sends requests to an agent. The agent 786 selects the request destination from a set of candidate servers, 787 using load information received from each server. The client does 788 not need to receive load information, since it does not select 789 between multiple agents. 791 ------S1 792 / 793 C----A 794 \ 795 ------S2 797 Figure 7: Simple Agent Scenario 799 A.3. Multiple Agents 801 Figure 8 shows a client selecting between multiple agents, and each 802 agent selecting from multiple servers. The client selects an agent 803 based on the load information received from each agent. Each agent 804 selects a server based on the load information received from its 805 servers. 807 This scenario adds a complication that one set of servers may be more 808 loaded than the other set. If, for example, S4 was the least loaded 809 server, C would need to know to select agent A2 to reach S4. This 810 might require C to receive load information from the servers as well 811 as the agents. Alternatively, each agent might use the load of its 812 servers as an input into calculating its own load, in effect 813 aggregating upstream load. 815 Similarly, if C sends a host-routed request [RFC7683], it needs to 816 know which agent can deliver requests to the selected server. 817 Without some special, potentially proprietary, knowledge of the 818 topology upstream of A1 and A2, C would select the agent based on the 819 normal peer selection procedures for the realm and application, and 820 perhaps consider the load information from A1 and A2. If C sends a 821 request to A1 that contains a Destination-Host AVP with a value of 822 S4, A1 will not be able to deliver the request. 824 -----S3 825 / 826 ---A1------S1 827 / 828 C 829 \ 830 ---A2------S2 831 \ 832 ---- S4 834 Figure 8: Multiple Agents and Servers 836 A.4. Linked Agents 838 Figure 9 shows a scenario similar to that of Figure 8, except that 839 the agents are linked, so that A1 can forward a request to A2, and 840 vice-versa. Each agent could receive load information from the 841 linked agent, as well as its connected servers. 843 This somewhat simplifies the complication from Figure 8, due to the 844 fact that C does not necessarily need to choose a particular agent to 845 reach a particular server. But it creates a similar question of how, 846 for example, A1 might know that S4 was less loaded than S1 or S3. 847 Additionally, it creates the opportunity for sub-optimal request 848 paths. For example [C,A1,A2,S4] vs. [C,A2,S4]. 850 A likely application for linked agents is when each agent prefers to 851 route only to directly connected servers and only forwards requests 852 to another agent under exceptional circumstances. For example, A1 853 might not forward requests to A2 unless both S1 and S3 are 854 overloaded. In this case, A1 might use the load information from S1 855 and S3 to select between those, and only consider the load 856 information from A2 (and other connected agents) if it needs to 857 divert requests to different agents. 859 -----S3 860 / 861 ---A1------S1 862 / | 863 C | 864 \ | 865 ---A2------S2 866 \ 867 ---- S4 869 Figure 9: Linked Agents 871 Figure 10 is a variant of Figure 9. In this case, C1 sends all 872 traffic through A1 and C2 sends all traffic through A2. By default, 873 A1 will load balance traffic between S1 and S3 and A2 will load 874 balance traffic between S2 and S4. 876 Now, if S1 S3 are significantly more loaded than S2 S4, A1 may route 877 some C1 traffic to A2. This is non optimal path but allows a better 878 load balancing between the servers. To achieve this, A1 needs to 879 receive some load info from A2 about S2/S4 load. 881 -----S3 882 / 883 C1----A1------S1 884 | 885 | 886 | 887 C2----A2------S2 888 \ 889 ---- S4 891 Figure 10: Linked Agents 893 A.5. Shared Server Pools 895 Figure 11 is similar to Figure 9, except that instead of a link 896 between agents, each agent is linked to all servers. (The links to 897 each set of servers should be interpreted as a link to each server. 898 The links are not shown separately due to the limitations of ASCII 899 art.) 901 In this scenario, each agent can select among all of the servers, 902 based on the load information from the servers. The client need only 903 be concerned with the load information of the agents. 905 ---A1---S[1], S[2]...S[p] 906 / \ / 907 C x 908 \ / \ 909 ---A2---S[p+1], S[p+2] ...S[n] 911 Figure 11: Shared Server Pools 913 A.6. Agent Chains 915 The scenario in Figure 12 is similar to that of Figure 8, except 916 that, instead of the client possibly needing to select an agent that 917 can route requests to the least loaded server, in this case A1 and A2 918 need to make similar decisions when selecting between A3 or A4. As 919 the former scenario, this could be mitigated if A3 and A4 aggregate 920 upstream loads into the load information they report downstream. 922 ---A1---A3----S[1], S[2]...S[p] 923 / | \ / 924 C | x 925 \ | / \ 926 ---A2---A4----S[p+1], S[p+2] ...S[n] 928 Figure 12: Agent Chains 930 A.7. Fully Meshed Layers 932 Figure 13 extends the scenario in Figure 11 by adding an extra layer 933 of agents. But since each layer of nodes can reach any node in the 934 next layer, each node only needs to consider the load of its next-hop 935 peer. 937 ---A1---A3---S[1], S[2]...S[p] 938 / | \ / |\ / 939 C | x | x 940 \ | / \ |/ \ 941 ---A2---A4---S[p+1], S[p+2] ...S[n] 943 Figure 13: Full Mesh 945 A.8. Partitions 947 A Diameter network with multiple servers is said to be "partitioned" 948 when only a subset of available servers can serve a particular realm- 949 routed request. For example, one group of servers may handle users 950 whose names start with "A" through "M", and another group may handle 951 "N" through "Z". 953 In such a partitioned network, nodes cannot load-balance requests 954 across partitions, since not all servers can handle the request. A 955 client, or an intermediate agent, may still be able to load-balance 956 between servers inside a partition. 958 A.9. Active-Standby Nodes 960 The previous scenarios assume that traffic can be load balanced among 961 all peers that are eligible to handle a request. That is, the peers 962 operate in an "active-active" configuration. In an "active-standby" 963 configuration, traffic would be load-balanced among active peers. 964 Requests would only be sent to peers in a "standby" state if the 965 active peers became unavailable. For example, requests might be 966 diverted to a stand-by peer if one or more active peers becomes 967 overloaded. 969 Authors' Addresses 971 Ben Campbell 972 Oracle 973 7460 Warren Parkway # 300 974 Frisco, Texas 75034 975 USA 977 Email: ben@nostrum.com 979 Steve Donovan (editor) 980 Oracle 981 7460 Warren Parkway # 300 982 Frisco, Texas 75034 983 United States 985 Email: srdonovan@usdonovans.com 987 Jean-Jacques Trottin 988 Nokia 989 Route de Villejust 990 91620 Nozay 991 France 993 Email: jean-jacques.trottin@nokia.com