--- 1/draft-ietf-dime-doic-rate-control-07.txt 2018-03-05 12:14:50.066323656 -0800 +++ 2/draft-ietf-dime-doic-rate-control-08.txt 2018-03-05 12:14:50.106324604 -0800 @@ -1,19 +1,19 @@ Diameter Maintenance and Extensions (DIME) S. Donovan, Ed. Internet-Draft Oracle Intended status: Standards Track E. Noel -Expires: March 31, 2018 AT&T Labs - September 27, 2017 +Expires: September 6, 2018 AT&T Labs + March 5, 2018 Diameter Overload Rate Control - draft-ietf-dime-doic-rate-control-07.txt + draft-ietf-dime-doic-rate-control-08.txt Abstract This specification documents an extension to the Diameter Overload Indication Conveyance (DOIC) [RFC7683] base solution. This extension adds a new overload control abatement algorithm. This abatement algorithm allows for a DOIC reporting node to specify a maximum rate at which a DOIC reacting node sends Diameter requests to the DOIC reporting node. @@ -31,98 +31,99 @@ Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at https://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." - This Internet-Draft will expire on March 31, 2018. + This Internet-Draft will expire on September 6, 2018. Copyright Notice - Copyright (c) 2017 IETF Trust and the persons identified as the + Copyright (c) 2018 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 - 2. Terminology and Abbreviations . . . . . . . . . . . . . . . . 4 - 3. Interaction with DOIC Report Rypes . . . . . . . . . . . . . 5 + 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 + 3. Interaction with DOIC Report Types . . . . . . . . . . . . . 5 4. Capability Announcement . . . . . . . . . . . . . . . . . . . 5 5. Overload Report Handling . . . . . . . . . . . . . . . . . . 6 5.1. Reporting Node Overload Control State . . . . . . . . . . 6 5.2. Reacting Node Overload Control State . . . . . . . . . . 6 5.3. Reporting Node Maintenance of Overload Control State . . 7 5.4. Reacting Node Maintenance of Overload Control State . . . 7 5.5. Reporting Node Behavior for Rate Abatement Algorithm . . 7 5.6. Reacting Node Behavior for Rate Abatement Algorithm . . . 8 6. Rate Abatement Algorithm AVPs . . . . . . . . . . . . . . . . 8 6.1. OC-Supported-Features AVP . . . . . . . . . . . . . . . . 8 6.1.1. OC-Feature-Vector AVP . . . . . . . . . . . . . . . . 8 6.2. OC-OLR AVP . . . . . . . . . . . . . . . . . . . . . . . 9 6.2.1. OC-Maximum-Rate AVP . . . . . . . . . . . . . . . . . 9 6.3. Attribute Value Pair Flag Rules . . . . . . . . . . . . . 9 7. Rate Based Abatement Algorithm . . . . . . . . . . . . . . . 10 7.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . 10 7.2. Reporting Node Behavior . . . . . . . . . . . . . . . . . 10 7.3. Reacting Node Behavior . . . . . . . . . . . . . . . . . 11 - 7.3.1. Default Algorithm . . . . . . . . . . . . . . . . . . 11 + 7.3.1. Default Algorithm for Rate-based Control . . . . . . 11 7.3.2. Priority Treatment . . . . . . . . . . . . . . . . . 14 7.3.3. Optional Enhancement: Avoidance of Resonance . . . . 16 8. IANA Consideration . . . . . . . . . . . . . . . . . . . . . 17 8.1. AVP Codes . . . . . . . . . . . . . . . . . . . . . . . . 17 8.2. New Registries . . . . . . . . . . . . . . . . . . . . . 17 9. Security Considerations . . . . . . . . . . . . . . . . . . . 17 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 18 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 18 11.1. Normative References . . . . . . . . . . . . . . . . . . 18 11.2. Informative References . . . . . . . . . . . . . . . . . 18 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 18 1. Introduction This document defines a new Diameter overload control abatement - algorithm. + algorithm, the "rate" algorithm. - The base Diameter overload specification [RFC7683] defines the loss + The base Diameter overload specification [RFC7683] defines the "loss" algorithm as the default Diameter overload abatement algorithm. The loss algorithm allows a reporting node to instruct a reacting node to reduce the amount of traffic sent to the reporting node by abating (diverting or throttling) a percentage of requests sent to the server. While this can effectively decrease the load handled by the server, it does not directly address cases where the rate of arrival of service requests increases quickly. If the service requests that result in Diameter transactions increase quickly then the loss algorithm cannot guarantee the load presented to the server remains below a specific rate level. The loss algorithm can be slow to protect the stability of reporting nodes when subjected with rapidly changing loads. Consider the case where a reacting node is handling 100 service requests per second, where each of these service requests results in one Diameter transaction being sent to a reporting node. If the reporting node is approaching an overload state, or is already in an overload state, it will send a Diameter overload report requesting a - percentage reduction in traffic sent. Assume for this discussion + percentage reduction in traffic sent when the loss algorithm is used + as Diameter overload abatement algorithm. Assume for this discussion that the reporting node requests a 10% reduction. The reacting node will then abate (diverting or throttling) ten Diameter transactions a second, sending the remaining 90 transactions per second to the reporting node. Now assume that the reacting node's service requests spikes to 1000 requests per second. The reacting node will continue to honor the reporting node's request for a 10% reduction in traffic. This results, in this example, in the reacting node sending 900 Diameter transactions per second, abating the remaining 100 transactions per @@ -134,40 +135,40 @@ report requesting that the reacting node abate 91% of requests to get back to the desired 90 transactions per second. However, once the spike has abated and the reacting node handled service requests returns to 100 per second, this will result in just 9 transactions per second being sent to the reporting node, requiring a new overload report setting the reduction percentage back to 10%. This control feedback loop has the potential to make the situation worse by causing wide fluctuations in traffic on multiple nodes in the Diameter network. - One of the benefits of a rate based algorithm is that it better - handles spikes in traffic. Instead of sending a request to reduce - traffic by a percentage, the rate approach allows the reporting node - to specify the maximum number of Diameter requests per second that - can be sent to the reporting node. For instance, in this example, - the reporting node could send a rate-based request specifying the - maximum transactions per second to be 90. The reacting node will - send the 90 regardless of whether it is receiving 100 or 1000 service - requests per second. + One of the benefits of a rate based algorithm over the loss algorithm + is that it better handles spikes in traffic. Instead of sending a + request to reduce traffic by a percentage, the rate approach allows + the reporting node to specify the maximum number of Diameter requests + per second that can be sent to the reporting node. For instance, in + this example, the reporting node could send a rate-based request + specifying the maximum transactions per second to be 90. The + reacting node will send the 90 regardless of whether it is receiving + 100 or 1000 service requests per second. This document extends the base DOIC solution [RFC7683] to add support for the rate based overload abatement algorithm. This document draws heavily on work in the SIP Overload Control working group. The definition of the rate abatement algorithm is copied almost verbatim from the SOC document [RFC7415], with changes focused on making the wording consistent with the DOIC solution and the Diameter protocol. -2. Terminology and Abbreviations +2. Terminology Diameter Node A RFC6733 Diameter Client, RFC6733 Diameter Server, or RFC6733 Diameter Agent. Diameter Endpoint An RFC6733 Diameter Client or RFC6733 Diameter Server. @@ -177,78 +178,76 @@ [RFC7683]. Reporting Node A DOIC Node that sends a DOIC overload report. Reacting Node A DOIC Node that receives and acts on a DOIC overload report. -3. Interaction with DOIC Report Rypes +3. Interaction with DOIC Report Types - As of the publication of this specification there are two DOIC report - types defined with the specification of a third in progress: + As of the publication of this specification, there are two DOIC + report types defined with the specification of a third in progress: - 1. Host - Overload of a specific Diameter Application at a specific - Diameter Node as defined in [RFC7683]. + HOST_REPORT 0 Overload of a specific Diameter Application at a + specific Diameter Node as defined in [RFC7683] - 2. Realm - Overload of a specific Diameter Application at a specific - Diameter Realm as defined in [RFC7683]. + REALM_REPORT 1 Overload of a specific Diameter Application at a + specific Diameter Realm as defined in [RFC7683] - 3. Peer - Overload of a specific Diameter peer as defined in - [I-D.ietf-dime-agent-overload]. + PEER_REPORT 2 Overload of a specific Diameter peer as defined in + [I-D.ietf-dime-agent-overload] The rate algorithm MAY be selected by reporting nodes for any of these report types. It is expected that all report types defined in the future will indicate whether or not the rate algorithm can be used with that report type. 4. Capability Announcement This extension defines the rate abatement algorithm (referred to as - rate in this document) feature. Support for the rate feature will be - reflected by use of a new value, as defined in Section 6.1.1, in the - OC-Feature-Vector AVP per the rules defined in [RFC7683]. + rate in this document) feature. Support of the rate feature by the + DOIC node is announced by a new value of the OC-Feature-Vector AVP, + as described in Section 6.1.1, per the rules defined in [RFC7683]. - Note that Diameter nodes that support the rate feature will, by - definition, support both the loss and rate based abatement - algorithms. DOIC reacting nodes SHOULD indicate support for both the - loss and rate algorithms in the OC-Feature-Vector AVP. + The loss algorithm being the default algorithm supported by all nodes + that support the Diameter overload control mechanism as specified in + [RFC7683], DOIC nodes supporting the rate feature will support both + the loss and rate based abatement algorithms. - There may be local policy reasons that cause a DOIC node that - supports the rate abatement algorithm to not include it in the OC- - Feature-Vector. All reacting nodes, however, must continue to - include loss in the OC-Feature-Vector in order to remain compliant - with [RFC7683]. + DOIC reacting nodes supporting the rate feature MUST indicate support + for both the loss and rate algorithms in the OC-Feature-Vector AVP. + + As defined in [RFC7683], a DOIC reporting node supporting the rate + feature MUST select a single abatement algorithm in the OC-Feature- + Vector AVP and OC-Peer-Algo AVP in the sent to the DOIC reacting + nodes. A reporting node MAY select one abatement algorithm to apply to host and realm reports and a different algorithm to apply to peer reports. For host or realm reports the selected algorithm is reflected in the OC-Feature-Vector AVP sent as part of the OC-Supported- Features AVP included in answer messages for transaction where the request contained an OC-Supported-Features AVP. This is per the procedures defined in [RFC7683]. For peer reports the selected algorithm is reflected in the OC- Peer-Algo AVP sent as part of the OC-Supported-Features AVP included answer messages for transactions where the request contained an OC-Supported-Features AVP. This is per the procedures defined in [I-D.ietf-dime-agent-overload]. - Editor's Node: The peer report specification is still under - development and, as such, the above paragraph is subject to - change. - 5. Overload Report Handling This section describes any changes to the behavior defined in [RFC7683] for handling of overload reports when the rate overload abatement algorithm is used. 5.1. Reporting Node Overload Control State A reporting node that uses the rate abatement algorithm SHOULD maintain reporting node Overload Control State (OCS) for each @@ -257,33 +256,34 @@ This is different from the behavior defined in [RFC7683] where a single loss percentage sent to all reacting nodes. A reporting node SHOULD maintain OCS entries when using the rate abatement algorithm per supported Diameter application, per targeted reacting node and per report type. A rate OCS entry is identified by the tuple of Application-Id, report type and DiameterIdentity of the target of the rate OLR. - A reporting node that supports the rate abatement algorithm MUST - include the rate of its abatement algorithm in the OC-Maximum-Rate - AVP when sending a rate OLR. + A reporting node that has selected the rate overoload abatement + algorithm MUST indicate the rate requested to be applied by DOIC + reacting nodes in the OC-Maximum-Rate AVP included in the OC-OLR AVP. All other elements for the OCS defined in [RFC7683] and [I-D.ietf-dime-agent-overload] also apply to the reporting nodes OCS when using the rate abatement algorithm. 5.2. Reacting Node Overload Control State A reacting node that supports the rate abatement algorithm MUST indicate rate as the selected abatement algorithm in the reacting - node OCS when receiving a rate OLR. + node OCS based on the OC-Feature-Vector AVP or the OC-Peer-Algo AVP + in the received OC-Supported-Features AVP. A reacting node that supports the rate abatement algorithm MUST include the rate specified in the OC-Maximum-Rate AVP included in the OC-OLR AVP as an element of the abatement algorithm specific portion of reacting node OCS entries. All other elements for the OCS defined in [RFC7683] and [I-D.ietf-dime-agent-overload] also apply to the reporting nodes OCS when using the rate abatement algorithm. @@ -296,27 +296,27 @@ A reporting node that has selected the rate abatement algorithm and enters an overload condition MUST indicate the selected rate in the resulting reporting node OCS entries. When selecting the rate algorithm in the response to a request that contained an OC-Supporting-Features AVP with an OC-Feature-Vector AVP indicating support for the rate feature, a reporting node MUST ensure that a reporting node OCS entry exists for the target of the overload report. The target is defined as follows: - o For Host reports the target is the DiameterIdentity contained in + o For Host reports, the target is the DiameterIdentity contained in the Origin-Host AVP received in the request. - o For Realm reports the target is the DiameterIdentity contained in + o For Realm reports, the target is the DiameterIdentity contained in the Origin-Realm AVP received in the request. - o For Peer reports the target is the DiameterIdentity of the + o For Peer reports, the target is the DiameterIdentity of the Diameter Peer from which the request was received. 5.4. Reacting Node Maintenance of Overload Control State When receiving an answer message indicating that the reporting node has selected the rate algorithm, a reacting node MUST indicate the rate abatement algorithm in the reacting node OCS entry for the reporting node. A reacting node receiving an overload report for the rate abatement @@ -326,22 +326,22 @@ 5.5. Reporting Node Behavior for Rate Abatement Algorithm When in an overload condition with rate selected as the overload abatement algorithm and when handling a request that contained an OC- Supported-Features AVP that indicated support for the rate abatement algorithm, a reporting node SHOULD include an OC-OLR AVP for the rate algorithm using the parameters stored in the reporting node OCS for the target of the overload report. When sending an overload report for the rate algorithm, the OC- - Maximum-Rate AVP MUST be included and the OC-Reduction-Percentage AVP - MUST NOT be included. + Maximum-Rate AVP MUST be included in the OC-OLR AVP and the OC- + Reduction-Percentage AVP MUST NOT be included. 5.6. Reacting Node Behavior for Rate Abatement Algorithm When determining if abatement treatment should be applied to a request being sent to a reporting node that has selected the rate overload abatement algorithm, the reacting node MAY use the algorithm detailed in Section 7. Note: Other algorithms for controlling the rate can be implemented by the reacting node as long as they result in the correct rate of @@ -357,28 +357,29 @@ 6.1. OC-Supported-Features AVP The rate algorithm does not add any new AVPs to the OC-Supported- Features AVP. The rate algorithm does add a new feature bit to be carried in the OC-Feature-Vector AVP. 6.1.1. OC-Feature-Vector AVP - This extension adds the following capabilities to the OC-Feature- - Vector AVP. + This extension adds the following capability to the OC-Feature-Vector + AVP. - OLR_RATE_ALGORITHM (0x0000000000000004) + OLR_RATE_ALGORITHM (bit 2) - When this flag is set by the overload control endpoint it - indicates that the DOIC Node supports the rate overload control - algorithm. + Bit 2 is assigned to the rate overload abatement algorithm. When + this flag is set by the overload control endpoint it indicates + that the DOIC Node supports the rate overload abatement + algorithm.. 6.2. OC-OLR AVP This extension defines the OC-Maximum-Rate AVP to be an optional part of the OC-OLR AVP. OC-OLR ::= < AVP Header: TBD2 > < OC-Sequence-Number > < OC-Report-Type > [ OC-Reduction-Percentage ] @@ -466,45 +467,44 @@ When setting the maximum rate for a particular reacting node, the reporting node may need take into account the workload (e.g. CPU load per request) of the distribution of message types from that reacting node. Furthermore, because the reacting node may prioritize the specific types of messages it sends while under overload restriction, this distribution of message types may be different from the message distribution for that reacting node under non-overload conditions (e.g., either higher or lower CPU load). - Note that the AVP for the rate algorithm is an upper bound (in - request messages per second) on the traffic sent by the reacting node - to the reporting node. The reacting node may send traffic at a rate - significantly lower than the upper bound, for a variety of reasons. + Note that the value of OC-Maximum-Rate AVP (in request messages per + second) for the rate algorithm provides an upper bound on the traffic + sent by the reacting node to the reporting node. In other words, when multiple reacting nodes are being controlled by - an overloaded reporting node, at any given time some reacting nodes + an overloaded reporting node, at any given time, some reporting nodes may receive requests at a rate below its target maximum Diameter request rate while others above that target rate. But the resulting request rate presented to the overloaded reporting node will converge towards the target Diameter request rate. Upon detection of overload, and the determination to invoke overload controls, the reporting node MUST follow the specifications in [RFC7683] to notify its clients of the allocated target maximum Diameter request rate and to notify them that the rate overload abatement is in effect. The reporting node MUST use the OC-Maximum-Rate AVP defined in this specification to communicate a target maximum Diameter request rate to each of its clients. 7.3. Reacting Node Behavior -7.3.1. Default Algorithm +7.3.1. Default Algorithm for Rate-based Control In determining whether or not to transmit a specific message, the reacting node can use any algorithm that limits the message rate to the OC-Maximum-Rate AVP value in units of messages per second. For ease of discussion, we define T = 1/[OC-Maximum-Rate] as the target inter-Diameter request interval. It may be strictly deterministic, or it may be probabilistic. It may, or may not, have a tolerance factor, to allow for short bursts, as long as the long term rate remains below 1/T. @@ -824,22 +824,22 @@ Erramilli, A. and L. Forys, "Traffic Synchronization Effects In Teletraffic Systems", 1991. [RFC7415] Noel, E. and P. Williams, "Session Initiation Protocol (SIP) Rate Control", RFC 7415, DOI 10.17487/RFC7415, February 2015, . Authors' Addresses Steve Donovan (editor) Oracle - 17210 Campbell Road - Dallas, Texas 75254 + 7460 Warren Pkwy # 300 + Frisco, Texas 75034 United States Email: srdonovan@usdonovans.com Eric Noel AT&T Labs 200s Laurel Avenue Middletown, NJ 07747 United States