--- 1/draft-ietf-dime-doic-rate-control-09.txt 2018-10-03 12:13:09.812670905 -0700 +++ 2/draft-ietf-dime-doic-rate-control-10.txt 2018-10-03 12:13:09.852671872 -0700 @@ -1,124 +1,126 @@ Diameter Maintenance and Extensions (DIME) S. Donovan, Ed. Internet-Draft Oracle Intended status: Standards Track E. Noel -Expires: March 14, 2019 AT&T Labs - September 10, 2018 +Expires: April 6, 2019 AT&T Labs + October 3, 2018 Diameter Overload Rate Control - draft-ietf-dime-doic-rate-control-09.txt + draft-ietf-dime-doic-rate-control-10 Abstract This specification documents an extension to the Diameter Overload Indication Conveyance (DOIC) [RFC7683] base solution. This extension adds a new overload control abatement algorithm. This abatement algorithm allows for a DOIC reporting node to specify a maximum rate at which a DOIC reacting node sends Diameter requests to the DOIC reporting node. Requirements The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", - "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this - document are to be interpreted as described in RFC 2119 [RFC2119]. + "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and + "OPTIONAL" in this document are to be interpreted as described in BCP + 14 [RFC2119] [RFC8174] when, and only when, they appear in all + capitals, as shown here. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at https://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." - This Internet-Draft will expire on March 14, 2019. + This Internet-Draft will expire on April 6, 2019. Copyright Notice Copyright (c) 2018 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 - 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 + 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 3. Interaction with DOIC Report Types . . . . . . . . . . . . . 5 - 4. Capability Announcement . . . . . . . . . . . . . . . . . . . 5 + 4. Capability Announcement . . . . . . . . . . . . . . . . . . . 6 5. Overload Report Handling . . . . . . . . . . . . . . . . . . 6 5.1. Reporting Node Overload Control State . . . . . . . . . . 6 - 5.2. Reacting Node Overload Control State . . . . . . . . . . 6 + 5.2. Reacting Node Overload Control State . . . . . . . . . . 7 5.3. Reporting Node Maintenance of Overload Control State . . 7 - 5.4. Reacting Node Maintenance of Overload Control State . . . 7 + 5.4. Reacting Node Maintenance of Overload Control State . . . 8 5.5. Reporting Node Behavior for Rate Abatement Algorithm . . 8 - 5.6. Reacting Node Behavior for Rate Abatement Algorithm . . . 8 - 6. Rate Abatement Algorithm AVPs . . . . . . . . . . . . . . . . 8 - 6.1. OC-Supported-Features AVP . . . . . . . . . . . . . . . . 8 + 5.6. Reacting Node Behavior for Rate Abatement Algorithm . . . 9 + 6. Rate Abatement Algorithm AVPs . . . . . . . . . . . . . . . . 9 + 6.1. OC-Supported-Features AVP . . . . . . . . . . . . . . . . 9 6.1.1. OC-Feature-Vector AVP . . . . . . . . . . . . . . . . 9 6.2. OC-OLR AVP . . . . . . . . . . . . . . . . . . . . . . . 9 - 6.2.1. OC-Maximum-Rate AVP . . . . . . . . . . . . . . . . . 9 - 6.3. Attribute Value Pair Flag Rules . . . . . . . . . . . . . 9 + 6.2.1. OC-Maximum-Rate AVP . . . . . . . . . . . . . . . . . 10 + 6.3. Attribute Value Pair Flag Rules . . . . . . . . . . . . . 10 7. Rate Based Abatement Algorithm . . . . . . . . . . . . . . . 10 - 7.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . 10 - 7.2. Reporting Node Behavior . . . . . . . . . . . . . . . . . 10 - 7.3. Reacting Node Behavior . . . . . . . . . . . . . . . . . 11 - 7.3.1. Default Algorithm for Rate-based Control . . . . . . 11 - 7.3.2. Priority Treatment . . . . . . . . . . . . . . . . . 14 - 7.3.3. Optional Enhancement: Avoidance of Resonance . . . . 16 - 8. IANA Consideration . . . . . . . . . . . . . . . . . . . . . 17 - 8.1. AVP Codes . . . . . . . . . . . . . . . . . . . . . . . . 17 - 8.2. New Registries . . . . . . . . . . . . . . . . . . . . . 17 - 8.3. New DOIC report types . . . . . . . . . . . . . . . . . . 17 - 9. Security Considerations . . . . . . . . . . . . . . . . . . . 18 - 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 18 - 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 18 - 11.1. Normative References . . . . . . . . . . . . . . . . . . 18 - 11.2. Informative References . . . . . . . . . . . . . . . . . 18 - Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 19 + 7.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . 11 + 7.2. Reporting Node Behavior . . . . . . . . . . . . . . . . . 11 + 7.3. Reacting Node Behavior . . . . . . . . . . . . . . . . . 12 + 7.3.1. Default Algorithm for Rate-based Control . . . . . . 12 + 7.3.2. Priority Treatment . . . . . . . . . . . . . . . . . 15 + 7.3.3. Optional Enhancement: Avoidance of Resonance . . . . 17 + 8. IANA Consideration . . . . . . . . . . . . . . . . . . . . . 18 + 8.1. AVP Codes . . . . . . . . . . . . . . . . . . . . . . . . 18 + 8.2. New Registries . . . . . . . . . . . . . . . . . . . . . 18 + 8.3. New DOIC report types . . . . . . . . . . . . . . . . . . 18 + 9. Security Considerations . . . . . . . . . . . . . . . . . . . 19 + 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 19 + 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 19 + 11.1. Normative References . . . . . . . . . . . . . . . . . . 19 + 11.2. Informative References . . . . . . . . . . . . . . . . . 19 + Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 20 1. Introduction This document defines a new Diameter overload control abatement algorithm, the "rate" algorithm. The base Diameter overload specification [RFC7683] defines the "loss" algorithm as the default Diameter overload abatement algorithm. The loss algorithm allows a reporting node (see Section 2) to instruct a reacting node (see Section 2) to reduce the amount of traffic sent to the reporting node by abating (diverting or throttling) a percentage of requests sent to the server. While this can effectively decrease the load handled by the server, it does not directly address cases where the rate of arrival of service requests changes quickly. For instance, if the service requests that result in Diameter transactions increase quickly then the loss algorithm cannot guarantee the load presented to the server remains below a specific - rate level. The loss algorithm can be slow to protect the stability - of reporting nodes when subjected with rapidly changing loads. The - "loss" algorithm errs both in throttling TOO MUCH when there is a dip - in offered load, and throttling NOT ENOUGH when there is a spike in + rate level. The loss algorithm can be slow to ensure the stability + of reporting nodes when subjected to rapidly changing loads. The + "loss" algorithm errs both in throttling too much when there is a dip + in offered load, and throttling not enough when there is a spike in offered load. Consider the case where a reacting node is handling 100 service requests per second, where each of these service requests results in one Diameter transaction being sent to a reporting node. If the reporting node is approaching an overload state, or is already in an overload state, it will send a Diameter overload report requesting a percentage reduction in traffic sent when the loss algorithm is used as Diameter overload abatement algorithm. Assume for this discussion that the reporting node requests a 10% reduction. The reacting node @@ -149,20 +151,41 @@ One of the benefits of a rate based algorithm over the loss algorithm is that it better handles spikes in traffic. Instead of sending a request to reduce traffic by a percentage, the rate approach allows the reporting node to specify the maximum number of Diameter requests per second that can be sent to the reporting node. For instance, in this example, the reporting node could send a rate-based request specifying the maximum transactions per second to be 90. The reacting node will send the 90 regardless of whether it is receiving 100 or 1000 service requests per second. + It should be noted that one of the implications of the rate based + algorithm is that the reporting node needs to determine how it wants + to distribute it's load over the set of reacting nodes from which it + is receiving traffic. For instance, if the reporting node is + receiving Diameter traffic from 10 reacting nodes and has a capacity + of 100 transactions per second then the reporting node could choose + to set the rate for each of the reacting nodes to 10 transactions per + second. This, of course, is assuming that each of the reacting nodes + has equal performance characteristics. The reporting node could also + choose to have a high capacity reacting node send 55 transactions per + second and the remaining 9 low capacity reacting nodes send 5 + transactions per second. The ability of the reporting node to + specify the amount of traffic on a per reacting node basis implies + that the reporting node must maintain state for each of the reacting + nodes. This state includes the current allocation of Diameter + traffic to that reacting node. If the number of reacting node + changes, either because new nodes are added, nodes are removed from + service or nodes fail, then the reporting node will need to + redistribute the maximum Diameter transactions over the new set of + reacting nodes. + This document extends the base Diameter Overload Indication Conveyance (DOIC) solution [RFC7683] to add support for the rate based overload abatement algorithm. This document draws heavily on work in the SIP Overload Control working group. The definition of the rate abatement algorithm is copied almost verbatim from the SIP Overload Control (SOC) document [RFC7415], with changes focused on making the wording consistent with the DOIC solution and the Diameter protocol. @@ -260,22 +283,22 @@ This is different from the behavior defined in [RFC7683] where a single loss percentage sent to all reacting nodes. A reporting node SHOULD maintain OCS entries when using the rate abatement algorithm per supported Diameter application, per targeted reacting node and per report type. A rate OCS entry is identified by the tuple of Application-Id, report type and DiameterIdentity of the target of the rate OLR. - The rate OCS entery SHOULD include the rate allocated to each - reacting note. + The rate OCS entry SHOULD include the rate allocated to the reacting + note. A reporting node that has selected the rate overload abatement algorithm MUST indicate the rate requested to be applied by DOIC reacting nodes in the OC-Maximum-Rate AVP included in the OC-OLR AVP. All other elements for the OCS defined in [RFC7683] and [I-D.ietf-dime-agent-overload] also apply to the reporting nodes OCS when using the rate abatement algorithm. 5.2. Reacting Node Overload Control State @@ -312,20 +335,27 @@ o For Host reports, the target is the DiameterIdentity contained in the Origin-Host AVP received in the request. o For Realm reports, the target is the DiameterIdentity contained in the Origin-Realm AVP received in the request. o For Peer reports, the target is the DiameterIdentity of the Diameter Peer from which the request was received. + A reporting node that receives a capability announcement from a new + reacting node, meaning a reacting node for which it does not have an + OCS entry, and the reporting node chooses the rate algorithm for that + reacting node may need to recalculate the rate to be allocated to all + reacting nodes. Any changed rate values will be communicated in the + next OLR sent to each reacting node. + 5.4. Reacting Node Maintenance of Overload Control State When receiving an answer message indicating that the reporting node has selected the rate algorithm, a reacting node MUST indicate the rate abatement algorithm in the reacting node OCS entry for the reporting node. A reacting node receiving an overload report for the rate abatement algorithm MUST save the rate received in the OC-Maximum-Rate AVP contained in the OC-OLR AVP in the reacting node OCS entry. @@ -349,23 +379,23 @@ Maximum-Rate AVP MUST be included in the OC-OLR AVP and the OC- Reduction-Percentage AVP MUST NOT be included. 5.6. Reacting Node Behavior for Rate Abatement Algorithm When determining if abatement treatment should be applied to a request being sent to a reporting node that has selected the rate overload abatement algorithm, the reacting node MAY use the algorithm detailed in Section 7. - Note: Other algorithms for controlling the rate can be implemented - by the reacting node as long as they result in the correct rate of - traffic being sent to the reporting node. + Other algorithms for controlling the rate MAY be implemented by + the reacting node. Any algorithm implemented MUST result in the + correct rate of traffic being sent to the reporting node. Once a determination is made by the reacting node that an individual Diameter request is to be subjected to abatement treatment then the procedures for throttling and diversion defined in [RFC7683] and [I-D.ietf-dime-agent-overload] apply. 6. Rate Abatement Algorithm AVPs 6.1. OC-Supported-Features AVP @@ -433,22 +464,21 @@ This section is pulled from [RFC7415], with minor changes needed to make it apply to the Diameter protocol. 7.1. Overview The reporting node is the one protected by the overload control algorithm defined here. The reacting node is the one that abates traffic towards the server. Following the procedures defined in [RFC7683], the reacting node and - reporting node signal one another support for rate-based overload - control. + reporting node signal their support for rate-based overload control. Then periodically, the reporting node relies on internal measurements (e.g. CPU utilization or queuing delay) to evaluate its overload state and estimate a target maximum Diameter request rate in number of requests per second (as opposed to target percent reduction in the case of loss-based abatement). When in an overloaded state, the reporting node uses the OC-OLR AVP to inform reacting nodes of its overload state and of the target Diameter request rate. @@ -487,36 +517,62 @@ Note that the value of OC-Maximum-Rate AVP (in request messages per second) for the rate algorithm provides an upper bound on the traffic sent by the reacting node to the reporting node. In other words, when multiple reacting nodes are being controlled by an overloaded reporting node, at any given time, some reporting nodes may receive requests at a rate below its target maximum Diameter request rate while others above that target rate. But the resulting request rate presented to the overloaded reporting node will converge - towards the target Diameter request rate. + towards the target Diameter request rate or a lower rate. Upon detection of overload, and the determination to invoke overload - controls, the reporting node MUST follow the specifications in - [RFC7683] to notify its clients of the allocated target maximum - Diameter request rate and to notify them that the rate overload - abatement is in effect. + controls, the reporting node follows the specifications in [RFC7683] + to notify its clients of the allocated target maximum Diameter + request rate and to notify them that the rate overload abatement is + in effect. - The reporting node MUST use the OC-Maximum-Rate AVP defined in this + The reporting node uses the OC-Maximum-Rate AVP defined in this specification to communicate a target maximum Diameter request rate to each of its clients. 7.3. Reacting Node Behavior 7.3.1. Default Algorithm for Rate-based Control + A reference algorithm is shown below. + + No priority case: + + // T: inter-transmission interval, set to 1 / OC-Maximum-Rate + // TAU: tolerance parameter + // ta: arrival time of the most recent arrival + // LCT: arrival time of last Diameter request that + // was sent to the server + // (initialized to the first arrival time) + // X: current value of the leaky bucket counter (initialized to + // TAU0) + + // After most recent arrival, calculate auxiliary variable Xp + Xp = X - (ta - LCT); + + if (Xp <= TAU) { + // Transmit Diameter request + // Update X and LCT + X = max (0, Xp) + T; + LCT = ta; + } else { + // Reject Diameter request + // Do not update X and LCT + } + In determining whether or not to transmit a specific message, the reacting node can use any algorithm that limits the message rate to the OC-Maximum-Rate AVP value in units of messages per second. For ease of discussion, we define T = 1/[OC-Maximum-Rate] as the target inter-Diameter request interval. It may be strictly deterministic, or it may be probabilistic. It may, or may not, have a tolerance factor, to allow for short bursts, as long as the long term rate remains below 1/T. The algorithm may have provisions for prioritizing traffic. @@ -572,23 +628,24 @@ tolerance to deviation of the inter-arrival time from T (the larger TAU the more tolerance to deviations from the inter-departure interval T). This deviation from the inter-departure interval influences the admitted rate burstyness, or the number of consecutive Diameter requests forwarded to the reporting node (burst size proportional to TAU over the difference between 1/T and the arrival rate). In situations where reacting nodes are configured with some knowledge - about the reporting node (e.g., operator pre-provisioning), it can be - beneficial to choose a value of TAU based on how many reacting nodes - will be sending requests to the reporting node. + about the reporting node and other traffic sources (e.g., operator + pre-provisioning), it can be beneficial to choose a value of TAU + based on how many reacting nodes will be sending requests to the + reporting node. Reporting nodes with a very large number of reacting nodes, each with a relatively small arrival rate, will generally benefit from a smaller value for TAU in order to limit queuing (and hence response times) at the reporting node when subjected to a sudden surge of traffic from all reacting nodes. Conversely, a reporting node with a relatively small number of reacting nodes, each with proportionally larger arrival rate, will benefit from a larger value of TAU. Once the control has been activated, at the arrival time of the k-th @@ -616,69 +673,71 @@ TAU can assume any positive real number value and is not necessarily bounded by T. TAU=4*T is a reasonable compromise between burst size and abatement rate adaptation at low offered rate. Note that specification of a value for TAU, and any communication or coordination between servers, is beyond the scope of this document. +7.3.2. Priority Treatment + A reference algorithm is shown below. - No priority case: + Priority case: // T: inter-transmission interval, set to 1 / OC-Maximum-Rate - // TAU: tolerance parameter + // TAU1: tolerance parameter of no priority Diameter requests + // TAU2: tolerance parameter of priority Diameter requests // ta: arrival time of the most recent arrival - // LCT: arrival time of last SIP request that was sent to the server + // LCT: arrival time of last Diameter request that + // was sent to the server // (initialized to the first arrival time) // X: current value of the leaky bucket counter (initialized to // TAU0) // After most recent arrival, calculate auxiliary variable Xp Xp = X - (ta - LCT); - if (Xp <= TAU) { - // Transmit SIP request + if (AnyRequestReceived && Xp <= TAU1) || (PriorityRequestReceived && + Xp <= TAU2 && Xp > TAU1) { + // Transmit Diameter request // Update X and LCT X = max (0, Xp) + T; LCT = ta; } else { - // Reject SIP request + // Apply abatement treatment to Diameter request // Do not update X and LCT } -7.3.2. Priority Treatment - The reacting node is responsible for applying message priority and for maintaining two categories of requests: Request candidates for reduction, requests not subject to reduction (except under extenuating circumstances when there aren't any messages in the first category that can be reduced). Accordingly, the proposed Leaky bucket implementation is modified to support priority using two thresholds for Diameter requests in the set of request candidates for reduction. With two priorities, the proposed Leaky bucket requires two thresholds TAU1 < TAU2: o All new requests would be admitted when the leaky bucket counter is at or below TAU1, o Only higher priority requests would be admitted when the leaky bucket counter is between TAU1 and TAU2, o All requests would be rejected when the bucket counter is above TAU2. - This can be generalized to n priorities using n thresholds for n>2 in - the obvious way. + This can be generalized to n priorities using n thresholds for n>2. With a priority scheme that relies on two tolerance parameters (TAU2 influences the priority traffic, TAU1 influences the non-priority traffic), always set TAU1 <= TAU2 (TAU is replaced by TAU1 and TAU2). Setting both tolerance parameters to the same value is equivalent to having no priority. TAU1 influences the admitted rate the same way as TAU does when no priority is set. And the larger the difference between TAU1 and TAU2, the closer the control is to strict priority queuing. @@ -690,47 +749,20 @@ o TAU0 = 0, o TAU1 = 1/2 * TAU2, and o TAU2 = 10 * T. Note that specification of a value for TAU1 and TAU2, and any communication or coordination between servers, is beyond the scope of this document. - A reference algorithm is shown below. - - Priority case: - - // T: inter-transmission interval, set to 1 / OC-Maximum-Rate - // TAU1: tolerance parameter of no priority Diameter requests - // TAU2: tolerance parameter of priority Diameter requests - // ta: arrival time of the most recent arrival - // LCT: arrival time of last Diameter request that was sent to the server - // (initialized to the first arrival time) - // X: current value of the leaky bucket counter (initialized to - // TAU0) - - // After most recent arrival, calculate auxiliary variable Xp - Xp = X - (ta - LCT); - - if (AnyRequestReceived && Xp <= TAU1) || (PriorityRequestReceived && - Xp <= TAU2 && Xp > TAU1) { - // Transmit Diameter request - // Update X and LCT - X = max (0, Xp) + T; - LCT = ta; - } else { - // Apply abatement treatment to Diameter request - // Do not update X and LCT - } - 7.3.3. Optional Enhancement: Avoidance of Resonance As the number of reacting node sources of traffic increases and the throughput of the reporting node decreases, the maximum rate admitted by each reacting node needs to decrease, and therefore the value of T becomes larger. Under some circumstances, e.g. if the traffic arises very quickly simultaneously at many sources, the occupancies of each bucket can become synchronized, resulting in the admissions from each source being close in time and batched or very 'peaky' arrivals at the reporting node, which not only gives rise to control instability, @@ -742,23 +774,23 @@ two appropriate points -- at the activation of control and whenever the bucket empties -- as described below. After updating the value of the leaky bucket to X', generate a value u as follows: if X' > 0, then u=0 else if X' <= 0, then let u be set to a random value uniformly distributed between -1/2 and +1/2 - Then (only) if the arrival is admitted, increase the bucket by an - amount T + uT, which will therefore be just T if the bucket hadn't - emptied, or lie between T/2 and 3T/2 if it had. + Then (only) if the arrival is admitted, increase the bucket content + by an amount T + uT, which will therefore be just T if the bucket + hadn't emptied, or lie between T/2 and 3T/2 if it had. This randomization should also be done when control is activated, i.e. instead of simply initializing the leaky bucket counter to TAU0, initialize it to TAU0 + uT, where u is uniformly distributed as above. Since activation would have been a result of response to a request sent by the reacting node, the second term in this expression can be interpreted as being the bucket increment following that admission. This method has the following characteristics: @@ -812,35 +844,34 @@ [I-D.ietf-dime-agent-overload] Donovan, S., "Diameter Agent Overload", draft-ietf-dime- agent-overload-00 (work in progress), December 2014. [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, . - [RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an - IANA Considerations Section in RFCs", RFC 5226, - DOI 10.17487/RFC5226, May 2008, - . - [RFC6733] Fajardo, V., Ed., Arkko, J., Loughney, J., and G. Zorn, Ed., "Diameter Base Protocol", RFC 6733, DOI 10.17487/RFC6733, October 2012, . [RFC7683] Korhonen, J., Ed., Donovan, S., Ed., Campbell, B., and L. Morand, "Diameter Overload Indication Conveyance", RFC 7683, DOI 10.17487/RFC7683, October 2015, . + [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC + 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, + May 2017, . + 11.2. Informative References [Erramilli] Erramilli, A. and L. Forys, "Traffic Synchronization Effects In Teletraffic Systems", 1991. [RFC7415] Noel, E. and P. Williams, "Session Initiation Protocol (SIP) Rate Control", RFC 7415, DOI 10.17487/RFC7415, February 2015, .