draft-ietf-dime-doic-rate-control-04.txt   draft-ietf-dime-doic-rate-control-05.txt 
Diameter Maintenance and Extensions (DIME) S. Donovan, Ed. Diameter Maintenance and Extensions (DIME) S. Donovan, Ed.
Internet-Draft Oracle Internet-Draft Oracle
Intended status: Standards Track E. Noel Intended status: Standards Track E. Noel
Expires: April 7, 2017 AT&T Labs Expires: August 20, 2017 AT&T Labs
October 4, 2016 February 16, 2017
Diameter Overload Rate Control Diameter Overload Rate Control
draft-ietf-dime-doic-rate-control-04.txt draft-ietf-dime-doic-rate-control-05.txt
Abstract Abstract
This specification documents an extension to the Diameter Overload This specification documents an extension to the Diameter Overload
Indication Conveyance (DOIC) [RFC7683] base solution. This extension Indication Conveyance (DOIC) [RFC7683] base solution. This extension
adds a new overload control abatement algorithm. This abatement adds a new overload control abatement algorithm. This abatement
algorithm allows for a DOIC reporting node to specify a maximum rate algorithm allows for a DOIC reporting node to specify a maximum rate
at which a DOIC reacting node sends Diameter requests to the DOIC at which a DOIC reacting node sends Diameter requests to the DOIC
reporting node. reporting node.
skipping to change at page 1, line 42 skipping to change at page 1, line 42
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on April 7, 2017. This Internet-Draft will expire on August 20, 2017.
Copyright Notice Copyright Notice
Copyright (c) 2016 IETF Trust and the persons identified as the Copyright (c) 2017 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License. described in the Simplified BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Terminology and Abbreviations . . . . . . . . . . . . . . . . 4 2. Terminology and Abbreviations . . . . . . . . . . . . . . . . 4
3. Interaction with DOIC report types . . . . . . . . . . . . . 4 3. Interaction with DOIC report types . . . . . . . . . . . . . 5
4. Capability Announcement . . . . . . . . . . . . . . . . . . . 5 4. Capability Announcement . . . . . . . . . . . . . . . . . . . 5
5. Overload Report Handling . . . . . . . . . . . . . . . . . . 6 5. Overload Report Handling . . . . . . . . . . . . . . . . . . 6
5.1. Reporting Node Overload Control State . . . . . . . . . . 6 5.1. Reporting Node Overload Control State . . . . . . . . . . 6
5.2. Reacting Node Overload Control State . . . . . . . . . . 6 5.2. Reacting Node Overload Control State . . . . . . . . . . 6
5.3. Reporting Node Maintenance of Overload Control State . . 7 5.3. Reporting Node Maintenance of Overload Control State . . 7
5.4. Reacting Node Maintenance of Overload Control State . . . 7 5.4. Reacting Node Maintenance of Overload Control State . . . 7
5.5. Reporting Node Behavior for Rate Abatement Algorithm . . 7 5.5. Reporting Node Behavior for Rate Abatement Algorithm . . 7
5.6. Reacting Node Behavior for Rate Abatement Algorithm . . . 8 5.6. Reacting Node Behavior for Rate Abatement Algorithm . . . 8
6. Rate Abatement Algorithm AVPs . . . . . . . . . . . . . . . . 8 6. Rate Abatement Algorithm AVPs . . . . . . . . . . . . . . . . 8
6.1. OC-Supported-Features AVP . . . . . . . . . . . . . . . . 8 6.1. OC-Supported-Features AVP . . . . . . . . . . . . . . . . 8
6.1.1. OC-Feature-Vector AVP . . . . . . . . . . . . . . . . 8 6.1.1. OC-Feature-Vector AVP . . . . . . . . . . . . . . . . 8
6.2. OC-OLR AVP . . . . . . . . . . . . . . . . . . . . . . . 8 6.2. OC-OLR AVP . . . . . . . . . . . . . . . . . . . . . . . 9
6.2.1. OC-Maximum-Rate AVP . . . . . . . . . . . . . . . . . 9 6.2.1. OC-Maximum-Rate AVP . . . . . . . . . . . . . . . . . 9
6.3. Attribute Value Pair flag rules . . . . . . . . . . . . . 9 6.3. Attribute Value Pair flag rules . . . . . . . . . . . . . 9
7. Rate Based Abatement Algorithm . . . . . . . . . . . . . . . 9 7. Rate Based Abatement Algorithm . . . . . . . . . . . . . . . 10
7.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . 10 7.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . 10
7.2. Reporting Node Behavior . . . . . . . . . . . . . . . . . 10 7.2. Reporting Node Behavior . . . . . . . . . . . . . . . . . 10
7.3. Reacting Node Behavior . . . . . . . . . . . . . . . . . 11 7.3. Reacting Node Behavior . . . . . . . . . . . . . . . . . 11
7.3.1. Default algorithm . . . . . . . . . . . . . . . . . . 11 7.3.1. Default Algorithm . . . . . . . . . . . . . . . . . . 11
7.3.2. Priority treatment . . . . . . . . . . . . . . . . . 14 7.3.2. Priority Treatment . . . . . . . . . . . . . . . . . 14
7.3.3. Optional enhancement: avoidance of resonance . . . . 16 7.3.3. Optional Enhancement: Avoidance of Resonance . . . . 16
8. IANA Consideration . . . . . . . . . . . . . . . . . . . . . 17 8. IANA Consideration . . . . . . . . . . . . . . . . . . . . . 17
8.1. AVP codes . . . . . . . . . . . . . . . . . . . . . . . . 17 8.1. AVP codes . . . . . . . . . . . . . . . . . . . . . . . . 17
8.2. New registries . . . . . . . . . . . . . . . . . . . . . 17 8.2. New registries . . . . . . . . . . . . . . . . . . . . . 17
9. Security Considerations . . . . . . . . . . . . . . . . . . . 17 9. Security Considerations . . . . . . . . . . . . . . . . . . . 17
10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 18 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 18
11. References . . . . . . . . . . . . . . . . . . . . . . . . . 18 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 18
11.1. Normative References . . . . . . . . . . . . . . . . . . 18 11.1. Normative References . . . . . . . . . . . . . . . . . . 18
11.2. Informative References . . . . . . . . . . . . . . . . . 18 11.2. Informative References . . . . . . . . . . . . . . . . . 18
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 18 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 18
skipping to change at page 3, line 18 skipping to change at page 3, line 18
algorithm. algorithm.
The base Diameter overload specification [RFC7683] defines the loss The base Diameter overload specification [RFC7683] defines the loss
algorithm as the default Diameter overload abatement algorithm. The algorithm as the default Diameter overload abatement algorithm. The
loss algorithm allows a reporting node to instruct a reacting node to loss algorithm allows a reporting node to instruct a reacting node to
reduce the amount of traffic sent to the reporting node by abating reduce the amount of traffic sent to the reporting node by abating
(diverting or throttling) a percentage of requests sent to the (diverting or throttling) a percentage of requests sent to the
server. While this can effectively decrease the load handled by the server. While this can effectively decrease the load handled by the
server, it does not directly address cases where the rate of arrival server, it does not directly address cases where the rate of arrival
of service requests increases quickly. If the service requests that of service requests increases quickly. If the service requests that
result in Diameter transactions increases quickly then the loss result in Diameter transactions increase quickly then the loss
algorithm cannot guarantee the load presented to the server remains algorithm cannot guarantee the load presented to the server remains
below a specific rate level. The loss algorithm can be slow to below a specific rate level. The loss algorithm can be slow to
protect the stability of reporting nodes when subjected with rapidly protect the stability of reporting nodes when subjected with rapidly
changing loads. changing loads.
Consider the case where a reacting node is handling 100 service Consider the case where a reacting node is handling 100 service
requests per second, where each of these service requests results in requests per second, where each of these service requests results in
one Diameter transaction being sent to a reacting node. If the one Diameter transaction being sent to a reporting node. If the
reacting node is approaching an overload state, or is already in an reporting node is approaching an overload state, or is already in an
overload state, it will send a Diameter overload report requesting a overload state, it will send a Diameter overload report requesting a
percentage reduction in traffic sent. Assume for this discussion percentage reduction in traffic sent. Assume for this discussion
that the reporting node requests a 10% reduction. The reacting node that the reporting node requests a 10% reduction. The reacting node
will then abate (diverting or throttling) ten Diameter transactions a will then abate (diverting or throttling) ten Diameter transactions a
second, sending the remaining 90 transactions per second to the second, sending the remaining 90 transactions per second to the
reacting node. reporting node.
Now assume that the reacting node's service requests spikes to 1000 Now assume that the reacting node's service requests spikes to 1000
requests per second. The reacting node will continue to honor the requests per second. The reacting node will continue to honor the
reporting nodes request for a 10% reduction in traffic. This reporting node's request for a 10% reduction in traffic. This
results, in this example, in the reacting node sending 900 Diameter results, in this example, in the reacting node sending 900 Diameter
transactions per second, abating the remaining 100 transactions per transactions per second, abating the remaining 100 transactions per
second. This spike in traffic is significantly higher than the second. This spike in traffic is significantly higher than the
reporting node is expecting to handle and can result in negative reporting node is expecting to handle and can result in negative
impacts to the stability of the reporting node. impacts to the stability of the reporting node.
The reporting node can, and likely would, send another overload The reporting node can, and likely would, send another overload
report requesting that the reacting node abate 91% of requests to get report requesting that the reacting node abate 91% of requests to get
back to the desired 90 transactions per second. However, once the back to the desired 90 transactions per second. However, once the
spike has abated and the reacting node handled service requests spike has abated and the reacting node handled service requests
returns to 100 per second, this will result in just 9 transactions returns to 100 per second, this will result in just 9 transactions
per second being sent to the reporting node, requiring a new overload per second being sent to the reporting node, requiring a new overload
report setting the reduction percentage back to 10%. This control report setting the reduction percentage back to 10%. This control
feedback loop has the potential to make the situation worse. feedback loop has the potential to make the situation worse by
causing wide fluctuations in traffic on multiple nodes in the
Diameter network.
One of the benefits of a rate based algorithm is that it better One of the benefits of a rate based algorithm is that it better
handles spikes in traffic. Instead of sending a request to reduce handles spikes in traffic. Instead of sending a request to reduce
traffic by a percentage, the rate approach allows the reporting node traffic by a percentage, the rate approach allows the reporting node
to specify the maximum number of Diameter requests per second that to specify the maximum number of Diameter requests per second that
can be sent to the reporting node. For instance, in this example, can be sent to the reporting node. For instance, in this example,
the reporting node could send a rate-based request specifying the the reporting node could send a rate-based request specifying the
maximum transactions per second to be 90. The reacting node will maximum transactions per second to be 90. The reacting node will
send the 90 regardless of whether it is receiving 100 or 1000 service send the 90 regardless of whether it is receiving 100 or 1000 service
requests per second. requests per second.
skipping to change at page 6, line 26 skipping to change at page 6, line 32
A reporting node that uses the rate abatement algorithm SHOULD A reporting node that uses the rate abatement algorithm SHOULD
maintain reporting node Overload Control State (OCS) for each maintain reporting node Overload Control State (OCS) for each
reacting node to which it sends a rate Overload Report (OLR). reacting node to which it sends a rate Overload Report (OLR).
This is different from the behavior defined in [RFC7683] where a This is different from the behavior defined in [RFC7683] where a
single loss percentage sent to all reacting nodes. single loss percentage sent to all reacting nodes.
A reporting node SHOULD maintain OCS entries when using the rate A reporting node SHOULD maintain OCS entries when using the rate
abatement algorithm per supported Diameter application, per targeted abatement algorithm per supported Diameter application, per targeted
reacting node and per report-type. reacting node and per report type.
A rate OCS entry is identified by the tuple of Application-Id, A rate OCS entry is identified by the tuple of Application-Id, report
report-type and DiameterID of the target of the rate OLR. type and DiameterIdentity of the target of the rate OLR.
A reporting node that supports the rate abatement algorithm MUST A reporting node that supports the rate abatement algorithm MUST
include the rate of its abatement algorithm in the OC-Maximum-Rate include the rate of its abatement algorithm in the OC-Maximum-Rate
AVP when sending a rate OLR. AVP when sending a rate OLR.
All other elements for the OCS defined in [RFC7683] and All other elements for the OCS defined in [RFC7683] and
[I-D.ietf-dime-agent-overload] also apply to the reporting nodes OCS [I-D.ietf-dime-agent-overload] also apply to the reporting nodes OCS
when using the rate abatement algorithm. when using the rate abatement algorithm.
5.2. Reacting Node Overload Control State 5.2. Reacting Node Overload Control State
skipping to change at page 7, line 50 skipping to change at page 8, line 8
5.5. Reporting Node Behavior for Rate Abatement Algorithm 5.5. Reporting Node Behavior for Rate Abatement Algorithm
When in an overload condition with rate selected as the overload When in an overload condition with rate selected as the overload
abatement algorithm and when handling a request that contained an OC- abatement algorithm and when handling a request that contained an OC-
Supported-Features AVP that indicated support for the rate abatement Supported-Features AVP that indicated support for the rate abatement
algorithm, a reporting node SHOULD include an OC-OLR AVP for the rate algorithm, a reporting node SHOULD include an OC-OLR AVP for the rate
algorithm using the parameters stored in the reporting node OCS for algorithm using the parameters stored in the reporting node OCS for
the target of the overload report. the target of the overload report.
When sending an overload report for the Rate algorithm, the OC- When sending an overload report for the rate algorithm, the OC-
Maximum-Rate AVP is included and the OC-Reduction-Percentage AVP is Maximum-Rate AVP MUST be included and the OC-Reduction-Percentage AVP
not included. MUST NOT be included.
5.6. Reacting Node Behavior for Rate Abatement Algorithm 5.6. Reacting Node Behavior for Rate Abatement Algorithm
When determining if abatement treatment should be applied to a When determining if abatement treatment should be applied to a
request being sent to a reporting node that has selected the rate request being sent to a reporting node that has selected the rate
overload abatement algorithm, the reacting node MAY use the algorithm overload abatement algorithm, the reacting node MAY use the algorithm
detailed in Section 7. detailed in Section 7.
Note: Other algorithms for controlling the rate can be implemented Note: Other algorithms for controlling the rate can be implemented
by the reacting node as long as they result in the correct rate of by the reacting node as long as they result in the correct rate of
skipping to change at page 9, line 25 skipping to change at page 9, line 30
the OC-OLR AVP. the OC-OLR AVP.
This extension does not define new overload report types. The This extension does not define new overload report types. The
existing report types of host and realm defined in [RFC7683] apply to existing report types of host and realm defined in [RFC7683] apply to
the rate control algorithm. The peer report type defined in the rate control algorithm. The peer report type defined in
[I-D.ietf-dime-agent-overload] also applies to the rate control [I-D.ietf-dime-agent-overload] also applies to the rate control
algorithm. algorithm.
6.2.1. OC-Maximum-Rate AVP 6.2.1. OC-Maximum-Rate AVP
The OC-Maximum-Rate AVP (AVP code TBD1) is type of Unsigned32 and The OC-Maximum-Rate AVP (AVP code TBD1) is of type Unsigned32 and
describes the maximum rate that that the sender is requested to send describes the maximum rate that the sender is requested to send
traffic. This is specified in terms of requests per second. traffic. This is specified in terms of requests per second.
A value of zero indicates that no traffic is to be sent. A value of zero indicates that no traffic is to be sent.
6.3. Attribute Value Pair flag rules 6.3. Attribute Value Pair flag rules
+---------+ +---------+
|AVP flag | |AVP flag |
|rules | |rules |
+----+----+ +----+----+
skipping to change at page 11, line 32 skipping to change at page 11, line 38
[RFC7683] to notify its clients of the allocated target maximum [RFC7683] to notify its clients of the allocated target maximum
Diameter request rate and to notify them that the rate overload Diameter request rate and to notify them that the rate overload
abatement is in effect. abatement is in effect.
The reporting node MUST use the OC-Maximum-Rate AVP defined in this The reporting node MUST use the OC-Maximum-Rate AVP defined in this
specification to communicate a target maximum Diameter request rate specification to communicate a target maximum Diameter request rate
to each of its clients. to each of its clients.
7.3. Reacting Node Behavior 7.3. Reacting Node Behavior
7.3.1. Default algorithm 7.3.1. Default Algorithm
In determining whether or not to transmit a specific message, the In determining whether or not to transmit a specific message, the
reacting node can use any algorithm that limits the message rate to reacting node can use any algorithm that limits the message rate to
the OC-Maximum-Rate AVP value in units of messages per second. For the OC-Maximum-Rate AVP value in units of messages per second. For
ease of discussion, we define T = 1/[OC-Maximum-Rate] as the target ease of discussion, we define T = 1/[OC-Maximum-Rate] as the target
inter-Diameter request interval. It may be strictly deterministic, inter-Diameter request interval. It may be strictly deterministic,
or it may be probabilistic. It may, or may not, have a tolerance or it may be probabilistic. It may, or may not, have a tolerance
factor, to allow for short bursts, as long as the long term rate factor, to allow for short bursts, as long as the long term rate
remains below 1/T. remains below 1/T.
skipping to change at page 14, line 33 skipping to change at page 14, line 36
if (Xp <= TAU) { if (Xp <= TAU) {
// Transmit SIP request // Transmit SIP request
// Update X and LCT // Update X and LCT
X = max (0, Xp) + T; X = max (0, Xp) + T;
LCT = ta; LCT = ta;
} else { } else {
// Reject SIP request // Reject SIP request
// Do not update X and LCT // Do not update X and LCT
} }
7.3.2. Priority treatment 7.3.2. Priority Treatment
The reacting node is responsible for applying message priority and The reacting node is responsible for applying message priority and
for maintaining two categories of requests: Request candidates for for maintaining two categories of requests: Request candidates for
reduction, requests not subject to reduction (except under reduction, requests not subject to reduction (except under
extenuating circumstances when there aren't any messages in the first extenuating circumstances when there aren't any messages in the first
category that can be reduced). category that can be reduced).
Accordingly, the proposed Leaky bucket implementation is modified to Accordingly, the proposed Leaky bucket implementation is modified to
support priority using two thresholds for Diameter requests in the support priority using two thresholds for Diameter requests in the
set of request candidates for reduction. With two priorities, the set of request candidates for reduction. With two priorities, the
skipping to change at page 16, line 28 skipping to change at page 16, line 28
Xp <= TAU2 && Xp > TAU1) { Xp <= TAU2 && Xp > TAU1) {
// Transmit Diameter request // Transmit Diameter request
// Update X and LCT // Update X and LCT
X = max (0, Xp) + T; X = max (0, Xp) + T;
LCT = ta; LCT = ta;
} else { } else {
// Apply abatement treatment to Diameter request // Apply abatement treatment to Diameter request
// Do not update X and LCT // Do not update X and LCT
} }
7.3.3. Optional enhancement: avoidance of resonance 7.3.3. Optional Enhancement: Avoidance of Resonance
As the number of reacting node sources of traffic increases and the As the number of reacting node sources of traffic increases and the
throughput of the reporting node decreases, the maximum rate admitted throughput of the reporting node decreases, the maximum rate admitted
by each reacting node needs to decrease, and therefore the value of T by each reacting node needs to decrease, and therefore the value of T
becomes larger. Under some circumstances, e.g. if the traffic arises becomes larger. Under some circumstances, e.g. if the traffic arises
very quickly simultaneously at many sources, the occupancies of each very quickly simultaneously at many sources, the occupancies of each
bucket can become synchronized, resulting in the admissions from each bucket can become synchronized, resulting in the admissions from each
source being close in time and batched or very 'peaky' arrivals at source being close in time and batched or very 'peaky' arrivals at
the reporting node, which not only gives rise to control instability, the reporting node, which not only gives rise to control instability,
but also very poor delays and even lost messages. An appropriate but also very poor delays and even lost messages. An appropriate
 End of changes. 20 change blocks. 
28 lines changed or deleted 30 lines changed or added

This html diff was produced by rfcdiff 1.45. The latest version is available from http://tools.ietf.org/tools/rfcdiff/