This specification documents a Diameter Overload Control (DOC) base solution and the dissemination of the overload report information.

Requirements

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

1. Introduction

This specification defines a base solution for Diameter Overload Control (DOC), refered to as Diameter Overload Indication Conveyance (DOIC). The requirements for the solution are described and discussed in the corresponding design requirements document [RFC7068]. Note that the overload control solution defined in this specification does not address all the requirements listed in [RFC7068]. A number of overload control related features are left for the future specifications.

The solution defined in this specification addresses Diameter overload control between two endpoints (see Section 3.1). Furthermore, the solution is designed to apply to existing and future Diameter applications, requires no changes to the Diameter base protocol [RFC6733] and is deployable in environments where some Diameter nodes do not implement the Diameter overload control solution defined in this specification.

2. Terminology and Abbreviations

3. Solution Overview

The Diameter Overload Information Conveyance (DOIC) mechanism allows Diameter nodes to request other nodes to perform overload abatement actions, that is, actions to reduce the load offered to the overloaded node or realm.

A Diameter node that supports DOIC is known as a "DOIC endpoint". Any Diameter node can act as a DOIC endpoint, including clients, servers, and agents. DOIC endpoints are further divided into "Reporting Nodes" and "Reacting Nodes." A reporting node requests overload abatement by sending an Overload Report (OLR) to one or more reacting nodes.

A reacting node consumes OLRs, and performs whatever actions are needed to fulfill the abatement requests included in the OLRs. A Reporting node may report overload on its own behalf, or on behalf of other (typically upstream) nodes. Likewise, a reacting node may perform overload abatement on its own behalf, or on behalf of other (typically downstream) nodes.

A node's role as a DOIC endpoint is independent of its Diameter role. For example, Diameter relay and proxy agents may act as DOIC endpoints, even though they are not endpoints in the Diameter sense. Since Diameter enables bi-directional applications, where Diameter servers can send requests towards Diameter clients, a given Diameter node can simultaneously act as a reporting node and a reacting node.

Likewise, a relay or proxy agent may act as a reacting node from the perspective of upstream nodes, and a reporting node from the perspective of downstream nodes.

DOIC endpoints do not generate new messages to carry DOIC related information. Rather, they "piggyback" DOIC information over existing Diameter messages by inserting new AVPs into existing Diameter requests and responses. Nodes indicate support for DOIC, and any needed DOIC parameters by inserting an OC_Supported_Features AVP (Section 6.2) into existing requests and responses. Reporting nodes send OLRs by inserting OC-OLR AVPs (Section 6.3).

A given OLR applies to the Diameter realm and application of the Diameter message that carries it. If a reporting node supports more than one realm and/or application, it reports independently for each combination of realm and application. Similarly, OC-Feature-Vector AVPs apply to the realm and application of the enclosing message. This implies that a node may support DOIC for one application and/or realm, but not another, and may indicate different DOIC parameters for each application and realm for which it supports DOIC.

Reacting nodes perform overload abatement according to an agreed-upon abatement algorithm. An abatement algorithm defines the meaning of the parameters of an OLR, and the procedures required for overload abatement. This document specifies a single must-support algorithm, namely the "loss" algorithm Section 5). Future specifications may introduce new algorithms.

Overload conditions may vary in scope. For example, a single Diameter node may be overloaded, in which case reacting nodes may reasonably attempt to send throttled requests to other destinations or via other agents. On the other hand, an entire Diameter realm may be overloaded, in which case such attempts would do harm. DOIC OLRs have a concept of "report type" (Section 6.6), where the type defines such behaviors. Report types are extensible. This document defines report types for overload of a specific server, and for overload of an entire realm.

While a reporting node sends OLRs to "adjacent" reacting nodes, nodes that are "adjacent" for DOIC purposes may not be adjacent from a Diameter, or transport, perspective. For example, one or more Diameter agents that do not support DOIC may exist between a given pair of reporting and reacting nodes, as long as those agents pass unknown AVPs through unmolested. The report types described in this document can safely pass through non-supporting agents. This may not be true for report types defined in future specifications. Documents that introduce new report types MUST describe any limitations on their use across non-supporting agents.

3.1. Overload Control Endpoints (Non normative)

The overload control solution can be considered as an overlay on top of an arbitrary Diameter network. The overload control information is exchanged over on a "DOIC association" established between two communication endpoints. The endpoints, namely the "reacting node" and the "reporting node" do not need to be adjacent Diameter peer nodes, nor they need to be the end-to-end Diameter nodes in a typical "client-server" deployment with multiple intermediate Diameter agent nodes in between. The overload control endpoints are the two Diameter nodes that decide to exchange overload control information between each other. How the endpoints are determined is specific to a deployment, a Diameter node role in that deployment and local configuration.

The following diagrams illustrate the concept of Diameter Overload Endpoints and how they differ from the standard [RFC6733] defined client, server and agent Diameter nodes. The following is the key to the elements in the diagrams:

Figure 1 illustrates the most basic configuration where a client is connected directly to a server. In this case, the Diameter session and the DOIC association are both between the client and server.

   +-----+            +-----+
   |  C  |            |  S  |
   +-----+            +-----+
   | DEP |            | DEP |
   +--+--+            +--+--+
      |                  |
      |                  |
      |{Diameter Session}|
      |                  |
      |{DOIC Association}|
      |                  |

Figure 1: Basic DOIC deployment

In Figure 2 there is an agent that is not participating directly in the exchange of overload reports. As a result, the Diameter session and the DOIC association are still established between the client and the server.

   +-----+            +-----+            +-----+
   |  C  |            |  A  |            |  S  |
   +-----+            +--+--+            +-----+
   | DEP |               |               | DEP |
   +--+--+               |               +--+--+
      |                  |                  |
      |                  |                  |
      |----------{Diameter Session}---------|
      |                  |                  |
      |----------{DOIC Association}---------|
      |                  |                  |

Figure 2: DOIC deployment with non participating agent

Figure 3 illustrates the case where the client does not support Diameter overload. In this case, the DOIC association is between the agent and the server. The agent handles the role of the reactor for overload reports generated by the server.

   +-----+            +-----+            +-----+
   |  C  |            |  A  |            |  S  |
   +--+--+            +-----+            +-----+
      |               | DEP |            | DEP |
      |               +--+--+            +--+--+
      |                  |                  |
      |                  |                  |
      |----------{Diameter Session}---------|
      |                  |                  |
      |                  |{DOIC Association}|
      |                  |                  |

Figure 3: DOIC deployment with non-DOIC client and DOIC enabled agent

In Figure 4 there is a DOIC association between the client and the agent and a second DOIC association between the agent and the server. One use case requiring this configuration is when the agent is serving as a SFE for a set of servers.

   +-----+            +-----+            +-----+
   |  C  |            |  A  |            |  S  |
   +-----+            +-----+            +-----+
   | DEP |            | DEP |            | DEP |
   +--+--+            +--+--+            +--+--+
      |                  |                  |
      |                  |                  |
      |----------{Diameter Session}---------|
      |                  |                  |
      |{DOIC Association}|{DOIC Association}|
      |                  |                and/or
      |----------{DOIC Association}---------|
      |                  |                  |

Figure 4: A deployment where all nodes support DOIC

Figure 5 illustrates a deployment where some clients support Diameter overload control and some do not. In this case the agent must support Diameter overload control for the non supporting client. It might also need to have a DOIC association with the server, as shown here, to handle overload for a server farm and/or for managing Realm overload.

   +-----+            +-----+            +-----+            +-----+
   | C1  |            | C2  |            |  A  |            |  S  |
   +-----+            +--+--+            +-----+            +-----+
   | DEP |               |               | DEP |            | DEP |
   +--+--+               |               +--+--+            +--+--+
      |                  |                  |                  |
      |                  |                  |                  |
      |-------------------{Diameter Session}-------------------|
      |                  |                  |                  |
      |                  |--------{Diameter Session}-----------|
      |                  |                  |                  |
      |---------{DOIC Association}----------|{DOIC Association}|
      |                  |                  |                and/or
      |-------------------{DOIC Association}-------------------|
      |                  |                  |                  |

Figure 5: A deployment with DOIC and non-DOIC supporting clients

Editor's note: Propose to remove C1, which is already shown in a previous figure. Have this focus just on the non supporting client scenario.

Figure 6 illustrates a deployment where some agents support Diameter overload control and others do not.

   +-----+            +-----+            +-----+            +-----+
   |  C  |            |  A  |            |  A  |            |  S  |
   +-----+            +--+--+            +-----+            +-----+
   | DEP |               |               | DEP |            | DEP |
   +--+--+               |               +--+--+            +--+--+
      |                  |                  |                  |
      |                  |                  |                  |
      |-------------------{Diameter Session}-------------------|
      |                  |                  |                  |
      |                  |                  |                  |
      |---------{DOIC Association}----------|{DOIC Association}|
      |                  |                  |                and/or
      |-------------------{DOIC Association}-------------------|
      |                  |                  |                  |

Figure 6: A deployment with DOIC and non-DOIC supporting agents

Editor's note: Propose to add a non supporting server scenario.

3.2. Piggybacking Principle (Non normative)

The overload control AVPs defined in this specification have been designed to be piggybacked on top of existing application message exchanges. This is made possible by adding overload control top level AVPs, the OC-OLR AVP and the OC-Supported-Features AVP as optional AVPs into existing commands when the corresponding Command Code Format (CCF) specification allows adding new optional AVPs (see Section 1.3.4 of [RFC6733]).

Reacting nodes indicate support for DOIC by including the OC-Supported-Features AVP all request messages originated or relayed by the Diameter node.

Reporting nodes indicate support for DOIC by including the OC-Supported-Features AVP in all answer messages originated or relayed by the Diameter node. Reporting nodes also include overload reports using the OC-OLR AVP in answer messages.

Note: There is no new Diameter application defined to carry overload related AVPs. The DOIC AVPs are carried in existing Diameter application messages.

Note that the overload control solution does not have fixed server and client roles. The endpoint role is determined based on the message type: whether the message is a request (i.e. sent by a "reacting node") or an answer (i.e. send by a "reporting node"). Therefore, in a typical "client-server" deployment, the "client" MAY report its overload condition to the "server" for any server initiated message exchange. An example of such is the server requesting a re-authentication from a client.

3.3. DOIC Capability Announcement (Non normative)

The DOIC solutions supports the ability for Diameter nodes to determine if other nodes in the path of a request support the solution. This capability is refered to as DOIC Capability Announcement (DCA) and is separate from Diameter Capability Exchange.

The DCA mechanism is built around the piggybacking principle used for transporting Diameter overload AVPs. This includes both DCA AVPs and AVPs associated with Diameter overload reports. This allows for the DCA AVPs to be carried across Diameter nodes that do not support the DOIC solution.

The DCA mechanism uses the OC-Supported-Features AVPs to indicate the Diameter overload features supported.

The first node in the path of a Diameter request that supports the DOIC solution inserts the OC-Supported-Feature AVP in the request message. This includes an indication that it supports the loss overload abatement algorithm defined in this specification (see Section 5). This insures that there is at least one commonly supported overload abatement algorithm between the reporting node and the reacting nodes in the path of the request.

DOIC must support deployments where Diameter Clients and/or Diameter servers do not support the DOIC solution. In this scenario, it is assumed that Diameter Agents that support the DOIC solution will handle overload abatement for the non supporting clients. In this case the DOIC agent will insert the OC-Supporting-Features AVP in requests that do not already contain one, telling the reporting node that there is a DOIC node that will handle overload abatement.

The reporting node inserts the OC-Supported-Feature AVP in all answer messages to requests that contained the OC-Supported-Feature AVP. The contents of the reporting node's OC-Supported-Feature AVP indicate the set of Diameter overload features supported by the reporting node with one exception.

The reporting node only includes an indication of support for one overload abatement algorithm. This is the algorithm that the reporting node intends to use should it enter an overload condition. Reacting nodes can use the indicated overload abatement algorithm to prepare for possible overload reports.

Note that the loss algorithm defined in this document is a stateless abatement algorithm. As a result it does not require any actions by reacting nodes prior to the receipt of an overload report. Stateful abatement algorithms that base the abatement logic on a history of request messages sent might require reacting nodes to maintain state to insure that overload reports can be properly handled.

The individual features supported by the DOIC nodes are indicated in the OC-Feature-Vector AVP. Any semantics associated with the features will be defined in extension specifications that introduce the features.

The DCA mechanism must also support the scenario where the set of features supported by the sender of a request and by agents in the path of a request differ. In this case, the agent updates the OC-Supported-Feature AVP to reflect the mixture of the two sets of supported features.

The logic to determine the content of the modified OC-Supported-Feature AVP is out-of-scope for this specification and is left to implementation decisions. Care must be taken in doing so not to introduce interoperability issues for downstream or upstream DOIC nodes.

3.4. DOIC Overload Condition Reporting (Non normative)

As with DOIC Capability Announcement, Overload Condition Reporting uses new AVPs (Section 6.3) to indicate an overload condition.

The OC-OLR AVP is referred to as an overload report. The OC-OLR AVP includes the type of report, an overload report ID, the length of time that the report is valid and abatement algorithm specific AVPs.

Two types of overload reports are defined in this document, host reports and realm reports.

Host reports apply to traffic that is sent to a specific Diameter host. The applies to requests that contain the Destination-Host AVP that contains a DiameterIdentity that matches that of the overload report. These requests are referred to as host-routed requests. A host report also applies to realm-routed requests, requests that do not have a Destination-Host AVP, when the selected route for the request is a connection to the impacted host.

Realm reports apply to realm-routed requests for a specific realm as indicated in the Destination-Realm AVP.

Reporting nodes are responsible for determining the need for a reduction of traffic. The method for making this determination is implementation specific and depend on the type of overload report being generated. A host report, for instance, will generally be generated by tracking utilization of resources required by the host to handle transactions for the the Diameter application. A realm report will generally impact the traffic sent to multiple hosts and, as such, will typically require tracking the capacity of the servers able to handle realm-routed requests for the application.

Once a reporting node determines the need for a reduction in traffic, it uses the DOIC defined AVPs to report on the condition. These AVPs are included in answer messages sent or relayed by the reporting node. The reporting node indicates the overload abatement algorithm that is to be used to handle the traffic reduction in the OC-Supported-Features AVP. The OC-OLR AVP is used to communicate information about the requested reduction.

Reacting nodes, upon receipt of an overload report, are responsible for applying the abatement algorithm to traffic impacted by the overload report. The method used for that abatement is dependent on the abatement algorithm. The loss abatement algorithm is defined in this document (Section 5). Other abatement algorithms can be defined in extensions to the DOIC solutions.

As the conditions that lead to the generation of the overload report change the reporting node can send new overload reports requesting greater reduction if the condition gets worse or less reduction if the condition improves. The reporting node sends an overload report with a duration of zero to indicate that the overlaod condition has ended and use of the abatement algorithm is no longer needed.

The reacting node also determines when the overload report expires based on the OC-Validaty-Duration AVP in the overload report and stops applying the abatement algorithm when the report expires.

3.5. DOIC Extensibility (Non normative)

The DOIC solutions is designed to be extensible. This extensibility is based on existing Diameter based extensibility mechanisms.

There are multiple categories of extensions that are expected. This includes the definition of new overload abatement algorithms, the definition of new report types and new definitions of the scope of messages impacted by an overload report.

The DOIC solution uses the OC-Supported-Features AVP for DOIC nodes to communicate supported features. The specific features supported by the DOIC node are indicated in the OC-Feature-Vector AVP. DOIC extensions must define new values for the OC-Feature-Vector AVP. DOIC extensions also have the ability to add new AVPs to the OC-Supported-Features AVP, if additional information about the new feature is required to be communicate.

Overload abatement algorithms use the OC-OLR AVP to communicate overload occurances. This AVP can also be extended to add new AVPs allowing a reporting nodes to communicate additional information about handling an overload condition.

If necessary, new extensions can also define new top level AVPs. It is, however, recommended that DOIC extensions use the OC-Supported-Features and OC-OLR to carry all DOIC related AVPs.

3.6. Simplified Example Architecture (Non normative)

Figure 7 illustrates the simplified architecture for Diameter overload information conveyance. See Section 3.1 for more discussion and details how different Diameter nodes fit into the architecture from the DOIC point of view.


 Realm X                                  Same or other Realms
<--------------------------------------> <---------------------->


   +--^-----+                 : (optional) :
   |Diameter|                 :            :
   |Server A|--+     .--.     : +---^----+ :     .--.
   +--------+  |   _(    `.   : |Diameter| :   _(    `.   +---^----+
               +--(        )--:-|  Agent |-:--(        )--|Diameter|
   +--------+  | ( `  .  )  ) : +-----^--+ : ( `  .  )  ) | Client |
   |Diameter|--+  `--(___.-'  :            :  `--(___.-'  +-----^--+
   |Server B|                 :            :
   +---^----+                 :            :

                       End-to-end Overload Indication
          1)  <----------------------------------------------->
                          Diameter Application Y

               Overload Indication A    Overload Indication A'
          2)  <----------------------> <---------------------->
              standard base protocol   standard base protocol

Figure 7: Simplified architecture choices for overload indication delivery

In Figure 7, the Diameter overload indication can be conveyed (1) end-to-end between servers and clients or (2) between servers and Diameter agent inside the realm and then between the Diameter agent and the clients when the Diameter agent acting as back-to-back-agent for DOIC purposes.

3.7. Considerations for Applications Integrating the DOIC Solution (Non normative)

THis section outlines considerations to be taken into account when integrating the DOIC solution into Diameter applications.

3.7.1. Application Classification (Non normative)

The following is a classification of Diameter applications and requests. This discussion is meant to document factors that play into decisions made by the Diameter identity responsible for handling overload reports.

Section 8.1 of [RFC6733] defines two state machines that imply two types of applications, session-less and session-based applications. The primary difference between these types of applications is the lifetime of Session-Ids.

For session-based applications, the Session-Id is used to tie multiple requests into a single session.

In session-less applications, the lifetime of the Session-Id is a single Diameter transaction, i.e. the session is implicitly terminated after a single Diameter transaction and a new Session-Id is generated for each Diameter request.

For the purposes of this discussion, session-less applications are further divided into two types of applications:

Stateless applications:: Requests within a stateless application have no relationship to each other. The 3GPP defined S13 application is an example of a stateless application [S13], --> where only a Diameter command is defined between a client and a server and no state is maintained between two consecutive transactions.
Pseudo-session applications:: Applications that do not rely on the Session-Id AVP for correlation of application messages related to the same session but use other session-related information in the Diameter requests for this purpose. The 3GPP defined Cx application [Cx] is an example of a pseudo-session application.

The Credit-Control application defined in [RFC4006] is an example of a Diameter session-based application.

The handling of overload reports must take the type of application into consideration, as discussed in Section 3.7.2.

3.7.2. Application Type Overload Implications (Non normative)

This section discusses considerations for mitigating overload reported by a Diameter entity. This discussion focuses on the type of application. Section 3.7.3 discusses considerations for handling various request types when the target server is known to be in an overloaded state.

These discussions assume that the strategy for mitigating the reported overload is to reduce the overall workload sent to the overloaded entity. The concept of applying overload treatment to requests targeted for an overloaded Diameter entity is inherent to this discussion. The method used to reduce offered load is not specified here but could include routing requests to another Diameter entity known to be able to handle them, or it could mean rejecting certain requests. For a Diameter agent, rejecting requests will usually mean generating appropriate Diameter error responses. For a Diameter client, rejecting requests will depend upon the application. For example, it could mean giving an indication to the entity requesting the Diameter service that the network is busy and to try again later.

Stateless applications:: By definition there is no relationship between individual requests in a stateless application. As a result, when a request is sent or relayed to an overloaded Diameter entity - either a Diameter Server or a Diameter Agent - the sending or relaying entity can choose to apply the overload treatment to any request targeted for the overloaded entity.
Pseudo-session applications:: For pseudo-session applications, there is an implied ordering of requests. As a result, decisions about which requests towards an overloaded entity to reject could take the command code of the request into consideration. This generally means that transactions later in the sequence of transactions should be given more favorable treatment than messages earlier in the sequence. This is because more work has already been done by the Diameter network for those transactions that occur later in the sequence. Rejecting them could result in increasing the load on the network as the transactions earlier in the sequence might also need to be repeated.
Session-based applications:: Overload handling for session-based applications must take into consideration the work load associated with setting up and maintaining a session. As such, the entity sending requests towards an overloaded Diameter entity for a session-based application might tend to reject new session requests prior to rejecting intra-session requests. In addition, session ending requests might be given a lower probability of being rejected as rejecting session ending requests could result in session status being out of sync between the Diameter clients and servers. Application designers that would decide to reject mid-session requests will need to consider whether the rejection invalidates the session and any resulting session clean-up procedures.

3.7.3. Request Transaction Classification (Non normative)

Independent Request:: An independent request is not correlated to any other requests and, as such, the lifetime of the session-id is constrained to an individual transaction.
Session-Initiating Request:: A session-initiating request is the initial message that establishes a Diameter session. The ACR message defined in [RFC6733] is an example of a session-initiating request.
Correlated Session-Initiating Request:: There are cases when multiple session-initiated requests must be correlated and managed by the same Diameter server. It is notably the case in the 3GPP PCC architecture [PCC], where multiple apparently independent Diameter application sessions are actually correlated and must be handled by the same Diameter server.
Intra-Session Request:: An intra session request is a request that uses the same Session-Id than the one used in a previous request. An intra session request generally needs to be delivered to the server that handled the session creating request for the session. The STR message defined in [RFC6733] is an example of an intra-session requests.
Pseudo-Session Requests:: Pseudo-session requests are independent requests and do not use the same Session-Id but are correlated by other session-related information contained in the request. There exists Diameter applications that define an expected ordering of transactions. This sequencing of independent transactions results in a pseudo session. The AIR, MAR and SAR requests in the 3GPP defined Cx [Cx] application are examples of pseudo-session requests.

3.7.4. Request Type Overload Implications (Non normative)

The request classes identified in Section 3.7.3 have implications on decisions about which requests should be throttled first. The following list of request treatment regarding throttling is provided as guidelines for application designers when implementing the Diameter overload control mechanism described in this document. The exact behavior regarding throttling is a matter of local policy, unless specifically defined for the application.

Independent requests:: Independent requests can be given equal treatment when making throttling decisions.
Session-initiating requests:: Session-initiating requests represent more work than independent or intra-session requests. Moreover, session-initiating requests are typically followed by other session-related requests. As such, as the main objective of the overload control is to reduce the total number of requests sent to the overloaded entity, throttling decisions might favor allowing intra-session requests over session-initiating requests. Individual session-initiating requests can be given equal treatment when making throttling decisions.
Correlated session-initiating requests:: A Request that results in a new binding, where the binding is used for routing of subsequent session-initiating requests to the same server, represents more work load than other requests. As such, these requests might be throttled more frequently than other request types.
Pseudo-session requests:: Throttling decisions for pseudo-session requests can take into consideration where individual requests fit into the overall sequence of requests within the pseudo session. Requests that are earlier in the sequence might be throttled more aggressively than requests that occur later in the sequence.
Intra-session requests: There are two classes of intra-sessions requests. The first class consists of requests that terminate a session. The second one contains the set of requests that are used by the Diameter client and server to maintain the ongoing session state. Session terminating requests should be throttled less aggressively in order to gracefully terminate sessions, allow clean-up of the related resources (e.g. session state) and get rid of the need for other intra-session requests, reducing the session management impact on the overloaded entity. The default handling of other intra-session requests might be to treat them equally when making throttling decisions. There might also be application level considerations whether some request types are favored over others.

4. Solution Procedures (Normative)

This section outlines the normative behavior associated with the DOIC solution.

4.1. Capability Announcement (Normative)

This section defines DOIC Capability Announcement (DCA) behavior.

The DCA procedures are used to indicate support for DOIC and support for DOIC features. The DOIC features include overload abatement algorithms supported. It might also include new report types or other extensions documented in the future.

Diameter nodes indicate support for DOIC by including the OC-Supported-Features AVP in messages sent or handled by the node.

Diameter agents that support DOIC MUST ensure that all messages have the OC-Supporting-Features AVP. If a message handled by the DOIC agent does not include the OC-Supported-Features AVP then the DOIC agent inserts the AVP. If the message already has the AVP then the agent either leaves it unchanged in the relayed message or modifies it to reflect a mixed set of DOIC features.

4.1.1. Reacting Node Behavior (Normative)

A reacting node MUST include the OC-Supported-Features AVP in all request messages.

A reacting node MUST include the OC-Feature-Vector AVP with an indication of the loss algorithm.

A reacting node SHOULD indicate support for all other DOIC features it supports.

An OC-Supported-Features AVP in answer messages indicates there is a reporting node for the transaction. The reacting node MAY take action based on the features indicated in the OC-Feature-Vector AVP.

Note that the loss abatement algorithm is the only feature described in this document and it does not require action to be taken by the reacting node except when the answer message also has an overload report. This behavior is described in Section 4.2 and Section 5.

4.1.2. Reporting Node Behavior (Normative)

Upon receipt of a request message, a reporting node determines if there is a reacting node for the transaction based on the presence of the OC-Supported-Features AVP.

Based on the content of the OC-Supported-Features AVP in the request message, the reporting node knows what overload control functionality supported by reacting node(s). The reporting node then acts accordingly for the subsequent answer messages it initiates.

If the reqeust message contains an OC-Supported-Features AVP then the reporting node MUST include the OC-Supported-Features AVP in the answer message for that transaction.

The reporting node MUST indicate support for one and only one abatement algorithm in the OC-Feature-Vector AVP. The abatement algorithm included MUST be from the set of abatement algorithms contained in the request messages OC-Supported-Features AVP. The abatement algorithm included indicates the abatement algorithm the reporting node wants the reacting node to use when the reporting node enters an overload condition.

The reporting node MUST NOT change the selected algorithm during a period of time that it is in an overload condition and, as a result, is sending OC-OLR AVPs in answer messages.

The reporting node SHOULD indicate support for other DOIC features it supports and that apply to the transaction.

Note that not all DOIC features will apply to all Diameter applications or deployment scenarios. The features included in the OC-Feature-Vector AVP is based on local reporting node policy.

The reporting node MUST NOT include the OC-Supported-Features AVP, OC-OLR AVP or any other overload control AVPs defined in extension drafts in response messages for transactions where the request message does not include the OC-Supported-Features AVP. Lack of the OC-Supported-Features AVP in the request message indicates that there is no reacting node for the transaction.

An agent MAY modify the OC-Supported-Features AVP carried in answer messages.

4.1.3. Agent Behavior (Normative)

Editor's note -- Need to add this section.

4.2. Overload Report Processing (Normative)

4.2.1. Overload Control State (Normative)

Both reacting and reporting nodes maintain an overload control state (OCS) for each endpoint (a host or a realm) they communicate with and both endpoints have announced support for DOIC. See Sections 6.1 and 4.1 for discussion about how the support for DOIC is determined.

4.2.1.1. Overload Control State for Reacting Nodes

A reacting node maintains the following OCS per supported Diameter application:

A host-type Overload Control State for each Destination-Host towards which it sends host-type requests and
A realm-type Overload Control State for each Destination-Realm towards which it sends realm-type requests.

A host-type Overload Control State may be identified by the pair of Application-Id and Destination-Host. A realm-type Overload Control State may be identified by the pair of Application-Id and Destination-Realm. The host-type/realm-type Overload Control State for a given pair of Application and Destination-Host / Destination-Realm could include the following information:

Sequence number (as received in OC-OLR)
Time of expiry (deviated from validity duration as received in OC-OLR and time of reception)
Selected Abatement Algorithm (as received in OC-Supported-Features)
Algorithm specific input data (as received within OC-OLR, e.g. Reduction Percentage for Loss)

4.2.1.2. Overload Control States for Reporting Nodes

A reporting node maintains per supported Diameter application and per supported (and eventually selected) Abatement Algorithm an Overload Control State.

An Overload Control State may be identified by the pair of Application-Id and supported Abatement Algorithm.

The Overload Control State for a given pair of Application and Abatement Algorithm could include the information:

Sequence number
Validity Duration and Expiry Time
Algorithm specific input data (e.g. Reduction Percentage for Loss)

Overload Control States for reporting nodes containing a validity duration of 0 sec. should not expire before any previously sent (stale) OLR has timed out at any reacting node.

Editor's note: This statement is unclear and contradictory with other statements. A validity timer of zero seconds indicates that the overload condition has ended and abatement is no longer requested.

4.2.1.3. Maintaining Overload Control State

Reacting nodes create a host-type OCS identified by OCS-Id = (app-id,host-id) when receiving an answer message of application app-id containing an Orig-Host of host-id and a host-type OC-OLR AVP unless such host-type OCS already exists.

Reacting nodes create a realm-type OCS identified by OCS-Id = (app-id,realm-id) when receiving an answer message of application app-id containing an Orig-Realm of realm-id and a realm-type OC-OLR AVP unless such realm type OCS already exists.

Reacting nodes delete an OCS when it expires (i.e. when current time minus reception time is greater than validity duration).

Editor's note: Reacting nodes also delete on OCS with an updated OLR is received with a validity duration of zero.

Reacting nodes update the host-type OCS identified by OCS-Id = (app-id,host-id) when receiving an answer message of application app-id containing an Orig-Host of host-id and a host-type OC-OLR AVP with a sequence number higher than the stored sequence number.

Reacting nodes update the realm-type OCS identified by OCS-Id = (app-id,realm-id) when receiving an answer message of application app-id containing an Orig-Realm of realm-id and a realm-type OC-OLR AVP with a sequence number higher than the stored sequence number.

Reacting nodes do not delete an OCS when receiving an answer message that does not contain an OC-OLR AVP (i.e. absence of OLR means “no change”).

Reporting nodes create an OCS identified by OCS-Id = (app-id,Alg) when receiving a request of application app-id containing an OC-Supported-Features AVP indicating support of the Abatement Algorithm Alg (which the reporting node selects) while being overloaded, unless such OCS already exists.

Reporting nodes delete an OCS when it expires.

Editor's note: Reporting nodes should send updated overload reports with a validity duration of zero for a period of time after an OCS expires or is removed due to the overload condition ending.

Reporting nodes update the OCS identified by OCS-Id = (app-id,Alg) when they detect the need to modify the requested amount of application app-id traffic reduction.

4.2.2. Reacting Node Behavior (Normative)

Once a reacting node receives an OC-OLR AVP from a reporting node, it applies traffic abatement based on the selected algorithm with the reporting node and the current overload condition. The reacting node learns the reporting node supported abatement algorithms directly from the received answer message containing the OC-Supported-Features AVP.

The received OC-Supported-Features AVP does not change the existing overload condition and/or traffic abatement algorithm settings if the OC-Sequence-Number AVP contains a value that is equal to the previously received/recorded value. If the OC-Supported-Features AVP is received for the first time for the reporting node or the OC-Sequence-Number AVP value is less than the previously received/recorded value (and is outside the valid overflow window), then the sequence number is stale (e.g. an intentional or unintentional replay) and SHOULD be silently discarded.

As described in Section 6.3, the OC-OLR AVP contains the necessary information for the overload condition on the reporting node.

From the OC-Report-Type AVP contained in the OC-OLR AVP, the reacting node learns whether the overload condition report concerns a specific host (as identified by the Origin-Host AVP of the answer message containing the OC-OLR AVP) or the entire realm (as identified by the Origin-Realm AVP of the answer message containing the OC-OLR AVP). The reacting node learns the Diameter application to which the overload report applies from the Application-ID of the answer message containing the OC-OLR AVP. The reacting node MUST use this information as an input for its traffic abatement algorithm. The idea is that the reacting node applies different handling of the traffic abatement, whether sent request messages are targeted to a specific host (identified by the Diameter-Host AVP in the request) or to any host in a realm (when only the Destination-Realm AVP is present in the request). Note that future specifications MAY define new OC-Report-Type AVP values that imply different handling of the OC-OLR AVP. For example, in a form of new additional AVPs inside the Grouped OC-OLR AVP that would define report target in a finer granularity than just a host.

Editor's note: The above behavior for Realm reports is inconsistent with the definition of realm reports in section Section 6.6.

If the OC-OLR AVP is received for the first time, the reacting node MUST create overload control state associated with the related realm or a specific host in the realm identified in the message carrying the OC-OLR AVP, as described in Section 4.2.1.

If the value of the OC-Sequence-Number AVP contained in the received OC-OLR AVP is equal to or less than the value stored in an existing overload control state, the received OC-OLR AVP SHOULD be silently discarded. If the value of the OC-Sequence-Number AVP contained in the received OC-OLR AVP is greater than the value stored in an existing overload control state or there is no previously recorded sequence number, the reacting node MUST update the overload control state associated with the realm or the specific node in the realm.

When an overload control state is created or updated, the reacting node MUST apply the traffic abatement requested in the OC-OLR AVP using the algorithm announced in the OC-Supported-Features AVP contained in the received answer message along with the OC-OLR AVP.

The validity duration of the overload information contained in the OC-OLR AVP is either explicitly indicated in the OC-Validity-Duration AVP or is implicitly equals to the default value (5 seconds) if the OC-Validity-Duration AVP is absent. The reacting node MUST maintain the validity duration in the overload control state. Once the validity duration times out, the reacting node MUST assume the overload condition reported in a previous OC-OLR AVP has ended.

A value of zero ("0") received in the OC-Validity-Duration in an updated overload report indicates that the overload condition has ended and that the overload state is no longer valid.

In the case that the validity duration expires or is explicitly signaled as being no longer valid the state associated with the overload report MUST be removed and any abatement associated with the overload report MUST be ended in a controlled fashion. After removing the overload state the sequence number MUST NOT be used for future comparisons of sequence numbers.

4.2.3. Reporting Node Behavior (Normative)

A reporting node is a Diameter node inserting an OC-OLR AVP in a Diameter message in order to inform a reacting node about an overload condition and request Diameter traffic abatement.

The operation on the reporting node is straight forward. The reporting node learns the capabilities of the reacting node when it receives the OC-Supported-Features AVP as part of any Diameter request message. If the reporting node shares at least one common feature with the reacting node, then the DOIC can be enabled between these two endpoints. See Section 4.1 for further discussion on the capability and feature announcement between two endpoints.

When a traffic reduction is required due to an overload condition and the overload control solution is supported by the sender of the Diameter request, the reporting node MUST include an OC-Supported-Features AVP and an OC-OLR AVP in the corresponding Diameter answer. The OC-OLR AVP contains the required traffic reduction and the OC-Supported-Features AVP indicates the traffic abatement algorithm to apply. This algorithm MUST be one of the algorithms advertised by the request sender.

A reporting node MAY rely on the OC-Validity-Duration AVP values for the implicit overload control state cleanup on the reacting node. However, it is RECOMMENDED that the reporting node always explicitly indicates the end of a overload condition.

The reporting node SHOULD indicate the end of an overload occurrence by sending a new OLR with OC-Validity-Duration set to a value of zero ("0"). The reporting node SHOULD insure that all reacting nodes receive the updated overload report.

4.2.4. Agent Behavior (Normative)

Editor's note -- Need to add this section.

4.3. Protocol Extensibility (Normative)

The overload control solution can be extended, e.g. with new traffic abatement algorithms, new report types or other new functionality.

When defining a new extension a new feature bit MUST be defined for the OC-Feature-Vector. This feature bit is used to communicate support for the new feature.

The extention may also define new AVPs for use in DOIC Capability Anouncement and for use in DOIC Overload reporting. These new AVP should be defined to be extensions to the OC-Supported-Features and OC-OLR AVPs defined in this document.

It should be noted that [RFC6733] defined Grouped AVP extension mechanisms apply. This allows, for example, defining a new feature that is mandatory to be understood even when piggybacked on an existing applications. More specifically, the sub-AVPs inside the OC-Supported-Features and OC-OLR AVP MAY have the M-bit set. However, when overload control AVPs are piggybacked on top of an existing applications, setting M-bit in sub-AVPs is NOT RECOMMENDED.

The handling of feature bits in the OC-Feature-Vector AVP that are not associated with overload abatement algorithms MUST be specified by the extensions that define the features.

When defining new report type values, the corresponding specification MUST define the semantics of the new report types and how they affect the OC-OLR AVP handling. The specification MUST also reserve a corresponding new feature, see the OC-Supported-Features and OC-Feature-Vector AVPs.

The OC-OLR AVP can be expanded with optional sub-AVPs only if a legacy implementation can safely ignore them without breaking backward compatibility for the given OC-Report-Type AVP value implied report handling semantics. If the new sub-AVPs imply new semantics for handling the indicated report type, then a new OC-Report-Type AVP value MUST be defined.

New features (feature bits in the OC-Feature-Vector AVP) and report types (in the OC-Report-Type AVP) MUST be registered with IANA. As with any Diameter specification, new AVPs MUST also be registered with IANA. See Section 8 for the required procedures.

5. Loss Algorithm (Normative)

This section documents the Diameter overload loss abatement algorithm.

5.1. Overview (Non normative)

The DOIC specification supports the ability for multiple overload abatement algorithms to be specified. The abatement algorithm used for any instance of overload is determined by the Diameter Overload Capability Announcement process documented in Section 4.1.

The loss algorithm described in this section is the default algorithm that must be supported by all Diameter nodes that support DOIC.

The loss algorithm is designed to be a straightforward and stateless overload abatement algorithm. It is used by reporting nodes to request a percentage reduction in the amount of traffic sent. The traffic impacted by the requested reduction depends on the type of overload report.

Reporting nodes use a strategy of applying abatement logic to the requested percentage of request messages sent (or handled in the case of agents) by the reacting node that are impacted by the overload report.

From a conceptual level, the logic at the reacting node could be outlined as follows. In this discussion assume that the reacting node is also the sending node.

An overload report is received and the associated overload state is saved by the reacting node.
A new Diameter request is generated by the application running on the reacting node.
The reacting node determines that an active overload report applies to the request.
The reacting node determines if abatement should be applied to the request. One approach that could be taken would be to select a random number between 1 and 100. If the random number is less than the indicated reduction percentage then the request is given abatement treatment, otherwise the request is given normal routing treatment.

5.2. Use of OC-Reduction-Percentage AVP

A reporting node using the loss algorithm must use the OC-Reduction-Percentage AVP (Section 6.7 to indicated the desired percentage of traffic reduction.)

Editor's note: The above duplicates what is in the OC-Reduction-Percentage AVP section can probably be removed.

5.3. Reporting Node Behavior (Normative)

The method a reporting nodes uses to determine the amount of traffic reduction required to address an overload condition is an implementation decision.

When a reporting node that has selected the loss abatement algorithm determines the need to request a traffic reduction it must include an OC-OLR AVP in all response messages.

The reporting node must indicate a percentage reduction in the OC-Reduction-Percentage AVP.

The reporting node may change the reduction percentage in subsequent overload reports. When doing so the reporting node must conform to overload report handing specified in Section 4.2.3.

When the reporting node determines it no longer needs a reduction in traffic the reporting node should send an overload report indicating the overload report is no longer valid, as specified in Section 4.2.3.

5.4. Reacting Node Behavior (Normative)

The method a reacting node uses to determine which request messages are given abatement treatment is an implementation decision.

When receiving an OC-OLR in an answer message where the algorithm indicated in the OC-Supported-Features AVP is the loss algorithm, the reacting node must attempt to apply abatement treatment to the requested percentage of request messages sent.

Note: the loss algorithm is a stateless algorithm. As a result, the reacting node does not guarantee that there will be an absolute reduction in traffic sent. Rather, it guarantees that the requested percentage of new requests will be given abatement treatment.

If reacting node comes out of the 100 percent traffic reduction as a result of the overload report timing out, the following concerns are RECOMMENDED to be applied. The reacting node sending the traffic should be conservative and, for example, first send "probe" messages to learn the overload condition of the overloaded node before converging to any traffic amount/rate decided by the sender. Similar concerns apply in all cases when the overload report times out unless the previous overload report stated 0 percent reduction.

Editor's note: Need to add additional guidance to slowly increase the rate of traffic sent to avoid a sudden spike in traffic, as the spike in traffic could result in oscillation of the need for overload control.

If the reacting node does not receive a an OLR in messages sent to the formally overloaded node then the reacting node should slowly increase the rate of traffic sent to the overloaded node.

It is suggested that the reacting node decrease the amount of traffic given abatement treatment by 20% each second until the reduction is completely removed and no traffic is given abatement treatment.

The goal of this behavior is to reduce the probability of overload condition thrashing where an immediate transition from 100% reduction to 0% reduction results in the reporting node moving quickly back into an overload condition.

6. Attribute Value Pairs (Normative)

This section describes the encoding and semantics of the Diameter Overload Indication Attribute Value Pairs (AVPs) defined in this document.

When added to existing commands, both OC-Feature-Vector and OC-OLR AVPs SHOULD have the M-bit flag cleared to avoid backward compatibility issues.

A new application specification can incorporate the overload control mechanism specified in this document by making it mandatory to implement for the application and referencing this specification normatively. In such a case, the OC-Feature-Vector and OC-OLR AVPs reused in newly defined Diameter applications SHOULD have the M-bit flag set. However, it is the responsibility of the Diameter application designers to define how overload control mechanisms works on that application.

6.1. OC-Supported-Features AVP

The OC-Supported-Features AVP (AVP code TBD1) is type of Grouped and serves for two purposes. First, it announces a node's support for the DOIC in general. Second, it contains the description of the supported DOIC features of the sending node. The OC-Supported-Features AVP MUST be included in every Diameter message a DOIC supporting node sends.

   OC-Supported-Features ::= < AVP Header: TBD1 >
                             [ OC-Feature-Vector ]
                           * [ AVP ]

The OC-Feature-Vector sub-AVP is used to announce the DOIC features supported by the endpoint, in the form of a flag bits field in which each bit announces one feature or capability supported by the node (see Section 6.2). The absence of the OC-Feature-Vector AVP indicates that only the default traffic abatement algorithm described in this specification is supported.

A reacting node includes this AVP to indicate its capabilities to a reporting node. For example, the endpoint (reacting node) may indicate which (future defined) traffic abatement algorithms it supports in addition to the default.

During the message exchange the overload control endpoints express their common set of supported capabilities. The reacting node includes the OC-Supported-Features AVP that announces what it supports. The reporting node that sends the answer also includes the OC-Supported-Features AVP that describes the capabilities it supports. The set of capabilities advertised by the reporting node depends on local policies. At least one of the announced capabilities MUST match. If there is no single matching capability the reacting node MUST act as if it does not implement DOIC and cease inserting any DOIC related AVPs into any Diameter messages with this specific reacting node.

Editor's note: The last sentence conflicts with the last sentence two paragraphs up. In reality, there will always be at least one matching capability as all nodes supporting DOIC must support the loss algorithm. Suggest removing the last sentence.

6.2. OC-Feature-Vector AVP

The OC-Feature-Vector AVP (AVP code TBD6) is type of Unsigned64 and contains a 64 bit flags field of announced capabilities of an overload control endpoint. The value of zero (0) is reserved.

The following capabilities are defined in this document:

OLR_DEFAULT_ALGO (0x0000000000000001): When this flag is set by the overload control endpoint it means that the default traffic abatement (loss) algorithm is supported.

6.3. OC-OLR AVP

The OC-OLR AVP (AVP code TBD2) is type of Grouped and contains the necessary information to convey an overload report. The OC-OLR AVP does not explicitly contain all information needed by the reacting node to decide whether a subsequent request must undergo a throttling process with the received reduction percentage. The value of the OC-Report-Type AVP within the OC-OLR AVP indicates which implicit information is relevant for this decision (see Section 6.6). The application the OC-OLR AVP applies to is the same as the Application-Id found in the Diameter message header. The identity the OC-OLR AVP concerns is determined from the Origin-Host AVP (and Origin-Realm AVP as well) found from the encapsulating Diameter command. The OC-OLR AVP is intended to be sent only by a reporting node.

   OC-OLR ::= < AVP Header: TBD2 >
              < OC-Sequence-Number >
              < OC-Report-Type >
              [ OC-Reduction-Percentage ]
              [ OC-Validity-Duration ]
            * [ AVP ]

The OC-Validity-Duration AVP indicates the validity time of the overload report associated with a specific sequence number, measured after reception of the OC-OLR AVP. The validity time MUST NOT be updated after reception of subsequent OC-OLR AVPs with the same sequence number. The default value for the OC-Validity-Duration AVP value is 5 (i.e., 5 seconds). When the OC-Validity-Duration AVP is not present in the OC-OLR AVP, the default value applies.

Note that if a Diameter command were to contain multiple OC-OLR AVPs they all MUST have different OC-Report-Type AVP value. OC-OLR AVPs with unknown values SHOULD be silently discarded and the event SHOULD be logged.

Editor's note: Need to specify what happens when two reports of the same type are received.

6.4. OC-Sequence-Number AVP

The OC-Sequence-Number AVP (AVP code TBD3) is type of Unsigned64. Its usage in the context of overload control is described in Section 4.2.

From the functionality point of view, the OC-Sequence-Number AVP MUST be used as a non-volatile increasing counter between two overload control endpoints. The sequence number is only required to be unique between two overload control endpoints. Sequence numbers are treated in a uni-directional manner, i.e. two sequence numbers on each direction between two endpoints are not related or correlated.

When generating sequence numbers, the new sequence number MUST be greater than any sequence number in an active overload report previously sent by the reporting node. This property MUST hold over a reboot of the reporting node.

6.5. OC-Validity-Duration AVP

The OC-Validity-Duration AVP (AVP code TBD4) is type of Unsigned32 and indicates in seconds the validity time of the overload report. The number of seconds is measured after reception of the first OC-OLR AVP with a given value of OC-Sequence-Number AVP. The default value for the OC-Validity-Duration AVP is 5 (i.e., 5 seconds). When the OC-Validity-Duration AVP is not present in the OC-OLR AVP, the default value applies. Validity duration with values above 86400 (i.e.; 24 hours) MUST NOT be used. Invalid duration values are treated as if the OC-Validity-Duration AVP were not present and result in the default value being used.

A timeout of the overload report has specific concerns that need to be taken into account by the endpoint acting on the earlier received overload report(s). Section 6.7 discusses the impacts of timeout in the scope of the traffic abatement algorithms.

When a reporting node has recovered from overload, it SHOULD invalidate any existing overload reports in a timely matter. This can be achieved by sending an updated overload report (meaning the OLR contains a new sequence number) with the OC-Validity-Duration AVP value set to zero ("0"). If the overload report is about to expire naturally, the reporting node MAY choose to simply let it do so.

A reacting node MUST invalidate and remove an overload report that expires without an explicit overload report containing an OC-Validity-Duration value set to zero ("0").

6.6. OC-Report-Type AVP

The OC-Report-Type AVP (AVP code TBD5) is type of Enumerated. The value of the AVP describes what the overload report concerns. The following values are initially defined:

0: A host report. The overload treatment should apply to requests for which all of the following conditions are true:
: Either the Destination-Host AVP is present in the request and its value matches the value of the Origin-Host AVP of the received message that contained the OC-OLR AVP; or the Destination-Host is not present in the request but the value of peer identity associated with the connection used to send the request matches the value of the Origin-Host AVP of the received message that contained the OC-OLR AVP.
: The value of the Destination-Realm AVP in the request matches the value of the Origin-Realm AVP of the received message that contained the OC-OLR AVP.
: The value of the Application-ID in the Diameter Header of the request matches the value of the Application-ID of the Diameter Header of the received message that contained the OC-OLR AVP.
1: A realm report. The overload treatment should apply to requests for which all of the following conditions are true:
: The Destination-Host AVP is absent in the request.
: The value of the Destination-Realm AVP in the request matches the value of the Origin-Realm AVP of the received message that contained the OC-OLR AVP.
: The value of the Application-ID in the Diameter Header of the request matches the value of the Application-ID of the Diameter Header of the received message that contained the OC-OLR AVP.

Editor's note: There is still an open issue on the definition of Realm reports and whether what report types should be supported. There is consensus that host reports should be supported. There is discussion on Realm reports and Realm-Routed-Request reports. The above definition applies to Realm-Routed-Request reports where Realm reports are defined to apply to all requests that match the realm, independent of the presence, absence or value of the Destination-Host AVP.

The default value of the OC-Report-Type AVP is 0 (i.e. the host report).

The OC-Report-Type AVP is envisioned to be useful for situations where a reacting node needs to apply different overload treatments for different "types" of overload. For example, the reacting node(s) might need to throttle differently requests sent to a specific server (identified by the Destination-Host AVP in the request) and requests that can be handled by any server in a realm. The example in Appendix B.1 illustrates this usage.

6.7. OC-Reduction-Percentage AVP

The OC-Reduction-Percentage AVP (AVP code TBD8) is type of Unsigned32 and describes the percentage of the traffic that the sender is requested to reduce, compared to what it otherwise would send. The OC-Reduction-Percentage AVP applies to the default (loss) algorithm specified in this specification. However, the AVP can be reused for future abatement algorithms, if its semantics fit into the new algorithm.

The value of the Reduction-Percentage AVP is between zero (0) and one hundred (100). Values greater than 100 are ignored. The value of 100 means that all traffic is to be throttled, i.e. the reporting node is under a severe load and ceases to process any new messages. The value of 0 means that the reporting node is in a stable state and has no need for the other endpoint to apply any traffic abatement. The default value of the OC-Reduction-Percentage AVP is 0. When the OC-Reduction-Percentage AVP is not present in the overload report, the default value applies.

6.8. Attribute Value Pair flag rules

                                                      +---------+
                                                      |AVP flag |
                                                      |rules    |
                                                      +----+----+
                           AVP   Section              |    |MUST|
    Attribute Name         Code  Defined  Value Type  |MUST| NOT|
   +--------------------------------------------------+----+----+
   |OC-Supported-Features  TBD1  x.x      Grouped     |    | V  |
   +--------------------------------------------------+----+----+
   |OC-OLR                 TBD2  x.x      Grouped     |    | V  |
   +--------------------------------------------------+----+----+
   |OC-Sequence-Number     TBD3  x.x      Unsigned64  |    | V  |
   +--------------------------------------------------+----+----+
   |OC-Validity-Duration   TBD4  x.x      Unsigned32  |    | V  |
   +--------------------------------------------------+----+----+
   |OC-Report-Type         TBD5  x.x      Enumerated  |    | V  |
   +--------------------------------------------------+----+----+
   |OC-Reduction                                      |    |    |
   |  -Percentage          TBD8  x.x      Unsigned32  |    | V  |
   +--------------------------------------------------+----+----+
   |OC-Feature-Vector      TBD6  x.x      Unsigned64  |    | V  |
   +--------------------------------------------------+----+----+

As described in the Diameter base protocol [RFC6733], the M-bit setting for a given AVP is relevant to an application and each command within that application that includes the AVP.

The Diameter overload control AVPs SHOULD always be sent with the M-bit cleared when used within existing Diameter applications to avoid backward compatibility issues. Otherwise, when reused in newly defined Diameter applications, the DOC related AVPs SHOULD have the M-bit set.

7. Error Response Codes

Editor's note: This section depends on resolution of issue #27.

8. IANA Considerations

8.1. AVP codes

New AVPs defined by this specification are listed in Section 6. All AVP codes allocated from the 'Authentication, Authorization, and Accounting (AAA) Parameters' AVP Codes registry.

8.2. New registries

Three new registries are needed under the 'Authentication, Authorization, and Accounting (AAA) Parameters' registry.

Section 6.2 defines a new "Overload Control Feature Vector" registry including the initial assignments. New values can be added into the registry using the Specification Required policy [RFC5226]. See Section 6.2 for the initial assignment in the registry.

Section 6.6 defines a new "Overload Report Type" registry with its initial assignments. New types can be added using the Specification Required policy [RFC5226].

9. Security Considerations

This mechanism gives Diameter nodes the ability to request that downstream nodes send fewer Diameter requests. Nodes do this by exchanging overload reports that directly affect this reduction. This exchange is potentially subject to multiple methods of attack, and has the potential to be used as a Denial-of-Service (DoS) attack vector.

Overload reports may contain information about the topology and current status of a Diameter network. This information is potentially sensitive. Network operators may wish to control disclosure of overload reports to unauthorized parties to avoid its use for competitive intelligence or to target attacks.

Diameter does not include features to provide end-to-end authentication, integrity protection, or confidentiality. This may cause complications when sending overload reports between non-adjacent nodes.

9.1. Potential Threat Modes

The Diameter protocol involves transactions in the form of requests and answers exchanged between clients and servers. These clients and servers may be peers, that is,they may share a direct transport (e.g. TCP or SCTP) connection, or the messages may traverse one or more intermediaries, known as Diameter Agents. Diameter nodes use TLS, DTLS, or IPSec to authenticate peers, and to provide confidentiality and integrity protection of traffic between peers. Nodes can make authorization decisions based on the peer identities authenticated at the transport layer.

When agents are involved, this presents an effectively hop-by-hop trust model. That is, a Diameter client or server can authorize an agent for certain actions, but it must trust that agent to make appropriate authorization decisions about its peers, and so on.

Since confidentiality and integrity protection occurs at the transport layer. Agents can read, and perhaps modify, any part of a Diameter message, including an overload report.

There are several ways an attacker might attempt to exploit the overload control mechanism. An unauthorized third party might inject an overload report into the network. If this third party is upstream of an agent, and that agent fails to apply proper authorization policies, downstream nodes may mistakenly trust the report. This attack is at least partially mitigated by the assumption that nodes include overload reports in Diameter answers but not in requests. This requires an attacker to have knowledge of the original request in order to construct a response. Therefore, implementations SHOULD validate that an answer containing an overload report is a properly constructed response to a pending request prior to acting on the overload report.

A similar attack involves an otherwise authorized Diameter node that sends an inappropriate overload report. For example, a server for the realm "example.com" might send an overload report indicating that a competitor's realm "example.net" is overloaded. If other nodes act on the report, they may falsely believe that "example.net" is overloaded, effectively reducing that realm's capacity. Therefore, it's critical that nodes validate that an overload report received from a peer actually falls within that peer's responsibility before acting on the report or forwarding the report to other peers. For example, an overload report from an peer that applies to a realm not handled by that peer is suspect.

An attacker might use the information in an overload report to assist in certain attacks. For example, an attacker could use information about current overload conditions to time a DoS attack for maximum effect, or use subsequent overload reports as a feedback mechanism to learn the results of a previous or ongoing attack.

9.2. Denial of Service Attacks

Diameter overload reports can cause a node to cease sending some or all Diameter requests for an extended period. This makes them a tempting vector for DoS tacks. Furthermore, since Diameter is almost always used in support of other protocols, a DoS attack on Diameter is likely to impact those protocols as well. Therefore, Diameter nodes MUST NOT honor or forward overload reports from unauthorized or otherwise untrusted sources.

9.3. Non-Compliant Nodes

When a Diameter node sends an overload report, it cannot assume that all nodes will comply. A non-compliant node might continue to send requests with no reduction in load. Requirement 28 [RFC7068] indicates that the overload control solution cannot assume that all Diameter nodes in a network are necessarily trusted, and that malicious nodes not be allowed to take advantage of the overload control mechanism to get more than their fair share of service.

In the absence of an overload control mechanism, Diameter nodes need to implement strategies to protect themselves from floods of requests, and to make sure that a disproportionate load from one source does not prevent other sources from receiving service. For example, a Diameter server might reject a certain percentage of requests from sources that exceed certain limits. Overload control can be thought of as an optimization for such strategies, where downstream nodes never send the excess requests in the first place. However, the presence of an overload control mechanism does not remove the need for these other protection strategies.

9.4. End-to End-Security Issues

The lack of end-to-end security features makes it far more difficult to establish trust in overload reports that originate from non-adjacent nodes. Any agents in the message path may insert or modify overload reports. Nodes must trust that their adjacent peers perform proper checks on overload reports from their peers, and so on, creating a transitive-trust requirement extending for potentially long chains of nodes. Network operators must determine if this transitive trust requirement is acceptable for their deployments. Nodes supporting Diameter overload control MUST give operators the ability to select which peers are trusted to deliver overload reports, and whether they are trusted to forward overload reports from non-adjacent nodes.

The lack of end-to-end confidentiality protection means that any Diameter agent in the path of an overload report can view the contents of that report. In addition to the requirement to select which peers are trusted to send overload reports, operators MUST be able to select which peers are authorized to receive reports. A node MUST not send an overload report to a peer not authorized to receive it. Furthermore, an agent MUST remove any overload reports that might have been inserted by other nodes before forwarding a Diameter message to a peer that is not authorized to receive overload reports.

At the time of this writing, the DIME working group is studying requirements for adding end-to-end security [I-D.ietf-dime-e2e-sec-req] features to Diameter. These features, when they become available, might make it easier to establish trust in non-adjacent nodes for overload control purposes. Readers should be reminded, however, that the overload control mechanism encourages Diameter agents to modify AVPs in, or insert additional AVPs into, existing messages that are originated by other nodes. If end-to-end security is enabled, there is a risk that such modification could violate integrity protection. The details of using any future Diameter end-to-end security mechanism with overload control will require careful consideration, and are beyond the scope of this document.

10. Contributors

The following people contributed substantial ideas, feedback, and discussion to this document:

Eric McMurry
Hannes Tschofenig
Ulrich Wiehe
Jean-Jacques Trottin
Maria Cruz Bartolome
Martin Dolly
Nirav Salot
Susan Shishufeng

11. References

11.1. Normative References

[RFC2119]	Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC5226]	Narten, T. and H. Alvestrand, "Guidelines for Writing an IANA Considerations Section in RFCs", BCP 26, RFC 5226, May 2008.
[RFC5905]	Mills, D., Martin, J., Burbank, J. and W. Kasch, "Network Time Protocol Version 4: Protocol and Algorithms Specification", RFC 5905, June 2010.
[RFC6733]	Fajardo, V., Arkko, J., Loughney, J. and G. Zorn, "Diameter Base Protocol", RFC 6733, October 2012.

11.2. Informative References

[Cx]	3GPP, , "ETSI TS 129 229 V11.4.0", August 2013.
[I-D.ietf-dime-e2e-sec-req]	Tschofenig, H., Korhonen, J., Zorn, G. and K. Pillay, "Diameter AVP Level Security: Scenarios and Requirements", Internet-Draft draft-ietf-dime-e2e-sec-req-00, September 2013.
[PCC]	3GPP, , "ETSI TS 123 203 V11.12.0", December 2013.
[RFC4006]	Hakala, H., Mattila, L., Koskinen, J-P., Stura, M. and J. Loughney, "Diameter Credit-Control Application", RFC 4006, August 2005.
[RFC5729]	Korhonen, J., Jones, M., Morand, L. and T. Tsou, "Clarifications on the Routing of Diameter Requests Based on the Username and the Realm", RFC 5729, December 2009.
[RFC7068]	McMurry, E. and B. Campbell, "Diameter Overload Control Requirements", RFC 7068, November 2013.
[S13]	3GPP, , "ETSI TS 129 272 V11.9.0", December 2012.

Appendix A. Issues left for future specifications

The base solution for the overload control does not cover all possible use cases. A number of solution aspects were intentionally left for future specification and protocol work.

A.1. Additional traffic abatement algorithms

This specification describes only means for a simple loss based algorithm. Future algorithms can be added using the designed solution extension mechanism. The new algorithms need to be registered with IANA. See Sections 6.1 and 8 for the required IANA steps.

A.2. Agent Overload

This specification focuses on Diameter endpoint (server or client) overload. A separate extension will be required to outline the handling the case of agent overload.

A.3. DIAMETER_TOO_BUSY clarifications

The current [RFC6733] behavior in a case of DIAMETER_TOO_BUSY is somewhat under specified. For example, there is no information how long the specific Diameter node is willing to be unavailable. A specification updating [RFC6733] should clarify the handling of DIAMETER_TOO_BUSY from the error answer initiating Diameter node point of view and from the original request initiating Diameter node point of view. Further, the inclusion of possible additional information providing AVPs should be discussed and possible be recommended to be used.

Appendix B. Examples

B.1. Mix of Destination-Realm routed requests and Destination-Host routed requests

Diameter allows a client to optionally select the destination server of a request, even if there are agents between the client and the server. The client does this using the Destination-Host AVP. In cases where the client does not care if a specific server receives the request, it can omit Destination-Host and route the request using the Destination-Realm and Application Id, effectively letting an agent select the server.

Clients commonly send mixtures of Destination-Host and Destination- Realm routed requests. For example, in an application that uses user sessions, a client typically won't care which server handles a session-initiating requests. But once the session is initiated, the client will send all subsequent requests in that session to the same server. Therefore it would send the initial request with no Destination-Host AVP. If it receives a successful answer, the client would copy the Origin-Host value from the answer message into a Destination-Host AVP in each subsequent request in the session.

An agent has very limited options in applying overload abatement to requests that contain Destination-Host AVPs. It typically cannot route the request to a different server than the one identified in Destination-Host. It's only remaining options are to throttle such requests locally, or to send an overload report back towards the client so the client can throttle the requests. The second choice is usually more efficient, since it prevents any throttled requests from being sent in the first place, and removes the agent's need to send errors back to the client for each dropped request.

On the other hand, an agent has much more leeway to apply overload abatement for requests that do not contain Destination-Host AVPs. If the agent has multiple servers in its peer table for the given realm and application, it can route such requests to other, less overloaded servers.

If the overload severity increases, the agent may reach a point where there is not sufficient capacity across all servers to handle even realm-routed requests. In this case, the realm itself can be considered overloaded. The agent may need the client to throttle realm-routed requests in addition to Destination-Host routed requests. The overload severity may be different for each server, and the severity for the realm at is likely to be different than for any specific server. Therefore, an agent may need to forward, or originate, multiple overload reports with differing ReportType and Reduction-Percentage values.

Figure 8 illustrates such a mixed-routing scenario. In this example, the servers S1, S2, and S3 handle requests for the realm "realm". Any of the three can handle requests that are not part of a user session (i.e. routed by Destination-Realm). But once a session is established, all requests in that session must go to the same server.

     Client     Agent      S1        S2        S3
        |         |         |         |         |
        |(1) Request (DR:realm)       |         |
        |-------->|         |         |         |
        |         |         |         |         |
        |         |         |         |         |
        |         |Agent selects S1   |         |
        |         |         |         |         |
        |         |         |         |         |
        |         |         |         |         |
        |         |(2) Request (DR:realm)       |
        |         |-------->|         |         |
        |         |         |         |         |
        |         |         |         |         |
        |         |         |S1 overloaded, returns OLR
        |         |         |         |         |
        |         |         |         |         |
        |         |         |         |         |
        |         |(3) Answer (OR:realm,OH:S1,OLR:RT=DH)
        |         |<--------|         |         |
        |         |         |         |         |
        |         |         |         |         |
        |         |sees OLR,routes DR traffic to S2&S3
        |         |         |         |         |
        |         |         |         |         |
        |         |         |         |         |
        |(4) Answer (OR:realm,OH:S1, OLR:RT=DH) |
        |<--------|         |         |         |
        |         |         |         |         |
        |         |         |         |         |
        |Client throttles requests with DH:S1   |
        |         |         |         |         |
        |         |         |         |         |
        |         |         |         |         |
        |(5) Request (DR:realm)       |         |
        |-------->|         |         |         |
        |         |         |         |         |
        |         |         |         |         |
        |         |Agent selects S2   |         |
        |         |         |         |         |
        |         |         |         |         |
        |         |         |         |         |
        |         |(6) Request (DR:realm)       |
        |         |------------------>|         |
        |         |         |         |         |
        |         |         |         |         |
        |         |         |         |S2 is overloaded...
        |         |         |         |         |
        |         |         |         |         |
        |         |         |         |         |
        |         |(7) Answer (OH:S2, OLR:RT=DH)|
        |         |<------------------|         |
        |         |         |         |         |
        |         |         |         |         |
        |         |Agent sees OLR, realm now overloaded
        |         |         |         |         |
        |         |         |         |         |
        |         |         |         |         |
        |(8) Answer (OR:realm,OH:S2, OLR:RT=DH, OLR: RT=R)
        |<--------|         |         |         |
        |         |         |         |         |
        |         |         |         |         |
        |Client throttles DH:S1, DH:S2, and DR:realm
        |         |         |         |         |
        |         |         |         |         |
        |         |         |         |         |
        |         |         |         |         |
        |         |         |         |         |

Figure 8: Mix of Destination-Host and Destination-Realm Routed Requests

The client sends a request with no Destination-Host AVP (that is, a Destination-Realm routed request.)
The agent follows local policy to select a server from its peer table. In this case, the agent selects S2 and forwards the request.
S1 is overloaded. It sends a answer indicating success, but also includes an overload report. Since the overload report only applies to S1, the ReportType is "Destination-Host".
The agent sees the overload report, and records that S1 is overloaded by the value in the Reduction-Percentage AVP. It begins diverting the indicated percentage of realm-routed traffic from S1 to S2 and S3. Since it can't divert Destination-Host routed traffic, it forwards the overload report to the client. This effectively delegates the throttling of traffic with Destination-Host:S1 to the client.
The client sends another Destination-Realm routed request.
The agent selects S2, and forwards the request.
It turns out that S2 is also overloaded, perhaps due to all that traffic it took over for S1. S2 returns an successful answer containing an overload report. Since this report only applies to S2, the ReportType is "Destination-Host".
The agent sees that S2 is also overloaded by the value in Reduction-Percentage. This value is probably different than the value from S1's report. The agent diverts the remaining traffic to S3 as best as it can, but it calculates that the remaining capacity across all three servers is no longer sufficient to handle all of the realm-routed traffic. This means the realm itself is overloaded. The realm's overload percentage is most likely different than that for either S1 or S2. The agent forward's S2's report back to the client in the Diameter answer. Additionally, the agent generates a new report for the realm of "realm", and inserts that report into the answer. The client throttles requests with Destination-Host:S1 at one rate, requests with Destination-Host:S2 at another rate, and requests with no Destination-Host AVP at yet a third rate. (Since S3 has not indicated overload, the client does not throttle requests with Destination-Host:S3.)

Appendix C. Restructuring of -02 version of the draft

This section captures the initial plan for restructuring the DOIC specification from the -02 version to the new -03 version.

   1. Introduction (non normative)
      -- Existing Text from section 1. --
   2. Terminology and Abbreviations (non normative)
      -- Existing Text from section 2. --
   3. Solution Overview (Non normative)
      -- Existing text from section 3. --
     3.1 Overload Control Endpoints (Non normative)
         -- New text leveraging text from existing section 5.1 --
     3.2 Piggybacking Principle (Non normative)
         -- Existing text from existing section 5.2, with enhancements --
     3.3 DOIC Capability Discovery (Non normative)
         -- New text leveraging text from existing section 5.3 --
     3.4 DOIC Overload Condition Reporting (Non normative)
         -- New text --
     3.5 DOIC Extensibility (Non normative)
         -- New text leveraging text from existing Section 5.4 --
     3.5 Simplified Example Architecture (Non normative)
         -- Existing text from section 3.1.6, with enhancements --
     3.6 Considerations for Applications Integrating the DOIC Solution (Non normative)
         -- New text --
       3.6.1. Application Classification  (Non normative)
              -- Existing text from section 3.1.1 --
       3.6.2. Application Type Overload Implications  (Non normative)
              -- Existing text from section 3.1.2 --
       3.6.3. Request Transaction Classification  (Non normative)
              -- Existing text from section 3.1.3 --
       3.6.4. Request Type Overload Implications  (Non normative)
              -- Existing text from section 3.1.4 --
   4. Solution Procedures (Normative)
     4.1 Capability Announcement (Normative)
        -- Existing text from section 5.3 --
       4.1.1. Reacting Node Behavior (Normative)
            -- Existing text from section 5.3.1 --
       4.1.2. Reporting Node Behavior  (Normative)
            -- Existing text from section 5.3.2 --
       4.1.3. Agent Behavior  (Normative)
            -- Existing text from section 5.3.3 --
     4.2. Overload Report Processing (Normative)
       4.2.1. Overload Control State (Normative)
            -- Existing text from section 5.5.1 --
       4.2.2. Reacting Node Behavior  (Normative)
            -- Existing text from section 5.5.2 --
       4.2.3. Reporting Node Behavior  (Normative)
            -- Existing text from section 5.5.3 --
       4.2.4. Agent Behavior  (Normative)
            -- Existing text from section 5.5.4 --
     4.3. Protocol Extensibility (Normative)
        -- Existing text from section 5.4 --
   5. Loss Algorithm (Normative)
      -- New text pulling from information spread through the document --
     5.1. Overview (Non normative)
          -- New text pulling from information spread through the document --
     5.2. Reporting Node Behavior (Normative)
          -- New text pulling from information spread through the document --
     5.3. Reacting Node Behavior (Normative)
          -- New text pulling from information spread through the document --
   6. Attribute Value Pairs (Normative)
      -- Existing text from section 4. --
     6.1. OC-Supported-Features AVP
          -- Existing text from section 4.1 --
     6.2. OC-Feature-Vector AVP
          -- Existing text from section 4.2 --
     6.3. OC-OLR AVP
          -- Existing text from section 4.3 --
     6.4. OC-Sequence-Number AVP
          -- Existing text from section 4.4 --
     6.5. OC-Validity-Duration AVP
          -- Existing text from section 4.5 --
     6.6. OC-Report-Type AVP
          -- Existing text from section 4.6 --
     6.7. OC-Reduction-Percentage AVP
          -- Existing text from section 4.7 --
     6.8. Attribute Value Pair flag rules
          -- Existing text from section 4.8 --
   7. Error Response Codes
          -- New text based on resolution of issue --
   8. IANA Considerations
      -- Existing text from section 7. --
     8.1. AVP codes
          -- Existing text from section 7.1 --
     8.2. New registries
          -- Existing text from section 7.2 --
   9. Security Considerations
       -- Existing text from section 8. --
     9.1. Potential Threat Modes
           -- Existing text from section 8.1 --
     9.2. Denial of Service Attacks
           -- Existing text from section 8.2 --
     9.3. Non-Compliant Nodes
           -- Existing text from section 8.3 --
     9.4. End-to End-Security Issues
           -- Existing text from section 8.4 --
   10. Contributors
   11. References
     11.1. Normative References
     11.2. Informative References
   Appendix A. Issues left for future specifications
     A.1. Additional traffic abatement algorithms
     A.2. Agent Overload
     A.3. DIAMETER_TOO_BUSY clarifications
     A.4. Per reacting node reports
   Appendix B. Examples
     B.1. Mix of Destination-Realm routed requests and Destination-
           Host routed requests
   Authors' Addresses

Authors' Addresses

Jouni Korhonen (editor) Broadcom Porkkalankatu 24 Helsinki, FIN-00180 Finland EMail: jouni.nospam@gmail.com

Steve Donovan (editor) Oracle 7460 Warren Parkway Frisco, Texas 75034 United States EMail: srdonovan@usdonovans.com

Ben Campbell Oracle 7460 Warren Parkway Frisco, Texas 75034 United States EMail: ben@nostrum.com

Lionel Morand Orange Labs 38/40 rue du General Leclerc Issy-Les-Moulineaux Cedex 9, 92794 France Phone: +33145296257 EMail: lionel.morand@orange.com