Internet DRAFT - draft-campbell-dime-overload-issues
draft-campbell-dime-overload-issues
Network Working Group B. Campbell
Internet-Draft Tekelec
Intended status: Informational July 15, 2013
Expires: January 16, 2014
Diameter Overload Control Solution Issues
draft-campbell-dime-overload-issues-01
Abstract
The Diameter Maintenance and Extensions (DIME) working group has
undertaken an "overload control" work item, with the goal of
standardizing a mechanism to allow Diameter nodes to report overload
information among themselves. Requirements currently include, among
others, the need to accurately report the scope of overload
conditions, and the ability to report overload information between
nodes that are not directly connected at the transport layer. These
requirements introduce complex issues. This document describes those
issues, in the hope that it will assist the working group's decision
process.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on January 16, 2014.
Copyright Notice
Copyright (c) 2013 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
Campbell Expires January 16, 2014 [Page 1]
Internet-Draft Diameter Overload Control Solution Issues July 2013
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Document Conventions . . . . . . . . . . . . . . . . . . . . 4
3. Non-adjacent Overload Information . . . . . . . . . . . . . . 4
3.1. Use-Cases for Non-adjacent Overload Control . . . . . . . 5
3.1.1. Interconnect . . . . . . . . . . . . . . . . . . . . 5
3.1.2. Non-Supporting Agents . . . . . . . . . . . . . . . . 6
3.2. Issues with Non-Adjacent Overload Control . . . . . . . . 6
3.2.1. Topology Issues . . . . . . . . . . . . . . . . . . . 6
3.2.2. Support Negotiation . . . . . . . . . . . . . . . . . 7
3.2.3. Overload Report Delivery . . . . . . . . . . . . . . 8
3.2.4. Non-Adjacent Overload Scopes . . . . . . . . . . . . 9
3.3. Non-adjacent Overload Control Recommendations . . . . . . 11
4. Overload Scopes . . . . . . . . . . . . . . . . . . . . . . . 12
4.1. Explicit vs Implicit Indication of Scopes . . . . . . . . 13
4.2. Types of Overload Scopes . . . . . . . . . . . . . . . . 14
4.2.1. Connection Scope-Type . . . . . . . . . . . . . . . . 14
4.2.2. Peer Scope-Type . . . . . . . . . . . . . . . . . . . 15
4.2.3. Destination-Host Scope-Type . . . . . . . . . . . . . 15
4.2.4. Origin-Host Scope-Type . . . . . . . . . . . . . . . 16
4.2.5. Diameter-Application Scope-Type . . . . . . . . . . . 16
4.2.6. Destination-Realm Scope-Type . . . . . . . . . . . . 16
4.2.7. Session Scope-Type . . . . . . . . . . . . . . . . . 17
4.2.8. Session-Group Scope-Type . . . . . . . . . . . . . . 18
4.3. Scope Values . . . . . . . . . . . . . . . . . . . . . . 18
4.4. Combining Scopes . . . . . . . . . . . . . . . . . . . . 18
4.5. Scope Extensibility . . . . . . . . . . . . . . . . . . . 19
4.6. Scope Recommendations . . . . . . . . . . . . . . . . . . 19
5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 19
6. Security Considerations . . . . . . . . . . . . . . . . . . . 19
7. References . . . . . . . . . . . . . . . . . . . . . . . . . 20
7.1. Normative References . . . . . . . . . . . . . . . . . . 20
7.2. Informative References . . . . . . . . . . . . . . . . . 20
Appendix A. Contributors . . . . . . . . . . . . . . . . . . . . 20
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 20
Campbell Expires January 16, 2014 [Page 2]
Internet-Draft Diameter Overload Control Solution Issues July 2013
1. Introduction
When a Diameter [RFC6733] server or agent becomes overloaded, it
needs to be able to gracefully reduce its load, typically by
requesting other nodes to reduce the number of Diameter requests for
some period of time.
The Diameter Overload Control Requirements
[I-D.ietf-dime-overload-reqs] describe requirements for overload
control mechanisms. Requirement 31 states that Diameter nodes must
be able to report overload with sufficient granularity to avoid
forcing available capacity to go unused. Requirement 34 requires the
ability to report overload across Diameter nodes that do not support
the mechanism. These requirements introduce significant and
interrelated complexities to potential solutions. This document
describes the related issues. The author hopes that this document
will assist the working group's decision process related to these
requirements.
At the time of this writing, there have been two proposals for
Diameter overload control solutions. "A Mechanism for Diameter
Overload Control" (MDOC) [I-D.roach-dime-overload-ctrl] defines a
solution that piggybacks overload and load state information over
existing Diameter messages. "The Diameter Overload Control
Application" (DOCA) [I-D.korhonen-dime-ovl] defines a solution that
uses a new dedicated Diameter application to communicate similar
information.
While there are significant differences between the two proposals,
they carry similar information. In many ways, the issues related
to Requirements 31 and 34 apply to both proposals. This
discussion is not specific to one proposal or the other, unless
explicitly mentioned.
This document serves two purposes. The primary purpose is to explore
the issues related to Requirement 34, that is, the requirement for
the overload control mechanism to support sending load and overload
information across intermediaries that do not support the mechanism
(referred to herein as "non-adjacent" overload reporting.) The
document describes two use cases for non-adjacent overload reporting.
It does not, however, attempt to describe the use cases for Diameter
agents in general. For a more thorough treatment of Diameter agent
use cases in the context of overload control, please see
[I-D.ietf-dime-overload-reqs].
The secondary purpose is to help the reader understand the concept of
overload scopes, and make recommendations about what kinds of
overload scope should be supported by the mechanism. These purposes
Campbell Expires January 16, 2014 [Page 3]
Internet-Draft Diameter Overload Control Solution Issues July 2013
are interrelated, since an understanding of overload scopes is
necessary to fully understand some of the issues with non-adjacent
overload reporting.
2. Document Conventions
This document uses terms defined in [RFC6733] and
[I-D.ietf-dime-overload-reqs]. In particular, the terms "client",
"server","upstream", and "downstream" are used as defined in RFC
6733. In addition, this document uses the following terms:
Overload: A condition where a Diameter node needs a reduction in the
number of requests that it must handle.
Overload Report: A request to reduce traffic that contributes to an
overload condition.
Overload Scope: A classifier that defines the set of requests that
may contribute to particular overload conditions.
Alternatively, the purposes for which a node may be
overloaded. For example, if a server is overloaded for the
purposes of one Diameter application but not another, the
overload condition can be considered "scoped" to that
application.
Reporting Node: The node that sends an overload report. Also known
as an "overloaded node".
Reacting Node: A node that consumes and possibly acts on an overload
report.
Adjacent Overload Reporting: Overload reports exchanged between
adjacent Diameter peers.
Non-Adjacent Overload Reporting: Overload reports sent between
Diameter nodes separated by one or more intermediate
Diameter agents (i.e. relays or proxies) .
Piggybacked Overload Reporting: The inclusion of overload reports in
existing Diameter messages.
Application-Based Overload Reporting: The sending of overload
reports in a separate, dedicated Diameter application.
3. Non-adjacent Overload Information
Requirement 34 of [I-D.ietf-dime-overload-reqs] says that the
selected Diameter overload control mechanism "SHOULD" be able to
Campbell Expires January 16, 2014 [Page 4]
Internet-Draft Diameter Overload Control Solution Issues July 2013
communicate overload and load information across intermediaries that
do not support the mechanism. This requirement introduces a number
of complications to the solution effort, creating complications in
how Diameters negotiate support for overload control, address and
route overload reports to the right places, and act on received
overload reports.
While the requirement does not explicitly say it, we interpret
"intermediaries" in this context to mean Diameter agents. The
requirement is irrelevant for lower layer intermediaries (e.g.
routers), and cannot be reasonably applied for non-Diameter entities,
or hybrid entities such as gateways between Diameter and other
protocols.
The requirement to traverse non-supporting intermediaries is not
necessarily the same thing as a requirement for end-to-end
communication of overload reports between Diameter clients and
servers. Non-adjacent reporting can include client-to-server
scenarios. They can also include server-to-agent scenarios and
agent-to-client scenarios. All such scenarios may include one or
more intervening agents. Since Diameter allows transactions to be
sent from server to client, all scenarios may be reversed.
Therefore, we refer to this requirement as "Non-adjacent Overload
Control".
3.1. Use-Cases for Non-adjacent Overload Control
There are two primary use-cases for non-adjacent overload control.
3.1.1. Interconnect
The first significant non-adjacent use-case is the interconnect
scenario described in section 2.3 of the overload control
requirements [I-D.ietf-dime-overload-reqs]. Two or more Diameter
network operators communicate with each other across a third-party
interconnect provider that brokers Diameter traffic between the
operators. Figure 1 illustrates the interconnect use case.
+-------------------------------------------+
| Interconnect |
| |
| +--------------+ +--------------+ |
| | Agent |------| Agent | |
| +--------------+ +--------------+ |
| .' `. |
+------.-'--------------------------`.------+
.' `.
.-' `.
Campbell Expires January 16, 2014 [Page 5]
Internet-Draft Diameter Overload Control Solution Issues July 2013
------------.'-----+ +----`.------------
+----------+ | | +----------+
|Edge Agent| | |Edge Agent|
+----------+ | | +----------+
| |
Operator 1 | | Operator 2
-------------------+ +------------------
Figure 1: Two Operator Interconnect Scenario
If the interconnect provider does not support Diameter overload
control, each operator network becomes an island of overload control,
similar to those in the non-supporting agent use-case
(Section 3.1.2). Even if the interconnect provider does support
overload control, the operators may not trust it to generate and act
on overload reports on the operators' behalves, and may prefer to
exchange overload and load information directly with each other.
The interconnect use-case may introduce additional security concerns.
While the non-supporting agent use case typically (but not
necessarily) occurs inside a single administrative domain, the
interconnect case will almost always involve sending overload reports
across multiple administrative domains. Since a malicious or
incorrect overload report can effectively shut down Diameter
processing, the current lack of a viable solution for end-to-end
integrity protection of Diameter messages may be a problem.
3.1.2. Non-Supporting Agents
[I-D.ietf-dime-overload-reqs] requires the solution to function in
networks where not all Diameter elements support it. That is, the
solution must allow gradual deployment, and must not require a flag-
day cutover. If non-adjacent overload control is not supported, one
or more non-supporting Diameter Agents can divide a network into
overload control islands, where overload information is communicated
inside each island, but not among separate islands.
In the author's strictly personal opinion, the non-supporting
agent use case is less compelling than the interconnect case. The
non-supporting agent case would typically occur inside one
administrative domain. The operator of that domain has
considerably more control over the implementations used in the
domain than it might have for third-party domains.
3.2. Issues with Non-Adjacent Overload Control
3.2.1. Topology Issues
Campbell Expires January 16, 2014 [Page 6]
Internet-Draft Diameter Overload Control Solution Issues July 2013
Many of the issues with non-adjacent overload control derive from the
fact that a Diameter node is unlikely to know the topology of the
Diameter network past its immediate peers. In a trivial topology,
that is, a Diameter network with only clients and servers, this is
not a problem. But if the immediate peer is a Diameter agent, a node
is unlikely to know what next hop the relay will select for a given
Diameter message. This is particularly difficult if the agent hides
topology in either direction, or uses dynamic peer discovery. While
a node may be able to infer the path a given message will take in
some specific cases (e.g. for mid-session messages), they cannot do
this in general. And even those specific cases may fail if an agent
on the message path performs topology hiding.
This lack of topology knowledge impacts the way that nodes can
negotiate overload-control support, the ways they send overload
reports, and the ways a reacting node can act to mitigate overload.
A non-adjacent overload-control mechanism will need to solve the
topology issues, either by offering ways to discover non-adjacent
topologies, or offering ways to constrain overload-control relevant
parts of such topologies in ways where a node could reasonably know
them in advance.
3.2.2. Support Negotiation
Diameter nodes need to negotiate or otherwise indicate their support
for overload control to other nodes. This includes indicating
support for overload control in general, as well as potentially
indicating support of certain parameters of the overload control
solution. For example, a node may need to indicate which overload
algorithms it supports. This becomes complex if two non-adjacent
nodes need to negotiate support.
In a Diameter application-based solution, support for the overload
control application would occur during the capabilities exchange
between peers. Diameter capabilities exchange occurs strictly
between peers; Diameter offers no mechanism for indicating support of
a given Application-ID between non-adjacent nodes.
Diameter allows non-negotiated use of an arbitrary Application-Id
between non-adjacent nodes across Diameter agents that implement the
Diameter Relay application. In theory, this means that an
application-based, non-adjacent overload control could only traverse
Diameter relays, or Diameter proxies that explicitly support the
overload-control Application-Id. In the latter case, we assume that
a proxy will not indicate support for the overload-control
Application-Id unless it supports the overload-control mechanism;
such a proxy cannot be considered a non-supporting agent.
Campbell Expires January 16, 2014 [Page 7]
Internet-Draft Diameter Overload Control Solution Issues July 2013
In practice, a Diameter agent can act as a proxy for some purposes
and a relay for others. If a Diameter proxy indicates support for
the Diameter relay application, we assume that it will relay any
arbitrary application. This means it can be considered a relay for
the purposes of overload control.
For both application-based and piggybacked solutions, a supporting
node needs know the other nodes with which it should negotiate. For
overload-control between Diameter peers, this is easy; a node
exchanges support information with its immediate peers. But for non-
adjacent overload control, this is more difficult for reasons
discussed in Section 3.2.1.
Therefore, for non-adjacent overload control negotiation, each
supporting node either needs advance knowledge of all nodes with
which it may negotiate overload-control support, or it needs a
mechanism for discovering that knowledge dynamically.
3.2.3. Overload Report Delivery
With adjacent overload control reporting, overload report addressing
and delivery is relatively simple. A node sends overload reports
directly to its peers. This becomes more complex for non-adjacent
overload-control.
For application-based overload control, nodes could address overload
reports to specific endpoint nodes using the Destination-Host AVP.
Doing so would be subject to the same non-adjacent topology issues
described in Section 3.2.1. That is, a node can only send overload
reports to non-adjacent clients or servers that it knows about,
either from prior knowledge (i.e. provisioning) or from which it has
observed previous Diameter messages.
An application-based mechanism could possibly address reports to non-
adjacent Diameter agents using the Destination-Host AVP. This would
effectively make the agent into an endpoint for the overload-control
application.
A piggy-backed mechanism will have more difficulty addressing non-
adjacent overload reports. A piggy-backed mechanism sends overload
reports in already existing Diameter requests; That is, requests that
have their own purposes and destinations independent of the overload-
report. Thus, nodes can only select the destination of an overload
report by bundling it into a Diameter message that was already going
to that destination. While a piggy-backed mechanism might be able to
send overload-reports across quiescent transport connections using
watchdog (DWR/DWA) messages, these message are cannot be exchanged
between non-adjacent nodes.
Campbell Expires January 16, 2014 [Page 8]
Internet-Draft Diameter Overload Control Solution Issues July 2013
In some cases, the limit of sending overload reports to
destinations to which existing traffic is bound may be acceptable.
If a node is contributing to an overload condition, then it's
reasonable to assume that node is regularly exchanging traffic
with the overloaded node. However, there may be cases where an
overload report causes a connection become quiescent. If the
reporting node needed to tell a reacting node that the condition
has resolved or improved, it would need to send a new report
across the now quiescent connection. There may also be cases
where a reacting node redirects traffic along a different path,
causing a previously quiescent node to suddenly start sending
requests to the overloaded node. Thus, without careful selection
of the overload report scope, an overloaded node may find itself
engaged in a game of Whack-a-Mole [Whac-a-Mole] with previously
quiescent non-adjacent nodes.
For both piggy-backed and application-based solutions, non-adjacent
overload control introduces a need to identify the sender of a
report, or at least determine whether the report is from an adjacent
or non-adjacent node. This is not required for purely adjacent
solutions, since the sender could always be assumed to be the peer.
For example, a non-adjacent report with a "Connection" scope does not
make sense. If a node receives one, it should ignore it. But in
order to make that decision, it must be able distinguish a non-
adjacent report from an adjacent one. For example, in an
application-based mechanism,
3.2.4. Non-Adjacent Overload Scopes
A reacting node will typically attempt to mitigate an overload
condition by either reducing the number of requests that contribute
to the condition, or by rerouting part of that traffic to avoid the
problem. In both cases, the reacting node's is limited by its
ability to determine to which Diameter requests contribute to the
overload condition in the first place. The overload scope concept
(Section 4) offers a way for overloaded nodes to indicate what
traffic is likely to contribute to an overload condition and should
be abated.
Not all of the scope-types described in Section 4 make sense for non-
adjacent overload control. The "Connection" scope-type is an obvious
example, since the reacting node will never share a transport
connection with a non-adjacent node; this is the very definition of
non-adjacent nodes.
Since a Diameter node cannot control how requests are forwarded to
non-adjacent nodes, the "Peer" scope-type also does not work well,
Campbell Expires January 16, 2014 [Page 9]
Internet-Draft Diameter Overload Control Solution Issues July 2013
especially when there are multiple possible destinations up or
downstream from the adjacent peer. For example in Figure 2, Node A
sends Diameter requests to Nodes B and C across a non-supporting
agent. If Node B becomes overloaded but Node C does not, Node A
cannot reroute requests to Node C, since it has very little way to
influence where the agent will forward any given request. If Node A
tries to reduce traffic by 50%, the agent will likely still send half
of the remaining traffic to Node B. If B and C are endpoints, Node A
may in some cases be able to use the Destination-Host AVP for this
purpose (in which case the "Destination-Host" scope-type would be
more appropriate), but this does not help if B and C are also agents
rather than servers.
+--------+ +--------+
| Node B | | Node C |
+----+---+ +---+----+
| |
+-------+-------+
|
+-------+--------+
| Non-Supporting |
| Agent |
+-------+--------+
|
|
+----+----+
| Node A |
+---------+
Figure 2: Non-Adjacent Routing
Scope-types that classify traffic by origin or final destinations,
such as "Origin-Host","Destination-Realm", "Application-ID", and
"Destination-Host" can be used for non-adjacent overload control. In
general, scope-types that may denote non-adjacent intermediary
devices, such "Peer" cannot, nor can scope-types that refer only to
peers, e.g. "Connection".
Even for destination-oriented scope-types, the sender of an overload
report must be authoritative for the indicated scope. That is, it
must have full knowledge of the congestion state for the scope. For
example, if Node B and C both serve the ream "example.com", and B
becomes 50% overloaded while C does not, B cannot simply report 50%
overload at realm scope. If it did, Node A would reduce its
generated traffic by 50%. Since the overall realm is really only
overloaded by 75%, this would leave the realm operating beneath
available capacity.
Campbell Expires January 16, 2014 [Page 10]
Internet-Draft Diameter Overload Control Solution Issues July 2013
The need to be authoritative for an indicated scope is also true
for strictly adjacent reporting mechanisms. But in an adjacent
mechanism, it is easier for an intervening agent to learn the
overload state of upstream nodes. In the example, if the agent
supported the overload control mechanism, it would most likely
receive reports from Nodes B and C, and could then construct
downstream reports that incorporate the state of B, C, and its own
local state. This contrasts with the non-adjacent case where B
must understand the current state of C even though it is not in
the path of overload reports from C.
Therefore, a given node must only report overload for scopes for
which it has full knowledge of the load and overload state. That is,
it must be a "scope authority" for any scope it reports. In the
example, nodes B and C (and any other nodes serving "example.com")
would be required to share current load and overload state. The
state-sharing requirement could be substantial for high-capacity
nodes.
When a node reports overload for a certain scope, reacting nodes will
treat the overload condition as uniform across the entire scope. For
example, if a node reports overload for an entire realm, reacting
nodes will reduce traffic equally for all servers that serve that
realm. If the servers are unequally overloaded, they must use a more
granular scope-type, for example, "Destination-Host".
3.3. Non-adjacent Overload Control Recommendations
An adjacent reporting mechanism allows for very flexible and fine
grained overload control. It solves or simplifies a number of
issues, such as negotiation of support and parameters, requirements
for topology knowledge, end-to-end security, etc, by avoiding them in
the first place. Adding non-adjacent support to such a mechanism
would complicate it considerably.
Non-adjacent overload control mechanism are better for connecting
islands of overload control. Such a mechanism works well for larger
scopes and relatively static topologies.
The author believes that we are unlikely to find a single solution
that works well for both adjacent and non-adjacent overload control.
While a single solution is more desirable in general, a single
solution that works well for both cases is likely to be extremely
complicated. Therefore, the working group should consider a separate
mechanism for the non-adjacent delivery of overload reports.
If the group chooses to accept two separate solutions, we should be
able to specify a single data model and set of AVPs that work for
Campbell Expires January 16, 2014 [Page 11]
Internet-Draft Diameter Overload Control Solution Issues July 2013
both, with some restrictions. (For example, the non-adjacent
solution would likely forbid the use of the "Connection" scope-type.)
If the working group chooses to add non-adjacent features to MDOC or
DOCA, we will need to change the support negotiation mechanisms to
allow for the non-adjacent case, specify how a node can determine
whether a report is adjacent or non-adjacent, and state what subset
of scope-types are allowed in non-adjacent supports. We will also
need to study how we can meet the security-related requirements
[I-D.ietf-dime-overload-reqs] given the current lack of end-to-end
security features in Diameter.
4. Overload Scopes
Diameter overload does not necessarily affect all kinds of Diameter
traffic. A node may become overloaded for some requests but not
others. For example, a Diameter agent may handle requests for more
than one Diameter Application, and may route requests to a different
set of servers for each application. If one server set becomes
overloaded, but the other does not, then the agent itself is
effectively overloaded for one application, but can process the other
at normal capacity.
The Diameter overload requirements [I-D.ietf-dime-overload-reqs] list
several scenarios that illustrate overload that affects some requests
but not others. We refer to the set of requests affected by a
particular overload event as the "scope" of the overload event. The
overload requirements require the mechanism to be able to report
overload reports that are "scoped" to (that is, they affect requests
targeted to) a particular Diameter node, a Realm, or a Diameter
Application.
The concept of scope may also be useful when applied to reported
load even without an overload condition. This usage is out of
"scope" for this document.
A scope indication in an overload report is a set of classifiers that
identify requests likely to contribute to the overload condition. In
general, this could include any aspect of a Diameter message that a
reacting node can observe. For example, requests could be classified
by Attribute Value Pair (AVP) values or next-hop routing decisions.
The ability to express the scope of an overload condition is only
useful when reacting nodes can act on the information. There are
only a small number of actions a reacting node may take to mitigate
overload. Essentially these actions boil down to reducing the number
of requests that "match" the scope, either by sending fewer requests
in the first place, or by routing around the problem. The former is
Campbell Expires January 16, 2014 [Page 12]
Internet-Draft Diameter Overload Control Solution Issues July 2013
limited by the node's ability to distinguish between requests that
match the overload scope, and request that do not. The latter is
limited by the node's ability to predict or influence how a request
will be routed.
Reacting nodes most likely take additional application-specific
actions to mitigate overload conditions. If a client reduces the
number of messages it sends, it almost certainly has to take
additional application-specific steps that affect its own client
application. Depending on the application, it might refuse some
client application requests, redirect some of its own clients to
different services (e.g. offloading mobile data sessions to local
WiFi networks), or assert an overload condition in the client
application protocol (e.g. The Session Initiation Protocol (SIP)
).
This section discusses the meanings of the required scope-types, and
analyses their implications for the selected mechanism.
4.1. Explicit vs Implicit Indication of Scopes
Both MDOC and DOCA use explicit scope indication. That is, the scope
of an overload report is not, in general, implied by the type of
message that carries the report. For example, if an overload report
is scoped to a particular Diameter Application-Id, the report
explicitly indicates affected Application-Id, rather than leaving the
reacting-node to infer the Application-ID based on that of the
message that carries the report. There are a few exceptions to this;
for example MDOC supports a "Connection" scope that, when specified,
pertains to requests to be sent over the same transport connection
over which the overload report arrived.
List discussions have shown a common assumption that overload
reports sent over a piggy-backed solution such as MDOC would only
affect requests associated with the same Diameter Application-Id.
For MDOC, this is a false assumption. MDOC's explicit use of
scopes allows overload reports sent over one application to affect
requests for any arbitrary application. On the other hand,
solutions that use a dedicated Application-Id (such as DOCA)
necessarily require the ability to report overload for arbitrary
applications; otherwise it would only be possible for an overload
control application to report overload on itself.
Some list participants have suggested that the solution include a
concept of a default scope, that is, a scope that is implied if no
other scope is explicitly indicated. The concept of default or
implicit scopes requires further study by the working group.
Campbell Expires January 16, 2014 [Page 13]
Internet-Draft Diameter Overload Control Solution Issues July 2013
4.2. Types of Overload Scopes
There are several different kinds, or types, of overload scopes. The
type of a scope defines how the reacting node interprets it. Table 1
gives a summary of the scope types discussed in this document. The
"Scope Type" column gives the name of the scope. The "Affected
Traffic" column describes what Diameter requests are impacted by the
scope-type. The "Reacting-Node" column describes which Diameter
nodes may be able to take action on an overload report with the
respective scope-type. Finally, the "Draft" column describes which
proposed solution includes the respective scope-type.
+------------------+-----------------------+---------------+--------+
| Scope Type | Affected Traffic | Reacting-Node | Draft |
+------------------+-----------------------+---------------+--------+
| Connection | Requests sent to | Adjacent Peer | MDOC, |
| | directly to the | | DOCA |
| | reporting-node on a | | |
| | particular transport | | |
| | connection | | |
| Peer | Requests routed | Adjacent Peer | MDOC, |
| | directly to | | DOCA |
| | reporting-node. | | |
| Destination-Host | Requests with a | Any | MDOC |
| | matching Destination- | | |
| | Host AVP | | |
| Origin Host | Requests including a | Any | DOCA? |
| | matching Origin-Host | | |
| | AVP | | |
| Diameter | Requests with a | Any | MDOC, |
| Application | matching Application- | | DOCA |
| | Id AVP | | |
| Destination | Requests with a | Any | MDOC, |
| Realm | matching Destination- | | DOCA |
| | Realm AVP | | |
| Session | Requests with a | Any | MDOC |
| | matching Session-Id | | |
| | AVP | | |
| Session-Group | Requests belonging to | Any | MDOC |
| | sessions assigned | | |
| | matching labels | | |
+------------------+-----------------------+---------------+--------+
Table 1: Summary of Overload Scope Types
4.2.1. Connection Scope-Type
Campbell Expires January 16, 2014 [Page 14]
Internet-Draft Diameter Overload Control Solution Issues July 2013
The "Connection" scope-type indicates that the reacting node should
reduce traffic sent on the transport connection on which it received
the overload report. A Connection scope indicate does not include an
explicit value; rather it implies "this connection".
4.2.2. Peer Scope-Type
The "Peer" scope-type indicates that a particular Diameter node is
overloaded. Other nodes should mitigate the overload by reducing the
number of requests that will land on the overloaded node, either by
sending fewer requests, or by attempting to route requests around the
overloaded node.
In both MDOC and DOCA, the "Peer" scope-type is named "Host". In
practice, only immediate peers can act as the reacting node for a
Host scoped overload report. This is due to the fact that non-
adjacent nodes have limited ability to influence routing decisions
beyond the immediate next hop. This document uses the term "Peer"
to illustrate that fact.
Large-scale Diameter nodes are often implemented as clusters of IP
hosts, which may or may not share their knowledge about upstream
overload conditions. Certain IP hosts in a cluster could become
overloaded when others do not. Furthermore, if the reacting-node is
also clustered, it may be difficult for the cluster members to share
real-time knowledge of the reporting-node's overload state. This can
make it difficult for a node to know conclusively whether any two
connections that appear to connect to the same peer can be treated as
such for the purposes of overload control. The working group should
study whether the Peer scope-type should be deprecated in favor of
the "Connection" scope-type.
4.2.3. Destination-Host Scope-Type
The "Destination-Host" scope type pertains to requests that contain a
Destination-Host AVP that matches the indicated Destination-Host
value. Destination-Host always refers to the endpoint for a given
Diameter request.
Campbell Expires January 16, 2014 [Page 15]
Internet-Draft Diameter Overload Control Solution Issues July 2013
The best the reacting node can do is reduce the number of requests
that contain a Destination-Host AVP that match the overloaded node.
Rerouting will not help in general, since the requests will simply
take different routes to arrive at the same overloaded server.
Unless the destination node is also direct peer, the reacting node
cannot do much about requests that don't contain a Destination-Host
AVP in the first place, since it cannot predict whether these
requests will land on the overloaded endpoint. The Destination-Host
scope type is useful for requests bound to a particular server, for
example, mid-session requests for a session-stateful application.
Go ahead and cover details for "session" and "session-groups", and
argue for removal of "session".
4.2.4. Origin-Host Scope-Type
While most scope-types refer to where a request is likely to go, the
"Origin-Host" scope-type refers to where the request originates.
That is, any request with a matching Origin-Host AVP would match.
The Origin-Host scope type is useful for situations where a specific
client or set of clients sends an excessive number of requests. An
overload report with an Origin-Host scope would tell matching clients
to reduce traffic, or agents to throttle requests that came from
matching clients.
Note that the Origin-Host scope-type is not explicitly mentioned
in the requirements document. The authors include it here because
others have mentioned the need in conversation.
4.2.5. Diameter-Application Scope-Type
The "Diameter Application" scope-type indicates overload for a
particular Diameter application. That is, it impacts all requests
with the matching value in an Application-Id AVP.
The Diameter Application scope-type is useful for declaring an
overload condition that affects a specific Diameter service,
typically, but not necessarily, in a specific realm.
Since the Diameter Application scope-type indicates overload for an
entire application, reacting nodes should reduce the number of
requests sent for that application. Similarly to the Realm scope-
type, it will rarely if ever make sense for a Diameter node to
reroute traffic to a different Diameter application.
4.2.6. Destination-Realm Scope-Type
Campbell Expires January 16, 2014 [Page 16]
Internet-Draft Diameter Overload Control Solution Issues July 2013
The "Destination-Realm" scope-type indicates overload for all servers
that handle requests for the particular Diameter realm. That is, it
impacts all requests with the particular realm in the Destination-
Realm AVP.
The Realm scope-type is useful for declaring a global overload
condition within a network serving a single realm. It is also useful
for requesting third-parties to reduce Diameter traffic sent to a
particular realm, for example, in roaming scenarios.
Since the Realm scope-type indicates overload for an entire realm,
reacting nodes should reduce the number of messages sent for the
realm. Rerouting traffic does not make sense for the Realm scope
type, since it would probably never be useful for Diameter nodes to
reroute traffic destined for an overloaded realm to a different, non-
overloaded realm. Client applications might, however, be able to
choose to use services from a different operator if the Diameter
realm of one operator reports an overload condition.
MDOC currently makes the Realm scope-type mandatory to implement.
List participants have indicated that there may be use cases where
all Diameter traffic on a network uses the same Realm, and that the
use of the Realm scope-type would be redundant in such networks.
Whether the Realm scope-type should remain mandatory or become
optional to implement requires further study.
4.2.7. Session Scope-Type
MDOC currently includes a "Session" scope-type. This scope-type
refers to messages that include a matching Session-Id. Conceptually,
this applies to all requests that are part of a previously
established session. This scope-type could potentially be useful for
a session-stateful agent that assigns session-establishing requests
to a certain server, and then sends all future requests in that
session to the same server. If that server became overloaded, the
agent could send an overload report scoped to the assigned session.
However, the Session scope-type will become unwieldy for anything
other than very small-scale installations. The number of sessions
assigned to any specific server is likely to be quite large.
Therefore, the number of Session scope values would probably become
quite large. The working group should consider deprecating the
Session scope-type. In non-topology hiding agents, the Destination-
Host scope-type can be used to affect all sessions assigned to a
particular server. For topology-hiding agents, the session-group
mechanism can do the same.
Campbell Expires January 16, 2014 [Page 17]
Internet-Draft Diameter Overload Control Solution Issues July 2013
4.2.8. Session-Group Scope-Type
Diameter agents that implement certain topology-hiding schemes may
modify Origin-Host AVPs inserted by servers, and use some local
mechanism to bind sessions to specific servers. The "Destination-
Host" type may not function correctly in this case. MDOC specifies a
"session-group" scope-type, where an agent or server can assign a
common identifier to sessions that are fate-shared in some way, such
as being bound to the same server. If that server becomes
overloaded, the agent can send an overload report that matches
requests in all sessions with the matching identifier.
This scope-type may be useful under certain circumstances, but may
also be complex to implement. Further discussion is needed to
determine if the session-group type should be included in the base
mechanism. Since the mechanism is required to allow extensible
scope-types, session-groups could still be added in the future. The
working group should study whether the Session-Group mechanism should
be included in the base overload control solution, or removed with
the potential to add as an extension scope-type in the future.
4.3. Scope Values
Scope labels in an overload report will typically take the form of a
scope-type and a value. For example, if the "example.com" realm is
overloaded for all services, the overload report would indicate a
scope-type of "Realm" and a scope-value of "example.com"
The Connection scope-type is an exception. Since an overload report
with a Connection scope is only actionable by one of the peers
connected via the specified connection, it makes sense to treat the
Connection scope-type as always having a value of "this connection".
4.4. Combining Scopes
Diameter nodes will commonly need to construct overload reports that
apply to a combination of scopes. For example, if a given realm is
overloaded for subset of the applications it supports, it might
indicate both a realm scope and and one or more Diameter application
scopes.
Logically, combining multiple scopes of different types reduces the
overall set of requests to which the overload report would apply.
Combining multiple scopes of the same type increases the applicable
set. A function that determines the requests affected by an overload
report could model this as a logical "and" or "intersection" operator
for combining scopes of different types, and a logical "or" or
"union" operator for combining scopes of the same type.
Campbell Expires January 16, 2014 [Page 18]
Internet-Draft Diameter Overload Control Solution Issues July 2013
The working group should study whether all possible combinations
should be allowed. For example, it may or may not make sense to
combine a "Connection" scope with other scopes, or to allow more than
one "Connection" scope-value for a single overload report.
4.5. Scope Extensibility
[I-D.ietf-dime-overload-reqs] requires scope-types to be extensible.
This requirement implies that the chosen mechanism or mechanisms must
discuss how new scope-types can be added, how support for specific
scope-types should be declared or negotiated, and which scope-types
might be mandatory to support.
4.6. Scope Recommendations
In the author's opinion, the selected solution or solutions should
support, at a minimum, the "Connection", "Destination-Host", "Realm"
and "Application-ID" scope-types. The working group should consider
also adding the "Origin-Host" scope-type.
The working group should consider whether the advantages of the
"session-group" concept and scope-type are worth the complexity. The
group should also study whether the Peer scope-type adds sufficient
utility over the Connection scope-type to warrant it's inclusion.
5. IANA Considerations
This draft makes no requests of IANA.
6. Security Considerations
Overload reports induce Diameter nodes to reduce or reroute traffic.
For large scopes, a single erroneous or malicious overload report
could effectively shut down Diameter processing for an entire realm.
A Diameter overload control solution needs mechanisms to ensure that
overload reports are only accepted from trusted sources, and that
nothing tampers with the reports en route.
For adjacent approaches, the transport connection can be protected
with TLS or IPSec. But this will not help for non-adjacent
reporting, since no such transport connection exists.
While such work is in progress in the DIME working group, Diameter
has no currently viable mechanism for end-to-end authentication and
integrity protection. The working group should consider either
making non-adjacent overload control contingent on a generic Diameter
end-to-end protection mechanism, or adding a specialized protection
mechanism to any resulting non-adjacent overload control solution.
Campbell Expires January 16, 2014 [Page 19]
Internet-Draft Diameter Overload Control Solution Issues July 2013
7. References
7.1. Normative References
[RFC6733] Fajardo, V., Arkko, J., Loughney, J., and G. Zorn,
"Diameter Base Protocol", RFC 6733, October 2012.
[I-D.ietf-dime-overload-reqs]
McMurry, E. and B. Campbell, "Diameter Overload Control
Requirements", draft-ietf-dime-overload-reqs-07 (work in
progress), June 2013.
7.2. Informative References
[I-D.roach-dime-overload-ctrl]
Roach, A. and E. McMurry, "A Mechanism for Diameter
Overload Control", draft-roach-dime-overload-ctrl-03 (work
in progress), May 2013.
[I-D.korhonen-dime-ovl]
Korhonen, J. and H. Tschofenig, "The Diameter Overload
Control Application (DOCA)", draft-korhonen-dime-ovl-01
(work in progress), February 2013.
[Whac-a-Mole]
, "Whack-a-Mole Colloquial Usage", , <http://
en.wikipedia.org/wiki/Whack-a-mole#Colloquial_usage>.
Appendix A. Contributors
Eric McMurry and Robert Sparks made significant contributions to the
concepts in this draft.
Author's Address
Ben Campbell
Tekelec
17210 Campbell Rd.
Suite 250
Dallas, TX 75252
US
Email: ben@nostrum.com
Campbell Expires January 16, 2014 [Page 20]