campbell-dime-overload-issues-01.txt

Internet DRAFT - draft-campbell-dime-overload-issues
draft-campbell-dime-overload-issues

Last Version:	draft-campbell-dime-overload-issues-01.txt	Tracker Entry
Date:	`16-Jul-2013`
Disposition:	expired
Previous Versions:	draft-campbell-dime-overload-issues-00.txt (diff) - 07-Jun-2013





Network Working Group                                        B. Campbell
Internet-Draft                                                   Tekelec
Intended status: Informational                             July 15, 2013
Expires: January 16, 2014


               Diameter Overload Control Solution Issues
                 draft-campbell-dime-overload-issues-01

Abstract

   The Diameter Maintenance and Extensions (DIME) working group has
   undertaken an "overload control" work item, with the goal of
   standardizing a mechanism to allow Diameter nodes to report overload
   information among themselves.  Requirements currently include, among
   others, the need to accurately report the scope of overload
   conditions, and the ability to report overload information between
   nodes that are not directly connected at the transport layer.  These
   requirements introduce complex issues.  This document describes those
   issues, in the hope that it will assist the working group's decision
   process.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on January 16, 2014.

Copyright Notice

   Copyright (c) 2013 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents



Campbell                Expires January 16, 2014                [Page 1]

Internet-Draft  Diameter Overload Control Solution Issues      July 2013


   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
   2.  Document Conventions  . . . . . . . . . . . . . . . . . . . .   4
   3.  Non-adjacent Overload Information . . . . . . . . . . . . . .   4
     3.1.  Use-Cases for Non-adjacent Overload Control . . . . . . .   5
       3.1.1.  Interconnect  . . . . . . . . . . . . . . . . . . . .   5
       3.1.2.  Non-Supporting Agents . . . . . . . . . . . . . . . .   6
     3.2.  Issues with Non-Adjacent Overload Control . . . . . . . .   6
       3.2.1.  Topology Issues . . . . . . . . . . . . . . . . . . .   6
       3.2.2.  Support Negotiation . . . . . . . . . . . . . . . . .   7
       3.2.3.  Overload Report Delivery  . . . . . . . . . . . . . .   8
       3.2.4.  Non-Adjacent Overload Scopes  . . . . . . . . . . . .   9
     3.3.  Non-adjacent Overload Control Recommendations . . . . . .  11
   4.  Overload Scopes . . . . . . . . . . . . . . . . . . . . . . .  12
     4.1.  Explicit vs Implicit Indication of Scopes . . . . . . . .  13
     4.2.  Types of Overload Scopes  . . . . . . . . . . . . . . . .  14
       4.2.1.  Connection Scope-Type . . . . . . . . . . . . . . . .  14
       4.2.2.  Peer Scope-Type . . . . . . . . . . . . . . . . . . .  15
       4.2.3.  Destination-Host Scope-Type . . . . . . . . . . . . .  15
       4.2.4.  Origin-Host Scope-Type  . . . . . . . . . . . . . . .  16
       4.2.5.  Diameter-Application Scope-Type . . . . . . . . . . .  16
       4.2.6.  Destination-Realm Scope-Type  . . . . . . . . . . . .  16
       4.2.7.  Session Scope-Type  . . . . . . . . . . . . . . . . .  17
       4.2.8.  Session-Group Scope-Type  . . . . . . . . . . . . . .  18
     4.3.  Scope Values  . . . . . . . . . . . . . . . . . . . . . .  18
     4.4.  Combining Scopes  . . . . . . . . . . . . . . . . . . . .  18
     4.5.  Scope Extensibility . . . . . . . . . . . . . . . . . . .  19
     4.6.  Scope Recommendations . . . . . . . . . . . . . . . . . .  19
   5.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  19
   6.  Security Considerations . . . . . . . . . . . . . . . . . . .  19
   7.  References  . . . . . . . . . . . . . . . . . . . . . . . . .  20
     7.1.  Normative References  . . . . . . . . . . . . . . . . . .  20
     7.2.  Informative References  . . . . . . . . . . . . . . . . .  20
   Appendix A.  Contributors . . . . . . . . . . . . . . . . . . . .  20
   Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .  20









Campbell                Expires January 16, 2014                [Page 2]

Internet-Draft  Diameter Overload Control Solution Issues      July 2013


1.  Introduction

   When a Diameter [RFC6733] server or agent becomes overloaded, it
   needs to be able to gracefully reduce its load, typically by
   requesting other nodes to reduce the number of Diameter requests for
   some period of time.

   The Diameter Overload Control Requirements
   [I-D.ietf-dime-overload-reqs] describe requirements for overload
   control mechanisms.  Requirement 31 states that Diameter nodes must
   be able to report overload with sufficient granularity to avoid
   forcing available capacity to go unused.  Requirement 34 requires the
   ability to report overload across Diameter nodes that do not support
   the mechanism.  These requirements introduce significant and
   interrelated complexities to potential solutions.  This document
   describes the related issues.  The author hopes that this document
   will assist the working group's decision process related to these
   requirements.

   At the time of this writing, there have been two proposals for
   Diameter overload control solutions.  "A Mechanism for Diameter
   Overload Control" (MDOC) [I-D.roach-dime-overload-ctrl] defines a
   solution that piggybacks overload and load state information over
   existing Diameter messages.  "The Diameter Overload Control
   Application" (DOCA) [I-D.korhonen-dime-ovl] defines a solution that
   uses a new dedicated Diameter application to communicate similar
   information.

      While there are significant differences between the two proposals,
      they carry similar information.  In many ways, the issues related
      to Requirements 31 and 34 apply to both proposals.  This
      discussion is not specific to one proposal or the other, unless
      explicitly mentioned.

   This document serves two purposes.  The primary purpose is to explore
   the issues related to Requirement 34, that is, the requirement for
   the overload control mechanism to support sending load and overload
   information across intermediaries that do not support the mechanism
   (referred to herein as "non-adjacent" overload reporting.)  The
   document describes two use cases for non-adjacent overload reporting.
   It does not, however, attempt to describe the use cases for Diameter
   agents in general.  For a more thorough treatment of Diameter agent
   use cases in the context of overload control, please see
   [I-D.ietf-dime-overload-reqs].

   The secondary purpose is to help the reader understand the concept of
   overload scopes, and make recommendations about what kinds of
   overload scope should be supported by the mechanism.  These purposes



Campbell                Expires January 16, 2014                [Page 3]

Internet-Draft  Diameter Overload Control Solution Issues      July 2013


   are interrelated, since an understanding of overload scopes is
   necessary to fully understand some of the issues with non-adjacent
   overload reporting.

2.  Document Conventions

   This document uses terms defined in [RFC6733] and
   [I-D.ietf-dime-overload-reqs].  In particular, the terms "client",
   "server","upstream", and "downstream" are used as defined in RFC
   6733.  In addition, this document uses the following terms:

   Overload: A condition where a Diameter node needs a reduction in the
             number of requests that it must handle.

   Overload Report:  A request to reduce traffic that contributes to an
             overload condition.

   Overload Scope:  A classifier that defines the set of requests that
             may contribute to particular overload conditions.
             Alternatively, the purposes for which a node may be
             overloaded.  For example, if a server is overloaded for the
             purposes of one Diameter application but not another, the
             overload condition can be considered "scoped" to that
             application.

   Reporting Node:  The node that sends an overload report.  Also known
             as an "overloaded node".

   Reacting Node:  A node that consumes and possibly acts on an overload
             report.

   Adjacent Overload Reporting:  Overload reports exchanged between
             adjacent Diameter peers.

   Non-Adjacent Overload Reporting:  Overload reports sent between
             Diameter nodes separated by one or more intermediate
             Diameter agents (i.e. relays or proxies) .

   Piggybacked Overload Reporting:  The inclusion of overload reports in
             existing Diameter messages.

   Application-Based Overload Reporting:  The sending of overload
             reports in a separate, dedicated Diameter application.

3.  Non-adjacent Overload Information

   Requirement 34 of [I-D.ietf-dime-overload-reqs] says that the
   selected Diameter overload control mechanism "SHOULD" be able to



Campbell                Expires January 16, 2014                [Page 4]

Internet-Draft  Diameter Overload Control Solution Issues      July 2013


   communicate overload and load information across intermediaries that
   do not support the mechanism.  This requirement introduces a number
   of complications to the solution effort, creating complications in
   how Diameters negotiate support for overload control, address and
   route overload reports to the right places, and act on received
   overload reports.

   While the requirement does not explicitly say it, we interpret
   "intermediaries" in this context to mean Diameter agents.  The
   requirement is irrelevant for lower layer intermediaries (e.g.
   routers), and cannot be reasonably applied for non-Diameter entities,
   or hybrid entities such as gateways between Diameter and other
   protocols.

   The requirement to traverse non-supporting intermediaries is not
   necessarily the same thing as a requirement for end-to-end
   communication of overload reports between Diameter clients and
   servers.  Non-adjacent reporting can include client-to-server
   scenarios.  They can also include server-to-agent scenarios and
   agent-to-client scenarios.  All such scenarios may include one or
   more intervening agents.  Since Diameter allows transactions to be
   sent from server to client, all scenarios may be reversed.
   Therefore, we refer to this requirement as "Non-adjacent Overload
   Control".

3.1.  Use-Cases for Non-adjacent Overload Control

   There are two primary use-cases for non-adjacent overload control.

3.1.1.  Interconnect

   The first significant non-adjacent use-case is the interconnect
   scenario described in section 2.3 of the overload control
   requirements [I-D.ietf-dime-overload-reqs].  Two or more Diameter
   network operators communicate with each other across a third-party
   interconnect provider that brokers Diameter traffic between the
   operators.  Figure 1 illustrates the interconnect use case.

                +-------------------------------------------+
                |               Interconnect                |
                |                                           |
                |   +--------------+      +--------------+  |
                |   |     Agent    |------|     Agent    |  |
                |   +--------------+      +--------------+  |
                |         .'                      `.        |
                +------.-'--------------------------`.------+
                     .'                               `.
                  .-'                                   `.



Campbell                Expires January 16, 2014                [Page 5]

Internet-Draft  Diameter Overload Control Solution Issues      July 2013


    ------------.'-----+                             +----`.------------
          +----------+ |                             | +----------+
          |Edge Agent|                               | |Edge Agent|
          +----------+ |                             | +----------+
                       |                             |
            Operator 1 |                             |  Operator 2
    -------------------+                             +------------------

               Figure 1: Two Operator Interconnect Scenario

   If the interconnect provider does not support Diameter overload
   control, each operator network becomes an island of overload control,
   similar to those in the non-supporting agent use-case
   (Section 3.1.2).  Even if the interconnect provider does support
   overload control, the operators may not trust it to generate and act
   on overload reports on the operators' behalves, and may prefer to
   exchange overload and load information directly with each other.

   The interconnect use-case may introduce additional security concerns.
   While the non-supporting agent use case typically (but not
   necessarily) occurs inside a single administrative domain, the
   interconnect case will almost always involve sending overload reports
   across multiple administrative domains.  Since a malicious or
   incorrect overload report can effectively shut down Diameter
   processing, the current lack of a viable solution for end-to-end
   integrity protection of Diameter messages may be a problem.

3.1.2.  Non-Supporting Agents

   [I-D.ietf-dime-overload-reqs] requires the solution to function in
   networks where not all Diameter elements support it.  That is, the
   solution must allow gradual deployment, and must not require a flag-
   day cutover.  If non-adjacent overload control is not supported, one
   or more non-supporting Diameter Agents can divide a network into
   overload control islands, where overload information is communicated
   inside each island, but not among separate islands.

      In the author's strictly personal opinion, the non-supporting
      agent use case is less compelling than the interconnect case.  The
      non-supporting agent case would typically occur inside one
      administrative domain.  The operator of that domain has
      considerably more control over the implementations used in the
      domain than it might have for third-party domains.

3.2.  Issues with Non-Adjacent Overload Control

3.2.1.  Topology Issues




Campbell                Expires January 16, 2014                [Page 6]

Internet-Draft  Diameter Overload Control Solution Issues      July 2013


   Many of the issues with non-adjacent overload control derive from the
   fact that a Diameter node is unlikely to know the topology of the
   Diameter network past its immediate peers.  In a trivial topology,
   that is, a Diameter network with only clients and servers, this is
   not a problem.  But if the immediate peer is a Diameter agent, a node
   is unlikely to know what next hop the relay will select for a given
   Diameter message.  This is particularly difficult if the agent hides
   topology in either direction, or uses dynamic peer discovery.  While
   a node may be able to infer the path a given message will take in
   some specific cases (e.g. for mid-session messages), they cannot do
   this in general.  And even those specific cases may fail if an agent
   on the message path performs topology hiding.

   This lack of topology knowledge impacts the way that nodes can
   negotiate overload-control support, the ways they send overload
   reports, and the ways a reacting node can act to mitigate overload.
   A non-adjacent overload-control mechanism will need to solve the
   topology issues, either by offering ways to discover non-adjacent
   topologies, or offering ways to constrain overload-control relevant
   parts of such topologies in ways where a node could reasonably know
   them in advance.

3.2.2.  Support Negotiation

   Diameter nodes need to negotiate or otherwise indicate their support
   for overload control to other nodes.  This includes indicating
   support for overload control in general, as well as potentially
   indicating support of certain parameters of the overload control
   solution.  For example, a node may need to indicate which overload
   algorithms it supports.  This becomes complex if two non-adjacent
   nodes need to negotiate support.

   In a Diameter application-based solution, support for the overload
   control application would occur during the capabilities exchange
   between peers.  Diameter capabilities exchange occurs strictly
   between peers; Diameter offers no mechanism for indicating support of
   a given Application-ID between non-adjacent nodes.

   Diameter allows non-negotiated use of an arbitrary Application-Id
   between non-adjacent nodes across Diameter agents that implement the
   Diameter Relay application.  In theory, this means that an
   application-based, non-adjacent overload control could only traverse
   Diameter relays, or Diameter proxies that explicitly support the
   overload-control Application-Id.  In the latter case, we assume that
   a proxy will not indicate support for the overload-control
   Application-Id unless it supports the overload-control mechanism;
   such a proxy cannot be considered a non-supporting agent.




Campbell                Expires January 16, 2014                [Page 7]

Internet-Draft  Diameter Overload Control Solution Issues      July 2013


   In practice, a Diameter agent can act as a proxy for some purposes
   and a relay for others.  If a Diameter proxy indicates support for
   the Diameter relay application, we assume that it will relay any
   arbitrary application.  This means it can be considered a relay for
   the purposes of overload control.

   For both application-based and piggybacked solutions, a supporting
   node needs know the other nodes with which it should negotiate.  For
   overload-control between Diameter peers, this is easy; a node
   exchanges support information with its immediate peers.  But for non-
   adjacent overload control, this is more difficult for reasons
   discussed in Section 3.2.1.

   Therefore, for non-adjacent overload control negotiation, each
   supporting node either needs advance knowledge of all nodes with
   which it may negotiate overload-control support, or it needs a
   mechanism for discovering that knowledge dynamically.

3.2.3.  Overload Report Delivery

   With adjacent overload control reporting, overload report addressing
   and delivery is relatively simple.  A node sends overload reports
   directly to its peers.  This becomes more complex for non-adjacent
   overload-control.

   For application-based overload control, nodes could address overload
   reports to specific endpoint nodes using the Destination-Host AVP.
   Doing so would be subject to the same non-adjacent topology issues
   described in Section 3.2.1.  That is, a node can only send overload
   reports to non-adjacent clients or servers that it knows about,
   either from prior knowledge (i.e. provisioning) or from which it has
   observed previous Diameter messages.

   An application-based mechanism could possibly address reports to non-
   adjacent Diameter agents using the Destination-Host AVP.  This would
   effectively make the agent into an endpoint for the overload-control
   application.

   A piggy-backed mechanism will have more difficulty addressing non-
   adjacent overload reports.  A piggy-backed mechanism sends overload
   reports in already existing Diameter requests; That is, requests that
   have their own purposes and destinations independent of the overload-
   report.  Thus, nodes can only select the destination of an overload
   report by bundling it into a Diameter message that was already going
   to that destination.  While a piggy-backed mechanism might be able to
   send overload-reports across quiescent transport connections using
   watchdog (DWR/DWA) messages, these message are cannot be exchanged
   between non-adjacent nodes.



Campbell                Expires January 16, 2014                [Page 8]

Internet-Draft  Diameter Overload Control Solution Issues      July 2013


      In some cases, the limit of sending overload reports to
      destinations to which existing traffic is bound may be acceptable.
      If a node is contributing to an overload condition, then it's
      reasonable to assume that node is regularly exchanging traffic
      with the overloaded node.  However, there may be cases where an
      overload report causes a connection become quiescent.  If the
      reporting node needed to tell a reacting node that the condition
      has resolved or improved, it would need to send a new report
      across the now quiescent connection.  There may also be cases
      where a reacting node redirects traffic along a different path,
      causing a previously quiescent node to suddenly start sending
      requests to the overloaded node.  Thus, without careful selection
      of the overload report scope, an overloaded node may find itself
      engaged in a game of Whack-a-Mole [Whac-a-Mole] with previously
      quiescent non-adjacent nodes.

   For both piggy-backed and application-based solutions, non-adjacent
   overload control introduces a need to identify the sender of a
   report, or at least determine whether the report is from an adjacent
   or non-adjacent node.  This is not required for purely adjacent
   solutions, since the sender could always be assumed to be the peer.

   For example, a non-adjacent report with a "Connection" scope does not
   make sense.  If a node receives one, it should ignore it.  But in
   order to make that decision, it must be able distinguish a non-
   adjacent report from an adjacent one.  For example, in an
   application-based mechanism,

3.2.4.  Non-Adjacent Overload Scopes

   A reacting node will typically attempt to mitigate an overload
   condition by either reducing the number of requests that contribute
   to the condition, or by rerouting part of that traffic to avoid the
   problem.  In both cases, the reacting node's is limited by its
   ability to determine to which Diameter requests contribute to the
   overload condition in the first place.  The overload scope concept
   (Section 4) offers a way for overloaded nodes to indicate what
   traffic is likely to contribute to an overload condition and should
   be abated.

   Not all of the scope-types described in Section 4 make sense for non-
   adjacent overload control.  The "Connection" scope-type is an obvious
   example, since the reacting node will never share a transport
   connection with a non-adjacent node; this is the very definition of
   non-adjacent nodes.

   Since a Diameter node cannot control how requests are forwarded to
   non-adjacent nodes, the "Peer" scope-type also does not work well,



Campbell                Expires January 16, 2014                [Page 9]

Internet-Draft  Diameter Overload Control Solution Issues      July 2013


   especially when there are multiple possible destinations up or
   downstream from the adjacent peer.  For example in Figure 2, Node A
   sends Diameter requests to Nodes B and C across a non-supporting
   agent.  If Node B becomes overloaded but Node C does not, Node A
   cannot reroute requests to Node C, since it has very little way to
   influence where the agent will forward any given request.  If Node A
   tries to reduce traffic by 50%, the agent will likely still send half
   of the remaining traffic to Node B. If B and C are endpoints, Node A
   may in some cases be able to use the Destination-Host AVP for this
   purpose (in which case the "Destination-Host" scope-type would be
   more appropriate), but this does not help if B and C are also agents
   rather than servers.

                      +--------+       +--------+
                      | Node B |       | Node C |
                      +----+---+       +---+----+
                           |               |
                           +-------+-------+
                                   |
                           +-------+--------+
                           | Non-Supporting |
                           |  Agent         |
                           +-------+--------+
                                   |
                                   |
                              +----+----+
                              | Node  A |
                              +---------+

                      Figure 2: Non-Adjacent Routing

   Scope-types that classify traffic by origin or final destinations,
   such as "Origin-Host","Destination-Realm", "Application-ID", and
   "Destination-Host" can be used for non-adjacent overload control.  In
   general, scope-types that may denote non-adjacent intermediary
   devices, such "Peer" cannot, nor can scope-types that refer only to
   peers, e.g. "Connection".

   Even for destination-oriented scope-types, the sender of an overload
   report must be authoritative for the indicated scope.  That is, it
   must have full knowledge of the congestion state for the scope.  For
   example, if Node B and C both serve the ream "example.com", and B
   becomes 50% overloaded while C does not, B cannot simply report 50%
   overload at realm scope.  If it did, Node A would reduce its
   generated traffic by 50%. Since the overall realm is really only
   overloaded by 75%, this would leave the realm operating beneath
   available capacity.




Campbell                Expires January 16, 2014               [Page 10]

Internet-Draft  Diameter Overload Control Solution Issues      July 2013


      The need to be authoritative for an indicated scope is also true
      for strictly adjacent reporting mechanisms.  But in an adjacent
      mechanism, it is easier for an intervening agent to learn the
      overload state of upstream nodes.  In the example, if the agent
      supported the overload control mechanism, it would most likely
      receive reports from Nodes B and C, and could then construct
      downstream reports that incorporate the state of B, C, and its own
      local state.  This contrasts with the non-adjacent case where B
      must understand the current state of C even though it is not in
      the path of overload reports from C.

   Therefore, a given node must only report overload for scopes for
   which it has full knowledge of the load and overload state.  That is,
   it must be a "scope authority" for any scope it reports.  In the
   example, nodes B and C (and any other nodes serving "example.com")
   would be required to share current load and overload state.  The
   state-sharing requirement could be substantial for high-capacity
   nodes.

   When a node reports overload for a certain scope, reacting nodes will
   treat the overload condition as uniform across the entire scope.  For
   example, if a node reports overload for an entire realm, reacting
   nodes will reduce traffic equally for all servers that serve that
   realm.  If the servers are unequally overloaded, they must use a more
   granular scope-type, for example, "Destination-Host".

3.3.  Non-adjacent Overload Control Recommendations

   An adjacent reporting mechanism allows for very flexible and fine
   grained overload control.  It solves or simplifies a number of
   issues, such as negotiation of support and parameters, requirements
   for topology knowledge, end-to-end security, etc, by avoiding them in
   the first place.  Adding non-adjacent support to such a mechanism
   would complicate it considerably.

   Non-adjacent overload control mechanism are better for connecting
   islands of overload control.  Such a mechanism works well for larger
   scopes and relatively static topologies.

   The author believes that we are unlikely to find a single solution
   that works well for both adjacent and non-adjacent overload control.
   While a single solution is more desirable in general, a single
   solution that works well for both cases is likely to be extremely
   complicated.  Therefore, the working group should consider a separate
   mechanism for the non-adjacent delivery of overload reports.

   If the group chooses to accept two separate solutions, we should be
   able to specify a single data model and set of AVPs that work for



Campbell                Expires January 16, 2014               [Page 11]

Internet-Draft  Diameter Overload Control Solution Issues      July 2013


   both, with some restrictions.  (For example, the non-adjacent
   solution would likely forbid the use of the "Connection" scope-type.)

   If the working group chooses to add non-adjacent features to MDOC or
   DOCA, we will need to change the support negotiation mechanisms to
   allow for the non-adjacent case, specify how a node can determine
   whether a report is adjacent or non-adjacent, and state what subset
   of scope-types are allowed in non-adjacent supports.  We will also
   need to study how we can meet the security-related requirements
   [I-D.ietf-dime-overload-reqs] given the current lack of end-to-end
   security features in Diameter.

4.  Overload Scopes

   Diameter overload does not necessarily affect all kinds of Diameter
   traffic.  A node may become overloaded for some requests but not
   others.  For example, a Diameter agent may handle requests for more
   than one Diameter Application, and may route requests to a different
   set of servers for each application.  If one server set becomes
   overloaded, but the other does not, then the agent itself is
   effectively overloaded for one application, but can process the other
   at normal capacity.

   The Diameter overload requirements [I-D.ietf-dime-overload-reqs] list
   several scenarios that illustrate overload that affects some requests
   but not others.  We refer to the set of requests affected by a
   particular overload event as the "scope" of the overload event.  The
   overload requirements require the mechanism to be able to report
   overload reports that are "scoped" to (that is, they affect requests
   targeted to) a particular Diameter node, a Realm, or a Diameter
   Application.

      The concept of scope may also be useful when applied to reported
      load even without an overload condition.  This usage is out of
      "scope" for this document.

   A scope indication in an overload report is a set of classifiers that
   identify requests likely to contribute to the overload condition.  In
   general, this could include any aspect of a Diameter message that a
   reacting node can observe.  For example, requests could be classified
   by Attribute Value Pair (AVP) values or next-hop routing decisions.

   The ability to express the scope of an overload condition is only
   useful when reacting nodes can act on the information.  There are
   only a small number of actions a reacting node may take to mitigate
   overload.  Essentially these actions boil down to reducing the number
   of requests that "match" the scope, either by sending fewer requests
   in the first place, or by routing around the problem.  The former is



Campbell                Expires January 16, 2014               [Page 12]

Internet-Draft  Diameter Overload Control Solution Issues      July 2013


   limited by the node's ability to distinguish between requests that
   match the overload scope, and request that do not.  The latter is
   limited by the node's ability to predict or influence how a request
   will be routed.

      Reacting nodes most likely take additional application-specific
      actions to mitigate overload conditions.  If a client reduces the
      number of messages it sends, it almost certainly has to take
      additional application-specific steps that affect its own client
      application.  Depending on the application, it might refuse some
      client application requests, redirect some of its own clients to
      different services (e.g. offloading mobile data sessions to local
      WiFi networks), or assert an overload condition in the client
      application protocol (e.g. The Session Initiation Protocol (SIP)
      ).

   This section discusses the meanings of the required scope-types, and
   analyses their implications for the selected mechanism.

4.1.  Explicit vs Implicit Indication of Scopes

   Both MDOC and DOCA use explicit scope indication.  That is, the scope
   of an overload report is not, in general, implied by the type of
   message that carries the report.  For example, if an overload report
   is scoped to a particular Diameter Application-Id, the report
   explicitly indicates affected Application-Id, rather than leaving the
   reacting-node to infer the Application-ID based on that of the
   message that carries the report.  There are a few exceptions to this;
   for example MDOC supports a "Connection" scope that, when specified,
   pertains to requests to be sent over the same transport connection
   over which the overload report arrived.

      List discussions have shown a common assumption that overload
      reports sent over a piggy-backed solution such as MDOC would only
      affect requests associated with the same Diameter Application-Id.
      For MDOC, this is a false assumption.  MDOC's explicit use of
      scopes allows overload reports sent over one application to affect
      requests for any arbitrary application.  On the other hand,
      solutions that use a dedicated Application-Id (such as DOCA)
      necessarily require the ability to report overload for arbitrary
      applications; otherwise it would only be possible for an overload
      control application to report overload on itself.

   Some list participants have suggested that the solution include a
   concept of a default scope, that is, a scope that is implied if no
   other scope is explicitly indicated.  The concept of default or
   implicit scopes requires further study by the working group.




Campbell                Expires January 16, 2014               [Page 13]

Internet-Draft  Diameter Overload Control Solution Issues      July 2013


4.2.  Types of Overload Scopes

   There are several different kinds, or types, of overload scopes.  The
   type of a scope defines how the reacting node interprets it.  Table 1
   gives a summary of the scope types discussed in this document.  The
   "Scope Type" column gives the name of the scope.  The "Affected
   Traffic" column describes what Diameter requests are impacted by the
   scope-type.  The "Reacting-Node" column describes which Diameter
   nodes may be able to take action on an overload report with the
   respective scope-type.  Finally, the "Draft" column describes which
   proposed solution includes the respective scope-type.

   +------------------+-----------------------+---------------+--------+
   | Scope Type       | Affected Traffic      | Reacting-Node | Draft  |
   +------------------+-----------------------+---------------+--------+
   | Connection       | Requests sent to      | Adjacent Peer | MDOC,  |
   |                  | directly to the       |               | DOCA   |
   |                  | reporting-node on a   |               |        |
   |                  | particular transport  |               |        |
   |                  | connection            |               |        |
   | Peer             | Requests routed       | Adjacent Peer | MDOC,  |
   |                  | directly to           |               | DOCA   |
   |                  | reporting-node.       |               |        |
   | Destination-Host | Requests with a       | Any           | MDOC   |
   |                  | matching Destination- |               |        |
   |                  | Host AVP              |               |        |
   | Origin Host      | Requests including a  | Any           | DOCA?  |
   |                  | matching Origin-Host  |               |        |
   |                  | AVP                   |               |        |
   | Diameter         | Requests with a       | Any           | MDOC,  |
   | Application      | matching Application- |               | DOCA   |
   |                  | Id AVP                |               |        |
   | Destination      | Requests with a       | Any           | MDOC,  |
   | Realm            | matching Destination- |               | DOCA   |
   |                  | Realm AVP             |               |        |
   | Session          | Requests with a       | Any           | MDOC   |
   |                  | matching Session-Id   |               |        |
   |                  | AVP                   |               |        |
   | Session-Group    | Requests belonging to | Any           | MDOC   |
   |                  | sessions assigned     |               |        |
   |                  | matching labels       |               |        |
   +------------------+-----------------------+---------------+--------+

                 Table 1: Summary of Overload Scope Types

4.2.1.  Connection Scope-Type





Campbell                Expires January 16, 2014               [Page 14]

Internet-Draft  Diameter Overload Control Solution Issues      July 2013


   The "Connection" scope-type indicates that the reacting node should
   reduce traffic sent on the transport connection on which it received
   the overload report.  A Connection scope indicate does not include an
   explicit value; rather it implies "this connection".

4.2.2.  Peer Scope-Type

   The "Peer" scope-type indicates that a particular Diameter node is
   overloaded.  Other nodes should mitigate the overload by reducing the
   number of requests that will land on the overloaded node, either by
   sending fewer requests, or by attempting to route requests around the
   overloaded node.

      In both MDOC and DOCA, the "Peer" scope-type is named "Host".  In
      practice, only immediate peers can act as the reacting node for a
      Host scoped overload report.  This is due to the fact that non-
      adjacent nodes have limited ability to influence routing decisions
      beyond the immediate next hop.  This document uses the term "Peer"
      to illustrate that fact.

   Large-scale Diameter nodes are often implemented as clusters of IP
   hosts, which may or may not share their knowledge about upstream
   overload conditions.  Certain IP hosts in a cluster could become
   overloaded when others do not.  Furthermore, if the reacting-node is
   also clustered, it may be difficult for the cluster members to share
   real-time knowledge of the reporting-node's overload state.  This can
   make it difficult for a node to know conclusively whether any two
   connections that appear to connect to the same peer can be treated as
   such for the purposes of overload control.  The working group should
   study whether the Peer scope-type should be deprecated in favor of
   the "Connection" scope-type.

4.2.3.  Destination-Host Scope-Type

   The "Destination-Host" scope type pertains to requests that contain a
   Destination-Host AVP that matches the indicated Destination-Host
   value.  Destination-Host always refers to the endpoint for a given
   Diameter request.













Campbell                Expires January 16, 2014               [Page 15]

Internet-Draft  Diameter Overload Control Solution Issues      July 2013


   The best the reacting node can do is reduce the number of requests
   that contain a Destination-Host AVP that match the overloaded node.
   Rerouting will not help in general, since the requests will simply
   take different routes to arrive at the same overloaded server.
   Unless the destination node is also direct peer, the reacting node
   cannot do much about requests that don't contain a Destination-Host
   AVP in the first place, since it cannot predict whether these
   requests will land on the overloaded endpoint.  The Destination-Host
   scope type is useful for requests bound to a particular server, for
   example, mid-session requests for a session-stateful application.

   Go ahead and cover details for "session" and "session-groups", and
   argue for removal of "session".

4.2.4.  Origin-Host Scope-Type

   While most scope-types refer to where a request is likely to go, the
   "Origin-Host" scope-type refers to where the request originates.
   That is, any request with a matching Origin-Host AVP would match.
   The Origin-Host scope type is useful for situations where a specific
   client or set of clients sends an excessive number of requests.  An
   overload report with an Origin-Host scope would tell matching clients
   to reduce traffic, or agents to throttle requests that came from
   matching clients.

      Note that the Origin-Host scope-type is not explicitly mentioned
      in the requirements document.  The authors include it here because
      others have mentioned the need in conversation.

4.2.5.  Diameter-Application Scope-Type

   The "Diameter Application" scope-type indicates overload for a
   particular Diameter application.  That is, it impacts all requests
   with the matching value in an Application-Id AVP.

   The Diameter Application scope-type is useful for declaring an
   overload condition that affects a specific Diameter service,
   typically, but not necessarily, in a specific realm.

   Since the Diameter Application scope-type indicates overload for an
   entire application, reacting nodes should reduce the number of
   requests sent for that application.  Similarly to the Realm scope-
   type, it will rarely if ever make sense for a Diameter node to
   reroute traffic to a different Diameter application.

4.2.6.  Destination-Realm Scope-Type





Campbell                Expires January 16, 2014               [Page 16]

Internet-Draft  Diameter Overload Control Solution Issues      July 2013


   The "Destination-Realm" scope-type indicates overload for all servers
   that handle requests for the particular Diameter realm.  That is, it
   impacts all requests with the particular realm in the Destination-
   Realm AVP.

   The Realm scope-type is useful for declaring a global overload
   condition within a network serving a single realm.  It is also useful
   for requesting third-parties to reduce Diameter traffic sent to a
   particular realm, for example, in roaming scenarios.

   Since the Realm scope-type indicates overload for an entire realm,
   reacting nodes should reduce the number of messages sent for the
   realm.  Rerouting traffic does not make sense for the Realm scope
   type, since it would probably never be useful for Diameter nodes to
   reroute traffic destined for an overloaded realm to a different, non-
   overloaded realm.  Client applications might, however, be able to
   choose to use services from a different operator if the Diameter
   realm of one operator reports an overload condition.

   MDOC currently makes the Realm scope-type mandatory to implement.
   List participants have indicated that there may be use cases where
   all Diameter traffic on a network uses the same Realm, and that the
   use of the Realm scope-type would be redundant in such networks.
   Whether the Realm scope-type should remain mandatory or become
   optional to implement requires further study.

4.2.7.  Session Scope-Type

   MDOC currently includes a "Session" scope-type.  This scope-type
   refers to messages that include a matching Session-Id.  Conceptually,
   this applies to all requests that are part of a previously
   established session.  This scope-type could potentially be useful for
   a session-stateful agent that assigns session-establishing requests
   to a certain server, and then sends all future requests in that
   session to the same server.  If that server became overloaded, the
   agent could send an overload report scoped to the assigned session.

   However, the Session scope-type will become unwieldy for anything
   other than very small-scale installations.  The number of sessions
   assigned to any specific server is likely to be quite large.
   Therefore, the number of Session scope values would probably become
   quite large.  The working group should consider deprecating the
   Session scope-type.  In non-topology hiding agents, the Destination-
   Host scope-type can be used to affect all sessions assigned to a
   particular server.  For topology-hiding agents, the session-group
   mechanism can do the same.





Campbell                Expires January 16, 2014               [Page 17]

Internet-Draft  Diameter Overload Control Solution Issues      July 2013


4.2.8.  Session-Group Scope-Type

   Diameter agents that implement certain topology-hiding schemes may
   modify Origin-Host AVPs inserted by servers, and use some local
   mechanism to bind sessions to specific servers.  The "Destination-
   Host" type may not function correctly in this case.  MDOC specifies a
   "session-group" scope-type, where an agent or server can assign a
   common identifier to sessions that are fate-shared in some way, such
   as being bound to the same server.  If that server becomes
   overloaded, the agent can send an overload report that matches
   requests in all sessions with the matching identifier.

   This scope-type may be useful under certain circumstances, but may
   also be complex to implement.  Further discussion is needed to
   determine if the session-group type should be included in the base
   mechanism.  Since the mechanism is required to allow extensible
   scope-types, session-groups could still be added in the future.  The
   working group should study whether the Session-Group mechanism should
   be included in the base overload control solution, or removed with
   the potential to add as an extension scope-type in the future.

4.3.  Scope Values

   Scope labels in an overload report will typically take the form of a
   scope-type and a value.  For example, if the "example.com" realm is
   overloaded for all services, the overload report would indicate a
   scope-type of "Realm" and a scope-value of "example.com"

   The Connection scope-type is an exception.  Since an overload report
   with a Connection scope is only actionable by one of the peers
   connected via the specified connection, it makes sense to treat the
   Connection scope-type as always having a value of "this connection".

4.4.  Combining Scopes

   Diameter nodes will commonly need to construct overload reports that
   apply to a combination of scopes.  For example, if a given realm is
   overloaded for subset of the applications it supports, it might
   indicate both a realm scope and and one or more Diameter application
   scopes.

   Logically, combining multiple scopes of different types reduces the
   overall set of requests to which the overload report would apply.
   Combining multiple scopes of the same type increases the applicable
   set.  A function that determines the requests affected by an overload
   report could model this as a logical "and" or "intersection" operator
   for combining scopes of different types, and a logical "or" or
   "union" operator for combining scopes of the same type.



Campbell                Expires January 16, 2014               [Page 18]

Internet-Draft  Diameter Overload Control Solution Issues      July 2013


   The working group should study whether all possible combinations
   should be allowed.  For example, it may or may not make sense to
   combine a "Connection" scope with other scopes, or to allow more than
   one "Connection" scope-value for a single overload report.

4.5.  Scope Extensibility

   [I-D.ietf-dime-overload-reqs] requires scope-types to be extensible.
   This requirement implies that the chosen mechanism or mechanisms must
   discuss how new scope-types can be added, how support for specific
   scope-types should be declared or negotiated, and which scope-types
   might be mandatory to support.

4.6.  Scope Recommendations

   In the author's opinion, the selected solution or solutions should
   support, at a minimum, the "Connection", "Destination-Host", "Realm"
   and "Application-ID" scope-types.  The working group should consider
   also adding the "Origin-Host" scope-type.

   The working group should consider whether the advantages of the
   "session-group" concept and scope-type are worth the complexity.  The
   group should also study whether the Peer scope-type adds sufficient
   utility over the Connection scope-type to warrant it's inclusion.

5.  IANA Considerations

   This draft makes no requests of IANA.

6.  Security Considerations

   Overload reports induce Diameter nodes to reduce or reroute traffic.
   For large scopes, a single erroneous or malicious overload report
   could effectively shut down Diameter processing for an entire realm.
   A Diameter overload control solution needs mechanisms to ensure that
   overload reports are only accepted from trusted sources, and that
   nothing tampers with the reports en route.

   For adjacent approaches, the transport connection can be protected
   with TLS or IPSec.  But this will not help for non-adjacent
   reporting, since no such transport connection exists.

   While such work is in progress in the DIME working group, Diameter
   has no currently viable mechanism for end-to-end authentication and
   integrity protection.  The working group should consider either
   making non-adjacent overload control contingent on a generic Diameter
   end-to-end protection mechanism, or adding a specialized protection
   mechanism to any resulting non-adjacent overload control solution.



Campbell                Expires January 16, 2014               [Page 19]

Internet-Draft  Diameter Overload Control Solution Issues      July 2013


7.  References

7.1.  Normative References

   [RFC6733]  Fajardo, V., Arkko, J., Loughney, J., and G. Zorn,
              "Diameter Base Protocol", RFC 6733, October 2012.

   [I-D.ietf-dime-overload-reqs]
              McMurry, E. and B. Campbell, "Diameter Overload Control
              Requirements", draft-ietf-dime-overload-reqs-07 (work in
              progress), June 2013.

7.2.  Informative References

   [I-D.roach-dime-overload-ctrl]
              Roach, A. and E. McMurry, "A Mechanism for Diameter
              Overload Control", draft-roach-dime-overload-ctrl-03 (work
              in progress), May 2013.

   [I-D.korhonen-dime-ovl]
              Korhonen, J. and H. Tschofenig, "The Diameter Overload
              Control Application (DOCA)", draft-korhonen-dime-ovl-01
              (work in progress), February 2013.

   [Whac-a-Mole]
              , "Whack-a-Mole Colloquial Usage", , <http://
              en.wikipedia.org/wiki/Whack-a-mole#Colloquial_usage>.

Appendix A.  Contributors

   Eric McMurry and Robert Sparks made significant contributions to the
   concepts in this draft.

Author's Address

   Ben Campbell
   Tekelec
   17210 Campbell Rd.
   Suite 250
   Dallas, TX  75252
   US

   Email: ben@nostrum.com








Campbell                Expires January 16, 2014               [Page 20]