TOC |
|
Overload occurs in Session Initiation Protocol (SIP) networks when SIP servers have insufficient resources to handle all SIP messages they receive. Even though the SIP protocol provides a limited overload control mechanism through its 503 (Service Unavailable) response code, SIP servers are still vulnerable to overload. This document defines an overload control mechanism for SIP.
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as “work in progress.”
This Internet-Draft will expire on December 16, 2010.
Copyright (c) 2010 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
1.
Introduction
2.
Terminology
3.
Via Header Parameters for Overload Control
3.1.
The 'oc_accept' Parameter
3.2.
Creating the 'oc' Parameter
3.3.
Determining the 'oc' Parameter Value
3.4.
Processing the 'oc' Parameter
3.5.
Using the 'oc' Parameter Value
3.6.
Self-Limiting
4.
Responding to an Overload Indication
4.1.
Message Prioritization
4.2.
Rejecting Requests
5.
Syntax
6.
Design Considerations
6.1.
SIP Mechanism
6.1.1.
SIP Response Header
6.1.2.
SIP Event Package
6.2.
Backwards Compatibility
7.
Security Considerations
8.
IANA Considerations
9.
References
9.1.
Normative References
9.2.
Informative References
Appendix A.
Acknowledgements
§
Authors' Addresses
TOC |
As with any network element, a Session Initiation Protocol (SIP) [RFC3261] (Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A., Peterson, J., Sparks, R., Handley, M., and E. Schooler, “SIP: Session Initiation Protocol,” June 2002.) server can suffer from overload when the number of SIP messages it receives exceeds the number of messages it can process. Overload can pose a serious problem for a network of SIP servers. During periods of overload, the throughput of a network of SIP servers can be significantly degraded. In fact, overload may lead to a situation in which the throughput drops down to a small fraction of the original processing capacity. This is often called congestion collapse.
Overload is said to occur if a SIP server does not have sufficient resources to process all incoming SIP messages. These resources may include CPU processing capacity, memory, network bandwidth, input/output, or disk resources.
For overload control, we only consider failure cases where SIP servers are unable to process all SIP requests due to resource constraints. There are other cases where a SIP server can successfully process incoming requests but has to reject them due to failure conditions unrelated to the SIP server being overloaded. For example, a PSTN gateway that runs out of trunk lines but still has plenty of capacity to process SIP messages should reject incoming INVITEs using a 488 (Not Acceptable Here) response [RFC4412] (Schulzrinne, H. and J. Polk, “Communications Resource Priority for the Session Initiation Protocol (SIP),” February 2006.). Similarly, a SIP registrar that has lost connectivity to its registration database but is still capable of processing SIP messages should reject REGISTER requests with a 500 (Server Error) response [RFC3261] (Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A., Peterson, J., Sparks, R., Handley, M., and E. Schooler, “SIP: Session Initiation Protocol,” June 2002.). Overload control does not apply to these cases and SIP provides appropriate response codes for them.
The SIP protocol provides a limited mechanism for overload control through its 503 (Service Unavailable) response code. However, this mechanism cannot prevent overload of a SIP server and it cannot prevent congestion collapse. In fact, the use of the 503 (Service Unavailable) response code may cause traffic to oscillate and to shift between SIP servers and thereby worsen an overload condition. A detailed discussion of the SIP overload problem, the problems with the 503 (Service Unavailable) response code and the requirements for a SIP overload control mechanism can be found in [RFC5390] (Rosenberg, J., “Requirements for Management of Overload in the Session Initiation Protocol,” December 2008.).
TOC |
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 (Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels,” March 1997.) [RFC2119].
TOC |
Because overload control is best performed hop-by-hop, the Via parameter is attractive since it allows two adjacent SIP entities to indicate support for, and exchange information associated with overload control. This document defines three new parameters for the SIP Via header for overload control. These parameters provide a SIP mechanism for conveying overload control information between adjacent SIP entities.) The three parameters are:
TOC |
A SIP server that supports this specification MUST add an "oc_accept" parameter to the Via headers it inserts into SIP requests. This provides an indication to downstream neighbors that this server supports overload control.
A SIP server MUST remove the 'oc_accept' parameter from the topmost Via header in a response before forwarding the response. The topmost Via header is determined after the SIP server has removed its own Via header. The topmost Via header corresponds to the upstream neighbor of the SIP server processing the response, and the 'oc-accept' parameter was added by the upstream neighbor when the request was proxied by the upstream neighbor.
- Removing the 'oc-accept' parameter from the topmost Via header before forwarding the response upstream has the property that the upstream entity knows that the downstream SIP server is capable of supporting overload control as specified by this document. This is true even if the downstream SIP server is currently not experiencing an overload situation and therefore did not insert other overload related parameters.
TOC |
A SIP server can provide overload control feedback to its upstream neighbors by adding the 'oc' parameter to the topmost Via header field of a SIP response. The 'oc' parameter is a new Via header parameter defined in this specification. When an 'oc' parameter is added to a response, it MUST be inserted into the topmost Via header. It MUST NOT be added to any other Via header in the response. The topmost Via header is determined after the SIP server has removed its own Via header; i.e., it is the Via header that was generated by the upstream neighbor.
Since the topmost Via header of a response will be removed by an upstream neighbor after processing it, overload control feedback contained in the 'oc' parameter will not travel beyond the upstream SIP server. A Via header parameter therefore provides hop-by-hop semantics for overload control feedback (see [I‑D.ietf‑sipping‑overload‑design] (Hilt, V., Noel, E., Shen, C., and A. Abdelal, “Design Considerations for Session Initiation Protocol (SIP) Overload Control,” July 2009.)) even if the next hop neighbor does not support this specification.
The 'oc' parameter can be used in all response types, including provisional, success and failure responses. A SIP server MAY add the 'oc' parameter to all responses it is sending. A SIP server SHOULD add an 'oc' parameter to responses, that contain an 'oc_accept' parameter in the topmost Via header. A SIP server MUST add an 'oc' parameter to responses when the transmission of overload control feedback is required by the overload control algorithm to limit the traffic received by the server. I.e., a SIP server MUST insert the 'oc' parameter when the overload control algorithm sets the 'oc' parameter to a value different than the default value.
A SIP server that has added an 'oc' parameter to Via header SHOULD also add a 'oc_validity' parameter to the same Via header. The 'oc_validity' parameter defines the time in milliseconds during which the content (i.e., the overload control feedback) of the 'oc' parameter is valid. The default value of the 'oc_validity' parameter is 500. A SIP server SHOULD use a shorter 'oc_validity' time if its overload status varies quickly and MAY use a longer 'oc_validity' time if this status is more stable. If the 'oc_validity' parameter is not present, its default value is used. The 'oc_validity' parameter MUST NOT be used in a Via header without an 'oc' parameter and MUST be ignored if it appears in a Via header without 'oc' parameter.
When a SIP server retransmits a response, it MUST use the 'oc' parameter value and 'oc-validity' parameter value consistent with the overload state at the time the retransmitted response is sent. This implies that the values in the 'oc' and 'oc-validity' parameters may be different then the ones used in previous retransmissions of the response.
A SIP server MAY forward the content of an 'oc' parameter it has received from a downstream neighbor on to its upstream neighbor. However, forwarding the content of the 'oc' parameter is generally NOT RECOMMENDED and should only be performed if permitted by the configuration of SIP servers. For example, a SIP server that only relays messages between exactly two SIP servers may forward an 'oc' parameter. The 'oc' parameter is forwarded by copying it from the Via in which it was received into the next Via header (i.e., the Via header that will be on top after processing the response). If an 'oc_validity' parameter is present, MUST be copied along with the 'oc' parameter.
The 'oc' and 'oc_validity' Via header parameters are only defined in SIP responses and MUST NOT be used in SIP requests. These parameters are only useful to the upstream neighbor of a SIP server (i.e., the entity that is sending requests to the SIP server) since this is the entity that can offload traffic by redirecting/rejecting new requests. If requests are forwarded in both directions between two SIP servers (i.e., the roles of upstream/downstream neighbors change), there are also responses flowing in both directions. Thus, both SIP servers can exchange overload information.
While adding 'oc' and 'oc_validity' parameters to requests may increase the frequency with which overload information is exchanged in these scenarios, this increase will rarely provide benefits and does not justify the added overhead and complexity needed.
Since overload control protects a SIP server from overload, it is RECOMMENDED that a SIP server generally inserts 'oc' and 'oc_validity' parameters into responses to all SIP servers. However, if a SIP server wanted to limit its overload control capability for privacy reasons, it MAY decide to add 'oc' and 'oc_validity' parameters only to responses that are sent via a secured transport channel such as TLS. The SIP server can use transport level authentication to identify the SIP servers, to which responses with these parameters are sent. This enables a SIP server to protect overload control information and ensure that it is only visible to trusted parties.
TOC |
The value of the 'oc' parameter is determined by an overload control algorithm (see [I‑D.ietf‑sipping‑overload‑design] (Hilt, V., Noel, E., Shen, C., and A. Abdelal, “Design Considerations for Session Initiation Protocol (SIP) Overload Control,” July 2009.)). This specification does not mandate the use of a specific overload control algorithm. However, the output of an overload control algorithm MUST be compliant to the semantics of this Via header parameter.
The 'oc' parameter value specifies the percentage by which the load forwarded to this SIP server should be reduced. Possible values range from 0 (the traffic forwarded is reduced by 0%, i.e., all traffic is forwarded) to 100 (the traffic forwarded is reduced by 100%, i.e., no traffic forwarded). The default value of this parameter is 0.
OPEN ISSUE: the 'oc' parameter value specified in this document is defined to contain a loss rate. However, other types of overload control feedback exist, for example, a target rate for rate-based overload control or message confirmations and window-size for window-based overload control.
While it would in theory be possible to allow multiple types of overload control feedback to co-exist (e.g., by using different parameters for the different feedback types) it is very problematic for interoperability purposes and would require SIP servers to implement multiple overload control mechanisms.
TOC |
A SIP entity compliant to this specification SHOULD remove 'oc' and 'oc_validity' parameters from all Via headers of a response received, except for the topmost Via header. This prevents 'oc'/'oc_validity' parameters that were accidentally or maliciously inserted into Via headers by a downstream SIP server from traveling upstream.
A SIP server maintains the 'oc' parameter values received along with the address of the SIP servers from which they were received for the duration specified in the 'oc_validity' parameter or the default duration. Each time a SIP server receives a response with an 'oc' parameter from a SIP server, it overwrites the 'oc' value it has currently stored for this server with the new value received. The SIP server restarts the validity period of an 'oc' parameter each time a response with an 'oc' parameter is received from this server. A stored 'oc' parameter value MUST be discarded once it has reached the end of its validity.
TOC |
A SIP server compliant to this specification MUST honor 'oc' parameter values it receives from downstream neighbors. The SIP server MUST NOT forward more messages to a SIP server than allowed by the current 'oc' parameter value from this server.
When forwarding a SIP request, a SIP entity uses the SIP procedures to determine the next hop SIP server as, e.g., described in [RFC3261] (Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A., Peterson, J., Sparks, R., Handley, M., and E. Schooler, “SIP: Session Initiation Protocol,” June 2002.) and [RFC3263] (Rosenberg, J. and H. Schulzrinne, “Session Initiation Protocol (SIP): Locating SIP Servers,” June 2002.). After selecting the next hop server, the SIP server MUST determine if it has an 'oc' parameter value for this server. If it has a non-expired 'oc' parameter value, the SIP server MUST determine if it can or cannot forward the current request within the current conditions.
The SIP server SHOULD use the following algorithm to determine if it can forward the request. The SIP server draws a random number between 1 and 100 for the current request. If the random number is less than or equal to the 'oc' parameter value, the request is not forwarded. Otherwise, the request is forwarded. Any other algorithms that approximate the random number algorithm may be used as well.
TOC |
In some cases, a SIP server may not receive a response from a downstream neighbor when sending a request. RFC3261 (Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A., Peterson, J., Sparks, R., Handley, M., and E. Schooler, “SIP: Session Initiation Protocol,” June 2002.) [RFC3261] defines that when a timeout error is received from the transaction layer, it MUST be treated as if a 408 (Request Timeout) status code has been received. If a fatal transport error is reported by the transport layer, it MUST be treated as a 503 (Service Unavailable) status code.
In these cases, a SIP server MUST stop sending requests to this server. The SIP server SHOULD occasionally forward a single request to probe if the downstream neighbor is alive. Once a SIP server has successfully transmitted a request to the downstream neighbor, it can resume normal transmission of requests. It should, of course, honor an 'oc' parameters it may receive. This avoids that a SIP server, which is unable to respond to incoming requests, is overloaded with additional requests.
TOC |
An element can receive overload control feedback indicating that it needs to reduce the traffic it sends to its downstream neighbor. An element can accomplish this task by sending some of the requests that would have gone to the overloaded element to a different destination. It needs to ensure, however, that this destination is not in overload and capable of processing the extra load. An element can also buffer requests in the hope that the overload condition will resolve quickly and the requests still can be forwarded in time. In many cases, however, it will need to reject these requests.
TOC |
During an overload condition, a SIP server needs to prioritize messages and select messages that need to be rejected or redirected. While this selection is largely a matter of local policy, certain heuristics can be suggested. One, during overload control, the SIP server should preserve existing dialogs as much as possible. This suggests that mid-dialog requests MAY be given preferential treatment. Similarly, requests that result in releasing resources (such as a BYE) MAY also be given preferential treatment.
A SIP server SHOULD honor a request containing the Resource-Priority header field RFC4412 (Schulzrinne, H. and J. Polk, “Communications Resource Priority for the Session Initiation Protocol (SIP),” February 2006.) [RFC4412]. Resource-Priority header field enables a proxy to identify high-priority requests, such as emergency service requests, and preserve them as much as possible during times of overload.
TOC |
A SIP server that rejects a request because of overload MUST reject this request with a 503 (Service Unavailable) response code. This response code indicates that the request did not succeed because the SIP servers processing the request are under overload.
A SIP server that is under overload and has started to throttle incoming traffic SHOULD use 503 responses without Retry-After header to reject a fraction of requests from upstream neighbors that do not include the 'oc_accept' parameter in their Via headers. These neighbors do not support this specification and will not respond to overload control feedback in the 'oc' parameter. If the upstream server does not support "oc" then the downstream server must bear the cost of rejecting some session requests as well as the cost of processing other requests to completion. It would be fair to devote the same amount of processing at the overloaded server to the combination of rejection and processing, as the overloaded server would devote to processing requests from an overload compliant upstream server (note to editor: reword later.) This is to ensure that SIP servers, which do not support this specification, don't receive an unfair advantage over those that do.
A SIP server that has reached overload (i.e., a load close to 100) SHOULD start using 503 responses in addition to using the 'oc' parameter for all upstream neighbors. If the proxy has reached a load close to 100, it needs to protect itself against overload. Also, it is likely that upstream proxies have ignored overload feedback and do not support this specification.
TOC |
This section defines the syntax of three new Via header parameters: 'oc', 'oc_validity' and 'oc_accept'. These Via header parameters are used to implement an overload control feedback loop between neighboring SIP servers.
The 'oc' and 'oc_validity' parameters are only defined in the topmost Via header of a response. They MUST NOT be used in the Via headers of requests and MUST NOT be used in other Via headers of a response. The 'oc' and 'oc_validity' parameters MUST be ignored if received outside of the topmost Via header of a response. The 'oc_accept' parameter MAY appear in all Via headers.
The 'oc' Via header parameter contains a number between 0 and 100. It describes the percentage by which the traffic to the SIP server from which the response has been received should be reduced. The default value for this parameter is 0.
The 'oc_validity' Via header parameter contains the time during which the corresponding 'oc' Via header parameter is valid. The 'oc_validity' parameter can only be present in a Via header in conjunction with an 'oc' parameter.
The 'oc_accept' Via header parameter indicates that the SIP server, which has created this Via header, supports overload control.
oc = "oc" [EQUAL 0-100]
oc-validity = "oc_validity" [EQUAL delta-ms]
oc-accept = "oc_accept"
This extends the existing definition of the Via header field parameters, so that its BNF now looks like:
via-params = via-ttl / via-maddr / via-received / via-branch / oc / oc-validity / oc-accept / via-extension
Example:
Via: SIP/2.0/TCP ss1.atlanta.example.com:5060 ;branch=z9hG4bK2d4790.1 ;received=192.0.2.111 ;oc=20;oc_validity=500
TOC |
This section discusses specific design considerations for the mechanism described in this document. General design considerations for SIP overload control can be found in [I‑D.ietf‑sipping‑overload‑design] (Hilt, V., Noel, E., Shen, C., and A. Abdelal, “Design Considerations for Session Initiation Protocol (SIP) Overload Control,” July 2009.).
TOC |
A SIP mechanism is needed to convey overload feedback from the receiving to the sending SIP entity. A number of different alternatives exist to implement such a mechanism.
TOC |
Overload control information can be transmitted using a new Via header field parameter for overload control. A SIP server can add this header parameter to the responses it is sending upstream to provide overload control feedback to its upstream neighbors. This approach has the following characteristics:
TOC |
Overload control information can also be conveyed from a receiver to a sender using a new event package. Such an event package enables a sending entity to subscribe to the overload status of its downstream neighbors and receive notifications of overload control status changes in NOTIFY requests. This approach has the following characteristics:
This document defines a new Via header field parameter for overload control.
TOC |
An new overload control mechanism needs to be backwards compatible so that it can be gradually introduced into a network and functions properly if only a fraction of the servers support it.
Hop-by-hop overload control (see [I‑D.ietf‑sipping‑overload‑design] (Hilt, V., Noel, E., Shen, C., and A. Abdelal, “Design Considerations for Session Initiation Protocol (SIP) Overload Control,” July 2009.)) has the advantage that it does not require that all SIP entities in a network support it. It can be used effectively between two adjacent SIP servers if both servers support overload control and does not depend on the support from any other server or user agent. The more SIP servers in a network support hop-by-hop overload control, the better protected the network is against occurrences of overload.
A SIP server may have multiple upstream neighbors from which only some may support overload control. If a server would simply use this overload control mechanism, only those that support it would reduce traffic. Others would keep sending at the full rate and benefit from the throttling by the servers that support overload control. In other words, upstream neighbors that do not support overload control would be better off than those that do.
A SIP server should therefore use 5xx responses towards upstream neighbors that do not support overload control. The server should reject the same amount of requests with 5xx responses that would be otherwise be rejected/redirected by the upstream neighbor if it would support overload control. If the load condition on the server does not permit the creation of 5xx responses, the server should drop all requests from servers that do not support overload control.
TOC |
Overload control mechanisms can be used by an attacker to conduct a denial-of-service attack on a SIP entity if the attacker can pretend that the SIP entity is overloaded. When such a forged overload indication is received by an upstream SIP entity, it will stop sending traffic to the victim. Thus, the victim is subject to a denial-of-service attack.
An attacker can create forged overload feedback by inserting itself into the communication between the victim and its upstream neighbors. The attacker would need to add overload feedback indicating a high load to the responses passed from the victim to its upstream neighbor. Proxies can prevent this attack by communicating via TLS. Since overload feedback has no meaning beyond the next hop, there is no need to secure the communication over multiple hops.
Another way to conduct an attack is to send a message containing a high overload feedback value through a proxy that does not support this extension. If this feedback is added to the second Via headers (or all Via headers), it will reach the next upstream proxy. If the attacker can make the recipient believe that the overload status was created by its direct downstream neighbor (and not by the attacker further downstream) the recipient stops sending traffic to the victim. A precondition for this attack is that the victim proxy does not support this extension since it would not pass through overload control feedback otherwise.
A malicious SIP entity could gain an advantage by pretending to support this specification but never reducing the amount of traffic it forwards to the downstream neighbor. If its downstream neighbor receives traffic from multiple sources which correctly implement overload control, the malicious SIP entity would benefit since all other sources to its downstream neighbor would reduce load.
OPEN ISSUE: the solution to this problem depends on the overload control method. For rate-based and window-based overload control, it is very easy for a downstream entity to monitor if the upstream neighbor throttles traffic forwarded as directed. For percentage throttling this is not always obvious since the load forwarded depends on the load received by the upstream neighbor.
TOC |
[TBD.]
TOC |
TOC |
[RFC2119] | Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels,” BCP 14, RFC 2119, March 1997 (TXT, HTML, XML). |
[RFC3261] | Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A., Peterson, J., Sparks, R., Handley, M., and E. Schooler, “SIP: Session Initiation Protocol,” RFC 3261, June 2002 (TXT). |
[RFC3263] | Rosenberg, J. and H. Schulzrinne, “Session Initiation Protocol (SIP): Locating SIP Servers,” RFC 3263, June 2002 (TXT). |
[RFC4412] | Schulzrinne, H. and J. Polk, “Communications Resource Priority for the Session Initiation Protocol (SIP),” RFC 4412, February 2006 (TXT). |
TOC |
[I-D.ietf-sipping-overload-design] | Hilt, V., Noel, E., Shen, C., and A. Abdelal, “Design Considerations for Session Initiation Protocol (SIP) Overload Control,” draft-ietf-sipping-overload-design-02 (work in progress), July 2009 (TXT). |
[RFC5390] | Rosenberg, J., “Requirements for Management of Overload in the Session Initiation Protocol,” RFC 5390, December 2008 (TXT). |
TOC |
Many thanks to Rich Terpstra, Daryl Malas, Jonathan Rosenberg and Charles Shen for their contributions to this specification.
TOC |
Vijay K. Gurbani (editor) | |
Bell Laboratories, Alcatel-Lucent | |
1960 Lucent Lane, Rm 9C-533 | |
Naperville, IL 60563 | |
USA | |
Email: | vkg@bell-labs.com |
Volker Hilt | |
Bell Labs/Alcatel-Lucent | |
791 Holmdel-Keyport Rd | |
Holmdel, NJ 07733 | |
USA | |
Email: | volkerh@bell-labs.com |
Henning Schulzrinne | |
Columbia University/Department of Computer Science | |
450 Computer Science Building | |
New York, NY 10027 | |
USA | |
Phone: | +1 212 939 7004 |
Email: | hgs@cs.columbia.edu |
URI: | http://www.cs.columbia.edu |