Internet DRAFT - draft-fan-opsawg-transmission-interruption
draft-fan-opsawg-transmission-interruption
Operations and Management Area Working Group P. Fan
Internet-Draft L. Li
Intended status: Informational China Mobile
Expires: January 16, 2014 July 15, 2013
Requirements for IP/MPLS network transmission interruption duration
draft-fan-opsawg-transmission-interruption-03
Abstract
The transmission performance of IP/MPLS network affects upper layer
services and networks, but there is no consensus in the industry on
transmission interruption for IP/MPLS network up to now. This memo
studies requirements for the interruption duration criteria in
several service scenarios.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on January 16, 2014.
Copyright Notice
Copyright (c) 2013 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Fan & Li Expires January 16, 2014 [Page 1]
Internet-Draft IP/MPLS transmission interruption July 2013
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
2. Services and Performance Criteria . . . . . . . . . . . . . . 3
2.1. Softswitch . . . . . . . . . . . . . . . . . . . . . . . 3
2.2. SS7 transport . . . . . . . . . . . . . . . . . . . . . . 5
2.3. LTE Backhaul . . . . . . . . . . . . . . . . . . . . . . 6
2.4. Ethernet VPN . . . . . . . . . . . . . . . . . . . . . . 6
2.5. IPTV . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3. Other considerations . . . . . . . . . . . . . . . . . . . . 7
4. Security Considerations . . . . . . . . . . . . . . . . . . . 7
5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 7
6. Appendix: Impact Analysis on Transmission Quality of IP
Carried Softswitch Voice . . . . . . . . . . . . . . . . . . 7
7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 10
8. Informative References . . . . . . . . . . . . . . . . . . . 10
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 11
1. Introduction
Today's IP/MPLS network is widely used as a bearer network to carry
diversified packet switched services. The transmission qualities of
these services are closely related to the performance of bearer
layers, as network failure, delay, congestion and other abnormities
will inevitably bring about service interruption and user perception
degradation. However, there is no consensus in the industry on
transmission interruption for IP/MPLS network up to now. This memo
studies relationships between service performance and transmission
interruption duration in several scenarios, and is intended to reach
a list of requirements for these interruption duration criteria.
For a long time the industry has been aspiring for the so-called
golden standard for network resilience, that is the 50-millisecond
recovery threshold. [HeavyReading] gives us a basic introduction to
the origin of this fast protection legacy which can date back to
1980s. The 50ms threshold was established informally in the early
1980s, and then formally through standardization of [G.841]
recommendation on SDH network protection architects. The specific
requirement shows a maximum threshold for detecting and restoring a
fault of 60ms, which adds up fault detection duration of less than
10ms and protection switching time of less than 50ms. The report
also mentions original concerns that the threshold results from. The
voice channel banks deployed in early 1980s had limited fault
tolerance. Failures that lasted longer than 200ms would generate a
Carrier Group Alarm (CGA) which caused the channel bank to terminate
all connections over that given TDM line. So an outage budget was
developed by carriers and the 50ms standard was employed to protect
voice services. However newer channel banks at that time had started
Fan & Li Expires January 16, 2014 [Page 2]
Internet-Draft IP/MPLS transmission interruption July 2013
to implement a CGA timer of 2s, so the 50ms protection was adopted to
protect a small and diminishing fraction of digital network.
Historically this 50ms fast protection speed has been achieved by SDH
network. Using various fast convergence technics, IP/MPLS is also
able to react within 50ms. As for network applications that are
carried by optical or packet core, changes have been made through the
past decades, accompanied by the continuing questions about needs for
50ms protection. Here we list three basic considerations about
services and their requirement for IP/MPLS: for services like TDM
over IP/MPLS, the traditional 50ms guarantee should be kept and met;
for current IP services (e.g. voice, internet), experiences or
experiments are to be provided for guidance; for services in future,
we are supposed to propose requirement early and give consideration
to IP/MPLS.
2. Services and Performance Criteria
Services delivered by IP/MPLS network have different transmission
quality requirements, thus introduce different performance criteria
for the bearing IP/MPLS network. We believe there are two principles
that need to be considered during network and service design,
configuration and operation. The IP/MPLS bearer should satisfy
quality requirements of upper level services and applications, while
services and applications should also take into account the intrinsic
IP capabilities. In this section we will describe concerns on IP/
MPLS and service mutual adaptation from aspects of several kinds of
service scenarios.
2.1. Softswitch
From the softswitch point of view, the IP carrying nature imposes
certain influence to the service quality. Especially when speech is
delivered by IP, the communication quality of voice is impaired, and
in turn makes higher requirements for the transmission performance of
IP. The following table gives a list of criteria regarding
transmission quality of a typical GSM network as well as impacting
factors brought by IP bearer.
+-----------------------------------------+------------------------+
| Criteria of GSM | Impacting Factors |
| Transmission Quality | Brought by IP Bearer |
+----------+------------------------------+------------------------+
| |Call loss of wireless channel | None |
| +------------------------------+------------------------+
| | Call loss between switches | Failure of Nc/Mc |
|Call Loss | (typical value: <=1%) | interface carried by IP|
| +------------------------------+------------------------+
Fan & Li Expires January 16, 2014 [Page 3]
Internet-Draft IP/MPLS transmission interruption July 2013
| | Call loss between switch and | None |
| | BSC (typical value: <=0.5%) | |
+----------+------------------------------+------------------------+
| Call | Call cut-off rate | Failure of Nc/Mc |
| Cut-off | (typical value: <1%) | interface carried by IP|
+----------+------------------------------+------------------------+
| | Service providing delay | None |
| +------------------------------+------------------------+
|Connection| Calling party connection | IP carried signaling |
| Delay | delay (typical value: <=4s) | delay |
| +------------------------------+------------------------+
| |Called party connection delay | None |
| | (typical value: <=4s) | |
+----------+------------------------------+------------------------+
If voice is carried by IP, communication quality criteria of call
loss, call cut-off and connection delay are likely to be influenced.
This subsection focuses on the three criteria and their impacting
factors to give requirements for softswitch and IP bearer networks,
with detailed analysis described in the appendix. Note that the
current discussion on softswitch is focused on quality of
transmission while not on quality of voice. In another word, the
scope of discussion is limited to network related QoS aspect, while
subjective QoE criteria such as PESQ (Perceptual Evaluation of Speech
Quality) and MOS (Mean Opinion Score) are left to later revisions.
Call loss related requirement: The duration of SCTP interface
association timer should be shorter than that of the state machine
message timer of upper layer protocols, and this duration is
further recommended to be no longer than 6 seconds in order to
maintain detection sensitivity; the interruption duration of IP
bearer network should be as short as possible to avoid call loss,
and this duration is further recommended to be no longer than 5
seconds.
Call cut-off related requirement: The SCTP association should be
guaranteed during IP layer interruption to avoid interface
breakoff alert. The requirements are the same as those related to
call loss.
Connection delay related requirement: The IP convergence time should
be no longer than 3 seconds to ensure that connection delay is
shorter than 4 seconds.
The overall requirement for IP/MPLS interruption duration is no
longer than 3 seconds.
Fan & Li Expires January 16, 2014 [Page 4]
Internet-Draft IP/MPLS transmission interruption July 2013
2.2. SS7 transport
The Signaling System No. 7 (SS7/C7) network is one of the examples of
the principle that services should take into account the ability of
IP. The bearer of SS7 protocol stack has been experiencing evolution
from TDM to IP. Traditionally the user parts of SS7 (including MAP,
CAP, BSSAP+, ISUP, etc.) are carried by MTP layers, but the bearer
has gradually been evolved into a packetized form with SIGTRAN
(including M2PA, M2UA, M3UA, etc.) using SCTP associations over IP.
The change requires transport layer to take mechanisms to meet demand
of SCN signaling, and more importantly it requires protocols to make
adaption to the "best effort" fact of IP.
The SIGTRAN uses an architecture that can be described as standard IP
plus unified transport plus diversified adaption units. It
introduces SCTP to realize reliable signaling transport over IP. The
SCTP itself provides reliable transmission mechanisms, such as path
selection and monitoring, validation and acknowledgment mechanisms,
and retransmission timing management.
The unreliable nature of IP makes it necessary for the upper-level
protocols to be more tolerable to the possible instability of bearer.
Once a service request from a UE is accepted, the system allocates
resources and establishes paths for the user. A breakoff caused by
IP will result in signaling disconnection or rerouting. Signaling
transmission path may also be switched back after IP layer restores.
Frequent switchovers and disconnections lead to unnecessary system
cost and service interruption, so parameters should be configured a
little bit "insensitive" to try to sustain connections on control
plane.
One of the examples of parameter configuration is the timer value.
The following gives two cases about SCTP on transport layer and M2PA
on adaption layer. The values should not be set very small to
prevent unnecessary disconnection caused by IP instability. However,
because upper services of SS7 may also have timeout rules, values
should not be set very large too to avoid violating the rules.
1) SCTP
SCTP uses RTO to manage timeout duration for retransmission in case
of feedback missing. The RTO is given an initial, a max and a min
value, and is calculated instantaneously with a set of management
rules. Many other parameters are used for fault detection in SCTP.
Association.Max.Retrans is used to indicate the upper limit of number
of possible retransmission without considering endpoint down.
Path.Max.Retrans is a similar value to detect path failure. The
parameters together characterize the ability of SCTP to tolerate
Fan & Li Expires January 16, 2014 [Page 5]
Internet-Draft IP/MPLS transmission interruption July 2013
bearer downwards and provide reliable SS7 transport upwards. The
typical values of the parameters are RTO.Initial = 0.5 sec, RTO.MIN =
0.5 sec, RTO.MAX = 1.5 sec, Path.Max.Retrans = 5, Assoc.Max.Retrans =
10.
2) M2PA
Although protocols like H.248 and BICC can be carried directly upon
SCTP, the user part protocols of SS7 usually have to be carried by
SCTP/IP with the help of different adaption layers. In this case,
the attributes of adaption layers, e.g. M2PA used between STPs, are
more important to SS7. M2PA uses a T7 timer to indicate the maximum
delay of acknowledgement and start T7 at the time of data
transmission. If no message is acknowledged after the maximum
waiting time, T7 expires and M2PA sends a message of out of service
to the peer end. Because propagation delays in IP networks are more
variable than in traditional SS7 networks, the value of T7 should be
set considering IP propagation delays, as well as acknowledgement
time, SCTP slow-start algorithms, upper service timers and other
factors. Typical value of T7 is 7~10 sec.
Parameter configuration induced tolerance to bearer may have some
influence on service, but it avoids service cut-off or severe user
perception degradation. For services like SMS or route lookup,
possible latency may be introduced, but operations can still be
completed after short delay. Because SMS has no strict requirement
for instantaneity, impact on service is limited. If route lookup
takes more time due to IP interruption and convergence, user may
experience longer setup delay when dialing. For service of location
update, even if operation fails because bearer is interrupted for too
long, UE has the mechanism to initiate request again.
2.3. LTE Backhaul
To be further analyzed.
2.4. Ethernet VPN
Ethernet VPNs (e.g. VPLS) are used to provide transparent Ethernet
type layer 2 connections for customers. Ethernet frames are treated
as service payload and encapsulated and transported in providers MPLS
network. The interruption criteria of IP/MPLS bearer should
guarantee continuity of Ethernet service, and IP/MPLS failover is not
supposed to generate outage of Ethernet service.
[Y.1731] and [IEEE802.1ag] describe in detail OAM functions and
mechanisms for Ethernet, with specific recommendation on connectivity
fault management. Ethernet uses continuity check function to detect
Fan & Li Expires January 16, 2014 [Page 6]
Internet-Draft IP/MPLS transmission interruption July 2013
loss of continuity between any pair of MEPs in a MEG, and this
function is realized by sending CCMs (connectivity check messages)
between peer MEPs. When a MEP does not receive CCM from a peer MEP
within a certain interval, it detects loss of continuity to that peer
MEP. The threshold interval is specified as 3.5 times the CCM
transmission period, which corresponds to a loss of three consecutive
CCMs from the peer MEP, and the CCM transmission period is
recommended to be the default value of 1 second. So the interruption
duration of IP/MPLS for Ethernet VPN services should be less than 3
seconds.
2.5. IPTV
To be further analyzed.
3. Other considerations
So far this document has focused on use cases and their requirement
for IP/MPLS, and other practical issues are not included in this
version. For example, an IP/MPLS packet core is expected to carry a
variety of services, so the requirement for IP/MPLS may have to
include additional concerns on this multi-service co-existence
scenario. A simple and straight-forward way may be to satisfy the
most critical need for protection time required by the services.
Another issue is related to service awareness. Whether service type
is or can be known by IP/MPLS would influence the ability of IP/MPLS
to provide reliability guarantee accordingly. It seems to be easier
to perform service identification on edge devices than network core.
We believe these kinds of issues need to be taken into account, and
currently we will just leave them to be updated in future revisions.
4. Security Considerations
TBD
5. IANA Considerations
This memo includes no request to IANA.
6. Appendix: Impact Analysis on Transmission Quality of IP Carried
Softswitch Voice
This section describes impact on transmission quality of softswitch
voice when carried by IP and requirements for IP bearer convergence
time.
1) Call Loss
Fan & Li Expires January 16, 2014 [Page 7]
Internet-Draft IP/MPLS transmission interruption July 2013
Call loss is used to describe the circumstance where a phone call
fails to establish after initiated by a subscriber due to network
faults. In the practical network, the call loss rate is mainly
associated by the factors as follows:
1. Interfaces, including Nc, Mc and interface between MSS and SG.
2. State machine message timer. If a timeout takes place, the state
machine releases signaling messages, producing a call loss.
Typical value of BICC timer is 10~15 seconds and value of DTAP
timer about 15 seconds.
3. Interface association timer. Associations breaks off at the
expiration of timer.
4. Bearer network convergence time.
If the configured timer duration of a state machine is shorter than
the timer duration of interface association, then although interface
association may not be broken off, call loss is still possible to
occur due to message timer expiration. If the association timer
duration is shorter than IP routing convergence time, the association
is considered broken off by SCTP, hence message loss at interface
between MSS and SG as well as interface Nc results in massive call
loss, and new calling request cannot be satisfied because of
interface Mc breakoff. In this case, the call loss rate can be
calculated as
Call Loss Rate = ( IP Convergence Time + Association Restoration
Time ) * CAPS / BHCA.
However, if the association timer duration is longer than IP routing
convergence time, then the association is considered normal by SCTP,
and data will be retransmitted. Although this may cause buffer
overflow leading to call loss, the call loss rate is possible to
achieve approximately zero if buffer is big enough.
From the analysis above and practical operation experience, the
requirements for softswitch and IP bearer are as follows: the
duration of SCTP interface association timer should be shorter than
that of the state machine message timer, and this duration is further
recommended to be no longer than 6 seconds in order to maintain
detection sensitivity; the interruption duration of IP bearer network
should be as short as possible to avoid call loss during the IP layer
interruption period, and this duration is further recommended to be
no longer than 5 seconds.
Fan & Li Expires January 16, 2014 [Page 8]
Internet-Draft IP/MPLS transmission interruption July 2013
2) Call Cut-off
Call cut-off is referred to the abnormal release during a phone call
due to reasons other than intentional release by any of the parties
involved in the call. The call cut-off rate is related with:
1. Interfaces, including Nc and interface between MSS and SG.
2. Interface association timer.
3. Bearer network convergence time.
If the association timer duration is shorter than IP routing
convergence time, established phone calls will be released once
interruption of interface Nc or interface connecting MSS and SG is
detected. In the case of association breakoff, call cut-off rate can
be calculated as
Call Cut-off Rate = ( CAPS * Call Duration ) * Busy Hour Association
Breakoffs / BHCA.
While if the association is not interrupted, the call cut-off rate
can be approximately zero.
In conclusion, the SCTP association should be guaranteed during IP
layer interruption to avoid interface breakoff alert. The
requirements for softswitch and IP bearer are the same as those
related to call loss.
3) Connection Delay
The connection delay from a call initiation by a calling party to
PLMN should be no longer than 4 seconds. This delay is affected by
factors below:
1. RRC connection setup delay (irrelevant to whether service is
carried by IP or not).
2. Core network signaling interaction delay. The message number at
interface Nc/Nb is 6, and is 8 (calling side) or 16 (called side,
in case of IP-IP) at interface Mc. Each message is with a delay
of no longer than 50 milliseconds. Calling message delay at
interface Nc is no longer than 300 milliseconds. If long
distance call is made though CMN, the message delay is to be
increased by transmission delay of 5 msec/km and CMN process
delay. So the message delay is likely to be 400 milliseconds.
Fan & Li Expires January 16, 2014 [Page 9]
Internet-Draft IP/MPLS transmission interruption July 2013
3. IP bearer network QoS and load.
The connection delay is influenced by the delay criterion defined in
the IP bearer network QoS, and is raised by delay, jitter, packet
loss caused by network overload. In addition, if the configured
timer duration of interface association is too long, the SCTP
sensitivity to the retransmitted messages after packet loss will be
decreased, which increases connection delay.
Connection delay is generally expressed as
Connection Delay = IP convergence time + RRC connection setup delay
+ Signaling Interaction Delay,
and is no longer than 4 seconds. So the IP network in normal working
state should be constrained within a certain range of load to ensure
that delay is shorter than 50 milliseconds, while in interruption
state the IP convergence time should be no longer than 3 seconds to
ensure that connection delay is shorter than 4 seconds.
From the analysis of IP/MPLS performance according to the three
criteria above, we suggest the transmission interruption duration of
IP/MPLS network for softswitch service should be no longer than 3
seconds.
7. Acknowledgements
The authors would like to thank Chris Donley and Melinda Shore for
their kind help in content enrichment, and Christopher Liljenstolpe,
Andrew Malis and Adrian Farrel for their helpful comments on the
document.
8. Informative References
[G.841] ITU-T Recommendation G.841, ., "Types and characteristics
of SDH network protection architectures", October 1998.
[HeavyReading]
Bennett, G., "Resilience Reliability and OAM in Converged
Network", Heavy Reading, Vol. 2, No. 6, February 2004.
[IEEE802.1ag]
IEEE Std 802.1ag-2007, ., "IEEE Standard for Local and
metropolitan area networks, Virtual Bridged Local Area
Networks, Amendment 5: Connectivity Fault Management",
December 2007.
Fan & Li Expires January 16, 2014 [Page 10]
Internet-Draft IP/MPLS transmission interruption July 2013
[Y.1731] ITU-T Recommendation Y.1731, ., "OAM Functions and
Mechanisms for Ethernet based Networks", July 2011.
Authors' Addresses
Peng Fan
China Mobile
32 Xuanwumen West Street, Xicheng District
Beijing 100053
P.R. China
Email: fanpeng@chinamobile.com
Lianyuan Li
China Mobile
32 Xuanwumen West Street, Xicheng District
Beijing 100053
P.R. China
Email: lilianyuan@chinamobile.com
Fan & Li Expires January 16, 2014 [Page 11]