Network Working Group X. Ding
Internet-Draft Q. Wu
Intended status: Standards Track Huawei
Expires: January 4, 2018 R. Gu
China Mobile
July 3, 2017

An Enhanced Media Delivery Index (eMDI) based on TCP
draft-ding-tcp-emdi-00

Abstract

This document introduces an Enhanced Media Delivery Index (eMDI) that can be used as a diagnostic tool or a quality indicator for monitoring a network intended to deliver streaming media over TCP transport. It aims to address the problems that RFC4445 has when measuring in environments where TCP traffic is dominated as a transport for streaming media.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on January 4, 2018.

Copyright Notice

Copyright (c) 2017 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.


Table of Contents

1. Introduction

TCP is one major transport protocol in use in most IP networks, and supports the transfer of over 80 percent of all traffic (e.g.,OTT traffic, IPTV VOD traffic) across the public Internet today. Packet loss ratio and latency are two major characteristics in the network to affect the behavior of TCP. The bad TCP performance might also indicate the unacceptable end-user-perceived quality level.

Media Delivery Index (MDI)[RFC4445] is a method widely used in the network as a diagnostic tool to measure both the instantaneous and longer-term behavior of networks carrying streaming media in the media layer. However the limitation of MDI measurement is mostly applicable to streaming media and protocol over UDP, it falls short when monitoring a network intended to deliver multimedia applications over TCP Transport, i.e., the traditional MDI metrics especially Media Loss Rate (MLR) deployed in the network devices is difficult to infer the packet loss if the missing packets were retransmitted when the packet loss was detected by the TCP sender. On the other hand, TCP sender will adjust the sending data rate to reduce the probability of further packet loss, which means throughput is declining when extra delay is incurred by retransmitting lost packets. Therefore, throughput can be regarded as a quality indication for network monitoring and diagnosis.

This document introduces a new measurement method and associated metrics,i.e.,downstream/upstream/end to end throughput, to complement methods defined in [RFC4445]. This new method can quickly identify the root cause of the QoS related problem, improve efficiency of network monitoring and troubleshooting.

2. Terminologies

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].

This document uses the following terms:

Measurement point (MP):
A measurement point is the logical or physical location defined in the TCP that acts as a source of information gathered for monitoring purposes.
Upstream packet lost ratio (UPLR):
UPLR is the ratio of the number of packets lost to the total number of packets sent from server to measurement point during predefined measurement interval.
Downstream packet lost ratio (DPLR):
DPLR is the ratio of the number of packets lost to the total number of packets sent from measurement point to client during a predefined measurement interval.
Upstream average RTT (URTT):
URTT is the average RTT at the path from server to measurement point during a predefined measurement interval.
Downstream average RTT (DRTT):
DRTT is the average RTT at the path from measurement point to client during a predefined measurement interval.
End to end Throughput (E2ET):
E2ET is the rate of successful packet delivery over an end-end network path during a predefined measurement interval.
Downstream throughput (DT):
DT is measured by the number of packets received per second at the downstream of measurement point during a predefined measurement interval.
Upstream throughput (UT):
UT is measured by the number of packets received per second at the upstream of measurement point during a predefined measurement interval.

3. Measurement Setup

A stream of packets sent by streaming media Server passes through MP (MP can be bridge, router or gateway), and finally reach the client (destination endpoint). If a node A is placed between the server and MP in the network , then A is upstream node of MP. Otherwise, A is downstream node of MP.

  +--------+                MP |                    +--------+    
  | Server |------Upstream---->|-----Downstream---->| client |    
  +--------+                   |                    +--- ----+    
                                                         

4. Measurement Method

The rationale of the measurement is to compare DT/UT/E2ET with data packet rate. If DT is less than data packet rate and UT is greater than data packet rate, there is something wrong with the downstream network. Otherwise, the upstream network has some problems.

DT = MSS/(DRTT+URTT)(DPLR)(1/2);

When the packet loss occurs in the network, an additional limit(i.e., packet loss probability) is imposed on the throughput besides TCP recieve window. In case of light or moderate packet loss when the TCP rate is adjusted by the congestion avoidance algorithm, DT can be calculated according to the following formula:

DPLR = a/x.

Where MSS is the maximum segment size. Assuming the number of lost packets at the downstream during a predefined measurement interval is a, and the number of total packets sent by MP is x, then DPLR is then calculated as following:

DRTT= sum (RTTdi)/m, i= 1..m
Where RTTdi indicates the RTT of packet di at downstream.

Average RTT of some packets (d1..dm) at the downstream direction are used to compute DRTT:

URTT= sum (RTTui)/n, i= 1..n
Where RTTui indicates the RTT of packet ui at the upstream.

Similarly, average RTT of some packets (u1..un) at the upstream direction are used to compute URTT:

UT = MSS/(DRTT+URTT)(UPLR)(1/2);

And, UT can be calculated according to the formula:

UPLR = b/y.

Assuming the number of lost packets at the upstream during a predefined measurement interval is b, and the number of total packets sent by Server is y, then UPLR is then calculated as following:

E2ET = MSS/(DRTT+URTT)(UPLR+DPLR)(1/2).

And E2ET can be calculated according to the formula:

5. Use Examples

5.1. Network Troubleshooting in VoD scenario

                      +--------+
        IPTV Platform +--------+----------^--------------
         /OTT/CDN     +--------+          |
                      +----+---+          |
                           |              |
                     //----+---\\         |
                 |///            \\\|     |
                |                    |    |URTT
                 |\\\            ///|     |
                     \\----+---//         |
                           |              |
                           |              |
                      +--------+          |
                CR    +--------+          |
                           |              |
                           ---------------V------------------
                           |                      ^
               BRAS  +-----+---+                  |
                     +---/---\-+                  | Downstream
                       //     \\                  |   Fixed
                     //         \                 |   Network
       OLT  +---------+        +-\------+         |    Latency
            +---------+        +--------+    DRTT |
                                                  |
          ----     ----         ----     ----  --------------
         /----\   /----\       /----\   /----\    |
         |    |   |    |       |    |   |    |    |Home Network
    Home |    |   |    |       |    |   |    |    |  Latency
   Network    |   |    |       |    |   |    |    |
         +----+   +----+       +----+   +----+----V------

Figure 1: Figure 1

The proposed measurement method can be applied when VoD streaming media running over TCP is delivered as unicast stream from VoD server in the operator network to end users in home network. In some cases, the fault occurs in the home network which cause user experience downgrading, in some other cases, fault occurs in the operator network.

To pinpoint the location of the fault , MP can be deployed on ONT device of the home network. The home network is refer to the downstream of the MP and the operator network is refer to the upstream of the MP. Suppose the rate of the media rate is v, we can compare DT/UT/E2ET with v. If DT<v and UT>v, the home network is the root cause for streaming media quality downgrading. If DT>v and UT<v, the operator network is the root cause. If DT>v, UT>v, and E2E<v, both home network and operator network should be responsible for streaming media quality downgrading.

5.2. WiFi Anomaly Analysis in the Home Network

WiFi Latency = DRTT - Downstream Fixed Network Latency

WiFi latency is a key factor impacting the user experience of home network application. [WIFI] shows WiFi latency follows a long tail distribution: its 50th, 90th and 99th percentile are around 3ms, 20ms and 250ms. If the WiFi network get congested, the quality degrades proportionally with WiFi lantency. To analyse WIFi Anomaly degree in the home network, See figure 1, we can calculate cumulative distribution of WiFi latency based on measured values:

Threshold = UBV + coef *(UBV-LBV)

and determine threshold value for WiFi Latency based on periodically collected dataset,e.g.,

Where UBV is the 75th percentile value, LBV is the 25th Percentile value, coef is coefficiency value which can be set to 1.5.

By Comparing WiFi latency measured value with the threshold value, we can decide if WiFi Anomaly is the root cause of network quality degrading.

6. Security Considerations

This document does not introduce security issues beyond those discussed in [[RFC4445].

7. Normative References

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", March 1997.
[RFC4445] Welch, J. and J. Clark, "A Proposed Media Delivery Index (MDI)", RFC 4445, DOI 10.17487/RFC4445, April 2006.
[WIFI] , "Characterizing and Improving WiFi Latency in Large-Scale Operational Networks", 2016.

Authors' Addresses

Xiaojian Ding Huawei 101 Software Avenue, Yuhua District Nanjing, Jiangsu 210012 China EMail: dingxiaojian1@huawei.com
Qin Wu Huawei 101 Software Avenue, Yuhua District Nanjing, Jiangsu 210012 China EMail: bill.wu@huawei.com
Rong Gu China Mobile 32 Xuanwumen West Ave, Xicheng District Beijing, 100053 China EMail: gurong_cmcc@outlook.com