          Impact analysis from IPv6 GTP-U checksum calculation


   This document describes about the impact on the performance when
   calculating the checksum for IPv6 GTP-U packet upon encapsulating the
   packet into IPv6 GTP-U.

1.  Introduction

   [RFC6935] allows to use zero checksum for IPv6 UDP when using IPv6
   UDP for encapsulating a packet.  Eliminating the checksum calculation
   contributes huge performance improvement in terms of forwarding

   3GPP also allows UDP checksum zero for GTP-U over IPv6 UDP
   encapsulation since Release-16 onward[TS.29281].  However, UDP
   checksum seem still remained in GTP-U over IPv6/UDP encapsulation
   implementations.  This can be causing non-negligible performance
   impact on nodes, especially in NFV environment, which nodes are
   encapsulating the packet into IPv6 GTP-U.

   This document describes an analysis of network performance impact
   caused by IPv6 UDP checksum calculation.  To do the analysis, we
   measured latency variation on three environments, (1) UDP checksum
   zero, (2) UDP checksum calculated by software, (3) offloading UDP
   checksum calculation.

   These latencies were measured on a VPP (Vector Packet Processing)
   instance when a packet encapsulated with an IPv6, a UDP and a GTP-U

2.  Terminology

   GTP-U: GPRS Tunneling Protocol for User Plane

   VPP: Vector Packet Processing

   NIC: Network Interface Card

3.  Impact analysis from UDP checksum calculation

3.1.  Vector Packet Processing

   VPP is doing the batching process on the received packets.  VPP
   stores some received packets and process these packets at once.
   Hence, it causes some small degration on latency to send out the
   packets to the network.

INPUT: +--+ +--+ +--+     +--+
       |P1| |P2| |P3| ... |Pn|
       +--+ +--+ +--+     +--+                      Batching Process
                              |                        (n packets)
                              |                  |<------------------->|
OUTPUT:                       |                  +--+ +--+ +--+     +--+
                              |                  |P1| |P2| |P3| ... |Pn|
                              |                  +--+ +--+ +--+     +--+
                              |                                        |
                              |<------------ batching time ------------>

   Based on VPP, the impact on the latency is measured when
   encapsulating the packets into IPv6 GTP-U with software base checksum
   calculation and with checksum calculation offload in order to figure
   out the impact on the network performance.

   In order to simplify the impact analysis, only 1 CPU core is assigned
   to the packet processing in VPP and the packets are arrived at one
   interface and sent out to another interface.  The traffic generator
   outside of VPP is sending out the packets and receives IPv6 GTP-U
   encapsulated packets.  Based on this, the impact on the latency is
   measured in each case.

3.2.  Software base checksum calculation

   This section describes the performance impact analysis when using
   software base checksum calculation.

   No checksum calculation: No packet loss
   |    Store-Forward     |    Store-Forward     |   Store-Forward    |
   |   Avg Latency (ns)   |   Min Latency (ns)   |  Max Letency (ns)  |
   |        15,336        |        14,535        |      124,397       |

   Software checksum calculation: No packet loss
   |    Store-Forward     |    Store-Forward     |   Store-Forward    |
   |   Avg Latency (ns)   |   Min Latency (ns)   |  Max Letency (ns)  |
   |        15,477        |        14,592        |      123,337       |

   INPUT: 100 pps, 1492 byte packet

   In this case, there is no impact on the network performance.  Since
   the incoming packet rate is enough small, the software checksum
   calculation can be done within the batching time to process the

   No checksum calculation: No packet loss
   |    Store-Forward     |    Store-Forward     |   Store-Forward    |
   |   Avg Latency (ns)   |   Min Latency (ns)   |  Max Letency (ns)  |
   |       120,005        |        62,650        |    1,300,217       |

   Software checksum calculation: 3.676% packet loss
   |    Store-Forward     |    Store-Forward     |   Store-Forward    |
   |   Avg Latency (ns)   |   Min Latency (ns)   |  Max Letency (ns)  |
   |     8,167,461        |        52,467        |    9,341,807       |

   INPUT: 275k pps, 1492 byte packet

   In this case, there is huge impact on the network performance.  If
   the total CPU time required for calculating UDP checksum is exceeding
   the batching time to process the packets, it causes huge impact on
   the latency.  In addition, since the checksum calculation steals CPU
   times and the software can not acquire enough CPU times to process
   the packets, it causes the huge packet loss.

3.3.  Checksum calculation offload

   Some of NIC can support UDP checksum calculation offload.  When
   enabling this function on NIC, UDP checksum is calculated by NIC.  In
   this case, CPU time is not consumed for calculating UDP checksum.
   This section describes the performance impact analysis when enabling
   UDP checksum offload on NIC.

   No checksum calculation: No packet loss
   |    Store-Forward     |    Store-Forward     |   Store-Forward    |
   |   Avg Latency (ns)   |   Min Latency (ns)   |  Max Letency (ns)  |
   |       15,336         |        14,535        |      124,397       |

   UDP checksum offload: No packet loss
   |    Store-Forward     |    Store-Forward     |   Store-Forward    |
   |   Avg Latency (ns)   |   Min Latency (ns)   |  Max Letency (ns)  |
   |       15,349         |        14,537        |       63,742       |

   INPUT: 100 pps, 1492 byte packet

   In this case, there is no impact on the network performance.  Since
   the incoming packet rate is enough small, all received packet can be
   processed within the batching time.

   No checksum calculation: No packet loss
   |    Store-Forward     |    Store-Forward     |   Store-Forward    |
   |   Avg Latency (ns)   |   Min Latency (ns)   |  Max Letency (ns)  |
   |      120,005         |        62,650        |     1,380,217      |

   Software checksum calculation: No packet loss
   |    Store-Forward     |    Store-Forward     |   Store-Forward    |
   |   Avg Latency (ns)   |   Min Latency (ns)   |  Max Letency (ns)  |
   |      134,126         |        74,090        |     2,443,992      |

   INPUT: 275k pps, 1492 byte packet

   In this case, there is small impact on the network performance.  The
   time required to calculate UDP checksum by NIC can not be done within
   the batching time and hence it causes small degrade on the latency.
   However, even though there are huge packet loss when using software
   base checksum calculation with same condition, there is no packet
   loss when using UDP checksum offload.

4.  Security Considerations

   No secturity consideration.

5.  IANA Considerations

   No IANA consideration.

6.  Contributors

   In addition to the authors listed on the front page, the following
   individuals have also made significant contributions to the draft:

   (Artwork only available as : No external link available, see draft-
   murakami-dmm-udp-checksum-impact-gtpu-01.html for artwork.)

