Internet DRAFT - draft-elkins-v6ops-ipv6-packet-sequence-needed
draft-elkins-v6ops-ipv6-packet-sequence-needed
INTERNET-DRAFT N. Elkins
Intended Status: Informational Inside Products
M. Ackermann
BCBS Michigan
W. Jouris
Inside Products
K. Haining
US Bank
S. Perdomo
DTCC
Expires: April 2014 October 3, 2013
IPv6 Packet Sequence Number Needed
draft-elkins-v6ops-ipv6-packet-sequence-needed-01
Abstract
To diagnose performance and connectivity problems, metrics on real
(non-synthetic) transmission are critical for timely end-to-end
problem resolution. Such diagnostics may be real-time or after the
fact, but must not impact an operational production network. The base
metrics are: packet sequence number and packet timestamp. Metrics
derived from these will be described separately. This document
provides the background and rationale for the packet sequence number
which is a part of the IPv6 Performance and Diagnostic Metrics
Destination Option (PDM).
Status of this Memo
This Internet-Draft is submitted to IETF in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as
Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/1id-abstracts.html
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html
Elkins Expires April 14, 2014 [Page 1]
INTERNET DRAFT-elkins-v6ops-ipv6-packet-sequence-needed-01 October 2013
Copyright and License Notice
Copyright (c) 2013 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1 Why Packet Sequence Number . . . . . . . . . . . . . . . . . 3
1.2 IPv4 IPID : DeFacto Sequence Number . . . . . . . . . . . . 4
1.2.1 Description of IPID in IPv4 . . . . . . . . . . . . . . 4
1.2.2 DeFacto Use of IPID . . . . . . . . . . . . . . . . . . 4
1.2.3 Merits of DeFacto Usage . . . . . . . . . . . . . . . . 5
1.2.4 Use Cases of IPv4 IPID in Diagnostics . . . . . . . . . 5
1.3 TCP sequence number is not enough . . . . . . . . . . . . . 6
1.4 Inadequacy of current measurement techniques . . . . . . . . 7
1.4.1 SNMP / CMIP Counters . . . . . . . . . . . . . . . . . . 7
1.4.2 Router / Firewall Logs . . . . . . . . . . . . . . . . . 7
1.4.3 Netflow . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4.4 Access to Intermediate Devices . . . . . . . . . . . . . 8
1.4.5 Modifications to an Operational Production Network . . . 8
2 Solution Parameters . . . . . . . . . . . . . . . . . . . . . . 9
2.1 Packet Trace Meets Criteria . . . . . . . . . . . . . . . . 9
2.1.1 Limitations of Packet Capture . . . . . . . . . . . . . 9
2.1.2 Problem Scenario 1 . . . . . . . . . . . . . . . . . . . 9
2.1.3 Problem Scenario 2 . . . . . . . . . . . . . . . . . . . 11
3 Rationale for Proposed Solution (PDM) . . . . . . . . . . . . . 11
4 Backward Compatibility . . . . . . . . . . . . . . . . . . . . 11
5 Security Considerations . . . . . . . . . . . . . . . . . . . . 12
6 IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 12
7 References . . . . . . . . . . . . . . . . . . . . . . . . . . 12
7.1 Normative References . . . . . . . . . . . . . . . . . . . 12
8 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . 12
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 13
Elkins Expires April 14, 2014 [Page 2]
INTERNET DRAFT-elkins-v6ops-ipv6-packet-sequence-needed-01 October 2013
1 Background
To diagnose performance and connectivity problems, metrics on real
(non-synthetic) transmission are critical for timely end-to-end
problem resolution. Such diagnostics may be real-time or after the
fact, but must not impact an operational production network. The base
metrics are: packet sequence number and packet timestamp. Metrics
derived from these will be described separately.
For background, please see draft-ackermann-tictoc-pdm-ntp-usage-00
[ACKPDM], draft-elkins-6man-ipv6-pdm-dest-option-02 [ELKPDM], draft-
elkins-v6ops-ipv6-end-to-end-rt-needed-01 [ELKRSP], draft-elkins-
v6ops-ipv6-pdm-recommended-usage-01 [ELKUSE] and draft-elkins-ippm-
pdm-metrics-00 [ELKIPPM]. These drafts are companions to this
document.
As discussed in the above Internet Drafts, current methods are
inadequate for these purposes because they assume unreasonable access
to intermediate devices, are cost prohibitive, require infeasible
changes to a running production network, or do not provide timely
data. The IPv6 Performance and Diagnostic Metrics destination option
(PDM) provides a solution to these problems. This document will
detail the background and need for the packet sequence number.
1.1 Why Packet Sequence Number
In many networks, during network diagnostics of an end-to-end
connection, it becomes necessary to find the device along the network
path creating problems. Diagnostic data may be collected at multiple
places along the path (if possible), or at the source and
destination. Then, the diagnostic data must be matched. Packet
sequence number is critical in this matching process. The timestamp
or even the IP addresses may be different at different devices. In
IPv4 networks, the IPID field was used as a de facto sequence number.
This will be discussed at greater length in section 1.2.
This method of data collection along the path is of special use on
large multi-tier networks to determine where packet loss or packet
corruption is happening. Multi-tier networks are those which have
multiple routers or switches on the path between the sender and the
receiver.
Elkins Expires April 14, 2014 [Page 3]
INTERNET DRAFT-elkins-v6ops-ipv6-packet-sequence-needed-01 October 2013
1.2 IPv4 IPID : DeFacto Sequence Number
With IPv4 networks, on many stack implementations, but not all, the
IPID field has the property of sequentiality.
1.2.1 Description of IPID in IPv4
In IPv4, the 16 bit IP Identification (IPID) field is located at an
offset of 4 bytes into the IPv4 header and is described in RFC0791
[RFC0791]. In IPv6, the IPID field is a 32-bit field contained in the
Fragment Header defined by section 4.5 of RFC2460 [RFC2460].
Unfortunately, unless fragmentation is being done by the source node,
the IPv6 packet will not contain this Fragment Header, and therefore
will have no Identification field.
The intended purpose of the IPID field, in both IPv4 and IPv6, is to
enable fragmentation and reassembly, and as currently specified is
required to be unique within the maximum segment lifetime (MSL) on
all datagrams. The MSL is often 2 minutes.
1.2.2 DeFacto Use of IPID
In many networks, the IPID field is used for more than fragmentation.
During network diagnostics, packet traces may be taken at multiple
places along the path, or at the source and destination. Then,
packets can be matched by looking at the IPID.
The inclusion of the IPID makes it easier for a device(s) in the
middle of the network, or on the receiving end of the network, to
identify flows belonging to a single node, even if that node might
have a different IP address. For example, in the case of sessions
going through a NAT or proxy server.
For its de-facto diagnostic mode usage, the IPID field needs to be
available whether or not fragmentation occurs. It also needs to be
unique in the context of the session, and across all the connections
controlled by the stack. In IPv4, the IPID is in the main header, so
it is available for all packets. As it is a 16-bit field, it wrapped
during the course of the session and thus had some limitations.
Even with these limitations, the IPID has been valuable and useful in
IPv4 for diagnostics and problem resolution. It is a practical
solution that is 'good enough' in many instances. Not having it
available in IPv6, may be a major detriment to new IPv6 deployments
and contribute to protracted downtimes in existing IPv6 operations.
Elkins Expires April 14, 2014 [Page 4]
INTERNET DRAFT-elkins-v6ops-ipv6-packet-sequence-needed-01 October 2013
1.2.3 Merits of DeFacto Usage
As network technology evolves, the uses to which fields are put can
change as well. De-facto use is powerful, and should not be lightly
ignored. In fact, it is a testament to the power and pervasiveness
of the protocol that users create new uses for the original
technology.
For example, the use of the IPID goes beyond the vision of the
original authors. This sort of thing has happened with numerous
other technologies and protocols.
The implementation of the traceroute command sends ICMP echo packets
with a varying TTL. This is a very useful for diagnostics yet
departs from the original purpose of TTL.
Similarly, cell phones have evolved to be more than just a means of
vocal communication, including Internet communications, photo-
sharing, stock exchange transactions, etc. Indeed, the Internet
itself has evolved, from a small network for researchers and the
military to share files into the pervasive global information
superhighway that it is today.
1.2.4 Use Cases of IPv4 IPID in Diagnostics
Use Case # 1 --- Large Insurance Company
- (estimated time saved by use of IPID: 7 hours)
Performance Tool produces extraneous packets
- Issue was whether a performance tool was accurately replicating
session flow during performance testing.
- Trace IPIDs showed more unique packets within same flow from
performance tool compared to IE Browser.
- Having the clear IPID sequence numbers also showed where and why
the extra packets were being generated.
- Solution: Problem rectified in subsequent version of performance
tool.
- Without IPID, it was not clear if there was an issue at all.
Use Case #2 --- Large Bank
- (estimated time saved by use of IPID: 4 hours)
Batch transfer duration increases 12x
- A data transfer which formerly took 30 minutes to complete
started taking 6-8 hours to complete.
- Was there packet loss? All the vendors said no.
- The other applications on the network did not report any
Elkins Expires April 14, 2014 [Page 5]
INTERNET DRAFT-elkins-v6ops-ipv6-packet-sequence-needed-01 October 2013
problems.
- 4 trace points were used, and the IPIDs in the packets were
compared.
- The comparison showed 7% packet loss.
- Solution: WAN hardware was replaced and problem fixed.
- Without IPID, no one would agree a problem existed
Use Case #3 --- Large Bank
- (estimated time saved by use of IPID: 6 hours)
Very slow interactive performance
- All network links looked good.
- Traces showed duplicated small packets (which can be OK).
- We saw that the IPID was the same in both packets but the TTL
was always + 1.
- A network device was "splitting" only small packets over two
interfaces.
- The small packets were control info, telling other side to slow
down.
- It erroneously looked like network congestion.
- Solution: Network device replaced and good interactive
performance restored.
- Without IPID, flows would have appeared OK.
Use Case #4 --- Large Government Agency
- (estimated time saved by use of IPID: 9 hours)
VPN drops
- Cell phone connections to law enforcement were being dropped.
The connections were going through a VPN.
- All parties (both sides of VPN connection, application, etc.) said
it was not their problem. The problem went on for weeks.
- Finally, we took a trace which showed packets with IPID and TTL
that did not match others in the flow AT ALL coming from the
router nearest the application server end of VPN.
- Solution: Provider for VPN for application server changed.
Problem resolved.
- Without IPID, much harder to diagnose problem.
- (Same case also happened with large corporation. Again, all
parties saying not their fault until proven via packet trace.)
1.3 TCP sequence number is not enough
TCP Sequence number is defined in RFC0793 [RFC0793]. Indeed, the TCP
Sequence Number along with the TCP Acknowledgment number can be used
to calculate dropped packets, duplicate packets, out-of-order packets
Elkins Expires April 14, 2014 [Page 6]
INTERNET DRAFT-elkins-v6ops-ipv6-packet-sequence-needed-01 October 2013
etc. That is, IF the packet flow itself reflects accurately what
happened on the wire!
See Scenario 1 (Section 1.5.2) and Scenario 2 (Section 1.5.3) for
what happens with packet trace capture in real networks.
The TCP Sequence Number is, obviously, available only for TCP and not
other transport protocols.
1.4 Inadequacy of current measurement techniques
The question arises of whether current methods of instrumentation
cannot be used without a change to the protocol. Current methods of
measuring network data, other than packet traces, are inadequate
because they assume unreasonable access to intermediate devices, are
cost prohibitive, require infeasible changes to a running production
network, or do not provide timely data. This section will discuss
each of these in detail.
Current methods include both instrumentation and third party
products. These include SNMP, CMIP, router logs, and firewall logs.
1.4.1 SNMP / CMIP Counters
The traditional network performance counters measured by SNMP or CMIP
do not provide information at the granularity desired on the behavior
of application flows across the network. The problem is that such
counters do not contain enough data be able to provide a detailed and
realistic view of the end-to-end behavior of a connection.
1.4.2 Router / Firewall Logs
Router and firewall logs may provide some information for diagnostics
But as discussed in section 1.4.5, routers and firewalls in a
production network are generally set to do minimal logging and
diagnostics to allow maximum efficiency and throughput. Such devices
cannot be asked to collect detailed data for an operational problem,
as this requires a change to a production network.
1.4.3 Netflow
Netflow is instrumentation which is available from some middle
devices. As discussed in detail in section 1.4.5, such devices are
generally set to do minimal logging and diagnostics to allow maximum
efficiency and throughput.
Correlations to produce some level of response time data may be
Elkins Expires April 14, 2014 [Page 7]
INTERNET DRAFT-elkins-v6ops-ipv6-packet-sequence-needed-01 October 2013
possible from Netflow. But, it is not an adequate picture of end-to-
end response time as Netflow is in an intermediate device and is not
in a position to know what has happened at a client.
1.4.4 Access to Intermediate Devices
The above current methods require access to the transport
infrastructure - that is, the routers, switches or other intermediate
devices. In some cases, this is possible; in others, the connections
in question may cross a number of administrative entities (both in
the transport and in the endpoints). When it is the enterprise at
the endpoint which is interested in the diagnostics, the
administrative entities who own the devices in the middle of the path
have no stake in operational measurement at the enterprise or
application level. They have no reason to provide the necessary
data or to impact the basic transport with the instrumentation
necessary to capture flow-oriented data as a continuous stream
suitable for general consumption.
In other words, if you don't own the path end-to-end, you will not be
able to get the data you need if you are required to get it from the
devices in the middle. Not only that, the devices in the middle do
not have the instrumentation necessary to make it easy to do end-to-
end diagnostics because they are not responsible for that and so do
not want to burden their devices with doing those kind of functions.
Many EDCO networks may not own the path end-to-end. They may be
working with a business partner's network or crossing the Internet.
1.4.5 Modifications to an Operational Production Network
Even when the enterprise does own all the devices along the entire
path, to get enough data to adequately resolve a problem means
changing the device configuration to do detailed diagnostics. In a
production network, devices are generally set to do minimal logging
and diagnostics. This is to allow maximum efficiency and throughput.
The more logging and diagnostics such devices do, the fewer resources
they have for actually transmitting traffic across the network.
So, if devices are to be asked to collect more data for an
operational problem, this requires a change to a production network.
This is generally not possible as it destabilizes a critical network
during business hours, thus potentially disrupting many customers.
Making changes is usually a lengthy process requiring change control,
testing on a test network, etc. On networks which are critical to
the business function, such as the networks we are discussing, it is
hardly likely that changing configuration "in flight" is an option.
Elkins Expires April 14, 2014 [Page 8]
INTERNET DRAFT-elkins-v6ops-ipv6-packet-sequence-needed-01 October 2013
2 Solution Parameters
What is needed is:
1) A method to identify and/or track the behavior of a connection
without assuming access to the transport devices.
2) A method to observe a connection in flight without introducing
agents at endpoints.
3) a method to observe arbitrary flows at multiple points within a
network and correlate the results of those observations in a
consistent manner.
4) A method to signal and correlate transport issues to application
end-to-end behavior.
5) A method which does not require changes to a production network in
real time.
6) Adequate granularity in the measurement technique to provide the
needed metrics.
2.1 Packet Trace Meets Criteria
The only instrumentation which provides enough detail to diagnose
end-to-end problems is a packet trace. Packet traces do not require
changes to devices in production mode because in many large EDCO
networks, products are available to capture packets in passive mode.
Such products continuously monitor network traffic. Often, they are
used not for diagnostic reasons but for regulatory reasons. For
example, there may be legal requirements to log all stock exchange
transactions.
Products for packet tracing are available freely and can be used at a
client host without disrupting major portions of the network.
2.1.1 Limitations of Packet Capture
Even though packets are the only reliable way to provide data at the
needed granularity, there are limitations with collecting packet
traces in some situations. They are as follows:
2.1.2 Problem Scenario 1
1. Packets are captured for analysis at places like large core
switches. All packets are kept. Again, not necessarily for
diagnostic reasons but for regulatory ones. For example, records of
Elkins Expires April 14, 2014 [Page 9]
INTERNET DRAFT-elkins-v6ops-ipv6-packet-sequence-needed-01 October 2013
all stock trades may need to be kept for a certain number of years.
2. When there is a problem, an analyst extracts the needed
information.
3. If the extract is done incorrectly, as often happens, or the
packet capture itself is incorrect, then there may be false duplicate
packets which can be quite misleading and can lead to wrong
conclusions. Are these real TCP duplicates? Is there congestion on
the subnet? Are these retransmissions? Situations have been seen
where routers incorrectly send two packets instead of one - is this
such a situation?
Elkins Expires April 14, 2014 [Page 10]
INTERNET DRAFT-elkins-v6ops-ipv6-packet-sequence-needed-01 October 2013
2.1.3 Problem Scenario 2
1. In this scenario, packets are captured for analysis at places like
a middleware box. It may be because problems are suspected with the
box itself or it is a central point of the suspected failure.
2. The box may not offer any way to tailor the packet capture. "You
will get what we give you, how we give it to you!" is their
philosophy.
3. The packet capture incorrectly duplicates only packets going to
certain nodes.
4. Again, there are false duplicate packets which can be misleading
and can lead to wrong conclusions. Are these real TCP duplicates? Is
there congestion on the subnet? Situations have been seen where
routers incorrectly send two packets instead of one - is this such a
situation?
3 Rationale for Proposed Solution (PDM)
The current IPv6 specification does not provide a packet sequence
number or similar field in the IPv6 main header. One option might be
to force all IPv6 packets to contain a Fragment Header. In packets
which are entire in and of themselves, the fragment ID would be zero
- that is, an atomic fragment. Why was a new destination option
header defined rather than recommending that Fragment Header be used?
Our reasoning was that the PDM destination option header would
provide multiple benefits : the packet sequence number and the
timestamp to calculate response time. See draft-elkins-v6ops-ipv6-
end-to-end-rt-needed-01 [ELKRSP].
4 Backward Compatibility
The scheme proposed in this document is backward compatible with all
the currently defined IPv6 extension headers. According to RFC2460
[RFC2460], if the destination node does not recognize this option, it
should skip over this option and continue processing the header.
Elkins Expires April 14, 2014 [Page 11]
INTERNET DRAFT-elkins-v6ops-ipv6-packet-sequence-needed-01 October 2013
5 Security Considerations
No security considerations are seen.
6 IANA Considerations
There are no IANA considerations.
7 References
7.1 Normative References
[RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, September
1981.
[RFC0793] Postel, J., "Transmission Control Protocol", STD 7,
RFC 793, September 1981.
[RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6
(IPv6) Specification", RFC 2460, December 1998.
[ACKPDM] Ackermann, M., "draft-ackermann-tictoc-pdm-ntp-usage-00",
Internet Draft, September 2013.
[ELKPDM] Elkins, N., "draft-elkins-6man-ipv6-pdm-dest-option-02",
Internet Draft, September 2013.
[ELKRSP] Elkins, N., "draft-elkins-v6ops-ipv6-end-to-end-rt-needed-
01", Internet Draft, September 2013.
[ELKUSE] Elkins, N., "draft-elkins-v6ops-ipv6-pdm-recommended-usage-
01", Internet Draft, September 2013
[ELKIPPM] Elkins, N., "draft-elkins-ippm-pdm-metrics-00", Internet
Draft, September 2013.
8 Acknowledgments
The authors would like to thank Rick Troth and Fred Baker
for their comments.
Elkins Expires April 14, 2014 [Page 12]
INTERNET DRAFT-elkins-v6ops-ipv6-packet-sequence-needed-01 October 2013
Authors' Addresses
Nalini Elkins
Inside Products, Inc.
36A Upper Circle
Carmel Valley, CA 93924
United States
Phone: +1 831 659 8360
Email: nalini.elkins@insidethestack.com
http://www.insidethestack.com
Michael S. Ackermann
Blue Cross Blue Shield of Michigan
P.O. Box 2888
Detroit, Michigan 48231
United States
Phone: +1 310 460 4080
Email: mackermann@bcbsmi.com
http://www.bcbsmi.com
Keven Haining
US Bank
16900 W Capitol Drive
Brookfield, WI 53005
United States
Phone: +1 262 790 3551
Email: keven.haining@usbank.com
http://www.usbank.com
Sigfrido Perdomo
Depository Trust and Clearing Corporation
55 Water Street
New York, NY 10055
United States
Phone: +1 917 842 7375
Email: s.perdomo@dtcc.com
http://www.dtcc.com
William Jouris
Inside Products, Inc.
36A Upper Circle
Carmel Valley, CA 93924
United States
Phone: +1 925 855 9512
Email: bill.jouris@insidethestack.com
Elkins Expires April 14, 2014 [Page 13]