Internet DRAFT - draft-ionta-plic
draft-ionta-plic
Network Working Group T. Ionta
Internet-Draft A. Cappadona
Intended status: Standard Telecom Italia
Expires: Nov 15, 2016 May 16, 2016
A Performance Layer Inboard Computing method to support active, passive
and hybrid measurements
draft-ionta-plic-01.txt
Abstract
This document describes an innovative frame, called PLIC,("Performance
Layer Inboard Computing") able to support active, passive or hybrid
performance measurements on any kind of unicast and multicast service
passing through a network. It is based on an algorithm which, through
a distributed computation, floods the performance measurements of each
node to the rest of the network, without necessity of any external
topology database or operation support system. It has been invented
and engineered in the Telecom Italia Core Network.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on May 1, 2016.
Copyright Notice
Copyright (c) 2015 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document.
Ionta, et al. Expires May 16, 2016 [Page 1]
Internet-Draft Performance Layer Inboard Computing method October 2015
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2. General Overview of the method . . . . . . . . . . . . . . . . . 3
3. Detailed description of the method . . . . . . . . . . . . . . . 3
3.1 Discovery process (Process 1) . . . . . . . . . . . . . . . . . 4
3.2 PLIC-Neighborship building (Process 2). . . . . . . . . . . . . 4
3.3 PLIC Token getting (Process 3). . . . . . . . . . . . . . . . . 6
3.4 Local Performance parameters calculation (Process 4). . . . . . 6
3.5 PLIC Token building (Process 5) . . . . . . . . . . . . . . . . 7
3.5.1 General info put in the PLIC Token. . . . . . . . . . . . . . 7
3.5.2 PLIC Token detailed structure . . . . . . . . . . . . . . . . 8
3.6 Processes sequencing during each single timeslot. . . . . . . . 8
4. Robustness of the method . . . . . . . . . . . . . . . . . . . . 9
5. Implementation and deployment (use case) . . . . . . . . . . . .10
6. Security Considerations. . . . . . . . . . . . . . . . . . . . .10
7. IANA Considerations. . . . . . . . . . . . . . . . . . . . . . .10
8. Acknowledgements. . . . . . . . . . . . . . . . . . . . . . . .10
9. References . . . . . . . . . . . . . . . . . . . . . . . . . . .11
9.1. Normative References . . . . . . . . . . . . . . . . . . . . .11
9.2. Informative References . . . . . . . . . . . . . . . . . . . .11
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . .11
1. Introduction
This document provides a framework for performing measurements on data
flows transmitted in a communication network, which may be
implemented in a distributed way by the nodes themselves, without
requiring the intervention of any external management server
collecting measurements from the nodes. It is also capable of
automatically adapting, without necessity of getting topology from
any external topology database, to any possible changes of the
network topology and of the paths followed by the data flow across
the communication network, thus easing the operations of performance
monitoring and failure detection and management.
This framework is applicable to active, passive or hybrid performance
measurements, since it can support both active and passive probing, or
a mix of them.
It is also capable to support performance measurements on both
multicast and unicast services.
It is applicable everywhere across the network (Core, Edge, Access),
giving the possibility to have the result of the measurements either
from an "end-to-end" point of view or a "node by node" too.
Ionta, et al. Expires May 16, 2016 [Page 2]
Internet-Draft Performance Layer Inboard Computing method October 2015
2. General Overview of the method
The main idea behind this method is to find a way to pass node-by-node
the performance measurements (Packet Loss, Delay, CRC, Bandwith,
passive/active measurement parameters or whatever else) taken on each
node in a way that let any other node of the network to:
- identify, automatically and dynamically, the list of all the upstream
nodes along the reverse path up to the source node of the service,
thus avoiding the necessity of an external topology database. The
topology discovery is performed and updated frequently (with a period
equal or less than a single timeslot) using the method described in
the following paragraph and based on a discovery method where each
node, discovering just its neighbors and passing these discovery info
to the following node, allows each node to build, in a distributed
way, the topology structure of the reverse path up to the source node
of the service.
- know what happens on the upstream nodes in terms of performance
measurements.
The data structure devoted to the information passing is defined as
PLIC Token.
It is generated by each node during a specific timeslot and passed (in
a push or pull way, indifferently) to the following node during the
same timeslot, thus allowing all the nodes to update themselves during
the same timeslot.
PLIC Token flows through the network using a multicast flow (belonging
to a probe or to real traffic, indifferently). The method is
applicable, at the same time, to one or multiple multicast trees,
regardless to the location of the sources and the receivers.
Note that the multicast transmission scenario is merely exemplary,
because the method is also applicable to unicast services. For instance
point-to-point transmission (e.g. MPLSE-TE tunnels, ATM/FR tunnels,
etc.)).
It is also applicable both to a flow-based traffic (i.e. active
probing) and to a link-based traffic (i.e. passive probing)
A notable consequence of the method is that no external Operation
Support System is needed to manage performance and topology data,
because, according to the method described in this document,
management is distributed on-board.
The communication protocol among nodes can be a new ad-hoc one (as
the one proposed in this document) or an extension of an existing one
(i.e. PIM).
3. Detailed description of the method
To achieve the above goals, and given a generic multicast tree, each
node N performs the same set of five processes, detailed below, during
each single timeslot (whose duration is freely settable by the
operator depending on the architecture of the network under
measurement) and then repeated during each following timeslot.
Ionta, et al. Expires May 16, 2016 [Page 3]
Internet-Draft Performance Layer Inboard Computing method October 2015
3.1 Discovery process (Process 1)
This discovery process is a local process (no external off-line
topology database is needed) and it is aimed to identify, for a
specific multicast flow entering a specific interface, the chain of the
upstream nodes along the reverse path from the current node up to the
source node.
Thus the current node, during the current timeslot "n" and before
performing the remaining processes, performs two actions (platform
dependent):
- Identify the specific interface where the multicast flow is entering
the node;
- Identify, based on the information inside its IGP database, the set
of all the upstream nodes along the reverse path from the current node
up to the source node. This set of nodes is the input for process n. 2.
Note 1: since the topology refreshing frequency rate must be at least
equal to the duration of a single time-slot, this way of performing
discovery is much lighter than using an off-line topology database,
where the impact of nodes polling would be too heavy.
Note 2: with the proposed algorithm, similarly to OSPF approach, the
whole network topology is distributed (on any single node) instead of
centralized (on an off-line topology database)
3.2 PLIC-Neighborship building (Process 2)
Given the set (defined by Process 1) of the upstream nodes along the
reverse path from the current node up to the source node, the current
process, starting from the closest reachable upstream node, tries to
build a "PLIC neighborship" with it, based on the steps defined in
the current process. If the PLIC-Neighborship building will result
unsuccessful with the closest upstream node, the current process will
try to build the PLIC-Neighborship with the following upstream node
and so on, up along the path toward the source.
As a first step the current node (N(m)) sends the following
"PLIC_Token_Status_Request message" to its closest upstream node along
the reverse path from the current node up to the source node:
"PLIC_Token_Status_Request message" definition:
- Node ID (of N(m))
- Request flag
Ionta, et al. Expires May 16, 2016 [Page 4]
Internet-Draft Performance Layer Inboard Computing method October 2015
The upstream node, regarding its own PLIC Token, can be in one of the
following situations:
1) No PLIC Token Tk(m-1): if the upstream node has never generated the
PLIC Token since up and running.
2) PLIC Token Tk(m-1) exists but it is "obsolete", that is generated
before the previous timeslot (n-2 or older). Thus it is no more useful
for node N(m).
3) PLIC Token Tk(m-1) has been generated during the previous timeslot
(n-1), thus the desired PLIC Token will be likely generated during the
current timeslot (n).
4) PLIC Token Tk(m-1) has been generated during the current timeslot
(n), thus it is exactly the PLIC Token node N(m) is looking for.
Thus, to let node N(m) distinguish among these situations, it is
enough, for node N(m-1) while answering to the
"PLIC_Token_Status_Request message" with the "PLIC_Token_Status_Request
message", to put the Time-Stamp referring to when the PLIC Token has
been successfully generated by node N(m-1).
"PLIC_Token_Status message" definition:
- Node ID (di N(m-1))
- Ts(m-1)
If no PLIC_Token_Status message is received by N(m) after timeout,
PLIC-Neighborship between N(m) and N(m-1) will end.
Node N(m), depending on the timestamp contained in the
"PLIC_Token_Status message", will be able to distinguish among the
above possibilities, thus putting the upstream node under examination
in one of the following two possible status:
- "PLIC-AWARE": if Tk(m-1) is in the state belonging to instance 1
or 2 (as stated above)
- "PLIC-UNAWARE": if Tk(m-1) is in the state belonging to instance 3
or 4 (as stated above).
In case the upstream node under examination is put in "PLIC-AWARE"
status, it is also defined as "PLIC-AWARE PREDECESSOR node" (N(m-1))
of N(m) and a PLIC-Neighborship is built between N(m) and N(m-1).
Instead if the upstream node under examination is put in
"PLIC-UNAWARE" status, the next upstream node will go under
examination trying to build a PLIC-neighborship with it by sending a
new "PLIC Token_Status_Request message" and analyzing the
corresponding received "PLIC Token_Status message", as stated above.
This process will continue iteratively until one of the following
conditions occurs:
- One PLIC-AWARE node is found along the reverse path from the current
node up to the source node. In this case Process 2 has not succeeded.
- No one PLIC-AWARE node is found along the reverse path from the
current node up to the source node. In this case N(m) is named "Source
Node" of the PLIC tree. Note that, following the above algorithm, the
"Source Node" of the PLIC tree could differ from the multicast source
node. In other words the "Source Node" of the PLIC tree is the PLIC
AWARE node more close to the multicast source node. In this case
Process 2 succeeded.
Ionta, et al. Expires May 16, 2016 [Page 5]
Internet-Draft Performance Layer Inboard Computing method October 2015
3.3 PLIC Token getting (Process 3)
As soon as process 2 succeeds, thus designating the
PLIC-AWARE PREDECESSOR node (called N(m-1)) and building of a
PLIC-Neighborship with it, the current node N(m) asks to it, for the
first time, the PLIC Token Tk(m-1) by sending the
"Get_PLIC_Token message" with the Request Flag set.
"Get_PLIC_Token message"
Node ID (referred to N(m))
Request Flag
Instead, if no one PLIC-AWARE node will be found along the reverse
path from the current node up to the source node, the current process
is skipped by node N(m).
After sending this first request, a "request time counter" is
switched on by N(m).
Now three possible occurrences occur:
1) Node N(m-1) doesn't answer immediately.
2) Node N(m-1) answers immediately sending its own PLIC Token Tk(m-1)
(by using the "Send_PLIC Token message", composed by the fields
detailed in Process 5). In this situation two possible occurrences
(node N(m) distinguishes between them based on Ts(m-1) contained in
the "Send_PLIC Token message" from N(m-1):
2.a) PLIC Token Tk(m-1) has been generated during the previous
timeslot (n-1), thus the desired (cfr occurrence 3 of process 2).
2.b) PLIC Token Tk(m-1) has been generated during the current
timeslot (n), thus it is exactly the PLIC Token node N(m) is looking
for (cfr occurrence 4 of process 2).
In occurrence 2.b the current process succeeds and thus process 5 can
immediately start.
Instead in occurrence 1 or 2.a node N(m), after an predefined
incremental quantum time, will reiterate again and again Process 4
until one out the following two occurrences occurs:
- occurrence 2.b occurs, thus process 5 can start, using PLIC Token
Tk(m-1) to build Tk(m).
- the "request time counter" overcomes a predefined timeout, thus
process 5 can start, but without using PLIC Token Tk(m-1) to build
Tk(m).
3.4 Local Performance parameters calculation (Process 4)
This process is a local process and it is devoted to calculate, on
node N(m), the data of interest (i.e. PLR, CRS, Delay, etc.), based
on measurements performed during the current or previous timeslot
(depending on the kind of parameter measured ). These data will be
put in PLIC Token Tk(m) during Process 5. Each performance parameter
is calculated depending on the chosen method.
Ionta, et al. Expires May 16, 2016 [Page 6]
Internet-Draft Performance Layer Inboard Computing method October 2015
3.5 PLIC Token building (Process 5)
This process is a local process and it is devoted to generate a new
PLIC Token Tk(m), available for node N(m+1).
This process puts in the PLIC Token Tk(m) some kind of information.
General information and the detailed corresponding fields are
described in the following paragraphs.
3.5.1 General info put in the PLIC Token
The following are the general info to put in the PLIC Token:
- Info about local node N(m) (i.e. hostname of N(m))
- Locally performed measurements (i.e. number of packets, CRC, etc.)
calculated by process 4
- If, and only if, process 3 succeeded (PLIC Token Tk(m-1) has been
got):
- Data contained in Tk(m-1) (i.e. list of hostnames of the PLIC-AWARE
nodes along the upstream reverse path from the local node N(m) up to
the source)
- Results of calculation performed merging the data coming from
Tk(m-1) and the locally performed measurements calculated by process 4
(i.e. Packet Loss calculated as difference between number of packets
measured in N(m-1), and thus contained in Tk(m-1), and number of
packets measured by the locally performed measurements during
process 4.
As detailed in the above PLIC-Neighborship building process, please
remember that PLIC Token Tk(m-1) has not been got by N(m) if one of
the following cases occurs:
- case 1: no PLIC-Aware node has been found during the
PLIC-Neighborship building process
- case 2: the PLIC-AWARE PREDECESSOR node (called N(m-1)) has not been
able to generate PLIC Token Tk(m-1), or N(m) has not been able to get
it, within the timeout.
In this situation no info about the upstream path can be put in PLIC
Token Tk(m).
Ionta, et al. Expires May 16, 2016 [Page 7]
Internet-Draft Performance Layer Inboard Computing method October 2015
3.5.2 PLIC Token detailed structure
The fields of the PLIC Token are:
1.Hostname ID of N(m)
2.Interface ID where the PLIC multicast flow comes in
3.Time-Stamp referred to the successfully creation of the PLIC Token
by N(m) during the current timeslot.
4.Flow ID 1: Main PLIC multicast flow identifier. i.e. Multicast Group
under monitoring. It can belong to a customer or to a monitoring
probe. There is a different PLIC multicast flow for each Multicast
group under monitoring by PLIC.
5.(optional) Flow ID 2: Detailed multicast flow identifier. In case
the Flow ID 1 is not enough to identify a specific multicast flow
(i.e. different PLIC flows using the same Multicast group need
different DSCP to be differentiated).
6.Number of Performance parameters, measured by Node N(m), contained
in the following part of the current PLIC Token Tk(m) (TLV approach)
7.Performance parameter 1 (i.e. PLR): node-level performance measure
(i.e. between the current node and its predecessor)
8.Performance parameter 1 (i.e. PLR): cumulative performance measure
(i.e. between the current node and the source)
9.Performance parameter 1 (i.e. PLR): additive info containing the
node-level measurements of the basic elements composing the
Performance parameter measurements (i.e. number of incoming packets
measured on the current Interface). These values will be useful for
the successor node to calculate its own node-level performance
measure.
10.(optional) analogue of 7 for Performance parameter 2 (i.e. Delay)
11.(optional) analogue of 8 for Performance parameter 2
12.(optional) analogue of 9 for Performance parameter 2
13.(optional) analogue of 7 for Performance parameter "n" (i.e. CRC)
14.(optional) analogue of 8 for Performance parameter "n"
15.(optional) analogue of 9 for Performance parameter "n"
16.(if and only if process 3 succeeded) Numbers of records, each
containing the sets of Performance parameters (analogue of the set
created by N(m) and put in the above fields 7-15) measured by the
upstream PLIC-aware nodes and all contained inside the PLIC Token
Tk(m-1) got by N(m) during process 3
17.- n.: analogue of fields 6-16 for each record referred to each
of the upstream PLIC-aware node.
Ionta, et al. Expires May 16, 2016 [Page 8]
Internet-Draft Performance Layer Inboard Computing method October 2015
3.6 Processes sequencing during each single timeslot
Following there is a time diagram detailing one single timeslot
Process | Process description | Timeslot "N" |
ID | | | PLIC Token |
|propagation time|
1 | Discovery | XXX |
2 |PLIC-Neighborsh build| | XXX |
3 | Get_PLIC_Token | | XXX |
4 | Local Performance | XXX | XXX | XXX |
| parameters calcul. |
5 |PLIC Token building | | XXX|
Process 2 can start only after the ending of process 1 because it needs
to know, as input, the set of all the upstream nodes along the reverse
path from the actual node up to the source node.
Process 3 can start only after the ending of process 2 because the PLIC
Token must be taken only from the upstream PLIC-AWARE node.
Process 4 has not dependencies from process 1,2 and 3 (thus it can
start asynchronously respect to them) because the local performance
parameters calculation, depending only on performance parameters
sampled on the local node, can be performed without necessity of the
PLIC Token from the upstream PLIC-AWARE node.
Process 5 can start only after the ending of both processes 3 and 4
because depends on the output of both of them.
It must finish before the PLIC Token propagation time interval so to
give enough time for the PLIC Token propagation downstream, that is
during the PLIC Token propagation time interval in Timeslot N its
successor node (N(m+1)) must get Tk(m,n) and create its own Tk(m+1,n)
(following the same processes sequence as for N(m)) to be taken, once
again during the same PLIC Token propagation time interval in
Timeslot N, by its successor node (N(m+2)) and creates its own
Tk(m+2,n),and so on.
Following is shown a zoom on the PLIC Token propagation time interval
during generic Timeslot "N":
Node m+1 |Get Tk(m,n)|
| |Create Tk(m+1,n)|
Node m+2 | | |Get Tk(m+1,n)|
| | | |Create Tk(m+2,n)|
4. Robustness of the method
Based on the above statements, the proposed method is robust
regarding:
- any loss of measurement samples: even if the PLIC Token is not
generated by the upstream node N(m-1), node N(m) generates anyway
its own PLIC Token available for downstream nodes.
- propagation drift of the PLIC Token during a single timeslot.
- network synchronization problems.
Ionta, et al. Expires May 16, 2016 [Page 9]
Internet-Draft Performance Layer Inboard Computing method October 2015
5. Implementation and deployment (use case)
The methodology has been implemented and delivered in Telecom Italia
by leveraging functions and tools available on IP routers and it's
currently being used to support monitoring of packet loss, CRC and
Bandwith in some portions of Telecom Italia's network. The timeslot
length is 5 minutes.
In particular the packet loss measurement is performed based on the
method described in [I-D.draft-tempia-ippm-p3m-01.txt] and the
results are put inside the PLIC Token structure and delivered to the
following nodes.
6. Security Considerations
Implementation of this method must be mindful of security
and privacy concerns.
There are two types of security concerns: potential harm caused by
the measurements and potential harm to the measurements. For what
concerns the first point, the measurements described in this document
are passive, so there are no packets injected into the network
causing potential harm to the network itself and to data traffic.
Nevertheless, the method implies modifications on the fly to the IP
header of data packets: this must be performed in a way that doesn't
alter the quality of service experienced by packets subject to
measurements and that preserve stability and performance of routers
doing the measurements. The measurements themselves could be harmed
by routers altering the coloring of the packets, or by an attacker
injecting artificial traffic. Authentication techniques, such as
digital signatures, may be used where appropriate to guard against
injected traffic attacks.
The privacy concerns of network measurement are limited because the
method only relies on information contained in the IP header without
any release of user data.
7. IANA Considerations
There are no IANA actions required.
8. Acknowledgments
The authors would like to thank their workmates F. Moretti, A. Soldati
and A. Barbetti for their helpful collaboration.
Thanks also to F. Benedetti, G. Giammona, G. Mazzola, S. Salamone
and L. Tomasino for their support while implementing and deploying this
method, according to the rules stated in the job agreement with
Telecom Italia.
A special thank to Daniela for her help while inspiring and
translating this memo.
Ionta, et al. Expires May 16, 2016 [Page 10]
Internet-Draft Performance Layer Inboard Computing method October 2015
9. References
9.1 Normative References
[RFC5357] Hedayat, K., Krzanowski, R., Morton, A., Yum, K., and J.
Babiarz, "A Two-Way Active Measurement Protocol (TWAMP)",
RFC 5357, October 2008.
[RFC5938] Morton, A. and M. Chiba, "Individual Session Control
Feature for the Two-Way Active Measurement Protocol
(TWAMP)", RFC 5938, August 2010.
9.2 Informative References
[I-D.tempia-opsawg-p3m] Capello, A., Cociglio, M., Castaldelli, L.,
and A. Bonda,"A packet based method for passive performance
monitoring", draft-tempia-opsawg-p3m-04 (work in
progress), February 2014.
[draft-tempia-ippm-p3m-01.txt] Capello, A., Cociglio, M., Castaldelli,
L., Fioccola G., and A. Bonda, "A packet based method for passive
performance monitoring", September 2015
Authors' Address
Tiziano Ionta (editor)
Telecom Italia Labs
Via Valcannuta 250
00167 Rome
Italy
Phone: +39 06 3688 5600
Email: tiziano.ionta@telecomitalia.it
Antonio Cappadona (editor)
Telecom Italia Labs
Via Valcannuta 250
00167 Rome
Italy
Phone: +39 06 3688 7181
Email: antonio.cappadona@telecomitalia.it
Ionta, et al. Expires May 16, 2016 [Page 11]
Internet-Draft Performance Layer Inboard Computing method October 2015