Internet DRAFT - draft-litkowski-idr-bgp-timestamp
draft-litkowski-idr-bgp-timestamp
Interdomain Working Group S. Litkowski
Internet-Draft Orange Business Service
Intended status: Standards Track K. Patel
Expires: September 24, 2015 Cisco Systems
J. Haas
Juniper Networks
March 23, 2015
Timestamp support for BGP paths
draft-litkowski-idr-bgp-timestamp-02
Abstract
BGP is more and more used to transport routing information for
critical services. Some BGP updates may be critical to be received
as fast as possible : for example, in a layer 3 VPN scenario where a
dual-attached site is loosing primary connection, the BGP withdraw
message should be propagated as fast as possible to restore the
service. The same criticity exists for other address-families like
multicast VPNs where "join" messages should also be propagated very
fast.
Experience of service providers shows that BGP path propagation time
may vary depending on network conditions (especially load of BGP
speaker on the path) and too long propagation time are affecting
customer service.
It is important for service providers to keep track of BGP updates
propagation time to monitor quality of service for the customers. It
is also important to be able to identify BGP Speakers that are
slowing down the propagation.
This document presents a solution to transport timestamps of a BGP
path. The solution is targeted to be used using special identified
beacon prefixes that are single-homed.
Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119].
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Litkowski, et al. Expires September 24, 2015 [Page 1]
Internet-Draft bgp-timestamp March 2015
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on September 24, 2015.
Copyright Notice
Copyright (c) 2015 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1. Problem statement . . . . . . . . . . . . . . . . . . . . . . 3
2. Requirements for monitoring BGP path propagation time . . . . 4
2.1. Architecture . . . . . . . . . . . . . . . . . . . . . . 4
2.2. Measurement accuracy . . . . . . . . . . . . . . . . . . 6
2.2.1. Clock synchronization . . . . . . . . . . . . . . . . 6
2.2.2. Beacon accuracy . . . . . . . . . . . . . . . . . . . 6
2.3. Churn . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.4. Path propagation complexity . . . . . . . . . . . . . . . 7
3. Proposal . . . . . . . . . . . . . . . . . . . . . . . . . . 8
4. BGP timestamp attribute . . . . . . . . . . . . . . . . . . . 9
5. Processing the BGP timestamp attribute . . . . . . . . . . . 10
5.1. Inspection list . . . . . . . . . . . . . . . . . . . . . 10
5.2. Originating a timestamped route in BGP . . . . . . . . . 11
5.3. Receiving a timestamped route in BGP . . . . . . . . . . 11
5.4. Sending a timestamped route in BGP . . . . . . . . . . . 13
5.4.1. Propagating the BGP Timestamp attribute . . . . . . . 13
5.4.2. Setting the send timestamp . . . . . . . . . . . . . 13
5.5. Limiting churn . . . . . . . . . . . . . . . . . . . . . 14
5.6. Marking stale entries . . . . . . . . . . . . . . . . . . 15
Litkowski, et al. Expires September 24, 2015 [Page 2]
Internet-Draft bgp-timestamp March 2015
5.7. Inter-AS considerations . . . . . . . . . . . . . . . . . 19
5.7.1. Drop option . . . . . . . . . . . . . . . . . . . . . 19
5.7.2. Drop AS option . . . . . . . . . . . . . . . . . . . 20
5.7.3. Summary option . . . . . . . . . . . . . . . . . . . 21
5.7.4. Propagate option . . . . . . . . . . . . . . . . . . 22
5.8. Retrieving timestamp vector . . . . . . . . . . . . . . . 23
5.9. Handling malformed attribute . . . . . . . . . . . . . . 23
5.10. Impact on update packing . . . . . . . . . . . . . . . . 23
6. Compared to BMP . . . . . . . . . . . . . . . . . . . . . . . 23
7. Deployment considerations . . . . . . . . . . . . . . . . . . 24
8. Security considerations . . . . . . . . . . . . . . . . . . . 25
9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 25
10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 25
11. Normative References . . . . . . . . . . . . . . . . . . . . 25
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 26
1. Problem statement
CE3----PE3 PE4 --- CE4 (Source)
\ /
RR3 RR4
\ /
RR5
/ \
RR1 RR2
/ | \
/ | \
CE1----PE1 PE5 PE2 --- CE2
|
CE5
Figure 1
The figure 1 describes a typical hierarchical RR design where PEs are
meshed to local RRs and local RRs are meshed to more centric RRs. We
consider a single multicast VPN between all CEs. CE4 is the source,
all others may be receivers. The BGP controlplane also supports some
other BGP service like L3VPN service.
We consider an event in L3VPN service leading to RR1 being
temporarily overloaded (for example, RR1 is processing massive
updates due to a router failure or formatting updates for a route-
refresh). In the same timeframe, CE1 wants to join the multicast
flow from CE4. PE1 propagates the C-multicast route to RR1, but RR1
fails to propagate the route to RR5 because it is busy processing
L3VPN. When RR1 finishes the L3VPN job, it would send the
C-multicast route to RR5 and updates would be imported by PE4. The
Litkowski, et al. Expires September 24, 2015 [Page 3]
Internet-Draft bgp-timestamp March 2015
long time to join the flow may cause CE4 to miss part of the
multicast flow.
All BGP implementations are different in term of internal processing
within an address family or between address family. The issue
described above is just given as an example, and the document does
not presume that all implementations are suffering from this exact
issue. But whatever the implementation, their always be cases where
BGP path propagation could be delayed.
Service providers currently lack of efficient solution to keep track
of BGP path propagation time as well as solution to identify the BGP
speakers causing issues.
BMP (BGP Monitoring Protocol) may be a solution but as several
drawbacks (see Section 6).
2. Requirements for monitoring BGP path propagation time
2.1. Architecture
--------- -------
/ \ / \
RTR_SRC1 ----- | AS1 | ----- | AS2 | ---- RTR_DST1
| \ / \ / |
Inject --------- --------- Sink point
point | |
| |
--------- -------
/ \ / \
RTR_DST2 ---- | AS4 | | AS3 | ---- RTR_SRC2_DST2
| \ / \ / |
Sink point --------- --------- Inject/Sink
point
Figure 2
Litkowski, et al. Expires September 24, 2015 [Page 4]
Internet-Draft bgp-timestamp March 2015
Single AS
-------------------------------------------
/ \
| RR1 ---------- RR2 |
| / \ \ |
| RTR_SRC1 \ RTR_DST2 |
| | \ | |
| Inject RR3 Sink point |
| point | |
| RTR_DST1 |
| | |
\ Sink point /
-------------------------------------------
Figure 3
Figure 2 and Figure 3 describes an interAS and a single AS scenario
where a service provider wants to monitor BGP path propagation time
from a router to multiple routers. In Figure 2, multiple probing
routers are attached to multiple ASes. In Figure 3, all probing
routers are in the same AS.
The architecture requires some BGP Speaker to originate some NLRI
within the BGP controlplane. In the diagram above, they are
identified as "Inject point". In order to provide information about
propagation delays, the architecture requires introduction of
timestamp information. Architecture also needs to identify BGP
Speaker causing high propagation delays. As only, specific
advertisement will serve for measurement, the architecture requires
BGP Speaker to identify NLRIs that must be timestamped. The
architecture also requires some BGP Speaker to serve as sink point
where a timestamp vector information can be retrieved. The timestamp
vector must contain propagation time information for all BGP Speaker
that participated in the BGP path. It is so required that each BGP
Speaker along the path to add timestamp information. There may be
multiple sink points in the network to perform measurement at
different location and also different inject points. An external
tool may be connected to Sink Points to retrieve the timestamp
information. But this is out of scope of the document.
In case of interAS, for security reason, the architecture MUST
support hiding detailed timestamp information to the other AS.
Example of usage :
An external tool should command RTR_SRC to originate a probing BGP
NLRI. All the BGP Speakers are configured to measure timestamp for
this NLRI. The BGP path would propagate across BGP Speakers. Each
BGP Speaker may provide timestamp informations. An external tool
Litkowski, et al. Expires September 24, 2015 [Page 5]
Internet-Draft bgp-timestamp March 2015
connected to sink points will retrieve timestamp vector information
for the NLRI.
2.2. Measurement accuracy
2.2.1. Clock synchronization
For the solution to be accurate, it is mandatory for BGP Speaker to
be synchronized. This could be ensured easily within a single AS but
in a inter domain scenario, it is hard to ensure that all Speakers
are synchronized to a good clock source.
The solution MUST include synchronization information associated with
the timestamp in order to be able to compare timestamps between them.
2.2.2. Beacon accuracy
In order to be accurate, an implementation SHOULD :
o ensure that the timestamped NLRIs are processed with the same
priority as non timestamped NLRIs.
o ensure that the processing of adding timestamp information is as
lightweight as possible. If some limitation exists, the vendor
SHOULD document them.
Using a unique special prefix advertisement from a single location to
evaluate propagation time will not provide a detail view of min/max
propagation time values as the user will not know where the path for
the prefix may be located in a processing queue. Considering a BGP
Speaker handling high churn, the advertisement of the path for the
special prefix may have a specific place in the long processing queue
of the churn depending on the implementation : it may be first, last
or somewhere in the middle.
It is required from user to perform sampling to establish propagation
time boundaries based on multiple advertisements. Repeated
operations of advertisement then withdraw may help in this. See
Section 7 for more details.
2.3. Churn
The target solution MUST NOT create more churn in the BGP
controlplane.
Litkowski, et al. Expires September 24, 2015 [Page 6]
Internet-Draft bgp-timestamp March 2015
2.4. Path propagation complexity
When a NLRI is originated in BGP from a point, a BGP path is created.
Nothing ensures that all nodes within the BGP controlplane will
receive this BGP path. When a concurrent path already exists from
the NLRI, the concurrent path may be prefered by some BGP Speaker
leading to hiding of the new path. Moreover, even if the NLRI is
originated in BGP from a single point, multiple paths may be created
within the BGP controlplane, this is inherent to the BGP meshing in
place.
As soon as multiple BGP paths are involved, controlplane convergence
may be done in multiple steps in order to find the final best path.
This convergence may involve multiple BGP path advertisement
(replacing each other) between peers.
The goal of our proposal is not to measure the convergence time but
to focus on the path propagation time. In a controlplane convergence
involving multiple paths for a NLRI, the solution MUST identify
timestamp for the event where the NLRI was seen for the first time on
a BGP Speaker.
Example :
Single AS
-------------------------------------------
/ RTR_SRC2- 10/8 \
| / |
| RR1 ---------- RR2 |
| / \ \ |
| RTR_SRC1 \ RTR_DST2 |
| | \ |
| 10/8 RR3 |
| | |
| RTR_DST1 |
| |
\ /
-------------------------------------------
Figure 4
In the figure above, consider that the service provider is keep
tracking of propagation time for real NLRIs (corresponding to
customer routes). All the BGP Speakers in our figure are configured
to inspect the NLRI 10/8 which is multihomed. We consider that the
network is starting and the NLRI has not been propagated yet.
Litkowski, et al. Expires September 24, 2015 [Page 7]
Internet-Draft bgp-timestamp March 2015
RTR_SRC1 starts to propagate 10/8 within the BGP controlplane. All
BGP Speakers considers the path as best and this path will be
propagated within the whole controlplane. Each BGP Speaker would add
its timestamp information and RTR_DST1 and RTR_DST2 would be able to
record the timestamp vector. In this case, the timestamp vector is
quite accurate because it represents an end to end propagation.
Now RTR_SRC2 starts to propagate its own path. RR2 has two paths for
10/8 and will choose the best one, let's consider that RTR_SRC2 path
is the best one, RTR_SRC2 path will so be propagated and timestamp
vector will be updated. RR1 will also have two paths, and we
consider that RR1 prefers RTR_SRC1 path, so RTR_SRC2 path will not be
propagated by RR1. In this situation, RTR_DST2 will receive the path
from RR2 with accurate timestamp (end to end propagation) but
RTR_DST1 will never receive it.
We could also consider a stable network situation, where both paths
have been advertised for a long time. A network event may occur
(e.g. IGP metric change) that would cause a BGP Speaker within a
path vector to change its best path. In Figure 10, an IGP event, may
cause RR1 to change its decision and prefers the path originated by
RTR_SRC2 as best, the path will be propagated with previous received
timestamp information that are no more accurate. RTR_DST1 will
receive a BGP timestamp vector containing stale (old) timestamp
informations as well as new ones.
3. Proposal
Our proposal is based on tagging NLRI with timestamp values along its
BGP path propagation. Each BGP Speaker along the path will add
timestamp values, so creating a timestamp vector. An ordered list of
timestamps would so be built along the path.
BGP Update BGP Update BGP Update BGP Update
10.0.0.0/8 10.0.0.0/8 10.0.0.0/8 10.0.0.0/8
Timestamp: Timestamp: Timestamp: Timestamp:
R1:T1 R1:T1 R1:T1 R1:T1
R2:T2 R2:T2 R2:T2
R3:T3 R3:T3
R4:T4
R1 ------------> R2 ------------> R3 ------------> R4 ------------> R5
Using this mechanism, we can easily identify if a hop within a path
is slowing down the propagation.
We propose to use a new BGP attribute, BGP timestamp attribute to
encode timestamps information.
Litkowski, et al. Expires September 24, 2015 [Page 8]
Internet-Draft bgp-timestamp March 2015
4. BGP timestamp attribute
The BGP timestamp (BGP-TS) Attribute is an optional transitive BGP
Path Attribute. The attribute type code is TBD.
The value field of the BGP timestamp attribute is defined as an
ordered list of timestamp entries, the first entry being the first
timestamp entry added (origin):
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Timestamp #1 (variable) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Timestamp #2 (variable) |
...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Timestamp #n (variable) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The timestamps entries are encoded as follows :
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Receive Timestamp #x |
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Send Timestamp #x |
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ASN |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|T| Rsvd | SyncType | EntryType | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
| |
| Optional variable field |
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
o Receive timestamp : the time at which the BGP path was received.
When originating a path in BGP, the timestamp is the originating
time. Expressed in seconds and microseconds since midnight (zero
hour), January 1, 1970 (UTC). If zero, the time is unavailable.
Precision of the timestamp is implementation- dependent.
Litkowski, et al. Expires September 24, 2015 [Page 9]
Internet-Draft bgp-timestamp March 2015
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Timestamp (seconds) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Timestamp (microseconds) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
o Send timestamp : the time at which the BGP path was exported to
the peer. Expressed in seconds and microseconds since midnight
(zero hour), January 1, 1970 (UTC). If zero, the time is
unavailable. Precision of the timestamp is implementation-
dependent.
o ASN : AS Number of the local node creating the timestamp entry.
o Flags :
* T : Synchronized, if set, the BGP speaker clock is synchronized
to an external system.
o SyncType : defines the stratum as defined in [RFC5905].
o EntryType : defines the type of Timestamp entry, the following
types are defined :
* Type 0 : empty. There is no following variable field. This
type is to be used in case of timestamp summarization.
* Type 1 : IPv4 address, the following variable field will be 4
bytes long and will contain the IPv4 router ID of the local
node.
* Type 2 : IPv6 address, the following variable field will be 16
bytes long and will contain the IPv6 router ID of the local
node.
* Type 3 : Stale Indicator, Stale indicates that previous
timestamp entries are old. There is no following variable
field. The receive timestamp and send timestamp should be set
to zero. The ASN is set to the ASN of the local BGP Speaker.
5. Processing the BGP timestamp attribute
5.1. Inspection list
A BGP Speaker supporting the BGP-TS can decide to timestamp only some
specific NLRIs. An inspection list may be configured by the user
(filter) to apply timestamping on a specific set of BGP NLRIs. By
Litkowski, et al. Expires September 24, 2015 [Page 10]
Internet-Draft bgp-timestamp March 2015
default, we suggest that a BGP Speaker supporting BGP-TS SHOULD NOT
timestamp any BGP NLRIs.
User of our proposal must be aware that using a complex policy to
express inspection list may result in more processing that will
influence the end to end propagation time. It is expected that the
inspection list policy should be kept as simple as possible.
5.2. Originating a timestamped route in BGP
When a BGP Speaker supporting BGP-TS originates a new path in BGP
that matches the inspection list, it MUST add the BGP-TS attribute to
the BGP path and MUST set the receive timestamp field to the time the
path was originated in BGP. At this time of processing, the send
timestamp will be set to 0. If the BGP Speaker is synchronized to an
external system when originating the route, the S-bit MUST be set in
the attribute and the SyncType MUST be set to the current stratum.
As mentioned above, the BGP path of the originated route will have a
send timestamp value of zero in the BGP LOC-RIB.
5.3. Receiving a timestamped route in BGP
When a BGP Speaker supporting BGP-TS receives a BGP path that matches
the inspection list, the implementation MUST record the current time
associated with the received path.
The time recording MUST append before the inbound routing policies.
Litkowski, et al. Expires September 24, 2015 [Page 11]
Internet-Draft bgp-timestamp March 2015
Inspection
List
+------------+ +---+ No match +------------+
--> | Adj-RIB-in | --> | I | -------------> | Rtg pol in |
| Peer#1 | | n | | Peers#1 | ----->
+------------+ | s | +-------+ | |
| p | --> | AddTS |->| |
| e | +-------+ +------------+
| c | If match
| t |
| |
| l |
+------------+ | i | No match +------------+
--> | Adj-RIB-in | --> | s | -------------> | Rtg pol in |
| Peer#2 | | t | | Peers#2 | ----->
+------------+ | | +-------+ | |
| | --> | AddTS |->| |
| | +-------+ +------------+
| | If match
+---+
If the path that matches the inspection list and does not contains a
BGP-TS attribute, it MUST add a BGP-TS attribute with a timestamp
entry :
o The receive timestamp MUST be set to the recorded time for this
BGP path.
o If the BGP Speaker is synchronized to an external system when
receiving the route, the S-bit MUST be set in the attribute and
the SyncType MUST be set to the current stratum.
o The send timestamp MUST be set to zero.
If the path that matches the inspection list and contains a BGP-TS
attribute, it MUST append a new timestamp entry in the existing
attribute :
o The receive timestamp MUST be set to the recorded time for this
BGP path.
Litkowski, et al. Expires September 24, 2015 [Page 12]
Internet-Draft bgp-timestamp March 2015
o If the BGP Speaker is synchronized to an external system when
receiving the route, the S-bit MUST be set in the attribute and
the SyncType MUST be set to the current stratum.
o The send timestamp MUST be set to zero.
The process of adding a timestamp entry or adding BGP-TS attribute
SHOULD be as light as possible in order to influence the propagation
time as lowest as possible.
When a BGP Speaker supporting BGP-TS receives a BGP path that does
not the inspection list and contains a BGP-TS attribute, it MUST NOT
change the existing attribute.
When a BGP Speaker not supporting BGP-TS receives a BGP path that
contains a BGP-TS attribute, it MUST follow the standard BGP
procedures described in [RFC4271].
5.4. Sending a timestamped route in BGP
5.4.1. Propagating the BGP Timestamp attribute
For a manageability/security purpose, the authors suggest that BGP
timestamp attribute MAY NOT be sent to a peer unless it was
explicitly configured for. This would prevent timestamp and internal
address informations to be propagated to some external peers for
example. See Section 5.7 for more information.
If a BGP path containing a BGP-TS attribute must be sent to be peer
not configured with BGP timestamp option, the BGP-TS attribute should
be dropped when the update message is sent to the peer.
5.4.2. Setting the send timestamp
If sending timestamp attribute is authorized for a specific peer, and
path has a BGP-TS attribute, the outgoing BGP processing MUST fill
the send timestamp field when exporting the path to a peer. The time
recording MUST occur after all BGP filtering policies (outgoing
routing policies, ORF, ...) and after placing path in Adj-RIB-Out. An
implementation SHOULD set timestamp at the nearest possible step
before sending the BGP Update to the peer. Depending of the
implementation, the timestamping may occur at different stage of the
outgoing BGP processing. Each implementer SHOULD document their
timestamping process in order to make users understand correctly
timestamp values. As most of implementations are using the concept
of peer-groups, in case, timestamp is set too early in the BGP
outgoing processing, all peers within a group may have the same
timestamp value. Implementation should avoid this.
Litkowski, et al. Expires September 24, 2015 [Page 13]
Internet-Draft bgp-timestamp March 2015
The process of adding the send timestamp must be as light as possible
in order to influence the propagation time as lowest as possible.
+------+
| | +--------+ +-----+ +---+ +-------+ No TS
| | --> | Rtgpol | --> | ORF | --> |...|-->|Adj-RIB|-------------->
| | | Out | |P#1 | | | |Out | Send to peer
| | | Peer#1 | | | | | |Peer#1 | +-----+
| | | | | | | | | |-->|AddTS| --->
| | +--------+ +-----+ +---+ +-------+ +-----+
| | TS present
| BGP |
| LOC |
| RIB |
| |
| | +--------+ +-----+ +---+ +-------+ No TS
| | --> | Rtgpol | --> | ORF | --> |...|-->|Adj-RIB|-------------->
| | | Out | |P#2 | | | |Out | Send to peer
| | | Peer#2 | | | | | |Peer#2 | +-----+
| | | | | | | | | |-->|AddTS| --->
| | +--------+ +-----+ +---+ +-------+ +-----+
| | TS present
+------+
5.5. Limiting churn
Adding timestamp informations to BGP path will make all received
paths to be unique.
RR1
/ \
10/8 - R1 RR3 --- R3
\ /
RR2
In the figure above, we consider that RR1 and RR2 are part of the
same cluster (cluster ID : 1). RR3 is client of RR1 and RR2. R3 is
client from RR3, R1 is client from RR1 and RR2.
Without BGP timestamp, when R1 originates the BGP prefix 10/8, it
sends it to RR1 and RR2. Consider that RR3 receives path from RR1
first, it will reflect it to R3. When it will receive the path from
RR2, it may consider that path from RR2 is best (lowest router ID)
but as BGP attributes of the path are exactly the same as for RR1
path, there is no need to send an update to R3.
With BGP timestamp, when R1 originates the BGP prefix 10/8, it sends
it to RR1 and RR2. Consider that RR3 receives path from RR1 first,
Litkowski, et al. Expires September 24, 2015 [Page 14]
Internet-Draft bgp-timestamp March 2015
it will reflect it to R3. When it will receive the path from RR2, it
may consider that path from RR2 is best (lowest router ID) but as BGP
attributes of the two paths are not more equal due to the timestamp
difference, RR3 may need to advertise an update to R3.
In order to prevent introducing more churn, we propose to modify the
behavior described in Section 9.2. of [RFC4271]. An implementation
MUST NOT consider BGP-TS attribute when evaluating the need to send a
new update. As the BGP-TS attribute is purely informational, even if
BGP Speakers have a different view of the timestamp attribute, there
will be no impact on routing.
Considering our example, when RR3 will receive the path from RR2,
even if it considers RR2 path as best, it will not send an update to
R3 as all the attributes, except BGP-TS are equal.
5.6. Marking stale entries
Section 2.4 describes some cases where advertised timestamp
information is no more relevant because it is old and also requires
identification of first propagation timestamps.
In order to do this, we propose to mark old entries by adding a Stale
Indicator within the timestamp vector. The presence of Stale
Indicator must be interpreted as all previous timestamp entries need
to be considered as old and not considered as a first propagation.
BGP-TS attribute example :
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +
| Timestamp #1 (IPv4) | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Old
| Timestamp #2 (IPv4) | | entries
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
| Timestamp #3 (IPv4) | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +
| Timestamp #4 (Stale Indicator) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +
| Timestamp #5 (IPv4) | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Usable
... ...entries
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
| Timestamp #n (variable) | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +
Litkowski, et al. Expires September 24, 2015 [Page 15]
Internet-Draft bgp-timestamp March 2015
Insertion of Stale Indicator in a BGP-TS attribute may happen in the
following conditions :
o A path is received from a peer containing BGP-TS attribute or
originated locally, the path matches the inspection list, and the
decision process does not select the path as best path. Then the
Stale Indicator SHOULD be inserted after decision process
happened.
o A path is received from a peer containing BGP-TS attribute or
originated locally, the path matches the inspection list, and the
decision process does select the path as best path. The path is
exported to peers and then the Stale Indicator MUST be inserted.
The path MUST NOT be repropagated as per Section 5.5.
When inserting a Stale indicator, if a Stale Indicator already exists
in the timestamp vector, the implement SHOULD remove it before adding
the new one.
Litkowski, et al. Expires September 24, 2015 [Page 16]
Internet-Draft bgp-timestamp March 2015
BGP Update BGP Update
10/8 10/8
NH R2 NH=R1
ASP : 2 ASP : 1,2
Origin IGP Origin IGP
BGP-TS : BGP-TS :
[TS_entry1:IPv4] [TS_entry1:IPv4]
[TS_entry2:IPv4] [TS_entry2:IPv4]
[TS_entry3:Stale] [TS_entry3:Stale]
[TS_entry4:IPv4] [TS_entry4:IPv4]
[TS_entry5:IPv4] [TS_entry5:IPv4]
[TS_entry6:IPv4]
BGP BGP Speaker BGP Speaker
Speaker R1 R3
R2 +---------------------------+
-----------------> | | ------------>
| BGP Path |
| At reception |
| +-----------------------+ |
| | 10/8, from R2 | |
| | BGP-TS : | |
| | [TS_entry1:IPv4] | |
| | [TS_entry2:IPv4] | |
| | [TS_entry3:Stale] | |
| | [TS_entry4:IPv4] | |
| | [TS_entry5:IPv4] | |
| | [TS_entry6:IPv4]<-| | New timestamp entry
| +-----------------------+ | created by R1
| |
| BGP Path |
| after sending to peer |
| Stale state is added |
| +-----------------------+ |
| | 10/8, from R2 | |
| | BGP-TS : | |
| | [TS_entry1:IPv4] | |
| | [TS_entry2:IPv4] | |
| | [TS_entry4:IPv4] | |
| | [TS_entry5:IPv4] | |
| | [TS_entry6:IPv4]<-| | New timestamp entry
| | [TS_entry7:Stale] | | created by R1
| +-----------------------+ |
+---------------------------+
In the example above, R2 sends a BGP path with some existing stale
timestamps. When R1 receives the route, it creates a new timestamp
Litkowski, et al. Expires September 24, 2015 [Page 17]
Internet-Draft bgp-timestamp March 2015
entry in the BGP-TS attribute. We consider that the decision process
decides that the path is best, the path is exported with the new
timestamp entry and old timestamps coming from R2. Then R1 will
update its local path by removing the previous Stale Indicator and
replace a new one at the latest position to mark that it is no more
the first propagation.
Single AS
----------------------------
/ RTR_SRC2- 10/8 \
| / |
| RR1 |
| / \ |
| RTR_SRC1 \ |
| | \ |
| 10/8 RR3 |
| | |
| RTR_DST1 |
\ /
----------------------------
In the figure above, we consider that all BGP Speaker apply timestamp
for prefix 10/8. RTR_SRC1 originates 10/8 in BGP, the decision
process will decide that the path is best. RTR_SRC1 will export path
to RR1 and then it will add locally the Stale Indicator within the
timestamp vector. The path exported does not have the Stale
Indicator. RR1 will receive the path and add a timestamp entry, the
path is considered as best, RR1 will export it to RTR_SRC2 and RR3
and then it will add a stale indicator. RR3 will proceed in the same
way.
When RTR_SRC2 will originate a new path for 10/8, if this new path is
best on RTR_SRC2, it will export the path to RR1 and then it will add
locally the Stale Indicator to the path. When RR1 will receive the
route :
o If the path from RTR_SRC2 is best, RR1 will export the new path to
RTR_SRC1 and RR3 and then will add Stale indicator to the path.If
RTR_SRC2 fails after some time, RR1 will pick up RTR_SRC1 path as
best, and will export it to RR3. RR3 will know that the received
timestamp entries are stale thanks to the stale indicator.
o If the path from RTR_SRC2 is not best, RR1 will add Stale
indicator to the path. If RTR_SRC1 fails after some time, RR1
will pick up RTR_SRC2 path as best, and will export it to RR3.
RR3 will know that the received timestamp entries are stale thanks
to the stale indicator.
Litkowski, et al. Expires September 24, 2015 [Page 18]
Internet-Draft bgp-timestamp March 2015
5.7. Inter-AS considerations
BGP update
10.0.0.0/8
TS:
AS3;CE1:rT1,sT2
CE1--------->R1 ------------> R2 ------------> R3 ------------> R4 -------> CE2
| | | |
| | | |
AS3 AS1 AS2 AS4
In the figure above, we consider that customer wants to monitor BGP
updates propagation time between its two sites.
If AS1 and AS2 BGP Speakers does not support BGP-TS, the attribute
will be transported transparently accross AS1 without any processing.
CE2 will so receive the BGP path with only a single timestamp entry
from CE1.
If AS1 and AS2 BGP Speakers does support BGP-TS, four different
options are offered : drop, drop-as, summarize, propagate. It must
be noted that using drop-as or summarize options may involve more
processing and so may impact the end to end propagation time.
5.7.1. Drop option
If AS1 and/or AS2 BGP Speakers support BGP-TS, they may not want to
expose any timestamp information between each other. If a service
does not want to propagate timestamp information to external peers,
it can decide to not activate the "timestamp" option on the peer
configuration , as explained in Section 5.4.
Litkowski, et al. Expires September 24, 2015 [Page 19]
Internet-Draft bgp-timestamp March 2015
BGP update BGP update BGP update BGP update
10.0.0.0/8 10.0.0.0/8 10.0.0.0/8 10.0.0.0/8
TS: TS: TS:
AS3;CE1:rT1,sT2 AS3;CE1:rT1,sT2 AS2;R3:rT5,sT6
AS1;R1:rT3,sT4
CE1------------->R1 -----------------> R2 ---------------> R3 ------------> R4
| | no TS | |
| | | |
AS3 AS1 AS2
In the example above, CE1 is configured to send timestamp to R1, as
well as R1 to R2. But R2 does not want to send timestamp to R3.
When sending BGP route for 10/8, CE1 adds timestamp attribute and a
timestamp entry (AS3, entry type : IPv4=CE1_IP, receive timestamp =
T1, send timestamp=T2). R1 receives the path, we suppose that the
inspection list matches, so R1 adds a timestamp entry. When sending
to R2, R1 will send the following information in its timestamp entry
: AS1,entry type : IPv4=R1_IP, receive timestamp T3, send timestamp
T4. As R2 is configured to not send timestamp information to R3, it
will drop the BGP attribute when sending to R3.
5.7.2. Drop AS option
If AS1 and/or AS2 BGP Speakers support BGP-TS, they may not want to
expose their timestamps or internal BGP topology to other ASes. If a
service does not want to propagate local AS related timestamp
information to external peers, it can decide to use the "drop-as"
option towards the peer.
BGP update BGP update BGP update BGP update
10.0.0.0/8 10.0.0.0/8 10.0.0.0/8 10.0.0.0/8
TS: TS: TS: TS:
AS3;CE1:rT1,sT2 AS3;CE1:rT1,sT2 AS3;CE1:rT1,sT2 AS3;CE1:rT1,sT2
AS1;R1:rT3,sT4 AS2;R3:rT5,sT6
CE1------------->R1 -----------------> R2 ---------------> R3 ------------> R4
| | no TS | |
| | | |
AS3 AS1 AS2
Litkowski, et al. Expires September 24, 2015 [Page 20]
Internet-Draft bgp-timestamp March 2015
In the example above, CE1 is configured to send timestamp to R1, as
well as R1 to R2. But R2 does not want to send AS1 internal
timestamp to R3. "Drop-as" option is configured on R2 towards R3.
When sending BGP route for 10/8, CE1 adds timestamp attribute and a
timestamp entry (AS3, entry type : IPv4=CE1_IP, receive timestamp =
T1, send timestamp=T2). R1 receives the path, we suppose that the
inspection list matches, so R1 adds a timestamp entry. When sending
to R2, R1 will send the following information in its timestamp entry
: AS1,entry type : IPv4=R1_IP, receive timestamp T3, send timestamp
T4. As R2 is configured with "drop-as" option to R3, it will remove
all timestamp entries where the ASN is equal to its autonomous system
number and then send the update to R3.
5.7.3. Summary option
If AS1 and/or AS2 BGP Speakers support BGP-TS, they may want to offer
timestamp service to their customers but they want to hide their
internal topology. In order to achieve the expected behavior, AS1/
AS2 can activate a timestamp summary option on the external peer.
BGP update BGP update BGP update BGP update
10.0.0.0/8 10.0.0.0/8 10.0.0.0/8 10.0.0.0/8
TS: TS: TS: TS :
AS3;CE1:rT1,sT2 AS3;CE1:rT1,sT2 AS3;CE1:rT1,sT2 AS3;CE1:rT1,sT2
AS1;R1:rT3,sT4 AS1;rT3,sT5 AS1;rT3,sT5
AS2;R3,rT6,sT7
CE1------------->R1 -----------------> R2 ---------------> R3 ------------> R4
| | TS summary | |
| | | |
AS3 AS1 AS2
When using summary option, the BGP-TS attribute is modified as
follows when exporting the route :
o All timestamp entries containing the local AS in AS field are
removed.
o A new timestamp entry is created and inserted in place of removed
entries (n entries replaced by 1).
o The new timestamp entry will use an entry type zero.
o The new timestamp entry MUST have the S bit set.
Litkowski, et al. Expires September 24, 2015 [Page 21]
Internet-Draft bgp-timestamp March 2015
o The new timestamp entry MUST NOT have any EntryType.
o The receive timestamp of the new timestamp entry is the receiving
timestamp of the first timestamp entry that has been removed.
o The send timestamp of the new timestamp entry will be added as
usual.
In the example above, CE1 is configured to send timestamp to R1, as
well as R1 to R2. But R2 wants summarize timestamp information to
AS2.
When sending BGP route for 10/8, CE1 adds timestamp attribute and a
timestamp entry (AS3, entry type : IPv4=CE1_IP, receive timestamp =
T1, send timestamp=T2). R1 receives the path, we suppose that the
inspection list matches, so R1 adds a timestamp entry. When sending
to R2, R1 will send the following information in its timestamp entry
: AS1,entry type : IPv4=R1_IP, receive timestamp T3, send timestamp
T4. As R2 is configured with "summarize" option to R3, it will
remove all timestamp entries where the ASN is equal to its autonomous
system number and add a new timestamp entry with an entry type zero.
The receive timestamp will be retrieved from R1 timestamp entry.
5.7.4. Propagate option
If AS1 and/or AS2 BGP Speakers support BGP-TS, they may want to offer
timestamp service to their customers with a full view. This MUST be
the default behavior when timestamp is activated on a peer.
BGP update BGP update BGP update BGP update
10.0.0.0/8 10.0.0.0/8 10.0.0.0/8 10.0.0.0/8
TS: TS: TS: TS :
AS3;CE1:rT1,sT2 AS3;CE1:rT1,sT2 AS3;CE1:rT1,sT2 AS3;CE1:rT1,sT2
AS1;R1:rT3,sT4 AS1;R1:rT3,sT4 AS1;R1:rT3,sT4
AS1;R2:rT5,sT6 AS1;R2,rT5,sT6
AS2;R3,rT6,sT7
CE1------------->R1 -----------------> R2 ---------------> R3 ------------> R4
| | TS propagate | |
| | | |
AS3 AS1 AS2
Litkowski, et al. Expires September 24, 2015 [Page 22]
Internet-Draft bgp-timestamp March 2015
5.8. Retrieving timestamp vector
Authors suggest to implementers to use a local wrapping buffer on
each node and record entries in the buffer each time a BGP path is
timestamped. An external tool should then retrieve timestamps
information from sink points. How the information is retrieved is
out of scope of the document but we can imagine using :
o BMP from the external tool to the sink point.
o NetConf get to retrieve wrapping buffer information.
o SNMP get to retrieve wrapping buffer information.
o CLI command to retrieve wrapping buffer information.
5.9. Handling malformed attribute
When receiving a BGP Update message containing a malformed BGP-TS
attribute, an "attribute discard" action MUST be applied as defined
in [I-D.ietf-idr-error-handling].
5.10. Impact on update packing
Introducing timestamps information will make update packing less
efficient for the timestamps path. In the deployment we are
targeting (Section 7), this is not considered as an issue. In the
case where a site is generating a special prefix with path
timestamped and others not timestamped, these prefixes will not be
packed together, so two update messages will be generated. Even if
two updates are generated, we do not consider, that the propagation
time will be highly affected.
6. Compared to BMP
BMP (BGP Monitoring Protocol) [I-D.ietf-grow-bmp] is a solution to
monitor BGP sessions and provides a convenient interface for
obtaining route views. BMP is a complete suite of messages to
exchange informations regarding a BGP session.
We can imagine to use BMP as a solution to monitor BGP update
propagation time but there is multiple drawbacks associated with such
solution :
o BMP provides dump of all received BGP update (per peer). If we
are interested only in probing BGP routes, a strong filtering of
information may be needed in BMP messages.
Litkowski, et al. Expires September 24, 2015 [Page 23]
Internet-Draft bgp-timestamp March 2015
o BMP does not mandate timestamping of messages (as per
[I-D.ietf-grow-bmp] Section 5) : "If the implementation is able to
provide information about when routes were received, it MAY
provide such information in the BMP timestamp field. Otherwise,
the BMP timestamp field MUST be set to zero, indicating that time
is not available."
o BMP may provide (if implementation available) timestamps
information only for a single router point of view. If we want to
retrieve timestamps of all BGP Speakers on a path, a BMP session
is required to all BGP speakers. Correlation (based on known
design) is also required at the external tool to order timestamps
from each BMP session.
o If BMP provides timestamp information, it does not provide
information on how the router clock is synchronized (free run,
NTP, GPS ...).
o BMP only provides Adj-RIB-in view and does not provide outgoing
information.
Using BMP to monitor BGP update propagation may complexify the design
of the monitor solution. But as mentioned in Section 1, BMP can be
used on specific sink routers to retrieve BGP TS vector.
7. Deployment considerations
This solution is not intended to perform timestamp imposition on all
BGP prefixes.
The deployment scenario we are targeting is really to monitor some
specific single-homed NLRIs identified by the service provider (see
Section 2 as an example).
These NLRIs may be advertised at some injection point in the network,
and timestamp vector will be retrieved at some sink points. As
pointed in Section 2.2.2 , multiple samples of measurement will be
necessary in order to evaluate the propagation time.
These NLRIs should be single-homed in order to ensure an end to end
propagation from injection point to sink point. A coordination
between injection and sink points based on an external tool is
necessary : once a NLRI to be monitored has been advertised, the tool
would retrieve the timestamp vector from the sink point.
Service provider may use real prefixes (used for routing) or special
prefixes (standard IP prefix but allocated for beaconing). In case
of special prefix used, the tool can at regular interval command the
Litkowski, et al. Expires September 24, 2015 [Page 24]
Internet-Draft bgp-timestamp March 2015
advertisement and withdrawal of the prefix. The tool must ensure
that it has retrieved the timestamp vector before withdrawing the
prefix and also wait for convergence after withdrawal before
advertising back the prefix.
The inspection list should be kept as small as possible by users in
order to not introduce processing overhead and as a consequence slow
down propagation.
8. Security considerations
Depending of the implementation and router capacity, adding
timestamps to BGP path may consume some router resources. As
proposed in Section 5.1, by default a BGP Speaker will not timestamp
any path and inspection list should be configured to activate
timestamping on a subset of paths. Using this approach, we consider
that overhead that may be introduced by timestamping BGP paths is
well controlled by operators. An external router cannot force an
internal router to timestamp.
Providing detailed timestamps information to other ASes may introduce
security issues by exposing internal datas (part of BGP topology, IP
addresses, internal performance) to external entities. The proposal
we make in Section 5.7 solves this security issue by giving
flexibility to operators on the level of information he wants to
expose to external peers.
9. Acknowledgements
10. IANA Considerations
IANA shall assign a codepoint for the BGP Timestamp attribute. This
codepoint will come from the "BGP Path Attributes" registry.
11. Normative References
[I-D.ietf-grow-bmp]
Scudder, J., Fernando, R., and S. Stuart, "BGP Monitoring
Protocol", draft-ietf-grow-bmp-07 (work in progress),
October 2012.
[I-D.ietf-idr-error-handling]
Chen, E., Scudder, J., Mohapatra, P., and K. Patel,
"Revised Error Handling for BGP UPDATE Messages", draft-
ietf-idr-error-handling-18 (work in progress), December
2014.
Litkowski, et al. Expires September 24, 2015 [Page 25]
Internet-Draft bgp-timestamp March 2015
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway
Protocol 4 (BGP-4)", RFC 4271, January 2006.
[RFC5905] Mills, D., Martin, J., Burbank, J., and W. Kasch, "Network
Time Protocol Version 4: Protocol and Algorithms
Specification", RFC 5905, June 2010.
Authors' Addresses
Stephane Litkowski
Orange Business Service
Email: stephane.litkowski@orange.com
Keyur Patel
Cisco Systems
Email: keyupate@cisco.com
Jeff Haas
Juniper Networks
Email: jhaas@juniper.net
Litkowski, et al. Expires September 24, 2015 [Page 26]