Internet DRAFT - draft-zhang-trill-multi-topo-rpfc
draft-zhang-trill-multi-topo-rpfc
INTERNET-DRAFT Mingui Zhang
Intended Status: Proposed Standard Huawei
Expires: August 2, 2012 January 30, 2012
Reverse Path Forwarding Check under Multiple Topology TRILL
draft-zhang-trill-multi-topo-rpfc-00.txt
Abstract
Multi-homing (RBridge Aggregation) is a promising approach to
increase the reliability and access bandwidth of TRILL edge. Active-
active forwarding in multi-homing allows multiple RBridges forward
data frames for VLAN-x on a LAN link, which creates the possibility
that multicast frames from a specific ingress RBridge may arrive at
multiple incoming ports of a remote RBridge. This violates the
Reverse Path Forwarding Check and multicast frames arrives at
unexpected incoming ports will be discarded by this RBridge. This
document makes use of multiple topology TRILL to solve this problem.
Multiple topology TRILL provides physical separation of traffic from
different members of aggregation. Multicast frames from aggregation
members comply with the Reverse Path Forwarding Check per topology.
Status of this Memo
This Internet-Draft is submitted to IETF in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that other
groups may also distribute working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/1id-abstracts.html
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html
Copyright and License Notice
Copyright (c) 2012 IETF Trust and the persons identified as the
document authors. All rights reserved.
Mingui Zhang Expires August 2, 2012 [Page 1]
INTERNET-DRAFT RFPC under Multi-Topology TRILL January 30, 2012
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1. Content . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3. Acronyms . . . . . . . . . . . . . . . . . . . . . . . . . 4
2. RPFC Issue in Active-Active Multi-homing . . . . . . . . . . . 5
3. Multi-Topology for Aggregation . . . . . . . . . . . . . . . . 6
3.1. Multicast Ingressing . . . . . . . . . . . . . . . . . . . 7
3.2. Multicast Egressing . . . . . . . . . . . . . . . . . . . . 7
3.3. Address Flip-Flop Avoidance by Asymmetric Topologies . . . 7
3.4. Tunneling Approach . . . . . . . . . . . . . . . . . . . . 8
4. Incremental Deployment . . . . . . . . . . . . . . . . . . . . 9
4.1. Intra-Topology Communication . . . . . . . . . . . . . . . 9
4.2. Inter-Topology Communication . . . . . . . . . . . . . . . 10
4.3. A Hybrid Scenario . . . . . . . . . . . . . . . . . . . . . 10
5. Security Considerations . . . . . . . . . . . . . . . . . . . . 11
6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 11
7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 11
8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 11
8.1. Normative References . . . . . . . . . . . . . . . . . . . 11
8.2. Informative References . . . . . . . . . . . . . . . . . . 12
Author's Addresses . . . . . . . . . . . . . . . . . . . . . . . . 13
Mingui Zhang Expires August 2, 2012 [Page 2]
INTERNET-DRAFT RFPC under Multi-Topology TRILL January 30, 2012
1. Introduction
With the link state routing of IS-IS (Intermediate System to
Intermediate System), TRILL provides a solution of least cost
forwarding of data frames to replace the Spanning Tree Protocol (STP)
running in traditional bridge networks.
RBridge Aggregation provides active-active multi-homing at the edge
of TRILL [RBAgg]. It increases the access bandwidth and reliability
of TRILL edge but creates the possibility that multiple RBridges
ingress/egress data frames for end-stations from VLAN-x on a LAN
link. A typical use of RBridges Aggregation is to represent a LAN
link with a single virtual RBridge. RBridges participating the
aggregation ingress/egress data frames on behalf of this virtual
RBridge using a pseudonode nickname.
Reverse Path Forwarding Check (RPFC) is used by TRILL to suppress
forwarding loops of multicast frames. Based on a Distribution Tree
(DT), a multicast frame from a specific ingress RBridge arrives at a
single expected link of an RBridge. RBridges MUST drop multicast
frames that fail the RPFC [RFC6325]. When multiple RBridges ingress
multicast frames for end-stations from VLAN-x on a LAN link
simultaneously, it can not guarantee that these frames always arrive
at the expected link of a remote RBridge.
Multiple Topology (MT) TRILL provides a physical separation of
traffic [RFC5120] [MTc] [MTd]. An MT aware RBridge can participate
data forwarding in multiple topologies at the same time. This feature
is utilized in this document to resolve the issue that active-active
multi-homing may fail RPFC. Each RBridge of the aggregation uses an
individual topology to ingress/egress data frames for the target LAN
link. Since distribution trees are calculated per topology by MT
aware RBridges [MTd], multicast frames will be forwarded along these
distribution tress separately, which helps the arriving multicast
frames pass RPFC. To be backward compatible, the solution provided in
this draft does not require all RBridges in a campus to upgrade to
support multiple topology TRILL. Legacy RBridges that do not support
multiple topology TRILL can inter-operate with the MT aware RBridges
participating the RBridge Aggregation.
This document focus on solving the RPFC issues caused by active-
active multi-homing. Other issues of multi-homing, such as failure
recovery and load balance, are in the scope of RBridge Aggregation
[RBagg]. One advantage of the adoption of multiple topology TRILL is
that approaches for failure recovery developed for multiple topology
routing ([RFC5714]) can be reused in RBridge Aggregation without the
reinvention of the wheel.
Mingui Zhang Expires August 2, 2012 [Page 3]
INTERNET-DRAFT RFPC under Multi-Topology TRILL January 30, 2012
1.1. Content
Section 2 explains why active-active multi-homing may cause trouble
in Reverse Path Forwarding Check of TRILL.
Section 3 describes the approach of configuration for the edge
RBridges to achieve RBridge Aggregation through multi-topology
TRILL.
Backward compatibility is an essential requirement for the inter-
operation between legacy RBridges and RBridges participating in
aggregation. Section 4 describes solutions for three incremental
deployment scenarios.
1.2. Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC2119].
1.3. Acronyms
IS-IS - Intermediate System to Intermediate System
TRILL - TRansparent Interconnection of Lots of Links
STP - Spanning Tree Protocol
MT - Multiple Topology
DT - Distribution Tree
LAG - Link Aggregation
RPFC - Reverse Path Forwarding Check
Mingui Zhang Expires August 2, 2012 [Page 4]
INTERNET-DRAFT RFPC under Multi-Topology TRILL January 30, 2012
2. RPFC Issue in Active-Active Multi-homing
+-----+
RBi | RBi |(Remote RBridge)
/ \ +-----+
RB1 RB2 |
/ /\/\/\/\/\/\
RBv / Transit \
< RBridges >
Distribution Tree(DT) \ Campus /
\/\/\/\/\/\/
| |
+-----+ +-----+
| RB1 |--| RB2 |(Aggregation Members)
+-----+ +-----+
\ /
*******
* RBv * (Virtual RBridge)
*******
||(LAG)
+----+
+---| CE |---+
| +----+ |
| |
|[H1][H2]... |
+------------+
VLAN-x
Figure 2.1: An Example Topology of RBridge Aggregation
RBridge Aggregation is first proposed in [RBagg]. RBridge Aggregation
enables active-active multi-homing for LAN links [RBagg]. Several
RBridges can ingress/egress data frames for end-stations of one VLAN
on a LAN link, which increases the access bandwidth and reliability
of TRILL edge.
Figure 2.1 shows an example topology of RBridge Aggregation and the
distribution tree is shown on the left (Suppose the transit RBridge
campus is null.). Based on the distributions tree, multicast frames
from RBv to RBi is expected to be received at the port attaching to
RB1.
Under RBridge Aggregation, RB2 can really ingress native data frames
from the LAG links, therefore multicast frames from RBv to RBi may
legally be received at the port attaching to RB2. These frames will
be discarded according to the rule of Reverse Path Forwarding Check
[RFC6325]. Active-active forwarding of multicast frames is the root
cause of this issue. The rest of this document will make use of
Mingui Zhang Expires August 2, 2012 [Page 5]
INTERNET-DRAFT RFPC under Multi-Topology TRILL January 30, 2012
multiple topology TRILL to solve this problem.
3. Multi-Topology for Aggregation
Documents [MTc] and [MTd] define the protocol extensions, data plane
encoding and procedures to make use of the multiple topology routing
supported by ISIS. Multiple topology routing provides physical
traffic segregation to TRILL, which is utilized to solve the RPFC
issue caused by RBridge Aggregation. RPFC will be done based on
distribution trees which are calculated per topology abbreviation by
MT aware RBridges.
Topology IDs are used to identify the aggregation members. If the
number of available topologies is greater than the number of
aggregation members, several topology IDs can be assigned to one
aggregation member which can make use of these topologies to realize
load-balancing. If available topologies are less than aggregation
members, some of these members get no topology ID. These standby
aggregation members can make use of the tunneling approach defined in
Section 3.3 to redirect arriving data frames to other members for
forwarding.
Table 3.1: A Sample Configuration for Aggregation
+------------+---------+-------+
|Aggregation | RBv's | LAG |
| Members |Nicknames|Members|
+------------+---------+-------+
| RB1 |...001...|RB1-RBv|
+------------+---------+-------+
| RB2 |...010...|RB2-RBv|
+------------+---------+-------+
| RB3 |...011...|RB3-RBv|
+------------+---------+-------+
| RB4 |...100...|RB4-RBv|
+------------+---------+-------+
Since multiple topology TRILL identifies a topology using the ingress
nickname [MTd], the topology assignment among aggregation members is
embodied through the nickname configuration of RBv. Figure 3.1 shows
a typical configuration of RBridge Aggregation with 4 members. Each
aggregation member ingress native frames using one nickname of RBv.
These frames will be confined to the topology as these nicknames
indicate. For example, when RB1 ingress the native frames from the
local link, it will use RBv001 as the ingress nickname and these
frames will be forwarded in topology 1.
The rest of this section discusses multicast forwarding. For the
Mingui Zhang Expires August 2, 2012 [Page 6]
INTERNET-DRAFT RFPC under Multi-Topology TRILL January 30, 2012
detail of unicast forwarding, one may refer to [RBagg].
3.1. Multicast Ingressing
RBi RBi
/ \ / \
RB1 RB2 RB1 RB2
/ \
RBv001 RBv010
DT for Topology_1 DT for Topology_2
Figure 3.1: Sample Distribution Trees for Topology 1 and 2
LAG may use any of its links as the active link to send frames to a
member of RBridge Aggregation. The receiver SHOULD encapsulate the
native frames on behalf of RBv. Take Figure 3.1 as an example, RB1
and RB2 encapsulates native frame using RBv001 and RBv010 as their
ingress nicknames respectively. If these frames are multicast frames,
they will be forwarded according to the distribution trees calculated
per topology. Since RBi calculates two different distributions trees
for RBv001 and RBv010, multicast frames arriving at the ports
attached to RB1 and RB2 can all pass the RPFC.
3.2. Multicast Egressing
Since distribution trees are built per topology, a multicast frame
will be received by only one aggregation member. This member should
egress the multicast frame to the local link on behalf of RBv. But
remote RBridges is not aware that RBv actually does not exist. All
aggregation members act as penultimate hops to RBv in the campus.
3.3. Address Flip-Flop Avoidance by Asymmetric Topologies
+-------+------+------+ +-------+------+------+
|VLAN ID|MacDA |Egress| |VLAN ID|MacDA |Egress|
+-------+----- +------+ ---> +-------+----- +------+
|VLAN-x |Mac_H1|RBv001| |VLAN-x |Mac_H1|RBv010|
+-------+------+------+ +-------+------+------+
Figure 3.2: An Example of MAC Address Flip-Flop
In the above ingressing procedure, native data frames from one end
station may be ingressed to the campus by different aggregation
members. Current RBridges do not have the topology abbreviation as a
separate column in their MAC tables. Therefore, when a remote RBridge
receives multicast frames with the same source MAC address from
different aggregation members, these multicast frames will create
Mingui Zhang Expires August 2, 2012 [Page 7]
INTERNET-DRAFT RFPC under Multi-Topology TRILL January 30, 2012
only one entry in the MAC table of this remote RBridge.
As illustrated in Figure 3.2, when frames originated from H1 is sent
to RBi from RB1, RBi will learn that the egress RBridge nickname for
Mac_H1 is RBv001. Afterwards, if RB2 sends frames originated from H1
to RBi, the egress RBridge nickname will change to RBv010. It seems
that the use of multiple topology TRILL brings a MAC address flip-
flop issue. If RBv001 and RBv010 are regarded as two different egress
RBridges and RBi prepares paths to them separately, it is possible
RBi gets different forwarding paths. In other words, RBi will use
different forwarding paths in different topologies for the data
frames destined to the same end-station, which may cause packet
disorder.
However, MT aware RBridges support asymmetric use of topologies
[MTd]. In the above example, RBi can send data frames to Mac_H1
according to topology 1 even if it learns Mac_H1 from the data flow
in topology 2. That is to say RBi can send return data frames to
RBv001 all the time. In practical use, remote RBridges SHOULD adhere
to a specific topology to send return data frames destined to a
specific MAC address.
3.4. Tunneling Approach
If available topologies are less than the aggregation members, there
will be standby members who get no topology ID. These members can
still ingress native frames from the LAG directly. But they should
redirect them to other members through the following tunneling
approach.
Suppose RB5 is a standby member of the aggregation. So it is not a
parent of RBv on the distribution tree of any topology. Assume RB5
tunnels native frames from the LAG to RB1 which is the parent of RBv
in topology 1. RB5 should ingress the native frame, fill its egress
nickname as RB1 and fill its ingress nickname as the nickname of
RBv001 which is used by RB1. Then RB5 sends this frame as a unicast
frame to RB1. When RB1 receives this unicast frame, it can judge from
its ingress nickname that this frame should be actually ingressed by
RB1. Therefore, RB1 decapsulates this frame and re-capsulate it as if
it is received from the LAG link RBv-RB1.
For the sake of load-balancing and resilience, it is recommended that
standby RBridges tunnel their multicast frames evenly among those
aggregation members who get topology IDs. The optimization of the
tunneling configuration is out the scope of this document. Tunneling
approach can also be used for any other purpose such as fail-over.
However, in this document, tunneling is used only for redirecting
ingress multicast frames to pass through the RPFC in TRILL.
Mingui Zhang Expires August 2, 2012 [Page 8]
INTERNET-DRAFT RFPC under Multi-Topology TRILL January 30, 2012
4. Incremental Deployment
When RBridge Aggregation is put to use in a TRILL campus, it is
probably that MT unaware RBridges have already been deployed in this
campus. It is therefore necessary to enable the inter-operation of
these two types of RBridges. On one hand, MT aware aggregation
members MUST be backward compatible to those legacy MT unaware
RBridges. On the other hand, legacy RBridges need not make any change
in order to communicate with aggregation members.
With multi-topology TRILL, RBridge Aggregation can be incrementally
deploy in an RBridge campus. This rest of this section provides
approaches for three incremental deployment scenarios: (1)
aggregation members need not to talk with MT unaware RBridges; (2)
aggregation members need to communicate with MT unaware RBridges; (3)
a combination of the above two scenarios.
4.1. Intra-Topology Communication
+------------------------+
|Topology_0 RBx |
| |
+------------------------+
|Topology_1 RBi |
| / \ |
| RB1 RB2 |
| / |
| RBv001 |
+------------------------+
|Topology_2 RBi |
| / \ |
| RB1 RB2 |
| \ |
| RBv010|
+------------------------+
Figure 4.1: Aggregation Members Talk with MT aware RBridges Only
If MT aware aggregated RBridges do not talk with MT unaware RBridges,
aggregation traffic can be confined to non-zero topologies. This kind
of traffic segregation is achieved through multi-topology routing. As
illustrated in Figure 4.1, when RB1 and RB2 forward multicast frames
to RBi according to distribution trees for topology 1 and topology 2
respectively, the MT unaware RBx will not receive these frames from
RBv. When RB1 and RB2 advertise LSPs in the base topology, they will
not include their adjacencies to RBv001 and RBv010, therefore RBx
will not be aware of RBv001 and RBv010. In particular, nickname
RBv000 SHOULD be reserved and not used in aggregation configuration
Mingui Zhang Expires August 2, 2012 [Page 9]
INTERNET-DRAFT RFPC under Multi-Topology TRILL January 30, 2012
so that even RBx can reach RBv000, they will not talk with each
other.
4.2. Inter-Topology Communication
+------------------------+
|Topology_0 |
| RBi |
| / \ |
| RB1 RB2 |
| / \ |
| RBv001 RBv010|
+------------------------+
Figure 4.2: MT aware RBridges Need to Talk with MT unaware RBridges
If MT aware aggregated RBridges need to talk with MT unaware
RBridges, the traffic segregation method in Section 4.1 can not be
used again. Since MT unaware RBridges only understand the base
topology, all aggregated RBridges advertise their connections to RBv
in the base topology. Figure 4.2 illustrate this approach. Assume RBi
is the MT unaware RBridge. RB1 and RB2 advertise their adjacencies to
RBv, i.e., "RB1-RBv001" and "RB2-RBv010", in their LSPs. When RBi
calculate the distribution tree, it should calculate as what is shown
in Figure 4.2. RBv001 and RBv010 are regarded as two different
RBridges by RBi.
Hashing function is widely used in LAG for the purpose of load
balancing. In a corner case, native data packets from one end-station
may be mapped to any aggregated member. Similar as the example shown
in Figure 3.3, the egress of Mac_H1 at RBi may change between RBv001
and RBv010 back and forth. When MAC address flip-flop happens, the MT
unaware RBi is unable to use asymmetric topologies to send return
TRILL data frames back to aggregated members. TRILL data frames
destined to RBv001 and RBv010 may go through two different forwarding
paths. Although this kind of MAC flip-flop is rare in real TRILL
campus, it is recommended that the hashing function is configured to
completely avoid it. The configuration of LAG should guarantee that
native data frames from one end-station are mapped to only one
aggregated member (one active link in the LAG). Destination IP/MAC
address fields of a native data frame SHOULD not be used as the input
of hashing function.
4.3. A Hybrid Scenario
It is allowable that some aggregated members report their connections
to RBv in the base topology while others do not. For aggregated
members which do not report the connections to RBv in the based
Mingui Zhang Expires August 2, 2012 [Page 10]
INTERNET-DRAFT RFPC under Multi-Topology TRILL January 30, 2012
topology, they need tunnel multicast frames to those members who
report their connections to RBv in the based topology in order to
communicate with MT unaware RBridges. For example, RB1 and RB2
advertise their connections to RBv001 and RBv010 in topology 1 and
2, while RB0 advertises the adjacency to RBv000 in topology 0. Assume
RBi is an MT unaware RBridge. The distribution tree calculated by RBi
will include RBv000 while does not include RBv001 or RBv010. RB0 can
talk with RBi directly on behalf of RBv000. When RB1 and RB2
communicates with MT aware RBridges, they can confine the traffic in
topology 1 and 2. If RB1 and RB2 need to send TRILL data frames to MT
unaware RBrdiges, such as RBi, they should redirect these frames to
RB0 using the tunneling approach described in Section 3.3. RB0 will
send these frames with RBv000 as their ingress nickname.
5. Security Considerations
This document raises no new security issues for IS-IS.
6. IANA Considerations
No new registry is requested to be assigned by IANA.
7. Acknowledgements
Discussions with authors and contributors of [Pseudo] and [CMT]
provide a great help to the write up of this draft. This document is
by no means to replace such kind of solutions used for RPFC relaxing.
These solutions are designed for TRILL base topology and can be used
in parallel in the same RBridge campus with the solution presented in
this document.
8. References
8.1. Normative References
[RBAgg] M. Zhang, D. Eastlake, et al, "RBridge Aggregation", draft-
zhang-trill-aggregation-01.txt, working in progress.
[RFC6325] R. Perlman, D. Eastlake, et al, "RBridges: Base Protocol
Specification", RFC 6325, July 2011.
[MTc] Vishwas Manral, D. Eastlake, et al, "Multiple Topology
Routing Extensions for Transparent Interconnection of Lots
of Links (TRILL)", draft-manral-isis-trill-multi-topo-
03.txt, working in progress.
[MTd] D. Eastlake, M. Zhang, et al, "Multiple Topology TRILL",
draft-eastlake-trill-rbridge-multi-topo-02.txt, working in
Mingui Zhang Expires August 2, 2012 [Page 11]
INTERNET-DRAFT RFPC under Multi-Topology TRILL January 30, 2012
progress.
8.2. Informative References
[RFC5120] Przygienda, T., Shen, N., and N. Sheth, "M-ISIS: Multi
Topology (MT) Routing in Intermediate System to
Intermediate Systems (IS-ISs)", RFC 5120, February 2008.
[RFC5714] Shand, M. and S. Bryant, "IP Fast Reroute Framework", RFC
5714, January 2010.
[Pseudo] H. Zhai, F. Hu, et atl, "RBridge: Pseudonode Nickname",
draft-hu-trill-pseudonode-nickname-01.txt, working in
progress.
[CMT] T. Senevirathne, J. Pathangi, et al, "Coordinated Multicast
Trees (CMT)for TRILL", draft-tissa-trill-cmt-00.txt,
working in progress.
Mingui Zhang Expires August 2, 2012 [Page 12]
INTERNET-DRAFT RFPC under Multi-Topology TRILL January 30, 2012
Author's Addresses
Mingui Zhang
Huawei Technologies Co.,Ltd
Huawei Building, No.156 Beiqing Rd.
Z-park ,Shi-Chuang-Ke-Ji-Shi-Fan-Yuan,Hai-Dian District,
Beijing 100095 P.R. China
Email: zhangmingui@huawei.com
Mingui Zhang Expires August 2, 2012 [Page 13]