Routing Area Working Group G. Enyedi, Ed.
Internet-Draft Ericsson
Intended status: Standards Track October 21, 2013
Expires: April 24, 2014

Artificial MRT Islands for Keeping Detours Local
draft-enyedi-rtgwg-mrt-local-detour-00

Abstract

IP and LDP Fast ReRoute using Maximally Redundant trees was defined in [I-D.ietf-rtgwg-mrt-frr-architecture]. In this document we add a simple extension to that technique, which can guarantee to keep detours in the part of the network, where the failure happened.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on April 24, 2014.

Copyright Notice

Copyright (c) 2013 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.


Table of Contents

1. Introduction

In the case of failure, Fast ReRoute using Maximally Redundant Trees (MRT-FRR) [I-D.ietf-rtgwg-mrt-frr-architecture] use detours defined by two maximally redundant trees, which are not related to shortest paths at all. Although there are heuristics for decreasing path lengths, which are sufficient in almost all situations, there is no strict guarantee to keep failure handling local. This means that detours may cause temporal congestion even in those parts of the network, which are far from the original failure.

This document defines a possible solution by using multiple pairs of maximally redundant trees in an IGP area. Our solution introduce artificial "subareas", each having its own recovery, and use them to provide the best possible protection. If both the Point of Local Repair (PLR) and the destination are in the same subarea, the detour can simply use one of the trees of that subarea, in this way never leaving the surroundings of the failure.

2. Terminology and Definitions

Maximally Redundant Trees (MRT):
A pair of trees where the path from any node X to the root R along the first tree and the path from the same node X to the root along the second tree share the minimum number of nodes and the minimum number of links. Each such shared node is a cut-node. Any shared links are cut-links.
2-connected:
A graph that has no cut-nodes. This is a graph that requires at least two nodes to be removed before gets partitioned.
block:
Either a maximally 2-connected (induced) subgraph, a cut-link with with its endpoints, or an isolated node.
DAG:
Directed Acyclic Graph - a digraph containing no directed cycle.
ADAG:
Almost Directed Acyclic Graph - a digraph that can be transformed into a DAG whith removing a single node (the root node).
GADAG:
Generalized ADAG - a digraph, which has only ADAGs as all of its blocks.
PLR:
Point of Local Repair - the node neighboring the failed resource (which can be a node or a link), and which do the rerouting.
Cut-node:
A node is a cut-node, if removing it partitions the network.
Cut-link:
A link is a cut-link, if removing it partitions the network.

3. Problem Description

Consider the network and GADAG depicted in Figure 1 (for GADAG computation and finding FRR paths using a GADAG consult [I-D.enyedi-rtgwg-mrt-frr-algorithm]). Suppose that node H wants to send some packets to node I, but the link between them went down. Since node H is definitely lesser than node I, the detour must be the one that goes through R, i.e. H->G->C->B->A->R->F->E->J->I, even if there was a much shorter one through node D. The problem here is not that such path is long, but that the traffic may get far away from the failure, thus congestion may occur in any part of the network.

----F   J----         ----F    J<---
|   |   |   |         |   ^    |   |
|   |   |   |         V   |    |   |
R   --E--   I         R   --E<--   I
|     |     |         |     ^      ^
|     |     |         |     |      |
|     D     |         |     D      |
|     |     |         |     ^      |
|     |     |         V     |      |
A   --C--   H         A   ->C--    H
|   |   |   |         |   |   |    ^
|   |   |   |         |   |   V    |
----B   G----         --->B   G-----
  
  A network       The GADAG rooted at R
            

Network with GADAG rooted at R.

Figure 1

There are already possibilities to mitigate the problem presented above. First, there are heuristics that can be applied in order to decrease path lengths, thus paths in real networks are usually using detours not much longer than the shortest paths. Even when heuristics cannot help (like in the network above), there is still some room for optimization by selecting the GADAG root better. E.g. in the previous example selecting node D as a GADAG root can solve the problem (however it cannot solve anything if we add one more ring connected to A and R). Finally, note that congestion can be caused only while IGP is doing the recovery, which is quite fast in most of the cases, in this way reducing the severity of this situation.

This document describes a mechanism to give strict guarantees that detour does not get far from the failure. This can be applied for special situations, when detours would be too long otherwise or in networks, where bandwidth guarantees are needed to be fulfilled in all cases.

4. Artificial Islands

MRT-FRR capable routers can handle multiple MRT profiles. MRT profiles was introduced for handling routers with different capabilities, even those, which do not support MRT-FRR at all. Routers supporting the same profile create an MRT-FRR island in the IGP area.

Each such island has its GADAG and its own redundant trees, which are only valid in that island. If a packet gets out from an island, it gets back to the shortest path. If the destination (or the area/AS border router) is inside the island where the failure happened, it is guaranteed that packet will never leave the island. Basically islands are subareas with their own protection.

Currently, islands are there only for handling capability differences between routers. In this document, we introduce artificial islands, which are limiting packet detours to a part of the network. In order to define such an island, operator needs only to configure routers in the desired subarea to advertise one more profile, which is is not supported by any other router in the network. Naturally, this profile does not need to describe new capabilities, but it can differ from other profiles just in some extra ID field. Therefore profile descriptor must be extended with such ID field.

Operators may define islands arbitrary; the only restriction is that such islands must be connected, otherwise they would be considered as multiple connected islands. Similarly, if an island is split into disjunct parts due to some failure, such parts can be handled as disjunct islands. As an example, operators can define one island containing the whole IGP area, and some smaller ones for keeping up local failure handled when needed. When a failure can be handled locally, a "small" island is used, while there is still the "big" island containing the whole area for the remaining cases.

As an example of this scheme, consider the the previous network, and define artificial islands in it in order to keep packets inside the ring where the failure has happened. The network and the two islands defined are depicted below. Note that there should be a third island that contains all the nodes to handle failures that cannot be handled locally; this third island is not depicted in Figure 2. Also note that having this third island is not mandatory, it can be not configured, if operator wants to disable global protection for some reason.

************     ************
*           *   *           *
*  ----F     * *     J----  *
*  |   |      *      |   |  *
*  |   |     * *     |   |  *
*  R   -------E-------   I  *
*  |        * | *        |  *
*  |        * | *        |  *
*  |        * D *        |  *
*  |        * | *        |  *
*  |        * | *        |  *
*  A   -------C-------   H  *
*  |   |     * *     |   |  *
*  |   |      *      |   |  *
*  ----B     * *     G----  *
*           *   *           *
*  Island1  *   *  Island2  *
*           *   *           *
*************   *************
  
            

A network made up by two islands

Figure 2

Observe that Island1 and Island2 are not disjunct, instead both of them are containing node C, D and E, in this way making both islands 2-connected. This overlapping is the main difference compared to the situation when an area is split into two using MRT-ineligible links; if we would only disable the MRT capability for some links in this network, at least one of the resulting "subareas" (islands) would be not 2-connected, thus protection would be impossible in at least one of them. Moreover thanks to overlapping, it is possible to define the third island and use it to provide protection when the PLR and the destination are not in the same ring (i.e. there is no local detour).

Although it is useful for protection, note that operator do not always need to find 2-connected artificial islands, if there are considerations other than maximum protection. Consider Figure 3 and suppose that in this network link L-F is not wanted to be used for local protection for some reason. In this case, the two artificial islands for local protection are selected as depicted below. If node M is going down, Island2 is split into two, so no local protection is possible in this case. (Naturally, selecting Island2 is not pointless, if any link or any other node than M is going down, there is still local protection.)

		         ************
       --------------L--K   *
       |         *   | /    *
*******|****     *   M      *
*      |    *   *    |      *
*  ----F     * *     J----  *
*  |   |      *      |   |  *
*  |   |     * *     |   |  *
*  R   -------E-------   I  *
*  |        * | *        |  *
*  |        * | *        |  *
*  |        * D *        |  *
*  |        * | *        |  *
*  |        * | *        |  *
*  A   -------C-------   H  *
*  |   |     * *     |   |  *
*  |   |      *      |   |  *
*  ----B     * *     G----  *
*           *   *           *
*  Island1  *   *  Island2  *
*           *   *           *
*************   *************
  
            

A network made up by two islands and a cut-node

Figure 3

If a router is in multiple islands, selecting one for a concrete failure case is a local decision of the node and not defined in this document. Vendors may make it possible to assign priority at each router for the artificial islands created. Moreover, routers may take other differences into consideration as well, e.g. if there is node protecting path in one island but only link protecting in another one.

Selecting the endpoint of detour is a local decision of MRT-FRR capable routers, it is not needed to select always the destination/border router as the endpoint, especially when not all the routers are supporting MRT-FRR and islands are formed (for details see [I-D.ietf-rtgwg-mrt-frr-architecture]). If there are artificial islands the only difference is that a router may belong to multiple islands, so it may take into consideration all of those islands and select the best for that failure with respect to arbitrary local preference.

For MRT-FRR, each router must have two extra addresses/labels per profile it supports. The situation is the same if there are artificial islands applied, since a router in multiple islands must compute the MRTs for each of those islands, and it must be able to decide which of these trees is used.

5. IANA Considerations

This document includes no request to IANA.

6. Security Considerations

This architecture is not currently believed to introduce new security concerns.

7. Normative References

[I-D.ietf-rtgwg-mrt-frr-architecture] Atlas, A., Kebler, R., Envedi, G., Csaszar, A., Tantsura, J., Konstantynowicz, M., White, R. and M. Shand, "An Architecture for IP/LDP Fast-Reroute Using Maximally Redundant Trees", Internet-Draft draft-ietf-rtgwg-mrt-frr-architecture-02, February 2013.
[I-D.enyedi-rtgwg-mrt-frr-algorithm] Atlas, A., Envedi, G., Csaszar, A. and A. Gopalan, "Algorithms for computing Maximally Redundant Trees for IP/LDP Fast- Reroute", Internet-Draft draft-enyedi-rtgwg-mrt-frr-algorithm-02, October 2012.

Author's Address

Gábor Sándor Enyedi (editor) Ericsson Konyves Kalman krt 11 Budapest, 1097 Hungary EMail: Gabor.Sandor.Enyedi@ericsson.com