Network Working Group Shankar Raman Internet-draft Balaji Venkat Venkataswami Intended Status: Standards Track Gaurav Raina Expires: September 2012 I.I.T Madras. March 25, 2012 PIM ECMP Redirect based on Linecard Replication Capacity and Power draft-mjsraman-pim-ecmp-redirect-power-replic-cap-00 Abstract This work derives itself from [1] which proposes a ECMP redirect from a PIM upstream neighbor that instructs or advices the PIM downstream neighbor to choose another of its own ECMP links between the former and the latter. What we propose in this document is a criterion based on power consumed in the linecards that comprise the ECMP links between the former and the latter. Also the multicast replication capacity available within the said linecards which form the ECMP links between the two is taken into consideration while making a ECMP redirect. A PIM router uses RPF procedure to select an upstream interface and router to build forwarding state. When there are equal cost multiple paths (ECMP), existing implementations often use hash algorithms to select a path. Such algorithms do not allow the spread of traffic among the ECMPs according to administrative metrics. This usually leads to inefficient or ineffective use of network resources. This document introduces the ECMP Redirect, a mechanism to improve the RPF procedure over ECMPs. It allows ECMP path selection to be based on administratively selected metrics, such as data transmission delays, path preferences and routing metrics. Status of this Memo This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Shankar Raman et.al Expires September 2012 [Page 1] INTERNET DRAFT Replication capability based PIM redirect March 2012 Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/1id-abstracts.html The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html Copyright and License Notice Copyright (c) 2012 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Conditions where this mechanism applies. . . . . . . . . . . . 3 3 Security Considerations . . . . . . . . . . . . . . . . . . . . 6 4 IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 6 5 References . . . . . . . . . . . . . . . . . . . . . . . . . . 6 5.1 Normative References . . . . . . . . . . . . . . . . . . . 6 5.2 Informative References . . . . . . . . . . . . . . . . . . 6 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 6 Shankar Raman et.al Expires September 2012 [Page 2] INTERNET DRAFT Replication capability based PIM redirect March 2012 1 Introduction This work derives itself from [1] which proposes a ECMP redirect from a PIM upstream neighbor that instructs or advices the PIM downstream neighbor to choose another of its own ECMP links between the former and the latter. What we propose in this document is a criterion based on power consumed in the linecards that comprise the ECMP links between the former and the latter. Also the multicast replication capacity available within the said linecards which form the ECMP links between the two is taken into consideration while making a ECMP redirect. A PIM router uses RPF procedure to select an upstream interface and router to build forwarding state. When there are equal cost multiple paths (ECMP), existing implementations often use hash algorithms to select a path. Such algorithms do not allow the spread of traffic among the ECMPs according to administrative metrics. This usually leads to inefficient or ineffective use of network resources. This document introduces the ECMP Redirect, a mechanism to improve the RPF procedure over ECMPs. It allows ECMP path selection to be based on administratively selected metrics, such as data transmission delays, path preferences and routing metrics. As mentioned earlier this document also proposes the use of the power being consumed by the linecards (if the ECMP links fall on multiple linecards with respect to their ECMP link ports) and the available replication capacity of the linecard within its ASICs as a measure of which ECMP link to which the PIM-Join is to be redirected to. If for example there exist multiple replication engines within different linecards then the lightly loaded replication engine and its corresponding ECMP link on that linecard can be recommended to the PIM downstream neighbor which sends the upstream neighbor a PIM join for a specific (S,G) or (*,G) group. 1.1 Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119]. 2. Conditions where this mechanism applies. Assume there are multiple ECMP links between a PIM upstream and Shankar Raman et.al Expires September 2012 [Page 3] INTERNET DRAFT Replication capability based PIM redirect March 2012 downstream router. Lets assume they fall on different linecards on the upstream neighbor with respect to the port placement of these ECMP links. Assume one linecard A, which is already carrying multiple data streams in either the incoming direction on its ports and replicating it to multiple other outgoing linecards through the switch fabric. Assume another linecard B that is lightly loaded with respect to multicast traffic and heavily loaded with respect to the unicast traffic. Assume another linecard C which is lightly loaded with respect to both. Now also assume that these linecards A,B and C are all members of the group of linecards whose ports place themselves within the ECMP links between the upstream and downstream neighbor. It is possible to decipher from this is that linecard C would be a better point of placement of the replication for the group for which the PIM-Join comes from the downstream neighbor. Assume now that the downstream neighbor selects the linecard A. It is now possible as a result of using the mechanism dictated to in [1], to send a redirect to the downstream neighbor that it would be a better choice to choose linecard C. This it does by recommending the neighbor address field as the port IP address which falls on linecard C. The PDU format for the ECMP redirect specified in [1] and the procedures that go along with it follow. It is also possible to make this decision in combination with the above or solely on a metric which we will call PWR-REPLIC-CAP. This metric is derived as follows... PWR-REPLIC-CAP = Power Consumed on that linecard ------------------------------- Available replication capacity on the linecard. In the metric case the lowest PWR-REPLIC-CAP metric is chosen. This optimizes on the power being spent on replication to the extent possible. It is important to note that a linecard may have multiple replication engines and ports assigned to each replication engine or to all of them. It is possible that the ECMP links and their ports fall on the same linecard and in that case it would be possible to choose a specific replication engine from among the multiple replication engines available. that has better available replication capacity by choosing a neighbor address belonging to the port that falls in a set that belongs to the superior replication engine (with respect to the available replication capacity at that point in time). If there is no distinction amongst ports with respect to the multiple replication then it actually makes no difference since it would be an internal decision as to where the replication for that port actually happens. Shankar Raman et.al Expires September 2012 [Page 4] INTERNET DRAFT Replication capability based PIM redirect March 2012 It is also important to note here that the linecards that are multicast capable have well advertised replication capacities which the vendors advice in the linecard data sheets. This could be placed in a variable that can be monitored for shifts within intervals of threshold values. The same goes for the power consumed by the linecards as well. Pseudo code for the steps to be followed to implement this scheme is as follows... if (multiple ECMP links exist to the PIM neighbor from which PIM-Join was received) then Get the list of ECMP ports which are members of the ECMP links; Get the list of linecards on which these ECMP ports are placed; LCA = Consider the least used linecard with respect to the replication capacity; if (port on LCA is the same on which PIM-Join was received) do nothing; return; else if (choice is to be made on replication capacity) LCA = Consider the least used linecard with respect to replication capacity; else if (choice is to be made on PWR-REPLIC-CAP) LCA = Consider the linecard with the best PWR-REPLIC-CAP metric; end if Send PIM redirect to PIM downstream neighbor recommending LCA ECMP link; end if Shankar Raman et.al Expires September 2012 [Page 5] INTERNET DRAFT Replication capability based PIM redirect March 2012 3 Security Considerations 4 IANA Considerations 5 References 5.1 Normative References [KEYWORDS] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC1776] Crocker, S., "The Address is the Message", RFC 1776, April 1 1995. [TRUTHS] Callon, R., "The Twelve Networking Truths", RFC 1925, April 1 1996. 5.2 Informative References [1] Yiqun, Cai et.al, Protocol Independent Multicast ECMP Redirect, "draft-ietf-pim-ecmp-02.txt", Work in Progress, October 2011. [2] Shankar Raman, et.al, Building power optimal Multicast Trees, "draft-mjsraman-rtgwg-pim-power-01.txt", Work in Progress, February 2011. [EVILBIT] Bellovin, S., "The Security Flag in the IPv4 Header", RFC 3514, April 1 2003. [RFC5513] Farrel, A., "IANA Considerations for Three Letter Acronyms", RFC 5513, April 1 2009. [RFC5514] Vyncke, E., "IPv6 over Social Networks", RFC 5514, April 1 2009. Authors' Addresses Shankar Raman et.al Expires September 2012 [Page 6] INTERNET DRAFT Replication capability based PIM redirect March 2012 Shankar Raman Department of Computer Science and Engineering I.I.T Madras, Chennai - 600036 TamilNadu, India. EMail: mjsraman@cse.iitm.ac.in Balaji Venkat Venkataswami Department of Electrical Engineering, I.I.T Madras, Chennai - 600036, TamilNadu, India. EMail: balajivenkat299@gmail.com Prof.Gaurav Raina Department of Electrical Engineering, I.I.T Madras, Chennai - 600036, TamilNadu, India. EMail: gaurav@ee.iitm.ac.in Shankar Raman et.al Expires September 2012 [Page 7]