Internet Engineering Task Force | J. Huang, Ed. |
Internet-Draft | Q. Zhong |
Intended status: Informational | Huawei |
Expires: March 4, 2017 | August 31, 2016 |
Framework and Requirements for GMPLS-based Control of Flexible Ethernet Network
draft-huang-flexe-framework-00
This memo provides some background information of Flexible Ethernet (FlexE), and explain some terminologies and use cases, further derives the requirements to the GMPLS based control plane.
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."
This Internet-Draft will expire on March 4, 2017.
Copyright (c) 2016 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119] .
Ethernet starts from 10M, then evolves to 100M, 1000M, 10G, etc. As the line rate goes higher, it is more and more difficult for manufacturing technology to support this 10-times-based evolution, and also it takes more time to develop new standard. For example, IEEE started standardization work on 100G Ethernet in December, 2007, and actually the initial discussion on this started about two year earlier, finished the first part of 100G standard in 802.3ba in June, 2010. As of today, it is almost 10 years since the beginning, the work on 100G for more models is still ongoing. The work on 400G Ethernet started from March, 2013, and it is expected to be finished by the end of 2017 which can covver a distance of 10km. It will take some more years before the 400G module is commonly available at a reasonable price in the market. There is no consensus yet what is going to be the next beyond, 800G or 1T.
If operators want to use the next generation higher speed Ethernet interface, e.g. for Inter-DC connections where high speed interfaces are desired due to large traffic volume and rapid increase of traffic, they will need to wait for some years for the new interface modules . A possible mitigation is to use some bonding technology, such as 802.1AX link aggregation, but with the hash issue -- traffic may not be evenly distributed over multiple links. FlexE is a good solution for this problem, traffic is distributed over multiple physical links by 64/66B blocks in a round-robin manner. This is one of the key reasons that FlexE is invented.
Besides bonding, FlexE also provides sub rate and channelization capability. In FlexE, a 100G physical link can be divided into 20 slots, and each slot is 5G. A subset of these 20 slots can be grouped together and provide a virtual link with bandwidth of 5G*N. FlexE also allows slots over multiple physical links be grouped together and provide bandwidth which is not integral multiple of a physical link, such as 150G bandwidth over two 100G physical links. This is called as channelization.
According to section 82, 83 of [IEEE802.3] and [FlexE1.0], Ethernet and FlexE layering is shown in the figure below.
+------------------+ | L2 & Above | +------------------+ | RECONCILIATION | +------------------+ | | MII +-------------------------+ | PCS | | +---------------------+ | | | 64/66B En/Decode | | / +------------------+ | +---------------------+ |/ | Idle Add/Remove | | | FlexE Shim | | +------------------+ | +---------------------+ |\ | FlexE Calendar | | | De/scramble | | \ +------------------+ | +---------------------+ | | | AM Add / Remove | | | | block Distribution | | /+------------------+ | +---------------------+ | / | PMA | +-------------------------+/ +------------------+ | PMA | | RS-FEC(Optional) | +-------------------------+\ +------------------+ | PMD | \ | PMA | +-------------------------+ \+------------------+ | | MDI +--------------------+ | MEDIUM | +--------------------+
Figure 1: Standard Ethernet and FlexE
There are three typical FlexE transport use cases as depicted in [FlexE1.0], and correspondingly there are several network layering and mapping options.
+-----------------+ +-----------------+ +-----------------+ | L2 & Above | | L2 & Above | | L2 & Above | +-----------------+ +-----------------+ +-----------------+ | PCS (Upper) | | PCS (Upper) | | PCS (Upper) | +-----------------+ +-----------------+ +-----------------+ | 64/66B | | 64/66B | | 64/66B | | Encode/Decode | | Encode/Decode | | Encode/Decode | +-----------------+ +-----------------+ +-----------------+ | Idle Add/Remove | | Idle Add/Remove | | | +-----------------+ +-----------------+ | | | FlexE Calendar | | FlexE Calendar | | | +-----------------+ +-----------------+ | | | PCS (Lower) | | | | | +-----------------+ | | | | | | | | | | | | | | | | | | | | | | +-----------------+ +-----------------+ +-----------------+ | OTN G.709 | | OTN G.709 | | OTN G.709 | +-----------------+ +-----------------+ +-----------------+ FlexE Unaware FlexE Aware Flex Termination
Figure 2: FlexE Transport Mappings
In this mode, the FlexE traffic will be treated as bit stream on physical link basis, rather than at FlexE group or FlexE client basis. FlexE encapsulation and signaling is transparent to the transport network, The transport network will not try to interpret the bit stream. If the transport network covers a very long distance, skew might be a problem for FlexE, a large skew value should be considered.
If there are multiple physical links in a FlexE group, it may be necessary to consider carrying the traffic of the various link along the same transport network path so as to mitigate skew issue.
+------------------+ | L2 & Above | +------------------+ | RECONCILIATION | +------------------+ | | MII +-------------------------+ | PCS | | +---------------------+ | | | 64/66B En/Decode | | | +---------------------+ | | | FlexE Shim | | | +---------------------+ | | | De/scramble | | | +---------------------+ | | | AM Add / Remove | | | | block Distribution | | +-------------------------+ | +---------------------+ | L1 | PCS Layer & Above | +-------------------------+ <-----> +-------------------------+ | PMA | | | +-------------------------+ | | | PMD | | | +-------------------------+ | | | | MDI | | +--------------------+ +-------------------------+ | MEDIUM | | OTN G.709 | +--------------------+ +-------------------------+
Figure 3: L1 Connectivity
According to section 17.7.5.1 and Annex E of [G.709], the data stream at the interface between PCS and PMA should be carried over OTN, as shown in the above figure.
There is an efficiency problem in this mode. Because the transport network will not be able to know which FlexE slots are in use and which are not, then the transport network will have to carry all the traffic in a FlexE group. For example, if a FlexE group consists of two 100G links, and the configured FlexE bandwidth is 150G: 100G over one link and 50G over the other. But the transport network has to carry the total 200G traffic, unable to remove the unused 50G slots.
The transport network can understand the FlexE protocol, and will remove the unused slots before carrying FlexE traffic over the transport network. At the egress point of transport network, FlexE traffic will be mapped to the same number slots, while leaving some slots in the FlexE Group unused if necessary. This will save some bandwidth of the transport network. In this mode, the transport network interwork with FlexE below the FlexE layer, assuming FlexE is L1.5, most of the FlexE overhead will be transported over the transport network. The session management channel in the FlexE overhead will be terminated and replaced with idle control block, as specified in section 8.3 of [FlexE1.0].
The data stream to be transported is above L1 and below FlexE shim (L1.5), which is called as L1.25 data stream.
+------------------+ | L2 & Above | +------------------+ | RECONCILIATION | +------------------+ | | MII +-------------------------+ | PCS | | +---------------------+ | | | 64/66B En/Decode | | | +---------------------+ | +---------------------+ | | FlexE Shim | | L1.25 | FlexE Shim & Above | | +---------------------+ | <-------> +---------------------+ | | De/scramble | | | | | +---------------------+ | | | | | AM Add / Remove | | | | | | block Distribution | | | | | +---------------------+ | | | +-------------------------+ | | | PMA | | | +-------------------------+ | | | PMD | | | +-------------------------+ | | | | MDI | | +--------------------+ +-------------------------+ | MEDIUM | | OTN G.709 | +--------------------+ +-------------------------+
Figure 4: L1.25 Connectivity
The FlexE traffic is interleaved before transporting, so there will be no skew issue; and OTN will use BGMP algorithm to map FlexE traffic into ODUk or ODUflex, as specified in section 17.11 of [G.709].
[Note: The case when FlexE traffic is greater than the rate of a single link or a WMD wavelength is not yet considered by [G.709] for the time being.
This mode is usually used for a point to point path, rather than a P2MP or MP2MP case. The traffic stream is the valid traffic of a whole FlexE group.
This is also called FlexE termination transport. The FlexE traffic over multiple slot will be aggregated into a 64/66B stream, and the FlexE overhead will be removed before FlexE traffic is carried over the transport network. At the egress of the transport network, FlexE overhead will be added when the traffic is converted back into FlexE mode.
+------------------+ | L2 & Above | +------------------+ | RECONCILIATION | +------------------+ | | MII +-------------------------+ | PCS | | +---------------------+ | +--------------------------+ | | 64/66B En/Decode | | L1.5 | 64/66B En/Decode & Above | | +---------------------+ | <------> +--------------------------+ | | FlexE Shim | | | | | +---------------------+ | | | | | De/scramble | | | | | +---------------------+ | | | | | AM Add / Remove | | | | | | block Distribution | | | | | +---------------------+ | | | +-------------------------+ | | | PMA | | | +-------------------------+ | | | PMD | | | +-------------------------+ | | | | MDI | | +--------------------+ +--------------------------+ | MEDIUM | | OTN G.709 | +--------------------+ +--------------------------+
Figure 5: L1.5 Connectivity
This mode does not have the skew issue because the FlexE overhead is removed and the concept of FlexE slots does not exist when the traffic is in the transport network, alignment between slot is not necessary.
The traffic in a FlexE group can be divided into multiple flows in the transport network, each can be identified by FlexE client ID, or FlexE client ID plus FlexE group number. These different flows can be routed through different ODUk or ODUflex and consequently may traverse over different link path to different end station.
As shown in Figure 5, the FlexE overhead is removed from the FlexE traffic and the same number of 64/66B idle blocks are inserted at the IPG position between Ethernet frames when FlexE payload is Ethernet. It may be a different case if the FlexE payload is not Ethernet. Then the traffic is in the form of 64/66B block stream on a per FlexE client basis, and it is CBR stream. The mapping of this CBR stream over OTN uses IMP algorithm as specified in section 17.11 of [G.709].
This is another type of FlexE termination transport, FlexE overhead will be terminated and traffic will be converted into L2/L2.5/L3 packets streams on a per FlexE client basis, which is VBR stream, rather than 64/66B blocks in the above case.
This will enable some new application scenario, such as transport network may be used to provide VPLS-like VPN service, or to support network virtualization and network slicing. A FlexE client can be modeled in a system as a virtual interface, a set of virtual interfaces can construct a L2 or L3 forwarding instance.
+------------------+ +--------------------------+ | L2 & Above | | L2 & Above | +------------------+ <---------> +--------------------------+ | RECONCILIATION | | | +------------------+ | | | | MII | | +-------------------------+ | | | PCS | | | | +---------------------+ | +--------------------------+ | | 64/66B En/Decode | | | 64/66B Encode / Decode | | +---------------------+ | +--------------------------+ | | FlexE Shim | | | Idle Add / Remove | | +---------------------+ | +--------------------------+ | | De/scramble | | | | | +---------------------+ | | | | | AM Add / Remove | | | | | | block Distribution | | | | | +---------------------+ | | | +-------------------------+ | | | PMA | | | +-------------------------+ | | | PMD | | | +-------------------------+ | | | | MDI | | +--------------------+ +--------------------------+ | MEDIUM | | OTN G.709 | +--------------------+ +--------------------------+
Figure 6: L2-L2.5-L3 Connectivity Option1
+------------------+ +--------------------------+ | L2 & Above | | L2 & Above | +------------------+ <---------> +--------------------------+ | RECONCILIATION | | | +------------------+ | | | | MII | | +-------------------------+ +--------------------------+ | PCS | | GFP-F | | +---------------------+ | +--------------------------+ | | 64/66B En/Decode | | | | | +---------------------+ | | | | | FlexE Shim | | | | | +---------------------+ | | | | | De/scramble | | | | | +---------------------+ | | | | | AM Add / Remove | | | | | | block Distribution | | | | | +---------------------+ | | | +-------------------------+ | | | PMA | | | +-------------------------+ | | | PMD | | | +-------------------------+ | | | | MDI | | +--------------------+ +--------------------------+ | MEDIUM | | OTN G.709 | +--------------------+ +--------------------------+
Figure 7: L2-L2.5-L3 Connectivity Option2
[G.709] provides two methods to map packet client signal into OPUk as specified in section 17.10 and 17.11 of [G.709], firs map packet flow into GFP-F encapsulation then into OPUk or OPUflex, or map packet flow into FlexE client signal in the form of 64/66B block, then into OPUflex using Idle Mapping Procedure (IMP) as specified in section 17.11 of [G.709].
The following parameters are applicable to FlexE Aware L1.25 Relay, FlexE Termination L1.5 Relay, and FlexE Termination L2/L2.5/L3 Relay.
SENDER_TSPEC object: Class = 12 [RFC2205], C-Type = TBD.
FLOWSPEC object: Class = 9, [RFC2205], C-Type = TBD.
Traffic Parameters will be carried in the SENDER_TSPEC and FLOWSPEC object in Path Message to specify the traffic characteristics of a flow. section 4.2 of [I-D.du-ccamp-flexe-channel] already provides a traffic parameter definition.
Switching Type
Value Type ----- ---- TBD1 FlexE
Generalized Label Object is carried in Resv Message. [RFC3209]. Section 3.1 of [I-D.hussain-ccamp-flexe-signaling-extensions] proivdes a label definition for FlexE which can support future links other than 100G and possible new granularities, also provides support to heterogeneous links in a FlexE group. If a simplified version is desired, the one in section 3.2 of [I-D.wang-ccamp-flexe-signaling] can be considered.
This is actually a traditional mode, listed here for the purpose of completeness only.
[Note: to consider carrying traffic of multiple links in a FlexE group along a same transport network path? TBD]
LSP Encoding Type, Switching Type, G-PID will be carried in the Generalized Label Request Object. [RFC3473].
LSP Encoding Type [RFC3471]
Value Type ----- ---- 9 Fiber
Switching Type [RFC3471]
Value Type ----- ---- 200 Fiber-Switch Capable (FSC)
Generalized PID (G-PID)
Value G-PID Type LSP Encoding Type ----- ---------- ------------------------ TBD2 100GE PCS Fiber, G.709 OCh, Lambda
A new G-PID type is required, although this is not FlexE related.
Label Format [RFC3471] will be carried in the Generalized Label Object.
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Label | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 8: Port Label Format (RFC3471)
A new traffic parameter definition for FlexE unaware mode may not be necessary since the transport network is not supposed to know FlexE. A possible option is to reuse the traffic parameter definition in section 3.2.2 of [RFC2210].
LSP Encoding Type
Value Type ----- ---- TBD3 FlexE Aware Group
Generalized PID (G-PID)
Value G-PID Type LSP Encoding Type ----- -------------- ------------------ TBD4 FlexE Aware FlexE Aware Group
OTN will use BGMP algorithm to map the FlexE signal over transport network.
LSP Encoding Type
Value Type ----- ------------ TBD5 FlexE Client
Generalized PID (G-PID)
Value G-PID Type LSP Encoding Type ----- -------------- ------------------ TBD6 FlexE Client FlexE Client
LSP Encoding Type for FlexE client (packet).
Value Type ----- ------------------- TBD7 FlexE Client Packet
Generalized PID (G-PID)
Value G-PID Type LSP Encoding Type ----- -------------------- ------------------- TBD8 FlexE Client Packet FlexE Client Packet
TBD.
TBD.
SENDER_TSPEC object: Class = 12 [RFC2205], C-Type = TBD.
FLOWSPEC object: Class = 9 [RFC2205], C-Type = TBD.
LSP Encoding Type for FlexE aware group
Value Type ----- ---- TBD FlexE Aware Group
LSP Encoding Type for FlexE client (L1.5).
Value Type ----- ------------ TBD FlexE Client
LSP Encoding Type for FlexE client (packet).
Value Type ----- ------------------- TBD FlexE Client Packet
Switching Type for FlexE.
Value Type ----- ---- TBD FlexE
Generalized PID (G-PID) for 100GE PCS
Value G-PID Type LSP Encoding Type ----- ---------- ------------------------ TBD 100GE PCS Fiber, G.709 OCh, Lambda
Generalized PID (G-PID) for FlexE aware transport.
Value G-PID Type LSP Encoding Type ----- -------------- ------------------ TBD FlexE Aware FlexE Aware Group
Generalized PID (G-PID) for FlexE Client (L1.5).
Value G-PID Type LSP Encoding Type ----- -------------- ------------------ TBD FlexE Client FlexE Client
Generalized PID (G-PID) for FlexE client (packet).
Value G-PID Type LSP Encoding Type ----- -------------------- ------------------- TBD FlexE Client Packet FlexE Client Packet
TBD.
[RFC2205] | Braden, R., Zhang, L., Berson, S., Herzog, S. and S. Jamin, "Resource ReSerVation Protocol (RSVP) -- Version 1 Functional Specification", RFC 2205, DOI 10.17487/RFC2205, September 1997. |
[RFC2210] | Wroclawski, J., "The Use of RSVP with IETF Integrated Services", RFC 2210, DOI 10.17487/RFC2210, September 1997. |
[RFC2629] | Rose, M., "Writing I-Ds and RFCs using XML", RFC 2629, DOI 10.17487/RFC2629, June 1999. |
[RFC3209] | Awduche, D., Berger, L., Gan, D., Li, T., Srinivasan, V. and G. Swallow, "RSVP-TE: Extensions to RSVP for LSP Tunnels", RFC 3209, DOI 10.17487/RFC3209, December 2001. |