Internet DRAFT - draft-maloy-tipc-tml
draft-maloy-tipc-tml
Network Working Group J. Maloy
Internet-Draft Ericsson
Expires: December 3, 2005 J. Hadi Salim
Znyx
H. Khosravi
Intel
F. Ansari
Lucent
C. Shuchi
Intel
June 2005
TIPC based TML for the ForCES protocol
draft-maloy-tipc-tml-00.txt
Status of this Memo
By submitting this Internet-Draft, each author represents that any
applicable patent or other IPR claims of which he or she is aware
have been or will be disclosed, and any of which he or she becomes
aware will be disclosed, in accordance with Section 6 of BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
This Internet-Draft will expire on December 3, 2005.
Copyright Notice
Copyright (C) The Internet Society (2005).
Conventions used in this document
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
Maloy, et al. Expires December 3, 2005 [Page 1]
Internet-Draft TIPC June 2005
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119].
Abstract
This document describes a ForCES [ForCES] Transport Mapping layer
(TML) based on the Transparent Inter Process Communication service
[TIPC]. It is intended to be used when the ForCES protocol is
transported over L2 carriers such as Ethernet, RapidIO or PCI-
Express. TIPC has been specially designed for efficient and easy-to-
use communication over L2 carriers, and is typically used to define
clusters of loosely coupled nodes in such environments.
Table of Contents
1. Requirements notation . . . . . . . . . . . . . . . . . . . 3
2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1 TIPC Summary . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Rationale for a TIPC based TML . . . . . . . . . . . . . . 5
2.3 Architectural Overview . . . . . . . . . . . . . . . . . . 5
2.4 The PL Layer . . . . . . . . . . . . . . . . . . . . . . . 6
2.5 The TML Layer . . . . . . . . . . . . . . . . . . . . . . 6
2.6 Terminology . . . . . . . . . . . . . . . . . . . . . . . 7
3. TIPC TML overview . . . . . . . . . . . . . . . . . . . . . 10
3.1 Separate Control and Data channels . . . . . . . . . . . . 10
3.1.1 Data Channel . . . . . . . . . . . . . . . . . . . . . 11
3.1.2 Control Channel . . . . . . . . . . . . . . . . . . . 12
3.1.3 Reliability . . . . . . . . . . . . . . . . . . . . . 12
3.1.4 Congestion Control . . . . . . . . . . . . . . . . . . 12
3.1.5 Security . . . . . . . . . . . . . . . . . . . . . . . 13
3.1.6 Addressing . . . . . . . . . . . . . . . . . . . . . . 13
3.1.7 Timeliness . . . . . . . . . . . . . . . . . . . . . . 18
3.1.8 Prioritization . . . . . . . . . . . . . . . . . . . . 18
3.1.9 HA Decisions . . . . . . . . . . . . . . . . . . . . . 18
3.1.10 Encapsulations Used . . . . . . . . . . . . . . . . 19
3.1.11 TML Messaging . . . . . . . . . . . . . . . . . . . 19
3.1.12 Protocol Initialization and Shutdown Model . . . . . 19
3.1.13 Protocol Initialization . . . . . . . . . . . . . . 19
3.1.14 Protocol Shutdown . . . . . . . . . . . . . . . . . 21
3.1.15 Multicast Model . . . . . . . . . . . . . . . . . . 22
3.1.16 Broadcast Model . . . . . . . . . . . . . . . . . . 25
3.1.17 Security Considerations . . . . . . . . . . . . . . 25
3.2 IANA Considerations . . . . . . . . . . . . . . . . . . . 25
3.3 Manageability . . . . . . . . . . . . . . . . . . . . . . 25
4. References . . . . . . . . . . . . . . . . . . . . . . . . . 25
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . 27
Intellectual Property and Copyright Statements . . . . . . . 29
Maloy, et al. Expires December 3, 2005 [Page 2]
Internet-Draft TIPC June 2005
1. Requirements notation
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119].
Maloy, et al. Expires December 3, 2005 [Page 3]
Internet-Draft TIPC June 2005
2. Introduction
The ForCES (Forwarding and Control Element Separation) working group
in IETF is defining the architecture and protocol for separation of
control and forwarding elements in network elements such as routers.
[RFC3654] and [RFC3746] define architectural and protocol
requirements for the communication between CE and FE. The ForCES
protocol layer [ForCES] describes the protocol specification. It is
envisioned that the ForCES protocol would be independent of the
interconnect technology between the CE and FE and can run over
multiple transport technologies and protocol. Thus a Transport
Mapping Layer (TML) has been defined in the protocol framework that
will take care of mapping the protocol messages to specific
transports. This document defines a TIPC based TML for the ForCES
protocol layer. It also addresses all the requirements for the TML
including security, reliability, etc.
2.1 TIPC Summary
For reference, this section gives a brief introduction to the
services provided by TIPC, as well as some basic concepts needed to
understand the rest of this document. For more in-depth information,
see [TIPC]
TIPC is a transport protocol with selectable reliability, typically
operating on top of L2 packet networks such as Ethernet. If IP-
routability and RFC2309 compliant congestion control is required, the
protocol can also be carried over higher-level protocols such as
DCCP, TCP, or SCTP.
TIPC offers the following services to its users:
o A functional addressing scheme providing full addressing
transparency over the whole cluster.
o A topology information and subscription service, providing up-to-
date information about functional and physical topology.
o Lightweight, highly reactive connections reporting errors or
destination unreachability within a fraction of a second.
o A reliable multicast service, based on functional addressing, but
using the underlying network multicast service when possible.
o Acknowledged, loss-free, error-free, non-duplicated transfer of
user data, both in connectionless and connection-oriented mode.
Maloy, et al. Expires December 3, 2005 [Page 4]
Internet-Draft TIPC June 2005
o Configurable congestion control both at bearer, link, and
connection level.
o Data fragmentation conforming to discovered carrier MTU size.
o Bundling of multiple user messages into a single TIPC packet in
situations where messages cannot be sent immediately, i.e. during
network congestion.
o Transparent, link-level load sharing and redundancy, through
support of heterogeneous multi-homing.
2.2 Rationale for a TIPC based TML
[RFC3654] states a set of basic requirements (loss-free, ordered,
non-corrupted delivery of messages, congestion control,scalability
etc) which are all met by TIPC. In addition, since TIPC constitutes
just a thin protocol layer on top of an L2 carrier, it is very
efficient when used in closed LANs, which we can assume will be a
very common environment for the type of routers we discuss here.
TIPC' location transparent addressing scheme also makes it
particularly fit for carrying the ForCES PL protocol; the latter's
addressing scheme can be directly mapped onto TIPC functional
addresses, making any form of address configuration or translation in
the TML layer superfluous. Furthermore, the topology subscription
service provided by TIPC makes it extremely easy for both the PL
layer and other functions to keep track of changes in physical and
functional topology changes in the router.
2.3 Architectural Overview
The reader is referred to the Framework document [RFC3746], and in
particular sections 3 and 4, for architectural overview and where and
how the ForCES protocol fits in. There may be some content overlap
between the ForCES protocol draft [ForCES] and this section in order
to provide clarity. The ForCES protocol constitutes two distinct
parts: the PL and TML layer. This is depicted in the figure below.
Maloy, et al. Expires December 3, 2005 [Page 5]
Internet-Draft TIPC June 2005
------------- -------------
| CE/PL | | FE/PL |
| Layer | | Layer |
|-------------| |-------------|
| CE/TML | | FE/TML |
| Layer | | Layer |
|-------------| |-------------|
| Transport | ForCES PL messages | Transport |
| Service |<---------------------------------->| Service |
------------- encapsulated in TML packets -------------
Figure 1: Architectural view of ForCES protocol
The PL layer is in fact the ForCES protocol. Its semantics and
message layout are defined in [ForCES]. The TML Layer is necessary
to connect two ForCES PL layers as shown in Figure 1 above. Both the
PL and TML layers are standardized by the IETF. While only one PL
layer is defined, different TMLs are expected to be standardized. To
interoperate the TML layer at the CE and FE are expected to be of the
same definition. On transmit, the PL layer delivers its messages to
the TML layer. The TML layer delivers the message to the destination
TML layer(s). On reception, the TML delivers the message to its
destination PL layer(s).
2.4 The PL Layer
The PL is common to all implementations of ForCES and is standardized
by the IETF [ForCES]. The PL layer is responsible for associating an
FE or CE to an NE. It is also responsible for tearing down such
associations. An FE uses the PL layer to throw various subscribed-to
events to the CE PL layer as well as respond to various status
requests issued from the CE PL. The CE configures both the FE and
associated LFBs attributes using the PL layer. In addition the CE
may send various requests to the FE to activate or deactivate it,
reconfigure it's HA parameterization, subscribe to specific events
etc.
2.5 The TML Layer
The service "Topology Information and Subscription" provides the The
TML layer is essentially responsible for transport of the PL layer
messages. The TML is where the issues of how to achieve transport
level reliability, congestion control, multicast, ordering, etc. are
handled. It is expected more than one TML will be standardized. The
different TMLs each could implement things differently based on
capabilities of underlying media and transport. However, since each
Maloy, et al. Expires December 3, 2005 [Page 6]
Internet-Draft TIPC June 2005
TML is standardized, interoperability is guaranteed as long as both
endpoints support the same TML. All ForCES Protocol Layer
implementations should be portable across all TMLs, because all TMLs
have the same top edge semantics.
2.6 Terminology
o ForCES Protocol: While there may be multiple protocols used within
the overall ForCES architecture, the term "ForCES protocol" refers
only to the protocol used at the Fp reference point in the ForCES
Framework in RFC3746 [RFC3746]. This protocol does not apply to
CE-to-CE communication, FE-to-FE communication, or to
communication between FE and CE managers. Basically, the ForCES
protocol works in a master-slave mode in which FEs are slaves and
CEs are masters.
o ForCES Protocol Layer (ForCES PL): A layer in ForCES protocol
architecture that defines the ForCES protocol messages, the
protocol state transfer scheme, as well as the ForCES protocol
architecture itself (including requirements of ForCES TML (see
below)). Specifications of ForCES PL are defined by this
document.
o ForCES Protocol Transport Mapping Layer (ForCES TML): A layer in
ForCES protocol architecture that specifically addresses the
protocol message transportation issues, such as how the protocol
messages are mapped to different transport media (like TCP, IP,
ATM, Ethernet, etc), and how to achieve and implement reliability,
multicast, ordering, etc. This document defines an TIPC based
ForCES TML.
o Port: The endpoint of all TIPC user communication. On Unix it
typically takes the shape of a socket.
o Zone: A "super-cluster" of clusters interconnected via TIPC.
o Cluster: A part of a zone where all nodes are directly
interconnected (fully meshed) via TIPC.
o Node: A physical computer within a cluster, identified by a TIPC
address.
o System Node: A node having direct links to all other system nodes
in the cluster, and a TIPC address defined within a certain range.
When using the term 'node' in the remainder of this document we
normally mean 'system node', unless the context makes a different
interpretation obvious.
Maloy, et al. Expires December 3, 2005 [Page 7]
Internet-Draft TIPC June 2005
o Secondary Node: A node identified by a TIPC address within a
certain range, and potentially having limited physical
connectivity to the rest of the cluster. Secondary nodes can
communicate with all system nodes in the cluster, and vice versa,
but the messages may have to pass via a system node acting as
router. Secondary nodes can not communicate with each other.
o Link: A signalling link connecting two nodes, performing tasks
such as message transfer, sequence ordering, retransmission etc.
A node pair may be interconnected by 1 or 2 parallel links, in
load sharing or active/standby configuration.
o Bearer: A generic term for an instance of a physical or logical
transport media, such as Ethernet, ATM/AAL or DCCP.
o Network Address: A TIPC internal node identifier. It is in
reality a 32 bit integer, subdivided into three fields (8/12/12),
representing zone, cluster and node number respectively. Normally
depicted as <Z.C.N>.
o Network Identity: A TIPC internal identifier, used to keep
different TIPC networks separated from each other, e.g. on a LAN
in a lab environment.
o Location transparency, sometimes called addressing transparency,
is the ability to let processes communicate within a cluster
without either of them knowing the physical location of their
peer.
o Port Name: (or just Name) A persistent functional address
identifying a port within a zone. A port may move between nodes
while retaining its name. For load sharing and redundancy
purposes several ports may bind to the same name.
o Port Identity: A volatile address identifying a unique physical
port within a zone. Once a physical port is deleted its identity
will not be reused for a very long time.
o Message: The unit of data delivered from one user to another, i.e.
between ports.
o Connection: A logical channel for passing messages between two
ports. Once a connection is established no address need be
indicated when sending a message from any of the endpoints. A
connection also implies automatic supervision of the endpoints'
existence and state.
Maloy, et al. Expires December 3, 2005 [Page 8]
Internet-Draft TIPC June 2005
o Message Bundling: The act of bundling several messages into one
bearer level packet, typically an Ethernet frame. TIPC bundles
messages e.g. during media congestion.
o Message Fragmentation: Dividing a long message into several
bearer-level packets, and reassembling the fragments at the
receiving end.
o Link Failover: Moving all traffic from a failing link/media to the
remaining link, while retaining original sequence order and
cardinality.
o Naming Table: A TIPC internal table which keeps track of the
mapping between port names and corresponding port identities. It
performs an on-the-fly translation from the one to the other
during the message transfer phase.
o Packet: The unit of data sent over a bearer. It may contain one
or more complete TIPC messages, as well as fragments of a message.
Maloy, et al. Expires December 3, 2005 [Page 9]
Internet-Draft TIPC June 2005
3. TIPC TML overview
The TIPC TML consists of two TIPC connections between the CE and FE
over which the protocol messages are exchanged. One of the
connections is called the control channel, over which control
messages are exchanged, the other is called data channel over which
external protocol packets, such as routing packets will be exchanged.
The TIPC connections will use unique server port names for each of
the channels. In addition to this, this TML will use the kernel
level mechanism to prioritize messages over the different channels,
as provided by TIPC. Some of the rationale for this approach, as
well as explanation of how it meets the TML requirements is explained
below.
3.1 Separate Control and Data channels
The ForCES NEs are subject to Denial of Service (DoS) attacks
[Requirements Section 7 15]. A malicious system in the network can
flood a ForCES NE with bogus control packets such as spurious RIP or
OSPF packets in an attempt to disrupt the operation of and the
communication between the CEs and FEs. In order to protect against
this situation, the TML uses separate control and data channels for
communication between the CEs and FEs.
Maloy, et al. Expires December 3, 2005 [Page 10]
Internet-Draft TIPC June 2005
CE
+-------------------+
| CE: PL |
+-------------------+
| CE: TML |
+-------------------+
| CE: TIPC |
+-------------------+
| | | | | |
| . | . | .
| | | | | |
| . | . | .
| | | | | |
| . | . | .
+-Cc1-----+ | | | | +-.-.-.-.-.Cdn.-+
| +-Cd1-.-.+ | . +--------Ccn---+ |
| | Cc2 Cd2 | .
| . | | | |
+-----------+ +-----------+ +-----------+
| FE: TIPC | | FE: TIPC | . . . | FE: TIPC |
+-----------+ +-----------+ +-----------+
| FE: TML | | FE: TML | | FE: TML |
+-----------+ +-----------+ +-----------+
| FE: PL | | FE: PL | | FE: PL |
+-----------+ +-----------+ +-----------+
FE1 FE2 FEn
\-------------V------------/
Legend:
---- Cc : Reliable Unicast Control Channel between CE and FE
-.-. Cd : Best Effort Unicast Data Channel between CE and FE
Figure 2: CE-FE Communication Channels
3.1.1 Data Channel
The data channel carries the control protocol packets such as RIP,
OSPF messages as outlined in Requirements [RFC3654] section 7.10,
which are carried in ForCES Packet Redirect messages [RFC3746],
between the CEs and FEs. The reliability requirements for the data
channel messages are different from that of the control messages
[RFC3654] i.e. they don't require strict reliability in terms of
retransmission, etc. However congestion control is important for the
data channel because in case of DoS attacks, if an unreliable
Maloy, et al. Expires December 3, 2005 [Page 11]
Internet-Draft TIPC June 2005
transport such as UDP is used for the data traffic, it can more
easily overflow the physical connection, overwhelming the control
traffic with congestion. Thus we need a transport protocol that
provides congestion control but does not necessarily provide full
reliability. Therefore, the data channel is established as a
connection with "best effort" properties in both directions. The
channel is set up by using the port name [CETYPE_DATA,CE-id] from the
FE.
3.1.2 Control Channel
All the other ForCES messages, which are used for configuration/
capability exchanges, event notification, etc, are carried over the
control channel. The data channel is set up only after the control
channel is set up, and is mapped to a TIPC connection which is
"reliable" in both directions. The control channel is set up by
using the port name [CETYPE_CONTROL,CE-id] from the FE.
3.1.2.1 Multicast Channel
Multicast groups are joined at the FE-side by binding to the port
name [McId,FE-id]. Messages are sent from CE to a multicast group by
using the port name sequence [McId,0,0xffffffff], which will
automatically cover all members of the group.
3.1.3 Reliability
TIPC provides the reliability (no losses, no data corruption, no re-
ordering of data) required for ForCES protocol control messages.
Furthermore, TIPC guarantees this property even when control traffic
is transparently load shared over more than one physical media, such
as two parallel Ethernets. This guarantee is valid even in
transition phases when one of the networks fails or is started.
Optionally, an individual socket can be set to be "best effort",
meaning that all messages sent from that socket may be dropped if
there is a network congestion or target node overload.
3.1.4 Congestion Control
Inside a LAN, TIPC does alone provide congestion control adequate to
satisfy this requirement [RFC3654]. There are three levels of
congestion control, as described in sections 3.5.5,3.7.6,3.7.7 and
3.9.6 of [TIPC]. The ForCES PL may receive indication of destination
socket or node congestion when setting up a channel. Once a channel
is established, socket level congestion is handled transparently by
the TIPC connection flow control scheme, while destination node
overload will result in an aborted channel if the connection is set
to "reliable". Since the direction FE->CE on the data channel is set
Maloy, et al. Expires December 3, 2005 [Page 12]
Internet-Draft TIPC June 2005
to "best effort", congestion or CE overload will NOT result in an
aborted channel. TIPC will also inform the TML layer about the
reason for such channel abortion, to help the TML decide what
recovery measures to take. It is possible to use TIPC and TIPC/TML
even over IP-based networks, but in such cases congestion control
must be guaranteed by the carrying transport protocol, e.g. TCP or
DCCP. In such cases TIPC will shortcircuit the concerned parts of
it's own transport protocol layer to avoid duplicate functionality.
3.1.5 Security
TIPC can only guarantee message and endpoint authenticity for closed
networks, e.g. a trusted LAN or bus. Since no router can yet forward
TIPC/Ethernet packets it is impossible to inject spoofed packets into
such a network. When needed, additional security can be achieved by
carrying TIPC over an IP-protocol with the requested properties. TLS
or IPsec will both fulfil the requirements stated in [RFC3654].
3.1.6 Addressing
There are at least two possible distribution models for TML CE-FE
channels. One such model assumes that there is only one set of data/
control channels between each CE-FE pair. A multiplexing/
demultiplexing step is then assumed at the PL layer in the ForCES
stack. The following figure illustrates this model.
Maloy, et al. Expires December 3, 2005 [Page 13]
Internet-Draft TIPC June 2005
------------------------------------------------------------------
| CE |
| |
| -------------- ----------------- -------------- |
| | XXX LFB | | CE Protocol LFB | | YYY LFB | |
| | | | | | | |
| | |<------+------->X<-------+----->| | |
| | | | | | | | |
| | | | | | | | |
| -------------- --------+-------- -------------- |
| | |
| | |
---------------------------------|--------------------------------
|
| Control/Data/Multicast
| Channels
|
---------------------------------|--------------------------------
| FE | |
| | |
| -------------- --------+-------- -------------- |
| | | | | | | | |
| | | | | | | | |
| | |<------+------->X<-------+----->| | |
| | | | | | | |
| | MMM LFB | | FE Protocol LFB | | FFF LFB | |
| -------------- ----------------- -------------- |
| |
| |
------------------------------------------------------------------
Figure 3: Channel model with explicit multi/demultiplexing
Another possible model is one where any CE LFB communicate directly
with any LFB on the FE side, and vice versa, without sending the
messages via any PL layer multiplexer step. In reality, it is the
transport protocol itself that performs the necessary multiplexing,
invisible for the upper layers. The following figure illustrates
this. In this model, the Protocol LFBs only serve the role of
configuring the transport protocol, on behalf of the CEM or FEM,
giving all existing channel pairs equal properties, and supervise the
availability of the peer FE-CE. There is one PL-protocol termination
(in practice, a library) per LFB, terminating all messages from any
other LFB it may communicate to. The main advantage with this model
is perfomance and simplicity, but it requires a PL layer providing
and assuming a connectionless communication model.
Maloy, et al. Expires December 3, 2005 [Page 14]
Internet-Draft TIPC June 2005
------------------------------------------------------------------
| CE |
| |
| -------------- ----------------- -------------- |
| | XXX LFB | | CE Protocol LFB | | YYY LFB | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| ------+------- --------+-------- -------+------ |
| | | | |
| | | | |
--------|------------------------|-----------------------|--------
| | |
<---+---> Reliable | Supervised <---+--->
connectionless | Control/Data
unicast or | Channel pair
<---+---> multicast | <---+--->
| | |
--------|------------------------|-----------------------|--------
| FE | | | |
| | | | |
| ------+------- --------+-------- -------+------ |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | MMM LFB | | FE Protocol LFB | | FFF LFB | |
| -------------- ----------------- -------------- |
| |
| |
------------------------------------------------------------------
Figure 4: Channel model with direct LFB-LFB communication
[ForCES] describes the first model, not taking into account that the
second one is possible if there is a reliable connectionless protocol
at hand. For now, we will therefore assume that model, not excluding
the second should the PL/TML model be modified to open up for this at
a later stage. Since TIPC has a functional addressing scheme, FE ids
as well as LFB ids can be mapped directly down to TIPC port names and
port name sequences. For TIPC/TML, a destination address is just an
opaque 4-byte integer pair. Irrespective of the PL-layer's
interpretation of that number, TIPC commits to reliably deliver
messages from any sender socket using that number-pair to any
destination socket bound to that same number-pair. When connection-
Maloy, et al. Expires December 3, 2005 [Page 15]
Internet-Draft TIPC June 2005
oriented messaging is wanted, the same address structure serves as
connect address, making TIPC basically behave like TCP, with some
additional properties to be activated on demand. For unicast
addressing/delivery, it uses the requested TIPC connection between
the CE and FE for control messaging. For multicast/broadcast
addressing/delivery of control messages, this TML uses TIPC multicast
between the CE to the FEs. The following example illustrates the
address mapping:
Maloy, et al. Expires December 3, 2005 [Page 16]
Internet-Draft TIPC June 2005
------------------------------------------------
| CE 8 |
| ----------------------------------- |
| | CE Protocol LFB | |
| | | |
| | | |
| | | |
| | | | |
| | (PL) | tmlInit(CeId = 8) | |
| | | | |
| | | | |
| | ------+----- TML API ----- | |
| | | | |
| | (TML) | bind(type = CETYPE, | |
| | | inst = 8) | |
| | V | |
| ----------------- TIPC API -------- |
| (TIPC) |
------------------------------------------------
------------------------------------------------
| FE 5 |
| (TIPC) |
| ----------------- TIPC API -------- |
| | A | |
| | | | |
| | | | |
| | | | |
| | (TML) | connect(type = CETYPE,| |
| | | inst = 8) | |
| | | | |
| | ------+----- TML API ----- | |
| | | | |
| | (PL) | tmlOpen(CeId = 8) | |
| | | | |
| | | |
| | FE Protocol LFB | |
| ----------------------------------- |
| |
------------------------------------------------
Figure 5: ForCES/PL to TIPC/TML address mapping
A CE address is represented by a CeId (a 32-bit number) in the PL
space. This address can be represented as a port name in the TIPC
addressing space, so that the type value is set to be CETYPE (a
Maloy, et al. Expires December 3, 2005 [Page 17]
Internet-Draft TIPC June 2005
wellknown, reserved 32-bit number) and the instance value is the
CeId. When the PL layer initiates the TML layer on the CE side, it
gives it the CeId. The TML layer then creates a socket and binds it
to a port name containing the CeId. On the FE side, the tmlOpen()
call provides the TML with the targeted CE's identity. The TML layer
uses this to construct a port name the same way as above, creates a
communication socket,and connects it to the CE's socket by using the
new port name. As we can see from this scenario, the CEM and FEM
don't need to configure any addresses at the TML level, the addresses
provided by the PL layer can be directly mapped down to corresponding
TIPC addresses.
3.1.7 Timeliness
Messages are delivered without any delay whatsoever over L2 networks.
With Ethernet this will in practice mean a delivery time, process-to-
process, in the order of 100 microseconds of a typical one-packet
message. TIPC does not allow obsoleting messages.
3.1.8 Prioritization
TIPC provides four message importance priorities, instead of eight,
as required in [ForCES]. The rationale for requiring as much as
eigth levels is weak; extensive experience from use of TIPC indicates
that four levels is perfectly adequate. If it is decided that the
ForCES PL must have eight levels, those will have to be mapped down
2-to-1 to the TIPC priorities by the TML layer. We suggest that the
data connection is set to TIPC_LOW in both directions, while the
control channel and multicast multicast sockets get the priority
TIPC_MEDIUM.
3.1.9 HA Decisions
L2 link failure detection and failover is handled transparently by
TIPC, by moving traffic over to the redundant link when one such is
available. This does not affect the PL layer, since it will have no
knowledge about the lower layer links. In case of complete
communication failure between CE and FE, the PL layer must be
informed. Returned, non-delivered, messages will not be returned to
the sending PL, but the failure reason will, as stated in [ForCES].
There is no support for heartbeat messages between peer TML layers.
The availability of a peer node is supervised by TIPC, using its own
heartbeat scheme, and indications of communication failure is
received by the TML via the topology subscription service. Failure
detection time can be configured per node (FE/CE), so a requested
heartbeat interval from CEM/FEM or PL layer can be translated into a
corresponding neighbour failure detection time per CE or FE. The TML
is responsible for keeping the control and data communication
Maloy, et al. Expires December 3, 2005 [Page 18]
Internet-Draft TIPC June 2005
channels up. It however does not have the authority to decide which
CE to set up the channels with. If a FE-CE communication channel
goes down or connectivity is lost, the following steps are taken by
the TML layer: If the error code from TIPC differs from TIPC_NO_NODE,
FE TML attempts to reestablish the communication channel If the FE
TML is unable to reestablish the channel (after some configured
number of retries/timeout), it notifies the FE PL that the channel is
down. CE TML waits for the channel to be reestablished (since only
the FE can reestablish it) for some configured timeout prior to
notifying the CE PL that the channel is down. CE TML waits for the
channel to be reestablished (since only the FE can reestablish it)
for some configured timeout prior to notifying the CE PL that the
channel is down.
3.1.10 Encapsulations Used
There is no further message encapsulation of control and data
messages done at the TML layer. The PL generated control messages
are transported as is by the TML layer. All ForCES protocol control
and data messages are encapsulated with a TIPC header.
3.1.11 TML Messaging
TBD.
3.1.12 Protocol Initialization and Shutdown Model
In order for the peer PL Layers to communicate, the control and data
channels must be set up. This section defines a model for the setup
of the channels, using the TML interface defined in [TMLAPI]. In
this model, the peer TML Layers may establish the control and data
channels between the FE and the CE without the involvement of the PL
Layers, or if desired, the PL Layer may trigger the setup of the
channels; this is left as an implementation decision. Both modes may
also be supported within an implementation
3.1.13 Protocol Initialization
The control channel must be established between the FE TML and the CE
TML for establishment of association to proceed. This channel will
be used for messages related to the association setup and capability
query. The data channel must be established no later than the
response from the FE to the CE Topology query message. The following
are the significant aspects associated with channel setup: single
call by the PL layer sets up the communication channels for both
control and data or distinct channels for control and data TML sets
up the appropriate channels and allocates required descriptors for
the channels. TML layer maintains a mapping between the Unicast
Maloy, et al. Expires December 3, 2005 [Page 19]
Internet-Draft TIPC June 2005
FE/CE Id and the corresponding conection. There is no need for
channel descriptors to be returned to the PL layer at either the FE
or the CE. The PL Layer only uses the Unicast FE/CE Id for read/
write calls and specifies the type of message (control versus data)
to be read/written.When channels are setup successfully, the TML
layer will have to return appropriate status that specifies which
channel is setup successfully and which isn't. Figure 4 illustrates
the initialization model where the PL layer via the TML API, triggers
the setup of the control and data channels.
FE1 PL FE1 TML CE TML CE PL
| | | | \
/ | | | TBD:tmlInit() | |
FE | | | |<--------------| > CE Init/
Init/ < | | | | | Bootup
Bootup | | | | | /
\ | | | |
| tmlOpen(CeId) | | |
|-------------->| | | \
| |CtrlChan(Cc) Setup | | | Setup control
| |~~~~~~~~~~~~~~~~~~~~~~>| | | channel if not
| | FeId -> [CcDes<ctrl>] | | setup. TML
| | | | > has mapping
| |CtrlChan(Cc) Setup Rsp | | | from PL Layer
| |<~~~~~~~~~~~~~~~~~~~~~~| | | Id to channel
| CeId -> [CcDes<ctrl>] | | | descriptor and
| | | | | channel if not
| | FeId -> [CcDes<ctrl>, | | setup. TML
| | CdDes<data>] | | updates
| | | | > mapping from
| |DataChan(Cd) Setup Rsp | | | PL Layer
| |<~~~~~~~~~~~~~~~~~~~~~~| | | Id to channel
| CeId -> [CdDes<data>] | | | descriptor and
| | | | / channel type.
| | | |
| <-- status | | |
| | | |
|tmlEvent(ChUp) | |tmlEvent(ChUp) |
|<--.--.--.--.--| |--.--.--.--.-->|
| | | |
| | Asso Setup Req | |
|---------------|-----------------------|-------------->|
| | Asso Setup Rsp | |
|<--------------|-----------------------|---------------|
| | | |
| | Capability Query | |
|<--------------|-----------------------|---------------|
Maloy, et al. Expires December 3, 2005 [Page 20]
Internet-Draft TIPC June 2005
| | Capability Query Rsp | |
|---------------|-----------------------|-------------->|
| | | |
| | Topology Query | |
|<--------------|-----------------------|---------------|
|<--------------|-----------------------|---------------|
| | Topology Query Rsp | |
|---------------|-----------------------|-------------->|
| | | |
| |STEADY STATE OPERATION |
Legend:
PL --------> PL : Protocol layer messaging
PL --------> TML: TML API
TML --.--.--> PL : Events/Notifications/Upcalls
TML ~~~~~~~~> TML: Internal protocol communication
Figure 6: PL-controlled Protocol Initialization
3.1.14 Protocol Shutdown
FE PL FE TML CE TML CE PL
| | | |
| |STEADY STATE OPERATION | |
|<--------------|-----------------------|-------------->|
| | Config Request | |
|<--------------|-----------------------|---------------|
| | Config Response | |
|---------------|-----------------------|-------------->|
| | | |
| | Association Teardown | |
|<--------------|-----------------------|---------------|
| | | |
| | | | \
|tmlClose(CeId) | | | | FE initiated:
|-------------->| | | > FE specifies CE
| <-- status | | | | Id associated
| | | | / with channel.
Figure 7: FE Initiated Shutdown
Maloy, et al. Expires December 3, 2005 [Page 21]
Internet-Draft TIPC June 2005
FE PL FE TML CE TML CE PL
| | | |
| |STEADY STATE OPERATION | |
|<--------------|-----------------------|-------------->|
| | Config Request | |
|<--------------|-----------------------|---------------|
| | Config Response | |
|---------------|-----------------------|-------------->|
| | | |
| | Association Teardown | |
|<--------------|-----------------------|---------------|
| | | |
| | | | \
| | |tmlClose(FeId) | | CE initiated:
| | |<--------------| > FE specifies CE
| <-- status | | status --> | | Id associated
| | | | / with channel.
Legend:
PL --------> PL : Protocol layer messaging
PL --------> TML: TML API
TML --.--.--> PL : Events/Notifications/Upcalls
TML ~~~~~~~~> TML: Internal protocol communication
Figure 8: CE Initiated Shutdown
3.1.15 Multicast Model
TIPC provides functional multicast, and broadcast as a special case
of that, to the PL layer. This function takes advantage of any
broadcast transport facility in the L2 bearer, such as Ethernet, and
will use replicated unicast if such a feature is missing.
Accordingly, the TIPC/TML layer provides support for multicast. In
the ForCES model, support is required to multicast to the FEs from a
CE; in this case, the CE is the source or root of the multicast and
the FEs are the leaves. Once the unicast control channel is open, a
CE may request FEs to join and leave specified multicast groups.
Multicast support is CE-initiated. FEs can join a multicast group
only if the CE requests them to join the group. TIPC/TML needs no
mapping between PL layer IDs and channel descriptors for multicast,
it can directly use the multicast group id provided by the PL layer.
The following are the significant steps for adding or removing
members from a multicast group: CE PL communicates with FE PL for
requesting the FE to join or leave a multicast group. FE PL informs
Maloy, et al. Expires December 3, 2005 [Page 22]
Internet-Draft TIPC June 2005
FE TML regarding the join or leave request. FE TML creates a new
socket and calls "bind()" to bind the socket to the multicast group
requested. The multicast group id is used directly as type field in
the bound address. FE PL responds to CE PL informing it of the
status of the join or leave request. If the join or leave request
was successful, CE PL informs CE TML regarding the update to the
multicast group membership. There is no need for any descriptors to
be returned to the PL layer at either the FE or the CE. PL Layer
only uses the Multicast FE Id for write calls and specifies the type
of message (control versus data) to be written A tmlWrite() on a
unicast FE Id results in a unicast message being sent to the FE
associated with the channel. A tmlWrite() on a multicast FE Id
results in multicast messaging. The figures below illustrate
multicast scenarios with 2 FEs, FE1 and FE2. In Figure 7, the CE
requests FE1 to join a multicast group. Although not shown as a
separate figure, if FE2 were to join the same group, the join
procedure would be the same as in Figure 7; it would result in the
multicast group membership being updated at the TML layer on the CE
to include FE2 in the group. In Figure 8, the CE requests FE1 to
leave the multicast group, thus resulting in only FE2 being a member
of the multicast group. Multicast Scenario with FE1 joining group:
New group created
Maloy, et al. Expires December 3, 2005 [Page 23]
Internet-Draft TIPC June 2005
FE1 PL FE1 TML CE TML CE PL
| | | |
| | | | \
| MC Grp Join Req (McId) | |
|<--------------|---------------|---------------| | CE:PL Level multicast group
[TML | tmlJoin(McId) | | | | join request sent to each
updates |-------------->| | | | FE:PL that needs to be part
MC grp | McId = {FE1_ChDes} | | > of a multicast group, McId,
info] | | | | | where McId specifies a
| <-- status | | | | multicast group Id at the
| | | | | PL layer.
| MC Grp Join Rsp (status) | |
|---------------|---------------|-------------->| /
| | | |
| | | | \
| | |tmlJoin(McId) | | TML updates multicast
| | |<--------------| | group membership. PL is
| | McId = {FE1_ChDes} | > only aware of
| | | |
| | | | \
| | |tmlJoin(McId) | | TML updates multicast
| | |<--------------| | group membership. PL is
| | McId = {FE1_ChDes} | > only aware of PL layer
| | | | | multicast group Id, that is,
| | | status --> | | McId]
| | | | /
Figure 9: FE Joining Multicast Group
Multicast Scenario with FE1 leaving group: Group membership updated
to exclude FE1
Maloy, et al. Expires December 3, 2005 [Page 24]
Internet-Draft TIPC June 2005
FE1 PL FE1 TML CE TML CE PL
| | | |
| | | | \
| MC Grp Leave Req (McId, FE1) | |
|<--------------|-------------------|---------------| | CE:PL Level multicast group
[TML | tmlLeave(McId)| | | | leave request sent to FE1:PL
removes |-------------->| | | | that needs to be removed
MC grp | McId = {} | | > from multicast group, McId,
info] | | | | | where McId specifies a
| <-- status | | | | multicast group Id at the
| | | | | PL layer.
| MC Grp Leave Rsp (status) | |
|---------------|-------------------|-------------->| /
| | | |
| | | |
| | | |
| tmlLeave(McId)| | | \TML removes FE1 from
| | |<--------------| | multicast group McId.
| | McId = {FE2_ChDes} | > That leaves only FE2
| | | | | in the group.
| | | status --> | |
| | | | /
Figure 10: FE Leaving Multicast Group
3.1.16 Broadcast Model
3.1.17 Security Considerations
If the CE or FE are in a single box and network operator is running
under a secured environment TIPC can be run over raw Ethernet,
without any security mechanisms activated. When the CEs, FEs are
running over IP networks or in an insecure environment, we don't
recommend use of TIPC for now.
3.2 IANA Considerations
3.3 Manageability
TBD: What needs to be added here ?
4. References
[ForCES] Doria et al., A., "ForCES Protocol Specification",
September 2004, <http://www.ietf.org/internet-drafts/
Maloy, et al. Expires December 3, 2005 [Page 25]
Internet-Draft TIPC June 2005
draft-ietf-forces-protocol-00.txt>.
[RFC2026] Bradner, S., "The Internet Standards Process -- Revision
3", RFC 2026, BCP 9, October 1996,
<http://www.rfc-editor.org/rfc/rfc2026.txt>.
[RFC2104] Krawczyk, H., Bellare, M., and R. Canetti, "HMAC: Keyed-
Hashing for Message Authentication", RFC 2104,
February 1997,
<http://www.rfc-editor.org/rfc/rfc2104.txt>.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC2406] Kent, S. and R. Atkinson, "IP Encapsulating Security
Payload (ESP)", RFC 2406, November 1998,
<http://www.rfc-editor.org/rfc/rfc2406.txt>.
[RFC2408] Maughan, D., Schertler, M., Schneider, M., and J. Turner,
"Internet Security Association and Key Management
Protocol", RFC 2408, November 1998,
<http://www.rfc-editor.org/rfc/rfc2408.txt>.
[RFC2434] Narten, T. and H. Alvestrand, "Guidelines for Writing an
IANA Considerations Section in RFCs", RFC 2434, BCP 26,
October 1998, <http://www.rfc-editor.org/rfc/rfc2434.txt>.
[RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6
(IPv6) Specification", RFC 2460, December 1998,
<http://www.rfc-editor.org/rfc/rfc2460.txt>.
[RFC2581] Allman, M., Paxson, V., and W. Stevens, "TCP Congestion
Control", RFC 2581, April 1999,
<http://www.rfc-editor.org/rfc/rfc2581.txt>.
[RFC2960] Stewart, R., Xie, Q., Morneault, K., Sharp, C.,
Schwarzbauer, H., Taylor, T., Rytina, I., Kalla, M.,
Zhang, L., and V. Paxson, "Stream Control Transmission
Protocol", RFC 2960, October 2000,
<http://www.rfc-editor.org/rfc/rfc2960.txt>.
[RFC3654] Khosravi, H., "Requirements for Separation of IP Control
and Forwarding", RFC 2026, BCP 9, November 2003,
<http://www.ietf.org/rfc/rfc3654.txt>.
[RFC3746] Yang, L., "Forwarding and Control Element Separation
(ForCES) Framework", RFC 2026, BCP 9, April 2004,
<http://www.ietf.org/rfc/rfc3746.txt>.
Maloy, et al. Expires December 3, 2005 [Page 26]
Internet-Draft TIPC June 2005
[RFC768] Postel, J., "User Datagram Protocol", RFC 768, STD 6,
August 1980, <http://www.rfc-editor.org/rfc/rfc768.txt>.
[RFC793] Postel, J., "Transmission Control Protocol", RFC 793,
STD 7, September 1981,
<http://www.rfc-editor.org/rfc/rfc793.txt>.
[TIPC] Maloy, J., "Transparent Inter Process Communication",
April 2004, <http://tipc.sourceforge.net>.
[TMLAPI] Salim et al., J., "ForCES Transport Mapping Layer (TML)
Service Primitives and Encapsulations,
draft-jhs-forces-tmlapi-00.txt, work in progress",
April 2005.
Authors' Addresses
Jon Paul Maloy
Ericsson
Research Canada
8400, boul. Decarie
Ville Mont-Royal, Quebec H4P 2N2
Canada
Phone: +1 514 576-2150
Email: jon.maloy@ericsson.com
Jamal Hadi Salim
Znyx
195 Staford Road West,
Suite 104
Nepean, ON K2H 9C1
Canada
Phone: +1 613 596-1138
Email: hadi@znyx.com
Maloy, et al. Expires December 3, 2005 [Page 27]
Internet-Draft TIPC June 2005
Hormuzd M. Khosravi
Intel
2111 NE 25th Avenue,
Hillsboro, OR 97124
USA
Phone: +1 503 264-0334
Email: hormuzd.m.khosravi@intel.com
Furquan Ansari
Lucent
101 Crawford Corner Road,
Holmdel, NJ 07733
USA
Phone: +1 732 949-5249
Email: furquan@lucent.com
Chawla Suchi
Intel
2111 NE 25th Avenue,
Hillsboro, OR 97124
USA
Phone: +1 503 712-4539
Email: suchi.chawla@intel.com
Maloy, et al. Expires December 3, 2005 [Page 28]
Internet-Draft TIPC June 2005
Intellectual Property Statement
The IETF takes no position regarding the validity or scope of any
Intellectual Property Rights or other rights that might be claimed to
pertain to the implementation or use of the technology described in
this document or the extent to which any license under such rights
might or might not be available; nor does it represent that it has
made any independent effort to identify any such rights. Information
on the procedures with respect to rights in RFC documents can be
found in BCP 78 and BCP 79.
Copies of IPR disclosures made to the IETF Secretariat and any
assurances of licenses to be made available, or the result of an
attempt made to obtain a general license or permission for the use of
such proprietary rights by implementers or users of this
specification can be obtained from the IETF on-line IPR repository at
http://www.ietf.org/ipr.
The IETF invites any interested party to bring to its attention any
copyrights, patents or patent applications, or other proprietary
rights that may cover technology that may be required to implement
this standard. Please address the information to the IETF at
ietf-ipr@ietf.org.
Disclaimer of Validity
This document and the information contained herein are provided on an
"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Copyright Statement
Copyright (C) The Internet Society (2005). This document is subject
to the rights, licenses and restrictions contained in BCP 78, and
except as set forth therein, the authors retain all their rights.
Acknowledgment
Funding for the RFC Editor function is currently provided by the
Internet Society.
Maloy, et al. Expires December 3, 2005 [Page 29]