TOC |
|
This document proposes minor backward-compatible changes to BGP for enabling multiple sessions to be established between pair of BGP speakers for different AFI/SAFI or groups of thereof. It also describes mechanism for handling each session in separate process (for information only). This memo updates BGP and multiprotocol extensions for BGP-4 specifications .
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 (Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels,” March 1997.) [RFC2119].
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as “work in progress.”
This Internet-Draft will expire on October 11, 2010.
Copyright (c) 2010 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
1.
Introduction
2.
Terminology
3.
Interoperability with legacy implementations
4.
Conformance requirement
5.
Overview of operations
6.
OPEN message handling
7.
BGP connection collision handling modification
8.
AFI/SAFI grouping
9.
Discussion
10.
Security considerations
11.
IANA Considerations
12.
Acknowledgments
13.
References
13.1.
Normative References
13.2.
Informative References
Appendix A.
Example of session handover
§
Author's Address
TOC |
There is desire from both network operators and vendors to separate exchange of information for different AFI/SAFI into multiple sessions between same pair of BGP speakers, particularly to allow various AFI/SAFI or groups of thereof to be handled in different system processes. Two methods have been already proposed to achieve desired functionality as described in [I‑D.ietf‑idr‑bgp‑multisession] (Scudder, J. and C. Appanna, “Multisession BGP,” March 2010.) and [I‑D.raszuk‑ti‑bgp] (Raszuk, R. and K. Patel, “Transport Instance BGP,” March 2010.). Both of these methods rely on introduction of new TCP ports in addition to existing well-known BGP port, which is neither desirable from operational prospective nor necessary from implementation prospective (at least no sufficient evidence has been made available to justify the need).
This memo describes alternative approach to achieve desired functionality without requiring additional TCP ports, and solicits discussion from the community in respect to proposed changes.
TOC |
In addition to commonly used keywords this memo uses following terminology:
TOC |
Mechanisms described in this memo rely on backward-compatible modification of Multiprotocol Extensions capability and backward-compatible modification of BGP connection collision detection. When BGP speaker detects (through information in the OPEN message) legacy peer it simply behaves as legacy BGP speaker towards that peer.
TOC |
Implementations conforming to this specification MUST NOT enable changes described here by default. They MUST provide configuration option to explicitly enable new functionality; the option SHOULD be available on per-peer basis. Conforming implementations MUST implement both new format of multiprotocol extensions capability code and the new procedure for connection collision detection.
TOC |
BGP speaker initiates session using modified multiprotocol extensions capability code in OPEN message. If the session is the only session between particular pair of BGP speakers (whether new or legacy) both sides continue according to legacy specifications.
If new session comes up when there is already existing session then BGP speaker attempts to detect whether peer is new or legacy peer (by looking at OPEN message from the peer). If peer is a new implementation then BGP speaker continues behaving as new implementation, otherwise it reverts to legacy behaviour. After connection collision detection has been performed and BGP session established, BGP speaker proceeds exchanging BGP messages with the peer similar to legacy implementation.
Once session is established BGP speaker is free to handle it within the same process as the one that accepted connection, transfer it to another already running process or create a new process and transfer session to it. Exact mechanism of such handover is up to implementation and is not part of this specification. However if session handover is used, then implementation MUST (or SHOULD?) ensure aliveness of the process handling particular session and restart session (or process or both) if problem detected.
Note there is no restriction whether initial session setup is performed by single multiplexing process or by a process dedicated to particular AFI/SAFI or a group of thereof. Appendix A (Example of session handover) of this memo contains information on how session handover could be implemented, but actual implementation MAY choose different approach.
TOC |
Optional parameter related to multisession: The Capability Value field is defined by [RFC4760] (Bates, T., Chandra, R., Katz, D., and Y. Rekhter, “Multiprotocol Extensions for BGP-4,” January 2007.) as:
0 7 15 23 31 +-------+-------+-------+-------+ | AFI | Res. | SAFI | +-------+-------+-------+-------+
Figure1: Capability Value field |
Currently section 8 of RFC4760 defines Reserved field as:
Res. - Reserved (8 bit) field. SHOULD be set to 0 by the sender and ignored by the receiver. Note that not setting the field value to 0 may create issues for a receiver not ignoring the field. In addition, this definition is problematic if it is ever attempted to redefine the field.
This document proposes following modification:
The highest order bit of Reserved field MUST be set to 1 if implementation supports multisession. Old implementation is expected to ignore value of this field and will set it to zero when sending its own OPEN message.
If multisession-capable router detects that peer has old implementation (Reserved field set to zero), then multisession-capable router MUST revert to the old behaviour (i.e. single session only) by using connection collision detection procedure described in section 6.8 of [RFC4271] (Rekhter, Y., Li, T., and S. Hares, “A Border Gateway Protocol 4 (BGP-4),” January 2006.). Note reverting to old behaviour means multisession is not possible between given pair of BGP speakers.
TOC |
Modifications described in this section apply only when router operates as new implementation towards particular peer, otherwise original RFC4271 section 6.8 applies.
If AFI/SAFI lists of OPEN messages do not overlap new session proceeds as another session wouldn't exist at all, i.e. new session is established.
If AFI/SAFI lists of OPEN messages match exactly, original procedure from section 6.8 of [RFC4271] (Rekhter, Y., Li, T., and S. Hares, “A Border Gateway Protocol 4 (BGP-4),” January 2006.) MUST be used.
If AFI/SAFI list of new incoming OPEN message is subset of an existing, even in established state, session (i.e. it overlaps but new list contains fewer AFI/SAFI), then behaviour is as follows:
Alternative 1: receiving side drops incoming session and initiates new session for those AFI/SAFIs which are found in the just dropped session but not in the old one. Receiver of this session (i.e. originator of the just collided session) MAY accept this proposal or drop it. Receiver SHOULD NOT attempt to send OPEN message with the list that has caused original collision (this will cause unnecessary stress). As special case this scenario may result in keeping old session and blocking new one until operators intervention.
Alternative 2: Receiving side drops incoming session and sends OPEN for combined AFI/SAFI for both existing and just dropped session. If originator of just collided session accepts this proposal, existing session is dropped but already exchanged prefixes are retained (possibly via graceful-restart).
Alternative 3: Receiving side accepts the proposal, gracefully transfers overlapping AFI/SAFI from the old session to the new one (using graceful restart mechanism).
While alternatives 2 and 3 may sound outlandish they in fact may become handy when operator needs to regroup AFI/SAFI as they avoid outage during the migration.
If AFI/SAFI list of new incoming OPEN message is superset of an existing, even in established state, session (i.e. it overlaps but new list contains more AFI/SAFI) and receiving BGP speaker is configured to support all involved AFI/SAFI, then behaviour is as follows:
Receiving side proceeds with new session, then gracefully transfers routing information from the old session to the new one and closes old connection.
If AFI/SAFI list of new incoming OPEN message partially overlaps with the existing, even in established state, but it's neither subset nor superset, then behaviour is as follows:
Alternative 1: Receiver accepts proposal, gracefully transfers overlapping AFI/SAFI from the old session to the new one (possibly via graceful-restart).
Alternative 2: Receiver drops incoming session and instead proposes three separate session - first that has AFI/SAFI unique to existing session, second with AFI/SAFI overlapping between sessions and third with AFI/SAFI unique to just dropped session.
Alternative 3: Receiver drops incoming session. Originator of just dropped session SHOULD NOT make further attempts to establish session with the same AFI/SAFI list until operator's intervention.
DISCUSSION: Choice of alternatives is open for discussion. Particularly whether single method should be chosen or a mechanism should be introduced as to inform which method is used.
TOC |
This specification does not explicitly define grouping capability. However this is not required to produce conforming implementation in order to achieve grouping as long as (S)AFI combinations in groups are not hardcoded. The grouping capability is implied by virtue of specifying more than one AFI/SAFI: if BGP speaker supports and configured to group particular set of AFI/SAFI then it simply advertises only given AFI/SAFI in OPEN message, and each session advertises different set of AFI/SAFI.
In order to ensure interoperability a BGP implementation SHOULD NOT impose particular (S)AFI grouping at coding time, otherwise it SHOULD provide at least one-(S)AFI-per-session alternative.
TOC |
This specification allows implementing per-AFI/SAFI or per-group BGP session using different host system processes without introduction of new TCP ports.
Some parts of this specification describe several behaviours. It's a request to the community to discuss which of the alternatives should be adopted, and whether or not this specification should contain formal definition of BGP FSM changes in format used by [RFC4271] (Rekhter, Y., Li, T., and S. Hares, “A Border Gateway Protocol 4 (BGP-4),” January 2006.).
This specification relies on availability of a mechanism to pass TCP session from one process to another. Many modern operating systems provide such functionality. This mechanism may be significantly different from system to system, but from protocol prospective they all allow achieving desired result. Author of this memo does not rule out possibility that a BGP implementation may exist that uses operating system without such facilities. In such cases it's believed that BGP implementer should work together with operating system implementers to organise required functionality, because such approach avoids penalising other protocol implementers and users.
Some network operators or BGP implementers may wish to restrict which AFI/SAFI are handled by specific process or how AFI/SAFI are grouped in multiple sessions. This specification allows achieving this without imposing restriction on protocol itself while permitting interoperability between implementations that impose restrictions and an implementation without restrictions. However if two implementations insist on non-matching grouping of AFI/SAFI then BGP session cannot be established. Therefore its recommended that AFI/SAFI grouping is left for configuration-time. Request to the community is to review whether this poses practical problem and how alternative multisession implementations would handle such situation.
TOC |
The changes proposed in this document do not introduce any new security concerns to BGP itself.
If BGP implementation involves handover of BGP session(s) between processes, interprocess communication is subject to security model and threats of the host operating system.
TOC |
The changes to BGP proposed by this memo do not require any new allocations from IANA.
TOC |
The author would like to thank Robert Raszuk for valuable comments and for help with translation of the document to XML format.
TOC |
TOC |
[RFC2119] | Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels,” BCP 14, RFC 2119, March 1997 (TXT, HTML, XML). |
[RFC4271] | Rekhter, Y., Li, T., and S. Hares, “A Border Gateway Protocol 4 (BGP-4),” RFC 4271, January 2006 (TXT). |
[RFC4760] | Bates, T., Chandra, R., Katz, D., and Y. Rekhter, “Multiprotocol Extensions for BGP-4,” RFC 4760, January 2007 (TXT). |
TOC |
[I-D.ietf-idr-bgp-multisession] | Scudder, J. and C. Appanna, “Multisession BGP,” draft-ietf-idr-bgp-multisession-05 (work in progress), March 2010 (TXT). |
[I-D.raszuk-ti-bgp] | Raszuk, R. and K. Patel, “Transport Instance BGP,” draft-raszuk-ti-bgp-01 (work in progress), March 2010 (TXT). |
TOC |
When BGP implementation runs on a host operating system that is similar to BSD, the session handover can use socket passing technique as described by W. Richard Stevens.
Basic idea is to establish Unix-socket session between process that initially accepts or creates TCP connection and the process which will actually handle the connection. In respect to BGP implementation following approach could be used (though other may exist):
TOC |
Ilya Varlashkin | |
Easynet Global Services | |
Harburger Schlossstr. 1 | |
Hamburg, 21079 | |
Germany | |
Email: | ilya.varlashkin@de.easynet.net |