Internet DRAFT - draft-kilsdonk-redundant-tcp
draft-kilsdonk-redundant-tcp
INTERNET-DRAFT expires 6/1/2006
draft-kilsdonk-redundant-tcp-01.txt
Network Working Group D. Kilsdonk
Internet-Draft:
June 13, 2005
Redundant TCP
By submitting this Internet-Draft, each author represents that any applicable patent or
other IPR claims of which he or she is aware have been or will be disclosed, and any of
which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task
Force (IETF), its areas, and its working groups. Note that other groups
may also distribute working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months and
may be updated, replaced, or obsoleted by other documents at any time. It
is inappropriate to use Internet-Drafts as reference material or to cite
them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/1id-abstracts.html
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html
Abstract
This memo documents a process which can be used to avoid network
outages during the upgrading of a router's software. In this way,
disruptions to the network community are avoided. The upgrade process
definition comprises requirements for inserting a replacement router
or upgrading the software in a working router within an operational
network. To do this, it maximizes the use of accepted internet
standards with minimal alteration.
[0]
RFC DONK Redundant TCP June 2005
Table of Contents
1.0 INTRODUCTION...................................................2
2.0 MINIMUM REQUIREMENTS...........................................2
2.1 REDUNDANT CONFIGURATION........................................3
2.2 AWARENESS OF REDUNDANT CONFIGURATION...........................3
2.3 AWARENESS OF STATE OF REDUNDANT CONFIGURATION..................3
3.0 REDUNDANT BOOTUP... ...........................................4
4.0 TIME SYNCHRONIZED CASE.........................................4
4.1 REDUNDANT OPERATIONS...................... ....................4
4.1.1 OPERATOR INITIATED CONFIGURATION CHANGES.....................4
4.1.2 BACKPLANE INITIATED CONFIGURATION CHANGES....................4
4.1.2.1 OPERATIONAL STATION DISTINCT PROCESSING....................4
4.1.2.2 STANDBY STATION DISTINCT PROCESSING........................5
4.1.2.2.1 STANDBY MAINTENANCE OF TCP REDUNDANCY....................5
5.0 UNSYNCHRONIZED CASE............................................6
5.1 STANDBY DATA SYNCHRONIZATION...................................6
5.2 STANDBY TIMING SYNCHRONIZATION ................................7
6.0 ALTERNATIVES...................................................7
7.0 CRYPTOGRAPHY...................................................7
8.0 REFERENCES.....................................................7
9.0 OTHER BENEFITS.................................................8
9.0 DEFINITIONS OF TERMS..........................................8
10.0 AUTHOR'S ADDRESS...............................................8
APPENDIX A: GLOSSARY OF ACRONYMS....................................9
RFC DONK Redundant TCP June 2005
[1]
RFC DONK Redundant TCP June 2005
1. INTRODUCTION
Disruption in the network community arises whenever a router or
endstation crashes or is brought down for maintenance-usually to
perform a software upgrade.
In many cases, this disruption proliferates through the network
causing other disturbances. To minimize these disruptions, a practical
operational framework is presented by using a generally-accepted model
of a simple, generic router and maximal use of existing protocol
specifications with minimal change.
2. MINIMUM REQUIREMENTS
This approach uses redundancy to support on-going operations while
critical equipment is upgraded or replaced. To facilitate this,
minimum functionality must be present beyond having two like-designed
pieces of equipment. To the extent these requirements are met, the
switchover will be seamless. If not, the switchover disruption will
be minimized to the degree these requirements have been facilitated.
[2]
RFC DONK Redundant TCP June 2005
2.1 REDUNDANT CONFIGURATION
Of course, to support a switchover and redundancy in a meaningful
sense, the two pieces of equipment must share physical connectivity
with all neighbors and must have the same IP and machine addresses for
all general traffic ports. This is very much the definition of
redundant configurations. Each station also has a distinct management
IP address of which the other is made aware (in-band management). One
station is designated "operational," and its redundant counterpart
henceforth, is in "standby" mode. (Where redundant addressing cannot be
achieved at the machine layer, the standby station must be put in
promiscuous mode and after switchover, the standby must transmit an
unsolicited ARP response correlating the extant IP address with the
new machine address for the benefit of all listening peers.)
The degree to which both stations exhibit truly redundant operations
can be gauged by how well they share current information. This approach
exploits all information on the network in lieu of shared memory to
facilitate the same.
2.2 AWARENESS OF REDUNDANT CONFIGURATION
The operational and standby equipment must support two indicators.
One indicates that a redundant configuration exists (ie, a redundant
counterpart is present), while another indicates that the station is
functioning properly.
For example, support for indicating that a backup station exists, may
be done through the human interface whether that be a command issued at
the command-line-interface (CLI) or through an SNMP setting or more
simply, a bit set to "one" by the station (I_AM_HERE). In this case, the
bit must be readable by the station's redundant counterpart.
2.3 AWARENESS OF STATE OF REDUNDANT COUNTERPART
A generic router typically has an operational-signal (LED) which
indicates that its software is actively performing the routing function.
The router's software turns this signal off during exception handling
such as a software crash thereby indicating a software malfunction
(possibly made manifest due to a hardware fault). An operator-initiated
deactivation also resets this operational-signal. This indicator must
be readable by the station's redundant counterpart.
[3]
RFC DONK Redundant TCP June 2005
3. REDUNDANT BOOTUP
As each station boots, and if it detects the presence of a redundant
counterpart, it immediately signals that it is operationally unready.
It then waits until it detects the same signal from its counterpart.
If so, it then signals readiness and awaits the detection of the same
from its counterpart. It does so until it detects readiness or a
timeout value expires. This timeout value may be as great as the
default TCP timeout for the station but for practical considerations, it
should be set to two minutes. If this step times out, the station is
stated as "standby" and "unsynchronized" but may otherwise proceed in
its bootup.
If this step does not time-out, each station is time-synchronized
with the other and may proceed in its bootup in much the same way it
would without redundancy concerns. The station with the lower
management IP address is stated as "operational". The station with the
higher management IP address is stated as "standby."
4. TIME SYNCHRONIZED CASE
4.1 REDUNDANT OPERATIONS
Operations at the station generally consist of two parts. The first
is operator initiated configuration changes such as adding a static
route. It is imperative that the redundancy configuration has this
knowledge instantiated at both stations at the same time to maintain
pure redundancy. The second part comprises configuration changes such
as routing table updates which are made manifest from external sources
(peer routers) connected on the general switching ports (backplane
redundancy).
4.1.1 OPERATOR INITIATED CONFIGURATION CHANGES
Configuration changes and commands issued at the station must be
conveyed to the other station. This is simply done for a command line
interface by establishing a telnet session to the redundant
counterpart's management IP address and echoing all commands as they are
introduced. The responses to these commands may be summarily ignored.
Configuration changes and commands issued directly to the station via
SNMP may be copied and sent to the redundant counterpart by copying the
SNMP packet upon its reception by the station and sending this copy to
its redundant counterpart (via its distinct management IP address).
4.1.2 BACKPLANE INITIATED CONFIGURATION CHANGES
The processing for handling backplane initiated configuration changes
is predicated on the operational-signal of the redundancy counterpart.
Each time a packet is sent, this bit is read to determine if the
counterpart is functioning as the operational station. If this
station's redundancy counterpart is not the "standby" station, it may be
safely assumed that this station is the "standby".
[4]
RFC DONK Redundant TCP June 2005
4.1.2.1 OPERATIONAL-STATION DISTINCT PROCESSING
The operational station operates in the same manner as in the
non-redundant configuration case. In this way, the operational
software's only modification is to detect, with each packet to be
transmitted, whether or not it is in the operational mode. Its
software is minimally altered and therefore its inherent reliability is
not diminished as a result of introducing redundancy functionality.
4.1.2.2 STANDBY-STATION DISTINCT PROCESSING
The standby station executes the same software as the operational
station. However, in layer-one of network handling, it does not
transmit its packet but does return a satisfactory status that it has
transmitted the packet. The net effect is the same, inasmuch as the
operational station DID send the packet. They may now await an
acknowledge (in TCP parlance).
4.1.2.2.1 STANDBY MAINTENANCE OF TCP REDUNDANCY
The station may be seen as an Input/Output (I/O) engine which is
only based on packet timing mark. We may then work at that timing
granularity. Initial timing synchronization was achieved in paragraph
3. Packets sent from backplane stations will tend to re-calibrate the
two stations to the same packet timing. Finally, the TCP timeout window
is usually very large which also confirms our ability to keep the
stations configuration redundant to a very high level of confidence.
Since the timing granularity is only held to within one packet time,
it is possible that an acknowledge can be seen at the standby before the
sequence-to-be-acknowledged, has been "sent." Therefore, it is
required that the TCP-input algorithm (but not its interface) be
modified so that if this syndrome is detected, the acknowledge packet is
buffered and recycled into the input stream after TCP outputs the
sequence for which the acknowledge is hinged.
Now, should the operational station become defective, its exception
handler will mark it as "standby". Since its redundant counterpart
checks this indicator every time it sends a backplane packet, switchover
of operations to what was the standby, can proceed within one packet
time. All operations will continue since protocols such as BGP, OSPF,
and others run on TCP.
The crashed or operator-terminated station may now have its software
upgraded. When it reboots, it will try the synchronization steps of
paragraph 3 and furthermore, it will timeout (since the other station is
continuing on in operational-mode) causing the newly introduced station
to be stated as "unsynchronized." As an unsynchronized-standby
processor, its transmissions are never allowed onto the backplane
connections.
[5]
RFC DONK Redundant TCP June 2005
5 UNSYNCHRONIZED CASE
In the unsynchronized case, the station has failed to perform the
steps of paragraph 3. It can be assumed that the failure was because
the redundancy counterpart of this station was already operational and
therefore it did not collaborate in the time synchronization and
testing (toggling) of its operational-indicator. This is the case of
a station joining the redundancy configuration late and is therefore
regarded as a "late-joiner" case.
The same requirements apply for the late-joiner. The operational
station can detect the presence of the late-joiner through the means of
paragraph 2.3. What remains is for the standby station to achieve data
and timing synchronization. These steps may be processor intensive
and should therefore be tied to a scheduler for a time in which
critical network operations are expected to be at a minimum. In this
way, during the steps of time and data synchronization, configuration
changes are also minimized.
5.1 STANDBY DATA SYNCHRONIZATION
The standby continues its bootup and then issues an SNMP MIB WALK (GET)
to the operational station's management IP address of object number one.
This will return all objects comprising the operational station's MIB
definition. On reception, the standby must convert the GET_RESULT to a
SET command and issue it to itself.
Typically, routers do not issue SNMP-GETs to other routers. There is
nothing in the SNMP RFC to preclude this however. The operational
station may assume that an SNMP-MIB WALK ALL from the standby station
indicates that a late joiner synchronization is in-process.
Also, some SNMP MIB objects are read-only. This prevents operators
from setting data that reflect non-operator-controlled-configuration-
items such as the system-uptime of the router. While this is a noble
end in itself, the standby may trust that the operational station
reflects this data dutifully. Therefore the data at the operational
station has the same integrity as if the standby had gotten it directly
from its source. Therefore, during data synchronization by the
standby, SNMP MIB objects that are read-only may be written in the
standby. This is a manageable, very specific, software change. The
result for system-uptime, for example, is that the uptime reflects the
uptime of the system as opposed to that of the individual station.
Finally, files used to contain configuration information may now be
obtained by the standby station by issuing appropriate FTP commands
targeted to the primary's management IP address. All that remains for
complete redundancy at this point is achieving timing synchronization
of the backplane.
[6]
RFC DONK Redundant TCP June 2005
5.2 STANDBY TIMING SYNCHRONIZATION
Once data synchronization is achieved, the two stations re-establish
telnet connectivity for in-band management redundancy. The first
command sent initiates timing synchronization. In this, both stations
state that timing synchronization does not exist. Therefore, both are
considered operationally unready. Since neither is operational during
timing resynchronization, neither transmits packets to the backplane.
Furthermore, TCP sessions are brought down and resurrected so that
initial sequence numbers may be established consistently in both
stations. Initial TCP sequence numbers can be unrandomized and
statically based on IP addresses of the router peer to which
connectivity is sought. This is because peer routers are typically in
the same IP domain, usually point-to-point connected and seldom more
than one hop away. Because no routing is needed between them, no
router loops, no transient packets wandering the network are seen,
especially during a catastrophic failure and thus the necessity for
randomizing the initial sequence numbers is obviated.
Alternatively, during the SNMP-GET-ALL step of data synchronization
(paragraph 5.1), should the MIB for TCP contain current sequence
numbers, backplane activity can be suspended until the operational
station performs an SNMP-GET of the standby's TCP sequence numbers to
ensure they are both coherent before allowing network activity on the
backplane to proceeed.
6 ALTERNATIVES TO THIS APPROACH
To preclude outages due to a hardware-oriented catastrophic event,
designers have produced dual-station systems that stay synchronized by
using a common clock and shared memory. This is a very expensive
solution which does not address problems due to a hardware malfunction
of the clock. If the hardware-oriented problem becomes manifest in a
memory or semaphore "stuck-high" in the shared memory, it becomes a
problem for the standby almost immediately. Finally, approaches of
this kind encompass timing synchronization in the Megahertz
calibration range when the inputs and outputs of the system need only
be true at the packet-time granularity.
6 OTHER BENEFITS
Since the investment in redundancy is hopefully never needed owing to
inherently reliable hardware and software, the standby station may never
be used in a theoretically 100% reliability case. Therefore, other
uses can be switched in such as backplane debugging wherein the standby
is used as a sniffer to promiscuously record and display packets
appearing on backplane links. This can be an incredible asset in
tuning and debugging the network. If there is enough station
throughput, the sniffing can be done simultaneous with the functionality
already presented.
7 CRYPTOGRAPHY
As cryptography is imposed by the station operator so to will it be
invoked on the redundant station vis-a-vis the telnet echo of the same.
Specialized cryptography may also be created to facilitate this echoing
and the echoing of SNMP management requests to the management port.
[7]
RFC DONK Redundant TCP June 2005
8 REFERENCES
o Structure and Identification of Management Information for
TCP/IP-based Internets, RFC 1155, M. Rose, K. McCloghrie
o Management Information Base for Network Management of TCP/IP-based
Internets, RFC 1156, K. McCloghrie, M. Rose
o SNMPv2 Management Information., RFC 2013, K. McCloghrie
o Simple Network Management Protocol (SNMP), RFC 1157, J. Case,
M. Fedor, M. Schoffstall, C. Davin
o Management Information Base for Network Management of TCP/IP-based
Internets: MIB-II, RFC 1158, M. Rose
o Transmission Control Protocol, RFC 793, J. Postel
o Requirements for Internet Hosts--Communication Layers, RFC 1122,
R. Braden, ed.
o TCP Extensions for High Performance, RFC 1323, V. Jacobson, R. Braden,
D. Borman
o Internet Protocol, RFC 791, J. Postel
o Ethernet Address Resolution Protocol, RFC 826, D. Plummer
9 DEFINITIONS OF TERMS
Internet address: A 32-bit address assigned to hosts using TCP/IP.
Operational station: Of two completely redundant stations, the station
actually transmitting packets.
protocol: A formal description of messages to be exchanged and rules
to be followed for two or more systems to exchange information.
router: A system responsible for making decisions about which of
several paths network (or Internet) traffic will follow. To do this
it uses a routing protocol to gain information about the network, and
algorithms to choose the best route based on several criteria known
as "routing metrics." In OSI terminology, a router is a Network
Layer intermediate system.
Standby Station: of two completely redundant stations, the station which
merely thinks it is transmitting packets but is disallowed at the
network driver layer. It relies on the operational stations
transmission.
Telnet: The virtual terminal protocol in the Internet suite of
protocols. Allows users of one host to log into a remote host and
interact as normal terminal users of that host.
Unsynchronized station: A station unable to complete time synchronization
with its redundant counterpart.
9 AUTHOR'S ADDRESS
Daniel D. Kilsdonk
65 Lake Shore Drive North
Westford, MA 01886
Phone: (978) 692-3383
EMail: dan@prospeed.net
APPENDIX A: GLOSSARY OF ACRONYMS
FTP: File Transfer Protocol. The Internet protocol (and program)
used to transfer files between hosts. See FTAM.
IP: Internet Protocol. The network layer protocol for the Internet
protocol suite.
IS-IS: Intermediate system to Intermediate system protocol. The OSI
protocol by which intermediate systems exchange routing information.
MIB: Management Information Base. A collection of objects that can
be accessed via a network management protocol.
OSPF: Open Shortest Path First. A "Proposed Standard" IGP for the
Internet. See IGP.
SNMP: Simple Network Management Protocol. The network management
protocol of choice for TCP/IP-based internets.
TCP: Transmission Control Protocol. The major transport protocol in
the Internet suite of protocols providing reliable, connection-
oriented, full-duplex streams. Uses IP for delivery. See TP4.
INTERNET-DRAFT expires 6/13/2006
Copyright (C) The Internet Society (2005).
This document is subject to the rights, licenses and restrictions contained in BCP 78,
and except as set forth therein, the authors retain all their rights.
This document and the information contained herein are provided on an "AS IS" basis and
THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE
INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR
FITNESS FOR A PARTICULAR PURPOSE."