Internet DRAFT - draft-zhang-dhc-dhcpv6-failure-detection
draft-zhang-dhc-dhcpv6-failure-detection
DHCWG L. Zhang
Internet-Draft W. Wang
Intended status: Informational BUPT University
Expires: August 1, 2018 Y. Chen
Tsinghua University
L. Sun
BUPT University
January 28, 2018
Detection of Primary Server Failure in DHCPv6 Failover
draft-zhang-dhc-dhcpv6-failure-detection-02
Abstract
In DHCPv6 failover or other multiple servers deployment scenarios, an
automatic failure detection capability may be desirable. This
document describes a detection method, with which the secondary
server can detect the link failure between the primary server and
clients. This document does not define any protocol details.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on August 1, 2018.
Copyright Notice
Copyright (c) 2018 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(https://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
Zhang, et al. Expires August 1, 2018 [Page 1]
Internet-Draft DHCPv6 Server Failure Detection January 2018
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
2. Requirements Language . . . . . . . . . . . . . . . . . . . . 3
3. Problem Statement and Applicability . . . . . . . . . . . . . 3
4. Detection of Primary Server Failure . . . . . . . . . . . . . 3
5. Security Considerations . . . . . . . . . . . . . . . . . . . 4
6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 5
7. Normative References . . . . . . . . . . . . . . . . . . . . 5
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 5
1. Introduction
[RFC7031] describes the requirements of DHCPv6 failover, [RFC6853]
discusses a simpler redundancy deployment considerations of DHCPv6.
Both scenarios employ multiple servers deployments to improve
DHCPv6's reliability and availability. In such scenarios, two
categories of DHCPv6 servers, primary and secondary servers, are
serving the clients in the domain. Both servers should provide
essential DHCPv6 service and maintian the consistent configurations
and lease inforamtion. The primary server should be resposnible for
answering clients' requests, while the secondary server is expected
to be responsive in case of the primary server's failure.
Popular implementations of failover and redundancy designs always
provide the ability that one server could detect its partner's
failure. This goal could be achieved through various mechanisms such
as timer-based solution and etc. However, such failure detection
methods are not sufficient. Since they cannot work out in a
situation that the connection between the primary and secondary
servers is normal while the link between the primary server and
clients is down. Under this circumstances, it would be desirable
that the secondary server could detect such a failure automatically
and take the responsibility of providing DHCPv6 services.
This document describes a method for the secondary server to detect
such a failure between primary server and clients in a ordinary
multiple servers deployment. The consideration of the potential
preference conflict between the responsive secondary server and
primary server is also presented.
Zhang, et al. Expires August 1, 2018 [Page 2]
Internet-Draft DHCPv6 Server Failure Detection January 2018
2. Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC2119].
3. Problem Statement and Applicability
[RFC3315] allows multiple servers working in one domain for high
availability and other benefits. One of the main purposes of
multiple DHCPv6 servers deployment and failover is to solve the
single point of faiulre problem. The server failure could be divided
into two categories, the first one is the failure between primary
server and secondary server, the second one refers to the failure
between primary server and clients. People and existing failover
implementations always focused more on the former situation and has
already came up with several automatic detection methods.
A common scenario of the second failure is a (physical) link failure
between primary server and clients. Such link failure may not do
harm to the primary server itself but could actually result in making
the primary server unreachable for clients. If the secondary server
is not able to detect such a failure, it will assume everything is
okay and not provide DHCPv6 service for redundancy.
Section 5.1.1 of [RFC7031] illustrates the first kind of server
failure and states that the secodnary server could easily detetct
such failure according to lack of responses from the primary server.
However, it is obvious that such method does not make sense for the
second server failure discussed in this document. Thus, we propose a
new method in this document to automatically discover the failure
between primary server and clients.
4. Detection of Primary Server Failure
The failure detection method described in this document is based on
the following assumptions.
o The secondary server is reachable to clients while the primary
server is not (at least to part of clients).
o The primary server is not down and the link between primary and
secondary server is normal.
Based on the assumptions above, if the primary server is not
reachable for a client, the client may keep advertising SOLICIT or
REQUEST messages (if stateless DHCPv6 is used, the client may keep
sending INFORMATION-REQUEST message).
Zhang, et al. Expires August 1, 2018 [Page 3]
Internet-Draft DHCPv6 Server Failure Detection January 2018
To achieve an automatic detection, the secondary server should
implement an internal counter. This counter will count each time the
secondary server receives a duplicated message (e.g. SOLICIT
message) from a same client. Also a threshold value and a time
period should be set at the secondary server side. If the count
value is larger than the threshold value in the configured time
period, and the secondary server cannot find anything wrong with the
primary server (i.e. responses from the primary server is regular),
it will consider there exists a failure between primary server and
clients. And if the count value does not reach the threshold in the
specific time period, the counter will be clear. The threshold and
time period value may differ in different deployments, thus the
specific value of threshold and time period and detailed
implementation of counter is out of scope of this document.
The detection method described in this document is likely to lead to
a situation that both the primary server and secondary server are
responsive, at least for the clients that their link to the primary
server is not down. The reason is that the primary server cannot
detect there is a failure between itself and part of clients. Thus
it will continue to provide its DHCPv6 service which may cause a
conflict with the secondary server. As a result, part of clients may
receive two responses from the two servers and cannot decide which
should be used.
One possible solution is that every time the secondary server decide
to take the responsibility of being a responsive server to provide
DHCPv6 service,it should inform the primary server about it. Such a
notification should be regardless of whether the primary server is
available or not. Since the purpose is to make sure there will not
be two servers offering service at the same time.
Once the primary server failure is detected and notification process
is finished, the secondary server may start to serve as a responsive
server or just report the condition but do nothing else.
5. Security Considerations
A sort of DoS attack can be performed by a malicious client, which
can flood the SOLICIT message in the network, thus make the secondary
server become responsive while the primary server is actually
responsive to the other clients.
Further security considerations is TBD.
Zhang, et al. Expires August 1, 2018 [Page 4]
Internet-Draft DHCPv6 Server Failure Detection January 2018
6. IANA Considerations
This document does not include an IANA request.
7. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/info/rfc2119>.
[RFC3315] Droms, R., Ed., Bound, J., Volz, B., Lemon, T., Perkins,
C., and M. Carney, "Dynamic Host Configuration Protocol
for IPv6 (DHCPv6)", RFC 3315, DOI 10.17487/RFC3315, July
2003, <https://www.rfc-editor.org/info/rfc3315>.
[RFC6853] Brzozowski, J., Tremblay, J., Chen, J., and T. Mrugalski,
"DHCPv6 Redundancy Deployment Considerations", BCP 180,
RFC 6853, DOI 10.17487/RFC6853, February 2013,
<https://www.rfc-editor.org/info/rfc6853>.
[RFC7031] Mrugalski, T. and K. Kinnear, "DHCPv6 Failover
Requirements", RFC 7031, DOI 10.17487/RFC7031, September
2013, <https://www.rfc-editor.org/info/rfc7031>.
Authors' Addresses
Lanshan Zhang
BUPT University
Beijing University of Posts and Telecommunications (BUPT)
Beijing 100876
P.R. China
Phone: +86-13146885878
Email: zls326@sina.com
Wendong Wang
BUPT University
Beijing University of Posts and Telecommunications (BUPT)
Beijing 100876
P.R. China
Email: wdwang@bupt.edu.cn
Zhang, et al. Expires August 1, 2018 [Page 5]
Internet-Draft DHCPv6 Server Failure Detection January 2018
Yuchi Chen
Tsinghua University
Beijing 100084
P.R. China
Phone: +86-10-6278-5822
Email: chenycmx@gmail.com
Linhui Sun
BUPT University
Beijing University of Posts and Telecommunications (BUPT)
Beijing 100084
P.R. China
Email: sunlinhui@bupt.edu.cn
Zhang, et al. Expires August 1, 2018 [Page 6]