draft-ietf-dhc-failover-01.txt   draft-ietf-dhc-failover-02.txt 
Network Working Group Greg Rabil
INTERNET DRAFT Mike Dooley Network Working Group Ralph Droms
Obsoletes: draft-ietf-dhc-failover-00.txt Arun Kapur INTERNET DRAFT Bucknell University
Greg Rabil
Mike Dooley
Arun Kapur
Quadritek Systems Quadritek Systems
Ralph Droms Kim Kinnear
Bucknell University American Internet
February 1998 Steve Gonczi
Expires August 1998 Bernie Volz
Process Software
August 1998
Expires March 1999
DHCP Failover Protocol DHCP Failover Protocol
<draft-ietf-dhc-failover-01.txt> <draft-ietf-dhc-failover-02.txt>
Status of this Memo Status of this Memo
This document is an Internet-Draft. Internet-Drafts are working This document is an Internet-Draft. Internet-Drafts are working
documents of the Internet Engineering Task Force (IETF), its areas, documents of the Internet Engineering Task Force (IETF), its areas,
and its working groups. Note that other groups may also distribute and its working groups. Note that other groups may also distribute
working documents as Internet-Drafts. working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as ``work in progress.'' material or to cite them other than as ``work in progress.''
To learn the current status of any Internet-Draft, please check the To learn the current status of any Internet-Draft, please check the
``1id-abstracts.txt'' listing contained in the Internet-Drafts Shadow ``1id-abstracts.txt'' listing contained in the Internet-Drafts Shadow
Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe),
munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or munnari.oz.au (Pacific Rim), ftp.ietf.org (US East Coast), or
ftp.isi.edu (US West Coast). ftp.isi.edu (US West Coast).
Abstract Abstract
DHCP [RFC 2131] allows for multiple servers to be operating on a DHCP [RFC 2131] allows for multiple servers to be operating on a
single network. Some sites are interested in running multiple servers single network. Some sites are interested in running multiple servers
in such a way so as to provide redundancy in case of server failure. in such a way so as to provide redundancy in case of server failure.
In order for this to work reliably, the servers must maintain a In order for this to work reliably, the cooperating Primary and
consistent database of the lease information. This implies that Secondary servers must maintain a consistent database of the lease
servers will need to coordinate any and all lease activity so that
this information is synchronized in case of failover. DRAFT January 1998
information. This implies that servers will need to coordinate any
and all lease activity so that this information is synchronized in
case of failover.
This document defines a protocol to provide this synchronization This document defines a protocol to provide this synchronization
between two servers. One server is designated the "primary" server, between two servers. One server is designated the "Primary" server,
the other is the "secondary" server. Additionally, this document the other is the "Secondary" server. Additionally, this document
describes a protocol for the automatic transfer of control from the describes a protocol for the automatic transfer of control from the
primary to the secondary in the case of failure (failover), as well Primary to the Secondary in the case of failure (failover), as well
as a network partition.
DRAFT DHCP Failover Protocol November 1997
as the re-establishment of control by the primary server. This document is a merge of draft-ietf-dhc-failover-01.txt and
draft-ietf-dhc-safe-failover-proto-00.txt, along with substantial
changes to each. Unfortunately, this merge was not completed with
sufficient time to allow review by any of the authors of draft-ietf-
dhc-failover-01.txt, and so it may well not reflect their views even
though their names appear as authors. See Section 11, issue #1 and
Section 12 for more details.
1.0 Introduction 1. Introduction
As the use of DHCP servers in networked environments grows, the As the use of DHCP servers in networked environments grows, the
dependency of those networks on the DHCP server increases. This is dependency of those networks on the DHCP server increases. This is
particularly true of the hosts that receive their configuration particularly true of the hosts that receive their configuration
information from the DHCP server. Therefore, it is very important to information from the DHCP server. Therefore, it is very important to
be able to provide reliable, continuous availability of DHCP be able to provide reliable, continuous availability of DHCP ser-
services. vices.
This specification describes a protocol to support automatic failover This specification describes a protocol to support automatic failover
from a primary to its secondary server. The failover mechanism from a primary to its secondary server. The failover mechanism
allows the secondary server to perform DHCP actions while the primary allows the secondary server to perform DHCP actions while the primary
is down. Additionally, the protocol defines how control is passed is down, or when a network failure prevents the primary and secondary
back to the primary when it becomes operational again. from communicating. The protocol also specifies how reintegration is
achieved when the primary again becomes operational or when the pri-
mary and secondary can again communicate.
In providing the specification for the failover, the protocol In providing the specification for the failover, the protocol speci-
specifies how to guarantee reliable delivery of changes to the fies how to guarantee reliable delivery of changes to the secondary.
secondary. This is required to synchronize the secondary's lease This is required to synchronize the secondary's lease data with that
data with that of the primary. The protocol further specifies a of the primary. The protocol further specifies a mechanism to allow
mechanism for determining the state (operational or not) of the the secondary to determine if it can communicate with the primary
primary server. The secondary will be able to automatically service server. The secondary will automatically begin to service DHCP
DHCP requests upon failover. When the primary server becomes requests whenever it cannot communicate with the primary. When the
available again, the secondary will convey any changes that occurred primary server becomes available again, the secondary will convey any
since the time of failover back to the primary prior to the primary changes that occurred since the time of failover back to the primary.
becoming operational.
1.1 Requirements Terminology Through careful control of the difference between the lease times
DRAFT January 1998
offered to DHCP clients and the lease time known by the secondary
server, the protocol allows the primary to communicate with the
secondary after the primary has completed communication with the DHCP
client (a technique known as "lazy" update) and still guarantee that
duplicate IP address allocations do not occur. Thus, the protocol
does not directly impact the ability of a DHCP server to respond to
DHCP client requests.
1.1. Requirements Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY" and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC 2119]. document are to be interpreted as described in RFC 2119 [RFC 2119].
1.2 DHCP Terminology 1.2. DHCP Terminology
This document uses the following terms: This document uses the following terms:
o "DHCP client" or "client" o "DHCP client" or "client"
A DHCP client is an Internet host using DHCP to obtain A DHCP client is an Internet host using DHCP to obtain confi-
configuration parameters such as a network address. guration parameters such as a network address.
o "DHCP server" or "server" o "DHCP server" or "server"
A DHCP server is an Internet host that returns configuration A DHCP server is an Internet host that returns configuration
parameters to DHCP clients.
DRAFT DHCP Failover Protocol November 1997 o "binding"
parameters to DHCP clients. A binding is a collection of configuration parameters, includ-
ing at least an IP address, associated with or "bound to" a
DHCP client. Bindings are managed by DHCP servers.
o "binding database"
The collection of bindings managed by a primary and secondary.
o "subnet address pool"
A subnet address pool is the set of IP address which is asso-
ciated with a particular network number and subnet mask. In
the simple case, there is a single network number and subnet
mask and a set of IP addresses. In the more complex case
(sometimes called "secondary subnets", sometimes "super-
scopes"), several (apparently unrelated) network number and
subnet mask combinations with their associated IP addresses
DRAFT January 1998
may all be configured together into one subnet address pool.
o "primary server" or "primary" o "primary server" or "primary"
A DHCP server configured to provide primary service to a set of A DHCP server configured to provide primary service to a set
DHCP clients. of DHCP clients for a particular set of subnet address pools.
o "secondary server" or "secondary" o "secondary server" or "secondary"
A DHCP server configured to act as a backup to a primary server; A DHCP server configured to act as backup to a primary server
the secondary answers requests from DHCP clients only if its for a particular set of subnet address pools.
primary is unable to respond.
o "bindings database"
The collection of bindings managed by a primary and secondary. o "stable storage"
1.3 Requirements for this protocol Every DHCP server is assumed to have some form of what is
called "stable storage". Stable storage is used to hold
information concerning IP address bindings (among other
things) so that this information is not lost in the event of a
server failure which requires restart of the server.
The following requirements must be met by this protocol. 1.3. Requirements for this protocol
o Implementations of this protocol must work with existing DHCP The following list of goals must be (and are) achieved by this proto-
clients. col.
o Implementations of this protocol must work with existing BOOTP 1. Implementations of this protocol must work with existing DHCP
relay agents. client implementations based on the DHCP protocol [1].
o The protocol must provide failover redundancy between servers that 2. Implementations of the protocol must work with existing BOOTP
are not located on the same subnet. relay implementations.
1.4 Goals of this protocol 3. The protocol must provide failover redundancy between servers
that are not located on the same subnet.
o Provide for continued service to DHCP clients through an automated 1.4. Goals for this protocol
mechanism in the event of failure of the primary server.
o Minimize the possibility of assigning an IP address to two 1. Provide for continued service to DHCP clients through an
different clients simultaneously. automated mechanism in the event of failure of the Primary
Server.
o Minimize any need for manual administrative intervention. 2. Avoid binding an IP address to a client while that binding is
currently valid for another client. In other words, don't
allocate the same IP address to two clients.
o Introduce no additional delays in server response time as a result 3. Minimize any need for manual administrative intervention.
of inter-server communication.
o Share IP address ranges between primary and secondary servers; DRAFT January 1998
i.e., impose no requirement that the pool of available IP addresses
be divided between servers.
o Continue to meet the goals and objectives of this protocol in the 4. Introduce no additional delays in server response time as a
result of inter-server communication.
DRAFT DHCP Failover Protocol November 1997 5. Share IP address ranges between primary and secondary
servers; i.e., impose no requirement that the pool of avail-
able addresses be divided between servers.
event of server failure or network partition. 6. Continue to meet the goals and objectives of this protocol in
the event of server failure or network partition.
o Provide graceful reintegration of full protocol service after 7. Provide graceful reintegration of full protocol service after
server failure or network partition. server failure or network partition.
o Allow for one computer to act as a secondary server for multiple 8. Allow for one computer to act as a Secondary Server for mul-
primary servers. Where possible, primary and secondary servers tiple Primary Servers. Other topologies (e.g.: mesh) are also
should be "logical" servers and not necessarily physical computers. possible. Primary and Secondary Servers SHOULD be viewed as
"logical" servers and not necessarily physical computers.
1.4 Limitations to this protocol 9. Ensure that an existing client can keep its existing IP
address binding if it can communicate with either the Primary
or Secondary DHCP server implementing this protocol - not
just whichever server that originally offered it the binding.
o Under normal operation, only one server at a time will service DHCP 10.Ensure that a new client can get an IP address from some
client requests; this protocol provides reliability through server. Ensure that in the face of partition, where servers
redundancy but not load balancing. continue to run but cannot communicate with each other, the
above goals and requirements may be met. In addition, when
the partition condition is removed, allow graceful automatic
re-integration without requiring human intervention.
o This protocol provides only one level of redundancy through a 11.If either Primary or Secondary Server loses all of the infor-
single secondary server for each primary server. mation that is has stored in stable storage, it should be
able to refresh its stable storage from the other server.
o Under certain combinations of failures, both a primary and 1.5. Limitations of this Protocol
secondary server may be active and assign the same IP address to
different DHCP clients.
DISCUSSION: The following are explicit limitations of this protocol.
The details of this failure mode are discussed in section X. In 1. Under normal operation, only one server at a time will ser-
summary, for duplicate address allocation to occur, a network vice DHCP client requests; this protocol provides reliability
partition must occur that prevents the servers from exchanging through redundancy but not load balancing.
messages and a subnet partition must occur so that some DHCP
clients on the subnet can only reach the server while other
clients on that same subnet can only reach the secondary.
o Primary and secondary servers require external configuration to 2. This protocol provides only one level of redundancy through a
acquire server addresses, available IP address ranges and other single Secondary Server for each Primary Server.
client configuration information.
DISCUSSION: 3. The protocol provides a way to detect when the primary and
secondary server cannot communicate, but once this condition
This protocol assumes external configuration of primaries and DRAFT January 1998
secondaries; e.g., through an independent internet configuration
management tool.
o The primary and secondary server must synchronized before an has been detected, does not (indeed, cannot) provide any way
address with an expired lease can be reassigned to a new client. to further distinguish between network failure and failure of
one of the servers.
o The primary and secondary servers must halt all DHCP transaction 4. A small number of IP addresses are reserved for Secondary
processing while resynchronizing after a system failure. Server use. In order to handle the failure case where both
servers are able to communicate with DHCP clients, but unable
to communicate with each other, a small number of IP
addresses must be set aside as a private address pool for the
Secondary Server. The Secondary can use these to service
newly arrived DHCP clients during such a period. The size of
this private pool SHOULD be based only on the arrival rate of
new DHCP clients and the length of expected downtime, and is
not influenced in any way by the total number of DHCP clients
supported by the server pair.
DRAFT DHCP Failover Protocol November 1997 5. The Primary and Secondary Servers SHOULD pause normal DHCP
transaction processing while resynchronizing, after a system
failure.
2. Protocol Operations
The protocol necessary in providing redundant/failover servers can be
grouped in three areas:
o Messages to keep the Secondary Server's lease data synchron-
ized with that of the Primary so that when failover occurs,
there is no degradation of service.
o Messages that allow the Secondary to determine the operational
state of the Primary, so as to know when to start servicing
DHCP traffic.
o Messages that are used to coordinate the Primary regaining
control when it has become available again.
2.1. Time synchronization between communicating servers
Each Binding update message carries a "sent time stamp" (the time
when the message was sent in GMT). This provides a simple mechanism
to determine any "time drift" between communicating servers.
DISCUSSION: DISCUSSION:
Presumably, unless the primary and secondary servers have been If an UDP packet is successfully transmitted (i.e.: it does not
out of communication for an extended period, the servers will get lost), the packet travel time is negligible in the framework
have only a small amount of information to exchange. Thus, the
time during which the servers are not available to answer DHCP
requests will be minimal and should be bridged by the normal
DHCP client retransmission mechanism.
2.0 Protocol Summary DRAFT January 1998
The protocol necessary in providing redundant/failover servers can be of DHCP leases. By providing a GMT "sent time" stamp, the reci-
grouped in three areas: pient can compare this with its notion of the current GMT time at
the time it receives the packet. The difference (plus the packet
travel time, which we ignore) is the time drift. The recipient
can use this time drift value to bias all "absolute time" values
it receives from the sender.
o Messages to keep the secondary server's lease data synchronized 2.2. Failover Protocol Messages
with that of the primary so that when failover occurs, there is no
degradation of service
o Messages that allow the secondary to determine the operational The Failover Protocol messages are encoded using a packet format
state of the primary, so as to know when to start servicing DHCP specific to the Failover Protocol. To allow easy recognition of
traffic Failover Protocol messages, BOOTP packet "op" field values 3..14 are
proposed to mark various Failover Protocol messages. A Failover Pro-
tocol message is always unicast from the source to the destination.
The sender, and never the recipient is responsible for reliable re-
transmission.
o Messages that are used to coordinate the primary regaining control 2.3. Failover Protocol packet header format
when it has become available again.
2.1 Primary keeps secondary lease data synchronized 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| op (1) | rev (1) | payload offset (2) |
+---------------+---------------+---------------+---------------+
| xid (4) |
+---------------------------------------------------------------+
| 0 or more additional header bytes (variable) |
+---------------------------------------------------------------+
| Payload data, formatted as DHCP-style options |
| (although using a unique option number space) |
| (variable) |
+---------------------------------------------------------------+
The messages for keeping the secondary's lease data up to date op - 1 byte
include the following:
DHCPBNDADD - Primary notifies secondary of new binding These values extend the number space of the existing BOOTP message
DHCPBNDUPD - Primary notifies secondary of modified binding type "Op" field. The following types are defined:
(e.g., extended lease)
DHCPBNDDEL - Primary notifies secondary of deleted binding
(e.g., expired or released lease)
In response to any of the above messages, the secondary server will DRAFT January 1998
respond to the primary with a message describing the status of the
binding addition, modification, or deletion.
DHCPBNDACK - Positive acknowledgment of binding change 3 DHCPPOOLREQ
DHCPBNDNAK - Negative acknowledgment of binding change 4 DHCPPOOLRESP
5 DHCPBNDUPD
6 DHCPBNDACK
7 DHCPPOLL
8 DHCPPRPL
9 DHCPCTLREQ
10 DHCPCTLRET
11 DHCPCTLACK
12 DHCPCTLACKACK
13 DHCPREQUEREQ
14 DHCPREQUERESP
2.2 Determination of operational state of a server rev - 1 byte
In order to determine the state of a given server, a participant can Failover protocol version supported. Set to 1 for the Failover Proto-
use the following message to poll (or "ping") the server: col described in this draft.
DRAFT DHCP Failover Protocol November 1997 payload offset - 2 bytes, network byte order
DHCPPOLL - Check if server is operational The byte offset of the Payload area, from the beginning of the Fail-
over packet header. The value for the current protocol version is 8.
xid - 4 bytes, network byte order
The sender of a failover protocol packet is responsible for setting
this number, and the receiver of the packet copies the number over
into any response packet. To the receiver it is opaque. The sender
SHOULD ensure that every packet sent to a particular IP address and
port combination has a unique transaction id unless that packet is a
re-transmission.
2.4. DHCPPOOLREQ and DHCPPOOLRESP:
Whenever the Secondary server transitions into NORMAL mode, it first
sends a DHCPPOOLREQ message to initiate a transfer of a small range
of IP addresses that will serve as its private address pool.
This is necessary, because initially the Secondary server has no such
address pool, and its pool gets depleted when it hands out addresses
in COMMUNICATION-INTERRUPTED mode. This is why the request is sent
every time the Secondary server transitions into NORMAL mode. The
DHCPPOOLREQ message does not carry any payload data. When the Primary
Server gets a DHCPPOOLREQ message, it computes which addresses should
be transferred to the Secondary, and queues up DHCPBNDUPD transac-
tions, setting the Status of these bindings to "BACKUP". Having done
this, it sends a DHCPPOOLRESP message. The DHCPPOOLRESP message
DRAFT January 1998
carries the "Number of addresses transferred" as its payload.
The Secondary server keeps sending DHCPPOOLREQ messages until it
receives a DHCPPOOLRESP with "Number of addresses transferred" = 0,
or it decides that the partner is not responding. Each one of these
message MUST have the same transaction ID. If a new transaction ID
is used in one of these messages, the receiving server will begin the
transmission of the DHCPBNDUPD messages all over again. To be clear,
if the Secondary Server receives a DHCPPOOLRESP message with "Number
of addresses transferred" > 0, it MUST send another DHCPPOOLREQ mes-
sage. This mechanism makes it possible for the Primary Server to pace
the transfer (e.g., it could generate all addresses all at once, or
one-by-one).
The Primary Server must respond to each DHCPPOOLREQ message it
receives. If it has already generated all private addresses, or it
has no available addresses, it MUST send DHCPPOOLRESP with "Number
of addresses transferred" = 0.
2.5. DHCPREQUEREQ and DHCPREQUERESP:
Whenever either server wishes to be updated with the information that
the other server knows and has not yet transmitted to it, will send a
DHCPREQUEREQ.
The DHCPREQUEREQ message does not carry any payload data. When the
either server gets a DHCPREQUEREQ message, it computes which updates
should be transferred to the Secondary, and queues up DHCPBNDUPD
transactions as appropriate. Having done this, it sends a DHCPRE-
QUERESP message. The DHCPREQUESP message carries the "Number of
addresses queued up" as its payload. The set of binding updates
queued up will depend on the requesting server's state. (The state
has already been communicated via prior DHCPPOLL/DHCPPRPL messages)
The Secondary server keeps sending DHCPPREQUEREQ messages until it
receives a DHCPREQUERESP with "Number of addresses queued up" = 0,
or it decides that the partner is not responding. This is the same
approach as in the DHCPPOOLREQ/DHCPPOOLRESP messages is used. Each
one of these DHCPREQUEREQ message MUST have the same transaction ID.
Use of a new transaction ID will cause re-building of the outgoing
binding update queue.
The Primary Server must respond to each DHCPREQUEREQ message it
receives. If it has already queued up all of the previously unsent
bindings update, then it MUST send DHCPREQUERESP with "Number of
addresses queued up" = 0.
DRAFT January 1998
2.6. DHCPBNDUPD
The Primary notifies Secondary (or the other way around) of a binding
state and data change.
In response to a binding update, the recipient server MUST respond
with a DHCPBNDACK message. Multiple binding updates can be batched
up, and sent in one Failover Protocol message.
2.7. DHCPBNDACK
This message implements a positive, or negative acknowledgement of
one or more binding updates.
A binding update, (or a batch of binding updates sent as one message)
are matched up with their associated acknowledgment by having the
same Xid field value in the message header.
The server sending a DHCPBNDACK message MAY include any of the
options that are acceptable in a DHCPBNDUPD message when the
DHCPBNDACK message returned to the sender. If any of this informa-
tion differs from the information in the DHCPBNDUPD message, the
receiver SHOULD update its bindings database with that information
upon receipt of the DHCPBNDACK message.
The DHCPBNDACK MAY selectively reject one or more updates by includ-
ing one or more IP address - Reject Reason option pairs in the mes-
sage body.
The DHCPBNDACK implicitly acknowledges any binding updates it replies
to, except those it enumerates using Reject Reason Codes.
2.8. DHCPPOLL
In order to determine the state of a given server, or to communicate
a critical change in its own status, a participant can use the above
message.
This message inquires about the current state of the recipient, and
tells the recipient what state the sender is.
In response to the DHCPPOLL message, the participant will listen for In response to the DHCPPOLL message, the participant will listen for
the following: a DHCPPRPL message.
DHCPPRPL - Poll reply DRAFT January 1998
2.3 Primary requests control from the secondary 2.9. DHCPPRPL
After a failover, when the primary server is restarted, the following This message replies to the DHCPPOLL message (PRPL=Poll reply). The
messages are used to coordinate the primary taking control back from DHCPPRPL also carries server status information (see message payload
the secondary: details below).
After a failover, when the Primary Server is restarted, the following
messages are used to coordinate the Primary taking control back from
the Secondary:
DHCPCTLREQ - Request for control DHCPCTLREQ - Request for control
DHCPCTLRET - Return of control initiated DHCPCTLRET - Return of control initiated
DHCPCTLACK - Return of control completed DHCPCTLACK - Return of control completed
DHCPCTLACKACK - Return of control completed message acknowledged.
3 Message formats and semantics The Primary Server sends a DHCPCTLREQ message, indicating that it
would like to take control of the bindings database. The Secondary
Server replies with a DHCPCTLRET message, which serves as a signal to
the Primary "Stand by to receive binding updates". This message then
is followed by a set of binding updates from the secondary to the
primary. When all updates have been transmitted (and acknowledged)
from Secondary to Primary, a DHCPCTLACK message is sent from the
Secondary to the Primary, to signal that "all updates from the Secon-
dary are now completed".
The failover protocol messages are encoded as a DHCP/BOOTP option in DISCUSSION:
a DHCP message. A DHCP message carrying a failover protocol message
carries only the failover protocol message option and no other
options. The DHCP message is unicast from the source to the
destination.
The option code for these messages is TBD. Within each failover Note, that the DHCPCTLACK message type must be transmitted reli-
protocol message, the specific message type is indicated by an option ably, as the Primary Server will not start servicing clients,
subcode in the first octet of the data area of the option. The 'len' until it has received the DHCPCTLACK message. To provide this
field includes the number of octets in the option subcode byte and in reliability, the DCHPCTLACKACK message is provided. This provides
any additional data carried in the failover protocol message. an acknowledgment of the DHCPCTLACK message, and the DHCPCTLACK
Bindings are encoded in a format that is TBD. message will be periodically re-sent until it is acknowledged. We
could just periodically re- send the DHCPCTLACK message until we
start receiving binding updates from the Primary, but the Primary
may not have any updates to send at all, hence the need for an
explicit DCHPCTLACKACK message.
DISCUSSION The Primary Server transitions into NORMAL state upon receiving a
DHCPCTLACK from the secondary, when the secondary has completed send-
ing all of its updates during synchronization. The DHCPCTLACKACK
message is needed to prevent the primary from waiting and not servic-
ing clients if the DHCPCTLACK message got lost. The Secondary server
will keep re-sending the DHCPCTLACK message, until:
The use of the REQUEST/REPLY field in the DHCP message header and 1. It Decides that the primary is not responding, so the Secon-
the UDP port to be used needs to be considered. dary server goes into COMMUNICATION- INTERRUPTED mode.
The use of existing DHCP options and header fields to encode DRAFT January 1998
bindings needs to be considered.
The sender places a 32-bit number in the DHCP header 'xid' field to 2. It receives a DHCPCTLACKACK or a DHCPBNDUPD message from the
uniquely identify each failover protocol message. The receiver primary. The Primary's DHCPBNDUPD messages would start
copies the contents of the 'xid' field into any reply or arriving at the Secondary server, if the Primary did get the
acknowledgment message. DHCPCTLACK, but the DHCPCTLACKACK message got lost.
The sender is responsible for reliable transmission and any 3. Protocol Payload Data Format
retransmission.
DRAFT DHCP Failover Protocol November 1997 Payload data is encoded as a set of flexible DHCP/BOOTP style
options. (The usual 1 byte option code, 1 byte length, and "length"
bytes of data). The options are placed after the header, after skip-
ping PayloadOffset bytes. The payload data options are not preceded
"cookie" value.
3.1 Binding Information Since the packet is NOT a DHCP/BOOTP protocol packet, the options
used here do not conflict with any existing "proper" DHCP/BOOTP
options. In fact, these options are allocated in relationship to the
DHCP option space in the following way. In cases where the syntax
and semantics of a Failover Payload Option is identical to that of a
DHCP/BOOTP option, the same number option number is used. For
options unique to the Failover protocol, options numbers starting at
230 are used.
Maintaining consistent binding information between the primary and Thus, all new Failover Protocol option numbers are assigned from a
secondary servers is a high priority of this protocol. Both the continuous range beginning with 230. This number is shown as an X in
primary and secondary must be sychronized in order for the failover the tables below.
operation to occur smoothly. The DHCPBNDADD, DHCPBNDUPD, and
DHCPBNDDEL messages described below require the following binding
information:
hType 1 byte The protocol is permissive in allowing various other DHCP options in
hLen 1 byte binding updates. As long as the sender wishes to use an option, it
chAddr 16 bytes MAY include it. On the other hand, the recipient MUST ignore any
ipAddr 4 bytes option it is not expecting.
grantTime 4 bytes
expireTime 4 bytes
clientIdentifierLen 2 bytes
clientIdentifierData clientIdentifierLen bytes
status 2 bytes
hostNameLen 2 bytes
hostNameData hostNameLen bytes
domainNameLen 2 bytes
domainNameData domainNameLen bytes
The minimum size of the binding information is 32 bytes. Multiple DHCPBNDUPD transactions can be batched together in one UDP
packet. Option sets for individual transaction MUST always begin
with the IP address (Option 50) . This is the only restriction on
payload item ordering. In any other case, payload data items can be
included in any desired order.
Note that the use of the client hardware address (hType, hLen, and In case an implementation chooses to use the DHCPBNDNAK mechanism,
chAddr) are in order to facilitate servers which support both the the DHCPBNDNAK message SHOULD contain one or more Option 50s from the
Bootp and DHCP protocols. Since most, if not all, Bootp clients do NAK-ed message, to indicate which specific update items are being
not send a 'client identifier' option, it seems appropriate to use NAK-ed.
this combination of fields of the Bootp packet to uniquely identify
the client within the primary and secondary servers' respective
bindings.
The 'ipaddr' is the IP address that the primary server has leased to While the synchronization is in progress, the secondary MUST NOT
the client. The 'grantTime' and 'expireTime' fields are represented accept client requests, and the primary MUST NOT send any updates to
as seconds since Jan 1, 1970 (i.e. ANSI C time_t time value the secondary. This is necessary to allow the Primary to be the sole
representation). An 'expireTime' of -1 (ffffffff) indicates an arbitrator of any conflicting updates.
infinite lease.
If available for the individual binding, the 'clientIdentifier' DRAFT January 1998
fields SHOULD be provided by the primary server. These fields
correspond to the DHCP vendor extension option number 61. If such
information is provided, then the secondary SHOULD use this data to
uniquely identify the client within its bindings database as
discussed in RFC 2132 Section 9.14.
The 'status' field is used to convey the status of a particular 3.1. DHCP Server Status
binding to the secondary server. The status may indicate that a
DRAFT DHCP Failover Protocol November 1997 This option is used to convey the current state of a server.
particular lease has expired, or that an address Code Len Type
+--+---+------+
| X| 1 | 1-15 |
+--+---+------+
The 'hostName' and 'domainName' fields can be used to maintain Allowed values for this option:
information required for Dynamic DNS updates. These fields
correspond to the DHCP vendor extension option number 12 and 15,
respectively.
DISCUSSION Value Message Type
----- ------------
1 UNKNOWN-STATE
2 PRIMARY-NORMAL Normal state
3 BACKUP-NORMAL
4 PRIMARY-COMINT Communication interrupted (safe)
5 BACKUP-COMINT
6 PRIMARY-PARTNERDOWN Partner down (unsafe
mode)
7 BACKUP-PARTNERDOWN
8 PRIMARY-CONFLICT Synchronizing, after a
"Partner-Down"
divergence
9 PRIMARY-SYNC Synchronizing, after a
"communications-
interrupted"
divergence.
10 BACKUP-SYNC
11 PRIMARY-RECOVER Recovering ALL
bindings from partner
12 BACKUP-RECOVER
13 FAILOVER-DISABLED The server is running
with the failover
protocol disabled.
(standalone)
The complete list of fields that may be required in the binding 14 SERVER-PAUSED The server is inactive,
information is still under discussion. Based upon such shutting down for a sort period.
discussions and other requirements, the information may be 15 SERVER-SHUTDOWN The server is inactive,
expanded or scaled back. shutting down for an extended period.
3.2 Primary keeps secondary lease data synchronized When a server is being re-started, it should send a DHCPPOLL message
to its partner, reporting its status (SERVER-PAUSED). In response,
the recipient SHOULD go into COMMUNICATION-INTERRUPTED mode.
DHCPBNDADD DRAFT January 1998
------------------------------------------ When a server is being shut down, it should send a DHCPPOLL message
| XX | len | 1 | Binding information to its partner, reporting its status (SERVER-SHUTDOWN).
------------------------------------------
The primary sends a DHCPBNDADD message to inform the secondary of a In response, the recipient SHOULD go into PARTNER-DOWN mode.
binding that has been added to the primary's set of bindings.
DHCPBNDUPD 3.2. DHCP Binding Status
------------------------------------------ This option is used to convey the current state of a binding. This
| XX | len | 2 | Binding information option is mandatory for DHCPBNDUPD messages.
------------------------------------------
The primary sends a DHCPBNDUPD message to inform the secondary of a Code Len Type
binding that has been changed in the primary's set of bindings. +-----+-----+-----+
| X+1 | 1 | 1-7 |
+-----+-----+-----+
DHCPBNDDEL Legal values for this option are:
------------------------------------------ Value Message Type
| XX | len | 3 | Binding information ----- ------------
------------------------------------------ 1 FREE The lease has never been used
2 ACTIVE assigned to a client *
3 EXPIRED
4 RELEASED A client released the lease
5 ABANDONED A server or client flagged address
as not usable.
6 RESET Lease was freed by some
external agent.
7 BACKUP Lease is set aside for Secondary
server's private address pool.
The primary sends a DHCPBNDDEL message to inform the secondary of a 3.3. Assigned IP address
binding that has been deleted from the primary's set of bindings.
DRAFT DHCP Failover Protocol November 1997 Uses identical code and format to DHCP Option 50 (requested IP
address).
DHCPBNDACK Code Len Address
+-----+-----+-----+-----+-----+-----+
| 50 | 4 | a1 | a2 | a3 | a4 |
+-----+-----+-----+-----+-----+-----+
-------------- DRAFT January 1998
| XX | 1 | 4 |
--------------
The secondary sends a DHCPBNDACK message to the primary to inform the 3.4. Lease grant time
primary that the binding change request identified by the 'xid' field
has successfully been completed.
DHCPBNDNAK An absolute, GMT time value for this option, as time synchronization
has already been achieved between the source and the target server
using the Sent Time Stamp option. Represented as seconds since Jan
1, 1970 (i.e. ANSI C time_t time value representation).
-------------- Code Len Time
| XX | 1 | 5 | +------+-----+-----+-----+-----+-----+
-------------- | X+2 | 4 | t1 | t2 | t3 | t4 |
+------+-----+-----+-----+-----+-----+
The secondary sends a DHCPBNDNAK message to the primary to inform the 3.5. Sent Time Stamp
primary that the secondary could not complete the binding change
request. For example, the secondary would send a DHCPBNDNAK in
response to a DHCPBNDUPD request for which the secondary had no
recorded binding.
DISCUSSION A time stamp using GMT, when the packet was sent. It is used to
determine the time drift between the sender and the recipient. The
time drift is defined as the difference between "Arrive Time (GMT)"
and (Send Time (GMT)" . The actual packet travel time is assumed to
be negligible in this context. All Date-Time values contained in
Failover messages will be corrected by the time drift before being
stored by the recipient.
The use of an additional field to indicate the reason for the Code Len Time
DHCPBNDNAK message should be considered. +-----+-----+-----+-----+-----+-----+
| X+3 | 4 | t1 | t2 | t3 | t4 |
+-----+-----+-----+-----+-----+-----+
3.3 Determination of operational state of a server The time is a 32 bit unsigned long in network byte order, in units of
seconds (GMT since EPOCH).
DHCPPOLL 3.6. Number of addresses transferred to Secondary Server
---------------------- A 32 bit unsigned long in network byte order. Reports the number of
| XX | 2 | 6 | flags | addresses transferred by the Primary to the Secondary Server
---------------------- (addresses to be used for the Secondary Server's private address
pool)
A DHCP participant sends a DHCPPOLL message to a server to determine DRAFT January 1998
whether that server is currently operational.
A DHCP secondary periodically sends a DHCPPOLL to its primary to Code Len Time
determine if the primary is currently operational. +-----+-----+-----+-----+-----+-----+
| X+4 | 4 | t1 | t2 | t3 | t4 |
+-----+-----+-----+-----+-----+-----+
A DHCP primary sends a DHCPPOLL to its secondary if the primary needs 3.7. Lease Duration
to determine if the secondary is operational.
A DHCP client sends a DHCPPOLL to a DHCP server to determine if the Uses the format and code of the standard DHCP IP Address Lease Time
server is currently operational. option. It is used by the DHCP protocol in the exact same way by the
DHCPOFFER message. The time is in units of seconds, and is specified
as a 32-bit unsigned integer. A Lease Duration of 0xFFFFFFFF indi-
cates an infinite lease.
The flags octet is defined as follows: CRRRRRRR, where the secondary Code Len Lease Time
+-----+-----+-----+-----+-----+-----+
| 51 | 4 | t1 | t2 | t3 | t4 |
+-----+-----+-----+-----+-----+-----+
DRAFT DHCP Failover Protocol November 1997 3.8. Client Identifier
sets the 'C' bit to 1 to indicate that it has taken control of the The format, code and conventions used are identical to DHCP option
bindings database, and the 'R' bits are reserved for future use. 61.
DHCPPRPL Code Len Type Client-Identifier
+-----+-----+-----+-----+-----+---
| 61 | n | t1 | i1 | i2 | ...
+-----+-----+-----+-----+-----+---
---------------------- 3.9. Client Hardware Address
| XX | 2 | 7 | flags |
----------------------
A DHCP participant replies to a DHCPPOLL message with a DHCPPRPL The format is similar to DHCP option 61. T1 (type) MUST be set to the
message. The sender copies the 'xid' field from the DHCPPOLL message proper ARP hardware address code ( it MUST NOT be zero!) TBD: Refer-
header into the 'xid' field in the DHCPPRPL message, ence the ARP document here.
The flags octet is defined as follows: ERRRRRRR, where the primary DRAFT January 1998
sets the 'E' bit to 1 (in response to a DHCPPOLL message with the 'C'
bit set to 1) to indicate to the secondary that the primary has not
relinquished control of the database. See section 4 for additional
details.
DISCUSSION Code Len Type Client-Identifier
+-----+-----+-----+-----+-----+---
| X+5 | n | t1 | i1 | i2 | ...
+-----+-----+-----+-----+-----+---
The DHCPPOLL and DHCPPRPL messages might also be useful to DHCP Either Client Id, Client Hardware Address or BOTH MAY be present in
clients to aid in determining the availability of specific DHCP binding update transactions. At least one of them MUST be present.
servers. Such use would avoid overloading the DHCPDISCOVER If both are present, the Client Id MUST be used to uniquely identify
message. the owner of the binding (exactly as in RFC 2131).
3.4 Primary requests control from the secondary 3.10. Host Name
DHCPCTLREQ Uses the format and code of DHCP option 12.
-------------- Code Len Host Name
| XX | 1 | 8 | +-----+-----+-----+-----+-----+-----+-----+-----+--
-------------- | 12 | n | h1 | h2 | h3 | h4 | h5 | h6 | ...
+-----+-----+-----+-----+-----+-----+-----+-----+--
A primary sends a DHCPCTLREQ message to its secondary to request 3.11. Domain Name
control of the bindings database from the secondary.
DHCPCTLRET Uses the format and code of DHCP option 15.
-------------- Code Len Domain Name
| XX | 1 | 9 | +-----+-----+-----+-----+-----+-----+--
-------------- | 15 | n | d1 | d2 | d3 | d4 | ...
+-----+-----+-----+-----+-----+-----+--
A secondary sends a DHCPCTLRET to its primary to begin the process of 3.12. Reject Reason Code
returning control of the bindings database to the secondary. After
sending the DHCPCTLRET message, the secondary sends a sequence of
DHCPBNDADD, DHCPBNDUPD and DHCPBNDDEL messages to synchronize the
primary's bindings database with the secondary's database.
DRAFT DHCP Failover Protocol November 1997 This option is used to selectively reject binding updates. It MAY be
used in DHCPBNDACK message, always following an option 50.(The option
50 contains the IP address of the specific update being rejected).
DHCPCTLACK DRAFT January 1998
--------------- Code Len Reason code
| XX | 1 | 10 | +-----+-----+-----+
--------------- | X+6 | 1 | R1 |
+-----+-----+-----+-
A secondary sends a DHCPCTLACK to its primary to indicate that the Reason codes :
secondary has finished returning control to the primary.
DISCUSSION 1 Illegal IP address (not part of any address pool)
2 Fatal conflict exists: address in use by other client.
Primary and secondary servers may need to exchange some additional 3.13. MDLI
information in DHCPCTLREQ, DHCPCTLRET and DHCPCTLACK messages.
This information would be encoded in an additional 'flags' or
'data' field added to the control messages.
The synchronization essentially requires a reliable transmission Maximum Delta Lease Interval, in seconds. A 32 bit integer value,
protocol using DHCPBND* and DHCPBNDACK messages. An alternative in netwotk byte order.
to using DHCPBND* messages to transfer bindings updates to the
primary would be to devise a separate transfer protocol based on
TCP.
4 Exchange of control between primary and secondary Code Len Time
+------+-----+-----+-----+-----+-----+
| X+7 | 4 | t1 | t2 | t3 | t4 |
+------+-----+-----+-----+-----+-----+
The primary and secondary servers coordinate the exchange control 4. Exchange of control between Primary and Secondary
The Primary and Secondary Servers coordinate the exchange control
over the bindings database through the use of DHCPPOLL and DHCPCTLREQ over the bindings database through the use of DHCPPOLL and DHCPCTLREQ
messages. In normal operation: messages. In normal operation:
o the primary sends notification of each change to its bindings The Primary sends notification of each change to its bindings data-
database to the secondary, and the secondary keeps its bindings base to the Secondary, and the Secondary keeps its bindings database
database synchronized with the primary's database synchronized with the Primary's database.
o the secondary periodically sends DHCPPOLL messages to the primary, The Secondary periodically sends DHCPPOLL messages to the Primary,
and the primary responds to each DHCPPOLL message with a DHCPPRPL and the Primary responds to each DHCPPOLL message with a DHCPPRPL
message message. If the Secondary does not receive a DHCPPRPL response mes-
sage, the Secondary takes control of the bindings database and begins
answering requests from DHCP clients. Note that the Secondary should
be able to be configured to not perform the automatic switch-over.
If the secondary does not receive a DHCPPRPL response message, the The conditions under which a Secondary takes control of the bindings
secondary takes control of the bindings database and begins database, e.g., the number of consecutive missing acknowledgments,
answering requests from DHCP clients. Note that the secondary should be configurable in the Secondary by the DHCP administrator.
should be able to be configured to not perform the automatic
switchover.
DISCUSSION DRAFT January 1998
The conditions under which a secondary takes control of the The Secondary records any changes it makes to the bindings database
bindings database, e.g., the number of consecutive missing while it has control. The Secondary continues to send DHCPPOLL mes-
acknowledgments, should be configurable in the secondary by the sages to the Primary. The DHCPPOLL messages also carry information
DHCP administrator. on the state of the Secondary Server.
DRAFT DHCP Failover Protocol November 1997 To regain control of the bindings database, e.g., after the Primary
Server has recovered from a failure, or a partitioned network condi-
tion, the Primary sends a DHCPCTLREQ message to the Secondary. The
Secondary stops answering DHCP client requests, and responds to its
Primary with a DHCPCTLRET message. After sending the DHCPCTLRET mes-
sage, the Secondary sends DHCPBNDUPD messages for each of the changes
it has made to the bindings database.
The secondary records any changes it makes to the bindings database The Primary sends a DHCPBNDACK for each DHCPBNDUPD message it
while it has control. The secondary continues to send DHCPPOLL receives. The Secondary completes the transfer of control by sending
messages to the primary, with the 'D' bit set. a DHCPCTLACK message to the Primary as soon as all of its updates
were acknowledged.
To regain control of the bindings database, e.g., after the primary Note, that the Primary SHOULD NOT send any DHCPBNDUPD messages while
server has failed, the primary sends a DHCPCTLREQ message to the synchronization is in progress with the Secondary.
secondary. The secondary stops answering DHCP client requests, and
responds to its primary with a DHCPCTLRET message. After sending
the DHCPCTLRET message, the secondary sends DHCPBND* messages for
each of the changes it has made to the bindings database. The
primary sends a DHCPBNDACK for each of the DHCPBND* messages it
receives. The secondary completes the transfer of control by
sending a DHCPCTLACK message to its primary.
If the primary server has not failed and has been answering DHCP Once the synchronization is completed, and the Primary transitions
client requests, and receives a DHCPPOLL message from its secondary into NORMAL state, and starts sending DHCPBNDUPD transactions on any
with the 'D' bit set, then both the primary and the secondary have accumulated binding changes it may have.
been answering DHCP client requests, and their bindings databases
may be unsynchronized. In this situation, the primary responds to
the secondary with a DHCPPRPL message with the 'E' bit set. Both
the primary and secondary servers notify a network administrator,
who must take steps to manually resynchronize the two bindings
databases.
DISCUSSION 5. Duplicate address assignment scenarios
It may be appropriate to state that, under administrator In the following two scenarios, the protocol could end up allocating
control, the primary and secondary both stop some or all DHCP duplicate IP addresses, unless the measures recommended in Section 6.
services when the servers discover that both have been are taken:
allocating DHCP addresses simultaneously and their databases are
potentially unsynchronized.
4.1 Minimizing the potential for duplicate bindings Primary Server crash before "lazy" update: In the case where the Pri-
mary Server sends an ACK to a client for a newly allocated IP address
and then crashes prior to sending the corresponding update to the
Secondary Server, the Secondary Server will have no record of the IP
address allocation. When the Secondary Server takes over, it may
well try to allocate that IP address to a different client. In the
case where the first client to receive the IP address is not on the
net at the time (yet while there was still time to run on its lease),
an ICMP echo (i.e., ping) will not prevent the Secondary Server from
allocating that IP address to different client.
One of the goals outlined in section 1.4 of this draft is to A more likely and subtle version of this problem is where the Primary
minimize the possibility of assigning an IP address to two Server crashes after extending a client's lease time, and before
different clients simultaneously. This possibility can occur only updating the Secondary with a new time using a lazy update. After the
if both the primary and secondary servers are handling requests Secondary takes over, if the client is not connected to the network
from the same subnet at the same time. Since the basis for this the Secondary will believe the client's lease has expired when, in
protocol is that the secondary only becomes "active" in the case fact, it has not. In this case as well, the IP address might be
that it has determined that the primary is no longer operational,
the situation in which both are operational at the same time can
occur only if there exists a failure in the mechanism for
determining the status of the primary. This failure could occur,
for example, if all routes between the primary and secondary were
unavailable such that secondary would not get a response to a poll,
even though the primary is still operational. In such
circumstances, if so configured on the secondary (see section 4
DRAFT DHCP Failover Protocol November 1997 DRAFT January 1998
above), manual intervention could be required to OK or disallow the reallocated to a different client while the first client is still
switchover. using it.
If such an option is not configured, it is still possible for the Network partition where servers can't communicate but each can talk
secondary to become active and servicing the same subnet(s) as the to clients: Several conditions are required for this situation to
primary. In this case, two clients could potentially get the same occur. First, due to a network failure, the Primary and Secondary
IP address, but only if both clients are on the same subnet. This Servers cannot communicate. As well, some of the DHCP clients must
situation could only occur if one of the client's packets went to be able to communicate with the Primary Server, and some of the
the primary and the other client's packets went to the secondary clients must now only be able to communicate with the Secondary
server. This would be a very rare situation. However, as rare as Server. When this condition occurs, both Primary and Secondary
this may be, the potential exists, so another mechanism is needed Servers could attempt to allocate IP addresses for new clients from
to ensure that this does not occur. Therefore, it is a requirement the same pool of available addresses. At some point, then, two
of this protocol that each server particpating in the failover MUST clients will end up being allocated the same IP address. This will
ping an address prior to offering that address. This should cause potentially serious problems when the network failure that
eliminate virtually any possibility of duplicate addresses being created this situation is corrected.
offered to clients from the participating servers.
5 Acknowledgments The next section details how the Failover Protocol prevents either of
the above scenarios (and other related scenarios) from causing dupli-
cate IP address allocation.
6 References 6. Duplicate Address Assignment Control
[RFC 2119] Bradner, S., "Key words for use in RFCs to Indicate There are several ways that the Failover protocol avoids the possi-
Requirement Levels", RFC 2119, March 1997. bility of duplicate address assignment.
[RFC 2131] Droms, R., "Dynamic Host Configuration Protocol", 6.1. Control of lease time
RFC2131, March 1997.
[RFC 2132] Droms, R., "DHCP Options and BOOTP Vendor Extensions", The key problem with lazy update is that when the primary server
RFC2132, March 1997. fails after updating a client with a particular lease time and before
updating the secondary server, the secondary server will believe that
a lease has expired even though the client still retains a valid
lease on that IP address.
7 Security Considerations In order to handle this problem, a period of time known as the "max-
imum delta lease interval" (MDLI) is defined and must be known to
both the primary and secondary servers. Proper use of this time
interval places an upper bound on the difference allowed between the
lease time provided to a DHCP client and the lease time known by the
secondary server. In order that this is not the maximum lease time
that the primary can ever provide to a client, during a lazy update
the primary typically updates the secondary with lease time informa-
tion which is longer than the lease time previously given to the
client.
8 Authors' Addresses In the case where the secondary needs to take over from the primary,
the secondary will not reallocate any IP addresses from one client to
a different clients. When transitioning to the PARTNER-DOWN state
(where the secondary is allowed to reallocate IP addresses), the
DRAFT January 1998
secondary will wait the maximum-delta-lease-interval before complet-
ing the state transition. Thus, any clients which have a lease on an
IP address with a lease time greater that than known by the secondary
will either have contacted the secondary during that time or the
their lease will have expired.
This protocol requires a DHCP server to deal with several different
lease intervals and places specific restrictions on their relation-
ships. The purpose of these restrictions is to allow the other server
in the pair to be able to make certain assumptions in the absence of
an ability to communicate between servers.
The different lease times are:
o desired client lease interval
The desired client lease interval is the lease interval that
the DHCP server would like to give to the DHCP client in the
absence of any restrictions imposed by the Failover Protocol.
Its determination is outside of the scope of this protocol.
Typically this is the result of external configuration of a
DHCP server.
o actual client lease interval
The actual client lease internal is the lease interval that
that DHCP server gives out to the DHCP client. It may be
shorter than the desired client lease interval (as explained
below).
o Primary Server lease interval
The Primary Server lease interval is the interval after which
the Primary Server believes that DHCP client's lease will
expire.
o desired Secondary Server lease interval
The desired Secondary Server lease interval is the interval
the Primary Server tells to the Secondary Server after which
the lease will expire.
o acknowledged Secondary Server lease interval
The acknowledged Secondary Server lease interval is the inter-
val the Secondary Server has most recently acknowledged. The
key restriction (and guarantee) that the Primary Server makes
with respect to lease intervals is that the actual client
DRAFT January 1998
lease interval never exceeds the acknowledged Secondary Server
lease interval (if any) by more than a fixed amount. This
fixed amount is called the "maximum delta lease interval"
(MDLI).
The MDLI MAY be configurable, but for correct server operation it
MUST be known to both the Primary and Secondary Servers.
The Primary Server MUST record in its state both the Primary Server
lease interval and the most recently acknowledged Secondary Server
lease interval. It is assumed that the desired client lease interval
can be determined through techniques outside of the scope of this
protocol.
The above lease time descriptions are written for the case where the
where the Primary server is operating and in communication with the
Secondary server. In the case where the Secondary server is operat-
ing out of communications with the Primary server, then the relation-
ships must hold in the other direction.
The fundamental relationship among these times which MUST be main-
tained is:
actual client lease interval <
( acknowledged other server lease interval + MDLI )
The "acknowledged other server lease interval" is the acknowledged
secondary server lease interval for the Primary server, and it would
be the acknowledged primary server lease interval for the Secondary
server when it is operating out of contact with the Primary server.
DISCUSSION:
This protocol mandates no particular detailed algorithms concern-
ing these lease intervals, as long as above fundamental relation-
ship is preserved.
In the interests of clarity, however, let's examine a specific
example. The MDLI in this case is 1 hour. The desired client
lease interval is 3 days. In operation this might work as fol-
lows:
When a Primary Server makes an offer for a new lease on an IP
address to a DHCP client, it determines the desired client lease
interval (in this case, 3 days). It then examines the ack-
nowledged Secondary lease interval (which in this case is zero).
DRAFT January 1998
Since the actual client lease interval can not be allowed to
exceed the current Secondary lease interval by more than the MDLI,
the offer made to the DHCP client (the actual client lease inter-
val) is for (essentially) the MDLI, 1 hour.
Once the Primary Server has performed the ACK to the DHCP client,
it will update the Secondary Server with the lease information.
However, the Secondary Server lease interval will be composed of
the current actual client lease interval + ( 1.5 * desired client
lease interval). Thus, the Secondary Server is updated with a
lease interval of 4.5 days + 1 hour.
When the Primary Server receives an ACK to its update of the
Secondary Server's lease interval, it records that as the ack-
nowledged Secondary Server lease interval. The Primary Server
MUST ensure that the Secondary Server has received and recorded in
its stable storage the Secondary Server lease interval.
When the DHCP client attempts to renew at T2 (approximately one
half an hour from the start of the lease), the Primary Server
again determines the desired client lease time, which is still 3
days. It then compares this with the remaining acknowledged
Secondary Server lease interval (adjusting for the time passed
since the Secondary Server was last updated), which is 4.5 days +
to the desired client lease interval as it is less than the ack-
nowledged Secondary lease interval.
When the Primary DHCP server updates the Secondary DHCP server
after the DHCP client's renewal ACK is complete, it will calculate
the Secondary Server lease interval as the actual client lease
interval (3 days this time) + .5 the desired client lease interval
(1.5 days). In this way, the Primary attempts to have the Secon-
dary always "lead" the client in its understanding of the client's
lease interval.
Once the initial actual client lease interval of the MDLI is past,
the protocol operates effectively like the DHCP protocol does
today in its behavior concerning lease intervals. However, the
guarantee that the actual client lease interval will never exceed
the acknowledged Secondary Server lease interval by more than the
MDLI allows full recovery from failures in lazy update.
6.2. Controlled re-allocation of IP addresses
When the servers cannot communicate neither server will allow an IP
address previously used by one client to be offered to a different
client. As a corollary, during normal operations the primary server
DRAFT January 1998
must update the secondary server whenever a lease expires or an IP
address is released, and must receive acknowledgement of that update
before offering the IP address of the expired or released IP address
to a different client.
7. Server States
The following server states are defined:
NORMAL State:
NORMAL state is the state used by a server when it can communicate
with the other server in the Primary-Secondary Server pair. When in
this state, the Primary responds to DHCP clients requests, while the
Secondary does not.
COMMUNICATION-INTERRUPTED state:
A server goes into this state whenever it is unable to communicate
with the other server. Both the Primary and Secondary Servers can go
into this state, although the behavior changes that result are dif-
ferent. Primary and Secondary Servers cycle automatically (without
administrative intervention) between NORMAL and COMMUNICATION-
INTERRUPTED state as the network connection between them fails and
recovers, or as the partner server cycles between operational and
non-operational. No duplicate IP address allocation can occur while
the servers cycle between these states. In this state both servers
may respond to DHCP client requests. When allocating new IP
addresses, each server allocates from a different pool. When respond-
ing to renewal requests, each server will allow continued renewal of
a DHCP client's current lease on an IP address.
PARTNER-DOWN state:
PARTNER-DOWN state is a state either server can enter. Once a server
has entered NORMAL state, the PARTNER-DOWN state is entered only on
command of an external agency (typically an administrator of some
sort) or after the expiration of an externally configured minimum
safe-time after the beginning of COMMUNICATION-INTERRUPTED state.
When in this state, the server no longer assumes that the other
server could still be operational and servicing a a different set of
clients, but instead assumes that it is the only server operating.
Only one server should be operating in this state at a time. The
server in this state will respond to DHCP client requests. It will
allow renewal of all outstanding leases on IP addresses, and will
allocate IP addresses from its own pool, and after a fixed period of
time, it will allocate IP addresses from the set of all available IP
DRAFT January 1998
addresses. The server will transition out of PARTNER-DOWN state after
automatic re-integration the companion server is complete. This
automatic re- integration will typically be initiated by the restart
of the server which was down.
POTENTIAL-CONFLICT state:
This state indicates that the two servers are attempting to rein-
tegrate with each other, but at least one of them was running in a
state that did not guarantee automatic reintegration would be possi-
ble. In POTENTIAL-CONFLICT state the servers may determine that the
same IP address has been offered and accepted by two different DHCP
clients.
RECOVER state:
This state indicates that the server has no information in its stable
storage. A server in this state will attempt to refresh its stable
storage from the other server.
SYNC state:
In this state, the Secondary Server attempts to synchronize its
stable storage with the Primary Server. Both the Primary and Secon-
dary may have information that the other lacks.
8. Primary Server Operation
This section discusses the operation of the primary server using the
state transition diagram in Figure 8.2-1.
8.1. Primary Server Initialization
When the Primary Server starts, there are three possibilities: it
has never started before and therefore has no record of any previous
state nor of any client binding information; it has started before
and has a record of a previous state and possibly of some client
binding information; it has started before, but failed catastrophi-
cally, and now has no record of any previous state (nor of any client
binding information).
When the Primary Server starts, if it has any record of a previous
state, then if that state was NORMAL or COMMUNICATION-INTERRUPTED it
moves to COMMUNICATION- INTERRUPTED state. If that state was
PARTNER-DOWN or POTENTIAL-CONFLICT, then it moves to PARTNER-DOWN
state. If that state was RECOVER, then the Primary Server moves into
the RECOVER state.
DRAFT January 1998
If it has no record of any previous state, then either this is an
initial startup, or a recovery from a catastrophic failure where
stable storage and all client binding information was lost. These are
distinguished by recovery from a catastrophic failure being indicated
by some external configuration indication to the Primary Server.
8.2. Primary Server State Transitions
Figure 8.2-1 is the diagram of the Primary Server's state transi-
tions. The remainder of this section contains information important
to the understanding of that diagram.
The server stays in the current state until all of the actions speci-
fied on the state transition are complete. If communications fails
during one of the actions, the server simply stays in the current
state and attempts a transition whenever the conditions for a transi-
tion are later fulfilled.
In the state transition diagram below, the "+" or "-" in the upper
right corner of each state is a notation about whether communication
is ongoing with the Secondary Server. The legend "responsive" and
"unresponsive" in each state indicates whether the Primary Server is
responsive to DHCP client requests in the respective state.
In the diagram state transition diagram below, when communication is
reestablished between the Primary and Secondary Server, the Primary
server must record the state of the Secondary Server when the commun-
ication was reestablished.
If the state of the Secondary Server changes while communicating,
then the Primary Server moves through the communications-failed tran-
sition, and into whatever state results. It then immediately moves
through whatever state transition is appropriate given the current
state of the Secondary Server.
DISCUSSION:
The point of this technique is simplicity, both in explanation of
the protocol and in its implementation. The alternative to this
technique of memory of partner state and automatic state transi-
tion on change of partner state is to have every state in the fol-
lowing diagram have a state transition for every possible state of
the partner. With the approach adopted, only the states in which
communications are reestablished require a state transition for
each possible partner state.
All state transitions of the Primary Server must be recorded in its
stable storage, and thus be available to the server after a server
DRAFT January 1998
restart.
Previous Primary State:
NORMAL or RECOVER PARTNER DOWN
COMMUNICATION <ext. cmd> POTENTIAL CONFLICT
INTERRUPTED | <none>
+---+ V |
| +----------------+ +-----------------+
| | - | | - |
| | RECOVER | | PARTNER DOWN |<-----+
| | (unresponsive) | | (responsive) | |
| +----------------+ +-----------------+ |
| | | | ^ |
| Comm. OK | Comm. OK | |
| Sec. State: | Sec. State: Comm. |
| | | V All Others Failed |
| | RECOVER +<---+ V | |
| All | | +-------------+ |
| Others | Comm. OK | POTENTIAL +| |
| | Note Sec. State: | CONFLICT | |
| | Poss. RECOVER |(responsive) |<---- | --+
| V Error NORMAL +-------------+ | |
| Sec->Pri | Pri->Sec | | |
| Sync | Sync. Resolve Conflict | |
| | | V V | |
| Wait MDLI | +-----------------+ | |
| from Fail. | | + | External | |
| V V | NORMAL |-Command-->+ |
| +-----++------>| (responsive) | | |
| ^ +-----------------+ | |
| | | | |
| Pri<->Sec Comm. External |
| Sync Failed Command |
| | | or |
| Comm. OK | "Safe Period" |
| Sec. State: V expiration |
| NORMAL +-----------------+ | |
| COMM. INT. | - |---------->+ |
| RECOVER------| COMMUNICATIONS | |
| | INTERRUPTED | Comm. OK |
+------------------>| (responsive) |--Sec. State:--+
+-----------------+ All Others
Figure 8.2-1: Primary Server state diagram.
DRAFT January 1998
8.3. Primary Server in PARTNER-DOWN state
When it is in PARTNER-DOWN state, the Primary Server operates largely
as does a normal DHCP server, with none of the special algorithms
described below. In PARTNER-DOWN state the Primary Server MUST
respond to DHCP client requests.
Any available IP address tagged as belonging to the Secondary Server
(at entry to PARTNER-DOWN state) MUST NOT be used until the MDLI
beyond the entry into PARTNER-DOWN state has elapsed.
The Primary Server MUST NOT allocate an IP address to a DHCP client
different from that to which it was allocated at the entrance to
PARTNER-DOWN state until the MDLI beyond the its expiration time has
elapsed. If this time would be earlier than the current time plus
the MDLI, then the current time plus the MDLI is used.
Two options exist for lease times, with different ramifications flow-
ing from each.
If the Primary Server wishes the Failover Protocol to protect it from
loss of stable storage in any state, then it should ensure that the
MDLI based lease time restrictions in Section 6.1 are maintained,
even in PARTNER-DOWN state.
If the Primary Server wishes to forego the protection of the Failover
Protocol in the event of loss of stable storage, then it need recog-
nize no restrictions on actual client lease times while in PARTNER-
DOWN state.
The Primary Server MUST poll the Secondary Server and attempt to
establish communications and synchronization with it.
Once the Primary succeeds in contacting the Secondary Server, the
Primary examines the state of the Secondary Server. If the state of
the Secondary Server is RECOVER or NORMAL, then both servers have
been running in such a way that duplicate IP address allocations were
inhibited. In this case, the Primary Server updates the Secondary
Server with its client binding information, and moves into the NORMAL
state.
Once contact has been established, if the state of the Secondary
Server is anything other than RECOVER or NORMAL then the Primary
Server moves into the POTENTIAL-CONFLICT state.
8.4. Primary Server in RECOVER state
When Primary Server is initialized in the RECOVER state it expects to
DRAFT January 1998
refresh its stable storage from an existing Secondary Server. In
this state the Primary Server MUST NOT respond to DHCP client
requests.
When the Primary Server succeeds in contacting the Secondary Server,
if it determines that the Secondary Server is itself in the RECOVER
state (which indicates that the Secondary Server has no existing
client binding information), the Primary Server will move directly
into NORMAL state after signaling some kind of an error (since some
person had to explicitly start the Primary Server in RECOVER state to
refresh its lost client binding information from the Secondary, and
the Secondary had no state).
If the Primary Server determines that the Secondary Server is in any
state other than RECOVER, then the Secondary Server has some client
binding information that the Primary Server needs before it moves
into the NORMAL state. The Primary Server will attempt to refresh
its state from the Secondary Server, and it will remain in the
RECOVER state until it is successful in doing so.
The Primary Server MUST remain in RECOVER state until a period of at
least the MDLI has passed since the Primary Server was known to have
failed. This is to allow any IP addresses that were allocated by the
Primary Server prior to loss of Primary Server client binding infor-
mation in stable storage to contact the Secondary Server or to time
out.
DISCUSSION:
The actual requirement on this wait period in RECOVER is that it
start when the Primary Server went down, not necessarily when it
came back up. If the time when the Primary Server failed is
known, then it could be communicated to the recovering server, and
the wait period could be reduced to the MDLI less the difference
between the current time and the time the server failed. In this
way, the waiting period could be minimized.
8.5. Primary Server in NORMAL state
When in NORMAL state, the Primary Server takes the following actions
to implement the Safe Failover Protocol:
o Lease Time Calculations
As discussed in Section 6.1, "Control of lease time", the
lease interval given to a DHCP client can never be more than
the maximum delta lease interval greater than the acknowledged
DRAFT January 1998
Secondary Server lease interval.
As long as the Primary Server adheres to this constraint, the
specifics of the lease intervals that it gives to either the
DHCP client or the Secondary DHCP server are implementation
dependent. One possible approach is shown in Section 6.1, but
that particular approach is in no way required by this proto-
col.
o Lazy Update of Secondary Server
After an ACK of a IP address binding, the Primary Server
attempts to update the Secondary with the binding information.
The lease time used in the update of the Secondary MUST be at
least that given to the DHCP client in the DHCPACK. It MAY,
however, be longer.
o Reallocation of IP Addresses Between Clients
Whenever a client binding is released, a DHCPBNDUPD message
must be sent to the Secondary Server, setting the binding
state to RELEASED. However, until a DHCPBNDACK is received for
this message, the IP address cannot be allocated to another
client.
8.6. Primary Server in COMMUNICATION-INTERRUPTED Mode
When in COMMUNICATION-INTERRUPTED state the Primary Server operates
in such a way that correct operation is ensured even if the Secondary
Server is still up and operational, but unable to communicate to the
Secondary Server. When communications are reestablished between the
Primary and Secondary Servers, if both are still in COMMUNICATION-
INTERRUPTED state, then the re-integration of their operation will
proceed automatically and without human intervention. The protocol
is designed to ensure that reintegration will proceed in an error
free manner and that no actions taken by either server while in
COMMUNICATION-INTERRUPTED state will cause problems during reintegra-
tion.
The Primary Server operates in COMMUNICATION-INTERRUPTED state as it
does in NORMAL state.
However, since it cannot communicate with the Secondary in this
state, the acknowledged-Secondary-lease-time will not be updated in
any new bindings. This is likely to eventually cause the actual-
client-lease-times to be the current-time plus the MDLI (unless this
is greater than the desired-client-lease-time).
DRAFT January 1998
The Primary Server can simply queue updates to the Secondary on com-
munication interruption and stay in the NORMAL state. If, at the time
communication with the Secondary is reestablished, the Secondary
remains in the NORMAL state as well, then the queued updates for the
Secondary will simply be processed.
COMMUNICATION-INTERRUPTED state for the Primary Server is a signal
that it has stopped queuing updates to the Secondary, and is able to
respond to a variety of possible Secondary states.
It is anticipated that some alarm condition would be raised upon the
transition from NORMAL state to COMMUNICATION-INTERRUPTED state. Once
the Primary Server has been in COMMUNICATION-INTERRUPTED state for a
period equal to the safe-period, then it can (if configured to do so)
transition into the PARTNER-DOWN state. An external command may also
force a transition to PARTNER-DOWN state.
9. Secondary Server Operation
The Secondary Server responds to DHCP client requests only in the
PARTNER-DOWN and COMMUNICATION-INTERRUPTED states.
9.1. Secondary Server Initialization
When the Secondary Server starts, there are three possibilities: it
has never started before and therefore has no record of any previous
state nor of any client binding information; it has started before
and has a record of a previous state and possibly of some client
binding information; it has started before, but failed catastrophi-
cally, and now has no record of any previous state (nor of any client
binding information).
When the Secondary Server starts, if it has any record of a previous
state, then if that state was NORMAL, COMMUNICATION-INTERRUPTED, or
SYNC, it moves to COMMUNICATION-INTERRUPTED state. If that state was
PARTNER-DOWN or POTENTIAL-CONFLICT, then it moves to PARTNER-DOWN
state. In all other cases (both other previous states and the cases
where there is no record of a previous state), the Secondary Server
moves into the RECOVER state.
9.2. Secondary Server State Transitions
The server stays in the current state until all of the actions speci-
fied on the state transition are complete. If communications fails
during one of the actions, the server simply stays in the current
state and attempts a transition whenever the conditions for a
DRAFT January 1998
transition are later fulfilled.
In the state transition diagram below, the "+" or "-" in the upper
right corner of each state is a notation about whether communication
is ongoing with the Primary Server. The legend responsive" and
"unresponsive" in each state indicates whether the Secondary Server
is responsive to DHCP client requests in the respective state.
In the state transition diagram below, when communication is reesta-
blished between the Secondary and Primary Server, the Secondary
Server must record the state of the Primary Server when the communi-
cations was reestablished. If the state of the Primary Server changes
while communicating, then the Secondary Server moves through the
communications-interrupted transition, and into whatever state
results. At that time, it then immediately moves through whatever
state transition is appropriate for the current state of the Primary
Server.
All state transitions of the Secondary Server must be recorded in its
stable storage, and thus be available to the server after a server
restart.
DRAFT January 1998
Previous Secondary State:
NORMAL RECOVER PARTNER DOWN
COMM. INT. <none> POTENTIAL CONFLICT
SYNC | |
+---+ V V
| +----------------+ +-----------------+
| | RECOVER - | | PARTNER DOWN - |<-----+
| | (unresponsive) | | (responsive) | |
| +----------------+ +-----------------+ |
| | | | ^ |
| Comm. OK | Comm. OK | |
| Pri. State: | Pri. State: Comm. |
| | | V All Others Failed |
| | RECOVER +<---+ V | |
| | | | +--------------+ |
| | | Comm. OK | POTENTIAL + | |
| All | Pri. State: | CONFLICT | |
| Others | RECOVER |(unresponsive)|<--- | --+
| | Note | +--------------+ | |
| | Poss. Sec->Pri | | |
| V Error Sync. Resolve Conflict | |
| Pri->Sec | V V | |
| Sync | +-----------------+ | |
| V V | NORMAL + |-External->+ |
| +-----++------>| (unresponsive) | Command | |
| ^ +-----------------+ | |
| Pri<->Sec | ^ | |
| Sync | Start Alloc Timer | |
| | | Sec->Pri | |
| +--------------+ | Sync | |
| | + |--->+ | External |
| | SYNC | Comm. Comm. OK Command |
| | unresponsive | Failed Pri. State: or |
| +--------------+ | RECOVER "Safe Period" |
| ^ V | expiration |
| | +------------------+ | |
| Comm. OK | COMMUNICATIONS - |---------->+ |
| Pri. State: | INTERRUPTED | Comm. OK |
| NORMAL-----| (responsive) |--Pri. State:--+
| COMM. INT. +------------------+ All Others
| ^
+---------------------+
Figure 9.2-1: Secondary Server State Diagram.
DRAFT January 1998
9.3. Secondary Server in RECOVER state
The Secondary DHCP server comes up in the RECOVER state when it has
no record of any previous state (or that previous state was RECOVER).
It stays in this state until it establishes communication with the
Primary Server, and is unresponsive to DHCP client requests in this
state. Essentially it is idle until it can contact the Primary
Server.
When it establishes communication with the Primary Server, it
attempts to load its client binding database from that of the Primary
Server using the techniques specified in section 6.
Once the Secondary Server's client binding database is refreshed from
that of the Primary, the Secondary Server moves into NORMAL state.
9.4. Secondary Server in NORMAL state
In normal state, the Secondary Server receives state updates from the
Primary Server in DHCPBNDUPD messages. It records these in its
client binding database in stable storage and then sends the
corresponding DHCPBNDACK message to the Primary Server.
While in NORMAL state, the Secondary Server MUST also acquire a
series of IP addresses from the Primary Server to be used to satisfy
DHCPDISCOVER requests from DHCP clients when in COMMUNICATION- INTER-
RUPTED state. See Section 2.2.2 for details of this acquisition pro-
cess.
The Secondary Server periodically polls the Primary Server with the
DHCPPOLL message. If it fails to receive a DHCPPRPL message in reply
after a configured number of retries or some administratively deter-
mined time, the Secondary Server transitions into COMMUNICATION-
INTERRUPTED state. Both the DHCPPOLL and DHCPPRPL messages carry the
current status of the sender.
If an external command is received by the Secondary Server, it can
move from NORMAL to PARTNER- DOWN state directly. Such a command
might be sent when the Primary Server was removed from server, and an
operator wanted the Secondary Server to take over immediately and
completely from the Primary Server.(Note that the Secondary Server
takes over from the Primary Server when in COMMUNICATION- INTERRUPTED
state, but less completely than in PARTNER-DOWN state).
DRAFT January 1998
9.5. Secondary Server in COMMUNICATION-INTERRUPTED state
When in COMMUNICATION-INTERRUPTED state the Secondary Server operates
in such a way that correct operation is ensured even if the Primary
Server is still up and operational, but unable to communicate to the
Secondary Server. When communications are reestablished between the
Primary and Secondary Servers, if both are still in COMMUNICATION-
INTERRUPTED state, then the re-integration of their operation will
proceed automatically and without human intervention. The protocol
is designed to ensure that reintegration will proceed in an error
free manner and that no actions taken by either server while in
COMMUNICATION-INTERRUPTED state will cause any conflicts to occur
during re-integration.
In COMMUNICATION-INTERRUPTED state, the Secondary Server responds to
DHCP client requests.
When processing a DHCPREQUEST from a DHCP client, the Secondary
Server MUST ensure that the client- lease-time is never more than the
maximum-delta-lease- interval from the current-time, independent of
the desired- client-lease-time.
When processing a DHCPRELEASE request from a DHCP client or the
expiration of a lease, the Secondary Server must not reallocate the
IP address to a different client. If the same client subsequently
performs a DHCPDISCOVER request, the Secondary Server SHOULD offer it
the previously used IP address.
When processing a DHCPDISCOVER request from a DHCP client, the secon-
dary MUST allocate IP addresses from the list of IP addresses that it
acquired from the Primary Server in RECOVER state. When it exhausts
this list, it MUST stop responding to DHCPDISCOVER requests (except
those it can satisfy by offering expired or released IP addresses to
their previously bound clients).
The Secondary Server MUST continue to send DHCPPOLL messages to the
Primary Server when in COMMUNICATION-INTERRUPTED state. If it
receives a DHCPPRPL message in reply, the Secondary Server determines
the state of the Primary Server. If the Primary Server is in NORMAL
or COMMUNICATION-INTERRUPTED state, then the Secondary Server moves
into the SYNC state.
If, however, the Primary Server is in RECOVER state, then the Secon-
dary Server updates the Primary Server with its known client binding
information, and moves into NORMAL state upon completion of that
update.
If instructed to by an outside agency (e.g., an administrator), the
DRAFT January 1998
Secondary Server SHOULD move into PARTNER-DOWN state. Once the
Secondary Server has been in COMMUNICATION-INTERRUPTED state for a
period equal to the safe-period, then it may (if configured to do so)
transition into the PARTNER-DOWN state in the absence of an external
command.
9.6. Secondary Server in SYNCH state
The Secondary Server does not respond to DHCP client requests when in
SYNCH state.
DISCUSSION:
This is the entire reason for this states existence, otherwise the
activities specified for this state could happen as part of a
state transition from the COMMUNICATION-INTERRUPTED state to the
NORMAL state. However, in the COMMUNICATION-INTERRUPTED state the
Secondary Server responds to DHCP client requests. Having the
Secondary Server respond to DHCP client requests during the syn-
chronization process (and thus taking actions requiring further
synchronization) seemed like a bad idea.
The Secondary Server synchronizes its information with the Primary
Server while in SYNCH state. Both Primary and Secondary Servers may
have information the other lacks because of operations performed
while communications were interrupted.
During the synchronization process, the Secondary Server continues to
poll the Primary Server with DHCPPOLL messages. If it fails to
receive a reply, it moves back into COMMUNICATION-INTERRUPTED state.
When synchronization is complete, the Secondary Server moves into
NORMAL state.
9.7. Secondary Server in PARTNER-DOWN state
The Secondary Server responds to DHCP client requests when in
PARTNER-DOWN state.
Any available IP address which does not belong to the private pool
established by the Secondary Server (at entry to PARTNER-DOWN state)
MUST NOT be used until the MDLI beyond the entry into PARTNER-DOWN
state has elapsed.
The Secondary Server MUST NOT allocate an IP address to a DHCP client
different from that to which it was allocated at the entrance to
DRAFT January 1998
PARTNER-DOWN state until the MDLI beyond the its expiration time has
elapsed. If this time would be earlier than the current time plus the
MDLI, then the current time plus the MDLI is used.
Two options exist for lease times, with different ramifications flow-
ing from each.
If the Secondary Server wishes the Failover Protocol to protect it
from loss of stable storage in any state, then it should ensure that
the MDLI based lease time restrictions in Section 6.1 are maintained,
even in PARTNER-DOWN state.
If the Secondary Server wishes to forego the protection of the safe
Failover Protocol in the event of loss of stable storage, then it MAY
recognize no restrictions on actual client lease times while in
PARTNER-DOWN state.
The Secondary Server continues to poll the Primary Server with
DHCPPOLL messages. If the Secondary Server receives a reply, and the
Primary Server is in the RECOVER state, the Secondary Server updates
the Primary Server with all of the Secondary's client binding infor-
mation, and then moves into the NORMAL state.
If communications with the Primary Server are reestablished, and the
Primary Server is in any other state but RECOVER, the Secondary
Server moves into the POTENTIAL-CONFLICT state (as does the Primary
Server).
9.8. Secondary Server in POTENTIAL-CONFLICT state
The secondary server enters POTENTIAL-CONFLICT state when the combi-
nation of its state and that of the primary indicate that a potential
conflict of IP address allocation has occurred. There is no guaran-
tee that such a conflict has occurred -- just the possibility. In
this state each server compares its client binding information with
that of the other server and any conflicts are resolved in an imple-
mentation dependent manner.
When (and if) the resolution process completes, each server moves
into the NORMAL state.
10. Safe Period
Due to the restrictions imposed on each server while in
COMMUNICATION-INTERRUPTED state, long-term operation in this state is
not feasible for either server. One reason that these states exist at
all, is to allow the servers to easily survive transient network
DRAFT January 1998
communications failures of a few minutes to a few days (although the
actual time periods will depend a great deal on the DHCP activity of
the network in terms of arrival and departure of DHCP clients on the
network).
Eventually, when the servers are unable to communicate, they will
have to move into a state where they no longer can re-integrate
without the some possibility of a duplicate IP address allocation.
There are two ways that they can move into this state (known as
PARTNER-DOWN).
They can either be informed by external command that, indeed, the
partner server is down. In this case, there is no difficulty in mov-
ing into the PARTNER-DOWN state since it is an accurate reflection of
reality and the protocol has been designed to operate correctly (even
during reintegration) if, when in PARTNER-DOWN state the partner is,
indeed, down.
The other difficulty is when the servers are running unattended for
extended periods, and in this case the option is provided to config-
ure something called a "safe- period" into each server. This OPTIONAL
safe-period is the period after which either the Primary or Secondary
Server will automatically transition to PARTNER-DOWN from
COMMUNICATION-INTERRUPTED state. If this transition is completed and
the partner is not down, then the possibility of duplicate IP address
allocations will exist.
The goal of the "safe-period" is to allow network operations staff
some time to react to a server moving into COMMUNICATION-INTERRUPTED
state. During the safe-period the only requirement is that the net-
work operations staff determine if both servers are still running --
and if they are, to either fix the network communications failure
between them, or to take one of the servers down before the expira-
tion of the safe-period.
The length of the safe-period is installation dependent, and depends
in large part on the number of unallocated IP addresses within the
subnet address pool and the expected frequency of arrival of previ-
ously unknown DHCP clients requiring IP addresses. Many environments
should be able to support safe-periods of several days.
During this safe period, either server will allow renewals from any
existing client. The only limitation concerns the need for IP
addresses for the DHCP server to hand out to new DHCP clients and the
need to re-allocate IP addresses to different DHCP clients.
The number of "extra" IP addresses required is equal to the expected
total number of new DHCP clients encountered during the safe period.
DRAFT January 1998
This is dependent only on the arrival rate of new DHCP clients, not
the total number of outstanding leases on IP addresses.
In the unlikely event that a relatively short safe period of an hour
is all that can be used (given a dearth of IP addresses or a very
high arrival rate of new DHCP clients), even that can provide sub-
stantial benefits in allowing the DHCP subsystem to ride through a
minor problems that could occur and be fixed within that hour. In
these cases, no possibility of duplicate IP address allocation
exists, and re-integration after the failure is solved will be
automatic and require no operator intervention.
11. Open Issues
A number of details remain to be worked out. They are as follows:
1. Level of Agreement and Completion
This draft is incomplete in two senses. First, none of the
authors agree with everything written, and quite a number of
issues remain to be worked out among the various authors (to say
nothing about the rest of the community). Second, this draft is
not yet complete enough to support creation of inter-operable
implementations.
However, we believe that even though this draft is very much a
work in progress, there is value with sharing it with the rest
of the DHCP community in its current form.
2. Failover Port
We need to resolve whether the Failover protocol runs with the
same or a different port as the DHCP protocol. In the interests
of allowing implementation of the Failover protocol by a dif-
ferent process or sub-process, having it use a different port
seems reasonable.
3. High Level Operations
While the detailed operations are beginning to come together,
the higher level operations (like reintegration) are, as yet,
incompletely specifcied. This will be rectified in a later
revision.
4. Option Spaces
The draft currently reflects some rather fuzzy goals of using
DHCP options where they apply but also defining new options. It
DRAFT January 1998
uses the "user defined option space" for this, which is probably
not a good idea. Perhaps the DHCP Panel will produce a larger
option space in which all of these options can be defined, or
perhaps (as it written in the draft) this protocol will just
have to define entirely unique options.
5. Subnet Level Granularity
This protocol talks about a server being in one state or
another, however the desire is for this protocol to operate
independently in each address pool for which a primary and
secondary server is defined. In this way, the "server" state
really refers to the "subnet" state. Once the protocol is vali-
dated, the editing work to make it operate at subnet granularity
will be performed.
6. Secondary Server Communications with DHCP Clients
There are two situations where we may want to allow the secon-
dary server to communicate with DHCP clients even though the
secondary can communicate with the primary and would normally be
unresponsive to DHCP client requests.
The first situation which deserves consideration is where the
secondary has given a DHCP client a lease on an IP address when
it was not able to communicate with the primary, and then subse-
quently the secondary becomes able to communicate with the pri-
mary. When the client unicasts its DHCPREQUEST to the secondary
to renew its lease, the secondary will not be able to communi-
cate with the client (as this protocol is defined). Should we
allow the Secondary to extend the lease for the DHCP client and
then inform the primary of that extension using the DHCPBNDUPD
message in the same was as the Primary uses that message?
The second situation arises where a client can only communicate
with the secondary due to some network failure, but the primary
and secondary server can communicate. As written, the protocol
will not allow the secondary to offer a lease to the DHCP
client, but it would be straightforward to modify the protocol
to allow the secondary to do so. The only difficult part of
this change to the protocol would be to suggest how the secon-
dary would know that the DHCP client could talk only to the
secondary. But, given that if the DHCP primary could talk to
the DHCP client, the secondary would expect to hear about it in
DHCPBNDUPD messages at some point, the absence of such messages
could be used as a signal to communicate to the DHCP client in
question.
DRAFT January 1998
7. UDP or TCP
There has been much debate about the utility of using UDP for
the failover protocol, since it doesn't supply guaranteed
delivery. Certainly rebuilding TCP out of UDP would be a mis-
take. Some factors to consider in this debate are as follows:
First, it is important to recognize that mere receipt of a
packet by the other server in the pair (e.g., receipt of a
DHCPBNDUPD packet by the secondary server) is not sufficient for
the primary to update its own bindings database with new infor-
mation about what the secondary knows. In all cases of
transfers of bindings information, the server of a DHCPBNDUPD
message MUST update its own stable storage prior to replying
with a DHCPBNDACK message (except in the marginal case where all
of the updates are rejected). An action is required by the
receiving server and an explicit ACK is needed by the sending
server to ensure the integrity of the protocol. So, just know-
ing that the other server has received a Failover protocol
packet is not intrinsically interesting.
Second, the DHCP protocol, both the client and server side, is
being implemented in progressively smaller and smaller machines.
While this progression is most evident in DHCP clients, there
exist implementations today of DHCP servers embedded in devices
that are by no stretch of the imagination traditional "servers"
running mainstream operating systems. In many ways, the Fail-
over protocol is very well suited to such devices. Adding addi-
tional protocol infrastructure requirements to implement the
Failover protocol could easily prevent its implementation in
devices that in some ways need it most.
Third, there are only a few cases where the Failover protocol
requires guaranteed delivery of packets. In particular, the
normal Primary to Secondary DHCPBNDUPD message to not have to be
delivered reliably. The consequences of lost DHCPBNDUPD mes-
sages are handled by the use of the MDLI, for the simple reason
that since these messages are "lazy", they may not get delivered
because of a server failover prior to their transmission. Given
that the protocol is robust in the face of loss of either a
DHCPBNDUPD message or a DHCPBNDACK message, a technique known as
"fire and forget" may be used with this protocol and two
cooperating implementations. If the DHCPBNDACK message contains
all of the information originally in the DHCPBNDUPD message,
then the DHCPBNDUPD message may be transmitted and forgotten by
the sending server (typically the primary). When and if the
secondary receives the DHCPBNDUPD and replies with a DHCPBNDACK
message and the primary receives it, the primary will update its
DRAFT January 1998
stable storage with a new picture of what the secondary knows
about the lease time. If either of these messages is lost, the
only downside is that the DHCP client associated with the bind-
ing in question may receive a shorter lease for one lease period
than it would otherwise. This "fire and forget" technique
could substantially ease both the complexity of implementation
and memory requirements of an implementation of the Failover
protocol, especially where two servers were communicating over a
very slow link.
12. Acknowledgments
Ralph Droms started it all, by sketching out an initial interserver
draft that embodied ideas from several past IETF meetings. In that
draft, he acknowledged contributions by Jeff Mogul, Greg Minshall,
Rob Stevens, Walt Wimer, Ted Lemon, and the DHC working group.
Kim Kinnear and Bob Cole each extended that draft, separately and
then together, until they created an interserver draft that supported
any number of servers. The complexity of that approach was just too
great, and led to a much simpler approach embodied in the first Fail-
over draft by Greg Rabil, Mike Dooley, and Arun Kapur and Ralph
Droms. This draft posited only two servers -- a primary and a secon-
dary. Kim Kinnear then wrote the Safe Failover draft to layer on top
of the Failover Draft and increase its the robustness in the face of
certain rare network failures. At the spring 1998 IETF meeting in LA,
the DHC working group said that they wanted a merged Failover and
Safe Failover draft. Steve Gonczi and Bernie Volz stepped up and
produced the raw material for such a merged draft, along with a new
message format designed around DHCP options and other extensions and
clarifications. Kim Kinnear edited their work into draft format and
made other changes, and that is what you have in your hands.
Many people have reviewed the various drafts that went into this
result. At American Internet, ideas have been contributed by Mark
Stapp, Brad Parker, and Ellen Garvey. Glenn Waters of Bay Networks
contributed ideas and enthusiasm to make a Failover protocol that was
both "safe" and "lazy".
13. References
[1] Droms, R., "Dynamic Host Configuration Protocol", RFC 2131,
March 1997.
[2] Alexander, S., Droms, R., "DHCP Options and BOOTP Vendor
Extensions", Internet RFC 2132, March 1997.
DRAFT January 1998
[3] Rabil, G., Dooley, M., Kapur, A., Droms, R., "DHCP Failover
Protocol", draft-ietf-dhc-failover-00.txt.
[4] Gudmundsson, Olafur, "Security Architecture for DHCP",
draft-ietf-dhc-security-arch-00.txt.
14. Author's information
Ralph Droms
323 Dana Engineering
Bucknell University
Lewisburg, PA 17837
Phone: (717) 524-1145
EMail: droms@bucknell.edu
Greg Rabil, Mike Dooley, Arun Kapur Greg Rabil, Mike Dooley, Arun Kapur
Quadritek Systems, Inc. Quadritek Systems, Inc.
10 Valley Stream Parkway, Quite 240 10 Valley Stream Parkway, Suite 240
Malvern, PA 19355 Malvern, PA 19355
Phone: (800) 408-2747 Phone: (800) 208-2747
E-mail: grabil@quadritek.com
EMail: grabil@quadritek.com
mdooley@quadritek.com mdooley@quadritek.com
akapur@quadritek.com akapur@quadritek.com
Ralph Droms Kim Kinnear
323 Dana Engineering American Internet Corporation
Bucknell University 4 Preston Ct.
Bedford, MA 01730-2334
DRAFT DHCP Failover Protocol November 1997 Phone: (781) 276-4587
EMail: kinnear@american.com
Lewisburg, PA 17837 Steve Gonczi, Bernie Volz
Process Software Corporation
959 Concord St.
Framingham, MA 01701
Phone: (717) 524-1145 Phone: (508) 879-6994
E-mail: droms@bucknell.edu
EMail: gonczi@process.com
volz@process.com
 End of changes. 

This html diff was produced by rfcdiff 1.23, available from http://www.levkowetz.com/ietf/tools/rfcdiff/