draft-ietf-dhc-failover-02.txt   draft-ietf-dhc-failover-03.txt 
Network Working Group Ralph Droms Network Working Group Ralph Droms
INTERNET DRAFT Bucknell University INTERNET DRAFT Bucknell University
Greg Rabil Greg Rabil
Mike Dooley Mike Dooley
Arun Kapur Arun Kapur
Quadritek Systems Quadritek Systems
Kim Kinnear Kim Kinnear
American Internet Mark Stapp
Cisco Systems
Steve Gonczi Steve Gonczi
Bernie Volz Bernie Volz
Process Software Process Software
August 1998 November 1998
Expires March 1999 Expires June 1999
DHCP Failover Protocol DHCP Failover Protocol
<draft-ietf-dhc-failover-02.txt> <draft-ietf-dhc-failover-03.txt>
Status of this Memo Status of this Memo
This document is an Internet-Draft. Internet-Drafts are working This document is an Internet-Draft. Internet-Drafts are working
documents of the Internet Engineering Task Force (IETF), its areas, documents of the Internet Engineering Task Force (IETF), its areas,
and its working groups. Note that other groups may also distribute and its working groups. Note that other groups may also distribute
working documents as Internet-Drafts. working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as ``work in progress.'' material or to cite them other than as "work in progress."
To learn the current status of any Internet-Draft, please check the To view the entire list of current Internet-Drafts, please check the
``1id-abstracts.txt'' listing contained in the Internet-Drafts Shadow "1id-abstracts.txt" listing contained in the Internet-Drafts Shadow
Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), Directories on ftp.is.co.za (Africa), ftp.nordu.net (Northern
munnari.oz.au (Pacific Rim), ftp.ietf.org (US East Coast), or Europe), ftp.nic.it (Southern Europe), munnari.oz.au (Pacific Rim),
ftp.isi.edu (US West Coast). ftp.ietf.org (US East Coast), or ftp.isi.edu (US West Coast).
Abstract Abstract
DHCP [RFC 2131] allows for multiple servers to be operating on a DHCP [RFC 2131] allows for multiple servers to be operating on a
single network. Some sites are interested in running multiple servers single network. Some sites are interested in running multiple servers
in such a way so as to provide redundancy in case of server failure. in such a way so as to provide redundancy in case of server failure.
In order for this to work reliably, the cooperating Primary and In order for this to work reliably, the cooperating primary and
Secondary servers must maintain a consistent database of the lease
DRAFT January 1998 DRAFT November 1998
secondary servers must maintain a consistent database of the lease
information. This implies that servers will need to coordinate any information. This implies that servers will need to coordinate any
and all lease activity so that this information is synchronized in and all lease activity so that this information is synchronized in
case of failover. case of failover.
This document defines a protocol to provide this synchronization This document defines a protocol to provide this synchronization
between two servers. One server is designated the "Primary" server, between two servers. One server is designated the "Primary" server,
the other is the "Secondary" server. Additionally, this document the other is the "Secondary" server. Additionally, this document
describes a protocol for the automatic transfer of control from the describes a protocol for the automatic transfer of control from the
Primary to the Secondary in the case of failure (failover), as well primary to the secondary in the case of failure (failover), as well
as a network partition. as a network partition.
This document is a merge of draft-ietf-dhc-failover-01.txt and This document further develops the concepts presented in draft-ietf-
draft-ietf-dhc-safe-failover-proto-00.txt, along with substantial dhc-failover-02.txt.
changes to each. Unfortunately, this merge was not completed with
sufficient time to allow review by any of the authors of draft-ietf-
dhc-failover-01.txt, and so it may well not reflect their views even
though their names appear as authors. See Section 11, issue #1 and
Section 12 for more details.
1. Introduction 1. Introduction
As the use of DHCP servers in networked environments grows, the As the use of DHCP servers in networked environments grows, the
dependency of those networks on the DHCP server increases. This is dependency of those networks on the DHCP server increases. This is
particularly true of the hosts that receive their configuration particularly true of the hosts that receive their configuration
information from the DHCP server. Therefore, it is very important to information from the DHCP server. Therefore, it is very important to
be able to provide reliable, continuous availability of DHCP ser- be able to provide reliable, continuous availability of DHCP ser-
vices. vices.
This specification describes a protocol to support automatic failover This specification describes a protocol to support automatic failover
from a primary to its secondary server. The failover mechanism from a primary to its secondary server. The failover mechanism
allows the secondary server to perform DHCP actions while the primary allows the secondary server to perform DHCP actions while the primary
is down, or when a network failure prevents the primary and secondary is down, or when a network failure prevents the primary and secondary
from communicating. The protocol also specifies how reintegration is from communicating. The protocol also specifies how reintegration is
achieved when the primary again becomes operational or when the pri- achieved when the primary again becomes operational or when the pri-
mary and secondary can again communicate. mary and secondary can again communicate.
In providing the specification for the failover, the protocol speci- In providing the specification for the failover, the protocol speci-
fies how to guarantee reliable delivery of changes to the secondary. fies how to guarantee reliable delivery of binding changes to the
This is required to synchronize the secondary's lease data with that partner server. This is required to synchronize lease data between
of the primary. The protocol further specifies a mechanism to allow the primary and the secondary. The protocol further specifies a
the secondary to determine if it can communicate with the primary mechanism to allow either server to determine if it can communicate
server. The secondary will automatically begin to service DHCP with its partner. The secondary will automatically begin to service
requests whenever it cannot communicate with the primary. When the DHCP requests whenever it cannot communicate with the primary. When
primary server becomes available again, the secondary will convey any the primary server becomes available again, the secondary will convey
changes that occurred since the time of failover back to the primary. any changes that occurred since the time of failover back to the pri-
mary.
Through careful control of the difference between the lease times Through careful control of the difference between the lease times
DRAFT January 1998
offered to DHCP clients and the lease time known by the secondary offered to DHCP clients and the lease time known by the secondary
server, the protocol allows the primary to communicate with the server, the protocol allows the primary to communicate with the
secondary after the primary has completed communication with the DHCP secondary after the primary has completed communication with the DHCP
client (a technique known as "lazy" update) and still guarantee that client (a technique known as "lazy" update) and still guarantee that
DRAFT November 1998
duplicate IP address allocations do not occur. Thus, the protocol duplicate IP address allocations do not occur. Thus, the protocol
does not directly impact the ability of a DHCP server to respond to does not directly impact the ability of a DHCP server to respond to
DHCP client requests. DHCP client requests.
1.1. Requirements Terminology 1.1. Requirements Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC 2119]. document are to be interpreted as described in RFC 2119 [RFC 2119].
skipping to change at page 3, line 37 skipping to change at page 3, line 33
A DHCP client is an Internet host using DHCP to obtain confi- A DHCP client is an Internet host using DHCP to obtain confi-
guration parameters such as a network address. guration parameters such as a network address.
o "DHCP server" or "server" o "DHCP server" or "server"
A DHCP server is an Internet host that returns configuration A DHCP server is an Internet host that returns configuration
parameters to DHCP clients. parameters to DHCP clients.
o "binding" o "binding"
A binding is a collection of configuration parameters, includ- A binding is a collection of configuration parameters, including
ing at least an IP address, associated with or "bound to" a at least an IP address, associated with or "bound to" a DHCP
DHCP client. Bindings are managed by DHCP servers. client. Bindings are managed by DHCP servers.
o "binding database" o "binding database"
The collection of bindings managed by a primary and secondary. The collection of bindings managed by a primary and secondary.
o "subnet address pool" o "subnet address pool"
A subnet address pool is the set of IP address which is asso- A subnet address pool is the set of IP address which is associ-
ciated with a particular network number and subnet mask. In ated with a particular network number and subnet mask. In the
the simple case, there is a single network number and subnet simple case, there is a single network number and subnet mask
mask and a set of IP addresses. In the more complex case and a set of IP addresses. In the more complex case (sometimes
(sometimes called "secondary subnets", sometimes "super- called "secondary subnets", sometimes "superscopes"), several
scopes"), several (apparently unrelated) network number and (apparently unrelated) network number and subnet mask combina-
subnet mask combinations with their associated IP addresses tions with their associated IP addresses may all be configured
together into one subnet address pool.
DRAFT January 1998
may all be configured together into one subnet address pool. o "Primary server" or "Primary"
o "primary server" or "primary" DRAFT November 1998
A DHCP server configured to provide primary service to a set A DHCP server configured to provide primary service to a set of
of DHCP clients for a particular set of subnet address pools. DHCP clients for a particular set of subnet address pools.
o "secondary server" or "secondary" o "Secondary server" or "Secondary"
A DHCP server configured to act as backup to a primary server A DHCP server configured to act as backup to a primary server
for a particular set of subnet address pools. for a particular set of subnet address pools.
o "stable storage" o "stable storage"
Every DHCP server is assumed to have some form of what is Every DHCP server is assumed to have some form of what is called
called "stable storage". Stable storage is used to hold "stable storage". Stable storage is used to hold information
information concerning IP address bindings (among other concerning IP address bindings (among other things) so that this
things) so that this information is not lost in the event of a information is not lost in the event of a server failure which
server failure which requires restart of the server. requires restart of the server.
1.3. Requirements for this protocol 1.3. Requirements for this protocol
The following list of goals must be (and are) achieved by this proto- The following list of goals must be (and are) achieved by this proto-
col. col.
1. Implementations of this protocol must work with existing DHCP 1. Implementations of this protocol must work with existing DHCP
client implementations based on the DHCP protocol [1]. client implementations based on the DHCP protocol [RFC 2131].
2. Implementations of the protocol must work with existing BOOTP 2. Implementations of the protocol must work with existing BOOTP
relay implementations. relay implementations.
3. The protocol must provide failover redundancy between servers 3. The protocol must provide failover redundancy between servers
that are not located on the same subnet. that are not located on the same subnet.
1.4. Goals for this protocol 1.4. Goals for this protocol
1. Provide for continued service to DHCP clients through an 1. Provide for continued service to DHCP clients through an
automated mechanism in the event of failure of the Primary automated mechanism in the event of failure of the primary
Server. server.
2. Avoid binding an IP address to a client while that binding is 2. Avoid binding an IP address to a client while that binding is
currently valid for another client. In other words, don't currently valid for another client. In other words, do not
allocate the same IP address to two clients. allocate the same IP address to two clients.
3. Minimize any need for manual administrative intervention. 3. Minimize any need for manual administrative intervention.
DRAFT January 1998
4. Introduce no additional delays in server response time as a 4. Introduce no additional delays in server response time as a
result of inter-server communication. result of the communications required to implement the Fail-
over protocol.
5. Share IP address ranges between primary and secondary DRAFT November 1998
servers; i.e., impose no requirement that the pool of avail-
able addresses be divided between servers. 5. Share IP address ranges between primary and secondary servers;
i.e., impose no requirement that the pool of available
addresses be divided between servers.
6. Continue to meet the goals and objectives of this protocol in 6. Continue to meet the goals and objectives of this protocol in
the event of server failure or network partition. the event of server failure or network partition.
7. Provide graceful reintegration of full protocol service after 7. Provide graceful reintegration of full protocol service after
server failure or network partition. server failure or network partition.
8. Allow for one computer to act as a Secondary Server for mul- 8. Allow for one computer to act as a secondary server for multi-
tiple Primary Servers. Other topologies (e.g.: mesh) are also ple primary servers. Other topologies (e.g.: mesh) are also
possible. Primary and Secondary Servers SHOULD be viewed as possible. primary and secondary servers SHOULD be viewed as
"logical" servers and not necessarily physical computers. "logical" servers and not necessarily physical computers.
9. Ensure that an existing client can keep its existing IP 9. Ensure that an existing client can keep its existing IP
address binding if it can communicate with either the Primary address binding if it can communicate with either the primary
or Secondary DHCP server implementing this protocol - not or secondary DHCP server implementing this protocol - not just
just whichever server that originally offered it the binding. whichever server that originally offered it the binding.
10.Ensure that a new client can get an IP address from some 10.Ensure that a new client can get an IP address from some
server. Ensure that in the face of partition, where servers server. Ensure that in the face of partition, where servers
continue to run but cannot communicate with each other, the continue to run but cannot communicate with each other, the
above goals and requirements may be met. In addition, when above goals and requirements may be met. In addition, when the
the partition condition is removed, allow graceful automatic partition condition is removed, allow graceful automatic re-
re-integration without requiring human intervention. integration without requiring human intervention.
11.If either Primary or Secondary Server loses all of the infor- 11. If either primary or secondary server loses all of the infor-
mation that is has stored in stable storage, it should be mation that is has stored in stable storage, it should be able
able to refresh its stable storage from the other server. to refresh its stable storage from the other server.
1.5. Limitations of this Protocol 1.5. Limitations of this Protocol
The following are explicit limitations of this protocol. The following are explicit limitations of this protocol.
1. Under normal operation, only one server at a time will ser- 1. Under normal operation, only one server at a time will hand
vice DHCP client requests; this protocol provides reliability out new IP addresses, but client lease renewals are serviced
through redundancy but not load balancing. by both servers; the protocol provides reliability through
redundancy and some degree of load balancing of lease
renewals.
2. This protocol provides only one level of redundancy through a 2. This protocol provides only one level of redundancy through a
single Secondary Server for each Primary Server. single secondary server for each primary server.
3. The protocol provides a way to detect when the primary and 3. The protocol provides a way to detect when the primary and
secondary server cannot communicate, but once this condition secondary server cannot communicate, but once this condition
has been detected, does not (indeed, cannot) provide any way
DRAFT January 1998 DRAFT November 1998
has been detected, does not (indeed, cannot) provide any way
to further distinguish between network failure and failure of to further distinguish between network failure and failure of
one of the servers. one of the servers. The protocol allows detection of an ord-
erly shutdown of a participating server.
4. A small number of IP addresses are reserved for Secondary 4. A subset of the address pool is reserved for secondary server
Server use. In order to handle the failure case where both use. In order to handle the failure case where both servers
servers are able to communicate with DHCP clients, but unable are able to communicate with DHCP clients, but unable to com-
to communicate with each other, a small number of IP municate with each other, a subset of the IP address pool must
addresses must be set aside as a private address pool for the be set aside as a private address pool for the secondary
Secondary Server. The Secondary can use these to service server. The secondary can use these to service newly arrived
newly arrived DHCP clients during such a period. The size of DHCP clients during such a period. The size of this private
this private pool SHOULD be based only on the arrival rate of pool SHOULD be based only on the arrival rate of new DHCP
new DHCP clients and the length of expected downtime, and is clients and the length of expected down-time, and is not
not influenced in any way by the total number of DHCP clients influenced in any way by the total number of DHCP clients sup-
supported by the server pair. ported by the server pair.
5. The Primary and Secondary Servers SHOULD pause normal DHCP 5. The primary and secondary servers do not respond to client
transaction processing while resynchronizing, after a system requests at all while recovering from a failure that could
failure. have resulted in duplicate IP assignments. (When synchroniz-
ing in POTENTIAL-CONFLICT state).
2. Protocol Operations 2. Protocol Operations
The protocol necessary in providing redundant/failover servers can be The protocol features a small number of messages to communicate bind-
grouped in three areas: ing information, operational status and to manage various
disconnect-reconnect scenarios between servers.
o Messages to keep the Secondary Server's lease data synchron- 2.1. Message Addressing and Configuration granularity
ized with that of the Primary so that when failover occurs,
there is no degradation of service.
o Messages that allow the Secondary to determine the operational When discussing messages, an important question is "to whom are mes-
state of the Primary, so as to know when to start servicing sages sent" and "from whom are messages sent". What is the address-
DHCP traffic. able entity from which and to which messages are sent?
o Messages that are used to coordinate the Primary regaining At one level, this would seem to be a single DHCP server, but in fact
control when it has become available again. there are many situations where additional flexibility in configura-
tion is useful. For instance, there might be several servers which
are each primary for a distinct set of address pools, and one server
which is secondary for all of those address pools. The situation
with the primaries is straightforward, but the secondary will need to
maintain a separate failover state, partner state, and communications
up/down status for each of the separate primary servers for which it
is acting as a secondary.
2.1. Time synchronization between communicating servers The protocol allows for there to be a unique failover entity per
partner per role (where role is primary or secondary). This failover
entity can take actions and hold unique states. There are thus a
DRAFT November 1998
maximum of two failover entities per partner (one for the partner as
a primary and one for that same partner as a secondary.)
Thus, in the case where there are two primary servers A and B each
backed up by a single common secondary server C, there is one fail-
over entity on each of A and B, and two different failover entities
on C. The two different failover entities on C each have unique
states and message xid ranges. As far as the protocol described in
this draft is concerned, they constitute different "servers",
although they are certainly part of one server (as the term is com-
monly used) if they reside in the same process.
It is not the case that there is subnet granularity for each failover
entity. On one server, there is one failover entity per "partner-
role", regardless of how many subnets or address pools are managed by
that combination of partner and role. Conversely, any given subnet
or pool will be associated with exactly one failover entity on a sin-
gle server (but it will also be associated with the corresponding
partner's failover entity.)
When a message is received from the partner, the unique failover
entity to which the message is directed is determined solely by the
IP address of the partner and the setting of the SECONDARY bit in the
'flags' field of the message header.
Throughout this document, the states and actions taken by "servers"
are described. The terms "server", "primary server", and "secondary
server" are commonly used to described the entity taking these states
and taking actions. This description is wholly accurate only for the
simplest of cases, where all of the address pools on one server are
backed up by all of the address pools on another server. In this
case, there is a "true" primary and secondary server. In all other
cases, the term "server" is used to describe one of the two possible
failover entities per partner.
2.2. Packet transport
All messages sent by this protocol are sent in UDP packets. All mes-
sages are unicast from the sender to the receiver. The next section
discusses the port to use when sending DHCP failover UDP packets.
DISCUSSION:
See section 8, Extended discussion #1, for a discussion of the
reasons to use UDP as the protocol.
DRAFT November 1998
2.3. Port usage
Compliant servers SHOULD use port 647 (assigned to dhcp-failover by
IANA) for sending and receiving Failover protocol messages, though
they MAY be configured to use a different port (including ports 67 or
68).
Since the use of port 67 and 68 is allowed, the messages are format-
ted in such a way that they can be distinguished from DHCP or BOOTP
messages by the use of distinct message 'op' codes. Note that send-
ing failover messages on port 67 to servers not designed to support
them may not only not work, but may cause those servers to operate
incorrectly or to crash.
DISCUSSION:
Some implementors have a strong requirement for using a separate
port for the Failover protocol, and the use of the allocated port
647 will accommodate them. Some other implementors seem equally
committed to allowing failover packets to be sent to the standard
DHCP port, port 67. The above language strongly suggests that the
failover port be used (by using SHOULD), but leaves open the pos-
sibility of using the standard DHCP port (or any other) for
servers designed to operate in that fashion.
2.4. Time synchronization between communicating servers
Each Binding update message carries a "sent time stamp" (the time Each Binding update message carries a "sent time stamp" (the time
when the message was sent in GMT). This provides a simple mechanism when the message was sent in GMT). This provides a simple mechanism
to determine any "time drift" between communicating servers. to determine any "time drift" between communicating servers.
DISCUSSION: DISCUSSION:
If an UDP packet is successfully transmitted (i.e.: it does not If a UDP packet is successfully transmitted (i.e.: it does not get
get lost), the packet travel time is negligible in the framework lost), the packet travel time is negligible in the framework of
DHCP leases. By providing a GMT "sent time" stamp, the recipient
can compare this with its notion of the current GMT time at the
time it receives the packet. The difference (plus the packet
travel time, which we ignore) is the time drift. The recipient
MUST use this time drift value to bias "absolute time" values it
receives from the sender.
DRAFT January 1998 2.5. Failover Protocol Messages
of DHCP leases. By providing a GMT "sent time" stamp, the reci- The Failover protocol messages are sent using UDP and encoded using a
pient can compare this with its notion of the current GMT time at packet format specific to the Failover protocol. To allow easy
the time it receives the packet. The difference (plus the packet recognition of and separation of Failover protocol messages from
travel time, which we ignore) is the time drift. The recipient
can use this time drift value to bias all "absolute time" values
it receives from the sender.
2.2. Failover Protocol Messages DRAFT November 1998
The Failover Protocol messages are encoded using a packet format BOOTP and DHCP messages, BOOTP packet 'op' field values 3..11 are
specific to the Failover Protocol. To allow easy recognition of used to indicate various Failover protocol message types. A Failover
Failover Protocol messages, BOOTP packet "op" field values 3..14 are protocol message is always unicast from the source to the destination
proposed to mark various Failover Protocol messages. A Failover Pro- using the port defined in section 2.2. The sender, and never the
tocol message is always unicast from the source to the destination. recipient is responsible for retransmission when necessary.
The sender, and never the recipient is responsible for reliable re-
transmission.
2.3. Failover Protocol packet header format 2.6. Failover protocol packet header format
All of the fields in the fixed portion of the packet MUST be filled
with correct data in every message sent.
0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| op (1) | rev (1) | payload offset (2) | | op (1) | rev (1) | payload offset (2) |
+---------------+---------------+---------------+---------------+ +---------------+---------------+---------------+---------------+
| xid (4) | | xid (4) |
+---------------------------------------------------------------+ +---------------------------------------------------------------+
| sending server ID ( IP address ) (4) |
+---------------------------------------------------------------+
| time stamp (4) |
+---------------------------------------------------------------+
| state (1) | flags(1) | reserved (2) |
+---------------+---------------+---------------+---------------+
| 0 or more additional header bytes (variable) | | 0 or more additional header bytes (variable) |
+---------------------------------------------------------------+ +---------------------------------------------------------------+
| Payload data, formatted as DHCP-style options | | Payload Data, formatted as DHCP-style options |
| (although using a unique option number space) | | (although using a unique option number space) |
| (variable) | | (variable) |
+---------------------------------------------------------------+ +---------------------------------------------------------------+
DRAFT November 1998
op - 1 byte op - 1 byte
These values extend the number space of the existing BOOTP message These values extend the number space of the existing BOOTP message
type "Op" field. The following types are defined: type "Op" field.
DRAFT January 1998 The following message types are defined:
3 DHCPPOOLREQ Value Message Type
4 DHCPPOOLRESP ----- ------------
5 DHCPBNDUPD 0 reserved to BOOTP/DHCP, unused by failover
6 DHCPBNDACK 1 BOOTREQUEST (reserved to BOOTP/DHCP, unused by failover)
7 DHCPPOLL 2 BOOTREPLY (reserved to BOOTP/DHCP, unused by failover)
8 DHCPPRPL 3 DHCPPOOLREQ request allocation of addresses
9 DHCPCTLREQ 4 DHCPPOOLRESP respond with allocation count
10 DHCPCTLRET 5 DHCPBNDUPD update partner with binding info
11 DHCPCTLACK 6 DHCPBNDACK acknowledge receipt of binding update
12 DHCPCTLACKACK 7 DHCPPOLL probe partner for comm. integrity
13 DHCPREQUEREQ 8 DHCPPRPL acknowledge comm. integrity
14 DHCPREQUERESP 9 DHCPUPDATEREQALL request full transfer of binding info
10 DHCPUPDATEDONE ack send and ack of req'd binding info
11 DHCPUPDATEREQ req transfer of un-acked binding info
rev - 1 byte rev - 1 byte
Failover protocol version supported. Set to 1 for the Failover Proto- Failover protocol version supported. Set to 1 for the Failover
col described in this draft. protocol described in this draft. The value 255 is reserved for
experimental implementations. Such implementations SHOULD use the
DHCP Vendor Class option to recognize a partner server which is using
the same vendor's experimental implementation.
payload offset - 2 bytes, network byte order payload offset - 2 bytes, network byte order
The byte offset of the Payload area, from the beginning of the Fail- The byte offset of the Payload area, from the beginning of the
over packet header. The value for the current protocol version is 8. Failover packet header. The value for the current protocol version is
20.
xid - 4 bytes, network byte order xid - 4 bytes, network byte order
The sender of a failover protocol packet is responsible for setting The sender of a Failover protocol packet is responsible for setting
this number, and the receiver of the packet copies the number over this number, and the receiver of the packet copies the number over
into any response packet. To the receiver it is opaque. The sender into any response packet, treating it as opaque data. The sender
SHOULD ensure that every packet sent to a particular IP address and SHOULD ensure that every packet sent to a particular IP address and
port combination has a unique transaction id unless that packet is a port combination has a unique transaction id unless that packet is a
re-transmission. re-transmission.
2.4. DHCPPOOLREQ and DHCPPOOLRESP: DRAFT November 1998
Whenever the Secondary server transitions into NORMAL mode, it first sending server ID - 4 bytes, network byte order
sends a DHCPPOOLREQ message to initiate a transfer of a small range
of IP addresses that will serve as its private address pool.
This is necessary, because initially the Secondary server has no such The IP address of the sending server. In conjunction with the
address pool, and its pool gets depleted when it hands out addresses setting of the SECONDARY flag, this uniquely determines the failover
in COMMUNICATION-INTERRUPTED mode. This is why the request is sent entity sending the message as well as that destined to receive the
every time the Secondary server transitions into NORMAL mode. The message.
DHCPPOOLREQ message does not carry any payload data. When the Primary
Server gets a DHCPPOOLREQ message, it computes which addresses should
be transferred to the Secondary, and queues up DHCPBNDUPD transac-
tions, setting the Status of these bindings to "BACKUP". Having done
this, it sends a DHCPPOOLRESP message. The DHCPPOOLRESP message
DRAFT January 1998 This is placed in the packet instead of being recovered from the IP
header for security purposes (see section 8).
carries the "Number of addresses transferred" as its payload. time stamp - 4 bytes, unsigned, network byte order
The Secondary server keeps sending DHCPPOOLREQ messages until it A time stamp, indicating the time when the packet was sent. The time
is a 32 bit unsigned long value in network byte order, in units of
seconds (GMT since EPOCH).
It is used to determine the time drift between the sender and the
recipient. The time drift is defined as the difference between
"Arrive Time (GMT)" and "(Send Time (GMT)". The actual packet travel
time is assumed to be negligible in this context. All Date-Time
values contained in Failover messages MUST be corrected by the time
drift before being stored by the recipient.
state - 1 byte
This field indicates the state of the sender, at the time the packet
was sent. The field MUST be set in every Failover message. The
server state value can be one of the following:
Value Server State
----- -------------------------------------------------------------
0 NO-STATE May only occur in POLL messages.
The partner should reply, but
should not react with any state
transition.
1 STARTUP Startup state (1)
2 NORMAL Normal state
3 COMMUNICATIONS-INTERRUPTED Communication interrupted (safe)
4 PARTNER-DOWN Partner down (unsafe mode)
5 POTENTIAL-CONFLICT Synchronizing
6 RECOVER Recovering bindings from partner
7 PAUSED Shutting down for a short period.
8 SHUTDOWN Shutting down for an extended
period.
9 RECOVER-DONE Interlock state prior to NORMAL
DRAFT November 1998
Note 1: The STARTUP state is never set in the State field of the mes-
sage, but rather is represented by the setting of the STARTUP flag
(see the description of the Flags field immediately below). When the
server is in the STARTUP state, the state transmitted in the State
byte is the PREVIOUS state (usually, but not always, the last
recorded in stable storage prior to a server going down -- see sec-
tion 6.3 for details.)
flags - 1 byte
Currently, bits 7 (MSB), 6, and 5 are defined. All other bits are
reserved, and must be set to 0.
o SECONDARY
Bit 7 is the SECONDARY flag and defines the server role. Bit 7
is 0 if the sender is a primary server, 1 if it is a secondary
server. Note that this role is fixed for the duration of the
relationship between primary and secondary server. In particu-
lar, it does not change when and if the secondary server "takes
over" for the primary server when it enters COMMUNICATIONS-
INTERRUPTED or PARTNER-DOWN state -- each server retains its
role throughout all of its state transitions.
o RESTART
Bit 6 is the RESTART flag. If bit 6 is 1, the sender is res-
tarting. A server MUST set this bit every time it is re-
started, and it MUST clear the bit upon receiving the first
DHCPPRPL to a DHCPPOLL message it has sent with the bit set.
Whenever a DHCPPOLL message is sent with the RESTART bit set in
the 'flags' field, the MCLT Option, Option 235, MUST be
included.
Whenever a message with the RESTART bit is received by a server,
it MUST transition through the communications failed state tran-
sition. The RESTART bit signals that the partner server has
been restarted, and if communications is already considered to
have failed, then nothing need be done. If, however, the
partner server appeared to be operating correctly, then it was
able to restart without the receiving server noticing that it
was ever gone. The communications failed transition is forced
in this case to restart any on-going resynchronization processes
that were operating with the partner server. See section 6.3
for additional information.
Whenever a DHCPPOLL message is sent with the RESTART bit set,
DRAFT November 1998
the server SHOULD include a Vendor Class Identifier, Option 60,
in the message to identify the server to its partner.
o STARTUP
Bit 5 is the STARTUP flag. Bit 5 MUST be set to 1 whenever the
server is in STARTUP state, and set to 0 otherwise. (Note that
when in STARTUP state, the state transmitted in the 'state'
field is usually the last recorded state from stable storage,
but see section 6.3 for details.)
reserved - 2 bytes
2 filler bytes, reserved.
2.7. DHCPPOOLREQ and DHCPPOOLRESP:
A secondary server requests addresses for its unique use from the
primary server by using the DHCPPOOLREQ message. The primary is in
complete charge of how many addresses the secondary receives.
The primary server will allocate IP addresses to the secondary server
upon receipt of a DHCPPOOLREQ message and inform the secondary server
of the number of additional addresses allocated in this allocation
cycle by sending the number in the DHCPPOOLRESP message.
When the primary server gets a DHCPPOOLREQ message, it computes which
addresses should be transferred to the secondary, and queues up
DHCPBNDUPD transactions by setting the Status of the selected
addresses to "BACKUP". Having done this, it sends a DHCPPOOLRESP
message. The DHCPPOOLRESP message carries the "Number of addresses
transferred" as its payload. The primary server does not have to
wait until all the above binding updates have been acknowledged,
The secondary server keeps sending DHCPPOOLREQ messages until it
receives a DHCPPOOLRESP with "Number of addresses transferred" = 0, receives a DHCPPOOLRESP with "Number of addresses transferred" = 0,
or it decides that the partner is not responding. Each one of these or it decides that the partner is not responding.
message MUST have the same transaction ID. If a new transaction ID
is used in one of these messages, the receiving server will begin the If the secondary server receives a DHCPPOOLRESP message with "Number
transmission of the DHCPBNDUPD messages all over again. To be clear,
if the Secondary Server receives a DHCPPOOLRESP message with "Number
of addresses transferred" > 0, it MUST send another DHCPPOOLREQ mes- of addresses transferred" > 0, it MUST send another DHCPPOOLREQ mes-
sage. This mechanism makes it possible for the Primary Server to pace sage, since additional addresses may still be waiting for it. How-
the transfer (e.g., it could generate all addresses all at once, or ever, the time at which it sends subsequent DHCPPOOLREQ messages is
one-by-one). implementation dependent. This mechanism makes it possible for the
primary server to pace the transfer (e.g., it could generate all
addresses all at once, or one-by-one) and to some degree for the
secondary to pace their receipt.
The Primary Server must respond to each DHCPPOOLREQ message it DRAFT November 1998
The primary server MUST respond to each DHCPPOOLREQ message it
receives. If it has already generated all private addresses, or it receives. If it has already generated all private addresses, or it
has no available addresses, it MUST send DHCPPOOLRESP with "Number has no available addresses, it MUST send DHCPPOOLRESP with "Number
of addresses transferred" = 0. of addresses transferred" = 0.
2.5. DHCPREQUEREQ and DHCPREQUERESP: The secondary server MAY send a DHCPPOOLREQ message at any time, and
although the primary server is under no obligation to allocate any
additional addresses, it MUST respond with a DHCPPOOLRESP indicating
how many new addresses it has allocated or 0 if no new addresses were
allocated.
Whenever either server wishes to be updated with the information that 2.8. DHCPUPDATEREQ, DHCPUPDATEREQALL and DHCPUPDATEDONE:
the other server knows and has not yet transmitted to it, will send a
DHCPREQUEREQ.
The DHCPREQUEREQ message does not carry any payload data. When the Whenever either server wishes to be updated with information the
either server gets a DHCPREQUEREQ message, it computes which updates other server knows but has not yet transmitted, it will send a
should be transferred to the Secondary, and queues up DHCPBNDUPD DHCPUPDATEREQ or DHCPUPDATEREQALL message.
transactions as appropriate. Having done this, it sends a DHCPRE-
QUERESP message. The DHCPREQUESP message carries the "Number of
addresses queued up" as its payload. The set of binding updates
queued up will depend on the requesting server's state. (The state
has already been communicated via prior DHCPPOLL/DHCPPRPL messages)
The Secondary server keeps sending DHCPPREQUEREQ messages until it When either server gets a DHCPUPDATEREQ or DHCPUPDATEREQALL message,
receives a DHCPREQUERESP with "Number of addresses queued up" = 0, it computes which updates should be transferred to the partner, and
or it decides that the partner is not responding. This is the same queues up DHCPBNDUPD transactions as appropriate. Once all such
approach as in the DHCPPOOLREQ/DHCPPOOLRESP messages is used. Each updates have been acknowledged, it sends a DHCPUPDATEDONE message.
one of these DHCPREQUEREQ message MUST have the same transaction ID.
Use of a new transaction ID will cause re-building of the outgoing
binding update queue.
The Primary Server must respond to each DHCPREQUEREQ message it If the message that initiated this process was a DHCPUPDATEREQ mes-
receives. If it has already queued up all of the previously unsent sage, the receiving server will transmit only DHCPBNDUPD messages for
bindings update, then it MUST send DHCPREQUERESP with "Number of IP addresses which its information indicates that its partner has not
addresses queued up" = 0. acked.
DRAFT January 1998 If, however, the message that initiated this process was a DHCPUP-
DATEREQALL message, the receiving server will transmit DHCPBNDUPD
messages for all IP addresses involved in failover with this partner
in this role.
2.6. DHCPBNDUPD The secondary server periodically re-transmits the DHCPUPDATEREQ mes-
sage, until it receives a DHCPUPDATEDONE message with a matching
'xid' field, or until it decides that the partner is not responding.
The Primary notifies Secondary (or the other way around) of a binding This approach is similar to the DHCPPOOLREQ/DHCPPOOLRESP message
state and data change. exchange, with one critical difference: the DHCPPOOLRESP is sent as
soon as the binding updates are queued up, but the DHCPUPDATEDONE
message is deferred until all of the sender's DHCPBNDUPD messages
have been successfully transmitted and a corresponding DHCPBNDACK
message has been received for each of them.
In response to a binding update, the recipient server MUST respond The server processing a DHCPUPDATEREQ message MUST NOT send a
with a DHCPBNDACK message. Multiple binding updates can be batched corresponding DHCPUPDATEDONE message until all of the DHCPBNDUPD mes-
up, and sent in one Failover Protocol message. sages have been acked by the partner with a DHCPBNDACK message.
2.7. DHCPBNDACK DRAFT November 1998
This message implements a positive, or negative acknowledgement of Any retransmissions of the DHCPUPDATEREQ message MUST have the same
one or more binding updates. transaction ID. Use of a new transaction ID may cause rebuilding of
the outgoing binding update queue or other processing in the server
with a negative effect on performance.
A binding update, (or a batch of binding updates sent as one message) 2.9. DHCPBNDUPD
are matched up with their associated acknowledgment by having the
same Xid field value in the message header.
The server sending a DHCPBNDACK message MAY include any of the One server notifies its partner of a binding state change by using
options that are acceptable in a DHCPBNDUPD message when the the DHCPBNDUPD message.
DHCPBNDACK message returned to the sender. If any of this informa-
tion differs from the information in the DHCPBNDUPD message, the
receiver SHOULD update its bindings database with that information
upon receipt of the DHCPBNDACK message.
The DHCPBNDACK MAY selectively reject one or more updates by includ- Every DHCPBNDUPD message MUST contain:
ing one or more IP address - Reject Reason option pairs in the mes-
sage body.
The DHCPBNDACK implicitly acknowledges any binding updates it replies o An Assigned IP Address Option (Option 50).
to, except those it enumerates using Reject Reason Codes.
2.8. DHCPPOLL o A DHCP Binding Status (Option X).
In order to determine the state of a given server, or to communicate o Where the Binding Status is ACTIVE, EXPIRED, RELEASED, or RESET,
a critical change in its own status, a participant can use the above it MUST also contain one or both of the Client Identifier
message. (Option 61) and the Client Hardware Address (Option X+3). In the
case where the Binding Status is ACTIVE, it MUST contain the
Lease Duration, Option 51.
This message inquires about the current state of the recipient, and o Where dynamic DNS updates are being used by the sending server,
tells the recipient what state the sender is. the Client FQDN Option, Option 81, is used by the sender to
communication the status of the binding update to its partner.
In response to a binding update, the recipient server MUST respond
with a DHCPBNDACK message.
In response to the DHCPPOLL message, the participant will listen for Multiple binding updates MAY be batched up, and sent in one Failover
a DHCPPRPL message. protocol message (see section 3.1).
DRAFT January 1998 2.10. DHCPBNDACK
2.9. DHCPPRPL This message implements either a positive or negative acknowledgment
of one or more binding updates.
This message replies to the DHCPPOLL message (PRPL=Poll reply). The A binding update, (or a batch of binding updates sent as one message)
DHCPPRPL also carries server status information (see message payload are matched up with their associated acknowledgment by having the
details below). same 'xid' field value in the message header.
After a failover, when the Primary Server is restarted, the following The server sending a DHCPBNDACK message MAY include any of the
messages are used to coordinate the Primary taking control back from options that are acceptable in a DHCPBNDUPD message when the
the Secondary: DHCPBNDACK message is returned to the sender. It MUST include at
least the Assigned IP Address Option.
DHCPCTLREQ - Request for control If any of this information differs from the information in the
DHCPCTLRET - Return of control initiated DHCPBNDUPD message, the receiver MUST NOT update its bindings
DHCPCTLACK - Return of control completed
DHCPCTLACKACK - Return of control completed message acknowledged.
The Primary Server sends a DHCPCTLREQ message, indicating that it DRAFT November 1998
would like to take control of the bindings database. The Secondary
Server replies with a DHCPCTLRET message, which serves as a signal to
the Primary "Stand by to receive binding updates". This message then
is followed by a set of binding updates from the secondary to the
primary. When all updates have been transmitted (and acknowledged)
from Secondary to Primary, a DHCPCTLACK message is sent from the
Secondary to the Primary, to signal that "all updates from the Secon-
dary are now completed".
DISCUSSION: database with that information upon receipt of the DHCPBNDACK mes-
sage, since the sender will have no way of knowing if the receiver
actually received the message.
Note, that the DHCPCTLACK message type must be transmitted reli- The DHCPBNDACK MAY selectively reject one or more updates, by includ-
ably, as the Primary Server will not start servicing clients, ing one or more IP address - Reject Reason option pairs in the mes-
until it has received the DHCPCTLACK message. To provide this sage body.
reliability, the DCHPCTLACKACK message is provided. This provides
an acknowledgment of the DHCPCTLACK message, and the DHCPCTLACK
message will be periodically re-sent until it is acknowledged. We
could just periodically re- send the DHCPCTLACK message until we
start receiving binding updates from the Primary, but the Primary
may not have any updates to send at all, hence the need for an
explicit DCHPCTLACKACK message.
The Primary Server transitions into NORMAL state upon receiving a The DHCPBNDACK implicitly acknowledges any binding updates it replies
DHCPCTLACK from the secondary, when the secondary has completed send- to, except those it enumerates using Reject Reason Codes.
ing all of its updates during synchronization. The DHCPCTLACKACK
message is needed to prevent the primary from waiting and not servic-
ing clients if the DHCPCTLACK message got lost. The Secondary server
will keep re-sending the DHCPCTLACK message, until:
1. It Decides that the primary is not responding, so the Secon- Implementations of this protocol MAY send batched updates, and they
dary server goes into COMMUNICATION- INTERRUPTED mode. MUST be prepared to receive batched updates.
DRAFT January 1998 2.11. DHCPPOLL
2. It receives a DHCPCTLACKACK or a DHCPBNDUPD message from the In the absence of other messages, a DHCPPOLL message is used to
primary. The Primary's DHCPBNDUPD messages would start verify the communications integrity of the link between the primary
arriving at the Secondary server, if the Primary did get the and secondary servers. It is used by either server whenever there is
DHCPCTLACK, but the DHCPCTLACKACK message got lost. some question about either the communications integrity or running
status of the other server.
Since current state and other status information is transmitted in
every DHCPPOLL and in every DHCPPRPL message, the DHCPPOLL and
DHCPPRPL exchange can also be used to signal a change in status by a
server or as a way to request an update of the status of its partner.
Whenever a DHCPPOLL message is generated it MUST have a unique value
in the 'xid' field, unless it is a retransmission of a previously
un-acked DHCPPOLL message.
2.12. DHCPPRPL
This message simply replies to the DHCPPOLL message (PRPL = Poll
reply). Like all messages, it needs to have all of the fixed
portions of the failover packet header filled in, including the state
and the flags fields.
3. Protocol Payload Data Format 3. Protocol Payload Data Format
Payload data is encoded as a set of flexible DHCP/BOOTP style Payload data is encoded as a set of flexible DHCP/BOOTP style options
options. (The usual 1 byte option code, 1 byte length, and "length" [RFC 2132]. (The usual 1 byte option code, 1 byte length, and
bytes of data). The options are placed after the header, after skip- "length" bytes of data). The options are placed after the header,
ping PayloadOffset bytes. The payload data options are not preceded after skipping PayloadOffset bytes. The payload data options are not
"cookie" value. preceded by a "cookie" value.
DRAFT November 1998
Since the packet is NOT a DHCP/BOOTP protocol packet, the options Since the packet is NOT a DHCP/BOOTP protocol packet, the options
used here do not conflict with any existing "proper" DHCP/BOOTP used here do not conflict with any existing "proper" DHCP/BOOTP
options. In fact, these options are allocated in relationship to the options. In fact, these options are allocated in relationship to the
DHCP option space in the following way. In cases where the syntax DHCP option space in the following way.
and semantics of a Failover Payload Option is identical to that of a
DHCP/BOOTP option, the same number option number is used. For
options unique to the Failover protocol, options numbers starting at
230 are used.
Thus, all new Failover Protocol option numbers are assigned from a In cases where the syntax and semantics of a Failover Payload Option
continuous range beginning with 230. This number is shown as an X in is identical to that of a DHCP/BOOTP option, the same option number
the tables below. is used. For options unique to the Failover protocol, option numbers
starting at 230 are used.
Thus, all new Failover protocol option numbers are assigned from a
continuous range beginning with 230.
The protocol is permissive in allowing various other DHCP options in The protocol is permissive in allowing various other DHCP options in
binding updates. As long as the sender wishes to use an option, it binding updates. As long as the sender wishes to use an option, it
MAY include it. On the other hand, the recipient MUST ignore any MAY include it. On the other hand, the recipient MUST ignore any
option it is not expecting. option it is not prepared to process.
Multiple DHCPBNDUPD transactions can be batched together in one UDP
packet. Option sets for individual transaction MUST always begin
with the IP address (Option 50) . This is the only restriction on
payload item ordering. In any other case, payload data items can be
included in any desired order.
In case an implementation chooses to use the DHCPBNDNAK mechanism,
the DHCPBNDNAK message SHOULD contain one or more Option 50s from the
NAK-ed message, to indicate which specific update items are being
NAK-ed.
While the synchronization is in progress, the secondary MUST NOT
accept client requests, and the primary MUST NOT send any updates to
the secondary. This is necessary to allow the Primary to be the sole
arbitrator of any conflicting updates.
DRAFT January 1998
3.1. DHCP Server Status
This option is used to convey the current state of a server.
Code Len Type 3.1. Batching multiple binding updates in one packet
+--+---+------+
| X| 1 | 1-15 |
+--+---+------+
Allowed values for this option: Implementations of this protocol MAY send batched updates, and they
MUST be prepared to receive batched updates.
Value Message Type Multiple DHCPBNDUPD transactions MAY be batched together in one
----- ------------ protocol message. Data sets for individual transactions MUST always
1 UNKNOWN-STATE begin with the Assigned IP Address (Option 50). Option ordering
2 PRIMARY-NORMAL Normal state between the Assigned IP Address options is not significant.
3 BACKUP-NORMAL
4 PRIMARY-COMINT Communication interrupted (safe)
5 BACKUP-COMINT
6 PRIMARY-PARTNERDOWN Partner down (unsafe
mode)
7 BACKUP-PARTNERDOWN
8 PRIMARY-CONFLICT Synchronizing, after a
"Partner-Down"
divergence
9 PRIMARY-SYNC Synchronizing, after a
"communications-
interrupted"
divergence.
10 BACKUP-SYNC
11 PRIMARY-RECOVER Recovering ALL
bindings from partner
12 BACKUP-RECOVER
13 FAILOVER-DISABLED The server is running
with the failover
protocol disabled.
(standalone)
14 SERVER-PAUSED The server is inactive, If batched updates are sent, they MUST be formatted as follows:
shutting down for a sort period.
15 SERVER-SHUTDOWN The server is inactive,
shutting down for an extended period.
When a server is being re-started, it should send a DHCPPOLL message Non-IP Address/Non-client specific options first
to its partner, reporting its status (SERVER-PAUSED). In response, Assigned IP address option (50) for the first address
the recipient SHOULD go into COMMUNICATION-INTERRUPTED mode. Options pertaining to first address, including
at least DHCP Binding Status (230)
Assigned IP address option (50) for the second address
Options pertaining to second address, including
at least DHCP Binding Status (230)
...
DRAFT January 1998 In case an implementation chooses to reject some or all of the IP
address binding information in a DHCPBNDUPD message in a DHCPBNDACK
reply, the DHCPBNDACK message MUST contain one or more Assigned IP
Address (Option 50) / Reject Reason Code pairs to indicate that the
updates for the address(es) were not accepted. The Assigned IP
Address options communicates which updates out of the batch are being
rejected, and the Reject Reason Code indicates why. Any IP addresses
When a server is being shut down, it should send a DHCPPOLL message DRAFT November 1998
to its partner, reporting its status (SERVER-SHUTDOWN).
In response, the recipient SHOULD go into PARTNER-DOWN mode. present in the DHCPBNDUPD message without corresponding Option 50/
Reject Reason Code pairs in the DHCPBNDACK message are implicitly
acked by the DHCPBNDACK message. If the DHCPBNDUPD message only con-
tains one binding update and that update is rejected, a DHCPBNDACK
with a single Assigned IP Address / Reject Reason Code pair MUST be
sent.
3.2. DHCP Binding Status 3.2. DHCP Binding Status
This option is used to convey the current state of a binding. This This option is used to convey the current state of a binding. This
option is mandatory for DHCPBNDUPD messages. option is mandatory for DHCPBNDUPD messages.
Code Len Type Code Len Type
+-----+-----+-----+ +-----+-----+-----+
| X+1 | 1 | 1-7 | | 230 | 1 | 1-7 |
+-----+-----+-----+ +-----+-----+-----+
Legal values for this option are: Legal values for this option are:
Value Message Type Value Binding Status
----- ------------ ----- ------------------------------------------------
1 FREE The lease has never been used 1 FREE Lease has never been used
2 ACTIVE assigned to a client * 2 ACTIVE Lease is assigned to a client
3 EXPIRED 3 EXPIRED Lease has expired
4 RELEASED A client released the lease 4 RELEASED Lease has been released by client
5 ABANDONED A server or client flagged address 5 ABANDONED A server, or client flagged address as unusable
as not usable. 6 RESET Lease was freed by some external agent
6 RESET Lease was freed by some 7 BACKUP Lease belongs to secondary's private address pool
external agent.
7 BACKUP Lease is set aside for Secondary
server's private address pool.
3.3. Assigned IP address 3.3. Assigned IP address
Uses identical code and format to DHCP Option 50 (requested IP Uses identical code and format to DHCP Option 50 (requested IP
address). address). This option is mandatory for DHCPBNDUPD messages and in
any DHCPBNDACK message where a Reject Reason Code option appears.
Code Len Address Code Len Address
+-----+-----+-----+-----+-----+-----+ +-----+-----+-----+-----+-----+-----+
| 50 | 4 | a1 | a2 | a3 | a4 | | 50 | 4 | a1 | a2 | a3 | a4 |
+-----+-----+-----+-----+-----+-----+ +-----+-----+-----+-----+-----+-----+
DRAFT January 1998 DRAFT November 1998
3.4. Lease grant time 3.4. Absolute time
This absolute time is used for the lease grant time as well the
partner-down time. When used in a DHCPBNDUPD or DHCPBNDACK
message, it represents the lease grant time. When used in a DHCPPOLL
message, it represents the partner-down time.
An absolute, GMT time value for this option, as time synchronization An absolute, GMT time value for this option, as time synchronization
has already been achieved between the source and the target server has already been achieved between the source and the target server
using the Sent Time Stamp option. Represented as seconds since Jan using the time field in the message. Represented as seconds elapsed
1, 1970 (i.e. ANSI C time_t time value representation). since Jan 1, 1970 (i.e. ANSI C time_t time value representation).
Note that this is (at present) a signed field.
Code Len Time Code Len Time
+------+-----+-----+-----+-----+-----+ +------+-----+-----+-----+-----+-----+
| X+2 | 4 | t1 | t2 | t3 | t4 | | 231 | 4 | t1 | t2 | t3 | t4 |
+------+-----+-----+-----+-----+-----+ +------+-----+-----+-----+-----+-----+
3.5. Sent Time Stamp 3.5. Number of addresses transferred to Secondary Server
A time stamp using GMT, when the packet was sent. It is used to
determine the time drift between the sender and the recipient. The
time drift is defined as the difference between "Arrive Time (GMT)"
and (Send Time (GMT)" . The actual packet travel time is assumed to
be negligible in this context. All Date-Time values contained in
Failover messages will be corrected by the time drift before being
stored by the recipient.
Code Len Time
+-----+-----+-----+-----+-----+-----+
| X+3 | 4 | t1 | t2 | t3 | t4 |
+-----+-----+-----+-----+-----+-----+
The time is a 32 bit unsigned long in network byte order, in units of
seconds (GMT since EPOCH).
3.6. Number of addresses transferred to Secondary Server
A 32 bit unsigned long in network byte order. Reports the number of A 32 bit unsigned long in network byte order. Reports the number of
addresses transferred by the Primary to the Secondary Server addresses transferred by the primary to the secondary server
(addresses to be used for the Secondary Server's private address (addresses to be used for the secondary server's private address
pool) pool)
DRAFT January 1998 Code Len Number of Addresses
Code Len Time
+-----+-----+-----+-----+-----+-----+ +-----+-----+-----+-----+-----+-----+
| X+4 | 4 | t1 | t2 | t3 | t4 | | 232 | 4 | n1 | n2 | n3 | n4 |
+-----+-----+-----+-----+-----+-----+ +-----+-----+-----+-----+-----+-----+
3.7. Lease Duration 3.6. Lease Duration
Uses the format and code of the standard DHCP IP Address Lease Time Uses the format and code of the standard DHCP IP Address Lease Time
option. It is used by the DHCP protocol in the exact same way by the option (51). The time is in units of seconds, and is specified as a
DHCPOFFER message. The time is in units of seconds, and is specified 32-bit unsigned integer. A Lease Duration of 0xFFFFFFFF indicates an
as a 32-bit unsigned integer. A Lease Duration of 0xFFFFFFFF indi- infinite lease.
cates an infinite lease.
Code Len Lease Time Code Len Lease Time
+-----+-----+-----+-----+-----+-----+ +-----+-----+-----+-----+-----+-----+
| 51 | 4 | t1 | t2 | t3 | t4 | | 51 | 4 | t1 | t2 | t3 | t4 |
+-----+-----+-----+-----+-----+-----+ +-----+-----+-----+-----+-----+-----+
3.8. Client Identifier DRAFT November 1998
3.7. Client Identifier
The format, code and conventions used are identical to DHCP option The format, code and conventions used are identical to DHCP option
61. 61.
Code Len Type Client-Identifier Code Len Type Client-Identifier
+-----+-----+-----+-----+-----+--- +-----+-----+-----+-----+-----+---
| 61 | n | t1 | i1 | i2 | ... | 61 | n | t1 | i1 | i2 | ...
+-----+-----+-----+-----+-----+--- +-----+-----+-----+-----+-----+---
3.9. Client Hardware Address 3.8. Client Hardware Address
The format is similar to DHCP option 61. T1 (type) MUST be set to the The format is similar to DHCP option 61. T1 (type) MUST be set to the
proper ARP hardware address code ( it MUST NOT be zero!) TBD: Refer- proper ARP hardware address code, as defined in the ARP section of
ence the ARP document here. RFC 1700 (it MUST NOT be zero!)
DRAFT January 1998
Code Len Type Client-Identifier Code Len Type MAC address
+-----+-----+-----+-----+-----+--- +-----+-----+-----+-----+-----+---
| X+5 | n | t1 | i1 | i2 | ... | 233 | n | t1 | m1 | m2 | ...
+-----+-----+-----+-----+-----+--- +-----+-----+-----+-----+-----+---
Either Client Id, Client Hardware Address or BOTH MAY be present in Either Client Id, Client Hardware Address or BOTH MAY be present in
binding update transactions. At least one of them MUST be present. binding update transactions. At least one of them MUST be present.
If both are present, the Client Id MUST be used to uniquely identify If both are present, the Client Id MUST be used to uniquely identify
the owner of the binding (exactly as in RFC 2131). the owner of the binding (exactly as in RFC 2131).
3.10. Host Name 3.9. Host Name
Uses the format and code of DHCP option 12. Uses the format and code of DHCP option 12.
Code Len Host Name Code Len Host Name
+-----+-----+-----+-----+-----+-----+-----+-----+-- +-----+-----+-----+-----+-----+-----+-----+-----+--
| 12 | n | h1 | h2 | h3 | h4 | h5 | h6 | ... | 12 | n | h1 | h2 | h3 | h4 | h5 | h6 | ...
+-----+-----+-----+-----+-----+-----+-----+-----+-- +-----+-----+-----+-----+-----+-----+-----+-----+--
3.11. Domain Name 3.10. Domain Name
Uses the format and code of DHCP option 15. Uses the format and code of DHCP option 15.
Code Len Domain Name Code Len Domain Name
+-----+-----+-----+-----+-----+-----+-- +-----+-----+-----+-----+-----+-----+--
| 15 | n | d1 | d2 | d3 | d4 | ... | 15 | n | d1 | d2 | d3 | d4 | ...
+-----+-----+-----+-----+-----+-----+-- +-----+-----+-----+-----+-----+-----+--
DRAFT November 1998
3.11. Client FQDN
If an implementation supports Dynamic DNS updates, this option can be
used to communicate the DNS name that was set. Uses the format and
code of the Client FQDN option (81) as described in <draft-ietf-dhc-
dhcp-dns-08.txt>.
Code Len Flags Rcode1 Rcode2 Domain Name
+-----+-----+-----+------+------+-----+------
| 81 | n | f | r1 | r2 | d1 | d2...
+-----+-----+-----+------+------+-----+------
3.12. Reject Reason Code 3.12. Reject Reason Code
This option is used to selectively reject binding updates. It MAY be This option is used to selectively reject binding updates. It MAY be
used in DHCPBNDACK message, always following an option 50.(The option used in DHCPBNDACK message, always following an option 50. Option 50
50 contains the IP address of the specific update being rejected). contains the IP address of the specific update being rejected.
DRAFT January 1998 Note that a Message option, DHCP Option 56, may be included to give a
human readable error indication along with the Reject Reason Code.
Code Len Reason code Code Len Reason code
+-----+-----+-----+ +-----+-----+----------+
| X+6 | 1 | R1 | | 234 | 1 | R1 |
+-----+-----+-----+- +-----+-----+----------+
Reason codes : Reason codes :
0 Reserved
1 Illegal IP address (not part of any address pool) 1 Illegal IP address (not part of any address pool)
2 Fatal conflict exists: address in use by other client. 2 Fatal conflict exists: address in use by other client.
3 - 253 Reserved for new Reason Codes.
254 Unknown: Error occurred but does not match any reason code
255 Reserved for code expansion
3.13. MDLI DRAFT November 1998
Maximum Delta Lease Interval, in seconds. A 32 bit integer value, 3.13. Message
in netwotk byte order.
This option is used to supply a human readable message. It may be
used in association with the Reject Reason Code to provide a human
readable error message for the reject.
Code Len Text
+-----+-----+------+-----+--
| 56 | 1 | c1 | c2 | ...
+-----+-----+------+-----+--
3.14. MCLT - Maximum Client Lead Time
Maximum Client Lead Time, in seconds. A 32 bit integer value, in
network byte order. This option MUST be used in DHCPPOLL and DHCPPRPL
messages, when the server is NOT in normal state.
Code Len Time Code Len Time
+------+-----+-----+-----+-----+-----+ +------+-----+-----+-----+-----+-----+
| X+7 | 4 | t1 | t2 | t3 | t4 | | 235 | 4 | t1 | t2 | t3 | t4 |
+------+-----+-----+-----+-----+-----+ +------+-----+-----+-----+-----+-----+
4. Exchange of control between Primary and Secondary 3.15. Vendor Class Identifier
The Primary and Secondary Servers coordinate the exchange control
over the bindings database through the use of DHCPPOLL and DHCPCTLREQ
messages. In normal operation:
The Primary sends notification of each change to its bindings data- A string which identifies the vendor of the failover protocol
base to the Secondary, and the Secondary keeps its bindings database implementation.
synchronized with the Primary's database.
The Secondary periodically sends DHCPPOLL messages to the Primary, The code for this option is 60, and its minimum length is 1.
and the Primary responds to each DHCPPOLL message with a DHCPPRPL
message. If the Secondary does not receive a DHCPPRPL response mes-
sage, the Secondary takes control of the bindings database and begins
answering requests from DHCP clients. Note that the Secondary should
be able to be configured to not perform the automatic switch-over.
The conditions under which a Secondary takes control of the bindings Code Len Vendor Class Identifier
database, e.g., the number of consecutive missing acknowledgments, +-----+-----+-----+-----+-----+--
should be configurable in the Secondary by the DHCP administrator. | 60 | n | i1 | i2 | i3 | ...
+-----+-----+-----+-----+-----+--
DRAFT January 1998 4. Challenging scenarios for a Failover protocol
The Secondary records any changes it makes to the bindings database There exist a number of failure scenarios which will challenge the
while it has control. The Secondary continues to send DHCPPOLL mes- correctness guarantees of the Failover protocol. Two of the
sages to the Primary. The DHCPPOLL messages also carry information scenarios that the Failover protocol was specifically designed to
on the state of the Secondary Server. handle correctly are detailed in this section in order to motivate
some of the more unusual aspects of the protocol's operations.
To regain control of the bindings database, e.g., after the Primary DRAFT November 1998
Server has recovered from a failure, or a partitioned network condi-
tion, the Primary sends a DHCPCTLREQ message to the Secondary. The
Secondary stops answering DHCP client requests, and responds to its
Primary with a DHCPCTLRET message. After sending the DHCPCTLRET mes-
sage, the Secondary sends DHCPBNDUPD messages for each of the changes
it has made to the bindings database.
The Primary sends a DHCPBNDACK for each DHCPBNDUPD message it 4.1. Primary Server crash before "lazy" update:
receives. The Secondary completes the transfer of control by sending
a DHCPCTLACK message to the Primary as soon as all of its updates
were acknowledged.
Note, that the Primary SHOULD NOT send any DHCPBNDUPD messages while In the case where the primary server sends a DHCPACK to a client for
synchronization is in progress with the Secondary. a newly allocated IP address and then crashes prior to sending the
corresponding update to the secondary server, the secondary server
will have no record of the IP address allocation. When the secondary
server takes over, it may well try to allocate that IP address to a
different client. In the case where the first client to receive the
IP address is not on the net at the time (yet while there was still
time to run on its lease), an ICMP echo (i.e., ping) will not prevent
the secondary server from allocating that IP address to different
client.
Once the synchronization is completed, and the Primary transitions This is handled in the protocol by having the primary and secondary
into NORMAL state, and starts sending DHCPBNDUPD transactions on any allocate addresses for new clients from distinct address pools.
accumulated binding changes it may have.
5. Duplicate address assignment scenarios A more likely (in that DHCPRENEWs are presumably more common than
DHCPDISCOVERs) and more subtle version of this problem is where the
primary server crashes after extending a client's lease time, and
before updating the secondary with a new time using a lazy update.
After the secondary takes over, if the client is not connected to the
network the secondary will believe the client's lease has expired
when, in fact, it has not. In this case as well, the IP address
might be reallocated to a different client while the first client is
still using it.
In the following two scenarios, the protocol could end up allocating This scenario is handled by the Failover protocol through control of
duplicate IP addresses, unless the measures recommended in Section 6. the lease time and the use of the maximum client lead time (MCLT).
are taken: See the next section for details.
Primary Server crash before "lazy" update: In the case where the Pri- 4.2. Network partition where servers can't communicate but each can
mary Server sends an ACK to a client for a newly allocated IP address talk to clients:
and then crashes prior to sending the corresponding update to the
Secondary Server, the Secondary Server will have no record of the IP
address allocation. When the Secondary Server takes over, it may
well try to allocate that IP address to a different client. In the
case where the first client to receive the IP address is not on the
net at the time (yet while there was still time to run on its lease),
an ICMP echo (i.e., ping) will not prevent the Secondary Server from
allocating that IP address to different client.
A more likely and subtle version of this problem is where the Primary Several conditions are required for this situation to occur. First,
Server crashes after extending a client's lease time, and before due to a network failure, the primary and secondary servers cannot
updating the Secondary with a new time using a lazy update. After the communicate. As well, some of the DHCP clients must be able to
Secondary takes over, if the client is not connected to the network communicate with the primary server, and some of the clients must now
the Secondary will believe the client's lease has expired when, in only be able to communicate with the secondary server. When this
fact, it has not. In this case as well, the IP address might be condition occurs, both primary and secondary servers could attempt to
allocate IP addresses for new clients from the same pool of available
addresses. At some point, then, two clients will end up being
allocated the same IP address. This will cause potentially serious
problems when the network failure that created this situation is
corrected.
DRAFT January 1998 This is handled in the protocol by having the primary and secondary
servers allocate addresses for new clients from distinct address
reallocated to a different client while the first client is still DRAFT November 1998
using it.
Network partition where servers can't communicate but each can talk pools.
to clients: Several conditions are required for this situation to
occur. First, due to a network failure, the Primary and Secondary
Servers cannot communicate. As well, some of the DHCP clients must
be able to communicate with the Primary Server, and some of the
clients must now only be able to communicate with the Secondary
Server. When this condition occurs, both Primary and Secondary
Servers could attempt to allocate IP addresses for new clients from
the same pool of available addresses. At some point, then, two
clients will end up being allocated the same IP address. This will
cause potentially serious problems when the network failure that
created this situation is corrected.
The next section details how the Failover Protocol prevents either of The specifics of how these two scenarios are handled are supplied in
the above scenarios (and other related scenarios) from causing dupli- the next section.
cate IP address allocation.
6. Duplicate Address Assignment Control 5. Duplicate Address Assignment Control
There are several ways that the Failover protocol avoids the possi- There are several ways that the Failover protocol avoids the possi-
bility of duplicate address assignment. bility of duplicate address assignment.
6.1. Control of lease time 5.1. Control of lease time
The key problem with lazy update is that when the primary server The key problem with lazy update is that when the a server fails
fails after updating a client with a particular lease time and before after updating a client with a particular lease time and before
updating the secondary server, the secondary server will believe that updating its partner, the partner will believe that a lease has
a lease has expired even though the client still retains a valid expired even though the client still retains a valid lease on that IP
lease on that IP address. address.
In order to handle this problem, a period of time known as the "max- In order to handle this problem, a period of time known as the "Max-
imum delta lease interval" (MDLI) is defined and must be known to imum Client Lead Time" (MCLT) is defined and must be known to both
both the primary and secondary servers. Proper use of this time the primary and secondary servers. Proper use of this time interval
interval places an upper bound on the difference allowed between the places an upper bound on the difference allowed between the lease
lease time provided to a DHCP client and the lease time known by the time provided to a DHCP client by a server and the lease time known
secondary server. In order that this is not the maximum lease time by that server's partner. In order that this is not the maximum
that the primary can ever provide to a client, during a lazy update lease time that a server can ever provide to a client, during a lazy
the primary typically updates the secondary with lease time informa- update the updating server typically updates its partner with lease
tion which is longer than the lease time previously given to the time information which is longer than the lease time previously given
client. to the client. This allows that server to give a longer lease time
to the client the next time the client renews its lease.
In the case where the secondary needs to take over from the primary, When moving to the PARTNER-DOWN state (where a server is allowed to
the secondary will not reallocate any IP addresses from one client to reallocate the partner's IP addresses), a server will wait the Max-
a different clients. When transitioning to the PARTNER-DOWN state imum Client Lead Time before allocating any IP addresses from its
(where the secondary is allowed to reallocate IP addresses), the partner's pool to any new DHCP clients. Thus, any clients which have
a lease on an IP address with a lease time greater than that known by
the server moving into PARTNER-DOWN state will either have contacted
that server during the MCLT period or their leases will have expired.
DRAFT January 1998 When a server has transitioned to PARTNER-DOWN state, it MUST NOT
reallocate an IP address from one client to another client until an
additional maximum client lead time interval after the lease on the
first client expires. (Actually, until the maximum client lead time
after what it believes to be the lease expiration time of the first
client.)
secondary will wait the maximum-delta-lease-interval before complet- The fundamental relationship on which much of the correctness of this
ing the state transition. Thus, any clients which have a lease on an protocol depends is that the lease expiration time known to a DHCP
IP address with a lease time greater that than known by the secondary client MUST NOT be more than the maximum client lead time greater
will either have contacted the secondary during that time or the
their lease will have expired. DRAFT November 1998
than the lease expiration time known to a server's partner.
The remainder of this section makes the above fundamental relation-
ship more explicit.
This protocol requires a DHCP server to deal with several different This protocol requires a DHCP server to deal with several different
lease intervals and places specific restrictions on their relation- lease intervals and places specific restrictions on their relation-
ships. The purpose of these restrictions is to allow the other server ships. The purpose of these restrictions is to allow the other server
in the pair to be able to make certain assumptions in the absence of in the pair to be able to make certain assumptions in the absence of
an ability to communicate between servers. an ability to communicate between servers.
The different lease times are: The different lease times are:
o desired client lease interval o desired client lease interval
The desired client lease interval is the lease interval that The desired client lease interval is the lease interval that a
the DHCP server would like to give to the DHCP client in the DHCP server would like to give to a DHCP client in the absence
absence of any restrictions imposed by the Failover Protocol. of any restrictions imposed by the Failover protocol. Its
Its determination is outside of the scope of this protocol. determination is outside of the scope of this protocol. Typi-
Typically this is the result of external configuration of a cally this is the result of external configuration of a DHCP
DHCP server. server.
o actual client lease interval o actual client lease interval
The actual client lease internal is the lease interval that The actual client lease internal is the lease interval that a
that DHCP server gives out to the DHCP client. It may be DHCP server gives out to a DHCP client. It may be shorter than
shorter than the desired client lease interval (as explained the desired client lease interval (as explained below).
below).
o Primary Server lease interval
The Primary Server lease interval is the interval after which o desired partner server lease interval
the Primary Server believes that DHCP client's lease will
expire.
o desired Secondary Server lease interval The desired partner server lease interval is the lease expira-
tion interval the local server tells to its partner.
The desired Secondary Server lease interval is the interval o acknowledged partner server lease interval
the Primary Server tells to the Secondary Server after which
the lease will expire.
o acknowledged Secondary Server lease interval The acknowledged partner server lease interval is the interval
the partner server has most recently acknowledged.
The acknowledged Secondary Server lease interval is the inter- The key restriction (and guarantee) that any server makes with
val the Secondary Server has most recently acknowledged. The respect to lease intervals is that the actual client lease interval
key restriction (and guarantee) that the Primary Server makes never exceeds the acknowledged partner server lease interval (if any)
with respect to lease intervals is that the actual client by more than a fixed amount. This fixed amount is called the "Max-
imum Client Lead Time" (MCLT).
DRAFT January 1998 The MCLT MAY be configurable, but for correct server operation it
MUST be the same and known to both the primary and secondary servers.
lease interval never exceeds the acknowledged Secondary Server It is transmitted from the primary to the secondary in every message
lease interval (if any) by more than a fixed amount. This
fixed amount is called the "maximum delta lease interval"
(MDLI).
The MDLI MAY be configurable, but for correct server operation it DRAFT November 1998
MUST be known to both the Primary and Secondary Servers.
The Primary Server MUST record in its state both the Primary Server sent with the RESTART bit set, and also in every poll and poll reply
lease interval and the most recently acknowledged Secondary Server message. The secondary MUST ensure that its value agrees with that
lease interval. It is assumed that the desired client lease interval of the primary. See section 3.14 concerning the MCLT Option.
can be determined through techniques outside of the scope of this
protocol.
The above lease time descriptions are written for the case where the A server MUST record in its stable storage both the local server
where the Primary server is operating and in communication with the lease interval and the most recently acknowledged partner server
Secondary server. In the case where the Secondary server is operat- lease interval for each IP address binding. It is assumed that the
ing out of communications with the Primary server, then the relation- desired client lease interval can be determined through techniques
ships must hold in the other direction. outside of the scope of this protocol.
The fundamental relationship among these times which MUST be main- Again, the fundamental relationship among these times which MUST be
tained is: maintained is:
actual client lease interval < actual client lease interval <
( acknowledged other server lease interval + MDLI ) ( acknowledged partner lease interval + MCLT )
The "acknowledged other server lease interval" is the acknowledged The "acknowledged partner lease interval" is the acknowledged secon-
secondary server lease interval for the Primary server, and it would dary server lease interval for the primary server, and it would be
be the acknowledged primary server lease interval for the Secondary the acknowledged primary server lease interval for the secondary
server when it is operating out of contact with the Primary server. server when it is operating out of contact with the primary server.
Figure 5.1-1 illustrates a initial lease to a client using the rules
discussed in the example which follows it.
DRAFT November 1998
DHCP Primary Secondary
Client Server Server
| | |
| >-DHCPDISCOVER-> | |
| <---DHCPOFFER-< | |
| | |
| >-DHCPREQUEST-> | |
| (selecting) | |
| | |
| <--------DHCPACK-< | |
| ^ (MCLT) | |
| : | >-DHCPBNDUPD--> |
| : | (1/2 MCLT + X ) |
| : | |
| : | <-DHCPBNDACK-< |
| MCLT / 2 | |
... : ... ...
| : | |
| V | |
| >-DHCPREQUEST-> | |
| (renew) | |
| | |
| <--------DHCPACK-< | |
| ^ (X) | |
| : | >-DHCPBNDUPD--> |
| : | ( 1/2 X + X ) |
| : | |
| : | <-DHCPBNDACK-< |
| X / 2 | |
| : | |
... ... ... ...
Figure 5.1-1: Lazy Update Message Traffic
X = Desired Client Lease Interval
DISCUSSION: DISCUSSION:
This protocol mandates no particular detailed algorithms concern- This protocol mandates no algorithm concerning these lease inter-
ing these lease intervals, as long as above fundamental relation- vals, as long as above fundamental relationship is preserved.
ship is preserved.
In the interests of clarity, however, let's examine a specific In the interests of clarity, however, let's examine a specific
example. The MDLI in this case is 1 hour. The desired client example. The MCLT in this case is 1 hour. The desired client
lease interval is 3 days. In operation this might work as fol- lease interval is 3 days, and its renewal time is half the lease
lows: interval.
When a Primary Server makes an offer for a new lease on an IP DRAFT November 1998
address to a DHCP client, it determines the desired client lease
interval (in this case, 3 days). It then examines the ack-
nowledged Secondary lease interval (which in this case is zero).
DRAFT January 1998 The rules for this example are:
Since the actual client lease interval can not be allowed to o What to tell the client:
exceed the current Secondary lease interval by more than the MDLI,
the offer made to the DHCP client (the actual client lease inter-
val) is for (essentially) the MDLI, 1 hour.
Once the Primary Server has performed the ACK to the DHCP client, Take the remainder of the acknowledged partner server lease
it will update the Secondary Server with the lease information. interval. If this is a new lease, then this value will be zero.
However, the Secondary Server lease interval will be composed of If this remainder plus the MCLT is greater than the desired
the current actual client lease interval + ( 1.5 * desired client client lease interval, give the client the desired client lease
lease interval). Thus, the Secondary Server is updated with a interval else give the client the remainder plus the MCLT.
lease interval of 4.5 days + 1 hour.
When the Primary Server receives an ACK to its update of the o What to tell the failover partner server:
Secondary Server's lease interval, it records that as the ack-
nowledged Secondary Server lease interval. The Primary Server
MUST ensure that the Secondary Server has received and recorded in
its stable storage the Secondary Server lease interval.
When the DHCP client attempts to renew at T2 (approximately one Take the renewal interval (typically half of the actual client
half an hour from the start of the lease), the Primary Server lease interval), and add to it the desired client lease inter-
again determines the desired client lease time, which is still 3 val.
days. It then compares this with the remaining acknowledged
Secondary Server lease interval (adjusting for the time passed
since the Secondary Server was last updated), which is 4.5 days +
to the desired client lease interval as it is less than the ack-
nowledged Secondary lease interval.
When the Primary DHCP server updates the Secondary DHCP server In operation this might work as follows:
after the DHCP client's renewal ACK is complete, it will calculate
the Secondary Server lease interval as the actual client lease
interval (3 days this time) + .5 the desired client lease interval
(1.5 days). In this way, the Primary attempts to have the Secon-
dary always "lead" the client in its understanding of the client's
lease interval.
Once the initial actual client lease interval of the MDLI is past, When a primary server makes an offer for a new lease on an IP
the protocol operates effectively like the DHCP protocol does address to a DHCP client, it determines the desired client lease
today in its behavior concerning lease intervals. However, the interval (in this case, 3 days). It then examines the ack-
guarantee that the actual client lease interval will never exceed nowledged partner lease interval (which in this case is zero) and
the acknowledged Secondary Server lease interval by more than the determines the remainder of the time left to run, which is also
MDLI allows full recovery from failures in lazy update. zero. To this it adds the the MCLT. Since the actual client
lease interval cannot be allowed to exceed the remainder of the
current partner lease interval plus the MCLT, the offer made to
the client is for the remainder of the current partner lease
interval (i.e., zero) plus the MCLT. Thus, the actual client
lease interval is 1 hour.
6.2. Controlled re-allocation of IP addresses Once the primary server has performed the ACK to the DHCP client,
it will update the secondary server with the lease information.
However, the desired partner server lease interval will be com-
posed of the one half of the current actual client lease interval
added to the desired client lease interval. Thus, the secondary
server is updated with a DHCPBNDUPD with a lease interval of 3
days + 1/2 hour specified in the Lease Duration Option (Option
51).
When the servers cannot communicate neither server will allow an IP When the primary server receives an ACK to its update of the
address previously used by one client to be offered to a different secondary server's (partner's) lease interval, it records that as
client. As a corollary, during normal operations the primary server the acknowledged partner server lease interval. A server MUST NOT
send a DHCPBNDACK in response to a DHCPBNDUPD message until it is
sure that the information in the DHCPBNDUPD message resides in its
stable storage. Thus, the primary server in this case can be sure
that the secondary server has recorded the desired partner server
lease interval in its stable storage when the primary server
receives a DHCPBNDACK message from the secondary server.
DRAFT January 1998 DRAFT November 1998
must update the secondary server whenever a lease expires or an IP When the DHCP client attempts to renew at T1 (approximately one
address is released, and must receive acknowledgement of that update half an hour from the start of the lease), the primary server
before offering the IP address of the expired or released IP address again determines the desired client lease interval, which is still
to a different client. 3 days. It then compares this with the remaining acknowledged
partner server lease interval (3 days + 1/2 hour) and adjusts for
the time passed since the secondary was last updated (1/2 hour).
Thus the remaining time on the acknowledged partner server lease
interval is 3 days. Adding the MCLT to this yields 3 days plus 1
hour, which is less than the desired client lease interval of 3
days. So the client is renewed for the desired client lease
interval -- 3 days.
7. Server States When the primary DHCP server updates the secondary DHCP server
after the DHCP client's renewal ACK is complete, it will calculate
the desired partner server lease interval as the T1 fraction of
the actual client lease interval (1/2 of 3 days this time = 1.5
days). To this it will add the desired client lease interval of 3
days, yielding a total desired partner server lease interval of
4.5 days. In this way, the primary attempts to have the secondary
always "lead" the client in its understanding of the client's
lease interval so as to be able to always offer the client the
desired client lease interval.
The following server states are defined: Once the initial actual client lease interval of the MCLT is past,
the protocol operates effectively like the DHCP protocol does
today in its behavior concerning lease intervals. However, the
guarantee that the actual client lease interval will never exceed
the remaining acknowledged partner server lease interval by more
than the MCLT allows full recovery from a variety of failures.
NORMAL State: 5.2. Controlled re-allocation of IP addresses
NORMAL state is the state used by a server when it can communicate When in PARTNER-DOWN state (after a period defined in detail in sec-
with the other server in the Primary-Secondary Server pair. When in tion 6.5.2 has passed), a there are no restrictions on reallocating a
this state, the Primary responds to DHCP clients requests, while the lease from one client to another.
Secondary does not.
COMMUNICATION-INTERRUPTED state: In any other state, a server cannot reallocate an address from one
client to another without first notifying (through a DHCPBNDUPD mes-
sage) and receiving acknowledgement (through a DHCPBNDACK message)
that its partner is aware that that first client is not using the
address.
A server goes into this state whenever it is unable to communicate This could be modeled in the following way (though this specific
with the other server. Both the Primary and Secondary Servers can go implementation is in no way required). An "available" IP address on
into this state, although the behavior changes that result are dif- a server may be allocated to any client. An IP address which was
ferent. Primary and Secondary Servers cycle automatically (without leased to a client and which expired or was released by that client
administrative intervention) between NORMAL and COMMUNICATION- would take on a new state, say "pending-available". When an IP
INTERRUPTED state as the network connection between them fails and address became "pending-available", the partner server would be
recovers, or as the partner server cycles between operational and
non-operational. No duplicate IP address allocation can occur while
the servers cycle between these states. In this state both servers
may respond to DHCP client requests. When allocating new IP
addresses, each server allocates from a different pool. When respond-
ing to renewal requests, each server will allow continued renewal of
a DHCP client's current lease on an IP address.
PARTNER-DOWN state: DRAFT November 1998
PARTNER-DOWN state is a state either server can enter. Once a server notified that this IP address was "available" through a DHCPBNDUPD.
has entered NORMAL state, the PARTNER-DOWN state is entered only on When the sending server received the DHCPBNDACK for that IP address
command of an external agency (typically an administrator of some showing it was "available", it would move the IP address from
sort) or after the expiration of an externally configured minimum "pending-available" to "available", and it would be available for
safe-time after the beginning of COMMUNICATION-INTERRUPTED state. allocation to any clients.
When in this state, the server no longer assumes that the other
server could still be operational and servicing a a different set of
clients, but instead assumes that it is the only server operating.
Only one server should be operating in this state at a time. The
server in this state will respond to DHCP client requests. It will
allow renewal of all outstanding leases on IP addresses, and will
allocate IP addresses from its own pool, and after a fixed period of
time, it will allocate IP addresses from the set of all available IP
DRAFT January 1998 A server MAY reallocate an IP address in "pending-available" state to
the same client with no restrictions.
addresses. The server will transition out of PARTNER-DOWN state after 5.3. Secondary renewal of leases
automatic re-integration the companion server is complete. This
automatic re- integration will typically be initiated by the restart
of the server which was down.
POTENTIAL-CONFLICT state: When operating in NORMAL state, a secondary server MAY process
DHCPREQUEST messages for renewal or rebinding leases. In this case,
the requirements for control of lease time and re-allocation of IP
addresses are the same as that of the primary server.
This state indicates that the two servers are attempting to rein- 6. Server Operation
tegrate with each other, but at least one of them was running in a
state that did not guarantee automatic reintegration would be possi-
ble. In POTENTIAL-CONFLICT state the servers may determine that the
same IP address has been offered and accepted by two different DHCP
clients.
RECOVER state: This section discusses the operation of a server implementing the
Failover protocol using the state transition diagram in Figure 6.2-1.
This is the common state transition diagram for both servers in a
pair.
This state indicates that the server has no information in its stable 6.1. Server Initialization
storage. A server in this state will attempt to refresh its stable
storage from the other server.
SYNC state: When a server starts it starts out in STARTUP state. See section 6.4
below for details.
In this state, the Secondary Server attempts to synchronize its 6.2. Establishing Communications Integrity
stable storage with the Primary Server. Both the Primary and Secon-
dary may have information that the other lacks.
8. Primary Server Operation Central to the operation of the Failover protocol is a notion of
"communications okay" or "communications failed". State transitions
are taken in many cases when the status of communications with the
partner changes.
This section discusses the operation of the primary server using the A specific discipline exists for establishing and verifying communi-
state transition diagram in Figure 8.2-1. cations integrity. Communications is set to "okay" whenever a mes-
sage sent is acked by the partner. After an implementation dependent
length of time from the communications "okay" event the communica-
tions with the partner are deemed to have "failed" if no subsequent
acknowledgments have been received. Whenever a DHCPPRPL, DHCPUP-
DATEDONE, DHCPPOOLRESP or DHCPBNDACK is received this time period is
restarted.
8.1. Primary Server Initialization Obviously, as the time period elapses, a server SHOULD send DHCPPOLL
messages in order to elicit a DHCPPRPL message in reply, which will
When the Primary Server starts, there are three possibilities: it DRAFT November 1998
has never started before and therefore has no record of any previous
state nor of any client binding information; it has started before
and has a record of a previous state and possibly of some client
binding information; it has started before, but failed catastrophi-
cally, and now has no record of any previous state (nor of any client
binding information).
When the Primary Server starts, if it has any record of a previous reset the time period.
state, then if that state was NORMAL or COMMUNICATION-INTERRUPTED it
moves to COMMUNICATION- INTERRUPTED state. If that state was
PARTNER-DOWN or POTENTIAL-CONFLICT, then it moves to PARTNER-DOWN
state. If that state was RECOVER, then the Primary Server moves into
the RECOVER state.
DRAFT January 1998 While an implementation SHOULD restart this time period on every
DHCPUPDATEDONE, DHCPPOOLRESP or DHCPBNDACK or DHCPRPL, it MAY choose
to only restart it on a DHCPPRPL.
If it has no record of any previous state, then either this is an This technique ensures that two-way communications integrity exists
initial startup, or a recovery from a catastrophic failure where between the servers. Were the timeout period to be reset on the
stable storage and all client binding information was lost. These are receipt of any message from the partner, a network failure where one
distinguished by recovery from a catastrophic failure being indicated server could send but not receive messages to the partner could lead
by some external configuration indication to the Primary Server. to failure of the entire redundant DHCP subsystem. For example, in a
situation where the primary could send but not receive any messages,
the secondary would never take over from the primary and yet DHCP
clients would not receive any service.
8.2. Primary Server State Transitions 6.3. Server State Transitions
Figure 8.2-1 is the diagram of the Primary Server's state transi- Figure 6.2-1 is the diagram of the server state transitions. The
tions. The remainder of this section contains information important remainder of this section contains information important to the
to the understanding of that diagram. understanding of that diagram.
The server stays in the current state until all of the actions speci- The server stays in the current state until all of the actions speci-
fied on the state transition are complete. If communications fails fied on the state transition are complete. If communications fails
during one of the actions, the server simply stays in the current during one of the actions, the server simply stays in the current
state and attempts a transition whenever the conditions for a transi- state and attempts a transition whenever the conditions for a transi-
tion are later fulfilled. tion are later fulfilled.
In the state transition diagram below, the "+" or "-" in the upper In the state transition diagram below, the "+" or "-" in the upper
right corner of each state is a notation about whether communication right corner of each state is a notation about whether communication
is ongoing with the Secondary Server. The legend "responsive" and is ongoing with the other server.
"unresponsive" in each state indicates whether the Primary Server is
responsive to DHCP client requests in the respective state.
In the diagram state transition diagram below, when communication is The legend "responsive", "partially-responsive", or "unresponsive" in
reestablished between the Primary and Secondary Server, the Primary each state indicates whether the server is responsive to DHCP client
server must record the state of the Secondary Server when the commun- requests in the respective state. The terms "responsive" and
ication was reestablished. "unresponsive" have the obvious meanings, while "partially-
responsive" means that a DHCP server may respond to DHCPREQUEST mes-
sages that are RENEWAL or REBINDING, but to no other messages.
If the state of the Secondary Server changes while communicating, In the state transition diagram below, when communication is reesta-
then the Primary Server moves through the communications-failed tran- blished between the two servers, each must record the state of the
sition, and into whatever state results. It then immediately moves partner when communication was restored. State transitions on one
through whatever state transition is appropriate given the current server in some cases imply state transitions on the partner server,
state of the Secondary Server. so a record of the current state of the partner server must be kept
by each server.
If a message is received from a partner with the state equal to zero
(0), then the receiving server should respond to that message with a
DHCPPRPL if it was a DHCPPOLL, but under no circumstances should it
DRAFT November 1998
consider communications to be "okay", nor take any state transitions
based on receipt of that message.
If the state of the partner changes while communicating a server
moves through the communications-failed transition and into whatever
state results. It then immediately moves through whatever state
transition is appropriate given the current state of the partner
server.
DISCUSSION: DISCUSSION:
The point of this technique is simplicity, both in explanation of The point of this technique is simplicity, both in explanation of
the protocol and in its implementation. The alternative to this the protocol and in its implementation. The alternative to this
technique of memory of partner state and automatic state transi- technique of memory of partner state and automatic state transi-
tion on change of partner state is to have every state in the fol- tion on change of partner state is to have every state in the fol-
lowing diagram have a state transition for every possible state of lowing diagram have a state transition for every possible state of
the partner. With the approach adopted, only the states in which the partner. With the approach adopted, only the states in which
communications are reestablished require a state transition for communications are reestablished require a state transition for
each possible partner state. each possible partner state.
All state transitions of the Primary Server must be recorded in its The current state of a server must be recorded in stable storage and
stable storage, and thus be available to the server after a server thus be available to the server after a server restart.
DRAFT January 1998 DRAFT November 1998
restart. +---------------+ V +--------------+
| RECOVER - | | | STARTUP - |
|(unresponsive) | +->|(unresponsive)|
+---------------+ +--------------+
Comm. OK +-----------------+
Other State:-RECOVER | PARTNER DOWN - |<-----+
| | | (responsive) | |
All POTENTIAL- +-----------------+ |
Others CONFLICT------------ | --------+ ^(see |
| Comm. OK | | 6.93) |
UPDATEREQ(ALL) Other State: | +-----+ |
Wait UPDATEDONE | | | Comm. | |
Wait MCLT from fail RECOVER All Others| Failed | |
+--------------+ | V V | | |
|RECOVER-DONE +| +--+ +--------------+ | |
|(unresponsive)| | | POTENTIAL + |<--+ |
+--------------+ Wait for +>| CONFLICT | |
Comm. OK Other | |(unresponsive)|<--- | --+
+--Other State:-+ State: | +--------------+ | |
| | | RECOVER | | | |
| All POTENT. DONE | Resolve Conflict | |
| Others: CONFLICT-- | ----+ (see 6.9) | |
| Wait for V V | |
| Other State: NORMAL +-----------------+ | |
| V | NORMAL + | External | |
| +--+----------+-->|(see 6.72, 6.73) |-Command-->+ |
| ^ ^ +-----------------+ | |
| | | | | |
| Wait for Comm. OK Comm. External |
| Other Other Failed Command |
| State: State: | or | |
|RECOVER-DONE NORMAL Start Safe Safe | |
| | COMM. INT. Period Timer Period | |
| Comm. OK. | V expiration |
| Other State: | +------------------+ | |
| RECOVER +--| COMMUNICATIONS - |-----------+ |
V +-------------| INTERRUPTED | Comm. OK |
RECOVER | (responsive) |--Other State:-+
RECOVER-DONE--------->+------------------+ All Others
Previous Primary State: Figure 6.2-1: Server state diagram.
NORMAL or RECOVER PARTNER DOWN DRAFT November 1998
COMMUNICATION <ext. cmd> POTENTIAL CONFLICT
INTERRUPTED | <none>
+---+ V |
| +----------------+ +-----------------+
| | - | | - |
| | RECOVER | | PARTNER DOWN |<-----+
| | (unresponsive) | | (responsive) | |
| +----------------+ +-----------------+ |
| | | | ^ |
| Comm. OK | Comm. OK | |
| Sec. State: | Sec. State: Comm. |
| | | V All Others Failed |
| | RECOVER +<---+ V | |
| All | | +-------------+ |
| Others | Comm. OK | POTENTIAL +| |
| | Note Sec. State: | CONFLICT | |
| | Poss. RECOVER |(responsive) |<---- | --+
| V Error NORMAL +-------------+ | |
| Sec->Pri | Pri->Sec | | |
| Sync | Sync. Resolve Conflict | |
| | | V V | |
| Wait MDLI | +-----------------+ | |
| from Fail. | | + | External | |
| V V | NORMAL |-Command-->+ |
| +-----++------>| (responsive) | | |
| ^ +-----------------+ | |
| | | | |
| Pri<->Sec Comm. External |
| Sync Failed Command |
| | | or |
| Comm. OK | "Safe Period" |
| Sec. State: V expiration |
| NORMAL +-----------------+ | |
| COMM. INT. | - |---------->+ |
| RECOVER------| COMMUNICATIONS | |
| | INTERRUPTED | Comm. OK |
+------------------>| (responsive) |--Sec. State:--+
+-----------------+ All Others
Figure 8.2-1: Primary Server state diagram. 6.4. STARTUP state
DRAFT January 1998 The STARTUP state affords an opportunity for a server to probe its
partner server, before starting to service DHCP clients.
8.3. Primary Server in PARTNER-DOWN state DISCUSSION:
When it is in PARTNER-DOWN state, the Primary Server operates largely Without the STARTUP state, a server would likely start in a state
as does a normal DHCP server, with none of the special algorithms derived from its previously stored state (held in stable storage),
described below. In PARTNER-DOWN state the Primary Server MUST if any. However, this may be inconsistent with the current state
respond to DHCP client requests. of the partner. The STARTUP state affords the opportunity for a
server to potentially learn the partner's state and determine if
that state is consistent with its derived starting state or
whether some significant state change has occurred at the partner
that forces the server to start in another state. This is
especially critical if significant time has elapsed while the
server was down.
Any available IP address tagged as belonging to the Secondary Server 6.4.1. Operation while in STARTUP state
(at entry to PARTNER-DOWN state) MUST NOT be used until the MDLI
beyond the entry into PARTNER-DOWN state has elapsed.
The Primary Server MUST NOT allocate an IP address to a DHCP client Whenever a server is in STARTUP state, it MUST be unresponsive to
different from that to which it was allocated at the entrance to DHCP client requests, and so the time spent in the STARTUP state is
PARTNER-DOWN state until the MDLI beyond the its expiration time has necessarily short, typically on the order of a few seconds to a few
elapsed. If this time would be earlier than the current time plus tens of seconds. The exact time spent in the STARTUP state is imple-
the MDLI, then the current time plus the MDLI is used. mentation dependent, and the primary and secondary server are not
required to spend the same amount of time in the STARTUP state.
Two options exist for lease times, with different ramifications flow- Whenever any message is sent to the partner while in STARTUP state
ing from each. the STARTUP bit MUST be set in the 'flags' field of the message
header.
If the Primary Server wishes the Failover Protocol to protect it from 6.4.2. Transition out of STARTUP state
loss of stable storage in any state, then it should ensure that the
MDLI based lease time restrictions in Section 6.1 are maintained,
even in PARTNER-DOWN state.
If the Primary Server wishes to forego the protection of the Failover Each server starts out in startup state every time it initializes
Protocol in the event of loss of stable storage, then it need recog- itself, and performs the following algorithm as part of its initiali-
nize no restrictions on actual client lease times while in PARTNER- zation:
DOWN state.
The Primary Server MUST poll the Secondary Server and attempt to 1. Ensure that the RESTART bit is set in the 'flags' field of the
establish communications and synchronization with it. failover message header. Once set, the RESTART bit must
remain set in all failover messages sent by the server to the
partner until the first acknowledgment of a message is
received from that partner. This is required to assure that
the partner knows that the server has restarted, even if the
partner itself is unreachable for a long while.
Once the Primary succeeds in contacting the Secondary Server, the DRAFT November 1998
Primary examines the state of the Secondary Server. If the state of
the Secondary Server is RECOVER or NORMAL, then both servers have Do not send any messages until step 5.
been running in such a way that duplicate IP address allocations were
inhibited. In this case, the Primary Server updates the Secondary 2. Is there any record in stable storage of a previous failover
Server with its client binding information, and moves into the NORMAL state? If yes, set previous-state to the last recorded state
in stable storage, and continue with step 3.
Is there any configuration information that indicates that
this server was previously running but lost its stable
storage? Such information must typically come from some
administrative intervention, since it is difficult for a
server to distinguish first startup from a startup after it
has lost its stable storage. If yes, then set the previous-
state to RECOVER, and set the time-of-failure to whatever time
was configured, and go on to step 3. This time-of-failure
will be used in the transition out of the RECOVER state into
the RECOVER-DONE state, below.
If there is no record of any previous failover state in stable
storage nor of any previous operational activity for this
server, then set the previous-state to RECOVER and set the
time-of-failure to a time before the maximum-client-lead-time
before now. If using standard Posix times, 0 would typically
do quite well.
3. Is the previous-state NORMAL? If yes, set the previous-state
to COMMUNICATIONS-INTERRUPTED.
4. Start the STARTUP state timer. The time that a server remains
in the STARTUP state (absent any communications with its
partner) is implementation dependent (and would typically be
configurable). It should be long enough to poll several times
and stand a good chance to receive a response to at least one
poll from a heavily loaded partner across a slow network.
5. Start sending DHCPPOLL messages (with both the RESTART and
STARTUP bits set in the 'flags' field).
6. Wait for "communications okay", i.e., the receipt of an
DHCPPRPL message.
When a DHCPPRPL message is received, clear the RESTART flag,
clear the STARTUP flag, and set the current state to the
previous-state.
If the partner is in PARTNER-DOWN state, and if its partner-
down time (received in the DHCPPRPL message in the Absolute
Time Option) is later than the last recorded time of operation
of this server, then set the current state to RECOVER.
DRAFT November 1998
Then, transition to the current state and take the "communica-
tions okay" state transition based on the current state of
this server and the partner.
7. If the startup time expires, take an implementation dependent
action: The server MAY go to the previous-state, or the
server MAY wait.
Reasons to go to previous-state and begin processing:
If the current server is the only operational server, then if
it waits, there will be no operational DHCP servers. This
situation could occur very easily where one server fails and
then the other crashes and reboots. If the rebooting server
doesn't start processing DHCP client requests without first
being in communication with the other server, then the level
of DHCP redundancy is not particularly high. This is an
appropriate approach if the possibility of partition is low,
or if the safe period expiration time is well beyond the time
at which an operator would notice and react to a partition
situation. It is also quite appropriate if the safe period
will never expire.
Reasons to wait:
If the current server has been down for longer than the
maximum-client-lead-time, and it is partitioned from the other
server, then when it returns it will attempt to use its own
available addresses to allocate to new DHCP clients, and the
other server may well be in PARTNER-DOWN state and may have
already allocated some of those available addresses to DHCP
clients. In cases where the possibility of partition is high,
and the safe period expiration time is less than the likely
operator reaction time, this is a good approach to use.
6.5. PARTNER-DOWN state
PARTNER-DOWN state is a state either server can enter. When in this
state, the server does not assume that the other server could still
be operating and servicing a different set of clients, but instead
assumes that it is the only server operating. For this reason, only
one server should be operating in this state at a time.
6.5.1. Upon Entry to PARTNER-DOWN state
When entering PARTNER-DOWN state a server MUST record the time of
entry, and must transmit it during every DHCPPOLL message or DHCPPRPL
DRAFT November 1998
message sent while in PARTNER-DOWN state.
6.5.2. Operation while in PARTNER-DOWN state
A server in PARTNER-DOWN state MUST respond to DHCP client requests.
It will allow renewal of all outstanding leases on IP addresses, and
will allocate IP addresses from its own pool, and after a fixed
period of time (the MCLT interval) has elapsed from entry into
PARTNER-DOWN state, it will allocate IP addresses from the set of all
available IP addresses.
Once a server has entered NORMAL state, the PARTNER-DOWN state is
entered only on command of an external agency (typically an adminis-
trator of some sort) or after the expiration of an externally config-
ured minimum safe-time after the beginning of COMMUNICATIONS-
INTERRUPTED state.
Any available IP address tagged as belonging to the other server (at
entry to PARTNER-DOWN state) MUST NOT be used until the maximum-
client-lead-time beyond the entry into PARTNER-DOWN state has
elapsed.
A server in PARTNER-DOWN state MUST NOT allocate an IP address to a
DHCP client different from that to which it was allocated at the
entrance to PARTNER-DOWN state until the maximum-client-lead-time
beyond the its expiration time has elapsed. If this time would be
earlier than the current time plus the maximum-client-lead-time, then
the current time plus the maximum-client-lead-time is used.
Two options exist for lease times given out while in PARTNER-DOWN
state, with different ramifications flowing from each.
If the server wishes the Failover protocol to protect it from loss of
stable storage in PARTNER-DOWN state, then it should ensure that the
MCLT based lease time restrictions in Section 5.1 are maintained,
even in PARTNER-DOWN state.
If the server wishes to forego the protection of the Failover proto-
col in the event of loss of stable storage, then it need recognize no
restrictions on actual client lease times while in PARTNER-DOWN
state. state.
Once contact has been established, if the state of the Secondary A server in PARTNER-DOWN state MUST poll its partner and attempt to
Server is anything other than RECOVER or NORMAL then the Primary establish communications and synchronization.
Server moves into the POTENTIAL-CONFLICT state.
8.4. Primary Server in RECOVER state While a server is in PARTNER-DOWN state, it MUST send the absolute
time of entry into PARTNER-DOWN using the absolute time option in
When Primary Server is initialized in the RECOVER state it expects to DRAFT November 1998
DRAFT January 1998 every DHCPPOLL and DHCPRPL message sent.
refresh its stable storage from an existing Secondary Server. In 6.5.3. Transitions out of PARTNER-DOWN state
this state the Primary Server MUST NOT respond to DHCP client
requests.
When the Primary Server succeeds in contacting the Secondary Server, When a server in PARTNER-DOWN state succeeds in contacting its
if it determines that the Secondary Server is itself in the RECOVER partner, its actions are conditional on the state and flags received
state (which indicates that the Secondary Server has no existing in the message from the other server.
client binding information), the Primary Server will move directly
into NORMAL state after signaling some kind of an error (since some
person had to explicitly start the Primary Server in RECOVER state to
refresh its lost client binding information from the Secondary, and
the Secondary had no state).
If the Primary Server determines that the Secondary Server is in any If the STARTUP bit is set in the 'flags' field of a received DHCPPOLL
state other than RECOVER, then the Secondary Server has some client message, the server in PARTNER-DOWN state will send a DHCPPRPL mes-
binding information that the Primary Server needs before it moves sage with its current state (and with the absolute PARTNER-DOWN time
into the NORMAL state. The Primary Server will attempt to refresh in the DHCPPRPL). A server in PARTNER-DOWN state MUST NOT take any
its state from the Secondary Server, and it will remain in the state transitions based on reestablishing communications if the
RECOVER state until it is successful in doing so. STARTUP bit is set in the 'flags' field of the messages that reesta-
blished communications.
The Primary Server MUST remain in RECOVER state until a period of at If the STARTUP bit is not set in the 'flags' field then a server in
least the MDLI has passed since the Primary Server was known to have PARTNER-DOWN state will move into POTENTIAL-CONFLICT state if the
failed. This is to allow any IP addresses that were allocated by the other server is in the NORMAL, COMMUNICATIONS-INTERRUPTED, PARTNER-
Primary Server prior to loss of Primary Server client binding infor- DOWN, or POTENTIAL-CONFLICT state.
mation in stable storage to contact the Secondary Server or to time
out. If the STARTUP bit is not set in the 'flags' field, then a server in
PARTNER-DOWN state will stay in PARTNER-DOWN state if it detects that
the other server is in RECOVER state.
If the STARTUP bit is not set in the 'flags' field, then a server in
PARTNER-DOWN state moves into NORMAL state if it detects that the
other server is in RECOVER-DONE state.
6.6. RECOVER state
This state indicates that the server has no information in its stable
storage or that it is re-integrating with a server in PARTNER-DOWN
state after it has been down. A server in this state will attempt to
refresh its stable storage from the other server.
6.6.1. Operation in RECOVER state
A server in RECOVER MUST NOT respond to DHCP client request.
A server in RECOVER state will attempt to reestablish communications
with the other server.
6.6.2. Transitions out of RECOVER state
If the other server is in POTENTIAL-CONFLICT state when communica-
tions are reestablished, then the server in RECOVER state will move
to POTENTIAL-CONFLICT state itself.
DRAFT November 1998
If the other server is in RECOVER state, then this server SHOULD sig-
nal an error and halt processing.
If the other server is in any other state, then the server in RECOVER
state will request an update of missing binding information by send-
ing an UPDATEREQ message. If the server has been configured to indi-
cate that it has lost its stable storage, it will send an
UPDATEREQALL message, otherwise it will send an UPDATEREQ message.
It will wait for an UPDATEDONE message, and upon receipt of that mes-
sage it will start a timer whose expiration is set to a time equal to
the the time the server went down (if known) or the current time (if
the down-time is unknown) plus the maximum-client-lead-time. When
this timer goes off, the server will go into RECOVER-DONE state.
This is to allow any IP addresses that were allocated by this server
prior to loss of its client binding information in stable storage to
contact the other server or to time out.
See Figure 6.6-1.
DISCUSSION: DISCUSSION:
The actual requirement on this wait period in RECOVER is that it The actual requirement on this wait period in RECOVER is that it
start when the Primary Server went down, not necessarily when it start when the recovering server went down, not necessarily when
came back up. If the time when the Primary Server failed is it came back up. If the time when the recovering server failed is
known, then it could be communicated to the recovering server, and known, then it could be communicated to the recovering server, and
the wait period could be reduced to the MDLI less the difference the wait period could be reduced to the maximum-client-lead-time
between the current time and the time the server failed. In this less the difference between the current time and the time the
way, the waiting period could be minimized. server failed. In this way, the waiting period could be minimized.
8.5. Primary Server in NORMAL state If an UPDATEDONE message isn't received within an implementation
dependent amount of time, and no DHCPBNDUPD message are being
received, then the UPDATEREQ(ALL) message will be re-transmitted.
When in NORMAL state, the Primary Server takes the following actions DRAFT November 1998
to implement the Safe Failover Protocol:
o Lease Time Calculations A B
Server Server
As discussed in Section 6.1, "Control of lease time", the | |
lease interval given to a DHCP client can never be more than RECOVER PARTNER-DOWN
the maximum delta lease interval greater than the acknowledged | |
| >--DHCPUPDATEREQ-------------> |
| |
| <-----------------DHCPBNDUPD--< |
| >--DHCPBNDACK----------------> |
... ...
| |
| <-----------------DHCPBNDUPD--< |
| >--DHCPBNDACK----------------> |
| |
| <-------------DHCPUPDATEDONE--< |
| |
Wait MCLT from last known |
time of operation |
| |
RECOVER-DONE |
| |
| >--DHCPPOLL-(RECOVER-DONE)---> |
| <-------------------DHCPPRPL--< |
| |
| NORMAL
| |
| <----------(NORMAL)-DHCPPOLL--< |
| >--DHCPPRPL------------------> |
| |
NORMAL |
| |
| |
DRAFT January 1998 Figure 6.6-1: Transition out of RECOVER state
Secondary Server lease interval. DRAFT November 1998
As long as the Primary Server adheres to this constraint, the 6.7. NORMAL state
NORMAL state is the state used by a server when it can communicate
with the other server. When in this state, the primary responds to
DHCP all clients requests and while the secondary only responds to
renewal or rebinding requests which it receives. This is one of the
few states where the operation of the primary and secondary servers
are quite different.
6.7.1. Upon Entry to NORMAL state
When entering NORMAL state, a server will send to the other server
all currently unacknowledged DHCPBNDUPD messages.
When the above process is complete, if the server entering NORMAL
state is a secondary server, then it will will request IP addresses
for allocation using the DHCPPOOLREQ message and the techniques
described in section 2.5.
6.7.2. Operation in NORMAL state: Primary Server
When in NORMAL state, the primary server takes the following actions
to implement the Failover protocol:
o Lease Time Calculations
As discussed in section 5.1, "Control of lease time", the lease
interval given to a DHCP client can never be more than the
maximum-client-lead-time greater than the acknowledged partner-
server-lease-interval.
As long as the primary server adheres to this constraint, the
specifics of the lease intervals that it gives to either the specifics of the lease intervals that it gives to either the
DHCP client or the Secondary DHCP server are implementation DHCP client or the secondary DHCP server are implementation
dependent. One possible approach is shown in Section 6.1, but dependent. One possible approach is shown in section 5.1, but
that particular approach is in no way required by this proto- that particular approach is in no way required by this protocol.
col.
o Lazy Update of Secondary Server o Lazy Update of Secondary Server
After an ACK of a IP address binding, the Primary Server After an ACK of a IP address binding, the primary server
attempts to update the Secondary with the binding information. attempts to update the secondary with the binding information.
The lease time used in the update of the Secondary MUST be at The lease time used in the update of the secondary MUST be at
least that given to the DHCP client in the DHCPACK. It MAY, least that given to the DHCP client in the DHCPACK. It MAY,
however, be longer. however, be longer.
DRAFT November 1998
o Reallocation of IP Addresses Between Clients o Reallocation of IP Addresses Between Clients
Whenever a client binding is released, a DHCPBNDUPD message Whenever a client binding is released, a DHCPBNDUPD message must
must be sent to the Secondary Server, setting the binding be sent to the secondary server, setting the binding state to
state to RELEASED. However, until a DHCPBNDACK is received for RELEASED. However, until a DHCPBNDACK is received for this mes-
this message, the IP address cannot be allocated to another sage, the IP address cannot be allocated to another client. It
client. can be allocated to the same client again.
8.6. Primary Server in COMMUNICATION-INTERRUPTED Mode 6.7.3. Operation in NORMAL state: Secondary Server
When in COMMUNICATION-INTERRUPTED state the Primary Server operates In normal state, the secondary server receives binding updates from
in such a way that correct operation is ensured even if the Secondary the primary server in DHCPBNDUPD messages. It records these in its
Server is still up and operational, but unable to communicate to the client binding database in stable storage and then sends the
Secondary Server. When communications are reestablished between the corresponding DHCPBNDACK message to the primary server. It MUST
Primary and Secondary Servers, if both are still in COMMUNICATION- ensure that the information is recorded in stable storage prior to
INTERRUPTED state, then the re-integration of their operation will sending the DHCPBNDACK message back to the primary server.
proceed automatically and without human intervention. The protocol
is designed to ensure that reintegration will proceed in an error
free manner and that no actions taken by either server while in
COMMUNICATION-INTERRUPTED state will cause problems during reintegra-
tion.
The Primary Server operates in COMMUNICATION-INTERRUPTED state as it While in NORMAL state, the secondary server MUST also acquire a
does in NORMAL state. series of IP addresses from the primary server to be used to satisfy
DHCPDISCOVER requests from DHCP clients when in COMMUNICATIONS-
INTERRUPTED state. See section 2.5 for details of this acquisition
process.
However, since it cannot communicate with the Secondary in this The secondary server periodically polls the primary server with the
state, the acknowledged-Secondary-lease-time will not be updated in DHCPPOLL message. If it fails to receive a DHCPPRPL message in reply
any new bindings. This is likely to eventually cause the actual- after a configured number of retries or some administratively deter-
client-lease-times to be the current-time plus the MDLI (unless this mined time, the secondary server transitions into COMMUNICATIONS-
is greater than the desired-client-lease-time). INTERRUPTED state. Both the DHCPPOLL and DHCPPRPL messages carry the
current state of the sender.
DRAFT January 1998 When in normal state, a secondary server is responsive to DHCP client
requests if they are RENEWAL or REBINDING. Any changes it makes to
any leases based on these responses should be sent to the primary
server using DHCPBNDUPD messages.
The Primary Server can simply queue updates to the Secondary on com- 6.7.4. Transitions out of NORMAL state
munication interruption and stay in the NORMAL state. If, at the time
communication with the Secondary is reestablished, the Secondary
remains in the NORMAL state as well, then the queued updates for the
Secondary will simply be processed.
COMMUNICATION-INTERRUPTED state for the Primary Server is a signal If an external command is received by a server in NORMAL state
that it has stopped queuing updates to the Secondary, and is able to informing it that its partner is down, then transition into PARTNER-
respond to a variety of possible Secondary states. DOWN state.
If a server in NORMAL state fails to receive acks to any messages
sent to its partner for an implementation dependent period of time,
it will move into COMMUNICATIONS-INTERRUPTED state. (See section
6.2).
DRAFT November 1998
If a server in NORMAL state receives any messages from its partner
where the partner has changed state from that expected by the server
in NORMAL state, then the server should transition into
COMMUNICATIONS-INTERRUPTED state and take the appropriate state tran-
sition from there. For example, it would be expected for the partner
to transition from POTENTIAL-CONFLICT into NORMAL state, but not for
the partner to transition from NORMAL into POTENTIAL-CONFLICT state.
6.8. COMMUNICATIONS-INTERRUPTED State
A server goes into COMMUNICATIONS-INTERRUPTED state whenever it is
unable to communicate with the other server. Primary and secondary
servers cycle automatically (without administrative intervention)
between NORMAL and COMMUNICATIONS-INTERRUPTED state as the network
connection between them fails and recovers, or as the partner server
cycles between operational and non-operational. No duplicate IP
address allocation can occur while the servers cycle between these
states.
6.8.1. Upon Entry to COMMUNICATIONS-INTERRUPTED state
When a server enters COMMUNICATIONS-INTERRUPTED state, if it has been
configured to support an automatic transition out of COMMUNICATIONS-
INTERRUPTED state and into PARTNER-DOWN state, then a timer MUST be
started for an implementation dependent period.
It is anticipated that some alarm condition would be raised upon the It is anticipated that some alarm condition would be raised upon the
transition from NORMAL state to COMMUNICATION-INTERRUPTED state. Once transition from NORMAL state to COMMUNICATIONS-INTERRUPTED state.
the Primary Server has been in COMMUNICATION-INTERRUPTED state for a
period equal to the safe-period, then it can (if configured to do so)
transition into the PARTNER-DOWN state. An external command may also
force a transition to PARTNER-DOWN state.
9. Secondary Server Operation 6.8.2. Operation in COMMUNICATIONS-INTERRUPTED State
The Secondary Server responds to DHCP client requests only in the In this state a server may respond to DHCP client requests. When
PARTNER-DOWN and COMMUNICATION-INTERRUPTED states. allocating new IP addresses, each server allocates from its own IP
address pool. When responding to renewal requests, each server will
allow continued renewal of a DHCP client's current lease on an IP
address, although the renewal period MUST not exceed the maximum
client lead time (MCLT) beyond the lease time already acknowledged by
the other server.
9.1. Secondary Server Initialization A server operates in COMMUNICATIONS-INTERRUPTED state as the primary
server does in NORMAL state.
When the Secondary Server starts, there are three possibilities: it However, since the server cannot communicate with its partner in this
has never started before and therefore has no record of any previous state, the acknowledged-partner-lease-time will not be updated in any
state nor of any client binding information; it has started before new bindings. This is likely to eventually cause the actual-client-
and has a record of a previous state and possibly of some client lease-times to be the current-time plus the maximum-client-lead-time
binding information; it has started before, but failed catastrophi-
cally, and now has no record of any previous state (nor of any client
binding information).
When the Secondary Server starts, if it has any record of a previous DRAFT November 1998
state, then if that state was NORMAL, COMMUNICATION-INTERRUPTED, or
SYNC, it moves to COMMUNICATION-INTERRUPTED state. If that state was
PARTNER-DOWN or POTENTIAL-CONFLICT, then it moves to PARTNER-DOWN
state. In all other cases (both other previous states and the cases
where there is no record of a previous state), the Secondary Server
moves into the RECOVER state.
9.2. Secondary Server State Transitions (unless this is greater than the desired-client-lease-time).
The server stays in the current state until all of the actions speci- 6.8.3. Transition out of COMMUNICATIONS-INTERRUPTED State
fied on the state transition are complete. If communications fails
during one of the actions, the server simply stays in the current
state and attempts a transition whenever the conditions for a
DRAFT January 1998 If the safe period timer expires while a server is in the
COMMUNICATIONS-INTERRUPTED state, it will go immediately into
PARTNER-DOWN state.
transition are later fulfilled. If an external command is received by a server in COMMUNICATIONS-
INTERRUPTED state informing it that its partner is down, it will go
immediately into PARTNER-DOWN state.
In the state transition diagram below, the "+" or "-" in the upper If communications is restored with the other server, then the server
right corner of each state is a notation about whether communication in COMMUNICATIONS-INTERRUPTED state will go into another state based
is ongoing with the Primary Server. The legend responsive" and on the state of the partner:
"unresponsive" in each state indicates whether the Secondary Server
is responsive to DHCP client requests in the respective state.
In the state transition diagram below, when communication is reesta- o partner in NORMAL or COMMUNICATIONS-INTERRUPTED
blished between the Secondary and Primary Server, the Secondary
Server must record the state of the Primary Server when the communi-
cations was reestablished. If the state of the Primary Server changes
while communicating, then the Secondary Server moves through the
communications-interrupted transition, and into whatever state
results. At that time, it then immediately moves through whatever
state transition is appropriate for the current state of the Primary
Server.
All state transitions of the Secondary Server must be recorded in its The server will transition into the NORMAL state.
stable storage, and thus be available to the server after a server
restart.
DRAFT January 1998 o partner in RECOVER
Previous Secondary State: Stay in COMMUNICATIONS-INTERRUPTED state.
NORMAL RECOVER PARTNER DOWN o partner in RECOVER-DONE
COMM. INT. <none> POTENTIAL CONFLICT
SYNC | |
+---+ V V
| +----------------+ +-----------------+
| | RECOVER - | | PARTNER DOWN - |<-----+
| | (unresponsive) | | (responsive) | |
| +----------------+ +-----------------+ |
| | | | ^ |
| Comm. OK | Comm. OK | |
| Pri. State: | Pri. State: Comm. |
| | | V All Others Failed |
| | RECOVER +<---+ V | |
| | | | +--------------+ |
| | | Comm. OK | POTENTIAL + | |
| All | Pri. State: | CONFLICT | |
| Others | RECOVER |(unresponsive)|<--- | --+
| | Note | +--------------+ | |
| | Poss. Sec->Pri | | |
| V Error Sync. Resolve Conflict | |
| Pri->Sec | V V | |
| Sync | +-----------------+ | |
| V V | NORMAL + |-External->+ |
| +-----++------>| (unresponsive) | Command | |
| ^ +-----------------+ | |
| Pri<->Sec | ^ | |
| Sync | Start Alloc Timer | |
| | | Sec->Pri | |
| +--------------+ | Sync | |
| | + |--->+ | External |
| | SYNC | Comm. Comm. OK Command |
| | unresponsive | Failed Pri. State: or |
| +--------------+ | RECOVER "Safe Period" |
| ^ V | expiration |
| | +------------------+ | |
| Comm. OK | COMMUNICATIONS - |---------->+ |
| Pri. State: | INTERRUPTED | Comm. OK |
| NORMAL-----| (responsive) |--Pri. State:--+
| COMM. INT. +------------------+ All Others
| ^
+---------------------+
Figure 9.2-1: Secondary Server State Diagram. Transition into NORMAL state.
DRAFT January 1998 o partner in PARTNER-DOWN or POTENTIAL-CONFLICT
9.3. Secondary Server in RECOVER state Transition into POTENTIAL-CONFLICT state.
The Secondary DHCP server comes up in the RECOVER state when it has o partner in PAUSED
no record of any previous state (or that previous state was RECOVER).
It stays in this state until it establishes communication with the Stay in COMMUNICATIONS-INTERRUPTED state.
Primary Server, and is unresponsive to DHCP client requests in this
state. Essentially it is idle until it can contact the Primary
Server.
When it establishes communication with the Primary Server, it o partner in SHUTDOWN
attempts to load its client binding database from that of the Primary
Server using the techniques specified in section 6.
Once the Secondary Server's client binding database is refreshed from Transition into PARTNER-DOWN state.
that of the Primary, the Secondary Server moves into NORMAL state.
9.4. Secondary Server in NORMAL state DRAFT November 1998
In normal state, the Secondary Server receives state updates from the Primary Secondary
Primary Server in DHCPBNDUPD messages. It records these in its Server Server
client binding database in stable storage and then sends the
corresponding DHCPBNDACK message to the Primary Server.
While in NORMAL state, the Secondary Server MUST also acquire a NORMAL NORMAL
series of IP addresses from the Primary Server to be used to satisfy | >--DHCPPOLL----->: |
DHCPDISCOVER requests from DHCP clients when in COMMUNICATION- INTER- | :<--------DHCPPOLL--< |
RUPTED state. See Section 2.2.2 for details of this acquisition pro- | : |
cess. COMMUNICATIONS : COMMUNICATIONS
INTERRUPTED : INTERRUPTED
| : |
| >--DHCPPOLL------------------> |
| <-------------------DHCPPRPL--< |
NORMAL |
| |
| >--DHCPBNDUPD----------------> |
| <-----------------DHCPBNDACK--< |
| |
| <-------------------DHCPPOLL--< |
| >--DHCPPRPL------------------> |
| NORMAL
| |
| <-----------------DHCPBNDUPD--< |
| >--DHCPBNDACK----------------> |
... ...
| |
| <----------------DHCPPOOLREQ--< |
| >--DHCPPOOLRESP-(2)----------> |
| |
| >--DHCPBNDUPD-(#1)-----------> |
| <-----------------DHCPBNDACK--< |
| |
| <----------------DHCPPOOLREQ--< |
| >--DHCPPOOLRESP-(0)----------> |
| |
| >--DHCPBNDUPD-(#2)-----------> |
| <-----------------DHCPBNDACK--< |
| |
The Secondary Server periodically polls the Primary Server with the Figure 6.8-1: Transition from NORMAL to COMMUNICATIONS-
DHCPPOLL message. If it fails to receive a DHCPPRPL message in reply INTERRUPTED and back (example with 2
after a configured number of retries or some administratively deter- addresses allocated to secondary)
mined time, the Secondary Server transitions into COMMUNICATION-
INTERRUPTED state. Both the DHCPPOLL and DHCPPRPL messages carry the
current status of the sender.
If an external command is received by the Secondary Server, it can DRAFT November 1998
move from NORMAL to PARTNER- DOWN state directly. Such a command
might be sent when the Primary Server was removed from server, and an
operator wanted the Secondary Server to take over immediately and
completely from the Primary Server.(Note that the Secondary Server
takes over from the Primary Server when in COMMUNICATION- INTERRUPTED
state, but less completely than in PARTNER-DOWN state).
DRAFT January 1998 6.9. POTENTIAL-CONFLICT state
9.5. Secondary Server in COMMUNICATION-INTERRUPTED state This state indicates that the two servers are attempting to re-
integrate with each other, but at least one of them was running in a
state that did not guarantee automatic reintegration would be
possible. In POTENTIAL-CONFLICT state the servers may determine that
the same IP address has been offered and accepted by two different
DHCP clients.
When in COMMUNICATION-INTERRUPTED state the Secondary Server operates It is a goal of this protocol to minimize the possibility that
in such a way that correct operation is ensured even if the Primary POTENTIAL-CONFLICT state is ever entered.
Server is still up and operational, but unable to communicate to the
Secondary Server. When communications are reestablished between the
Primary and Secondary Servers, if both are still in COMMUNICATION-
INTERRUPTED state, then the re-integration of their operation will
proceed automatically and without human intervention. The protocol
is designed to ensure that reintegration will proceed in an error
free manner and that no actions taken by either server while in
COMMUNICATION-INTERRUPTED state will cause any conflicts to occur
during re-integration.
In COMMUNICATION-INTERRUPTED state, the Secondary Server responds to 6.9.1. Upon Entry to POTENTIAL-CONFLICT
DHCP client requests.
When processing a DHCPREQUEST from a DHCP client, the Secondary When a primary server enters POTENTIAL-CONFLICT state it should
Server MUST ensure that the client- lease-time is never more than the request that the secondary send it all updates of which it is
maximum-delta-lease- interval from the current-time, independent of currently unaware by sending an UPDATEREQ message to the secondary
the desired- client-lease-time. server.
When processing a DHCPRELEASE request from a DHCP client or the A secondary server entering POTENTIAL-CONFLICT state will wait for
expiration of a lease, the Secondary Server must not reallocate the the primary to send it an UPDATEREQ message.
IP address to a different client. If the same client subsequently
performs a DHCPDISCOVER request, the Secondary Server SHOULD offer it
the previously used IP address.
When processing a DHCPDISCOVER request from a DHCP client, the secon- 6.9.2. Operation in POTENTIAL-CONFLICT state
dary MUST allocate IP addresses from the list of IP addresses that it
acquired from the Primary Server in RECOVER state. When it exhausts
this list, it MUST stop responding to DHCPDISCOVER requests (except
those it can satisfy by offering expired or released IP addresses to
their previously bound clients).
The Secondary Server MUST continue to send DHCPPOLL messages to the Any server in POTENTIAL-CONFLICT state MUST be unresponsive to incom-
Primary Server when in COMMUNICATION-INTERRUPTED state. If it ing DHCP requests.
receives a DHCPPRPL message in reply, the Secondary Server determines
the state of the Primary Server. If the Primary Server is in NORMAL
or COMMUNICATION-INTERRUPTED state, then the Secondary Server moves
into the SYNC state.
If, however, the Primary Server is in RECOVER state, then the Secon- 6.9.3. Transitions out of POTENTIAL-CONFLICT state
dary Server updates the Primary Server with its known client binding
information, and moves into NORMAL state upon completion of that
update.
If instructed to by an outside agency (e.g., an administrator), the If communications fails with the partner while in POTENTIAL-CONFLICT
state, then a primary server will transition to PARTNER-DOWN state
and a secondary server will stay in POTENTIAL-CONFLICT state.
DRAFT January 1998 Whenever either server receives an UPDATEDONE message from its
partner, it MUST transition to NORMAL state. This will cause the
primary server to leave POTENTIAL-CONFLICT state prior to the secon-
dary, since the primary sends an UPDATEREQ message and receives an
UPDATEDONE before the secondary sends an UPDATEREQ message and
receives its UPDATEDONE message.
Secondary Server SHOULD move into PARTNER-DOWN state. Once the When a secondary server receives an indication that the primary
Secondary Server has been in COMMUNICATION-INTERRUPTED state for a server has transitioned from POTENTIAL-CONFLICT to NORMAL state, it
period equal to the safe-period, then it may (if configured to do so) SHOULD send an UPDATEREQ message to the primary server.
transition into the PARTNER-DOWN state in the absence of an external
command.
9.6. Secondary Server in SYNCH state DRAFT November 1998
The Secondary Server does not respond to DHCP client requests when in Primary Secondary
SYNCH state. Server Server
DISCUSSION: | |
POTENTIAL-CONFLICT POTENTIAL-CONFLICT
| |
| >--DHCPUPDATEREQ-------------> |
| |
| <-----------------DHCPBNDUPD--< |
| >--DHCPBNDACK----------------> |
... ...
| |
| <-----------------DHCPBNDUPD--< |
| >--DHCPBNDACK----------------> |
| |
| <-------------DHCPUPDATEDONE--< |
NORMAL |
| >--DHCPPOLL--(NORMAL) -------> |
| <-------------------DHCPPRPL--< |
| |
| <--------------DHCPUPDATEREQ--< |
| |
| >--DHCPBNDUPD----------------> |
| <-----------------DHCPBNDACK--< |
... ...
| |
| >--DHCPBNDUPD----------------> |
| <-----------------DHCPBNDACK--< |
| |
| >--DHCPUPDATEDONE------------> |
| |
| NORMAL
| |
| <----------------DHCPPOOLREQ--< |
| >--DHCPPOOLRESP--------------> |
| |
This is the entire reason for this states existence, otherwise the Figure 6.9-1: Transition out of POTENTIAL-CONFLICT
activities specified for this state could happen as part of a
state transition from the COMMUNICATION-INTERRUPTED state to the
NORMAL state. However, in the COMMUNICATION-INTERRUPTED state the
Secondary Server responds to DHCP client requests. Having the
Secondary Server respond to DHCP client requests during the syn-
chronization process (and thus taking actions requiring further
synchronization) seemed like a bad idea.
The Secondary Server synchronizes its information with the Primary DRAFT November 1998
Server while in SYNCH state. Both Primary and Secondary Servers may
have information the other lacks because of operations performed
while communications were interrupted.
During the synchronization process, the Secondary Server continues to 6.10. RECOVER-DONE state
poll the Primary Server with DHCPPOLL messages. If it fails to
receive a reply, it moves back into COMMUNICATION-INTERRUPTED state.
When synchronization is complete, the Secondary Server moves into This state exists to allow an interlocked transition for one server
NORMAL state. from RECOVER state and another server from PARTNER-DOWN or
COMMUNICATIONS-INTERRUPTED state into NORMAL state.
9.7. Secondary Server in PARTNER-DOWN state 6.10.1. Operation in RECOVER-DOWN state
The Secondary Server responds to DHCP client requests when in A server in RECOVER-DONE state is responsive only to RENEWAL and
PARTNER-DOWN state. REBINDING DHCP messages.
Any available IP address which does not belong to the private pool 6.10.2. Transitions out of RECOVER-DONE state
established by the Secondary Server (at entry to PARTNER-DOWN state)
MUST NOT be used until the MDLI beyond the entry into PARTNER-DOWN
state has elapsed.
The Secondary Server MUST NOT allocate an IP address to a DHCP client When a server in RECOVER-DONE state determines that its partner
different from that to which it was allocated at the entrance to server has entered NORMAL state, then it will transition into NORMAL
state as well.
DRAFT January 1998 6.11. PAUSED state
PARTNER-DOWN state until the MDLI beyond the its expiration time has This state exists to allow one server to inform another that it will
elapsed. If this time would be earlier than the current time plus the be out of service for what is predicted to be a relatively short
MDLI, then the current time plus the MDLI is used. time, and to allow the other server to transition to COMMUNICATIONS-
INTERRUPTED state immediately and (if it is a secondary server) to
begin servicing clients with no interruption.
Two options exist for lease times, with different ramifications flow- A server which is aware that it is shutting down temporarily SHOULD
ing from each. send one or more DHCPPOLL messages with the 'state' field containing
PAUSED.
If the Secondary Server wishes the Failover Protocol to protect it While a server may or may not transition internally into PAUSED
from loss of stable storage in any state, then it should ensure that state, the 'previous' state determined when it is restarted MUST be
the MDLI based lease time restrictions in Section 6.1 are maintained, the state the server was in prior to receiving the command to shut-
even in PARTNER-DOWN state. down and restart and its entry into the PAUSED state.
If the Secondary Server wishes to forego the protection of the safe 6.11.1. Upon entry to PAUSED state
Failover Protocol in the event of loss of stable storage, then it MAY
recognize no restrictions on actual client lease times while in
PARTNER-DOWN state.
The Secondary Server continues to poll the Primary Server with When entering PAUSED state, the server MUST remember the previous
DHCPPOLL messages. If the Secondary Server receives a reply, and the state, and use that state as the previous state when it is restarted.
Primary Server is in the RECOVER state, the Secondary Server updates
the Primary Server with all of the Secondary's client binding infor-
mation, and then moves into the NORMAL state.
If communications with the Primary Server are reestablished, and the 6.11.2. Transitions out of PAUSED state
Primary Server is in any other state but RECOVER, the Secondary
Server moves into the POTENTIAL-CONFLICT state (as does the Primary
Server).
9.8. Secondary Server in POTENTIAL-CONFLICT state A server transitions out of PAUSED state by being restarted. At that
time, the previous state MUST be the state the server was in prior to
entering the PAUSED state.
The secondary server enters POTENTIAL-CONFLICT state when the combi- DRAFT November 1998
nation of its state and that of the primary indicate that a potential
conflict of IP address allocation has occurred. There is no guaran-
tee that such a conflict has occurred -- just the possibility. In
this state each server compares its client binding information with
that of the other server and any conflicts are resolved in an imple-
mentation dependent manner.
When (and if) the resolution process completes, each server moves 6.12. SHUTDOWN state
into the NORMAL state.
10. Safe Period This state exists to allow one server to inform another that it will
be out of service for what is predicted to be a relatively long time,
and to allow the other server to transition immediately to PARTNER-
DOWN state, and take over completely for the server going down.
A server which is aware that it is shutting down SHOULD send one or
more DHCPPOLL messages with the 'state' field containing SHUTDOWN.
While a server may or may not transition internally into SHUTDOWN
state, the 'previous' state determined when it is restarted MUST be
the state active prior to the command to shutdown unless the server
detects that its partner has moved to PARTNER-DOWN, in which case it
MUST be RECOVER.
6.12.1. Upon entry to SHUTDOWN state
When entering SHUTDOWN state, the server MUST record the previous
state in stable storage for use when the server is restarted. It
also MUST record the current time as the last time operational.
A DHCPPOLL message SHOULD be sent to the partner with the 'state'
field containing SHUTDOWN state.
6.12.2.
A server in SHUTDOWN state MUST be unresponsive to DHCP client input.
If a server receives any message indicating that the partner has
moved to PARTNER-DOWN state while it is in SHUTDOWN state (e.g in
response to the DHCPPOLL it sent containing SHUTDOWN state), then it
MUST record RECOVER state as the previous state to be used when it is
restarted.
A server SHOULD wait for a few seconds after informing the partner of
entry into SHUTDOWN state (if communications are okay) to determine
if it will enter PARTNER-DOWN state.
6.12.3. Transitions out of SHUTDOWN state
A server transitions out of SHUTDOWN state by being restarted.
7. Safe Period
Due to the restrictions imposed on each server while in Due to the restrictions imposed on each server while in
COMMUNICATION-INTERRUPTED state, long-term operation in this state is COMMUNICATIONS-INTERRUPTED state, long-term operation in this state
not feasible for either server. One reason that these states exist at
all, is to allow the servers to easily survive transient network
DRAFT January 1998 DRAFT November 1998
communications failures of a few minutes to a few days (although the is not feasible for either server. One reason that these states
actual time periods will depend a great deal on the DHCP activity of exist at all, is to allow the servers to easily survive transient
the network in terms of arrival and departure of DHCP clients on the network communications failures of a few minutes to a few days
network). (although the actual time periods will depend a great deal on the
DHCP activity of the network in terms of arrival and departure of
DHCP clients on the network).
Eventually, when the servers are unable to communicate, they will Eventually, when the servers are unable to communicate, they will
have to move into a state where they no longer can re-integrate have to move into a state where they no longer can re-integrate
without the some possibility of a duplicate IP address allocation. without the some possibility of a duplicate IP address allocation.
There are two ways that they can move into this state (known as There are two ways that they can move into this state (known as
PARTNER-DOWN). PARTNER-DOWN).
They can either be informed by external command that, indeed, the They can either be informed by external command that, indeed, the
partner server is down. In this case, there is no difficulty in mov- partner server is down. In this case, there is no difficulty in mov-
ing into the PARTNER-DOWN state since it is an accurate reflection of ing into the PARTNER-DOWN state since it is an accurate reflection of
reality and the protocol has been designed to operate correctly (even reality and the protocol has been designed to operate correctly (even
during reintegration) if, when in PARTNER-DOWN state the partner is, during reintegration) if, when in PARTNER-DOWN state the partner is,
indeed, down. indeed, down.
The other difficulty is when the servers are running unattended for The more difficult scenario is when the servers are running unat-
extended periods, and in this case the option is provided to config- tended for extended periods, and in this case an option is provided
ure something called a "safe- period" into each server. This OPTIONAL to configure something called a "safe-period" into each server. This
safe-period is the period after which either the Primary or Secondary OPTIONAL safe-period is the period after which either the primary or
Server will automatically transition to PARTNER-DOWN from secondary server will automatically transition to PARTNER-DOWN from
COMMUNICATION-INTERRUPTED state. If this transition is completed and COMMUNICATIONS-INTERRUPTED state. If this transition is completed
the partner is not down, then the possibility of duplicate IP address and the partner is not down, then the possibility of duplicate IP
allocations will exist. address allocations will exist.
The goal of the "safe-period" is to allow network operations staff The goal of the "safe-period" is to allow network operations staff
some time to react to a server moving into COMMUNICATION-INTERRUPTED some time to react to a server moving into COMMUNICATIONS-INTERRUPTED
state. During the safe-period the only requirement is that the net- state. During the safe-period the only requirement is that the net-
work operations staff determine if both servers are still running -- work operations staff determine if both servers are still running --
and if they are, to either fix the network communications failure and if they are, to either fix the network communications failure
between them, or to take one of the servers down before the expira- between them, or to take one of the servers down before the expira-
tion of the safe-period. tion of the safe-period.
The length of the safe-period is installation dependent, and depends The length of the safe-period is installation dependent, and depends
in large part on the number of unallocated IP addresses within the in large part on the number of unallocated IP addresses within the
subnet address pool and the expected frequency of arrival of previ- subnet address pool and the expected frequency of arrival of previ-
ously unknown DHCP clients requiring IP addresses. Many environments ously unknown DHCP clients requiring IP addresses. Many environments
should be able to support safe-periods of several days. should be able to support safe-periods of several days.
During this safe period, either server will allow renewals from any During this safe period, either server will allow renewals from any
existing client. The only limitation concerns the need for IP existing client. The only limitation concerns the need for IP
addresses for the DHCP server to hand out to new DHCP clients and the addresses for the DHCP server to hand out to new DHCP clients and the
need to re-allocate IP addresses to different DHCP clients. need to re-allocate IP addresses to different DHCP clients.
DRAFT November 1998
The number of "extra" IP addresses required is equal to the expected The number of "extra" IP addresses required is equal to the expected
total number of new DHCP clients encountered during the safe period. total number of new DHCP clients encountered during the safe period.
DRAFT January 1998
This is dependent only on the arrival rate of new DHCP clients, not This is dependent only on the arrival rate of new DHCP clients, not
the total number of outstanding leases on IP addresses. the total number of outstanding leases on IP addresses.
In the unlikely event that a relatively short safe period of an hour In the unlikely event that a relatively short safe period of an hour
is all that can be used (given a dearth of IP addresses or a very is all that can be used (given a dearth of IP addresses or a very
high arrival rate of new DHCP clients), even that can provide sub- high arrival rate of new DHCP clients), even that can provide sub-
stantial benefits in allowing the DHCP subsystem to ride through a stantial benefits in allowing the DHCP subsystem to ride through
minor problems that could occur and be fixed within that hour. In minor problems that could occur and be fixed within that hour. In
these cases, no possibility of duplicate IP address allocation these cases, no possibility of duplicate IP address allocation
exists, and re-integration after the failure is solved will be exists, and re-integration after the failure is solved will be
automatic and require no operator intervention. automatic and require no operator intervention.
11. Open Issues 8. Security
A number of details remain to be worked out. They are as follows:
1. Level of Agreement and Completion
This draft is incomplete in two senses. First, none of the
authors agree with everything written, and quite a number of
issues remain to be worked out among the various authors (to say
nothing about the rest of the community). Second, this draft is
not yet complete enough to support creation of inter-operable
implementations.
However, we believe that even though this draft is very much a
work in progress, there is value with sharing it with the rest
of the DHCP community in its current form.
2. Failover Port
We need to resolve whether the Failover protocol runs with the
same or a different port as the DHCP protocol. In the interests
of allowing implementation of the Failover protocol by a dif-
ferent process or sub-process, having it use a different port
seems reasonable.
3. High Level Operations
While the detailed operations are beginning to come together,
the higher level operations (like reintegration) are, as yet,
incompletely specifcied. This will be rectified in a later
revision.
4. Option Spaces
The draft currently reflects some rather fuzzy goals of using
DHCP options where they apply but also defining new options. It
DRAFT January 1998
uses the "user defined option space" for this, which is probably
not a good idea. Perhaps the DHCP Panel will produce a larger
option space in which all of these options can be defined, or
perhaps (as it written in the draft) this protocol will just
have to define entirely unique options.
5. Subnet Level Granularity
This protocol talks about a server being in one state or
another, however the desire is for this protocol to operate
independently in each address pool for which a primary and
secondary server is defined. In this way, the "server" state
really refers to the "subnet" state. Once the protocol is vali-
dated, the editing work to make it operate at subnet granularity
will be performed.
6. Secondary Server Communications with DHCP Clients
There are two situations where we may want to allow the secon- The Failover protocol MAY be secured with a simple shared secret mes-
dary server to communicate with DHCP clients even though the sage digest which covers each message. Since there are a number of
secondary can communicate with the primary and would normally be configuration parameters that must be the same on each server in a
unresponsive to DHCP client requests. pair, it is not unreasonable to require a shared secret be configured
as well.
The first situation which deserves consideration is where the Only information within the packet and covered by the message digest
secondary has given a DHCP client a lease on an IP address when is used for operation of the protocol. It is for this reason that
it was not able to communicate with the primary, and then subse- the IP address of the sending server is sent in the 'sending server
quently the secondary becomes able to communicate with the pri- id' field of the fixed header of the failover message when it might
mary. When the client unicasts its DHCPREQUEST to the secondary seem that the same information could be recovered from the source
to renew its lease, the secondary will not be able to communi- address of the IP packet.
cate with the client (as this protocol is defined). Should we
allow the Secondary to extend the lease for the DHCP client and
then inform the primary of that extension using the DHCPBNDUPD
message in the same was as the Primary uses that message?
The second situation arises where a client can only communicate 9. Extended Discussion
with the secondary due to some network failure, but the primary
and secondary server can communicate. As written, the protocol
will not allow the secondary to offer a lease to the DHCP
client, but it would be straightforward to modify the protocol
to allow the secondary to do so. The only difficult part of
this change to the protocol would be to suggest how the secon-
dary would know that the DHCP client could talk only to the
secondary. But, given that if the DHCP primary could talk to
the DHCP client, the secondary would expect to hear about it in
DHCPBNDUPD messages at some point, the absence of such messages
could be used as a signal to communicate to the DHCP client in
question.
DRAFT January 1998 Some areas in the draft above warranted more extended discussion than
was feasible to insert directly into the next.
7. UDP or TCP 1. UDP or TCP
There has been much debate about the utility of using UDP for There has been debate about the utility of using UDP for the
the failover protocol, since it doesn't supply guaranteed Failover protocol, since it doesn't supply guaranteed
delivery. Certainly rebuilding TCP out of UDP would be a mis- delivery. UDP has been chosen as the protocol of choice for
take. Some factors to consider in this debate are as follows: the failover protocol due to the following factors:
First, it is important to recognize that mere receipt of a First, it is important to recognize that mere receipt of a
packet by the other server in the pair (e.g., receipt of a packet by the other server in the pair (e.g., receipt of a
DHCPBNDUPD packet by the secondary server) is not sufficient for DHCPBNDUPD packet by the secondary server) is not sufficient
the primary to update its own bindings database with new infor- for the primary to update its own bindings database with new
mation about what the secondary knows. In all cases of information about what the secondary knows. In all cases of
transfers of bindings information, the server of a DHCPBNDUPD
DRAFT November 1998
transfers of binding information, the server of a DHCPBNDUPD
message MUST update its own stable storage prior to replying message MUST update its own stable storage prior to replying
with a DHCPBNDACK message (except in the marginal case where all with a DHCPBNDACK message (except in the marginal case where
of the updates are rejected). An action is required by the all of the updates are rejected). An action is required by
receiving server and an explicit ACK is needed by the sending the receiving server and an explicit ACK is needed by the
server to ensure the integrity of the protocol. So, just know- sending server to ensure the integrity of the protocol. So,
ing that the other server has received a Failover protocol just knowing that the other server has received a Failover
packet is not intrinsically interesting. protocol packet is not intrinsically interesting.
Second, the DHCP protocol, both the client and server side, is Second, the DHCP protocol, both the client and server side, is
being implemented in progressively smaller and smaller machines. being implemented in progressively smaller and smaller
While this progression is most evident in DHCP clients, there machines. While this progression is most evident in DHCP
exist implementations today of DHCP servers embedded in devices clients, there exist implementations today of DHCP servers
that are by no stretch of the imagination traditional "servers" embedded in devices that are by no stretch of the imagination
running mainstream operating systems. In many ways, the Fail- traditional "servers" running mainstream operating systems.
over protocol is very well suited to such devices. Adding addi- In many ways, the Failover protocol is very well suited to
tional protocol infrastructure requirements to implement the such devices. Adding additional protocol infrastructure
Failover protocol could easily prevent its implementation in requirements to implement the Failover protocol might prevent
devices that in some ways need it most. its implementation in devices that in some ways need it most
(devices with limited stable storage of their own).
Third, there are only a few cases where the Failover protocol Third, there are only a few cases where the Failover protocol
requires guaranteed delivery of packets. In particular, the requires guaranteed delivery of packets. In particular, the
normal Primary to Secondary DHCPBNDUPD message to not have to be normal Primary to Secondary DHCPBNDUPD message do not have to
delivered reliably. The consequences of lost DHCPBNDUPD mes- be delivered reliably. The consequences of lost DHCPBNDUPD
sages are handled by the use of the MDLI, for the simple reason messages are handled by the use of the MCLT, for the simple
that since these messages are "lazy", they may not get delivered reason that since these messages are "lazy", they may not get
because of a server failover prior to their transmission. Given delivered because of a server Failover prior to their
that the protocol is robust in the face of loss of either a transmission. The protocol is robust in the face of loss of
DHCPBNDUPD message or a DHCPBNDACK message, a technique known as either a DHCPBNDUPD message or a DHCPBNDACK message.
"fire and forget" may be used with this protocol and two
cooperating implementations. If the DHCPBNDACK message contains
all of the information originally in the DHCPBNDUPD message,
then the DHCPBNDUPD message may be transmitted and forgotten by
the sending server (typically the primary). When and if the
secondary receives the DHCPBNDUPD and replies with a DHCPBNDACK
message and the primary receives it, the primary will update its
DRAFT January 1998
stable storage with a new picture of what the secondary knows Furthermore, a technique known as "fire and forget" may be
about the lease time. If either of these messages is lost, the used with this protocol and two cooperating implementations.
only downside is that the DHCP client associated with the bind- If the DHCPBNDACK message contains all of the information ori-
ing in question may receive a shorter lease for one lease period ginally in the DHCPBNDUPD message, then the DHCPBNDUPD message
than it would otherwise. This "fire and forget" technique may be transmitted and forgotten by the sending server (typi-
could substantially ease both the complexity of implementation cally the primary). When and if the secondary receives the
and memory requirements of an implementation of the Failover DHCPBNDUPD and replies with a DHCPBNDACK message and the pri-
protocol, especially where two servers were communicating over a mary receives it, the primary will update its stable storage
with a new picture of what the secondary knows about the lease
time. If either of these messages is lost, the only downside
is that the DHCP client associated with the binding in ques-
tion may receive a shorter lease for one lease period than it
would otherwise. This "fire and forget" technique could sub-
stantially ease both the complexity of implementation and
memory requirements of an implementation of the Failover pro-
tocol, especially where two servers were communicating over a
very slow link. very slow link.
12. Acknowledgments DRAFT November 1998
10. Acknowledgments
Ralph Droms started it all, by sketching out an initial interserver Ralph Droms started it all, by sketching out an initial interserver
draft that embodied ideas from several past IETF meetings. In that draft that embodied ideas from several past IETF meetings. In that
draft, he acknowledged contributions by Jeff Mogul, Greg Minshall, draft, he acknowledged contributions by Jeff Mogul, Greg Minshall,
Rob Stevens, Walt Wimer, Ted Lemon, and the DHC working group. Rob Stevens, Walt Wimer, Ted Lemon, and the DHC working group.
Kim Kinnear and Bob Cole each extended that draft, separately and Kim Kinnear and Bob Cole each extended that draft, separately and
then together, until they created an interserver draft that supported then together, until they created an interserver draft that supported
any number of servers. The complexity of that approach was just too any number of servers. The complexity of that approach was just too
great, and led to a much simpler approach embodied in the first Fail- great, and that draft wasn't greeted with enthusiasm by many, includ-
over draft by Greg Rabil, Mike Dooley, and Arun Kapur and Ralph ing its authors.
It did however lead to a much simpler approach embodied in the first
Failover draft by Greg Rabil, Mike Dooley, Arun Kapur and Ralph
Droms. This draft posited only two servers -- a primary and a secon- Droms. This draft posited only two servers -- a primary and a secon-
dary. Kim Kinnear then wrote the Safe Failover draft to layer on top dary.
of the Failover Draft and increase its the robustness in the face of
certain rare network failures. At the spring 1998 IETF meeting in LA,
the DHC working group said that they wanted a merged Failover and
Safe Failover draft. Steve Gonczi and Bernie Volz stepped up and
produced the raw material for such a merged draft, along with a new
message format designed around DHCP options and other extensions and
clarifications. Kim Kinnear edited their work into draft format and
made other changes, and that is what you have in your hands.
Many people have reviewed the various drafts that went into this Kim Kinnear then wrote the Safe Failover draft to layer on top of the
result. At American Internet, ideas have been contributed by Mark Failover Draft and increase its robustness in the face of certain
Stapp, Brad Parker, and Ellen Garvey. Glenn Waters of Bay Networks rare network failures.
contributed ideas and enthusiasm to make a Failover protocol that was
both "safe" and "lazy".
13. References At the spring 1998 IETF meeting in LA, the DHC working group said
that they wanted a merged Failover and Safe Failover draft. Steve
Gonczi and Bernie Volz stepped up and produced the raw material for
such a merged draft, along with a new message format designed around
DHCP options and other extensions and clarifications. Kim Kinnear
edited their work into draft format and made other changes in time
for the Summer Chicago IETF meeting.
[1] Droms, R., "Dynamic Host Configuration Protocol", RFC 2131, During the summer and fall of 1998, two groups have been working on
March 1997. separate implementations of the evolving draft. Bernie Volz and
Steve Gonczi constitute one group, and Kim Kinnear, Mark Stapp and
Paul Fox make up the other. These two groups have worked together to
produce considerable changes and simplifications of the protocol dur-
ing this period, and Steve Gonczi and Kim Kinnear have edited these
changes into this latest revision in time for submission to the
December 1998 Orlando IETF meeting.
[2] Alexander, S., Droms, R., "DHCP Options and BOOTP Vendor These most recent changes have been reviewed by Ralph Droms, Greg
Extensions", Internet RFC 2132, March 1997. Rabil, Bernie Volz, Steve Gonczi, Mark Stapp, Paul Fox, and Kim Kin-
near. This does not preclude any of these people from expressing
disagreement with what is contained in this draft at any future time.
DRAFT January 1998 Many people have reviewed the various earlier drafts that went into
this result. At American Internet, ideas were contributed by Brad
Parker. At Cisco Systems, Paul Fox, and Ellen Garvey have contri-
buted greatly to the form of the protocol. Glenn Waters of Bay
[3] Rabil, G., Dooley, M., Kapur, A., Droms, R., "DHCP Failover DRAFT November 1998
Protocol", draft-ietf-dhc-failover-00.txt.
[4] Gudmundsson, Olafur, "Security Architecture for DHCP", Networks contributed ideas and enthusiasm to make a Failover protocol
draft-ietf-dhc-security-arch-00.txt. that was both "safe" and "lazy".
14. Author's information 11. References
[RFC 2131] Droms, R., "Dynamic Host Configuration Protocol", RFC
2131, March 1997.
[RFC 2119] Bradner, S. "Key words for use in RFCs to Indicate
Requirement Levels", RFC 2119.
[RFC 2132] Alexander, S., Droms, R., "DHCP Options and BOOTP Vendor
Extensions", Internet RFC 2132, March 1997.
12. Author's information
Ralph Droms Ralph Droms
323 Dana Engineering 323 Dana Engineering
Bucknell University Bucknell University
Lewisburg, PA 17837 Lewisburg, PA 17837
Phone: (717) 524-1145 Phone: (717) 524-1145
EMail: droms@bucknell.edu EMail: droms@bucknell.edu
Greg Rabil, Mike Dooley, Arun Kapur Greg Rabil, Mike Dooley, Arun Kapur
Quadritek Systems, Inc. Lucent Technologies (Quadritek)
10 Valley Stream Parkway, Suite 240 10 Valley Stream Parkway, Suite 240
Malvern, PA 19355 Malvern, PA 19355
Phone: (800) 208-2747 Phone: (800) 208-2747
EMail: grabil@quadritek.com EMail: grabil@lucent.com
mdooley@quadritek.com mdooley@lucent.com
akapur@quadritek.com akapur@lucent.com
Kim Kinnear Kim Kinnear
American Internet Corporation Mark Stapp
4 Preston Ct. Cisco Systems
Bedford, MA 01730-2334 250 Apollo Drive
Chelmsford, MA 01824
Phone: (781) 276-4587 Phone: (978) 244-8000
EMail: kinnear@american.com
DRAFT November 1998
EMail: kkinnear@cisco.com
mjs@cisco.com
Steve Gonczi, Bernie Volz Steve Gonczi, Bernie Volz
Process Software Corporation Process Software Corporation
959 Concord St. 959 Concord St.
Framingham, MA 01701 Framingham, MA 01701
Phone: (508) 879-6994 Phone: (508) 879-6994
EMail: gonczi@process.com EMail: gonczi@process.com
volz@process.com volz@process.com
 End of changes. 

This html diff was produced by rfcdiff 1.23, available from http://www.levkowetz.com/ietf/tools/rfcdiff/