draft-ietf-dhc-failover-04.txt   draft-ietf-dhc-failover-05.txt 
skipping to change at page 1, line 16 skipping to change at page 1, line 17
Mark Stapp Mark Stapp
Cisco Systems Cisco Systems
Bernie Volz Bernie Volz
Steve Gonczi Steve Gonczi
Process Software Process Software
Greg Rabil Greg Rabil
Mike Dooley Mike Dooley
Arun Kapur Arun Kapur
Quadritek Systems Lucent Technologies
June 1999 October 1999
Expires December 1999 Expires April 2000
DHCP Failover Protocol DHCP Failover Protocol
<draft-ietf-dhc-failover-04.txt> <draft-ietf-dhc-failover-05.txt>
Status of this Memo Status of this Memo
This document is an Internet-Draft and is in full conformance with This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC2026. all provisions of Section 10 of RFC2026.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet- other groups may also distribute working documents as Internet-
Drafts. Drafts.
skipping to change at page 2, line 8 skipping to change at page 2, line 8
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html. http://www.ietf.org/shadow.html.
Copyright Notice Copyright Notice
Copyright (C) The Internet Society (1999). All Rights Reserved. Copyright (C) The Internet Society (1999). All Rights Reserved.
Abstract Abstract
DHCP [RFC 2131] allows for multiple servers to be operating on a DHCP [RFC 2131] allows for multiple servers to be operating on a
single network. Some sites are interested in running multiple servers single network. Some sites are interested in running multiple
in such a way so as to provide redundancy in case of server failure. servers in such a way so as to provide redundancy in case of server
In order for this to work reliably, the cooperating primary and failure. In order for this to work reliably, the cooperating primary
secondary servers must maintain a consistent database of the lease and secondary servers must maintain a consistent database of the
information. This implies that servers will need to coordinate any lease information. This implies that servers will need to coordinate
and all lease activity so that this information is synchronized in any and all lease activity so that this information is synchronized
case of failover. in case of failover.
This document defines a protocol to provide this synchronization This document defines a protocol to provide this synchronization
between two servers. One server is designated the "primary" server, between two servers. One server is designated the "primary" server,
the other is the "secondary" server. Additionally, this document the other is the "secondary" server. This document also describes a
describes a protocol which allows each server to determine to which way to integrate the failover protocol with the DHCP loadbalancing
DHCP clients it should provide service when both servers are approach.
operating in order to support load balancing as well as when on one
server has failed in order to support increased DHCP service
availability.
This document is a complete rewrite of draft-ietf-dhc-failover- This document is a significant revision of draft-ietf-dhc-failover-
03.txt. That earlier draft described a UDP based failover protocol, 04.txt.
and this draft describes a closely related protocol which uses TCP as
a transport and includes new load-balancing and security
capabilities.
Table of Contents Table of Contents
1. Introduction................................................. 4 1. Introduction................................................. 4
2. Terminology.................................................. 5 2. Terminology.................................................. 5
2.1. Requirements terminology................................... 5 2.1. Requirements terminology................................... 5
2.2. DHCP and failover terminology.............................. 5 2.2. DHCP and failover terminology.............................. 5
3. Background and External Requirements......................... 7 3. Background and External Requirements......................... 8
3.1. Key aspects of the DHCP protocol........................... 7 3.1. Key aspects of the DHCP protocol........................... 8
3.2. BOOTP relay agent implementation........................... 9 3.2. BOOTP relay agent implementation........................... 10
3.3. What does it mean if a server can't communicate with its partner? 3.3. What does it mean if a server can't communicate with its partner? 11
10 3.4. Challenging scenarios for a Failover protocol.............. 12
3.4. Challenging scenarios for a Failover protocol............. 10 3.5. Using TCP to detect partner server failure................. 13
3.5. Using TCP to detect partner server failure................ 11 4. Design Goals................................................. 14
4. Design Goals................................................ 13 4.1. Design requirements for this protocol...................... 14
4.1. Design requirements for this protocol..................... 13 4.2. Goals for this protocol.................................... 15
4.2. Goals for this protocol................................... 13 4.3. Limitations of this Protocol............................... 16
4.3. Limitations of this Protocol.............................. 14 5. Protocol Overview............................................ 16
5. Protocol Overview........................................... 15 5.1. Messages and States........................................ 17
5.1. Messages and States....................................... 15 5.2. Fundamental restrictions................................... 19
5.2. Fundamental restrictions.................................. 18 5.3. Load balancing............................................. 26
5.3. Load balancing............................................ 24 5.4. Operating in NORMAL state.................................. 27
5.4. Operating in NORMAL state................................. 25 5.5. Operating in COMMUNICATIONS-INTERRUPTED state.............. 27
5.5. Operating in COMMUNICATIONS-INTERRUPTED state............. 25 5.6. Operating in PARTNER-DOWN state............................ 27
5.6. Operating in PARTNER-DOWN state........................... 25 5.7. Operating in RECOVER state................................. 28
5.7. Operating in RECOVER state................................ 26 5.8. Operating in STARTUP state................................. 28
6. Packet Formats.............................................. 26 5. Protocol Overview (continued)
6.1. Common message format..................................... 26 5.9. Time synchronization between servers....................... 28
6.2. Common option format...................................... 28 5.10. IP address binding-status................................. 29
6.3. BNDUPD message format..................................... 40 5.11. DNS dynamic update considerations......................... 34
6.4. BNDACK message format..................................... 42 5.12. Reservations and failover................................. 38
6.5. Bulking for BNDUPD and BNDACK messages.................... 44 5.13. Dynamic BOOTP and failover................................ 39
6.6. UPDREQ message format..................................... 44 5.14. Guidelines for selecting MCLT............................. 39
6.7. UPDREQALL message format.................................. 44 6. Packet Formats............................................... 40
6.8. UPDDONE message format.................................... 44 6.1. Common message format...................................... 40
6.9. POOLREQ message format.................................... 45 6.2. Common option format....................................... 43
6.10. POOLRESP message format.................................. 45 6.3. BNDUPD message format...................................... 55
6.11. CONNECT message format................................... 46 6.4. BNDACK message format...................................... 58
6.12. CONNECTACK message format................................ 46 6.5. Bulking for BNDUPD and BNDACK messages..................... 59
6.13. STATE message format..................................... 47 6.6. UPDREQ message format...................................... 60
6.14. CONTACT message format................................... 48 6.7. UPDREQALL message format................................... 60
7. Protocol Messages........................................... 48 6.8. UPDDONE message format..................................... 60
7.1. BNDUPD message............................................ 48 6.9. POOLREQ message format..................................... 61
7.2. BNDACK message............................................ 57 6.10. POOLRESP message format................................... 61
7.3. UPDREQ message............................................ 58 6.11. CONNECT message format.................................... 62
7.4. UPDREQALL message......................................... 59 6.12. CONNECTACK message format................................. 62
7.5. UPDDONE message........................................... 60 6.13. STATE message format...................................... 63
7.6. POOLREQ message........................................... 60 6.14. CONTACT message format.................................... 64
7.7. POOLRESP message.......................................... 61 6.15. DISCONNECT message format................................. 64
7.8. CONNECT message........................................... 62 7. Protocol Messages............................................ 64
7.9. CONNECTACK message........................................ 65 7.1. BNDUPD message............................................. 64
7.10. STATE message............................................ 68 7.2. BNDACK message............................................. 75
7.11. CONTACT message.......................................... 69 7.3. UPDREQ message............................................. 76
8. Connection Management....................................... 70 7.4. UPDREQALL message.......................................... 78
8.1. Connection granularity.................................... 70 7.5. UPDDONE message............................................ 79
8.2. Creating the TCP connection............................... 70 7.6. POOLREQ message............................................ 80
8.3. Using the TCP connection for determining communications status. 71 7.7. POOLRESP message........................................... 81
8.4. Using the TCP connection for binding data................. 73 7.8. CONNECT message............................................ 81
8.5. Using the TCP connection for control messages............. 73 7.9. CONNECTACK message......................................... 85
8.6. Losing the TCP connection................................. 73 7.10. STATE message............................................. 88
9. Protocol States............................................. 73 7.11. CONTACT message........................................... 89
9.1. Server Initialization..................................... 74 7.12. DISCONNECT message........................................ 89
9.2. Server State Transitions.................................. 74 8. Connection Management........................................ 90
9.3. STARTUP state............................................. 77 8.1. Connection granularity..................................... 90
9.4. PARTNER-DOWN state........................................ 79 8.2. Creating the TCP connection................................ 90
9.5. RECOVER state............................................. 81 8.3. Using the TCP connection for determining communications status 91
9.6. NORMAL state.............................................. 83 8.4. Using the TCP connection for binding data.................. 93
9.7. COMMUNICATIONS-INTERRUPTED State.......................... 86 8.5. Using the TCP connection for control messages.............. 94
9.8. POTENTIAL-CONFLICT state.................................. 89 8.6. Losing the TCP connection.................................. 94
9.9. RECOVER-DONE state........................................ 90 9. Protocol States.............................................. 94
9.10. PAUSED state............................................. 91 9.1. Server Initialization...................................... 95
9.11. SHUTDOWN state........................................... 91 9.2. Server State Transitions................................... 95
10. Safe Period................................................ 92 9.3. STARTUP state.............................................. 98
11. Security................................................... 94 9.4. PARTNER-DOWN state......................................... 100
11.1. Simple shared secret..................................... 94 9.5. RECOVER state.............................................. 102
11.2. TLS...................................................... 94 9.6. NORMAL state............................................... 104
12. Hash algorithm for load balancing.......................... 95 9.7. COMMUNICATIONS-INTERRUPTED State........................... 107
13. Acknowledgments............................................ 96 9.8. POTENTIAL-CONFLICT state................................... 110
14. References................................................. 97 9.9. RESOLUTION-INTERRUPTED state............................... 111
15. Author's information....................................... 98 9.10. RECOVER-DONE state........................................ 112
16. Full Copyright Statement................................... 99 9.11. PAUSED state.............................................. 113
9.12. SHUTDOWN state............................................ 113
10. Safe Period................................................. 114
11. Security.................................................... 116
11.1. Simple shared secret...................................... 116
11.2. TLS....................................................... 117
12. Acknowledgments............................................. 117
13. References.................................................. 119
14. Author's information........................................ 120
15. Full Copyright Statement.................................... 121
1. Introduction 1. Introduction
DHCP [RFC 2131] allows for multiple servers to be operating on a sin- DHCP [RFC 2131] allows for multiple servers to be operating on a sin-
gle network. Some sites are interested in running multiple servers gle network. Some sites are interested in running multiple servers
in such a way so as to provide redundancy in case of server failure in such a way so as to provide redundancy in case of server failure
since the DHCP subsystem is in many cases a critical part of the net- since the DHCP subsystem is in many cases a critical part of the net-
work infrastructure. work infrastructure.
This document defines a protocol to provide synchronization between This document defines a protocol to provide synchronization between
skipping to change at page 5, line 24 skipping to change at page 5, line 24
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC 2119]. document are to be interpreted as described in RFC 2119 [RFC 2119].
2.2. DHCP and failover terminology 2.2. DHCP and failover terminology
This document uses the following terms: This document uses the following terms:
o "DHCP client" or "client" o "DHCP client" or "client"
A DHCP client is an Internet host using DHCP to obtain confi- A DHCP client is an Internet host using DHCP to obtain confi-
guration parameters such as a network address. guration parameters such as a network address. The term
"client" used within this document always means a DHCP client,
and never one of the two failover servers.
o "DHCP server" or "server" o "DHCP server" or "server"
A DHCP server is an Internet host that returns configuration A DHCP server is an Internet host that returns configuration
parameters to DHCP clients. parameters to DHCP clients.
o "binding" o "binding"
A binding is a collection of configuration parameters, including A binding is a collection of configuration parameters, including
at least an IP address, associated with or "bound to" a DHCP at least an IP address, associated with or "bound to" a DHCP
skipping to change at page 6, line 49 skipping to change at page 7, line 4
o "stable storage" o "stable storage"
Every DHCP server is assumed to have some form of what is called Every DHCP server is assumed to have some form of what is called
"stable storage". Stable storage is used to hold information "stable storage". Stable storage is used to hold information
concerning IP address bindings (among other things) so that this concerning IP address bindings (among other things) so that this
information is not lost in the event of a server failure which information is not lost in the event of a server failure which
requires restart of the server. requires restart of the server.
o "MCLT" o "MCLT"
The MCLT refers to maximum client lead time. This time is con- The MCLT refers to maximum client lead time. This time is con-
figured on the primary server and transmitted from the primary figured on the primary server and transmitted from the primary
to the secondary server in the CONNECT message. It is the max- to the secondary server in the CONNECT message. It is the max-
imum amount of time that one server can give to a client for a imum amount of time that one server can give to a client for a
binding beyond that known and ACKed by the partner server. See binding beyond that known and ACKed by the partner server. See
section 5.2.1 for details. section 5.2.1 for details.
o "DNS"
An abbreviation for "Domain Name System", a scheme where a cen-
tral name repository is used to map names to IP addresses and IP
addresses to names.
o "FQDN"
An FQDN is a "fully qualified domain name". A fully qualified
domain name generally is a host name with at least one zone
name, for example "www.dhcp.org" is a fully qualified domain
name.
o "partner"
A "partner", for the purposes of this document, refers to a
failover server, typically the other failover server. In many
(if not most) cases, the failover protocol is symmetric with
respect to the primary or secondary nature of the servers, and
so it is often appropriate to dicuss "updating the partner
server", since it could be a primary server updating a secondary
server or a secondary server updating a primary server.
o "RR"
"RR" is an abbreviation for "resource record". All records in
the DNS are resource records. The resource records of most
relevance to this document are the "A" resource record, which
maps a DNS name to a particular IP address, the "PTR" resource
record, which allows a "reverse map", from the IP address back
to a DNS name, and the "KEY" resource record, which is used in
ways defined in [DDNS] to tag a DNS name with the identity of
the DHCP client with which it is associated.
o "DDNS"
An abbreviation for "Dynamic DNS", which refers to the capabil-
ity to update a DNS server's name (actually resource record)
database using an on-the-wire protocol defined in [RFC2136].
o "binding-status"
The binding-status is the status of an IP address with respect
to its association with a client. There are specific binding-
status values defined for use by the failover protocol, e.g.,
ACTIVE, FREE, RELEASED, ABANDONED, etc. These are designed to
map more or less directly onto the binding-status values used
internally in most DHCP server implementations. The term
binding-status refers to the concept also sometimes known as
"lease state" or "IP address state", but in this document the
term "state" is reserved for the failover state of a failover
endpoint, and binding-status is always used to refer to the
state associated with an IP address or lease.
3. Background and External Requirements 3. Background and External Requirements
This section highlights key aspects of the DHCP protocol on which the This section highlights key aspects of the DHCP protocol on which the
failover protocol depends. It also discusses the requirements that failover protocol depends. It also discusses the requirements that
the failover protocol places on other aspects of the network infras- the failover protocol places on other aspects of the network infras-
tructure, and some general issues surrounding server failure detec- tructure, and some general issues surrounding server failure detec-
tion. Some failure scenarios that provide particular challenges to a tion. Some failure scenarios that provide particular challenges to a
failover protocol are discussed. Finally, the challenges inherent in failover protocol are discussed. Finally, the challenges inherent in
using a TCP connection as a means to detect failure of a partner using a TCP connection as a means to detect failure of a partner
server are elaborated. server are elaborated.
skipping to change at page 8, line 33 skipping to change at page 9, line 40
to the IP address from which the RENEW or REBINDING originated. to the IP address from which the RENEW or REBINDING originated.
Given the existing responsibility placed on the client to only use an Given the existing responsibility placed on the client to only use an
IP address when the lease is valid, and to only send in a RENEW or IP address when the lease is valid, and to only send in a RENEW or
REBINDING if the lease is valid, the failover protocol relies on DHCP REBINDING if the lease is valid, the failover protocol relies on DHCP
clients to perform responsibly and will, in the absence of conflict- clients to perform responsibly and will, in the absence of conflict-
ing information, believe a DHCP client that is attempting to RENEW or ing information, believe a DHCP client that is attempting to RENEW or
REBIND a lease on an IP address is the legitimate owner of that IP REBIND a lease on an IP address is the legitimate owner of that IP
address. address.
If clients do not follow these rules, it is possible for an address
to be in use by more than one client. For a single server, this hap-
pens because the server has leased the expired address to another
client and the original client is also attempting to use the address.
The server would NAK the renewal request. This is made slightly worse
in the failover protocol if the two servers are unable to communicate
with each other and one server leases an available address to a new
client while the other server receives a renewal from a different
client. In this case, both servers lease the same address to dif-
ferent clients for the MCLT time.
One troublesome issue is that of the DHCP client responsibility when One troublesome issue is that of the DHCP client responsibility when
sending in DHCPREQUEST/INIT-REBOOT requests. While the original DHCP sending in DHCPREQUEST/INIT-REBOOT requests. While the original DHCP
RFC was written to require a DHCP client to have time left to run on RFC was written to require a DHCP client to have time left to run on
the lease for an IP address if the client is sending an INIT-REBOOT the lease for an IP address if the client is sending an INIT-REBOOT
request, it was sufficiently unclear that some client vendors didn't request, it was sufficiently unclear that some client vendors didn't
realize this until recently. Since the INIT-REBOOT request was sent realize this until recently. Since the INIT-REBOOT request was sent
with the IP address in the dhcp-requested-address option and not in with the IP address in the dhcp-requested-address option and not in
the ciaddr (for perfectly good reasons), the similarity to the RENEW the ciaddr (for perfectly good reasons), the similarity to the RENEW
and REBINDING case was lost on many people. and REBINDING case was lost on many people.
skipping to change at page 15, line 18 skipping to change at page 16, line 33
are able to communicate with DHCP clients, but unable to com- are able to communicate with DHCP clients, but unable to com-
municate with each other, a subset of the IP address pool must municate with each other, a subset of the IP address pool must
be set aside as a private address pool for the secondary be set aside as a private address pool for the secondary
server. The secondary can use these to service newly arrived server. The secondary can use these to service newly arrived
DHCP clients during such a period. The size of this private DHCP clients during such a period. The size of this private
pool SHOULD be based only on the arrival rate of new DHCP pool SHOULD be based only on the arrival rate of new DHCP
clients and the length of expected downtime, and is not influ- clients and the length of expected downtime, and is not influ-
enced in any way by the total number of DHCP clients supported enced in any way by the total number of DHCP clients supported
by the server pair. by the server pair.
The failover protocol can be used in a mode where both the
primary and secondary servers can share the load between them
when both are operating. In this loadbalancing mode, the
addresses allocated by the primary server to the secondary
server are not unused, but are used instead to service the
portion of the client base which to which the secondary server
is required to respond. See section 5.3 for more information
on loadbalancing.
3. The primary and secondary servers do not respond to client 3. The primary and secondary servers do not respond to client
requests at all while recovering from a failure that could requests at all while recovering from a failure that could
have resulted in duplicate IP assignments. (When synchroniz- have resulted in duplicate IP assignments. (When synchroniz-
ing in POTENTIAL-CONFLICT state). ing in POTENTIAL-CONFLICT state).
5. Protocol Overview 5. Protocol Overview
This section will discuss the failover protocol at a relatively high This section will discuss the failover protocol at a relatively high
level level of detail. In the event that a description in this sec- level of detail. In the event that a description in this section
tion conflicts (or appears to conflict due to the overview nature of conflicts (or appears to conflict due to the overview nature of this
this section) with information in later sections of this draft, the section) with information in later sections of this draft, the infor-
information in the later sections should be considered authoritative. mation in the later sections should be considered authoritative.
5.1. Messages and States 5.1. Messages and States
This protocol is centered around the message exchange used by one This protocol is centered around the message exchange used by one
server to update the other server of binding database changes result- server to update the other server of binding database changes result-
ing from DHCP client activity: ing from DHCP client activity:
o Communication of binding database changes o Communication of binding database changes
The binding update (BNDUPD) message is used to send the binding The binding update (BNDUPD) message is used to send the binding
database changes to the partner server, and the partner server database changes to the partner server, and the partner server
responds with a binding acknowledgement (BNDACK) message when it responds with a binding acknowledgement (BNDACK) message when it
has successfully committed those changes to its own stable has successfully committed those changes to its own stable
storage. storage.
All of the other messages are involve ancillary issues: All of the other messages involve ancillary issues:
o Management of available IP addresses o Management of available IP addresses
The pool request (POOLREQ) is used by the secondary server to The pool request (POOLREQ) is used by the secondary server to
request an allocation of IP addresses from the primary server. request an allocation of IP addresses from the primary server.
The pool response (POOLRESP) is used by the primary server to The pool response (POOLRESP) is used by the primary server to
inform the secondary server how many IP addresses it was allo- inform the secondary server how many IP addresses were allocated
cated as the result of a pool request. to the secondary server as the result of the pool request.
o Synchronization of the binding databases between the servers o Synchronization of the binding databases between the servers
after they've been out of communications after they've been out of communications
The update request (UPDREQ) message is used by one server to The update request (UPDREQ) message is used by one server to
request that its partner send it all binding database informa- request that its partner send it all binding database informa-
tion that it has not already seen. The update request all tion that it has not already seen. The update request all
(UPDREQALL) message is used by one server to request that all (UPDREQALL) message is used by one server to request that all
binding database information be sent in order to recover from a binding database information be sent in order to recover from a
total loss of its lease state database by the requesting server. total loss of its binding database by the requesting server.
The update done (UPDDONE) message is used by the responding The update done (UPDDONE) message is used by the responding
server to indicate that all requested updates have been sent the server to indicate that all requested updates have been sent the
responding server and acked by the requesting server. responding server and acked by the requesting server.
o Connection establishment o Connection establishment
The connect (CONNECT) message is used by either server to estab- The connect (CONNECT) message is used by the primary server to
lish a high level connection with the other server, and to establish a high level connection with the other server, and to
transmit several important configuration data items between the transmit several important configuration data items between the
servers. The connect acknowledgement message (CONNECTACK) is servers. The connect acknowledgement message (CONNECTACK) is
used to respond to a CONNECT message from another server. used by the secondary server to respond to a CONNECT message
from the primary server. The disconnect (DISCONNECT) message is
used by either server when closing a connection.
o Server synchronization o Server synchronization
The state change (STATE) message is used by either server to The state change (STATE) message is used by either server to
inform the other server of a change of failover state. inform the other server of a change of failover state.
o Connection integrity management o Connection integrity management
The contact (CONTACT) message is used by either server to ensure The contact (CONTACT) message is used by either server to ensure
that the other server continues to see the connection as opera- that the other server continues to see the connection as opera-
skipping to change at page 16, line 52 skipping to change at page 18, line 29
5.1.1. Failover endpoints 5.1.1. Failover endpoints
The proper operation of the failover protocol requires more than the The proper operation of the failover protocol requires more than the
transmission of messages between one server and the other. Each end- transmission of messages between one server and the other. Each end-
point might seem to be a single DHCP server, but in fact there are point might seem to be a single DHCP server, but in fact there are
many situations where additional flexibility in configuration is use- many situations where additional flexibility in configuration is use-
ful. ful.
For instance, there might be several servers which are each primary For instance, there might be several servers which are each primary
for a distinct set of address pools, and one server which is for a distinct set of address pools, and one server which is secon-
secondary for all of those address pools. The situation with the dary for all of those address pools. The situation with the pri-
primaries is straightforward, but the secondary will need to maintain maries is straightforward, but the secondary will need to maintain a
a separate failover state, partner state, and communications up/down separate failover state, partner state, and communications up/down
status for each of the separate primary servers for which it is act- status for each of the separate primary servers for which it is act-
ing as a secondary. ing as a secondary.
The failover protocol calls for there to be a unique failover end- The failover protocol calls for there to be a unique failover end-
point per partner per role (where role is primary or secondary). point per partner per role (where role is primary or secondary).
This failover endpoint can take actions and hold unique states. This failover endpoint can take actions and hold unique states.
There are thus a maximum of two failover endpoints per partner (one There are thus a maximum of two failover endpoints per partner (one
for the partner as a primary and one for that same partner as a for the partner as a primary and one for that same partner as a
secondary.) secondary.)
skipping to change at page 17, line 34 skipping to change at page 19, line 12
mary and secondary servers, not primary and secondary failover end- mary and secondary servers, not primary and secondary failover end-
points. However, it is important to remember that every 'server' points. However, it is important to remember that every 'server'
described in this document is in reality a failover endpoint that described in this document is in reality a failover endpoint that
resides in a particular process, and that many failover endpoints may resides in a particular process, and that many failover endpoints may
reside in the same process. reside in the same process.
It is not the case that there is a unique failover endpoint for each It is not the case that there is a unique failover endpoint for each
subnet that participates in a failover relationship. On one server, subnet that participates in a failover relationship. On one server,
there is one failover endpoint per partner per role, regardless of there is one failover endpoint per partner per role, regardless of
how many subnets or address pools are managed by that combination of how many subnets or address pools are managed by that combination of
partner and role. Conversely, any given subnet or pool will be asso- partner and role. Conversely, on a particular server, any given sub-
ciated with exactly one failover endpoint on a single server. net or pool will be associated with exactly one failover endpoint.
When a connection is received from the partner, the unique failover When a connection is received from the partner, the unique failover
endpoint to which the message is directed is determined solely by the endpoint to which the message is directed is determined solely by the
IP address of the partner and the setting of the SECONDARY bit in the IP address of the partner and the setting of the SECONDARY bit in the
'flags' field of the contact message. 'flags' field of the CONTACT message.
Throughout this document, the states and actions taken by "servers" Throughout this document, the states and actions taken by "servers"
are described. The terms "server", "primary server", and "secondary are described. The terms "server", "primary server", and "secondary
server" are commonly used to described the failover endpoint taking server" are commonly used to described the failover endpoint taking
these states and performing these actions. This description is these states and performing these actions. This description is
wholly accurate only for the simplest of cases, where all of the wholly accurate only for the simplest of cases, where all of the
address pools on one server are backed up by all of the address pools address pools on one server are backed up by all of the address pools
on another server. In this case, there is single failover endpoint on another server. In this case, there is single failover endpoint
in each server. In all other cases, the term "server" is used to in each server. In all other cases, the term "server" is used to
describe one of the two possible failover endpoints per partner. describe one of the two possible failover endpoints per partner.
5.2. Fundamental restrictions 5.2. Fundamental restrictions
There a several fundamental restrictions this protocol places on what There a several fundamental restrictions this protocol places on what
one server an do in the absence of knowledge of the other server, and one server can do in the absence of knowledge of the other server,
these restrictions are key to the correct operation of the protocol. and these restrictions are key to the correct operation of the proto-
col.
5.2.1. Control of lease time 5.2.1. Control of lease time
The key problem with lazy update is that when the a server fails The key problem with lazy update is that when the a server fails
after updating a client with a particular lease time and before after updating a client with a particular lease time and before
updating its partner, the partner will believe that a lease has updating its partner, the partner will believe that a lease has
expired even though the client still retains a valid lease on that IP expired even though the client still retains a valid lease on that IP
address. address.
In order to handle this problem, a period of time known as the "Max- In order to handle this problem, a period of time known as the "Max-
skipping to change at page 18, line 37 skipping to change at page 20, line 15
the configured lease time to a client. During a lazy update the the configured lease time to a client. During a lazy update the
updating server typically updates its partner with a potential updating server typically updates its partner with a potential
expiration time which is longer than the lease time previously given expiration time which is longer than the lease time previously given
to the client and which is longer than the lease time that the server to the client and which is longer than the lease time that the server
has been configured to give a client. This allows that server to has been configured to give a client. This allows that server to
give a longer lease time to the client the next time the client give a longer lease time to the client the next time the client
renews its lease, since the time that it will give to the client will renews its lease, since the time that it will give to the client will
not exceed the MCLT beyond the potential expiration time acknowledged not exceed the MCLT beyond the potential expiration time acknowledged
by the partner. by the partner.
When moving to the PARTNER-DOWN state (where a server is allowed to The PARTNER-DOWN state exists so that a server can be sure that its
reallocate the partner's IP addresses), a server will wait the Max- partner is, indeed, down. Correct operation while in that state
imum Client Lead Time before allocating any IP addresses from its requires (generally) that the server wait the MCLT after anything
partner's pool to any new DHCP clients. Thus, any clients which have that happened prior to its transition into PARTNER-DOWN state (or,
a lease on an IP address with a lease time greater than that known by more accurately, when the other server went down if that is known).
the server moving into PARTNER-DOWN state will either have contacted Thus, the server MUST wait the Maximum Client Lead Time after the
that server during the MCLT period or their leases will have expired. partner server went down before allocating any of the partner's FREE
addresses. In the event the partner was not in communication prior
to going down, it might have allocated one or more of its FREE
addresses to a DHCP client and been unable to inform the server
entering PARTNER-DOWN prior to going down itself. By waiting the
MCLT after the time the partner went down, the server in PARTNER-DOWN
state ensures that any clients which have a lease on one of the
partner's FREE addresses will either time out or contact the server
in PARTNER-DOWN by the time that period ends.
When a server has transitioned to PARTNER-DOWN state, it MUST NOT In addition, once a server has transitioned to PARTNER-DOWN state, it
reallocate an IP address from one client to another client until an MUST NOT reallocate an IP address from one client to another client
additional maximum client lead time interval after the lease by the until an additional MCLT interval after the lease by the original
original client expires. (Actually, until the maximum client lead client expires. (Actually, until the maximum client lead time after
time after what it believes to be the lease expiration time of the what it believes to be the lease expiration time of the first
first client.) client.)
Some optimizations exist for this restriction, in that it only Some optimizations exist for this restriction, in that it only
applies to leases that were issued BEFORE entering PARTNER-DOWN. Once applies to leases that were issued BEFORE entering PARTNER-DOWN. Once
a server has entered PARTNER-DOWN and it leases out an address, it a server has entered PARTNER-DOWN and it leases out an address, it
need not wait this time as long as it has never communicated with the need not wait this time as long as it has never communicated with the
partner since the lease was given out. partner since the lease was given out.
The fundamental relationship on which much of the correctness of this The fundamental relationship on which much of the correctness of this
protocol depends is that the lease expiration time known to a DHCP protocol depends is that the lease expiration time known to a DHCP
client MUST NOT be more than the maximum client lead time greater client MUST NOT be more than the maximum client lead time greater
skipping to change at page 19, line 48 skipping to change at page 21, line 36
lease interval (as explained below). lease interval (as explained below).
o potential lease interval o potential lease interval
The potential lease interval is the lease expiration interval The potential lease interval is the lease expiration interval
the local server tells to its partner in the potential- the local server tells to its partner in the potential-
expiration-time option of a BNDUPD message. expiration-time option of a BNDUPD message.
o acknowledged potential lease interval o acknowledged potential lease interval
The acknowledged potential lease interval is the potential least The acknowledged potential lease interval is the potential lease
interval the partner server has most recently acknowledged in interval the partner server has most recently acknowledged in
the potential-expiration-time option of a BNDACK message. the potential-expiration-time option of a BNDACK message.
The key restriction (and guarantee) that any server makes with The key restriction (and guarantee) that any server makes with
respect to lease intervals is that the actual client lease interval respect to lease intervals is that the actual client lease interval
never exceeds the acknowledged potential lease interval (if any) by never exceeds the acknowledged potential lease interval (if any) by
more than a fixed amount. This fixed amount is called the "Maximum more than a fixed amount. This fixed amount is called the "Maximum
Client Lead Time" (MCLT). Client Lead Time" (MCLT).
The MCLT MAY be configurable on the primary server, but for correct The MCLT MAY be configurable on the primary server, but for correct
server operation it MUST be the same and known to both the primary server operation it MUST be the same and known to both the primary
and secondary servers. The secondary server determines the MCLT from and secondary servers. The secondary server determines the MCLT from
the MCLT option sent from the primary server to the secondary server the MCLT option sent from the primary server to the secondary server
in the CONNECT or CONNECTACK message. in the CONNECT message.
A server MUST record in its stable storage both the actual lease A server MUST record in its stable storage both the actual lease
interval and the most recently acknowledged potential lease interval interval and the most recently acknowledged potential lease interval
for each IP address binding. It is assumed that the desired client for each IP address binding. It is assumed that the desired client
lease interval can be determined through techniques outside of the lease interval can be determined through techniques outside of the
scope of this protocol. scope of this protocol. See section 7.1.4 for more details concern-
ing the times that the server MUST record in its stable storage and
the way that they interact with the lease time that may be offered to
a DHCP client.
Again, the fundamental relationship among these times which MUST be Again, the fundamental relationship among these times which MUST be
maintained is: maintained is:
actual lease interval < actual lease interval <
( acknowledged potential lease interval + MCLT ) ( acknowledged potential lease interval + MCLT )
Figure 5.1-1 illustrates a initial lease to a client using the rules Figure 5.1-1 illustrates a initial lease to a client using the rules
discussed in the example which follows it. discussed in the example which follows it.
skipping to change at page 22, line 45 skipping to change at page 24, line 45
plus the MCLT, the offer made to the client is for the remainder plus the MCLT, the offer made to the client is for the remainder
of the current acknowledged potential lease interval (i.e., zero) of the current acknowledged potential lease interval (i.e., zero)
plus the MCLT. Thus, the actual lease interval is 1 hour. plus the MCLT. Thus, the actual lease interval is 1 hour.
Once the server has performed the ACK to the DHCP client, it will Once the server has performed the ACK to the DHCP client, it will
update the secondary server with the lease information. However, update the secondary server with the lease information. However,
the desired potential lease interval will be composed of the one the desired potential lease interval will be composed of the one
half of the current actual lease interval added to the desired half of the current actual lease interval added to the desired
lease interval. Thus, the secondary server is updated with a lease interval. Thus, the secondary server is updated with a
BNDUPD with a lease interval of 3 days + 1/2 hour specified in the BNDUPD with a lease interval of 3 days + 1/2 hour specified in the
IP Address Lease Time Option (Option 51). potential-expiration-time option.
When the primary server receives an ACK to its update of the When the primary server receives an ACK to its update of the
secondary server's (partner's) potential lease interval, it secondary server's (partner's) potential lease interval, it
records that as the acknowledged potential lease interval. A records that as the acknowledged potential lease interval. A
server MUST NOT send a BNDACK in response to a BNDUPD message server MUST NOT send a BNDACK in response to a BNDUPD message
until it is sure that the information in the BNDUPD message until it is sure that the information in the BNDUPD message
resides in its stable storage. Thus, the primary server in this resides in its stable storage. Thus, the primary server in this
case can be sure that the secondary server has recorded the poten- case can be sure that the secondary server has recorded the poten-
tial lease interval in its stable storage when the primary server tial lease interval in its stable storage when the primary server
receives a BNDACK message from the secondary server. receives a BNDACK message from the secondary server.
skipping to change at page 24, line 38 skipping to change at page 26, line 38
receipt of a DHCP client request whether it is to service this receipt of a DHCP client request whether it is to service this
request or to ignore it in order to allow the other server to service request or to ignore it in order to allow the other server to service
the request. the request.
In addition, it should be possible to configure the percentage of In addition, it should be possible to configure the percentage of
clients which will be serviced by either the primary or secondary clients which will be serviced by either the primary or secondary
server. This configuration should be more or less continuous, from server. This configuration should be more or less continuous, from
all serviced by the primary through an even split with half serviced all serviced by the primary through an even split with half serviced
by each, to all serviced by the secondary. by each, to all serviced by the secondary.
The technique chosen to support these goals is to define a hash func- The technique chosen to support these goals is described in [LOADB].
tion which must be applied to the client-identifier or to the htype When using the load balancing algorithm in [LOADB] among two servers
concatenated with the chaddr if no client-identifier is specified. implementing the failover protocol, both servers MUST use the same
The results of this hash function yields a number between 0 and 255 information from the DHCP client packet as the Request ID for the
which maps into one of 256 "hash-buckets". Each hash bucket is load balancing algorithm. Both servers MUST use the dhcp-client-
assigned to one server or the other by the primary server whenever a identifier (if it appears), and the client-hardware-address if the
connection is established, through use of the hash-bucket-assignment dhcp-client-identifier does not. The client-hardware-address is con-
option. structed from the htype and chaddr fields of the DHCP client request
in the same manner as described for creation of the client-hardware-
The hash-bucket-assignment option uses a 32 octet value field (con- address option in section 6.2.
taining 256 bits), with one bit associated with each possible hash
bucket. If the bit corresponding to a hash bucket is a 1 in the
hash-bucket-assignment option, then the secondary server is required
to service all DHCP client requests that map into that hash bucket
when in NORMAL state.
For example, if the primary server sends a hash-bucket-assignment
option to the secondary with the following 32 octets:
buckets
FF FF FF FF FF FF FF FF ( 0 - 63 )
FF FF FF FF FF FF FF FF ( 64 - 127 )
00 00 00 00 00 00 00 00 ( 128 - 191 )
00 00 00 00 00 00 00 00 ( 192 - 255 )
then the secondary MUST service any DHCP client requests where the A bitmap-style Hash Bucket Assignment (as described in section 5.2 of
client-identifier or htype concatenated with the chaddr hashs into [LOADB]) is sent by the primary server to the secondary server when-
the bucket values of 0 through 127. ever a connection is established, using the hash-bucket-assignment
option defined in section 6.2. This Hash Bucket Assignment is used
by the secondary server to decide which packets to process when in
NORMAL state.
See section 12 for the code to implement the hash bucket algorithm. The way in which either primary or secondary servers determine the
Each server MUST implement this same algorithm in order for all hash bucket assignment for it to use when in other than NORMAL state
clients to get service. is outside of the scope of this document. Note, however, that the
primary and secondary servers MUST use identical hash bucket assign-
ments when not in NORMAL state. This common hash bucket assignment
MAY be for all of the hash buckets, indicating that there is no other
DHCP server sharing the load with this failover pair, or it MAY be
for a subset of the hash buckets, which would indicate that there
exists another server or server pair with which this DHCP server pair
is sharing the load.
5.4. Operating in NORMAL state 5.4. Operating in NORMAL state
When in NORMAL state, each server services DHCPDISCOVER's and all When in NORMAL state, each server services DHCPDISCOVER's and all
other DHCP requests other than DHCPREQUEST/RENEWAL or other DHCP requests other than DHCPREQUEST/RENEWAL or
DHCPREQUEST/REBINDING from the client set defined by the load balanc- DHCPREQUEST/REBINDING from the client set defined by the load balanc-
ing algorithm. Each server services DHCPREQUEST/RENEWAL or ing algorithm. Each server services DHCPREQUEST/RENEWAL or
DHCPDISCOVER/REBINDING requests from any client. DHCPDISCOVER/REBINDING requests from any client.
In general, whenever the binding database is changed in stable In general, whenever the binding database is changed in stable
skipping to change at page 25, line 45 skipping to change at page 27, line 41
storage and replies with a BNDACK message. storage and replies with a BNDACK message.
5.5. Operating in COMMUNICATIONS-INTERRUPTED state 5.5. Operating in COMMUNICATIONS-INTERRUPTED state
When operating in COMMUNICATIONS-INTERRUPTED state, each server is When operating in COMMUNICATIONS-INTERRUPTED state, each server is
operating independently, but does not assume that its partner is not operating independently, but does not assume that its partner is not
operating. The partner server might be operating and simply unable operating. The partner server might be operating and simply unable
to communicate with this server, or might not be operating. to communicate with this server, or might not be operating.
Each server responds to the full range of DHCP client messages that Each server responds to the full range of DHCP client messages that
it receives, but in such a way that graceful reintegration is alway it receives, but in such a way that graceful reintegration is always
possible when its partner comes back into contact with it. possible when its partner comes back into contact with it.
5.6. Operating in PARTNER-DOWN state 5.6. Operating in PARTNER-DOWN state
When operating in PARTNER-DOWN state, a server assumes that its When operating in PARTNER-DOWN state, a server assumes that its
partner is not currently operating, but does make allowances for the partner is not currently operating, but does make allowances for the
possibility that that server was operating in the past. It responds possibility that that server was operating in the past, though possi-
to all DHCP client requests in PARTNER-DOWN state. bly out of communications with this server. It responds to all DHCP
client requests in PARTNER-DOWN state.
Any transactions that the partner server may have had with DHCP
clients but been unable to communicate to this server are allowed for
in the algorithms that are used to gradually take over full control
of all of the addresses configured into the server.
5.7. Operating in RECOVER state 5.7. Operating in RECOVER state
A server operating in RECOVER state assumes that it is reintegrating A server operating in RECOVER state assumes that it is reintegrating
with a server that has been operating in PARTNER-DOWN state, and that with a server that has been operating in PARTNER-DOWN state, and that
it needs to update its bindings database before it services DHCP it needs to update its bindings database before it services DHCP
client requests. client requests.
A server may also operate in RECOVER state in order to fully recover A server may also operate in RECOVER state in order to fully recover
its bindings database from its partner server. its bindings database from its partner server.
5.8. Operating in STARTUP state
A server operating in STARTUP state assumes that failover is opera-
tional, and it spends a short time whenever it comes up attempting to
contact the partner. During this time (generally a few seconds), the
server is unresponsive to DHCP client requests. This period exists
in order to give a server a chance to determine that its partner has
changed state since it was last in communications, and to react to
that changed state (if any) prior to responding to DHCP client
requests.
The period of time a server remains in STARTUP state SHOULD be long
enough to ensure that it will connect to the other server if that
server is available for connections.
5.9. Time synchronization between servers
The failover protocol is designed to operate between two servers
which have time values which differ by an arbitrarily large amount.
A particular implementation MAY choose to only support servers whose
time values differ by an arbitrarily small amount.
In any event, whether large or only small differences in time values
are supported, every message that is received MUST be tagged with a
time value as soon as possible after receipt. This time value is
used along with the time value that is sent in every message between
the failover partners to develop a delta time between the servers.
This delta time is used during the connection process to establish a
baseline delta time between the servers, and upon receipt of each
message, the delta time for that message is used to refine the delta
time for the server pair.
While the algorithm for this refinement of delta time is not speci-
fied as part of this protocol, a server SHOULD allow the delta time
value for a pair of failover servers to be periodically updated to
account for time drift. In addition, the delta time value between
servers SHOULD be smoothed in some fashion, so that transient network
delays will not cause it to vary wildly.
A server SHOULD recognize a drastic change in the delta time value as
an event to be signaled to a network administrator.
5.10. IP address binding-status
In most DHCP servers an IP address can take on several different
binding-status values, sometimes also called states. While no two
DHCP servers probably have exactly the same possible binding-status
values the DHCP RFC enforces some commonality among the general
semantics of the binding-status values used by various DHCP server
implementations.
In order to transmit binding database updates between one server and
another using the failover protocol, some common denominator
binding-status values must be defined. It is not expected that these
binding-status-values correspond with any actual implementation of
the DHCP protocol in a DHCP server, but rather that the binding-
status values defined in this document should be a superset of most
if not all DHCP server implementations. It is a goal of this proto-
col that any DHCP server can map the various IP address binding-
status values that it uses internally into these failover IP address
binding-status values on transmission of binding database updates to
its partner, and likewise that it can map any failover IP address
binding-status values into its internal IP address binding-status
values upon receipt of a binding database update.
The IP address binding-status values defined for the failover proto-
col are:
o FREE
Lease may be allocated to any DHCP client.
o ACTIVE
Lease is assigned to a client. It MUST have client information
associated with it.
o EXPIRED
Lease has expired. It may be allocated to the same client.
o RELEASED
Lease has been released by client. It may be allocated to the
same client.
o ABANDONED
A server, or client flagged address as unusable.
o RESET
Lease was freed by some external agent.
o BACKUP
Lease belongs to secondary's private address pool.
These binding-status values are communicated from one failover
partner to another using the binding-status option, see section 6.2
for details of this option. Unless otherwise noted above there MAY
be client information associated with each of these binding-status
values.
Again, note that a DHCP server implementing the failover protocol
does not have to implement either this state machine or use these
particular binding-status values in its normal operation of allocat-
ing IP addresses to DHCP clients. It only needs to map its internal
binding-status-values onto these "standard" binding-status values,
and map these "standard" binding-status values back into its internal
binding-status values. In particular, a server which implements a
grace period for a IP address binding SHOULD simply wait to update
its partner server until the grace period on that binding has run
out.
The process of setting an IP address to FREE deserves some detailed
discussion. When an IP address is moved to the EXPIRED,RELEASED, or
RESET binding-status on a server, it will send a BNDUPD with the
binding-status of EXPIRED, RELEASED, or RESET to its partner. If its
partner agrees that is acceptable (see sections 7.1.2 and 7.13 con-
cerning why a server might not accept a BNDUPD) it will return a
BNDACK with no reject-reason, signifying that it accepted the update.
As part of the BNDUPD processing, the server returning the BNDACK
will set the binding-status of the IP address to FREE, and upon
receipt of the BNDACK the server which sent the BNDUPD will set the
binding-status of the IP address to FREE. Thus, the EXPIRED,
RELEASED, or RESET binding-status is something of a transitory state.
This process is encoded in the transition diagram below by "Comm
w/Partner".
An IP address will move between these lease binding-status values
using the following state transition diagram:
DHCP client DECLINE or
server detected problem
from any state
+----------+ V +---------+
External >---->| RESET | | |ABANDONED|
command | | +-->| |
+----------+ +---------+
|
Comm w/Parter
V
+---------+ Comm +----------+ Comm +---------+
| EXPIRED |--------->| FREE |<----------| RELEASED|
| | w/Parter | | w/Partner | |
+---------+ +----------+ +---------+
^ ^ | | ^
| Exp. grace IP address IP addr alloc. |
| period ends leased by to secondary |
| | primary V |
| | | +----------+ |
| | | | BACKUP | |
| wait for | | | |
| grace period | +----------+ |
| | | | |
| | | IP addr leased by |
| Expired grace | secondary |
| period exists V V |
| | +----------+ |
| | Lease on | ACTIVE | DHCPRELEASE |
+-----+-IP addr---| |------------------+
expires +----------+
Figure 5.10-1: Transitions between binding-status values.
If a server receives a binding-status that it doesn't implement
internally, it should do something reasonable. A server which doesn't
support an ABANDONED binding-status could set the IP address ACTIVE
and belonging to a client which will never be seen in a DHCP request.
5.10.1. IP address binding-status changes from BNDUPD messages
IP addresses undergo binding status changes for several reasons,
including receipt and processing of DHCP client requests, administra-
tive inputs and receipt of BNDUPD messages. Every DHCP server needs
to respond to DHCP client request and administrative inputs with
changes to its internal record of the binding-status of an IP
address, and this response is not in the scope of the failover proto-
col. However, the receipt of BNDUPD messages implies at least a pos-
sible change of the binding-status for an IP address, and must be
discussed here. See section 7.1.2 for general actions to take upon
receipt of a BNDUPD message.
When receiving a BNDUPD message, it is important to note that it may
not be current, in that the server receiving the BNDUPD message may
have had a more recent interaction with the DHCP client than its
partner who sent the BNDUPD message. In this case, the receiving
server MUST reject the BNDUPD message. In addition, it is worth not-
ing that two (and possibly three) binding-status values are the
direct result of interaction with a DHCP client, ACTIVE and RELEASED
(and possibly ABANDONED). All other binding-status values are either
the result of the expiration of a time period or interaction with an
external agency (e.g., a network admistrator).
Every BNDUPD message SHOULD contain a client-last-transaction-time
option, which MUST, if it appears, be the time that the server last
interacted with the DHCP client. It MUST NOT be, for instance, the
time that the lease on an IP address expired. If there has been no
interaction with the DHCP client in question (or there is no DHCP
client presently associated with this IP address), then there will be
no client-last-transaction-time option in the BNDUPD message.
The following list is indexed by the binding-status that a server
receives in a BNDUPD message. In many cases, the binding-status of
an IP address within the receiving server's data storage will have an
affect upon the checks performed prior to accepting the new binding-
status in a BNDUPD message.
In the following list, to "accept" a BNDUPD means to update the
server's bindings database with the information contained in the
BNDUPD and once that update is complete, send a BNDACK message
corresponding to the BNDUPD message. To "reject" a BNDUPD means to
respond to the BNDUPD with a BNDACK with a reject-reason option
included..
When interpreting the rules in the following list, if a BNDUPD
doesn't have a client-last-transaction-time value, then it MUST NOT
be considered later than the client-last-transaction-time in the
receiving server's binding. If the BNDUPD contains a client-last-
transaction-time value and the receiving server's binding does not,
then the client-last-transaction-time value in the BNDUPD MUST be
considered later than the server's.
The second rule concerns clients and IP addresses. If the client in
a BNDUPD message the client in a receiving server's binding both
exist and if they differ, then if the receiving server's binding-
status is ACTIVE and the binding-status in the BNDUPD is ACTIVE, then
if the receiving server is a secondary server accept it, else reject
it.
Otherwise, look up the binding-status in the BNDUPD in this list:
o ACTIVE in BNDUPD
If the receiving server's binding-status is ACTIVE, FREE, or
BACKUP, then accept it.
If the receiving server's binding-status is ABANDONED or RESET,
then reject it.
If the receiving server's binding status is RELEASED, EXPIRED,
then if the client-last-transaction-time in the BNDUPD is later
than the client-last-transaction-time in the receiving server's
binding, accept it, else reject it.
o EXPIRED in BNDUPD
If the receiving server's binding-status is ACTIVE, then current
time is later than the receiving server's lease-expiration-time,
accept it, else reject it.
If the receiving server's binding-status is ABANDONED or RESET,
reject it.
If the receiving server's binding-status is FREE or BACKUP,
accept it.
If the receiving server's binding-status is RELEASED, then if
the client-last-transaction-time is greater in the BNDUPD than
in the receiving server's binding, then accept it, else reject
it.
o RELEASED in BNDUPD
If the receiving server's binding-status is ACTIVE, then if the
client-last-transaction-time is greater than the client-last-
transaction-time in the receiving server's binding, accept it,
else reject it.
If the receiving server's binding-status is RELEASED, FREE or
BACKUP, accept it.
If the receiving server's binding-status is ABANDONED or RESET,
reject it.
o FREE or BACKUP in BNDUPD
If the receiving server's binding-status is ACTIVE and the
current time is later than the lease-expiration-time accept it,
else reject it.
If the receiving server's binding-status is ABANDONED, reject
it.
If the receiving server's binding-status is FREE or BACKUP or
RESET, accept it.
o RESET or ABANDONDED in BNDUPD
Accept the new binding-status under all circumstances.
5.11. DNS dynamic update considerations
DHCP servers (and clients) can use DNS Dynamic Updates as described
in [RFC2136] to maintain DNS name-mappings as they maintain DHCP
leases. Many different administrative models for DHCP-DNS integra-
tion are possible. Descriptions of several of these models, and
guidelines that DHCP servers and clients should follow in carrying
them out, are laid out in [DDNS]. The nature of the DHCP failover
protocol introduces some issues concerning dynamic DNS updates that
are not part of non-failover DHCP environments. This section
describes these issues, and defines the information which failover
partners should exchange and the protocol which they should follow in
order to ensure consistent behavior. The presence of this section
should not be interpreted as requiring that implementations of the
DHCP failover protocol must also support DDNS updates. The purpose
of this discussion is to clarify the areas where the DHCP failover
and DHCP-DDNS protocols intersect for the benefit of implementations
which support both protocols, not to introduce a new requirement into
the DHCP failover protocol. Thus, a DHCP server which implements the
failover protocol MAY also support dynamic DNS updates, but if it
does support dynamic DNS updates it SHOULD utilize the techniques
described here in order to correctly distribute them between the
failover partners.
5.11.1. Relationship between failover and dynamic DNS update
The failover protocol describes the conditions under which each fail-
over server may renew a lease to its current DHCP client, and
describes the conditions under which it may grant a lease to a new
DHCP client. An analogous set of conditions determines when a fail-
over server should initiate a DDNS update, and when it should attempt
to remove records from the DNS. The failover protocol's conditions
are based on the desired external behavior: avoiding duplicate
address assignments; allowing clients to continue using leases which
they obtained from one failover partner even if they can only commun-
icate with the other partner; allowing the backup DHCP server to
grant new leases even if it is unable to communicate with the primary
server. The desired external DDNS behavior for DHCP failover servers
is:
1. Allow timely DDNS updates from the server which grants a
client a lease. Recognize that there is often a DDNS update
lifecycle which parallels the DHCP lease lifecycle. This is
likely to include the addition of records when the lease is
granted, and the removal of DNS records when the lease is sub-
sequently made available for allocation to a different client.
2. Communicate enough information between the two failover
servers to allow one to complete the DDNS update 'lifecycle'
even if the other server originally granted the lease.
3. Avoid redundant or overlapping DDNS updates, where both fail-
over servers are attempting to perform DDNS updates for the
same lease-client binding. Avoid situations where one partner
is attempting to add RRs related to a lease binding while the
other partner is attempting to remove RRs related to the same
lease binding.
5.11.2. Use of the DDNS option
In order for either server to be able to complete a DDNS update, or
to remove DNS records which were added by its partner, both servers
need to know the FQDN associated with the lease-client binding. The
FQDN associated with the client's A RR and PTR RR SHOULD be communi-
cated from the server which adds records into the DNS to its partner.
The initiating server SHOULD use the DDNS option in the BNDUPD mes-
sages to inform the partner server of the status of any DDNS updates
associated with a lease binding. Failover servers MAY choose not to
include the DDNS option in BNDUPD messages if there has been no
change in the status of any DDNS update related to the lease binding.
The partner server receiving BNDUPD messages containing the ddn
option SHOULD compare the status flags and the FQDN contained in the
option data with the current DDNS information it has associated with
the lease binding, and update its notion of the DDNS status accord-
ingly.
The initiating server MAY send a BNDUPD to its partner before the
DDNS update has been successfully completed. If it does so, it SHOULD
leave the 'C' bit in the Flags field clear, to indicate to the
partner that the DDNS update may not be complete. When the DDNS
update has been successfully acknowledged by the DNS server, the ini-
tiating DHCP server SHOULD include the DDNS option in its next BNDUPD
message about the binding, so that the partner server will be able to
record the final status of the DDNS update. The initiating server
SHOULD set the 'C' bit in the DDNS option if the DDNS update was suc-
cessfully accepted by the DNS server.
Some implementations will choose to send a BNDUPD without waiting for
the DDNS update to complete, and then will send a second BNDUPD once
the DDNS update is complete. Other implementations will delay sending
the partner a BNDUPD until the DDNS update has been acknowledged by
the DNS server, or until some time-limit has elapsed, in order to
avoid sending a second BNDUPD.
The Domain Name field in the DDNS option contains the FQDN that will
be associated with the A RR (if the server is performing an A RR
update for the client) and the PTR RR. This FQDN may be composed in
any of several ways, depending on server configuration and the infor-
mation provided by the client in its DHCP messages. The client may
supply a hostname which it would like the server to use in forming
the FQDN, or it may supply the entire FQDN. The server may be config-
ured to attempt to use the information the client supplies, it may be
configured with an FQDN to use for the client, or it may be config-
ured to synthesize an FQDN. The responsive server SHOULD include the
FQDN that it will be using in DDNS updates it initiates when it sends
the DDNS option.
Since the responsive server may not have completed the DDNS update at
the time it sends the first BNDUPD about the lease binding, there may
be cases where the FQDN in later BNDUPD messages does not match the
FQDN included in earlier messages. For example, the responsive server
may be configured to handle situations where two or more DHCP client
FQDNs are identical by modifying the most-specific label in the FQDNs
of some of the clients in an attempt to generate unique FQDNs for
them. Alternatively, at sites which use some or all of the informa-
tion which clients supply to form the FQDN, it's possible that a
client's configuration may be changed so that it begins to supply new
data. The responsive server may react by removing the DNS records
which it originally added for the client, and replacing them with
records that refer to the client's new FQDN. In such cases, the
responsive server SHOULD include the actual FQDN that was used in
subsequent DDNS options. The responsive server SHOULD include
relevant client-option data in the client-request-options option in
its BNDUPD messages. This information may be necessary in order to
allow the non-responsive partner to detect client configuration
changes that change the hostname or FQDN data which the client
includes in its DHCP requests.
5.11.3. Adding RRs to the DNS
A failover server which is going to perform DDNS updates SHOULD ini-
tiate the DDNS update when it grants a new lease to a client. The
non-responsive partner SHOULD NOT initiate a DDNS update when it
receives the BNDUPD after the lease has been granted. The failover
protocol ensures that only one of the partners will grant a lease to
any individual client, so it follows that this requirement will
prevent both partners from initiating updates simultaneously. The
server initiating the update SHOULD follow the protocol in [DDNS].
The server may be configured to perform an A RR update on behalf of
its clients, or not. Ordinarily, a failover server will not initiate
DDNS updates when it renews leases. In two cases, however, a failover
server MAY initiate a DDNS update when it renews a lease to its
existing client:
1. When the lease was granted before the server was configured to
perform DDNS updates, the server MAY be configured to perform
updates when it next renews existing leases. Since both
servers are responsive to renewals in NORMAL state, it is not
enough to simply require the non-responsive server to avoid a
DNS update in this case. The server which would be responsive
to a DHCPDISCOVER from this client (even though the current
request is a DHCPREQUEST/RENEW) is the server which should
initiate the DDNS update.
2. If a server is in PARTNER-DOWN state, it can conclude that its
partner is no longer attempting to perform an update for the
existing client. If the remaining server has not recorded that
an update for the binding has been successfully completed, the
server MAY initiate a DDNS update. It MAY initiate this
update immediately upon entry to PARTNER-DOWN state, it may
perform this in the background, or it MAY initiate this update
upon next hearing from the DHCP client.
5.11.4. Deleting RRs from the DNS
The failover server which makes a lease FREE SHOULD initiate any DDNS
deletes, if it has recorded that DNS records were added on behalf of
the client.
A server "makes a lease FREE" when it initiates a BNDUPD with a
binding-status of FREE, EXPIRED, or RELEASED. Its partner confirms
this status by acking that BNDUPD, and upon receipt of the ACK the
server has "made the address FREE". It is at this point that it
should initiate the DDNS operations to delete RRs from the DDNS. Its
partner SHOULD NOT initiate DDNS deletes for DNS records related to
the lease binding as part of sending the BNDACK message. The
partner MAY have issued BNDUPD messages with a binding-status of
FREE, EXPIRED, or RELEASED previously, but the other server will have
NAKed these BNDUPD messages.
The failover protocol ensures that only one of the two partner
servers will be able to make a lease FREE. The server making the
lease FREE may be doing so while it is in NORMAL communication with
its partner, or it may be in PARTNER-DOWN state. If a server is in
PARTNER-DOWN state, it may be performing DDNS deletes for RRs which
its partner added originally. This allows a single remaining partner
server to assume responsibility for all of the DDNS activity which
the two servers were undertaking.
Another implication of this approach is that no DDNS RR deletes will
be performed while either server is in COMMUNICATIONS-INTERRUPTED
state, since no IP addresses are moved into the FREE state during
that period.
5.12. Reservations and failover
Some DHCP servers support a capability to offer specific pre-
configured IP addresses to DHCP clients. These are real DHCP
clients, they do the entire DHCP protocol, but these servers always
offer the client a specific pre-configured IP address -- and they
offer that IP address to no other clients. Such a capability has
several names, but it is sometimes called a "reservation", in that
the IP address is reserved for a particular DHCP client.
In a situation where there are two DHCP server serving the same sub-
net without using failover, the two DHCP server's need to have dis-
joint IP address pools, but identical reservations for the DHCP
clients.
In a failover context, both servers need to be configured with the
proper reservations in an identical manner, but if we stop there
problems can occur around the edge conditions where reservations are
made for an IP address that has already been leased to a different
client. Different servers handle this conflict in different ways,
but the goal of the failover protocol is to allow correct operation
with any server's approach to the normal processing of the DHCP pro-
tocol.
The general solution with regards to reservations is as follows.
Whenever a reserved IP address becomes FREE (i.e., when first config-
ured or whenever a client frees it or it expires or is reset), the
primary server MUST show that IP address as FREE (and thus available
for its own allocation) and it MUST send it to the secondary server
as BACKUP, in order that the secondary server be able to allocate it
as well.
5.13. Dynamic BOOTP and failover
Some DHCP servers support a capability to offer IP addresses to BOOTP
clients without having a particular address previously allocated for
those clients. This capability is often called something like
"dynamic BOOTP". It is not a capability explicitly discussed in
either the DHCP or BOOTP RFC's, but rather a pragmatic capability
which can work reasonably well for a small set of legacy BOOTP dev-
ices.
This capability has a negative interaction with the fundamental ele-
ments of the failover protocol, in that an address handed out to a
BOOTP device has no term (or effectively no term, in that usually
they are considered leases for "forever"). There is no opportunity
to hand out a lease which is only the MCLT long when first hearing
from a BOOTP device, because they may only interact once with the
DHCP server and they have no notion of a lease expiration time. Thus
the entire concept of the MCLT and waiting the MCLT after entering
PARTNER-DOWN state is broken when dealing with BOOTP devices.
With some restrictions, however, dynamic BOOTP devices can be sup-
ported in a server on a subnet where failover is supported. The only
restriction (and it is not small) is that on any portion of the sub-
net (in any address pool) where dynamic BOOTP devices can be allo-
cated IP addresses, a DHCP server MUST NOT ever use any of the IP
addresses which were previously available for allocation by its fail-
over partner. Thus, the addresses allocated by the primary to the
secondary for allocation MUST NOT ever be used by the primary server
even if it is in PARTNER-DOWN state and has waited the MCLT after
entering that state. The reason for this is because one of those IP
address could have been allocated by the secondary server to a BOOTP
device, and the primary server would have no way of ever knowing that
happened.
5.14. Guidelines for selecting MCLT
There is no one correct value for the MCLT. There is an explicit
tradeoff between various factors in selecting an MCLT value.
5.14.1. Short MCLT
A short MCLT value will mean that after entering PARTNER-DOWN state,
a server will only have to wait a short time before it can start
allocating its partner's IP addresses to DHCP clients. Furthermore,
it will only have to wait a short time after the expiration of a
lease on an IP address before it can reallocate that IP address to
another DHCP client.
However the downside of a short MCLT value is that the initial lease
interval that will be offered to every new DHCP client will be short,
which will cause increased traffic as those clients will need to send
in their first renew in a half of a short MCLT time. In addition,
the lease extensions that a server in COMMUNICATIONS-INTERRUPTED
state can give will be only the MCLT after the server has been in
COMMUNICATIONS-INTERRUPTED for around the desired client lease
period. If a server stays in COMMUNICATIONS-INTERRUPTED for that
long, then the leases it hands out will be short and that will
increase the load on that server, possibly causing difficulty.
5.14.2. Long MCLT
A long MCLT value will mean that the initial lease period will be
longer and the time that a server in COMMUNICATIONS-INTERRUPTED state
will be able to extend leases (after it has been in COMMUNICATIONS-
INTERRUPTED state for around the desired client lease period) will be
longer.
However, a server entering PARTNER-DOWN state will have to wait the
longer MCLT before being able to allocate its partner's IP addresses
to new DHCP clients. This may mean that additional IP addresses are
required in order to cover this time period. Further, the server in
PARTNER-DOWN will have to wait the longer MCLT from every lease
expiration before it can reallocate an IP address to a different DHCP
client.
6. Packet Formats 6. Packet Formats
This section discusses the common message format that all failover This section discusses the common message format that all failover
messages have in common, and then defines option used in the failover messages have in common, and then defines option used in the failover
protocol. protocol.
6.1. Common message format 6.1. Common message format
All failover protocol messages are sent over the TCP connection All failover protocol messages are sent over the TCP connection
between failover endpoints and encoded using a packet format specific between failover endpoints and encoded using a message format
to the failover protocol. specific to the failover protocol.
There exists a common message format for all failover messages, which There exists a common message format for all failover messages, which
utilizes the options in a way similar to the DHCP protocol. For each utilizes the options in a way similar to the DHCP protocol. For each
message type, some options are required and some are optional. In message type, some options are required and some are optional. In
addition, when a message is received any options that are not under- addition, when a message is received any options that are not
stood by the receiving server MUST be ignored. understood by the receiving server MUST be ignored.
All of the fields in the fixed portion of the packet MUST be filled All of the fields in the fixed portion of the message MUST be filled
with correct data in every message sent. with correct data in every message sent.
0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| packet length (2) | msg type (1) |payload off (1)| | message length (2) | msg type (1) |payload off (1)|
+---------------+---------------+---------------+---------------+ +---------------+---------------+---------------+---------------+
| time (4) |
+---------------------------------------------------------------+
| xid (4) | | xid (4) |
+---------------------------------------------------------------+ +---------------------------------------------------------------+
| 0 or more additional header bytes (variable) | | 0 or more additional header bytes (variable) |
+---------------------------------------------------------------+ +---------------------------------------------------------------+
| payload data (variable) | | payload data (variable) |
| | | |
| formatted as DHCP-style options | | formatted as DHCP-style options |
| using a unique option number space in the ?R6? | | using a unique option number space in the RFC TBD |
| format defined by [NAMESPACE] | | format defined by [NAMESPACE] |
+---------------------------------------------------------------+ +---------------------------------------------------------------+
packet length - 2 bytes, network byte order message length - 2 bytes, network byte order
This is the length of the packet. It includes the two byte packet This is the length of the message. It includes the two byte message
length itself. length itself. The maximum length is 2048 bytes.
msg type - 1 byte msg type - 1 byte
The message type field is used to distinguish between messages. The message type field is used to distinguish between messages.
The following message types are defined: The following message types are defined:
Value Message Type Value Message Type
----- ------------ ----- ------------
0 reserved not used 0 reserved not used
1 POOLREQ request allocation of addresses 1 POOLREQ request allocation of addresses
2 POOLRESP respond with allocation count 2 POOLRESP respond with allocation count
3 BNDUPD update partner with binding info 3 BNDUPD update partner with binding info
4 BNDACK acknowledge receipt of binding update 4 BNDACK acknowledge receipt of binding update
5 CONNECT establish connection with partner 5 CONNECT establish connection with the secondary
6 CONNECTACK respond to attempt to establish contact with partner 6 CONNECTACK respond to attempt to establish connection with partner
7 UPDREQALL request full transfer of binding info 7 UPDREQALL request full transfer of binding info
8 UPDDONE ack send and ack of req'd binding info 8 UPDDONE ack send and ack of req'd binding info
9 UPDREQ req transfer of un-acked binding info 9 UPDREQ req transfer of un-acked binding info
10 STATE inform partner of current state or state change 10 STATE inform partner of current state or state change
11 CONTACT probe communications integrity with partner 11 CONTACT probe communications integrity with partner
12 DISCONNECT close a connection
New message types should be defined in one of two ranges, 0-127 or New message types should be defined in one of two ranges, 0-127 or
129-255. The range of 0-127 is used for messages that MUST be 129-255. The range of 0-127 is used for messages that MUST be sup-
supported by every server, and if a server receives a message in the ported by every server, and if a server receives a message in the
range of 0-127 that it doesn't understand, it MUST drop the TCP con- range of 0-127 that it doesn't understand, it MUST close the TCP con-
nection. The range of 128-255 is used for messages which MAY be sup- nection. The range of 128-255 is used for messages which MAY be sup-
ported but are not required, and if a server receives a message in ported but are not required, and if a server receives a message in
this range that it does not understand it SHOULD ignore the message. this range that it does not understand it SHOULD ignore the message.
payload offset - 1 byte payload offset - 1 byte
The byte offset of the Payload Data, from the beginning of the The byte offset of the Payload Data, from the beginning of the
failover packet header. The value for the current protocol version is failover message header. The value for the current protocol version
8. is 8.
time - 4 bytes, network byte order
The absolute time in GMT when the message was transmitted,
represented as seconds elapsed since Jan 1, 1970 (i.e., similar to
the ANSI C time_t time value representation). While the ANSI C
time_t value is signed, the value used in this specification is
unsigned.
A server SHOULD set this time as close to the actual transmission of
the message as possible.
xid - 4 bytes, network byte order xid - 4 bytes, network byte order
This is the transaction id of the failover packet. The sender of a This is the transaction id of the failover message. The sender of a
failover protocol packet is responsible for setting this number, and failover protocol message is responsible for setting this number, and
the receiver of the packet copies the number over into any response the receiver of the message copies the number over into any response
packet, treating it as opaque data. The sender SHOULD ensure that message, treating it as opaque data. The sender SHOULD ensure that
every packet sent from a particular failover endpoint over the every message sent from a particular failover endpoint over the
associated TCP connection has a unique transaction id unless that associated TCP connection has a unique transaction id unless that
packet is a re-transmission. message is a re-transmission.
payload data - variable length payload data - variable length
The options are placed after the header, after skipping payload The options are placed after the header, after skipping payload
offset bytes from beginning of the packet. The payload data options offset bytes from beginning of the message. The payload data options
are not preceded by a "cookie" value. are not preceded by a "cookie" value.
The payload data is formatted as DHCP style options using the two The payload data is formatted as DHCP style options using the two
byte option number and two byte option length format as specified in byte option number and two byte option length format as specified in
the recommendations of the DHCP panel in [NAMESPACE]. the recommendations of the DHCP panel in [NAMESPACE].
The maximum length of the payload data in octets is 2048 less the The maximum length of the payload data in octets is 2048 less the
size of the header, i.e., the maximum packet length is 2048 octets. size of the header, i.e., the maximum message length is 2048 octets.
6.2. Common option format 6.2. Common option format
The options contained in the payload data section of the failover The options contained in the payload data section of the failover
packet all use the two byte option number and two byte length format message all use the two byte option number and two byte length format
as specified by the recommendations of the DHCP panel in [NAMESPACE]. as specified by the recommendations of the DHCP panel in [NAMESPACE].
The option numbers are drawn from an option number space unique to The option numbers are drawn from an option number space unique to
the failover protocol. All of the message types share a common the failover protocol. All of the message types share a common
option number space and common options definitions, though not all option number space and common options definitions, though not all
options are required or meaningful for every message. options are required or meaningful for every message.
In contrast to the options which appear in DHCP client and server In contrast to the options which appear in DHCP client and server
packets, the options in failover message are ordered. That is, for messages, the options in failover message are ordered. That is, for
some messages the order in which the options appear in the payload some messages the order in which the options appear in the payload
data area is significant. The messages for which this is the case data area is significant. The messages for which this is the case
spell it out in detail. spell it out in detail.
For all options which refer to time, they all use an absolute time in For all options which refer to time, they all use an absolute time in
GMT. Time synchronization has already been achieved between the GMT. Time synchronization has already been achieved between the
source and the target server using the CONNECT message. All time source and the target server using the CONNECT message and is updated
fields in the options defined below use a time represented as seconds using the time in every packet. All time fields in the options
elapsed since Jan 1, 1970 (i.e. ANSI C time_t time value representa- defined below use a time represented as seconds elapsed since Jan 1,
tion). Note that this is (at present) a signed field. 1970 (i.e. ANSI C time_t time value representation). Note that this
is (at present) a signed field.
Additional options can be defined for intervendor or vendor specific Additional options can be defined for intervendor or vendor specific
use with limited difficulty due to the large number of option numbers use with limited difficulty due to the large number of option numbers
available. available.
6.2.1. binding-status 6.2.1. binding-status
This option is used to convey the current state of a binding. This option is used to convey the current state of a binding.
Code Len Type Code Len Type
skipping to change at page 29, line 42 skipping to change at page 44, line 29
Value Binding Status Value Binding Status
----- ------------------------------------------------ ----- ------------------------------------------------
1 FREE Lease has never been used 1 FREE Lease has never been used
2 ACTIVE Lease is assigned to a client 2 ACTIVE Lease is assigned to a client
3 EXPIRED Lease has expired 3 EXPIRED Lease has expired
4 RELEASED Lease has been released by client 4 RELEASED Lease has been released by client
5 ABANDONED A server, or client flagged address as unusable 5 ABANDONED A server, or client flagged address as unusable
6 RESET Lease was freed by some external agent 6 RESET Lease was freed by some external agent
7 BACKUP Lease belongs to secondary's private address pool 7 BACKUP Lease belongs to secondary's private address pool
8 EXPIRED-GRACE Lease will become available after this period
9 RELEASED-GRACE Lease will become available after this period
6.2.2. assigned-IP-address 6.2.2. assigned-IP-address
The IP address to which this message refers. The IP address to which this message refers.
Code Len Address Code Len Address
+-----+-----+------+-----+----+-----+-----+-----+ +-----+-----+------+-----+----+-----+-----+-----+
| 0 | 2 | 0 | 4 | a1 | a2 | a3 | a4 | | 0 | 2 | 0 | 4 | a1 | a2 | a3 | a4 |
+-----+-----+------+-----+----+-----+-----+-----+ +-----+-----+------+-----+----+-----+-----+-----+
skipping to change at page 31, line 11 skipping to change at page 45, line 33
+-----+-----+------+-----+----+-----+--- +-----+-----+------+-----+----+-----+---
| 0 | 5 | 0 | n | i1 | i2 | ... | 0 | 5 | 0 | n | i1 | i2 | ...
+-----+-----+------+-----+----+-----+-- +-----+-----+------+-----+----+-----+--
6.2.6. client-hardware-address 6.2.6. client-hardware-address
The format is similar to DHCP option 61. Byte t1 (type) MUST be set The format is similar to DHCP option 61. Byte t1 (type) MUST be set
to the proper ARP hardware address code, as defined in the ARP to the proper ARP hardware address code, as defined in the ARP
section of RFC 1700 (it MUST NOT be zero!) section of RFC 1700 (it MUST NOT be zero!)
Code Len MAC address Code Len htype chaddr
+-----+-----+------+-----+----+-----+-----+--- +-----+-----+------+-----+----+-----+-----+---
| 0 | 6 | 0 | n | t1 | m1 | m2 | ... | 0 | 6 | 0 | n | t1 | c1 | c2 | ...
+-----+-----+------+-----+----+-----+-----+--- +-----+-----+------+-----+----+-----+-----+---
Either Client Id, Client Hardware Address or BOTH MAY be present in Either client-identifier, client-hardware-address or BOTH MAY be
binding update transactions. At least one of them MUST be present. present in binding update transactions. At least one of them MUST be
If both are present, the Client Id MUST be used to uniquely identify present. If both are present, the client-identifier MUST be used to
the owner of the binding (exactly as in RFC 2131). uniquely identify the owner of the binding (exactly as in RFC 2131).
6.2.7. client-FQDN 6.2.7. DDNS
If an implementation supports Dynamic DNS updates, this option can be If an implementation supports Dynamic DNS updates, this option is
used to communicate the DNS name that was set. Uses the format of the used to communicate the status of the DDNS update associated with a
Client FQDN option (81) as described in [DDNS] and extended to fit in particular lease binding. The Flags field conveys the types of DNS
the two byte code and length approach of the DHCP panel. RRs that are to be updated by the DHCP server, and the status of the
DDNS update. The Domain Name field conveys the DNS FQDN that the
DHCP server is using to refer to the client, in DNS encoding as
specified in [RFC1035].
Code Len Flags Rcode1 Rcode2 Domain Name Code Len Flags Domain Name
+-----+-----+------+-----+-----+------+------+-----+------ +-----+-----+------+-----+-----+------+------+-----+------
| 0 | 7 | 0 | n | f | r1 | r2 | d1 | d2... | 0 | 7 | 0 | n | flags | d1 | d2 | ...
+-----+-----+------+-----+-----+------+------+-----+------ +-----+-----+------+-----+-----+------+------+-----+------
The Flags field is a 16-bit field; several bit positions are
specified here.
15 7 0
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| MBZ |P|D|A|C|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The bits (numbered from the least-significant bit in network
byte-order) are used as follows:
0 (C): A RR update successfully completed
1 (A): Server is controlling A RR on behalf of the client
2 (D): PTR RR update successfully completed (Done)
3 (P): Server is controlling PTR RR on behalf of the client
4-15 : Must be zero
All of the unspecified bit positions SHOULD be set to 0 by servers
sending the Failover-DDNS option, and they MUST be ignored by servers
receiving the option.
6.2.8. reject-reason 6.2.8. reject-reason
This option is used to selectively reject binding updates. It MAY be This option is used to selectively reject binding updates. It MAY be
used in BNDACK message, always associated with an assigned-IP-address used in BNDACK message, always associated with an assigned-IP-address
option, which contains the IP address of the update being rejected. option, which contains the IP address of the update being rejected.
Code Len Reason Code Code Len Reason Code
+-----+-----+------+-----+----------+ +-----+-----+------+-----+----------+
| 0 | 8 | 0 | 1 | R1 | | 0 | 8 | 0 | 1 | R1 |
+-----+-----+------+-----+----------+ +-----+-----+------+-----+----------+
skipping to change at page 32, line 34 skipping to change at page 47, line 34
6 Connection rejected, unknown reason. 6 Connection rejected, unknown reason.
7 Connection rejected, duplicate connection. 7 Connection rejected, duplicate connection.
8 Connection rejected, invalid failover partner. 8 Connection rejected, invalid failover partner.
9 TLS not supported 9 TLS not supported
10 TLS supported but not configured 10 TLS supported but not configured
11 TLS required but not supported by partner 11 TLS required but not supported by partner
12 Message digest not supported 12 Message digest not supported
13 Message digest not configured 13 Message digest not configured
14 Protocol version mismatch 14 Protocol version mismatch
15 Missing binding information 15 Missing binding information
16 Outdata binding information 16 Outdated binding information
17 Less critical binding information 17 Less critical binding information
18-253, reserved. 18 No traffic within sufficient time
19 Hash bucket assignment conflict
20-253, reserved.
254 Unknown: Error occurred but does not match any reason code 254 Unknown: Error occurred but does not match any reason code
255 Reserved for code expansion 255 Reserved for code expansion
6.2.9. message 6.2.9. message
This option is used to supply a human readable message. It may be This option is used to supply a human readable message. It may be
used in association with the Reject Reason Code to provide a human used in association with the Reject Reason Code to provide a human
readable error message for the reject. readable error message for the reject.
Code Len Text Code Len Text
+-----+-----+------+-----+------+-----+-- +-----+-----+------+-----+------+-----+--
| 0 | 9 | 0 | n | c1 | c2 | ... | 0 | 9 | 0 | n | c1 | c2 | ...
+-----+-----+------+-----+------+-----+-- +-----+-----+------+-----+------+-----+--
6.2.10. MCLT 6.2.10. MCLT
Maximum Client Lead Time, in seconds. A 32 bit integer value, in Maximum Client Lead Time, in seconds. A 32 bit integer value, in
network byte order. T network byte order.
Code Len Time Code Len Time
+-----+-----+------+-----+----+-----+-----+-----+ +-----+-----+------+-----+----+-----+-----+-----+
| 0 | 10 | 0 | 4 | t1 | t2 | t3 | t4 | | 0 | 10 | 0 | 4 | t1 | t2 | t3 | t4 |
+-----+-----+------+-----+----+-----+-----+-----+ +-----+-----+------+-----+----+-----+-----+-----+
6.2.11. vendor-class-identifier 6.2.11. vendor-class-identifier
A string which identifies the vendor of the failover protocol A string which identifies the vendor of the failover protocol
implementation. implementation.
The code for this option is 60, and its minimum length is 1. The code for this option is 60, and its minimum length is 1.
Code Len vendor class string Code Len vendor class string
+-----+-----+------+-----+----+-----+--- +-----+-----+------+-----+----+-----+---
| 0 | 11 | 0 | n | c1 | c2 | ... | 0 | 11 | 0 | n | c1 | c2 | ...
+-----+-----+------+-----+----+-----+--- +-----+-----+------+-----+----+-----+---
6.2.12. current-time 6.2.12. lease-expiration-time
The current time expressed as an absolute time in GMT represented as
seconds elapsed since Jan 1, 1970 (i.e. ANSI C time_t time value
representation).
Code Len Current Time
+-----+-----+------+-----+----+-----+-----+-----+
| 0 | 12 | 0 | 4 | t1 | t2 | t3 | t4 |
+-----+-----+------+-----+----+-----+-----+-----+
6.2.13. lease-expiration-time
The lease expiration time expressed as an absolute time in GMT The lease expiration time expressed as an absolute time in GMT
represented as seconds elapsed since Jan 1, 1970 (i.e. ANSI C time_t represented as seconds elapsed since Jan 1, 1970 (i.e. ANSI C time_t
time value representation). time value representation).
The lease expiration time is the time that a server has ACKed to a The lease expiration time is the time that a server has ACKed to a
DHCP client. DHCP client.
Code Len Time Code Len Time
+-----+-----+------+-----+----+-----+-----+-----+ +-----+-----+------+-----+----+-----+-----+-----+
| 0 | 13 | 0 | 4 | t1 | t2 | t3 | t4 | | 0 | 13 | 0 | 4 | t1 | t2 | t3 | t4 |
+-----+-----+------+-----+----+-----+-----+-----+ +-----+-----+------+-----+----+-----+-----+-----+
6.2.14. potential-expiration-time 6.2.13. potential-expiration-time
The potential expiration time expressed as an absolute time in GMT The potential expiration time expressed as an absolute time in GMT
represented as seconds elapsed since Jan 1, 1970 (i.e. ANSI C time_t represented as seconds elapsed since Jan 1, 1970 (i.e. ANSI C time_t
time value representation). time value representation).
The potential expiration time is the time that one server tells The potential expiration time is the time that one server tells
another server that it may ACK to a client. another server that it may ACK to a client.
Code Len Time Code Len Time
+-----+-----+------+-----+----+-----+-----+-----+ +-----+-----+------+-----+----+-----+-----+-----+
| 0 | 14 | 0 | 4 | t1 | t2 | t3 | t4 | | 0 | 14 | 0 | 4 | t1 | t2 | t3 | t4 |
+-----+-----+------+-----+----+-----+-----+-----+ +-----+-----+------+-----+----+-----+-----+-----+
6.2.15. grace-expiration-time 6.2.14. grace-expiration-time
The grace expiration time expressed as an absolute time in GMT The grace expiration time expressed as an absolute time in GMT
represented as seconds elapsed since Jan 1, 1970 (i.e. ANSI C time_t represented as seconds elapsed since Jan 1, 1970 (i.e. ANSI C time_t
time value representation). time value representation).
The grace expiration time is the time that a grace period will The grace expiration time is the time that a grace period will
expire. expire.
Code Len Time Code Len Time
+-----+-----+------+-----+----+-----+-----+-----+ +-----+-----+------+-----+----+-----+-----+-----+
| 0 | 15 | 0 | 4 | t1 | t2 | t3 | t4 | | 0 | 15 | 0 | 4 | t1 | t2 | t3 | t4 |
+-----+-----+------+-----+----+-----+-----+-----+ +-----+-----+------+-----+----+-----+-----+-----+
6.2.16. client-last-transaction-time 6.2.15. client-last-transaction-time
The time at which this server last received a DHCP request from a The time at which this server last received a DHCP request from a
particular client expressed as an absolute time in GMT represented as particular client expressed as an absolute time in GMT represented as
seconds elapsed since Jan 1, 1970 (i.e. ANSI C time_t time value seconds elapsed since Jan 1, 1970 (i.e. ANSI C time_t time value
representation). representation).
Code Len Partner Down Time Code Len Partner Down Time
+-----+-----+------+-----+----+-----+-----+-----+ +-----+-----+------+-----+----+-----+-----+-----+
| 0 | 16 | 0 | 4 | t1 | t2 | t3 | t4 | | 0 | 16 | 0 | 4 | t1 | t2 | t3 | t4 |
+-----+-----+------+-----+----+-----+-----+-----+ +-----+-----+------+-----+----+-----+-----+-----+
6.2.17. start-time-of-state 6.2.16. start-time-of-state
The time at which the state contained in this message began, The time at which the state contained in this message began,
expressed as an absolute time in GMT represented as seconds elapsed expressed as an absolute time in GMT represented as seconds elapsed
since Jan 1, 1970 (i.e. ANSI C time_t time value representation). since Jan 1, 1970 (i.e. ANSI C time_t time value representation).
This option is used for different states in different messages. In a This option is used for different states in different messages. In a
BNDUPD message it represents the start time of the state of the lease BNDUPD message it represents the start time of the state of the lease
in the BNDUPD message. In a STATE message, it represents the start in the BNDUPD message. In a STATE message, it represents the start
time of the partner server's failover state. time of the partner server's failover state.
Code Len Start Time of State Code Len Start Time of State
+-----+-----+------+-----+----+-----+-----+-----+ +-----+-----+------+-----+----+-----+-----+-----+
| 0 | 17 | 0 | 4 | t1 | t2 | t3 | t4 | | 0 | 17 | 0 | 4 | t1 | t2 | t3 | t4 |
+-----+-----+------+-----+----+-----+-----+-----+ +-----+-----+------+-----+----+-----+-----+-----+
6.2.18. server-state 6.2.17. server-state
This option is used to convey the current state of the failover This option is used to convey the current state of the failover
endpoint in the sending server. endpoint in the sending server.
Code Len Server State Code Len Server State
+-----+-----+------+-----+-----+ +-----+-----+------+-----+-----+
| 0 | 18 | 0 | 1 | 1-9 | | 0 | 18 | 0 | 1 | 1-9 |
+-----+-----+------+-----+-----+ +-----+-----+------+-----+-----+
Legal values for this option are: Legal values for this option are:
skipping to change at page 36, line 31 skipping to change at page 51, line 31
2 NORMAL Normal state 2 NORMAL Normal state
3 COMMUNICATIONS-INTERRUPTED Communication interrupted (safe) 3 COMMUNICATIONS-INTERRUPTED Communication interrupted (safe)
4 PARTNER-DOWN Partner down (unsafe mode) 4 PARTNER-DOWN Partner down (unsafe mode)
5 POTENTIAL-CONFLICT Synchronizing 5 POTENTIAL-CONFLICT Synchronizing
6 RECOVER Recovering bindings from partner 6 RECOVER Recovering bindings from partner
7 PAUSED Shutting down for a short period. 7 PAUSED Shutting down for a short period.
8 SHUTDOWN Shutting down for an extended 8 SHUTDOWN Shutting down for an extended
period. period.
9 RECOVER-DONE Interlock state prior to NORMAL 9 RECOVER-DONE Interlock state prior to NORMAL
6.2.19. server-flags 6.2.18. server-flags
This option is used to convey the current flags of the failover This option is used to convey the current flags of the failover
endpoint in the sending server. endpoint in the sending server.
Code Len Server Flags Code Len Server Flags
+-----+-----+------+-----+-------+ +-----+-----+------+-----+-------+
| 0 | 19 | 0 | 1 | flags | | 0 | 19 | 0 | 1 | flags |
+-----+-----+------+-----+-------+ +-----+-----+------+-----+-------+
Legal values for this option are: Legal values for this option are:
skipping to change at page 37, line 8 skipping to change at page 52, line 8
are reserved, and must be set to 0. are reserved, and must be set to 0.
o STARTUP o STARTUP
Bit 5 is the STARTUP flag. Bit 5 MUST be set to 1 whenever the Bit 5 is the STARTUP flag. Bit 5 MUST be set to 1 whenever the
server is in STARTUP state, and set to 0 otherwise. (Note that server is in STARTUP state, and set to 0 otherwise. (Note that
when in STARTUP state, the state transmitted in the server-state when in STARTUP state, the state transmitted in the server-state
option is usually the last recorded state from stable storage, option is usually the last recorded state from stable storage,
but see section 9.3 for details.) but see section 9.3 for details.)
6.2.20. vendor-specific-options 6.2.19. vendor-specific-options
This option is used to convey options specific to a particular This option is used to convey options specific to a particular
vendor's implementation. The vendor class identifier is used to vendor's implementation. The vendor class identifier is used to
specify which option space the embedded options are drawn from. specify which option space the embedded options are drawn from.
It functions similarly to the vendor class identifier and vendor It functions similarly to the vendor class identifier and vendor
specific options in the DHCP protocol. specific options in the DHCP protocol.
This option contains other options in the same two byte code, two This option contains other options in the same two byte code, two
byte length format. If this option appears in a message without a byte length format. If this option appears in a message without a
corresponding vendor class identifier, it MUST be ignored. corresponding vendor class identifier, it MUST be ignored.
Code Len Embedded options Code Len Embedded options
+-----+-----+------+-----+----+-----+--- +-----+-----+------+-----+----+-----+---
| 0 | 20 | 0 | n | c1 | c2 | ... | 0 | 20 | 0 | n | c1 | c2 | ...
+-----+-----+------+-----+----+-----+--- +-----+-----+------+-----+----+-----+---
6.2.21. max-unacked-bndupd 6.2.20. max-unacked-bndupd
The maximum number of BNDUPD message that this server is prepared to The maximum number of BNDUPD message that this server is prepared to
accept over the TCP connection without causing the TCP connection to accept over the TCP connection without causing the TCP connection to
block. block.
Code Len Maximum Unacked BNDUPD Code Len Maximum Unacked BNDUPD
+-----+-----+------+-----+----+-----+-----+-----+ +-----+-----+------+-----+----+-----+-----+-----+
| 0 | 21 | 0 | 4 | n1 | n2 | n3 | n4 | | 0 | 21 | 0 | 4 | n1 | n2 | n3 | n4 |
+-----+-----+------+-----+----+-----+-----+-----+ +-----+-----+------+-----+----+-----+-----+-----+
6.2.22. server-role 6.2.21. receive-timer
This option is used to convey the role of the failover endpoint in
the sending server.
Code Len Role
+-----+-----+------+-----+-------+
| 0 | 22 | 0 | 1 | r1 |
+-----+-----+------+-----+-------+
A value of 0 indicates that the failover endpoint is a primary server
and a value of 1 indicates that it is a secondary server.
6.2.23. receive-timer
The number of seconds within which the server must receive a packet The number of seconds within which the server must receive a message
from its partner, or it will assume that the partner is down or the from its partner, or it will assume that the partner is down or the
communication path to the partner has failed. communication path to the partner has failed.
Code Len Receive Timer Code Len Receive Timer
+-----+-----+------+-----+----+-----+-----+-----+ +-----+-----+------+-----+----+-----+-----+-----+
| 0 | 23 | 0 | 4 | s1 | s2 | s3 | s4 | | 0 | 23 | 0 | 4 | s1 | s2 | s3 | s4 |
+-----+-----+------+-----+----+-----+-----+-----+ +-----+-----+------+-----+----+-----+-----+-----+
6.2.24. hash-bucket-assignment 6.2.22. hash-bucket-assignment
The set of hash values to which the receiving server MUST respond. The set of hash values to which the receiving server MUST respond.
See section 5.3 for more information on how this option is used. See section 5.3 for more information on how this option is used.
This option consists of a set of 32 bytes, in network byte order, The format and usage of the data in this option is defined in
where each bit corresponds to one of 256 possible hash bucket values. [LOADB].
If a bit is set to 1, the recipient is required to service the
requests whose client-identifier or htype concatenated with the
chaddr (if no client-identifier exists) map into the corresponding
hash bucket.
Code Len Hash Buckets Code Len Hash Buckets
+-----+-----+------+-----+----+-----+-----+-----+ +-----+-----+------+-----+----+-----+-----+-----+
| 0 | 24 | 0 | 32 | b1 | b2 | ... | b32 | | 0 | 24 | 0 | 32 | b1 | b2 | ... | b32 |
+-----+-----+------+-----+----+-----+-----+-----+ +-----+-----+------+-----+----+-----+-----+-----+
6.2.25. message-digest 6.2.23. message-digest
The message digest for this message. The message digest for this message.
This option consists of a variable number of bytes which contain the This option consists of a variable number of bytes which contain the
message digest of the message prior to the inclusion of this option. message digest of the message prior to the inclusion of this option.
When this option appears in a message, it MUST appear as the last When this option appears in a message, it MUST appear as the last
option in the message. option in the message.
Code Len Message Digest Code Len Message Digest
+-----+-----+------+-----+----+-----+----- +-----+-----+------+-----+----+-----+-----
| 0 | 25 | 0 | n | d1 | d2 | ... | 0 | 25 | 0 | n | d1 | d2 | ...
+-----+-----+------+-----+----+-----+----- +-----+-----+------+-----+----+-----+-----
6.2.26. protocol-version 6.2.24. protocol-version
The protocol version being used by the server. It is only sent in the The protocol version being used by the server. It is only sent in the
CONNECT and CONNECTACK messages. CONNECT and CONNECTACK messages.
Code Len Version Code Len Version
+-----+-----+------+-----+----+ +-----+-----+------+-----+----+
| 0 | 26 | 0 | 1 | v1 | | 0 | 26 | 0 | 1 | v1 |
+-----+-----+------+-----+----+ +-----+-----+------+-----+----+
6.2.27. TLS-request 6.2.25. TLS-request
This option contains information relating to TLS security This option contains information relating to TLS security
negotiation. It is sent in a CONNECT message negotiation. It is sent in a CONNECT message
The first byte, req, is the TLS request from this server. A value of The first byte, req, is the TLS request from this server. A value of
0 indicates no TLS operation, a value of 1 indicates that TLS 0 indicates no TLS operation, a value of 1 indicates that TLS
operation is desired, and a value of 2 indicates that TLS operation operation is desired, and a value of 2 indicates that TLS operation
is required to establish communications with this server. is required to establish communications with this server.
The second byte, acc, is what this server will accept for TLS The second byte, acc, is what this server will accept for TLS
operation. A value of 0 means that this server will not accept TLS operation. A value of 0 means that this server will not accept TLS
connections. A value of 1 means that this server will accept TLS connections. A value of 1 means that this server will accept TLS
connections. connections.
If req is not zero, then acc MUST be 1. If req is not zero, then acc MUST be 1.
This allows a server which is not configured for TLS support to This allows a server which is not configured to require TLS support
inform its partner that it will accept a TLS connection although it to inform its partner that it will accept a TLS connection although
does not desire one, for instance. it does not desire one, for instance.
Code Len request acccept Code Len request accept
+-----+-----+------+-----+----+----+ +-----+-----+------+-----+----+----+
| 0 | 27 | 0 | 2 | req| acc| | 0 | 27 | 0 | 2 | req| acc|
+-----+-----+------+-----+----+----+ +-----+-----+------+-----+----+----+
6.2.28. TLS-reply 6.2.26. TLS-reply
This option contains information relating to TLS security This option contains information relating to TLS security
negotiation. It is sent in a CONNECTACK message negotiation. It is sent in a CONNECTACK message
The value of 0 indicates no TLS operation, a value of 1 indicates The value of 0 indicates no TLS operation, a value of 1 indicates
that TLS operation is required. that TLS operation is required.
Code Len TLS Code Len TLS
+-----+-----+------+-----+----+ +-----+-----+------+-----+----+
| 0 | 28 | 0 | 1 | t1 | | 0 | 28 | 0 | 1 | t1 |
+-----+-----+------+-----+----+ +-----+-----+------+-----+----+
6.2.27. client-request-options
This option contains options from a DHCP client's request. It is
sent in a BNDUPD message. The first 4 bytes of the option contain
the "magic number" of the option area from which the DHCP client's
request options were taken and serves to define the format of the
rest of the sub-options contained in this option. After the magic
number, the options included are in the normal options format
appropriate for that magic number.
A server SHOULD NOT include all of the options in a DHCP client
request in this option, but rather a server SHOULD include only those
options which are of likely interest to its partner server. See
section 7.1 for details.
Code Len Magic Number Embedded options
+-----+-----+------+-----+----+----+----+----+----+----+--
| 0 | 29 | 0 | n | m1 | m2 | m3 | m4 | b1 | b2 | ...
+-----+-----+------+-----+----+----+----+----+----+----+--
6.2.28. client-reply-options
This option contains options from a DHCP server's reply to a DHCP
client request. It is sent in a BNDUPD message. The first 4 bytes
of the option contain the "magic number" of the option area from
which the DHCP reply options were taken and serves to define the
format of the rest of the sub-options contained in this option.
After the magic number, the options included are in the normal
options format appropriate for that magic number.
A server SHOULD NOT include all of the options in a DHCP server's
reply to a client's request in this option, but rather a server
SHOULD include only those options which are of likely interest to its
partner server. See section 7.1 for details.
Code Len Magic Number Embedded options
+-----+-----+------+-----+----+----+----+----+----+----+--
| 0 | 30 | 0 | n | m1 | m2 | m3 | m4 | b1 | b2 | ...
+-----+-----+------+-----+----+----+----+----+----+----+--
6.3. BNDUPD message format 6.3. BNDUPD message format
The binding update (BNDUPD) message is used to send the binding data- The binding update (BNDUPD) message is used to send the binding data-
base changes to the partner server. base changes to the partner server.
The message type for the BNDUPD message is 3. The message type for the BNDUPD message is 3.
The xid of the BNDUPD MUST be unique with respect to other failover The xid of the BNDUPD MUST be unique with respect to other failover
messages transmitted from this failover endpoint. messages transmitted from this failover endpoint.
skipping to change at page 41, line 23 skipping to change at page 57, line 17
Option ACTIVE EXPIRED RELEASED FREE Option ACTIVE EXPIRED RELEASED FREE
------ ------ ------- -------- ---- ------ ------ ------- -------- ----
assigned-IP-address MUST MUST MUST MUST assigned-IP-address MUST MUST MUST MUST
binding-status MUST MUST MUST MUST binding-status MUST MUST MUST MUST
client-identifier MAY MAY MAY MAY client-identifier MAY MAY MAY MAY
client-hardware-address MUST MUST MUST MAY client-hardware-address MUST MUST MUST MAY
lease-expiration-time MUST MUST NOT MUST NOT MUST NOT lease-expiration-time MUST MUST NOT MUST NOT MUST NOT
potential-expiration-time MUST MUST NOT MUST NOT MUST NOT potential-expiration-time MUST MUST NOT MUST NOT MUST NOT
grace-expiration-time MUST NOT MUST NOT MUST NOT MUST NOT grace-expiration-time MUST NOT MUST NOT MUST NOT MUST NOT
start-time-of-state SHOULD SHOULD SHOULD SHOULD start-time-of-state SHOULD SHOULD SHOULD SHOULD
client-last-trans.-time SHOULD SHOULD SHOULD MAY client-last-trans.-time MUST SHOULD MUST MAY
client-FQDN(1) SHOULD SHOULD SHOULD SHOULD DDNS(1) SHOULD SHOULD SHOULD SHOULD
client-request-options SHOULD SHOULD NOT SHOULD SHOULD NOT
client-reply-options SHOULD SHOULD NOT SHOULD SHOULD NOT
all others MAY MAY MAY MAY all others MAY MAY MAY MAY
binding-status binding-status
BACKUP BACKUP
EXPIRED- RELEASED- RESET RESET
Option GRACE GRACE ABANDONED Option ABANDONED
------ ------ ----- --------- ------ ---------
assigned-IP-address MUST MUST MUST assigned-IP-address MUST
binding-status MUST MUST MUST binding-status MUST
client-identifier MAY MAY MAY(2) client-identifier MAY(2)
client-hardware-address MAY MAY MAY(2) client-hardware-address MAY(2)
lease-expiration-time MUST NOT MUST NOT MUST NOT lease-expiration-time MUST NOT
potential-expiration-time MUST NOT MUST NOT MUST NOT potential-expiration-time MUST NOT
grace-expiration-time MUST MUST MUST NOT grace-expiration-time MUST NOT
start-time-of-state SHOULD SHOULD SHOULD start-time-of-state SHOULD
client-last-trans.-time SHOULD SHOULD MAY client-last-trans.-time MAY
client-FQDN(1) SHOULD SHOULD SHOULD DDNS(1) SHOULD
all others MAY MAY MAY client-request-options SHOULD NOT
client-reply-options SHOULD NOT
all others MAY
(1) Only SHOULD appear if client supplies a host name and dynamic DNS (1) Only SHOULD appear if server supports dynamic DNS.
is used.
(2) MUST NOT if binding-status is ABANDONED. (2) MUST NOT if binding-status is ABANDONED.
Table 6.3-1: Options used in a BNDACK message Table 6.3-1: Options used in a BNDUPD message
6.4. BNDACK message format 6.4. BNDACK message format
A server sends a binding acknowledgement (BNDACK) message when it has A server sends a binding acknowledgement (BNDACK) message when it has
successfully committed binding database changes received from a fail- successfully committed binding database changes received from a fail-
over partner in a BNDUPD message to its own stable storage. over partner in a BNDUPD message to its own stable storage.
The message type for the BNDACK message is 4. The message type for the BNDACK message is 4.
The xid in a BNDACK MUST be the same as the xid of the corresponding The xid in a BNDACK MUST be the same as the xid of the corresponding
skipping to change at page 43, line 20 skipping to change at page 59, line 20
binding-status MUST MUST MUST MUST binding-status MUST MUST MUST MUST
client-identifier MAY MAY MAY MAY client-identifier MAY MAY MAY MAY
client-hardware-address MUST MUST MUST MAY client-hardware-address MUST MUST MUST MAY
reject-reason MAY MAY MAY MAY reject-reason MAY MAY MAY MAY
message MAY MAY MAY MAY message MAY MAY MAY MAY
lease-expiration-time MUST MUST NOT MUST NOT MUST NOT lease-expiration-time MUST MUST NOT MUST NOT MUST NOT
potential-expiration-time MUST MUST NOT MUST NOT MUST NOT potential-expiration-time MUST MUST NOT MUST NOT MUST NOT
grace-expiration-time MUST NOT MUST NOT MUST NOT MUST NOT grace-expiration-time MUST NOT MUST NOT MUST NOT MUST NOT
start-time-of-state SHOULD SHOULD SHOULD SHOULD start-time-of-state SHOULD SHOULD SHOULD SHOULD
client-last-trans.-time SHOULD SHOULD SHOULD MAY client-last-trans.-time SHOULD SHOULD SHOULD MAY
client-FQDN(1) SHOULD SHOULD SHOULD SHOULD DDNS(1) SHOULD SHOULD SHOULD SHOULD
all others MAY MAY MAY MAY all others MAY MAY MAY MAY
binding-status binding-status
BACKUP BACKUP
EXPIRED- RELEASED- RESET RESET
Option GRACE GRACE ABANDONED Option ABANDONED
------ ------ ----- --------- ------ ---------
assigned-IP-address MUST MUST MUST assigned-IP-address MUST
binding-status MUST MUST MUST binding-status MUST
client-identifier MAY MAY MAY client-identifier MAY
client-hardware-address MAY MAY MAY(2) client-hardware-address MAY(2)
reject-reason MAY MAY MAY reject-reason MAY
message MAY MAY MAY message MAY
lease-expiration-time MUST NOT MUST NOT MUST NOT lease-expiration-time MUST NOT
potential-expiration-time MUST NOT MUST NOT MUST NOT potential-expiration-time MUST NOT
grace-expiration-time MUST MUST MUST NOT grace-expiration-time MUST NOT
start-time-of-state SHOULD SHOULD SHOULD start-time-of-state SHOULD
client-last-trans.-time SHOULD SHOULD MAY client-last-trans.-time MAY
client-FQDN(1) SHOULD SHOULD SHOULD DDNS(1) SHOULD
all others MAY MAY MAY all others MAY
(1) Only SHOULD appear if client supplies a host name and dynamic DNS (1) Only SHOULD appear if the server supports dynamic DNS.
is used.
(2) MUST NOT if binding-status is ABANDONED. (2) MUST NOT if binding-status is ABANDONED.
Table 6.4-1: Options used in a BNDACK message Table 6.4-1: Options used in a BNDACK message
6.5. Bulking for BNDUPD and BNDACK messages 6.5. Bulking for BNDUPD and BNDACK messages
DISCUSSION: DISCUSSION:
Bulking is planned for this protocol, but it hasn't been specified Bulking is planned for this protocol, but it hasn't been specified
in this revision of the draft. Once the draft settles down, we in this revision of the draft. Once the draft settles down, we
will specify the bulking approach in detail. will specify the bulking approach in detail.
6.6. UPDREQ message format 6.6. UPDREQ message format
The update request (UPDREQ) message is used by one server to request The update request (UPDREQ) message is used by one server to request
that its partner send it all binding database information that it has that its partner send it all binding database information that it has
skipping to change at page 44, line 25 skipping to change at page 60, line 22
The update request (UPDREQ) message is used by one server to request The update request (UPDREQ) message is used by one server to request
that its partner send it all binding database information that it has that its partner send it all binding database information that it has
not already seen. not already seen.
The message type for the UPDREQ message is 9. The message type for the UPDREQ message is 9.
The xid in a UPDREQ message MUST be unique among messages transmitted The xid in a UPDREQ message MUST be unique among messages transmitted
from this failover endpoint during the life of this connection. from this failover endpoint during the life of this connection.
There are no options that MUST appear in an UPDREQALL message. Any There are no options that MUST appear in an UPDREQALL message. Any
option MAY appear. option MAY appear, though very few will likely be useful.
6.7. UPDREQALL message format 6.7. UPDREQALL message format
The update request all (UPDREQALL) message is used by one server to The update request all (UPDREQALL) message is used by one server to
request that all binding database information be sent in order to request that all binding database information be sent in order to
recover from a total loss of its lease state database by the request- recover from a total loss of its binding database by the requesting
ing server. server.
The message type for the UPDREQALL message is 7. The message type for the UPDREQALL message is 7.
The xid in a UPDREQALL message MUST be unique among messages The xid in a UPDREQALL message MUST be unique among messages
transmitted from this failover endpoint during the life of this con- transmitted from this failover endpoint during the life of this con-
nection. nection.
There are no options that MUST appear in an UPDREQALL message. Any There are no options that MUST appear in an UPDREQALL message. Any
option MAY appear. option MAY appear, though very few will likely be useful.
6.8. UPDDONE message format 6.8. UPDDONE message format
The update done (UPDDONE) message is used by the responding server to The update done (UPDDONE) message is used by the responding server to
indicate that all requested updates have been sent by the responding indicate that all requested updates have been sent by the responding
server as BNDUPD messages and acked by the requesting server using server as BNDUPD messages and responded to by the requesting server
BNDACK messages. While a BNDACK message MUST have been received for using BNDACK messages. While a BNDACK message MUST have been
each IP address that was sent in a BNDUPD message, the BNDACK message received for each BNDUPD message prior to the transmission of the
could have contained a reject-reason in order to NAK that specific UPDDONE message, this doesn't necessarily mean that all of the BNDUPD
update. messages were accepted, only that all of them were responded to with
a BNDACK message. Thus, a NAK (comprised of a BNDACK message con-
Thus, this message confirms that the requesting server has received taining a reject-reason option) could be used to reject a BNDUPD, but
and responded to a BNDUPD message for all of the requested updates, for the purposes of the UPDDONE message, such NAK would count as a
but it does require the requesting server to accept all of the response to the associated BNDUPD message, and would not block the
offered updates. eventual transmission of the UPDDONE message.
The message type for the UPDDONE message is 7. The message type for the UPDDONE message is 7.
The xid in an UPDDONE message MUST be identical to the xid in the The xid in an UPDDONE message MUST be identical to the xid in the
UPDREQ or UPDREQALL message that initiated the update process. UPDREQ or UPDREQALL message that initiated the update process.
There are no options that MUST appear in an UPDDONE message. Any There are no options that MUST appear in an UPDDONE message. Any
option MAY appear. option MAY appear, though very few will likely be useful.
6.9. POOLREQ message format 6.9. POOLREQ message format
The pool request (POOLREQ) is used by the secondary server to request The pool request (POOLREQ) is used by the secondary server to request
an allocation of IP addresses from the primary server. an allocation of IP addresses from the primary server.
The message type for the POOLREQ message is 1. The message type for the POOLREQ message is 1.
The xid in a POOLREQ message MUST be unique among messages transmit- The xid in a POOLREQ message MUST be unique among messages transmit-
ted from this failover endpoint during the life of this connection. ted from this failover endpoint during the life of this connection.
There are no options that MUST appear in a POOLREQ message. Any There are no options that MUST appear in a POOLREQ message. Any
option MAY appear. option MAY appear.
6.10. POOLRESP message format 6.10. POOLRESP message format
The pool response (POOLRESP) is used by the primary server to inform The pool response (POOLRESP) is used by the primary server to inform
the secondary server how many IP addresses it was allocated as the the secondary server how many IP addresses were allocated to the
result of a pool request. secondary server as the result of the pool request.
The message type for the POOLRESP message is 2. The message type for the POOLRESP message is 2.
The xid in the POOLRESP message MUST be identical to the xid in the The xid in the POOLRESP message MUST be identical to the xid in the
POOLREQ message for which this POOLRESP is a response. POOLREQ message for which this POOLRESP is a response.
The following table shows the options that MUST appear in a POOLRESP The following table shows the options that MUST appear in a POOLRESP
message: message:
Option Option
------ ------
addresses-transferred MUST addresses-transferred MUST
Table 6.10-1: Options used in a STATE message Table 6.10-1: Options used in a POOLREQ message
6.11. CONNECT message format 6.11. CONNECT message format
The connect (CONNECT) message is used by either server to establish a The connect (CONNECT) message is used by the primary server to estab-
high level connection with the other server, and to transmit several lish a high level connection with the other server, and to transmit
important configuration data items between the servers. several important configuration data items between the servers.
The message type for the CONNECT message is 5. The message type for the CONNECT message is 5.
The xid in a CONNECT message MUST be unique among messages transmit- The xid in a CONNECT message MUST be unique among messages transmit-
ted from this failover endpoint during the life of this connection. ted from this failover endpoint during the life of this connection.
The CONNECT message MUST be the first message sent down a newly esta- The CONNECT message MUST be the first message sent down a newly esta-
blished connection. blished connection.
The following table summarizes the options that are associated with The following table summarizes the options that are associated with
the CONNECT message: the CONNECT message:
role Option
------
Option primary secondary sending-server-IP-address MUST
------ ------ --------- max-unacked-bndupd MUST
sending-server-IP-address MUST MUST receive-timer MUST
server-role MUST MUST vendor-class-identifier MUST
max-unacked-bndupd MUST MUST protocol-version MUST
receive-timer MUST MUST TLS-request MUST
current-time MUST MUST MCLT MUST
vendor-class-identifier MUST MUST hash-bucket-assignment MUST
protocol-version MUST MUST all others MAY
TLS-request MUST(1) MUST(1)
MCLT MUST MUST NOT
hash-bucket-assignment MUST MUST NOT
all others MAY MAY
(1) If the CONNECT message is being sent on a TLS secured connection,
then there MUST NOT be a TLS-request option.
Table 6.11-1: Options used in a CONNECT message Table 6.11-1: Options used in a CONNECT message
6.12. CONNECTACK message format 6.12. CONNECTACK message format
The connect response (CONNECTACK) message is used by a server to The connect response (CONNECTACK) message is used by a secondary
respond to the receipt of a CONNECT message. server to respond to the receipt of a CONNECT message from the pri-
mary server.
The message type for the CONNECTACK message is 6. The message type for the CONNECTACK message is 6.
The xid in the CONNECTACK message MUST be identical to the xid in the The xid in the CONNECTACK message MUST be identical to the xid in the
CONNECT message for which this CONNECTACK is a response. CONNECT message for which this CONNECTACK is a response.
The following table summarizes the options associated with the CON- The following table summarizes the options associated with the CON-
NECTACK message: NECTACK message:
Option Option
------ ------
sending-server-IP-address MUST sending-server-IP-address MUST
server-role MUST
max-unacked-bndupd MUST max-unacked-bndupd MUST
receive-timer MUST receive-timer MUST
current-time MUST
vendor-class-identifier MUST vendor-class-identifier MUST
protocol-version MUST protocol-version MUST
TLS-reply MUST(1) TLS-request MUST
reject-reason MAY(2) reject-reason MAY(1)
message MAY message MAY
MCLT MUST NOT
hash-bucket-assignment MUST NOT
(1) If the CONNECTACK is being sent over an already TLS secured (1) Indicates a rejection of the CONNECT message.
connection, then the TLS-reply option MUST NOT appear.
(2) Indicates a rejection of the CONNECT message.
Table 6.12-1: Options used in a CONNECTACK message Table 6.12-1: Options used in a CONNECTACK message
6.13. STATE message format 6.13. STATE message format
The state (STATE) message is used by either server to communicate the The state (STATE) message is used by either server to communicate the
current state of the failover endpoint with the other server. It current state of the failover endpoint with the other server. It
MUST be sent immediately after a connection is established with MUST be sent immediately after connection negotiation completes with
another server, and it MUST be sent whenever the server's state the other server, and it MUST be sent whenever the server's state
changes. changes.
The message type for the STATE message is 10. The message type for the STATE message is 10.
The xid in a STATE message MUST be unique among messages transmitted The xid in a STATE message MUST be unique among messages transmitted
from this failover endpoint during the life of this connection. from this failover endpoint during the life of this connection.
The following table shows the options that MUST appear in a STATE The following table shows the options that MUST appear in a STATE
message: message:
skipping to change at page 48, line 26 skipping to change at page 64, line 15
6.14. CONTACT message format 6.14. CONTACT message format
The contact (CONTACT) message is used by either server to verify that The contact (CONTACT) message is used by either server to verify that
the connection is operational to the other server. the connection is operational to the other server.
The message type for the CONTACT message is 11. The message type for the CONTACT message is 11.
The xid in a CONTACT message MUST be unique among messages transmit- The xid in a CONTACT message MUST be unique among messages transmit-
ted from this failover endpoint during the life of this connection. ted from this failover endpoint during the life of this connection.
The following table shows the options that MUST appear in a CONTACT There are no options that MUST be used in a CONTACT message.
message:
6.15. DISCONNECT message format
The disconnect (DISCONNECT) message is used by either server just
prior to closing a connection.
The message type for the DISCONNECT message is 12.
The xid in a DISCONNECT message MUST be unique among messages
transmitted from this failover endpoint during the life of this con-
nection.
The DISCONNECT message MUST be the last message sent down a connec-
tion before it is closed.
The following table summarizes the options that are associated with
the DISCONNECT message:
Option Option
------ ------
current-time MUST reject-reason MUST
message SHOULD
Table 6.14-1: Options used in a CONTACT message Table 6.15-1: Options used in a DISCONNECT message
7. Protocol Messages 7. Protocol Messages
This section contains the detailed definition of the protocol mes- This section contains the detailed definition of the protocol mes-
sages, including the information to include when sending the message, sages, including the information to include when sending the message,
as well as the actions to take upon receiving the message. as well as the actions to take upon receiving the message.
7.1. BNDUPD message 7.1. BNDUPD message
The binding update (BNDUPD) message is used to send the binding data- The binding update (BNDUPD) message is used to send the binding data-
base changes to the partner server, and the partner server responds base changes to the partner server, and the partner server responds
with a binding acknowledgement (BNDACK) message when it has success- with a binding acknowledgement (BNDACK) message when it has success-
fully commited those changes to its own stable storage. fully committed those changes to its own stable storage.
The rest of the failover protocol exists to determine whether the The rest of the failover protocol exists to determine whether the
partner server is able to communicate or not, and to enable the partner server is able to communicate or not, and to enable the
partners to exchange BNDUPD/BNDACK messages in order to keep their partners to exchange BNDUPD/BNDACK messages in order to keep their
binding databases in stable storage synchronized. binding databases in stable storage synchronized.
7.1.1. Sending the BNDUPD message 7.1.1. Sending the BNDUPD message
A BNDUPD message SHOULD be generated whenever any binding changes. A A BNDUPD message SHOULD be generated whenever any binding changes. A
change might be in the binding-status, the lease-expiration-time, or change might be in the binding-status, the lease-expiration-time, or
skipping to change at page 49, line 41 skipping to change at page 65, line 43
mation that should go into the BNDUPD message because of them. mation that should go into the BNDUPD message because of them.
o ACTIVE o ACTIVE
Indicates that the IP address is currently leased to a DHCP Indicates that the IP address is currently leased to a DHCP
client. client.
client-hardware-address client-hardware-address
The client-hardware-address option MUST appear, and be set from The client-hardware-address option MUST appear, and be set from
the MAC address of the DHCP client to which this IP address is the htype and chaddr of the DHCP client to which this IP address
leased. is leased.
client-identifier client-identifier
If the DHCP client to which this IP address is leased used a If the DHCP client to which this IP address is leased used a
client-identifier option to identify itself, then the client- client-identifier option to identify itself, then the client-
identifier MUST appear in the BNDUPD message, else it MUST NOT identifier MUST appear in the BNDUPD message, else it MUST NOT
appear. appear.
lease-expiration-time lease-expiration-time
The lease-expiration-time option MUST appear, and be set to the The lease-expiration-time option MUST appear, and be set to the
expiration time most recently ACKed to the DHCP client. Note expiration time most recently ACKed to the DHCP client. Note
that the time ACKed to a DHCP client is a lease duration in that the time ACKed to a DHCP client is a lease duration in
seconds, while the lease-expiration-time option in a BNDUPD mes- seconds, while the lease-expiration-time option in a BNDUPD mes-
sage is an absolute time value. sage is an absolute time value.
potential-expiration-time potential-expiration-time
The potential-expiration-time option MUST appear, and be set to The potential-expiration-time option MUST appear, and be set to
a value beyond that of the lease-expiration time. This is the a value beyond that of the lease-expiration time. This is the
skipping to change at page 50, line 36 skipping to change at page 66, line 39
an IP address has expired and the server does not wish to imple- an IP address has expired and the server does not wish to imple-
ment an expired-grace period. When the partner server ACK's the ment an expired-grace period. When the partner server ACK's the
BNDUPD of an EXPIRED IP address, the server sets its internal BNDUPD of an EXPIRED IP address, the server sets its internal
state to FREE. It is then available to allocation to any client state to FREE. It is then available to allocation to any client
of the primary server. of the primary server.
client-hardware-address client-hardware-address
There SHOULD be a DHCP client associated with the IP address There SHOULD be a DHCP client associated with the IP address
whose binding has expired. If there is, then the client- whose binding has expired. If there is, then the client-
hardware-address option MUST appear, and be set from the MAC hardware-address option MUST appear, and be set from the htype
address of the DHCP client to which this IP address was leased. and chaddr of the DHCP client to which this IP address was
leased.
client-identifier client-identifier
There SHOULD be a DHCP client associated with the IP address There SHOULD be a DHCP client associated with the IP address
whose binding has expired. If there is, then if the DHCP client whose binding has expired. If there is, then if the DHCP client
to which this IP address was leased used a client-identifier to which this IP address was leased used a client-identifier
option to identify itself, then the client-identifier MUST option to identify itself, then the client-identifier MUST
appear in the BNDUPD message, else it MUST NOT appear. appear in the BNDUPD message, else it MUST NOT appear.
o RELEASED o RELEASED
skipping to change at page 51, line 12 skipping to change at page 67, line 15
a DHCPRELEASE message and the server does not wish to implement a DHCPRELEASE message and the server does not wish to implement
a released-grace period. When the partner server ACK's the a released-grace period. When the partner server ACK's the
BNDUPD of an RELEASED IP address, the server sets its internal BNDUPD of an RELEASED IP address, the server sets its internal
state to FREE, and it is available for allocation by the primary state to FREE, and it is available for allocation by the primary
server to any DHCP client. server to any DHCP client.
client-hardware-address client-hardware-address
There SHOULD be a DHCP client associated with the IP address There SHOULD be a DHCP client associated with the IP address
whose binding has been released. If there is, then the client- whose binding has been released. If there is, then the client-
hardware-address option MUST appear, and be set from the MAC hardware-address option MUST appear, and be set from the htype
address of the DHCP client which released this IP address. and chaddr of the DHCP client which released this IP address.
client-identifier client-identifier
There SHOULD be a DHCP client associated with the IP address There SHOULD be a DHCP client associated with the IP address
whose binding has been released. If there is, then if the DHCP whose binding has been released. If there is, then if the DHCP
client which released this IP address used a client-identifier client which released this IP address used a client-identifier
option to identify itself, then the client-identifier MUST option to identify itself, then the client-identifier MUST
appear in the BNDUPD message, else it MUST NOT appear. appear in the BNDUPD message, else it MUST NOT appear.
o FREE o FREE
skipping to change at page 51, line 37 skipping to change at page 67, line 40
another server, but it was not just released, expired, or reset another server, but it was not just released, expired, or reset
by a network administrator. When the partner server ACK's the by a network administrator. When the partner server ACK's the
BNDUPD of an FREE IP address, the server sets its internal state BNDUPD of an FREE IP address, the server sets its internal state
such that it is available for allocation by any DHCP client. such that it is available for allocation by any DHCP client.
client-hardware-address client-hardware-address
There MAY be a DHCP client associated with the IP address whose There MAY be a DHCP client associated with the IP address whose
binding is now desired to be FREE. If there is, then the binding is now desired to be FREE. If there is, then the
client-hardware-address option MUST appear, and be set from the client-hardware-address option MUST appear, and be set from the
MAC address of the DHCP client which released this IP address. htype and chaddr of the DHCP client which released this IP
address.
client-identifier client-identifier
There MAY be a DHCP client associated with the IP address whose There MAY be a DHCP client associated with the IP address whose
binding is now desired to be FREE. If there is, then if the binding is now desired to be FREE. If there is, then if the
DHCP client which released this IP address used a client- DHCP client which released this IP address used a client-
identifier option to identify itself, then the client-identifier identifier option to identify itself, then the client-identifier
MUST appear in the BNDUPD message, else it MUST NOT appear. MUST appear in the BNDUPD message, else it MUST NOT appear.
o EXPIRED-GRACE
Some servers support a grace period after lease expiration, to
handle clock speed differences between clients and servers as
well as to limit the number of times names are removed and
subsequently added to dynamic DNS.
client-hardware-address client-hardware-address
There MAY be a DHCP client associated with the IP address whose There MAY be a DHCP client associated with the IP address whose
binding has now expired. If there is, then the client- binding has now expired. If there is, then the client-
hardware-address option MUST appear, and be set from the MAC hardware-address option MUST appear, and be set from the htype
address of the DHCP client which released this IP address. and chaddr of the DHCP client which released this IP address.
client-identifier client-identifier
There MAY be a DHCP client associated with the IP address whose There MAY be a DHCP client associated with the IP address whose
binding hs now expired. If there is, then if the DHCP client binding has now expired. If there is, then if the DHCP client
which most recently leased this IP address used a client- which most recently leased this IP address used a client-
identifier option to identify itself, then the client-identifier identifier option to identify itself, then the client-identifier
MUST appear in the BNDUPD message, else it MUST NOT appear. MUST appear in the BNDUPD message, else it MUST NOT appear.
grace-expiration-time grace-expiration-time
The grace-expiration-time option MUST appear, and is the length The grace-expiration-time option MUST appear, and is the length
of time that this server will wait before trying to make the IP of time that this server will wait before trying to make the IP
address available after the lease has expired for this IP address available after the lease has expired for this IP
address. address.
o RELEASED-GRACE
Some servers support a grace period after lease release by a
DHCP client, to handle clock speed differences between clients
and servers as well as to limit the number of times names are
removed and subsequently added to dynamic DNS.
client-hardware-address client-hardware-address
There MAY be a DHCP client associated with the IP address whose There MAY be a DHCP client associated with the IP address whose
binding has now been released by sending a DHCPRELEASE. If binding has now been released by sending a DHCPRELEASE. If
there is, then the client-hardware-address option MUST appear, there is, then the client-hardware-address option MUST appear,
and be set from the MAC address of the DHCP client which and be set from the htype and chaddr of the DHCP client which
released this IP address. released this IP address.
client-identifier client-identifier
There MAY be a DHCP client associated with the IP address whose There MAY be a DHCP client associated with the IP address whose
binding has been released. If there is, then if the DHCP client binding has been released. If there is, then if the DHCP client
which most recently leased this IP address used a client- which most recently leased this IP address used a client-
identifier option to identify itself, then the client-identifier identifier option to identify itself, then the client-identifier
MUST appear in the BNDUPD message, else it MUST NOT appear. MUST appear in the BNDUPD message, else it MUST NOT appear.
skipping to change at page 53, line 4 skipping to change at page 68, line 41
client-identifier client-identifier
There MAY be a DHCP client associated with the IP address whose There MAY be a DHCP client associated with the IP address whose
binding has been released. If there is, then if the DHCP client binding has been released. If there is, then if the DHCP client
which most recently leased this IP address used a client- which most recently leased this IP address used a client-
identifier option to identify itself, then the client-identifier identifier option to identify itself, then the client-identifier
MUST appear in the BNDUPD message, else it MUST NOT appear. MUST appear in the BNDUPD message, else it MUST NOT appear.
client-hardware-address client-hardware-address
There MAY be a DHCP client associated with the IP address whose There MAY be a DHCP client associated with the IP address whose
binding is now desired to be FREE. If there is, then the binding is now desired to be FREE. If there is, then the
client-hardware-address option MUST appear, and be set from the client-hardware-address option MUST appear, and be set from the
MAC address of the DHCP client which released this IP address. htype and chaddr of the DHCP client which released this IP
address.
client-identifier client-identifier
There MAY be a DHCP client associated with the IP address whose There MAY be a DHCP client associated with the IP address whose
binding is now desired to be FREE. If there is, then if the binding is now desired to be FREE. If there is, then if the
DHCP client which released this IP address used a client- DHCP client which released this IP address used a client-
identifier option to identify itself, then the client-identifier identifier option to identify itself, then the client-identifier
MUST appear in the BNDUPD message, else it MUST NOT appear. MUST appear in the BNDUPD message, else it MUST NOT appear.
grace-expiration-time grace-expiration-time
skipping to change at page 54, line 9 skipping to change at page 69, line 47
o BACKUP o BACKUP
The BACKUP value of binding-status indicates that this IP The BACKUP value of binding-status indicates that this IP
address belongs to the secondary server, and can be allocated by address belongs to the secondary server, and can be allocated by
that server to a DHCP client at any time. that server to a DHCP client at any time.
client-hardware-address client-hardware-address
There MAY be a DHCP client associated with an BACKUP IP address. There MAY be a DHCP client associated with an BACKUP IP address.
If there is, the client-hardware-address option MUST appear, and If there is, the client-hardware-address option MUST appear, and
be set from the MAC address of the DHCP client to which this IP be set from the htype and chaddr of the DHCP client to which
address was most recently associated. this IP address was most recently associated.
client-identifier client-identifier
There MAY be a DHCP client associated with this IP address. If There MAY be a DHCP client associated with this IP address. If
the DHCP client to which this IP address is leased used a the DHCP client to which this IP address is leased used a
client-identifier option to identify itself, then the client- client-identifier option to identify itself, then the client-
identifier MUST appear in the BNDUPD message, else it MUST NOT identifier MUST appear in the BNDUPD message, else it MUST NOT
appear. appear.
The following option information is generic to all BNDUPD messages, The following option information is generic to all BNDUPD messages,
regardless of the value of the binding-status. regardless of the value of the binding-status.
o start-time-of-state o start-time-of-state
skipping to change at page 54, line 37 skipping to change at page 70, line 27
the current value of binding-status. the current value of binding-status.
o last-transaction-time o last-transaction-time
The last-transaction-time value SHOULD appear. This is the time at The last-transaction-time value SHOULD appear. This is the time at
which this DHCP server last received a packet from the DHCP client which this DHCP server last received a packet from the DHCP client
referenced by the client-identifier or client-hardware-address that referenced by the client-identifier or client-hardware-address that
was associated with the IP address referenced by the assigned-IP- was associated with the IP address referenced by the assigned-IP-
address. address.
o client-FQDN o DDNS
If the DHCP server is performing dynamic DNS operations on behalf If the DHCP server is performing dynamic DNS operations on behalf
of the DHCP client represented by the client-identifier or client- of the DHCP client represented by the client-identifier or client-
hardware-address, then it should include a client-FQDN option con- hardware-address, then it should include a DDNS option containing
taining the host name, domain name, and status of any dynamic DNS the host name, domain name, and status of any dynamic DNS opera-
operations enabled. tions enabled.
o client-request-options
If the BNDUPD was triggered by a request from a DHCP client (typi-
cally those with binding-status of ACTIVE and RELEASED), then the
server SHOULD include options of interest to a failover partner
from the client's request packet in the client-request-options for
transmission to its partner.
A server sending a BNDUPD need not remember the "interesting"
options or the information that would appear in an "interesting"
option for transmission at a time when the BNDUPD is not closely
associated with a DHCP client request.
A server SHOULD send the following "interesting" options. It MAY
send any DHCP client options. As new options are defined, the RFC
defining these options SHOULD include information that they are
"interesting to failover servers" if they should be sent as part of
a BNDUPD.
option option
number name
-----------------------------------------
12 host-name
81 client-FQDN [DDNS]
82 relay-agent-information [AGENTINFO]
TBD user-class [USERCLASS]
60 vendor-class-identifier
Table 7.1.1-1: Options which SHOULD be sent in
the client-request-options option in a BNDUPD message.
o client-reply-options
If the BNDUPD was triggered by a request from a DHCP client (typi-
cally those with binding-status of ACTIVE and RELEASED), then the
server SHOULD include options of interest to a failover partner
from the server's DHCP reply packet in the client-reply-options for
transmission to its partner.
A server sending a BNDUPD need not remember the "interesting"
options or the information that would appear in an "interesting"
option for transmission at a time when the BNDUPD is not closely
associated with a DHCP client request.
A server SHOULD send the following "interesting" options. It MAY
send any DHCP client options. As new options are defined, the RFC
defining these options SHOULD include information that they are
"interesting to failover servers" if they should be sent as part of
a BNDUPD.
option option
number name
-----------------------------------------
58 renewal-time
59 rebinding-time
Table 7.1.1-2: Options which SHOULD be sent in
the client-reply-options option in a BNDUPD message.
The BNDUPD message SHOULD be sent as soon as possible from the time The BNDUPD message SHOULD be sent as soon as possible from the time
that the DHCP client received a response and the lease bindings data- that the DHCP client received a response and the lease bindings data-
base is written on stable storage. base is written on stable storage.
7.1.2. Receiving the BNDUPD message 7.1.2. Receiving the BNDUPD message
When a server receives a BNDUPD message, it needs to decide how to When a server receives a BNDUPD message, it needs to decide how to
processes the message and whether the message represents a conflict processes the message and whether the message represents a conflict
of any sort. The conflict resolution process is used on the receipt of any sort. The conflict resolution process SHOULD be used on the
of every BNDUPD message, not just those that are received while in receipt of every BNDUPD message, not just those that are received
POTENTIAL-CONFLICT state, in order to increase the robustness of the while in POTENTIAL-CONFLICT state, in order to increase the robust-
protocol. ness of the protocol.
There are two sorts of conflict. The first, more major conflict, is There are three sorts of conflicts:
when a server receives a BNDUPD message from its partner for an
ACTIVE IP address and finds that the client specified in the BNDUPD
message is different from the client associated with this ACTIVE IP
address in this server's bindings database.
The second sort of conflict is where the receiving server has in its o Two clients one IP address conflict
bindings database the client specified in the BNDUPD message associ-
ated with a different IP address.
These two conflict cases can both occur together with the same BNDUPD This is the duplicate IP address allocation conflict. There are
message. two different clients each allocated the same address. There
cannot be a client conflict unless there is a client specified
in the BNDUPD message. See section 5.10.1 for how to resolve
this conflict.
When receiving a BNDUPD message, the server first determines the IP o Two IP addresses one client conflict
address from the assigned-IP-address option, and then determines if
there was any client associated with this IP address by looking for
the client-identifier option. If there is no client-identifier
option, then the server looks for a client-hardware-address option,
and ultimately determines the client's identity specified in the
BNDUPD.
The client specified in the BNDUPD message is compared to the client This conflict exists when a client on one server is associated
currently associated with the IP address in this server's bindings with a one IP address, and on the other server with a different
database. If they are the same, continue. If there is no client in IP address in the same or a related subnet. This does not refer
this server's binding database, continue. If there is a client in to the case where a single client has addresses in multiple dif-
this server's bindings database, and it is different from that speci- ferent subnets or administrative domains, but rather the case
fied in the BNDUPD message, a 'client conflict' exists. See the sec- where on the same subnet the client has as lease on one IP
tion below on conflict resolution. If the client specified in the address in one server and on a different IP address on the other
BNDUPD message is associated with a different IP address in this server.
server's bindings database in the same subnet, then an 'IP address
conflict' exists. This does not refer to the case where a single
client has addresses in multiple different subnets or administrative
domains, but rather the case where in the same subnet the client has
as lease on one IP address in one server and on a different IP
address on the other server. See the section below on conflict reso-
lution.
If none of the conflicts mentioned above exist, then develop a time This conflict may or may not be a problem for a given DHCP
for both the BNDUPD message and the server's information. server implementation. In the event that a DHCP server requires
that a DHCP client have only one outstanding lease for an IP
address on one subnet, this conflict should be resolved by
accepting the update which has the latest client-last-
transaction-time.
The time for both the BNDUPD and the server's information are o binding-status conflict
developed independently in the following way: If there is a client-
last-transaction time, use that. If there isn't, but there is a
start-time-of-state, use that. If there isn't, but there is a
client-expiration-time, use that. If there isn't, then use the time
the BNDUPD message was received for a BNDUPD message, and the current
time for the server's information.
Then the server determines the binding-status in the BNDUPD, and This is normal conflict, where one server is updating the other
takes the following actions based on binding-status: with newer information. See section 5.10.1 for details of how
to resolve these conflicts.
(In the following list, to "accept" a BNDUPD means to update the See section 5.10.1 for details of how to process binding-status
server's bindings database with the information contained in the changes in BNDUPD messages.
BNDUPD and once that update is complete, send a BNDACK message
corresponding to the BNDUPD message).
o ACTIVE in BNDUPD 7.1.3. Accepting the BNDUPD message
If the BNDUPD is LATER than the server's information, accept it, When accepting a BNDUPD message, the information contained in the
else reject it. client-request-options and client-reply-options SHOULD be examined
for any information of interest to this server. For instance, a
server which wished to detect changes in client specified host names
might want examine and save information from the host-name or
client-FQDN options. Server's which expect to utilize information
from the relay-agent-information option would want to store this
information.
o EXPIRED or EXPIRED-GRACE in BNDUPD 7.1.4. Time values related to the BNDUPD message
If the binding-status in the receiving server's bindings data- There are three time values that may be sent in a BNDUPD message.
base is ACTIVE, then reject the BNDUPD. Otherwise, accept the
BNDUPD.
If the binding-status in the BNDUPD is EXPIRED-GRACE and the o lease-expiration-time
server receiving the BNDUPD does not implement a grace period
for expired leases, then the server MUST set its lease expira-
tion to value held in the grace-expiration in the BNDUPD.
o RELEASED or RELEASED-GRACE in BNDUPD The time that the server gave to the client, i.e., the time that
the server believes that the client's lease will expire.
If the BNDUPD is LATER than the server's information, accept it, o potential-expiration-time
else reject it.
If the binding-status in the BNDUPD is RELEASED-GRACE and the The time that the server wants to be sure its partner waits
server receiving the BNDUPD does not implement a grace period (added to the MCLT) before assuming that this lease has expired.
for released leases, then the server MUST set its lease expira- Typically some time beyond the desired client lease time.
tion to value held in the grace-expiration in the BNDUPD.
o FREE or BACKUP in BNDUPD o client-last-transaction-time
If the binding-status in the receiving server's database is The time that the client last interacted with this server.
ACTIVE and the lease-expiration-time has not yet been reached,
reject it, else accept it.
o RESET or ABANDONDED in BNDUPD As discussed in section 5.2, each server knows what its partner has
ACKed with regard to potential-expiration time. In addition, each
server needs to remember what it has told its partner as the
potential-expiration-time. Moreover, each server must remember what
it has acked to the *other* server as the most recent potential-
expiration-time from that server.
Accept it under all circumstances. Remember that each server sends a potential-expiration-time and
receives an ACK for that as well as receiving a potential-
expiration-time and needing to remember what it has acked for that.
7.1.3. Conflict resolution when receiving the BNDUPD message While they don't have to be named in any particular way, the times
that a server needs to remember for every IP address in order to
implement the failover protocol are:
When a either of the following conflicts exists between the informa- o lease-expiration-time
tion in a BNDUPD message and the information held in the receiving The time that this server gave to the DHCP client. A DHCP
server's bindings database, it should be resolved in the following server needs to remember this time already, just to be a DHCP
manner: server.
o client conflict o sent-potential-expiration-time
This is the duplicate IP address allocation conflict. There are The latest time sent to the partner for a potential-expiration-
two different clients each allocated the same address. time.
If times for both exist, use the LATER update, else use the o acked-potential-expiration-time
information from the primary server.
o IP address conflict The latest time that the partner has acked for a potential
expiration time. Typically the same as sent-potential-
expiration-time if there is not a BNDUPD outstanding.
An IP address conflict exists when a client on one server is o received-potential-expiration-time
associated with a one IP address, and on the other server with a
different IP address in the same or a related subnet. If one The latest time that this server has ever received as a
binding-status is ACTIVE and the other is anything but ACTIVE, potential-expiration-time from its partner in a BNDUPD that this
then the information in the ACTIVE binding SHOULD be used. Oth- server ACKed.
erwise, if times exist, then the LATER SHOULD be used. Other-
wise, if times do not exist, then the information from the pri- So, a server has to remember two additional times concerning BNDUPD
mary server should be used. messages that it has initiated, and one additional time concerning
BNDUPD message that it has received. How are these times used?
First, let's look at the time that DHCP server can offer to a DHCP
client. A server can offer to a to a DHCP client a time that is no
longer than the MCLT beyond the max( received-potential-expiration-
time, acked-potential-expiration-time). One might think that the
server should be able to offer only the MCLT beyond the acked-
potential-expiration-time, and while that is certainly simple and
easy to understand, it has negative consequences in actual operation.
To illustrate this, in the simple case where the primary updates the
secondary for a while and then fails, if the secondary can then renew
the client for only the MCLT beyond the acked-potential-expiration-
time, then the secondary will only be able to renew the client for
the MCLT, because the secondary has never sent a BNDUPD packet to the
primary concerning this IP address and client, and so its acked-
potential-expiration-time is zero.
However, if we allow the secondary to renew the client with the MCLT
beyond the max( received-potential-expiration-time, acked-potential-
expiration-time), then the secondary can usually renew the client for
the full lease period, at least for the first renew it sees from the
client, since the received-potential-expiration-time is generally
longer than the client's desired lease interval. The difference in
renew times could make a big difference in server load on the
secondary in this case.
What are the consequences of allowing a server to offer a DHCP client
a lease term of the MCLT beyond the max( received-potential-
expiration-time, acked-potential-expiration-time)? The consequences
appear whenever a server enters PARTNER-DOWN state, and affect how
long that server has to wait before reallocating expired leases.
With this approach, when a server goes into PARTNER-DOWN state, it
must wait the MCLT beyond the max( lease-expiration-time, sent-
potential-expiration-time, acked-potential-expiration-time,
received-potential-expiration-time ) for each IP address before it
can reallocate that IP address to another DHCP client. One might
normally think that it needed to wait only the MCLT beyond the max(
lease-expiration-time, received-potential-expiration-time ), i.e.,
beyond what it has told the client and what it has explicitly acked
to the other server. But with the optimization discussed above --
where either server can offer the DHCP client a lease term of the
MCLT beyond the max( received-potential-expiration-time, acked-
potential-expiration-time), then the additional times sent-
potential-expiration-time and acked-potential-expiration-time must be
added into the expression, since the partner could have used those
times as part of its own lease time calculation.
Thus this optimization may require a longer waiting time when enter-
ing PARTNER-DOWN state, but will generally allow servers to operate
considerably more effectively when running in COMMUNICATIONS-
INTERRUPTED state.
7.2. BNDACK message 7.2. BNDACK message
Every BNDUPD message that is received by a server MUST be responded Every BNDUPD message that is received by a server MUST be responded
to with a corresponding BNDUPD message. The receiving server SHOULD to with a corresponding BNDACK message. The receiving server SHOULD
respond quickly to every BNDUPD message but it MAY choose to respond respond quickly to every BNDUPD message but it MAY choose to respond
preferentially to DHCP client requests instead of BNDUPD messages, preferentially to DHCP client requests instead of BNDUPD messages,
since there is no absolute time period within which a BNDACK must be since there is no absolute time period within which a BNDACK must be
sent in response to a BNDUPD message, and DHCP clients frequently do sent in response to a BNDUPD message, and DHCP clients frequently do
have time constraints that must be met. have time constraints that must be met.
A BNDACK message can only be sent in response to a BNDUPD message
using the same TCP connection from which the BNDUPD message was
received, since the XID's in BNDUPD messages are guaranteed unique
only during the life of a single TCP connection. When a connection
to a partner server goes down, a server with unprocessed BNDUPD mes-
sages MAY simply drop all of those messages, since it can be sure
that the partner will retransmit them when they are next in communi-
cations. A server with unprocessed BNDUPD messages when a TCP con-
nection goes down MAY instead choose to process those BNDUPD mes-
sages, but it MUST NOT send any BNDACK messages in response (again
because of the issues surrounding XID uniqueness).
7.2.1. Sending the BNDACK message 7.2.1. Sending the BNDACK message
The BNDACK message MUST contain the same xid as the corresponding The BNDACK message MUST contain the same xid as the corresponding
BNDUPD message. BNDUPD message.
All of the options which appear in the BNDUPD message MUST be All of the options which appear in the BNDUPD message MUST be
included in the BNDACK message. The values in the options MAY be included in the BNDACK message. The values in the options MAY be
updated to reflect current information on the server sending the updated to reflect current information on the server sending the
BNDACK. Note that update of this information may be used for infor- BNDACK. Note that update of this information may be used for infor-
mational purposes, but MUST NOT be assumed to necessarily be recorded mational purposes, but MUST NOT be assumed to necessarily be recorded
in the stable storage of the server who sent the BNDUPD message in the stable storage of the server who sent the BNDUPD message
because there is not corresponding ACK of the BNDACK message. Any because there is no corresponding ACK of the BNDACK message. Any
information that SHOULD be recorded in the partner server's stable information that SHOULD be recorded in the partner server's stable
storage MUST be transmitted in a subsequent BNDUPD. storage MUST be transmitted in a subsequent BNDUPD.
If the server is accepting the BNDUPD, the BNDACK message includes If the server is accepting the BNDUPD, the BNDACK message includes
only those options that appears in the BNDUPD message. If the server only those options that appeared in the BNDUPD message. If the server
is rejecting the BNDUPD, the additional option reject-reason MUST is rejecting the BNDUPD, the additional option reject-reason MUST
appear in the BNDACK message, and the message option SHOULD appear in appear in the BNDACK message, and the message option SHOULD appear in
this case containing a human-readable error message describing in this case containing a human-readable error message describing in
some detail the reason for the rejection of the BNDUPD message. some detail the reason for the rejection of the BNDUPD message.
If the server rejects the BNDUPD message with a BNDACK and a reject-
reason option, it may be because the server believes that it has
binding information that the other server should know. A server
which is rejecting a BNDUPD may initiate a BNDUPD of its own in order
to update its partner with what it believes is better binding infor-
mation, but it MUST ensure through some means that it will not end up
a situation where each server is sending BNDUPD messages as fast as
possible because they can't agree on which server has better binding
data. Placing a reasonable delay on the initiation of a BNDUPD mes-
sage after sending a BNDACK with a reject-reason would be one way to
ensure this situation doesn't occur.
7.2.2. Receiving the BNDACK message 7.2.2. Receiving the BNDACK message
When a server receives a BNDACK message, if it doesn't contain a When a server receives a BNDACK message, if it doesn't contain a
reject-reason option that means that the BNDUPD message was accepted, reject-reason option that means that the BNDUPD message was accepted,
and the server which sent the BNDUPD MUST update its stable storage and the server which sent the BNDUPD MUST update its stable storage
with the potential-expiration-time value sent in the BNDUPD message with the potential-expiration-time value sent in the BNDUPD message
and returned in the BNDACK message. Other values sent in the BNDUPD and returned in the BNDACK message. Other values sent in the BNDUPD
message MAY be used as desired. message MAY be used as desired.
7.3. UPDREQ message 7.3. UPDREQ message
skipping to change at page 58, line 40 skipping to change at page 77, line 18
ACKed binding database information held by the other server by using ACKed binding database information held by the other server by using
the UPDREQ message. the UPDREQ message.
The UPDREQ message is used whenever the sending server cannot proceed The UPDREQ message is used whenever the sending server cannot proceed
before it has processed all previously un-ACKed binding update infor- before it has processed all previously un-ACKed binding update infor-
mation, since the UPDREQ message should yield a corresponding UPDDONE mation, since the UPDREQ message should yield a corresponding UPDDONE
message. The UPDDONE message is not sent until the server that sent message. The UPDDONE message is not sent until the server that sent
the UPDREQ message has responded to all of the BNDUPD messages gen- the UPDREQ message has responded to all of the BNDUPD messages gen-
erated by the UPDREQ message with BNDACK messages. Thus, the sender erated by the UPDREQ message with BNDACK messages. Thus, the sender
of the UPDREQ message can be sure upon receipt of an UPDDONE message of the UPDREQ message can be sure upon receipt of an UPDDONE message
that it has received and commited to stable storage all outstanding that it has received and committed to stable storage all outstanding
binding database updates. binding database updates.
See section 9, Protcol state transitions, for the details of when the See section 9, Protocol state transitions, for the details of when
UPDREQ message is sent. the UPDREQ message is sent.
7.3.1. Sending the UPDREQ message 7.3.1. Sending the UPDREQ message
There are no options for the UPDREQ message. There are no options for the UPDREQ message.
The UPDREQ message is sent with a unique xid. The UPDREQ message is sent with a unique xid.
7.3.2. Receiving the UPDREQ message 7.3.2. Receiving the UPDREQ message
A server receiving an UPDREQ message MUST send all binding database A server receiving an UPDREQ message MUST send all binding database
changes that have not yet been ACKed by the sending server. These changes that have not yet been ACKed by the sending server. These
changes are sent as undistinguished BNDUPD messages. changes are sent as undistinguished BNDUPD messages.
However, the server which received and is processing the UPDREQ mes- However, the server which received and is processing the UPDREQ mes-
sage MUST track the BNDACK messages that correspond to the BNDUPD sage MUST track the BNDACK messages that correspond to the BNDUPD
messages triggered by the UPDREQ message and, when they are all messages triggered by the UPDREQ message and, when they are all
received, the server MUST send an UPDDONE message. received, the server MUST send an UPDDONE message.
The server processing the UPDREQ message and sending BNDUPD messages
to its partner SHOULD only track the BNDUPD and BNDACK message pairs
for unACKed binding database changes that were present upon the
receipt of the UPDREQ message. A server which has received an UPDREQ
message SHOULD send BNDUPD messages for binding database changes that
occur after receipt of the UPDREQ message, but it SHOULD NOT include
those additional BNDUPD messages and their corresponding BNDACK mes-
sages in the accounting necessary to consider the UPDREQ complete and
subsequently send the UPDDONE message. If some additional binding
database changes end up becoming part of the set of BNDUPD messages
considered as part of the UPDREQ (due to whatever algorithm the
server uses to scan its bindings database for unacked changes) it
will probably not cause any difficulty, but a server MUST NOT attempt
to include all such later BNDUPD messages in the accounting for the
UPDREQ in order to be able to transmit an UPDDONE message.
When queuing up the BNDUPD messages for transmission to the sender of When queuing up the BNDUPD messages for transmission to the sender of
the UPDREQ message, the receiving server MUST honor the value the UPDREQ message, the server processing the UPDREQ message MUST
returned in the max-unacked-bndupd option in the CONNECT or CONNEC- honor the value returned in the max-unacked-bndupd option in the CON-
TACK message that set up the connection with the sending server. It NECT or CONNECTACK message that set up the connection with the send-
MUST NOT send more BNDUPD messages without receiving corresponding ing server. It MUST NOT send more BNDUPD messages without receiving
BNDACKs than the value returned in max-unacked-bndupd. corresponding BNDACKs than the value returned in max-unacked-bndupd.
7.4. UPDREQALL message 7.4. UPDREQALL message
The update request all (UPDREQALL) message is used by one server to The update request all (UPDREQALL) message is used by one server to
request that its partner send it all of the binding database informa- request that its partner send it all of the binding database informa-
tion. This message is used to allow one server to recover from a tion. This message is used to allow one server to recover from a
failure of stable storage and to restore its binding database in its failure of stable storage and to restore its binding database in its
entirety from the other server. entirety from the other server.
A server which sends an UPDREQALL message cannot proceed until all of A server which sends an UPDREQALL message cannot proceed until all of
its binding update information is restored, and it knows that all of its binding update information is restored, and it knows that all of
that information is restored when an UPDDONE message is received. that information is restored when an UPDDONE message is received.
See section 9, Protcol state transitions, for the details of when the See section 9, Protocol state transitions, for the details of when
UPDREQALL message is sent. the UPDREQALL message is sent.
7.4.1. Sending the UPDREQALL message 7.4.1. Sending the UPDREQALL message
There are no options for the UPDREQALL message. There are no options for the UPDREQALL message.
The UPDREQALL message is sent with a unique xid. The UPDREQALL message is sent with a unique xid.
7.4.2. Receiving the UPDREQALL message 7.4.2. Receiving the UPDREQALL message
A server receiving an UPDREQALL message MUST send all binding data- A server receiving an UPDREQALL message MUST send all binding data-
base information to the sending server. These changes are sent as base information to the sending server. These changes are sent as
undistinguished BNDUPD messages. undistinguished BNDUPD messages.
However, the server receiving the UPDREQALL message MUST track the However, the server processing the UPDREQALL message MUST track the
BNDACK messages that correspond to the BNDUPD messages triggered by BNDACK messages that correspond to the BNDUPD messages triggered by
the UPDREQ message and, when they are all received, the server MUST the UPDREQALL message and, when they are all received, the server
send an UPDDONE message. MUST send an UPDDONE message.
Just as specified for the processing of the UPDREQ message, the
server processing the UPDREQALL message and sending BNDUPD messages
to its partner SHOULD only track the BNDUPD and BNDACK message pairs
for unACKed binding database changes that were present upon the
receipt of the UPDREQALL message. A server which has received an
UPDREQALL message SHOULD send BNDUPD messages for binding database
changes that occur after receipt of the UPDREQ message, but it SHOULD
NOT include those additional BNDUPD messages and their corresponding
BNDACK messages in the accounting necessary to consider the UPDREQALL
complete and subsequently send the UPDDONE message. If some addi-
tional binding database changes end up becoming part of the set of
BNDUPD messages considered as part of the UPDREALLQ (due to whatever
algorithm the server uses to scan its bindings database for unacked
changes) it will probably not cause any difficulty, but a server MUST
NOT attempt to include all such later BNDUPD messages in the account-
ing for the UPDREQALL in order to be able to transmit an UPDDONE mes-
sage.
When queuing up the BNDUPD messages for transmission to the sender of When queuing up the BNDUPD messages for transmission to the sender of
the UPDREQALL message, the receiving server MUST honor the value the UPDREQALL message, the server processing the UPDREQALL MUST honor
returned in the max-unacked-bndupd option in the CONNECT or CONNEC- the value returned in the max-unacked-bndupd option in the CONNECT or
TACK message that set up the connection with the sending server. It CONNECTACK message that set up the connection with the sending
MUST NOT send more BNDUPD messages without receiving corresponding server. It MUST NOT send more BNDUPD messages without receiving
BNDACKs than the value returned in max-unacked-bndupd. corresponding BNDACKs than the value returned in max-unacked-bndupd.
7.5. UPDDONE message 7.5. UPDDONE message
The update done (UPDDONE) message is used by a server receiving an The update done (UPDDONE) message is used by a server receiving an
UPDREQ or UPDREQALL message to signify that it has sent all of the UPDREQ or UPDREQALL message to signify that it has sent all of the
BNDUPD messages requested by the UPDREQ or UPDREQALL request and that BNDUPD messages requested by the UPDREQ or UPDREQALL request and that
it has received a BNDACK for each of those messages. it has received a BNDACK for each of those messages.
7.5.1. Sending the UPDDONE message 7.5.1. Sending the UPDDONE message
The UPDDONE message SHOULD be sent as soon as the last BNDACK message The UPDDONE message SHOULD be sent as soon as the last BNDACK message
corresponding to a BNDUPD message requested by the UPDREQ or corresponding to a BNDUPD message requested by the UPDREQ or
UPDREQALL is received from the server which sent the UPDREQ or UPDREQALL is received from the server which sent the UPDREQ or
UPDREQALL. UPDREQALL. The XID of the UPDDONE message MUST be the same as the
XID of the corresponding UPDREQ or UPDREQALL message.
7.5.2. Receiving the UPDDONE message 7.5.2. Receiving the UPDDONE message
A server receiving the UPDDONE message knows that all of the informa- A server receiving the UPDDONE message knows that all of the informa-
tion that it requested by sending an UPDREQ or UPDREQALL message has tion that it requested by sending an UPDREQ or UPDREQALL message has
now been sent and that it has recorded this information in its stable now been sent and that it has recorded this information in its stable
storage. It typically uses that the receipt of an UPDDONE message to storage. It typically uses that the receipt of an UPDDONE message to
move to a different failover state. See sections 9.5.2 and 9.8.3 for move to a different failover state. See sections 9.5.2 and 9.8.3 for
details. details.
skipping to change at page 60, line 50 skipping to change at page 80, line 18
request an allocation of IP addresses from the primary server. It request an allocation of IP addresses from the primary server. It
MUST be sent by a secondary server to a primary server to request IP MUST be sent by a secondary server to a primary server to request IP
address allocation by the primary. The IP addresses allocated are address allocation by the primary. The IP addresses allocated are
transmitted using normal BNDUPD messages from the primary to the transmitted using normal BNDUPD messages from the primary to the
secondary. secondary.
The POOLREQ message SHOULD be sent from the secondary to the primary The POOLREQ message SHOULD be sent from the secondary to the primary
whenever the secondary transitions into NORMAL state. It SHOULD whenever the secondary transitions into NORMAL state. It SHOULD
periodically be resent in order that any change in the number of periodically be resent in order that any change in the number of
available IP addresses on the primary be reflected in the pool on the available IP addresses on the primary be reflected in the pool on the
secondary. secondary. The period may be influenced by the secondary server's
leasing activity.
7.6.1. Sending the POOLREQ message 7.6.1. Sending the POOLREQ message
The POOLREQ message has no options. It must be sent with a unique The POOLREQ message has no options. It must be sent with a unique
xid. xid.
7.6.2. Receiving the POOLREQ message 7.6.2. Receiving the POOLREQ message
When a primary server receives a POOLREQ message it SHOULD examine When a primary server receives a POOLREQ message it SHOULD examine
the binding database and determine how many IP addresses the secon- the binding database and determine how many IP addresses the secon-
skipping to change at page 62, line 32 skipping to change at page 81, line 48
option is non-zero. option is non-zero.
Typically, no other action is taken on the reception of a POOLRESP Typically, no other action is taken on the reception of a POOLRESP
message. message.
7.8. CONNECT message 7.8. CONNECT message
The connect message is used to establish an applications level con- The connect message is used to establish an applications level con-
nection over a newly created TCP connection. It gives the source nection over a newly created TCP connection. It gives the source
information for the connection, and some important configuration information for the connection, and some important configuration
information. It may be sent by either primary or secondary server. information. It MUST be sent only by the primary server. Either
It is sent by the initiator of a TCP connection. server can initiate a TCP connection, but the CONNECT message is only
sent by the primary server.
7.8.1. Sending the CONNECT message 7.8.1. Sending the CONNECT message
The CONNECT message MUST be the first message sent by the initiator The CONNECT message MUST be the first message sent by the primary
of a TCP connection after the establishment of a new TCP connection server after the establishment of a new TCP connection with a secon-
with another server participating in the failover protocol. dary server participating in the failover protocol.
The xid of the CONNECT message must be unique. The xid of the CONNECT message must be unique.
The IP address of the sending server MUST be placed in the sending- The IP address of the primary server MUST be placed in the sending-
server-IP-address option. This information is placed in an option server-IP-address option. This information is placed in an option
inside of the packet in order to allow the identity of the sender to inside of the message in order to allow the identity of the sender to
be covered by a shared secret. be covered by a shared secret.
The role of the sending failover endpoint (i.e., either primary or The number of BNDUPD messages the primary server can accept without
secondary) MUST be placed in the server-role option. blocking the TCP connection MUST be placed in the max-unacked-bndupd
option. This MUST be a number equal to or greater than 1, SHOULD be
The current time MUST be placed in the current-time option. a number greater than 10, and SHOULD be a number less than 100.
The number of BNDUPD messages the server can accept without blocking
the TCP connection MUST be placed in the max-unacked-bndupd option.
This MUST be a number equal to or greater than 1, SHOULD be a number
greater than 10, and SHOULD be a number less than 100.
The length of the receive timer (tReceive, see section 8.3) MUST be The length of the receive timer (tReceive, see section 8.3) MUST be
placed in the receive-timer option. placed in the receive-timer option.
If the sending server is a primary server, then the MCLT MUST be The MCLT MUST be placed in the MCLT option.
placed in the MCLT option.
If the sending server is a primary server, then the hash-bucket- The hash-bucket-assignment option MUST be included in the CONNECT
assignment option MUST be included in the CONNECT message. The value message. In the event that load balancing is not configured for this
of the hash-bucket-assignment option is determined from the specific server, the hash-bucket-assignment option will indicate that. The
buckets that the primary server has determined that the secondary value of the hash-bucket-assignment option is determined from the
server MUST service as part of the load-balancing algorithm. The way specific buckets that the primary server has determined that the
in which the primary server determines this information is outside secondary server MUST service as part of the load-balancing algo-
the scope of this protocol definition. The primary server is SHOULD rithm. The way in which the primary server determines this informa-
be able to be configured with a percentage of clients that the secon- tion is outside the scope of this protocol definition. The primary
dary server will be instructed to service, and the primary server server SHOULD be configured with a percentage of clients that the
SHOULD convert that percentage value into a corresponding set of bits secondary server will be instructed to service, and the primary
in the hash-bucket-assignment option that are set to a 1, indicating server SHOULD use the algorithm in [LOADB] to generate a Hash Bucket
that the secondary server MUST service clients which map to those Assignment which it sends to the secondary server.
hash buckets.
The vendor class identifier MUST be placed in the vendor-class- The vendor class identifier MUST be placed in the vendor-class-
identifier option. identifier option.
The protocol-version option MUST be included in every CONNECT mes- The protocol-version option MUST be included in every CONNECT mes-
sage. The current value of the protocol version is 1. sage. The current value of the protocol version is 1.
The TLS-request option MUST be sent and contains the desired TLS con- The TLS-request option MUST be sent and contains the desired TLS con-
nection request as well as information concerning whether TLS is sup- nection request as well as information concerning whether TLS is sup-
ported. If this CONNECT message is being sent over a already ported. If this CONNECT message is being sent over a already
created TLS connection, the TLS-request MUST NOT appear. created TLS connection, the TLS-request MUST NOT appear.
7.8.2. Receiving the CONNECT message 7.8.2. Receiving the CONNECT message
When a server receives a TCP connection on the failover port, it When a server receives a TCP connection on the failover port, if it
should wait for a CONNECT message. is a PRIMARY server it should send a CONNECT message, and if it is a
secondary server it should wait for a CONNECT message.
When a server receives a CONNECT message it should: When a secondary server receives a CONNECT message it should:
1. Record the time at which the message was received. 1. Record the time at which the message was received.
2. Examine the protocol-version option, and decide if this server 2. Examine the protocol-version option, and decide if this server
is capable of interoperating with another server running that is capable of interoperating with another server running that
protocol version. If not, then send the CONNECTACK message protocol version. If not, send the CONNECTACK message with
with the appropriate reject-reason. The server MUST include the appropriate reject-reason. The server MUST include its
its protocol-version in the CONNECTACK message. protocol-version in the CONNECTACK message.
3. Examine the TLS-request option. Figure out the TLS-reply 3. Examine the TLS-request option. Figure out the TLS-reply
value based on the capabilities and configuration of this value based on the capabilities and configuration of this
server, and save it for the CONNECTACK message. If the server, and save it for the CONNECTACK message. If the
results of the TLS negotiation result in a connection rejec- results of the TLS negotiation result in a connection rejec-
tion, then go immediately to send the CONNECTACK message. tion, then go immediately to send the CONNECTACK message.
The possibilities are: The possibilities are:
CONNECT CONNECTACK CONNECT CONNECTACK
skipping to change at page 64, line 37 skipping to change at page 84, line 6
2 0 - request doesn't make sense 2 0 - request doesn't make sense
2 1 0 9 or 10 receiver won't do TLS 2 1 0 9 or 10 receiver won't do TLS
2 1 1 2 1 1
4. Check to see if there is a message-digest option in the CON- 4. Check to see if there is a message-digest option in the CON-
NECT message. If there was, and the server does not support NECT message. If there was, and the server does not support
message-digests, then reject the connection with the appropri- message-digests, then reject the connection with the appropri-
ate reject-reason in the CONNECTACK. ate reject-reason in the CONNECTACK.
5. Determine if the sender (from the sending-server-IP-address 5. Determine if the sender (from the sending-server-IP-address
option) and the role of the sender (from the server-role) option) and the implicit role of the sender (i.e., primary)
option represents a server with which the receiver was config- represents a server with which the receiver was configured to
ured to engage in failover activity. engage in failover activity. This is performed after the any
TLS processing so that it occurs after a secure connection is
created, to ensure that there is no tampering with the IP
address of the partner.
If not, then the receiving server should reject the CONNECT If not, then the receiving server should reject the CONNECT
request by sending a CONNECTACK message with a reject-reason request by sending a CONNECTACK message with a reject-reason
value of: 8, invalid failover partner. value of: 8, invalid failover partner.
If it is, then the receiving failover endpoint should be If it is, then the receiving failover endpoint should be
determined. determined.
6. Decide if the time delta between the sending of the packet, in 6. Decide if the time delta between the sending of the message,
the current-time option, and the receipt of the packet, in the time field, and the receipt of the message, recorded in
recorded in step 1 above, is acceptable. A server MAY require step 1 above, is acceptable. A server MAY require an arbi-
an arbitrarily small delta in time values in order to set up a trarily small delta in time values in order to set up a fail-
failover connection with another server. over connection with another server. See section 5.9 for
information on time synchronization.
If the delta between the time values is too great, the server If the delta between the time values is too great, the server
should reject the CONNECT request by sending a CONNECTACK mes- should reject the CONNECT request by sending a CONNECTACK mes-
sage with a reject-reason of 4, time mismatch too great. sage with a reject-reason of 4, time mismatch too great.
If the time mismatch is not considered too great then the If the time mismatch is not considered too great then the
receiving server MUST record the delta between the servers. receiving server MUST record the delta between the servers.
The receiving server MUST use this delta to correct all of the The receiving server MUST use this delta to correct all of the
absolute times received from the other server in all time- absolute times received from the other server in all time-
valued options. Note that server's can participate in fail- valued options. Note that server's can participate in fail-
skipping to change at page 65, line 29 skipping to change at page 84, line 48
7. If the receiving server is a secondary server, it MUST examine 7. If the receiving server is a secondary server, it MUST examine
the MCLT option in the CONNECT request and use the value of the MCLT option in the CONNECT request and use the value of
the MCLT as the MCLT for this failover endpoint. the MCLT as the MCLT for this failover endpoint.
A receiving secondary server SHOULD be able to operate with A receiving secondary server SHOULD be able to operate with
any MCLT sent by the primary, but if it cannot, then it any MCLT sent by the primary, but if it cannot, then it
should send a CONNECTACK with a reject-reason of 5, MCLT should send a CONNECTACK with a reject-reason of 5, MCLT
mismatch. mismatch.
8. The receiving server MAY use the vendor-class-identifier to do 8. The server MUST store hash-bucket-assignment option for use
during processing during NORMAL state. If this hash bucket
assignment conflicts with the secondary server's configured
hash bucket assignment for use in other than NORMAL state, the
secondary server should send a CONNECTACK with a reject reason
of 19, Hash bucket assignment conflict.
9. The receiving server MAY use the vendor-class-identifier to do
vendor specific processing. vendor specific processing.
7.9. CONNECTACK message 7.9. CONNECTACK message
The CONNECTACK message is sent to accept or reject a CONNECT message. The CONNECTACK message is sent to accept or reject a CONNECT message.
It is sent by the server which accepted the TCP connection and It is sent by the secondary server which received a CONNECT message.
received a CONNECT message.
Attempting immediately to reconnect after either receiving a CONNEC-
TACK with a reject-reason or after sending a CONNECTACK with a
reject-reason could yield unwanted looping behavior, since the reason
that the connection was rejected may well not have changed since the
last attempt. A simple suggested solution is to wait a minute or two
after sending or receiving a CONNECTACK message with a reject-reason
before attempting to reestablish communication.
7.9.1. Sending the CONNECTACK message 7.9.1. Sending the CONNECTACK message
The xid of the CONNECTACK message must be that of the corresponding The xid of the CONNECTACK message MUST be that of the corresponding
CONNECT message. CONNECT message.
The IP address of the sending server MUST be placed in the sending- The IP address of the sending server MUST be placed in the sending-
server-IP-address option. This information is placed in an option server-IP-address option. This information is placed in an option
inside of the packet in order to allow the identity of the sender to inside of the message in order to allow the identity of the sender to
be covered by a shared secret. be covered by a shared secret.
The role of the sending failover endpoint (i.e., either primary or
secondary) MUST be placed in the server-role option.
The current time MUST be placed in the current-time option.
The protocol-version option MUST be included in every CONNECTACK mes- The protocol-version option MUST be included in every CONNECTACK mes-
sage. The current value of the protocol version is 1. sage. The current value of the protocol version is 1.
If the connection has been rejected, the reject-reason option MUST be If the connection has been rejected, the reject-reason option MUST be
placed in the CONNECTACK message with an appropriate reason, and a placed in the CONNECTACK message with an appropriate reason, and a
message option SHOULD be included with a human-readable error message message option SHOULD be included with a human-readable error message
describing the reason for the rejection in some detail. If the describing the reason for the rejection in some detail. If the
reject-reason option appears, then the remaining options listed below reject-reason option appears, then the remaining options listed below
do not appear. do not appear. The sending server should close the connection after
sending the CONNECTACK if the connection was rejected.
The results of the TLS negotiation MUST be placed in the TLS-reply The results of the TLS negotiation MUST be placed in the TLS-reply
option. If this CONNECTACK message is being sent over an already TLS option. If this CONNECTACK message is being sent over an already TLS
secured connection, then there MUST NOT be a TLS-reply option. secured connection, then there MUST NOT be a TLS-reply option.
If there was a message-digest option in the CONNECT message, then If there was a message-digest option in the CONNECT message, then
there MUST be a message-digest in the CONNECTACK message if it does there MUST be a message-digest in the CONNECTACK message and any sub-
not contain a reject-reason. sequent messages if the CONNECTACK does not contain a reject-reason.
The number of BNDUPD messages the server can accept without blocking The number of BNDUPD messages the server can accept without blocking
the TCP connection MUST be placed in the max-unacked-bndupd option. the TCP connection MUST be placed in the max-unacked-bndupd option.
This SHOULD be a number greater than 10, and SHOULD be a number less This SHOULD be a number greater than 10, and SHOULD be a number less
than 100. than 100.
The length of the receive timer (tReceive, see section 8.3) MUST be The length of the receive timer (tReceive, see section 8.3) MUST be
placed in the receive-timer option. placed in the receive-timer option.
If the sending server is a primary server, then the MCLT MUST be
placed in the MCLT option.
The vendor class identifier MUST be placed in the vendor-class- The vendor class identifier MUST be placed in the vendor-class-
identifier option. identifier option.
If the server is rejecting the CONNECT message, then the reject- If the server is rejecting the CONNECT message, then the reject-
reason option MUST appear. A message option MAY appear to give a reason option MUST appear. A message option SHOULD appear to give a
human readable version of the rejection reason. human readable version of the rejection reason.
After sending a CONNECTACK message, the server MUST send a STATE mes- After a connection is created (either by sending a CONNECTACK message
sage. to the first CONNECT message, or sending a CONNECTACK message to a
CONNECT message received over a TLS connection), the server MUST send
a STATE message.
After sending a CONNECTACK message, the server MUST start two timers After a connection is created, the server MUST start two timers for
for the connection: tSend and tReceive. The tSend timer SHOULD be the connection: tSend and tReceive. The tSend timer SHOULD be
approximately 20 percent of the time in the receiver-timer option in approximately 33 percent of the time in the receiver-timer option in
the corresponding CONNECT message. The tReceive timer SHOULD be the the corresponding CONNECT message. The tReceive timer SHOULD be the
time sent in the receiver-timer option in the CONNECTACK message. time sent in the receiver-timer option in the CONNECTACK message.
The tReceive timer is reset whenever a message is received from this The tReceive timer is reset whenever a message is received from this
TCP connection. If it ever expires, the TCP connection is dropped TCP connection. If it ever expires, the TCP connection is dropped
and communications with this partner is considered not ok. and communications with this partner is considered not ok.
The tSend timer is reset whenever a packet is sent over this connec- The tSend timer is reset whenever a message is sent over this connec-
tion. When it expires, a CONTACT message MUST be sent. tion. When it expires, a CONTACT message MUST be sent.
7.9.2. Receiving the CONNECTACK message 7.9.2. Receiving the CONNECTACK message
If a CONNECTACK message is received with a different XID from the one
in the CONNECT that was sent, it SHOULD be ignored.
When a CONNECTACK message is received, the following actions should When a CONNECTACK message is received, the following actions should
be taken: be taken:
1. Record the time the packet was received. 1. Record the time the message was received.
2. Check to see if there is a reject-reason option in the CONNEC- 2. Check to see if there is a reject-reason option in the CONNEC-
TACK message. If not, continue with step 3. If there is a TACK message. If not, continue with step 3. If there is a
reject-reason option, the server SHOULD report the error code. reject-reason option, the server SHOULD report the error code.
If a message option appears a server SHOULD display the string If a message option appears a server SHOULD display the string
from the message option in a user visible way. The server from the message option in a user visible way. The server
MUST close the connection if a reject-reason option appears. MUST close the connection if a reject-reason option appears.
3. Check to see if the xid on the CONNECTACK matches an outstand- 3. Check to see if the xid on the CONNECTACK matches an outstand-
ing CONNECT message on this TCP connection. ing CONNECT message on this TCP connection.
4. Check the value of the TLS-reply option, and if it was 1, then 4. Check the value of the TLS-reply option, and if it was 1, then
skip processing of the rest of the CONNECTACK message, and skip processing of the rest of the CONNECTACK message, and
immediately enter into TLS connection setup. immediately enter into TLS connection setup.
If it does not, a server SHOULD report an error. If it does not, a server SHOULD report an error.
This step occurs prior to steps 5 and 6 in order to allow
creation of a secure connection (if required) prior to pro-
cessing the protocol version and IP address information.
5. Examine the value of the protocol-version option. If this 5. Examine the value of the protocol-version option. If this
server is able to establish connections with another server server is able to establish connections with another server
running this protocol version, then continue, else close the running this protocol version, then continue, else close the
connection. connection.
6. Check to see if the sending-server-IP-address and server-role 6. Decide if the time delta between the sending of the message,
in the CONNECTACK message correspond to the failover endpoint in the time field, and the receipt of the message, recorded in
for which this TCP connection was created. step 1 above, is acceptable. A server MAY require an arbi-
trarily small delta in time values in order to set up a fail-
If it was not, the server MUST drop the TCP connection and over connection with another server.
SHOULD report an error.
7. Decide if the time delta between the sending of the packet, in
the current-time option, and the receipt of the packet,
recorded in step 1 above, is acceptable. A server MAY require
an arbitrarily small delta in time values in order to set up a
failover connection with another server.
If the delta between the time values is too great, the server If the delta between the time values is too great, the server
should drop the TCP connection. should drop the TCP connection.
If the time mismatch is not considered too great then the If the time mismatch is not considered too great then the
receiving server MUST record the delta between the servers. receiving server MUST record the delta between the servers.
The receiving server MUST use this delta to correct all of the The receiving server MUST use this delta to correct all of the
absolute times received from the other server in all time- absolute times received from the other server in all time-
valued options. Note that the failover protocol is con- valued options. Note that the failover protocol is con-
structed so that two servers can be failover partners with structed so that two servers can be failover partners with
arbitrarily great time mismatches. arbitrarily great time mismatches.
8. If the receiving server is a secondary server, it MUST examine 7. If the receiving server is a secondary server, it MUST examine
the MCLT option in the CONNECT request and use the value of the MCLT option in the CONNECT request and use the value of
the MCLT as the MCLT for this failover endpoint. the MCLT as the MCLT for this failover endpoint.
A receiving secondary server SHOULD be able to operate with A receiving secondary server SHOULD be able to operate with
any MCLT sent by the primary, but if it cannot, then it MUST any MCLT sent by the primary, but if it cannot, then it MUST
drop the TCP connection. drop the TCP connection.
8. If the receiving server is a secondary server, it MUST store
the hash-bucket-assignment option for use during processing
during NORMAL state. If this hash bucket assignment conflicts
with the server's configured hash bucket assignment for use in
other than NORMAL state, the secondary server should send a
CONNECTACK with a reject reason of 19, Hash bucket assignment
conflict.
9. The receiving server MAY use the vendor-class-identifier to do 9. The receiving server MAY use the vendor-class-identifier to do
vendor specific processing. vendor specific processing.
10. After accepting a CONNECTACK message, the server MUST send a 10. After accepting a CONNECTACK message, the server MUST send a
STATE message. STATE message.
After receiving a CONNECTACK message, the server MUST start After receiving a CONNECTACK message, the server MUST start
two timers for the connection: tSend and tReceive. The tSend two timers for the connection: tSend and tReceive. The tSend
timer SHOULD be approximately 20 percent of the time in the timer SHOULD be approximately 20 percent of the time in the
receiver-timer option in the corresponding CONNECTACK message. receiver-timer option in the corresponding CONNECTACK message.
The tReceive timer SHOULD be set to the time sent in the The tReceive timer SHOULD be set to the time sent in the
receiver-timer option in the CONNECT message. receiver-timer option in the CONNECT message.
The tReceive timer is reset whenever a message is received The tReceive timer is reset whenever a message is received
from this TCP connection. If it ever expires, the TCP connec- from this TCP connection. If it ever expires, the TCP connec-
tion is dropped and communications with this partner is con- tion is dropped and communications with this partner is con-
sidered not ok. sidered not ok.
The tSend timer is reset whenever a packet is sent over this The tSend timer is reset whenever a message is sent over this
connection. When it expires, a CONTACT message MUST be sent. connection. When it expires, a CONTACT message MUST be sent.
7.10. STATE message 7.10. STATE message
The state (STATE) message is used to communicate the current failover The state (STATE) message is used to communicate the current failover
state to the partner server. state to the partner server.
The STATE message MUST be sent after sending a CONNECTACK message The STATE message MUST be sent after sending a CONNECTACK message
that didn't contain a reject-reason option, and MUST be sent after that didn't contain a reject-reason option, and MUST be sent after
receiving a CONNECTACK message without a reject-reason option. receiving a CONNECTACK message without a reject-reason option.
skipping to change at page 69, line 40 skipping to change at page 89, line 27
7.11. CONTACT message 7.11. CONTACT message
The contact (CONTACT) message is sent to verify communications The contact (CONTACT) message is sent to verify communications
integrity with a failover partner. The CONTACT message is sent when integrity with a failover partner. The CONTACT message is sent when
no messages have been sent to the failover partner for a specified no messages have been sent to the failover partner for a specified
period of time. This is determined by the tSend timer expiring (see period of time. This is determined by the tSend timer expiring (see
section 8.3). section 8.3).
7.11.1. Sending the CONTACT message 7.11.1. Sending the CONTACT message
The current time is placed in the current-time option, and the CON- The CONTACT message is sent.
TACT message is sent.
7.11.2. Receiving the CONTACT message 7.11.2. Receiving the CONTACT message
When a CONTACT message is received, the tReceive timer is reset (as When a CONTACT message is received, the tReceive timer is reset (as
it is with any message that is received). it is with any message that is received).
A server MAY use the time in the current-time option and the time A server MAY use the time in the time field and the time recorded
recorded above to refine the delta time calculations between the above to refine the delta time calculations between the servers.
servers.
7.12. DISCONNECT message
The DISCONNECT is the last message sent over a connection before
dropping an established connection.
After sending or receiving a DISCONNECT message, a server needs to
have some mechanism to prevent an error loop. Simply reconnecting to
the partner immediately is not the best option, especially after
several consecutive attempts.
A simple suggested solution is to wait a minute or two after sending
or receiving a DISCONNECT before attempting to reestablish communica-
tion.
7.12.1. Sending the DISCONNECT message
The DISCONNECT message MUST be the last message sent by the a server
which is dropping a TCP connection.
The xid of the DISCONNECT message must be unique.
The reject-reason option MUST appear giving a reason why the connec-
tion was dropped. A message option SHOULD appear giving a human
readable error message with possibly more details.
7.12.2. Receiving the DISCONNECT message
When a server receives a DISCONNECT message it should log the message
if there was one and possibly raise an alarm of some sort if the
reject reason was one that was sufficiently serious.
8. Connection Management 8. Connection Management
Servers participating in the failover protocol communicate over TCP Servers participating in the failover protocol communicate over TCP
connections. These TCP connections are used both to transmit bind- connections. These TCP connections are used both to transmit bind-
ing information from one server to another as well as to allow each ing information from one server to another as well as to allow each
server to determine whether communications is possible with the other server to determine whether communications is possible with the other
server. server.
Central to the operation of the failover protocol is a notion of Central to the operation of the failover protocol is a notion of
skipping to change at page 70, line 43 skipping to change at page 91, line 11
8.2. Creating the TCP connection 8.2. Creating the TCP connection
Every server implementing the failover protocol MUST listen on port Every server implementing the failover protocol MUST listen on port
647 for incoming failover TCP connections. The source port of the 647 for incoming failover TCP connections. The source port of the
TCP connection is unimportant. TCP connection is unimportant.
Every server implementing the failover protocol SHOULD attempt to Every server implementing the failover protocol SHOULD attempt to
connect to all of its partners periodically, where the period is connect to all of its partners periodically, where the period is
implementation dependent and SHOULD be configurable. In the event implementation dependent and SHOULD be configurable. In the event
that a connection has been rejected by a CONNECTACK message with a that a connection has been rejected by a CONNECTACK message with a
reject-reason option contained in it, a server SHOULD reduce the fre- reject-reason option contained in it or a DISCONNECT message, a
quency with which it attempts to connect to that server but it SHOULD server SHOULD r educe the frequency with which it attempts to connect
continue to attempt to connect periodically. to that server but it SHOULD continue to attempt to connect periodi-
cally.
Once a connection is established, the first message sent across the Once a connection is established, the primary server MUST send a CON-
connection MUST be a CONNECT message. This message establishes the NECT message across the connection. A secondary server MUST wait for
identity of the failover endpoint making the connection. the CONNECT message from a primary server.
Every CONNECT message includes a TLS-request option, and if the CON- Every CONNECT message includes a TLS-request option, and if the CON-
NECTACK message does not reject the CONNECT message and the TLS-reply NECTACK message does not reject the CONNECT message and the TLS-reply
option says TLS MUST be used, then the servers will enter into TLS option says TLS MUST be used, then the servers will immediately enter
negotiation. into TLS negotiation.
Once that negotiation is complete, then the server MUST resend the Once TLS negotiation is complete, the primary server MUST resend the
CONNECT message on the newly secured TLS connection and then wait for CONNECT message on the newly secured TLS connection and then wait for
the CONNECTACK message in response. The TLS-request and TLS-reply the CONNECTACK message in response. The TLS-request and TLS-reply
options MUST have the same values in this second CONNECT and CONNEC- options MUST have the same values in this second CONNECT and CONNEC-
TACK message has they had in the first messages. TACK message as they had in the first messages.
The second message sent over a new connection is a STATE message. The second message sent over a new connection (either a bare TCP con-
Upon the receipt of this message, the receiver can consider communi- nection or a connection utilizing TLS) is a STATE message. Upon the
cations up. receipt of this message, the receiver can consider communications up.
It is entirely possible that two servers will attempt to make connec- It is entirely possible that two servers will attempt to make connec-
tions to each other essentially simultaneously, and then each will tions to each other essentially simultaneously, and in this case the
send a CONNECT message down the new connection. In this case each secondary server will be waiting for a CONNECT message on each con-
server will receive a CONNECT message on one connection having nection. The primary server MUST send a CONNECT message over one
already sent a CONNECT message on the other connection. In the event connection and it MUST close the other connection.
that the primary server receives a CONNECT message from the secondary
server either while waiting for a CONNECTACK message from a secondary A secondary server MUST NOT respond to the closing of a TCP connec-
server or when it has a valid connection open to a secondary server, tion with a blind attempt to reconnect -- there may be another TCP
it will close the connection on which the CONNECT message was connection to the same failover partner already in use.
received.
8.3. Using the TCP connection for determining communications status 8.3. Using the TCP connection for determining communications status
The TCP connection is used to determine the communications status of The TCP connection is used to determine the communications status of
the other server, i.e., communications-ok, or communications- the other server, i.e., communications-ok, or communications-
interrupted. interrupted.
Three things must happen for a server to consider that communications Three things must happen for a server to consider that communications
are ok with respect to another server: are ok with respect to another server:
skipping to change at page 72, line 9 skipping to change at page 92, line 25
3. A STATE message must be received from the other server over 3. A STATE message must be received from the other server over
the connection. This STATE message initializes important the connection. This STATE message initializes important
information necessary to the operation of the state machine information necessary to the operation of the state machine
the governs the behavior of this failover endpoint. the governs the behavior of this failover endpoint.
There are two ways that a server can determine that communications There are two ways that a server can determine that communications
has failed: has failed:
1. The TCP connection can go down, yielding an error when 1. The TCP connection can go down, yielding an error when
attempting to send a message. This will happen at least as attempting to send or receive a message. This will happen at
often as the period of the tSend timer. least as often as the period of the tSend timer.
2. The tReceive timer can expire. 2. The tReceive timer can expire.
In either of these cases, communications is considered interrupted. In either of these cases, communications is considered interrupted.
Several difficulties arise when trying to use one TCP connection for Several difficulties arise when trying to use one TCP connection for
both bulk data transfer as well as to sense the communications status both bulk data transfer as well as to sense the communications status
of the other server. One aspect of the problem stems from the dif- of the other server. One aspect of the problem stems from the dif-
ferent requirements of both uses. The bulk data transfer is of ferent requirements of both uses. The bulk data transfer is of
course critically important to the protocol, but the speed with which course critically important to the protocol, but the speed with which
skipping to change at page 72, line 33 skipping to change at page 92, line 49
such an occasional delay doesn't compromise the correctness of the such an occasional delay doesn't compromise the correctness of the
protocol. However, the speed with which one server detects the other protocol. However, the speed with which one server detects the other
server is up (or, more importantly, down) is more highly constrained. server is up (or, more importantly, down) is more highly constrained.
Generally one server should be able to detect that the other server Generally one server should be able to detect that the other server
is not communicating within a minute or less. is not communicating within a minute or less.
These differing time constraints makes it difficult to use the same These differing time constraints makes it difficult to use the same
TCP connection for data transfer as well as to sense communications TCP connection for data transfer as well as to sense communications
integrity. See section 3.5 for additional details on TCP. integrity. See section 3.5 for additional details on TCP.
The solution to this problem is to require a that some message be The solution to this problem is to require that some message be
received by each end of the connection within a limited time or that received by each end of the connection within a limited time or that
the connection will be considered down. If no messages have been the connection will be considered down. If no messages have been
sent recently, then a CONTACT message is sent. sent recently, then a CONTACT message is sent.
In the case where there is no data queued to be sent, this is not a In the case where there is no data queued to be sent, this is not a
problem, but in the case where there is data queued to be sent to the problem, but in the case where there is data queued to be sent to the
partner, then the CONTACT message will not actually be transmitted partner, then the CONTACT message will not actually be transmitted
until the queued data is sent. Section 3.5 explains why waiting for until the queued data is sent. Section 3.5 explains why waiting for
TCP to determine that the connection is down is not acceptable, and TCP to determine that the connection is down is not acceptable, and
leads a requirement that the receiving server never block the sending leads a requirement that the receiving server never block the sending
server from sending CONTACT packets. server from sending CONTACT messages.
In order to meet this requirement, each server tells the other server In order to meet this requirement, each server tells the other server
the number of outstanding BNDUPD messages that it will accept. The the number of outstanding BNDUPD messages that it will accept. The
receiving server is required to always be able to accept that many receiving server is required to always be able to accept that many
BNDUPD messages off of the connection's input queue even if it cannot BNDUPD messages off of the connection's input queue even if it cannot
process them immediately, and to accept all other messages immedi- process them immediately, and to accept all other messages immedi-
ately. ately.
Thus, the sending server's TCP is never blocked from sending a mes- Thus, the sending server's TCP is never blocked from sending a mes-
sage except for very short periods, less than a few seconds unless sage except for very short periods, less than a few seconds unless
the network connection itself has problems. In this case, if the the network connection itself has problems. In this case, if the
CONTACT messages don't make it to the partner then the partner will CONTACT messages don't make it to the partner then the partner will
close the connection. close the connection.
DISCUSSION:
When implementing this capability, one needs to be careful when
sending any message on the TCP connection as TCP can easily block
the server if the local TCP send buffers are full. This can't be
prevented because if the receiver is not reachable (via the net-
work), the sending TCP can't send and thus it will be unable to
empty the local TCP send buffers. So, all send operations either
need to assume they may block for some time or non-blocking sends
must be used.
8.4. Using the TCP connection for binding data 8.4. Using the TCP connection for binding data
Binding data, in the form of BNDUPD messages and BNDACK messages to Binding data, in the form of BNDUPD messages and BNDACK messages to
respond to them, are sent across the TCP connection. respond to them, are sent across the TCP connection.
In order to support timely detection of any failure in the partner In order to support timely detection of any failure in the partner
server, the TCP connection MUST NOT block for more than a very short server, the TCP connection MUST NOT block for more than a very short
time, on the order of a few seconds. Therefore, a server that is time, on the order of a few seconds. Therefore, a server that is
sending BNDUPD messages MUST send only a restricted number before sending BNDUPD messages MUST send only a restricted number before
receiving BNDACK messages about previous messages sent. receiving BNDACK messages about previous messages sent.
The number of outstanding BNDUPD messages that each server will The number of outstanding BNDUPD messages that each server will
accept without causing TCP to block transmission of additional data accept without causing TCP to block transmission of additional data
(i.e, CONTACT messages) is sent by each server in the CONNECT and (i.e, CONTACT messages) is sent by each server in the CONNECT and
CONNECTACK messages in the max-unacked-bndupd option. CONNECTACK messages in the max-unacked-bndupd option.
8.5. Using the TCP connection for control messages 8.5. Using the TCP connection for control messages
The TCP connection is used for control messages: POOLREQ, UPDREQ, The TCP connection is used for control messages: POOLREQ, UPDREQ,
STATE, UPDREQALL and the corresponding reply messages: POOLRESP, STATE, CONTACT, UPDREQALL and the corresponding reply messages: POOL-
UPDDONE. A server MUST immediately accept all of these messages from RESP, UPDDONE. A server MUST immediately accept all of these mes-
the TCP connection. A server MUST immediately accept any BNDACK sages from the TCP connection. A server MUST immediately accept any
which is received as well. BNDACK which is received as well.
8.6. Losing the TCP connection 8.6. Losing the TCP connection
When the TCP connection is lost, then communications is not ok with When the TCP connection is lost, then communications is not ok with
the other server. A server which has lost communications SHOULD the other server. A server which has lost communications SHOULD
immediately attempt to reconnect to the other server, and should immediately attempt to reconnect to the other server, and should
retry these connection attempts periodically. retry these connection attempts periodically.
Any BNDUPD or other messages that have been received but not yet pro- A BNDACK message can only be sent in response to a BNDUPD message
cessed from the partner SHOULD be processed as soon as possible. using the same TCP connection from which the BNDUPD message was
received, since the XID's in BNDUPD messages are guaranteed unique
only during the life of a single TCP connection. When a connection
to a partner server goes down, a server with unprocessed BNDUPD mes-
sages MAY simply drop all of those messages, since it can be sure
that the partner will retransmit them when they are next in communi-
cations. A server with unprocessed BNDUPD messages when a TCP con-
nection goes down MAY instead choose to process those BNDUPD mes-
sages, but it MUST NOT send any BNDACK messages in response (again
because of the issues surrounding XID uniqueness).
When the TCP connection is closed explicitly, the DISCONNECT message
with a reject-reason option (and, ideally, a message option) MUST be
sent over the TCP connection.
9. Protocol States 9. Protocol States
This section discusses the various states that a failover endpoint may This section discusses the various states that a failover endpoint
take, and the server actions required when entering the state, operating may take, and the server actions required when entering the state,
in the state, and leaving the state, as well as the events that cause operating in the state, and leaving the state, as well as the events
transitions out of the state into another state. that cause transitions out of the state into another state.
The state transition diagram in Figure 9.2-1 is relevant for this The state transition diagram in Figure 9.2-1 is relevant for this
section. This is the common state transition diagram for both servers
section. In the event that the textual description of a state differs in a failover pair. In the event that the textual description of a
from the state transition diagram, the textual description is to be con- state differs from the state transition diagram, the textual descrip-
sidered authoritative. This is the common state transition diagram for tion is to be considered authoritative.
both servers in a failover pair.
9.1. Server Initialization 9.1. Server Initialization
When a server starts it starts out in STARTUP state. See section 9.4 When a server starts it starts out in STARTUP state. See section 9.4
below for details. below for details.
9.2. Server State Transitions 9.2. Server State Transitions
Whenever a server transitions into a new state, it MUST record the Whenever a server transitions into a new state, it MUST record the
state and the time at which it entered that state in stable storage. state and the time at which it entered that state in stable storage.
skipping to change at page 75, line 9 skipping to change at page 95, line 52
blished between the two servers, each must record the state of the blished between the two servers, each must record the state of the
partner when communication was restored. State transitions on one partner when communication was restored. State transitions on one
server in some cases imply state transitions on the partner server, server in some cases imply state transitions on the partner server,
so a record of the current state of the partner server must be kept so a record of the current state of the partner server must be kept
by each server. by each server.
If the state of the partner changes while communicating a server If the state of the partner changes while communicating a server
moves through the communications-failed transition and into whatever moves through the communications-failed transition and into whatever
state results. It then immediately moves through whatever state state results. It then immediately moves through whatever state
transition is appropriate given the current state of the partner transition is appropriate given the current state of the partner
server. A server performing this operation SHOULD NOT drop the TCP server. A server performing this operation SHOULD NOT close the TCP
connection to its partner. connection to its partner.
DISCUSSION: DISCUSSION:
The point of this technique is simplicity, both in explanation of The point of this technique is simplicity, both in explanation of
the protocol and in its implementation. The alternative to this the protocol and in its implementation. The alternative to this
technique of memory of partner state and automatic state transi- technique of memory of partner state and automatic state transi-
tion on change of partner state is to have every state in the fol- tion on change of partner state is to have every state in the fol-
lowing diagram have a state transition for every possible state of lowing diagram have a state transition for every possible state of
the partner. With the approach adopted, only the states in which the partner. With the approach adopted, only the states in which
skipping to change at page 76, line 10 skipping to change at page 97, line 10
each possible partner state. each possible partner state.
The current state of a server MUST be recorded in stable storage and The current state of a server MUST be recorded in stable storage and
thus be available to the server after a server restart. thus be available to the server after a server restart.
+---------------+ V +--------------+ +---------------+ V +--------------+
| RECOVER - | | | STARTUP - | | RECOVER - | | | STARTUP - |
|(unresponsive) | +->|(unresponsive)| |(unresponsive) | +->|(unresponsive)|
+---------------+ +--------------+ +---------------+ +--------------+
Comm. OK +-----------------+ Comm. OK +-----------------+
Other State:-RECOVER | PARTNER DOWN - |<-----+ Other State:-RECOVER | PARTNER DOWN - |<-----------------+
| | | (responsive) | | | | | (responsive) | |
All POTENTIAL- +-----------------+ | All POTENTIAL- +-----------------+ +--------------+ |
Others CONFLICT------------ | --------+ ^(see | Others CONFLICT------------ | --------+ | RESOLUTION | |
| Comm. OK | | 9.8.3)| | Comm. OK | | INTERRUPTED | |
UPDREQ(ALL) Other State: | +-----+ | UPDREQ(ALL) Other State: | +-| (responsive) | |
Wait UPDDONE | | | Comm. | | Wait UPDDONE | | | | +--------------+ |
Wait MCLT from fail RECOVER All Others| Failed | | Wait MCLT from fail RECOVER All Others| Comm. OK ^ | |
+--------------+ | V V | | | +--------------+ | V V V | Ext. |
|RECOVER-DONE +| +--+ +--------------+ | | |RECOVER-DONE +| +--+ +--------------+ Comm. Cmd. |
|(unresponsive)| | | POTENTIAL + |<--+ | |(unresponsive)| | | POTENTIAL + | Failed | |
+--------------+ Wait for +>| CONFLICT | | +--------------+ Wait for +>| CONFLICT |------+ +-->|
Comm. OK Other | |(unresponsive)|<--- | --+ Comm. OK Other | |(unresponsive)|<--------+ |
+--Other State:-+ State: | +--------------+ | | +--Other State:-+ State: | +--------------+ | |
| | | RECOVER | | | | | | | RECOVER | | | |
| All POTENT. DONE | Resolve Conflict | | | All POTENT. DONE | Resolve Conflict | |
| Others: CONFLICT-- | ----+ (see 9.8) | | | Others: CONFLICT-- | ----+ (see 9.8) | |
| Wait for V V | | | Wait for V V | |
| Other State: NORMAL +-----------------+ | | | Other State: NORMAL +-----------------+ | |
| V | NORMAL + | External | | | V | NORMAL + | External | |
| +--+----------+-->| (balanced) |-Command-->+ | | +--+----------+-->| (balanced) |-Command---+-- | -----+
| ^ ^ +-----------------+ | | | ^ ^ +-----------------+ | |
| | | | | | | | | | | |
| Wait for Comm. OK Comm. External | | Wait for Comm. OK Comm. External |
| Other Other Failed Command | | Other Other Failed Command |
| State: State: | or | | | State: State: | or | |
|RECOVER-DONE NORMAL Start Safe Safe | | |RECOVER-DONE NORMAL Start Safe Safe | |
| | COMM. INT. Period Timer Period | | | | COMM. INT. Period Timer Period | |
| Comm. OK. | V expiration | | Comm. OK. | V expiration |
| Other State: | +------------------+ | | | Other State: | +------------------+ | |
| RECOVER +--| COMMUNICATIONS - |-----------+ | | RECOVER +--| COMMUNICATIONS - |-----------+ |
skipping to change at page 77, line 43 skipping to change at page 98, line 43
state the STARTUP bit MUST be set in the server-flags option and the state the STARTUP bit MUST be set in the server-flags option and the
previously recorded failover state MUST be placed in the server-state previously recorded failover state MUST be placed in the server-state
option. option.
9.3.2. Transition out of STARTUP state 9.3.2. Transition out of STARTUP state
Each server starts out in startup state every time it initializes Each server starts out in startup state every time it initializes
itself, and performs the following algorithm as part of its initiali- itself, and performs the following algorithm as part of its initiali-
zation: zation:
1. Do not send any messages until step 5. 1. Is there any record in stable storage of a previous failover
2. Is there any record in stable storage of a previous failover
state? If yes, set previous-state to the last recorded state state? If yes, set previous-state to the last recorded state
in stable storage, and continue with step 3. in stable storage, and continue with step 2.
Is there any configuration information that indicates that Is there any configuration information that indicates that
this server was previously running but lost its stable this server was previously running but lost its stable
storage? Such information must typically come from some storage? Such information must typically come from some
administrative intervention, since it is difficult for a administrative intervention, since it is difficult for a
server to distinguish first startup from a startup after it server to distinguish first startup from a startup after it
has lost its stable storage. If yes, then set the previous- has lost its stable storage. If yes, then set the previous-
state to RECOVER, and set the time-of-failure to whatever time state to RECOVER, and set the time-of-failure to whatever time
was configured, and go on to step 3. This time-of-failure was configured, and go on to step 2. This time-of-failure
will be used in the transition out of the RECOVER state into will be used in the transition out of the RECOVER state into
the RECOVER-DONE state, below. the RECOVER-DONE state, below.
If there is no record of any previous failover state in stable If there is no record of any previous failover state in stable
storage nor of any previous operational activity for this storage nor of any previous operational activity for this
server, then set the previous-state to PARTNER-DOWN if this server, then set the previous-state to PARTNER-DOWN if this
server is a primary and RECOVER if this server is a secondary, server is a primary and RECOVER if this server is a secondary,
and set the time-of-failure to a time before the maximum- and set the time-of-failure to a time before the maximum-
client-lead-time before now. If using standard Posix times, 0 client-lead-time before now. If using standard Posix times, 0
would typically do quite well. would typically do quite well.
3. Is the previous-state NORMAL? If yes, set the previous-state 2. Is the previous-state NORMAL? If yes, set the previous-state
to COMMUNICATIONS-INTERRUPTED. to COMMUNICATIONS-INTERRUPTED.
4. Start the STARTUP state timer. The time that a server remains 3. Start the STARTUP state timer. The time that a server remains
in the STARTUP state (absent any communications with its in the STARTUP state (absent any communications with its
partner) is implementation dependent and SHOULD be configur- partner) is implementation dependent and SHOULD be configur-
able. It SHOULD be long enough to for a TCP connection to be able. It SHOULD be long enough to for a TCP connection to be
created to a heavily loaded partner across a slow network. created to a heavily loaded partner across a slow network.
5. Attempt to create a TCP connection to the failover partner. 4. Attempt to create a TCP connection to the failover partner.
See section 8.2. See section 8.2.
6. Wait for "communications okay", i.e., the process discussed in 5. Wait for "communications okay", i.e., the process discussed in
section 8.2 "Creating the TCP Connection", to complete, section 8.2 "Creating the TCP Connection", to complete,
including the receipt of a STATE message from the partner. including the receipt of a STATE message from the partner.
When and if communications become "okay", clear the STARTUP When and if communications become "okay", clear the STARTUP
flag, and set the current state to the previous-state. flag, and set the current state to the previous-state.
If the partner is in PARTNER-DOWN state, and if the time at If the partner is in PARTNER-DOWN state, and if the time at
which it entered PARTNER-DOWN state (as receive in the start- which it entered PARTNER-DOWN state (as received in the
time-of-state option in the STATE message) is later than the start-time-of-state option in the STATE message) is later than
last recorded time of operation of this server, then set the the last recorded time of operation of this server, then set
current state to RECOVER. the current state to RECOVER. If the time at which it entered
PARTNER-DOWN state is earlier than the last recorded time of
operation of this server, then set the current state to
POTENTIAL-CONFLICT.
Then, transition to the current state and take the "communica- Then, transition to the current state and take the "communica-
tions okay" state transition based on the current state of tions okay" state transition based on the current state of
this server and the partner. this server and the partner.
7. If the startup time expires, take an implementation dependent 7. If the startup time expires, take an implementation dependent
action: The server MAY go to the previous-state, or the action: The server MAY go to the previous-state, or the
server MAY wait. server MAY wait.
Reasons to go to previous-state and begin processing: Reasons to go to previous-state and begin processing:
skipping to change at page 79, line 39 skipping to change at page 100, line 41
already allocated some of those available addresses to DHCP already allocated some of those available addresses to DHCP
clients. In cases where the possibility of partition is high, clients. In cases where the possibility of partition is high,
and the safe period expiration time is less than the likely and the safe period expiration time is less than the likely
operator reaction time, this is a good approach to use. operator reaction time, this is a good approach to use.
9.4. PARTNER-DOWN state 9.4. PARTNER-DOWN state
PARTNER-DOWN state is a state either server can enter. When in this PARTNER-DOWN state is a state either server can enter. When in this
state, the server does not assume that the other server could still state, the server does not assume that the other server could still
be operating and servicing a different set of clients, but instead be operating and servicing a different set of clients, but instead
assumes that it is the only server operating. For this reason, only assumes that it is the only server operating. If one server is in
one server should be operating in this state at a time. PARTNER-DOWN state, the other server MUST NOT be operating.
9.4.1. Upon entry to PARTNER-DOWN state 9.4.1. Upon entry to PARTNER-DOWN state
No special actions are required when entering PARTNER-DOWN state. No special actions are required when entering PARTNER-DOWN state.
The server should continue to attempt to connect to the partner The server should continue to attempt to connect to the partner
periodically. periodically.
9.4.2. Operation while in PARTNER-DOWN state 9.4.2. Operation while in PARTNER-DOWN state
skipping to change at page 80, line 28 skipping to change at page 101, line 28
INTERRUPTED state. INTERRUPTED state.
Any available IP address tagged as belonging to the other server (at Any available IP address tagged as belonging to the other server (at
entry to PARTNER-DOWN state) MUST NOT be used until the maximum- entry to PARTNER-DOWN state) MUST NOT be used until the maximum-
client-lead-time beyond the entry into PARTNER-DOWN state has client-lead-time beyond the entry into PARTNER-DOWN state has
elapsed. elapsed.
A server in PARTNER-DOWN state MUST NOT allocate an IP address to a A server in PARTNER-DOWN state MUST NOT allocate an IP address to a
DHCP client different from that to which it was allocated at the DHCP client different from that to which it was allocated at the
entrance to PARTNER-DOWN state until the maximum-client-lead-time entrance to PARTNER-DOWN state until the maximum-client-lead-time
beyond the its expiration time has elapsed. If this time would be beyond the maximum of the following times: client expiration time,
earlier than the current time plus the maximum-client-lead-time, then most recently transmitted potential-expiration-time, most recently
the current time plus the maximum-client-lead-time is used. received ack of potential-expiration-time from the partner, and most
recently acked potential-expiration-time to the partner. See section
7.1.4 for details. If this time would be earlier than the current
time plus the maximum-client-lead-time, then the time the server
entered PARTNER-DOWN state plus the maximum-client-lead-time is used.
Two options exist for lease times given out while in PARTNER-DOWN Two options exist for lease times given out while in PARTNER-DOWN
state, with different ramifications flowing from each. state, with different ramifications flowing from each.
If the server wishes the Failover protocol to protect it from loss of If the server wishes the Failover protocol to protect it from loss of
stable storage in PARTNER-DOWN state, then it should ensure that the stable storage in PARTNER-DOWN state, then it should ensure that the
MCLT based lease time restrictions in Section 5.1 are maintained, MCLT based lease time restrictions in Section 5.1 are maintained,
even in PARTNER-DOWN state. even in PARTNER-DOWN state.
If the server wishes to forego the protection of the Failover proto- If the server wishes to forego the protection of the Failover proto-
col in the event of loss of stable storage, then it need recognize no col in the event of loss of stable storage, then it need recognize no
restrictions on actual client lease times while in PARTNER-DOWN restrictions on actual client lease times while in PARTNER-DOWN
state. state.
A server in PARTNER-DOWN state attempt to establish communications A server in PARTNER-DOWN state MUST continue to attempt to establish
and synchronization with its partner. communications and synchronization with its partner.
9.4.3. Transitions out of PARTNER-DOWN state 9.4.3. Transitions out of PARTNER-DOWN state
When a server in PARTNER-DOWN state succeeds in establishing a con- When a server in PARTNER-DOWN state succeeds in establishing a con-
nection to its partner, its actions are conditional on the state and nection to its partner, its actions are conditional on the state and
flags received in the STATE message from the other server as part of flags received in the STATE message from the other server as part of
the process of establishing the connection. the process of establishing the connection.
If the STARTUP bit is set in the server-flags option of a received If the STARTUP bit is set in the server-flags option of a received
STATE message, a server in PARTNER-DOWN state MUST NOT take any state STATE message, a server in PARTNER-DOWN state MUST NOT take any state
transitions based on reestablishing communications. Essentially, if a transitions based on reestablishing communications. Essentially, if a
server is in PARTNER-DOWN state, it ignores all STATE messages from server is in PARTNER-DOWN state, it ignores all STATE messages from
its partner that have the STARTUP bit set in the server-flags option its partner that have the STARTUP bit set in the server-flags option
of the STATE message. of the STATE message.
If the STARTUP bit is not set in the server-flags option of a STATE If the STARTUP bit is not set in the server-flags option of a STATE
message received from its partner, then a server in PARTNER-DOWN message received from its partner, then a server in PARTNER-DOWN
state take the following actions based on the value of the server- state takes the following actions based on the value of the server-
state option in the received STATE message: state option in the received STATE message:
o partner in NORMAL, COMMUNICATIONS-INTERRUPTED, PARTNER-DOWN or o partner in NORMAL, COMMUNICATIONS-INTERRUPTED, PARTNER-DOWN or
POTENTIAL-CONFLICT state POTENTIAL-CONFLICT state
transition to POTENTIAL-CONFLICT state transition to POTENTIAL-CONFLICT state
o partner in RECOVER state o partner in RECOVER state
stay in PARTNER-DOWN state stay in PARTNER-DOWN state
skipping to change at page 81, line 51 skipping to change at page 103, line 11
A server in RECOVER state will attempt to reestablish communications A server in RECOVER state will attempt to reestablish communications
with the other server. with the other server.
9.5.2. Transitions out of RECOVER state 9.5.2. Transitions out of RECOVER state
If the other server is in POTENTIAL-CONFLICT state when communica- If the other server is in POTENTIAL-CONFLICT state when communica-
tions are reestablished, then the server in RECOVER state will move tions are reestablished, then the server in RECOVER state will move
to POTENTIAL-CONFLICT state itself. to POTENTIAL-CONFLICT state itself.
If the other server is in RECOVER state, then this server SHOULD If the other server is in RECOVER state, then this server SHOULD sig-
signal an error and halt processing. nal an error and halt processing.
If the other server is in any other state, then the server in RECOVER If the other server is in any other state, then the server in RECOVER
state will request an update of missing binding information by send- state will request an update of missing binding information by send-
ing an UPDREQ message. If the server has been instructed (through ing an UPDREQ message. If the server has been instructed (through
configuration or other external agency) that it has lost its stable configuration or other external agency) that it has lost its stable
storage, it MUST send an UPDREQALL message, otherwise it MUST send an storage, it MUST send an UPDREQALL message, otherwise it MUST send an
UPDREQ message. UPDREQ message.
It will wait for an UPDDONE message, and upon receipt of that message It will wait for an UPDDONE message, and upon receipt of that message
it will start a timer whose expiration is set to a time equal to the it will start a timer whose expiration is set to a time equal to the
skipping to change at page 82, line 29 skipping to change at page 103, line 37
prior to loss of its client binding information in stable storage to prior to loss of its client binding information in stable storage to
contact the other server or to time out. contact the other server or to time out.
See Figure 9.5.2-1. See Figure 9.5.2-1.
DISCUSSION: DISCUSSION:
The actual requirement on this wait period in RECOVER is that it The actual requirement on this wait period in RECOVER is that it
start when the recovering server went down, not necessarily when start when the recovering server went down, not necessarily when
it came back up. If the time when the recovering server failed is it came back up. If the time when the recovering server failed is
known, then it could be communicated to the recovering server, and known, it could be communicated to the recovering server (perhaps
the wait period could be reduced to the maximum-client-lead-time through actions of the network administrator), and the wait period
less the difference between the current time and the time the could be reduced to the maximum-client-lead-time less the differ-
server failed. In this way, the waiting period could be minimized. ence between the current time and the time the server failed. In
this way, the waiting period could be minimized.
If an UPDDONE message isn't received within an implementation depen- If an UPDDONE message isn't received within an implementation depen-
dent amount of time, and no BNDUPD message are being received, then dent amount of time, and no BNDUPD message are being received, then
the UPDREQ(ALL) message will be re-transmitted. the UPDREQ(ALL) message will be re-transmitted.
A B A B
Server Server Server Server
| | | |
RECOVER PARTNER-DOWN RECOVER PARTNER-DOWN
skipping to change at page 83, line 38 skipping to change at page 104, line 38
| NORMAL | NORMAL
| <-------------(NORMAL)-STATE--< | | <-------------(NORMAL)-STATE--< |
NORMAL | NORMAL |
| | | |
| | | |
Figure 9.5.2-1: Transition out of RECOVER state Figure 9.5.2-1: Transition out of RECOVER state
9.6. NORMAL state 9.6. NORMAL state
NORMAL state is the state used by a server when it can communicate NORMAL state is the state used by a server when it is communicating
with the other server. with the other server, and any required resynchronization has been
performed. While some bindings database synchronization is performed
in NORMAL state, potential conflicts are resolved prior to entry into
NORMAL state as is binding database data loss.
9.6.1. Upon Entry to NORMAL state 9.6.1. Upon Entry to NORMAL state
When entering NORMAL state, a server will send to the other server When entering NORMAL state, a server will send to the other server
all currently unacknowledged binding updates as BNDUPD messages. all currently unacknowledged binding updates as BNDUPD messages.
When the above process is complete, if the server entering NORMAL When the above process is complete, if the server entering NORMAL
state is a secondary server, then it will request IP addresses for state is a secondary server, then it will request IP addresses for
allocation using the POOLREQ message. allocation using the POOLREQ message.
9.6.2. Processing DHCP client requests and load balancing 9.6.2. Processing DHCP client requests and load balancing
When in NORMAL state, each server MUST process all requests from some When in NORMAL state, each server MUST process all requests from some
DHCP clients, and MUST NOT process any request other than a DHCP clients, and MUST NOT process any request other than a
DHCPREQUEST/RENEWAL or a DHCPREQUEST/REBINDING request from some DHCPREQUEST/RENEWAL or a DHCPREQUEST/REBINDING request from some
other DHCP clients. The load balancing algorithm determines into other DHCP clients.
which set a particular DHCP client falls.
As discussed in section 5.3, each server will take the client- However, if the load balancing algorithm specified in [LOADB] is used
identifier from each DHCP client request (or the htype concatenated with a pair of servers implementing the failover protocol, then each
to the front of the chaddr if no client-identifier is present in the server needs to test each incoming DHCP client request to see if it
request), and hash it with the algorithm given in section 12. The should process that request.
results of this hash algorithm yields a number between 0 and 255.
This number is used to index into the bit array received by a server
in the hash-bucket-assignment option (if the server is a secondary),
or into the inverse of the bit array sent to the secondary in the
hash-bucket-assignment option if the server is a primary.
If the bit found from this indexing process is a 1 bit, then the As discussed in section 5.3, each server will take the client-
server MUST process this DHCP request. identifier from each DHCP client request (or the client-hardware-
address, i.e., the htype concatenated to the front of the chaddr if
no client-identifier is present in the request) and use it as the
'Request ID' specified in [LOADB]. After applying the algorithm
specified in [LOADB] and comparing the result with the hash bucket
assignment (performed during connect processing between failover
servers), each failover server will be able to unambiguously deter-
mine if it should processes the DHCP client request.
In NORMAL state, a server MUST processes every DHCPREQUEST/RENEWAL or In NORMAL state, a server MUST process every DHCPREQUEST/RENEWAL or
DHCPREQUEST/REBINDING request it receives. DHCPREQUEST/REBINDING request it receives.
9.6.3. Operation in NORMAL state 9.6.3. Operation in NORMAL state
When in NORMAL state, for every DHCP client request that it When in NORMAL state, for every DHCP client request that it
processes, as determined by the algorithm described in section 9.6.2, processes, as determined by the algorithm described in section 9.6.2,
above, a server will operate in the following manner: above, a server will operate in the following manner:
o Lease time calculations o Lease time calculations
As discussed in section 5.2.1, "Control of lease time", the As discussed in section 5.2.1, "Control of lease time", the
lease interval given to a DHCP client can never be more than the lease interval given to a DHCP client can never be more than the
MCLT greater than the most recently received potential- MCLT greater than the most recently received potential-
expiration-time from the failover partner or the current time, expiration-time from the failover partner or the current time,
whichever is later. whichever is later.
As long as a server adheres to this constraint, the specifics of As long as a server adheres to this constraint, the specifics of
the lease interval that it gives to a DHCP client or the value the lease interval that it gives to a DHCP client or the value
of the potential-expiration-time sent to its failover partner of the potential-expiration-time sent to its failover partner
are implementation dependent. One possible approach is dis- are implementation dependent. One possible approach is
cussed in section 5.2.1, but that particular approach is in no discussed in section 5.2.1, but that particular approach is in
way required by this protocol. no way required by this protocol.
See section 7.1.4 for details concerning the storage of time
associated IP addresses and how to use these times when calcu-
lating lease times for DHCP clients.
o Lazy update of partner server o Lazy update of partner server
After an ACK of a IP address binding, the server servicing a After an ACK of a IP address binding, the server servicing a
DHCP client request attempts to update its partner with the new DHCP client request attempts to update its partner with the new
binding information. The lease time used in the update of the binding information. The lease time used in the update of the
secondary MUST be at that given to the DHCP client in the secondary MUST be at that given to the DHCP client in the
DHCPACK, and the potential-expiration-time MUST be at least the DHCPACK, and the potential-expiration-time MUST be at least the
lease time, and SHOULD be longer. lease time, and SHOULD be longer.
skipping to change at page 86, line 47 skipping to change at page 108, line 8
renewal of a DHCP client's current lease on an IP address irrespec- renewal of a DHCP client's current lease on an IP address irrespec-
tive of whether that lease was given out by the receiving server or tive of whether that lease was given out by the receiving server or
not, although the renewal period MUST not exceed the maximum client not, although the renewal period MUST not exceed the maximum client
lead time (MCLT) beyond the potential-expiration-time already ack- lead time (MCLT) beyond the potential-expiration-time already ack-
nowledged by the other server or the lease-expiration-time or nowledged by the other server or the lease-expiration-time or
potential-expiration-time received from the partner server. potential-expiration-time received from the partner server.
However, since the server cannot communicate with its partner in this However, since the server cannot communicate with its partner in this
state, the acknowledged-potential-expiration time will not be updated state, the acknowledged-potential-expiration time will not be updated
in any new bindings. This is likely to eventually cause the actual- in any new bindings. This is likely to eventually cause the actual-
client-lease-times to be the current-time plus the maximum-client- client-lease-times to be the current time plus the maximum-client-
lead-time (unless this is greater than the desired-client-lease- lead-time (unless this is greater than the desired-client-lease-
time). time).
9.7.3. Transition out of COMMUNICATIONS-INTERRUPTED State 9.7.3. Transition out of COMMUNICATIONS-INTERRUPTED State
If the safe period timer expires while a server is in the If the safe period timer expires while a server is in the
COMMUNICATIONS-INTERRUPTED state, it will transition immediately into COMMUNICATIONS-INTERRUPTED state, it will transition immediately into
PARTNER-DOWN state. PARTNER-DOWN state.
If an external command is received by a server in COMMUNICATIONS- If an external command is received by a server in COMMUNICATIONS-
INTERRUPTED state informing it that its partner is down, it will INTERRUPTED state informing it that its partner is down, it will
transition immediately into PARTNER-DOWN state. transition immediately into PARTNER-DOWN state.
If communications is restored with the other server, then the server If communications is restored with the other server, then the server
in COMMUNICATIONS-INTERRUPTED state will transition into another in COMMUNICATIONS-INTERRUPTED state will transition into another
state based on the state of the partner: state based on the state of the partner:
o partner in NORMAL or COMMUNICATIONS-INTERRUPTED o partner in NORMAL or COMMUNICATIONS-INTERRUPTED
The partner really SHOULD NOT be in NORMAL state here, since
upon restoration of communications is MUST have created a new
TCP connection which would have forced it into COMMUNICATIONS-
INTERRUPTED state. Still, we should account for every state
just in case.
Transition into the NORMAL state. Transition into the NORMAL state.
o partner in RECOVER o partner in RECOVER
Stay in COMMUNICATIONS-INTERRUPTED state. Stay in COMMUNICATIONS-INTERRUPTED state.
o partner in RECOVER-DONE o partner in RECOVER-DONE
Transition into NORMAL state. Transition into NORMAL state.
skipping to change at page 90, line 35 skipping to change at page 111, line 35
| >--BNDUPD--------------------> | | >--BNDUPD--------------------> |
| <---------------------BNDACK--< | | <---------------------BNDACK--< |
... ... ... ...
| >--BNDUPD--------------------> | | >--BNDUPD--------------------> |
| <---------------------BNDACK--< | | <---------------------BNDACK--< |
| | | |
| >--UPDDONE-------------------> | | >--UPDDONE-------------------> |
| NORMAL | NORMAL
| | | |
| <--------------------POOLREQ--< | | <--------------------POOLREQ--< |
| >------POOLRESP-(?)----------> | | >------POOLRESP-(n)----------> |
| | | addresses |
Figure 9.8.3-1: Transition out of POTENTIAL-CONFLICT Figure 9.8.3-1: Transition out of POTENTIAL-CONFLICT
9.9. RECOVER-DONE state 9.9. RESOLUTION-INTERRUPTED state
This state indicates that the two servers were attempting to re-
integrate with each other in POTENTIAL-CONFLICT state, but
communications failed prior to completion of re-integration.
If the servers remained in POTENTIAL-CONFLICT while communications
was interrupted, neither server would be responsive to DHCP client
requests, and if one server had crashed, then there might be no
server able to process DHCP requests.
9.9.1. Upon Entry to RESOLUTION-INTERRUPTED state
When a server enters RESOLUTION-INTERRUPTED SHOULD raise an alarm
condition to alert administrative staff of a problem in the DHCP sub-
system.
9.9.2. Operation in RESOLUTION-INTERRUPTED state
In this state a server MUST respond to all DHCP client requests, and
any load balancing (described in section 5.3) MUST NOT be used. When
allocating new IP addresses, each server SHOULD allocate from its own
IP address pool (if that can be determined), where the primary MUST
allocate only FREE IP addresses, and the secondary MUST allocate only
BACKUP IP addresses. When responding to renewal requests, each
server will allow continued renewal of a DHCP client's current lease
on an IP address irrespective of whether that lease was given out by
the receiving server or not, although the renewal period MUST not
exceed the maximum client lead time (MCLT) beyond the potential-
expiration-time already acknowledged by the other server or the
lease-expiration-time or potential-expiration-time received from the
partner server.
However, since the server cannot communicate with its partner in this
state, the acknowledged-potential-expiration time will not be updated
in any new bindings.
9.9.3. Transitions out of RESOLUTION-INTERRUPTED state
If an external command is received by a server in RESOLUTION-
INTERRUPTED state informing it that its partner is down, it will
transition immediately into PARTNER-DOWN state.
If communications is restored with the other server, then the server
in RESOLUTION-INTERRUPTED state will transition into POTENTIAL-
CONFLICT state.
9.10. RECOVER-DONE state
This state exists to allow an interlocked transition for one server This state exists to allow an interlocked transition for one server
from RECOVER state and another server from PARTNER-DOWN or from RECOVER state and another server from PARTNER-DOWN or
COMMUNICATIONS-INTERRUPTED state into NORMAL state. COMMUNICATIONS-INTERRUPTED state into NORMAL state.
9.9.1. Operation in RECOVER-DOWN state 9.10.1. Operation in RECOVER-DONE state
A server in RECOVER-DONE state MUST respond only to A server in RECOVER-DONE state MUST respond only to
DHCPREQUEST/RENEWAL and DHCPREQUEST/REBINDING DHCP messages. DHCPREQUEST/RENEWAL and DHCPREQUEST/REBINDING DHCP messages.
9.9.2. Transitions out of RECOVER-DONE state 9.10.2. Transitions out of RECOVER-DONE state
When a server in RECOVER-DONE state determines that its partner When a server in RECOVER-DONE state determines that its partner
server has entered NORMAL state, then it will transition into NORMAL server has entered NORMAL state, then it will transition into NORMAL
state as well. state as well.
9.10. PAUSED state 9.11. PAUSED state
This state exists to allow one server to inform another that it will This state exists to allow one server to inform another that it will
be out of service for what is predicted to be a relatively short be out of service for what is predicted to be a relatively short
time, and to allow the other server to transition to COMMUNICATIONS- time, and to allow the other server to transition to COMMUNICATIONS-
INTERRUPTED state immediately and to begin servicing all DHCP clients INTERRUPTED state immediately and to begin servicing all DHCP clients
with no interruption in service to new DHCP clients. with no interruption in service to new DHCP clients.
A server which is aware that it is shutting down temporarily SHOULD A server which is aware that it is shutting down temporarily SHOULD
send a STATE message with the server-state option containing PAUSED send a STATE message with the server-state option containing PAUSED
state. state and close the TCP connection.
While a server may or may not transition internally into PAUSED While a server may or may not transition internally into PAUSED
state, the 'previous' state determined when it is restarted MUST be state, the 'previous' state determined when it is restarted MUST be
the state the server was in prior to receiving the command to shut- the state the server was in prior to receiving the command to shut-
down and restart and which precedes its entry into the PAUSED state. down and restart and which precedes its entry into the PAUSED state.
See section 9.3.2 concerning the use of the previous state upon See section 9.3.2 concerning the use of the previous state upon
server restart. server restart.
9.10.1. Upon entry to PAUSED state 9.11.1. Upon entry to PAUSED state
When entering PAUSED state, the server MUST store the previous state When entering PAUSED state, the server MUST store the previous state
in stable storage, and use that state as the previous state when it in stable storage, and use that state as the previous state when it
is restarted. is restarted.
9.10.2. Transitions out of PAUSED state 9.11.2. Transitions out of PAUSED state
A server transitions out of PAUSED state by being restarted. At that A server transitions out of PAUSED state by being restarted. At that
time, the previous state MUST be the state the server was in prior to time, the previous state MUST be the state the server was in prior to
entering the PAUSED state. entering the PAUSED state.
9.11. SHUTDOWN state 9.12. SHUTDOWN state
This state exists to allow one server to inform another that it will This state exists to allow one server to inform another that it will
be out of service for what is predicted to be a relatively long time, be out of service for what is predicted to be a relatively long time,
and to allow the other server to transition immediately to PARTNER- and to allow the other server to transition immediately to PARTNER-
DOWN state, and take over completely for the server going down. DOWN state, and take over completely for the server going down.
A server which is aware that it is shutting down SHOULD send a STATE A server which is aware that it is shutting down SHOULD send a STATE
message with the server-state field containing SHUTDOWN. message with the server-state field containing SHUTDOWN.
While a server may or may not transition internally into SHUTDOWN While a server may or may not transition internally into SHUTDOWN
state, the 'previous' state determined when it is restarted MUST be state, the 'previous' state determined when it is restarted MUST be
the state active prior to the command to shutdown. See section 9.3.2 the state active prior to the command to shutdown. See section 9.3.2
concerning the use of the previous state upon server restart. concerning the use of the previous state upon server restart.
9.11.1. Upon entry to SHUTDOWN state 9.12.1. Upon entry to SHUTDOWN state
When entering SHUTDOWN state, the server MUST record the previous When entering SHUTDOWN state, the server MUST record the previous
state in stable storage for use when the server is restarted. It state in stable storage for use when the server is restarted. It
also MUST record the current time as the last time operational. also MUST record the current time as the last time operational.
A server which is aware that it is shutting down SHOULD send a STATE A server which is aware that it is shutting down SHOULD send a STATE
message with the server-state field containing SHUTDOWN. message with the server-state field containing SHUTDOWN.
9.11.2. Operation in SHUTDOWN state 9.12.2. Operation in SHUTDOWN state
A server in SHUTDOWN state MUST NOT respond to any DHCP client input. A server in SHUTDOWN state MUST NOT respond to any DHCP client input.
If a server receives any message indicating that the partner has If a server receives any message indicating that the partner has
moved to PARTNER-DOWN state while it is in SHUTDOWN state then it moved to PARTNER-DOWN state while it is in SHUTDOWN state then it
MUST record RECOVER state as the previous state to be used when it is MUST record RECOVER state as the previous state to be used when it is
restarted. restarted.
A server SHOULD wait for a few seconds after informing the partner of A server SHOULD wait for a few seconds after informing the partner of
entry into SHUTDOWN state (if communications are okay) to determine entry into SHUTDOWN state (if communications are okay) to determine
if it will enter PARTNER-DOWN state. if it will enter PARTNER-DOWN state.
9.11.3. Transitions out of SHUTDOWN state 9.12.3. Transitions out of SHUTDOWN state
A server transitions out of SHUTDOWN state by being restarted. A server transitions out of SHUTDOWN state by being restarted.
10. Safe Period 10. Safe Period
Due to the restrictions imposed on each server while in Due to the restrictions imposed on each server while in
COMMUNICATIONS-INTERRUPTED state, long-term operation in this state COMMUNICATIONS-INTERRUPTED state, long-term operation in this state
is not feasible for either server. One reason that these states is not feasible for either server. One reason that these states
exist at all, is to allow the servers to easily survive transient exist at all, is to allow the servers to easily survive transient
network communications failures of a few minutes to a few days network communications failures of a few minutes to a few days
(although the actual time periods will depend a great deal on the (although the actual time periods will depend a great deal on the
DHCP activity of the network in terms of arrival and departure of DHCP activity of the network in terms of arrival and departure of
DHCP clients on the network). DHCP clients on the network).
Eventually, when the servers are unable to communicate, they will Eventually, when the servers are unable to communicate, they will
have to move into a state where they no longer can re-integrate have to move into a state where they no longer can re-integrate
without the some possibility of a duplicate IP address allocation. without some possibility of a duplicate IP address allocation. There
There are two ways that they can move into this state (known as are two ways that they can move into this state (known as PARTNER-
PARTNER-DOWN). DOWN).
They can either be informed by external command that, indeed, the They can either be informed by external command that, indeed, the
partner server is down. In this case, there is no difficulty in mov- partner server is down. In this case, there is no difficulty in mov-
ing into the PARTNER-DOWN state since it is an accurate reflection of ing into the PARTNER-DOWN state since it is an accurate reflection of
reality and the protocol has been designed to operate correctly (even reality and the protocol has been designed to operate correctly (even
during reintegration) if, when in PARTNER-DOWN state the partner is, during reintegration) if, when in PARTNER-DOWN state the partner is,
indeed, down. indeed, down.
The more difficult scenario is when the servers are running unat- The more difficult scenario is when the servers are running unat-
tended for extended periods, and in this case an option is provided tended for extended periods, and in this case an option is provided
skipping to change at page 94, line 7 skipping to change at page 116, line 7
is all that can be used (given a dearth of IP addresses or a very is all that can be used (given a dearth of IP addresses or a very
high arrival rate of new DHCP clients), even that can provide sub- high arrival rate of new DHCP clients), even that can provide sub-
stantial benefits in allowing the DHCP subsystem to ride through stantial benefits in allowing the DHCP subsystem to ride through
minor problems that could occur and be fixed within that hour. In minor problems that could occur and be fixed within that hour. In
these cases, no possibility of duplicate IP address allocation these cases, no possibility of duplicate IP address allocation
exists, and re-integration after the failure is solved will be exists, and re-integration after the failure is solved will be
automatic and require no operator intervention. automatic and require no operator intervention.
11. Security 11. Security
It is very desirable to assure the integrity of failover partners and The Failover protocol communicates DHCP lease activity and this data
to thus ensure proper operation of the servers. For example, denial is generally easily discovered via other means, such as by pinging
of service attacks are possible by the communication of invalid state addresses and doing DNS lookups. Therefore, the need to encrypt the
information to both servers. data over the wire is likely not great (though some sites may feel
differently).
The Failover protocol MAY be secured either by using a simple shared However, it is very desirable to assure the integrity of failover
secret message digest which covers each message or by using TLS [TLS] partners and to thus ensure proper operation of the servers. For
(Transport Layer Security). example, denial of service attacks are possible by the communication
of invalid state information to one or both servers.
Therefore, the Failover protocol MUST be capable of being secured by
using a simple shared secret message digest which covers each mes-
sage. This provides authentication of the servers, but does not pro-
vide encryption of the data exchange.
The Failover protocol MAY also be secured by using TLS [TLS] (Tran-
sport Layer Security) if encryption of the data exchange is desired.
The use of the shared secret or TLS will not protect against TCP or
IP layer attacks (such as someone sending fake TCP RST segments).
IPsec SHOULD be used to protect against most (if not all) of these
kinds of attacks.
11.1. Simple shared secret 11.1. Simple shared secret
A simple shared secret message digest MAY be used to cover each mes- Messages between the failover partners are authenticated through the
sage. Since there are a number of configuration parameters that must use of a shared secret, which is never sent over the network and must
already be the same on each server in a pair, it is not unreasonable be known by each server. How each server is told about this shared
to require a shared secret to be configured as well. secret and secures its storage of the shared secret is outside the
scope of this document. If a server is configured with a shared
secret for a partner, it MUST send the message-digest option in ALL
messages to that partner and it MUST treat any messages received from
that partner without a message-digest option as failing authentica-
tion.
Only information within the packet and covered by the message digest If a server is not configured with a shared secret for a partner, it
is used for operation of the protocol. It is for this reason that the MUST NOT send the message-digest option in any message to that
IP address of the sending server is sent in the sending-server-IP- partner and it MUST treat any messages received from that partner
address option of the CONNECT and CONNECTACK messages. with a message-digest option as failing authentication.
This message digest is placed in the message-digest option. The dig- The shared secret is used to calculate a 16 octet message-digest
est covers the message prior to the inclusion of the message-digest which is sent in every failover message as the message-digest option.
option. See section 6.2.25. The message-digest contains a one-way 16 octet
MD5 [MD5] hash calculated over a stream of octets consisting of the
entire message concatenated with the shared secret.
For calculation, the message includes the message-digest option with
the message-digest data zeroed (16-octets of zero). Once the calcula-
tion is complete, these 16 octets of zero are replaced by the 16-
octet MD5 hash and the message is sent.
For verification, the 16-octet message-digest is saved and replaced
with 16-octets of zero and calculated per above. The resulting MD5
hash is compared to the received hash and if they match, the message
is assumed authenticated.
A failover partner that fails to authenticate a received message or
receives a message without a message-digest option when configured
with a shared secret MUST close the connection immediately and take
steps to notify operators.
This use of the shared secret is very similar to that used for RADIUS
Accounting [RADIUS].
11.2. TLS 11.2. TLS
TLS, Transport Layer Security, as specified in [TLS] MAY be used. The TLS, Transport Layer Security, as specified in [TLS] MAY be used.
use of TLS would be similar to the way it is used with SMTP [SMTPTLS] The use of TLS would be similar to the way it is used with SMTP
and IMAP/POP3/ACAP [IPAMTLS]. [SMTPTLS] and IMAP/POP3/ACAP [IPAMTLS].
To request the use TLS, the server that successfully opened a connec- To request the use of TLS, the server that successfully opened a con-
tion to its peer MUST send the TLS option as part of the CONNECT mes- nection to its peer MUST send the TLS option as part of the CONNECT
sage. The server receiving the TLS option MUST respond with a TLS- message. The server receiving the TLS option MUST respond with a
reply option indicating its acceptace or rejection of the TLS-request TLS-reply option indicating its acceptance or rejection of the TLS-
in the CONNECT message. request in the CONNECT message.
If the CONNECTACK message contained a TLS-reply of 1 , then both If the CONNECTACK message contained a TLS-reply of 1 , then both
servers begin TLS negotiation. servers begin TLS negotiation.
Upon completion of this negotiation, the server which originally sent Upon completion of this negotiation, the server which originally sent
the CONNECT message MUST resent its CONNECT message without any TLS- the CONNECT message MUST resent its CONNECT message without any TLS-
request, and must wait for a corresponding CONNECTACK. request, and must wait for a corresponding CONNECTACK.
Implementation of the TLS_DHE_DSS_WITH_3DES_EDE_CBC_SHA [TLS] cipher Implementation of the TLS_DHE_DSS_WITH_3DES_EDE_CBC_SHA [TLS] cipher
suite is REQUIRED in Failover servers supporting TLS. This is suite is REQUIRED in Failover servers supporting TLS. This is impor-
important as it assures that any two compliant implementations can be tant as it assures that any two compliant implementations can be con-
configured to interoperate. figured to interoperate.
12. Hash algorithm for load balancing
The following hash function is an implementation of the algorithm known
as "Pearson's hash". The Pearson's hash algorithm was originally pub-
lished in the Communications of the ACM Vol.33, No. 6 (June 1990), pp.
677-680. The author, Peter K. Pearson, has kindly granted his permis-
sion to use this algorithm, free of any encumbrances.
To make Primary-backup load balancing possible , both servers MUST use
the same hash function.
/* A "mixing table" of 256 distinct values, in pseudo-random order. */
unsigned char failover_hash_mx_tbl[256] =
{
251, 175, 119, 215, 81, 14, 79, 191, 103, 49,
181, 143, 186, 157, 0, 232, 31, 32, 55, 60,
152, 58, 17, 237, 174, 70, 160, 144, 220, 90,
57, 223, 59, 3, 18, 140, 111, 166, 203, 196,
134, 243, 124, 95, 222, 179, 197, 65, 180, 48,
36, 15, 107, 46, 233, 130, 165, 30, 123, 161,
209, 23, 97, 16, 40, 91, 219, 61, 100, 10,
210, 109, 250, 127, 22, 138, 29, 108, 244, 67,
207, 9, 178, 204, 74, 98, 126, 249, 167, 116,
34, 77, 193, 200, 121, 5, 20, 113, 71, 35,
128, 13, 182, 94, 25, 226, 227, 199, 75, 27,
41, 245, 230, 224, 43, 225, 177, 26, 155, 150,
212, 142, 218, 115, 241, 73, 88, 105, 39, 114,
62, 255, 192, 201, 145, 214, 168, 158, 221, 148,
154, 122, 12, 84, 82, 163, 44, 139, 228, 236,
205, 242, 217, 11, 187, 146, 159, 64, 86, 239,
195, 42, 106, 198, 118, 112, 184, 172, 87, 2,
173, 117, 176, 229, 247, 253, 137, 185, 99, 164,
102, 147, 45, 66, 231, 52, 141, 211, 194, 206,
246, 238, 56, 110, 78, 248, 63, 240, 189, 93,
92, 51, 53, 183, 19, 171, 72, 50, 33, 104,
101, 69, 8, 252, 83, 120, 76, 135, 85, 54,
202, 125, 188, 213, 96, 235, 136, 208, 162, 129,
190, 132, 156, 38, 47, 1, 7, 254, 24, 4,
216, 131, 89, 21, 28, 133, 37, 153, 149, 80,
170, 68, 6, 169, 234, 151
};
unsigned char failover_p_hash(
unsigned char *key, /* The key to be hashed (e.g., MAC address)
*/
int len /* Length of key in bytes */ )
{
unsigned char hash = len;
int i;
for( i=len ; i > 0 ; )
{
hash = failover_p_mx_tbl [ hash ^ key[ --i ] ];
}
return( hash );
}
13. Acknowledgments 12. Acknowledgments
Ralph Droms started it all, by sketching out an initial interserver Ralph Droms started it all, by sketching out an initial interserver
draft that embodied ideas from several past IETF meetings. In that draft that embodied ideas from several past IETF meetings. In that
draft, he acknowledged contributions by Jeff Mogul, Greg Minshall, draft, he acknowledged contributions by Jeff Mogul, Greg Minshall,
Rob Stevens, Walt Wimer, Ted Lemon, and the DHC working group.