draft-ietf-dhc-failover-05.txt   draft-ietf-dhc-failover-06.txt 
skipping to change at page 1, line 19 skipping to change at page 1, line 19
Bernie Volz Bernie Volz
Steve Gonczi Steve Gonczi
Process Software Process Software
Greg Rabil Greg Rabil
Mike Dooley Mike Dooley
Arun Kapur Arun Kapur
Lucent Technologies Lucent Technologies
October 1999 March 2000
Expires April 2000 Expires September 2000
DHCP Failover Protocol DHCP Failover Protocol
<draft-ietf-dhc-failover-05.txt> <draft-ietf-dhc-failover-06.txt>
Status of this Memo Status of this Memo
This document is an Internet-Draft and is in full conformance with This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC2026. all provisions of Section 10 of RFC2026.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet- other groups may also distribute working documents as Internet-
Drafts. Drafts.
skipping to change at page 1, line 48 skipping to change at page 1, line 48
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html. http://www.ietf.org/shadow.html.
Copyright Notice Copyright Notice
Copyright (C) The Internet Society (1999). All Rights Reserved. Copyright (C) The Internet Society (2000). All Rights Reserved.
Abstract Abstract
DHCP [RFC 2131] allows for multiple servers to be operating on a DHCP [RFC 2131] allows for multiple servers to be operating on a
single network. Some sites are interested in running multiple single network. Some sites are interested in running multiple
servers in such a way so as to provide redundancy in case of server servers in such a way so as to provide redundancy in case of server
failure. In order for this to work reliably, the cooperating primary failure. In order for this to work reliably, the cooperating primary
and secondary servers must maintain a consistent database of the and secondary servers must maintain a consistent database of the
lease information. This implies that servers will need to coordinate lease information. This implies that servers will need to coordinate
any and all lease activity so that this information is synchronized any and all lease activity so that this information is synchronized
in case of failover. in case of failover.
This document defines a protocol to provide this synchronization This document defines a protocol to provide such synchronization
between two servers. One server is designated the "primary" server, between two servers. One server is designated the "primary" server,
the other is the "secondary" server. This document also describes a the other is the "secondary" server. This document also describes a
way to integrate the failover protocol with the DHCP loadbalancing way to integrate the failover protocol with the DHCP loadbalancing
approach. approach.
This document is a significant revision of draft-ietf-dhc-failover- This document is a substantial reorganization as well as a technical
04.txt. and editorial revision of draft-ietf-dhc-failover-05.txt.
Table of Contents Table of Contents
1. Introduction................................................. 4 1. Introduction................................................. 4
2. Terminology.................................................. 5 2. Terminology.................................................. 5
2.1. Requirements terminology................................... 5 2.1. Requirements terminology................................... 5
2.2. DHCP and failover terminology.............................. 5 2.2. DHCP and failover terminology.............................. 5
3. Background and External Requirements......................... 8 3. Background and External Requirements......................... 8
3.1. Key aspects of the DHCP protocol........................... 8 3.1. Key aspects of the DHCP protocol........................... 9
3.2. BOOTP relay agent implementation........................... 10 3.2. BOOTP relay agent implementation........................... 11
3.3. What does it mean if a server can't communicate with its partner? 11 3.3. What does it mean if a server can't communicate with its partner? 12
3.4. Challenging scenarios for a Failover protocol.............. 12 3.4. Challenging scenarios for a Failover protocol.............. 12
3.5. Using TCP to detect partner server failure................. 13 3.5. Using TCP to detect partner server failure................. 13
4. Design Goals................................................. 14 4. Design Goals................................................. 15
4.1. Design requirements for this protocol...................... 14 4.1. Design goals for this protocol............................. 15
4.2. Goals for this protocol.................................... 15 4.2. Limitations of this protocol............................... 16
4.3. Limitations of this Protocol............................... 16 5. Protocol Overview............................................ 17
5. Protocol Overview............................................ 16
5.1. Messages and States........................................ 17 5.1. Messages and States........................................ 17
5.2. Fundamental restrictions................................... 19 5.2. Fundamental guarantees..................................... 19
5.3. Load balancing............................................. 26 5.3. Load balancing............................................. 26
5.4. Operating in NORMAL state.................................. 27 5.4. Operating in NORMAL state.................................. 27
5.5. Operating in COMMUNICATIONS-INTERRUPTED state.............. 27 5.5. Operating in COMMUNICATIONS-INTERRUPTED state.............. 27
5.6. Operating in PARTNER-DOWN state............................ 27 5.6. Operating in PARTNER-DOWN state............................ 27
5.7. Operating in RECOVER state................................. 28 5.7. Operating in RECOVER state................................. 27
5.8. Operating in STARTUP state................................. 28 5.8. Operating in STARTUP state................................. 28
5. Protocol Overview (continued)
5.9. Time synchronization between servers....................... 28 5.9. Time synchronization between servers....................... 28
5.10. IP address binding-status................................. 29 5.10. IP address binding-status................................. 29
5.11. DNS dynamic update considerations......................... 34 5.11. DNS dynamic update considerations......................... 32
5.12. Reservations and failover................................. 38 5.12. Reservations and failover................................. 36
5.13. Dynamic BOOTP and failover................................ 39 5.13. Dynamic BOOTP and failover................................ 37
5.14. Guidelines for selecting MCLT............................. 39 5.14. Guidelines for selecting MCLT............................. 38
6. Packet Formats............................................... 40 6. Common Message Format........................................ 39
6.1. Common message format...................................... 40 6.1. Message header format...................................... 39
6.2. Common option format....................................... 43 6.2. Common option format....................................... 42
6.3. BNDUPD message format...................................... 55 6.3. Batching multiple binding update transactions in one BNDUPD mes- 42
6.4. BNDACK message format...................................... 58 7. Protocol Messages............................................ 44
6.5. Bulking for BNDUPD and BNDACK messages..................... 59 7.1. BNDUPD message............................................. 44
6.6. UPDREQ message format...................................... 60 7.2. BNDACK message............................................. 54
6.7. UPDREQALL message format................................... 60 7.3. UPDREQ message............................................. 57
6.8. UPDDONE message format..................................... 60 7.4. UPDREQALL message.......................................... 59
6.9. POOLREQ message format..................................... 61 7.5. UPDDONE message............................................ 59
6.10. POOLRESP message format................................... 61 7.6. POOLREQ message............................................ 60
6.11. CONNECT message format.................................... 62 7.7. POOLRESP message........................................... 61
6.12. CONNECTACK message format................................. 62 7.8. CONNECT message............................................ 62
6.13. STATE message format...................................... 63 7.9. CONNECTACK message......................................... 66
6.14. CONTACT message format.................................... 64 7.10. STATE message............................................. 70
6.15. DISCONNECT message format................................. 64 7.11. CONTACT message........................................... 71
7. Protocol Messages............................................ 64 7.12. DISCONNECT message........................................ 72
7.1. BNDUPD message............................................. 64 8. Connection Management........................................ 73
7.2. BNDACK message............................................. 75 8.1. Connection granularity..................................... 73
7.3. UPDREQ message............................................. 76 8.2. Creating the TCP connection................................ 73
7.4. UPDREQALL message.......................................... 78 8.3. Using the TCP connection for determining communications status 75
7.5. UPDDONE message............................................ 79 8.4. Using the TCP connection for binding data.................. 77
7.6. POOLREQ message............................................ 80 8.5. Using the TCP connection for control messages.............. 77
7.7. POOLRESP message........................................... 81 8.6. Losing the TCP connection.................................. 77
7.8. CONNECT message............................................ 81 9. Failover Endpoint States..................................... 78
7.9. CONNECTACK message......................................... 85 9.1. Server Initialization...................................... 78
7.10. STATE message............................................. 88 9.2. Server State Transitions................................... 78
7.11. CONTACT message........................................... 89 9.3. STARTUP state.............................................. 81
7.12. DISCONNECT message........................................ 89 9.4. PARTNER-DOWN state......................................... 83
8. Connection Management........................................ 90 9.5. RECOVER state.............................................. 85
8.1. Connection granularity..................................... 90 9.6. NORMAL state............................................... 88
8.2. Creating the TCP connection................................ 90 9.7. COMMUNICATIONS-INTERRUPTED State........................... 90
8.3. Using the TCP connection for determining communications status 91 9.8. POTENTIAL-CONFLICT state................................... 94
8.4. Using the TCP connection for binding data.................. 93 9.9. RESOLUTION-INTERRUPTED state............................... 95
8.5. Using the TCP connection for control messages.............. 94 9.10. RECOVER-DONE state........................................ 96
8.6. Losing the TCP connection.................................. 94 9.11. PAUSED state.............................................. 97
9. Protocol States.............................................. 94 9.12. SHUTDOWN state............................................ 97
9.1. Server Initialization...................................... 95 10. Safe Period................................................. 98
9.2. Server State Transitions................................... 95 11. Security.................................................... 100
9.3. STARTUP state.............................................. 98 11.1. Simple shared secret...................................... 100
9.4. PARTNER-DOWN state......................................... 100 11.2. TLS....................................................... 101
9.5. RECOVER state.............................................. 102 12. Failover Options............................................ 102
9.6. NORMAL state............................................... 104 12.1. addresses-transferred..................................... 102
9.7. COMMUNICATIONS-INTERRUPTED State........................... 107 12.2. assigned-IP-address....................................... 102
9.8. POTENTIAL-CONFLICT state................................... 110 12.3. binding-status............................................ 103
9.9. RESOLUTION-INTERRUPTED state............................... 111 12.4. client-identifier......................................... 103
9.10. RECOVER-DONE state........................................ 112 12.5. client-hardware-address................................... 103
9.11. PAUSED state.............................................. 113 12.6. client-last-transaction-time.............................. 104
9.12. SHUTDOWN state............................................ 113 12.7. client-reply-options...................................... 104
10. Safe Period................................................. 114 12.8. client-request-options.................................... 105
11. Security.................................................... 116 12.9. DDNS...................................................... 106
11.1. Simple shared secret...................................... 116 12.10. hash-bucket-assignment................................... 107
11.2. TLS....................................................... 117 12.11. lease-expiration-time.................................... 107
12. Acknowledgments............................................. 117 12.12. max-unacked-bndupd....................................... 107
13. References.................................................. 119 12.13. MCLT..................................................... 108
14. Author's information........................................ 120 12.14. message.................................................. 108
15. Full Copyright Statement.................................... 121 12.15. message-digest........................................... 108
12.16. potential-expiration-time................................ 109
12.17. receive-timer............................................ 109
12.18. protocol-version......................................... 109
12.19. reject-reason............................................ 110
12.20. sending-server-IP-address................................ 111
12.21. server-flags............................................. 111
12.22. server-state............................................. 112
12.23. start-time-of-state...................................... 112
12.24. TLS-reply................................................ 113
12.25. TLS-request.............................................. 113
12.26. vendor-class-identifier.................................. 113
12.27. vendor-specific-options.................................. 114
13. IANA Considerations......................................... 114
14. Acknowledgments............................................. 114
15. References.................................................. 116
16. Author's information........................................ 117
17. Full Copyright Statement.................................... 118
1. Introduction 1. Introduction
DHCP [RFC 2131] allows for multiple servers to be operating on a sin- DHCP [RFC 2131] allows for multiple servers to be operating on a sin-
gle network. Some sites are interested in running multiple servers gle network. Some sites are interested in running multiple servers
in such a way so as to provide redundancy in case of server failure in such a way so as to provide redundancy in case of server failure
since the DHCP subsystem is in many cases a critical part of the net- since the DHCP subsystem is in many cases a critical part of the net-
work infrastructure. work infrastructure.
This document defines a protocol to provide synchronization between This document defines a protocol to provide synchronization between
two servers in order that each can take over for the other should two servers in order that each can take over for the other should
either one fail or become unreachable. either one fail or become unreachable.
One server is designated the "primary" server, the other is the One server is designated the "primary" server, the other is the
"secondary" server, and all DHCP client requests are sent to each "secondary" server, and all DHCP client requests are sent to each
server. server.
In order to provide a high availability DHCP service, these In order to provide a high availability DHCP service, these
cooperating primary and secondary servers must maintain a consistent cooperating primary and secondary servers must maintain a consistent
database of lease information. This implies that servers will need database of lease information. This implies that servers will need
to coordinate any and all lease activity so that this information is to coordinate all lease activity so that this information is syn-
synchronized in case failover is required. The protocol messages and chronized in case failover is required. The protocol messages and
processing techniques required to maintain a consistent database are processing techniques required to maintain a consistent database are
specified in the protocol described here. specified in the protocol described here.
The failover protocol also contains an algorithm which allows each The failover protocol also contains a way to integrate the DHCP load-
server to determine to which DHCP clients it should provide service balancing algorithm described in [LOADB] with the failover protocol.
when both servers are operating normally, and this capability can be
used to support load balancing.
2. Terminology 2. Terminology
This section discusses both the generic requirements terminology com- This section discusses both the generic requirements terminology com-
mon to many IETF protocol specifications as well as specialized DHCP mon to many IETF protocol specifications as well as specialized DHCP
and failover protocol specific terminology. and failover protocol specific terminology.
2.1. Requirements terminology 2.1. Requirements terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC 2119]. document are to be interpreted as described in RFC 2119 [RFC 2119].
2.2. DHCP and failover terminology 2.2. DHCP and failover terminology
This document uses the following terms: This document uses the following terms:
o "binding"
A binding is a collection of configuration parameters, includ-
ing at least an IP address, associated with or "bound to" a
DHCP client. Bindings are managed by DHCP servers.
o "binding database"
The collection of bindings managed by a primary and secondary.
o "binding update transaction"
A binding update transaction refers to the set of information
(contained in options) necessary to perform a binding update
for a single IP address. It will be comprised of the
assigned-IP-address option and the binding-status option, along
other options as appropriate.
o "binding-status"
The binding-status is the status of an IP address with respect
to its association with a client. There are specific binding-
status values defined for use by the failover protocol, e.g.,
ACTIVE, FREE, RELEASED, ABANDONED, etc. These are designed to
map more or less directly onto the binding-status values used
internally in most DHCP server implementations. The term
binding-status refers to the concept also sometimes known as
"lease state" or "IP address state", but in this document the
term "state" is reserved for the failover state of a failover
endpoint, and binding-status is always used to refer to the
state associated with an IP address or lease.
o "DHCP client" or "client" o "DHCP client" or "client"
A DHCP client is an Internet host using DHCP to obtain confi- A DHCP client is an Internet host using DHCP to obtain confi-
guration parameters such as a network address. The term guration parameters such as a network address. The term
"client" used within this document always means a DHCP client, "client" used within this document always means a DHCP client,
and never one of the two failover servers. and never one of the two failover servers.
o "DHCP server" or "server" o "DHCP server" or "server"
A DHCP server is an Internet host that returns configuration A DHCP server is an Internet host that returns configuration
parameters to DHCP clients. parameters to DHCP clients.
o "binding" o "DDNS"
A binding is a collection of configuration parameters, including An abbreviation for "Dynamic DNS", which refers to the capabil-
at least an IP address, associated with or "bound to" a DHCP ity to update a DNS server's name (actually resource record)
client. Bindings are managed by DHCP servers. database using an on-the-wire protocol defined in [RFC 2136].
o "binding database" o "DNS"
The collection of bindings managed by a primary and secondary. An abbreviation for "Domain Name System", a scheme where a cen-
tral name repository is used to map names to IP addresses and IP
addresses to names.
o "failover endpoint" o "failover endpoint"
The failover protocol allows for there to be a unique failover The failover protocol allows for there to be a unique failover
endpoint per partner per role (where role is primary or secon- endpoint per partner per role (where role is primary or secon-
dary). This failover endpoint can take actions and hold unique dary). This failover endpoint can take actions and hold unique
states. There are thus a maximum of two failover endpoints per states. There are thus a maximum of two failover endpoints per
server per partner (one for each partner as a primary and one server per partner (one for each partner as a primary and one
for that same partner as a secondary.) for that same partner as a secondary.)
o "FQDN"
An FQDN is a "fully qualified domain name". A fully qualified
domain name generally is a host name with at least one zone
name, for example "www.dhcp.org" is a fully qualified domain
name.
o "lazy update" o "lazy update"
Lazy update refers to the requirement placed on a server imple- Lazy update refers to the requirement placed on a server imple-
menting a failover protocol to update its failover partner when- menting a failover protocol to update its failover partner when-
ever the binding database changes. A failover protocol which ever the binding database changes. A failover protocol which
didn't support lazy update would require the failover partner didn't support lazy update would require the failover partner
update to be complete before a DHCP server could respond to a update to be complete before a DHCP server could respond to a
DHCP client request with a DHCPACK. A failover protocol which DHCP client request with a DHCPACK. A failover protocol which
does support lazy update places no such restriction on the does support lazy update places no such restriction on the
update of the failover partner server, and so a server can allo- update of the failover partner server, and so a server can allo-
cate an IP address or extend a lease on an IP address and then cate an IP address or extend a lease on an IP address and then
update its failover partner as time permits. A failover proto- update its failover partner as time permits. A failover proto-
col which supports lazy update not only removes the requirement col which supports lazy update not only removes the requirement
to update the failover partner prior to responding to a DHCP to update the failover partner prior to responding to a DHCP
client with a DHCPACK, but also allows gathering up batches of client with a DHCPACK, but also allows gathering up batches of
updates from one failover server to its partner. updates from one failover server to its partner.
o "subnet address pool"
A subnet address pool is the set of IP address which is associ-
ated with a particular network number and subnet mask. In the
simple case, there is a single network number and subnet mask
and a set of IP addresses. In the more complex case (sometimes
called "secondary subnets", sometimes "superscopes"), several
(apparently unrelated) network number and subnet mask combina-
tions with their associated IP addresses may all be configured
together into one subnet address pool.
o "Primary server" or "Primary"
A DHCP server configured to provide primary service to a set of
DHCP clients for a particular set of subnet address pools.
o "Secondary server" or "Secondary"
A DHCP server configured to act as backup to a primary server
for a particular set of subnet address pools.
o "stable storage"
Every DHCP server is assumed to have some form of what is called
"stable storage". Stable storage is used to hold information
concerning IP address bindings (among other things) so that this
information is not lost in the event of a server failure which
requires restart of the server.
o "MCLT" o "MCLT"
The MCLT refers to maximum client lead time. This time is con- The MCLT refers to maximum client lead time. This time is con-
figured on the primary server and transmitted from the primary figured on the primary server and transmitted from the primary
to the secondary server in the CONNECT message. It is the max- to the secondary server in the CONNECT message. It is the max-
imum amount of time that one server can give to a client for a imum amount of time that one server can extend a lease for a
binding beyond that known and ACKed by the partner server. See client's binding beyond the time known by the partner server.
section 5.2.1 for details. See section 5.2.1 for details.
o "DNS"
An abbreviation for "Domain Name System", a scheme where a cen-
tral name repository is used to map names to IP addresses and IP
addresses to names.
o "FQDN"
An FQDN is a "fully qualified domain name". A fully qualified
domain name generally is a host name with at least one zone
name, for example "www.dhcp.org" is a fully qualified domain
name.
o "partner" o "partner"
A "partner", for the purposes of this document, refers to a A "partner", for the purposes of this document, refers to a
failover server, typically the other failover server. In many failover server, typically the other failover server. In many
(if not most) cases, the failover protocol is symmetric with (if not most) cases, the failover protocol is symmetric with
respect to the primary or secondary nature of the servers, and respect to the primary or secondary nature of the servers, and
so it is often appropriate to dicuss "updating the partner so it is often appropriate to discuss "updating the partner
server", since it could be a primary server updating a secondary server", since it could be a primary server updating a secondary
server or a secondary server updating a primary server. server or a secondary server updating a primary server.
o "RR" o "Primary server" or "Primary"
A DHCP server configured to provide primary service to a set of
DHCP clients for a particular set of subnet address pools.
o "RR"
"RR" is an abbreviation for "resource record". All records in "RR" is an abbreviation for "resource record". All records in
the DNS are resource records. The resource records of most the DNS are resource records. The resource records of most
relevance to this document are the "A" resource record, which relevance to this document are the "A" resource record, which
maps a DNS name to a particular IP address, the "PTR" resource maps a DNS name to a particular IP address, the "PTR" resource
record, which allows a "reverse map", from the IP address back record, which allows a "reverse map", from the IP address back
to a DNS name, and the "KEY" resource record, which is used in to a DNS name, and the "KEY" resource record, which is used in
ways defined in [DDNS] to tag a DNS name with the identity of ways defined in [DDNS] to tag a DNS name with the identity of
the DHCP client with which it is associated. the DHCP client with which it is associated.
o "DDNS" o "Secondary server" or "Secondary"
An abbreviation for "Dynamic DNS", which refers to the capabil- A DHCP server configured to act as backup to a primary server
ity to update a DNS server's name (actually resource record) for a particular set of subnet address pools.
database using an on-the-wire protocol defined in [RFC2136].
o "binding-status" o "stable storage"
The binding-status is the status of an IP address with respect
to its association with a client. There are specific binding- Every DHCP server is assumed to have some form of what is called
status values defined for use by the failover protocol, e.g., "stable storage". Stable storage is used to hold information
ACTIVE, FREE, RELEASED, ABANDONED, etc. These are designed to concerning IP address bindings (among other things) so that this
map more or less directly onto the binding-status values used information is not lost in the event of a server failure which
internally in most DHCP server implementations. The term requires restart of the server.
binding-status refers to the concept also sometimes known as
"lease state" or "IP address state", but in this document the o "state"
term "state" is reserved for the failover state of a failover
endpoint, and binding-status is always used to refer to the In this document, the term "state" refers exclusively to the
state associated with an IP address or lease. state of a failover endpoint, for example: NORMAL,
COMMUNICATIONS-INTERRUPTED, PARTNER-DOWN. It is not used to
refer to any attributes of an IP address or a binding of an IP
address. See "binding-status".
o "subnet address pool"
A subnet address pool is the set of IP address which is associ-
ated with a particular network number and subnet mask. In the
simple case, there is a single network number and subnet mask
and a set of IP addresses. In the more complex case (sometimes
called "secondary subnets", sometimes "superscopes"), several
(apparently unrelated) network number and subnet mask combina-
tions with their associated IP addresses may all be configured
together into one subnet address pool.
3. Background and External Requirements 3. Background and External Requirements
This section highlights key aspects of the DHCP protocol on which the This section highlights key aspects of the DHCP protocol on which the
failover protocol depends. It also discusses the requirements that failover protocol depends. It also discusses the requirements that
the failover protocol places on other aspects of the network infras- the failover protocol places on other aspects of the network infras-
tructure, and some general issues surrounding server failure detec- tructure, and some general issues surrounding server failure
tion. Some failure scenarios that provide particular challenges to a detection. Some failure scenarios that provide particular challenges
failover protocol are discussed. Finally, the challenges inherent in to a failover protocol are discussed. Finally, the challenges
using a TCP connection as a means to detect failure of a partner inherent in using a TCP connection as a means to detect failure of a
server are elaborated. partner server are elaborated.
3.1. Key aspects of the DHCP protocol 3.1. Key aspects of the DHCP protocol
The failover protocol is designed to augment the DHCP protocol as The failover protocol is designed to augment the DHCP protocol as
described in RFC 2131 [RFC 2131]. There are several key aspects of described in RFC 2131 [RFC 2131]. There are several key aspects of
the DHCP protocol which are required by the failover protocol in the DHCP protocol which are required by the failover protocol in
order to successfully meet its design goals. order to successfully meet its design goals.
3.1.1. Broadcast behavior 3.1.1. Broadcast behavior
skipping to change at page 9, line 10 skipping to change at page 9, line 39
DHCP client uses to extend its lease. It is unicast to the DHCP DHCP client uses to extend its lease. It is unicast to the DHCP
server from which it acquired the lease. However, the DHCP protocol server from which it acquired the lease. However, the DHCP protocol
(in a farsighted move), was explicitly designed so that in the event (in a farsighted move), was explicitly designed so that in the event
that a DHCP client cannot contact the server from which it received a that a DHCP client cannot contact the server from which it received a
lease on an IP address using a DHCPREQUEST/RENEW, the client is lease on an IP address using a DHCPREQUEST/RENEW, the client is
required to broadcast its renewal using a DHCPREQUEST/REBINDING to required to broadcast its renewal using a DHCPREQUEST/REBINDING to
any available DHCP server. Since all DHCP clients were required to any available DHCP server. Since all DHCP clients were required to
implement this algorithm, the failover protocol can have a different implement this algorithm, the failover protocol can have a different
server from the one that initially granted a lease be the server to server from the one that initially granted a lease be the server to
renew a lease. Thus, one server can take over for another with no renew a lease. Thus, one server can take over for another with no
interruption in the service as experience by the DHCP client or its interruption in the service as experienced by the DHCP client or its
associated applications software. associated applications software.
3.1.2. Client responsibility 3.1.2. Client responsibility
In the DHCP protocol the DHCP clients are entrusted with a consider- In the DHCP protocol the DHCP clients are entrusted with a consider-
able responsibility. In particular, after they are granted a lease able responsibility. In particular, after they are granted a lease
on an IP address, they are enjoined to only use that IP address while on an IP address, they are enjoined to only use that IP address while
their lease is valid. Every DHCP client is expected to stop using an their lease is valid. Every DHCP client is expected to stop using an
IP address if the expiration time on the lease has passed and if it IP address if the expiration time on the lease has passed and if it
cannot get an extension on the lease for that IP address from some cannot get an extension on the lease for that IP address from some
skipping to change at page 10, line 22 skipping to change at page 10, line 51
At present, the failover protocol does not assume that a client send- At present, the failover protocol does not assume that a client send-
ing in an INIT-REBOOT request necessarily has a valid lease on the IP ing in an INIT-REBOOT request necessarily has a valid lease on the IP
address appearing in the dhcp-requested-address option in the INIT- address appearing in the dhcp-requested-address option in the INIT-
REBOOT request. REBOOT request.
The implications of this are as follows: Assume that there is a DHCP The implications of this are as follows: Assume that there is a DHCP
client that gets a lease from one server while that server is unable client that gets a lease from one server while that server is unable
to communicate with its failover partner. Then, assume that after to communicate with its failover partner. Then, assume that after
that client reboots it is able only to communicate with the other that client reboots it is able only to communicate with the other
failover server. If the failover servers have not been able to com- failover server. If the failover servers have not been able to
municate with each other during this process, then the DHCP client communicate with each other during this process, then the DHCP client
will get a new IP address instead of being able to continue to use will get a new IP address instead of being able to continue to use
its existing IP address. This will affect no applications on the DHCP its existing IP address. This will affect no applications on the DHCP
client, since it is rebooting. However, it will use up an additional client, since it is rebooting. However, it will use up an additional
IP address in this marginal case. IP address in this marginal case.
3.1.3. Stable storage update before DHCPACK 3.1.3. Stable storage update before DHCPACK
The DHCP protocol allocates resources, and in order to operate The DHCP protocol allocates resources, and in order to operate
correctly it requires that a DHCP server update some form of stable correctly it requires that a DHCP server update some form of stable
storage prior to sending a DHCPACK to a DHCP client in order to grant storage prior to sending a DHCPACK to a DHCP client in order to grant
that client a lease on an IP address. that client a lease on an IP address.
One of the goals of the failover protocol is that it not add signifi- One of the goals of the failover protocol is that it not add signifi-
cant additional time to this already time consuming requirement to cant additional time to this already time consuming requirement to
update stable storage prior to a DHCPACK. In particular, adding a update stable storage prior to a DHCPACK. In particular, adding a
requirement to communicate with another server prior to sending a requirement to communicate with another server prior to sending a
DHCPACK would simplify the failover protocol, but it would limit the DHCPACK would greatly simplify the failover protocol, but it would
potential scalability of any DHCP server which employed the failover limit the potential scalability of any DHCP server which employed the
protocol in an unacceptable manner. failover protocol in an unacceptable manner.
3.2. BOOTP relay agent implementation 3.2. BOOTP relay agent implementation
Many DHCP clients are not resident on the same network segment as a Many DHCP clients are not resident on the same network segment as a
DHCP server. In order to support this form of network architecture, DHCP server. In order to support this form of network architecture,
most contemporary routers implement something known as a BOOTP Relay most contemporary routers implement something known as a BOOTP Relay
Agent. This capability inside of a router listens for all broadcasts Agent. This capability inside of a router listens for all broadcasts
at the DHCP port, port 67, and will relay any broadcasts that it at the DHCP port, port 67, and will relay any broadcasts that it
receives on to a DHCP server. The IP address of the DHCP server must receives on to a DHCP server. The IP address of the DHCP server must
have been previously configured into the router. As part of the have been previously configured into the router. As part of the
skipping to change at page 11, line 21 skipping to change at page 11, line 50
not local to the DHCP server, the BOOTP relay agent on the router not local to the DHCP server, the BOOTP relay agent on the router
closest to the DHCP client must be configured to point at more than closest to the DHCP client must be configured to point at more than
one DHCP server. one DHCP server.
Most BOOTP relay agent implementations allow this duplication of Most BOOTP relay agent implementations allow this duplication of
packets. packets.
If this is not possible, an administrator might be able to configure If this is not possible, an administrator might be able to configure
the relay agent with a subnet broadcast address, but in this case the the relay agent with a subnet broadcast address, but in this case the
primary and secondary DHCP servers in a failover pair must both primary and secondary DHCP servers in a failover pair must both
reside on the same subnet. While this is a realistic configuration, reside on the same subnet.
it is not the one that most people will use.
3.3. What does it mean if a server can't communicate with its partner? 3.3. What does it mean if a server can't communicate with its partner?
In any protocol designed to allow one server to take over some In any protocol designed to allow one server to take over some
responsibilities from a partner server in the event of "failure" of responsibilities from a partner server in the event of "failure" of
that partner server, there is an inherent difficulty in determining that partner server, there is an inherent difficulty in determining
when that partner server has failed. when that partner server has failed.
In fact, it is fundamentally impossible for one server to distinguish In fact, it is fundamentally impossible for one server to distinguish
a network communications failure from the outright failure of the a network communications failure from the outright failure of the
skipping to change at page 12, line 5 skipping to change at page 12, line 34
sider themselves operational, and any server which can't communicate sider themselves operational, and any server which can't communicate
to a majority of other servers must immediately cease operations. to a majority of other servers must immediately cease operations.
While this technique works in some domains, having the only server to While this technique works in some domains, having the only server to
which a DHCP client can communicate voluntarily shut itself down which a DHCP client can communicate voluntarily shut itself down
seems like something worth avoiding. seems like something worth avoiding.
The failover protocol will operate correctly while both servers are The failover protocol will operate correctly while both servers are
unable to communicate, whether they are both running or not. At some unable to communicate, whether they are both running or not. At some
point there may be resource contention, and if one of the servers is point there may be resource contention, and if one of the servers is
actually down, then the operator can inform the other server and the actually down, then the operator can inform the operational server
operational server will be able to use all of the downed server's and the operational server will be able to use all of the failed
resources. server's resources.
The protocol also allows detection of an orderly shutdown of a parti- The protocol also allows detection of an orderly shutdown of a parti-
cipating server. cipating server.
3.4. Challenging scenarios for a Failover protocol 3.4. Challenging scenarios for a Failover protocol
There exist two failure scenarios which provide particular challenges There exist two failure scenarios which provide particular challenges
the correctness guarantees of a failover protocol. to the correctness guarantees of a failover protocol.
3.4.1. Primary Server crash before "lazy" update: 3.4.1. Primary Server crash before "lazy" update:
In the case where the primary server sends a DHCPACK to a client for In the case where the primary server sends a DHCPACK to a client for
a newly allocated IP address and then crashes prior to sending the a newly allocated IP address and then crashes prior to sending the
corresponding update to the secondary server, the secondary server corresponding update to the secondary server, the secondary server
will have no record of the IP address allocation. When the secondary will have no record of the IP address allocation. When the secondary
server takes over, it may well try to allocate that IP address to a server takes over, it may well try to allocate that IP address to a
different client. In the case where the first client to receive the different client. In the case where the first client to receive the
IP address is not on the net at the time (yet while there was still IP address is not on the net at the time (yet while there was still
skipping to change at page 14, line 12 skipping to change at page 14, line 40
Thus, we can ensure that the TCP connection has messages flowing Thus, we can ensure that the TCP connection has messages flowing
periodically across the connection fairly easily. The question periodically across the connection fairly easily. The question
remains as to what TCP will do if the other end of the connection remains as to what TCP will do if the other end of the connection
fails to respond (either because of network partition or because the fails to respond (either because of network partition or because the
receiving server crashes). TCP will attempt to retransmit a message receiving server crashes). TCP will attempt to retransmit a message
with an exponential backoff, and will eventually timeout that with an exponential backoff, and will eventually timeout that
retransmission. However, the length of that timeout cannot, in gen- retransmission. However, the length of that timeout cannot, in gen-
eral, be set on a per-connection basis, and is frequently as long as eral, be set on a per-connection basis, and is frequently as long as
nine minutes, though in some cases it may be as short as two minutes. nine minutes, though in some cases it may be as short as two minutes.
One some systems it can be set system-wide, while on some systems it On some systems it can be set system-wide, while on other systems it
cannot be changed at all. cannot be changed at all.
A value for this timeout that would be appropriate for the failover A value for this timeout that would be appropriate for the failover
protocol, say less than 1 minute, could have unpleasant side-effects protocol, say less than 1 minute, could have unpleasant side-effects
on other applications running on the same server, assuming that it on other applications running on the same server, assuming that it
could be changed at all on the host operating system. could be changed at all on the host operating system.
Nine minutes is a long time for the DHCP service to be unavailable to Nine minutes is a long time for the DHCP service to be unavailable to
any new clients that were being served by the server which has any new clients that were being served by the server which has
crashed, when there is another server running that could respond to crashed, when there is another server running that could respond to
them immediately as soon as it determines that its partner is not them as soon as it determines that its partner is not operational.
operational.
The conclusion drawn from this analysis is that TCP provides very The conclusion drawn from this analysis is that TCP provides very
useful support for the failover protocol in the areas of reliable and useful support for the failover protocol in the areas of reliable and
ordered message delivery, but cannot by itself be relied upon to ordered message delivery, but cannot by itself be relied upon to
detect partner server failure in a fashion acceptable to the needs of detect partner server failure in a fashion acceptable to the needs of
the failover protocol. Additional failover protocol capabilities the failover protocol. Additional failover protocol capabilities
will need to be created to support timely detection of partner server have been created to support timely detection of partner server
failure. See section 8.3 for details on this mechanism. failure. See section 8.3 for details on this mechanism.
4. Design Goals 4. Design Goals
This section lists the design requirements, the design goals, and the This section lists the the design goals and the limitations of the
limitations of the failover protocol. failover protocol.
4.1. Design requirements for this protocol 4.1. Design goals for this protocol
The following list of requirements must be (and are) met by this pro- The following list of goals that are met by this protocol. They are
tocol. They are listed in priority order. listed in priority order.
1. Implementations of this protocol must work with existing DHCP 1. Implementations of this protocol must work with existing DHCP
client implementations based on the DHCP protocol [1]. client implementations based on the DHCP protocol [1].
2. Implementations of the protocol must work with existing BOOTP 2. Implementations of the protocol must work with existing BOOTP
relay agent implementations. relay agent implementations.
3. The protocol must provide failover redundancy between servers 3. The protocol must provide failover redundancy between servers
that are not located on the same subnet. that are not located on the same subnet.
4.2. Goals for this protocol 4. Provide for continued service to DHCP clients through an
The following goals are met by this protocol as well, though they are
less important than the requirements listed above. These goals are
listed in priority order.
1. Provide for continued service to DHCP clients through an
automated mechanism in the event of failure of the primary automated mechanism in the event of failure of the primary
server. server.
2. Avoid binding an IP address to a client while that binding is 5. Avoid binding an IP address to a client while that binding is
currently valid for another client. In other words, do not currently valid for another client. In other words, do not
allocate the same IP address to two clients. allocate the same IP address to two clients.
3. Minimize any need for manual administrative intervention. 6. Minimize any need for manual administrative intervention.
4. Introduce no additional delays in server response time as a 7. Introduce no additional delays in server response time as a
result of the network communications required to implement the result of the network communications required to implement the
failover protocol, i.e., don't require communications with the failover protocol, i.e., don't require communications with the
partner between the receipt of a DHCPREQUEST and the partner between the receipt of a DHCPREQUEST and the
corresponding DHCPACK. corresponding DHCPACK.
5. Share IP address ranges between primary and secondary servers; 8. Share IP address ranges between primary and secondary servers;
i.e., impose no requirement that the pool of available i.e., impose no requirement that the pool of available
addresses be divided between servers. addresses be manually or permanently divided between servers.
6. Continue to meet the goals and objectives of this protocol in 9. Continue to meet the goals and objectives of this protocol in
the event of server failure or network partition. the event of server failure or network partition.
7. Provide graceful reintegration of full protocol service after 10. Provide graceful reintegration of full protocol service after
server failure or network partition. server failure or network partition.
8. Allow for one computer to act as a secondary server for multi- 11. Allow for one computer to act as a secondary server for multi-
ple primary servers. Other topologies (e.g.: mesh) are also ple primary servers. The protocol must allow failover primary
possible. primary and secondary servers SHOULD be viewed as and secondary configuration choices to be made at a granular-
"logical" servers and not necessarily physical computers. ity smaller than "all of the subnets served by a single
server", though individual implementations may not choose to
allow such flexibility.
9. Ensure that an existing client can keep its existing IP 12. Ensure that an existing client can keep its existing IP
address binding if it can communicate with either the primary address binding if it can communicate with either the primary
or secondary DHCP server implementing this protocol - not just or secondary DHCP server implementing this protocol - not just
whichever server that originally offered it the binding. whichever server that originally offered it the binding.
10. Ensure that a new client can get an IP address from some 13. Ensure that a new client can get an IP address from some
server. Ensure that in the face of partition, where servers server. Ensure that in the face of partition, where servers
continue to run but cannot communicate with each other, the continue to run but cannot communicate with each other, the
above goals and requirements may be met. In addition, when the above goals and requirements may be met. In addition, when
partition condition is removed, allow graceful automatic re- the partition condition is removed, allow graceful automatic
integration without requiring human intervention. re-integration without requiring human intervention.
11. If either primary or secondary server loses all of the infor- 14. If either primary or secondary server loses all of the infor-
mation that is has stored in stable storage, it should be able mation that is has stored in stable storage, ensure that it be
to refresh its stable storage from the other server. able to refresh its stable storage from the other server.
12. Support load balancing between the primary and secondary 15. Support load balancing between the primary and secondary
servers, and allow configuration of the percentage of the servers, and allow configuration of the percentage of the
client population served by each with a moderately fine granu- client population served by each with a moderately fine granu-
larity. larity.
4.3. Limitations of this Protocol 4.2. Limitations of this protocol
The following are explicit limitations of this protocol. The following are explicit limitations of this protocol.
1. This protocol provides only one level of redundancy through a 1. This protocol provides only one level of redundancy through a
single secondary server for each primary server. single secondary server for each primary server.
2. A subset of the address pool is reserved for secondary server 2. A subset of the address pool is reserved for secondary server
use. In order to handle the failure case where both servers use. In order to handle the failure case where both servers
are able to communicate with DHCP clients, but unable to com- are able to communicate with DHCP clients, but unable to com-
municate with each other, a subset of the IP address pool must municate with each other, a subset of the IP address pool must
be set aside as a private address pool for the secondary be set aside as a private address pool for the secondary
server. The secondary can use these to service newly arrived server. The secondary can use these to service newly arrived
DHCP clients during such a period. The size of this private DHCP clients during such a period. The required size of this
pool SHOULD be based only on the arrival rate of new DHCP private pool is be based only on the arrival rate of new DHCP
clients and the length of expected downtime, and is not influ- clients and the length of expected downtime, and is not influ-
enced in any way by the total number of DHCP clients supported enced in any way by the total number of DHCP clients supported
by the server pair. by the server pair.
The failover protocol can be used in a mode where both the The failover protocol can be used in a mode where both the
primary and secondary servers can share the load between them primary and secondary servers can share the load between them
when both are operating. In this loadbalancing mode, the when both are operating. In this loadbalancing mode, the
addresses allocated by the primary server to the secondary addresses allocated by the primary server to the secondary
server are not unused, but are used instead to service the server are not unused, but are used instead to service the
portion of the client base which to which the secondary server portion of the client base to which the secondary server is
is required to respond. See section 5.3 for more information required to respond. See section 5.3 for more information on
on loadbalancing. load balancing.
3. The primary and secondary servers do not respond to client 3. The primary and secondary servers do not respond to client
requests at all while recovering from a failure that could requests at all while recovering from a failure that could
have resulted in duplicate IP assignments. (When synchroniz- have resulted in duplicate IP assignments. (When synchroniz-
ing in POTENTIAL-CONFLICT state). ing in POTENTIAL-CONFLICT state).
5. Protocol Overview 5. Protocol Overview
This section will discuss the failover protocol at a relatively high This section will discuss the failover protocol at a relatively high
level of detail. In the event that a description in this section level of detail. In the event that a description in this section
skipping to change at page 18, line 49 skipping to change at page 19, line 26
There are thus a maximum of two failover endpoints per partner (one There are thus a maximum of two failover endpoints per partner (one
for the partner as a primary and one for that same partner as a for the partner as a primary and one for that same partner as a
secondary.) secondary.)
Thus, in the case where there are two primary servers A and B each Thus, in the case where there are two primary servers A and B each
backed up by a single common secondary server C, there is one fail- backed up by a single common secondary server C, there is one fail-
over endpoint on each of A and B, and two different failover end- over endpoint on each of A and B, and two different failover end-
points on C. The two different failover endpoints on C each have points on C. The two different failover endpoints on C each have
unique states and independent TCP connections. unique states and independent TCP connections.
This document describes the behavior of the protocol in terms of pri- This document frequently describes the behavior of the protocol in
mary and secondary servers, not primary and secondary failover end- terms of primary and secondary servers, not primary and secondary
points. However, it is important to remember that every 'server' failover endpoints. However, it is important to remember that every
described in this document is in reality a failover endpoint that 'server' described in this document is in reality a failover endpoint
resides in a particular process, and that many failover endpoints may that resides in a particular process, and that many failover end-
reside in the same process. points may reside in the same process.
It is not the case that there is a unique failover endpoint for each It is not the case that there is a unique failover endpoint for each
subnet that participates in a failover relationship. On one server, subnet address pool that participates in a failover relationship. On
there is one failover endpoint per partner per role, regardless of one server, there is one failover endpoint per partner per role,
how many subnets or address pools are managed by that combination of regardless of how many subnet address pools are managed by that com-
partner and role. Conversely, on a particular server, any given sub- bination of partner and role. Conversely, on a particular server,
net or pool will be associated with exactly one failover endpoint. any given subnet address pool will be associated with exactly one
failover endpoint.
When a connection is received from the partner, the unique failover When a connection is received from the partner, the unique failover
endpoint to which the message is directed is determined solely by the endpoint to which the message is directed is determined solely by the
IP address of the partner and the setting of the SECONDARY bit in the IP address of the partner and the port to which the connection is
'flags' field of the CONTACT message. directed by the partner. See section 8.2.
Throughout this document, the states and actions taken by "servers"
are described. The terms "server", "primary server", and "secondary
server" are commonly used to described the failover endpoint taking
these states and performing these actions. This description is
wholly accurate only for the simplest of cases, where all of the
address pools on one server are backed up by all of the address pools
on another server. In this case, there is single failover endpoint
in each server. In all other cases, the term "server" is used to
describe one of the two possible failover endpoints per partner.
5.2. Fundamental restrictions 5.2. Fundamental guarantees
There a several fundamental restrictions this protocol places on what There a several fundamental restrictions this protocol places on what
one server can do in the absence of knowledge of the other server, one server can do in the absence of knowledge of the other server.
and these restrictions are key to the correct operation of the proto- Operating within these restrictions allows certain guarantees to be
col. made to the partner server, and these are key to the correct opera-
tion of the protocol.
5.2.1. Control of lease time 5.2.1. Control of lease time
The key problem with lazy update is that when the a server fails The key problem with lazy update is that when a server fails after
after updating a client with a particular lease time and before updating a client with a particular lease time and before updating
updating its partner, the partner will believe that a lease has its partner, the partner will believe that a lease has expired even
expired even though the client still retains a valid lease on that IP though the client still retains a valid lease on that IP address.
address.
In order to handle this problem, a period of time known as the "Max- In order to handle this problem, a period of time known as the "Max-
imum Client Lead Time" (MCLT) is defined and must be known to both imum Client Lead Time" (MCLT) is defined and must be known to both
the primary and secondary servers. Proper use of this time interval the primary and secondary servers. Proper use of this time interval
places an upper bound on the difference allowed between the lease places an upper bound on the difference allowed between the lease
time provided to a DHCP client by a server and the lease time known time provided to a DHCP client by a server and the lease time known
by that server's partner. However, the MCLT is typically much less by that server's partner. However, the MCLT is typically much less
than the lease time that a server has been configured to offer a than the lease time that a server has been configured to offer a
client, and so some strategy must exist to allow a server to offer client, and so some strategy must exist to allow a server to offer
the configured lease time to a client. During a lazy update the the configured lease time to a client. During a lazy update the
updating server typically updates its partner with a potential updating server typically updates its partner with a potential
expiration time which is longer than the lease time previously given expiration time which is longer than the lease time previously given
to the client and which is longer than the lease time that the server to the client and which is longer than the lease time that the server
has been configured to give a client. This allows that server to has been configured to give a client. This allows that server to
give a longer lease time to the client the next time the client give a longer lease time to the client the next time the client
renews its lease, since the time that it will give to the client will renews its lease, since the time that it will give to the client will
not exceed the MCLT beyond the potential expiration time acknowledged not exceed the MCLT beyond the potential expiration time acknowledged
by the partner. by its partner.
The PARTNER-DOWN state exists so that a server can be sure that its The PARTNER-DOWN state exists so that a server can be sure that its
partner is, indeed, down. Correct operation while in that state partner is, indeed, down. Correct operation while in that state
requires (generally) that the server wait the MCLT after anything requires (generally) that the server wait the MCLT after anything
that happened prior to its transition into PARTNER-DOWN state (or, that happened prior to its transition into PARTNER-DOWN state (or,
more accurately, when the other server went down if that is known). more accurately, when the other server went down if that is known).
Thus, the server MUST wait the Maximum Client Lead Time after the Thus, the server MUST wait the MCLT after the partner server went
partner server went down before allocating any of the partner's FREE down before allocating any of the partner's addresses which were
addresses. In the event the partner was not in communication prior available for allocation. In the event the partner was not in com-
to going down, it might have allocated one or more of its FREE munication prior to going down, it might have allocated one or more
addresses to a DHCP client and been unable to inform the server of its FREE addresses to a DHCP client and been unable to inform the
entering PARTNER-DOWN prior to going down itself. By waiting the server entering PARTNER-DOWN prior to going down itself. By waiting
MCLT after the time the partner went down, the server in PARTNER-DOWN the MCLT after the time the partner went down, the server in
state ensures that any clients which have a lease on one of the PARTNER-DOWN state ensures that any clients which have a lease on one
partner's FREE addresses will either time out or contact the server of the partner's FREE addresses will either time out or contact the
in PARTNER-DOWN by the time that period ends. server in PARTNER-DOWN by the time that period ends.
In addition, once a server has transitioned to PARTNER-DOWN state, it In addition, once a server has transitioned to PARTNER-DOWN state, it
MUST NOT reallocate an IP address from one client to another client MUST NOT reallocate an IP address from one client to another client
until an additional MCLT interval after the lease by the original until an additional MCLT interval after the lease by the original
client expires. (Actually, until the maximum client lead time after client expires. (Actually, until the maximum client lead time after
what it believes to be the lease expiration time of the first what it believes to be the lease expiration time of the client.)
client.)
Some optimizations exist for this restriction, in that it only Some optimizations exist for this restriction, in that it only
applies to leases that were issued BEFORE entering PARTNER-DOWN. Once applies to leases that were issued BEFORE entering PARTNER-DOWN. Once
a server has entered PARTNER-DOWN and it leases out an address, it a server has entered PARTNER-DOWN and it leases out an address, it
need not wait this time as long as it has never communicated with the need not wait this time as long as it has never communicated with the
partner since the lease was given out. partner since the lease was given out.
The fundamental relationship on which much of the correctness of this The fundamental relationship on which much of the correctness of this
protocol depends is that the lease expiration time known to a DHCP protocol depends is that the lease expiration time known to a DHCP
client MUST NOT be more than the maximum client lead time greater client MUST NOT be more than the maximum client lead time greater
skipping to change at page 22, line 7 skipping to change at page 22, line 21
The MCLT MAY be configurable on the primary server, but for correct The MCLT MAY be configurable on the primary server, but for correct
server operation it MUST be the same and known to both the primary server operation it MUST be the same and known to both the primary
and secondary servers. The secondary server determines the MCLT from and secondary servers. The secondary server determines the MCLT from
the MCLT option sent from the primary server to the secondary server the MCLT option sent from the primary server to the secondary server
in the CONNECT message. in the CONNECT message.
A server MUST record in its stable storage both the actual lease A server MUST record in its stable storage both the actual lease
interval and the most recently acknowledged potential lease interval interval and the most recently acknowledged potential lease interval
for each IP address binding. It is assumed that the desired client for each IP address binding. It is assumed that the desired client
lease interval can be determined through techniques outside of the lease interval can be determined through techniques outside of the
scope of this protocol. See section 7.1.4 for more details concern- scope of this protocol. See section 7.1.5 for more details concern-
ing the times that the server MUST record in its stable storage and ing the times that the server MUST record in its stable storage and
the way that they interact with the lease time that may be offered to the way that they interact with the lease time that may be offered to
a DHCP client. a DHCP client.
Again, the fundamental relationship among these times which MUST be Again, the fundamental relationship among these times which MUST be
maintained is: maintained is:
actual lease interval < actual lease interval <
( acknowledged potential lease interval + MCLT ) ( acknowledged potential lease interval + MCLT )
Figure 5.1-1 illustrates a initial lease to a client using the rules Figure 5.2.1-1 illustrates a initial lease to a client using the
discussed in the example which follows it. rules discussed in the example which follows it. Note that this is
only one example -- as long as the fundamental relationship is
preserved, the actual times used could be quite different.
DHCP Primary Secondary DHCP Primary Secondary
time Client Server Server time Client Server Server
| (time in intervals) | (absolute time) | | (time in intervals) | (absolute time) |
| | | | | |
| >-DHCPDISCOVER-> | | | >-DHCPDISCOVER-> | |
| <---DHCPOFFER-< | | | <---DHCPOFFER-< | |
| | | | | |
| >-DHCPREQUEST-> | | | >-DHCPREQUEST-> | |
skipping to change at page 23, line 39 skipping to change at page 23, line 39
t1 | <--------DHCPACK-< | | t1 | <--------DHCPACK-< | |
| lease-time=X | | | lease-time=X | |
| | >-BNDUPD--> | | | >-BNDUPD--> |
| | lease-expiration=t1+X | | lease-expiration=t1+X
| | potential-expiration=t1+(X/2)+X | | potential-expiration=t1+(X/2)+X
| | | | | |
| | <-BNDACK-< | | | <-BNDACK-< |
| | potential-expiration=t1+(X/2)+X | | potential-expiration=t1+(X/2)+X
... ... ... ... ... ...
Figure 5.1-1: Lazy Update Message Traffic Figure 5.2.1-1: Lazy Update Message Traffic
X = Desired Lease Interval X = Desired Lease Interval
Assumes renewal interval = lease interval / 2
DISCUSSION: DISCUSSION:
This protocol mandates no algorithm concerning these lease inter- This protocol mandates only that the above fundamental relation-
vals, as long as above fundamental relationship is preserved. ship concerning lease intervals is preserved.
In the interests of clarity, however, let's examine a specific In the interests of clarity, however, let's examine a specific
example. The MCLT in this case is 1 hour. The desired lease example. The MCLT in this case is 1 hour. The desired lease
interval is 3 days, and its renewal time is half the lease inter- interval is 3 days, and its renewal time is half the lease
val. interval.
The rules for this example are: The rules for this example are:
o What to tell the client: o What to tell the client:
Take the remainder of the acknowledged potential lease interval. Take the remainder of the acknowledged potential lease interval.
If this is a new lease, then this value will be zero. If this If this is a new lease, then this value will be zero. If this
remainder plus the MCLT is greater than the desired lease inter- remainder plus the MCLT is greater than the desired lease inter-
val, give the client the desired lease interval else give the val, give the client the desired lease interval else give the
client the remainder plus the MCLT. client the remainder plus the MCLT.
skipping to change at page 24, line 39 skipping to change at page 24, line 40
DHCP client, it determines the desired lease interval (in this DHCP client, it determines the desired lease interval (in this
case, 3 days). It then examines the acknowledged potential lease case, 3 days). It then examines the acknowledged potential lease
interval (which in this case is zero) and determines the remainder interval (which in this case is zero) and determines the remainder
of the time left to run, which is also zero. To this it adds the of the time left to run, which is also zero. To this it adds the
MCLT. Since the actual lease interval cannot be allowed to exceed MCLT. Since the actual lease interval cannot be allowed to exceed
the remainder of the current acknowledged potential lease interval the remainder of the current acknowledged potential lease interval
plus the MCLT, the offer made to the client is for the remainder plus the MCLT, the offer made to the client is for the remainder
of the current acknowledged potential lease interval (i.e., zero) of the current acknowledged potential lease interval (i.e., zero)
plus the MCLT. Thus, the actual lease interval is 1 hour. plus the MCLT. Thus, the actual lease interval is 1 hour.
Once the server has performed the ACK to the DHCP client, it will Once the server has performed the BNDACK to the DHCP client, it
update the secondary server with the lease information. However, will update the secondary server with the lease information. How-
the desired potential lease interval will be composed of the one ever, the desired potential lease interval will be composed of the
half of the current actual lease interval added to the desired one half of the current actual lease interval added to the desired
lease interval. Thus, the secondary server is updated with a lease interval. Thus, the secondary server is updated with a
BNDUPD with a lease interval of 3 days + 1/2 hour specified in the BNDUPD with a lease interval of 3 days + 1/2 hour specified in the
potential-expiration-time option. potential-expiration-time option.
When the primary server receives an ACK to its update of the When the primary server receives an ACK to its update of the
secondary server's (partner's) potential lease interval, it secondary server's (partner's) potential lease interval, it
records that as the acknowledged potential lease interval. A records that as the acknowledged potential lease interval. A
server MUST NOT send a BNDACK in response to a BNDUPD message server MUST NOT send a BNDACK in response to a BNDUPD message
until it is sure that the information in the BNDUPD message until it is sure that the information in the BNDUPD message
resides in its stable storage. Thus, the primary server in this resides in its stable storage. Thus, the primary server in this
skipping to change at page 25, line 49 skipping to change at page 25, line 51
When in PARTNER-DOWN state there is a waiting period after which an When in PARTNER-DOWN state there is a waiting period after which an
IP address can be re-allocated to another client. For leases which IP address can be re-allocated to another client. For leases which
are available when the server enters PARTNER-DOWN state, the period are available when the server enters PARTNER-DOWN state, the period
is the MCLT from entry into PARTNER-DOWN state. For IP addresses is the MCLT from entry into PARTNER-DOWN state. For IP addresses
which are not available when the server enters PARTNER-DOWN state, which are not available when the server enters PARTNER-DOWN state,
the period is the MCLT after the lease becomes available. See sec- the period is the MCLT after the lease becomes available. See sec-
tion 9.4.2 for more details. tion 9.4.2 for more details.
In any other state, a server cannot reallocate an address from one In any other state, a server cannot reallocate an address from one
client to another without first notifying its partner (through a client to another without first notifying its partner (through a
BNDUPD message) and receiving acknowledgement (through a BNDACK mes- BNDUPD message) and receiving acknowledgement (through a BNDACK
sage) that its partner is aware that that first client is not using message) that its partner is aware that that first client is not
the address. using the address.
This could be modeled in the following way. Though this specific This could be modeled in the following way. Though this specific
implementation is in no way required, it may serve to better illus- implementation is in no way required, it may serve to better illus-
trate the concept. trate the concept.
An "available" IP address on a server may be allocated to any client. An "available" IP address on a server may be allocated to any client.
An IP address which was leased to a client and which expired or was An IP address which was leased to a client and which expired or was
released by that client would take on a new state, EXPIRED or released by that client would take on a new state, EXPIRED or
RELEASED respectively. The partner server would then be notified RELEASED respectively. The partner server would then be notified
that this IP address was EXPIRED or RELEASED through a BNDUPD. When that this IP address was EXPIRED or RELEASED through a BNDUPD. When
the sending server received the BNDACK for that IP address showing it the sending server received the BNDACK for that IP address showing it
was FREE, it would move the IP address from EXPIRED or RELEASED to was FREE, it would move the IP address from EXPIRED or RELEASED to
FREE, and it would be available for allocation by the primary server FREE, and it would be available for allocation by the primary server
to any clients. to any clients.
A server MAY reallocate an IP address in the EXPIRED or RELEASED A server MAY reallocate an IP address in the EXPIRED or RELEASED
state to the same client with no restrictions. state to the same client with no restrictions provided it has not
sent a BNDUPD message to its partner. This situation would exist if
the lease expired or was released after the transition into PARTNER-
DOWN state, for instance.
5.3. Load balancing 5.3. Load balancing
In order to implement load balancing between a primary and secondary In order to implement load balancing between a primary and secondary
server pair, each server must respond to DHCPDISCOVER requests from server pair, each server must respond to DHCPDISCOVER requests from
some clients and not from other clients. In order to do this suc- some clients and not from other clients. In order to do this suc-
cessfully, each server must be able to determine immediately upon cessfully, each server must be able to determine immediately upon
receipt of a DHCP client request whether it is to service this receipt of a DHCP client request whether it is to service this
request or to ignore it in order to allow the other server to service request or to ignore it in order to allow the other server to service
the request. the request.
In addition, it should be possible to configure the percentage of In addition, it should be possible to configure the percentage of
clients which will be serviced by either the primary or secondary clients which will be serviced by either the primary or secondary
server. This configuration should be more or less continuous, from server. This configuration should be more or less continuous, from
all serviced by the primary through an even split with half serviced all clients serviced by the primary through an even split with half
by each, to all serviced by the secondary. serviced by each, to all clients serviced by the secondary.
The technique chosen to support these goals is described in [LOADB]. The technique chosen to support these goals is described in [LOADB].
When using the load balancing algorithm in [LOADB] among two servers
implementing the failover protocol, both servers MUST use the same
information from the DHCP client packet as the Request ID for the
load balancing algorithm. Both servers MUST use the dhcp-client-
identifier (if it appears), and the client-hardware-address if the
dhcp-client-identifier does not. The client-hardware-address is con-
structed from the htype and chaddr fields of the DHCP client request
in the same manner as described for creation of the client-hardware-
address option in section 6.2.
A bitmap-style Hash Bucket Assignment (as described in section 5.2 of A bitmap-style Hash Bucket Assignment (as described in [LOADB]) is
[LOADB]) is sent by the primary server to the secondary server when- used to determine which DHCP clients can be processed. There are two
ever a connection is established, using the hash-bucket-assignment potential HBA's in a failover server -- a server HBA and a failover
option defined in section 6.2. This Hash Bucket Assignment is used HBA. The way that a server acquires a server HBA is outside of the
by the secondary server to decide which packets to process when in scope of the failover protocol, but both servers in a failover pair
NORMAL state. MUST have the same server HBA. The failover HBA is sent by the
primary server to the secondary server whenever a connection is esta-
blished, using the hash-bucket-assignment option defined in section
12.10.
The way in which either primary or secondary servers determine the When using the server HBA (if any) and the failover HBA (if any), to
hash bucket assignment for it to use when in other than NORMAL state decide whether to process a DHCP request, the server HBA always
is outside of the scope of this document. Note, however, that the applies in every failover state, and the failover HBA (which MUST be
primary and secondary servers MUST use identical hash bucket assign- a subset of the server HBA) is used by the secondary server to decide
ments when not in NORMAL state. This common hash bucket assignment which packets to process when in NORMAL state.
MAY be for all of the hash buckets, indicating that there is no other
DHCP server sharing the load with this failover pair, or it MAY be
for a subset of the hash buckets, which would indicate that there
exists another server or server pair with which this DHCP server pair
is sharing the load.
5.4. Operating in NORMAL state 5.4. Operating in NORMAL state
When in NORMAL state, each server services DHCPDISCOVER's and all When in NORMAL state, each server services DHCPDISCOVER's and all
other DHCP requests other than DHCPREQUEST/RENEWAL or other DHCP requests other than DHCPREQUEST/RENEWAL or
DHCPREQUEST/REBINDING from the client set defined by the load balanc- DHCPREQUEST/REBINDING from the client set defined by the load balanc-
ing algorithm. Each server services DHCPREQUEST/RENEWAL or ing algorithm. Each server services DHCPREQUEST/RENEWAL or
DHCPDISCOVER/REBINDING requests from any client. DHCPDISCOVER/REBINDING requests from any client.
In general, whenever the binding database is changed in stable In general, whenever the binding database is changed in stable
skipping to change at page 29, line 6 skipping to change at page 28, line 49
time for the server pair. time for the server pair.
While the algorithm for this refinement of delta time is not speci- While the algorithm for this refinement of delta time is not speci-
fied as part of this protocol, a server SHOULD allow the delta time fied as part of this protocol, a server SHOULD allow the delta time
value for a pair of failover servers to be periodically updated to value for a pair of failover servers to be periodically updated to
account for time drift. In addition, the delta time value between account for time drift. In addition, the delta time value between
servers SHOULD be smoothed in some fashion, so that transient network servers SHOULD be smoothed in some fashion, so that transient network
delays will not cause it to vary wildly. delays will not cause it to vary wildly.
A server SHOULD recognize a drastic change in the delta time value as A server SHOULD recognize a drastic change in the delta time value as
an event to be signaled to a network administrator. an event to be signaled to a network administrator, as well as reset-
ting the time delta between the failover partners.
The specific definitions of a minor or drastic change in delta time
as well as the algorithm used to smooth minor changes into the run-
ning delta time are implementation issues and are not further
addresses in this document.
5.10. IP address binding-status 5.10. IP address binding-status
In most DHCP servers an IP address can take on several different In most DHCP servers an IP address can take on several different
binding-status values, sometimes also called states. While no two binding-status values, sometimes also called states. While no two
DHCP servers probably have exactly the same possible binding-status DHCP servers probably have exactly the same possible binding-status
values the DHCP RFC enforces some commonality among the general values, the DHCP RFC enforces some commonality among the general
semantics of the binding-status values used by various DHCP server semantics of the binding-status values used by various DHCP server
implementations. implementations.
In order to transmit binding database updates between one server and In order to transmit binding database updates between one server and
another using the failover protocol, some common denominator another using the failover protocol, some common denominator
binding-status values must be defined. It is not expected that these binding-status values must be defined. It is not expected that these
binding-status-values correspond with any actual implementation of binding-status-values correspond with any actual implementation of
the DHCP protocol in a DHCP server, but rather that the binding- the DHCP protocol in a DHCP server, but rather that the binding-
status values defined in this document should be a superset of most status values defined in this document should be a common denominator
if not all DHCP server implementations. It is a goal of this proto- of those in use by many DHCP server implementations. It is a goal of
col that any DHCP server can map the various IP address binding- this protocol that any DHCP server can map the various IP address
status values that it uses internally into these failover IP address binding-status values that it uses internally into these failover IP
binding-status values on transmission of binding database updates to address binding-status values on transmission of binding database
its partner, and likewise that it can map any failover IP address updates to its partner, and likewise that it can map any failover IP
binding-status values into its internal IP address binding-status address binding-status values it received in a binding update into
values upon receipt of a binding database update. its internal IP address binding-status values.
The IP address binding-status values defined for the failover proto- The IP address binding-status values defined for the failover proto-
col are: col are:
o FREE o FREE
Lease may be allocated to any DHCP client. IP address may be allocated by the primary to any DHCP client.
When the MCLT has passed after its time of entry into PARTNER-
DOWN state, the IP address may be allocated by the secondary to
any DHCP client.
o ACTIVE o ACTIVE
Lease is assigned to a client. It MUST have client information Lease is assigned to a client. A client MUST appear.
associated with it.
o EXPIRED
Lease has expired. It may be allocated to the same client.
o RELEASED
Lease has been released by client. It may be allocated to the o EXPIRED -- indicates that a client's binding on an IP address
same client. has expired. When the partner server ACK's the BNDUPD of an
EXPIRED IP address, the server sets its internal state to FREE.
It is then available to allocation to any client of the primary
server. It may be allocated to the same client if a BNDUPD has
not yet been sent to the partner. A client SHOULD appear.
o ABANDONED o RELEASED -- indicates that a DHCP client sent in a DHCPRELEASE
A server, or client flagged address as unusable. message. When the partner server ACK's the BNDUPD of an
RELEASED IP address, the server sets its internal state to FREE,
and it is available for allocation by the primary server to any
DHCP client. It may be allocated to the same client if a BNDUPD
has not yet been sent to the partner. A client SHOULD appear.
o RESET o FREE -- is used when a DHCP server needs to communicate that an
IP address is unused by any DHCP client, but it was not just
released, expired, or reset by a network administrator. When
the partner server ACK's the BNDUPD of an FREE IP address, the
server sets its internal state such that it is available for
allocation by the primary DHCP server to any DHCP client. (Note
that in PARTNER-DOWN state, after waiting the MCLT, the IP
address MAY be allocated to a DHCP client by the secondary
server.) A client MAY appear.
Lease was freed by some external agent. o ABANDONED -- indicates that an IP address is considered unusable
by the DHCP subsystem. An IP address for which a valid PING
response was received SHOULD be set to ABANDONED. An IP address
for which a DHCPDECLINE was received should be set to ABANDONED.
A client MUST NOT appear.
o BACKUP o RESET -- indicates that this IP address was made available by
operator command. A client MAY appear.
Lease belongs to secondary's private address pool. o BACKUP -- indicates that this IP address can be allocated by the
secondary server to a DHCP client at any time. When the MCLT has
passed after its time of entry into PARTNER-DOWN state, the IP
address may be allocated by the primary to any DHCP client. A
client MAY appear.
These binding-status values are communicated from one failover These binding-status values are communicated from one failover
partner to another using the binding-status option, see section 6.2 partner to another using the binding-status option, see section 12.3
for details of this option. Unless otherwise noted above there MAY for details of this option. Unless otherwise noted above there MAY
be client information associated with each of these binding-status be client information associated with each of these binding-status
values. values.
Again, note that a DHCP server implementing the failover protocol An IP address will move between these binding-status values using the
does not have to implement either this state machine or use these following state transition diagram:
particular binding-status values in its normal operation of allocat-
ing IP addresses to DHCP clients. It only needs to map its internal
binding-status-values onto these "standard" binding-status values,
and map these "standard" binding-status values back into its internal
binding-status values. In particular, a server which implements a
grace period for a IP address binding SHOULD simply wait to update
its partner server until the grace period on that binding has run
out.
The process of setting an IP address to FREE deserves some detailed
discussion. When an IP address is moved to the EXPIRED,RELEASED, or
RESET binding-status on a server, it will send a BNDUPD with the
binding-status of EXPIRED, RELEASED, or RESET to its partner. If its
partner agrees that is acceptable (see sections 7.1.2 and 7.13 con-
cerning why a server might not accept a BNDUPD) it will return a
BNDACK with no reject-reason, signifying that it accepted the update.
As part of the BNDUPD processing, the server returning the BNDACK
will set the binding-status of the IP address to FREE, and upon
receipt of the BNDACK the server which sent the BNDUPD will set the
binding-status of the IP address to FREE. Thus, the EXPIRED,
RELEASED, or RESET binding-status is something of a transitory state.
This process is encoded in the transition diagram below by "Comm
w/Partner".
An IP address will move between these lease binding-status values
using the following state transition diagram:
DHCP client DECLINE or DHCP client DECLINE or
server detected problem server detected problem
from any state from any state
+----------+ V +---------+ +----------+ V +---------+
External >---->| RESET | | |ABANDONED| External >---->| RESET | | |ABANDONED|
command | | +-->| | command | | +-->| |
+----------+ +---------+ +----------+ +---------+
| |
Comm w/Parter Comm w/Parter(1)
V V
+---------+ Comm +----------+ Comm +---------+ +---------+ Comm(1) +----------+ Comm(1) +---------+
| EXPIRED |--------->| FREE |<----------| RELEASED| | EXPIRED |--------->| FREE |<----------| RELEASED|
| | w/Parter | | w/Partner | | | | w/Parter | | w/Partner | |
+---------+ +----------+ +---------+ +---------+ +----------+ +---------+
^ ^ | | ^ ^ ^ | | ^
| Exp. grace IP address IP addr alloc. | | Exp. grace IP address IP addr alloc. |
| period ends leased by to secondary | | period ends leased by to secondary(2) |
| | primary V | | | primary V |
| | | +----------+ | | | | +----------+ |
| | | | BACKUP | | | | | | BACKUP | |
| wait for | | | | | wait for | | | |
| grace period | +----------+ | | grace period | +----------+ |
| | | | | | | | | |
| | | IP addr leased by | | | | IP addr leased by |
| Expired grace | secondary | | Expired grace | secondary |
| period exists V V | | period exists V V |
| | +----------+ | | | +----------+ |
| | Lease on | ACTIVE | DHCPRELEASE | | | Lease on | ACTIVE | DHCPRELEASE |
+-----+-IP addr---| |------------------+ +-----+-IP addr---| |------------------+
expires +----------+ expires +----------+
Figure 5.10-1: Transitions between binding-status values. Figure 5.10-1: Transitions between binding-status values.
If a server receives a binding-status that it doesn't implement (1) This transition MAY also occur if the server is in
internally, it should do something reasonable. A server which doesn't PARTNER-DOWN state and the MCLT has passed since the entry
support an ABANDONED binding-status could set the IP address ACTIVE in the RELEASED, EXPIRED, or RESET states.
and belonging to a client which will never be seen in a DHCP request.
5.10.1. IP address binding-status changes from BNDUPD messages
IP addresses undergo binding status changes for several reasons,
including receipt and processing of DHCP client requests, administra-
tive inputs and receipt of BNDUPD messages. Every DHCP server needs
to respond to DHCP client request and administrative inputs with
changes to its internal record of the binding-status of an IP
address, and this response is not in the scope of the failover proto-
col. However, the receipt of BNDUPD messages implies at least a pos-
sible change of the binding-status for an IP address, and must be
discussed here. See section 7.1.2 for general actions to take upon
receipt of a BNDUPD message.
When receiving a BNDUPD message, it is important to note that it may
not be current, in that the server receiving the BNDUPD message may
have had a more recent interaction with the DHCP client than its
partner who sent the BNDUPD message. In this case, the receiving
server MUST reject the BNDUPD message. In addition, it is worth not-
ing that two (and possibly three) binding-status values are the
direct result of interaction with a DHCP client, ACTIVE and RELEASED
(and possibly ABANDONED). All other binding-status values are either
the result of the expiration of a time period or interaction with an
external agency (e.g., a network admistrator).
Every BNDUPD message SHOULD contain a client-last-transaction-time
option, which MUST, if it appears, be the time that the server last
interacted with the DHCP client. It MUST NOT be, for instance, the
time that the lease on an IP address expired. If there has been no
interaction with the DHCP client in question (or there is no DHCP
client presently associated with this IP address), then there will be
no client-last-transaction-time option in the BNDUPD message.
The following list is indexed by the binding-status that a server
receives in a BNDUPD message. In many cases, the binding-status of
an IP address within the receiving server's data storage will have an
affect upon the checks performed prior to accepting the new binding-
status in a BNDUPD message.
In the following list, to "accept" a BNDUPD means to update the
server's bindings database with the information contained in the
BNDUPD and once that update is complete, send a BNDACK message
corresponding to the BNDUPD message. To "reject" a BNDUPD means to
respond to the BNDUPD with a BNDACK with a reject-reason option
included..
When interpreting the rules in the following list, if a BNDUPD
doesn't have a client-last-transaction-time value, then it MUST NOT
be considered later than the client-last-transaction-time in the
receiving server's binding. If the BNDUPD contains a client-last-
transaction-time value and the receiving server's binding does not,
then the client-last-transaction-time value in the BNDUPD MUST be
considered later than the server's.
The second rule concerns clients and IP addresses. If the client in
a BNDUPD message the client in a receiving server's binding both
exist and if they differ, then if the receiving server's binding-
status is ACTIVE and the binding-status in the BNDUPD is ACTIVE, then
if the receiving server is a secondary server accept it, else reject
it.
Otherwise, look up the binding-status in the BNDUPD in this list:
o ACTIVE in BNDUPD
If the receiving server's binding-status is ACTIVE, FREE, or
BACKUP, then accept it.
If the receiving server's binding-status is ABANDONED or RESET,
then reject it.
If the receiving server's binding status is RELEASED, EXPIRED,
then if the client-last-transaction-time in the BNDUPD is later
than the client-last-transaction-time in the receiving server's
binding, accept it, else reject it.
o EXPIRED in BNDUPD
If the receiving server's binding-status is ACTIVE, then current
time is later than the receiving server's lease-expiration-time,
accept it, else reject it.
If the receiving server's binding-status is ABANDONED or RESET,
reject it.
If the receiving server's binding-status is FREE or BACKUP,
accept it.
If the receiving server's binding-status is RELEASED, then if
the client-last-transaction-time is greater in the BNDUPD than
in the receiving server's binding, then accept it, else reject
it.
o RELEASED in BNDUPD
If the receiving server's binding-status is ACTIVE, then if the
client-last-transaction-time is greater than the client-last-
transaction-time in the receiving server's binding, accept it,
else reject it.
If the receiving server's binding-status is RELEASED, FREE or
BACKUP, accept it.
If the receiving server's binding-status is ABANDONED or RESET,
reject it.
o FREE or BACKUP in BNDUPD
If the receiving server's binding-status is ACTIVE and the
current time is later than the lease-expiration-time accept it,
else reject it.
If the receiving server's binding-status is ABANDONED, reject
it.
If the receiving server's binding-status is FREE or BACKUP or (2) This transition MAY occur if the server is the secondary
RESET, accept it. and the MCLT has passed since its entry into PARTNER-DOWN state.
o RESET or ABANDONDED in BNDUPD Again, note that a DHCP server implementing the failover protocol
does not have to implement either this state machine or use these
particular binding-status values in its normal operation of
allocating IP addresses to DHCP clients. It only needs to map its
internal binding-status-values onto these "standard" binding-status
values, and map these "standard" binding-status values back into its
internal binding-status values. For example, a server which imple-
ments a grace period for a IP address binding SHOULD simply wait to
update its partner server until the grace period on that binding has
run out.
Accept the new binding-status under all circumstances. The process of setting an IP address to FREE deserves some detailed
discussion. When an IP address is moved to the EXPIRED,RELEASED, or
RESET binding-status on a server, it will send a BNDUPD with the
binding-status of EXPIRED, RELEASED, or RESET to its partner. If its
partner agrees that is acceptable (see sections 7.1.2 and 7.1.3 con-
cerning why a server might not accept a BNDUPD) it will return a
BNDACK with no reject-reason, signifying that it accepted the update.
As part of the BNDUPD processing, the server returning the BNDACK
will set the binding-status of the IP address to FREE, and upon
receipt of the BNDACK the server which sent the BNDUPD will set the
binding-status of the IP address to FREE. Thus, the EXPIRED,
RELEASED, or RESET binding-status is something of a transitory state.
This process is encoded in the transition diagram above by "Comm
w/Partner".
5.11. DNS dynamic update considerations 5.11. DNS dynamic update considerations
DHCP servers (and clients) can use DNS Dynamic Updates as described DHCP servers (and clients) can use DNS Dynamic Updates as described
in [RFC2136] to maintain DNS name-mappings as they maintain DHCP in [RFC2136] to maintain DNS name-mappings as they maintain DHCP
leases. Many different administrative models for DHCP-DNS integra- leases. Many different administrative models for DHCP-DNS integra-
tion are possible. Descriptions of several of these models, and tion are possible. Descriptions of several of these models, and
guidelines that DHCP servers and clients should follow in carrying guidelines that DHCP servers and clients should follow in carrying
them out, are laid out in [DDNS]. The nature of the DHCP failover them out, are laid out in [DDNS]. The nature of the DHCP failover
protocol introduces some issues concerning dynamic DNS updates that protocol introduces some issues concerning dynamic DNS updates that
skipping to change at page 35, line 45 skipping to change at page 33, line 51
In order for either server to be able to complete a DDNS update, or In order for either server to be able to complete a DDNS update, or
to remove DNS records which were added by its partner, both servers to remove DNS records which were added by its partner, both servers
need to know the FQDN associated with the lease-client binding. The need to know the FQDN associated with the lease-client binding. The
FQDN associated with the client's A RR and PTR RR SHOULD be communi- FQDN associated with the client's A RR and PTR RR SHOULD be communi-
cated from the server which adds records into the DNS to its partner. cated from the server which adds records into the DNS to its partner.
The initiating server SHOULD use the DDNS option in the BNDUPD mes- The initiating server SHOULD use the DDNS option in the BNDUPD mes-
sages to inform the partner server of the status of any DDNS updates sages to inform the partner server of the status of any DDNS updates
associated with a lease binding. Failover servers MAY choose not to associated with a lease binding. Failover servers MAY choose not to
include the DDNS option in BNDUPD messages if there has been no include the DDNS option in BNDUPD messages if there has been no
change in the status of any DDNS update related to the lease binding. change in the status of any DDNS update related to the lease binding.
The partner server receiving BNDUPD messages containing the ddn The partner server receiving BNDUPD messages containing the DDNS
option SHOULD compare the status flags and the FQDN contained in the option SHOULD compare the status flags and the FQDN contained in the
option data with the current DDNS information it has associated with option data with the current DDNS information it has associated with
the lease binding, and update its notion of the DDNS status accord- the lease binding, and update its notion of the DDNS status accord-
ingly. ingly.
The initiating server MAY send a BNDUPD to its partner before the The initiating server MAY send a BNDUPD to its partner before the
DDNS update has been successfully completed. If it does so, it SHOULD DDNS update has been successfully completed. If it does so, it SHOULD
leave the 'C' bit in the Flags field clear, to indicate to the leave the 'C' bit in the Flags field clear, to indicate to the
partner that the DDNS update may not be complete. When the DDNS partner that the DDNS update may not be complete. When the DDNS
update has been successfully acknowledged by the DNS server, the ini- update has been successfully acknowledged by the DNS server, the ini-
skipping to change at page 36, line 37 skipping to change at page 34, line 42
the FQDN, or it may supply the entire FQDN. The server may be config- the FQDN, or it may supply the entire FQDN. The server may be config-
ured to attempt to use the information the client supplies, it may be ured to attempt to use the information the client supplies, it may be
configured with an FQDN to use for the client, or it may be config- configured with an FQDN to use for the client, or it may be config-
ured to synthesize an FQDN. The responsive server SHOULD include the ured to synthesize an FQDN. The responsive server SHOULD include the
FQDN that it will be using in DDNS updates it initiates when it sends FQDN that it will be using in DDNS updates it initiates when it sends
the DDNS option. the DDNS option.
Since the responsive server may not have completed the DDNS update at Since the responsive server may not have completed the DDNS update at
the time it sends the first BNDUPD about the lease binding, there may the time it sends the first BNDUPD about the lease binding, there may
be cases where the FQDN in later BNDUPD messages does not match the be cases where the FQDN in later BNDUPD messages does not match the
FQDN included in earlier messages. For example, the responsive server FQDN included in earlier messages. For example, the responsive
may be configured to handle situations where two or more DHCP client server may be configured to handle situations where two or more DHCP
FQDNs are identical by modifying the most-specific label in the FQDNs client FQDNs are identical by modifying the most-specific label in
of some of the clients in an attempt to generate unique FQDNs for the FQDNs of some of the clients in an attempt to generate unique
them. Alternatively, at sites which use some or all of the informa- FQDNs for them (a process sometimes called "disambiguation"). Alter-
tion which clients supply to form the FQDN, it's possible that a natively, at sites which use some or all of the information which
client's configuration may be changed so that it begins to supply new clients supply to form the FQDN, it's possible that a client's confi-
data. The responsive server may react by removing the DNS records guration may be changed so that it begins to supply new data. The
which it originally added for the client, and replacing them with responsive server may react by removing the DNS records which it ori-
records that refer to the client's new FQDN. In such cases, the ginally added for the client, and replacing them with records that
responsive server SHOULD include the actual FQDN that was used in refer to the client's new FQDN. In such cases, the responsive server
subsequent DDNS options. The responsive server SHOULD include SHOULD include the actual FQDN that was used in subsequent DDNS
relevant client-option data in the client-request-options option in options. The responsive server SHOULD include relevant client-option
its BNDUPD messages. This information may be necessary in order to data in the client-request-options option in its BNDUPD messages.
allow the non-responsive partner to detect client configuration This information may be necessary in order to allow the non-
changes that change the hostname or FQDN data which the client responsive partner to detect client configuration changes that change
includes in its DHCP requests. the hostname or FQDN data which the client includes in its DHCP
requests.
5.11.3. Adding RRs to the DNS 5.11.3. Adding RRs to the DNS
A failover server which is going to perform DDNS updates SHOULD ini- A failover server which is going to perform DDNS updates SHOULD ini-
tiate the DDNS update when it grants a new lease to a client. The tiate the DDNS update when it grants a new lease to a client. The
non-responsive partner SHOULD NOT initiate a DDNS update when it non-responsive partner SHOULD NOT initiate a DDNS update when it
receives the BNDUPD after the lease has been granted. The failover receives the BNDUPD after the lease has been granted. The failover
protocol ensures that only one of the partners will grant a lease to protocol ensures that only one of the partners will grant a lease to
any individual client, so it follows that this requirement will any individual client, so it follows that this requirement will
prevent both partners from initiating updates simultaneously. The prevent both partners from initiating updates simultaneously. The
skipping to change at page 37, line 44 skipping to change at page 35, line 50
partner is no longer attempting to perform an update for the partner is no longer attempting to perform an update for the
existing client. If the remaining server has not recorded that existing client. If the remaining server has not recorded that
an update for the binding has been successfully completed, the an update for the binding has been successfully completed, the
server MAY initiate a DDNS update. It MAY initiate this server MAY initiate a DDNS update. It MAY initiate this
update immediately upon entry to PARTNER-DOWN state, it may update immediately upon entry to PARTNER-DOWN state, it may
perform this in the background, or it MAY initiate this update perform this in the background, or it MAY initiate this update
upon next hearing from the DHCP client. upon next hearing from the DHCP client.
5.11.4. Deleting RRs from the DNS 5.11.4. Deleting RRs from the DNS
The failover server which makes a lease FREE SHOULD initiate any DDNS The failover server which makes an IP address FREE SHOULD initiate
deletes, if it has recorded that DNS records were added on behalf of any DDNS deletes, if it has recorded that DNS records were added on
the client. behalf of the client.
A server "makes a lease FREE" when it initiates a BNDUPD with a A server not in PARTNER-DOWN state "makes an IP address FREE" when it
binding-status of FREE, EXPIRED, or RELEASED. Its partner confirms initiates a BNDUPD with a binding-status of FREE, EXPIRED, or
this status by acking that BNDUPD, and upon receipt of the ACK the RELEASED. Its partner confirms this status by acking that BNDUPD,
server has "made the address FREE". It is at this point that it and upon receipt of the ACK the server has "made the IP address
should initiate the DDNS operations to delete RRs from the DDNS. Its FREE". Conversely, a server in PARTNER-DOWN state "makes an IP
partner SHOULD NOT initiate DDNS deletes for DNS records related to address FREE" when it sets the binding-status to FREE, since in
the lease binding as part of sending the BNDACK message. The PARTNER-DOWN state not communications is required with the partner.
partner MAY have issued BNDUPD messages with a binding-status of
FREE, EXPIRED, or RELEASED previously, but the other server will have It is at this point that it should initiate the DDNS operations to
NAKed these BNDUPD messages. delete RRs from the DDNS. Its partner SHOULD NOT initiate DDNS
deletes for DNS records related to the lease binding as part of send-
ing the BNDACK message. The partner MAY have issued BNDUPD messages
with a binding-status of FREE, EXPIRED, or RELEASED previously, but
the other server will have NAKed these BNDUPD messages.
The failover protocol ensures that only one of the two partner The failover protocol ensures that only one of the two partner
servers will be able to make a lease FREE. The server making the servers will be able to make a lease FREE. The server making the
lease FREE may be doing so while it is in NORMAL communication with lease FREE may be doing so while it is in NORMAL communication with
its partner, or it may be in PARTNER-DOWN state. If a server is in its partner, or it may be in PARTNER-DOWN state. If a server is in
PARTNER-DOWN state, it may be performing DDNS deletes for RRs which PARTNER-DOWN state, it may be performing DDNS deletes for RRs which
its partner added originally. This allows a single remaining partner its partner added originally. This allows a single remaining partner
server to assume responsibility for all of the DDNS activity which server to assume responsibility for all of the DDNS activity which
the two servers were undertaking. the two servers were undertaking.
skipping to change at page 39, line 9 skipping to change at page 37, line 19
tocol. tocol.
The general solution with regards to reservations is as follows. The general solution with regards to reservations is as follows.
Whenever a reserved IP address becomes FREE (i.e., when first config- Whenever a reserved IP address becomes FREE (i.e., when first config-
ured or whenever a client frees it or it expires or is reset), the ured or whenever a client frees it or it expires or is reset), the
primary server MUST show that IP address as FREE (and thus available primary server MUST show that IP address as FREE (and thus available
for its own allocation) and it MUST send it to the secondary server for its own allocation) and it MUST send it to the secondary server
as BACKUP, in order that the secondary server be able to allocate it as BACKUP, in order that the secondary server be able to allocate it
as well. as well.
When the reservation on an IP address is cancelled, if the IP address
is currently FREE and the server is the primary, or BACKUP and the
server is the secondary, the server MUST send a BNDUPD to the other
server with the binding-status FREE.
5.13. Dynamic BOOTP and failover 5.13. Dynamic BOOTP and failover
Some DHCP servers support a capability to offer IP addresses to BOOTP Some DHCP servers support a capability to offer IP addresses to BOOTP
clients without having a particular address previously allocated for clients without having a particular address previously allocated for
those clients. This capability is often called something like those clients. This capability is often called something like
"dynamic BOOTP". It is not a capability explicitly discussed in "dynamic BOOTP". It is discussed briefly in RFC 1534 [RFC 1534].
either the DHCP or BOOTP RFC's, but rather a pragmatic capability
which can work reasonably well for a small set of legacy BOOTP dev-
ices.
This capability has a negative interaction with the fundamental ele- This capability has a negative interaction with the fundamental ele-
ments of the failover protocol, in that an address handed out to a ments of the failover protocol, in that an address handed out to a
BOOTP device has no term (or effectively no term, in that usually BOOTP device has no term (or effectively no term, in that usually
they are considered leases for "forever"). There is no opportunity they are considered leases for "forever"). There is no opportunity
to hand out a lease which is only the MCLT long when first hearing to hand out a lease which is only the MCLT long when first hearing
from a BOOTP device, because they may only interact once with the from a BOOTP device, because they may only interact once with the
DHCP server and they have no notion of a lease expiration time. Thus DHCP server and they have no notion of a lease expiration time. Thus
the entire concept of the MCLT and waiting the MCLT after entering the entire concept of the MCLT and waiting the MCLT after entering
PARTNER-DOWN state is broken when dealing with BOOTP devices. PARTNER-DOWN state is defeated when dealing with BOOTP devices.
With some restrictions, however, dynamic BOOTP devices can be sup- With some restrictions, however, dynamic BOOTP devices can be sup-
ported in a server on a subnet where failover is supported. The only ported in a server on a subnet where failover is supported. The only
restriction (and it is not small) is that on any portion of the sub- restriction (and it is not small) is that on any portion of the sub-
net (in any address pool) where dynamic BOOTP devices can be allo- net (in any address pool) where dynamic BOOTP devices can be allo-
cated IP addresses, a DHCP server MUST NOT ever use any of the IP cated IP addresses, a DHCP server MUST NOT ever use any of the IP
addresses which were previously available for allocation by its fail- addresses which were previously available for allocation by its fail-
over partner. Thus, the addresses allocated by the primary to the over partner. Thus, the addresses allocated by the primary to the
secondary for allocation MUST NOT ever be used by the primary server secondary for allocation that might have been allocated to BOOTP dev-
even if it is in PARTNER-DOWN state and has waited the MCLT after ices MUST NOT ever be used by the primary server even if it is in
entering that state. The reason for this is because one of those IP PARTNER-DOWN state and has waited the MCLT after entering that state.
address could have been allocated by the secondary server to a BOOTP Conversely, addresses available for allocation by the primary MUST
device, and the primary server would have no way of ever knowing that NOT be used by the secondary even it is in PARTNER-DOWN state. The
happened. reason for this is because one of those IP address could have been
allocated by the secondary server to a BOOTP device, and the primary
server would have no way of ever knowing that happened.
5.14. Guidelines for selecting MCLT 5.14. Guidelines for selecting MCLT
There is no one correct value for the MCLT. There is an explicit There is no one correct value for the MCLT. There is an explicit
tradeoff between various factors in selecting an MCLT value. tradeoff between various factors in selecting an MCLT value.
5.14.1. Short MCLT 5.14.1. Short MCLT
A short MCLT value will mean that after entering PARTNER-DOWN state, A short MCLT value will mean that after entering PARTNER-DOWN state,
a server will only have to wait a short time before it can start a server will only have to wait a short time before it can start
skipping to change at page 40, line 36 skipping to change at page 39, line 5
longer. longer.
However, a server entering PARTNER-DOWN state will have to wait the However, a server entering PARTNER-DOWN state will have to wait the
longer MCLT before being able to allocate its partner's IP addresses longer MCLT before being able to allocate its partner's IP addresses
to new DHCP clients. This may mean that additional IP addresses are to new DHCP clients. This may mean that additional IP addresses are
required in order to cover this time period. Further, the server in required in order to cover this time period. Further, the server in
PARTNER-DOWN will have to wait the longer MCLT from every lease PARTNER-DOWN will have to wait the longer MCLT from every lease
expiration before it can reallocate an IP address to a different DHCP expiration before it can reallocate an IP address to a different DHCP
client. client.
6. Packet Formats 6. Common Message Format
This section discusses the common message format that all failover This section discusses the common message format that all failover
messages have in common, and then defines option used in the failover messages have in common, including the message header format as well
protocol. as the common option format. See section 12 for the the definitions
of the specific options used in the failover protocol.
6.1. Common message format 6.1. Message header format
The options contained in the payload data section of the failover
message
All failover protocol messages are sent over the TCP connection All failover protocol messages are sent over the TCP connection
between failover endpoints and encoded using a message format between failover endpoints and encoded using a message format
specific to the failover protocol. specific to the failover protocol.
There exists a common message format for all failover messages, which There exists a common message format for all failover messages, which
utilizes the options in a way similar to the DHCP protocol. For each utilizes the options in a way similar to the DHCP protocol. For each
message type, some options are required and some are optional. In message type, some options are required and some are optional. In
addition, when a message is received any options that are not addition, when a message is received any options that are not under-
understood by the receiving server MUST be ignored. stood by the receiving server MUST be ignored.
All of the fields in the fixed portion of the message MUST be filled All of the fields in the fixed portion of the message MUST be filled
with correct data in every message sent. with correct data in every message sent.
0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| message length (2) | msg type (1) |payload off (1)| | message length (2) | msg type (1) |payload off (1)|
+---------------+---------------+---------------+---------------+ +---------------+---------------+---------------+---------------+
| time (4) | | time (4) |
+---------------------------------------------------------------+ +---------------------------------------------------------------+
| xid (4) | | xid (4) |
+---------------------------------------------------------------+ +---------------------------------------------------------------+
| 0 or more additional header bytes (variable) | | 0 or more additional header bytes (variable) |
+---------------------------------------------------------------+ +---------------------------------------------------------------+
| payload data (variable) | | payload data (variable) |
| | | |
| formatted as DHCP-style options | | formatted as DHCP-style options |
| using a unique option number space in the RFC TBD | | using a two byte option code and two byte length |
| format defined by [NAMESPACE] | | See section 6.2 for details. |
+---------------------------------------------------------------+ +---------------------------------------------------------------+
message length - 2 bytes, network byte order message length - 2 bytes, network byte order
This is the length of the message. It includes the two byte message This is the length of the message. It includes the two byte message
length itself. The maximum length is 2048 bytes. length itself. The maximum length is 2048 bytes. The minimum length
is 12.
msg type - 1 byte msg type - 1 byte
The message type field is used to distinguish between messages. The message type field is used to distinguish between messages.
The following message types are defined: The following message types are defined:
Value Message Type Value Message Type
----- ------------ ----- ------------
0 reserved not used 0 reserved not used
skipping to change at page 42, line 39 skipping to change at page 40, line 44
ported by every server, and if a server receives a message in the ported by every server, and if a server receives a message in the
range of 0-127 that it doesn't understand, it MUST close the TCP con- range of 0-127 that it doesn't understand, it MUST close the TCP con-
nection. The range of 128-255 is used for messages which MAY be sup- nection. The range of 128-255 is used for messages which MAY be sup-
ported but are not required, and if a server receives a message in ported but are not required, and if a server receives a message in
this range that it does not understand it SHOULD ignore the message. this range that it does not understand it SHOULD ignore the message.
payload offset - 1 byte payload offset - 1 byte
The byte offset of the Payload Data, from the beginning of the The byte offset of the Payload Data, from the beginning of the
failover message header. The value for the current protocol version failover message header. The value for the current protocol version
is 8. (version 1) is 8.
time - 4 bytes, network byte order time - 4 bytes, network byte order
The absolute time in GMT when the message was transmitted, The absolute time in GMT when the message was transmitted,
represented as seconds elapsed since Jan 1, 1970 (i.e., similar to represented as seconds elapsed since Jan 1, 1970 (i.e., similar to
the ANSI C time_t time value representation). While the ANSI C the ANSI C time_t time value representation). While the ANSI C
time_t value is signed, the value used in this specification is time_t value is signed, the value used in this specification is
unsigned. unsigned.
A server SHOULD set this time as close to the actual transmission of A server SHOULD set this time as close to the actual transmission of
the message as possible. the message as possible.
xid - 4 bytes, network byte order xid - 4 bytes, network byte order
This is the transaction id of the failover message. The sender of a This is the transaction id of the failover message. The sender of a
failover protocol message is responsible for setting this number, and failover protocol message is responsible for setting this number, and
the receiver of the message copies the number over into any response the receiver of the message copies the number over into any response
message, treating it as opaque data. The sender SHOULD ensure that message, treating it as opaque data. The sender MUST ensure that
every message sent from a particular failover endpoint over the every message sent from a particular failover endpoint over the
associated TCP connection has a unique transaction id unless that associated TCP connection has a unique transaction id.
message is a re-transmission.
For failover messages that have no corresponding response message,
the XID value is meaningless, but MUST be supplied. The XID value is
used solely by the receiver of a response message to determine the
corresponding request message.
Requests messages where the XID is used in the corresponding response
messages are: POOLREQ, BNDUPD, CONNECT, UPDREQALL, and UPDREQ. The
corresponding response messages are POOLRESP, BNDACK, CONNECTACK,
UPDDONE, and UPDDONE, respectively.
As requests/responses don't survive connection reestablishment, XIDs
only need
payload data - variable length payload data - variable length
The options are placed after the header, after skipping payload The options are placed after the header, after skipping payload
offset bytes from beginning of the message. The payload data options offset bytes from beginning of the message. The payload data options
are not preceded by a "cookie" value. are not preceded by a "cookie" value.
The payload data is formatted as DHCP style options using the two The payload data is formatted as DHCP style options using two byte
byte option number and two byte option length format as specified in option codes and two byte option lengths. The option codes are in a
the recommendations of the DHCP panel in [NAMESPACE]. namespace which is unique to the failover protocol.
The maximum length of the payload data in octets is 2048 less the The maximum length of the payload data in octets is 2048 less the
size of the header, i.e., the maximum message length is 2048 octets. size of the header, i.e., the maximum message length is 2048 octets.
6.2. Common option format 6.2. Common option format
The options contained in the payload data section of the failover The options contained in the payload data section of the failover
message all use the two byte option number and two byte length format message all use a two byte option number and two byte length format.
as specified by the recommendations of the DHCP panel in [NAMESPACE].
The option numbers are drawn from an option number space unique to The option numbers are drawn from an option number space unique to
the failover protocol. All of the message types share a common the failover protocol. All of the message types share a common
option number space and common options definitions, though not all option number space and common options definitions, though not all
options are required or meaningful for every message. options are required or meaningful for every message.
In contrast to the options which appear in DHCP client and server In contrast to the options which appear in DHCP client and server
messages, the options in failover message are ordered. That is, for messages, the options in failover message are ordered. That is, for
some messages the order in which the options appear in the payload some messages the order in which the options appear in the payload
data area is significant. The messages for which this is the case data area is significant. The messages for which option ordering is
spell it out in detail. significant explicitly describe the ordering requirements. If no
ordering requirements are mentioned, then the order is not signifi-
cant for that message.
For all options which refer to time, they all use an absolute time in For all options which refer to time, they all use an absolute time in
GMT. Time synchronization has already been achieved between the GMT. Time synchronization has already been achieved between the
source and the target server using the CONNECT message and is updated source and the target server using the CONNECT message and is updated
using the time in every packet. All time fields in the options and refined using the time in every packet.
defined below use a time represented as seconds elapsed since Jan 1,
1970 (i.e. ANSI C time_t time value representation). Note that this
is (at present) a signed field.
Additional options can be defined for intervendor or vendor specific
use with limited difficulty due to the large number of option numbers
available.
6.2.1. binding-status
This option is used to convey the current state of a binding.
Code Len Type
+-----+-----+------+-----+-----+
| 0 | 1 | 0 | 1 | 1-7 |
+-----+-----+------+-----+-----+
Legal values for this option are:
Value Binding Status
----- ------------------------------------------------
1 FREE Lease has never been used
2 ACTIVE Lease is assigned to a client
3 EXPIRED Lease has expired
4 RELEASED Lease has been released by client
5 ABANDONED A server, or client flagged address as unusable
6 RESET Lease was freed by some external agent
7 BACKUP Lease belongs to secondary's private address pool
6.2.2. assigned-IP-address
The IP address to which this message refers.
Code Len Address
+-----+-----+------+-----+----+-----+-----+-----+
| 0 | 2 | 0 | 4 | a1 | a2 | a3 | a4 |
+-----+-----+------+-----+----+-----+-----+-----+
6.2.3. sending-server-IP-address
The IP address of the server sending this message.
Code Len Address
+-----+-----+------+-----+----+-----+-----+-----+
| 0 | 3 | 0 | 4 | a1 | a2 | a3 | a4 |
+-----+-----+------+-----+----+-----+-----+-----+
6.2.4. addresses-transferred
A 32 bit unsigned long in network byte order. Reports the number of
addresses transferred by the primary to the secondary server
(addresses to be used for the secondary server's private address
pool)
Code Len Number of Addresses
+-----+-----+------+-----+----+-----+-----+-----+
| 0 | 4 | 0 | 4 | n1 | n2 | n3 | n4 |
+-----+-----+------+-----+----+-----+-----+-----+
6.2.5. client-identifier
The format, code and conventions used are identical to DHCP option
61.
Code Len Client Identifier
+-----+-----+------+-----+----+-----+---
| 0 | 5 | 0 | n | i1 | i2 | ...
+-----+-----+------+-----+----+-----+--
6.2.6. client-hardware-address
The format is similar to DHCP option 61. Byte t1 (type) MUST be set
to the proper ARP hardware address code, as defined in the ARP
section of RFC 1700 (it MUST NOT be zero!)
Code Len htype chaddr
+-----+-----+------+-----+----+-----+-----+---
| 0 | 6 | 0 | n | t1 | c1 | c2 | ...
+-----+-----+------+-----+----+-----+-----+---
Either client-identifier, client-hardware-address or BOTH MAY be
present in binding update transactions. At least one of them MUST be
present. If both are present, the client-identifier MUST be used to
uniquely identify the owner of the binding (exactly as in RFC 2131).
6.2.7. DDNS
If an implementation supports Dynamic DNS updates, this option is
used to communicate the status of the DDNS update associated with a
particular lease binding. The Flags field conveys the types of DNS
RRs that are to be updated by the DHCP server, and the status of the
DDNS update. The Domain Name field conveys the DNS FQDN that the
DHCP server is using to refer to the client, in DNS encoding as
specified in [RFC1035].
Code Len Flags Domain Name
+-----+-----+------+-----+-----+------+------+-----+------
| 0 | 7 | 0 | n | flags | d1 | d2 | ...
+-----+-----+------+-----+-----+------+------+-----+------
The Flags field is a 16-bit field; several bit positions are
specified here.
15 7 0
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| MBZ |P|D|A|C|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The bits (numbered from the least-significant bit in network
byte-order) are used as follows:
0 (C): A RR update successfully completed
1 (A): Server is controlling A RR on behalf of the client
2 (D): PTR RR update successfully completed (Done)
3 (P): Server is controlling PTR RR on behalf of the client
4-15 : Must be zero
All of the unspecified bit positions SHOULD be set to 0 by servers
sending the Failover-DDNS option, and they MUST be ignored by servers
receiving the option.
6.2.8. reject-reason
This option is used to selectively reject binding updates. It MAY be
used in BNDACK message, always associated with an assigned-IP-address
option, which contains the IP address of the update being rejected.
Code Len Reason Code
+-----+-----+------+-----+----------+
| 0 | 8 | 0 | 1 | R1 |
+-----+-----+------+-----+----------+
Reason codes :
0 Reserved
1 Illegal IP address (not part of any address pool)
2 Fatal conflict exists: address in use by other client.
3 Missing binding information.
4 Connection rejected, time mismatch too great.
5 Connection rejected, invalid MCLT.
6 Connection rejected, unknown reason.
7 Connection rejected, duplicate connection.
8 Connection rejected, invalid failover partner.
9 TLS not supported
10 TLS supported but not configured
11 TLS required but not supported by partner
12 Message digest not supported
13 Message digest not configured
14 Protocol version mismatch
15 Missing binding information
16 Outdated binding information
17 Less critical binding information
18 No traffic within sufficient time
19 Hash bucket assignment conflict
20-253, reserved.
254 Unknown: Error occurred but does not match any reason code
255 Reserved for code expansion
6.2.9. message
This option is used to supply a human readable message. It may be
used in association with the Reject Reason Code to provide a human
readable error message for the reject.
Code Len Text
+-----+-----+------+-----+------+-----+--
| 0 | 9 | 0 | n | c1 | c2 | ...
+-----+-----+------+-----+------+-----+--
6.2.10. MCLT
Maximum Client Lead Time, in seconds. A 32 bit integer value, in
network byte order.
Code Len Time
+-----+-----+------+-----+----+-----+-----+-----+
| 0 | 10 | 0 | 4 | t1 | t2 | t3 | t4 |
+-----+-----+------+-----+----+-----+-----+-----+
6.2.11. vendor-class-identifier
A string which identifies the vendor of the failover protocol
implementation.
The code for this option is 60, and its minimum length is 1.
Code Len vendor class string
+-----+-----+------+-----+----+-----+---
| 0 | 11 | 0 | n | c1 | c2 | ...
+-----+-----+------+-----+----+-----+---
6.2.12. lease-expiration-time
The lease expiration time expressed as an absolute time in GMT
represented as seconds elapsed since Jan 1, 1970 (i.e. ANSI C time_t
time value representation).
The lease expiration time is the time that a server has ACKed to a
DHCP client.
Code Len Time
+-----+-----+------+-----+----+-----+-----+-----+
| 0 | 13 | 0 | 4 | t1 | t2 | t3 | t4 |
+-----+-----+------+-----+----+-----+-----+-----+
6.2.13. potential-expiration-time
The potential expiration time expressed as an absolute time in GMT
represented as seconds elapsed since Jan 1, 1970 (i.e. ANSI C time_t
time value representation).
The potential expiration time is the time that one server tells
another server that it may ACK to a client.
Code Len Time
+-----+-----+------+-----+----+-----+-----+-----+
| 0 | 14 | 0 | 4 | t1 | t2 | t3 | t4 |
+-----+-----+------+-----+----+-----+-----+-----+
6.2.14. grace-expiration-time
The grace expiration time expressed as an absolute time in GMT
represented as seconds elapsed since Jan 1, 1970 (i.e. ANSI C time_t
time value representation).
The grace expiration time is the time that a grace period will
expire.
Code Len Time
+-----+-----+------+-----+----+-----+-----+-----+
| 0 | 15 | 0 | 4 | t1 | t2 | t3 | t4 |
+-----+-----+------+-----+----+-----+-----+-----+
6.2.15. client-last-transaction-time
The time at which this server last received a DHCP request from a
particular client expressed as an absolute time in GMT represented as
seconds elapsed since Jan 1, 1970 (i.e. ANSI C time_t time value
representation).
Code Len Partner Down Time
+-----+-----+------+-----+----+-----+-----+-----+
| 0 | 16 | 0 | 4 | t1 | t2 | t3 | t4 |
+-----+-----+------+-----+----+-----+-----+-----+
6.2.16. start-time-of-state
The time at which the state contained in this message began,
expressed as an absolute time in GMT represented as seconds elapsed
since Jan 1, 1970 (i.e. ANSI C time_t time value representation).
This option is used for different states in different messages. In a
BNDUPD message it represents the start time of the state of the lease
in the BNDUPD message. In a STATE message, it represents the start
time of the partner server's failover state.
Code Len Start Time of State
+-----+-----+------+-----+----+-----+-----+-----+
| 0 | 17 | 0 | 4 | t1 | t2 | t3 | t4 |
+-----+-----+------+-----+----+-----+-----+-----+
6.2.17. server-state
This option is used to convey the current state of the failover
endpoint in the sending server.
Code Len Server State
+-----+-----+------+-----+-----+
| 0 | 18 | 0 | 1 | 1-9 |
+-----+-----+------+-----+-----+
Legal values for this option are:
Value Server State
----- -------------------------------------------------------------
0 reserved
1 STARTUP Startup state (1)
2 NORMAL Normal state
3 COMMUNICATIONS-INTERRUPTED Communication interrupted (safe)
4 PARTNER-DOWN Partner down (unsafe mode)
5 POTENTIAL-CONFLICT Synchronizing
6 RECOVER Recovering bindings from partner
7 PAUSED Shutting down for a short period.
8 SHUTDOWN Shutting down for an extended
period.
9 RECOVER-DONE Interlock state prior to NORMAL
6.2.18. server-flags
This option is used to convey the current flags of the failover
endpoint in the sending server.
Code Len Server Flags
+-----+-----+------+-----+-------+
| 0 | 19 | 0 | 1 | flags |
+-----+-----+------+-----+-------+
Legal values for this option are:
Currently, bit 5 is defined. All other bits
are reserved, and must be set to 0.
o STARTUP
Bit 5 is the STARTUP flag. Bit 5 MUST be set to 1 whenever the
server is in STARTUP state, and set to 0 otherwise. (Note that
when in STARTUP state, the state transmitted in the server-state
option is usually the last recorded state from stable storage,
but see section 9.3 for details.)
6.2.19. vendor-specific-options
This option is used to convey options specific to a particular
vendor's implementation. The vendor class identifier is used to
specify which option space the embedded options are drawn from.
It functions similarly to the vendor class identifier and vendor
specific options in the DHCP protocol.
This option contains other options in the same two byte code, two
byte length format. If this option appears in a message without a
corresponding vendor class identifier, it MUST be ignored.
Code Len Embedded options
+-----+-----+------+-----+----+-----+---
| 0 | 20 | 0 | n | c1 | c2 | ...
+-----+-----+------+-----+----+-----+---
6.2.20. max-unacked-bndupd
The maximum number of BNDUPD message that this server is prepared to
accept over the TCP connection without causing the TCP connection to
block.
Code Len Maximum Unacked BNDUPD
+-----+-----+------+-----+----+-----+-----+-----+
| 0 | 21 | 0 | 4 | n1 | n2 | n3 | n4 |
+-----+-----+------+-----+----+-----+-----+-----+
6.2.21. receive-timer
The number of seconds within which the server must receive a message
from its partner, or it will assume that the partner is down or the
communication path to the partner has failed.
Code Len Receive Timer
+-----+-----+------+-----+----+-----+-----+-----+
| 0 | 23 | 0 | 4 | s1 | s2 | s3 | s4 |
+-----+-----+------+-----+----+-----+-----+-----+
6.2.22. hash-bucket-assignment
The set of hash values to which the receiving server MUST respond.
See section 5.3 for more information on how this option is used.
The format and usage of the data in this option is defined in
[LOADB].
Code Len Hash Buckets
+-----+-----+------+-----+----+-----+-----+-----+
| 0 | 24 | 0 | 32 | b1 | b2 | ... | b32 |
+-----+-----+------+-----+----+-----+-----+-----+
6.2.23. message-digest
The message digest for this message.
This option consists of a variable number of bytes which contain the
message digest of the message prior to the inclusion of this option.
When this option appears in a message, it MUST appear as the last
option in the message.
Code Len Message Digest
+-----+-----+------+-----+----+-----+-----
| 0 | 25 | 0 | n | d1 | d2 | ...
+-----+-----+------+-----+----+-----+-----
6.2.24. protocol-version
The protocol version being used by the server. It is only sent in the
CONNECT and CONNECTACK messages.
Code Len Version
+-----+-----+------+-----+----+
| 0 | 26 | 0 | 1 | v1 |
+-----+-----+------+-----+----+
6.2.25. TLS-request
This option contains information relating to TLS security
negotiation. It is sent in a CONNECT message
The first byte, req, is the TLS request from this server. A value of
0 indicates no TLS operation, a value of 1 indicates that TLS
operation is desired, and a value of 2 indicates that TLS operation
is required to establish communications with this server.
The second byte, acc, is what this server will accept for TLS The time value is an unsigned 32 bit integer in network byte order
operation. A value of 0 means that this server will not accept TLS giving the number of seconds since 00:00 UTC, 1st January 1970. This
connections. A value of 1 means that this server will accept TLS can be converted to an NTP timestamp by adding decimal 2208988800.
connections. This time format will not wrap until the year 2106. Until sometime
in 2038, it is equal to the ANSI C time_t value (which is a signed 32
bit value and will overflow into a negative number in 2038).
If req is not zero, then acc MUST be 1. Options should appear once only in each message (except for BNDUPD
and BNDACK messages where bulking is used, see section 6.3 for
details.) An option that appears twice is not concatenated, but
treated as an error.
This allows a server which is not configured to require TLS support Specific option values are described in section 12.
to inform its partner that it will accept a TLS connection although
it does not desire one, for instance.
Code Len request accept See section 13 for how to define additional options.
+-----+-----+------+-----+----+----+
| 0 | 27 | 0 | 2 | req| acc|
+-----+-----+------+-----+----+----+
6.2.26. TLS-reply 6.3. Batching multiple binding update transactions in one BNDUPD mes-
sage
This option contains information relating to TLS security Implementations of this protocol MAY send multiple binding update
negotiation. It is sent in a CONNECTACK message transactions in one BNDUPD message, where a binding update transac-
tion is defined as the set of options which are associated with the
update of a single IP address. All implementations of this protocol
MUST be prepared to receive BNDUPD messages which contain multiple
binding update transactions and respond correctly to them, including
replying with a BNDACK message which contains status for the multiple
binding update transactions contained in the BNDUPD message.
The value of 0 indicates no TLS operation, a value of 1 indicates In the discussion of sending and receiving BNDUPD messages in section
that TLS operation is required. 7.1 and BNDACK messages in section 7.2, each BNDUPD message and
BNDACK message is assumed to contain a single binding update transac-
tion in order to reduce the complexity of the discussions in section
7.
Code Len TLS Multiple binding update transactions MAY be batched together in one
+-----+-----+------+-----+----+ BNDUPD protocol message with the data sets for the individual tran-
| 0 | 28 | 0 | 1 | t1 | sactions delimited by the assigned-IP-address option, which MUST
+-----+-----+------+-----+----+ appear first in the option set for each transaction. Ordering of
options between the assigned-IP-address options is not significant.
This is illustrated in the following schematic representation:
6.2.27. client-request-options Non-IP Address/Non-client specific options first
assigned-IP-address option for the first IP address
Options pertaining to first address, including
at least the binding-status option and others as
required.
assigned-IP-address option for the second IP address
Options pertaining to second address, including
at least the binding-status option and others as
required.
...
This option contains options from a DHCP client's request. It is There MUST be a one-to-one correspondence between BNDUPD and BNDACK
sent in a BNDUPD message. The first 4 bytes of the option contain messages, and every BNDACK message MUST contain status for all of the
the "magic number" of the option area from which the DHCP client's binding update transactions in the corresponding BNDUPD message.
request options were taken and serves to define the format of the
rest of the sub-options contained in this option. After the magic
number, the options included are in the normal options format
appropriate for that magic number.
A server SHOULD NOT include all of the options in a DHCP client The BNDACK message corresponding to a BNDUPD message MUST contain
request in this option, but rather a server SHOULD include only those assigned-IP-address options for all of the binding update transac-
options which are of likely interest to its partner server. See tions in the BNDUPD message. Thus, every BNDACK message contains
section 7.1 for details. exactly exactly the same assigned-IP-address options as does its
corresponding BNDUPD message. The order of the assigned-IP-address
options MAY, however, be different.
Code Len Magic Number Embedded options In case the server chooses to reject some or all of the IP address
+-----+-----+------+-----+----+----+----+----+----+----+-- binding information in a BNDUPD message in a BNDACK reply, the BNDACK
| 0 | 29 | 0 | n | m1 | m2 | m3 | m4 | b1 | b2 | ... message MUST contain a reject-reason option following every
+-----+-----+------+-----+----+----+----+----+----+----+-- assigned-IP-address option in order to indicate that the binding
update transaction for that IP address was not accepted and why. As
with a BNDACK message containing a single binding update transaction,
an assigned-IP-address option without any associated reject-reason
option indicates a successful binding update transaction.
6.2.28. client-reply-options 7. Protocol Messages
This option contains options from a DHCP server's reply to a DHCP This section contains the detailed definition of the protocol mes-
client request. It is sent in a BNDUPD message. The first 4 bytes sages, including the information to include when sending the message,
of the option contain the "magic number" of the option area from as well as the actions to take upon receiving the message.
which the DHCP reply options were taken and serves to define the
format of the rest of the sub-options contained in this option.
After the magic number, the options included are in the normal
options format appropriate for that magic number.
A server SHOULD NOT include all of the options in a DHCP server's 7.1. BNDUPD message
reply to a client's request in this option, but rather a server
SHOULD include only those options which are of likely interest to its
partner server. See section 7.1 for details.
Code Len Magic Number Embedded options The binding update (BNDUPD) message is used to send the binding data-
+-----+-----+------+-----+----+----+----+----+----+----+-- base changes (known as binding update transactions) to the partner
| 0 | 30 | 0 | n | m1 | m2 | m3 | m4 | b1 | b2 | ... server, and the partner server responds with a binding acknowledge-
+-----+-----+------+-----+----+----+----+----+----+----+-- ment (BNDACK) message when it has successfully committed those
changes to its own stable storage.
6.3. BNDUPD message format The rest of the failover protocol exists to determine whether the
partner server is able to communicate or not, and to enable the
partners to exchange BNDUPD/BNDACK messages in order to keep their
binding databases in stable storage synchronized.
The binding update (BNDUPD) message is used to send the binding data- The rest of this section is written as though every BNDUPD message
base changes to the partner server. contains only a single binding update transaction in order to reduce
the complexity of the discussion. See section 6.3 for information on
how to create and process BNDUPD and BNDACK messages which contain
multiple binding update transactions. Note that while a server MAY
generate BNDUPD messages with multiple binding update transactions,
every server MUST be able to process a BNDUPD message which contains
multiple binding update transactions and generate the corresponding
BNDACK messages with status for multiple binding update transactions.
The message type for the BNDUPD message is 3. The message type for the BNDUPD message is 3.
The xid of the BNDUPD MUST be unique with respect to other failover
messages transmitted from this failover endpoint.
The following table summarizes the various options for the BNDUPD The following table summarizes the various options for the BNDUPD
message. message.
binding-status binding-status BACKUP
RESET
ABANDONED
Option ACTIVE EXPIRED RELEASED FREE Option ACTIVE EXPIRED RELEASED FREE
------ ------ ------- -------- ---- ------ ------ ------- -------- ----
assigned-IP-address MUST MUST MUST MUST assigned-IP-address MUST MUST MUST MUST
binding-status MUST MUST MUST MUST binding-status MUST MUST MUST MUST
client-identifier MAY MAY MAY MAY client-identifier MAY MAY MAY MAY(2)
client-hardware-address MUST MUST MUST MAY client-hardware-address MUST MUST MUST MAY(2)
lease-expiration-time MUST MUST NOT MUST NOT MUST NOT lease-expiration-time MUST MUST NOT MUST NOT MUST NOT
potential-expiration-time MUST MUST NOT MUST NOT MUST NOT potential-expiration-time MUST MUST NOT MUST NOT MUST NOT
grace-expiration-time MUST NOT MUST NOT MUST NOT MUST NOT
start-time-of-state SHOULD SHOULD SHOULD SHOULD start-time-of-state SHOULD SHOULD SHOULD SHOULD
client-last-trans.-time MUST SHOULD MUST MAY client-last-trans.-time MUST SHOULD MUST MAY
DDNS(1) SHOULD SHOULD SHOULD SHOULD DDNS(1) SHOULD SHOULD SHOULD SHOULD
client-request-options SHOULD SHOULD NOT SHOULD SHOULD NOT client-request-options SHOULD SHOULD NOT SHOULD SHOULD NOT
client-reply-options SHOULD SHOULD NOT SHOULD SHOULD NOT client-reply-options SHOULD SHOULD NOT SHOULD NOT SHOULD NOT
all others MAY MAY MAY MAY
binding-status
BACKUP
RESET
Option ABANDONED
------ ---------
assigned-IP-address MUST
binding-status MUST
client-identifier MAY(2)
client-hardware-address MAY(2)
lease-expiration-time MUST NOT
potential-expiration-time MUST NOT
grace-expiration-time MUST NOT
start-time-of-state SHOULD
client-last-trans.-time MAY
DDNS(1) SHOULD
client-request-options SHOULD NOT
client-reply-options SHOULD NOT
all others MAY
(1) Only SHOULD appear if server supports dynamic DNS. (1) Only SHOULD appear if server supports dynamic DNS.
(2) MUST NOT if binding-status is ABANDONED. (2) MUST NOT if binding-status is ABANDONED.
Table 6.3-1: Options used in a BNDUPD message Table 7.1-1: Options used in a BNDUPD message
6.4. BNDACK message format
A server sends a binding acknowledgement (BNDACK) message when it has
successfully committed binding database changes received from a fail-
over partner in a BNDUPD message to its own stable storage.
The message type for the BNDACK message is 4.
The xid in a BNDACK MUST be the same as the xid of the corresponding
BNDUPD.
The following table summarizes the options for the BNDACK message.
binding-status
Option ACTIVE EXPIRED RELEASED FREE
------ ------ ------- -------- ----
assigned-IP-address MUST MUST MUST MUST
binding-status MUST MUST MUST MUST
client-identifier MAY MAY MAY MAY
client-hardware-address MUST MUST MUST MAY
reject-reason MAY MAY MAY MAY
message MAY MAY MAY MAY
lease-expiration-time MUST MUST NOT MUST NOT MUST NOT
potential-expiration-time MUST MUST NOT MUST NOT MUST NOT
grace-expiration-time MUST NOT MUST NOT MUST NOT MUST NOT
start-time-of-state SHOULD SHOULD SHOULD SHOULD
client-last-trans.-time SHOULD SHOULD SHOULD MAY
DDNS(1) SHOULD SHOULD SHOULD SHOULD
all others MAY MAY MAY MAY
binding-status
BACKUP
RESET
Option ABANDONED
------ ---------
assigned-IP-address MUST
binding-status MUST
client-identifier MAY
client-hardware-address MAY(2)
reject-reason MAY
message MAY
lease-expiration-time MUST NOT
potential-expiration-time MUST NOT
grace-expiration-time MUST NOT
start-time-of-state SHOULD
client-last-trans.-time MAY
DDNS(1) SHOULD
all others MAY
(1) Only SHOULD appear if the server supports dynamic DNS.
(2) MUST NOT if binding-status is ABANDONED.
Table 6.4-1: Options used in a BNDACK message
6.5. Bulking for BNDUPD and BNDACK messages
DISCUSSION:
Bulking is planned for this protocol, but it hasn't been specified
in this revision of the draft. Once the draft settles down, we
will specify the bulking approach in detail.
6.6. UPDREQ message format
The update request (UPDREQ) message is used by one server to request
that its partner send it all binding database information that it has
not already seen.
The message type for the UPDREQ message is 9.
The xid in a UPDREQ message MUST be unique among messages transmitted
from this failover endpoint during the life of this connection.
There are no options that MUST appear in an UPDREQALL message. Any
option MAY appear, though very few will likely be useful.
6.7. UPDREQALL message format
The update request all (UPDREQALL) message is used by one server to
request that all binding database information be sent in order to
recover from a total loss of its binding database by the requesting
server.
The message type for the UPDREQALL message is 7.
The xid in a UPDREQALL message MUST be unique among messages
transmitted from this failover endpoint during the life of this con-
nection.
There are no options that MUST appear in an UPDREQALL message. Any
option MAY appear, though very few will likely be useful.
6.8. UPDDONE message format
The update done (UPDDONE) message is used by the responding server to
indicate that all requested updates have been sent by the responding
server as BNDUPD messages and responded to by the requesting server
using BNDACK messages. While a BNDACK message MUST have been
received for each BNDUPD message prior to the transmission of the
UPDDONE message, this doesn't necessarily mean that all of the BNDUPD
messages were accepted, only that all of them were responded to with
a BNDACK message. Thus, a NAK (comprised of a BNDACK message con-
taining a reject-reason option) could be used to reject a BNDUPD, but
for the purposes of the UPDDONE message, such NAK would count as a
response to the associated BNDUPD message, and would not block the
eventual transmission of the UPDDONE message.
The message type for the UPDDONE message is 7.
The xid in an UPDDONE message MUST be identical to the xid in the
UPDREQ or UPDREQALL message that initiated the update process.
There are no options that MUST appear in an UPDDONE message. Any
option MAY appear, though very few will likely be useful.
6.9. POOLREQ message format
The pool request (POOLREQ) is used by the secondary server to request
an allocation of IP addresses from the primary server.
The message type for the POOLREQ message is 1.
The xid in a POOLREQ message MUST be unique among messages transmit-
ted from this failover endpoint during the life of this connection.
There are no options that MUST appear in a POOLREQ message. Any
option MAY appear.
6.10. POOLRESP message format
The pool response (POOLRESP) is used by the primary server to inform
the secondary server how many IP addresses were allocated to the
secondary server as the result of the pool request.
The message type for the POOLRESP message is 2.
The xid in the POOLRESP message MUST be identical to the xid in the
POOLREQ message for which this POOLRESP is a response.
The following table shows the options that MUST appear in a POOLRESP
message:
Option
------
addresses-transferred MUST
Table 6.10-1: Options used in a POOLREQ message
6.11. CONNECT message format
The connect (CONNECT) message is used by the primary server to estab-
lish a high level connection with the other server, and to transmit
several important configuration data items between the servers.
The message type for the CONNECT message is 5.
The xid in a CONNECT message MUST be unique among messages transmit-
ted from this failover endpoint during the life of this connection.
The CONNECT message MUST be the first message sent down a newly esta-
blished connection.
The following table summarizes the options that are associated with
the CONNECT message:
Option
------
sending-server-IP-address MUST
max-unacked-bndupd MUST
receive-timer MUST
vendor-class-identifier MUST
protocol-version MUST
TLS-request MUST
MCLT MUST
hash-bucket-assignment MUST
all others MAY
Table 6.11-1: Options used in a CONNECT message
6.12. CONNECTACK message format
The connect response (CONNECTACK) message is used by a secondary
server to respond to the receipt of a CONNECT message from the pri-
mary server.
The message type for the CONNECTACK message is 6.
The xid in the CONNECTACK message MUST be identical to the xid in the
CONNECT message for which this CONNECTACK is a response.
The following table summarizes the options associated with the CON-
NECTACK message:
Option
------
sending-server-IP-address MUST
max-unacked-bndupd MUST
receive-timer MUST
vendor-class-identifier MUST
protocol-version MUST
TLS-request MUST
reject-reason MAY(1)
message MAY
MCLT MUST NOT
hash-bucket-assignment MUST NOT
(1) Indicates a rejection of the CONNECT message.
Table 6.12-1: Options used in a CONNECTACK message
6.13. STATE message format
The state (STATE) message is used by either server to communicate the
current state of the failover endpoint with the other server. It
MUST be sent immediately after connection negotiation completes with
the other server, and it MUST be sent whenever the server's state
changes.
The message type for the STATE message is 10.
The xid in a STATE message MUST be unique among messages transmitted
from this failover endpoint during the life of this connection.
The following table shows the options that MUST appear in a STATE
message:
Option
------
sending-state MUST
server-flags MUST
start-time-of-state MUST
Table 6.13-1: Options used in a STATE message
6.14. CONTACT message format
The contact (CONTACT) message is used by either server to verify that
the connection is operational to the other server.
The message type for the CONTACT message is 11.
The xid in a CONTACT message MUST be unique among messages transmit-
ted from this failover endpoint during the life of this connection.
There are no options that MUST be used in a CONTACT message.
6.15. DISCONNECT message format
The disconnect (DISCONNECT) message is used by either server just
prior to closing a connection.
The message type for the DISCONNECT message is 12.
The xid in a DISCONNECT message MUST be unique among messages
transmitted from this failover endpoint during the life of this con-
nection.
The DISCONNECT message MUST be the last message sent down a connec-
tion before it is closed.
The following table summarizes the options that are associated with
the DISCONNECT message:
Option
------
reject-reason MUST
message SHOULD
Table 6.15-1: Options used in a DISCONNECT message
7. Protocol Messages
This section contains the detailed definition of the protocol mes-
sages, including the information to include when sending the message,
as well as the actions to take upon receiving the message.
7.1. BNDUPD message
The binding update (BNDUPD) message is used to send the binding data-
base changes to the partner server, and the partner server responds
with a binding acknowledgement (BNDACK) message when it has success-
fully committed those changes to its own stable storage.
The rest of the failover protocol exists to determine whether the
partner server is able to communicate or not, and to enable the
partners to exchange BNDUPD/BNDACK messages in order to keep their
binding databases in stable storage synchronized.
7.1.1. Sending the BNDUPD message 7.1.1. Sending the BNDUPD message
A BNDUPD message SHOULD be generated whenever any binding changes. A A BNDUPD message SHOULD be generated whenever any binding changes. A
change might be in the binding-status, the lease-expiration-time, or change might be in the binding-status, the lease-expiration-time, or
even just the last-transaction-time. In general, any time a DHCP even just the last-transaction-time. In general, any time a DHCP
client sends in a packet that results in a DHCP server writing to its server writes its stable storage, a BNDUPD message SHOULD be gen-
stable storage, a BNDUPD message SHOULD be generated. erated. This will often be the result of the processing of a DHCP
client request, but it might also be the result of a successful
The BNDUPD (and BNDACK) messages refer to the binding-status of the dynamic DNS update operation.
IP address, and this protocol defines a series of binding-statuses,
discussed in more detail below. Some servers may not support all of
these binding-statuses, and so in those cases they will not be sent,
and upon receipt a reasonable interpretation should be made.
All BNDUPD messages MUST contain the IP address in the assigned-IP-
address option, and it contains the IP address about which the BNDUPD
message is being sent.
All BNDUPD messages MUST contain the binding-status option, and it
will have one of the values in the following list. This list
discusses the meanings of the various binding-statuses and the infor-
mation that should go into the BNDUPD message because of them.
o ACTIVE BNDUPD (and BNDACK) messages refer to the binding-status of the IP
address, and this protocol defines a series of binding-statuses, dis-
cussed in more detail below. Some servers may not support all of
these binding-statuses, and so in those cases they will not be sent.
Upon receipt of a BNDUPD message which contains an unsupported
binding-status, a reasonable interpretation should be made (see sec-
tion 5.10).
Indicates that the IP address is currently leased to a DHCP All BNDUPD messages MUST contain the IP address of the binding update
client. transaction in the assigned-IP-address option.
client-hardware-address All binding update transactions contain a binding-status option, and
it will have one of the values found in section 5.10. Client infor-
mation consists of client-hardware-address and possibly a client-
identifier, and is explained in more detail later in this section.
The following table indicates whether client information should or
should not appear with each binding-status in a binding update tran-
saction:
The client-hardware-address option MUST appear, and be set from binding-status includes client information
the htype and chaddr of the DHCP client to which this IP address ------------------------------------------------
is leased. ACTIVE MUST
EXPIRED SHOULD
RELEASED SHOULD
FREE MAY
ABANDONED MUST NOT
RESET MAY
BACKUP MAY
client-identifier Table 7.1.1-1: Client information required by various
binding-status values.
If the DHCP client to which this IP address is leased used a The ACTIVE binding-status requires some options to indicate the
client-identifier option to identify itself, then the client- length of the binding:
identifier MUST appear in the BNDUPD message, else it MUST NOT
appear.
lease-expiration-time o lease-expiration-time
The lease-expiration-time option MUST appear, and be set to the The lease-expiration-time option MUST appear, and be set to the
expiration time most recently ACKed to the DHCP client. Note expiration time most recently ACKed to the DHCP client. Note
that the time ACKed to a DHCP client is a lease duration in that the time ACKed to a DHCP client is a lease duration in
seconds, while the lease-expiration-time option in a BNDUPD mes- seconds, while the lease-expiration-time option in a BNDUPD mes-
sage is an absolute time value. sage is an absolute time value.
potential-expiration-time o potential-expiration-time
The potential-expiration-time option MUST appear, and be set to The potential-expiration-time option MUST appear, and be set to
a value beyond that of the lease-expiration time. This is the a value beyond that of the lease-expiration time. This is the
value that is ACKed by the BNDACK message. A server sending a value that is ACKed by the BNDACK message. A server sending a
BNDUPD message MUST be able to recover the potential- BNDUPD message MUST be able to recover the potential-
expiration-time sent in every BNDUPD, not just those that expiration-time sent in every BNDUPD, not just those that
receive a corresponding BNDACK, in order to be able to protect receive a corresponding BNDACK, in order to be able to protect
against possible duplicate allocation of IP addresses after against possible duplicate allocation of IP addresses after
transitioning to PARTNER-DOWN state. See section 5.2.1 for transitioning to PARTNER-DOWN state. See section 5.2.1 for
details as to why the potential-expiration-time exists and details as to why the potential-expiration-time exists and
guidelines for how to decide the value. guidelines for how to decide on the value.
o EXPIRED
A binding-status of EXPIRED is used when a client's binding on
an IP address has expired and the server does not wish to imple-
ment an expired-grace period. When the partner server ACK's the
BNDUPD of an EXPIRED IP address, the server sets its internal
state to FREE. It is then available to allocation to any client
of the primary server.
client-hardware-address
There SHOULD be a DHCP client associated with the IP address
whose binding has expired. If there is, then the client-
hardware-address option MUST appear, and be set from the htype
and chaddr of the DHCP client to which this IP address was
leased.
client-identifier
There SHOULD be a DHCP client associated with the IP address
whose binding has expired. If there is, then if the DHCP client
to which this IP address was leased used a client-identifier
option to identify itself, then the client-identifier MUST
appear in the BNDUPD message, else it MUST NOT appear.
o RELEASED
A binding-status of RELEASED is used when a DHCP client sends in
a DHCPRELEASE message and the server does not wish to implement
a released-grace period. When the partner server ACK's the
BNDUPD of an RELEASED IP address, the server sets its internal
state to FREE, and it is available for allocation by the primary
server to any DHCP client.
client-hardware-address
There SHOULD be a DHCP client associated with the IP address
whose binding has been released. If there is, then the client-
hardware-address option MUST appear, and be set from the htype
and chaddr of the DHCP client which released this IP address.
client-identifier
There SHOULD be a DHCP client associated with the IP address
whose binding has been released. If there is, then if the DHCP
client which released this IP address used a client-identifier
option to identify itself, then the client-identifier MUST
appear in the BNDUPD message, else it MUST NOT appear.
o FREE
A binding-status of FREE is used when a DHCP server needs to
communicate that an IP address is available for allocation to
another server, but it was not just released, expired, or reset
by a network administrator. When the partner server ACK's the
BNDUPD of an FREE IP address, the server sets its internal state
such that it is available for allocation by any DHCP client.
client-hardware-address
There MAY be a DHCP client associated with the IP address whose
binding is now desired to be FREE. If there is, then the
client-hardware-address option MUST appear, and be set from the
htype and chaddr of the DHCP client which released this IP
address.
client-identifier
There MAY be a DHCP client associated with the IP address whose
binding is now desired to be FREE. If there is, then if the
DHCP client which released this IP address used a client-
identifier option to identify itself, then the client-identifier
MUST appear in the BNDUPD message, else it MUST NOT appear.
client-hardware-address
There MAY be a DHCP client associated with the IP address whose
binding has now expired. If there is, then the client-
hardware-address option MUST appear, and be set from the htype
and chaddr of the DHCP client which released this IP address.
client-identifier
There MAY be a DHCP client associated with the IP address whose
binding has now expired. If there is, then if the DHCP client
which most recently leased this IP address used a client-
identifier option to identify itself, then the client-identifier
MUST appear in the BNDUPD message, else it MUST NOT appear.
grace-expiration-time
The grace-expiration-time option MUST appear, and is the length
of time that this server will wait before trying to make the IP
address available after the lease has expired for this IP
address.
client-hardware-address
There MAY be a DHCP client associated with the IP address whose
binding has now been released by sending a DHCPRELEASE. If
there is, then the client-hardware-address option MUST appear,
and be set from the htype and chaddr of the DHCP client which
released this IP address.
client-identifier
There MAY be a DHCP client associated with the IP address whose
binding has been released. If there is, then if the DHCP client
which most recently leased this IP address used a client-
identifier option to identify itself, then the client-identifier
MUST appear in the BNDUPD message, else it MUST NOT appear.
client-hardware-address
There MAY be a DHCP client associated with the IP address whose
binding is now desired to be FREE. If there is, then the
client-hardware-address option MUST appear, and be set from the
htype and chaddr of the DHCP client which released this IP
address.
client-identifier
There MAY be a DHCP client associated with the IP address whose
binding is now desired to be FREE. If there is, then if the
DHCP client which released this IP address used a client-
identifier option to identify itself, then the client-identifier
MUST appear in the BNDUPD message, else it MUST NOT appear.
grace-expiration-time
The grace-expiration-time MUST appear, and is the length of time
that this server will wait before trying to make the IP address
available after the lease was released for this IP address
o ABANDONED
An ABANDONED IP address is one that has been considered unusable
by the DHCP subsystem. An IP address for which a valid PING
response was received SHOULD be set to ABANDONED.
client-hardware-address
There SHOULD NOT be a DHCP client associated with an ABANDONDED
IP address. The client-hardware-address option MUST NOT appear
in the BNDUPD message.
client-identifier
There SHOULD NOT be a DHCP client associated with the IP address
whose binding has now been ABANDONED. The client-identifier
option MUST-NOT appear in the BNDUPD message.
o RESET
The RESET value of the binding-status is used to indicate that
this IP address was made available by operator command.
o BACKUP
The BACKUP value of binding-status indicates that this IP
address belongs to the secondary server, and can be allocated by
that server to a DHCP client at any time.
client-hardware-address The following option information is applies to all BNDUPD messages,
regardless of the value of the binding-status, unless otherwise
noted.
There MAY be a DHCP client associated with an BACKUP IP address. o Identifying the client
If there is, the client-hardware-address option MUST appear, and
be set from the htype and chaddr of the DHCP client to which
this IP address was most recently associated.
client-identifier For many of the binding-status values a client MUST appear while
There MAY be a DHCP client associated with this IP address. If for others a client MAY appear, and for some a client MUST NOT
the DHCP client to which this IP address is leased used a
client-identifier option to identify itself, then the client-
identifier MUST appear in the BNDUPD message, else it MUST NOT
appear. appear.
The following option information is generic to all BNDUPD messages, A client is identified in a BNDUPD message by at least one and pos-
regardless of the value of the binding-status. sibly two options. The client-hardware-address option MUST appear
any time that a client appears in a BNDUPD message, and contains
the hardware type and chaddr information from the DHCP request
packet. A failover client-identifier option MUST appear any time
that a client appears in a BNDUPD message if and only if that
client used a DHCP client-identifier option when communicating with
the DHCP server. See section 12.7 and 12.8 for details of how to
construct these two options from a DHCP request packet.
o start-time-of-state o start-time-of-state
The start-time-of-state SHOULD appear. It is set to the time at The start-time-of-state SHOULD appear. It is set to the time at
which this IP address first took on the state that corresponds to which this IP address first took on the state that corresponds to
the current value of binding-status. the current value of binding-status.
o last-transaction-time o last-transaction-time
The last-transaction-time value SHOULD appear. This is the time at The last-transaction-time value SHOULD appear. This is the time at
skipping to change at page 70, line 43 skipping to change at page 48, line 5
tions enabled. tions enabled.
o client-request-options o client-request-options
If the BNDUPD was triggered by a request from a DHCP client (typi- If the BNDUPD was triggered by a request from a DHCP client (typi-
cally those with binding-status of ACTIVE and RELEASED), then the cally those with binding-status of ACTIVE and RELEASED), then the
server SHOULD include options of interest to a failover partner server SHOULD include options of interest to a failover partner
from the client's request packet in the client-request-options for from the client's request packet in the client-request-options for
transmission to its partner. transmission to its partner.
A server sending a BNDUPD need not remember the "interesting" A server sending a BNDUPD SHOULD remember the "interesting" options
options or the information that would appear in an "interesting" or the information that would appear in an "interesting" option for
option for transmission at a time when the BNDUPD is not closely transmission at a time when the BNDUPD is not closely associated
associated with a DHCP client request. with a DHCP client request.
A server SHOULD send the following "interesting" options. It MAY A server SHOULD send the following "interesting" options. It MAY
send any DHCP client options. As new options are defined, the RFC send any DHCP client options. As new options are defined, the RFC
defining these options SHOULD include information that they are defining these options SHOULD include information that they are
"interesting to failover servers" if they should be sent as part of "interesting to failover servers" if they should be sent as part of
a BNDUPD. a BNDUPD.
option option option option
number name number name
----------------------------------------- -----------------------------------------
12 host-name 12 host-name
81 client-FQDN [DDNS] 81 client-FQDN [DDNS]
82 relay-agent-information [AGENTINFO] 82 relay-agent-information [AGENTINFO]
TBD user-class [USERCLASS] TBD user-class [USERCLASS]
60 vendor-class-identifier 60 vendor-class-identifier
Table 7.1.1-1: Options which SHOULD be sent in Table 7.1.1-2: Options which SHOULD be sent in
the client-request-options option in a BNDUPD message. the client-request-options option in a BNDUPD message.
o client-reply-options o client-reply-options
If the BNDUPD was triggered by a request from a DHCP client (typi- If the BNDUPD was triggered by a request from a DHCP client (typi-
cally those with binding-status of ACTIVE and RELEASED), then the cally those with binding-status of ACTIVE and RELEASED), then the
server SHOULD include options of interest to a failover partner server SHOULD include options of interest to a failover partner
from the server's DHCP reply packet in the client-reply-options for from the server's DHCP reply packet in the client-reply-options for
transmission to its partner. transmission to its partner.
A server sending a BNDUPD need not remember the "interesting" A server sending a BNDUPD SHOULD remember the "interesting" options
options or the information that would appear in an "interesting" or the information that would appear in an "interesting" option for
option for transmission at a time when the BNDUPD is not closely transmission at a time when the BNDUPD is not closely associated
associated with a DHCP client request. with a DHCP client request.
A server SHOULD send the following "interesting" options. It MAY A server SHOULD send the following "interesting" options. It MAY
send any DHCP client options. As new options are defined, the RFC send any DHCP client options. As new options are defined, the RFC
defining these options SHOULD include information that they are defining these options SHOULD include information that they are
"interesting to failover servers" if they should be sent as part of "interesting to failover servers" if they should be sent as part of
a BNDUPD. a BNDUPD.
option option option option
number name number name
----------------------------------------- -----------------------------------------
58 renewal-time 58 renewal-time
59 rebinding-time 59 rebinding-time
Table 7.1.1-2: Options which SHOULD be sent in Table 7.1.1-3: Options which SHOULD be sent in
the client-reply-options option in a BNDUPD message. the client-reply-options option in a BNDUPD message.
The BNDUPD message SHOULD be sent as soon as possible from the time The BNDUPD message SHOULD be sent as soon as possible from the time
that the DHCP client received a response and the lease bindings data- that the DHCP client received a response and the lease bindings data-
base is written on stable storage. base is written on stable storage.
7.1.2. Receiving the BNDUPD message 7.1.2. Receiving the BNDUPD message
When a server receives a BNDUPD message, it needs to decide how to When a server receives a BNDUPD message, it needs to decide how to
processes the message and whether the message represents a conflict processes the binding update transaction it contains and whether that
of any sort. The conflict resolution process SHOULD be used on the transaction represents a conflict of any sort. The conflict resolu-
receipt of every BNDUPD message, not just those that are received tion process MUST be used on the receipt of every BNDUPD message, not
while in POTENTIAL-CONFLICT state, in order to increase the robust- just those that are received while in POTENTIAL-CONFLICT state, in
ness of the protocol. order to increase the robustness of the protocol.
There are three sorts of conflicts: There are three sorts of conflicts:
o Two clients one IP address conflict o Two clients, one IP address conflict
This is the duplicate IP address allocation conflict. There are This is the duplicate IP address allocation conflict. There are
two different clients each allocated the same address. There two different clients each allocated the same address. See sec-
cannot be a client conflict unless there is a client specified tion 7.1.3 for how to resolve this conflict.
in the BNDUPD message. See section 5.10.1 for how to resolve
this conflict.
o Two IP addresses one client conflict o Two IP addresses, one client conflict
This conflict exists when a client on one server is associated This conflict exists when a client on one server is associated
with a one IP address, and on the other server with a different with a one IP address, and on the other server with a different
IP address in the same or a related subnet. This does not refer IP address in the same or a related subnet. This does not refer
to the case where a single client has addresses in multiple dif- to the case where a single client has addresses in multiple dif-
ferent subnets or administrative domains, but rather the case ferent subnets or administrative domains, but rather the case
where on the same subnet the client has as lease on one IP where on the same subnet the client has as lease on one IP
address in one server and on a different IP address on the other address in one server and on a different IP address on the other
server. server.
This conflict may or may not be a problem for a given DHCP This conflict may or may not be a problem for a given DHCP
server implementation. In the event that a DHCP server requires server implementation. In the event that a DHCP server requires
that a DHCP client have only one outstanding lease for an IP that a DHCP client have only one outstanding lease for an IP
address on one subnet, this conflict should be resolved by address on one subnet, this conflict should be resolved by
accepting the update which has the latest client-last- accepting the update which has the latest client-last-
transaction-time. transaction-time.
o binding-status conflict o binding-status conflict
This is normal conflict, where one server is updating the other This is normal conflict, where one server is updating the other
with newer information. See section 5.10.1 for details of how with newer information. See section 7.1.3 for details of how to
to resolve these conflicts. resolve these conflicts.
See section 5.10.1 for details of how to process binding-status 7.1.3. Deciding whether to accept the binding update transaction in a
changes in BNDUPD messages. BNDUPD message
7.1.3. Accepting the BNDUPD message IP addresses undergo binding status changes for several reasons,
including receipt and processing of DHCP client requests, administra-
tive inputs and receipt of BNDUPD messages. Every DHCP server needs
to respond to DHCP client requests and administrative inputs with
changes to its internal record of the binding-status of an IP
address, and this response is not in the scope of the failover proto-
col. However, the receipt of BNDUPD messages implies at least a pos-
sible change of the binding-status for an IP address, and must be
discussed here. See section 7.1.2 for general actions to take upon
receipt of a BNDUPD message.
When receiving a BNDUPD message, it is important to note that it may
not be current, in that the server receiving the BNDUPD message may
have had a more recent interaction with the DHCP client than its
partner who sent the BNDUPD message. In this case, the receiving
server MUST reject the BNDUPD message. In addition, it is worth not-
ing that two (and possibly three) binding-status values are the
direct result of interaction with a DHCP client, ACTIVE and RELEASED
(and possibly ABANDONED). All other binding-status values are either
the result of the expiration of a time period or interaction with an
external agency (e.g., a network admistrator).
Every BNDUPD message SHOULD contain a client-last-transaction-time
option, which MUST, if it appears, be the time that the server last
interacted with the DHCP client. It MUST NOT be, for instance, the
time that the lease on an IP address expired. If there has been no
interaction with the DHCP client in question (or there is no DHCP
client presently associated with this IP address), then there will be
no client-last-transaction-time option in the BNDUPD message.
The following list is indexed by the binding-status that a server
receives in a BNDUPD message. In many cases, the binding-status of
an IP address within the receiving server's data storage will have an
affect upon the checks performed prior to accepting the new binding-
status in a BNDUPD message.
In the following list, to "accept" a BNDUPD means to update the
server's bindings database with the information contained in the
BNDUPD and once that update is complete, send a BNDACK message
corresponding to the BNDUPD message. To "reject" a BNDUPD means to
respond to the BNDUPD with a BNDACK with a reject-reason option
included.
When interpreting the rules in the following list, if a BNDUPD
doesn't have a client-last-transaction-time value, then it MUST NOT
be considered later than the client-last-transaction-time in the
receiving server's binding. If the BNDUPD contains a client-last-
transaction-time value and the receiving server's binding does not,
then the client-last-transaction-time value in the BNDUPD MUST be
considered later than the server's.
The second rule concerns clients and IP addresses. If the clients in
a BNDUPD message and in a receiving server's binding differ, then if
the receiving server's binding-status is ACTIVE and the binding-
status in the BNDUPD is ACTIVE, then if the receiving server is a
secondary server accept it, else reject it.
binding-status in received BNDUPD
binding-status
in recieving FREE RESET
server ACTIVE EXPIRED RELEASED BACKUP ABANDONED
ACTIVE accept time(2) time(1) time(2) accept
EXPIRED time(1) accept accept accept accept
RELEASED time(1) time(1) accept accept accept
FREE/BACKUP accept accept accept accept accept
RESET time(3) accept accept accept accept
ABANDONED reject reject reject reject accept
time(1): If the client-last-transaction-time in the BNDUPD
is later than the client-last-transaction-time in the
receiving server's binding, accept it, else reject it.
time(2): If the current time is later than the receiving
servers' lease-expiration-time, accept it, else reject it.
time(3): If the client-last-transaction-time in the BNDUPD
is later than the start-time-of-state in the receiving server's
binding, accept it, else reject it.
Figure 7.1.3-1: Accepting BNDUPD messages
7.1.4. Accepting the BNDUPD message
When accepting a BNDUPD message, the information contained in the When accepting a BNDUPD message, the information contained in the
client-request-options and client-reply-options SHOULD be examined client-request-options and client-reply-options SHOULD be examined
for any information of interest to this server. For instance, a for any information of interest to this server. For instance, a
server which wished to detect changes in client specified host names server which wished to detect changes in client specified host names
might want examine and save information from the host-name or might want to examine and save information from the host-name or
client-FQDN options. Server's which expect to utilize information client-FQDN options. Server's which expect to utilize information
from the relay-agent-information option would want to store this from the relay-agent-information option would want to store this
information. information.
7.1.4. Time values related to the BNDUPD message 7.1.5. Time values related to the BNDUPD message
There are three time values that may be sent in a BNDUPD message. There are four time values that MAY be sent in a BNDUPD message.
o lease-expiration-time o lease-expiration-time
The time that the server gave to the client, i.e., the time that The time that the server gave to the client, i.e., the time that
the server believes that the client's lease will expire. the server believes that the client's lease will expire.
o potential-expiration-time o potential-expiration-time
The time that the server wants to be sure its partner waits The time that the server wants to be sure its partner waits
(added to the MCLT) before assuming that this lease has expired. (added to the MCLT) before assuming that this lease has expired.
Typically some time beyond the desired client lease time. Typically some time beyond the desired client lease time.
o client-last-transaction-time o client-last-transaction-time
The time that the client last interacted with this server. The time that the client last interacted with this server.
o start-time-of-state
The time at which the binding first went into the current state.
As discussed in section 5.2, each server knows what its partner has As discussed in section 5.2, each server knows what its partner has
ACKed with regard to potential-expiration time. In addition, each ACKed with regard to potential-expiration time. In addition, each
server needs to remember what it has told its partner as the server needs to remember what it has told its partner as the
potential-expiration-time. Moreover, each server must remember what potential-expiration-time. Moreover, each server must remember what
it has acked to the *other* server as the most recent potential- it has acked to the *other* server as the most recent potential-
expiration-time from that server. expiration-time from that server.
Remember that each server sends a potential-expiration-time and Remember that each server sends a potential-expiration-time and
receives an ACK for that as well as receiving a potential- receives an ACK for that as well as receiving a potential-
expiration-time and needing to remember what it has acked for that. expiration-time and needing to remember what it has acked for that.
skipping to change at page 74, line 45 skipping to change at page 53, line 48
easy to understand, it has negative consequences in actual operation. easy to understand, it has negative consequences in actual operation.
To illustrate this, in the simple case where the primary updates the To illustrate this, in the simple case where the primary updates the
secondary for a while and then fails, if the secondary can then renew secondary for a while and then fails, if the secondary can then renew
the client for only the MCLT beyond the acked-potential-expiration- the client for only the MCLT beyond the acked-potential-expiration-
time, then the secondary will only be able to renew the client for time, then the secondary will only be able to renew the client for
the MCLT, because the secondary has never sent a BNDUPD packet to the the MCLT, because the secondary has never sent a BNDUPD packet to the
primary concerning this IP address and client, and so its acked- primary concerning this IP address and client, and so its acked-
potential-expiration-time is zero. potential-expiration-time is zero.
However, if we allow the secondary to renew the client with the MCLT However, since the secondary is allowed to renew the client with the
beyond the max( received-potential-expiration-time, acked-potential- MCLT beyond the max( received-potential-expiration-time, acked-
expiration-time), then the secondary can usually renew the client for potential-expiration-time), then the secondary can usually renew the
the full lease period, at least for the first renew it sees from the client for the full lease period, at least for the first renew it
client, since the received-potential-expiration-time is generally sees from the client, since the received-potential-expiration-time is
longer than the client's desired lease interval. The difference in generally longer than the client's desired lease interval. The
renew times could make a big difference in server load on the difference in renew times could make a big difference in server load
secondary in this case. on the secondary in this case.
What are the consequences of allowing a server to offer a DHCP client What are the consequences of allowing a server to offer a DHCP client
a lease term of the MCLT beyond the max( received-potential- a lease term of the MCLT beyond the max( received-potential-
expiration-time, acked-potential-expiration-time)? The consequences expiration-time, acked-potential-expiration-time)? The consequences
appear whenever a server enters PARTNER-DOWN state, and affect how appear whenever a server enters PARTNER-DOWN state, and affect how
long that server has to wait before reallocating expired leases. long that server has to wait before reallocating expired leases.
With this approach, when a server goes into PARTNER-DOWN state, it With this approach, when a server goes into PARTNER-DOWN state, it
must wait the MCLT beyond the max( lease-expiration-time, sent- must wait the MCLT beyond the max( lease-expiration-time, sent-
potential-expiration-time, acked-potential-expiration-time, potential-expiration-time, acked-potential-expiration-time,
received-potential-expiration-time ) for each IP address before it received-potential-expiration-time ) for each IP address before it
skipping to change at page 75, line 34 skipping to change at page 54, line 36
added into the expression, since the partner could have used those added into the expression, since the partner could have used those
times as part of its own lease time calculation. times as part of its own lease time calculation.
Thus this optimization may require a longer waiting time when enter- Thus this optimization may require a longer waiting time when enter-
ing PARTNER-DOWN state, but will generally allow servers to operate ing PARTNER-DOWN state, but will generally allow servers to operate
considerably more effectively when running in COMMUNICATIONS- considerably more effectively when running in COMMUNICATIONS-
INTERRUPTED state. INTERRUPTED state.
7.2. BNDACK message 7.2. BNDACK message
A server sends a binding acknowledgement (BNDACK) message when it has
processed a BNDUPD message and after it has successfully committed to
stable storage any binding database changes made as a result of pro-
cessing the BNDUPD message. A BNDACK message is used to both accept
or reject a BNDUPD message. A BNDACK message which contains a
reject-reason option is a rejection of the corresponding BNDUPD mes-
sage.
In order to reduce the complexity of the discussion, the rest of this
section is written as though every BNDUPD message contains only a
single binding update transaction and thus every corresponding BNDACK
message would also contain reply information about only a single
binding update transaction. See section 6.3 for information on how
to create and process BNDUPD and BNDACK messages which contain multi-
ple binding update transactions.
Note that while a server MAY generate BNDUPD messages with multiple
binding update transactions, every server MUST be able to process a
BNDUPD message which contains multiple binding update transactions
and generate the corresponding BNDACK messages with status for multi-
ple binding update transactions. If a server does not every create
BNDUPD messages which contain multiple binding update transactions,
then it does not need to be able to process a received BNDACK message
with multiple binding update transactions. However, all servers MUST
be able to create BNDACK messages which deal with multiple binding
update transactions received in a BNDUPD message.
Every BNDUPD message that is received by a server MUST be responded Every BNDUPD message that is received by a server MUST be responded
to with a corresponding BNDACK message. The receiving server SHOULD to with a corresponding BNDACK message. The receiving server SHOULD
respond quickly to every BNDUPD message but it MAY choose to respond respond quickly to every BNDUPD message but it MAY choose to respond
preferentially to DHCP client requests instead of BNDUPD messages, preferentially to DHCP client requests instead of BNDUPD messages,
since there is no absolute time period within which a BNDACK must be since there is no absolute time period within which a BNDACK must be
sent in response to a BNDUPD message, and DHCP clients frequently do sent in response to a BNDUPD message, while DHCP clients frequently
have time constraints that must be met. have strict time constraints.
A BNDACK message can only be sent in response to a BNDUPD message A BNDACK message can only be sent in response to a BNDUPD message
using the same TCP connection from which the BNDUPD message was using the same TCP connection from which the BNDUPD message was
received, since the XID's in BNDUPD messages are guaranteed unique received, since the XID's in BNDUPD messages are guaranteed unique
only during the life of a single TCP connection. When a connection only during the life of a single TCP connection. When a connection
to a partner server goes down, a server with unprocessed BNDUPD mes- to a partner server goes down, a server with unprocessed BNDUPD mes-
sages MAY simply drop all of those messages, since it can be sure sages MAY simply drop all of those messages, since it can be sure
that the partner will retransmit them when they are next in communi- that the partner will resend them when they are next in communica-
cations. A server with unprocessed BNDUPD messages when a TCP con- tions, albeit with a different XID. A server with unprocessed BNDUPD
nection goes down MAY instead choose to process those BNDUPD mes- messages when a TCP connection goes down MAY instead choose to pro-
sages, but it MUST NOT send any BNDACK messages in response (again cess those BNDUPD messages, but it MUST NOT send any BNDACK messages
because of the issues surrounding XID uniqueness). in response (again because of the issues surrounding XID uniqueness).
The message type for the BNDACK message is 4.
The following table summarizes the options for the BNDACK message.
binding-status BACKUP
RESET
ABANDONED
Option ACTIVE EXPIRED RELEASED FREE
------ ------ ------- -------- ----
assigned-IP-address MUST MUST MUST MUST
binding-status MUST MUST MUST MUST
client-identifier MAY MAY MAY MAY
client-hardware-address MUST MUST MUST MAY(2)
reject-reason MAY MAY MAY MAY
message MAY MAY MAY MAY
lease-expiration-time MUST MUST NOT MUST NOT MUST NOT
potential-expiration-time MUST MUST NOT MUST NOT MUST NOT
start-time-of-state SHOULD SHOULD SHOULD SHOULD
client-last-trans.-time SHOULD SHOULD SHOULD MAY
DDNS(1) SHOULD SHOULD SHOULD SHOULD
(1) Only SHOULD appear if the server supports dynamic DNS.
(2) MUST NOT if binding-status is ABANDONED.
Table 7.2-1: Options used in a BNDACK message
7.2.1. Sending the BNDACK message 7.2.1. Sending the BNDACK message
The BNDACK message MUST contain the same xid as the corresponding The BNDACK message MUST contain the same xid as the corresponding
BNDUPD message. BNDUPD message.
All of the options which appear in the BNDUPD message MUST be All of the options which appear in the BNDUPD message MUST be
included in the BNDACK message. The values in the options MAY be included in the BNDACK message. The values in the options MAY be
updated to reflect current information on the server sending the updated to reflect current information on the server sending the
BNDACK. Note that update of this information may be used for infor- BNDACK. Note that update of this information may be used for infor-
skipping to change at page 76, line 34 skipping to change at page 57, line 11
appear in the BNDACK message, and the message option SHOULD appear in appear in the BNDACK message, and the message option SHOULD appear in
this case containing a human-readable error message describing in this case containing a human-readable error message describing in
some detail the reason for the rejection of the BNDUPD message. some detail the reason for the rejection of the BNDUPD message.
If the server rejects the BNDUPD message with a BNDACK and a reject- If the server rejects the BNDUPD message with a BNDACK and a reject-
reason option, it may be because the server believes that it has reason option, it may be because the server believes that it has
binding information that the other server should know. A server binding information that the other server should know. A server
which is rejecting a BNDUPD may initiate a BNDUPD of its own in order which is rejecting a BNDUPD may initiate a BNDUPD of its own in order
to update its partner with what it believes is better binding infor- to update its partner with what it believes is better binding infor-
mation, but it MUST ensure through some means that it will not end up mation, but it MUST ensure through some means that it will not end up
a situation where each server is sending BNDUPD messages as fast as in a situation where each server is sending BNDUPD messages as fast
possible because they can't agree on which server has better binding as possible because they can't agree on which server has better bind-
data. Placing a reasonable delay on the initiation of a BNDUPD mes- ing data. Placing a considerable delay on the initiation of a BNDUPD
sage after sending a BNDACK with a reject-reason would be one way to message after sending a BNDACK with a reject-reason would be one way
ensure this situation doesn't occur. to ensure this situation doesn't occur.
7.2.2. Receiving the BNDACK message 7.2.2. Receiving the BNDACK message
When a server receives a BNDACK message, if it doesn't contain a When a server receives a BNDACK message, if it doesn't contain a
reject-reason option that means that the BNDUPD message was accepted, reject-reason option that means that the BNDUPD message was accepted,
and the server which sent the BNDUPD MUST update its stable storage and the server which sent the BNDUPD SHOULD update its stable storage
with the potential-expiration-time value sent in the BNDUPD message with the potential-expiration-time value sent in the BNDUPD message
and returned in the BNDACK message. Other values sent in the BNDUPD and returned in the BNDACK message. Other values sent in the BNDUPD
message MAY be used as desired. message MAY be used as desired.
If the BNDACK message contains a reject-reason option, that means
that the BNDUPD was rejected. There SHOULD be a message option in
the BNDACK giving a text reason for the rejection, and the server
SHOULD log the message in some way. The server MUST NOT immediately
try to resend the BNDACK message as there is no reason to believe the
partner won't reject it a second time. However a server MAY choose
to send another BNDACK at some future time, for instance when the
server next processes an update request from its partner.
7.3. UPDREQ message 7.3. UPDREQ message
The update request (UPDREQ) message is used by one server to request The update request (UPDREQ) message is used by one server to request
that its partner send it all of the binding database information that that its partner send it all of the binding database information that
it has not already seen. Since each server is required to keep it has not already seen. Since each server is required to keep
track at all times of the binding information the other server has track at all times of the binding information the other server has
received and ACKed, one server can request transmission of all un- received and ACKed, one server can request transmission of all un-
ACKed binding database information held by the other server by using ACKed binding database information held by the other server by using
the UPDREQ message. the UPDREQ message.
The UPDREQ message is used whenever the sending server cannot proceed The UPDREQ message is used whenever the sending server cannot proceed
before it has processed all previously un-ACKed binding update infor- before it has processed all previously un-ACKed binding update infor-
mation, since the UPDREQ message should yield a corresponding UPDDONE mation, since the UPDREQ message should yield a corresponding UPDDONE
message. The UPDDONE message is not sent until the server that sent message. The UPDDONE message is not sent until the server that sent
the UPDREQ message has responded to all of the BNDUPD messages gen- the UPDREQ message has responded to all of the BNDUPD messages gen-
erated by the UPDREQ message with BNDACK messages. Thus, the sender erated by the UPDREQ message with BNDACK messages (they may either be
of the UPDREQ message can be sure upon receipt of an UPDDONE message accepted or rejected by the BNDACK messages, but they MUST have been
that it has received and committed to stable storage all outstanding responded to). Thus, the sender of the UPDREQ message can be sure
binding database updates. upon receipt of an UPDDONE message that it has received and committed
to stable storage all outstanding binding database updates.
See section 9, Protocol state transitions, for the details of when See section 9, Failover Endpoint States, for the details of when the
the UPDREQ message is sent. UPDREQ message is sent.
7.3.1. Sending the UPDREQ message 7.3.1. Sending the UPDREQ message
There are no options for the UPDREQ message. The message type for the UPDREQ message is 9.
The UPDREQ message is sent with a unique xid. The UPDREQ message has no message specific options.
7.3.2. Receiving the UPDREQ message 7.3.2. Receiving the UPDREQ message
A server receiving an UPDREQ message MUST send all binding database A server receiving an UPDREQ message MUST send all binding database
changes that have not yet been ACKed by the sending server. These changes that have not yet been ACKed by the sending server. These
changes are sent as undistinguished BNDUPD messages. changes are sent as undistinguished BNDUPD messages.
However, the server which received and is processing the UPDREQ mes- However, the server which received and is processing the UPDREQ mes-
sage MUST track the BNDACK messages that correspond to the BNDUPD sage MUST track the BNDACK messages that correspond to the BNDUPD
messages triggered by the UPDREQ message and, when they are all messages triggered by the UPDREQ message and, when they are all
skipping to change at page 78, line 31 skipping to change at page 59, line 20
failure of stable storage and to restore its binding database in its failure of stable storage and to restore its binding database in its
entirety from the other server. entirety from the other server.
A server which sends an UPDREQALL message cannot proceed until all of A server which sends an UPDREQALL message cannot proceed until all of
its binding update information is restored, and it knows that all of its binding update information is restored, and it knows that all of
that information is restored when an UPDDONE message is received. that information is restored when an UPDDONE message is received.
See section 9, Protocol state transitions, for the details of when See section 9, Protocol state transitions, for the details of when
the UPDREQALL message is sent. the UPDREQALL message is sent.
7.4.1. Sending the UPDREQALL message The message type for the UPDREQALL message is 7.
There are no options for the UPDREQALL message. The UPDREQALL message has no message specific options.
The UPDREQALL message is sent with a unique xid. 7.4.1. Sending the UPDREQALL message
The UPDREQALL is sent.
7.4.2. Receiving the UPDREQALL message 7.4.2. Receiving the UPDREQALL message
A server receiving an UPDREQALL message MUST send all binding data- A server receiving an UPDREQALL message MUST send all binding data-
base information to the sending server. These changes are sent as base information to the sending server. These changes are sent as
undistinguished BNDUPD messages. undistinguished BNDUPD messages. Otherwise the processing is the same
as for the UPDREQ message. See section 7.3.2 for details.
However, the server processing the UPDREQALL message MUST track the
BNDACK messages that correspond to the BNDUPD messages triggered by
the UPDREQALL message and, when they are all received, the server
MUST send an UPDDONE message.
Just as specified for the processing of the UPDREQ message, the
server processing the UPDREQALL message and sending BNDUPD messages
to its partner SHOULD only track the BNDUPD and BNDACK message pairs
for unACKed binding database changes that were present upon the
receipt of the UPDREQALL message. A server which has received an
UPDREQALL message SHOULD send BNDUPD messages for binding database
changes that occur after receipt of the UPDREQ message, but it SHOULD
NOT include those additional BNDUPD messages and their corresponding
BNDACK messages in the accounting necessary to consider the UPDREQALL
complete and subsequently send the UPDDONE message. If some addi-
tional binding database changes end up becoming part of the set of
BNDUPD messages considered as part of the UPDREALLQ (due to whatever
algorithm the server uses to scan its bindings database for unacked
changes) it will probably not cause any difficulty, but a server MUST
NOT attempt to include all such later BNDUPD messages in the account-
ing for the UPDREQALL in order to be able to transmit an UPDDONE mes-
sage.
When queuing up the BNDUPD messages for transmission to the sender of
the UPDREQALL message, the server processing the UPDREQALL MUST honor
the value returned in the max-unacked-bndupd option in the CONNECT or
CONNECTACK message that set up the connection with the sending
server. It MUST NOT send more BNDUPD messages without receiving
corresponding BNDACKs than the value returned in max-unacked-bndupd.
7.5. UPDDONE message 7.5. UPDDONE message
The update done (UPDDONE) message is used by a server receiving an The update done (UPDDONE) message is used by a server receiving an
UPDREQ or UPDREQALL message to signify that it has sent all of the UPDREQ or UPDREQALL message to signify that it has sent all of the
BNDUPD messages requested by the UPDREQ or UPDREQALL request and that BNDUPD messages requested by the UPDREQ or UPDREQALL request and that
it has received a BNDACK for each of those messages. it has received a BNDACK for each of those messages.
While a BNDACK message MUST have been received for each BNDUPD mes-
sage prior to the transmission of the UPDDONE message, this doesn't
necessarily mean that all of the BNDUPD messages were accepted, only
that all of them were responded to with a BNDACK message. Thus, a
NAK (comprised of a BNDACK message containing a reject-reason option)
could be used to reject a BNDUPD, but for the purposes of the UPDDONE
message, such NAK would count as a response to the associated BNDUPD
message, and would not block the eventual transmission of the UPDDONE
message.
The message type for the UPDDONE message is 8.
The xid in an UPDDONE message MUST be identical to the xid in the
UPDREQ or UPDREQALL message that initiated the update process.
The UPDDONE message has no message specific options.
7.5.1. Sending the UPDDONE message 7.5.1. Sending the UPDDONE message
The UPDDONE message SHOULD be sent as soon as the last BNDACK message The UPDDONE message SHOULD be sent as soon as the last BNDACK message
corresponding to a BNDUPD message requested by the UPDREQ or corresponding to a BNDUPD message requested by the UPDREQ or
UPDREQALL is received from the server which sent the UPDREQ or UPDREQALL is received from the server which sent the UPDREQ or
UPDREQALL. The XID of the UPDDONE message MUST be the same as the UPDREQALL. The XID of the UPDDONE message MUST be the same as the
XID of the corresponding UPDREQ or UPDREQALL message. XID of the corresponding UPDREQ or UPDREQALL message.
7.5.2. Receiving the UPDDONE message 7.5.2. Receiving the UPDDONE message
skipping to change at page 80, line 21 skipping to change at page 60, line 43
transmitted using normal BNDUPD messages from the primary to the transmitted using normal BNDUPD messages from the primary to the
secondary. secondary.
The POOLREQ message SHOULD be sent from the secondary to the primary The POOLREQ message SHOULD be sent from the secondary to the primary
whenever the secondary transitions into NORMAL state. It SHOULD whenever the secondary transitions into NORMAL state. It SHOULD
periodically be resent in order that any change in the number of periodically be resent in order that any change in the number of
available IP addresses on the primary be reflected in the pool on the available IP addresses on the primary be reflected in the pool on the
secondary. The period may be influenced by the secondary server's secondary. The period may be influenced by the secondary server's
leasing activity. leasing activity.
The message type for the POOLREQ message is 1.
The POOLREQ message has no message specific options.
7.6.1. Sending the POOLREQ message 7.6.1. Sending the POOLREQ message
The POOLREQ message has no options. It must be sent with a unique The POOLREQ message is sent.
xid.
7.6.2. Receiving the POOLREQ message 7.6.2. Receiving the POOLREQ message
When a primary server receives a POOLREQ message it SHOULD examine When a primary server receives a POOLREQ message it SHOULD examine
the binding database and determine how many IP addresses the secon- the binding database and determine how many IP addresses the secon-
dary server should have, and set these IP addresses to BACKUP state. dary server should have, and set these IP addresses to BACKUP state.
It SHOULD then send BNDUPD messages concerning all of these IP It SHOULD then send BNDUPD messages concerning all of these IP
addresses to the secondary server. addresses to the secondary server.
Servers frequently have several kinds of IP addresses available on a Servers frequently have several kinds of IP addresses available on a
skipping to change at page 80, line 50 skipping to change at page 61, line 29
tion of available IP addresses of each kind, and the secondary server tion of available IP addresses of each kind, and the secondary server
is responsible for being configured in such a way that it can tell is responsible for being configured in such a way that it can tell
the kind of every IP address based solely on the IP address itself. the kind of every IP address based solely on the IP address itself.
A primary server MUST keep track of how many IP addresses were allo- A primary server MUST keep track of how many IP addresses were allo-
cated as a result of processing the POOLREQ message, and send that cated as a result of processing the POOLREQ message, and send that
number in the POOLRESP message. number in the POOLRESP message.
A primary server MAY choose to defer processing a POOLREQ message A primary server MAY choose to defer processing a POOLREQ message
until a more convenient time to process it, but it should not depend until a more convenient time to process it, but it should not depend
on the secondary server to retransmit the POOLREQ message in that on the secondary server to resend the POOLREQ message in that case.
case.
If a secondary server receives a POOLREQ message it SHOULD report an If a secondary server receives a POOLREQ message it SHOULD report an
error. error.
7.7. POOLRESP message 7.7. POOLRESP message
A primary server sends a POOLRESP message to a secondary server after A primary server sends a POOLRESP message to a secondary server after
the allocation process for available addresses to the secondary the allocation process for available addresses to the secondary
server is complete. Typically this message will precede some of the server is complete. Typically this message will precede some of the
BNDUPD messages that the primary uses to send the actual allocated IP BNDUPD messages that the primary uses to send the actual allocated IP
addresses to the secondary. addresses to the secondary.
The message type for the POOLRESP message is 2.
The xid in the POOLRESP message MUST be identical to the xid in the
POOLREQ message for which this POOLRESP is a response.
7.7.1. Sending the POOLRESP message 7.7.1. Sending the POOLRESP message
The POOLRESP message MUST contain the same xid as the corresponding The POOLRESP message MUST contain the same xid as the corresponding
POOLREQ message. POOLREQ message.
The only option which MUST appear in a POOLREQ message is: Only one option MUST appear in a POOLREQ message:
o addressed-transferred o addresses-transferred
The number of addresses allocated to the secondary server by the The number of addresses allocated to the secondary server by the
primary server as a result of a POOLREQ is contained in the primary server as a result of a POOLREQ is contained in the
addresses-transferred option in a POOLRESP message. Note this addresses-transferred option in a POOLRESP message. Note this
is the number of addresses that are transferred to the secondary is the number of addresses that are transferred to the secondary
in the primary's binding database as a result of the correspond- in the primary's binding database as a result of the correspond-
ing POOLREQ message, and that it may be some time before they ing POOLREQ message, and that it may be some time before they
can all be transmitted to the secondary server through the use can all be transmitted to the secondary server through the use
of BNDUPD messages. of BNDUPD messages.
skipping to change at page 81, line 47 skipping to change at page 62, line 31
another POOLRESP message if the value of the addresses-transferred another POOLRESP message if the value of the addresses-transferred
option is non-zero. option is non-zero.
Typically, no other action is taken on the reception of a POOLRESP Typically, no other action is taken on the reception of a POOLRESP
message. message.
7.8. CONNECT message 7.8. CONNECT message
The connect message is used to establish an applications level con- The connect message is used to establish an applications level con-
nection over a newly created TCP connection. It gives the source nection over a newly created TCP connection. It gives the source
information for the connection, and some important configuration information for the connection, and critical configuration informa-
information. It MUST be sent only by the primary server. Either tion. It MUST be sent only by the primary server. Either server can
server can initiate a TCP connection, but the CONNECT message is only initiate a TCP connection, but the CONNECT message is only sent by
sent by the primary server. the primary server.
The message type for the CONNECT message is 5.
The CONNECT message MUST be the first message sent down a newly esta-
blished connection, and it MUST be sent only by the primary server.
The following table summarizes the options that are associated with
the CONNECT message:
Option
------
sending-server-IP-address MUST
max-unacked-bndupd MUST
receive-timer MUST
vendor-class-identifier MUST
protocol-version MUST
TLS-request MUST
MCLT MUST
hash-bucket-assignment MUST
Table 7.8-1: Options used in a CONNECT message
7.8.1. Sending the CONNECT message 7.8.1. Sending the CONNECT message
The CONNECT message MUST be the first message sent by the primary The CONNECT message MUST be the first message sent by the primary
server after the establishment of a new TCP connection with a secon- server after the establishment of a new TCP connection with a secon-
dary server participating in the failover protocol. dary server participating in the failover protocol.
The xid of the CONNECT message must be unique. The xid of the CONNECT message must be unique.
The IP address of the primary server MUST be placed in the sending- The IP address of the primary server MUST be placed in the sending-
skipping to change at page 83, line 9 skipping to change at page 64, line 23
The TLS-request option MUST be sent and contains the desired TLS con- The TLS-request option MUST be sent and contains the desired TLS con-
nection request as well as information concerning whether TLS is sup- nection request as well as information concerning whether TLS is sup-
ported. If this CONNECT message is being sent over a already ported. If this CONNECT message is being sent over a already
created TLS connection, the TLS-request MUST NOT appear. created TLS connection, the TLS-request MUST NOT appear.
7.8.2. Receiving the CONNECT message 7.8.2. Receiving the CONNECT message
When a server receives a TCP connection on the failover port, if it When a server receives a TCP connection on the failover port, if it
is a PRIMARY server it should send a CONNECT message, and if it is a is a PRIMARY server it should send a CONNECT message, and if it is a
secondary server it should wait for a CONNECT message. secondary server it should wait for a CONNECT message before sending
any messages. To avoid denial of service attacks, a secondary should
only wait for a CONNECT message on a new connection for a limited
amount of time and close the connection if none is received during
that time.
When a secondary server receives a CONNECT message it should: When a secondary server receives a CONNECT message it should:
1. Record the time at which the message was received. 1. Record the time at which the message was received.
2. Examine the protocol-version option, and decide if this server 2. Examine the protocol-version option, and decide if this server
is capable of interoperating with another server running that is capable of interoperating with another server running that
protocol version. If not, send the CONNECTACK message with protocol version. If not, send the CONNECTACK message with
the appropriate reject-reason. The server MUST include its the appropriate reject-reason. The server MUST include its
protocol-version in the CONNECTACK message. protocol-version in the CONNECTACK message.
3. Examine the TLS-request option. Figure out the TLS-reply 3. Examine the TLS-request option. Figure out the TLS-reply
value based on the capabilities and configuration of this value based on the capabilities and configuration of this
server, and save it for the CONNECTACK message. If the server. If the result for the TLS-reply value is a 1 and the
results of the TLS negotiation result in a connection rejec- connection is accepted, indicating use of TLS, then immedi-
tion, then go immediately to send the CONNECTACK message. ately send the CONNECTACK message and go into TLS negotiation.
If the TLS-reply value implies rejection of the connection,
then immediately send the CONNECTACK message with the TLS-
reply value and the appropriate reject-reason option value.
In all other cases, save the TLS-reply option information for
the eventual CONNECTACK message.
The possibilities are: The possibilities for TLS-request and TLS-reply are:
CONNECT CONNECTACK CONNECT CONNECTACK
TLS-request TLS-reply TLS TLS
request reply
Reject Reject
req acc t1 Reason Comments t1 t1 Reason Comments
--- --- -- ------ -------- -- -- ------ --------
0 0 0 0 0 no TLS used
0 0 1 11 receiver requires TLS 0 1 11 primary won't use TLS, secondary requires TLS
0 1 0 1 0 primary desires TLS, secondary doesn't
0 1 1 1 1 primary desires TLS, secondary will use TLS
1 0 - request doesn't make sense 2 0 9, 10 primary requires TLS and secondary won't
1 1 0 2 1 primary requires TLS and secondary will use TLS
1 1 1
2 0 - request doesn't make sense
2 1 0 9 or 10 receiver won't do TLS
2 1 1
4. Check to see if there is a message-digest option in the CON- 4. Check to see if there is a message-digest option in the CON-
NECT message. If there was, and the server does not support NECT message. If there was, and the server does not support
message-digests, then reject the connection with the appropri- message-digests, then reject the connection with the appropri-
ate reject-reason in the CONNECTACK. ate reject-reason in the CONNECTACK. If the server does sup-
port message-digests, then check this message for validity
based on the message-digest, and reject it if the digest indi-
cates the message was altered.
5. Determine if the sender (from the sending-server-IP-address 5. Determine if the sender (from the sending-server-IP-address
option) and the implicit role of the sender (i.e., primary) option) and the implicit role of the sender (i.e., primary)
represents a server with which the receiver was configured to represents a server with which the receiver was configured to
engage in failover activity. This is performed after the any engage in failover activity. This is performed after the any
TLS processing so that it occurs after a secure connection is TLS or message digest processing so that it occurs after a
created, to ensure that there is no tampering with the IP secure connection is created, to ensure that there is no
address of the partner. tampering with the IP address of the partner.
If not, then the receiving server should reject the CONNECT If not, then the receiving server should reject the CONNECT
request by sending a CONNECTACK message with a reject-reason request by sending a CONNECTACK message with a reject-reason
value of: 8, invalid failover partner. value of: 8, invalid failover partner.
If it is, then the receiving failover endpoint should be If it is, then the receiving failover endpoint should be
determined. determined.
6. Decide if the time delta between the sending of the message, 6. Decide if the time delta between the sending of the message,
in the time field, and the receipt of the message, recorded in in the time field, and the receipt of the message, recorded in
step 1 above, is acceptable. A server MAY require an arbi- step 1 above, is acceptable. A server MAY require an arbi-
trarily small delta in time values in order to set up a fail- trarily small delta in time values in order to set up a fail-
over connection with another server. See section 5.9 for over connection with another server. See section 5.9 for
information on time synchronization. information on time synchronization.
If the delta between the time values is too great, the server If the delta between the time values is too great, the server
should reject the CONNECT request by sending a CONNECTACK mes- should reject the CONNECT request by sending a CONNECTACK
sage with a reject-reason of 4, time mismatch too great. message with a reject-reason of 4, time mismatch too great.
If the time mismatch is not considered too great then the If the time mismatch is not considered too great then the
receiving server MUST record the delta between the servers. receiving server MUST record the delta between the servers.
The receiving server MUST use this delta to correct all of the The receiving server MUST use this delta to correct all of the
absolute times received from the other server in all time- absolute times received from the other server in all time-
valued options. Note that server's can participate in fail- valued options. Note that server's can participate in fail-
over with arbitrarily great time mismatches, as long as it is over with arbitrarily great time mismatches, as long as it is
more or less constant. more or less constant.
7. If the receiving server is a secondary server, it MUST examine 7. If the receiving server is a secondary server, it MUST examine
skipping to change at page 85, line 22 skipping to change at page 66, line 46
It is sent by the secondary server which received a CONNECT message. It is sent by the secondary server which received a CONNECT message.
Attempting immediately to reconnect after either receiving a CONNEC- Attempting immediately to reconnect after either receiving a CONNEC-
TACK with a reject-reason or after sending a CONNECTACK with a TACK with a reject-reason or after sending a CONNECTACK with a
reject-reason could yield unwanted looping behavior, since the reason reject-reason could yield unwanted looping behavior, since the reason
that the connection was rejected may well not have changed since the that the connection was rejected may well not have changed since the
last attempt. A simple suggested solution is to wait a minute or two last attempt. A simple suggested solution is to wait a minute or two
after sending or receiving a CONNECTACK message with a reject-reason after sending or receiving a CONNECTACK message with a reject-reason
before attempting to reestablish communication. before attempting to reestablish communication.
The message type for the CONNECTACK message is 6.
The following table summarizes the options associated with the CON-
NECTACK message:
Option
------
sending-server-IP-address MUST
max-unacked-bndupd MUST
receive-timer MUST
vendor-class-identifier MUST
protocol-version MUST
TLS-request MUST
reject-reason MAY(1)
message MAY
MCLT MUST NOT
hash-bucket-assignment MUST NOT
(1) Indicates a rejection of the CONNECT message.
Table 7.9-1: Options used in a CONNECTACK message
7.9.1. Sending the CONNECTACK message 7.9.1. Sending the CONNECTACK message
The xid of the CONNECTACK message MUST be that of the corresponding The xid of the CONNECTACK message MUST be that of the corresponding
CONNECT message. CONNECT message.
The IP address of the sending server MUST be placed in the sending- The IP address of the sending server MUST be placed in the sending-
server-IP-address option. This information is placed in an option server-IP-address option. This information is placed in an option
inside of the message in order to allow the identity of the sender to inside of the message in order to allow the identity of the sender to
be covered by a shared secret. be covered by a shared secret.
skipping to change at page 88, line 42 skipping to change at page 70, line 44
The STATE message MUST be sent after sending a CONNECTACK message The STATE message MUST be sent after sending a CONNECTACK message
that didn't contain a reject-reason option, and MUST be sent after that didn't contain a reject-reason option, and MUST be sent after
receiving a CONNECTACK message without a reject-reason option. receiving a CONNECTACK message without a reject-reason option.
A STATE message MUST be sent whenever the failover endpoint changes A STATE message MUST be sent whenever the failover endpoint changes
its failover state and a connection exists to the partner. its failover state and a connection exists to the partner.
The STATE message requires no response from the failover partner. The STATE message requires no response from the failover partner.
The message type for the STATE message is 10.
The following table shows the options that MUST appear in a STATE
message:
Option
------
sending-state MUST
server-flags MUST
start-time-of-state MUST
Table 7.10-1: Options used in a STATE message
7.10.1. Sending the STATE message 7.10.1. Sending the STATE message
The current failover state is placed in the server-state option and The current failover state is placed in the server-state option and
the current state of the STARTUP flag is placed in the server-flags the current state of the STARTUP flag is placed in the server-flags
option. option.
The message is sent with a unique xid. The message is sent with a unique xid.
A server SHOULD only send the STATE message either when the connec- A server SHOULD only send the STATE message either when the connec-
tion is created (i.e, after sending or receiving a CONNECTACK message tion is created (i.e, after sending or receiving a CONNECTACK message
skipping to change at page 89, line 25 skipping to change at page 71, line 44
No response to a STATE message is required. No response to a STATE message is required.
7.11. CONTACT message 7.11. CONTACT message
The contact (CONTACT) message is sent to verify communications The contact (CONTACT) message is sent to verify communications
integrity with a failover partner. The CONTACT message is sent when integrity with a failover partner. The CONTACT message is sent when
no messages have been sent to the failover partner for a specified no messages have been sent to the failover partner for a specified
period of time. This is determined by the tSend timer expiring (see period of time. This is determined by the tSend timer expiring (see
section 8.3). section 8.3).
The message type for the CONTACT message is 11.
The CONTACT message has no message specific options.
7.11.1. Sending the CONTACT message 7.11.1. Sending the CONTACT message
The CONTACT message is sent. The CONTACT message is sent.
7.11.2. Receiving the CONTACT message 7.11.2. Receiving the CONTACT message
When a CONTACT message is received, the tReceive timer is reset (as When a CONTACT message is received, the tReceive timer is reset (as
it is with any message that is received). it is with any message that is received).
A server MAY use the time in the time field and the time recorded A server MAY use the time in the time field and the time recorded
skipping to change at page 90, line 5 skipping to change at page 72, line 31
After sending or receiving a DISCONNECT message, a server needs to After sending or receiving a DISCONNECT message, a server needs to
have some mechanism to prevent an error loop. Simply reconnecting to have some mechanism to prevent an error loop. Simply reconnecting to
the partner immediately is not the best option, especially after the partner immediately is not the best option, especially after
several consecutive attempts. several consecutive attempts.
A simple suggested solution is to wait a minute or two after sending A simple suggested solution is to wait a minute or two after sending
or receiving a DISCONNECT before attempting to reestablish communica- or receiving a DISCONNECT before attempting to reestablish communica-
tion. tion.
The message type for the DISCONNECT message is 12.
The DISCONNECT message MUST be the last message sent down a connec-
tion before it is closed.
The following table summarizes the options that are associated with
the DISCONNECT message:
Option
------
reject-reason MUST
message SHOULD
Table 7.12-1: Options used in a DISCONNECT message
7.12.1. Sending the DISCONNECT message 7.12.1. Sending the DISCONNECT message
The DISCONNECT message MUST be the last message sent by the a server The DISCONNECT message MUST be the last message sent by the a server
which is dropping a TCP connection. which is dropping a TCP connection.
The xid of the DISCONNECT message must be unique. The xid of the DISCONNECT message must be unique.
The reject-reason option MUST appear giving a reason why the connec- The reject-reason option MUST appear giving a reason why the connec-
tion was dropped. A message option SHOULD appear giving a human tion was dropped. A message option SHOULD appear giving a human
readable error message with possibly more details. readable error message with possibly more details.
skipping to change at page 90, line 42 skipping to change at page 73, line 37
transitions are taken in many cases when the status of communications transitions are taken in many cases when the status of communications
with the partner changes, and the existence or non-existence of a TCP with the partner changes, and the existence or non-existence of a TCP
connections between failover endpoints is used to determine if com- connections between failover endpoints is used to determine if com-
munications is "okay" or "failed". munications is "okay" or "failed".
A single TCP connection exists which connects two failover endpoints. A single TCP connection exists which connects two failover endpoints.
8.1. Connection granularity 8.1. Connection granularity
There exists one TCP connection between each set of failover end- There exists one TCP connection between each set of failover end-
points. See section 5.1.1 for an explanation of failover endpoint. points. See section 5.1.1 for an explanation of failover endpoints.
There are a maximum of two TCP connections between any two servers There are a maximum of two TCP connections between any two servers
implementing the failover protocol, one for each of the possible implementing the failover protocol, one for each of the possible
failover endpoints between these two servers. There is a minimum of failover endpoints between these two servers. There is a minimum of
one TCP connection between one server and every other failover server one TCP connection between one server and every other failover server
with which it implements the failover protocol. with which it implements the failover protocol.
8.2. Creating the TCP connection 8.2. Creating the TCP connection
Every server implementing the failover protocol MUST listen on port There are two ports used for initiating TCP connections, correspond-
647 for incoming failover TCP connections. The source port of the ing to the two roles that a server can fill with respect to another
TCP connection is unimportant. server. Every server implementing the failover protcol MUST listen
on at least one of these ports. Port 647 is the port to which pri-
mary servers will attempt a connection, and port TBD is the port to
which secondary servers will attempt a connection. When a connection
attempt is received on port 647 it is therefore from a primary
server, and it is attempting to connect to this server as a secondary
server. Likewise, when an attempt to connect is received on port TBD
the connection attempt is from a secondary server, and it is attempt-
ing to connect to this server as a primary server. The source port
of any TCP connection is unimportant.
Every server implementing the failover protocol SHOULD attempt to Every server implementing the failover protocol SHOULD attempt to
connect to all of its partners periodically, where the period is connect to all of its partners periodically, where the period is
implementation dependent and SHOULD be configurable. In the event implementation dependent and SHOULD be configurable. In the event
that a connection has been rejected by a CONNECTACK message with a that a connection has been rejected by a CONNECTACK message with a
reject-reason option contained in it or a DISCONNECT message, a reject-reason option contained in it or a DISCONNECT message, a
server SHOULD r educe the frequency with which it attempts to connect server SHOULD r educe the frequency with which it attempts to connect
to that server but it SHOULD continue to attempt to connect periodi- to that server but it SHOULD continue to attempt to connect periodi-
cally. cally.
If a connection attempt has been received from another server in a
particular role (i.e., from a specific failover endpoint) then the
receiving server MUST NOT initiate a connection attempt to the
partner server in that same role.
If both servers happen to attempt to connect simultaneously, the
secondary server MUST drop its attempt in favor of the primary's
attempt. Thus, in the event that a secondary server receives a con-
nection attempt to port 647 from a primary server when it has already
initiated a connection attempt to port TBD on the same primary
server, it MUST accept the connection to port 647 and it MUST drop
drop the connection attempt to port TBD. In the event that a primary
server receives a connection attempt to port TBD from a secondary
server when it has already initiated a connection attempt to port 647
on that same server, it MUST reject the connection attempt to port
TBD and continue to pursue the connection attempt on port 647.
Once a connection is established, the primary server MUST send a CON- Once a connection is established, the primary server MUST send a CON-
NECT message across the connection. A secondary server MUST wait for NECT message across the connection. A secondary server MUST wait for
the CONNECT message from a primary server. the CONNECT message from a primary server.
Every CONNECT message includes a TLS-request option, and if the CON- Every CONNECT message includes a TLS-request option, and if the CON-
NECTACK message does not reject the CONNECT message and the TLS-reply NECTACK message does not reject the CONNECT message and the TLS-reply
option says TLS MUST be used, then the servers will immediately enter option says TLS MUST be used, then the servers will immediately enter
into TLS negotiation. into TLS negotiation.
Once TLS negotiation is complete, the primary server MUST resend the Once TLS negotiation is complete, the primary server MUST resend the
skipping to change at page 94, line 23 skipping to change at page 77, line 43
sages from the TCP connection. A server MUST immediately accept any sages from the TCP connection. A server MUST immediately accept any
BNDACK which is received as well. BNDACK which is received as well.
8.6. Losing the TCP connection 8.6. Losing the TCP connection
When the TCP connection is lost, then communications is not ok with When the TCP connection is lost, then communications is not ok with
the other server. A server which has lost communications SHOULD the other server. A server which has lost communications SHOULD
immediately attempt to reconnect to the other server, and should immediately attempt to reconnect to the other server, and should
retry these connection attempts periodically. retry these connection attempts periodically.
A BNDACK message can only be sent in response to a BNDUPD message An acknowledgement message (BNDACK, POOLRESP, UPDDONE) message can
using the same TCP connection from which the BNDUPD message was only be sent in response to a request message (BNDUPD, POOLREQ,
received, since the XID's in BNDUPD messages are guaranteed unique UPDREQ, UPDREQALL) on the same TCP connection from which the request
only during the life of a single TCP connection. When a connection was received, in part since the XID's in the request messages are
to a partner server goes down, a server with unprocessed BNDUPD mes- guaranteed unique only during the life of a single TCP connection.
sages MAY simply drop all of those messages, since it can be sure
that the partner will retransmit them when they are next in communi- When a connection to a partner server goes down, a server with unpro-
cations. A server with unprocessed BNDUPD messages when a TCP con- cessed request messages MAY simply drop all of those messages, since
nection goes down MAY instead choose to process those BNDUPD mes- it can be sure that the partner will resend them when they are next
sages, but it MUST NOT send any BNDACK messages in response (again in communications. A server with unprocessed BNDUPD messages when a
TCP connection goes down MAY instead choose to process those BNDUPD
messages, but it MUST NOT send any BNDACK messages in response (again
because of the issues surrounding XID uniqueness). because of the issues surrounding XID uniqueness).
When the TCP connection is closed explicitly, the DISCONNECT message When the TCP connection is closed explicitly, the DISCONNECT message
with a reject-reason option (and, ideally, a message option) MUST be with a reject-reason option (and, ideally, a message option) MUST be
sent over the TCP connection. sent over the TCP connection.
9. Protocol States 9. Failover Endpoint States
This section discusses the various states that a failover endpoint This section discusses the various states that a failover endpoint
may take, and the server actions required when entering the state, may take, and the server actions required when entering the state,
operating in the state, and leaving the state, as well as the events operating in the state, and leaving the state, as well as the events
that cause transitions out of the state into another state. that cause transitions out of the state into another state.
The state transition diagram in Figure 9.2-1 is relevant for this The state transition diagram in Figure 9.2-1 is relevant for this
section. This is the common state transition diagram for both servers section. This is the common state transition diagram for both servers
in a failover pair. In the event that the textual description of a in a failover pair. In the event that the textual description of a
state differs from the state transition diagram, the textual descrip- state differs from the state transition diagram, the textual descrip-
tion is to be considered authoritative. tion is to be considered authoritative.
9.1. Server Initialization 9.1. Server Initialization
When a server starts it starts out in STARTUP state. See section 9.4 When a server starts it starts out in STARTUP state. See section 9.3
below for details. below for details.
9.2. Server State Transitions 9.2. Server State Transitions
Whenever a server transitions into a new state, it MUST record the Whenever a server transitions into a new state, it MUST record the
state and the time at which it entered that state in stable storage. state and the time at which it entered that state in stable storage.
If communications is "ok", it MUST also send a STATE message to its If communications is "ok", it MUST also send a STATE message to its
failover partner. failover partner.
Figure 9.2-1 is the diagram of the server state transitions. The Figure 9.2-1 is the diagram of the server state transitions. The
skipping to change at page 97, line 13 skipping to change at page 80, line 13
thus be available to the server after a server restart. thus be available to the server after a server restart.
+---------------+ V +--------------+ +---------------+ V +--------------+
| RECOVER - | | | STARTUP - | | RECOVER - | | | STARTUP - |
|(unresponsive) | +->|(unresponsive)| |(unresponsive) | +->|(unresponsive)|
+---------------+ +--------------+ +---------------+ +--------------+
Comm. OK +-----------------+ Comm. OK +-----------------+
Other State:-RECOVER | PARTNER DOWN - |<-----------------+ Other State:-RECOVER | PARTNER DOWN - |<-----------------+
| | | (responsive) | | | | | (responsive) | |
All POTENTIAL- +-----------------+ +--------------+ | All POTENTIAL- +-----------------+ +--------------+ |
Others CONFLICT------------ | --------+ | RESOLUTION | | Others CONFLICT------------ | --------+ | RESOLUTION -| |
| Comm. OK | | INTERRUPTED | | | Comm. OK | | INTERRUPTED | |
UPDREQ(ALL) Other State: | +-| (responsive) | | UPDREQ(ALL) Other State: | +-| (responsive) | |
Wait UPDDONE | | | | +--------------+ | Wait UPDDONE | | | | +--------------+ |
Wait MCLT from fail RECOVER All Others| Comm. OK ^ | | Wait MCLT from fail RECOVER All Others| Comm. OK ^ | |
+--------------+ | V V V | Ext. | +--------------+ | V V V | Ext. |
|RECOVER-DONE +| +--+ +--------------+ Comm. Cmd. | |RECOVER-DONE +| +--+ +--------------+ Comm. Cmd. |
|(unresponsive)| | | POTENTIAL + | Failed | | |(unresponsive)| | | POTENTIAL + | Failed | |
+--------------+ Wait for +>| CONFLICT |------+ +-->| +--------------+ Wait for +>| CONFLICT |------+ +-->|
Comm. OK Other | |(unresponsive)|<--------+ | Comm. OK Other | |(unresponsive)|<--------+ |
+--Other State:-+ State: | +--------------+ | | +--Other State:-+ State: | +--------------+ | |
skipping to change at page 101, line 20 skipping to change at page 84, line 20
period of time (the MCLT interval) has elapsed from entry into period of time (the MCLT interval) has elapsed from entry into
PARTNER-DOWN state, it will allocate IP addresses from the set of all PARTNER-DOWN state, it will allocate IP addresses from the set of all
available IP addresses. available IP addresses.
Once a server has entered NORMAL state, the PARTNER-DOWN state is Once a server has entered NORMAL state, the PARTNER-DOWN state is
entered only on command of an external agency (typically an adminis- entered only on command of an external agency (typically an adminis-
trator of some sort) or after the expiration of an externally config- trator of some sort) or after the expiration of an externally config-
ured minimum safe-time after the beginning of COMMUNICATIONS- ured minimum safe-time after the beginning of COMMUNICATIONS-
INTERRUPTED state. INTERRUPTED state.
Any available IP address tagged as belonging to the other server (at Any available IP address tagged as available for allocation by the
entry to PARTNER-DOWN state) MUST NOT be used until the maximum- other server (at entry to PARTNER-DOWN state) MUST NOT be allocated
client-lead-time beyond the entry into PARTNER-DOWN state has to a new client until the maximum-client-lead-time beyond the entry
elapsed. into PARTNER-DOWN state has elapsed.
A server in PARTNER-DOWN state MUST NOT allocate an IP address to a A server in PARTNER-DOWN state MUST NOT allocate an IP address to a
DHCP client different from that to which it was allocated at the DHCP client different from that to which it was allocated at the
entrance to PARTNER-DOWN state until the maximum-client-lead-time entrance to PARTNER-DOWN state until the maximum-client-lead-time
beyond the maximum of the following times: client expiration time, beyond the maximum of the following times: client expiration time,
most recently transmitted potential-expiration-time, most recently most recently transmitted potential-expiration-time, most recently
received ack of potential-expiration-time from the partner, and most received ack of potential-expiration-time from the partner, and most
recently acked potential-expiration-time to the partner. See section recently acked potential-expiration-time to the partner. See section
7.1.4 for details. If this time would be earlier than the current 7.1.5 for details. If this time would be earlier than the current
time plus the maximum-client-lead-time, then the time the server time plus the maximum-client-lead-time, then the time the server
entered PARTNER-DOWN state plus the maximum-client-lead-time is used. entered PARTNER-DOWN state plus the maximum-client-lead-time is used.
Two options exist for lease times given out while in PARTNER-DOWN Two options exist for lease times given out while in PARTNER-DOWN
state, with different ramifications flowing from each. state, with different ramifications flowing from each.
If the server wishes the Failover protocol to protect it from loss of If the server wishes the Failover protocol to protect it from loss of
stable storage in PARTNER-DOWN state, then it should ensure that the stable storage in PARTNER-DOWN state, then it should ensure that the
MCLT based lease time restrictions in Section 5.1 are maintained, MCLT based lease time restrictions in Section 5.1 are maintained,
even in PARTNER-DOWN state. even in PARTNER-DOWN state.
skipping to change at page 102, line 41 skipping to change at page 85, line 41
stay in PARTNER-DOWN state stay in PARTNER-DOWN state
o partner in RECOVER-DONE state o partner in RECOVER-DONE state
transition into NORMAL state transition into NORMAL state
9.5. RECOVER state 9.5. RECOVER state
This state indicates that the server has no information in its stable This state indicates that the server has no information in its stable
storage or that it is re-integrating with a server in PARTNER-DOWN storage or that it is re-integrating with a server in PARTNER-DOWN
state after it has been down. A server in this state will attempt to state after it has been down. A server in this state MUST attempt to
refresh its stable storage from the other server. refresh its stable storage from the other server.
9.5.1. Operation in RECOVER state 9.5.1. Operation in RECOVER state
A server in RECOVER MUST NOT respond to DHCP client requests. A server in RECOVER MUST NOT respond to DHCP client requests.
A server in RECOVER state will attempt to reestablish communications A server in RECOVER state will attempt to reestablish communications
with the other server. with the other server.
9.5.2. Transitions out of RECOVER state 9.5.2. Transitions out of RECOVER state
skipping to change at page 103, line 18 skipping to change at page 86, line 18
tions are reestablished, then the server in RECOVER state will move tions are reestablished, then the server in RECOVER state will move
to POTENTIAL-CONFLICT state itself. to POTENTIAL-CONFLICT state itself.
If the other server is in RECOVER state, then this server SHOULD sig- If the other server is in RECOVER state, then this server SHOULD sig-
nal an error and halt processing. nal an error and halt processing.
If the other server is in any other state, then the server in RECOVER If the other server is in any other state, then the server in RECOVER
state will request an update of missing binding information by send- state will request an update of missing binding information by send-
ing an UPDREQ message. If the server has been instructed (through ing an UPDREQ message. If the server has been instructed (through
configuration or other external agency) that it has lost its stable configuration or other external agency) that it has lost its stable
storage, it MUST send an UPDREQALL message, otherwise it MUST send an storage, or if it has deduced that from the fact that it has no
UPDREQ message. record of ever having talked to its partner, while its partner does
have a record of communicating with it, it MUST send an UPDREQALL
message, otherwise it MUST send an UPDREQ message.
It will wait for an UPDDONE message, and upon receipt of that message It will wait for an UPDDONE message, and upon receipt of that message
it will start a timer whose expiration is set to a time equal to the it will start a timer whose expiration is set to a time equal to the
time the server went down (if known) or the current time (if the time the server went down (if known) or the current time (if the
down-time is unknown) plus the maximum-client-lead-time. When this down-time is unknown) plus the maximum-client-lead-time. When this
timer goes off, the server will transition into RECOVER-DONE state. timer goes off, the server will transition into RECOVER-DONE state.
This is to allow any IP addresses that were allocated by this server This is to allow any IP addresses that were allocated by this server
prior to loss of its client binding information in stable storage to prior to loss of its client binding information in stable storage to
contact the other server or to time out. contact the other server or to time out.
skipping to change at page 103, line 44 skipping to change at page 86, line 46
The actual requirement on this wait period in RECOVER is that it The actual requirement on this wait period in RECOVER is that it
start when the recovering server went down, not necessarily when start when the recovering server went down, not necessarily when
it came back up. If the time when the recovering server failed is it came back up. If the time when the recovering server failed is
known, it could be communicated to the recovering server (perhaps known, it could be communicated to the recovering server (perhaps
through actions of the network administrator), and the wait period through actions of the network administrator), and the wait period
could be reduced to the maximum-client-lead-time less the differ- could be reduced to the maximum-client-lead-time less the differ-
ence between the current time and the time the server failed. In ence between the current time and the time the server failed. In
this way, the waiting period could be minimized. this way, the waiting period could be minimized.
If an UPDDONE message isn't received within an implementation depen- If an UPDDONE message isn't received within an implementation depen-
dent amount of time, and no BNDUPD message are being received, then dent amount of time, and no BNDUPD message are being received, the
the UPDREQ(ALL) message will be re-transmitted. connection SHOULD be dropped.
A B A B
Server Server Server Server
| | | |
RECOVER PARTNER-DOWN RECOVER PARTNER-DOWN
| | | |
| >--UPDREQ--------------------> | | >--UPDREQ--------------------> |
| | | |
| <---------------------BNDUPD--< | | <---------------------BNDUPD--< |
skipping to change at page 104, line 31 skipping to change at page 87, line 31
| | | |
Wait MCLT from last known | Wait MCLT from last known |
time of operation | time of operation |
| | | |
RECOVER-DONE | RECOVER-DONE |
| | | |
| >--STATE-(RECOVER-DONE)------> | | >--STATE-(RECOVER-DONE)------> |
| NORMAL | NORMAL
| <-------------(NORMAL)-STATE--< | | <-------------(NORMAL)-STATE--< |
NORMAL | NORMAL |
| >---- State-(NORMAL)--------------->
| | | |
| | | |
Figure 9.5.2-1: Transition out of RECOVER state Figure 9.5.2-1: Transition out of RECOVER state
9.6. NORMAL state 9.6. NORMAL state
NORMAL state is the state used by a server when it is communicating NORMAL state is the state used by a server when it is communicating
with the other server, and any required resynchronization has been with the other server, and any required resynchronization has been
performed. While some bindings database synchronization is performed performed. While some bindings database synchronization is performed
in NORMAL state, potential conflicts are resolved prior to entry into in NORMAL state, potential conflicts are resolved prior to entry into
NORMAL state as is binding database data loss. NORMAL state as is binding database data loss.
9.6.1. Upon Entry to NORMAL state 9.6.1. Upon entry to NORMAL state
When entering NORMAL state, a server will send to the other server When entering NORMAL state, a server will send to the other server
all currently unacknowledged binding updates as BNDUPD messages. all currently unacknowledged binding updates as BNDUPD messages.
When the above process is complete, if the server entering NORMAL When the above process is complete, if the server entering NORMAL
state is a secondary server, then it will request IP addresses for state is a secondary server, then it will request IP addresses for
allocation using the POOLREQ message. allocation using the POOLREQ message.
9.6.2. Processing DHCP client requests and load balancing 9.6.2. Processing DHCP client requests and load balancing
When in NORMAL state, each server MUST process all requests from some In NORMAL state, a server MUST process every DHCPREQUEST/RENEWAL or
DHCP clients, and MUST NOT process any request other than a DHCPREQUEST/REBINDING request it receives. And, it processes other
DHCPREQUEST/RENEWAL or a DHCPREQUEST/REBINDING request from some requests only for those clients as dictated by the load balancing
other DHCP clients. algorithm specified in [LOADB].
However, if the load balancing algorithm specified in [LOADB] is used
with a pair of servers implementing the failover protocol, then each
server needs to test each incoming DHCP client request to see if it
should process that request.
As discussed in section 5.3, each server will take the client- As discussed in section 5.3, each server will take the client-
identifier from each DHCP client request (or the client-hardware- identifier from each DHCP client request (or the client-hardware-
address, i.e., the htype concatenated to the front of the chaddr if address, i.e., the htype concatenated to the front of the chaddr if
no client-identifier is present in the request) and use it as the no client-identifier is present in the request) and use it as the
'Request ID' specified in [LOADB]. After applying the algorithm 'Request ID' specified in [LOADB]. After applying the algorithm
specified in [LOADB] and comparing the result with the hash bucket specified in [LOADB] and comparing the result with the hash bucket
assignment (performed during connect processing between failover assignment (performed during connect processing between failover
servers), each failover server will be able to unambiguously deter- servers), each failover server will be able to unambiguously deter-
mine if it should processes the DHCP client request. mine if it should processes the DHCP client request.
skipping to change at page 105, line 41 skipping to change at page 89, line 4
In NORMAL state, a server MUST process every DHCPREQUEST/RENEWAL or In NORMAL state, a server MUST process every DHCPREQUEST/RENEWAL or
DHCPREQUEST/REBINDING request it receives. DHCPREQUEST/REBINDING request it receives.
9.6.3. Operation in NORMAL state 9.6.3. Operation in NORMAL state
When in NORMAL state, for every DHCP client request that it When in NORMAL state, for every DHCP client request that it
processes, as determined by the algorithm described in section 9.6.2, processes, as determined by the algorithm described in section 9.6.2,
above, a server will operate in the following manner: above, a server will operate in the following manner:
o Lease time calculations o Lease time calculations
As discussed in section 5.2.1, "Control of lease time", the As discussed in section 5.2.1, "Control of lease time", the
lease interval given to a DHCP client can never be more than the lease interval given to a DHCP client can never be more than the
MCLT greater than the most recently received potential- MCLT greater than the most recently received potential-
expiration-time from the failover partner or the current time, expiration-time from the failover partner or the current time,
whichever is later. whichever is later.
As long as a server adheres to this constraint, the specifics of As long as a server adheres to this constraint, the specifics of
the lease interval that it gives to a DHCP client or the value the lease interval that it gives to a DHCP client or the value
of the potential-expiration-time sent to its failover partner of the potential-expiration-time sent to its failover partner
are implementation dependent. One possible approach is are implementation dependent. One possible approach is dis-
discussed in section 5.2.1, but that particular approach is in cussed in section 5.2.1, but that particular approach is in no
no way required by this protocol. way required by this protocol.
See section 7.1.4 for details concerning the storage of time See section 7.1.5 for details concerning the storage of time
associated IP addresses and how to use these times when calcu- associated IP addresses and how to use these times when calcu-
lating lease times for DHCP clients. lating lease times for DHCP clients.
o Lazy update of partner server o Lazy update of partner server
After an ACK of a IP address binding, the server servicing a After an ACK of a IP address binding, the server servicing a
DHCP client request attempts to update its partner with the new DHCP client request attempts to update its partner with the new
binding information. The lease time used in the update of the binding information. The lease time used in the update of the
secondary MUST be at that given to the DHCP client in the secondary MUST be at that given to the DHCP client in the
DHCPACK, and the potential-expiration-time MUST be at least the DHCPACK, and the potential-expiration-time MUST be at least the
lease time, and SHOULD be longer. lease time, and SHOULD be considerably longer.
o Reallocation of IP addresses between clients o Reallocation of IP addresses between clients
Whenever a client binding is released or expires, a BNDUPD mes- Whenever a client binding is released or expires, a BNDUPD mes-
sage must be sent to partner, setting the binding state to sage must be sent to partner, setting the binding state to
RELEASED or EXPIRED. However, until a BNDACK is received for RELEASED or EXPIRED. However, until a BNDACK is received for
this message, the IP address cannot be allocated to another this message, the IP address cannot be allocated to another
client. It can be allocated to the same client again. client. It can be allocated to the same client again.
In normal state, the each server receives binding updates from its In normal state, each server receives binding updates from its
partner server in BNDUPD messages. It records these in its client partner server in BNDUPD messages. It records these in its client
binding database in stable storage and then sends a corresponding binding database in stable storage and then sends a corresponding
BNDACK message to the primary server. It MUST ensure that the infor- BNDACK message to the primary server. It MUST ensure that the infor-
mation is recorded in stable storage prior to sending the BNDACK mes- mation is recorded in stable storage prior to sending the BNDACK mes-
sage back to the primary server. sage back to the primary server.
9.6.4. Transitions out of NORMAL state 9.6.4. Transitions out of NORMAL state
If an external command is received by a server in NORMAL state If an external command is received by a server in NORMAL state
informing it that its partner is down, then transition into PARTNER- informing it that its partner is down, then transition into PARTNER-
DOWN state. DOWN state. Generally, this would be an unusual situation, where
some external agency new the partner server was down. Using the
command in this case would be appropriate if the polling interval and
timeout were long.
If a server in NORMAL state fails to receive acks to messages sent to If a server in NORMAL state fails to receive acks to messages sent to
its partner for an implementation dependent period of time, it MAY its partner for an implementation dependent period of time, it MAY
move into COMMUNICATIONS-INTERRUPTED state. This situation might move into COMMUNICATIONS-INTERRUPTED state. This situation might
occur if the partner server was capable of maintaining the TCP con- occur if the partner server was capable of maintaining the TCP con-
nection between the server and also capable of sending a CONTACT mes- nection between the server and also capable of sending a CONTACT mes-
sage every tSend seconds, but was (for some reason) incapable of pro- sage every tSend seconds, but was (for some reason) incapable of pro-
cessing BNDUPD messages. cessing BNDUPD messages.
If the communications is determined to not be "ok" (as defined in If the communications is determined to not be "ok" (as defined in
skipping to change at page 107, line 24 skipping to change at page 90, line 37
A server goes into COMMUNICATIONS-INTERRUPTED state whenever it is A server goes into COMMUNICATIONS-INTERRUPTED state whenever it is
unable to communicate with the other server. Primary and secondary unable to communicate with the other server. Primary and secondary
servers cycle automatically (without administrative intervention) servers cycle automatically (without administrative intervention)
between NORMAL and COMMUNICATIONS-INTERRUPTED state as the network between NORMAL and COMMUNICATIONS-INTERRUPTED state as the network
connection between them fails and recovers, or as the partner server connection between them fails and recovers, or as the partner server
cycles between operational and non-operational. No duplicate IP cycles between operational and non-operational. No duplicate IP
address allocation can occur while the servers cycle between these address allocation can occur while the servers cycle between these
states. states.
9.7.1. Upon Entry to COMMUNICATIONS-INTERRUPTED state 9.7.1. Upon entry to COMMUNICATIONS-INTERRUPTED state
When a server enters COMMUNICATIONS-INTERRUPTED state, if it has been When a server enters COMMUNICATIONS-INTERRUPTED state, if it has been
configured to support an automatic transition out of COMMUNICATIONS- configured to support an automatic transition out of COMMUNICATIONS-
INTERRUPTED state and into PARTNER-DOWN state (i.e., a "safe period" INTERRUPTED state and into PARTNER-DOWN state (i.e., a "safe period"
has been configured, see section 10), then a timer MUST be started has been configured, see section 10), then a timer MUST be started
for a the length of the configured safe period. for the length of the configured safe period.
A server transitioning into the COMMUNICATIONS-INTERRUPTED state from A server transitioning into the COMMUNICATIONS-INTERRUPTED state from
the NORMAL state SHOULD raise some alarm condition to alert adminis- the NORMAL state SHOULD raise some alarm condition to alert adminis-
trative staff to a potential problem in the DHCP subsystem. trative staff to a potential problem in the DHCP subsystem.
9.7.2. Operation in COMMUNICATIONS-INTERRUPTED State 9.7.2. Operation in COMMUNICATIONS-INTERRUPTED State
In this state a server MUST respond to all DHCP client requests, and In this state a server MUST respond to all DHCP client requests, and
the algorithm for load balancing described in section 5.3 MUST NOT be the algorithm for load balancing described in section 5.3 MUST NOT be
used. When allocating new IP addresses, each server allocates from used. When allocating new IP addresses, each server allocates from
skipping to change at page 108, line 28 skipping to change at page 91, line 43
If an external command is received by a server in COMMUNICATIONS- If an external command is received by a server in COMMUNICATIONS-
INTERRUPTED state informing it that its partner is down, it will INTERRUPTED state informing it that its partner is down, it will
transition immediately into PARTNER-DOWN state. transition immediately into PARTNER-DOWN state.
If communications is restored with the other server, then the server If communications is restored with the other server, then the server
in COMMUNICATIONS-INTERRUPTED state will transition into another in COMMUNICATIONS-INTERRUPTED state will transition into another
state based on the state of the partner: state based on the state of the partner:
o partner in NORMAL or COMMUNICATIONS-INTERRUPTED o partner in NORMAL or COMMUNICATIONS-INTERRUPTED
The partner really SHOULD NOT be in NORMAL state here, since The partner SHOULD NOT be in NORMAL state here, since upon res-
upon restoration of communications is MUST have created a new toration of communications it MUST have created a new TCP con-
TCP connection which would have forced it into COMMUNICATIONS- nection which would have forced it into COMMUNICATIONS-
INTERRUPTED state. Still, we should account for every state INTERRUPTED state. Still, we should account for every state
just in case. just in case.
Transition into the NORMAL state. Transition into the NORMAL state.
o partner in RECOVER o partner in RECOVER
Stay in COMMUNICATIONS-INTERRUPTED state. Stay in COMMUNICATIONS-INTERRUPTED state.
o partner in RECOVER-DONE o partner in RECOVER-DONE
Transition into NORMAL state. Transition into NORMAL state.
o partner in PARTNER-DOWN or POTENTIAL-CONFLICT o partner in PARTNER-DOWN or POTENTIAL-CONFLICT
Transition into POTENTIAL-CONFLICT state. Transition into POTENTIAL-CONFLICT state.
skipping to change at page 110, line 17 skipping to change at page 94, line 17
This state indicates that the two servers are attempting to re- This state indicates that the two servers are attempting to re-
integrate with each other, but at least one of them was running in a integrate with each other, but at least one of them was running in a
state that did not guarantee automatic reintegration would be state that did not guarantee automatic reintegration would be
possible. In POTENTIAL-CONFLICT state the servers may determine that possible. In POTENTIAL-CONFLICT state the servers may determine that
the same IP address has been offered and accepted by two different the same IP address has been offered and accepted by two different
DHCP clients. DHCP clients.
It is a goal of this protocol to minimize the possibility that It is a goal of this protocol to minimize the possibility that
POTENTIAL-CONFLICT state is ever entered. POTENTIAL-CONFLICT state is ever entered.
9.8.1. Upon Entry to POTENTIAL-CONFLICT 9.8.1. Upon entry to POTENTIAL-CONFLICT state
When a primary server enters POTENTIAL-CONFLICT state it should When a primary server enters POTENTIAL-CONFLICT state it should
request that the secondary send it all updates of which it is request that the secondary send it all updates of which it is
currently unaware by sending an UPDREQ message to the secondary currently unaware by sending an UPDREQ message to the secondary
server. server.
A secondary server entering POTENTIAL-CONFLICT state will wait for A secondary server entering POTENTIAL-CONFLICT state will wait for
the primary to send it an UPDREQ message. the primary to send it an UPDREQ message.
9.8.2. Operation in POTENTIAL-CONFLICT state 9.8.2.
Any server in POTENTIAL-CONFLICT state MUST NOT process any incoming Any server in POTENTIAL-CONFLICT state MUST NOT process any incoming
DHCP requests. DHCP requests.
9.8.3. Transitions out of POTENTIAL-CONFLICT state 9.8.3. Transitions out of POTENTIAL-CONFLICT state
If communications fails with the partner while in POTENTIAL-CONFLICT If communications fails with the partner while in POTENTIAL-CONFLICT
state, then a primary server will transition to PARTNER-DOWN state state, then the server will transition to RESOLUTION-INTERRUPTED
and a secondary server will stay in POTENTIAL-CONFLICT state. state.
Whenever either server receives an UPDDONE message from its partner Whenever either server receives an UPDDONE message from its partner
while in POTENTIAL-CONFLICT state, it MUST transition to NORMAL while in POTENTIAL-CONFLICT state, it MUST transition to NORMAL
state. This will cause the primary server to leave POTENTIAL- state. This will cause the primary server to leave POTENTIAL-
CONFLICT state prior to the secondary, since the primary sends an CONFLICT state prior to the secondary, since the primary sends an
UPDREQ message and receives an UPDDONE before the secondary sends an UPDREQ message and receives an UPDDONE before the secondary sends an
UPDREQ message and receives its UPDDONE message. UPDREQ message and receives its UPDDONE message.
When a secondary server receives an indication that the primary When a secondary server receives an indication that the primary
server has transitioned from POTENTIAL-CONFLICT to NORMAL state, it server has transitioned from POTENTIAL-CONFLICT to NORMAL state, it
skipping to change at page 112, line 5 skipping to change at page 96, line 5
This state indicates that the two servers were attempting to re- This state indicates that the two servers were attempting to re-
integrate with each other in POTENTIAL-CONFLICT state, but integrate with each other in POTENTIAL-CONFLICT state, but
communications failed prior to completion of re-integration. communications failed prior to completion of re-integration.
If the servers remained in POTENTIAL-CONFLICT while communications If the servers remained in POTENTIAL-CONFLICT while communications
was interrupted, neither server would be responsive to DHCP client was interrupted, neither server would be responsive to DHCP client
requests, and if one server had crashed, then there might be no requests, and if one server had crashed, then there might be no
server able to process DHCP requests. server able to process DHCP requests.
9.9.1. Upon Entry to RESOLUTION-INTERRUPTED state 9.9.1. Upon entry to RESOLUTION-INTERRUPTED state
When a server enters RESOLUTION-INTERRUPTED SHOULD raise an alarm When a server enters RESOLUTION-INTERRUPTED state it SHOULD raise an
condition to alert administrative staff of a problem in the DHCP sub- alarm condition to alert administrative staff of a problem in the
system. DHCP subsystem.
9.9.2. Operation in RESOLUTION-INTERRUPTED state 9.9.2. Operation in RESOLUTION-INTERRUPTED state
In this state a server MUST respond to all DHCP client requests, and In this state a server MUST respond to all DHCP client requests, and
any load balancing (described in section 5.3) MUST NOT be used. When any load balancing (described in section 5.3) MUST NOT be used. When
allocating new IP addresses, each server SHOULD allocate from its own allocating new IP addresses, each server SHOULD allocate from its own
IP address pool (if that can be determined), where the primary MUST IP address pool (if that can be determined), where the primary SHOULD
allocate only FREE IP addresses, and the secondary MUST allocate only allocate only FREE IP addresses, and the secondary SHOULD allocate
BACKUP IP addresses. When responding to renewal requests, each only BACKUP IP addresses. When responding to renewal requests, each
server will allow continued renewal of a DHCP client's current lease server will allow continued renewal of a DHCP client's current lease
on an IP address irrespective of whether that lease was given out by on an IP address irrespective of whether that lease was given out by
the receiving server or not, although the renewal period MUST not the receiving server or not, although the renewal period MUST not
exceed the maximum client lead time (MCLT) beyond the potential- exceed the maximum client lead time (MCLT) beyond the potential-
expiration-time already acknowledged by the other server or the expiration-time already acknowledged by the other server or the
lease-expiration-time or potential-expiration-time received from the lease-expiration-time or potential-expiration-time received from the
partner server. partner server.
However, since the server cannot communicate with its partner in this However, since the server cannot communicate with its partner in this
state, the acknowledged-potential-expiration time will not be updated state, the acknowledged-potential-expiration time will not be updated
skipping to change at page 113, line 11 skipping to change at page 97, line 11
A server in RECOVER-DONE state MUST respond only to A server in RECOVER-DONE state MUST respond only to
DHCPREQUEST/RENEWAL and DHCPREQUEST/REBINDING DHCP messages. DHCPREQUEST/RENEWAL and DHCPREQUEST/REBINDING DHCP messages.
9.10.2. Transitions out of RECOVER-DONE state 9.10.2. Transitions out of RECOVER-DONE state
When a server in RECOVER-DONE state determines that its partner When a server in RECOVER-DONE state determines that its partner
server has entered NORMAL state, then it will transition into NORMAL server has entered NORMAL state, then it will transition into NORMAL
state as well. state as well.
If communications fails while in RECOVER-DONE state, a server will
stay in RECOVER-DONE state.
9.11. PAUSED state 9.11. PAUSED state
This state exists to allow one server to inform another that it will This state exists to allow one server to inform another that it will
be out of service for what is predicted to be a relatively short be out of service for what is predicted to be a relatively short
time, and to allow the other server to transition to COMMUNICATIONS- time, and to allow the other server to transition to COMMUNICATIONS-
INTERRUPTED state immediately and to begin servicing all DHCP clients INTERRUPTED state immediately and to begin servicing all DHCP clients
with no interruption in service to new DHCP clients. with no interruption in service to new DHCP clients.
A server which is aware that it is shutting down temporarily SHOULD A server which is aware that it is shutting down temporarily SHOULD
send a STATE message with the server-state option containing PAUSED send a STATE message with the server-state option containing PAUSED
skipping to change at page 114, line 30 skipping to change at page 98, line 33
A server in SHUTDOWN state MUST NOT respond to any DHCP client input. A server in SHUTDOWN state MUST NOT respond to any DHCP client input.
If a server receives any message indicating that the partner has If a server receives any message indicating that the partner has
moved to PARTNER-DOWN state while it is in SHUTDOWN state then it moved to PARTNER-DOWN state while it is in SHUTDOWN state then it
MUST record RECOVER state as the previous state to be used when it is MUST record RECOVER state as the previous state to be used when it is
restarted. restarted.
A server SHOULD wait for a few seconds after informing the partner of A server SHOULD wait for a few seconds after informing the partner of
entry into SHUTDOWN state (if communications are okay) to determine entry into SHUTDOWN state (if communications are okay) to determine
if it will enter PARTNER-DOWN state. if the partner entered PARTNER-DOWN state.
9.12.3. Transitions out of SHUTDOWN state 9.12.3. Transitions out of SHUTDOWN state
A server transitions out of SHUTDOWN state by being restarted. A server transitions out of SHUTDOWN state by being restarted.
10. Safe Period 10. Safe Period
Due to the restrictions imposed on each server while in Due to the restrictions imposed on each server while in
COMMUNICATIONS-INTERRUPTED state, long-term operation in this state COMMUNICATIONS-INTERRUPTED state, long-term operation in this state
is not feasible for either server. One reason that these states is not feasible for either server. One reason that these states
skipping to change at page 115, line 9 skipping to change at page 99, line 12
Eventually, when the servers are unable to communicate, they will Eventually, when the servers are unable to communicate, they will
have to move into a state where they no longer can re-integrate have to move into a state where they no longer can re-integrate
without some possibility of a duplicate IP address allocation. There without some possibility of a duplicate IP address allocation. There
are two ways that they can move into this state (known as PARTNER- are two ways that they can move into this state (known as PARTNER-
DOWN). DOWN).
They can either be informed by external command that, indeed, the They can either be informed by external command that, indeed, the
partner server is down. In this case, there is no difficulty in mov- partner server is down. In this case, there is no difficulty in mov-
ing into the PARTNER-DOWN state since it is an accurate reflection of ing into the PARTNER-DOWN state since it is an accurate reflection of
reality and the protocol has been designed to operate correctly (even reality and the protocol has been designed to operate correctly (even
during reintegration) if, when in PARTNER-DOWN state the partner is, during reintegration) as long as, when in PARTNER-DOWN state the
indeed, down. partner is, indeed, down.
The more difficult scenario is when the servers are running unat- The more difficult scenario is when the servers are running unat-
tended for extended periods, and in this case an option is provided tended for extended periods, and in this case an option is provided
to configure something called a "safe-period" into each server. This to configure something called a "safe-period" into each server. This
OPTIONAL safe-period is the period after which either the primary or OPTIONAL safe-period is the period after which either the primary or
secondary server will automatically transition to PARTNER-DOWN from secondary server will automatically transition to PARTNER-DOWN from
COMMUNICATIONS-INTERRUPTED state. If this transition is completed COMMUNICATIONS-INTERRUPTED state. If this transition is completed
and the partner is not down, then the possibility of duplicate IP and the partner is not down, then the possibility of duplicate IP
address allocations will exist. address allocations will exist.
skipping to change at page 116, line 23 skipping to change at page 100, line 27
However, it is very desirable to assure the integrity of failover However, it is very desirable to assure the integrity of failover
partners and to thus ensure proper operation of the servers. For partners and to thus ensure proper operation of the servers. For
example, denial of service attacks are possible by the communication example, denial of service attacks are possible by the communication
of invalid state information to one or both servers. of invalid state information to one or both servers.
Therefore, the Failover protocol MUST be capable of being secured by Therefore, the Failover protocol MUST be capable of being secured by
using a simple shared secret message digest which covers each mes- using a simple shared secret message digest which covers each mes-
sage. This provides authentication of the servers, but does not pro- sage. This provides authentication of the servers, but does not pro-
vide encryption of the data exchange. vide encryption of the data exchange.
The Failover protocol MAY also be secured by using TLS [TLS] (Tran- The Failover protocol MAY also be secured by using TLS [RFC 2246]
sport Layer Security) if encryption of the data exchange is desired. (Transport Layer Security) if encryption of the data exchange is
The use of the shared secret or TLS will not protect against TCP or desired. The use of the shared secret or TLS will not protect
IP layer attacks (such as someone sending fake TCP RST segments). against TCP or IP layer attacks (such as someone sending fake TCP RST
IPsec SHOULD be used to protect against most (if not all) of these segments). IPsec SHOULD be used to protect against most (if not all)
kinds of attacks. of these kinds of attacks.
11.1. Simple shared secret 11.1. Simple shared secret
Messages between the failover partners are authenticated through the Messages between the failover partners are authenticated through the
use of a shared secret, which is never sent over the network and must use of a shared secret, which is never sent over the network and must
be known by each server. How each server is told about this shared be known by each server. How each server is told about this shared
secret and secures its storage of the shared secret is outside the secret and secures its storage of the shared secret is outside the
scope of this document. If a server is configured with a shared scope of this document. If a server is configured with a shared
secret for a partner, it MUST send the message-digest option in ALL secret for a partner, it MUST send the message-digest option in ALL
messages to that partner and it MUST treat any messages received from messages to that partner and it MUST treat any messages received from
that partner without a message-digest option as failing authentica- that partner without a message-digest option as failing authentica-
tion. tion.
If a server is not configured with a shared secret for a partner, it If a server is not configured with a shared secret for a partner, it
MUST NOT send the message-digest option in any message to that MUST NOT send the message-digest option in any message to that
partner and it MUST treat any messages received from that partner partner and it MUST treat any messages received from that partner
with a message-digest option as failing authentication. with a message-digest option as failing authentication.
The shared secret is used to calculate a 16 octet message-digest The shared secret is used to calculate a 16 octet message-digest
which is sent in every failover message as the message-digest option. which is sent in every failover message as the message-digest option.
See section 6.2.25. The message-digest contains a one-way 16 octet See section 12.15. The message-digest contains a one-way 16 octet MD5
MD5 [MD5] hash calculated over a stream of octets consisting of the [RFC 1321] hash calculated over a stream of octets consisting of the
entire message concatenated with the shared secret. entire message concatenated with the shared secret.
For calculation, the message includes the message-digest option with For calculation, the message includes the message-digest option with
the message-digest data zeroed (16-octets of zero). Once the calcula- the message-digest data zeroed (16-octets of zero). Once the calcula-
tion is complete, these 16 octets of zero are replaced by the 16- tion is complete, these 16 octets of zero are replaced by the 16-
octet MD5 hash and the message is sent. octet MD5 hash and the message is sent.
For verification, the 16-octet message-digest is saved and replaced For verification, the 16-octet message-digest is saved and replaced<