Network Working Group                                        Ralph Droms
INTERNET DRAFT                                       Bucknell University

                                                             Kim Kinnear
                                                              Mark Stapp
                                                           Cisco Systems

                                                             Bernie Volz
                                                            Steve Gonczi
                                                        Process Software

                                                              Greg Rabil
                                                             Mike Dooley
                                                              Arun Kapur
                                                       Quadritek Systems

                                                               June
                                                     Lucent Technologies

                                                            October 1999
                                                      Expires December 1999 April 2000

                         DHCP Failover Protocol
                    <draft-ietf-dhc-failover-04.txt>
                    <draft-ietf-dhc-failover-05.txt>

Status of this Memo

   This document is an Internet-Draft and is in full conformance with
   all provisions of Section 10 of RFC2026.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet- Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

Copyright Notice

   Copyright (C) The Internet Society (1999). All Rights Reserved.

Abstract

   DHCP [RFC 2131] allows for multiple servers to be operating on a
   single network.  Some sites are interested in running multiple
   servers in such a way so as to provide redundancy in case of server
   failure.  In order for this to work reliably, the cooperating primary
   and secondary servers must maintain a consistent database of the
   lease information.  This implies that servers will need to coordinate
   any and all lease activity so that this information is synchronized
   in case of failover.

   This document defines a protocol to provide this synchronization
   between two servers.  One server is designated the "primary" server,
   the other is the "secondary" server. Additionally, this  This document also describes a protocol which allows each server to determine to which
   DHCP clients it should provide service when both servers are
   operating in order to support load balancing as well as when on one
   server has failed in order
   way to support increased integrate the failover protocol with the DHCP service
   availability. loadbalancing
   approach.

   This document is a complete rewrite significant revision of draft-ietf-dhc-failover-
   03.txt.  That earlier draft described a UDP based failover protocol,
   and this draft describes a closely related protocol which uses TCP as
   a transport and includes new load-balancing and security
   capabilities.
   04.txt.

Table of Contents

    1.  Introduction................................................. 4
    2.  Terminology.................................................. 5
    2.1.  Requirements terminology................................... 5
    2.2.  DHCP and failover terminology.............................. 5
    3.  Background and External Requirements......................... 7 8
    3.1.  Key aspects of the DHCP protocol........................... 7 8
    3.2.  BOOTP relay agent implementation........................... 9 10
    3.3.  What does it mean if a server can't communicate with its partner?
10 11
    3.4.  Challenging scenarios for a Failover protocol............. 10 protocol.............. 12
    3.5.  Using TCP to detect partner server failure................ 11 failure................. 13
    4.  Design Goals................................................ 13 Goals................................................. 14
    4.1.  Design requirements for this protocol..................... 13 protocol...................... 14
    4.2.  Goals for this protocol................................... 13 protocol.................................... 15
    4.3.  Limitations of this Protocol.............................. 14 Protocol............................... 16
    5.  Protocol Overview........................................... 15 Overview............................................ 16
    5.1.  Messages and States....................................... 15 States........................................ 17
    5.2.  Fundamental restrictions.................................. 18 restrictions................................... 19
    5.3.  Load balancing............................................ 24 balancing............................................. 26
    5.4.  Operating in NORMAL state................................. 25 state.................................. 27
    5.5.  Operating in COMMUNICATIONS-INTERRUPTED state............. 25 state.............. 27
    5.6.  Operating in PARTNER-DOWN state........................... 25 state............................ 27
    5.7.  Operating in RECOVER state................................ 26 state................................. 28
    5.8.  Operating in STARTUP state................................. 28
    5.  Protocol Overview (continued)
    5.9.  Time synchronization between servers....................... 28
    5.10.  IP address binding-status................................. 29
    5.11.  DNS dynamic update considerations......................... 34
    5.12.  Reservations and failover................................. 38
    5.13.  Dynamic BOOTP and failover................................ 39
    5.14.  Guidelines for selecting MCLT............................. 39
    6.  Packet Formats.............................................. 26 Formats............................................... 40
    6.1.  Common message format..................................... 26 format...................................... 40
    6.2.  Common option format...................................... 28 format....................................... 43
    6.3.  BNDUPD message format..................................... 40 format...................................... 55
    6.4.  BNDACK message format..................................... 42 format...................................... 58
    6.5.  Bulking for BNDUPD and BNDACK messages.................... 44 messages..................... 59
    6.6.  UPDREQ message format..................................... 44 format...................................... 60
    6.7.  UPDREQALL message format.................................. 44 format................................... 60
    6.8.  UPDDONE message format.................................... 44 format..................................... 60
    6.9.  POOLREQ message format.................................... 45 format..................................... 61
    6.10.  POOLRESP message format.................................. 45 format................................... 61
    6.11.  CONNECT message format................................... 46 format.................................... 62
    6.12.  CONNECTACK message format................................ 46 format................................. 62
    6.13.  STATE message format..................................... 47 format...................................... 63
    6.14.  CONTACT message format................................... 48 format.................................... 64
    6.15.  DISCONNECT message format................................. 64
    7.  Protocol Messages........................................... 48 Messages............................................ 64
    7.1.  BNDUPD message............................................ 48 message............................................. 64
    7.2.  BNDACK message............................................ 57 message............................................. 75
    7.3.  UPDREQ message............................................ 58 message............................................. 76
    7.4.  UPDREQALL message......................................... 59 message.......................................... 78
    7.5.  UPDDONE message........................................... 60 message............................................ 79
    7.6.  POOLREQ message........................................... 60 message............................................ 80
    7.7.  POOLRESP message.......................................... 61 message........................................... 81
    7.8.  CONNECT message........................................... 62 message............................................ 81
    7.9.  CONNECTACK message........................................ 65 message......................................... 85
    7.10.  STATE message............................................ 68 message............................................. 88
    7.11.  CONTACT message.......................................... 69 message........................................... 89
    7.12.  DISCONNECT message........................................ 89
    8.  Connection Management....................................... 70 Management........................................ 90
    8.1.  Connection granularity.................................... 70 granularity..................................... 90
    8.2.  Creating the TCP connection............................... 70 connection................................ 90
    8.3.  Using the TCP connection for determining communications status. 71 status 91
    8.4.  Using the TCP connection for binding data................. 73 data.................. 93
    8.5.  Using the TCP connection for control messages............. 73 messages.............. 94
    8.6.  Losing the TCP connection................................. 73 connection.................................. 94
    9.  Protocol States............................................. 73 States.............................................. 94
    9.1.  Server Initialization..................................... 74 Initialization...................................... 95
    9.2.  Server State Transitions.................................. 74 Transitions................................... 95
    9.3.  STARTUP state............................................. 77 state.............................................. 98
    9.4.  PARTNER-DOWN state........................................ 79 state......................................... 100
    9.5.  RECOVER state............................................. 81 state.............................................. 102
    9.6.  NORMAL state.............................................. 83 state............................................... 104
    9.7.  COMMUNICATIONS-INTERRUPTED State.......................... 86 State........................... 107
    9.8.  POTENTIAL-CONFLICT state.................................. 89 state................................... 110
    9.9.  RESOLUTION-INTERRUPTED state............................... 111
    9.10.  RECOVER-DONE state........................................ 90
    9.10.  PAUSED state............................................. 91 112
    9.11.  PAUSED state.............................................. 113
    9.12.  SHUTDOWN state........................................... 91 state............................................ 113
    10.  Safe Period................................................ 92 Period................................................. 114
    11.  Security................................................... 94  Security.................................................... 116
    11.1.  Simple shared secret..................................... 94 secret...................................... 116
    11.2.  TLS...................................................... 94  TLS....................................................... 117
    12.  Hash algorithm for load balancing.......................... 95  Acknowledgments............................................. 117
    13.  Acknowledgments............................................ 96  References.................................................. 119
    14.  References................................................. 97
    15.  Author's information....................................... 98
    16. information........................................ 120
    15.  Full Copyright Statement................................... 99 Statement.................................... 121

1.  Introduction

   DHCP [RFC 2131] allows for multiple servers to be operating on a sin-
   gle network.  Some sites are interested in running multiple servers
   in such a way so as to provide redundancy in case of server failure
   since the DHCP subsystem is in many cases a critical part of the net-
   work infrastructure.

   This document defines a protocol to provide synchronization between
   two servers in order that each can take over for the other should
   either one fail or become unreachable.

   One server is designated the "primary" server,  the other is the
   "secondary" server, and all DHCP client requests are sent to each
   server.

   In order to provide a  high availability DHCP service, these
   cooperating primary and secondary servers must maintain a consistent
   database of lease information.  This implies that servers will need
   to coordinate any and all lease activity so that this information is
   synchronized in case failover is required.  The protocol messages and
   processing techniques required to maintain a consistent database are
   specified in the protocol described here.

   The failover protocol also contains an algorithm which allows each
   server to determine to which DHCP clients it should provide service
   when both servers are operating normally, and this capability can be
   used to support load balancing.

2.  Terminology

   This section discusses both the generic requirements terminology com-
   mon to many IETF protocol specifications as well as specialized DHCP
   and failover protocol specific terminology.

2.1.  Requirements terminology

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119 [RFC 2119].

2.2.  DHCP and failover terminology

   This document uses the following terms:

      o "DHCP client" or "client"

        A DHCP client is an Internet host using DHCP to obtain confi-
        guration parameters such as a network address.  The term
        "client" used within this document always means a DHCP client,
        and never one of the two failover servers.

      o "DHCP server" or "server"

        A DHCP server is an Internet host that returns configuration
        parameters to DHCP clients.

      o "binding"

        A binding is a collection of configuration parameters, including
        at least an IP address, associated with or "bound to" a DHCP
        client.  Bindings are managed by DHCP servers.

      o "binding database"

        The collection of bindings managed by a primary and secondary.

      o "failover endpoint"

        The failover protocol allows for there to be a unique failover
        endpoint per partner per role (where role is primary or secon-
        dary).  This failover endpoint can take actions and hold unique
        states.  There are thus a maximum of two failover endpoints per
        server per partner (one for each partner as a primary and one
        for that same partner as a secondary.)
      o "lazy update"

        Lazy update refers to the requirement placed on a server imple-
        menting a failover protocol to update its failover partner when-
        ever the binding database changes.  A failover protocol which
        didn't support lazy update would require the failover partner
        update to be complete before a DHCP server could respond to a
        DHCP client request with a DHCPACK.  A failover protocol which
        does support lazy update places no such restriction on the
        update of the failover partner server, and so a server can allo-
        cate an IP address or extend a lease on an IP address and then
        update its failover partner as time permits.  A failover proto-
        col which supports lazy update not only removes the requirement
        to update the failover partner prior to responding to a DHCP
        client with a DHCPACK, but also allows gathering up batches of
        updates from one failover server to its partner.

      o "subnet address pool"

        A subnet address pool is the set of IP address which is associ-
        ated with a particular network number and subnet mask.  In the
        simple case, there is a single network number and subnet mask
        and a set of IP addresses.  In the more complex case (sometimes
        called "secondary subnets", sometimes "superscopes"), several
        (apparently unrelated) network number and subnet mask combina-
        tions with their associated IP addresses may all be configured
        together into one subnet address pool.

      o "Primary server" or "Primary"

        A DHCP server configured to provide primary service to a set of
        DHCP clients for a particular set of subnet address pools.

      o "Secondary server" or "Secondary"

        A DHCP server configured to act as backup to a primary server
        for a particular set of subnet address pools.

      o "stable storage"

        Every DHCP server is assumed to have some form of what is called
        "stable storage".  Stable storage is used to hold information
        concerning IP address bindings (among other things) so that this
        information is not lost in the event of a server failure which
        requires restart of the server.

      o "MCLT"
        The MCLT refers to maximum client lead time.  This time is con-
        figured on the primary server and transmitted from the primary
        to the secondary server in the CONNECT message.  It is the max-
        imum amount of time that one server can give to a client for a
        binding beyond that known and ACKed by the partner server.  See
        section 5.2.1 for details.

3.  Background

      o "DNS"

        An abbreviation for "Domain Name System", a scheme where a cen-
        tral name repository is used to map names to IP addresses and External Requirements

   This section highlights key aspects of the DHCP protocol on which IP
        addresses to names.

      o "FQDN"

        An FQDN is a "fully qualified domain name".  A fully qualified
        domain name generally is a host name with at least one zone
        name, for example "www.dhcp.org" is a fully qualified domain
        name.

      o "partner"

        A "partner", for the purposes of this document, refers to a
        failover protocol depends.  It also discusses server, typically the requirements that other failover server.  In many
        (if not most) cases, the failover protocol places on other aspects is symmetric with
        respect to the primary or secondary nature of the network infras-
   tructure, servers, and some general issues surrounding server failure detec-
   tion.  Some failure scenarios that provide particular challenges
        so it is often appropriate to a
   failover protocol are discussed.  Finally, dicuss "updating the challenges inherent in
   using partner
        server", since it could be a TCP connection as primary server updating a means to detect failure of secondary
        server or a partner secondary server are elaborated.

3.1.  Key aspects of the DHCP protocol

   The failover protocol updating a primary server.

      o "RR"

        "RR" is designed to augment the DHCP protocol as
   described an abbreviation for "resource record".  All records in RFC 2131 [RFC 2131].  There are several key aspects of
        the DHCP protocol which DNS are required by the failover protocol in
   order resource records.  The resource records of most
        relevance to successfully meet its design goals.

3.1.1.  Broadcast behavior

   There this document are two aspects of the broadcast behavior of the DHCP protocol "A" resource record, which are key to making the failover protocol operate successfully.
   The first is simply that the DHCP protocol requires
        maps a DHCP client DNS name to
   broadcast all DHCPDISCOVER and DHCPREQUEST/INIT-REBOOT messages.
   Because of this requirement, a DHCP client who was communicating with
   one server will automatically be able to communicate with another
   server if one is available.

   The second aspect of broadcast behavior is similar to particular IP address, the first, but
   involves "PTR" resource
        record, which allows a "reverse map", from the distinction between IP address back
        to a DHCPREQUEST/RENEW DNS name, and
   DHCPREQUEST/REBINDING.  A DHCPREQUEST/RENEW is the message that a
   DHCP client uses to extend its lease.  It "KEY" resource record, which is unicast used in
        ways defined in [DDNS] to tag a DNS name with the identity of
        the DHCP
   server from client with which it acquired the lease.   However, is associated.

      o "DDNS"

        An abbreviation for "Dynamic DNS", which refers to the DHCP protocol
   (in capabil-
        ity to update a farsighted move), was explicitly designed so that DNS server's name (actually resource record)
        database using an on-the-wire protocol defined in [RFC2136].

      o "binding-status"
        The binding-status is the event
   that a DHCP client cannot contact the server from which it received a
   lease on status of an IP address using a DHCPREQUEST/RENEW, the client is
   required with respect
        to broadcast its renewal using association with a DHCPREQUEST/REBINDING client.  There are specific binding-
        status values defined for use by the failover protocol, e.g.,
        ACTIVE, FREE, RELEASED, ABANDONED, etc.  These are designed to
   any available DHCP server.  Since all
        map more or less directly onto the binding-status values used
        internally in most DHCP clients were required server implementations.  The term
        binding-status refers to
   implement the concept also sometimes known as
        "lease state" or "IP address state", but in this algorithm, document the
        term "state" is reserved for the failover protocol can have state of a different
   server from the one that initially granted a lease be the server failover
        endpoint, and binding-status is always used to refer to
   renew a lease.  Thus, one server can take over for another with no
   interruption in the service as experience by the DHCP client or its
        state associated applications software.

3.1.2.  Client responsibility

   In the DHCP protocol the DHCP clients are entrusted with a consider-
   able responsibility.  In particular, after they are granted a lease
   on an IP address, they are enjoined to only use that IP address while
   their lease is valid.  Every DHCP client is expected to stop using an IP address if or lease.

3.  Background and External Requirements

   This section highlights key aspects of the expiration time DHCP protocol on which the lease has passed and if it
   cannot get an extension on
   failover protocol depends.  It also discusses the lease for requirements that IP address from
   the failover protocol places on other aspects of the network infras-
   tructure, and some
   DHCP server.  Thus, general issues surrounding server failure detec-
   tion.  Some failure scenarios that provide particular challenges to a
   failover protocol are discussed.  Finally, the correct behavior challenges inherent in
   using a TCP connection as a means to detect failure of every a partner
   server are elaborated.

3.1.  Key aspects of the DHCP client in this
   regard protocol

   The failover protocol is required designed to ensure augment the integrity DHCP protocol as
   described in RFC 2131 [RFC 2131].  There are several key aspects of
   the DHCP service.  On
   the other hand, incorrect behavior protocol which are required by a client the failover protocol in this area will tend
   order to adversely affect at most one other DHCP client.

   Furthermore, any DHCP client which sends in a DHCPREQUEST/RENEW or
   DHCPREQUEST/REBINDING to a DHCP server (either unicast for a RENEW or successfully meet its design goals.

3.1.1.  Broadcast behavior

   There are two aspects of the broadcast for a REBINDING) MUST still have time to run on behavior of the lease
   for that IP address.  The DHCP server sends the DHCPACK back unicast
   to the IP address from protocol
   which the RENEW or REBINDING originated.

   Given the existing responsibility placed on the client to only use an
   IP address when the lease is valid, and are key to only send in a RENEW or
   REBINDING if making the lease failover protocol operate successfully.
   The first is valid, simply that the failover DHCP protocol relies on requires a DHCP
   clients client to perform responsibly
   broadcast all DHCPDISCOVER and will, in the absence DHCPREQUEST/INIT-REBOOT messages.
   Because of conflict-
   ing information, believe this requirement, a DHCP client that who was communicating with
   one server will automatically be able to communicate with another
   server if one is attempting available.

   The second aspect of broadcast behavior is similar to RENEW or
   REBIND the first, but
   involves the distinction between a lease on an IP address DHCPREQUEST/RENEW and
   DHCPREQUEST/REBINDING.  A DHCPREQUEST/RENEW is the legitimate owner of message that IP
   address.

   One troublesome issue a
   DHCP client uses to extend its lease.  It is that of unicast to the DHCP client responsibility when
   sending in DHCPREQUEST/INIT-REBOOT requests.  While
   server from which it acquired the lease.   However, the original DHCP
   RFC protocol
   (in a farsighted move), was written to require explicitly designed so that in the event
   that a DHCP client to have time left to run on cannot contact the server from which it received a
   lease for on an IP address if using a DHCPREQUEST/RENEW, the client is sending an INIT-REBOOT
   request, it was sufficiently unclear that some client vendors didn't
   realize this until recently.
   required to broadcast its renewal using a DHCPREQUEST/REBINDING to
   any available DHCP server.  Since all DHCP clients were required to
   implement this algorithm, the INIT-REBOOT request was sent
   with failover protocol can have a different
   server from the IP address in one that initially granted a lease be the dhcp-requested-address option and not server to
   renew a lease.  Thus, one server can take over for another with no
   interruption in the ciaddr (for perfectly good reasons), the similarity to service as experience by the RENEW
   and REBINDING case was lost on many people.

   At present, DHCP client or its
   associated applications software.

3.1.2.  Client responsibility

   In the failover DHCP protocol does not assume that the DHCP clients are entrusted with a client send-
   ing in an INIT-REBOOT request necessarily has consider-
   able responsibility.  In particular, after they are granted a valid lease
   on the an IP
   address appearing in the dhcp-requested-address option in the INIT-
   REBOOT request.

   The implications of this address, they are as follows: Assume enjoined to only use that there IP address while
   their lease is a valid.  Every DHCP client that gets a lease from one server while that server is unable expected to communicate with its failover partner.  Then, assume that after stop using an
   IP address if the expiration time on the lease has passed and if it
   cannot get an extension on the lease for that IP address from some
   DHCP server.  Thus, the correct behavior of every DHCP client reboots it in this
   regard is able only required to communicate with ensure the other
   failover server.  If integrity of the DHCP service.  On
   the failover servers have not been able to com-
   municate with each other during hand, incorrect behavior by a client in this process, then the area will tend
   to adversely affect at most one other DHCP client.

   Furthermore, any DHCP client
   will get which sends in a new IP address instead of being able DHCPREQUEST/RENEW or
   DHCPREQUEST/REBINDING to continue a DHCP server (either unicast for a RENEW or
   broadcast for a REBINDING) MUST still have time to use
   its existing run on the lease
   for that IP address. This will affect no applications  The DHCP server sends the DHCPACK back unicast
   to the IP address from which the RENEW or REBINDING originated.

   Given the existing responsibility placed on the DHCP
   client, since it is rebooting.  However, it will client to only use up an additional
   IP address in this marginal case.

3.1.3.  Stable storage update before DHCPACK

   The DHCP protocol allocates resources, when the lease is valid, and in order to operate
   correctly it requires that only send in a RENEW or
   REBINDING if the lease is valid, the failover protocol relies on DHCP server update some form of stable
   storage prior to sending a DHCPACK
   clients to perform responsibly and will, in the absence of conflict-
   ing information, believe a DHCP client in order to grant that client is attempting to RENEW or
   REBIND a lease on an IP address.

   One of address is the goals legitimate owner of the failover protocol is that it IP
   address.

   If clients do not add signifi-
   cant additional time follow these rules, it is possible for an address
   to be in use by more than one client. For a single server, this already time consuming requirement to
   update stable storage prior to a DHCPACK.  In particular, adding a
   requirement hap-
   pens because the server has leased the expired address to communicate with another server prior
   client and the original client is also attempting to sending a
   DHCPACK would simplify use the failover protocol, but it address.
   The server would limit NAK the
   potential scalability of any DHCP server which employed renewal request. This is made slightly worse
   in the failover protocol in an unacceptable manner.

3.2.  BOOTP relay agent implementation

   Many DHCP clients are not resident on if the same network segment as a
   DHCP server.  In order two servers are unable to support this form of network architecture,
   most contemporary routers implement something known as a BOOTP Relay
   Agent.  This capability inside of a router listens for all broadcasts
   at the DHCP port, port 67, communicate
   with each other and will relay any broadcasts that it
   receives on one server leases an available address to a DHCP server.  The IP address of new
   client while the DHCP other server must
   have been previously configured into the router.  As part of the
   relay process, the relay agent will place receives a renewal from a different
   client.  In this case, both servers lease the same address of the inter-
   face on which it received the broadcast into to dif-
   ferent clients for the giaddr field MCLT time.

   One troublesome issue is that of the DHCP packet.

   Since client responsibility when
   sending in DHCPREQUEST/INIT-REBOOT requests.  While the failover protocol requires two original DHCP servers
   RFC was written to receive any
   broadcast require a DHCP messages, in order client to work with DHCP clients which are
   not local have time left to run on
   the DHCP server, the BOOTP relay agent on the router
   closest to lease for an IP address if the DHCP client must be configured to point at more than
   one DHCP server.

   Most BOOTP relay agent implementations allow this duplication of
   packets.

   If this is not possible, sending an administrator might be able to configure INIT-REBOOT
   request, it was sufficiently unclear that some client vendors didn't
   realize this until recently.  Since the relay agent INIT-REBOOT request was sent
   with a subnet broadcast address, but the IP address in this case the
   primary dhcp-requested-address option and secondary DHCP servers not in a failover pair must both
   reside
   the ciaddr (for perfectly good reasons), the similarity to the RENEW
   and REBINDING case was lost on many people.

   At present, the same subnet.   While this is a realistic configuration,
   it is failover protocol does not the one assume that most people will use.

3.3.  What does it mean if a server can't communicate with its partner?

   In any protocol designed to allow one server to take over some
   responsibilities from client send-
   ing in an INIT-REBOOT request necessarily has a partner server valid lease on the IP
   address appearing in the event of "failure" dhcp-requested-address option in the INIT-
   REBOOT request.

   The implications of this are as follows: Assume that partner server, there is an inherent difficulty in determining
   when a DHCP
   client that partner server has failed.

   In fact, it is fundamentally impossible for one server to distinguish gets a network communications failure lease from the outright failure of the one server while that server is unable
   to which communicate with its failover partner.  Then, assume that after
   that client reboots it is trying able only to communicate.  In communicate with the case where other
   failover server.  If the failover servers have not been able to com-
   municate with each
   server is handing out resources (in other during this case process, then the DHCP client
   will get a new IP addresses) address instead of being able to a
   client community, mistaking continue to use
   its existing IP address. This will affect no applications on the DHCP
   client, since it is rebooting.  However, it will use up an inability additional
   IP address in this marginal case.

3.1.3.  Stable storage update before DHCPACK

   The DHCP protocol allocates resources, and in order to communicate with operate
   correctly it requires that a
   partner DHCP server for failure update some form of that partner server could easily cause
   both servers stable
   storage prior to be handing out the same IP addresses sending a DHCPACK to different
   clients.

   One way that this is sometimes handled is for there a DHCP client in order to be more than
   two servers.  In the case of grant
   that client a lease on an odd number IP address.

   One of servers, the servers goals of the failover protocol is that can still it not add signifi-
   cant additional time to this already time consuming requirement to
   update stable storage prior to a DHCPACK.  In particular, adding a
   requirement to communicate with a majority of other servers will con-
   sider themselves operational, and any another server which can't communicate prior to sending a majority of other servers must immediately cease operations.

   While this technique works in some domains, having
   DHCPACK would simplify the only failover protocol, but it would limit the
   potential scalability of any DHCP server to which a DHCP client can communicate voluntarily shut itself down
   seems like something worth avoiding.

   The employed the failover
   protocol will operate correctly while both servers are
   unable to communicate, whether they in an unacceptable manner.

3.2.  BOOTP relay agent implementation

   Many DHCP clients are both running or not.  At some
   point there may be resource contention, and if one of the servers is
   actually down, then the operator can inform the other server and the
   operational server will be able to use all of not resident on the downed server's
   resources.

   The protocol also allows detection of an orderly shutdown of same network segment as a parti-
   cipating
   DHCP server.

3.4.  Challenging scenarios for  In order to support this form of network architecture,
   most contemporary routers implement something known as a Failover protocol

   There exist two failure scenarios which provide particular challenges
   the correctness guarantees BOOTP Relay
   Agent.  This capability inside of a failover protocol.

3.4.1.  Primary Server crash before "lazy" update:

   In the case where router listens for all broadcasts
   at the primary server sends a DHCPACK DHCP port, port 67, and will relay any broadcasts that it
   receives on to a client for
   a newly allocated DHCP server.  The IP address and then crashes prior to sending the
   corresponding update to the secondary server, of the secondary DHCP server
   will must
   have no record been previously configured into the router.  As part of the IP
   relay process, the relay agent will place the address allocation.  When of the secondary
   server takes over, inter-
   face on which it may well try to allocate that IP address to a
   different client.  In received the case where broadcast into the first client to receive giaddr field of the
   IP address is
   DHCP packet.

   Since the failover protocol requires two DHCP servers to receive any
   broadcast DHCP messages, in order to work with DHCP clients which are
   not on local to the net at DHCP server, the time (yet while there was still
   time to run BOOTP relay agent on its lease), an ICMP echo (i.e., ping) will not prevent the secondary server from allocating that IP address router
   closest to a different
   client.

   The failover protocol deals with this situation by having the primary
   and secondary servers allocate addresses for new clients from dis-
   joint address pools.  See section 5.4 for details.

   A more likely (in that DHCPRENEWs are presumably DHCP client must be configured to point at more common than
   DHCPDISCOVERs) and more subtle version
   one DHCP server.

   Most BOOTP relay agent implementations allow this duplication of
   packets.

   If this problem is where the
   primary server crashes after extending a client's lease time, and
   before updating not possible, an administrator might be able to configure
   the secondary relay agent with a new time using a lazy update.
   After subnet broadcast address, but in this case the
   primary and secondary takes over, if DHCP servers in a failover pair must both
   reside on the client same subnet.   While this is a realistic configuration,
   it is not connected to the
   network the secondary one that most people will believe the client's lease has expired
   when, in fact, use.

3.3.  What does it has not. mean if a server can't communicate with its partner?

   In this case as well, the IP address
   might be reallocated any protocol designed to allow one server to take over some
   responsibilities from a different client while the first client is
   still using it.

   This scenario is handled by partner server in the failover protocol through control event of
   the lease time and the use "failure" of the maximum client lead time (MCLT).
   See section 5.2.1  for details.

3.4.2.  Network partition where DHCP servers can't communicate but each
can talk to clients:

   Several conditions are required
   that partner server, there is an inherent difficulty in determining
   when that partner server has failed.

   In fact, it is fundamentally impossible for this situation to occur.  First,
   due one server to distinguish
   a network failure, communications failure from the primary and secondary servers cannot
   communicate.  As well, some outright failure of the DHCP clients must be able
   server to com-
   municate with the primary server, and some of which it is trying to communicate.  In the clients must now
   only be able case where each
   server is handing out resources (in this case IP addresses) to a
   client community, mistaking an inability to communicate with the secondary server.  When this
   condition occurs, a
   partner server for failure of that partner server could easily cause
   both primary and secondary servers could attempt to
   allocate IP addresses for new clients from the same pool of available
   addresses.  At some point, then, two clients will end up being allo-
   cated be handing out the same IP address.  This will cause problems when the network
   failure addresses to different
   clients.

   One way that created this situation is corrected.

   The failover protocol deals with this situation by having the primary
   and secondary servers allocate addresses for new clients from dis-
   joint address pools.  See section 5.4 sometimes handled is for details.

3.5.  Using TCP to detect partner server failure

   There are several characteristics of TCP that are important there to be more than
   two servers.  In the
   functioning case of an odd number of servers, the failover protocol, which uses one TCP connection
   for both bulk data transfer as well as to assess communications
   integrity servers
   that can still communicate with the a majority of other server.  Reliable servers will con-
   sider themselves operational, and ordered message
   delivery are chief among these important characteristics.

   It would be nice to use the capabilities built in to TCP to allow it
   to determine if communications integrity exists any server which can't communicate
   to the failover
   partner but a majority of other servers must immediately cease operations.

   While this strategy contains technique works in some problems domains, having the only server to
   which require
   analysis.  There exist three fundamental cases for an open TCP con-
   nection that must be examined.

      1.  When no data is being sent then no messages a DHCP client can communicate voluntarily shut itself down
   seems like something worth avoiding.

   The failover protocol will operate correctly while both servers are traveling
          across the TCP connection.

      2.  When data is queued
   unable to communicate, whether they are both running or not.  At some
   point there may be sent, resource contention, and the receiver has not
          blocked the sending if one of additional data, the servers is
   actually down, then messages are
          flowing across the TCP connection containing operator can inform the applications
          data.

      3.  When data is queued to be sent, other server and the receiver has blocked
          the transmission
   operational server will be able to use all of additional data, then persist messages are
          flowing from the receiver downed server's
   resources.

   The protocol also allows detection of an orderly shutdown of a parti-
   cipating server.

3.4.  Challenging scenarios for a Failover protocol

   There exist two failure scenarios which provide particular challenges
   the correctness guarantees of a failover protocol.

3.4.1.  Primary Server crash before "lazy" update:

   In the case where the primary server sends a DHCPACK to a client for
   a newly allocated IP address and then crashes prior to sending the sender
   corresponding update to ensure that the
          sender doesn't miss secondary server, the receiver opening secondary server
   will have no record of the window for
          further transmissions.

   The first case can be turned into IP address allocation.  When the second case by sending
   application-level keep-alive messages periodically when there is no
   other data queued secondary
   server takes over, it may well try to be sent.  Note TCP keep-alive messages might be
   used as well, but they present additional problems.

   Thus, we can ensure allocate that IP address to a
   different client.  In the TCP connection has messages flowing
   periodically across case where the connection fairly easily.  The question
   remains as first client to what TCP will do if receive the other end of
   IP address is not on the connection
   fails net at the time (yet while there was still
   time to respond (either because of network partition or because run on its lease), an ICMP echo (i.e., ping) will not prevent
   the
   receiving secondary server crashes). TCP will attempt from allocating that IP address to retransmit a message different
   client.

   The failover protocol deals with an exponential backoff, this situation by having the primary
   and will eventually timeout that
   retransmission.  However, the length of secondary servers allocate addresses for new clients from dis-
   joint address pools.  See section 5.4 for details.

   A more likely (in that timeout cannot, in gen-
   eral, be set on DHCPRENEWs are presumably more common than
   DHCPDISCOVERs) and more subtle version of this problem is where the
   primary server crashes after extending a per-connection basis, client's lease time, and
   before updating the secondary with a new time using a lazy update.
   After the secondary takes over, if the client is frequently as long as
   nine minutes, though not connected to the
   network the secondary will believe the client's lease has expired
   when, in some cases fact, it may be as short has not.  In this case as two minutes.
   One some systems it can well, the IP address
   might be set system-wide, reallocated to a different client while on some systems it
   cannot be changed at all.

   A value for this timeout that would be appropriate for the first client is
   still using it.

   This scenario is handled by the failover
   protocol, say less than 1 minute, could have unpleasant side-effects
   on other applications running on protocol through control of
   the same server, assuming that it
   could be changed at all on lease time and the host operating system.

   Nine minutes is a long use of the maximum client lead time (MCLT).
   See section 5.2.1  for the details.

3.4.2.  Network partition where DHCP service servers can't communicate but each
can talk to be unavailable clients:

   Several conditions are required for this situation to
   any new clients that were being served by occur.  First,
   due to a network failure, the server which has
   crashed, when there is another server running that primary and secondary servers cannot
   communicate.  As well, some of the DHCP clients must be able to com-
   municate with the primary server, and some of the clients must now
   only be able to communicate with the secondary server.  When this
   condition occurs, both primary and secondary servers could respond attempt to
   them immediately as soon as it determines that its partner is not
   operational.

   The conclusion drawn
   allocate IP addresses for new clients from the same pool of available
   addresses.  At some point, then, two clients will end up being allo-
   cated the same IP address.  This will cause problems when the network
   failure that created this analysis situation is that TCP provides very
   useful support for the corrected.

   The failover protocol in deals with this situation by having the areas of reliable primary
   and
   ordered message delivery, but cannot by itself be relied upon secondary servers allocate addresses for new clients from dis-
   joint address pools.  See section 5.4 for details.

3.5.  Using TCP to detect partner server failure in a fashion acceptable

   There are several characteristics of TCP that are important to the needs
   functioning of the failover protocol.  Additional failover protocol capabilities
   will need to be created to support timely detection of partner server
   failure.  See section 8.3 protocol, which uses one TCP connection
   for details on this mechanism.

4.  Design Goals

   This section lists the design requirements, both bulk data transfer as well as to assess communications
   integrity with the design goals, other server.  Reliable and ordered message
   delivery are chief among these important characteristics.

   It would be nice to use the
   limitations of capabilities built in to TCP to allow it
   to determine if communications integrity exists to the failover protocol.

4.1.  Design requirements for
   partner but this protocol

   The following list of requirements strategy contains some problems which require
   analysis.  There exist three fundamental cases for an open TCP con-
   nection that must be (and are) met by this pro-
   tocol.  They are listed in priority order. examined.

      1.  Implementations of this protocol must work with existing DHCP
          client implementations based on  When no data is being sent then no messages are traveling
          across the DHCP protocol [1]. TCP connection.

      2.  Implementations of  When data is queued to be sent, and the protocol must work with existing BOOTP
          relay agent implementations.

      3.  The protocol must provide failover redundancy between servers
          that are receiver has not located on
          blocked the same subnet.

4.2.  Goals for this protocol

   The following goals are met by this protocol as well, though they sending of additional data, then messages are
   less important than
          flowing across the requirements listed above. These goals are
   listed in priority order.

      1.  Provide for continued service TCP connection containing the applications
          data.

      3.  When data is queued to DHCP clients through an
          automated mechanism in be sent, and the event of failure receiver has blocked
          the transmission of additional data, then persist messages are
          flowing from the primary
          server.

      2.  Avoid binding an IP address receiver to a client while that binding is
          currently valid for another client.  In other words, do not
          allocate the same IP address sender to two clients.

      3.  Minimize any need ensure that the
          sender doesn't miss the receiver opening the window for manual administrative intervention.

      4.  Introduce
          further transmissions.

   The first case can be turned into the second case by sending
   application-level keep-alive messages periodically when there is no additional delays in server response time
   other data queued to be sent.  Note TCP keep-alive messages might be
   used as a
          result of well, but they present additional problems.

   Thus, we can ensure that the network communications required to implement TCP connection has messages flowing
   periodically across the
          failover protocol, i.e., don't require communications with connection fairly easily.  The question
   remains as to what TCP will do if the
          partner between other end of the receipt connection
   fails to respond (either because of a DHCPREQUEST and network partition or because the
          corresponding DHCPACK.

      5.  Share IP address ranges between primary
   receiving server crashes). TCP will attempt to retransmit a message
   with an exponential backoff, and secondary servers;
          i.e., impose no requirement will eventually timeout that
   retransmission.  However, the pool of available
          addresses be divided between servers.

      6.  Continue to meet the goals and objectives length of this protocol that timeout cannot, in
          the event of server failure or network partition.

      7.  Provide graceful reintegration of full protocol service after
          server failure or network partition.

      8.  Allow for one computer to act as gen-
   eral, be set on a secondary server for multi-
          ple primary servers. Other topologies (e.g.: mesh) are also
          possible.  primary per-connection basis, and secondary servers SHOULD is frequently as long as
   nine minutes, though in some cases it may be viewed as
          "logical" servers and not necessarily physical computers.

      9.  Ensure that an existing client short as two minutes.
   One some systems it can keep its existing IP
          address binding if be set system-wide, while on some systems it can communicate with either the primary
          or secondary DHCP server implementing
   cannot be changed at all.

   A value for this protocol - not just
          whichever server timeout that originally offered it would be appropriate for the binding.

      10. Ensure failover
   protocol, say less than 1 minute, could have unpleasant side-effects
   on other applications running on the same server, assuming that it
   could be changed at all on the host operating system.

   Nine minutes is a new client can get an IP address from some
          server. Ensure that in long time for the face of partition, where servers
          continue DHCP service to run but cannot communicate with each other, the
          above goals and requirements may be met. In addition, when unavailable to
   any new clients that were being served by the
          partition condition server which has
   crashed, when there is removed, allow graceful automatic re-
          integration without requiring human intervention.

      11. If either primary or secondary another server loses all of the infor-
          mation running that is has stored in stable storage, it should be able could respond to refresh
   them immediately as soon as it determines that its stable storage partner is not
   operational.

   The conclusion drawn from this analysis is that TCP provides very
   useful support for the other server.

      12. Support load balancing between failover protocol in the primary and secondary
          servers, and allow configuration areas of reliable and
   ordered message delivery, but cannot by itself be relied upon to
   detect partner server failure in a fashion acceptable to the percentage needs of
   the
          client population served by each with a moderately fine granu-
          larity.

4.3.  Limitations failover protocol.  Additional failover protocol capabilities
   will need to be created to support timely detection of partner server
   failure.  See section 8.3 for details on this Protocol mechanism.

4.  Design Goals

   This section lists the design requirements, the design goals, and the
   limitations of the failover protocol.

4.1.  Design requirements for this protocol

   The following are explicit limitations list of requirements must be (and are) met by this protocol. pro-
   tocol.  They are listed in priority order.

      1.  This protocol provides only one level  Implementations of redundancy through a
          single secondary server for each primary server. this protocol must work with existing DHCP
          client implementations based on the DHCP protocol [1].

      2.  A subset  Implementations of the address pool is reserved protocol must work with existing BOOTP
          relay agent implementations.

      3.  The protocol must provide failover redundancy between servers
          that are not located on the same subnet.

4.2.  Goals for secondary server
          use.  In order to handle this protocol

   The following goals are met by this protocol as well, though they are
   less important than the failure case where both servers requirements listed above. These goals are able
   listed in priority order.

      1.  Provide for continued service to communicate with DHCP clients, but unable to com-
          municate with each other, a subset clients through an
          automated mechanism in the event of failure of the primary
          server.

      2.  Avoid binding an IP address pool must
          be set aside as to a private address pool client while that binding is
          currently valid for another client.  In other words, do not
          allocate the secondary
          server. The secondary can use these same IP address to service newly arrived
          DHCP clients during such two clients.

      3.  Minimize any need for manual administrative intervention.

      4.  Introduce no additional delays in server response time as a period.  The size
          result of this private
          pool SHOULD be based only on the arrival rate network communications required to implement the
          failover protocol, i.e., don't require communications with the
          partner between the receipt of new DHCP
          clients a DHCPREQUEST and the length
          corresponding DHCPACK.

      5.  Share IP address ranges between primary and secondary servers;
          i.e., impose no requirement that the pool of expected downtime, available
          addresses be divided between servers.

      6.  Continue to meet the goals and is not influ-
          enced objectives of this protocol in any way by
          the total number event of DHCP clients supported
          by the server pair.

      3.  The failure or network partition.

      7.  Provide graceful reintegration of full protocol service after
          server failure or network partition.

      8.  Allow for one computer to act as a secondary server for multi-
          ple primary servers. Other topologies (e.g.: mesh) are also
          possible.  primary and secondary servers do SHOULD be viewed as
          "logical" servers and not respond to client
          requests at all while recovering from a failure necessarily physical computers.

      9.  Ensure that could
          have resulted in duplicate an existing client can keep its existing IP assignments.  (When synchroniz-
          ing in POTENTIAL-CONFLICT state).

5.  Protocol Overview

   This section will discuss
          address binding if it can communicate with either the failover primary
          or secondary DHCP server implementing this protocol at a relatively high
   level level of detail.  In - not just
          whichever server that originally offered it the event binding.

      10. Ensure that a description in this sec-
   tion conflicts (or appears to conflict due to new client can get an IP address from some
          server. Ensure that in the overview nature face of
   this section) partition, where servers
          continue to run but cannot communicate with information in later sections of this draft, the
   information in each other, the later sections should be considered authoritative.

5.1.  Messages
          above goals and States

   This protocol is centered around the message exchange used by one
   server to update requirements may be met. In addition, when the other
          partition condition is removed, allow graceful automatic re-
          integration without requiring human intervention.

      11. If either primary or secondary server loses all of binding database changes result-
   ing from DHCP client activity:

      o Communication of binding database changes

        The binding update (BNDUPD) message the infor-
          mation that is used has stored in stable storage, it should be able
          to send refresh its stable storage from the binding
        database changes to other server.

      12. Support load balancing between the partner server, primary and secondary
          servers, and allow configuration of the partner server
        responds percentage of the
          client population served by each with a binding acknowledgement (BNDACK) message when it
        has successfully committed those changes to its own stable
        storage.

   All moderately fine granu-
          larity.

4.3.  Limitations of the other messages this Protocol

   The following are involve ancillary issues:

      o Management explicit limitations of available IP addresses

        The pool request (POOLREQ) is used by the this protocol.

      1.  This protocol provides only one level of redundancy through a
          single secondary server to
        request an allocation of IP addresses from the for each primary server.

        The

      2.  A subset of the address pool response (POOLRESP) is used by the primary reserved for secondary server to
        inform
          use.  In order to handle the failure case where both servers
          are able to communicate with DHCP clients, but unable to com-
          municate with each other, a subset of the secondary server how many IP addresses it was allo-
        cated address pool must
          be set aside as the result of a private address pool request.

      o Synchronization for the secondary
          server.  The secondary can use these to service newly arrived
          DHCP clients during such a period.  The size of this private
          pool SHOULD be based only on the binding databases between arrival rate of new DHCP
          clients and the servers
        after they've been out length of communications

        The update request (UPDREQ) message expected downtime, and is used by one server to
        request that its partner send it all binding database informa-
        tion that it has not already seen.  The update request all
        (UPDREQALL) message is used by one server to request that all
        binding database information be sent influ-
          enced in order to recover from a any way by the total loss number of its lease state database DHCP clients supported
          by the requesting server. server pair.

          The update done (UPDDONE) message is failover protocol can be used in a mode where both the
          primary and secondary servers can share the load between them
          when both are operating.  In this loadbalancing mode, the
          addresses allocated by the responding primary server to indicate that all requested updates have been sent the
        responding secondary
          server and acked by the requesting server.

      o Connection establishment

        The connect (CONNECT) message is are not unused, but are used by either server instead to estab-
        lish a high level connection with service the other server, and
          portion of the client base which to
        transmit several important configuration data items between which the
        servers.  The connect acknowledgement message (CONNECTACK) secondary server
          is
        used required to respond.  See section 5.3 for more information
          on loadbalancing.

      3.  The primary and secondary servers do not respond to a CONNECT message client
          requests at all while recovering from another server.

      o Server synchronization

        The state change (STATE) message a failure that could
          have resulted in duplicate IP assignments.  (When synchroniz-
          ing in POTENTIAL-CONFLICT state).

5.  Protocol Overview

   This section will discuss the failover protocol at a relatively high
   level of detail.  In the event that a description in this section
   conflicts (or appears to conflict due to the overview nature of this
   section) with information in later sections of this draft, the infor-
   mation in the later sections should be considered authoritative.

5.1.  Messages and States

   This protocol is centered around the message exchange used by either one
   server to
        inform update the other server of a change of failover state. binding database changes result-
   ing from DHCP client activity:

      o Connection integrity management Communication of binding database changes

        The contact (CONTACT) binding update (BNDUPD) message is used by either server to ensure
        that send the other server continues binding
        database changes to see the connection as opera-
        tional.  It MUST be transmitted periodically over every esta-
        blished connection if other message traffic is not flowing, partner server, and the partner server
        responds with a binding acknowledgement (BNDACK) message when it MAY be sent at any time.

5.1.1.  Failover endpoints

   The proper operation
        has successfully committed those changes to its own stable
        storage.

   All of the failover protocol requires more than the
   transmission of other messages between one server and the other.  Each end-
   point might seem to be a single DHCP server, but in fact there are
   many situations where additional flexibility in configuration is use-
   ful.

   For instance, there might be several servers which are each primary
   for a distinct set of address pools, and one server which is
   secondary for all involve ancillary issues:

      o Management of those address pools. available IP addresses

        The situation with the
   primaries pool request (POOLREQ) is straightforward, but used by the secondary will need server to maintain
   a separate failover state, partner state, and communications up/down
   status for each
        request an allocation of IP addresses from the separate primary servers for which it is act-
   ing as a secondary. server.
        The failover protocol calls for there to be a unique failover end-
   point per partner per role (where role pool response (POOLRESP) is primary or secondary).
   This failover endpoint can take actions and hold unique states.
   There are thus a maximum of two failover endpoints per partner (one
   for used by the partner as a primary and one for that same partner as a
   secondary.)

   Thus, in server to
        inform the case where there are two primary servers A and B each
   backed up by a single common secondary server C, there is one fail-
   over endpoint on each how many IP addresses were allocated
        to the secondary server as the result of A and B, and two different failover end-
   points on C.  The two different failover endpoints on C each have
   unique states and independent TCP connections.

   This document describes the behavior pool request.

      o Synchronization of the protocol in terms binding databases between the servers
        after they've been out of pri-
   mary and secondary servers, not primary and secondary failover end-
   points.  However, it communications

        The update request (UPDREQ) message is important used by one server to remember that every 'server'
   described in this document is in reality a failover endpoint
        request that
   resides in a particular process, and its partner send it all binding database informa-
        tion that many failover endpoints may
   reside in the same process.

   It is it has not the case that there already seen.  The update request all
        (UPDREQALL) message is a unique failover endpoint for each
   subnet used by one server to request that participates all
        binding database information be sent in order to recover from a failover relationship.  On one server,
   there is one failover endpoint per partner per role, regardless
        total loss of
   how many subnets or address pools are managed its binding database by that combination of
   partner and role.  Conversely, any given subnet or pool will be asso-
   ciated with exactly one failover endpoint on a single the requesting server.

   When a connection
        The update done (UPDDONE) message is received from the partner, used by the unique failover
   endpoint responding
        server to which indicate that all requested updates have been sent the
        responding server and acked by the requesting server.

      o Connection establishment

        The connect (CONNECT) message is directed is determined solely used by the
   IP address of primary server to
        establish a high level connection with the partner other server, and to
        transmit several important configuration data items between the setting of the SECONDARY bit in
        servers.  The connect acknowledgement message (CONNECTACK) is
        used by the
   'flags' field of secondary server to respond to a CONNECT message
        from the contact message.

   Throughout this document, the states and actions taken primary server.  The disconnect (DISCONNECT) message is
        used by "servers"
   are described. either server when closing a connection.

      o Server synchronization

        The terms "server", "primary server", and "secondary
   server" are commonly state change (STATE) message is used by either server to described the failover endpoint taking
   these states and performing these actions.  This description is
   wholly accurate only for the simplest of cases, where all of
        inform the
   address pools on one other server are backed up by all of the address pools
   on another server.  In this case, there is single a change of failover endpoint
   in each server.  In all other cases, the term "server" state.

      o Connection integrity management

        The contact (CONTACT) message is used by either server to
   describe one of ensure
        that the two possible failover endpoints per partner.

5.2.  Fundamental restrictions

   There a several fundamental restrictions this protocol places on what
   one other server an do in the absence of knowledge of continues to see the connection as opera-
        tional.  It MUST be transmitted periodically over every esta-
        blished connection if other server, message traffic is not flowing, and
   these restrictions are key to the correct
        it MAY be sent at any time.

5.1.1.  Failover endpoints

   The proper operation of the protocol.

5.2.1.  Control of lease time

   The key problem with lazy update is that when failover protocol requires more than the a
   transmission of messages between one server fails
   after updating a client with a particular lease time and before
   updating its partner, the partner will believe that a lease has
   expired even though the client still retains other.  Each end-
   point might seem to be a valid lease on that IP
   address.

   In order to handle this problem, a period of time known as the "Max-
   imum Client Lead Time" (MCLT) single DHCP server, but in fact there are
   many situations where additional flexibility in configuration is defined and must use-
   ful.

   For instance, there might be known to both
   the several servers which are each primary
   for a distinct set of address pools, and secondary servers.  Proper use one server which is secon-
   dary for all of this time interval
   places an upper bound on those address pools.  The situation with the difference allowed between pri-
   maries is straightforward, but the lease
   time provided secondary will need to maintain a DHCP client by a server
   separate failover state, partner state, and communications up/down
   status for each of the lease time known
   by that server's partner. However, the MCLT separate primary servers for which it is typically much less
   than the lease time that act-
   ing as a server has been configured secondary.

   The failover protocol calls for there to offer be a
   client, unique failover end-
   point per partner per role (where role is primary or secondary).
   This failover endpoint can take actions and so some strategy must exist to allow a server to offer
   the configured lease time to a client.  During hold unique states.
   There are thus a lazy update maximum of two failover endpoints per partner (one
   for the
   updating server typically updates its partner with as a potential
   expiration time which is longer than the lease time previously given
   to the client primary and which is longer than the lease time one for that same partner as a
   secondary.)

   Thus, in the server
   has been configured to give case where there are two primary servers A and B each
   backed up by a client.  This allows that single common secondary server to
   give a longer lease time to the client the next time C, there is one fail-
   over endpoint on each of A and B, and two different failover end-
   points on C.  The two different failover endpoints on C each have
   unique states and independent TCP connections.

   This document describes the client
   renews its lease, since behavior of the time that protocol in terms of pri-
   mary and secondary servers, not primary and secondary failover end-
   points.  However, it will give is important to remember that every 'server'
   described in this document is in reality a failover endpoint that
   resides in a particular process, and that many failover endpoints may
   reside in the client will same process.

   It is not exceed the MCLT beyond the potential expiration time acknowledged case that there is a unique failover endpoint for each
   subnet that participates in a failover relationship.  On one server,
   there is one failover endpoint per partner per role, regardless of
   how many subnets or address pools are managed by the partner. that combination of
   partner and role.  Conversely, on a particular server, any given sub-
   net or pool will be associated with exactly one failover endpoint.

   When moving to the PARTNER-DOWN state (where a server connection is allowed received from the partner, the unique failover
   endpoint to
   reallocate which the partner's IP addresses), a server will wait message is directed is determined solely by the Max-
   imum Client Lead Time before allocating any IP addresses from its
   partner's pool to any new DHCP clients.  Thus, any clients which have
   a lease on an IP address with a lease time greater than that known by
   the server moving into PARTNER-DOWN state will either have contacted
   that server during the MCLT period or their leases will have expired.

   When a server has transitioned to PARTNER-DOWN state, it MUST NOT
   reallocate an
   IP address from one client to another client until an
   additional maximum client lead time interval after of the lease by partner and the
   original client expires. (Actually, until setting of the maximum client lead
   time after what it believes to be SECONDARY bit in the lease expiration time
   'flags' field of the
   first client.)

   Some optimizations exist for CONTACT message.

   Throughout this restriction, in that it only
   applies document, the states and actions taken by "servers"
   are described.  The terms "server", "primary server", and "secondary
   server" are commonly used to leases that were issued BEFORE entering PARTNER-DOWN. Once
   a server has entered PARTNER-DOWN described the failover endpoint taking
   these states and it leases out an address, it
   need not wait this time as long as it has never communicated with performing these actions.  This description is
   wholly accurate only for the
   partner since simplest of cases, where all of the lease was given out.

   The fundamental relationship
   address pools on which much one server are backed up by all of the correctness of address pools
   on another server.  In this
   protocol depends case, there is that single failover endpoint
   in each server.  In all other cases, the lease expiration time known term "server" is used to a DHCP
   client MUST NOT be more than the maximum client lead time greater
   than the potential expiration time known to a server's partner.

   The remainder
   describe one of this section makes the above fundamental relation-
   ship more explicit.

   This protocol requires two possible failover endpoints per partner.

5.2.  Fundamental restrictions

   There a DHCP server to deal with several different
   lease intervals and places specific fundamental restrictions this protocol places on their relation-
   ships. The purpose what
   one server can do in the absence of knowledge of the other server,
   and these restrictions is are key to allow the other server
   in the pair to be able to make certain assumptions in correct operation of the absence proto-
   col.

5.2.1.  Control of
   an ability to communicate between servers.

   The different lease times are:

      o desired lease interval time

   The desired lease interval key problem with lazy update is the lease interval that when the a DHCP server would like to give to fails
   after updating a DHCP client in with a particular lease time and before
   updating its partner, the absence of any
        restrictions imposed by partner will believe that a lease has
   expired even though the Failover protocol.  Its determina-
        tion is outside client still retains a valid lease on that IP
   address.

   In order to handle this problem, a period of time known as the scope of this protocol. Typically this "Max-
   imum Client Lead Time" (MCLT) is defined and must be known to both
   the result of external configuration primary and secondary servers.  Proper use of a DHCP server.

      o actual lease this time interval

        The actual lease internal is
   places an upper bound on the difference allowed between the lease interval that a DHCP
        server gives out
   time provided to a DHCP client in the dhcp-lease-time option
        of by a DHCPACK packet.  It may be shorter than server and the desired client lease interval (as explained below).

      o potential lease interval

        The potential lease interval time known
   by that server's partner.  However, the MCLT is typically much less
   than the lease expiration interval
        the local time that a server tells has been configured to offer a
   client, and so some strategy must exist to allow a server to offer
   the configured lease time to a client.  During a lazy update the
   updating server typically updates its partner in the potential-
        expiration-time option of with a BNDUPD message.

      o acknowledged potential lease interval

        The acknowledged potential
   expiration time which is longer than the lease interval time previously given
   to the client and which is longer than the potential least
        interval lease time that the partner server
   has most recently acknowledged in
        the potential-expiration-time option of been configured to give a BNDACK message.

   The key restriction (and guarantee) client.  This allows that any server makes with
   respect to
   give a longer lease intervals is that time to the actual client lease interval
   never exceeds the acknowledged potential lease interval (if any) by
   more than a fixed amount.  This fixed amount is called next time the "Maximum
   Client Lead Time" (MCLT).

   The MCLT MAY be configurable on client
   renews its lease, since the primary server, but for correct
   server operation time that it MUST be the same and known will give to both the primary
   and secondary servers.  The secondary server determines the MCLT from client will
   not exceed the MCLT option sent from beyond the primary server to potential expiration time acknowledged
   by the secondary partner.

   The PARTNER-DOWN state exists so that a server
   in the CONNECT or CONNECTACK message.

   A can be sure that its
   partner is, indeed, down.  Correct operation while in that state
   requires (generally) that the server wait the MCLT after anything
   that happened prior to its transition into PARTNER-DOWN state (or,
   more accurately, when the other server went down if that is known).
   Thus, the server MUST record wait the Maximum Client Lead Time after the
   partner server went down before allocating any of the partner's FREE
   addresses.  In the event the partner was not in communication prior
   to going down, it might have allocated one or more of its stable storage both the actual lease
   interval FREE
   addresses to a DHCP client and been unable to inform the most recently acknowledged potential lease interval
   for each IP address binding.  It is assumed that server
   entering PARTNER-DOWN prior to going down itself.  By waiting the desired client
   MCLT after the time the partner went down, the server in PARTNER-DOWN
   state ensures that any clients which have a lease interval can be determined through techniques outside on one of the
   scope of this protocol.

   Again,
   partner's FREE addresses will either time out or contact the fundamental relationship among these times which server
   in PARTNER-DOWN by the time that period ends.

   In addition, once a server has transitioned to PARTNER-DOWN state, it
   MUST be
   maintained is:

       actual lease interval <
       ( acknowledged potential lease interval + NOT reallocate an IP address from one client to another client
   until an additional MCLT )

   Figure 5.1-1 illustrates a initial interval after the lease to a by the original
   client using expires.  (Actually, until the rules
   discussed in maximum client lead time after
   what it believes to be the example which follows it.

              DHCP                 Primary             Secondary lease expiration time   Client               Server               Server

                | (time of the first
   client.)

   Some optimizations exist for this restriction, in intervals) |  (absolute time)   |
                |                     |                    |
                | >-DHCPDISCOVER->    |                    |
                |     <---DHCPOFFER-< |                    |
                |                     |                    |
                | >-DHCPREQUEST->     |                    |
                |   (selecting)       |                    |
                |                     |                    |
         t      |  <--------DHCPACK-< |                    |
                |  lease-time=MCLT    |                    |
                |                     |    >-BNDUPD-->     |
                |                     |  lease-expiration=t+MCLT
                |                     |  potential-expiration=t+(MCLT/2)+X
                |                     |                    |
                |                     |     <-BNDACK-<     |
                |                     |  potential-expiration=t+(MCLT/2)+X
               ...                   ...                  ...
                |                     |                    |
      t+MCLT/2  | >-DHCPREQUEST->     |                    |
                |      (renew)        |                    |
                |                     |                    |
         t1     |  <--------DHCPACK-< |                    |
                |   lease-time=X      |                    |
                |                     |    >-BNDUPD-->     |
                |                     |  lease-expiration=t1+X
                |                     |  potential-expiration=t1+(X/2)+X
                |                     |                    |
                |                     |     <-BNDACK-<     |
                |                     |  potential-expiration=t1+(X/2)+X
               ...                   ...                  ...

           Figure 5.1-1:  Lazy Update Message Traffic
                          X = Desired Lease Interval

   DISCUSSION:

      This protocol mandates no algorithm concerning these lease inter-
      vals, that it only
   applies to leases that were issued BEFORE entering PARTNER-DOWN. Once
   a server has entered PARTNER-DOWN and it leases out an address, it
   need not wait this time as long as above it has never communicated with the
   partner since the lease was given out.

   The fundamental relationship is preserved.

      In on which much of the interests correctness of clarity, however, let's examine a specific
      example.  The MCLT in this case is 1 hour.  The desired lease
      interval is 3 days, and its renewal time
   protocol depends is half that the lease inter-
      val.

      The rules for this example are:

      o What expiration time known to tell the client:

        Take the remainder of the acknowledged potential lease interval.
        If this is a new lease, then this value will DHCP
   client MUST NOT be zero.  If this
        remainder plus more than the MCLT is maximum client lead time greater
   than the desired lease inter-
        val, give the client potential expiration time known to a server's partner.

   The remainder of this section makes the desired above fundamental relation-
   ship more explicit.

   This protocol requires a DHCP server to deal with several different
   lease interval else give the
        client intervals and places specific restrictions on their relation-
   ships. The purpose of these restrictions is to allow the remainder plus other server
   in the MCLT.

      o What pair to tell the failover partner server:

        Take be able to make certain assumptions in the renewal interval (typically half absence of the actual client
        lease interval), add
   an ability to it the communicate between servers.

   The different lease times are:

      o desired lease interval, and add
        it to interval

        The desired lease interval is the current time lease interval that a DHCP
        server would like to yield give to a DHCP client in the value that goes into absence of any
        restrictions imposed by the
        potential-expiration-time option.

        Also tell Failover protocol.  Its determina-
        tion is outside of the failover partner scope of this protocol. Typically this is
        the result of external configuration of a DHCP server.

      o actual lease interval by
        adding it to the current time to yield

        The actual lease internal is the value lease interval that goes into
        the lease-expiration option.

      In operation this might work as follows:

      When a DHCP
        server makes an offer for a new lease on an IP address gives out to a DHCP client, it determines the client in the dhcp-lease-time option
        of a DHCPACK packet.  It may be shorter than the desired client
        lease interval (in this
      case, 3 days).  It then examines the acknowledged (as explained below).

      o potential lease interval

        The potential lease interval (which in this case is zero) and determines the remainder
      of the time left to run, which is also zero.  To this it adds the
      MCLT.  Since the actual lease expiration interval cannot be allowed
        the local server tells to exceed its partner in the remainder potential-
        expiration-time option of the current a BNDUPD message.

      o acknowledged potential lease interval
      plus the MCLT, the offer made to the client is for the remainder
      of the current

        The acknowledged potential lease interval (i.e., zero)
      plus the MCLT.  Thus, is the actual potential lease
        interval is 1 hour.

      Once the partner server has performed the ACK to the DHCP client, it will
      update most recently acknowledged in
        the secondary potential-expiration-time option of a BNDACK message.

   The key restriction (and guarantee) that any server makes with
   respect to lease intervals is that the actual client lease information. However, interval
   never exceeds the desired acknowledged potential lease interval will (if any) by
   more than a fixed amount.  This fixed amount is called the "Maximum
   Client Lead Time" (MCLT).

   The MCLT MAY be composed of configurable on the one
      half of primary server, but for correct
   server operation it MUST be the current actual lease interval added same and known to both the desired
      lease interval. Thus, the primary
   and secondary servers.  The secondary server is updated with a
      BNDUPD with a lease interval of 3 days + 1/2 hour specified in determines the
      IP Address Lease Time Option (Option 51).

      When MCLT from
   the MCLT option sent from the primary server receives an ACK to its update of the secondary server's (partner's) potential lease interval, it
      records that as the acknowledged potential lease interval.  A
      server MUST NOT send a BNDACK in response to a BNDUPD message
      until it is sure that the information in the BNDUPD message
      resides in its stable storage.  Thus, the primary server
   in this
      case can be sure that the secondary CONNECT message.

   A server has recorded the poten-
      tial lease interval MUST record in its stable storage when the primary server
      receives a BNDACK message from the secondary server.

      When the DHCP client attempts to renew at T1 (approximately one
      half an hour from the start of the lease), the primary server
      again determines the desired lease interval, which is still 3
      days.  It then compares this with both the remaining acknowledged
      potential actual lease
   interval (3 days + 1/2 hour) and adjusts for the
      time passed since the secondary was last updated (1/2 hour).  Thus
      the time remaining of the most recently acknowledged potential lease interval
   for each IP address binding.  It is
      3 days.  Adding the MCLT to this yields 3 days plus 1 hour, which
      is more than assumed that the desired client
   lease interval can be determined through techniques outside of 3 days.  So the client
      is renewed
   scope of this protocol.  See section 7.1.4 for more details concern-
   ing the desired lease interval -- 3 days.

      When times that the primary DHCP server updates MUST record in its stable storage and
   the secondary DHCP server
      after way that they interact with the lease time that may be offered to
   a DHCP client's renewal ACK is complete, it will calculate client.

   Again, the desired fundamental relationship among these times which MUST be
   maintained is:

       actual lease interval <
       ( acknowledged potential lease interval as the T1 fraction of the
      actual client + MCLT )

   Figure 5.1-1 illustrates a initial lease interval (1/2 of 3 days this to a client using the rules
   discussed in the example which follows it.

              DHCP                 Primary             Secondary
       time   Client               Server               Server

                | (time in intervals) |  (absolute time)   |
                |                     |                    |
                | >-DHCPDISCOVER->    |                    |
                |     <---DHCPOFFER-< |                    |
                |                     |                    |
                | >-DHCPREQUEST->     |                    |
                |   (selecting)       |                    |
                |                     |                    |
         t      |  <--------DHCPACK-< |                    |
                |  lease-time=MCLT    |                    |
                |                     |    >-BNDUPD-->     |
                |                     |  lease-expiration=t+MCLT
                |                     |  potential-expiration=t+(MCLT/2)+X
                |                     |                    |
                |                     |     <-BNDACK-<     |
                |                     |  potential-expiration=t+(MCLT/2)+X
               ...                   ...                  ...
                |                     |                    |
      t+MCLT/2  | >-DHCPREQUEST->     |                    |
                |      (renew)        |                    |
                |                     |                    |
         t1     |  <--------DHCPACK-< |                    |
                |   lease-time=X      |                    |
                |                     |    >-BNDUPD-->     |
                |                     |  lease-expiration=t1+X
                |                     |  potential-expiration=t1+(X/2)+X
                |                     |                    |
                |                     |     <-BNDACK-<     |
                |                     |  potential-expiration=t1+(X/2)+X
               ...                   ...                  ...

           Figure 5.1-1:  Lazy Update Message Traffic
                          X = 1.5 days).
      To this it will add Desired Lease Interval

   DISCUSSION:

      This protocol mandates no algorithm concerning these lease inter-
      vals, as long as above fundamental relationship is preserved.

      In the interests of clarity, however, let's examine a specific
      example.  The MCLT in this case is 1 hour.  The desired client lease
      interval of is 3 days,
      yielding a total desired partner server and its renewal time is half the lease interval of 4.5
      days.  In inter-
      val.

      The rules for this way, the primary attempts example are:

      o What to have tell the secondary
      always "lead" client:

        Take the client in its understanding remainder of the client's acknowledged potential lease interval so as to interval.
        If this is a new lease, then this value will be able to always offer zero.  If this
        remainder plus the client MCLT is greater than the desired client lease interval.

      Once inter-
        val, give the initial actual client the desired lease interval of else give the MCLT is past,
        client the protocol operates effectively like remainder plus the DHCP protocol does
      today in its behavior concerning lease intervals. However, MCLT.

      o What to tell the
      guarantee that failover partner server:

        Take the renewal interval (typically half of the actual client
        lease interval will never exceed interval), add to it the remaining acknowledged desired lease interval, and add
        it to the current time to yield the value that goes into the
        potential-expiration-time option.

        Also tell the failover partner server the actual lease interval by more
      than
        adding it to the MCLT allows full recovery from a variety of failures.

5.2.2.  Controlled re-allocation of IP addresses current time to yield the value that goes into
        the lease-expiration option.

      In operation this might work as follows:

      When in PARTNER-DOWN state there is a waiting period after which server makes an offer for a new lease on an IP address can be re-allocated to another client.  For leases which
   are available when a
      DHCP client, it determines the server enters PARTNER-DOWN state, desired lease interval (in this
      case, 3 days).  It then examines the period acknowledged potential lease
      interval (which in this case is zero) and determines the MCLT from entry into PARTNER-DOWN state.  For IP addresses
   which are not available when the server enters PARTNER-DOWN state, remainder
      of the period time left to run, which is also zero.  To this it adds the MCLT after
      MCLT.  Since the actual lease becomes available.  See sec-
   tion 9.4.2 for more details.

   In any other state, a server interval cannot reallocate an address from one
   client be allowed to another without first notifying its partner (through a
   BNDUPD message) and receiving acknowledgement (through a BNDACK mes-
   sage) that its partner is aware that that first exceed
      the remainder of the current acknowledged potential lease interval
      plus the MCLT, the offer made to the client is not using for the address.

   This could be modeled in remainder
      of the following way. Though this specific
   implementation current acknowledged potential lease interval (i.e., zero)
      plus the MCLT.  Thus, the actual lease interval is in no way required, it may serve to better illus-
   trate 1 hour.

      Once the concept.

   An "available" IP address on a server may be allocated to any client.
   An IP address which was leased has performed the ACK to a client and which expired or was
   released by that client would take on a new state, EXPIRED or
   RELEASED respectively.  The partner server would then be notified
   that this IP address was EXPIRED or RELEASED through a BNDUPD.  When the sending DHCP client, it will
      update the secondary server received with the BNDACK for that IP address showing it
   was FREE, it would move lease information. However,
      the IP address from EXPIRED or RELEASED to
   FREE, and it would desired potential lease interval will be available for allocation by composed of the primary server
   to any clients.

   A server MAY reallocate an IP address in one
      half of the EXPIRED or RELEASED
   state current actual lease interval added to the same client with no restrictions.

5.3.  Load balancing

   In order to implement load balancing between a primary and desired
      lease interval. Thus, the secondary server pair, each is updated with a
      BNDUPD with a lease interval of 3 days + 1/2 hour specified in the
      potential-expiration-time option.

      When the primary server must respond to DHCPDISCOVER requests from
   some clients and not from other clients.  In order receives an ACK to do this suc-
   cessfully, each its update of the
      secondary server's (partner's) potential lease interval, it
      records that as the acknowledged potential lease interval.  A
      server must be able MUST NOT send a BNDACK in response to determine immediately upon
   receipt of a DHCP client request whether BNDUPD message
      until it is to service this
   request or to ignore it sure that the information in order to allow the other server to service BNDUPD message
      resides in its stable storage.  Thus, the request.

   In addition, it should primary server in this
      case can be possible to configure sure that the percentage of
   clients which will be serviced by either secondary server has recorded the poten-
      tial lease interval in its stable storage when the primary or server
      receives a BNDACK message from the secondary server.  This configuration should be more or less continuous, from
   all serviced by

      When the primary through an even split with half serviced
   by each, DHCP client attempts to all serviced by renew at T1 (approximately one
      half an hour from the secondary.

   The technique chosen to support these goals is to define a hash func-
   tion which must be applied to start of the client-identifier or to lease), the htype
   concatenated with primary server
      again determines the chaddr if no client-identifier desired lease interval, which is specified.
   The results of still 3
      days.  It then compares this hash function yields a number between 0 with the remaining acknowledged
      potential lease interval (3 days + 1/2 hour) and 255
   which maps into one of 256 "hash-buckets".  Each hash bucket is
   assigned to one server or adjusts for the other by
      time passed since the primary server whenever a
   connection is established, through use secondary was last updated (1/2 hour).  Thus
      the time remaining of the hash-bucket-assignment
   option.

   The hash-bucket-assignment option uses a 32 octet value field (con-
   taining 256 bits), with one bit associated with each possible hash
   bucket.  If acknowledged potential lease interval is
      3 days.  Adding the bit corresponding MCLT to a hash bucket is a this yields 3 days plus 1 in hour, which
      is more than the
   hash-bucket-assignment option, then desired lease interval of 3 days.  So the secondary server is required
   to service all DHCP client requests that map into that hash bucket
   when in NORMAL state.

   For example, if
      is renewed for the desired lease interval -- 3 days.

      When the primary DHCP server sends a hash-bucket-assignment
   option to the secondary with the following 32 octets:

                                  buckets
       FF FF FF FF FF FF FF FF  ( 0   - 63 )
       FF FF FF FF FF FF FF FF  ( 64  - 127 )
       00 00 00 00 00 00 00 00  ( 128 - 191 )
       00 00 00 00 00 00 00 00  ( 192 - 255 )

   then updates the secondary MUST service any DHCP client requests where server
      after the
   client-identifier or htype concatenated with DHCP client's renewal ACK is complete, it will calculate
      the chaddr hashs into desired potential lease interval as the bucket values T1 fraction of 0 through 127.

   See section 12 for the code to implement the hash bucket algorithm.
   Each server MUST implement
      actual client lease interval (1/2 of 3 days this same algorithm in order for all
   clients to get service.

5.4.  Operating in NORMAL state

   When in NORMAL state, each server services DHCPDISCOVER's and all
   other DHCP requests other than DHCPREQUEST/RENEWAL or
   DHCPREQUEST/REBINDING from time = 1.5 days).
      To this it will add the desired client set defined by the load balanc-
   ing algorithm.  Each lease interval of 3 days,
      yielding a total desired partner server services DHCPREQUEST/RENEWAL or
   DHCPDISCOVER/REBINDING requests from any client. lease interval of 4.5
      days.  In general, whenever the binding database is changed in stable
   storage, then a BNDUPD message is sent with this way, the contents of that
   change primary attempts to have the partner server.  The partner server then writes secondary
      always "lead" the
   information about that binding in its bindings database in stable
   storage and replies with a BNDACK message.

5.5.  Operating in COMMUNICATIONS-INTERRUPTED state

   When operating client in COMMUNICATIONS-INTERRUPTED state, each server is
   operating independently, but does not assume that its partner is not
   operating.  The partner server might be operating and simply unable understanding of the client's
      lease interval so as to communicate with this server, or might not be operating.

   Each server responds able to always offer the full range of DHCP client messages that
   it receives, but in such a way that graceful reintegration the
      desired client lease interval.

      Once the initial actual client lease interval of the MCLT is alway
   possible when its partner comes back into contact with it.

5.6.  Operating in PARTNER-DOWN state

   When operating past,
      the protocol operates effectively like the DHCP protocol does
      today in PARTNER-DOWN state, a server assumes that its
   partner is not currently operating, but does make allowances for behavior concerning lease intervals. However, the
   possibility
      guarantee that that server was operating in the past.  It responds
   to all DHCP actual client requests in PARTNER-DOWN state.

   Any transactions that lease interval will never exceed
      the remaining acknowledged partner server may have had with DHCP
   clients but been unable to communicate to this server are allowed for
   in lease interval by more
      than the algorithms that are used to gradually take over MCLT allows full control recovery from a variety of all failures.

5.2.2.  Controlled re-allocation of the IP addresses configured into the server.

5.7.  Operating in RECOVER state

   A server operating

   When in RECOVER PARTNER-DOWN state assumes that it there is reintegrating
   with a waiting period after which an
   IP address can be re-allocated to another client.  For leases which
   are available when the server that has been operating in enters PARTNER-DOWN state, and that
   it needs to update its bindings database before it services DHCP
   client requests.

   A the period
   is the MCLT from entry into PARTNER-DOWN state.  For IP addresses
   which are not available when the server may also operate in RECOVER state in order enters PARTNER-DOWN state,
   the period is the MCLT after the lease becomes available.  See sec-
   tion 9.4.2 for more details.

   In any other state, a server cannot reallocate an address from one
   client to fully recover another without first notifying its bindings database from partner (through a
   BNDUPD message) and receiving acknowledgement (through a BNDACK mes-
   sage) that its partner server.

6.  Packet Formats

   This section discusses the common message format is aware that all failover
   messages have in common, and then defines option used that first client is not using
   the address.

   This could be modeled in the failover
   protocol.

6.1.  Common message format

   All failover protocol messages are sent over the TCP connection
   between failover endpoints and encoded using a packet format following way. Though this specific
   implementation is in no way required, it may serve to better illus-
   trate the failover protocol.

   There exists concept.

   An "available" IP address on a common message format for all failover messages, server may be allocated to any client.
   An IP address which
   utilizes was leased to a client and which expired or was
   released by that client would take on a new state, EXPIRED or
   RELEASED respectively.  The partner server would then be notified
   that this IP address was EXPIRED or RELEASED through a BNDUPD.  When
   the options sending server received the BNDACK for that IP address showing it
   was FREE, it would move the IP address from EXPIRED or RELEASED to
   FREE, and it would be available for allocation by the primary server
   to any clients.

   A server MAY reallocate an IP address in a way similar the EXPIRED or RELEASED
   state to the DHCP protocol.  For same client with no restrictions.

5.3.  Load balancing

   In order to implement load balancing between a primary and secondary
   server pair, each
   message type, server must respond to DHCPDISCOVER requests from
   some options are required clients and some are optional.  In
   addition, when a message is received any options that are not under-
   stood by the receiving from other clients.  In order to do this suc-
   cessfully, each server MUST must be ignored.

   All able to determine immediately upon
   receipt of the fields a DHCP client request whether it is to service this
   request or to ignore it in order to allow the fixed portion other server to service
   the request.

   In addition, it should be possible to configure the percentage of
   clients which will be serviced by either the packet MUST primary or secondary
   server.  This configuration should be filled more or less continuous, from
   all serviced by the primary through an even split with correct data half serviced
   by each, to all serviced by the secondary.

   The technique chosen to support these goals is described in every message sent.

   0                   1                   2                   3
   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |         packet length (2)     | msg type (1)  |payload off (1)|
   +---------------+---------------+---------------+---------------+
   |                            xid (4)                            |
   +---------------------------------------------------------------+
   |     0 or more additional header bytes  (variable)             |
   +---------------------------------------------------------------+
   |                    payload data  (variable)                   |
   |                                                               |
   |               formatted as DHCP-style options                 |
   | [LOADB].
   When using a unique option number space the load balancing algorithm in [LOADB] among two servers
   implementing the ?R6?        |
   |                   format defined by [NAMESPACE]               |
   +---------------------------------------------------------------+ failover protocol, both servers MUST use the same
   information from the DHCP client packet length - 2 bytes, network byte order

   This is as the length of Request ID for the packet.  It includes
   load balancing algorithm.  Both servers MUST use the two byte packet
   length itself.

   msg type - 1 byte dhcp-client-
   identifier (if it appears), and the client-hardware-address if the
   dhcp-client-identifier does not.  The message type field client-hardware-address is used to distinguish between messages.

   The following message types are defined:

   Value   Message Type
   -----   ------------
   0       reserved    not used
   1       POOLREQ     request allocation of addresses
   2       POOLRESP    respond with allocation count
   3       BNDUPD      update partner with binding info
   4       BNDACK      acknowledge receipt of binding update
   5       CONNECT     establish connection with partner
   6       CONNECTACK  respond to attempt to establish contact with partner
   7       UPDREQALL   request full transfer of binding info
   8       UPDDONE     ack send con-
   structed from the htype and ack of req'd binding info
   9       UPDREQ      req transfer of un-acked binding info
   10      STATE       inform partner chaddr fields of current state or state change
   11      CONTACT     probe communications integrity with partner

   New message types should be defined the DHCP client request
   in one the same manner as described for creation of two ranges, 0-127 or
   129-255.  The range the client-hardware-
   address option in section 6.2.

   A bitmap-style Hash Bucket Assignment (as described in section 5.2 of 0-127
   [LOADB]) is used for messages that MUST be
   supported sent by every server, and if a the primary server receives a message in to the
   range of 0-127 that it doesn't understand, it MUST drop secondary server when-
   ever a connection is established, using the TCP con-
   nection.  The range of 128-255 hash-bucket-assignment
   option defined in section 6.2.  This Hash Bucket Assignment is used for messages which MAY be sup-
   ported but are not required, and if a
   by the secondary server receives a message to decide which packets to process when in
   this range that it does not understand it SHOULD ignore the message.

   payload offset - 1 byte
   NORMAL state.

   The byte offset of the Payload Data, from way in which either primary or secondary servers determine the beginning
   hash bucket assignment for it to use when in other than NORMAL state
   is outside of the
   failover packet header. The value for the current protocol version is
   8.

   xid - 4 bytes, network byte order

   This is the transaction id of the failover packet.  The sender scope of a
   failover protocol packet is responsible for setting this number, and document.  Note, however, that the receiver
   primary and secondary servers MUST use identical hash bucket assign-
   ments when not in NORMAL state.  This common hash bucket assignment
   MAY be for all of the packet copies hash buckets, indicating that there is no other
   DHCP server sharing the number over into any response
   packet, treating load with this failover pair, or it as opaque data.  The sender SHOULD ensure that
   every packet sent from MAY be
   for a particular failover endpoint over subset of the
   associated TCP connection has a unique transaction id unless hash buckets, which would indicate that
   packet there
   exists another server or server pair with which this DHCP server pair
   is a re-transmission.

   payload data - variable length

   The options are placed after sharing the header, after skipping payload
   offset bytes load.

5.4.  Operating in NORMAL state

   When in NORMAL state, each server services DHCPDISCOVER's and all
   other DHCP requests other than DHCPREQUEST/RENEWAL or
   DHCPREQUEST/REBINDING from beginning of the packet.  The payload data options
   are not preceded client set defined by a "cookie" value.

   The payload data is formatted as DHCP style options using the two
   byte option number and two byte option length format as specified load balanc-
   ing algorithm.  Each server services DHCPREQUEST/RENEWAL or
   DHCPDISCOVER/REBINDING requests from any client.

   In general, whenever the binding database is changed in stable
   storage, then a BNDUPD message is sent with the recommendations contents of that
   change to the DHCP panel in [NAMESPACE]. partner server.  The maximum length of partner server then writes the payload data
   information about that binding in octets is 2048 less the
   size of the header, i.e., the maximum packet length is 2048 octets.

6.2.  Common option format

   The options contained its bindings database in the payload data section of the failover
   packet all use the two byte option number stable
   storage and two byte length format
   as specified by the recommendations of the DHCP panel replies with a BNDACK message.

5.5.  Operating in [NAMESPACE]. COMMUNICATIONS-INTERRUPTED state

   When operating in COMMUNICATIONS-INTERRUPTED state, each server is
   operating independently, but does not assume that its partner is not
   operating.  The option numbers are drawn from an option number space unique to
   the failover protocol.  All of the message types share a common
   option number space partner server might be operating and common options definitions, though not all
   options are required simply unable
   to communicate with this server, or meaningful for every message.

   In contrast might not be operating.

   Each server responds to the options which appear in full range of DHCP client and server
   packets, the options in failover message are ordered.  That is, for
   some messages the order that
   it receives, but in which the options appear such a way that graceful reintegration is always
   possible when its partner comes back into contact with it.

5.6.  Operating in the payload
   data area PARTNER-DOWN state

   When operating in PARTNER-DOWN state, a server assumes that its
   partner is significant.  The messages not currently operating, but does make allowances for which this is the case
   spell it out
   possibility that that server was operating in detail.

   For all options which refer to time, they the past, though possi-
   bly out of communications with this server.  It responds to all use an absolute time DHCP
   client requests in
   GMT.  Time synchronization PARTNER-DOWN state.

5.7.  Operating in RECOVER state

   A server operating in RECOVER state assumes that it is reintegrating
   with a server that has already been achieved between the
   source operating in PARTNER-DOWN state, and the target that
   it needs to update its bindings database before it services DHCP
   client requests.

   A server using may also operate in RECOVER state in order to fully recover
   its bindings database from its partner server.

5.8.  Operating in STARTUP state

   A server operating in STARTUP state assumes that failover is opera-
   tional, and it spends a short time whenever it comes up attempting to
   contact the CONNECT message.  All partner.  During this time
   fields in (generally a few seconds), the options defined below use
   server is unresponsive to DHCP client requests.  This period exists
   in order to give a time represented as seconds
   elapsed server a chance to determine that its partner has
   changed state since Jan 1, 1970 (i.e. ANSI C time_t time value representa-
   tion).  Note it was last in communications, and to react to
   that this is (at present) changed state (if any) prior to responding to DHCP client
   requests.

   The period of time a signed field.

   Additional options can server remains in STARTUP state SHOULD be defined for intervendor or vendor specific
   use with limited difficulty due long
   enough to ensure that it will connect to the large number of option numbers
   available.

6.2.1.  binding-status

   This option other server if that
   server is used available for connections.

5.9.  Time synchronization between servers

   The failover protocol is designed to operate between two servers
   which have time values which differ by an arbitrarily large amount.
   A particular implementation MAY choose to only support servers whose
   time values differ by an arbitrarily small amount.

   In any event, whether large or only small differences in time values
   are supported, every message that is received MUST be tagged with a
   time value as soon as possible after receipt.  This time value is
   used along with the time value that is sent in every message between
   the failover partners to develop a delta time between the servers.
   This delta time is used during the connection process to establish a
   baseline delta time between the servers, and upon receipt of each
   message, the delta time for that message is used to refine the delta
   time for the server pair.

   While the algorithm for this refinement of delta time is not speci-
   fied as part of this protocol, a server SHOULD allow the delta time
   value for a pair of failover servers to be periodically updated to
   account for time drift.  In addition, the delta time value between
   servers SHOULD be smoothed in some fashion, so that transient network
   delays will not cause it to vary wildly.

   A server SHOULD recognize a drastic change in the delta time value as
   an event to be signaled to a network administrator.

5.10.  IP address binding-status

   In most DHCP servers an IP address can take on several different
   binding-status values, sometimes also called states.  While no two
   DHCP servers probably have exactly the same possible binding-status
   values the DHCP RFC enforces some commonality among the general
   semantics of the binding-status values used by various DHCP server
   implementations.

   In order to transmit binding database updates between one server and
   another using the failover protocol, some common denominator
   binding-status values must be defined.  It is not expected that these
   binding-status-values correspond with any actual implementation of
   the DHCP protocol in a DHCP server, but rather that the binding-
   status values defined in this document should be a superset of most
   if not all DHCP server implementations.  It is a goal of this proto-
   col that any DHCP server can map the various IP address binding-
   status values that it uses internally into these failover IP address
   binding-status values on transmission of binding database updates to
   its partner, and likewise that it can map any failover IP address
   binding-status values into its internal IP address binding-status
   values upon receipt of a binding database update.

   The IP address binding-status values defined for the failover proto-
   col are:

      o FREE

        Lease may be allocated to any DHCP client.

      o ACTIVE

        Lease is assigned to a client.  It MUST have client information
        associated with it.

      o EXPIRED

        Lease has expired.  It may be allocated to the same client.

      o RELEASED

        Lease has been released by client.  It may be allocated to the
        same client.

      o ABANDONED
        A server, or client flagged address as unusable.

      o RESET

        Lease was freed by some external agent.

      o BACKUP

        Lease belongs to secondary's private address pool.

   These binding-status values are communicated from one failover
   partner to another using the binding-status option, see section 6.2
   for details of this option.  Unless otherwise noted above there MAY
   be client information associated with each of these binding-status
   values.

   Again, note that a DHCP server implementing the failover protocol
   does not have to implement either this state machine or use these
   particular binding-status values in its normal operation of allocat-
   ing IP addresses to DHCP clients.  It only needs to map its internal
   binding-status-values onto these "standard" binding-status values,
   and map these "standard" binding-status values back into its internal
   binding-status values.  In particular, a server which implements a
   grace period for a IP address binding SHOULD simply wait to update
   its partner server until the grace period on that binding has run
   out.

   The process of setting an IP address to FREE deserves some detailed
   discussion.  When an IP address is moved to the EXPIRED,RELEASED, or
   RESET binding-status on a server, it will send a BNDUPD with the
   binding-status of EXPIRED, RELEASED, or RESET to its partner.  If its
   partner agrees that is acceptable (see sections 7.1.2 and 7.13 con-
   cerning why a server might not accept a BNDUPD) it will return a
   BNDACK with no reject-reason, signifying that it accepted the update.
   As part of the BNDUPD processing, the server returning the BNDACK
   will set the binding-status of the IP address to FREE, and upon
   receipt of the BNDACK the server which sent the BNDUPD will set the
   binding-status of the IP address to FREE.  Thus, the EXPIRED,
   RELEASED, or RESET binding-status is something of a transitory state.
   This process is encoded in the transition diagram below by "Comm
   w/Partner".

   An IP address will move between these lease binding-status values
   using the following state transition diagram:

                                        DHCP client DECLINE or
                                        server detected problem
                                        from any state
                          +----------+     V   +---------+
         External   >---->|   RESET  |     |   |ABANDONED|
         command          |          |     +-->|         |
                          +----------+         +---------+
                               |
                           Comm w/Parter
                               V
     +---------+  Comm    +----------+   Comm    +---------+
     | EXPIRED |--------->|  FREE    |<----------| RELEASED|
     |         | w/Parter |          | w/Partner |         |
     +---------+          +----------+           +---------+
       ^     ^             |        |                  ^
       | Exp. grace   IP address  IP addr alloc.       |
       | period ends  leased by   to secondary         |
       |     |        primary       V                  |
       |     |             |      +----------+         |
       |     |             |      |  BACKUP  |         |
       |   wait for        |      |          |         |
       |  grace period     |      +----------+         |
       |     |             |       |                   |
       |     |             |    IP addr leased by      |
       |  Expired grace    |       secondary           |
       |  period exists    V       V                   |
       |     |           +----------+                  |
       |     | Lease on  |  ACTIVE  | DHCPRELEASE      |
       +-----+-IP addr---|          |------------------+
               expires   +----------+

       Figure 5.10-1:  Transitions between binding-status values.

   If a server receives a binding-status that it doesn't implement
   internally, it should do something reasonable. A server which doesn't
   support an ABANDONED binding-status could set the IP address ACTIVE
   and belonging to a client which will never be seen in a DHCP request.

5.10.1.  IP address binding-status changes from BNDUPD messages

   IP addresses undergo binding status changes for several reasons,
   including receipt and processing of DHCP client requests, administra-
   tive inputs and receipt of BNDUPD messages.  Every DHCP server needs
   to respond to DHCP client request and administrative inputs with
   changes to its internal record of the binding-status of an IP
   address, and this response is not in the scope of the failover proto-
   col.  However, the receipt of BNDUPD messages implies at least a pos-
   sible change of the binding-status for an IP address, and must be
   discussed here.  See section 7.1.2 for general actions to take upon
   receipt of a BNDUPD message.

   When receiving a BNDUPD message, it is important to note that it may
   not be current, in that the server receiving the BNDUPD message may
   have had a more recent interaction with the DHCP client than its
   partner who sent the BNDUPD message.  In this case, the receiving
   server MUST reject the BNDUPD message.  In addition, it is worth not-
   ing that two (and possibly three) binding-status values are the
   direct result of interaction with a DHCP client, ACTIVE and RELEASED
   (and possibly ABANDONED).  All other binding-status values are either
   the result of the expiration of a time period or interaction with an
   external agency (e.g., a network admistrator).

   Every BNDUPD message SHOULD contain a client-last-transaction-time
   option, which MUST, if it appears, be the time that the server last
   interacted with the DHCP client.  It MUST NOT be, for instance, the
   time that the lease on an IP address expired.  If there has been no
   interaction with the DHCP client in question (or there is no DHCP
   client presently associated with this IP address), then there will be
   no client-last-transaction-time option in the BNDUPD message.

   The following list is indexed by the binding-status that a server
   receives in a BNDUPD message.  In many cases, the binding-status of
   an IP address within the receiving server's data storage will have an
   affect upon the checks performed prior to accepting the new binding-
   status in a BNDUPD message.

   In the following list, to "accept" a BNDUPD means to update the
   server's bindings database with the information contained in the
   BNDUPD and once that update is complete, send a BNDACK message
   corresponding to the BNDUPD message.  To "reject" a BNDUPD means to
   respond to the BNDUPD with a BNDACK with a reject-reason option
   included..

   When interpreting the rules in the following list, if a BNDUPD
   doesn't have a client-last-transaction-time value, then it MUST NOT
   be considered later than the client-last-transaction-time in the
   receiving server's binding.   If the BNDUPD contains a client-last-
   transaction-time value and the receiving server's binding does not,
   then the client-last-transaction-time value in the BNDUPD MUST be
   considered later than the server's.

   The second rule concerns clients and IP addresses.  If the client in
   a BNDUPD message the client in a receiving server's binding both
   exist and if they differ, then if the receiving server's binding-
   status is ACTIVE and the binding-status in the BNDUPD is ACTIVE, then
   if the receiving server is a secondary server accept it, else reject
   it.

   Otherwise, look up the binding-status in the BNDUPD in this list:

      o ACTIVE in BNDUPD

        If the receiving server's binding-status is ACTIVE, FREE, or
        BACKUP, then accept it.

        If the receiving server's binding-status is ABANDONED or RESET,
        then reject it.

        If the receiving server's binding status is RELEASED, EXPIRED,
        then if the client-last-transaction-time in the BNDUPD is later
        than the client-last-transaction-time in the receiving server's
        binding, accept it, else reject it.

      o EXPIRED in BNDUPD

        If the receiving server's binding-status is ACTIVE, then current
        time is later than the receiving server's lease-expiration-time,
        accept it, else reject it.

        If the receiving server's binding-status is ABANDONED or RESET,
        reject it.

        If the receiving server's binding-status is FREE or BACKUP,
        accept it.

        If the receiving server's binding-status is RELEASED, then if
        the client-last-transaction-time is greater in the BNDUPD than
        in the receiving server's binding, then accept it, else reject
        it.

      o RELEASED in BNDUPD

        If the receiving server's binding-status is ACTIVE, then if the
        client-last-transaction-time is greater than the client-last-
        transaction-time in the receiving server's binding, accept it,
        else reject it.

        If the receiving server's binding-status is RELEASED, FREE or
        BACKUP, accept it.

        If the receiving server's binding-status is ABANDONED or RESET,
        reject it.

      o FREE or BACKUP in BNDUPD

        If the receiving server's binding-status is ACTIVE and the
        current time is later than the lease-expiration-time accept it,
        else reject it.

        If the receiving server's binding-status is ABANDONED, reject
        it.

        If the receiving server's binding-status is FREE or BACKUP or
        RESET, accept it.

      o RESET or ABANDONDED in BNDUPD

        Accept the new binding-status under all circumstances.

5.11.  DNS dynamic update considerations

   DHCP servers (and clients) can use DNS Dynamic Updates as described
   in [RFC2136] to maintain DNS name-mappings as they maintain DHCP
   leases.  Many different administrative models for DHCP-DNS integra-
   tion are possible.  Descriptions of several of these models, and
   guidelines that DHCP servers and clients should follow in carrying
   them out, are laid out in [DDNS].  The nature of the DHCP failover
   protocol introduces some issues concerning dynamic DNS updates that
   are not part of non-failover DHCP environments.  This section
   describes these issues, and defines the information which failover
   partners should exchange and the protocol which they should follow in
   order to ensure consistent behavior.  The presence of this section
   should not be interpreted as requiring that implementations of the
   DHCP failover protocol must also support DDNS updates.  The purpose
   of this discussion is to clarify the areas where the DHCP failover
   and DHCP-DDNS protocols intersect for the benefit of implementations
   which support both protocols, not to introduce a new requirement into
   the DHCP failover protocol.  Thus, a DHCP server which implements the
   failover protocol MAY also support dynamic DNS updates, but if it
   does support dynamic DNS updates it SHOULD utilize the techniques
   described here in order to correctly distribute them between the
   failover partners.

5.11.1.  Relationship between failover and dynamic DNS update

   The failover protocol describes the conditions under which each fail-
   over server may renew a lease to its current DHCP client, and
   describes the conditions under which it may grant a lease to a new
   DHCP client.  An analogous set of conditions determines when a fail-
   over server should initiate a DDNS update, and when it should attempt
   to remove records from the DNS. The failover protocol's conditions
   are based on the desired external behavior: avoiding duplicate
   address assignments; allowing clients to continue using leases which
   they obtained from one failover partner even if they can only commun-
   icate with the other partner; allowing the backup DHCP server to
   grant new leases even if it is unable to communicate with the primary
   server.  The desired external DDNS behavior for DHCP failover servers
   is:

      1.  Allow timely DDNS updates from the server which grants a
          client a lease. Recognize that there is often a DDNS update
          lifecycle which parallels the DHCP lease lifecycle. This is
          likely to include the addition of records when the lease is
          granted, and the removal of DNS records when the lease is sub-
          sequently made available for allocation to a different client.

      2.  Communicate enough information between the two failover
          servers to allow one to complete the DDNS update 'lifecycle'
          even if the other server originally granted the lease.

      3.  Avoid redundant or overlapping DDNS updates, where both fail-
          over servers are attempting to perform DDNS updates for the
          same lease-client binding. Avoid situations where one partner
          is attempting to add RRs related to a lease binding while the
          other partner is attempting to remove RRs related to the same
          lease binding.

5.11.2.  Use of the DDNS option

   In order for either server to be able to complete a DDNS update, or
   to remove DNS records which were added by its partner, both servers
   need to know the FQDN associated with the lease-client binding. The
   FQDN associated with the client's A RR and PTR RR SHOULD be communi-
   cated from the server which adds records into the DNS to its partner.
   The initiating server SHOULD use the DDNS option in the BNDUPD mes-
   sages to inform the partner server of the status of any DDNS updates
   associated with a lease binding. Failover servers MAY choose not to
   include the DDNS option in BNDUPD messages if there has been no
   change in the status of any DDNS update related to the lease binding.
   The partner server receiving BNDUPD messages containing the ddn
   option SHOULD compare the status flags and the FQDN contained in the
   option data with the current DDNS information it has associated with
   the lease binding, and update its notion of the DDNS status accord-
   ingly.

   The initiating server MAY send a BNDUPD to its partner before the
   DDNS update has been successfully completed. If it does so, it SHOULD
   leave the 'C' bit in the Flags field clear, to indicate to the
   partner that the DDNS update may not be complete. When the DDNS
   update has been successfully acknowledged by the DNS server, the ini-
   tiating DHCP server SHOULD include the DDNS option in its next BNDUPD
   message about the binding, so that the partner server will be able to
   record the final status of the DDNS update. The initiating server
   SHOULD set the 'C' bit in the DDNS option if the DDNS update was suc-
   cessfully accepted by the DNS server.

   Some implementations will choose to send a BNDUPD without waiting for
   the DDNS update to complete, and then will send a second BNDUPD once
   the DDNS update is complete. Other implementations will delay sending
   the partner a BNDUPD until the DDNS update has been acknowledged by
   the DNS server, or until some time-limit has elapsed, in order to
   avoid sending a second BNDUPD.

   The Domain Name field in the DDNS option contains the FQDN that will
   be associated with the A RR (if the server is performing an A RR
   update for the client) and the PTR RR. This FQDN may be composed in
   any of several ways, depending on server configuration and the infor-
   mation provided by the client in its DHCP messages. The client may
   supply a hostname which it would like the server to use in forming
   the FQDN, or it may supply the entire FQDN. The server may be config-
   ured to attempt to use the information the client supplies, it may be
   configured with an FQDN to use for the client, or it may be config-
   ured to synthesize an FQDN. The responsive server SHOULD include the
   FQDN that it will be using in DDNS updates it initiates when it sends
   the DDNS option.

   Since the responsive server may not have completed the DDNS update at
   the time it sends the first BNDUPD about the lease binding, there may
   be cases where the FQDN in later BNDUPD messages does not match the
   FQDN included in earlier messages. For example, the responsive server
   may be configured to handle situations where two or more DHCP client
   FQDNs are identical by modifying the most-specific label in the FQDNs
   of some of the clients in an attempt to generate unique FQDNs for
   them. Alternatively, at sites which use some or all of the informa-
   tion which clients supply to form the FQDN, it's possible that a
   client's configuration may be changed so that it begins to supply new
   data. The responsive server may react by removing the DNS records
   which it originally added for the client, and replacing them with
   records that refer to the client's new FQDN. In such cases, the
   responsive server SHOULD include the actual FQDN that was used in
   subsequent DDNS options. The responsive server SHOULD include
   relevant client-option data in the client-request-options option in
   its BNDUPD messages. This information may be necessary in order to
   allow the non-responsive partner to detect client configuration
   changes that change the hostname or FQDN data which the client
   includes in its DHCP requests.

5.11.3.  Adding RRs to the DNS

   A failover server which is going to perform DDNS updates SHOULD ini-
   tiate the DDNS update when it grants a new lease to a client. The
   non-responsive partner SHOULD NOT initiate a DDNS update when it
   receives the BNDUPD after the lease has been granted. The failover
   protocol ensures that only one of the partners will grant a lease to
   any individual client, so it follows that this requirement will
   prevent both partners from initiating updates simultaneously. The
   server initiating the update SHOULD follow the protocol in [DDNS].
   The server may be configured to perform an A RR update on behalf of
   its clients, or not. Ordinarily, a failover server will not initiate
   DDNS updates when it renews leases. In two cases, however, a failover
   server MAY initiate a DDNS update when it renews a lease to its
   existing client:

      1.  When the lease was granted before the server was configured to
          perform DDNS updates, the server MAY be configured to perform
          updates when it next renews existing leases. Since both
          servers are responsive to renewals in NORMAL state, it is not
          enough to simply require the non-responsive server to avoid a
          DNS update in this case.  The server which would be responsive
          to a DHCPDISCOVER from this client (even though the current
          request is a DHCPREQUEST/RENEW) is the server which should
          initiate the DDNS update.

      2.  If a server is in PARTNER-DOWN state, it can conclude that its
          partner is no longer attempting to perform an update for the
          existing client. If the remaining server has not recorded that
          an update for the binding has been successfully completed, the
          server MAY initiate a DDNS update.  It MAY initiate this
          update immediately upon entry to PARTNER-DOWN state, it may
          perform this in the background, or it MAY initiate this update
          upon next hearing from the DHCP client.

5.11.4.  Deleting RRs from the DNS

   The failover server which makes a lease FREE SHOULD initiate any DDNS
   deletes, if it has recorded that DNS records were added on behalf of
   the client.

   A server "makes a lease FREE" when it initiates a BNDUPD with a
   binding-status of FREE, EXPIRED, or RELEASED.  Its partner confirms
   this status by acking that BNDUPD, and upon receipt of the ACK the
   server has "made the address FREE". It is at this point that it
   should initiate the DDNS operations to delete RRs from the DDNS.  Its
   partner SHOULD NOT initiate DDNS deletes for DNS records related to
   the lease binding as part of sending the BNDACK message.   The
   partner MAY have issued BNDUPD messages with a binding-status of
   FREE, EXPIRED, or RELEASED previously, but the other server will have
   NAKed these BNDUPD messages.

   The failover protocol ensures that only one of the two partner
   servers will be able to make a lease FREE. The server making the
   lease FREE may be doing so while it is in NORMAL communication with
   its partner, or it may be in PARTNER-DOWN state. If a server is in
   PARTNER-DOWN state, it may be performing DDNS deletes for RRs which
   its partner added originally. This allows a single remaining partner
   server to assume responsibility for all of the DDNS activity which
   the two servers were undertaking.

   Another implication of this approach is that no DDNS RR deletes will
   be performed while either server is in COMMUNICATIONS-INTERRUPTED
   state, since no IP addresses are moved into the FREE state during
   that period.

5.12.  Reservations and failover

   Some DHCP servers support a capability to offer specific pre-
   configured IP addresses to DHCP clients.  These are real DHCP
   clients, they do the entire DHCP protocol, but these servers always
   offer the client a specific pre-configured IP address -- and they
   offer that IP address to no other clients.  Such a capability has
   several names, but it is sometimes called a "reservation", in that
   the IP address is reserved for a particular DHCP client.

   In a situation where there are two DHCP server serving the same sub-
   net without using failover, the two DHCP server's need to have dis-
   joint IP address pools, but identical reservations for the DHCP
   clients.

   In a failover context, both servers need to be configured with the
   proper reservations in an identical manner, but if we stop there
   problems can occur around the edge conditions where reservations are
   made for an IP address that has already been leased to a different
   client.  Different servers handle this conflict in different ways,
   but the goal of the failover protocol is to allow correct operation
   with any server's approach to the normal processing of the DHCP pro-
   tocol.

   The general solution with regards to reservations is as follows.
   Whenever a reserved IP address becomes FREE (i.e., when first config-
   ured or whenever a client frees it or it expires or is reset), the
   primary server MUST show that IP address as FREE (and thus available
   for its own allocation) and it MUST send it to the secondary server
   as BACKUP, in order that the secondary server be able to allocate it
   as well.

5.13.  Dynamic BOOTP and failover

   Some DHCP servers support a capability to offer IP addresses to BOOTP
   clients without having a particular address previously allocated for
   those clients.  This capability is often called something like
   "dynamic BOOTP".  It is not a capability explicitly discussed in
   either the DHCP or BOOTP RFC's, but rather a pragmatic capability
   which can work reasonably well for a small set of legacy BOOTP dev-
   ices.

   This capability has a negative interaction with the fundamental ele-
   ments of the failover protocol, in that an address handed out to a
   BOOTP device has no term (or effectively no term, in that usually
   they are considered leases for "forever").  There is no opportunity
   to hand out a lease which is only the MCLT long when first hearing
   from a BOOTP device, because they may only interact once with the
   DHCP server and they have no notion of a lease expiration time.  Thus
   the entire concept of the MCLT and waiting the MCLT after entering
   PARTNER-DOWN state is broken when dealing with BOOTP devices.

   With some restrictions, however, dynamic BOOTP devices can be sup-
   ported in a server on a subnet where failover is supported.  The only
   restriction (and it is not small) is that on any portion of the sub-
   net (in any address pool) where dynamic BOOTP devices can be allo-
   cated IP addresses, a DHCP server MUST NOT ever use any of the IP
   addresses which were previously available for allocation by its fail-
   over partner.  Thus, the addresses allocated by the primary to the
   secondary for allocation MUST NOT ever be used by the primary server
   even if it is in PARTNER-DOWN state and has waited the MCLT after
   entering that state.  The reason for this is because one of those IP
   address could have been allocated by the secondary server to a BOOTP
   device, and the primary server would have no way of ever knowing that
   happened.

5.14.  Guidelines for selecting MCLT

   There is no one correct value for the MCLT.  There is an explicit
   tradeoff between various factors in selecting an MCLT value.

5.14.1.  Short MCLT

   A short MCLT value will mean that after entering PARTNER-DOWN state,
   a server will only have to wait a short time before it can start
   allocating its partner's IP addresses to DHCP clients.  Furthermore,
   it will only have to wait a short time after the expiration of a
   lease on an IP address before it can reallocate that IP address to
   another DHCP client.

   However the downside of a short MCLT value is that the initial lease
   interval that will be offered to every new DHCP client will be short,
   which will cause increased traffic as those clients will need to send
   in their first renew in a half of a short MCLT time.  In addition,
   the lease extensions that a server in COMMUNICATIONS-INTERRUPTED
   state can give will be only the MCLT after the server has been in
   COMMUNICATIONS-INTERRUPTED for around the desired client lease
   period.  If a server stays in COMMUNICATIONS-INTERRUPTED for that
   long, then the leases it hands out will be short and that will
   increase the load on that server, possibly causing difficulty.

5.14.2.  Long MCLT

   A long MCLT value will mean that the initial lease period will be
   longer and the time that a server in COMMUNICATIONS-INTERRUPTED state
   will be able to extend leases (after it has been in COMMUNICATIONS-
   INTERRUPTED state for around the desired client lease period) will be
   longer.

   However, a server entering PARTNER-DOWN state will have to wait the
   longer MCLT before being able to allocate its partner's IP addresses
   to new DHCP clients.  This may mean that additional IP addresses are
   required in order to cover this time period.  Further, the server in
   PARTNER-DOWN will have to wait the longer MCLT from every lease
   expiration before it can reallocate an IP address to a different DHCP
   client.

6.  Packet Formats

   This section discusses the common message format that all failover
   messages have in common, and then defines option used in the failover
   protocol.

6.1.  Common message format

   All failover protocol messages are sent over the TCP connection
   between failover endpoints and encoded using a message format
   specific to the failover protocol.

   There exists a common message format for all failover messages, which
   utilizes the options in a way similar to the DHCP protocol.  For each
   message type, some options are required and some are optional.  In
   addition, when a message is received any options that are not
   understood by the receiving server MUST be ignored.

   All of the fields in the fixed portion of the message MUST be filled
   with correct data in every message sent.

   0                   1                   2                   3
   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |         message length (2)     | msg type (1)  |payload off (1)|
   +---------------+---------------+---------------+---------------+
   |                            time (4)                           |
   +---------------------------------------------------------------+
   |                            xid (4)                            |
   +---------------------------------------------------------------+
   |     0 or more additional header bytes  (variable)             |
   +---------------------------------------------------------------+
   |                    payload data  (variable)                   |
   |                                                               |
   |               formatted as DHCP-style options                 |
   |         using a unique option number space in the RFC TBD     |
   |                   format defined by [NAMESPACE]               |
   +---------------------------------------------------------------+

   message length - 2 bytes, network byte order

   This is the length of the message.  It includes the two byte message
   length itself.  The maximum length is 2048 bytes.

   msg type - 1 byte

   The message type field is used to distinguish between messages.

   The following message types are defined:

   Value   Message Type
   -----   ------------
   0       reserved    not used
   1       POOLREQ     request allocation of addresses
   2       POOLRESP    respond with allocation count
   3       BNDUPD      update partner with binding info
   4       BNDACK      acknowledge receipt of binding update
   5       CONNECT     establish connection with the secondary
   6       CONNECTACK  respond to attempt to establish connection with partner
   7       UPDREQALL   request full transfer of binding info
   8       UPDDONE     ack send and ack of req'd binding info
   9       UPDREQ      req transfer of un-acked binding info
   10      STATE       inform partner of current state or state change
   11      CONTACT     probe communications integrity with partner
   12      DISCONNECT  close a connection

   New message types should be defined in one of two ranges, 0-127 or
   129-255.  The range of 0-127 is used for messages that MUST be sup-
   ported by every server, and if a server receives a message in the
   range of 0-127 that it doesn't understand, it MUST close the TCP con-
   nection.  The range of 128-255 is used for messages which MAY be sup-
   ported but are not required, and if a server receives a message in
   this range that it does not understand it SHOULD ignore the message.

   payload offset - 1 byte

   The byte offset of the Payload Data, from the beginning of the
   failover message header. The value for the current protocol version
   is 8.

   time - 4 bytes, network byte order

   The absolute time in GMT when the message was transmitted,
   represented as seconds elapsed since Jan 1, 1970 (i.e., similar to
   the ANSI C time_t time value representation).  While the ANSI C
   time_t value is signed, the value used in this specification is
   unsigned.

   A server SHOULD set this time as close to the actual transmission of
   the message as possible.

   xid - 4 bytes, network byte order

   This is the transaction id of the failover message.  The sender of a
   failover protocol message is responsible for setting this number, and
   the receiver of the message copies the number over into any response
   message, treating it as opaque data.  The sender SHOULD ensure that
   every message sent from a particular failover endpoint over the
   associated TCP connection has a unique transaction id unless that
   message is a re-transmission.

   payload data - variable length

   The options are placed after the header, after skipping payload
   offset bytes from beginning of the message.  The payload data options
   are not preceded by a "cookie" value.

   The payload data is formatted as DHCP style options using the two
   byte option number and two byte option length format as specified in
   the recommendations of the DHCP panel in [NAMESPACE].

   The maximum length of the payload data in octets is 2048 less the
   size of the header, i.e., the maximum message length is 2048 octets.

6.2.  Common option format

   The options contained in the payload data section of the failover
   message all use the two byte option number and two byte length format
   as specified by the recommendations of the DHCP panel in [NAMESPACE].

   The option numbers are drawn from an option number space unique to
   the failover protocol.  All of the message types share a common
   option number space and common options definitions, though not all
   options are required or meaningful for every message.

   In contrast to the options which appear in DHCP client and server
   messages, the options in failover message are ordered.  That is, for
   some messages the order in which the options appear in the payload
   data area is significant.  The messages for which this is the case
   spell it out in detail.

   For all options which refer to time, they all use an absolute time in
   GMT.  Time synchronization has already been achieved between the
   source and the target server using the CONNECT message and is updated
   using the time in every packet.  All time fields in the options
   defined below use a time represented as seconds elapsed since Jan 1,
   1970 (i.e. ANSI C time_t time value representation).  Note that this
   is (at present) a signed field.

   Additional options can be defined for intervendor or vendor specific
   use with limited difficulty due to the large number of option numbers
   available.

6.2.1.  binding-status

   This option is used to convey the current state of a binding.

       Code          Len     Type
   +-----+-----+------+-----+-----+
   |  0  |  1  |   0  |  1  | 1-7 |
   +-----+-----+------+-----+-----+

   Legal values for this option are:

   Value Binding Status
   ----- ------------------------------------------------
   1     FREE           Lease has never been used
   2     ACTIVE         Lease is assigned to a client
   3     EXPIRED        Lease has expired
   4     RELEASED       Lease has been released by client
   5     ABANDONED      A server, or client flagged address as unusable
   6     RESET          Lease was freed by some external agent
   7     BACKUP         Lease belongs to secondary's private address pool
   8     EXPIRED-GRACE  Lease will become available after this period
   9     RELEASED-GRACE Lease will become available after this period

6.2.2.  assigned-IP-address

   The IP address to which this message refers.

        Code         Len          Address
   +-----+-----+------+-----+----+-----+-----+-----+
   |  0  |  2  |   0  |  4  | a1 |  a2 |  a3 |  a4 |
   +-----+-----+------+-----+----+-----+-----+-----+

6.2.3.  sending-server-IP-address

   The IP address of the server sending this message.

        Code         Len          Address
   +-----+-----+------+-----+----+-----+-----+-----+
   |  0  |  3  |   0  |  4  | a1 |  a2 |  a3 |  a4 |
   +-----+-----+------+-----+----+-----+-----+-----+

6.2.4.  addresses-transferred

   A 32 bit unsigned long in network byte order. Reports the number of
   addresses transferred by the primary to the secondary server
   (addresses to be used for the secondary server's private address
   pool)

        Code         Len       Number of Addresses
   +-----+-----+------+-----+----+-----+-----+-----+
   |  0  |  4  |   0  |  4  | n1 |  n2 |  n3 |  n4 |
   +-----+-----+------+-----+----+-----+-----+-----+

6.2.5.  client-identifier

   The format, code and conventions used are identical to DHCP option
   61.

        Code         Len       Client Identifier
   +-----+-----+------+-----+----+-----+---
   |  0  |  5  |   0  |  n  | i1 |  i2 | ...
   +-----+-----+------+-----+----+-----+--

6.2.6.  client-hardware-address

   The format is similar to DHCP option 61. Byte t1 (type) MUST be set
   to the proper ARP hardware address code, as defined in the ARP
   section of RFC 1700 (it MUST NOT be zero!)

        Code         Len      MAC address     htype   chaddr
   +-----+-----+------+-----+----+-----+-----+---
   |  0  |  6  |   0  |  n  | t1 |  m1  c1 |  m2  c2 | ...
   +-----+-----+------+-----+----+-----+-----+---

   Either Client Id, Client Hardware Address client-identifier, client-hardware-address or BOTH MAY be
   present in binding update transactions. At least one of them MUST be
   present.  If both are present, the Client Id client-identifier MUST be used to
   uniquely identify the owner of the binding (exactly as in RFC 2131).

6.2.7.  client-FQDN  DDNS

   If an implementation supports Dynamic DNS updates, this option can be is
   used to communicate the status of the DDNS update associated with a
   particular lease binding.  The Flags field conveys the types of DNS
   RRs that are to be updated by the DHCP server, and the status of the
   DDNS update.  The Domain Name field conveys the DNS name FQDN that was set. Uses the format of
   DHCP server is using to refer to the
   Client FQDN option (81) as described client, in [DDNS] and extended to fit DNS encoding as
   specified in
   the two byte code and length approach of the DHCP panel. [RFC1035].

       Code         Len        Flags Rcode1 Rcode2      Domain Name
   +-----+-----+------+-----+-----+------+------+-----+------
   |  0  |  7  |   0  |  n  |  f  |  r1  |  r2   flags    |  d1  | d2...  d2 | ...
   +-----+-----+------+-----+-----+------+------+-----+------

   The Flags field is a 16-bit field; several bit positions are
   specified here.

   15               7             0
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |           MBZ         |P|D|A|C|
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   The bits (numbered from the least-significant bit in network
   byte-order) are used as follows:

   0 (C): A RR update successfully completed
   1 (A): Server is controlling A RR on behalf of the client
   2 (D): PTR RR update successfully completed (Done)
   3 (P): Server is controlling PTR RR on behalf of the client
   4-15 : Must be zero

   All of the unspecified bit positions SHOULD be set to 0 by servers
   sending the Failover-DDNS option, and they MUST be ignored by servers
   receiving the option.

6.2.8.  reject-reason

   This option is used to selectively reject binding updates. It MAY be
   used in BNDACK message, always associated with an assigned-IP-address
   option, which contains the IP address of the update being rejected.

        Code         Len     Reason Code
   +-----+-----+------+-----+----------+
   |  0  |  8  |   0  |  1  |    R1    |
   +-----+-----+------+-----+----------+

   Reason codes :

   0   Reserved
   1   Illegal IP address (not part of any address pool)
   2   Fatal conflict exists: address in use by other client.
   3   Missing binding information.
   4   Connection rejected, time mismatch too great.
   5   Connection rejected, invalid MCLT.
   6   Connection rejected, unknown reason.
   7   Connection rejected, duplicate connection.
   8   Connection rejected, invalid failover partner.
   9   TLS not supported
   10  TLS supported but not configured
   11  TLS required but not supported by partner
   12  Message digest not supported
   13  Message digest not configured
   14  Protocol version mismatch
   15  Missing binding information
   16  Outdata  Outdated binding information
   17  Less critical binding information
   18-253,
   18  No traffic within sufficient time
   19  Hash bucket assignment conflict
   20-253, reserved.
   254 Unknown: Error occurred but does not match any reason code
   255 Reserved for code expansion

6.2.9.  message

   This option is used to supply a human readable message.  It may be
   used in association with the Reject Reason Code to provide a human
   readable error message for the reject.

        Code         Len         Text
   +-----+-----+------+-----+------+-----+--
   |  0  |  9  |   0  |  n  |  c1  | c2  | ...
   +-----+-----+------+-----+------+-----+--

6.2.10.  MCLT

   Maximum Client Lead Time, in seconds.  A 32 bit integer value, in
   network byte order. T

        Code         Len             Time
   +-----+-----+------+-----+----+-----+-----+-----+
   |  0  |  10 |   0  |  4  | t1 |  t2 |  t3 |  t4 |
   +-----+-----+------+-----+----+-----+-----+-----+

6.2.11.  vendor-class-identifier

   A string which identifies the vendor of the failover protocol
   implementation.

   The code for this option is 60, and its minimum length is 1.

        Code         Len           vendor class string
   +-----+-----+------+-----+----+-----+---
   |  0  |  11 |   0  |  n  | c1 |  c2 |  ...
   +-----+-----+------+-----+----+-----+---

6.2.12.  current-time

   The current time expressed as an absolute time in GMT represented as
   seconds elapsed since Jan 1, 1970 (i.e.  ANSI C time_t time value
   representation).

        Code         Len          Current Time
   +-----+-----+------+-----+----+-----+-----+-----+
   |  0  |  12 |   0  |  4  | t1 |  t2 |  t3 |  t4 |
   +-----+-----+------+-----+----+-----+-----+-----+

6.2.13.  lease-expiration-time

   The lease expiration time expressed as an absolute time in GMT
   represented as seconds elapsed since Jan 1, 1970 (i.e.  ANSI C time_t
   time value representation).

   The lease expiration time is the time that a server has ACKed to a
   DHCP client.

        Code         Len          Time
   +-----+-----+------+-----+----+-----+-----+-----+
   |  0  |  13 |   0  |  4  | t1 |  t2 |  t3 |  t4 |
   +-----+-----+------+-----+----+-----+-----+-----+

6.2.14.

6.2.13.  potential-expiration-time

   The potential expiration time expressed as an absolute time in GMT
   represented as seconds elapsed since Jan 1, 1970 (i.e.  ANSI C time_t
   time value representation).

   The potential expiration time is the time that one server tells
   another server that it may ACK to a client.

        Code         Len          Time
   +-----+-----+------+-----+----+-----+-----+-----+
   |  0  |  14 |   0  |  4  | t1 |  t2 |  t3 |  t4 |
   +-----+-----+------+-----+----+-----+-----+-----+

6.2.15.

6.2.14.  grace-expiration-time

   The grace expiration time expressed as an absolute time in GMT
   represented as seconds elapsed since Jan 1, 1970 (i.e.  ANSI C time_t
   time value representation).

   The grace expiration time is the time that a grace period will
   expire.

        Code         Len          Time
   +-----+-----+------+-----+----+-----+-----+-----+
   |  0  |  15 |   0  |  4  | t1 |  t2 |  t3 |  t4 |
   +-----+-----+------+-----+----+-----+-----+-----+

6.2.16.

6.2.15.  client-last-transaction-time

   The time at which this server last received a DHCP request from a
   particular client expressed as an absolute time in GMT represented as
   seconds elapsed since Jan 1, 1970 (i.e.  ANSI C time_t time value
   representation).

        Code         Len       Partner Down Time
   +-----+-----+------+-----+----+-----+-----+-----+
   |  0  |  16 |   0  |  4  | t1 |  t2 |  t3 |  t4 |
   +-----+-----+------+-----+----+-----+-----+-----+

6.2.17.

6.2.16.  start-time-of-state

   The time at which the state contained in this message began,
   expressed as an absolute time in GMT represented as seconds elapsed
   since Jan 1, 1970 (i.e.  ANSI C time_t time value representation).

   This option is used for different states in different messages.  In a
   BNDUPD message it represents the start time of the state of the lease
   in the BNDUPD message.  In a STATE message, it represents the start
   time of the partner server's failover state.

        Code         Len      Start Time of State
   +-----+-----+------+-----+----+-----+-----+-----+
   |  0  |  17 |   0  |  4  | t1 |  t2 |  t3 |  t4 |
   +-----+-----+------+-----+----+-----+-----+-----+

6.2.18.

6.2.17.  server-state

   This option is used to convey the current state of the failover
   endpoint in the sending server.

       Code          Len     Server State
   +-----+-----+------+-----+-----+
   |  0  |  18 |   0  |  1  | 1-9 |
   +-----+-----+------+-----+-----+

   Legal values for this option are:

   Value   Server State
   -----   -------------------------------------------------------------
   0       reserved
   1       STARTUP                      Startup state (1)
   2       NORMAL                       Normal state
   3       COMMUNICATIONS-INTERRUPTED   Communication interrupted (safe)
   4       PARTNER-DOWN                 Partner down (unsafe mode)
   5       POTENTIAL-CONFLICT           Synchronizing
   6       RECOVER                      Recovering bindings from partner
   7       PAUSED                       Shutting down for a short period.
   8       SHUTDOWN                     Shutting down for an extended
                                        period.
   9       RECOVER-DONE                 Interlock state prior to NORMAL

6.2.19.

6.2.18.  server-flags

   This option is used to convey the current flags of the failover
   endpoint in the sending server.

       Code          Len     Server Flags
   +-----+-----+------+-----+-------+
   |  0  |  19 |   0  |  1  | flags |
   +-----+-----+------+-----+-------+

   Legal values for this option are:

   Currently, bit 5 is defined.  All other bits
   are reserved, and must be set to 0.

      o STARTUP

        Bit 5 is the STARTUP flag.  Bit 5 MUST be set to 1 whenever the
        server is in STARTUP state, and set to 0 otherwise.  (Note that
        when in STARTUP state, the state transmitted in the server-state
        option is usually the last recorded state from stable storage,
        but see section 9.3 for details.)

6.2.20.

6.2.19.  vendor-specific-options

   This option is used to convey options specific to a particular
   vendor's implementation.  The vendor class identifier is used to
   specify which option space the embedded options are drawn from.

   It functions similarly to the vendor class identifier and vendor
   specific options in the DHCP protocol.

   This option contains other options in the same two byte code, two
   byte length format.  If this option appears in a message without a
   corresponding vendor class identifier, it MUST be ignored.

        Code         Len        Embedded options
   +-----+-----+------+-----+----+-----+---
   |  0  |  20 |   0  |  n  | c1 |  c2 |  ...
   +-----+-----+------+-----+----+-----+---

6.2.21.

6.2.20.  max-unacked-bndupd

   The maximum number of BNDUPD message that this server is prepared to
   accept over the TCP connection without causing the TCP connection to
   block.

        Code         Len     Maximum Unacked BNDUPD
   +-----+-----+------+-----+----+-----+-----+-----+
   |  0  |  21 |   0  |  4  | n1 |  n2 |  n3 |  n4 |
   +-----+-----+------+-----+----+-----+-----+-----+

6.2.22.  server-role

   This option is used to convey the role of the failover endpoint in
   the sending server.

       Code          Len      Role
   +-----+-----+------+-----+-------+
   |  0  |  22 |   0  |  1  |   r1  |
   +-----+-----+------+-----+-------+

   A value of 0 indicates that the failover endpoint is a primary server
   and a value of 1 indicates that it is a secondary server.

6.2.23.  n4 |
   +-----+-----+------+-----+----+-----+-----+-----+

6.2.21.  receive-timer

   The number of seconds within which the server must receive a packet message
   from its partner, or it will assume that the partner is down or the
   communication path to the partner has failed.

        Code         Len         Receive Timer
   +-----+-----+------+-----+----+-----+-----+-----+
   |  0  |  23 |   0  |  4  | s1 |  s2 |  s3 |  s4 |
   +-----+-----+------+-----+----+-----+-----+-----+

6.2.24.

6.2.22.  hash-bucket-assignment

   The set of hash values to which the receiving server MUST respond.
   See section 5.3 for more information on how this option is used.

   This option consists of a set of 32 bytes, in network byte order,
   where each bit corresponds to one

   The format and usage of 256 possible hash bucket values.
   If a bit is set to 1, the recipient data in this option is required to service the
   requests whose client-identifier or htype concatenated with the
   chaddr (if no client-identifier exists) map into the corresponding
   hash bucket. defined in
   [LOADB].

        Code         Len        Hash Buckets
   +-----+-----+------+-----+----+-----+-----+-----+
   |  0  |  24 |   0  |  32 | b1 |  b2 | ... | b32 |
   +-----+-----+------+-----+----+-----+-----+-----+

6.2.25.

6.2.23.  message-digest

   The message digest for this message.

   This option consists of a variable number of bytes which contain the
   message digest of the message prior to the inclusion of this option.

   When this option appears in a message, it MUST appear as the last
   option in the message.

        Code         Len       Message Digest
   +-----+-----+------+-----+----+-----+-----
   |  0  |  25 |   0  |  n  | d1 |  d2 | ...
   +-----+-----+------+-----+----+-----+-----

6.2.26.

6.2.24.  protocol-version

   The protocol version being used by the server. It is only sent in the
   CONNECT and CONNECTACK messages.

        Code         Len    Version
   +-----+-----+------+-----+----+
   |  0  |  26 |   0  |  1  | v1 |
   +-----+-----+------+-----+----+

6.2.27.

6.2.25.  TLS-request

   This option contains information relating to TLS security
   negotiation.  It is sent in a CONNECT message

   The first byte, req, is the TLS request from this server.  A value of
   0 indicates no TLS operation, a value of 1 indicates that TLS
   operation is desired, and a value of 2 indicates that TLS operation
   is required to establish communications with this server.

   The second byte, acc, is what this server will accept for TLS
   operation.  A value of 0 means that this server will not accept TLS
   connections.  A value of 1 means that this this server will accept TLS
   connections.

   If req is not zero, then acc MUST be 1.

   This allows a server which is not configured to require TLS support
   to inform its partner that it will accept a TLS connection although
   it does not desire one, for instance.

        Code         Len  request accept
   +-----+-----+------+-----+----+----+
   |  0  |  27 |   0  |  2  | req| acc|
   +-----+-----+------+-----+----+----+

6.2.26.  TLS-reply

   This option contains information relating to TLS security
   negotiation.  It is sent in a CONNECTACK message

   The value of 0 indicates no TLS operation, a value of 1 indicates
   that TLS operation is required.

        Code         Len     TLS
   +-----+-----+------+-----+----+
   |  0  |  28 |   0  |  1  | t1 |
   +-----+-----+------+-----+----+

6.2.27.  client-request-options

   This option contains options from a DHCP client's request.  It is
   sent in a BNDUPD message.  The first 4 bytes of the option contain
   the "magic number" of the option area from which the DHCP client's
   request options were taken and serves to define the format of the
   rest of the sub-options contained in this option.  After the magic
   number, the options included are in the normal options format
   appropriate for that magic number.

   A server will accept TLS
   connections.

   If req is not zero, then acc MUST be 1.

   This allows SHOULD NOT include all of the options in a DHCP client
   request in this option, but rather a server SHOULD include only those
   options which is not configured for TLS support are of likely interest to
   inform its partner that it will accept a TLS connection although it
   does not desire one, server.  See
   section 7.1 for instance. details.

        Code         Len  request acccept
   +-----+-----+------+-----+----+----+         Magic Number      Embedded options
   +-----+-----+------+-----+----+----+----+----+----+----+--
   |  0  |  27  29 |   0  |  2  n  | req| acc|
   +-----+-----+------+-----+----+----+ m1 | m2 | m3 | m4 | b1 | b2 |  ...
   +-----+-----+------+-----+----+----+----+----+----+----+--

6.2.28.  TLS-reply  client-reply-options

   This option contains information relating options from a DHCP server's reply to TLS security
   negotiation. a DHCP
   client request.  It is sent in a CONNECTACK message BNDUPD message.  The value first 4 bytes
   of 0 indicates no TLS operation, a value the option contain the "magic number" of 1 indicates the option area from
   which the DHCP reply options were taken and serves to define the
   format of the rest of the sub-options contained in this option.
   After the magic number, the options included are in the normal
   options format appropriate for that TLS operation is required. magic number.

   A server SHOULD NOT include all of the options in a DHCP server's
   reply to a client's request in this option, but rather a server
   SHOULD include only those options which are of likely interest to its
   partner server.  See section 7.1 for details.

        Code         Len     TLS
   +-----+-----+------+-----+----+         Magic Number      Embedded options
   +-----+-----+------+-----+----+----+----+----+----+----+--
   |  0  |  28  30 |   0  |  1  n  | t1 m1 |
   +-----+-----+------+-----+----+ m2 | m3 | m4 | b1 | b2 |  ...
   +-----+-----+------+-----+----+----+----+----+----+----+--

6.3.  BNDUPD message format

   The binding update (BNDUPD) message is used to send the binding data-
   base changes to the partner server.

   The message type for the BNDUPD message is 3.

   The xid of the BNDUPD MUST be unique with respect to other failover
   messages transmitted from this failover endpoint.

   The following table summarizes the various options for the BNDUPD
   message.

                                        binding-status

   Option                        ACTIVE     EXPIRED    RELEASED   FREE
   ------                        ------     -------    --------   ----
   assigned-IP-address           MUST       MUST       MUST       MUST
   binding-status                MUST       MUST       MUST       MUST
   client-identifier             MAY        MAY        MAY        MAY
   client-hardware-address       MUST       MUST       MUST       MAY
   lease-expiration-time         MUST       MUST NOT   MUST NOT   MUST NOT
   potential-expiration-time     MUST       MUST NOT   MUST NOT   MUST NOT
   grace-expiration-time         MUST NOT   MUST NOT   MUST NOT   MUST NOT
   start-time-of-state           SHOULD     SHOULD     SHOULD     SHOULD
   client-last-trans.-time       MUST       SHOULD     MUST       MAY
   DDNS(1)                       SHOULD     SHOULD     SHOULD     MAY
   client-FQDN(1)     SHOULD
   client-request-options        SHOULD     SHOULD NOT SHOULD     SHOULD NOT
   client-reply-options          SHOULD     SHOULD NOT SHOULD     SHOULD NOT
   all others                    MAY        MAY        MAY        MAY

                                        binding-status

                                BACKUP
                                EXPIRED-     RELEASED-
                                RESET
   Option                       GRACE        GRACE                       ABANDONED
   ------                       ------       -----                       ---------
   assigned-IP-address          MUST         MUST         MUST
   binding-status               MUST         MUST         MUST
   client-identifier            MAY          MAY
   client-identifier            MAY(2)
   client-hardware-address      MAY          MAY      MAY(2)
   lease-expiration-time        MUST NOT     MUST NOT     MUST NOT
   potential-expiration-time    MUST NOT     MUST NOT     MUST NOT
   grace-expiration-time        MUST         MUST         MUST NOT
   start-time-of-state          SHOULD       SHOULD       SHOULD
   client-last-trans.-time      SHOULD       SHOULD      MAY
   client-FQDN(1)
   DDNS(1)                      SHOULD
   client-request-options       SHOULD NOT
   client-reply-options         SHOULD NOT
   all others                   MAY          MAY          MAY

   (1) Only SHOULD appear if client supplies a host name and server supports dynamic DNS
       is used. DNS.

   (2) MUST NOT if binding-status is ABANDONED.

             Table 6.3-1: Options used in a BNDACK BNDUPD message

6.4.  BNDACK message format

   A server sends a binding acknowledgement (BNDACK) message when it has
   successfully committed binding database changes received from a fail-
   over partner in a BNDUPD message to its own stable storage.

   The message type for the BNDACK message is 4.

   The xid in a BNDACK MUST be the same as the xid of the corresponding
   BNDUPD.

   The following table summarizes the options for the BNDACK message.

                                        binding-status

   Option                        ACTIVE     EXPIRED    RELEASED   FREE
   ------                        ------     -------    --------   ----
   assigned-IP-address           MUST       MUST       MUST       MUST
   binding-status                MUST       MUST       MUST       MUST
   client-identifier             MAY        MAY        MAY        MAY
   client-hardware-address       MUST       MUST       MUST       MAY
   reject-reason                 MAY        MAY        MAY        MAY
   message                       MAY        MAY        MAY        MAY
   lease-expiration-time         MUST       MUST NOT   MUST NOT   MUST NOT
   potential-expiration-time     MUST       MUST NOT   MUST NOT   MUST NOT
   grace-expiration-time         MUST NOT   MUST NOT   MUST NOT   MUST NOT
   start-time-of-state           SHOULD     SHOULD     SHOULD     SHOULD
   client-last-trans.-time       SHOULD     SHOULD     SHOULD     MAY
   client-FQDN(1)
   DDNS(1)                       SHOULD     SHOULD     SHOULD     SHOULD
   all others                    MAY        MAY        MAY        MAY

                                        binding-status
                                BACKUP
                                EXPIRED-     RELEASED-
                                RESET
   Option                       GRACE        GRACE                       ABANDONED
   ------                       ------       -----                       ---------
   assigned-IP-address          MUST         MUST         MUST
   binding-status               MUST         MUST         MUST
   client-identifier            MAY          MAY          MAY
   client-hardware-address      MAY          MAY      MAY(2)
   reject-reason                MAY          MAY          MAY
   message                      MAY          MAY          MAY
   lease-expiration-time        MUST NOT     MUST NOT     MUST NOT
   potential-expiration-time    MUST NOT     MUST NOT     MUST NOT
   grace-expiration-time        MUST         MUST         MUST NOT
   start-time-of-state          SHOULD       SHOULD       SHOULD
   client-last-trans.-time      SHOULD       SHOULD      MAY
   client-FQDN(1)               SHOULD       SHOULD
   DDNS(1)                      SHOULD
   all others                   MAY          MAY          MAY

   (1) Only SHOULD appear if client supplies a host name and the server supports dynamic DNS
       is used. DNS.

   (2) MUST NOT if binding-status is ABANDONED.

              Table 6.4-1: Options used in a BNDACK message

6.5.  Bulking for BNDUPD and BNDACK messages
   DISCUSSION:

      Bulking is planned for this protocol, but it hasn't been specified
      in this revision of the draft.  Once the draft settles down, we
      will specify the bulking approach in detail.

6.6.  UPDREQ message format

   The update request (UPDREQ) message is used by one server to request
   that its partner send it all binding database information that it has
   not already seen.

   The message type for the UPDREQ message is 9.

   The xid in a UPDREQ message MUST be unique among messages transmitted
   from this failover endpoint during the life of this connection.

   There are no options that MUST appear in an UPDREQALL message.  Any
   option MAY appear. appear, though very few will likely be useful.

6.7.  UPDREQALL message format

   The update request all (UPDREQALL) message is used by one server to
   request that all binding database information be sent in order to
   recover from a total loss of its lease state binding database by the request-
   ing requesting
   server.

   The message type for the UPDREQALL message is 7.

   The xid in a UPDREQALL message MUST be unique among messages
   transmitted from this failover endpoint during the life of this con-
   nection.

   There are no options that MUST appear in an UPDREQALL message.  Any
   option MAY appear. appear, though very few will likely be useful.

6.8.  UPDDONE message format

   The update done (UPDDONE) message is used by the responding server to
   indicate that all requested updates have been sent by the responding
   server as BNDUPD messages and acked responded to by the requesting server
   using BNDACK messages.  While a BNDACK message MUST have been
   received for each IP address that was sent in a BNDUPD message, the BNDACK message
   could have contained a reject-reason in order prior to NAK that specific
   update.

   Thus, the transmission of the
   UPDDONE message, this message confirms doesn't necessarily mean that all of the requesting server has received
   and BNDUPD
   messages were accepted, only that all of them were responded to with
   a BNDUPD BNDACK message.  Thus, a NAK (comprised of a BNDACK message con-
   taining a reject-reason option) could be used to reject a BNDUPD, but
   for all of the requested updates,
   but it does require purposes of the requesting server UPDDONE message, such NAK would count as a
   response to accept all the associated BNDUPD message, and would not block the
   eventual transmission of the
   offered updates. UPDDONE message.

   The message type for the UPDDONE message is 7.

   The xid in an UPDDONE message MUST be identical to the xid in the
   UPDREQ or UPDREQALL message that initiated the update process.

   There are no options that MUST appear in an UPDDONE message.  Any
   option MAY appear. appear, though very few will likely be useful.

6.9.  POOLREQ message format

   The pool request (POOLREQ) is used by the secondary server to request
   an allocation of IP addresses from the primary server.

   The message type for the POOLREQ message is 1.

   The xid in a POOLREQ message MUST be unique among messages transmit-
   ted from this failover endpoint during the life of this connection.

   There are no options that MUST appear in a POOLREQ message.  Any
   option MAY appear.

6.10.  POOLRESP message format

   The pool response (POOLRESP) is used by the primary server to inform
   the secondary server how many IP addresses it was were allocated to the
   secondary server as the result of a the pool request.

   The message type for the POOLRESP message is 2.

   The xid in the POOLRESP message MUST be identical to the xid in the
   POOLREQ message for which this POOLRESP is a response.

   The following table shows the options that MUST appear in a POOLRESP
   message:

           Option
           ------
           addresses-transferred       MUST

                          Table 6.10-1: Options used in a STATE POOLREQ message

6.11.  CONNECT message format

   The connect (CONNECT) message is used by either the primary server to establish estab-
   lish a high level connection with the other server, and to transmit
   several important configuration data items between the servers.

   The message type for the CONNECT message is 5.

   The xid in a CONNECT message MUST be unique among messages transmit-
   ted from this failover endpoint during the life of this connection.

   The CONNECT message MUST be the first message sent down a newly esta-
   blished connection.

   The following table summarizes the options that are associated with
   the CONNECT message:

                                      role

   Option                      primary       secondary
   ------                      ------        ---------
   sending-server-IP-address   MUST          MUST
   server-role                 MUST          MUST
   max-unacked-bndupd          MUST          MUST
   receive-timer               MUST          MUST
   current-time                MUST          MUST
   vendor-class-identifier     MUST          MUST
   protocol-version            MUST          MUST
   TLS-request                 MUST(1)       MUST(1)
   MCLT                 MUST
   MCLT                        MUST NOT
   hash-bucket-assignment      MUST
   hash-bucket-assignment      MUST NOT
   all others                  MAY           MAY

   (1) If the CONNECT message is being sent on a TLS secured connection,
   then there MUST NOT be a TLS-request option.

              Table 6.11-1: Options used in a CONNECT message

6.12.  CONNECTACK message format

   The connect response (CONNECTACK) message is used by a secondary
   server to respond to the receipt of a CONNECT message. message from the pri-
   mary server.

   The message type for the CONNECTACK message is 6.

   The xid in the CONNECTACK message MUST be identical to the xid in the
   CONNECT message for which this CONNECTACK is a response.

   The following table summarizes the options associated with the CON-
   NECTACK message:

   Option
   ------
   sending-server-IP-address   MUST
   server-role                 MUST
   max-unacked-bndupd          MUST
   receive-timer               MUST
   current-time                MUST
   vendor-class-identifier     MUST
   protocol-version            MUST
   TLS-reply                   MUST(1)
   TLS-request                 MUST
   reject-reason               MAY(2)               MAY(1)
   message                     MAY

   (1) If the CONNECTACK is being sent over an already TLS secured
       connection, then the TLS-reply option
   MCLT                        MUST NOT appear.

   (2)
   hash-bucket-assignment      MUST NOT

   (1) Indicates a rejection of the CONNECT message.

              Table 6.12-1: Options used in a CONNECTACK message

6.13.  STATE message format

   The state (STATE) message is used by either server to communicate the
   current state of the failover endpoint with the other server.  It
   MUST be sent immediately after a connection is established negotiation completes with
   another
   the other server, and it MUST be sent whenever the server's state
   changes.

   The message type for the STATE message is 10.

   The xid in a STATE message MUST be unique among messages transmitted
   from this failover endpoint during the life of this connection.

   The following table shows the options that MUST appear in a STATE
   message:

           Option
           ------
           sending-state               MUST
           server-flags                MUST
           start-time-of-state         MUST

                      Table 6.13-1: Options used in a STATE message

6.14.  CONTACT message format

   The contact (CONTACT) message is used by either server to verify that
   the connection is operational to the other server.

   The message type for the CONTACT message is 11.

   The xid in a CONTACT message MUST be unique among messages transmit-
   ted from this failover endpoint during the life of this connection.

   The following table shows the

   There are no options that MUST appear be used in a CONTACT message.

6.15.  DISCONNECT message format

   The disconnect (DISCONNECT) message is used by either server just
   prior to closing a connection.

   The message type for the DISCONNECT message is 12.

   The xid in a DISCONNECT message MUST be unique among messages
   transmitted from this failover endpoint during the life of this con-
   nection.

   The DISCONNECT message MUST be the last message sent down a connec-
   tion before it is closed.

   The following table summarizes the options that are associated with
   the DISCONNECT message:

   Option
   ------
           current-time
   reject-reason               MUST
   message                     SHOULD

              Table 6.14-1: 6.15-1: Options used in a CONTACT DISCONNECT message

7.  Protocol Messages

   This section contains the detailed definition of the protocol mes-
   sages, including the information to include when sending the message,
   as well as the actions to take upon receiving the message.

7.1.  BNDUPD message

   The binding update (BNDUPD) message is used to send the binding data-
   base changes to the partner server, and the partner server responds
   with a binding acknowledgement (BNDACK) message when it has success-
   fully commited committed those changes to its own stable storage.

   The rest of the failover protocol exists to determine whether the
   partner server is able to communicate or not, and to enable the
   partners to exchange BNDUPD/BNDACK messages in order to keep their
   binding databases in stable storage synchronized.

7.1.1.  Sending the BNDUPD message

   A BNDUPD message SHOULD be generated whenever any binding changes.  A
   change might be in the binding-status, the lease-expiration-time, or
   even just the last-transaction-time.  In general, any time a DHCP
   client sends in a packet that results in a DHCP server writing to its
   stable storage, a BNDUPD message SHOULD be generated.

   The BNDUPD (and BNDACK) messages refer to the binding-status of the
   IP address, and this protocol defines a series of binding-statuses,
   discussed in more detail below.  Some servers may not support all of
   these binding-statuses, and so in those cases they will not be sent,
   and upon receipt a reasonable interpretation should be made.

   All BNDUPD messages MUST contain the IP address in the assigned-IP-
   address option, and it contains the IP address about which the BNDUPD
   message is being sent.

   All BNDUPD messages MUST contain the binding-status option, and it
   will have one of the values in the following list.  This list
   discusses the meanings of the various binding-statuses and the infor-
   mation that should go into the BNDUPD message because of them.

      o ACTIVE

        Indicates that the IP address is currently leased to a DHCP
        client.

        client-hardware-address

        The client-hardware-address option MUST appear, and be set from
        the MAC address htype and chaddr of the DHCP client to which this IP address
        is leased.

        client-identifier

        If the DHCP client to which this IP address is leased used a
        client-identifier option to identify itself, then the client-
        identifier MUST appear in the BNDUPD message, else it MUST NOT
        appear.

        lease-expiration-time

        The lease-expiration-time option MUST appear, and be set to the
        expiration time most recently ACKed to the DHCP client.  Note
        that the time ACKed to a DHCP client is a lease duration in
        seconds, while the lease-expiration-time option in a BNDUPD mes-
        sage is an absolute time value.

        potential-expiration-time

        The potential-expiration-time option MUST appear, and be set to
        a value beyond that of the lease-expiration time.  This is the
        value that is ACKed by the BNDACK message.  A server sending a
        BNDUPD message MUST be able to recover the potential-
        expiration-time sent in every BNDUPD, not just those that
        receive a corresponding BNDACK, in order to be able to protect
        against possible duplicate allocation of IP addresses after
        transitioning to PARTNER-DOWN state. See section 5.2.1 for
        details as to why the potential-expiration-time exists and
        guidelines for how to decide the value.

      o EXPIRED

        A binding-status of EXPIRED is used when a client's binding on
        an IP address has expired and the server does not wish to imple-
        ment an expired-grace period.  When the partner server ACK's the
        BNDUPD of an EXPIRED IP address, the server sets its internal
        state to FREE.  It is then available to allocation to any client
        of the primary server.

        client-hardware-address

        There SHOULD be a DHCP client associated with the IP address
        whose binding has expired.  If there is, then the client-
        hardware-address option MUST appear, and be set from the MAC
        address htype
        and chaddr of the DHCP client to which this IP address was
        leased.

        client-identifier

        There SHOULD be a DHCP client associated with the IP address
        whose binding has expired.  If there is, then if the DHCP client
        to which this IP address was leased used a client-identifier
        option to identify itself, then the client-identifier MUST
        appear in the BNDUPD message, else it MUST NOT appear.

      o RELEASED
        A binding-status of RELEASED is used when a DHCP client sends in
        a DHCPRELEASE message and the server does not wish to implement
        a released-grace period.  When the partner server ACK's the
        BNDUPD of an RELEASED IP address, the server sets its internal
        state to FREE, and it is available for allocation by the primary
        server to any DHCP client.

        client-hardware-address

        There SHOULD be a DHCP client associated with the IP address
        whose binding has been released.  If there is, then the client-
        hardware-address option MUST appear, and be set from the MAC
        address htype
        and chaddr of the DHCP client which released this IP address.

        client-identifier

        There SHOULD be a DHCP client associated with the IP address
        whose binding has been released.  If there is, then if the DHCP
        client which released this IP address used a client-identifier
        option to identify itself, then the client-identifier MUST
        appear in the BNDUPD message, else it MUST NOT appear.

      o FREE

        A binding-status of FREE is used when a DHCP server needs to
        communicate that an IP address is available for allocation to
        another server, but it was not just released, expired, or reset
        by a network administrator.  When the partner server ACK's the
        BNDUPD of an FREE IP address, the server sets its internal state
        such that it is available for allocation by any DHCP client.

        client-hardware-address

        There MAY be a DHCP client associated with the IP address whose
        binding is now desired to be FREE.  If there is, then the
        client-hardware-address option MUST appear, and be set from the
        MAC address
        htype and chaddr of the DHCP client which released this IP
        address.

        client-identifier

        There MAY be a DHCP client associated with the IP address whose
        binding is now desired to be FREE.  If there is, then if the
        DHCP client which released this IP address used a client-
        identifier option to identify itself, then the client-identifier
        MUST appear in the BNDUPD message, else it MUST NOT appear.

      o EXPIRED-GRACE

        Some servers support a grace period after lease expiration, to
        handle clock speed differences between clients and servers as
        well as to limit the number of times names are removed and
        subsequently added to dynamic DNS.

        client-hardware-address
        There MAY be a DHCP client associated with the IP address whose
        binding has now expired.  If there is, then the client-
        hardware-address option MUST appear, and be set from the MAC
        address htype
        and chaddr of the DHCP client which released this IP address.

        client-identifier

        There MAY be a DHCP client associated with the IP address whose
        binding hs has now expired.  If there is, then if the DHCP client
        which most recently leased this IP address used a client-
        identifier option to identify itself, then the client-identifier
        MUST appear in the BNDUPD message, else it MUST NOT appear.

        grace-expiration-time

        The grace-expiration-time option MUST appear, and is the length
        of time that this server will wait before trying to make the IP
        address available after the lease has expired for this IP
        address.

      o RELEASED-GRACE

        Some servers support a grace period after lease release by a
        DHCP client, to handle clock speed differences between clients
        and servers as well as to limit the number of times names are
        removed and subsequently added to dynamic DNS.

        client-hardware-address

        There MAY be a DHCP client associated with the IP address whose
        binding has now been released by sending a DHCPRELEASE.  If
        there is, then the client-hardware-address option MUST appear,
        and be set from the MAC address htype and chaddr of the DHCP client which
        released this IP address.

        client-identifier

        There MAY be a DHCP client associated with the IP address whose
        binding has been released.  If there is, then if the DHCP client
        which most recently leased this IP address used a client-
        identifier option to identify itself, then the client-identifier
        MUST appear in the BNDUPD message, else it MUST NOT appear.

        client-hardware-address

        There MAY be a DHCP client associated with the IP address whose
        binding is now desired to be FREE.  If there is, then the
        client-hardware-address option MUST appear, and be set from the
        MAC address
        htype and chaddr of the DHCP client which released this IP
        address.

        client-identifier

        There MAY be a DHCP client associated with the IP address whose
        binding is now desired to be FREE.  If there is, then if the
        DHCP client which released this IP address used a client-
        identifier option to identify itself, then the client-identifier
        MUST appear in the BNDUPD message, else it MUST NOT appear.

        grace-expiration-time

        The grace-expiration-time MUST appear, and is the length of time
        that this server will wait before trying to make the IP address
        available after the lease was released for this IP address

      o ABANDONED

        An ABANDONED IP address is one that has been considered unusable
        by the DHCP subsystem.  An IP address for which a valid PING
        response was received SHOULD be set to ABANDONED.

        client-hardware-address

        There SHOULD NOT be a DHCP client associated with an ABANDONDED
        IP address.  The client-hardware-address option MUST NOT appear
        in the BNDUPD message.

        client-identifier

        There SHOULD NOT be a DHCP client associated with the IP address
        whose binding has now been ABANDONED.  The client-identifier
        option MUST-NOT appear in the BNDUPD message.

      o RESET

        The RESET value of the binding-status is used to indicate that
        this IP address was made available by operator command.

      o BACKUP

        The BACKUP value of binding-status indicates that this IP
        address belongs to the secondary server, and can be allocated by
        that server to a DHCP client at any time.

        client-hardware-address

        There MAY be a DHCP client associated with an BACKUP IP address.
        If there is, the client-hardware-address option MUST appear, and
        be set from the MAC address htype and chaddr of the DHCP client to which
        this IP address was most recently associated.

        client-identifier
        There MAY be a DHCP client associated with this IP address.  If
        the DHCP client to which this IP address is leased used a
        client-identifier option to identify itself, then the client-
        identifier MUST appear in the BNDUPD message, else it MUST NOT
        appear.

   The following option information is generic to all BNDUPD messages,
   regardless of the value of the binding-status.

   o start-time-of-state

     The start-time-of-state SHOULD appear.  It is set to the time at
     which this IP address first took on the state that corresponds to
     the current value of binding-status.

   o last-transaction-time

     The last-transaction-time value SHOULD appear.  This is the time at
     which this DHCP server last received a packet from the DHCP client
     referenced by the client-identifier client-identifier or client-hardware-address that
     was associated with the IP address referenced by the assigned-IP-
     address.

   o DDNS

     If the DHCP server is performing dynamic DNS operations on behalf
     of the DHCP client represented by the client-identifier or client-
     hardware-address, then it should include a DDNS option containing
     the host name, domain name, and status of any dynamic DNS opera-
     tions enabled.

   o client-request-options

     If the BNDUPD was triggered by a request from a DHCP client (typi-
     cally those with binding-status of ACTIVE and RELEASED), then the
     server SHOULD include options of interest to a failover partner
     from the client's request packet in the client-request-options for
     transmission to its partner.

     A server sending a BNDUPD need not remember the "interesting"
     options or the information that would appear in an "interesting"
     option for transmission at a time when the BNDUPD is not closely
     associated with a DHCP client request.

     A server SHOULD send the following "interesting" options.  It MAY
     send any DHCP client options.  As new options are defined, the RFC
     defining these options SHOULD include information that they are
     "interesting to failover servers" if they should be sent as part of
     a BNDUPD.

         option          option
         number          name
         -----------------------------------------

         12              host-name
         81              client-FQDN [DDNS]
         82              relay-agent-information [AGENTINFO]
         TBD             user-class [USERCLASS]
         60              vendor-class-identifier

           Table 7.1.1-1: Options which SHOULD be sent in
           the client-request-options option in a BNDUPD message.

   o client-reply-options

     If the BNDUPD was triggered by a request from a DHCP client (typi-
     cally those with binding-status of ACTIVE and RELEASED), then the
     server SHOULD include options of interest to a failover partner
     from the server's DHCP reply packet in the client-reply-options for
     transmission to its partner.

     A server sending a BNDUPD need not remember the "interesting"
     options or client-hardware-address the information that
     was would appear in an "interesting"
     option for transmission at a time when the BNDUPD is not closely
     associated with the IP address referenced by the assigned-IP-
     address.

   o client-FQDN

     If the a DHCP client request.

     A server is performing dynamic DNS operations on behalf
     of SHOULD send the following "interesting" options.  It MAY
     send any DHCP client represented by options.  As new options are defined, the client-identifier or client-
     hardware-address, then it should RFC
     defining these options SHOULD include information that they are
     "interesting to failover servers" if they should be sent as part of
     a client-FQDN BNDUPD.

         option con-
     taining          option
         number          name
         -----------------------------------------

         58              renewal-time
         59              rebinding-time

           Table 7.1.1-2: Options which SHOULD be sent in
           the host name, domain name, and status of any dynamic DNS
     operations enabled. client-reply-options option in a BNDUPD message.

   The BNDUPD message SHOULD be sent as soon as possible from the time
   that the DHCP client received a response and the lease bindings data-
   base is written written on stable storage.

7.1.2.  Receiving the BNDUPD message

   When a server receives a BNDUPD message, it needs to decide how to
   processes the message and whether the message represents a conflict
   of any sort. The conflict resolution process SHOULD be used on the
   receipt of every BNDUPD message, not just those that are received
   while in POTENTIAL-CONFLICT state, in order to increase the robust-
   ness of the protocol.

   There are three sorts of conflicts:

      o Two clients one IP address conflict

        This is the duplicate IP address allocation conflict. There are
        two different clients each allocated the same address.  There
        cannot be a client conflict unless there is a client specified
        in the BNDUPD message.  See section 5.10.1 for how to resolve
        this conflict.

      o Two IP addresses one client conflict

        This conflict exists when a client on one server is associated
        with a one IP address, and on stable storage.

7.1.2.  Receiving the BNDUPD message

   When a other server receives with a BNDUPD message, it needs to decide how to
   processes different
        IP address in the message and whether same or a related subnet. This does not refer
        to the message represents case where a conflict
   of any sort. The conflict resolution process is used single client has addresses in multiple dif-
        ferent subnets or administrative domains, but rather the case
        where on the receipt
   of every BNDUPD message, not just those that are received while in
   POTENTIAL-CONFLICT state, same subnet the client has as lease on one IP
        address in order to increase one server and on a different IP address on the robustness of other
        server.

        This conflict may or may not be a problem for a given DHCP
        server implementation.  In the
   protocol.

   There are two sorts of conflict.  The first, more major conflict, is
   when event that a DHCP server receives requires
        that a BNDUPD message from its partner DHCP client have only one outstanding lease for an
   ACTIVE IP
        address and finds that on one subnet, this conflict should be resolved by
        accepting the client specified update which has the latest client-last-
        transaction-time.

      o binding-status conflict

        This is normal conflict, where one server is updating the other
        with newer information.  See section 5.10.1 for details of how
        to resolve these conflicts.

      See section 5.10.1 for details of how to process binding-status
      changes in BNDUPD messages.

7.1.3.  Accepting the BNDUPD message is different from

   When accepting a BNDUPD message, the client associated with this ACTIVE IP
   address information contained in this server's bindings database.

   The second sort of conflict is where the receiving
   client-request-options and client-reply-options SHOULD be examined
   for any information of interest to this server.  For instance, a
   server has which wished to detect changes in its
   bindings database the client specified in host names
   might want examine and save information from the host-name or
   client-FQDN options.  Server's which expect to utilize information
   from the relay-agent-information option would want to store this
   information.

7.1.4.  Time values related to the BNDUPD message associ-
   ated with

   There are three time values that may be sent in a different IP address.

   These two conflict cases can both occur together with the same BNDUPD message.

   When receiving a BNDUPD message,

      o lease-expiration-time

        The time that the server first determines the IP
   address from gave to the assigned-IP-address option, and then determines if
   there was any client associated with this IP address by looking for client, i.e., the client-identifier option.  If there is no client-identifier
   option, then time that
        the server looks for a client-hardware-address option,
   and ultimately determines believes that the client's identity specified in the
   BNDUPD. lease will expire.

      o potential-expiration-time

        The client specified in time that the BNDUPD message is compared server wants to be sure its partner waits
        (added to the client
   currently associated with the IP address in MCLT) before assuming that this server's bindings
   database.  If they are lease has expired.
        Typically some time beyond the same, continue.  If there is no desired client in
   this server's binding database, continue.  If there is a lease time.

      o client-last-transaction-time

        The time that the client in last interacted with this server's bindings database, server.

   As discussed in section 5.2, each server knows what its partner has
   ACKed with regard to potential-expiration time.  In addition, each
   server needs to remember what it has told its partner as the
   potential-expiration-time.  Moreover, each server must remember what
   it has acked to the *other* server as the most recent potential-
   expiration-time from that server.

   Remember that each server sends a potential-expiration-time and it is different from
   receives an ACK for that speci-
   fied in the BNDUPD message, as well as receiving a 'client conflict' exists.  See the sec-
   tion below on conflict resolution.  If the client specified potential-
   expiration-time and needing to remember what it has acked for that.

   While they don't have to be named in any particular way, the
   BNDUPD message is associated with times
   that a different server needs to remember for every IP address in this
   server's bindings database in the same subnet, then an 'IP address
   conflict' exists. This does not refer order to
   implement the case where a single
   client has addresses in multiple different subnets or administrative
   domains, but rather the case where in the same subnet failover protocol are:

      o lease-expiration-time
        The time that this server gave to the client has
   as lease on one IP address in one DHCP client.  A DHCP
        server and on needs to remember this time already, just to be a different IP
   address on the other DHCP
        server.  See the section below on conflict reso-
   lution.

   If none of the conflicts mentioned above exist, then develop a

      o sent-potential-expiration-time

        The latest time
   for both the BNDUPD message and sent to the server's information. partner for a potential-expiration-
        time.

      o acked-potential-expiration-time

        The latest time for both the BNDUPD and that the server's information are
   developed independently in partner has acked for a potential
        expiration time.  Typically the following way:  If same as sent-potential-
        expiration-time if there is not a client-
   last-transaction time, use that.  If there isn't, but there is BNDUPD outstanding.

      o received-potential-expiration-time

        The latest time that this server has ever received as a
   start-time-of-state, use that.  If there isn't, but there is
        potential-expiration-time from its partner in a
   client-expiration-time, use that.  If there isn't, then use the BNDUPD that this
        server ACKed.

   So, a server has to remember two additional times concerning BNDUPD
   messages that it has initiated, and one additional time
   the concerning
   BNDUPD message was received for a BNDUPD message, and that it has received.  How are these times used?

   First, let's look at the current time for the server's information.

   Then the that DHCP server determines the binding-status in the BNDUPD, and
   takes the following actions based on binding-status:

   (In the following list, can offer to "accept" a BNDUPD means DHCP
   client.  A server can offer to update the
   server's bindings database with a to a DHCP client a time that is no
   longer than the information contained in MCLT beyond the
   BNDUPD and once max( received-potential-expiration-
   time, acked-potential-expiration-time).  One might think that update is complete, send a BNDACK message
   corresponding the
   server should be able to offer only the BNDUPD message).

      o ACTIVE in BNDUPD

        If MCLT beyond the BNDUPD acked-
   potential-expiration-time, and while that is LATER than the server's information, accept it,
        else reject it.

      o EXPIRED or EXPIRED-GRACE certainly simple and
   easy to understand, it has negative consequences in BNDUPD

        If the binding-status actual operation.

   To illustrate this, in the receiving server's bindings data-
        base is ACTIVE, then reject the BNDUPD.  Otherwise, accept the
        BNDUPD.

        If simple case where the binding-status in primary updates the BNDUPD is EXPIRED-GRACE
   secondary for a while and then fails, if the
        server receiving secondary can then renew
   the BNDUPD does not implement a grace period client for expired leases, only the MCLT beyond the acked-potential-expiration-
   time, then the server MUST set its lease expira-
        tion secondary will only be able to value held in renew the grace-expiration in client for
   the BNDUPD.

      o RELEASED or RELEASED-GRACE in BNDUPD

        If MCLT, because the secondary has never sent a BNDUPD is LATER than the server's information, accept it,
        else reject it.

        If the binding-status in packet to the BNDUPD
   primary concerning this IP address and client, and so its acked-
   potential-expiration-time is RELEASED-GRACE and zero.

   However, if we allow the
        server receiving secondary to renew the BNDUPD does not implement a grace period
        for released leases, then client with the server MUST set its lease expira-
        tion to value held in MCLT
   beyond the grace-expiration in max( received-potential-expiration-time, acked-potential-
   expiration-time), then the BNDUPD.

      o FREE or BACKUP in BNDUPD

        If secondary can usually renew the binding-status in client for
   the receiving server's database is
        ACTIVE and full lease period, at least for the lease-expiration-time has not yet been reached,
        reject it, else accept it.

      o RESET or ABANDONDED in BNDUPD

        Accept first renew it under all circumstances.

7.1.3.  Conflict resolution when receiving sees from the BNDUPD message

   When a either of
   client, since the following conflicts exists between received-potential-expiration-time is generally
   longer than the informa-
   tion client's desired lease interval.  The difference in
   renew times could make a BNDUPD message and the information held big difference in server load on the receiving
   server's bindings database, it should be resolved
   secondary in this case.

   What are the following
   manner:

      o consequences of allowing a server to offer a DHCP client conflict

        This is the duplicate IP address allocation conflict. There are
        two different clients each allocated
   a lease term of the same address.

        If times for both exist, use MCLT beyond the LATER update, else use max( received-potential-
   expiration-time, acked-potential-expiration-time)?  The consequences
   appear whenever a server enters PARTNER-DOWN state, and affect how
   long that server has to wait before reallocating expired leases.
   With this approach, when a server goes into PARTNER-DOWN state, it
   must wait the
        information from MCLT beyond the primary server.

      o max( lease-expiration-time, sent-
   potential-expiration-time, acked-potential-expiration-time,
   received-potential-expiration-time ) for each IP address conflict

        An before it
   can reallocate that IP address conflict exists when a to another DHCP client.   One might
   normally think that it needed to wait only the MCLT beyond the max(
   lease-expiration-time, received-potential-expiration-time ), i.e.,
   beyond what it has told the client on one server is
        associated with a one IP address, and on what it has explicitly acked
   to the other server server.  But with a
        different IP address in the same or a related subnet. If one
        binding-status is ACTIVE and optimization discussed above --
   where either server can offer the other is anything but ACTIVE,
        then DHCP client a lease term of the information in
   MCLT beyond the ACTIVE binding SHOULD be used.  Oth-
        erwise, if times exist, max( received-potential-expiration-time, acked-
   potential-expiration-time), then the LATER SHOULD be used. Other-
        wise, if additional times do not exist, then sent-
   potential-expiration-time and acked-potential-expiration-time must be
   added into the information from expression, since the pri-
        mary server should be used. partner could have used those
   times as part of its own lease time calculation.

   Thus this optimization may require a longer waiting time when enter-
   ing PARTNER-DOWN state, but will generally allow servers to operate
   considerably more effectively when running in COMMUNICATIONS-
   INTERRUPTED state.

7.2.  BNDACK message

   Every BNDUPD message that is received by a server MUST be responded
   to with a corresponding BNDUPD BNDACK message.  The receiving server SHOULD
   respond quickly to every BNDUPD message but it MAY choose to respond
   preferentially to DHCP client requests instead of BNDUPD messages,
   since there is no absolute time period within which a BNDACK must be
   sent must be
   sent in response to a BNDUPD message, and DHCP clients frequently do
   have time constraints that must be met.

   A BNDACK message can only be sent in response to a BNDUPD message
   using the same TCP connection from which the BNDUPD message was
   received, since the XID's in BNDUPD messages are guaranteed unique
   only during the life of a single TCP connection.  When a connection
   to a partner server goes down, a server with unprocessed BNDUPD mes-
   sages MAY simply drop all of those messages, since it can be sure
   that the partner will retransmit them when they are next in communi-
   cations.  A server with unprocessed BNDUPD messages when a TCP con-
   nection goes down MAY instead choose to process those BNDUPD mes-
   sages, but it MUST NOT send any BNDACK messages in response to a BNDUPD message, and DHCP clients frequently do
   have time constraints that must be met. (again
   because of the issues surrounding XID uniqueness).

7.2.1.  Sending the BNDACK message

   The BNDACK message MUST contain the same xid as the corresponding
   BNDUPD message.

   All of the options which appear in the BNDUPD message MUST be
   included in the BNDACK message.  The values in the options MAY be
   updated to reflect current information on the server sending the
   BNDACK.   Note that update of this information may be used for infor-
   mational purposes, but MUST NOT be assumed to necessarily be recorded
   in the stable storage of the server who sent the BNDUPD message
   because there is not no corresponding ACK of the BNDACK message.  Any
   information that SHOULD be recorded in the partner server's stable
   storage MUST be transmitted in a subsequent BNDUPD.

   If the server is accepting the BNDUPD, the BNDACK message includes
   only those options that appears appeared in the BNDUPD message. If the server
   is rejecting the BNDUPD, the additional option reject-reason MUST
   appear in the BNDACK message, and the message option SHOULD appear in
   this case containing a human-readable error message describing in
   some detail the reason for the rejection of the BNDUPD message.

   If the server rejects the BNDUPD message with a BNDACK and a reject-
   reason option, it may be because the server believes that it has
   binding information that the other server should know.  A server
   which is rejecting a BNDUPD may initiate a BNDUPD of its own in order
   to update its partner with what it believes is better binding infor-
   mation, but it MUST ensure through some means that it will not end up
   a situation where each server is sending BNDUPD messages as fast as
   possible because they can't agree on which server has better binding
   data.  Placing a reasonable delay on the initiation of a BNDUPD mes-
   sage after sending a BNDACK with a reject-reason would be one way to
   ensure this situation doesn't occur.

7.2.2.  Receiving the BNDACK message

   When a server receives a BNDACK message, if it doesn't contain a
   reject-reason option that means that the BNDUPD message was accepted,
   and the server which sent the BNDUPD MUST update its stable storage
   with the potential-expiration-time value sent in the BNDUPD message
   and returned in the BNDACK message.  Other values sent in the BNDUPD
   message MAY be used as desired.

7.3.  UPDREQ message

   The update request (UPDREQ) message is used by one server to request
   that its partner send it all of the binding database information that
   it has not already seen.   Since each server is required to keep
   track at all times of the binding information the other server has
   received and ACKed, one server can request transmission of all un-
   ACKed binding database information held by the other server by using
   the UPDREQ message.

   The UPDREQ message is used whenever the sending server cannot proceed
   before it has processed all previously un-ACKed binding update infor-
   mation, since the UPDREQ message should yield a corresponding UPDDONE
   message.  The UPDDONE message is not sent until the server that sent
   the UPDREQ message has responded to all of the BNDUPD messages gen-
   erated by the UPDREQ message with BNDACK messages. Thus, the sender
   of the UPDREQ message can be sure upon receipt of an UPDDONE message
   that it has received and commited committed to stable storage all outstanding
   binding database updates.

   See section 9, Protcol Protocol state transitions, for the details of when
   the UPDREQ message is sent.

7.3.1.  Sending the UPDREQ message

   There are no options for the UPDREQ message.

   The UPDREQ message is sent with a unique xid.

7.3.2.  Receiving the UPDREQ message

   A server receiving an UPDREQ message MUST send all binding database
   changes that have not yet been ACKed by the sending server.   These
   changes are sent as undistinguished BNDUPD messages.

   However, the server which received and is processing the UPDREQ mes-
   sage MUST track the BNDACK messages that correspond to the BNDUPD
   messages triggered by triggered by the UPDREQ message and, when they are all
   received, the server MUST send an UPDDONE message.

   The server processing the UPDREQ message and sending BNDUPD messages
   to its partner SHOULD only track the BNDUPD and BNDACK message pairs
   for unACKed binding database changes that were present upon the
   receipt of the UPDREQ message.  A server which has received an UPDREQ
   message SHOULD send BNDUPD messages for binding database changes that
   occur after receipt of the UPDREQ message, but it SHOULD NOT include
   those additional BNDUPD messages and their corresponding BNDACK mes-
   sages in the accounting necessary to consider the UPDREQ complete and
   subsequently send the UPDDONE message.  If some additional binding
   database changes end up becoming part of the set of BNDUPD messages
   considered as part of the UPDREQ message and, when they are all
   received, (due to whatever algorithm the
   server uses to scan its bindings database for unacked changes) it
   will probably not cause any difficulty, but a server MUST send NOT attempt
   to include all such later BNDUPD messages in the accounting for the
   UPDREQ in order to be able to transmit an UPDDONE message.

   When queuing up the BNDUPD messages for transmission to the sender of
   the UPDREQ message, the receiving server processing the UPDREQ message MUST
   honor the value returned in the max-unacked-bndupd option in the CONNECT CON-
   NECT or CONNEC-
   TACK CONNECTACK message that set up the connection with the sending send-
   ing server.  It MUST NOT send more BNDUPD messages without receiving
   corresponding BNDACKs than the value returned in max-unacked-bndupd.

7.4.  UPDREQALL message

   The update request all (UPDREQALL) message is used by one server to
   request that its partner send it all of the binding database informa-
   tion.  This message is used to allow one server to recover from a
   failure of stable storage and to restore its binding database in its
   entirety from the other server.

   A server which sends an UPDREQALL message cannot proceed until all of
   its binding update information is restored, and it knows that all of
   that information is restored when an UPDDONE message is received.

   See section 9, Protcol Protocol state transitions, for the details of when
   the UPDREQALL message is sent.

7.4.1.  Sending the UPDREQALL message

   There are no options for the UPDREQALL message.

   The UPDREQALL message is sent with a unique xid.

7.4.2.  Receiving the UPDREQALL message

   A server receiving an UPDREQALL message MUST send all binding data-
   base information to the sending server.  These changes are sent as
   undistinguished BNDUPD messages.

   However, the server receiving processing the UPDREQALL message MUST track the
   BNDACK messages that correspond to the BNDUPD messages triggered by
   the UPDREQ UPDREQALL message and, when they are all received, the server
   MUST send an UPDDONE message.

   Just as specified for the processing of the UPDREQ message, the
   server processing the UPDREQALL message and sending BNDUPD messages
   to its partner SHOULD only track the BNDUPD and BNDACK message pairs
   for unACKed binding database changes that were present upon the
   receipt of the UPDREQALL message.  A server which has received an
   UPDREQALL message SHOULD send BNDUPD messages for binding database
   changes that occur after receipt of the UPDREQ message, but it SHOULD
   NOT include those additional BNDUPD messages and their corresponding
   BNDACK messages in the accounting necessary to consider the UPDREQALL
   complete and subsequently send the UPDDONE message.  If some addi-
   tional binding database changes end up becoming part of the set of
   BNDUPD messages considered as part of the UPDREALLQ (due to whatever
   algorithm the server uses to scan its bindings database for unacked
   changes) it will probably not cause any difficulty, but a server MUST
   NOT attempt to include all such later BNDUPD messages in the account-
   ing for the UPDREQALL in order to be able to transmit an UPDDONE mes-
   sage.

   When queuing up the BNDUPD messages for transmission to the sender of
   the UPDREQALL message, the receiving server processing the UPDREQALL MUST honor
   the value returned in the max-unacked-bndupd option in the CONNECT or CONNEC-
   TACK
   CONNECTACK message that set up the connection with the sending
   server.  It MUST NOT send more BNDUPD messages without receiving
   corresponding BNDACKs than the value returned in max-unacked-bndupd.

7.5.  UPDDONE message

   The update done (UPDDONE) message is used by a server receiving an
   UPDREQ or UPDREQALL message to signify that it has sent all of the
   BNDUPD messages requested by the UPDREQ or UPDREQALL request and that
   it has received a BNDACK for each of those messages.

7.5.1.  Sending the UPDDONE message

   The UPDDONE message SHOULD be sent as soon as the last BNDACK message
   corresponding to a BNDUPD message requested by the UPDREQ or
   UPDREQALL is received from the server which sent the UPDREQ or
   UPDREQALL.  The XID of the UPDDONE message MUST be the same as the
   XID of the corresponding UPDREQ or UPDREQALL message.

7.5.2.  Receiving the UPDDONE message

   A server receiving the UPDDONE message knows that all of the informa-
   tion that it requested by sending an UPDREQ or UPDREQALL message has
   now been sent and that it has recorded this information in its stable
   storage.  It typically uses that the receipt of an UPDDONE message to
   move to a different failover state.  See sections 9.5.2 and 9.8.3 for
   details.

7.6.  POOLREQ message

   The pool request (POOLREQ) message is used by the secondary server to
   request an allocation of IP addresses from the primary server.   It
   MUST be sent by a secondary server to a primary server to request IP
   address allocation by the primary.  The IP addresses allocated are
   transmitted using normal BNDUPD messages from the primary to the
   secondary.

   The POOLREQ message SHOULD be sent from the secondary to the primary
   whenever the secondary transitions into NORMAL state.  It SHOULD
   periodically be resent in order that any change in the number of
   available IP addresses on the primary be reflected in the pool on the
   secondary.  The period may be influenced by the secondary server's
   leasing activity.

7.6.1.  Sending the POOLREQ message

   The POOLREQ message has no options.  It must be sent with a unique
   xid.

7.6.2.  Receiving the POOLREQ message

   When a primary server receives a POOLREQ message it SHOULD examine
   the binding database and determine how many IP addresses the secon-
   dary server should have, and set these IP addresses to BACKUP state.
   It SHOULD then send BNDUPD messages concerning all of these IP
   addresses to the secondary server.

   Servers frequently have several kinds of IP addresses available on a
   particular network segment.  The failover protocol assumes that both
   primary and secondary servers are configured in such a way that each
   knows the type and number of IP addresses on every network segment
   participating in the failover protocol.  The primary server is
   responsible for allocating the secondary server the correct propor-
   tion of available IP addresses of each kind, and the secondary server
   is responsible for being configured in such a way that it can tell
   the kind of every IP address based solely on the IP address itself.

   A primary server MUST keep track of how many IP addresses were allo-
   cated as a result of processing the POOLREQ message, and send that
   number in the POOLRESP message.

   A primary server MAY choose to defer processing a POOLREQ message
   until a more convenient time to process it, but it should not depend
   on the secondary server to retransmit the POOLREQ message in that
   case.

   If a secondary server receives a POOLREQ message it SHOULD report an
   error.

7.7.  POOLRESP message

   A primary server sends a POOLRESP message to a secondary server after
   the allocation process for available addresses to the secondary
   server is complete.  Typically this message will precede some of the
   BNDUPD messages that the primary uses to send the actual allocated IP
   addresses to the secondary.

7.7.1.  Sending the POOLRESP message

   The POOLRESP message MUST contain the same xid as the corresponding
   POOLREQ message.

   The only option which MUST appear in a POOLREQ message is:

      o addressed-transferred

        The number of addresses allocated to the secondary server by the
        primary server as a result of a POOLREQ is contained in the
        addresses-transferred option in a POOLRESP message.  Note this
        is the number of addresses that are transferred to the secondary
        in the primary's binding database as a result of the correspond-
        ing POOLREQ message, and that it may be some time before they
        can all be transmitted to the secondary server through the use
        of BNDUPD messages.

7.7.2.  Receiving the POOLRESP message

   When a secondary server receives a POOLRESP message, it SHOULD send
   another POOLRESP message if the value of the addresses-transferred
   option is non-zero.

   Typically, no other action is taken on the reception of a POOLRESP
   message.

7.8.  CONNECT message

   The connect message is used to establish an applications level con-
   nection over a newly created TCP connection.  It gives the source
   information for the connection, and some important configuration
   information.  It may MUST be sent only by either the primary or secondary server.
   It  Either
   server can initiate a TCP connection, but the CONNECT message is only
   sent by the initiator of a TCP connection. primary server.

7.8.1.  Sending the CONNECT message

   The CONNECT message MUST be the first message sent by the initiator
   of a TCP connection primary
   server after the establishment of a new TCP connection with another a secon-
   dary server participating in the failover protocol.

   The xid of the CONNECT message must be unique.

   The IP address of the sending primary server MUST be placed in the sending-
   server-IP-address option.  This information is placed in an option
   inside of the packet message in order to allow the identity of the sender to
   be covered by a shared secret.

   The role of the sending failover endpoint (i.e., either primary or
   secondary) MUST be placed in the server-role option.

   The current time MUST be placed in the current-time option.

   The number of BNDUPD messages the primary server can accept without
   blocking the TCP connection MUST be placed in the max-unacked-bndupd
   option.  This MUST be a number equal to or greater than 1, SHOULD be
   a number greater than 10, and SHOULD be a number less than 100.

   The length of the receive timer (tReceive, see section 8.3) MUST be
   placed in the receive-timer option.

   If the sending server is a primary server, then the

   The MCLT MUST be placed in the MCLT option.

   If the sending server is a primary server, then the hash-bucket-
   assignment

   The hash-bucket-assignment option MUST be included in the CONNECT
   message.  In the event that load balancing is not configured for this
   server, the hash-bucket-assignment option will indicate that.  The
   value of the hash-bucket-assignment option is determined from the
   specific buckets that the primary server has determined that the
   secondary server MUST service as part of the load-balancing algorithm. algo-
   rithm.  The way in which the primary server determines this information informa-
   tion is outside the scope of this protocol definition.  The primary
   server is SHOULD be able to be configured with a percentage of clients that the secon-
   dary
   secondary server will be instructed to service, and the primary
   server SHOULD convert that percentage value into a corresponding set of bits
   in use the hash-bucket-assignment option that are set algorithm in [LOADB] to generate a 1, indicating
   that the secondary server MUST service clients Hash Bucket
   Assignment which map it sends to those
   hash buckets. the secondary server.

   The vendor class identifier MUST be placed in the vendor-class-
   identifier option.

   The protocol-version option MUST be included in every CONNECT mes-
   sage.  The current value of the protocol version is 1.

   The TLS-request option MUST be sent and contains the desired TLS con-
   nection request as well as information concerning whether TLS is sup-
   ported.    If this CONNECT message is being sent over a already
   created TLS connection, the TLS-request MUST NOT appear.

7.8.2.  Receiving the CONNECT message

   When a server receives a TCP connection on the failover port, if it
   is a PRIMARY server it should send a CONNECT message, and if it is a
   secondary server it should wait for a CONNECT message.

   When a secondary server receives a CONNECT message it should:

      1.  Record the time at which the message was received.

      2.  Examine the protocol-version option, and decide if this server
          is capable of interoperating with another server running that
          protocol version.  If not, then send the CONNECTACK message with
          the appropriate reject-reason.  The server MUST include its
          protocol-version in the CONNECTACK message.

      3.  Examine the TLS-request option.  Figure out the TLS-reply
          value based on the capabilities and configuration of this
          server, and save it for the CONNECTACK message.  If the
          results of the TLS negotiation result in a connection rejec-
          tion, then go immediately to send the CONNECTACK message.

          The possibilities are:

               CONNECT        CONNECTACK
             TLS-request       TLS-reply

                                    Reject
              req acc          t1   Reason   Comments
              --- ---          --   ------   --------
              0   0            0
              0   0            1    11       receiver requires TLS
              0   1            0
              0   1            1
              1   0            -             request doesn't make sense
              1   1            0
              1   1            1
              2   0            -             request doesn't make sense
              2   1            0    9 or 10  receiver won't do TLS
              2   1            1

      4.  Check to see if there is a message-digest option in the CON-
          NECT message.  If there was, and the server does not support
          message-digests, then reject the connection with the appropri-
          ate reject-reason in the CONNECTACK.

      5.  Determine if the sender (from the sending-server-IP-address
          option) and the implicit role of the sender (from the server-role)
          option (i.e., primary)
          represents a server with which the receiver was config-
          ured configured to
          engage in failover activity.  This is performed after the any
          TLS processing so that it occurs after a secure connection is
          created, to ensure that there is no tampering with the IP
          address of the partner.

          If not, then the receiving server should reject the CONNECT
          request by sending a CONNECTACK message with a reject-reason
          value of: 8, invalid failover partner.

          If it is, then the receiving failover endpoint should be
          determined.

      6.  Decide if the time delta between the sending of the packet, message,
          in the current-time option, time field, and the receipt of the packet, message, recorded in
          step 1 above, is acceptable.  A server MAY require an arbitrarily arbi-
          trarily small delta in time values in order to set up a
          failover fail-
          over connection with another server.  See section 5.9 for
          information on time synchronization.

          If the delta between the time values is too great, the server
          should reject the CONNECT request by sending a CONNECTACK mes-
          sage with a reject-reason of 4, time mismatch too great.

          If the time mismatch is not considered too great then the
          receiving server MUST record the delta between the servers.
          The receiving server MUST use this delta to correct all of the
          absolute times received from the other server in all time-
          valued options.  Note that server's can participate in fail-
          over with arbitrarily great time mismatches, as long as it is
          more or less constant.

      7.  If the receiving server is a secondary server, it MUST examine
          the MCLT option in the CONNECT request and use the value of
          the MCLT as the MCLT for this failover endpoint.

          A receiving secondary server SHOULD be able to operate with
          any MCLT sent by the primary,  but if it cannot, then it
          should send a CONNECTACK with a reject-reason of 5, MCLT
          mismatch.

      8.  The server MUST store hash-bucket-assignment option for use
          during processing during NORMAL state.  If this hash bucket
          assignment conflicts with the secondary server's configured
          hash bucket assignment for use in other than NORMAL state, the
          secondary server should send a CONNECTACK with a reject reason
          of 19, Hash bucket assignment conflict.

      9.  The receiving server MAY use the vendor-class-identifier to do
          vendor specific processing.

7.9.  CONNECTACK message

   The CONNECTACK message is sent to accept or reject a CONNECT message.
   It is sent by the secondary server which accepted the TCP connection and received a CONNECT message.

   Attempting immediately to reconnect after either receiving a CONNEC-
   TACK with a reject-reason or after sending a CONNECTACK with a
   reject-reason could yield unwanted looping behavior, since the reason
   that the connection was rejected may well not have changed since the
   last attempt.  A simple suggested solution is to wait a minute or two
   after sending or receiving a CONNECTACK message with a reject-reason
   before attempting to reestablish communication.

7.9.1.  Sending the CONNECTACK message

   The xid of the CONNECTACK message must MUST be that of the corresponding
   CONNECT message.

   The IP address of the sending server MUST be placed in the sending-
   server-IP-address option.  This information is placed in an option
   inside of the packet message in order to allow the identity of the sender to
   be covered by a shared secret.

   The role of the sending failover endpoint (i.e., either primary or
   secondary) MUST be placed in the server-role option.

   The current time MUST be placed in the current-time option.

   The protocol-version option MUST be included in every CONNECTACK mes-
   sage.  The current value of the protocol version is 1.

   If the connection has been rejected, the reject-reason option MUST be
   placed in the CONNECTACK message with an appropriate reason, and a
   message option SHOULD be included with a human-readable error message
   describing the reason for the rejection in some detail.  If the
   reject-reason option appears, then the remaining options listed below
   do not appear.  The sending server should close the connection after
   sending the CONNECTACK if the connection was rejected.

   The results of the TLS negotiation MUST be placed in the TLS-reply
   option.  If this CONNECTACK message is being sent over an already TLS
   secured connection, then there MUST NOT be a TLS-reply option.

   If there was a message-digest option in the CONNECT message, then
   there MUST be a message-digest in the CONNECTACK message and any sub-
   sequent messages if it the CONNECTACK does not contain a reject-reason.

   The number of BNDUPD messages the server can accept without blocking
   the TCP connection MUST be placed in the max-unacked-bndupd option.
   This SHOULD be a number greater than 10, and SHOULD be a number less
   than 100.

   The length of the receive timer (tReceive, see section 8.3) MUST be
   placed in the receive-timer option.

   If the sending server is a primary server, then the MCLT MUST be
   placed in the MCLT option.

   The vendor class identifier MUST be placed in the vendor-class-
   identifier option.

   If the server is rejecting the CONNECT message, then the reject-
   reason option MUST appear.  A message option MAY SHOULD appear to give a
   human readable version of the rejection reason.

   After a connection is created (either by sending a CONNECTACK message
   to the first CONNECT message, or sending a CONNECTACK message to a
   CONNECT message received over a TLS connection), the server MUST send
   a STATE mes-
   sage. message.

   After sending a CONNECTACK message, connection is created, the server MUST start two timers for
   the connection: tSend and tReceive.   The tSend timer SHOULD be
   approximately 20 33 percent of the time in the receiver-timer option in
   the corresponding CONNECT message.  The tReceive timer SHOULD be the
   time sent in the receiver-timer option in the CONNECTACK message.

   The tReceive timer is reset whenever a message is received from this
   TCP connection.  If it ever expires, the TCP connection is dropped
   and communications with this partner is considered not ok.

   The tSend timer is reset whenever a packet message is sent over this connec-
   tion. When it expires, a CONTACT message MUST be sent.

7.9.2.  Receiving the CONNECTACK message

   If a CONNECTACK message is received with a different XID from the one
   in the CONNECT that was sent, it SHOULD be ignored.

   When a CONNECTACK message is received, the following actions should
   be taken:

      1.  Record the time the packet message was received.

      2.  Check to see if there is a reject-reason option in the CONNEC-
          TACK message.  If not, continue with step 3.  If there is a
          reject-reason option, the server SHOULD report the error code.
          If a message option appears a server SHOULD display the string
          from the message option in a user visible way.  The server
          MUST close the connection if a reject-reason option appears.

      3.  Check to see if the xid on the CONNECTACK matches an outstand-
          ing CONNECT message on this TCP connection.

      4.  Check the value of the TLS-reply option, and if it was 1, then
          skip processing of the rest of the CONNECTACK message, and
          immediately enter into TLS connection setup.

          If it does not, a server SHOULD report an error.

          This step occurs prior to steps 5 and 6 in order to allow
          creation of a secure connection (if required) prior to pro-
          cessing the protocol version and IP address information.

      5.  Examine the value of the protocol-version option.  If this
          server is able to establish connections with another server
          running this protocol version, then continue, else close the
          connection.

      6.  Check to see if the sending-server-IP-address and server-role
          in the CONNECTACK message correspond to the failover endpoint
          for which this TCP connection was created.

          If it was not, the server MUST drop the TCP connection and
          SHOULD report an error.

      7.  Decide if the time delta between the sending of the packet, message,
          in the current-time option, time field, and the receipt of the packet, message, recorded in
          step 1 above, is acceptable.  A server MAY require an arbitrarily arbi-
          trarily small delta in time values in order to set up a
          failover fail-
          over connection with another server.

          If the delta between the time values is too great, the server
          should drop the TCP connection.

          If the time mismatch is not considered too great then the
          receiving server MUST record the delta between the servers.
          The receiving server MUST use this delta to correct all of the
          absolute times received from the other server in all time-
          valued options.  Note that the failover protocol is con-
          structed so that two servers can be failover partners with
          arbitrarily great time mismatches.

      8.

      7.  If the receiving server is a secondary server, it MUST examine
          the MCLT option in the CONNECT request and use the value of
          the MCLT as the MCLT for this failover endpoint.

          A receiving secondary server SHOULD be able to operate with
          any MCLT sent by the primary,  but if it cannot, then it MUST
          drop the TCP connection.

      8.  If the receiving server is a secondary server, it MUST store
          the hash-bucket-assignment option for use during processing
          during NORMAL state.  If this hash bucket assignment conflicts
          with the server's configured hash bucket assignment for use in
          other than NORMAL state, the secondary server should send a
          CONNECTACK with a reject reason of 19, Hash bucket assignment
          conflict.

      9.  The receiving server MAY use the vendor-class-identifier to do
          vendor specific processing.

      10. After accepting a CONNECTACK message, the server MUST send a
          STATE message.

          After receiving a CONNECTACK message, the server MUST start
          two timers for the connection: tSend and tReceive.   The tSend
          timer SHOULD be approximately 20 percent of the time in the
          receiver-timer option in the corresponding CONNECTACK message.
          The tReceive timer SHOULD be set to the time sent in the
          receiver-timer option in the CONNECT message.

          The tReceive timer is reset whenever a message is received
          from this TCP connection.  If it ever expires, the TCP connec-
          tion is dropped and communications with this partner is con-
          sidered not ok.

          The tSend timer is reset whenever a packet message is sent over this
          connection. When it expires, a CONTACT message MUST be sent.

7.10.  STATE message

   The state (STATE) message is used to communicate the current failover
   state to the partner server.

   The STATE message MUST be sent after sending a CONNECTACK message
   that didn't contain a reject-reason option, and MUST be sent after
   receiving a CONNECTACK message without a reject-reason option.

   A STATE message MUST be sent whenever the failover endpoint changes
   its failover state and a connection exists to the partner.

   The STATE message requires no response from the failover partner.

7.10.1.  Sending the STATE message

   The current failover state is placed in the server-state option and
   the current state of the STARTUP flag is placed in the server-flags
   option.

   The message is sent with a unique xid.

   A server SHOULD only send the STATE message either when the connec-
   tion is created (i.e, after sending or receiving a CONNECTACK message
   with no reject-reason option), or when there is a change from the
   values sent in a previous STATE message.

7.10.2.  Receiving the STATE message

   Every STATE message SHOULD indicate a change in state or a change in
   the flags.

   When a STATE message is received, any state transitions specified in
   section 9 are taken.

   No response to a STATE message is required.

7.11.  CONTACT message

   The contact (CONTACT) message is sent to verify communications
   integrity with a failover partner. The CONTACT message is sent when
   no messages have been sent to the failover partner for a specified
   period of time.  This is determined by the tSend timer expiring (see
   section 8.3).

7.11.1.  Sending the CONTACT CONTACT message

   The CONTACT message is sent.

7.11.2.  Receiving the CONTACT message

   When a CONTACT message is received, the tReceive timer is reset (as
   it is with any message that is received).

   A server MAY use the time in the time field and the time recorded
   above to refine the delta time calculations between the servers.

7.12.  DISCONNECT message

   The DISCONNECT is the last message sent over a connection before
   dropping an established connection.

   After sending or receiving a DISCONNECT message, a server needs to
   have some mechanism to prevent an error loop. Simply reconnecting to
   the partner immediately is not the best option, especially after
   several consecutive attempts.

   A simple suggested solution is to wait a minute or two after sending
   or receiving a DISCONNECT before attempting to reestablish communica-
   tion.

7.12.1.  Sending the DISCONNECT message

   The current time DISCONNECT message MUST be the last message sent by the a server
   which is placed in dropping a TCP connection.

   The xid of the current-time option, and DISCONNECT message must be unique.

   The reject-reason option MUST appear giving a reason why the CON-
   TACT connec-
   tion was dropped.  A message is sent.

7.11.2. option SHOULD appear giving a human
   readable error message with possibly more details.

7.12.2.  Receiving the CONTACT DISCONNECT message

   When a CONTACT server receives a DISCONNECT message is received, the tReceive timer is reset (as it is with any message that is received).

   A server MAY use the time in should log the current-time option message
   if there was one and possibly raise an alarm of some sort if the time
   recorded above to refine the delta time calculations between the
   servers.
   reject reason was one that was sufficiently serious.

8.  Connection Management

   Servers participating in the failover protocol communicate over TCP
   connections.   These TCP connections are used both to transmit bind-
   ing information from one server to another as well as to allow each
   server to determine whether communications is possible with the other
   server.

   Central to the operation of the failover protocol is a notion of
   "communications okay" or "communications failed".  Failover state
   transitions are taken in many cases when the status of communications
   with the partner changes, and the existence or non-existence of a TCP
   connections between failover endpoints is used to determine if com-
   munications is "okay" or "failed".

   A single TCP connection exists which connects two failover endpoints.

8.1.  Connection granularity

   There exists one TCP connection between each set of failover end-
   points.  See section 5.1.1 for an explanation of failover endpoint.

   There are a maximum of two TCP connections between any two servers
   implementing the failover protocol, one for each of the possible
   failover endpoints between these two servers.  There is a minimum of
   one TCP connection between one server and every other failover server
   with which it implements the failover protocol.

8.2.  Creating the TCP connection

   Every server implementing the failover protocol MUST listen on port
   647 for incoming failover TCP connections.  The source port of the
   TCP connection is unimportant.

   Every server implementing the failover protocol SHOULD attempt to
   connect to all of its partners periodically, where the period is
   implementation dependent and SHOULD be configurable. In the event
   that a connection has been rejected by a CONNECTACK message with a
   reject-reason option contained in it, it or a DISCONNECT message, a
   server SHOULD reduce r educe the fre-
   quency frequency with which it attempts to connect
   to that server but it SHOULD continue to attempt to connect periodically. periodi-
   cally.

   Once a connection is established, the first primary server MUST send a CON-
   NECT message sent across the
   connection connection.  A secondary server MUST be a wait for
   the CONNECT message. This message establishes the
   identity of the failover endpoint making the connection. from a primary server.

   Every CONNECT message includes a TLS-request option, and if the CON-
   NECTACK message does not reject the CONNECT message and the TLS-reply
   option says TLS MUST be used, then the servers will immediately enter
   into TLS negotiation.

   Once that TLS negotiation is complete, then the primary server MUST resend the
   CONNECT message on the newly secured TLS connection and then wait for
   the CONNECTACK message in response.  The TLS-request and TLS-reply
   options MUST have the same values in this second CONNECT and CONNEC-
   TACK message has as they had in the first messages.

   The second message sent over a new connection (either a bare TCP con-
   nection or a connection utilizing TLS) is a STATE message.  Upon the
   receipt of this message, the receiver can consider communi-
   cations communications up.

   It is entirely possible that two servers will attempt to make connec-
   tions to each other essentially simultaneously, and then each will
   send a CONNECT message down the new connection.  In in this case each the
   secondary server will receive a CONNECT message on one connection having
   already sent be waiting for a CONNECT message on the other connection.  In the event
   that the each con-
   nection.  The primary server receives MUST send a CONNECT message from over one
   connection and it MUST close the other connection.

   A secondary server either while waiting for a CONNECTACK message from MUST NOT respond to the closing of a secondary
   server or when it has TCP connec-
   tion with a valid connection open blind attempt to a secondary server,
   it will close the reconnect -- there may be another TCP
   connection on which to the CONNECT message was
   received. same failover partner already in use.

8.3.  Using the TCP connection for determining communications status

   The TCP connection is used to determine the communications status of
   the other server, i.e., communications-ok, or communications-
   interrupted.

   Three things must happen for a server to consider that communications
   are ok with respect to another server:

      1.  A TCP connection must be established to the other server.

      2.  A CONNECT message must be received and a CONNECTACK message
          sent in response.  The CONNECT message is used to determine
          the identify of the failover endpoint of the other end of the
          TCP connection -- without it, the failover endpoint cannot be
          uniquely determined.  Without knowledge of the failover end-
          point, then the entity with which communications is ok is
          undetermined.

      3.  A STATE message must be received from the other server over
          the connection.  This STATE message initializes important
          information necessary to the operation of the state machine
          the governs the behavior of this failover endpoint.

   There are two ways that a server can determine that communications
   has failed:

      1.  The TCP connection can go down, yielding an error when
          attempting to send or receive a message. This will happen at
          least as often as the period of the tSend timer.

      2.  The tReceive timer can expire.

   In either of these cases, communications is considered interrupted.

   Several difficulties arise when trying to use one TCP connection for
   both bulk data transfer as well as to sense the communications status
   of the other server.   One aspect of the problem stems from the dif-
   ferent requirements of both uses.  The bulk data transfer is of
   course critically important to the protocol, but the speed with which
   it is processed is not terribly significant.  It might well be
   minutes before a BNDUPD message is processed, and while not optimal,
   such an occasional delay doesn't compromise the correctness of the
   protocol. However, the speed with which one server detects the other
   server is up (or, more importantly, down) is more highly constrained.
   Generally one server should be able to detect that the other server
   is not communicating within a minute or less.

   These differing time constraints makes it difficult to use the same
   TCP connection for data transfer as well as to sense communications
   integrity.   See section 3.5 for additional details on TCP.

   The solution to this problem is to require a that some message be
   received by each end of the connection within a limited time or that
   the connection will be considered down.  If no messages have been
   sent recently, then a CONTACT message is sent.

   In the case where there is no data queued to be sent, this is not a
   problem, but in the case where there is data queued to be sent to the
   partner, then the CONTACT message will not actually be transmitted
   until the queued data is sent.  Section 3.5 explains why waiting for
   TCP to determine that the connection is down is not acceptable, and
   leads a requirement that the receiving server never block the sending
   server from sending CONTACT packets. messages.

   In order to meet this requirement, each server tells the other server
   the number of outstanding BNDUPD messages that it will accept.  The
   receiving server is required to always be able to accept that many
   BNDUPD messages off of the connection's input queue even if it cannot
   process them immediately, and to accept all other messages immedi-
   ately.

   Thus, the sending server's TCP is never blocked from sending a mes-
   sage except for very short periods, less than a few seconds unless
   the network connection itself has problems.  In this case, if the
   CONTACT messages don't make it to the partner then the partner will
   close the connection.

   DISCUSSION:

      When implementing this capability, one needs to be careful when
      sending any message on the TCP connection as TCP can easily block
      the server if the local TCP send buffers are full.  This can't be
      prevented because if the receiver is not reachable (via the net-
      work), the sending TCP can't send and thus it will be unable to
      empty the local TCP send buffers.  So, all send operations either
      need to assume they may block for some time or non-blocking sends
      must be used.

8.4.  Using the TCP connection for binding data

   Binding data, in the form of BNDUPD messages and BNDACK messages to
   respond to them, are sent across the TCP connection.

   In order to support timely detection of any failure in the partner
   server, the TCP connection MUST NOT block for more than a very short
   time, on the order of a few seconds.  Therefore, a server that is
   sending BNDUPD messages MUST send only a restricted number before
   receiving BNDACK messages about previous messages sent.

   The number of outstanding BNDUPD messages that each server will
   accept without causing TCP to block transmission of additional data
   (i.e, CONTACT messages) is sent by each server in the CONNECT and
   CONNECTACK messages in the max-unacked-bndupd option.

8.5.  Using the TCP connection for control messages

   The TCP connection is used for control messages: POOLREQ, UPDREQ,
   STATE, CONTACT, UPDREQALL and the corresponding reply messages: POOLRESP, POOL-
   RESP, UPDDONE.  A server MUST immediately accept all of these messages mes-
   sages from the TCP connection.  A server MUST immediately accept any
   BNDACK which is received as well.

8.6.  Losing the TCP connection

   When the TCP connection is lost, then communications is not ok with
   the other server.  A server which has lost communications SHOULD
   immediately attempt to reconnect to the other server, and should
   retry these connection attempts periodically.

   Any

   A BNDACK message can only be sent in response to a BNDUPD or other messages that have been received but not yet pro-
   cessed message
   using the same TCP connection from which the BNDUPD message was
   received, since the XID's in BNDUPD messages are guaranteed unique
   only during the life of a single TCP connection.  When a connection
   to a partner SHOULD server goes down, a server with unprocessed BNDUPD mes-
   sages MAY simply drop all of those messages, since it can be processed as soon as possible. sure
   that the partner will retransmit them when they are next in communi-
   cations.  A server with unprocessed BNDUPD messages when a TCP con-
   nection goes down MAY instead choose to process those BNDUPD mes-
   sages, but it MUST NOT send any BNDACK messages in response (again
   because of the issues surrounding XID uniqueness).

   When the TCP connection is closed explicitly, the DISCONNECT message
   with a reject-reason option (and, ideally, a message option) MUST be
   sent over the TCP connection.

9.  Protocol States

   This section discusses the various states that a failover endpoint
   may take, and the server actions required when entering the state,
   operating in the state, and leaving the state, as well as the events
   that cause transitions out of the state into another state.

   The state transition diagram in Figure 9.2-1 is relevant for this
   section. This is the common state transition diagram for both servers
   in a failover pair.  In the event that the textual description of a
   state differs from the state transition diagram, the textual description descrip-
   tion is to be con-
sidered considered authoritative.  This is the common state transition diagram for
both servers in a failover pair.

9.1.  Server Initialization

   When a server starts it starts out in STARTUP state.  See section 9.4
   below for details.

9.2.  Server State Transitions

   Whenever a server transitions into a new state, it MUST record the
   state and the time at which it entered that state in stable storage.
   If communications is "ok", it MUST also send a STATE message to its
   failover partner.

   Figure 9.2-1 is the diagram of the server state transitions. The
   remainder of this section contains information important to the
   understanding of that diagram.

   The server stays in the current state until all of the actions speci-
   fied on the state transition are complete.  If communications fails
   during one of the actions, the server simply stays in the current
   state and attempts a transition whenever the conditions for a transi-
   tion are later fulfilled.

   In the state transition diagram below, the "+" or "-" in the upper
   right corner of each state is a notation about whether communication
   is ongoing with the other server.

   The legend "responsive", "balanced", or "unresponsive" in each state
   indicates whether the server is responsive to all DHCP client
   requests, running in load balanced mode, or totally unresponsive in
   the respective state.  The terms "responsive" and "unresponsive" have
   the obvious meanings, while "balanced" means that a DHCP server may
   respond to all DHCPREQUEST messages that are RENEWAL or REBINDING,
   and to all other messages from clients for which the load balancing
   algorithm indicates that it MUST respond to.  See sections 5.3 and
   9.6.2 for details on load balancing.

   In the state transition diagram below, when communication is reesta-
   blished between the two servers, each must record the state of the
   partner when communication was restored.  State transitions on one
   server in some cases imply state transitions on the partner server,
   so a record of the current state of the partner server must be kept
   by each server.

   If the state of the partner changes while communicating a server
   moves through the communications-failed transition and into whatever
   state results.  It then immediately moves through whatever state
   transition is appropriate given the current state of the partner
   server.  A server performing this operation SHOULD NOT drop close the TCP
   connection to its partner.

   DISCUSSION:

      The point of this technique is simplicity, both in explanation of
      the protocol and in its implementation.  The alternative to this
      technique of memory of partner state and automatic state transi-
      tion on change of partner state is to have every state in the fol-
      lowing diagram have a state transition for every possible state of
      the partner.  With the approach adopted, only the states in which
      communications are reestablished require a state transition for
      each possible partner state.

   The current state of a server MUST be recorded in stable storage and
   thus be available to the server after a server restart.

        +---------------+  V  +--------------+
        |    RECOVER  - |  |  |   STARTUP  - |
        |(unresponsive) |  +->|(unresponsive)|
        +---------------+     +--------------+
           Comm. OK            +-----------------+
          Other State:-RECOVER |  PARTNER DOWN - |<-----+ |<-----------------+
          |      |             | (responsive)    |                  |
         All   POTENTIAL-      +-----------------+ +--------------+ |
       Others  CONFLICT------------ | --------+  ^(see    |  RESOLUTION  | |
          |                     Comm. OK      |    | 9.8.3)|  INTERRUPTED | |
         UPDREQ(ALL)          Other State:    |  +-----+  +-| (responsive) | |
       Wait UPDDONE            |        |     | Comm.  | +--------------+ |
     Wait MCLT from fail   RECOVER  All Others| Failed Comm. OK  ^     |   |
      +--------------+         |        V     V  V        |     |    Ext. |
      |RECOVER-DONE +|      +--+    +--------------+   |    Comm.  Cmd. |
      |(unresponsive)|      |       |  POTENTIAL + |<--+ |    Failed  |   |
      +--------------+   Wait for +>|  CONFLICT    |     |    |------+     +-->|
         Comm. OK         Other   | |(unresponsive)|<--- |(unresponsive)|<--------+      | --+
     +--Other State:-+    State:  | +--------------+         |      |
     |   |           |   RECOVER  |         |                |      |
     |   All      POTENT.  DONE   | Resolve Conflict         |      |
     |  Others:  CONFLICT-- | ----+     (see 9.8)            |      |
     | Wait for             V               V                |      |
     | Other State: NORMAL +-----------------+               |      |
     |   V                 |     NORMAL    + | External      |      |
     |   +--+----------+-->|   (balanced)    |-Command-->+    |-Command---+-- | -----+
     |      ^          ^   +-----------------+           |   |
     |      |          |            |                    |   |
     |  Wait for   Comm. OK       Comm.            External  |
     |   Other      Other        Failed            Command   |
     |   State:     State:          |                or  |   |
     |RECOVER-DONE  NORMAL     Start Safe        Safe    |   |
     |      |     COMM. INT.  Period Timer       Period  |   |
     |   Comm. OK.     |            V            expiration  |
     |  Other State:   |  +------------------+           |   |
     |    RECOVER      +--| COMMUNICATIONS - |-----------+   |
     V      +-------------|   INTERRUPTED    |   Comm. OK    |
    RECOVER               |  (responsive)    |--Other State:-+
    RECOVER-DONE--------->+------------------+   All Others

           Figure 9.2-1:  Server state diagram.

9.3.  STARTUP state

   The STARTUP state affords an opportunity for a server to probe its
   partner server, before starting to service DHCP clients.

   DISCUSSION:

      Without the STARTUP state, a server would likely start in a state
      derived from its previously stored state (held in stable storage),
      if any.  However, this may be inconsistent with the current state
      of the partner.  The STARTUP state affords the opportunity for a
      server to potentially learn the partner's state and determine if
      that state is consistent with its derived starting state or
      whether some significant state change has occurred at the partner
      that forces the server to start in another state.  This is
      especially critical if significant time has elapsed while the
      server was down.

9.3.1.  Operation while in STARTUP state

   Whenever a server is in STARTUP state, it MUST be unresponsive to
   DHCP client requests, and so the time spent in the STARTUP state is
   necessarily short, typically on the order of a few seconds to a few
   tens of seconds.  The exact time spent in the STARTUP state is imple-
   mentation dependent, and the primary and secondary server are not
   required to spend the same amount of time in the STARTUP state.

   Whenever a STATE message is sent to the partner while in STARTUP
   state the STARTUP bit MUST be set in the server-flags option and the
   previously recorded failover state MUST be placed in the server-state
   option.

9.3.2.  Transition out of STARTUP state

   Each server starts out in startup state every time it initializes
   itself, and performs the following algorithm as part of its initiali-
   zation:

      1.  Do not send any messages until step 5.

      2.  Is there any record in stable storage of a previous failover
          state?  If yes, set previous-state to the last recorded state
          in stable storage, and continue with step 3. 2.

          Is there any configuration information that indicates that
          this server was previously running but lost its stable
          storage?  Such information must typically come from some
          administrative intervention, since it is difficult for a
          server to distinguish first startup from a startup after it
          has lost its stable storage.  If yes, then set the previous-
          state to RECOVER, and set the time-of-failure to whatever time
          was configured, and go on to step 3. 2.  This time-of-failure
          will be used in the transition out of the RECOVER state into
          the RECOVER-DONE state, below.

          If there is no record of any previous failover state in stable
          storage nor of any previous operational activity for this
          server, then set the previous-state to PARTNER-DOWN if this
          server is a primary and RECOVER if this server is a secondary,
          and set the time-of-failure to a time before the maximum-
          client-lead-time before now.  If using standard Posix times, 0
          would typically do quite well.

      3.

      2.  Is the previous-state NORMAL?  If yes, set the previous-state
          to COMMUNICATIONS-INTERRUPTED.

      4.

      3.  Start the STARTUP state timer.  The time that a server remains
          in the STARTUP state (absent any communications with its
          partner) is implementation dependent and SHOULD be configur-
          able.  It SHOULD be long enough to for a TCP connection to be
          created to a heavily loaded partner across a slow network.

      5.

      4.  Attempt to create a TCP connection to the failover partner.
          See section 8.2.

      6.

      5.  Wait for "communications okay", i.e., the process discussed in
          section 8.2 "Creating the TCP Connection", to complete,
          including the receipt of a STATE message from the partner.

          When and if communications become "okay", clear the STARTUP
          flag, and set the current state to the previous-state.

          If the partner is in PARTNER-DOWN state, and if the time at
          which it entered PARTNER-DOWN state (as receive received in the start-
          time-of-state
          start-time-of-state option in the STATE message) is later than
          the last recorded time of operation of this server, then set
          the current state to RECOVER.  If the time at which it entered
          PARTNER-DOWN state is earlier than the last recorded time of
          operation of this server, then set the current state to
          POTENTIAL-CONFLICT.

          Then, transition to the current state and take the "communica-
          tions okay" state transition based on the current state of
          this server and the partner.

      7.  If the startup time expires, take an implementation dependent
          action:  The server MAY go to the previous-state, or the
          server MAY wait.

          Reasons to go to previous-state and begin processing:

          If the current server is the only operational server, then if
          it waits, there will be no operational DHCP servers.  This
          situation could occur very easily where one server fails and
          then the other crashes and reboots.  If the rebooting server
          doesn't start processing DHCP client requests without first
          being in communication with the other server, then the level
          of DHCP redundancy is not particularly high.  This is an
          appropriate approach if the possibility of partition is low,
          or if the safe period expiration time is well beyond the time
          at which an operator would notice and react to a partition
          situation.  It is also quite appropriate if the safe period
          will never expire.

          Reasons to wait:

          If the current server has been down for longer than the
          maximum-client-lead-time, and it is partitioned from the other
          server, then when it returns it will attempt to use its own
          available addresses to allocate to new DHCP clients, and the
          other server may well be in PARTNER-DOWN state and may have
          already allocated some of those available addresses to DHCP
          clients.  In cases where the possibility of partition is high,
          and the safe period expiration time is less than the likely
          operator reaction time, this is a good approach to use.

9.4.  PARTNER-DOWN state

   PARTNER-DOWN state is a state either server can enter.  When in this
   state, the server does not assume that the other server could still
   be operating and servicing a different set of clients, but instead
   assumes that it is the only server operating.  For this reason, only If one server should be operating is in this state at a time.
   PARTNER-DOWN state, the other server MUST NOT be operating.

9.4.1.  Upon entry to PARTNER-DOWN state

   No special actions are required when entering PARTNER-DOWN state.

   The server should continue to attempt to connect to the partner
   periodically.

9.4.2.  Operation while in PARTNER-DOWN state

   A server in PARTNER-DOWN state MUST respond to DHCP client requests.
   It will allow renewal of all outstanding leases on IP addresses, and
   will allocate IP addresses from its own pool, and after a fixed
   period of time (the MCLT interval) has elapsed from entry into
   PARTNER-DOWN state, it will allocate IP addresses from the set of all
   available IP addresses.

   Once a server has entered NORMAL state, the PARTNER-DOWN state is
   entered only on command of an external agency (typically an adminis-
   trator of some sort) or after the expiration of an externally config-
   ured minimum safe-time after the beginning of COMMUNICATIONS-
   INTERRUPTED state.

   Any available IP address tagged as belonging to the other server (at
   entry to PARTNER-DOWN state) MUST NOT be used until the maximum-
   client-lead-time beyond the entry into PARTNER-DOWN state has
   elapsed.

   A server in PARTNER-DOWN state MUST NOT allocate an IP address to a
   DHCP client different from that to which it was allocated at the
   entrance to PARTNER-DOWN state until the maximum-client-lead-time
   beyond the its maximum of the following times: client expiration time has elapsed. time,
   most recently transmitted potential-expiration-time, most recently
   received ack of potential-expiration-time from the partner, and most
   recently acked potential-expiration-time to the partner.  See section
   7.1.4 for details.  If this time would be earlier than the current
   time plus the maximum-client-lead-time, then the current time the server
   entered PARTNER-DOWN state plus the maximum-client-lead-time is used.

   Two options exist for lease times given out while in PARTNER-DOWN
   state, with different ramifications flowing from each.

   If the server wishes the Failover protocol to protect it from loss of
   stable storage in PARTNER-DOWN state, then it should ensure that the
   MCLT based lease time restrictions in Section 5.1 are maintained,
   even in PARTNER-DOWN state.

   If the server wishes to forego the protection of the Failover proto-
   col in the event of loss of stable storage, then it need recognize no
   restrictions on actual client lease times while in PARTNER-DOWN
   state.

   A server in PARTNER-DOWN state MUST continue to attempt to establish
   communications and synchronization with its partner.

9.4.3.  Transitions out of PARTNER-DOWN state

   When a server in PARTNER-DOWN state succeeds in establishing a con-
   nection to its partner, its actions are conditional on the state and
   flags received in the STATE message from the other server as part of
   the process of establishing the connection.

   If the STARTUP bit is set in the server-flags option of a received
   STATE message, a server in PARTNER-DOWN state MUST NOT take any state
   transitions based on reestablishing communications. Essentially, if a
   server is in PARTNER-DOWN state, it ignores all STATE messages from
   its partner that have the STARTUP bit set in the server-flags option
   of the STATE message.

   If the STARTUP bit is not set in the server-flags option of a STATE
   message received from its partner, then a server in PARTNER-DOWN
   state take takes the following actions based on the value of the server-
   state option in the received STATE message:

      o partner in NORMAL, COMMUNICATIONS-INTERRUPTED, PARTNER-DOWN or
        POTENTIAL-CONFLICT state

        transition to POTENTIAL-CONFLICT state

      o partner in RECOVER state

        stay in PARTNER-DOWN state

      o partner in RECOVER-DONE state

        transition into NORMAL state

9.5.  RECOVER state

   This state indicates that the server has no information in its stable
   storage or that it is re-integrating with a server in PARTNER-DOWN
   state after it has been down.  A server in this state will attempt to
   refresh its stable storage from the other server.

9.5.1.  Operation in RECOVER state

   A server in RECOVER MUST NOT respond to DHCP client requests.

   A server in RECOVER state will attempt to reestablish communications
   with the other server.

9.5.2.  Transitions out of RECOVER state

   If the other server is in POTENTIAL-CONFLICT state when communica-
   tions are reestablished, then the server in RECOVER state will move
   to POTENTIAL-CONFLICT state itself.

   If the other server is in RECOVER state, then this server SHOULD
   signal sig-
   nal an error and halt processing.

   If the other server is in any other state, then the server in RECOVER
   state will request an update of missing binding information by send-
   ing an UPDREQ message.  If the server has been instructed (through
   configuration or other external agency) that it has lost its stable
   storage, it MUST send an UPDREQALL message, otherwise it MUST send an
   UPDREQ message.

   It will wait for an UPDDONE message, and upon receipt of that message
   it will start a timer whose expiration is set to a time equal to the
   time the server went down (if known) or the current time (if the
   down-time is unknown) plus the maximum-client-lead-time.  When this
   timer goes off, the server will transition into RECOVER-DONE state.
   This is to allow any IP addresses that were allocated by this server
   prior to loss of its client binding information in stable storage to
   contact the other server or to time out.

   See Figure 9.5.2-1.

   DISCUSSION:

      The actual requirement on this wait period in RECOVER is that it
      start when the recovering server went down, not necessarily when
      it came back up.  If the time when the recovering server failed is
      known, then it could be communicated to the recovering server, server (perhaps
      through actions of the network administrator), and the wait period
      could be reduced to the maximum-client-lead-time less the difference differ-
      ence between the current time and the time the server failed.  In
      this way, the waiting period could be minimized.

   If an UPDDONE message isn't received within an implementation depen-
   dent amount of time, and no BNDUPD message are being received, then
   the UPDREQ(ALL) message will be re-transmitted.

                A                                        B
              Server                                  Server

                |                                        |
             RECOVER                               PARTNER-DOWN
                |                                        |
                | >--UPDREQ-------------------->         |
                |                                        |
                |        <---------------------BNDUPD--< |
                | >--BNDACK-------------------->         |
               ...                                      ...
                |                                        |
                |        <---------------------BNDUPD--< |
                | >--BNDACK-------------------->         |
                |                                        |
                |        <--------------------UPDDONE--< |
                |                                        |
       Wait MCLT from last known                         |
          time of operation                              |
                |                                        |
           RECOVER-DONE                                  |
                |                                        |
                | >--STATE-(RECOVER-DONE)------>         |
                |                                     NORMAL
                |        <-------------(NORMAL)-STATE--< |
             NORMAL                                      |
                |                                        |
                |                                        |

              Figure 9.5.2-1:  Transition out of RECOVER state

9.6.  NORMAL state

   NORMAL state is the state used by a server when it can communicate is communicating
   with the other server. server, and any required resynchronization has been
   performed. While some bindings database synchronization is performed
   in NORMAL state, potential conflicts are resolved prior to entry into
   NORMAL state as is binding database data loss.

9.6.1.  Upon Entry to NORMAL state

   When entering NORMAL state, a server will send to the other server
   all currently unacknowledged binding updates as BNDUPD messages.

   When the above process is complete, if the server entering NORMAL
   state is a secondary server, then it will request IP addresses for
   allocation using the POOLREQ message.

9.6.2.  Processing DHCP client requests and load balancing

   When in NORMAL state, each server MUST process all requests from some
   DHCP clients, and MUST NOT process any request other than a
   DHCPREQUEST/RENEWAL or a DHCPREQUEST/REBINDING request from some
   other DHCP clients.  The

   However, if the load balancing algorithm determines into
   which set specified in [LOADB] is used
   with a particular pair of servers implementing the failover protocol, then each
   server needs to test each incoming DHCP client falls. request to see if it
   should process that request.

   As discussed in section 5.3, each server will take the client-
   identifier from each DHCP client request (or the client-hardware-
   address, i.e., the htype concatenated to the front of the chaddr if
   no client-identifier is present in the
   request), request) and hash use it with as the algorithm given
   'Request ID' specified in section 12.  The
   results of this hash algorithm yields a number between 0 and 255.
   This number is used to index into [LOADB].  After applying the bit array received by a server algorithm
   specified in [LOADB] and comparing the hash-bucket-assignment option (if result with the hash bucket
   assignment (performed during connect processing between failover
   servers), each failover server is a secondary),
   or into the inverse of the bit array sent will be able to the secondary in the
   hash-bucket-assignment option unambiguously deter-
   mine if it should processes the server is a primary.

   If the bit found from this indexing process is a 1 bit, then the
   server MUST process this DHCP client request.

   In NORMAL state, a server MUST processes process every DHCPREQUEST/RENEWAL or
   DHCPREQUEST/REBINDING request it receives.

9.6.3.  Operation in NORMAL state

   When in NORMAL state, for every DHCP client request that it
   processes, as determined by the algorithm described in section 9.6.2,
   above, a server will operate in the following manner:

      o Lease time calculations

        As discussed in section 5.2.1, "Control of lease time", the
        lease interval given to a DHCP client can never be more than the
        MCLT greater than the most recently received potential-
        expiration-time from the failover partner or the current time,
        whichever is later.

        As long as a server adheres to this constraint, the specifics of
        the lease interval that it gives to a DHCP client or the value
        of the potential-expiration-time sent to its failover partner
        are implementation dependent.  One possible approach is dis-
        cussed
        discussed in section 5.2.1, but that particular approach is in
        no way required by this protocol.

        See section 7.1.4 for details concerning the storage of time
        associated IP addresses and how to use these times when calcu-
        lating lease times for DHCP clients.

      o Lazy update of partner server

        After an ACK of a IP address binding, the server servicing a
        DHCP client request attempts to update its partner with the new
        binding information.  The lease time used in the update of the
        secondary MUST be at that given to the DHCP client in the
        DHCPACK, and the potential-expiration-time MUST be at least the
        lease time, and SHOULD be longer.

      o Reallocation of IP addresses between clients

        Whenever a client binding is released or expires, a BNDUPD mes-
        sage must be sent to partner, setting the binding state to
        RELEASED or EXPIRED.  However, until a BNDACK is received for
        this message, the IP address cannot be allocated to another
        client.  It can be allocated to the same client again.

   In normal state, the each server receives binding updates from its
   partner server in BNDUPD messages.  It records these in its client
   binding database in stable storage and then sends a corresponding
   BNDACK message to the primary server.  It MUST ensure that the infor-
   mation is recorded in stable storage prior to sending the BNDACK mes-
   sage back to the primary server.

9.6.4.  Transitions out of NORMAL state

   If an external command is received by a server in NORMAL state
   informing it that its partner is down, then transition into PARTNER-
   DOWN state.

   If a server in NORMAL state fails to receive acks to messages sent to
   its partner for an implementation dependent period of time, it MAY
   move into COMMUNICATIONS-INTERRUPTED state.  This situation might
   occur if the partner server was capable of maintaining the TCP con-
   nection between the server and also capable of sending a CONTACT mes-
   sage every tSend seconds, but was (for some reason) incapable of pro-
   cessing BNDUPD messages.

   If the communications is determined to not be "ok" (as defined in
   section 8), then transition into COMMUNICATIONS-INTERRUPTED state.

   If a server in NORMAL state receives any messages from its partner
   where the partner has changed state from that expected by the server
   in NORMAL state, then the server should transition into
   COMMUNICATIONS-INTERRUPTED state and take the appropriate state tran-
   sition from there.  For example, it would be expected for the partner
   to transition from POTENTIAL-CONFLICT into NORMAL state, but not for
   the partner to transition from NORMAL into POTENTIAL-CONFLICT state.

9.7.  COMMUNICATIONS-INTERRUPTED State

   A server goes into COMMUNICATIONS-INTERRUPTED state whenever it is
   unable to communicate with the other server.  Primary and secondary
   servers cycle automatically (without administrative intervention)
   between NORMAL and COMMUNICATIONS-INTERRUPTED state as the network
   connection between them fails and recovers, or as the partner server
   cycles between operational and non-operational.  No duplicate IP
   address allocation can occur while the servers cycle between these
   states.

9.7.1.  Upon Entry to COMMUNICATIONS-INTERRUPTED state

   When a server enters COMMUNICATIONS-INTERRUPTED state, if it has been
   configured to support an automatic transition out of COMMUNICATIONS-
   INTERRUPTED state and into PARTNER-DOWN state (i.e., a "safe period"
   has been configured, see section 10), then a timer MUST be started
   for a the length of the configured safe period.

   A server transitioning into the COMMUNICATIONS-INTERRUPTED state from
   the NORMAL state SHOULD raise some alarm condition to alert adminis-
   trative staff to a potential problem in the DHCP subsystem.

9.7.2.  Operation in COMMUNICATIONS-INTERRUPTED State

   In this state a server MUST respond to all DHCP client requests, and
   the algorithm for load balancing described in section 5.3 MUST NOT be
   used.  When allocating new IP addresses, each server allocates from
   its own IP address pool, where the primary MUST allocate only FREE IP
   addresses, and the secondary MUST allocate only BACKUP IP addresses.
   When responding to renewal requests, each server will allow continued
   renewal of a DHCP client's current lease on an IP address irrespec-
   tive of whether that lease was given out by the receiving server or
   not, although the renewal period MUST not exceed the maximum client
   lead time (MCLT) beyond the potential-expiration-time already ack-
   nowledged by the other server or or the lease-expiration-time or
   potential-expiration-time received from the partner server.

   However, since the server cannot communicate with its partner in this
   state, the acknowledged-potential-expiration time will not be updated
   in any new bindings.  This is likely to eventually cause the actual-
   client-lease-times to be the current time plus the maximum-client-
   lead-time (unless this is greater than the desired-client-lease-
   time).

9.7.3.  Transition out of COMMUNICATIONS-INTERRUPTED State

   If the safe period timer expires while a server is in the
   COMMUNICATIONS-INTERRUPTED state, it will transition immediately into
   PARTNER-DOWN state.

   If an external command is received by a server in COMMUNICATIONS-
   INTERRUPTED state informing it that its partner is down, it will
   transition immediately into PARTNER-DOWN state.

   If communications is restored with the other server, then the server
   in COMMUNICATIONS-INTERRUPTED state will transition into another
   state based on the lease-expiration-time or
   potential-expiration-time received from state of the partner:

      o partner server.

   However, in NORMAL or COMMUNICATIONS-INTERRUPTED

        The partner really SHOULD NOT be in NORMAL state here, since
        upon restoration of communications is MUST have created a new
        TCP connection which would have forced it into COMMUNICATIONS-
        INTERRUPTED state.  Still, we should account for every state
        just in case.

        Transition into the server cannot communicate with its NORMAL state.

      o partner in this
   state, the acknowledged-potential-expiration time will not be updated RECOVER

        Stay in any COMMUNICATIONS-INTERRUPTED state.

      o partner in RECOVER-DONE

        Transition into NORMAL state.

      o partner in PARTNER-DOWN or POTENTIAL-CONFLICT

        Transition into POTENTIAL-CONFLICT state.

      o partner in PAUSED

        Stay in COMMUNICATIONS-INTERRUPTED state.

      o partner in SHUTDOWN

        Transition into PARTNER-DOWN state.

   The following figure illustrates the transition from NORMAL to
   COMMUNICATIONS-INTERRUPTED state and then back to NORMAL state again.

             Primary                                Secondary
              Server                                  Server

              NORMAL                                  NORMAL
                | >--CONTACT------------------->         |
                |        <--------------------CONTACT--< |
                |         [TCP connection broken]        |
           COMMUNICATIONS          :              COMMUNICATIONS
             INTERRUPTED           :                INTERRUPTED
                |      [attempt new bindings.  This is likely TCP connection]      |
                |         [connection succeeds]          |
                |                                        |
                | >--CONNECT------------------->         |
                |        <-----------------CONNECTACK--< |
                |        <-------------------STATE-----< |
                |                                     NORMAL
                | >--STATE--------------------->         |
              NORMAL                                     |
                | >--BNDUPD-------------------->         |
                |        <---------------------BNDACK--< |
                |                                        |
                |        <---------------------BNDUPD--< |
                | >------BNDACK---------------->         |
               ...                                      ...
                |                                        |
                |        <--------------------POOLREQ--< |
                | >--POOLRESP-(2)-------------->         |
                |                                        |
                | >--BNDUPD-(#1)--------------->         |
                |        <---------------------BNDACK--< |
                |                                        |
                |        <--------------------POOLREQ--< |
                | >--POOLRESP-(0)-------------->         |
                |                                        |
                | >--BNDUPD-(#2)--------------->         |
                |        <---------------------BNDACK--< |
                |                                        |

       Figure 9.7.3-1:  Transition from NORMAL to COMMUNICATIONS-
                        INTERRUPTED and back (example with 2
                        addresses allocated to eventually cause secondary)

9.8.  POTENTIAL-CONFLICT state

   This state indicates that the actual-
   client-lease-times two servers are attempting to re-
   integrate with each other, but at least one of them was running in a
   state that did not guarantee automatic reintegration would be
   possible.  In POTENTIAL-CONFLICT state the current-time plus servers may determine that
   the maximum-client-
   lead-time (unless this same IP address has been offered and accepted by two different
   DHCP clients.

   It is greater than the desired-client-lease-
   time).

9.7.3.  Transition out of COMMUNICATIONS-INTERRUPTED State

   If the safe period timer expires while a server is in goal of this protocol to minimize the
   COMMUNICATIONS-INTERRUPTED state, it will transition immediately into
   PARTNER-DOWN state.

   If an external command possibility that
   POTENTIAL-CONFLICT state is received by ever entered.

9.8.1.  Upon Entry to POTENTIAL-CONFLICT

   When a primary server in COMMUNICATIONS-
   INTERRUPTED enters POTENTIAL-CONFLICT state informing it should
   request that its partner is down, the secondary send it all updates of which it will
   transition immediately into PARTNER-DOWN state.

   If communications is restored with the other server, then
   currently unaware by sending an UPDREQ message to the secondary
   server.

   A secondary server
   in COMMUNICATIONS-INTERRUPTED entering POTENTIAL-CONFLICT state will transition into another
   state based on wait for
   the primary to send it an UPDREQ message.

9.8.2.  Operation in POTENTIAL-CONFLICT state of the partner:

      o partner

   Any server in NORMAL or COMMUNICATIONS-INTERRUPTED

        Transition into POTENTIAL-CONFLICT state MUST NOT process any incoming
   DHCP requests.

9.8.3.  Transitions out of POTENTIAL-CONFLICT state

   If communications fails with the NORMAL state.

      o partner in RECOVER

        Stay in COMMUNICATIONS-INTERRUPTED state.

      o partner while in RECOVER-DONE

        Transition into NORMAL state.

      o partner POTENTIAL-CONFLICT
   state, then a primary server will transition to PARTNER-DOWN state
   and a secondary server will stay in PARTNER-DOWN or POTENTIAL-CONFLICT

        Transition into POTENTIAL-CONFLICT state.

      o partner in PAUSED

        Stay in COMMUNICATIONS-INTERRUPTED state.

      o

   Whenever either server receives an UPDDONE message from its partner
   while in SHUTDOWN

        Transition into PARTNER-DOWN state.

   The following figure illustrates the POTENTIAL-CONFLICT state, it MUST transition from to NORMAL
   state.  This will cause the primary server to
   COMMUNICATIONS-INTERRUPTED leave POTENTIAL-
   CONFLICT state prior to the secondary, since the primary sends an
   UPDREQ message and then back receives an UPDDONE before the secondary sends an
   UPDREQ message and receives its UPDDONE message.

   When a secondary server receives an indication that the primary
   server has transitioned from POTENTIAL-CONFLICT to NORMAL state again. state, it
   SHOULD send an UPDREQ message to the primary server.

              Primary                                Secondary
              Server                                  Server

              NORMAL                                  NORMAL

                | >--CONTACT------------------->                                        |
         POTENTIAL-CONFLICT                    POTENTIAL-CONFLICT
                |        <--------------------CONTACT--<                                        |
                |         [TCP connection broken] >--UPDREQ-------------------->         |
           COMMUNICATIONS          :              COMMUNICATIONS
             INTERRUPTED           :                INTERRUPTED
                |      [attempt new TCP connection]                                        |
                |         [connection succeeds]        <---------------------BNDUPD--< |
                | >--BNDACK-------------------->         |
               ...                                      ...
                | >--CONNECT------------------->                                        |
                |        <-----------------CONNECTACK--<        <---------------------BNDUPD--< |
                |        <-------------------STATE-----< >--BNDACK-------------------->         |
                |                                     NORMAL                                        | >--STATE--------------------->
                |        <--------------------UPDDONE--< |
              NORMAL                                     |
                | >--BNDUPD--------------------> >--STATE--(NORMAL)----------->         |
                |        <---------------------BNDACK--<        <---------------------UPDREQ--< |
                |                                        |
                |        <---------------------BNDUPD--< >--BNDUPD-------------------->         |
                | >------BNDACK---------------->        <---------------------BNDACK--< |
               ...                                      ...
                |                                        |
                |        <--------------------POOLREQ--< |
                | >--POOLRESP-(2)-------------->         |
                |                                        |
                | >--BNDUPD-(#1)---------------> >--BNDUPD-------------------->         |
                |        <---------------------BNDACK--< |
                |                                        |
                |        <--------------------POOLREQ--< | >--UPDDONE------------------->         | >--POOLRESP-(0)-------------->
                |                                     NORMAL
                |                                        |
                | >--BNDUPD-(#2)--------------->        <--------------------POOLREQ--< |
                |        <---------------------BNDACK--< >------POOLRESP-(n)---------->         |
                |              addresses                 |

           Figure 9.7.3-1: 9.8.3-1:  Transition from NORMAL to COMMUNICATIONS-
                        INTERRUPTED and back (example with 2
                        addresses allocated to secondary)

9.8. out of POTENTIAL-CONFLICT

9.9.  RESOLUTION-INTERRUPTED state

   This state indicates that the two servers are were attempting to re-
   integrate with each other, but at least one of them was running other in a
   state that did not guarantee automatic reintegration would be
   possible.  In POTENTIAL-CONFLICT state the servers may determine that
   the same IP address has been offered and accepted by two different
   DHCP clients.

   It is a goal of this protocol state, but
   communications failed prior to minimize completion of re-integration.

   If the possibility that servers remained in POTENTIAL-CONFLICT state is ever entered.

9.8.1. while communications
   was interrupted, neither server would be responsive to DHCP client
   requests, and if one server had crashed, then there might be no
   server able to process DHCP requests.

9.9.1.  Upon Entry to POTENTIAL-CONFLICT RESOLUTION-INTERRUPTED state

   When a primary server enters POTENTIAL-CONFLICT state it should
   request that the secondary send it all updates of which it is
   currently unaware by sending RESOLUTION-INTERRUPTED SHOULD raise an UPDREQ message alarm
   condition to alert administrative staff of a problem in the secondary
   server.

   A secondary server entering POTENTIAL-CONFLICT state will wait for
   the primary to send it an UPDREQ message.

9.8.2. DHCP sub-
   system.

9.9.2.  Operation in POTENTIAL-CONFLICT RESOLUTION-INTERRUPTED state

   Any

   In this state a server MUST respond to all DHCP client requests, and
   any load balancing (described in POTENTIAL-CONFLICT state section 5.3) MUST NOT process any incoming be used.  When
   allocating new IP addresses, each server SHOULD allocate from its own
   IP address pool (if that can be determined), where the primary MUST
   allocate only FREE IP addresses, and the secondary MUST allocate only
   BACKUP IP addresses.  When responding to renewal requests, each
   server will allow continued renewal of a DHCP requests.

9.8.3.  Transitions out client's current lease
   on an IP address irrespective of POTENTIAL-CONFLICT state

   If communications fails with whether that lease was given out by
   the receiving server or not, although the renewal period MUST not
   exceed the maximum client lead time (MCLT) beyond the potential-
   expiration-time already acknowledged by the other server or the
   lease-expiration-time or potential-expiration-time received from the
   partner while server.

   However, since the server cannot communicate with its partner in POTENTIAL-CONFLICT this
   state, then a primary server the acknowledged-potential-expiration time will transition to PARTNER-DOWN not be updated
   in any new bindings.

9.9.3.  Transitions out of RESOLUTION-INTERRUPTED state
   and

   If an external command is received by a secondary server will stay in POTENTIAL-CONFLICT state.

   Whenever either server receives an UPDDONE message from RESOLUTION-
   INTERRUPTED state informing it that its partner
   while in POTENTIAL-CONFLICT state, is down, it MUST will
   transition to NORMAL immediately into PARTNER-DOWN state.  This will cause

   If communications is restored with the other server, then the primary server to leave
   in RESOLUTION-INTERRUPTED state will transition into POTENTIAL-
   CONFLICT state prior to the secondary, since the primary sends an
   UPDREQ message and receives an UPDDONE before the secondary sends an
   UPDREQ message and receives its UPDDONE message.

   When a secondary server receives an indication that the primary
   server has transitioned from POTENTIAL-CONFLICT to NORMAL state, it
   SHOULD send an UPDREQ message to the primary server.

              Primary                                Secondary
              Server                                  Server

                |                                        |
         POTENTIAL-CONFLICT                    POTENTIAL-CONFLICT
                |                                        |
                | >--UPDREQ-------------------->         |
                |                                        |
                |        <---------------------BNDUPD--< |
                | >--BNDACK-------------------->         |
               ...                                      ...
                |                                        |
                |        <---------------------BNDUPD--< |
                | >--BNDACK-------------------->         |
                |                                        |
                |        <--------------------UPDDONE--< |
              NORMAL                                     |
                | >--STATE--(NORMAL)----------->         |
                |        <---------------------UPDREQ--< |
                |                                        |
                | >--BNDUPD-------------------->         |
                |        <---------------------BNDACK--< |
               ...                                      ...
                | >--BNDUPD-------------------->         |
                |        <---------------------BNDACK--< |
                |                                        |
                | >--UPDDONE------------------->         |
                |                                     NORMAL
                |                                        |
                |        <--------------------POOLREQ--< |
                | >------POOLRESP-(?)---------->         |
                |                                        |

           Figure 9.8.3-1:  Transition out of POTENTIAL-CONFLICT

9.9. state.

9.10.  RECOVER-DONE state

   This state exists to allow an interlocked transition for one server
   from RECOVER state and another server from PARTNER-DOWN or
   COMMUNICATIONS-INTERRUPTED state into NORMAL state.

9.9.1.

9.10.1.  Operation in RECOVER-DOWN RECOVER-DONE state

   A server in RECOVER-DONE state MUST respond only to
   DHCPREQUEST/RENEWAL and DHCPREQUEST/REBINDING DHCP messages.

9.9.2.

9.10.2.  Transitions out of RECOVER-DONE state

   When a server in RECOVER-DONE state determines that its partner
   server has entered NORMAL state, then it will transition into NORMAL
   state as well.

9.10.

9.11.  PAUSED state

   This state exists to allow one server to inform another that it will
   be out of service for what is predicted to be a relatively short
   time, and to allow the other server to transition to COMMUNICATIONS-
   INTERRUPTED state immediately and to begin servicing all DHCP clients
   with no interruption in service to new DHCP clients.

   A server which is aware that it is shutting down temporarily SHOULD
   send a STATE message with the server-state option containing PAUSED
   state.
   state and close the TCP connection.

   While a server may or may not transition internally into PAUSED
   state, the 'previous' state determined when it is restarted MUST be
   the state the server was in prior to receiving the command to shut-
   down and restart and which precedes its entry into the PAUSED state.
   See section 9.3.2 concerning the use of the previous state upon
   server restart.

9.10.1.

9.11.1.  Upon entry to PAUSED state

   When entering PAUSED state, the server MUST store the previous state
   in stable storage, and use that state as the previous state when it
   is restarted.

9.10.2.

9.11.2.  Transitions out of PAUSED state

   A server transitions out of PAUSED state by being restarted.  At that
   time, the previous state MUST be the state the server was in prior to
   entering the PAUSED state.

9.11.

9.12.  SHUTDOWN state

   This state exists to allow one server to inform another that it will
   be out of service for what is predicted to be a relatively long time,
   and to allow the other server to transition immediately to PARTNER-
   DOWN state, and take over completely for the server going down.

   A server which is aware that it is shutting down SHOULD send a STATE
   message with the server-state field containing SHUTDOWN.

   While a server may or may not transition internally into SHUTDOWN
   state, the 'previous' state determined when it is restarted MUST be
   the state active prior to the command to shutdown.  See section 9.3.2
   concerning the use of the previous state upon server restart.

9.11.1.

9.12.1.  Upon entry to SHUTDOWN state

   When entering SHUTDOWN state, the server MUST record the previous
   state in stable storage for use when the server is restarted.  It
   also MUST record the current time as the last time operational.

   A server which is aware that it is shutting down SHOULD send a STATE
   message with the server-state field containing SHUTDOWN.

9.11.2.

9.12.2.  Operation in SHUTDOWN state

   A server in SHUTDOWN state MUST NOT respond to any DHCP client input.

   If a server receives any message indicating that the partner has
   moved to PARTNER-DOWN state while it is in SHUTDOWN state then it
   MUST record RECOVER state as the previous state to be used when it is
   restarted.

   A server SHOULD wait for a few seconds after informing the partner of
   entry into SHUTDOWN state (if communications are okay) to determine
   if it will enter PARTNER-DOWN state.

9.11.3.

9.12.3.  Transitions out of SHUTDOWN state

   A server transitions out of SHUTDOWN state by being restarted.

10.  Safe Period

   Due to the restrictions imposed on each server while in
   COMMUNICATIONS-INTERRUPTED state, long-term operation in this state
   is not feasible for either server.  One reason that these states
   exist at all, is to allow the servers to easily survive transient
   network communications failures of a few minutes to a few days
   (although the actual time periods will depend a great deal on the
   DHCP activity of the network in terms of arrival and departure of
   DHCP clients on the network).

   Eventually, when the servers are unable to communicate, they will
   have to move into a state where they no longer can re-integrate
   without the some possibility of a duplicate IP address allocation.  There
   are two ways that they can move into this state (known as
   PARTNER-DOWN). PARTNER-
   DOWN).

   They can either be informed by external command that, indeed, the
   partner server is down.  In this case, there is no difficulty in mov-
   ing into the PARTNER-DOWN state since it is an accurate reflection of
   reality and the protocol has been designed to operate correctly (even
   during reintegration) if, when in PARTNER-DOWN state the partner is,
   indeed, down.

   The more difficult scenario is when the servers are running unat-
   tended for extended periods, and in this case an option is provided
   to configure something called a "safe-period" into each server.  This
   OPTIONAL safe-period is the period after which either the primary or
   secondary server will automatically transition to PARTNER-DOWN from
   COMMUNICATIONS-INTERRUPTED state.  If this transition is completed
   and the partner is not down, then the possibility of duplicate IP
   address allocations will exist.

   The goal of the "safe-period" is to allow network operations staff
   some time to react to a server moving into COMMUNICATIONS-INTERRUPTED
   state.  During the safe-period the only requirement is that the net-
   work operations staff determine if both servers are still running --
   and if they are, to either fix the network communications failure
   between them, or to take one of the servers down before the  expira-
   tion of the safe-period.

   The length of the safe-period is installation dependent, and depends
   in large part on the number of unallocated IP addresses within the
   subnet address pool and the expected frequency of arrival of previ-
   ously unknown DHCP clients requiring IP addresses.  Many environments
   should be able to support safe-periods of several days.

   During this safe period, either server will allow renewals from any
   existing client.  The only limitation concerns the need for IP
   addresses for the DHCP server to hand out to new DHCP clients and the
   need to re-allocate IP addresses to different DHCP clients.

   The number of "extra" IP addresses required is equal to the expected
   total number of new DHCP clients encountered during the safe period.
   This is dependent only on the arrival rate of new DHCP clients, not
   the total number of outstanding leases on IP addresses.

   In the unlikely event that a relatively short safe period of an hour
   is all that can be used (given a dearth of IP addresses or a very
   high arrival rate of new DHCP clients), even that can provide sub-
   stantial benefits in allowing the DHCP subsystem to ride through
   minor problems that could occur and be fixed within that hour.  In
   these cases, no possibility of duplicate IP address allocation
   exists, and re-integration after the failure is solved will be
   automatic and require no operator intervention.

11.  Security

   It

   The Failover protocol communicates DHCP lease activity and this data
   is generally easily discovered via other means, such as by pinging
   addresses and doing DNS lookups. Therefore, the need to encrypt the
   data over the wire is likely not great (though some sites may feel
   differently).

   However, it is very desirable to assure the integrity of failover
   partners and to thus ensure proper operation of the servers. For
   example, denial of service attacks are possible by the communication
   of invalid state information to one or both servers.

   The

   Therefore, the Failover protocol MAY MUST be capable of being secured either by
   using a simple shared secret message digest which covers each message or mes-
   sage.  This provides authentication of the servers, but does not pro-
   vide encryption of the data exchange.

   The Failover protocol MAY also be secured by using TLS [TLS]
   (Transport (Tran-
   sport Layer Security).

11.1.  Simple shared secret

   A simple Security) if encryption of the data exchange is desired.
   The use of the shared secret message digest MAY or TLS will not protect against TCP or
   IP layer attacks (such as someone sending fake TCP RST segments).
   IPsec SHOULD be used to cover each mes-
   sage.  Since there protect against most (if not all) of these
   kinds of attacks.

11.1.  Simple shared secret

   Messages between the failover partners are a number authenticated through the
   use of configuration parameters that a shared secret, which is never sent over the network and must
   already
   be the same on known by each server. How each server in is told about this shared
   secret and secures its storage of the shared secret is outside the
   scope of this document.  If a pair, server is configured with a shared
   secret for a partner, it MUST send the message-digest option in ALL
   messages to that partner and it MUST treat any messages received from
   that partner without a message-digest option as failing authentica-
   tion.

   If a server is not unreasonable configured with a shared secret for a partner, it
   MUST NOT send the message-digest option in any message to require that
   partner and it MUST treat any messages received from that partner
   with a message-digest option as failing authentication.

   The shared secret is used to be configured calculate a 16 octet message-digest
   which is sent in every failover message as well.

   Only information within the packet and covered by message-digest option.
   See section 6.2.25. The message-digest contains a one-way 16 octet
   MD5 [MD5] hash calculated over a stream of octets consisting of the
   entire message concatenated with the shared secret.

   For calculation, the message digest
   is used for operation of includes the protocol. It is for this reason that message-digest option with
   the
   IP address message-digest data zeroed (16-octets of zero). Once the sending server calcula-
   tion is sent in the sending-server-IP-
   address option complete, these 16 octets of zero are replaced by the CONNECT 16-
   octet MD5 hash and CONNECTACK messages.

   This the message digest is placed in sent.

   For verification, the 16-octet message-digest option. is saved and replaced
   with 16-octets of zero and calculated per above. The dig-
   est covers resulting MD5
   hash is compared to the received hash and if they match, the message prior
   is assumed authenticated.

   A failover partner that fails to authenticate a received message or
   receives a message without a message-digest option when configured
   with a shared secret MUST close the inclusion connection immediately and take
   steps to notify operators.

   This use of the message-digest
   option. shared secret is very similar to that used for RADIUS
   Accounting [RADIUS].

11.2.  TLS

   TLS, Transport Layer Security, as specified in [TLS] MAY be used.
   The use of TLS would be similar to the way it is used with SMTP
   [SMTPTLS] and IMAP/POP3/ACAP [IPAMTLS].

   To request the use of TLS, the server that successfully opened a connec-
   tion con-
   nection to its peer MUST send the TLS option as part of the CONNECT mes-
   sage.
   message.  The server receiving the TLS option MUST respond with a TLS-
   reply
   TLS-reply option indicating its acceptace acceptance or rejection of the TLS-request TLS-
   request in the CONNECT message.

   If the CONNECTACK message contained a TLS-reply of 1 , then both
   servers begin TLS negotiation.

   Upon completion of this negotiation, the server which originally sent
   the CONNECT message MUST resent its CONNECT message without any TLS-
   request, and must wait for a corresponding CONNECTACK.

   Implementation of the TLS_DHE_DSS_WITH_3DES_EDE_CBC_SHA [TLS] cipher
   suite is REQUIRED in Failover servers supporting TLS. This is
   important as it assures that any two compliant implementations can be
   configured to interoperate.

12.  Hash algorithm for load balancing

The following hash function is an implementation of the algorithm known
as "Pearson's hash".  The Pearson's hash  algorithm was originally pub-
lished in the Communications of the ACM  Vol.33, No.  6 (June 1990), pp.
677-680.  The author,  Peter K. Pearson, has kindly granted his permis-
sion to use this algorithm, free of any encumbrances.

To make  Primary-backup load balancing possible , both servers MUST use
the same hash function.

    /* A "mixing table" of 256 distinct values, in pseudo-random order. */

    unsigned char failover_hash_mx_tbl[256] =
    {
    251, 175, 119, 215,  81,  14,  79, 191, 103,  49,
    181, 143, 186, 157,   0, 232,  31,  32,  55,  60,
    152,  58,  17, 237, 174,  70, 160, 144, 220,  90,
    57,  223,  59,   3,  18, 140, 111, 166, 203, 196,
    134, 243, 124,  95, 222, 179, 197,  65, 180,  48,
     36,  15, 107,  46, 233, 130, 165,  30, 123, 161,
    209,  23,  97,  16,  40,  91, 219,  61, 100,  10,
    210, 109, 250, 127,  22, 138,  29, 108, 244,  67,
    207,   9, 178, 204,  74,  98, 126, 249, 167, 116,
    34,   77, 193, 200, 121,   5,  20, 113,  71,  35,
    128,  13, 182,  94,  25, 226, 227, 199,  75,  27,
     41, 245, 230, 224,  43, 225, 177,  26, 155, 150,
    212, 142, 218, 115, 241,  73,  88, 105,  39, 114,
     62, 255, 192, 201, 145, 214, 168, 158, 221, 148,
    154, 122,  12,  84,  82, 163,  44, 139, 228, 236,
    205, 242, 217,  11, 187, 146, 159,  64,  86, 239,
    195,  42, 106, 198, 118, 112, 184, 172,  87,   2,
    173, 117, 176, 229, 247, 253, 137, 185,  99, 164,
    102, 147,  45,  66, 231,  52, 141, 211, 194, 206,
    246, 238,  56, 110,  78, 248,  63, 240, 189,  93,
     92,  51,  53, 183,  19, 171,  72,  50,  33, 104,
    101,  69,   8, 252,  83, 120,  76, 135,  85,  54,
    202, 125, 188, 213,  96, 235, 136, 208, 162, 129,
    190, 132, 156,  38,  47,   1,   7, 254,  24,   4,
    216, 131,  89,  21,  28, 133,  37, 153, 149,  80,
    170,  68,   6, 169, 234, 151
    };
    unsigned char failover_p_hash(
            unsigned char *key, /* The key to be hashed (e.g., MAC address)
*/
            int len             /* Length any TLS-
   request, and must wait for a corresponding CONNECTACK.

   Implementation of key the TLS_DHE_DSS_WITH_3DES_EDE_CBC_SHA [TLS] cipher
   suite is REQUIRED in bytes */       )
    {
        unsigned char hash  = len;
        int i;

        for( i=len ; i > 0 ;  )
        {
            hash = failover_p_mx_tbl  [ hash ^ key[ --i ] ];
        }
        return( hash );
    }

13. Failover servers supporting TLS. This is impor-
   tant as it assures that any two compliant implementations can be con-
   figured to interoperate.

12.  Acknowledgments

   Ralph Droms started it all, by sketching out an initial interserver
   draft that embodied ideas from several past IETF meetings.  In that
   draft, he acknowledged contributions by Jeff Mogul, Greg Minshall,
   Rob Stevens, Walt Wimer, Ted Lemon, and the DHC working group.

   Kim Kinnear and Bob Cole each extended that draft, separately and
   then together, until they created an interserver draft that supported
   any number of servers.  The complexity of that approach was just too
   great, and that draft wasn't greeted with enthusiasm by many, includ-
   ing its authors.

   It did however lead to a much simpler approach embodied in the first
   Failover draft by Greg Rabil, Mike Dooley, Arun Kapur and Ralph
   Droms.  This draft posited only two servers -- a primary and a secon-
   dary.

   Kim Kinnear then wrote the Safe Failover draft to layer on top of the
   Failover Draft and increase its robustness in the face of certain
   rare network failures.

   At the spring 1998 IETF meeting in LA, the DHC working group said
   that they wanted a merged Failover and Safe Failover draft.  Steve
   Gonczi and Bernie Volz stepped up and produced the raw material for
   such a merged draft, along with a new message format designed around
   DHCP options and other extensions and clarifications.  Kim Kinnear
   edited their work into draft format and made other changes in time
   for the Summer Chicago IETF meeting.

   During the summer and fall of 1998, two groups worked on separate
   implementations of the UDP failover draft.  Bernie Volz and Steve
   Gonczi constituted one group, and Kim Kinnear, Mark Stapp and Paul
   Fox made up the other.  These two groups worked together to produce
   considerable changes and simplifications of the protocol during that
   period, and Steve Gonczi and Kim Kinnear edited those changes into
   -03 draft in time for submission to the December 1998 Orlando IETF
   meeting.

   In February of 1999 Kim Kinnear and Mark Stapp hosted a meeting on
   people interested in the failover draft.  During that meeting a gen-
   eral agreement was reached to recast the failover protocol to use TCP
   instead of UDP.  In addition, the group together brainstormed a work-
   able load-balancing technique.  Kim Kinnear volunteered to rewrite rewrote the entire draft
   to include the changes made at that meeting as well as to restructure
   the draft along guidelines suggested by Thomas Nar-
   ten. Narten.  The current draft represents result
   was the results of that effort. -04 draft, submitted prior to the Oslo IETF meeting.

   The initial idea for a hash-based load balancing approach was offered
   by Ted Lemon, and the determination of an algorithm and its integra-
   tion into the draft was done by Steve Gonczi.  The security section
   was spearheaded by Bernie Volz.  Both contributed considerably to the
   ideas and text in the rest of the draft with several reviews.

   In early October of 1999, three conference calls were held to discuss
   the -04 draft.  The current draft (-05) includes changes as a result
   of those calls, perhaps the largest of which was to remove the load-
   balancing approach into a separate draft.   Thanks to all of the many
   people whoe participated in the conference calls.  This current draft
   was changed because of contributions by: Ted Lemon, David Erdmann,
   Richard Jones, Rob Stevens, Thomas Narten, Diana Lane, and Andre Kos-
   tur.

   These most recent changes have been widely circulated among the other
   authors, but that does not preclude any of them from expressing
   disagreement with what is contained in this draft at any future time.

   Many people have reviewed the various earlier drafts that went into
   this result.  At American Internet, ideas were contributed by Brad
   Parker.  At Cisco Systems, Systems Paul Fox, Fox and Ellen Garvey have contri-
   buted greatly contributed to
   the form design of the protocol.

   Glenn Waters of Bay Nortel Networks contributed ideas and enthusiasm to
   make a Failover protocol that was both "safe" and "lazy".

   Many thanks to Peter K. Pearson, the author of Pearson's hash who has
   kindly granted his permission to use this algorithm, for DHCP load
   balancing, free of any encumbrances.

14.

13.  References

   [RFC 2131] Droms, R., "Dynamic Host Configuration Protocol", RFC
      2131, March 1997.

   [RFC 2119] Bradner, S. "Key words for use in RFCs to Indicate
      Requirement Levels", RFC 2119.

   [RFC 2132] Alexander, S.,  Droms, R., "DHCP Options and BOOTP Vendor
      Extensions", Internet RFC 2132, March 1997.

   [TLS] Dierks, T., "The TLS Protocol, Version 1.0", RFC 2246, January
      1999.

   [SMTPTLS] Hoffman, P., "SMTP Service Extension for Secure SMTP over
      TLS", RFC 2487, January 1999.

   [IMAPTLS] Newman, C., "Using TLS with IMAP, POP3, and ACAP", RFC
      2595, June 1999.

   [NAMESPACE] Carney, M., "draft-ietf-dhc-option_review_and_namespace-
      00.txt", June 1999.

   [DDNS] Rekhter, Y., Stapp, M., "draft-ietf-dhc-dhcp-dns-10.txt",
      June, "draft-ietf-dhc-dhcp-dns-11.txt",
      October, 1999.

15.

   [MD5] Rivest, R., and Dusse, S., "The MD5 Message-Digest Algorithm",
      RFC 1321, MIT Laboratory for Computer Science, RSA Data Security
      Inc., April 1992.

   [RADIUS] Rigney, C., "Radius Accounting", RFC 2139, Livingston Enter-
      prises, April 1997.

   [LOADB] Volz, B., Gonczi, S., Lemon, T., Stevens, R., "draft-ietf-
      dhc-loadb-00.txt", October, 1999.

   [RFC1035] Mockapetris, P., "Domain Names - Implementation and Specif-
      ication", November, 1987.

   [AGENTINFO] Patrick, M., "draft-ietf-dhc-agent-options-07.txt",
      August, 1999. [USERCLASS] Stump, G., Droms, R., "draft-ietf-dhc-
      userclass-04.txt", October, 1999.

   [RFC2136] P. Vixie, S. Thomson, Y. Rekhter, J. Bound, "Dynamic
      Updates in the Domain Name System (DNS UPDATE)", RFC2136, April
      1997

14.  Author's information

      Ralph Droms
      323 Dana Engineering
      Bucknell University
      Lewisburg, PA  17837

      Phone: (717) 524-1145
      EMail: droms@bucknell.edu

      Greg Rabil, Mike Dooley, Arun Kapur
      Lucent Technologies (Quadritek)
      10 Valley Stream Parkway, Suite 240
      Malvern, PA 19355

      Phone: (800) 208-2747

      EMail: grabil@lucent.com
             mdooley@lucent.com
             akapur@lucent.com

      Kim Kinnear
      Mark Stapp
      Cisco Systems
      250 Apollo Drive
      Chelmsford, MA  01824

      Phone: (978) 244-8000

      EMail: kkinnear@cisco.com
             mjs@cisco.com

      Bernie Volz
      Steve Gonczi
      Process Software Corporation
      959 Concord St.
      Framingham, MA  01701

      Phone: (508) 879-6994

      EMail: volz@process.com
             gonczi@process.com

16.

15.  Full Copyright Statement

Copyright (C) The Internet Society (1999). All Rights Reserved.

This document and translations of it may be copied and furnished to oth-
ers, and derivative works that comment on or otherwise explain it or
assist in its implementation may be prepared, copied, published and dis-
tributed, in whole or in part, without restriction of any kind, provided
that the above copyright notice and this paragraph are included on all
such copies and derivative works.  However, this document itself may not
be modified in any way, such as by removing the copyright notice or
references to the Internet Society or other Internet organizations,
except as needed for the  purpose of developing Internet standards in
which case the procedures for copyrights defined in the Internet Stan-
dards process must be followed, or as required to translate it into
languages other than English.

The limited permissions granted above are perpetual and will not be
revoked by the Internet Society or its successors or assigns.

This document and the information contained herein is provided on an "AS
IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK
FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT
LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT
INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FIT-
NESS FOR A PARTICULAR PURPOSE.

Open Issues

   These issues need to be resolved:

      1.  Need to figure out how to get 16 bit options without referenc-
          ing the [NAMESPACE] draft, since it doesn't really define them
          anymore.

      2.  We need to deal with the option space, and the procedures for
          managing it.  Probably IANA.

      2.

      3.  Figure out a better way to identify vendors.  How about an
          SNMP Enterprise MIB value?

      3.

      4.  Need more clarity in the conflict resolution section, probably
          backed up by real implementation experience.  Learned a lot
          from the UDP implementation and experience with it in the real
          world, and need equivalent learning from a TCP implementation
          with no messages out to tie reject-reasons to text of order or lost. draft, remove obsolete
          reject-reasons.

      5.  Using tables, compress description of sending BNDUPD message
          to save duplicated words, enhance description of differences.