Network Working Group                                        Ralph Droms
INTERNET DRAFT                                       Bucknell University

                                                              Greg Rabil
                                                             Mike Dooley
                                                              Arun Kapur
                                                       Quadritek Systems

                                                             Kim Kinnear
                                                              Mark Stapp
                                                           Cisco Systems

                                                            Steve Gonczi

                                                             Bernie Volz
                                                            Steve Gonczi
                                                        Process Software

                                                           November 1998
                                                       Expires

                                                              Greg Rabil
                                                             Mike Dooley
                                                              Arun Kapur
                                                       Quadritek Systems

                                                               June 1999
                                                   Expires December 1999

                         DHCP Failover Protocol
                    <draft-ietf-dhc-failover-03.txt>
                    <draft-ietf-dhc-failover-04.txt>

Status of this Memo

   This document is an Internet-Draft. Internet-Draft and is in full conformance with
   all provisions of Section 10 of RFC2026.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-Drafts. Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet- Drafts as reference
   material or to cite them other than as "work in progress."

   To view the entire

   The list of current Internet-Drafts, please check the
   "1id-abstracts.txt" listing contained in the Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt

   The list of Internet-Draft Shadow Directories on ftp.is.co.za (Africa), ftp.nordu.net (Northern
   Europe), ftp.nic.it (Southern Europe), munnari.oz.au (Pacific Rim),
   ftp.ietf.org (US East Coast), or ftp.isi.edu (US West Coast). can be accessed at
   http://www.ietf.org/shadow.html.

Copyright Notice

   Copyright (C) The Internet Society (1999). All Rights Reserved.

Abstract

   DHCP [RFC 2131] allows for multiple servers to be operating on a
   single network. Some sites are interested in running multiple servers
   in such a way so as to provide redundancy in case of server failure.
   In order for this to work reliably, the cooperating primary and

DRAFT                                                      November 1998
   secondary servers must maintain a consistent database of the lease
   information.  This implies that servers will need to coordinate any
   and all lease activity so that this information is synchronized in
   case of failover.

   This document defines a protocol to provide this synchronization
   between two servers. One server is designated the "Primary" "primary" server,
   the other is the "Secondary" "secondary" server. Additionally, this document
   describes a protocol for the automatic transfer of control from the
   primary which allows each server to the secondary determine to which
   DHCP clients it should provide service when both servers are
   operating in the case of failure (failover), order to support load balancing as well as a network partition.

   This document further develops the concepts presented in draft-ietf-
   dhc-failover-02.txt.

1.  Introduction

   As the use of DHCP servers in networked environments grows, the
   dependency of those networks when on the DHCP one
   server increases.  This is
   particularly true of the hosts that receive their configuration
   information from the DHCP server.  Therefore, it is very important to
   be able has failed in order to provide reliable, continuous availability of support increased DHCP ser-
   vices. service
   availability.

   This specification document is a complete rewrite of draft-ietf-dhc-failover-
   03.txt.  That earlier draft described a UDP based failover protocol,
   and this draft describes a closely related protocol to support automatic failover
   from which uses TCP as
   a primary to its secondary server.  The failover mechanism
   allows the secondary server to perform transport and includes new load-balancing and security
   capabilities.

Table of Contents

    1.  Introduction................................................. 4
    2.  Terminology.................................................. 5
    2.1.  Requirements terminology................................... 5
    2.2.  DHCP actions while the primary
   is down, or when a network failure prevents the primary and secondary
   from communicating.  The protocol also specifies how reintegration is
   achieved when the primary again becomes operational or when the pri-
   mary failover terminology.............................. 5
    3.  Background and secondary can again communicate.

   In providing the specification for the failover, External Requirements......................... 7
    3.1.  Key aspects of the protocol speci-
   fies how to guarantee reliable delivery of binding changes to the
   partner server.  This is required to synchronize lease data between
   the primary and the secondary.  The protocol further specifies DHCP protocol........................... 7
    3.2.  BOOTP relay agent implementation........................... 9
    3.3.  What does it mean if a
   mechanism to allow either server to determine if it can can't communicate with its partner.  The secondary will automatically begin partner?
10
    3.4.  Challenging scenarios for a Failover protocol............. 10
    3.5.  Using TCP to service
   DHCP requests whenever it cannot communicate with the primary.  When
   the primary detect partner server becomes available again, the secondary will convey
   any changes that occurred since the time of failover back to the pri-
   mary.

   Through careful control failure................ 11
    4.  Design Goals................................................ 13
    4.1.  Design requirements for this protocol..................... 13
    4.2.  Goals for this protocol................................... 13
    4.3.  Limitations of the difference between the lease times
   offered to DHCP clients this Protocol.............................. 14
    5.  Protocol Overview........................................... 15
    5.1.  Messages and States....................................... 15
    5.2.  Fundamental restrictions.................................. 18
    5.3.  Load balancing............................................ 24
    5.4.  Operating in NORMAL state................................. 25
    5.5.  Operating in COMMUNICATIONS-INTERRUPTED state............. 25
    5.6.  Operating in PARTNER-DOWN state........................... 25
    5.7.  Operating in RECOVER state................................ 26
    6.  Packet Formats.............................................. 26
    6.1.  Common message format..................................... 26
    6.2.  Common option format...................................... 28
    6.3.  BNDUPD message format..................................... 40
    6.4.  BNDACK message format..................................... 42
    6.5.  Bulking for BNDUPD and BNDACK messages.................... 44
    6.6.  UPDREQ message format..................................... 44
    6.7.  UPDREQALL message format.................................. 44
    6.8.  UPDDONE message format.................................... 44
    6.9.  POOLREQ message format.................................... 45
    6.10.  POOLRESP message format.................................. 45
    6.11.  CONNECT message format................................... 46
    6.12.  CONNECTACK message format................................ 46
    6.13.  STATE message format..................................... 47
    6.14.  CONTACT message format................................... 48
    7.  Protocol Messages........................................... 48
    7.1.  BNDUPD message............................................ 48
    7.2.  BNDACK message............................................ 57
    7.3.  UPDREQ message............................................ 58
    7.4.  UPDREQALL message......................................... 59
    7.5.  UPDDONE message........................................... 60
    7.6.  POOLREQ message........................................... 60
    7.7.  POOLRESP message.......................................... 61
    7.8.  CONNECT message........................................... 62
    7.9.  CONNECTACK message........................................ 65
    7.10.  STATE message............................................ 68
    7.11.  CONTACT message.......................................... 69
    8.  Connection Management....................................... 70
    8.1.  Connection granularity.................................... 70
    8.2.  Creating the lease time known by the secondary
   server, the protocol allows TCP connection............................... 70
    8.3.  Using the primary to communicate with TCP connection for determining communications status. 71
    8.4.  Using the
   secondary after TCP connection for binding data................. 73
    8.5.  Using the primary has completed communication with TCP connection for control messages............. 73
    8.6.  Losing the TCP connection................................. 73
    9.  Protocol States............................................. 73
    9.1.  Server Initialization..................................... 74
    9.2.  Server State Transitions.................................. 74
    9.3.  STARTUP state............................................. 77
    9.4.  PARTNER-DOWN state........................................ 79
    9.5.  RECOVER state............................................. 81
    9.6.  NORMAL state.............................................. 83
    9.7.  COMMUNICATIONS-INTERRUPTED State.......................... 86
    9.8.  POTENTIAL-CONFLICT state.................................. 89
    9.9.  RECOVER-DONE state........................................ 90
    9.10.  PAUSED state............................................. 91
    9.11.  SHUTDOWN state........................................... 91
    10.  Safe Period................................................ 92
    11.  Security................................................... 94
    11.1.  Simple shared secret..................................... 94
    11.2.  TLS...................................................... 94
    12.  Hash algorithm for load balancing.......................... 95
    13.  Acknowledgments............................................ 96
    14.  References................................................. 97
    15.  Author's information....................................... 98
    16.  Full Copyright Statement................................... 99

1.  Introduction

   DHCP
   client (a technique known as "lazy" update) and still guarantee that

DRAFT                                                      November 1998

   duplicate IP address allocations do not occur.  Thus, the protocol
   does not directly impact [RFC 2131] allows for multiple servers to be operating on a sin-
   gle network.  Some sites are interested in running multiple servers
   in such a way so as to provide redundancy in case of server failure
   since the ability DHCP subsystem is in many cases a critical part of the net-
   work infrastructure.

   This document defines a protocol to provide synchronization between
   two servers in order that each can take over for the other should
   either one fail or become unreachable.

   One server is designated the "primary" server,  the other is the
   "secondary" server, and all DHCP client requests are sent to each
   server.

   In order to provide a  high availability DHCP service, these
   cooperating primary and secondary servers must maintain a consistent
   database of lease information.  This implies that servers will need
   to coordinate any and all lease activity so that this information is
   synchronized in case failover is required.  The protocol messages and
   processing techniques required to maintain a consistent database are
   specified in the protocol described here.

   The failover protocol also contains an algorithm which allows each
   server to respond determine to which DHCP client requests.

1.1.  Requirements clients it should provide service
   when both servers are operating normally, and this capability can be
   used to support load balancing.

2.  Terminology

   This section discusses both the generic requirements terminology com-
   mon to many IETF protocol specifications as well as specialized DHCP
   and failover protocol specific terminology.

2.1.  Requirements terminology

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119 [RFC 2119].

1.2.

2.2.  DHCP Terminology and failover terminology

   This document uses the following terms:

      o "DHCP client" or "client"

        A DHCP client is an Internet host using DHCP to obtain confi-
        guration parameters such as a network address.

      o "DHCP server" or "server"

        A DHCP server is an Internet host that returns configuration
        parameters to DHCP clients.

      o "binding"

        A binding is a collection of configuration parameters, including
        at least an IP address, associated with or "bound to" a DHCP
        client.  Bindings are managed by DHCP servers.

      o "binding database"

        The collection of bindings managed by a primary and secondary.

      o "subnet address pool"

        A subnet address pool is the set of IP address which is associ-
        ated with a particular network number and subnet mask.  In the
        simple case, "failover endpoint"

        The failover protocol allows for there is to be a unique failover
        endpoint per partner per role (where role is primary or secon-
        dary).  This failover endpoint can take actions and hold unique
        states.  There are thus a maximum of two failover endpoints per
        server per partner (one for each partner as a primary and one
        for that same partner as a secondary.)

      o "lazy update"
        Lazy update refers to the requirement placed on a server imple-
        menting a failover protocol to update its failover partner when-
        ever the binding database changes.  A failover protocol which
        didn't support lazy update would require the failover partner
        update to be complete before a DHCP server could respond to a
        DHCP client request with a DHCPACK.  A failover protocol which
        does support lazy update places no such restriction on the
        update of the failover partner server, and so a server can allo-
        cate an IP address or extend a lease on an IP address and then
        update its failover partner as time permits.  A failover proto-
        col which supports lazy update not only removes the requirement
        to update the failover partner prior to responding to a DHCP
        client with a DHCPACK, but also allows gathering up batches of
        updates from one failover server to its partner.

      o "subnet address pool"

        A subnet address pool is the set of IP address which is associ-
        ated with a particular network number and subnet mask.  In the
        simple case, there is a single network number and subnet mask
        and a set of IP addresses.  In the more complex case (sometimes
        called "secondary subnets", sometimes "superscopes"), several
        (apparently unrelated) network number and subnet mask combina-
        tions with their associated IP addresses may all be configured
        together into one subnet address pool.

      o "Primary server" or "Primary"

DRAFT                                                      November 1998

        A DHCP server configured to provide primary service to a set of
        DHCP clients for a particular set of subnet address pools.

      o "Secondary server" or "Secondary"

        A DHCP server configured to act as backup to a primary server
        for a particular set of subnet address pools.

      o "stable storage"

        Every DHCP server is assumed to have some form of what is called
        "stable storage".  Stable storage is used to hold information
        concerning IP address bindings (among other things) so that this
        information is not lost in the event of a server failure which
        requires restart of the server.

1.3.  Requirements for this protocol

      o "MCLT"

        The following list of goals must be (and are) achieved by this proto-
   col.

      1.  Implementations of this protocol must work with existing DHCP MCLT refers to maximum client implementations based on the DHCP protocol [RFC 2131].

      2.  Implementations of the protocol must work with existing BOOTP
          relay implementations.

      3.  The protocol must provide failover redundancy between servers
          that are not located lead time.  This time is con-
        figured on the same subnet.

1.4.  Goals for this protocol

      1.  Provide for continued service primary server and transmitted from the primary
        to DHCP clients through an
          automated mechanism the secondary server in the event of failure of CONNECT message.  It is the primary
          server.

      2.  Avoid binding an IP address max-
        imum amount of time that one server can give to a client while that binding is
          currently valid for another client.  In other words, do not
          allocate a
        binding beyond that known and ACKed by the same IP address to two clients.

      3.  Minimize any need partner server.  See
        section 5.2.1 for manual administrative intervention.

      4.  Introduce no additional delays in server response time as a
          result details.

3.  Background and External Requirements

   This section highlights key aspects of the communications required to implement DHCP protocol on which the Fail-
          over protocol.

DRAFT                                                      November 1998

      5.  Share IP address ranges between primary and secondary servers;
          i.e., impose no requirement
   failover protocol depends.  It also discusses the requirements that
   the pool failover protocol places on other aspects of available
          addresses be divided between servers.

      6.  Continue to meet the goals network infras-
   tructure, and objectives of this protocol in
          the event of some general issues surrounding server failure or network partition.

      7.  Provide graceful reintegration of full protocol service after
          server detec-
   tion.  Some failure or network partition.

      8.  Allow for one computer scenarios that provide particular challenges to act a
   failover protocol are discussed.  Finally, the challenges inherent in
   using a TCP connection as a secondary means to detect failure of a partner
   server for multi-
          ple primary servers. Other topologies (e.g.: mesh) are also
          possible.  primary and secondary servers SHOULD be viewed elaborated.

3.1.  Key aspects of the DHCP protocol

   The failover protocol is designed to augment the DHCP protocol as
          "logical" servers and not necessarily physical computers.

      9.  Ensure that an existing client can keep
   described in RFC 2131 [RFC 2131].  There are several key aspects of
   the DHCP protocol which are required by the failover protocol in
   order to successfully meet its existing IP
          address binding if it can communicate with either design goals.

3.1.1.  Broadcast behavior

   There are two aspects of the broadcast behavior of the primary
          or secondary DHCP server implementing this protocol - not just
          whichever server that originally offered it
   which are key to making the binding.

      10. Ensure failover protocol operate successfully.
   The first is simply that the DHCP protocol requires a new DHCP client can get an IP address from some
          server. Ensure that in the face to
   broadcast all DHCPDISCOVER and DHCPREQUEST/INIT-REBOOT messages.
   Because of partition, where servers
          continue this requirement, a DHCP client who was communicating with
   one server will automatically be able to run but cannot communicate with each other, another
   server if one is available.

   The second aspect of broadcast behavior is similar to the
          above goals and requirements may be met. In addition, when first, but
   involves the
          partition condition distinction between a DHCPREQUEST/RENEW and
   DHCPREQUEST/REBINDING.  A DHCPREQUEST/RENEW is removed, allow graceful automatic re-
          integration without requiring human intervention.

      11. If either primary or secondary server loses all of the infor-
          mation message that is has stored in stable storage, it should be able a
   DHCP client uses to refresh extend its stable storage lease.  It is unicast to the DHCP
   server from which it acquired the other server.

1.5.  Limitations of this Protocol

   The following are explicit limitations of this protocol.

      1.  Under normal operation, only one server at a time will hand
          out new IP addresses, but client lease renewals are serviced
          by both servers; lease.   However, the DHCP protocol provides reliability through
          redundancy and some degree of load balancing of lease
          renewals.

      2.  This protocol provides only one level of redundancy through a
          single secondary server for each primary server.

      3.  The protocol provides
   (in a way to detect when farsighted move), was explicitly designed so that in the primary and
          secondary server event
   that a DHCP client cannot communicate, but once this condition
          has been detected, does not (indeed, cannot) provide any way

DRAFT                                                      November 1998

          to further distinguish between network failure and failure of
          one of contact the servers. The protocol allows detection of server from which it received a
   lease on an ord-
          erly shutdown of IP address using a participating server.

      4.  A subset of DHCPREQUEST/RENEW, the address pool client is reserved for secondary server
          use.  In order
   required to handle the failure case where both servers
          are able broadcast its renewal using a DHCPREQUEST/REBINDING to communicate with
   any available DHCP clients, but unable server.  Since all DHCP clients were required to com-
          municate with each other,
   implement this algorithm, the failover protocol can have a subset of different
   server from the IP address pool must one that initially granted a lease be set aside as the server to
   renew a private address pool lease.  Thus, one server can take over for another with no
   interruption in the secondary
          server. The secondary can use these to service newly arrived as experience by the DHCP clients during such a period.  The size of this private
          pool SHOULD be based only on client or its
   associated applications software.

3.1.2.  Client responsibility

   In the arrival rate of new DHCP
          clients and protocol the length of DHCP clients are entrusted with a consider-
   able responsibility.  In particular, after they are granted a lease
   on an IP address, they are enjoined to only use that IP address while
   their lease is valid.  Every DHCP client is expected down-time, to stop using an
   IP address if the expiration time on the lease has passed and is not
          influenced if it
   cannot get an extension on the lease for that IP address from some
   DHCP server.  Thus, the correct behavior of every DHCP client in any way by this
   regard is required to ensure the total number integrity of the DHCP clients sup-
          ported by service.  On
   the server pair.

      5.  The primary and secondary servers do not respond to other hand, incorrect behavior by a client
          requests in this area will tend
   to adversely affect at all while recovering from most one other DHCP client.

   Furthermore, any DHCP client which sends in a failure that could DHCPREQUEST/RENEW or
   DHCPREQUEST/REBINDING to a DHCP server (either unicast for a RENEW or
   broadcast for a REBINDING) MUST still have resulted in duplicate time to run on the lease
   for that IP assignments.  (When synchroniz-
          ing in POTENTIAL-CONFLICT state).

2.  Protocol Operations address.  The protocol features a small number of messages DHCP server sends the DHCPACK back unicast
   to communicate bind-
   ing information, operational status and the IP address from which the RENEW or REBINDING originated.

   Given the existing responsibility placed on the client to manage various
   disconnect-reconnect scenarios between servers.

2.1.  Message Addressing and Configuration granularity

   When discussing messages, only use an important question
   IP address when the lease is "to whom are mes-
   sages sent" valid, and "from whom are messages sent".  What to only send in a RENEW or
   REBINDING if the lease is valid, the address-
   able entity from which and to which messages are sent?

   At one level, this would seem failover protocol relies on DHCP
   clients to be perform responsibly and will, in the absence of conflict-
   ing information, believe a single DHCP server, but in fact
   there are many situations where additional flexibility in configura-
   tion client that is useful.  For instance, there might be several servers which
   are each primary for attempting to RENEW or
   REBIND a distinct set of lease on an IP address pools, and one server
   which is secondary for all of those address pools.  The situation
   with the primaries legitimate owner of that IP
   address.

   One troublesome issue is straightforward, but that of the secondary will need DHCP client responsibility when
   sending in DHCPREQUEST/INIT-REBOOT requests.  While the original DHCP
   RFC was written to
   maintain require a separate failover state, partner state, and communications
   up/down status for each of DHCP client to have time left to run on
   the separate primary servers for which it
   is acting as a secondary.

   The protocol allows lease for there to be a unique failover entity per
   partner per role (where role an IP address if the client is primary or secondary).  This failover
   entity can take actions sending an INIT-REBOOT
   request, it was sufficiently unclear that some client vendors didn't
   realize this until recently.  Since the INIT-REBOOT request was sent
   with the IP address in the dhcp-requested-address option and hold unique states.  There are thus a

DRAFT                                                      November 1998

   maximum of two failover entities per partner (one for not in
   the partner as
   a primary ciaddr (for perfectly good reasons), the similarity to the RENEW
   and one for REBINDING case was lost on many people.

   At present, the failover protocol does not assume that same partner as a secondary.)

   Thus, client send-
   ing in the case where there are two primary servers A and B each
   backed up by an INIT-REBOOT request necessarily has a single common secondary server C, there is one fail-
   over entity valid lease on each of A and B, and two different failover entities
   on C.  The two different failover entities on C each have unique
   states and message xid ranges.  As far as the protocol described IP
   address appearing in
   this draft is concerned, they constitute different "servers",
   although they are certainly part of one server (as the term is com-
   monly used) if they reside dhcp-requested-address option in the same process.

   It is not the case INIT-
   REBOOT request.

   The implications of this are as follows: Assume that there is subnet granularity for each failover
   entity.  On one server, there is a DHCP
   client that gets a lease from one failover entity per "partner-
   role", regardless of how many subnets or address pools are managed by server while that combination of partner and role.  Conversely, any given subnet
   or pool will be associated server is unable
   to communicate with exactly one its failover entity on a sin-
   gle server (but partner.  Then, assume that after
   that client reboots it will also be associated is able only to communicate with the corresponding
   partner's other
   failover entity.)

   When a message is received from the partner, server.  If the unique failover
   entity servers have not been able to which the message is directed is determined solely by com-
   municate with each other during this process, then the DHCP client
   will get a new IP address instead of being able to continue to use
   its existing IP address. This will affect no applications on the partner and the setting of the SECONDARY bit DHCP
   client, since it is rebooting.  However, it will use up an additional
   IP address in the
   'flags' field of the message header.

   Throughout this document, the states and actions taken by "servers"
   are described. marginal case.

3.1.3.  Stable storage update before DHCPACK

   The terms "server", "primary server", DHCP protocol allocates resources, and "secondary
   server" are commonly used in order to described the entity taking these states
   and taking actions.  This description is wholly accurate only for the
   simplest operate
   correctly it requires that a DHCP server update some form of cases, where all stable
   storage prior to sending a DHCPACK to a DHCP client in order to grant
   that client a lease on an IP address.

   One of the address pools on one server are
   backed up by all goals of the address pools on another server.  In this
   case, there failover protocol is that it not add signifi-
   cant additional time to this already time consuming requirement to
   update stable storage prior to a "true" primary and secondary server. DHCPACK.  In all other
   cases, the term "server" is used particular, adding a
   requirement to describe one communicate with another server prior to sending a
   DHCPACK would simplify the failover protocol, but it would limit the
   potential scalability of any DHCP server which employed the two possible failover entities per partner.

2.2.  Packet transport

   All messages sent by this
   protocol are sent in UDP packets.  All mes-
   sages an unacceptable manner.

3.2.  BOOTP relay agent implementation

   Many DHCP clients are unicast from the sender to the receiver.  The next section
   discusses not resident on the port to use when sending DHCP failover UDP packets.

   DISCUSSION:

      See section 8, Extended discussion #1, for same network segment as a discussion of the
      reasons
   DHCP server.  In order to use UDP support this form of network architecture,
   most contemporary routers implement something known as a BOOTP Relay
   Agent.  This capability inside of a router listens for all broadcasts
   at the protocol.

DRAFT                                                      November 1998

2.3.  Port usage

   Compliant servers SHOULD use DHCP port, port 647 (assigned to dhcp-failover by
   IANA) for sending 67, and receiving Failover protocol messages, though
   they MAY be configured will relay any broadcasts that it
   receives on to use a different port (including ports 67 or
   68).

   Since the use of port 67 and 68 is allowed, DHCP server.  The IP address of the messages are format-
   ted in such a way that they can be distinguished from DHCP or BOOTP
   messages by server must
   have been previously configured into the use router.  As part of distinct message 'op' codes.  Note that send-
   ing failover messages the
   relay process, the relay agent will place the address of the inter-
   face on port 67 to servers not designed to support
   them may not only not work, but may cause those servers to operate
   incorrectly or to crash.

   DISCUSSION:

      Some implementors have a strong requirement for using a separate
      port for which it received the Failover protocol, and broadcast into the use giaddr field of the allocated port
      647 will accommodate them.  Some other implementors seem equally
      committed to allowing
   DHCP packet.

   Since the failover packets protocol requires two DHCP servers to be sent receive any
   broadcast DHCP messages, in order to the standard work with DHCP port, port 67.  The above language strongly suggests that clients which are
   not local to the
      failover port be used (by using SHOULD), but leaves open DHCP server, the pos-
      sibility of using BOOTP relay agent on the standard DHCP port (or any other) for
      servers designed router
   closest to operate in that fashion.

2.4.  Time synchronization between communicating servers

   Each Binding update message carries a "sent time stamp" (the time
   when the message was sent in GMT). This provides a simple mechanism DHCP client must be configured to determine any "time drift" between communicating servers.

   DISCUSSION: point at more than
   one DHCP server.

   Most BOOTP relay agent implementations allow this duplication of
   packets.

   If a UDP packet this is successfully transmitted (i.e.: it does not get
      lost), possible, an administrator might be able to configure
   the packet travel time is negligible relay agent with a subnet broadcast address, but in this case the framework of
   primary and secondary DHCP leases.  By providing servers in a GMT "sent time" stamp, failover pair must both
   reside on the recipient
      can compare this same subnet.   While this is a realistic configuration,
   it is not the one that most people will use.

3.3.  What does it mean if a server can't communicate with its notion of the current GMT time at partner?

   In any protocol designed to allow one server to take over some
   responsibilities from a partner server in the
      time event of "failure" of
   that partner server, there is an inherent difficulty in determining
   when that partner server has failed.

   In fact, it receives is fundamentally impossible for one server to distinguish
   a network communications failure from the packet.  The difference (plus outright failure of the packet
      travel time,
   server to which we ignore) it is trying to communicate.  In the time drift.  The recipient
      MUST use case where each
   server is handing out resources (in this time drift value case IP addresses) to bias "absolute time" values it
      receives from the sender.

2.5.  Failover Protocol Messages

   The Failover protocol messages are sent using UDP and encoded using a
   packet format specific
   client community, mistaking an inability to the Failover protocol. To allow easy
   recognition of and separation communicate with a
   partner server for failure of Failover protocol messages from

DRAFT                                                      November 1998

   BOOTP and DHCP messages, BOOTP packet 'op' field values  3..11 are
   used that partner server could easily cause
   both servers to indicate various Failover protocol message types. A Failover
   protocol message is always unicast from be handing out the source same IP addresses to the destination
   using the port defined in section 2.2. The sender, and never the
   recipient different
   clients.

   One way that this is sometimes handled is responsible for retransmission when necessary.

2.6.  Failover protocol packet header format

   All of the fields in there to be more than
   two servers.  In the fixed portion case of an odd number of servers, the packet MUST be filled servers
   that can still communicate with correct data in every message sent.

   0 a majority of other servers will con-
   sider themselves operational, and any server which can't communicate
   to a majority of other servers must immediately cease operations.

   While this technique works in some domains, having the only server to
   which a DHCP client can communicate voluntarily shut itself down
   seems like something worth avoiding.

   The failover protocol will operate correctly while both servers are
   unable to communicate, whether they are both running or not.  At some
   point there may be resource contention, and if one of the servers is
   actually down, then the operator can inform the other server and the
   operational server will be able to use all of the downed server's
   resources.

   The protocol also allows detection of an orderly shutdown of a parti-
   cipating server.

3.4.  Challenging scenarios for a Failover protocol

   There exist two failure scenarios which provide particular challenges
   the correctness guarantees of a failover protocol.

3.4.1.  Primary Server crash before "lazy" update:

   In the case where the primary server sends a DHCPACK to a client for
   a newly allocated IP address and then crashes prior to sending the
   corresponding update to the secondary server, the secondary server
   will have no record of the IP address allocation.  When the secondary
   server takes over, it may well try to allocate that IP address to a
   different client.  In the case where the first client to receive the
   IP address is not on the net at the time (yet while there was still
   time to run on its lease), an ICMP echo (i.e., ping) will not prevent
   the secondary server from allocating that IP address to a different
   client.

   The failover protocol deals with this situation by having the primary
   and secondary servers allocate addresses for new clients from dis-
   joint address pools.  See section 5.4 for details.

   A more likely (in that DHCPRENEWs are presumably more common than
   DHCPDISCOVERs) and more subtle version of this problem is where the
   primary server crashes after extending a client's lease time, and
   before updating the secondary with a new time using a lazy update.
   After the secondary takes over, if the client is not connected to the
   network the secondary will believe the client's lease has expired
   when, in fact, it has not.  In this case as well, the IP address
   might be reallocated to a different client while the first client is
   still using it.

   This scenario is handled by the failover protocol through control of
   the lease time and the use of the maximum client lead time (MCLT).
   See section 5.2.1  for details.

3.4.2.  Network partition where DHCP servers can't communicate but each
can talk to clients:

   Several conditions are required for this situation to occur.  First,
   due to a network failure, the primary and secondary servers cannot
   communicate.  As well, some of the DHCP clients must be able to com-
   municate with the primary server, and some of the clients must now
   only be able to communicate with the secondary server.  When this
   condition occurs, both primary and secondary servers could attempt to
   allocate IP addresses for new clients from the same pool of available
   addresses.  At some point, then, two clients will end up being allo-
   cated the same IP address.  This will cause problems when the network
   failure that created this situation is corrected.

   The failover protocol deals with this situation by having the primary
   and secondary servers allocate addresses for new clients from dis-
   joint address pools.  See section 5.4 for details.

3.5.  Using TCP to detect partner server failure

   There are several characteristics of TCP that are important to the
   functioning of the failover protocol, which uses one TCP connection
   for both bulk data transfer as well as to assess communications
   integrity with the other server.  Reliable and ordered message
   delivery are chief among these important characteristics.

   It would be nice to use the capabilities built in to TCP to allow it
   to determine if communications integrity exists to the failover
   partner but this strategy contains some problems which require
   analysis.  There exist three fundamental cases for an open TCP con-
   nection that must be examined.

      1.  When no data is being sent then no messages are traveling
          across the TCP connection.

      2.  When data is queued to be sent, and the receiver has not
          blocked the sending of additional data, then messages are
          flowing across the TCP connection containing the applications
          data.

      3.  When data is queued to be sent, and the receiver has blocked
          the transmission of additional data, then persist messages are
          flowing from the receiver to the sender to ensure that the
          sender doesn't miss the receiver opening the window for
          further transmissions.

   The first case can be turned into the second case by sending
   application-level keep-alive messages periodically when there is no
   other data queued to be sent.  Note TCP keep-alive messages might be
   used as well, but they present additional problems.

   Thus, we can ensure that the TCP connection has messages flowing
   periodically across the connection fairly easily.  The question
   remains as to what TCP will do if the other end of the connection
   fails to respond (either because of network partition or because the
   receiving server crashes). TCP will attempt to retransmit a message
   with an exponential backoff, and will eventually timeout that
   retransmission.  However, the length of that timeout cannot, in gen-
   eral, be set on a per-connection basis, and is frequently as long as
   nine minutes, though in some cases it may be as short as two minutes.
   One some systems it can be set system-wide, while on some systems it
   cannot be changed at all.

   A value for this timeout that would be appropriate for the failover
   protocol, say less than 1 minute, could have unpleasant side-effects
   on other applications running on the same server, assuming that it
   could be changed at all on the host operating system.

   Nine minutes is a long time for the DHCP service to be unavailable to
   any new clients that were being served by the server which has
   crashed, when there is another server running that could respond to
   them immediately as soon as it determines that its partner is not
   operational.

   The conclusion drawn from this analysis is that TCP provides very
   useful support for the failover protocol in the areas of reliable and
   ordered message delivery, but cannot by itself be relied upon to
   detect partner server failure in a fashion acceptable to the needs of
   the failover protocol.  Additional failover protocol capabilities
   will need to be created to support timely detection of partner server
   failure.  See section 8.3 for details on this mechanism.

4.  Design Goals

   This section lists the design requirements, the design goals, and the
   limitations of the failover protocol.

4.1.  Design requirements for this protocol

   The following list of requirements must be (and are) met by this pro-
   tocol.  They are listed in priority order.

      1.  Implementations of this protocol must work with existing DHCP
          client implementations based on the DHCP protocol [1].

      2.  Implementations of the protocol must work with existing BOOTP
          relay agent implementations.

      3.  The protocol must provide failover redundancy between servers
          that are not located on the same subnet.

4.2.  Goals for this protocol

   The following goals are met by this protocol as well, though they are
   less important than the requirements listed above. These goals are
   listed in priority order.

      1.  Provide for continued service to DHCP clients through an
          automated mechanism in the event of failure of the primary
          server.

      2.  Avoid binding an IP address to a client while that binding is
          currently valid for another client.  In other words, do not
          allocate the same IP address to two clients.

      3.  Minimize any need for manual administrative intervention.

      4.  Introduce no additional delays in server response time as a
          result of the network communications required to implement the
          failover protocol, i.e., don't require communications with the
          partner between the receipt of a DHCPREQUEST and the
          corresponding DHCPACK.

      5.  Share IP address ranges between primary and secondary servers;
          i.e., impose no requirement that the pool of available
          addresses be divided between servers.

      6.  Continue to meet the goals and objectives of this protocol in
          the event of server failure or network partition.

      7.  Provide graceful reintegration of full protocol service after
          server failure or network partition.

      8.  Allow for one computer to act as a secondary server for multi-
          ple primary servers. Other topologies (e.g.: mesh) are also
          possible.  primary and secondary servers SHOULD be viewed as
          "logical" servers and not necessarily physical computers.

      9.  Ensure that an existing client can keep its existing IP
          address binding if it can communicate with either the primary
          or secondary DHCP server implementing this protocol - not just
          whichever server that originally offered it the binding.

      10. Ensure that a new client can get an IP address from some
          server. Ensure that in the face of partition, where servers
          continue to run but cannot communicate with each other, the
          above goals and requirements may be met. In addition, when the
          partition condition is removed, allow graceful automatic re-
          integration without requiring human intervention.

      11. If either primary or secondary server loses all of the infor-
          mation that is has stored in stable storage, it should be able
          to refresh its stable storage from the other server.

      12. Support load balancing between the primary and secondary
          servers, and allow configuration of the percentage of the
          client population served by each with a moderately fine granu-
          larity.

4.3.  Limitations of this Protocol

   The following are explicit limitations of this protocol.

      1.  This protocol provides only one level of redundancy through a
          single secondary server for each primary server.

      2.  A subset of the address pool is reserved for secondary server
          use.  In order to handle the failure case where both servers
          are able to communicate with DHCP clients, but unable to com-
          municate with each other, a subset of the IP address pool must
          be set aside as a private address pool for the secondary
          server. The secondary can use these to service newly arrived
          DHCP clients during such a period.  The size of this private
          pool SHOULD be based only on the arrival rate of new DHCP
          clients and the length of expected downtime, and is not influ-
          enced in any way by the total number of DHCP clients supported
          by the server pair.

      3.  The primary and secondary servers do not respond to client
          requests at all while recovering from a failure that could
          have resulted in duplicate IP assignments.  (When synchroniz-
          ing in POTENTIAL-CONFLICT state).

5.  Protocol Overview

   This section will discuss the failover protocol at a relatively high
   level level of detail.  In the event that a description in this sec-
   tion conflicts (or appears to conflict due to the overview nature of
   this section) with information in later sections of this draft, the
   information in the later sections should be considered authoritative.

5.1.  Messages and States

   This protocol is centered around the message exchange used by one
   server to update the other server of binding database changes result-
   ing from DHCP client activity:

      o Communication of binding database changes

        The binding update (BNDUPD) message is used to send the binding
        database changes to the partner server, and the partner server
        responds with a binding acknowledgement (BNDACK) message when it
        has successfully committed those changes to its own stable
        storage.

   All of the other messages are involve ancillary issues:

      o Management of available IP addresses

        The pool request (POOLREQ) is used by the secondary server to
        request an allocation of IP addresses from the primary server.

        The pool response (POOLRESP) is used by the primary server to
        inform the secondary server how many IP addresses it was allo-
        cated as the result of a pool request.

      o Synchronization of the binding databases between the servers
        after they've been out of communications

        The update request (UPDREQ) message is used by one server to
        request that its partner send it all binding database informa-
        tion that it has not already seen.  The update request all
        (UPDREQALL) message is used by one server to request that all
        binding database information be sent in order to recover from a
        total loss of its lease state database by the requesting server.
        The update done (UPDDONE) message is used by the responding
        server to indicate that all requested updates have been sent the
        responding server and acked by the requesting server.

      o Connection establishment

        The connect (CONNECT) message is used by either server to estab-
        lish a high level connection with the other server, and to
        transmit several important configuration data items between the
        servers.  The connect acknowledgement message (CONNECTACK) is
        used to respond to a CONNECT message from another server.

      o Server synchronization

        The state change (STATE) message is used by either server to
        inform the other server of a change of failover state.

      o Connection integrity management

        The contact (CONTACT) message is used by either server to ensure
        that the other server continues to see the connection as opera-
        tional.  It MUST be transmitted periodically over every esta-
        blished connection if other message traffic is not flowing, and
        it MAY be sent at any time.

5.1.1.  Failover endpoints

   The proper operation of the failover protocol requires more than the
   transmission of messages between one server and the other.  Each end-
   point might seem to be a single DHCP server, but in fact there are
   many situations where additional flexibility in configuration is use-
   ful.

   For instance, there might be several servers which are each primary
   for a distinct set of address pools, and one server which is
   secondary for all of those address pools.  The situation with the
   primaries is straightforward, but the secondary will need to maintain
   a separate failover state, partner state, and communications up/down
   status for each of the separate primary servers for which it is act-
   ing as a secondary.

   The failover protocol calls for there to be a unique failover end-
   point per partner per role (where role is primary or secondary).
   This failover endpoint can take actions and hold unique states.
   There are thus a maximum of two failover endpoints per partner (one
   for the partner as a primary and one for that same partner as a
   secondary.)

   Thus, in the case where there are two primary servers A and B each
   backed up by a single common secondary server C, there is one fail-
   over endpoint on each of A and B, and two different failover end-
   points on C.  The two different failover endpoints on C each have
   unique states and independent TCP connections.

   This document describes the behavior of the protocol in terms of pri-
   mary and secondary servers, not primary and secondary failover end-
   points.  However, it is important to remember that every 'server'
   described in this document is in reality a failover endpoint that
   resides in a particular process, and that many failover endpoints may
   reside in the same process.

   It is not the case that there is a unique failover endpoint for each
   subnet that participates in a failover relationship.  On one server,
   there is one failover endpoint per partner per role, regardless of
   how many subnets or address pools are managed by that combination of
   partner and role.  Conversely, any given subnet or pool will be asso-
   ciated with exactly one failover endpoint on a single server.

   When a connection is received from the partner, the unique failover
   endpoint to which the message is directed is determined solely by the
   IP address of the partner and the setting of the SECONDARY bit in the
   'flags' field of the contact message.

   Throughout this document, the states and actions taken by "servers"
   are described.  The terms "server", "primary server", and "secondary
   server" are commonly used to described the failover endpoint taking
   these states and performing these actions.  This description is
   wholly accurate only for the simplest of cases, where all of the
   address pools on one server are backed up by all of the address pools
   on another server.  In this case, there is single failover endpoint
   in each server.  In all other cases, the term "server" is used to
   describe one of the two possible failover endpoints per partner.

5.2.  Fundamental restrictions

   There a several fundamental restrictions this protocol places on what
   one server an do in the absence of knowledge of the other server, and
   these restrictions are key to the correct operation of the protocol.

5.2.1.  Control of lease time

   The key problem with lazy update is that when the a server fails
   after updating a client with a particular lease time and before
   updating its partner, the partner will believe that a lease has
   expired even though the client still retains a valid lease on that IP
   address.

   In order to handle this problem, a period of time known as the "Max-
   imum Client Lead Time" (MCLT) is defined and must be known to both
   the primary and secondary servers.  Proper use of this time interval
   places an upper bound on the difference allowed between the lease
   time provided to a DHCP client by a server and the lease time known
   by that server's partner. However, the MCLT is typically much less
   than the lease time that a server has been configured to offer a
   client, and so some strategy must exist to allow a server to offer
   the configured lease time to a client.  During a lazy update the
   updating server typically updates its partner with a potential
   expiration time which is longer than the lease time previously given
   to the client and which is longer than the lease time that the server
   has been configured to give a client.  This allows that server to
   give a longer lease time to the client the next time the client
   renews its lease, since the time that it will give to the client will
   not exceed the MCLT beyond the potential expiration time acknowledged
   by the partner.

   When moving to the PARTNER-DOWN state (where a server is allowed to
   reallocate the partner's IP addresses), a server will wait the Max-
   imum Client Lead Time before allocating any IP addresses from its
   partner's pool to any new DHCP clients.  Thus, any clients which have
   a lease on an IP address with a lease time greater than that known by
   the server moving into PARTNER-DOWN state will either have contacted
   that server during the MCLT period or their leases will have expired.

   When a server has transitioned to PARTNER-DOWN state, it MUST NOT
   reallocate an IP address from one client to another client until an
   additional maximum client lead time interval after the lease by the
   original client expires. (Actually, until the maximum client lead
   time after what it believes to be the lease expiration time of the
   first client.)

   Some optimizations exist for this restriction, in that it only
   applies to leases that were issued BEFORE entering PARTNER-DOWN. Once
   a server has entered PARTNER-DOWN and it leases out an address, it
   need not wait this time as long as it has never communicated with the
   partner since the lease was given out.

   The fundamental relationship on which much of the correctness of this
   protocol depends is that the lease expiration time known to a DHCP
   client MUST NOT be more than the maximum client lead time greater
   than the potential expiration time known to a server's partner.

   The remainder of this section makes the above fundamental relation-
   ship more explicit.

   This protocol requires a DHCP server to deal with several different
   lease intervals and places specific restrictions on their relation-
   ships. The purpose of these restrictions is to allow the other server
   in the pair to be able to make certain assumptions in the absence of
   an ability to communicate between servers.

   The different lease times are:

      o desired lease interval

        The desired lease interval is the lease interval that a DHCP
        server would like to give to a DHCP client in the absence of any
        restrictions imposed by the Failover protocol.  Its determina-
        tion is outside of the scope of this protocol. Typically this is
        the result of external configuration of a DHCP server.

      o actual lease interval

        The actual lease internal is the lease interval that a DHCP
        server gives out to a DHCP client in the dhcp-lease-time option
        of a DHCPACK packet.  It may be shorter than the desired client
        lease interval (as explained below).

      o potential lease interval

        The potential lease interval is the lease expiration interval
        the local server tells to its partner in the potential-
        expiration-time option of a BNDUPD message.

      o acknowledged potential lease interval

        The acknowledged potential lease interval is the potential least
        interval the partner server has most recently acknowledged in
        the potential-expiration-time option of a BNDACK message.

   The key restriction (and guarantee) that any server makes with
   respect to lease intervals is that the actual client lease interval
   never exceeds the acknowledged potential lease interval (if any) by
   more than a fixed amount.  This fixed amount is called the "Maximum
   Client Lead Time" (MCLT).

   The MCLT MAY be configurable on the primary server, but for correct
   server operation it MUST be the same and known to both the primary
   and secondary servers.  The secondary server determines the MCLT from
   the MCLT option sent from the primary server to the secondary server
   in the CONNECT or CONNECTACK message.

   A server MUST record in its stable storage both the actual lease
   interval and the most recently acknowledged potential lease interval
   for each IP address binding.  It is assumed that the desired client
   lease interval can be determined through techniques outside of the
   scope of this protocol.

   Again, the fundamental relationship among these times which MUST be
   maintained is:

       actual lease interval <
       ( acknowledged potential lease interval + MCLT )

   Figure 5.1-1 illustrates a initial lease to a client using the rules
   discussed in the example which follows it.

              DHCP                 Primary             Secondary
       time   Client               Server               Server

                | (time in intervals) |  (absolute time)   |
                |                     |                    |
                | >-DHCPDISCOVER->    |                    |
                |     <---DHCPOFFER-< |                    |
                |                     |                    |
                | >-DHCPREQUEST->     |                    |
                |   (selecting)       |                    |
                |                     |                    |
         t      |  <--------DHCPACK-< |                    |
                |  lease-time=MCLT    |                    |
                |                     |    >-BNDUPD-->     |
                |                     |  lease-expiration=t+MCLT
                |                     |  potential-expiration=t+(MCLT/2)+X
                |                     |                    |
                |                     |     <-BNDACK-<     |
                |                     |  potential-expiration=t+(MCLT/2)+X
               ...                   ...                  ...
                |                     |                    |
      t+MCLT/2  | >-DHCPREQUEST->     |                    |
                |      (renew)        |                    |
                |                     |                    |
         t1     |  <--------DHCPACK-< |                    |
                |   lease-time=X      |                    |
                |                     |    >-BNDUPD-->     |
                |                     |  lease-expiration=t1+X
                |                     |  potential-expiration=t1+(X/2)+X
                |                     |                    |
                |                     |     <-BNDACK-<     |
                |                     |  potential-expiration=t1+(X/2)+X
               ...                   ...                  ...

           Figure 5.1-1:  Lazy Update Message Traffic
                          X = Desired Lease Interval

   DISCUSSION:

      This protocol mandates no algorithm concerning these lease inter-
      vals, as long as above fundamental relationship is preserved.

      In the interests of clarity, however, let's examine a specific
      example.  The MCLT in this case is 1 hour.  The desired lease
      interval is 3 days, and its renewal time is half the lease inter-
      val.

      The rules for this example are:

      o What to tell the client:

        Take the remainder of the acknowledged potential lease interval.
        If this is a new lease, then this value will be zero.  If this
        remainder plus the MCLT is greater than the desired lease inter-
        val, give the client the desired lease interval else give the
        client the remainder plus the MCLT.

      o What to tell the failover partner server:

        Take the renewal interval (typically half of the actual client
        lease interval), add to it the desired lease interval, and add
        it to the current time to yield the value that goes into the
        potential-expiration-time option.

        Also tell the failover partner the actual lease interval by
        adding it to the current time to yield the value that goes into
        the lease-expiration option.

      In operation this might work as follows:

      When a server makes an offer for a new lease on an IP address to a
      DHCP client, it determines the desired lease interval (in this
      case, 3 days).  It then examines the acknowledged potential lease
      interval (which in this case is zero) and determines the remainder
      of the time left to run, which is also zero.  To this it adds the
      MCLT.  Since the actual lease interval cannot be allowed to exceed
      the remainder of the current acknowledged potential lease interval
      plus the MCLT, the offer made to the client is for the remainder
      of the current acknowledged potential lease interval (i.e., zero)
      plus the MCLT.  Thus, the actual lease interval is 1 hour.

      Once the server has performed the ACK to the DHCP client, it will
      update the secondary server with the lease information. However,
      the desired potential lease interval will be composed of the one
      half of the current actual lease interval added to the desired
      lease interval. Thus, the secondary server is updated with a
      BNDUPD with a lease interval of 3 days + 1/2 hour specified in the
      IP Address Lease Time Option (Option 51).

      When the primary server receives an ACK to its update of the
      secondary server's (partner's) potential lease interval, it
      records that as the acknowledged potential lease interval.  A
      server MUST NOT send a BNDACK in response to a BNDUPD message
      until it is sure that the information in the BNDUPD message
      resides in its stable storage.  Thus, the primary server in this
      case can be sure that the secondary server has recorded the poten-
      tial lease interval in its stable storage when the primary server
      receives a BNDACK message from the secondary server.

      When the DHCP client attempts to renew at T1 (approximately one
      half an hour from the start of the lease), the primary server
      again determines the desired lease interval, which is still 3
      days.  It then compares this with the remaining acknowledged
      potential lease interval (3 days + 1/2 hour) and adjusts for the
      time passed since the secondary was last updated (1/2 hour).  Thus
      the time remaining of the acknowledged potential lease interval is
      3 days.  Adding the MCLT to this yields 3 days plus 1 hour, which
      is more than the desired lease interval of 3 days.  So the client
      is renewed for the desired lease interval -- 3 days.

      When the primary DHCP server updates the secondary DHCP server
      after the DHCP client's renewal ACK is complete, it will calculate
      the desired potential lease interval as the T1 fraction of the
      actual client lease interval (1/2 of 3 days this time = 1.5 days).
      To this it will add the desired client lease interval of 3 days,
      yielding a total desired partner server lease interval of 4.5
      days.  In this way, the primary attempts to have the secondary
      always "lead" the client in its understanding of the client's
      lease interval so as to be able to always offer the client the
      desired client lease interval.

      Once the initial actual client lease interval of the MCLT is past,
      the protocol operates effectively like the DHCP protocol does
      today in its behavior concerning lease intervals. However, the
      guarantee that the actual client lease interval will never exceed
      the remaining acknowledged partner server lease interval by more
      than the MCLT allows full recovery from a variety of failures.

5.2.2.  Controlled re-allocation of IP addresses

   When in PARTNER-DOWN state there is a waiting period after which an
   IP address can be re-allocated to another client.  For leases which
   are available when the server enters PARTNER-DOWN state, the period
   is the MCLT from entry into PARTNER-DOWN state.  For IP addresses
   which are not available when the server enters PARTNER-DOWN state,
   the period is the MCLT after the lease becomes available.  See sec-
   tion 9.4.2 for more details.

   In any other state, a server cannot reallocate an address from one
   client to another without first notifying its partner (through a
   BNDUPD message) and receiving acknowledgement (through a BNDACK mes-
   sage) that its partner is aware that that first client is not using
   the address.

   This could be modeled in the following way. Though this specific
   implementation is in no way required, it may serve to better illus-
   trate the concept.

   An "available" IP address on a server may be allocated to any client.
   An IP address which was leased to a client and which expired or was
   released by that client would take on a new state, EXPIRED or
   RELEASED respectively.  The partner server would then be notified
   that this IP address was EXPIRED or RELEASED through a BNDUPD.  When
   the sending server received the BNDACK for that IP address showing it
   was FREE, it would move the IP address from EXPIRED or RELEASED to
   FREE, and it would be available for allocation by the primary server
   to any clients.

   A server MAY reallocate an IP address in the EXPIRED or RELEASED
   state to the same client with no restrictions.

5.3.  Load balancing

   In order to implement load balancing between a primary and secondary
   server pair, each server must respond to DHCPDISCOVER requests from
   some clients and not from other clients.  In order to do this suc-
   cessfully, each server must be able to determine immediately upon
   receipt of a DHCP client request whether it is to service this
   request or to ignore it in order to allow the other server to service
   the request.

   In addition, it should be possible to configure the percentage of
   clients which will be serviced by either the primary or secondary
   server.  This configuration should be more or less continuous, from
   all serviced by the primary through an even split with half serviced
   by each, to all serviced by the secondary.

   The technique chosen to support these goals is to define a hash func-
   tion which must be applied to the client-identifier or to the htype
   concatenated with the chaddr if no client-identifier is specified.
   The results of this hash function yields a number between 0 and 255
   which maps into one of 256 "hash-buckets".  Each hash bucket is
   assigned to one server or the other by the primary server whenever a
   connection is established, through use of the hash-bucket-assignment
   option.

   The hash-bucket-assignment option uses a 32 octet value field (con-
   taining 256 bits), with one bit associated with each possible hash
   bucket.  If the bit corresponding to a hash bucket is a 1 in the
   hash-bucket-assignment option, then the secondary server is required
   to service all DHCP client requests that map into that hash bucket
   when in NORMAL state.

   For example, if the primary server sends a hash-bucket-assignment
   option to the secondary with the following 32 octets:

                                  buckets
       FF FF FF FF FF FF FF FF  ( 0   - 63 )
       FF FF FF FF FF FF FF FF  ( 64  - 127 )
       00 00 00 00 00 00 00 00  ( 128 - 191 )
       00 00 00 00 00 00 00 00  ( 192 - 255 )

   then the secondary MUST service any DHCP client requests where the
   client-identifier or htype concatenated with the chaddr hashs into
   the bucket values of 0 through 127.

   See section 12 for the code to implement the hash bucket algorithm.
   Each server MUST implement this same algorithm in order for all
   clients to get service.

5.4.  Operating in NORMAL state

   When in NORMAL state, each server services DHCPDISCOVER's and all
   other DHCP requests other than DHCPREQUEST/RENEWAL or
   DHCPREQUEST/REBINDING from the client set defined by the load balanc-
   ing algorithm.  Each server services DHCPREQUEST/RENEWAL or
   DHCPDISCOVER/REBINDING requests from any client.

   In general, whenever the binding database is changed in stable
   storage, then a BNDUPD message is sent with the contents of that
   change to the partner server.  The partner server then writes the
   information about that binding in its bindings database in stable
   storage and replies with a BNDACK message.

5.5.  Operating in COMMUNICATIONS-INTERRUPTED state

   When operating in COMMUNICATIONS-INTERRUPTED state, each server is
   operating independently, but does not assume that its partner is not
   operating.  The partner server might be operating and simply unable
   to communicate with this server, or might not be operating.

   Each server responds to the full range of DHCP client messages that
   it receives, but in such a way that graceful reintegration is alway
   possible when its partner comes back into contact with it.

5.6.  Operating in PARTNER-DOWN state

   When operating in PARTNER-DOWN state, a server assumes that its
   partner is not currently operating, but does make allowances for the
   possibility that that server was operating in the past.  It responds
   to all DHCP client requests in PARTNER-DOWN state.

   Any transactions that the partner server may have had with DHCP
   clients but been unable to communicate to this server are allowed for
   in the algorithms that are used to gradually take over full control
   of all  of the addresses configured into the server.

5.7.  Operating in RECOVER state

   A server operating in RECOVER state assumes that it is reintegrating
   with a server that has been operating in PARTNER-DOWN state, and that
   it needs to update its bindings database before it services DHCP
   client requests.

   A server may also operate in RECOVER state in order to fully recover
   its bindings database from its partner server.

6.  Packet Formats

   This section discusses the common message format that all failover
   messages have in common, and then defines option used in the failover
   protocol.

6.1.  Common message format

   All failover protocol messages are sent over the TCP connection
   between failover endpoints and encoded using a packet format specific
   to the failover protocol.

   There exists a common message format for all failover messages, which
   utilizes the options in a way similar to the DHCP protocol.  For each
   message type, some options are required and some are optional.  In
   addition, when a message is received any options that are not under-
   stood by the receiving server MUST be ignored.

   All of the fields in the fixed portion of the packet MUST be filled
   with correct data in every message sent.

   0                   1                   2                   3
   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |         packet length (2)     | msg type (1)  |payload off (1)|
   +---------------+---------------+---------------+---------------+
   |                            xid (4)                            |
   +---------------------------------------------------------------+
   |     0 or more additional header bytes  (variable)             |
   +---------------------------------------------------------------+
   |                    payload data  (variable)                   |
   |                                                               |
   |               formatted as DHCP-style options                 |
   |         using a unique option number space in the ?R6?        |
   |                   format defined by [NAMESPACE]               |
   +---------------------------------------------------------------+

   packet length - 2 bytes, network byte order

   This is the length of the packet.  It includes the two byte packet
   length itself.

   msg type - 1 byte

   The message type field is used to distinguish between messages.

   The following message types are defined:

   Value   Message Type
   -----   ------------
   0       reserved    not used
   1       POOLREQ     request allocation of addresses
   2       POOLRESP    respond with allocation count
   3       BNDUPD      update partner with binding info
   4       BNDACK      acknowledge receipt of binding update
   5       CONNECT     establish connection with partner
   6       CONNECTACK  respond to attempt to establish contact with partner
   7       UPDREQALL   request full transfer of binding info
   8       UPDDONE     ack send and ack of req'd binding info
   9       UPDREQ      req transfer of un-acked binding info
   10      STATE       inform partner of current state or state change
   11      CONTACT     probe communications integrity with partner

   New message types should be defined in one of two ranges, 0-127 or
   129-255.  The range of 0-127 is used for messages that MUST be
   supported by every server, and if a server receives a message in the
   range of 0-127 that it doesn't understand, it MUST drop the TCP con-
   nection.  The range of 128-255 is used for messages which MAY be sup-
   ported but are not required, and if a server receives a message in
   this range that it does not understand it SHOULD ignore the message.

   payload offset - 1 byte

   The byte offset of the Payload Data, from the beginning of the
   failover packet header. The value for the current protocol version is
   8.

   xid - 4 bytes, network byte order

   This is the transaction id of the failover packet.  The sender of a
   failover protocol packet is responsible for setting this number, and
   the receiver of the packet copies the number over into any response
   packet, treating it as opaque data.  The sender SHOULD ensure that
   every packet sent from a particular failover endpoint over the
   associated TCP connection has a unique transaction id unless that
   packet is a re-transmission.

   payload data - variable length

   The options are placed after the header, after skipping payload
   offset bytes from beginning of the packet.  The payload data options
   are not preceded by a "cookie" value.

   The payload data is formatted as DHCP style options using the two
   byte option number and two byte option length format as specified in
   the recommendations of the DHCP panel in [NAMESPACE].

   The maximum length of the payload data in octets is 2048 less the
   size of the header, i.e., the maximum packet length is 2048 octets.

6.2.  Common option format

   The options contained in the payload data section of the failover
   packet all use the two byte option number and two byte length format
   as specified by the recommendations of the DHCP panel in [NAMESPACE].

   The option numbers are drawn from an option number space unique to
   the failover protocol.  All of the message types share a common
   option number space and common options definitions, though not all
   options are required or meaningful for every message.

   In contrast to the options which appear in DHCP client and server
   packets, the options in failover message are ordered.  That is, for
   some messages the order in which the options appear in the payload
   data area is significant.  The messages for which this is the case
   spell it out in detail.

   For all options which refer to time, they all use an absolute time in
   GMT.  Time synchronization has already been achieved between the
   source and the target server using the CONNECT message.  All time
   fields in the options defined below use a time represented as seconds
   elapsed since Jan 1, 1970 (i.e. ANSI C time_t time value representa-
   tion).  Note that this is (at present) a signed field.

   Additional options can be defined for intervendor or vendor specific
   use with limited difficulty due to the large number of option numbers
   available.

6.2.1.  binding-status

   This option is used to convey the current state of a binding.

       Code          Len     Type
   +-----+-----+------+-----+-----+
   |  0  |  1  |   0  |  1  | 1-7 |
   +-----+-----+------+-----+-----+

   Legal values for this option are:

   Value Binding Status
   ----- ------------------------------------------------
   1     FREE           Lease has never been used
   2     ACTIVE         Lease is assigned to a client
   3     EXPIRED        Lease has expired
   4     RELEASED       Lease has been released by client
   5     ABANDONED      A server, or client flagged address as unusable
   6     RESET          Lease was freed by some external agent
   7     BACKUP         Lease belongs to secondary's private address pool
   8     EXPIRED-GRACE  Lease will become available after this period
   9     RELEASED-GRACE Lease will become available after this period

6.2.2.  assigned-IP-address

   The IP address to which this message refers.

        Code         Len          Address
   +-----+-----+------+-----+----+-----+-----+-----+
   |  0  |  2  |   0  |  4  | a1 |  a2 |  a3 |  a4 |
   +-----+-----+------+-----+----+-----+-----+-----+

6.2.3.  sending-server-IP-address

   The IP address of the server sending this message.

        Code         Len          Address
   +-----+-----+------+-----+----+-----+-----+-----+
   |  0  |  3  |   0  |  4  | a1 |  a2 |  a3 |  a4 |
   +-----+-----+------+-----+----+-----+-----+-----+

6.2.4.  addresses-transferred

   A 32 bit unsigned long in network byte order. Reports the number of
   addresses transferred by the primary to the secondary server
   (addresses to be used for the secondary server's private address
   pool)

        Code         Len       Number of Addresses
   +-----+-----+------+-----+----+-----+-----+-----+
   |  0  |  4  |   0  |  4  | n1 |  n2 |  n3 |  n4 |
   +-----+-----+------+-----+----+-----+-----+-----+

6.2.5.  client-identifier

   The format, code and conventions used are identical to DHCP option
   61.

        Code         Len       Client Identifier
   +-----+-----+------+-----+----+-----+---
   |  0  |  5  |   0  |  n  | i1 |  i2 | ...
   +-----+-----+------+-----+----+-----+--

6.2.6.  client-hardware-address

   The format is similar to DHCP option 61. Byte t1 (type) MUST be set
   to the proper ARP hardware address code, as defined in the ARP
   section of RFC 1700 (it MUST NOT be zero!)

        Code         Len      MAC address
   +-----+-----+------+-----+----+-----+-----+---
   |  0  |  6  |   0  |  n  | t1 |  m1 |  m2 | ...
   +-----+-----+------+-----+----+-----+-----+---

   Either Client Id, Client Hardware Address or BOTH MAY be present in
   binding update transactions. At least one of them MUST be present.
   If both are present, the Client Id MUST be used to uniquely identify
   the owner of the binding (exactly as in RFC 2131).

6.2.7.  client-FQDN

   If an implementation supports Dynamic DNS updates, this option can be
   used to communicate the DNS name that was set. Uses the format of the
   Client FQDN option (81) as described in [DDNS] and extended to fit in
   the two byte code and length approach of the DHCP panel.

        Code         Len     Flags Rcode1 Rcode2 Domain Name
   +-----+-----+------+-----+-----+------+------+-----+------
   |  0  |  7  |   0  |  n  |  f  |  r1  |  r2  |  d1 | d2...
   +-----+-----+------+-----+-----+------+------+-----+------

6.2.8.  reject-reason

   This option is used to selectively reject binding updates. It MAY be
   used in BNDACK message, always associated with an assigned-IP-address
   option, which contains the IP address of the update being rejected.

        Code         Len     Reason Code
   +-----+-----+------+-----+----------+
   |  0  |  8  |   0  |  1  |    R1    |
   +-----+-----+------+-----+----------+

   Reason codes :

   0   Reserved
   1   Illegal IP address (not part of any address pool)
   2   Fatal conflict exists: address in use by other client.
   3   Missing binding information.
   4   Connection rejected, time mismatch too great.
   5   Connection rejected, invalid MCLT.
   6   Connection rejected, unknown reason.
   7   Connection rejected, duplicate connection.
   8   Connection rejected, invalid failover partner.
   9   TLS not supported
   10  TLS supported but not configured
   11  TLS required but not supported by partner
   12  Message digest not supported
   13  Message digest not configured
   14  Protocol version mismatch
   15  Missing binding information
   16  Outdata binding information
   17  Less critical binding information
   18-253, reserved.
   254 Unknown: Error occurred but does not match any reason code
   255 Reserved for code expansion

6.2.9.  message

   This option is used to supply a human readable message.  It may be
   used in association with the Reject Reason Code to provide a human
   readable error message for the reject.

        Code         Len         Text
   +-----+-----+------+-----+------+-----+--
   |  0  |  9  |   0  |  n  |  c1  | c2  | ...
   +-----+-----+------+-----+------+-----+--

6.2.10.  MCLT

   Maximum Client Lead Time, in seconds.  A 32 bit integer value, in
   network byte order. T

        Code         Len             Time
   +-----+-----+------+-----+----+-----+-----+-----+
   |  0  |  10 |   0  |  4  | t1 |  t2 |  t3 |  t4 |
   +-----+-----+------+-----+----+-----+-----+-----+

6.2.11.  vendor-class-identifier

   A string which identifies the vendor of the failover protocol
   implementation.

   The code for this option is 60, and its minimum length is 1.

        Code         Len           vendor class string
   +-----+-----+------+-----+----+-----+---
   |  0  |  11 |   0  |  n  | c1 |  c2 |  ...
   +-----+-----+------+-----+----+-----+---

6.2.12.  current-time

   The current time expressed as an absolute time in GMT represented as
   seconds elapsed since Jan 1, 1970 (i.e.  ANSI C time_t time value
   representation).

        Code         Len          Current Time
   +-----+-----+------+-----+----+-----+-----+-----+
   |  0  |  12 |   0  |  4  | t1 |  t2 |  t3 |  t4 |
   +-----+-----+------+-----+----+-----+-----+-----+

6.2.13.  lease-expiration-time

   The lease expiration time expressed as an absolute time in GMT
   represented as seconds elapsed since Jan 1, 1970 (i.e.  ANSI C time_t
   time value representation).

   The lease expiration time is the time that a server has ACKed to a
   DHCP client.

        Code         Len          Time
   +-----+-----+------+-----+----+-----+-----+-----+
   |  0  |  13 |   0  |  4  | t1 |  t2 |  t3 |  t4 |
   +-----+-----+------+-----+----+-----+-----+-----+

6.2.14.  potential-expiration-time

   The potential expiration time expressed as an absolute time in GMT
   represented as seconds elapsed since Jan 1, 1970 (i.e.  ANSI C time_t
   time value representation).

   The potential expiration time is the time that one server tells
   another server that it may ACK to a client.

        Code         Len          Time
   +-----+-----+------+-----+----+-----+-----+-----+
   |  0  |  14 |   0  |  4  | t1 |  t2 |  t3 |  t4 |
   +-----+-----+------+-----+----+-----+-----+-----+

6.2.15.  grace-expiration-time

   The grace expiration time expressed as an absolute time in GMT
   represented as seconds elapsed since Jan 1, 1970 (i.e.  ANSI C time_t
   time value representation).

   The grace expiration time is the time that a grace period will
   expire.

        Code         Len          Time
   +-----+-----+------+-----+----+-----+-----+-----+
   |  0  |  15 |   0  |  4  | t1 |  t2 |  t3 |  t4 |
   +-----+-----+------+-----+----+-----+-----+-----+

6.2.16.  client-last-transaction-time

   The time at which this server last received a DHCP request from a
   particular client expressed as an absolute time in GMT represented as
   seconds elapsed since Jan 1, 1970 (i.e.  ANSI C time_t time value
   representation).

        Code         Len       Partner Down Time
   +-----+-----+------+-----+----+-----+-----+-----+
   |  0  |  16 |   0  |  4  | t1 |  t2 |  t3 |  t4 |
   +-----+-----+------+-----+----+-----+-----+-----+

6.2.17.  start-time-of-state

   The time at which the state contained in this message began,
   expressed as an absolute time in GMT represented as seconds elapsed
   since Jan 1, 1970 (i.e.  ANSI C time_t time value representation).

   This option is used for different states in different messages.  In a
   BNDUPD message it represents the start time of the state of the lease
   in the BNDUPD message.  In a STATE message, it represents the start
   time of the partner server's failover state.

        Code         Len      Start Time of State
   +-----+-----+------+-----+----+-----+-----+-----+
   |  0  |  17 |   0  |  4  | t1 |  t2 |  t3 |  t4 |
   +-----+-----+------+-----+----+-----+-----+-----+

6.2.18.  server-state

   This option is used to convey the current state of the failover
   endpoint in the sending server.

       Code          Len     Server State
   +-----+-----+------+-----+-----+
   |  0  |  18 |   0  |  1  | 1-9 |
   +-----+-----+------+-----+-----+

   Legal values for this option are:

   Value   Server State
   -----   -------------------------------------------------------------
   0       reserved
   1       STARTUP                      Startup state (1)
   2       NORMAL                       Normal state
   3       COMMUNICATIONS-INTERRUPTED   Communication interrupted (safe)
   4       PARTNER-DOWN                 Partner down (unsafe mode)
   5       POTENTIAL-CONFLICT           Synchronizing
   6       RECOVER                      Recovering bindings from partner
   7       PAUSED                       Shutting down for a short period.
   8       SHUTDOWN                     Shutting down for an extended
                                        period.
   9       RECOVER-DONE                 Interlock state prior to NORMAL

6.2.19.  server-flags

   This option is used to convey the current flags of the failover
   endpoint in the sending server.

       Code          Len     Server Flags
   +-----+-----+------+-----+-------+
   |  0  |  19 |   0  |  1  | flags |
   +-----+-----+------+-----+-------+

   Legal values for this option are:

   Currently, bit 5 is defined.  All other bits
   are reserved, and must be set to 0.

      o STARTUP

        Bit 5 is the STARTUP flag.  Bit 5 MUST be set to 1 whenever the
        server is in STARTUP state, and set to 0 otherwise.  (Note that
        when in STARTUP state, the state transmitted in the server-state
        option is usually the last recorded state from stable storage,
        but see section 9.3 for details.)

6.2.20.  vendor-specific-options

   This option is used to convey options specific to a particular
   vendor's implementation.  The vendor class identifier is used to
   specify which option space the embedded options are drawn from.

   It functions similarly to the vendor class identifier and vendor
   specific options in the DHCP protocol.

   This option contains other options in the same two byte code, two
   byte length format.  If this option appears in a message without a
   corresponding vendor class identifier, it MUST be ignored.

        Code         Len        Embedded options
   +-----+-----+------+-----+----+-----+---
   |  0  |  20 |   0  |  n  | c1 |  c2 |  ...
   +-----+-----+------+-----+----+-----+---

6.2.21.  max-unacked-bndupd

   The maximum number of BNDUPD message that this server is prepared to
   accept over the TCP connection without causing the TCP connection to
   block.

        Code         Len     Maximum Unacked BNDUPD
   +-----+-----+------+-----+----+-----+-----+-----+
   |  0  |  21 |   0  |  4  | n1 |  n2 |  n3 |  n4 |
   +-----+-----+------+-----+----+-----+-----+-----+

6.2.22.  server-role

   This option is used to convey the role of the failover endpoint in
   the sending server.

       Code          Len      Role
   +-----+-----+------+-----+-------+
   |  0  |  22 |   0  |  1  |   r1  |
   +-----+-----+------+-----+-------+

   A value of 0 indicates that the failover endpoint is a primary server
   and a value of 1 indicates that it is a secondary server.

6.2.23.  receive-timer

   The number of seconds within which the server must receive a packet
   from its partner, or it will assume that the partner is down or the
   communication path to the partner has failed.

        Code         Len         Receive Timer
   +-----+-----+------+-----+----+-----+-----+-----+
   |  0  |  23 |   0  |  4  | s1 |  s2 |  s3 |  s4 |
   +-----+-----+------+-----+----+-----+-----+-----+

6.2.24.  hash-bucket-assignment

   The set of hash values to which the receiving server MUST respond.
   See section 5.3 for more information on how this option is used.

   This option consists of a set of 32 bytes, in network byte order,
   where each bit corresponds to one of 256 possible hash bucket values.
   If a bit is set to 1, the recipient is required to service the
   requests whose client-identifier or htype concatenated with the
   chaddr (if no client-identifier exists) map into the corresponding
   hash bucket.

        Code         Len        Hash Buckets
   +-----+-----+------+-----+----+-----+-----+-----+
   |  0  |  24 |   0  |  32 | b1 |  b2 | ... | b32 |
   +-----+-----+------+-----+----+-----+-----+-----+

6.2.25.  message-digest

   The message digest for this message.

   This option consists of a variable number of bytes which contain the
   message digest of the message prior to the inclusion of this option.

   When this option appears in a message, it MUST appear as the last
   option in the message.

        Code         Len       Message Digest
   +-----+-----+------+-----+----+-----+-----
   |  0  |  25 |   0  |  n  | d1 |  d2 | ...
   +-----+-----+------+-----+----+-----+-----

6.2.26.  protocol-version

   The protocol version being used by the server. It is only sent in the
   CONNECT and CONNECTACK messages.

        Code         Len    Version
   +-----+-----+------+-----+----+
   |  0  |  26 |   0  |  1  | v1 |
   +-----+-----+------+-----+----+

6.2.27.  TLS-request

   This option contains information relating to TLS security
   negotiation.  It is sent in a CONNECT message

   The first byte, req, is the TLS request from this server.  A value of
   0 indicates no TLS operation, a value of 1 indicates that TLS
   operation is desired, and a value of 2 indicates that TLS operation
   is required to establish communications with this server.

   The second byte, acc, is what this server will accept for TLS
   operation.  A value of 0 means that this server will not accept TLS
   connections.  A value of 1                   2                   3 means that this server will accept TLS
   connections.

   If req is not zero, then acc MUST be 1.

   This allows a server which is not configured for TLS support to
   inform its partner that it will accept a TLS connection although it
   does not desire one, for instance.

        Code         Len  request acccept
   +-----+-----+------+-----+----+----+
   |  0 1 2 3 4 5 6 7 8 9  |  27 |   0 1  |  2 3 4 5 6 7 8 9  | req| acc|
   +-----+-----+------+-----+----+----+

6.2.28.  TLS-reply

   This option contains information relating to TLS security
   negotiation.  It is sent in a CONNECTACK message

   The value of 0 indicates no TLS operation, a value of 1 2 3 4 5 6 7 8 9 indicates
   that TLS operation is required.

        Code         Len     TLS
   +-----+-----+------+-----+----+
   |  0  |  28 |   0  |  1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+  |     op t1 |
   +-----+-----+------+-----+----+

6.3.  BNDUPD message format

   The binding update (BNDUPD) message is used to send the binding data-
   base changes to the partner server.

   The message type for the BNDUPD message is 3.

   The xid of the BNDUPD MUST be unique with respect to other failover
   messages transmitted from this failover endpoint.

   The following table summarizes the various options for the BNDUPD
   message.

                                        binding-status

   Option                        ACTIVE     EXPIRED    RELEASED   FREE
   ------                        ------     -------    --------   ----
   assigned-IP-address           MUST       MUST       MUST       MUST
   binding-status                MUST       MUST       MUST       MUST
   client-identifier             MAY        MAY        MAY        MAY
   client-hardware-address       MUST       MUST       MUST       MAY
   lease-expiration-time         MUST       MUST NOT   MUST NOT   MUST NOT
   potential-expiration-time     MUST       MUST NOT   MUST NOT   MUST NOT
   grace-expiration-time         MUST NOT   MUST NOT   MUST NOT   MUST NOT
   start-time-of-state           SHOULD     SHOULD     SHOULD     SHOULD
   client-last-trans.-time       SHOULD     SHOULD     SHOULD     MAY
   client-FQDN(1)                SHOULD     SHOULD     SHOULD     SHOULD
   all others                    MAY        MAY        MAY        MAY

                                        binding-status
                                                          BACKUP
                                EXPIRED-     RELEASED-    RESET
   Option                       GRACE        GRACE        ABANDONED
   ------                       ------       -----        ---------
   assigned-IP-address          MUST         MUST         MUST
   binding-status               MUST         MUST         MUST
   client-identifier            MAY          MAY          MAY(2)
   client-hardware-address      MAY          MAY          MAY(2)
   lease-expiration-time        MUST NOT     MUST NOT     MUST NOT
   potential-expiration-time    MUST NOT     MUST NOT     MUST NOT
   grace-expiration-time        MUST         MUST         MUST NOT
   start-time-of-state          SHOULD       SHOULD       SHOULD
   client-last-trans.-time      SHOULD       SHOULD       MAY
   client-FQDN(1)               SHOULD       SHOULD       SHOULD
   all others                   MAY          MAY          MAY

   (1) Only SHOULD appear if client supplies a host name and dynamic DNS
       is used.

   (2) MUST NOT if binding-status is ABANDONED.

                 Table 6.3-1: Options used in a BNDACK message

6.4.  BNDACK message format

   A server sends a binding acknowledgement (BNDACK) message when it has
   successfully committed binding database changes received from a fail-
   over partner in a BNDUPD message to its own stable storage.

   The message type for the BNDACK message is 4.

   The xid in a BNDACK MUST be the same as the xid of the corresponding
   BNDUPD.

   The following table summarizes the options for the BNDACK message.

                                        binding-status

   Option                        ACTIVE     EXPIRED    RELEASED   FREE
   ------                        ------     -------    --------   ----
   assigned-IP-address           MUST       MUST       MUST       MUST
   binding-status                MUST       MUST       MUST       MUST
   client-identifier             MAY        MAY        MAY        MAY
   client-hardware-address       MUST       MUST       MUST       MAY
   reject-reason                 MAY        MAY        MAY        MAY
   message                       MAY        MAY        MAY        MAY
   lease-expiration-time         MUST       MUST NOT   MUST NOT   MUST NOT
   potential-expiration-time     MUST       MUST NOT   MUST NOT   MUST NOT
   grace-expiration-time         MUST NOT   MUST NOT   MUST NOT   MUST NOT
   start-time-of-state           SHOULD     SHOULD     SHOULD     SHOULD
   client-last-trans.-time       SHOULD     SHOULD     SHOULD     MAY
   client-FQDN(1)                SHOULD     SHOULD     SHOULD     SHOULD
   all others                    MAY        MAY        MAY        MAY

                                        binding-status
                                                          BACKUP
                                EXPIRED-     RELEASED-    RESET
   Option                       GRACE        GRACE        ABANDONED
   ------                       ------       -----        ---------
   assigned-IP-address          MUST         MUST         MUST
   binding-status               MUST         MUST         MUST
   client-identifier            MAY          MAY          MAY
   client-hardware-address      MAY          MAY          MAY(2)
   reject-reason                MAY          MAY          MAY
   message                      MAY          MAY          MAY
   lease-expiration-time        MUST NOT     MUST NOT     MUST NOT
   potential-expiration-time    MUST NOT     MUST NOT     MUST NOT
   grace-expiration-time        MUST         MUST         MUST NOT
   start-time-of-state          SHOULD       SHOULD       SHOULD
   client-last-trans.-time      SHOULD       SHOULD       MAY
   client-FQDN(1)               SHOULD       SHOULD       SHOULD
   all others                   MAY          MAY          MAY

   (1) Only SHOULD appear if client supplies a host name and dynamic DNS
       is used.

   (2) MUST NOT if binding-status is ABANDONED.

                  Table 6.4-1: Options used in a BNDACK message

6.5.  Bulking for BNDUPD and BNDACK messages

   DISCUSSION:

      Bulking is planned for this protocol, but it hasn't been specified
      in this revision of the draft.  Once the draft settles down, we
      will specify the bulking approach in detail.

6.6.  UPDREQ message format

   The update request (UPDREQ) message is used by one server to request
   that its partner send it all binding database information that it has
   not already seen.

   The message type for the UPDREQ message is 9.

   The xid in a UPDREQ message MUST be unique among messages transmitted
   from this failover endpoint during the life of this connection.

   There are no options that MUST appear in an UPDREQALL message.  Any
   option MAY appear.

6.7.  UPDREQALL message format

   The update request all (UPDREQALL) message is used by one server to
   request that all binding database information be sent in order to
   recover from a total loss of its lease state database by the request-
   ing server.

   The message type for the UPDREQALL message is 7.

   The xid in a UPDREQALL message MUST be unique among messages
   transmitted from this failover endpoint during the life of this con-
   nection.

   There are no options that MUST appear in an UPDREQALL message.  Any
   option MAY appear.

6.8.  UPDDONE message format

   The update done (UPDDONE) message is used by the responding server to
   indicate that all requested updates have been sent by the responding
   server as BNDUPD messages and acked by the requesting server using
   BNDACK messages.  While a BNDACK message MUST have been received for
   each IP address that was sent in a BNDUPD message, the BNDACK message
   could have contained a reject-reason in order to NAK that specific
   update.

   Thus, this message confirms that the requesting server has received
   and responded to a BNDUPD message for all of the requested updates,
   but it does require the requesting server to accept all of the
   offered updates.

   The message type for the UPDDONE message is 7.

   The xid in an UPDDONE message MUST be identical to the xid in the
   UPDREQ or UPDREQALL message that initiated the update process.

   There are no options that MUST appear in an UPDDONE message.  Any
   option MAY appear.

6.9.  POOLREQ message format

   The pool request (POOLREQ) is used by the secondary server to request
   an allocation of IP addresses from the primary server.

   The message type for the POOLREQ message is 1.

   The xid in a POOLREQ message MUST be unique among messages transmit-
   ted from this failover endpoint during the life of this connection.

   There are no options that MUST appear in a POOLREQ message.  Any
   option MAY appear.

6.10.  POOLRESP message format

   The pool response (POOLRESP) is used by the primary server to inform
   the secondary server how many IP addresses it was allocated as the
   result of a pool request.

   The message type for the POOLRESP message is 2.

   The xid in the POOLRESP message MUST be identical to the xid in the
   POOLREQ message for which this POOLRESP is a response.

   The following table shows the options that MUST appear in a POOLRESP
   message:

           Option
           ------
           addresses-transferred       MUST

                          Table 6.10-1: Options used in a STATE message

6.11.  CONNECT message format

   The connect (CONNECT) message is used by either server to establish a
   high level connection with the other server, and to transmit several
   important configuration data items between the servers.

   The message type for the CONNECT message is 5.

   The xid in a CONNECT message MUST be unique among messages transmit-
   ted from this failover endpoint during the life of this connection.

   The CONNECT message MUST be the first message sent down a newly esta-
   blished connection.

   The following table summarizes the options that are associated with
   the CONNECT message:

                                      role

   Option                      primary       secondary
   ------                      ------        ---------
   sending-server-IP-address   MUST          MUST
   server-role                 MUST          MUST
   max-unacked-bndupd          MUST          MUST
   receive-timer               MUST          MUST
   current-time                MUST          MUST
   vendor-class-identifier     MUST          MUST
   protocol-version            MUST          MUST
   TLS-request                 MUST(1)       MUST(1)
   MCLT                        MUST          MUST NOT
   hash-bucket-assignment      MUST          MUST NOT
   all others                  MAY           MAY

   (1)    |     rev If the CONNECT message is being sent on a TLS secured connection,
   then there MUST NOT be a TLS-request option.

                  Table 6.11-1: Options used in a CONNECT message

6.12.  CONNECTACK message format

   The connect response (CONNECTACK) message is used by a server to
   respond to the receipt of a CONNECT message.

   The message type for the CONNECTACK message is 6.

   The xid in the CONNECTACK message MUST be identical to the xid in the
   CONNECT message for which this CONNECTACK is a response.

   The following table summarizes the options associated with the CON-
   NECTACK message:

   Option
   ------
   sending-server-IP-address   MUST
   server-role                 MUST
   max-unacked-bndupd          MUST
   receive-timer               MUST
   current-time                MUST
   vendor-class-identifier     MUST
   protocol-version            MUST
   TLS-reply                   MUST(1)
   reject-reason               MAY(2)
   message                     MAY

   (1)   |        payload offset If the CONNECTACK is being sent over an already TLS secured
       connection, then the TLS-reply option MUST NOT appear.

   (2)     |
   +---------------+---------------+---------------+---------------+
   | Indicates a rejection of the CONNECT message.

                  Table 6.12-1: Options used in a CONNECTACK message

6.13.  STATE message format

   The state (STATE) message is used by either server to communicate the
   current state of the failover endpoint with the other server.  It
   MUST be sent immediately after a connection is established with
   another server, and it MUST be sent whenever the server's state
   changes.

   The message type for the STATE message is 10.

   The xid (4)                            |
   +---------------------------------------------------------------+
   | in a STATE message MUST be unique among messages transmitted
   from this failover endpoint during the life of this connection.

   The following table shows the options that MUST appear in a STATE
   message:

           Option
           ------
           sending-state               MUST
           server-flags                MUST
           start-time-of-state         MUST

                          Table 6.13-1: Options used in a STATE message

6.14.  CONTACT message format

   The contact (CONTACT) message is used by either server to verify that
   the connection is operational to the other server.

   The message type for the CONTACT message is 11.

   The xid in a CONTACT message MUST be unique among messages transmit-
   ted from this failover endpoint during the life of this connection.

   The following table shows the options that MUST appear in a CONTACT
   message:

           Option
           ------
           current-time                MUST

                          Table 6.14-1: Options used in a CONTACT message

7.  Protocol Messages

   This section contains the detailed definition of the protocol mes-
   sages, including the information to include when sending the message,
   as well as the actions to take upon receiving the message.

7.1.  BNDUPD message

   The binding update (BNDUPD) message is used to send the binding data-
   base changes to the partner server, and the partner server responds
   with a binding acknowledgement (BNDACK) message when it has success-
   fully commited those changes to its own stable storage.

   The rest of the failover protocol exists to determine whether the
   partner server is able to communicate or not, and to enable the
   partners to exchange BNDUPD/BNDACK messages in order to keep their
   binding databases in stable storage synchronized.

7.1.1.  Sending the BNDUPD message

   A BNDUPD message SHOULD be generated whenever any binding changes.  A
   change might be in the binding-status, the lease-expiration-time, or
   even just the last-transaction-time.  In general, any time a DHCP
   client sends in a packet that results in a DHCP server writing to its
   stable storage, a BNDUPD message SHOULD be generated.

   The BNDUPD (and BNDACK) messages refer to the binding-status of the
   IP address, and this protocol defines a series of binding-statuses,
   discussed in more detail below.  Some servers may not support all of
   these binding-statuses, and so in those cases they will not be sent,
   and upon receipt a reasonable interpretation should be made.

   All BNDUPD messages MUST contain the IP address in the assigned-IP-
   address option, and it contains the IP address about which the BNDUPD
   message is being sent.

   All BNDUPD messages MUST contain the binding-status option, and it
   will have one of the values in the following list.  This list
   discusses the meanings of the various binding-statuses and the infor-
   mation that should go into the BNDUPD message because of them.

      o ACTIVE

        Indicates that the IP address is currently leased to a DHCP
        client.

        client-hardware-address

        The client-hardware-address option MUST appear, and be set from
        the MAC address of the DHCP client to which this IP address is
        leased.

        client-identifier

        If the DHCP client to which this IP address is leased used a
        client-identifier option to identify itself, then the client-
        identifier MUST appear in the BNDUPD message, else it MUST NOT
        appear.

        lease-expiration-time
        The lease-expiration-time option MUST appear, and be set to the
        expiration time most recently ACKed to the DHCP client.  Note
        that the time ACKed to a DHCP client is a lease duration in
        seconds, while the lease-expiration-time option in a BNDUPD mes-
        sage is an absolute time value.

        potential-expiration-time

        The potential-expiration-time option MUST appear, and be set to
        a value beyond that of the lease-expiration time.  This is the
        value that is ACKed by the BNDACK message.  A server sending a
        BNDUPD message MUST be able to recover the potential-
        expiration-time sent in every BNDUPD, not just those that
        receive a corresponding BNDACK, in order to be able to protect
        against possible duplicate allocation of IP addresses after
        transitioning to PARTNER-DOWN state. See section 5.2.1 for
        details as to why the potential-expiration-time exists and
        guidelines for how to decide the value.

      o EXPIRED

        A binding-status of EXPIRED is used when a client's binding on
        an IP address has expired and the server does not wish to imple-
        ment an expired-grace period.  When the partner server ACK's the
        BNDUPD of an EXPIRED IP address, the server sets its internal
        state to FREE.  It is then available to allocation to any client
        of the primary server.

        client-hardware-address

        There SHOULD be a DHCP client associated with the IP address
        whose binding has expired.  If there is, then the client-
        hardware-address option MUST appear, and be set from the MAC
        address of the DHCP client to which this IP address was leased.

        client-identifier

        There SHOULD be a DHCP client associated with the IP address
        whose binding has expired.  If there is, then if the DHCP client
        to which this IP address was leased used a client-identifier
        option to identify itself, then the client-identifier MUST
        appear in the BNDUPD message, else it MUST NOT appear.

      o RELEASED

        A binding-status of RELEASED is used when a DHCP client sends in
        a DHCPRELEASE message and the server does not wish to implement
        a released-grace period.  When the partner server ACK's the
        BNDUPD of an RELEASED IP address, the server sets its internal
        state to FREE, and it is available for allocation by the primary
        server to any DHCP client.

        client-hardware-address

        There SHOULD be a DHCP client associated with the IP address
        whose binding has been released.  If there is, then the client-
        hardware-address option MUST appear, and be set from the MAC
        address of the DHCP client which released this IP address.

        client-identifier

        There SHOULD be a DHCP client associated with the IP address
        whose binding has been released.  If there is, then if the DHCP
        client which released this IP address used a client-identifier
        option to identify itself, then the client-identifier MUST
        appear in the BNDUPD message, else it MUST NOT appear.

      o FREE

        A binding-status of FREE is used when a DHCP server needs to
        communicate that an IP address is available for allocation to
        another server, but it was not just released, expired, or reset
        by a network administrator.  When the partner server ACK's the
        BNDUPD of an FREE IP address, the server sets its internal state
        such that it is available for allocation by any DHCP client.

        client-hardware-address

        There MAY be a DHCP client associated with the IP address whose
        binding is now desired to be FREE.  If there is, then the
        client-hardware-address option MUST appear, and be set from the
        MAC address of the DHCP client which released this IP address.

        client-identifier

        There MAY be a DHCP client associated with the IP address whose
        binding is now desired to be FREE.  If there is, then if the
        DHCP client which released this IP address used a client-
        identifier option to identify itself, then the client-identifier
        MUST appear in the BNDUPD message, else it MUST NOT appear.

      o EXPIRED-GRACE

        Some servers support a grace period after lease expiration, to
        handle clock speed differences between clients and servers as
        well as to limit the number of times names are removed and
        subsequently added to dynamic DNS.

        client-hardware-address

        There MAY be a DHCP client associated with the IP address whose
        binding has now expired.  If there is, then the client-
        hardware-address option MUST appear, and be set from the MAC
        address of the DHCP client which released this IP address.

        client-identifier

        There MAY be a DHCP client associated with the IP address whose
        binding hs now expired.  If there is, then if the DHCP client
        which most recently leased this IP address used a client-
        identifier option to identify itself, then the client-identifier
        MUST appear in the BNDUPD message, else it MUST NOT appear.

        grace-expiration-time

        The grace-expiration-time option MUST appear, and is the length
        of time that this server will wait before trying to make the IP
        address available after the lease has expired for this IP
        address.

      o RELEASED-GRACE

        Some servers support a grace period after lease release by a
        DHCP client, to handle clock speed differences between clients
        and servers as well as to limit the number of times names are
        removed and subsequently added to dynamic DNS.

        client-hardware-address

        There MAY be a DHCP client associated with the IP address whose
        binding has now been released by sending server ID ( a DHCPRELEASE.  If
        there is, then the client-hardware-address option MUST appear,
        and be set from the MAC address of the DHCP client which
        released this IP address.

        client-identifier

        There MAY be a DHCP client associated with the IP address ) (4)             |
   +---------------------------------------------------------------+
   |                          time stamp (4)                       |
   +---------------------------------------------------------------+
   |     state (1) |    flags(1)   |       reserved (2)            |
   +---------------+---------------+---------------+---------------+
   |     0 or more additional header bytes  (variable)             |
   +---------------------------------------------------------------+
   |     Payload Data, formatted as DHCP-style options             |
   |     (although using whose
        binding has been released.  If there is, then if the DHCP client
        which most recently leased this IP address used a unique client-
        identifier option number space)             |
   |                     (variable)                                |
   +---------------------------------------------------------------+

DRAFT                                                      November 1998

   op - 1 byte

   These values extend to identify itself, then the client-identifier
        MUST appear in the BNDUPD message, else it MUST NOT appear.

        client-hardware-address
        There MAY be a DHCP client associated with the IP address whose
        binding is now desired to be FREE.  If there is, then the
        client-hardware-address option MUST appear, and be set from the
        MAC address of the DHCP client which released this IP address.

        client-identifier

        There MAY be a DHCP client associated with the IP address whose
        binding is now desired to be FREE.  If there is, then if the
        DHCP client which released this IP address used a client-
        identifier option to identify itself, then the client-identifier
        MUST appear in the BNDUPD message, else it MUST NOT appear.

        grace-expiration-time

        The grace-expiration-time MUST appear, and is the number space length of the existing BOOTP message
   type "Op" field.

   The following message types are defined:

   Value   Message Type
   -----   ------------
   0       reserved to BOOTP/DHCP, unused by failover
   1       BOOTREQUEST (reserved time
        that this server will wait before trying to BOOTP/DHCP, unused make the IP address
        available after the lease was released for this IP address

      o ABANDONED

        An ABANDONED IP address is one that has been considered unusable
        by failover)
   2       BOOTREPLY   (reserved the DHCP subsystem.  An IP address for which a valid PING
        response was received SHOULD be set to BOOTP/DHCP, unused by failover)
   3       DHCPPOOLREQ         request allocation of addresses
   4       DHCPPOOLRESP        respond ABANDONED.

        client-hardware-address

        There SHOULD NOT be a DHCP client associated with allocation count
   5       DHCPBNDUPD          update partner an ABANDONDED
        IP address.  The client-hardware-address option MUST NOT appear
        in the BNDUPD message.

        client-identifier

        There SHOULD NOT be a DHCP client associated with the IP address
        whose binding info
   6       DHCPBNDACK          acknowledge receipt of binding update
   7       DHCPPOLL            probe partner for comm. integrity
   8       DHCPPRPL            acknowledge comm. integrity
   9       DHCPUPDATEREQALL    request full transfer of binding info
   10      DHCPUPDATEDONE      ack send and ack of req'd binding info
   11      DHCPUPDATEREQ       req transfer has now been ABANDONED.  The client-identifier
        option MUST-NOT appear in the BNDUPD message.

      o RESET

        The RESET value of un-acked binding info

   rev - 1 byte

   Failover protocol version supported.  Set to 1 for the Failover
   protocol described in binding-status is used to indicate that
        this draft. IP address was made available by operator command.

      o BACKUP

        The BACKUP value 255 is reserved for
   experimental implementations.  Such implementations SHOULD use of binding-status indicates that this IP
        address belongs to the secondary server, and can be allocated by
        that server to a DHCP client at any time.

        client-hardware-address

        There MAY be a DHCP client associated with an BACKUP IP address.
        If there is, the client-hardware-address option MUST appear, and
        be set from the MAC address of the DHCP Vendor Class option client to recognize which this IP
        address was most recently associated.

        client-identifier

        There MAY be a partner server DHCP client associated with this IP address.  If
        the DHCP client to which this IP address is using
   the same vendor's experimental implementation.

   payload offset - 2 bytes, network byte order

   The byte offset of the Payload area, from leased used a
        client-identifier option to identify itself, then the beginning of client-
        identifier MUST appear in the
   Failover packet header. BNDUPD message, else it MUST NOT
        appear.

   The value for the current protocol version following option information is
   20.

   xid - 4 bytes, network byte order

   The sender generic to all BNDUPD messages,
   regardless of a Failover protocol packet is responsible for setting
   this number, and the receiver value of the packet copies the number over
   into any response packet, treating it as opaque data. binding-status.

   o start-time-of-state

     The sender start-time-of-state SHOULD ensure that every packet sent appear.  It is set to a particular the time at
     which this IP address and
   port combination has a unique transaction id unless first took on the state that packet corresponds to
     the current value of binding-status.

   o last-transaction-time

     The last-transaction-time value SHOULD appear.  This is a
   re-transmission.

DRAFT                                                      November 1998

   sending the time at
     which this DHCP server ID - 4 bytes, network byte order

   The last received a packet from the DHCP client
     referenced by the client-identifier or client-hardware-address that
     was associated with the IP address referenced by the assigned-IP-
     address.

   o client-FQDN

     If the DHCP server is performing dynamic DNS operations on behalf
     of the sending server.  In conjunction with DHCP client represented by the
   setting client-identifier or client-
     hardware-address, then it should include a client-FQDN option con-
     taining the host name, domain name, and status of any dynamic DNS
     operations enabled.

   The BNDUPD message SHOULD be sent as soon as possible from the SECONDARY flag, this uniquely determines time
   that the DHCP client received a response and the failover
   entity sending lease bindings data-
   base is written on stable storage.

7.1.2.  Receiving the BNDUPD message as well as that destined

   When a server receives a BNDUPD message, it needs to receive decide how to
   processes the
   message.

   This is placed in message and whether the packet instead message represents a conflict
   of being recovered from any sort. The conflict resolution process is used on the IP
   header for security purposes (see section 8).

   time stamp - 4 bytes, unsigned, network byte receipt
   of every BNDUPD message, not just those that are received while in
   POTENTIAL-CONFLICT state, in order

   A time stamp, indicating to increase the time when robustness of the packet was sent.
   protocol.

   There are two sorts of conflict.  The time first, more major conflict, is
   when a 32 bit unsigned long value server receives a BNDUPD message from its partner for an
   ACTIVE IP address and finds that the client specified in network byte order, the BNDUPD
   message is different from the client associated with this ACTIVE IP
   address in units this server's bindings database.

   The second sort of
   seconds (GMT since EPOCH).

   It conflict is used to determine where the time drift between receiving server has in its
   bindings database the sender and client specified in the
   recipient. The time drift is defined as BNDUPD message associ-
   ated with a different IP address.

   These two conflict cases can both occur together with the difference between
   "Arrive Time (GMT)" same BNDUPD
   message.

   When receiving a BNDUPD message, the server first determines the IP
   address from the assigned-IP-address option, and "(Send Time (GMT)".  The actual packet travel
   time is assumed to be negligible in then determines if
   there was any client associated with this context. All Date-Time
   values contained in Failover messages MUST be corrected by the time
   drift before being stored IP address by looking for
   the recipient.

   state - 1 byte

   This field indicates the state of client-identifier option.  If there is no client-identifier
   option, then the sender, at server looks for a client-hardware-address option,
   and ultimately determines the time client's identity specified in the packet
   was sent.
   BNDUPD.

   The field MUST be set client specified in every Failover message.  The
   server state value can be one of the following:

   Value   Server State
   -----   -------------------------------------------------------------
   0       NO-STATE                     May only occur in POLL messages.
                                        The partner should reply, but
                                        should not react BNDUPD message is compared to the client
   currently associated with any state
                                        transition.
   1       STARTUP                      Startup state (1)
   2       NORMAL                       Normal state
   3       COMMUNICATIONS-INTERRUPTED   Communication interrupted (safe)
   4       PARTNER-DOWN                 Partner down (unsafe mode)
   5       POTENTIAL-CONFLICT           Synchronizing
   6       RECOVER                      Recovering the IP address in this server's bindings from partner
   7       PAUSED                       Shutting down for
   database.  If they are the same, continue.  If there is no client in
   this server's binding database, continue.  If there is a short period.
   8       SHUTDOWN                     Shutting down for an extended
                                        period.
   9       RECOVER-DONE                 Interlock state prior to NORMAL

DRAFT                                                      November 1998

   Note 1: The STARTUP state client in
   this server's bindings database, and it is never set different from that speci-
   fied in the State field of BNDUPD message, a 'client conflict' exists.  See the mes-
   sage, but rather is represented by sec-
   tion below on conflict resolution.  If the setting of client specified in the STARTUP flag
   (see
   BNDUPD message is associated with a different IP address in this
   server's bindings database in the description of same subnet, then an 'IP address
   conflict' exists. This does not refer to the Flags field immediately below).  When case where a single
   client has addresses in multiple different subnets or administrative
   domains, but rather the
   server is case where in the STARTUP state, same subnet the state transmitted client has
   as lease on one IP address in one server and on a different IP
   address on the State
   byte is other server.  See the PREVIOUS state (usually, but not always, section below on conflict reso-
   lution.

   If none of the last
   recorded in stable storage prior to conflicts mentioned above exist, then develop a server going down -- see sec-
   tion 6.3 time
   for details.)

   flags - 1 byte

   Currently, bits 7 (MSB), 6, and 5 are defined.  All other bits are
   reserved, both the BNDUPD message and must be set to 0.

      o SECONDARY

        Bit 7 is the SECONDARY flag server's information.

   The time for both the BNDUPD and defines the server role.  Bit 7
        is 0 if server's information are
   developed independently in the sender following way:  If there is a primary server, 1 if it client-
   last-transaction time, use that.  If there isn't, but there is a secondary
        server.  Note that this role
   start-time-of-state, use that.  If there isn't, but there is fixed for a
   client-expiration-time, use that.  If there isn't, then use the duration of time
   the
        relationship between primary and secondary server.  In particu-
        lar, it does not change when BNDUPD message was received for a BNDUPD message, and if the secondary server "takes
        over" current
   time for the primary server when it enters COMMUNICATIONS-
        INTERRUPTED or PARTNER-DOWN state -- each server's information.

   Then the server retains its
        role throughout all of its state transitions.

      o RESTART

        Bit 6 is determines the RESTART flag.  If bit 6 is 1, binding-status in the sender is res-
        tarting.  A server MUST set this bit every time it is re-
        started, BNDUPD, and it MUST clear
   takes the bit upon receiving following actions based on binding-status:

   (In the first
        DHCPPRPL following list, to "accept" a DHCPPOLL message it has sent BNDUPD means to update the
   server's bindings database with the bit set.

        Whenever information contained in the
   BNDUPD and once that update is complete, send a DHCPPOLL BNDACK message
   corresponding to the BNDUPD message).

      o ACTIVE in BNDUPD

        If the BNDUPD is sent with LATER than the RESTART bit set server's information, accept it,
        else reject it.

      o EXPIRED or EXPIRED-GRACE in BNDUPD

        If the 'flags' field, binding-status in the receiving server's bindings data-
        base is ACTIVE, then reject the BNDUPD.  Otherwise, accept the
        BNDUPD.

        If the MCLT Option, Option 235, MUST be
        included.

        Whenever a message with binding-status in the RESTART bit BNDUPD is received by a server,
        it MUST transition through the communications failed state tran-
        sition.  The RESTART bit signals that EXPIRED-GRACE and the partner
        server has
        been restarted, and if communications is already considered to
        have failed, receiving the BNDUPD does not implement a grace period
        for expired leases, then nothing need be done.  If, however, the
        partner server appeared to be operating correctly, then it was
        able MUST set its lease expira-
        tion to restart without value held in the receiving server noticing that it
        was ever gone.  The communications failed transition is forced grace-expiration in this case to restart any on-going resynchronization processes
        that were operating with the partner server.  See section 6.3
        for additional information.

        Whenever a DHCPPOLL message BNDUPD.

      o RELEASED or RELEASED-GRACE in BNDUPD

        If the BNDUPD is sent with LATER than the RESTART bit set,

DRAFT                                                      November 1998 server's information, accept it,
        else reject it.

        If the server SHOULD include a Vendor Class Identifier, Option 60, binding-status in the message to identify BNDUPD is RELEASED-GRACE and the
        server to its partner.

      o STARTUP

        Bit 5 is receiving the STARTUP flag.  Bit 5 BNDUPD does not implement a grace period
        for released leases, then the server MUST be set its lease expira-
        tion to 1 whenever value held in the
        server is grace-expiration in STARTUP state, and set to 0 otherwise.  (Note that
        when the BNDUPD.

      o FREE or BACKUP in STARTUP state, BNDUPD

        If the state transmitted binding-status in the 'state'
        field receiving server's database is usually
        ACTIVE and the last recorded state from stable storage,
        but see section 6.3 for details.)

   reserved - 2 bytes

   2 filler bytes, reserved.

2.7.  DHCPPOOLREQ lease-expiration-time has not yet been reached,
        reject it, else accept it.

      o RESET or ABANDONDED in BNDUPD

        Accept it under all circumstances.

7.1.3.  Conflict resolution when receiving the BNDUPD message

   When a either of the following conflicts exists between the informa-
   tion in a BNDUPD message and DHCPPOOLRESP:

   A secondary server requests addresses the information held in the receiving
   server's bindings database, it should be resolved in the following
   manner:

      o client conflict

        This is the duplicate IP address allocation conflict. There are
        two different clients each allocated the same address.

        If times for its unique both exist, use the LATER update, else use the
        information from the primary server.

      o IP address conflict

        An IP address conflict exists when a client on one server by using the DHCPPOOLREQ message.  The primary is in
   complete charge of how many addresses
        associated with a one IP address, and on the secondary receives.

   The primary other server will allocate with a
        different IP addresses to address in the secondary server
   upon receipt of same or a DHCPPOOLREQ message related subnet. If one
        binding-status is ACTIVE and inform the secondary server
   of other is anything but ACTIVE,
        then the number of additional addresses allocated information in this allocation
   cycle by sending the number in ACTIVE binding SHOULD be used.  Oth-
        erwise, if times exist, then the DHCPPOOLRESP message.

   When LATER SHOULD be used. Other-
        wise, if times do not exist, then the primary information from the pri-
        mary server gets should be used.

7.2.  BNDACK message

   Every BNDUPD message that is received by a DHCPPOOLREQ message, server MUST be responded
   to with a corresponding BNDUPD message.  The receiving server SHOULD
   respond quickly to every BNDUPD message but it computes MAY choose to respond
   preferentially to DHCP client requests instead of BNDUPD messages,
   since there is no absolute time period within which
   addresses should a BNDACK must be transferred
   sent in response to the secondary, a BNDUPD message, and queues up
   DHCPBNDUPD transactions by setting the Status of DHCP clients frequently do
   have time constraints that must be met.

7.2.1.  Sending the selected
   addresses to "BACKUP".  Having done this, it sends a DHCPPOOLRESP
   message. BNDACK message

   The DHCPPOOLRESP BNDACK message carries MUST contain the "Number of addresses
   transferred" same xid as its payload.  The primary server does not have to
   wait until all the above binding updates have been acknowledged, corresponding
   BNDUPD message.

   All of the options which appear in the BNDUPD message MUST be
   included in the BNDACK message.  The secondary values in the options MAY be
   updated to reflect current information on the server keeps sending DHCPPOOLREQ messages until it
   receives a  DHCPPOOLRESP with "Number of addresses transferred" = 0,
   or it decides the
   BNDACK.   Note that update of this information may be used for infor-
   mational purposes, but MUST NOT be assumed to necessarily be recorded
   in the partner is not responding.

   If stable storage of the secondary server receives a  DHCPPOOLRESP who sent the BNDUPD message with "Number
   because there is not corresponding ACK of addresses transferred" > 0, it MUST send another DHCPPOOLREQ mes-
   sage, since additional addresses may still the BNDACK message.  Any
   information that SHOULD be waiting for it.  How-
   ever, recorded in the time at which it sends partner server's stable
   storage MUST be transmitted in a subsequent DHCPPOOLREQ messages is
   implementation dependent.  This mechanism makes it possible for BNDUPD.

   If the
   primary server to pace is accepting the transfer (e.g., it could generate all
   addresses all at once, or one-by-one) and to some degree for BNDUPD, the
   secondary to pace their receipt.

DRAFT                                                      November 1998

   The primary server MUST respond to each DHCPPOOLREQ BNDACK message it
   receives. includes
   only those options that appears in the BNDUPD message. If it has already generated all private addresses, or it
   has no available addresses, it MUST send  DHCPPOOLRESP with "Number
   of addresses transferred" = 0.

   The secondary server MAY send a DHCPPOOLREQ message at any time, and
   although the primary server
   is under no obligation to allocate any rejecting the BNDUPD, the additional addresses, it option reject-reason MUST respond with a DHCPPOOLRESP indicating
   how many new addresses it has allocated or 0 if no new addresses were
   allocated.

2.8.  DHCPUPDATEREQ, DHCPUPDATEREQALL
   appear in the BNDACK message, and DHCPUPDATEDONE:

   Whenever either server wishes to be updated with information the
   other server knows but has not yet transmitted, it will send message option SHOULD appear in
   this case containing a
   DHCPUPDATEREQ or DHCPUPDATEREQALL human-readable error message describing in
   some detail the reason for the rejection of the BNDUPD message.

7.2.2.  Receiving the BNDACK message

   When either a server gets receives a DHCPUPDATEREQ or DHCPUPDATEREQALL BNDACK message, if it computes which updates should be transferred to the partner, and
   queues up DHCPBNDUPD transactions as appropriate.  Once all such
   updates have been acknowledged, it sends doesn't contain a DHCPUPDATEDONE message.

   If
   reject-reason option that means that the BNDUPD message that initiated this process was a DHCPUPDATEREQ mes-
   sage, accepted,
   and the receiving server will transmit only DHCPBNDUPD messages for
   IP addresses which sent the BNDUPD MUST update its information indicates stable storage
   with the potential-expiration-time value sent in the BNDUPD message
   and returned in the BNDACK message.  Other values sent in the BNDUPD
   message MAY be used as desired.

7.3.  UPDREQ message

   The update request (UPDREQ) message is used by one server to request
   that its partner send it all of the binding database information that
   it has not
   acked.

   If, however, already seen.   Since each server is required to keep
   track at all times of the message that initiated this process was a DHCPUP-
   DATEREQALL message, binding information the receiving other server will transmit DHCPBNDUPD
   messages for has
   received and ACKed, one server can request transmission of all IP addresses involved in failover with this partner
   in this role.

   The secondary un-
   ACKed binding database information held by the other server periodically re-transmits by using
   the DHCPUPDATEREQ mes-
   sage, until UPDREQ message.

   The UPDREQ message is used whenever the sending server cannot proceed
   before it receives a DHCPUPDATEDONE has processed all previously un-ACKed binding update infor-
   mation, since the UPDREQ message with should yield a matching
   'xid' field, or corresponding UPDDONE
   message.  The UPDDONE message is not sent until it decides the server that sent
   the partner is not responding.

   This approach is similar UPDREQ message has responded to all of the BNDUPD messages gen-
   erated by the UPDREQ message with BNDACK messages. Thus, the sender
   of the UPDREQ message can be sure upon receipt of an UPDDONE message
   that it has received and commited to stable storage all outstanding
   binding database updates.

   See section 9, Protcol state transitions, for the details of when the DHCPPOOLREQ/DHCPPOOLRESP
   UPDREQ message
   exchange, with one critical difference: the DHCPPOOLRESP is sent as
   soon as sent.

7.3.1.  Sending the binding updates UPDREQ message

   There are queued up, but no options for the DHCPUPDATEDONE UPDREQ message.

   The UPDREQ message is deferred until all of the sender's DHCPBNDUPD messages
   have been successfully transmitted and sent with a corresponding DHCPBNDACK unique xid.

7.3.2.  Receiving the UPDREQ message has been received for each of them.

   The

   A server processing a DHCPUPDATEREQ receiving an UPDREQ message MUST NOT send a
   corresponding DHCPUPDATEDONE message until all of the DHCPBNDUPD mes-
   sages binding database
   changes that have not yet been acked ACKed by the partner with a DHCPBNDACK message.

DRAFT                                                      November 1998

   Any retransmissions of sending server.   These
   changes are sent as undistinguished BNDUPD messages.

   However, the DHCPUPDATEREQ message MUST have server which received and is processing the same
   transaction ID.  Use of a new transaction ID may cause rebuilding of UPDREQ mes-
   sage MUST track the outgoing binding update queue or other processing in BNDACK messages that correspond to the server
   with a negative effect on performance.

2.9.  DHCPBNDUPD

   One server notifies its partner of a binding state change BNDUPD
   messages triggered by using the DHCPBNDUPD message.

   Every DHCPBNDUPD UPDREQ message MUST contain:

      o An Assigned IP Address Option (Option 50).

      o A DHCP Binding Status (Option X).

      o Where and, when they are all
   received, the Binding Status is ACTIVE, EXPIRED, RELEASED, or RESET,
        it server MUST also contain one or both of send an UPDDONE message.

   When queuing up the Client Identifier
        (Option 61) and BNDUPD messages for transmission to the Client Hardware Address (Option X+3). In sender of
   the
        case where UPDREQ message, the Binding Status is ACTIVE, it receiving server MUST contain honor the
        Lease Duration, Option 51.

      o Where dynamic DNS updates are being used by value
   returned in the sending server, max-unacked-bndupd option in the Client FQDN Option, Option 81, is used by CONNECT or CONNEC-
   TACK message that set up the sender to
        communication connection with the status of sending server.  It
   MUST NOT send more BNDUPD messages without receiving corresponding
   BNDACKs than the binding value returned in max-unacked-bndupd.

7.4.  UPDREQALL message

   The update request all (UPDREQALL) message is used by one server to its partner.
   In response to a binding update,
   request that its partner send it all of the recipient server MUST respond
   with a  DHCPBNDACK message.

   Multiple binding updates MAY be batched up, and sent in one Failover
   protocol message (see section 3.1).

2.10.  DHCPBNDACK database informa-
   tion.  This message implements either a positive or negative acknowledgment
   of is used to allow one or more binding updates.

   A binding update, (or server to recover from a batch
   failure of stable storage and to restore its binding updates sent as one message)
   are matched up with their associated acknowledgment by having the
   same 'xid' field value database in its
   entirety from the message header.

   The other server.

   A server sending a DHCPBNDACK which sends an UPDREQALL message MAY include any cannot proceed until all of the
   options
   its binding update information is restored, and it knows that are acceptable in a DHCPBNDUPD message all of
   that information is restored when the
   DHCPBNDACK an UPDDONE message is returned to the sender.  It MUST include at
   least received.

   See section 9, Protcol state transitions, for the Assigned IP Address Option.

   If any details of this information differs from when the information in
   UPDREQALL message is sent.

7.4.1.  Sending the
   DHCPBNDUPD message, UPDREQALL message

   There are no options for the receiver MUST NOT update its bindings

DRAFT                                                      November 1998

   database UPDREQALL message.

   The UPDREQALL message is sent with that a unique xid.

7.4.2.  Receiving the UPDREQALL message

   A server receiving an UPDREQALL message MUST send all binding data-
   base information upon receipt of to the DHCPBNDACK mes-
   sage, since sending server.  These changes are sent as
   undistinguished BNDUPD messages.

   However, the sender will have no way of knowing if server receiving the receiver
   actually received UPDREQALL message MUST track the message.

   The DHCPBNDACK MAY selectively reject one or more updates, by includ-
   ing one or more IP address - Reject Reason option pairs in
   BNDACK messages that correspond to the mes-
   sage body.

   The DHCPBNDACK implicitly acknowledges any binding updates it replies
   to, except those it enumerates using Reject Reason Codes.

   Implementations of this protocol MAY send batched updates, and BNDUPD messages triggered by
   the UPDREQ message and, when they are all received, the server MUST be prepared
   send an UPDDONE message.

   When queuing up the BNDUPD messages for transmission to receive batched updates.

2.11.  DHCPPOLL

   In the absence sender of other messages, a DHCPPOLL
   the UPDREQALL message, the receiving server MUST honor the value
   returned in the max-unacked-bndupd option in the CONNECT or CONNEC-
   TACK message that set up the connection with the sending server.  It
   MUST NOT send more BNDUPD messages without receiving corresponding
   BNDACKs than the value returned in max-unacked-bndupd.

7.5.  UPDDONE message

   The update done (UPDDONE) message is used by a server receiving an
   UPDREQ or UPDREQALL message to
   verify the communications integrity signify that it has sent all of the link between
   BNDUPD messages requested by the primary UPDREQ or UPDREQALL request and secondary servers.  It is used that
   it has received a BNDACK for each of those messages.

7.5.1.  Sending the UPDDONE message

   The UPDDONE message SHOULD be sent as soon as the last BNDACK message
   corresponding to a BNDUPD message requested by either server whenever there the UPDREQ or
   UPDREQALL is
   some question about either received from the communications integrity server which sent the UPDREQ or running
   status
   UPDREQALL.

7.5.2.  Receiving the UPDDONE message

   A server receiving the UPDDONE message knows that all of the other server.

   Since current state informa-
   tion that it requested by sending an UPDREQ or UPDREQALL message has
   now been sent and other status that it has recorded this information is transmitted in
   every DHCPPOLL and in every DHCPPRPL message, its stable
   storage.  It typically uses that the DHCPPOLL and
   DHCPPRPL exchange can also be used receipt of an UPDDONE message to
   move to signal a change in status different failover state.  See sections 9.5.2 and 9.8.3 for
   details.

7.6.  POOLREQ message

   The pool request (POOLREQ) message is used by a the secondary server or as a way to
   request an update of the status allocation of its partner.

   Whenever a DHCPPOLL message is generated it MUST have a unique value
   in IP addresses from the 'xid' field, unless it is primary server.   It
   MUST be sent by a retransmission of secondary server to a previously
   un-acked DHCPPOLL message.

2.12.  DHCPPRPL

   This message simply replies primary server to request IP
   address allocation by the primary.  The IP addresses allocated are
   transmitted using normal BNDUPD messages from the primary to the DHCPPOLL
   secondary.

   The POOLREQ message (PRPL = Poll
   reply).  Like all messages, it needs SHOULD be sent from the secondary to have all of the fixed
   portions primary
   whenever the secondary transitions into NORMAL state.  It SHOULD
   periodically be resent in order that any change in the number of
   available IP addresses on the failover packet header filled in, including primary be reflected in the state
   and pool on the flags fields.

3.  Protocol Payload Data Format

   Payload data is encoded as a set of flexible DHCP/BOOTP style options
   [RFC 2132].  (The usual 1 byte option code, 1 byte length, and
   "length" bytes of data).  The options are placed after
   secondary.

7.6.1.  Sending the header,
   after skipping PayloadOffset bytes. POOLREQ message

   The payload data options are not
   preceded by POOLREQ message has no options.  It must be sent with a "cookie" value.

DRAFT                                                      November 1998

   Since unique
   xid.

7.6.2.  Receiving the packet is NOT POOLREQ message

   When a DHCP/BOOTP protocol packet, the options
   used here do not conflict with any existing "proper" DHCP/BOOTP
   options.  In fact, these options are allocated in relationship to the
   DHCP option space in primary server receives a POOLREQ message it SHOULD examine
   the following way.

   In cases where binding database and determine how many IP addresses the syntax secon-
   dary server should have, and semantics of a Failover Payload Option
   is identical set these IP addresses to that BACKUP state.
   It SHOULD then send BNDUPD messages concerning all of a DHCP/BOOTP option, the same option number
   is used.  For options unique these IP
   addresses to the Failover protocol, option numbers
   starting at 230 are used.

   Thus, all new Failover protocol option numbers are assigned from the secondary server.

   Servers frequently have several kinds of IP addresses available on a
   continuous range beginning with 230.
   particular network segment.  The failover protocol is permissive assumes that both
   primary and secondary servers are configured in allowing various other DHCP options such a way that each
   knows the type and number of IP addresses on every network segment
   participating in
   binding updates.  As long as the sender wishes to use an option, it
   MAY include it.  On failover protocol.  The primary server is
   responsible for allocating the other hand, secondary server the recipient MUST ignore any
   option it is not prepared to process.

3.1.  Batching multiple binding updates in one packet

   Implementations correct propor-
   tion of this protocol MAY send batched updates, available IP addresses of each kind, and they
   MUST be prepared to receive batched updates.

   Multiple DHCPBNDUPD transactions MAY be batched together in one
   protocol message.  Data sets the secondary server
   is responsible for individual transactions MUST always
   begin with being configured in such a way that it can tell
   the Assigned kind of every IP Address (Option 50).  Option ordering
   between address based solely on the Assigned IP Address options is not significant.

   If batched updates are sent, they address itself.

   A primary server MUST be formatted as follows:

       Non-IP Address/Non-client specific options first
       Assigned keep track of how many IP address option (50) for addresses were allo-
   cated as a result of processing the first address
           Options pertaining to first address, including
           at least DHCP Binding Status (230)
       Assigned IP address option (50) for POOLREQ message, and send that
   number in the second address
           Options pertaining POOLRESP message.

   A primary server MAY choose to second address, including
           at least DHCP Binding Status (230)
        ...

   In case an implementation chooses defer processing a POOLREQ message
   until a more convenient time to reject some or all of process it, but it should not depend
   on the secondary server to retransmit the IP
   address binding information POOLREQ message in that
   case.

   If a secondary server receives a DHCPBNDUPD POOLREQ message in it SHOULD report an
   error.

7.7.  POOLRESP message

   A primary server sends a DHCPBNDACK
   reply, the DHCPBNDACK POOLRESP message MUST contain one or more Assigned IP
   Address (Option 50) / Reject Reason Code pairs to indicate that a secondary server after
   the
   updates allocation process for available addresses to the address(es) were not accepted.  The Assigned IP
   Address options communicates which updates out secondary
   server is complete.  Typically this message will precede some of the batch are being
   rejected, and
   BNDUPD messages that the Reject Reason Code indicates why.  Any primary uses to send the actual allocated IP
   addresses

DRAFT                                                      November 1998

   present in to the secondary.

7.7.1.  Sending the DHCPBNDUPD POOLRESP message without

   The POOLRESP message MUST contain the same xid as the corresponding Option 50/
   Reject Reason Code pairs
   POOLREQ message.

   The only option which MUST appear in the DHCPBNDACK a POOLREQ message are implicitly
   acked is:

      o addressed-transferred

        The number of addresses allocated to the secondary server by the DHCPBNDACK
        primary server as a result of a POOLREQ is contained in the
        addresses-transferred option in a POOLRESP message.  If  Note this
        is the DHCPBNDUPD message only con-
   tains one number of addresses that are transferred to the secondary
        in the primary's binding update database as a result of the correspond-
        ing POOLREQ message, and that update it may be some time before they
        can all be transmitted to the secondary server through the use
        of BNDUPD messages.

7.7.2.  Receiving the POOLRESP message

   When a secondary server receives a POOLRESP message, it SHOULD send
   another POOLRESP message if the value of the addresses-transferred
   option is rejected, non-zero.

   Typically, no other action is taken on the reception of a DHCPBNDACK
   with POOLRESP
   message.

7.8.  CONNECT message

   The connect message is used to establish an applications level con-
   nection over a newly created TCP connection.  It gives the source
   information for the connection, and some important configuration
   information.  It may be sent by either primary or secondary server.
   It is sent by the initiator of a single Assigned IP Address / Reject Reason Code pair TCP connection.

7.8.1.  Sending the CONNECT message

   The CONNECT message MUST be
   sent.

3.2.  DHCP Binding Status

   This option is used to convey the current state first message sent by the initiator
   of a binding. This
   option is mandatory for DHCPBNDUPD messages.

   Code     Len  Type
   +-----+-----+-----+
   | 230 |  1  | 1-7 |
   +-----+-----+-----+
   Legal values for this option are:

   Value Binding Status
   ----- ------------------------------------------------
   1     FREE       Lease has never been used
   2     ACTIVE     Lease is assigned to TCP connection after the establishment of a client
   3     EXPIRED    Lease has expired
   4     RELEASED   Lease has been released by client
   5     ABANDONED  A server, or client flagged address as unusable
   6     RESET      Lease was freed by some external agent
   7     BACKUP     Lease belongs to secondary's private address pool

3.3.  Assigned new TCP connection
   with another server participating in the failover protocol.

   The xid of the CONNECT message must be unique.

   The IP address

   Uses identical code and format to DHCP Option 50 (requested IP
   address). of the sending server MUST be placed in the sending-
   server-IP-address option.  This option information is mandatory for DHCPBNDUPD messages and placed in
   any DHCPBNDACK message where a Reject Reason Code an option appears.

   Code   Len          Address
   +-----+-----+-----+-----+-----+-----+
   |  50 |  4  |  a1 |  a2 |  a3 |  a4 |
   +-----+-----+-----+-----+-----+-----+

DRAFT                                                      November 1998

3.4.  Absolute time

   This absolute time is used for the lease grant time as well
   inside of the
   partner-down time.    When used packet in order to allow the identity of the sender to
   be covered by a DHCPBNDUPD or DHCPBNDACK
   message, it represents shared secret.

   The role of the lease grant time.  When used sending failover endpoint (i.e., either primary or
   secondary) MUST be placed in a DHCPPOLL
   message, it represents the partner-down time.

   An absolute, GMT time value for this option, as server-role option.

   The current time synchronization
   has already been achieved between MUST be placed in the source and current-time option.

   The number of BNDUPD messages the target server
   using can accept without blocking
   the time field TCP connection MUST be placed in the message.  Represented as seconds elapsed
   since Jan max-unacked-bndupd option.
   This MUST be a number equal to or greater than 1, 1970 (i.e. ANSI C time_t time value representation).
   Note that this is (at present) SHOULD be a signed field.

   Code   Len           Time
   +------+-----+-----+-----+-----+-----+
   | 231  |  4  |  t1 |  t2 |  t3 |  t4 |
   +------+-----+-----+-----+-----+-----+

3.5.  Number number
   greater than 10, and SHOULD be a number less than 100.

   The length of addresses transferred to Secondary Server

   A 32 bit unsigned long the receive timer (tReceive, see section 8.3) MUST be
   placed in network byte order. Reports the number of
   addresses transferred by receive-timer option.

   If the sending server is a primary to server, then the secondary server
   (addresses to MCLT MUST be used for
   placed in the secondary server's private address
   pool)

   Code   Len     Number of Addresses
   +-----+-----+-----+-----+-----+-----+
   | 232 |  4  |  n1 |  n2 |  n3 |  n4 |
   +-----+-----+-----+-----+-----+-----+

3.6.  Lease Duration

   Uses MCLT option.

   If the format and code of sending server is a primary server, then the standard DHCP IP Address Lease Time hash-bucket-
   assignment option (51).  The time is MUST be included in units of seconds, and is specified as a
   32-bit  unsigned integer. A Lease Duration of 0xFFFFFFFF indicates an
   infinite lease.

   Code   Len         Lease Time
   +-----+-----+-----+-----+-----+-----+
   |  51 |  4  |  t1 |  t2 |  t3 |  t4 |
   +-----+-----+-----+-----+-----+-----+

DRAFT                                                      November 1998

3.7.  Client Identifier the CONNECT message.  The format, code and conventions used are identical to DHCP value
   of the hash-bucket-assignment option
   61.

   Code   Len   Type  Client-Identifier
   +-----+-----+-----+-----+-----+---
   |  61 |  n  |  t1 |  i1 |  i2 | ...
   +-----+-----+-----+-----+-----+---

3.8.  Client Hardware Address is determined from the specific
   buckets that the primary server has determined that the secondary
   server MUST service as part of the load-balancing algorithm.  The format way
   in which the primary server determines this information is similar outside
   the scope of this protocol definition.  The primary server is SHOULD
   be able to DHCP option 61. T1 (type) MUST be set configured with a percentage of clients that the secon-
   dary server will be instructed to service, and the
   proper ARP hardware address code, as defined primary server
   SHOULD convert that percentage value into a corresponding set of bits
   in the ARP section of
   RFC 1700 (it hash-bucket-assignment option that are set to a 1, indicating
   that the secondary server MUST service clients which map to those
   hash buckets.

   The vendor class identifier MUST NOT be zero!)

   Code   Len   Type   MAC address
   +-----+-----+-----+-----+-----+---
   | 233 |  n  |  t1 |  m1 |  m2 | ...
   +-----+-----+-----+-----+-----+---

   Either Client Id, Client Hardware Address or BOTH MAY placed in the vendor-class-
   identifier option.

   The protocol-version option MUST be present included in
   binding update transactions. At least one every CONNECT mes-
   sage.  The current value of them the protocol version is 1.

   The TLS-request option MUST be present. sent and contains the desired TLS con-
   nection request as well as information concerning whether TLS is sup-
   ported.    If both are present, this CONNECT message is being sent over a already
   created TLS connection, the Client Id TLS-request MUST be used to uniquely identify NOT appear.

7.8.2.  Receiving the owner of CONNECT message

   When a server receives a TCP connection on the binding (exactly as in RFC 2131).

3.9.  Host Name

   Uses failover port, it
   should wait for a CONNECT message.

   When a server receives a CONNECT message it should:

      1.  Record the format and code of DHCP option 12.

   Code   Len                 Host Name
   +-----+-----+-----+-----+-----+-----+-----+-----+--
   |  12 |  n  |  h1 |  h2 |  h3 |  h4 |  h5 |  h6 |  ...
   +-----+-----+-----+-----+-----+-----+-----+-----+--

3.10.  Domain Name

   Uses time at which the format message was received.

      2.  Examine the protocol-version option, and code decide if this server
          is capable of DHCP option 15.

   Code   Len   Domain Name
   +-----+-----+-----+-----+-----+-----+--
   |  15 |  n  |  d1 |  d2 |  d3 |  d4 |  ...
   +-----+-----+-----+-----+-----+-----+--

DRAFT                                                      November 1998

3.11.  Client FQDN interoperating with another server running that
          protocol version.  If an implementation supports Dynamic DNS updates, this option can be
   used to communicate not, then send the DNS name that was set. Uses CONNECTACK message
          with the format appropriate reject-reason.  The server MUST include
          its protocol-version in the CONNECTACK message.

      3.  Examine the TLS-request option.  Figure out the TLS-reply
          value based on the capabilities and
   code configuration of this
          server, and save it for the Client FQDN option (81) as described in <draft-ietf-dhc-
   dhcp-dns-08.txt>.

   Code   Len   Flags Rcode1 Rcode2 Domain Name
   +-----+-----+-----+------+------+-----+------
   |  81 |  n  |  f  |  r1  |  r2  |  d1 | d2...
   +-----+-----+-----+------+------+-----+------

3.12.  Reject Reason Code

   This option is used to selectively reject binding updates. It MAY be
   used in DHCPBNDACK message, always following an option 50.  Option 50
   contains CONNECTACK message.  If the IP address
          results of the specific update being rejected.

   Note that TLS negotiation result in a Message option, DHCP Option 56, may be included connection rejec-
          tion, then go immediately to give a
   human readable error indication along with send the CONNECTACK message.

          The possibilities are:

                CONNECT      CONNECTACK
              TLS-request     TLS-reply
                               Reject
              req acc     t1   Reason Code.

   Code   Len   Reason code
   +-----+-----+----------+
   | 234 |   Comments
              --- ---     --   ------   --------
              0   0       0
              0   0       1  |    R1    |
   +-----+-----+----------+

   Reason codes :    11       receiver requires TLS
              0   Reserved   1   Illegal IP address (not part of any address pool)       0
              0   1       1
              1   0       -             request doesn't make sense
              1   1       0
              1   1       1
              2   Fatal conflict exists: address   0       -             request doesn't make sense
              2   1       0    9 or 10  receiver won't do TLS
              2   1       1

      4.  Check to see if there is a message-digest option in use by other client.
   3 - 253 Reserved for new Reason Codes.
   254 Unknown: Error occurred but the CON-
          NECT message.  If there was, and the server does not match any reason code
   255 Reserved for code expansion

DRAFT                                                      November 1998

3.13.  Message

   This support
          message-digests, then reject the connection with the appropri-
          ate reject-reason in the CONNECTACK.

      5.  Determine if the sender (from the sending-server-IP-address
          option) and the role of the sender (from the server-role)
          option is used to supply represents a human readable message.  It may be
   used in association server with which the Reject Reason Code receiver was config-
          ured to provide engage in failover activity.

          If not, then the receiving server should reject the CONNECT
          request by sending a human
   readable error CONNECTACK message for with a reject-reason
          value of: 8, invalid failover partner.

          If it is, then the reject.

   Code   Len      Text
   +-----+-----+------+-----+--
   | 56  |  1  |  c1  | c2  | ...
   +-----+-----+------+-----+--

3.14.  MCLT - Maximum Client Lead Time

   Maximum Client Lead Time, in seconds.  A 32 bit integer value, in
   network byte order. This option MUST receiving failover endpoint should be used
          determined.

      6.  Decide if the time delta between the sending of the packet, in DHCPPOLL
          the current-time option, and DHCPPRPL
   messages, when the server is NOT receipt of the packet,
          recorded in normal state.

   Code   Len           Time
   +------+-----+-----+-----+-----+-----+
   | 235  |  4  |  t1 |  t2 |  t3 |  t4 |
   +------+-----+-----+-----+-----+-----+

3.15.  Vendor Class Identifier step 1 above, is acceptable.  A string which identifies server MAY require
          an arbitrarily small delta in time values in order to set up a
          failover connection with another server.

          If the vendor of delta between the failover protocol
   implementation.

   The code for this option is 60, and its minimum length time values is 1.

   Code    Len    Vendor Class Identifier
   +-----+-----+-----+-----+-----+--
   | 60  |  n  |  i1 |  i2 |  i3 | ...
   +-----+-----+-----+-----+-----+--

4.  Challenging scenarios for too great, the server
          should reject the CONNECT request by sending a Failover protocol

   There exist CONNECTACK mes-
          sage with a number reject-reason of failure scenarios which will challenge 4, time mismatch too great.

          If the
   correctness guarantees of time mismatch is not considered too great then the Failover protocol.  Two of
          receiving server MUST record the
   scenarios that delta between the Failover protocol was specifically designed to
   handle correctly are detailed in servers.
          The receiving server MUST use this section in order delta to motivate
   some of the more unusual aspects correct all of the protocol's operations.

DRAFT                                                      November 1998

4.1.  Primary Server crash before "lazy" update:

   In
          absolute times received from the case where other server in all time-
          valued options.  Note that server's can participate in fail-
          over with arbitrarily great time mismatches, as long as it is
          more or less constant.

      7.  If the primary receiving server sends a DHCPACK to a client for is a newly allocated IP address and then crashes prior to sending the
   corresponding update to the secondary server, it MUST examine
          the secondary server
   will have no record MCLT option in the CONNECT request and use the value of
          the IP address allocation.  When MCLT as the MCLT for this failover endpoint.

          A receiving secondary server takes over, it may well try to allocate that IP address SHOULD be able to a
   different client.  In operate with
          any MCLT sent by the case where primary,  but if it cannot, then it
          should send a CONNECTACK with a reject-reason of 5, MCLT
          mismatch.

      8.  The receiving server MAY use the first client vendor-class-identifier to receive the
   IP address do
          vendor specific processing.

7.9.  CONNECTACK message

   The CONNECTACK message is not on the net at the time (yet while there was still
   time sent to run on its lease), an ICMP echo (i.e., ping) will not prevent accept or reject a CONNECT message.
   It is sent by the secondary server from allocating which accepted the TCP connection and
   received a CONNECT message.

7.9.1.  Sending the CONNECTACK message

   The xid of the CONNECTACK message must be that of the corresponding
   CONNECT message.

   The IP address to different
   client. of the sending server MUST be placed in the sending-
   server-IP-address option.  This information is handled placed in an option
   inside of the protocol by having packet in order to allow the primary and secondary
   allocate addresses for new clients from distinct address pools.

   A more likely (in that DHCPRENEWs are presumably more common than
   DHCPDISCOVERs) and more subtle version identity of this problem is where the
   primary server crashes after extending sender to
   be covered by a client's lease time, and
   before updating shared secret.

   The role of the secondary with a new sending failover endpoint (i.e., either primary or
   secondary) MUST be placed in the server-role option.

   The current time using a lazy update.
   After MUST be placed in the secondary takes over, if current-time option.

   The protocol-version option MUST be included in every CONNECTACK mes-
   sage.  The current value of the client protocol version is not connected to the
   network the secondary will believe 1.

   If the client's lease connection has expired
   when, been rejected, the reject-reason option MUST be
   placed in fact, it has not.  In this case as well, the IP address
   might CONNECTACK message with an appropriate reason, and a
   message option SHOULD be reallocated to included with a different client while human-readable error message
   describing the first client is
   still using it.

   This scenario is handled by reason for the Failover protocol through control rejection in some detail.  If the
   reject-reason option appears, then the remaining options listed below
   do not appear.

   The results of the lease time and TLS negotiation MUST be placed in the TLS-reply
   option.  If this CONNECTACK message is being sent over an already TLS
   secured connection, then there MUST NOT be a TLS-reply option.

   If there was a message-digest option in the CONNECT message, then
   there MUST be a message-digest in the use CONNECTACK message if it does
   not contain a reject-reason.

   The number of BNDUPD messages the maximum client lead time (MCLT).
   See the next section for details.

4.2.  Network partition where servers can't communicate but each server can
talk to clients:

   Several conditions are required for this situation to occur. First,
   due to a network failure, accept without blocking
   the primary TCP connection MUST be placed in the max-unacked-bndupd option.
   This SHOULD be a number greater than 10, and secondary servers cannot
   communicate.  As well, some SHOULD be a number less
   than 100.

   The length of the DHCP clients must receive timer (tReceive, see section 8.3) MUST be able to
   communicate with
   placed in the receive-timer option.

   If the sending server is a primary server, and some of then the clients must now
   only MCLT MUST be able to communicate with the secondary server.  When this
   condition occurs, both primary and secondary servers could attempt to
   allocate IP addresses for new clients from
   placed in the same pool of available
   addresses. At some point, then, two clients will end up being
   allocated MCLT option.

   The vendor class identifier MUST be placed in the same IP address. This will cause potentially serious
   problems when vendor-class-
   identifier option.

   If the network failure that created this situation is
   corrected.

   This server is handled in rejecting the protocol by having CONNECT message, then the primary and secondary
   servers allocate addresses for new clients from distinct address

DRAFT                                                      November 1998

   pools. reject-
   reason option MUST appear.  A message option MAY appear to give a
   human readable version of the rejection reason.

   After sending a CONNECTACK message, the server MUST send a STATE mes-
   sage.

   After sending a CONNECTACK message, the server MUST start two timers
   for the connection: tSend and tReceive.   The specifics tSend timer SHOULD be
   approximately 20 percent of how these two scenarios are handled are supplied the time in the next section.

5.  Duplicate Address Assignment Control

   There are several ways that receiver-timer option in
   the Failover protocol avoids corresponding CONNECT message.  The tReceive timer SHOULD be the possi-
   bility of duplicate address assignment.

5.1.  Control of lease
   time sent in the receiver-timer option in the CONNECTACK message.

   The key problem with lazy update tReceive timer is that when the a server fails
   after updating a client with reset whenever a particular lease time and before
   updating its partner, message is received from this
   TCP connection.  If it ever expires, the TCP connection is dropped
   and communications with this partner will believe that a lease has
   expired even though the client still retains is considered not ok.

   The tSend timer is reset whenever a valid lease on that IP
   address.

   In order to handle packet is sent over this problem, connec-
   tion. When it expires, a period of time known as CONTACT message MUST be sent.

7.9.2.  Receiving the "Max-
   imum Client Lead Time" (MCLT) CONNECTACK message

   When a CONNECTACK message is defined and must received, the following actions should
   be known to both taken:

      1.  Record the primary and secondary servers.  Proper use of this time interval
   places an upper bound on the difference allowed between the lease
   time provided packet was received.

      2.  Check to see if there is a DHCP client by reject-reason option in the CONNEC-
          TACK message.  If not, continue with step 3.  If there is a
          reject-reason option, the server SHOULD report the error code.
          If a message option appears a server SHOULD display the string
          from the message option in a user visible way.  The server
          MUST close the connection if a reject-reason option appears.

      3.  Check to see if the xid on the CONNECTACK matches an outstand-
          ing CONNECT message on this TCP connection.

      4.  Check the value of the TLS-reply option, and if it was 1, then
          skip processing of the lease time known
   by that server's partner.  In order that this is not rest of the maximum
   lease time that CONNECTACK message, and
          immediately enter into TLS connection setup.

          If it does not, a server can ever provide to a client, during a lazy
   update SHOULD report an error.

      5.  Examine the updating value of the protocol-version option.  If this
          server typically updates its partner with lease
   time information which is longer than the lease time previously given able to the client.  This allows that establish connections with another server to give a longer lease time
   to
          running this protocol version, then continue, else close the client
          connection.

      6.  Check to see if the next time sending-server-IP-address and server-role
          in the client renews its lease.

   When moving CONNECTACK message correspond to the PARTNER-DOWN state (where a server is allowed to
   reallocate failover endpoint
          for which this TCP connection was created.

          If it was not, the partner's IP addresses), a server will wait MUST drop the Max-
   imum Client Lead Time before allocating any IP addresses from its
   partner's pool to any new DHCP clients.  Thus, any clients which have
   a lease on TCP connection and
          SHOULD report an IP address with a lease error.

      7.  Decide if the time greater than that known by delta between the server moving into PARTNER-DOWN state will either have contacted
   that server during sending of the MCLT period or their leases will have expired.

   When a packet, in
          the current-time option, and the receipt of the packet,
          recorded in step 1 above, is acceptable.  A server has transitioned to PARTNER-DOWN state, it MUST NOT
   reallocate MAY require
          an IP address from one client arbitrarily small delta in time values in order to set up a
          failover connection with another client until an
   additional maximum client lead server.

          If the delta between the time interval after values is too great, the lease on server
          should drop the
   first client expires. (Actually, until TCP connection.

          If the maximum client lead time
   after what it believes to be mismatch is not considered too great then the lease expiration time of
          receiving server MUST record the first
   client.) delta between the servers.
          The fundamental relationship on which much receiving server MUST use this delta to correct all of the correctness of this
          absolute times received from the other server in all time-
          valued options.  Note that the failover protocol depends is con-
          structed so that the lease expiration time known to a DHCP
   client MUST NOT two servers can be more than the maximum client lead failover partners with
          arbitrarily great time greater

DRAFT                                                      November 1998

   than mismatches.

      8.  If the lease expiration time known to receiving server is a server's partner.

   The remainder of this section makes secondary server, it MUST examine
          the above fundamental relation-
   ship more explicit.

   This protocol requires a DHCP server to deal with several different
   lease intervals MCLT option in the CONNECT request and places specific restrictions on their relation-
   ships. The purpose use the value of these restrictions is to allow
          the other server
   in MCLT as the pair to MCLT for this failover endpoint.

          A receiving secondary server SHOULD be able to make certain assumptions in operate with
          any MCLT sent by the absence of
   an ability to communicate between servers.

   The different lease times are:

      o desired client lease interval primary,  but if it cannot, then it MUST
          drop the TCP connection.

      9.  The desired client lease interval is receiving server MAY use the lease interval that vendor-class-identifier to do
          vendor specific processing.

      10. After accepting a
        DHCP CONNECTACK message, the server MUST send a
          STATE message.

          After receiving a CONNECTACK message, the server would like to give MUST start
          two timers for the connection: tSend and tReceive.   The tSend
          timer SHOULD be approximately 20 percent of the time in the
          receiver-timer option in the corresponding CONNECTACK message.
          The tReceive timer SHOULD be set to a DHCP client the time sent in the absence
        of any restrictions imposed by
          receiver-timer option in the Failover protocol.  Its
        determination CONNECT message.

          The tReceive timer is outside of the scope of reset whenever a message is received
          from this protocol. Typi-
        cally TCP connection.  If it ever expires, the TCP connec-
          tion is dropped and communications with this partner is the result of external configuration of a DHCP
        server.

      o actual client lease interval con-
          sidered not ok.

          The actual client lease internal tSend timer is the lease interval that reset whenever a
        DHCP server gives out to packet is sent over this
          connection. When it expires, a DHCP client.  It may CONTACT message MUST be shorter than
        the desired client lease interval (as explained below).

      o desired partner server lease interval sent.

7.10.  STATE message

   The desired partner server lease interval state (STATE) message is the lease expira-
        tion interval the local server tells used to its partner.

      o acknowledged partner server lease interval

        The acknowledged partner server lease interval is communicate the interval current failover
   state to the partner server has most recently acknowledged. server.

   The key restriction (and guarantee) that any server makes with
   respect to lease intervals is STATE message MUST be sent after sending a CONNECTACK message
   that the actual client lease interval
   never exceeds the acknowledged partner server lease interval (if any)
   by more than didn't contain a fixed amount.  This fixed amount is called the "Max-
   imum Client Lead Time" (MCLT).

   The MCLT MAY reject-reason option, and MUST be configurable, but for correct server operation it sent after
   receiving a CONNECTACK message without a reject-reason option.

   A STATE message MUST be sent whenever the same failover endpoint changes
   its failover state and known a connection exists to both the primary and secondary servers.

   It is transmitted partner.

   The STATE message requires no response from the primary to failover partner.

7.10.1.  Sending the secondary in every STATE message

DRAFT                                                      November 1998

   sent with the RESTART bit set, and also

   The current failover state is placed in every poll the server-state option and poll reply
   message.  The secondary MUST ensure that its value agrees with that
   the current state of the primary.  See section 3.14 concerning STARTUP flag is placed in the MCLT Option. server-flags
   option.

   The message is sent with a unique xid.

   A server MUST record in its stable storage both SHOULD only send the local server
   lease interval and STATE message either when the most recently acknowledged partner server
   lease interval for each IP address binding.  It connec-
   tion is assumed that created (i.e, after sending or receiving a CONNECTACK message
   with no reject-reason option), or when there is a change from the
   desired client lease interval can be determined through techniques
   outside of
   values sent in a previous STATE message.

7.10.2.  Receiving the scope of this protocol.

   Again, STATE message

   Every STATE message SHOULD indicate a change in state or a change in
   the fundamental relationship among these times which MUST be
   maintained is:

       actual client lease interval <
       ( acknowledged partner lease interval + MCLT ) flags.

   When a STATE message is received, any state transitions specified in
   section 9 are taken.

   No response to a STATE message is required.

7.11.  CONTACT message

   The "acknowledged partner lease interval" contact (CONTACT) message is sent to verify communications
   integrity with a failover partner. The CONTACT message is sent when
   no messages have been sent to the acknowledged secon-
   dary server lease interval failover partner for a specified
   period of time.  This is determined by the tSend timer expiring (see
   section 8.3).

7.11.1.  Sending the primary server, CONTACT message

   The current time is placed in the current-time option, and it would be the acknowledged primary server lease interval for CON-
   TACT message is sent.

7.11.2.  Receiving the secondary
   server when CONTACT message

   When a CONTACT message is received, the tReceive timer is reset (as
   it is operating out of contact with any message that is received).

   A server MAY use the primary server.

   Figure 5.1-1 illustrates a initial lease time in the current-time option and the time
   recorded above to a client using refine the rules
   discussed delta time calculations between the
   servers.

8.  Connection Management

   Servers participating in the example which follows it.

DRAFT                                                      November 1998

          DHCP                 Primary             Secondary
          Client               Server               Server

            |                     |                    |
            | >-DHCPDISCOVER->    |                    |
            |     <---DHCPOFFER-< |                    |
            |                     |                    |
            | >-DHCPREQUEST->     |                    |
            |   (selecting)       |                    |
            |                     |                    |
            |  <--------DHCPACK-< |                    |
            |      ^    (MCLT)    |                    |
            |      :              | >-DHCPBNDUPD-->    |
            |      :              |  (1/2 MCLT + X )   |
            |      :              |                    |
            |      :              |     <-DHCPBNDACK-< |
            |   MCLT / 2          |                    |
           ...     :             ...                  ...
            |      :              |                    |
            |      V              |                    |
            | >-DHCPREQUEST->     |                    |
            |      (renew)        |                    |
            |                     |                    |
            |  <--------DHCPACK-< |                    |
            |      ^    (X)       |                    |
            |      :              | >-DHCPBNDUPD-->    |
            |      :              |   ( 1/2 X + X )    |
            |      :              |                    |
            |      :              |     <-DHCPBNDACK-< |
            |    X / 2            |                    |
            |      :              |                    |
           ...    ...            ...                  ...

           Figure 5.1-1:  Lazy Update Message Traffic
                          X = Desired Client Lease Interval

   DISCUSSION:

      This failover protocol mandates no algorithm concerning these lease inter-
      vals, communicate over TCP
   connections.   These TCP connections are used both to transmit bind-
   ing information from one server to another as long well as above fundamental relationship to allow each
   server to determine whether communications is preserved.

      In possible with the interests other
   server.

   Central to the operation of clarity, however, let's examine the failover protocol is a specific
      example.  The MCLT notion of
   "communications okay" or "communications failed".  Failover state
   transitions are taken in this case is 1 hour.  The desired client
      lease interval is 3 days, many cases when the status of communications
   with the partner changes, and its renewal time the existence or non-existence of a TCP
   connections between failover endpoints is half used to determine if com-
   munications is "okay" or "failed".

   A single TCP connection exists which connects two failover endpoints.

8.1.  Connection granularity

   There exists one TCP connection between each set of failover end-
   points.  See section 5.1.1 for an explanation of failover endpoint.

   There are a maximum of two TCP connections between any two servers
   implementing the lease
      interval.

DRAFT                                                      November 1998

      The rules failover protocol, one for this example are:

      o What to tell each of the client:

        Take possible
   failover endpoints between these two servers.  There is a minimum of
   one TCP connection between one server and every other failover server
   with which it implements the remainder failover protocol.

8.2.  Creating the TCP connection

   Every server implementing the failover protocol MUST listen on port
   647 for incoming failover TCP connections.  The source port of the acknowledged partner
   TCP connection is unimportant.

   Every server lease
        interval.  If this implementing the failover protocol SHOULD attempt to
   connect to all of its partners periodically, where the period is a new lease, then this value will
   implementation dependent and SHOULD be zero.
        If this remainder plus configurable. In the event
   that a connection has been rejected by a CONNECTACK message with a
   reject-reason option contained in it, a server SHOULD reduce the MCLT fre-
   quency with which it attempts to connect to that server but it SHOULD
   continue to attempt to connect periodically.

   Once a connection is greater than established, the desired
        client lease interval, give first message sent across the client
   connection MUST be a CONNECT message. This message establishes the desired client lease
        interval else give
   identity of the client failover endpoint making the remainder plus connection.

   Every CONNECT message includes a TLS-request option, and if the MCLT.

      o What to tell CON-
   NECTACK message does not reject the failover partner server:

        Take CONNECT message and the renewal interval (typically half of TLS-reply
   option says TLS MUST be used, then the actual client
        lease interval), and add to it servers will enter into TLS
   negotiation.

   Once that negotiation is complete, then the desired client lease inter-
        val.

      In operation this might work as follows:

      When a primary server makes an offer for a new lease MUST resend the
   CONNECT message on an IP
      address to a DHCP client, it determines the desired client lease
      interval (in this case, 3 days).  It newly secured TLS connection and then examines wait for
   the ack-
      nowledged partner lease interval (which CONNECTACK message in this case is zero) response.  The TLS-request and
      determines the remainder of TLS-reply
   options MUST have the time left to run, which is also
      zero.  To same values in this it adds second CONNECT and CONNEC-
   TACK message has they had in the first messages.

   The second message sent over a new connection is a STATE message.
   Upon the MCLT.  Since receipt of this message, the actual client
      lease interval cannot be allowed receiver can consider communi-
   cations up.

   It is entirely possible that two servers will attempt to exceed make connec-
   tions to each other essentially simultaneously, and then each will
   send a CONNECT message down the remainder of new connection.  In this case each
   server will receive a CONNECT message on one connection having
   already sent a CONNECT message on the
      current partner lease interval plus other connection.  In the MCLT, event
   that the offer made to primary server receives a CONNECT message from the client is secondary
   server either while waiting for a CONNECTACK message from a secondary
   server or when it has a valid connection open to a secondary server,
   it will close the remainder of the current partner lease
      interval (i.e., zero) plus connection on which the MCLT.  Thus, CONNECT message was
   received.

8.3.  Using the actual client
      lease interval TCP connection for determining communications status

   The TCP connection is 1 hour.

      Once the primary server has performed the ACK used to determine the DHCP client,
      it will update communications status of
   the secondary other server, i.e., communications-ok, or communications-
   interrupted.

   Three things must happen for a server to consider that communications
   are ok with respect to another server:

      1.  A TCP connection must be established to the lease information.
      However, the desired partner server lease interval will other server.

      2.  A CONNECT message must be com-
      posed of received and a CONNECTACK message
          sent in response.  The CONNECT message is used to determine
          the one half identify of the current actual client lease interval
      added to the desired client lease interval. Thus, failover endpoint of the secondary
      server is updated with a DHCPBNDUPD with a lease interval other end of 3
      days + 1/2 hour specified in the Lease Duration Option (Option
      51).

      When
          TCP connection -- without it, the primary server receives an ACK to its update failover endpoint cannot be
          uniquely determined.  Without knowledge of the
      secondary server's (partner's) lease interval, it records that as failover end-
          point, then the acknowledged partner server lease interval. entity with which communications is ok is
          undetermined.

      3.  A server MUST NOT
      send a DHCPBNDACK in response to a DHCPBNDUPD STATE message until it is
      sure that must be received from the information in other server over
          the DHCPBNDUPD connection.  This STATE message resides in its
      stable storage.  Thus, initializes important
          information necessary to the primary server in operation of the state machine
          the governs the behavior of this case can be sure failover endpoint.

   There are two ways that the secondary a server can determine that communications
   has recorded the desired partner server
      lease interval in its stable storage failed:

      1.  The TCP connection can go down, yielding an error when the primary server
      receives a DHCPBNDACK message from the secondary server.

DRAFT                                                      November 1998

      When the DHCP client attempts
          attempting to renew send a message.  This will happen at T1 (approximately one
      half an hour from least as
          often as the start period of the lease), the primary server
      again determines the desired client lease interval, which tSend timer.

      2.  The tReceive timer can expire.

   In either of these cases, communications is still
      3 days.  It then compares this with the remaining acknowledged
      partner server lease interval (3 days + 1/2 hour) and adjusts considered interrupted.

   Several difficulties arise when trying to use one TCP connection for
   both bulk data transfer as well as to sense the time passed since communications status
   of the secondary was last updated (1/2 hour).
      Thus other server.   One aspect of the remaining time on problem stems from the acknowledged partner server lease
      interval dif-
   ferent requirements of both uses.  The bulk data transfer is 3 days.  Adding the MCLT of
   course critically important to this yields 3 days plus 1
      hour, which is less than the desired client lease interval of 3
      days.  So protocol, but the client speed with which
   it is renewed for processed is not terribly significant.  It might well be
   minutes before a BNDUPD message is processed, and while not optimal,
   such an occasional delay doesn't compromise the desired client lease
      interval -- 3 days.

      When correctness of the primary DHCP
   protocol. However, the speed with which one server updates detects the secondary DHCP other
   server
      after is up (or, more importantly, down) is more highly constrained.
   Generally one server should be able to detect that the DHCP client's renewal ACK other server
   is complete, not communicating within a minute or less.

   These differing time constraints makes it will calculate difficult to use the desired partner server lease interval same
   TCP connection for data transfer as the T1 fraction well as to sense communications
   integrity.   See section 3.5 for additional details on TCP.

   The solution to this problem is to require a that some message be
   received by each end of the actual client lease interval (1/2 of 3 days this connection within a limited time = 1.5
      days).  To this it will add or that
   the desired client lease interval of 3
      days, yielding connection will be considered down.  If no messages have been
   sent recently, then a total desired partner server lease interval of
      4.5 days. CONTACT message is sent.

   In this way, the primary attempts case where there is no data queued to have the secondary
      always "lead" the client be sent, this is not a
   problem, but in its understanding of the client's
      lease interval so as case where there is data queued to be able sent to always offer the client the
      desired client lease interval.

      Once
   partner, then the initial actual client lease interval of CONTACT message will not actually be transmitted
   until the MCLT queued data is past,
      the protocol operates effectively like the DHCP protocol does
      today in its behavior concerning lease intervals. However, sent.  Section 3.5 explains why waiting for
   TCP to determine that the
      guarantee connection is down is not acceptable, and
   leads a requirement that the actual client lease interval will receiving server never exceed block the remaining acknowledged partner sending
   server lease interval by more
      than the MCLT allows full recovery from a variety of failures.

5.2.  Controlled re-allocation of IP addresses

   When in PARTNER-DOWN state (after a period defined in detail in sec-
   tion 6.5.2 has passed), a there are no restrictions on reallocating a
   lease from one client to another. sending CONTACT packets.

   In any order to meet this requirement, each server tells the other state, a server
   the number of outstanding BNDUPD messages that it will accept.  The
   receiving server is required to always be able to accept that many
   BNDUPD messages off of the connection's input queue even if it cannot reallocate an address from one
   client
   process them immediately, and to another without first notifying (through accept all other messages immedi-
   ately.

   Thus, the sending server's TCP is never blocked from sending a DHCPBNDUPD mes-
   sage) and receiving acknowledgement (through
   sage except for very short periods, less than a DHCPBNDACK message)
   that its few seconds unless
   the network connection itself has problems.  In this case, if the
   CONTACT messages don't make it to the partner is aware that that first client is not using then the
   address.

   This could be modeled in partner will
   close the following way (though this specific
   implementation is connection.

8.4.  Using the TCP connection for binding data

   Binding data, in no way required).  An "available" IP address on
   a server may be allocated the form of BNDUPD messages and BNDACK messages to any client.  An IP address which was
   leased
   respond to them, are sent across the TCP connection.

   In order to support timely detection of any failure in the partner
   server, the TCP connection MUST NOT block for more than a client and which expired or was released by that client
   would take on a new state, say "pending-available".  When an IP
   address became "pending-available", very short
   time, on the partner order of a few seconds.  Therefore, a server would be

DRAFT                                                      November 1998

   notified that this IP address was "available" through a DHCPBNDUPD.
   When the is
   sending server received the DHCPBNDACK for BNDUPD messages MUST send only a restricted number before
   receiving BNDACK messages about previous messages sent.

   The number of outstanding BNDUPD messages that IP address
   showing it was "available", it would move the IP address from
   "pending-available" to "available", and it would be available for
   allocation to any clients.

   A each server MAY reallocate an IP address in "pending-available" state will
   accept without causing TCP to
   the same client with no restrictions.

5.3.  Secondary renewal block transmission of leases

   When operating in NORMAL state, a secondary additional data
   (i.e, CONTACT messages) is sent by each server MAY process
   DHCPREQUEST in the CONNECT and
   CONNECTACK messages for renewal or rebinding leases.  In this case, in the requirements max-unacked-bndupd option.

8.5.  Using the TCP connection for control of lease time messages

   The TCP connection is used for control messages: POOLREQ, UPDREQ,
   STATE, UPDREQALL and re-allocation the corresponding reply messages: POOLRESP,
   UPDDONE.  A server MUST immediately accept all of IP
   addresses are these messages from
   the same TCP connection.  A server MUST immediately accept any BNDACK
   which is received as that of well.

8.6.  Losing the primary TCP connection

   When the TCP connection is lost, then communications is not ok with
   the other server.

6.  Server Operation  A server which has lost communications SHOULD
   immediately attempt to reconnect to the other server, and should
   retry these connection attempts periodically.

   Any BNDUPD or other messages that have been received but not yet pro-
   cessed from the partner SHOULD be processed as soon as possible.

9.  Protocol States

This section discusses the operation of various states that a failover endpoint may
take, and the server implementing actions required when entering the
   Failover protocol using state, operating
in the state, and leaving the state, as well as the events that cause
transitions out of the state into another state.

The state transition diagram in Figure 6.2-1. 9.2-1 is relevant for this

section.  In the event that the textual description of a state differs
from the state transition diagram, the textual description is to be con-
sidered authoritative.  This is the common state transition diagram for
both servers in a failover pair.

6.1.

9.1.  Server Initialization

   When a server starts it starts out in STARTUP state.  See section 6.4 9.4
   below for details.

6.2.  Establishing Communications Integrity

   Central to the operation of the Failover protocol is a notion of
   "communications okay" or "communications failed".

9.2.  Server State transitions
   are taken in many cases when the status of communications with the
   partner changes.

   A specific discipline exists for establishing and verifying communi-
   cations integrity.  Communications is set to "okay" whenever a mes-
   sage sent is acked by the partner.  After an implementation dependent
   length of time from the communications "okay" event the communica-
   tions with the partner are deemed to have "failed" if no subsequent
   acknowledgments have been received. Transitions

   Whenever a DHCPPRPL, DHCPUP-
   DATEDONE, DHCPPOOLRESP or DHCPBNDACK is received this time period is
   restarted.

   Obviously, as the time period elapses, a server SHOULD send DHCPPOLL
   messages in order to elicit transitions into a DHCPPRPL message in reply, which will

DRAFT                                                      November 1998

   reset the time period.

   While an implementation SHOULD restart this time period on every
   DHCPUPDATEDONE, DHCPPOOLRESP or DHCPBNDACK or DHCPRPL, it MAY choose
   to only restart new state, it on a DHCPPRPL.

   This technique ensures that two-way communications integrity exists
   between the servers.  Were the timeout period to be reset on the
   receipt of any message from the partner, a network failure where one
   server could send but not receive messages to the partner could lead
   to failure of the entire redundant DHCP subsystem.  For example, in a
   situation where the primary could send but not receive any messages,
   the secondary would never take over from the primary and yet DHCP
   clients would not receive any service.

6.3.  Server State Transitions MUST record the
   state and the time at which it entered that state in stable storage.
   If communications is "ok", it MUST also send a STATE message to its
   failover partner.

   Figure 6.2-1 9.2-1 is the diagram of the server state transitions. The
   remainder of this section contains information important to the
   understanding of that diagram.

   The server stays in the current state until all of the actions speci-
   fied on the state transition are complete.  If communications fails
   during one of the actions, the server simply stays in the current
   state and attempts a transition whenever the conditions for a transi-
   tion are later fulfilled.

   In the state transition diagram below, the "+" or "-" in the upper
   right corner of each state is a notation about whether communication
   is ongoing with the other server.

   The legend "responsive", "partially-responsive", "balanced", or "unresponsive" in each state
   indicates whether the server is responsive to all DHCP client
   requests
   requests, running in load balanced mode, or totally unresponsive in
   the respective state.  The terms "responsive" and "unresponsive" have
   the obvious meanings, while "partially-
   responsive" "balanced" means that a DHCP server may
   respond to all DHCPREQUEST mes-
   sages messages that are RENEWAL or REBINDING, but
   and to no all other messages. messages from clients for which the load balancing
   algorithm indicates that it MUST respond to.  See sections 5.3 and
   9.6.2 for details on load balancing.

   In the state transition diagram below, when communication is reesta-
   blished between the two servers, each must record the state of the
   partner when communication was restored.  State transitions on one
   server in some cases imply state transitions on the partner server,
   so a record of the current state of the partner server must be kept
   by each server.

   If a message is received from a partner with the state equal to zero
   (0), then the receiving server should respond to that message with a
   DHCPPRPL if it was a DHCPPOLL, but under no circumstances should it

DRAFT                                                      November 1998

   consider communications to be "okay", nor take any state transitions
   based on receipt of that message.

   If the state of the partner changes while communicating a server
   moves through the communications-failed transition and into whatever
   state results.  It then immediately moves through whatever state
   transition is appropriate given the current state of the partner
   server.  A server performing this operation SHOULD NOT drop the TCP
   connection to its partner.

   DISCUSSION:

      The point of this technique is simplicity, both in explanation of
      the protocol and in its implementation.  The alternative to this
      technique of memory of partner state and automatic state transi-
      tion on change of partner state is to have every state in the fol-
      lowing diagram have a state transition for every possible state of
      the partner.  With the approach adopted, only the states in which
      communications are reestablished require a state transition for
      each possible partner state.

   The current state of a server must MUST be recorded in stable storage and
   thus be available to the server after a server restart.

DRAFT                                                      November 1998

        +---------------+  V  +--------------+
        |    RECOVER  - |  |  |   STARTUP  - |
        |(unresponsive) |  +->|(unresponsive)|
        +---------------+     +--------------+
           Comm. OK             +-----------------+
          Other State:-RECOVER  |  PARTNER DOWN - |<-----+
          |      |              | (responsive)    |      |
         All   POTENTIAL-       +-----------------+      |
       Others  CONFLICT------------ | --------+  ^(see   |
          |                     Comm. OK      |  | 6.93) |
         UPDATEREQ(ALL) 9.8.3)|
         UPDREQ(ALL)          Other State:    |  +-----+ |
       Wait UPDATEDONE UPDDONE            |        |     | Comm.  | |
     Wait MCLT from fail   RECOVER  All Others| Failed | |
      +--------------+         |        V     V  |     | |
      |RECOVER-DONE +|      +--+    +--------------+   | |
      |(unresponsive)|      |       |  POTENTIAL + |<--+ |
      +--------------+   Wait for +>|  CONFLICT    |     |
         Comm. OK         Other   | |(unresponsive)|<--- | --+
     +--Other State:-+    State:  | +--------------+     |   |
     |   |           |   RECOVER  |         |            |   |
     |   All      POTENT.  DONE   | Resolve Conflict     |   |
     |  Others:  CONFLICT-- | ----+     (see 6.9) 9.8)        |   |
     | Wait for             V               V            |   |
     | Other State: NORMAL +-----------------+           |   |
     |   V                 |     NORMAL    + | External  |   |
     |   +--+----------+-->|(see 6.72, 6.73)   +--+----------+-->|   (balanced)    |-Command-->+   |
     |      ^          ^   +-----------------+           |   |
     |      |          |            |                    |   |
     |  Wait for   Comm. OK       Comm.            External  |
     |   Other      Other        Failed            Command   |
     |   State:     State:          |                or  |   |
     |RECOVER-DONE  NORMAL     Start Safe        Safe    |   |
     |      |     COMM. INT.  Period Timer       Period  |   |
     |   Comm. OK.     |            V            expiration  |
     |  Other State:   |  +------------------+           |   |
     |    RECOVER      +--| COMMUNICATIONS - |-----------+   |
     V      +-------------|   INTERRUPTED    |   Comm. OK    |
    RECOVER               |  (responsive)    |--Other State:-+
    RECOVER-DONE--------->+------------------+   All Others

           Figure 6.2-1: 9.2-1:  Server state diagram.

DRAFT                                                      November 1998

6.4.

9.3.  STARTUP state

   The STARTUP state affords an opportunity for a server to probe its
   partner server, before starting to service DHCP clients.

   DISCUSSION:

      Without the STARTUP state, a server would likely start in a state
      derived from its previously stored state (held in stable storage),
      if any.  However, this may be inconsistent with the current state
      of the partner.  The STARTUP state affords the opportunity for a
      server to potentially learn the partner's state and determine if
      that state is consistent with its derived starting state or
      whether some significant state change has occurred at the partner
      that forces the server to start in another state.  This is
      especially critical if significant time has elapsed while the
      server was down.

6.4.1.

9.3.1.  Operation while in STARTUP state

   Whenever a server is in STARTUP state, it MUST be unresponsive to
   DHCP client requests, and so the time spent in the STARTUP state is
   necessarily short, typically on the order of a few seconds to a few
   tens of seconds.  The exact time spent in the STARTUP state is imple-
   mentation dependent, and the primary and secondary server are not
   required to spend the same amount of time in the STARTUP state.

   Whenever any a STATE message is sent to the partner while in STARTUP
   state the STARTUP bit MUST be set in the 'flags' field of server-flags option and the message
   header.

6.4.2.
   previously recorded failover state MUST be placed in the server-state
   option.

9.3.2.  Transition out of STARTUP state

   Each server starts out in startup state every time it initializes
   itself, and performs the following algorithm as part of its initiali-
   zation:

      1.  Ensure that the RESTART bit is set in the 'flags' field of the
          failover message header.  Once set, the RESTART bit must
          remain set in all failover messages sent by the server to the
          partner until the first acknowledgment of a message is
          received from that partner.  This is required to assure that
          the partner knows that the server has restarted, even if the
          partner itself is unreachable for a long while.

DRAFT                                                      November 1998  Do not send any messages until step 5.

      2.  Is there any record in stable storage of a previous failover
          state?  If yes, set previous-state to the last recorded state
          in stable storage, and continue with step 3.

          Is there any configuration information that indicates that
          this server was previously running but lost its stable
          storage?  Such information must typically come from some
          administrative intervention, since it is difficult for a
          server to distinguish first startup from a startup after it
          has lost its stable storage.  If yes, then set the previous-
          state to RECOVER, and set the time-of-failure to whatever time
          was configured, and go on to step 3.  This time-of-failure
          will be used in the transition out of the RECOVER state into
          the RECOVER-DONE state, below.

          If there is no record of any previous failover state in stable
          storage nor of any previous operational activity for this
          server, then set the previous-state to PARTNER-DOWN if this
          server is a primary and RECOVER if this server is a secondary,
          and set the time-of-failure to a time before the maximum-client-lead-time maximum-
          client-lead-time before now.  If using standard Posix times, 0
          would typically do quite well.

      3.  Is the previous-state NORMAL?  If yes, set the previous-state
          to COMMUNICATIONS-INTERRUPTED.

      4.  Start the STARTUP state timer.  The time that a server remains
          in the STARTUP state (absent any communications with its
          partner) is implementation dependent (and would typically and SHOULD be
          configurable). configur-
          able.  It should SHOULD be long enough to poll several times
          and stand a good chance to receive for a response TCP connection to be
          created to at least one
          poll from a heavily loaded partner across a slow network.

      5.  Start sending DHCPPOLL messages (with both the RESTART and
          STARTUP bits set in  Attempt to create a TCP connection to the 'flags' field). failover partner.
          See section 8.2.

      6.  Wait for "communications okay", i.e., the process discussed in
          section 8.2 "Creating the TCP Connection", to complete,
          including the receipt of an
          DHCPPRPL message.

          When a DHCPPRPL STATE message is received, clear from the RESTART flag, partner.

          When and if communications become "okay", clear the STARTUP
          flag, and set the current state to the previous-state.

          If the partner is in PARTNER-DOWN state, and if its partner-
          down the time (received at
          which it entered PARTNER-DOWN state (as receive in the DHCPPRPL message start-
          time-of-state option in the Absolute
          Time Option) STATE message) is later than the
          last recorded time of operation of this server, then set the
          current state to RECOVER.

DRAFT                                                      November 1998

          Then, transition to the current state and take the "communica-
          tions okay" state transition based on the current state of
          this server and the partner.

      7.  If the startup time expires, take an implementation dependent
          action:  The server MAY go to the previous-state, or the
          server MAY wait.

          Reasons to go to previous-state and begin processing:

          If the current server is the only operational server, then if
          it waits, there will be no operational DHCP servers.  This
          situation could occur very easily where one server fails and
          then the other crashes and reboots.  If the rebooting server
          doesn't start processing DHCP client requests without first
          being in communication with the other server, then the level
          of DHCP redundancy is not particularly high.  This is an
          appropriate approach if the possibility of partition is low,
          or if the safe period expiration time is well beyond the time
          at which an operator would notice and react to a partition
          situation.  It is also quite appropriate if the safe period
          will never expire.

          Reasons to wait:

          If the current server has been down for longer than the
          maximum-client-lead-time, and it is partitioned from the other
          server, then when it returns it will attempt to use its own
          available addresses to allocate to new DHCP clients, and the
          other server may well be in PARTNER-DOWN state and may have
          already allocated some of those available addresses to DHCP
          clients.  In cases where the possibility of partition is high,
          and the safe period expiration time is less than the likely
          operator reaction time, this is a good approach to use.

6.5.

9.4.  PARTNER-DOWN state

   PARTNER-DOWN state is a state either server can enter.  When in this
   state, the server does not assume that the other server could still
   be operating and servicing a different set of clients, but instead
   assumes that it is the only server operating.  For this reason, only
   one server should be operating in this state at a time.

6.5.1.

9.4.1.  Upon Entry entry to PARTNER-DOWN state

   When

   No special actions are required when entering PARTNER-DOWN state a state.

   The server MUST record should continue to attempt to connect to the time of
   entry, and must transmit it during every DHCPPOLL message or DHCPPRPL

DRAFT                                                      November 1998

   message sent while in PARTNER-DOWN state.

6.5.2. partner
   periodically.

9.4.2.  Operation while in PARTNER-DOWN state

   A server in PARTNER-DOWN state MUST respond to DHCP client requests.
   It will allow renewal of all outstanding leases on IP addresses, and
   will allocate IP addresses from its own pool, and after a fixed
   period of time (the MCLT interval) has elapsed from entry into
   PARTNER-DOWN state, it will allocate IP addresses from the set of all
   available IP addresses.

   Once a server has entered NORMAL state, the PARTNER-DOWN state is
   entered only on command of an external agency (typically an adminis-
   trator of some sort) or after the expiration of an externally config-
   ured minimum safe-time after the beginning of COMMUNICATIONS-
   INTERRUPTED state.

   Any available IP address tagged as belonging to the other server (at
   entry to PARTNER-DOWN state) MUST NOT be used until the maximum-
   client-lead-time beyond the entry into PARTNER-DOWN state has
   elapsed.

   A server in PARTNER-DOWN state MUST NOT allocate an IP address to a
   DHCP client different from that to which it was allocated at the
   entrance to PARTNER-DOWN state until the maximum-client-lead-time
   beyond the its expiration time has elapsed.  If this time would be
   earlier than the current time plus the maximum-client-lead-time, then
   the current time plus the maximum-client-lead-time is used.

   Two options exist for lease times given out while in PARTNER-DOWN
   state, with different ramifications flowing from each.

   If the server wishes the Failover protocol to protect it from loss of
   stable storage in PARTNER-DOWN state, then it should ensure that the
   MCLT based lease time restrictions in Section 5.1 are maintained,
   even in PARTNER-DOWN state.

   If the server wishes to forego the protection of the Failover proto-
   col in the event of loss of stable storage, then it need recognize no
   restrictions on actual client lease times while in PARTNER-DOWN
   state.

   A server in PARTNER-DOWN state MUST poll its partner and attempt to establish communications
   and synchronization.

   While a server is in PARTNER-DOWN state, it MUST send the absolute
   time of entry into PARTNER-DOWN using the absolute time option in

DRAFT                                                      November 1998

   every DHCPPOLL and DHCPRPL message sent.

6.5.3. synchronization with its partner.

9.4.3.  Transitions out of PARTNER-DOWN state

   When a server in PARTNER-DOWN state succeeds in contacting establishing a con-
   nection to its partner, its actions are conditional on the state and
   flags received in the STATE message from the other server. server as part of
   the process of establishing the connection.

   If the STARTUP bit is set in the 'flags' field server-flags option of a received DHCPPOLL
   STATE message, the server in PARTNER-DOWN state will send a DHCPPRPL mes-
   sage with its current state (and with the absolute PARTNER-DOWN time
   in the DHCPPRPL).  A server in PARTNER-DOWN state MUST NOT take any state
   transitions based on reestablishing communications if the
   STARTUP bit is set in the 'flags' field of the messages that reesta-
   blished communications.

   If the STARTUP bit is not set in the 'flags' field then a server in
   PARTNER-DOWN state will move into POTENTIAL-CONFLICT state Essentially, if the
   other server is in the NORMAL, COMMUNICATIONS-INTERRUPTED, PARTNER-
   DOWN, or POTENTIAL-CONFLICT state.

   If the STARTUP bit is not set in the 'flags' field, then a
   server is in PARTNER-DOWN state will stay in PARTNER-DOWN state if state, it detects ignores all STATE messages from
   its partner that
   the other server is have the STARTUP bit set in RECOVER state. the server-flags option
   of the STATE message.

   If the STARTUP bit is not set in the 'flags' field, server-flags option of a STATE
   message received from its partner, then a server in PARTNER-DOWN
   state moves into NORMAL take the following actions based on the value of the server-
   state if it detects that option in the
   other server is received STATE message:

      o partner in NORMAL, COMMUNICATIONS-INTERRUPTED, PARTNER-DOWN or
        POTENTIAL-CONFLICT state

        transition to POTENTIAL-CONFLICT state

      o partner in RECOVER state

        stay in PARTNER-DOWN state

      o partner in RECOVER-DONE state.

6.6. state

        transition into NORMAL state

9.5.  RECOVER state

   This state indicates that the server has no information in its stable
   storage or that it is re-integrating with a server in PARTNER-DOWN
   state after it has been down.  A server in this state will attempt to
   refresh its stable storage from the other server.

6.6.1.

9.5.1.  Operation in RECOVER state

   A server in RECOVER MUST NOT respond to DHCP client request. requests.

   A server in RECOVER state will attempt to reestablish communications
   with the other server.

6.6.2.

9.5.2.  Transitions out of RECOVER state

   If the other server is in POTENTIAL-CONFLICT state when communica-
   tions are reestablished, then the server in RECOVER state will move
   to POTENTIAL-CONFLICT state itself.

DRAFT                                                      November 1998

   If the other server is in RECOVER state, then this server SHOULD sig-
   nal
   signal an error and halt processing.

   If the other server is in any other state, then the server in RECOVER
   state will request an update of missing binding information by send-
   ing an UPDATEREQ UPDREQ message.  If the server has been configured to indi-
   cate instructed (through
   configuration or other external agency) that it has lost its stable
   storage, it will MUST send an
   UPDATEREQALL UPDREQALL message, otherwise it will MUST send an UPDATEREQ
   UPDREQ message.

   It will wait for an UPDATEDONE UPDDONE message, and upon receipt of that mes-
   sage message
   it will start a timer whose expiration is set to a time equal to the the
   time the server went down (if known) or the current time (if the
   down-time is unknown) plus the maximum-client-lead-time.  When this
   timer goes off, the server will go transition into RECOVER-DONE state.
   This is to allow any IP addresses that were allocated by this server
   prior to loss of its client binding information in stable storage to
   contact the other server or to time out.

   See Figure 6.6-1. 9.5.2-1.

   DISCUSSION:

      The actual requirement on this wait period in RECOVER is that it
      start when the recovering server went down, not necessarily when
      it came back up.  If the time when the recovering server failed is
      known, then it could be communicated to the recovering server, and
      the wait period could be reduced to the maximum-client-lead-time
      less the difference between the current time and the time the
      server failed. In this way, the waiting period could be minimized.

   If an UPDATEDONE UPDDONE message isn't received within an implementation
   dependent depen-
   dent amount of time, and no DHCPBNDUPD BNDUPD message are being received, then
   the UPDATEREQ(ALL) UPDREQ(ALL) message will be re-transmitted.

DRAFT                                                      November 1998

                A                                        B
              Server                                  Server

                |                                        |
             RECOVER                               PARTNER-DOWN
                |                                        |
                | >--DHCPUPDATEREQ-------------> >--UPDREQ-------------------->         |
                |                                        |
                |        <-----------------DHCPBNDUPD--<        <---------------------BNDUPD--< |
                | >--DHCPBNDACK----------------> >--BNDACK-------------------->         |
               ...                                      ...
                |                                        |
                |        <-----------------DHCPBNDUPD--<        <---------------------BNDUPD--< |
                | >--DHCPBNDACK----------------> >--BNDACK-------------------->         |
                |                                        |
                |        <-------------DHCPUPDATEDONE--<        <--------------------UPDDONE--< |
                |                                        |
       Wait MCLT from last known                         |
          time of operation                              |
                |                                        |
           RECOVER-DONE                                  |
                |                                        |
                | >--DHCPPOLL-(RECOVER-DONE)--->         |
                |        <-------------------DHCPPRPL--< |
                | >--STATE-(RECOVER-DONE)------>         |
                |                                     NORMAL
                |                                        |
                |        <----------(NORMAL)-DHCPPOLL--< |
                | >--DHCPPRPL------------------>         |
                |        <-------------(NORMAL)-STATE--< |
             NORMAL                                      |
                |                                        |
                |                                        |

              Figure 6.6-1: 9.5.2-1:  Transition out of RECOVER state

DRAFT                                                      November 1998

6.7.

9.6.  NORMAL state

   NORMAL state is the state used by a server when it can communicate
   with the other server.  When in this state, the primary responds to
   DHCP all clients requests and while the secondary only responds to
   renewal or rebinding requests which it receives.  This is one of the
   few states where the operation of the primary and secondary servers
   are quite different.

6.7.1.

9.6.1.  Upon Entry to NORMAL state

   When entering NORMAL state, a server will send to the other server
   all currently unacknowledged DHCPBNDUPD binding updates as BNDUPD messages.

   When the above process is complete, if the server entering NORMAL
   state is a secondary server, then it will will request IP addresses for
   allocation using the DHCPPOOLREQ message POOLREQ message.

9.6.2.  Processing DHCP client requests and the techniques
   described in section 2.5.

6.7.2.  Operation in NORMAL state: Primary Server load balancing

   When in NORMAL state, the primary each server takes the following actions
   to implement the Failover protocol:

      o Lease Time Calculations MUST process all requests from some
   DHCP clients, and MUST NOT process any request other than a
   DHCPREQUEST/RENEWAL or a DHCPREQUEST/REBINDING request from some
   other DHCP clients.  The load balancing algorithm determines into
   which set a particular DHCP client falls.

   As discussed in section 5.1, "Control of lease time", 5.3, each server will take the lease
        interval given to a client-
   identifier from each DHCP client can never be more than the
        maximum-client-lead-time greater than the acknowledged partner-
        server-lease-interval.

        As long as client request (or the primary server adheres htype concatenated
   to this constraint, the
        specifics front of the lease intervals that it gives to either chaddr if no client-identifier is present in the
        DHCP client or
   request), and hash it with the secondary DHCP server are implementation
        dependent. One possible approach is shown algorithm given in section 5.1, but
        that particular approach is in no way required by this protocol.

      o Lazy Update of Secondary Server

        After an ACK 12.  The
   results of this hash algorithm yields a IP address binding, number between 0 and 255.
   This number is used to index into the primary bit array received by a server
        attempts to update
   in the secondary with hash-bucket-assignment option (if the binding information.
        The lease time used in server is a secondary),
   or into the update inverse of the secondary MUST be at
        least that given bit array sent to the DHCP client secondary in the DHCPACK.  It MAY,
        however, be longer.

DRAFT                                                      November 1998

      o Reallocation of IP Addresses Between Clients

        Whenever a client binding
   hash-bucket-assignment option if the server is released, a DHCPBNDUPD message must
        be sent to the secondary server, setting primary.

   If the binding state to
        RELEASED. However, until a DHCPBNDACK is received for bit found from this mes-
        sage, the IP address cannot be allocated to another client.  It
        can be allocated to indexing process is a 1 bit, then the same client again.

6.7.3.  Operation in NORMAL state: Secondary Server
   server MUST process this DHCP request.

   In normal NORMAL state, the secondary server receives binding updates from
   the primary a server in DHCPBNDUPD messages.  It records these in its
   client binding database in stable storage and then sends the
   corresponding DHCPBNDACK message to the primary server.  It MUST
   ensure that the information is recorded processes every DHCPREQUEST/RENEWAL or
   DHCPREQUEST/REBINDING request it receives.

9.6.3.  Operation in stable storage prior to
   sending the DHCPBNDACK message back to the primary server.

   While NORMAL state

   When in NORMAL state, for every DHCP client request that it
   processes, as determined by the secondary server MUST also acquire algorithm described in section 9.6.2,
   above, a
   series of IP addresses from the primary server to be used to satisfy
   DHCPDISCOVER requests from DHCP clients when will operate in the following manner:

      o Lease time calculations

        As discussed in COMMUNICATIONS-
   INTERRUPTED state.  See section 2.5 for details 5.2.1, "Control of this acquisition
   process.

   The secondary server periodically polls the primary server with lease time", the
   DHCPPOLL message.  If it fails
        lease interval given to receive a DHCPPRPL message in reply
   after a configured number of retries or some administratively deter-
   mined time, DHCP client can never be more than the secondary server transitions into COMMUNICATIONS-
   INTERRUPTED state.  Both
        MCLT greater than the DHCPPOLL and DHCPPRPL messages carry most recently received potential-
        expiration-time from the
   current state of failover partner or the sender.

   When in normal state, current time,
        whichever is later.

        As long as a secondary server is responsive adheres to this constraint, the specifics of
        the lease interval that it gives to a DHCP client
   requests if they are RENEWAL or REBINDING. Any changes it makes to
   any leases based on these responses should be sent to the primary
   server using DHCPBNDUPD messages.

6.7.4.  Transitions out value
        of NORMAL state

   If an external command the potential-expiration-time sent to its failover partner
        are implementation dependent.  One possible approach is received by a server dis-
        cussed in NORMAL state
   informing it section 5.2.1, but that its partner particular approach is down, then transition into PARTNER-
   DOWN state.

   If a server in NORMAL state fails to receive acks to any messages
   sent to its no
        way required by this protocol.

      o Lazy update of partner for server

        After an implementation dependent period ACK of time,
   it will move into COMMUNICATIONS-INTERRUPTED state. (See section
   6.2).

DRAFT                                                      November 1998

   If a server in NORMAL state receives any messages from IP address binding, the server servicing a
        DHCP client request attempts to update its partner
   where the partner has changed state from that expected by with the server new
        binding information.  The lease time used in NORMAL state, then the server should transition into
   COMMUNICATIONS-INTERRUPTED state and take update of the appropriate state tran-
   sition from there.  For example, it would
        secondary MUST be expected for the partner at that given to transition from POTENTIAL-CONFLICT into NORMAL state, but not for the partner to transition from NORMAL into POTENTIAL-CONFLICT state.

6.8.  COMMUNICATIONS-INTERRUPTED State

   A server goes into COMMUNICATIONS-INTERRUPTED state whenever it is
   unable to communicate with DHCP client in the other server.  Primary and secondary
   servers cycle automatically (without administrative intervention)
   between NORMAL
        DHCPACK, and COMMUNICATIONS-INTERRUPTED state as the network
   connection between them fails and recovers, or as potential-expiration-time MUST be at least the partner server
   cycles between operational
        lease time, and non-operational.  No duplicate SHOULD be longer.

      o Reallocation of IP
   address allocation can occur while the servers cycle addresses between these
   states.

6.8.1.  Upon Entry to COMMUNICATIONS-INTERRUPTED state

   When a server enters COMMUNICATIONS-INTERRUPTED state, if it has been
   configured to support an automatic transition out of COMMUNICATIONS-
   INTERRUPTED state and into PARTNER-DOWN state, then clients

        Whenever a timer MUST be
   started for an implementation dependent period.

   It client binding is anticipated that some alarm condition would released or expires, a BNDUPD mes-
        sage must be raised upon sent to partner, setting the
   transition from NORMAL binding state to COMMUNICATIONS-INTERRUPTED state.

6.8.2.  Operation in COMMUNICATIONS-INTERRUPTED State

   In this state
        RELEASED or EXPIRED.  However, until a server may respond to DHCP client requests.  When
   allocating new IP addresses, each server allocates from its own BNDACK is received for
        this message, the IP address pool.  When responding cannot be allocated to another
        client.  It can be allocated to renewal requests, each server will
   allow continued renewal of a DHCP client's current lease on an IP
   address, although the renewal period MUST not exceed the maximum same client lead time (MCLT) beyond the lease time already acknowledged by again.

   In normal state, the other server.

   A each server operates in COMMUNICATIONS-INTERRUPTED state as the primary receives binding updates from its
   partner server does in NORMAL state.

   However, since the server cannot communicate with BNDUPD messages.  It records these in its partner client
   binding database in this
   state, stable storage and then sends a corresponding
   BNDACK message to the acknowledged-partner-lease-time will not be updated in any
   new bindings.  This primary server.  It MUST ensure that the infor-
   mation is likely recorded in stable storage prior to eventually cause sending the actual-client-
   lease-times BNDACK mes-
   sage back to be the current-time plus the maximum-client-lead-time

DRAFT                                                      November 1998

   (unless this is greater than the desired-client-lease-time).

6.8.3.  Transition primary server.

9.6.4.  Transitions out of COMMUNICATIONS-INTERRUPTED State

   If the safe period timer expires while a server is in the
   COMMUNICATIONS-INTERRUPTED state, it will go immediately into
   PARTNER-DOWN state. NORMAL state

   If an external command is received by a server in COMMUNICATIONS-
   INTERRUPTED NORMAL state
   informing it that its partner is down, then transition into PARTNER-
   DOWN state.

   If a server in NORMAL state fails to receive acks to messages sent to
   its partner for an implementation dependent period of time, it will go
   immediately MAY
   move into COMMUNICATIONS-INTERRUPTED state.  This situation might
   occur if the partner server was capable of maintaining the TCP con-
   nection between the server and also capable of sending a CONTACT mes-
   sage every tSend seconds, but was (for some reason) incapable of pro-
   cessing BNDUPD messages.

   If the communications is determined to not be "ok" (as defined in
   section 8), then transition into PARTNER-DOWN COMMUNICATIONS-INTERRUPTED state.

   If communications is restored with the other server, then the a server in COMMUNICATIONS-INTERRUPTED state will go into another NORMAL state based
   on receives any messages from its partner
   where the partner has changed state of from that expected by the partner:

      o partner server
   in NORMAL or COMMUNICATIONS-INTERRUPTED

        The state, then the server will should transition into the NORMAL state.

      o partner in RECOVER

        Stay in
   COMMUNICATIONS-INTERRUPTED state.

      o state and take the appropriate state tran-
   sition from there.  For example, it would be expected for the partner in RECOVER-DONE

        Transition
   to transition from POTENTIAL-CONFLICT into NORMAL state.

      o state, but not for
   the partner in PARTNER-DOWN or POTENTIAL-CONFLICT

        Transition to transition from NORMAL into POTENTIAL-CONFLICT state.

      o partner in PAUSED

        Stay in

9.7.  COMMUNICATIONS-INTERRUPTED State

   A server goes into COMMUNICATIONS-INTERRUPTED state.

      o partner in SHUTDOWN

        Transition into PARTNER-DOWN state.

DRAFT                                                      November 1998

             Primary                                Secondary
              Server                                  Server

              NORMAL                                  NORMAL
                | >--DHCPPOLL----->:                     |
                |                  :<--------DHCPPOLL--< |
                |                  :                     |
           COMMUNICATIONS          :              COMMUNICATIONS
             INTERRUPTED           :                INTERRUPTED
                |                  :                     |
                | >--DHCPPOLL------------------>         |
                |        <-------------------DHCPPRPL--< |
              NORMAL                                     |
                |                                        |
                | >--DHCPBNDUPD---------------->         |
                |        <-----------------DHCPBNDACK--< |
                |                                        |
                |        <-------------------DHCPPOLL--< |
                | >--DHCPPRPL------------------>         |
                |                                     NORMAL
                |                                        |
                |        <-----------------DHCPBNDUPD--< |
                | >--DHCPBNDACK---------------->         |
               ...                                      ...
                |                                        |
                |        <----------------DHCPPOOLREQ--< |
                | >--DHCPPOOLRESP-(2)---------->         |
                |                                        |
                | >--DHCPBNDUPD-(#1)----------->         |
                |        <-----------------DHCPBNDACK--< |
                |                                        |
                |        <----------------DHCPPOOLREQ--< |
                | >--DHCPPOOLRESP-(0)---------->         |
                |                                        |
                | >--DHCPBNDUPD-(#2)----------->         |
                |        <-----------------DHCPBNDACK--< |
                |                                        |

       Figure 6.8-1:  Transition from NORMAL to COMMUNICATIONS-
                      INTERRUPTED and back (example with 2
                      addresses allocated to secondary)

DRAFT                                                      November 1998

6.9.  POTENTIAL-CONFLICT state

   This state indicates that the two servers are attempting to re-
   integrate with each other, but at least one of them was running in a
   state that did not guarantee automatic reintegration would be
   possible.  In POTENTIAL-CONFLICT state whenever it is
   unable to communicate with the other server.  Primary and secondary
   servers may determine that cycle automatically (without administrative intervention)
   between NORMAL and COMMUNICATIONS-INTERRUPTED state as the same network
   connection between them fails and recovers, or as the partner server
   cycles between operational and non-operational.  No duplicate IP
   address allocation can occur while the servers cycle between these
   states.

9.7.1.  Upon Entry to COMMUNICATIONS-INTERRUPTED state

   When a server enters COMMUNICATIONS-INTERRUPTED state, if it has been offered
   configured to support an automatic transition out of COMMUNICATIONS-
   INTERRUPTED state and accepted by two different
   DHCP clients.

   It is into PARTNER-DOWN state (i.e., a goal "safe period"
   has been configured, see section 10), then a timer MUST be started
   for a the length of this protocol the configured safe period.

   A server transitioning into the COMMUNICATIONS-INTERRUPTED state from
   the NORMAL state SHOULD raise some alarm condition to minimize alert adminis-
   trative staff to a potential problem in the possibility that
   POTENTIAL-CONFLICT DHCP subsystem.

9.7.2.  Operation in COMMUNICATIONS-INTERRUPTED State

   In this state is ever entered.

6.9.1.  Upon Entry a server MUST respond to POTENTIAL-CONFLICT all DHCP client requests, and
   the algorithm for load balancing described in section 5.3 MUST NOT be
   used.  When a allocating new IP addresses, each server allocates from
   its own IP address pool, where the primary MUST allocate only FREE IP
   addresses, and the secondary MUST allocate only BACKUP IP addresses.
   When responding to renewal requests, each server enters POTENTIAL-CONFLICT state it should
   request will allow continued
   renewal of a DHCP client's current lease on an IP address irrespec-
   tive of whether that lease was given out by the receiving server or
   not, although the renewal period MUST not exceed the maximum client
   lead time (MCLT) beyond the secondary send it all updates of which it is
   currently unaware potential-expiration-time already ack-
   nowledged by sending an UPDATEREQ message to the secondary
   server.

   A secondary other server entering POTENTIAL-CONFLICT state will wait for or the lease-expiration-time or
   potential-expiration-time received from the partner server.

   However, since the primary to send it an UPDATEREQ message.

6.9.2.  Operation in POTENTIAL-CONFLICT state

   Any server cannot communicate with its partner in POTENTIAL-CONFLICT state MUST this
   state, the acknowledged-potential-expiration time will not be unresponsive updated
   in any new bindings.  This is likely to incom-
   ing DHCP requests.

6.9.3.  Transitions eventually cause the actual-
   client-lease-times to be the current-time plus the maximum-client-
   lead-time (unless this is greater than the desired-client-lease-
   time).

9.7.3.  Transition out of POTENTIAL-CONFLICT state COMMUNICATIONS-INTERRUPTED State

   If communications fails with the partner safe period timer expires while in POTENTIAL-CONFLICT
   state, then a primary server is in the
   COMMUNICATIONS-INTERRUPTED state, it will transition to immediately into
   PARTNER-DOWN state
   and state.

   If an external command is received by a secondary server will stay in POTENTIAL-CONFLICT state.

   Whenever either server receives an UPDATEDONE message from COMMUNICATIONS-
   INTERRUPTED state informing it that its
   partner, partner is down, it MUST will
   transition to NORMAL immediately into PARTNER-DOWN state.  This will cause

   If communications is restored with the other server, then the
   primary server to leave POTENTIAL-CONFLICT
   in COMMUNICATIONS-INTERRUPTED state prior to will transition into another
   state based on the secon-
   dary, since state of the primary sends an UPDATEREQ message and receives an
   UPDATEDONE before partner:

      o partner in NORMAL or COMMUNICATIONS-INTERRUPTED

        Transition into the secondary sends an UPDATEREQ message and
   receives its UPDATEDONE message.

   When a secondary server receives an indication that NORMAL state.

      o partner in RECOVER

        Stay in COMMUNICATIONS-INTERRUPTED state.

      o partner in RECOVER-DONE

        Transition into NORMAL state.

      o partner in PARTNER-DOWN or POTENTIAL-CONFLICT

        Transition into POTENTIAL-CONFLICT state.

      o partner in PAUSED

        Stay in COMMUNICATIONS-INTERRUPTED state.

      o partner in SHUTDOWN

        Transition into PARTNER-DOWN state.

   The following figure illustrates the primary
   server has transitioned transition from POTENTIAL-CONFLICT to NORMAL state, it
   SHOULD send an UPDATEREQ message to the primary server.

DRAFT                                                      November 1998
   COMMUNICATIONS-INTERRUPTED state and then back to NORMAL state again.

             Primary                                Secondary
              Server                                  Server

              NORMAL                                  NORMAL
                |                                        |
         POTENTIAL-CONFLICT                    POTENTIAL-CONFLICT >--CONTACT------------------->         |
                |        <--------------------CONTACT--< | >--DHCPUPDATEREQ------------->
                |         [TCP connection broken]        |
           COMMUNICATIONS          :              COMMUNICATIONS
             INTERRUPTED           :                INTERRUPTED
                |      [attempt new TCP connection]      |        <-----------------DHCPBNDUPD--<
                |         [connection succeeds]          | >--DHCPBNDACK---------------->
                |
               ...                                      ...                                        |
                | >--CONNECT------------------->         |        <-----------------DHCPBNDUPD--<
                |        <-----------------CONNECTACK--< | >--DHCPBNDACK---------------->
                |        <-------------------STATE-----< |
                |                                     NORMAL
                |        <-------------DHCPUPDATEDONE--< >--STATE--------------------->         |
              NORMAL                                     |
                | >--DHCPPOLL--(NORMAL) -------> >--BNDUPD-------------------->         |
                |        <-------------------DHCPPRPL--<        <---------------------BNDACK--< |
                |                                        |
                |        <--------------DHCPUPDATEREQ--<        <---------------------BNDUPD--< |
                | >------BNDACK---------------->         |
               ...                                      ...
                | >--DHCPBNDUPD---------------->                                        |
                |        <-----------------DHCPBNDACK--<        <--------------------POOLREQ--< |
               ...                                      ...
                | >--POOLRESP-(2)-------------->         |
                | >--DHCPBNDUPD---------------->                                        |
                |        <-----------------DHCPBNDACK--< >--BNDUPD-(#1)--------------->         |
                |        <---------------------BNDACK--< |
                | >--DHCPUPDATEDONE------------>                                        |
                |        <--------------------POOLREQ--< |
                |                                     NORMAL >--POOLRESP-(0)-------------->         |
                |                                        |        <----------------DHCPPOOLREQ--<
                | >--BNDUPD-(#2)--------------->         | >--DHCPPOOLRESP-------------->
                |        <---------------------BNDACK--< |
                |

           Figure 6.9-1:  Transition out of POTENTIAL-CONFLICT

DRAFT                                                      November 1998

6.10.  RECOVER-DONE state

   This state exists to allow an interlocked transition for one server
   from RECOVER state and another server from PARTNER-DOWN or
   COMMUNICATIONS-INTERRUPTED state into NORMAL state.

6.10.1.  Operation in RECOVER-DOWN state

   A server in RECOVER-DONE state is responsive only to RENEWAL and
   REBINDING DHCP messages.

6.10.2.  Transitions out of RECOVER-DONE state

   When a server in RECOVER-DONE state determines that its partner
   server has entered NORMAL state, then it will transition into NORMAL
   state as well.

6.11.  PAUSED state

   This state exists to allow one server to inform another that it will
   be out of service for what is predicted to be a relatively short
   time, and to allow the other server to transition to COMMUNICATIONS-
   INTERRUPTED state immediately and (if it is a secondary server) to
   begin servicing clients with no interruption.

   A server which is aware that it is shutting down temporarily SHOULD
   send one or more DHCPPOLL messages with the 'state' field containing
   PAUSED.

   While a server may or may not transition internally into PAUSED
   state, the 'previous' state determined when it is restarted MUST be
   the state the server was in prior to receiving the command                                        |

       Figure 9.7.3-1:  Transition from NORMAL to shut-
   down and restart COMMUNICATIONS-
                        INTERRUPTED and its entry into the PAUSED state.

6.11.1.  Upon entry back (example with 2
                        addresses allocated to PAUSED secondary)

9.8.  POTENTIAL-CONFLICT state

   When entering PAUSED state, the server MUST remember the previous
   state, and use that

   This state as indicates that the previous state when it is restarted.

6.11.2.  Transitions out of PAUSED state

   A server transitions out two servers are attempting to re-
   integrate with each other, but at least one of PAUSED them was running in a
   state by being restarted.  At that
   time, the previous state MUST did not guarantee automatic reintegration would be the
   possible.  In POTENTIAL-CONFLICT state the server was in prior servers may determine that
   the same IP address has been offered and accepted by two different
   DHCP clients.

   It is a goal of this protocol to
   entering minimize the PAUSED state.

DRAFT                                                      November 1998

6.12.  SHUTDOWN state

   This possibility that
   POTENTIAL-CONFLICT state exists is ever entered.

9.8.1.  Upon Entry to allow one POTENTIAL-CONFLICT

   When a primary server to inform another enters POTENTIAL-CONFLICT state it should
   request that the secondary send it will
   be out all updates of service for what which it is predicted to be a relatively long time,
   and
   currently unaware by sending an UPDREQ message to allow the other secondary
   server.

   A secondary server to transition immediately to PARTNER-
   DOWN state, and take over completely entering POTENTIAL-CONFLICT state will wait for
   the server going down.

   A server which is aware that it is shutting down SHOULD primary to send one or
   more DHCPPOLL messages it an UPDREQ message.

9.8.2.  Operation in POTENTIAL-CONFLICT state

   Any server in POTENTIAL-CONFLICT state MUST NOT process any incoming
   DHCP requests.

9.8.3.  Transitions out of POTENTIAL-CONFLICT state

   If communications fails with the 'state' field containing SHUTDOWN.

   While partner while in POTENTIAL-CONFLICT
   state, then a primary server may or may not will transition internally into SHUTDOWN
   state, the 'previous' to PARTNER-DOWN state determined when
   and a secondary server will stay in POTENTIAL-CONFLICT state.

   Whenever either server receives an UPDDONE message from its partner
   while in POTENTIAL-CONFLICT state, it is restarted MUST be transition to NORMAL
   state.  This will cause the primary server to leave POTENTIAL-
   CONFLICT state active prior to the command to shutdown unless secondary, since the primary sends an
   UPDREQ message and receives an UPDDONE before the secondary sends an
   UPDREQ message and receives its UPDDONE message.

   When a secondary server
   detects receives an indication that its partner the primary
   server has moved transitioned from POTENTIAL-CONFLICT to PARTNER-DOWN, in which case NORMAL state, it
   MUST be RECOVER.

6.12.1.  Upon entry
   SHOULD send an UPDREQ message to SHUTDOWN state

   When entering SHUTDOWN state, the server MUST record the previous primary server.

              Primary                                Secondary
              Server                                  Server

                |                                        |
         POTENTIAL-CONFLICT                    POTENTIAL-CONFLICT
                |                                        |
                | >--UPDREQ-------------------->         |
                |                                        |
                |        <---------------------BNDUPD--< |
                | >--BNDACK-------------------->         |
               ...                                      ...
                |                                        |
                |        <---------------------BNDUPD--< |
                | >--BNDACK-------------------->         |
                |                                        |
                |        <--------------------UPDDONE--< |
              NORMAL                                     |
                | >--STATE--(NORMAL)----------->         |
                |        <---------------------UPDREQ--< |
                |                                        |
                | >--BNDUPD-------------------->         |
                |        <---------------------BNDACK--< |
               ...                                      ...
                | >--BNDUPD-------------------->         |
                |        <---------------------BNDACK--< |
                |                                        |
                | >--UPDDONE------------------->         |
                |                                     NORMAL
                |                                        |
                |        <--------------------POOLREQ--< |
                | >------POOLRESP-(?)---------->         |
                |                                        |

           Figure 9.8.3-1:  Transition out of POTENTIAL-CONFLICT

9.9.  RECOVER-DONE state in stable storage

   This state exists to allow an interlocked transition for use when the one server is restarted.  It
   also MUST record the current time as the last time operational.

   A DHCPPOLL message SHOULD be sent to the partner with the 'state'
   field containing SHUTDOWN
   from RECOVER state and another server from PARTNER-DOWN or
   COMMUNICATIONS-INTERRUPTED state into NORMAL state.

6.12.2.

9.9.1.  Operation in RECOVER-DOWN state

   A server in SHUTDOWN RECOVER-DONE state MUST be unresponsive respond only to
   DHCPREQUEST/RENEWAL and DHCPREQUEST/REBINDING DHCP client input.

   If messages.

9.9.2.  Transitions out of RECOVER-DONE state

   When a server receives any message indicating in RECOVER-DONE state determines that the its partner
   server has
   moved to PARTNER-DOWN state while it is in SHUTDOWN state (e.g in
   response to the DHCPPOLL it sent containing SHUTDOWN state), entered NORMAL state, then it
   MUST record RECOVER will transition into NORMAL
   state as the previous well.

9.10.  PAUSED state

   This state exists to be used when it is
   restarted.

   A allow one server SHOULD wait for a few seconds after informing the partner of
   entry into SHUTDOWN state (if communications are okay) to determine
   if inform another that it will enter PARTNER-DOWN state.

6.12.3.  Transitions out of SHUTDOWN state

   A server transitions
   be out of SHUTDOWN state by being restarted.

7.  Safe Period

   Due to the restrictions imposed on each server while in
   COMMUNICATIONS-INTERRUPTED state, long-term operation in this state

DRAFT                                                      November 1998

   is not feasible service for either server.  One reason that these states
   exist at all, what is predicted to be a relatively short
   time, and to allow the servers other server to easily survive transient
   network communications failures of a few minutes transition to COMMUNICATIONS-
   INTERRUPTED state immediately and to begin servicing all DHCP clients
   with no interruption in service to new DHCP clients.

   A server which is aware that it is shutting down temporarily SHOULD
   send a few days
   (although STATE message with the actual time periods will depend server-state option containing PAUSED
   state.

   While a great deal on the
   DHCP activity of server may or may not transition internally into PAUSED
   state, the network in terms of arrival and departure of
   DHCP clients on 'previous' state determined when it is restarted MUST be
   the network).

   Eventually, when state the servers are unable server was in prior to communicate, they will
   have receiving the command to move shut-
   down and restart and which precedes its entry into a state where they no longer can re-integrate
   without the some possibility PAUSED state.
   See section 9.3.2 concerning the use of a duplicate IP address allocation.
   There are two ways that they can move into this the previous state (known as
   PARTNER-DOWN).

   They can either be informed by external command that, indeed, upon
   server restart.

9.10.1.  Upon entry to PAUSED state

   When entering PAUSED state, the
   partner server is down.  In this case, there is no difficulty MUST store the previous state
   in mov-
   ing into stable storage, and use that state as the PARTNER-DOWN previous state since when it
   is an accurate reflection restarted.

9.10.2.  Transitions out of
   reality and PAUSED state

   A server transitions out of PAUSED state by being restarted.  At that
   time, the protocol has been designed to operate correctly (even
   during reintegration) if, when in PARTNER-DOWN previous state MUST be the partner is,
   indeed, down.

   The more difficult scenario is when state the servers are running unat-
   tended for extended periods, and server was in this case an option is provided prior to configure something called a "safe-period" into each server.  This
   OPTIONAL safe-period is the period after which either
   entering the primary or
   secondary PAUSED state.

9.11.  SHUTDOWN state

   This state exists to allow one server will automatically transition to PARTNER-DOWN from
   COMMUNICATIONS-INTERRUPTED state.  If this transition is completed
   and the partner is not down, then the possibility of duplicate IP
   address allocations inform another that it will exist.

   The goal
   be out of the "safe-period" service for what is predicted to be a relatively long time,
   and to allow network operations staff
   some time the other server to react transition immediately to PARTNER-
   DOWN state, and take over completely for the server going down.

   A server which is aware that it is shutting down SHOULD send a STATE
   message with the server-state field containing SHUTDOWN.

   While a server moving may or may not transition internally into COMMUNICATIONS-INTERRUPTED
   state.  During the safe-period SHUTDOWN
   state, the only requirement 'previous' state determined when it is that restarted MUST be
   the net-
   work operations staff determine if both servers are still running --
   and if they are, state active prior to either fix the network communications failure
   between them, or command to take one shutdown.  See section 9.3.2
   concerning the use of the servers down before previous state upon server restart.

9.11.1.  Upon entry to SHUTDOWN state

   When entering SHUTDOWN state, the  expira-
   tion of server MUST record the safe-period.

   The length of previous
   state in stable storage for use when the safe-period server is installation dependent, and depends
   in large part on restarted.  It
   also MUST record the number of unallocated IP addresses within current time as the
   subnet address pool and last time operational.

   A server which is aware that it is shutting down SHOULD send a STATE
   message with the expected frequency of arrival of previ-
   ously unknown server-state field containing SHUTDOWN.

9.11.2.  Operation in SHUTDOWN state

   A server in SHUTDOWN state MUST NOT respond to any DHCP clients requiring IP addresses.  Many environments
   should be able client input.

   If a server receives any message indicating that the partner has
   moved to PARTNER-DOWN state while it is in SHUTDOWN state then it
   MUST record RECOVER state as the previous state to support safe-periods of several days.

   During this safe period, either be used when it is
   restarted.

   A server will allow renewals from any
   existing client.  The only limitation concerns the need for IP
   addresses SHOULD wait for a few seconds after informing the DHCP server partner of
   entry into SHUTDOWN state (if communications are okay) to hand determine
   if it will enter PARTNER-DOWN state.

9.11.3.  Transitions out of SHUTDOWN state

   A server transitions out of SHUTDOWN state by being restarted.

10.  Safe Period

   Due to new DHCP clients and the
   need to re-allocate IP addresses to different DHCP clients.

DRAFT                                                      November 1998

   The number of "extra" IP addresses required restrictions imposed on each server while in
   COMMUNICATIONS-INTERRUPTED state, long-term operation in this state
   is not feasible for either server.  One reason that these states
   exist at all, is equal to allow the expected
   total number servers to easily survive transient
   network communications failures of new DHCP clients encountered during a few minutes to a few days
   (although the safe period.
   This is dependent only actual time periods will depend a great deal on the arrival rate of new
   DHCP clients, not
   the total number activity of outstanding leases on IP addresses.

   In the unlikely event that a relatively short safe period of an hour
   is all that can be used (given a dearth network in terms of IP addresses or a very
   high arrival rate and departure of new
   DHCP clients), even that can provide sub-
   stantial benefits in allowing clients on the DHCP subsystem network).

   Eventually, when the servers are unable to ride through
   minor problems that could occur and be fixed within that hour.  In
   these cases, communicate, they will
   have to move into a state where they no longer can re-integrate
   without the some possibility of a duplicate IP address allocation
   exists, and re-integration after the failure is solved will be
   automatic and require no operator intervention.

8.  Security

   The Failover protocol MAY be secured with a simple shared secret mes-
   sage digest which covers each message.  Since there allocation.
   There are a number of
   configuration parameters two ways that must they can move into this state (known as
   PARTNER-DOWN).

   They can either be informed by external command that, indeed, the same on each
   partner server is down.  In this case, there is no difficulty in a
   pair, mov-
   ing into the PARTNER-DOWN state since it is not unreasonable to require a shared secret be configured
   as well.

   Only information within the packet an accurate reflection of
   reality and covered by the message digest
   is used for operation of protocol has been designed to operate correctly (even
   during reintegration) if, when in PARTNER-DOWN state the protocol.  It partner is,
   indeed, down.

   The more difficult scenario is when the servers are running unat-
   tended for extended periods, and in this reason that case an option is provided
   to configure something called a "safe-period" into each server.  This
   OPTIONAL safe-period is the IP address of period after which either the sending primary or
   secondary server will automatically transition to PARTNER-DOWN from
   COMMUNICATIONS-INTERRUPTED state.  If this transition is sent in completed
   and the 'sending server
   id' field of partner is not down, then the fixed header possibility of duplicate IP
   address allocations will exist.

   The goal of the failover message when it might
   seem that the same information could be recovered from "safe-period" is to allow network operations staff
   some time to react to a server moving into COMMUNICATIONS-INTERRUPTED
   state.  During the source
   address of safe-period the IP packet.

9.  Extended Discussion

   Some areas in only requirement is that the draft above warranted more extended discussion than
   was feasible net-
   work operations staff determine if both servers are still running --
   and if they are, to insert directly into either fix the next.

      1.  UDP network communications failure
   between them, or TCP

          There has been debate about the utility to take one of using UDP for the
          Failover protocol, since it doesn't supply guaranteed
          delivery.  UDP has been chosen as servers down before the protocol  expira-
   tion of choice for
          the failover protocol due to the following factors:

          First, it is important to recognize that mere receipt safe-period.

   The length of a
          packet by the other server safe-period is installation dependent, and depends
   in large part on the pair (e.g., receipt number of a
          DHCPBNDUPD packet by the secondary server) is not sufficient
          for unallocated IP addresses within the primary to update its own bindings database with new
          information about what
   subnet address pool and the secondary knows.  In all cases of

DRAFT                                                      November 1998

          transfers expected frequency of binding information, the server arrival of a DHCPBNDUPD
          message MUST update its own stable storage prior previ-
   ously unknown DHCP clients requiring IP addresses.  Many environments
   should be able to replying
          with a DHCPBNDACK message (except in the marginal case where
          all support safe-periods of several days.

   During this safe period, either server will allow renewals from any
   existing client.  The only limitation concerns the updates are rejected).  An action is required by need for IP
   addresses for the receiving DHCP server to hand out to new DHCP clients and an explicit ACK the
   need to re-allocate IP addresses to different DHCP clients.

   The number of "extra" IP addresses required is needed by the
          sending server equal to ensure the integrity expected
   total number of new DHCP clients encountered during the protocol.  So,
          just knowing that the other server has received a Failover
          protocol packet safe period.
   This is not intrinsically interesting.

          Second, dependent only on the arrival rate of new DHCP protocol, both clients, not
   the client and server side, is
          being implemented in progressively smaller and smaller
          machines.  While this progression total number of outstanding leases on IP addresses.

   In the unlikely event that a relatively short safe period of an hour
   is most evident in DHCP
          clients, there exist implementations today all that can be used (given a dearth of IP addresses or a very
   high arrival rate of new DHCP servers
          embedded clients), even that can provide sub-
   stantial benefits in devices allowing the DHCP subsystem to ride through
   minor problems that are by could occur and be fixed within that hour.  In
   these cases, no stretch possibility of duplicate IP address allocation
   exists, and re-integration after the imagination
          traditional "servers" running mainstream operating systems.
          In many ways, the Failover protocol failure is solved will be
   automatic and require no operator intervention.

11.  Security

   It is very well suited desirable to
          such devices.  Adding additional protocol infrastructure
          requirements assure the integrity of failover partners and
   to implement thus ensure proper operation of the Failover protocol might prevent
          its implementation in devices that in some ways need it most
          (devices with limited stable storage servers. For example, denial
   of their own).

          Third, there service attacks are only a few cases where possible by the communication of invalid state
   information to both servers.

   The Failover protocol
          requires guaranteed delivery MAY be secured either by using a simple shared
   secret message digest which covers each message or by using TLS [TLS]
   (Transport Layer Security).

11.1.  Simple shared secret

   A simple shared secret message digest MAY be used to cover each mes-
   sage.  Since there are a number of packets.  In particular, configuration parameters that must
   already be the
          normal Primary to Secondary DHCPBNDUPD message do same on each server in a pair, it is not have unreasonable
   to require a shared secret to be delivered reliably.  The consequences of lost DHCPBNDUPD
          messages are handled configured as well.

   Only information within the packet and covered by the use message digest
   is used for operation of the MCLT, protocol. It is for the simple this reason that since these messages are "lazy", they may not get
          delivered because the
   IP address of a the sending server Failover prior to their
          transmission.  The protocol is robust sent in the face of loss sending-server-IP-
   address option of
          either a DHCPBNDUPD message or a DHCPBNDACK message.

          Furthermore, a technique known as "fire and forget" may be
          used with this protocol and two cooperating implementations.
          If the DHCPBNDACK CONNECT and CONNECTACK messages.

   This message contains all of the information ori-
          ginally digest is placed in the DHCPBNDUPD message, then message-digest option.  The dig-
   est covers the DHCPBNDUPD message
          may be transmitted and forgotten by the sending server (typi-
          cally prior to the primary).  When and if inclusion of the secondary receives message-digest
   option.

11.2.  TLS

   TLS, Transport Layer Security, as specified in [TLS] MAY be used. The
   use of TLS would be similar to the
          DHCPBNDUPD and replies way it is used with a DHCPBNDACK message SMTP [SMTPTLS]
   and IMAP/POP3/ACAP [IPAMTLS].

   To request the pri-
          mary receives it, use TLS, the primary will update its stable storage
          with server that successfully opened a new picture of what the secondary knows about connec-
   tion to its peer MUST send the lease
          time.  If either TLS option as part of these messages is lost, the only downside
          is that CONNECT mes-
   sage.  The server receiving the DHCP client associated TLS option MUST respond with a TLS-
   reply option indicating its acceptace or rejection of the binding TLS-request
   in ques-
          tion may receive a shorter lease for one lease period than it
          would otherwise.   This "fire and forget" technique could sub-
          stantially ease the CONNECT message.

   If the CONNECTACK message contained a TLS-reply of 1 , then both the complexity
   servers begin TLS negotiation.

   Upon completion of implementation this negotiation, the server which originally sent
   the CONNECT message MUST resent its CONNECT message without any TLS-
   request, and
          memory requirements must wait for a corresponding CONNECTACK.

   Implementation of the TLS_DHE_DSS_WITH_3DES_EDE_CBC_SHA [TLS] cipher
   suite is REQUIRED in Failover servers supporting TLS. This is
   important as it assures that any two compliant implementations can be
   configured to interoperate.

12.  Hash algorithm for load balancing

The following hash function is an implementation of the Failover pro-
          tocol, especially where two algorithm known
as "Pearson's hash".  The Pearson's hash  algorithm was originally pub-
lished in the Communications of the ACM  Vol.33, No.  6 (June 1990), pp.
677-680.  The author,  Peter K. Pearson, has kindly granted his permis-
sion to use this algorithm, free of any encumbrances.

To make  Primary-backup load balancing possible , both servers were communicating over a
          very slow link.

DRAFT                                                      November 1998

10. MUST use
the same hash function.

    /* A "mixing table" of 256 distinct values, in pseudo-random order. */

    unsigned char failover_hash_mx_tbl[256] =
    {
    251, 175, 119, 215,  81,  14,  79, 191, 103,  49,
    181, 143, 186, 157,   0, 232,  31,  32,  55,  60,
    152,  58,  17, 237, 174,  70, 160, 144, 220,  90,
    57,  223,  59,   3,  18, 140, 111, 166, 203, 196,
    134, 243, 124,  95, 222, 179, 197,  65, 180,  48,
     36,  15, 107,  46, 233, 130, 165,  30, 123, 161,
    209,  23,  97,  16,  40,  91, 219,  61, 100,  10,
    210, 109, 250, 127,  22, 138,  29, 108, 244,  67,
    207,   9, 178, 204,  74,  98, 126, 249, 167, 116,
    34,   77, 193, 200, 121,   5,  20, 113,  71,  35,
    128,  13, 182,  94,  25, 226, 227, 199,  75,  27,
     41, 245, 230, 224,  43, 225, 177,  26, 155, 150,
    212, 142, 218, 115, 241,  73,  88, 105,  39, 114,
     62, 255, 192, 201, 145, 214, 168, 158, 221, 148,
    154, 122,  12,  84,  82, 163,  44, 139, 228, 236,
    205, 242, 217,  11, 187, 146, 159,  64,  86, 239,
    195,  42, 106, 198, 118, 112, 184, 172,  87,   2,
    173, 117, 176, 229, 247, 253, 137, 185,  99, 164,
    102, 147,  45,  66, 231,  52, 141, 211, 194, 206,
    246, 238,  56, 110,  78, 248,  63, 240, 189,  93,
     92,  51,  53, 183,  19, 171,  72,  50,  33, 104,
    101,  69,   8, 252,  83, 120,  76, 135,  85,  54,
    202, 125, 188, 213,  96, 235, 136, 208, 162, 129,
    190, 132, 156,  38,  47,   1,   7, 254,  24,   4,
    216, 131,  89,  21,  28, 133,  37, 153, 149,  80,
    170,  68,   6, 169, 234, 151
    };
    unsigned char failover_p_hash(
            unsigned char *key, /* The key to be hashed (e.g., MAC address)
*/
            int len             /* Length of key in bytes */       )
    {
        unsigned char hash  = len;
        int i;

        for( i=len ; i > 0 ;  )
        {
            hash = failover_p_mx_tbl  [ hash ^ key[ --i ] ];
        }
        return( hash );
    }

13.  Acknowledgments

   Ralph Droms started it all, by sketching out an initial interserver
   draft that embodied ideas from several past IETF meetings.  In that
   draft, he acknowledged contributions by Jeff Mogul, Greg Minshall,
   Rob Stevens, Walt Wimer, Ted Lemon, and the DHC working group.

   Kim Kinnear and Bob Cole each extended that draft, separately and
   then together, until they created an interserver draft that supported
   any number of servers.  The complexity of that approach was just too
   great, and that draft wasn't greeted with enthusiasm by many, includ-
   ing its authors.

   It did however lead to a much simpler approach embodied in the first
   Failover draft by Greg Rabil, Mike Dooley, Arun Kapur and Ralph
   Droms.  This draft posited only two servers -- a primary and a secon-
   dary.

   Kim Kinnear then wrote the Safe Failover draft to layer on top of the
   Failover Draft and increase its robustness in the face of certain
   rare network failures.

   At the spring 1998 IETF meeting in LA, the DHC working group said
   that they wanted a merged Failover and Safe Failover draft.  Steve
   Gonczi and Bernie Volz stepped up and produced the raw material for
   such a merged draft, along with a new message format designed around
   DHCP options and other extensions and clarifications.  Kim Kinnear
   edited their work into draft format and made other changes in time
   for the Summer Chicago IETF meeting.

   During the summer and fall of 1998, two groups have been working worked on separate
   implementations of the evolving UDP failover draft.  Bernie Volz and Steve
   Gonczi constitute constituted one group, and Kim Kinnear, Mark Stapp and Paul
   Fox make made up the other.  These two groups have worked together to produce
   considerable changes and simplifications of the protocol dur-
   ing this during that
   period, and Steve Gonczi and Kim Kinnear have edited these those changes into this latest revision
   -03 draft in time for submission to the
   December 1998 Orlando IETF meeting. December 1998 Orlando IETF
   meeting.

   In February of 1999 Kim Kinnear and Mark Stapp hosted a meeting on
   people interested in the failover draft.  During that meeting a gen-
   eral agreement was reached to recast the failover protocol to use TCP
   instead of UDP.  In addition, the group together brainstormed a work-
   able load-balancing technique.  Kim Kinnear volunteered to rewrite
   the entire draft to include the changes made at that meeting as well
   as to restructure the draft along guidelines suggested by Thomas Nar-
   ten.  The current draft represents the results of that effort.

   The initial idea for a hash-based load balancing approach was offered
   by Ted Lemon, and the determination of an algorithm and its integra-
   tion into the draft was done by Steve Gonczi.  The security section
   was spearheaded by Bernie Volz.  Both contributed considerably to the
   ideas and text in the rest of the draft with several reviews.

   These most recent changes have been reviewed by Ralph Droms, Greg
   Rabil, Bernie Volz, Steve Gonczi, Mark Stapp, Paul Fox, and Kim Kin-
   near.  This widely circulated among the other
   authors, but that does not preclude any of these people them from expressing
   disagreement with what is contained in this draft at any future time.

   Many people have reviewed the various earlier drafts that went into
   this result.  At American Internet, ideas were contributed by Brad
   Parker.  At Cisco Systems, Paul Fox, and Ellen Garvey have contri-
   buted greatly to the form of the protocol.

   Glenn Waters of Bay

DRAFT                                                      November 1998 Networks contributed ideas and enthusiasm to make
   a Failover protocol that was both "safe" and "lazy".

11.

   Many thanks to Peter K. Pearson, the author of Pearson's hash who has
   kindly granted his permission to use this algorithm, for DHCP load
   balancing, free of any encumbrances.

14.  References

   [RFC 2131] Droms, R., "Dynamic Host Configuration Protocol", RFC
      2131, March 1997.

   [RFC 2119] Bradner, S. "Key words for use in RFCs to Indicate
      Requirement Levels", RFC 2119.

   [RFC 2132] Alexander, S.,  Droms, R., "DHCP Options and BOOTP Vendor
      Extensions", Internet RFC 2132, March 1997.

12.

   [TLS] Dierks, T., "The TLS Protocol, Version 1.0", RFC 2246, January
      1999.

   [SMTPTLS] Hoffman, P., "SMTP Service Extension for Secure SMTP over
      TLS", RFC 2487, January 1999.

   [IMAPTLS] Newman, C., "Using TLS with IMAP, POP3, and ACAP", RFC
      2595, June 1999.

   [NAMESPACE] Carney, M., "draft-ietf-dhc-option_review_and_namespace-
      00.txt", June 1999.

   [DDNS] Rekhter, Y., Stapp, M., "draft-ietf-dhc-dhcp-dns-10.txt",
      June, 1999.

15.  Author's information

      Ralph Droms
      323 Dana Engineering
      Bucknell University
      Lewisburg, PA  17837

      Phone: (717) 524-1145
      EMail: droms@bucknell.edu

      Greg Rabil, Mike Dooley, Arun Kapur
      Lucent Technologies (Quadritek)
      10 Valley Stream Parkway, Suite 240
      Malvern, PA 19355

      Phone: (800) 208-2747

      EMail: grabil@lucent.com
             mdooley@lucent.com
             akapur@lucent.com

      Kim Kinnear
      Mark Stapp
      Cisco Systems
      250 Apollo Drive
      Chelmsford, MA  01824
      Phone: (978) 244-8000

DRAFT                                                      November 1998

      EMail: kkinnear@cisco.com
             mjs@cisco.com

      Steve Gonczi,

      Bernie Volz
      Steve Gonczi
      Process Software Corporation
      959 Concord St.
      Framingham, MA  01701

      Phone: (508) 879-6994

      EMail: gonczi@process.com volz@process.com
             gonczi@process.com

16.  Full Copyright Statement

Copyright (C) The Internet Society (1999). All Rights Reserved.

This document and translations of it may be copied and furnished to oth-
ers, and derivative works that comment on or otherwise explain it or
assist in its implementation may be prepared, copied, published and dis-
tributed, in whole or in part, without restriction of any kind, provided
that the above copyright notice and this paragraph are included on all
such copies and derivative works.  However, this document itself may not
be modified in any way, such as by removing the copyright notice or
references to the Internet Society or other Internet organizations,
except as needed for the  purpose of developing Internet standards in
which case the procedures for copyrights defined in the Internet Stan-
dards process must be followed, or as required to translate it into
languages other than English.

The limited permissions granted above are perpetual and will not be
revoked by the Internet Society or its successors or assigns.

This document and the information contained herein is provided on an "AS
IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK
FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT
LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT
INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FIT-
NESS FOR A PARTICULAR PURPOSE.

Open Issues

   These issues need to be resolved:

      1.  We need to deal with the option space, and the procedures for
          managing it.  Probably IANA.

      2.  Figure out a better way to identify vendors.  How about an
          SNMP Enterprise MIB value?

      3.  Need more clarity in the conflict resolution section, probably
          backed up by real implementation experience.  Learned a lot
          from the UDP implementation and experience with it in the real
          world, and need equivalent learning from a TCP implementation
          with no messages out of order or lost.