Network	Working	Group                                         Greg Rabil					     Ralph Droms
INTERNET DRAFT					     Bucknell University

							      Greg Rabil
							     Mike Dooley
Obsoletes: draft-ietf-dhc-failover-00.txt
							      Arun Kapur
						       Quadritek Systems

                                                             Ralph Droms
                                                     Bucknell University

                                                           February 1998
                                                     Expires

							     Kim Kinnear
						       American	Internet

							    Steve Gonczi
							     Bernie Volz
							Process	Software

							     August 1998
						      Expires March 1999

			 DHCP Failover Protocol
                    <draft-ietf-dhc-failover-01.txt>
		    <draft-ietf-dhc-failover-02.txt>

Status of this Memo

   This	document is an Internet-Draft. Internet-Drafts are working
   documents of	the Internet Engineering Task Force (IETF), its	areas,
   and its working groups. Note	that other groups may also distribute
   working documents as	Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time. It is inappropriate to	use Internet-Drafts as reference
   material or to cite them other than as ``work in progress.''

   To learn the	current	status of any Internet-Draft, please check the
   ``1id-abstracts.txt'' listing contained in the Internet-Drafts Shadow
   Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe),
   munnari.oz.au (Pacific Rim), ds.internic.net	ftp.ietf.org (US East Coast), or
   ftp.isi.edu (US West	Coast).

Abstract

   DHCP	[RFC 2131] allows for multiple servers to be operating on a
   single network. Some	sites are interested in	running	multiple servers
   in such a way so as to provide redundancy in	case of	server failure.
   In order for	this to	work reliably, the cooperating Primary and
   Secondary servers must maintain a consistent	database of the	lease

DRAFT							    January 1998

   information.	 This implies that servers will	need to	coordinate any
   and all lease activity so that this information is synchronized in
   case	of failover.

   This	document defines a protocol to provide this synchronization
   between two servers.	One server is designated the "primary" "Primary" server,
   the other is	the "secondary" "Secondary"	server.	Additionally, this document
   describes a protocol	for the	automatic transfer of control from the
   primary
   Primary to the secondary Secondary in the case	of failure (failover), as well

DRAFT                    DHCP Failover Protocol            November 1997
   as the re-establishment a	network	partition.

   This	document is a merge of control draft-ietf-dhc-failover-01.txt and
   draft-ietf-dhc-safe-failover-proto-00.txt, along with substantial
   changes to each.  Unfortunately, this merge was not completed with
   sufficient time to allow review by any of the primary server.

1.0 authors of draft-ietf-
   dhc-failover-01.txt,	and so it may well not reflect their views even
   though their	names appear as	authors.  See Section 11, issue	#1 and
   Section 12 for more details.

1.  Introduction

   As the use of DHCP servers in networked environments	grows, the
   dependency of those networks	on the DHCP server increases.  This is
   particularly	true of	the hosts that receive their configuration
   information from the	DHCP server.  Therefore, it is very important to
   be able to provide reliable,	continuous availability	of DHCP
   services.	ser-
   vices.

   This	specification describes	a protocol to support automatic	failover
   from	a primary to its secondary server.  The	failover mechanism
   allows the secondary	server to perform DHCP actions while the primary
   is down.  Additionally, down, or when a network failure prevents the primary and secondary
   from	communicating.	The protocol defines also specifies how control	reintegration is passed
   back to
   achieved when the primary when it again becomes operational again. or when the pri-
   mary	and secondary can again	communicate.

   In providing	the specification for the failover, the	protocol
   specifies speci-
   fies	how to guarantee reliable delivery of changes to the secondary.
   This	is required to synchronize the secondary's lease data with that
   of the primary.  The	protocol further specifies a mechanism for determining to allow
   the state (operational or not) of secondary to determine if it can	communicate with the primary
   server.  The	secondary will be able to automatically begin to service DHCP
   requests upon failover. whenever it	cannot communicate with	the primary.  When the
   primary server becomes available again, the secondary will convey any
   changes that	occurred since the time	of failover back to the	primary.

   Through careful control of the difference between the lease times

DRAFT							    January 1998

   offered to DHCP clients and the lease time known by the secondary
   server, the protocol	allows the primary prior to communicate with the
   secondary after the primary
   becoming operational.

1.1 has completed communication with	the DHCP
   client (a technique known as	"lazy" update) and still guarantee that
   duplicate IP	address	allocations do not occur.  Thus, the protocol
   does	not directly impact the	ability	of a DHCP server to respond to
   DHCP	client requests.

1.1.  Requirements Terminology

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY" "MAY", and "OPTIONAL"	in this
   document are	to be interpreted as described in RFC 2119 [RFC	2119].

1.2

1.2.  DHCP Terminology

   This	document uses the following terms:

	o "DHCP	client"	or "client"

	  A DHCP client	is an Internet host using DHCP to obtain
     configuration confi-
	  guration parameters such as a	network	address.

	o "DHCP	server"	or "server"

	  A DHCP server	is an Internet host that returns configuration

DRAFT                    DHCP Failover Protocol            November 1997
	  parameters to	DHCP clients.

	o "primary server" or "primary" "binding"

	  A DHCP server configured to provide primary service to binding is a set collection of
     DHCP clients.

   o "secondary server" configuration parameters, includ-
	  ing at least an IP address, associated with or "secondary"

     A DHCP server configured to act as "bound	to" a backup to a primary server;
     the secondary answers requests from
	  DHCP clients only if its
     primary is unable to respond. client.	Bindings are managed by	DHCP servers.

	o "bindings "binding database"

	  The collection of bindings managed by	a primary and secondary.

1.3 Requirements

	o "subnet address pool"

	  A subnet address pool	is the set of IP address which is asso-
	  ciated with a	particular network number and subnet mask.  In
	  the simple case, there is a single network number and	subnet
	  mask and a set of IP addresses.  In the more complex case
	  (sometimes called "secondary subnets", sometimes "super-
	  scopes"), several (apparently	unrelated) network number and
	  subnet mask combinations with	their associated IP addresses

DRAFT							    January 1998

	  may all be configured	together into one subnet address pool.

	o "primary server" or "primary"

	  A DHCP server	configured to provide primary service to a set
	  of DHCP clients for a	particular set of subnet address pools.

	o "secondary server" or	"secondary"

	  A DHCP server	configured to act as backup to a primary server
	  for a	particular set of subnet address pools.

	o "stable storage"

	  Every	DHCP server is assumed to have some form of what is
	  called "stable storage".  Stable storage is used to hold
	  information concerning IP address bindings (among other
	  things) so that this information is not lost in the event of a
	  server failure which requires	restart	of the server.

1.3.  Requirements for this protocol

   The following requirements list of goals must be met (and are) achieved by this protocol.

   o proto-
   col.

	1. Implementations of this protocol must work with existing DHCP
     clients.

   o
	   client implementations based	on the DHCP protocol [1].

	2. Implementations of this the protocol must	work with existing BOOTP
	   relay agents.

   o implementations.

	3. The protocol	must provide failover redundancy between servers
	   that	are not	located	on the same subnet.

1.4

1.4.  Goals of for	this protocol

   o

	1. Provide for continued service to DHCP clients through an
	   automated mechanism in the event of failure of the primary server.

   o Minimize the possibility of assigning Primary
	   Server.

	2. Avoid binding an IP address to a client while that binding is
	   currently valid for another client.	In other words,	don't
	   allocate the	same IP	address	to two
     different clients simultaneously.

   o clients.

	3. Minimize any	need for manual	administrative intervention.

   o

DRAFT							    January 1998

	4. Introduce no	additional delays in server response time as a
	   result of inter-server communication.

   o

	5. Share IP address ranges between primary and secondary
	   servers; i.e., impose no requirement	that the pool of available IP avail-
	   able	addresses be divided between servers.

   o

	6. Continue to meet the	goals and objectives of	this protocol in
	   the

DRAFT                    DHCP Failover Protocol            November 1997 event of	server failure or network partition.

   o

	7. Provide graceful reintegration of full protocol service after
	   server failure or network partition.

   o

	8. Allow for one computer to act as a secondary server Secondary	Server for multiple
     primary servers.  Where possible, primary mul-
	   tiple Primary Servers. Other	topologies (e.g.: mesh)	are also
	   possible.  Primary and secondary servers
     should Secondary Servers SHOULD be viewed as
	   "logical" servers and not necessarily physical computers.

1.4 Limitations to

	9. Ensure that an existing client can keep its existing	IP
	   address binding if it can communicate with either the Primary
	   or Secondary	DHCP server implementing this protocol

   o Under normal operation, only one - not
	   just	whichever server that originally offered it the	binding.

	10.Ensure that a new client can	get an IP address from some
	   server. Ensure that in the face of partition, where servers
	   continue to run but cannot communicate with each other, the
	   above goals and requirements	may be met. In addition, when
	   the partition condition is removed, allow graceful automatic
	   re-integration without requiring human intervention.

	11.If either Primary or	Secondary Server loses all of the infor-
	   mation that is has stored in	stable storage,	it should be
	   able	to refresh its stable storage from the other server.

1.5.  Limitations of this Protocol

   The following are explicit limitations of this protocol.

	1. Under normal	operation, only	one server at a	time will service ser-
	   vice	DHCP client requests; this protocol provides reliability
	   through  redundancy but not load balancing.

   o

	2. This	protocol provides only one level of redundancy through a
	   single secondary server Secondary Server for each primary server.

   o Under certain combinations of failures, both Primary Server.

	3. The protocol	provides a way to detect when the primary and
	   secondary server may be active cannot communicate,	but once this condition

DRAFT							    January 1998

	   has been detected, does not (indeed,	cannot)	provide	any way
	   to further distinguish between network failure and assign failure of
	   one of the same IP address to
     different DHCP clients.

     DISCUSSION:

        The details servers.

	4. A small number of this failure mode IP	addresses are discussed in section X.  In
        summary, reserved for duplicate address allocation Secondary
	   Server use.	In order to occur, a network
        partition must occur that prevents handle the failure case where both
	   servers from exchanging
        messages and are able to communicate with	DHCP clients, but unable
	   to communicate with each other, a subnet partition small number of IP
	   addresses must occur so that some be set aside as a private address pool for the
	   Secondary Server. The Secondary can use these to service
	   newly arrived DHCP clients during such a period.  The size of
	   this	private	pool SHOULD be based only on the subnet can only reach arrival rate of
	   new DHCP clients and	the server while other length of expected downtime, and is
	   not influenced in any way by	the total number of DHCP clients on that same subnet can only reach
	   supported by	the secondary.

   o server pair.

	5. The Primary and secondary servers require external configuration to
     acquire server addresses, available IP address ranges and other
     client configuration information.

     DISCUSSION:

        This protocol assumes external configuration of primaries and
        secondaries; e.g., through an independent internet configuration
        management tool.

   o The primary and secondary server must synchronized before an
     address with an expired lease can be reassigned to a new client.

   o The primary and secondary servers must halt all Secondary Servers SHOULD pause normal DHCP
	   transaction processing while resynchronizing	resynchronizing, after a system
	   failure.

DRAFT                    DHCP Failover Protocol            November 1997

     DISCUSSION:

        Presumably, unless the primary and secondary servers have been
        out of communication for an extended period, the servers will
        have only a small amount of information to exchange.  Thus, the
        time during which the servers are not available to answer DHCP
        requests will be minimal and should be bridged by the normal
        DHCP client retransmission mechanism.

2.0

2.  Protocol Summary Operations

   The protocol	necessary in providing redundant/failover servers can be
   grouped in three areas:

	o Messages to keep the secondary server's Secondary Server's lease	data synchronized synchron-
	  ized with that of the primary	Primary	so that	when failover occurs,
	  there	is no degradation of service service.

	o Messages that	allow the secondary Secondary to determine the operational
	  state	of the primary, Primary,	so as to know when to start servicing
	  DHCP
     traffic traffic.

	o Messages that	are used to coordinate the primary Primary regaining
	  control when it has become available again.

2.1  Primary keeps secondary lease data synchronized

   The messages for keeping

2.1.  Time synchronization between communicating servers

   Each	Binding	update message carries a "sent time stamp" (the	time
   when	the secondary's lease data up message was	sent in	GMT). This provides a simple mechanism
   to date
   include determine	any "time drift" between communicating servers.

   DISCUSSION:

      If an UDP	packet is successfully transmitted (i.e.: it does not
      get lost), the following:

      DHCPBNDADD - Primary notifies secondary of new binding
      DHCPBNDUPD - Primary notifies secondary of modified binding
                   (e.g., extended lease)
      DHCPBNDDEL - Primary notifies secondary packet travel time	is negligible in the framework

DRAFT							    January 1998

      of deleted binding
                   (e.g., expired or released lease)

   In response to any  DHCP leases.	By providing a GMT "sent time" stamp, the reci-
      pient can	compare	this with its notion of	the above messages, current GMT	time at
      the secondary server will
   respond time it receives the packet.	The difference (plus the packet
      travel time, which we ignore) is the time	drift.	The recipient
      can use this time	drift value to bias all	"absolute time"	values
      it receives from the primary with sender.

2.2.  Failover Protocol	Messages

   The Failover	Protocol messages are encoded using a message describing packet format
   specific to the status Failover Protocol. To allow easy  recognition of the
   binding addition, modification,
   Failover Protocol messages, BOOTP packet "op" field values  3..14 are
   proposed to mark various Failover Protocol messages.	A Failover Pro-
   tocol message is always unicast from	the source to the destination.
   The sender, and never the recipient is responsible for reliable re-
   transmission.

2.3.  Failover Protocol	packet header format

   0		       1		   2		       3
   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |	 op (1)	   |	 rev (1)   |	    payload offset (2)	   |
   +---------------+---------------+---------------+---------------+
   |				xid (4)				   |
   +---------------------------------------------------------------+
   |	     0 or deletion.

      DHCPBNDACK more additional header bytes	(variable)	   |
   +---------------------------------------------------------------+
   |	       Payload data, formatted as DHCP-style options	   |
   |	       (although using a unique	option number space)	   |
   |			       (variable)			   |
   +---------------------------------------------------------------+

   op - Positive acknowledgment	1 byte

   These values	extend the number space	of binding change
      DHCPBNDNAK the existing	BOOTP message
   type	"Op" field.  The following types are defined:

DRAFT							    January 1998

   3		   DHCPPOOLREQ
   4		   DHCPPOOLRESP
   5		   DHCPBNDUPD
   6		   DHCPBNDACK
   7		   DHCPPOLL
   8		   DHCPPRPL
   9		   DHCPCTLREQ
   10		   DHCPCTLRET
   11		   DHCPCTLACK
   12		   DHCPCTLACKACK
   13		   DHCPREQUEREQ
   14		   DHCPREQUERESP

   rev - Negative acknowledgment 1 byte

   Failover protocol version supported.	Set to 1 for the Failover Proto-
   col described in this draft.

   payload offset - 2 bytes, network byte order

   The byte offset of binding change

2.2  Determination the Payload area,	from the beginning of operational state the Fail-
   over	packet header. The value for the current protocol version is 8.

   xid - 4 bytes, network byte order

   The sender of a server

   In order to determine failover protocol packet is responsible for setting
   this	number,	and the state	receiver of the	packet copies the number over
   into	any response packet.  To the receiver it is opaque.  The sender
   SHOULD ensure that every packet sent	to a given server, particular	IP address and
   port	combination has	a participant can
   use unique transaction id	unless that packet is a
   re-transmission.

2.4.  DHCPPOOLREQ and DHCPPOOLRESP:

   Whenever the following	Secondary server transitions into NORMAL mode, it first
   sends a DHCPPOOLREQ message	to poll (or "ping") initiate a transfer of a small range
   of IP addresses that	will serve as its private address pool.

   This	is necessary, because initially	the server:

DRAFT                    DHCP Failover Protocol            November 1997

      DHCPPOLL - Check if Secondary server has no such
   address pool, and its pool gets depleted when it hands out addresses
   in COMMUNICATION-INTERRUPTED	mode. This is why the request is sent
   every time the Secondary server transitions into NORMAL mode.  The
   DHCPPOOLREQ message does not	carry any payload data.	When the Primary
   Server gets a DHCPPOOLREQ message, it computes which	addresses should
   be transferred to the Secondary, and	queues up  DHCPBNDUPD transac-
   tions, setting the Status of	these bindings to "BACKUP".  Having done
   this, it sends a  DHCPPOOLRESP message. The DHCPPOOLRESP message

DRAFT							    January 1998

   carries the "Number of addresses transferred" as its	payload.

   The Secondary server	keeps sending DHCPPOOLREQ messages until it
   receives a  DHCPPOOLRESP with "Number of addresses transferred" = 0,
   or it decides that the partner is not responding.  Each one of these
   message MUST	have the same transaction ID.  If a new	transaction ID
   is used in one of these messages, the receiving server will begin the
   transmission	of the DHCPBNDUPD messages all over again.  To be clear,
   if the Secondary Server receives a  DHCPPOOLRESP message with "Number
   of addresses	transferred" > 0, it MUST send another DHCPPOOLREQ mes-
   sage. This mechanism	makes it possible for the Primary Server to pace
   the transfer	(e.g., it could	generate all addresses all at once, or
   one-by-one).

   The Primary Server must respond to each DHCPPOOLREQ message it
   receives. If	it has already generated all private addresses,	or it
   has no available addresses, it MUST send  DHCPPOOLRESP with "Number
   of addresses	transferred" = 0.

2.5.  DHCPREQUEREQ and DHCPREQUERESP:

   Whenever either server wishes to be updated with the	information that
   the other server knows and has not yet transmitted to it, will send a
   DHCPREQUEREQ.

   The DHCPREQUEREQ message does not carry any payload data. When the
   either server gets a	DHCPREQUEREQ message, it computes which	updates
   should be transferred to the	Secondary, and queues up DHCPBNDUPD
   transactions	   as appropriate.  Having done	this, it sends a DHCPRE-
   QUERESP message. The	DHCPREQUESP message carries the	"Number	of
   addresses queued up"	as its payload.	The set	of binding updates
   queued up will depend on the	requesting server's state. (The	state
   has already been communicated via prior DHCPPOLL/DHCPPRPL messages)

   The Secondary server	keeps sending DHCPPREQUEREQ messages until it
   receives a  DHCPREQUERESP with "Number of addresses queued up" = 0,
   or it decides that the partner is not responding.  This is the same
   approach  as	in the DHCPPOOLREQ/DHCPPOOLRESP	messages is used.  Each
   one of  these DHCPREQUEREQ message MUST have	the same transaction ID.
   Use of a new	transaction ID will cause re-building of the outgoing
   binding update queue.

   The Primary Server must respond to each DHCPREQUEREQ	message	it
   receives. If	it has already queued up all of	the previously unsent
   bindings update, then it MUST send  DHCPREQUERESP with "Number of
   addresses queued up"	= 0.

DRAFT							    January 1998

2.6.  DHCPBNDUPD

   The Primary notifies	Secondary (or the other	way around) of a binding
   state and data change.

   In response to a binding update, the	recipient server MUST respond
   with	a  DHCPBNDACK message.	Multiple binding updates can be	batched
   up, and sent	in one Failover	Protocol message.

2.7.  DHCPBNDACK

   This	message	implements a positive, or negative acknowledgement of
   one or more binding updates.

   A binding update, (or a batch of binding updates sent as one	message)
   are matched up with their associated	acknowledgment by having the
   same	Xid field value	in the message header.

   The server sending a	DHCPBNDACK message MAY include any of the
   options that	are acceptable in a DHCPBNDUPD message when the
   DHCPBNDACK message returned to the sender.  If any of this informa-
   tion	differs	from the information in	the DHCPBNDUPD message,	the
   receiver SHOULD update its bindings database	with that information
   upon	receipt	of the DHCPBNDACK message.

   The DHCPBNDACK MAY selectively reject one or	more updates by	includ-
   ing one or more IP address -	Reject Reason option pairs in the mes-
   sage	body.

   The DHCPBNDACK implicitly acknowledges any binding updates it replies
   to, except those it enumerates using	Reject Reason Codes.

2.8.  DHCPPOLL

   In order to determine the state of a	given server, or to communicate
   a critical change in	its own	status,	a participant can use the above
   message.

   This	message	inquires about the current state of the	recipient, and
   tells the recipient what state the sender is.

   In response to the DHCPPOLL message,	the participant	will listen for
   a DHCPPRPL message.

DRAFT							    January 1998

2.9.  DHCPPRPL

   This	message	replies	to the DHCPPOLL	message	(PRPL=Poll reply). The
   DHCPPRPL also carries server	status information (see	message	payload
   details below).

   After a failover, when the Primary Server is	restarted, the following
   messages are	used to	coordinate the Primary taking control back from
   the Secondary:

   DHCPCTLREQ	  - Request for	control
   DHCPCTLRET	  - Return of control initiated
   DHCPCTLACK	  - Return of control completed
   DHCPCTLACKACK  - Return of control completed	message	acknowledged.

   The Primary Server sends a DHCPCTLREQ message, indicating that it
   would like to take control of the bindings database.	 The Secondary
   Server replies with a DHCPCTLRET message, which serves as a signal to
   the Primary "Stand by to receive binding updates".  This message then
   is followed by a set	of binding updates from	the secondary to the
   primary.  When all updates have been	transmitted (and acknowledged)
   from	Secondary to Primary,  a DHCPCTLACK message is sent from the
   Secondary to	the Primary, to	signal that "all updates from the Secon-
   dary	are now	completed".

   DISCUSSION:

      Note, that the DHCPCTLACK	message	type must be transmitted reli-
      ably, as the Primary Server will not start servicing clients,
      until it has received the	DHCPCTLACK message.  To	provide	this
      reliability, the DCHPCTLACKACK message is	provided. This provides
      an acknowledgment	of the DHCPCTLACK message, and the DHCPCTLACK
      message will be periodically re-sent until it is acknowledged.  We
      could  just periodically re- send	the DHCPCTLACK message until we
      start receiving binding updates from the Primary,	but the	Primary
      may not have any updates to send at all, hence the need for an
      explicit DCHPCTLACKACK   message.

   The Primary Server transitions into NORMAL state upon receiving a
   DHCPCTLACK from the secondary, when the secondary has completed send-
   ing all of its updates during synchronization. The  DHCPCTLACKACK
   message is needed to	prevent	the primary from waiting and not servic-
   ing clients if the DHCPCTLACK message got lost.  The	Secondary server
   will	keep re-sending	the DHCPCTLACK message,	until:

	1. It Decides that the primary is not responding, so the Secon-
	   dary	server goes into COMMUNICATION-	INTERRUPTED mode.

DRAFT							    January 1998

	2. It receives a DHCPCTLACKACK or a DHCPBNDUPD message from the
	   primary.  The Primary's DHCPBNDUPD messages would start
	   arriving at the Secondary server, if	the Primary did	get the
	   DHCPCTLACK, but the DHCPCTLACKACK message got lost.

3.  Protocol Payload Data Format

   Payload data	is encoded as a	set of flexible	DHCP/BOOTP style
   options. (The usual 1 byte option code, 1 byte length, and "length"
   bytes of data).  The	options	are placed after the header, after skip-
   ping	PayloadOffset bytes. The payload data options are not preceded
   "cookie" value.

   Since the packet is NOT a DHCP/BOOTP	protocol packet,  the options
   used	here do	not conflict with any existing "proper"	DHCP/BOOTP
   options.  In	fact, these options are	allocated in relationship to the
   DHCP	option space in	the following way.  In cases where the syntax
   and semantics of a Failover Payload Option is identical to that of a
   DHCP/BOOTP option, the same number option number is used.  For
   options unique to the Failover protocol, options numbers starting at
   230 are used.

   Thus, all new Failover Protocol option numbers are assigned from a
   continuous range beginning with 230.	 This number is	shown as an X in
   the tables below.

   The protocol	is permissive in allowing various other	DHCP options in
   binding updates.  As	long as	the sender wishes to use an option, it
   MAY include it. On the other	hand, the recipient MUST ignore	any
   option it is	not expecting.

   Multiple DHCPBNDUPD transactions can	be batched together in one UDP
   packet. Option sets	for individual transaction MUST	always begin
   with	the IP address (Option	50) . This is the only restriction on
   payload item	ordering. In any other case, payload data items	can be
   included in any desired order.

   In case an implementation chooses to	use the	DHCPBNDNAK mechanism,
   the DHCPBNDNAK message SHOULD contain one or	more Option 50s	from the
   NAK-ed message, to indicate which specific update items are being
   NAK-ed.

   While the synchronization is	in progress, the secondary MUST	NOT
   accept client requests, and the primary MUST	NOT send any updates to
   the secondary. This is necessary to allow the Primary to be the sole
   arbitrator of any conflicting updates.

DRAFT							    January 1998

3.1.  DHCP Server Status

   This	option is used to convey the current state of a	server.

    Code  Len  Type
   +--+---+------+
   | X|	1 | 1-15 |
   +--+---+------+

   Allowed values for this option:

   Value Message Type
   ----- ------------
   1	 UNKNOWN-STATE
   2	 PRIMARY-NORMAL		   Normal state
   3	 BACKUP-NORMAL
   4	 PRIMARY-COMINT		   Communication interrupted (safe)
   5	 BACKUP-COMINT
   6	 PRIMARY-PARTNERDOWN	   Partner down	(unsafe
				   mode)
   7	 BACKUP-PARTNERDOWN
   8	 PRIMARY-CONFLICT	   Synchronizing, after	a
				   "Partner-Down"
				   divergence
   9	 PRIMARY-SYNC		   Synchronizing, after	a
				   "communications-
				   interrupted"
				   divergence.
   10	 BACKUP-SYNC
   11	 PRIMARY-RECOVER	   Recovering ALL
				   bindings from partner
   12	 BACKUP-RECOVER
   13	 FAILOVER-DISABLED	   The server is running
				   with	the failover
				   protocol disabled.
				   (standalone)

   14	 SERVER-PAUSED		    The	server is inactive,
				   shutting down for a sort period.
   15	 SERVER-SHUTDOWN	    The	server is inactive,
				   shutting down for an	extended period.

   When	a server is being re-started, it should	send a DHCPPOLL	message
   to its partner, reporting its status	(SERVER-PAUSED).  In response,
   the recipient SHOULD	go into	COMMUNICATION-INTERRUPTED mode.

DRAFT							    January 1998

   When	a server is being shut down,  it should	send a DHCPPOLL	message
   to its partner, reporting its status	(SERVER-SHUTDOWN).

   In response,	the recipient SHOULD go	into PARTNER-DOWN mode.

3.2.  DHCP Binding Status

   This	option is used to convey the current state of a	binding. This
   option is mandatory for DHCPBNDUPD messages.

   Code	  Len  Type
   +-----+-----+-----+
   | X+1 |  1  | 1-7 |
   +-----+-----+-----+

   Legal values	for this option	are:

   Value   Message Type
   -----   ------------
   1	   FREE		  The lease has	never been used
   2	   ACTIVE	  assigned to a	client *
   3	   EXPIRED
   4	   RELEASED	  A client released the	lease
   5	   ABANDONED	  A server or client flagged address
			  as not usable.
   6	   RESET	  Lease	was freed by some
			  external agent.
   7	   BACKUP	  Lease	is set aside for Secondary
			  server's private address pool.

3.3.  Assigned IP address

   Uses	identical code and format to DHCP Option 50 (requested IP
   address).

   Code	  Len	       Address
   +-----+-----+-----+-----+-----+-----+
   |  50 |  4  |  a1 |	a2 |  a3 |  a4 |
   +-----+-----+-----+-----+-----+-----+

DRAFT							    January 1998

3.4.  Lease grant time

   An absolute,	GMT time value for this	option,	as time	synchronization
   has already been achieved between the source	and the	target server
   using the Sent Time Stamp option.  Represented as seconds since Jan
   1, 1970  (i.e. ANSI C time_t	time value representation).

   Code	  Len		Time
   +------+-----+-----+-----+-----+-----+
   | X+2  |  4	|  t1 |	 t2 |  t3 |  t4	|
   +------+-----+-----+-----+-----+-----+

3.5.  Sent Time	Stamp

   A time stamp	using GMT, when	the packet was sent. It	is used	to
   determine the time drift between the	sender and the recipient. The
   time	drift is defined as the	difference between "Arrive Time	(GMT)"
   and (Send Time (GMT)" .  The	actual packet travel time is assumed to
   be negligible in this context. All Date-Time	values contained  in
   Failover messages will be corrected by the time drift before	being
   stored by the recipient.

   Code	  Len		Time
   +-----+-----+-----+-----+-----+-----+
   | X+3 |  4  |  t1 |	t2 |  t3 |  t4 |
   +-----+-----+-----+-----+-----+-----+

   The time is a 32 bit	unsigned long in network byte order, in	units of
   seconds (GMT	since EPOCH).

3.6.  Number of	addresses transferred to Secondary Server

   A 32	bit unsigned long in network byte order. Reports the number of
   addresses transferred by the	Primary	to the Secondary Server
   (addresses to be used for the Secondary Server's private address
   pool)

DRAFT							    January 1998

   Code	  Len		Time
   +-----+-----+-----+-----+-----+-----+
   | X+4 |  4  |  t1 |	t2 |  t3 |  t4 |
   +-----+-----+-----+-----+-----+-----+

3.7.  Lease Duration

   Uses	the format and code of the standard DHCP IP Address Lease Time
   option. It is used by the DHCP protocol in the exact	same way by the
   DHCPOFFER message. The time is in units of seconds, and is specified
   as a	32-bit	unsigned integer. A Lease Duration of 0xFFFFFFFF indi-
   cates an infinite lease.

   Code	  Len	      Lease Time
   +-----+-----+-----+-----+-----+-----+
   |  51 |  4  |  t1 |	t2 |  t3 |  t4 |
   +-----+-----+-----+-----+-----+-----+

3.8.  Client Identifier

   The format, code and	conventions used are identical to DHCP option
   61.

   Code	  Len	Type  Client-Identifier
   +-----+-----+-----+-----+-----+---
   |  61 |  n  |  t1 |	i1 |  i2 | ...
   +-----+-----+-----+-----+-----+---

3.9.  Client Hardware Address

   The format is similar to DHCP option	61. T1 (type) MUST be set to the
   proper ARP hardware address code ( it MUST NOT be zero!)  TBD: Refer-
   ence	the ARP	document here.

DRAFT							    January 1998

   Code	  Len	Type  Client-Identifier
   +-----+-----+-----+-----+-----+---
   | X+5 |  n  |  t1 |	i1 |  i2 | ...
   +-----+-----+-----+-----+-----+---

   Either Client Id, Client Hardware Address or	BOTH MAY be present in
   binding update transactions.	At least one of	them MUST be present.
   If both are present,	the Client Id MUST be used to uniquely identify
   the owner of	the binding (exactly as	in RFC 2131).

3.10.  Host Name

   Uses	the format and code of DHCP option 12.

   Code	  Len		      Host Name
   +-----+-----+-----+-----+-----+-----+-----+-----+--
   |  12 |  n  |  h1 |	h2 |  h3 |  h4 |  h5 |	h6 |  ...
   +-----+-----+-----+-----+-----+-----+-----+-----+--

3.11.  Domain Name

   Uses	the format and code of DHCP option 15.

   Code	  Len	Domain Name
   +-----+-----+-----+-----+-----+-----+--
   |  15 |  n  |  d1 |	d2 |  d3 |  d4 |  ...
   +-----+-----+-----+-----+-----+-----+--

3.12.  Reject Reason Code

   This	option is used to selectively reject binding updates. It MAY be
   used	in DHCPBNDACK message, always following	an option 50.(The option
   50 contains the IP address of the specific update being rejected).

DRAFT							    January 1998

   Code	  Len	Reason code
   +-----+-----+-----+
   | X+6 |  1  |  R1 |
   +-----+-----+-----+-

   Reason codes	:

   1 Illegal IP	address	(not part of any address pool)
   2 Fatal conflict exists: address in use by other client.

3.13.  MDLI

   Maximum Delta Lease Interval, in seconds.  A	32  bit	integer	value,
   in netwotk byte order.

   Code	  Len		Time
   +------+-----+-----+-----+-----+-----+
   | X+7  |  4	|  t1 |	 t2 |  t3 |  t4	|
   +------+-----+-----+-----+-----+-----+

4.  Exchange of	control	between	Primary	and Secondary

   The Primary and Secondary Servers coordinate	the exchange control
   over	the bindings database through the use of DHCPPOLL and DHCPCTLREQ
   messages.  In normal	operation:

   The Primary sends notification of each change to its	bindings data-
   base	to the Secondary, and the Secondary keeps its bindings database
   synchronized	with the Primary's database.

   The Secondary periodically sends DHCPPOLL messages to the Primary,
   and the Primary responds to each DHCPPOLL message with a DHCPPRPL
   message. If the Secondary does not receive a	DHCPPRPL response mes-
   sage, the Secondary takes control of	the bindings database and begins
   answering requests from DHCP	clients.  Note that the	Secondary should
   be able to be configured to not perform the automatic switch-over.

   The conditions under	which a	Secondary takes	control	of the bindings
   database, e.g., the number of consecutive missing acknowledgments,
   should be configurable in the Secondary by the DHCP administrator.

DRAFT							    January 1998

   The Secondary records any changes it	makes to the bindings database
   while it has	control. The Secondary continues to send DHCPPOLL mes-
   sages to the	Primary.  The DHCPPOLL messages	also carry information
   on the state	of the Secondary Server.

   To regain control of	the bindings database, e.g., after the Primary
   Server has recovered	from a failure,	or a partitioned network condi-
   tion, the Primary sends a DHCPCTLREQ	message	to the Secondary.  The
   Secondary stops answering DHCP client requests, and responds	to its
   Primary with	a DHCPCTLRET message.  After sending the DHCPCTLRET mes-
   sage, the Secondary sends DHCPBNDUPD	messages for each of the changes
   it has made to the bindings database.

   The Primary sends a DHCPBNDACK for each DHCPBNDUPD message it
   receives.  The Secondary completes the transfer of control by sending
   a DHCPCTLACK	message	to the Primary as soon as all of its updates
   were	acknowledged.

   Note, that the Primary SHOULD NOT send any DHCPBNDUPD messages while
   synchronization is in progress with the Secondary.

   Once	the synchronization is completed, and the Primary transitions
   into	NORMAL state, and starts sending DHCPBNDUPD transactions on any
   accumulated binding changes it may have.

5.  Duplicate address assignment scenarios

   In the following two	scenarios, the protocol	could end up allocating
   duplicate IP	addresses, unless the measures recommended in Section 6.
   are taken:

   Primary Server crash	before "lazy" update: In the case where	the Pri-
   mary	Server sends an	ACK to a client	for a newly allocated IP address
   and then crashes prior to sending the corresponding update to the
   Secondary Server, the Secondary Server will have no record of the IP
   address allocation.	When the Secondary Server takes	over, it may
   well	try to allocate	that IP	address	to a different client.	In the
   case	where the first	client to receive the IP address is not	on the
   net at the time (yet	while there was	still time to run on its lease),
   an ICMP echo	(i.e., ping) will not prevent the Secondary Server from
   allocating that IP address to different client.

   A more likely and subtle version of this problem is where the Primary
   Server crashes after	extending a client's lease time, and before
   updating the	Secondary with a new time using	a lazy update. After the
   Secondary takes over, if the	client is not connected	to the network
   the Secondary will believe the client's lease has expired when, in
   fact, it has	not.  In this case as well, the	IP address might be

DRAFT							    January 1998

   reallocated to a different client while the first client is still
   using it.

   Network partition where servers can't communicate but each can talk
   to clients: Several conditions are required for this	situation to
   occur. First, due to	a network failure, the Primary and Secondary
   Servers cannot communicate.	As well, some of the DHCP clients must
   be able to communicate with the Primary Server, and some of the
   clients must	now only be able to communicate	with the Secondary
   Server.  When this condition	occurs,	both Primary and Secondary
   Servers could attempt to allocate IP	addresses for new clients from
   the same pool of available addresses. At some point,	then, two
   clients will	end up being allocated the same	IP address. This will
   cause potentially serious problems when the network failure that
   created this	situation is corrected.

   The next section details how	the Failover Protocol prevents either of
   the above scenarios (and other related scenarios) from causing dupli-
   cate	IP address allocation.

6.  Duplicate Address Assignment Control

   There are several ways that the Failover protocol avoids the	possi-
   bility of duplicate address assignment.

6.1.  Control of lease time

   The key problem with	lazy update is that when the primary server
   fails after updating	a client with a	particular lease time and before
   updating the	secondary server, the secondary	server will believe that
   a lease has expired even though the client still retains a valid
   lease on that IP address.

   In order to handle this problem, a period of	time known as the "max-
   imum	delta lease interval" (MDLI) is	defined	and must be known to
   both	the primary and	secondary servers.  Proper use of this time
   interval places an upper bound on the difference allowed between the
   lease time provided to a DHCP client	and the	lease time known by the
   secondary server.  In order that this is not	the maximum lease time
   that	the primary can	ever provide to	a client, during a lazy	update
   the primary typically updates the secondary with lease time informa-
   tion	which is longer	than the lease time previously given to	the
   client.

   In the case where the secondary needs to take over from the primary,
   the secondary will not reallocate any IP addresses from one client to
   a different clients.	 When transitioning to the PARTNER-DOWN	state
   (where the secondary	is allowed to reallocate IP addresses),	the

DRAFT							    January 1998

   secondary will wait the maximum-delta-lease-interval	before complet-
   ing the state transition.  Thus, any	clients	which have a lease on an
   IP address with a lease time	greater	that than known	by the secondary
   will	either have contacted the secondary during that	time or	the
   their lease will have expired.

   This	protocol requires a DHCP server	to deal	with several different
   lease intervals and places specific restrictions on their relation-
   ships. The purpose of these restrictions is to allow	the other server
   in the pair to be able to make certain assumptions in the absence of
   an ability to communicate between servers.

   The different lease times are:

	o desired client lease interval

	  The desired client lease interval is the lease interval that
	  the DHCP server would	like to	give to	the DHCP client	in the
	  absence of any restrictions imposed by the Failover Protocol.
	  Its determination is outside of the scope of this protocol.
	  Typically this is the	result of external configuration of a
	  DHCP server.

	o actual client	lease interval

	  The actual client lease internal is the lease	interval that
	  that DHCP server gives out to	the DHCP client.  It may be
	  shorter than the desired client lease	interval (as explained
	  below).

	o Primary Server lease interval

	  The Primary Server lease interval is the interval after which
	  the Primary Server believes that DHCP	client's lease will
	  expire.

	o desired Secondary Server lease interval

	  The desired Secondary	Server lease interval is the interval
	  the Primary Server tells to the Secondary Server after which
	  the lease will expire.

	o acknowledged Secondary Server	lease interval

	  The acknowledged Secondary Server lease interval is the inter-
	  val the Secondary Server has most recently acknowledged. The
	  key restriction (and guarantee) that the Primary Server makes
	  with respect to lease	intervals is that the actual client

DRAFT							    January 1998

	  lease	interval never exceeds the acknowledged	Secondary Server
	  lease	interval (if any) by more than a fixed amount.	This
	  fixed	amount is called the "maximum delta lease interval"
	  (MDLI).

   The MDLI MAY	be configurable, but for correct server	operation it
   MUST	be known to both the Primary and Secondary Servers.

   The Primary Server MUST record in its state both the	Primary	Server
   lease interval and the most recently	acknowledged Secondary Server
   lease interval. It is assumed that the desired client lease interval
   can be determined through techniques	outside	of the scope of	this
   protocol.

   The above lease time	descriptions are written for the case where the
   where the Primary server is operating and in	communication with the
   Secondary server.  In the case where	the Secondary server is	operat-
   ing out of communications with the Primary server, then the relation-
   ships must hold in the other	direction.

   The fundamental relationship	among these times which	MUST be	main-
   tained is:

       actual client lease interval <
       ( acknowledged other server lease interval + MDLI )

   The "acknowledged other server lease	interval" is the acknowledged
   secondary server lease interval for the Primary server, and it would
   be the acknowledged primary server lease interval for the Secondary
   server when it is operating out of contact with the Primary server.

   DISCUSSION:

      This protocol mandates no	particular detailed algorithms concern-
      ing these	lease intervals, as long as above fundamental relation-
      ship is preserved.

      In the interests of clarity, however, let's examine a specific
      example. The MDLI	in this	case is	1 hour.	 The desired client
      lease interval is	3 days.	 In operation this might work as fol-
      lows:

      When a Primary Server makes an offer for a new lease on an IP
      address to a DHCP	client,	it determines the desired client lease
      interval (in this	case, 3	days).	It then	examines the ack-
      nowledged	Secondary lease	interval (which	in this	case is	 zero).

DRAFT							    January 1998

      Since the	actual client lease interval can not be	allowed	to
      exceed the current Secondary lease interval by more than the MDLI,
      the offer	made to	the DHCP client	(the actual client lease inter-
      val) is for (essentially)	the MDLI, 1 hour.

      Once the Primary Server has performed the	ACK to the DHCP	client,
      it will update the Secondary Server with the lease information.
      However, the Secondary Server lease interval will	be composed of
      the current actual client	lease interval + ( 1.5 * desired client
      lease interval). Thus, the Secondary Server is updated with a
      lease interval of	4.5 days + 1 hour.

      When the Primary Server receives an ACK to its update of the
      Secondary	Server's lease interval, it records that as the	ack-
      nowledged	Secondary Server lease interval.  The Primary Server
      MUST ensure that the Secondary Server has	received and recorded in
      its stable storage the Secondary Server lease interval.

      When the DHCP client attempts to renew at	T2 (approximately one
      half an hour from	the start of the lease), the Primary Server
      again determines the desired client lease	time, which is still 3
      days.  It	then compares this with	the remaining acknowledged
      Secondary	Server lease interval (adjusting for the time passed
      since the	Secondary Server was last updated), which is 4.5 days +
      to the desired client lease interval as it is less than the ack-
      nowledged	Secondary lease	interval.

      When the Primary DHCP server updates the Secondary DHCP server
      after the	DHCP client's renewal ACK is complete, it will calculate
      the Secondary Server lease interval as the actual	client lease
      interval (3 days this time) + .5 the desired client lease	interval
      (1.5 days).  In this way,	the Primary attempts to	have the Secon-
      dary always "lead" the client in its understanding of the	client's
      lease interval.

      Once the initial actual client lease interval of the MDLI	is past,
      the protocol operates effectively	like the DHCP protocol does
      today in its behavior concerning lease intervals.	However, the
      guarantee	that the actual	client lease interval will never exceed
      the acknowledged Secondary Server	lease interval by more than the
      MDLI allows full recovery	from failures in lazy update.

6.2.  Controlled re-allocation of IP addresses

   When	the servers cannot communicate neither server will allow an IP
   address previously used by one client to be offered to a different
   client.  As a corollary, during normal operations the primary server

DRAFT							    January 1998

   must	update the secondary server whenever a lease expires or	an IP
   address is released,	and must receive acknowledgement of that update
   before offering the IP address of the expired or released IP	address
   to a	different client.

7.  Server States

   The following server	states are defined:

   NORMAL State:

   NORMAL state	is the state used by a server when it can communicate
   with	the other server in the	Primary-Secondary Server pair. When in
   this	state, the Primary responds to DHCP clients requests, while the
   Secondary does not.

   COMMUNICATION-INTERRUPTED state:

   A server goes into this state whenever it is	unable to communicate
   with	the other server. Both the Primary and Secondary Servers can go
   into	this state, although the behavior changes that result are dif-
   ferent. Primary and Secondary Servers cycle automatically (without
   administrative intervention)	between	NORMAL and COMMUNICATION-
   INTERRUPTED state as	the network connection between them fails and
   recovers, or	as the partner server cycles between operational and
   non-operational. No duplicate IP address allocation can occur while
   the servers cycle between these states.  In this state both servers
   may respond to DHCP client requests.	 When allocating new IP
   addresses, each server allocates from a different pool. When	respond-
   ing to renewal requests, each server	will allow continued renewal of
   a DHCP client's current lease on an IP address.

   PARTNER-DOWN	state:

   PARTNER-DOWN	state is a state either	server can enter. Once a server
   has entered NORMAL state, the PARTNER-DOWN state is entered only on
   command of an external agency (typically an administrator of	some
   sort) or after the expiration of an externally configured minimum
   safe-time after the beginning of COMMUNICATION-INTERRUPTED state.
   When	in this	state, the server no longer assumes that the other
   server could	still be operational and servicing a a different set of
   clients, but	instead	assumes	that it	is the only server operating.
   Only	one server should be operating in this state at	a time.	The
   server in this state	will respond to	DHCP client requests. It will
   allow renewal of all	outstanding leases on IP addresses, and	will
   allocate IP addresses from its own pool, and	after a	fixed period of
   time, it will allocate IP addresses from the	set of all available IP

DRAFT							    January 1998

   addresses. The server will transition out of	PARTNER-DOWN state after
   automatic re-integration the	companion server is complete.  This
   automatic re- integration will typically be initiated by the	restart
   of the server which was down.

   POTENTIAL-CONFLICT state:

   This	state indicates	that the two servers are attempting to rein-
   tegrate with	each other, but	at least one of	them was running in a
   state that did not guarantee	automatic reintegration	would be possi-
   ble.	 In POTENTIAL-CONFLICT state the servers may determine that the
   same	IP address has been offered and	accepted by two	different DHCP
   clients.

   RECOVER state:

   This	state indicates	that the server	has no information in its stable
   storage. A server in	this state will	attempt	to refresh its stable
   storage from	the other server.

   SYNC	state:

   In this state, the Secondary	Server attempts	to synchronize its
   stable storage with the Primary Server.  Both the Primary and Secon-
   dary	may have information that the other lacks.

8.  Primary Server Operation

   This	section	discusses the operation	of the primary server using the
   state transition diagram in Figure 8.2-1.

8.1.  Primary Server Initialization

   When	the Primary Server starts, there are three possibilities:  it
   has never started before and	therefore has no record	of any previous
   state nor of	any client binding information;	it has started before
   and has a record of a previous state	and possibly of	some client
   binding information;	it has started before, but failed catastrophi-
   cally, and now has no record	of any previous	state (nor of any client
   binding information).

   When	the Primary Server starts, if it has any record	of a previous
   state, then if that state was NORMAL	or COMMUNICATION-INTERRUPTED it
   moves to COMMUNICATION- INTERRUPTED state.  If that state was
   PARTNER-DOWN	or POTENTIAL-CONFLICT, then it moves to	PARTNER-DOWN
   state.  If that state was RECOVER, then the Primary Server moves into
   the RECOVER state.

DRAFT							    January 1998

   If it has no	record of any previous state, then either this is an
   initial startup, or a recovery from a catastrophic failure where
   stable storage and all client binding information was lost. These are
   distinguished by recovery from a catastrophic failure being indicated
   by some external configuration indication to	the Primary Server.

8.2.  Primary Server State Transitions

   Figure 8.2-1	is the diagram of the Primary Server's state transi-
   tions. The remainder	of this	section	contains information important
   to the understanding	of that	diagram.

   The server stays in the current state until all of the actions speci-
   fied	on the state transition	are complete.  If communications fails
   during one of the actions, the server simply	stays in the current
   state and attempts a	transition whenever the	conditions for a transi-
   tion	are later fulfilled.

   In the state	transition diagram below, the "+" or "-" in the	upper
   right corner	of each	state is a notation about whether communication
   is ongoing with the Secondary Server.  The legend "responsive" and
   "unresponsive" in each state	indicates whether the Primary Server is
   responsive to DHCP client requests in the respective	state.

   In the diagram state	transition diagram below, when communication is
   reestablished between the Primary and Secondary Server, the Primary
   server must record the state	of the Secondary Server	when the commun-
   ication was reestablished.

   If the state	of the Secondary Server	changes	 while communicating,
   then	the Primary Server moves through the communications-failed tran-
   sition, and into whatever state results.  It	then immediately moves
   through whatever state transition is	appropriate given the current
   state of the	Secondary Server.

   DISCUSSION:

      The point	of this	technique is simplicity, both in explanation of
      the protocol and in its implementation.  The alternative to this
      technique	of memory of partner state and automatic state transi-
      tion on change of	partner	state is to have every state in	the fol-
      lowing diagram have a state transition for every possible	state of
      the partner.  With the approach adopted, only the	states in which
      communications are reestablished require a state transition for
      each possible partner state.

   All state transitions of the	Primary	Server must be recorded	in its
   stable storage, and thus be available to the	server after a server

DRAFT							    January 1998

   restart.

	       Previous	Primary	State:

	 NORMAL	or     RECOVER	       PARTNER DOWN
       COMMUNICATION  <ext. cmd>    POTENTIAL CONFLICT
	INTERRUPTED	  |		<none>
       +---+		  V		   |
       |     +----------------+	+-----------------+
       |     |		    - |	|		- |
       |     |	  RECOVER     |	|  PARTNER DOWN	  |<-----+
       |     | (unresponsive) |	|  (responsive)	  |	 |
       |     +----------------+	+-----------------+	 |
       |       |		 |	 |	 ^	 |
       |   Comm. OK		 |    Comm. OK	 |	 |
       |   Sec.	State:		 |  Sec. State:	Comm.	 |
       |    |	   |		 V  All	Others	Failed	 |
       |    |	RECOVER	    +<---+	 V	 |	 |
       |   All	   |	    |	    +-------------+	 |
       |  Others   |	 Comm. OK   |  POTENTIAL +|	 |
       |    |	  Note	Sec. State: |  CONFLICT	  |	 |
       |    |	  Poss.	 RECOVER    |(responsive) |<---- | --+
       |    V	  Error	  NORMAL    +-------------+	 |   |
       | Sec->Pri   |	 Pri->Sec	    |		 |   |
       |   Sync	    |	  Sync.	      Resolve Conflict	 |   |
       |    |	    |	    V		    V		 |   |
       | Wait MDLI  |	   +-----------------+		 |   |
       | from Fail. |	   |		   + | External	 |   |
       |    V	    V	   |	 NORMAL	     |-Command-->+   |
       |    +-----++------>|  (responsive)   |		 |   |
       |	  ^	   +-----------------+		 |   |
       |	  |		    |			 |   |
       |      Pri<->Sec		  Comm.		    External |
       |	Sync		 Failed		     Command |
       |	  |		    |			or   |
       |      Comm. OK		    |	       "Safe Period" |
       |     Sec. State:	    V		 expiration  |
       |       NORMAL	   +-----------------+		 |   |
       |     COMM. INT.	   |		   - |---------->+   |
       |      RECOVER------| COMMUNICATIONS  |		     |
       |		   |   INTERRUPTED   |	 Comm. OK    |
       +------------------>|  (responsive)   |--Sec. State:--+
			   +-----------------+	All Others

	   Figure 8.2-1:  Primary Server state diagram.

DRAFT							    January 1998

8.3.  Primary Server in	PARTNER-DOWN state

   When	it is in PARTNER-DOWN state, the Primary Server	operates largely
   as does a normal DHCP server, with none of the special algorithms
   described below.  In	PARTNER-DOWN state the Primary Server MUST
   respond to DHCP client requests.

   Any available IP address tagged as belonging	to the Secondary Server
   (at entry to	PARTNER-DOWN state) MUST NOT be	used until the MDLI
   beyond the entry into PARTNER-DOWN state has	elapsed.

   The Primary Server MUST NOT allocate	an IP address to a DHCP	client
   different from that to which	it was allocated at the	entrance to
   PARTNER-DOWN	state until the	MDLI beyond the	its expiration time has
   elapsed.  If	this time would	be earlier than	the current time plus
   the MDLI, then the current time plus	the MDLI is used.

   Two options exist for lease times, with different ramifications flow-
   ing from each.

   If the Primary Server wishes	the Failover Protocol to protect it from
   loss	of stable storage in any state,	then it	should ensure that the
   MDLI	based lease time restrictions in Section 6.1 are maintained,
   even	in PARTNER-DOWN	state.

   If the Primary Server wishes	to forego the protection of the	Failover
   Protocol in the event of loss of stable storage, then it need recog-
   nize	no restrictions	on actual client lease times while in PARTNER-
   DOWN	state.

   The Primary Server MUST poll	the Secondary Server and attempt to
   establish communications and	synchronization	with it.

   Once	the Primary succeeds in	contacting the Secondary Server, the
   Primary examines the	state of the Secondary Server. If the state of
   the Secondary Server	is RECOVER or NORMAL, then both	servers	have
   been	running	in such	a way that duplicate IP	address	allocations were
   inhibited.  In this case, the Primary Server	updates	the Secondary
   Server with its client binding information, and moves into the NORMAL
   state.

   Once	contact	has been established, if the state of the Secondary
   Server is anything other than RECOVER or NORMAL then	the Primary
   Server moves	into the POTENTIAL-CONFLICT state.

8.4.  Primary Server in	RECOVER	state

   When	Primary	Server is initialized in the RECOVER state it expects to

DRAFT							    January 1998

   refresh its stable storage from an existing Secondary Server.  In
   this	state the Primary Server MUST NOT respond to DHCP client
   requests.

   When	the Primary Server succeeds in contacting the Secondary	Server,
   if it determines that the Secondary Server is itself	in the RECOVER
   state (which	indicates that the Secondary Server has	no existing
   client binding information),	the Primary Server will	move directly
   into	NORMAL state after signaling some kind of an error (since some
   person had to explicitly start the Primary Server in	RECOVER	state to
   refresh its lost client binding information from the	Secondary, and
   the Secondary had no	state).

   If the Primary Server determines that the Secondary Server is in any
   state other than RECOVER, then the Secondary	Server has some	client
   binding information that the	Primary	Server needs before it moves
   into	the NORMAL state.  The Primary Server will attempt to refresh
   its state from the Secondary	Server,	and it will remain in the
   RECOVER state until it is successful	in doing so.

   The Primary Server MUST remain in RECOVER state until a period of at
   least the MDLI has passed since the Primary Server was known	to have
   failed.  This is to allow any IP addresses that were	allocated by the
   Primary Server prior	to loss	of Primary Server client binding infor-
   mation in stable storage to contact the Secondary Server or to time
   out.

   DISCUSSION:

      The actual requirement on	this wait period in RECOVER is that it
      start when the Primary Server went down, not necessarily when it
      came back	up.  If	the time when the Primary Server failed	is
      known, then it could be communicated to the recovering server, and
      the wait period could be reduced to the MDLI less	the difference
      between the current time and the time the	server failed. In this
      way, the waiting period could be minimized.

8.5.  Primary Server in	NORMAL state

   When	in NORMAL state, the Primary Server takes the following	actions
   to implement	the Safe Failover Protocol:

	o Lease	Time Calculations

	  As discussed in Section 6.1, "Control	of lease time",	the
	  lease	interval given to a DHCP client	can never be more than
	  the maximum delta lease interval greater than	the acknowledged

DRAFT							    January 1998

	  Secondary Server lease interval.

	  As long as the Primary Server	adheres	to this	constraint, the
	  specifics of the lease intervals that	it gives to either the
	  DHCP client or the Secondary DHCP server are implementation
	  dependent. One possible approach is shown in Section 6.1, but
	  that particular approach is in no way	required by this proto-
	  col.

	o Lazy Update of Secondary Server

	  After	an ACK of a IP address binding,	the Primary Server
	  attempts to update the Secondary with	the binding information.
	  The lease time used in the update of the Secondary MUST be at
	  least	that given to the DHCP client in the DHCPACK.  It MAY,
	  however, be longer.

	o Reallocation of IP Addresses Between Clients

	  Whenever a client binding is released, a DHCPBNDUPD message
	  must be sent to the Secondary	Server,	setting	the binding
	  state	to RELEASED. However, until a DHCPBNDACK is received for
	  this message,	the IP address cannot be allocated to another
	  client.

8.6.  Primary Server in	COMMUNICATION-INTERRUPTED Mode

   When	in COMMUNICATION-INTERRUPTED state the Primary Server operates
   in such a way that correct operation	is ensured even	if the Secondary
   Server is still up and operational, but unable to communicate to the
   Secondary Server. When communications are reestablished between the
   Primary and Secondary Servers, if both are still in COMMUNICATION-
   INTERRUPTED state, then the re-integration of their operation will
   proceed automatically and without human intervention.  The protocol
   is designed to ensure that reintegration will proceed in an error
   free	manner and that	no actions taken by either server while	in
   COMMUNICATION-INTERRUPTED state will	cause problems during reintegra-
   tion.

   The Primary Server operates in COMMUNICATION-INTERRUPTED state as it
   does	in NORMAL state.

   However, since it cannot communicate	with the Secondary in this
   state, the acknowledged-Secondary-lease-time	will not be updated in
   any new bindings. This is likely to eventually cause	the actual-
   client-lease-times to be the	current-time plus the MDLI (unless this
   is greater than the desired-client-lease-time).

DRAFT							    January 1998

   The Primary Server can simply queue updates to the Secondary	on com-
   munication interruption and stay in the NORMAL state. If, at	the time
   communication with the Secondary is reestablished, the Secondary
   remains in the NORMAL state as well,	then the queued	updates	for the
   Secondary will simply be processed.

   COMMUNICATION-INTERRUPTED state for the Primary Server is a signal
   that	it has stopped queuing updates to the Secondary, and is	able to
   respond to a	variety	of possible Secondary states.

   It is anticipated that some alarm condition would be	raised upon the
   transition from NORMAL state	to COMMUNICATION-INTERRUPTED state. Once
   the Primary Server has been in COMMUNICATION-INTERRUPTED state for a
   period equal	to the safe-period, then it can	(if configured to do so)
   transition into the PARTNER-DOWN state.  An external	command	may also
   force a transition to PARTNER-DOWN state.

9.  Secondary Server Operation

   The Secondary Server	responds to DHCP client	requests only in the
   PARTNER-DOWN	and COMMUNICATION-INTERRUPTED states.

9.1.  Secondary	Server Initialization

   When	the Secondary Server starts, there are three possibilities: it
   has never started before and	therefore has no record	of any previous
   state nor of	any client binding information;	it has started before
   and has a record of a previous state	and possibly of	some client
   binding information;	it has started before, but failed catastrophi-
   cally, and now has no record	of any previous	state (nor of any client
   binding information).

   When	the Secondary Server starts, if	it has any record of a previous
   state, then if that state was NORMAL, COMMUNICATION-INTERRUPTED, or
   SYNC, it moves to COMMUNICATION-INTERRUPTED state. If that state was
   PARTNER-DOWN	or POTENTIAL-CONFLICT, then it moves to	PARTNER-DOWN
   state. In all other cases (both other previous states and the cases
   where there is no record of a previous state), the Secondary	Server
   moves into the RECOVER state.

9.2.  Secondary	Server State Transitions

   The server stays in the current state until all of the actions speci-
   fied	on the state transition	are complete.  If communications fails
   during one of the actions, the server simply	stays in the current
   state and attempts a	transition whenever the	conditions for a

DRAFT							    January 1998

   transition are later	fulfilled.

   In the state	transition diagram below, the "+" or "-" in the	upper
   right corner	of each	state is a notation about whether communication
   is ongoing with the Primary Server. The legend responsive" and
   "unresponsive" in each state	indicates whether the Secondary	Server
   is responsive to DHCP client	requests in the	respective state.

   In the state	transition diagram below, when communication is	reesta-
   blished between the Secondary and Primary Server, the Secondary
   Server must record the state	of the Primary Server when the communi-
   cations was reestablished. If the state of the Primary Server changes
   while communicating,	then the Secondary Server moves	through	the
   communications-interrupted transition, and into whatever state
   results.  At	that time, it then immediately moves through whatever
   state transition is appropriate for the current state of the	Primary
   Server.

   All state transitions of the	Secondary Server must be recorded in its
   stable storage, and thus be available to the	server after a server
   restart.

DRAFT							    January 1998

	       Previous	Secondary State:

	 NORMAL	   RECOVER	  PARTNER DOWN
       COMM. INT.   <none>	POTENTIAL CONFLICT
	  SYNC	      |		       |
       +---+	      V		       V
       |     +----------------+	+-----------------+
       |     |	  RECOVER   - |	|  PARTNER DOWN	- |<-----+
       |     | (unresponsive) |	|  (responsive)	  |	 |
       |     +----------------+	+-----------------+	 |
       |       |		 |	|	 ^	 |
       |   Comm. OK		 |   Comm. OK	 |	 |
       |   Pri.	State:		 |  Pri. State:	Comm.	 |
       |    |	   |		 V  All	Others	Failed	 |
       |    |	RECOVER	    +<---+	V	 |	 |
       |    |	   |	    |	    +--------------+	 |
       |    |	   |	 Comm. OK   |  POTENTIAL + |	 |
       |   All	   |	Pri. State: |  CONFLICT	   |	 |
       |  Others   |	 RECOVER    |(unresponsive)|<--- | --+
       |    |	  Note	    |	    +--------------+	 |   |
       |    |	  Poss.	 Sec->Pri	    |		 |   |
       |    V	  Error	  Sync.	      Resolve Conflict	 |   |
       | Pri->Sec  |	    V		    V		 |   |
       |   Sync	   |	   +-----------------+		 |   |
       |    V	   V	   |	 NORMAL	   + |-External->+   |
       |    +-----++------>| (unresponsive)  | Command	 |   |
       |	  ^	   +-----------------+		 |   |
       |      Pri<->Sec	      |	       ^		 |   |
       |	Sync	      |	 Start Alloc Timer	 |   |
       |	  |	      |	    Sec->Pri		 |   |
       |  +--------------+    |	      Sync		 |   |
       |  |	       + |--->+	       |	    External |
       |  |	SYNC	 |  Comm.   Comm. OK	     Command |
       |  | unresponsive | Failed  Pri.	State:		or   |
       |  +--------------+    |	     RECOVER   "Safe Period" |
       |	  ^	      V	       |	 expiration  |
       |	  |	  +------------------+		 |   |
       |      Comm. OK	  | COMMUNICATIONS - |---------->+   |
       |     Pri. State:  |    INTERRUPTED   |	 Comm. OK    |
       |       NORMAL-----|   (responsive)   |--Pri. State:--+
       |     COMM. INT.	  +------------------+	All Others
       |		     ^
       +---------------------+

	  Figure 9.2-1:	 Secondary Server State	Diagram.

DRAFT							    January 1998

9.3.  Secondary	Server in RECOVER state

   The Secondary DHCP server comes up in the RECOVER state when	it has
   no record of	any previous state (or that previous state was RECOVER).

   It stays in this state until	it establishes communication with the
   Primary Server, and is unresponsive to DHCP client requests in this
   state. Essentially it is idle until it can contact the Primary
   Server.

   When	it establishes communication with the Primary Server, it
   attempts to load its	client binding database	from that of the Primary
   Server using	the techniques specified in section 6.

   Once	the Secondary Server's client binding database is refreshed from
   that	of the Primary,	the Secondary Server moves into	NORMAL state.

9.4.  Secondary	Server in NORMAL state

   In normal state, the	Secondary Server receives state	updates	from the
   Primary Server in DHCPBNDUPD	messages.  It records these in its
   client binding database in stable storage and then sends the
   corresponding DHCPBNDACK message to the Primary Server.

   While in NORMAL state, the Secondary	Server MUST also acquire a
   series of IP	addresses from the Primary Server to be	used to	satisfy
   DHCPDISCOVER	requests from DHCP clients when	in COMMUNICATION- INTER-
   RUPTED state. See Section 2.2.2 for details of this acquisition pro-
   cess.

   The Secondary Server	periodically polls the Primary Server with the
   DHCPPOLL message. If	it fails to receive a DHCPPRPL message in reply
   after a configured number of	retries	or some	administratively deter-
   mined time, the Secondary Server transitions	into COMMUNICATION-
   INTERRUPTED state. Both the DHCPPOLL	and DHCPPRPL messages carry the
   current status of the sender.

   If an external command is operational received by the Secondary Server, it can
   move	from NORMAL to PARTNER-	DOWN state directly.  Such a command
   might be sent when the Primary Server was removed from server, and an
   operator wanted the Secondary Server	to take	over immediately and
   completely from the Primary Server.(Note that the Secondary Server
   takes over from the Primary Server when in COMMUNICATION- INTERRUPTED
   state, but less completely than in PARTNER-DOWN state).

DRAFT							    January 1998

9.5.  Secondary	Server in COMMUNICATION-INTERRUPTED state

   When	in COMMUNICATION-INTERRUPTED state the Secondary Server	operates
   in such a way that correct operation	is ensured even	if the Primary
   Server is still up and operational, but unable to communicate to the
   Secondary Server. When communications are reestablished between the
   Primary and Secondary Servers, if both are still in COMMUNICATION-
   INTERRUPTED state, then the re-integration of their operation will
   proceed automatically and without human intervention.  The protocol
   is designed to ensure that reintegration will proceed in an error
   free	manner and that	no actions taken by either server while	in
   COMMUNICATION-INTERRUPTED state will	cause any conflicts to occur
   during re-integration.

   In response COMMUNICATION-INTERRUPTED	state, the Secondary Server responds to
   DHCP	client requests.

   When	processing a DHCPREQUEST from a	DHCP client, the Secondary
   Server MUST ensure that the client- lease-time is never more	than the
   maximum-delta-lease-	interval from the current-time,	independent of
   the desired-	client-lease-time.

   When	processing a DHCPRELEASE request from a	DHCP client or the
   expiration of a lease, the Secondary	Server must not	reallocate the
   IP address to a different client.  If the same client subsequently
   performs a DHCPDISCOVER request, the	Secondary Server SHOULD	offer it
   the previously used IP address.

   When	processing a DHCPDISCOVER request from a DHCP client, the secon-
   dary	MUST allocate IP addresses from	the list of IP addresses that it
   acquired from the Primary Server in RECOVER state.  When it exhausts
   this	list, it MUST stop responding to DHCPDISCOVER requests (except
   those it can	satisfy	by offering expired or released	IP addresses to
   their previously bound clients).

   The Secondary Server	MUST continue to send DHCPPOLL messages	to the
   Primary Server when in COMMUNICATION-INTERRUPTED state.  If it
   receives a DHCPPRPL message in reply, the DHCPPOLL message, Secondary Server determines
   the participant will listen for state of	the following:

      DHCPPRPL - Poll reply

2.3 Primary requests control from Server.  If	the secondary

   After a failover, when Primary Server is in NORMAL
   or COMMUNICATION-INTERRUPTED	state, then the primary server	Secondary Server moves
   into	the SYNC state.

   If, however,	the Primary Server is restarted, in RECOVER state,	then the following
   messages are used Secon-
   dary	Server updates the Primary Server with its known client	binding
   information,	and moves into NORMAL state upon completion of that
   update.

   If instructed to coordinate by an outside agency (e.g.,	an administrator), the primary taking control back from

DRAFT							    January 1998

   Secondary Server SHOULD move	into PARTNER-DOWN state.  Once the secondary:

      DHCPCTLREQ - Request
   Secondary Server has	been in	COMMUNICATION-INTERRUPTED state	for control
      DHCPCTLRET - Return of control initiated
      DHCPCTLACK - Return of control completed

3 Message formats and semantics

   The failover protocol messages are encoded as a DHCP/BOOTP option
   period equal	to the safe-period, then it may	(if configured to do so)
   transition into the PARTNER-DOWN state in
   a DHCP message.  A DHCP message carrying a failover protocol message
   carries only the failover protocol message option and no other
   options. absence of an external
   command.

9.6.  Secondary	Server in SYNCH	state

   The Secondary Server	does not respond to DHCP message client	requests when in
   SYNCH state.

   DISCUSSION:

      This is unicast from the source to entire reason	for this states	existence, otherwise the
   destination.

   The option code
      activities specified for these messages is TBD.  Within each failover
   protocol message, this state could	happen as part of a
      state transition from the specific message type is indicated by an option
   subcode	COMMUNICATION-INTERRUPTED state	to the
      NORMAL state. However, in	the first octet of COMMUNICATION-INTERRUPTED state the data area of
      Secondary	Server responds	to DHCP	client requests. Having	the option.  The 'len'
   field includes
      Secondary	Server respond to DHCP client requests during the number of octets in syn-
      chronization process (and	thus taking actions requiring further
      synchronization) seemed like a bad idea.

   The Secondary Server	synchronizes its information with the option subcode byte and in
   any additional data carried Primary
   Server while	in SYNCH state.	 Both Primary and Secondary Servers may
   have	information the failover protocol message.
   Bindings are encoded in	other lacks because of operations performed
   while communications	were interrupted.

   During the synchronization process, the Secondary Server continues to
   poll	the Primary Server with	DHCPPOLL messages.  If it fails	to
   receive a format that reply, it moves back into COMMUNICATION-INTERRUPTED state.

   When	synchronization	is TBD.

   DISCUSSION

      The use of complete, the REQUEST/REPLY field Secondary Server moves	into
   NORMAL state.

9.7.  Secondary	Server in the PARTNER-DOWN state

   The Secondary Server	responds to DHCP message header and client	requests when in
   PARTNER-DOWN	state.

   Any available IP address which does not belong to the UDP port private pool
   established by the Secondary	Server (at entry to PARTNER-DOWN state)
   MUST	NOT be used needs to be considered. until the MDLI beyond the entry into PARTNER-DOWN
   state has elapsed.

   The use of existing Secondary Server	MUST NOT allocate an IP	address	to a DHCP options and header fields client
   different from that to encode
      bindings needs which	it was allocated at the	entrance to

DRAFT							    January 1998

   PARTNER-DOWN	state until the	MDLI beyond the	its expiration time has
   elapsed. If this time would be considered.

   The sender places a 32-bit number in earlier than the DHCP header 'xid' field to
   uniquely identify each failover protocol message.  The receiver
   copies current time	plus the contents of
   MDLI, then the 'xid' field into any reply or
   acknowledgment message.

   The sender current time plus the	MDLI is responsible	used.

   Two options exist for reliable transmission and any
   retransmission.

DRAFT                    DHCP lease times, with different ramifications flow-
   ing from each.

   If the Secondary Server wishes the Failover Protocol            November 1997

3.1 Binding Information

   Maintaining consistent binding information between the primary and
   secondary servers is a high priority	to protect it
   from	loss of this protocol.  Both	stable storage in any state, then it should ensure that
   the
   primary and secondary must be sychronized MDLI based lease	time restrictions in order for Section 6.1 are maintained,
   even	in PARTNER-DOWN	state.

   If the failover
   operation Secondary Server wishes to occur smoothly.  The DHCPBNDADD, DHCPBNDUPD, and
   DHCPBNDDEL messages described below require the following binding
   information:

           hType                   1 byte
           hLen                    1 byte
           chAddr                  16 bytes
           ipAddr                  4 bytes
           grantTime               4 bytes
           expireTime              4 bytes
           clientIdentifierLen     2 bytes
           clientIdentifierData    clientIdentifierLen bytes
           status                  2 bytes
           hostNameLen             2 bytes
           hostNameData            hostNameLen bytes
           domainNameLen           2 bytes
           domainNameData          domainNameLen bytes

   The minimum size of the binding information is 32 bytes.

   Note that forego the use	protection of the client hardware address (hType, hLen, and
   chAddr) are safe
   Failover Protocol in order to facilitate servers which support both	the
   Bootp and DHCP protocols.  Since most, if not all, Bootp clients do
   not send a 'client identifier' option, it seems appropriate to use
   this combination event of fields loss of the Bootp packet stable storage, then it MAY
   recognize no	restrictions on	actual client lease times while	in
   PARTNER-DOWN	state.

   The Secondary Server	continues to uniquely identify poll the client within Primary Server with
   DHCPPOLL messages.  If the primary Secondary	Server receives	a reply, and secondary servers' respective
   bindings.

   The 'ipaddr' the
   Primary Server is in	the IP address that RECOVER state, the primary server has leased to Secondary Server	updates
   the client.  The 'grantTime' and 'expireTime' fields are represented
   as seconds since Jan 1, 1970 (i.e. ANSI C time_t time value
   representation). An 'expireTime' Primary Server with all of -1 (ffffffff) indicates an
   infinite lease.

   If available for the individual binding, Secondary's client binding infor-
   mation, and then moves into the 'clientIdentifier'
   fields SHOULD be provided by NORMAL state.

   If communications with the primary server.  These fields
   correspond to Primary Server are reestablished,	and the DHCP vendor extension option number 61.  If such
   information
   Primary Server is provided, then in	any other state	but RECOVER, the secondary SHOULD use this data to
   uniquely identify Secondary
   Server moves	into the client within its bindings database as
   discussed POTENTIAL-CONFLICT state (as does the Primary
   Server).

9.8.  Secondary	Server in RFC 2132 Section 9.14. POTENTIAL-CONFLICT state

   The 'status' field is used to convey secondary server	enters POTENTIAL-CONFLICT state	when the status combi-
   nation of its state and that	of a particular
   binding to the secondary server.  The status may primary indicate	that a

DRAFT                    DHCP Failover Protocol            November 1997

   particular lease potential
   conflict of IP address allocation has expired, or occurred.  There is no	guaran-
   tee that an address

   The 'hostName' and 'domainName' fields can be used to maintain such a conflict has	occurred -- just the possibility.  In
   this	state each server compares its client binding information required for Dynamic DNS updates.  These fields
   correspond to with
   that	of the DHCP vendor extension option number 12 other server and 15,
   respectively.

   DISCUSSION

      The complete list of fields that may be required	any conflicts are resolved in an imple-
   mentation dependent manner.

   When	(and if) the binding
      information is still under discussion.  Based upon such
      discussions and other requirements, resolution	process	completes, each	server moves
   into	the information may be
      expanded or scaled back.

3.2 Primary keeps secondary lease data synchronized

   DHCPBNDADD

      ------------------------------------------
      | XX | len | 1 | Binding information
      ------------------------------------------

   The primary sends a DHCPBNDADD message NORMAL state.

10.  Safe Period

   Due to the restrictions imposed on each server while	in
   COMMUNICATION-INTERRUPTED state, long-term operation	in this	state is
   not feasible	for either server. One reason that these states	exist at
   all,	is to inform allow the secondary	servers	to easily survive transient network

DRAFT							    January 1998

   communications failures of a
   binding that has been added	few minutes to a few days (although the primary's set of bindings.

   DHCPBNDUPD

      ------------------------------------------
      | XX | len | 2 | Binding information
      ------------------------------------------

   The primary sends
   actual time periods will depend a DHCPBNDUPD message great deal	on the DHCP activity of
   the network in terms	of arrival and departure of DHCP clients on the
   network).

   Eventually, when the	servers	are unable to inform communicate, they	will
   have	to move	into a state where they	no longer can re-integrate
   without the secondary some possibility	of a
   binding duplicate IP address allocation.
   There are two ways that they	can move into this state (known	as
   PARTNER-DOWN).

   They	can either be informed by external command that, indeed, the
   partner server is down. In this case, there is no difficulty	in mov-
   ing into the	PARTNER-DOWN state since it is an accurate reflection of
   reality and the protocol has	been changed designed to operate correctly (even
   during reintegration) if, when in PARTNER-DOWN state	the primary's set of bindings.

   DHCPBNDDEL

      ------------------------------------------
      | XX | len | 3 | Binding information
      ------------------------------------------ partner is,
   indeed, down.

   The primary sends a DHCPBNDDEL message to inform other difficulty	is when	the secondary of servers are	running	unattended for
   extended periods, and in this case the option is provided to	config-
   ure something called	a
   binding that has been deleted "safe- period" into each server. This	OPTIONAL
   safe-period is the period after which either	the Primary or Secondary
   Server will automatically transition	to PARTNER-DOWN	from
   COMMUNICATION-INTERRUPTED state.  If	this transition	is completed and
   the primary's set partner is not down, then the possibility of bindings.

DRAFT                    DHCP Failover Protocol            November 1997

   DHCPBNDACK

      --------------
      | XX | 1 | 4 |
      -------------- duplicate IP address
   allocations will exist.

   The secondary sends a DHCPBNDACK message to goal of the primary "safe-period" is to inform allow network operations	staff
   some	time to	react to a server moving into COMMUNICATION-INTERRUPTED
   state.  During the
   primary that safe-period the binding change request identified by only requirement is that the 'xid' field
   has successfully been completed.

   DHCPBNDNAK

      --------------
      | XX | 1 | 5 |
      --------------

   The secondary sends a DHCPBNDNAK message net-
   work	operations staff determine if both servers are still running --
   and if they are, to either fix the primary network communications failure
   between them, or to inform take one	of the
   primary that servers down before the secondary could not complete	expira-
   tion	of the binding change
   request.  For example, safe-period.

   The length of the secondary would send a DHCPBNDNAK safe-period is installation dependent, and	depends
   in
   response to a DHCPBNDUPD request for which large part on the secondary had no
   recorded binding.

   DISCUSSION

      The use	number of an additional field to indicate unallocated IP addresses within the reason for
   subnet address pool and the
      DHCPBNDNAK message should be considered.

3.3  Determination expected	frequency of operational state arrival of a server

   DHCPPOLL

      ----------------------
      | XX | 2 | 6 | flags |
      ----------------------

   A	previ-
   ously unknown DHCP participant sends a DHCPPOLL message to a server clients requiring	IP addresses.  Many environments
   should be able to determine
   whether that support safe-periods of several days.

   During this safe period, either server is currently operational.

   A will allow renewals from any
   existing client.  The only limitation concerns the need for IP
   addresses for the DHCP secondary periodically sends a DHCPPOLL server to its primary hand out to
   determine if the primary is currently operational.

   A	new DHCP primary sends a DHCPPOLL to its secondary if clients and the primary needs
   need	to determine if the secondary is operational.

   A DHCP client sends a DHCPPOLL re-allocate IP addresses to a different DHCP server to determine if the
   server is currently operational. clients.

   The flags octet number of "extra" IP addresses required is defined as follows: CRRRRRRR, where equal	to the secondary

DRAFT expected
   total number	of new DHCP Failover Protocol            November 1997

   sets clients encountered	during the 'C' bit to 1 to indicate that it has taken control safe	period.

DRAFT							    January 1998

   This	is dependent only on the arrival rate of new DHCP clients, not
   the
   bindings database, and total number of outstanding leases on IP	addresses.

   In the 'R' bits are reserved for future use.

   DHCPPRPL

      ----------------------
      | XX | 2 | 7 | flags |
      ----------------------

   A unlikely event that a	relatively short safe period of	an hour
   is all that can be used (given a dearth of IP addresses or a	very
   high	arrival	rate of	new DHCP participant replies clients), even	that can provide sub-
   stantial benefits in	allowing the DHCP subsystem to ride through a DHCPPOLL message with a DHCPPRPL
   message.  The sender copies the 'xid' field from the DHCPPOLL message
   header into the 'xid' field in
   minor problems that could occur and be fixed	within that hour.  In
   these cases,	no possibility of duplicate IP address allocation
   exists, and re-integration after the DHCPPRPL message,

   The flags octet	failure	is defined solved will be
   automatic and require no operator intervention.

11.  Open Issues

A number of details remain to be worked	out.  They are as follows: ERRRRRRR, where the primary
   sets

     1.	Level of Agreement and Completion

	This draft is incomplete in two	senses.	 First,	none of	the 'E' bit to 1 (in response to a DHCPPOLL message
	authors	agree with the 'C'
   bit set to 1) to indicate everything written, and quite a number of
	issues remain to be worked out among the secondary that various authors (to say
	nothing	about the primary has not
   relinquished control rest of the database.  See section 4 for additional
   details.

   DISCUSSION

      The DHCPPOLL and DHCPPRPL messages might also be useful to DHCP
      clients community).  Second, this	draft is
	not yet	complete enough	to aid support creation of inter-operable
	implementations.

	However, we believe that even though this draft	is very	much a
	work in determining	progress, there	is value with sharing it with the availability rest
	of specific DHCP
      servers.  Such use would avoid overloading the DHCPDISCOVER
      message.

3.4  Primary requests control from the secondary

   DHCPCTLREQ

      --------------
      | XX | 1 | 8 |
      --------------

   A primary sends a DHCPCTLREQ message to DHCP community in its secondary current form.

     2.	Failover Port

	We need	to request
   control of resolve whether the bindings database from Failover	protocol runs with the secondary.

   DHCPCTLRET

      --------------
      | XX | 1 | 9 |
      --------------

   A secondary sends
	same or	a DHCPCTLRET to its primary to begin different port as the process	DHCP protocol.	In the interests
	of
   returning control allowing implementation of the bindings database to Failover protocol by a dif-
	ferent process or sub-process, having it use a different port
	seems reasonable.

     3.	High Level Operations

	While the secondary.  After
   sending detailed operations are beginning to come together,
	the DHCPCTLRET message, higher level operations (like reintegration) are, as yet,
	incompletely specifcied.  This will be rectified in a later
	revision.

     4.	Option Spaces

	The draft currently reflects some rather fuzzy goals of	using
	DHCP options where they	apply but also defining	new options.  It

DRAFT							    January 1998

	uses the secondary sends "user defined option space" for this, which is	probably
	not a sequence of
   DHCPBNDADD, DHCPBNDUPD and DHCPBNDDEL messages to synchronize the
   primary's bindings database with good idea.  Perhaps the secondary's database.

DRAFT DHCP Failover Protocol            November 1997

   DHCPCTLACK

      ---------------
      | XX | 1 | 10 |
      ---------------

   A secondary sends Panel will produce a DHCPCTLACK to its primary to indicate that	larger
	option space in	which all of these options can be defined, or
	perhaps	(as it written in the
   secondary has finished returning control draft) this protocol will	just
	have to	define entirely	unique options.

     5.	Subnet Level Granularity

	This protocol talks about a server being in one	state or
	another, however the primary.

   DISCUSSION

      Primary and secondary servers may need desire is for this	protocol to exchange some additional
      information operate
	independently in DHCPCTLREQ, DHCPCTLRET each address pool for which a primary and DHCPCTLACK messages.
      This information would be encoded in an additional 'flags' or
      'data' field added
	secondary server is defined.  In this way, the "server"	state
	really refers to the control messages.

      The synchronization essentially requires a reliable transmission "subnet" state.  Once the protocol using DHCPBND* and DHCPBNDACK messages.  An alternative	is vali-
	dated, the editing work	to using DHCPBND* messages make	it operate at subnet granularity
	will be	performed.

     6.	Secondary Server Communications	with DHCP Clients

	There are two situations where we may want to transfer bindings updates allow the	secon-
	dary server to communicate with	DHCP clients even though the
	secondary can communicate with the primary and would normally be
	unresponsive to devise a separate transfer protocol based on
      TCP.

4 Exchange of control between primary and secondary	DHCP client requests.

	The primary and secondary servers coordinate the exchange control
   over first situation which deserves consideration is where the bindings database through
	secondary has given a DHCP client a lease on an	IP address when
	it was not able	to communicate with the use of DHCPPOLL	primary, and DHCPCTLREQ
   messages.  In normal operation:

   o then subse-
	quently	the primary sends notification of each change secondary becomes able to communicate with the pri-
	mary.  When the	client unicasts	its bindings
     database DHCPREQUEST	to the secondary, and the secondary keeps
	to renew its bindings
     database synchronized lease, the	secondary will not be able to communi-
	cate with the primary's database

   o client (as this protocol is defined).  Should we
	allow the secondary periodically sends DHCPPOLL messages Secondary to extend the primary, lease	for the	DHCP client and
	then inform the	primary responds to each DHCPPOLL message with a DHCPPRPL	of that	extension using	the DHCPBNDUPD
	message

     If	in the secondary does not receive same was	as the Primary uses that message?

	The second situation arises where a DHCPPRPL response message, client can only communicate
	with the secondary takes control of due to some network failure,	but the bindings database	primary
	and begins
     answering requests from DHCP clients.  Note that secondary server can communicate.  As written, the protocol
	will not allow the secondary
     should be able to	offer a	lease to the DHCP
	client,	but it would be configured	straightforward	to not perform modify the protocol
	to allow the automatic
     switchover.

     DISCUSSION

        The conditions under which a secondary takes control to do so.  The only difficult part of
	this change to the
        bindings database, e.g., the number of consecutive missing
        acknowledgments, should protocol would be configurable in to	suggest	how the secondary by	secon-
	dary would know	that the DHCP administrator.

DRAFT client could talk	only to	the
	secondary.  But, given that if the DHCP Failover Protocol            November 1997

     The secondary records any changes it makes	primary	could talk to
	the bindings database
     while it has control.  The DHCP client, the secondary continues would expect to send DHCPPOLL hear about it in
	DHCPBNDUPD messages to the primary, with at some point, the 'D' bit set.

     To regain control absence of the bindings database, e.g., after the primary
     server has failed, the primary sends such messages
	could be used as a DHCPCTLREQ message signal to communicate to the
     secondary.  The secondary stops answering	DHCP client requests, and
     responds in
	question.

DRAFT							    January 1998

     7.	UDP or TCP

	There has been much debate about the utility of	using UDP for
	the failover protocol, since it	doesn't	supply guaranteed
	delivery.  Certainly rebuilding	TCP out	of UDP would be	a mis-
	take.  Some factors to its primary with consider	in this	debate are as follows:

	First, it is important to recognize that mere receipt of a DHCPCTLRET message.  After sending
	packet by the DHCPCTLRET message, other server in the pair (e.g., receipt of a
	DHCPBNDUPD packet by the secondary sends DHCPBND* messages server) is not sufficient for
     each of
	the changes it has made primary to the update its own bindings database.  The
     primary sends a DHCPBNDACK for each of database	with new infor-
	mation about what the DHCPBND* messages it
     receives.  The secondary completes the transfer	knows.	In all cases of control by
     sending a DHCPCTLACK message to its primary.

     If
	transfers of bindings information, the primary server has not failed and has been answering DHCP
     client requests, and receives of a DHCPPOLL DHCPBNDUPD
	message from	MUST update its secondary	own stable storage prior to replying
	with a DHCPBNDACK message (except in the 'D' bit set, then both marginal case where all
	of the primary and updates are rejected).  An action is required by	the secondary have
     been answering DHCP client requests,
	receiving server and their bindings databases
     may be unsynchronized.  In this situation, an	explicit ACK is	needed by the primary responds sending
	server to ensure the secondary with a DHCPPRPL message with integrity of the 'E' bit set.  Both protocol.	 So, just know-
	ing that the primary and secondary servers notify other server has received a network administrator,
     who must take steps to manually resynchronize Failover protocol
	packet is not intrinsically interesting.

	Second,	the two bindings
     databases.

     DISCUSSION

        It may be appropriate to state that, under administrator
        control, DHCP protocol, both	the primary client and secondary both stop some or all server side, is
	being implemented in progressively smaller and smaller machines.
	While this progression is most evident in DHCP clients,	there
	exist implementations today of DHCP
        services when the servers discover embedded in	devices
	that both have been
        allocating DHCP addresses simultaneously and their databases are
        potentially unsynchronized.

4.1  Minimizing the potential for duplicate bindings

     One by no stretch of the goals outlined in section 1.4 of this draft imagination traditional "servers"
	running	mainstream operating systems.  In many ways, the Fail-
	over protocol is very well suited to
     minimize the possibility of assigning an IP address such devices.  Adding addi-
	tional protocol	infrastructure requirements to two
     different clients simultaneously.  This possibility can occur implement the
	Failover protocol could	easily prevent its implementation in
	devices	that in	some ways need it most.

	Third, there are only
     if both a	few cases where	the Failover protocol
	requires guaranteed delivery of	packets.  In particular, the primary and secondary servers
	normal Primary to Secondary DHCPBNDUPD message to not have to be
	delivered reliably.  The consequences of lost DHCPBNDUPD mes-
	sages are handling requests
     from the same subnet at handled by the same time.  Since use of	the basis MDLI, for this
     protocol is that the secondary only becomes "active" in the case simple reason
	that it has determined since these messages are "lazy", they may not get delivered
	because	of a server failover prior to their transmission.  Given
	that the primary protocol is no longer operational,
     the situation robust in which both are operational at the same time can
     occur only if there exists face	of loss	of either a failure
	DHCPBNDUPD message or a	DHCPBNDACK message, a technique	known as
	"fire and forget" may be used with this	protocol and two
	cooperating implementations.  If the DHCPBNDACK	message	contains
	all of the information originally in the mechanism for
     determining DHCPBNDUPD message,
	then the status of DHCPBNDUPD message may	be transmitted and forgotten by
	the primary.  This failure could occur,
     for example, if all routes between sending server (typically the primary primary).  When and if the
	secondary were
     unavailable such that secondary would not get a response to receives the DHCPBNDUPD and replies with a poll,
     even though DHCPBNDACK
	message	and the	primary is still operational.  In such
     circumstances, if so configured on	receives it, the secondary (see section 4 primary will update its

DRAFT                    DHCP Failover Protocol            November 1997

     above), manual intervention could be required to OK or disallow							    January 1998

	stable storage with a new picture of what the secondary	knows
	about the
     switchover. lease	time.  If such an option either of these messages is not configured, it lost, the
	only downside is still possible for that the
     secondary to become active DHCP client associated with the bind-
	ing in question	may receive a shorter lease for	one lease period
	than it	would otherwise.   This	"fire and servicing forget" technique
	could substantially ease both the same subnet(s) as complexity of	implementation
	and memory requirements	of an implementation of	the
     primary.  In this case, Failover
	protocol, especially where two clients could potentially get servers were communicating over a
	very slow link.

12.  Acknowledgments

   Ralph Droms started it all, by sketching out	an initial interserver
   draft that embodied ideas from several past IETF meetings.  In that
   draft, he acknowledged contributions	by Jeff	Mogul, Greg Minshall,
   Rob Stevens,	Walt Wimer, Ted	Lemon, and the same
     IP address, but only if both clients are on DHC working group.

   Kim Kinnear and Bob Cole each extended that draft, separately and
   then	together, until	they created an	interserver draft that supported
   any number of servers.  The complexity of that approach was just too
   great, and led to a much simpler approach embodied in the same subnet. first Fail-
   over	draft by Greg Rabil, Mike Dooley, and Arun Kapur and Ralph
   Droms.  This
     situation could	draft posited only occur if one of two servers -- a primary and	a secon-
   dary.  Kim Kinnear then wrote the client's packets went Safe Failover draft to layer on top
   of the primary Failover Draft and increase its the other client's packets went to robustness in the secondary
     server.  This would be a very rare situation.  However, as	face of
   certain rare as
     this may be,	network	failures. At the potential exists, so another mechanism is needed
     to ensure spring	1998 IETF meeting in LA,
   the DHC working group said that this does not occur.  Therefore, it is they	wanted a requirement
     of this protocol merged	Failover and
   Safe	Failover draft.	 Steve Gonczi and Bernie Volz stepped up and
   produced the	raw material for such a	merged draft, along with a new
   message format designed around DHCP options and other extensions and
   clarifications.  Kim	Kinnear	edited their work into draft format and
   made	other changes, and that each server particpating	is what	you have in your hands.

   Many	people have reviewed the failover MUST
     ping an address prior to offering various drafts	that address.  This should
     eliminate virtually any possibility of duplicate addresses being
     offered went into this
   result.  At American	Internet, ideas	have been contributed by Mark
   Stapp, Brad Parker, and Ellen Garvey.  Glenn	Waters of Bay Networks
   contributed ideas and enthusiasm to clients from the participating servers.

5 Acknowledgments

6 make a Failover protocol	that was
   both	"safe" and "lazy".

13.  References

     [RFC 2119] Bradner, S., "Key words for use in RFCs to Indicate
     Requirement Levels", RFC 2119, March 1997.

     [RFC 2131]

	[1] Droms, R., "Dynamic	Host Configuration Protocol",
     RFC2131, RFC 2131,
	    March 1997.

     [RFC 2132]

	[2] Alexander, S.,  Droms, R., "DHCP Options and BOOTP Vendor
	    Extensions",
     RFC2132, Internet RFC 2132, March 1997.

7 Security Considerations

8 Authors' Addresses

DRAFT							    January 1998

	[3] Rabil, G., Dooley, M., Kapur, A., Droms, R., "DHCP Failover
	    Protocol", draft-ietf-dhc-failover-00.txt.

	[4] Gudmundsson, Olafur, "Security Architecture	for DHCP",
	    draft-ietf-dhc-security-arch-00.txt.

14.  Author's information

      Ralph Droms
      323 Dana Engineering
      Bucknell University
      Lewisburg, PA  17837

      Phone: (717) 524-1145
      EMail: droms@bucknell.edu

      Greg Rabil, Mike Dooley, Arun Kapur
      Quadritek	Systems, Inc.
      10 Valley	Stream Parkway, Quite	Suite 240
      Malvern, PA 19355

      Phone: (800) 408-2747
     E-mail: 208-2747

      EMail: grabil@quadritek.com
	     mdooley@quadritek.com
	     akapur@quadritek.com

     Ralph Droms
     323 Dana Engineering
     Bucknell University

DRAFT                    DHCP Failover Protocol            November 1997

     Lewisburg, PA 17837

      Kim Kinnear
      American Internet	Corporation
      4	Preston	Ct.
      Bedford, MA  01730-2334

      Phone:  (717) 524-1145
     E-mail: droms@bucknell.edu (781) 276-4587
      EMail: kinnear@american.com

      Steve Gonczi, Bernie Volz
      Process Software Corporation
      959 Concord St.
      Framingham, MA  01701

      Phone: (508) 879-6994

      EMail: gonczi@process.com
	     volz@process.com