--- 1/draft-ietf-mpls-lsp-ping-05.txt 2006-02-05 00:41:12.000000000 +0100 +++ 2/draft-ietf-mpls-lsp-ping-06.txt 2006-02-05 00:41:12.000000000 +0100 @@ -1,40 +1,40 @@ -Network Working Group K. Kompella (Juniper) -Internet Draft P. Pan (Ciena) -draft-ietf-mpls-lsp-ping-05.txt N. Sheth (Juniper) -Category: Standards Track D. Cooper (Global Crossing) -Expires: August 2004 G. Swallow (Cisco) - S. Wadhwa (Juniper) - R. Bonica (WorldCom) - February 2004 - Detecting MPLS Data Plane Failures +Network Working Group K. Kompella +Internet Draft Juniper Networks +Category: Standards Track G. Swallow +Expires: January 2005 Cisco Systems + July 2004 - + Detecting MPLS Data Plane Failures + draft-ietf-mpls-lsp-ping-06.txt + *** DRAFT *** Status of this Memo - This document is an Internet-Draft and is in full conformance with - all provisions of Section 10 of RFC2026. + By submitting this Internet-Draft, I certify that any applicable + patent or other IPR claims of which I am aware have been disclosed, + and any of which I become aware will be disclosed, in accordance with + RFC 3668. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference - material or to cite them other than as ``work in progress.'' + material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at - http://www.ietf.org/ietf/1id-abstracts.txt + http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Copyright Notice Copyright (C) The Internet Society (2004). All Rights Reserved. Abstract @@ -42,37 +42,33 @@ used to detect data plane failures in Multi-Protocol Label Switching (MPLS) Label Switched Paths (LSPs). There are two parts to this document: information carried in an MPLS "echo request" and "echo reply" for the purposes of fault detection and isolation; and mechanisms for reliably sending the echo reply. Changes since last revision (This section to be removed before publication.) - *** Changed the format of an L2 circuit ID FEC. Added a sender's PE - address field to uniquely identify the VC ID *** - - Further clarified that an MPLS echo request/reply can be either an - IPv4 or an IPv6 packet. - - Added format pictures for LDP IPv4/IPv6 prefixes. + *** Changed the format of an L2 circuit ID FEC back to what it was, + on demand. Added a new FEC with sender's PE address field to + uniquely identify the VC ID *** - Clarified the section on Receiving an MPLS Echo Request. + *** Added a FEC TLV for "Labeled BGP IPv4" *** -Issues + Reformatted section on Downstream Mapping - (This section to be removed before publication.) + Described issue with (and solution to) problem with VPN IPv4/6 - Need to address issues with pinging L3VPN FECs. + Rephrased section on receiving an LSP ping - Need to add new FEC type for "type 129" L2 circuits. + Clarified "Expert Review" allocation policy. 1. Introduction This document describes a simple and efficient mechanism that can be used to detect data plane failures in MPLS LSPs. There are two parts to this document: information carried in an MPLS "echo request" and "echo reply"; and mechanisms for transporting the echo reply. The first part aims at providing enough information to check correct operation of the data plane, as well as a mechanism to verify the data plane against the control plane, and thereby localize faults. @@ -74,74 +70,82 @@ to this document: information carried in an MPLS "echo request" and "echo reply"; and mechanisms for transporting the echo reply. The first part aims at providing enough information to check correct operation of the data plane, as well as a mechanism to verify the data plane against the control plane, and thereby localize faults. The second part suggests two methods of reliable reply channels for the echo request message, for more robust fault isolation. An important consideration in this design is that MPLS echo requests follow the same data path that normal MPLS packets would traverse. - MPLS echo requests are meant primarily to validate the data plane, and secondarily to verify the data plane against the control plane. Mechanisms to check the control plane are valuable, but are not covered in this document. To avoid potential Denial of Service attacks, it is recommended to - regulate the MPLS ping traffic going to the control plane. A rate + regulate the LSP ping traffic going to the control plane. A rate limiter should be applied to the well-known UDP port defined below. 1.1. Conventions The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [KEYWORDS]. 1.2. Structure of this document The body of this memo contains four main parts: motivation, MPLS echo - request/reply packet format, MPLS ping operation, and a reliable + request/reply packet format, LSP ping operation, and a reliable return path. It is suggested that first-time readers skip the actual packet formats and read the Theory of Operation first; the document is structured the way it is to avoid forward references. - The last section (reliable return path for RSVP LSPs) may be removed - in a future revision. +1.3. Contributors + + The following made vital contributions to all aspects of this + document, and much of the material came out of debate and discussion + among this group. + + Ronald P. Bonica, MCI + Dave Cooper, Global Crossing + Ping Pan, Ciena + Nischal Sheth, Juniper Networks, Inc. + Sanjay Wadhwa, Juniper Networks 2. Motivation When an LSP fails to deliver user traffic, the failure cannot always be detected by the MPLS control plane. There is a need to provide a tool that would enable users to detect such traffic "black holes" or misrouting within a reasonable period of time; and a mechanism to isolate faults. In this document, we describe a mechanism that accomplishes these goals. This mechanism is modeled after the ping/traceroute paradigm: ping (ICMP echo request [ICMP]) is used for connectivity checks, and traceroute is used for hop-by-hop fault localization as well as path tracing. This document specifies a "ping mode" and a "traceroute" mode for testing MPLS LSPs. - The basic idea is to test that packets that belong to a particular + The basic idea is to verify that packets that belong to a particular Forwarding Equivalence Class (FEC) actually end their MPLS path on an LSR that is an egress for that FEC. This document proposes that this test be carried out by sending a packet (called an "MPLS echo request") along the same data path as other packets belonging to this FEC. An MPLS echo request also carries information about the FEC whose MPLS path is being verified. This echo request is forwarded just like any other packet belonging to that FEC. In "ping" mode (basic connectivity check), the packet should reach the end of the path, at which point it is sent to the control plane of the egress - LSR, which then verifies that it is indeed an egress for the FEC. In - "traceroute" mode (fault isolation), the packet is sent to the + LSR, which then verifies whether it is indeed an egress for the FEC. + In "traceroute" mode (fault isolation), the packet is sent to the control plane of each transit LSR, which performs various checks that it is indeed a transit LSR for this path; this LSR also returns further information that helps check the control plane against the data plane, i.e., that forwarding matches what the routing protocols determined as the path. One way these tools can be used is to periodically ping a FEC to ensure connectivity. If the ping fails, one can then initiate a traceroute to determine where the fault lies. One can also periodically traceroute FECs to verify that forwarding matches the @@ -297,36 +301,43 @@ 10 Mapping for this FEC is not the given label at stack depth 11 No label entry at stack-depth 12 Protocol not associated with interface at FEC stack depth + 13 Premature termination of ping due to label stack + shrinking to a single label + 3.2. Target FEC Stack A Target FEC Stack is a list of sub-TLVs. The number of elements is determined by the looking at the sub-TLV length fields. Sub-Type # Length Value Field ---------- ------ ----------- 1 5 LDP IPv4 prefix 2 17 LDP IPv6 prefix 3 20 RSVP IPv4 Session Query 4 56 RSVP IPv6 Session Query 5 Reserved; see Appendix 6 13 VPN IPv4 prefix 7 25 VPN IPv6 prefix 8 14 L2 VPN endpoint - 9 10 L2 circuit ID + 9 10 "FEC 128" Pseudowire (old) + 10 14 "FEC 128" Pseudowire (new) + 11 13+ "FEC 129" Pseudowire + 12 10 BGP labeled IPv4 prefix + Other FEC Types will be defined as needed. Note that this TLV defines a stack of FECs, the first FEC element corresponding to the top of the label stack, etc. An MPLS echo request MUST have a Target FEC Stack that describes the FEC stack being tested. For example, if an LSR X has an LDP mapping for 192.168.1.1 (say label 1001), then to verify that label 1001 does indeed reach an egress LSR that announced this prefix via LDP, X can send an MPLS echo request with a FEC Stack TLV with one FEC in it, @@ -428,37 +439,39 @@ | | | | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Must Be Zero | LSP ID | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3.2.5. VPN IPv4 Prefix The value field consists of the Route Distinguisher advertised with - the VPN IPv4 prefix, the IPv4 prefix and a prefix length, as follows: + the VPN IPv4 prefix, the IPv4 prefix (with trailing 0 bits to make 32 + bits in all) and a prefix length, as follows: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Route Distinguisher | | (8 octets) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | IPv4 prefix | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Prefix Length | Must Be Zero | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3.2.6. VPN IPv6 Prefix The value field consists of the Route Distinguisher advertised with - the VPN IPv6 prefix, the IPv6 prefix and a prefix length, as follows: + the VPN IPv6 prefix, the IPv6 prefix (with trailing 0 bits to make + 128 bits in all) and a prefix length, as follows: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Route Distinguisher | | (8 octets) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | IPv6 prefix | | | | | @@ -477,49 +490,116 @@ 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Route Distinguisher | | (8 octets) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Sender's CE ID | Receiver's CE ID | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Encapsulation Type | Must Be Zero | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -3.2.8. L2 Circuit ID +3.2.8. FEC 128 Pseudowire (Deprecated) + + The value field consists of the remote PE address (the destination + address of the targetted LDP session), a VC ID and an encapsulation + type, as follows: + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Remote PE Address | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | VC ID | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Encapsulation Type | Must Be Zero | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + This FEC will be deprecated, and is retained only for backward + compatibility. Implementations of LSP ping SHOULD accept and process + this TLV, but SHOULD send LSP ping echo requests with the new TLV + (see next section), unless explicitly asked by configuration to use + the old TLV. + + An LSR receiving this TLV SHOULD use the source IP address of the LSP + echo request to infer the Sender's PE Address. + +3.2.9. FEC 128 Pseudowire (Current) The value field consists of the sender's PE address (the source address of the targetted LDP session), the remote PE address (the destination address of the targetted LDP session), a VC ID and an encapsulation type, as follows: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Sender's PE Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Remote PE Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | VC ID | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Encapsulation Type | Must Be Zero | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +3.2.10. FEC 129 Pseudowire + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Sender's PE Address | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Remote PE Address | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | PW Type | AGI Length | SAII Length | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | TAII Length | AGI Value ... SAII Value ... TAII Value ... + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + . ... . + . . + . . + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + ... | 0-3 octets of zero padding | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + The Length of this TLV is 13 + AGI length + SAII length + TAII + length. Padding is used to make the total length a multiple of 4; + the length of the padding is not included in the Length field. + +3.2.11. BGP Labeled IPv4 Prefix + + The value field consists of the BGP Next Hop associated with the NLRI + advertising the prefix and label, the IPv4 prefix (with trailing 0 + bits to make 32 bits in all), and the prefix length, as follows: + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | BGP Next Hop | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | IPv4 Prefix | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Prefix Length | Must Be Zero | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + 3.3. Downstream Mapping The Downstream Mapping object is an optional TLV. Only one Downstream Mapping request may appear in and echo request. The presence of a Downstream Mapping object is a request that Downstream Mapping objects be included in the echo reply. If the replying router is the destination of the FEC, then a Downstream Mapping TLV SHOULD NOT be included in the echo reply. Otherwise Downstream Mapping objects SHOULD include a Downstream Mapping object for each - interface over which this FEC could be forwarded. + interface over which this FEC could be forwarded. For a more precise + definition of the notion of "downstream", see the section named + "Downstream". The Length is 16 + M + 4*N octets, where M is the Multipath Length, and N is the number of Downstream Labels. The Value field of a Downstream Mapping has the following format: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | MTU | Address Type | Resvd (SBZ) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ @@ -541,29 +621,31 @@ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Downstream Label | Protocol | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Maximum Transmission Unit (MTU) The MTU is the largest MPLS frame (including label stack) that fits on the interface to the Downstream LSR. Address Type - The Address Type indicates if the interface is numbered or unnumbered and is set to one of the following values: Type # Address Type ------ ------------ 1 IPv4 2 Unnumbered 3 IPv6 + + Reserved + The field marked SBZ SHOULD be set to zero when sending and SHOULD be ignored on receipt. Downstream IP Address and Downstream Interface Address If the interface to the downstream LSR is numbered, then the Address Type MUST be set to IPv4 or IPv6, the Downstream IP Address MUST be set to either the downstream LSR's Router ID or the interface address of the downstream LSR, and the Downstream Interface Address MUST be set to the downstream LSR's interface @@ -579,87 +661,41 @@ The length in octets of the Multipath Information. Downstream Label(s) The set of labels in the label stack as it would have appeared if this router were forwarding the packet through this interface. Any Implicit Null labels are explicitly inluded. Labels are treated as numbers, i.e. they are right justified in the field. - Protocol + A Downstream Label is 24 bits, in the same format as an MPLS + label minus the TTL field, i.e., the MSBit of the label is bit 0, + the LSbit is bit 19, the EXP bits are bits 20-22, and bit 23 is + the S bit. The replying router SHOULD fill in the EXP and S + bits; the LSR receiving the echo reply MAY choose to ignore these + bits. + Protocol The Protocol is taken from the following table: Protocol # Signaling Protocol ---------- ------------------ 0 Unknown 1 Static 2 BGP 3 LDP 4 RSVP-TE 5 Reserved; see Appendix - The notion of "downstream router" and "downstream interface" - should be explained. Consider an LSR X. If a packet that was - originated with TTL n>1 arrived with outermost label L at LSR X, - X must be able to compute which LSRs could receive the packet if - it was originated with TTL=n+1, over which interface the request - would arrive and what label stack those LSRs would see. (It is - outside the scope of this document to specify how this - computation is done.) The set of these LSRs/interfaces are the - downstream routers/interfaces (and their corresponding labels) - for X with respect to L. Each pair of downstream router and - interface requires a separate Downstream Mapping to be added to - the reply. (Note that there are multiple Downstream Label fields - in each TLV as the incoming label L may be swapped with a label - stack.) - - The case where X is the LSR originating the echo request is a - special case. X needs to figure out what LSRs would receive the - MPLS echo request for a given FEC Stack that X originates with - TTL=1. - - The set of downstream routers at X may be alternative paths (see - the discussion below on ECMP) or simultaneous paths (e.g., for - MPLS multicast). In the former case, the Multipath sub-field is - used as a hint to the sender as to how it may influence the - choice of these alternatives. The "No of Multipaths" is the - number of IP Address/Next Label fields. The Hash Key Type is - taken from the following table: - - Key Type Multipath Information - --- ---------------- --------------------- - 0 no multipath (empty; M = 0) - 1 label labels - 2 IP address IP addresses - 3 label range low/high label pairs - 4 IP address range low/high address pairs - 5 no more labels (empty; M = 0) - 6 All IP addresses (empty; M = 0) - 7 no match (empty; M = 0) - 8 Bit-masked IPv4 IP address prefix and bit mask - address set - 9 Bit-masked label set Label prefix and bit mask - - Type 0 indicates that all packets will be forwarded out this one - interface. - - Types 1, 2, 3, 4, 8 and 9 specify that the supplied Multipath - Information will serve to execise this path. - - Types 5 and 6 are TBD. - - Type 7 indicates that no matches are possible given the Multipath - Information in the received DS mapping information. - Depth Limit + The Depth Limit is applicable only to a label stack, and is the maximum number of labels considered in the hash; this SHOULD be set to zero if unspecified or unlimited. Multipath Information The multipath information encodes labels or addresses which will exercise this path. The multipath informaiton depends on the hash key type. The contents of the field are shown in the table above. IP addresses are drawn from the range 127/8. Labels are @@ -678,35 +714,89 @@ Hash key 9 allows a denser encoding of Labels. The label prefix is formatted as a base label value with the non-prefix low order bits set to zero. The maximum prefix (including leading zeros due to encoding) length is 27. Following the prefix is a mask of length 2^(32-prefix length) bits. Each bit set to one represents a valid Label. The label is the base label plus the position of the bit in the mask where the bits are numbered left to right begining with zero. - If the received DS mapping information is non-null the labels and + If the received multipath information is non-null, the labels and IP addresses MUST be picked from the set provided or the Hash Key - Type MUST be set to 7. + Type MUST be set to 7. If the received multipath information is + null, the receiver simply returns null. For example, suppose LSR X at hop 10 has two downstream LSRs Y and Z for the FEC in question. X could return Hash Key Type 4, with low/high IP addresses of 1.1.1.1->1.1.1.255 for downstream LSR Y and 2.1.1.1->2.1.1.255 for downstream LSR Z. The head end reflects this information to LSR Y. Y, which has three downstream LSRs U, V and W, computes that 1.1.1.1->1.1.1.127 would go to U and 1.1.1.128-> 1.1.1.255 would go to V. Y would then respond with 3 Downstream Mappings: to U, with Hash Key Type 4 (1.1.1.1->1.1.1.127); to V, with Hash Key Type 4 (1.1.1.127->1.1.1.255); and to W, with Hash Key Type 7. +3.3.1. "Downstream" + + The notion of "downstream router" and "downstream interface" should + be explained. Consider an LSR X. If a packet that was originated + with TTL n>1 arrived with outermost label L at LSR X, X must be able + to compute which LSRs could receive the packet if it was originated + with TTL=n+1, over which interface the request would arrive and what + label stack those LSRs would see. (It is outside the scope of this + document to specify how this computation is done.) The set of these + LSRs/interfaces are the downstream routers/interfaces (and their + corresponding labels) for X with respect to L. Each pair of + downstream router and interface requires a separate Downstream + Mapping to be added to the reply. (Note that there are multiple + Downstream Label fields in each TLV as the incoming label L may be + swapped with a label stack.) + + The case where X is the LSR originating the echo request is a special + case. X needs to figure out what LSRs would receive the MPLS echo + request for a given FEC Stack that X originates with TTL=1. + + The set of downstream routers at X may be alternative paths (see the + discussion below on ECMP) or simultaneous paths (e.g., for MPLS + multicast). In the former case, the Multipath sub-field is used as a + hint to the sender as to how it may influence the choice of these + alternatives. The "No of Multipaths" is the number of IP + Address/Next Label fields. The Hash Key Type is taken from the + following table: + + Key Type Multipath Information + --- ---------------- --------------------- + 0 no multipath (empty; M = 0) + 1 label labels + 2 IP address IP addresses + 3 label range low/high label pairs + 4 IP address range low/high address pairs + 5 no more labels (empty; M = 0) + 6 All IP addresses (empty; M = 0) + 7 no match (empty; M = 0) + 8 Bit-masked IPv4 IP address prefix and bit mask + address set + 9 Bit-masked label set Label prefix and bit mask + + Type 0 indicates that all packets will be forwarded out this one + interface. + + Types 1, 2, 3, 4, 8 and 9 specify that the supplied Multipath + Information will serve to execise this path. + + Types 5 and 6 are TBD. + + Type 7 indicates that no matches are possible given the Multipath + Information in the received DS mapping information. + 3.4. Pad TLV The value part of the Pad TLV contains a variable number (>= 1) of octets. The first octet takes values from the following table; all the other octets (if any) are ignored. The receiver SHOULD verify that the TLV is received in its entirety, but otherwise ignores the contents of this TLV, apart from the first octet. Value Meaning ----- ------- @@ -793,22 +883,22 @@ header is set as follows: the source IP address is a routable address of the sender; the destination IP address is a (randomly chosen) address from 127/8; the IP TTL is set to 1. The source UDP port is chosen by the sender; the destination UDP port is set to 3503 (assigned by IANA for MPLS echo requests). The Router Alert option is set in the IP header. If the echo request is labelled, one may (depending on what is being pinged) set the TTL of the innermost label to 1, to prevent the ping request going farther than it should. Examples of this include - pinging a VPN IPv4 or IPv6 prefix, an L2 VPN end point or an L2 - circuit ID. This can also be accomplished by inserting a router + pinging a VPN IPv4 or IPv6 prefix, an L2 VPN end point or a + pseudowire. This can also be accomplished by inserting a router alert label above this label; however, this may lead to the undesired side effect that MPLS echo requests take a different data path than actual data. In "ping" mode (end-to-end connectivity check), the TTL in the outermost label is set to 255. In "traceroute" mode (fault isolation mode), the TTL is set successively to 1, 2, .... The sender chooses a Sender's Handle, and a Sequence Number. When sending subsequent MPLS echo requests, the sender SHOULD increment @@ -848,24 +938,26 @@ echo was received, and the label stack with which it came. X matches up the labels in the received label stack with the FECs contained in the FEC stack. The matching is done beginning at the bottom of both stacks, and working up. For reporting purposes the bottom of stack is consided to be stack-depth of 1. This is to establish an absolute reference for the case where the stack may have more labels than are in the FEC stack. If there are more FECs than labels, the extra FECs are assumed to - correspond to Implicit Null Labels. Thus for the processing below, - there is never the case where there is a FEC with no corresponding - label. Further the label operation associated with an assumed Null - Label is 'pop and continue processing'. + correspond to Implicit Null Labels. That is, extra Implicit Null + Labels are added to the top of the received label stack and the stack + depth is set to the depth of the FEC stack. Thus for the processing + below, there is never the case where there is a FEC with no + corresponding label. Further, the label operation associated with an + assumed Null Label is 'pop and continue processing'. Note: in all the error codes listed in this draft a stack-depth of 0 means "no value specified". This allows compatibility with existing implementations which do not use the Return Subcode field. X sets a variable, call it current-stack-depth, to the number of labels in the received label stack. Processing now continues with the following steps: 1. Check if there is a FEC corresponding to the current-stack- @@ -919,21 +1011,21 @@ MPLS echo reply has either a Return Code of 8, or a Return Code of 9 with a Return Subcode of 1 then Downstream mapping TLVs SHOULD be included for each multipath. X uses the procedure in the next subsection to send the echo reply. 4.4. Sending an MPLS Echo Reply An MPLS echo reply is a UDP packet. It MUST ONLY be sent in response to an MPLS echo request. The source IP address is a routable address - of the replier; the source port is the well-known UDP port for MPLS + of the replier; the source port is the well-known UDP port for LSP ping. The destination IP address and UDP port are copied from the source IP address and UDP port of the echo request. The IP TTL is set to 255. If the Reply Mode in the echo request is "Reply via an IPv4 UDP packet with Router Alert", then the IP header MUST contain the Router Alert IP option. If the reply is sent over an LSP, the topmost label MUST in this case be the Router Alert label (1) (see [LABEL-STACK]). The format of the echo reply is the same as the echo request. The Sender's Handle, the Sequence Number and TimeStamp Sent are copied @@ -965,35 +1057,55 @@ otherwise, it checks the Sequence Number to see if it matches. Gaps in the Sequence Number MAY be logged and SHOULD be counted. Once an Echo Reply is received for a given Sequence Number (for a given UDP port and Handle), the Sequence Number for subsequent Echo Requests for that UDP port and Handle SHOULD be incremented. If the Echo Reply contains Downstream Mappings, and X wishes to traceroute further, it SHOULD copy the Downstream Mappings into its next Echo Request (with TTL incremented by one). -4.6. Non-compliant Routers +4.6. Issue with VPN IPv4 and IPv6 Prefixes + + Typically, a LSP ping for a VPN IPv4 or IPv6 prefix is sent with a + label stack of depth greater than 1, with the innermost label having + a TTL of 1. This is to terminate the ping at the egress PE, before + it gets sent to the customer device. However, under certain + circumstances, the label stack can shrink to a single label before + the ping hits the egress PE; this will result in the ping terminating + prematurely. One such scenario is a multi-AS Carrier's Carrier VPN. + + To get around this problem, one approach is for the LSR that receives + such a ping to realize that the ping terminated prematurely, and send + back error code 13. In that case, the initiating LSR can retry the + ping after incrementing the TTL on the VPN label. In this fashion, + the ingress LSR will sequentially try TTL values until it finds one + that allows the VPN ping to reach the egress PE. + +4.7. Non-compliant Routers If the egress for the FEC Stack being pinged does not support MPLS ping, then no reply will be sent, resulting in possible "false negatives". If in "traceroute" mode, a transit LSR does not support - MPLS ping, then no reply will be forthcoming from that LSR for some + LSP ping, then no reply will be forthcoming from that LSR for some TTL, say n. The LSR originating the echo request SHOULD try sending the echo request with TTL=n+1, n+2, ..., n+k in the hope that some transit LSR further downstream may support MPLS echo requests and reply. In such a case, the echo request for TTL>n MUST NOT have Downstream Mapping TLVs, until a reply is received with a Downstream Mapping. Normative References + [IANA] Narten, T. and H. Alvestrand, "Guidelines for IANA + Considerations", BCP 26, RFC 2434, October 1998. + [KEYWORDS] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [LABEL-STACK] Rosen, E., et al, "MPLS Label Stack Encoding", RFC 3032, January 2001. [RSVP] Braden, R. (Editor), et al, "Resource ReSerVation protocol (RSVP) -- Version 1 Functional Specification," RFC 2205, September 1997. @@ -1020,41 +1132,42 @@ tampering with MPLS echo requests and replies. Authentication will help reduce the number of seemingly valid MPLS echo requests, and thus cut down the Denial of Service attacks; beyond that, each LSR must protect itself. Authentication sufficiently addresses spoofing, replay and most tampering attacks; one hopes to use some mechanism devised or suggested by the RPSec WG. It is not clear how to prevent hijacking (non-delivery) of echo requests or replies; however, if these - messages are indeed hijacked, MPLS ping will report that the data + messages are indeed hijacked, LSP ping will report that the data plane isn't working as it should. It doesn't seem vital (at this point) to secure the data carried in MPLS echo requests and replies, although knowledge of the state of the MPLS data plane may be considered confidential by some. 5. IANA Considerations The TCP and UDP port number 3503 has been allocated by IANA for LSP echo requests and replies. The following sections detail the new name spaces to be managed by IANA. For each of these name spaces, the space is divided into assignment ranges; the following terms are used in describing the procedures by which IANA allocates values: "Standards Action" (as defined in [IANA]); "Expert Review" and "Vendor Private Use". Values from "Expert Review" ranges MUST be registered with IANA, and MUST be accompanied by an Experimental RFC that describes the format - and procedures for using the code point. + and procedures for using the code point; the actual assignment is + made during the IANA actions for the RFC. Values from "Vendor Private" ranges MUST NOT be registered with IANA; however, the message MUST contain an enterprise code as registered with the IANA SMI Network Management Private Enterprise Codes. For each name space that has a Vendor Private range, it must be specified where exactly the SMI Enterprise Code resides; see below for examples. In this way, several enterprises (vendors) can use the same code point without fear of collision. 5.1. Message Types, Reply Modes, Return Codes @@ -1067,34 +1180,35 @@ Use, and MUST NOT be allocated. If any of these fields fall in the Vendor Private range, a top-level Vendor Enterprise Code TLV MUST be present in the message. 5.2. TLVs It is requested that IANA maintain registries for the Type field of top-level TLVs as well as for sub-TLVs. The valid range for each of these is 0-65535. Assignments in the range 0-32767 are made via - Standards Action; assignments in the range 32768-64511 are made via - Expert Review; values in the range 64512-65535 are for Vendor Private - Use, and MUST NOT be allocated. + Standards Action as defined in {IANA]; assignments in the range + 32768-64511 are made via Expert Review (see below); values in the + range 64512-65535 are for Vendor Private Use, and MUST NOT be + allocated. If a TLV or sub-TLV has a Type that falls in the range for Vendor Private Use, the Length MUST be at least 4, and the first four octets MUST be that vendor's SMI Enterprise Code, in network octet order. The rest of the Value field is private to the vendor. Acknowledgments This document is the outcome of many discussions among many people, that include Manoj Leelanivas, Paul Traina, Yakov Rekhter, Der-Hwa - Gan, Brook Bailey, Eric Rosen and Ina Minei. + Gan, Brook Bailey, Eric Rosen, Ina Minei and Shivani Aggarwal. The description of the Multipath Information sub-field of the Downstream Mapping TLV was adapted from text suggested by Curtis Villamizar. Appendix This appendix specifies non-normative aspects of detecting MPLS data plane liveness. @@ -1122,63 +1236,34 @@ 5.2. Downstream Mapping for CR-LDP If a label in a Downstream Mapping was learned via CR-LDP, the Protocol field in the Mapping TLV can use the following entry: Protocol # Signaling Protocol ---------- ------------------ 5 CR-LDP -Authors' Addresses +Authors' Address Kireeti Kompella - Nischal Sheth Juniper Networks 1194 N.Mathilda Ave Sunnyvale, CA 94089 - e-mail: kireeti@juniper.net - e-mail: nsheth@juniper.net - - Ping Pan - Ciena - 10480 Ridgeview Court - Cupertino, CA 95014 - e-mail: ppan@ciena.com - phone: +1 408.366.4700 - - Dave Cooper - Global Crossing - 960 Hamlin Court - Sunnyvale, CA 94089 - email: dcooper@gblx.net - phone: +1 916.415.0437 + Email: kireeti@juniper.net George Swallow - Cisco Systems, Inc. - 250 Apollo Drive - Chelmsford, MA 01824 - e-mail: swallow@cisco.com - phone: +1 978.497.8143 - - Sanjay Wadhwa - Juniper Networks - 10 Technology Park Drive - Westford, MA 01886-3146 - email: swadhwa@unispherenetworks.com - phone: +1 978.589.0697 - Ronald P. Bonica - WorldCom - 22001 Loudoun County Pkwy - Ashburn, Virginia, 20147 - email: ronald.p.bonica@wcom.com - phone: +1 703.886.1681 + Cisco Systems + 1414 Massachusetts Ave, + Boxborough, MA 01719 + Phone: +1 978 936 1398 + Email: swallow@cisco.com Intellectual Property Rights Notices The IETF takes no position regarding the validity or scope of any intellectual property or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; neither does it represent that it has made any effort to identify any such rights. Information on the IETF's procedures with respect to rights in standards-track and @@ -1188,37 +1273,25 @@ obtain a general license or permission for the use of such proprietary rights by implementors or users of this specification can be obtained from the IETF Secretariat. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights which may cover technology that may be required to practice this standard. Please address the information to the IETF Executive Director. -Full Copyright Statement - - Copyright (C) The Internet Society (2004). All Rights Reserved. +Disclaimer of Validity - This document and translations of it may be copied and furnished to - others, and derivative works that comment on or otherwise explain it - or assist in its implmentation may be prepared, copied, published and - distributed, in whole or in part, without restriction of any kind, - provided that the above copyright notice and this paragraph are - included on all such copies and derivative works. However, this - document itself may not be modified in any way, such as by removing - the copyright notice or references to the Internet Society or other - Internet organizations, except as needed for the purpose of - developing Internet standards in which case the procedures for - copyrights defined in the Internet Standards process must be - followed, or as required to translate it into languages other than - English. + This document and the information contained herein are provided on an + "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS + OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET + ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, + INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE + INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED + WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. - The limited permissions granted above are perpetual and will not be - revoked by the Internet Society or its successors or assigns. +Copyright Statement - This document and the information contained herein is provided on an - "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING - TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING - BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION - HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF - MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. + Copyright (C) The Internet Society (2004). This document is subject + to the rights, licenses and restrictions contained in BCP 78, and + except as set forth therein, the authors retain all their rights.