draft-ietf-tcpm-early-rexmt-03.txt | draft-ietf-tcpm-early-rexmt-04.txt | |||
---|---|---|---|---|
Internet Engineering Task Force Mark Allman | Internet Engineering Task Force Mark Allman | |||
INTERNET DRAFT ICSI | INTERNET DRAFT ICSI | |||
File: draft-ietf-tcpm-early-rexmt-03.txt Konstantin Avrachenkov | File: draft-ietf-tcpm-early-rexmt-04.txt Konstantin Avrachenkov | |||
Intended Status: Experimental INRIA | Intended Status: Experimental INRIA | |||
Urtzi Ayesta | Urtzi Ayesta | |||
LAAS-CNRS | BCAM-IKERBASQUE and LAAS-CNRS | |||
Josh Blanton | Josh Blanton | |||
Ohio University | Ohio University | |||
Per Hurtig | Per Hurtig | |||
Karlstad University | Karlstad University | |||
November 2009 | January 2010 | |||
Expires: May 2010 | Expires: July 2010 | |||
Early Retransmit for TCP and SCTP | Early Retransmit for TCP and SCTP | |||
Status of this Memo | Status of this Memo | |||
This Internet-Draft is submitted to IETF in full conformance with | This Internet-Draft is submitted to IETF in full conformance with | |||
the provisions of BCP 78 and BCP 79. | the provisions of BCP 78 and BCP 79. | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF), its areas, and its working groups. Note that | Task Force (IETF), its areas, and its working groups. Note that | |||
skipping to change at page 1, line 39 | skipping to change at page 1, line 39 | |||
months and may be updated, replaced, or obsoleted by other documents | months and may be updated, replaced, or obsoleted by other documents | |||
at any time. It is inappropriate to use Internet-Drafts as | at any time. It is inappropriate to use Internet-Drafts as | |||
reference material or to cite them other than as "work in progress." | reference material or to cite them other than as "work in progress." | |||
The list of current Internet-Drafts can be accessed at | The list of current Internet-Drafts can be accessed at | |||
http://www.ietf.org/ietf/1id-abstracts.txt. | http://www.ietf.org/ietf/1id-abstracts.txt. | |||
The list of Internet-Draft Shadow Directories can be accessed at | The list of Internet-Draft Shadow Directories can be accessed at | |||
http://www.ietf.org/shadow.html. | http://www.ietf.org/shadow.html. | |||
This Internet-Draft will expire on May 18, 2010. | This Internet-Draft will expire on July 27, 2010. | |||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2009 IETF Trust and the persons identified as the | Copyright (c) 2009 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
(http://trustee.ietf.org/license-info) in effect on the date of | (http://trustee.ietf.org/license-info) in effect on the date of | |||
publication of this document. Please review these documents | publication of this document. Please review these documents | |||
skipping to change at page 2, line 24 | skipping to change at page 2, line 24 | |||
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | |||
document are to be interpreted as described in RFC 2119 [RFC2119]. | document are to be interpreted as described in RFC 2119 [RFC2119]. | |||
The reader is expected to be familiar with the definitions given in | The reader is expected to be familiar with the definitions given in | |||
[RFC5681]. | [RFC5681]. | |||
1 Introduction | 1 Introduction | |||
Many researchers have studied problems with TCP [RFC793,RFC5681] | Many researchers have studied problems with TCP's loss recovery | |||
when the congestion window is small and have outlined possible | [RFC793,RFC5681] when the congestion window is small and have | |||
mechanisms to mitigate these problems | outlined possible mechanisms to mitigate these problems | |||
[Mor97,BPS+98,Bal98,LK98,RFC3150,AA02]. SCTP's [RFC4960] loss | [Mor97,BPS+98,Bal98,LK98,RFC3150,AA02]. SCTP's [RFC4960] loss | |||
recovery and congestion control mechanisms are based on TCP and | recovery and congestion control mechanisms are based on TCP and | |||
therefore the same problems impact the performance of SCTP | therefore the same problems impact the performance of SCTP | |||
connections. When the transport detects a missing segment, the | connections. When the transport detects a missing segment, the | |||
connection enters a loss recovery phase. There are several variants | connection enters a loss recovery phase. There are several variants | |||
of the loss recovery phase depending on the TCP implementation. TCP | of the loss recovery phase depending on the TCP implementation. TCP | |||
can use slow start based recovery or Fast Recovery [RFC5681], | can use slow start based recovery or Fast Recovery [RFC5681], | |||
NewReno [RFC3782], and loss recovery based on selective | NewReno [RFC3782], and loss recovery based on selective | |||
acknowledgments (SACKs) [RFC2018,FF96,RFC3517]. SCTP's loss | acknowledgments (SACKs) [RFC2018,FF96,RFC3517]. SCTP's loss | |||
recovery is not as varied due to the built-in selective | recovery is not as varied due to the built-in selective | |||
skipping to change at page 4, line 15 | skipping to change at page 4, line 15 | |||
use "Limited Transmit" to include both TCP and SCTP mechanisms for | use "Limited Transmit" to include both TCP and SCTP mechanisms for | |||
sending in response to the first two duplicate ACKs. By sending | sending in response to the first two duplicate ACKs. By sending | |||
these two new segments the sender is attempting to induce additional | these two new segments the sender is attempting to induce additional | |||
duplicate ACKs (if appropriate) so that Fast Retransmit will be | duplicate ACKs (if appropriate) so that Fast Retransmit will be | |||
triggered before the retransmission timeout expires. The | triggered before the retransmission timeout expires. The | |||
sender-side "Early Retransmit" mechanism outlined in this document | sender-side "Early Retransmit" mechanism outlined in this document | |||
covers the case when previously unsent data is not available for | covers the case when previously unsent data is not available for | |||
transmission (case (2) above) or cannot be transmitted due to an | transmission (case (2) above) or cannot be transmitted due to an | |||
advertised window limitation (case (3) above). | advertised window limitation (case (3) above). | |||
Note: This document is being published as an experimental RFC as | ||||
part of the process for the TCPM WG and the IETF to assess whether | ||||
the proposed change is useful and safe in the heterogeneous | ||||
environments, including which variants of the mechanism are the most | ||||
effective. In the future, this specification may be updated and put | ||||
on the standards track if the safeness and efficacy can be | ||||
demonstrated. | ||||
2 Early Retransmit Algorithm | 2 Early Retransmit Algorithm | |||
The Early Retransmit algorithm calls for lowering the threshold for | The Early Retransmit algorithm calls for lowering the threshold for | |||
triggering Fast Retransmit when the amount of outstanding data is | triggering Fast Retransmit when the amount of outstanding data is | |||
small and when no previously unsent data can be transmitted (such | small and when no previously unsent data can be transmitted (such | |||
that Limited Transmit could be used). Duplicate ACKs are triggered | that Limited Transmit could be used). Duplicate ACKs are triggered | |||
by each arriving out-of-order segment. Therefore, Fast Retransmit | by each arriving out-of-order segment. Therefore, Fast Retransmit | |||
will not be invoked when there are less than four outstanding | will not be invoked when there are less than four outstanding | |||
segments (assuming only one segment loss in the window). However, | segments (assuming only one segment loss in the window). However, | |||
TCP and SCTP are not required to track the number of outstanding | TCP and SCTP are not required to track the number of outstanding | |||
skipping to change at page 5, line 25 | skipping to change at page 5, line 33 | |||
ER_thresh = ceiling (ownd/SMSS) - 1 (1) | ER_thresh = ceiling (ownd/SMSS) - 1 (1) | |||
duplicate ACKs, where ownd is in terms of bytes. We call this | duplicate ACKs, where ownd is in terms of bytes. We call this | |||
reduced ACK threshold enabling "Early Retransmission". | reduced ACK threshold enabling "Early Retransmission". | |||
When conditions (2.a) and (2.b) hold and a TCP connection does | When conditions (2.a) and (2.b) hold and a TCP connection does | |||
support SACK or SCTP is in use, Early Retransmit MUST be used only | support SACK or SCTP is in use, Early Retransmit MUST be used only | |||
when "ownd - SMSS" bytes have been SACKed. | when "ownd - SMSS" bytes have been SACKed. | |||
When conditions (2.a) and (2.b) do not hold, the transport MUST NOT | If either (or both) condition (2.a) or (2.b) does not hold, the | |||
use Early Retransmit, but rather prefer the standard mechanisms, | transport MUST NOT use Early Retransmit, but rather prefer the | |||
including Fast Retransmit and Limited Transmit. | standard mechanisms, including Fast Retransmit and Limited Transmit. | |||
As noted above, the drawback of this byte-based variant is precision | As noted above, the drawback of this byte-based variant is precision | |||
[HB08]. We illustrate this with two examples: | [HB08]. We illustrate this with two examples: | |||
+ Consider a non-SACK TCP sender that uses an SMSS of 1460 bytes | + Consider a non-SACK TCP sender that uses an SMSS of 1460 bytes | |||
and transmits three segments each with 400 bytes of payload. | and transmits three segments each with 400 bytes of payload. | |||
This is a case where Early Retransmit could aid loss recovery if | This is a case where Early Retransmit could aid loss recovery if | |||
one segment is lost. However, in this case ER_thresh will | one segment is lost. However, in this case ER_thresh will | |||
become zero, per equation (1), because the number of outstanding | become zero, per equation (1), because the number of outstanding | |||
bytes is a poor estimate of the number of outstanding segments. | bytes is a poor estimate of the number of outstanding segments. | |||
skipping to change at page 6, line 26 | skipping to change at page 6, line 34 | |||
segments. (We discuss tracking the number of outstanding segments | segments. (We discuss tracking the number of outstanding segments | |||
below.) We call this reduced ACK threshold enabling "Early | below.) We call this reduced ACK threshold enabling "Early | |||
Retransmission". | Retransmission". | |||
When conditions (3.a) and (3.b) hold and a TCP connection does | When conditions (3.a) and (3.b) hold and a TCP connection does | |||
support SACK or SCTP is in use, Early Retransmit MUST be used only | support SACK or SCTP is in use, Early Retransmit MUST be used only | |||
when "oseg - 1" segments have been SACKed. A segment is considered | when "oseg - 1" segments have been SACKed. A segment is considered | |||
to be SACKed when all its data bytes (TCP) or data chunks (SCTP) | to be SACKed when all its data bytes (TCP) or data chunks (SCTP) | |||
have been indicated as arrived by the receiver. | have been indicated as arrived by the receiver. | |||
When conditions (3.a) and (3.b) do not hold, the transport MUST NOT | If either (or both) conditions (3.a) or (3.b) does not hold, the | |||
use Early Retransmit, but rather prefer the standard mechanisms, | transport MUST NOT use Early Retransmit, but rather prefer the | |||
including Fast Retransmit and Limited Transmit. | standard mechanisms, including Fast Retransmit and Limited Transmit. | |||
This version of Early Retransmit solves the precision issues | This version of Early Retransmit solves the precision issues | |||
discussed in the previous section. As noted previously, the cost is | discussed in the previous section. As noted previously, the cost is | |||
that the implementation will have to track segment boundaries to | that the implementation will have to track segment boundaries to | |||
form an understanding as to how many actual segments have been | form an understanding as to how many actual segments have been | |||
transmitted, but not acknowledged. This can be done by the sender | transmitted, but not acknowledged. This can be done by the sender | |||
tracking the boundaries of the three segments on the right side of | tracking the boundaries of the three segments on the right side of | |||
the current window (which involves tracking four sequence numbers in | the current window (which involves tracking four sequence numbers in | |||
TCP). This could be done by keeping a circular list of the segment | TCP). This could be done by keeping a circular list of the segment | |||
boundaries, for instance. Cumulative ACKs that do not fall within | boundaries, for instance. Cumulative ACKs that do not fall within | |||
End of changes. 8 change blocks. | ||||
14 lines changed or deleted | 22 lines changed or added | |||
This html diff was produced by rfcdiff 1.37c. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |