draft-ietf-tcpm-tcp-security-02.txt   draft-ietf-tcpm-tcp-security-03.txt 
TCP Maintenance and Minor F. Gont TCP Maintenance and Minor Extensions F. Gont
Extensions (tcpm) UK CPNI (tcpm) UK CPNI
Internet-Draft January 21, 2011 Internet-Draft March 13, 2012
Intended status: BCP Intended status: Informational
Expires: July 25, 2011 Expires: September 14, 2012
Security Assessment of the Transmission Control Protocol (TCP) Survey of Security Hardening Methods for Transmission Control Protocol
draft-ietf-tcpm-tcp-security-02.txt (TCP) Implementations
draft-ietf-tcpm-tcp-security-03.txt
Abstract Abstract
This document contains a security assessment of the specifications of This document surveys methods to harden Transmission Control Protocol
the Transmission Control Protocol (TCP), and of a number of (TCP) implementations. It provides an overview of known attacks and
mechanisms and policies in use by popular TCP implementations. refers to the corresponding solutions in the TCP standards.
Additionally, it contains best current practices for hardening a TCP
implementation.
Status of this Memo Status of this Memo
This Internet-Draft is submitted to IETF in full conformance with the This Internet-Draft is submitted to IETF in full conformance with the
provisions of BCP 78 and BCP 79. provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on July 25, 2011. This Internet-Draft will expire on September 14, 2012.
Copyright Notice Copyright Notice
Copyright (c) 2011 IETF Trust and the persons identified as the Copyright (c) 2012 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License. described in the Simplified BSD License.
Table of Contents Table of Contents
1. Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1. Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1. Introduction . . . . . . . . . . . . . . . . . . . . . . 5 1.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . 5
1.2. Scope of this document . . . . . . . . . . . . . . . . . 6 1.2. Scope of this document . . . . . . . . . . . . . . . . . . 6
1.3. Organization of this document . . . . . . . . . . . . . . 8 1.3. Organization of this document . . . . . . . . . . . . . . 7
2. The Transmission Control Protocol . . . . . . . . . . . . . . 8 2. The Transmission Control Protocol . . . . . . . . . . . . . . 7
3. TCP header fields . . . . . . . . . . . . . . . . . . . . . . 9 3. TCP header fields . . . . . . . . . . . . . . . . . . . . . . 8
3.1. Source Port and Destination Port . . . . . . . . . . . . 10 3.1. Source Port and Destination Port . . . . . . . . . . . . . 8
3.2. Sequence number . . . . . . . . . . . . . . . . . . . . . 12 3.2. Sequence number . . . . . . . . . . . . . . . . . . . . . 9
3.3. Acknowledgement Number . . . . . . . . . . . . . . . . . 14 3.3. Acknowledgement Number . . . . . . . . . . . . . . . . . . 10
3.4. Data Offset . . . . . . . . . . . . . . . . . . . . . . . 15 3.4. Data Offset . . . . . . . . . . . . . . . . . . . . . . . 10
3.5. Control bits . . . . . . . . . . . . . . . . . . . . . . 15 3.5. Control bits . . . . . . . . . . . . . . . . . . . . . . . 10
3.5.1. Reserved (four bits) . . . . . . . . . . . . . . . . 15 3.5.1. Reserved (four bits) . . . . . . . . . . . . . . . . . 10
3.5.2. CWR (Congestion Window Reduced) . . . . . . . . . . . 16 3.5.2. CWR (Congestion Window Reduced) . . . . . . . . . . . 11
3.5.3. ECE (ECN-Echo) . . . . . . . . . . . . . . . . . . . 16 3.5.3. ECE (ECN-Echo) . . . . . . . . . . . . . . . . . . . . 11
3.5.4. URG . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.5.4. URG . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.5.5. ACK . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.5.5. ACK . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.5.6. PSH . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.5.6. PSH . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.5.7. RST . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.5.7. RST . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.5.8. SYN . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.5.8. SYN . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.5.9. FIN . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.5.9. FIN . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.6. Window . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.6. Window . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.7. Checksum . . . . . . . . . . . . . . . . . . . . . . . . 22 3.6.1. Security implications arising from closed windows . . 14
3.8. Urgent pointer . . . . . . . . . . . . . . . . . . . . . 23 3.7. Checksum . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.9. Options . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.8. Urgent pointer . . . . . . . . . . . . . . . . . . . . . . 16
3.10. Padding . . . . . . . . . . . . . . . . . . . . . . . . . 28 3.9. Options . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.11. Data . . . . . . . . . . . . . . . . . . . . . . . . . . 28 3.10. Padding . . . . . . . . . . . . . . . . . . . . . . . . . 19
4. Common TCP Options . . . . . . . . . . . . . . . . . . . . . 29 3.11. Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.1. End of Option List (Kind = 0) . . . . . . . . . . . . . . 29 4. Common TCP Options . . . . . . . . . . . . . . . . . . . . . . 19
4.2. No Operation (Kind = 1) . . . . . . . . . . . . . . . . . 29 4.1. End of Option List (Kind = 0) . . . . . . . . . . . . . . 19
4.3. Maximum Segment Size (Kind = 2) . . . . . . . . . . . . . 29 4.2. No Operation (Kind = 1) . . . . . . . . . . . . . . . . . 19
4.4. Selective Acknowledgement Option . . . . . . . . . . . . 32 4.3. Maximum Segment Size (Kind = 2) . . . . . . . . . . . . . 19
4.4.1. SACK-permitted Option (Kind = 4) . . . . . . . . . . 32 4.4. Selective Acknowledgement Option . . . . . . . . . . . . . 20
4.4.2. SACK Option (Kind = 5) . . . . . . . . . . . . . . . 33 4.4.1. SACK-permitted Option (Kind = 4) . . . . . . . . . . . 20
4.5. MD5 Option (Kind=19) . . . . . . . . . . . . . . . . . . 35 4.4.2. SACK Option (Kind = 5) . . . . . . . . . . . . . . . . 20
4.6. Window scale option (Kind = 3) . . . . . . . . . . . . . 36 4.5. MD5 Option (Kind=19) . . . . . . . . . . . . . . . . . . . 21
4.7. Timestamps option (Kind = 8) . . . . . . . . . . . . . . 37 4.6. Window scale option (Kind = 3) . . . . . . . . . . . . . . 21
4.7.1. Generation of timestamps . . . . . . . . . . . . . . 37 4.7. Timestamps option (Kind = 8) . . . . . . . . . . . . . . . 22
4.7.2. Vulnerabilities . . . . . . . . . . . . . . . . . . . 38 4.7.1. Generation of timestamps . . . . . . . . . . . . . . . 22
5. Connection-establishment mechanism . . . . . . . . . . . . . 39 4.7.2. Vulnerabilities . . . . . . . . . . . . . . . . . . . 22
5.1. SYN flood . . . . . . . . . . . . . . . . . . . . . . . . 40 5. Connection-establishment mechanism . . . . . . . . . . . . . . 24
5.2. Connection forgery . . . . . . . . . . . . . . . . . . . 44 5.1. SYN flood . . . . . . . . . . . . . . . . . . . . . . . . 24
5.3. Connection-flooding attack . . . . . . . . . . . . . . . 45 5.2. Connection forgery . . . . . . . . . . . . . . . . . . . . 28
5.3.1. Vulnerability . . . . . . . . . . . . . . . . . . . . 45 5.3. Connection-flooding attack . . . . . . . . . . . . . . . . 29
5.3.2. Countermeasures . . . . . . . . . . . . . . . . . . . 46 5.3.1. Vulnerability . . . . . . . . . . . . . . . . . . . . 29
5.4. Firewall-bypassing techniques . . . . . . . . . . . . . . 48 5.3.2. Countermeasures . . . . . . . . . . . . . . . . . . . 30
6. Connection-termination mechanism . . . . . . . . . . . . . . 49 5.4. Firewall-bypassing techniques . . . . . . . . . . . . . . 32
6.1. FIN-WAIT-2 flooding attack . . . . . . . . . . . . . . . 49
6.1.1. Vulnerability . . . . . . . . . . . . . . . . . . . . 49 6. Connection-termination mechanism . . . . . . . . . . . . . . . 32
6.1.2. Countermeasures . . . . . . . . . . . . . . . . . . . 50 6.1. FIN-WAIT-2 flooding attack . . . . . . . . . . . . . . . . 32
7. Buffer management . . . . . . . . . . . . . . . . . . . . . . 52 6.1.1. Vulnerability . . . . . . . . . . . . . . . . . . . . 32
7.1. TCP retransmission buffer . . . . . . . . . . . . . . . . 52 6.1.2. Countermeasures . . . . . . . . . . . . . . . . . . . 33
7.1.1. Vulnerability . . . . . . . . . . . . . . . . . . . . 52 7. Buffer management . . . . . . . . . . . . . . . . . . . . . . 35
7.1.2. Countermeasures . . . . . . . . . . . . . . . . . . . 53 7.1. TCP retransmission buffer . . . . . . . . . . . . . . . . 36
7.2. TCP segment reassembly buffer . . . . . . . . . . . . . . 56 7.1.1. Vulnerability . . . . . . . . . . . . . . . . . . . . 36
7.3. Automatic buffer tuning mechanisms . . . . . . . . . . . 59 7.1.2. Countermeasures . . . . . . . . . . . . . . . . . . . 37
7.3.1. Automatic send-buffer tuning mechanisms . . . . . . . 59 7.2. TCP segment reassembly buffer . . . . . . . . . . . . . . 40
7.3.2. Automatic receive-buffer tuning mechanism . . . . . . 61 7.3. Automatic buffer tuning mechanisms . . . . . . . . . . . . 42
8. TCP segment reassembly algorithm . . . . . . . . . . . . . . 63 7.3.1. Automatic send-buffer tuning mechanisms . . . . . . . 43
7.3.2. Automatic receive-buffer tuning mechanism . . . . . . 45
8. TCP segment reassembly algorithm . . . . . . . . . . . . . . . 47
8.1. Problems that arise from ambiguity in the reassembly 8.1. Problems that arise from ambiguity in the reassembly
process . . . . . . . . . . . . . . . . . . . . . . . . . 63 process . . . . . . . . . . . . . . . . . . . . . . . . . 47
9. TCP Congestion Control . . . . . . . . . . . . . . . . . . . 64 9. TCP Congestion Control . . . . . . . . . . . . . . . . . . . . 48
9.1. Congestion control with misbehaving receivers . . . . . . 66 9.1. Congestion control with misbehaving receivers . . . . . . 48
9.1.1. ACK division . . . . . . . . . . . . . . . . . . . . 66 9.1.1. ACK division . . . . . . . . . . . . . . . . . . . . . 48
9.1.2. DupACK forgery . . . . . . . . . . . . . . . . . . . 66 9.1.2. DupACK forgery . . . . . . . . . . . . . . . . . . . . 49
9.1.3. Optimistic ACKing . . . . . . . . . . . . . . . . . . 67 9.1.3. Optimistic ACKing . . . . . . . . . . . . . . . . . . 49
9.2. Blind DupACK triggering attacks against TCP . . . . . . . 68 9.2. Blind DupACK triggering attacks against TCP . . . . . . . 50
9.2.1. Blind throughput-reduction attack . . . . . . . . . . 70 9.2.1. Blind throughput-reduction attack . . . . . . . . . . 52
9.2.2. Blind flooding attack . . . . . . . . . . . . . . . . 70 9.2.2. Blind flooding attack . . . . . . . . . . . . . . . . 53
9.2.3. Difficulty in performing the attacks . . . . . . . . 71 9.2.3. Difficulty in performing the attacks . . . . . . . . . 53
9.2.4. Modifications to TCP's loss recovery algorithms . . . 72 9.2.4. Modifications to TCP's loss recovery algorithms . . . 54
9.2.5. Countermeasures . . . . . . . . . . . . . . . . . . . 74 9.2.5. Countermeasures . . . . . . . . . . . . . . . . . . . 55
9.3. TCP Explicit Congestion Notification (ECN) . . . . . . . 79 9.3. TCP Explicit Congestion Notification (ECN) . . . . . . . . 55
9.3.1. Possible attacks by a compromised router . . . . . . 79 10. TCP API . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
9.3.2. Possible attacks by a malicious TCP endpoint . . . . 80 10.1. Passive opens and binding sockets . . . . . . . . . . . . 56
10. TCP API . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 10.2. Active opens and binding sockets . . . . . . . . . . . . . 57
10.1. Passive opens and binding sockets . . . . . . . . . . . . 81 11. Blind in-window attacks . . . . . . . . . . . . . . . . . . . 59
10.2. Active opens and binding sockets . . . . . . . . . . . . 82 11.1. Blind TCP-based connection-reset attacks . . . . . . . . . 59
11. Blind in-window attacks . . . . . . . . . . . . . . . . . . . 84 11.1.1. RST flag . . . . . . . . . . . . . . . . . . . . . . . 60
11.1. Blind TCP-based connection-reset attacks . . . . . . . . 84 11.1.2. SYN flag . . . . . . . . . . . . . . . . . . . . . . . 60
11.1.1. RST flag . . . . . . . . . . . . . . . . . . . . . . 85 11.1.3. Security/Compartment . . . . . . . . . . . . . . . . . 60
11.1.2. SYN flag . . . . . . . . . . . . . . . . . . . . . . 86 11.1.4. Precedence . . . . . . . . . . . . . . . . . . . . . . 61
11.1.3. Security/Compartment . . . . . . . . . . . . . . . . 88 11.1.5. Illegal options . . . . . . . . . . . . . . . . . . . 61
11.1.4. Precedence . . . . . . . . . . . . . . . . . . . . . 89 11.2. Blind data-injection attacks . . . . . . . . . . . . . . . 61
11.1.5. Illegal options . . . . . . . . . . . . . . . . . . . 90 12. Information leaking . . . . . . . . . . . . . . . . . . . . . 62
11.2. Blind data-injection attacks . . . . . . . . . . . . . . 90
12. Information leaking . . . . . . . . . . . . . . . . . . . . . 91
12.1. Remote Operating System detection via TCP/IP stack 12.1. Remote Operating System detection via TCP/IP stack
fingerprinting . . . . . . . . . . . . . . . . . . . . . 91 fingerprinting . . . . . . . . . . . . . . . . . . . . . . 62
12.1.1. FIN probe . . . . . . . . . . . . . . . . . . . . . . 91 12.1.1. FIN probe . . . . . . . . . . . . . . . . . . . . . . 63
12.1.2. Bogus flag test . . . . . . . . . . . . . . . . . . . 92 12.1.2. Bogus flag test . . . . . . . . . . . . . . . . . . . 63
12.1.3. TCP ISN sampling . . . . . . . . . . . . . . . . . . 92 12.1.3. TCP ISN sampling . . . . . . . . . . . . . . . . . . . 63
12.1.4. TCP initial window . . . . . . . . . . . . . . . . . 92 12.1.4. TCP initial window . . . . . . . . . . . . . . . . . . 63
12.1.5. RST sampling . . . . . . . . . . . . . . . . . . . . 93 12.1.5. RST sampling . . . . . . . . . . . . . . . . . . . . . 64
12.1.6. TCP options . . . . . . . . . . . . . . . . . . . . . 94 12.1.6. TCP options . . . . . . . . . . . . . . . . . . . . . 65
12.1.7. Retransmission Timeout (RTO) sampling . . . . . . . . 94 12.1.7. Retransmission Timeout (RTO) sampling . . . . . . . . 65
12.2. System uptime detection . . . . . . . . . . . . . . . . . 94
13. Covert channels . . . . . . . . . . . . . . . . . . . . . . . 95 12.2. System uptime detection . . . . . . . . . . . . . . . . . 66
14. TCP Port scanning . . . . . . . . . . . . . . . . . . . . . . 95 13. Covert channels . . . . . . . . . . . . . . . . . . . . . . . 66
14.1. Traditional connect() scan . . . . . . . . . . . . . . . 96 14. TCP Port scanning . . . . . . . . . . . . . . . . . . . . . . 66
14.2. SYN scan . . . . . . . . . . . . . . . . . . . . . . . . 96 14.1. Traditional connect() scan . . . . . . . . . . . . . . . . 67
14.3. FIN, NULL, and XMAS scans . . . . . . . . . . . . . . . . 96 14.2. SYN scan . . . . . . . . . . . . . . . . . . . . . . . . . 67
14.4. Maimon scan . . . . . . . . . . . . . . . . . . . . . . . 98 14.3. FIN, NULL, and XMAS scans . . . . . . . . . . . . . . . . 68
14.5. Window scan . . . . . . . . . . . . . . . . . . . . . . . 98 14.4. Maimon scan . . . . . . . . . . . . . . . . . . . . . . . 69
14.6. ACK scan . . . . . . . . . . . . . . . . . . . . . . . . 99 14.5. Window scan . . . . . . . . . . . . . . . . . . . . . . . 69
15. Processing of ICMP error messages by TCP . . . . . . . . . . 99 14.6. ACK scan . . . . . . . . . . . . . . . . . . . . . . . . . 70
16. TCP interaction with the Internet Protocol (IP) . . . . . . . 99 15. Processing of ICMP error messages by TCP . . . . . . . . . . . 70
16.1. TCP-based traceroute . . . . . . . . . . . . . . . . . . 99 16. TCP interaction with the Internet Protocol (IP) . . . . . . . 70
16.2. Blind TCP data injection through fragmented IP traffic . 100 16.1. TCP-based traceroute . . . . . . . . . . . . . . . . . . . 71
16.3. Broadcast and multicast IP addresses . . . . . . . . . . 102 16.2. Blind TCP data injection through fragmented IP traffic . . 71
17. Security Considerations . . . . . . . . . . . . . . . . . . . 102 16.3. Broadcast and multicast IP addresses . . . . . . . . . . . 73
18. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 102 17. Security Considerations . . . . . . . . . . . . . . . . . . . 73
19. References . . . . . . . . . . . . . . . . . . . . . . . . . 103 18. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 73
20. References . . . . . . . . . . . . . . . . . . . . . . . . . 113 19. References (to be translated to xml) . . . . . . . . . . . . . 74
20.1. Normative References . . . . . . . . . . . . . . . . . . 113 20. References . . . . . . . . . . . . . . . . . . . . . . . . . . 84
20.2. Informative References . . . . . . . . . . . . . . . . . 113 20.1. Normative References . . . . . . . . . . . . . . . . . . . 84
Appendix A. TODO list . . . . . . . . . . . . . . . . . . . . . 113 20.2. Informative References . . . . . . . . . . . . . . . . . . 84
Appendix A. TODO list . . . . . . . . . . . . . . . . . . . . . . 85
Appendix B. Change log (to be removed by the RFC Editor Appendix B. Change log (to be removed by the RFC Editor
before publication of this document as an RFC) . . . 113 before publication of this document as an RFC) . . . 85
B.1. Changes from draft-ietf-tcpm-tcp-security-01 . . . . . . 113 B.1. Changes from draft-ietf-tcpm-tcp-security-02 . . . . . . . 85
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 114 B.2. Changes from draft-ietf-tcpm-tcp-security-01 . . . . . . . 86
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 86
1. Preface 1. Preface
1.1. Introduction 1.1. Introduction
The TCP/IP protocol suite was conceived in an environment that was The TCP/IP protocol suite was conceived in an environment that was
quite different from the hostile environment they currently operate quite different from the hostile environment they currently operate
in. However, the effectiveness of the protocols led to their early in. However, the effectiveness of the protocols led to their early
adoption in production environments, to the point that, to some adoption in production environments, to the point that, to some
extent, the current world's economy depends on them. extent, the current world's economy depends on them.
skipping to change at page 6, line 11 skipping to change at page 6, line 11
interoperability [Silbersack, 2005]. interoperability [Silbersack, 2005].
Producing a secure TCP/IP implementation nowadays is a very difficult Producing a secure TCP/IP implementation nowadays is a very difficult
task, in part because of the lack of a single document that serves as task, in part because of the lack of a single document that serves as
a security roadmap for the protocols. Implementers are faced with a security roadmap for the protocols. Implementers are faced with
the hard task of identifying relevant documentation and the hard task of identifying relevant documentation and
differentiating between that which provides correct advice, and that differentiating between that which provides correct advice, and that
which provides misleading advice based on inaccurate or wrong which provides misleading advice based on inaccurate or wrong
assumptions. assumptions.
There is a clear need for a companion document to the IETF
specifications that discusses the security aspects and implications
of the protocols, identifies the existing vulnerabilities, discusses
the possible countermeasures, and analyzes their respective
effectiveness.
This document is the result of a security assessment of the IETF This document is the result of a security assessment of the IETF
specifications of the Transmission Control Protocol (TCP), from a specifications of the Transmission Control Protocol (TCP), from a
security point of view. Possible threats are identified and, where security point of view. Possible threats are identified and, where
possible, countermeasures are proposed. Additionally, many possible, countermeasures are described. Additionally, many
implementation flaws that have led to security vulnerabilities have implementation flaws that have led to security vulnerabilities have
been referenced in the hope that future implementations will not been referenced in the hope that future implementations will not
incur the same problems. incur the same problems.
This document does not aim to be the final word on the security This document is based on the "Security Assessment of the
aspects of TCP. On the contrary, it aims to raise awareness about a
number of TCP vulnerabilities that have been faced in the past, those
that are currently being faced, and some of those that we may still
have to deal with in the future.
Feedback from the community is more than encouraged to help this
document be as accurate as possible and to keep it updated as new
vulnerabilities are discovered.
This document is heavily based on the "Security Assessment of the
Transmission Control Protocol (TCP)" released by the UK Centre for Transmission Control Protocol (TCP)" released by the UK Centre for
the Protection of National Infrastructure (CPNI), available at: http: the Protection of National Infrastructure (CPNI), available at: http:
//www.cpni.gov.uk/Products/technicalnotes/ //www.cpni.gov.uk/Products/technicalnotes/
Feb-09-security-assessment-TCP.aspx . Feb-09-security-assessment-TCP.aspx .
1.2. Scope of this document 1.2. Scope of this document
While there are a number of protocols that may affect the way TCP While there are a number of protocols that may affect the way TCP
operates, this document focuses only on the specifications of the operates, this document focuses only on the specifications of the
Transmission Control Protocol (TCP) itself. Transmission Control Protocol (TCP) itself.
The following IETF RFCs were selected for assessment as part of this The machanisms described in the following documents were selected for
work: assessment as part of this work:
o RFC 793, "Transmission Control Protocol. DARPA Internet Program. o RFC 793, "Transmission Control Protocol. DARPA Internet Program.
Protocol Specification" (91 pages) Protocol Specification" (91 pages)
o RFC 1122, "Requirements for Internet Hosts -- Communication o RFC 1122, "Requirements for Internet Hosts -- Communication
Layers" (116 pages) Layers" (116 pages)
o RFC 1191, "Path MTU Discovery" (19 pages) o RFC 1191, "Path MTU Discovery" (19 pages)
o RFC 1323, "TCP Extensions for High Performance" (37 pages) o RFC 1323, "TCP Extensions for High Performance" (37 pages)
skipping to change at page 8, line 19 skipping to change at page 7, line 46
their security implications, and discusses the possible their security implications, and discusses the possible
countermeasures. The second part contains an analysis of the countermeasures. The second part contains an analysis of the
security implications of the mechanisms and policies implemented by security implications of the mechanisms and policies implemented by
TCP, and of a number of implementation strategies in use by a number TCP, and of a number of implementation strategies in use by a number
of popular TCP implementations. of popular TCP implementations.
2. The Transmission Control Protocol 2. The Transmission Control Protocol
The Transmission Control Protocol (TCP) is a connection-oriented The Transmission Control Protocol (TCP) is a connection-oriented
transport protocol that provides a reliable byte-stream data transfer transport protocol that provides a reliable byte-stream data transfer
service. service. Very few assumptions are made about the reliability of
underlying data transfer services below the TCP layer. Basically,
Very few assumptions are made about the reliability of underlying TCP assumes it can obtain a simple, potentially unreliable datagram
data transfer services below the TCP layer. Basically, TCP assumes service from the lower level protocols.
it can obtain a simple, potentially unreliable datagram service from
the lower level protocols. Figure 1 illustrates where TCP fits in
the DARPA reference model.
+---------------+
| Application |
+---------------+
| TCP |
+---------------+
| IP |
+---------------+
| Network |
+---------------+
Figure 1: TCP in the DARPA reference model
TCP provides facilities in the following areas:
o Basic Data Transfer
o Reliability
o Flow Control
o Multiplexing
o Connections
o Precedence and Security
o Congestion Control
The core TCP specification, RFC 793 [Postel, 1981c], dates back to The core TCP specification, RFC 793 [RFC0793], dates back to 1981 and
1981 and standardizes the basic mechanisms and policies of TCP. RFC standardizes the basic mechanisms and policies of TCP. RFC 1122
1122 [Braden, 1989] provides clarifications and errata for the [RFC1122] provides clarifications and errata for the original
original specification. RFC 2581 [Allman et al, 1999] specifies TCP specification. RFC 2581 [RFC5681] specifies TCP congestion control
congestion control and avoidance mechanisms, not present in the and avoidance mechanisms, not present in the original specification.
original specification. Other documents specify extensions and Other documents specify extensions and improvements for TCP.
improvements for TCP.
The large amount of documents that specify extensions, improvements, The large amount of documents that specify extensions, improvements,
or modifications to existing TCP mechanisms has led the IETF to or modifications to existing TCP mechanisms has led the IETF to
publish a roadmap for TCP, RFC 4614 [Duke et al, 2006], that publish a roadmap for TCP, RFC 4614 [Duke et al, 2006], that
clarifies the relevance of each of those documents. clarifies the relevance of each of those documents.
3. TCP header fields 3. TCP header fields
RFC 793 [Postel, 1981c] defines the syntax of a TCP segment, along RFC 793 [RFC0793] defines the syntax of a TCP segment, along with the
with the semantics of each of the header fields. Figure 2 semantics of each of the header fields.
illustrates the syntax of a TCP segment.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Port | Destination Port |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Acknowledgment Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data | |C|E|U|A|P|R|S|F| |
| Offset|Resrved|W|C|R|C|S|S|Y|I| Window |
| | |R|E|G|K|H|T|N|N| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Checksum | Urgent Pointer |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Options | Padding |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| data |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Note that one tick mark represents one bit position
Figure 2: Transmission Control Protocol header format
The minimum TCP header size is 20 bytes, and corresponds to a TCP The minimum TCP header size is 20 bytes, and corresponds to a TCP
segment with no options and no data. However, a TCP module might be segment with no options and no data. However, a TCP module might be
handed an (illegitimate) "TCP segment" of less than 20 bytes. handed an (illegitimate) "TCP segment" of less than 20 bytes.
Therefore, before doing any processing of the TCP header fields, the Therefore, before doing any processing of the TCP header fields, the
following check should be performed by TCP on the segments handed by following check should be performed by TCP on the segments handed by
the internet layer: the internet layer:
Segment.Size >= 20 Segment.Size >= 20
skipping to change at page 10, line 29 skipping to change at page 8, line 44
3.1. Source Port and Destination Port 3.1. Source Port and Destination Port
The Source Port field contains a 16-bit number that identifies the The Source Port field contains a 16-bit number that identifies the
TCP end-point that originated this TCP segment. The TCP Destination TCP end-point that originated this TCP segment. The TCP Destination
Port contains a 16-bit number that identifies the destination TCP Port contains a 16-bit number that identifies the destination TCP
end-point of this segment. In most of the discussion we refer to end-point of this segment. In most of the discussion we refer to
client-side (or "ephemeral") port-numbers and server-side port client-side (or "ephemeral") port-numbers and server-side port
numbers, since that distinction is what usually affects the numbers, since that distinction is what usually affects the
interpretation of a port number. interpretation of a port number.
TCP SHOULD randomize its ephemeral (client-side) ports, to improve Most active attacks against ongoing TCP connections require the
its resistance to off-path attacks. For the purpose of ephemeral attacker to guess or know the four-tuple that identifies the
port selection, the largest posible port range SHOULD be used connection. As a result, randomization of the TCP ephemeral ports
(ideally 1024-65535) I-D.ietf-tsvwg-port-randomization. provides a (partial) mitigation against off-path attacks. [RFC6056]
provides guidance in this area.
DISCUSSION:
[I-D.ietf-tsvwg-port-randomization] provides advice on port
randomization.
TCP MUST NOT allocate port number 0, as its use could lead to
interoperability problems. If a segment is received with port 0 as
the Source Port or the Destination Port, a RST segment SHOULD be sent
in response (provided that the incomming segment does not have the
RST flag set).
DISCUSSION:
While port 0 is a legitimate port number, it has a special meaning
in the UNIX Sockets API. For example, when a TCP port number of 0
is passed as an argument to the bind() function, rather than
binding port 0, an ephemeral port is selected for the
corresponding TCP end-point. As a result, the TCP port number 0
is never actually used in TCP segments.
Different implementations have been found to respond differently
to TCP segments that have a port number of 0 as the Source Port
and/or the Destination Port. As a result, TCP segments with a
port number of 0 are usually employed for remote OS detection via
TCP/IP stack fingerprinting [Jones, 2003].
Since in practice TCP port 0 is not used by any legitimate
application and is only used for fingerprinting purposes, a number
of host implementations already reject TCP segments that use 0 as
the Source Port and/or the Destination Port. Also, a number
firewalls filter (by default) any TCP segments that contain a port
number of zero for the Source Port and/or the Destination Port.
We therefore recommend that TCP implementations respond to
incoming TCP segments that have a Source Port or a Destination
Port of 0 with an RST (provided these incoming segments do not
have the RST bit set).
Responding with an RST segment to incoming segments that have the
RST bit would open the door to RST-war attacks.
TCP MUST be able to grecefully handle the case where the source end-
point (IP Source Address, TCP Source Port) is the same as the
destination end-point (IP Destination Address, TCP Destination Port).
DISCUSSION:
Some systems have been found to be unable to process TCP segments
in which the source endpoint {Source Address, Source Port} is the
same than the destination end-point {Destination Address,
Destination Port}. Such TCP segments have been reported to cause
malfunction of a number of implementations [CERT, 1996], and have
been exploited in the past to perform Denial of Service (DoS)
attacks [Meltman, 1997]. While these packets are very very
unlikely to exist in real and legitimate scenarios, TCP should
nevertheless be able to process them without the need of any
"extra" code.
A SYN segment in which the source end-point {Source Address,
Source Port} is the same as the destination end-point {Destination
Address, Destination Port} will result in a "simultaneous open"
scenario, such as the one described in page 32 of RFC 793 [Postel,
1981c]. Therefore, those TCP implementations that correctly
handle simultaneous opens should already be prepared to handle
these unusual TCP segments.
TCP SHOULD NOT allocate of port numbers that are in use by a TCP that
is in the LISTEN or CLOSED states for use as ephemeral ports, as this
could allow attackers on the local system to "steal" incomming TCP
connections.
DISCUSSION:
While the only requirement for a selected ephemeral port is that Some implementations have been known to crash when a TCP segment in
the resulting four-tuple (connection-id) is unique (i.e., not which the source end-point (IP Source Address, TCP Source Port) is
currently in use by any other TCP connection), in practice it may the same as the destination end-point (IP Destination Address, TCP
be necessary to not allow the allocation of port numbers that are Destination Port). [draft-gont-tcpm-tcp-mirrored-endpoints-00.txt]
in use by a TCP that is in the LISTEN or CLOSED states for use as describes this issue in detail and provides advice in this area.
ephemeral ports, as this might allow an attacker to "steal"
incoming connections from a local server application. Therefore,
TCP SHOULD NOT allocate port numbers that are in use by a TCP in
the LISTEN or CLOSED states for use as ephemeral ports. Section
10.2 of this document provides a detailed discussion of this
issue.
While some systems restrict use of the port numbers in the range While some systems restrict use of the port numbers in the range
0-1024 to privileged users, applications SHOULD NOT grant any trust 0-1024 to privileged users, applications should not grant any trust
based on the port numbers used for a TCP connection. based on the port numbers used for a TCP connection.
DISCUSSION:
Not all systems require superuser privileges to bind port numbers Not all systems require superuser privileges to bind port numbers
in that range. Besides, with desktop computers such "distinction" in that range. Besides, with desktop computers such "distinction"
has generally become irrelevant. has generally become irrelevant.
Middle-boxes such as packet filters MUST NOT assume that clients use Middle-boxes such as packet filters must not assume that clients use
port numbers from only the Dynamic or Registered port ranges. port numbers from only the Dynamic or Registered port ranges.
DISCUSSION:
It should also be noted that some clients, such as DNS resolvers, It should also be noted that some clients, such as DNS resolvers,
are known to use port numbers from the "Well Known Ports" range. are known to use port numbers from the "Well Known Ports" range.
Therefore, middle-boxes such as packet filters MUST NOT assume Therefore, middle-boxes such as packet filters MUST NOT assume
that clients use port number from only the Dynamic or Registered that clients use port number from only the Dynamic or Registered
port ranges. port ranges.
3.2. Sequence number 3.2. Sequence number
TCP SHOULD select its Initial Sequence Numbers (ISNs) with the Predictable sequence numbers allow a variety of attacks against TCP,
following expression: such as those described in Section 5.2 and Section 11 of this
document. This vulnerability was first described in [Morris1985],
ISN = M + F(localhost, localport, remotehost, remoteport, secret_key) and its exploitation was widely publicized about 10 years later
[Shimomura1995].
where M is a monotonically increasing counter maintained within TCP,
and F() is a Pseudo-Random Function (PRF). As it is vital that F()
not be computable from the outside, F() could be a PRF of the
connection-id and some secret data. HMAC-SHA-256 would be a good
choice for F()
DISCUSSION:
The choice of the Initial Sequence Number of a connection is not
arbitrary, but aims to minimize the chances of a stale segment
from being accepted by a new incarnation of a previous connection.
RFC 793 [Postel, 1981c] suggests the use of a global 32-bit ISN
generator, whose lower bit is incremented roughly every 4
microseconds.
However, use of such an ISN generator makes it trivial to predict
the ISN that a TCP will use for new connections, thus allowing a
variety of attacks against TCP, such as those described in Section
5.2 and Section 11 of this document. This vulnerability was first
described in [Morris, 1985], and its exploitation was widely
publicized about 10 years later [Shimomura, 1995].
As a matter of fact, protection against old stale segments from a
previous incarnation of the connection comes from allowing the
creation of a new incarnation of a previous connection only after
2*MSL have passed since a segment corresponding to the old
incarnation was last seen. This is accomplished by the TIME-WAIT
state, and TCP's "quiet time" concept. However, as discussed in
Section 3.1 and Section 11.1.2 of this document, the ISN can be
used to perform some heuristics meant to avoid an interoperability
problem that may arise when two systems establish connections at a
high rate. In order for such heuristics to work, the ISNs
generated by a TCP should be monotonically increasing.
The ISN generation scheme recommended in this section was
originally proposed in RFC 1948 [Bellovin, 1996], such that the
chances of an attacker from guessing the ISN of a TCP are reduced,
while still producing a monotonically-increasing sequence that
allows implementation of the optimization described in Section 3.1
and Section 11.1.2 of this document.
[CERT, 2001] and [US-CERT, 2001] are advisories about the security In order to mitigate this vulnerabilities, some implementations set
implications of weak ISN generators. [Zalewski, 2001a] and the TCP ISN to a PRNG. However, this has been known to cause
[Zalewski, 2002] contain a detailed analysis of ISN generators, interoperability problems. [RFC6528] provides advice in this area.
and a survey of the algorithms in use by popular TCP
implementations.
Another security consideration that should be made about TCP Another security consideration that should be made about TCP sequence
sequence numbers is that they might allow an attacker to count the numbers is that they might allow an attacker to count the number of
number of systems behind a Network Address Translator (NAT) systems behind a Network Address Translator (NAT) [Srisuresh and
[Srisuresh and Egevang, 2001]. Depending on the ISN generators Egevang, 2001]. Depending on the ISN generators implemented by each
implemented by each of the systems behind the NAT, an attacker of the systems behind the NAT, an attacker might be able to count the
might be able to count the number of systems behind the NAT by number of systems behind the NAT by establishing a number of TCP
establishing a number of TCP connections (using the public address connections (using the public address of the NAT) and indentifying
of the NAT) and indentifying the number of different sequence the number of different sequence number "spaces". [Gont and
number "spaces". This information leakage could be eliminated by Srisuresh, 2008] provides a detailed discussion of the security
rewriting the contents of all those header fields and options that implications of NATs and of the possible mitigations for this and
make use of sequence numbers (such as the Sequence Number and the other issues.
Acknowledgement Number fields, and the SACK Option) at the NAT.
[Gont and Srisuresh, 2008] provides a detailed discussion of the
security implications of NATs and of the possible mitigations for
this and other issues.
3.3. Acknowledgement Number 3.3. Acknowledgement Number
TCP SHOULD set the Acknowledgement Number to zero when sending a TCP If the ACK bit is on, the Acknowledgement Number contains the value
segment that does not have the ACK bit set (i.e., a SYN segment). of the next sequence number the sender of this segment is expecting
to receive. According to RFC 793, the Acknowledgement Number is
TCP MUST check that, on segments that have the ACK bit set, the considered valid as long as it does not acknowledge the receipt of
Acknowledgment Number satisfies the expression: data that has not yet been sent.
SND.UNA - SND.MAX.WND <= SEG.ACK <= SND.NXT
If a TCP segment does not pass this check, the segment MUST be
dropped, and an ACK segment SHOULD be sent in response.
DISCUSSION:
If the ACK bit is on, the Acknowledgement Number contains the
value of the next sequence number the sender of this segment is
expecting to receive. According to RFC 793, the Acknowledgement
Number is considered valid as long as it does not acknowledge the
receipt of data that has not yet been sent.
However, as a result of recent concerns on forgery attacks against
TCP (see Section 11 of this document), ongoing work at the IETF
[Ramaiah et al, 2008] has proposed to enforce a more strict check
on the Acknowledgement Number of segments that have the ACK bit
set:
SND.UNA - SND.MAX.WND <= SEG.ACK <= SND.NXT However, as a result of recent concerns on forgery attacks against
TCP (see Section 11 of this document) [RFC5961] has proposed to
enforce a more strict check on the Acknowledgement Number of segments
that have the ACK bit set. See for more details.
If the ACK bit is off, the Acknowledgement Number field is not If the ACK bit is off, the Acknowledgement Number field is not valid.
valid. We recommend TCP implementations to set the We recommend TCP implementations to set the Acknowledgement Number to
Acknowledgement Number to zero when sending a TCP segment that zero when sending a TCP segment that does not have the ACK bit set
does not have the ACK bit set (i.e., a SYN segment). Some TCP (i.e., a SYN segment). Some TCP implementations have been known to
implementations have been known to fail to set the Acknowledgement fail to set the Acknowledgement Number to zero, thus leaking
Number to zero, thus leaking information. information.
TCP Acknowledgements are also used to perform heuristics for loss TCP Acknowledgements are also used to perform heuristics for loss
recovery and congestion control. Section 9 of this document recovery and congestion control. Section 9 of this document
describes a number of ways in which these mechanisms can be describes a number of ways in which these mechanisms can be
exploited. exploited.
3.4. Data Offset 3.4. Data Offset
TCP MUST enforce the following checks on the Data Offset field: [draft-gont-tcpm-tcp-sanity-checks-00.txt] specifies a number of
sanity checks that should be performed on the Data Offset field.
Data Offset >= 5
Data Offset * 4 <= TCP segment length
If a TCP segment does not pass these checks, it should be silently
dropped.
The TCP segment length should be obtained from the IP layer, as
TCP does not include a TCP segment length field.
DISCUSSION:
The Data Offset field indicates the length of the TCP header in
32-bit words. As the minimum TCP header size is 20 bytes, the
minimum legal value for this field is 5.
For obvious reasons, the TCP header cannot be larger than the
whole TCP segment it is part of.
3.5. Control bits 3.5. Control bits
The following subsections provide a discussion of the different The following subsections provide a discussion of the different
control bits in the TCP header. TCP segments with unusual control bits in the TCP header. TCP segments with unusual
combinations of flags set have been known in the past to cause combinations of flags set have been known in the past to cause
malfunction of some implementations, sometimes to the extent of malfunction of some implementations, sometimes to the extent of
causing them to crash [Postel, 1987] [Braden, 1992]. These packets causing them to crash [RFC1025] [RFC1379]. These packets are still
are still usually employed for the purpose of TCP/IP stack usually employed for the purpose of TCP/IP stack fingerprinting.
fingerprinting. Section 12.1 contains a discussion of TCP/IP stack Section 12.1 contains a discussion of TCP/IP stack fingerprinting.
fingerprinting.
3.5.1. Reserved (four bits) 3.5.1. Reserved (four bits)
TCP MUST ignore the Reserved field of incoming TCP segments. These four bits are reserved for future use, and must be zero. As
with virtually every field, the Reserved field could be used as a
DISCUSSION: covert channel. While there exist intermediate devices such as
protocol scrubbers that clear these bits, and firewalls that drop/
These four bits are reserved for future use, and must be zero. As reject segments with any of these bits set, these devices should
with virtually every field, the Reserved field could be used as a consider the impact of these policies on TCP interoperability. For
covert channel. While there exist intermediate devices such as example, as TCP continues to evolve, all or part of the bits in the
protocol scrubbers that clear these bits, and firewalls that drop/ Reserved field could be used to implement some new functionality. If
reject segments with any of these bits set, these devices should some middle-box or end-system implementation were to drop a TCP
consider the impact of these policies on TCP interoperability. segment merely because some of these bits are not set to zero,
For example, as TCP continues to evolve, all or part of the bits interoperability problems would arise.
in the Reserved field could be used to implement some new
functionality. If some middle-box or end-system implementation
were to drop a TCP segment merely because some of these bits are
not set to zero, interoperability problems would arise.
3.5.2. CWR (Congestion Window Reduced) 3.5.2. CWR (Congestion Window Reduced)
DISCUSSION: The CWR flag, defined in RFC 3168 [Ramakrishnan et al, 2001], is used
as part of the Explicit Congestion Notification (ECN) mechanism. For
The CWR flag, defined in RFC 3168 [Ramakrishnan et al, 2001], is connections in any of the synchronized states, this flag indicates,
used as part of the Explicit Congestion Notification (ECN) when set, that the TCP sending this segment has reduced its
mechanism. For connections in any of the synchronized states, congestion window.
this flag indicates, when set, that the TCP sending this segment
has reduced its congestion window.
An analysis of the security implications of ECN can be found in An analysis of the security implications of ECN can be found in
Section 9.3 of this document. Section 9.3 of this document.
3.5.3. ECE (ECN-Echo) 3.5.3. ECE (ECN-Echo)
DISCUSSION: The ECE flag, defined in RFC 3168 [Ramakrishnan et al, 2001], is used
as part of the Explicit Congestion Notification (ECN) mechanism.
The ECE flag, defined in RFC 3168 [Ramakrishnan et al, 2001], is
used as part of the Explicit Congestion Notification (ECN)
mechanism.
Once a TCP connection has been established, an ACK segment with
the ECE bit set indicates that congestion was encountered in the
network on the path from the sender to the receiver. This
indication of congestion should be treated just as a congestion
loss in non-ECN-capable TCP [Ramakrishnan et al, 2001].
Additionally, TCP should not increase the congestion window (cwnd)
in response to such an ACK segment that indicates congestion, and
should also not react to congestion indications more than once
every window of data (or once per round-trip time).
An analysis of the security implications of ECN can be found in An analysis of the security implications of ECN can be found in
Section 9.3 of this document. Section 9.3 of this document.
3.5.4. URG 3.5.4. URG
DISCUSSION: When the URG flag is set, the Urgent Pointer field contains the
current value of the urgent pointer.
When the URG flag is set, the Urgent Pointer field contains the
current value of the urgent pointer.
Receipt of an "urgent" indication generates, in a number of
implementations (such as those in UNIX-like systems), a software
interrupt (signal) that is delivered to the corresponding process.
In UNIX-like systems, receipt of an urgent indication causes a Receipt of an "urgent" indication generates, in a number of
SIGURG signal to be delivered to the corresponding process. implementations (such as those in UNIX-like systems), a software
interrupt (signal) that is delivered to the corresponding process.
In UNIX-like systems, receipt of an urgent indication causes a SIGURG
signal to be delivered to the corresponding process.
A number of applications handle TCP urgent indications by A number of applications handle TCP urgent indications by installing
installing a signal handler for the corresponding signal (e.g., a signal handler for the corresponding signal (e.g., SIGURG). As
SIGURG). As discussed in [Zalewski, 2001b], some signal handlers discussed in [Zalewski, 2001b], some signal handlers can be
can be maliciously exploited by an attacker, for example to gain maliciously exploited by an attacker, for example to gain remote
remote access to a system. While secure programming of signal access to a system. While secure programming of signal handlers is
handlers is out of the scope of this document, we nevertheless out of the scope of this document, we nevertheless raise awareness
raise awareness that TCP urgent indications might be exploited to that TCP urgent indications might be exploited to abuse poorly-
abuse poorly-written signal handlers. written signal handlers.
Section 3.9 discusses the security implications of the TCP urgent Section 3.9 discusses the security implications of the TCP urgent
mechanism. mechanism.
3.5.5. ACK 3.5.5. ACK
DISCUSSION: When the ACK bit is one, the Acknowledgment Number field contains the
next sequence number expected, cumulatively acknowledging the receipt
When the ACK bit is one, the Acknowledgment Number field contains of all data up to the sequence number in the Acknowledgement Number,
the next sequence number expected, cumulatively acknowledging the minus one. Section 3.4 of this document describes sanity checks that
receipt of all data up to the sequence number in the should be performed on the Acknowledgement Number field.
Acknowledgement Number, minus one. Section 3.4 of this document
describes sanity checks that should be performed on the
Acknowledgement Number field.
TCP Acknowledgements are also used to perform heuristics for loss TCP Acknowledgements are also used to perform heuristics for loss
recovery and congestion control. Section 9 of this document recovery and congestion control. Section 9 of this document
describes a number of ways in which these mechanisms can be describes a number of ways in which these mechanisms can be
exploited. exploited.
3.5.6. PSH 3.5.6. PSH
As a result of a SEND call, TCP SHOULD send all queued data (provided [draft-gont-tcpm-tcp-push-semantics-00.txt] describes a number of
that TCP's flow control and congestion control algorithms allow it). security issues that may arise as a result of the PUSH semantics, and
proposes a number of ways to mitigate these issues.
Received data SHOULD be immediately delivered to an application
calling the RECEIVE function, even if the data already available are
less than those requested by the application.
DISCUSSION:
RFC 793 [Postel, 1981c] contains (in pages 54-64) a functional
description of a TCP Application Programming Interface (API). One
of the parameters of the SEND function is the PUSH flag which,
when set, signals the local TCP that it must send all unsent data.
The TCP PSH (PUSH) flag will be set in the last outgoing segment,
to signal the push function to the receiving TCP. Upon receipt of
a segment with the PSH flag set, the receiving user's buffer is
returned to the user, without waiting for additional data to
arrive.
There are two security considerations arising from the PUSH
function. On the sending side, an attacker could cause a large
amount of data to be queued for transmission without setting the
PUSH flag in the SEND call. This would prevent the local TCP from
sending the queued data, causing system memory to be tied to those
data for an unnecessarily long period of time.
An analogous consideration should be made for the receiving TCP.
TCP is allowed to buffer incoming data until the receiving user's
buffer fills or a segment with the PSH bit set is received. If
the receiving TCP implements this policy, an attacker could send a
large amount of data, slightly less than the receiving user's
buffer size, to cause system memory to be tied to these data for
an unnecessarily long period of time. Both of these issues are
discussed in Section 4.2.2.2 of RFC 1122 [Braden, 1989].
In order to mitigate these potential vulnerabilities, we suggest
assuming an implicit "PUSH" in every SEND call. On the sending
side, this means that as a result of a SEND call TCP should try to
send all queued data (provided that TCP's flow control and
congestion control algorithms allow it). On the receiving side,
this means that the received data will be immediately delivered to
an application calling the RECEIVE function, even if the data
already available are less than those requested by the
application.
It is interesting to note that popular TCP APIs (such as
"sockets") do not provide a PUSH flag in any of the interfaces
they define, but rather perform some kind of "heuristics" to set
the PSH bit in outgoing segments. As a result, the value of the
PSH bit in the received TCP segments is usually a policy of the
sending TCP, rather than a policy of the sending application. All
robust applications that make use of those APIs (such as the
sockets API) properly handle the case of a RECEIVE call returning
less data (e.g., zero) than requested, usually by performing
subsequent RECEIVE calls.
Another potential malicious use of the PSH bit would be for an
attacker to send small TCP segments (probably with zero bytes of
data payload) to cause the receiving application to be
unnecessarily woken up (increasing the CPU load), or to cause
malfunction of poorly-written applications that may not handle
well the case of RECEIVE calls returning less data than requested.
3.5.7. RST 3.5.7. RST
TCP MUST process RST segments (i.e., segments with the RST bit set) The RST bit is used to request the abortion (abnormal close) of a TCP
as follows: connection. RFC 793 [RFC0793] suggests that an RST segment should be
considered valid if its Sequence Number is valid (i.e., falls within
o If the Sequence Number of the RST segment is not valid (i.e., the receive window). However, in response to the security concerns
falls outside of the receive window), silently drop the segment. raised by [Watson, 2004] and [NISCC, 2004], [RFC6429] proposed
stricter validity checks. Please see [RFC6429] for additional
o If the Sequence Number of the RST segment matches the next details.
expected sequence number (RCV.NXT), abort the corresponding
connection.
o If the Sequence Number is valid (i.e., falls within the receive
window) but is not exactly RCV.NXT, send an ACK segment (a
"challenge ACK") of the form: <SEQ=SND.NXT><ACK=RCV.NXT><CTL=ACK>.
TCP SHOULD rate-limit these challenge ACK segments.
DISCUSSION:
The RST bit is used to request the abortion (abnormal close) of a
TCP connection. RFC 793 [Postel, 1981c] suggests that an RST
segment should be considered valid if its Sequence Number is valid
(i.e., falls within the receive window). However, in response to
the security concerns raised by [Watson, 2004] and [NISCC, 2004],
[Ramaiah et al, 2008] proposec the aforementioned stricter
validity checks.
Section 11.1 of this document describes TCP-based connection-reset Section 11.1 of this document describes TCP-based connection-reset
attacks, along with a number of countermeasures to mitigate their attacks, along with a number of countermeasures to mitigate their
impact. impact.
3.5.8. SYN 3.5.8. SYN
DISCUSSION: The SYN bit is used during the connection-establishment phase, to
request the synchronization of sequence numbers.
The SYN bit is used during the connection-establishment phase, to
request the synchronization of sequence numbers.
There are basically four different vulnerabilities that make use There are basically four different vulnerabilities that make use of
of the SYN bit: SYN-flooding attacks, connection forgery attacks, the SYN bit: SYN-flooding attacks, connection forgery attacks,
connection flooding attacks, and connection-reset attacks. They connection flooding attacks, and connection-reset attacks. They are
are described in Section 5.1, Section 5.2, Section 5.3, and described in Section 5.1, Section 5.2, Section 5.3, and Section
Section 11.1.2, respectively, along with the possible 11.1.2, respectively, along with the possible countermeasures.
countermeasures.
3.5.9. FIN 3.5.9. FIN
DISCUSSION: The FIN flag is used to signal the remote end-point the end of the
data transfer in this direction. Receipt of a valid FIN segment
The FIN flag is used to signal the remote end-point the end of the (i.e., a TCP segment with the FIN flag set) causes the transition in
data transfer in this direction. Receipt of a valid FIN segment the connection state, as part of what is usually referred to as the
(i.e., a TCP segment with the FIN flag set) causes the transition "connection termination phase".
in the connection state, as part of what is usually referred to as
the "connection termination phase".
The connection-termination phase can be exploited to perform a The connection-termination phase can be exploited to perform a number
number of resource-exhaustion attacks. Section 6 of this document of resource-exhaustion attacks. Section 6 of this document describes
describes a number of attacks that exploit the connection- a number of attacks that exploit the connection-termination phase
termination phase along with the possible countermeasures. along with the possible countermeasures.
3.6. Window 3.6. Window
DISCUSSION: The TCP Window field advertises how many bytes of data the remote
peer is allowed to send before a new advertisement is made.
The TCP Window field advertises how many bytes of data the remote Theoretically, the maximum transfer rate that can be achieved by TCP
peer is allowed to send before a new advertisement is made. is limited to:
Theoretically, the maximum transfer rate that can be achieved by
TCP is limited to:
Maximum Transfer Rate = Window / RTT Maximum Transfer Rate = Window / RTT
This means that, under ideal network conditions (e.g., no packet This means that, under ideal network conditions (e.g., no packet
loss), the TCP Window in use should be at least: loss), the TCP Window in use should be at least:
Window = 2 * Bandwidth * Delay Window = 2 * Bandwidth * Delay
Using a larger Window than that resulting from the previous Using a larger Window than that resulting from the previous equation
equation will not provide any improvements in terms of will not provide any improvements in terms of performance.
performance.
In practice, selection of the most convenient Window size may also
depend on a number of other parameters, such as: packet loss rate,
loss recovery mechanisms in use, etc.
Security implications of the maximum TCP window size In practice, selection of the most convenient Window size may also
depend on a number of other parameters, such as: packet loss rate,
loss recovery mechanisms in use, etc.
An aspect of the TCP Window that is usually overlooked is the An aspect of the TCP Window that is usually overlooked is the
security implications of its size. Increasing the TCP window security implications of its size. Increasing the TCP window
increases the sequence number space that will be considered increases the sequence number space that will be considered "valid"
"valid" for incoming segments. Thus, use of unnecessarily large for incoming segments. Thus, use of unnecessarily large TCP Window
TCP Window sizes increases TCP's vulnerability to forgery attacks sizes increases TCP's vulnerability to forgery attacks unnecessarily.
unnecessarily.
In those scenarios in which the network conditions are known In those scenarios in which the network conditions are known and/or
and/or can be easily predicted, it is recommended that the TCP can be easily predicted, it is recommended that the TCP Window is
Window is never set to a value larger than that resulting from the never set to a value larger than that resulting from the equations
equations above. Additionally, the nature of the application above. Additionally, the nature of the application running on top of
running on top of TCP should be considered when tuning the TCP TCP should be considered when tuning the TCP window. As an example,
window. As an example, an H.245 signaling application certainly an H.245 signaling application certainly does not have high
does not have high requirements on throughput, and thus a window requirements on throughput, and thus a window size of around 4 KBytes
size of around 4 KBytes will usually fulfill its needs, while will usually fulfill its needs, while keeping TCP's resistance to
keeping TCP's resistance to off-path forgery attacks at a decent off-path forgery attacks at a decent level. Some rough measurements
level. Some rough measurements seem to indicate that a TCP window seem to indicate that a TCP window of 4Kbytes is common practice for
of 4Kbytes is common practice for TCP connections servicing TCP connections servicing applications such as BGP.
applications such as BGP.
In principle, a possible approach to avoid requiring In principle, a possible approach to avoid requiring administrators
administrators to manually set the TCP window would be to to manually set the TCP window would be to implement an automatic
implement an automatic buffer tuning mechanism, such as that buffer tuning mechanism, such as that described in [Heffner, 2002].
described in [Heffner, 2002]. However, as discussed in Section However, as discussed in Section 7.3.2 of this document these
7.3.2 of this document these mechanisms can be exploited to mechanisms can be exploited to perform other types of attacks.
perform other types of attacks.
Security implications arising from closed windows 3.6.1. Security implications arising from closed windows
The TCP window is a flow-control mechanism that prevents a fast When a TCP end-point is not willing to receive any more data (before
data sender application from overwhelming a "slow" receiver. When some of the data that have already been received are consumed), it
a TCP end-point is not willing to receive any more data (before will advertise a TCP window of zero bytes. This will effectively
some of the data that have already been received are consumed), it stop the sender from sending any new data to the TCP receiver.
will advertise a TCP window of zero bytes. This will effectively Transmission of new data will resume when the TCP receiver advertises
stop the sender from sending any new data to the TCP receiver. a nonzero TCP window, usually with a TCP segment that contains no
Transmission of new data will resume when the TCP receiver data ("an ACK").
advertises a nonzero TCP window, usually with a TCP segment that
contains no data ("an ACK").
This segment is usually referred to as a "window update", as the This segment is usually referred to as a "window update", as the
only purpose of this segment is to update the server regarding the only purpose of this segment is to update the server regarding the
new window. new window.
To accommodate those scenarios in which the ACK segment that To accommodate those scenarios in which the ACK segment that "opens"
"opens" the window is lost, TCP implements a "persist timer" that the window is lost, TCP implements a "persist timer" that causes the
causes the TCP sender to query the TCP receiver periodically if TCP sender to query the TCP receiver periodically if the last segment
the last segment received advertised a window of zero bytes. This received advertised a window of zero bytes. This probe simply
probe simply consists of sending one byte of new data that will consists of sending one byte of new data that will force the TCP
force the TCP receiver to send an ACK segment back to the TCP receiver to send an ACK segment back to the TCP sender, containing
sender, containing the current TCP window. Similarly to the the current TCP window. Similarly to the retransmission timeout
retransmission timeout timer, an exponential back-off is used when timer, an exponential back-off is used when calculating the
calculating the retransmission timer, so that the spacing between retransmission timer, so that the spacing between probes increases
probes increases exponentially. exponentially.
A fundamental difference between the "persist timer" and the A fundamental difference between the "persist timer" and the
retransmission timer is that there is no limit on the amount of retransmission timer is that there is no limit on the amount of time
time during which a TCP can advertise a zero window. This means during which a TCP can advertise a zero window. This means that a
that a TCP end-point could potentially advertise a zero window TCP end-point could potentially advertise a zero window forever, thus
forever, thus keeping kernel memory at the TCP sender tied to the keeping kernel memory at the TCP sender tied to the TCP
TCP retransmission buffer. This could clearly be exploited as a retransmission buffer. This could clearly be exploited as a vector
vector for performing a Denial of Service (DoS) attack against for performing a Denial of Service (DoS) attack against TCP, such as
TCP, such as that described in Section 7.1 of this document. that described in Section 7.1 of this document.
Section 7.1 of this document describes a Denial of Service attack Section 7.1 of this document describes a Denial of Service attack
that aims at exhausting the kernel memory used for the TCP that aims at exhausting the kernel memory used for the TCP
retransmission buffer, along with possible countermeasures. retransmission buffer, along with possible countermeasures.
3.7. Checksum 3.7. Checksum
Middleboxes that process TCP segments MUST validate the Checksum While in principle there should not be security implications arising
field, and silently discard the TCP segment if such validation fails. from the Checksum field, due to non-RFC-compliant implementations,
the Checksum can be exploited to detect firewalls, evade network
DISCUSSION: intrusion detection systems (NIDS), and/or perform Denial of Service
attacks.
The Checksum field is an error detection mechanism meant for the
contents of the TCP segment and a number of important fields of
the IP header. It is computed over the full TCP header pre-pended
with a pseudo header that includes the IP Source Address, the IP
Destination Address, the Protocol number, and the TCP segment
length. While in principle there should not be security
implications arising from this field, due to non-RFC-compliant
implementations, the Checksum can be exploited to detect
firewalls, evade network intrusion detection systems (NIDS),
and/or perform Denial of Service attacks.
If a stateful firewall does not check the TCP Checksum in the If a stateful firewall does not check the TCP Checksum in the
segments it processes, an attacker can exploit this situation to segments it processes, an attacker can exploit this situation to
perform a variety of attacks. For example, he could send a flood perform a variety of attacks. For example, he could send a flood of
of TCP segments with invalid checksums, which would nevertheless TCP segments with invalid checksums, which would nevertheless create
create state information at the firewall. When each of these state information at the firewall. When each of these segments is
segments is received at its intended destination, the TCP checksum received at its intended destination, the TCP checksum will be found
will be found to be incorrect, and the corresponding will be to be incorrect, and the corresponding will be silently discarded.
silently discarded. As these segments will not elicit a response As these segments will not elicit a response (e.g., an RST segment)
(e.g., an RST segment) from the intended recipients, the from the intended recipients, the corresponding connection state
corresponding connection state entries at the firewall will not be entries at the firewall will not be removed. Therefore, an attacker
removed. Therefore, an attacker may end up tying all the state may end up tying all the state resources of the firewall to TCP
resources of the firewall to TCP connections that will never connections that will never complete or be terminated, probably
complete or be terminated, probably leading to a Denial of Service leading to a Denial of Service to legitimate users, or forcing the
to legitimate users, or forcing the firewall to randomly drop firewall to randomly drop connection state entries.
connection state entries.
If a NIDS does not check the Checksum of TCP segments, an attacker If a NIDS does not check the Checksum of TCP segments, an attacker
may send TCP segments with an invalid checksum to cause the NIDS may send TCP segments with an invalid checksum to cause the NIDS to
to obtain a TCP data stream different from that obtained by the obtain a TCP data stream different from that obtained by the system
system being monitored. In order to "confuse" the NIDS, the being monitored. In order to "confuse" the NIDS, the attacker would
attacker would send TCP segments with an invalid Checksum and a send TCP segments with an invalid Checksum and a Sequence Number that
Sequence Number that would overlap the sequence number space being would overlap the sequence number space being used for his malicious
used for his malicious activity. FTester [Barisani, 2006] is a activity. FTester [Barisani, 2006] is a tool that can be used to
tool that can be used to assess NIDS on this issue. assess NIDS on this issue.
Finally, an attacker performing port-scanning could potentially Finally, an attacker performing port-scanning could potentially
exploit intermediate systems that do not check the TCP Checksum to exploit intermediate systems that do not check the TCP Checksum to
detect whether a given TCP port is being filtered by an detect whether a given TCP port is being filtered by an intermediate
intermediate firewall, or the port is actually closed by the host firewall, or the port is actually closed by the host being port-
being port-scanned. If a given TCP port appeared to be closed, scanned. If a given TCP port appeared to be closed, the attacker
the attacker would then send a SYN segment with an invalid would then send a SYN segment with an invalid Checksum. If this
Checksum. If this segment elicited a response (either an ICMP segment elicited a response (either an ICMP error message or a TCP
error message or a TCP RST segment) to this packet, then that RST segment) to this packet, then that response should come from a
response should come from a system that does not check the TCP system that does not check the TCP checksum. Since normal host
checksum. Since normal host implementations of the TCP protocol implementations of the TCP protocol do check the TCP checksum, such a
do check the TCP checksum, such a response would most likely come response would most likely come from a firewall or some other middle-
from a firewall or some other middle-box. box.
[Ed3f, 2002] describes the exploitation of the TCP checksum for [Ed3f, 2002] describes the exploitation of the TCP checksum for
performing the above activities. [US-CERT, 2005d] provides an performing the above activities. [US-CERT, 2005d] provides an
example of a TCP implementation that failed to check the TCP example of a TCP implementation that failed to check the TCP
checksum. checksum.
3.8. Urgent pointer 3.8. Urgent pointer
Segment.Size - Data Offset * 4 > 0 Some implementations have been found to be unable to process TCP
urgent indications correctly. [Myst, 1997] originally described how
If a TCP segment with the URG bit set does not pass this check, it TCP urgent indications could be exploited to perform a Denial of
MUST be silently dropped. Service (DoS) attack against some TCP/IP implementations, usually
leading to a system crash.
For TCP segments that have the URG bit set to zero, sending TCP TCP
SHOULD set the Urgent Pointer to zero.
A receiving TCP MUST ignore the Urgent Pointer field of TCP segments
for which the URG bit is zero.
DISCUSSION:
Section 3.7 of RFC 793 [Postel, 1981c] states (in page 42) that to
send an urgent indication the user must also send at least one
byte of data.
If the URG bit is zero, the Urgent Pointer is not valid, and thus
should not be processed by the receiving TCP. Nevertheless, we
recommend TCP implementations to set the Urgent Pointer to zero
when sending a TCP segment that does not have the URG bit set, and
to ignore the Urgent Pointer (as required by RFC 793) when the URG
bit is zero.
Some stacks have been known to fail to set the Urgent Pointer to
zero when the URG bit is zero, thus leaking out the corresponding
system memory contents. [Zalewski, 2008] provides further details
about this issue.
Some implementations have been found to be unable to process TCP [draft-gont-tcpm-tcp-sanity-checks-00.txt] describes a number of
urgent indications correctly. [Myst, 1997] originally described sanity checks to be enforced on TCP segments regarding urgent
how TCP urgent indications could be exploited to perform a Denial indications. [RFC6093] deprecates the use of urgent indications in
of Service (DoS) attack against some TCP/IP implementations, new applications.
usually leading to a system crash.
3.9. Options 3.9. Options
[IANA, 2007] contains the official list of the assigned option [IANA, 2007] contains the official list of the assigned option
numbers. TCP Options have been specified in the past both within the numbers. TCP Options have been specified in the past both within the
IETF and by other groups. [Hnes, 2007] contains an un-official IETF and by other groups. [Hnes, 2007] contains an un-official
updated version of the IANA list of assigned option numbers. The updated version of the IANA list of assigned option numbers. The
following table contains a summary of the assigned TCP option following table contains a summary of the assigned TCP option
numbers, which is based on [Hnes, 2007]. numbers, which is based on [Hnes, 2007].
skipping to change at page 27, line 10 skipping to change at page 19, line 10
o Case 2: An option-kind byte, followed by an option-length byte, o Case 2: An option-kind byte, followed by an option-length byte,
and the actual option-data bytes. and the actual option-data bytes.
In options of the Case 2 above, the option-length byte counts the In options of the Case 2 above, the option-length byte counts the
option-kind byte and the option-length byte, as well as the actual option-kind byte and the option-length byte, as well as the actual
option-data bytes. option-data bytes.
All options except "End of Option List" (Kind = 0) and "No Operation" All options except "End of Option List" (Kind = 0) and "No Operation"
(Kind = 1), are of "Case 2". (Kind = 1), are of "Case 2".
For options that belong to the "Case 2" described above, the [draft-gont-tcpm-tcp-sanity-checks-00.txt] describes a number of
following checks MUST be performed: sanity checks that should be performed on TCP options.
option-length >= 2
option-offset + option-length <= Data Offset * 4
Where option-offset is the offset of the first byte of the option
within the TCP header, with the first byte of the TCP header being
assigned an offset of 0.
If a TCP segment fails to pass any of these checks, it SHOULD be
silently dropped.
TCP MUST ignore unknown TCP options, provided they pass the
validation checks specified above. In the same way, middle-boxes
such as packet filters SHOULD NOT reject TCP segments containing
"unknown" TCP options that pass the validation checks described
earlier in this Section.
DISCUSSION:
The value "2" in the first equation accounts for the option-kind
byte and the option-length byte, and assumes zero bytes of option-
data. This check prevents, among other things, loops in option
processing that may arise from incorrect option lengths.
The second equation takes into account the limit on the legitimate
option length imposed by the syntax of the TCP header, and is
meant to detect forged option-length values that might make an
option overlap with the TCP payload, or even go past the actual
end of the TCP segment carrying the option.
Middle-boxes such as packet filters should not reject TCP segments
containing unknown options solely because these options have not been
present in the SYN/SYN-ACK handshake.
DISCUSSION:
There is renewed interest in defining new TCP options for purposes
like improved connection management and maintenance, advanced
congestion control schemes, and security features. The evolution
of the TCP/IP protocol suite would be severely impacted by
obstacles to deploying such new protocol mechanisms.
Middle-boxes such as packet filters SHOULD NOT reject TCP segments
containing unknown options solely because these options have not been
present in the SYN/SYN-ACK handshake.
DISCUSSION:
In the past, TCP enhancements based on TCP options regularly have
specified the exchange of a specific "enabling" option during the
initial SYN/SYN-ACK handshake. Due to the severely limited TCP
option space which has already become a concern, it should be
expected that future specifications might introduce new options
not negotiated or enabled in this way. Therefore, middle-boxes
such as packet filters should not reject TCP segments containing
unknown options solely because these options have not been present
in the SYN/SYN-ACK handshake.
TCP MUST NOT "echo" in any way unknown TCP options received in
inbound TCP segments.
DISCUSSION:
Some TCP implementations have been known to "echo" unknown TCP
options received in incoming segments. Here we stress that TCP
must not "echo" in any way unknown TCP options received in inbound
TCP segments. This is at the foundation for the introduction of
new TCP options, ensuring unambiguous behavior of systems not
supporting a new specification.
Section 4 discusses the security implications of common TCP options. Section 4 discusses the security implications of common TCP options.
3.10. Padding 3.10. Padding
The TCP header padding is used to ensure that the TCP header ends and The TCP header padding is used to ensure that the TCP header ends and
data begins on a 32-bit boundary. The padding is composed of zeros. data begins on a 32-bit boundary. The padding is composed of zeros.
3.11. Data 3.11. Data
The data field contains the upper-layer packet being transmitted by The data field contains the upper-layer packet being transmitted by
means of TCP. This payload is processed by the application process means of TCP. This payload is processed by the application process
making use of the transport services of TCP. Therefore, the security making use of the transport services of TCP. Therefore, the security
implications of this field are out of the scope of this document. implications of this field are out of the scope of this document.
4. Common TCP Options 4. Common TCP Options
4.1. End of Option List (Kind = 0) 4.1. End of Option List (Kind = 0)
TCP implementations MUST be able to gracefully handle those TCP This option indicates the "End of Options". As noted in
segments in which the End of Option List should have been present, [draft-gont-tcpm-tcp-sanity-checks-00.txt], some implementations pad
but is missing. the end of options with "No Operation" options rather than including
an "End of Options List" option.
DISCUSSION:
This option is used to indicate the "end of options" in those
cases in which the end of options would not coincide with the end
of the TCP header.
TCP implementations are required to ignore those options they do
not implement, and to be able to handle options with illegal
lengths. Therefore, TCP implementations should be able to
gracefully handle those TCP segments in which the End of Option
List should have been present, but is missing.
It is interesting to note that some TCP implementations do not use
the "End of Option List" option for indicating the "end of
options", but simply pad the TCP header with several "No
Operation" (Kind = 1) options to meet the header length specified
by the Data Offset header field.
4.2. No Operation (Kind = 1) 4.2. No Operation (Kind = 1)
The no-operation option is basically used to allow the sending system The no-operation option is basically used to allow the sending system
to align subsequent options in, for example, 32-bit boundaries. to align subsequent options in, for example, 32-bit boundaries.
This option does not have any known security implications. This option does not have any known security implications.
4.3. Maximum Segment Size (Kind = 2) 4.3. Maximum Segment Size (Kind = 2)
The Maximum Segment Size (MSS) option is used to indicate to the The Maximum Segment Size (MSS) option is used to indicate to the
remote TCP endpoint the maximum segment size this TCP is willing to remote TCP endpoint the maximum segment size this TCP is willing to
receive. receive.
The following check MUST be performed on a TCP segment that carries a The MSS option has been employed for performing DoS attacks, by
MSS option: advertising very small MSS values thus greatly increasing the packet-
rate used by the victim system.
SYN == 1 [draft-gont-tcpm-tcp-sanity-checks-00.txt] describes this issue, and
proposes sanity checks to mitigate it.
If the segment does not pass this check, it MUST be silently dropped.
DISCUSSION:
As stated in Section 3.1 of RFC 793 [Postel, 1981c], this option
can only be sent in the initial connection request (i.e., in
segments with the SYN control bit set).
TCP MUST check that the option length is 4. If the option does not
pass this check, it MUST be dropped.
The received MSS SHOULD be sanitized as follows:
Sanitized_MSS = max(MSS, 536)
This "sanitized" MSS value SHOULD be used to compute the "effective
send MSS" by the expression included in Section 4.2.2.6 of RFC 1122
[Braden, 1989], as follows:
Eff.snd.MSS = min(Sanitized_MSS+20, MMS_S) - TCPhdrsize - IPoptionsize
where:
Sanitized_MSS:
sanitized MSS value (the value received in the MSS option, with an
enforced minimum value)
MMS_S:
maximum size for a transport-layer message that TCP may send
TCPhdrsize:
size of the TCP header, which typically was 20, but may be larger
if TCP options are to be sent.
IPoptionsize
size of any IP options that TCP will pass to the IP layer with the
current message.
DISCUSSION:
The advertised maximum segment size may be the result of the
consideration of a number of factors. Firstly, if fragmentation
is employed, the size of the IP reassembly buffer may impose a
limit on the maximum TCP segment size that can be received.
Considering that the minimum IP reassembly buffer size is 576
bytes, if an MSS option is not present included in the connection-
establishment phase, an MSS of 536 bytes should be assumed.
Secondly, if Path-MTU Discovery (specified in RFC 1191 [Mogul and
Deering, 1990] and RFC 1981 [McCann et al, 1996]) is expected to
be used for the connection, an artificial maximum segment size may
be enforced by a TCP to prevent the remote peer from sending TCP
segments which would be too large to be transmitted without
fragmentation. Finally, a system connected by a low-speed link
may choose to introduce an artificial maximum segment size to
enforce an upper limit on the network latency that would otherwise
negatively affect its interactive applications [Stevens, 1994].
The TCP specifications do not impose any requirements on the
maximum segment size value that is included in the MSS option.
However, there are a number of values that may cause undesirable
results. Firstly, an MSS of 0 could possible "freeze" the TCP
connection, as it would not allow data to be included in the
payload of the TCP segments. Secondly, low values other than 0
would degrade the performance of the TCP connection (wasting more
bandwidth in protocol headers than in actual data), and could
potentially exhaust processing cycles at the sending TCP and/or
the receiving TCP by producing an increase in the interrupt rate
caused by the transmitted (or received) packets.
The problems that might arise from low MSS values were first
described by [Reed, 2001]. However, the community did not reach
consensus on how to deal with these issues at that point.
RFC 791 [Postel, 1981a] requires IP implementations to be able to
receive IP datagrams of at least 576 bytes. Assuming an IPv4
header of 20 bytes, and a TCP header of 20 bytes, there should be
room in each IP packet for 536 application data bytes.
There are two cases to analyze when considering the possible
interoperability impact of sanitizing the received MSS value: TCP
connections relying on IP fragmentation and TCP connections
implementing Path-MTU Discovery. In case the corresponding TCP
connection relies on IP fragmentation, given that the minimum
reassembly buffer size is required to be 576 bytes by RFC 791
[Postel, 1981a], the adoption of 536 bytes as a lower limit is
safe.
In case the TCP connection relies on Path-MTU Discovery, imposing
a lower limit on the adopted MSS may ignore the advice of the
remote TCP on the maximum segment size that can possibly be
transmitted without fragmentation. As a result, this could lead
to the first TCP data segment to be larger than the Path-MTU.
However, in such a scenario, the TCP segment should elicit an ICMP
Unreachable "fragmentation needed and DF bit set" error message
that would cause the "effective send MSS" (E_MSS) to be decreased
appropriately. Thus, imposing a lower limit on the accepted MSS
will not cause any interoperability problems.
A possible scenario exists in which the proposed enforcement of a
lower limit in the received MSS might lead to an interoperability
problem. If a system was attached to the network by means of a
link with an MTU of less than 576 bytes, and there was some
intermediate system which either silently dropped (i.e., without
sending an ICMP error message) those packets equal to or larger
than that 576 bytes, or some intermediate system simply filtered
ICMP "fragmentation needed and DF bit set" error messages, the
proposed behavior would not lead to an interoperability problem,
when communication could have otherwise succeeded. However, the
interoperability problem would really be introduced by the network
setup (e.g., the middle-box silently dropping packets), rather
than by the mechanism proposed in this section. In any case, TCP
should nevertheless implement a mechanism such as that specified
by RFC 4821 [Mathis and Heffner, 2007] to deal with this type of
"network black-holes".
4.4. Selective Acknowledgement Option 4.4. Selective Acknowledgement Option
The Selective Acknowledgement option provides an extension to allow The Selective Acknowledgement option provides an extension to allow
the acknowledgement of individual segments, to enhance TCP's loss the acknowledgement of individual segments, to enhance TCP's loss
recovery. recovery.
Two options are involved in the SACK mechanism. The "Sack-permitted Two options are involved in the SACK mechanism. The "Sack-permitted
option" is sent during the connections-establishment phase, to option" is sent during the connections-establishment phase, to
advertise that SACK is supported. If both TCP peers agree to use advertise that SACK is supported. If both TCP peers agree to use
selective acknowledgements, the actual selective acknowledgements are selective acknowledgements, the actual selective acknowledgements are
sent, if needed, by means of "SACK options". sent, if needed, by means of "SACK options".
4.4.1. SACK-permitted Option (Kind = 4) 4.4.1. SACK-permitted Option (Kind = 4)
The SACK-permitted option is meant to advertise that the TCP sending [draft-gont-tcpm-tcp-sanity-checks-00.txt] to be performed on this
this segment supports Selective Acknowledgements. option.
The following check MUST be performed on a TCP segment that carries a
MSS option:
SYN == 1
If a segment does not pass this check, it MUST be silently dropped.
DISCUSSION:
The SACK-permitted option can be sent only in SYN segments.
TCP MUST check that the option length is 2. If the option does not
pass this check it MUST be silently dropped.
4.4.2. SACK Option (Kind = 5) 4.4.2. SACK Option (Kind = 5)
The SACK option is used to convey extended acknowledgment information The TCP receiving a SACK option is expected to keep track of the
from the receiver to the sender over an established TCP connection. selectively-acknowledged blocks. Even when space in the TCP header
The option consists of an option-kind byte (which must be 5), an is limited (and thus each TCP segment can selectively-acknowledge at
option-length byte, and a variable number of SACK blocks. most four blocks of data), an attacker could try to perform a buffer
overflow or a resource-exhaustion attack by sending a large number of
TCP MUST silently discard those TCP segments carrying a SACK option SACK options.
that does not pass the following check:
option-offset + option-length <= Data Offset * 4
TCP MUST silently discard those TCP segments carrying a SACK option
that does not pass the following check:
option-length >= 10
DISCUSSION:
A SACK Option with zero SACK blocks is nonsensical. The value
"10" accounts for the option-kind byte, the option-length byte, a
4-byte left-edge field, and a 4-byte right-edge field.
TCP MUST silently discard those TCP segments carrying a SACK option
that does not pass the following check:
(option-length - 2) % 8 == 0
DISCUSSION:
As stated in Section 3 of RFC 2018 [Mathis et al, 1996], a SACK
option that specifies n blocks will have a length of 8*n+2.
TCP MUST silently discard those TCP segments carrying a SACK option
that contains a SACK block that does not pass the following check:
Left Edge of Block < Right Edge of Block
As in all the other occurrences in this document, all comparisons
between sequence numbers should be performed using sequence number
arithmetic.
DISCUSSION:
Each block included in a SACK option represents a number of
received data bytes that are contiguous and isolated; that is, the
bytes just below the block, (Left Edge of Block - 1), and just
above the block, (Right Edge of Block), have not yet been
received.
TCP MUST enforce a limit on the number of SACK blocks that a TCP will
store in memory for each connection at any time.
DISCUSSION:
The TCP receiving a SACK option is expected to keep track of the
selectively-acknowledged blocks. Even when space in the TCP
header is limited (and thus each TCP segment can selectively-
acknowledge at most four blocks of data), an attacker could try to
perform a buffer overflow or a resource-exhaustion attack by
sending a large number of SACK options.
For example, an attacker could send a large number of SACK For example, an attacker could send a large number of SACK options,
options, each of them acknowledging one byte of data. each of them acknowledging one byte of data. Additionally, for the
Additionally, for the purpose of wasting resources on the attacked purpose of wasting resources on the attacked system, each of these
system, each of these blocks would be separated from each other by blocks would be separated from each other by one byte, to prevent the
one byte, to prevent the attacked system from coalescing two (or attacked system from coalescing two (or more) contiguous SACK blocks
more) contiguous SACK blocks into a single SACK block. If the into a single SACK block. If the attacked system kept track of each
attacked system kept track of each SACKed block by storing both SACKed block by storing both the Left Edge and the Right Edge of the
the Left Edge and the Right Edge of the block, then for each block, then for each window of data, the attacker could waste up to 4
window of data, the attacker could waste up to 4 * Window bytes of * Window bytes of memory at the attacked TCP.
memory at the attacked TCP.
The value "4 * Window" results from the expression "(Window / 2) * The value "4 * Window" results from the expression "(Window / 2) *
8", in which the value "2" accounts for the 1-byte block 8", in which the value "2" accounts for the 1-byte block
selectively-acknowledged by each SACK block and 1 byte that would selectively-acknowledged by each SACK block and 1 byte that would
be used to separate each SACK blocks from each other, and the be used to separate each SACK blocks from each other, and the
value "8" accounts for the 8 bytes needed to store the Left Edge value "8" accounts for the 8 bytes needed to store the Left Edge
and the Right Edge of each SACKed block. and the Right Edge of each SACKed block.
Therefore, it is clear that a limit should be imposed on the [draft-gont-tcpm-tcp-sanity-checks-00.txt] describes sanity checks to
number of SACK blocks that a TCP will store in memory for each be performed on this option such that this and other possible issues
connection at any time. Measurements in [Dharmapurikar and are mitigated.
Paxson, 2005] indicate that in the vast majority of cases
connections have a single hole in the data stream at any given
time. Thus, a limit of 16 SACK blocks for each connection would
handle even most of the more unusual cases in which there is more
than one simultaneous hole at a time.
4.5. MD5 Option (Kind=19) 4.5. MD5 Option (Kind=19)
The TCP MD5 option provides a mechanism for authenticating TCP The TCP MD5 option provides a mechanism for authenticating TCP
segments with a 18-byte digest produced by the MD5 algorithm. The segments with a 18-byte digest produced by the MD5 algorithm. The
option consists of an option-kind byte (which must be 19), an option- option consists of an option-kind byte (which must be 19), an option-
length byte (which must be 18), and a 16-byte MD5 digest. length byte (which must be 18), and a 16-byte MD5 digest.
TCP MUST silently drop a TCP segment that carries a TCP MD5 option A basic weakness on the TCP MD5 option is that the MD5 algorithm
that does not pass the following checks: itself has been known (for a long time) to be vulnerable to collision
search attacks.
option-offset + option-length <= Data Offset * 4
option-length == 18
DISCUSSION:
The TCP MD5 option is of "Case 2", and has a fixed length.
DISCUSSION:
A basic weakness on the TCP MD5 option is that the MD5 algorithm
itself has been known (for a long time) to be vulnerable to
collision search attacks.
[Bellovin, 2006] argues that it has two other weaknesses, namely [Bellovin, 2006] argues that it has two other weaknesses, namely that
that it does not provide a key identifier, and that it has no it does not provide a key identifier, and that it has no provision
provision for automated key management. However, it is generally for automated key management. However, it is generally accepted that
accepted that while a Key-ID field can be a good approach for while a Key-ID field can be a good approach for providing smooth key
providing smooth key rollover, it is not actually a requirement. rollover, it is not actually a requirement. For instance, most
For instance, most systems implementing the TCP MD5 option include systems implementing the TCP MD5 option include a "keychain"
a "keychain" mechanism that fully supports smooth key rollover. mechanism that fully supports smooth key rollover. Additionally,
Additionally, with some further work, ISAKMP/IKE could be used to with some further work, ISAKMP/IKE could be used to configure the MD5
configure the MD5 keys. keys.
It is interesting to note that while the TCP MD5 option, as It is interesting to note that while the TCP MD5 option, as specified
specified by RFC 2385 [Heffernan, 1998], addresses the TCP-based by RFC 2385 [Heffernan, 1998], addresses the TCP-based forgery
forgery attacks against TCP discussed in Section 11, it does not attacks against TCP discussed in Section 11, it does not address the
address the ICMP-based connection-reset attacks discussed in ICMP-based connection-reset attacks discussed in Section 15. As a
Section 15. As a result, while a TCP connection may be protected result, while a TCP connection may be protected from TCP-based
from TCP-based forgery attacks by means of the MD5 option, an forgery attacks by means of the MD5 option, an attacker might still
attacker might still be able to successfully perform the ICMP- be able to successfully perform the ICMP-based counter-part.
based counter-part.
The TCP MD5 option has been obsoleted by the TCP-AO. The TCP MD5 option has been obsoleted by the TCP-AO.
4.6. Window scale option (Kind = 3) 4.6. Window scale option (Kind = 3)
The window scale option provides a mechanism to expand the definition The window scale option provides a mechanism to expand the definition
of the TCP window to 32 bits, such that the performance of TCP can be of the TCP window to 32 bits, such that the performance of TCP can be
improved in some network scenarios. The Window scale option consists improved in some network scenarios. The Window scale option consists
of an option-kind byte (which must be 3), followed by an option- of an option-kind byte (which must be 3), followed by an option-
length byte (which must be 3), and a shift count (shift.cnt) byte length byte (which must be 3), and a shift count (shift.cnt) byte
(the actual option-data). (the actual option-data).
The option may be sent only in the initial SYN segment, but may also While there are not known security implications arising from the
be sent in a SYN/ACK segment if the option was received in the window scale mechanism itself, the size of the TCP window has a
initial SYN segment. If the option is received in any other segment, number of security implications. In general, larger window sizes
it MUST be silently dropped. increase the chances of an attacker from successfully performing
forgery attacks against TCP, such as those described in Section 11 of
TCP MUST silently discard TCP segments that contain a Window scale this document. Additionally, large windows can exacerbate the impact
option whose option-length is not 3. of resource exhaustion attacks such as those described in Section 7
of this document.
DISCUSSION:
This option has a fixed length.
TCP MUST silently discard TCP segments that contain a Window scale
option that does not pass the following check:
shift.cnt <= 14
DISCUSSION:
As discussed in Section 2.3 of RFC 1323 [Jacobson et al, 1992], in
order to prevent new data from being mistakenly considered as old
and vice versa, the resulting window should be equal to or smaller
than 2^32.
DISCUSSION:
[Welzl, 2008] describes major problems with the use of the Window
scale option in the Internet due to faulty equipment.
While there are not known security implications arising from the
window scale mechanism itself, the size of the TCP window has a
number of security implications. In general, larger window sizes
increase the chances of an attacker from successfully performing
forgery attacks against TCP, such as those described in Section 11
of this document. Additionally, large windows can exacerbate the
impact of resource exhaustion attacks such as those described in
Section 7 of this document.
Section 3.7 provides a general discussion of the security Section 3.7 provides a general discussion of the security
implications of the TCP window size. Section 7.3.2 discusses the implications of the TCP window size. Section 7.3.2 discusses the
security implications of Automatic receive-buffer tuning security implications of Automatic receive-buffer tuning mechanisms.
mechanisms.
4.7. Timestamps option (Kind = 8) 4.7. Timestamps option (Kind = 8)
The Timestamps option, specified in RFC 1323 [Jacobson et al, 1992], The Timestamps option, specified in RFC 1323 [Jacobson et al, 1992],
is used to perform two functions: Round-Trip Time Measurement (RTTM), is used to perform two functions: Round-Trip Time Measurement (RTTM),
and Protection Against Wrapped Sequence Numbers (PAWS). and Protection Against Wrapped Sequence Numbers (PAWS).
TCP MUST silently discard TCP segments that contain a Timestamps
option that does not pass the following check:
option-length == 10
DISCUSSION:
As specified by RFC 1323, the option-length must be 10.
4.7.1. Generation of timestamps 4.7.1. Generation of timestamps
TCP SHOULD generate timestamps with the following expression: For the purpose of PAWS, the timestamps sent on a connection are
required to be monotonically increasing. While there is no
timestamp = T() + F(localhost, localport, remotehost, remoteport, secret_key) requirement that timestamps are monotonically increasing across TCP
connections, the generation of timestamps such that they are
where the result of T() is a global system clock that complies with monotonically increasing across connections between the same two
the requirements of Section 4.2.2 of RFC 1323 [Jacobson et al, 1992], endpoints allows the use of timestamps for improving the handling of
and F() is a function that should not be computable from the outside. SYN segments that are received while the corresponding four-tuple is
Therefore, we suggest F() to be a cryptographic hash function of the in the TIME-WAIT state. This is discussed in Section 11.1.2 of this
connection-id and some secret data. document.
DISCUSSION:
For the purpose of PAWS, the timestamps sent on a connection are
required to be monotonically increasing. While there is no
requirement that timestamps are monotonically increasing across
TCP connections, the generation of timestamps such that they are
monotonically increasing across connections between the same two
endpoints allows the use of timestamps for improving the handling
of SYN segments that are received while the corresponding four-
tuple is in the TIME-WAIT state. This is discussed in Section
11.1.2 of this document.
F() provides an offset that will be the same for all incarnations
of a connection between the same two endpoints, while T() provides
the monotonically increasing values that are needed for PAWS.
Further discussion about this algorithm is available in
[I-D.gont-timestamps-generation].
TCP SHOULD NOT initialize a global timestamp counter to a fixed value
when the system is bootstrapped.
DISCUSSION:
Some implementations are known to initialize their global
timestamp clock to zero when the system is bootstrapped. This is
undesirable, as the timestamp clock would disclose the system
uptime.
TCP SHOULD set the Timestamp Echo Reply (TSecr) field to zero when
sending a TCP segment that does not have the ACK bit set (i.e., a SYN
segment).
DISCUSSION:
Some TCP implementations have been found to fail to set the Some implementations are known to initialize their global timestamp
Timestamp Echo Reply field (TSecr) to zero in TCP segments that do clock to zero when the system is bootstrapped. This is undesirable,
not have the ACK bit set, thus potentially leaking information. as the timestamp clock would disclose the system uptime.
[I-D.gont-timestamps-generation] discusses the generation of TCP
timestamps in detail.
4.7.2. Vulnerabilities 4.7.2. Vulnerabilities
Blind In-Window Attacks Blind In-Window Attacks
Segments that contain a timestamp option smaller than the last Segments that contain a timestamp option smaller than the last
timestamp option recorded by TCP are silently dropped. This allows timestamp option recorded by TCP are silently dropped. This allows
for a subtle attack against TCP that would allow an attacker to cause for a subtle attack against TCP that would allow an attacker to cause
one direction of data transfer of the attacked connection to freeze one direction of data transfer of the attacked connection to freeze
[US-CERT, 2005c]. An attacker could forge a TCP segment that [US-CERT, 2005c]. An attacker could forge a TCP segment that
skipping to change at page 40, line 7 skipping to change at page 24, line 14
proposes mitigations for this and other issues. proposes mitigations for this and other issues.
5. Connection-establishment mechanism 5. Connection-establishment mechanism
The following subsections describe a number of attacks that can be The following subsections describe a number of attacks that can be
performed against TCP by exploiting its connection-establishment performed against TCP by exploiting its connection-establishment
mechanism. mechanism.
5.1. SYN flood 5.1. SYN flood
TCP SHOULD implement (and enable by default) a syn-cache [Lemon, TCP uses a mechanism known as the "three-way handshake" for the
2002]. establishment of a connection between two TCP peers. RFC 793
[RFC0793] states that when a TCP that is in the LISTEN state receives
TCP SHOULD implement syn-cookies, and SHOULD enable them only after a a SYN segment (i.e., a TCP segment with the SYN flag set), it must
specified number of TCBs has been allocated for connections in the transition to the SYN-RECEIVED state, record the control information
SYN-RECEIVED state. (e.g., the ISN) contained in the SYN segment in a Transmission
Control Block (TCB), and respond with a SYN/ACK segment.
DISCUSSION:
TCP uses a mechanism known as the "three-way handshake" for the
establishment of a connection between two TCP peers. RFC 793
[Postel, 1981c] states that when a TCP that is in the LISTEN state
receives a SYN segment (i.e., a TCP segment with the SYN flag
set), it must transition to the SYN-RECEIVED state, record the
control information (e.g., the ISN) contained in the SYN segment
in a Transmission Control Block (TCB), and respond with a SYN/ACK
segment.
A Transmission Control Block is the data structure used to store A Transmission Control Block is the data structure used to store
(usually within the kernel) all the information relevant to a TCP (usually within the kernel) all the information relevant to a TCP
connection. The concept of "TCB" is introduced in the core TCP connection. The concept of "TCB" is introduced in the core TCP
specification RFC 793 [Postel, 1981c]. specification RFC 793 [RFC0793].
In practice, virtually all existing implementations do not modify In practice, virtually all existing implementations do not modify the
the state of the TCP that was in the LISTEN state, but rather state of the TCP that was in the LISTEN state, but rather create a
create a new TCP (i.e., a new "protocol machine"), and perform all new TCP (i.e., a new "protocol machine"), and perform all the state
the state transitions on this newly-created TCP. This allows the transitions on this newly-created TCP. This allows the application
application running on top of TCP to service to more than one running on top of TCP to service to more than one client at the same
client at the same time. As a result, each connection request time. As a result, each connection request results in the allocation
results in the allocation of system memory to store the TCB of system memory to store the TCB associated with the newly created
associated with the newly created TCB. TCB.
If TCP was implemented strictly as described in RFC 793, the If TCP was implemented strictly as described in RFC 793, the
application running on top of TCP would have to finish servicing application running on top of TCP would have to finish servicing the
the current client before being able to service the next one in current client before being able to service the next one in line, or
line, or should instead be able to perform some kind of connection should instead be able to perform some kind of connection hand-off.
hand-off.
An attacker could exploit TCP's connection-establishment mechanism An attacker could exploit TCP's connection-establishment mechanism to
to perform a Denial of Service (DoS) attack, by sending a large perform a Denial of Service (DoS) attack, by sending a large number
number of connection requests to the target system, with the of connection requests to the target system, with the intent of
intent of exhausting the system memory destined for storing TCBs exhausting the system memory destined for storing TCBs (or related
(or related kernel data structures), thus preventing the attacked kernel data structures), thus preventing the attacked system from
system from establishing new connections with legitimate users. establishing new connections with legitimate users. This attack is
This attack is widely known as "SYN flood", and has received a lot widely known as "SYN flood", and has received a lot of attention
of attention during the late 90's [CERT, 1996]. during the late 90's [CERT, 1996].
Given that the attacker does not need to complete the three-way Given that the attacker does not need to complete the three-way
handshake for the attacked system to tie system resources to the handshake for the attacked system to tie system resources to the
newly created TCBs, he will typically forge the source IP address newly created TCBs, he will typically forge the source IP address of
of the malicious SYN segments he sends, thus concealing his own IP the malicious SYN segments he sends, thus concealing his own IP
address. address.
If the forged IP addresses corresponded to some reachable system, If the forged IP addresses corresponded to some reachable system, the
the impersonated system would receive the SYN/ACK segment sent by impersonated system would receive the SYN/ACK segment sent by the
the attacked host (in response to the forged SYN segment), which attacked host (in response to the forged SYN segment), which would
would elicit an RST segment. This RST segment would be delivered elicit an RST segment. This RST segment would be delivered to the
to the attacked system, causing the corresponding connection to be attacked system, causing the corresponding connection to be aborted,
aborted, and the corresponding TCB to be removed. and the corresponding TCB to be removed.
As the impersonated host would not have any state information for As the impersonated host would not have any state information for the
the TCP connection being referred to by the SYN/ACK segment, it TCP connection being referred to by the SYN/ACK segment, it would
would respond with a RST segment, as specified by the TCP segment respond with a RST segment, as specified by the TCP segment
processing rules of RFC 793 [Postel, 1981c]. processing rules of RFC 793 [RFC0793].
However, if the forged IP source addresses were unreachable, the However, if the forged IP source addresses were unreachable, the
attacked TCP would continue retransmitting the SYN/ACK segment attacked TCP would continue retransmitting the SYN/ACK segment
corresponding to each connection request, until timing out and corresponding to each connection request, until timing out and
aborting the connection. For this reason, a number of widely aborting the connection. For this reason, a number of widely
available attack tools first check whether each of the (forged) IP available attack tools first check whether each of the (forged) IP
addresses are reachable by sending an ICMP echo request to them. addresses are reachable by sending an ICMP echo request to them. The
The receipt of an ICMP echo response is considered an indication receipt of an ICMP echo response is considered an indication of the
of the IP address being reachable (and thus results in the IP address being reachable (and thus results in the corresponding IP
corresponding IP address not being used for performing the address not being used for performing the attack), while the receipt
attack), while the receipt of an ICMP unreachable error message is of an ICMP unreachable error message is considered an indication of
considered an indication of the IP address being unreachable (and the IP address being unreachable (and thus results in the
thus results in the corresponding IP address being used for corresponding IP address being used for performing the attack).
performing the attack).
[Gont, 2008b] describes how the so-called ICMP soft errors could [Gont, 2008b] describes how the so-called ICMP soft errors could be
be used by TCP to abort connections in any of the non-synchronized used by TCP to abort connections in any of the non-synchronized
states. While implementation of the mechanism described in that states. While implementation of the mechanism described in that
document would certainly not eliminate the vulnerability of TCP to document would certainly not eliminate the vulnerability of TCP to
SYN flood attacks (as the attacker could use addresses that are SYN flood attacks (as the attacker could use addresses that are
simply "black-holed"), it provides an example of how signaling simply "black-holed"), it provides an example of how signaling
information such as that provided by means of ICMP error messages information such as that provided by means of ICMP error messages can
can provide valuable information that a transport protocol could provide valuable information that a transport protocol could use to
use to perform heuristics. perform heuristics.
In order to mitigate the impact of this attack, the amount of In order to mitigate the impact of this attack, the amount of
information stored for non-established connections should be information stored for non-established connections should be reduced
reduced (ideally, non-synchronized connections should not require (ideally, non-synchronized connections should not require any state
any state information to be maintained at the TCP performing the information to be maintained at the TCP performing the passive OPEN).
passive OPEN). There are basically two mitigation techniques for There are basically two mitigation techniques for this vulnerability:
this vulnerability: a syn-cache and syn-cookies. a syn-cache and syn-cookies.
[Borman, 1997] and RFC 4987 [Eddy, 2007] contain a general [Borman, 1997] and RFC 4987 [Eddy, 2007] contain a general discussion
discussion of SYN-flooding attacks and common mitigation of SYN-flooding attacks and common mitigation approaches.
approaches.
The syn-cache [Lemon, 2002] approach aims at reducing the amount The syn-cache [Lemon, 2002] approach aims at reducing the amount of
of state information that is maintained for connections in the state information that is maintained for connections in the SYN-
SYN-RECEIVED state, and allocates a full TCB only after the RECEIVED state, and allocates a full TCB only after the connection
connection has transited to the ESTABLISHED state. has transited to the ESTABLISHED state.
The syn-cookie [Bernstein, 1996] approach aims at completely The syn-cookie [Bernstein, 1996] approach aims at completely
eliminating the need to maintain state information at the TCP eliminating the need to maintain state information at the TCP
performing the passive OPEN, by encoding the most elementary performing the passive OPEN, by encoding the most elementary
information required to complete the three-way handshake in the information required to complete the three-way handshake in the
Sequence Number of the SYN/ACK segment that is sent in response to Sequence Number of the SYN/ACK segment that is sent in response to
the received SYN segment. Thus, TCP is relieved from keeping the received SYN segment. Thus, TCP is relieved from keeping state
state for connections in the SYN-RECEIVED state. for connections in the SYN-RECEIVED state.
The syn-cookie approach has a number of drawbacks: The syn-cookie approach has a number of drawbacks:
* Firstly, given the limited space in the Sequence Number field, o Firstly, given the limited space in the Sequence Number field, it
it is not possible to encode all the information included in is not possible to encode all the information included in the
the initial segment, such as, for example, support of Selective initial segment, such as, for example, support of Selective
Acknowledgements (SACK). Acknowledgements (SACK).
* Secondly, in the event that the Acknowledgement segment sent in o Secondly, in the event that the Acknowledgement segment sent in
response to the SYN/ACK sent by the TCP that performed the response to the SYN/ACK sent by the TCP that performed the passive
passive OPEN (i.e., the TCP server) were lost, the connection OPEN (i.e., the TCP server) were lost, the connection would end up
would end up in the ESTABLISHED state on the client-side, but in the ESTABLISHED state on the client-side, but in the CLOSED
in the CLOSED state on the server side. This scenario is state on the server side. This scenario is normally handled in
normally handled in TCP by having the TCP server retransmit its TCP by having the TCP server retransmit its SYN/ACK. However, if
SYN/ACK. However, if syn-cookies are enabled, there would be syn-cookies are enabled, there would be no connection state
no connection state information on the server side, and thus information on the server side, and thus the SYN/ACK would never
the SYN/ACK would never be retransmitted. This could lead to a be retransmitted. This could lead to a scenario in which the
scenario in which the connection could remain in the connection could remain in the ESTABLISHED state on the client
ESTABLISHED state on the client side, but in the CLOSED state side, but in the CLOSED state at the server side, indefinitely.
at the server side, indefinitely. If the application protocol If the application protocol was such that it required the client
was such that it required the client to wait for some data from to wait for some data from the server (e.g., a greeting message)
the server (e.g., a greeting message) before sending any data before sending any data to the server, a deadlock would take
to the server, a deadlock would take place, with the client place, with the client application waiting for such server data,
application waiting for such server data, and the server and the server waiting for the TCP three-way handshake to
waiting for the TCP three-way handshake to complete. complete.
* Thirdly, unless the function used to encode information in the o Thirdly, unless the function used to encode information in the
SYN/ACK packet is cryptographically strong, an attacker could SYN/ACK packet is cryptographically strong, an attacker could
forge TCP connections in the ESTABLISHED state by forging ACK forge TCP connections in the ESTABLISHED state by forging ACK
segments that would be considered as "legitimate" by the segments that would be considered as "legitimate" by the receiving
receiving TCP. TCP.
* Fourthly, in those scenarios in which establishment of new o Fourthly, in those scenarios in which establishment of new
connections is blocked by simply dropping segments with the SYN connections is blocked by simply dropping segments with the SYN
bit set, use of SYN cookies could allow an attacker to bypass bit set, use of SYN cookies could allow an attacker to bypass the
the firewall rules, as a connection could be established by firewall rules, as a connection could be established by forging an
forging an ACK segment with the correct values, without the ACK segment with the correct values, without the need of setting
need of setting the SYN bit. the SYN bit.
As a result, syn-cookies are usually not employed as a first line As a result, syn-cookies are usually not employed as a first line of
of defense against SYN-flood attacks, but are only as the last defense against SYN-flood attacks, but are only as the last resort to
resort to cope with them. For example, some TCP implementations cope with them. For example, some TCP implementations enable syn-
enable syn-cookies only after a certain number of TCBs has been cookies only after a certain number of TCBs has been allocated for
allocated for connections in the SYN-RECEIVED state. We recommend connections in the SYN-RECEIVED state. We recommend this
this implementation technique, with a syn-cache enabled by implementation technique, with a syn-cache enabled by default, and
default, and use of syn-cookies triggered, for example, when the use of syn-cookies triggered, for example, when the limit of TCBs for
limit of TCBs for non-synchronized connections with a given port non-synchronized connections with a given port number has been
number has been reached. reached.
It is interesting to note that a SYN-flood attack should only It is interesting to note that a SYN-flood attack should only affect
affect the establishment of new connections. A number of books the establishment of new connections. A number of books and online
and online documents seem to assume that TCP will not be able to documents seem to assume that TCP will not be able to respond to any
respond to any TCP segment that is meant for a TCP port that is TCP segment that is meant for a TCP port that is being SYN-flooded
being SYN-flooded (e.g., respond with an RST segment upon receipt (e.g., respond with an RST segment upon receipt of a TCP segment that
of a TCP segment that refers to a non-existent TCP connection). refers to a non-existent TCP connection). While SYN-flooding attacks
While SYN-flooding attacks have been successfully exploited in the have been successfully exploited in the past for achieving such a
past for achieving such a goal [Shimomura, 1995], as clarified by goal [Shimomura, 1995], as clarified by RFC 1948 [Bellovin, 1996] the
RFC 1948 [Bellovin, 1996] the effectiveness of SYN flood attacks effectiveness of SYN flood attacks to silence a TCP implementation
to silence a TCP implementation arose as a result of a bug in the arose as a result of a bug in the 4.4BSD TCP implementation [Wright
4.4BSD TCP implementation [Wright and Stevens, 1994], rather than and Stevens, 1994], rather than from a theoretical property of SYN-
from a theoretical property of SYN-flood attacks themselves. flood attacks themselves. Therefore, those TCP implementations that
Therefore, those TCP implementations that do not suffer from such do not suffer from such a bug should not be silenced as a result of a
a bug should not be silenced as a result of a SYN-flood attack. SYN-flood attack.
[Zquete, 2002] describes a mechanism that could theoretically [Zquete, 2002] describes a mechanism that could theoretically improve
improve the functionality of SYN cookies. It exploits the TCP the functionality of SYN cookies. It exploits the TCP "simultaneous
"simultaneous open" mechanism, as illustrated in Figure 5. open" mechanism, as illustrated in Figure 5.
See Figure 5, in page 46 of the UK CPNI document. See Figure 5, in page 46 of the UK CPNI document.
Use of TCP simultaneous open for handling SYN floods Use of TCP simultaneous open for handling SYN floods
In line 1, TCP A initiates the connection-establishment phase by In line 1, TCP A initiates the connection-establishment phase by
sending a SYN segment to TCP B. In line 2, TCP B creates a SYN sending a SYN segment to TCP B. In line 2, TCP B creates a SYN cookie
cookie as described by [Bernstein, 1996], but does not set the ACK as described by [Bernstein, 1996], but does not set the ACK bit of
bit of the segment it sends (thus really sending a SYN segment, the segment it sends (thus really sending a SYN segment, rather than
rather than a SYN/ACK). This "fools" TCP A into thinking that a SYN/ACK). This "fools" TCP A into thinking that both SYN segments
both SYN segments "have crossed each other in the network" as if a "have crossed each other in the network" as if a "simultaneous open"
"simultaneous open" scenario had taken place. As a result, in scenario had taken place. As a result, in line 3 TCP A sends a SYN/
line 3 TCP A sends a SYN/ACK segment containing the same options ACK segment containing the same options that were contained in the
that were contained in the original SYN segment. In line 4, upon original SYN segment. In line 4, upon receipt of this segment, TCP
receipt of this segment, TCP processes the cookie encoded in the processes the cookie encoded in the ACK field as if it had been the
ACK field as if it had been the result of a traditional SYN cookie result of a traditional SYN cookie scenario, and moves the connection
scenario, and moves the connection into the ESTABLISHED state. In into the ESTABLISHED state. In line 5, TCP B sends a SYN/ACK
line 5, TCP B sends a SYN/ACK segment, which causes the connection segment, which causes the connection at TCP A to move into the
at TCP A to move into the ESTABLISHED state. In line 6, TCP A ESTABLISHED state. In line 6, TCP A sends a data segment on the
sends a data segment on the connection. connection.
While this mechanism would work in theory, unfortunately there are While this mechanism would work in theory, unfortunately there are a
a number of factors that prevent it from being usable in real number of factors that prevent it from being usable in real network
network environments: environments:
* Some systems are not able to perform the "simultaneous open" o Some systems are not able to perform the "simultaneous open"
operation specified in RFC 793, and thus the connection operation specified in RFC 793, and thus the connection
establishment will fail. establishment will fail.
* Some firewalls might prevent the establishment of TCP o Some firewalls might prevent the establishment of TCP connections
connections that rely on the "simultaneous open" mechanism that rely on the "simultaneous open" mechanism (e.g., a given
(e.g., a given firewall might be allowing incoming SYN/ACK firewall might be allowing incoming SYN/ACK segments, but not
segments, but not outgoing SYN/ACK segments). outgoing SYN/ACK segments).
Therefore, we do not recommend implementation of this mechanism Therefore, we do not recommend implementation of this mechanism for
for mitigating SYN-flood attacks. mitigating SYN-flood attacks.
5.2. Connection forgery 5.2. Connection forgery
The process of causing a TCP connection to be illegitimately The process of causing a TCP connection to be illegitimately
established between two arbitrary remote peers is usually referred to established between two arbitrary remote peers is usually referred to
as "connection spoofing" or "connection forgery". This can have a as "connection spoofing" or "connection forgery". This can have a
great negative impact when systems establish some sort of trust great negative impact when systems establish some sort of trust
relationships based on the IP addresses used to establish a TCP relationships based on the IP addresses used to establish a TCP
connection [daemon9 et al, 1996]. connection [daemon9 et al, 1996].
skipping to change at page 45, line 24 skipping to change at page 29, line 22
recommended that systems disable IP Source Routing by default, or at recommended that systems disable IP Source Routing by default, or at
the very least, they disable source routing for IP packets that the very least, they disable source routing for IP packets that
encapsulate TCP segments. encapsulate TCP segments.
The IPv6 Routing Header Type 0, which provides a similar The IPv6 Routing Header Type 0, which provides a similar
functionality to that provided by IPv4 source routing, has been functionality to that provided by IPv4 source routing, has been
officially deprecated by RFC 5095 [Abley et al, 2007]. officially deprecated by RFC 5095 [Abley et al, 2007].
5.3. Connection-flooding attack 5.3. Connection-flooding attack
NOTE: THIS SECTION IS BEING EDITED. RFC2119-LANGUAGE IS BEING
REMOVED.
5.3.1. Vulnerability 5.3.1. Vulnerability
The creation and maintenance of a TCP connection requires system The creation and maintenance of a TCP connection requires system
memory to maintain shared state between the local and the remote TCP. memory to maintain shared state between the local and the remote TCP.
As system memory is a finite resource, there is a limit on the number As system memory is a finite resource, there is a limit on the number
of TCP connections that a system can maintain at any time. When the of TCP connections that a system can maintain at any time. When the
TCP API is employed to create a TCP connection with a remote peer, it TCP API is employed to create a TCP connection with a remote peer, it
allocates system memory for maintaining shared state with the remote allocates system memory for maintaining shared state with the remote
TCP peer, and thus the resulting connection would tie a similar TCP peer, and thus the resulting connection would tie a similar
amount of resources at the remote host as at the local host. amount of resources at the remote host as at the local host.
skipping to change at page 48, line 36 skipping to change at page 32, line 36
Some firewalls can be configured to limit the number of Some firewalls can be configured to limit the number of
simultaneous connections that any system can maintain with a simultaneous connections that any system can maintain with a
specific system and/or service at any given time. Limiting the specific system and/or service at any given time. Limiting the
number of simultaneous connections that each system can establish number of simultaneous connections that each system can establish
with a specific system and service would effectively limit the with a specific system and service would effectively limit the
possibility of an attacker that controls a single IP address to possibility of an attacker that controls a single IP address to
exhaust system resources at the attacker system/service. exhaust system resources at the attacker system/service.
5.4. Firewall-bypassing techniques 5.4. Firewall-bypassing techniques
TCP MUST silently drop those TCP segments that have both the SYN and [draft-gont-tcpm-tcp-sanity-checks-00.txt] discusses how packets with
the RST flags set. both the SYN and RST bits set have been employed in the wild to
bypass firewall rules, and provides advices in this area.
DISCUSSION:
Some firewalls block incoming TCP connections by blocking only
incoming SYN segments. However, there are inconsistencies in how
different TCP implementations handle SYN segments that have
additional flags set, which may allow an attacker to bypass
firewall rules [US-CERT, 2003b].
For example, some firewalls have been known to mistakenly allow
incoming SYN segments if they also have the RST bit set. As some
TCP implementations will create a new connection in response to a
TCP segment with both the SYN and RST bits set, an attacker could
bypass the firewall rules and establish a connection with a
"protected" system by setting the RST bit in his SYN segments.
Here we advise TCP implementations to silently drop those TCP
segments that have both the SYN and the RST flags set.
6. Connection-termination mechanism 6. Connection-termination mechanism
6.1. FIN-WAIT-2 flooding attack 6.1. FIN-WAIT-2 flooding attack
6.1.1. Vulnerability 6.1.1. Vulnerability
TCP implements a connection-termination mechanism that is employed TCP implements a connection-termination mechanism that is employed
for the graceful termination of a TCP connection. This mechanism for the graceful termination of a TCP connection. This mechanism
usually consists of the exchange of four-segments. Figure 6 usually consists of the exchange of four-segments. Figure 6
skipping to change at page 49, line 40 skipping to change at page 33, line 25
As a result, an attacker could establish a large number of As a result, an attacker could establish a large number of
connections with the target system, and cause it close each of them. connections with the target system, and cause it close each of them.
For each connection, once the target system has sent its FIN segment, For each connection, once the target system has sent its FIN segment,
the attacker would acknowledge the receipt of this segment, but would the attacker would acknowledge the receipt of this segment, but would
send no further segments on that connection. As a result, an send no further segments on that connection. As a result, an
attacker could cause the corresponding system resources (e.g., the attacker could cause the corresponding system resources (e.g., the
system memory used for storing the TCB) without the need to send any system memory used for storing the TCB) without the need to send any
further packets. further packets.
While the CLOSE command described in RFC 793 [Postel, 1981c] simply While the CLOSE command described in RFC 793 [RFC0793] simply signals
signals the remote TCP end-point that this TCP has finished sending the remote TCP end-point that this TCP has finished sending data
data (i.e., it closes only one direction of the data transfer), the (i.e., it closes only one direction of the data transfer), the
close() system-call available in most operating systems has different close() system-call available in most operating systems has different
semantics: it marks the corresponding file descriptor as closed (and semantics: it marks the corresponding file descriptor as closed (and
thus it is no longer usable), and assigns the operating system the thus it is no longer usable), and assigns the operating system the
responsibility to deliver any queued data to the remote TCP peer and responsibility to deliver any queued data to the remote TCP peer and
to terminate the TCP connection. This makes the FIN-WAIT-2 state to terminate the TCP connection. This makes the FIN-WAIT-2 state
particularly attractive for performing memory exhaustion attacks, as particularly attractive for performing memory exhaustion attacks, as
even if the application running on top of TCP were imposing limits on even if the application running on top of TCP were imposing limits on
the maximum number of ongoing connections, and/or time limits on the the maximum number of ongoing connections, and/or time limits on the
function calls performed on TCP connections, that application would function calls performed on TCP connections, that application would
be unable to enforce these limits on the FIN-WAIT-2 state. be unable to enforce these limits on the FIN-WAIT-2 state.
skipping to change at page 56, line 35 skipping to change at page 40, line 27
window to cause the target system to tie system memory to the TCP window to cause the target system to tie system memory to the TCP
retransmission buffer, it is hard to perform any useful statistics retransmission buffer, it is hard to perform any useful statistics
from the advertised window. While it is tempting to enforce a limit from the advertised window. While it is tempting to enforce a limit
on the length of the persist state (see Section 3.7.2 of this on the length of the persist state (see Section 3.7.2 of this
document), an attacker could simply open the window (i.e., advertise document), an attacker could simply open the window (i.e., advertise
a TCP window larger than zero) from time to time to prevent this a TCP window larger than zero) from time to time to prevent this
enforced limit from causing his malicious connections to be aborted. enforced limit from causing his malicious connections to be aborted.
7.2. TCP segment reassembly buffer 7.2. TCP segment reassembly buffer
TCP MAY discard out-of-order data when system-memory exhaustion is TCP buffers out-of-order segments to more efficiently handle the
imminent. occurrence of packet reordering and segment loss. When out-of-order
data are received, a "hole" momentarily exists in the data stream
DISCUSSION: which must be filled before the received data can be delivered to the
application making use of TCP's services. This situation can be
TCP buffers out-of-order segments to more efficiently handle the exploited by an attacker, which could intentionally create a hole in
occurrence of packet reordering and segment loss. When out-of- the data stream by sending a number of segments with a sequence
order data are received, a "hole" momentarily exists in the data number larger than the next sequence number expected (RCV.NXT) by the
stream which must be filled before the received data can be attacked TCP. Thus, the attacked TCP would tie system memory to
delivered to the application making use of TCP's services. This buffer the out-of-order segments, without being able to hand the
situation can be exploited by an attacker, which could received data to the corresponding application.
intentionally create a hole in the data stream by sending a number
of segments with a sequence number larger than the next sequence
number expected (RCV.NXT) by the attacked TCP. Thus, the attacked
TCP would tie system memory to buffer the out-of-order segments,
without being able to hand the received data to the corresponding
application.
If a large number of such connections were created, system memory If a large number of such connections were created, system memory
could be exhausted, precluding the attacked TCP from servicing new could be exhausted, precluding the attacked TCP from servicing new
connections and/or continue servicing TCP connections previously connections and/or continue servicing TCP connections previously
established. established.
Fortunately, these attacks can be easily mitigated, at the expense Fortunately, these attacks can be easily mitigated, at the expense of
of degrading the performance of possibly legitimate connections. degrading the performance of possibly legitimate connections. When
When out-of-order data is received, an Acknowledgement segment is out-of-order data is received, an Acknowledgement segment is sent
sent with the next sequence number expected (RCV.NXT). This means with the next sequence number expected (RCV.NXT). This means that
that receipt of the out-of-order data will not be actually receipt of the out-of-order data will not be actually acknowledged by
acknowledged by the TCP's cumulative Acknowledgement Number. As a the TCP's cumulative Acknowledgement Number. As a result, a TCP is
result, a TCP is free to discard any data that have been received free to discard any data that have been received out-of-order,
out-of-order, without affecting the reliability of the data without affecting the reliability of the data transfer. Given the
transfer. Given the performance implications of discarding out- performance implications of discarding out-of-order segments for
of-order segments for legitimate connections, this pruning policy legitimate connections, this pruning policy should be applied only if
should be applied only if memory exhaustion is imminent. memory exhaustion is imminent.
As a result of discarding the out-of-order data, these data will As a result of discarding the out-of-order data, these data will need
need to be unnecessarily retransmitted. Additionally, a loss to be unnecessarily retransmitted. Additionally, a loss event will
event will be detected by the sending TCP, and thus the slow start be detected by the sending TCP, and thus the slow start phase of
phase of TCP's congestion control will be entered, thus reducing TCP's congestion control will be entered, thus reducing the data
the data transfer rate of the connection. transfer rate of the connection.
It is interesting to note that this pruning policy could be It is interesting to note that this pruning policy could be applied
applied even if Selective Acknowledgements (SACK) (specified in even if Selective Acknowledgements (SACK) (specified in RFC 2018
RFC 2018 [Mathis et al, 1996]) are in use, as SACK provides only [Mathis et al, 1996]) are in use, as SACK provides only advisory
advisory information, and does not preclude the receiving TCP from information, and does not preclude the receiving TCP from discarding
discarding data that have been previously selectively-acknowledged data that have been previously selectively-acknowledged by means of
by means of TCP's SACK option, but not acknowledged by TCP's TCP's SACK option, but not acknowledged by TCP's cumulative
cumulative Acknowledgement Number. Acknowledgement Number.
There are a number of ways in which the pruning policy could be There are a number of ways in which the pruning policy could be
triggered. For example, when out of order data are received, a triggered. For example, when out of order data are received, a timer
timer could be set, and the sequence number of the out-of-order could be set, and the sequence number of the out-of-order data could
data could be recorded. If the hole were filled before the timer be recorded. If the hole were filled before the timer expires, the
expires, the timer would be turned off. However, if the timer timer would be turned off. However, if the timer expired before the
expired before the hole were filled, all the out-of-order segments hole were filled, all the out-of-order segments of the corresponding
of the corresponding connection would be discarded. This would be connection would be discarded. This would be a proactive counter-
a proactive counter-measure for attacks that aim at exhausting the measure for attacks that aim at exhausting the receive buffers.
receive buffers.
In addition, an implementation could incorporate reactive In addition, an implementation could incorporate reactive mechanisms
mechanisms for more carefully controlling buffer allocation when for more carefully controlling buffer allocation when some predefined
some predefined buffer allocation threshold was reached. At such buffer allocation threshold was reached. At such point, pruning
point, pruning policies would be applied. policies would be applied.
A number of mechanisms can aid in the process of freeing system A number of mechanisms can aid in the process of freeing system
resources. For example, a table of network prefixes corresponding resources. For example, a table of network prefixes corresponding to
to the IP addresses of TCP peers that have ongoing TCP connections the IP addresses of TCP peers that have ongoing TCP connections could
could record the aggregate amount of out-of-order data currently record the aggregate amount of out-of-order data currently buffered
buffered for those connections. When the pruning policy was for those connections. When the pruning policy was triggered, TCP
triggered, TCP connections with hosts that have network prefixes connections with hosts that have network prefixes with large
with large aggregate out-of-order buffered data could be selected aggregate out-of-order buffered data could be selected first for
first for pruning the out-of-order segments. pruning the out-of-order segments.
Alternatively, if TCP segments were de-multiplexed by means of a Alternatively, if TCP segments were de-multiplexed by means of a hash
hash table (as it is currently the case in many TCP table (as it is currently the case in many TCP implementations), a
implementations), a counter could be held at each entry of the counter could be held at each entry of the hash table that would
hash table that would record the aggregate out-of-order data record the aggregate out-of-order data currently buffered for those
currently buffered for those connections belonging to that hash connections belonging to that hash table entry. When the pruning
table entry. When the pruning policy is triggered, the out-of- policy is triggered, the out-of-order data corresponding to those
order data corresponding to those connections linked by the hash connections linked by the hash table entry with largest amount of
table entry with largest amount of aggregate out-of-order data aggregate out-of-order data could be pruned first. It is important
could be pruned first. It is important that this hash is not that this hash is not computable by an attacker, as this would allow
computable by an attacker, as this would allow him to maliciously him to maliciously cause the performance of specific connections to
cause the performance of specific connections to be degraded. be degraded. That is, given a four-tuple that identifies a
That is, given a four-tuple that identifies a connection, an connection, an attacker should not be able to compute the
attacker should not be able to compute the corresponding hash corresponding hash value used by the target system to de-multiplex
value used by the target system to de-multiplex incoming TCP incoming TCP segments to that connection.
segments to that connection.
Another variant of a resource exhaustion attack against TCP's Another variant of a resource exhaustion attack against TCP's segment
segment reassembly mechanism would target the data structures used reassembly mechanism would target the data structures used to link
to link the different holes in a data stream. For example, an the different holes in a data stream. For example, an attacker could
attacker could send a burst of 1 byte segments, leaving a one-byte send a burst of 1 byte segments, leaving a one-byte hole between each
hole between each of the data bytes sent. Depending on the data of the data bytes sent. Depending on the data structures used for
structures used for holding and linking together each of the data holding and linking together each of the data segments, such an
segments, such an attack might waste a large amount of system attack might waste a large amount of system memory by exploiting the
memory by exploiting the overhead needed store and link together overhead needed store and link together each of these one-byte
each of these one-byte segments. segments.
For example, if a linked-list is used for holding and linking each For example, if a linked-list is used for holding and linking each of
of the data segments, each of the involved data structures could the data segments, each of the involved data structures could involve
involve one byte of kernel memory for storing the received data one byte of kernel memory for storing the received data byte (the TCP
byte (the TCP payload), plus 4 bytes (32 bits) for storing a payload), plus 4 bytes (32 bits) for storing a pointer to the next
pointer to the next node in the linked-list. Additionally, while node in the linked-list. Additionally, while such a data structure
such a data structure would require only a few bytes of kernel would require only a few bytes of kernel memory, it could result in
memory, it could result in the allocation of a whole memory page, the allocation of a whole memory page, thus consuming much more
thus consuming much more memory than expected. memory than expected.
Therefore, implementations should enforce a limit on the number of Therefore, implementations should enforce a limit on the number of
holes that are allowed in the received data stream at any given holes that are allowed in the received data stream at any given time.
time. When such a limit is reached, incoming TCP segments which When such a limit is reached, incoming TCP segments which would
would create new holes would be silently dropped. Measurements in create new holes would be silently dropped. Measurements in
[Dharmapurikar and Paxson, 2005] indicate that in the vast [Dharmapurikar and Paxson, 2005] indicate that in the vast majority
majority of TCP connections have at most a single hole at any of TCP connections have at most a single hole at any given time. A
given time. A limit of 16 holes for each connection would limit of 16 holes for each connection would accommodate even most of
accommodate even most of the very unusual cases in which there can the very unusual cases in which there can be more than hole in the
be more than hole in the data stream at a given time. data stream at a given time.
[US-CERT, 2004a] is a security advisory about a Denial of Service [US-CERT, 2004a] is a security advisory about a Denial of Service
vulnerability resulting from a TCP implementation that did not vulnerability resulting from a TCP implementation that did not
enforce limits on the number of segments stored in the TCP enforce limits on the number of segments stored in the TCP reassembly
reassembly buffer. buffer.
Section 8 of this document describes the security implications of Section 8 of this document describes the security implications of the
the TCP segment reassembly algorithm. TCP segment reassembly algorithm.
7.3. Automatic buffer tuning mechanisms 7.3. Automatic buffer tuning mechanisms
NOTE: THIS SECTION IS BEING EDITED. PLEASE DISREGARD THE RFC2119-
LANGUAGE RECOMMENDATIONS.
7.3.1. Automatic send-buffer tuning mechanisms 7.3.1. Automatic send-buffer tuning mechanisms
A TCP implementing an automatic send-buffer tuning mechanism SHOULD A TCP implementing an automatic send-buffer tuning mechanism SHOULD
enforce the following limit on the size of the send buffer of each enforce the following limit on the size of the send buffer of each
TCP connection: TCP connection:
send_buffer_size <= send_buffer_pool / (min_buffer_size * max_connections) send_buffer_size <= send_buffer_pool / (min_buffer_size * max_connections)
where where
skipping to change at page 63, line 37 skipping to change at page 47, line 28
It is worth noting that TCP Selective Acknowledgements (SACK) are It is worth noting that TCP Selective Acknowledgements (SACK) are
advisory, in the sense that a TCP that has SACKed (but not ACKed) advisory, in the sense that a TCP that has SACKed (but not ACKed)
a block of data is free to discard that block, and expect the TCP a block of data is free to discard that block, and expect the TCP
sender to retransmit them when the retransmission timer of the sender to retransmit them when the retransmission timer of the
peer TCP expires. peer TCP expires.
8. TCP segment reassembly algorithm 8. TCP segment reassembly algorithm
8.1. Problems that arise from ambiguity in the reassembly process 8.1. Problems that arise from ambiguity in the reassembly process
If a TCP segment is received containing some data bytes that had A security consideration that should be made for the TCP segment
already been received, the first copy of those data SHOULD be used reassembly algorithm is that of data stream consistency between the
for reassembling the application data stream. host performing the TCP segment reassembly, and a Network Intrusion
Detection System (NIDS) being employed to monitor the host in
DISCUSSION: question.
A security consideration that should be made for the TCP segment
reassembly algorithm is that of data stream consistency between
the host performing the TCP segment reassembly, and a Network
Intrusion Detection System (NIDS) being employed to monitor the
host in question.
In the event a TCP segment was unnecessarily retransmitted, or In the event a TCP segment was unnecessarily retransmitted, or there
there was packet duplication in any of the intervening networks, a was packet duplication in any of the intervening networks, a TCP
TCP might get more than one copy of the same data. Also, as TCP might get more than one copy of the same data. Also, as TCP segments
segments can be re-packetized when they are retransmitted, a given can be re-packetized when they are retransmitted, a given TCP segment
TCP segment might partially overlap data already received in might partially overlap data already received in earlier segments.
earlier segments. In all these cases, the question arises about In all these cases, the question arises about which of the copies of
which of the copies of the received data should be used when the received data should be used when reassembling the data stream.
reassembling the data stream. In legitimate and normal In legitimate and normal circumstances, all copies would be
circumstances, all copies would be identical, and the same data identical, and the same data stream would be obtained regardless of
stream would be obtained regardless of which copy of the data was which copy of the data was used. However, an attacker could
used. However, an attacker could maliciously send overlapping maliciously send overlapping segments containing different data, with
segments containing different data, with the intent of evading a the intent of evading a Network Intrusion Detection Systems (NIDS),
Network Intrusion Detection Systems (NIDS), which might reassemble which might reassemble the received TCP segments differently than the
the received TCP segments differently than the monitored system. monitored system. [Ptacek and Newsham, 1998] provides a detailed
[Ptacek and Newsham, 1998] provides a detailed discussion of these discussion of these issues.
issues.
As suggested in Section 3.9 of RFC 793 [Postel, 1981c], if a TCP As suggested in Section 3.9 of RFC 793 [RFC0793], if a TCP segment
segment arrives containing some data bytes that have already been arrives containing some data bytes that have already been received,
received, the first copy of those data should be used for the first copy of those data should be used for reassembling the
reassembling the application data stream. It should be noted that application data stream. It should be noted that while convergence
while convergence to this policy might prevent some cases of to this policy might prevent some cases of ambiguity in the
ambiguity in the reassembly process, there are a number of other reassembly process, there are a number of other techniques that an
techniques that an attacker could still exploit to evade a NIDS attacker could still exploit to evade a NIDS [CPNI, 2008]. These
[CPNI, 2008]. These techniques can generally be defeated if the techniques can generally be defeated if the NIDS is placed in-line
NIDS is placed in-line with the monitored system, thus allowing with the monitored system, thus allowing the NIDS to normalize the
the NIDS to normalize the network traffic or apply some other network traffic or apply some other policy that could ensure
policy that could ensure consistency between the result of the consistency between the result of the segment reassembly process
segment reassembly process obtained by the monitored host and that obtained by the monitored host and that obtained by the NIDS.
obtained by the NIDS.
[CERT, 2003] and [CORE, 2003] are advisories about a heap buffer [CERT, 2003] and [CORE, 2003] are advisories about a heap buffer
overflow in a popular Network Intrusion Detection System resulting overflow in a popular Network Intrusion Detection System resulting
from incorrect sequence number calculations in its TCP stream- from incorrect sequence number calculations in its TCP stream-
reassembly module. reassembly module.
9. TCP Congestion Control 9. TCP Congestion Control
NOTE: THIS SECTION IS BEING EDITED.
TCP implements two algorithms, "slow start" and "congestion TCP implements two algorithms, "slow start" and "congestion
avoidance", for controlling the rate at which data is transmitted on avoidance", for controlling the rate at which data is transmitted on
a TCP connection [Allman et al, 1999]. These algorithms require the a TCP connection [RFC5681].
addition of two variables as part of TCP per-connection state: cwnd
and ssthresh.
The congestion window (cwnd) is a sender-side limit on the amount of
outstanding data that the sender can have at any time, while the
receiver's advertised window (rwnd) is a receiver-side limit on the
amount of outstanding data. The minimum of cwnd and rwnd governs
data transmission.
Another state variable, the slow-start threshold (ssthresh), is used
to determine whether it is the slow start or the congestion avoidance
algorithm that should control data transmission. When cwnd <
ssthresh, "slow start" governs data transmission, and the congestion
window (cwnd) is exponentially increased. When cwnd > ssthresh,
"congestion avoidance" governs data transmission, and the congestion
window (cwnd) is only linearly increased.
As specified in RFC 2581 [Allman et al, 1999], when cwnd and ssthresh
are equal the sender may use either slow start or congestion
avoidance.
During slow start, TCP increments cwnd by at most SMSS bytes for each
ACK received that acknowledges new data. During congestion
avoidance, cwnd is incremented by 1 full-sized segment per round-trip
time (RTT), until congestion is detected.
Additionally, TCP uses two algorithms, Fast Retransmit and Fast
Recovery, to mitigate the effects of packet loss. The "Fast
Retransmit" algorithm infers packet loss when three Duplicate
Acknowledgements (DupACKs) are received.
The value "three" is meant to allow for fast-retransmission of
"missing" data, while avoiding network packet reordering from
triggering loss recovery.
Once packet loss is detected by the receipt of three duplicate-ACKs,
the "Fast Recovery" algorithm governs the transfer of new data until
a non-duplicate ACK is received that acknowledges the receipt of new
data. The Fast Retransmit and Fast Recovery algorithms are usually
implemented together, as follows (from RFC 2581):
o When the third duplicate ACK is received, set ssthresh to no more
than the value given in the equation: ssthresh = max (FlightSize /
2, 2*SMSS)
o Retransmit the lost segment and set cwnd to ssthresh plus 3*SMSS.
This artificially "inflates" the congestion window by the number
of segments (three) that have left the network and which the
receiver has buffered.
o For each additional duplicate ACK received, increment cwnd by
SMSS. This artificially inflates the congestion window in order
to reflect the additional segment that has left the network.
o Transmit a segment, if allowed by the new value of cwnd and the
receiver's advertised window.
o When the next ACK arrives that acknowledges new data, set cwnd to
ssthresh (the value set in step 1). This is termed "deflating"
the window.
9.1. Congestion control with misbehaving receivers 9.1. Congestion control with misbehaving receivers
[Savage et al, 1999] describes a number of ways in which TCP's [Savage et al, 1999] describes a number of ways in which TCP's
congestion control mechanisms can be exploited by a misbehaving TCP congestion control mechanisms can be exploited by a misbehaving TCP
receiver to obtain more than its fair share of bandwidth. The receiver to obtain more than its fair share of bandwidth. The
following subsections provide a brief discussion of these following subsections provide a brief discussion of these
vulnerabilities, along with the possible countermeasures. vulnerabilities, along with the possible countermeasures.
9.1.1. ACK division 9.1.1. ACK division
TCP SHOULD increase cwnd by one SMSS only when a valid ACK covers the Given that TCP updates cwnd based on the number of duplicate ACKs it
entire data segment sent receives, rather than on the amount of data that each ACK is actually
acknowledging, a malicious TCP receiver could cause the TCP sender to
(note: or should we recommend the other counter-measure (i.e., illegitimately increase its congestion window by acknowledging a data
implementation of ABC?) segment with a number of separate Acknowledgements, each covering a
distinct piece of the received data segment.
DISCUSSION:
Given that TCP updates cwnd based on the number of duplicate ACKs
it receives, rather than on the amount of data that each ACK is
actually acknowledging, a malicious TCP receiver could cause the
TCP sender to illegitimately increase its congestion window by
acknowledging a data segment with a number of separate
Acknowledgements, each covering a distinct piece of the received
data segment.
See Figure 7, in page 64 of the UK CPNI document. See Figure 7, in page 64 of the UK CPNI document.
ACK division attack ACK division attack
[Savage et al, 1999] describes two possible countermeasures for [Savage et al, 1999] describes two possible countermeasures for this
this vulnerability. One of them is to increment cwnd not by a vulnerability. One of them is to increment cwnd not by a full SMSS,
full SMSS, but proportionally to the amount of data being but proportionally to the amount of data being acknowledged by the
acknowledged by the received ACK, similarly to the policy received ACK, similarly to the policy described in RFC 3465 [Allman,
described in RFC 3465 [Allman, 2003]. Another alternative is to 2003]. Another alternative is to increase cwnd by one SMSS only when
increase cwnd by one SMSS only when a valid ACK covers the entire a valid ACK covers the entire data segment sent.
data segment sent.
9.1.2. DupACK forgery 9.1.2. DupACK forgery
TCP SHOULD keep track of the number of outstanding segments (o_seg), The second vulnerability discussed in [Savage et al, 1999] allows an
and accept only up to (o_seg -1) duplicate Acknowledgements. attacker to cause the TCP sender to illegitimately increase its
congestion window by forging a number of duplicate Acknowledgements
DISCUSSION: (DupACKs). Figure 8 shows a sample scenario. The first three
DupACKs trigger the Fast Recovery mechanism, while the rest of them
The second vulnerability discussed in [Savage et al, 1999] allows cause the congestion window at the TCP sender to be illegitimately
an attacker to cause the TCP sender to illegitimately increase its inflated. Thus, the attacker is able to illegitimately cause the TCP
congestion window by forging a number of duplicate sender to increase its data transmission rate.
Acknowledgements (DupACKs). Figure 8 shows a sample scenario.
The first three DupACKs trigger the Fast Recovery mechanism, while
the rest of them cause the congestion window at the TCP sender to
be illegitimately inflated. Thus, the attacker is able to
illegitimately cause the TCP sender to increase its data
transmission rate.
See Figure 8, in page 65 of the UK CPNI document. See Figure 8, in page 65 of the UK CPNI document.
DupACK forgery attack DupACK forgery attack
Fortunately, a number of sender-side heuristics can be implemented Fortunately, a number of sender-side heuristics can be implemented to
to mitigate this vulnerability. First, the TCP sender could keep mitigate this vulnerability. First, the TCP sender could keep track
track of the number of outstanding segment (o_seg), and accept of the number of outstanding segment (o_seg), and accept only up to
only up to (o_seg -1) DupACKs. Secondly, a TCP sender might, for (o_seg -1) DupACKs. Secondly, a TCP sender might, for example,
example, refuse to enter Fast Recovery multiple times in some refuse to enter Fast Recovery multiple times in some period of time
period of time (e.g., one RTT). (e.g., one RTT).
[Savage et al, 1999] also describes a modification to TCP to [Savage et al, 1999] also describes a modification to TCP to
implement a nonce protocol that would eliminate this implement a nonce protocol that would eliminate this vulnerability.
vulnerability. However, this would require modification of all However, this would require modification of all implementations,
implementations, which makes this counter-measure hard to deploy. which makes this counter-measure hard to deploy.
9.1.3. Optimistic ACKing 9.1.3. Optimistic ACKing
Another alternative for an attacker to exploit TCP's congestion Another alternative for an attacker to exploit TCP's congestion
control mechanisms is to acknowledge data that has not yet been control mechanisms is to acknowledge data that has not yet been
received, thus causing the congestion window at the TCP sender to be received, thus causing the congestion window at the TCP sender to be
incremented faster than it should. incremented faster than it should.
See Figure 9, in page 66 of the UK CPNI document. See Figure 9, in page 66 of the UK CPNI document.
skipping to change at page 68, line 31 skipping to change at page 50, line 37
TCP", the third duplicate-ACK will cause the "lost" segment to be TCP", the third duplicate-ACK will cause the "lost" segment to be
retransmitted, and each subsequent duplicate-ACK will cause cwnd to retransmitted, and each subsequent duplicate-ACK will cause cwnd to
be artificially inflated. Thus, the "sending TCP" might end up be artificially inflated. Thus, the "sending TCP" might end up
injecting more packets into the network than it really should, with injecting more packets into the network than it really should, with
the potential of causing network congestion. This is a potential the potential of causing network congestion. This is a potential
consequence of the "Duplicate-ACK spoofing attack" described in consequence of the "Duplicate-ACK spoofing attack" described in
[Savage et al, 1999]. [Savage et al, 1999].
Secondly, if bursts of three duplicate ACKs are sent to the TCP Secondly, if bursts of three duplicate ACKs are sent to the TCP
sender, the attacked system would infer packet loss, and ssthresh and sender, the attacked system would infer packet loss, and ssthresh and
cwnd would be reduced. As noted in RFC 2581 [Allman et al, 1999], cwnd would be reduced. As noted in RFC 5681 [RFC5681], causing two
causing two congestion control events back-to-back will often cut congestion control events back-to-back will often cut ssthresh and
ssthresh and cwnd to their minimum value of 2*SMSS, with the cwnd to their minimum value of 2*SMSS, with the connection
connection immediately entering the slower-performing congestion immediately entering the slower-performing congestion avoidance
avoidance phase. While it would not be attractive for an attacker to phase. While it would not be attractive for an attacker to perform
perform this attack against one of his TCP connections, the attack this attack against one of his TCP connections, the attack might be
might be attractive when the TCP connection to be attacked is attractive when the TCP connection to be attacked is established
established between two other parties. between two other parties.
It is usually assumed that in order for an off-path attacker to It is usually assumed that in order for an off-path attacker to
perform attacks against a third-party TCP connection, he should be perform attacks against a third-party TCP connection, he should be
able to guess a number of values, including a valid TCP Sequence able to guess a number of values, including a valid TCP Sequence
Number and a valid TCP Acknowledgement Number. While this is true if Number and a valid TCP Acknowledgement Number. While this is true if
the attacker tries to "inject" valid packets into the connection by the attacker tries to "inject" valid packets into the connection by
himself, a feature of TCP can be exploited to fool one of the TCP himself, a feature of TCP can be exploited to fool one of the TCP
endpoints to transmit valid duplicate Acknowledgements on behalf of endpoints to transmit valid duplicate Acknowledgements on behalf of
the attacker, hence relieving the attacker of the hard task of the attacker, hence relieving the attacker of the hard task of
forging valid values for the Sequence Number and Acknowledgement forging valid values for the Sequence Number and Acknowledgement
Number TCP header fields. Number TCP header fields.
Section 3.9 of RFC 793 [Postel, 1981c] describes the processing of Section 3.9 of RFC 793 [RFC0793] describes the processing of incoming
incoming TCP segments as a function of the connection state and the TCP segments as a function of the connection state and the contents
contents of the various header fields of the received segment. For of the various header fields of the received segment. For
connections in the ESTABLISHED state, the first check that is connections in the ESTABLISHED state, the first check that is
performed on incoming segments is that they contain "in window" data. performed on incoming segments is that they contain "in window" data.
That is, That is,
RCV.NXT <= SEG.SEQ <= RCV.NXT+RCV.WND, or RCV.NXT <= SEG.SEQ <= RCV.NXT+RCV.WND, or
RCV.NXT <= SEG.SEQ+SEG.LEN-1 < RCV.NXT+RCV.WND RCV.NXT <= SEG.SEQ+SEG.LEN-1 < RCV.NXT+RCV.WND
If a segment does not pass this check, it is dropped, and an If a segment does not pass this check, it is dropped, and an
Acknowledgement is sent in response: Acknowledgement is sent in response:
skipping to change at page 71, line 22 skipping to change at page 53, line 28
segments (in red) sent by the attacker causes the TCP sender to enter segments (in red) sent by the attacker causes the TCP sender to enter
the loss recovery phase and illegitimately inflate the congestion the loss recovery phase and illegitimately inflate the congestion
window, leading to an increase in the data transmission rate. Once a window, leading to an increase in the data transmission rate. Once a
segment that acknowledges new data is received by the TCP sender, the segment that acknowledges new data is received by the TCP sender, the
loss recovery phase ends, and the data transmission rate is reduced. loss recovery phase ends, and the data transmission rate is reduced.
See Figure 12, in page 70 of the UK CPNI document. See Figure 12, in page 70 of the UK CPNI document.
Blind flooding attack (time-line graph) Blind flooding attack (time-line graph)
Figure 13 is a time-sequence graph produced from packet logs obtained
from tests of the described attack in a real network. A burst of
segments is sent upon receipt of the burst of Duplicate
Acknowledgements illegitimately elicited by the attacker. Figure 14
is an averaged-throughput graphic for the same time frame, which
clearly shows the effect of the attack in terms of throughput.
See Figure 13, in page 71 of the UK CPNI document.
Blind flooding attack (time sequence graph)
See Figure 14, in page 71 of the UK CPNI document.
Blind flooding attack (averaged throughput graph)
These graphics were produced with Shawn Ostermann's tcptrace tool
[Ostermann, 2008]. An explanation of the format of the graphics can
be found in tcptrace's manual (available at the project's web site:
http://www.tcptrace.org).
9.2.3. Difficulty in performing the attacks 9.2.3. Difficulty in performing the attacks
In order to exploit the technique described in Section 9.2 of this In order to exploit the technique described in Section 9.2 of this
document, an attacker would need to know the four-tuple {IP Source document, an attacker would need to know the four-tuple {IP Source
Address, TCP Source Port, IP Destination Address, TCP Destination Address, TCP Source Port, IP Destination Address, TCP Destination
Port} that identifies the connection to be attacked. As discussed by Port} that identifies the connection to be attacked. As discussed by
[Watson, 2004] and RFC 4953 [Touch, 2007], there are a number of [Watson, 2004] and RFC 4953 [Touch, 2007], there are a number of
scenarios in which these values may be known or easily guessed. scenarios in which these values may be known or easily guessed.
It is interesting to note that the attacks described in Section 9.2 It is interesting to note that the attacks described in Section 9.2
skipping to change at page 73, line 10 skipping to change at page 54, line 43
interesting in the case of the blind-flooding attack, as the attack interesting in the case of the blind-flooding attack, as the attack
would elicit even more packets from the TCP sender. would elicit even more packets from the TCP sender.
Whether a full-window or just half a window of data is retransmitted Whether a full-window or just half a window of data is retransmitted
depends on the Acknowledgement policy at the TCP receiver. If the depends on the Acknowledgement policy at the TCP receiver. If the
TCP receiver sends an Acknowledgement (ACK) for every segment, a TCP receiver sends an Acknowledgement (ACK) for every segment, a
full-window of data will be retransmitted. If the TCP receiver sends full-window of data will be retransmitted. If the TCP receiver sends
an Acknowledgement (ACK) for every other segment, then only half a an Acknowledgement (ACK) for every other segment, then only half a
window of data will be retransmitted. window of data will be retransmitted.
Figure 15 is a time-sequence graph produced from packet logs obtained
from tests performed in a real network. Once loss recovery is
illegitimately triggered by the duplicate-ACKs elicited by the
attacker, an entire flight of data is unnecessarily retransmitted.
Figure 16 is an averaged-throughput graphic for the same time-frame,
which shows an increase in the throughput of the connection resulting
from the retransmission of segments governed by NewReno's loss
recovery.
See Figure 15, in page 73 of the UK CPNI document.
NewReno loss recovery (time-sequence graph)
See Figure 16, in page 74 of the UK CPNI document.
NewReno loss recovery (averaged throughput graph)
Limited Transmit Limited Transmit
RFC 3042 [Allman et al, 2001] proposes an enhancement to TCP to more RFC 3042 [Allman et al, 2001] proposes an enhancement to TCP to more
effectively recover lost segments when a connection's congestion effectively recover lost segments when a connection's congestion
window is small, or when a large number of segments are lost in a window is small, or when a large number of segments are lost in a
single transmission window. The "Limited Transmit" algorithm calls single transmission window. The "Limited Transmit" algorithm calls
for sending a new data segment in response to each of the first two for sending a new data segment in response to each of the first two
Duplicate Acknowledgements that arrive at the TCP sender. This would Duplicate Acknowledgements that arrive at the TCP sender. This would
provide two additional transmitted packets that may be useful for the provide two additional transmitted packets that may be useful for the
attacker in the case of the blind flooding attack described in attacker in the case of the blind flooding attack described in
Section 9.2.2 is performed. Section 9.2.2 is performed.
SACK-based loss recovery SACK-based loss recovery
RFC 3517 [Blanton et al, 2003] specifies a conservative loss-recovery [I-D.ietf-tcpm-3517bis] specifies a conservative loss-recovery
algorithm that is based on the use of the selective acknowledgement algorithm that is based on the use of the selective acknowledgement
(SACK) TCP option. The algorithm uses DupACKs as an indication of (SACK) TCP option. The algorithm uses DupACKs as an indication of
congestion, as specified in RFC 2581 [Allman et al, 1999]. However, congestion, as specified in RFC 2581 [RFC5681]. However, a
a difference between this algorithm and the basic algorithm described difference between this algorithm and the basic algorithm described
in RFC 2581 is that it clocks out segments only with the SACK in RFC 2581 is that it clocks out segments only with the SACK
information included in the DupACKs. That is, during the loss information included in the DupACKs. That is, during the loss
recovery phase, segments will be injected in the network only if the recovery phase, segments will be injected in the network only if the
SACK information included in the received DupACKs indicates that one SACK information included in the received DupACKs indicates that one
or more segments have left the network. As a result, those systems or more segments have left the network. As a result, those systems
that implement SACK-based loss recovery will not be vulnerable to the that implement SACK-based loss recovery will not be vulnerable to the
blind flooding attack described in Section 9.2.2. However, as RFC blind flooding attack described in Section 9.2.2. Additionally, as
3517 does not actually require DupACKs to include new SACK [I-D.ietf-tcpm-3517bis] requires DupACKs to include new SACK
information (corresponding to data that has not yet been acknowledged information (corresponding to data that has not yet been acknowledged
by TCP's cumulative Acknowledgement), systems that implement SACK- by TCP's cumulative Acknowledgement), systems that implement SACK-
based loss-recovery may still remain vulnerable to the blind based loss-recovery will not be vulnerable to the blind throughput-
throughput-reduction attack described in Section 9.2.1. SACK-based reduction attack described in Section 9.2.1.
loss recovery implementations should be updated to implement the
countermeasure ("Use of SACK information to validate DupACKs")
described in Section 9.2.5.
9.2.5. Countermeasures 9.2.5. Countermeasures
TCP SHOULD validate the Sequence Number of an incomming TCP segment [draft-gont-tcpm-limiting-aow-segments-00.txt] proposes to rate-limit
as follows: the reaction to out-of-window segments. This would mitigate the
attacks described earlier in this section.
RCV.NXT - MAX.RCV.WND <= SEG.SEQ <= RCV.NXT + RCV.WND
where MAX.RCV.WND is the largest TCP window that has so far been
advertised to the remote endpoint.
If a segment passes this check, the processing rules specified in RFC
793 [Postel, 1981c] MUST applied. Otherwise, TCP SHOULD send an ACK
(as specified by the processing rules in RFC 793 [Postel, 1981c]),
applying rate-limiting to the Acknowledgement segments sent in
response to out-of-window segments.
DISCUSSION:
As discussed in Section 9.2, TCP responds with an ACK when an out-
of-window segment is received, to accommodate those scenarios in
which the Acknowledgement segments that correspond to some
received data are lost in the network, and to help discover half-
open TCP connections.
However, it is possible to restrict the sequence numbers that are
considered acceptable, and have TCP respond with ACKs only when it
is strictly necessary.
A feature of TCP is that, in some scenarios, it can detect half-
open connections. If an implementation chose to silently drop
those TCP segments that do not pass the check enforced by the
equation above, it could prevent TCP from detecting half-open
connections. Figure 17 shows a scenario in which, provided that
"TCP B" behaves as specified in RFC 793, a half-open connection
would be discovered and aborted.
An established connection is said to be "half open" if one of the
TCPs has closed or aborted the connection at its end without the
knowledge of the other, or if the two ends of the connection have
become desynchronized owing to a crash that resulted in loss of
memory.
See Figure 17, in page 76 of the UK CPNI document.
Half-Open Connection Discovery
In the scenario illustrated by Figure 17, TCP A crashes losing the
connection-state information of the TCP connection with TCP B. In
line 3, TCP A tries to establish a new connection with TCP B,
using the same four-tuple {IP Source Address, TCP source port, IP
Destination Address, TCP destination port}. In line 4, as the SYN
segment is out of window, TCP B responds with an ACK. This ACK
elicits an RST segment from TCP A, which causes the half-open
connection at TCP B to be aborted.
If the SYN segment had been "in window", TCP B would have sent an
RST segment instead, which would have closed the half-open
connection. Ongoing work at the TCPM WG of the IETF proposes to
change this behavior, and make TCP respond to a SYN segment
received for any of the synchronized states with an ACK segment,
to avoid in-window SYN segments from being used to perform
connection-reset attacks [Ramaiah et al, 2008].
However, in case the out-of-window segment was silently dropped,
the scenario in Figure 17 would change into that in Figure 18.
See Figure 18, in page 76 of the UK CPNI document.
Half-Open Connection Discovery with the proposed counter-measure
In line 3, the SYN segment sent by TCP A is silently dropped by
TCP B because it does not pass the check enforced by the equation
above (i.e., it contains an out-of-window sequence number). As a
result, some time later (an RTO) TCP A retransmits its SYN
segment. Even after TCP A times out, the half-open connection at
TCP B will remain in the same state.
Thus, a conservative reaction to those segments that do not pass
the check enforced by the equation above would be to respond with
an Acknowledgement segment (as specified by RFC 793), applying
rate-limiting to those Acknowledgement segments sent in response
to segments that do not pass the check enforced by that equation.
An implementation might choose to enforce a rate-limit of, e.g.,
one ACK per five seconds, as a single ACK segment is needed for
the Half-Open Connection Discovery mechanism to work.
As the only reason to respond with an ACK to those segments that
do not pass the check enforced by the equation above is to allow
TCP to discover half-open connections, an aggressive rate-limit
can be enforced. As long as the rate-limit prevents out-of-window
segments from eliciting three Acknowledgment segments in a Round-
trip Time (RTT), an attacker would not be able to trigger TCP's
loss-recovery, and thus would not be able to perform the attacks
described in the previous sections.
It is interesting to note that RFC 793 [Postel, 1981c] itself
states that half-open connections are expected to be unusual.
Additionally, given that in many scenarios it may be unlikely for
a TCP connection request to be issued with the same four-tuple as
that of the half-open connection, a complete solution for the
discovery of half-open connections cannot rely on the mechanism
illustrated by Figure 17, either. Therefore, some implementations
might choose to sacrifice TCP's ability to detect half-open
connections, and have a more aggressive reaction to those segments
that do not pass the check enforced by the equation above by
silently dropping them.
This validation check can also help to avoid ACK wars in some
scenarios that may arise from the use of transparent proxies. In
those scenarios, when the transparent proxy fails to wire (i.e.,
is disabled), the sequence numbers of the two end-points of the
TCP connection become desynchronized, and both TCPs begin to send
duplicate Acknowledgements to each other, with the intention of
re-synchronizing them. As the sequence numbers never get re-
synchronized, the ACK war can only be stopped by an external
agent.
TCP SHOULD limit the number of duplicate acknowledgements it will
honour to:
Max_DupACKs = (FlightSize / SMSS) - 1
Where FlightSize and SMSS are the values defined in RFC 2581 [Allman
et al, 1999]. When more than Max_DupACKs duplicate acknowledgements
are received, the exceeding DupACKs should be silently dropped.
DISCUSSION:
Note that duplicate acknowledgements should be elicited by out-of-
order segments.
In the case of TCP connections that have agreed to employ SACK, TCP
SHOULD validate duplicate ACKs with the following criteria: Valid
Duplicate ACKs MUST contain new SACK information. The SACK
information MUST refer to data that has already been sent, but that
has not yet been acknowledged by TCP's cumulative Acknowledgement. A
TCP segment that does not pass this check SHOULD NOT be considered as
"duplicate Acknowledgement".
DISCUSSION:
SACK, specified in 2018 [Mathis et al, 1996], provides a mechanism
for TCP to be able to acknowledge the receipt of out-of-order TCP
segments. For connections that have agreed to use SACK, each
legitimate DupACK will contain new SACK information that reflects
the data bytes contained in the out-of-order data segment that
elicited the DupACK.
RFC 3517 [Blanton et al, 2003] specifies a SACK-based loss
recovery algorithm for TCP. However, it does recommend TCP
implementations to validate DupACKs by requiring that they contain
new SACK information. Results obtained from auditing a number of
TCP implementations seem to indicate that most TCP implementations
do not enforce this validation check on incoming DupACKs, either.
In the case of TCP connections that have agreed to use SACK, a
validation check should be performed on incoming ACK segments to
completely eliminate the attacks described in Section 9.2.1 and
Section 9.2.2 of this document: "Duplicate ACKs should contain new
SACK information. The SACK information should refer to data that
has already been sent, but that has not yet been acknowledged by
TCP's cumulative Acknowledgement".
Those ACK segments that do not comply with this validation check
should not be considered "duplicate ACKs", and thus should not
trigger the loss-recovery phase.
In case at least one segment in a window of data has been lost,
the successive segments will elicit the generation of Duplicate
ACKs containing new SACK information. This SACK information will
indicate the receipt of these successive segments by the TCP
receiver.
In the case of pure ACKs illegitimately elicited by out-of-window
segments, however, the ACKs will not contain any SACK information.
If DSACK (specified in 2883 [Floyd et al, 2000]) were implemented
by the TCP receiver, then the illegitimately elicited DupACKs
might contain out-of-window SACK information if the sequence
number of the forged TCP segment (SEG.SEQ) is lower than the next
expected sequence number (RECV.NXT) at the TCP receiver. Such
segments should be considered to indicate the receipt of duplicate
data, rather than an indication of lost data, and therefore should
not trigger loss recovery.
Other possible general mitigations are discussed in the following
paragraphs:
TCP port number randomization
As in order to perform the blind attacks described in Section 9.2.1
and Section 9.2.2 the attacker needs to know the TCP port numbers in
use by the connection to be attacked, obfuscating the TCP source port
used for outgoing TCP connections will increase the number of packets
required to successfully perform these attacks. Section 3.1 of this
document discusses the use of port randomization.
It must be noted that given that these blind DupACK triggering
attacks do not require the attacker to forge valid TCP Sequence
numbers and TCP Acknowledgement numbers, port randomization should
not be relied upon as a first line of defense.
Ingress and Egress filtering
Ingress and Egress filtering reduces the number of systems in the
global Internet that can perform attacks that rely on forged source
IP addresses. While protection from the blind attacks discussed in
Section 9.2 should not rely only on Ingress and Egress filtering, its
deployment is recommended to help prevent all attacks that rely on
forged IP addresses. RFC 3704 [Baker and Savola, 2004], RFC 2827
[Ferguson and Senie, 2000], and [NISCC, 2006] provide advice on
Ingress and Egress filtering.
Generalized TTL Security Mechanism (GTSM)
RFC 5082 [Gill et al, 2007] proposes a check on the TTL field of the
IP packets that correspond to a given TCP connection to reduce the
number of systems that could successfully attack the protected TCP
connection. It provides for the attacks discussed in this document
the same level of protection than for the attacks described in
[Watson, 2004] and RFC 4953 [Touch, 2007]. While implementation of
this mechanism may be useful in some scenarios, it should be clear
that countermeasures discussed in the previous sections provide a
more effective and simpler solution than that provided by the GTSM.
9.3. TCP Explicit Congestion Notification (ECN) 9.3. TCP Explicit Congestion Notification (ECN)
ECN (Explicit Congestion Notification) provides a mechanism for ECN (Explicit Congestion Notification) provides a mechanism for
intermediate systems to signal congestion to the communicating intermediate systems to signal congestion to the communicating
endpoints that in some scenarios can be used as an alternative to endpoints that in some scenarios can be used as an alternative to
dropping packets. dropping packets.
RFC 3168 [Ramakrishnan et al, 2001] contains a detailed discussion of RFC 3168 [Ramakrishnan et al, 2001] contains a detailed discussion of
the possible ways and scenarios in which ECN could be exploited by an the possible ways and scenarios in which ECN could be exploited by an
skipping to change at page 79, line 27 skipping to change at page 56, line 6
on nonces, that protects against accidental or malicious concealment on nonces, that protects against accidental or malicious concealment
of marked packets from the TCP sender. The specified mechanism of marked packets from the TCP sender. The specified mechanism
defines a "NS" ("Nonce Sum") field in the TCP header that makes use defines a "NS" ("Nonce Sum") field in the TCP header that makes use
of one bit from the Reserved field, and requires a modification in of one bit from the Reserved field, and requires a modification in
both of the endpoints of a TCP connection to process this new field. both of the endpoints of a TCP connection to process this new field.
This mechanism is still in "Experimental" status, and since it might This mechanism is still in "Experimental" status, and since it might
suffer from the behavior of some middle-boxes such as firewalls or suffer from the behavior of some middle-boxes such as firewalls or
packet-scrubbers, we defer a recommendation of this mechanism until packet-scrubbers, we defer a recommendation of this mechanism until
more experience is gained. more experience is gained.
There also is ongoing work in the research community and the IETF to There also is ongoing work in the research community and the IETF
define alternate semantics for the ECN field of the IP header (e.g., to define alternate semantics for the ECN field of the IP header
see [PCNWG, 2009]). (e.g., see [PCNWG, 2009]).
The following subsections try to summarize the security implications
of ECN.
9.3.1. Possible attacks by a compromised router
Firstly, a router controlled by a malicious user could erase the CE
codepoint (either by replacing it with the ECT(0), ECT(1), or non-ECT
codepoints), effectively eliminating the congestion indication. As a
result, the corresponding TCP sender would not reduce its data
transmission rate, possibly leading to network congestion. This
could also lead to unfairness, as this flow could experience better
performance than other flows for which the congestion indication is
not erased (and thus their transmission rate is reduced).
Secondly, a router controlled by a malicious user could
illegitimately set the CE codepoint, falsely indicating congestion,
to cause the TCP sender to reduce its data transmission rate.
However, this particular attack is no worse than the malicious router
simply dropping the packets rather setting their CE codepoint.
Thirdly, a malicious router could turn off the ECT codepoint of a
packet, thus disabling ECN support. As a result, if the packet later
arrives at a router that is experiencing congestion, it may be
dropped rather than marked. As with the previous scenario, though,
this is no worse than the malicious router simply dropping the
corresponding packet.
It should be noted that a compromised on-path IP router could engage
in a much broader range of attacks, with broader impacts, and at much
lower attacker cost than the ones described here. Such a compromised
router is extremely unlikely to engage in the attack vectors
discussed in this section, given the existence of more effective
attack vectors that have lower attacker cost.
9.3.2. Possible attacks by a malicious TCP endpoint
If a packet with the ECT codepoint set arrives at an ECN-capable
router that is experiencing moderate congestion, the router may
decide to set its CE codepoint instead of dropping it. If either of
the TCP endpoints do not honour the congestion indication provided by
an ECN-capable router, this would result in unfairness, as other
(legitimate) ECN-capable flows would still reduce their sending rate
in response to the ECN marking of packets. Furthermore, under
moderate congestion, non-ECN-capable flows would be subject to packet
drops by the same router. As a result, the flow with a malicious TCP
end-point would obtain better service than the legitimate flows.
As noted in RFC 3168 [Ramakrishnan et al, 2001], a TCP endpoint
falsely indicating ECN capability could lead to unfairness, allowing
the mis-beheaving flow to get more than its fair share of the
bandwidth. This could be the result of the mis-behavior of either of
the TCP endpoints. For example, the sending TCP could indicate ECN
capability, but then send a CWR in response to an ECE without
actually reducing its congestion window. Alternatively (or in
addition), the receiving TCP could simply ignore those packets with
the CE codepoint set, thus avoiding the sending TCP from receiving
the congestion indication.
In the case of the sending TCP ignoring the ECN congestion
indication, this would be no worse than the sending TCP ignoring the
congestion indication provided by a lost segment. However, the case
of a TCP receiver ignoring the CE codepoint allows the TCP receiver
to get more than its fair share of bandwidth in a way that was
previously unavailable. If congestion was kept "moderate", then the
malicious TCP receiver could maintain the unfairness, as the router
experiencing congestion would mark the offending packets of the
misbehaving flow rather than dropping them. At the same time,
legitimate ECN-capable flows would respond to the congestion
indication provided by the CE codepoint, while legitimate non-ECN-
capable flows would be subject of packet dropping. However, if
congestion turned to sufficiently heavy, the router experiencing
congestion would switch from marking packets to dropping packets, and
at that point the attack vector provided by ECN could no longer be
exploited (until congestion returns to moderate state).
RFC 3168 [Ramakrishnan et al, 2001] describes the use of "penalty RFC 3168 [RFC3168] provides a very throrough security assessment of
boxes" which would act on flows that do not respond appropriately to ECN. Among the possible mitigations, it describes the use of
congestion indications. Section 10 of RFC 3168 suggests that a first "penalty boxes" which would act on flows that do not respond
action taken at a penalty box for an ECN-capable flow would be to appropriately to congestion indications. Section 10 of RFC 3168
switch to dropping packets (instead of marking them), and, if the suggests that a first action taken at a penalty box for an ECN-
flow does not respond appropriately to the congestion indication, the capable flow would be to switch to dropping packets (instead of
penalty box could reset the misbehaving connection. Here we marking them), and, if the flow does not respond appropriately to the
discourage implementation of such a policy, as it would create a congestion indication, the penalty box could reset the misbehaving
vector for connection-reset attacks. For example, an attacker could connection. Here we discourage implementation of such a policy, as
forge TCP segments with the same four-tuple as the targeted it would create a vector for connection-reset attacks. For example,
connection and cause them to transit the penalty box. The penalty an attacker could forge TCP segments with the same four-tuple as the
box would first switch from marking to dropping packets. However, targeted connection and cause them to transit the penalty box. The
the attacker would continue sending forged segments, at a steady penalty box would first switch from marking to dropping packets.
rate. As a result, if the penalty box implemented such a severe However, the attacker would continue sending forged segments, at a
policy of resetting connections for flows that still do not respond steady rate. As a result, if the penalty box implemented such a
to end-to-end congestion control after switching from marking to severe policy of resetting connections for flows that still do not
dropping, the attacked connection would be reset. respond to end-to-end congestion control after switching from marking
to dropping, the attacked connection would be reset.
10. TCP API 10. TCP API
Section 3.8 of RFC 793 [Postel, 1981c] describes the minimum set of NOTE: THIS SECTION IS BEING EDITED.
TCP User Commands required of all TCP Implementations. Most
operating systems provide an Application Programming Interface (API) Section 3.8 of RFC 793 [RFC0793] describes the minimum set of TCP
that allows applications to make use of the services provided by TCP. User Commands required of all TCP Implementations. Most operating
One of the most popular APIs is the Sockets API, originally systems provide an Application Programming Interface (API) that
introduced in the BSD networking package [McKusick et al, 1996]. allows applications to make use of the services provided by TCP. One
of the most popular APIs is the Sockets API, originally introduced in
the BSD networking package [McKusick et al, 1996].
10.1. Passive opens and binding sockets 10.1. Passive opens and binding sockets
When there is already a pending passive OPEN for some local port When there is already a pending passive OPEN for some local port
number, TCP SHOULD NOT allow processes that do not belong to the same number, TCP SHOULD NOT allow processes that do not belong to the same
user to "reuse" the local port for another passive OPEN. user to "reuse" the local port for another passive OPEN.
Additionally, reuse of a local port SHOULD default to "off", and be Additionally, reuse of a local port SHOULD default to "off", and be
enabled only by an explicit command (e.g., the setsockopt() function enabled only by an explicit command (e.g., the setsockopt() function
of the Sockets API). of the Sockets API).
skipping to change at page 82, line 14 skipping to change at page 57, line 18
OPEN (local port, foreign socket, active/passive [, timeout] [, OPEN (local port, foreign socket, active/passive [, timeout] [,
precedence] [, security/compartment] [, options]) -> local precedence] [, security/compartment] [, options]) -> local
connection name connection name
When this command is used to perform a passive open (i.e., the When this command is used to perform a passive open (i.e., the
active/passive flag is set to passive), the foreign socket active/passive flag is set to passive), the foreign socket
parameter may be either fully-specified (to wait for a particular parameter may be either fully-specified (to wait for a particular
connection) or unspecified (to wait for any call). connection) or unspecified (to wait for any call).
As discussed in Section 2.7 of RFC 793 [Postel, 1981c], if there As discussed in Section 2.7 of RFC 793 [RFC0793], if there are
are several passive OPENs with the same local socket (recorded in several passive OPENs with the same local socket (recorded in the
the corresponding TCB), an incoming connection will be matched to corresponding TCB), an incoming connection will be matched to the
the TCB with the more specific foreign socket. This means that TCB with the more specific foreign socket. This means that when
when the foreign socket of a passive OPEN matches that of the the foreign socket of a passive OPEN matches that of the incoming
incoming connection request, that passive OPEN takes precedence connection request, that passive OPEN takes precedence over those
over those passive OPENs with an unspecified foreign socket. passive OPENs with an unspecified foreign socket.
Popular implementations such as the Sockets API let the user Popular implementations such as the Sockets API let the user
specify the local socket as fully-specified {local IP address, specify the local socket as fully-specified {local IP address,
local TCP port} pair, or as just the local TCP port (leaving the local TCP port} pair, or as just the local TCP port (leaving the
local IP address unspecified). In the former case, only those local IP address unspecified). In the former case, only those
connection requests sent to {local port, local IP address} will be connection requests sent to {local port, local IP address} will be
accepted. In the latter case, connection requests sent to any of accepted. In the latter case, connection requests sent to any of
the system's IP addresses will be accepted. In a similar fashion the system's IP addresses will be accepted. In a similar fashion
to the generic API described in Section 2.7 of RFC 793, if there to the generic API described in Section 2.7 of RFC 793, if there
is a pending passive OPEN with a fully-specified local socket that is a pending passive OPEN with a fully-specified local socket that
skipping to change at page 83, line 6 skipping to change at page 58, line 8
port" argument of the "OPEN" command. port" argument of the "OPEN" command.
An implementation MAY relax the aforementioned restriction when the An implementation MAY relax the aforementioned restriction when the
process or system user requesting allocation of such a port number is process or system user requesting allocation of such a port number is
the same that the process or system user controlling the TCP in the the same that the process or system user controlling the TCP in the
CLOSED or LISTEN states with the same port number. CLOSED or LISTEN states with the same port number.
DISCUSSION: DISCUSSION:
As discussed in Section 10.1, the "OPEN" command specified in As discussed in Section 10.1, the "OPEN" command specified in
Section 3.8 of RFC 793 [Postel, 1981c] can be used to perform Section 3.8 of RFC 793 [RFC0793] can be used to perform active
active opens. In case of active opens, the parameter "local port" opens. In case of active opens, the parameter "local port" will
will contain a so-called "ephemeral port". While the only contain a so-called "ephemeral port". While the only requirement
requirement for such an ephemeral port is that the resulting for such an ephemeral port is that the resulting connection-id is
connection-id is unique, port numbers that are currently in use by unique, port numbers that are currently in use by a TCP in the
a TCP in the LISTEN state should not be allowed for use as LISTEN state should not be allowed for use as ephemeral ports. If
ephemeral ports. If this rule is not complied, an attacker could this rule is not complied, an attacker could potentially "steal"
potentially "steal" an incoming connection to a local server an incoming connection to a local server application by issuing a
application by issuing a connection request to the victim client connection request to the victim client at roughly the same time
at roughly the same time the client tries to connect to the victim the client tries to connect to the victim server application. If
server application. If the SYN segment corresponding to the the SYN segment corresponding to the attacker's connection request
attacker's connection request and the SYN segment corresponding to and the SYN segment corresponding to the victim client "cross each
the victim client "cross each other in the network", and provided other in the network", and provided the attacker is able to know
the attacker is able to know or guess the ephemeral port used by or guess the ephemeral port used by the client, a TCP simultaneous
the client, a TCP simultaneous open scenario would take place, and open scenario would take place, and the incoming connection
the incoming connection request sent by the client would be request sent by the client would be matched with the attacker's
matched with the attacker's socket rather than with the victim socket rather than with the victim server application's socket.
server application's socket.
As already noted, in order for this attack to succeed, the As already noted, in order for this attack to succeed, the
attacker should be able to guess or know (in advance) the attacker should be able to guess or know (in advance) the
ephemeral port selected by the victim client, and be able to know ephemeral port selected by the victim client, and be able to know
the right moment to issue a connection request to the victim the right moment to issue a connection request to the victim
client. While in many scenarios this may prove to be a difficult client. While in many scenarios this may prove to be a difficult
task, some factors such as an inadequate ephemeral port selection task, some factors such as an inadequate ephemeral port selection
policy at the victim client could make this attack feasible. policy at the victim client could make this attack feasible.
It should be noted that most applications based on popular It should be noted that most applications based on popular
skipping to change at page 84, line 13 skipping to change at page 59, line 13
ports. ports.
An implementation might choose to relax the aforementioned An implementation might choose to relax the aforementioned
restriction when the process or system user requesting allocation restriction when the process or system user requesting allocation
of such a port number is the same that the process or system user of such a port number is the same that the process or system user
controlling the TCP in the CLOSED or LISTEN states with the same controlling the TCP in the CLOSED or LISTEN states with the same
port number. port number.
11. Blind in-window attacks 11. Blind in-window attacks
NOTE: THIS SECTION IS BEING EDITED.
In the last few years awareness has been raised about a number of In the last few years awareness has been raised about a number of
"blind" attacks that can be performed against TCP by forging TCP "blind" attacks that can be performed against TCP by forging TCP
segments that fall within the receive window [NISCC, 2004] [Watson, segments that fall within the receive window [NISCC, 2004] [Watson,
2004]. 2004].
The term "blind" refers to the fact that the attacker does not have The term "blind" refers to the fact that the attacker does not have
access to the packets that belong to the attacked connection. access to the packets that belong to the attacked connection.
The effects of these attacks range from connection resets to data The effects of these attacks range from connection resets to data
injection. While these attacks were known in the research community, injection. While these attacks were known in the research community,
skipping to change at page 85, line 7 skipping to change at page 60, line 7
reset attacks against TCP. [Watson, 2004] and [NISCC, 2004] raised reset attacks against TCP. [Watson, 2004] and [NISCC, 2004] raised
awareness about connection-reset attacks that exploit the RST flag of awareness about connection-reset attacks that exploit the RST flag of
TCP segments. [Ramaiah et al, 2008] noted that carefully crafted SYN TCP segments. [Ramaiah et al, 2008] noted that carefully crafted SYN
segments could also be used to perform connection-reset attacks. segments could also be used to perform connection-reset attacks.
This document describes yet two previously undocumented vectors for This document describes yet two previously undocumented vectors for
performing connection-reset attacks: the Precedence field of IP performing connection-reset attacks: the Precedence field of IP
packets that encapsulate TCP segments, and illegal TCP options. packets that encapsulate TCP segments, and illegal TCP options.
11.1.1. RST flag 11.1.1. RST flag
TCP SHOULD implement the mitigation for RST-based attacks specified The RST flag signals a TCP peer that the connection should be
in [Ramaiah et al, 2008]. aborted. In contrast with the FIN handshake (which gracefully
terminates a TCP connection), an RST segment causes the connection to
DISCUSSION: be abnormally closed.
The RST flag signals a TCP peer that the connection should be
aborted. In contrast with the FIN handshake (which gracefully
terminates a TCP connection), an RST segment causes the connection
to be abnormally closed.
As stated in Section 3.4 of RFC 793 [Postel, 1981c], all reset
segments are validated by checking their Sequence Numbers, with
the Sequence Number considered valid if it is within the receive
window. In the SYN-SENT state, however, an RST is valid if the
Acknowledgement Number acknowledges the SYN segment that
supposedly elicited the reset.
[Ramaiah et al, 2008] proposes a modification to TCP's transition
diagram to address this attack vector. The counter-measure is a
combination of enforcing a more strict validation check on the
sequence number of reset segments, and the addition of a
"challenge" mechanism. With the implementation of the proposed
mechanism, TCP would behave as follows:
If the Sequence Number of an RST segment is outside the receive
window, the segment is silently dropped (as stated by RFC 793).
That is, a reset segment is discarded unless it passes the
following check:
RCV.NXT <= Sequence Number < RCV.NXT+RCV.WND
If the sequence number falls exactly on the left-edge of the
receive window, the reset is honoured. That is, the connection is
reset if the following condition is true:
Sequence Number == RCV.NXT
If an RST segment passes the first check (i.e., it is within the
receive window) but does not pass the second check (i.e., it does
not fall exactly on the left edge of the receive window), an
Acknowledgement segment ("challenge ACK") is set in response:
<SEQ=SND.NXT><ACK=RCV.NXT><CTL=ACK> As stated in Section 3.4 of RFC 793 [RFC0793], all reset segments are
validated by checking their Sequence Numbers, with the Sequence
Number considered valid if it is within the receive window. In the
SYN-SENT state, however, an RST is valid if the Acknowledgement
Number acknowledges the SYN segment that supposedly elicited the
reset.
This Acknowledgement segment is referred to as a "challenge ACK" [RFC5961] proposes a modification to TCP's transition diagram to
as, in the event the RST segment that elicited it had been address this attack vector. The counter-measure is a combination of
legitimate (but silently dropped as a result of enforcing the enforcing a more strict validation check on the sequence number of
above checks), the challenge ACK would elicit a new reset segment reset segments, and the addition of a "challenge" mechanism.
that would fall exactly on the left edge of the window and would
thus pass all the above checks, finally resetting the connection.
We recommend the implementation of this countermeasure. However, We note that we are aware of patent claims on this counter-
we are aware of patent claims on this counter-measure, and suggest measure, and suggest vendors to research the consequences of the
vendors to research the consequences of the possible patents that possible patents that may apply.
may apply.
[US-CERT, 2003a] is an advisory of a firewall system that was [US-CERT, 2003a] is an advisory of a firewall system that was found
found particularly vulnerable to resets attack because of not particularly vulnerable to resets attack because of not validating
validating the TCP Sequence Number of RST segments. Clearly, all the TCP Sequence Number of RST segments. Clearly, all TCPs
TCPs (including those in middle-boxes) should validate RST (including those in middle-boxes) should validate RST segments as
segments as discussed in this section. discussed in this section.
11.1.2. SYN flag 11.1.2. SYN flag
Processing of SYN segments received for connections in the Section 3.9 (page 71) of RFC 793 [RFC0793] states that if a SYN
synchronized states SHOULD occur as follows: segment is received with a valid (i.e., "in window") Sequence Number,
an RST segment should be sent in response, and the connection should
o If a SYN segment is received for a connection in any synchronized be aborted. This could be leveraged to perform a blind connection-
state other than TIME-WAIT, respond with an ACK, applying rate- reset attack. [RFC5961] proposes a change in TCP's state diagram to
throttling. [Ramaiah et al, 2008] mitigate this attack vector.
o If the corresponding connection is in the TIME-WAIT state, then
process the incomming SYN as specified in
[I-D.ietf-tcpm-tcp-timestamps].
DISCUSSION:
Section 3.9 (page 71) of RFC 793 [Postel, 1981c] states that if a
SYN segment is received with a valid (i.e., "in window") Sequence
Number, an RST segment should be sent in response, and the
connection should be aborted.
The IETF has published an RFC, "Improving TCP's Resistance to
Blind In-Window Attacks" [Ramaiah et al, 2008] which addresses,
among others, this variant of TCP-based connection-reset attack.
This section describes the counter-measure proposed by the IETF, a
problem that may arise from the implementation of that solution,
and a workaround to it.
In order to mitigate this attack vector, [Ramaiah et al, 2008]
proposes to change TCP's reaction to SYN segments as follows.
When a SYN segment is received for a connection in any of the
synchronized states, an Acknowledgement (ACK) segment is sent in
response.
As discussed in [Ramaiah et al, 2008], there is a corner-case that
would not be properly handled by this mechanism. If a host (TCP
A) establishes a TCP connection with a remote peer (TCP B), and
then crashes, reboots and tries to initiate a new incarnation of
the same connection (i.e., a connection with the same four-tuple
as the previous connection) using an Initial Sequence Number equal
to the RCV.NXT value at the remote peer (TCP B), the ACK segment
sent by TCP B in response to the SYN segment would contain an
Acknowledgement number that would be considered valid by TCP A,
and thus an RST segment would not be sent in response to the
Acknowledgement (ACK) segment. As this ACK would not have the SYN
bit set, TCP A (being in the SYN-SENT state) would silently drop
it (as stated on page 68 of RFC 793). After a Retransmission
Timeout (RTO), TCP A would retransmit its SYN segment, which would
lead to the same sequence of events as before. Eventually, TCP A
would timeout, and the connection would be aborted. This is a
corner case in which the introduced change would lead to a non-
desirable behavior. However, we consider this scenario to be
extremely unlikely and, in the event it ever took place, the
connection would nevertheless be aborted after retrying for a
period of USER TIMEOUT seconds.
However, when this change is implemented exactly as described in
[Ramaiah et al, 2008], the potential of interoperability problems
is introduced, as a heuristic widely incorporated in many TCP
implementations is disabled.
In a number of scenarios a socket pair may need to be reused while
the corresponding four-tuple is still in the TIME-WAIT state in a
remote TCP peer. For example, a client accessing some service on
a host may try to create a new incarnation of a previous
connection, while the corresponding four-tuple is still in the
TIME-WAIT state at the remote TCP peer (the server). This may
happen if the ephemeral port numbers are being reused too quickly,
either because of a bad policy of selection of ephemeral ports, or
simply because of a high connection rate to the corresponding
service. In such scenarios, the establishment of new connections
that reuse a four-tuple that is in the TIME-WAIT state would fail.
In order to avoid this problem, RFC 1122 [Braden, 1989] states (in
Section 4.2.2.13) that when a connection request is received with
a four-tuple that is in the TIME-WAIT state, the connection
request could be accepted if the sequence number of the incoming
SYN segment is greater than the last sequence number seen on the
previous incarnation of the connection (for that direction of the
data transfer).
This requirement aims at avoiding the sequence number space of the
new and old incarnations of the connection to overlap, thus
avoiding old segments from the previous incarnation of the
connection to be accepted as valid by the new connection.
The requirement in [Ramaiah et al, 2008] to disregard SYN segments
received for connections in any of the synchronized states forbids
the implementation of the heuristic described above. As a result,
we argue that the processing of SYN segments proposed in [Ramaiah
et al, 2008] should apply only for connections in any of the
synchronized states other than the TIME-WAIT state.
11.1.3. Security/Compartment 11.1.3. Security/Compartment
If the security/compartment field of an incoming TCP segment does not Section 3.9 (page 71) of RFC 793 [RFC0793] states that if the IP
match the value recorded in the corresponding TCB, TCP SHOULD NOT security/compartment of an incoming segment does not exactly match
abort the connection, but simply discard the corresponding packet. the security/compartment in the TCB, a RST segment should be sent,
Additionally, this whole event SHOULD be logged as a security and the connection should be aborted. This certainly provides
violation. another attack vector for performing connection-reset attacks, as an
attacker could forge TCP segments with a security/compartment that is
DISCUSSION: different from that recorded in the corresponding TCB and, as a
result, the attacked connection would be reset.
Section 3.9 (page 71) of RFC 793 [Postel, 1981c] states that if
the IP security/compartment of an incoming segment does not
exactly match the security/compartment in the TCB, a RST segment
should be sent, and the connection should be aborted.
A discussion of the IP security options relevant to this section
can be found in Section 3.13.2.12, Section 3.13.2.13, and Section
3.13.2.14 of [CPNI, 2008].
This certainly provides another attack vector for performing
connection-reset attacks, as an attacker could forge TCP segments
with a security/compartment that is different from that recorded
in the corresponding TCB and, as a result, the attacked connection
would be reset.
It is interesting to note that for connections in the ESTABLISHED
state, this check is performed after validating the TCP Sequence
Number and checking the RST bit, but before validating the
Acknowledgement field. Therefore, even if the stricter validation
of the Acknowledgement field (described in Section 3.4) was
implemented, it would not help to mitigate this attack vector.
This attack vector can be easily mitigated by relaxing the [draft-gont-tcpm-tcp-seccomp-prec-00.txt] aims to update RFC 793 such
reaction to TCP segments with "incorrect" security/compartment that this issue is eliminated.
values as specified in this section.
11.1.4. Precedence 11.1.4. Precedence
If the Precedence field of an incomming TCP segment does not match Section 3.9 (page 71) of RFC 793 [RFC0793] states that if the IP
the value recorded in the corresponding TCB, TCP MUST NOT abort the precedence of an incoming segment does not exactly match the
connection, and MUST instead continue processing the segment as precedence in the TCB, a RST segment should be sent, and the
specified by RFC 793. connection should be aborted. This certainly provides another attack
vector for performing connection-reset attacks, as an attacker could
DISCUSSION: forge TCP segments with a precedence that is different from that
recorded in the corresponding TCB and, as a result, the attacked
Section 3.9 (page 71) of RFC 793 [Postel, 1981c] states that if connection would be reset.
the IP Precedence of an incoming segment does not exactly match
the Precedence recorded in the TCB, a RST segment should be sent,
and the connection should be aborted.
This certainly provides another attack vector for performing
connection-reset attacks, as an attacker could forge TCP segments
with a IP Precedence that is different from that recorded in the
corresponding TCB and, as a result, the attacked connection would
be reset.
It is interesting to note that for connections in the ESTABLISHED
state, this check is performed after validating the TCP Sequence
Number and checking the RST bit, but before validating the
Acknowledgement field. Therefore, even if the stricter validation
of the Acknowledgement field (described in Section 3.4) were
implemented, it would not help to mitigate this attack vector.
This attack vector can be easily mitigated by relaxing the
reaction to TCP segments with "incorrect" IP Precedence values.
That is, even if the Precedence field does not match the value
recorded in the corresponding TCB, TCP should not abort the
connection, and should instead continue processing the segment as
specified by RFC 793.
It is interesting to note that resetting a connection due to a
change in the Precedence value might have a negative impact on
interoperability. For example, the packets that correspond to the
connection could temporarily take a different internet path, in
which some middle-box could re-mark the Precedence field (due to
administration policies at the network to be transited). In such
a scenario, an implementation following the advice in RFC 793
would abort the connection, when the connection would have
probably survived.
While the IPv4 Type of Service field (and hence the Precedence [draft-gont-tcpm-tcp-seccomp-prec-00.txt] aims to update RFC 793 such
field) has been redefined by the Differentiated Services (DS) that this issue is eliminated.
field specified in RFC 2474 [Nichols et al, 1998], RFC 793
[Postel, 1981c] was never formally updated in this respect. We
note that both legacy systems that have not been upgraded to
implement the differentiated services architecture described in
RFC 2475 [Blake et al, 1998] and current implementations that have
extrapolated the discussion of the Precedence field to the
Differentiated Services field may still be vulnerable to the
connection reset vector discussed in this section.
11.1.5. Illegal options 11.1.5. Illegal options
TCP MUST silently drop those TCP segments that contain TCP options Section 4.2.2.5 of RFC 1122 [RFC1122] discusses the processing of TCP
with illegal option lengths. options. It states that TCP should be prepared to handle an illegal
option length (e.g., zero) without crashing, and suggests handling
DISCUSSION: such illegal options by resetting the corresponding connection and
logging the reason. However, this suggested behavior could be
exploited to perform connection-reset attacks.
Section 4.2.2.5 of RFC 1122 [Braden, 1989] discusses the [draft-gont-tcpm-tcp-illegal-option-lengths-00] aims at formally
processing of TCP options. It states that TCP must be able to updating RFC 1122, such that this issue is eliminated.
receive a TCP option in any segment, and must ignore without error
any option it does not implement. Additionally, it states that
TCP should be prepared to handle an illegal option length (e.g.,
zero) without crashing, and suggests handling such illegal options
by resetting the corresponding connection and logging the reason.
However, this suggested behavior could be exploited to perform
connection-reset attacks. Therefore, as discussed in Section 3.10
of this document, we advise TCP implementations to silently drop
those TCP segments that contain illegal option lengths.
11.2. Blind data-injection attacks 11.2. Blind data-injection attacks
An attacker could try to inject data in the stream of data being An attacker could try to inject data in the stream of data being
transferred on the connection. As with the other attacks described transferred on the connection. As with the other attacks described
in Section 11 of this document, in order to perform a blind data in Section 11 of this document, in order to perform a blind data
injection attack the attacker would need to know or guess the four- injection attack the attacker would need to know or guess the four-
tuple that identifies the TCP connection to be attacked. tuple that identifies the TCP connection to be attacked.
Additionally, he should be able to guess a valid ("in window") TCP Additionally, he should be able to guess a valid ("in window") TCP
Sequence Number, and a valid Acknowledgement Number. Sequence Number, and a valid Acknowledgement Number.
As discussed in Section 3.4 of this document, [Ramaiah et al, 2008] As discussed in Section 3.4 of this document, [Ramaiah et al, 2008]
proposes to enforce a more strict check on the Acknowledgement Number proposes to enforce a more strict check on the Acknowledgement Number
of incoming segments than that specified in RFC 793 [Postel, 1981c]. of incoming segments than that specified in RFC 793 [RFC0793].
Implementation of the proposed check requires more packets on the Implementation of the proposed check requires more packets on the
side of the attacker to successfully perform a blind data-injection side of the attacker to successfully perform a blind data-injection
attack. However, it should be noted that applications concerned with attack. However, it should be noted that applications concerned with
any of the attacks discussed in Section 11 of this document should any of the attacks discussed in Section 11 of this document should
make use of proper authentication techniques, such as those specified make use of proper authentication techniques, such as those specified
for IPsec in RFC 4301 [Kent and Seo, 2005]. for IPsec in RFC 4301 [Kent and Seo, 2005].
12. Information leaking 12. Information leaking
NOTE: THIS SECTION IS BEING EDITED.
12.1. Remote Operating System detection via TCP/IP stack fingerprinting 12.1. Remote Operating System detection via TCP/IP stack fingerprinting
Clearly, remote Operating System (OS) detection is a useful tool for Clearly, remote Operating System (OS) detection is a useful tool for
attackers. Tools such as nmap [Fyodor, 2006b] can usually detect the attackers. Tools such as nmap [Fyodor, 2006b] can usually detect the
operating system type and version of a remote system with an operating system type and version of a remote system with an
amazingly accurate precision. This information can in turn be used amazingly accurate precision. This information can in turn be used
by attackers to tailor their exploits to the identified operating by attackers to tailor their exploits to the identified operating
system type and version. system type and version.
Evasion of OS fingerprinting can prove to be a very difficult task. Evasion of OS fingerprinting can prove to be a very difficult task.
skipping to change at page 92, line 6 skipping to change at page 63, line 15
12.1.1. FIN probe 12.1.1. FIN probe
TCP MUST silently drop TCP any segments received for a connection in TCP MUST silently drop TCP any segments received for a connection in
the LISTEN state that do not have the SYN, RST, or ACK flags set. In the LISTEN state that do not have the SYN, RST, or ACK flags set. In
the rest of the cases, the processing rules in RFC 793 MUST be the rest of the cases, the processing rules in RFC 793 MUST be
applied. applied.
DISCUSSION: DISCUSSION:
The attacker sends a FIN (or any packet without the SYN or the ACK The attacker sends a FIN (or any packet without the SYN or the ACK
flags set) to an open port. RFC 793 [Postel, 1981c] leaves the flags set) to an open port. RFC 793 [RFC0793] leaves the reaction
reaction to such segments unspecified. As a result, some to such segments unspecified. As a result, some implementations
implementations silently drop the received segment, while others silently drop the received segment, while others respond with a
respond with a RST. RST.
12.1.2. Bogus flag test 12.1.2. Bogus flag test
TCP MUST ignore any flags not supported, and MUST NOT reflect them if TCP MUST ignore any flags not supported, and MUST NOT reflect them if
a TCP segment is sent in response to the one just received. a TCP segment is sent in response to the one just received.
DISCUSSION: DISCUSSION:
The attacker sends a TCP segment setting at least one bit of the The attacker sends a TCP segment setting at least one bit of the
Reserved field. Some implementations ignore this field, while Reserved field. Some implementations ignore this field, while
skipping to change at page 93, line 41 skipping to change at page 64, line 49
DISCUSSION: DISCUSSION:
[Fyodor, 1998] reports that many implementations differ in the [Fyodor, 1998] reports that many implementations differ in the
Acknowledgement Number they use in response to segments received Acknowledgement Number they use in response to segments received
for connections in the CLOSED state. In particular, these for connections in the CLOSED state. In particular, these
implementations differ in the way they construct the RST segment implementations differ in the way they construct the RST segment
that is sent in response to those TCP segments received for that is sent in response to those TCP segments received for
connections in the CLOSED state. connections in the CLOSED state.
RFC 793 [Postel, 1981c] describes (in pages 36-37) how RST RFC 793 [RFC0793] describes (in pages 36-37) how RST segments are
segments are to be generated. According to this RFC, the ACK bit to be generated. According to this RFC, the ACK bit (and the
(and the Acknowledgment Number) is set in a RST only if the Acknowledgment Number) is set in a RST only if the incoming
incoming segment that elicited the RST did not have the ACK bit segment that elicited the RST did not have the ACK bit set (and
set (and thus the Sequence Number of the outgoing RST segment must thus the Sequence Number of the outgoing RST segment must be set
be set to zero). However, we recommend TCP implementations to set to zero). However, we recommend TCP implementations to set the
the ACK bit (and the Acknowledgement Number) in all outgoing RST ACK bit (and the Acknowledgement Number) in all outgoing RST
segments, as it allows for additional validation checks to be segments, as it allows for additional validation checks to be
enforced at the system receiving the segment. enforced at the system receiving the segment.
12.1.6. TCP options 12.1.6. TCP options
Different implementations differ in the TCP options they enable by Different implementations differ in the TCP options they enable by
default. Additionally, they differ in the actual contents of the default. Additionally, they differ in the actual contents of the
options, and in the order in which the options are included in a TCP options, and in the order in which the options are included in a TCP
segment. There is currently no recommendation on the order in which segment. There is currently no recommendation on the order in which
to include TCP options in TCP segments. to include TCP options in TCP segments.
skipping to change at page 95, line 36 skipping to change at page 66, line 47
[Rowland, 1996] contains a discussion of covert channels in the [Rowland, 1996] contains a discussion of covert channels in the
TCP/IP protocol suite, with some TCP-based examples. [Giffin et al, TCP/IP protocol suite, with some TCP-based examples. [Giffin et al,
2002] describes the use of TCP timestamps for the establishment of 2002] describes the use of TCP timestamps for the establishment of
covert channels. [Zander, 2008] contains an extensive bibliography covert channels. [Zander, 2008] contains an extensive bibliography
of papers on covert channels, and a list of freely-available tools of papers on covert channels, and a list of freely-available tools
that implement covert channels with the TCP/IP protocol suite. that implement covert channels with the TCP/IP protocol suite.
14. TCP Port scanning 14. TCP Port scanning
NOTE: THIS SECTION IS BEING EDITED.
TCP port scanning aims at identifying TCP port numbers on which there TCP port scanning aims at identifying TCP port numbers on which there
is a process listening for incoming connections. That is, it aims at is a process listening for incoming connections. That is, it aims at
identifying TCPs at the target system that are in the LISTEN state. identifying TCPs at the target system that are in the LISTEN state.
The following subsections describe different TCP port scanning The following subsections describe different TCP port scanning
techniques that have been implemented in freely-available tools. techniques that have been implemented in freely-available tools.
These subsections focus only on those port scanning techniques that These subsections focus only on those port scanning techniques that
exploit features of TCP itself, and not of other communication exploit features of TCP itself, and not of other communication
protocols. protocols.
For example, the following subsections do not discuss the For example, the following subsections do not discuss the
skipping to change at page 97, line 5 skipping to change at page 68, line 17
scanning tool. scanning tool.
14.3. FIN, NULL, and XMAS scans 14.3. FIN, NULL, and XMAS scans
TCP SHOULD respond with an RST when a TCP segment is received for a TCP SHOULD respond with an RST when a TCP segment is received for a
connection in the LISTEN state, and the incoming segment has neither connection in the LISTEN state, and the incoming segment has neither
the SYN bit nor the RST bit set. the SYN bit nor the RST bit set.
DISCUSSION: DISCUSSION:
RFC 793 [Postel, 1981c] states, in page 65, that an incoming RFC 793 [RFC0793] states, in page 65, that an incoming segment
segment that does not have the RST bit set and that is received that does not have the RST bit set and that is received for a
for a connection in the fictional state CLOSED causes an RST to be connection in the fictional state CLOSED causes an RST to be sent
sent in response. Pages 65-66 of RFC 793 describes the processing in response. Pages 65-66 of RFC 793 describes the processing of
of incoming segments for connections in the state LISTEN, and incoming segments for connections in the state LISTEN, and
implicitly states that an incoming segment that does not have the implicitly states that an incoming segment that does not have the
ACK bit set (and is not a SYN or an RST) should be silently ACK bit set (and is not a SYN or an RST) should be silently
dropped. dropped.
As a result, an attacker can exploit this situation to perform a As a result, an attacker can exploit this situation to perform a
port scan by sending TCP segments that do not have the ACK bit set port scan by sending TCP segments that do not have the ACK bit set
to the target system. When a port is "open" (i.e., there is a TCP to the target system. When a port is "open" (i.e., there is a TCP
in the LISTEN state on the corresponding port), the target system in the LISTEN state on the corresponding port), the target system
will respond with an RST segment. On the other hand, if the port will respond with an RST segment. On the other hand, if the port
is "closed" (i.e., there is a TCP in the fictional state CLOSED) is "closed" (i.e., there is a TCP in the fictional state CLOSED)
skipping to change at page 97, line 45 skipping to change at page 69, line 9
It should be clear that while the aforementioned control-bits It should be clear that while the aforementioned control-bits
combinations are the most popular ones, other combinations could combinations are the most popular ones, other combinations could
be used to exploit this port-scanning vector. For example, the be used to exploit this port-scanning vector. For example, the
CWR, ECE, and/or any of the Reserved bits could be set in the CWR, ECE, and/or any of the Reserved bits could be set in the
probe segments. probe segments.
The advantage of this port-scanning technique is that in can The advantage of this port-scanning technique is that in can
bypass some stateless firewalls. However, the downside is that a bypass some stateless firewalls. However, the downside is that a
number of implementations do not comply strictly with RFC 793 number of implementations do not comply strictly with RFC 793
[Postel, 1981c], and thus always respond to the probe segments [RFC0793], and thus always respond to the probe segments with an
with an RST, regardless of whether the port is open or closed. RST, regardless of whether the port is open or closed.
This port-scanning vector can be easily defeated as rby responding This port-scanning vector can be easily defeated as rby responding
with an RST when a TCP segment is received for a connection in the with an RST when a TCP segment is received for a connection in the
LISTEN state, and the incoming segment has neither the SYN bit nor LISTEN state, and the incoming segment has neither the SYN bit nor
the RST bit set. the RST bit set.
14.4. Maimon scan 14.4. Maimon scan
If a TCP that is in the CLOSED or LISTEN states receives a TCP If a TCP that is in the CLOSED or LISTEN states receives a TCP
segment with both the FIN and ACK bits set, it MUST respond with a segment with both the FIN and ACK bits set, it MUST respond with a
RST. RST.
DISCUSSION: DISCUSSION:
This port scanning technique was introduced in [Maimon, 1996] with This port scanning technique was introduced in [Maimon, 1996] with
the name "StealthScan" (method #1), and was later incorporated the name "StealthScan" (method #1), and was later incorporated
into the nmap tool [Fyodor, 2006b] as the "Maimon scan". into the nmap tool [Fyodor, 2006b] as the "Maimon scan".
This port scanning technique employs TCP segments that have both This port scanning technique employs TCP segments that have both
the FIN and ACK bits sets as the probe segments. While according the FIN and ACK bits sets as the probe segments. While according
to RFC 793 [Postel, 1981c] these segments should elicit an RST to RFC 793 [RFC0793] these segments should elicit an RST
regardless of whether the corresponding port is open or closed, a regardless of whether the corresponding port is open or closed, a
programming flaw found in a number of TCP implementations has programming flaw found in a number of TCP implementations has
caused some systems to silently drop the probe segment if the caused some systems to silently drop the probe segment if the
corresponding port was open (i.e., there was a TCP in the LISTEN corresponding port was open (i.e., there was a TCP in the LISTEN
state), and respond with an RST only if the port was closed. state), and respond with an RST only if the port was closed.
Therefore, an RST would indicate that the scanned port is closed, Therefore, an RST would indicate that the scanned port is closed,
while the absence of a response from the target system would while the absence of a response from the target system would
indicate that the scanned port is open. indicate that the scanned port is open.
skipping to change at page 99, line 18 skipping to change at page 70, line 33
implement this policy. implement this policy.
14.6. ACK scan 14.6. ACK scan
The so-called "ACK scan" is not really a port-scanning technique The so-called "ACK scan" is not really a port-scanning technique
(i.e., it does not aim at determining whether a specific port is open (i.e., it does not aim at determining whether a specific port is open
or closed), but rather aims at determining whether some intermediate or closed), but rather aims at determining whether some intermediate
system is filtering TCP segments sent to that specific port number. system is filtering TCP segments sent to that specific port number.
The probe packet is a TCP segment with the ACK bit set which, The probe packet is a TCP segment with the ACK bit set which,
according to RFC 793 [Postel, 1981c] should elicit an RST from the according to RFC 793 [RFC0793] should elicit an RST from the target
target system regardless of whether the corresponding TCP port is system regardless of whether the corresponding TCP port is open or
open or closed. If no response is received from the target system, closed. If no response is received from the target system, it is
it is assumed that some intermediate system is filtering the probe assumed that some intermediate system is filtering the probe packets
packets sent to the target system. sent to the target system.
It should be noted that this "port scanning" techniques exploits It should be noted that this "port scanning" techniques exploits
basic TCP processing rules, and therefore cannot be defeated at an basic TCP processing rules, and therefore cannot be defeated at an
end-system. end-system.
15. Processing of ICMP error messages by TCP 15. Processing of ICMP error messages by TCP
TCP SHOULD silently ignore received ICMP Source Quench messages. [RFC5927] analyzes a number of vulnerabilities based on crafted ICMP
messages, along with possible counter-measures.
TCP SHOULD process ICMP "hard errors" as "soft errors" when they are
received for connections that are in any of he synchronized states.
TCP SHOULD process ICMP "fragmentation needed and DF bit set" and
ICMPv6 "Packet Too Big" error messages as described in [RFC5927].
DISCUSSION:
[RFC5927] analyzes a number of vulnerabilities based on crafted
ICMP messages, along with possible counter-measures.
16. TCP interaction with the Internet Protocol (IP) 16. TCP interaction with the Internet Protocol (IP)
16.1. TCP-based traceroute 16.1. TCP-based traceroute
The traceroute tool is used to identify the intermediate systems the The traceroute tool is used to identify the intermediate systems the
local system and the destination system. It is usually implemented local system and the destination system. It is usually implemented
by sending "probe" packets with increasing IP Time to Live values by sending "probe" packets with increasing IP Time to Live values
(starting from 0), without maintaining any state with the final (starting from 0), without maintaining any state with the final
destination. destination.
Some traceroute implementations use ICMP "echo request" messages as Some traceroute implementations use ICMP "echo request" messages as
the probe packets, while others use UDP packets or TCP SYN segments. the probe packets, while others use UDP packets or TCP SYN segments.
skipping to change at page 102, line 28 skipping to change at page 73, line 36
This document provides a thorough security assessment of the This document provides a thorough security assessment of the
Transmission Control Protocol (TCP), identifies a number of Transmission Control Protocol (TCP), identifies a number of
vulnerabilities, and specifies possible counter-measures. vulnerabilities, and specifies possible counter-measures.
Additionally, it provides implementation guidance such that the Additionally, it provides implementation guidance such that the
resilience of TCP implementations is improved. resilience of TCP implementations is improved.
18. Acknowledgements 18. Acknowledgements
The author would like to thank (in alphabetical order) David Borman, The author would like to thank (in alphabetical order) David Borman,
Wesley Eddy, and Alfred Hoenes, for providing valuable feedback on Wesley Eddy, Alfred Hoenes, and Michael Scharf, for providing
earlier versions of thi document. valuable feedback on earlier versions of thi document.
This document is heavily based on the document "Security Assessment This document is heavily based on the document "Security Assessment
of the Transmission Control Protocol (TCP)" [CPNI, 2009] written by of the Transmission Control Protocol (TCP)" [CPNI, 2009] written by
Fernando Gont on behalf of CPNI (Centre for the Protection of Fernando Gont on behalf of CPNI (Centre for the Protection of
National Infrastructure). National Infrastructure).
The author would like to thank (in alphabetical order) Randall The author would like to thank (in alphabetical order) Randall
Atkinson, Guillermo Gont, Alfred Hoenes, Jamshid Mahdavi, Stanislav Atkinson, Guillermo Gont, Alfred Hoenes, Jamshid Mahdavi, Stanislav
Shalunov, Michael Welzl, Dan Wing, Andrew Yourtchenko, Michal Shalunov, Michael Welzl, Dan Wing, Andrew Yourtchenko, Michal
Zalewski, and Christos Zoulas, for providing valuable feedback on Zalewski, and Christos Zoulas, for providing valuable feedback on
skipping to change at page 103, line 6 skipping to change at page 74, line 13
Additionally, the author would like to thank (in alphabetical order) Additionally, the author would like to thank (in alphabetical order)
Mark Allman, David Black, Ethan Blanton, David Borman, James Chacon, Mark Allman, David Black, Ethan Blanton, David Borman, James Chacon,
John Heffner, Jerrold Leichter, Jamshid Mahdavi, Keith Scott, Bill John Heffner, Jerrold Leichter, Jamshid Mahdavi, Keith Scott, Bill
Squier, and David White, who generously answered a number of Squier, and David White, who generously answered a number of
questions that araised while the aforementioned document was being questions that araised while the aforementioned document was being
written. written.
Finally, the author would like to thank CPNI (formely NISCC) for Finally, the author would like to thank CPNI (formely NISCC) for
their continued support. their continued support.
19. References 19. References (to be translated to xml)
Abley, J., Savola, P., Neville-Neil, G. 2007. Deprecation of Type 0 Abley, J., Savola, P., Neville-Neil, G. 2007. Deprecation of Type 0
Routing Headers in IPv6. RFC 5095. Routing Headers in IPv6. RFC 5095.
Allman, M. 2003. TCP Congestion Control with Appropriate Byte Allman, M. 2003. TCP Congestion Control with Appropriate Byte
Counting (ABC). RFC 3465. Counting (ABC). RFC 3465.
Allman, M. 2008. Comments On Selecting Ephemeral Ports. Available Allman, M. 2008. Comments On Selecting Ephemeral Ports. Available
at: http://www.icir.org/mallman/share/ports-dec08.pdf at: http://www.icir.org/mallman/share/ports-dec08.pdf
skipping to change at page 108, line 13 skipping to change at page 79, line 22
Protocol. RFC 4301. Protocol. RFC 4301.
Klensin, J. 2008. Simple Mail Transfer Protocol. RFC 5321. Klensin, J. 2008. Simple Mail Transfer Protocol. RFC 5321.
Ko, Y., Ko, S., and Ko, M. 2001. NIDS Evasion Method named SeolMa. Ko, Y., Ko, S., and Ko, M. 2001. NIDS Evasion Method named SeolMa.
Phrack Magazine, Volume 0x0b, Issue 0x39, phile #0x03 of 0x12. Phrack Magazine, Volume 0x0b, Issue 0x39, phile #0x03 of 0x12.
Available at: http://www.phrack.org/issues.html?issue=57&id=3#article Available at: http://www.phrack.org/issues.html?issue=57&id=3#article
Lahey, K. 2000. TCP Problems with Path MTU Discovery. RFC 2923. Lahey, K. 2000. TCP Problems with Path MTU Discovery. RFC 2923.
Larsen, M., Gont, F. 2008. Port Randomization. IETF Internet-Draft
(draft-ietf-tsvwg-port-randomization-02), work in progress.
Lemon, 2002. Resisting SYN flood DoS attacks with a SYN cache. Lemon, 2002. Resisting SYN flood DoS attacks with a SYN cache.
Proceedings of the BSDCon 2002 Conference, pp 89-98. Proceedings of the BSDCon 2002 Conference, pp 89-98.
Maimon, U. 1996. Port Scanning without the SYN flag. Phrack Maimon, U. 1996. Port Scanning without the SYN flag. Phrack
Magazine, Volume Seven, Issue Fourty-Nine, phile #0x0f of 0x10. Magazine, Volume Seven, Issue Fourty-Nine, phile #0x0f of 0x10.
Available at: Available at:
http://www.phrack.org/issues.html?issue=49&id=15#article http://www.phrack.org/issues.html?issue=49&id=15#article
Mathis, M., Mahdavi, J., Floyd, S. Romanow, A. 1996. TCP Selective Mathis, M., Mahdavi, J., Floyd, S. Romanow, A. 1996. TCP Selective
Acknowledgment Options. RFC 2018. Acknowledgment Options. RFC 2018.
skipping to change at page 113, line 9 skipping to change at page 84, line 10
IFIP Communications and Multimedia Security Conference (CMS 2002). IFIP Communications and Multimedia Security Conference (CMS 2002).
Available at: http://www.ieeta.pt/~avz/pubs/CMS02.html Available at: http://www.ieeta.pt/~avz/pubs/CMS02.html
Zweig, J., Partridge, C. 1990. TCP Alternate Checksum Options. RFC Zweig, J., Partridge, C. 1990. TCP Alternate Checksum Options. RFC
1146. 1146.
20. References 20. References
20.1. Normative References 20.1. Normative References
[I-D.ietf-tcpm-tcp-timestamps] [RFC0793] Postel, J., "Transmission Control Protocol", STD 7,
Gont, F., "Reducing the TIME-WAIT state using TCP RFC 793, September 1981.
timestamps", draft-ietf-tcpm-tcp-timestamps-03 (work in
progress), December 2010.
[I-D.ietf-tsvwg-port-randomization] [RFC1122] Braden, R., "Requirements for Internet Hosts -
Larsen, M. and F. Gont, "Transport Protocol Port Communication Layers", STD 3, RFC 1122, October 1989.
Randomization Recommendations",
draft-ietf-tsvwg-port-randomization-09 (work in progress), [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition
of Explicit Congestion Notification (ECN) to IP",
RFC 3168, September 2001.
[RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion
Control", RFC 5681, September 2009.
[RFC5961] Ramaiah, A., Stewart, R., and M. Dalal, "Improving TCP's
Robustness to Blind In-Window Attacks", RFC 5961,
August 2010. August 2010.
[RFC6056] Larsen, M. and F. Gont, "Recommendations for Transport-
Protocol Port Randomization", BCP 156, RFC 6056,
January 2011.
[RFC6093] Gont, F. and A. Yourtchenko, "On the Implementation of the [RFC6093] Gont, F. and A. Yourtchenko, "On the Implementation of the
TCP Urgent Mechanism", RFC 6093, January 2011. TCP Urgent Mechanism", RFC 6093, January 2011.
[RFC6191] Gont, F., "Reducing the TIME-WAIT State Using TCP
Timestamps", BCP 159, RFC 6191, April 2011.
[RFC6528] Gont, F. and S. Bellovin, "Defending against Sequence
Number Attacks", RFC 6528, February 2012.
20.2. Informative References 20.2. Informative References
[I-D.gont-timestamps-generation] [I-D.gont-timestamps-generation]
Gont, F. and A. Oppermann, "On the generation of TCP Gont, F. and A. Oppermann, "On the generation of TCP
timestamps", draft-gont-timestamps-generation-00 (work in timestamps", draft-gont-timestamps-generation-00 (work in
progress), June 2010. progress), June 2010.
[I-D.ietf-tcpm-3517bis]
Blanton, E., Jarvinen, I., Wang, L., Allman, M., Kojo, M.,
and Y. Nishida, "A Conservative Selective Acknowledgment
(SACK)-based Loss Recovery Algorithm for TCP",
draft-ietf-tcpm-3517bis-01 (work in progress),
January 2012.
[Morris1985]
Morris, R., "A Weakness in the 4.2BSD UNIX TCP/IP
Software", CSTR 117, AT&T Bell Laboratories, Murray Hill,
NJ, 1985.
[RFC1025] Postel, J., "TCP and IP bake off", RFC 1025,
September 1987.
[RFC1379] Braden, B., "Extending TCP for Transactions -- Concepts",
RFC 1379, November 1992.
[RFC5927] Gont, F., "ICMP Attacks against TCP", RFC 5927, July 2010. [RFC5927] Gont, F., "ICMP Attacks against TCP", RFC 5927, July 2010.
[RFC6429] Bashyam, M., Jethanandani, M., and A. Ramaiah, "TCP Sender
Clarification for Persist Condition", RFC 6429,
December 2011.
[Shimomura1995]
Shimomura, T., "Technical details of the attack described
by Markoff in NYT",
http://www.gont.com.ar/docs/post-shimomura-usenet.txt,
Message posted in USENET's comp.security.misc newsgroup,
Message-ID: <3g5gkl$5j1@ariel.sdsc.edu>, 1995.
Appendix A. TODO list Appendix A. TODO list
A Number of formatting issues still have to be fixed in this A Number of formatting issues still have to be fixed in this
document. Among others are: document. Among others are:
o The ASCII-art corresponding to some figures are still missing. We o The ASCII-art corresponding to some figures are still missing. We
still have to convert the nice JPGs of the UK CPNI document into still have to convert the nice JPGs of the UK CPNI document into
ugly ASCII-art. ugly ASCII-art.
o The references have not yet been converted to xml, but are o The references have not yet been converted to xml, but are
hardcoded, instead. That's why they may not look as expected hardcoded, instead. That's why they may not look as expected
Appendix B. Change log (to be removed by the RFC Editor before Appendix B. Change log (to be removed by the RFC Editor before
publication of this document as an RFC) publication of this document as an RFC)
B.1. Changes from draft-ietf-tcpm-tcp-security-01 B.1. Changes from draft-ietf-tcpm-tcp-security-02
o Lots of text has been removed out of the document.
o The documento track has been changed from BCP to Informational
(RFC2119-language recommendations ahve been removed).
o Where necessary, stand-alone std tracks documents have been
produced.
B.2. Changes from draft-ietf-tcpm-tcp-security-01
A Number of formatting issues still have to be fixed in this A Number of formatting issues still have to be fixed in this
document. Among others are: document. Among others are:
o The whole document was reformatted with RFC 1122 style. o The whole document was reformatted with RFC 1122 style.
Author's Address Author's Address
Fernando Gont Fernando Gont
UK Centre for the Protection of National Infrastructure UK Centre for the Protection of National Infrastructure
 End of changes. 192 change blocks. 
2391 lines changed or deleted 1059 lines changed or added

This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/