draft-ietf-tcpm-tcp-security-02.txt | draft-ietf-tcpm-tcp-security-03.txt | |||
---|---|---|---|---|
TCP Maintenance and Minor F. Gont | TCP Maintenance and Minor Extensions F. Gont | |||
Extensions (tcpm) UK CPNI | (tcpm) UK CPNI | |||
Internet-Draft January 21, 2011 | Internet-Draft March 13, 2012 | |||
Intended status: BCP | Intended status: Informational | |||
Expires: July 25, 2011 | Expires: September 14, 2012 | |||
Security Assessment of the Transmission Control Protocol (TCP) | Survey of Security Hardening Methods for Transmission Control Protocol | |||
draft-ietf-tcpm-tcp-security-02.txt | (TCP) Implementations | |||
draft-ietf-tcpm-tcp-security-03.txt | ||||
Abstract | Abstract | |||
This document contains a security assessment of the specifications of | This document surveys methods to harden Transmission Control Protocol | |||
the Transmission Control Protocol (TCP), and of a number of | (TCP) implementations. It provides an overview of known attacks and | |||
mechanisms and policies in use by popular TCP implementations. | refers to the corresponding solutions in the TCP standards. | |||
Additionally, it contains best current practices for hardening a TCP | ||||
implementation. | ||||
Status of this Memo | Status of this Memo | |||
This Internet-Draft is submitted to IETF in full conformance with the | This Internet-Draft is submitted to IETF in full conformance with the | |||
provisions of BCP 78 and BCP 79. | provisions of BCP 78 and BCP 79. | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
Drafts is at http://datatracker.ietf.org/drafts/current/. | Drafts is at http://datatracker.ietf.org/drafts/current/. | |||
Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
This Internet-Draft will expire on July 25, 2011. | This Internet-Draft will expire on September 14, 2012. | |||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2011 IETF Trust and the persons identified as the | Copyright (c) 2012 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
(http://trustee.ietf.org/license-info) in effect on the date of | (http://trustee.ietf.org/license-info) in effect on the date of | |||
publication of this document. Please review these documents | publication of this document. Please review these documents | |||
carefully, as they describe your rights and restrictions with respect | carefully, as they describe your rights and restrictions with respect | |||
to this document. Code Components extracted from this document must | to this document. Code Components extracted from this document must | |||
include Simplified BSD License text as described in Section 4.e of | include Simplified BSD License text as described in Section 4.e of | |||
the Trust Legal Provisions and are provided without warranty as | the Trust Legal Provisions and are provided without warranty as | |||
described in the Simplified BSD License. | described in the Simplified BSD License. | |||
Table of Contents | Table of Contents | |||
1. Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 | 1. Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 | |||
1.1. Introduction . . . . . . . . . . . . . . . . . . . . . . 5 | 1.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . 5 | |||
1.2. Scope of this document . . . . . . . . . . . . . . . . . 6 | 1.2. Scope of this document . . . . . . . . . . . . . . . . . . 6 | |||
1.3. Organization of this document . . . . . . . . . . . . . . 8 | 1.3. Organization of this document . . . . . . . . . . . . . . 7 | |||
2. The Transmission Control Protocol . . . . . . . . . . . . . . 8 | 2. The Transmission Control Protocol . . . . . . . . . . . . . . 7 | |||
3. TCP header fields . . . . . . . . . . . . . . . . . . . . . . 9 | 3. TCP header fields . . . . . . . . . . . . . . . . . . . . . . 8 | |||
3.1. Source Port and Destination Port . . . . . . . . . . . . 10 | 3.1. Source Port and Destination Port . . . . . . . . . . . . . 8 | |||
3.2. Sequence number . . . . . . . . . . . . . . . . . . . . . 12 | 3.2. Sequence number . . . . . . . . . . . . . . . . . . . . . 9 | |||
3.3. Acknowledgement Number . . . . . . . . . . . . . . . . . 14 | 3.3. Acknowledgement Number . . . . . . . . . . . . . . . . . . 10 | |||
3.4. Data Offset . . . . . . . . . . . . . . . . . . . . . . . 15 | 3.4. Data Offset . . . . . . . . . . . . . . . . . . . . . . . 10 | |||
3.5. Control bits . . . . . . . . . . . . . . . . . . . . . . 15 | 3.5. Control bits . . . . . . . . . . . . . . . . . . . . . . . 10 | |||
3.5.1. Reserved (four bits) . . . . . . . . . . . . . . . . 15 | 3.5.1. Reserved (four bits) . . . . . . . . . . . . . . . . . 10 | |||
3.5.2. CWR (Congestion Window Reduced) . . . . . . . . . . . 16 | 3.5.2. CWR (Congestion Window Reduced) . . . . . . . . . . . 11 | |||
3.5.3. ECE (ECN-Echo) . . . . . . . . . . . . . . . . . . . 16 | 3.5.3. ECE (ECN-Echo) . . . . . . . . . . . . . . . . . . . . 11 | |||
3.5.4. URG . . . . . . . . . . . . . . . . . . . . . . . . . 17 | 3.5.4. URG . . . . . . . . . . . . . . . . . . . . . . . . . 11 | |||
3.5.5. ACK . . . . . . . . . . . . . . . . . . . . . . . . . 17 | 3.5.5. ACK . . . . . . . . . . . . . . . . . . . . . . . . . 12 | |||
3.5.6. PSH . . . . . . . . . . . . . . . . . . . . . . . . . 17 | 3.5.6. PSH . . . . . . . . . . . . . . . . . . . . . . . . . 12 | |||
3.5.7. RST . . . . . . . . . . . . . . . . . . . . . . . . . 19 | 3.5.7. RST . . . . . . . . . . . . . . . . . . . . . . . . . 12 | |||
3.5.8. SYN . . . . . . . . . . . . . . . . . . . . . . . . . 19 | 3.5.8. SYN . . . . . . . . . . . . . . . . . . . . . . . . . 12 | |||
3.5.9. FIN . . . . . . . . . . . . . . . . . . . . . . . . . 20 | 3.5.9. FIN . . . . . . . . . . . . . . . . . . . . . . . . . 12 | |||
3.6. Window . . . . . . . . . . . . . . . . . . . . . . . . . 20 | 3.6. Window . . . . . . . . . . . . . . . . . . . . . . . . . . 13 | |||
3.7. Checksum . . . . . . . . . . . . . . . . . . . . . . . . 22 | 3.6.1. Security implications arising from closed windows . . 14 | |||
3.8. Urgent pointer . . . . . . . . . . . . . . . . . . . . . 23 | 3.7. Checksum . . . . . . . . . . . . . . . . . . . . . . . . . 14 | |||
3.9. Options . . . . . . . . . . . . . . . . . . . . . . . . . 24 | 3.8. Urgent pointer . . . . . . . . . . . . . . . . . . . . . . 16 | |||
3.10. Padding . . . . . . . . . . . . . . . . . . . . . . . . . 28 | 3.9. Options . . . . . . . . . . . . . . . . . . . . . . . . . 16 | |||
3.11. Data . . . . . . . . . . . . . . . . . . . . . . . . . . 28 | 3.10. Padding . . . . . . . . . . . . . . . . . . . . . . . . . 19 | |||
4. Common TCP Options . . . . . . . . . . . . . . . . . . . . . 29 | 3.11. Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 | |||
4.1. End of Option List (Kind = 0) . . . . . . . . . . . . . . 29 | 4. Common TCP Options . . . . . . . . . . . . . . . . . . . . . . 19 | |||
4.2. No Operation (Kind = 1) . . . . . . . . . . . . . . . . . 29 | 4.1. End of Option List (Kind = 0) . . . . . . . . . . . . . . 19 | |||
4.3. Maximum Segment Size (Kind = 2) . . . . . . . . . . . . . 29 | 4.2. No Operation (Kind = 1) . . . . . . . . . . . . . . . . . 19 | |||
4.4. Selective Acknowledgement Option . . . . . . . . . . . . 32 | 4.3. Maximum Segment Size (Kind = 2) . . . . . . . . . . . . . 19 | |||
4.4.1. SACK-permitted Option (Kind = 4) . . . . . . . . . . 32 | 4.4. Selective Acknowledgement Option . . . . . . . . . . . . . 20 | |||
4.4.2. SACK Option (Kind = 5) . . . . . . . . . . . . . . . 33 | 4.4.1. SACK-permitted Option (Kind = 4) . . . . . . . . . . . 20 | |||
4.5. MD5 Option (Kind=19) . . . . . . . . . . . . . . . . . . 35 | 4.4.2. SACK Option (Kind = 5) . . . . . . . . . . . . . . . . 20 | |||
4.6. Window scale option (Kind = 3) . . . . . . . . . . . . . 36 | 4.5. MD5 Option (Kind=19) . . . . . . . . . . . . . . . . . . . 21 | |||
4.7. Timestamps option (Kind = 8) . . . . . . . . . . . . . . 37 | 4.6. Window scale option (Kind = 3) . . . . . . . . . . . . . . 21 | |||
4.7.1. Generation of timestamps . . . . . . . . . . . . . . 37 | 4.7. Timestamps option (Kind = 8) . . . . . . . . . . . . . . . 22 | |||
4.7.2. Vulnerabilities . . . . . . . . . . . . . . . . . . . 38 | 4.7.1. Generation of timestamps . . . . . . . . . . . . . . . 22 | |||
5. Connection-establishment mechanism . . . . . . . . . . . . . 39 | 4.7.2. Vulnerabilities . . . . . . . . . . . . . . . . . . . 22 | |||
5.1. SYN flood . . . . . . . . . . . . . . . . . . . . . . . . 40 | 5. Connection-establishment mechanism . . . . . . . . . . . . . . 24 | |||
5.2. Connection forgery . . . . . . . . . . . . . . . . . . . 44 | 5.1. SYN flood . . . . . . . . . . . . . . . . . . . . . . . . 24 | |||
5.3. Connection-flooding attack . . . . . . . . . . . . . . . 45 | 5.2. Connection forgery . . . . . . . . . . . . . . . . . . . . 28 | |||
5.3.1. Vulnerability . . . . . . . . . . . . . . . . . . . . 45 | 5.3. Connection-flooding attack . . . . . . . . . . . . . . . . 29 | |||
5.3.2. Countermeasures . . . . . . . . . . . . . . . . . . . 46 | 5.3.1. Vulnerability . . . . . . . . . . . . . . . . . . . . 29 | |||
5.4. Firewall-bypassing techniques . . . . . . . . . . . . . . 48 | 5.3.2. Countermeasures . . . . . . . . . . . . . . . . . . . 30 | |||
6. Connection-termination mechanism . . . . . . . . . . . . . . 49 | 5.4. Firewall-bypassing techniques . . . . . . . . . . . . . . 32 | |||
6.1. FIN-WAIT-2 flooding attack . . . . . . . . . . . . . . . 49 | ||||
6.1.1. Vulnerability . . . . . . . . . . . . . . . . . . . . 49 | 6. Connection-termination mechanism . . . . . . . . . . . . . . . 32 | |||
6.1.2. Countermeasures . . . . . . . . . . . . . . . . . . . 50 | 6.1. FIN-WAIT-2 flooding attack . . . . . . . . . . . . . . . . 32 | |||
7. Buffer management . . . . . . . . . . . . . . . . . . . . . . 52 | 6.1.1. Vulnerability . . . . . . . . . . . . . . . . . . . . 32 | |||
7.1. TCP retransmission buffer . . . . . . . . . . . . . . . . 52 | 6.1.2. Countermeasures . . . . . . . . . . . . . . . . . . . 33 | |||
7.1.1. Vulnerability . . . . . . . . . . . . . . . . . . . . 52 | 7. Buffer management . . . . . . . . . . . . . . . . . . . . . . 35 | |||
7.1.2. Countermeasures . . . . . . . . . . . . . . . . . . . 53 | 7.1. TCP retransmission buffer . . . . . . . . . . . . . . . . 36 | |||
7.2. TCP segment reassembly buffer . . . . . . . . . . . . . . 56 | 7.1.1. Vulnerability . . . . . . . . . . . . . . . . . . . . 36 | |||
7.3. Automatic buffer tuning mechanisms . . . . . . . . . . . 59 | 7.1.2. Countermeasures . . . . . . . . . . . . . . . . . . . 37 | |||
7.3.1. Automatic send-buffer tuning mechanisms . . . . . . . 59 | 7.2. TCP segment reassembly buffer . . . . . . . . . . . . . . 40 | |||
7.3.2. Automatic receive-buffer tuning mechanism . . . . . . 61 | 7.3. Automatic buffer tuning mechanisms . . . . . . . . . . . . 42 | |||
8. TCP segment reassembly algorithm . . . . . . . . . . . . . . 63 | 7.3.1. Automatic send-buffer tuning mechanisms . . . . . . . 43 | |||
7.3.2. Automatic receive-buffer tuning mechanism . . . . . . 45 | ||||
8. TCP segment reassembly algorithm . . . . . . . . . . . . . . . 47 | ||||
8.1. Problems that arise from ambiguity in the reassembly | 8.1. Problems that arise from ambiguity in the reassembly | |||
process . . . . . . . . . . . . . . . . . . . . . . . . . 63 | process . . . . . . . . . . . . . . . . . . . . . . . . . 47 | |||
9. TCP Congestion Control . . . . . . . . . . . . . . . . . . . 64 | 9. TCP Congestion Control . . . . . . . . . . . . . . . . . . . . 48 | |||
9.1. Congestion control with misbehaving receivers . . . . . . 66 | 9.1. Congestion control with misbehaving receivers . . . . . . 48 | |||
9.1.1. ACK division . . . . . . . . . . . . . . . . . . . . 66 | 9.1.1. ACK division . . . . . . . . . . . . . . . . . . . . . 48 | |||
9.1.2. DupACK forgery . . . . . . . . . . . . . . . . . . . 66 | 9.1.2. DupACK forgery . . . . . . . . . . . . . . . . . . . . 49 | |||
9.1.3. Optimistic ACKing . . . . . . . . . . . . . . . . . . 67 | 9.1.3. Optimistic ACKing . . . . . . . . . . . . . . . . . . 49 | |||
9.2. Blind DupACK triggering attacks against TCP . . . . . . . 68 | 9.2. Blind DupACK triggering attacks against TCP . . . . . . . 50 | |||
9.2.1. Blind throughput-reduction attack . . . . . . . . . . 70 | 9.2.1. Blind throughput-reduction attack . . . . . . . . . . 52 | |||
9.2.2. Blind flooding attack . . . . . . . . . . . . . . . . 70 | 9.2.2. Blind flooding attack . . . . . . . . . . . . . . . . 53 | |||
9.2.3. Difficulty in performing the attacks . . . . . . . . 71 | 9.2.3. Difficulty in performing the attacks . . . . . . . . . 53 | |||
9.2.4. Modifications to TCP's loss recovery algorithms . . . 72 | 9.2.4. Modifications to TCP's loss recovery algorithms . . . 54 | |||
9.2.5. Countermeasures . . . . . . . . . . . . . . . . . . . 74 | 9.2.5. Countermeasures . . . . . . . . . . . . . . . . . . . 55 | |||
9.3. TCP Explicit Congestion Notification (ECN) . . . . . . . 79 | 9.3. TCP Explicit Congestion Notification (ECN) . . . . . . . . 55 | |||
9.3.1. Possible attacks by a compromised router . . . . . . 79 | 10. TCP API . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 | |||
9.3.2. Possible attacks by a malicious TCP endpoint . . . . 80 | 10.1. Passive opens and binding sockets . . . . . . . . . . . . 56 | |||
10. TCP API . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 | 10.2. Active opens and binding sockets . . . . . . . . . . . . . 57 | |||
10.1. Passive opens and binding sockets . . . . . . . . . . . . 81 | 11. Blind in-window attacks . . . . . . . . . . . . . . . . . . . 59 | |||
10.2. Active opens and binding sockets . . . . . . . . . . . . 82 | 11.1. Blind TCP-based connection-reset attacks . . . . . . . . . 59 | |||
11. Blind in-window attacks . . . . . . . . . . . . . . . . . . . 84 | 11.1.1. RST flag . . . . . . . . . . . . . . . . . . . . . . . 60 | |||
11.1. Blind TCP-based connection-reset attacks . . . . . . . . 84 | 11.1.2. SYN flag . . . . . . . . . . . . . . . . . . . . . . . 60 | |||
11.1.1. RST flag . . . . . . . . . . . . . . . . . . . . . . 85 | 11.1.3. Security/Compartment . . . . . . . . . . . . . . . . . 60 | |||
11.1.2. SYN flag . . . . . . . . . . . . . . . . . . . . . . 86 | 11.1.4. Precedence . . . . . . . . . . . . . . . . . . . . . . 61 | |||
11.1.3. Security/Compartment . . . . . . . . . . . . . . . . 88 | 11.1.5. Illegal options . . . . . . . . . . . . . . . . . . . 61 | |||
11.1.4. Precedence . . . . . . . . . . . . . . . . . . . . . 89 | 11.2. Blind data-injection attacks . . . . . . . . . . . . . . . 61 | |||
11.1.5. Illegal options . . . . . . . . . . . . . . . . . . . 90 | 12. Information leaking . . . . . . . . . . . . . . . . . . . . . 62 | |||
11.2. Blind data-injection attacks . . . . . . . . . . . . . . 90 | ||||
12. Information leaking . . . . . . . . . . . . . . . . . . . . . 91 | ||||
12.1. Remote Operating System detection via TCP/IP stack | 12.1. Remote Operating System detection via TCP/IP stack | |||
fingerprinting . . . . . . . . . . . . . . . . . . . . . 91 | fingerprinting . . . . . . . . . . . . . . . . . . . . . . 62 | |||
12.1.1. FIN probe . . . . . . . . . . . . . . . . . . . . . . 91 | 12.1.1. FIN probe . . . . . . . . . . . . . . . . . . . . . . 63 | |||
12.1.2. Bogus flag test . . . . . . . . . . . . . . . . . . . 92 | 12.1.2. Bogus flag test . . . . . . . . . . . . . . . . . . . 63 | |||
12.1.3. TCP ISN sampling . . . . . . . . . . . . . . . . . . 92 | 12.1.3. TCP ISN sampling . . . . . . . . . . . . . . . . . . . 63 | |||
12.1.4. TCP initial window . . . . . . . . . . . . . . . . . 92 | 12.1.4. TCP initial window . . . . . . . . . . . . . . . . . . 63 | |||
12.1.5. RST sampling . . . . . . . . . . . . . . . . . . . . 93 | 12.1.5. RST sampling . . . . . . . . . . . . . . . . . . . . . 64 | |||
12.1.6. TCP options . . . . . . . . . . . . . . . . . . . . . 94 | 12.1.6. TCP options . . . . . . . . . . . . . . . . . . . . . 65 | |||
12.1.7. Retransmission Timeout (RTO) sampling . . . . . . . . 94 | 12.1.7. Retransmission Timeout (RTO) sampling . . . . . . . . 65 | |||
12.2. System uptime detection . . . . . . . . . . . . . . . . . 94 | ||||
13. Covert channels . . . . . . . . . . . . . . . . . . . . . . . 95 | 12.2. System uptime detection . . . . . . . . . . . . . . . . . 66 | |||
14. TCP Port scanning . . . . . . . . . . . . . . . . . . . . . . 95 | 13. Covert channels . . . . . . . . . . . . . . . . . . . . . . . 66 | |||
14.1. Traditional connect() scan . . . . . . . . . . . . . . . 96 | 14. TCP Port scanning . . . . . . . . . . . . . . . . . . . . . . 66 | |||
14.2. SYN scan . . . . . . . . . . . . . . . . . . . . . . . . 96 | 14.1. Traditional connect() scan . . . . . . . . . . . . . . . . 67 | |||
14.3. FIN, NULL, and XMAS scans . . . . . . . . . . . . . . . . 96 | 14.2. SYN scan . . . . . . . . . . . . . . . . . . . . . . . . . 67 | |||
14.4. Maimon scan . . . . . . . . . . . . . . . . . . . . . . . 98 | 14.3. FIN, NULL, and XMAS scans . . . . . . . . . . . . . . . . 68 | |||
14.5. Window scan . . . . . . . . . . . . . . . . . . . . . . . 98 | 14.4. Maimon scan . . . . . . . . . . . . . . . . . . . . . . . 69 | |||
14.6. ACK scan . . . . . . . . . . . . . . . . . . . . . . . . 99 | 14.5. Window scan . . . . . . . . . . . . . . . . . . . . . . . 69 | |||
15. Processing of ICMP error messages by TCP . . . . . . . . . . 99 | 14.6. ACK scan . . . . . . . . . . . . . . . . . . . . . . . . . 70 | |||
16. TCP interaction with the Internet Protocol (IP) . . . . . . . 99 | 15. Processing of ICMP error messages by TCP . . . . . . . . . . . 70 | |||
16.1. TCP-based traceroute . . . . . . . . . . . . . . . . . . 99 | 16. TCP interaction with the Internet Protocol (IP) . . . . . . . 70 | |||
16.2. Blind TCP data injection through fragmented IP traffic . 100 | 16.1. TCP-based traceroute . . . . . . . . . . . . . . . . . . . 71 | |||
16.3. Broadcast and multicast IP addresses . . . . . . . . . . 102 | 16.2. Blind TCP data injection through fragmented IP traffic . . 71 | |||
17. Security Considerations . . . . . . . . . . . . . . . . . . . 102 | 16.3. Broadcast and multicast IP addresses . . . . . . . . . . . 73 | |||
18. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 102 | 17. Security Considerations . . . . . . . . . . . . . . . . . . . 73 | |||
19. References . . . . . . . . . . . . . . . . . . . . . . . . . 103 | 18. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 73 | |||
20. References . . . . . . . . . . . . . . . . . . . . . . . . . 113 | 19. References (to be translated to xml) . . . . . . . . . . . . . 74 | |||
20.1. Normative References . . . . . . . . . . . . . . . . . . 113 | 20. References . . . . . . . . . . . . . . . . . . . . . . . . . . 84 | |||
20.2. Informative References . . . . . . . . . . . . . . . . . 113 | 20.1. Normative References . . . . . . . . . . . . . . . . . . . 84 | |||
Appendix A. TODO list . . . . . . . . . . . . . . . . . . . . . 113 | 20.2. Informative References . . . . . . . . . . . . . . . . . . 84 | |||
Appendix A. TODO list . . . . . . . . . . . . . . . . . . . . . . 85 | ||||
Appendix B. Change log (to be removed by the RFC Editor | Appendix B. Change log (to be removed by the RFC Editor | |||
before publication of this document as an RFC) . . . 113 | before publication of this document as an RFC) . . . 85 | |||
B.1. Changes from draft-ietf-tcpm-tcp-security-01 . . . . . . 113 | B.1. Changes from draft-ietf-tcpm-tcp-security-02 . . . . . . . 85 | |||
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 114 | B.2. Changes from draft-ietf-tcpm-tcp-security-01 . . . . . . . 86 | |||
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 86 | ||||
1. Preface | 1. Preface | |||
1.1. Introduction | 1.1. Introduction | |||
The TCP/IP protocol suite was conceived in an environment that was | The TCP/IP protocol suite was conceived in an environment that was | |||
quite different from the hostile environment they currently operate | quite different from the hostile environment they currently operate | |||
in. However, the effectiveness of the protocols led to their early | in. However, the effectiveness of the protocols led to their early | |||
adoption in production environments, to the point that, to some | adoption in production environments, to the point that, to some | |||
extent, the current world's economy depends on them. | extent, the current world's economy depends on them. | |||
skipping to change at page 6, line 11 | skipping to change at page 6, line 11 | |||
interoperability [Silbersack, 2005]. | interoperability [Silbersack, 2005]. | |||
Producing a secure TCP/IP implementation nowadays is a very difficult | Producing a secure TCP/IP implementation nowadays is a very difficult | |||
task, in part because of the lack of a single document that serves as | task, in part because of the lack of a single document that serves as | |||
a security roadmap for the protocols. Implementers are faced with | a security roadmap for the protocols. Implementers are faced with | |||
the hard task of identifying relevant documentation and | the hard task of identifying relevant documentation and | |||
differentiating between that which provides correct advice, and that | differentiating between that which provides correct advice, and that | |||
which provides misleading advice based on inaccurate or wrong | which provides misleading advice based on inaccurate or wrong | |||
assumptions. | assumptions. | |||
There is a clear need for a companion document to the IETF | ||||
specifications that discusses the security aspects and implications | ||||
of the protocols, identifies the existing vulnerabilities, discusses | ||||
the possible countermeasures, and analyzes their respective | ||||
effectiveness. | ||||
This document is the result of a security assessment of the IETF | This document is the result of a security assessment of the IETF | |||
specifications of the Transmission Control Protocol (TCP), from a | specifications of the Transmission Control Protocol (TCP), from a | |||
security point of view. Possible threats are identified and, where | security point of view. Possible threats are identified and, where | |||
possible, countermeasures are proposed. Additionally, many | possible, countermeasures are described. Additionally, many | |||
implementation flaws that have led to security vulnerabilities have | implementation flaws that have led to security vulnerabilities have | |||
been referenced in the hope that future implementations will not | been referenced in the hope that future implementations will not | |||
incur the same problems. | incur the same problems. | |||
This document does not aim to be the final word on the security | This document is based on the "Security Assessment of the | |||
aspects of TCP. On the contrary, it aims to raise awareness about a | ||||
number of TCP vulnerabilities that have been faced in the past, those | ||||
that are currently being faced, and some of those that we may still | ||||
have to deal with in the future. | ||||
Feedback from the community is more than encouraged to help this | ||||
document be as accurate as possible and to keep it updated as new | ||||
vulnerabilities are discovered. | ||||
This document is heavily based on the "Security Assessment of the | ||||
Transmission Control Protocol (TCP)" released by the UK Centre for | Transmission Control Protocol (TCP)" released by the UK Centre for | |||
the Protection of National Infrastructure (CPNI), available at: http: | the Protection of National Infrastructure (CPNI), available at: http: | |||
//www.cpni.gov.uk/Products/technicalnotes/ | //www.cpni.gov.uk/Products/technicalnotes/ | |||
Feb-09-security-assessment-TCP.aspx . | Feb-09-security-assessment-TCP.aspx . | |||
1.2. Scope of this document | 1.2. Scope of this document | |||
While there are a number of protocols that may affect the way TCP | While there are a number of protocols that may affect the way TCP | |||
operates, this document focuses only on the specifications of the | operates, this document focuses only on the specifications of the | |||
Transmission Control Protocol (TCP) itself. | Transmission Control Protocol (TCP) itself. | |||
The following IETF RFCs were selected for assessment as part of this | The machanisms described in the following documents were selected for | |||
work: | assessment as part of this work: | |||
o RFC 793, "Transmission Control Protocol. DARPA Internet Program. | o RFC 793, "Transmission Control Protocol. DARPA Internet Program. | |||
Protocol Specification" (91 pages) | Protocol Specification" (91 pages) | |||
o RFC 1122, "Requirements for Internet Hosts -- Communication | o RFC 1122, "Requirements for Internet Hosts -- Communication | |||
Layers" (116 pages) | Layers" (116 pages) | |||
o RFC 1191, "Path MTU Discovery" (19 pages) | o RFC 1191, "Path MTU Discovery" (19 pages) | |||
o RFC 1323, "TCP Extensions for High Performance" (37 pages) | o RFC 1323, "TCP Extensions for High Performance" (37 pages) | |||
skipping to change at page 8, line 19 | skipping to change at page 7, line 46 | |||
their security implications, and discusses the possible | their security implications, and discusses the possible | |||
countermeasures. The second part contains an analysis of the | countermeasures. The second part contains an analysis of the | |||
security implications of the mechanisms and policies implemented by | security implications of the mechanisms and policies implemented by | |||
TCP, and of a number of implementation strategies in use by a number | TCP, and of a number of implementation strategies in use by a number | |||
of popular TCP implementations. | of popular TCP implementations. | |||
2. The Transmission Control Protocol | 2. The Transmission Control Protocol | |||
The Transmission Control Protocol (TCP) is a connection-oriented | The Transmission Control Protocol (TCP) is a connection-oriented | |||
transport protocol that provides a reliable byte-stream data transfer | transport protocol that provides a reliable byte-stream data transfer | |||
service. | service. Very few assumptions are made about the reliability of | |||
underlying data transfer services below the TCP layer. Basically, | ||||
Very few assumptions are made about the reliability of underlying | TCP assumes it can obtain a simple, potentially unreliable datagram | |||
data transfer services below the TCP layer. Basically, TCP assumes | service from the lower level protocols. | |||
it can obtain a simple, potentially unreliable datagram service from | ||||
the lower level protocols. Figure 1 illustrates where TCP fits in | ||||
the DARPA reference model. | ||||
+---------------+ | ||||
| Application | | ||||
+---------------+ | ||||
| TCP | | ||||
+---------------+ | ||||
| IP | | ||||
+---------------+ | ||||
| Network | | ||||
+---------------+ | ||||
Figure 1: TCP in the DARPA reference model | ||||
TCP provides facilities in the following areas: | ||||
o Basic Data Transfer | ||||
o Reliability | ||||
o Flow Control | ||||
o Multiplexing | ||||
o Connections | ||||
o Precedence and Security | ||||
o Congestion Control | ||||
The core TCP specification, RFC 793 [Postel, 1981c], dates back to | The core TCP specification, RFC 793 [RFC0793], dates back to 1981 and | |||
1981 and standardizes the basic mechanisms and policies of TCP. RFC | standardizes the basic mechanisms and policies of TCP. RFC 1122 | |||
1122 [Braden, 1989] provides clarifications and errata for the | [RFC1122] provides clarifications and errata for the original | |||
original specification. RFC 2581 [Allman et al, 1999] specifies TCP | specification. RFC 2581 [RFC5681] specifies TCP congestion control | |||
congestion control and avoidance mechanisms, not present in the | and avoidance mechanisms, not present in the original specification. | |||
original specification. Other documents specify extensions and | Other documents specify extensions and improvements for TCP. | |||
improvements for TCP. | ||||
The large amount of documents that specify extensions, improvements, | The large amount of documents that specify extensions, improvements, | |||
or modifications to existing TCP mechanisms has led the IETF to | or modifications to existing TCP mechanisms has led the IETF to | |||
publish a roadmap for TCP, RFC 4614 [Duke et al, 2006], that | publish a roadmap for TCP, RFC 4614 [Duke et al, 2006], that | |||
clarifies the relevance of each of those documents. | clarifies the relevance of each of those documents. | |||
3. TCP header fields | 3. TCP header fields | |||
RFC 793 [Postel, 1981c] defines the syntax of a TCP segment, along | RFC 793 [RFC0793] defines the syntax of a TCP segment, along with the | |||
with the semantics of each of the header fields. Figure 2 | semantics of each of the header fields. | |||
illustrates the syntax of a TCP segment. | ||||
0 1 2 3 | ||||
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | ||||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ||||
| Source Port | Destination Port | | ||||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ||||
| Sequence Number | | ||||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ||||
| Acknowledgment Number | | ||||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ||||
| Data | |C|E|U|A|P|R|S|F| | | ||||
| Offset|Resrved|W|C|R|C|S|S|Y|I| Window | | ||||
| | |R|E|G|K|H|T|N|N| | | ||||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ||||
| Checksum | Urgent Pointer | | ||||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ||||
| Options | Padding | | ||||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ||||
| data | | ||||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ||||
Note that one tick mark represents one bit position | ||||
Figure 2: Transmission Control Protocol header format | ||||
The minimum TCP header size is 20 bytes, and corresponds to a TCP | The minimum TCP header size is 20 bytes, and corresponds to a TCP | |||
segment with no options and no data. However, a TCP module might be | segment with no options and no data. However, a TCP module might be | |||
handed an (illegitimate) "TCP segment" of less than 20 bytes. | handed an (illegitimate) "TCP segment" of less than 20 bytes. | |||
Therefore, before doing any processing of the TCP header fields, the | Therefore, before doing any processing of the TCP header fields, the | |||
following check should be performed by TCP on the segments handed by | following check should be performed by TCP on the segments handed by | |||
the internet layer: | the internet layer: | |||
Segment.Size >= 20 | Segment.Size >= 20 | |||
skipping to change at page 10, line 29 | skipping to change at page 8, line 44 | |||
3.1. Source Port and Destination Port | 3.1. Source Port and Destination Port | |||
The Source Port field contains a 16-bit number that identifies the | The Source Port field contains a 16-bit number that identifies the | |||
TCP end-point that originated this TCP segment. The TCP Destination | TCP end-point that originated this TCP segment. The TCP Destination | |||
Port contains a 16-bit number that identifies the destination TCP | Port contains a 16-bit number that identifies the destination TCP | |||
end-point of this segment. In most of the discussion we refer to | end-point of this segment. In most of the discussion we refer to | |||
client-side (or "ephemeral") port-numbers and server-side port | client-side (or "ephemeral") port-numbers and server-side port | |||
numbers, since that distinction is what usually affects the | numbers, since that distinction is what usually affects the | |||
interpretation of a port number. | interpretation of a port number. | |||
TCP SHOULD randomize its ephemeral (client-side) ports, to improve | Most active attacks against ongoing TCP connections require the | |||
its resistance to off-path attacks. For the purpose of ephemeral | attacker to guess or know the four-tuple that identifies the | |||
port selection, the largest posible port range SHOULD be used | connection. As a result, randomization of the TCP ephemeral ports | |||
(ideally 1024-65535) I-D.ietf-tsvwg-port-randomization. | provides a (partial) mitigation against off-path attacks. [RFC6056] | |||
provides guidance in this area. | ||||
DISCUSSION: | ||||
[I-D.ietf-tsvwg-port-randomization] provides advice on port | ||||
randomization. | ||||
TCP MUST NOT allocate port number 0, as its use could lead to | ||||
interoperability problems. If a segment is received with port 0 as | ||||
the Source Port or the Destination Port, a RST segment SHOULD be sent | ||||
in response (provided that the incomming segment does not have the | ||||
RST flag set). | ||||
DISCUSSION: | ||||
While port 0 is a legitimate port number, it has a special meaning | ||||
in the UNIX Sockets API. For example, when a TCP port number of 0 | ||||
is passed as an argument to the bind() function, rather than | ||||
binding port 0, an ephemeral port is selected for the | ||||
corresponding TCP end-point. As a result, the TCP port number 0 | ||||
is never actually used in TCP segments. | ||||
Different implementations have been found to respond differently | ||||
to TCP segments that have a port number of 0 as the Source Port | ||||
and/or the Destination Port. As a result, TCP segments with a | ||||
port number of 0 are usually employed for remote OS detection via | ||||
TCP/IP stack fingerprinting [Jones, 2003]. | ||||
Since in practice TCP port 0 is not used by any legitimate | ||||
application and is only used for fingerprinting purposes, a number | ||||
of host implementations already reject TCP segments that use 0 as | ||||
the Source Port and/or the Destination Port. Also, a number | ||||
firewalls filter (by default) any TCP segments that contain a port | ||||
number of zero for the Source Port and/or the Destination Port. | ||||
We therefore recommend that TCP implementations respond to | ||||
incoming TCP segments that have a Source Port or a Destination | ||||
Port of 0 with an RST (provided these incoming segments do not | ||||
have the RST bit set). | ||||
Responding with an RST segment to incoming segments that have the | ||||
RST bit would open the door to RST-war attacks. | ||||
TCP MUST be able to grecefully handle the case where the source end- | ||||
point (IP Source Address, TCP Source Port) is the same as the | ||||
destination end-point (IP Destination Address, TCP Destination Port). | ||||
DISCUSSION: | ||||
Some systems have been found to be unable to process TCP segments | ||||
in which the source endpoint {Source Address, Source Port} is the | ||||
same than the destination end-point {Destination Address, | ||||
Destination Port}. Such TCP segments have been reported to cause | ||||
malfunction of a number of implementations [CERT, 1996], and have | ||||
been exploited in the past to perform Denial of Service (DoS) | ||||
attacks [Meltman, 1997]. While these packets are very very | ||||
unlikely to exist in real and legitimate scenarios, TCP should | ||||
nevertheless be able to process them without the need of any | ||||
"extra" code. | ||||
A SYN segment in which the source end-point {Source Address, | ||||
Source Port} is the same as the destination end-point {Destination | ||||
Address, Destination Port} will result in a "simultaneous open" | ||||
scenario, such as the one described in page 32 of RFC 793 [Postel, | ||||
1981c]. Therefore, those TCP implementations that correctly | ||||
handle simultaneous opens should already be prepared to handle | ||||
these unusual TCP segments. | ||||
TCP SHOULD NOT allocate of port numbers that are in use by a TCP that | ||||
is in the LISTEN or CLOSED states for use as ephemeral ports, as this | ||||
could allow attackers on the local system to "steal" incomming TCP | ||||
connections. | ||||
DISCUSSION: | ||||
While the only requirement for a selected ephemeral port is that | Some implementations have been known to crash when a TCP segment in | |||
the resulting four-tuple (connection-id) is unique (i.e., not | which the source end-point (IP Source Address, TCP Source Port) is | |||
currently in use by any other TCP connection), in practice it may | the same as the destination end-point (IP Destination Address, TCP | |||
be necessary to not allow the allocation of port numbers that are | Destination Port). [draft-gont-tcpm-tcp-mirrored-endpoints-00.txt] | |||
in use by a TCP that is in the LISTEN or CLOSED states for use as | describes this issue in detail and provides advice in this area. | |||
ephemeral ports, as this might allow an attacker to "steal" | ||||
incoming connections from a local server application. Therefore, | ||||
TCP SHOULD NOT allocate port numbers that are in use by a TCP in | ||||
the LISTEN or CLOSED states for use as ephemeral ports. Section | ||||
10.2 of this document provides a detailed discussion of this | ||||
issue. | ||||
While some systems restrict use of the port numbers in the range | While some systems restrict use of the port numbers in the range | |||
0-1024 to privileged users, applications SHOULD NOT grant any trust | 0-1024 to privileged users, applications should not grant any trust | |||
based on the port numbers used for a TCP connection. | based on the port numbers used for a TCP connection. | |||
DISCUSSION: | ||||
Not all systems require superuser privileges to bind port numbers | Not all systems require superuser privileges to bind port numbers | |||
in that range. Besides, with desktop computers such "distinction" | in that range. Besides, with desktop computers such "distinction" | |||
has generally become irrelevant. | has generally become irrelevant. | |||
Middle-boxes such as packet filters MUST NOT assume that clients use | Middle-boxes such as packet filters must not assume that clients use | |||
port numbers from only the Dynamic or Registered port ranges. | port numbers from only the Dynamic or Registered port ranges. | |||
DISCUSSION: | ||||
It should also be noted that some clients, such as DNS resolvers, | It should also be noted that some clients, such as DNS resolvers, | |||
are known to use port numbers from the "Well Known Ports" range. | are known to use port numbers from the "Well Known Ports" range. | |||
Therefore, middle-boxes such as packet filters MUST NOT assume | Therefore, middle-boxes such as packet filters MUST NOT assume | |||
that clients use port number from only the Dynamic or Registered | that clients use port number from only the Dynamic or Registered | |||
port ranges. | port ranges. | |||
3.2. Sequence number | 3.2. Sequence number | |||
TCP SHOULD select its Initial Sequence Numbers (ISNs) with the | Predictable sequence numbers allow a variety of attacks against TCP, | |||
following expression: | such as those described in Section 5.2 and Section 11 of this | |||
document. This vulnerability was first described in [Morris1985], | ||||
ISN = M + F(localhost, localport, remotehost, remoteport, secret_key) | and its exploitation was widely publicized about 10 years later | |||
[Shimomura1995]. | ||||
where M is a monotonically increasing counter maintained within TCP, | ||||
and F() is a Pseudo-Random Function (PRF). As it is vital that F() | ||||
not be computable from the outside, F() could be a PRF of the | ||||
connection-id and some secret data. HMAC-SHA-256 would be a good | ||||
choice for F() | ||||
DISCUSSION: | ||||
The choice of the Initial Sequence Number of a connection is not | ||||
arbitrary, but aims to minimize the chances of a stale segment | ||||
from being accepted by a new incarnation of a previous connection. | ||||
RFC 793 [Postel, 1981c] suggests the use of a global 32-bit ISN | ||||
generator, whose lower bit is incremented roughly every 4 | ||||
microseconds. | ||||
However, use of such an ISN generator makes it trivial to predict | ||||
the ISN that a TCP will use for new connections, thus allowing a | ||||
variety of attacks against TCP, such as those described in Section | ||||
5.2 and Section 11 of this document. This vulnerability was first | ||||
described in [Morris, 1985], and its exploitation was widely | ||||
publicized about 10 years later [Shimomura, 1995]. | ||||
As a matter of fact, protection against old stale segments from a | ||||
previous incarnation of the connection comes from allowing the | ||||
creation of a new incarnation of a previous connection only after | ||||
2*MSL have passed since a segment corresponding to the old | ||||
incarnation was last seen. This is accomplished by the TIME-WAIT | ||||
state, and TCP's "quiet time" concept. However, as discussed in | ||||
Section 3.1 and Section 11.1.2 of this document, the ISN can be | ||||
used to perform some heuristics meant to avoid an interoperability | ||||
problem that may arise when two systems establish connections at a | ||||
high rate. In order for such heuristics to work, the ISNs | ||||
generated by a TCP should be monotonically increasing. | ||||
The ISN generation scheme recommended in this section was | ||||
originally proposed in RFC 1948 [Bellovin, 1996], such that the | ||||
chances of an attacker from guessing the ISN of a TCP are reduced, | ||||
while still producing a monotonically-increasing sequence that | ||||
allows implementation of the optimization described in Section 3.1 | ||||
and Section 11.1.2 of this document. | ||||
[CERT, 2001] and [US-CERT, 2001] are advisories about the security | In order to mitigate this vulnerabilities, some implementations set | |||
implications of weak ISN generators. [Zalewski, 2001a] and | the TCP ISN to a PRNG. However, this has been known to cause | |||
[Zalewski, 2002] contain a detailed analysis of ISN generators, | interoperability problems. [RFC6528] provides advice in this area. | |||
and a survey of the algorithms in use by popular TCP | ||||
implementations. | ||||
Another security consideration that should be made about TCP | Another security consideration that should be made about TCP sequence | |||
sequence numbers is that they might allow an attacker to count the | numbers is that they might allow an attacker to count the number of | |||
number of systems behind a Network Address Translator (NAT) | systems behind a Network Address Translator (NAT) [Srisuresh and | |||
[Srisuresh and Egevang, 2001]. Depending on the ISN generators | Egevang, 2001]. Depending on the ISN generators implemented by each | |||
implemented by each of the systems behind the NAT, an attacker | of the systems behind the NAT, an attacker might be able to count the | |||
might be able to count the number of systems behind the NAT by | number of systems behind the NAT by establishing a number of TCP | |||
establishing a number of TCP connections (using the public address | connections (using the public address of the NAT) and indentifying | |||
of the NAT) and indentifying the number of different sequence | the number of different sequence number "spaces". [Gont and | |||
number "spaces". This information leakage could be eliminated by | Srisuresh, 2008] provides a detailed discussion of the security | |||
rewriting the contents of all those header fields and options that | implications of NATs and of the possible mitigations for this and | |||
make use of sequence numbers (such as the Sequence Number and the | other issues. | |||
Acknowledgement Number fields, and the SACK Option) at the NAT. | ||||
[Gont and Srisuresh, 2008] provides a detailed discussion of the | ||||
security implications of NATs and of the possible mitigations for | ||||
this and other issues. | ||||
3.3. Acknowledgement Number | 3.3. Acknowledgement Number | |||
TCP SHOULD set the Acknowledgement Number to zero when sending a TCP | If the ACK bit is on, the Acknowledgement Number contains the value | |||
segment that does not have the ACK bit set (i.e., a SYN segment). | of the next sequence number the sender of this segment is expecting | |||
to receive. According to RFC 793, the Acknowledgement Number is | ||||
TCP MUST check that, on segments that have the ACK bit set, the | considered valid as long as it does not acknowledge the receipt of | |||
Acknowledgment Number satisfies the expression: | data that has not yet been sent. | |||
SND.UNA - SND.MAX.WND <= SEG.ACK <= SND.NXT | ||||
If a TCP segment does not pass this check, the segment MUST be | ||||
dropped, and an ACK segment SHOULD be sent in response. | ||||
DISCUSSION: | ||||
If the ACK bit is on, the Acknowledgement Number contains the | ||||
value of the next sequence number the sender of this segment is | ||||
expecting to receive. According to RFC 793, the Acknowledgement | ||||
Number is considered valid as long as it does not acknowledge the | ||||
receipt of data that has not yet been sent. | ||||
However, as a result of recent concerns on forgery attacks against | ||||
TCP (see Section 11 of this document), ongoing work at the IETF | ||||
[Ramaiah et al, 2008] has proposed to enforce a more strict check | ||||
on the Acknowledgement Number of segments that have the ACK bit | ||||
set: | ||||
SND.UNA - SND.MAX.WND <= SEG.ACK <= SND.NXT | However, as a result of recent concerns on forgery attacks against | |||
TCP (see Section 11 of this document) [RFC5961] has proposed to | ||||
enforce a more strict check on the Acknowledgement Number of segments | ||||
that have the ACK bit set. See for more details. | ||||
If the ACK bit is off, the Acknowledgement Number field is not | If the ACK bit is off, the Acknowledgement Number field is not valid. | |||
valid. We recommend TCP implementations to set the | We recommend TCP implementations to set the Acknowledgement Number to | |||
Acknowledgement Number to zero when sending a TCP segment that | zero when sending a TCP segment that does not have the ACK bit set | |||
does not have the ACK bit set (i.e., a SYN segment). Some TCP | (i.e., a SYN segment). Some TCP implementations have been known to | |||
implementations have been known to fail to set the Acknowledgement | fail to set the Acknowledgement Number to zero, thus leaking | |||
Number to zero, thus leaking information. | information. | |||
TCP Acknowledgements are also used to perform heuristics for loss | TCP Acknowledgements are also used to perform heuristics for loss | |||
recovery and congestion control. Section 9 of this document | recovery and congestion control. Section 9 of this document | |||
describes a number of ways in which these mechanisms can be | describes a number of ways in which these mechanisms can be | |||
exploited. | exploited. | |||
3.4. Data Offset | 3.4. Data Offset | |||
TCP MUST enforce the following checks on the Data Offset field: | [draft-gont-tcpm-tcp-sanity-checks-00.txt] specifies a number of | |||
sanity checks that should be performed on the Data Offset field. | ||||
Data Offset >= 5 | ||||
Data Offset * 4 <= TCP segment length | ||||
If a TCP segment does not pass these checks, it should be silently | ||||
dropped. | ||||
The TCP segment length should be obtained from the IP layer, as | ||||
TCP does not include a TCP segment length field. | ||||
DISCUSSION: | ||||
The Data Offset field indicates the length of the TCP header in | ||||
32-bit words. As the minimum TCP header size is 20 bytes, the | ||||
minimum legal value for this field is 5. | ||||
For obvious reasons, the TCP header cannot be larger than the | ||||
whole TCP segment it is part of. | ||||
3.5. Control bits | 3.5. Control bits | |||
The following subsections provide a discussion of the different | The following subsections provide a discussion of the different | |||
control bits in the TCP header. TCP segments with unusual | control bits in the TCP header. TCP segments with unusual | |||
combinations of flags set have been known in the past to cause | combinations of flags set have been known in the past to cause | |||
malfunction of some implementations, sometimes to the extent of | malfunction of some implementations, sometimes to the extent of | |||
causing them to crash [Postel, 1987] [Braden, 1992]. These packets | causing them to crash [RFC1025] [RFC1379]. These packets are still | |||
are still usually employed for the purpose of TCP/IP stack | usually employed for the purpose of TCP/IP stack fingerprinting. | |||
fingerprinting. Section 12.1 contains a discussion of TCP/IP stack | Section 12.1 contains a discussion of TCP/IP stack fingerprinting. | |||
fingerprinting. | ||||
3.5.1. Reserved (four bits) | 3.5.1. Reserved (four bits) | |||
TCP MUST ignore the Reserved field of incoming TCP segments. | These four bits are reserved for future use, and must be zero. As | |||
with virtually every field, the Reserved field could be used as a | ||||
DISCUSSION: | covert channel. While there exist intermediate devices such as | |||
protocol scrubbers that clear these bits, and firewalls that drop/ | ||||
These four bits are reserved for future use, and must be zero. As | reject segments with any of these bits set, these devices should | |||
with virtually every field, the Reserved field could be used as a | consider the impact of these policies on TCP interoperability. For | |||
covert channel. While there exist intermediate devices such as | example, as TCP continues to evolve, all or part of the bits in the | |||
protocol scrubbers that clear these bits, and firewalls that drop/ | Reserved field could be used to implement some new functionality. If | |||
reject segments with any of these bits set, these devices should | some middle-box or end-system implementation were to drop a TCP | |||
consider the impact of these policies on TCP interoperability. | segment merely because some of these bits are not set to zero, | |||
For example, as TCP continues to evolve, all or part of the bits | interoperability problems would arise. | |||
in the Reserved field could be used to implement some new | ||||
functionality. If some middle-box or end-system implementation | ||||
were to drop a TCP segment merely because some of these bits are | ||||
not set to zero, interoperability problems would arise. | ||||
3.5.2. CWR (Congestion Window Reduced) | 3.5.2. CWR (Congestion Window Reduced) | |||
DISCUSSION: | The CWR flag, defined in RFC 3168 [Ramakrishnan et al, 2001], is used | |||
as part of the Explicit Congestion Notification (ECN) mechanism. For | ||||
The CWR flag, defined in RFC 3168 [Ramakrishnan et al, 2001], is | connections in any of the synchronized states, this flag indicates, | |||
used as part of the Explicit Congestion Notification (ECN) | when set, that the TCP sending this segment has reduced its | |||
mechanism. For connections in any of the synchronized states, | congestion window. | |||
this flag indicates, when set, that the TCP sending this segment | ||||
has reduced its congestion window. | ||||
An analysis of the security implications of ECN can be found in | An analysis of the security implications of ECN can be found in | |||
Section 9.3 of this document. | Section 9.3 of this document. | |||
3.5.3. ECE (ECN-Echo) | 3.5.3. ECE (ECN-Echo) | |||
DISCUSSION: | The ECE flag, defined in RFC 3168 [Ramakrishnan et al, 2001], is used | |||
as part of the Explicit Congestion Notification (ECN) mechanism. | ||||
The ECE flag, defined in RFC 3168 [Ramakrishnan et al, 2001], is | ||||
used as part of the Explicit Congestion Notification (ECN) | ||||
mechanism. | ||||
Once a TCP connection has been established, an ACK segment with | ||||
the ECE bit set indicates that congestion was encountered in the | ||||
network on the path from the sender to the receiver. This | ||||
indication of congestion should be treated just as a congestion | ||||
loss in non-ECN-capable TCP [Ramakrishnan et al, 2001]. | ||||
Additionally, TCP should not increase the congestion window (cwnd) | ||||
in response to such an ACK segment that indicates congestion, and | ||||
should also not react to congestion indications more than once | ||||
every window of data (or once per round-trip time). | ||||
An analysis of the security implications of ECN can be found in | An analysis of the security implications of ECN can be found in | |||
Section 9.3 of this document. | Section 9.3 of this document. | |||
3.5.4. URG | 3.5.4. URG | |||
DISCUSSION: | When the URG flag is set, the Urgent Pointer field contains the | |||
current value of the urgent pointer. | ||||
When the URG flag is set, the Urgent Pointer field contains the | ||||
current value of the urgent pointer. | ||||
Receipt of an "urgent" indication generates, in a number of | ||||
implementations (such as those in UNIX-like systems), a software | ||||
interrupt (signal) that is delivered to the corresponding process. | ||||
In UNIX-like systems, receipt of an urgent indication causes a | Receipt of an "urgent" indication generates, in a number of | |||
SIGURG signal to be delivered to the corresponding process. | implementations (such as those in UNIX-like systems), a software | |||
interrupt (signal) that is delivered to the corresponding process. | ||||
In UNIX-like systems, receipt of an urgent indication causes a SIGURG | ||||
signal to be delivered to the corresponding process. | ||||
A number of applications handle TCP urgent indications by | A number of applications handle TCP urgent indications by installing | |||
installing a signal handler for the corresponding signal (e.g., | a signal handler for the corresponding signal (e.g., SIGURG). As | |||
SIGURG). As discussed in [Zalewski, 2001b], some signal handlers | discussed in [Zalewski, 2001b], some signal handlers can be | |||
can be maliciously exploited by an attacker, for example to gain | maliciously exploited by an attacker, for example to gain remote | |||
remote access to a system. While secure programming of signal | access to a system. While secure programming of signal handlers is | |||
handlers is out of the scope of this document, we nevertheless | out of the scope of this document, we nevertheless raise awareness | |||
raise awareness that TCP urgent indications might be exploited to | that TCP urgent indications might be exploited to abuse poorly- | |||
abuse poorly-written signal handlers. | written signal handlers. | |||
Section 3.9 discusses the security implications of the TCP urgent | Section 3.9 discusses the security implications of the TCP urgent | |||
mechanism. | mechanism. | |||
3.5.5. ACK | 3.5.5. ACK | |||
DISCUSSION: | When the ACK bit is one, the Acknowledgment Number field contains the | |||
next sequence number expected, cumulatively acknowledging the receipt | ||||
When the ACK bit is one, the Acknowledgment Number field contains | of all data up to the sequence number in the Acknowledgement Number, | |||
the next sequence number expected, cumulatively acknowledging the | minus one. Section 3.4 of this document describes sanity checks that | |||
receipt of all data up to the sequence number in the | should be performed on the Acknowledgement Number field. | |||
Acknowledgement Number, minus one. Section 3.4 of this document | ||||
describes sanity checks that should be performed on the | ||||
Acknowledgement Number field. | ||||
TCP Acknowledgements are also used to perform heuristics for loss | TCP Acknowledgements are also used to perform heuristics for loss | |||
recovery and congestion control. Section 9 of this document | recovery and congestion control. Section 9 of this document | |||
describes a number of ways in which these mechanisms can be | describes a number of ways in which these mechanisms can be | |||
exploited. | exploited. | |||
3.5.6. PSH | 3.5.6. PSH | |||
As a result of a SEND call, TCP SHOULD send all queued data (provided | [draft-gont-tcpm-tcp-push-semantics-00.txt] describes a number of | |||
that TCP's flow control and congestion control algorithms allow it). | security issues that may arise as a result of the PUSH semantics, and | |||
proposes a number of ways to mitigate these issues. | ||||
Received data SHOULD be immediately delivered to an application | ||||
calling the RECEIVE function, even if the data already available are | ||||
less than those requested by the application. | ||||
DISCUSSION: | ||||
RFC 793 [Postel, 1981c] contains (in pages 54-64) a functional | ||||
description of a TCP Application Programming Interface (API). One | ||||
of the parameters of the SEND function is the PUSH flag which, | ||||
when set, signals the local TCP that it must send all unsent data. | ||||
The TCP PSH (PUSH) flag will be set in the last outgoing segment, | ||||
to signal the push function to the receiving TCP. Upon receipt of | ||||
a segment with the PSH flag set, the receiving user's buffer is | ||||
returned to the user, without waiting for additional data to | ||||
arrive. | ||||
There are two security considerations arising from the PUSH | ||||
function. On the sending side, an attacker could cause a large | ||||
amount of data to be queued for transmission without setting the | ||||
PUSH flag in the SEND call. This would prevent the local TCP from | ||||
sending the queued data, causing system memory to be tied to those | ||||
data for an unnecessarily long period of time. | ||||
An analogous consideration should be made for the receiving TCP. | ||||
TCP is allowed to buffer incoming data until the receiving user's | ||||
buffer fills or a segment with the PSH bit set is received. If | ||||
the receiving TCP implements this policy, an attacker could send a | ||||
large amount of data, slightly less than the receiving user's | ||||
buffer size, to cause system memory to be tied to these data for | ||||
an unnecessarily long period of time. Both of these issues are | ||||
discussed in Section 4.2.2.2 of RFC 1122 [Braden, 1989]. | ||||
In order to mitigate these potential vulnerabilities, we suggest | ||||
assuming an implicit "PUSH" in every SEND call. On the sending | ||||
side, this means that as a result of a SEND call TCP should try to | ||||
send all queued data (provided that TCP's flow control and | ||||
congestion control algorithms allow it). On the receiving side, | ||||
this means that the received data will be immediately delivered to | ||||
an application calling the RECEIVE function, even if the data | ||||
already available are less than those requested by the | ||||
application. | ||||
It is interesting to note that popular TCP APIs (such as | ||||
"sockets") do not provide a PUSH flag in any of the interfaces | ||||
they define, but rather perform some kind of "heuristics" to set | ||||
the PSH bit in outgoing segments. As a result, the value of the | ||||
PSH bit in the received TCP segments is usually a policy of the | ||||
sending TCP, rather than a policy of the sending application. All | ||||
robust applications that make use of those APIs (such as the | ||||
sockets API) properly handle the case of a RECEIVE call returning | ||||
less data (e.g., zero) than requested, usually by performing | ||||
subsequent RECEIVE calls. | ||||
Another potential malicious use of the PSH bit would be for an | ||||
attacker to send small TCP segments (probably with zero bytes of | ||||
data payload) to cause the receiving application to be | ||||
unnecessarily woken up (increasing the CPU load), or to cause | ||||
malfunction of poorly-written applications that may not handle | ||||
well the case of RECEIVE calls returning less data than requested. | ||||
3.5.7. RST | 3.5.7. RST | |||
TCP MUST process RST segments (i.e., segments with the RST bit set) | The RST bit is used to request the abortion (abnormal close) of a TCP | |||
as follows: | connection. RFC 793 [RFC0793] suggests that an RST segment should be | |||
considered valid if its Sequence Number is valid (i.e., falls within | ||||
o If the Sequence Number of the RST segment is not valid (i.e., | the receive window). However, in response to the security concerns | |||
falls outside of the receive window), silently drop the segment. | raised by [Watson, 2004] and [NISCC, 2004], [RFC6429] proposed | |||
stricter validity checks. Please see [RFC6429] for additional | ||||
o If the Sequence Number of the RST segment matches the next | details. | |||
expected sequence number (RCV.NXT), abort the corresponding | ||||
connection. | ||||
o If the Sequence Number is valid (i.e., falls within the receive | ||||
window) but is not exactly RCV.NXT, send an ACK segment (a | ||||
"challenge ACK") of the form: <SEQ=SND.NXT><ACK=RCV.NXT><CTL=ACK>. | ||||
TCP SHOULD rate-limit these challenge ACK segments. | ||||
DISCUSSION: | ||||
The RST bit is used to request the abortion (abnormal close) of a | ||||
TCP connection. RFC 793 [Postel, 1981c] suggests that an RST | ||||
segment should be considered valid if its Sequence Number is valid | ||||
(i.e., falls within the receive window). However, in response to | ||||
the security concerns raised by [Watson, 2004] and [NISCC, 2004], | ||||
[Ramaiah et al, 2008] proposec the aforementioned stricter | ||||
validity checks. | ||||
Section 11.1 of this document describes TCP-based connection-reset | Section 11.1 of this document describes TCP-based connection-reset | |||
attacks, along with a number of countermeasures to mitigate their | attacks, along with a number of countermeasures to mitigate their | |||
impact. | impact. | |||
3.5.8. SYN | 3.5.8. SYN | |||
DISCUSSION: | The SYN bit is used during the connection-establishment phase, to | |||
request the synchronization of sequence numbers. | ||||
The SYN bit is used during the connection-establishment phase, to | ||||
request the synchronization of sequence numbers. | ||||
There are basically four different vulnerabilities that make use | There are basically four different vulnerabilities that make use of | |||
of the SYN bit: SYN-flooding attacks, connection forgery attacks, | the SYN bit: SYN-flooding attacks, connection forgery attacks, | |||
connection flooding attacks, and connection-reset attacks. They | connection flooding attacks, and connection-reset attacks. They are | |||
are described in Section 5.1, Section 5.2, Section 5.3, and | described in Section 5.1, Section 5.2, Section 5.3, and Section | |||
Section 11.1.2, respectively, along with the possible | 11.1.2, respectively, along with the possible countermeasures. | |||
countermeasures. | ||||
3.5.9. FIN | 3.5.9. FIN | |||
DISCUSSION: | The FIN flag is used to signal the remote end-point the end of the | |||
data transfer in this direction. Receipt of a valid FIN segment | ||||
The FIN flag is used to signal the remote end-point the end of the | (i.e., a TCP segment with the FIN flag set) causes the transition in | |||
data transfer in this direction. Receipt of a valid FIN segment | the connection state, as part of what is usually referred to as the | |||
(i.e., a TCP segment with the FIN flag set) causes the transition | "connection termination phase". | |||
in the connection state, as part of what is usually referred to as | ||||
the "connection termination phase". | ||||
The connection-termination phase can be exploited to perform a | The connection-termination phase can be exploited to perform a number | |||
number of resource-exhaustion attacks. Section 6 of this document | of resource-exhaustion attacks. Section 6 of this document describes | |||
describes a number of attacks that exploit the connection- | a number of attacks that exploit the connection-termination phase | |||
termination phase along with the possible countermeasures. | along with the possible countermeasures. | |||
3.6. Window | 3.6. Window | |||
DISCUSSION: | The TCP Window field advertises how many bytes of data the remote | |||
peer is allowed to send before a new advertisement is made. | ||||
The TCP Window field advertises how many bytes of data the remote | Theoretically, the maximum transfer rate that can be achieved by TCP | |||
peer is allowed to send before a new advertisement is made. | is limited to: | |||
Theoretically, the maximum transfer rate that can be achieved by | ||||
TCP is limited to: | ||||
Maximum Transfer Rate = Window / RTT | Maximum Transfer Rate = Window / RTT | |||
This means that, under ideal network conditions (e.g., no packet | This means that, under ideal network conditions (e.g., no packet | |||
loss), the TCP Window in use should be at least: | loss), the TCP Window in use should be at least: | |||
Window = 2 * Bandwidth * Delay | Window = 2 * Bandwidth * Delay | |||
Using a larger Window than that resulting from the previous | Using a larger Window than that resulting from the previous equation | |||
equation will not provide any improvements in terms of | will not provide any improvements in terms of performance. | |||
performance. | ||||
In practice, selection of the most convenient Window size may also | ||||
depend on a number of other parameters, such as: packet loss rate, | ||||
loss recovery mechanisms in use, etc. | ||||
Security implications of the maximum TCP window size | In practice, selection of the most convenient Window size may also | |||
depend on a number of other parameters, such as: packet loss rate, | ||||
loss recovery mechanisms in use, etc. | ||||
An aspect of the TCP Window that is usually overlooked is the | An aspect of the TCP Window that is usually overlooked is the | |||
security implications of its size. Increasing the TCP window | security implications of its size. Increasing the TCP window | |||
increases the sequence number space that will be considered | increases the sequence number space that will be considered "valid" | |||
"valid" for incoming segments. Thus, use of unnecessarily large | for incoming segments. Thus, use of unnecessarily large TCP Window | |||
TCP Window sizes increases TCP's vulnerability to forgery attacks | sizes increases TCP's vulnerability to forgery attacks unnecessarily. | |||
unnecessarily. | ||||
In those scenarios in which the network conditions are known | In those scenarios in which the network conditions are known and/or | |||
and/or can be easily predicted, it is recommended that the TCP | can be easily predicted, it is recommended that the TCP Window is | |||
Window is never set to a value larger than that resulting from the | never set to a value larger than that resulting from the equations | |||
equations above. Additionally, the nature of the application | above. Additionally, the nature of the application running on top of | |||
running on top of TCP should be considered when tuning the TCP | TCP should be considered when tuning the TCP window. As an example, | |||
window. As an example, an H.245 signaling application certainly | an H.245 signaling application certainly does not have high | |||
does not have high requirements on throughput, and thus a window | requirements on throughput, and thus a window size of around 4 KBytes | |||
size of around 4 KBytes will usually fulfill its needs, while | will usually fulfill its needs, while keeping TCP's resistance to | |||
keeping TCP's resistance to off-path forgery attacks at a decent | off-path forgery attacks at a decent level. Some rough measurements | |||
level. Some rough measurements seem to indicate that a TCP window | seem to indicate that a TCP window of 4Kbytes is common practice for | |||
of 4Kbytes is common practice for TCP connections servicing | TCP connections servicing applications such as BGP. | |||
applications such as BGP. | ||||
In principle, a possible approach to avoid requiring | In principle, a possible approach to avoid requiring administrators | |||
administrators to manually set the TCP window would be to | to manually set the TCP window would be to implement an automatic | |||
implement an automatic buffer tuning mechanism, such as that | buffer tuning mechanism, such as that described in [Heffner, 2002]. | |||
described in [Heffner, 2002]. However, as discussed in Section | However, as discussed in Section 7.3.2 of this document these | |||
7.3.2 of this document these mechanisms can be exploited to | mechanisms can be exploited to perform other types of attacks. | |||
perform other types of attacks. | ||||
Security implications arising from closed windows | 3.6.1. Security implications arising from closed windows | |||
The TCP window is a flow-control mechanism that prevents a fast | When a TCP end-point is not willing to receive any more data (before | |||
data sender application from overwhelming a "slow" receiver. When | some of the data that have already been received are consumed), it | |||
a TCP end-point is not willing to receive any more data (before | will advertise a TCP window of zero bytes. This will effectively | |||
some of the data that have already been received are consumed), it | stop the sender from sending any new data to the TCP receiver. | |||
will advertise a TCP window of zero bytes. This will effectively | Transmission of new data will resume when the TCP receiver advertises | |||
stop the sender from sending any new data to the TCP receiver. | a nonzero TCP window, usually with a TCP segment that contains no | |||
Transmission of new data will resume when the TCP receiver | data ("an ACK"). | |||
advertises a nonzero TCP window, usually with a TCP segment that | ||||
contains no data ("an ACK"). | ||||
This segment is usually referred to as a "window update", as the | This segment is usually referred to as a "window update", as the | |||
only purpose of this segment is to update the server regarding the | only purpose of this segment is to update the server regarding the | |||
new window. | new window. | |||
To accommodate those scenarios in which the ACK segment that | To accommodate those scenarios in which the ACK segment that "opens" | |||
"opens" the window is lost, TCP implements a "persist timer" that | the window is lost, TCP implements a "persist timer" that causes the | |||
causes the TCP sender to query the TCP receiver periodically if | TCP sender to query the TCP receiver periodically if the last segment | |||
the last segment received advertised a window of zero bytes. This | received advertised a window of zero bytes. This probe simply | |||
probe simply consists of sending one byte of new data that will | consists of sending one byte of new data that will force the TCP | |||
force the TCP receiver to send an ACK segment back to the TCP | receiver to send an ACK segment back to the TCP sender, containing | |||
sender, containing the current TCP window. Similarly to the | the current TCP window. Similarly to the retransmission timeout | |||
retransmission timeout timer, an exponential back-off is used when | timer, an exponential back-off is used when calculating the | |||
calculating the retransmission timer, so that the spacing between | retransmission timer, so that the spacing between probes increases | |||
probes increases exponentially. | exponentially. | |||
A fundamental difference between the "persist timer" and the | A fundamental difference between the "persist timer" and the | |||
retransmission timer is that there is no limit on the amount of | retransmission timer is that there is no limit on the amount of time | |||
time during which a TCP can advertise a zero window. This means | during which a TCP can advertise a zero window. This means that a | |||
that a TCP end-point could potentially advertise a zero window | TCP end-point could potentially advertise a zero window forever, thus | |||
forever, thus keeping kernel memory at the TCP sender tied to the | keeping kernel memory at the TCP sender tied to the TCP | |||
TCP retransmission buffer. This could clearly be exploited as a | retransmission buffer. This could clearly be exploited as a vector | |||
vector for performing a Denial of Service (DoS) attack against | for performing a Denial of Service (DoS) attack against TCP, such as | |||
TCP, such as that described in Section 7.1 of this document. | that described in Section 7.1 of this document. | |||
Section 7.1 of this document describes a Denial of Service attack | Section 7.1 of this document describes a Denial of Service attack | |||
that aims at exhausting the kernel memory used for the TCP | that aims at exhausting the kernel memory used for the TCP | |||
retransmission buffer, along with possible countermeasures. | retransmission buffer, along with possible countermeasures. | |||
3.7. Checksum | 3.7. Checksum | |||
Middleboxes that process TCP segments MUST validate the Checksum | While in principle there should not be security implications arising | |||
field, and silently discard the TCP segment if such validation fails. | from the Checksum field, due to non-RFC-compliant implementations, | |||
the Checksum can be exploited to detect firewalls, evade network | ||||
DISCUSSION: | intrusion detection systems (NIDS), and/or perform Denial of Service | |||
attacks. | ||||
The Checksum field is an error detection mechanism meant for the | ||||
contents of the TCP segment and a number of important fields of | ||||
the IP header. It is computed over the full TCP header pre-pended | ||||
with a pseudo header that includes the IP Source Address, the IP | ||||
Destination Address, the Protocol number, and the TCP segment | ||||
length. While in principle there should not be security | ||||
implications arising from this field, due to non-RFC-compliant | ||||
implementations, the Checksum can be exploited to detect | ||||
firewalls, evade network intrusion detection systems (NIDS), | ||||
and/or perform Denial of Service attacks. | ||||
If a stateful firewall does not check the TCP Checksum in the | If a stateful firewall does not check the TCP Checksum in the | |||
segments it processes, an attacker can exploit this situation to | segments it processes, an attacker can exploit this situation to | |||
perform a variety of attacks. For example, he could send a flood | perform a variety of attacks. For example, he could send a flood of | |||
of TCP segments with invalid checksums, which would nevertheless | TCP segments with invalid checksums, which would nevertheless create | |||
create state information at the firewall. When each of these | state information at the firewall. When each of these segments is | |||
segments is received at its intended destination, the TCP checksum | received at its intended destination, the TCP checksum will be found | |||
will be found to be incorrect, and the corresponding will be | to be incorrect, and the corresponding will be silently discarded. | |||
silently discarded. As these segments will not elicit a response | As these segments will not elicit a response (e.g., an RST segment) | |||
(e.g., an RST segment) from the intended recipients, the | from the intended recipients, the corresponding connection state | |||
corresponding connection state entries at the firewall will not be | entries at the firewall will not be removed. Therefore, an attacker | |||
removed. Therefore, an attacker may end up tying all the state | may end up tying all the state resources of the firewall to TCP | |||
resources of the firewall to TCP connections that will never | connections that will never complete or be terminated, probably | |||
complete or be terminated, probably leading to a Denial of Service | leading to a Denial of Service to legitimate users, or forcing the | |||
to legitimate users, or forcing the firewall to randomly drop | firewall to randomly drop connection state entries. | |||
connection state entries. | ||||
If a NIDS does not check the Checksum of TCP segments, an attacker | If a NIDS does not check the Checksum of TCP segments, an attacker | |||
may send TCP segments with an invalid checksum to cause the NIDS | may send TCP segments with an invalid checksum to cause the NIDS to | |||
to obtain a TCP data stream different from that obtained by the | obtain a TCP data stream different from that obtained by the system | |||
system being monitored. In order to "confuse" the NIDS, the | being monitored. In order to "confuse" the NIDS, the attacker would | |||
attacker would send TCP segments with an invalid Checksum and a | send TCP segments with an invalid Checksum and a Sequence Number that | |||
Sequence Number that would overlap the sequence number space being | would overlap the sequence number space being used for his malicious | |||
used for his malicious activity. FTester [Barisani, 2006] is a | activity. FTester [Barisani, 2006] is a tool that can be used to | |||
tool that can be used to assess NIDS on this issue. | assess NIDS on this issue. | |||
Finally, an attacker performing port-scanning could potentially | Finally, an attacker performing port-scanning could potentially | |||
exploit intermediate systems that do not check the TCP Checksum to | exploit intermediate systems that do not check the TCP Checksum to | |||
detect whether a given TCP port is being filtered by an | detect whether a given TCP port is being filtered by an intermediate | |||
intermediate firewall, or the port is actually closed by the host | firewall, or the port is actually closed by the host being port- | |||
being port-scanned. If a given TCP port appeared to be closed, | scanned. If a given TCP port appeared to be closed, the attacker | |||
the attacker would then send a SYN segment with an invalid | would then send a SYN segment with an invalid Checksum. If this | |||
Checksum. If this segment elicited a response (either an ICMP | segment elicited a response (either an ICMP error message or a TCP | |||
error message or a TCP RST segment) to this packet, then that | RST segment) to this packet, then that response should come from a | |||
response should come from a system that does not check the TCP | system that does not check the TCP checksum. Since normal host | |||
checksum. Since normal host implementations of the TCP protocol | implementations of the TCP protocol do check the TCP checksum, such a | |||
do check the TCP checksum, such a response would most likely come | response would most likely come from a firewall or some other middle- | |||
from a firewall or some other middle-box. | box. | |||
[Ed3f, 2002] describes the exploitation of the TCP checksum for | [Ed3f, 2002] describes the exploitation of the TCP checksum for | |||
performing the above activities. [US-CERT, 2005d] provides an | performing the above activities. [US-CERT, 2005d] provides an | |||
example of a TCP implementation that failed to check the TCP | example of a TCP implementation that failed to check the TCP | |||
checksum. | checksum. | |||
3.8. Urgent pointer | 3.8. Urgent pointer | |||
Segment.Size - Data Offset * 4 > 0 | Some implementations have been found to be unable to process TCP | |||
urgent indications correctly. [Myst, 1997] originally described how | ||||
If a TCP segment with the URG bit set does not pass this check, it | TCP urgent indications could be exploited to perform a Denial of | |||
MUST be silently dropped. | Service (DoS) attack against some TCP/IP implementations, usually | |||
leading to a system crash. | ||||
For TCP segments that have the URG bit set to zero, sending TCP TCP | ||||
SHOULD set the Urgent Pointer to zero. | ||||
A receiving TCP MUST ignore the Urgent Pointer field of TCP segments | ||||
for which the URG bit is zero. | ||||
DISCUSSION: | ||||
Section 3.7 of RFC 793 [Postel, 1981c] states (in page 42) that to | ||||
send an urgent indication the user must also send at least one | ||||
byte of data. | ||||
If the URG bit is zero, the Urgent Pointer is not valid, and thus | ||||
should not be processed by the receiving TCP. Nevertheless, we | ||||
recommend TCP implementations to set the Urgent Pointer to zero | ||||
when sending a TCP segment that does not have the URG bit set, and | ||||
to ignore the Urgent Pointer (as required by RFC 793) when the URG | ||||
bit is zero. | ||||
Some stacks have been known to fail to set the Urgent Pointer to | ||||
zero when the URG bit is zero, thus leaking out the corresponding | ||||
system memory contents. [Zalewski, 2008] provides further details | ||||
about this issue. | ||||
Some implementations have been found to be unable to process TCP | [draft-gont-tcpm-tcp-sanity-checks-00.txt] describes a number of | |||
urgent indications correctly. [Myst, 1997] originally described | sanity checks to be enforced on TCP segments regarding urgent | |||
how TCP urgent indications could be exploited to perform a Denial | indications. [RFC6093] deprecates the use of urgent indications in | |||
of Service (DoS) attack against some TCP/IP implementations, | new applications. | |||
usually leading to a system crash. | ||||
3.9. Options | 3.9. Options | |||
[IANA, 2007] contains the official list of the assigned option | [IANA, 2007] contains the official list of the assigned option | |||
numbers. TCP Options have been specified in the past both within the | numbers. TCP Options have been specified in the past both within the | |||
IETF and by other groups. [Hnes, 2007] contains an un-official | IETF and by other groups. [Hnes, 2007] contains an un-official | |||
updated version of the IANA list of assigned option numbers. The | updated version of the IANA list of assigned option numbers. The | |||
following table contains a summary of the assigned TCP option | following table contains a summary of the assigned TCP option | |||
numbers, which is based on [Hnes, 2007]. | numbers, which is based on [Hnes, 2007]. | |||
skipping to change at page 27, line 10 | skipping to change at page 19, line 10 | |||
o Case 2: An option-kind byte, followed by an option-length byte, | o Case 2: An option-kind byte, followed by an option-length byte, | |||
and the actual option-data bytes. | and the actual option-data bytes. | |||
In options of the Case 2 above, the option-length byte counts the | In options of the Case 2 above, the option-length byte counts the | |||
option-kind byte and the option-length byte, as well as the actual | option-kind byte and the option-length byte, as well as the actual | |||
option-data bytes. | option-data bytes. | |||
All options except "End of Option List" (Kind = 0) and "No Operation" | All options except "End of Option List" (Kind = 0) and "No Operation" | |||
(Kind = 1), are of "Case 2". | (Kind = 1), are of "Case 2". | |||
For options that belong to the "Case 2" described above, the | [draft-gont-tcpm-tcp-sanity-checks-00.txt] describes a number of | |||
following checks MUST be performed: | sanity checks that should be performed on TCP options. | |||
option-length >= 2 | ||||
option-offset + option-length <= Data Offset * 4 | ||||
Where option-offset is the offset of the first byte of the option | ||||
within the TCP header, with the first byte of the TCP header being | ||||
assigned an offset of 0. | ||||
If a TCP segment fails to pass any of these checks, it SHOULD be | ||||
silently dropped. | ||||
TCP MUST ignore unknown TCP options, provided they pass the | ||||
validation checks specified above. In the same way, middle-boxes | ||||
such as packet filters SHOULD NOT reject TCP segments containing | ||||
"unknown" TCP options that pass the validation checks described | ||||
earlier in this Section. | ||||
DISCUSSION: | ||||
The value "2" in the first equation accounts for the option-kind | ||||
byte and the option-length byte, and assumes zero bytes of option- | ||||
data. This check prevents, among other things, loops in option | ||||
processing that may arise from incorrect option lengths. | ||||
The second equation takes into account the limit on the legitimate | ||||
option length imposed by the syntax of the TCP header, and is | ||||
meant to detect forged option-length values that might make an | ||||
option overlap with the TCP payload, or even go past the actual | ||||
end of the TCP segment carrying the option. | ||||
Middle-boxes such as packet filters should not reject TCP segments | ||||
containing unknown options solely because these options have not been | ||||
present in the SYN/SYN-ACK handshake. | ||||
DISCUSSION: | ||||
There is renewed interest in defining new TCP options for purposes | ||||
like improved connection management and maintenance, advanced | ||||
congestion control schemes, and security features. The evolution | ||||
of the TCP/IP protocol suite would be severely impacted by | ||||
obstacles to deploying such new protocol mechanisms. | ||||
Middle-boxes such as packet filters SHOULD NOT reject TCP segments | ||||
containing unknown options solely because these options have not been | ||||
present in the SYN/SYN-ACK handshake. | ||||
DISCUSSION: | ||||
In the past, TCP enhancements based on TCP options regularly have | ||||
specified the exchange of a specific "enabling" option during the | ||||
initial SYN/SYN-ACK handshake. Due to the severely limited TCP | ||||
option space which has already become a concern, it should be | ||||
expected that future specifications might introduce new options | ||||
not negotiated or enabled in this way. Therefore, middle-boxes | ||||
such as packet filters should not reject TCP segments containing | ||||
unknown options solely because these options have not been present | ||||
in the SYN/SYN-ACK handshake. | ||||
TCP MUST NOT "echo" in any way unknown TCP options received in | ||||
inbound TCP segments. | ||||
DISCUSSION: | ||||
Some TCP implementations have been known to "echo" unknown TCP | ||||
options received in incoming segments. Here we stress that TCP | ||||
must not "echo" in any way unknown TCP options received in inbound | ||||
TCP segments. This is at the foundation for the introduction of | ||||
new TCP options, ensuring unambiguous behavior of systems not | ||||
supporting a new specification. | ||||
Section 4 discusses the security implications of common TCP options. | Section 4 discusses the security implications of common TCP options. | |||
3.10. Padding | 3.10. Padding | |||
The TCP header padding is used to ensure that the TCP header ends and | The TCP header padding is used to ensure that the TCP header ends and | |||
data begins on a 32-bit boundary. The padding is composed of zeros. | data begins on a 32-bit boundary. The padding is composed of zeros. | |||
3.11. Data | 3.11. Data | |||
The data field contains the upper-layer packet being transmitted by | The data field contains the upper-layer packet being transmitted by | |||
means of TCP. This payload is processed by the application process | means of TCP. This payload is processed by the application process | |||
making use of the transport services of TCP. Therefore, the security | making use of the transport services of TCP. Therefore, the security | |||
implications of this field are out of the scope of this document. | implications of this field are out of the scope of this document. | |||
4. Common TCP Options | 4. Common TCP Options | |||
4.1. End of Option List (Kind = 0) | 4.1. End of Option List (Kind = 0) | |||
TCP implementations MUST be able to gracefully handle those TCP | This option indicates the "End of Options". As noted in | |||
segments in which the End of Option List should have been present, | [draft-gont-tcpm-tcp-sanity-checks-00.txt], some implementations pad | |||
but is missing. | the end of options with "No Operation" options rather than including | |||
an "End of Options List" option. | ||||
DISCUSSION: | ||||
This option is used to indicate the "end of options" in those | ||||
cases in which the end of options would not coincide with the end | ||||
of the TCP header. | ||||
TCP implementations are required to ignore those options they do | ||||
not implement, and to be able to handle options with illegal | ||||
lengths. Therefore, TCP implementations should be able to | ||||
gracefully handle those TCP segments in which the End of Option | ||||
List should have been present, but is missing. | ||||
It is interesting to note that some TCP implementations do not use | ||||
the "End of Option List" option for indicating the "end of | ||||
options", but simply pad the TCP header with several "No | ||||
Operation" (Kind = 1) options to meet the header length specified | ||||
by the Data Offset header field. | ||||
4.2. No Operation (Kind = 1) | 4.2. No Operation (Kind = 1) | |||
The no-operation option is basically used to allow the sending system | The no-operation option is basically used to allow the sending system | |||
to align subsequent options in, for example, 32-bit boundaries. | to align subsequent options in, for example, 32-bit boundaries. | |||
This option does not have any known security implications. | This option does not have any known security implications. | |||
4.3. Maximum Segment Size (Kind = 2) | 4.3. Maximum Segment Size (Kind = 2) | |||
The Maximum Segment Size (MSS) option is used to indicate to the | The Maximum Segment Size (MSS) option is used to indicate to the | |||
remote TCP endpoint the maximum segment size this TCP is willing to | remote TCP endpoint the maximum segment size this TCP is willing to | |||
receive. | receive. | |||
The following check MUST be performed on a TCP segment that carries a | The MSS option has been employed for performing DoS attacks, by | |||
MSS option: | advertising very small MSS values thus greatly increasing the packet- | |||
rate used by the victim system. | ||||
SYN == 1 | [draft-gont-tcpm-tcp-sanity-checks-00.txt] describes this issue, and | |||
proposes sanity checks to mitigate it. | ||||
If the segment does not pass this check, it MUST be silently dropped. | ||||
DISCUSSION: | ||||
As stated in Section 3.1 of RFC 793 [Postel, 1981c], this option | ||||
can only be sent in the initial connection request (i.e., in | ||||
segments with the SYN control bit set). | ||||
TCP MUST check that the option length is 4. If the option does not | ||||
pass this check, it MUST be dropped. | ||||
The received MSS SHOULD be sanitized as follows: | ||||
Sanitized_MSS = max(MSS, 536) | ||||
This "sanitized" MSS value SHOULD be used to compute the "effective | ||||
send MSS" by the expression included in Section 4.2.2.6 of RFC 1122 | ||||
[Braden, 1989], as follows: | ||||
Eff.snd.MSS = min(Sanitized_MSS+20, MMS_S) - TCPhdrsize - IPoptionsize | ||||
where: | ||||
Sanitized_MSS: | ||||
sanitized MSS value (the value received in the MSS option, with an | ||||
enforced minimum value) | ||||
MMS_S: | ||||
maximum size for a transport-layer message that TCP may send | ||||
TCPhdrsize: | ||||
size of the TCP header, which typically was 20, but may be larger | ||||
if TCP options are to be sent. | ||||
IPoptionsize | ||||
size of any IP options that TCP will pass to the IP layer with the | ||||
current message. | ||||
DISCUSSION: | ||||
The advertised maximum segment size may be the result of the | ||||
consideration of a number of factors. Firstly, if fragmentation | ||||
is employed, the size of the IP reassembly buffer may impose a | ||||
limit on the maximum TCP segment size that can be received. | ||||
Considering that the minimum IP reassembly buffer size is 576 | ||||
bytes, if an MSS option is not present included in the connection- | ||||
establishment phase, an MSS of 536 bytes should be assumed. | ||||
Secondly, if Path-MTU Discovery (specified in RFC 1191 [Mogul and | ||||
Deering, 1990] and RFC 1981 [McCann et al, 1996]) is expected to | ||||
be used for the connection, an artificial maximum segment size may | ||||
be enforced by a TCP to prevent the remote peer from sending TCP | ||||
segments which would be too large to be transmitted without | ||||
fragmentation. Finally, a system connected by a low-speed link | ||||
may choose to introduce an artificial maximum segment size to | ||||
enforce an upper limit on the network latency that would otherwise | ||||
negatively affect its interactive applications [Stevens, 1994]. | ||||
The TCP specifications do not impose any requirements on the | ||||
maximum segment size value that is included in the MSS option. | ||||
However, there are a number of values that may cause undesirable | ||||
results. Firstly, an MSS of 0 could possible "freeze" the TCP | ||||
connection, as it would not allow data to be included in the | ||||
payload of the TCP segments. Secondly, low values other than 0 | ||||
would degrade the performance of the TCP connection (wasting more | ||||
bandwidth in protocol headers than in actual data), and could | ||||
potentially exhaust processing cycles at the sending TCP and/or | ||||
the receiving TCP by producing an increase in the interrupt rate | ||||
caused by the transmitted (or received) packets. | ||||
The problems that might arise from low MSS values were first | ||||
described by [Reed, 2001]. However, the community did not reach | ||||
consensus on how to deal with these issues at that point. | ||||
RFC 791 [Postel, 1981a] requires IP implementations to be able to | ||||
receive IP datagrams of at least 576 bytes. Assuming an IPv4 | ||||
header of 20 bytes, and a TCP header of 20 bytes, there should be | ||||
room in each IP packet for 536 application data bytes. | ||||
There are two cases to analyze when considering the possible | ||||
interoperability impact of sanitizing the received MSS value: TCP | ||||
connections relying on IP fragmentation and TCP connections | ||||
implementing Path-MTU Discovery. In case the corresponding TCP | ||||
connection relies on IP fragmentation, given that the minimum | ||||
reassembly buffer size is required to be 576 bytes by RFC 791 | ||||
[Postel, 1981a], the adoption of 536 bytes as a lower limit is | ||||
safe. | ||||
In case the TCP connection relies on Path-MTU Discovery, imposing | ||||
a lower limit on the adopted MSS may ignore the advice of the | ||||
remote TCP on the maximum segment size that can possibly be | ||||
transmitted without fragmentation. As a result, this could lead | ||||
to the first TCP data segment to be larger than the Path-MTU. | ||||
However, in such a scenario, the TCP segment should elicit an ICMP | ||||
Unreachable "fragmentation needed and DF bit set" error message | ||||
that would cause the "effective send MSS" (E_MSS) to be decreased | ||||
appropriately. Thus, imposing a lower limit on the accepted MSS | ||||
will not cause any interoperability problems. | ||||
A possible scenario exists in which the proposed enforcement of a | ||||
lower limit in the received MSS might lead to an interoperability | ||||
problem. If a system was attached to the network by means of a | ||||
link with an MTU of less than 576 bytes, and there was some | ||||
intermediate system which either silently dropped (i.e., without | ||||
sending an ICMP error message) those packets equal to or larger | ||||
than that 576 bytes, or some intermediate system simply filtered | ||||
ICMP "fragmentation needed and DF bit set" error messages, the | ||||
proposed behavior would not lead to an interoperability problem, | ||||
when communication could have otherwise succeeded. However, the | ||||
interoperability problem would really be introduced by the network | ||||
setup (e.g., the middle-box silently dropping packets), rather | ||||
than by the mechanism proposed in this section. In any case, TCP | ||||
should nevertheless implement a mechanism such as that specified | ||||
by RFC 4821 [Mathis and Heffner, 2007] to deal with this type of | ||||
"network black-holes". | ||||
4.4. Selective Acknowledgement Option | 4.4. Selective Acknowledgement Option | |||
The Selective Acknowledgement option provides an extension to allow | The Selective Acknowledgement option provides an extension to allow | |||
the acknowledgement of individual segments, to enhance TCP's loss | the acknowledgement of individual segments, to enhance TCP's loss | |||
recovery. | recovery. | |||
Two options are involved in the SACK mechanism. The "Sack-permitted | Two options are involved in the SACK mechanism. The "Sack-permitted | |||
option" is sent during the connections-establishment phase, to | option" is sent during the connections-establishment phase, to | |||
advertise that SACK is supported. If both TCP peers agree to use | advertise that SACK is supported. If both TCP peers agree to use | |||
selective acknowledgements, the actual selective acknowledgements are | selective acknowledgements, the actual selective acknowledgements are | |||
sent, if needed, by means of "SACK options". | sent, if needed, by means of "SACK options". | |||
4.4.1. SACK-permitted Option (Kind = 4) | 4.4.1. SACK-permitted Option (Kind = 4) | |||
The SACK-permitted option is meant to advertise that the TCP sending | [draft-gont-tcpm-tcp-sanity-checks-00.txt] to be performed on this | |||
this segment supports Selective Acknowledgements. | option. | |||
The following check MUST be performed on a TCP segment that carries a | ||||
MSS option: | ||||
SYN == 1 | ||||
If a segment does not pass this check, it MUST be silently dropped. | ||||
DISCUSSION: | ||||
The SACK-permitted option can be sent only in SYN segments. | ||||
TCP MUST check that the option length is 2. If the option does not | ||||
pass this check it MUST be silently dropped. | ||||
4.4.2. SACK Option (Kind = 5) | 4.4.2. SACK Option (Kind = 5) | |||
The SACK option is used to convey extended acknowledgment information | The TCP receiving a SACK option is expected to keep track of the | |||
from the receiver to the sender over an established TCP connection. | selectively-acknowledged blocks. Even when space in the TCP header | |||
The option consists of an option-kind byte (which must be 5), an | is limited (and thus each TCP segment can selectively-acknowledge at | |||
option-length byte, and a variable number of SACK blocks. | most four blocks of data), an attacker could try to perform a buffer | |||
overflow or a resource-exhaustion attack by sending a large number of | ||||
TCP MUST silently discard those TCP segments carrying a SACK option | SACK options. | |||
that does not pass the following check: | ||||
option-offset + option-length <= Data Offset * 4 | ||||
TCP MUST silently discard those TCP segments carrying a SACK option | ||||
that does not pass the following check: | ||||
option-length >= 10 | ||||
DISCUSSION: | ||||
A SACK Option with zero SACK blocks is nonsensical. The value | ||||
"10" accounts for the option-kind byte, the option-length byte, a | ||||
4-byte left-edge field, and a 4-byte right-edge field. | ||||
TCP MUST silently discard those TCP segments carrying a SACK option | ||||
that does not pass the following check: | ||||
(option-length - 2) % 8 == 0 | ||||
DISCUSSION: | ||||
As stated in Section 3 of RFC 2018 [Mathis et al, 1996], a SACK | ||||
option that specifies n blocks will have a length of 8*n+2. | ||||
TCP MUST silently discard those TCP segments carrying a SACK option | ||||
that contains a SACK block that does not pass the following check: | ||||
Left Edge of Block < Right Edge of Block | ||||
As in all the other occurrences in this document, all comparisons | ||||
between sequence numbers should be performed using sequence number | ||||
arithmetic. | ||||
DISCUSSION: | ||||
Each block included in a SACK option represents a number of | ||||
received data bytes that are contiguous and isolated; that is, the | ||||
bytes just below the block, (Left Edge of Block - 1), and just | ||||
above the block, (Right Edge of Block), have not yet been | ||||
received. | ||||
TCP MUST enforce a limit on the number of SACK blocks that a TCP will | ||||
store in memory for each connection at any time. | ||||
DISCUSSION: | ||||
The TCP receiving a SACK option is expected to keep track of the | ||||
selectively-acknowledged blocks. Even when space in the TCP | ||||
header is limited (and thus each TCP segment can selectively- | ||||
acknowledge at most four blocks of data), an attacker could try to | ||||
perform a buffer overflow or a resource-exhaustion attack by | ||||
sending a large number of SACK options. | ||||
For example, an attacker could send a large number of SACK | For example, an attacker could send a large number of SACK options, | |||
options, each of them acknowledging one byte of data. | each of them acknowledging one byte of data. Additionally, for the | |||
Additionally, for the purpose of wasting resources on the attacked | purpose of wasting resources on the attacked system, each of these | |||
system, each of these blocks would be separated from each other by | blocks would be separated from each other by one byte, to prevent the | |||
one byte, to prevent the attacked system from coalescing two (or | attacked system from coalescing two (or more) contiguous SACK blocks | |||
more) contiguous SACK blocks into a single SACK block. If the | into a single SACK block. If the attacked system kept track of each | |||
attacked system kept track of each SACKed block by storing both | SACKed block by storing both the Left Edge and the Right Edge of the | |||
the Left Edge and the Right Edge of the block, then for each | block, then for each window of data, the attacker could waste up to 4 | |||
window of data, the attacker could waste up to 4 * Window bytes of | * Window bytes of memory at the attacked TCP. | |||
memory at the attacked TCP. | ||||
The value "4 * Window" results from the expression "(Window / 2) * | The value "4 * Window" results from the expression "(Window / 2) * | |||
8", in which the value "2" accounts for the 1-byte block | 8", in which the value "2" accounts for the 1-byte block | |||
selectively-acknowledged by each SACK block and 1 byte that would | selectively-acknowledged by each SACK block and 1 byte that would | |||
be used to separate each SACK blocks from each other, and the | be used to separate each SACK blocks from each other, and the | |||
value "8" accounts for the 8 bytes needed to store the Left Edge | value "8" accounts for the 8 bytes needed to store the Left Edge | |||
and the Right Edge of each SACKed block. | and the Right Edge of each SACKed block. | |||
Therefore, it is clear that a limit should be imposed on the | [draft-gont-tcpm-tcp-sanity-checks-00.txt] describes sanity checks to | |||
number of SACK blocks that a TCP will store in memory for each | be performed on this option such that this and other possible issues | |||
connection at any time. Measurements in [Dharmapurikar and | are mitigated. | |||
Paxson, 2005] indicate that in the vast majority of cases | ||||
connections have a single hole in the data stream at any given | ||||
time. Thus, a limit of 16 SACK blocks for each connection would | ||||
handle even most of the more unusual cases in which there is more | ||||
than one simultaneous hole at a time. | ||||
4.5. MD5 Option (Kind=19) | 4.5. MD5 Option (Kind=19) | |||
The TCP MD5 option provides a mechanism for authenticating TCP | The TCP MD5 option provides a mechanism for authenticating TCP | |||
segments with a 18-byte digest produced by the MD5 algorithm. The | segments with a 18-byte digest produced by the MD5 algorithm. The | |||
option consists of an option-kind byte (which must be 19), an option- | option consists of an option-kind byte (which must be 19), an option- | |||
length byte (which must be 18), and a 16-byte MD5 digest. | length byte (which must be 18), and a 16-byte MD5 digest. | |||
TCP MUST silently drop a TCP segment that carries a TCP MD5 option | A basic weakness on the TCP MD5 option is that the MD5 algorithm | |||
that does not pass the following checks: | itself has been known (for a long time) to be vulnerable to collision | |||
search attacks. | ||||
option-offset + option-length <= Data Offset * 4 | ||||
option-length == 18 | ||||
DISCUSSION: | ||||
The TCP MD5 option is of "Case 2", and has a fixed length. | ||||
DISCUSSION: | ||||
A basic weakness on the TCP MD5 option is that the MD5 algorithm | ||||
itself has been known (for a long time) to be vulnerable to | ||||
collision search attacks. | ||||
[Bellovin, 2006] argues that it has two other weaknesses, namely | [Bellovin, 2006] argues that it has two other weaknesses, namely that | |||
that it does not provide a key identifier, and that it has no | it does not provide a key identifier, and that it has no provision | |||
provision for automated key management. However, it is generally | for automated key management. However, it is generally accepted that | |||
accepted that while a Key-ID field can be a good approach for | while a Key-ID field can be a good approach for providing smooth key | |||
providing smooth key rollover, it is not actually a requirement. | rollover, it is not actually a requirement. For instance, most | |||
For instance, most systems implementing the TCP MD5 option include | systems implementing the TCP MD5 option include a "keychain" | |||
a "keychain" mechanism that fully supports smooth key rollover. | mechanism that fully supports smooth key rollover. Additionally, | |||
Additionally, with some further work, ISAKMP/IKE could be used to | with some further work, ISAKMP/IKE could be used to configure the MD5 | |||
configure the MD5 keys. | keys. | |||
It is interesting to note that while the TCP MD5 option, as | It is interesting to note that while the TCP MD5 option, as specified | |||
specified by RFC 2385 [Heffernan, 1998], addresses the TCP-based | by RFC 2385 [Heffernan, 1998], addresses the TCP-based forgery | |||
forgery attacks against TCP discussed in Section 11, it does not | attacks against TCP discussed in Section 11, it does not address the | |||
address the ICMP-based connection-reset attacks discussed in | ICMP-based connection-reset attacks discussed in Section 15. As a | |||
Section 15. As a result, while a TCP connection may be protected | result, while a TCP connection may be protected from TCP-based | |||
from TCP-based forgery attacks by means of the MD5 option, an | forgery attacks by means of the MD5 option, an attacker might still | |||
attacker might still be able to successfully perform the ICMP- | be able to successfully perform the ICMP-based counter-part. | |||
based counter-part. | ||||
The TCP MD5 option has been obsoleted by the TCP-AO. | The TCP MD5 option has been obsoleted by the TCP-AO. | |||
4.6. Window scale option (Kind = 3) | 4.6. Window scale option (Kind = 3) | |||
The window scale option provides a mechanism to expand the definition | The window scale option provides a mechanism to expand the definition | |||
of the TCP window to 32 bits, such that the performance of TCP can be | of the TCP window to 32 bits, such that the performance of TCP can be | |||
improved in some network scenarios. The Window scale option consists | improved in some network scenarios. The Window scale option consists | |||
of an option-kind byte (which must be 3), followed by an option- | of an option-kind byte (which must be 3), followed by an option- | |||
length byte (which must be 3), and a shift count (shift.cnt) byte | length byte (which must be 3), and a shift count (shift.cnt) byte | |||
(the actual option-data). | (the actual option-data). | |||
The option may be sent only in the initial SYN segment, but may also | While there are not known security implications arising from the | |||
be sent in a SYN/ACK segment if the option was received in the | window scale mechanism itself, the size of the TCP window has a | |||
initial SYN segment. If the option is received in any other segment, | number of security implications. In general, larger window sizes | |||
it MUST be silently dropped. | increase the chances of an attacker from successfully performing | |||
forgery attacks against TCP, such as those described in Section 11 of | ||||
TCP MUST silently discard TCP segments that contain a Window scale | this document. Additionally, large windows can exacerbate the impact | |||
option whose option-length is not 3. | of resource exhaustion attacks such as those described in Section 7 | |||
of this document. | ||||
DISCUSSION: | ||||
This option has a fixed length. | ||||
TCP MUST silently discard TCP segments that contain a Window scale | ||||
option that does not pass the following check: | ||||
shift.cnt <= 14 | ||||
DISCUSSION: | ||||
As discussed in Section 2.3 of RFC 1323 [Jacobson et al, 1992], in | ||||
order to prevent new data from being mistakenly considered as old | ||||
and vice versa, the resulting window should be equal to or smaller | ||||
than 2^32. | ||||
DISCUSSION: | ||||
[Welzl, 2008] describes major problems with the use of the Window | ||||
scale option in the Internet due to faulty equipment. | ||||
While there are not known security implications arising from the | ||||
window scale mechanism itself, the size of the TCP window has a | ||||
number of security implications. In general, larger window sizes | ||||
increase the chances of an attacker from successfully performing | ||||
forgery attacks against TCP, such as those described in Section 11 | ||||
of this document. Additionally, large windows can exacerbate the | ||||
impact of resource exhaustion attacks such as those described in | ||||
Section 7 of this document. | ||||
Section 3.7 provides a general discussion of the security | Section 3.7 provides a general discussion of the security | |||
implications of the TCP window size. Section 7.3.2 discusses the | implications of the TCP window size. Section 7.3.2 discusses the | |||
security implications of Automatic receive-buffer tuning | security implications of Automatic receive-buffer tuning mechanisms. | |||
mechanisms. | ||||
4.7. Timestamps option (Kind = 8) | 4.7. Timestamps option (Kind = 8) | |||
The Timestamps option, specified in RFC 1323 [Jacobson et al, 1992], | The Timestamps option, specified in RFC 1323 [Jacobson et al, 1992], | |||
is used to perform two functions: Round-Trip Time Measurement (RTTM), | is used to perform two functions: Round-Trip Time Measurement (RTTM), | |||
and Protection Against Wrapped Sequence Numbers (PAWS). | and Protection Against Wrapped Sequence Numbers (PAWS). | |||
TCP MUST silently discard TCP segments that contain a Timestamps | ||||
option that does not pass the following check: | ||||
option-length == 10 | ||||
DISCUSSION: | ||||
As specified by RFC 1323, the option-length must be 10. | ||||
4.7.1. Generation of timestamps | 4.7.1. Generation of timestamps | |||
TCP SHOULD generate timestamps with the following expression: | For the purpose of PAWS, the timestamps sent on a connection are | |||
required to be monotonically increasing. While there is no | ||||
timestamp = T() + F(localhost, localport, remotehost, remoteport, secret_key) | requirement that timestamps are monotonically increasing across TCP | |||
connections, the generation of timestamps such that they are | ||||
where the result of T() is a global system clock that complies with | monotonically increasing across connections between the same two | |||
the requirements of Section 4.2.2 of RFC 1323 [Jacobson et al, 1992], | endpoints allows the use of timestamps for improving the handling of | |||
and F() is a function that should not be computable from the outside. | SYN segments that are received while the corresponding four-tuple is | |||
Therefore, we suggest F() to be a cryptographic hash function of the | in the TIME-WAIT state. This is discussed in Section 11.1.2 of this | |||
connection-id and some secret data. | document. | |||
DISCUSSION: | ||||
For the purpose of PAWS, the timestamps sent on a connection are | ||||
required to be monotonically increasing. While there is no | ||||
requirement that timestamps are monotonically increasing across | ||||
TCP connections, the generation of timestamps such that they are | ||||
monotonically increasing across connections between the same two | ||||
endpoints allows the use of timestamps for improving the handling | ||||
of SYN segments that are received while the corresponding four- | ||||
tuple is in the TIME-WAIT state. This is discussed in Section | ||||
11.1.2 of this document. | ||||
F() provides an offset that will be the same for all incarnations | ||||
of a connection between the same two endpoints, while T() provides | ||||
the monotonically increasing values that are needed for PAWS. | ||||
Further discussion about this algorithm is available in | ||||
[I-D.gont-timestamps-generation]. | ||||
TCP SHOULD NOT initialize a global timestamp counter to a fixed value | ||||
when the system is bootstrapped. | ||||
DISCUSSION: | ||||
Some implementations are known to initialize their global | ||||
timestamp clock to zero when the system is bootstrapped. This is | ||||
undesirable, as the timestamp clock would disclose the system | ||||
uptime. | ||||
TCP SHOULD set the Timestamp Echo Reply (TSecr) field to zero when | ||||
sending a TCP segment that does not have the ACK bit set (i.e., a SYN | ||||
segment). | ||||
DISCUSSION: | ||||
Some TCP implementations have been found to fail to set the | Some implementations are known to initialize their global timestamp | |||
Timestamp Echo Reply field (TSecr) to zero in TCP segments that do | clock to zero when the system is bootstrapped. This is undesirable, | |||
not have the ACK bit set, thus potentially leaking information. | as the timestamp clock would disclose the system uptime. | |||
[I-D.gont-timestamps-generation] discusses the generation of TCP | ||||
timestamps in detail. | ||||
4.7.2. Vulnerabilities | 4.7.2. Vulnerabilities | |||
Blind In-Window Attacks | Blind In-Window Attacks | |||
Segments that contain a timestamp option smaller than the last | Segments that contain a timestamp option smaller than the last | |||
timestamp option recorded by TCP are silently dropped. This allows | timestamp option recorded by TCP are silently dropped. This allows | |||
for a subtle attack against TCP that would allow an attacker to cause | for a subtle attack against TCP that would allow an attacker to cause | |||
one direction of data transfer of the attacked connection to freeze | one direction of data transfer of the attacked connection to freeze | |||
[US-CERT, 2005c]. An attacker could forge a TCP segment that | [US-CERT, 2005c]. An attacker could forge a TCP segment that | |||
skipping to change at page 40, line 7 | skipping to change at page 24, line 14 | |||
proposes mitigations for this and other issues. | proposes mitigations for this and other issues. | |||
5. Connection-establishment mechanism | 5. Connection-establishment mechanism | |||
The following subsections describe a number of attacks that can be | The following subsections describe a number of attacks that can be | |||
performed against TCP by exploiting its connection-establishment | performed against TCP by exploiting its connection-establishment | |||
mechanism. | mechanism. | |||
5.1. SYN flood | 5.1. SYN flood | |||
TCP SHOULD implement (and enable by default) a syn-cache [Lemon, | TCP uses a mechanism known as the "three-way handshake" for the | |||
2002]. | establishment of a connection between two TCP peers. RFC 793 | |||
[RFC0793] states that when a TCP that is in the LISTEN state receives | ||||
TCP SHOULD implement syn-cookies, and SHOULD enable them only after a | a SYN segment (i.e., a TCP segment with the SYN flag set), it must | |||
specified number of TCBs has been allocated for connections in the | transition to the SYN-RECEIVED state, record the control information | |||
SYN-RECEIVED state. | (e.g., the ISN) contained in the SYN segment in a Transmission | |||
Control Block (TCB), and respond with a SYN/ACK segment. | ||||
DISCUSSION: | ||||
TCP uses a mechanism known as the "three-way handshake" for the | ||||
establishment of a connection between two TCP peers. RFC 793 | ||||
[Postel, 1981c] states that when a TCP that is in the LISTEN state | ||||
receives a SYN segment (i.e., a TCP segment with the SYN flag | ||||
set), it must transition to the SYN-RECEIVED state, record the | ||||
control information (e.g., the ISN) contained in the SYN segment | ||||
in a Transmission Control Block (TCB), and respond with a SYN/ACK | ||||
segment. | ||||
A Transmission Control Block is the data structure used to store | A Transmission Control Block is the data structure used to store | |||
(usually within the kernel) all the information relevant to a TCP | (usually within the kernel) all the information relevant to a TCP | |||
connection. The concept of "TCB" is introduced in the core TCP | connection. The concept of "TCB" is introduced in the core TCP | |||
specification RFC 793 [Postel, 1981c]. | specification RFC 793 [RFC0793]. | |||
In practice, virtually all existing implementations do not modify | In practice, virtually all existing implementations do not modify the | |||
the state of the TCP that was in the LISTEN state, but rather | state of the TCP that was in the LISTEN state, but rather create a | |||
create a new TCP (i.e., a new "protocol machine"), and perform all | new TCP (i.e., a new "protocol machine"), and perform all the state | |||
the state transitions on this newly-created TCP. This allows the | transitions on this newly-created TCP. This allows the application | |||
application running on top of TCP to service to more than one | running on top of TCP to service to more than one client at the same | |||
client at the same time. As a result, each connection request | time. As a result, each connection request results in the allocation | |||
results in the allocation of system memory to store the TCB | of system memory to store the TCB associated with the newly created | |||
associated with the newly created TCB. | TCB. | |||
If TCP was implemented strictly as described in RFC 793, the | If TCP was implemented strictly as described in RFC 793, the | |||
application running on top of TCP would have to finish servicing | application running on top of TCP would have to finish servicing the | |||
the current client before being able to service the next one in | current client before being able to service the next one in line, or | |||
line, or should instead be able to perform some kind of connection | should instead be able to perform some kind of connection hand-off. | |||
hand-off. | ||||
An attacker could exploit TCP's connection-establishment mechanism | An attacker could exploit TCP's connection-establishment mechanism to | |||
to perform a Denial of Service (DoS) attack, by sending a large | perform a Denial of Service (DoS) attack, by sending a large number | |||
number of connection requests to the target system, with the | of connection requests to the target system, with the intent of | |||
intent of exhausting the system memory destined for storing TCBs | exhausting the system memory destined for storing TCBs (or related | |||
(or related kernel data structures), thus preventing the attacked | kernel data structures), thus preventing the attacked system from | |||
system from establishing new connections with legitimate users. | establishing new connections with legitimate users. This attack is | |||
This attack is widely known as "SYN flood", and has received a lot | widely known as "SYN flood", and has received a lot of attention | |||
of attention during the late 90's [CERT, 1996]. | during the late 90's [CERT, 1996]. | |||
Given that the attacker does not need to complete the three-way | Given that the attacker does not need to complete the three-way | |||
handshake for the attacked system to tie system resources to the | handshake for the attacked system to tie system resources to the | |||
newly created TCBs, he will typically forge the source IP address | newly created TCBs, he will typically forge the source IP address of | |||
of the malicious SYN segments he sends, thus concealing his own IP | the malicious SYN segments he sends, thus concealing his own IP | |||
address. | address. | |||
If the forged IP addresses corresponded to some reachable system, | If the forged IP addresses corresponded to some reachable system, the | |||
the impersonated system would receive the SYN/ACK segment sent by | impersonated system would receive the SYN/ACK segment sent by the | |||
the attacked host (in response to the forged SYN segment), which | attacked host (in response to the forged SYN segment), which would | |||
would elicit an RST segment. This RST segment would be delivered | elicit an RST segment. This RST segment would be delivered to the | |||
to the attacked system, causing the corresponding connection to be | attacked system, causing the corresponding connection to be aborted, | |||
aborted, and the corresponding TCB to be removed. | and the corresponding TCB to be removed. | |||
As the impersonated host would not have any state information for | As the impersonated host would not have any state information for the | |||
the TCP connection being referred to by the SYN/ACK segment, it | TCP connection being referred to by the SYN/ACK segment, it would | |||
would respond with a RST segment, as specified by the TCP segment | respond with a RST segment, as specified by the TCP segment | |||
processing rules of RFC 793 [Postel, 1981c]. | processing rules of RFC 793 [RFC0793]. | |||
However, if the forged IP source addresses were unreachable, the | However, if the forged IP source addresses were unreachable, the | |||
attacked TCP would continue retransmitting the SYN/ACK segment | attacked TCP would continue retransmitting the SYN/ACK segment | |||
corresponding to each connection request, until timing out and | corresponding to each connection request, until timing out and | |||
aborting the connection. For this reason, a number of widely | aborting the connection. For this reason, a number of widely | |||
available attack tools first check whether each of the (forged) IP | available attack tools first check whether each of the (forged) IP | |||
addresses are reachable by sending an ICMP echo request to them. | addresses are reachable by sending an ICMP echo request to them. The | |||
The receipt of an ICMP echo response is considered an indication | receipt of an ICMP echo response is considered an indication of the | |||
of the IP address being reachable (and thus results in the | IP address being reachable (and thus results in the corresponding IP | |||
corresponding IP address not being used for performing the | address not being used for performing the attack), while the receipt | |||
attack), while the receipt of an ICMP unreachable error message is | of an ICMP unreachable error message is considered an indication of | |||
considered an indication of the IP address being unreachable (and | the IP address being unreachable (and thus results in the | |||
thus results in the corresponding IP address being used for | corresponding IP address being used for performing the attack). | |||
performing the attack). | ||||
[Gont, 2008b] describes how the so-called ICMP soft errors could | [Gont, 2008b] describes how the so-called ICMP soft errors could be | |||
be used by TCP to abort connections in any of the non-synchronized | used by TCP to abort connections in any of the non-synchronized | |||
states. While implementation of the mechanism described in that | states. While implementation of the mechanism described in that | |||
document would certainly not eliminate the vulnerability of TCP to | document would certainly not eliminate the vulnerability of TCP to | |||
SYN flood attacks (as the attacker could use addresses that are | SYN flood attacks (as the attacker could use addresses that are | |||
simply "black-holed"), it provides an example of how signaling | simply "black-holed"), it provides an example of how signaling | |||
information such as that provided by means of ICMP error messages | information such as that provided by means of ICMP error messages can | |||
can provide valuable information that a transport protocol could | provide valuable information that a transport protocol could use to | |||
use to perform heuristics. | perform heuristics. | |||
In order to mitigate the impact of this attack, the amount of | In order to mitigate the impact of this attack, the amount of | |||
information stored for non-established connections should be | information stored for non-established connections should be reduced | |||
reduced (ideally, non-synchronized connections should not require | (ideally, non-synchronized connections should not require any state | |||
any state information to be maintained at the TCP performing the | information to be maintained at the TCP performing the passive OPEN). | |||
passive OPEN). There are basically two mitigation techniques for | There are basically two mitigation techniques for this vulnerability: | |||
this vulnerability: a syn-cache and syn-cookies. | a syn-cache and syn-cookies. | |||
[Borman, 1997] and RFC 4987 [Eddy, 2007] contain a general | [Borman, 1997] and RFC 4987 [Eddy, 2007] contain a general discussion | |||
discussion of SYN-flooding attacks and common mitigation | of SYN-flooding attacks and common mitigation approaches. | |||
approaches. | ||||
The syn-cache [Lemon, 2002] approach aims at reducing the amount | The syn-cache [Lemon, 2002] approach aims at reducing the amount of | |||
of state information that is maintained for connections in the | state information that is maintained for connections in the SYN- | |||
SYN-RECEIVED state, and allocates a full TCB only after the | RECEIVED state, and allocates a full TCB only after the connection | |||
connection has transited to the ESTABLISHED state. | has transited to the ESTABLISHED state. | |||
The syn-cookie [Bernstein, 1996] approach aims at completely | The syn-cookie [Bernstein, 1996] approach aims at completely | |||
eliminating the need to maintain state information at the TCP | eliminating the need to maintain state information at the TCP | |||
performing the passive OPEN, by encoding the most elementary | performing the passive OPEN, by encoding the most elementary | |||
information required to complete the three-way handshake in the | information required to complete the three-way handshake in the | |||
Sequence Number of the SYN/ACK segment that is sent in response to | Sequence Number of the SYN/ACK segment that is sent in response to | |||
the received SYN segment. Thus, TCP is relieved from keeping | the received SYN segment. Thus, TCP is relieved from keeping state | |||
state for connections in the SYN-RECEIVED state. | for connections in the SYN-RECEIVED state. | |||
The syn-cookie approach has a number of drawbacks: | The syn-cookie approach has a number of drawbacks: | |||
* Firstly, given the limited space in the Sequence Number field, | o Firstly, given the limited space in the Sequence Number field, it | |||
it is not possible to encode all the information included in | is not possible to encode all the information included in the | |||
the initial segment, such as, for example, support of Selective | initial segment, such as, for example, support of Selective | |||
Acknowledgements (SACK). | Acknowledgements (SACK). | |||
* Secondly, in the event that the Acknowledgement segment sent in | o Secondly, in the event that the Acknowledgement segment sent in | |||
response to the SYN/ACK sent by the TCP that performed the | response to the SYN/ACK sent by the TCP that performed the passive | |||
passive OPEN (i.e., the TCP server) were lost, the connection | OPEN (i.e., the TCP server) were lost, the connection would end up | |||
would end up in the ESTABLISHED state on the client-side, but | in the ESTABLISHED state on the client-side, but in the CLOSED | |||
in the CLOSED state on the server side. This scenario is | state on the server side. This scenario is normally handled in | |||
normally handled in TCP by having the TCP server retransmit its | TCP by having the TCP server retransmit its SYN/ACK. However, if | |||
SYN/ACK. However, if syn-cookies are enabled, there would be | syn-cookies are enabled, there would be no connection state | |||
no connection state information on the server side, and thus | information on the server side, and thus the SYN/ACK would never | |||
the SYN/ACK would never be retransmitted. This could lead to a | be retransmitted. This could lead to a scenario in which the | |||
scenario in which the connection could remain in the | connection could remain in the ESTABLISHED state on the client | |||
ESTABLISHED state on the client side, but in the CLOSED state | side, but in the CLOSED state at the server side, indefinitely. | |||
at the server side, indefinitely. If the application protocol | If the application protocol was such that it required the client | |||
was such that it required the client to wait for some data from | to wait for some data from the server (e.g., a greeting message) | |||
the server (e.g., a greeting message) before sending any data | before sending any data to the server, a deadlock would take | |||
to the server, a deadlock would take place, with the client | place, with the client application waiting for such server data, | |||
application waiting for such server data, and the server | and the server waiting for the TCP three-way handshake to | |||
waiting for the TCP three-way handshake to complete. | complete. | |||
* Thirdly, unless the function used to encode information in the | o Thirdly, unless the function used to encode information in the | |||
SYN/ACK packet is cryptographically strong, an attacker could | SYN/ACK packet is cryptographically strong, an attacker could | |||
forge TCP connections in the ESTABLISHED state by forging ACK | forge TCP connections in the ESTABLISHED state by forging ACK | |||
segments that would be considered as "legitimate" by the | segments that would be considered as "legitimate" by the receiving | |||
receiving TCP. | TCP. | |||
* Fourthly, in those scenarios in which establishment of new | o Fourthly, in those scenarios in which establishment of new | |||
connections is blocked by simply dropping segments with the SYN | connections is blocked by simply dropping segments with the SYN | |||
bit set, use of SYN cookies could allow an attacker to bypass | bit set, use of SYN cookies could allow an attacker to bypass the | |||
the firewall rules, as a connection could be established by | firewall rules, as a connection could be established by forging an | |||
forging an ACK segment with the correct values, without the | ACK segment with the correct values, without the need of setting | |||
need of setting the SYN bit. | the SYN bit. | |||
As a result, syn-cookies are usually not employed as a first line | As a result, syn-cookies are usually not employed as a first line of | |||
of defense against SYN-flood attacks, but are only as the last | defense against SYN-flood attacks, but are only as the last resort to | |||
resort to cope with them. For example, some TCP implementations | cope with them. For example, some TCP implementations enable syn- | |||
enable syn-cookies only after a certain number of TCBs has been | cookies only after a certain number of TCBs has been allocated for | |||
allocated for connections in the SYN-RECEIVED state. We recommend | connections in the SYN-RECEIVED state. We recommend this | |||
this implementation technique, with a syn-cache enabled by | implementation technique, with a syn-cache enabled by default, and | |||
default, and use of syn-cookies triggered, for example, when the | use of syn-cookies triggered, for example, when the limit of TCBs for | |||
limit of TCBs for non-synchronized connections with a given port | non-synchronized connections with a given port number has been | |||
number has been reached. | reached. | |||
It is interesting to note that a SYN-flood attack should only | It is interesting to note that a SYN-flood attack should only affect | |||
affect the establishment of new connections. A number of books | the establishment of new connections. A number of books and online | |||
and online documents seem to assume that TCP will not be able to | documents seem to assume that TCP will not be able to respond to any | |||
respond to any TCP segment that is meant for a TCP port that is | TCP segment that is meant for a TCP port that is being SYN-flooded | |||
being SYN-flooded (e.g., respond with an RST segment upon receipt | (e.g., respond with an RST segment upon receipt of a TCP segment that | |||
of a TCP segment that refers to a non-existent TCP connection). | refers to a non-existent TCP connection). While SYN-flooding attacks | |||
While SYN-flooding attacks have been successfully exploited in the | have been successfully exploited in the past for achieving such a | |||
past for achieving such a goal [Shimomura, 1995], as clarified by | goal [Shimomura, 1995], as clarified by RFC 1948 [Bellovin, 1996] the | |||
RFC 1948 [Bellovin, 1996] the effectiveness of SYN flood attacks | effectiveness of SYN flood attacks to silence a TCP implementation | |||
to silence a TCP implementation arose as a result of a bug in the | arose as a result of a bug in the 4.4BSD TCP implementation [Wright | |||
4.4BSD TCP implementation [Wright and Stevens, 1994], rather than | and Stevens, 1994], rather than from a theoretical property of SYN- | |||
from a theoretical property of SYN-flood attacks themselves. | flood attacks themselves. Therefore, those TCP implementations that | |||
Therefore, those TCP implementations that do not suffer from such | do not suffer from such a bug should not be silenced as a result of a | |||
a bug should not be silenced as a result of a SYN-flood attack. | SYN-flood attack. | |||
[Zquete, 2002] describes a mechanism that could theoretically | [Zquete, 2002] describes a mechanism that could theoretically improve | |||
improve the functionality of SYN cookies. It exploits the TCP | the functionality of SYN cookies. It exploits the TCP "simultaneous | |||
"simultaneous open" mechanism, as illustrated in Figure 5. | open" mechanism, as illustrated in Figure 5. | |||
See Figure 5, in page 46 of the UK CPNI document. | See Figure 5, in page 46 of the UK CPNI document. | |||
Use of TCP simultaneous open for handling SYN floods | Use of TCP simultaneous open for handling SYN floods | |||
In line 1, TCP A initiates the connection-establishment phase by | In line 1, TCP A initiates the connection-establishment phase by | |||
sending a SYN segment to TCP B. In line 2, TCP B creates a SYN | sending a SYN segment to TCP B. In line 2, TCP B creates a SYN cookie | |||
cookie as described by [Bernstein, 1996], but does not set the ACK | as described by [Bernstein, 1996], but does not set the ACK bit of | |||
bit of the segment it sends (thus really sending a SYN segment, | the segment it sends (thus really sending a SYN segment, rather than | |||
rather than a SYN/ACK). This "fools" TCP A into thinking that | a SYN/ACK). This "fools" TCP A into thinking that both SYN segments | |||
both SYN segments "have crossed each other in the network" as if a | "have crossed each other in the network" as if a "simultaneous open" | |||
"simultaneous open" scenario had taken place. As a result, in | scenario had taken place. As a result, in line 3 TCP A sends a SYN/ | |||
line 3 TCP A sends a SYN/ACK segment containing the same options | ACK segment containing the same options that were contained in the | |||
that were contained in the original SYN segment. In line 4, upon | original SYN segment. In line 4, upon receipt of this segment, TCP | |||
receipt of this segment, TCP processes the cookie encoded in the | processes the cookie encoded in the ACK field as if it had been the | |||
ACK field as if it had been the result of a traditional SYN cookie | result of a traditional SYN cookie scenario, and moves the connection | |||
scenario, and moves the connection into the ESTABLISHED state. In | into the ESTABLISHED state. In line 5, TCP B sends a SYN/ACK | |||
line 5, TCP B sends a SYN/ACK segment, which causes the connection | segment, which causes the connection at TCP A to move into the | |||
at TCP A to move into the ESTABLISHED state. In line 6, TCP A | ESTABLISHED state. In line 6, TCP A sends a data segment on the | |||
sends a data segment on the connection. | connection. | |||
While this mechanism would work in theory, unfortunately there are | While this mechanism would work in theory, unfortunately there are a | |||
a number of factors that prevent it from being usable in real | number of factors that prevent it from being usable in real network | |||
network environments: | environments: | |||
* Some systems are not able to perform the "simultaneous open" | o Some systems are not able to perform the "simultaneous open" | |||
operation specified in RFC 793, and thus the connection | operation specified in RFC 793, and thus the connection | |||
establishment will fail. | establishment will fail. | |||
* Some firewalls might prevent the establishment of TCP | o Some firewalls might prevent the establishment of TCP connections | |||
connections that rely on the "simultaneous open" mechanism | that rely on the "simultaneous open" mechanism (e.g., a given | |||
(e.g., a given firewall might be allowing incoming SYN/ACK | firewall might be allowing incoming SYN/ACK segments, but not | |||
segments, but not outgoing SYN/ACK segments). | outgoing SYN/ACK segments). | |||
Therefore, we do not recommend implementation of this mechanism | Therefore, we do not recommend implementation of this mechanism for | |||
for mitigating SYN-flood attacks. | mitigating SYN-flood attacks. | |||
5.2. Connection forgery | 5.2. Connection forgery | |||
The process of causing a TCP connection to be illegitimately | The process of causing a TCP connection to be illegitimately | |||
established between two arbitrary remote peers is usually referred to | established between two arbitrary remote peers is usually referred to | |||
as "connection spoofing" or "connection forgery". This can have a | as "connection spoofing" or "connection forgery". This can have a | |||
great negative impact when systems establish some sort of trust | great negative impact when systems establish some sort of trust | |||
relationships based on the IP addresses used to establish a TCP | relationships based on the IP addresses used to establish a TCP | |||
connection [daemon9 et al, 1996]. | connection [daemon9 et al, 1996]. | |||
skipping to change at page 45, line 24 | skipping to change at page 29, line 22 | |||
recommended that systems disable IP Source Routing by default, or at | recommended that systems disable IP Source Routing by default, or at | |||
the very least, they disable source routing for IP packets that | the very least, they disable source routing for IP packets that | |||
encapsulate TCP segments. | encapsulate TCP segments. | |||
The IPv6 Routing Header Type 0, which provides a similar | The IPv6 Routing Header Type 0, which provides a similar | |||
functionality to that provided by IPv4 source routing, has been | functionality to that provided by IPv4 source routing, has been | |||
officially deprecated by RFC 5095 [Abley et al, 2007]. | officially deprecated by RFC 5095 [Abley et al, 2007]. | |||
5.3. Connection-flooding attack | 5.3. Connection-flooding attack | |||
NOTE: THIS SECTION IS BEING EDITED. RFC2119-LANGUAGE IS BEING | ||||
REMOVED. | ||||
5.3.1. Vulnerability | 5.3.1. Vulnerability | |||
The creation and maintenance of a TCP connection requires system | The creation and maintenance of a TCP connection requires system | |||
memory to maintain shared state between the local and the remote TCP. | memory to maintain shared state between the local and the remote TCP. | |||
As system memory is a finite resource, there is a limit on the number | As system memory is a finite resource, there is a limit on the number | |||
of TCP connections that a system can maintain at any time. When the | of TCP connections that a system can maintain at any time. When the | |||
TCP API is employed to create a TCP connection with a remote peer, it | TCP API is employed to create a TCP connection with a remote peer, it | |||
allocates system memory for maintaining shared state with the remote | allocates system memory for maintaining shared state with the remote | |||
TCP peer, and thus the resulting connection would tie a similar | TCP peer, and thus the resulting connection would tie a similar | |||
amount of resources at the remote host as at the local host. | amount of resources at the remote host as at the local host. | |||
skipping to change at page 48, line 36 | skipping to change at page 32, line 36 | |||
Some firewalls can be configured to limit the number of | Some firewalls can be configured to limit the number of | |||
simultaneous connections that any system can maintain with a | simultaneous connections that any system can maintain with a | |||
specific system and/or service at any given time. Limiting the | specific system and/or service at any given time. Limiting the | |||
number of simultaneous connections that each system can establish | number of simultaneous connections that each system can establish | |||
with a specific system and service would effectively limit the | with a specific system and service would effectively limit the | |||
possibility of an attacker that controls a single IP address to | possibility of an attacker that controls a single IP address to | |||
exhaust system resources at the attacker system/service. | exhaust system resources at the attacker system/service. | |||
5.4. Firewall-bypassing techniques | 5.4. Firewall-bypassing techniques | |||
TCP MUST silently drop those TCP segments that have both the SYN and | [draft-gont-tcpm-tcp-sanity-checks-00.txt] discusses how packets with | |||
the RST flags set. | both the SYN and RST bits set have been employed in the wild to | |||
bypass firewall rules, and provides advices in this area. | ||||
DISCUSSION: | ||||
Some firewalls block incoming TCP connections by blocking only | ||||
incoming SYN segments. However, there are inconsistencies in how | ||||
different TCP implementations handle SYN segments that have | ||||
additional flags set, which may allow an attacker to bypass | ||||
firewall rules [US-CERT, 2003b]. | ||||
For example, some firewalls have been known to mistakenly allow | ||||
incoming SYN segments if they also have the RST bit set. As some | ||||
TCP implementations will create a new connection in response to a | ||||
TCP segment with both the SYN and RST bits set, an attacker could | ||||
bypass the firewall rules and establish a connection with a | ||||
"protected" system by setting the RST bit in his SYN segments. | ||||
Here we advise TCP implementations to silently drop those TCP | ||||
segments that have both the SYN and the RST flags set. | ||||
6. Connection-termination mechanism | 6. Connection-termination mechanism | |||
6.1. FIN-WAIT-2 flooding attack | 6.1. FIN-WAIT-2 flooding attack | |||
6.1.1. Vulnerability | 6.1.1. Vulnerability | |||
TCP implements a connection-termination mechanism that is employed | TCP implements a connection-termination mechanism that is employed | |||
for the graceful termination of a TCP connection. This mechanism | for the graceful termination of a TCP connection. This mechanism | |||
usually consists of the exchange of four-segments. Figure 6 | usually consists of the exchange of four-segments. Figure 6 | |||
skipping to change at page 49, line 40 | skipping to change at page 33, line 25 | |||
As a result, an attacker could establish a large number of | As a result, an attacker could establish a large number of | |||
connections with the target system, and cause it close each of them. | connections with the target system, and cause it close each of them. | |||
For each connection, once the target system has sent its FIN segment, | For each connection, once the target system has sent its FIN segment, | |||
the attacker would acknowledge the receipt of this segment, but would | the attacker would acknowledge the receipt of this segment, but would | |||
send no further segments on that connection. As a result, an | send no further segments on that connection. As a result, an | |||
attacker could cause the corresponding system resources (e.g., the | attacker could cause the corresponding system resources (e.g., the | |||
system memory used for storing the TCB) without the need to send any | system memory used for storing the TCB) without the need to send any | |||
further packets. | further packets. | |||
While the CLOSE command described in RFC 793 [Postel, 1981c] simply | While the CLOSE command described in RFC 793 [RFC0793] simply signals | |||
signals the remote TCP end-point that this TCP has finished sending | the remote TCP end-point that this TCP has finished sending data | |||
data (i.e., it closes only one direction of the data transfer), the | (i.e., it closes only one direction of the data transfer), the | |||
close() system-call available in most operating systems has different | close() system-call available in most operating systems has different | |||
semantics: it marks the corresponding file descriptor as closed (and | semantics: it marks the corresponding file descriptor as closed (and | |||
thus it is no longer usable), and assigns the operating system the | thus it is no longer usable), and assigns the operating system the | |||
responsibility to deliver any queued data to the remote TCP peer and | responsibility to deliver any queued data to the remote TCP peer and | |||
to terminate the TCP connection. This makes the FIN-WAIT-2 state | to terminate the TCP connection. This makes the FIN-WAIT-2 state | |||
particularly attractive for performing memory exhaustion attacks, as | particularly attractive for performing memory exhaustion attacks, as | |||
even if the application running on top of TCP were imposing limits on | even if the application running on top of TCP were imposing limits on | |||
the maximum number of ongoing connections, and/or time limits on the | the maximum number of ongoing connections, and/or time limits on the | |||
function calls performed on TCP connections, that application would | function calls performed on TCP connections, that application would | |||
be unable to enforce these limits on the FIN-WAIT-2 state. | be unable to enforce these limits on the FIN-WAIT-2 state. | |||
skipping to change at page 56, line 35 | skipping to change at page 40, line 27 | |||
window to cause the target system to tie system memory to the TCP | window to cause the target system to tie system memory to the TCP | |||
retransmission buffer, it is hard to perform any useful statistics | retransmission buffer, it is hard to perform any useful statistics | |||
from the advertised window. While it is tempting to enforce a limit | from the advertised window. While it is tempting to enforce a limit | |||
on the length of the persist state (see Section 3.7.2 of this | on the length of the persist state (see Section 3.7.2 of this | |||
document), an attacker could simply open the window (i.e., advertise | document), an attacker could simply open the window (i.e., advertise | |||
a TCP window larger than zero) from time to time to prevent this | a TCP window larger than zero) from time to time to prevent this | |||
enforced limit from causing his malicious connections to be aborted. | enforced limit from causing his malicious connections to be aborted. | |||
7.2. TCP segment reassembly buffer | 7.2. TCP segment reassembly buffer | |||
TCP MAY discard out-of-order data when system-memory exhaustion is | TCP buffers out-of-order segments to more efficiently handle the | |||
imminent. | occurrence of packet reordering and segment loss. When out-of-order | |||
data are received, a "hole" momentarily exists in the data stream | ||||
DISCUSSION: | which must be filled before the received data can be delivered to the | |||
application making use of TCP's services. This situation can be | ||||
TCP buffers out-of-order segments to more efficiently handle the | exploited by an attacker, which could intentionally create a hole in | |||
occurrence of packet reordering and segment loss. When out-of- | the data stream by sending a number of segments with a sequence | |||
order data are received, a "hole" momentarily exists in the data | number larger than the next sequence number expected (RCV.NXT) by the | |||
stream which must be filled before the received data can be | attacked TCP. Thus, the attacked TCP would tie system memory to | |||
delivered to the application making use of TCP's services. This | buffer the out-of-order segments, without being able to hand the | |||
situation can be exploited by an attacker, which could | received data to the corresponding application. | |||
intentionally create a hole in the data stream by sending a number | ||||
of segments with a sequence number larger than the next sequence | ||||
number expected (RCV.NXT) by the attacked TCP. Thus, the attacked | ||||
TCP would tie system memory to buffer the out-of-order segments, | ||||
without being able to hand the received data to the corresponding | ||||
application. | ||||
If a large number of such connections were created, system memory | If a large number of such connections were created, system memory | |||
could be exhausted, precluding the attacked TCP from servicing new | could be exhausted, precluding the attacked TCP from servicing new | |||
connections and/or continue servicing TCP connections previously | connections and/or continue servicing TCP connections previously | |||
established. | established. | |||
Fortunately, these attacks can be easily mitigated, at the expense | Fortunately, these attacks can be easily mitigated, at the expense of | |||
of degrading the performance of possibly legitimate connections. | degrading the performance of possibly legitimate connections. When | |||
When out-of-order data is received, an Acknowledgement segment is | out-of-order data is received, an Acknowledgement segment is sent | |||
sent with the next sequence number expected (RCV.NXT). This means | with the next sequence number expected (RCV.NXT). This means that | |||
that receipt of the out-of-order data will not be actually | receipt of the out-of-order data will not be actually acknowledged by | |||
acknowledged by the TCP's cumulative Acknowledgement Number. As a | the TCP's cumulative Acknowledgement Number. As a result, a TCP is | |||
result, a TCP is free to discard any data that have been received | free to discard any data that have been received out-of-order, | |||
out-of-order, without affecting the reliability of the data | without affecting the reliability of the data transfer. Given the | |||
transfer. Given the performance implications of discarding out- | performance implications of discarding out-of-order segments for | |||
of-order segments for legitimate connections, this pruning policy | legitimate connections, this pruning policy should be applied only if | |||
should be applied only if memory exhaustion is imminent. | memory exhaustion is imminent. | |||
As a result of discarding the out-of-order data, these data will | As a result of discarding the out-of-order data, these data will need | |||
need to be unnecessarily retransmitted. Additionally, a loss | to be unnecessarily retransmitted. Additionally, a loss event will | |||
event will be detected by the sending TCP, and thus the slow start | be detected by the sending TCP, and thus the slow start phase of | |||
phase of TCP's congestion control will be entered, thus reducing | TCP's congestion control will be entered, thus reducing the data | |||
the data transfer rate of the connection. | transfer rate of the connection. | |||
It is interesting to note that this pruning policy could be | It is interesting to note that this pruning policy could be applied | |||
applied even if Selective Acknowledgements (SACK) (specified in | even if Selective Acknowledgements (SACK) (specified in RFC 2018 | |||
RFC 2018 [Mathis et al, 1996]) are in use, as SACK provides only | [Mathis et al, 1996]) are in use, as SACK provides only advisory | |||
advisory information, and does not preclude the receiving TCP from | information, and does not preclude the receiving TCP from discarding | |||
discarding data that have been previously selectively-acknowledged | data that have been previously selectively-acknowledged by means of | |||
by means of TCP's SACK option, but not acknowledged by TCP's | TCP's SACK option, but not acknowledged by TCP's cumulative | |||
cumulative Acknowledgement Number. | Acknowledgement Number. | |||
There are a number of ways in which the pruning policy could be | There are a number of ways in which the pruning policy could be | |||
triggered. For example, when out of order data are received, a | triggered. For example, when out of order data are received, a timer | |||
timer could be set, and the sequence number of the out-of-order | could be set, and the sequence number of the out-of-order data could | |||
data could be recorded. If the hole were filled before the timer | be recorded. If the hole were filled before the timer expires, the | |||
expires, the timer would be turned off. However, if the timer | timer would be turned off. However, if the timer expired before the | |||
expired before the hole were filled, all the out-of-order segments | hole were filled, all the out-of-order segments of the corresponding | |||
of the corresponding connection would be discarded. This would be | connection would be discarded. This would be a proactive counter- | |||
a proactive counter-measure for attacks that aim at exhausting the | measure for attacks that aim at exhausting the receive buffers. | |||
receive buffers. | ||||
In addition, an implementation could incorporate reactive | In addition, an implementation could incorporate reactive mechanisms | |||
mechanisms for more carefully controlling buffer allocation when | for more carefully controlling buffer allocation when some predefined | |||
some predefined buffer allocation threshold was reached. At such | buffer allocation threshold was reached. At such point, pruning | |||
point, pruning policies would be applied. | policies would be applied. | |||
A number of mechanisms can aid in the process of freeing system | A number of mechanisms can aid in the process of freeing system | |||
resources. For example, a table of network prefixes corresponding | resources. For example, a table of network prefixes corresponding to | |||
to the IP addresses of TCP peers that have ongoing TCP connections | the IP addresses of TCP peers that have ongoing TCP connections could | |||
could record the aggregate amount of out-of-order data currently | record the aggregate amount of out-of-order data currently buffered | |||
buffered for those connections. When the pruning policy was | for those connections. When the pruning policy was triggered, TCP | |||
triggered, TCP connections with hosts that have network prefixes | connections with hosts that have network prefixes with large | |||
with large aggregate out-of-order buffered data could be selected | aggregate out-of-order buffered data could be selected first for | |||
first for pruning the out-of-order segments. | pruning the out-of-order segments. | |||
Alternatively, if TCP segments were de-multiplexed by means of a | Alternatively, if TCP segments were de-multiplexed by means of a hash | |||
hash table (as it is currently the case in many TCP | table (as it is currently the case in many TCP implementations), a | |||
implementations), a counter could be held at each entry of the | counter could be held at each entry of the hash table that would | |||
hash table that would record the aggregate out-of-order data | record the aggregate out-of-order data currently buffered for those | |||
currently buffered for those connections belonging to that hash | connections belonging to that hash table entry. When the pruning | |||
table entry. When the pruning policy is triggered, the out-of- | policy is triggered, the out-of-order data corresponding to those | |||
order data corresponding to those connections linked by the hash | connections linked by the hash table entry with largest amount of | |||
table entry with largest amount of aggregate out-of-order data | aggregate out-of-order data could be pruned first. It is important | |||
could be pruned first. It is important that this hash is not | that this hash is not computable by an attacker, as this would allow | |||
computable by an attacker, as this would allow him to maliciously | him to maliciously cause the performance of specific connections to | |||
cause the performance of specific connections to be degraded. | be degraded. That is, given a four-tuple that identifies a | |||
That is, given a four-tuple that identifies a connection, an | connection, an attacker should not be able to compute the | |||
attacker should not be able to compute the corresponding hash | corresponding hash value used by the target system to de-multiplex | |||
value used by the target system to de-multiplex incoming TCP | incoming TCP segments to that connection. | |||
segments to that connection. | ||||
Another variant of a resource exhaustion attack against TCP's | Another variant of a resource exhaustion attack against TCP's segment | |||
segment reassembly mechanism would target the data structures used | reassembly mechanism would target the data structures used to link | |||
to link the different holes in a data stream. For example, an | the different holes in a data stream. For example, an attacker could | |||
attacker could send a burst of 1 byte segments, leaving a one-byte | send a burst of 1 byte segments, leaving a one-byte hole between each | |||
hole between each of the data bytes sent. Depending on the data | of the data bytes sent. Depending on the data structures used for | |||
structures used for holding and linking together each of the data | holding and linking together each of the data segments, such an | |||
segments, such an attack might waste a large amount of system | attack might waste a large amount of system memory by exploiting the | |||
memory by exploiting the overhead needed store and link together | overhead needed store and link together each of these one-byte | |||
each of these one-byte segments. | segments. | |||
For example, if a linked-list is used for holding and linking each | For example, if a linked-list is used for holding and linking each of | |||
of the data segments, each of the involved data structures could | the data segments, each of the involved data structures could involve | |||
involve one byte of kernel memory for storing the received data | one byte of kernel memory for storing the received data byte (the TCP | |||
byte (the TCP payload), plus 4 bytes (32 bits) for storing a | payload), plus 4 bytes (32 bits) for storing a pointer to the next | |||
pointer to the next node in the linked-list. Additionally, while | node in the linked-list. Additionally, while such a data structure | |||
such a data structure would require only a few bytes of kernel | would require only a few bytes of kernel memory, it could result in | |||
memory, it could result in the allocation of a whole memory page, | the allocation of a whole memory page, thus consuming much more | |||
thus consuming much more memory than expected. | memory than expected. | |||
Therefore, implementations should enforce a limit on the number of | Therefore, implementations should enforce a limit on the number of | |||
holes that are allowed in the received data stream at any given | holes that are allowed in the received data stream at any given time. | |||
time. When such a limit is reached, incoming TCP segments which | When such a limit is reached, incoming TCP segments which would | |||
would create new holes would be silently dropped. Measurements in | create new holes would be silently dropped. Measurements in | |||
[Dharmapurikar and Paxson, 2005] indicate that in the vast | [Dharmapurikar and Paxson, 2005] indicate that in the vast majority | |||
majority of TCP connections have at most a single hole at any | of TCP connections have at most a single hole at any given time. A | |||
given time. A limit of 16 holes for each connection would | limit of 16 holes for each connection would accommodate even most of | |||
accommodate even most of the very unusual cases in which there can | the very unusual cases in which there can be more than hole in the | |||
be more than hole in the data stream at a given time. | data stream at a given time. | |||
[US-CERT, 2004a] is a security advisory about a Denial of Service | [US-CERT, 2004a] is a security advisory about a Denial of Service | |||
vulnerability resulting from a TCP implementation that did not | vulnerability resulting from a TCP implementation that did not | |||
enforce limits on the number of segments stored in the TCP | enforce limits on the number of segments stored in the TCP reassembly | |||
reassembly buffer. | buffer. | |||
Section 8 of this document describes the security implications of | Section 8 of this document describes the security implications of the | |||
the TCP segment reassembly algorithm. | TCP segment reassembly algorithm. | |||
7.3. Automatic buffer tuning mechanisms | 7.3. Automatic buffer tuning mechanisms | |||
NOTE: THIS SECTION IS BEING EDITED. PLEASE DISREGARD THE RFC2119- | ||||
LANGUAGE RECOMMENDATIONS. | ||||
7.3.1. Automatic send-buffer tuning mechanisms | 7.3.1. Automatic send-buffer tuning mechanisms | |||
A TCP implementing an automatic send-buffer tuning mechanism SHOULD | A TCP implementing an automatic send-buffer tuning mechanism SHOULD | |||
enforce the following limit on the size of the send buffer of each | enforce the following limit on the size of the send buffer of each | |||
TCP connection: | TCP connection: | |||
send_buffer_size <= send_buffer_pool / (min_buffer_size * max_connections) | send_buffer_size <= send_buffer_pool / (min_buffer_size * max_connections) | |||
where | where | |||
skipping to change at page 63, line 37 | skipping to change at page 47, line 28 | |||
It is worth noting that TCP Selective Acknowledgements (SACK) are | It is worth noting that TCP Selective Acknowledgements (SACK) are | |||
advisory, in the sense that a TCP that has SACKed (but not ACKed) | advisory, in the sense that a TCP that has SACKed (but not ACKed) | |||
a block of data is free to discard that block, and expect the TCP | a block of data is free to discard that block, and expect the TCP | |||
sender to retransmit them when the retransmission timer of the | sender to retransmit them when the retransmission timer of the | |||
peer TCP expires. | peer TCP expires. | |||
8. TCP segment reassembly algorithm | 8. TCP segment reassembly algorithm | |||
8.1. Problems that arise from ambiguity in the reassembly process | 8.1. Problems that arise from ambiguity in the reassembly process | |||
If a TCP segment is received containing some data bytes that had | A security consideration that should be made for the TCP segment | |||
already been received, the first copy of those data SHOULD be used | reassembly algorithm is that of data stream consistency between the | |||
for reassembling the application data stream. | host performing the TCP segment reassembly, and a Network Intrusion | |||
Detection System (NIDS) being employed to monitor the host in | ||||
DISCUSSION: | question. | |||
A security consideration that should be made for the TCP segment | ||||
reassembly algorithm is that of data stream consistency between | ||||
the host performing the TCP segment reassembly, and a Network | ||||
Intrusion Detection System (NIDS) being employed to monitor the | ||||
host in question. | ||||
In the event a TCP segment was unnecessarily retransmitted, or | In the event a TCP segment was unnecessarily retransmitted, or there | |||
there was packet duplication in any of the intervening networks, a | was packet duplication in any of the intervening networks, a TCP | |||
TCP might get more than one copy of the same data. Also, as TCP | might get more than one copy of the same data. Also, as TCP segments | |||
segments can be re-packetized when they are retransmitted, a given | can be re-packetized when they are retransmitted, a given TCP segment | |||
TCP segment might partially overlap data already received in | might partially overlap data already received in earlier segments. | |||
earlier segments. In all these cases, the question arises about | In all these cases, the question arises about which of the copies of | |||
which of the copies of the received data should be used when | the received data should be used when reassembling the data stream. | |||
reassembling the data stream. In legitimate and normal | In legitimate and normal circumstances, all copies would be | |||
circumstances, all copies would be identical, and the same data | identical, and the same data stream would be obtained regardless of | |||
stream would be obtained regardless of which copy of the data was | which copy of the data was used. However, an attacker could | |||
used. However, an attacker could maliciously send overlapping | maliciously send overlapping segments containing different data, with | |||
segments containing different data, with the intent of evading a | the intent of evading a Network Intrusion Detection Systems (NIDS), | |||
Network Intrusion Detection Systems (NIDS), which might reassemble | which might reassemble the received TCP segments differently than the | |||
the received TCP segments differently than the monitored system. | monitored system. [Ptacek and Newsham, 1998] provides a detailed | |||
[Ptacek and Newsham, 1998] provides a detailed discussion of these | discussion of these issues. | |||
issues. | ||||
As suggested in Section 3.9 of RFC 793 [Postel, 1981c], if a TCP | As suggested in Section 3.9 of RFC 793 [RFC0793], if a TCP segment | |||
segment arrives containing some data bytes that have already been | arrives containing some data bytes that have already been received, | |||
received, the first copy of those data should be used for | the first copy of those data should be used for reassembling the | |||
reassembling the application data stream. It should be noted that | application data stream. It should be noted that while convergence | |||
while convergence to this policy might prevent some cases of | to this policy might prevent some cases of ambiguity in the | |||
ambiguity in the reassembly process, there are a number of other | reassembly process, there are a number of other techniques that an | |||
techniques that an attacker could still exploit to evade a NIDS | attacker could still exploit to evade a NIDS [CPNI, 2008]. These | |||
[CPNI, 2008]. These techniques can generally be defeated if the | techniques can generally be defeated if the NIDS is placed in-line | |||
NIDS is placed in-line with the monitored system, thus allowing | with the monitored system, thus allowing the NIDS to normalize the | |||
the NIDS to normalize the network traffic or apply some other | network traffic or apply some other policy that could ensure | |||
policy that could ensure consistency between the result of the | consistency between the result of the segment reassembly process | |||
segment reassembly process obtained by the monitored host and that | obtained by the monitored host and that obtained by the NIDS. | |||
obtained by the NIDS. | ||||
[CERT, 2003] and [CORE, 2003] are advisories about a heap buffer | [CERT, 2003] and [CORE, 2003] are advisories about a heap buffer | |||
overflow in a popular Network Intrusion Detection System resulting | overflow in a popular Network Intrusion Detection System resulting | |||
from incorrect sequence number calculations in its TCP stream- | from incorrect sequence number calculations in its TCP stream- | |||
reassembly module. | reassembly module. | |||
9. TCP Congestion Control | 9. TCP Congestion Control | |||
NOTE: THIS SECTION IS BEING EDITED. | ||||
TCP implements two algorithms, "slow start" and "congestion | TCP implements two algorithms, "slow start" and "congestion | |||
avoidance", for controlling the rate at which data is transmitted on | avoidance", for controlling the rate at which data is transmitted on | |||
a TCP connection [Allman et al, 1999]. These algorithms require the | a TCP connection [RFC5681]. | |||
addition of two variables as part of TCP per-connection state: cwnd | ||||
and ssthresh. | ||||
The congestion window (cwnd) is a sender-side limit on the amount of | ||||
outstanding data that the sender can have at any time, while the | ||||
receiver's advertised window (rwnd) is a receiver-side limit on the | ||||
amount of outstanding data. The minimum of cwnd and rwnd governs | ||||
data transmission. | ||||
Another state variable, the slow-start threshold (ssthresh), is used | ||||
to determine whether it is the slow start or the congestion avoidance | ||||
algorithm that should control data transmission. When cwnd < | ||||
ssthresh, "slow start" governs data transmission, and the congestion | ||||
window (cwnd) is exponentially increased. When cwnd > ssthresh, | ||||
"congestion avoidance" governs data transmission, and the congestion | ||||
window (cwnd) is only linearly increased. | ||||
As specified in RFC 2581 [Allman et al, 1999], when cwnd and ssthresh | ||||
are equal the sender may use either slow start or congestion | ||||
avoidance. | ||||
During slow start, TCP increments cwnd by at most SMSS bytes for each | ||||
ACK received that acknowledges new data. During congestion | ||||
avoidance, cwnd is incremented by 1 full-sized segment per round-trip | ||||
time (RTT), until congestion is detected. | ||||
Additionally, TCP uses two algorithms, Fast Retransmit and Fast | ||||
Recovery, to mitigate the effects of packet loss. The "Fast | ||||
Retransmit" algorithm infers packet loss when three Duplicate | ||||
Acknowledgements (DupACKs) are received. | ||||
The value "three" is meant to allow for fast-retransmission of | ||||
"missing" data, while avoiding network packet reordering from | ||||
triggering loss recovery. | ||||
Once packet loss is detected by the receipt of three duplicate-ACKs, | ||||
the "Fast Recovery" algorithm governs the transfer of new data until | ||||
a non-duplicate ACK is received that acknowledges the receipt of new | ||||
data. The Fast Retransmit and Fast Recovery algorithms are usually | ||||
implemented together, as follows (from RFC 2581): | ||||
o When the third duplicate ACK is received, set ssthresh to no more | ||||
than the value given in the equation: ssthresh = max (FlightSize / | ||||
2, 2*SMSS) | ||||
o Retransmit the lost segment and set cwnd to ssthresh plus 3*SMSS. | ||||
This artificially "inflates" the congestion window by the number | ||||
of segments (three) that have left the network and which the | ||||
receiver has buffered. | ||||
o For each additional duplicate ACK received, increment cwnd by | ||||
SMSS. This artificially inflates the congestion window in order | ||||
to reflect the additional segment that has left the network. | ||||
o Transmit a segment, if allowed by the new value of cwnd and the | ||||
receiver's advertised window. | ||||
o When the next ACK arrives that acknowledges new data, set cwnd to | ||||
ssthresh (the value set in step 1). This is termed "deflating" | ||||
the window. | ||||
9.1. Congestion control with misbehaving receivers | 9.1. Congestion control with misbehaving receivers | |||
[Savage et al, 1999] describes a number of ways in which TCP's | [Savage et al, 1999] describes a number of ways in which TCP's | |||
congestion control mechanisms can be exploited by a misbehaving TCP | congestion control mechanisms can be exploited by a misbehaving TCP | |||
receiver to obtain more than its fair share of bandwidth. The | receiver to obtain more than its fair share of bandwidth. The | |||
following subsections provide a brief discussion of these | following subsections provide a brief discussion of these | |||
vulnerabilities, along with the possible countermeasures. | vulnerabilities, along with the possible countermeasures. | |||
9.1.1. ACK division | 9.1.1. ACK division | |||
TCP SHOULD increase cwnd by one SMSS only when a valid ACK covers the | Given that TCP updates cwnd based on the number of duplicate ACKs it | |||
entire data segment sent | receives, rather than on the amount of data that each ACK is actually | |||
acknowledging, a malicious TCP receiver could cause the TCP sender to | ||||
(note: or should we recommend the other counter-measure (i.e., | illegitimately increase its congestion window by acknowledging a data | |||
implementation of ABC?) | segment with a number of separate Acknowledgements, each covering a | |||
distinct piece of the received data segment. | ||||
DISCUSSION: | ||||
Given that TCP updates cwnd based on the number of duplicate ACKs | ||||
it receives, rather than on the amount of data that each ACK is | ||||
actually acknowledging, a malicious TCP receiver could cause the | ||||
TCP sender to illegitimately increase its congestion window by | ||||
acknowledging a data segment with a number of separate | ||||
Acknowledgements, each covering a distinct piece of the received | ||||
data segment. | ||||
See Figure 7, in page 64 of the UK CPNI document. | See Figure 7, in page 64 of the UK CPNI document. | |||
ACK division attack | ACK division attack | |||
[Savage et al, 1999] describes two possible countermeasures for | [Savage et al, 1999] describes two possible countermeasures for this | |||
this vulnerability. One of them is to increment cwnd not by a | vulnerability. One of them is to increment cwnd not by a full SMSS, | |||
full SMSS, but proportionally to the amount of data being | but proportionally to the amount of data being acknowledged by the | |||
acknowledged by the received ACK, similarly to the policy | received ACK, similarly to the policy described in RFC 3465 [Allman, | |||
described in RFC 3465 [Allman, 2003]. Another alternative is to | 2003]. Another alternative is to increase cwnd by one SMSS only when | |||
increase cwnd by one SMSS only when a valid ACK covers the entire | a valid ACK covers the entire data segment sent. | |||
data segment sent. | ||||
9.1.2. DupACK forgery | 9.1.2. DupACK forgery | |||
TCP SHOULD keep track of the number of outstanding segments (o_seg), | The second vulnerability discussed in [Savage et al, 1999] allows an | |||
and accept only up to (o_seg -1) duplicate Acknowledgements. | attacker to cause the TCP sender to illegitimately increase its | |||
congestion window by forging a number of duplicate Acknowledgements | ||||
DISCUSSION: | (DupACKs). Figure 8 shows a sample scenario. The first three | |||
DupACKs trigger the Fast Recovery mechanism, while the rest of them | ||||
The second vulnerability discussed in [Savage et al, 1999] allows | cause the congestion window at the TCP sender to be illegitimately | |||
an attacker to cause the TCP sender to illegitimately increase its | inflated. Thus, the attacker is able to illegitimately cause the TCP | |||
congestion window by forging a number of duplicate | sender to increase its data transmission rate. | |||
Acknowledgements (DupACKs). Figure 8 shows a sample scenario. | ||||
The first three DupACKs trigger the Fast Recovery mechanism, while | ||||
the rest of them cause the congestion window at the TCP sender to | ||||
be illegitimately inflated. Thus, the attacker is able to | ||||
illegitimately cause the TCP sender to increase its data | ||||
transmission rate. | ||||
See Figure 8, in page 65 of the UK CPNI document. | See Figure 8, in page 65 of the UK CPNI document. | |||
DupACK forgery attack | DupACK forgery attack | |||
Fortunately, a number of sender-side heuristics can be implemented | Fortunately, a number of sender-side heuristics can be implemented to | |||
to mitigate this vulnerability. First, the TCP sender could keep | mitigate this vulnerability. First, the TCP sender could keep track | |||
track of the number of outstanding segment (o_seg), and accept | of the number of outstanding segment (o_seg), and accept only up to | |||
only up to (o_seg -1) DupACKs. Secondly, a TCP sender might, for | (o_seg -1) DupACKs. Secondly, a TCP sender might, for example, | |||
example, refuse to enter Fast Recovery multiple times in some | refuse to enter Fast Recovery multiple times in some period of time | |||
period of time (e.g., one RTT). | (e.g., one RTT). | |||
[Savage et al, 1999] also describes a modification to TCP to | [Savage et al, 1999] also describes a modification to TCP to | |||
implement a nonce protocol that would eliminate this | implement a nonce protocol that would eliminate this vulnerability. | |||
vulnerability. However, this would require modification of all | However, this would require modification of all implementations, | |||
implementations, which makes this counter-measure hard to deploy. | which makes this counter-measure hard to deploy. | |||
9.1.3. Optimistic ACKing | 9.1.3. Optimistic ACKing | |||
Another alternative for an attacker to exploit TCP's congestion | Another alternative for an attacker to exploit TCP's congestion | |||
control mechanisms is to acknowledge data that has not yet been | control mechanisms is to acknowledge data that has not yet been | |||
received, thus causing the congestion window at the TCP sender to be | received, thus causing the congestion window at the TCP sender to be | |||
incremented faster than it should. | incremented faster than it should. | |||
See Figure 9, in page 66 of the UK CPNI document. | See Figure 9, in page 66 of the UK CPNI document. | |||
skipping to change at page 68, line 31 | skipping to change at page 50, line 37 | |||
TCP", the third duplicate-ACK will cause the "lost" segment to be | TCP", the third duplicate-ACK will cause the "lost" segment to be | |||
retransmitted, and each subsequent duplicate-ACK will cause cwnd to | retransmitted, and each subsequent duplicate-ACK will cause cwnd to | |||
be artificially inflated. Thus, the "sending TCP" might end up | be artificially inflated. Thus, the "sending TCP" might end up | |||
injecting more packets into the network than it really should, with | injecting more packets into the network than it really should, with | |||
the potential of causing network congestion. This is a potential | the potential of causing network congestion. This is a potential | |||
consequence of the "Duplicate-ACK spoofing attack" described in | consequence of the "Duplicate-ACK spoofing attack" described in | |||
[Savage et al, 1999]. | [Savage et al, 1999]. | |||
Secondly, if bursts of three duplicate ACKs are sent to the TCP | Secondly, if bursts of three duplicate ACKs are sent to the TCP | |||
sender, the attacked system would infer packet loss, and ssthresh and | sender, the attacked system would infer packet loss, and ssthresh and | |||
cwnd would be reduced. As noted in RFC 2581 [Allman et al, 1999], | cwnd would be reduced. As noted in RFC 5681 [RFC5681], causing two | |||
causing two congestion control events back-to-back will often cut | congestion control events back-to-back will often cut ssthresh and | |||
ssthresh and cwnd to their minimum value of 2*SMSS, with the | cwnd to their minimum value of 2*SMSS, with the connection | |||
connection immediately entering the slower-performing congestion | immediately entering the slower-performing congestion avoidance | |||
avoidance phase. While it would not be attractive for an attacker to | phase. While it would not be attractive for an attacker to perform | |||
perform this attack against one of his TCP connections, the attack | this attack against one of his TCP connections, the attack might be | |||
might be attractive when the TCP connection to be attacked is | attractive when the TCP connection to be attacked is established | |||
established between two other parties. | between two other parties. | |||
It is usually assumed that in order for an off-path attacker to | It is usually assumed that in order for an off-path attacker to | |||
perform attacks against a third-party TCP connection, he should be | perform attacks against a third-party TCP connection, he should be | |||
able to guess a number of values, including a valid TCP Sequence | able to guess a number of values, including a valid TCP Sequence | |||
Number and a valid TCP Acknowledgement Number. While this is true if | Number and a valid TCP Acknowledgement Number. While this is true if | |||
the attacker tries to "inject" valid packets into the connection by | the attacker tries to "inject" valid packets into the connection by | |||
himself, a feature of TCP can be exploited to fool one of the TCP | himself, a feature of TCP can be exploited to fool one of the TCP | |||
endpoints to transmit valid duplicate Acknowledgements on behalf of | endpoints to transmit valid duplicate Acknowledgements on behalf of | |||
the attacker, hence relieving the attacker of the hard task of | the attacker, hence relieving the attacker of the hard task of | |||
forging valid values for the Sequence Number and Acknowledgement | forging valid values for the Sequence Number and Acknowledgement | |||
Number TCP header fields. | Number TCP header fields. | |||
Section 3.9 of RFC 793 [Postel, 1981c] describes the processing of | Section 3.9 of RFC 793 [RFC0793] describes the processing of incoming | |||
incoming TCP segments as a function of the connection state and the | TCP segments as a function of the connection state and the contents | |||
contents of the various header fields of the received segment. For | of the various header fields of the received segment. For | |||
connections in the ESTABLISHED state, the first check that is | connections in the ESTABLISHED state, the first check that is | |||
performed on incoming segments is that they contain "in window" data. | performed on incoming segments is that they contain "in window" data. | |||
That is, | That is, | |||
RCV.NXT <= SEG.SEQ <= RCV.NXT+RCV.WND, or | RCV.NXT <= SEG.SEQ <= RCV.NXT+RCV.WND, or | |||
RCV.NXT <= SEG.SEQ+SEG.LEN-1 < RCV.NXT+RCV.WND | RCV.NXT <= SEG.SEQ+SEG.LEN-1 < RCV.NXT+RCV.WND | |||
If a segment does not pass this check, it is dropped, and an | If a segment does not pass this check, it is dropped, and an | |||
Acknowledgement is sent in response: | Acknowledgement is sent in response: | |||
skipping to change at page 71, line 22 | skipping to change at page 53, line 28 | |||
segments (in red) sent by the attacker causes the TCP sender to enter | segments (in red) sent by the attacker causes the TCP sender to enter | |||
the loss recovery phase and illegitimately inflate the congestion | the loss recovery phase and illegitimately inflate the congestion | |||
window, leading to an increase in the data transmission rate. Once a | window, leading to an increase in the data transmission rate. Once a | |||
segment that acknowledges new data is received by the TCP sender, the | segment that acknowledges new data is received by the TCP sender, the | |||
loss recovery phase ends, and the data transmission rate is reduced. | loss recovery phase ends, and the data transmission rate is reduced. | |||
See Figure 12, in page 70 of the UK CPNI document. | See Figure 12, in page 70 of the UK CPNI document. | |||
Blind flooding attack (time-line graph) | Blind flooding attack (time-line graph) | |||
Figure 13 is a time-sequence graph produced from packet logs obtained | ||||
from tests of the described attack in a real network. A burst of | ||||
segments is sent upon receipt of the burst of Duplicate | ||||
Acknowledgements illegitimately elicited by the attacker. Figure 14 | ||||
is an averaged-throughput graphic for the same time frame, which | ||||
clearly shows the effect of the attack in terms of throughput. | ||||
See Figure 13, in page 71 of the UK CPNI document. | ||||
Blind flooding attack (time sequence graph) | ||||
See Figure 14, in page 71 of the UK CPNI document. | ||||
Blind flooding attack (averaged throughput graph) | ||||
These graphics were produced with Shawn Ostermann's tcptrace tool | ||||
[Ostermann, 2008]. An explanation of the format of the graphics can | ||||
be found in tcptrace's manual (available at the project's web site: | ||||
http://www.tcptrace.org). | ||||
9.2.3. Difficulty in performing the attacks | 9.2.3. Difficulty in performing the attacks | |||
In order to exploit the technique described in Section 9.2 of this | In order to exploit the technique described in Section 9.2 of this | |||
document, an attacker would need to know the four-tuple {IP Source | document, an attacker would need to know the four-tuple {IP Source | |||
Address, TCP Source Port, IP Destination Address, TCP Destination | Address, TCP Source Port, IP Destination Address, TCP Destination | |||
Port} that identifies the connection to be attacked. As discussed by | Port} that identifies the connection to be attacked. As discussed by | |||
[Watson, 2004] and RFC 4953 [Touch, 2007], there are a number of | [Watson, 2004] and RFC 4953 [Touch, 2007], there are a number of | |||
scenarios in which these values may be known or easily guessed. | scenarios in which these values may be known or easily guessed. | |||
It is interesting to note that the attacks described in Section 9.2 | It is interesting to note that the attacks described in Section 9.2 | |||
skipping to change at page 73, line 10 | skipping to change at page 54, line 43 | |||
interesting in the case of the blind-flooding attack, as the attack | interesting in the case of the blind-flooding attack, as the attack | |||
would elicit even more packets from the TCP sender. | would elicit even more packets from the TCP sender. | |||
Whether a full-window or just half a window of data is retransmitted | Whether a full-window or just half a window of data is retransmitted | |||
depends on the Acknowledgement policy at the TCP receiver. If the | depends on the Acknowledgement policy at the TCP receiver. If the | |||
TCP receiver sends an Acknowledgement (ACK) for every segment, a | TCP receiver sends an Acknowledgement (ACK) for every segment, a | |||
full-window of data will be retransmitted. If the TCP receiver sends | full-window of data will be retransmitted. If the TCP receiver sends | |||
an Acknowledgement (ACK) for every other segment, then only half a | an Acknowledgement (ACK) for every other segment, then only half a | |||
window of data will be retransmitted. | window of data will be retransmitted. | |||
Figure 15 is a time-sequence graph produced from packet logs obtained | ||||
from tests performed in a real network. Once loss recovery is | ||||
illegitimately triggered by the duplicate-ACKs elicited by the | ||||
attacker, an entire flight of data is unnecessarily retransmitted. | ||||
Figure 16 is an averaged-throughput graphic for the same time-frame, | ||||
which shows an increase in the throughput of the connection resulting | ||||
from the retransmission of segments governed by NewReno's loss | ||||
recovery. | ||||
See Figure 15, in page 73 of the UK CPNI document. | ||||
NewReno loss recovery (time-sequence graph) | ||||
See Figure 16, in page 74 of the UK CPNI document. | ||||
NewReno loss recovery (averaged throughput graph) | ||||
Limited Transmit | Limited Transmit | |||
RFC 3042 [Allman et al, 2001] proposes an enhancement to TCP to more | RFC 3042 [Allman et al, 2001] proposes an enhancement to TCP to more | |||
effectively recover lost segments when a connection's congestion | effectively recover lost segments when a connection's congestion | |||
window is small, or when a large number of segments are lost in a | window is small, or when a large number of segments are lost in a | |||
single transmission window. The "Limited Transmit" algorithm calls | single transmission window. The "Limited Transmit" algorithm calls | |||
for sending a new data segment in response to each of the first two | for sending a new data segment in response to each of the first two | |||
Duplicate Acknowledgements that arrive at the TCP sender. This would | Duplicate Acknowledgements that arrive at the TCP sender. This would | |||
provide two additional transmitted packets that may be useful for the | provide two additional transmitted packets that may be useful for the | |||
attacker in the case of the blind flooding attack described in | attacker in the case of the blind flooding attack described in | |||
Section 9.2.2 is performed. | Section 9.2.2 is performed. | |||
SACK-based loss recovery | SACK-based loss recovery | |||
RFC 3517 [Blanton et al, 2003] specifies a conservative loss-recovery | [I-D.ietf-tcpm-3517bis] specifies a conservative loss-recovery | |||
algorithm that is based on the use of the selective acknowledgement | algorithm that is based on the use of the selective acknowledgement | |||
(SACK) TCP option. The algorithm uses DupACKs as an indication of | (SACK) TCP option. The algorithm uses DupACKs as an indication of | |||
congestion, as specified in RFC 2581 [Allman et al, 1999]. However, | congestion, as specified in RFC 2581 [RFC5681]. However, a | |||
a difference between this algorithm and the basic algorithm described | difference between this algorithm and the basic algorithm described | |||
in RFC 2581 is that it clocks out segments only with the SACK | in RFC 2581 is that it clocks out segments only with the SACK | |||
information included in the DupACKs. That is, during the loss | information included in the DupACKs. That is, during the loss | |||
recovery phase, segments will be injected in the network only if the | recovery phase, segments will be injected in the network only if the | |||
SACK information included in the received DupACKs indicates that one | SACK information included in the received DupACKs indicates that one | |||
or more segments have left the network. As a result, those systems | or more segments have left the network. As a result, those systems | |||
that implement SACK-based loss recovery will not be vulnerable to the | that implement SACK-based loss recovery will not be vulnerable to the | |||
blind flooding attack described in Section 9.2.2. However, as RFC | blind flooding attack described in Section 9.2.2. Additionally, as | |||
3517 does not actually require DupACKs to include new SACK | [I-D.ietf-tcpm-3517bis] requires DupACKs to include new SACK | |||
information (corresponding to data that has not yet been acknowledged | information (corresponding to data that has not yet been acknowledged | |||
by TCP's cumulative Acknowledgement), systems that implement SACK- | by TCP's cumulative Acknowledgement), systems that implement SACK- | |||
based loss-recovery may still remain vulnerable to the blind | based loss-recovery will not be vulnerable to the blind throughput- | |||
throughput-reduction attack described in Section 9.2.1. SACK-based | reduction attack described in Section 9.2.1. | |||
loss recovery implementations should be updated to implement the | ||||
countermeasure ("Use of SACK information to validate DupACKs") | ||||
described in Section 9.2.5. | ||||
9.2.5. Countermeasures | 9.2.5. Countermeasures | |||
TCP SHOULD validate the Sequence Number of an incomming TCP segment | [draft-gont-tcpm-limiting-aow-segments-00.txt] proposes to rate-limit | |||
as follows: | the reaction to out-of-window segments. This would mitigate the | |||
attacks described earlier in this section. | ||||
RCV.NXT - MAX.RCV.WND <= SEG.SEQ <= RCV.NXT + RCV.WND | ||||
where MAX.RCV.WND is the largest TCP window that has so far been | ||||
advertised to the remote endpoint. | ||||
If a segment passes this check, the processing rules specified in RFC | ||||
793 [Postel, 1981c] MUST applied. Otherwise, TCP SHOULD send an ACK | ||||
(as specified by the processing rules in RFC 793 [Postel, 1981c]), | ||||
applying rate-limiting to the Acknowledgement segments sent in | ||||
response to out-of-window segments. | ||||
DISCUSSION: | ||||
As discussed in Section 9.2, TCP responds with an ACK when an out- | ||||
of-window segment is received, to accommodate those scenarios in | ||||
which the Acknowledgement segments that correspond to some | ||||
received data are lost in the network, and to help discover half- | ||||
open TCP connections. | ||||
However, it is possible to restrict the sequence numbers that are | ||||
considered acceptable, and have TCP respond with ACKs only when it | ||||
is strictly necessary. | ||||
A feature of TCP is that, in some scenarios, it can detect half- | ||||
open connections. If an implementation chose to silently drop | ||||
those TCP segments that do not pass the check enforced by the | ||||
equation above, it could prevent TCP from detecting half-open | ||||
connections. Figure 17 shows a scenario in which, provided that | ||||
"TCP B" behaves as specified in RFC 793, a half-open connection | ||||
would be discovered and aborted. | ||||
An established connection is said to be "half open" if one of the | ||||
TCPs has closed or aborted the connection at its end without the | ||||
knowledge of the other, or if the two ends of the connection have | ||||
become desynchronized owing to a crash that resulted in loss of | ||||
memory. | ||||
See Figure 17, in page 76 of the UK CPNI document. | ||||
Half-Open Connection Discovery | ||||
In the scenario illustrated by Figure 17, TCP A crashes losing the | ||||
connection-state information of the TCP connection with TCP B. In | ||||
line 3, TCP A tries to establish a new connection with TCP B, | ||||
using the same four-tuple {IP Source Address, TCP source port, IP | ||||
Destination Address, TCP destination port}. In line 4, as the SYN | ||||
segment is out of window, TCP B responds with an ACK. This ACK | ||||
elicits an RST segment from TCP A, which causes the half-open | ||||
connection at TCP B to be aborted. | ||||
If the SYN segment had been "in window", TCP B would have sent an | ||||
RST segment instead, which would have closed the half-open | ||||
connection. Ongoing work at the TCPM WG of the IETF proposes to | ||||
change this behavior, and make TCP respond to a SYN segment | ||||
received for any of the synchronized states with an ACK segment, | ||||
to avoid in-window SYN segments from being used to perform | ||||
connection-reset attacks [Ramaiah et al, 2008]. | ||||
However, in case the out-of-window segment was silently dropped, | ||||
the scenario in Figure 17 would change into that in Figure 18. | ||||
See Figure 18, in page 76 of the UK CPNI document. | ||||
Half-Open Connection Discovery with the proposed counter-measure | ||||
In line 3, the SYN segment sent by TCP A is silently dropped by | ||||
TCP B because it does not pass the check enforced by the equation | ||||
above (i.e., it contains an out-of-window sequence number). As a | ||||
result, some time later (an RTO) TCP A retransmits its SYN | ||||
segment. Even after TCP A times out, the half-open connection at | ||||
TCP B will remain in the same state. | ||||
Thus, a conservative reaction to those segments that do not pass | ||||
the check enforced by the equation above would be to respond with | ||||
an Acknowledgement segment (as specified by RFC 793), applying | ||||
rate-limiting to those Acknowledgement segments sent in response | ||||
to segments that do not pass the check enforced by that equation. | ||||
An implementation might choose to enforce a rate-limit of, e.g., | ||||
one ACK per five seconds, as a single ACK segment is needed for | ||||
the Half-Open Connection Discovery mechanism to work. | ||||
As the only reason to respond with an ACK to those segments that | ||||
do not pass the check enforced by the equation above is to allow | ||||
TCP to discover half-open connections, an aggressive rate-limit | ||||
can be enforced. As long as the rate-limit prevents out-of-window | ||||
segments from eliciting three Acknowledgment segments in a Round- | ||||
trip Time (RTT), an attacker would not be able to trigger TCP's | ||||
loss-recovery, and thus would not be able to perform the attacks | ||||
described in the previous sections. | ||||
It is interesting to note that RFC 793 [Postel, 1981c] itself | ||||
states that half-open connections are expected to be unusual. | ||||
Additionally, given that in many scenarios it may be unlikely for | ||||
a TCP connection request to be issued with the same four-tuple as | ||||
that of the half-open connection, a complete solution for the | ||||
discovery of half-open connections cannot rely on the mechanism | ||||
illustrated by Figure 17, either. Therefore, some implementations | ||||
might choose to sacrifice TCP's ability to detect half-open | ||||
connections, and have a more aggressive reaction to those segments | ||||
that do not pass the check enforced by the equation above by | ||||
silently dropping them. | ||||
This validation check can also help to avoid ACK wars in some | ||||
scenarios that may arise from the use of transparent proxies. In | ||||
those scenarios, when the transparent proxy fails to wire (i.e., | ||||
is disabled), the sequence numbers of the two end-points of the | ||||
TCP connection become desynchronized, and both TCPs begin to send | ||||
duplicate Acknowledgements to each other, with the intention of | ||||
re-synchronizing them. As the sequence numbers never get re- | ||||
synchronized, the ACK war can only be stopped by an external | ||||
agent. | ||||
TCP SHOULD limit the number of duplicate acknowledgements it will | ||||
honour to: | ||||
Max_DupACKs = (FlightSize / SMSS) - 1 | ||||
Where FlightSize and SMSS are the values defined in RFC 2581 [Allman | ||||
et al, 1999]. When more than Max_DupACKs duplicate acknowledgements | ||||
are received, the exceeding DupACKs should be silently dropped. | ||||
DISCUSSION: | ||||
Note that duplicate acknowledgements should be elicited by out-of- | ||||
order segments. | ||||
In the case of TCP connections that have agreed to employ SACK, TCP | ||||
SHOULD validate duplicate ACKs with the following criteria: Valid | ||||
Duplicate ACKs MUST contain new SACK information. The SACK | ||||
information MUST refer to data that has already been sent, but that | ||||
has not yet been acknowledged by TCP's cumulative Acknowledgement. A | ||||
TCP segment that does not pass this check SHOULD NOT be considered as | ||||
"duplicate Acknowledgement". | ||||
DISCUSSION: | ||||
SACK, specified in 2018 [Mathis et al, 1996], provides a mechanism | ||||
for TCP to be able to acknowledge the receipt of out-of-order TCP | ||||
segments. For connections that have agreed to use SACK, each | ||||
legitimate DupACK will contain new SACK information that reflects | ||||
the data bytes contained in the out-of-order data segment that | ||||
elicited the DupACK. | ||||
RFC 3517 [Blanton et al, 2003] specifies a SACK-based loss | ||||
recovery algorithm for TCP. However, it does recommend TCP | ||||
implementations to validate DupACKs by requiring that they contain | ||||
new SACK information. Results obtained from auditing a number of | ||||
TCP implementations seem to indicate that most TCP implementations | ||||
do not enforce this validation check on incoming DupACKs, either. | ||||
In the case of TCP connections that have agreed to use SACK, a | ||||
validation check should be performed on incoming ACK segments to | ||||
completely eliminate the attacks described in Section 9.2.1 and | ||||
Section 9.2.2 of this document: "Duplicate ACKs should contain new | ||||
SACK information. The SACK information should refer to data that | ||||
has already been sent, but that has not yet been acknowledged by | ||||
TCP's cumulative Acknowledgement". | ||||
Those ACK segments that do not comply with this validation check | ||||
should not be considered "duplicate ACKs", and thus should not | ||||
trigger the loss-recovery phase. | ||||
In case at least one segment in a window of data has been lost, | ||||
the successive segments will elicit the generation of Duplicate | ||||
ACKs containing new SACK information. This SACK information will | ||||
indicate the receipt of these successive segments by the TCP | ||||
receiver. | ||||
In the case of pure ACKs illegitimately elicited by out-of-window | ||||
segments, however, the ACKs will not contain any SACK information. | ||||
If DSACK (specified in 2883 [Floyd et al, 2000]) were implemented | ||||
by the TCP receiver, then the illegitimately elicited DupACKs | ||||
might contain out-of-window SACK information if the sequence | ||||
number of the forged TCP segment (SEG.SEQ) is lower than the next | ||||
expected sequence number (RECV.NXT) at the TCP receiver. Such | ||||
segments should be considered to indicate the receipt of duplicate | ||||
data, rather than an indication of lost data, and therefore should | ||||
not trigger loss recovery. | ||||
Other possible general mitigations are discussed in the following | ||||
paragraphs: | ||||
TCP port number randomization | ||||
As in order to perform the blind attacks described in Section 9.2.1 | ||||
and Section 9.2.2 the attacker needs to know the TCP port numbers in | ||||
use by the connection to be attacked, obfuscating the TCP source port | ||||
used for outgoing TCP connections will increase the number of packets | ||||
required to successfully perform these attacks. Section 3.1 of this | ||||
document discusses the use of port randomization. | ||||
It must be noted that given that these blind DupACK triggering | ||||
attacks do not require the attacker to forge valid TCP Sequence | ||||
numbers and TCP Acknowledgement numbers, port randomization should | ||||
not be relied upon as a first line of defense. | ||||
Ingress and Egress filtering | ||||
Ingress and Egress filtering reduces the number of systems in the | ||||
global Internet that can perform attacks that rely on forged source | ||||
IP addresses. While protection from the blind attacks discussed in | ||||
Section 9.2 should not rely only on Ingress and Egress filtering, its | ||||
deployment is recommended to help prevent all attacks that rely on | ||||
forged IP addresses. RFC 3704 [Baker and Savola, 2004], RFC 2827 | ||||
[Ferguson and Senie, 2000], and [NISCC, 2006] provide advice on | ||||
Ingress and Egress filtering. | ||||
Generalized TTL Security Mechanism (GTSM) | ||||
RFC 5082 [Gill et al, 2007] proposes a check on the TTL field of the | ||||
IP packets that correspond to a given TCP connection to reduce the | ||||
number of systems that could successfully attack the protected TCP | ||||
connection. It provides for the attacks discussed in this document | ||||
the same level of protection than for the attacks described in | ||||
[Watson, 2004] and RFC 4953 [Touch, 2007]. While implementation of | ||||
this mechanism may be useful in some scenarios, it should be clear | ||||
that countermeasures discussed in the previous sections provide a | ||||
more effective and simpler solution than that provided by the GTSM. | ||||
9.3. TCP Explicit Congestion Notification (ECN) | 9.3. TCP Explicit Congestion Notification (ECN) | |||
ECN (Explicit Congestion Notification) provides a mechanism for | ECN (Explicit Congestion Notification) provides a mechanism for | |||
intermediate systems to signal congestion to the communicating | intermediate systems to signal congestion to the communicating | |||
endpoints that in some scenarios can be used as an alternative to | endpoints that in some scenarios can be used as an alternative to | |||
dropping packets. | dropping packets. | |||
RFC 3168 [Ramakrishnan et al, 2001] contains a detailed discussion of | RFC 3168 [Ramakrishnan et al, 2001] contains a detailed discussion of | |||
the possible ways and scenarios in which ECN could be exploited by an | the possible ways and scenarios in which ECN could be exploited by an | |||
skipping to change at page 79, line 27 | skipping to change at page 56, line 6 | |||
on nonces, that protects against accidental or malicious concealment | on nonces, that protects against accidental or malicious concealment | |||
of marked packets from the TCP sender. The specified mechanism | of marked packets from the TCP sender. The specified mechanism | |||
defines a "NS" ("Nonce Sum") field in the TCP header that makes use | defines a "NS" ("Nonce Sum") field in the TCP header that makes use | |||
of one bit from the Reserved field, and requires a modification in | of one bit from the Reserved field, and requires a modification in | |||
both of the endpoints of a TCP connection to process this new field. | both of the endpoints of a TCP connection to process this new field. | |||
This mechanism is still in "Experimental" status, and since it might | This mechanism is still in "Experimental" status, and since it might | |||
suffer from the behavior of some middle-boxes such as firewalls or | suffer from the behavior of some middle-boxes such as firewalls or | |||
packet-scrubbers, we defer a recommendation of this mechanism until | packet-scrubbers, we defer a recommendation of this mechanism until | |||
more experience is gained. | more experience is gained. | |||
There also is ongoing work in the research community and the IETF to | There also is ongoing work in the research community and the IETF | |||
define alternate semantics for the ECN field of the IP header (e.g., | to define alternate semantics for the ECN field of the IP header | |||
see [PCNWG, 2009]). | (e.g., see [PCNWG, 2009]). | |||
The following subsections try to summarize the security implications | ||||
of ECN. | ||||
9.3.1. Possible attacks by a compromised router | ||||
Firstly, a router controlled by a malicious user could erase the CE | ||||
codepoint (either by replacing it with the ECT(0), ECT(1), or non-ECT | ||||
codepoints), effectively eliminating the congestion indication. As a | ||||
result, the corresponding TCP sender would not reduce its data | ||||
transmission rate, possibly leading to network congestion. This | ||||
could also lead to unfairness, as this flow could experience better | ||||
performance than other flows for which the congestion indication is | ||||
not erased (and thus their transmission rate is reduced). | ||||
Secondly, a router controlled by a malicious user could | ||||
illegitimately set the CE codepoint, falsely indicating congestion, | ||||
to cause the TCP sender to reduce its data transmission rate. | ||||
However, this particular attack is no worse than the malicious router | ||||
simply dropping the packets rather setting their CE codepoint. | ||||
Thirdly, a malicious router could turn off the ECT codepoint of a | ||||
packet, thus disabling ECN support. As a result, if the packet later | ||||
arrives at a router that is experiencing congestion, it may be | ||||
dropped rather than marked. As with the previous scenario, though, | ||||
this is no worse than the malicious router simply dropping the | ||||
corresponding packet. | ||||
It should be noted that a compromised on-path IP router could engage | ||||
in a much broader range of attacks, with broader impacts, and at much | ||||
lower attacker cost than the ones described here. Such a compromised | ||||
router is extremely unlikely to engage in the attack vectors | ||||
discussed in this section, given the existence of more effective | ||||
attack vectors that have lower attacker cost. | ||||
9.3.2. Possible attacks by a malicious TCP endpoint | ||||
If a packet with the ECT codepoint set arrives at an ECN-capable | ||||
router that is experiencing moderate congestion, the router may | ||||
decide to set its CE codepoint instead of dropping it. If either of | ||||
the TCP endpoints do not honour the congestion indication provided by | ||||
an ECN-capable router, this would result in unfairness, as other | ||||
(legitimate) ECN-capable flows would still reduce their sending rate | ||||
in response to the ECN marking of packets. Furthermore, under | ||||
moderate congestion, non-ECN-capable flows would be subject to packet | ||||
drops by the same router. As a result, the flow with a malicious TCP | ||||
end-point would obtain better service than the legitimate flows. | ||||
As noted in RFC 3168 [Ramakrishnan et al, 2001], a TCP endpoint | ||||
falsely indicating ECN capability could lead to unfairness, allowing | ||||
the mis-beheaving flow to get more than its fair share of the | ||||
bandwidth. This could be the result of the mis-behavior of either of | ||||
the TCP endpoints. For example, the sending TCP could indicate ECN | ||||
capability, but then send a CWR in response to an ECE without | ||||
actually reducing its congestion window. Alternatively (or in | ||||
addition), the receiving TCP could simply ignore those packets with | ||||
the CE codepoint set, thus avoiding the sending TCP from receiving | ||||
the congestion indication. | ||||
In the case of the sending TCP ignoring the ECN congestion | ||||
indication, this would be no worse than the sending TCP ignoring the | ||||
congestion indication provided by a lost segment. However, the case | ||||
of a TCP receiver ignoring the CE codepoint allows the TCP receiver | ||||
to get more than its fair share of bandwidth in a way that was | ||||
previously unavailable. If congestion was kept "moderate", then the | ||||
malicious TCP receiver could maintain the unfairness, as the router | ||||
experiencing congestion would mark the offending packets of the | ||||
misbehaving flow rather than dropping them. At the same time, | ||||
legitimate ECN-capable flows would respond to the congestion | ||||
indication provided by the CE codepoint, while legitimate non-ECN- | ||||
capable flows would be subject of packet dropping. However, if | ||||
congestion turned to sufficiently heavy, the router experiencing | ||||
congestion would switch from marking packets to dropping packets, and | ||||
at that point the attack vector provided by ECN could no longer be | ||||
exploited (until congestion returns to moderate state). | ||||
RFC 3168 [Ramakrishnan et al, 2001] describes the use of "penalty | RFC 3168 [RFC3168] provides a very throrough security assessment of | |||
boxes" which would act on flows that do not respond appropriately to | ECN. Among the possible mitigations, it describes the use of | |||
congestion indications. Section 10 of RFC 3168 suggests that a first | "penalty boxes" which would act on flows that do not respond | |||
action taken at a penalty box for an ECN-capable flow would be to | appropriately to congestion indications. Section 10 of RFC 3168 | |||
switch to dropping packets (instead of marking them), and, if the | suggests that a first action taken at a penalty box for an ECN- | |||
flow does not respond appropriately to the congestion indication, the | capable flow would be to switch to dropping packets (instead of | |||
penalty box could reset the misbehaving connection. Here we | marking them), and, if the flow does not respond appropriately to the | |||
discourage implementation of such a policy, as it would create a | congestion indication, the penalty box could reset the misbehaving | |||
vector for connection-reset attacks. For example, an attacker could | connection. Here we discourage implementation of such a policy, as | |||
forge TCP segments with the same four-tuple as the targeted | it would create a vector for connection-reset attacks. For example, | |||
connection and cause them to transit the penalty box. The penalty | an attacker could forge TCP segments with the same four-tuple as the | |||
box would first switch from marking to dropping packets. However, | targeted connection and cause them to transit the penalty box. The | |||
the attacker would continue sending forged segments, at a steady | penalty box would first switch from marking to dropping packets. | |||
rate. As a result, if the penalty box implemented such a severe | However, the attacker would continue sending forged segments, at a | |||
policy of resetting connections for flows that still do not respond | steady rate. As a result, if the penalty box implemented such a | |||
to end-to-end congestion control after switching from marking to | severe policy of resetting connections for flows that still do not | |||
dropping, the attacked connection would be reset. | respond to end-to-end congestion control after switching from marking | |||
to dropping, the attacked connection would be reset. | ||||
10. TCP API | 10. TCP API | |||
Section 3.8 of RFC 793 [Postel, 1981c] describes the minimum set of | NOTE: THIS SECTION IS BEING EDITED. | |||
TCP User Commands required of all TCP Implementations. Most | ||||
operating systems provide an Application Programming Interface (API) | Section 3.8 of RFC 793 [RFC0793] describes the minimum set of TCP | |||
that allows applications to make use of the services provided by TCP. | User Commands required of all TCP Implementations. Most operating | |||
One of the most popular APIs is the Sockets API, originally | systems provide an Application Programming Interface (API) that | |||
introduced in the BSD networking package [McKusick et al, 1996]. | allows applications to make use of the services provided by TCP. One | |||
of the most popular APIs is the Sockets API, originally introduced in | ||||
the BSD networking package [McKusick et al, 1996]. | ||||
10.1. Passive opens and binding sockets | 10.1. Passive opens and binding sockets | |||
When there is already a pending passive OPEN for some local port | When there is already a pending passive OPEN for some local port | |||
number, TCP SHOULD NOT allow processes that do not belong to the same | number, TCP SHOULD NOT allow processes that do not belong to the same | |||
user to "reuse" the local port for another passive OPEN. | user to "reuse" the local port for another passive OPEN. | |||
Additionally, reuse of a local port SHOULD default to "off", and be | Additionally, reuse of a local port SHOULD default to "off", and be | |||
enabled only by an explicit command (e.g., the setsockopt() function | enabled only by an explicit command (e.g., the setsockopt() function | |||
of the Sockets API). | of the Sockets API). | |||
skipping to change at page 82, line 14 | skipping to change at page 57, line 18 | |||
OPEN (local port, foreign socket, active/passive [, timeout] [, | OPEN (local port, foreign socket, active/passive [, timeout] [, | |||
precedence] [, security/compartment] [, options]) -> local | precedence] [, security/compartment] [, options]) -> local | |||
connection name | connection name | |||
When this command is used to perform a passive open (i.e., the | When this command is used to perform a passive open (i.e., the | |||
active/passive flag is set to passive), the foreign socket | active/passive flag is set to passive), the foreign socket | |||
parameter may be either fully-specified (to wait for a particular | parameter may be either fully-specified (to wait for a particular | |||
connection) or unspecified (to wait for any call). | connection) or unspecified (to wait for any call). | |||
As discussed in Section 2.7 of RFC 793 [Postel, 1981c], if there | As discussed in Section 2.7 of RFC 793 [RFC0793], if there are | |||
are several passive OPENs with the same local socket (recorded in | several passive OPENs with the same local socket (recorded in the | |||
the corresponding TCB), an incoming connection will be matched to | corresponding TCB), an incoming connection will be matched to the | |||
the TCB with the more specific foreign socket. This means that | TCB with the more specific foreign socket. This means that when | |||
when the foreign socket of a passive OPEN matches that of the | the foreign socket of a passive OPEN matches that of the incoming | |||
incoming connection request, that passive OPEN takes precedence | connection request, that passive OPEN takes precedence over those | |||
over those passive OPENs with an unspecified foreign socket. | passive OPENs with an unspecified foreign socket. | |||
Popular implementations such as the Sockets API let the user | Popular implementations such as the Sockets API let the user | |||
specify the local socket as fully-specified {local IP address, | specify the local socket as fully-specified {local IP address, | |||
local TCP port} pair, or as just the local TCP port (leaving the | local TCP port} pair, or as just the local TCP port (leaving the | |||
local IP address unspecified). In the former case, only those | local IP address unspecified). In the former case, only those | |||
connection requests sent to {local port, local IP address} will be | connection requests sent to {local port, local IP address} will be | |||
accepted. In the latter case, connection requests sent to any of | accepted. In the latter case, connection requests sent to any of | |||
the system's IP addresses will be accepted. In a similar fashion | the system's IP addresses will be accepted. In a similar fashion | |||
to the generic API described in Section 2.7 of RFC 793, if there | to the generic API described in Section 2.7 of RFC 793, if there | |||
is a pending passive OPEN with a fully-specified local socket that | is a pending passive OPEN with a fully-specified local socket that | |||
skipping to change at page 83, line 6 | skipping to change at page 58, line 8 | |||
port" argument of the "OPEN" command. | port" argument of the "OPEN" command. | |||
An implementation MAY relax the aforementioned restriction when the | An implementation MAY relax the aforementioned restriction when the | |||
process or system user requesting allocation of such a port number is | process or system user requesting allocation of such a port number is | |||
the same that the process or system user controlling the TCP in the | the same that the process or system user controlling the TCP in the | |||
CLOSED or LISTEN states with the same port number. | CLOSED or LISTEN states with the same port number. | |||
DISCUSSION: | DISCUSSION: | |||
As discussed in Section 10.1, the "OPEN" command specified in | As discussed in Section 10.1, the "OPEN" command specified in | |||
Section 3.8 of RFC 793 [Postel, 1981c] can be used to perform | Section 3.8 of RFC 793 [RFC0793] can be used to perform active | |||
active opens. In case of active opens, the parameter "local port" | opens. In case of active opens, the parameter "local port" will | |||
will contain a so-called "ephemeral port". While the only | contain a so-called "ephemeral port". While the only requirement | |||
requirement for such an ephemeral port is that the resulting | for such an ephemeral port is that the resulting connection-id is | |||
connection-id is unique, port numbers that are currently in use by | unique, port numbers that are currently in use by a TCP in the | |||
a TCP in the LISTEN state should not be allowed for use as | LISTEN state should not be allowed for use as ephemeral ports. If | |||
ephemeral ports. If this rule is not complied, an attacker could | this rule is not complied, an attacker could potentially "steal" | |||
potentially "steal" an incoming connection to a local server | an incoming connection to a local server application by issuing a | |||
application by issuing a connection request to the victim client | connection request to the victim client at roughly the same time | |||
at roughly the same time the client tries to connect to the victim | the client tries to connect to the victim server application. If | |||
server application. If the SYN segment corresponding to the | the SYN segment corresponding to the attacker's connection request | |||
attacker's connection request and the SYN segment corresponding to | and the SYN segment corresponding to the victim client "cross each | |||
the victim client "cross each other in the network", and provided | other in the network", and provided the attacker is able to know | |||
the attacker is able to know or guess the ephemeral port used by | or guess the ephemeral port used by the client, a TCP simultaneous | |||
the client, a TCP simultaneous open scenario would take place, and | open scenario would take place, and the incoming connection | |||
the incoming connection request sent by the client would be | request sent by the client would be matched with the attacker's | |||
matched with the attacker's socket rather than with the victim | socket rather than with the victim server application's socket. | |||
server application's socket. | ||||
As already noted, in order for this attack to succeed, the | As already noted, in order for this attack to succeed, the | |||
attacker should be able to guess or know (in advance) the | attacker should be able to guess or know (in advance) the | |||
ephemeral port selected by the victim client, and be able to know | ephemeral port selected by the victim client, and be able to know | |||
the right moment to issue a connection request to the victim | the right moment to issue a connection request to the victim | |||
client. While in many scenarios this may prove to be a difficult | client. While in many scenarios this may prove to be a difficult | |||
task, some factors such as an inadequate ephemeral port selection | task, some factors such as an inadequate ephemeral port selection | |||
policy at the victim client could make this attack feasible. | policy at the victim client could make this attack feasible. | |||
It should be noted that most applications based on popular | It should be noted that most applications based on popular | |||
skipping to change at page 84, line 13 | skipping to change at page 59, line 13 | |||
ports. | ports. | |||
An implementation might choose to relax the aforementioned | An implementation might choose to relax the aforementioned | |||
restriction when the process or system user requesting allocation | restriction when the process or system user requesting allocation | |||
of such a port number is the same that the process or system user | of such a port number is the same that the process or system user | |||
controlling the TCP in the CLOSED or LISTEN states with the same | controlling the TCP in the CLOSED or LISTEN states with the same | |||
port number. | port number. | |||
11. Blind in-window attacks | 11. Blind in-window attacks | |||
NOTE: THIS SECTION IS BEING EDITED. | ||||
In the last few years awareness has been raised about a number of | In the last few years awareness has been raised about a number of | |||
"blind" attacks that can be performed against TCP by forging TCP | "blind" attacks that can be performed against TCP by forging TCP | |||
segments that fall within the receive window [NISCC, 2004] [Watson, | segments that fall within the receive window [NISCC, 2004] [Watson, | |||
2004]. | 2004]. | |||
The term "blind" refers to the fact that the attacker does not have | The term "blind" refers to the fact that the attacker does not have | |||
access to the packets that belong to the attacked connection. | access to the packets that belong to the attacked connection. | |||
The effects of these attacks range from connection resets to data | The effects of these attacks range from connection resets to data | |||
injection. While these attacks were known in the research community, | injection. While these attacks were known in the research community, | |||
skipping to change at page 85, line 7 | skipping to change at page 60, line 7 | |||
reset attacks against TCP. [Watson, 2004] and [NISCC, 2004] raised | reset attacks against TCP. [Watson, 2004] and [NISCC, 2004] raised | |||
awareness about connection-reset attacks that exploit the RST flag of | awareness about connection-reset attacks that exploit the RST flag of | |||
TCP segments. [Ramaiah et al, 2008] noted that carefully crafted SYN | TCP segments. [Ramaiah et al, 2008] noted that carefully crafted SYN | |||
segments could also be used to perform connection-reset attacks. | segments could also be used to perform connection-reset attacks. | |||
This document describes yet two previously undocumented vectors for | This document describes yet two previously undocumented vectors for | |||
performing connection-reset attacks: the Precedence field of IP | performing connection-reset attacks: the Precedence field of IP | |||
packets that encapsulate TCP segments, and illegal TCP options. | packets that encapsulate TCP segments, and illegal TCP options. | |||
11.1.1. RST flag | 11.1.1. RST flag | |||
TCP SHOULD implement the mitigation for RST-based attacks specified | The RST flag signals a TCP peer that the connection should be | |||
in [Ramaiah et al, 2008]. | aborted. In contrast with the FIN handshake (which gracefully | |||
terminates a TCP connection), an RST segment causes the connection to | ||||
DISCUSSION: | be abnormally closed. | |||
The RST flag signals a TCP peer that the connection should be | ||||
aborted. In contrast with the FIN handshake (which gracefully | ||||
terminates a TCP connection), an RST segment causes the connection | ||||
to be abnormally closed. | ||||
As stated in Section 3.4 of RFC 793 [Postel, 1981c], all reset | ||||
segments are validated by checking their Sequence Numbers, with | ||||
the Sequence Number considered valid if it is within the receive | ||||
window. In the SYN-SENT state, however, an RST is valid if the | ||||
Acknowledgement Number acknowledges the SYN segment that | ||||
supposedly elicited the reset. | ||||
[Ramaiah et al, 2008] proposes a modification to TCP's transition | ||||
diagram to address this attack vector. The counter-measure is a | ||||
combination of enforcing a more strict validation check on the | ||||
sequence number of reset segments, and the addition of a | ||||
"challenge" mechanism. With the implementation of the proposed | ||||
mechanism, TCP would behave as follows: | ||||
If the Sequence Number of an RST segment is outside the receive | ||||
window, the segment is silently dropped (as stated by RFC 793). | ||||
That is, a reset segment is discarded unless it passes the | ||||
following check: | ||||
RCV.NXT <= Sequence Number < RCV.NXT+RCV.WND | ||||
If the sequence number falls exactly on the left-edge of the | ||||
receive window, the reset is honoured. That is, the connection is | ||||
reset if the following condition is true: | ||||
Sequence Number == RCV.NXT | ||||
If an RST segment passes the first check (i.e., it is within the | ||||
receive window) but does not pass the second check (i.e., it does | ||||
not fall exactly on the left edge of the receive window), an | ||||
Acknowledgement segment ("challenge ACK") is set in response: | ||||
<SEQ=SND.NXT><ACK=RCV.NXT><CTL=ACK> | As stated in Section 3.4 of RFC 793 [RFC0793], all reset segments are | |||
validated by checking their Sequence Numbers, with the Sequence | ||||
Number considered valid if it is within the receive window. In the | ||||
SYN-SENT state, however, an RST is valid if the Acknowledgement | ||||
Number acknowledges the SYN segment that supposedly elicited the | ||||
reset. | ||||
This Acknowledgement segment is referred to as a "challenge ACK" | [RFC5961] proposes a modification to TCP's transition diagram to | |||
as, in the event the RST segment that elicited it had been | address this attack vector. The counter-measure is a combination of | |||
legitimate (but silently dropped as a result of enforcing the | enforcing a more strict validation check on the sequence number of | |||
above checks), the challenge ACK would elicit a new reset segment | reset segments, and the addition of a "challenge" mechanism. | |||
that would fall exactly on the left edge of the window and would | ||||
thus pass all the above checks, finally resetting the connection. | ||||
We recommend the implementation of this countermeasure. However, | We note that we are aware of patent claims on this counter- | |||
we are aware of patent claims on this counter-measure, and suggest | measure, and suggest vendors to research the consequences of the | |||
vendors to research the consequences of the possible patents that | possible patents that may apply. | |||
may apply. | ||||
[US-CERT, 2003a] is an advisory of a firewall system that was | [US-CERT, 2003a] is an advisory of a firewall system that was found | |||
found particularly vulnerable to resets attack because of not | particularly vulnerable to resets attack because of not validating | |||
validating the TCP Sequence Number of RST segments. Clearly, all | the TCP Sequence Number of RST segments. Clearly, all TCPs | |||
TCPs (including those in middle-boxes) should validate RST | (including those in middle-boxes) should validate RST segments as | |||
segments as discussed in this section. | discussed in this section. | |||
11.1.2. SYN flag | 11.1.2. SYN flag | |||
Processing of SYN segments received for connections in the | Section 3.9 (page 71) of RFC 793 [RFC0793] states that if a SYN | |||
synchronized states SHOULD occur as follows: | segment is received with a valid (i.e., "in window") Sequence Number, | |||
an RST segment should be sent in response, and the connection should | ||||
o If a SYN segment is received for a connection in any synchronized | be aborted. This could be leveraged to perform a blind connection- | |||
state other than TIME-WAIT, respond with an ACK, applying rate- | reset attack. [RFC5961] proposes a change in TCP's state diagram to | |||
throttling. [Ramaiah et al, 2008] | mitigate this attack vector. | |||
o If the corresponding connection is in the TIME-WAIT state, then | ||||
process the incomming SYN as specified in | ||||
[I-D.ietf-tcpm-tcp-timestamps]. | ||||
DISCUSSION: | ||||
Section 3.9 (page 71) of RFC 793 [Postel, 1981c] states that if a | ||||
SYN segment is received with a valid (i.e., "in window") Sequence | ||||
Number, an RST segment should be sent in response, and the | ||||
connection should be aborted. | ||||
The IETF has published an RFC, "Improving TCP's Resistance to | ||||
Blind In-Window Attacks" [Ramaiah et al, 2008] which addresses, | ||||
among others, this variant of TCP-based connection-reset attack. | ||||
This section describes the counter-measure proposed by the IETF, a | ||||
problem that may arise from the implementation of that solution, | ||||
and a workaround to it. | ||||
In order to mitigate this attack vector, [Ramaiah et al, 2008] | ||||
proposes to change TCP's reaction to SYN segments as follows. | ||||
When a SYN segment is received for a connection in any of the | ||||
synchronized states, an Acknowledgement (ACK) segment is sent in | ||||
response. | ||||
As discussed in [Ramaiah et al, 2008], there is a corner-case that | ||||
would not be properly handled by this mechanism. If a host (TCP | ||||
A) establishes a TCP connection with a remote peer (TCP B), and | ||||
then crashes, reboots and tries to initiate a new incarnation of | ||||
the same connection (i.e., a connection with the same four-tuple | ||||
as the previous connection) using an Initial Sequence Number equal | ||||
to the RCV.NXT value at the remote peer (TCP B), the ACK segment | ||||
sent by TCP B in response to the SYN segment would contain an | ||||
Acknowledgement number that would be considered valid by TCP A, | ||||
and thus an RST segment would not be sent in response to the | ||||
Acknowledgement (ACK) segment. As this ACK would not have the SYN | ||||
bit set, TCP A (being in the SYN-SENT state) would silently drop | ||||
it (as stated on page 68 of RFC 793). After a Retransmission | ||||
Timeout (RTO), TCP A would retransmit its SYN segment, which would | ||||
lead to the same sequence of events as before. Eventually, TCP A | ||||
would timeout, and the connection would be aborted. This is a | ||||
corner case in which the introduced change would lead to a non- | ||||
desirable behavior. However, we consider this scenario to be | ||||
extremely unlikely and, in the event it ever took place, the | ||||
connection would nevertheless be aborted after retrying for a | ||||
period of USER TIMEOUT seconds. | ||||
However, when this change is implemented exactly as described in | ||||
[Ramaiah et al, 2008], the potential of interoperability problems | ||||
is introduced, as a heuristic widely incorporated in many TCP | ||||
implementations is disabled. | ||||
In a number of scenarios a socket pair may need to be reused while | ||||
the corresponding four-tuple is still in the TIME-WAIT state in a | ||||
remote TCP peer. For example, a client accessing some service on | ||||
a host may try to create a new incarnation of a previous | ||||
connection, while the corresponding four-tuple is still in the | ||||
TIME-WAIT state at the remote TCP peer (the server). This may | ||||
happen if the ephemeral port numbers are being reused too quickly, | ||||
either because of a bad policy of selection of ephemeral ports, or | ||||
simply because of a high connection rate to the corresponding | ||||
service. In such scenarios, the establishment of new connections | ||||
that reuse a four-tuple that is in the TIME-WAIT state would fail. | ||||
In order to avoid this problem, RFC 1122 [Braden, 1989] states (in | ||||
Section 4.2.2.13) that when a connection request is received with | ||||
a four-tuple that is in the TIME-WAIT state, the connection | ||||
request could be accepted if the sequence number of the incoming | ||||
SYN segment is greater than the last sequence number seen on the | ||||
previous incarnation of the connection (for that direction of the | ||||
data transfer). | ||||
This requirement aims at avoiding the sequence number space of the | ||||
new and old incarnations of the connection to overlap, thus | ||||
avoiding old segments from the previous incarnation of the | ||||
connection to be accepted as valid by the new connection. | ||||
The requirement in [Ramaiah et al, 2008] to disregard SYN segments | ||||
received for connections in any of the synchronized states forbids | ||||
the implementation of the heuristic described above. As a result, | ||||
we argue that the processing of SYN segments proposed in [Ramaiah | ||||
et al, 2008] should apply only for connections in any of the | ||||
synchronized states other than the TIME-WAIT state. | ||||
11.1.3. Security/Compartment | 11.1.3. Security/Compartment | |||
If the security/compartment field of an incoming TCP segment does not | Section 3.9 (page 71) of RFC 793 [RFC0793] states that if the IP | |||
match the value recorded in the corresponding TCB, TCP SHOULD NOT | security/compartment of an incoming segment does not exactly match | |||
abort the connection, but simply discard the corresponding packet. | the security/compartment in the TCB, a RST segment should be sent, | |||
Additionally, this whole event SHOULD be logged as a security | and the connection should be aborted. This certainly provides | |||
violation. | another attack vector for performing connection-reset attacks, as an | |||
attacker could forge TCP segments with a security/compartment that is | ||||
DISCUSSION: | different from that recorded in the corresponding TCB and, as a | |||
result, the attacked connection would be reset. | ||||
Section 3.9 (page 71) of RFC 793 [Postel, 1981c] states that if | ||||
the IP security/compartment of an incoming segment does not | ||||
exactly match the security/compartment in the TCB, a RST segment | ||||
should be sent, and the connection should be aborted. | ||||
A discussion of the IP security options relevant to this section | ||||
can be found in Section 3.13.2.12, Section 3.13.2.13, and Section | ||||
3.13.2.14 of [CPNI, 2008]. | ||||
This certainly provides another attack vector for performing | ||||
connection-reset attacks, as an attacker could forge TCP segments | ||||
with a security/compartment that is different from that recorded | ||||
in the corresponding TCB and, as a result, the attacked connection | ||||
would be reset. | ||||
It is interesting to note that for connections in the ESTABLISHED | ||||
state, this check is performed after validating the TCP Sequence | ||||
Number and checking the RST bit, but before validating the | ||||
Acknowledgement field. Therefore, even if the stricter validation | ||||
of the Acknowledgement field (described in Section 3.4) was | ||||
implemented, it would not help to mitigate this attack vector. | ||||
This attack vector can be easily mitigated by relaxing the | [draft-gont-tcpm-tcp-seccomp-prec-00.txt] aims to update RFC 793 such | |||
reaction to TCP segments with "incorrect" security/compartment | that this issue is eliminated. | |||
values as specified in this section. | ||||
11.1.4. Precedence | 11.1.4. Precedence | |||
If the Precedence field of an incomming TCP segment does not match | Section 3.9 (page 71) of RFC 793 [RFC0793] states that if the IP | |||
the value recorded in the corresponding TCB, TCP MUST NOT abort the | precedence of an incoming segment does not exactly match the | |||
connection, and MUST instead continue processing the segment as | precedence in the TCB, a RST segment should be sent, and the | |||
specified by RFC 793. | connection should be aborted. This certainly provides another attack | |||
vector for performing connection-reset attacks, as an attacker could | ||||
DISCUSSION: | forge TCP segments with a precedence that is different from that | |||
recorded in the corresponding TCB and, as a result, the attacked | ||||
Section 3.9 (page 71) of RFC 793 [Postel, 1981c] states that if | connection would be reset. | |||
the IP Precedence of an incoming segment does not exactly match | ||||
the Precedence recorded in the TCB, a RST segment should be sent, | ||||
and the connection should be aborted. | ||||
This certainly provides another attack vector for performing | ||||
connection-reset attacks, as an attacker could forge TCP segments | ||||
with a IP Precedence that is different from that recorded in the | ||||
corresponding TCB and, as a result, the attacked connection would | ||||
be reset. | ||||
It is interesting to note that for connections in the ESTABLISHED | ||||
state, this check is performed after validating the TCP Sequence | ||||
Number and checking the RST bit, but before validating the | ||||
Acknowledgement field. Therefore, even if the stricter validation | ||||
of the Acknowledgement field (described in Section 3.4) were | ||||
implemented, it would not help to mitigate this attack vector. | ||||
This attack vector can be easily mitigated by relaxing the | ||||
reaction to TCP segments with "incorrect" IP Precedence values. | ||||
That is, even if the Precedence field does not match the value | ||||
recorded in the corresponding TCB, TCP should not abort the | ||||
connection, and should instead continue processing the segment as | ||||
specified by RFC 793. | ||||
It is interesting to note that resetting a connection due to a | ||||
change in the Precedence value might have a negative impact on | ||||
interoperability. For example, the packets that correspond to the | ||||
connection could temporarily take a different internet path, in | ||||
which some middle-box could re-mark the Precedence field (due to | ||||
administration policies at the network to be transited). In such | ||||
a scenario, an implementation following the advice in RFC 793 | ||||
would abort the connection, when the connection would have | ||||
probably survived. | ||||
While the IPv4 Type of Service field (and hence the Precedence | [draft-gont-tcpm-tcp-seccomp-prec-00.txt] aims to update RFC 793 such | |||
field) has been redefined by the Differentiated Services (DS) | that this issue is eliminated. | |||
field specified in RFC 2474 [Nichols et al, 1998], RFC 793 | ||||
[Postel, 1981c] was never formally updated in this respect. We | ||||
note that both legacy systems that have not been upgraded to | ||||
implement the differentiated services architecture described in | ||||
RFC 2475 [Blake et al, 1998] and current implementations that have | ||||
extrapolated the discussion of the Precedence field to the | ||||
Differentiated Services field may still be vulnerable to the | ||||
connection reset vector discussed in this section. | ||||
11.1.5. Illegal options | 11.1.5. Illegal options | |||
TCP MUST silently drop those TCP segments that contain TCP options | Section 4.2.2.5 of RFC 1122 [RFC1122] discusses the processing of TCP | |||
with illegal option lengths. | options. It states that TCP should be prepared to handle an illegal | |||
option length (e.g., zero) without crashing, and suggests handling | ||||
DISCUSSION: | such illegal options by resetting the corresponding connection and | |||
logging the reason. However, this suggested behavior could be | ||||
exploited to perform connection-reset attacks. | ||||
Section 4.2.2.5 of RFC 1122 [Braden, 1989] discusses the | [draft-gont-tcpm-tcp-illegal-option-lengths-00] aims at formally | |||
processing of TCP options. It states that TCP must be able to | updating RFC 1122, such that this issue is eliminated. | |||
receive a TCP option in any segment, and must ignore without error | ||||
any option it does not implement. Additionally, it states that | ||||
TCP should be prepared to handle an illegal option length (e.g., | ||||
zero) without crashing, and suggests handling such illegal options | ||||
by resetting the corresponding connection and logging the reason. | ||||
However, this suggested behavior could be exploited to perform | ||||
connection-reset attacks. Therefore, as discussed in Section 3.10 | ||||
of this document, we advise TCP implementations to silently drop | ||||
those TCP segments that contain illegal option lengths. | ||||
11.2. Blind data-injection attacks | 11.2. Blind data-injection attacks | |||
An attacker could try to inject data in the stream of data being | An attacker could try to inject data in the stream of data being | |||
transferred on the connection. As with the other attacks described | transferred on the connection. As with the other attacks described | |||
in Section 11 of this document, in order to perform a blind data | in Section 11 of this document, in order to perform a blind data | |||
injection attack the attacker would need to know or guess the four- | injection attack the attacker would need to know or guess the four- | |||
tuple that identifies the TCP connection to be attacked. | tuple that identifies the TCP connection to be attacked. | |||
Additionally, he should be able to guess a valid ("in window") TCP | Additionally, he should be able to guess a valid ("in window") TCP | |||
Sequence Number, and a valid Acknowledgement Number. | Sequence Number, and a valid Acknowledgement Number. | |||
As discussed in Section 3.4 of this document, [Ramaiah et al, 2008] | As discussed in Section 3.4 of this document, [Ramaiah et al, 2008] | |||
proposes to enforce a more strict check on the Acknowledgement Number | proposes to enforce a more strict check on the Acknowledgement Number | |||
of incoming segments than that specified in RFC 793 [Postel, 1981c]. | of incoming segments than that specified in RFC 793 [RFC0793]. | |||
Implementation of the proposed check requires more packets on the | Implementation of the proposed check requires more packets on the | |||
side of the attacker to successfully perform a blind data-injection | side of the attacker to successfully perform a blind data-injection | |||
attack. However, it should be noted that applications concerned with | attack. However, it should be noted that applications concerned with | |||
any of the attacks discussed in Section 11 of this document should | any of the attacks discussed in Section 11 of this document should | |||
make use of proper authentication techniques, such as those specified | make use of proper authentication techniques, such as those specified | |||
for IPsec in RFC 4301 [Kent and Seo, 2005]. | for IPsec in RFC 4301 [Kent and Seo, 2005]. | |||
12. Information leaking | 12. Information leaking | |||
NOTE: THIS SECTION IS BEING EDITED. | ||||
12.1. Remote Operating System detection via TCP/IP stack fingerprinting | 12.1. Remote Operating System detection via TCP/IP stack fingerprinting | |||
Clearly, remote Operating System (OS) detection is a useful tool for | Clearly, remote Operating System (OS) detection is a useful tool for | |||
attackers. Tools such as nmap [Fyodor, 2006b] can usually detect the | attackers. Tools such as nmap [Fyodor, 2006b] can usually detect the | |||
operating system type and version of a remote system with an | operating system type and version of a remote system with an | |||
amazingly accurate precision. This information can in turn be used | amazingly accurate precision. This information can in turn be used | |||
by attackers to tailor their exploits to the identified operating | by attackers to tailor their exploits to the identified operating | |||
system type and version. | system type and version. | |||
Evasion of OS fingerprinting can prove to be a very difficult task. | Evasion of OS fingerprinting can prove to be a very difficult task. | |||
skipping to change at page 92, line 6 | skipping to change at page 63, line 15 | |||
12.1.1. FIN probe | 12.1.1. FIN probe | |||
TCP MUST silently drop TCP any segments received for a connection in | TCP MUST silently drop TCP any segments received for a connection in | |||
the LISTEN state that do not have the SYN, RST, or ACK flags set. In | the LISTEN state that do not have the SYN, RST, or ACK flags set. In | |||
the rest of the cases, the processing rules in RFC 793 MUST be | the rest of the cases, the processing rules in RFC 793 MUST be | |||
applied. | applied. | |||
DISCUSSION: | DISCUSSION: | |||
The attacker sends a FIN (or any packet without the SYN or the ACK | The attacker sends a FIN (or any packet without the SYN or the ACK | |||
flags set) to an open port. RFC 793 [Postel, 1981c] leaves the | flags set) to an open port. RFC 793 [RFC0793] leaves the reaction | |||
reaction to such segments unspecified. As a result, some | to such segments unspecified. As a result, some implementations | |||
implementations silently drop the received segment, while others | silently drop the received segment, while others respond with a | |||
respond with a RST. | RST. | |||
12.1.2. Bogus flag test | 12.1.2. Bogus flag test | |||
TCP MUST ignore any flags not supported, and MUST NOT reflect them if | TCP MUST ignore any flags not supported, and MUST NOT reflect them if | |||
a TCP segment is sent in response to the one just received. | a TCP segment is sent in response to the one just received. | |||
DISCUSSION: | DISCUSSION: | |||
The attacker sends a TCP segment setting at least one bit of the | The attacker sends a TCP segment setting at least one bit of the | |||
Reserved field. Some implementations ignore this field, while | Reserved field. Some implementations ignore this field, while | |||
skipping to change at page 93, line 41 | skipping to change at page 64, line 49 | |||
DISCUSSION: | DISCUSSION: | |||
[Fyodor, 1998] reports that many implementations differ in the | [Fyodor, 1998] reports that many implementations differ in the | |||
Acknowledgement Number they use in response to segments received | Acknowledgement Number they use in response to segments received | |||
for connections in the CLOSED state. In particular, these | for connections in the CLOSED state. In particular, these | |||
implementations differ in the way they construct the RST segment | implementations differ in the way they construct the RST segment | |||
that is sent in response to those TCP segments received for | that is sent in response to those TCP segments received for | |||
connections in the CLOSED state. | connections in the CLOSED state. | |||
RFC 793 [Postel, 1981c] describes (in pages 36-37) how RST | RFC 793 [RFC0793] describes (in pages 36-37) how RST segments are | |||
segments are to be generated. According to this RFC, the ACK bit | to be generated. According to this RFC, the ACK bit (and the | |||
(and the Acknowledgment Number) is set in a RST only if the | Acknowledgment Number) is set in a RST only if the incoming | |||
incoming segment that elicited the RST did not have the ACK bit | segment that elicited the RST did not have the ACK bit set (and | |||
set (and thus the Sequence Number of the outgoing RST segment must | thus the Sequence Number of the outgoing RST segment must be set | |||
be set to zero). However, we recommend TCP implementations to set | to zero). However, we recommend TCP implementations to set the | |||
the ACK bit (and the Acknowledgement Number) in all outgoing RST | ACK bit (and the Acknowledgement Number) in all outgoing RST | |||
segments, as it allows for additional validation checks to be | segments, as it allows for additional validation checks to be | |||
enforced at the system receiving the segment. | enforced at the system receiving the segment. | |||
12.1.6. TCP options | 12.1.6. TCP options | |||
Different implementations differ in the TCP options they enable by | Different implementations differ in the TCP options they enable by | |||
default. Additionally, they differ in the actual contents of the | default. Additionally, they differ in the actual contents of the | |||
options, and in the order in which the options are included in a TCP | options, and in the order in which the options are included in a TCP | |||
segment. There is currently no recommendation on the order in which | segment. There is currently no recommendation on the order in which | |||
to include TCP options in TCP segments. | to include TCP options in TCP segments. | |||
skipping to change at page 95, line 36 | skipping to change at page 66, line 47 | |||
[Rowland, 1996] contains a discussion of covert channels in the | [Rowland, 1996] contains a discussion of covert channels in the | |||
TCP/IP protocol suite, with some TCP-based examples. [Giffin et al, | TCP/IP protocol suite, with some TCP-based examples. [Giffin et al, | |||
2002] describes the use of TCP timestamps for the establishment of | 2002] describes the use of TCP timestamps for the establishment of | |||
covert channels. [Zander, 2008] contains an extensive bibliography | covert channels. [Zander, 2008] contains an extensive bibliography | |||
of papers on covert channels, and a list of freely-available tools | of papers on covert channels, and a list of freely-available tools | |||
that implement covert channels with the TCP/IP protocol suite. | that implement covert channels with the TCP/IP protocol suite. | |||
14. TCP Port scanning | 14. TCP Port scanning | |||
NOTE: THIS SECTION IS BEING EDITED. | ||||
TCP port scanning aims at identifying TCP port numbers on which there | TCP port scanning aims at identifying TCP port numbers on which there | |||
is a process listening for incoming connections. That is, it aims at | is a process listening for incoming connections. That is, it aims at | |||
identifying TCPs at the target system that are in the LISTEN state. | identifying TCPs at the target system that are in the LISTEN state. | |||
The following subsections describe different TCP port scanning | The following subsections describe different TCP port scanning | |||
techniques that have been implemented in freely-available tools. | techniques that have been implemented in freely-available tools. | |||
These subsections focus only on those port scanning techniques that | These subsections focus only on those port scanning techniques that | |||
exploit features of TCP itself, and not of other communication | exploit features of TCP itself, and not of other communication | |||
protocols. | protocols. | |||
For example, the following subsections do not discuss the | For example, the following subsections do not discuss the | |||
skipping to change at page 97, line 5 | skipping to change at page 68, line 17 | |||
scanning tool. | scanning tool. | |||
14.3. FIN, NULL, and XMAS scans | 14.3. FIN, NULL, and XMAS scans | |||
TCP SHOULD respond with an RST when a TCP segment is received for a | TCP SHOULD respond with an RST when a TCP segment is received for a | |||
connection in the LISTEN state, and the incoming segment has neither | connection in the LISTEN state, and the incoming segment has neither | |||
the SYN bit nor the RST bit set. | the SYN bit nor the RST bit set. | |||
DISCUSSION: | DISCUSSION: | |||
RFC 793 [Postel, 1981c] states, in page 65, that an incoming | RFC 793 [RFC0793] states, in page 65, that an incoming segment | |||
segment that does not have the RST bit set and that is received | that does not have the RST bit set and that is received for a | |||
for a connection in the fictional state CLOSED causes an RST to be | connection in the fictional state CLOSED causes an RST to be sent | |||
sent in response. Pages 65-66 of RFC 793 describes the processing | in response. Pages 65-66 of RFC 793 describes the processing of | |||
of incoming segments for connections in the state LISTEN, and | incoming segments for connections in the state LISTEN, and | |||
implicitly states that an incoming segment that does not have the | implicitly states that an incoming segment that does not have the | |||
ACK bit set (and is not a SYN or an RST) should be silently | ACK bit set (and is not a SYN or an RST) should be silently | |||
dropped. | dropped. | |||
As a result, an attacker can exploit this situation to perform a | As a result, an attacker can exploit this situation to perform a | |||
port scan by sending TCP segments that do not have the ACK bit set | port scan by sending TCP segments that do not have the ACK bit set | |||
to the target system. When a port is "open" (i.e., there is a TCP | to the target system. When a port is "open" (i.e., there is a TCP | |||
in the LISTEN state on the corresponding port), the target system | in the LISTEN state on the corresponding port), the target system | |||
will respond with an RST segment. On the other hand, if the port | will respond with an RST segment. On the other hand, if the port | |||
is "closed" (i.e., there is a TCP in the fictional state CLOSED) | is "closed" (i.e., there is a TCP in the fictional state CLOSED) | |||
skipping to change at page 97, line 45 | skipping to change at page 69, line 9 | |||
It should be clear that while the aforementioned control-bits | It should be clear that while the aforementioned control-bits | |||
combinations are the most popular ones, other combinations could | combinations are the most popular ones, other combinations could | |||
be used to exploit this port-scanning vector. For example, the | be used to exploit this port-scanning vector. For example, the | |||
CWR, ECE, and/or any of the Reserved bits could be set in the | CWR, ECE, and/or any of the Reserved bits could be set in the | |||
probe segments. | probe segments. | |||
The advantage of this port-scanning technique is that in can | The advantage of this port-scanning technique is that in can | |||
bypass some stateless firewalls. However, the downside is that a | bypass some stateless firewalls. However, the downside is that a | |||
number of implementations do not comply strictly with RFC 793 | number of implementations do not comply strictly with RFC 793 | |||
[Postel, 1981c], and thus always respond to the probe segments | [RFC0793], and thus always respond to the probe segments with an | |||
with an RST, regardless of whether the port is open or closed. | RST, regardless of whether the port is open or closed. | |||
This port-scanning vector can be easily defeated as rby responding | This port-scanning vector can be easily defeated as rby responding | |||
with an RST when a TCP segment is received for a connection in the | with an RST when a TCP segment is received for a connection in the | |||
LISTEN state, and the incoming segment has neither the SYN bit nor | LISTEN state, and the incoming segment has neither the SYN bit nor | |||
the RST bit set. | the RST bit set. | |||
14.4. Maimon scan | 14.4. Maimon scan | |||
If a TCP that is in the CLOSED or LISTEN states receives a TCP | If a TCP that is in the CLOSED or LISTEN states receives a TCP | |||
segment with both the FIN and ACK bits set, it MUST respond with a | segment with both the FIN and ACK bits set, it MUST respond with a | |||
RST. | RST. | |||
DISCUSSION: | DISCUSSION: | |||
This port scanning technique was introduced in [Maimon, 1996] with | This port scanning technique was introduced in [Maimon, 1996] with | |||
the name "StealthScan" (method #1), and was later incorporated | the name "StealthScan" (method #1), and was later incorporated | |||
into the nmap tool [Fyodor, 2006b] as the "Maimon scan". | into the nmap tool [Fyodor, 2006b] as the "Maimon scan". | |||
This port scanning technique employs TCP segments that have both | This port scanning technique employs TCP segments that have both | |||
the FIN and ACK bits sets as the probe segments. While according | the FIN and ACK bits sets as the probe segments. While according | |||
to RFC 793 [Postel, 1981c] these segments should elicit an RST | to RFC 793 [RFC0793] these segments should elicit an RST | |||
regardless of whether the corresponding port is open or closed, a | regardless of whether the corresponding port is open or closed, a | |||
programming flaw found in a number of TCP implementations has | programming flaw found in a number of TCP implementations has | |||
caused some systems to silently drop the probe segment if the | caused some systems to silently drop the probe segment if the | |||
corresponding port was open (i.e., there was a TCP in the LISTEN | corresponding port was open (i.e., there was a TCP in the LISTEN | |||
state), and respond with an RST only if the port was closed. | state), and respond with an RST only if the port was closed. | |||
Therefore, an RST would indicate that the scanned port is closed, | Therefore, an RST would indicate that the scanned port is closed, | |||
while the absence of a response from the target system would | while the absence of a response from the target system would | |||
indicate that the scanned port is open. | indicate that the scanned port is open. | |||
skipping to change at page 99, line 18 | skipping to change at page 70, line 33 | |||
implement this policy. | implement this policy. | |||
14.6. ACK scan | 14.6. ACK scan | |||
The so-called "ACK scan" is not really a port-scanning technique | The so-called "ACK scan" is not really a port-scanning technique | |||
(i.e., it does not aim at determining whether a specific port is open | (i.e., it does not aim at determining whether a specific port is open | |||
or closed), but rather aims at determining whether some intermediate | or closed), but rather aims at determining whether some intermediate | |||
system is filtering TCP segments sent to that specific port number. | system is filtering TCP segments sent to that specific port number. | |||
The probe packet is a TCP segment with the ACK bit set which, | The probe packet is a TCP segment with the ACK bit set which, | |||
according to RFC 793 [Postel, 1981c] should elicit an RST from the | according to RFC 793 [RFC0793] should elicit an RST from the target | |||
target system regardless of whether the corresponding TCP port is | system regardless of whether the corresponding TCP port is open or | |||
open or closed. If no response is received from the target system, | closed. If no response is received from the target system, it is | |||
it is assumed that some intermediate system is filtering the probe | assumed that some intermediate system is filtering the probe packets | |||
packets sent to the target system. | sent to the target system. | |||
It should be noted that this "port scanning" techniques exploits | It should be noted that this "port scanning" techniques exploits | |||
basic TCP processing rules, and therefore cannot be defeated at an | basic TCP processing rules, and therefore cannot be defeated at an | |||
end-system. | end-system. | |||
15. Processing of ICMP error messages by TCP | 15. Processing of ICMP error messages by TCP | |||
TCP SHOULD silently ignore received ICMP Source Quench messages. | [RFC5927] analyzes a number of vulnerabilities based on crafted ICMP | |||
messages, along with possible counter-measures. | ||||
TCP SHOULD process ICMP "hard errors" as "soft errors" when they are | ||||
received for connections that are in any of he synchronized states. | ||||
TCP SHOULD process ICMP "fragmentation needed and DF bit set" and | ||||
ICMPv6 "Packet Too Big" error messages as described in [RFC5927]. | ||||
DISCUSSION: | ||||
[RFC5927] analyzes a number of vulnerabilities based on crafted | ||||
ICMP messages, along with possible counter-measures. | ||||
16. TCP interaction with the Internet Protocol (IP) | 16. TCP interaction with the Internet Protocol (IP) | |||
16.1. TCP-based traceroute | 16.1. TCP-based traceroute | |||
The traceroute tool is used to identify the intermediate systems the | The traceroute tool is used to identify the intermediate systems the | |||
local system and the destination system. It is usually implemented | local system and the destination system. It is usually implemented | |||
by sending "probe" packets with increasing IP Time to Live values | by sending "probe" packets with increasing IP Time to Live values | |||
(starting from 0), without maintaining any state with the final | (starting from 0), without maintaining any state with the final | |||
destination. | destination. | |||
Some traceroute implementations use ICMP "echo request" messages as | Some traceroute implementations use ICMP "echo request" messages as | |||
the probe packets, while others use UDP packets or TCP SYN segments. | the probe packets, while others use UDP packets or TCP SYN segments. | |||
skipping to change at page 102, line 28 | skipping to change at page 73, line 36 | |||
This document provides a thorough security assessment of the | This document provides a thorough security assessment of the | |||
Transmission Control Protocol (TCP), identifies a number of | Transmission Control Protocol (TCP), identifies a number of | |||
vulnerabilities, and specifies possible counter-measures. | vulnerabilities, and specifies possible counter-measures. | |||
Additionally, it provides implementation guidance such that the | Additionally, it provides implementation guidance such that the | |||
resilience of TCP implementations is improved. | resilience of TCP implementations is improved. | |||
18. Acknowledgements | 18. Acknowledgements | |||
The author would like to thank (in alphabetical order) David Borman, | The author would like to thank (in alphabetical order) David Borman, | |||
Wesley Eddy, and Alfred Hoenes, for providing valuable feedback on | Wesley Eddy, Alfred Hoenes, and Michael Scharf, for providing | |||
earlier versions of thi document. | valuable feedback on earlier versions of thi document. | |||
This document is heavily based on the document "Security Assessment | This document is heavily based on the document "Security Assessment | |||
of the Transmission Control Protocol (TCP)" [CPNI, 2009] written by | of the Transmission Control Protocol (TCP)" [CPNI, 2009] written by | |||
Fernando Gont on behalf of CPNI (Centre for the Protection of | Fernando Gont on behalf of CPNI (Centre for the Protection of | |||
National Infrastructure). | National Infrastructure). | |||
The author would like to thank (in alphabetical order) Randall | The author would like to thank (in alphabetical order) Randall | |||
Atkinson, Guillermo Gont, Alfred Hoenes, Jamshid Mahdavi, Stanislav | Atkinson, Guillermo Gont, Alfred Hoenes, Jamshid Mahdavi, Stanislav | |||
Shalunov, Michael Welzl, Dan Wing, Andrew Yourtchenko, Michal | Shalunov, Michael Welzl, Dan Wing, Andrew Yourtchenko, Michal | |||
Zalewski, and Christos Zoulas, for providing valuable feedback on | Zalewski, and Christos Zoulas, for providing valuable feedback on | |||
skipping to change at page 103, line 6 | skipping to change at page 74, line 13 | |||
Additionally, the author would like to thank (in alphabetical order) | Additionally, the author would like to thank (in alphabetical order) | |||
Mark Allman, David Black, Ethan Blanton, David Borman, James Chacon, | Mark Allman, David Black, Ethan Blanton, David Borman, James Chacon, | |||
John Heffner, Jerrold Leichter, Jamshid Mahdavi, Keith Scott, Bill | John Heffner, Jerrold Leichter, Jamshid Mahdavi, Keith Scott, Bill | |||
Squier, and David White, who generously answered a number of | Squier, and David White, who generously answered a number of | |||
questions that araised while the aforementioned document was being | questions that araised while the aforementioned document was being | |||
written. | written. | |||
Finally, the author would like to thank CPNI (formely NISCC) for | Finally, the author would like to thank CPNI (formely NISCC) for | |||
their continued support. | their continued support. | |||
19. References | 19. References (to be translated to xml) | |||
Abley, J., Savola, P., Neville-Neil, G. 2007. Deprecation of Type 0 | Abley, J., Savola, P., Neville-Neil, G. 2007. Deprecation of Type 0 | |||
Routing Headers in IPv6. RFC 5095. | Routing Headers in IPv6. RFC 5095. | |||
Allman, M. 2003. TCP Congestion Control with Appropriate Byte | Allman, M. 2003. TCP Congestion Control with Appropriate Byte | |||
Counting (ABC). RFC 3465. | Counting (ABC). RFC 3465. | |||
Allman, M. 2008. Comments On Selecting Ephemeral Ports. Available | Allman, M. 2008. Comments On Selecting Ephemeral Ports. Available | |||
at: http://www.icir.org/mallman/share/ports-dec08.pdf | at: http://www.icir.org/mallman/share/ports-dec08.pdf | |||
skipping to change at page 108, line 13 | skipping to change at page 79, line 22 | |||
Protocol. RFC 4301. | Protocol. RFC 4301. | |||
Klensin, J. 2008. Simple Mail Transfer Protocol. RFC 5321. | Klensin, J. 2008. Simple Mail Transfer Protocol. RFC 5321. | |||
Ko, Y., Ko, S., and Ko, M. 2001. NIDS Evasion Method named SeolMa. | Ko, Y., Ko, S., and Ko, M. 2001. NIDS Evasion Method named SeolMa. | |||
Phrack Magazine, Volume 0x0b, Issue 0x39, phile #0x03 of 0x12. | Phrack Magazine, Volume 0x0b, Issue 0x39, phile #0x03 of 0x12. | |||
Available at: http://www.phrack.org/issues.html?issue=57&id=3#article | Available at: http://www.phrack.org/issues.html?issue=57&id=3#article | |||
Lahey, K. 2000. TCP Problems with Path MTU Discovery. RFC 2923. | Lahey, K. 2000. TCP Problems with Path MTU Discovery. RFC 2923. | |||
Larsen, M., Gont, F. 2008. Port Randomization. IETF Internet-Draft | ||||
(draft-ietf-tsvwg-port-randomization-02), work in progress. | ||||
Lemon, 2002. Resisting SYN flood DoS attacks with a SYN cache. | Lemon, 2002. Resisting SYN flood DoS attacks with a SYN cache. | |||
Proceedings of the BSDCon 2002 Conference, pp 89-98. | Proceedings of the BSDCon 2002 Conference, pp 89-98. | |||
Maimon, U. 1996. Port Scanning without the SYN flag. Phrack | Maimon, U. 1996. Port Scanning without the SYN flag. Phrack | |||
Magazine, Volume Seven, Issue Fourty-Nine, phile #0x0f of 0x10. | Magazine, Volume Seven, Issue Fourty-Nine, phile #0x0f of 0x10. | |||
Available at: | Available at: | |||
http://www.phrack.org/issues.html?issue=49&id=15#article | http://www.phrack.org/issues.html?issue=49&id=15#article | |||
Mathis, M., Mahdavi, J., Floyd, S. Romanow, A. 1996. TCP Selective | Mathis, M., Mahdavi, J., Floyd, S. Romanow, A. 1996. TCP Selective | |||
Acknowledgment Options. RFC 2018. | Acknowledgment Options. RFC 2018. | |||
skipping to change at page 113, line 9 | skipping to change at page 84, line 10 | |||
IFIP Communications and Multimedia Security Conference (CMS 2002). | IFIP Communications and Multimedia Security Conference (CMS 2002). | |||
Available at: http://www.ieeta.pt/~avz/pubs/CMS02.html | Available at: http://www.ieeta.pt/~avz/pubs/CMS02.html | |||
Zweig, J., Partridge, C. 1990. TCP Alternate Checksum Options. RFC | Zweig, J., Partridge, C. 1990. TCP Alternate Checksum Options. RFC | |||
1146. | 1146. | |||
20. References | 20. References | |||
20.1. Normative References | 20.1. Normative References | |||
[I-D.ietf-tcpm-tcp-timestamps] | [RFC0793] Postel, J., "Transmission Control Protocol", STD 7, | |||
Gont, F., "Reducing the TIME-WAIT state using TCP | RFC 793, September 1981. | |||
timestamps", draft-ietf-tcpm-tcp-timestamps-03 (work in | ||||
progress), December 2010. | ||||
[I-D.ietf-tsvwg-port-randomization] | [RFC1122] Braden, R., "Requirements for Internet Hosts - | |||
Larsen, M. and F. Gont, "Transport Protocol Port | Communication Layers", STD 3, RFC 1122, October 1989. | |||
Randomization Recommendations", | ||||
draft-ietf-tsvwg-port-randomization-09 (work in progress), | [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition | |||
of Explicit Congestion Notification (ECN) to IP", | ||||
RFC 3168, September 2001. | ||||
[RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion | ||||
Control", RFC 5681, September 2009. | ||||
[RFC5961] Ramaiah, A., Stewart, R., and M. Dalal, "Improving TCP's | ||||
Robustness to Blind In-Window Attacks", RFC 5961, | ||||
August 2010. | August 2010. | |||
[RFC6056] Larsen, M. and F. Gont, "Recommendations for Transport- | ||||
Protocol Port Randomization", BCP 156, RFC 6056, | ||||
January 2011. | ||||
[RFC6093] Gont, F. and A. Yourtchenko, "On the Implementation of the | [RFC6093] Gont, F. and A. Yourtchenko, "On the Implementation of the | |||
TCP Urgent Mechanism", RFC 6093, January 2011. | TCP Urgent Mechanism", RFC 6093, January 2011. | |||
[RFC6191] Gont, F., "Reducing the TIME-WAIT State Using TCP | ||||
Timestamps", BCP 159, RFC 6191, April 2011. | ||||
[RFC6528] Gont, F. and S. Bellovin, "Defending against Sequence | ||||
Number Attacks", RFC 6528, February 2012. | ||||
20.2. Informative References | 20.2. Informative References | |||
[I-D.gont-timestamps-generation] | [I-D.gont-timestamps-generation] | |||
Gont, F. and A. Oppermann, "On the generation of TCP | Gont, F. and A. Oppermann, "On the generation of TCP | |||
timestamps", draft-gont-timestamps-generation-00 (work in | timestamps", draft-gont-timestamps-generation-00 (work in | |||
progress), June 2010. | progress), June 2010. | |||
[I-D.ietf-tcpm-3517bis] | ||||
Blanton, E., Jarvinen, I., Wang, L., Allman, M., Kojo, M., | ||||
and Y. Nishida, "A Conservative Selective Acknowledgment | ||||
(SACK)-based Loss Recovery Algorithm for TCP", | ||||
draft-ietf-tcpm-3517bis-01 (work in progress), | ||||
January 2012. | ||||
[Morris1985] | ||||
Morris, R., "A Weakness in the 4.2BSD UNIX TCP/IP | ||||
Software", CSTR 117, AT&T Bell Laboratories, Murray Hill, | ||||
NJ, 1985. | ||||
[RFC1025] Postel, J., "TCP and IP bake off", RFC 1025, | ||||
September 1987. | ||||
[RFC1379] Braden, B., "Extending TCP for Transactions -- Concepts", | ||||
RFC 1379, November 1992. | ||||
[RFC5927] Gont, F., "ICMP Attacks against TCP", RFC 5927, July 2010. | [RFC5927] Gont, F., "ICMP Attacks against TCP", RFC 5927, July 2010. | |||
[RFC6429] Bashyam, M., Jethanandani, M., and A. Ramaiah, "TCP Sender | ||||
Clarification for Persist Condition", RFC 6429, | ||||
December 2011. | ||||
[Shimomura1995] | ||||
Shimomura, T., "Technical details of the attack described | ||||
by Markoff in NYT", | ||||
http://www.gont.com.ar/docs/post-shimomura-usenet.txt, | ||||
Message posted in USENET's comp.security.misc newsgroup, | ||||
Message-ID: <3g5gkl$5j1@ariel.sdsc.edu>, 1995. | ||||
Appendix A. TODO list | Appendix A. TODO list | |||
A Number of formatting issues still have to be fixed in this | A Number of formatting issues still have to be fixed in this | |||
document. Among others are: | document. Among others are: | |||
o The ASCII-art corresponding to some figures are still missing. We | o The ASCII-art corresponding to some figures are still missing. We | |||
still have to convert the nice JPGs of the UK CPNI document into | still have to convert the nice JPGs of the UK CPNI document into | |||
ugly ASCII-art. | ugly ASCII-art. | |||
o The references have not yet been converted to xml, but are | o The references have not yet been converted to xml, but are | |||
hardcoded, instead. That's why they may not look as expected | hardcoded, instead. That's why they may not look as expected | |||
Appendix B. Change log (to be removed by the RFC Editor before | Appendix B. Change log (to be removed by the RFC Editor before | |||
publication of this document as an RFC) | publication of this document as an RFC) | |||
B.1. Changes from draft-ietf-tcpm-tcp-security-01 | B.1. Changes from draft-ietf-tcpm-tcp-security-02 | |||
o Lots of text has been removed out of the document. | ||||
o The documento track has been changed from BCP to Informational | ||||
(RFC2119-language recommendations ahve been removed). | ||||
o Where necessary, stand-alone std tracks documents have been | ||||
produced. | ||||
B.2. Changes from draft-ietf-tcpm-tcp-security-01 | ||||
A Number of formatting issues still have to be fixed in this | A Number of formatting issues still have to be fixed in this | |||
document. Among others are: | document. Among others are: | |||
o The whole document was reformatted with RFC 1122 style. | o The whole document was reformatted with RFC 1122 style. | |||
Author's Address | Author's Address | |||
Fernando Gont | Fernando Gont | |||
UK Centre for the Protection of National Infrastructure | UK Centre for the Protection of National Infrastructure | |||
End of changes. 192 change blocks. | ||||
2391 lines changed or deleted | 1059 lines changed or added | |||
This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |