idnits 2.17.00 (12 Aug 2021) /tmp/idnits20482/draft-ietf-dccp-spec-11.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1.a on line 18. -- Found old boilerplate from RFC 3978, Section 5.5 on line 5828. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 5839. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 5846. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 5852. ** Found boilerplate matching RFC 3978, Section 5.4, paragraph 1 (on line 5820), which is fine, but *also* found old RFC 2026, Section 10.4C, paragraph 1 text on line 40. ** The document seems to lack an RFC 3978 Section 5.1 IPR Disclosure Acknowledgement. ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. ** The document uses RFC 3667 boilerplate or RFC 3978-like boilerplate instead of verbatim RFC 3978 boilerplate. After 6 May 2005, submission of drafts without verbatim RFC 3978 boilerplate is not accepted. The following non-3978 patterns matched text found in the document. That text should be removed or replaced: This document is an Internet-Draft and is subject to all provisions of Section 3 of RFC 3667. By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year == Line 1842 has weird spacing: '...t value snd...' == Line 2404 has weird spacing: '...loseReq seq...' == The document seems to use 'NOT RECOMMENDED' as an RFC 2119 keyword, but does not include the phrase in its RFC 2119 key words list. -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (10 March 2005) is 6280 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'CLOSED' is mentioned on line 851, but not defined == Missing Reference: 'LISTEN' is mentioned on line 851, but not defined == Missing Reference: 'TIMEWAIT' is mentioned on line 860, but not defined == Missing Reference: 'Nonce 0' is mentioned on line 4547, but not defined == Missing Reference: 'Nonce 1' is mentioned on line 4547, but not defined == Missing Reference: 'AWL' is mentioned on line 2367, but not defined == Missing Reference: 'AWH' is mentioned on line 2367, but not defined == Missing Reference: 'SWL' is mentioned on line 2367, but not defined == Missing Reference: 'SWH' is mentioned on line 2367, but not defined == Missing Reference: 'RFC TBA' is mentioned on line 3577, but not defined == Missing Reference: 'DrpCd' is mentioned on line 4305, but not defined == Missing Reference: 'E' is mentioned on line 5332, but not defined -- Looks like a reference, but probably isn't: '1' on line 5543 -- Looks like a reference, but probably isn't: '0' on line 5526 == Unused Reference: 'RFC 2119' is defined on line 5680, but no explicit reference was found in the text == Unused Reference: 'RFC 2434' is defined on line 5683, but no explicit reference was found in the text == Unused Reference: 'RFC 2460' is defined on line 5686, but no explicit reference was found in the text == Unused Reference: 'RFC 1948' is defined on line 5739, but no explicit reference was found in the text == Unused Reference: 'RFC 2960' is defined on line 5757, but no explicit reference was found in the text ** Obsolete normative reference: RFC 2434 (Obsoleted by RFC 5226) ** Obsolete normative reference: RFC 2460 (Obsoleted by RFC 8200) ** Obsolete normative reference: RFC 3309 (Obsoleted by RFC 4960) ** Obsolete normative reference: RFC 3775 (Obsoleted by RFC 6275) == Outdated reference: draft-ietf-pmtud-method has been published as RFC 4821 -- Obsolete informational reference (is this intentional?): RFC 1750 (Obsoleted by RFC 4086) -- Obsolete informational reference (is this intentional?): RFC 1948 (Obsoleted by RFC 6528) -- Obsolete informational reference (is this intentional?): RFC 2401 (Obsoleted by RFC 4301) -- Obsolete informational reference (is this intentional?): RFC 2463 (Obsoleted by RFC 4443) -- Obsolete informational reference (is this intentional?): RFC 2581 (Obsoleted by RFC 5681) -- Obsolete informational reference (is this intentional?): RFC 2960 (Obsoleted by RFC 4960) -- Obsolete informational reference (is this intentional?): RFC 3448 (Obsoleted by RFC 5348) Summary: 10 errors (**), 0 flaws (~~), 23 warnings (==), 17 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet Engineering Task Force Eddie Kohler 2 INTERNET-DRAFT UCLA 3 draft-ietf-dccp-spec-11.txt Mark Handley 4 Expires: 10 September 2005 UCL 5 Sally Floyd 6 ICIR 7 10 March 2005 9 Datagram Congestion Control Protocol (DCCP) 11 Status of this Memo 13 This document is an Internet-Draft and is subject to all provisions 14 of section 3 of RFC 3667. By submitting this Internet-Draft, each 15 author represents that any applicable patent or other IPR claims of 16 which he or she is aware have been or will be disclosed, and any of 17 which he or she become aware will be disclosed, in accordance with 18 RFC 3668. 20 Internet-Drafts are working documents of the Internet Engineering 21 Task Force (IETF), its areas, and its working groups. Note that 22 other groups may also distribute working documents as Internet- 23 Drafts. 25 Internet-Drafts are draft documents valid for a maximum of six 26 months and may be updated, replaced, or obsoleted by other documents 27 at any time. It is inappropriate to use Internet-Drafts as 28 reference material or to cite them other than as "work in progress." 30 The list of current Internet-Drafts can be accessed at 31 http://www.ietf.org/ietf/1id-abstracts.txt. 33 The list of Internet-Draft Shadow Directories can be accessed at 34 http://www.ietf.org/shadow.html. 36 This Internet-Draft will expire on 10 September 2005. 38 Copyright Notice 40 Copyright (C) The Internet Society (2005). All Rights Reserved. 42 Abstract 44 The Datagram Congestion Control Protocol (DCCP) is a transport 45 protocol that provides bidirectional unicast connections of 46 congestion-controlled unreliable datagrams. DCCP is suitable for 47 applications that transfer fairly large amounts of data, but can 48 benefit from control over the tradeoff between timeliness and 49 reliability. 51 TO BE DELETED BY THE RFC EDITOR UPON PUBLICATION: 53 Changes since draft-ietf-dccp-spec-08.txt: 55 * Added minimum Sequence Window. 57 * Init Cookie implementation sketch. 59 * Include reasoning for ignoring options on DCCP-Data. 61 * More Aggression Penalty explanation. 63 * More explanation on Ack Vectors that report information on packets 64 that haven't been sent. 66 Changes since draft-ietf-dccp-spec-07.txt: 68 * Many changes, not listed here, for WGLC. 70 * The more stringent Sequence Number checks on DCCP-Sync and DCCP- 71 SyncAck packets become SHOULD, not MAY. 73 Changes since draft-ietf-dccp-spec-06.txt: 75 * Change extended sequence numbers. Now 48-bit sequence numbers are 76 MANDATORY, and all packet types except Data, Ack, and DataAck always 77 use 48-bit sequence numbers. This change improves DCCP's robustness 78 against blind attacks. 80 * Removed empty Change options. 82 * Allow preference list changes during feature negotiations (this 83 seems easier to implement than the alternative). This required a 84 new feature negotiation state, UNSTABLE. 86 * Add Minimum Checksum Coverage feature. 88 * Add Reset Congestion State option. 90 * Simplify the implementation of CCID-specific option processing: no 91 need to check whether the CCID feature is being negotiated. 93 * Many more minor changes. 95 Changes since draft-ietf-dccp-spec-05.txt: 97 * Organization overhaul. 99 * Add pseudocode for event processing. 101 * Remove # NDP; replace with Ack Count. 103 * Remove Identification, Challenge, ID Regime, and Connection Nonce. 105 * Data Checksum (formerly Payload Checksum) uses a 32-bit CRC. 107 * Switch location of non-negotiable features to clarify 108 presentation; now the feature location controls its value. 110 * Rename "value type" to "reconciliation rule". 112 * Rename "Reset Reason" to "Reset Code". 114 * Mobility ID becomes 128 bits long. 116 * Add probabilities to Mobility ID discussion. 118 * Add SyncAck. 120 Table of Contents 122 1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . 10 123 2. Design Rationale. . . . . . . . . . . . . . . . . . . . . . . 11 124 3. Conventions and Terminology . . . . . . . . . . . . . . . . . 12 125 3.1. Numbers and Fields . . . . . . . . . . . . . . . . . . . 12 126 3.2. Parts of a Connection. . . . . . . . . . . . . . . . . . 13 127 3.3. Features . . . . . . . . . . . . . . . . . . . . . . . . 13 128 3.4. Round-Trip Times . . . . . . . . . . . . . . . . . . . . 14 129 3.5. Security Limitation. . . . . . . . . . . . . . . . . . . 14 130 3.6. Robustness Principle . . . . . . . . . . . . . . . . . . 14 131 4. Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . 15 132 4.1. Packet Types . . . . . . . . . . . . . . . . . . . . . . 15 133 4.2. Sequence Numbers . . . . . . . . . . . . . . . . . . . . 16 134 4.3. States . . . . . . . . . . . . . . . . . . . . . . . . . 17 135 4.4. Congestion Control . . . . . . . . . . . . . . . . . . . 19 136 4.5. Features . . . . . . . . . . . . . . . . . . . . . . . . 20 137 4.6. Differences From TCP . . . . . . . . . . . . . . . . . . 21 138 4.7. Example Connection . . . . . . . . . . . . . . . . . . . 22 139 5. Packet Formats. . . . . . . . . . . . . . . . . . . . . . . . 23 140 5.1. Generic Header . . . . . . . . . . . . . . . . . . . . . 24 141 5.2. DCCP-Request Packets . . . . . . . . . . . . . . . . . . 27 142 5.3. DCCP-Response Packets. . . . . . . . . . . . . . . . . . 28 143 5.4. DCCP-Data, DCCP-Ack, and DCCP-DataAck Packets. . . . . . 29 144 5.5. DCCP-CloseReq and DCCP-Close Packets . . . . . . . . . . 30 145 5.6. DCCP-Reset Packets . . . . . . . . . . . . . . . . . . . 31 146 5.7. DCCP-Sync and DCCP-SyncAck Packets . . . . . . . . . . . 34 147 5.8. Options. . . . . . . . . . . . . . . . . . . . . . . . . 35 148 5.8.1. Padding Option. . . . . . . . . . . . . . . . . . . 36 149 5.8.2. Mandatory Option. . . . . . . . . . . . . . . . . . 37 150 6. Feature Negotiation . . . . . . . . . . . . . . . . . . . . . 38 151 6.1. Change Options . . . . . . . . . . . . . . . . . . . . . 38 152 6.2. Confirm Options. . . . . . . . . . . . . . . . . . . . . 39 153 6.3. Reconciliation Rules . . . . . . . . . . . . . . . . . . 39 154 6.3.1. Server-Priority . . . . . . . . . . . . . . . . . . 39 155 6.3.2. Non-Negotiable. . . . . . . . . . . . . . . . . . . 40 156 6.4. Feature Numbers. . . . . . . . . . . . . . . . . . . . . 40 157 6.5. Examples . . . . . . . . . . . . . . . . . . . . . . . . 41 158 6.6. Option Exchange. . . . . . . . . . . . . . . . . . . . . 42 159 6.6.1. Normal Exchange . . . . . . . . . . . . . . . . . . 43 160 6.6.2. Processing Received Options . . . . . . . . . . . . 43 161 6.6.3. Loss and Retransmission . . . . . . . . . . . . . . 45 162 6.6.4. Reordering. . . . . . . . . . . . . . . . . . . . . 46 163 6.6.5. Preference Changes. . . . . . . . . . . . . . . . . 47 164 6.6.6. Simultaneous Negotiation. . . . . . . . . . . . . . 47 165 6.6.7. Unknown Features. . . . . . . . . . . . . . . . . . 47 166 6.6.8. Invalid Options . . . . . . . . . . . . . . . . . . 48 167 6.6.9. Mandatory Feature Negotiation . . . . . . . . . . . 49 169 7. Sequence Numbers. . . . . . . . . . . . . . . . . . . . . . . 49 170 7.1. Variables. . . . . . . . . . . . . . . . . . . . . . . . 50 171 7.2. Initial Sequence Numbers . . . . . . . . . . . . . . . . 50 172 7.3. Quiet Time . . . . . . . . . . . . . . . . . . . . . . . 51 173 7.4. Acknowledgement Numbers. . . . . . . . . . . . . . . . . 52 174 7.5. Validity and Synchronization . . . . . . . . . . . . . . 52 175 7.5.1. Sequence and Acknowledgement Number 176 Windows. . . . . . . . . . . . . . . . . . . . . . . . . . 53 177 7.5.2. Sequence Window Feature . . . . . . . . . . . . . . 54 178 7.5.3. Sequence-Validity Rules . . . . . . . . . . . . . . 54 179 7.5.4. Handling Sequence-Invalid Packets . . . . . . . . . 56 180 7.5.5. Sequence Number Attacks . . . . . . . . . . . . . . 57 181 7.5.6. Examples. . . . . . . . . . . . . . . . . . . . . . 58 182 7.6. Short Sequence Numbers . . . . . . . . . . . . . . . . . 59 183 7.6.1. Allow Short Sequence Numbers Feature. . . . . . . . 60 184 7.6.2. When to Avoid Short Sequence Numbers. . . . . . . . 60 185 7.7. NDP Count and Detecting Application Loss . . . . . . . . 61 186 7.7.1. Usage Notes . . . . . . . . . . . . . . . . . . . . 62 187 7.7.2. Send NDP Count Feature. . . . . . . . . . . . . . . 62 188 8. Event Processing. . . . . . . . . . . . . . . . . . . . . . . 62 189 8.1. Connection Establishment . . . . . . . . . . . . . . . . 63 190 8.1.1. Client Request. . . . . . . . . . . . . . . . . . . 63 191 8.1.2. Service Codes . . . . . . . . . . . . . . . . . . . 64 192 8.1.3. Server Response . . . . . . . . . . . . . . . . . . 65 193 8.1.4. Init Cookie Option. . . . . . . . . . . . . . . . . 66 194 8.1.5. Handshake Completion. . . . . . . . . . . . . . . . 67 195 8.2. Data Transfer. . . . . . . . . . . . . . . . . . . . . . 67 196 8.3. Termination. . . . . . . . . . . . . . . . . . . . . . . 68 197 8.3.1. Abnormal Termination. . . . . . . . . . . . . . . . 70 198 8.4. DCCP State Diagram . . . . . . . . . . . . . . . . . . . 70 199 8.5. Pseudocode . . . . . . . . . . . . . . . . . . . . . . . 71 200 9. Checksums . . . . . . . . . . . . . . . . . . . . . . . . . . 75 201 9.1. Header Checksum Field. . . . . . . . . . . . . . . . . . 76 202 9.2. Header Checksum Coverage Field . . . . . . . . . . . . . 77 203 9.2.1. Minimum Checksum Coverage Feature . . . . . . . . . 78 204 9.3. Data Checksum Option . . . . . . . . . . . . . . . . . . 78 205 9.3.1. Check Data Checksum Feature . . . . . . . . . . . . 79 206 9.3.2. Usage Notes . . . . . . . . . . . . . . . . . . . . 79 207 10. Congestion Control . . . . . . . . . . . . . . . . . . . . . 80 208 10.1. TCP-like Congestion Control . . . . . . . . . . . . . . 81 209 10.2. TFRC Congestion Control . . . . . . . . . . . . . . . . 81 210 10.3. CCID-Specific Options, Features, and Reset 211 Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 212 10.4. CCID Profile Requirements . . . . . . . . . . . . . . . 84 213 10.5. Congestion State. . . . . . . . . . . . . . . . . . . . 84 214 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 85 215 11.1. Acks of Acks and Unidirectional Connections . . . . . . 86 216 11.2. Ack Piggybacking. . . . . . . . . . . . . . . . . . . . 87 217 11.3. Ack Ratio Feature . . . . . . . . . . . . . . . . . . . 87 218 11.4. Ack Vector Options. . . . . . . . . . . . . . . . . . . 89 219 11.4.1. Ack Vector Consistency . . . . . . . . . . . . . . 91 220 11.4.2. Ack Vector Coverage. . . . . . . . . . . . . . . . 93 221 11.5. Send Ack Vector Feature . . . . . . . . . . . . . . . . 94 222 11.6. Slow Receiver Option. . . . . . . . . . . . . . . . . . 94 223 11.7. Data Dropped Option . . . . . . . . . . . . . . . . . . 95 224 11.7.1. Data Dropped and Normal Congestion 225 Response . . . . . . . . . . . . . . . . . . . . . . . . . 98 226 11.7.2. Particular Drop Codes. . . . . . . . . . . . . . . 98 227 12. Explicit Congestion Notification . . . . . . . . . . . . . . 99 228 12.1. ECN Incapable Feature . . . . . . . . . . . . . . . . . 100 229 12.2. ECN Nonces. . . . . . . . . . . . . . . . . . . . . . . 100 230 12.3. Aggression Penalties. . . . . . . . . . . . . . . . . . 101 231 13. Timing Options . . . . . . . . . . . . . . . . . . . . . . . 102 232 13.1. Timestamp Option. . . . . . . . . . . . . . . . . . . . 102 233 13.2. Elapsed Time Option . . . . . . . . . . . . . . . . . . 103 234 13.3. Timestamp Echo Option . . . . . . . . . . . . . . . . . 104 235 14. Maximum Packet Size. . . . . . . . . . . . . . . . . . . . . 105 236 14.1. Measuring PMTU. . . . . . . . . . . . . . . . . . . . . 105 237 14.2. Sender Behavior . . . . . . . . . . . . . . . . . . . . 107 238 15. Forward Compatibility. . . . . . . . . . . . . . . . . . . . 108 239 16. Middlebox Considerations . . . . . . . . . . . . . . . . . . 108 240 17. Relations to Other Specifications. . . . . . . . . . . . . . 110 241 17.1. RTP . . . . . . . . . . . . . . . . . . . . . . . . . . 110 242 17.2. Congestion Manager and Multiplexing . . . . . . . . . . 111 243 18. Security Considerations. . . . . . . . . . . . . . . . . . . 111 244 18.1. Security Considerations for Partial 245 Checksums . . . . . . . . . . . . . . . . . . . . . . . . . . 112 246 19. IANA Considerations. . . . . . . . . . . . . . . . . . . . . 113 247 19.1. Packet Types. . . . . . . . . . . . . . . . . . . . . . 113 248 19.2. Reset Codes . . . . . . . . . . . . . . . . . . . . . . 113 249 19.3. Option Types. . . . . . . . . . . . . . . . . . . . . . 114 250 19.4. Feature Numbers . . . . . . . . . . . . . . . . . . . . 114 251 19.5. Congestion Control Identifiers. . . . . . . . . . . . . 114 252 19.6. Ack Vector States . . . . . . . . . . . . . . . . . . . 115 253 19.7. Drop Codes. . . . . . . . . . . . . . . . . . . . . . . 115 254 19.8. Service Codes . . . . . . . . . . . . . . . . . . . . . 115 255 20. Thanks . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 256 A. Appendix: Ack Vector Implementation Notes . . . . . . . . . . 116 257 A.1. Packet Arrival . . . . . . . . . . . . . . . . . . . . . 118 258 A.1.1. New Packets . . . . . . . . . . . . . . . . . . . . 118 259 A.1.2. Old Packets . . . . . . . . . . . . . . . . . . . . 119 260 A.2. Sending Acknowledgements . . . . . . . . . . . . . . . . 120 261 A.3. Clearing State . . . . . . . . . . . . . . . . . . . . . 121 262 A.4. Processing Acknowledgements. . . . . . . . . . . . . . . 122 263 B. Appendix: Partial Checksumming Design Motivation. . . . . . . 123 264 Normative References . . . . . . . . . . . . . . . . . . . . . . 124 265 Informative References . . . . . . . . . . . . . . . . . . . . . 125 266 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 127 267 Full Copyright Statement . . . . . . . . . . . . . . . . . . . . 127 268 Intellectual Property. . . . . . . . . . . . . . . . . . . . . . 128 269 List of Tables 271 Table 1: DCCP Packet Types . . . . . . . . . . . . . . . . . . . 26 272 Table 2: DCCP Reset Codes. . . . . . . . . . . . . . . . . . . . 33 273 Table 3: DCCP Options. . . . . . . . . . . . . . . . . . . . . . 35 274 Table 4: DCCP Feature Numbers. . . . . . . . . . . . . . . . . . 40 275 Table 5: DCCP Congestion Control Identifiers . . . . . . . . . . 80 276 Table 6: DCCP Ack Vector States. . . . . . . . . . . . . . . . . 90 277 Table 7: DCCP Drop Codes . . . . . . . . . . . . . . . . . . . . 96 279 1. Introduction 281 The Datagram Congestion Control Protocol (DCCP) is a transport 282 protocol that implements bidirectional, unicast connections of 283 congestion-controlled, unreliable datagrams. Specifically, DCCP 284 provides: 286 o Unreliable flows of datagrams, with acknowledgements. 288 o Reliable handshakes for connection setup and teardown. 290 o Reliable negotiation of options, including negotiation of a 291 suitable congestion control mechanism. 293 o Mechanisms allowing servers to avoid holding state for 294 unacknowledged connection attempts and already-finished 295 connections. 297 o Congestion control incorporating Explicit Congestion Notification 298 (ECN) [RFC 3168] and the ECN Nonce [RFC 3540]. 300 o Acknowledgement mechanisms communicating packet loss and ECN 301 information. Acks are transmitted as reliably as the relevant 302 congestion control mechanism requires, possibly completely 303 reliably. 305 o Optional mechanisms that tell the sending application, with high 306 reliability, which data packets reached the receiver, and whether 307 those packets were ECN marked, corrupted, or dropped in the 308 receive buffer. 310 o Path Maximum Transmission Unit (PMTU) discovery [RFC 1191]. 312 o A choice of modular congestion control mechanisms. Two 313 mechanisms are currently specified, TCP-like Congestion Control 314 [CCID 2 PROFILE] and TFRC (TCP-Friendly Rate Control) Congestion 315 Control [CCID 3 PROFILE], but DCCP is easily extensible to 316 further forms of unicast congestion control. 318 DCCP is intended for applications such as streaming media that can 319 benefit from control over the tradeoffs between delay and reliable 320 in-order delivery. TCP is not well-suited for these applications, 321 since reliable in-order delivery and congestion control can cause 322 arbitrarily long delays. UDP avoids long delays, but UDP 323 applications that implement congestion control must do so on their 324 own. DCCP provides built-in congestion control, including ECN 325 support, for unreliable datagram flows, avoiding the arbitrary 326 delays associated with TCP. It also implements reliable connection 327 setup, teardown, and feature negotiation. 329 2. Design Rationale 331 One DCCP design goal was to give most streaming UDP applications 332 little reason not to switch to DCCP, once it is deployed. To 333 facilitate this, DCCP was designed to have as little overhead as 334 possible, both in terms of the packet header size and in terms of 335 the state and CPU overhead required at end hosts. Only the minimal 336 necessary functionality was included in DCCP, leaving other 337 functionality, such as forward error correction (FEC), semi- 338 reliability, and multiple streams, to be layered on top of DCCP as 339 desired. 341 Different forms of conformant congestion control are appropriate for 342 different applications. For example, on-line games might want to 343 make quick use of any available bandwidth, while streaming media 344 might trade off this responsiveness for a steadier, less bursty 345 rate. (Sudden rate changes can cause unacceptable UI glitches, such 346 as audible pauses or clicks in the playout stream.) DCCP thus 347 allows applications to choose from a set of congestion control 348 mechanisms. One alternative, TCP-like Congestion Control, halves 349 the congestion window in response to a packet drop or mark, as in 350 TCP. Applications using this congestion control mechanism will 351 respond quickly to changes in available bandwidth, but must tolerate 352 the abrupt changes in congestion window typical of TCP. A second 353 alternative, TCP-Friendly Rate Control (TFRC) [RFC 3448], a form of 354 equation-based congestion control, minimizes abrupt changes in the 355 sending rate while maintaining longer-term fairness with TCP. Other 356 alternatives can be added as future congestion control mechanisms 357 are standardized. 359 DCCP also lets unreliable traffic safely use ECN. A UDP kernel API 360 might not allow applications to set UDP packets as ECN-capable, 361 since the API could not guarantee the application would properly 362 detect or respond to congestion. DCCP kernel APIs will have no such 363 issues, since DCCP implements congestion control itself. 365 We chose not to require the use of the Congestion Manager [RFC 366 3124], which allows multiple concurrent streams between the same 367 sender and receiver to share congestion control. The current 368 Congestion Manager can only be used by applications that have their 369 own end-to-end feedback about packet losses, but this is not the 370 case for many of the applications currently using UDP. In addition, 371 the current Congestion Manager does not easily support multiple 372 congestion control mechanisms, or lend itself to the use of forms of 373 TFRC where the state about past packet drops or marks is maintained 374 at the receiver rather than at the sender. DCCP should be able to 375 make use of CM where desired by the application, but we do not see 376 any benefit in making the deployment of DCCP contingent on the 377 deployment of CM itself. 379 We intend for DCCP's protocol mechanisms, which are described in 380 this document, to suit any application desiring unicast congestion- 381 controlled streams of unreliable datagrams. The congestion control 382 mechanisms currently approved for use with DCCP, which are described 383 in separate Congestion Control ID Profiles [CCID 2 PROFILE, CCID 3 384 PROFILE], may, however, cause problems for some applications, 385 including high-bandwidth interactive video. These applications 386 should be able to use DCCP once suitable Congestion Control ID 387 Profiles are standardized. 389 3. Conventions and Terminology 391 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 392 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 393 document are to be interpreted as described in RFC 2119. 395 3.1. Numbers and Fields 397 All multi-byte numerical quantities in DCCP, such as port numbers, 398 Sequence Numbers, and arguments to options, are transmitted in 399 network byte order (most significant byte first). 401 We occasionally refer to the "left" and "right" sides of a bit 402 field. "Left" means towards the most significant bit, and "right" 403 means towards the least significant bit. 405 Random numbers in DCCP are used for their security properties, and 406 SHOULD be chosen according to the guidelines in RFC 1750. 408 All operations on DCCP sequence numbers, and comparisons such as 409 "greater" and "greatest", use circular arithmetic modulo 2**48. 410 This form of arithmetic preserves the relationships between sequence 411 numbers as they roll over from 2**48 - 1 to 0. Implementation 412 strategies for DCCP sequence numbers will resemble those for other 413 circular arithmetic spaces, including TCP's sequence numbers [RFC 414 793] and DNS's serial numbers [RFC 1982]. Note that the common 415 technique for implementing circular comparison using two's- 416 complement arithmetic, whereby A < B using circular arithmetic if 417 and only if (A - B) < 0 using conventional two's-complement 418 arithmetic, may be used for DCCP sequence numbers, provided they are 419 stored in the most significant 48 bits of 64-bit integers. 421 Reserved bitfields in DCCP packet headers MUST be set to zero by 422 senders, and MUST be ignored by receivers, unless otherwise 423 specified. This is to allow for future protocol extensions. In 424 particular, DCCP processors MUST NOT reset a DCCP connection simply 425 because a Reserved field has non-zero value [RFC 3360]. 427 3.2. Parts of a Connection 429 Each DCCP connection runs between two hosts, which we often name 430 DCCP A and DCCP B. Each connection is actively initiated by one of 431 the hosts, which we call the client; the other, initially passive 432 host is called the server. The term "DCCP endpoint" is used to 433 refer to either of the two hosts explicitly named by the connection 434 (the client and the server). The term "DCCP processor" refers more 435 generally to any host that might need to process a DCCP header; this 436 includes the endpoints and any middleboxes on the path, such as 437 firewalls and network address translators. 439 DCCP connections are bidirectional: data may pass from either 440 endpoint to the other. This means that data and acknowledgements 441 may be flowing in both directions simultaneously. Logically, 442 however, a DCCP connection consists of two separate unidirectional 443 connections, called half-connections. Each half-connection consists 444 of the application data sent by one endpoint and the corresponding 445 acknowledgements sent by the other endpoint. We can illustrate this 446 as follows: 448 +--------+ A-to-B half-connection: +--------+ 449 | | --> application data --> | | 450 | | <-- acknowledgements <-- | | 451 | DCCP A | | DCCP B | 452 | | B-to-A half-connection: | | 453 | | <-- application data <-- | | 454 +--------+ --> acknowledgements --> +--------+ 456 Although they are logically distinct, in practice the half- 457 connections overlap; a DCCP-DataAck packet, for example, contains 458 application data relevant to one half-connection and acknowledgement 459 information relevant to the other. 461 In the context of a single half-connection, the terms "HC-Sender" 462 and "HC-Receiver" denote the endpoints sending application data and 463 acknowledgements, respectively. For example, DCCP A is the HC- 464 Sender and DCCP B is the HC-Receiver in the A-to-B half-connection. 466 3.3. Features 468 A DCCP feature is a connection attribute on whose value the two 469 endpoints agree. Many properties of a DCCP connection are 470 controlled by features, including the congestion control mechanisms 471 in use on the two half-connections. The endpoints achieve agreement 472 through the exchange of feature negotiation options in DCCP headers. 474 DCCP features are identified by a feature number and an endpoint. 475 The notation "F/X" represents the feature with feature number F 476 located at DCCP endpoint X. Each valid feature number thus 477 corresponds to two features, which are negotiated separately and 478 need not have the same value. The two endpoints know, and agree on, 479 the value of every valid feature. DCCP A is the "feature location" 480 for all features F/A, and the "feature remote" for all features F/B. 482 3.4. Round-Trip Times 484 DCCP round-trip time measurements are performed by congestion 485 control mechanisms; different mechanisms may measure round-trip time 486 in different ways, or not measure it at all. However, the main DCCP 487 protocol does use round-trip times occasionally, such as in the 488 initial values for certain timers. Each DCCP implementation thus 489 defines a default round-trip time for use when no estimate is 490 available; this parameter should default to not less than 491 0.2 seconds, a reasonably conservative round-trip time for Internet 492 TCP connections. Protocol behavior specified in terms of "round- 493 trip time" values actually refers to "a current round-trip time 494 estimate taken by some CCID, or, if no estimate is available, the 495 default round-trip time parameter". 497 The maximum segment lifetime, or MSL, is the maximum length of time 498 a packet can survive in the network. The DCCP MSL should equal that 499 of TCP, which is normally two minutes. 501 3.5. Security Limitation 503 DCCP provides no protection against attackers who can snoop on a 504 connection in progress, or who can guess valid sequence numbers in 505 other ways. Applications desiring stronger security should use 506 IPsec [RFC 2401]; depending on the level of security required, 507 application-level cryptography may also suffice. These issues are 508 discussed further in Sections 18 and 7.5.5. 510 3.6. Robustness Principle 512 DCCP implementations will follow TCP's "general principle of 513 robustness": "be conservative in what you do, be liberal in what you 514 accept from others" [RFC 793]. 516 4. Overview 518 DCCP's high-level connection dynamics echo those of TCP. 519 Connections progress through three phases: initiation, including a 520 three-way handshake; data transfer; and termination. Data can flow 521 both ways over the connection. An acknowledgement framework lets 522 senders discover how much data has been lost, and thus avoid 523 unfairly congesting the network. Of course, DCCP provides 524 unreliable datagram semantics, not TCP's reliable bytestream 525 semantics. The application must package its data into explicit 526 frames, and must retransmit its own data as necessary. It may be 527 useful to think of DCCP as TCP minus bytestream semantics and 528 reliability, or as UDP plus congestion control, handshakes, and 529 acknowledgements. 531 4.1. Packet Types 533 Ten packet types implement DCCP's protocol functions. For example, 534 every new connection attempt begins with a DCCP-Request packet sent 535 by the client. A DCCP-Request packet thus resembles a TCP SYN; but 536 DCCP-Request is a packet type, not a flag, so there's no way to send 537 an unexpected combination such as TCP's SYN+FIN+ACK+RST. 539 Eight packet types occur during the progress of a typical 540 connection, shown here. Note the three-way handshakes during 541 initiation and termination. 543 Client Server 544 ------ ------ 545 (1) Initiation 546 DCCP-Request --> 547 <-- DCCP-Response 548 DCCP-Ack --> 549 (2) Data transfer 550 DCCP-Data, DCCP-Ack, DCCP-DataAck --> 551 <-- DCCP-Data, DCCP-Ack, DCCP-DataAck 552 (3) Termination 553 <-- DCCP-CloseReq 554 DCCP-Close --> 555 <-- DCCP-Reset 557 The two remaining packet types are used to resynchronize after 558 bursts of loss. 560 Every DCCP packet starts with a 12-byte generic header. Particular 561 packet types include additional fixed-size header data; for example, 562 DCCP-Acks include an Acknowledgement Number. DCCP options and any 563 application data follow the fixed-size header. 565 The packet types are as follows: 567 DCCP-Request 568 Sent by the client to initiate a connection (the first part of 569 the three-way initiation handshake). 571 DCCP-Response 572 Sent by the server in response to a DCCP-Request (the second 573 part of the three-way initiation handshake). 575 DCCP-Data 576 Used to transmit application data. 578 DCCP-Ack 579 Used to transmit pure acknowledgements. 581 DCCP-DataAck 582 Used to transmit application data with piggybacked 583 acknowledgements. 585 DCCP-CloseReq 586 Sent by the server to request that the client close the 587 connection. 589 DCCP-Close 590 Used by the client or the server to close the connection; 591 elicits a DCCP-Reset in response. 593 DCCP-Reset 594 Used to terminate the connection, either normally or abnormally. 596 DCCP-Sync, DCCP-SyncAck 597 Used to resynchronize sequence numbers after large bursts of 598 loss. 600 4.2. Sequence Numbers 602 Each DCCP packet carries a sequence number, so that losses can be 603 detected and reported. Unlike TCP sequence numbers, which are byte- 604 based, DCCP sequence numbers increment by one per packet. For 605 example: 607 DCCP A DCCP B 608 ------ ------ 609 DCCP-Data(seqno 1) --> 610 DCCP-Data(seqno 2) --> 611 <-- DCCP-Ack(seqno 10, ackno 2) 612 DCCP-DataAck(seqno 3, ackno 10) --> 613 <-- DCCP-Data(seqno 11) 615 Every DCCP packet increments the sequence number, whether or not it 616 contains application data. DCCP-Ack pure acknowledgements increment 617 the sequence number, for instance: DCCP B's second packet above uses 618 sequence number 11, since sequence number 10 was used for an 619 acknowledgement. This lets endpoints detect all packet loss, 620 including acknowledgement loss. It also means that endpoints can 621 get out of sync after long bursts of loss; the DCCP-Sync and DCCP- 622 SyncAck packet types are used to recover (Section 7.5). 624 Since DCCP provides unreliable semantics, there are no 625 retransmissions, and it doesn't make sense to have a TCP-style 626 cumulative acknowledgement field. DCCP's Acknowledgement Number 627 field equals the greatest sequence number received, rather than the 628 smallest sequence number not received. Separate options indicate 629 any intermediate sequence numbers that weren't received. 631 4.3. States 633 DCCP endpoints progress through different states during the course 634 of a connection, corresponding roughly to the three phases of 635 initiation, data transfer, and termination. The figure below shows 636 the typical progress through these states for a client and server. 638 Client Server 639 ------ ------ 640 (0) No connection 641 CLOSED LISTEN 643 (1) Initiation 644 REQUEST DCCP-Request --> 645 <-- DCCP-Response RESPOND 646 PARTOPEN DCCP-Ack or DCCP-DataAck --> 648 (2) Data transfer 649 OPEN <-- DCCP-Data, Ack, DataAck --> OPEN 651 (3) Termination 652 <-- DCCP-CloseReq CLOSEREQ 653 CLOSING DCCP-Close --> 654 <-- DCCP-Reset CLOSED 655 TIMEWAIT 656 CLOSED 658 The nine possible states are as follows. They are listed in 659 increasing order, so that "state >= CLOSEREQ" means the same as 660 "state = CLOSEREQ or state = CLOSING or state = TIMEWAIT". Section 661 8 describes the states in more detail. 663 CLOSED 664 Represents nonexistent connections. 666 LISTEN 667 Represents server sockets in the passive listening state. 668 LISTEN and CLOSED are not associated with any particular DCCP 669 connection. 671 REQUEST 672 A client socket enters this state, from CLOSED, after sending a 673 DCCP-Request packet to try to initiate a connection. 675 RESPOND 676 A server socket enters this state, from LISTEN, after receiving 677 a DCCP-Request from a client. 679 PARTOPEN 680 A client socket enters this state, from REQUEST, after receiving 681 a DCCP-Response from the server. This state represents the 682 third phase of the three-way handshake. The client may send 683 application data in this state, but it MUST include an 684 Acknowledgement Number on all of its packets. 686 OPEN 687 The central, data transfer portion of a DCCP connection. Client 688 and server sockets enter this state from PARTOPEN and RESPOND, 689 respectively. Sometimes we speak of SERVER-OPEN and CLIENT-OPEN 690 states, corresponding to the server's OPEN state and the 691 client's OPEN state. 693 CLOSEREQ 694 A server socket enters this state, from SERVER-OPEN, to signal 695 that the connection is over, but the client must hold TIMEWAIT 696 state. 698 CLOSING 699 Server and client sockets can both enter this state to close the 700 connection. 702 TIMEWAIT 703 A server or client socket remains in this state for 2MSL (4 704 minutes) after the connection has been torn down, to prevent 705 mistakes due to the delivery of old packets. Only one of the 706 endpoints need enter TIMEWAIT state (the other can enter CLOSED 707 state immediately), and a server can request its client to hold 708 TIMEWAIT state using the DCCP-CloseReq packet type. 710 4.4. Congestion Control 712 DCCP connections are congestion controlled, but unlike in TCP, DCCP 713 applications have a choice of congestion control mechanism. In 714 fact, the two half-connections can be governed by different 715 mechanisms. Mechanisms are denoted by one-byte congestion control 716 identifiers, or CCIDs. The endpoints negotiate their CCIDs during 717 connection initiation. Each CCID describes how the HC-Sender limits 718 data packet rates, how the HC-Receiver sends congestion feedback via 719 acknowledgements, and so forth. CCIDs 2 and 3 are currently 720 defined; CCIDs 0, 1, and 4-255 are reserved. Other CCIDs may be 721 defined in the future. 723 CCID 2 provides TCP-like Congestion Control, which is similar to 724 that of TCP. The sender maintains a congestion window and sends 725 packets until that window is full. Packets are acknowledged by the 726 receiver. Dropped packets and ECN [RFC 3168] indicate congestion; 727 the response to congestion is to halve the congestion window. 728 Acknowledgements in CCID 2 contain the sequence numbers of all 729 received packets within some window, similar to a selective 730 acknowledgement (SACK) [RFC 2018]. 732 CCID 3 provides TFRC Congestion Control, an equation-based form of 733 congestion control intended to respond to congestion more smoothly 734 than CCID 2. The sender maintains a transmit rate, which it updates 735 using the receiver's estimate of the packet loss and mark rate. 736 CCID 3 behaves somewhat differently from TCP in the short term, it 737 is designed to operate fairly with TCP over the long term. 739 Section 10 describes DCCP's CCIDs in more detail. The behaviors of 740 CCIDs 2 and 3 are fully defined in separate profile documents [CCID 741 2 PROFILE, CCID 3 PROFILE]. 743 4.5. Features 745 DCCP endpoints use Change and Confirm options to negotiate and agree 746 on feature values. Feature negotiation will almost always happen on 747 the connection initiation handshake, but it can begin at any time. 749 There are four feature negotiation options in all: Change L, 750 Confirm L, Change R, and Confirm R. The "L" options are sent by the 751 feature location, and the "R" options are sent by the feature 752 remote. A Change R option says to the feature location, "change 753 this feature value as follows". The feature location responds with 754 Confirm L, meaning "I've changed it". Some features allow Change R 755 options to contain multiple values, sorted in preference order. For 756 example: 758 Client Server 759 ------ ------ 760 Change R(CCID, 2) --> 761 <-- Confirm L(CCID, 2) 762 * agreement that CCID/Server = 2 * 764 Change R(CCID, 3 4) --> 765 <-- Confirm L(CCID, 4, 4 2) 766 * agreement that CCID/Server = 4 * 768 Both exchanges negotiate the CCID/Server feature's value, which is 769 the CCID in use on the server-to-client half-connection. In the 770 second exchange, the client requests that the server use either 771 CCID 3 or CCID 4, with 3 preferred; the server chooses 4 and 772 supplies its preference list, "4 2". 774 The Change L and Confirm R options are used for feature negotiations 775 initiated by the feature location. In the following example, the 776 server requests that CCID/Server be set to 3 or 2, with 3 preferred, 777 and the client agrees. 779 Client Server 780 ------ ------ 781 <-- Change L(CCID, 3 2) 782 Confirm R(CCID, 3, 3 2) --> 783 * agreement that CCID/Server = 3 * 785 Section 6 describes the feature negotiation options further, 786 including the retransmission strategies that make negotiation 787 reliable. 789 4.6. Differences From TCP 791 Differences between DCCP and TCP apart from those discussed so far 792 include: 794 o Copious space for options (up to 1008 bytes or the PMTU). 796 o Different acknowledgement formats. The CCID for a connection 797 determines how much acknowledgement information needs to be 798 transmitted. For example, in CCID 2 (TCP-like), this is about 799 one ack per 2 packets, and each ack must declare exactly which 800 packets were received; in CCID 3 (TFRC), it's about one ack per 801 round-trip time, and acks must declare at minimum just the 802 lengths of recent loss intervals. 804 o Denial-of-service (DoS) protection. Several mechanisms help 805 limit the amount of state possibly-misbehaving clients can force 806 DCCP servers to maintain. An Init Cookie option, analogous to 807 TCP's SYN Cookies [SYNCOOKIES], avoids SYN-flood-like attacks. 808 Only one connection endpoint need hold TIMEWAIT state; the DCCP- 809 CloseReq packet, which may only be sent by the server, passes 810 that state to the client. Various rate limits let servers avoid 811 attacks that might force extensive computation or packet 812 generation. 814 o Distinguishing different kinds of loss. A Data Dropped option 815 (Section 11.7) lets an endpoint declare that a packet was dropped 816 because of corruption, because of receive buffer overflow, and so 817 on. This facilitates research into more appropriate rate-control 818 responses for these non-network-congestion losses (although 819 currently such losses will cause a congestion response). 821 o Acknowledgeability. In TCP, a packet may be acknowledged only 822 once the data is reliably queued for application delivery. This 823 does not make sense in DCCP, where an application might, for 824 example, request a drop-from-front receive buffer. A DCCP packet 825 may be acknowledged as soon as its header has been successfully 826 processed. Concretely, a packet becomes acknowledgeable at 827 Step 8 of Section 8.5's packet processing pseudocode. 828 Acknowledgeability does not guarantee data delivery, however: the 829 Data Dropped option may later report that the packet's 830 application data was discarded. 832 o No receive window. DCCP is a congestion control protocol, not a 833 flow control protocol. 835 o No simultaneous open. Every connection has one client and one 836 server. 838 o No half-closed states. DCCP has no states corresponding to TCP's 839 FINWAIT and CLOSEWAIT, where one half-connection is explicitly 840 closed while the other is still active. The Data Dropped 841 option's Drop Code 1, Application Not Listening (Section 11.7), 842 can achieve a similar effect, however. 844 4.7. Example Connection 846 The progress of a typical DCCP connection is as follows. (This 847 description is informative, not normative.) 849 Client Server 850 ------ ------ 851 0. [CLOSED] [LISTEN] 852 1. DCCP-Request --> 853 2. <-- DCCP-Response 854 3. DCCP-Ack --> 855 4. DCCP-Data, DCCP-Ack, DCCP-DataAck --> 856 <-- DCCP-Data, DCCP-Ack, DCCP-DataAck 857 5. <-- DCCP-CloseReq 858 6. DCCP-Close --> 859 7. <-- DCCP-Reset 860 8. [TIMEWAIT] 862 1. The client sends the server a DCCP-Request packet specifying the 863 client and server ports, the service being requested, and any 864 features being negotiated, including the CCID that the client 865 would like the server to use. The client may optionally 866 piggyback an application request on the DCCP-Request packet, 867 which the server may ignore. 869 2. The server sends the client a DCCP-Response packet indicating 870 that it is willing to communicate with the client. This 871 response indicates any features and options that the server 872 agrees to, begins other feature negotiations as desired, and 873 optionally includes an Init Cookie that wraps up all this 874 information and which must be returned by the client for the 875 connection to complete. 877 3. The client sends the server a DCCP-Ack packet that acknowledges 878 the DCCP-Response packet. This acknowledges the server's 879 initial sequence number and returns the Init Cookie if there was 880 one in the DCCP-Response. It may also continue feature 881 negotiation. The client may piggyback an application-level 882 request on its final ack, producing a DCCP-DataAck packet. 884 4. The server and client then exchange DCCP-Data packets, DCCP-Ack 885 packets acknowledging that data, and, optionally, DCCP-DataAck 886 packets containing data with piggybacked acknowledgements. If 887 the client has no data to send, then the server will send DCCP- 888 Data and DCCP-DataAck packets, while the client will send DCCP- 889 Acks exclusively. (However, the client may not send DCCP-Data 890 packets before receiving at least one non-DCCP-Response packet 891 from the server.) 893 5. The server sends a DCCP-CloseReq packet requesting a close. 895 6. The client sends a DCCP-Close packet acknowledging the close. 897 7. The server sends a DCCP-Reset packet with Reset Code 1, 898 "Closed", and clears its connection state. DCCP-Resets are part 899 of normal connection termination; see Section 5.6. 901 8. The client receives the DCCP-Reset packet and holds state for 902 two maximum segment lifetimes, or 2MSL, to allow any remaining 903 packets to clear the network. 905 An alternative connection closedown sequence is initiated by the 906 client: 908 5b. The client sends a DCCP-Close packet closing the connection. 910 6b. The server sends a DCCP-Reset packet with Reset Code 1, 911 "Closed", and clears its connection state. 913 7b. The client receives the DCCP-Reset packet and holds state for 914 2MSL to allow any remaining packets to clear the network. 916 5. Packet Formats 918 The DCCP header can be from 12 to 1020 bytes long. The initial 12 919 bytes of the header have the same semantics for all currently- 920 defined packet types. Following this comes any additional fixed- 921 length fields required by the packet type, and then a variable- 922 length list of options. The application data area follows the 923 header. In some packet types, this area contains data for the 924 application; in other packet types, its contents are ignored. 926 +---------------------------------------+ -. 927 | Generic Header | | 928 +---------------------------------------+ | 929 | Additional Fields (depending on type) | +- DCCP Header 930 +---------------------------------------+ | 931 | Options (optional) | | 932 +=======================================+ -' 933 | Application Data Area | 934 +---------------------------------------+ 936 5.1. Generic Header 938 The DCCP generic header takes different forms depending on the value 939 of X, the Extended Sequence Numbers bit. If X is one, the Sequence 940 Number field is 48 bits long and the generic header takes 16 bytes, 941 as follows. 943 0 1 2 3 944 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 945 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 946 | Source Port | Dest Port | 947 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 948 | Data Offset | CCVal | CsCov | Checksum | 949 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 950 | | |X| | . 951 | Res | Type |=| Reserved | Sequence Number (high bits) . 952 | | |1| | . 953 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 954 . Sequence Number (low bits) | 955 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 957 If X is zero, only the low 24 bits of the Sequence Number are 958 transmitted, and the generic header is 12 bytes long. 960 0 1 2 3 961 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 962 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 963 | Source Port | Dest Port | 964 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 965 | Data Offset | CCVal | CsCov | Checksum | 966 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 967 | | |X| | 968 | Res | Type |=| Sequence Number (low bits) | 969 | | |0| | 970 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 972 The generic header fields are defined as follows. 974 Source and Destination Ports: 16 bits each 975 These fields identify the connection, similar to the 976 corresponding fields in TCP and UDP. The Source Port represents 977 the relevant port on the endpoint that sent this packet, the 978 Destination Port the relevant port on the other endpoint. When 979 initiating a connection, the client SHOULD choose its Source 980 Port randomly to reduce the likelihood of attack. 982 DCCP APIs should treat port numbers similarly to TCP and UDP 983 port numbers. For example, machines that distinguish between 984 "privileged" and "unprivileged" ports for TCP and UDP should do 985 the same for DCCP. 987 Data Offset: 8 bits 988 The offset from the start of the packet's DCCP header to the 989 start of its application data area, in 32-bit words. The 990 receiver MUST ignore packets whose Data Offset is smaller than 991 the minimum-sized header for the given Type, or larger than the 992 DCCP packet itself. 994 CCVal: 4 bits 995 Used by the HC-Sender CCID. For example, the A-to-B CCID's 996 sender, which is active at DCCP A, MAY send 4 bits of 997 information per packet to its receiver by encoding that 998 information in CCVal. The sender MUST set CCVal to zero unless 999 its HC-Sender CCID specifies otherwise, and the receiver MUST 1000 ignore the CCVal field unless its HC-Receiver CCID specifies 1001 otherwise. 1003 Checksum Coverage (CsCov): 4 bits 1004 Checksum Coverage determines the parts of the packet that are 1005 covered by the Checksum field. This always includes the DCCP 1006 header and options, but some or all of the application data may 1007 be excluded. This can improve performance on noisy links for 1008 applications that can tolerate corruption. See Section 9. 1010 Checksum: 16 bits 1011 The Internet checksum of the packet's DCCP header (including 1012 options), a network-layer pseudoheader, and, depending on 1013 Checksum Coverage, all, some, or none of the application data. 1014 See Section 9. 1016 Reserved (Res): 3 bits 1017 Senders MUST set this field to all zeroes on generated packets, 1018 and receivers MUST ignore its value. 1020 Type: 4 bits 1021 The Type field specifies the type of the packet. The following 1022 values are defined: 1024 Type Meaning 1025 ---- ------- 1026 0 DCCP-Request 1027 1 DCCP-Response 1028 2 DCCP-Data 1029 3 DCCP-Ack 1030 4 DCCP-DataAck 1031 5 DCCP-CloseReq 1032 6 DCCP-Close 1033 7 DCCP-Reset 1034 8 DCCP-Sync 1035 9 DCCP-SyncAck 1036 10-15 Reserved 1038 Table 1: DCCP Packet Types 1040 Receivers MUST ignore any packets with reserved type. That is, 1041 packets with reserved type MUST NOT be processed and they MUST 1042 NOT be acknowledged as received. 1044 Extended Sequence Numbers (X): 1 bit 1045 Set to one to indicate the use of an extended generic header 1046 with 48-bit Sequence and Acknowledgement Numbers. DCCP-Data, 1047 DCCP-DataAck, and DCCP-Ack packets MAY set X to zero or one. 1048 All DCCP-Request, DCCP-Response, DCCP-CloseReq, DCCP-Close, 1049 DCCP-Reset, DCCP-Sync, and DCCP-SyncAck packets MUST set X to 1050 one; endpoints MUST ignore any such packets with X set to zero. 1051 High-rate connections SHOULD set X to one on all packets to gain 1052 increased protection against wrapped sequence numbers and 1053 attacks. See Section 7.6. 1055 Sequence Number: 48 or 24 bits 1056 Identifies the packet uniquely in the sequence of all packets 1057 the source sent on this connection. Sequence Number increases 1058 by one with every packet sent, including packets such as DCCP- 1059 Ack that carry no application data. See Section 7. 1061 All currently defined packet types except DCCP-Request and DCCP-Data 1062 carry an Acknowledgement Number Subheader in the four or eight bytes 1063 immediately following the generic header. When X=1, its format is: 1065 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1066 | Reserved | Acknowledgement Number . 1067 | | (high bits) . 1068 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1069 . Acknowledgement Number (low bits) | 1070 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1072 When X=0, only the low 24 bits of the Acknowledgement Number are 1073 transmitted, giving the Acknowledgement Number Subheader this 1074 format: 1076 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1077 | Reserved | Acknowledgement Number (low bits) | 1078 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1080 Reserved: 16 or 8 bits 1081 Senders MUST set this field to all zeroes on generated packets, 1082 and receivers MUST ignore its value. 1084 Acknowledgement Number: 48 or 24 bits 1085 Generally contains GSR, the Greatest Sequence Number Received on 1086 any acknowledgeable packet so far. A packet is acknowledgeable 1087 if and only if its header was successfully processed by the 1088 receiver; Section 7.4 describes this further. Options such as 1089 Ack Vector (Section 11.4) combine with the Acknowledgement 1090 Number to provide precise information about which packets have 1091 arrived. 1093 Acknowledgement Numbers on DCCP-Sync and DCCP-SyncAck packets 1094 need not equal GSR. See Section 5.7. 1096 5.2. DCCP-Request Packets 1098 A client initiates a DCCP connection by sending a DCCP-Request 1099 packet. These packets MAY contain application data, and MUST use 1100 48-bit sequence numbers (X=1). 1102 0 1 2 3 1103 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1104 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1105 / Generic DCCP Header with X=1 (16 bytes) / 1106 / with Type=0 (DCCP-Request) / 1107 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1108 | Service Code | 1109 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1110 / Options and Padding / 1111 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1112 / Application Data / 1113 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1115 Service Code: 32 bits 1116 Describes the application-level service to which the client 1117 application wants to connect. Service Codes are intended to 1118 provide information about which application protocol a 1119 connection intends to use, and thus aiding middleboxes and 1120 reducing reliance on globally well-known ports. See Section 1121 8.1.2. 1123 5.3. DCCP-Response Packets 1125 The server responds to valid DCCP-Request packets with DCCP-Response 1126 packets. This is the second phase of the three-way handshake. 1127 DCCP-Response packets MAY contain application data, and MUST use 1128 48-bit sequence numbers (X=1). 1130 0 1 2 3 1131 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1132 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1133 / Generic DCCP Header with X=1 (16 bytes) / 1134 / with Type=1 (DCCP-Response) / 1135 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1136 / Acknowledgement Number Subheader (8 bytes) / 1137 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1138 | Service Code | 1139 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1140 / Options and Padding / 1141 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1142 / Application Data / 1143 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1145 Acknowledgement Number: 48 bits 1146 Contains GSR. Since DCCP-Responses are only sent during 1147 connection initiation, this will always equal the Sequence 1148 Number on a received DCCP-Request. 1150 Service Code: 32 bits 1151 MUST equal the Service Code on the corresponding DCCP-Request. 1153 5.4. DCCP-Data, DCCP-Ack, and DCCP-DataAck Packets 1155 The central data transfer portion of every DCCP connection uses 1156 DCCP-Data, DCCP-Ack, and DCCP-DataAck packets. These packets MAY 1157 use 24-bit sequence numbers, depending on the value of the Allow 1158 Short Sequence Numbers feature (Section 7.6.1). DCCP-Data packets 1159 carry application data without acknowledgements. 1161 0 1 2 3 1162 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1163 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1164 / Generic DCCP Header (16 or 12 bytes) / 1165 / with Type=2 (DCCP-Data) / 1166 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1167 / Options and Padding / 1168 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1169 / Application Data / 1170 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1172 DCCP-Ack packets dispense with the data, but contain an 1173 Acknowledgement Number. They are used for pure acknowledgements. 1175 0 1 2 3 1176 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1177 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1178 / Generic DCCP Header (16 or 12 bytes) / 1179 / with Type=3 (DCCP-Ack) / 1180 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1181 / Acknowledgement Number Subheader (8 or 4 bytes) / 1182 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1183 / Options and Padding / 1184 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1185 / Application Data Area (Ignored) / 1186 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1188 DCCP-DataAck packets carry both application data and an 1189 Acknowledgement Number: acknowledgement information is piggybacked 1190 on a data packet. 1192 0 1 2 3 1193 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1194 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1195 / Generic DCCP Header (16 or 12 bytes) / 1196 / with Type=4 (DCCP-DataAck) / 1197 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1198 / Acknowledgement Number Subheader (8 or 4 bytes) / 1199 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1200 / Options and Padding / 1201 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1202 / Application Data / 1203 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1205 A DCCP-Data or DCCP-DataAck packet may have a zero-length 1206 application data area, which indicates that the application sent a 1207 zero-length datagram. This differs from DCCP-Request and DCCP- 1208 Response packets, where an empty application data area indicates the 1209 absence of application data (not the presence of zero-length 1210 application data). The API SHOULD report any received zero-length 1211 datagrams to the receiving application. 1213 A DCCP-Ack packet MAY have a non-zero-length application data area, 1214 which essentially pads the DCCP-Ack to a desired length. Receivers 1215 MUST ignore the content of the application data area in DCCP-Ack 1216 packets. 1218 DCCP-Ack and DCCP-DataAck packets often include additional 1219 acknowledgement options, such as Ack Vector, as required by the 1220 congestion control mechanism in use. 1222 5.5. DCCP-CloseReq and DCCP-Close Packets 1224 DCCP-CloseReq and DCCP-Close packets begin the handshake that 1225 normally terminates a connection. Either client or server may send 1226 a DCCP-Close packet, which will elicit a DCCP-Reset packet. Only 1227 the server can send a DCCP-CloseReq packet, which indicates that the 1228 server wants to close the connection, but does not want to hold its 1229 TIMEWAIT state. Both packet types MUST use 48-bit sequence numbers 1230 (X=1). 1232 0 1 2 3 1233 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1234 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1235 / Generic DCCP Header with X=1 (16 bytes) / 1236 / with Type=5 (DCCP-CloseReq) or 6 (DCCP-Close) / 1237 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1238 / Acknowledgement Number Subheader (8 bytes) / 1239 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1240 / Options and Padding / 1241 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1242 / Application Data Area (Ignored) / 1243 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1245 As with DCCP-Ack packets, DCCP-CloseReq and DCCP-Close packets MAY 1246 have non-zero-length application data areas, whose contents 1247 receivers MUST ignore. 1249 5.6. DCCP-Reset Packets 1251 DCCP-Reset packets unconditionally shut down a connection. 1252 Connections normally terminate with a DCCP-Reset, but resets may be 1253 sent for other reasons, including bad port numbers, bad option 1254 behavior, incorrect ECN Nonce Echoes, and so forth. DCCP-Resets 1255 MUST use 48-bit sequence numbers (X=1). 1257 0 1 2 3 1258 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1259 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1260 / Generic DCCP Header with X=1 (16 bytes) / 1261 / with Type=7 (DCCP-Reset) / 1262 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1263 / Acknowledgement Number Subheader (8 bytes) / 1264 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1265 | Reset Code | Data 1 | Data 2 | Data 3 | 1266 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1267 / Options and Padding / 1268 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1269 / Application Data Area (Error Text) / 1270 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1272 Reset Code: 8 bits 1273 Represents the reason that the sender reset the DCCP connection. 1275 Data 1, Data 2, and Data 3: 8 bits each 1276 The Data fields provide additional information about why the 1277 sender reset the DCCP connection. The meanings of these fields 1278 depend on the value of Reset Code. 1280 Application Data Area: Error Text 1281 If present, Error Text is a human-readable text string encoded 1282 in Unicode UTF-8, and preferably in English, that describes the 1283 error in more detail. For example, a DCCP-Reset with Reset Code 1284 11, "Aggression Penalty", might contain Error Text such as 1285 "Aggression Penalty: Received 3 bad ECN Nonce Echoes, assuming 1286 misbehavior". 1288 The following Reset Codes are currently defined. Unless otherwise 1289 specified, the Data 1, 2, and 3 fields MUST be set to 0 by the 1290 sender of the DCCP-Reset and ignored by its receiver. Section 1291 references describe concrete situations that will cause each Reset 1292 Code to be generated; they are not meant to be exhaustive. 1294 0, "Unspecified" 1295 Indicates the absence of a meaningful Reset Code. Use of Reset 1296 Code 0 is NOT RECOMMENDED: the sender should choose a Reset Code 1297 that more clearly defines why the connection is being reset. 1299 1, "Closed" 1300 Normal connection close. See Section 8.3. 1302 2, "Aborted" 1303 The sending endpoint gave up on the connection because of lack 1304 of progress. See Sections 8.1.1 and 8.1.5. 1306 3, "No Connection" 1307 No connection exists. See Section 8.3.1. 1309 4, "Packet Error" 1310 A valid packet arrived with unexpected type. For example, a 1311 DCCP-Data packet with valid header checksum and sequence numbers 1312 arrived at a connection in the REQUEST state. See Section 1313 8.3.1. The Data 1 field equals the offending packet type as an 1314 eight-bit number; thus, an offending packet with Type 2 will 1315 result in a Data 1 value of 2. 1317 5, "Option Error" 1318 An option was erroneous, and the error was serious enough to 1319 warrant resetting the connection. See Sections 6.6.7, 6.6.8, 1320 and 11.4. The Data 1 field equals the offending option type; 1321 Data 2 and Data 3 equal the first two bytes of option data (or 1322 zero if the option had less than two bytes of data). 1324 6, "Mandatory Error" 1325 The sending endpoint could not process an option O that was 1326 immediately preceded by Mandatory. The Data fields report the 1327 option type and data of option O, using the format of Reset Code 1328 5, "Option Error". See Section 5.8.2. 1330 7, "Connection Refused" 1331 The Destination Port didn't correspond to a port open for 1332 listening. Sent only in response to DCCP-Requests. See Section 1333 8.1.3. 1335 8, "Bad Service Code" 1336 The Service Code didn't equal the service code attached to the 1337 Destination Port. Sent only in response to DCCP-Requests. See 1338 Section 8.1.3. 1340 9, "Too Busy" 1341 The server is too busy to accept new connections. Sent only in 1342 response to DCCP-Requests. See Section 8.1.3. 1344 10, "Bad Init Cookie" 1345 The Init Cookie echoed by the client was incorrect or missing. 1346 See Section 8.1.4. 1348 11, "Aggression Penalty" 1349 This endpoint has detected congestion control-related 1350 misbehavior on the part of the other endpoint. See Section 1351 12.3. 1353 12-127, Reserved 1354 Receivers should treat these codes like Reset Code 0, 1355 "Unspecified". 1357 128-255, CCID-specific codes 1358 Semantics depend on the connection's CCIDs. See Section 10.3. 1359 Receivers should treat unknown CCID-specific Reset Codes like 1360 Reset Code 0, "Unspecified". 1362 The following table summarizes this information. 1364 Reset 1365 Code Name Data 1 Data 2 & 3 1366 ----- ---- ------ ---------- 1367 0 Unspecified 0 0 1368 1 Closed 0 0 1369 2 Aborted 0 0 1370 3 No Connection 0 0 1371 4 Packet Error pkt type 0 1372 5 Option Error option # option data 1373 6 Mandatory Error option # option data 1374 7 Connection Refused 0 0 1375 8 Bad Service Code 0 0 1376 9 Too Busy 0 0 1377 10 Bad Init Cookie 0 0 1378 11 Aggression Penalty 0 0 1379 12-127 Reserved 1380 128-255 CCID-specific codes 1382 Table 2: DCCP Reset Codes 1384 Options on DCCP-Reset packets are processed before the connection is 1385 shut down. This means that certain combinations of options, 1386 particularly involving Mandatory, may cause an endpoint to respond 1387 to a valid DCCP-Reset with another DCCP-Reset. This cannot lead to 1388 a reset storm; since the first endpoint has already reset the 1389 connection, the second DCCP-Reset will be ignored. 1391 5.7. DCCP-Sync and DCCP-SyncAck Packets 1393 DCCP-Sync packets help DCCP endpoints recover synchronization after 1394 bursts of loss, or recover from half-open connections. Each valid 1395 received DCCP-Sync immediately elicits a DCCP-SyncAck. Both packet 1396 types MUST use 48-bit sequence numbers (X=1). 1398 0 1 2 3 1399 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1400 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1401 / Generic DCCP Header with X=1 (16 bytes) / 1402 / with Type=8 (DCCP-Sync) or 9 (DCCP-SyncAck) / 1403 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1404 / Acknowledgement Number Subheader (8 bytes) / 1405 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1406 / Options and Padding / 1407 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1408 / Application Data Area (Ignored) / 1409 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1411 The Acknowledgement Number field has special semantics for DCCP-Sync 1412 and DCCP-SyncAck packets. First, the packet corresponding to a 1413 DCCP-Sync's Acknowledgement Number need not have been 1414 acknowledgeable. Thus, receivers MUST NOT assume that a packet was 1415 processed simply because it appears in the Acknowledgement Number 1416 field of a DCCP-Sync packet. This differs from all other packet 1417 types, where the Acknowledgement Number by definition corresponds to 1418 an acknowledgeable packet. Second, the Acknowledgement Number on 1419 any DCCP-SyncAck packet MUST correspond to the Sequence Number on an 1420 acknowledgeable DCCP-Sync packet. In the presence of reordering, 1421 this might not equal GSR. 1423 As with DCCP-Ack packets, DCCP-Sync and DCCP-SyncAck packets MAY 1424 have non-zero-length application data areas, whose contents 1425 receivers MUST ignore. Padded DCCP-Sync packets may be useful when 1426 performing Path MTU discovery; see Section 14. 1428 5.8. Options 1430 Any DCCP packet may contain options, which occupy space at the end 1431 of the DCCP header. Each option is a multiple of 8 bits in length. 1432 Individual options are not padded to multiples of 32 bits, and any 1433 option may begin on any byte boundary. However, the combination of 1434 all options MUST add up to a multiple of 32 bits; Padding options 1435 MUST be added as necessary to fill out option space to a word 1436 boundary. Any options present are included in the header checksum. 1438 The first byte of an option is the option type. Options with types 1439 0 through 31 are single-byte options. Other options are followed by 1440 a byte indicating the option's length. This length value includes 1441 the two bytes of option-type and option-length as well as any 1442 option-data bytes, and must therefore be greater than or equal to 1443 two. 1445 Options are processed sequentially, starting at the first option in 1446 the packet header. Options with unknown types MUST be ignored. 1447 Also, options with nonsensical lengths (length byte less than two or 1448 more than the remaining space in the options portion of the header) 1449 MUST be ignored, and any option space following an option with 1450 nonsensical length MUST likewise be ignored. 1452 The following options are currently defined: 1454 Option DCCP- Section 1455 Type Length Meaning Data? Reference 1456 ---- ------ ------- ----- --------- 1457 0 1 Padding Y 5.8.1 1458 1 1 Mandatory N 5.8.2 1459 2 1 Slow Receiver Y 11.6 1460 3-31 1 Reserved 1461 32 variable Change L N 6.1 1462 33 variable Confirm L N 6.2 1463 34 variable Change R N 6.1 1464 35 variable Confirm R N 6.2 1465 36 variable Init Cookie N 8.1.4 1466 37 3-5 NDP Count Y 7.7 1467 38 variable Ack Vector [Nonce 0] N 11.4 1468 39 variable Ack Vector [Nonce 1] N 11.4 1469 40 variable Data Dropped N 11.7 1470 41 6 Timestamp Y 13.1 1471 42 6/8/10 Timestamp Echo Y 13.3 1472 43 4/6 Elapsed Time N 13.2 1473 44 6 Data Checksum Y 9.3 1474 45-127 variable Reserved 1475 128-255 variable CCID-specific options - 10.3 1477 Table 3: DCCP Options 1479 Not all options are suitable for all packet types. For example, 1480 since the Ack Vector option is interpreted relative to the 1481 Acknowledgement Number, it isn't suitable on DCCP-Request and DCCP- 1482 Data packets, which have no Acknowledgement Number. If an option 1483 occurs on an unexpected packet type, it MUST generally be ignored; 1484 any such restrictions are mentioned in each option's description. 1485 The table summarizes the most common restriction: when the DCCP- 1486 Data? column value is N, the corresponding option MUST be ignored 1487 when received on a DCCP-Data packet. (Section 7.5.5 describes why 1488 such options are ignored as opposed to, say, causing a reset.) 1490 Options with invalid values MUST be ignored unless otherwise 1491 specified. For example, any Data Checksum option with option length 1492 4 MUST be ignored, since all valid Data Checksum options have option 1493 length 6. 1495 This section describes two generic options, Padding and Mandatory. 1496 Other options are described later. 1498 5.8.1. Padding Option 1499 +--------+ 1500 |00000000| 1501 +--------+ 1502 Type=0 1504 Padding is a single-byte "no-operation" option used to pad between 1505 or after options. If the length of a packet's other options is not 1506 a multiple of 32 bits, then Padding options are REQUIRED to pad out 1507 the options area to the length implied by Data Offset. Padding may 1508 also be used between options -- for example, to align the beginning 1509 of a subsequent option on a 32-bit boundary. There is no guarantee 1510 that senders will use this option, so receivers must be prepared to 1511 process options even if they do not begin on a word boundary. 1513 5.8.2. Mandatory Option 1515 +--------+ 1516 |00000001| 1517 +--------+ 1518 Type=1 1520 Mandatory is a single-byte option that marks the immediately 1521 following option as mandatory. Say that the immediately following 1522 option is O. Then the Mandatory option has no effect if the 1523 receiving DCCP endpoint understands and processes O. If the 1524 endpoint does not understand or process O, however, then it MUST 1525 reset the connection using Reset Code 6, "Mandatory Failure". For 1526 instance, the endpoint would reset the connection if it did not 1527 understand O's type; if it understood O's type, but not O's data; if 1528 O's data was invalid for O's type; if O was a feature negotiation 1529 option, and the endpoint did not understand the enclosed feature 1530 number; if the endpoint understood O, but chose not to perform the 1531 action O implies; and so forth. 1533 Mandatory options MUST NOT be sent on DCCP-Data packets, and any 1534 Mandatory options received on DCCP-Data packets MUST be ignored. 1536 The connection is in error and should be reset with Reset Code 5, 1537 "Option Error" if option O is absent (Mandatory was the last byte of 1538 the option list), or if option O equals Mandatory. However, the 1539 combination "Mandatory Padding" is valid, and MUST behave like two 1540 bytes of Padding. 1542 Section 6.6.9 describes the behavior of Mandatory feature 1543 negotiation options in more detail. 1545 6. Feature Negotiation 1547 Four DCCP options, Change L, Confirm L, Change R, and Confirm R, are 1548 used to negotiate feature values. Change options initiate a 1549 negotiation; Confirm options complete that negotiation. The "L" 1550 options are sent by the feature location, and the "R" options are 1551 sent by the feature remote. Change options are retransmitted to 1552 ensure reliability. 1554 All these options have the same format. The first byte of option 1555 data is the feature number, and the second and subsequent data bytes 1556 hold one or more feature values. The exact format of the feature 1557 value area depends on the feature type; see Section 6.3. 1559 +--------+--------+--------+--------+-------- 1560 | Type | Length |Feature#| Value(s) ... 1561 +--------+--------+--------+--------+-------- 1563 Together, the feature number and the option type ("L" or "R") 1564 uniquely identify the feature to which an option applies. The exact 1565 format of the Value(s) area depends on the feature number. 1567 Feature negotiation options MUST NOT be sent on DCCP-Data packets, 1568 and any feature negotiation options received on DCCP-Data packets 1569 MUST be ignored. 1571 6.1. Change Options 1573 Change L and Change R options initiate feature negotiation. The 1574 option to use depends on the relevant feature's location: To start a 1575 negotiation for feature F/A, DCCP A will send a Change L option; to 1576 start a negotiation for F/B, it will send a Change R option. Change 1577 options are retransmitted until some response is received. They 1578 contain at least one Value, and thus have length at least 4. 1580 +--------+--------+--------+--------+-------- 1581 Change L: |00100000| Length |Feature#| Value(s) ... 1582 +--------+--------+--------+--------+-------- 1583 Type=32 1585 +--------+--------+--------+--------+-------- 1586 Change R: |00100010| Length |Feature#| Value(s) ... 1587 +--------+--------+--------+--------+-------- 1588 Type=34 1590 6.2. Confirm Options 1592 Confirm L and Confirm R options complete feature negotiation, and 1593 are sent in response to Change R and Change L options, respectively. 1594 Confirm options MUST NOT be generated except in response to Change 1595 options. Confirm options need not be retransmitted, since Change 1596 options are retransmitted as necessary. The first byte of the 1597 Confirm option contains the feature number from the corresponding 1598 Change. Following this is the selected Value, and then possibly the 1599 sender's preference list. 1601 +--------+--------+--------+--------+-------- 1602 Confirm L: |00100001| Length |Feature#| Value(s) ... 1603 +--------+--------+--------+--------+-------- 1604 Type=33 1606 +--------+--------+--------+--------+-------- 1607 Confirm R: |00100011| Length |Feature#| Value(s) ... 1608 +--------+--------+--------+--------+-------- 1609 Type=35 1611 If an endpoint receives an invalid Change option -- with an unknown 1612 feature number, or an invalid value -- it will respond with an empty 1613 Confirm option containing the problematic feature number, but no 1614 value. Such options have length 3. 1616 6.3. Reconciliation Rules 1618 Reconciliation rules determine how the two sets of preferences for a 1619 given feature are resolved into a unique result. The reconciliation 1620 rule depends only on the feature number. Each reconciliation rule 1621 must have the property that the result is uniquely determined given 1622 the contents of Change options sent by the two endpoints. 1624 All current DCCP features use one of two reconciliation rules, 1625 server-priority ("SP") and non-negotiable ("NN"). 1627 6.3.1. Server-Priority 1629 The feature value is a fixed-length byte string (length determined 1630 by the feature number). Each Change option contains a list of 1631 values ordered by preference, with the most preferred value coming 1632 first. Each Confirm option contains the confirmed value, followed 1633 by the confirmer's preference list. Thus, the feature's current 1634 value will generally appear twice in Confirm options' data, once as 1635 the current value and once in the confirmer's preference list. 1637 To reconcile the preference lists, select the first entry in the 1638 server's list that also occurs in the client's list. If there is no 1639 shared entry, the feature's value MUST NOT change, and the Confirm 1640 option will confirm the feature's previous value (unless the Change 1641 option was Mandatory; see Section 6.6.9). 1643 6.3.2. Non-Negotiable 1645 The feature value is a byte string. Each option contains exactly 1646 one feature value. The feature location signals a new value by 1647 sending a Change L option. The feature remote MUST accept any valid 1648 value, responding with a Confirm R option containing the new value, 1649 and it MUST send empty Confirm R options in response to invalid 1650 values (unless the Change L option was Mandatory; see Section 1651 6.6.9). Change R and Confirm L options MUST NOT be sent for non- 1652 negotiable features; see Section 6.6.8. Non-negotiable features use 1653 the feature negotiation mechanism to achieve reliability. 1655 6.4. Feature Numbers 1657 This document defines the following feature numbers. 1659 Rec'n Initial Section 1660 Number Meaning Rule Value Req'd Reference 1661 ------ ------- ----- ----- ----- --------- 1662 0 Reserved 1663 1 Congestion Control ID (CCID) SP 2 Y 10 1664 2 Allow Short Seqnos SP 1 Y 7.6.1 1665 3 Sequence Window NN 100 Y 7.5.2 1666 4 ECN Incapable SP 0 N 12.1 1667 5 Ack Ratio NN 2 N 11.3 1668 6 Send Ack Vector SP 0 N 11.5 1669 7 Send NDP Count SP 0 N 7.7.2 1670 8 Minimum Checksum Coverage SP 0 N 9.2.1 1671 9 Check Data Checksum SP 0 N 9.3.1 1672 10-127 Reserved 1673 128-255 CCID-specific features 10.3 1675 Table 4: DCCP Feature Numbers 1677 Rec'n Rule The reconciliation rule used for the feature. SP is 1678 server-priority and NN is non-negotiable. 1680 Initial Value The initial value for the feature. Every feature has 1681 a known initial value. 1683 Req'd This column is "Y" if and only if every DCCP 1684 implementation MUST understand the feature. If it is 1685 "N", then the feature behaves like an extension (see 1686 Section 15), and it is safe to respond to Change 1687 options for the feature with empty Confirm options. 1688 Of course, a CCID might require the feature; a DCCP 1689 that implements CCID 2 MUST support Ack Ratio and 1690 Send Ack Vector, for example. 1692 6.5. Examples 1693 Here are three example feature negotiations for features located at 1694 the server, the first two for the Congestion Control ID feature, the 1695 last for the Ack Ratio. 1697 Client Server 1698 ------ ------ 1699 1. Change R(CCID, 2 3 1) --> 1700 ("2 3 1" is client's preference list) 1701 2. <-- Confirm L(CCID, 3, 3 2 1) 1702 (3 is the negotiated value; 1703 "3 2 1" is server's pref list) 1704 * agreement that CCID/Server = 3 * 1706 1. XXX <-- Change L(CCID, 3 2 1) 1707 2. Retransmission: 1708 <-- Change L(CCID, 3 2 1) 1709 3. Confirm R(CCID, 3, 2 3 1) --> 1710 * agreement that CCID/Server = 3 * 1712 1. <-- Change L(Ack Ratio, 3) 1713 2. Confirm R(Ack Ratio, 3) --> 1714 * agreement that Ack Ratio/Server = 3 * 1716 This example shows a simultaneous negotiation. 1718 Client Server 1719 ------ ------ 1720 1a. Change R(CCID, 2 3 1) --> 1721 b. <-- Change L(CCID, 3 2 1) 1722 2a. <-- Confirm L(CCID, 3, 3 2 1) 1723 b. Confirm R(CCID, 3, 2 3 1) --> 1724 * agreement that CCID/Server = 3 * 1726 Here are the byte encodings of several Change and Confirm options. 1727 Each option is sent by DCCP A. 1729 Change L(CCID, 2 3) = 32,5,1,2,3 1730 DCCP B should change CCID/A's value (feature number 1, a server- 1731 priority feature); DCCP A's preferred values are 2 and 3, in 1732 that preference order. 1734 Change L(Sequence Window, 1024) = 32,9,3,0,0,0,0,4,0 1735 DCCP B should change Sequence Window/A's value (feature number 1736 3, a non-negotiable feature) to the 6-byte string 0,0,0,0,4,0 1737 (the value 1024). 1739 Confirm L(CCID, 2, 2 3) = 33,6,1,2,2,3 1740 DCCP A has changed CCID/A's value to 2; its preferred values are 1741 2 and 3, in that preference order. 1743 Empty Confirm L(126) = 33,3,126 1744 DCCP A doesn't implement feature number 126, or DCCP B's 1745 proposed value for feature 126/A was invalid. 1747 Change R(CCID, 3 2) = 34,5,1,3,2 1748 DCCP B should change CCID/B's value; DCCP A's preferred values 1749 are 3 and 2, in that preference order. 1751 Confirm R(CCID, 2, 3 2) = 35,6,1,2,3,2 1752 DCCP A has changed CCID/B's value to 2; its preferred values 1753 were 3 and 2, in that preference order. 1755 Confirm R(Sequence Window, 1024) = 35,9,3,0,0,0,0,4,0 1756 DCCP A has changed Sequence Window/B's value to the 6-byte 1757 string 0,0,0,0,4,0 (the value 1024). 1759 Empty Confirm R(126) = 35,3,126 1760 DCCP A doesn't implement feature number 126, or DCCP B's 1761 proposed value for feature 126/B was invalid. 1763 6.6. Option Exchange 1765 A few basic rules govern feature negotiation option exchange. 1767 1. Every non-reordered Change option gets a Confirm option in 1768 response. 1770 2. Change options are retransmitted until a response for the latest 1771 Change is received. 1773 3. Feature negotiation options are processed in strictly increasing 1774 order by Sequence Number. 1776 The rest of this section describes the consequences of these rules 1777 in more detail. 1779 6.6.1. Normal Exchange 1781 Change options are generated when a DCCP endpoint wants to change 1782 the value of some feature. Generally, this will happen at the 1783 beginning of a connection, although it may happen at any time. We 1784 say the endpoint "generates" or "sends" a Change L or Change R 1785 option, but of course the option must be attached to a packet. The 1786 endpoint may attach the option to a packet it would have generated 1787 anyway (such as a DCCP-Request), or it may create a "feature 1788 negotiation packet", often a DCCP-Ack or DCCP-Sync, just to carry 1789 the option. Feature negotiation packets are controlled by the 1790 relevant congestion control mechanism. For example, DCCP A may send 1791 a DCCP-Ack or DCCP-Sync for feature negotiation only if the B-to-A 1792 CCID would allow sending a DCCP-Ack. In addition, an endpoint 1793 SHOULD generate at most one feature negotiation packet per round- 1794 trip time. 1796 On receiving a Change L or Change R option, a DCCP endpoint examines 1797 the included preference list, reconciles that with its own 1798 preference list, calculates the new value, and sends back a 1799 Confirm R or Confirm L option, respectively, informing its peer of 1800 the new value or that the feature was not understood. Every non- 1801 reordered Change option MUST result in a corresponding Confirm 1802 option, and any packet including a Confirm option MUST carry an 1803 Acknowledgement Number. (Section 6.6.4 describes how Change 1804 reordering is detected and handled.) Generated Confirm options may 1805 be attached to packets that would have been sent anyway (such as 1806 DCCP-Response or DCCP-SyncAck), or to new feature negotiation 1807 packets, as described above. 1809 The Change-sending endpoint MUST wait to receive a corresponding 1810 Confirm option before changing its stored feature value. The 1811 Confirm-sending endpoint changes its stored feature value as soon as 1812 it sends the Confirm. 1814 A packet MAY contain more than one feature negotiation option, as 1815 long as no two options refer to the same feature. Note, however, 1816 that a packet is allowed to contain one L option and one R option 1817 with the same feature number, since the two options actually refer 1818 to different features (F/A and F/B). 1820 6.6.2. Processing Received Options 1822 DCCP endpoints exist in one of three states relative to each 1823 feature. STABLE is the normal state, where the endpoint knows the 1824 feature's value and thinks the other endpoint agrees. An endpoint 1825 enters the CHANGING state when it first sends a Change for the 1826 feature, and returns to STABLE once it receives a corresponding 1827 Confirm. The final state, UNSTABLE, indicates that an endpoint in 1828 CHANGING state changed its preference list, but has not yet 1829 transmitted a Change option with the new preference list. 1831 Feature state transitions at a feature location are implemented 1832 according to this diagram. The diagram ignores sequence number and 1833 option validity issues; these are handled explicitly in the 1834 pseudocode that follows. 1836 timeout/ 1837 rcv Confirm R app/protocol evt : snd Change L rcv non-ack 1838 : ignore +---------------------------------------+ : snd Change L 1839 +----+ | | +----+ 1840 | v | rcv Change R v | v 1841 +------------+ rcv Confirm R : calc new value, +------------+ 1842 | | : accept value snd Confirm L | | 1843 | STABLE |<-----------------------------------| CHANGING | 1844 | | rcv empty Confirm R | | 1845 +------------+ : revert to old value +------------+ 1846 | ^ | ^ 1847 +----+ pref list | | snd 1848 rcv Change R changes | | Change L 1849 : calc new value, snd Confirm L v | 1850 +------------+ 1851 +---| | 1852 rcv Confirm/Change R | | UNSTABLE | 1853 : ignore +-->| | 1854 +------------+ 1856 Feature locations SHOULD use the following pseudocode, which 1857 corresponds to the state diagram, to react to each feature 1858 negotiation option on each valid packet received. The pseudocode 1859 refers to "P.seqno" and "P.ackno", which are properties of the 1860 packet; "O.type", and "O.len", which are properties of the option; 1861 "FGSR" and "FGSS", which are properties of the connection, and 1862 handle reordering as described in Section 6.6.4; "F.state", which is 1863 the feature's state (STABLE, CHANGING, or UNSTABLE); and "F.value", 1864 which is the feature's value. 1866 First, check for unknown features (Section 6.6.7); 1867 If F is unknown, 1868 If the option was Mandatory, /* Section 6.6.9 */ 1869 Reset connection and return 1870 Otherwise, if O.type == Change R, 1871 Send Empty Confirm L on a future packet 1873 Return 1875 Second, check for reordering (Section 6.6.4); 1876 If F.state == UNSTABLE or P.seqno <= FGSR 1877 or (O.type == Confirm R and P.ackno < FGSS), 1878 Ignore option and return 1880 Third, process Change R options; 1881 If O.type == Change R, 1882 If the option's value is valid, /* Section 6.6.8 */ 1883 Calculate new value 1884 Send Confirm L on a future packet 1885 Set F.state := STABLE 1886 Otherwise, if the option was Mandatory, 1887 Reset connection and return 1888 Otherwise, 1889 Send Empty Confirm L on a future packet 1890 /* Remain in existing state. If that's CHANGING, this 1891 endpoint will retransmit its Change L option later. */ 1893 Fourth, process Confirm R options (but only in CHANGING state). 1894 If F.state == CHANGING and O.type == Confirm R, 1895 If O.len > 3, /* nonempty */ 1896 If the option's value is valid, 1897 Set F.value := new value 1898 Otherwise, 1899 Reset connection and return 1900 Set F.state := STABLE 1902 Versions of this diagram and pseudocode are also used by feature 1903 remotes; simply switch the "L"s and "R"s, so that the relevant 1904 options are Change R and Confirm L. 1906 6.6.3. Loss and Retransmission 1908 Packets containing Change and Confirm options might be lost or 1909 delayed by the network. Therefore, Change options are repeatedly 1910 transmitted to achieve reliability. We refer to this as 1911 "retransmission", although of course there are no packet-level 1912 retransmissions in DCCP: a Change option that is sent again will be 1913 sent on a new packet with a new sequence number. 1915 A CHANGING endpoint transmits another Change option once it realizes 1916 that it has not heard back from the other endpoint. The new Change 1917 option need not contain the same payload as the original; reordering 1918 protection will ensure that agreement is reached based on the most 1919 recently transmitted option. 1921 A CHANGING endpoint MUST continue retransmitting Change options 1922 until it gets some response or the connection terminates. 1924 Endpoints SHOULD use an exponential-backoff timer to decide when to 1925 retransmit Change options. (Endpoints that generate packets 1926 specifically for feature negotiation MUST use such a timer.) The 1927 timer interval is initially set to not less than one round-trip 1928 time, and should back off to not less than 64 seconds. The backoff 1929 protects against delayed agreement due to the reordering protection 1930 algorithms described in the next section. Again, endpoints may 1931 piggyback Change options on packets they would have sent anyway, or 1932 create new packets to carry the options; any such new packets are 1933 controlled by the relevant congestion-control mechanism. 1935 Confirm options are never retransmitted, but the Confirm-sending 1936 endpoint MUST generate a Confirm option after every non-reordered 1937 Change. 1939 6.6.4. Reordering 1941 Reordering might cause packets containing Change and Confirm options 1942 to arrive in an unexpected order. Endpoints MUST ignore feature 1943 negotiation options that do not arrive in strictly-increasing order 1944 by Sequence Number. The rest of this section presents two 1945 algorithms that fulfill this requirement. 1947 The first algorithm introduces two sequence number variables that 1948 each endpoint maintains for the connection. 1950 FGSR Feature Greatest Sequence Number Received: The greatest 1951 sequence number received, considering only valid packets 1952 that contained one or more feature negotiation options 1953 (Change and/or Confirm). This value is initialized to 1954 ISR - 1. 1956 FGSS Feature Greatest Sequence Number Sent: The greatest 1957 sequence number sent, considering only packets that 1958 contained one or more non-retransmitted Change options. 1959 (Retransmitted Change options MUST have exactly the same 1960 contents as previously transmitted options, so limited 1961 reordering can safely be tolerated.) This value is 1962 initialized to ISS. 1964 Each endpoint checks two conditions on sequence numbers to decide 1965 whether to process received feature negotiation options. 1967 1. If a packet's Sequence Number is less than or equal to FGSR, 1968 then its Change options MUST be ignored. 1970 2. If a packet's Sequence Number is less than or equal to FGSR, OR 1971 it has no Acknowledgement Number, OR its Acknowledgement Number 1972 is less than FGSS, then its Confirm options MUST be ignored. 1974 Alternatively, an endpoint MAY maintain separate FGSR and FGSS 1975 values for every feature. FGSR(F/X) would equal the greatest 1976 sequence number received, considering only packets that contained 1977 Change or Confirm options applying to feature F/X; FGSS(F/X) would 1978 be defined similarly. This algorithm requires more state, but is 1979 slightly more forgiving to multiple overlapped feature negotiations. 1980 Either algorithm MAY be used; the first algorithm, with connection- 1981 wide FGSR and FGSS variables, is RECOMMENDED. 1983 One consequence of these rules is that a CHANGING endpoint will 1984 ignore any Confirm option that does not acknowledge the latest 1985 Change option sent. This ensures that agreement, once achieved, 1986 used the most recent available information about the endpoints' 1987 preferences. 1989 6.6.5. Preference Changes 1991 Endpoints are allowed to change their preference lists at any time. 1992 However, an endpoint that changes its preference list while in the 1993 CHANGING state MUST transition to the UNSTABLE state. It will 1994 transition back to CHANGING once it has transmitted a Change option 1995 with the new preference list. This ensures that agreement is based 1996 on active preference lists. Without the UNSTABLE state, 1997 simultaneous negotiation -- where the endpoints began independent 1998 negotiations for the same feature at the same time -- might lead to 1999 the negotiation terminating with the endpoints thinking the feature 2000 had different values. 2002 6.6.6. Simultaneous Negotiation 2004 The two endpoints might simultaneously open negotiation for the same 2005 feature, after which an endpoint in the CHANGING state will receive 2006 a Change option for the same feature. Such received Change options 2007 can act as responses to the original Change options. The CHANGING 2008 endpoint MUST examine the received Change's preference list, 2009 reconcile that with its own preference list (as expressed in its 2010 generated Change options), and generate the corresponding Confirm 2011 option. It can then transition to the STABLE state. 2013 6.6.7. Unknown Features 2015 Endpoints may receive Change options referring to feature numbers 2016 they do not understand -- for instance, when an extended DCCP 2017 converses with a non-extended DCCP. Endpoints MUST respond to 2018 unknown Change options with Empty Confirm options (that is, Confirm 2019 options containing no data), which inform the CHANGING endpoint that 2020 the feature was not understood. However, if the Change option was 2021 Mandatory, the connection MUST be reset; see Section 6.6.9. 2023 On receiving an empty Confirm option for some feature, the CHANGING 2024 endpoint MUST transition back to the STABLE state, leaving the 2025 feature's value unchanged. Section 15 suggests that the default 2026 value for any extension feature should correspond to "extension not 2027 available". 2029 Some features are required to be understood by all DCCPs (see 2030 Section 6.4). The CHANGING endpoint SHOULD reset the connection 2031 (with Reset Code 5, "Option Error") if it receives an empty Confirm 2032 option for such a feature. 2034 Since Confirm options are generated only in response to Change 2035 options, an endpoint should never receive a Confirm option referring 2036 to a feature number it does not understand. Nevertheless, endpoints 2037 MUST ignore any such options they receive. 2039 6.6.8. Invalid Options 2041 A DCCP endpoint might receive a Change or Confirm option that lists 2042 one or more values that it does not understand. Some, but not all, 2043 such options are invalid, depending on the relevant reconciliation 2044 rule (Section 6.3). For instance: 2046 o All features have length limitations, and options with invalid 2047 lengths are invalid. For example, the Ack Ratio feature takes 2048 16-bit values, so valid "Confirm R(Ack Ratio)" options have 2049 option length 5. 2051 o Some non-negotiable features have value limitations. The Ack 2052 Ratio feature takes two-byte, non-zero integer values, so a 2053 "Change L(Ack Ratio, 0)" option is never valid. Note that 2054 server-priority features do not have value limitations, since 2055 unknown values are handled as a matter of course. 2057 o Any Confirm option that selects the wrong value, based on the two 2058 preference lists and the relevant reconciliation rule, is 2059 invalid. 2061 o However, unexpected Confirm options -- that refer to unknown 2062 feature numbers, or that don't appear to be part of a current 2063 negotiation -- are considered valid, although they are ignored by 2064 the receiver. 2066 An endpoint receiving an invalid Change option MUST respond with the 2067 corresponding empty Confirm option. An endpoint receiving an 2068 invalid Confirm option MUST reset the connection, with Reset Code 5, 2069 "Option Error". 2071 6.6.9. Mandatory Feature Negotiation 2073 Change options may be preceded by Mandatory options (Section 5.8.2). 2074 Mandatory Change options are processed like normal Change options, 2075 except that the following failure cases will cause the receiver to 2076 reset the connection with Reset Code 6, "Mandatory Failure", rather 2077 than send a Confirm option. The connection MUST be reset if: 2079 o The Change option's feature number was not understood; 2081 o The Change option's value was invalid, and the receiver would 2082 normally have sent an empty Confirm option in response; or 2084 o For server-priority features, there was no shared entry in the 2085 two endpoints' preference lists. 2087 There's no reason to mark Confirm options as Mandatory in this 2088 version of DCCP, since Confirm options are sent only in response to 2089 Change options and therefore can't mention potentially-invalid 2090 values or unexpected feature numbers. 2092 7. Sequence Numbers 2094 DCCP uses sequence numbers to arrange packets into sequence, detect 2095 losses and network duplicates, and protect against attackers, half- 2096 open connections, and the delivery of very old packets. Every 2097 packet carries a Sequence Number; most packet types carry an 2098 Acknowledgement Number as well. 2100 DCCP sequence numbers are packet-based. That is, the packets 2101 generated by each endpoint have Sequence Numbers that increase by 2102 one, modulo 2^48, for every packet. Even DCCP-Ack and DCCP-Sync 2103 packets, and other packets that don't carry user data, increment the 2104 Sequence Number. Since DCCP is an unreliable protocol, there are no 2105 true retransmissions; but effective retransmissions, such as 2106 retransmissions of DCCP-Request packets, also increment the Sequence 2107 Number. This lets DCCP implementations detect network duplication, 2108 retransmissions, and acknowledgement loss, and is a significant 2109 departure from TCP practice. 2111 7.1. Variables 2113 DCCP endpoints maintain a set of sequence number variables for each 2114 connection. 2116 ISS The Initial Sequence Number Sent by this endpoint. This 2117 equals the Sequence Number of the first DCCP-Request or 2118 DCCP-Response sent. 2120 ISR The Initial Sequence Number Received from the other 2121 endpoint. This equals the Sequence Number of the first 2122 DCCP-Request or DCCP-Response received. 2124 GSS The Greatest Sequence Number Sent by this endpoint. Here, 2125 and elsewhere, "greatest" is measured in circular sequence 2126 space. 2128 GSR The Greatest Sequence Number Received from the other 2129 endpoint on an acknowledgeable packet. (Section 7.4 defines 2130 this term.) 2132 GAR The Greatest Acknowledgement Number Received from the other 2133 endpoint on an acknowledgeable packet that was not a DCCP- 2134 Sync. 2136 Some other variables are derived from these primitives. 2138 SWL and SWH 2139 (Sequence Number Window Low and High) The extremes of the 2140 validity window for received packets' Sequence Numbers. 2142 AWL and AWH 2143 (Acknowledgement Number Window Low and High) The extremes 2144 of the validity window for received packets' Acknowledgement 2145 Numbers. 2147 7.2. Initial Sequence Numbers 2149 The endpoints' initial sequence numbers are set by the first DCCP- 2150 Request and DCCP-Response packets sent. Initial sequence numbers 2151 MUST be chosen to avoid two problems: 2153 o Delivery of old packets, where packets lingering in the network 2154 from an old connection are delivered to a new connection with the 2155 same addresses and port numbers. 2157 o Sequence number attacks, where an attacker can guess the sequence 2158 numbers that a future connection would use [M85]. 2160 These problems are the same as problems faced by TCP, and DCCP 2161 implementations SHOULD use TCP's strategies to avoid them [RFC 793, 2162 RFC 1948]. The rest of this section explains these strategies in 2163 more detail. 2165 To address the first problem, an implementation MUST ensure that the 2166 initial sequence number for a given 4-tuple doesn't overlap with 2168 recent sequence numbers on previous connections with the same 2169 4-tuple. ("Recent" means sent within 2 maximum segment lifetimes, 2170 or 4 minutes.) The implementation MUST additionally ensure that the 2171 lower 24 bits of the initial sequence number don't overlap with the 2172 lower 24 bits of recent sequence numbers (unless the implementation 2173 plans to avoid short sequence numbers; see Section 7.6). An 2174 implementation that has state for a recent connection with the same 2175 4-tuple can pick a good initial sequence number explicitly. 2176 Otherwise, it could tie initial sequence number selection to some 2177 clock, such as the 4-microsecond clock used by TCP [RFC 793]. Two 2178 separate clocks may be required, one for the upper 24 bits and one 2179 for the lower 24 bits. 2181 To address the second problem, an implementation MUST provide each 2182 4-tuple with an independent initial sequence number space. Then 2183 opening a connection doesn't provide any information about initial 2184 sequence numbers on other connections to the same host. RFC 1948 2185 achieves this by adding a cryptographic hash of the 4-tuple and a 2186 secret to each initial sequence number. For the secret, RFC 1948 2187 recommends a combination of some truly-random data [RFC 1750], an 2188 administratively-installed passphrase, the endpoint's IP address, 2189 and the endpoint's boot time, but truly-random data is sufficient. 2190 Care should be taken when changing the secret; such a change alters 2191 all initial sequence number spaces, which might make an initial 2192 sequence number for some 4-tuple equal a recently sent sequence 2193 number for the same 4-tuple. To avoid this problem, the endpoint 2194 might remember dead connection state for each 4-tuple or stay quiet 2195 for 2 maximum segment lifetimes around such a change. 2197 7.3. Quiet Time 2199 DCCP endpoints, like TCP endpoints, must take care before initiating 2200 connections when they boot. In particular, they MUST NOT send 2201 packets whose sequence numbers are close to the sequence numbers of 2202 packets lingering in the network from before the boot. The simplest 2203 way to enforce this rule is for DCCP endpoints to avoid sending any 2204 packets until one maximum segment lifetime (2 minutes) after boot. 2205 Other enforcement mechanisms include remembering recent sequence 2206 numbers across boots, and reserving the upper 8 or so bits of 2207 initial sequence numbers for a persistent counter that decrements by 2208 two each boot. (The latter mechanism would require disallowing 2209 packets with short sequence numbers; see Section 7.6.1.) 2211 7.4. Acknowledgement Numbers 2213 Cumulative acknowledgements are meaningless in an unreliable 2214 protocol. Therefore, DCCP's Acknowledgement Number field has a 2215 different meaning than TCP's. 2217 A received packet is classified as acknowledgeable if and only if 2218 its header was succesfully processed by the receiving DCCP. In 2219 terms of the pseudocode in Section 8.5, a received packet becomes 2220 acknowledgeable when the receiving endpoint reaches Step 8. This 2221 means, for example, that all acknowledgeable packets have valid 2222 header checksums and sequence numbers. The Acknowledgement Number 2223 MUST equal GSR, the Greatest Sequence Number Received on an 2224 acknowledgeable packet, for all packet types except DCCP-Sync and 2225 DCCP-SyncAck. 2227 "Acknowledgeable" does not refer to data processing. Even 2228 acknowledgeable packets may have their application data dropped, due 2229 to receive buffer overflow or corruption, for instance. Data 2230 Dropped options report these data losses when necessary, letting 2231 congestion control mechanisms distinguish between network losses and 2232 endpoint losses. This issue is discussed further in Sections 11.4 2233 and 11.7. 2235 DCCP-Sync and DCCP-SyncAck packets' Acknowledgement Numbers differ 2236 as follows: The Acknowledgement Number on a DCCP-Sync packet 2237 corresponds to a received packet, but not necessarily an 2238 acknowledgeable packet; in particular, it might correspond to an 2239 out-of-sync packet whose options were not processed. The 2240 Acknowledgement Number on a DCCP-SyncAck packet always corresponds 2241 to an acknowledgeable DCCP-Sync packet; it might be less than GSR in 2242 the presence of reordering. 2244 7.5. Validity and Synchronization 2246 Any DCCP endpoint might receive packets that are not actually part 2247 of the current connection. For instance, the network might deliver 2248 an old packet, an attacker might attempt to hijack a connection, or 2249 the other endpoint might crash, causing a half-open connection. 2251 DCCP, like TCP, uses sequence number checks to detect these cases. 2252 Packets whose Sequence and/or Acknowledgement Numbers are out of 2253 range are called sequence-invalid, and are not processed normally. 2255 Unlike TCP, DCCP requires a synchronization mechanism to recover 2256 from large bursts of loss. One endpoint might send so many packets 2257 during a burst of loss that when one of its packets finally got 2258 through, the other endpoint would label its Sequence Number as 2259 invalid. A handshake of DCCP-Sync and DCCP-SyncAck packets recovers 2260 from this case. 2262 7.5.1. Sequence and Acknowledgement Number Windows 2264 Each DCCP endpoint defines sequence validity windows that are 2265 subsets of the Sequence and Acknowledgement Number spaces. These 2266 windows correspond to packets the endpoint expects to receive in the 2267 next few round-trip times. The Sequence and Acknowledgement Number 2268 windows always contain GSR and GSS, respectively. The window widths 2269 are controlled by Sequence Window features for the two half- 2270 connections. 2272 The Sequence Number validity window for packets from DCCP B is [SWL, 2273 SWH]. This window always contains GSR, the Greatest Sequence Number 2274 Received on a sequence-valid packet from DCCP B. It is W packets 2275 wide, where W is the value of the Sequence Window/B feature. One- 2276 fourth of the sequence window, rounded down, is less than or equal 2277 to GSR, and three-fourths is greater than GSR. (This asymmetric 2278 placement assumes that bursts of loss are more common in the network 2279 than significant reordering.) 2281 invalid | valid Sequence Numbers | invalid 2282 <---------*|*===========*=======================*|*---------> 2283 GSR -|GSR + 1 - GSR GSR +|GSR + 1 + 2284 floor(W/4)|floor(W/4) ceil(3W/4)|ceil(3W/4) 2285 = SWL = SWH 2287 The Acknowledgement Number validity window for packets from DCCP B 2288 is [AWL, AWH]. The high end of the window, AWH, equals GSS, the 2289 Greatest Sequence Number Sent by DCCP A; the window is W' packets 2290 wide, where W' is the value of the Sequence Window/A feature. 2292 invalid | valid Acknowledgement Numbers | invalid 2293 <---------*|*===================================*|*---------> 2294 GSS - W'|GSS + 1 - W' GSS|GSS + 1 2295 = AWL = AWH 2297 SWL and AWL are initially adjusted so that they are not less than 2298 the initial Sequence Numbers received and sent, respectively: 2299 SWL := max(GSR + 1 - floor(W/4), ISR), 2300 AWL := max(GSS - W' + 1, ISS). 2301 These adjustments MUST be applied only at the beginning of the 2302 connection. (Long-lived connections may wrap sequence numbers so 2303 that they appear to be less than ISR or ISS; the adjustments MUST 2304 NOT be applied in that case.) 2306 7.5.2. Sequence Window Feature 2308 The Sequence Window/A feature determines the width of the Sequence 2309 Number validity window used by DCCP B, and the width of the 2310 Acknowledgement Number validity window used by DCCP A. DCCP A sends 2311 a "Change L(Sequence Window, W)" option to notify DCCP B that the 2312 Sequence Window/A value is W. 2314 Sequence Window has feature number 3, and is non-negotiable. It 2315 takes 48-bit (6-byte) integer values, like DCCP sequence numbers. 2316 Change and Confirm options for Sequence Window are therefore 9 bytes 2317 long. New connections start with Sequence Window 100 for both 2318 endpoints. The minimum valid Sequence Window value is Wmin = 32. 2319 The maximum valid Sequence Window value is Wmax = 2^46 - 1 = 2320 70368744177663; circular sequence number comparisons would stop 2321 working absent this constraint. Change options suggesting Sequence 2322 Window values out of this range are invalid and MUST be handled 2323 accordingly. 2325 A proper Sequence Window/A value must reflect the number of packets 2326 DCCP A expects to be in flight. Only DCCP A can anticipate this 2327 number. Values that are too small increase the risk of the 2328 endpoints getting out sync after bursts of loss, and values that are 2329 much too small can prevent productive communication whether or not 2330 there is loss. On the other hand, too-large values increase the 2331 risk of connection hijacking; Section 7.5.5 quantifies this risk. 2332 One good guideline is for each endpoint to set Sequence Window to 2333 about five times the maximum number of packets it expects to send in 2334 a round-trip time. Endpoints SHOULD send Change L(Sequence Window) 2335 options as necessary as the connection progresses. Also, an 2336 endpoint MUST NOT persistently send more than its Sequence Window 2337 number of packets per round-trip time; that is, DCCP A MUST NOT 2338 persistently send more than Sequence Window/A packets per RTT. 2340 7.5.3. Sequence-Validity Rules 2342 Sequence-validity depends on the received packet's type. This table 2343 shows the sequence and acknowledgement number checks applied to each 2344 packet; a packet is sequence-valid if it passes both tests, and 2345 sequence-invalid if it does not. Many of the checks refer to the 2346 sequence and acknowledgement number validity windows [SWL, SWH] and 2347 [AWL, AWH] defined in Section 7.5.1. 2349 Acknowledgement Number 2350 Packet Type Sequence Number Check Check 2351 ----------- --------------------- ---------------------- 2352 DCCP-Request SWL <= seqno <= SWH (*) N/A 2353 DCCP-Response SWL <= seqno <= SWH (*) AWL <= ackno <= AWH 2354 DCCP-Data SWL <= seqno <= SWH N/A 2355 DCCP-Ack SWL <= seqno <= SWH AWL <= ackno <= AWH 2356 DCCP-DataAck SWL <= seqno <= SWH AWL <= ackno <= AWH 2357 DCCP-CloseReq GSR < seqno <= SWH GAR <= ackno <= AWH 2358 DCCP-Close GSR < seqno <= SWH GAR <= ackno <= AWH 2359 DCCP-Reset GSR < seqno <= SWH GAR <= ackno <= AWH 2360 DCCP-Sync SWL <= seqno AWL <= ackno <= AWH 2361 DCCP-SyncAck SWL <= seqno AWL <= ackno <= AWH 2363 (*) Check not applied if connection is in LISTEN or REQUEST state. 2365 In general, packets are sequence-valid if their Sequence and 2366 Acknowledgement Numbers lie within the corresponding valid windows, 2367 [SWL, SWH] and [AWL, AWH]. The exceptions to this rule are as 2368 follows: 2370 o Since DCCP-CloseReq, DCCP-Close, and DCCP-Reset packets end a 2371 connection, they cannot have Sequence Numbers less than or equal 2372 to GSR, or Acknowledgement Numbers less than GAR. 2374 o DCCP-Sync and DCCP-SyncAck Sequence Numbers are not strongly 2375 checked. These packet types exist specifically to get the 2376 endpoints back into sync; checking their Sequence Numbers would 2377 eliminate their usefulness. 2379 The lenient checks on DCCP-Sync and DCCP-SyncAck packets allow 2380 continued operation after unusual events, such as endpoint crashes 2381 and large bursts of loss, but there's no need for leniency in the 2382 absence of unusual events -- that is, during ongoing successful 2383 communication. Therefore, DCCP implementations SHOULD use the 2384 following, more stringent checks for active connections, where a 2385 connection is considered active if it has received valid packets 2386 from the other endpoint within the last five round-trip times. 2388 Acknowledgement Number 2389 Packet Type Sequence Number Check Check 2390 ----------- --------------------- ---------------------- 2391 DCCP-Sync SWL <= seqno <= SWH AWL <= ackno <= AWH 2392 DCCP-SyncAck SWL <= seqno <= SWH AWL <= ackno <= AWH 2394 Finally, an endpoint MAY apply the following more stringent checks 2395 to DCCP-CloseReq, DCCP-Close, and DCCP-Reset packets, further 2396 lowering the probability of successful blind attacks using those 2397 packet types. Since these checks can cause extra synchronization 2398 overhead and delay connection closing when packets are lost, they 2399 should be considered experimental. 2401 Acknowledgement Number 2402 Packet Type Sequence Number Check Check 2403 ----------- --------------------- ---------------------- 2404 DCCP-CloseReq seqno == GSR + 1 GAR <= ackno <= AWH 2405 DCCP-Close seqno == GSR + 1 GAR <= ackno <= AWH 2406 DCCP-Reset seqno == GSR + 1 GAR <= ackno <= AWH 2408 Note that sequence-validity is only one of the validity checks 2409 applied to received packets. 2411 7.5.4. Handling Sequence-Invalid Packets 2413 Endpoints MUST ignore sequence-invalid DCCP-Sync and DCCP-SyncAck 2414 packets, and MUST respond to other sequence-invalid packets with 2415 (possibly rate-limited) DCCP-Sync packets. Each DCCP-Sync packet 2416 MUST acknowledge the corresponding sequence-invalid packet's 2417 Sequence Number, not GSR. The DCCP-Sync MUST use a new Sequence 2418 Number, and thus will increase GSS; GSR will not change, however, 2419 since the received packet was sequence-invalid. 2421 On receiving a sequence-valid DCCP-Sync packet, the peer endpoint 2422 (say, DCCP B) MUST update its GSR variable and reply with a DCCP- 2423 SyncAck packet. The DCCP-SyncAck packet's Acknowledgement Number 2424 will equal the DCCP-Sync's Sequence Number, not necessarily GSR. 2425 Upon receiving this DCCP-SyncAck, which will be sequence-valid since 2426 it acknowledges the DCCP-Sync, DCCP A will update its GSR variable, 2427 and the endpoints will be back in sync. As an exception, if the 2428 peer endpoint is in the REQUEST state, it MUST respond with a DCCP- 2429 Reset instead of a DCCP-SyncAck. This serves to clean up DCCP A's 2430 half-open connection. 2432 To protect against denial-of-service attacks, DCCP implementations 2433 SHOULD impose a rate limit on DCCP-Syncs sent in response to 2434 sequence-invalid packets, such as not more than eight DCCP-Syncs per 2435 second. 2437 DCCP endpoints MUST NOT process sequence-invalid packets except, 2438 perhaps, by generating a DCCP-Sync. For instance, options MUST NOT 2439 but processed. An endpoint MAY temporarily preserve sequence- 2440 invalid packets in case they become valid later, however; this can 2441 reduce the impact of bursts of loss by delivering more packets to 2442 the application. In particular, an endpoint MAY preserve sequence- 2443 invalid packets for up to 2 round-trip times. If, within that time, 2444 the relevant sequence windows change so that the packets become 2445 sequence-valid, the endpoint MAY process them again. 2447 Note that sequence-invalid DCCP-Reset packets cause DCCP-Syncs to be 2448 generated. This is because endpoints in an unsynchronized state 2449 (CLOSED, REQUEST, and LISTEN) might not have enough information to 2450 generate a proper DCCP-Reset on the first try. For example, if a 2451 peer endpoint is in CLOSED state and receives a DCCP-Data packet, it 2452 cannot guess the right Sequence Number to use on the DCCP-Reset it 2453 generates (since the DCCP-Data packet has no Acknowledgement 2454 Number). The DCCP-Sync generated in response to this bad reset 2455 serves as a challenge, and contains enough information for the peer 2456 to generate a proper DCCP-Reset. However, the new DCCP-Reset may 2457 carry a different Reset Code than the original DCCP-Reset; probably 2458 the new Reset Code will be 3, "No Connection". The endpoint SHOULD 2459 use information from the original DCCP-Reset when possible. 2461 7.5.5. Sequence Number Attacks 2463 Sequence and Acknowledgement Numbers form DCCP's main line of 2464 defense against attackers. An attacker that cannot guess sequence 2465 numbers cannot easily manipulate or hijack a DCCP connection, and 2466 requirements like careful initial sequence number choice eliminate 2467 the most serious attacks. 2469 An attacker might still send many packets with randomly chosen 2470 Sequence and Acknowledgement Numbers, however. If one of those 2471 probes ends up sequence-valid, it may shut down the connection or 2472 otherwise cause problems. The easiest such attacks to execute are: 2474 o Send DCCP-Data packets with random Sequence Numbers. If one of 2475 these packets hits the valid sequence number window, the attack 2476 packet's application data may be inserted into the data stream. 2478 o Send DCCP-Sync packets with random Sequence and Acknowledgement 2479 Numbers. If one of these packets hits the valid acknowledgement 2480 number window, the receiver will shift its sequence number window 2481 accordingly, getting out of sync with the correct endpoint -- 2482 perhaps permanently. 2484 The attacker has to guess both Source and Destination Ports for any 2485 of these attacks to succeed. Additionally, the connection would 2486 have to be inactive for the DCCP-Sync attack to succeed, assuming 2487 the victim implemented the more stringent checks for active 2488 connections recommended in Section 7.5.3. 2490 To quantify the probability of success, let N be the number of 2491 attack packets the attacker is willing to send, W be the relevant 2492 sequence window width, and L be the length of sequence numbers (24 2493 or 48). The attacker's best strategy is to space the attack packets 2494 evenly over sequence space. Then the probability of hitting one 2495 sequence number window is P = WN/2^L. 2497 The success probability for a DCCP-Data attack using short sequence 2498 numbers thus equals P = WN/2^24. For W = 100, then, the attacker 2499 must send more than 83,000 packets to achieve a 50% chance of 2500 success. For reference, the easiest TCP attack -- sending a SYN 2501 with a random sequence number, which will cause a connection reset 2502 if it falls within the window -- has W = 8760 (a common default) and 2503 L = 32, and requires more than 245,000 packets to achieve a 50% 2504 chance of success. 2506 A fast connection's W will generally be high, increasing the attack 2507 success probability for fixed N. If this probability gets 2508 uncomfortably high with L = 24, the endpoint SHOULD prevent the use 2509 of short sequence numbers by manipulating the Allow Short Sequence 2510 Numbers feature (see Section 7.6.1). The probability limit depends 2511 on the application, however. Some applications, such as those 2512 already designed to handle corruption, are quite resilient to data 2513 injection attacks. 2515 The DCCP-Sync attack has L = 48, since DCCP-Sync packets use long 2516 sequence numbers exclusively; in addition, the success probability 2517 is halved, since only half the Sequence Number space is valid. 2518 Attacks have a correspondingly smaller probability of success. For 2519 a large W of 2000 packets, then, the attacker must send more than 2520 10^11 packets to achieve a 50% chance of success. 2522 Attacks involving DCCP-Ack, DCCP-DataAck, DCCP-CloseReq, DCCP-Close, 2523 and DCCP-Reset packets are more difficult, since Sequence and 2524 Acknowledgement Numbers must both be guessed. The probability of 2525 attack success for these packet types equals P = WXN/2^(2L), where W 2526 is the Sequence Number window, X is the Acknowledgement Number 2527 window, and N and L are as before. 2529 Since DCCP-Data attacks with short sequence numbers are relatively 2530 easy for attackers to execute, DCCP has been engineered to prevent 2531 these attacks from escalating to connection resets or other serious 2532 consequences. In particular, any options whose processing might 2533 cause the connection to be reset are ignored when they appear on 2534 DCCP-Data packets. 2536 7.5.6. Examples 2538 In the following example, DCCP A and DCCP B recover from a large 2539 burst of loss that runs DCCP A's sequence numbers out of DCCP B's 2540 appropriate sequence number window. 2542 DCCP A DCCP B 2543 (GSS=1,GSR=10) (GSS=10,GSR=1) 2544 --> DCCP-Data(seq 2) XXX 2545 ... 2546 --> DCCP-Data(seq 100) XXX 2547 --> DCCP-Data(seq 101) --> ??? 2548 seqno out of range; 2549 send Sync 2550 OK <-- DCCP-Sync(seq 11, ack 101) <-- 2551 (GSS=11,GSR=1) 2552 --> DCCP-SyncAck(seq 102, ack 11) --> OK 2553 (GSS=102,GSR=11) (GSS=11,GSR=102) 2555 In the next example, a DCCP connection recovers from a simple blind 2556 attack. 2558 DCCP A DCCP B 2559 (GSS=1,GSR=10) (GSS=10,GSR=1) 2560 *ATTACKER* --> DCCP-Data(seq 10^6) --> ??? 2561 seqno out of range; 2562 send Sync 2563 ??? <-- DCCP-Sync(seq 11, ack 10^6) <-- 2564 ackno out of range; ignore 2565 (GSS=1,GSR=10) (GSS=11,GSR=1) 2567 The final example demonstrates recovery from a half-open connection. 2569 DCCP A DCCP B 2570 (GSS=1,GSR=10) (GSS=10,GSR=1) 2571 (Crash) 2572 CLOSED OPEN 2573 REQUEST --> DCCP-Request(seq 400) --> ??? 2574 !! <-- DCCP-Sync(seq 11, ack 400) <-- OPEN 2575 REQUEST --> DCCP-Reset(seq 401, ack 11) --> (Abort) 2576 REQUEST CLOSED 2577 REQUEST --> DCCP-Request(seq 402) --> ... 2579 7.6. Short Sequence Numbers 2581 DCCP sequence numbers are 48 bits long. This large sequence space 2582 protects DCCP connections against some blind attacks, such as the 2583 injection of DCCP-Resets into the connection. However, DCCP-Data, 2584 DCCP-Ack, and DCCP-DataAck packets, which make up the body of any 2585 DCCP connection, may reduce header space by transmitting only the 2586 lower 24 bits of the relevant Sequence and Acknowledgement Numbers. 2587 The receiving endpoint will extend these numbers to 48 bits using 2588 the following pseudocode: 2590 procedure Extend_Sequence_Number(S, REF) 2591 /* S is a 24-bit sequence number from the packet header. 2592 REF is the relevant 48-bit reference sequence number: 2593 GSS if S is an Acknowledgement Number, and GSR if S is a 2594 Sequence Number. */ 2595 Set REF_low := low 24 bits of REF 2596 Set REF_hi := high 24 bits of REF 2597 If REF_low (<) S /* circular comparison mod 2^24 */ 2598 && S |<| REF_low, /* conventional, non-circular 2599 comparison */ 2600 Return (((REF_hi + 1) mod 2^24) << 24) | S 2601 Otherwise, 2602 Return (REF_hi << 24) | S 2604 The two different kinds of comparison in the if statement detect 2605 when the low-order bits of the sequence space have wrapped. (The 2606 circular comparison "REF_low (<) S" returns true if and only if 2607 (S - REF_low), calculated using two's-complement arithmetic and then 2608 represented as an unsigned number, is less than or equal to 2^23 2609 (mod 2^24).) When this happens, the high-order bits are 2610 incremented. 2612 7.6.1. Allow Short Sequence Numbers Feature 2614 Endpoints can require that all packets use long sequence numbers by 2615 setting the Allow Short Sequence Numbers feature to false. This can 2616 reduce the risk that data will be inappropriately injected into the 2617 connection. DCCP A sends a "Change R(Allow Short Seqnos, 0)" option 2618 to ask DCCP B to send only long sequence numbers. 2620 Allow Short Sequence Numbers has feature number 2, and is server- 2621 priority. It takes one-byte Boolean values. DCCP B MUST NOT send 2622 packets with short sequence numbers when Allow Short Seqnos/B is 2623 zero. Values of two or more are reserved. New connections start 2624 with Allow Short Sequence Numbers 1 for both endpoints. 2626 7.6.2. When to Avoid Short Sequence Numbers 2628 Short sequence numbers reduce the rate DCCP connections can safely 2629 achieve, and increase the risks of certain kinds of attacks, 2630 including blind data injection. Very-high-rate DCCP connections, 2631 and connections with large sequence windows (Section 7.5.2), SHOULD 2632 NOT use short sequence numbers on their data packets. The attack 2633 risk issues have been discussed in Section 7.5.5; we discuss the 2634 rate limitation issue here. 2636 The sequence-validity mechanism assumes that the network does not 2637 deliver extremely old data. In particular, it assumes that the 2638 network must have dropped any packet by the time the connection 2639 wraps around and uses its sequence number again. This constraint 2640 limits the maximum connection rate that can be safely achieved. Let 2641 MSL equal the maximum segment lifetime, P equal the average DCCP 2642 packet size in bits, and L equal the length of sequence numbers (24 2643 or 48 bits). Then the maximum safe rate, in bits per second, is R = 2644 P*(2^L)/2MSL. 2646 For the default MSL of 2 minutes, 1500-byte DCCP packets, and short 2647 sequence numbers, the safe rate is therefore approximately 800 Mb/s. 2648 Although 2 minutes is a very large MSL for any networks that could 2649 sustain that rate with such small packets, long sequence numbers 2650 allow much higher rates under the same constraints: up to 2651 14 petabits a second for 1500-byte packets and the default MSL. 2653 7.7. NDP Count and Detecting Application Loss 2655 DCCP's sequence numbers increment by one on every packet, including 2656 non-data packets (packets that don't carry application data). This 2657 makes DCCP sequence numbers suitable for detecting any network loss, 2658 but not for detecting the loss of application data. The NDP Count 2659 option reports the length of each burst of non-data packets. This 2660 lets the receiving DCCP reliably determine when a burst of loss 2661 included application data. 2663 +--------+--------+-------- ... --------+ 2664 |00100101| Length | NDP Count | 2665 +--------+--------+-------- ... --------+ 2666 Type=37 Len=3-5 (1-3 bytes) 2668 If a DCCP endpoint's Send NDP Count feature is one (see below), then 2669 that endpoint MUST send an NDP Count option on every packet whose 2670 immediate predecessor was a non-data packet. Non-data packets 2671 consist of DCCP packet types DCCP-Ack, DCCP-Close, DCCP-CloseReq, 2672 DCCP-Reset, DCCP-Sync, and DCCP-SyncAck. The other packet types, 2673 namely DCCP-Request, DCCP-Response, DCCP-Data, and DCCP-DataAck, are 2674 considered data packets, although not all DCCP-Request and DCCP- 2675 Response packets will actually carry application data. 2677 The value stored in NDP Count equals the number of consecutive non- 2678 data packets in the run immediately previous to the current packet. 2679 Packets with no NDP Count option are considered to have NDP Count 2680 zero. 2682 The NDP Count option can carry one to three bytes of data. The 2683 smallest option format that can hold the NDP Count SHOULD be used. 2685 With NDP Count, the receiver can reliably tell only whether a burst 2686 of loss contained at least one data packet. For example, the 2687 receiver cannot always tell whether a burst of loss contained a non- 2688 data packet. 2690 7.7.1. Usage Notes 2692 Say that K consecutive sequence numbers are missing in some burst of 2693 loss, and the Send NDP Count feature is on. Then some application 2694 data was lost within those sequence numbers unless the packet 2695 following the hole contains an NDP Count option whose value is 2696 greater than or equal to K. 2698 For example, say that an endpoint sent the following sequence of 2699 non-data packets (Nx) and data packets (Dx). 2701 N0 N1 D2 N3 D4 D5 N6 D7 D8 D9 D10 N11 N12 D13 2703 Those packets would have NDP Counts as follows. 2705 N0 N1 D2 N3 D4 D5 N6 D7 D8 D9 D10 N11 N12 D13 2706 - 1 2 - 1 - - 1 - - - - 1 2 2708 NDP Count is not useful for applications that include their own 2709 sequence numbers with their packet headers. 2711 7.7.2. Send NDP Count Feature 2713 The Send NDP Count feature lets DCCP endpoints negotiate whether 2714 they should send NDP Count options on their packets. DCCP A sends a 2715 "Change R(Send NDP Count, 1)" option to ask DCCP B to send NDP Count 2716 options. 2718 Send NDP Count has feature number 7, and is server-priority. It 2719 takes one-byte Boolean values. DCCP B MUST send NDP Count options 2720 as described above when Send NDP Count/B is one, although it MAY 2721 send NDP Count options even when Send NDP Count/B is zero. Values 2722 of two or more are reserved. New connections start with Send NDP 2723 Count 0 for both endpoints. 2725 8. Event Processing 2727 This section describes how DCCP connections move between states, and 2728 which packets are sent when. Note that feature negotiation takes 2729 place in parallel with the connection-wide state transitions 2730 described here. 2732 8.1. Connection Establishment 2734 DCCP connections' initiation phase consists of a three-way 2735 handshake: an initial DCCP-Request packet sent by the client, a 2736 DCCP-Response sent by the server in reply, and finally an 2737 acknowledgement from the client, usually via a DCCP-Ack or DCCP- 2738 DataAck packet. The client moves from the REQUEST state to 2739 PARTOPEN, and finally to OPEN; the server moves from LISTEN to 2740 RESPOND, and finally to OPEN. 2742 Client State Server State 2743 CLOSED LISTEN 2744 1. REQUEST --> Request --> 2745 2. <-- Response <-- RESPOND 2746 3. PARTOPEN --> Ack, DataAck --> 2747 4. <-- Data, Ack, DataAck <-- OPEN 2748 5. OPEN <-> Data, Ack, DataAck <-> OPEN 2750 8.1.1. Client Request 2752 When a client decides to initiate a connection, it enters the 2753 REQUEST state, chooses an initial sequence number (Section 7.2), and 2754 sends a DCCP-Request packet using that sequence number to the 2755 intended server. 2757 DCCP-Request packets will commonly carry feature negotiation options 2758 that open negotiations for various connection parameters, such as 2759 preferred congestion control IDs for each half-connection. They may 2760 also carry application data, but the client should be aware that the 2761 server may not accept such data. 2763 A client in the REQUEST state SHOULD use an exponential-backoff 2764 timer to send new DCCP-Request packets if no response is received. 2765 The first retransmission should occur after approximately one 2766 second, backing off to not less than one packet every 64 seconds; or 2767 the endpoint can use whatever retransmission strategy is followed 2768 for retransmitting TCP SYNs. Each new DCCP-Request MUST increment 2769 the Sequence Number by one, and MUST contain the same Service Code 2770 and application data as the original DCCP-Request. 2772 A client MAY give up on its DCCP-Requests after some time 2773 (3 minutes, for example). When it does, it SHOULD send a DCCP-Reset 2774 packet to the server with Reset Code 2, "Aborted", to clean up state 2775 in case one or more of the Requests actually arrived. A client in 2776 REQUEST state has never received an initial sequence number from its 2777 peer, so the DCCP-Reset's Acknowledgement Number MUST be set to 2778 zero. 2780 The client leaves the REQUEST state for PARTOPEN when it receives a 2781 DCCP-Response from the server. 2783 8.1.2. Service Codes 2785 Each DCCP-Request contains a 32-bit Service Code, which identifies 2786 the application-level service to which the client application is 2787 trying to connect. Service Codes should correspond to application 2788 services and protocols. For example, there might be a Service Code 2789 for SIP control connections and one for RTP audio connections. 2790 Middleboxes, such as firewalls, can use the Service Code to identify 2791 the application running on a nonstandard port (assuming the DCCP 2792 header has not been encrypted). 2794 Endpoints MUST associate a Service Code with every DCCP socket, both 2795 actively and passively opened. The application will generally 2796 supply this Service Code. Each active socket MUST have exactly one 2797 Service Code. Passive sockets MAY, at the implementation's 2798 discretion, be associated with more than one Service Code; this 2799 might let multiple applications, or multiple versions of the same 2800 application, listen on the same port, differentiated by Service 2801 Code. If the DCCP-Request's Service Code doesn't match any of the 2802 server's Service Codes for the given port, the server MUST reject 2803 the request by sending a DCCP-Reset packet with Reset Code 8, "Bad 2804 Service Code". A middlebox MAY also send such a DCCP-Reset in 2805 response to packets whose Service Code is considered unsuitable. 2807 Service Codes are not intended to be DCCP-specific, and are 2808 allocated by IANA. Following the policies outlined in RFC 2434, 2809 most Service Codes are allocated First Come First Served, subject to 2810 the following guidelines. 2812 o Service Codes are allocated one at a time, or in small blocks. A 2813 short English description of the intended service is REQUIRED to 2814 obtain a Service Code assignment, but no specification, 2815 standards-track or otherwise, is necessary. IANA maintains an 2816 association of Service Codes to the corresponding phrases. 2818 o Users request specific Service Code values. We suggest that 2819 users request Service Codes that can be interpreted as meaningful 2820 four-byte ASCII strings. Thus, the "Frobodyne Plotz Protocol" 2821 might correspond to "fdpz", or the number 1717858426. The 2822 canonical interpretation of a Service Code field is numeric. 2824 o Service Codes whose bytes each have values in the set {32, 45-57, 2825 65-90} use a Specification Required allocation policy. That is, 2826 these Service Codes are used for international standard or 2827 standards-track specifications, IETF or otherwise. (This set 2828 consists of the ASCII digits, uppercase letters, and characters 2829 space, '-', '.', and '/'.) 2831 o Service Codes whose high-order byte equals 63 (ASCII '?') are 2832 reserved for Private Use. 2834 o Service Code 0 represents the absence of a meaningful Service 2835 Code, and MUST NOT be allocated. 2837 This design for Service Code allocation is based on the allocation 2838 of 4-byte identifiers for Macintosh resources, PNG chunks, and 2839 TrueType and OpenType tables. 2841 8.1.3. Server Response 2843 In the second phase of the three-way handshake, the server moves 2844 from the LISTEN state to RESPOND, and sends a DCCP-Response message 2845 to the client. In this phase, a server will often specify the 2846 features it would like to use, either from among those the client 2847 requested, or in addition to those. Among these options is the 2848 congestion control mechanism the server expects to use. 2850 The server MAY respond to a DCCP-Request packet with a DCCP-Reset 2851 packet to refuse the connection. Relevant Reset Codes for refusing 2852 a connection include 7, "Connection Refused", when the DCCP- 2853 Request's Destination Port did not correspond to a DCCP port open 2854 for listening; 8, "Bad Service Code", when the DCCP-Request's 2855 Service Code did not correspond to the service code registered with 2856 the Destination Port; and 9, "Too Busy", when the server is 2857 currently too busy to respond to requests. The server SHOULD limit 2858 the rate at which it generates these resets, for example to not more 2859 than 1024 per second. 2861 The server SHOULD NOT retransmit DCCP-Response packets; the client 2862 will retransmit the DCCP-Request if necessary. (Note that the 2863 "retransmitted" DCCP-Request will have, at least, a different 2864 sequence number from the "original" DCCP-Request. The server can 2865 thus distinguish true retransmissions from network duplicates.) The 2866 server will detect that the retransmitted DCCP-Request applies to an 2867 existing connection because of its Source and Destination Ports. 2868 Every valid DCCP-Request received while the server is in the RESPOND 2869 state MUST elicit a new DCCP-Response. Each new DCCP-Response MUST 2870 increment the server's Sequence Number by one, and MUST include the 2871 same application data, if any, as the original DCCP-Response. 2873 The server MUST NOT accept more than one piece of DCCP-Request 2874 application data per connection. In particular, the DCCP-Response 2875 sent in reply to a retransmitted DCCP-Request with application data 2876 SHOULD contain a Data Dropped option, in which the retransmitted 2877 DCCP-Request data is reported with Drop Code 0, Protocol 2878 Constraints. The original DCCP-Request SHOULD also be reported in 2879 the Data Dropped option, either in a Normal Block (if the server 2880 accepted the data, or there was no data), or in a Drop Code 0 Drop 2881 Block (if the server refused the data the first time as well). 2883 The Data Dropped and Init Cookie options are particularly useful for 2884 DCCP-Response packets (Sections 11.7 and 8.1.4). 2886 The server leaves the RESPOND state for OPEN when it receives a 2887 valid DCCP-Ack from the client, completing the three-way handshake. 2888 It MAY also leave the RESPOND state for CLOSED after a timeout of 2889 not less than 4MSL (8 minutes); when doing so, it SHOULD send a 2890 DCCP-Reset with Reset Code 2, "Aborted", to clean up state at the 2891 client. 2893 8.1.4. Init Cookie Option 2895 +--------+--------+--------+--------+--------+-------- 2896 |00100100| Length | Init Cookie Value ... 2897 +--------+--------+--------+--------+--------+-------- 2898 Type=36 2900 The Init Cookie option lets a DCCP server avoid having to hold any 2901 state until the three-way connection setup handshake has completed, 2902 in a similar fashion as TCP SYN cookies [SYNCOOKIES]. The server 2903 wraps up the Service Code, server port, and any options it cares 2904 about from both the DCCP-Request and DCCP-Response in an opaque 2905 cookie. Typically the cookie will be encrypted using a secret known 2906 only to the server and include a cryptographic checksum or magic 2907 value so that correct decryption can be verified. When the server 2908 receives the cookie back in the response, it can decrypt the cookie 2909 and instantiate all the state it avoided keeping. In the meantime, 2910 it need not move from the LISTEN state. 2912 The Init Cookie option MUST NOT be sent on DCCP-Request or DCCP-Data 2913 packets, and any such options received on DCCP-Request or DCCP-Data 2914 packets MUST be ignored. The server MAY include an Init Cookie 2915 option in its DCCP-Response. If so, then the client MUST echo the 2916 same Init Cookie option in each succeeding DCCP packet until one of 2917 those packets is acknowledged, meaning the three-way handshake has 2918 completed, or the connection is reset. (As a result, the client 2919 MUST NOT use DCCP-Data packets until the three-way handshake 2920 completes or the connection is reset.) The server SHOULD design its 2921 Init Cookie format so that Init Cookies can be checked for 2922 tampering; it SHOULD respond to a tampered Init Cookie option by 2923 resetting the connection with Reset Code 10, "Bad Init Cookie". 2925 Init Cookie's precise implementation need not be specified here; 2926 since Init Cookies are opaque to the client, there are no 2927 interoperability concerns. An example cookie format might encrypt 2928 (using a secret key) the connection's initial sequence and 2929 acknowledgement numbers, ports, Service Code, any options included 2930 on the DCCP-Request packet and the corresponding DCCP-Reply, a 2931 random salt, and a magic number. On receiving a reflected Init 2932 Cookie, the server would decrypt the cookie, validate it by checking 2933 its magic number, sequence numbers, and ports, and, if valid, create 2934 a corresponding socket using the options. 2936 Init Cookies are limited to at most 253 bytes in length. 2938 8.1.5. Handshake Completion 2940 When the client receives a DCCP-Response from the server, it moves 2941 from the REQUEST state to PARTOPEN and completes the three-way 2942 handshake by sending a DCCP-Ack packet to the server. The client 2943 remains in PARTOPEN until it can be sure that the server has 2944 received some packet the client sent from PARTOPEN (either the 2945 initial DCCP-Ack or a later packet). Clients in the PARTOPEN state 2946 that want to send data MUST do so using DCCP-DataAck packets, not 2947 DCCP-Data packets. This is because DCCP-Data packets lack 2948 Acknowledgement Numbers, so the server can't tell from a DCCP-Data 2949 packet whether the client saw its DCCP-Response. Furthermore, if 2950 the DCCP-Response included an Init Cookie, that Init Cookie MUST be 2951 included on every packet sent in PARTOPEN. 2953 The single DCCP-Ack sent when entering the PARTOPEN state might, of 2954 course, be dropped by the network. The client SHOULD ensure that 2955 some packet gets through eventually. The preferred mechanism would 2956 be a roughly 200-millisecond timer, set every time a packet is 2957 transmitted in PARTOPEN. If this timer goes off and the client is 2958 still in PARTOPEN, the client generates another DCCP-Ack and backs 2959 off the timer. If the client remains in PARTOPEN for more than 4MSL 2960 (8 minutes), it SHOULD reset the connection with Reset Code 2, 2961 "Aborted". 2963 The client leaves the PARTOPEN state for OPEN when it receives a 2964 valid packet other than DCCP-Response, DCCP-Reset, or DCCP-Sync from 2965 the server. 2967 8.2. Data Transfer 2969 In the central data transfer phase of the connection, both server 2970 and client are in the OPEN state. 2972 DCCP A sends DCCP-Data and DCCP-DataAck packets to DCCP B due to 2973 application events on host A. These packets are congestion- 2974 controlled by the CCID for the A-to-B half-connection. In contrast, 2975 DCCP-Ack packets sent by DCCP A are controlled by the CCID for the 2976 B-to-A half-connection. Generally, DCCP A will piggyback 2977 acknowledgement information on DCCP-Data packets when acceptable, 2978 creating DCCP-DataAck packets. DCCP-Ack packets are used when there 2979 is no data to send from DCCP A to DCCP B, or when the congestion 2980 state of the A-to-B CCID will not allow data to be sent. 2982 DCCP-Sync and DCCP-SyncAck packets may also occur in the data 2983 transfer phase. Some cases causing DCCP-Sync generation are 2984 discussed in Section 7.5. One important distinction between DCCP- 2985 Sync packets and other packet types is that DCCP-Sync elicits an 2986 immediate acknowledgement. On receiving a valid DCCP-Sync packet, a 2987 DCCP endpoint MUST immediately generate and send a DCCP-SyncAck 2988 response (subject to any implementation rate limits); the 2989 Acknowledgement Number on that DCCP-SyncAck MUST equal the Sequence 2990 Number of the DCCP-Sync. 2992 A particular DCCP implementation might decide to initiate feature 2993 negotiation only once the OPEN state was reached, in which case it 2994 might not allow data transfer until some time later. Data received 2995 during that time SHOULD be rejected and reported using a Data 2996 Dropped Drop Block with Drop Code 0, Protocol Constraints (see 2997 Section 11.7). 2999 8.3. Termination 3001 DCCP connection termination uses a handshake consisting of an 3002 optional DCCP-CloseReq packet, a DCCP-Close packet, and a DCCP-Reset 3003 packet. The server moves from the OPEN state, possibly through the 3004 CLOSEREQ state, to CLOSED; the client moves from OPEN through 3005 CLOSING to TIMEWAIT, and after 2MSL wait time (4 minutes), to 3006 CLOSED. 3008 The sequence DCCP-CloseReq, DCCP-Close, DCCP-Reset is used when the 3009 server decides to close the connection, but doesn't want to hold 3010 TIMEWAIT state: 3012 Client State Server State 3013 OPEN OPEN 3014 1. <-- CloseReq <-- CLOSEREQ 3015 2. CLOSING --> Close --> 3016 3. <-- Reset <-- CLOSED (LISTEN) 3017 4. TIMEWAIT 3018 5. CLOSED 3019 A shorter sequence occurs when the client decides to close the 3020 connection. 3022 Client State Server State 3023 OPEN OPEN 3024 1. CLOSING --> Close --> 3025 2. <-- Reset <-- CLOSED (LISTEN) 3026 3. TIMEWAIT 3027 4. CLOSED 3029 Finally, the server can decide to hold TIMEWAIT state: 3031 Client State Server State 3032 OPEN OPEN 3033 1. <-- Close <-- CLOSING 3034 2. CLOSED --> Reset --> 3035 3. TIMEWAIT 3036 4. CLOSED (LISTEN) 3038 In all cases, the receiver of the DCCP-Reset packet holds TIMEWAIT 3039 state for the connection. As in TCP, TIMEWAIT state, where an 3040 endpoint quietly preserves a socket for 2MSL (4 minutes) after its 3041 connection has closed, ensures that no connection duplicating the 3042 current connection's source and destination addresses and ports can 3043 start up while old packets might remain in the network. 3045 The termination handshake proceeds as follows. The receiver of a 3046 valid DCCP-CloseReq packet MUST respond with a DCCP-Close packet. 3047 The receiver of a valid DCCP-Close packet MUST respond with a DCCP- 3048 Reset packet, with Reset Code 1, "Closed". The receiver of a valid 3049 DCCP-Reset packet -- which is also the sender of the DCCP-Close 3050 packet (and possibly the receiver of the DCCP-CloseReq packet) -- 3051 will hold TIMEWAIT state for the connection. 3053 A DCCP-Reset packet completes every DCCP connection, whether the 3054 termination is clean (due to application close; Reset Code 1, 3055 "Closed") or unclean. Unlike TCP, which has two distinct 3056 termination mechanisms (FIN and RST), DCCP ends all connections in a 3057 uniform manner. This is justified because some aspects of 3058 connection termination are the same independent of whether 3059 termination was clean. For instance, the endpoint that receives a 3060 valid DCCP-Reset SHOULD hold TIMEWAIT state for the connection. 3061 Processors that must distinguish between clean and unclean 3062 termination can examine the Reset Code. DCCP-Reset packets MUST NOT 3063 be generated in response to received DCCP-Reset packets. DCCP 3064 implementations generally transition to the CLOSED state after 3065 sending a DCCP-Reset packet. 3067 Endpoints in the CLOSEREQ and CLOSING states MUST retransmit DCCP- 3068 CloseReq and DCCP-Close packets, respectively, until leaving those 3069 states. The retransmission timer should initially be set to go off 3070 in two round-trip times, and should back off to not less than once 3071 every 64 seconds if no relevant response is received. 3073 Only the server can send a DCCP-CloseReq packet or enter the 3074 CLOSEREQ state. A server receiving a sequence-valid DCCP-CloseReq 3075 packet MUST respond with a DCCP-Sync packet, and otherwise ignore 3076 the DCCP-CloseReq. 3078 DCCP-Data, DCCP-DataAck, and DCCP-Ack packets received in CLOSEREQ 3079 or CLOSE states MAY be either processed or ignored. 3081 8.3.1. Abnormal Termination 3083 DCCP endpoints generate DCCP-Reset packets to terminate connections 3084 abnormally; a DCCP-Reset packet may be generated from any state. 3085 Resets sent in the CLOSED, LISTEN, and TIMEWAIT states use Reset 3086 Code 3, "No Connection", unless otherwise specified. Resets sent in 3087 the REQUEST or RESPOND states use Reset Code 4, "Packet Error", 3088 unless otherwise specified. 3090 DCCP endpoints in CLOSED or LISTEN state may need to generate a 3091 DCCP-Reset packet in response to a packet received from a peer. 3092 Since these states have no associated sequence number variables, the 3093 Sequence and Acknowledgement Numbers on the DCCP-Reset packet R are 3094 taken from the received packet P, as follows. 3096 1. If P.ackno exists, then set R.seqno := P.ackno + 1. Otherwise, 3097 set R.seqno := 0. 3099 2. Set R.ackno := P.seqno. 3101 3. If the packet used short sequence numbers (P.X == 0), then set 3102 the upper 24 bits of R.seqno and R.ackno to 0. 3104 8.4. DCCP State Diagram 3106 The most common state transitions discussed above can be summarized 3107 in the following state diagram. The diagram is illustrative; the 3108 text in Section 8.5 and elsewhere should be considered definitive. 3109 For example, there are arcs (not shown) from every state except 3110 CLOSED to TIMEWAIT, contingent on the receipt of a valid DCCP-Reset. 3112 +---------------------------+ +---------------------------+ 3113 | v v | 3114 | +----------+ | 3115 | +-------------+ CLOSED +------------+ | 3116 | | passive +----------+ active | | 3117 | | open open | | 3118 | | snd Request | | 3119 | v v | 3120 | +----------+ +----------+ | 3121 | | LISTEN | | REQUEST | | 3122 | +----+-----+ +----+-----+ | 3123 | | rcv Request rcv Response | | 3124 | | snd Response snd Ack | | 3125 | v v | 3126 | +----------+ +----------+ | 3127 | | RESPOND | | PARTOPEN | | 3128 | +----+-----+ +----+-----+ | 3129 | | rcv Ack/DataAck rcv packet | | 3130 | | | | 3131 | | +----------+ | | 3132 | +------------>| OPEN |<-----------+ | 3133 | +--+-+--+--+ | 3134 | server active close | | | active close | 3135 | snd CloseReq | | | or rcv CloseReq | 3136 | | | | snd Close | 3137 | | | | | 3138 | +----------+ | | | +----------+ | 3139 | | CLOSEREQ |<---------+ | +--------->| CLOSING | | 3140 | +----+-----+ | +----+-----+ | 3141 | | rcv Close | rcv Reset | | 3142 | | snd Reset | | | 3143 |<---------+ | v | 3144 | | +----+-----+ | 3145 | rcv Close | | TIMEWAIT | | 3146 | snd Reset | +----+-----+ | 3147 +-----------------------------+ | | 3148 +-----------+ 3149 2MSL timer expires 3151 8.5. Pseudocode 3153 This section presents an algorithm describing the processing steps a 3154 DCCP endpoint must go through when it receives a packet. A DCCP 3155 implementation need not implement the algorithm as it is described 3156 here, but any implementation MUST generate observable effects 3157 exactly as indicated by this pseudocode, except where allowed 3158 otherwise by another part of this document. 3160 The received packet is written as P, the socket as S. 3161 Packet variables P.seqno and P.ackno are 48-bit sequence numbers. 3162 Socket variables: 3163 S.SWL - sequence number window low 3164 S.SWH - sequence number window high 3165 S.AWL - acknowledgement number window low 3166 S.AWH - acknowledgement number window high 3167 S.ISS - initial sequence number sent 3168 S.ISR - initial sequence number received 3169 S.OSR - first OPEN sequence number received 3170 S.GSS - greatest sequence number sent 3171 S.GSR - greatest valid sequence number received 3172 S.GAR - greatest valid acknowledgement number received on a 3173 non-Sync; initialized to S.ISS 3174 "Send packet" actions always use, and increment, S.GSS. 3176 Step 1: Check header basics 3177 /* This step checks for malformed packets. Packets that fail 3178 these checks are ignored -- they do not receive Resets in 3179 response */ 3180 If the packet is shorter than 12 bytes, drop packet and return 3181 If the packet type is not understood, drop packet and return 3182 If P.Data Offset is too small for packet type, or too large for 3183 packet, drop packet and return 3184 If P.type is not Data, Ack, or DataAck and P.X == 0 (the packet 3185 has short sequence numbers), drop packet and return 3186 If the header checksum is incorrect, drop packet and return 3187 If P.CsCov is too large for the packet size, drop packet and 3188 return 3190 Step 2: Check ports and process TIMEWAIT state 3191 Look up flow ID in table and get corresponding socket 3192 If no socket, or S.state == TIMEWAIT, 3193 Generate Reset(No Connection) unless P.type == Reset 3194 Drop packet and return 3196 Step 3: Process LISTEN state 3197 If S.state == LISTEN, 3198 If P.type == Request or P contains a valid Init Cookie option, 3199 /* Must scan the packet's options to check for an Init 3200 Cookie. Only the Init Cookie is processed here, 3201 however; other options are processed in Step 8. This 3202 scan need only be performed if the endpoint uses Init 3203 Cookies */ 3204 /* Generate a new socket and switch to that socket */ 3205 Set S := new socket for this port pair 3206 S.state = RESPOND 3207 Choose S.ISS (initial seqno) or set from Init Cookie 3208 Set S.ISR, S.GSR, S.SWL, S.SWH from packet or Init Cookie 3209 Continue with S.state == RESPOND 3210 /* A Response packet will be generated in Step 11 */ 3211 Otherwise, 3212 Generate Reset(No Connection) unless P.type == Reset 3213 Drop packet and return 3215 Step 4: Prepare sequence numbers in REQUEST 3216 If S.state == REQUEST, 3217 If (P.type == Response or P.type == Reset) 3218 and S.AWL <= P.ackno <= S.AWH, 3219 /* Set sequence number variables corresponding to the 3220 other endpoint, so P will pass the tests in Step 6 */ 3221 Set S.GSR, S.ISR, S.SWL, S.SWH 3222 /* Response processing continues in Step 10; Reset 3223 processing continues in Step 9 */ 3224 Otherwise, 3225 /* Only Response and Reset are valid in REQUEST state */ 3226 Generate Reset(Packet Error) 3227 Drop packet and return 3229 Step 5: Prepare sequence numbers for Sync 3230 If P.type == Sync or P.type == SyncAck, 3231 If S.AWL <= P.ackno <= S.AWH and P.seqno >= S.SWL, 3232 /* P is valid, so update sequence number variables 3233 accordingly. After this update, P will pass the tests 3234 in Step 6. A SyncAck is generated if necessary in 3235 Step 15 */ 3236 Update S.GSR, S.SWL, S.SWH 3237 Otherwise, 3238 Drop packet and return 3240 Step 6: Check sequence numbers 3241 Let LSWL = S.SWL and LAWL = S.AWL 3242 If P.type == CloseReq or P.type == Close or P.type == Reset, 3243 LSWL := S.GSR + 1, LAWL := S.GAR 3244 If LSWL <= P.seqno <= S.SWH 3245 and (P.ackno does not exist or LAWL <= P.ackno <= S.AWH), 3246 Update S.GSR, S.SWL, S.SWH 3247 If P.type != Sync, 3248 Update S.GAR 3249 Otherwise, 3250 Send Sync packet acknowledging P.seqno 3251 Drop packet and return 3253 Step 7: Check for unexpected packet types 3254 If (S.is_server and P.type == CloseReq) 3255 or (S.is_server and P.type == Response) 3256 or (S.is_client and P.type == Request) 3257 or (S.state >= OPEN and P.type == Request 3258 and P.seqno >= S.OSR) 3259 or (S.state >= OPEN and P.type == Response 3260 and P.seqno >= S.OSR) 3261 or (S.state == RESPOND and P.type == Data), 3262 Send Sync packet acknowledging P.seqno 3263 Drop packet and return 3265 Step 8: Process options and mark acknowledgeable 3266 /* Option processing is not specifically described here. 3267 Certain options, such as Mandatory, may cause the connection 3268 to be reset, in which case Steps 9 and on are not executed */ 3269 Mark packet as acknowledgeable (in Ack Vector terms, Received 3270 or Received ECN Marked) 3272 Step 9: Process Reset 3273 If P.type == Reset, 3274 Tear down connection 3275 S.state := TIMEWAIT 3276 Set TIMEWAIT timer 3277 Drop packet and return 3279 Step 10: Process REQUEST state (second part) 3280 If S.state == REQUEST, 3281 /* If we get here, P is a valid Response from the server (see 3282 Step 4), and we should move to PARTOPEN state. PARTOPEN 3283 means send an Ack, don't send Data packets, retransmit 3284 Acks periodically, and always include any Init Cookie from 3285 the Response */ 3286 S.state := PARTOPEN 3287 Set PARTOPEN timer 3288 Continue with S.state == PARTOPEN 3289 /* Step 12 will send the Ack completing the three-way 3290 handshake */ 3292 Step 11: Process RESPOND state 3293 If S.state == RESPOND, 3294 If P.type == Request, 3295 Send Response, possibly containing Init Cookie 3296 If Init Cookie was sent, 3297 Destroy S and return 3298 /* Step 3 will create another socket when the client 3299 completes the three-way handshake */ 3300 Otherwise, 3301 S.OSR := P.seqno 3302 S.state := OPEN 3304 Step 12: Process PARTOPEN state 3305 If S.state == PARTOPEN, 3306 If P.type == Response, 3307 Send Ack 3308 Otherwise, if P.type != Sync, 3309 S.OSR := P.seqno 3310 S.state := OPEN 3312 Step 13: Process CloseReq 3313 If P.type == CloseReq and S.state < CLOSEREQ, 3314 Generate Close 3315 S.state := CLOSING 3316 Set CLOSING timer 3318 Step 14: Process Close 3319 If P.type == Close, 3320 Generate Reset(Closed) 3321 Tear down connection 3322 Drop packet and return 3324 Step 15: Process Sync 3325 If P.type == Sync, 3326 Generate SyncAck 3328 Step 16: Process data 3329 /* At this point any application data on P can be passed to the 3330 application, except that the application MUST NOT receive 3331 data from more than one Request or Response */ 3333 9. Checksums 3335 DCCP uses a header checksum to protect its header against 3336 corruption. Generally, this checksum also covers any application 3337 data. DCCP applications can, however, request that the header 3338 checksum cover only part of the application data, or perhaps no 3339 application data at all. Link layers may then reduce their 3340 protection on unprotected parts of DCCP packets. For some noisy 3341 links, and applications that can tolerate corruption, this can 3342 greatly improve delivery rates and perceived performance. 3344 Checksum coverage may eventually impact congestion control 3345 mechanisms as well. A packet with corrupt application data and 3346 complete checksum coverage is treated as lost. This incurs a heavy- 3347 duty loss response from the sender's congestion control mechanism, 3348 which can unfairly penalize connections on links with high 3349 background corruption. The combination of reduced checksum coverage 3350 and Data Checksum options may let endpoints report packets as 3351 corrupt rather than dropped, using Data Dropped options and Drop 3352 Code 3 (see Section 11.7). This may eventually benefit 3353 applications. However, further research is required to determine an 3354 appropriate response to corruption, which can sometimes correlate 3355 with congestion. Corrupt packets currently incur a loss response. 3357 The Data Checksum option, which contains a strong CRC, lets 3358 endpoints detect application data corruption. An API can then be 3359 used to avoid delivering corrupt data to the application, even if 3360 links deliver corrupt data to the endpoint due to reduced checksum 3361 coverage. However, the use of reduced checksum coverage for 3362 applications that demand correct data is currently considered 3363 experimental. This is because the combined loss-plus-corruption 3364 rate for packets with reduced checksum coverage may be significantly 3365 higher than that for packets with full checksum coverage, although 3366 the loss rate will generally be lower. Actual behavior will depend 3367 on link design; further research and experience is required. 3369 Reduced checksum coverage introduces some security considerations; 3370 see Section 18.1. See Appendix B for further motivation and 3371 discussion. DCCP's implementation of reduced checksum coverage was 3372 inspired by UDP-Lite [RFC 3828]. 3374 9.1. Header Checksum Field 3376 DCCP uses the TCP/IP checksum algorithm. The Checksum field in the 3377 DCCP generic header (see Section 5.1) equals the 16 bit one's 3378 complement of the one's complement sum of all 16 bit words in the 3379 DCCP header, DCCP options, a pseudoheader taken from the network- 3380 layer header, and, depending on the value of the Checksum Coverage 3381 field, some or all of the application data. When calculating the 3382 checksum, the Checksum field itself is treated as 0. If a packet 3383 contains an odd number of header and payload bytes to be 3384 checksummed, 8 zero bits are added on the right to form a 16 bit 3385 word for checksum purposes. The pad byte is not transmitted as part 3386 of the packet. 3388 The pseudoheader is calculated as for TCP. For IPv4, it is 96 bits 3389 long, and consists of the IPv4 source and destination addresses, the 3390 IP protocol number for DCCP (padded on the left with 8 zero bits), 3391 and the DCCP length as a 16-bit quantity (the length of the DCCP 3392 header with options, plus the length of any data); see RFC 793 3393 (Section 3.1). For IPv6, it is 320 bits long, and consists of the 3394 IPv6 source and destination addresses, the DCCP length as a 32-bit 3395 quantity, and the IP protocol number for DCCP (padded on the left 3396 with 24 zero bits); see RFC 2460 (Section 8.1). 3398 Packets with invalid header checksums MUST be ignored. In 3399 particular, their options MUST NOT be processed. 3401 9.2. Header Checksum Coverage Field 3403 The Checksum Coverage field in the DCCP generic header (see Section 3404 5.1) specifies what parts of the packet are covered by the Checksum 3405 field, as follows: 3407 CsCov = 0 The Checksum field covers the DCCP header, DCCP 3408 options, network-layer pseudoheader, and all 3409 application data in the packet, possibly padded on 3410 the right with zeros to an even number of bytes. 3412 CsCov = 1-15 The Checksum field covers the DCCP header, DCCP 3413 options, network-layer pseudoheader, and the initial 3414 (CsCov-1)*4 bytes of the packet's application data. 3416 Thus, if CsCov is 1, none of the application data is protected by 3417 the header checksum. The value (CsCov-1)*4 MUST be less than or 3418 equal to the length of the application data. Packets with invalid 3419 CsCov values MUST be ignored; in particular, their options MUST NOT 3420 be processed. The meanings of values other than 0 and 1 should be 3421 considered experimental. 3423 Values other than 0 specify that corruption is acceptable in some or 3424 all of the DCCP packet's application data. In fact, DCCP cannot 3425 even detect corruption in areas not covered by the header checksum, 3426 unless the Data Checksum option is used. Applications should not 3427 make any assumptions about the correctness of received data not 3428 covered by the checksum, and should if necessary introduce their own 3429 validity checks. 3431 A DCCP application interface should let sending applications suggest 3432 a value for CsCov for sent packets, defaulting to 0 (full coverage). 3433 The Minimum Checksum Coverage feature, described below, lets an 3434 endpoint refuse delivery of application data on packets with partial 3435 checksum coverage; by default, only fully-covered application data 3436 is accepted. Lower layers that support partial error detection MAY 3437 use the Checksum Coverage field as a hint of where errors do not 3438 need to be detected. Lower layers MUST use a strong error detection 3439 mechanism to detect at least errors that occur in the sensitive part 3440 of the packet, and discard damaged packets. The sensitive part 3441 consists of the bytes between the first byte of the IP header and 3442 the last byte identified by Checksum Coverage. 3444 For more details on application and lower-layer interface issues 3445 relating to partial checksumming, see [RFC 3828]. 3447 9.2.1. Minimum Checksum Coverage Feature 3449 The Minimum Checksum Coverage feature lets a DCCP endpoint determine 3450 whether its peer is willing to accept packets with reduced Checksum 3451 Coverage. For example, DCCP A sends a "Change R(Minimum Checksum 3452 Coverage, 1)" option to DCCP B to check whether B is willing to 3453 accept packets with Checksum Coverage set to 1. 3455 Minimum Checksum Coverage has feature number 8, and is server- 3456 priority. It takes one-byte integer values between 0 and 15; values 3457 of 16 or more are reserved. Minimum Checksum Coverage/B reflects 3458 values of Checksum Coverage that DCCP B finds unacceptable. Say 3459 that the value of Minimum Checksum Coverage/B is MinCsCov. Then: 3461 o If MinCsCov = 0, then DCCP B only finds packets with CsCov = 0 3462 acceptable. 3464 o If MinCsCov > 0, then DCCP B additionally finds packets with 3465 CsCov >= MinCsCov acceptable. 3467 DCCP B MAY refuse to process application data from packets with 3468 unacceptable Checksum Coverage. Such packets SHOULD be reported 3469 using Data Dropped options (Section 11.7) with Drop Code 0, Protocol 3470 Constraints. New connections start with Minimum Checksum Coverage 0 3471 for both endpoints. 3473 9.3. Data Checksum Option 3475 The Data Checksum option holds a 32-bit CRC-32c cyclic redundancy- 3476 check code of a DCCP packet's application data. 3478 +--------+--------+--------+--------+--------+--------+ 3479 |00101100|00000110| CRC-32c | 3480 +--------+--------+--------+--------+--------+--------+ 3481 Type=44 Length=6 3483 The sending DCCP computes the CRC of the bytes comprising the 3484 application data area and stores it in the option data. The CRC-32c 3485 algorithm used for Data Checksum is the same as that used for SCTP 3486 [RFC 3309]; note that the CRC-32c of zero bytes of data equals zero. 3487 The DCCP header checksum will cover the Data Checksum option, so the 3488 data checksum must be computed before the header checksum. 3490 A DCCP endpoint receiving a packet with a Data Checksum option 3491 SHOULD compute the received application data's CRC-32c, using the 3492 same algorithm as the sender, and compare the result with the Data 3493 Checksum value. (The endpoint can indicate its willingness to check 3494 Data Checksums using the Check Data Checksum feature, described 3495 below.) If the CRCs differ, the endpoint reacts in one of two ways. 3497 o The receiving application may have requested delivery of known- 3498 corrupt data via some optional API. In this case, the packet's 3499 data MUST be delivered to the application, with a note that it is 3500 known to be corrupt. Furthermore, the receiving endpoint MUST 3501 report the packet as delivered corrupt using a Data Dropped 3502 option (Drop Code 7, Delivered Corrupt). 3504 o Otherwise, the receiving endpoint MUST drop the application data, 3505 and report that data as dropped due to corruption using a Data 3506 Dropped option (Drop Code 3, Corrupt). 3508 In either case, the packet is considered acknowledgeable (since its 3509 header was processed), and will therefore be acknowledged using the 3510 equivalent of Ack Vector's Received or Received ECN Marked states. 3512 Although Data Checksum is intended for packets containing 3513 application data, it may be included on other packets, such as DCCP- 3514 Ack, DCCP-Sync, and DCCP-SyncAck. The receiver SHOULD calculate the 3515 application data area's CRC-32c on such packets, just as it does for 3516 DCCP-Data and similar packets; and if the CRCs differ, the packets 3517 similarly MUST be reported using Data Dropped options (Drop Code 3), 3518 although their application data areas would not be delivered to the 3519 application in any case. 3521 9.3.1. Check Data Checksum Feature 3523 The Check Data Checksum feature lets a DCCP endpoint determine 3524 whether its peer will definitely check Data Checksum options. 3525 DCCP A sends a Mandatory "Change R(Check Data Checksum, 1)" option 3526 to DCCP B to require it to check Data Checksum options (the 3527 connection will be reset if it cannot). 3529 Check Data Checksum has feature number 9, and is server-priority. 3530 It takes one-byte Boolean values. DCCP B MUST check any received 3531 Data Checksum options when Check Data Checksum/B is one, although it 3532 MAY check them even when Check Data Checksum/B is zero. Values of 3533 two or more are reserved. New connections start with Check Data 3534 Checksum 0 for both endpoints. 3536 9.3.2. Usage Notes 3538 Internet links must normally apply strong integrity checks to the 3539 packets they transmit [RFC 3828, RFC 3819]. This is the default 3540 case when the DCCP header's Checksum Coverage value equals zero 3541 (full coverage). However, the DCCP Checksum Coverage value might 3542 not be zero. By setting partial Checksum Coverage, the application 3543 indicates that it can tolerate corruption in the unprotected part of 3544 the application data. Recognizing this, link layers may reduce 3545 error detection and/or correction strength when transmitting this 3546 unprotected part. This, in turn, can significantly increase the 3547 likelihood of the endpoint receiving corrupt data; Data Checksum 3548 lets the receiver detect that corruption with very high probability. 3550 10. Congestion Control 3552 Each congestion control mechanism supported by DCCP is assigned a 3553 congestion control identifier, or CCID: a number from 0 to 255. 3554 During connection setup, and optionally thereafter, the endpoints 3555 negotiate their congestion control mechanisms by negotiating the 3556 values for their Congestion Control ID features. Congestion Control 3557 ID has feature number 1. The CCID/A value equals the CCID in use 3558 for the A-to-B half-connection. DCCP B sends a "Change R(CCID, K)" 3559 option to ask DCCP A to use CCID K for its data packets. 3561 CCID is a server-priority feature, so CCID negotiation options can 3562 list multiple acceptable CCIDs, sorted in descending order of 3563 priority. For example, the option "Change R(CCID, 2 3 4)" asks the 3564 receiver to use CCID 2 for its packets, although CCIDs 3 and 4 are 3565 also acceptable. (This corresponds to the bytes "35, 6, 1, 2, 3, 3566 4": Change R option (35), option length (6), feature ID (1), CCIDs 3567 (2, 3, 4).) Similarly, "Confirm L(CCID, 1, 2 3 4)" tells the 3568 receiver that the sender is using CCID 2 for its packets, but that 3569 CCIDs 3 and 4 might also be acceptable. 3571 Currently allocated CCIDs are as follows. 3573 CCID Meaning Reference 3574 ---- ------- --------- 3575 0-1 Reserved 3576 2 TCP-like Congestion Control [RFC TBA] 3577 3 TFRC Congestion Control [RFC TBA] 3578 4-255 Reserved 3580 Table 5: DCCP Congestion Control Identifiers 3582 New connections start with CCID 2 for both endpoints. If this is 3583 unacceptable for a DCCP endpoint, that endpoint MUST send Mandatory 3584 Change(CCID) options on its first packets. 3586 All CCIDs standardized for use with DCCP will correspond to 3587 congestion control mechanisms previously standardized by the IETF. 3588 We expect that for quite some time, all such mechanisms will be TCP- 3589 friendly, but TCP-friendliness is not an explicit DCCP requirement. 3591 A DCCP implementation intended for general use, such as an 3592 implementation in a general-purpose operating system kernel, SHOULD 3593 implement at least CCID 2. The intent is to make CCID 2 broadly 3594 available for interoperability, although particular applications 3595 might disallow its use. 3597 10.1. TCP-like Congestion Control 3599 CCID 2, TCP-like Congestion Control, denotes Additive Increase, 3600 Multiplicative Decrease (AIMD) congestion control with behavior 3601 modelled directly on TCP, including congestion window, slow start, 3602 timeouts, and so forth [RFC 2581]. CCID 2 achieves maximum 3603 bandwidth over the long term, consistent with the use of end-to-end 3604 congestion control, but halves its congestion window in response to 3605 each congestion event. This leads to the abrupt rate changes 3606 typical of TCP. Applications should use CCID 2 if they prefer 3607 maximum bandwidth utilization to steadiness of rate. This is often 3608 the case for applications that are not playing their data directly 3609 to the user. For example, a hypothetical application that 3610 transferred files over DCCP, using application-level retransmissions 3611 for lost packets, would prefer CCID 2 to CCID 3. On-line games may 3612 also prefer CCID 2. 3614 CCID 2 is further described in [CCID 2 PROFILE]. 3616 10.2. TFRC Congestion Control 3618 CCID 3 denotes TCP-Friendly Rate Control (TFRC), an equation-based 3619 rate-controlled congestion control mechanism. TFRC is designed to 3620 be reasonably fair when competing for bandwidth with TCP-like flows, 3621 where a flow is "reasonably fair" if its sending rate is generally 3622 within a factor of two of the sending rate of a TCP flow under the 3623 same conditions. However, TFRC has a much lower variation of 3624 throughput over time compared with TCP, which makes CCID 3 more 3625 suitable than CCID 2 for applications such streaming media where a 3626 relatively smooth sending rate is of importance. 3628 CCID 3 is further described in [CCID 3 PROFILE]. The TFRC 3629 congestion control algorithms were initially described in RFC 3448. 3631 10.3. CCID-Specific Options, Features, and Reset Codes 3633 Half of the option types, feature numbers, and Reset Codes are 3634 reserved for CCID-specific use. CCIDs may often need new options, 3635 for communicating acknowledgement or rate information, for example; 3636 reserved option spaces let CCIDs create options at will without 3637 polluting the global option space. Option 128 might have different 3638 meanings on a half-connection using CCID 4 and a half-connection 3639 using CCID 8. CCID-specific options and features will never 3640 conflict with global options and features introduced by later 3641 versions of this specification. 3643 Any packet may contain information meant for either half-connection, 3644 so CCID-specific option types, feature numbers, and Reset Codes 3645 explicitly signal the half-connection to which they apply. 3647 o Option numbers 128 through 191 are for options sent from the HC- 3648 Sender to the HC-Receiver; option numbers 192 through 255 are for 3649 options sent from the HC-Receiver to the HC-Sender. 3651 o Reset Codes 128 through 191 indicate that the HC-Sender reset the 3652 connection (most likely because of some problem with 3653 acknowledgements sent by the HC-Receiver); Reset Codes 192 3654 through 255 indicate that the HC-Receiver reset the connection 3655 (most likely because of some problem with data packets sent by 3656 the HC-Sender). 3658 o Finally, feature numbers 128 through 191 are used for features 3659 located at the HC-Sender; feature numbers 192 through 255 are for 3660 features located at the HC-Receiver. Since Change L and 3661 Confirm L options for a feature are sent by the feature location, 3662 we know that any Change L(128) option was sent by the HC-Sender, 3663 while any Change L(192) option was sent by the HC-Receiver. 3664 Similarly, Change R(128) options are sent by the HC-Receiver, 3665 while Change R(192) options are sent by the HC-Sender. 3667 For example, consider a DCCP connection where the A-to-B half- 3668 connection uses CCID 4 and the B-to-A half-connection uses CCID 5. 3669 Here is how a sampling of CCID-specific options are assigned to 3670 half-connections. 3672 Relevant Relevant 3673 Packet Option Half-conn. CCID 3674 ------ ------ ---------- ---- 3675 A > B 128 A-to-B 4 3676 A > B 192 B-to-A 5 3677 A > B Change L(128, ...) A-to-B 4 3678 A > B Change R(192, ...) A-to-B 4 3679 A > B Confirm L(128, ...) A-to-B 4 3680 A > B Confirm R(192, ...) A-to-B 4 3681 A > B Change R(128, ...) B-to-A 5 3682 A > B Change L(192, ...) B-to-A 5 3683 A > B Confirm R(128, ...) B-to-A 5 3684 A > B Confirm L(192, ...) B-to-A 5 3686 B > A 128 B-to-A 5 3687 B > A 192 A-to-B 4 3688 B > A Change L(128, ...) B-to-A 5 3689 B > A Change R(192, ...) B-to-A 5 3690 B > A Confirm L(128, ...) B-to-A 5 3691 B > A Confirm R(192, ...) B-to-A 5 3692 B > A Change R(128, ...) A-to-B 4 3693 B > A Change L(192, ...) A-to-B 4 3694 B > A Confirm R(128, ...) A-to-B 4 3695 B > A Confirm L(192, ...) A-to-B 4 3697 Using CCID-specific options and feature options during a negotiation 3698 for that CCID feature is NOT RECOMMENDED, since it is difficult to 3699 predict the CCID that will be in force when the option is processed. 3700 For example, if a DCCP-Request contains the option sequence 3701 "Change L(CCID, 3), 128", the CCID-specific option "128" may be 3702 processed either by CCID 3 (if the server supports CCID 3) or by the 3703 default CCID 2 (if it does not). However, it is safe to include 3704 CCID-specific options following certain Mandatory Change(CCID) 3705 options. For example, if a DCCP-Request contains the option 3706 sequence "Mandatory, Change L(CCID, 3), 128", then either the "128" 3707 option will be processed by CCID 3 or the connection will be reset. 3709 Servers that do not implement the default CCID 2 might nevertheless 3710 receive CCID 2-specific options on a DCCP-Request packet. (Such a 3711 server MUST send Mandatory Change(CCID) options on its DCCP- 3712 Response, so CCID-specific options on any other packet won't refer 3713 to CCID 2.) The server MUST treat such options as non-understood. 3714 Thus, it will reset the connection on encountering a Mandatory CCID- 3715 specific option, send an empty Confirm for a non-Mandatory Change 3716 option for a CCID-specific feature, and ignore other options. 3718 10.4. CCID Profile Requirements 3720 Each CCID Profile document MUST address at least the following 3721 requirements: 3723 o The profile MUST include the name and number of the CCID being 3724 described. 3726 o The profile MUST describe the conditions in which it is likely to 3727 be useful. Often the best way to do this is by comparison to 3728 existing CCIDs. 3730 o The profile MUST list and describe any CCID-specific options, 3731 features, and Reset Codes, and SHOULD list those general options 3732 and features described in this document that are especially 3733 relevant to the CCID. 3735 o Any newly defined acknowledgement mechanism MUST include a way to 3736 transmit ECN Nonce Echoes back to the sender. 3738 o The profile MUST describe the format of data packets, including 3739 any options that should be included and the setting of the CCval 3740 header field. 3742 o The profile MUST describe the format of acknowledgement packets, 3743 including any options that should be included. 3745 o The profile MUST define how data packets are congestion 3746 controlled. This includes responses to congestion events, idle 3747 and application-limited periods, and responses to the DCCP Data 3748 Dropped and Slow Receiver options. CCIDs that implement per- 3749 packet congestion control SHOULD discuss how packet size is 3750 factored in to congestion control decisions. 3752 o The profile MUST specify when acknowledgement packets are 3753 generated, and how they are congestion controlled. 3755 o The profile MUST define when a sender using the CCID is 3756 considered quiescent. 3758 o The profile MUST say whether its CCID's acknowledgements ever 3759 need to be acknowledged, and if so, how often. 3761 10.5. Congestion State 3763 Most congestion control algorithms depend on past history to 3764 determine the current allowed sending rate. In CCID 2, this 3765 congestion state includes a congestion window and a measurement of 3766 the number of packets outstanding in the network; in CCID 3, it 3767 includes the lengths of recent loss intervals; and both CCIDs use an 3768 estimate of the round-trip time. Congestion state depends on the 3769 network path, and is invalidated by path changes. Therefore, DCCP 3770 senders and receivers SHOULD reset their congestion state -- 3771 essentially restarting congestion control from "slow start" or 3772 equivalent -- on significant changes in end-to-end path. For 3773 example, an endpoint that sends or receives a Mobile IPv6 Binding 3774 Update message [RFC 3775] SHOULD reset its congestion state for any 3775 corresponding DCCP connections. 3777 A DCCP implementation MAY also reset its congestion state when a 3778 CCID changes (that is, a negotiation for the CCID feature completes 3779 successfully, and the new feature value differs from the old value). 3780 Thus, a connection in a heavily congested environment might evade 3781 end-to-end congestion control by frequently renegotiating a CCID, 3782 just as it could evade end-to-end congestion control by opening new 3783 connections for the same session. This behavior is prohibited. To 3784 prevent it, DCCP implementations MAY limit the rate at which CCID 3785 can be changed -- for instance, by refusing to change a CCID feature 3786 value more than once per minute. 3788 11. Acknowledgements 3790 Congestion control requires receivers to transmit information about 3791 packet losses and ECN marks to senders. DCCP receivers MUST report 3792 all congestion they see, as defined by the relevant CCID profile. 3793 Each CCID says when acknowledgements should be sent, what options 3794 they must use, and so on. DCCP acknowledgements are congestion 3795 controlled, although it is not required that the acknowledgement 3796 stream be more than very roughly TCP-friendly; each CCID defines how 3797 acknowledgements are congestion controlled. 3799 Most acknowledgements use DCCP options. For example, on a half- 3800 connection with CCID 2 (TCP-like), the receiver reports 3801 acknowledgement information using the Ack Vector option. This 3802 section describes common acknowledgement options and shows how acks 3803 using those options will commonly work. Full descriptions of the 3804 ack mechanisms used for each CCID are laid out in the CCID profile 3805 specifications. 3807 Acknowledgement options, such as Ack Vector, generally depend on the 3808 DCCP Acknowledgement Number, and are thus only allowed on packet 3809 types that carry that number (all packets except DCCP-Request and 3810 DCCP-Data). Detailed acknowledgement options are not necessarily 3811 required on every packet that carries an Acknowledgement Number, 3812 however. 3814 11.1. Acks of Acks and Unidirectional Connections 3816 DCCP was designed to work well for both bidirectional and 3817 unidirectional flows of data, and for connections that transition 3818 between these states. However, acknowledgements required for a 3819 unidirectional connection are very different from those required for 3820 a bidirectional connection. In particular, unidirectional 3821 connections need to worry about acks of acks. 3823 The ack-of-acks problem arises because some acknowledgement 3824 mechanisms are reliable. For example, an HC-Receiver using CCID 2, 3825 TCP-like Congestion Control, sends Ack Vectors containing completely 3826 reliable acknowledgement information. The HC-Sender should 3827 occasionally inform the HC-Receiver that it has received an ack. If 3828 it did not, the HC-Receiver might resend complete Ack Vector 3829 information, going back to the start of the connection, with every 3830 DCCP-Ack packet! However, note that acks-of-acks need not be 3831 reliable themselves: when an ack-of-acks is lost, the HC-Receiver 3832 will simply maintain, and periodically retransmit, old 3833 acknowledgement-related state for a little longer. Therefore, there 3834 is no need for acks-of-acks-of-acks. 3836 When communication is bidirectional, any required acks-of-acks are 3837 automatically contained in normal acknowledgements for data packets. 3838 On a unidirectional connection, however, the receiver DCCP sends no 3839 data, so the sender would not normally send acknowledgements. 3840 Therefore, the CCID in force on that half-connection must explicitly 3841 say whether, when, and how the HC-Sender should generate acks-of- 3842 acks. 3844 For example, consider a bidirectional connection where both half- 3845 connections use the same CCID (either 2 or 3), and where DCCP B goes 3846 "quiescent". This means that the connection becomes unidirectional: 3847 DCCP B stops sending data, and sends only sends DCCP-Ack packets to 3848 DCCP A. For example, in CCID 2, TCP-like Congestion Control, DCCP B 3849 uses Ack Vector to reliably communicate which packets it has 3850 received. As described above, DCCP A must occasionally acknowledge 3851 a pure acknowledgement from DCCP B, so that B can free old Ack 3852 Vector state. For instance, A might send a DCCP-DataAck packet 3853 every now and then, instead of DCCP-Data. In contrast, in CCID 3, 3854 TFRC Congestion Control, DCCP B's acknowledgements generally need 3855 not be reliable, since they contain cumulative loss rates; TFRC 3856 works even if every DCCP-Ack is lost. Therefore, DCCP A need never 3857 acknowledge an acknowledgement. 3859 When communication is unidirectional, a single CCID -- in the 3860 example, the A-to-B CCID -- controls both DCCPs' acknowledgements, 3861 in terms of their content, their frequency, and so forth. For 3862 bidirectional connections, the A-to-B CCID governs DCCP B's 3863 acknowledgements (including its acks of DCCP A's acks), while the B- 3864 to-A CCID governs DCCP A's acknowledgements. 3866 DCCP A switches its ack pattern from bidirectional to unidirectional 3867 when it notices that DCCP B has gone quiescent. It switches from 3868 unidirectional to bidirectional when it must acknowledge even a 3869 single DCCP-Data or DCCP-DataAck packet from DCCP B. 3871 Each CCID defines how to detect quiescence on that CCID, and how 3872 that CCID handles acks-of-acks on unidirectional connections. The 3873 B-to-A CCID defines when DCCP B has gone quiescent. Usually, this 3874 happens when a period has passed without B sending any data packets; 3875 in CCID 2, for example, this period is the maximum of 0.2 seconds 3876 and two round-trip times. The A-to-B CCID defines how DCCP A 3877 handles acks-of-acks once DCCP B has gone quiescent. 3879 11.2. Ack Piggybacking 3881 Acknowledgements of A-to-B data MAY be piggybacked on data sent by 3882 DCCP B, as long as that does not delay the acknowledgement longer 3883 than the A-to-B CCID would find acceptable. However, data 3884 acknowledgements often require more than 4 bytes to express. A 3885 large set of acknowledgements prepended to a large data packet might 3886 exceed the allowed maximum packet size. In this case, DCCP B SHOULD 3887 send separate DCCP-Data and DCCP-Ack packets, or wait, but not too 3888 long, for a smaller datagram. 3890 Piggybacking is particularly common at DCCP A when the B-to-A half- 3891 connection is quiescent -- that is, when DCCP A is just 3892 acknowledging DCCP B's acknowledgements. There are three reasons to 3893 acknowledge DCCP B's acknowledgements: to allow DCCP B to free up 3894 information about previously acknowledged data packets from A; to 3895 shrink the size of future acknowledgements; and to manipulate the 3896 rate at which future acknowledgements are sent. Since these are 3897 secondary concerns, DCCP A can generally afford to wait indefinitely 3898 for a data packet to piggyback its acknowledgement onto; if DCCP B 3899 wants to elicit an acknowledgement, it can send a DCCP-Sync. 3901 Any restrictions on ack piggybacking are described in the relevant 3902 CCID's profile. 3904 11.3. Ack Ratio Feature 3906 The Ack Ratio feature lets HC-Senders influence the rate at which 3907 HC-Receivers generate DCCP-Ack packets, thus controlling reverse- 3908 path congestion. This differs from TCP, which presently has no 3909 congestion control for pure acknowledgement traffic. Ack Ratio 3910 reverse-path congestion control does not try to be TCP-friendly. It 3911 just tries to avoid congestion collapse, and to be somewhat better 3912 than TCP in the presence of a high packet loss or mark rate on the 3913 reverse path. 3915 Ack Ratio applies to CCIDs whose HC-Receivers clock acknowledgements 3916 off the receipt of data packets. The value of Ack Ratio/A equals 3917 the rough ratio of data packets sent by DCCP A to DCCP-Ack packets 3918 sent by DCCP B. Higher Ack Ratios correspond to lower DCCP-Ack 3919 rates; the sender raises Ack Ratio when the reverse path is 3920 congested and lowers Ack Ratio when it is not. Each CCID profile 3921 defines how it controls congestion on the acknowledgement path, and, 3922 particularly, whether Ack Ratio is used. CCID 2, for example, uses 3923 Ack Ratio for acknowledgement congestion control, but CCID 3 does 3924 not. However, each Ack Ratio feature has a value whether or not 3925 that value is used by the relevant CCID. 3927 Ack Ratio has feature number 5, and is non-negotiable. It takes 3928 two-byte integer values. An Ack Ratio/A value of four means that 3929 DCCP B will send at least one acknowledgement packet for every four 3930 data packets sent by DCCP A. DCCP A sends a "Change L(Ack Ratio)" 3931 option to notify DCCP B of its ack ratio. An Ack Ratio value of 3932 zero indicates that the relevant half-connection does not use an Ack 3933 Ratio to control its acknowledgement rate. New connections start 3934 with Ack Ratio 2 for both endpoints; this Ack Ratio results in 3935 acknowledgement behavior analogous to TCP's delayed acks. 3937 Ack Ratio should be treated as a guideline rather than a strict 3938 requirement. We intend Ack Ratio-controlled acknowledgement 3939 behavior to resemble TCP's acknowledgement behavior when there is no 3940 reverse-path congestion, and to be somewhat more conservative when 3941 there is reverse-path congestion. Following this intent is more 3942 important than implementing Ack Ratio precisely. In particular: 3944 o Receivers MAY piggyback acknowledgement information on data 3945 packets, creating DCCP-DataAck packets. The Ack Ratio does not 3946 apply to piggybacked acknowledgements. However, if the data 3947 packets are too big to carry acknowledgement information, or the 3948 data sending rate is lower than Ack Ratio would suggest, then 3949 DCCP B SHOULD send enough pure DCCP-Ack packets to maintain the 3950 rate of one acknowledgement per Ack Ratio received data packets. 3952 o Receivers MAY rate-pace their acknowledgements, rather than 3953 sending acknowledgements immediately upon the receipt of data 3954 packets. Receivers that rate-pace acknowledgements SHOULD pick a 3955 rate that approximates the effect of Ack Ratio, and SHOULD 3956 include Elapsed Time options (Section 13.2) to help the sender 3957 calculate round-trip times. 3959 o Receivers SHOULD implement delayed acknowledgement timers like 3960 TCP's, whereby any packet's acknowledgement is delayed by at most 3961 T seconds. This delay lets the receiver collect additional 3962 packets to acknowledge, and thus reduce the per-packet overhead 3963 of acknowledgements; but if T seconds have passed by and the ack 3964 is still around, it is sent out right away. The default value of 3965 T should be 0.2 seconds, as is common in TCP implementations. 3966 This may lead to sending more acknowledgement packets than Ack 3967 Ratio would suggest. 3969 o Receivers SHOULD send acknowledgements immediately on receiving 3970 packets marked ECN Congestion Experienced, or packets whose out- 3971 of-order sequence numbers potentially indicate loss. However, 3972 there is no need to send such immediate acknowledgements for 3973 marked packets more than once per round-trip time. 3975 o Receivers MAY ignore Ack Ratio if they perform their own 3976 congestion control on acknowledgements. For example, a receiver 3977 that knows the loss and mark rate for its DCCP-Ack packets might 3978 maintain a TCP-friendly acknowledgement rate on its own. Such a 3979 receiver MUST either ensure that it always obtains sufficient 3980 acknowledgement loss and mark information, or fall back to Ack 3981 Ratio when sufficient information is not available, as might 3982 happen during periods when the receiver is quiescent. 3984 11.4. Ack Vector Options 3986 The Ack Vector gives a run-length encoded history of data packets 3987 received at the client. Each byte of the vector gives the state of 3988 that data packet in the loss history, and the number of preceding 3989 packets with the same state. The option's data looks like this: 3991 +--------+--------+--------+--------+--------+-------- 3992 |0010011?| Length |SSLLLLLL|SSLLLLLL|SSLLLLLL| ... 3993 +--------+--------+--------+--------+--------+-------- 3994 Type=38/39 \___________ Vector ___________... 3996 The two Ack Vector options (option types 38 and 39) differ only in 3997 the values they imply for ECN Nonce Echo. Section 12.2 describes 3998 this further. 4000 The vector itself consists of a series of bytes, each of whose 4001 encoding is: 4003 0 1 2 3 4 5 6 7 4004 +-+-+-+-+-+-+-+-+ 4005 |Sta| Run Length| 4006 +-+-+-+-+-+-+-+-+ 4008 Sta[te] occupies the most significant two bits of each byte, and can 4009 have one of four values, as follows. 4011 State Meaning 4012 ----- ------- 4013 0 Received 4014 1 Received ECN Marked 4015 2 Reserved 4016 3 Not Yet Received 4018 Table 6: DCCP Ack Vector States 4020 The term "ECN marked" refers to packets with ECN code point 11, CE 4021 (Congestion Experienced); packets received with this ECN code point 4022 MUST be reported using State 1, Received ECN Marked. Packets 4023 received with other ECN code points 00, 01, or 10 (Non-ECT, ECT(0), 4024 or ECT(1), respectively) MUST be reported using State 0, Received. 4026 Run Length, the least significant six bits of each byte, specifies 4027 how many consecutive packets have the given State. Run Length zero 4028 says the corresponding State applies to one packet only; Run Length 4029 63 says it applies to 64 consecutive packets. Run lengths of 65 or 4030 more must be encoded in multiple bytes. 4032 The first byte in the first Ack Vector option refers to the packet 4033 indicated in the Acknowledgement Number; subsequent bytes refer to 4034 older packets. (Ack Vector MUST NOT be sent on DCCP-Data and DCCP- 4035 Request packets, which lack an Acknowledgement Number.) An Ack 4036 Vector containing the decimal values 0,192,3,64,5 and the 4037 Acknowledgement Number is decimal 100 indicates that: 4039 Packet 100 was received (Acknowledgement Number 100, State 0, 4040 Run Length 0). 4042 Packet 99 was lost (State 3, Run Length 0). 4044 Packets 98, 97, 96 and 95 were received (State 0, Run Length 3). 4046 Packet 94 was ECN marked (State 1, Run Length 0). 4048 Packets 93, 92, 91, 90, 89, and 88 were received (State 0, Run 4049 Length 5). 4051 A single Ack Vector option can acknowledge up to 16192 data packets. 4052 Should more packets need to be acknowledged than can fit in 253 4053 bytes of Ack Vector, then multiple Ack Vector options can be sent; 4054 the second Ack Vector begins where the first left off, and so forth. 4056 Ack Vector states are subject to two general constraints. (These 4057 principles SHOULD also be followed for other acknowledgement 4058 mechanisms; referring to Ack Vector states simplifies their 4059 explanation.) 4061 1. Packets reported as State 0 or State 1 MUST be acknowledgeable: 4062 their options have been processed by the receiving DCCP stack. 4063 Any data on the packet need not have been delivered to the 4064 receiving application; in fact, the data may have been dropped. 4066 2. Packets reported as State 3 MUST NOT be acknowledgeable. 4067 Feature negotiations and options on such packets MUST NOT have 4068 been processed, and the Acknowledgement Number MUST NOT 4069 correspond to such a packet. 4071 Packets dropped in the application's receive buffer MUST be reported 4072 as Received or Received ECN Marked (States 0 and 1), depending on 4073 their ECN state; such packets' ECN Nonces MUST be included in the 4074 Nonce Echo. The Data Dropped option informs the sender that some 4075 packets reported as received actually had their application data 4076 dropped. 4078 One or more Ack Vector options that, together, report the status of 4079 a packet with sequence number less than ISN, the initial sequence 4080 number, SHOULD be considered invalid. The receiving DCCP SHOULD 4081 either ignore the options or reset the connection with Reset Code 5, 4082 "Option Error". No Ack Vector option can refer to a packet that has 4083 not yet been sent, as the Acknowledgement Number checks in Section 4084 7.5.3 ensure, but because of attack, implementation bug, or 4085 misbehavior, an Ack Vector option can claim that a packet was 4086 received before it is actually delivered; Section 12.2 describes how 4087 this is detected and how senders should react. Packets that haven't 4088 been included in any Ack Vector option SHOULD be treated as "not yet 4089 received" (State 3) by the sender. 4091 Appendix A provides a non-normative description of the details of 4092 DCCP acknowledgement handling, in the context of an abstract Ack 4093 Vector implementation. 4095 11.4.1. Ack Vector Consistency 4097 A DCCP sender will commonly receive multiple acknowledgements for 4098 some of its data packets. For instance, an HC-Sender might receive 4099 two DCCP-Acks with Ack Vectors, both of which contained information 4100 about sequence number 24. (Information about a sequence number is 4101 generally repeated in every ack until the HC-Sender acknowledges an 4102 ack. In this case, perhaps the HC-Receiver is sending acks faster 4103 than the HC-Sender is acknowledging them.) In a perfect world, the 4104 two Ack Vectors would always be consistent. However, there are many 4105 reasons why they might not be. For example: 4107 o The HC-Receiver received packet 24 between sending its acks, so 4108 the first ack said 24 was not received (State 3) and the second 4109 said it was received or ECN marked (State 0 or 1). 4111 o The HC-Receiver received packet 24 between sending its acks, and 4112 the network reordered the acks. In this case, the packet will 4113 appear to transition from State 0 or 1 to State 3. 4115 o The network duplicated packet 24, and one of the duplicates was 4116 ECN marked. This might show up as a transition between States 0 4117 and 1. 4119 To cope with these situations, HC-Sender DCCP implementations SHOULD 4120 combine multiple received Ack Vector states according to this table: 4122 Received State 4123 0 1 3 4124 +---+---+---+ 4125 0 | 0 |0/1| 0 | 4126 Old +---+---+---+ 4127 1 | 1 | 1 | 1 | 4128 State +---+---+---+ 4129 3 | 0 | 1 | 3 | 4130 +---+---+---+ 4132 To read the table, choose the row corresponding to the packet's old 4133 state and the column corresponding to the packet's state in the 4134 newly received Ack Vector, then read the packet's new state off the 4135 table. For an old state of 0 (received non-marked) and received 4136 state of 1 (received ECN marked), the packet's new state may be set 4137 to either 0 or 1. The HC-Sender implementation will be indifferent 4138 to ack reordering if it chooses new state 1 for that cell. 4140 The HC-Receiver should collect information about received packets, 4141 which it will eventually report to the HC-Sender on one or more 4142 acknowledgements, according to the following table: 4144 Received Packet 4145 0 1 3 4146 +---+---+---+ 4147 0 | 0 |0/1| 0 | 4148 Stored +---+---+---+ 4149 1 |0/1| 1 | 1 | 4150 State +---+---+---+ 4151 3 | 0 | 1 | 3 | 4152 +---+---+---+ 4154 This table equals the sender's table, except that when the stored 4155 state is 1 and the received state is 0, the receiver is allowed to 4156 switch its stored state to 0. 4158 A HC-Sender MAY choose to throw away old information gleaned from 4159 the HC-Receiver's Ack Vectors, in which case it MUST ignore newly 4160 received acknowledgements from the HC-Receiver for those old 4161 packets. It is often kinder to save recent Ack Vector information 4162 for a while, so that the HC-Sender can undo its reaction to presumed 4163 congestion when a "lost" packet unexpectedly shows up (the 4164 transition from State 3 to State 0). 4166 11.4.2. Ack Vector Coverage 4168 We can divide the packets that have been sent from an HC-Sender to 4169 an HC-Receiver into four roughly contiguous groups. From oldest to 4170 youngest, these are: 4172 1. Packets already acknowledged by the HC-Receiver, where the HC- 4173 Receiver knows that the HC-Sender has definitely received the 4174 acknowledgements. 4176 2. Packets already acknowledged by the HC-Receiver, where the HC- 4177 Receiver cannot be sure that the HC-Sender has received the 4178 acknowledgements. 4180 3. Packets not yet acknowledged by the HC-Receiver. 4182 4. Packets not yet received by the HC-Receiver. 4184 The union of groups 2 and 3 is called the Acknowledgement Window. 4185 Generally, every Ack Vector generated by the HC-Receiver will cover 4186 the whole Acknowledgement Window: Ack Vector acknowledgements are 4187 cumulative. (This simplifies Ack Vector maintenance at the HC- 4188 Receiver; see Appendix A, below.) As packets are received, this 4189 window both grows on the right and shrinks on the left. It grows 4190 because there are more packets, and shrinks because the data 4191 packets' Acknowledgement Numbers will acknowledge previous 4192 acknowledgements, moving packets from group 2 into group 1. 4194 11.5. Send Ack Vector Feature 4196 The Send Ack Vector feature lets DCCPs negotiate whether they should 4197 use Ack Vector options to report congestion. Ack Vector provides 4198 detailed loss information, and lets senders report back to their 4199 applications whether particular packets were dropped. Send Ack 4200 Vector is mandatory for some CCIDs, and optional for others. 4202 Send Ack Vector has feature number 6, and is server-priority. It 4203 takes one-byte Boolean values. DCCP A MUST send Ack Vector options 4204 on its acknowledgements when Send Ack Vector/A has value one, 4205 although it MAY send Ack Vector options even when Send Ack Vector/A 4206 is zero. Values of two or more are reserved. New connections start 4207 with Send Ack Vector 0 for both endpoints. DCCP B sends a 4208 "Change R(Send Ack Vector, 1)" option to DCCP A to ask A to send Ack 4209 Vector options as part of its acknowledgement traffic. 4211 11.6. Slow Receiver Option 4213 An HC-Receiver sends the Slow Receiver option to its sender to 4214 indicate that it is having trouble keeping up with the sender's 4215 data. The HC-Sender SHOULD NOT increase its sending rate for 4216 approximately one round-trip time after seeing a packet with a Slow 4217 Receiver option. After one round-trip time, the effect of Slow 4218 Receiver disappears and the HC-Sender may again increase its rate, 4219 so the HC-Receiver SHOULD continue to send Slow Receiver options if 4220 it needs to prevent the HC-Sender from going faster in the long 4221 term. The Slow Receiver option does not indicate congestion, and 4222 the HC-Sender need not reduce its sending rate. (If necessary, the 4223 receiver can force the sender to slow down by dropping packets, with 4224 or without Data Dropped, or reporting false ECN marks.) APIs should 4225 let receiver applications set Slow Receiver, and sending 4226 applications determine whether or not their receivers are Slow. 4228 Slow Receiver is a one-byte option. 4230 +--------+ 4231 |00000010| 4232 +--------+ 4233 Type=2 4235 Slow Receiver does not specify why the receiver is having trouble 4236 keeping up with the sender. Possible reasons include lack of buffer 4237 space, CPU overload, and application quotas. A sending application 4238 might react to Slow Receiver by reducing its sending rate, for 4239 example. 4241 The sending application should not react to Slow Receiver by sending 4242 more data, however. The optimal response to a CPU-bound receiver 4243 might be to increase the sending rate, by switching to a less- 4244 compressed sending format, since a highly-compressed data format 4245 might overwhelm a slow CPU more seriously than the higher memory 4246 requirements of a less-compressed data format. This kind of format 4247 change should be requested at the application level, not via the 4248 Slow Receiver option. 4250 Slow Receiver implements a portion of TCP's receive window 4251 functionality. 4253 11.7. Data Dropped Option 4255 The Data Dropped option indicates that the application data on one 4256 or more received packets did not actually reach the application. 4257 Data Dropped additionally reports why the data was dropped: perhaps 4258 the data was corrupt, or perhaps the receiver cannot keep up with 4259 the sender's current rate and the data was dropped in some receive 4260 buffer. Using Data Dropped, DCCP endpoints can discriminate between 4261 different kinds of loss; this differs from TCP, in which all loss is 4262 reported the same way. 4264 Unless explicitly specified otherwise, DCCP congestion control 4265 mechanisms MUST react as if each Data Dropped packet was marked as 4266 ECN Congestion Experienced by the network. We intend for Data 4267 Dropped to enable research into richer congestion responses to 4268 corrupt and other endpoint-dropped packets, but DCCP CCIDs MUST 4269 react conservatively to Data Dropped until this behavior is 4270 standardized. Section 11.7.2, below, describes congestion responses 4271 for all current Drop Codes. 4273 If a received packet's application data is dropped for one of the 4274 reasons listed below, this SHOULD be reported using a Data Dropped 4275 option. Alternatively, the receiver MAY choose to report as 4276 "received" only those packets whose data were not dropped, subject 4277 to the constraint that packets not reported as received MUST NOT 4278 have had their options processed. 4280 The option's data looks like this: 4282 +--------+--------+--------+--------+--------+-------- 4283 |00101000| Length | Block | Block | Block | ... 4284 +--------+--------+--------+--------+--------+-------- 4285 Type=40 \___________ Vector ___________ ... 4287 The Vector consists of a series of bytes, called Blocks, each of 4288 whose encoding corresponds to one of two choices: 4290 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 4291 +-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ 4292 |0| Run Length | or |1|DrpCd|Run Len| 4293 +-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ 4294 Normal Block Drop Block 4296 The first byte in the first Data Dropped option refers to the packet 4297 indicated in the Acknowledgement Number; subsequent bytes refer to 4298 older packets. (Data Dropped MUST NOT be sent on DCCP-Data or DCCP- 4299 Request packets, which lack an Acknowledgement Number, and any Data 4300 Dropped options received on these packet types MUST be ignored.) 4301 Normal Blocks, which have high bit 0, indicate that any received 4302 packets in the Run Length had their data delivered to the 4303 application. Drop Blocks, which have high bit 1, indicate that 4304 received packets in the Run Len[gth] were not delivered as usual. 4305 The 3-bit Drop Code [DrpCd] field says what happened; generally, no 4306 data from that packet reached the application. Packets reported as 4307 "not yet received" MUST be included in Normal Blocks; packets not 4308 covered by any Data Dropped option are treated as if they were in a 4309 Normal Block. Defined Drop Codes for Drop Blocks are as follows. 4311 Drop Code Meaning 4312 --------- ------- 4313 0 Protocol Constraints 4314 1 Application Not Listening 4315 2 Receive Buffer 4316 3 Corrupt 4317 4-6 Reserved 4318 7 Delivered Corrupt 4320 Table 7: DCCP Drop Codes 4322 In more detail: 4324 0 The packet data was dropped due to protocol constraints. 4325 For example, the data was included on a DCCP-Request packet, 4326 but the receiving application does not allow such 4327 piggybacking; or the data was included on a packet with 4328 inappropriately low Checksum Coverage. 4330 1 The packet data was dropped because the application is no 4331 longer listening. See Section 11.7.2. 4333 2 The packet data was dropped in a receive buffer, probably 4334 because of receive buffer overflow. See Section 11.7.2. 4336 3 The packet data was dropped due to corruption. See Section 4337 9.3. 4339 7 The packet data was corrupted, but delivered to the 4340 application anyway. See Section 9.3. 4342 For example, assume a packet arrives with Acknowledgement Number 4343 100, an Ack Vector reporting all packets as received, and a Data 4344 Dropped option containing the decimal values 0,160,3,162. Then: 4346 Packet 100 was received (Acknowledgement Number 100, Normal 4347 Block, Run Length 0). 4349 Packet 99 was dropped in a receive buffer (Drop Block, Drop Code 4350 2, Run Length 0). 4352 Packets 98, 97, 96, and 95 were received (Normal Block, Run 4353 Length 3). 4355 Packets 95, 94, and 93 were dropped in the receive buffer (Drop 4356 Block, Drop Code 2, Run Length 2). 4358 Run lengths of more than 128 (for Normal Blocks) or 16 (for Drop 4359 Blocks) must be encoded in multiple Blocks. A single Data Dropped 4360 option can acknowledge up to 32384 Normal Block data packets, 4361 although the receiver SHOULD NOT send a Data Dropped option when all 4362 relevant packets fit into Normal Blocks. Should more packets need 4363 to be acknowledged than can fit in 253 bytes of Data Dropped, then 4364 multiple Data Dropped options can be sent. The second option will 4365 begin where the first left off, and so forth. 4367 One or more Data Dropped options that, together, report the status 4368 of more packets than have been sent, or that change the status of a 4369 packet, or that disagree with Ack Vector or equivalent options (by 4370 reporting a "not yet received" packet as "dropped in the receive 4371 buffer", for example), SHOULD be considered invalid. The receiving 4372 DCCP SHOULD either ignore such options, or respond by resetting the 4373 connection with Reset Code 5, "Option Error". 4375 A DCCP application interface should let receiving applications 4376 specify the Drop Codes corresponding to received packets. For 4377 example, this would let applications calculate their own checksums, 4378 but still report "dropped due to corruption" packets via the Data 4379 Dropped option. The interface SHOULD NOT let applications reduce 4380 the "seriousness" of a packet's Drop Code; for example, the 4381 application should not be able to upgrade a packet from delivered 4382 corrupt (Drop Code 7) to delivered normally (no Drop Code). 4384 Data Dropped information is transmitted reliably. That is, 4385 endpoints SHOULD continue to transmit Data Dropped options until 4386 receiving an acknowledgement indicating that the relevant options 4387 have been processed. In Ack Vector terms, each acknowledgement 4388 should contain Data Dropped options that cover the whole 4389 Acknowledgement Window (Section 11.4.2), although when every packet 4390 in that window would be placed in a Normal Block no actual option is 4391 required. 4393 11.7.1. Data Dropped and Normal Congestion Response 4395 When deciding on a response to a particular acknowledgement or set 4396 of acknowledgements containing Data Dropped options, a congestion 4397 control mechanism MUST consider dropped packets and ECN Congestion 4398 Experienced marks (including marked packets that are included in 4399 Data Dropped), as well as the packets singled out in Data Dropped. 4400 For window-based mechanisms, the valid response space is defined as 4401 follows. 4403 Assume an old window of W. Independently calculate a new window 4404 W_new1 that assumes no packets were Data Dropped (so W_new1 contains 4405 only the normal congestion response), and a new window W_new2 that 4406 assumes no packets were lost or marked (so W_new2 contains only the 4407 Data Dropped response). We are assuming that Data Dropped 4408 recommended a reduction in congestion window, so W_new2 < W. 4410 Then the actual new window W_new MUST NOT be larger than the minimum 4411 of W_new1 and W_new2; and the sender MAY combine the two responses, 4412 by setting 4413 W_new = W + min(W_new1 - W, 0) + min(W_new2 - W, 0). 4415 The details of how this is accomplished are specified in CCID 4416 profile documents. Non-window-based congestion control mechanisms 4417 MUST behave analogously; again, CCID profiles define how. 4419 11.7.2. Particular Drop Codes 4421 Drop Code 0, Protocol Constraints, does not indicate any kind of 4422 congestion, so the sender's CCID SHOULD react to packets with Drop 4423 Code 0 as if they were received (with or without ECN Congestion 4424 Experienced marks, as appropriate). However, the sending endpoint 4425 SHOULD NOT send data until it believes the protocol constraint no 4426 longer applies. 4428 Drop Code 1, Application Not Listening, means the application 4429 running at the endpoint that sent the option is no longer listening 4430 for data. For example, a server might close its receiving half- 4431 connection to new data after receiving a complete request from the 4432 client. This would limit the amount of state available at the 4433 server for incoming data, and thus reduce the potential damage from 4434 certain denial-of-service attacks. A Data Dropped option containing 4435 Drop Code 1 SHOULD be sent whenever received data is ignored due to 4436 a non-listening application. Once an endpoint reports Drop Code 1 4437 for a packet, it SHOULD report Drop Code 1 for every succeeding data 4438 packet on that half-connection; once an endpoint receives a Drop 4439 State 1 report, it SHOULD expect that no more data will ever be 4440 delivered to the other endpoint's application, so it SHOULD NOT send 4441 more data. 4443 Drop Code 2, Receive Buffer, indicates congestion inside the 4444 receiving host. For instance, if a drop-from-tail kernel socket 4445 buffer is too full to accept a packet's application data, that 4446 packet should be reported as Drop Code 2. For a drop-from-head or 4447 more complex socket buffer, the dropped packet should be reported as 4448 Drop Code 2. DCCP implementations may also provide an API by which 4449 applications can mark received packets as Drop Code 2, indicating 4450 that the application ran out of space in its user-level receive 4451 buffer. (However, it is not generally useful to report packets as 4452 dropped due to Drop Code 2 after more than a couple round-trip times 4453 have passed. The HC-Sender may have forgotten its acknowledgement 4454 state for the packet by that time, so the Data Dropped report will 4455 have no effect.) Every packet newly acknowledged as Drop Code 2 4456 SHOULD reduce the sender's instantaneous rate by one packet per 4457 round-trip time, unless the sender is already sending one packet per 4458 RTT or less. Each CCID profile defines the CCID-specific mechanism 4459 by which this is accomplished. 4461 Currently, the other Drop Codes, namely Drop Code 3, Corrupt, Drop 4462 Code 7, Delivered Corrupt, and reserved Drop Codes 4-6, MUST cause 4463 the relevant CCID to behave as if the relevant packets were ECN 4464 marked (ECN Congestion Experienced). 4466 12. Explicit Congestion Notification 4468 The DCCP protocol is fully ECN-aware [RFC 3168]. Each CCID 4469 specifies how its endpoints respond to ECN marks. Furthermore, 4470 DCCP, unlike TCP, allows senders to control the rate at which 4471 acknowledgements are generated (with options like Ack Ratio); since 4472 acknowledgements are congestion-controlled, they also qualify as 4473 ECN-Capable Transport. 4475 A CCID profile describes how that CCID interacts with ECN, both for 4476 data traffic and pure-acknowledgement traffic. A sender SHOULD set 4477 ECN-Capable Transport on its packets' IP headers, unless the 4478 receiver's ECN Incapable feature is on or the relevant CCID 4479 disallows it. 4481 The rest of this section describes the ECN Incapable feature and the 4482 interaction of the ECN Nonce with acknowledgement options such as 4483 Ack Vector. 4485 12.1. ECN Incapable Feature 4487 DCCP endpoints are ECN-aware by default, but the ECN Incapable 4488 feature lets an endpoint reject the use of Explicit Congestion 4489 Notification. The use of this feature is NOT RECOMMENDED. ECN 4490 incapability both avoids ECN's possible benefits and prevents 4491 senders from using the ECN Nonce to check for receiver misbehavior. 4492 A DCCP stack MAY therefore leave the ECN Incapable feature 4493 unimplemented, acting as if all connections were ECN capable. It is 4494 worth noting that the inappropriate firewall interactions that 4495 dogged TCP's implementation of ECN [RFC 3360] involve TCP header 4496 bits, not the IP header's ECN bits; we know of no middlebox that 4497 would block ECN-capable DCCP packets, but allow ECN-incapable DCCP 4498 packets. 4500 ECN Incapable has feature number 4, and is server-priority. It 4501 takes one-byte Boolean values. DCCP A MUST be able to read ECN bits 4502 from received frames' IP headers when ECN Incapable/A is zero. 4503 (This is independent of whether it can set ECN bits on sent frames.) 4504 DCCP A thus sends a "Change L(ECN Inapable, 1)" option to DCCP B to 4505 inform it that A cannot read ECN bits. If the ECN Incapable/A 4506 feature is one, then all of DCCP B's packets MUST be sent as ECN 4507 incapable. New connections start with ECN Incapable 0 (that is, ECN 4508 capable) for both endpoints. Values of two or more are reserved. 4510 If a DCCP is not ECN capable, it MUST send Mandatory "Change L(ECN 4511 Incapable, 1)" options to the other endpoint until acknowledged (by 4512 "Confirm R(ECN Incapable, 1)") or the connection closes. 4513 Furthermore, it MUST NOT accept any data until the other endpoint 4514 sends "Confirm R(ECN Incapable, 1)". It SHOULD send Data Dropped 4515 options on its acknowledgements, with Drop Code 0 ("protocol 4516 constraints"), if the other endpoint does send data inappropriately. 4518 12.2. ECN Nonces 4520 Congestion avoidance will not occur, and the receiver will sometimes 4521 get its data faster, if the sender isn't told about congestion 4522 events. Thus, the receiver has some incentive to falsify 4523 acknowledgement information, reporting that marked or dropped 4524 packets were actually received unmarked. This problem is more 4525 serious with DCCP than with TCP, since TCP provides reliable 4526 transport: it is more difficult with TCP to lie about lost packets 4527 without breaking the application. 4529 ECN Nonces are a general mechanism to prevent ECN cheating (or loss 4530 cheating). Two values for the two-bit ECN header field indicate 4531 ECN-Capable Transport, 01 and 10. The second code point, 10, is the 4532 ECN Nonce. In general, a protocol sender chooses between these code 4533 points randomly on its output packets, remembering the sequence it 4534 chose. The protocol receiver reports, on every acknowledgement, the 4535 number of ECN Nonces it has received thus far. This is called the 4536 ECN Nonce Echo. Since ECN marking and packet dropping both destroy 4537 the ECN Nonce, a receiver that lies about an ECN mark or packet drop 4538 has a 50% chance of guessing right and avoiding discipline. The 4539 sender may react punitively to an ECN Nonce mismatch, possibly up to 4540 dropping the connection. The ECN Nonce Echo field need not be an 4541 integer; one bit is enough to catch 50% of infractions, and the 4542 probability of success drops exponentially as more packets are sent 4543 [RFC 3540]. 4545 In DCCP, the ECN Nonce Echo field is encoded in acknowledgement 4546 options. For example, the Ack Vector option comes in two forms, Ack 4547 Vector [Nonce 0] (option 38) and Ack Vector [Nonce 1] (option 39), 4548 corresponding to the two values for a one-bit ECN Nonce Echo. The 4549 Nonce Echo for a given Ack Vector equals the one-bit sum (exclusive- 4550 or, or parity) of ECN nonces for packets reported by that Ack Vector 4551 as received and not ECN marked. Thus, only packets marked as State 4552 0 matter for this calculation (that is, valid received packets that 4553 were not ECN marked). Every Ack Vector option is detailed enough 4554 for the sender to determine what the Nonce Echo should have been. 4555 It can check this calculation against the actual Nonce Echo, and 4556 complain if there is a mismatch. (The Ack Vector could conceivably 4557 report every packet's ECN Nonce state, but this would severely limit 4558 its compressibility without providing much extra protection.) 4560 Each DCCP sender SHOULD set ECN Nonces on its packets, and remember 4561 which packets had nonces. When a sender detects an ECN Nonce Echo 4562 mismatch, it behaves as described in the next section. Each DCCP 4563 receiver MUST calculate and use the correct value for ECN Nonce Echo 4564 when sending acknowledgement options. 4566 ECN incapability, as indicated by the ECN Incapable feature, is 4567 handled as follows: An endpoint sending packets to an ECN-incapable 4568 receiver MUST send its packets as ECN incapable, and an ECN- 4569 incapable receiver MUST use the value zero for all ECN Nonce Echoes. 4571 12.3. Aggression Penalties 4573 DCCP endpoints have several mechanisms for detecting congestion- 4574 related misbehavior. For example: 4576 o A sender can detect an ECN Nonce Echo mismatch, indicating 4577 possible receiver misbehavior. 4579 o A receiver can detect whether the sender is responding to 4580 congestion feedback or Slow Receiver. 4582 o An endpoint may be able to detect that its peer is reporting 4583 inappropriately small Elapsed Time values (Section 13.2). 4585 An endpoint that detects possible congestion-related misbehavior 4586 SHOULD try to verify that its peer is truly misbehaving. For 4587 example, a sending endpoint might send a packet whose ECN header 4588 field is set to Congestion Experienced, 11; a receiver that doesn't 4589 report a corresponding mark is most likely misbehaving. 4591 Upon detecting possible misbehavior, a sender SHOULD respond as if 4592 the receiver had reported one or more recent packets as ECN-marked 4593 (instead of unmarked), while a receiver SHOULD report one or more 4594 recent non-marked packets as ECN-marked. Alternately, a sender 4595 might act as if the receiver had sent a Slow Receiver option, and a 4596 receiver might send Slow Receiver options. Other reactions that 4597 serve to slow the transfer rate are also acceptable. An entity that 4598 detects particularly egregious and ongoing misbehavior MAY also 4599 reset the connection with Reset Code 11, "Aggression Penalty". 4601 However, ECN Nonce mismatches and other warning signs can result 4602 from innocent causes, such as implementation bugs or attack. In 4603 particular, a successful DCCP-Data attack (Section 7.5.5) can cause 4604 the receiver to report an incorrect ECN Nonce Echo. Therefore, 4605 connection reset and other heavyweight mechanisms SHOULD be sent 4606 only as last resorts, after multiple round-trip times of verified 4607 aggression. 4609 13. Timing Options 4611 The Timestamp, Timestamp Echo, and Elapsed Time options help DCCP 4612 endpoints explicitly measure round-trip times. 4614 13.1. Timestamp Option 4616 This option is permitted in any DCCP packet. The length of the 4617 option is 6 bytes. 4619 +--------+--------+--------+--------+--------+--------+ 4620 |00101001|00000110| Timestamp Value | 4621 +--------+--------+--------+--------+--------+--------+ 4622 Type=41 Length=6 4624 The four bytes of option data carry the timestamp of this packet. 4625 The timestamp is a 32-bit integer that increases monotonically with 4626 time, at a rate of 1 unit per 10 microseconds. At this rate, 4627 Timestamp Value will wrap approximately every 11.9 hours. Endpoints 4628 need not measure time at this fine granularity; for example, an 4629 endpoint that preferred to measure time at millisecond granularity 4630 might send Timestamp Values that were all multiples of 100. The 4631 precise time corresponding to Timestamp Value zero is not specified: 4632 Timestamp Values are only meaningful relative to other Timestamp 4633 Values sent on the same connection. A DCCP receiving a Timestamp 4634 option SHOULD respond with a Timestamp Echo option on the next 4635 packet it sends. 4637 13.2. Elapsed Time Option 4639 This option is permitted in any DCCP packet that contains an 4640 Acknowledgement Number (such options received on other packet types 4641 MUST be ignored). It indicates how much time has elapsed, in 4642 hundredths of milliseconds (or, equivalently, multiples of 4643 10 microseconds), since the packet being acknowledged -- the packet 4644 with the given Acknowledgement Number -- was received. The option 4645 may take 4 or 6 bytes, depending on the size of the Elapsed Time 4646 value. Elapsed Time helps correct round-trip time estimates when 4647 the gap between receiving a packet and acknowledging that packet may 4648 be long -- in CCID 3, for example, where acknowledgements are sent 4649 infrequently. 4651 +--------+--------+--------+--------+ 4652 |00101011|00000100| Elapsed Time | 4653 +--------+--------+--------+--------+ 4654 Type=43 Len=4 4656 +--------+--------+--------+--------+--------+--------+ 4657 |00101011|00000110| Elapsed Time | 4658 +--------+--------+--------+--------+--------+--------+ 4659 Type=43 Len=6 4661 The option data, Elapsed Time, represents an estimated upper bound 4662 on the amount of time elapsed since the packet being acknowledged 4663 was received, with units of hundredths of milliseconds. If Elapsed 4664 Time is less than a half-second, the first, smaller form of the 4665 option SHOULD be used. Elapsed Times of more than 0.65535 seconds 4666 MUST be sent using the second form of the option. The special 4667 Elapsed Time value 4294967295, which corresponds to approximately 4668 11.9 hours, is used to represent any Elapsed Time greater than 4669 42949.67294 seconds. DCCP endpoints MUST NOT report Elapsed Times 4670 that are significantly larger than the true elapsed times. A 4671 connection MAY be reset with Reset Code 11, "Aggression Penalty", if 4672 one endpoint determines that the other is reporting a much-too-large 4673 Elapsed Time. 4675 Elapsed Time is measured in hundredths of milliseconds as a 4676 compromise between two conflicting goals. First, it provides enough 4677 granularity to reduce rounding error when measuring elapsed time 4678 over fast LANs; second, it allows many reasonable elapsed times to 4679 fit into two bytes of data. 4681 13.3. Timestamp Echo Option 4683 This option is permitted in any DCCP packet, as long as at least one 4684 packet carrying the Timestamp option has been received. Generally, 4685 a DCCP endpoint should send one Timestamp Echo option for each 4686 Timestamp option it receives; and it should send that option as soon 4687 as is convenient. The length of the option is between 6 and 10 4688 bytes, depending on whether Elapsed Time is included and how large 4689 it is. 4691 +--------+--------+--------+--------+--------+--------+ 4692 |00101010|00000110| Timestamp Echo | 4693 +--------+--------+--------+--------+--------+--------+ 4694 Type=42 Len=6 4696 +--------+--------+------- ... -------+--------+--------+ 4697 |00101010|00001000| Timestamp Echo | Elapsed Time | 4698 +--------+--------+------- ... -------+--------+--------+ 4699 Type=42 Len=8 (4 bytes) 4701 +--------+--------+------- ... -------+------- ... -------+ 4702 |00101010|00001010| Timestamp Echo | Elapsed Time | 4703 +--------+--------+------- ... -------+------- ... -------+ 4704 Type=42 Len=10 (4 bytes) (4 bytes) 4706 The first four bytes of option data, Timestamp Echo, carry a 4707 Timestamp Value taken from a preceding received Timestamp option. 4708 Usually, this will be the last packet that was received -- the 4709 packet indicated by the Acknowledgement Number, if any -- but it 4710 might be a preceding packet. Each Timestamp received will generally 4711 result in exactly one Timestamp Echo transmitted. If an endpoint 4712 has received multiple Timestamp options since the last time it sent 4713 a packet, then it MAY ignore all Timestamp options but the one 4714 included on the packet with the greatest sequence number; 4715 alternatively, it MAY include multiple Timestamp Echo options in its 4716 response, each corresponding to a different Timestamp option. 4718 The Elapsed Time value, similar to that in the Elapsed Time option, 4719 indicates the amount of time elapsed since receiving the packet 4720 whose timestamp is being echoed. This time MUST be in hundredths of 4721 milliseconds. Elapsed Time is meant to help the Timestamp sender 4722 separate the network round-trip time from the Timestamp receiver's 4723 processing time. This may be particularly important for CCIDs where 4724 acknowledgements are sent infrequently, so that there might be 4725 considerable delay between receiving a Timestamp option and sending 4726 the corresponding Timestamp Echo. A missing Elapsed Time field is 4727 equivalent to an Elapsed Time of zero. The smallest version of the 4728 option SHOULD be used that can hold the relevant Elapsed Time value. 4730 14. Maximum Packet Size 4732 A DCCP implementation MUST maintain the maximum packet size (MPS) 4733 allowed for each active DCCP session. The MPS is influenced by the 4734 maximum packet size allowed by the current congestion control 4735 mechanism (CCMPS), the maximum packet size supported by the path's 4736 links (PMTU, the Path Maximum Transmission Unit) [RFC 1191], and the 4737 lengths of the IP and DCCP headers. 4739 A DCCP application interface SHOULD let the application discover 4740 DCCP's current MPS. Generally, the DCCP implementation will refuse 4741 to send any packet bigger than the MPS, returning an appropriate 4742 error to the application. A DCCP interface MAY allow applications 4743 to request fragmentation for packets larger than PMTU, but not 4744 larger than CCMPS (packets larger than CCMPS MUST be rejected in any 4745 case). Fragmentation SHOULD NOT be the default, since it decreases 4746 robustness: an entire packet is discarded if even one of its 4747 fragments is lost. Applications can usually get better error 4748 tolerance by producing packets smaller than the PMTU. 4750 The MPS reported to the application SHOULD be influenced by the size 4751 expected to be required for DCCP headers and options. If the 4752 application provides data that, when combined with the options the 4753 DCCP implementation would like to include, would exceed the MPS, the 4754 implementation should either send the options on a separate packet 4755 (such as a DCCP-Ack) or lower the MPS, drop the data, and return an 4756 appropriate error to the application. 4758 14.1. Measuring PMTU 4760 Each DCCP endpoint MUST keep track of the current PMTU for each 4761 connection, except that this is not required for IPv4 connections 4762 whose applications have requested fragmentation. The PMTU SHOULD be 4763 initialized from the interface MTU that will be used to send 4764 packets. The MPS will be initialized with the minimum of the PMTU 4765 and the CCMPS, if any. 4767 Classical PMTU discovery uses unfragmentable packets. In IPv4, 4768 these packets have the IP Don't Fragment (DF) bit set; in IPv6, all 4769 packets are unfragmentable once emitted by an end host. As 4770 specified in RFC 1191, when a router receives a packet with DF set 4771 that is larger than the next link's MTU, it sends an ICMP 4772 Destination Unreachable message back to the source whose Code 4773 indicates that an unfragmentable packet was too large to forward (a 4774 "Datagram Too Big" message). When a DCCP implementation receives a 4775 Datagram Too Big message, it decreases its PMTU to the Next-Hop MTU 4776 value given in the ICMP message. If the MTU given in the message is 4777 zero, the sender chooses a value for PMTU using the algorithm 4778 described in RFC 1191 (Section 7). If the MTU given in the message 4779 is greater than the current PMTU, the Datagram Too Big message is 4780 ignored, as described in RFC 1191. (We are aware that this may 4781 cause problems for DCCP endpoints behind certain firewalls.) 4783 A DCCP implementation may allow the application to occasionally 4784 request that PMTU discovery be performed again. This will reset the 4785 PMTU to the outgoing interface's MTU. Such requests SHOULD be rate 4786 limited, to one per two seconds, for example. 4788 A DCCP sender MAY treat the reception of an ICMP Datagram Too Big 4789 message as an indication that the packet being reported was not lost 4790 due to congestion, and so for the purposes of congestion control it 4791 MAY ignore the DCCP receiver's indication that this packet did not 4792 arrive. However, if this is done, then the DCCP sender MUST check 4793 the ECN bits of the IP header echoed in the ICMP message, and only 4794 perform this optimization if these ECN bits indicate that the packet 4795 did not experience congestion prior to reaching the router whose 4796 link MTU it exceeded. 4798 A DCCP implementation SHOULD ensure, as far as possible, that ICMP 4799 Datagram Too Big messages were actually generated by routers, so 4800 that attackers cannot drive the PMTU down to a falsely small value. 4801 The simplest way to do this is to verify that the Sequence Number on 4802 the ICMP error's encapsulated header corresponds to a Sequence 4803 Number that the implementation recently sent. (According to current 4804 specifications, routers should return the full DCCP header and 4805 payload up to a maximum of 576 bytes [RFC 1812] or the minimum IPv6 4806 MTU [RFC 2463], although they are not required to return more than 4807 64 bits [RFC 792]. Any amount greater than 128 bits will include 4808 the Sequence Number.) ICMP Datagram Too Big messages with incorrect 4809 or missing Sequence Numbers may be ignored, or the DCCP 4810 implementation may lower the PMTU only temporarily in response. If 4811 more than three odd Datagram Too Big messages are received and the 4812 other DCCP endpoint reports more than three lost packets, however, 4813 the DCCP implementation SHOULD assume the presence of a confused 4814 router, and either obey the ICMP messages' PMTU or (on IPv4 4815 networks) switch to allowing fragmentation. 4817 DCCP also allows upward probing of the PMTU [PMTUD], where the DCCP 4818 endpoint begins by sending small packets with DF set, then gradually 4819 increases the packet size until a packet is lost. This mechanism 4820 does not require any ICMP error processing. DCCP-Sync packets are 4821 the best choice for upward probing, since DCCP-Sync probes do not 4822 risk application data loss. The DCCP implementation inserts 4823 arbitrary data into the DCCP-Sync application area, padding the 4824 packet to the right length; and since every valid DCCP-Sync 4825 generates an immediate DCCP-SyncAck in response, the endpoint will 4826 have a pretty good idea of when a probe is lost. 4828 14.2. Sender Behavior 4830 A DCCP sender SHOULD send every packet as unfragmentable, as 4831 described above, with the following exceptions. 4833 o On IPv4 connections whose applications have requested 4834 fragmentation, the sender SHOULD send packets with the DF bit not 4835 set. 4837 o On IPv6 connections whose applications have requested 4838 fragmentation, the sender SHOULD use fragmentation extension 4839 headers to fragment packets larger than PMTU into suitably-sized 4840 chunks. (Those chunks are, of course, unfragmentable.) 4842 o It is undesirable for PMTU discovery to occur on the initial 4843 connection setup handshake, as the connection setup process may 4844 not be representative of packet sizes used during the connection, 4845 and performing MTU discovery on the initial handshake might 4846 unnecessarily delay connection establishment. Thus, DCCP-Request 4847 and DCCP-Response packets SHOULD be sent as fragmentable. In 4848 addition, DCCP-Reset packets SHOULD be sent as fragmentable, 4849 although typically these would be small enough to not be a 4850 problem. For IPv4 connections, these packets SHOULD be sent with 4851 the DF bit not set; for IPv6 connections, they SHOULD be 4852 preemptively fragmented to a size not larger than the relevant 4853 interface MTU. 4855 If the DCCP implementation has decreased the PMTU, the sending 4856 application has not requested fragmentation, and the sending 4857 application attempts to send a packet larger than the new MPS, the 4858 API MUST refuse to send the packet and return an appropriate error 4859 to the application. The application should then use the API to 4860 query the new value of MPS. The kernel might have some packets 4861 buffered for transmission that are smaller than the old MPS, but 4862 larger than the new MPS. It MAY send these packets as fragmentable, 4863 or it MAY discard these packets; it MUST NOT send them as 4864 unfragmentable. 4866 15. Forward Compatibility 4868 Future versions of DCCP may add new options and features. A few 4869 simple guidelines will let extended DCCPs interoperate with normal 4870 DCCPs. 4872 o DCCP processors MUST NOT act punitively towards options and 4873 features they do not understand. For example, DCCP processors 4874 MUST NOT reset the connection if some field marked Reserved in 4875 this specification is non-zero; if some unknown option is 4876 present; or if some feature negotiation option mentions an 4877 unknown feature. Instead, DCCP processors MUST ignore these 4878 events. The Mandatory option is the single exception: if 4879 Mandatory precedes some unknown option or feature, the connection 4880 MUST be reset. 4882 o DCCP processors MUST anticipate the possibility of unknown 4883 feature values, which might occur as part of a negotiation for a 4884 known feature. For server-priority features, unknown values are 4885 handled as a matter of course: since the non-extended DCCP's 4886 priority list will not contain unknown values, the result of the 4887 negotiation cannot be an unknown value. A DCCP SHOULD respond 4888 with an empty Confirm option if it is assigned an unacceptable 4889 value for some non-negotiable feature. 4891 o Each DCCP extension SHOULD be controlled by some feature. The 4892 default value of this feature should correspond to "extension not 4893 available". If an extended DCCP wants to use the extension, it 4894 SHOULD attempt to change the feature's value using a Change L or 4895 Change R option. Any non-extended DCCP will ignore the option, 4896 thus leaving the feature value at its default, "extension not 4897 available". 4899 Section 19 lists DCCP assigned numbers reserved for experimental and 4900 testing purposes. 4902 16. Middlebox Considerations 4904 This section describes properties of DCCP that firewalls, network 4905 address translators, and other middleboxes should consider, 4906 including parts of the packet that middleboxes should not change. 4907 The intent is to draw attention to aspects of DCCP that may be 4908 useful, or dangerous, for middleboxes, or that differ significantly 4909 from TCP. 4911 The Service Code field in DCCP-Request packets provides information 4912 that may be useful for stateful middleboxes. With Service Code, a 4913 middlebox can tell what protocol a connection will use without 4914 relying on port numbers. Middleboxes can disallow connections that 4915 attempt to access unexpected services by sending a DCCP-Reset with 4916 Reset Code 8, "Bad Service Code". Middleboxes should not modify the 4917 Service Code unless they are really changing the service a 4918 connection is accessing. 4920 The Source and Destination Port fields are in the same packet 4921 locations as the corresponding fields in TCP and UDP, which may 4922 simplify some middlebox implementations. 4924 The forward compatibility considerations in Section 15 apply to 4925 middleboxes as well. In particular, middleboxes generally shouldn't 4926 act punitively towards options and features they do not understand. 4928 Modifying DCCP Sequence Numbers and Acknowledgement Numbers is more 4929 tedious and dangerous than modifying TCP sequence numbers. A 4930 middlebox that added packets to, or removed packets from, a DCCP 4931 connection would have to modify acknowledgement options, such as Ack 4932 Vector, and CCID-specific options, such as TFRC's Loss Intervals, at 4933 minimum. On ECN-capable connections, the middlebox would have to 4934 keep track of ECN Nonce information for packets it introduced or 4935 removed, so that the relevant acknowledgement options continued to 4936 have correct ECN Nonce Echoes, or risk the connection being reset 4937 for "Aggression Penalty". We therefore recommend that middleboxes 4938 not modify packet streams by adding or removing packets. 4940 Note that there is less need to modify DCCP's per-packet sequence 4941 numbers than TCP's per-byte sequence numbers; for example, a 4942 middlebox can change the contents of a packet without changing its 4943 sequence number. (In TCP, sequence number modification is required 4944 to support protocols like FTP that carry variable-length addresses 4945 in the data stream. If such an application were deployed over DCCP, 4946 middleboxes would simply grow or shrink the relevant packets as 4947 necessary, without changing their sequence numbers. This might 4948 involve fragmenting the packet.) 4950 Middleboxes may, of course, reset connections in progress. Clearly 4951 this requires inserting a packet into one or both packet streams, 4952 but the difficult issues do not arise. 4954 DCCP is somewhat unfriendly to "connection splicing" [SHHP00], in 4955 which clients' connection attempts are intercepted, but possibly 4956 later "spliced in" to external server connections via sequence 4957 number manipulations. A connection splicer at minimum would have to 4958 ensure that the spliced connections agreed on all relevant feature 4959 values, which might take some renegotiation. 4961 The contents of this section should not be interpreted as a 4962 wholesale endorsement of stateful middleboxes. 4964 17. Relations to Other Specifications 4966 17.1. RTP 4968 The Real-Time Transport Protocol, RTP [RFC 3550], is currently used 4969 over UDP by many of DCCP's target applications (for instance, 4970 streaming media). Therefore, it is important to examine the 4971 relationship between DCCP and RTP, and in particular, the question 4972 of whether any changes in RTP are necessary or desirable when it is 4973 layered over DCCP instead of UDP. 4975 There are two potential sources of overhead in the RTP-over-DCCP 4976 combination, duplicated acknowledgement information and duplicated 4977 sequence numbers. Together, these sources of overhead add slightly 4978 more than 4 bytes per packet relative to RTP-over-UDP, and that 4979 eliminating the redundancy would not reduce the overhead. 4981 First, consider acknowledgements. Both RTP and DCCP report feedback 4982 about loss rates to data senders, via RTP Control Protocol Sender 4983 and Receiver Reports (RTCP SR/RR packets) and via DCCP 4984 acknowledgement options. These feedback mechanisms are potentially 4985 redundant. However, RTCP SR/RR packets contain information not 4986 present in DCCP acknowledgements, such as "interarrival jitter", and 4987 DCCP's acknowledgements contain information not transmitted by RTCP, 4988 such as the ECN Nonce Echo. Neither feedback mechanism makes the 4989 other redundant. 4991 Sending both types of feedback need not be particularly costly 4992 either. RTCP reports may be sent relatively infrequently: once 4993 every 5 seconds on average, for low-bandwidth flows. In DCCP, some 4994 feedback mechanisms are expensive -- Ack Vector, for example, is 4995 frequent and verbose -- but others are relatively cheap: CCID 3 4996 (TFRC) acknowledgements take between 16 and 32 bytes of options sent 4997 once per round-trip time. (Reporting less frequently than once per 4998 RTT would make congestion control less responsive to loss.) We 4999 therefore conclude that acknowledgement overhead in RTP-over-DCCP 5000 need not be significantly higher than for RTP-over-UDP, at least for 5001 CCID 3. 5003 One clear redundancy can be addressed at the application level. The 5004 verbose packet-by-packet loss reports sent in RTCP Extended Reports 5005 Loss RLE Blocks [RFC 3611] can be derived from DCCP's Ack Vector 5006 options. (The converse is not true, since Loss RLE Blocks contain 5007 no ECN information.) Since DCCP implementations should provide an 5008 API for application access to Ack Vector information, RTP-over-DCCP 5009 applications might request either DCCP Ack Vectors or RTCP Extended 5010 Report Loss RLE Blocks, but not both. 5012 Now consider sequence number redundancy on data packets. The 5013 embedded RTP header contains a 16-bit RTP sequence number. Most 5014 data packets will use the DCCP-Data type; DCCP-DataAck and DCCP-Ack 5015 packets need not usually be sent. The DCCP-Data header is 12 bytes 5016 long without options, including a 24-bit sequence number. This is 4 5017 bytes more than a UDP header. Any options required on data packets 5018 would add further overhead, although many CCIDs (for instance, CCID 5019 3, TFRC) don't require options on most data packets. 5021 The DCCP sequence number cannot be inferred from the RTP sequence 5022 number since it increments on non-data packets as well as data 5023 packets. The RTP sequence number cannot be inferred from the DCCP 5024 sequence number either [RFC 3550]. Furthermore, removing RTP's 5025 sequence number would not save any header space because of alignment 5026 issues. We therefore recommend that RTP transmitted over DCCP use 5027 the same headers currently defined. The 4 byte header cost is a 5028 reasonable tradeoff for DCCP's congestion control features and 5029 access to ECN. Truly bandwidth-starved endpoints should use some 5030 future header compression scheme. 5032 17.2. Congestion Manager and Multiplexing 5034 Since DCCP doesn't provide reliable, ordered delivery, multiple 5035 application sub-flows may be multiplexed over a single DCCP 5036 connection with no inherent performance penalty. Thus, there is no 5037 need for DCCP to provide built-in, SCTP-style support for multiple 5038 sub-flows. 5040 Some applications might want to share congestion control state among 5041 multiple DCCP flows that share the same source and destination 5042 addresses. This functionality could be provided by the Congestion 5043 Manager [RFC 3124], a generic multiplexing facility. However, the 5044 CM would not fully support DCCP without change; it does not 5045 gracefully handle multiple congestion control mechanisms, for 5046 example. 5048 18. Security Considerations 5050 DCCP does not provide cryptographic security guarantees. 5051 Applications desiring cryptographic security services (integrity, 5052 authentication, confidentiality, access control, and anti-replay 5053 protection) should use IPsec or end-to-end security of some kind; 5054 Secure RTP is one candidate protocol [RFC 3711]. 5056 Nevertheless, DCCP is intended to protect against some classes of 5057 attackers: Attackers cannot hijack a DCCP connection (close the 5058 connection unexpectedly, or cause attacker data to be accepted by an 5059 endpoint as if it came from the sender) unless they can guess valid 5060 sequence numbers. Thus, as long as endpoints choose initial 5061 sequence numbers well, a DCCP attacker must snoop on data packets to 5062 get any reasonable probability of success. Sequence number validity 5063 checks provide this guarantee. Section 7.5.5 describes sequence 5064 number security further. This security property only holds assuming 5065 that DCCP's random numbers are chosen according to the guidelines in 5066 RFC 1750. 5068 DCCP also provides mechanisms to limit the potential impact of some 5069 denial-of-service attacks. These mechanisms include Init Cookie 5070 (Section 8.1.4), the DCCP-CloseReq packet (Section 5.5), the 5071 Application Not Listening Drop Code (Section 11.7.2), limitations on 5072 the processing of options that might cause connection reset (Section 5073 7.5.5), limitations on the processing of some ICMP messages (Section 5074 14.1), and various rate limits, which let servers avoid extensive 5075 computation or packet generation (Sections 7.5.3, 8.1.3, and 5076 others). 5078 DCCP provides no protection against attackers that can snoop on data 5079 packets. 5081 18.1. Security Considerations for Partial Checksums 5083 The partial checksum facility has a separate security impact, 5084 particularly in its interaction with authentication and encryption 5085 mechanisms. The impact is the same in DCCP as in the UDP-Lite 5086 protocol, and what follows was adapted from the corresponding text 5087 in the UDP-Lite specification [RFC 3828]. 5089 When a DCCP packet's Checksum Coverage field is not zero, the 5090 uncovered portion of a packet may change in transit. This is 5091 contrary to the idea behind most authentication mechanisms: 5092 authentication succeeds if the packet has not changed in transit. 5093 Unless authentication mechanisms that operate only on the sensitive 5094 part of packets are developed and used, authentication will always 5095 fail for partially-checksummed DCCP packets whose uncovered part has 5096 been damaged. 5098 The IPsec integrity check (Encapsulation Security Protocol, ESP, or 5099 Authentication Header, AH) is applied (at least) to the entire IP 5100 packet payload. Corruption of any bit within that area will then 5101 result in the IP receiver discarding a DCCP packet, even if the 5102 corruption happened in an uncovered part of the DCCP application 5103 data. 5105 When IPsec is used with ESP payload encryption, a link can not 5106 determine the specific transport protocol of a packet being 5107 forwarded by inspecting the IP packet payload. In this case, the 5108 link MUST provide a standard integrity check covering the entire IP 5109 packet and payload. DCCP partial checksums provide no benefit in 5110 this case. 5112 Encryption (e.g., at the transport or application levels) may be 5113 used. Note that omitting an integrity check can, under certain 5114 circumstances, compromise confidentiality [BEL98]. 5116 If a few bits of an encrypted packet are damaged, the decryption 5117 transform will typically spread errors so that the packet becomes 5118 too damaged to be of use. Many encryption transforms today exhibit 5119 this behavior. There exist encryption transforms, stream ciphers, 5120 which do not cause error propagation. Proper use of stream ciphers 5121 can be quite difficult, especially when authentication checking is 5122 omitted [BB01]. In particular, an attacker can cause predictable 5123 changes to the ultimate plaintext, even without being able to 5124 decrypt the ciphertext. 5126 19. IANA Considerations 5128 DCCP introduces eight sets of numbers whose values should be 5129 allocated by IANA. We refer to allocation policies, such as 5130 Standards Action, outlined in RFC 2434, and most registries reserve 5131 some values for experimental and testing use [RFC 3692]. In 5132 addition, DCCP requires a Protocol Number to be added to the 5133 registry of Assigned Internet Protocol Numbers. IANA is requested 5134 to assign IP Protocol Number 33 to DCCP; this number has already 5135 been informally made available for experimental DCCP use. 5137 19.1. Packet Types 5139 Each entry in the DCCP Packet Type registry contains a packet type, 5140 which is a number in the range 0-15; a packet type name, such as 5141 DCCP-Request; and a reference to the RFC defining the packet type. 5142 The registry is initially populated using the values in Table 1 5143 (Section 5.1). This document allocates packet types 0-9, and packet 5144 type 14 is permanently reserved for experimental and testing use. 5145 Packet types 10-13 and 15 are currently reserved, and should be 5146 allocated with the Standards Action policy, which requires IESG 5147 review and approval and standards-track IETF RFC publication. 5149 19.2. Reset Codes 5151 Each entry in the DCCP Reset Code registry contains a Reset Code, 5152 which is a number in the range 0-255; a short description of the 5153 Reset Code, such as "No Connection"; and a reference to the RFC 5154 defining the Reset Code. The registry is initially populated using 5155 the values in Table 2 (Section 5.6). This document allocates Reset 5156 Codes 0-11, and Reset Codes 120-126 are permanently reserved for 5157 experimental and testing use. Reset Codes 12-119 and 127 are 5158 currently reserved, and should be allocated with the IETF Consensus 5159 policy, requiring an IETF RFC publication (standards-track or not) 5160 with IESG review and approval. Reset Codes 128-255 are permanently 5161 reserved for CCID-specific registries; each CCID Profile document 5162 describes how the corresponding registry is managed. 5164 19.3. Option Types 5166 Each entry in the DCCP option type registry contains an option type, 5167 which is a number in the range 0-255; the name of the option, such 5168 as "Slow Receiver"; and a reference to the RFC defining the option 5169 type. The registry is initially populated using the values in Table 5170 3 (Section 5.8). This document allocates option types 0-2 and 5171 32-44, and option types 31 and 120-126 are permanently reserved for 5172 experimental and testing use. Option types 3-30, 45-119, and 127 5173 are currently reserved, and should be allocated with the IETF 5174 Consensus policy, requiring an IETF RFC publication (standards-track 5175 or not) with IESG review and approval. Option types 128-255 are 5176 permanently reserved for CCID-specific registries; each CCID Profile 5177 document describes how the corresponding registry is managed. 5179 19.4. Feature Numbers 5181 Each entry in the DCCP feature number registry contains a feature 5182 number, which is a number in the range 0-255; the name of the 5183 feature, such as "ECN Incapable"; and a reference to the RFC 5184 defining the feature number. The registry is initially populated 5185 using the values in Table 4 (Section 6). This document allocates 5186 feature numbers 0-9, and feature numbers 120-126 are permanently 5187 reserved for experimental and testing use. Feature numbers 10-119 5188 and 127 are currently reserved, and should be allocated with the 5189 IETF Consensus policy, requiring an IETF RFC publication (standards- 5190 track or not) with IESG review and approval. Feature numbers 5191 128-255 are permanently reserved for CCID-specific registries; each 5192 CCID Profile document describes how the corresponding registry is 5193 managed. 5195 19.5. Congestion Control Identifiers 5197 Each entry in the DCCP Congestion Control Identifier (CCID) registry 5198 contains a CCID, which is a number in the range 0-255; the name of 5199 the CCID, such as "TCP-like Congestion Control"; and a reference to 5200 the RFC defining the CCID. The registry is initially populated 5201 using the values in Table 5 (Section 10). CCIDs 2 and 3 are 5202 allocated by concurrently published profiles, and CCIDs 248-254 are 5203 permanently reserved for experimental and testing use. CCIDs 0, 1, 5204 4-247, and 255 are currently reserved, and should be allocated with 5205 the IETF Consensus policy, requiring an IETF RFC publication 5206 (standards-track or not) with IESG review and approval. 5208 19.6. Ack Vector States 5210 Each entry in the DCCP Ack Vector State registry contains an Ack 5211 Vector State, which is a number in the range 0-3; the name of the 5212 State, such as "Received ECN Marked"; and a reference to the RFC 5213 defining the State. The registry is initially populated using the 5214 values in Table 6 (Section 11.4). This document allocates States 0, 5215 1, and 3. State 2 is currently reserved, and should be allocated 5216 with the Standards Action policy, which requires IESG review and 5217 approval and standards-track IETF RFC publication. 5219 19.7. Drop Codes 5221 Each entry in the DCCP Drop Code registry contains a Data Dropped 5222 Drop Code, which is a number in the range 0-7; the name of the Drop 5223 Code, such as "Application Not Listening"; and a reference to the 5224 RFC defining the Drop Code. The registry is initially populated 5225 using the values in Table 7 (Section 11.7). This document allocates 5226 Drop Codes 0-3 and 7. Drop Codes 4-6 are currently reserved, and 5227 should be allocated with the Standards Action policy, which requires 5228 IESG review and approval and standards-track IETF RFC publication. 5230 19.8. Service Codes 5232 Each entry in the Service Code registry contains a Service Code, 5233 which is a number in the range 0-4294967295; a short English 5234 description of the intended service; and an optional reference to an 5235 RFC or other publicly available specification defining the Service 5236 Code. The registry should list the Service Code's numeric value as 5237 a decimal number, but when each byte of the four-byte Service Code 5238 is in the range 32-127, the registry should also show a four- 5239 character ASCII interpretation of the Service Code. Thus, the 5240 number 1717858426 would additionally appear as "fdpz". Service 5241 Codes are not DCCP-specific. This document does not allocate any 5242 Service Codes, but Service Code 0 is permanently reserved (it 5243 represents the absence of a meaningful Service Code), and Service 5244 Codes 1056964608-1073741823 (high byte ASCII "?") are reserved for 5245 Private Use. Most of the remaining Service Codes are allocated 5246 First Come First Served, with no RFC publication required. 5247 Exceptions are listed in Section 8.1.2. 5249 20. Thanks 5251 Thanks to Jitendra Padhye for his help with early versions of this 5252 specification. 5254 Thanks to Junwen Lai and Arun Venkataramani, who, as interns at 5255 ICIR, built a prototype DCCP implementation. In particular, Junwen 5256 Lai recommended that the old feature negotiation mechanism be 5257 scrapped and co-designed the current mechanism. Arun 5258 Venkataramani's feedback improved Appendix A. 5260 We thank the staff and interns of ICIR and, formerly, ACIRI, the 5261 members of the End-to-End Research Group, and the members of the 5262 Transport Area Working Group for their feedback on DCCP. We 5263 especially thank the DCCP expert reviewers: Greg Minshall, Eric 5264 Rescorla, and Magnus Westerlund for detailed written comments and 5265 problem spotting, and Rob Austein and Steve Bellovin for verbal 5266 comments and written notes. 5268 We also thank those who provided comments and suggestions via the 5269 DCCP BOF, Working Group, and mailing lists, including Damon 5270 Lanphear, Patrick McManus, Colin Perkins, Sara Karlberg, Kevin Lai, 5271 Bernard Aboba, Youngsoo Choi, Pengfei Di, Dan Duchamp, Gorry 5272 Fairhurst, Derek Fawcus, David Timothy Fleeman, John Loughney, 5273 Ghyslain Pelletier, Tom Phelan, Stanislav Shalunov, David Vos, Yufei 5274 Wang, and Michael Welzl. In particular, Colin Perkins provided 5275 extensive, detailed feedback, Michael Welzl suggested the Data 5276 Checksum option, and Gorry Fairhurst provided extensive feedback on 5277 various checksum issues. 5279 A. Appendix: Ack Vector Implementation Notes 5281 This appendix discusses particulars of DCCP acknowledgement 5282 handling, in the context of an abstract implementation for Ack 5283 Vector. It is informative rather than normative. 5285 The first part of our implementation runs at the HC-Receiver, and 5286 therefore acknowledges data packets. It generates Ack Vector 5287 options. The implementation has the following characteristics: 5289 o At most one byte of state per acknowledged packet. 5291 o O(1) time to update that state when a new packet arrives (normal 5292 case). 5294 o Cumulative acknowledgements. 5296 o Quick removal of old state. 5298 The basic data structure is a circular buffer containing information 5299 about acknowledged packets. Each byte in this buffer contains a 5300 state and run length; the state can be 0 (packet received), 1 5301 (packet ECN marked), or 3 (packet not yet received). The buffer 5302 grows from right to left. The implementation maintains five 5303 variables, aside from the buffer contents: 5305 o "buf_head" and "buf_tail", which mark the live portion of the 5306 buffer. 5308 o "buf_ackno", the Acknowledgement Number of the most recent packet 5309 acknowledged in the buffer. This corresponds to the "head" 5310 pointer. 5312 o "buf_nonce", the one-bit sum (exclusive-or, or parity) of the ECN 5313 Nonces received on all packets acknowledged by the buffer with 5314 State 0. 5316 We draw acknowledgement buffers like this: 5318 +---------------------------------------------------------------+ 5319 |S,L|S,L|S,L|S,L| | | | |S,L|S,L|S,L|S,L|S,L|S,L|S,L|S,L| 5320 +---------------------------------------------------------------+ 5321 ^ ^ 5322 buf_tail buf_head, buf_ackno = A buf_nonce = E 5324 <=== buf_head and buf_tail move this way <=== 5326 Each `S,L' represents a State/Run length byte. We will draw these 5327 buffers showing only their live portion, and will add an annotation 5328 showing the Acknowledgement Number for the last live byte in the 5329 buffer. For example: 5331 +-----------------------------------------------+ 5332 A |S,L|S,L|S,L|S,L|S,L|S,L|S,L|S,L|S,L|S,L|S,L|S,L| T BN[E] 5333 +-----------------------------------------------+ 5335 Here, buf_nonce equals E and buf_ackno equals A. 5337 We will use this buffer as a running example. 5339 +---------------------------+ 5340 10 |0,0|3,0|3,0|3,0|0,4|1,0|0,0| 0 BN[1] [Example Buffer] 5341 +---------------------------+ 5343 In concrete terms, its meaning is as follows: 5345 Packet 10 was received. (The head of the buffer has sequence 5346 number 10, state 0, and run length 0.) 5348 Packets 9, 8, and 7 have not yet been received. (The three 5349 bytes preceding the head each have state 3 and run length 0.) 5351 Packets 6, 5, 4, 3, and 2 were received. 5353 Packet 1 was ECN marked. 5355 Packet 0 was received. 5357 The one-bit sum of the ECN Nonces on packets 10, 6, 5, 4, 3, 2, 5358 and 0 equals 1. 5360 Additionally, the HC-Receiver must keep some information about the 5361 Ack Vectors it has recently sent. For each packet sent carrying an 5362 Ack Vector, it remembers four variables: 5364 o "ack_seqno", the Sequence Number used for the packet. This is an 5365 HC-Receiver sequence number. 5367 o "ack_ptr", the value of buf_head at the time of acknowledgement. 5369 o "ack_ackno", the Acknowledgement Number used for the packet. 5370 This is an HC-Sender sequence number. Since acknowledgements are 5371 cumulative, this single number completely specifies all necessary 5372 information about the packets acknowledged by this Ack Vector. 5374 o "ack_nonce", the one-bit sum of the ECN Nonces for all State 0 5375 packets in the buffer from buf_head to ack_ackno, inclusive. 5376 Initially, this equals the Nonce Echo of the acknowledgement's 5377 Ack Vector (or, if the ack packet contained more than one Ack 5378 Vector, the exclusive-or of all the acknowledgement's Ack 5379 Vectors). It changes as information about old acknowledgements 5380 is removed (so ack_ptr and buf_head diverge), and as old packets 5381 arrive (so they change from State 3 or State 1 to State 0). 5383 A.1. Packet Arrival 5385 This section describes how the HC-Receiver updates its 5386 acknowledgement buffer as packets arrive from the HC-Sender. 5388 A.1.1. New Packets 5390 When a packet with Sequence Number greater than buf_ackno arrives, 5391 the HC-Receiver updates buf_head (by moving it to the left 5392 appropriately), buf_ackno (which is set to the new packet's Sequence 5393 Number), and possibly buf_nonce (if the packet arrived unmarked with 5394 ECN Nonce 1), in addition to the buffer itself. For example, if HC- 5395 Sender packet 11 arrived ECN marked, the Example Buffer above would 5396 enter this new state (changes are marked with stars): 5398 ** +***----------------------------+ 5399 11 |1,0|0,0|3,0|3,0|3,0|0,4|1,0|0,0| 0 BN[1] 5400 ** +***----------------------------+ 5402 If the packet's state equals the state at the head of the buffer, 5403 the HC-Receiver may choose to increment its run length (up to the 5404 maximum). For example, if HC-Sender packet 11 arrived without ECN 5405 marking and with ECN Nonce 0, the Example Buffer might enter this 5406 state instead: 5408 ** +--*------------------------+ 5409 11 |0,1|3,0|3,0|3,0|0,4|1,0|0,0| 0 BN[1] 5410 ** +--*------------------------+ 5412 Of course, the new packet's sequence number might not equal the 5413 expected sequence number. In this case, the HC-Receiver will enter 5414 the intervening packets as State 3. If several packets are missing, 5415 the HC-Receiver may prefer to enter multiple bytes with run length 5416 0, rather than a single byte with a larger run length; this 5417 simplifies table updates if one of the missing packets arrives. For 5418 example, if HC-Sender packet 12 arrived with ECN Nonce 1, the 5419 Example Buffer would enter this state: 5421 ** +*******----------------------------+ * 5422 12 |0,0|3,0|0,1|3,0|3,0|3,0|0,4|1,0|0,0| 0 BN[0] 5423 ** +*******----------------------------+ * 5425 Of course, the circular buffer may overflow, either when the HC- 5426 Sender is sending data at a very high rate, when the HC-Receiver's 5427 acknowledgements are not reaching the HC-Sender, or when the HC- 5428 Sender is forgetting to acknowledge those acks (so the HC-Receiver 5429 is unable to clean up old state). In this case, the HC-Receiver 5430 should either compress the buffer (by increasing run lengths when 5431 possible), transfer its state to a larger buffer, or, as a last 5432 resort, drop all received packets, without processing them 5433 whatsoever, until its buffer shrinks again. 5435 A.1.2. Old Packets 5437 When a packet with Sequence Number S arrives, and S <= buf_ackno, 5438 the HC-Receiver will scan the table for the byte corresponding to S. 5439 (Indexing structures could reduce the complexity of this scan.) If 5440 S was previously lost (State 3), and it was stored in a byte with 5441 run length 0, the HC-Receiver can simply change the byte's state. 5442 For example, if HC-Sender packet 8 was received with ECN Nonce 0, 5443 the Example Buffer would enter this state: 5445 +--------*------------------+ 5446 10 |0,0|3,0|0,0|3,0|0,4|1,0|0,0| 0 BN[1] 5447 +--------*------------------+ 5449 If S was not marked as lost, or if it was not contained in the 5450 table, the packet is probably a duplicate, and should be ignored. 5451 (The new packet's ECN marking state might differ from the state in 5452 the buffer; Section 11.4.1 describes what is allowed then.) If S's 5453 buffer byte has a non-zero run length, then the buffer might need be 5454 reshuffled to make space for one or two new bytes. 5456 The ack_nonce fields may also need manipulation when old packets 5457 arrive. In particular, when S transitions from State 3 or State 1 5458 to State 0, and S had ECN Nonce 1, then the implementation should 5459 flip the value of ack_nonce for every acknowledgement with ack_ackno 5460 >= S. 5462 It is impossible with this data structure to shift packets from 5463 State 0 to State 1, since the buffer doesn't store individual 5464 packets' ECN Nonces. 5466 A.2. Sending Acknowledgements 5468 Whenever the HC-Receiver needs to generate an acknowledgement, the 5469 buffer's contents can simply be copied into one or more Ack Vector 5470 options. Copied Ack Vectors might not be maximally compressed; for 5471 example, the Example Buffer above contains three adjacent 3,0 bytes 5472 that could be combined into a single 3,2 byte. The HC-Receiver 5473 might, therefore, choose to compress the buffer in place before 5474 sending the option, or to compress the buffer while copying it; 5475 either operation is simple. 5477 Every acknowledgement sent by the HC-Receiver SHOULD include the 5478 entire state of the buffer. That is, acknowledgements are 5479 cumulative. 5481 If the acknowledgement fits in one Ack Vector, that Ack Vector's 5482 Nonce Echo simply equals buf_nonce. For multiple Ack Vectors, more 5483 care is required. The Ack Vectors should be split at points 5484 corresponding to previous acknowledgements, since the stored 5485 ack_nonce fields provide enough information to calculate correct 5486 Nonce Echoes. The implementation should therefore acknowledge data 5487 at least once per 253 bytes of buffer state. (Otherwise, there'd be 5488 no way to calculate a Nonce Echo.) 5489 For each acknowledgement it sends, the HC-Receiver will add an 5490 acknowledgement record. ack_seqno will equal the HC-Receiver 5491 sequence number it used for the ack packet; ack_ptr will equal 5492 buf_head; ack_ackno will equal buf_ackno; and ack_nonce will equal 5493 buf_nonce. 5495 A.3. Clearing State 5497 Some of the HC-Sender's packets will include acknowledgement 5498 numbers, which ack the HC-Receiver's acknowledgements. When such an 5499 ack is received, the HC-Receiver finds the acknowledgement record R 5500 with the appropriate ack_seqno, then: 5502 o Sets buf_tail to R.ack_ptr + 1. 5504 o If R.ack_nonce is 1, it flips buf_nonce, and the value of 5505 ack_nonce for every later ack record. 5507 o Throws away R and every preceding ack record. 5509 (The HC-Receiver may choose to keep some older information, in case 5510 a lost packet shows up late.) For example, say that the HC-Receiver 5511 storing the Example Buffer had sent two acknowledgements already: 5513 1. ack_seqno = 59, ack_ackno = 3, ack_nonce = 1. 5515 2. ack_seqno = 60, ack_ackno = 10, ack_nonce = 0. 5517 Say the HC-Receiver then received a DCCP-DataAck packet with 5518 Acknowledgement Number 59 from the HC-Sender. This informs the HC- 5519 Receiver that the HC-Sender received, and processed, all the 5520 information in HC-Receiver packet 59. This packet acknowledged HC- 5521 Sender packet 3, so the HC-Sender has now received HC-Receiver's 5522 acknowledgements for packets 0, 1, 2, and 3. The Example Buffer 5523 should enter this state: 5525 +------------------*+ * * 5526 10 |0,0|3,0|3,0|3,0|0,2| 4 BN[0] 5527 +------------------*+ * * 5529 The tail byte's run length was adjusted, since packet 3 was in the 5530 middle of that byte. Since R.ack_nonce was 1, the buf_nonce field 5531 was flipped, as were the ack_nonce fields for later acknowledgements 5532 (here, the HC-Receiver Ack 60 record, not shown, has its ack_nonce 5533 flipped to 1). The HC-Receiver can also throw away stored 5534 information about HC-Receiver Ack 59 and any earlier 5535 acknowledgements. 5537 A careful implementation might try to ensure reasonable robustness 5538 to reordering. Suppose that the Example Buffer is as before, but 5539 that packet 9 now arrives, out of sequence. The buffer would enter 5540 this state: 5542 +----*----------------------+ 5543 10 |0,0|0,0|3,0|3,0|0,4|1,0|0,0| 0 BN[1] 5544 +----*----------------------+ 5546 The danger is that the HC-Sender might acknowledge the HC-Receiver's 5547 previous acknowledgement (with sequence number 60), which says that 5548 Packet 9 was not received, before the HC-Receiver has a chance to 5549 send a new acknowledgement saying that Packet 9 actually was 5550 received. Therefore, when packet 9 arrived, the HC-Receiver might 5551 modify its acknowledgement record to: 5553 1. ack_seqno = 59, ack_ackno = 3, ack_nonce = 1. 5555 2. ack_seqno = 60, ack_ackno = 3, ack_nonce = 1. 5557 That is, Ack 60 is now treated like a duplicate of Ack 59. This 5558 would prevent the Tail pointer from moving past packet 9 until the 5559 HC-Receiver knows that the HC-Sender has seen an Ack Vector 5560 indicating that packet's arrival. 5562 A.4. Processing Acknowledgements 5564 When the HC-Sender receives an acknowledgement, it generally cares 5565 about the number of packets that were dropped and/or ECN marked. It 5566 simply reads this off the Ack Vector. Additionally, it should check 5567 the ECN Nonce for correctness. (As described in Section 11.4.1, it 5568 may want to keep more detailed information about acknowledged 5569 packets in case packets change states between acknowledgements, or 5570 in case the application queries whether a packet arrived.) 5572 The HC-Sender must also acknowledge the HC-Receiver's 5573 acknowledgements so that the HC-Receiver can free old Ack Vector 5574 state. (Since Ack Vector acknowledgements are reliable, the HC- 5575 Receiver must maintain and resend Ack Vector information until it is 5576 sure that the HC-Sender has received that information.) A simple 5577 algorithm suffices: since Ack Vector acknowledgements are 5578 cumulative, a single acknowledgement number tells HC-Receiver how 5579 much ack information has arrived. Assuming that the HC-Receiver 5580 sends no data, the HC-Sender can ensure that at least once a round- 5581 trip time, it sends a DCCP-DataAck packet acknowledging the latest 5582 DCCP-Ack packet it has received. Of course, the HC-Sender only 5583 needs to acknowledge the HC-Receiver's acknowledgements if the HC- 5584 Sender is also sending data. If the HC-Sender is not sending data, 5585 then the HC-Receiver's Ack Vector state is stable, and there is no 5586 need to shrink it. The HC-Sender must watch for drops and ECN marks 5587 on received DCCP-Ack packets so that it can adjust the HC-Receiver's 5588 ack-sending rate -- for example, with Ack Ratio -- in response to 5589 congestion. 5591 If the other half-connection is not quiescent -- that is, the HC- 5592 Receiver is sending data to the HC-Sender, possibly using another 5593 CCID -- then the acknowledgements on that half-connection are 5594 sufficient for the HC-Receiver to free its state. 5596 B. Appendix: Partial Checksumming Design Motivation 5598 A great deal of discussion has taken place regarding the utility of 5599 allowing a DCCP sender to restrict the checksum so that it does not 5600 cover the complete packet. This section attempts to capture some of 5601 the rationale behind specific details of DCCP design. 5603 Many of the applications that we envisage using DCCP are resilient 5604 to some degree of data loss, or they would typically have chosen a 5605 reliable transport. Some of these applications may also be 5606 resilient to data corruption -- some audio payloads, for example. 5607 These resilient applications might prefer to receive corrupted data 5608 than to have DCCP drop a corrupted packet. This is particularly 5609 because of congestion control: DCCP cannot tell the difference 5610 between packets dropped due to corruption and packets dropped due to 5611 congestion, and so it must reduce the transmission rate accordingly. 5612 This response may cause the connection to receive less bandwidth 5613 than it is due; corruption in some networking technologies is 5614 independent of, or at least not always correlated to, congestion. 5615 Therefore, corrupted packets do not need to cause as strong a 5616 reduction in transmission rate as the congestion response would 5617 dictate (so long as the DCCP header and options are not corrupt). 5619 Thus DCCP allows the checksum to cover all of the packet, just the 5620 DCCP header, or both the DCCP header and some number of bytes from 5621 the application data. If the application cannot tolerate any data 5622 corruption, then the checksum must cover the whole packet. If the 5623 application would prefer to tolerate some corruption rather than 5624 have the packet dropped, then it can set the checksum to cover only 5625 part of the packet (but always the DCCP header). In addition, if 5626 the application wishes to decouple checksumming of the DCCP header 5627 from checksumming of the application data, it may do so by including 5628 the Data Checksum option. This would allow DCCP to discard 5629 corrupted application data, but still not mistake the corruption for 5630 network congestion. 5632 Thus, from the application point of view, partial checksums seem to 5633 be a desirable feature. However, the usefulness of partial 5634 checksums depends on partially corrupted packets being delivered to 5635 the receiver. If the link-layer CRC always discards corrupted 5636 packets, then this will not happen, and so the usefulness of partial 5637 checksums would be restricted to corruption that occurred in routers 5638 and other places not covered by link CRCs. There does not appear to 5639 be consensus on how likely it is that future network links that 5640 suffer significant corruption will not cover the entire packet with 5641 a single strong CRC. DCCP makes it possible to tailor such links to 5642 the application, but it is difficult to predict if this will be 5643 compelling for future link technologies. 5645 In addition, partial checksums do not co-exist well with IP-level 5646 authentication mechanisms such as IPsec AH, which cover the entire 5647 packet with a cryptographic hash. Thus, if cryptographic 5648 authentication mechanisms are required to co-exist with partial 5649 checksums, the authentication must be carried in the application 5650 data. A possible mode of usage would appear to be similar to that 5651 of Secure RTP. However, such "application-level" authentication 5652 does not protect the DCCP option negotiation and state machine from 5653 forged packets. An alternative would be to use IPsec ESP, and use 5654 encryption to protect the DCCP headers against attack, while using 5655 the DCCP header validity checks to authenticate that the header is 5656 from someone who possessed the correct key. However, while this is 5657 resistant to replay (due to the DCCP sequence number), it is not by 5658 itself resistant to some forms of man-in-the-middle attacks because 5659 the application data is not tightly coupled to the packet header. 5660 Thus an application-level authentication probably needs to be 5661 coupled with IPsec ESP or a similar mechanism to provide a 5662 reasonably complete security solution. The overhead of such a 5663 solution might be unacceptable for some applications that would 5664 otherwise wish to use partial checksums. 5666 On balance, the authors believe that DCCP partial checksums have the 5667 potential to enable some future uses that would otherwise be 5668 difficult. As the cost and complexity of supporting them is small, 5669 it seems worth including them at this time. It remains to be seen 5670 whether they are useful in practice. 5672 Normative References 5674 [RFC 793] J. Postel, editor. Transmission Control Protocol. 5675 RFC 793. 5677 [RFC 1191] J. C. Mogul and S. E. Deering. Path MTU Discovery. 5678 RFC 1191. 5680 [RFC 2119] S. Bradner. Key Words For Use in RFCs to Indicate 5681 Requirement Levels. RFC 2119. 5683 [RFC 2434] T. Narten and H. Alvestrand. Guidelines for Writing an 5684 IANA Considerations Section in RFCs. RFC 2434. 5686 [RFC 2460] S. Deering and R. Hinden. Internet Protocol, Version 6 5687 (IPv6) Specification. RFC 2460. 5689 [RFC 3168] K.K. Ramakrishnan, S. Floyd, and D. Black. The Addition 5690 of Explicit Congestion Notification (ECN) to IP. RFC 3168. 5692 [RFC 3309] J. Stone, R. Stewart, and D. Otis. Stream Control 5693 Transmission Protocol (SCTP) Checksum Change. RFC 3309. 5695 [RFC 3692] T. Narten. Assigning Experimental and Testing Numbers 5696 Considered Useful. RFC 3692. 5698 [RFC 3775] D. Johnson, C. Perkins, and J. Arkko. Mobility Support 5699 in IPv6. RFC 3775. 5701 [RFC 3828] L-A. Larzon, M. Degermark, S. Pink, L-E. Jonsson, editor, 5702 and G. Fairhurst, editor. The Lightweight User Datagram Protocol 5703 (UDP-Lite). RFC 3828. 5705 Informative References 5707 [BB01] S.M. Bellovin and M. Blaze. Cryptographic Modes of Operation 5708 for the Internet. 2nd NIST Workshop on Modes of Operation, 5709 August 2001. 5711 [BEL98] S.M. Bellovin. Cryptography and the Internet. Proc. CRYPTO 5712 '98 (LNCS 1462), pp46-55, August, 1988. 5714 [CCID 2 PROFILE] S. Floyd and E. Kohler. Profile for DCCP 5715 Congestion Control ID 2: TCP-like Congestion Control. draft- 5716 ietf-dccp-ccid2-10.txt, work in progress, March 2005. 5718 [CCID 3 PROFILE] S. Floyd, E. Kohler, and J. Padhye. Profile for 5719 DCCP Congestion Control ID 3: TFRC Congestion Control. draft- 5720 ietf-dccp-ccid3-11.txt, work in progress, March 2005. 5722 [M85] Robert T. Morris. A Weakness in the 4.2BSD Unix TCP/IP 5723 Software. Computer Science Technical Report 117, AT&T Bell 5724 Laboratories, Murray Hill, NJ, February 1985. 5726 [PMTUD] Matt Mathis, John Heffner, and Kevin Lahey. Path MTU 5727 Discovery. draft-ietf-pmtud-method-01.txt, work in progress, 5728 February 2004. 5730 [RFC 792] J. Postel, editor. Internet Control Message Protocol. 5731 RFC 792. 5733 [RFC 1750] D. Eastlake, S. Crocker, and J. Schiller. Randomness 5734 Recommendations for Security. RFC 1750. 5736 [RFC 1812] F. Baker, editor. Requirements for IP Version 4 Routers. 5737 RFC 1812. 5739 [RFC 1948] S. Bellovin. Defending Against Sequence Number Attacks. 5740 RFC 1948. 5742 [RFC 1982] R. Elz and R. Bush. Serial Number Arithmetic. RFC 1982. 5744 [RFC 2018] M. Mathis, J. Mahdavi, S. Floyd, and A. Romanow. TCP 5745 Selective Acknowledgement Options. RFC 2018. 5747 [RFC 2401] S. Kent and R. Atkinson. Security Architecture for the 5748 Internet Protocol. RFC 2401. 5750 [RFC 2463] A. Conta and S. Deering. Internet Control Message 5751 Protocol (ICMPv6) for the Internet Protocol Version 6 (IPv6) 5752 Specification. RFC 2463. 5754 [RFC 2581] M. Allman, V. Paxson, and W. Stevens. TCP Congestion 5755 Control. RFC 2581. 5757 [RFC 2960] R. Stewart, Q. Xie, K. Morneault, C. Sharp, H. 5758 Schwarzbauer, T. Taylor, I. Rytina, M. Kalla, L. Zhang, and V. 5759 Paxson. Stream Control Transmission Protocol. RFC 2960. 5761 [RFC 3124] H. Balakrishnan and S. Seshan. The Congestion Manager. 5762 RFC 3124. 5764 [RFC 3360] S. Floyd. Inappropriate TCP Resets Considered Harmful. 5765 RFC 3360. 5767 [RFC 3448] M. Handley, S. Floyd, J. Padhye, and J. Widmer. TCP 5768 Friendly Rate Control (TFRC): Protocol Specification. RFC 3448. 5770 [RFC 3540] N. Spring, D. Wetherall, and D. Ely. Robust Explicit 5771 Congestion Notification (ECN) Signaling with Nonces. RFC 3540. 5773 [RFC 3550] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson. 5774 RTP: A Transport Protocol for Real-Time Applications. STD 64. 5775 RFC 3550. 5777 [RFC 3611] T. Friedman, R. Caceres, and A. Clark, editors. RTP 5778 Control Protocol Extended Reports (RTCP XR). RFC 3611. 5780 [RFC 3711] M. Baugher, D. McGrew, M. Naslund, E. Carrara, and K. 5781 Norrman. The Secure Real-time Transport Protocol (SRTP). 5782 RFC 3711. 5784 [RFC 3819] P. Karn, editor, C. Bormann, G. Fairhurst, D. Grossman, 5785 R. Ludwig, J. Mahdavi, G. Montenegro, J. Touch, and L. Wood. 5786 Advice for Internet Subnetwork Designers. RFC 3819. 5788 [SHHP00] Oliver Spatscheck, Jorgen S. Hansen, John H. Hartman, and 5789 Larry L. Peterson. Optimizing TCP Forwarder Performance. 5790 IEEE/ACM Transactions on Networking 8(2):146-157, April 2000. 5792 [SYNCOOKIES] Daniel J. Bernstein. SYN Cookies. 5793 http://cr.yp.to/syncookies.html, as of July 2003. 5795 Authors' Addresses 5797 Eddie Kohler 5798 4531C Boelter Hall 5799 UCLA Computer Science Department 5800 Los Angeles, CA 90095 5801 USA 5803 Mark Handley 5804 Department of Computer Science 5805 University College London 5806 Gower Street 5807 London WC1E 6BT 5808 UK 5810 Sally Floyd 5811 ICSI Center for Internet Research 5812 1947 Center Street, Suite 600 5813 Berkeley, CA 94704 5814 USA 5816 Full Copyright Statement 5818 Copyright (C) The Internet Society 2005. This document is subject 5819 to the rights, licenses and restrictions contained in BCP 78, and 5820 except as set forth therein, the authors retain all their rights. 5822 This document and the information contained herein are provided on 5823 an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE 5824 REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE 5825 INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR 5826 IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 5827 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 5828 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 5830 Intellectual Property 5832 The IETF takes no position regarding the validity or scope of any 5833 Intellectual Property Rights or other rights that might be claimed 5834 to pertain to the implementation or use of the technology described 5835 in this document or the extent to which any license under such 5836 rights might or might not be available; nor does it represent that 5837 it has made any independent effort to identify any such rights. 5838 Information on the procedures with respect to rights in RFC 5839 documents can be found in BCP 78 and BCP 79. 5841 Copies of IPR disclosures made to the IETF Secretariat and any 5842 assurances of licenses to be made available, or the result of an 5843 attempt made to obtain a general license or permission for the use 5844 of such proprietary rights by implementers or users of this 5845 specification can be obtained from the IETF on-line IPR repository 5846 at http://www.ietf.org/ipr. 5848 The IETF invites any interested party to bring to its attention any 5849 copyrights, patents or patent applications, or other proprietary 5850 rights that may cover technology that may be required to implement 5851 this standard. Please address the information to the IETF at ietf- 5852 ipr@ietf.org.