idnits 2.17.00 (12 Aug 2021) /tmp/idnits34198/draft-ietf-httpbis-p1-messaging-24.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The draft header indicates that this document obsoletes RFC2616, but the abstract doesn't seem to mention this, which it should. -- The draft header indicates that this document obsoletes RFC2145, but the abstract doesn't seem to mention this, which it should. -- The draft header indicates that this document updates RFC2817, but the abstract doesn't seem to mention this, which it should. -- The draft header indicates that this document updates RFC2818, but the abstract doesn't seem to mention this, which it should. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year (Using the creation date from RFC2817, updated by this document, for RFC5378 checks: 1998-11-18) (Using the creation date from RFC2818, updated by this document, for RFC5378 checks: 1998-01-27) -- The document seems to contain a disclaimer for pre-RFC5378 work, and may have content which was first submitted before 10 November 2008. The disclaimer is necessary when there are original authors that you have been unable to contact, or if some do not wish to grant the BCP78 rights to the IETF Trust. If you are able to get all authors (current and original) to grant those rights, you can and should remove the disclaimer; otherwise, the disclaimer is needed and you can ignore this comment. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (September 25, 2013) is 3159 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'Part5' is defined on line 3256, but no explicit reference was found in the text == Unused Reference: 'Part7' is defined on line 3266, but no explicit reference was found in the text == Unused Reference: 'RFC2145' is defined on line 3345, but no explicit reference was found in the text == Outdated reference: draft-ietf-httpbis-p2-semantics has been published as RFC 7231 == Outdated reference: draft-ietf-httpbis-p4-conditional has been published as RFC 7232 == Outdated reference: draft-ietf-httpbis-p5-range has been published as RFC 7233 == Outdated reference: draft-ietf-httpbis-p6-cache has been published as RFC 7234 == Outdated reference: draft-ietf-httpbis-p7-auth has been published as RFC 7235 ** Downref: Normative reference to an Informational RFC: RFC 1950 ** Downref: Normative reference to an Informational RFC: RFC 1951 ** Downref: Normative reference to an Informational RFC: RFC 1952 -- Possible downref: Non-RFC (?) normative reference: ref. 'USASCII' -- Possible downref: Non-RFC (?) normative reference: ref. 'Welch' -- Obsolete informational reference (is this intentional?): RFC 4395 (ref. 'BCP115') (Obsoleted by RFC 7595) -- Obsolete informational reference (is this intentional?): RFC 2068 (Obsoleted by RFC 2616) -- Obsolete informational reference (is this intentional?): RFC 2145 (Obsoleted by RFC 7230) -- Obsolete informational reference (is this intentional?): RFC 2616 (Obsoleted by RFC 7230, RFC 7231, RFC 7232, RFC 7233, RFC 7234, RFC 7235) -- Obsolete informational reference (is this intentional?): RFC 5226 (Obsoleted by RFC 8126) -- Obsolete informational reference (is this intentional?): RFC 5246 (Obsoleted by RFC 8446) Summary: 3 errors (**), 0 flaws (~~), 9 warnings (==), 15 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 HTTPbis Working Group R. Fielding, Ed. 3 Internet-Draft Adobe 4 Obsoletes: 2145,2616 (if approved) J. Reschke, Ed. 5 Updates: 2817,2818 (if approved) greenbytes 6 Intended status: Standards Track September 25, 2013 7 Expires: March 29, 2014 9 Hypertext Transfer Protocol (HTTP/1.1): Message Syntax and Routing 10 draft-ietf-httpbis-p1-messaging-24 12 Abstract 14 The Hypertext Transfer Protocol (HTTP) is an application-level 15 protocol for distributed, collaborative, hypertext information 16 systems. HTTP has been in use by the World Wide Web global 17 information initiative since 1990. This document provides an 18 overview of HTTP architecture and its associated terminology, defines 19 the "http" and "https" Uniform Resource Identifier (URI) schemes, 20 defines the HTTP/1.1 message syntax and parsing requirements, and 21 describes general security concerns for implementations. 23 Editorial Note (To be removed by RFC Editor) 25 Discussion of this draft takes place on the HTTPBIS working group 26 mailing list (ietf-http-wg@w3.org), which is archived at 27 . 29 The current issues list is at 30 and related 31 documents (including fancy diffs) can be found at 32 . 34 The changes in this draft are summarized in Appendix C.4. 36 Status of This Memo 38 This Internet-Draft is submitted in full conformance with the 39 provisions of BCP 78 and BCP 79. 41 Internet-Drafts are working documents of the Internet Engineering 42 Task Force (IETF). Note that other groups may also distribute 43 working documents as Internet-Drafts. The list of current Internet- 44 Drafts is at http://datatracker.ietf.org/drafts/current/. 46 Internet-Drafts are draft documents valid for a maximum of six months 47 and may be updated, replaced, or obsoleted by other documents at any 48 time. It is inappropriate to use Internet-Drafts as reference 49 material or to cite them other than as "work in progress." 51 This Internet-Draft will expire on March 29, 2014. 53 Copyright Notice 55 Copyright (c) 2013 IETF Trust and the persons identified as the 56 document authors. All rights reserved. 58 This document is subject to BCP 78 and the IETF Trust's Legal 59 Provisions Relating to IETF Documents 60 (http://trustee.ietf.org/license-info) in effect on the date of 61 publication of this document. Please review these documents 62 carefully, as they describe your rights and restrictions with respect 63 to this document. Code Components extracted from this document must 64 include Simplified BSD License text as described in Section 4.e of 65 the Trust Legal Provisions and are provided without warranty as 66 described in the Simplified BSD License. 68 This document may contain material from IETF Documents or IETF 69 Contributions published or made publicly available before November 70 10, 2008. The person(s) controlling the copyright in some of this 71 material may not have granted the IETF Trust the right to allow 72 modifications of such material outside the IETF Standards Process. 73 Without obtaining an adequate license from the person(s) controlling 74 the copyright in such materials, this document may not be modified 75 outside the IETF Standards Process, and derivative works of it may 76 not be created outside the IETF Standards Process, except to format 77 it for publication as an RFC or to translate it into languages other 78 than English. 80 Table of Contents 82 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 5 83 1.1. Requirement Notation . . . . . . . . . . . . . . . . . . . 6 84 1.2. Syntax Notation . . . . . . . . . . . . . . . . . . . . . 6 85 2. Architecture . . . . . . . . . . . . . . . . . . . . . . . . . 6 86 2.1. Client/Server Messaging . . . . . . . . . . . . . . . . . 7 87 2.2. Implementation Diversity . . . . . . . . . . . . . . . . . 8 88 2.3. Intermediaries . . . . . . . . . . . . . . . . . . . . . . 9 89 2.4. Caches . . . . . . . . . . . . . . . . . . . . . . . . . . 11 90 2.5. Conformance and Error Handling . . . . . . . . . . . . . . 12 91 2.6. Protocol Versioning . . . . . . . . . . . . . . . . . . . 14 92 2.7. Uniform Resource Identifiers . . . . . . . . . . . . . . . 16 93 2.7.1. http URI scheme . . . . . . . . . . . . . . . . . . . 17 94 2.7.2. https URI scheme . . . . . . . . . . . . . . . . . . . 18 95 2.7.3. http and https URI Normalization and Comparison . . . 19 96 3. Message Format . . . . . . . . . . . . . . . . . . . . . . . . 19 97 3.1. Start Line . . . . . . . . . . . . . . . . . . . . . . . . 20 98 3.1.1. Request Line . . . . . . . . . . . . . . . . . . . . . 21 99 3.1.2. Status Line . . . . . . . . . . . . . . . . . . . . . 22 100 3.2. Header Fields . . . . . . . . . . . . . . . . . . . . . . 22 101 3.2.1. Field Extensibility . . . . . . . . . . . . . . . . . 23 102 3.2.2. Field Order . . . . . . . . . . . . . . . . . . . . . 23 103 3.2.3. Whitespace . . . . . . . . . . . . . . . . . . . . . . 24 104 3.2.4. Field Parsing . . . . . . . . . . . . . . . . . . . . 24 105 3.2.5. Field Limits . . . . . . . . . . . . . . . . . . . . . 26 106 3.2.6. Field value components . . . . . . . . . . . . . . . . 26 107 3.3. Message Body . . . . . . . . . . . . . . . . . . . . . . . 27 108 3.3.1. Transfer-Encoding . . . . . . . . . . . . . . . . . . 28 109 3.3.2. Content-Length . . . . . . . . . . . . . . . . . . . . 30 110 3.3.3. Message Body Length . . . . . . . . . . . . . . . . . 31 111 3.4. Handling Incomplete Messages . . . . . . . . . . . . . . . 33 112 3.5. Message Parsing Robustness . . . . . . . . . . . . . . . . 34 113 4. Transfer Codings . . . . . . . . . . . . . . . . . . . . . . . 35 114 4.1. Chunked Transfer Coding . . . . . . . . . . . . . . . . . 35 115 4.1.1. Chunk Extensions . . . . . . . . . . . . . . . . . . . 36 116 4.1.2. Chunked Trailer Part . . . . . . . . . . . . . . . . . 36 117 4.1.3. Decoding Chunked . . . . . . . . . . . . . . . . . . . 37 118 4.2. Compression Codings . . . . . . . . . . . . . . . . . . . 38 119 4.2.1. Compress Coding . . . . . . . . . . . . . . . . . . . 38 120 4.2.2. Deflate Coding . . . . . . . . . . . . . . . . . . . . 38 121 4.2.3. Gzip Coding . . . . . . . . . . . . . . . . . . . . . 38 122 4.3. TE . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 123 4.4. Trailer . . . . . . . . . . . . . . . . . . . . . . . . . 40 124 5. Message Routing . . . . . . . . . . . . . . . . . . . . . . . 40 125 5.1. Identifying a Target Resource . . . . . . . . . . . . . . 40 126 5.2. Connecting Inbound . . . . . . . . . . . . . . . . . . . . 40 127 5.3. Request Target . . . . . . . . . . . . . . . . . . . . . . 41 128 5.4. Host . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 129 5.5. Effective Request URI . . . . . . . . . . . . . . . . . . 44 130 5.6. Associating a Response to a Request . . . . . . . . . . . 46 131 5.7. Message Forwarding . . . . . . . . . . . . . . . . . . . . 46 132 5.7.1. Via . . . . . . . . . . . . . . . . . . . . . . . . . 46 133 5.7.2. Transformations . . . . . . . . . . . . . . . . . . . 48 134 6. Connection Management . . . . . . . . . . . . . . . . . . . . 49 135 6.1. Connection . . . . . . . . . . . . . . . . . . . . . . . . 49 136 6.2. Establishment . . . . . . . . . . . . . . . . . . . . . . 51 137 6.3. Persistence . . . . . . . . . . . . . . . . . . . . . . . 51 138 6.3.1. Retrying Requests . . . . . . . . . . . . . . . . . . 52 139 6.3.2. Pipelining . . . . . . . . . . . . . . . . . . . . . . 53 140 6.4. Concurrency . . . . . . . . . . . . . . . . . . . . . . . 54 141 6.5. Failures and Time-outs . . . . . . . . . . . . . . . . . . 54 142 6.6. Tear-down . . . . . . . . . . . . . . . . . . . . . . . . 55 143 6.7. Upgrade . . . . . . . . . . . . . . . . . . . . . . . . . 56 144 7. ABNF list extension: #rule . . . . . . . . . . . . . . . . . . 58 145 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 59 146 8.1. Header Field Registration . . . . . . . . . . . . . . . . 59 147 8.2. URI Scheme Registration . . . . . . . . . . . . . . . . . 60 148 8.3. Internet Media Type Registration . . . . . . . . . . . . . 60 149 8.3.1. Internet Media Type message/http . . . . . . . . . . . 61 150 8.3.2. Internet Media Type application/http . . . . . . . . . 62 151 8.4. Transfer Coding Registry . . . . . . . . . . . . . . . . . 63 152 8.4.1. Procedure . . . . . . . . . . . . . . . . . . . . . . 63 153 8.4.2. Registration . . . . . . . . . . . . . . . . . . . . . 63 154 8.5. Upgrade Token Registry . . . . . . . . . . . . . . . . . . 64 155 8.5.1. Procedure . . . . . . . . . . . . . . . . . . . . . . 64 156 8.5.2. Upgrade Token Registration . . . . . . . . . . . . . . 65 157 9. Security Considerations . . . . . . . . . . . . . . . . . . . 65 158 9.1. DNS-related Attacks . . . . . . . . . . . . . . . . . . . 65 159 9.2. Intermediaries and Caching . . . . . . . . . . . . . . . . 65 160 9.3. Buffer Overflows . . . . . . . . . . . . . . . . . . . . . 66 161 9.4. Message Integrity . . . . . . . . . . . . . . . . . . . . 66 162 9.5. Server Log Information . . . . . . . . . . . . . . . . . . 67 163 10. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 67 164 11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 69 165 11.1. Normative References . . . . . . . . . . . . . . . . . . . 69 166 11.2. Informative References . . . . . . . . . . . . . . . . . . 70 167 Appendix A. HTTP Version History . . . . . . . . . . . . . . . . 72 168 A.1. Changes from HTTP/1.0 . . . . . . . . . . . . . . . . . . 73 169 A.1.1. Multi-homed Web Servers . . . . . . . . . . . . . . . 73 170 A.1.2. Keep-Alive Connections . . . . . . . . . . . . . . . . 73 171 A.1.3. Introduction of Transfer-Encoding . . . . . . . . . . 74 172 A.2. Changes from RFC 2616 . . . . . . . . . . . . . . . . . . 74 173 Appendix B. Collected ABNF . . . . . . . . . . . . . . . . . . . 76 174 Appendix C. Change Log (to be removed by RFC Editor before 175 publication) . . . . . . . . . . . . . . . . . . . . 79 176 C.1. Since RFC 2616 . . . . . . . . . . . . . . . . . . . . . . 79 177 C.2. Since draft-ietf-httpbis-p1-messaging-21 . . . . . . . . . 79 178 C.3. Since draft-ietf-httpbis-p1-messaging-22 . . . . . . . . . 80 179 C.4. Since draft-ietf-httpbis-p1-messaging-23 . . . . . . . . . 82 180 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 182 1. Introduction 184 The Hypertext Transfer Protocol (HTTP) is an application-level 185 request/response protocol that uses extensible semantics and self- 186 descriptive message payloads for flexible interaction with network- 187 based hypertext information systems. This document is the first in a 188 series of documents that collectively form the HTTP/1.1 189 specification: 191 RFC xxx1: Message Syntax and Routing 193 RFC xxx2: Semantics and Content 195 RFC xxx3: Conditional Requests 197 RFC xxx4: Range Requests 199 RFC xxx5: Caching 201 RFC xxx6: Authentication 203 This HTTP/1.1 specification obsoletes and moves to historic status 204 RFC 2616, its predecessor RFC 2068, and RFC 2145 (on HTTP 205 versioning). This specification also updates the use of CONNECT to 206 establish a tunnel, previously defined in RFC 2817, and defines the 207 "https" URI scheme that was described informally in RFC 2818. 209 HTTP is a generic interface protocol for information systems. It is 210 designed to hide the details of how a service is implemented by 211 presenting a uniform interface to clients that is independent of the 212 types of resources provided. Likewise, servers do not need to be 213 aware of each client's purpose: an HTTP request can be considered in 214 isolation rather than being associated with a specific type of client 215 or a predetermined sequence of application steps. The result is a 216 protocol that can be used effectively in many different contexts and 217 for which implementations can evolve independently over time. 219 HTTP is also designed for use as an intermediation protocol for 220 translating communication to and from non-HTTP information systems. 221 HTTP proxies and gateways can provide access to alternative 222 information services by translating their diverse protocols into a 223 hypertext format that can be viewed and manipulated by clients in the 224 same way as HTTP services. 226 One consequence of this flexibility is that the protocol cannot be 227 defined in terms of what occurs behind the interface. Instead, we 228 are limited to defining the syntax of communication, the intent of 229 received communication, and the expected behavior of recipients. If 230 the communication is considered in isolation, then successful actions 231 ought to be reflected in corresponding changes to the observable 232 interface provided by servers. However, since multiple clients might 233 act in parallel and perhaps at cross-purposes, we cannot require that 234 such changes be observable beyond the scope of a single response. 236 This document describes the architectural elements that are used or 237 referred to in HTTP, defines the "http" and "https" URI schemes, 238 describes overall network operation and connection management, and 239 defines HTTP message framing and forwarding requirements. Our goal 240 is to define all of the mechanisms necessary for HTTP message 241 handling that are independent of message semantics, thereby defining 242 the complete set of requirements for message parsers and message- 243 forwarding intermediaries. 245 1.1. Requirement Notation 247 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 248 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 249 document are to be interpreted as described in [RFC2119]. 251 Conformance criteria and considerations regarding error handling are 252 defined in Section 2.5. 254 1.2. Syntax Notation 256 This specification uses the Augmented Backus-Naur Form (ABNF) 257 notation of [RFC5234] with the list rule extension defined in 258 Section 7. Appendix B shows the collected ABNF with the list rule 259 expanded. 261 The following core rules are included by reference, as defined in 262 [RFC5234], Appendix B.1: ALPHA (letters), CR (carriage return), CRLF 263 (CR LF), CTL (controls), DIGIT (decimal 0-9), DQUOTE (double quote), 264 HEXDIG (hexadecimal 0-9/A-F/a-f), HTAB (horizontal tab), LF (line 265 feed), OCTET (any 8-bit sequence of data), SP (space), and VCHAR (any 266 visible [USASCII] character). 268 As a convention, ABNF rule names prefixed with "obs-" denote 269 "obsolete" grammar rules that appear for historical reasons. 271 2. Architecture 273 HTTP was created for the World Wide Web architecture and has evolved 274 over time to support the scalability needs of a worldwide hypertext 275 system. Much of that architecture is reflected in the terminology 276 and syntax productions used to define HTTP. 278 2.1. Client/Server Messaging 280 HTTP is a stateless request/response protocol that operates by 281 exchanging messages (Section 3) across a reliable transport or 282 session-layer "connection" (Section 6). An HTTP "client" is a 283 program that establishes a connection to a server for the purpose of 284 sending one or more HTTP requests. An HTTP "server" is a program 285 that accepts connections in order to service HTTP requests by sending 286 HTTP responses. 288 The terms client and server refer only to the roles that these 289 programs perform for a particular connection. The same program might 290 act as a client on some connections and a server on others. We use 291 the term "user agent" to refer to any of the various client programs 292 that initiate a request, including (but not limited to) browsers, 293 spiders (web-based robots), command-line tools, native applications, 294 and mobile apps. The term "origin server" is used to refer to the 295 program that can originate authoritative responses to a request. For 296 general requirements, we use the terms "sender" and "recipient" to 297 refer to any component that sends or receives, respectively, a given 298 message. 300 HTTP relies upon the Uniform Resource Identifier (URI) standard 301 [RFC3986] to indicate the target resource (Section 5.1) and 302 relationships between resources. Messages are passed in a format 303 similar to that used by Internet mail [RFC5322] and the Multipurpose 304 Internet Mail Extensions (MIME) [RFC2045] (see Appendix A of [Part2] 305 for the differences between HTTP and MIME messages). 307 Most HTTP communication consists of a retrieval request (GET) for a 308 representation of some resource identified by a URI. In the simplest 309 case, this might be accomplished via a single bidirectional 310 connection (===) between the user agent (UA) and the origin server 311 (O). 313 request > 314 UA ======================================= O 315 < response 317 A client sends an HTTP request to a server in the form of a request 318 message, beginning with a request-line that includes a method, URI, 319 and protocol version (Section 3.1.1), followed by header fields 320 containing request modifiers, client information, and representation 321 metadata (Section 3.2), an empty line to indicate the end of the 322 header section, and finally a message body containing the payload 323 body (if any, Section 3.3). 325 A server responds to a client's request by sending one or more HTTP 326 response messages, each beginning with a status line that includes 327 the protocol version, a success or error code, and textual reason 328 phrase (Section 3.1.2), possibly followed by header fields containing 329 server information, resource metadata, and representation metadata 330 (Section 3.2), an empty line to indicate the end of the header 331 section, and finally a message body containing the payload body (if 332 any, Section 3.3). 334 A connection might be used for multiple request/response exchanges, 335 as defined in Section 6.3. 337 The following example illustrates a typical message exchange for a 338 GET request on the URI "http://www.example.com/hello.txt": 340 Client request: 342 GET /hello.txt HTTP/1.1 343 User-Agent: curl/7.16.3 libcurl/7.16.3 OpenSSL/0.9.7l zlib/1.2.3 344 Host: www.example.com 345 Accept-Language: en, mi 347 Server response: 349 HTTP/1.1 200 OK 350 Date: Mon, 27 Jul 2009 12:28:53 GMT 351 Server: Apache 352 Last-Modified: Wed, 22 Jul 2009 19:15:56 GMT 353 ETag: "34aa387-d-1568eb00" 354 Accept-Ranges: bytes 355 Content-Length: 51 356 Vary: Accept-Encoding 357 Content-Type: text/plain 359 Hello World! My payload includes a trailing CRLF. 361 2.2. Implementation Diversity 363 When considering the design of HTTP, it is easy to fall into a trap 364 of thinking that all user agents are general-purpose browsers and all 365 origin servers are large public websites. That is not the case in 366 practice. Common HTTP user agents include household appliances, 367 stereos, scales, firmware update scripts, command-line programs, 368 mobile apps, and communication devices in a multitude of shapes and 369 sizes. Likewise, common HTTP origin servers include home automation 370 units, configurable networking components, office machines, 371 autonomous robots, news feeds, traffic cameras, ad selectors, and 372 video delivery platforms. 374 The term "user agent" does not imply that there is a human user 375 directly interacting with the software agent at the time of a 376 request. In many cases, a user agent is installed or configured to 377 run in the background and save its results for later inspection (or 378 save only a subset of those results that might be interesting or 379 erroneous). Spiders, for example, are typically given a start URI 380 and configured to follow certain behavior while crawling the Web as a 381 hypertext graph. 383 The implementation diversity of HTTP means that we cannot assume the 384 user agent can make interactive suggestions to a user or provide 385 adequate warning for security or privacy options. In the few cases 386 where this specification requires reporting of errors to the user, it 387 is acceptable for such reporting to only be observable in an error 388 console or log file. Likewise, requirements that an automated action 389 be confirmed by the user before proceeding might be met via advance 390 configuration choices, run-time options, or simple avoidance of the 391 unsafe action; confirmation does not imply any specific user 392 interface or interruption of normal processing if the user has 393 already made that choice. 395 2.3. Intermediaries 397 HTTP enables the use of intermediaries to satisfy requests through a 398 chain of connections. There are three common forms of HTTP 399 intermediary: proxy, gateway, and tunnel. In some cases, a single 400 intermediary might act as an origin server, proxy, gateway, or 401 tunnel, switching behavior based on the nature of each request. 403 > > > > 404 UA =========== A =========== B =========== C =========== O 405 < < < < 407 The figure above shows three intermediaries (A, B, and C) between the 408 user agent and origin server. A request or response message that 409 travels the whole chain will pass through four separate connections. 410 Some HTTP communication options might apply only to the connection 411 with the nearest, non-tunnel neighbor, only to the end-points of the 412 chain, or to all connections along the chain. Although the diagram 413 is linear, each participant might be engaged in multiple, 414 simultaneous communications. For example, B might be receiving 415 requests from many clients other than A, and/or forwarding requests 416 to servers other than C, at the same time that it is handling A's 417 request. Likewise, later requests might be sent through a different 418 path of connections, often based on dynamic configuration for load 419 balancing. 421 We use the terms "upstream" and "downstream" to describe various 422 requirements in relation to the directional flow of a message: all 423 messages flow from upstream to downstream. Likewise, we use the 424 terms inbound and outbound to refer to directions in relation to the 425 request path: "inbound" means toward the origin server and "outbound" 426 means toward the user agent. 428 A "proxy" is a message forwarding agent that is selected by the 429 client, usually via local configuration rules, to receive requests 430 for some type(s) of absolute URI and attempt to satisfy those 431 requests via translation through the HTTP interface. Some 432 translations are minimal, such as for proxy requests for "http" URIs, 433 whereas other requests might require translation to and from entirely 434 different application-level protocols. Proxies are often used to 435 group an organization's HTTP requests through a common intermediary 436 for the sake of security, annotation services, or shared caching. 438 An HTTP-to-HTTP proxy is called a "transforming proxy" if it is 439 designed or configured to modify request or response messages in a 440 semantically meaningful way (i.e., modifications, beyond those 441 required by normal HTTP processing, that change the message in a way 442 that would be significant to the original sender or potentially 443 significant to downstream recipients). For example, a transforming 444 proxy might be acting as a shared annotation server (modifying 445 responses to include references to a local annotation database), a 446 malware filter, a format transcoder, or an intranet-to-Internet 447 privacy filter. Such transformations are presumed to be desired by 448 the client (or client organization) that selected the proxy and are 449 beyond the scope of this specification. However, when a proxy is not 450 intended to transform a given message, we use the term "non- 451 transforming proxy" to target requirements that preserve HTTP message 452 semantics. See Section 6.3.4 of [Part2] and Section 5.5 of [Part6] 453 for status and warning codes related to transformations. 455 A "gateway" (a.k.a., "reverse proxy") is an intermediary that acts as 456 an origin server for the outbound connection, but translates received 457 requests and forwards them inbound to another server or servers. 458 Gateways are often used to encapsulate legacy or untrusted 459 information services, to improve server performance through 460 "accelerator" caching, and to enable partitioning or load balancing 461 of HTTP services across multiple machines. 463 All HTTP requirements applicable to an origin server also apply to 464 the outbound communication of a gateway. A gateway communicates with 465 inbound servers using any protocol that it desires, including private 466 extensions to HTTP that are outside the scope of this specification. 467 However, an HTTP-to-HTTP gateway that wishes to interoperate with 468 third-party HTTP servers ought to conform to user agent requirements 469 on the gateway's inbound connection. 471 A "tunnel" acts as a blind relay between two connections without 472 changing the messages. Once active, a tunnel is not considered a 473 party to the HTTP communication, though the tunnel might have been 474 initiated by an HTTP request. A tunnel ceases to exist when both 475 ends of the relayed connection are closed. Tunnels are used to 476 extend a virtual connection through an intermediary, such as when 477 Transport Layer Security (TLS, [RFC5246]) is used to establish 478 confidential communication through a shared firewall proxy. 480 The above categories for intermediary only consider those acting as 481 participants in the HTTP communication. There are also 482 intermediaries that can act on lower layers of the network protocol 483 stack, filtering or redirecting HTTP traffic without the knowledge or 484 permission of message senders. Network intermediaries often 485 introduce security flaws or interoperability problems by violating 486 HTTP semantics. For example, an "interception proxy" [RFC3040] (also 487 commonly known as a "transparent proxy" [RFC1919] or "captive 488 portal") differs from an HTTP proxy because it is not selected by the 489 client. Instead, an interception proxy filters or redirects outgoing 490 TCP port 80 packets (and occasionally other common port traffic). 491 Interception proxies are commonly found on public network access 492 points, as a means of enforcing account subscription prior to 493 allowing use of non-local Internet services, and within corporate 494 firewalls to enforce network usage policies. They are 495 indistinguishable from a man-in-the-middle attack. 497 HTTP is defined as a stateless protocol, meaning that each request 498 message can be understood in isolation. Many implementations depend 499 on HTTP's stateless design in order to reuse proxied connections or 500 dynamically load-balance requests across multiple servers. Hence, a 501 server MUST NOT assume that two requests on the same connection are 502 from the same user agent unless the connection is secured and 503 specific to that agent. Some non-standard HTTP extensions (e.g., 504 [RFC4559]) have been known to violate this requirement, resulting in 505 security and interoperability problems. 507 2.4. Caches 509 A "cache" is a local store of previous response messages and the 510 subsystem that controls its message storage, retrieval, and deletion. 511 A cache stores cacheable responses in order to reduce the response 512 time and network bandwidth consumption on future, equivalent 513 requests. Any client or server MAY employ a cache, though a cache 514 cannot be used by a server while it is acting as a tunnel. 516 The effect of a cache is that the request/response chain is shortened 517 if one of the participants along the chain has a cached response 518 applicable to that request. The following illustrates the resulting 519 chain if B has a cached copy of an earlier response from O (via C) 520 for a request that has not been cached by UA or A. 522 > > 523 UA =========== A =========== B - - - - - - C - - - - - - O 524 < < 526 A response is "cacheable" if a cache is allowed to store a copy of 527 the response message for use in answering subsequent requests. Even 528 when a response is cacheable, there might be additional constraints 529 placed by the client or by the origin server on when that cached 530 response can be used for a particular request. HTTP requirements for 531 cache behavior and cacheable responses are defined in Section 2 of 532 [Part6]. 534 There are a wide variety of architectures and configurations of 535 caches deployed across the World Wide Web and inside large 536 organizations. These include national hierarchies of proxy caches to 537 save transoceanic bandwidth, collaborative systems that broadcast or 538 multicast cache entries, archives of pre-fetched cache entries for 539 use in off-line or high-latency environments, and so on. 541 2.5. Conformance and Error Handling 543 This specification targets conformance criteria according to the role 544 of a participant in HTTP communication. Hence, HTTP requirements are 545 placed on senders, recipients, clients, servers, user agents, 546 intermediaries, origin servers, proxies, gateways, or caches, 547 depending on what behavior is being constrained by the requirement. 548 Additional (social) requirements are placed on implementations, 549 resource owners, and protocol element registrations when they apply 550 beyond the scope of a single communication. 552 The verb "generate" is used instead of "send" where a requirement 553 differentiates between creating a protocol element and merely 554 forwarding a received element downstream. 556 An implementation is considered conformant if it complies with all of 557 the requirements associated with the roles it partakes in HTTP. 559 Conformance includes both the syntax and semantics of HTTP protocol 560 elements. A sender MUST NOT generate protocol elements that convey a 561 meaning that is known by that sender to be false. A sender MUST NOT 562 generate protocol elements that do not match the grammar defined by 563 the corresponding ABNF rules. Within a given message, a sender MUST 564 NOT generate protocol elements or syntax alternatives that are only 565 allowed to be generated by participants in other roles (i.e., a role 566 that the sender does not have for that message). 568 When a received protocol element is parsed, the recipient MUST be 569 able to parse any value of reasonable length that is applicable to 570 the recipient's role and matches the grammar defined by the 571 corresponding ABNF rules. Note, however, that some received protocol 572 elements might not be parsed. For example, an intermediary 573 forwarding a message might parse a header-field into generic field- 574 name and field-value components, but then forward the header field 575 without further parsing inside the field-value. 577 HTTP does not have specific length limitations for many of its 578 protocol elements because the lengths that might be appropriate will 579 vary widely, depending on the deployment context and purpose of the 580 implementation. Hence, interoperability between senders and 581 recipients depends on shared expectations regarding what is a 582 reasonable length for each protocol element. Furthermore, what is 583 commonly understood to be a reasonable length for some protocol 584 elements has changed over the course of the past two decades of HTTP 585 use, and is expected to continue changing in the future. 587 At a minimum, a recipient MUST be able to parse and process protocol 588 element lengths that are at least as long as the values that it 589 generates for those same protocol elements in other messages. For 590 example, an origin server that publishes very long URI references to 591 its own resources needs to be able to parse and process those same 592 references when received as a request target. 594 A recipient MUST interpret a received protocol element according to 595 the semantics defined for it by this specification, including 596 extensions to this specification, unless the recipient has determined 597 (through experience or configuration) that the sender incorrectly 598 implements what is implied by those semantics. For example, an 599 origin server might disregard the contents of a received Accept- 600 Encoding header field if inspection of the User-Agent header field 601 indicates a specific implementation version that is known to fail on 602 receipt of certain content codings. 604 Unless noted otherwise, a recipient MAY attempt to recover a usable 605 protocol element from an invalid construct. HTTP does not define 606 specific error handling mechanisms except when they have a direct 607 impact on security, since different applications of the protocol 608 require different error handling strategies. For example, a Web 609 browser might wish to transparently recover from a response where the 610 Location header field doesn't parse according to the ABNF, whereas a 611 systems control client might consider any form of error recovery to 612 be dangerous. 614 2.6. Protocol Versioning 616 HTTP uses a "." numbering scheme to indicate versions 617 of the protocol. This specification defines version "1.1". The 618 protocol version as a whole indicates the sender's conformance with 619 the set of requirements laid out in that version's corresponding 620 specification of HTTP. 622 The version of an HTTP message is indicated by an HTTP-version field 623 in the first line of the message. HTTP-version is case-sensitive. 625 HTTP-version = HTTP-name "/" DIGIT "." DIGIT 626 HTTP-name = %x48.54.54.50 ; "HTTP", case-sensitive 628 The HTTP version number consists of two decimal digits separated by a 629 "." (period or decimal point). The first digit ("major version") 630 indicates the HTTP messaging syntax, whereas the second digit ("minor 631 version") indicates the highest minor version within that major 632 version to which the sender is conformant and able to understand for 633 future communication. The minor version advertises the sender's 634 communication capabilities even when the sender is only using a 635 backwards-compatible subset of the protocol, thereby letting the 636 recipient know that more advanced features can be used in response 637 (by servers) or in future requests (by clients). 639 When an HTTP/1.1 message is sent to an HTTP/1.0 recipient [RFC1945] 640 or a recipient whose version is unknown, the HTTP/1.1 message is 641 constructed such that it can be interpreted as a valid HTTP/1.0 642 message if all of the newer features are ignored. This specification 643 places recipient-version requirements on some new features so that a 644 conformant sender will only use compatible features until it has 645 determined, through configuration or the receipt of a message, that 646 the recipient supports HTTP/1.1. 648 The interpretation of a header field does not change between minor 649 versions of the same major HTTP version, though the default behavior 650 of a recipient in the absence of such a field can change. Unless 651 specified otherwise, header fields defined in HTTP/1.1 are defined 652 for all versions of HTTP/1.x. In particular, the Host and Connection 653 header fields ought to be implemented by all HTTP/1.x implementations 654 whether or not they advertise conformance with HTTP/1.1. 656 New header fields can be introduced without changing the protocol 657 version if their defined semantics allow them to be safely ignored by 658 recipients that do not recognize them. Header field extensibility is 659 discussed in Section 3.2.1. 661 Intermediaries that process HTTP messages (i.e., all intermediaries 662 other than those acting as tunnels) MUST send their own HTTP-version 663 in forwarded messages. In other words, they MUST NOT blindly forward 664 the first line of an HTTP message without ensuring that the protocol 665 version in that message matches a version to which that intermediary 666 is conformant for both the receiving and sending of messages. 667 Forwarding an HTTP message without rewriting the HTTP-version might 668 result in communication errors when downstream recipients use the 669 message sender's version to determine what features are safe to use 670 for later communication with that sender. 672 A client SHOULD send a request version equal to the highest version 673 to which the client is conformant and whose major version is no 674 higher than the highest version supported by the server, if this is 675 known. A client MUST NOT send a version to which it is not 676 conformant. 678 A client MAY send a lower request version if it is known that the 679 server incorrectly implements the HTTP specification, but only after 680 the client has attempted at least one normal request and determined 681 from the response status code or header fields (e.g., Server) that 682 the server improperly handles higher request versions. 684 A server SHOULD send a response version equal to the highest version 685 to which the server is conformant and whose major version is less 686 than or equal to the one received in the request. A server MUST NOT 687 send a version to which it is not conformant. A server MAY send a 688 505 (HTTP Version Not Supported) response if it cannot send a 689 response using the major version used in the client's request. 691 A server MAY send an HTTP/1.0 response to a request if it is known or 692 suspected that the client incorrectly implements the HTTP 693 specification and is incapable of correctly processing later version 694 responses, such as when a client fails to parse the version number 695 correctly or when an intermediary is known to blindly forward the 696 HTTP-version even when it doesn't conform to the given minor version 697 of the protocol. Such protocol downgrades SHOULD NOT be performed 698 unless triggered by specific client attributes, such as when one or 699 more of the request header fields (e.g., User-Agent) uniquely match 700 the values sent by a client known to be in error. 702 The intention of HTTP's versioning design is that the major number 703 will only be incremented if an incompatible message syntax is 704 introduced, and that the minor number will only be incremented when 705 changes made to the protocol have the effect of adding to the message 706 semantics or implying additional capabilities of the sender. 707 However, the minor version was not incremented for the changes 708 introduced between [RFC2068] and [RFC2616], and this revision has 709 specifically avoided any such changes to the protocol. 711 When an HTTP message is received with a major version number that the 712 recipient implements, but a higher minor version number than what the 713 recipient implements, the recipient SHOULD process the message as if 714 it were in the highest minor version within that major version to 715 which the recipient is conformant. A recipient can assume that a 716 message with a higher minor version, when sent to a recipient that 717 has not yet indicated support for that higher version, is 718 sufficiently backwards-compatible to be safely processed by any 719 implementation of the same major version. 721 2.7. Uniform Resource Identifiers 723 Uniform Resource Identifiers (URIs) [RFC3986] are used throughout 724 HTTP as the means for identifying resources (Section 2 of [Part2]). 725 URI references are used to target requests, indicate redirects, and 726 define relationships. 728 This specification adopts the definitions of "URI-reference", 729 "absolute-URI", "relative-part", "authority", "port", "host", "path- 730 abempty", "segment", "query", and "fragment" from the URI generic 731 syntax. In addition, we define an "absolute-path" rule (that differs 732 from RFC 3986's "path-absolute" in that it allows a leading "//") and 733 a "partial-URI" rule for protocol elements that allow a relative URI 734 but not a fragment. 736 URI-reference = 737 absolute-URI = 738 relative-part = 739 authority = 740 uri-host = 741 port = 742 path-abempty = 743 segment = 744 query = 745 fragment = 747 absolute-path = 1*( "/" segment ) 748 partial-URI = relative-part [ "?" query ] 750 Each protocol element in HTTP that allows a URI reference will 751 indicate in its ABNF production whether the element allows any form 752 of reference (URI-reference), only a URI in absolute form (absolute- 753 URI), only the path and optional query components, or some 754 combination of the above. Unless otherwise indicated, URI references 755 are parsed relative to the effective request URI (Section 5.5). 757 2.7.1. http URI scheme 759 The "http" URI scheme is hereby defined for the purpose of minting 760 identifiers according to their association with the hierarchical 761 namespace governed by a potential HTTP origin server listening for 762 TCP ([RFC0793]) connections on a given port. 764 http-URI = "http:" "//" authority path-abempty [ "?" query ] 765 [ "#" fragment ] 767 The HTTP origin server is identified by the generic syntax's 768 authority component, which includes a host identifier and optional 769 TCP port ([RFC3986], Section 3.2.2). The remainder of the URI, 770 consisting of both the hierarchical path component and optional query 771 component, serves as an identifier for a potential resource within 772 that origin server's name space. 774 A sender MUST NOT generate an "http" URI with an empty host 775 identifier. A recipient that processes such a URI reference MUST 776 reject it as invalid. 778 If the host identifier is provided as an IP address, then the origin 779 server is any listener on the indicated TCP port at that IP address. 780 If host is a registered name, then that name is considered an 781 indirect identifier and the recipient might use a name resolution 782 service, such as DNS, to find the address of a listener for that 783 host. If the port subcomponent is empty or not given, then TCP port 784 80 is assumed (the default reserved port for WWW services). 786 Regardless of the form of host identifier, access to that host is not 787 implied by the mere presence of its name or address. The host might 788 or might not exist and, even when it does exist, might or might not 789 be running an HTTP server or listening to the indicated port. The 790 "http" URI scheme makes use of the delegated nature of Internet names 791 and addresses to establish a naming authority (whatever entity has 792 the ability to place an HTTP server at that Internet name or address) 793 and allows that authority to determine which names are valid and how 794 they might be used. 796 When an "http" URI is used within a context that calls for access to 797 the indicated resource, a client MAY attempt access by resolving the 798 host to an IP address, establishing a TCP connection to that address 799 on the indicated port, and sending an HTTP request message 800 (Section 3) containing the URI's identifying data (Section 5) to the 801 server. If the server responds to that request with a non-interim 802 HTTP response message, as described in Section 6 of [Part2], then 803 that response is considered an authoritative answer to the client's 804 request. 806 Although HTTP is independent of the transport protocol, the "http" 807 scheme is specific to TCP-based services because the name delegation 808 process depends on TCP for establishing authority. An HTTP service 809 based on some other underlying connection protocol would presumably 810 be identified using a different URI scheme, just as the "https" 811 scheme (below) is used for resources that require an end-to-end 812 secured connection. Other protocols might also be used to provide 813 access to "http" identified resources -- it is only the authoritative 814 interface that is specific to TCP. 816 The URI generic syntax for authority also includes a deprecated 817 userinfo subcomponent ([RFC3986], Section 3.2.1) for including user 818 authentication information in the URI. Some implementations make use 819 of the userinfo component for internal configuration of 820 authentication information, such as within command invocation 821 options, configuration files, or bookmark lists, even though such 822 usage might expose a user identifier or password. A sender MUST NOT 823 generate the userinfo subcomponent (and its "@" delimiter) when an 824 "http" URI reference is generated within a message as a request 825 target or header field value. Before making use of an "http" URI 826 reference received from an untrusted source, a recipient ought to 827 parse for userinfo and treat its presence as an error; it is likely 828 being used to obscure the authority for the sake of phishing attacks. 830 2.7.2. https URI scheme 832 The "https" URI scheme is hereby defined for the purpose of minting 833 identifiers according to their association with the hierarchical 834 namespace governed by a potential HTTP origin server listening to a 835 given TCP port for TLS-secured connections ([RFC0793], [RFC5246]). 837 All of the requirements listed above for the "http" scheme are also 838 requirements for the "https" scheme, except that a default TCP port 839 of 443 is assumed if the port subcomponent is empty or not given, and 840 the user agent MUST ensure that its connection to the origin server 841 is secured through the use of strong encryption, end-to-end, prior to 842 sending the first HTTP request. 844 https-URI = "https:" "//" authority path-abempty [ "?" query ] 845 [ "#" fragment ] 847 Note that the "https" URI scheme depends on both TLS and TCP for 848 establishing authority. Resources made available via the "https" 849 scheme have no shared identity with the "http" scheme even if their 850 resource identifiers indicate the same authority (the same host 851 listening to the same TCP port). They are distinct name spaces and 852 are considered to be distinct origin servers. However, an extension 853 to HTTP that is defined to apply to entire host domains, such as the 854 Cookie protocol [RFC6265], can allow information set by one service 855 to impact communication with other services within a matching group 856 of host domains. 858 The process for authoritative access to an "https" identified 859 resource is defined in [RFC2818]. 861 2.7.3. http and https URI Normalization and Comparison 863 Since the "http" and "https" schemes conform to the URI generic 864 syntax, such URIs are normalized and compared according to the 865 algorithm defined in [RFC3986], Section 6, using the defaults 866 described above for each scheme. 868 If the port is equal to the default port for a scheme, the normal 869 form is to omit the port subcomponent. When not being used in 870 absolute form as the request target of an OPTIONS request, an empty 871 path component is equivalent to an absolute path of "/", so the 872 normal form is to provide a path of "/" instead. The scheme and host 873 are case-insensitive and normally provided in lowercase; all other 874 components are compared in a case-sensitive manner. Characters other 875 than those in the "reserved" set are equivalent to their percent- 876 encoded octets (see [RFC3986], Section 2.1): the normal form is to 877 not encode them. 879 For example, the following three URIs are equivalent: 881 http://example.com:80/~smith/home.html 882 http://EXAMPLE.com/%7Esmith/home.html 883 http://EXAMPLE.com:/%7esmith/home.html 885 3. Message Format 887 All HTTP/1.1 messages consist of a start-line followed by a sequence 888 of octets in a format similar to the Internet Message Format 889 [RFC5322]: zero or more header fields (collectively referred to as 890 the "headers" or the "header section"), an empty line indicating the 891 end of the header section, and an optional message body. 893 HTTP-message = start-line 894 *( header-field CRLF ) 895 CRLF 896 [ message-body ] 898 The normal procedure for parsing an HTTP message is to read the 899 start-line into a structure, read each header field into a hash table 900 by field name until the empty line, and then use the parsed data to 901 determine if a message body is expected. If a message body has been 902 indicated, then it is read as a stream until an amount of octets 903 equal to the message body length is read or the connection is closed. 905 A recipient MUST parse an HTTP message as a sequence of octets in an 906 encoding that is a superset of US-ASCII [USASCII]. Parsing an HTTP 907 message as a stream of Unicode characters, without regard for the 908 specific encoding, creates security vulnerabilities due to the 909 varying ways that string processing libraries handle invalid 910 multibyte character sequences that contain the octet LF (%x0A). 911 String-based parsers can only be safely used within protocol elements 912 after the element has been extracted from the message, such as within 913 a header field-value after message parsing has delineated the 914 individual fields. 916 An HTTP message can be parsed as a stream for incremental processing 917 or forwarding downstream. However, recipients cannot rely on 918 incremental delivery of partial messages, since some implementations 919 will buffer or delay message forwarding for the sake of network 920 efficiency, security checks, or payload transformations. 922 A sender MUST NOT send whitespace between the start-line and the 923 first header field. A recipient that receives whitespace between the 924 start-line and the first header field MUST either reject the message 925 as invalid or consume each whitespace-preceded line without further 926 processing of it (i.e., ignore the entire line, along with any 927 subsequent lines preceded by whitespace, until a properly formed 928 header field is received or the header section is terminated). 930 The presence of such whitespace in a request might be an attempt to 931 trick a server into ignoring that field or processing the line after 932 it as a new request, either of which might result in a security 933 vulnerability if other implementations within the request chain 934 interpret the same message differently. Likewise, the presence of 935 such whitespace in a response might be ignored by some clients or 936 cause others to cease parsing. 938 3.1. Start Line 940 An HTTP message can either be a request from client to server or a 941 response from server to client. Syntactically, the two types of 942 message differ only in the start-line, which is either a request-line 943 (for requests) or a status-line (for responses), and in the algorithm 944 for determining the length of the message body (Section 3.3). 946 In theory, a client could receive requests and a server could receive 947 responses, distinguishing them by their different start-line formats, 948 but in practice servers are implemented to only expect a request (a 949 response is interpreted as an unknown or invalid request method) and 950 clients are implemented to only expect a response. 952 start-line = request-line / status-line 954 3.1.1. Request Line 956 A request-line begins with a method token, followed by a single space 957 (SP), the request-target, another single space (SP), the protocol 958 version, and ending with CRLF. 960 request-line = method SP request-target SP HTTP-version CRLF 962 The method token indicates the request method to be performed on the 963 target resource. The request method is case-sensitive. 965 method = token 967 The request methods defined by this specification can be found in 968 Section 4 of [Part2], along with information regarding the HTTP 969 method registry and considerations for defining new methods. 971 The request-target identifies the target resource upon which to apply 972 the request, as defined in Section 5.3. 974 Recipients typically parse the request-line into its component parts 975 by splitting on whitespace (see Section 3.5), since no whitespace is 976 allowed in the three components. Unfortunately, some user agents 977 fail to properly encode or exclude whitespace found in hypertext 978 references, resulting in those disallowed characters being sent in a 979 request-target. 981 Recipients of an invalid request-line SHOULD respond with either a 982 400 (Bad Request) error or a 301 (Moved Permanently) redirect with 983 the request-target properly encoded. A recipient SHOULD NOT attempt 984 to autocorrect and then process the request without a redirect, since 985 the invalid request-line might be deliberately crafted to bypass 986 security filters along the request chain. 988 HTTP does not place a pre-defined limit on the length of a request- 989 line. A server that receives a method longer than any that it 990 implements SHOULD respond with a 501 (Not Implemented) status code. 991 A server ought to be prepared to receive URIs of unbounded length, as 992 described in Section 2.5, and MUST respond with a 414 (URI Too Long) 993 status code if the received request-target is longer than the server 994 wishes to parse (see Section 6.5.12 of [Part2]). 996 Various ad-hoc limitations on request-line length are found in 997 practice. It is RECOMMENDED that all HTTP senders and recipients 998 support, at a minimum, request-line lengths of 8000 octets. 1000 3.1.2. Status Line 1002 The first line of a response message is the status-line, consisting 1003 of the protocol version, a space (SP), the status code, another 1004 space, a possibly-empty textual phrase describing the status code, 1005 and ending with CRLF. 1007 status-line = HTTP-version SP status-code SP reason-phrase CRLF 1009 The status-code element is a 3-digit integer code describing the 1010 result of the server's attempt to understand and satisfy the client's 1011 corresponding request. The rest of the response message is to be 1012 interpreted in light of the semantics defined for that status code. 1013 See Section 6 of [Part2] for information about the semantics of 1014 status codes, including the classes of status code (indicated by the 1015 first digit), the status codes defined by this specification, 1016 considerations for the definition of new status codes, and the IANA 1017 registry. 1019 status-code = 3DIGIT 1021 The reason-phrase element exists for the sole purpose of providing a 1022 textual description associated with the numeric status code, mostly 1023 out of deference to earlier Internet application protocols that were 1024 more frequently used with interactive text clients. A client SHOULD 1025 ignore the reason-phrase content. 1027 reason-phrase = *( HTAB / SP / VCHAR / obs-text ) 1029 3.2. Header Fields 1031 Each HTTP header field consists of a case-insensitive field name 1032 followed by a colon (":"), optional leading whitespace, the field 1033 value, and optional trailing whitespace. 1035 header-field = field-name ":" OWS field-value OWS 1036 field-name = token 1037 field-value = *( field-content / obs-fold ) 1038 field-content = *( HTAB / SP / VCHAR / obs-text ) 1039 obs-fold = CRLF ( SP / HTAB ) 1040 ; obsolete line folding 1041 ; see Section 3.2.4 1043 The field-name token labels the corresponding field-value as having 1044 the semantics defined by that header field. For example, the Date 1045 header field is defined in Section 7.1.1.2 of [Part2] as containing 1046 the origination timestamp for the message in which it appears. 1048 3.2.1. Field Extensibility 1050 Header fields are fully extensible: there is no limit on the 1051 introduction of new field names, each presumably defining new 1052 semantics, nor on the number of header fields used in a given 1053 message. Existing fields are defined in each part of this 1054 specification and in many other specifications outside the core 1055 standard. 1057 New header fields can be defined such that, when they are understood 1058 by a recipient, they might override or enhance the interpretation of 1059 previously defined header fields, define preconditions on request 1060 evaluation, or refine the meaning of responses. 1062 A proxy MUST forward unrecognized header fields unless the field-name 1063 is listed in the Connection header field (Section 6.1) or the proxy 1064 is specifically configured to block, or otherwise transform, such 1065 fields. Other recipients SHOULD ignore unrecognized header fields. 1066 These requirements allow HTTP's functionality to be enhanced without 1067 requiring prior update of deployed intermediaries. 1069 All defined header fields ought to be registered with IANA in the 1070 Message Header Field Registry, as described in Section 8.3 of 1071 [Part2]. 1073 3.2.2. Field Order 1075 The order in which header fields with differing field names are 1076 received is not significant. However, it is "good practice" to send 1077 header fields that contain control data first, such as Host on 1078 requests and Date on responses, so that implementations can decide 1079 when not to handle a message as early as possible. A server MUST 1080 wait until the entire header section is received before interpreting 1081 a request message, since later header fields might include 1082 conditionals, authentication credentials, or deliberately misleading 1083 duplicate header fields that would impact request processing. 1085 A sender MUST NOT generate multiple header fields with the same field 1086 name in a message unless either the entire field value for that 1087 header field is defined as a comma-separated list [i.e., #(values)] 1088 or the header field is a well-known exception (as noted below). 1090 A recipient MAY combine multiple header fields with the same field 1091 name into one "field-name: field-value" pair, without changing the 1092 semantics of the message, by appending each subsequent field value to 1093 the combined field value in order, separated by a comma. The order 1094 in which header fields with the same field name are received is 1095 therefore significant to the interpretation of the combined field 1096 value; a proxy MUST NOT change the order of these field values when 1097 forwarding a message. 1099 Note: In practice, the "Set-Cookie" header field ([RFC6265]) often 1100 appears multiple times in a response message and does not use the 1101 list syntax, violating the above requirements on multiple header 1102 fields with the same name. Since it cannot be combined into a 1103 single field-value, recipients ought to handle "Set-Cookie" as a 1104 special case while processing header fields. (See Appendix A.2.3 1105 of [Kri2001] for details.) 1107 3.2.3. Whitespace 1109 This specification uses three rules to denote the use of linear 1110 whitespace: OWS (optional whitespace), RWS (required whitespace), and 1111 BWS ("bad" whitespace). 1113 The OWS rule is used where zero or more linear whitespace octets 1114 might appear. For protocol elements where optional whitespace is 1115 preferred to improve readability, a sender SHOULD generate the 1116 optional whitespace as a single SP; otherwise, a sender SHOULD NOT 1117 generate optional whitespace except as needed to white-out invalid or 1118 unwanted protocol elements during in-place message filtering. 1120 The RWS rule is used when at least one linear whitespace octet is 1121 required to separate field tokens. A sender SHOULD generate RWS as a 1122 single SP. 1124 The BWS rule is used where the grammar allows optional whitespace 1125 only for historical reasons. A sender MUST NOT generate BWS in 1126 messages. A recipient MUST parse for such bad whitespace and remove 1127 it before interpreting the protocol element. 1129 OWS = *( SP / HTAB ) 1130 ; optional whitespace 1131 RWS = 1*( SP / HTAB ) 1132 ; required whitespace 1133 BWS = OWS 1134 ; "bad" whitespace 1136 3.2.4. Field Parsing 1138 No whitespace is allowed between the header field-name and colon. In 1139 the past, differences in the handling of such whitespace have led to 1140 security vulnerabilities in request routing and response handling. A 1141 server MUST reject any received request message that contains 1142 whitespace between a header field-name and colon with a response code 1143 of 400 (Bad Request). A proxy MUST remove any such whitespace from a 1144 response message before forwarding the message downstream. 1146 A field value is preceded by optional whitespace (OWS); a single SP 1147 is preferred. The field value does not include any leading or 1148 trailing white space: OWS occurring before the first non-whitespace 1149 octet of the field value or after the last non-whitespace octet of 1150 the field value ought to be excluded by parsers when extracting the 1151 field value from a header field. 1153 A recipient of field-content containing multiple sequential octets of 1154 optional (OWS) or required (RWS) whitespace SHOULD either replace the 1155 sequence with a single SP or transform any non-SP octets in the 1156 sequence to SP octets before interpreting the field value or 1157 forwarding the message downstream. 1159 Historically, HTTP header field values could be extended over 1160 multiple lines by preceding each extra line with at least one space 1161 or horizontal tab (obs-fold). This specification deprecates such 1162 line folding except within the message/http media type 1163 (Section 8.3.1). A sender MUST NOT generate a message that includes 1164 line folding (i.e., that has any field-value that contains a match to 1165 the obs-fold rule) unless the message is intended for packaging 1166 within the message/http media type. 1168 A server that receives an obs-fold in a request message that is not 1169 within a message/http container MUST either reject the message by 1170 sending a 400 (Bad Request), preferably with a representation 1171 explaining that obsolete line folding is unacceptable, or replace 1172 each received obs-fold with one or more SP octets prior to 1173 interpreting the field value or forwarding the message downstream. 1175 A proxy or gateway that receives an obs-fold in a response message 1176 that is not within a message/http container MUST either discard the 1177 message and replace it with a 502 (Bad Gateway) response, preferably 1178 with a representation explaining that unacceptable line folding was 1179 received, or replace each received obs-fold with one or more SP 1180 octets prior to interpreting the field value or forwarding the 1181 message downstream. 1183 A user agent that receives an obs-fold in a response message that is 1184 not within a message/http container MUST replace each received obs- 1185 fold with one or more SP octets prior to interpreting the field 1186 value. 1188 Historically, HTTP has allowed field content with text in the ISO- 1189 8859-1 [ISO-8859-1] charset, supporting other charsets only through 1190 use of [RFC2047] encoding. In practice, most HTTP header field 1191 values use only a subset of the US-ASCII charset [USASCII]. Newly 1192 defined header fields SHOULD limit their field values to US-ASCII 1193 octets. A recipient SHOULD treat other octets in field content (obs- 1194 text) as opaque data. 1196 3.2.5. Field Limits 1198 HTTP does not place a pre-defined limit on the length of each header 1199 field or on the length of the header section as a whole, as described 1200 in Section 2.5. Various ad-hoc limitations on individual header 1201 field length are found in practice, often depending on the specific 1202 field semantics. 1204 A server ought to be prepared to receive request header fields of 1205 unbounded length and MUST respond with an appropriate 4xx (Client 1206 Error) status code if the received header field(s) are larger than 1207 the server wishes to process. 1209 A client ought to be prepared to receive response header fields of 1210 unbounded length. A client MAY discard or truncate received header 1211 fields that are larger than the client wishes to process if the field 1212 semantics are such that the dropped value(s) can be safely ignored 1213 without changing the message framing or response semantics. 1215 3.2.6. Field value components 1217 Many HTTP header field values consist of words (token or quoted- 1218 string) separated by whitespace or special characters. 1220 word = token / quoted-string 1222 token = 1*tchar 1224 tchar = "!" / "#" / "$" / "%" / "&" / "'" / "*" 1225 / "+" / "-" / "." / "^" / "_" / "`" / "|" / "~" 1226 / DIGIT / ALPHA 1227 ; any VCHAR, except special 1229 special = "(" / ")" / "<" / ">" / "@" / "," 1230 / ";" / ":" / "\" / DQUOTE / "/" / "[" 1231 / "]" / "?" / "=" / "{" / "}" 1233 A string of text is parsed as a single word if it is quoted using 1234 double-quote marks. 1236 quoted-string = DQUOTE *( qdtext / quoted-pair ) DQUOTE 1237 qdtext = HTAB / SP /%x21 / %x23-5B / %x5D-7E / obs-text 1238 obs-text = %x80-FF 1240 The backslash octet ("\") can be used as a single-octet quoting 1241 mechanism within quoted-string constructs: 1243 quoted-pair = "\" ( HTAB / SP / VCHAR / obs-text ) 1245 Recipients that process the value of a quoted-string MUST handle a 1246 quoted-pair as if it were replaced by the octet following the 1247 backslash. 1249 A sender SHOULD NOT generate a quoted-pair in a quoted-string except 1250 where necessary to quote DQUOTE and backslash octets occurring within 1251 that string. 1253 Comments can be included in some HTTP header fields by surrounding 1254 the comment text with parentheses. Comments are only allowed in 1255 fields containing "comment" as part of their field value definition. 1257 comment = "(" *( ctext / quoted-cpair / comment ) ")" 1258 ctext = HTAB / SP / %x21-27 / %x2A-5B / %x5D-7E / obs-text 1260 The backslash octet ("\") can be used as a single-octet quoting 1261 mechanism within comment constructs: 1263 quoted-cpair = "\" ( HTAB / SP / VCHAR / obs-text ) 1265 A sender SHOULD NOT escape octets in comments that do not require 1266 escaping (i.e., other than the backslash octet "\" and the 1267 parentheses "(" and ")"). 1269 3.3. Message Body 1271 The message body (if any) of an HTTP message is used to carry the 1272 payload body of that request or response. The message body is 1273 identical to the payload body unless a transfer coding has been 1274 applied, as described in Section 3.3.1. 1276 message-body = *OCTET 1278 The rules for when a message body is allowed in a message differ for 1279 requests and responses. 1281 The presence of a message body in a request is signaled by a Content- 1282 Length or Transfer-Encoding header field. Request message framing is 1283 independent of method semantics, even if the method does not define 1284 any use for a message body. 1286 The presence of a message body in a response depends on both the 1287 request method to which it is responding and the response status code 1288 (Section 3.1.2). Responses to the HEAD request method never include 1289 a message body because the associated response header fields (e.g., 1290 Transfer-Encoding, Content-Length, etc.), if present, indicate only 1291 what their values would have been if the request method had been GET 1292 (Section 4.3.2 of [Part2]). 2xx (Successful) responses to CONNECT 1293 switch to tunnel mode instead of having a message body (Section 4.3.6 1294 of [Part2]). All 1xx (Informational), 204 (No Content), and 304 (Not 1295 Modified) responses do not include a message body. All other 1296 responses do include a message body, although the body might be of 1297 zero length. 1299 3.3.1. Transfer-Encoding 1301 The Transfer-Encoding header field lists the transfer coding names 1302 corresponding to the sequence of transfer codings that have been (or 1303 will be) applied to the payload body in order to form the message 1304 body. Transfer codings are defined in Section 4. 1306 Transfer-Encoding = 1#transfer-coding 1308 Transfer-Encoding is analogous to the Content-Transfer-Encoding field 1309 of MIME, which was designed to enable safe transport of binary data 1310 over a 7-bit transport service ([RFC2045], Section 6). However, safe 1311 transport has a different focus for an 8bit-clean transfer protocol. 1312 In HTTP's case, Transfer-Encoding is primarily intended to accurately 1313 delimit a dynamically generated payload and to distinguish payload 1314 encodings that are only applied for transport efficiency or security 1315 from those that are characteristics of the selected resource. 1317 A recipient MUST be able to parse the chunked transfer coding 1318 (Section 4.1) because it plays a crucial role in framing messages 1319 when the payload body size is not known in advance. A sender MUST 1320 NOT apply chunked more than once to a message body (i.e., chunking an 1321 already chunked message is not allowed). If any transfer coding 1322 other than chunked is applied to a request payload body, the sender 1323 MUST apply chunked as the final transfer coding to ensure that the 1324 message is properly framed. If any transfer coding other than 1325 chunked is applied to a response payload body, the sender MUST either 1326 apply chunked as the final transfer coding or terminate the message 1327 by closing the connection. 1329 For example, 1331 Transfer-Encoding: gzip, chunked 1333 indicates that the payload body has been compressed using the gzip 1334 coding and then chunked using the chunked coding while forming the 1335 message body. 1337 Unlike Content-Encoding (Section 3.1.2.1 of [Part2]), Transfer- 1338 Encoding is a property of the message, not of the representation, and 1339 any recipient along the request/response chain MAY decode the 1340 received transfer coding(s) or apply additional transfer coding(s) to 1341 the message body, assuming that corresponding changes are made to the 1342 Transfer-Encoding field-value. Additional information about the 1343 encoding parameters MAY be provided by other header fields not 1344 defined by this specification. 1346 Transfer-Encoding MAY be sent in a response to a HEAD request or in a 1347 304 (Not Modified) response (Section 4.1 of [Part4]) to a GET 1348 request, neither of which includes a message body, to indicate that 1349 the origin server would have applied a transfer coding to the message 1350 body if the request had been an unconditional GET. This indication 1351 is not required, however, because any recipient on the response chain 1352 (including the origin server) can remove transfer codings when they 1353 are not needed. 1355 A server MUST NOT send a Transfer-Encoding header field in any 1356 response with a status code of 1xx (Informational) or 204 (No 1357 Content). A server MUST NOT send a Transfer-Encoding header field in 1358 any 2xx (Successful) response to a CONNECT request (Section 4.3.6 of 1359 [Part2]). 1361 Transfer-Encoding was added in HTTP/1.1. It is generally assumed 1362 that implementations advertising only HTTP/1.0 support will not 1363 understand how to process a transfer-encoded payload. A client MUST 1364 NOT send a request containing Transfer-Encoding unless it knows the 1365 server will handle HTTP/1.1 (or later) requests; such knowledge might 1366 be in the form of specific user configuration or by remembering the 1367 version of a prior received response. A server MUST NOT send a 1368 response containing Transfer-Encoding unless the corresponding 1369 request indicates HTTP/1.1 (or later). 1371 A server that receives a request message with a transfer coding it 1372 does not understand SHOULD respond with 501 (Not Implemented). 1374 3.3.2. Content-Length 1376 When a message does not have a Transfer-Encoding header field, a 1377 Content-Length header field can provide the anticipated size, as a 1378 decimal number of octets, for a potential payload body. For messages 1379 that do include a payload body, the Content-Length field-value 1380 provides the framing information necessary for determining where the 1381 body (and message) ends. For messages that do not include a payload 1382 body, the Content-Length indicates the size of the selected 1383 representation (Section 3 of [Part2]). 1385 Content-Length = 1*DIGIT 1387 An example is 1389 Content-Length: 3495 1391 A sender MUST NOT send a Content-Length header field in any message 1392 that contains a Transfer-Encoding header field. 1394 A user agent SHOULD send a Content-Length in a request message when 1395 no Transfer-Encoding is sent and the request method defines a meaning 1396 for an enclosed payload body. For example, a Content-Length header 1397 field is normally sent in a POST request even when the value is 0 1398 (indicating an empty payload body). A user agent SHOULD NOT send a 1399 Content-Length header field when the request message does not contain 1400 a payload body and the method semantics do not anticipate such a 1401 body. 1403 A server MAY send a Content-Length header field in a response to a 1404 HEAD request (Section 4.3.2 of [Part2]); a server MUST NOT send 1405 Content-Length in such a response unless its field-value equals the 1406 decimal number of octets that would have been sent in the payload 1407 body of a response if the same request had used the GET method. 1409 A server MAY send a Content-Length header field in a 304 (Not 1410 Modified) response to a conditional GET request (Section 4.1 of 1411 [Part4]); a server MUST NOT send Content-Length in such a response 1412 unless its field-value equals the decimal number of octets that would 1413 have been sent in the payload body of a 200 (OK) response to the same 1414 request. 1416 A server MUST NOT send a Content-Length header field in any response 1417 with a status code of 1xx (Informational) or 204 (No Content). A 1418 server MUST NOT send a Content-Length header field in any 2xx 1419 (Successful) response to a CONNECT request (Section 4.3.6 of 1420 [Part2]). 1422 Aside from the cases defined above, in the absence of Transfer- 1423 Encoding, an origin server SHOULD send a Content-Length header field 1424 when the payload body size is known prior to sending the complete 1425 header section. This will allow downstream recipients to measure 1426 transfer progress, know when a received message is complete, and 1427 potentially reuse the connection for additional requests. 1429 Any Content-Length field value greater than or equal to zero is 1430 valid. Since there is no predefined limit to the length of a 1431 payload, a recipient SHOULD anticipate potentially large decimal 1432 numerals and prevent parsing errors due to integer conversion 1433 overflows (Section 9.3). 1435 If a message is received that has multiple Content-Length header 1436 fields with field-values consisting of the same decimal value, or a 1437 single Content-Length header field with a field value containing a 1438 list of identical decimal values (e.g., "Content-Length: 42, 42"), 1439 indicating that duplicate Content-Length header fields have been 1440 generated or combined by an upstream message processor, then the 1441 recipient MUST either reject the message as invalid or replace the 1442 duplicated field-values with a single valid Content-Length field 1443 containing that decimal value prior to determining the message body 1444 length or forwarding the message. 1446 Note: HTTP's use of Content-Length for message framing differs 1447 significantly from the same field's use in MIME, where it is an 1448 optional field used only within the "message/external-body" media- 1449 type. 1451 3.3.3. Message Body Length 1453 The length of a message body is determined by one of the following 1454 (in order of precedence): 1456 1. Any response to a HEAD request and any response with a 1xx 1457 (Informational), 204 (No Content), or 304 (Not Modified) status 1458 code is always terminated by the first empty line after the 1459 header fields, regardless of the header fields present in the 1460 message, and thus cannot contain a message body. 1462 2. Any 2xx (Successful) response to a CONNECT request implies that 1463 the connection will become a tunnel immediately after the empty 1464 line that concludes the header fields. A client MUST ignore any 1465 Content-Length or Transfer-Encoding header fields received in 1466 such a message. 1468 3. If a Transfer-Encoding header field is present and the chunked 1469 transfer coding (Section 4.1) is the final encoding, the message 1470 body length is determined by reading and decoding the chunked 1471 data until the transfer coding indicates the data is complete. 1473 If a Transfer-Encoding header field is present in a response and 1474 the chunked transfer coding is not the final encoding, the 1475 message body length is determined by reading the connection until 1476 it is closed by the server. If a Transfer-Encoding header field 1477 is present in a request and the chunked transfer coding is not 1478 the final encoding, the message body length cannot be determined 1479 reliably; the server MUST respond with the 400 (Bad Request) 1480 status code and then close the connection. 1482 If a message is received with both a Transfer-Encoding and a 1483 Content-Length header field, the Transfer-Encoding overrides the 1484 Content-Length. Such a message might indicate an attempt to 1485 perform request or response smuggling (bypass of security-related 1486 checks on message routing or content) and thus ought to be 1487 handled as an error. A sender MUST remove the received Content- 1488 Length field prior to forwarding such a message downstream. 1490 4. If a message is received without Transfer-Encoding and with 1491 either multiple Content-Length header fields having differing 1492 field-values or a single Content-Length header field having an 1493 invalid value, then the message framing is invalid and the 1494 recipient MUST treat it as an unrecoverable error to prevent 1495 request or response smuggling. If this is a request message, the 1496 server MUST respond with a 400 (Bad Request) status code and then 1497 close the connection. If this is a response message received by 1498 a proxy, the proxy MUST close the connection to the server, 1499 discard the received response, and send a 502 (Bad Gateway) 1500 response to the client. If this is a response message received 1501 by a user agent, the user agent MUST close the connection to the 1502 server and discard the received response. 1504 5. If a valid Content-Length header field is present without 1505 Transfer-Encoding, its decimal value defines the expected message 1506 body length in octets. If the sender closes the connection or 1507 the recipient times out before the indicated number of octets are 1508 received, the recipient MUST consider the message to be 1509 incomplete and close the connection. 1511 6. If this is a request message and none of the above are true, then 1512 the message body length is zero (no message body is present). 1514 7. Otherwise, this is a response message without a declared message 1515 body length, so the message body length is determined by the 1516 number of octets received prior to the server closing the 1517 connection. 1519 Since there is no way to distinguish a successfully completed, close- 1520 delimited message from a partially-received message interrupted by 1521 network failure, a server SHOULD generate encoding or length- 1522 delimited messages whenever possible. The close-delimiting feature 1523 exists primarily for backwards compatibility with HTTP/1.0. 1525 A server MAY reject a request that contains a message body but not a 1526 Content-Length by responding with 411 (Length Required). 1528 Unless a transfer coding other than chunked has been applied, a 1529 client that sends a request containing a message body SHOULD use a 1530 valid Content-Length header field if the message body length is known 1531 in advance, rather than the chunked transfer coding, since some 1532 existing services respond to chunked with a 411 (Length Required) 1533 status code even though they understand the chunked transfer coding. 1534 This is typically because such services are implemented via a gateway 1535 that requires a content-length in advance of being called and the 1536 server is unable or unwilling to buffer the entire request before 1537 processing. 1539 A user agent that sends a request containing a message body MUST send 1540 a valid Content-Length header field if it does not know the server 1541 will handle HTTP/1.1 (or later) requests; such knowledge can be in 1542 the form of specific user configuration or by remembering the version 1543 of a prior received response. 1545 If the final response to the last request on a connection has been 1546 completely received and there remains additional data to read, a user 1547 agent MAY discard the remaining data or attempt to determine if that 1548 data belongs as part of the prior response body, which might be the 1549 case if the prior message's Content-Length value is incorrect. A 1550 client MUST NOT process, cache, or forward such extra data as a 1551 separate response, since such behavior would be vulnerable to cache 1552 poisoning. 1554 3.4. Handling Incomplete Messages 1556 A server that receives an incomplete request message, usually due to 1557 a canceled request or a triggered time-out exception, MAY send an 1558 error response prior to closing the connection. 1560 A client that receives an incomplete response message, which can 1561 occur when a connection is closed prematurely or when decoding a 1562 supposedly chunked transfer coding fails, MUST record the message as 1563 incomplete. Cache requirements for incomplete responses are defined 1564 in Section 3 of [Part6]. 1566 If a response terminates in the middle of the header section (before 1567 the empty line is received) and the status code might rely on header 1568 fields to convey the full meaning of the response, then the client 1569 cannot assume that meaning has been conveyed; the client might need 1570 to repeat the request in order to determine what action to take next. 1572 A message body that uses the chunked transfer coding is incomplete if 1573 the zero-sized chunk that terminates the encoding has not been 1574 received. A message that uses a valid Content-Length is incomplete 1575 if the size of the message body received (in octets) is less than the 1576 value given by Content-Length. A response that has neither chunked 1577 transfer coding nor Content-Length is terminated by closure of the 1578 connection, and thus is considered complete regardless of the number 1579 of message body octets received, provided that the header section was 1580 received intact. 1582 3.5. Message Parsing Robustness 1584 Older HTTP/1.0 user agent implementations might send an extra CRLF 1585 after a POST request as a workaround for some early server 1586 applications that failed to read message body content that was not 1587 terminated by a line-ending. An HTTP/1.1 user agent MUST NOT preface 1588 or follow a request with an extra CRLF. If terminating the request 1589 message body with a line-ending is desired, then the user agent MUST 1590 count the terminating CRLF octets as part of the message body length. 1592 In the interest of robustness, a server that is expecting to receive 1593 and parse a request-line SHOULD ignore at least one empty line (CRLF) 1594 received prior to the request-line. 1596 Although the line terminator for the start-line and header fields is 1597 the sequence CRLF, a recipient MAY recognize a single LF as a line 1598 terminator and ignore any preceding CR. 1600 Although the request-line and status-line grammar rules require that 1601 each of the component elements be separated by a single SP octet, 1602 recipients MAY instead parse on whitespace-delimited word boundaries 1603 and, aside from the CRLF terminator, treat any form of whitespace as 1604 the SP separator while ignoring preceding or trailing whitespace; 1605 such whitespace includes one or more of the following octets: SP, 1606 HTAB, VT (%x0B), FF (%x0C), or bare CR. 1608 When a server listening only for HTTP request messages, or processing 1609 what appears from the start-line to be an HTTP request message, 1610 receives a sequence of octets that does not match the HTTP-message 1611 grammar aside from the robustness exceptions listed above, the server 1612 SHOULD respond with a 400 (Bad Request) response. 1614 4. Transfer Codings 1616 Transfer coding names are used to indicate an encoding transformation 1617 that has been, can be, or might need to be applied to a payload body 1618 in order to ensure "safe transport" through the network. This 1619 differs from a content coding in that the transfer coding is a 1620 property of the message rather than a property of the representation 1621 that is being transferred. 1623 transfer-coding = "chunked" ; Section 4.1 1624 / "compress" ; Section 4.2.1 1625 / "deflate" ; Section 4.2.2 1626 / "gzip" ; Section 4.2.3 1627 / transfer-extension 1628 transfer-extension = token *( OWS ";" OWS transfer-parameter ) 1630 Parameters are in the form of attribute/value pairs. 1632 transfer-parameter = attribute BWS "=" BWS value 1633 attribute = token 1634 value = word 1636 All transfer-coding names are case-insensitive and ought to be 1637 registered within the HTTP Transfer Coding registry, as defined in 1638 Section 8.4. They are used in the TE (Section 4.3) and Transfer- 1639 Encoding (Section 3.3.1) header fields. 1641 4.1. Chunked Transfer Coding 1643 The chunked transfer coding wraps the payload body in order to 1644 transfer it as a series of chunks, each with its own size indicator, 1645 followed by an OPTIONAL trailer containing header fields. Chunked 1646 enables content streams of unknown size to be transferred as a 1647 sequence of length-delimited buffers, which enables the sender to 1648 retain connection persistence and the recipient to know when it has 1649 received the entire message. 1651 chunked-body = *chunk 1652 last-chunk 1653 trailer-part 1654 CRLF 1656 chunk = chunk-size [ chunk-ext ] CRLF 1657 chunk-data CRLF 1658 chunk-size = 1*HEXDIG 1659 last-chunk = 1*("0") [ chunk-ext ] CRLF 1661 chunk-data = 1*OCTET ; a sequence of chunk-size octets 1663 The chunk-size field is a string of hex digits indicating the size of 1664 the chunk-data in octets. The chunked transfer coding is complete 1665 when a chunk with a chunk-size of zero is received, possibly followed 1666 by a trailer, and finally terminated by an empty line. 1668 A recipient MUST be able to parse and decode the chunked transfer 1669 coding. 1671 4.1.1. Chunk Extensions 1673 The chunked encoding allows each chunk to include zero or more chunk 1674 extensions, immediately following the chunk-size, for the sake of 1675 supplying per-chunk metadata (such as a signature or hash), mid- 1676 message control information, or randomization of message body size. 1678 chunk-ext = *( ";" chunk-ext-name [ "=" chunk-ext-val ] ) 1680 chunk-ext-name = token 1681 chunk-ext-val = token / quoted-str-nf 1683 quoted-str-nf = DQUOTE *( qdtext-nf / quoted-pair ) DQUOTE 1684 ; like quoted-string, but disallowing line folding 1685 qdtext-nf = HTAB / SP / %x21 / %x23-5B / %x5D-7E / obs-text 1687 The chunked encoding is specific to each connection and is likely to 1688 be removed or recoded by each recipient (including intermediaries) 1689 before any higher-level application would have a chance to inspect 1690 the extensions. Hence, use of chunk extensions is generally limited 1691 to specialized HTTP services such as "long polling" (where client and 1692 server can have shared expectations regarding the use of chunk 1693 extensions) or for padding within an end-to-end secured connection. 1695 A recipient MUST ignore unrecognized chunk extensions. A server 1696 ought to limit the total length of chunk extensions received in a 1697 request to an amount reasonable for the services provided, in the 1698 same way that it applies length limitations and timeouts for other 1699 parts of a message, and generate an appropriate 4xx (Client Error) 1700 response if that amount is exceeded. 1702 4.1.2. Chunked Trailer Part 1704 A trailer allows the sender to include additional fields at the end 1705 of a chunked message in order to supply metadata that might be 1706 dynamically generated while the message body is sent, such as a 1707 message integrity check, digital signature, or post-processing 1708 status. The trailer fields are identical to header fields, except 1709 they are sent in a chunked trailer instead of the message's header 1710 section. 1712 trailer-part = *( header-field CRLF ) 1714 A sender MUST NOT generate a trailer that contains a field which 1715 needs to be known by the recipient before it can begin processing the 1716 message body. For example, most recipients need to know the values 1717 of Content-Encoding and Content-Type in order to select a content 1718 handler, so placing those fields in a trailer would force the 1719 recipient to buffer the entire body before it could begin, greatly 1720 increasing user-perceived latency and defeating one of the main 1721 advantages of using chunked to send data streams of unknown length. 1722 A sender MUST NOT generate a trailer containing a Transfer-Encoding, 1723 Content-Length, or Trailer field. 1725 A server MUST generate an empty trailer with the chunked transfer 1726 coding unless at least one of the following is true: 1728 1. the request included a TE header field that indicates "trailers" 1729 is acceptable in the transfer coding of the response, as 1730 described in Section 4.3; or, 1732 2. the trailer fields consist entirely of optional metadata and the 1733 recipient could use the message (in a manner acceptable to the 1734 generating server) without receiving that metadata. In other 1735 words, the generating server is willing to accept the possibility 1736 that the trailer fields might be silently discarded along the 1737 path to the client. 1739 The above requirement prevents the need for an infinite buffer when a 1740 message is being received by an HTTP/1.1 (or later) proxy and 1741 forwarded to an HTTP/1.0 recipient. 1743 4.1.3. Decoding Chunked 1745 A process for decoding the chunked transfer coding can be represented 1746 in pseudo-code as: 1748 length := 0 1749 read chunk-size, chunk-ext (if any), and CRLF 1750 while (chunk-size > 0) { 1751 read chunk-data and CRLF 1752 append chunk-data to decoded-body 1753 length := length + chunk-size 1754 read chunk-size, chunk-ext (if any), and CRLF 1755 } 1756 read header-field 1757 while (header-field not empty) { 1758 append header-field to existing header fields 1759 read header-field 1760 } 1761 Content-Length := length 1762 Remove "chunked" from Transfer-Encoding 1763 Remove Trailer from existing header fields 1765 4.2. Compression Codings 1767 The codings defined below can be used to compress the payload of a 1768 message. 1770 4.2.1. Compress Coding 1772 The "compress" coding is an adaptive Lempel-Ziv-Welch (LZW) coding 1773 [Welch] that is commonly produced by the UNIX file compression 1774 program "compress". A recipient SHOULD consider "x-compress" to be 1775 equivalent to "compress". 1777 4.2.2. Deflate Coding 1779 The "deflate" coding is a "zlib" data format [RFC1950] containing a 1780 "deflate" compressed data stream [RFC1951] that uses a combination of 1781 the Lempel-Ziv (LZ77) compression algorithm and Huffman coding. 1783 Note: Some incorrect implementations send the "deflate" compressed 1784 data without the zlib wrapper. 1786 4.2.3. Gzip Coding 1788 The "gzip" coding is an LZ77 coding with a 32 bit CRC that is 1789 commonly produced by the gzip file compression program [RFC1952]. A 1790 recipient SHOULD consider "x-gzip" to be equivalent to "gzip". 1792 4.3. TE 1794 The "TE" header field in a request indicates what transfer codings, 1795 besides chunked, the client is willing to accept in response, and 1796 whether or not the client is willing to accept trailer fields in a 1797 chunked transfer coding. 1799 The TE field-value consists of a comma-separated list of transfer 1800 coding names, each allowing for optional parameters (as described in 1801 Section 4), and/or the keyword "trailers". A client MUST NOT send 1802 the chunked transfer coding name in TE; chunked is always acceptable 1803 for HTTP/1.1 recipients. 1805 TE = #t-codings 1806 t-codings = "trailers" / ( transfer-coding [ t-ranking ] ) 1807 t-ranking = OWS ";" OWS "q=" rank 1808 rank = ( "0" [ "." 0*3DIGIT ] ) 1809 / ( "1" [ "." 0*3("0") ] ) 1811 Three examples of TE use are below. 1813 TE: deflate 1814 TE: 1815 TE: trailers, deflate;q=0.5 1817 The presence of the keyword "trailers" indicates that the client is 1818 willing to accept trailer fields in a chunked transfer coding, as 1819 defined in Section 4.1.2, on behalf of itself and any downstream 1820 clients. For requests from an intermediary, this implies that 1821 either: (a) all downstream clients are willing to accept trailer 1822 fields in the forwarded response; or, (b) the intermediary will 1823 attempt to buffer the response on behalf of downstream recipients. 1824 Note that HTTP/1.1 does not define any means to limit the size of a 1825 chunked response such that an intermediary can be assured of 1826 buffering the entire response. 1828 When multiple transfer codings are acceptable, the client MAY rank 1829 the codings by preference using a case-insensitive "q" parameter 1830 (similar to the qvalues used in content negotiation fields, Section 1831 5.3.1 of [Part2]). The rank value is a real number in the range 0 1832 through 1, where 0.001 is the least preferred and 1 is the most 1833 preferred; a value of 0 means "not acceptable". 1835 If the TE field-value is empty or if no TE field is present, the only 1836 acceptable transfer coding is chunked. A message with no transfer 1837 coding is always acceptable. 1839 Since the TE header field only applies to the immediate connection, a 1840 sender of TE MUST also send a "TE" connection option within the 1841 Connection header field (Section 6.1) in order to prevent the TE 1842 field from being forwarded by intermediaries that do not support its 1843 semantics. 1845 4.4. Trailer 1847 When a message includes a message body encoded with the chunked 1848 transfer coding and the sender desires to send metadata in the form 1849 of trailer fields at the end of the message, the sender SHOULD 1850 generate a Trailer header field before the message body to indicate 1851 which fields will be present in the trailers. This allows the 1852 recipient to prepare for receipt of that metadata before it starts 1853 processing the body, which is useful if the message is being streamed 1854 and the recipient wishes to confirm an integrity check on the fly. 1856 Trailer = 1#field-name 1858 5. Message Routing 1860 HTTP request message routing is determined by each client based on 1861 the target resource, the client's proxy configuration, and 1862 establishment or reuse of an inbound connection. The corresponding 1863 response routing follows the same connection chain back to the 1864 client. 1866 5.1. Identifying a Target Resource 1868 HTTP is used in a wide variety of applications, ranging from general- 1869 purpose computers to home appliances. In some cases, communication 1870 options are hard-coded in a client's configuration. However, most 1871 HTTP clients rely on the same resource identification mechanism and 1872 configuration techniques as general-purpose Web browsers. 1874 HTTP communication is initiated by a user agent for some purpose. 1875 The purpose is a combination of request semantics, which are defined 1876 in [Part2], and a target resource upon which to apply those 1877 semantics. A URI reference (Section 2.7) is typically used as an 1878 identifier for the "target resource", which a user agent would 1879 resolve to its absolute form in order to obtain the "target URI". 1880 The target URI excludes the reference's fragment component, if any, 1881 since fragment identifiers are reserved for client-side processing 1882 ([RFC3986], Section 3.5). 1884 5.2. Connecting Inbound 1886 Once the target URI is determined, a client needs to decide whether a 1887 network request is necessary to accomplish the desired semantics and, 1888 if so, where that request is to be directed. 1890 If the client has a cache [Part6] and the request can be satisfied by 1891 it, then the request is usually directed there first. 1893 If the request is not satisfied by a cache, then a typical client 1894 will check its configuration to determine whether a proxy is to be 1895 used to satisfy the request. Proxy configuration is implementation- 1896 dependent, but is often based on URI prefix matching, selective 1897 authority matching, or both, and the proxy itself is usually 1898 identified by an "http" or "https" URI. If a proxy is applicable, 1899 the client connects inbound by establishing (or reusing) a connection 1900 to that proxy. 1902 If no proxy is applicable, a typical client will invoke a handler 1903 routine, usually specific to the target URI's scheme, to connect 1904 directly to an authority for the target resource. How that is 1905 accomplished is dependent on the target URI scheme and defined by its 1906 associated specification, similar to how this specification defines 1907 origin server access for resolution of the "http" (Section 2.7.1) and 1908 "https" (Section 2.7.2) schemes. 1910 HTTP requirements regarding connection management are defined in 1911 Section 6. 1913 5.3. Request Target 1915 Once an inbound connection is obtained, the client sends an HTTP 1916 request message (Section 3) with a request-target derived from the 1917 target URI. There are four distinct formats for the request-target, 1918 depending on both the method being requested and whether the request 1919 is to a proxy. 1921 request-target = origin-form 1922 / absolute-form 1923 / authority-form 1924 / asterisk-form 1926 origin-form = absolute-path [ "?" query ] 1927 absolute-form = absolute-URI 1928 authority-form = authority 1929 asterisk-form = "*" 1931 origin-form 1933 The most common form of request-target is the origin-form. When 1934 making a request directly to an origin server, other than a CONNECT 1935 or server-wide OPTIONS request (as detailed below), a client MUST 1936 send only the absolute path and query components of the target URI as 1937 the request-target. If the target URI's path component is empty, 1938 then the client MUST send "/" as the path within the origin-form of 1939 request-target. A Host header field is also sent, as defined in 1940 Section 5.4. 1942 For example, a client wishing to retrieve a representation of the 1943 resource identified as 1945 http://www.example.org/where?q=now 1947 directly from the origin server would open (or reuse) a TCP 1948 connection to port 80 of the host "www.example.org" and send the 1949 lines: 1951 GET /where?q=now HTTP/1.1 1952 Host: www.example.org 1954 followed by the remainder of the request message. 1956 absolute-form 1958 When making a request to a proxy, other than a CONNECT or server-wide 1959 OPTIONS request (as detailed below), a client MUST send the target 1960 URI in absolute-form as the request-target. The proxy is requested 1961 to either service that request from a valid cache, if possible, or 1962 make the same request on the client's behalf to either the next 1963 inbound proxy server or directly to the origin server indicated by 1964 the request-target. Requirements on such "forwarding" of messages 1965 are defined in Section 5.7. 1967 An example absolute-form of request-line would be: 1969 GET http://www.example.org/pub/WWW/TheProject.html HTTP/1.1 1971 To allow for transition to the absolute-form for all requests in some 1972 future version of HTTP, a server MUST accept the absolute-form in 1973 requests, even though HTTP/1.1 clients will only send them in 1974 requests to proxies. 1976 authority-form 1978 The authority-form of request-target is only used for CONNECT 1979 requests (Section 4.3.6 of [Part2]). When making a CONNECT request 1980 to establish a tunnel through one or more proxies, a client MUST send 1981 only the target URI's authority component (excluding any userinfo and 1982 its "@" delimiter) as the request-target. For example, 1984 CONNECT www.example.com:80 HTTP/1.1 1986 asterisk-form 1988 The asterisk-form of request-target is only used for a server-wide 1989 OPTIONS request (Section 4.3.7 of [Part2]). When a client wishes to 1990 request OPTIONS for the server as a whole, as opposed to a specific 1991 named resource of that server, the client MUST send only "*" (%x2A) 1992 as the request-target. For example, 1994 OPTIONS * HTTP/1.1 1996 If a proxy receives an OPTIONS request with an absolute-form of 1997 request-target in which the URI has an empty path and no query 1998 component, then the last proxy on the request chain MUST send a 1999 request-target of "*" when it forwards the request to the indicated 2000 origin server. 2002 For example, the request 2004 OPTIONS http://www.example.org:8001 HTTP/1.1 2006 would be forwarded by the final proxy as 2008 OPTIONS * HTTP/1.1 2009 Host: www.example.org:8001 2011 after connecting to port 8001 of host "www.example.org". 2013 5.4. Host 2015 The "Host" header field in a request provides the host and port 2016 information from the target URI, enabling the origin server to 2017 distinguish among resources while servicing requests for multiple 2018 host names on a single IP address. 2020 Host = uri-host [ ":" port ] ; Section 2.7.1 2022 A client MUST send a Host header field in all HTTP/1.1 request 2023 messages. If the target URI includes an authority component, then a 2024 client MUST send a field-value for Host that is identical to that 2025 authority component, excluding any userinfo subcomponent and its "@" 2026 delimiter (Section 2.7.1). If the authority component is missing or 2027 undefined for the target URI, then a client MUST send a Host header 2028 field with an empty field-value. 2030 Since the Host field-value is critical information for handling a 2031 request, a user agent SHOULD generate Host as the first header field 2032 following the request-line. 2034 For example, a GET request to the origin server for 2035 would begin with: 2037 GET /pub/WWW/ HTTP/1.1 2038 Host: www.example.org 2040 A client MUST send a Host header field in an HTTP/1.1 request even if 2041 the request-target is in the absolute-form, since this allows the 2042 Host information to be forwarded through ancient HTTP/1.0 proxies 2043 that might not have implemented Host. 2045 When a proxy receives a request with an absolute-form of request- 2046 target, the proxy MUST ignore the received Host header field (if any) 2047 and instead replace it with the host information of the request- 2048 target. A proxy that forwards such a request MUST generate a new 2049 Host field-value based on the received request-target rather than 2050 forward the received Host field-value. 2052 Since the Host header field acts as an application-level routing 2053 mechanism, it is a frequent target for malware seeking to poison a 2054 shared cache or redirect a request to an unintended server. An 2055 interception proxy is particularly vulnerable if it relies on the 2056 Host field-value for redirecting requests to internal servers, or for 2057 use as a cache key in a shared cache, without first verifying that 2058 the intercepted connection is targeting a valid IP address for that 2059 host. 2061 A server MUST respond with a 400 (Bad Request) status code to any 2062 HTTP/1.1 request message that lacks a Host header field and to any 2063 request message that contains more than one Host header field or a 2064 Host header field with an invalid field-value. 2066 5.5. Effective Request URI 2068 A server that receives an HTTP request message MUST reconstruct the 2069 user agent's original target URI, based on the pieces of information 2070 learned from the request-target, Host header field, and connection 2071 context, in order to identify the intended target resource and 2072 properly service the request. The URI derived from this 2073 reconstruction process is referred to as the "effective request URI". 2075 For a user agent, the effective request URI is the target URI. 2077 If the request-target is in absolute-form, then the effective request 2078 URI is the same as the request-target. Otherwise, the effective 2079 request URI is constructed as follows. 2081 If the request is received over a TLS-secured TCP connection, then 2082 the effective request URI's scheme is "https"; otherwise, the scheme 2083 is "http". 2085 If the request-target is in authority-form, then the effective 2086 request URI's authority component is the same as the request-target. 2087 Otherwise, if a Host header field is supplied with a non-empty field- 2088 value, then the authority component is the same as the Host field- 2089 value. Otherwise, the authority component is the concatenation of 2090 the default host name configured for the server, a colon (":"), and 2091 the connection's incoming TCP port number in decimal form. 2093 If the request-target is in authority-form or asterisk-form, then the 2094 effective request URI's combined path and query component is empty. 2095 Otherwise, the combined path and query component is the same as the 2096 request-target. 2098 The components of the effective request URI, once determined as 2099 above, can be combined into absolute-URI form by concatenating the 2100 scheme, "://", authority, and combined path and query component. 2102 Example 1: the following message received over an insecure TCP 2103 connection 2105 GET /pub/WWW/TheProject.html HTTP/1.1 2106 Host: www.example.org:8080 2108 has an effective request URI of 2110 http://www.example.org:8080/pub/WWW/TheProject.html 2112 Example 2: the following message received over a TLS-secured TCP 2113 connection 2115 OPTIONS * HTTP/1.1 2116 Host: www.example.org 2118 has an effective request URI of 2120 https://www.example.org 2122 An origin server that does not allow resources to differ by requested 2123 host MAY ignore the Host field-value and instead replace it with a 2124 configured server name when constructing the effective request URI. 2126 Recipients of an HTTP/1.0 request that lacks a Host header field MAY 2127 attempt to use heuristics (e.g., examination of the URI path for 2128 something unique to a particular host) in order to guess the 2129 effective request URI's authority component. 2131 5.6. Associating a Response to a Request 2133 HTTP does not include a request identifier for associating a given 2134 request message with its corresponding one or more response messages. 2135 Hence, it relies on the order of response arrival to correspond 2136 exactly to the order in which requests are made on the same 2137 connection. More than one response message per request only occurs 2138 when one or more informational responses (1xx, see Section 6.2 of 2139 [Part2]) precede a final response to the same request. 2141 A client that has more than one outstanding request on a connection 2142 MUST maintain a list of outstanding requests in the order sent and 2143 MUST associate each received response message on that connection to 2144 the highest ordered request that has not yet received a final (non- 2145 1xx) response. 2147 5.7. Message Forwarding 2149 As described in Section 2.3, intermediaries can serve a variety of 2150 roles in the processing of HTTP requests and responses. Some 2151 intermediaries are used to improve performance or availability. 2152 Others are used for access control or to filter content. Since an 2153 HTTP stream has characteristics similar to a pipe-and-filter 2154 architecture, there are no inherent limits to the extent an 2155 intermediary can enhance (or interfere) with either direction of the 2156 stream. 2158 An intermediary not acting as a tunnel MUST implement the Connection 2159 header field, as specified in Section 6.1, and exclude fields from 2160 being forwarded that are only intended for the incoming connection. 2162 An intermediary MUST NOT forward a message to itself unless it is 2163 protected from an infinite request loop. In general, an intermediary 2164 ought to recognize its own server names, including any aliases, local 2165 variations, or literal IP addresses, and respond to such requests 2166 directly. 2168 5.7.1. Via 2170 The "Via" header field indicates the presence of intermediate 2171 protocols and recipients between the user agent and the server (on 2172 requests) or between the origin server and the client (on responses), 2173 similar to the "Received" header field in email (Section 3.6.7 of 2174 [RFC5322]). Via can be used for tracking message forwards, avoiding 2175 request loops, and identifying the protocol capabilities of senders 2176 along the request/response chain. 2178 Via = 1#( received-protocol RWS received-by [ RWS comment ] ) 2180 received-protocol = [ protocol-name "/" ] protocol-version 2181 ; see Section 6.7 2182 received-by = ( uri-host [ ":" port ] ) / pseudonym 2183 pseudonym = token 2185 Multiple Via field values represent each proxy or gateway that has 2186 forwarded the message. Each intermediary appends its own information 2187 about how the message was received, such that the end result is 2188 ordered according to the sequence of forwarding recipients. 2190 A proxy MUST send an appropriate Via header field, as described 2191 below, in each message that it forwards. An HTTP-to-HTTP gateway 2192 MUST send an appropriate Via header field in each inbound request 2193 message and MAY send a Via header field in forwarded response 2194 messages. 2196 For each intermediary, the received-protocol indicates the protocol 2197 and protocol version used by the upstream sender of the message. 2198 Hence, the Via field value records the advertised protocol 2199 capabilities of the request/response chain such that they remain 2200 visible to downstream recipients; this can be useful for determining 2201 what backwards-incompatible features might be safe to use in 2202 response, or within a later request, as described in Section 2.6. 2203 For brevity, the protocol-name is omitted when the received protocol 2204 is HTTP. 2206 The received-by field is normally the host and optional port number 2207 of a recipient server or client that subsequently forwarded the 2208 message. However, if the real host is considered to be sensitive 2209 information, a sender MAY replace it with a pseudonym. If a port is 2210 not provided, a recipient MAY interpret that as meaning it was 2211 received on the default TCP port, if any, for the received-protocol. 2213 A sender MAY generate comments in the Via header field to identify 2214 the software of each recipient, analogous to the User-Agent and 2215 Server header fields. However, all comments in the Via field are 2216 optional and a recipient MAY remove them prior to forwarding the 2217 message. 2219 For example, a request message could be sent from an HTTP/1.0 user 2220 agent to an internal proxy code-named "fred", which uses HTTP/1.1 to 2221 forward the request to a public proxy at p.example.net, which 2222 completes the request by forwarding it to the origin server at 2223 www.example.com. The request received by www.example.com would then 2224 have the following Via header field: 2226 Via: 1.0 fred, 1.1 p.example.net 2228 An intermediary used as a portal through a network firewall SHOULD 2229 NOT forward the names and ports of hosts within the firewall region 2230 unless it is explicitly enabled to do so. If not enabled, such an 2231 intermediary SHOULD replace each received-by host of any host behind 2232 the firewall by an appropriate pseudonym for that host. 2234 An intermediary MAY combine an ordered subsequence of Via header 2235 field entries into a single such entry if the entries have identical 2236 received-protocol values. For example, 2238 Via: 1.0 ricky, 1.1 ethel, 1.1 fred, 1.0 lucy 2240 could be collapsed to 2242 Via: 1.0 ricky, 1.1 mertz, 1.0 lucy 2244 A sender SHOULD NOT combine multiple entries unless they are all 2245 under the same organizational control and the hosts have already been 2246 replaced by pseudonyms. A sender MUST NOT combine entries that have 2247 different received-protocol values. 2249 5.7.2. Transformations 2251 Some intermediaries include features for transforming messages and 2252 their payloads. A transforming proxy might, for example, convert 2253 between image formats in order to save cache space or to reduce the 2254 amount of traffic on a slow link. However, operational problems 2255 might occur when these transformations are applied to payloads 2256 intended for critical applications, such as medical imaging or 2257 scientific data analysis, particularly when integrity checks or 2258 digital signatures are used to ensure that the payload received is 2259 identical to the original. 2261 If a proxy receives a request-target with a host name that is not a 2262 fully qualified domain name, it MAY add its own domain to the host 2263 name it received when forwarding the request. A proxy MUST NOT 2264 change the host name if it is a fully qualified domain name. 2266 A proxy MUST NOT modify the "absolute-path" and "query" parts of the 2267 received request-target when forwarding it to the next inbound 2268 server, except as noted above to replace an empty path with "/" or 2269 "*". 2271 A proxy MUST NOT modify header fields that provide information about 2272 the end points of the communication chain, the resource state, or the 2273 selected representation. A proxy MAY change the message body through 2274 application or removal of a transfer coding (Section 4). 2276 A non-transforming proxy MUST NOT modify the message payload (Section 2277 3.3 of [Part2]). A transforming proxy MUST NOT modify the payload of 2278 a message that contains the no-transform cache-control directive. 2280 A transforming proxy MAY transform the payload of a message that does 2281 not contain the no-transform cache-control directive; if the payload 2282 is transformed, the transforming proxy MUST add a Warning header 2283 field with the warn-code of 214 ("Transformation Applied") if one 2284 does not already appear in the message (see Section 5.5 of [Part6]). 2285 If the payload of a 200 (OK) response is transformed, the 2286 transforming proxy can also inform downstream recipients that a 2287 transformation has been applied by changing the response status code 2288 to 203 (Non-Authoritative Information) (Section 6.3.4 of [Part2]). 2290 6. Connection Management 2292 HTTP messaging is independent of the underlying transport or session- 2293 layer connection protocol(s). HTTP only presumes a reliable 2294 transport with in-order delivery of requests and the corresponding 2295 in-order delivery of responses. The mapping of HTTP request and 2296 response structures onto the data units of an underlying transport 2297 protocol is outside the scope of this specification. 2299 As described in Section 5.2, the specific connection protocols to be 2300 used for an HTTP interaction are determined by client configuration 2301 and the target URI. For example, the "http" URI scheme 2302 (Section 2.7.1) indicates a default connection of TCP over IP, with a 2303 default TCP port of 80, but the client might be configured to use a 2304 proxy via some other connection, port, or protocol. 2306 HTTP implementations are expected to engage in connection management, 2307 which includes maintaining the state of current connections, 2308 establishing a new connection or reusing an existing connection, 2309 processing messages received on a connection, detecting connection 2310 failures, and closing each connection. Most clients maintain 2311 multiple connections in parallel, including more than one connection 2312 per server endpoint. Most servers are designed to maintain thousands 2313 of concurrent connections, while controlling request queues to enable 2314 fair use and detect denial of service attacks. 2316 6.1. Connection 2318 The "Connection" header field allows the sender to indicate desired 2319 control options for the current connection. In order to avoid 2320 confusing downstream recipients, a proxy or gateway MUST remove or 2321 replace any received connection options before forwarding the 2322 message. 2324 When a header field aside from Connection is used to supply control 2325 information for or about the current connection, the sender MUST list 2326 the corresponding field-name within the "Connection" header field. A 2327 proxy or gateway MUST parse a received Connection header field before 2328 a message is forwarded and, for each connection-option in this field, 2329 remove any header field(s) from the message with the same name as the 2330 connection-option, and then remove the Connection header field itself 2331 (or replace it with the intermediary's own connection options for the 2332 forwarded message). 2334 Hence, the Connection header field provides a declarative way of 2335 distinguishing header fields that are only intended for the immediate 2336 recipient ("hop-by-hop") from those fields that are intended for all 2337 recipients on the chain ("end-to-end"), enabling the message to be 2338 self-descriptive and allowing future connection-specific extensions 2339 to be deployed without fear that they will be blindly forwarded by 2340 older intermediaries. 2342 The Connection header field's value has the following grammar: 2344 Connection = 1#connection-option 2345 connection-option = token 2347 Connection options are case-insensitive. 2349 A sender MUST NOT send a connection option corresponding to a header 2350 field that is intended for all recipients of the payload. For 2351 example, Cache-Control is never appropriate as a connection option 2352 (Section 5.2 of [Part6]). 2354 The connection options do not have to correspond to a header field 2355 present in the message, since a connection-specific header field 2356 might not be needed if there are no parameters associated with that 2357 connection option. Recipients that trigger certain connection 2358 behavior based on the presence of connection options MUST do so based 2359 on the presence of the connection-option rather than only the 2360 presence of the optional header field. In other words, if the 2361 connection option is received as a header field but not indicated 2362 within the Connection field-value, then the recipient MUST ignore the 2363 connection-specific header field because it has likely been forwarded 2364 by an intermediary that is only partially conformant. 2366 When defining new connection options, specifications ought to 2367 carefully consider existing deployed header fields and ensure that 2368 the new connection option does not share the same name as an 2369 unrelated header field that might already be deployed. Defining a 2370 new connection option essentially reserves that potential field-name 2371 for carrying additional information related to the connection option, 2372 since it would be unwise for senders to use that field-name for 2373 anything else. 2375 The "close" connection option is defined for a sender to signal that 2376 this connection will be closed after completion of the response. For 2377 example, 2379 Connection: close 2381 in either the request or the response header fields indicates that 2382 the sender is going to close the connection after the current 2383 request/response is complete (Section 6.6). 2385 A client that does not support persistent connections MUST send the 2386 "close" connection option in every request message. 2388 A server that does not support persistent connections MUST send the 2389 "close" connection option in every response message that does not 2390 have a 1xx (Informational) status code. 2392 6.2. Establishment 2394 It is beyond the scope of this specification to describe how 2395 connections are established via various transport or session-layer 2396 protocols. Each connection applies to only one transport link. 2398 6.3. Persistence 2400 HTTP/1.1 defaults to the use of "persistent connections", allowing 2401 multiple requests and responses to be carried over a single 2402 connection. The "close" connection-option is used to signal that a 2403 connection will not persist after the current request/response. HTTP 2404 implementations SHOULD support persistent connections. 2406 A recipient determines whether a connection is persistent or not 2407 based on the most recently received message's protocol version and 2408 Connection header field (if any): 2410 o If the close connection option is present, the connection will not 2411 persist after the current response; else, 2413 o If the received protocol is HTTP/1.1 (or later), the connection 2414 will persist after the current response; else, 2416 o If the received protocol is HTTP/1.0, the "keep-alive" connection 2417 option is present, the recipient is not a proxy, and the recipient 2418 wishes to honor the HTTP/1.0 "keep-alive" mechanism, the 2419 connection will persist after the current response; otherwise, 2421 o The connection will close after the current response. 2423 A server MAY assume that an HTTP/1.1 client intends to maintain a 2424 persistent connection until a close connection option is received in 2425 a request. 2427 A client MAY reuse a persistent connection until it sends or receives 2428 a close connection option or receives an HTTP/1.0 response without a 2429 "keep-alive" connection option. 2431 In order to remain persistent, all messages on a connection need to 2432 have a self-defined message length (i.e., one not defined by closure 2433 of the connection), as described in Section 3.3. A server MUST read 2434 the entire request message body or close the connection after sending 2435 its response, since otherwise the remaining data on a persistent 2436 connection would be misinterpreted as the next request. Likewise, a 2437 client MUST read the entire response message body if it intends to 2438 reuse the same connection for a subsequent request. 2440 A proxy server MUST NOT maintain a persistent connection with an 2441 HTTP/1.0 client (see Section 19.7.1 of [RFC2068] for information and 2442 discussion of the problems with the Keep-Alive header field 2443 implemented by many HTTP/1.0 clients). 2445 Clients and servers SHOULD NOT assume that a persistent connection is 2446 maintained for HTTP versions less than 1.1 unless it is explicitly 2447 signaled. See Appendix A.1.2 for more information on backward 2448 compatibility with HTTP/1.0 clients. 2450 6.3.1. Retrying Requests 2452 Connections can be closed at any time, with or without intention. 2453 Implementations ought to anticipate the need to recover from 2454 asynchronous close events. 2456 When an inbound connection is closed prematurely, a client MAY open a 2457 new connection and automatically retransmit an aborted sequence of 2458 requests if all of those requests have idempotent methods (Section 2459 4.2.2 of [Part2]). A proxy MUST NOT automatically retry non- 2460 idempotent requests. 2462 A user agent MUST NOT automatically retry a request with a non- 2463 idempotent method unless it has some means to know that the request 2464 semantics are actually idempotent, regardless of the method, or some 2465 means to detect that the original request was never applied. For 2466 example, a user agent that knows (through design or configuration) 2467 that a POST request to a given resource is safe can repeat that 2468 request automatically. Likewise, a user agent designed specifically 2469 to operate on a version control repository might be able to recover 2470 from partial failure conditions by checking the target resource 2471 revision(s) after a failed connection, reverting or fixing any 2472 changes that were partially applied, and then automatically retrying 2473 the requests that failed. 2475 A client SHOULD NOT automatically retry a failed automatic retry. 2477 6.3.2. Pipelining 2479 A client that supports persistent connections MAY "pipeline" its 2480 requests (i.e., send multiple requests without waiting for each 2481 response). A server MAY process a sequence of pipelined requests in 2482 parallel if they all have safe methods (Section 4.2.1 of [Part2]), 2483 but MUST send the corresponding responses in the same order that the 2484 requests were received. 2486 A client that pipelines requests SHOULD retry unanswered requests if 2487 the connection closes before it receives all of the corresponding 2488 responses. When retrying pipelined requests after a failed 2489 connection (a connection not explicitly closed by the server in its 2490 last complete response), a client MUST NOT pipeline immediately after 2491 connection establishment, since the first remaining request in the 2492 prior pipeline might have caused an error response that can be lost 2493 again if multiple requests are sent on a prematurely closed 2494 connection (see the TCP reset problem described in Section 6.6). 2496 Idempotent methods (Section 4.2.2 of [Part2]) are significant to 2497 pipelining because they can be automatically retried after a 2498 connection failure. A user agent SHOULD NOT pipeline requests after 2499 a non-idempotent method, until the final response status code for 2500 that method has been received, unless the user agent has a means to 2501 detect and recover from partial failure conditions involving the 2502 pipelined sequence. 2504 An intermediary that receives pipelined requests MAY pipeline those 2505 requests when forwarding them inbound, since it can rely on the 2506 outbound user agent(s) to determine what requests can be safely 2507 pipelined. If the inbound connection fails before receiving a 2508 response, the pipelining intermediary MAY attempt to retry a sequence 2509 of requests that have yet to receive a response if the requests all 2510 have idempotent methods; otherwise, the pipelining intermediary 2511 SHOULD forward any received responses and then close the 2512 corresponding outbound connection(s) so that the outbound user 2513 agent(s) can recover accordingly. 2515 6.4. Concurrency 2517 A client SHOULD limit the number of simultaneous open connections 2518 that it maintains to a given server. 2520 Previous revisions of HTTP gave a specific number of connections as a 2521 ceiling, but this was found to be impractical for many applications. 2522 As a result, this specification does not mandate a particular maximum 2523 number of connections, but instead encourages clients to be 2524 conservative when opening multiple connections. 2526 Multiple connections are typically used to avoid the "head-of-line 2527 blocking" problem, wherein a request that takes significant server- 2528 side processing and/or has a large payload blocks subsequent requests 2529 on the same connection. However, each connection consumes server 2530 resources. Furthermore, using multiple connections can cause 2531 undesirable side effects in congested networks. 2533 Note that servers might reject traffic that they deem abusive, 2534 including an excessive number of connections from a client. 2536 6.5. Failures and Time-outs 2538 Servers will usually have some time-out value beyond which they will 2539 no longer maintain an inactive connection. Proxy servers might make 2540 this a higher value since it is likely that the client will be making 2541 more connections through the same server. The use of persistent 2542 connections places no requirements on the length (or existence) of 2543 this time-out for either the client or the server. 2545 A client or server that wishes to time-out SHOULD issue a graceful 2546 close on the connection. Implementations SHOULD constantly monitor 2547 open connections for a received closure signal and respond to it as 2548 appropriate, since prompt closure of both sides of a connection 2549 enables allocated system resources to be reclaimed. 2551 A client, server, or proxy MAY close the transport connection at any 2552 time. For example, a client might have started to send a new request 2553 at the same time that the server has decided to close the "idle" 2554 connection. From the server's point of view, the connection is being 2555 closed while it was idle, but from the client's point of view, a 2556 request is in progress. 2558 A server SHOULD sustain persistent connections, when possible, and 2559 allow the underlying transport's flow control mechanisms to resolve 2560 temporary overloads, rather than terminate connections with the 2561 expectation that clients will retry. The latter technique can 2562 exacerbate network congestion. 2564 A client sending a message body SHOULD monitor the network connection 2565 for an error response while it is transmitting the request. If the 2566 client sees a response that indicates the server does not wish to 2567 receive the message body and is closing the connection, the client 2568 SHOULD immediately cease transmitting the body and close its side of 2569 the connection. 2571 6.6. Tear-down 2573 The Connection header field (Section 6.1) provides a "close" 2574 connection option that a sender SHOULD send when it wishes to close 2575 the connection after the current request/response pair. 2577 A client that sends a close connection option MUST NOT send further 2578 requests on that connection (after the one containing close) and MUST 2579 close the connection after reading the final response message 2580 corresponding to this request. 2582 A server that receives a close connection option MUST initiate a 2583 close of the connection (see below) after it sends the final response 2584 to the request that contained close. The server SHOULD send a close 2585 connection option in its final response on that connection. The 2586 server MUST NOT process any further requests received on that 2587 connection. 2589 A server that sends a close connection option MUST initiate a close 2590 of the connection (see below) after it sends the response containing 2591 close. The server MUST NOT process any further requests received on 2592 that connection. 2594 A client that receives a close connection option MUST cease sending 2595 requests on that connection and close the connection after reading 2596 the response message containing the close; if additional pipelined 2597 requests had been sent on the connection, the client SHOULD NOT 2598 assume that they will be processed by the server. 2600 If a server performs an immediate close of a TCP connection, there is 2601 a significant risk that the client will not be able to read the last 2602 HTTP response. If the server receives additional data from the 2603 client on a fully-closed connection, such as another request that was 2604 sent by the client before receiving the server's response, the 2605 server's TCP stack will send a reset packet to the client; 2606 unfortunately, the reset packet might erase the client's 2607 unacknowledged input buffers before they can be read and interpreted 2608 by the client's HTTP parser. 2610 To avoid the TCP reset problem, servers typically close a connection 2611 in stages. First, the server performs a half-close by closing only 2612 the write side of the read/write connection. The server then 2613 continues to read from the connection until it receives a 2614 corresponding close by the client, or until the server is reasonably 2615 certain that its own TCP stack has received the client's 2616 acknowledgement of the packet(s) containing the server's last 2617 response. Finally, the server fully closes the connection. 2619 It is unknown whether the reset problem is exclusive to TCP or might 2620 also be found in other transport connection protocols. 2622 6.7. Upgrade 2624 The "Upgrade" header field is intended to provide a simple mechanism 2625 for transitioning from HTTP/1.1 to some other protocol on the same 2626 connection. A client MAY send a list of protocols in the Upgrade 2627 header field of a request to invite the server to switch to one or 2628 more of those protocols, in order of descending preference, before 2629 sending the final response. A server MAY ignore a received Upgrade 2630 header field if it wishes to continue using the current protocol on 2631 that connection. 2633 Upgrade = 1#protocol 2635 protocol = protocol-name ["/" protocol-version] 2636 protocol-name = token 2637 protocol-version = token 2639 A server that sends a 101 (Switching Protocols) response MUST send an 2640 Upgrade header field to indicate the new protocol(s) to which the 2641 connection is being switched; if multiple protocol layers are being 2642 switched, the sender MUST list the protocols in layer-ascending 2643 order. A server MUST NOT switch to a protocol that was not indicated 2644 by the client in the corresponding request's Upgrade header field. A 2645 server MAY choose to ignore the order of preference indicated by the 2646 client and select the new protocol(s) based on other factors, such as 2647 the nature of the request or the current load on the server. 2649 A server that sends a 426 (Upgrade Required) response MUST send an 2650 Upgrade header field to indicate the acceptable protocols, in order 2651 of descending preference. 2653 A server MAY send an Upgrade header field in any other response to 2654 advertise that it implements support for upgrading to the listed 2655 protocols, in order of descending preference, when appropriate for a 2656 future request. 2658 The following is a hypothetical example sent by a client: 2660 GET /hello.txt HTTP/1.1 2661 Host: www.example.com 2662 Connection: upgrade 2663 Upgrade: HTTP/2.0, SHTTP/1.3, IRC/6.9, RTA/x11 2665 Upgrade cannot be used to insist on a protocol change; its acceptance 2666 and use by the server is optional. The capabilities and nature of 2667 the application-level communication after the protocol change is 2668 entirely dependent upon the new protocol(s) chosen. However, 2669 immediately after sending the 101 response, the server is expected to 2670 continue responding to the original request as if it had received its 2671 equivalent within the new protocol (i.e., the server still has an 2672 outstanding request to satisfy after the protocol has been changed, 2673 and is expected to do so without requiring the request to be 2674 repeated). 2676 For example, if the Upgrade header field is received in a GET request 2677 and the server decides to switch protocols, it first responds with a 2678 101 (Switching Protocols) message in HTTP/1.1 and then immediately 2679 follows that with the new protocol's equivalent of a response to a 2680 GET on the target resource. This allows a connection to be upgraded 2681 to protocols with the same semantics as HTTP without the latency cost 2682 of an additional round-trip. A server MUST NOT switch protocols 2683 unless the received message semantics can be honored by the new 2684 protocol; an OPTIONS request can be honored by any protocol. 2686 The following is an example response to the above hypothetical 2687 request: 2689 HTTP/1.1 101 Switching Protocols 2690 Connection: upgrade 2691 Upgrade: HTTP/2.0 2693 [... data stream switches to HTTP/2.0 with an appropriate response 2694 (as defined by new protocol) to the "GET /hello.txt" request ...] 2696 When Upgrade is sent, the sender MUST also send a Connection header 2697 field (Section 6.1) that contains an "upgrade" connection option, in 2698 order to prevent Upgrade from being accidentally forwarded by 2699 intermediaries that might not implement the listed protocols. A 2700 server MUST ignore an Upgrade header field that is received in an 2701 HTTP/1.0 request. 2703 A client cannot begin using an upgraded protocol on the connection 2704 until it has completely sent the request message (i.e., the client 2705 can't change the protocol it is sending in the middle of a message). 2706 If a server receives both Upgrade and an Expect header field with the 2707 "100-continue" expectation (Section 5.1.1 of [Part2]), the server 2708 MUST send a 100 (Continue) response before sending a 101 (Switching 2709 Protocols) response. 2711 The Upgrade header field only applies to switching protocols on top 2712 of the existing connection; it cannot be used to switch the 2713 underlying connection (transport) protocol, nor to switch the 2714 existing communication to a different connection. For those 2715 purposes, it is more appropriate to use a 3xx (Redirection) response 2716 (Section 6.4 of [Part2]). 2718 This specification only defines the protocol name "HTTP" for use by 2719 the family of Hypertext Transfer Protocols, as defined by the HTTP 2720 version rules of Section 2.6 and future updates to this 2721 specification. Additional tokens ought to be registered with IANA 2722 using the registration procedure defined in Section 8.5. 2724 7. ABNF list extension: #rule 2726 A #rule extension to the ABNF rules of [RFC5234] is used to improve 2727 readability in the definitions of some header field values. 2729 A construct "#" is defined, similar to "*", for defining comma- 2730 delimited lists of elements. The full form is "#element" 2731 indicating at least and at most elements, each separated by a 2732 single comma (",") and optional whitespace (OWS). 2734 Thus, a sender MUST expand the list construct as follows: 2736 1#element => element *( OWS "," OWS element ) 2738 and: 2740 #element => [ 1#element ] 2742 and for n >= 1 and m > 1: 2744 #element => element *( OWS "," OWS element ) 2746 For compatibility with legacy list rules, a recipient MUST parse and 2747 ignore a reasonable number of empty list elements: enough to handle 2748 common mistakes by senders that merge values, but not so much that 2749 they could be used as a denial of service mechanism. In other words, 2750 a recipient MUST expand the list construct as follows: 2752 #element => [ ( "," / element ) *( OWS "," [ OWS element ] ) ] 2754 1#element => *( "," OWS ) element *( OWS "," [ OWS element ] ) 2756 Empty elements do not contribute to the count of elements present. 2757 For example, given these ABNF productions: 2759 example-list = 1#example-list-elmt 2760 example-list-elmt = token ; see Section 3.2.6 2762 Then the following are valid values for example-list (not including 2763 the double quotes, which are present for delimitation only): 2765 "foo,bar" 2766 "foo ,bar," 2767 "foo , ,bar,charlie " 2769 In contrast, the following values would be invalid, since at least 2770 one non-empty element is required by the example-list production: 2772 "" 2773 "," 2774 ", ," 2776 Appendix B shows the collected ABNF after the list constructs have 2777 been expanded, as described above, for recipients. 2779 8. IANA Considerations 2781 8.1. Header Field Registration 2783 HTTP header fields are registered within the Message Header Field 2784 Registry maintained at . 2787 This document defines the following HTTP header fields, so their 2788 associated registry entries shall be updated according to the 2789 permanent registrations below (see [BCP90]): 2791 +-------------------+----------+----------+---------------+ 2792 | Header Field Name | Protocol | Status | Reference | 2793 +-------------------+----------+----------+---------------+ 2794 | Connection | http | standard | Section 6.1 | 2795 | Content-Length | http | standard | Section 3.3.2 | 2796 | Host | http | standard | Section 5.4 | 2797 | TE | http | standard | Section 4.3 | 2798 | Trailer | http | standard | Section 4.4 | 2799 | Transfer-Encoding | http | standard | Section 3.3.1 | 2800 | Upgrade | http | standard | Section 6.7 | 2801 | Via | http | standard | Section 5.7.1 | 2802 +-------------------+----------+----------+---------------+ 2804 Furthermore, the header field-name "Close" shall be registered as 2805 "reserved", since using that name as an HTTP header field might 2806 conflict with the "close" connection option of the "Connection" 2807 header field (Section 6.1). 2809 +-------------------+----------+----------+-------------+ 2810 | Header Field Name | Protocol | Status | Reference | 2811 +-------------------+----------+----------+-------------+ 2812 | Close | http | reserved | Section 8.1 | 2813 +-------------------+----------+----------+-------------+ 2815 The change controller is: "IETF (iesg@ietf.org) - Internet 2816 Engineering Task Force". 2818 8.2. URI Scheme Registration 2820 IANA maintains the registry of URI Schemes [BCP115] at 2821 . 2823 This document defines the following URI schemes, so their associated 2824 registry entries shall be updated according to the permanent 2825 registrations below: 2827 +------------+------------------------------------+---------------+ 2828 | URI Scheme | Description | Reference | 2829 +------------+------------------------------------+---------------+ 2830 | http | Hypertext Transfer Protocol | Section 2.7.1 | 2831 | https | Hypertext Transfer Protocol Secure | Section 2.7.2 | 2832 +------------+------------------------------------+---------------+ 2834 8.3. Internet Media Type Registration 2836 This document serves as the specification for the Internet media 2837 types "message/http" and "application/http". The following is to be 2838 registered with IANA (see [BCP13]). 2840 8.3.1. Internet Media Type message/http 2842 The message/http type can be used to enclose a single HTTP request or 2843 response message, provided that it obeys the MIME restrictions for 2844 all "message" types regarding line length and encodings. 2846 Type name: message 2848 Subtype name: http 2850 Required parameters: none 2852 Optional parameters: version, msgtype 2854 version: The HTTP-version number of the enclosed message (e.g., 2855 "1.1"). If not present, the version can be determined from the 2856 first line of the body. 2858 msgtype: The message type -- "request" or "response". If not 2859 present, the type can be determined from the first line of the 2860 body. 2862 Encoding considerations: only "7bit", "8bit", or "binary" are 2863 permitted 2865 Security considerations: none 2867 Interoperability considerations: none 2869 Published specification: This specification (see Section 8.3.1). 2871 Applications that use this media type: 2873 Additional information: 2875 Magic number(s): none 2877 File extension(s): none 2879 Macintosh file type code(s): none 2881 Person and email address to contact for further information: See 2882 Authors Section. 2884 Intended usage: COMMON 2885 Restrictions on usage: none 2887 Author: See Authors Section. 2889 Change controller: IESG 2891 8.3.2. Internet Media Type application/http 2893 The application/http type can be used to enclose a pipeline of one or 2894 more HTTP request or response messages (not intermixed). 2896 Type name: application 2898 Subtype name: http 2900 Required parameters: none 2902 Optional parameters: version, msgtype 2904 version: The HTTP-version number of the enclosed messages (e.g., 2905 "1.1"). If not present, the version can be determined from the 2906 first line of the body. 2908 msgtype: The message type -- "request" or "response". If not 2909 present, the type can be determined from the first line of the 2910 body. 2912 Encoding considerations: HTTP messages enclosed by this type are in 2913 "binary" format; use of an appropriate Content-Transfer-Encoding 2914 is required when transmitted via E-mail. 2916 Security considerations: none 2918 Interoperability considerations: none 2920 Published specification: This specification (see Section 8.3.2). 2922 Applications that use this media type: 2924 Additional information: 2926 Magic number(s): none 2928 File extension(s): none 2929 Macintosh file type code(s): none 2931 Person and email address to contact for further information: See 2932 Authors Section. 2934 Intended usage: COMMON 2936 Restrictions on usage: none 2938 Author: See Authors Section. 2940 Change controller: IESG 2942 8.4. Transfer Coding Registry 2944 The HTTP Transfer Coding Registry defines the name space for transfer 2945 coding names. It is maintained at 2946 . 2948 8.4.1. Procedure 2950 Registrations MUST include the following fields: 2952 o Name 2954 o Description 2956 o Pointer to specification text 2958 Names of transfer codings MUST NOT overlap with names of content 2959 codings (Section 3.1.2.1 of [Part2]) unless the encoding 2960 transformation is identical, as is the case for the compression 2961 codings defined in Section 4.2. 2963 Values to be added to this name space require IETF Review (see 2964 Section 4.1 of [RFC5226]), and MUST conform to the purpose of 2965 transfer coding defined in this specification. 2967 Use of program names for the identification of encoding formats is 2968 not desirable and is discouraged for future encodings. 2970 8.4.2. Registration 2972 The HTTP Transfer Coding Registry shall be updated with the 2973 registrations below: 2975 +------------+--------------------------------------+---------------+ 2976 | Name | Description | Reference | 2977 +------------+--------------------------------------+---------------+ 2978 | chunked | Transfer in a series of chunks | Section 4.1 | 2979 | compress | UNIX "compress" data format [Welch] | Section 4.2.1 | 2980 | deflate | "deflate" compressed data | Section 4.2.2 | 2981 | | ([RFC1951]) inside the "zlib" data | | 2982 | | format ([RFC1950]) | | 2983 | gzip | GZIP file format [RFC1952] | Section 4.2.3 | 2984 | x-compress | Deprecated (alias for compress) | Section 4.2.1 | 2985 | x-gzip | Deprecated (alias for gzip) | Section 4.2.3 | 2986 +------------+--------------------------------------+---------------+ 2988 8.5. Upgrade Token Registry 2990 The HTTP Upgrade Token Registry defines the name space for protocol- 2991 name tokens used to identify protocols in the Upgrade header field. 2992 The registry is maintained at 2993 . 2995 8.5.1. Procedure 2997 Each registered protocol name is associated with contact information 2998 and an optional set of specifications that details how the connection 2999 will be processed after it has been upgraded. 3001 Registrations happen on a "First Come First Served" basis (see 3002 Section 4.1 of [RFC5226]) and are subject to the following rules: 3004 1. A protocol-name token, once registered, stays registered forever. 3006 2. The registration MUST name a responsible party for the 3007 registration. 3009 3. The registration MUST name a point of contact. 3011 4. The registration MAY name a set of specifications associated with 3012 that token. Such specifications need not be publicly available. 3014 5. The registration SHOULD name a set of expected "protocol-version" 3015 tokens associated with that token at the time of registration. 3017 6. The responsible party MAY change the registration at any time. 3018 The IANA will keep a record of all such changes, and make them 3019 available upon request. 3021 7. The IESG MAY reassign responsibility for a protocol token. This 3022 will normally only be used in the case when a responsible party 3023 cannot be contacted. 3025 This registration procedure for HTTP Upgrade Tokens replaces that 3026 previously defined in Section 7.2 of [RFC2817]. 3028 8.5.2. Upgrade Token Registration 3030 The HTTP Upgrade Token Registry shall be updated with the 3031 registration below: 3033 +-------+----------------------+----------------------+-------------+ 3034 | Value | Description | Expected Version | Reference | 3035 | | | Tokens | | 3036 +-------+----------------------+----------------------+-------------+ 3037 | HTTP | Hypertext Transfer | any DIGIT.DIGIT | Section 2.6 | 3038 | | Protocol | (e.g, "2.0") | | 3039 +-------+----------------------+----------------------+-------------+ 3041 The responsible party is: "IETF (iesg@ietf.org) - Internet 3042 Engineering Task Force". 3044 9. Security Considerations 3046 This section is meant to inform developers, information providers, 3047 and users of known security concerns relevant to HTTP/1.1 message 3048 syntax, parsing, and routing. 3050 9.1. DNS-related Attacks 3052 HTTP clients rely heavily on the Domain Name Service (DNS), and are 3053 thus generally prone to security attacks based on the deliberate 3054 misassociation of IP addresses and DNS names not protected by DNSSEC. 3055 Clients need to be cautious in assuming the validity of an IP number/ 3056 DNS name association unless the response is protected by DNSSEC 3057 ([RFC4033]). 3059 9.2. Intermediaries and Caching 3061 By their very nature, HTTP intermediaries are men-in-the-middle, and 3062 represent an opportunity for man-in-the-middle attacks. Compromise 3063 of the systems on which the intermediaries run can result in serious 3064 security and privacy problems. Intermediaries have access to 3065 security-related information, personal information about individual 3066 users and organizations, and proprietary information belonging to 3067 users and content providers. A compromised intermediary, or an 3068 intermediary implemented or configured without regard to security and 3069 privacy considerations, might be used in the commission of a wide 3070 range of potential attacks. 3072 Intermediaries that contain a shared cache are especially vulnerable 3073 to cache poisoning attacks. 3075 Implementers need to consider the privacy and security implications 3076 of their design and coding decisions, and of the configuration 3077 options they provide to operators (especially the default 3078 configuration). 3080 Users need to be aware that intermediaries are no more trustworthy 3081 than the people who run them; HTTP itself cannot solve this problem. 3083 9.3. Buffer Overflows 3085 Because HTTP uses mostly textual, character-delimited fields, 3086 attackers can overflow buffers in implementations, and/or perform a 3087 Denial of Service against implementations that accept fields with 3088 unlimited lengths. 3090 To promote interoperability, this specification makes specific 3091 recommendations for minimum size limits on request-line 3092 (Section 3.1.1) and header fields (Section 3.2). These are minimum 3093 recommendations, chosen to be supportable even by implementations 3094 with limited resources; it is expected that most implementations will 3095 choose substantially higher limits. 3097 This specification also provides a way for servers to reject messages 3098 that have request-targets that are too long (Section 6.5.12 of 3099 [Part2]) or request entities that are too large (Section 6.5 of 3100 [Part2]). Additional status codes related to capacity limits have 3101 been defined by extensions to HTTP [RFC6585]. 3103 Recipients ought to carefully limit the extent to which they read 3104 other fields, including (but not limited to) request methods, 3105 response status phrases, header field-names, and body chunks, so as 3106 to avoid denial of service attacks without impeding interoperability. 3108 9.4. Message Integrity 3110 HTTP does not define a specific mechanism for ensuring message 3111 integrity, instead relying on the error-detection ability of 3112 underlying transport protocols and the use of length or chunk- 3113 delimited framing to detect completeness. Additional integrity 3114 mechanisms, such as hash functions or digital signatures applied to 3115 the content, can be selectively added to messages via extensible 3116 metadata header fields. Historically, the lack of a single integrity 3117 mechanism has been justified by the informal nature of most HTTP 3118 communication. However, the prevalence of HTTP as an information 3119 access mechanism has resulted in its increasing use within 3120 environments where verification of message integrity is crucial. 3122 User agents are encouraged to implement configurable means for 3123 detecting and reporting failures of message integrity such that those 3124 means can be enabled within environments for which integrity is 3125 necessary. For example, a browser being used to view medical history 3126 or drug interaction information needs to indicate to the user when 3127 such information is detected by the protocol to be incomplete, 3128 expired, or corrupted during transfer. Such mechanisms might be 3129 selectively enabled via user agent extensions or the presence of 3130 message integrity metadata in a response. At a minimum, user agents 3131 ought to provide some indication that allows a user to distinguish 3132 between a complete and incomplete response message (Section 3.4) when 3133 such verification is desired. 3135 9.5. Server Log Information 3137 A server is in the position to save personal data about a user's 3138 requests over time, which might identify their reading patterns or 3139 subjects of interest. In particular, log information gathered at an 3140 intermediary often contains a history of user agent interaction, 3141 across a multitude of sites, that can be traced to individual users. 3143 HTTP log information is confidential in nature; its handling is often 3144 constrained by laws and regulations. Log information needs to be 3145 securely stored and appropriate guidelines followed for its analysis. 3146 Anonymization of personal information within individual entries 3147 helps, but is generally not sufficient to prevent real log traces 3148 from being re-identified based on correlation with other access 3149 characteristics. As such, access traces that are keyed to a specific 3150 client are unsafe to publish even if the key is pseudonymous. 3152 To minimize the risk of theft or accidental publication, log 3153 information ought to be purged of personally identifiable 3154 information, including user identifiers, IP addresses, and user- 3155 provided query parameters, as soon as that information is no longer 3156 necessary to support operational needs for security, auditing, or 3157 fraud control. 3159 10. Acknowledgments 3161 This edition of HTTP/1.1 builds on the many contributions that went 3162 into RFC 1945, RFC 2068, RFC 2145, and RFC 2616, including 3163 substantial contributions made by the previous authors, editors, and 3164 working group chairs: Tim Berners-Lee, Ari Luotonen, Roy T. Fielding, 3165 Henrik Frystyk Nielsen, Jim Gettys, Jeffrey C. Mogul, Larry Masinter, 3166 and Paul J. Leach. Mark Nottingham oversaw this effort as working 3167 group chair. 3169 Since 1999, the following contributors have helped improve the HTTP 3170 specification by reporting bugs, asking smart questions, drafting or 3171 reviewing text, and evaluating open issues: 3173 Adam Barth, Adam Roach, Addison Phillips, Adrian Chadd, Adrien W. de 3174 Croy, Alan Ford, Alan Ruttenberg, Albert Lunde, Alek Storm, Alex 3175 Rousskov, Alexandre Morgaut, Alexey Melnikov, Alisha Smith, Amichai 3176 Rothman, Amit Klein, Amos Jeffries, Andreas Maier, Andreas Petersson, 3177 Anil Sharma, Anne van Kesteren, Anthony Bryan, Asbjorn Ulsberg, Ashok 3178 Kumar, Balachander Krishnamurthy, Barry Leiba, Ben Laurie, Benjamin 3179 Carlyle, Benjamin Niven-Jenkins, Bil Corry, Bill Burke, Bjoern 3180 Hoehrmann, Bob Scheifler, Boris Zbarsky, Brett Slatkin, Brian Kell, 3181 Brian McBarron, Brian Pane, Brian Raymor, Brian Smith, Bryce Nesbitt, 3182 Cameron Heavon-Jones, Carl Kugler, Carsten Bormann, Charles Fry, 3183 Chris Newman, Cyrus Daboo, Dale Robert Anderson, Dan Wing, Dan 3184 Winship, Daniel Stenberg, Darrel Miller, Dave Cridland, Dave Crocker, 3185 Dave Kristol, Dave Thaler, David Booth, David Singer, David W. 3186 Morris, Diwakar Shetty, Dmitry Kurochkin, Drummond Reed, Duane 3187 Wessels, Edward Lee, Eitan Adler, Eliot Lear, Eran Hammer-Lahav, Eric 3188 D. Williams, Eric J. Bowman, Eric Lawrence, Eric Rescorla, Erik 3189 Aronesty, Evan Prodromou, Felix Geisendoerfer, Florian Weimer, Frank 3190 Ellermann, Fred Akalin, Fred Bohle, Frederic Kayser, Gabor Molnar, 3191 Gabriel Montenegro, Geoffrey Sneddon, Gervase Markham, Gili Tzabari, 3192 Grahame Grieve, Greg Wilkins, Grzegorz Calkowski, Harald Tveit 3193 Alvestrand, Harry Halpin, Helge Hess, Henrik Nordstrom, Henry S. 3194 Thompson, Henry Story, Herbert van de Sompel, Herve Ruellan, Howard 3195 Melman, Hugo Haas, Ian Fette, Ian Hickson, Ido Safruti, Ilari 3196 Liusvaara, Ilya Grigorik, Ingo Struck, J. Ross Nicoll, James Cloos, 3197 James H. Manger, James Lacey, James M. Snell, Jamie Lokier, Jan 3198 Algermissen, Jeff Hodges (who came up with the term 'effective 3199 Request-URI'), Jeff Pinner, Jeff Walden, Jim Luther, Jitu Padhye, Joe 3200 D. Williams, Joe Gregorio, Joe Orton, John C. Klensin, John C. 3201 Mallery, John Cowan, John Kemp, John Panzer, John Schneider, John 3202 Stracke, John Sullivan, Jonas Sicking, Jonathan A. Rees, Jonathan 3203 Billington, Jonathan Moore, Jonathan Silvera, Jordi Ros, Joris 3204 Dobbelsteen, Josh Cohen, Julien Pierre, Jungshik Shin, Justin 3205 Chapweske, Justin Erenkrantz, Justin James, Kalvinder Singh, Karl 3206 Dubost, Keith Hoffman, Keith Moore, Ken Murchison, Koen Holtman, 3207 Konstantin Voronkov, Kris Zyp, Lisa Dusseault, Maciej Stachowiak, 3208 Manu Sporny, Marc Schneider, Marc Slemko, Mark Baker, Mark Pauley, 3209 Mark Watson, Markus Isomaki, Markus Lanthaler, Martin J. Duerst, 3210 Martin Musatov, Martin Nilsson, Martin Thomson, Matt Lynch, Matthew 3211 Cox, Max Clark, Michael Burrows, Michael Hausenblas, Michael Sweet, 3212 Michael Tuexen, Michael Welzl, Mike Amundsen, Mike Belshe, Mike 3213 Bishop, Mike Kelly, Mike Schinkel, Miles Sabin, Murray S. Kucherawy, 3214 Mykyta Yevstifeyev, Nathan Rixham, Nicholas Shanks, Nico Williams, 3215 Nicolas Alvarez, Nicolas Mailhot, Noah Slater, Osama Mazahir, Pablo 3216 Castro, Pat Hayes, Patrick R. McManus, Paul E. Jones, Paul Hoffman, 3217 Paul Marquess, Peter Lepeska, Peter Occil, Peter Saint-Andre, Peter 3218 Watkins, Phil Archer, Philippe Mougin, Phillip Hallam-Baker, Piotr 3219 Dobrogost, Poul-Henning Kamp, Preethi Natarajan, Rajeev Bector, Ray 3220 Polk, Reto Bachmann-Gmuer, Richard Cyganiak, Robby Simpson, Robert 3221 Brewer, Robert Collins, Robert Mattson, Robert O'Callahan, Robert 3222 Olofsson, Robert Sayre, Robert Siemer, Robert de Wilde, Roberto 3223 Javier Godoy, Roberto Peon, Roland Zink, Ronny Widjaja, Ryan 3224 Hamilton, S. Mike Dierken, Salvatore Loreto, Sam Johnston, Sam 3225 Pullara, Sam Ruby, Scott Lawrence (who maintained the original issues 3226 list), Sean B. Palmer, Sebastien Barnoud, Shane McCarron, Shigeki 3227 Ohtsu, Stefan Eissing, Stefan Tilkov, Stefanos Harhalakis, Stephane 3228 Bortzmeyer, Stephen Farrell, Stephen Ludin, Stuart Williams, Subbu 3229 Allamaraju, Sylvain Hellegouarch, Tapan Divekar, Tatsuhiro Tsujikawa, 3230 Tatsuya Hayashi, Ted Hardie, Thomas Broyer, Thomas Fossati, Thomas 3231 Maslen, Thomas Nordin, Thomas Roessler, Tim Bray, Tim Morgan, Tim 3232 Olsen, Tom Zhou, Travis Snoozy, Tyler Close, Vincent Murphy, Wenbo 3233 Zhu, Werner Baumann, Wilbur Streett, Wilfredo Sanchez Vega, William 3234 A. Rowe Jr., William Chan, Willy Tarreau, Xiaoshu Wang, Yaron Goland, 3235 Yngve Nysaeter Pettersen, Yoav Nir, Yogesh Bang, Yuchung Cheng, 3236 Yutaka Oiwa, Yves Lafon (long-time member of the editor team), Zed A. 3237 Shaw, and Zhong Yu. 3239 See Section 16 of [RFC2616] for additional acknowledgements from 3240 prior revisions. 3242 11. References 3244 11.1. Normative References 3246 [Part2] Fielding, R., Ed. and J. Reschke, Ed., "Hypertext 3247 Transfer Protocol (HTTP/1.1): Semantics and Content", 3248 draft-ietf-httpbis-p2-semantics-24 (work in progress), 3249 September 2013. 3251 [Part4] Fielding, R., Ed. and J. Reschke, Ed., "Hypertext 3252 Transfer Protocol (HTTP/1.1): Conditional Requests", 3253 draft-ietf-httpbis-p4-conditional-24 (work in 3254 progress), September 2013. 3256 [Part5] Fielding, R., Ed., Lafon, Y., Ed., and J. Reschke, Ed., 3257 "Hypertext Transfer Protocol (HTTP/1.1): Range 3258 Requests", draft-ietf-httpbis-p5-range-24 (work in 3259 progress), September 2013. 3261 [Part6] Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, 3262 Ed., "Hypertext Transfer Protocol (HTTP/1.1): Caching", 3263 draft-ietf-httpbis-p6-cache-24 (work in progress), 3264 September 2013. 3266 [Part7] Fielding, R., Ed. and J. Reschke, Ed., "Hypertext 3267 Transfer Protocol (HTTP/1.1): Authentication", 3268 draft-ietf-httpbis-p7-auth-24 (work in progress), 3269 September 2013. 3271 [RFC0793] Postel, J., "Transmission Control Protocol", STD 7, 3272 RFC 793, September 1981. 3274 [RFC1950] Deutsch, L. and J-L. Gailly, "ZLIB Compressed Data 3275 Format Specification version 3.3", RFC 1950, May 1996. 3277 [RFC1951] Deutsch, P., "DEFLATE Compressed Data Format 3278 Specification version 1.3", RFC 1951, May 1996. 3280 [RFC1952] Deutsch, P., Gailly, J-L., Adler, M., Deutsch, L., and 3281 G. Randers-Pehrson, "GZIP file format specification 3282 version 4.3", RFC 1952, May 1996. 3284 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 3285 Requirement Levels", BCP 14, RFC 2119, March 1997. 3287 [RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, 3288 "Uniform Resource Identifier (URI): Generic Syntax", 3289 STD 66, RFC 3986, January 2005. 3291 [RFC5234] Crocker, D., Ed. and P. Overell, "Augmented BNF for 3292 Syntax Specifications: ABNF", STD 68, RFC 5234, 3293 January 2008. 3295 [USASCII] American National Standards Institute, "Coded Character 3296 Set -- 7-bit American Standard Code for Information 3297 Interchange", ANSI X3.4, 1986. 3299 [Welch] Welch, T., "A Technique for High Performance Data 3300 Compression", IEEE Computer 17(6), June 1984. 3302 11.2. Informative References 3304 [BCP115] Hansen, T., Hardie, T., and L. Masinter, "Guidelines 3305 and Registration Procedures for New URI Schemes", 3306 BCP 115, RFC 4395, February 2006. 3308 [BCP13] Freed, N., Klensin, J., and T. Hansen, "Media Type 3309 Specifications and Registration Procedures", BCP 13, 3310 RFC 6838, January 2013. 3312 [BCP90] Klyne, G., Nottingham, M., and J. Mogul, "Registration 3313 Procedures for Message Header Fields", BCP 90, 3314 RFC 3864, September 2004. 3316 [ISO-8859-1] International Organization for Standardization, 3317 "Information technology -- 8-bit single-byte coded 3318 graphic character sets -- Part 1: Latin alphabet No. 3319 1", ISO/IEC 8859-1:1998, 1998. 3321 [Kri2001] Kristol, D., "HTTP Cookies: Standards, Privacy, and 3322 Politics", ACM Transactions on Internet 3323 Technology 1(2), November 2001, 3324 . 3326 [RFC1919] Chatel, M., "Classical versus Transparent IP Proxies", 3327 RFC 1919, March 1996. 3329 [RFC1945] Berners-Lee, T., Fielding, R., and H. Nielsen, 3330 "Hypertext Transfer Protocol -- HTTP/1.0", RFC 1945, 3331 May 1996. 3333 [RFC2045] Freed, N. and N. Borenstein, "Multipurpose Internet 3334 Mail Extensions (MIME) Part One: Format of Internet 3335 Message Bodies", RFC 2045, November 1996. 3337 [RFC2047] Moore, K., "MIME (Multipurpose Internet Mail 3338 Extensions) Part Three: Message Header Extensions for 3339 Non-ASCII Text", RFC 2047, November 1996. 3341 [RFC2068] Fielding, R., Gettys, J., Mogul, J., Nielsen, H., and 3342 T. Berners-Lee, "Hypertext Transfer Protocol -- 3343 HTTP/1.1", RFC 2068, January 1997. 3345 [RFC2145] Mogul, J., Fielding, R., Gettys, J., and H. Nielsen, 3346 "Use and Interpretation of HTTP Version Numbers", 3347 RFC 2145, May 1997. 3349 [RFC2616] Fielding, R., Gettys, J., Mogul, J., Frystyk, H., 3350 Masinter, L., Leach, P., and T. Berners-Lee, "Hypertext 3351 Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999. 3353 [RFC2817] Khare, R. and S. Lawrence, "Upgrading to TLS Within 3354 HTTP/1.1", RFC 2817, May 2000. 3356 [RFC2818] Rescorla, E., "HTTP Over TLS", RFC 2818, May 2000. 3358 [RFC3040] Cooper, I., Melve, I., and G. Tomlinson, "Internet Web 3359 Replication and Caching Taxonomy", RFC 3040, 3360 January 2001. 3362 [RFC4033] Arends, R., Austein, R., Larson, M., Massey, D., and S. 3363 Rose, "DNS Security Introduction and Requirements", 3364 RFC 4033, March 2005. 3366 [RFC4559] Jaganathan, K., Zhu, L., and J. Brezak, "SPNEGO-based 3367 Kerberos and NTLM HTTP Authentication in Microsoft 3368 Windows", RFC 4559, June 2006. 3370 [RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing 3371 an IANA Considerations Section in RFCs", BCP 26, 3372 RFC 5226, May 2008. 3374 [RFC5246] Dierks, T. and E. Rescorla, "The Transport Layer 3375 Security (TLS) Protocol Version 1.2", RFC 5246, 3376 August 2008. 3378 [RFC5322] Resnick, P., "Internet Message Format", RFC 5322, 3379 October 2008. 3381 [RFC6265] Barth, A., "HTTP State Management Mechanism", RFC 6265, 3382 April 2011. 3384 [RFC6585] Nottingham, M. and R. Fielding, "Additional HTTP Status 3385 Codes", RFC 6585, April 2012. 3387 Appendix A. HTTP Version History 3389 HTTP has been in use by the World-Wide Web global information 3390 initiative since 1990. The first version of HTTP, later referred to 3391 as HTTP/0.9, was a simple protocol for hypertext data transfer across 3392 the Internet with only a single request method (GET) and no metadata. 3393 HTTP/1.0, as defined by [RFC1945], added a range of request methods 3394 and MIME-like messaging that could include metadata about the data 3395 transferred and modifiers on the request/response semantics. 3396 However, HTTP/1.0 did not sufficiently take into consideration the 3397 effects of hierarchical proxies, caching, the need for persistent 3398 connections, or name-based virtual hosts. The proliferation of 3399 incompletely-implemented applications calling themselves "HTTP/1.0" 3400 further necessitated a protocol version change in order for two 3401 communicating applications to determine each other's true 3402 capabilities. 3404 HTTP/1.1 remains compatible with HTTP/1.0 by including more stringent 3405 requirements that enable reliable implementations, adding only those 3406 new features that will either be safely ignored by an HTTP/1.0 3407 recipient or only sent when communicating with a party advertising 3408 conformance with HTTP/1.1. 3410 It is beyond the scope of a protocol specification to mandate 3411 conformance with previous versions. HTTP/1.1 was deliberately 3412 designed, however, to make supporting previous versions easy. We 3413 would expect a general-purpose HTTP/1.1 server to understand any 3414 valid request in the format of HTTP/1.0 and respond appropriately 3415 with an HTTP/1.1 message that only uses features understood (or 3416 safely ignored) by HTTP/1.0 clients. Likewise, we would expect an 3417 HTTP/1.1 client to understand any valid HTTP/1.0 response. 3419 Since HTTP/0.9 did not support header fields in a request, there is 3420 no mechanism for it to support name-based virtual hosts (selection of 3421 resource by inspection of the Host header field). Any server that 3422 implements name-based virtual hosts ought to disable support for 3423 HTTP/0.9. Most requests that appear to be HTTP/0.9 are, in fact, 3424 badly constructed HTTP/1.x requests wherein a buggy client failed to 3425 properly encode linear whitespace found in a URI reference and placed 3426 in the request-target. 3428 A.1. Changes from HTTP/1.0 3430 This section summarizes major differences between versions HTTP/1.0 3431 and HTTP/1.1. 3433 A.1.1. Multi-homed Web Servers 3435 The requirements that clients and servers support the Host header 3436 field (Section 5.4), report an error if it is missing from an 3437 HTTP/1.1 request, and accept absolute URIs (Section 5.3) are among 3438 the most important changes defined by HTTP/1.1. 3440 Older HTTP/1.0 clients assumed a one-to-one relationship of IP 3441 addresses and servers; there was no other established mechanism for 3442 distinguishing the intended server of a request than the IP address 3443 to which that request was directed. The Host header field was 3444 introduced during the development of HTTP/1.1 and, though it was 3445 quickly implemented by most HTTP/1.0 browsers, additional 3446 requirements were placed on all HTTP/1.1 requests in order to ensure 3447 complete adoption. At the time of this writing, most HTTP-based 3448 services are dependent upon the Host header field for targeting 3449 requests. 3451 A.1.2. Keep-Alive Connections 3453 In HTTP/1.0, each connection is established by the client prior to 3454 the request and closed by the server after sending the response. 3455 However, some implementations implement the explicitly negotiated 3456 ("Keep-Alive") version of persistent connections described in Section 3457 19.7.1 of [RFC2068]. 3459 Some clients and servers might wish to be compatible with these 3460 previous approaches to persistent connections, by explicitly 3461 negotiating for them with a "Connection: keep-alive" request header 3462 field. However, some experimental implementations of HTTP/1.0 3463 persistent connections are faulty; for example, if an HTTP/1.0 proxy 3464 server doesn't understand Connection, it will erroneously forward 3465 that header field to the next inbound server, which would result in a 3466 hung connection. 3468 One attempted solution was the introduction of a Proxy-Connection 3469 header field, targeted specifically at proxies. In practice, this 3470 was also unworkable, because proxies are often deployed in multiple 3471 layers, bringing about the same problem discussed above. 3473 As a result, clients are encouraged not to send the Proxy-Connection 3474 header field in any requests. 3476 Clients are also encouraged to consider the use of Connection: keep- 3477 alive in requests carefully; while they can enable persistent 3478 connections with HTTP/1.0 servers, clients using them will need to 3479 monitor the connection for "hung" requests (which indicate that the 3480 client ought stop sending the header field), and this mechanism ought 3481 not be used by clients at all when a proxy is being used. 3483 A.1.3. Introduction of Transfer-Encoding 3485 HTTP/1.1 introduces the Transfer-Encoding header field 3486 (Section 3.3.1). Transfer codings need to be decoded prior to 3487 forwarding an HTTP message over a MIME-compliant protocol. 3489 A.2. Changes from RFC 2616 3491 HTTP's approach to error handling has been explained. (Section 2.5) 3493 The HTTP-version ABNF production has been clarified to be case- 3494 sensitive. Additionally, version numbers has been restricted to 3495 single digits, due to the fact that implementations are known to 3496 handle multi-digit version numbers incorrectly. (Section 2.6) 3498 Userinfo (i.e., username and password) are now disallowed in HTTP and 3499 HTTPS URIs, because of security issues related to their transmission 3500 on the wire. (Section 2.7.1) 3502 The HTTPS URI scheme is now defined by this specification; 3503 previously, it was done in Section 2.4 of [RFC2818]. Furthermore, it 3504 implies end-to-end security. (Section 2.7.2) 3506 HTTP messages can be (and often are) buffered by implementations; 3507 despite it sometimes being available as a stream, HTTP is 3508 fundamentally a message-oriented protocol. Minimum supported sizes 3509 for various protocol elements have been suggested, to improve 3510 interoperability. (Section 3) 3512 Invalid whitespace around field-names is now required to be rejected, 3513 because accepting it represents a security vulnerability. The ABNF 3514 productions defining header fields now only list the field value. 3515 (Section 3.2) 3517 Rules about implicit linear whitespace between certain grammar 3518 productions have been removed; now whitespace is only allowed where 3519 specifically defined in the ABNF. (Section 3.2.3) 3521 Header fields that span multiple lines ("line folding") are 3522 deprecated. (Section 3.2.4) 3524 The NUL octet is no longer allowed in comment and quoted-string text, 3525 and handling of backslash-escaping in them has been clarified. The 3526 quoted-pair rule no longer allows escaping control characters other 3527 than HTAB. Non-ASCII content in header fields and the reason phrase 3528 has been obsoleted and made opaque (the TEXT rule was removed). 3529 (Section 3.2.6) 3531 Bogus "Content-Length" header fields are now required to be handled 3532 as errors by recipients. (Section 3.3.2) 3534 The algorithm for determining the message body length has been 3535 clarified to indicate all of the special cases (e.g., driven by 3536 methods or status codes) that affect it, and that new protocol 3537 elements cannot define such special cases. CONNECT is a new, special 3538 case in determining message body length. "multipart/byteranges" is no 3539 longer a way of determining message body length detection. 3540 (Section 3.3.3) 3542 The "identity" transfer coding token has been removed. (Sections 3.3 3543 and 4) 3545 Chunk length does not include the count of the octets in the chunk 3546 header and trailer. Line folding in chunk extensions is disallowed. 3547 (Section 4.1) 3549 The meaning of the "deflate" content coding has been clarified. 3550 (Section 4.2.2) 3552 The segment + query components of RFC 3986 have been used to define 3553 the request-target, instead of abs_path from RFC 1808. The asterisk- 3554 form of the request-target is only allowed with the OPTIONS method. 3556 (Section 5.3) 3558 The term "Effective Request URI" has been introduced. (Section 5.5) 3560 Gateways do not need to generate Via header fields anymore. 3561 (Section 5.7.1) 3563 Exactly when "close" connection options have to be sent has been 3564 clarified. Also, "hop-by-hop" header fields are required to appear 3565 in the Connection header field; just because they're defined as hop- 3566 by-hop in this specification doesn't exempt them. (Section 6.1) 3568 The limit of two connections per server has been removed. An 3569 idempotent sequence of requests is no longer required to be retried. 3570 The requirement to retry requests under certain circumstances when 3571 the server prematurely closes the connection has been removed. Also, 3572 some extraneous requirements about when servers are allowed to close 3573 connections prematurely have been removed. (Section 6.3) 3575 The semantics of the Upgrade header field is now defined in responses 3576 other than 101 (this was incorporated from [RFC2817]). Furthermore, 3577 the ordering in the field value is now significant. (Section 6.7) 3579 Empty list elements in list productions (e.g., a list header field 3580 containing ", ,") have been deprecated. (Section 7) 3582 Registration of Transfer Codings now requires IETF Review 3583 (Section 8.4) 3585 This specification now defines the Upgrade Token Registry, previously 3586 defined in Section 7.2 of [RFC2817]. (Section 8.5) 3588 The expectation to support HTTP/0.9 requests has been removed. 3589 (Appendix A) 3591 Issues with the Keep-Alive and Proxy-Connection header fields in 3592 requests are pointed out, with use of the latter being discouraged 3593 altogether. (Appendix A.1.2) 3595 Appendix B. Collected ABNF 3597 BWS = OWS 3599 Connection = *( "," OWS ) connection-option *( OWS "," [ OWS 3600 connection-option ] ) 3601 Content-Length = 1*DIGIT 3602 HTTP-message = start-line *( header-field CRLF ) CRLF [ message-body 3603 ] 3604 HTTP-name = %x48.54.54.50 ; HTTP 3605 HTTP-version = HTTP-name "/" DIGIT "." DIGIT 3606 Host = uri-host [ ":" port ] 3608 OWS = *( SP / HTAB ) 3610 RWS = 1*( SP / HTAB ) 3612 TE = [ ( "," / t-codings ) *( OWS "," [ OWS t-codings ] ) ] 3613 Trailer = *( "," OWS ) field-name *( OWS "," [ OWS field-name ] ) 3614 Transfer-Encoding = *( "," OWS ) transfer-coding *( OWS "," [ OWS 3615 transfer-coding ] ) 3617 URI-reference = 3618 Upgrade = *( "," OWS ) protocol *( OWS "," [ OWS protocol ] ) 3620 Via = *( "," OWS ) ( received-protocol RWS received-by [ RWS comment 3621 ] ) *( OWS "," [ OWS ( received-protocol RWS received-by [ RWS 3622 comment ] ) ] ) 3624 absolute-URI = 3625 absolute-form = absolute-URI 3626 absolute-path = 1*( "/" segment ) 3627 asterisk-form = "*" 3628 attribute = token 3629 authority = 3630 authority-form = authority 3632 chunk = chunk-size [ chunk-ext ] CRLF chunk-data CRLF 3633 chunk-data = 1*OCTET 3634 chunk-ext = *( ";" chunk-ext-name [ "=" chunk-ext-val ] ) 3635 chunk-ext-name = token 3636 chunk-ext-val = token / quoted-str-nf 3637 chunk-size = 1*HEXDIG 3638 chunked-body = *chunk last-chunk trailer-part CRLF 3639 comment = "(" *( ctext / quoted-cpair / comment ) ")" 3640 connection-option = token 3641 ctext = HTAB / SP / %x21-27 ; '!'-''' 3642 / %x2A-5B ; '*'-'[' 3643 / %x5D-7E ; ']'-'~' 3644 / obs-text 3646 field-content = *( HTAB / SP / VCHAR / obs-text ) 3647 field-name = token 3648 field-value = *( field-content / obs-fold ) 3649 fragment = 3650 header-field = field-name ":" OWS field-value OWS 3651 http-URI = "http://" authority path-abempty [ "?" query ] [ "#" 3652 fragment ] 3653 https-URI = "https://" authority path-abempty [ "?" query ] [ "#" 3654 fragment ] 3656 last-chunk = 1*"0" [ chunk-ext ] CRLF 3658 message-body = *OCTET 3659 method = token 3661 obs-fold = CRLF ( SP / HTAB ) 3662 obs-text = %x80-FF 3663 origin-form = absolute-path [ "?" query ] 3665 partial-URI = relative-part [ "?" query ] 3666 path-abempty = 3667 port = 3668 protocol = protocol-name [ "/" protocol-version ] 3669 protocol-name = token 3670 protocol-version = token 3671 pseudonym = token 3673 qdtext = HTAB / SP / "!" / %x23-5B ; '#'-'[' 3674 / %x5D-7E ; ']'-'~' 3675 / obs-text 3676 qdtext-nf = HTAB / SP / "!" / %x23-5B ; '#'-'[' 3677 / %x5D-7E ; ']'-'~' 3678 / obs-text 3679 query = 3680 quoted-cpair = "\" ( HTAB / SP / VCHAR / obs-text ) 3681 quoted-pair = "\" ( HTAB / SP / VCHAR / obs-text ) 3682 quoted-str-nf = DQUOTE *( qdtext-nf / quoted-pair ) DQUOTE 3683 quoted-string = DQUOTE *( qdtext / quoted-pair ) DQUOTE 3685 rank = ( "0" [ "." *3DIGIT ] ) / ( "1" [ "." *3"0" ] ) 3686 reason-phrase = *( HTAB / SP / VCHAR / obs-text ) 3687 received-by = ( uri-host [ ":" port ] ) / pseudonym 3688 received-protocol = [ protocol-name "/" ] protocol-version 3689 relative-part = 3690 request-line = method SP request-target SP HTTP-version CRLF 3691 request-target = origin-form / absolute-form / authority-form / 3692 asterisk-form 3694 segment = 3695 special = "(" / ")" / "<" / ">" / "@" / "," / ";" / ":" / "\" / 3696 DQUOTE / "/" / "[" / "]" / "?" / "=" / "{" / "}" 3697 start-line = request-line / status-line 3698 status-code = 3DIGIT 3699 status-line = HTTP-version SP status-code SP reason-phrase CRLF 3701 t-codings = "trailers" / ( transfer-coding [ t-ranking ] ) 3702 t-ranking = OWS ";" OWS "q=" rank 3703 tchar = "!" / "#" / "$" / "%" / "&" / "'" / "*" / "+" / "-" / "." / 3704 "^" / "_" / "`" / "|" / "~" / DIGIT / ALPHA 3705 token = 1*tchar 3706 trailer-part = *( header-field CRLF ) 3707 transfer-coding = "chunked" / "compress" / "deflate" / "gzip" / 3708 transfer-extension 3709 transfer-extension = token *( OWS ";" OWS transfer-parameter ) 3710 transfer-parameter = attribute BWS "=" BWS value 3712 uri-host = 3714 value = word 3716 word = token / quoted-string 3718 Appendix C. Change Log (to be removed by RFC Editor before publication) 3720 C.1. Since RFC 2616 3722 Changes up to the first Working Group Last Call draft are summarized 3723 in . 3726 C.2. Since draft-ietf-httpbis-p1-messaging-21 3728 Closed issues: 3730 o : "Cite HTTPS 3731 URI scheme definition" (the spec now includes the HTTPs scheme 3732 definition and thus updates RFC 2818) 3734 o : "mention of 3735 'proxies' in section about caches" 3737 o : "use of ABNF 3738 terms from RFC 3986" 3740 o : "transferring 3741 URIs with userinfo in payload" 3743 o : "editorial 3744 improvements to message length definition" 3746 o : "Connection 3747 header field MUST vs SHOULD" 3749 o : "editorial 3750 improvements to persistent connections section" 3752 o : "URI 3753 normalization vs empty path" 3755 o : "p1 feedback" 3757 o : "is parsing 3758 OBS-FOLD mandatory?" 3760 o : "HTTPS and 3761 Shared Caching" 3763 o : "Requirements 3764 for recipients of ws between start-line and first header field" 3766 o : "SP and HT 3767 when being tolerant" 3769 o : "Message 3770 Parsing Strictness" 3772 o : "'Render'" 3774 o : "No-Transform" 3776 o : "p2 editorial 3777 feedback" 3779 o : "Content- 3780 Length SHOULD be sent" 3782 o : "origin-form 3783 does not allow path starting with "//"" 3785 o : "ambiguity in 3786 part 1 example" 3788 C.3. Since draft-ietf-httpbis-p1-messaging-22 3790 Closed issues: 3792 o : "Part1 should 3793 have a reference to TCP (RFC 793)" 3795 o : "media type 3796 registration template issues" 3798 o : P1 editorial 3799 nits 3801 o : "BWS" (vs 3802 conformance) 3804 o : "obs-fold 3805 language" 3807 o : "Ordering in 3808 Upgrade" 3810 o : "p1 editorial 3811 feedback" 3813 o : "HTTP and TCP 3814 name delegation" 3816 o : "Receiving a 3817 higher minor HTTP version number" 3819 o : "HTTP(S) URIs 3820 and fragids" 3822 o : "Registering 3823 x-gzip and x-deflate" 3825 o : "Via and 3826 gateways" 3828 o : "Mention 203 3829 Non-Authoritative Information in p1" 3831 o : "SHOULD and 3832 conformance" 3834 o : "Pipelining 3835 language" 3837 o : "proxy 3838 handling of a really bad Content-Length" 3840 C.4. Since draft-ietf-httpbis-p1-messaging-23 3842 Closed issues: 3844 o : "chunk- 3845 extensions" (un-deprecated and explained) 3847 o : "MUST fix 3848 Content-Length?" 3850 o : "list notation 3851 defined in appendix" 3853 o : "Fine-Tuning 3854 when Upgrade takes effect" 3856 Index 3858 A 3859 absolute-form (of request-target) 42 3860 accelerator 10 3861 application/http Media Type 62 3862 asterisk-form (of request-target) 42 3863 authority-form (of request-target) 42 3865 B 3866 browser 7 3868 C 3869 cache 11 3870 cacheable 12 3871 captive portal 11 3872 chunked (Coding Format) 28, 31, 35 3873 client 7 3874 close 49, 55 3875 compress (Coding Format) 38 3876 connection 7 3877 Connection header field 49, 55 3878 Content-Length header field 30 3880 D 3881 deflate (Coding Format) 38 3882 downstream 9 3884 E 3885 effective request URI 44 3887 G 3888 gateway 10 3889 Grammar 3890 absolute-form 41 3891 absolute-path 16 3892 absolute-URI 16 3893 ALPHA 6 3894 asterisk-form 41 3895 attribute 35 3896 authority 16 3897 authority-form 41 3898 BWS 24 3899 chunk 35-36 3900 chunk-data 35-36 3901 chunk-ext 35-36 3902 chunk-ext-name 35-36 3903 chunk-ext-val 35-36 3904 chunk-size 35-36 3905 chunked-body 35-36 3906 comment 27 3907 Connection 50 3908 connection-option 50 3909 Content-Length 30 3910 CR 6 3911 CRLF 6 3912 ctext 27 3913 CTL 6 3914 date2 35 3915 date3 35 3916 DIGIT 6 3917 DQUOTE 6 3918 field-content 22 3919 field-name 22 3920 field-value 22 3921 fragment 16 3922 header-field 22 3923 HEXDIG 6 3924 Host 43 3925 HTAB 6 3926 HTTP-message 19 3927 HTTP-name 14 3928 http-URI 17 3929 HTTP-version 14 3930 https-URI 18 3931 last-chunk 35-36 3932 LF 6 3933 message-body 27 3934 method 21 3935 obs-fold 22 3936 obs-text 27 3937 OCTET 6 3938 origin-form 41 3939 OWS 24 3940 partial-URI 16 3941 port 16 3942 protocol-name 47 3943 protocol-version 47 3944 pseudonym 47 3945 qdtext 27 3946 qdtext-nf 35-36 3947 query 16 3948 quoted-cpair 27 3949 quoted-pair 27 3950 quoted-str-nf 35-36 3951 quoted-string 27 3952 rank 39 3953 reason-phrase 22 3954 received-by 47 3955 received-protocol 47 3956 request-line 21 3957 request-target 41 3958 RWS 24 3959 segment 16 3960 SP 6 3961 special 26 3962 start-line 21 3963 status-code 22 3964 status-line 22 3965 t-codings 39 3966 t-ranking 39 3967 tchar 26 3968 TE 39 3969 token 26 3970 Trailer 40 3971 trailer-part 35-37 3972 transfer-coding 35 3973 Transfer-Encoding 28 3974 transfer-extension 35 3975 transfer-parameter 35 3976 Upgrade 56 3977 uri-host 16 3978 URI-reference 16 3979 value 35 3980 VCHAR 6 3981 Via 47 3982 word 26 3983 gzip (Coding Format) 38 3985 H 3986 header field 19 3987 header section 19 3988 headers 19 3989 Host header field 43 3990 http URI scheme 17 3991 https URI scheme 18 3993 I 3994 inbound 9 3995 interception proxy 11 3996 intermediary 9 3998 M 3999 Media Type 4000 application/http 62 4001 message/http 61 4002 message 7 4003 message/http Media Type 61 4004 method 21 4006 N 4007 non-transforming proxy 10 4009 O 4010 origin server 7 4011 origin-form (of request-target) 41 4012 outbound 9 4014 P 4015 proxy 10 4017 R 4018 recipient 7 4019 request 7 4020 request-target 21 4021 resource 16 4022 response 7 4023 reverse proxy 10 4025 S 4026 sender 7 4027 server 7 4028 spider 7 4030 T 4031 target resource 40 4032 target URI 40 4033 TE header field 38 4034 Trailer header field 40 4035 Transfer-Encoding header field 28 4036 transforming proxy 10 4037 transparent proxy 11 4038 tunnel 11 4040 U 4041 Upgrade header field 56 4042 upstream 9 4043 URI scheme 4044 http 17 4045 https 18 4046 user agent 7 4048 V 4049 Via header field 46 4051 Authors' Addresses 4053 Roy T. Fielding (editor) 4054 Adobe Systems Incorporated 4055 345 Park Ave 4056 San Jose, CA 95110 4057 USA 4059 EMail: fielding@gbiv.com 4060 URI: http://roy.gbiv.com/ 4062 Julian F. Reschke (editor) 4063 greenbytes GmbH 4064 Hafenweg 16 4065 Muenster, NW 48155 4066 Germany 4068 EMail: julian.reschke@greenbytes.de 4069 URI: http://greenbytes.de/tech/webdav/