idnits 2.17.00 (12 Aug 2021) /tmp/idnits61362/draft-ietf-httpbis-p1-messaging-25.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The draft header indicates that this document obsoletes RFC2616, but the abstract doesn't seem to mention this, which it should. -- The draft header indicates that this document obsoletes RFC2145, but the abstract doesn't seem to mention this, which it should. -- The draft header indicates that this document updates RFC2817, but the abstract doesn't seem to mention this, which it should. -- The draft header indicates that this document updates RFC2818, but the abstract doesn't seem to mention this, which it should. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year (Using the creation date from RFC2817, updated by this document, for RFC5378 checks: 1998-11-18) (Using the creation date from RFC2818, updated by this document, for RFC5378 checks: 1998-01-27) -- The document seems to contain a disclaimer for pre-RFC5378 work, and may have content which was first submitted before 10 November 2008. The disclaimer is necessary when there are original authors that you have been unable to contact, or if some do not wish to grant the BCP78 rights to the IETF Trust. If you are able to get all authors (current and original) to grant those rights, you can and should remove the disclaimer; otherwise, the disclaimer is needed and you can ignore this comment. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (November 17, 2013) is 3106 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'Part5' is defined on line 3275, but no explicit reference was found in the text == Unused Reference: 'Part7' is defined on line 3285, but no explicit reference was found in the text == Unused Reference: 'RFC2145' is defined on line 3364, but no explicit reference was found in the text == Outdated reference: draft-ietf-httpbis-p2-semantics has been published as RFC 7231 == Outdated reference: draft-ietf-httpbis-p4-conditional has been published as RFC 7232 == Outdated reference: draft-ietf-httpbis-p5-range has been published as RFC 7233 == Outdated reference: draft-ietf-httpbis-p6-cache has been published as RFC 7234 == Outdated reference: draft-ietf-httpbis-p7-auth has been published as RFC 7235 ** Downref: Normative reference to an Informational RFC: RFC 1950 ** Downref: Normative reference to an Informational RFC: RFC 1951 ** Downref: Normative reference to an Informational RFC: RFC 1952 -- Possible downref: Non-RFC (?) normative reference: ref. 'USASCII' -- Possible downref: Non-RFC (?) normative reference: ref. 'Welch' -- Obsolete informational reference (is this intentional?): RFC 4395 (ref. 'BCP115') (Obsoleted by RFC 7595) -- Obsolete informational reference (is this intentional?): RFC 2068 (Obsoleted by RFC 2616) -- Obsolete informational reference (is this intentional?): RFC 2145 (Obsoleted by RFC 7230) -- Obsolete informational reference (is this intentional?): RFC 2616 (Obsoleted by RFC 7230, RFC 7231, RFC 7232, RFC 7233, RFC 7234, RFC 7235) -- Obsolete informational reference (is this intentional?): RFC 5226 (Obsoleted by RFC 8126) -- Obsolete informational reference (is this intentional?): RFC 5246 (Obsoleted by RFC 8446) Summary: 3 errors (**), 0 flaws (~~), 9 warnings (==), 15 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 HTTPbis Working Group R. Fielding, Ed. 3 Internet-Draft Adobe 4 Obsoletes: 2145,2616 (if approved) J. Reschke, Ed. 5 Updates: 2817,2818 (if approved) greenbytes 6 Intended status: Standards Track November 17, 2013 7 Expires: May 21, 2014 9 Hypertext Transfer Protocol (HTTP/1.1): Message Syntax and Routing 10 draft-ietf-httpbis-p1-messaging-25 12 Abstract 14 The Hypertext Transfer Protocol (HTTP) is an application-level 15 protocol for distributed, collaborative, hypertext information 16 systems. HTTP has been in use by the World Wide Web global 17 information initiative since 1990. This document provides an 18 overview of HTTP architecture and its associated terminology, defines 19 the "http" and "https" Uniform Resource Identifier (URI) schemes, 20 defines the HTTP/1.1 message syntax and parsing requirements, and 21 describes general security concerns for implementations. 23 Editorial Note (To be removed by RFC Editor) 25 Discussion of this draft takes place on the HTTPBIS working group 26 mailing list (ietf-http-wg@w3.org), which is archived at 27 . 29 The current issues list is at 30 and related 31 documents (including fancy diffs) can be found at 32 . 34 The changes in this draft are summarized in Appendix C.2. 36 Status of This Memo 38 This Internet-Draft is submitted in full conformance with the 39 provisions of BCP 78 and BCP 79. 41 Internet-Drafts are working documents of the Internet Engineering 42 Task Force (IETF). Note that other groups may also distribute 43 working documents as Internet-Drafts. The list of current Internet- 44 Drafts is at http://datatracker.ietf.org/drafts/current/. 46 Internet-Drafts are draft documents valid for a maximum of six months 47 and may be updated, replaced, or obsoleted by other documents at any 48 time. It is inappropriate to use Internet-Drafts as reference 49 material or to cite them other than as "work in progress." 51 This Internet-Draft will expire on May 21, 2014. 53 Copyright Notice 55 Copyright (c) 2013 IETF Trust and the persons identified as the 56 document authors. All rights reserved. 58 This document is subject to BCP 78 and the IETF Trust's Legal 59 Provisions Relating to IETF Documents 60 (http://trustee.ietf.org/license-info) in effect on the date of 61 publication of this document. Please review these documents 62 carefully, as they describe your rights and restrictions with respect 63 to this document. Code Components extracted from this document must 64 include Simplified BSD License text as described in Section 4.e of 65 the Trust Legal Provisions and are provided without warranty as 66 described in the Simplified BSD License. 68 This document may contain material from IETF Documents or IETF 69 Contributions published or made publicly available before November 70 10, 2008. The person(s) controlling the copyright in some of this 71 material may not have granted the IETF Trust the right to allow 72 modifications of such material outside the IETF Standards Process. 73 Without obtaining an adequate license from the person(s) controlling 74 the copyright in such materials, this document may not be modified 75 outside the IETF Standards Process, and derivative works of it may 76 not be created outside the IETF Standards Process, except to format 77 it for publication as an RFC or to translate it into languages other 78 than English. 80 Table of Contents 82 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 5 83 1.1. Requirement Notation . . . . . . . . . . . . . . . . . . . 6 84 1.2. Syntax Notation . . . . . . . . . . . . . . . . . . . . . 6 85 2. Architecture . . . . . . . . . . . . . . . . . . . . . . . . . 6 86 2.1. Client/Server Messaging . . . . . . . . . . . . . . . . . 7 87 2.2. Implementation Diversity . . . . . . . . . . . . . . . . . 8 88 2.3. Intermediaries . . . . . . . . . . . . . . . . . . . . . . 9 89 2.4. Caches . . . . . . . . . . . . . . . . . . . . . . . . . . 11 90 2.5. Conformance and Error Handling . . . . . . . . . . . . . . 12 91 2.6. Protocol Versioning . . . . . . . . . . . . . . . . . . . 14 92 2.7. Uniform Resource Identifiers . . . . . . . . . . . . . . . 16 93 2.7.1. http URI scheme . . . . . . . . . . . . . . . . . . . 17 94 2.7.2. https URI scheme . . . . . . . . . . . . . . . . . . . 18 95 2.7.3. http and https URI Normalization and Comparison . . . 19 96 3. Message Format . . . . . . . . . . . . . . . . . . . . . . . . 19 97 3.1. Start Line . . . . . . . . . . . . . . . . . . . . . . . . 20 98 3.1.1. Request Line . . . . . . . . . . . . . . . . . . . . . 21 99 3.1.2. Status Line . . . . . . . . . . . . . . . . . . . . . 22 100 3.2. Header Fields . . . . . . . . . . . . . . . . . . . . . . 22 101 3.2.1. Field Extensibility . . . . . . . . . . . . . . . . . 23 102 3.2.2. Field Order . . . . . . . . . . . . . . . . . . . . . 23 103 3.2.3. Whitespace . . . . . . . . . . . . . . . . . . . . . . 24 104 3.2.4. Field Parsing . . . . . . . . . . . . . . . . . . . . 24 105 3.2.5. Field Limits . . . . . . . . . . . . . . . . . . . . . 26 106 3.2.6. Field value components . . . . . . . . . . . . . . . . 26 107 3.3. Message Body . . . . . . . . . . . . . . . . . . . . . . . 27 108 3.3.1. Transfer-Encoding . . . . . . . . . . . . . . . . . . 28 109 3.3.2. Content-Length . . . . . . . . . . . . . . . . . . . . 30 110 3.3.3. Message Body Length . . . . . . . . . . . . . . . . . 31 111 3.4. Handling Incomplete Messages . . . . . . . . . . . . . . . 33 112 3.5. Message Parsing Robustness . . . . . . . . . . . . . . . . 34 113 4. Transfer Codings . . . . . . . . . . . . . . . . . . . . . . . 35 114 4.1. Chunked Transfer Coding . . . . . . . . . . . . . . . . . 35 115 4.1.1. Chunk Extensions . . . . . . . . . . . . . . . . . . . 36 116 4.1.2. Chunked Trailer Part . . . . . . . . . . . . . . . . . 36 117 4.1.3. Decoding Chunked . . . . . . . . . . . . . . . . . . . 37 118 4.2. Compression Codings . . . . . . . . . . . . . . . . . . . 38 119 4.2.1. Compress Coding . . . . . . . . . . . . . . . . . . . 38 120 4.2.2. Deflate Coding . . . . . . . . . . . . . . . . . . . . 38 121 4.2.3. Gzip Coding . . . . . . . . . . . . . . . . . . . . . 38 122 4.3. TE . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 123 4.4. Trailer . . . . . . . . . . . . . . . . . . . . . . . . . 40 124 5. Message Routing . . . . . . . . . . . . . . . . . . . . . . . 40 125 5.1. Identifying a Target Resource . . . . . . . . . . . . . . 40 126 5.2. Connecting Inbound . . . . . . . . . . . . . . . . . . . . 40 127 5.3. Request Target . . . . . . . . . . . . . . . . . . . . . . 41 128 5.4. Host . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 129 5.5. Effective Request URI . . . . . . . . . . . . . . . . . . 44 130 5.6. Associating a Response to a Request . . . . . . . . . . . 46 131 5.7. Message Forwarding . . . . . . . . . . . . . . . . . . . . 46 132 5.7.1. Via . . . . . . . . . . . . . . . . . . . . . . . . . 46 133 5.7.2. Transformations . . . . . . . . . . . . . . . . . . . 48 134 6. Connection Management . . . . . . . . . . . . . . . . . . . . 49 135 6.1. Connection . . . . . . . . . . . . . . . . . . . . . . . . 49 136 6.2. Establishment . . . . . . . . . . . . . . . . . . . . . . 51 137 6.3. Persistence . . . . . . . . . . . . . . . . . . . . . . . 51 138 6.3.1. Retrying Requests . . . . . . . . . . . . . . . . . . 52 139 6.3.2. Pipelining . . . . . . . . . . . . . . . . . . . . . . 53 140 6.4. Concurrency . . . . . . . . . . . . . . . . . . . . . . . 53 141 6.5. Failures and Time-outs . . . . . . . . . . . . . . . . . . 54 142 6.6. Tear-down . . . . . . . . . . . . . . . . . . . . . . . . 55 143 6.7. Upgrade . . . . . . . . . . . . . . . . . . . . . . . . . 56 144 7. ABNF list extension: #rule . . . . . . . . . . . . . . . . . . 58 145 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 59 146 8.1. Header Field Registration . . . . . . . . . . . . . . . . 59 147 8.2. URI Scheme Registration . . . . . . . . . . . . . . . . . 60 148 8.3. Internet Media Type Registration . . . . . . . . . . . . . 60 149 8.3.1. Internet Media Type message/http . . . . . . . . . . . 60 150 8.3.2. Internet Media Type application/http . . . . . . . . . 61 151 8.4. Transfer Coding Registry . . . . . . . . . . . . . . . . . 63 152 8.4.1. Procedure . . . . . . . . . . . . . . . . . . . . . . 63 153 8.4.2. Registration . . . . . . . . . . . . . . . . . . . . . 63 154 8.5. Content Coding Registration . . . . . . . . . . . . . . . 64 155 8.6. Upgrade Token Registry . . . . . . . . . . . . . . . . . . 64 156 8.6.1. Procedure . . . . . . . . . . . . . . . . . . . . . . 64 157 8.6.2. Upgrade Token Registration . . . . . . . . . . . . . . 65 158 9. Security Considerations . . . . . . . . . . . . . . . . . . . 65 159 9.1. DNS-related Attacks . . . . . . . . . . . . . . . . . . . 65 160 9.2. Intermediaries and Caching . . . . . . . . . . . . . . . . 65 161 9.3. Buffer Overflows . . . . . . . . . . . . . . . . . . . . . 66 162 9.4. Message Integrity . . . . . . . . . . . . . . . . . . . . 66 163 9.5. Server Log Information . . . . . . . . . . . . . . . . . . 67 164 10. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 68 165 11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 69 166 11.1. Normative References . . . . . . . . . . . . . . . . . . . 69 167 11.2. Informative References . . . . . . . . . . . . . . . . . . 71 168 Appendix A. HTTP Version History . . . . . . . . . . . . . . . . 72 169 A.1. Changes from HTTP/1.0 . . . . . . . . . . . . . . . . . . 73 170 A.1.1. Multi-homed Web Servers . . . . . . . . . . . . . . . 73 171 A.1.2. Keep-Alive Connections . . . . . . . . . . . . . . . . 74 172 A.1.3. Introduction of Transfer-Encoding . . . . . . . . . . 74 173 A.2. Changes from RFC 2616 . . . . . . . . . . . . . . . . . . 74 174 Appendix B. Collected ABNF . . . . . . . . . . . . . . . . . . . 77 175 Appendix C. Change Log (to be removed by RFC Editor before 176 publication) . . . . . . . . . . . . . . . . . . . . 79 177 C.1. Since RFC 2616 . . . . . . . . . . . . . . . . . . . . . . 79 178 C.2. Since draft-ietf-httpbis-p1-messaging-24 . . . . . . . . . 79 179 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 181 1. Introduction 183 The Hypertext Transfer Protocol (HTTP) is an application-level 184 request/response protocol that uses extensible semantics and self- 185 descriptive message payloads for flexible interaction with network- 186 based hypertext information systems. This document is the first in a 187 series of documents that collectively form the HTTP/1.1 188 specification: 190 RFC xxx1: Message Syntax and Routing 192 RFC xxx2: Semantics and Content 194 RFC xxx3: Conditional Requests 196 RFC xxx4: Range Requests 198 RFC xxx5: Caching 200 RFC xxx6: Authentication 202 This HTTP/1.1 specification obsoletes and moves to historic status 203 RFC 2616, its predecessor RFC 2068, and RFC 2145 (on HTTP 204 versioning). This specification also updates the use of CONNECT to 205 establish a tunnel, previously defined in RFC 2817, and defines the 206 "https" URI scheme that was described informally in RFC 2818. 208 HTTP is a generic interface protocol for information systems. It is 209 designed to hide the details of how a service is implemented by 210 presenting a uniform interface to clients that is independent of the 211 types of resources provided. Likewise, servers do not need to be 212 aware of each client's purpose: an HTTP request can be considered in 213 isolation rather than being associated with a specific type of client 214 or a predetermined sequence of application steps. The result is a 215 protocol that can be used effectively in many different contexts and 216 for which implementations can evolve independently over time. 218 HTTP is also designed for use as an intermediation protocol for 219 translating communication to and from non-HTTP information systems. 220 HTTP proxies and gateways can provide access to alternative 221 information services by translating their diverse protocols into a 222 hypertext format that can be viewed and manipulated by clients in the 223 same way as HTTP services. 225 One consequence of this flexibility is that the protocol cannot be 226 defined in terms of what occurs behind the interface. Instead, we 227 are limited to defining the syntax of communication, the intent of 228 received communication, and the expected behavior of recipients. If 229 the communication is considered in isolation, then successful actions 230 ought to be reflected in corresponding changes to the observable 231 interface provided by servers. However, since multiple clients might 232 act in parallel and perhaps at cross-purposes, we cannot require that 233 such changes be observable beyond the scope of a single response. 235 This document describes the architectural elements that are used or 236 referred to in HTTP, defines the "http" and "https" URI schemes, 237 describes overall network operation and connection management, and 238 defines HTTP message framing and forwarding requirements. Our goal 239 is to define all of the mechanisms necessary for HTTP message 240 handling that are independent of message semantics, thereby defining 241 the complete set of requirements for message parsers and message- 242 forwarding intermediaries. 244 1.1. Requirement Notation 246 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 247 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 248 document are to be interpreted as described in [RFC2119]. 250 Conformance criteria and considerations regarding error handling are 251 defined in Section 2.5. 253 1.2. Syntax Notation 255 This specification uses the Augmented Backus-Naur Form (ABNF) 256 notation of [RFC5234] with the list rule extension defined in 257 Section 7. Appendix B shows the collected ABNF with the list rule 258 expanded. 260 The following core rules are included by reference, as defined in 261 [RFC5234], Appendix B.1: ALPHA (letters), CR (carriage return), CRLF 262 (CR LF), CTL (controls), DIGIT (decimal 0-9), DQUOTE (double quote), 263 HEXDIG (hexadecimal 0-9/A-F/a-f), HTAB (horizontal tab), LF (line 264 feed), OCTET (any 8-bit sequence of data), SP (space), and VCHAR (any 265 visible [USASCII] character). 267 As a convention, ABNF rule names prefixed with "obs-" denote 268 "obsolete" grammar rules that appear for historical reasons. 270 2. Architecture 272 HTTP was created for the World Wide Web architecture and has evolved 273 over time to support the scalability needs of a worldwide hypertext 274 system. Much of that architecture is reflected in the terminology 275 and syntax productions used to define HTTP. 277 2.1. Client/Server Messaging 279 HTTP is a stateless request/response protocol that operates by 280 exchanging messages (Section 3) across a reliable transport or 281 session-layer "connection" (Section 6). An HTTP "client" is a 282 program that establishes a connection to a server for the purpose of 283 sending one or more HTTP requests. An HTTP "server" is a program 284 that accepts connections in order to service HTTP requests by sending 285 HTTP responses. 287 The terms client and server refer only to the roles that these 288 programs perform for a particular connection. The same program might 289 act as a client on some connections and a server on others. We use 290 the term "user agent" to refer to any of the various client programs 291 that initiate a request, including (but not limited to) browsers, 292 spiders (web-based robots), command-line tools, native applications, 293 and mobile apps. The term "origin server" is used to refer to the 294 program that can originate authoritative responses to a request. For 295 general requirements, we use the terms "sender" and "recipient" to 296 refer to any component that sends or receives, respectively, a given 297 message. 299 HTTP relies upon the Uniform Resource Identifier (URI) standard 300 [RFC3986] to indicate the target resource (Section 5.1) and 301 relationships between resources. Messages are passed in a format 302 similar to that used by Internet mail [RFC5322] and the Multipurpose 303 Internet Mail Extensions (MIME) [RFC2045] (see Appendix A of [Part2] 304 for the differences between HTTP and MIME messages). 306 Most HTTP communication consists of a retrieval request (GET) for a 307 representation of some resource identified by a URI. In the simplest 308 case, this might be accomplished via a single bidirectional 309 connection (===) between the user agent (UA) and the origin server 310 (O). 312 request > 313 UA ======================================= O 314 < response 316 A client sends an HTTP request to a server in the form of a request 317 message, beginning with a request-line that includes a method, URI, 318 and protocol version (Section 3.1.1), followed by header fields 319 containing request modifiers, client information, and representation 320 metadata (Section 3.2), an empty line to indicate the end of the 321 header section, and finally a message body containing the payload 322 body (if any, Section 3.3). 324 A server responds to a client's request by sending one or more HTTP 325 response messages, each beginning with a status line that includes 326 the protocol version, a success or error code, and textual reason 327 phrase (Section 3.1.2), possibly followed by header fields containing 328 server information, resource metadata, and representation metadata 329 (Section 3.2), an empty line to indicate the end of the header 330 section, and finally a message body containing the payload body (if 331 any, Section 3.3). 333 A connection might be used for multiple request/response exchanges, 334 as defined in Section 6.3. 336 The following example illustrates a typical message exchange for a 337 GET request on the URI "http://www.example.com/hello.txt": 339 Client request: 341 GET /hello.txt HTTP/1.1 342 User-Agent: curl/7.16.3 libcurl/7.16.3 OpenSSL/0.9.7l zlib/1.2.3 343 Host: www.example.com 344 Accept-Language: en, mi 346 Server response: 348 HTTP/1.1 200 OK 349 Date: Mon, 27 Jul 2009 12:28:53 GMT 350 Server: Apache 351 Last-Modified: Wed, 22 Jul 2009 19:15:56 GMT 352 ETag: "34aa387-d-1568eb00" 353 Accept-Ranges: bytes 354 Content-Length: 51 355 Vary: Accept-Encoding 356 Content-Type: text/plain 358 Hello World! My payload includes a trailing CRLF. 360 2.2. Implementation Diversity 362 When considering the design of HTTP, it is easy to fall into a trap 363 of thinking that all user agents are general-purpose browsers and all 364 origin servers are large public websites. That is not the case in 365 practice. Common HTTP user agents include household appliances, 366 stereos, scales, firmware update scripts, command-line programs, 367 mobile apps, and communication devices in a multitude of shapes and 368 sizes. Likewise, common HTTP origin servers include home automation 369 units, configurable networking components, office machines, 370 autonomous robots, news feeds, traffic cameras, ad selectors, and 371 video delivery platforms. 373 The term "user agent" does not imply that there is a human user 374 directly interacting with the software agent at the time of a 375 request. In many cases, a user agent is installed or configured to 376 run in the background and save its results for later inspection (or 377 save only a subset of those results that might be interesting or 378 erroneous). Spiders, for example, are typically given a start URI 379 and configured to follow certain behavior while crawling the Web as a 380 hypertext graph. 382 The implementation diversity of HTTP means that we cannot assume the 383 user agent can make interactive suggestions to a user or provide 384 adequate warning for security or privacy options. In the few cases 385 where this specification requires reporting of errors to the user, it 386 is acceptable for such reporting to only be observable in an error 387 console or log file. Likewise, requirements that an automated action 388 be confirmed by the user before proceeding might be met via advance 389 configuration choices, run-time options, or simple avoidance of the 390 unsafe action; confirmation does not imply any specific user 391 interface or interruption of normal processing if the user has 392 already made that choice. 394 2.3. Intermediaries 396 HTTP enables the use of intermediaries to satisfy requests through a 397 chain of connections. There are three common forms of HTTP 398 intermediary: proxy, gateway, and tunnel. In some cases, a single 399 intermediary might act as an origin server, proxy, gateway, or 400 tunnel, switching behavior based on the nature of each request. 402 > > > > 403 UA =========== A =========== B =========== C =========== O 404 < < < < 406 The figure above shows three intermediaries (A, B, and C) between the 407 user agent and origin server. A request or response message that 408 travels the whole chain will pass through four separate connections. 409 Some HTTP communication options might apply only to the connection 410 with the nearest, non-tunnel neighbor, only to the end-points of the 411 chain, or to all connections along the chain. Although the diagram 412 is linear, each participant might be engaged in multiple, 413 simultaneous communications. For example, B might be receiving 414 requests from many clients other than A, and/or forwarding requests 415 to servers other than C, at the same time that it is handling A's 416 request. Likewise, later requests might be sent through a different 417 path of connections, often based on dynamic configuration for load 418 balancing. 420 We use the terms "upstream" and "downstream" to describe various 421 requirements in relation to the directional flow of a message: all 422 messages flow from upstream to downstream. Likewise, we use the 423 terms inbound and outbound to refer to directions in relation to the 424 request path: "inbound" means toward the origin server and "outbound" 425 means toward the user agent. 427 A "proxy" is a message forwarding agent that is selected by the 428 client, usually via local configuration rules, to receive requests 429 for some type(s) of absolute URI and attempt to satisfy those 430 requests via translation through the HTTP interface. Some 431 translations are minimal, such as for proxy requests for "http" URIs, 432 whereas other requests might require translation to and from entirely 433 different application-level protocols. Proxies are often used to 434 group an organization's HTTP requests through a common intermediary 435 for the sake of security, annotation services, or shared caching. 437 An HTTP-to-HTTP proxy is called a "transforming proxy" if it is 438 designed or configured to modify request or response messages in a 439 semantically meaningful way (i.e., modifications, beyond those 440 required by normal HTTP processing, that change the message in a way 441 that would be significant to the original sender or potentially 442 significant to downstream recipients). For example, a transforming 443 proxy might be acting as a shared annotation server (modifying 444 responses to include references to a local annotation database), a 445 malware filter, a format transcoder, or an intranet-to-Internet 446 privacy filter. Such transformations are presumed to be desired by 447 the client (or client organization) that selected the proxy and are 448 beyond the scope of this specification. However, when a proxy is not 449 intended to transform a given message, we use the term "non- 450 transforming proxy" to target requirements that preserve HTTP message 451 semantics. See Section 6.3.4 of [Part2] and Section 5.5 of [Part6] 452 for status and warning codes related to transformations. 454 A "gateway" (a.k.a., "reverse proxy") is an intermediary that acts as 455 an origin server for the outbound connection, but translates received 456 requests and forwards them inbound to another server or servers. 457 Gateways are often used to encapsulate legacy or untrusted 458 information services, to improve server performance through 459 "accelerator" caching, and to enable partitioning or load balancing 460 of HTTP services across multiple machines. 462 All HTTP requirements applicable to an origin server also apply to 463 the outbound communication of a gateway. A gateway communicates with 464 inbound servers using any protocol that it desires, including private 465 extensions to HTTP that are outside the scope of this specification. 466 However, an HTTP-to-HTTP gateway that wishes to interoperate with 467 third-party HTTP servers ought to conform to user agent requirements 468 on the gateway's inbound connection. 470 A "tunnel" acts as a blind relay between two connections without 471 changing the messages. Once active, a tunnel is not considered a 472 party to the HTTP communication, though the tunnel might have been 473 initiated by an HTTP request. A tunnel ceases to exist when both 474 ends of the relayed connection are closed. Tunnels are used to 475 extend a virtual connection through an intermediary, such as when 476 Transport Layer Security (TLS, [RFC5246]) is used to establish 477 confidential communication through a shared firewall proxy. 479 The above categories for intermediary only consider those acting as 480 participants in the HTTP communication. There are also 481 intermediaries that can act on lower layers of the network protocol 482 stack, filtering or redirecting HTTP traffic without the knowledge or 483 permission of message senders. Network intermediaries often 484 introduce security flaws or interoperability problems by violating 485 HTTP semantics. For example, an "interception proxy" [RFC3040] (also 486 commonly known as a "transparent proxy" [RFC1919] or "captive 487 portal") differs from an HTTP proxy because it is not selected by the 488 client. Instead, an interception proxy filters or redirects outgoing 489 TCP port 80 packets (and occasionally other common port traffic). 490 Interception proxies are commonly found on public network access 491 points, as a means of enforcing account subscription prior to 492 allowing use of non-local Internet services, and within corporate 493 firewalls to enforce network usage policies. They are 494 indistinguishable from a man-in-the-middle attack. 496 HTTP is defined as a stateless protocol, meaning that each request 497 message can be understood in isolation. Many implementations depend 498 on HTTP's stateless design in order to reuse proxied connections or 499 dynamically load-balance requests across multiple servers. Hence, a 500 server MUST NOT assume that two requests on the same connection are 501 from the same user agent unless the connection is secured and 502 specific to that agent. Some non-standard HTTP extensions (e.g., 503 [RFC4559]) have been known to violate this requirement, resulting in 504 security and interoperability problems. 506 2.4. Caches 508 A "cache" is a local store of previous response messages and the 509 subsystem that controls its message storage, retrieval, and deletion. 510 A cache stores cacheable responses in order to reduce the response 511 time and network bandwidth consumption on future, equivalent 512 requests. Any client or server MAY employ a cache, though a cache 513 cannot be used by a server while it is acting as a tunnel. 515 The effect of a cache is that the request/response chain is shortened 516 if one of the participants along the chain has a cached response 517 applicable to that request. The following illustrates the resulting 518 chain if B has a cached copy of an earlier response from O (via C) 519 for a request that has not been cached by UA or A. 521 > > 522 UA =========== A =========== B - - - - - - C - - - - - - O 523 < < 525 A response is "cacheable" if a cache is allowed to store a copy of 526 the response message for use in answering subsequent requests. Even 527 when a response is cacheable, there might be additional constraints 528 placed by the client or by the origin server on when that cached 529 response can be used for a particular request. HTTP requirements for 530 cache behavior and cacheable responses are defined in Section 2 of 531 [Part6]. 533 There are a wide variety of architectures and configurations of 534 caches deployed across the World Wide Web and inside large 535 organizations. These include national hierarchies of proxy caches to 536 save transoceanic bandwidth, collaborative systems that broadcast or 537 multicast cache entries, archives of pre-fetched cache entries for 538 use in off-line or high-latency environments, and so on. 540 2.5. Conformance and Error Handling 542 This specification targets conformance criteria according to the role 543 of a participant in HTTP communication. Hence, HTTP requirements are 544 placed on senders, recipients, clients, servers, user agents, 545 intermediaries, origin servers, proxies, gateways, or caches, 546 depending on what behavior is being constrained by the requirement. 547 Additional (social) requirements are placed on implementations, 548 resource owners, and protocol element registrations when they apply 549 beyond the scope of a single communication. 551 The verb "generate" is used instead of "send" where a requirement 552 differentiates between creating a protocol element and merely 553 forwarding a received element downstream. 555 An implementation is considered conformant if it complies with all of 556 the requirements associated with the roles it partakes in HTTP. 558 Conformance includes both the syntax and semantics of protocol 559 elements. A sender MUST NOT generate protocol elements that convey a 560 meaning that is known by that sender to be false. A sender MUST NOT 561 generate protocol elements that do not match the grammar defined by 562 the corresponding ABNF rules. Within a given message, a sender MUST 563 NOT generate protocol elements or syntax alternatives that are only 564 allowed to be generated by participants in other roles (i.e., a role 565 that the sender does not have for that message). 567 When a received protocol element is parsed, the recipient MUST be 568 able to parse any value of reasonable length that is applicable to 569 the recipient's role and matches the grammar defined by the 570 corresponding ABNF rules. Note, however, that some received protocol 571 elements might not be parsed. For example, an intermediary 572 forwarding a message might parse a header-field into generic field- 573 name and field-value components, but then forward the header field 574 without further parsing inside the field-value. 576 HTTP does not have specific length limitations for many of its 577 protocol elements because the lengths that might be appropriate will 578 vary widely, depending on the deployment context and purpose of the 579 implementation. Hence, interoperability between senders and 580 recipients depends on shared expectations regarding what is a 581 reasonable length for each protocol element. Furthermore, what is 582 commonly understood to be a reasonable length for some protocol 583 elements has changed over the course of the past two decades of HTTP 584 use, and is expected to continue changing in the future. 586 At a minimum, a recipient MUST be able to parse and process protocol 587 element lengths that are at least as long as the values that it 588 generates for those same protocol elements in other messages. For 589 example, an origin server that publishes very long URI references to 590 its own resources needs to be able to parse and process those same 591 references when received as a request target. 593 A recipient MUST interpret a received protocol element according to 594 the semantics defined for it by this specification, including 595 extensions to this specification, unless the recipient has determined 596 (through experience or configuration) that the sender incorrectly 597 implements what is implied by those semantics. For example, an 598 origin server might disregard the contents of a received Accept- 599 Encoding header field if inspection of the User-Agent header field 600 indicates a specific implementation version that is known to fail on 601 receipt of certain content codings. 603 Unless noted otherwise, a recipient MAY attempt to recover a usable 604 protocol element from an invalid construct. HTTP does not define 605 specific error handling mechanisms except when they have a direct 606 impact on security, since different applications of the protocol 607 require different error handling strategies. For example, a Web 608 browser might wish to transparently recover from a response where the 609 Location header field doesn't parse according to the ABNF, whereas a 610 systems control client might consider any form of error recovery to 611 be dangerous. 613 2.6. Protocol Versioning 615 HTTP uses a "." numbering scheme to indicate versions 616 of the protocol. This specification defines version "1.1". The 617 protocol version as a whole indicates the sender's conformance with 618 the set of requirements laid out in that version's corresponding 619 specification of HTTP. 621 The version of an HTTP message is indicated by an HTTP-version field 622 in the first line of the message. HTTP-version is case-sensitive. 624 HTTP-version = HTTP-name "/" DIGIT "." DIGIT 625 HTTP-name = %x48.54.54.50 ; "HTTP", case-sensitive 627 The HTTP version number consists of two decimal digits separated by a 628 "." (period or decimal point). The first digit ("major version") 629 indicates the HTTP messaging syntax, whereas the second digit ("minor 630 version") indicates the highest minor version within that major 631 version to which the sender is conformant and able to understand for 632 future communication. The minor version advertises the sender's 633 communication capabilities even when the sender is only using a 634 backwards-compatible subset of the protocol, thereby letting the 635 recipient know that more advanced features can be used in response 636 (by servers) or in future requests (by clients). 638 When an HTTP/1.1 message is sent to an HTTP/1.0 recipient [RFC1945] 639 or a recipient whose version is unknown, the HTTP/1.1 message is 640 constructed such that it can be interpreted as a valid HTTP/1.0 641 message if all of the newer features are ignored. This specification 642 places recipient-version requirements on some new features so that a 643 conformant sender will only use compatible features until it has 644 determined, through configuration or the receipt of a message, that 645 the recipient supports HTTP/1.1. 647 The interpretation of a header field does not change between minor 648 versions of the same major HTTP version, though the default behavior 649 of a recipient in the absence of such a field can change. Unless 650 specified otherwise, header fields defined in HTTP/1.1 are defined 651 for all versions of HTTP/1.x. In particular, the Host and Connection 652 header fields ought to be implemented by all HTTP/1.x implementations 653 whether or not they advertise conformance with HTTP/1.1. 655 New header fields can be introduced without changing the protocol 656 version if their defined semantics allow them to be safely ignored by 657 recipients that do not recognize them. Header field extensibility is 658 discussed in Section 3.2.1. 660 Intermediaries that process HTTP messages (i.e., all intermediaries 661 other than those acting as tunnels) MUST send their own HTTP-version 662 in forwarded messages. In other words, they are not allowed to 663 blindly forward the first line of an HTTP message without ensuring 664 that the protocol version in that message matches a version to which 665 that intermediary is conformant for both the receiving and sending of 666 messages. Forwarding an HTTP message without rewriting the HTTP- 667 version might result in communication errors when downstream 668 recipients use the message sender's version to determine what 669 features are safe to use for later communication with that sender. 671 A client SHOULD send a request version equal to the highest version 672 to which the client is conformant and whose major version is no 673 higher than the highest version supported by the server, if this is 674 known. A client MUST NOT send a version to which it is not 675 conformant. 677 A client MAY send a lower request version if it is known that the 678 server incorrectly implements the HTTP specification, but only after 679 the client has attempted at least one normal request and determined 680 from the response status code or header fields (e.g., Server) that 681 the server improperly handles higher request versions. 683 A server SHOULD send a response version equal to the highest version 684 to which the server is conformant that has a major version less than 685 or equal to the one received in the request. A server MUST NOT send 686 a version to which it is not conformant. A server can send a 505 687 (HTTP Version Not Supported) response if it wishes, for any reason, 688 to refuse service of the client's major protocol version. 690 A server MAY send an HTTP/1.0 response to a request if it is known or 691 suspected that the client incorrectly implements the HTTP 692 specification and is incapable of correctly processing later version 693 responses, such as when a client fails to parse the version number 694 correctly or when an intermediary is known to blindly forward the 695 HTTP-version even when it doesn't conform to the given minor version 696 of the protocol. Such protocol downgrades SHOULD NOT be performed 697 unless triggered by specific client attributes, such as when one or 698 more of the request header fields (e.g., User-Agent) uniquely match 699 the values sent by a client known to be in error. 701 The intention of HTTP's versioning design is that the major number 702 will only be incremented if an incompatible message syntax is 703 introduced, and that the minor number will only be incremented when 704 changes made to the protocol have the effect of adding to the message 705 semantics or implying additional capabilities of the sender. 706 However, the minor version was not incremented for the changes 707 introduced between [RFC2068] and [RFC2616], and this revision has 708 specifically avoided any such changes to the protocol. 710 When an HTTP message is received with a major version number that the 711 recipient implements, but a higher minor version number than what the 712 recipient implements, the recipient SHOULD process the message as if 713 it were in the highest minor version within that major version to 714 which the recipient is conformant. A recipient can assume that a 715 message with a higher minor version, when sent to a recipient that 716 has not yet indicated support for that higher version, is 717 sufficiently backwards-compatible to be safely processed by any 718 implementation of the same major version. 720 2.7. Uniform Resource Identifiers 722 Uniform Resource Identifiers (URIs) [RFC3986] are used throughout 723 HTTP as the means for identifying resources (Section 2 of [Part2]). 724 URI references are used to target requests, indicate redirects, and 725 define relationships. 727 This specification adopts the definitions of "URI-reference", 728 "absolute-URI", "relative-part", "authority", "port", "host", "path- 729 abempty", "segment", "query", and "fragment" from the URI generic 730 syntax. In addition, we define an "absolute-path" rule (that differs 731 from RFC 3986's "path-absolute" in that it allows a leading "//") and 732 a "partial-URI" rule for protocol elements that allow a relative URI 733 but not a fragment. 735 URI-reference = 736 absolute-URI = 737 relative-part = 738 authority = 739 uri-host = 740 port = 741 path-abempty = 742 segment = 743 query = 744 fragment = 746 absolute-path = 1*( "/" segment ) 747 partial-URI = relative-part [ "?" query ] 749 Each protocol element in HTTP that allows a URI reference will 750 indicate in its ABNF production whether the element allows any form 751 of reference (URI-reference), only a URI in absolute form (absolute- 752 URI), only the path and optional query components, or some 753 combination of the above. Unless otherwise indicated, URI references 754 are parsed relative to the effective request URI (Section 5.5). 756 2.7.1. http URI scheme 758 The "http" URI scheme is hereby defined for the purpose of minting 759 identifiers according to their association with the hierarchical 760 namespace governed by a potential HTTP origin server listening for 761 TCP ([RFC0793]) connections on a given port. 763 http-URI = "http:" "//" authority path-abempty [ "?" query ] 764 [ "#" fragment ] 766 The HTTP origin server is identified by the generic syntax's 767 authority component, which includes a host identifier and optional 768 TCP port ([RFC3986], Section 3.2.2). The remainder of the URI, 769 consisting of both the hierarchical path component and optional query 770 component, serves as an identifier for a potential resource within 771 that origin server's name space. 773 A sender MUST NOT generate an "http" URI with an empty host 774 identifier. A recipient that processes such a URI reference MUST 775 reject it as invalid. 777 If the host identifier is provided as an IP address, then the origin 778 server is any listener on the indicated TCP port at that IP address. 779 If host is a registered name, then that name is considered an 780 indirect identifier and the recipient might use a name resolution 781 service, such as DNS, to find the address of a listener for that 782 host. If the port subcomponent is empty or not given, then TCP port 783 80 is assumed (the default reserved port for WWW services). 785 Regardless of the form of host identifier, access to that host is not 786 implied by the mere presence of its name or address. The host might 787 or might not exist and, even when it does exist, might or might not 788 be running an HTTP server or listening to the indicated port. The 789 "http" URI scheme makes use of the delegated nature of Internet names 790 and addresses to establish a naming authority (whatever entity has 791 the ability to place an HTTP server at that Internet name or address) 792 and allows that authority to determine which names are valid and how 793 they might be used. 795 When an "http" URI is used within a context that calls for access to 796 the indicated resource, a client MAY attempt access by resolving the 797 host to an IP address, establishing a TCP connection to that address 798 on the indicated port, and sending an HTTP request message 799 (Section 3) containing the URI's identifying data (Section 5) to the 800 server. If the server responds to that request with a non-interim 801 HTTP response message, as described in Section 6 of [Part2], then 802 that response is considered an authoritative answer to the client's 803 request. 805 Although HTTP is independent of the transport protocol, the "http" 806 scheme is specific to TCP-based services because the name delegation 807 process depends on TCP for establishing authority. An HTTP service 808 based on some other underlying connection protocol would presumably 809 be identified using a different URI scheme, just as the "https" 810 scheme (below) is used for resources that require an end-to-end 811 secured connection. Other protocols might also be used to provide 812 access to "http" identified resources -- it is only the authoritative 813 interface that is specific to TCP. 815 The URI generic syntax for authority also includes a deprecated 816 userinfo subcomponent ([RFC3986], Section 3.2.1) for including user 817 authentication information in the URI. Some implementations make use 818 of the userinfo component for internal configuration of 819 authentication information, such as within command invocation 820 options, configuration files, or bookmark lists, even though such 821 usage might expose a user identifier or password. A sender MUST NOT 822 generate the userinfo subcomponent (and its "@" delimiter) when an 823 "http" URI reference is generated within a message as a request 824 target or header field value. Before making use of an "http" URI 825 reference received from an untrusted source, a recipient ought to 826 parse for userinfo and treat its presence as an error; it is likely 827 being used to obscure the authority for the sake of phishing attacks. 829 2.7.2. https URI scheme 831 The "https" URI scheme is hereby defined for the purpose of minting 832 identifiers according to their association with the hierarchical 833 namespace governed by a potential HTTP origin server listening to a 834 given TCP port for TLS-secured connections ([RFC0793], [RFC5246]). 836 All of the requirements listed above for the "http" scheme are also 837 requirements for the "https" scheme, except that a default TCP port 838 of 443 is assumed if the port subcomponent is empty or not given, and 839 the user agent MUST ensure that its connection to the origin server 840 is secured through the use of strong encryption, end-to-end, prior to 841 sending the first HTTP request. 843 https-URI = "https:" "//" authority path-abempty [ "?" query ] 844 [ "#" fragment ] 846 Note that the "https" URI scheme depends on both TLS and TCP for 847 establishing authority. Resources made available via the "https" 848 scheme have no shared identity with the "http" scheme even if their 849 resource identifiers indicate the same authority (the same host 850 listening to the same TCP port). They are distinct name spaces and 851 are considered to be distinct origin servers. However, an extension 852 to HTTP that is defined to apply to entire host domains, such as the 853 Cookie protocol [RFC6265], can allow information set by one service 854 to impact communication with other services within a matching group 855 of host domains. 857 The process for authoritative access to an "https" identified 858 resource is defined in [RFC2818]. 860 2.7.3. http and https URI Normalization and Comparison 862 Since the "http" and "https" schemes conform to the URI generic 863 syntax, such URIs are normalized and compared according to the 864 algorithm defined in [RFC3986], Section 6, using the defaults 865 described above for each scheme. 867 If the port is equal to the default port for a scheme, the normal 868 form is to omit the port subcomponent. When not being used in 869 absolute form as the request target of an OPTIONS request, an empty 870 path component is equivalent to an absolute path of "/", so the 871 normal form is to provide a path of "/" instead. The scheme and host 872 are case-insensitive and normally provided in lowercase; all other 873 components are compared in a case-sensitive manner. Characters other 874 than those in the "reserved" set are equivalent to their percent- 875 encoded octets (see [RFC3986], Section 2.1): the normal form is to 876 not encode them. 878 For example, the following three URIs are equivalent: 880 http://example.com:80/~smith/home.html 881 http://EXAMPLE.com/%7Esmith/home.html 882 http://EXAMPLE.com:/%7esmith/home.html 884 3. Message Format 886 All HTTP/1.1 messages consist of a start-line followed by a sequence 887 of octets in a format similar to the Internet Message Format 888 [RFC5322]: zero or more header fields (collectively referred to as 889 the "headers" or the "header section"), an empty line indicating the 890 end of the header section, and an optional message body. 892 HTTP-message = start-line 893 *( header-field CRLF ) 894 CRLF 895 [ message-body ] 897 The normal procedure for parsing an HTTP message is to read the 898 start-line into a structure, read each header field into a hash table 899 by field name until the empty line, and then use the parsed data to 900 determine if a message body is expected. If a message body has been 901 indicated, then it is read as a stream until an amount of octets 902 equal to the message body length is read or the connection is closed. 904 A recipient MUST parse an HTTP message as a sequence of octets in an 905 encoding that is a superset of US-ASCII [USASCII]. Parsing an HTTP 906 message as a stream of Unicode characters, without regard for the 907 specific encoding, creates security vulnerabilities due to the 908 varying ways that string processing libraries handle invalid 909 multibyte character sequences that contain the octet LF (%x0A). 910 String-based parsers can only be safely used within protocol elements 911 after the element has been extracted from the message, such as within 912 a header field-value after message parsing has delineated the 913 individual fields. 915 An HTTP message can be parsed as a stream for incremental processing 916 or forwarding downstream. However, recipients cannot rely on 917 incremental delivery of partial messages, since some implementations 918 will buffer or delay message forwarding for the sake of network 919 efficiency, security checks, or payload transformations. 921 A sender MUST NOT send whitespace between the start-line and the 922 first header field. A recipient that receives whitespace between the 923 start-line and the first header field MUST either reject the message 924 as invalid or consume each whitespace-preceded line without further 925 processing of it (i.e., ignore the entire line, along with any 926 subsequent lines preceded by whitespace, until a properly formed 927 header field is received or the header section is terminated). 929 The presence of such whitespace in a request might be an attempt to 930 trick a server into ignoring that field or processing the line after 931 it as a new request, either of which might result in a security 932 vulnerability if other implementations within the request chain 933 interpret the same message differently. Likewise, the presence of 934 such whitespace in a response might be ignored by some clients or 935 cause others to cease parsing. 937 3.1. Start Line 939 An HTTP message can either be a request from client to server or a 940 response from server to client. Syntactically, the two types of 941 message differ only in the start-line, which is either a request-line 942 (for requests) or a status-line (for responses), and in the algorithm 943 for determining the length of the message body (Section 3.3). 945 In theory, a client could receive requests and a server could receive 946 responses, distinguishing them by their different start-line formats, 947 but in practice servers are implemented to only expect a request (a 948 response is interpreted as an unknown or invalid request method) and 949 clients are implemented to only expect a response. 951 start-line = request-line / status-line 953 3.1.1. Request Line 955 A request-line begins with a method token, followed by a single space 956 (SP), the request-target, another single space (SP), the protocol 957 version, and ending with CRLF. 959 request-line = method SP request-target SP HTTP-version CRLF 961 The method token indicates the request method to be performed on the 962 target resource. The request method is case-sensitive. 964 method = token 966 The request methods defined by this specification can be found in 967 Section 4 of [Part2], along with information regarding the HTTP 968 method registry and considerations for defining new methods. 970 The request-target identifies the target resource upon which to apply 971 the request, as defined in Section 5.3. 973 Recipients typically parse the request-line into its component parts 974 by splitting on whitespace (see Section 3.5), since no whitespace is 975 allowed in the three components. Unfortunately, some user agents 976 fail to properly encode or exclude whitespace found in hypertext 977 references, resulting in those disallowed characters being sent in a 978 request-target. 980 Recipients of an invalid request-line SHOULD respond with either a 981 400 (Bad Request) error or a 301 (Moved Permanently) redirect with 982 the request-target properly encoded. A recipient SHOULD NOT attempt 983 to autocorrect and then process the request without a redirect, since 984 the invalid request-line might be deliberately crafted to bypass 985 security filters along the request chain. 987 HTTP does not place a pre-defined limit on the length of a request- 988 line. A server that receives a method longer than any that it 989 implements SHOULD respond with a 501 (Not Implemented) status code. 990 A server ought to be prepared to receive URIs of unbounded length, as 991 described in Section 2.5, and MUST respond with a 414 (URI Too Long) 992 status code if the received request-target is longer than the server 993 wishes to parse (see Section 6.5.12 of [Part2]). 995 Various ad-hoc limitations on request-line length are found in 996 practice. It is RECOMMENDED that all HTTP senders and recipients 997 support, at a minimum, request-line lengths of 8000 octets. 999 3.1.2. Status Line 1001 The first line of a response message is the status-line, consisting 1002 of the protocol version, a space (SP), the status code, another 1003 space, a possibly-empty textual phrase describing the status code, 1004 and ending with CRLF. 1006 status-line = HTTP-version SP status-code SP reason-phrase CRLF 1008 The status-code element is a 3-digit integer code describing the 1009 result of the server's attempt to understand and satisfy the client's 1010 corresponding request. The rest of the response message is to be 1011 interpreted in light of the semantics defined for that status code. 1012 See Section 6 of [Part2] for information about the semantics of 1013 status codes, including the classes of status code (indicated by the 1014 first digit), the status codes defined by this specification, 1015 considerations for the definition of new status codes, and the IANA 1016 registry. 1018 status-code = 3DIGIT 1020 The reason-phrase element exists for the sole purpose of providing a 1021 textual description associated with the numeric status code, mostly 1022 out of deference to earlier Internet application protocols that were 1023 more frequently used with interactive text clients. A client SHOULD 1024 ignore the reason-phrase content. 1026 reason-phrase = *( HTAB / SP / VCHAR / obs-text ) 1028 3.2. Header Fields 1030 Each HTTP header field consists of a case-insensitive field name 1031 followed by a colon (":"), optional leading whitespace, the field 1032 value, and optional trailing whitespace. 1034 header-field = field-name ":" OWS field-value OWS 1035 field-name = token 1036 field-value = *( field-content / obs-fold ) 1037 field-content = *( HTAB / SP / VCHAR / obs-text ) 1038 obs-fold = CRLF ( SP / HTAB ) 1039 ; obsolete line folding 1040 ; see Section 3.2.4 1042 The field-name token labels the corresponding field-value as having 1043 the semantics defined by that header field. For example, the Date 1044 header field is defined in Section 7.1.1.2 of [Part2] as containing 1045 the origination timestamp for the message in which it appears. 1047 3.2.1. Field Extensibility 1049 Header fields are fully extensible: there is no limit on the 1050 introduction of new field names, each presumably defining new 1051 semantics, nor on the number of header fields used in a given 1052 message. Existing fields are defined in each part of this 1053 specification and in many other specifications outside the core 1054 standard. 1056 New header fields can be defined such that, when they are understood 1057 by a recipient, they might override or enhance the interpretation of 1058 previously defined header fields, define preconditions on request 1059 evaluation, or refine the meaning of responses. 1061 A proxy MUST forward unrecognized header fields unless the field-name 1062 is listed in the Connection header field (Section 6.1) or the proxy 1063 is specifically configured to block, or otherwise transform, such 1064 fields. Other recipients SHOULD ignore unrecognized header fields. 1065 These requirements allow HTTP's functionality to be enhanced without 1066 requiring prior update of deployed intermediaries. 1068 All defined header fields ought to be registered with IANA in the 1069 Message Header Field Registry, as described in Section 8.3 of 1070 [Part2]. 1072 3.2.2. Field Order 1074 The order in which header fields with differing field names are 1075 received is not significant. However, it is "good practice" to send 1076 header fields that contain control data first, such as Host on 1077 requests and Date on responses, so that implementations can decide 1078 when not to handle a message as early as possible. A server MUST 1079 wait until the entire header section is received before interpreting 1080 a request message, since later header fields might include 1081 conditionals, authentication credentials, or deliberately misleading 1082 duplicate header fields that would impact request processing. 1084 A sender MUST NOT generate multiple header fields with the same field 1085 name in a message unless either the entire field value for that 1086 header field is defined as a comma-separated list [i.e., #(values)] 1087 or the header field is a well-known exception (as noted below). 1089 A recipient MAY combine multiple header fields with the same field 1090 name into one "field-name: field-value" pair, without changing the 1091 semantics of the message, by appending each subsequent field value to 1092 the combined field value in order, separated by a comma. The order 1093 in which header fields with the same field name are received is 1094 therefore significant to the interpretation of the combined field 1095 value; a proxy MUST NOT change the order of these field values when 1096 forwarding a message. 1098 Note: In practice, the "Set-Cookie" header field ([RFC6265]) often 1099 appears multiple times in a response message and does not use the 1100 list syntax, violating the above requirements on multiple header 1101 fields with the same name. Since it cannot be combined into a 1102 single field-value, recipients ought to handle "Set-Cookie" as a 1103 special case while processing header fields. (See Appendix A.2.3 1104 of [Kri2001] for details.) 1106 3.2.3. Whitespace 1108 This specification uses three rules to denote the use of linear 1109 whitespace: OWS (optional whitespace), RWS (required whitespace), and 1110 BWS ("bad" whitespace). 1112 The OWS rule is used where zero or more linear whitespace octets 1113 might appear. For protocol elements where optional whitespace is 1114 preferred to improve readability, a sender SHOULD generate the 1115 optional whitespace as a single SP; otherwise, a sender SHOULD NOT 1116 generate optional whitespace except as needed to white-out invalid or 1117 unwanted protocol elements during in-place message filtering. 1119 The RWS rule is used when at least one linear whitespace octet is 1120 required to separate field tokens. A sender SHOULD generate RWS as a 1121 single SP. 1123 The BWS rule is used where the grammar allows optional whitespace 1124 only for historical reasons. A sender MUST NOT generate BWS in 1125 messages. A recipient MUST parse for such bad whitespace and remove 1126 it before interpreting the protocol element. 1128 OWS = *( SP / HTAB ) 1129 ; optional whitespace 1130 RWS = 1*( SP / HTAB ) 1131 ; required whitespace 1132 BWS = OWS 1133 ; "bad" whitespace 1135 3.2.4. Field Parsing 1137 No whitespace is allowed between the header field-name and colon. In 1138 the past, differences in the handling of such whitespace have led to 1139 security vulnerabilities in request routing and response handling. A 1140 server MUST reject any received request message that contains 1141 whitespace between a header field-name and colon with a response code 1142 of 400 (Bad Request). A proxy MUST remove any such whitespace from a 1143 response message before forwarding the message downstream. 1145 A field value is preceded by optional whitespace (OWS); a single SP 1146 is preferred. The field value does not include any leading or 1147 trailing white space: OWS occurring before the first non-whitespace 1148 octet of the field value or after the last non-whitespace octet of 1149 the field value ought to be excluded by parsers when extracting the 1150 field value from a header field. 1152 A recipient of field-content containing multiple sequential octets of 1153 optional (OWS) or required (RWS) whitespace SHOULD either replace the 1154 sequence with a single SP or transform any non-SP octets in the 1155 sequence to SP octets before interpreting the field value or 1156 forwarding the message downstream. 1158 Historically, HTTP header field values could be extended over 1159 multiple lines by preceding each extra line with at least one space 1160 or horizontal tab (obs-fold). This specification deprecates such 1161 line folding except within the message/http media type 1162 (Section 8.3.1). A sender MUST NOT generate a message that includes 1163 line folding (i.e., that has any field-value that contains a match to 1164 the obs-fold rule) unless the message is intended for packaging 1165 within the message/http media type. 1167 A server that receives an obs-fold in a request message that is not 1168 within a message/http container MUST either reject the message by 1169 sending a 400 (Bad Request), preferably with a representation 1170 explaining that obsolete line folding is unacceptable, or replace 1171 each received obs-fold with one or more SP octets prior to 1172 interpreting the field value or forwarding the message downstream. 1174 A proxy or gateway that receives an obs-fold in a response message 1175 that is not within a message/http container MUST either discard the 1176 message and replace it with a 502 (Bad Gateway) response, preferably 1177 with a representation explaining that unacceptable line folding was 1178 received, or replace each received obs-fold with one or more SP 1179 octets prior to interpreting the field value or forwarding the 1180 message downstream. 1182 A user agent that receives an obs-fold in a response message that is 1183 not within a message/http container MUST replace each received obs- 1184 fold with one or more SP octets prior to interpreting the field 1185 value. 1187 Historically, HTTP has allowed field content with text in the ISO- 1188 8859-1 [ISO-8859-1] charset, supporting other charsets only through 1189 use of [RFC2047] encoding. In practice, most HTTP header field 1190 values use only a subset of the US-ASCII charset [USASCII]. Newly 1191 defined header fields SHOULD limit their field values to US-ASCII 1192 octets. A recipient SHOULD treat other octets in field content (obs- 1193 text) as opaque data. 1195 3.2.5. Field Limits 1197 HTTP does not place a pre-defined limit on the length of each header 1198 field or on the length of the header section as a whole, as described 1199 in Section 2.5. Various ad-hoc limitations on individual header 1200 field length are found in practice, often depending on the specific 1201 field semantics. 1203 A server ought to be prepared to receive request header fields of 1204 unbounded length and MUST respond with an appropriate 4xx (Client 1205 Error) status code if the received header field(s) are larger than 1206 the server wishes to process. 1208 A client ought to be prepared to receive response header fields of 1209 unbounded length. A client MAY discard or truncate received header 1210 fields that are larger than the client wishes to process if the field 1211 semantics are such that the dropped value(s) can be safely ignored 1212 without changing the message framing or response semantics. 1214 3.2.6. Field value components 1216 Many HTTP header field values consist of words (token or quoted- 1217 string) separated by whitespace or special characters. 1219 word = token / quoted-string 1221 token = 1*tchar 1223 tchar = "!" / "#" / "$" / "%" / "&" / "'" / "*" 1224 / "+" / "-" / "." / "^" / "_" / "`" / "|" / "~" 1225 / DIGIT / ALPHA 1226 ; any VCHAR, except special 1228 special = "(" / ")" / "<" / ">" / "@" / "," 1229 / ";" / ":" / "\" / DQUOTE / "/" / "[" 1230 / "]" / "?" / "=" / "{" / "}" 1232 A string of text is parsed as a single word if it is quoted using 1233 double-quote marks. 1235 quoted-string = DQUOTE *( qdtext / quoted-pair ) DQUOTE 1236 qdtext = HTAB / SP /%x21 / %x23-5B / %x5D-7E / obs-text 1237 obs-text = %x80-FF 1239 The backslash octet ("\") can be used as a single-octet quoting 1240 mechanism within quoted-string constructs: 1242 quoted-pair = "\" ( HTAB / SP / VCHAR / obs-text ) 1244 Recipients that process the value of a quoted-string MUST handle a 1245 quoted-pair as if it were replaced by the octet following the 1246 backslash. 1248 A sender SHOULD NOT generate a quoted-pair in a quoted-string except 1249 where necessary to quote DQUOTE and backslash octets occurring within 1250 that string. 1252 Comments can be included in some HTTP header fields by surrounding 1253 the comment text with parentheses. Comments are only allowed in 1254 fields containing "comment" as part of their field value definition. 1256 comment = "(" *( ctext / quoted-cpair / comment ) ")" 1257 ctext = HTAB / SP / %x21-27 / %x2A-5B / %x5D-7E / obs-text 1259 The backslash octet ("\") can be used as a single-octet quoting 1260 mechanism within comment constructs: 1262 quoted-cpair = "\" ( HTAB / SP / VCHAR / obs-text ) 1264 A sender SHOULD NOT escape octets in comments that do not require 1265 escaping (i.e., other than the backslash octet "\" and the 1266 parentheses "(" and ")"). 1268 3.3. Message Body 1270 The message body (if any) of an HTTP message is used to carry the 1271 payload body of that request or response. The message body is 1272 identical to the payload body unless a transfer coding has been 1273 applied, as described in Section 3.3.1. 1275 message-body = *OCTET 1277 The rules for when a message body is allowed in a message differ for 1278 requests and responses. 1280 The presence of a message body in a request is signaled by a Content- 1281 Length or Transfer-Encoding header field. Request message framing is 1282 independent of method semantics, even if the method does not define 1283 any use for a message body. 1285 The presence of a message body in a response depends on both the 1286 request method to which it is responding and the response status code 1287 (Section 3.1.2). Responses to the HEAD request method never include 1288 a message body because the associated response header fields (e.g., 1289 Transfer-Encoding, Content-Length, etc.), if present, indicate only 1290 what their values would have been if the request method had been GET 1291 (Section 4.3.2 of [Part2]). 2xx (Successful) responses to CONNECT 1292 switch to tunnel mode instead of having a message body (Section 4.3.6 1293 of [Part2]). All 1xx (Informational), 204 (No Content), and 304 (Not 1294 Modified) responses do not include a message body. All other 1295 responses do include a message body, although the body might be of 1296 zero length. 1298 3.3.1. Transfer-Encoding 1300 The Transfer-Encoding header field lists the transfer coding names 1301 corresponding to the sequence of transfer codings that have been (or 1302 will be) applied to the payload body in order to form the message 1303 body. Transfer codings are defined in Section 4. 1305 Transfer-Encoding = 1#transfer-coding 1307 Transfer-Encoding is analogous to the Content-Transfer-Encoding field 1308 of MIME, which was designed to enable safe transport of binary data 1309 over a 7-bit transport service ([RFC2045], Section 6). However, safe 1310 transport has a different focus for an 8bit-clean transfer protocol. 1311 In HTTP's case, Transfer-Encoding is primarily intended to accurately 1312 delimit a dynamically generated payload and to distinguish payload 1313 encodings that are only applied for transport efficiency or security 1314 from those that are characteristics of the selected resource. 1316 A recipient MUST be able to parse the chunked transfer coding 1317 (Section 4.1) because it plays a crucial role in framing messages 1318 when the payload body size is not known in advance. A sender MUST 1319 NOT apply chunked more than once to a message body (i.e., chunking an 1320 already chunked message is not allowed). If any transfer coding 1321 other than chunked is applied to a request payload body, the sender 1322 MUST apply chunked as the final transfer coding to ensure that the 1323 message is properly framed. If any transfer coding other than 1324 chunked is applied to a response payload body, the sender MUST either 1325 apply chunked as the final transfer coding or terminate the message 1326 by closing the connection. 1328 For example, 1330 Transfer-Encoding: gzip, chunked 1332 indicates that the payload body has been compressed using the gzip 1333 coding and then chunked using the chunked coding while forming the 1334 message body. 1336 Unlike Content-Encoding (Section 3.1.2.1 of [Part2]), Transfer- 1337 Encoding is a property of the message, not of the representation, and 1338 any recipient along the request/response chain MAY decode the 1339 received transfer coding(s) or apply additional transfer coding(s) to 1340 the message body, assuming that corresponding changes are made to the 1341 Transfer-Encoding field-value. Additional information about the 1342 encoding parameters MAY be provided by other header fields not 1343 defined by this specification. 1345 Transfer-Encoding MAY be sent in a response to a HEAD request or in a 1346 304 (Not Modified) response (Section 4.1 of [Part4]) to a GET 1347 request, neither of which includes a message body, to indicate that 1348 the origin server would have applied a transfer coding to the message 1349 body if the request had been an unconditional GET. This indication 1350 is not required, however, because any recipient on the response chain 1351 (including the origin server) can remove transfer codings when they 1352 are not needed. 1354 A server MUST NOT send a Transfer-Encoding header field in any 1355 response with a status code of 1xx (Informational) or 204 (No 1356 Content). A server MUST NOT send a Transfer-Encoding header field in 1357 any 2xx (Successful) response to a CONNECT request (Section 4.3.6 of 1358 [Part2]). 1360 Transfer-Encoding was added in HTTP/1.1. It is generally assumed 1361 that implementations advertising only HTTP/1.0 support will not 1362 understand how to process a transfer-encoded payload. A client MUST 1363 NOT send a request containing Transfer-Encoding unless it knows the 1364 server will handle HTTP/1.1 (or later) requests; such knowledge might 1365 be in the form of specific user configuration or by remembering the 1366 version of a prior received response. A server MUST NOT send a 1367 response containing Transfer-Encoding unless the corresponding 1368 request indicates HTTP/1.1 (or later). 1370 A server that receives a request message with a transfer coding it 1371 does not understand SHOULD respond with 501 (Not Implemented). 1373 3.3.2. Content-Length 1375 When a message does not have a Transfer-Encoding header field, a 1376 Content-Length header field can provide the anticipated size, as a 1377 decimal number of octets, for a potential payload body. For messages 1378 that do include a payload body, the Content-Length field-value 1379 provides the framing information necessary for determining where the 1380 body (and message) ends. For messages that do not include a payload 1381 body, the Content-Length indicates the size of the selected 1382 representation (Section 3 of [Part2]). 1384 Content-Length = 1*DIGIT 1386 An example is 1388 Content-Length: 3495 1390 A sender MUST NOT send a Content-Length header field in any message 1391 that contains a Transfer-Encoding header field. 1393 A user agent SHOULD send a Content-Length in a request message when 1394 no Transfer-Encoding is sent and the request method defines a meaning 1395 for an enclosed payload body. For example, a Content-Length header 1396 field is normally sent in a POST request even when the value is 0 1397 (indicating an empty payload body). A user agent SHOULD NOT send a 1398 Content-Length header field when the request message does not contain 1399 a payload body and the method semantics do not anticipate such a 1400 body. 1402 A server MAY send a Content-Length header field in a response to a 1403 HEAD request (Section 4.3.2 of [Part2]); a server MUST NOT send 1404 Content-Length in such a response unless its field-value equals the 1405 decimal number of octets that would have been sent in the payload 1406 body of a response if the same request had used the GET method. 1408 A server MAY send a Content-Length header field in a 304 (Not 1409 Modified) response to a conditional GET request (Section 4.1 of 1410 [Part4]); a server MUST NOT send Content-Length in such a response 1411 unless its field-value equals the decimal number of octets that would 1412 have been sent in the payload body of a 200 (OK) response to the same 1413 request. 1415 A server MUST NOT send a Content-Length header field in any response 1416 with a status code of 1xx (Informational) or 204 (No Content). A 1417 server MUST NOT send a Content-Length header field in any 2xx 1418 (Successful) response to a CONNECT request (Section 4.3.6 of 1419 [Part2]). 1421 Aside from the cases defined above, in the absence of Transfer- 1422 Encoding, an origin server SHOULD send a Content-Length header field 1423 when the payload body size is known prior to sending the complete 1424 header section. This will allow downstream recipients to measure 1425 transfer progress, know when a received message is complete, and 1426 potentially reuse the connection for additional requests. 1428 Any Content-Length field value greater than or equal to zero is 1429 valid. Since there is no predefined limit to the length of a 1430 payload, a recipient MUST anticipate potentially large decimal 1431 numerals and prevent parsing errors due to integer conversion 1432 overflows (Section 9.3). 1434 If a message is received that has multiple Content-Length header 1435 fields with field-values consisting of the same decimal value, or a 1436 single Content-Length header field with a field value containing a 1437 list of identical decimal values (e.g., "Content-Length: 42, 42"), 1438 indicating that duplicate Content-Length header fields have been 1439 generated or combined by an upstream message processor, then the 1440 recipient MUST either reject the message as invalid or replace the 1441 duplicated field-values with a single valid Content-Length field 1442 containing that decimal value prior to determining the message body 1443 length or forwarding the message. 1445 Note: HTTP's use of Content-Length for message framing differs 1446 significantly from the same field's use in MIME, where it is an 1447 optional field used only within the "message/external-body" media- 1448 type. 1450 3.3.3. Message Body Length 1452 The length of a message body is determined by one of the following 1453 (in order of precedence): 1455 1. Any response to a HEAD request and any response with a 1xx 1456 (Informational), 204 (No Content), or 304 (Not Modified) status 1457 code is always terminated by the first empty line after the 1458 header fields, regardless of the header fields present in the 1459 message, and thus cannot contain a message body. 1461 2. Any 2xx (Successful) response to a CONNECT request implies that 1462 the connection will become a tunnel immediately after the empty 1463 line that concludes the header fields. A client MUST ignore any 1464 Content-Length or Transfer-Encoding header fields received in 1465 such a message. 1467 3. If a Transfer-Encoding header field is present and the chunked 1468 transfer coding (Section 4.1) is the final encoding, the message 1469 body length is determined by reading and decoding the chunked 1470 data until the transfer coding indicates the data is complete. 1472 If a Transfer-Encoding header field is present in a response and 1473 the chunked transfer coding is not the final encoding, the 1474 message body length is determined by reading the connection until 1475 it is closed by the server. If a Transfer-Encoding header field 1476 is present in a request and the chunked transfer coding is not 1477 the final encoding, the message body length cannot be determined 1478 reliably; the server MUST respond with the 400 (Bad Request) 1479 status code and then close the connection. 1481 If a message is received with both a Transfer-Encoding and a 1482 Content-Length header field, the Transfer-Encoding overrides the 1483 Content-Length. Such a message might indicate an attempt to 1484 perform request or response smuggling (bypass of security-related 1485 checks on message routing or content) and thus ought to be 1486 handled as an error. A sender MUST remove the received Content- 1487 Length field prior to forwarding such a message downstream. 1489 4. If a message is received without Transfer-Encoding and with 1490 either multiple Content-Length header fields having differing 1491 field-values or a single Content-Length header field having an 1492 invalid value, then the message framing is invalid and the 1493 recipient MUST treat it as an unrecoverable error to prevent 1494 request or response smuggling. If this is a request message, the 1495 server MUST respond with a 400 (Bad Request) status code and then 1496 close the connection. If this is a response message received by 1497 a proxy, the proxy MUST close the connection to the server, 1498 discard the received response, and send a 502 (Bad Gateway) 1499 response to the client. If this is a response message received 1500 by a user agent, the user agent MUST close the connection to the 1501 server and discard the received response. 1503 5. If a valid Content-Length header field is present without 1504 Transfer-Encoding, its decimal value defines the expected message 1505 body length in octets. If the sender closes the connection or 1506 the recipient times out before the indicated number of octets are 1507 received, the recipient MUST consider the message to be 1508 incomplete and close the connection. 1510 6. If this is a request message and none of the above are true, then 1511 the message body length is zero (no message body is present). 1513 7. Otherwise, this is a response message without a declared message 1514 body length, so the message body length is determined by the 1515 number of octets received prior to the server closing the 1516 connection. 1518 Since there is no way to distinguish a successfully completed, close- 1519 delimited message from a partially-received message interrupted by 1520 network failure, a server SHOULD generate encoding or length- 1521 delimited messages whenever possible. The close-delimiting feature 1522 exists primarily for backwards compatibility with HTTP/1.0. 1524 A server MAY reject a request that contains a message body but not a 1525 Content-Length by responding with 411 (Length Required). 1527 Unless a transfer coding other than chunked has been applied, a 1528 client that sends a request containing a message body SHOULD use a 1529 valid Content-Length header field if the message body length is known 1530 in advance, rather than the chunked transfer coding, since some 1531 existing services respond to chunked with a 411 (Length Required) 1532 status code even though they understand the chunked transfer coding. 1533 This is typically because such services are implemented via a gateway 1534 that requires a content-length in advance of being called and the 1535 server is unable or unwilling to buffer the entire request before 1536 processing. 1538 A user agent that sends a request containing a message body MUST send 1539 a valid Content-Length header field if it does not know the server 1540 will handle HTTP/1.1 (or later) requests; such knowledge can be in 1541 the form of specific user configuration or by remembering the version 1542 of a prior received response. 1544 If the final response to the last request on a connection has been 1545 completely received and there remains additional data to read, a user 1546 agent MAY discard the remaining data or attempt to determine if that 1547 data belongs as part of the prior response body, which might be the 1548 case if the prior message's Content-Length value is incorrect. A 1549 client MUST NOT process, cache, or forward such extra data as a 1550 separate response, since such behavior would be vulnerable to cache 1551 poisoning. 1553 3.4. Handling Incomplete Messages 1555 A server that receives an incomplete request message, usually due to 1556 a canceled request or a triggered time-out exception, MAY send an 1557 error response prior to closing the connection. 1559 A client that receives an incomplete response message, which can 1560 occur when a connection is closed prematurely or when decoding a 1561 supposedly chunked transfer coding fails, MUST record the message as 1562 incomplete. Cache requirements for incomplete responses are defined 1563 in Section 3 of [Part6]. 1565 If a response terminates in the middle of the header section (before 1566 the empty line is received) and the status code might rely on header 1567 fields to convey the full meaning of the response, then the client 1568 cannot assume that meaning has been conveyed; the client might need 1569 to repeat the request in order to determine what action to take next. 1571 A message body that uses the chunked transfer coding is incomplete if 1572 the zero-sized chunk that terminates the encoding has not been 1573 received. A message that uses a valid Content-Length is incomplete 1574 if the size of the message body received (in octets) is less than the 1575 value given by Content-Length. A response that has neither chunked 1576 transfer coding nor Content-Length is terminated by closure of the 1577 connection, and thus is considered complete regardless of the number 1578 of message body octets received, provided that the header section was 1579 received intact. 1581 3.5. Message Parsing Robustness 1583 Older HTTP/1.0 user agent implementations might send an extra CRLF 1584 after a POST request as a workaround for some early server 1585 applications that failed to read message body content that was not 1586 terminated by a line-ending. An HTTP/1.1 user agent MUST NOT preface 1587 or follow a request with an extra CRLF. If terminating the request 1588 message body with a line-ending is desired, then the user agent MUST 1589 count the terminating CRLF octets as part of the message body length. 1591 In the interest of robustness, a server that is expecting to receive 1592 and parse a request-line SHOULD ignore at least one empty line (CRLF) 1593 received prior to the request-line. 1595 Although the line terminator for the start-line and header fields is 1596 the sequence CRLF, a recipient MAY recognize a single LF as a line 1597 terminator and ignore any preceding CR. 1599 Although the request-line and status-line grammar rules require that 1600 each of the component elements be separated by a single SP octet, 1601 recipients MAY instead parse on whitespace-delimited word boundaries 1602 and, aside from the CRLF terminator, treat any form of whitespace as 1603 the SP separator while ignoring preceding or trailing whitespace; 1604 such whitespace includes one or more of the following octets: SP, 1605 HTAB, VT (%x0B), FF (%x0C), or bare CR. 1607 When a server listening only for HTTP request messages, or processing 1608 what appears from the start-line to be an HTTP request message, 1609 receives a sequence of octets that does not match the HTTP-message 1610 grammar aside from the robustness exceptions listed above, the server 1611 SHOULD respond with a 400 (Bad Request) response. 1613 4. Transfer Codings 1615 Transfer coding names are used to indicate an encoding transformation 1616 that has been, can be, or might need to be applied to a payload body 1617 in order to ensure "safe transport" through the network. This 1618 differs from a content coding in that the transfer coding is a 1619 property of the message rather than a property of the representation 1620 that is being transferred. 1622 transfer-coding = "chunked" ; Section 4.1 1623 / "compress" ; Section 4.2.1 1624 / "deflate" ; Section 4.2.2 1625 / "gzip" ; Section 4.2.3 1626 / transfer-extension 1627 transfer-extension = token *( OWS ";" OWS transfer-parameter ) 1629 Parameters are in the form of attribute/value pairs. 1631 transfer-parameter = attribute BWS "=" BWS value 1632 attribute = token 1633 value = word 1635 All transfer-coding names are case-insensitive and ought to be 1636 registered within the HTTP Transfer Coding registry, as defined in 1637 Section 8.4. They are used in the TE (Section 4.3) and Transfer- 1638 Encoding (Section 3.3.1) header fields. 1640 4.1. Chunked Transfer Coding 1642 The chunked transfer coding wraps the payload body in order to 1643 transfer it as a series of chunks, each with its own size indicator, 1644 followed by an OPTIONAL trailer containing header fields. Chunked 1645 enables content streams of unknown size to be transferred as a 1646 sequence of length-delimited buffers, which enables the sender to 1647 retain connection persistence and the recipient to know when it has 1648 received the entire message. 1650 chunked-body = *chunk 1651 last-chunk 1652 trailer-part 1653 CRLF 1655 chunk = chunk-size [ chunk-ext ] CRLF 1656 chunk-data CRLF 1657 chunk-size = 1*HEXDIG 1658 last-chunk = 1*("0") [ chunk-ext ] CRLF 1660 chunk-data = 1*OCTET ; a sequence of chunk-size octets 1662 The chunk-size field is a string of hex digits indicating the size of 1663 the chunk-data in octets. The chunked transfer coding is complete 1664 when a chunk with a chunk-size of zero is received, possibly followed 1665 by a trailer, and finally terminated by an empty line. 1667 A recipient MUST be able to parse and decode the chunked transfer 1668 coding. 1670 4.1.1. Chunk Extensions 1672 The chunked encoding allows each chunk to include zero or more chunk 1673 extensions, immediately following the chunk-size, for the sake of 1674 supplying per-chunk metadata (such as a signature or hash), mid- 1675 message control information, or randomization of message body size. 1677 chunk-ext = *( ";" chunk-ext-name [ "=" chunk-ext-val ] ) 1679 chunk-ext-name = token 1680 chunk-ext-val = token / quoted-str-nf 1682 quoted-str-nf = DQUOTE *( qdtext-nf / quoted-pair ) DQUOTE 1683 ; like quoted-string, but disallowing line folding 1684 qdtext-nf = HTAB / SP / %x21 / %x23-5B / %x5D-7E / obs-text 1686 The chunked encoding is specific to each connection and is likely to 1687 be removed or recoded by each recipient (including intermediaries) 1688 before any higher-level application would have a chance to inspect 1689 the extensions. Hence, use of chunk extensions is generally limited 1690 to specialized HTTP services such as "long polling" (where client and 1691 server can have shared expectations regarding the use of chunk 1692 extensions) or for padding within an end-to-end secured connection. 1694 A recipient MUST ignore unrecognized chunk extensions. A server 1695 ought to limit the total length of chunk extensions received in a 1696 request to an amount reasonable for the services provided, in the 1697 same way that it applies length limitations and timeouts for other 1698 parts of a message, and generate an appropriate 4xx (Client Error) 1699 response if that amount is exceeded. 1701 4.1.2. Chunked Trailer Part 1703 A trailer allows the sender to include additional fields at the end 1704 of a chunked message in order to supply metadata that might be 1705 dynamically generated while the message body is sent, such as a 1706 message integrity check, digital signature, or post-processing 1707 status. The trailer fields are identical to header fields, except 1708 they are sent in a chunked trailer instead of the message's header 1709 section. 1711 trailer-part = *( header-field CRLF ) 1713 A sender MUST NOT generate a trailer that contains a field which 1714 needs to be known by the recipient before it can begin processing the 1715 message body. For example, most recipients need to know the values 1716 of Content-Encoding and Content-Type in order to select a content 1717 handler, so placing those fields in a trailer would force the 1718 recipient to buffer the entire body before it could begin, greatly 1719 increasing user-perceived latency and defeating one of the main 1720 advantages of using chunked to send data streams of unknown length. 1721 A sender MUST NOT generate a trailer containing a Transfer-Encoding, 1722 Content-Length, or Trailer field. 1724 A server MUST generate an empty trailer with the chunked transfer 1725 coding unless at least one of the following is true: 1727 1. the request included a TE header field that indicates "trailers" 1728 is acceptable in the transfer coding of the response, as 1729 described in Section 4.3; or, 1731 2. the trailer fields consist entirely of optional metadata and the 1732 recipient could use the message (in a manner acceptable to the 1733 generating server) without receiving that metadata. In other 1734 words, the generating server is willing to accept the possibility 1735 that the trailer fields might be silently discarded along the 1736 path to the client. 1738 The above requirement prevents the need for an infinite buffer when a 1739 message is being received by an HTTP/1.1 (or later) proxy and 1740 forwarded to an HTTP/1.0 recipient. 1742 4.1.3. Decoding Chunked 1744 A process for decoding the chunked transfer coding can be represented 1745 in pseudo-code as: 1747 length := 0 1748 read chunk-size, chunk-ext (if any), and CRLF 1749 while (chunk-size > 0) { 1750 read chunk-data and CRLF 1751 append chunk-data to decoded-body 1752 length := length + chunk-size 1753 read chunk-size, chunk-ext (if any), and CRLF 1754 } 1755 read header-field 1756 while (header-field not empty) { 1757 append header-field to existing header fields 1758 read header-field 1759 } 1760 Content-Length := length 1761 Remove "chunked" from Transfer-Encoding 1762 Remove Trailer from existing header fields 1764 4.2. Compression Codings 1766 The codings defined below can be used to compress the payload of a 1767 message. 1769 4.2.1. Compress Coding 1771 The "compress" coding is an adaptive Lempel-Ziv-Welch (LZW) coding 1772 [Welch] that is commonly produced by the UNIX file compression 1773 program "compress". A recipient SHOULD consider "x-compress" to be 1774 equivalent to "compress". 1776 4.2.2. Deflate Coding 1778 The "deflate" coding is a "zlib" data format [RFC1950] containing a 1779 "deflate" compressed data stream [RFC1951] that uses a combination of 1780 the Lempel-Ziv (LZ77) compression algorithm and Huffman coding. 1782 Note: Some incorrect implementations send the "deflate" compressed 1783 data without the zlib wrapper. 1785 4.2.3. Gzip Coding 1787 The "gzip" coding is an LZ77 coding with a 32 bit CRC that is 1788 commonly produced by the gzip file compression program [RFC1952]. A 1789 recipient SHOULD consider "x-gzip" to be equivalent to "gzip". 1791 4.3. TE 1793 The "TE" header field in a request indicates what transfer codings, 1794 besides chunked, the client is willing to accept in response, and 1795 whether or not the client is willing to accept trailer fields in a 1796 chunked transfer coding. 1798 The TE field-value consists of a comma-separated list of transfer 1799 coding names, each allowing for optional parameters (as described in 1800 Section 4), and/or the keyword "trailers". A client MUST NOT send 1801 the chunked transfer coding name in TE; chunked is always acceptable 1802 for HTTP/1.1 recipients. 1804 TE = #t-codings 1805 t-codings = "trailers" / ( transfer-coding [ t-ranking ] ) 1806 t-ranking = OWS ";" OWS "q=" rank 1807 rank = ( "0" [ "." 0*3DIGIT ] ) 1808 / ( "1" [ "." 0*3("0") ] ) 1810 Three examples of TE use are below. 1812 TE: deflate 1813 TE: 1814 TE: trailers, deflate;q=0.5 1816 The presence of the keyword "trailers" indicates that the client is 1817 willing to accept trailer fields in a chunked transfer coding, as 1818 defined in Section 4.1.2, on behalf of itself and any downstream 1819 clients. For requests from an intermediary, this implies that 1820 either: (a) all downstream clients are willing to accept trailer 1821 fields in the forwarded response; or, (b) the intermediary will 1822 attempt to buffer the response on behalf of downstream recipients. 1823 Note that HTTP/1.1 does not define any means to limit the size of a 1824 chunked response such that an intermediary can be assured of 1825 buffering the entire response. 1827 When multiple transfer codings are acceptable, the client MAY rank 1828 the codings by preference using a case-insensitive "q" parameter 1829 (similar to the qvalues used in content negotiation fields, Section 1830 5.3.1 of [Part2]). The rank value is a real number in the range 0 1831 through 1, where 0.001 is the least preferred and 1 is the most 1832 preferred; a value of 0 means "not acceptable". 1834 If the TE field-value is empty or if no TE field is present, the only 1835 acceptable transfer coding is chunked. A message with no transfer 1836 coding is always acceptable. 1838 Since the TE header field only applies to the immediate connection, a 1839 sender of TE MUST also send a "TE" connection option within the 1840 Connection header field (Section 6.1) in order to prevent the TE 1841 field from being forwarded by intermediaries that do not support its 1842 semantics. 1844 4.4. Trailer 1846 When a message includes a message body encoded with the chunked 1847 transfer coding and the sender desires to send metadata in the form 1848 of trailer fields at the end of the message, the sender SHOULD 1849 generate a Trailer header field before the message body to indicate 1850 which fields will be present in the trailers. This allows the 1851 recipient to prepare for receipt of that metadata before it starts 1852 processing the body, which is useful if the message is being streamed 1853 and the recipient wishes to confirm an integrity check on the fly. 1855 Trailer = 1#field-name 1857 5. Message Routing 1859 HTTP request message routing is determined by each client based on 1860 the target resource, the client's proxy configuration, and 1861 establishment or reuse of an inbound connection. The corresponding 1862 response routing follows the same connection chain back to the 1863 client. 1865 5.1. Identifying a Target Resource 1867 HTTP is used in a wide variety of applications, ranging from general- 1868 purpose computers to home appliances. In some cases, communication 1869 options are hard-coded in a client's configuration. However, most 1870 HTTP clients rely on the same resource identification mechanism and 1871 configuration techniques as general-purpose Web browsers. 1873 HTTP communication is initiated by a user agent for some purpose. 1874 The purpose is a combination of request semantics, which are defined 1875 in [Part2], and a target resource upon which to apply those 1876 semantics. A URI reference (Section 2.7) is typically used as an 1877 identifier for the "target resource", which a user agent would 1878 resolve to its absolute form in order to obtain the "target URI". 1879 The target URI excludes the reference's fragment component, if any, 1880 since fragment identifiers are reserved for client-side processing 1881 ([RFC3986], Section 3.5). 1883 5.2. Connecting Inbound 1885 Once the target URI is determined, a client needs to decide whether a 1886 network request is necessary to accomplish the desired semantics and, 1887 if so, where that request is to be directed. 1889 If the client has a cache [Part6] and the request can be satisfied by 1890 it, then the request is usually directed there first. 1892 If the request is not satisfied by a cache, then a typical client 1893 will check its configuration to determine whether a proxy is to be 1894 used to satisfy the request. Proxy configuration is implementation- 1895 dependent, but is often based on URI prefix matching, selective 1896 authority matching, or both, and the proxy itself is usually 1897 identified by an "http" or "https" URI. If a proxy is applicable, 1898 the client connects inbound by establishing (or reusing) a connection 1899 to that proxy. 1901 If no proxy is applicable, a typical client will invoke a handler 1902 routine, usually specific to the target URI's scheme, to connect 1903 directly to an authority for the target resource. How that is 1904 accomplished is dependent on the target URI scheme and defined by its 1905 associated specification, similar to how this specification defines 1906 origin server access for resolution of the "http" (Section 2.7.1) and 1907 "https" (Section 2.7.2) schemes. 1909 HTTP requirements regarding connection management are defined in 1910 Section 6. 1912 5.3. Request Target 1914 Once an inbound connection is obtained, the client sends an HTTP 1915 request message (Section 3) with a request-target derived from the 1916 target URI. There are four distinct formats for the request-target, 1917 depending on both the method being requested and whether the request 1918 is to a proxy. 1920 request-target = origin-form 1921 / absolute-form 1922 / authority-form 1923 / asterisk-form 1925 origin-form = absolute-path [ "?" query ] 1926 absolute-form = absolute-URI 1927 authority-form = authority 1928 asterisk-form = "*" 1930 origin-form 1932 The most common form of request-target is the origin-form. When 1933 making a request directly to an origin server, other than a CONNECT 1934 or server-wide OPTIONS request (as detailed below), a client MUST 1935 send only the absolute path and query components of the target URI as 1936 the request-target. If the target URI's path component is empty, 1937 then the client MUST send "/" as the path within the origin-form of 1938 request-target. A Host header field is also sent, as defined in 1939 Section 5.4. 1941 For example, a client wishing to retrieve a representation of the 1942 resource identified as 1944 http://www.example.org/where?q=now 1946 directly from the origin server would open (or reuse) a TCP 1947 connection to port 80 of the host "www.example.org" and send the 1948 lines: 1950 GET /where?q=now HTTP/1.1 1951 Host: www.example.org 1953 followed by the remainder of the request message. 1955 absolute-form 1957 When making a request to a proxy, other than a CONNECT or server-wide 1958 OPTIONS request (as detailed below), a client MUST send the target 1959 URI in absolute-form as the request-target. The proxy is requested 1960 to either service that request from a valid cache, if possible, or 1961 make the same request on the client's behalf to either the next 1962 inbound proxy server or directly to the origin server indicated by 1963 the request-target. Requirements on such "forwarding" of messages 1964 are defined in Section 5.7. 1966 An example absolute-form of request-line would be: 1968 GET http://www.example.org/pub/WWW/TheProject.html HTTP/1.1 1970 To allow for transition to the absolute-form for all requests in some 1971 future version of HTTP, a server MUST accept the absolute-form in 1972 requests, even though HTTP/1.1 clients will only send them in 1973 requests to proxies. 1975 authority-form 1977 The authority-form of request-target is only used for CONNECT 1978 requests (Section 4.3.6 of [Part2]). When making a CONNECT request 1979 to establish a tunnel through one or more proxies, a client MUST send 1980 only the target URI's authority component (excluding any userinfo and 1981 its "@" delimiter) as the request-target. For example, 1983 CONNECT www.example.com:80 HTTP/1.1 1985 asterisk-form 1987 The asterisk-form of request-target is only used for a server-wide 1988 OPTIONS request (Section 4.3.7 of [Part2]). When a client wishes to 1989 request OPTIONS for the server as a whole, as opposed to a specific 1990 named resource of that server, the client MUST send only "*" (%x2A) 1991 as the request-target. For example, 1993 OPTIONS * HTTP/1.1 1995 If a proxy receives an OPTIONS request with an absolute-form of 1996 request-target in which the URI has an empty path and no query 1997 component, then the last proxy on the request chain MUST send a 1998 request-target of "*" when it forwards the request to the indicated 1999 origin server. 2001 For example, the request 2003 OPTIONS http://www.example.org:8001 HTTP/1.1 2005 would be forwarded by the final proxy as 2007 OPTIONS * HTTP/1.1 2008 Host: www.example.org:8001 2010 after connecting to port 8001 of host "www.example.org". 2012 5.4. Host 2014 The "Host" header field in a request provides the host and port 2015 information from the target URI, enabling the origin server to 2016 distinguish among resources while servicing requests for multiple 2017 host names on a single IP address. 2019 Host = uri-host [ ":" port ] ; Section 2.7.1 2021 A client MUST send a Host header field in all HTTP/1.1 request 2022 messages. If the target URI includes an authority component, then a 2023 client MUST send a field-value for Host that is identical to that 2024 authority component, excluding any userinfo subcomponent and its "@" 2025 delimiter (Section 2.7.1). If the authority component is missing or 2026 undefined for the target URI, then a client MUST send a Host header 2027 field with an empty field-value. 2029 Since the Host field-value is critical information for handling a 2030 request, a user agent SHOULD generate Host as the first header field 2031 following the request-line. 2033 For example, a GET request to the origin server for 2034 would begin with: 2036 GET /pub/WWW/ HTTP/1.1 2037 Host: www.example.org 2039 A client MUST send a Host header field in an HTTP/1.1 request even if 2040 the request-target is in the absolute-form, since this allows the 2041 Host information to be forwarded through ancient HTTP/1.0 proxies 2042 that might not have implemented Host. 2044 When a proxy receives a request with an absolute-form of request- 2045 target, the proxy MUST ignore the received Host header field (if any) 2046 and instead replace it with the host information of the request- 2047 target. A proxy that forwards such a request MUST generate a new 2048 Host field-value based on the received request-target rather than 2049 forward the received Host field-value. 2051 Since the Host header field acts as an application-level routing 2052 mechanism, it is a frequent target for malware seeking to poison a 2053 shared cache or redirect a request to an unintended server. An 2054 interception proxy is particularly vulnerable if it relies on the 2055 Host field-value for redirecting requests to internal servers, or for 2056 use as a cache key in a shared cache, without first verifying that 2057 the intercepted connection is targeting a valid IP address for that 2058 host. 2060 A server MUST respond with a 400 (Bad Request) status code to any 2061 HTTP/1.1 request message that lacks a Host header field and to any 2062 request message that contains more than one Host header field or a 2063 Host header field with an invalid field-value. 2065 5.5. Effective Request URI 2067 A server that receives an HTTP request message MUST reconstruct the 2068 user agent's original target URI, based on the pieces of information 2069 learned from the request-target, Host header field, and connection 2070 context, in order to identify the intended target resource and 2071 properly service the request. The URI derived from this 2072 reconstruction process is referred to as the "effective request URI". 2074 For a user agent, the effective request URI is the target URI. 2076 If the request-target is in absolute-form, then the effective request 2077 URI is the same as the request-target. Otherwise, the effective 2078 request URI is constructed as follows. 2080 If the request is received over a TLS-secured TCP connection, then 2081 the effective request URI's scheme is "https"; otherwise, the scheme 2082 is "http". 2084 If the request-target is in authority-form, then the effective 2085 request URI's authority component is the same as the request-target. 2086 Otherwise, if a Host header field is supplied with a non-empty field- 2087 value, then the authority component is the same as the Host field- 2088 value. Otherwise, the authority component is the concatenation of 2089 the default host name configured for the server, a colon (":"), and 2090 the connection's incoming TCP port number in decimal form. 2092 If the request-target is in authority-form or asterisk-form, then the 2093 effective request URI's combined path and query component is empty. 2094 Otherwise, the combined path and query component is the same as the 2095 request-target. 2097 The components of the effective request URI, once determined as 2098 above, can be combined into absolute-URI form by concatenating the 2099 scheme, "://", authority, and combined path and query component. 2101 Example 1: the following message received over an insecure TCP 2102 connection 2104 GET /pub/WWW/TheProject.html HTTP/1.1 2105 Host: www.example.org:8080 2107 has an effective request URI of 2109 http://www.example.org:8080/pub/WWW/TheProject.html 2111 Example 2: the following message received over a TLS-secured TCP 2112 connection 2114 OPTIONS * HTTP/1.1 2115 Host: www.example.org 2117 has an effective request URI of 2119 https://www.example.org 2121 An origin server that does not allow resources to differ by requested 2122 host MAY ignore the Host field-value and instead replace it with a 2123 configured server name when constructing the effective request URI. 2125 Recipients of an HTTP/1.0 request that lacks a Host header field MAY 2126 attempt to use heuristics (e.g., examination of the URI path for 2127 something unique to a particular host) in order to guess the 2128 effective request URI's authority component. 2130 5.6. Associating a Response to a Request 2132 HTTP does not include a request identifier for associating a given 2133 request message with its corresponding one or more response messages. 2134 Hence, it relies on the order of response arrival to correspond 2135 exactly to the order in which requests are made on the same 2136 connection. More than one response message per request only occurs 2137 when one or more informational responses (1xx, see Section 6.2 of 2138 [Part2]) precede a final response to the same request. 2140 A client that has more than one outstanding request on a connection 2141 MUST maintain a list of outstanding requests in the order sent and 2142 MUST associate each received response message on that connection to 2143 the highest ordered request that has not yet received a final (non- 2144 1xx) response. 2146 5.7. Message Forwarding 2148 As described in Section 2.3, intermediaries can serve a variety of 2149 roles in the processing of HTTP requests and responses. Some 2150 intermediaries are used to improve performance or availability. 2151 Others are used for access control or to filter content. Since an 2152 HTTP stream has characteristics similar to a pipe-and-filter 2153 architecture, there are no inherent limits to the extent an 2154 intermediary can enhance (or interfere) with either direction of the 2155 stream. 2157 An intermediary not acting as a tunnel MUST implement the Connection 2158 header field, as specified in Section 6.1, and exclude fields from 2159 being forwarded that are only intended for the incoming connection. 2161 An intermediary MUST NOT forward a message to itself unless it is 2162 protected from an infinite request loop. In general, an intermediary 2163 ought to recognize its own server names, including any aliases, local 2164 variations, or literal IP addresses, and respond to such requests 2165 directly. 2167 5.7.1. Via 2169 The "Via" header field indicates the presence of intermediate 2170 protocols and recipients between the user agent and the server (on 2171 requests) or between the origin server and the client (on responses), 2172 similar to the "Received" header field in email (Section 3.6.7 of 2173 [RFC5322]). Via can be used for tracking message forwards, avoiding 2174 request loops, and identifying the protocol capabilities of senders 2175 along the request/response chain. 2177 Via = 1#( received-protocol RWS received-by [ RWS comment ] ) 2179 received-protocol = [ protocol-name "/" ] protocol-version 2180 ; see Section 6.7 2181 received-by = ( uri-host [ ":" port ] ) / pseudonym 2182 pseudonym = token 2184 Multiple Via field values represent each proxy or gateway that has 2185 forwarded the message. Each intermediary appends its own information 2186 about how the message was received, such that the end result is 2187 ordered according to the sequence of forwarding recipients. 2189 A proxy MUST send an appropriate Via header field, as described 2190 below, in each message that it forwards. An HTTP-to-HTTP gateway 2191 MUST send an appropriate Via header field in each inbound request 2192 message and MAY send a Via header field in forwarded response 2193 messages. 2195 For each intermediary, the received-protocol indicates the protocol 2196 and protocol version used by the upstream sender of the message. 2197 Hence, the Via field value records the advertised protocol 2198 capabilities of the request/response chain such that they remain 2199 visible to downstream recipients; this can be useful for determining 2200 what backwards-incompatible features might be safe to use in 2201 response, or within a later request, as described in Section 2.6. 2202 For brevity, the protocol-name is omitted when the received protocol 2203 is HTTP. 2205 The received-by field is normally the host and optional port number 2206 of a recipient server or client that subsequently forwarded the 2207 message. However, if the real host is considered to be sensitive 2208 information, a sender MAY replace it with a pseudonym. If a port is 2209 not provided, a recipient MAY interpret that as meaning it was 2210 received on the default TCP port, if any, for the received-protocol. 2212 A sender MAY generate comments in the Via header field to identify 2213 the software of each recipient, analogous to the User-Agent and 2214 Server header fields. However, all comments in the Via field are 2215 optional and a recipient MAY remove them prior to forwarding the 2216 message. 2218 For example, a request message could be sent from an HTTP/1.0 user 2219 agent to an internal proxy code-named "fred", which uses HTTP/1.1 to 2220 forward the request to a public proxy at p.example.net, which 2221 completes the request by forwarding it to the origin server at 2222 www.example.com. The request received by www.example.com would then 2223 have the following Via header field: 2225 Via: 1.0 fred, 1.1 p.example.net 2227 An intermediary used as a portal through a network firewall SHOULD 2228 NOT forward the names and ports of hosts within the firewall region 2229 unless it is explicitly enabled to do so. If not enabled, such an 2230 intermediary SHOULD replace each received-by host of any host behind 2231 the firewall by an appropriate pseudonym for that host. 2233 An intermediary MAY combine an ordered subsequence of Via header 2234 field entries into a single such entry if the entries have identical 2235 received-protocol values. For example, 2237 Via: 1.0 ricky, 1.1 ethel, 1.1 fred, 1.0 lucy 2239 could be collapsed to 2241 Via: 1.0 ricky, 1.1 mertz, 1.0 lucy 2243 A sender SHOULD NOT combine multiple entries unless they are all 2244 under the same organizational control and the hosts have already been 2245 replaced by pseudonyms. A sender MUST NOT combine entries that have 2246 different received-protocol values. 2248 5.7.2. Transformations 2250 Some intermediaries include features for transforming messages and 2251 their payloads. A transforming proxy might, for example, convert 2252 between image formats in order to save cache space or to reduce the 2253 amount of traffic on a slow link. However, operational problems 2254 might occur when these transformations are applied to payloads 2255 intended for critical applications, such as medical imaging or 2256 scientific data analysis, particularly when integrity checks or 2257 digital signatures are used to ensure that the payload received is 2258 identical to the original. 2260 If a proxy receives a request-target with a host name that is not a 2261 fully qualified domain name, it MAY add its own domain to the host 2262 name it received when forwarding the request. A proxy MUST NOT 2263 change the host name if it is a fully qualified domain name. 2265 A proxy MUST NOT modify the "absolute-path" and "query" parts of the 2266 received request-target when forwarding it to the next inbound 2267 server, except as noted above to replace an empty path with "/" or 2268 "*". 2270 A proxy MUST NOT modify header fields that provide information about 2271 the end points of the communication chain, the resource state, or the 2272 selected representation. A proxy MAY change the message body through 2273 application or removal of a transfer coding (Section 4). 2275 A non-transforming proxy MUST NOT modify the message payload (Section 2276 3.3 of [Part2]). A transforming proxy MUST NOT modify the payload of 2277 a message that contains the no-transform cache-control directive. 2279 A transforming proxy MAY transform the payload of a message that does 2280 not contain the no-transform cache-control directive; if the payload 2281 is transformed, the transforming proxy MUST add a Warning header 2282 field with the warn-code of 214 ("Transformation Applied") if one 2283 does not already appear in the message (see Section 5.5 of [Part6]). 2284 If the payload of a 200 (OK) response is transformed, the 2285 transforming proxy can also inform downstream recipients that a 2286 transformation has been applied by changing the response status code 2287 to 203 (Non-Authoritative Information) (Section 6.3.4 of [Part2]). 2289 6. Connection Management 2291 HTTP messaging is independent of the underlying transport or session- 2292 layer connection protocol(s). HTTP only presumes a reliable 2293 transport with in-order delivery of requests and the corresponding 2294 in-order delivery of responses. The mapping of HTTP request and 2295 response structures onto the data units of an underlying transport 2296 protocol is outside the scope of this specification. 2298 As described in Section 5.2, the specific connection protocols to be 2299 used for an HTTP interaction are determined by client configuration 2300 and the target URI. For example, the "http" URI scheme 2301 (Section 2.7.1) indicates a default connection of TCP over IP, with a 2302 default TCP port of 80, but the client might be configured to use a 2303 proxy via some other connection, port, or protocol. 2305 HTTP implementations are expected to engage in connection management, 2306 which includes maintaining the state of current connections, 2307 establishing a new connection or reusing an existing connection, 2308 processing messages received on a connection, detecting connection 2309 failures, and closing each connection. Most clients maintain 2310 multiple connections in parallel, including more than one connection 2311 per server endpoint. Most servers are designed to maintain thousands 2312 of concurrent connections, while controlling request queues to enable 2313 fair use and detect denial of service attacks. 2315 6.1. Connection 2317 The "Connection" header field allows the sender to indicate desired 2318 control options for the current connection. In order to avoid 2319 confusing downstream recipients, a proxy or gateway MUST remove or 2320 replace any received connection options before forwarding the 2321 message. 2323 When a header field aside from Connection is used to supply control 2324 information for or about the current connection, the sender MUST list 2325 the corresponding field-name within the "Connection" header field. A 2326 proxy or gateway MUST parse a received Connection header field before 2327 a message is forwarded and, for each connection-option in this field, 2328 remove any header field(s) from the message with the same name as the 2329 connection-option, and then remove the Connection header field itself 2330 (or replace it with the intermediary's own connection options for the 2331 forwarded message). 2333 Hence, the Connection header field provides a declarative way of 2334 distinguishing header fields that are only intended for the immediate 2335 recipient ("hop-by-hop") from those fields that are intended for all 2336 recipients on the chain ("end-to-end"), enabling the message to be 2337 self-descriptive and allowing future connection-specific extensions 2338 to be deployed without fear that they will be blindly forwarded by 2339 older intermediaries. 2341 The Connection header field's value has the following grammar: 2343 Connection = 1#connection-option 2344 connection-option = token 2346 Connection options are case-insensitive. 2348 A sender MUST NOT send a connection option corresponding to a header 2349 field that is intended for all recipients of the payload. For 2350 example, Cache-Control is never appropriate as a connection option 2351 (Section 5.2 of [Part6]). 2353 The connection options do not always correspond to a header field 2354 present in the message, since a connection-specific header field 2355 might not be needed if there are no parameters associated with a 2356 connection option. In contrast, a connection-specific header field 2357 that is received without a corresponding connection option usually 2358 indicates that the field has been improperly forwarded by an 2359 intermediary and ought to be ignored by the recipient. 2361 When defining new connection options, specification authors ought to 2362 survey existing header field names and ensure that the new connection 2363 option does not share the same name as an already deployed header 2364 field. Defining a new connection option essentially reserves that 2365 potential field-name for carrying additional information related to 2366 the connection option, since it would be unwise for senders to use 2367 that field-name for anything else. 2369 The "close" connection option is defined for a sender to signal that 2370 this connection will be closed after completion of the response. For 2371 example, 2373 Connection: close 2375 in either the request or the response header fields indicates that 2376 the sender is going to close the connection after the current 2377 request/response is complete (Section 6.6). 2379 A client that does not support persistent connections MUST send the 2380 "close" connection option in every request message. 2382 A server that does not support persistent connections MUST send the 2383 "close" connection option in every response message that does not 2384 have a 1xx (Informational) status code. 2386 6.2. Establishment 2388 It is beyond the scope of this specification to describe how 2389 connections are established via various transport or session-layer 2390 protocols. Each connection applies to only one transport link. 2392 6.3. Persistence 2394 HTTP/1.1 defaults to the use of "persistent connections", allowing 2395 multiple requests and responses to be carried over a single 2396 connection. The "close" connection-option is used to signal that a 2397 connection will not persist after the current request/response. HTTP 2398 implementations SHOULD support persistent connections. 2400 A recipient determines whether a connection is persistent or not 2401 based on the most recently received message's protocol version and 2402 Connection header field (if any): 2404 o If the close connection option is present, the connection will not 2405 persist after the current response; else, 2407 o If the received protocol is HTTP/1.1 (or later), the connection 2408 will persist after the current response; else, 2410 o If the received protocol is HTTP/1.0, the "keep-alive" connection 2411 option is present, the recipient is not a proxy, and the recipient 2412 wishes to honor the HTTP/1.0 "keep-alive" mechanism, the 2413 connection will persist after the current response; otherwise, 2415 o The connection will close after the current response. 2417 A server MAY assume that an HTTP/1.1 client intends to maintain a 2418 persistent connection until a close connection option is received in 2419 a request. 2421 A client MAY reuse a persistent connection until it sends or receives 2422 a close connection option or receives an HTTP/1.0 response without a 2423 "keep-alive" connection option. 2425 In order to remain persistent, all messages on a connection need to 2426 have a self-defined message length (i.e., one not defined by closure 2427 of the connection), as described in Section 3.3. A server MUST read 2428 the entire request message body or close the connection after sending 2429 its response, since otherwise the remaining data on a persistent 2430 connection would be misinterpreted as the next request. Likewise, a 2431 client MUST read the entire response message body if it intends to 2432 reuse the same connection for a subsequent request. 2434 A proxy server MUST NOT maintain a persistent connection with an 2435 HTTP/1.0 client (see Section 19.7.1 of [RFC2068] for information and 2436 discussion of the problems with the Keep-Alive header field 2437 implemented by many HTTP/1.0 clients). 2439 Clients and servers SHOULD NOT assume that a persistent connection is 2440 maintained for HTTP versions less than 1.1 unless it is explicitly 2441 signaled. See Appendix A.1.2 for more information on backward 2442 compatibility with HTTP/1.0 clients. 2444 6.3.1. Retrying Requests 2446 Connections can be closed at any time, with or without intention. 2447 Implementations ought to anticipate the need to recover from 2448 asynchronous close events. 2450 When an inbound connection is closed prematurely, a client MAY open a 2451 new connection and automatically retransmit an aborted sequence of 2452 requests if all of those requests have idempotent methods (Section 2453 4.2.2 of [Part2]). A proxy MUST NOT automatically retry non- 2454 idempotent requests. 2456 A user agent MUST NOT automatically retry a request with a non- 2457 idempotent method unless it has some means to know that the request 2458 semantics are actually idempotent, regardless of the method, or some 2459 means to detect that the original request was never applied. For 2460 example, a user agent that knows (through design or configuration) 2461 that a POST request to a given resource is safe can repeat that 2462 request automatically. Likewise, a user agent designed specifically 2463 to operate on a version control repository might be able to recover 2464 from partial failure conditions by checking the target resource 2465 revision(s) after a failed connection, reverting or fixing any 2466 changes that were partially applied, and then automatically retrying 2467 the requests that failed. 2469 A client SHOULD NOT automatically retry a failed automatic retry. 2471 6.3.2. Pipelining 2473 A client that supports persistent connections MAY "pipeline" its 2474 requests (i.e., send multiple requests without waiting for each 2475 response). A server MAY process a sequence of pipelined requests in 2476 parallel if they all have safe methods (Section 4.2.1 of [Part2]), 2477 but MUST send the corresponding responses in the same order that the 2478 requests were received. 2480 A client that pipelines requests SHOULD retry unanswered requests if 2481 the connection closes before it receives all of the corresponding 2482 responses. When retrying pipelined requests after a failed 2483 connection (a connection not explicitly closed by the server in its 2484 last complete response), a client MUST NOT pipeline immediately after 2485 connection establishment, since the first remaining request in the 2486 prior pipeline might have caused an error response that can be lost 2487 again if multiple requests are sent on a prematurely closed 2488 connection (see the TCP reset problem described in Section 6.6). 2490 Idempotent methods (Section 4.2.2 of [Part2]) are significant to 2491 pipelining because they can be automatically retried after a 2492 connection failure. A user agent SHOULD NOT pipeline requests after 2493 a non-idempotent method, until the final response status code for 2494 that method has been received, unless the user agent has a means to 2495 detect and recover from partial failure conditions involving the 2496 pipelined sequence. 2498 An intermediary that receives pipelined requests MAY pipeline those 2499 requests when forwarding them inbound, since it can rely on the 2500 outbound user agent(s) to determine what requests can be safely 2501 pipelined. If the inbound connection fails before receiving a 2502 response, the pipelining intermediary MAY attempt to retry a sequence 2503 of requests that have yet to receive a response if the requests all 2504 have idempotent methods; otherwise, the pipelining intermediary 2505 SHOULD forward any received responses and then close the 2506 corresponding outbound connection(s) so that the outbound user 2507 agent(s) can recover accordingly. 2509 6.4. Concurrency 2511 A client SHOULD limit the number of simultaneous open connections 2512 that it maintains to a given server. 2514 Previous revisions of HTTP gave a specific number of connections as a 2515 ceiling, but this was found to be impractical for many applications. 2516 As a result, this specification does not mandate a particular maximum 2517 number of connections, but instead encourages clients to be 2518 conservative when opening multiple connections. 2520 Multiple connections are typically used to avoid the "head-of-line 2521 blocking" problem, wherein a request that takes significant server- 2522 side processing and/or has a large payload blocks subsequent requests 2523 on the same connection. However, each connection consumes server 2524 resources. Furthermore, using multiple connections can cause 2525 undesirable side effects in congested networks. 2527 Note that servers might reject traffic that they deem abusive, 2528 including an excessive number of connections from a client. 2530 6.5. Failures and Time-outs 2532 Servers will usually have some time-out value beyond which they will 2533 no longer maintain an inactive connection. Proxy servers might make 2534 this a higher value since it is likely that the client will be making 2535 more connections through the same server. The use of persistent 2536 connections places no requirements on the length (or existence) of 2537 this time-out for either the client or the server. 2539 A client or server that wishes to time-out SHOULD issue a graceful 2540 close on the connection. Implementations SHOULD constantly monitor 2541 open connections for a received closure signal and respond to it as 2542 appropriate, since prompt closure of both sides of a connection 2543 enables allocated system resources to be reclaimed. 2545 A client, server, or proxy MAY close the transport connection at any 2546 time. For example, a client might have started to send a new request 2547 at the same time that the server has decided to close the "idle" 2548 connection. From the server's point of view, the connection is being 2549 closed while it was idle, but from the client's point of view, a 2550 request is in progress. 2552 A server SHOULD sustain persistent connections, when possible, and 2553 allow the underlying transport's flow control mechanisms to resolve 2554 temporary overloads, rather than terminate connections with the 2555 expectation that clients will retry. The latter technique can 2556 exacerbate network congestion. 2558 A client sending a message body SHOULD monitor the network connection 2559 for an error response while it is transmitting the request. If the 2560 client sees a response that indicates the server does not wish to 2561 receive the message body and is closing the connection, the client 2562 SHOULD immediately cease transmitting the body and close its side of 2563 the connection. 2565 6.6. Tear-down 2567 The Connection header field (Section 6.1) provides a "close" 2568 connection option that a sender SHOULD send when it wishes to close 2569 the connection after the current request/response pair. 2571 A client that sends a close connection option MUST NOT send further 2572 requests on that connection (after the one containing close) and MUST 2573 close the connection after reading the final response message 2574 corresponding to this request. 2576 A server that receives a close connection option MUST initiate a 2577 close of the connection (see below) after it sends the final response 2578 to the request that contained close. The server SHOULD send a close 2579 connection option in its final response on that connection. The 2580 server MUST NOT process any further requests received on that 2581 connection. 2583 A server that sends a close connection option MUST initiate a close 2584 of the connection (see below) after it sends the response containing 2585 close. The server MUST NOT process any further requests received on 2586 that connection. 2588 A client that receives a close connection option MUST cease sending 2589 requests on that connection and close the connection after reading 2590 the response message containing the close; if additional pipelined 2591 requests had been sent on the connection, the client SHOULD NOT 2592 assume that they will be processed by the server. 2594 If a server performs an immediate close of a TCP connection, there is 2595 a significant risk that the client will not be able to read the last 2596 HTTP response. If the server receives additional data from the 2597 client on a fully-closed connection, such as another request that was 2598 sent by the client before receiving the server's response, the 2599 server's TCP stack will send a reset packet to the client; 2600 unfortunately, the reset packet might erase the client's 2601 unacknowledged input buffers before they can be read and interpreted 2602 by the client's HTTP parser. 2604 To avoid the TCP reset problem, servers typically close a connection 2605 in stages. First, the server performs a half-close by closing only 2606 the write side of the read/write connection. The server then 2607 continues to read from the connection until it receives a 2608 corresponding close by the client, or until the server is reasonably 2609 certain that its own TCP stack has received the client's 2610 acknowledgement of the packet(s) containing the server's last 2611 response. Finally, the server fully closes the connection. 2613 It is unknown whether the reset problem is exclusive to TCP or might 2614 also be found in other transport connection protocols. 2616 6.7. Upgrade 2618 The "Upgrade" header field is intended to provide a simple mechanism 2619 for transitioning from HTTP/1.1 to some other protocol on the same 2620 connection. A client MAY send a list of protocols in the Upgrade 2621 header field of a request to invite the server to switch to one or 2622 more of those protocols, in order of descending preference, before 2623 sending the final response. A server MAY ignore a received Upgrade 2624 header field if it wishes to continue using the current protocol on 2625 that connection. 2627 Upgrade = 1#protocol 2629 protocol = protocol-name ["/" protocol-version] 2630 protocol-name = token 2631 protocol-version = token 2633 A server that sends a 101 (Switching Protocols) response MUST send an 2634 Upgrade header field to indicate the new protocol(s) to which the 2635 connection is being switched; if multiple protocol layers are being 2636 switched, the sender MUST list the protocols in layer-ascending 2637 order. A server MUST NOT switch to a protocol that was not indicated 2638 by the client in the corresponding request's Upgrade header field. A 2639 server MAY choose to ignore the order of preference indicated by the 2640 client and select the new protocol(s) based on other factors, such as 2641 the nature of the request or the current load on the server. 2643 A server that sends a 426 (Upgrade Required) response MUST send an 2644 Upgrade header field to indicate the acceptable protocols, in order 2645 of descending preference. 2647 A server MAY send an Upgrade header field in any other response to 2648 advertise that it implements support for upgrading to the listed 2649 protocols, in order of descending preference, when appropriate for a 2650 future request. 2652 The following is a hypothetical example sent by a client: 2654 GET /hello.txt HTTP/1.1 2655 Host: www.example.com 2656 Connection: upgrade 2657 Upgrade: HTTP/2.0, SHTTP/1.3, IRC/6.9, RTA/x11 2659 Upgrade cannot be used to insist on a protocol change; its acceptance 2660 and use by the server is optional. The capabilities and nature of 2661 the application-level communication after the protocol change is 2662 entirely dependent upon the new protocol(s) chosen. However, 2663 immediately after sending the 101 response, the server is expected to 2664 continue responding to the original request as if it had received its 2665 equivalent within the new protocol (i.e., the server still has an 2666 outstanding request to satisfy after the protocol has been changed, 2667 and is expected to do so without requiring the request to be 2668 repeated). 2670 For example, if the Upgrade header field is received in a GET request 2671 and the server decides to switch protocols, it first responds with a 2672 101 (Switching Protocols) message in HTTP/1.1 and then immediately 2673 follows that with the new protocol's equivalent of a response to a 2674 GET on the target resource. This allows a connection to be upgraded 2675 to protocols with the same semantics as HTTP without the latency cost 2676 of an additional round-trip. A server MUST NOT switch protocols 2677 unless the received message semantics can be honored by the new 2678 protocol; an OPTIONS request can be honored by any protocol. 2680 The following is an example response to the above hypothetical 2681 request: 2683 HTTP/1.1 101 Switching Protocols 2684 Connection: upgrade 2685 Upgrade: HTTP/2.0 2687 [... data stream switches to HTTP/2.0 with an appropriate response 2688 (as defined by new protocol) to the "GET /hello.txt" request ...] 2690 When Upgrade is sent, the sender MUST also send a Connection header 2691 field (Section 6.1) that contains an "upgrade" connection option, in 2692 order to prevent Upgrade from being accidentally forwarded by 2693 intermediaries that might not implement the listed protocols. A 2694 server MUST ignore an Upgrade header field that is received in an 2695 HTTP/1.0 request. 2697 A client cannot begin using an upgraded protocol on the connection 2698 until it has completely sent the request message (i.e., the client 2699 can't change the protocol it is sending in the middle of a message). 2700 If a server receives both Upgrade and an Expect header field with the 2701 "100-continue" expectation (Section 5.1.1 of [Part2]), the server 2702 MUST send a 100 (Continue) response before sending a 101 (Switching 2703 Protocols) response. 2705 The Upgrade header field only applies to switching protocols on top 2706 of the existing connection; it cannot be used to switch the 2707 underlying connection (transport) protocol, nor to switch the 2708 existing communication to a different connection. For those 2709 purposes, it is more appropriate to use a 3xx (Redirection) response 2710 (Section 6.4 of [Part2]). 2712 This specification only defines the protocol name "HTTP" for use by 2713 the family of Hypertext Transfer Protocols, as defined by the HTTP 2714 version rules of Section 2.6 and future updates to this 2715 specification. Additional tokens ought to be registered with IANA 2716 using the registration procedure defined in Section 8.6. 2718 7. ABNF list extension: #rule 2720 A #rule extension to the ABNF rules of [RFC5234] is used to improve 2721 readability in the definitions of some header field values. 2723 A construct "#" is defined, similar to "*", for defining comma- 2724 delimited lists of elements. The full form is "#element" 2725 indicating at least and at most elements, each separated by a 2726 single comma (",") and optional whitespace (OWS). 2728 Thus, a sender MUST expand the list construct as follows: 2730 1#element => element *( OWS "," OWS element ) 2732 and: 2734 #element => [ 1#element ] 2736 and for n >= 1 and m > 1: 2738 #element => element *( OWS "," OWS element ) 2740 For compatibility with legacy list rules, a recipient MUST parse and 2741 ignore a reasonable number of empty list elements: enough to handle 2742 common mistakes by senders that merge values, but not so much that 2743 they could be used as a denial of service mechanism. In other words, 2744 a recipient MUST expand the list construct as follows: 2746 #element => [ ( "," / element ) *( OWS "," [ OWS element ] ) ] 2748 1#element => *( "," OWS ) element *( OWS "," [ OWS element ] ) 2750 Empty elements do not contribute to the count of elements present. 2751 For example, given these ABNF productions: 2753 example-list = 1#example-list-elmt 2754 example-list-elmt = token ; see Section 3.2.6 2756 Then the following are valid values for example-list (not including 2757 the double quotes, which are present for delimitation only): 2759 "foo,bar" 2760 "foo ,bar," 2761 "foo , ,bar,charlie " 2763 In contrast, the following values would be invalid, since at least 2764 one non-empty element is required by the example-list production: 2766 "" 2767 "," 2768 ", ," 2770 Appendix B shows the collected ABNF after the list constructs have 2771 been expanded, as described above, for recipients. 2773 8. IANA Considerations 2775 8.1. Header Field Registration 2777 HTTP header fields are registered within the Message Header Field 2778 Registry maintained at 2779 . 2781 This document defines the following HTTP header fields, so their 2782 associated registry entries shall be updated according to the 2783 permanent registrations below (see [BCP90]): 2785 +-------------------+----------+----------+---------------+ 2786 | Header Field Name | Protocol | Status | Reference | 2787 +-------------------+----------+----------+---------------+ 2788 | Connection | http | standard | Section 6.1 | 2789 | Content-Length | http | standard | Section 3.3.2 | 2790 | Host | http | standard | Section 5.4 | 2791 | TE | http | standard | Section 4.3 | 2792 | Trailer | http | standard | Section 4.4 | 2793 | Transfer-Encoding | http | standard | Section 3.3.1 | 2794 | Upgrade | http | standard | Section 6.7 | 2795 | Via | http | standard | Section 5.7.1 | 2796 +-------------------+----------+----------+---------------+ 2798 Furthermore, the header field-name "Close" shall be registered as 2799 "reserved", since using that name as an HTTP header field might 2800 conflict with the "close" connection option of the "Connection" 2801 header field (Section 6.1). 2803 +-------------------+----------+----------+-------------+ 2804 | Header Field Name | Protocol | Status | Reference | 2805 +-------------------+----------+----------+-------------+ 2806 | Close | http | reserved | Section 8.1 | 2807 +-------------------+----------+----------+-------------+ 2809 The change controller is: "IETF (iesg@ietf.org) - Internet 2810 Engineering Task Force". 2812 8.2. URI Scheme Registration 2814 IANA maintains the registry of URI Schemes [BCP115] at 2815 . 2817 This document defines the following URI schemes, so their associated 2818 registry entries shall be updated according to the permanent 2819 registrations below: 2821 +------------+------------------------------------+---------------+ 2822 | URI Scheme | Description | Reference | 2823 +------------+------------------------------------+---------------+ 2824 | http | Hypertext Transfer Protocol | Section 2.7.1 | 2825 | https | Hypertext Transfer Protocol Secure | Section 2.7.2 | 2826 +------------+------------------------------------+---------------+ 2828 8.3. Internet Media Type Registration 2830 IANA maintains the registry of Internet media types [BCP13] at 2831 . 2833 This document serves as the specification for the Internet media 2834 types "message/http" and "application/http". The following is to be 2835 registered with IANA. 2837 8.3.1. Internet Media Type message/http 2839 The message/http type can be used to enclose a single HTTP request or 2840 response message, provided that it obeys the MIME restrictions for 2841 all "message" types regarding line length and encodings. 2843 Type name: message 2845 Subtype name: http 2847 Required parameters: none 2848 Optional parameters: version, msgtype 2850 version: The HTTP-version number of the enclosed message (e.g., 2851 "1.1"). If not present, the version can be determined from the 2852 first line of the body. 2854 msgtype: The message type -- "request" or "response". If not 2855 present, the type can be determined from the first line of the 2856 body. 2858 Encoding considerations: only "7bit", "8bit", or "binary" are 2859 permitted 2861 Security considerations: none 2863 Interoperability considerations: none 2865 Published specification: This specification (see Section 8.3.1). 2867 Applications that use this media type: 2869 Additional information: 2871 Magic number(s): none 2873 File extension(s): none 2875 Macintosh file type code(s): none 2877 Person and email address to contact for further information: See 2878 Authors Section. 2880 Intended usage: COMMON 2882 Restrictions on usage: none 2884 Author: See Authors Section. 2886 Change controller: IESG 2888 8.3.2. Internet Media Type application/http 2890 The application/http type can be used to enclose a pipeline of one or 2891 more HTTP request or response messages (not intermixed). 2893 Type name: application 2895 Subtype name: http 2897 Required parameters: none 2899 Optional parameters: version, msgtype 2901 version: The HTTP-version number of the enclosed messages (e.g., 2902 "1.1"). If not present, the version can be determined from the 2903 first line of the body. 2905 msgtype: The message type -- "request" or "response". If not 2906 present, the type can be determined from the first line of the 2907 body. 2909 Encoding considerations: HTTP messages enclosed by this type are in 2910 "binary" format; use of an appropriate Content-Transfer-Encoding 2911 is required when transmitted via E-mail. 2913 Security considerations: none 2915 Interoperability considerations: none 2917 Published specification: This specification (see Section 8.3.2). 2919 Applications that use this media type: 2921 Additional information: 2923 Magic number(s): none 2925 File extension(s): none 2927 Macintosh file type code(s): none 2929 Person and email address to contact for further information: See 2930 Authors Section. 2932 Intended usage: COMMON 2934 Restrictions on usage: none 2936 Author: See Authors Section. 2938 Change controller: IESG 2940 8.4. Transfer Coding Registry 2942 The HTTP Transfer Coding Registry defines the name space for transfer 2943 coding names. It is maintained at 2944 . 2946 8.4.1. Procedure 2948 Registrations MUST include the following fields: 2950 o Name 2952 o Description 2954 o Pointer to specification text 2956 Names of transfer codings MUST NOT overlap with names of content 2957 codings (Section 3.1.2.1 of [Part2]) unless the encoding 2958 transformation is identical, as is the case for the compression 2959 codings defined in Section 4.2. 2961 Values to be added to this name space require IETF Review (see 2962 Section 4.1 of [RFC5226]), and MUST conform to the purpose of 2963 transfer coding defined in this specification. 2965 Use of program names for the identification of encoding formats is 2966 not desirable and is discouraged for future encodings. 2968 8.4.2. Registration 2970 The HTTP Transfer Coding Registry shall be updated with the 2971 registrations below: 2973 +------------+--------------------------------------+---------------+ 2974 | Name | Description | Reference | 2975 +------------+--------------------------------------+---------------+ 2976 | chunked | Transfer in a series of chunks | Section 4.1 | 2977 | compress | UNIX "compress" data format [Welch] | Section 4.2.1 | 2978 | deflate | "deflate" compressed data | Section 4.2.2 | 2979 | | ([RFC1951]) inside the "zlib" data | | 2980 | | format ([RFC1950]) | | 2981 | gzip | GZIP file format [RFC1952] | Section 4.2.3 | 2982 | x-compress | Deprecated (alias for compress) | Section 4.2.1 | 2983 | x-gzip | Deprecated (alias for gzip) | Section 4.2.3 | 2984 +------------+--------------------------------------+---------------+ 2986 8.5. Content Coding Registration 2988 IANA maintains the registry of HTTP Content Codings at 2989 . 2991 The HTTP Content Codings Registry shall be updated with the 2992 registrations below: 2994 +------------+--------------------------------------+---------------+ 2995 | Name | Description | Reference | 2996 +------------+--------------------------------------+---------------+ 2997 | compress | UNIX "compress" data format [Welch] | Section 4.2.1 | 2998 | deflate | "deflate" compressed data | Section 4.2.2 | 2999 | | ([RFC1951]) inside the "zlib" data | | 3000 | | format ([RFC1950]) | | 3001 | gzip | GZIP file format [RFC1952] | Section 4.2.3 | 3002 | x-compress | Deprecated (alias for compress) | Section 4.2.1 | 3003 | x-gzip | Deprecated (alias for gzip) | Section 4.2.3 | 3004 +------------+--------------------------------------+---------------+ 3006 8.6. Upgrade Token Registry 3008 The HTTP Upgrade Token Registry defines the name space for protocol- 3009 name tokens used to identify protocols in the Upgrade header field. 3010 The registry is maintained at 3011 . 3013 8.6.1. Procedure 3015 Each registered protocol name is associated with contact information 3016 and an optional set of specifications that details how the connection 3017 will be processed after it has been upgraded. 3019 Registrations happen on a "First Come First Served" basis (see 3020 Section 4.1 of [RFC5226]) and are subject to the following rules: 3022 1. A protocol-name token, once registered, stays registered forever. 3024 2. The registration MUST name a responsible party for the 3025 registration. 3027 3. The registration MUST name a point of contact. 3029 4. The registration MAY name a set of specifications associated with 3030 that token. Such specifications need not be publicly available. 3032 5. The registration SHOULD name a set of expected "protocol-version" 3033 tokens associated with that token at the time of registration. 3035 6. The responsible party MAY change the registration at any time. 3036 The IANA will keep a record of all such changes, and make them 3037 available upon request. 3039 7. The IESG MAY reassign responsibility for a protocol token. This 3040 will normally only be used in the case when a responsible party 3041 cannot be contacted. 3043 This registration procedure for HTTP Upgrade Tokens replaces that 3044 previously defined in Section 7.2 of [RFC2817]. 3046 8.6.2. Upgrade Token Registration 3048 The "HTTP" entry in the HTTP Upgrade Token Registry shall be updated 3049 with the registration below: 3051 +-------+----------------------+----------------------+-------------+ 3052 | Value | Description | Expected Version | Reference | 3053 | | | Tokens | | 3054 +-------+----------------------+----------------------+-------------+ 3055 | HTTP | Hypertext Transfer | any DIGIT.DIGIT | Section 2.6 | 3056 | | Protocol | (e.g, "2.0") | | 3057 +-------+----------------------+----------------------+-------------+ 3059 The responsible party is: "IETF (iesg@ietf.org) - Internet 3060 Engineering Task Force". 3062 9. Security Considerations 3064 This section is meant to inform developers, information providers, 3065 and users of known security concerns relevant to HTTP/1.1 message 3066 syntax, parsing, and routing. 3068 9.1. DNS-related Attacks 3070 HTTP clients rely heavily on the Domain Name Service (DNS), and are 3071 thus generally prone to security attacks based on the deliberate 3072 misassociation of IP addresses and DNS names not protected by DNSSEC. 3073 Clients need to be cautious in assuming the validity of an IP number/ 3074 DNS name association unless the response is protected by DNSSEC 3075 ([RFC4033]). 3077 9.2. Intermediaries and Caching 3079 By their very nature, HTTP intermediaries are men-in-the-middle, and 3080 represent an opportunity for man-in-the-middle attacks. Compromise 3081 of the systems on which the intermediaries run can result in serious 3082 security and privacy problems. Intermediaries have access to 3083 security-related information, personal information about individual 3084 users and organizations, and proprietary information belonging to 3085 users and content providers. A compromised intermediary, or an 3086 intermediary implemented or configured without regard to security and 3087 privacy considerations, might be used in the commission of a wide 3088 range of potential attacks. 3090 Intermediaries that contain a shared cache are especially vulnerable 3091 to cache poisoning attacks. 3093 Implementers need to consider the privacy and security implications 3094 of their design and coding decisions, and of the configuration 3095 options they provide to operators (especially the default 3096 configuration). 3098 Users need to be aware that intermediaries are no more trustworthy 3099 than the people who run them; HTTP itself cannot solve this problem. 3101 9.3. Buffer Overflows 3103 Because HTTP uses mostly textual, character-delimited fields, 3104 attackers can overflow buffers in implementations, and/or perform a 3105 Denial of Service against implementations that accept fields with 3106 unlimited lengths. 3108 To promote interoperability, this specification makes specific 3109 recommendations for minimum size limits on request-line 3110 (Section 3.1.1) and header fields (Section 3.2). These are minimum 3111 recommendations, chosen to be supportable even by implementations 3112 with limited resources; it is expected that most implementations will 3113 choose substantially higher limits. 3115 This specification also provides a way for servers to reject messages 3116 that have request-targets that are too long (Section 6.5.12 of 3117 [Part2]) or request entities that are too large (Section 6.5 of 3118 [Part2]). Additional status codes related to capacity limits have 3119 been defined by extensions to HTTP [RFC6585]. 3121 Recipients ought to carefully limit the extent to which they read 3122 other fields, including (but not limited to) request methods, 3123 response status phrases, header field-names, and body chunks, so as 3124 to avoid denial of service attacks without impeding interoperability. 3126 9.4. Message Integrity 3128 HTTP does not define a specific mechanism for ensuring message 3129 integrity, instead relying on the error-detection ability of 3130 underlying transport protocols and the use of length or chunk- 3131 delimited framing to detect completeness. Additional integrity 3132 mechanisms, such as hash functions or digital signatures applied to 3133 the content, can be selectively added to messages via extensible 3134 metadata header fields. Historically, the lack of a single integrity 3135 mechanism has been justified by the informal nature of most HTTP 3136 communication. However, the prevalence of HTTP as an information 3137 access mechanism has resulted in its increasing use within 3138 environments where verification of message integrity is crucial. 3140 User agents are encouraged to implement configurable means for 3141 detecting and reporting failures of message integrity such that those 3142 means can be enabled within environments for which integrity is 3143 necessary. For example, a browser being used to view medical history 3144 or drug interaction information needs to indicate to the user when 3145 such information is detected by the protocol to be incomplete, 3146 expired, or corrupted during transfer. Such mechanisms might be 3147 selectively enabled via user agent extensions or the presence of 3148 message integrity metadata in a response. At a minimum, user agents 3149 ought to provide some indication that allows a user to distinguish 3150 between a complete and incomplete response message (Section 3.4) when 3151 such verification is desired. 3153 9.5. Server Log Information 3155 A server is in the position to save personal data about a user's 3156 requests over time, which might identify their reading patterns or 3157 subjects of interest. In particular, log information gathered at an 3158 intermediary often contains a history of user agent interaction, 3159 across a multitude of sites, that can be traced to individual users. 3161 HTTP log information is confidential in nature; its handling is often 3162 constrained by laws and regulations. Log information needs to be 3163 securely stored and appropriate guidelines followed for its analysis. 3164 Anonymization of personal information within individual entries 3165 helps, but is generally not sufficient to prevent real log traces 3166 from being re-identified based on correlation with other access 3167 characteristics. As such, access traces that are keyed to a specific 3168 client are unsafe to publish even if the key is pseudonymous. 3170 To minimize the risk of theft or accidental publication, log 3171 information ought to be purged of personally identifiable 3172 information, including user identifiers, IP addresses, and user- 3173 provided query parameters, as soon as that information is no longer 3174 necessary to support operational needs for security, auditing, or 3175 fraud control. 3177 10. Acknowledgments 3179 This edition of HTTP/1.1 builds on the many contributions that went 3180 into RFC 1945, RFC 2068, RFC 2145, and RFC 2616, including 3181 substantial contributions made by the previous authors, editors, and 3182 working group chairs: Tim Berners-Lee, Ari Luotonen, Roy T. Fielding, 3183 Henrik Frystyk Nielsen, Jim Gettys, Jeffrey C. Mogul, Larry Masinter, 3184 and Paul J. Leach. Mark Nottingham oversaw this effort as working 3185 group chair. 3187 Since 1999, the following contributors have helped improve the HTTP 3188 specification by reporting bugs, asking smart questions, drafting or 3189 reviewing text, and evaluating open issues: 3191 Adam Barth, Adam Roach, Addison Phillips, Adrian Chadd, Adrien W. de 3192 Croy, Alan Ford, Alan Ruttenberg, Albert Lunde, Alek Storm, Alex 3193 Rousskov, Alexandre Morgaut, Alexey Melnikov, Alisha Smith, Amichai 3194 Rothman, Amit Klein, Amos Jeffries, Andreas Maier, Andreas Petersson, 3195 Andrei Popov, Anil Sharma, Anne van Kesteren, Anthony Bryan, Asbjorn 3196 Ulsberg, Ashok Kumar, Balachander Krishnamurthy, Barry Leiba, Ben 3197 Laurie, Benjamin Carlyle, Benjamin Niven-Jenkins, Bil Corry, Bill 3198 Burke, Bjoern Hoehrmann, Bob Scheifler, Boris Zbarsky, Brett Slatkin, 3199 Brian Kell, Brian McBarron, Brian Pane, Brian Raymor, Brian Smith, 3200 Bryce Nesbitt, Cameron Heavon-Jones, Carl Kugler, Carsten Bormann, 3201 Charles Fry, Chris Newman, Cyrus Daboo, Dale Robert Anderson, Dan 3202 Wing, Dan Winship, Daniel Stenberg, Darrel Miller, Dave Cridland, 3203 Dave Crocker, Dave Kristol, Dave Thaler, David Booth, David Singer, 3204 David W. Morris, Diwakar Shetty, Dmitry Kurochkin, Drummond Reed, 3205 Duane Wessels, Edward Lee, Eitan Adler, Eliot Lear, Emile Stephan, 3206 Eran Hammer-Lahav, Eric D. Williams, Eric J. Bowman, Eric Lawrence, 3207 Eric Rescorla, Erik Aronesty, EungJun Yi, Evan Prodromou, Felix 3208 Geisendoerfer, Florian Weimer, Frank Ellermann, Fred Akalin, Fred 3209 Bohle, Frederic Kayser, Gabor Molnar, Gabriel Montenegro, Geoffrey 3210 Sneddon, Gervase Markham, Gili Tzabari, Grahame Grieve, Greg Wilkins, 3211 Grzegorz Calkowski, Harald Tveit Alvestrand, Harry Halpin, Helge 3212 Hess, Henrik Nordstrom, Henry S. Thompson, Henry Story, Herbert van 3213 de Sompel, Herve Ruellan, Howard Melman, Hugo Haas, Ian Fette, Ian 3214 Hickson, Ido Safruti, Ilari Liusvaara, Ilya Grigorik, Ingo Struck, J. 3215 Ross Nicoll, James Cloos, James H. Manger, James Lacey, James M. 3216 Snell, Jamie Lokier, Jan Algermissen, Jeff Hodges (who came up with 3217 the term 'effective Request-URI'), Jeff Pinner, Jeff Walden, Jim 3218 Luther, Jitu Padhye, Joe D. Williams, Joe Gregorio, Joe Orton, John 3219 C. Klensin, John C. Mallery, John Cowan, John Kemp, John Panzer, John 3220 Schneider, John Stracke, John Sullivan, Jonas Sicking, Jonathan A. 3221 Rees, Jonathan Billington, Jonathan Moore, Jonathan Silvera, Jordi 3222 Ros, Joris Dobbelsteen, Josh Cohen, Julien Pierre, Jungshik Shin, 3223 Justin Chapweske, Justin Erenkrantz, Justin James, Kalvinder Singh, 3224 Karl Dubost, Keith Hoffman, Keith Moore, Ken Murchison, Koen Holtman, 3225 Konstantin Voronkov, Kris Zyp, Leif Hedstrom, Lisa Dusseault, Maciej 3226 Stachowiak, Manu Sporny, Marc Schneider, Marc Slemko, Mark Baker, 3227 Mark Pauley, Mark Watson, Markus Isomaki, Markus Lanthaler, Martin J. 3228 Duerst, Martin Musatov, Martin Nilsson, Martin Thomson, Matt Lynch, 3229 Matthew Cox, Max Clark, Michael Burrows, Michael Hausenblas, Michael 3230 Scharf, Michael Sweet, Michael Tuexen, Michael Welzl, Mike Amundsen, 3231 Mike Belshe, Mike Bishop, Mike Kelly, Mike Schinkel, Miles Sabin, 3232 Murray S. Kucherawy, Mykyta Yevstifeyev, Nathan Rixham, Nicholas 3233 Shanks, Nico Williams, Nicolas Alvarez, Nicolas Mailhot, Noah Slater, 3234 Osama Mazahir, Pablo Castro, Pat Hayes, Patrick R. McManus, Paul E. 3235 Jones, Paul Hoffman, Paul Marquess, Peter Lepeska, Peter Occil, Peter 3236 Saint-Andre, Peter Watkins, Phil Archer, Philippe Mougin, Phillip 3237 Hallam-Baker, Piotr Dobrogost, Poul-Henning Kamp, Preethi Natarajan, 3238 Rajeev Bector, Ray Polk, Reto Bachmann-Gmuer, Richard Cyganiak, Robby 3239 Simpson, Robert Brewer, Robert Collins, Robert Mattson, Robert 3240 O'Callahan, Robert Olofsson, Robert Sayre, Robert Siemer, Robert de 3241 Wilde, Roberto Javier Godoy, Roberto Peon, Roland Zink, Ronny 3242 Widjaja, Ryan Hamilton, S. Mike Dierken, Salvatore Loreto, Sam 3243 Johnston, Sam Pullara, Sam Ruby, Saurabh Kulkarni, Scott Lawrence 3244 (who maintained the original issues list), Sean B. Palmer, Sebastien 3245 Barnoud, Shane McCarron, Shigeki Ohtsu, Stefan Eissing, Stefan 3246 Tilkov, Stefanos Harhalakis, Stephane Bortzmeyer, Stephen Farrell, 3247 Stephen Ludin, Stuart Williams, Subbu Allamaraju, Subramanian 3248 Moonesamy, Sylvain Hellegouarch, Tapan Divekar, Tatsuhiro Tsujikawa, 3249 Tatsuya Hayashi, Ted Hardie, Thomas Broyer, Thomas Fossati, Thomas 3250 Maslen, Thomas Nordin, Thomas Roessler, Tim Bray, Tim Morgan, Tim 3251 Olsen, Tom Zhou, Travis Snoozy, Tyler Close, Vincent Murphy, Wenbo 3252 Zhu, Werner Baumann, Wilbur Streett, Wilfredo Sanchez Vega, William 3253 A. Rowe Jr., William Chan, Willy Tarreau, Xiaoshu Wang, Yaron Goland, 3254 Yngve Nysaeter Pettersen, Yoav Nir, Yogesh Bang, Yuchung Cheng, 3255 Yutaka Oiwa, Yves Lafon (long-time member of the editor team), Zed A. 3256 Shaw, and Zhong Yu. 3258 See Section 16 of [RFC2616] for additional acknowledgements from 3259 prior revisions. 3261 11. References 3263 11.1. Normative References 3265 [Part2] Fielding, R., Ed. and J. Reschke, Ed., "Hypertext 3266 Transfer Protocol (HTTP/1.1): Semantics and Content", 3267 draft-ietf-httpbis-p2-semantics-25 (work in progress), 3268 November 2013. 3270 [Part4] Fielding, R., Ed. and J. Reschke, Ed., "Hypertext 3271 Transfer Protocol (HTTP/1.1): Conditional Requests", 3272 draft-ietf-httpbis-p4-conditional-25 (work in 3273 progress), November 2013. 3275 [Part5] Fielding, R., Ed., Lafon, Y., Ed., and J. Reschke, Ed., 3276 "Hypertext Transfer Protocol (HTTP/1.1): Range 3277 Requests", draft-ietf-httpbis-p5-range-25 (work in 3278 progress), November 2013. 3280 [Part6] Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, 3281 Ed., "Hypertext Transfer Protocol (HTTP/1.1): Caching", 3282 draft-ietf-httpbis-p6-cache-25 (work in progress), 3283 November 2013. 3285 [Part7] Fielding, R., Ed. and J. Reschke, Ed., "Hypertext 3286 Transfer Protocol (HTTP/1.1): Authentication", 3287 draft-ietf-httpbis-p7-auth-25 (work in progress), 3288 November 2013. 3290 [RFC0793] Postel, J., "Transmission Control Protocol", STD 7, 3291 RFC 793, September 1981. 3293 [RFC1950] Deutsch, L. and J-L. Gailly, "ZLIB Compressed Data 3294 Format Specification version 3.3", RFC 1950, May 1996. 3296 [RFC1951] Deutsch, P., "DEFLATE Compressed Data Format 3297 Specification version 1.3", RFC 1951, May 1996. 3299 [RFC1952] Deutsch, P., Gailly, J-L., Adler, M., Deutsch, L., and 3300 G. Randers-Pehrson, "GZIP file format specification 3301 version 4.3", RFC 1952, May 1996. 3303 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 3304 Requirement Levels", BCP 14, RFC 2119, March 1997. 3306 [RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, 3307 "Uniform Resource Identifier (URI): Generic Syntax", 3308 STD 66, RFC 3986, January 2005. 3310 [RFC5234] Crocker, D., Ed. and P. Overell, "Augmented BNF for 3311 Syntax Specifications: ABNF", STD 68, RFC 5234, 3312 January 2008. 3314 [USASCII] American National Standards Institute, "Coded Character 3315 Set -- 7-bit American Standard Code for Information 3316 Interchange", ANSI X3.4, 1986. 3318 [Welch] Welch, T., "A Technique for High Performance Data 3319 Compression", IEEE Computer 17(6), June 1984. 3321 11.2. Informative References 3323 [BCP115] Hansen, T., Hardie, T., and L. Masinter, "Guidelines 3324 and Registration Procedures for New URI Schemes", 3325 BCP 115, RFC 4395, February 2006. 3327 [BCP13] Freed, N., Klensin, J., and T. Hansen, "Media Type 3328 Specifications and Registration Procedures", BCP 13, 3329 RFC 6838, January 2013. 3331 [BCP90] Klyne, G., Nottingham, M., and J. Mogul, "Registration 3332 Procedures for Message Header Fields", BCP 90, 3333 RFC 3864, September 2004. 3335 [ISO-8859-1] International Organization for Standardization, 3336 "Information technology -- 8-bit single-byte coded 3337 graphic character sets -- Part 1: Latin alphabet No. 3338 1", ISO/IEC 8859-1:1998, 1998. 3340 [Kri2001] Kristol, D., "HTTP Cookies: Standards, Privacy, and 3341 Politics", ACM Transactions on Internet 3342 Technology 1(2), November 2001, 3343 . 3345 [RFC1919] Chatel, M., "Classical versus Transparent IP Proxies", 3346 RFC 1919, March 1996. 3348 [RFC1945] Berners-Lee, T., Fielding, R., and H. Nielsen, 3349 "Hypertext Transfer Protocol -- HTTP/1.0", RFC 1945, 3350 May 1996. 3352 [RFC2045] Freed, N. and N. Borenstein, "Multipurpose Internet 3353 Mail Extensions (MIME) Part One: Format of Internet 3354 Message Bodies", RFC 2045, November 1996. 3356 [RFC2047] Moore, K., "MIME (Multipurpose Internet Mail 3357 Extensions) Part Three: Message Header Extensions for 3358 Non-ASCII Text", RFC 2047, November 1996. 3360 [RFC2068] Fielding, R., Gettys, J., Mogul, J., Nielsen, H., and 3361 T. Berners-Lee, "Hypertext Transfer Protocol -- 3362 HTTP/1.1", RFC 2068, January 1997. 3364 [RFC2145] Mogul, J., Fielding, R., Gettys, J., and H. Nielsen, 3365 "Use and Interpretation of HTTP Version Numbers", 3366 RFC 2145, May 1997. 3368 [RFC2616] Fielding, R., Gettys, J., Mogul, J., Frystyk, H., 3369 Masinter, L., Leach, P., and T. Berners-Lee, "Hypertext 3370 Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999. 3372 [RFC2817] Khare, R. and S. Lawrence, "Upgrading to TLS Within 3373 HTTP/1.1", RFC 2817, May 2000. 3375 [RFC2818] Rescorla, E., "HTTP Over TLS", RFC 2818, May 2000. 3377 [RFC3040] Cooper, I., Melve, I., and G. Tomlinson, "Internet Web 3378 Replication and Caching Taxonomy", RFC 3040, 3379 January 2001. 3381 [RFC4033] Arends, R., Austein, R., Larson, M., Massey, D., and S. 3382 Rose, "DNS Security Introduction and Requirements", 3383 RFC 4033, March 2005. 3385 [RFC4559] Jaganathan, K., Zhu, L., and J. Brezak, "SPNEGO-based 3386 Kerberos and NTLM HTTP Authentication in Microsoft 3387 Windows", RFC 4559, June 2006. 3389 [RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing 3390 an IANA Considerations Section in RFCs", BCP 26, 3391 RFC 5226, May 2008. 3393 [RFC5246] Dierks, T. and E. Rescorla, "The Transport Layer 3394 Security (TLS) Protocol Version 1.2", RFC 5246, 3395 August 2008. 3397 [RFC5322] Resnick, P., "Internet Message Format", RFC 5322, 3398 October 2008. 3400 [RFC6265] Barth, A., "HTTP State Management Mechanism", RFC 6265, 3401 April 2011. 3403 [RFC6585] Nottingham, M. and R. Fielding, "Additional HTTP Status 3404 Codes", RFC 6585, April 2012. 3406 Appendix A. HTTP Version History 3408 HTTP has been in use by the World-Wide Web global information 3409 initiative since 1990. The first version of HTTP, later referred to 3410 as HTTP/0.9, was a simple protocol for hypertext data transfer across 3411 the Internet with only a single request method (GET) and no metadata. 3412 HTTP/1.0, as defined by [RFC1945], added a range of request methods 3413 and MIME-like messaging that could include metadata about the data 3414 transferred and modifiers on the request/response semantics. 3415 However, HTTP/1.0 did not sufficiently take into consideration the 3416 effects of hierarchical proxies, caching, the need for persistent 3417 connections, or name-based virtual hosts. The proliferation of 3418 incompletely-implemented applications calling themselves "HTTP/1.0" 3419 further necessitated a protocol version change in order for two 3420 communicating applications to determine each other's true 3421 capabilities. 3423 HTTP/1.1 remains compatible with HTTP/1.0 by including more stringent 3424 requirements that enable reliable implementations, adding only those 3425 new features that will either be safely ignored by an HTTP/1.0 3426 recipient or only sent when communicating with a party advertising 3427 conformance with HTTP/1.1. 3429 It is beyond the scope of a protocol specification to mandate 3430 conformance with previous versions. HTTP/1.1 was deliberately 3431 designed, however, to make supporting previous versions easy. We 3432 would expect a general-purpose HTTP/1.1 server to understand any 3433 valid request in the format of HTTP/1.0 and respond appropriately 3434 with an HTTP/1.1 message that only uses features understood (or 3435 safely ignored) by HTTP/1.0 clients. Likewise, we would expect an 3436 HTTP/1.1 client to understand any valid HTTP/1.0 response. 3438 Since HTTP/0.9 did not support header fields in a request, there is 3439 no mechanism for it to support name-based virtual hosts (selection of 3440 resource by inspection of the Host header field). Any server that 3441 implements name-based virtual hosts ought to disable support for 3442 HTTP/0.9. Most requests that appear to be HTTP/0.9 are, in fact, 3443 badly constructed HTTP/1.x requests wherein a buggy client failed to 3444 properly encode linear whitespace found in a URI reference and placed 3445 in the request-target. 3447 A.1. Changes from HTTP/1.0 3449 This section summarizes major differences between versions HTTP/1.0 3450 and HTTP/1.1. 3452 A.1.1. Multi-homed Web Servers 3454 The requirements that clients and servers support the Host header 3455 field (Section 5.4), report an error if it is missing from an 3456 HTTP/1.1 request, and accept absolute URIs (Section 5.3) are among 3457 the most important changes defined by HTTP/1.1. 3459 Older HTTP/1.0 clients assumed a one-to-one relationship of IP 3460 addresses and servers; there was no other established mechanism for 3461 distinguishing the intended server of a request than the IP address 3462 to which that request was directed. The Host header field was 3463 introduced during the development of HTTP/1.1 and, though it was 3464 quickly implemented by most HTTP/1.0 browsers, additional 3465 requirements were placed on all HTTP/1.1 requests in order to ensure 3466 complete adoption. At the time of this writing, most HTTP-based 3467 services are dependent upon the Host header field for targeting 3468 requests. 3470 A.1.2. Keep-Alive Connections 3472 In HTTP/1.0, each connection is established by the client prior to 3473 the request and closed by the server after sending the response. 3474 However, some implementations implement the explicitly negotiated 3475 ("Keep-Alive") version of persistent connections described in Section 3476 19.7.1 of [RFC2068]. 3478 Some clients and servers might wish to be compatible with these 3479 previous approaches to persistent connections, by explicitly 3480 negotiating for them with a "Connection: keep-alive" request header 3481 field. However, some experimental implementations of HTTP/1.0 3482 persistent connections are faulty; for example, if an HTTP/1.0 proxy 3483 server doesn't understand Connection, it will erroneously forward 3484 that header field to the next inbound server, which would result in a 3485 hung connection. 3487 One attempted solution was the introduction of a Proxy-Connection 3488 header field, targeted specifically at proxies. In practice, this 3489 was also unworkable, because proxies are often deployed in multiple 3490 layers, bringing about the same problem discussed above. 3492 As a result, clients are encouraged not to send the Proxy-Connection 3493 header field in any requests. 3495 Clients are also encouraged to consider the use of Connection: keep- 3496 alive in requests carefully; while they can enable persistent 3497 connections with HTTP/1.0 servers, clients using them will need to 3498 monitor the connection for "hung" requests (which indicate that the 3499 client ought stop sending the header field), and this mechanism ought 3500 not be used by clients at all when a proxy is being used. 3502 A.1.3. Introduction of Transfer-Encoding 3504 HTTP/1.1 introduces the Transfer-Encoding header field 3505 (Section 3.3.1). Transfer codings need to be decoded prior to 3506 forwarding an HTTP message over a MIME-compliant protocol. 3508 A.2. Changes from RFC 2616 3510 HTTP's approach to error handling has been explained. (Section 2.5) 3512 The HTTP-version ABNF production has been clarified to be case- 3513 sensitive. Additionally, version numbers has been restricted to 3514 single digits, due to the fact that implementations are known to 3515 handle multi-digit version numbers incorrectly. (Section 2.6) 3517 Userinfo (i.e., username and password) are now disallowed in HTTP and 3518 HTTPS URIs, because of security issues related to their transmission 3519 on the wire. (Section 2.7.1) 3521 The HTTPS URI scheme is now defined by this specification; 3522 previously, it was done in Section 2.4 of [RFC2818]. Furthermore, it 3523 implies end-to-end security. (Section 2.7.2) 3525 HTTP messages can be (and often are) buffered by implementations; 3526 despite it sometimes being available as a stream, HTTP is 3527 fundamentally a message-oriented protocol. Minimum supported sizes 3528 for various protocol elements have been suggested, to improve 3529 interoperability. (Section 3) 3531 Invalid whitespace around field-names is now required to be rejected, 3532 because accepting it represents a security vulnerability. The ABNF 3533 productions defining header fields now only list the field value. 3534 (Section 3.2) 3536 Rules about implicit linear whitespace between certain grammar 3537 productions have been removed; now whitespace is only allowed where 3538 specifically defined in the ABNF. (Section 3.2.3) 3540 Header fields that span multiple lines ("line folding") are 3541 deprecated. (Section 3.2.4) 3543 The NUL octet is no longer allowed in comment and quoted-string text, 3544 and handling of backslash-escaping in them has been clarified. The 3545 quoted-pair rule no longer allows escaping control characters other 3546 than HTAB. Non-ASCII content in header fields and the reason phrase 3547 has been obsoleted and made opaque (the TEXT rule was removed). 3548 (Section 3.2.6) 3550 Bogus "Content-Length" header fields are now required to be handled 3551 as errors by recipients. (Section 3.3.2) 3553 The algorithm for determining the message body length has been 3554 clarified to indicate all of the special cases (e.g., driven by 3555 methods or status codes) that affect it, and that new protocol 3556 elements cannot define such special cases. CONNECT is a new, special 3557 case in determining message body length. "multipart/byteranges" is no 3558 longer a way of determining message body length detection. 3559 (Section 3.3.3) 3560 The "identity" transfer coding token has been removed. (Sections 3.3 3561 and 4) 3563 Chunk length does not include the count of the octets in the chunk 3564 header and trailer. Line folding in chunk extensions is disallowed. 3565 (Section 4.1) 3567 The meaning of the "deflate" content coding has been clarified. 3568 (Section 4.2.2) 3570 The segment + query components of RFC 3986 have been used to define 3571 the request-target, instead of abs_path from RFC 1808. The asterisk- 3572 form of the request-target is only allowed with the OPTIONS method. 3573 (Section 5.3) 3575 The term "Effective Request URI" has been introduced. (Section 5.5) 3577 Gateways do not need to generate Via header fields anymore. 3578 (Section 5.7.1) 3580 Exactly when "close" connection options have to be sent has been 3581 clarified. Also, "hop-by-hop" header fields are required to appear 3582 in the Connection header field; just because they're defined as hop- 3583 by-hop in this specification doesn't exempt them. (Section 6.1) 3585 The limit of two connections per server has been removed. An 3586 idempotent sequence of requests is no longer required to be retried. 3587 The requirement to retry requests under certain circumstances when 3588 the server prematurely closes the connection has been removed. Also, 3589 some extraneous requirements about when servers are allowed to close 3590 connections prematurely have been removed. (Section 6.3) 3592 The semantics of the Upgrade header field is now defined in responses 3593 other than 101 (this was incorporated from [RFC2817]). Furthermore, 3594 the ordering in the field value is now significant. (Section 6.7) 3596 Empty list elements in list productions (e.g., a list header field 3597 containing ", ,") have been deprecated. (Section 7) 3599 Registration of Transfer Codings now requires IETF Review 3600 (Section 8.4) 3602 This specification now defines the Upgrade Token Registry, previously 3603 defined in Section 7.2 of [RFC2817]. (Section 8.6) 3605 The expectation to support HTTP/0.9 requests has been removed. 3606 (Appendix A) 3607 Issues with the Keep-Alive and Proxy-Connection header fields in 3608 requests are pointed out, with use of the latter being discouraged 3609 altogether. (Appendix A.1.2) 3611 Appendix B. Collected ABNF 3613 BWS = OWS 3615 Connection = *( "," OWS ) connection-option *( OWS "," [ OWS 3616 connection-option ] ) 3617 Content-Length = 1*DIGIT 3619 HTTP-message = start-line *( header-field CRLF ) CRLF [ message-body 3620 ] 3621 HTTP-name = %x48.54.54.50 ; HTTP 3622 HTTP-version = HTTP-name "/" DIGIT "." DIGIT 3623 Host = uri-host [ ":" port ] 3625 OWS = *( SP / HTAB ) 3627 RWS = 1*( SP / HTAB ) 3629 TE = [ ( "," / t-codings ) *( OWS "," [ OWS t-codings ] ) ] 3630 Trailer = *( "," OWS ) field-name *( OWS "," [ OWS field-name ] ) 3631 Transfer-Encoding = *( "," OWS ) transfer-coding *( OWS "," [ OWS 3632 transfer-coding ] ) 3634 URI-reference = 3635 Upgrade = *( "," OWS ) protocol *( OWS "," [ OWS protocol ] ) 3637 Via = *( "," OWS ) ( received-protocol RWS received-by [ RWS comment 3638 ] ) *( OWS "," [ OWS ( received-protocol RWS received-by [ RWS 3639 comment ] ) ] ) 3641 absolute-URI = 3642 absolute-form = absolute-URI 3643 absolute-path = 1*( "/" segment ) 3644 asterisk-form = "*" 3645 attribute = token 3646 authority = 3647 authority-form = authority 3649 chunk = chunk-size [ chunk-ext ] CRLF chunk-data CRLF 3650 chunk-data = 1*OCTET 3651 chunk-ext = *( ";" chunk-ext-name [ "=" chunk-ext-val ] ) 3652 chunk-ext-name = token 3653 chunk-ext-val = token / quoted-str-nf 3654 chunk-size = 1*HEXDIG 3655 chunked-body = *chunk last-chunk trailer-part CRLF 3656 comment = "(" *( ctext / quoted-cpair / comment ) ")" 3657 connection-option = token 3658 ctext = HTAB / SP / %x21-27 ; '!'-''' 3659 / %x2A-5B ; '*'-'[' 3660 / %x5D-7E ; ']'-'~' 3661 / obs-text 3663 field-content = *( HTAB / SP / VCHAR / obs-text ) 3664 field-name = token 3665 field-value = *( field-content / obs-fold ) 3666 fragment = 3668 header-field = field-name ":" OWS field-value OWS 3669 http-URI = "http://" authority path-abempty [ "?" query ] [ "#" 3670 fragment ] 3671 https-URI = "https://" authority path-abempty [ "?" query ] [ "#" 3672 fragment ] 3674 last-chunk = 1*"0" [ chunk-ext ] CRLF 3676 message-body = *OCTET 3677 method = token 3679 obs-fold = CRLF ( SP / HTAB ) 3680 obs-text = %x80-FF 3681 origin-form = absolute-path [ "?" query ] 3683 partial-URI = relative-part [ "?" query ] 3684 path-abempty = 3685 port = 3686 protocol = protocol-name [ "/" protocol-version ] 3687 protocol-name = token 3688 protocol-version = token 3689 pseudonym = token 3691 qdtext = HTAB / SP / "!" / %x23-5B ; '#'-'[' 3692 / %x5D-7E ; ']'-'~' 3693 / obs-text 3694 qdtext-nf = HTAB / SP / "!" / %x23-5B ; '#'-'[' 3695 / %x5D-7E ; ']'-'~' 3696 / obs-text 3697 query = 3698 quoted-cpair = "\" ( HTAB / SP / VCHAR / obs-text ) 3699 quoted-pair = "\" ( HTAB / SP / VCHAR / obs-text ) 3700 quoted-str-nf = DQUOTE *( qdtext-nf / quoted-pair ) DQUOTE 3701 quoted-string = DQUOTE *( qdtext / quoted-pair ) DQUOTE 3702 rank = ( "0" [ "." *3DIGIT ] ) / ( "1" [ "." *3"0" ] ) 3703 reason-phrase = *( HTAB / SP / VCHAR / obs-text ) 3704 received-by = ( uri-host [ ":" port ] ) / pseudonym 3705 received-protocol = [ protocol-name "/" ] protocol-version 3706 relative-part = 3707 request-line = method SP request-target SP HTTP-version CRLF 3708 request-target = origin-form / absolute-form / authority-form / 3709 asterisk-form 3711 segment = 3712 special = "(" / ")" / "<" / ">" / "@" / "," / ";" / ":" / "\" / 3713 DQUOTE / "/" / "[" / "]" / "?" / "=" / "{" / "}" 3714 start-line = request-line / status-line 3715 status-code = 3DIGIT 3716 status-line = HTTP-version SP status-code SP reason-phrase CRLF 3718 t-codings = "trailers" / ( transfer-coding [ t-ranking ] ) 3719 t-ranking = OWS ";" OWS "q=" rank 3720 tchar = "!" / "#" / "$" / "%" / "&" / "'" / "*" / "+" / "-" / "." / 3721 "^" / "_" / "`" / "|" / "~" / DIGIT / ALPHA 3722 token = 1*tchar 3723 trailer-part = *( header-field CRLF ) 3724 transfer-coding = "chunked" / "compress" / "deflate" / "gzip" / 3725 transfer-extension 3726 transfer-extension = token *( OWS ";" OWS transfer-parameter ) 3727 transfer-parameter = attribute BWS "=" BWS value 3729 uri-host = 3731 value = word 3733 word = token / quoted-string 3735 Appendix C. Change Log (to be removed by RFC Editor before publication) 3737 C.1. Since RFC 2616 3739 Changes up to the IETF Last Call draft are summarized in . 3743 C.2. Since draft-ietf-httpbis-p1-messaging-24 3745 Closed issues: 3747 o : "APPSDIR 3748 review of draft-ietf-httpbis-p1-messaging-24" 3750 o : "integer value 3751 parsing" 3753 o : "move IANA 3754 registrations to correct draft" 3756 Index 3758 A 3759 absolute-form (of request-target) 42 3760 accelerator 10 3761 application/http Media Type 61 3762 asterisk-form (of request-target) 42 3763 authority-form (of request-target) 42 3765 B 3766 browser 7 3768 C 3769 cache 11 3770 cacheable 12 3771 captive portal 11 3772 chunked (Coding Format) 28, 31, 35 3773 client 7 3774 close 49, 55 3775 compress (Coding Format) 38 3776 connection 7 3777 Connection header field 49, 55 3778 Content-Length header field 30 3780 D 3781 deflate (Coding Format) 38 3782 downstream 9 3784 E 3785 effective request URI 44 3787 G 3788 gateway 10 3789 Grammar 3790 absolute-form 41 3791 absolute-path 16 3792 absolute-URI 16 3793 ALPHA 6 3794 asterisk-form 41 3795 attribute 35 3796 authority 16 3797 authority-form 41 3798 BWS 24 3799 chunk 35-36 3800 chunk-data 35-36 3801 chunk-ext 35-36 3802 chunk-ext-name 35-36 3803 chunk-ext-val 35-36 3804 chunk-size 35-36 3805 chunked-body 35-36 3806 comment 27 3807 Connection 50 3808 connection-option 50 3809 Content-Length 30 3810 CR 6 3811 CRLF 6 3812 ctext 27 3813 CTL 6 3814 date2 35 3815 date3 35 3816 DIGIT 6 3817 DQUOTE 6 3818 field-content 22 3819 field-name 22 3820 field-value 22 3821 fragment 16 3822 header-field 22 3823 HEXDIG 6 3824 Host 43 3825 HTAB 6 3826 HTTP-message 19 3827 HTTP-name 14 3828 http-URI 17 3829 HTTP-version 14 3830 https-URI 18 3831 last-chunk 35-36 3832 LF 6 3833 message-body 27 3834 method 21 3835 obs-fold 22 3836 obs-text 27 3837 OCTET 6 3838 origin-form 41 3839 OWS 24 3840 partial-URI 16 3841 port 16 3842 protocol-name 47 3843 protocol-version 47 3844 pseudonym 47 3845 qdtext 27 3846 qdtext-nf 35-36 3847 query 16 3848 quoted-cpair 27 3849 quoted-pair 27 3850 quoted-str-nf 35-36 3851 quoted-string 27 3852 rank 39 3853 reason-phrase 22 3854 received-by 47 3855 received-protocol 47 3856 request-line 21 3857 request-target 41 3858 RWS 24 3859 segment 16 3860 SP 6 3861 special 26 3862 start-line 21 3863 status-code 22 3864 status-line 22 3865 t-codings 39 3866 t-ranking 39 3867 tchar 26 3868 TE 39 3869 token 26 3870 Trailer 40 3871 trailer-part 35-37 3872 transfer-coding 35 3873 Transfer-Encoding 28 3874 transfer-extension 35 3875 transfer-parameter 35 3876 Upgrade 56 3877 uri-host 16 3878 URI-reference 16 3879 value 35 3880 VCHAR 6 3881 Via 47 3882 word 26 3883 gzip (Coding Format) 38 3885 H 3886 header field 19 3887 header section 19 3888 headers 19 3889 Host header field 43 3890 http URI scheme 17 3891 https URI scheme 18 3893 I 3894 inbound 9 3895 interception proxy 11 3896 intermediary 9 3898 M 3899 Media Type 3900 application/http 61 3901 message/http 60 3902 message 7 3903 message/http Media Type 60 3904 method 21 3906 N 3907 non-transforming proxy 10 3909 O 3910 origin server 7 3911 origin-form (of request-target) 41 3912 outbound 9 3914 P 3915 proxy 10 3917 R 3918 recipient 7 3919 request 7 3920 request-target 21 3921 resource 16 3922 response 7 3923 reverse proxy 10 3925 S 3926 sender 7 3927 server 7 3928 spider 7 3930 T 3931 target resource 40 3932 target URI 40 3933 TE header field 38 3934 Trailer header field 40 3935 Transfer-Encoding header field 28 3936 transforming proxy 10 3937 transparent proxy 11 3938 tunnel 11 3940 U 3941 Upgrade header field 56 3942 upstream 9 3943 URI scheme 3944 http 17 3945 https 18 3946 user agent 7 3948 V 3949 Via header field 46 3951 Authors' Addresses 3953 Roy T. Fielding (editor) 3954 Adobe Systems Incorporated 3955 345 Park Ave 3956 San Jose, CA 95110 3957 USA 3959 EMail: fielding@gbiv.com 3960 URI: http://roy.gbiv.com/ 3962 Julian F. Reschke (editor) 3963 greenbytes GmbH 3964 Hafenweg 16 3965 Muenster, NW 48155 3966 Germany 3968 EMail: julian.reschke@greenbytes.de 3969 URI: http://greenbytes.de/tech/webdav/