idnits 2.17.00 (12 Aug 2021) /tmp/idnits30304/draft-kucherawy-email-caps-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (April 25, 2014) is 2941 days in the past. Is this intentional? Checking references for intended status: Best Current Practice ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Downref: Normative reference to an Informational RFC: RFC 5598 -- Obsolete informational reference (is this intentional?): RFC 733 (Obsoleted by RFC 822) -- Obsolete informational reference (is this intentional?): RFC 821 (Obsoleted by RFC 2821) == Outdated reference: draft-ietf-appsawg-rrvs-header-field has been published as RFC 7293 Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group M. Kucherawy 3 Internet-Draft April 25, 2014 4 Intended status: BCP 5 Expires: October 27, 2014 7 Architectural Approaches for Enhancing Email 8 draft-kucherawy-email-caps-02 10 Abstract 12 This document provides guidance regarding architectural decisions 13 made when developing enhancements to the Internet message service 14 ("email"). 16 Status of This Memo 18 This Internet-Draft is submitted in full conformance with the 19 provisions of BCP 78 and BCP 79. 21 Internet-Drafts are working documents of the Internet Engineering 22 Task Force (IETF). Note that other groups may also distribute 23 working documents as Internet-Drafts. The list of current Internet- 24 Drafts is at http://datatracker.ietf.org/drafts/current/. 26 Internet-Drafts are draft documents valid for a maximum of six months 27 and may be updated, replaced, or obsoleted by other documents at any 28 time. It is inappropriate to use Internet-Drafts as reference 29 material or to cite them other than as "work in progress." 31 This Internet-Draft will expire on October 27, 2014. 33 Copyright Notice 35 Copyright (c) 2014 IETF Trust and the persons identified as the 36 document authors. All rights reserved. 38 This document is subject to BCP 78 and the IETF Trust's Legal 39 Provisions Relating to IETF Documents 40 (http://trustee.ietf.org/license-info) in effect on the date of 41 publication of this document. Please review these documents 42 carefully, as they describe your rights and restrictions with respect 43 to this document. Code Components extracted from this document must 44 include Simplified BSD License text as described in Section 4.e of 45 the Trust Legal Provisions and are provided without warranty as 46 described in the Simplified BSD License. 48 Table of Contents 50 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 51 2. Architectural Guidance . . . . . . . . . . . . . . . . . . . . 3 52 3. Enhancement History . . . . . . . . . . . . . . . . . . . . . 3 53 4. The Protocol . . . . . . . . . . . . . . . . . . . . . . . . . 4 54 5. The Message . . . . . . . . . . . . . . . . . . . . . . . . . 5 55 6. Header vs. Envelope . . . . . . . . . . . . . . . . . . . . . 5 56 7. Deployment Observations and Results . . . . . . . . . . . . . 7 57 8. Consequences of Faulty Design . . . . . . . . . . . . . . . . 8 58 9. Security Considerations . . . . . . . . . . . . . . . . . . . 9 59 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 60 11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 9 61 11.1. Normative References . . . . . . . . . . . . . . . . . . 9 62 11.2. Informative References . . . . . . . . . . . . . . . . . 10 63 Appendix A. Acknowledgments . . . . . . . . . . . . . . . . . . . 10 65 1. Introduction 67 The email service is fully described in [RFC5598]. It has two core 68 components: The message payload, and the transfer protocol that 69 conveys it. 71 For various reasons discussed later in this document, it is common 72 for enahncements to the service to be made in undesirable ways. This 73 document first presents some basic architectural recommendations to 74 be considered when enhancing the service, and then describes why 75 these recommendations are apt and provides some history for them. 77 2. Architectural Guidance 79 When enhancing the email service, it is critical to identify the 80 precise nature of the enhancement. Specifically, an enhancement will 81 affect either the message payload or the way the payload is 82 transferred, but rarely both. 84 Simply put: 86 o An enhancement that affects the content of the message in some way 87 (such as meta-data about how to display the payload, a digital 88 signature ensuring payload integrity, indications of the handling 89 history of the payload, etc.) is best implemented in ways that 90 alter the content somehow, such as addition of header fields or 91 addition of Multipurpose Internet Mail Extensions (MIME) metadata 92 (see [RFC2045]). 94 o An enhancement that purely affects payload transport, and is not 95 meant to be recorded beyond delivery of the message to a mailbox, 96 is best implemented in a way that extend the delivery protocol 97 itself and not in a way that alters the payload. 99 Enhancements that affect both transport and content are rare, and 100 require special attention to this important boundary. 102 3. Enhancement History 104 As stated above, the email service is primarily composed of two 105 specifications: The format of the payload, and the method by which 106 the payload is transferred from one handling agent to the next. 108 The message format was originally fully specified in [RFC0733], 109 though it has some antecedents in the RFC archive. The current 110 format specificaton is [RFC5322]. This format describes two 111 sections: a "header" and a "body". Generally speaking, the body 112 contains the primary content of the message itself, while the header 113 conveys some metadata such as who sent it, to whom it was sent, where 114 replies are to be directed, how it should be displayed, etc. One 115 notable exception is the Subject: header field, which is essentially 116 part of the content. 118 The Simple Mail Transfer Protocol (SMTP) was originally fully 119 specified in [RFC0821], though it too was based on some other 120 previous work. The current specification is [RFC5321]. The protocol 121 is essentially a simple ASCII dialog between a client system and the 122 server system that exchanges a couple of identifiers -- who the 123 message is "from" and who it is "to" -- and then the message itself, 124 with status codes as the responses at each step. 126 The partition between these two has often been blurred as a result of 127 the original design and implementation of the service. It was simply 128 not always made clear what the best way is to add extensions. 130 A number of enhancements to both of these have appeared over the 131 years, which are too numerous to list here. They range in popularity 132 and deployment. Some of these are enhancements to format (such as 133 the addition of multimedia support), others to the protocol (such as 134 enhanced error handling), and a few have augmented both. 136 The original and increased complexity of the service has led to a 137 body of deployed code that has in turn had some impacts on the 138 development of enhancements over time. This often leads to 139 enhancements that are developed in ways that contradict the advice 140 presented in Section 2. This can have unfortunate consequences, as 141 described below. 143 4. The Protocol 145 The Simple Mail Transfer Protocol (SMTP) is the language spoken by 146 email clients and servers to exchange messages. The protocol is all 147 in printable ASCII, which makes it easy for users to "speak" the 148 protocol directly for the purposes of testing, debugging, or 149 illustration. 151 Essentially, the client introduces itself to the server, which 152 replies with a similar greeting. The client declares that it has a 153 message from a given party for delivery to one or more parties, 154 followed by a declaration that it is ready to send the content. When 155 the server is ready, the client relays its payload (the message) to 156 the server. Finally, the server accepts the message, usually 157 returning a code to the client that uniquely identifies this 158 transaction so that later analysis of the specific transaction is 159 possible. This sequence can repeat if the client has multiple 160 messages to relay during the same SMTP session. When no more 161 relaying is to be done, the two politely disconnect, and the dialog 162 is complete. 164 One could make the analogy of a person (perhaps a postal worker) 165 speaking to another person (perhaps at home) and the former handing 166 the latter a sealed envelope bearing a sender address and a recipient 167 address. The contents of the envelope are not known to either of 168 these parties at this stage; the exchange does not require it. 170 An important point here is that once the exchange is complete, the 171 first party no longer has the message. This is one of the 172 intentional properties of the email service; the message always 173 exists in exactly one place. The notable exception is the period 174 where transmission of the message is complete but not acknowledged; 175 for that brief period, the message exists in two places. 177 An envelope, in this illustration, can name more than one recipient. 178 An agent holding a message with such an envelope may find it must 179 next relay the message to multiple independent servers to complete 180 delivery to each recipient. In this case, that agent clones 181 ("splits") the envelope, resuting in multiple envelopes each with a 182 subset of the previous recipient set, but with identical content. 184 5. The Message 186 The email message conveys the content of a message from one or more 187 authors to one or more recipients. The message consists of a header 188 and a body. The body is the primary content, and in modern terms it 189 can contain unstructured plain text, structured multimedia, or 190 nothing at all. The header consists of a set of header fields that 191 include meta-data about the content, such as identifying the party 192 (or parties) that generated it, which agents handled it in transit, 193 the date and time at which it was generated, the (apparent) set of 194 intended recipients, etc. In the case of structured content, the 195 header also contains the initial set of details needed to extract the 196 structure. 198 If one imagines a printed memo, with fields like "From", "To", 199 "Subject", "Date", and perhaps "Cc", it is easy to envision a simple 200 email message; these fields are at the top, separated by some kind of 201 divider (which might be just an extra blank line or two) followed by 202 the body of the memo. It is in this image that the email format was 203 also created. 205 6. Header vs. Envelope 207 It is useful to carefully distinguish the separation of function of 208 the message header versus the SMTP envelope when considering the 209 design of any enhancement to the email service. There are tradeoffs 210 in the choice of enhancement approach. One tends to gain easier 211 adoption, but has less handling control. The other is much more 212 difficult for adoption, but offers much greater handling control. 214 The most distinctive aspect of the separation is that the addresses 215 in the envelope, used during transfer, can be entirely different from 216 the addresses contained in the message header. So the SMTP return 217 address (MAIL FROM) can be different from the message author (From: 218 header field), and the list of SMTP recipient addresses (RCPT TO) can 219 be entirely different from the recipients listed in the message 220 header (To, Cc, and Bcc header fields). 222 Thus, what's in that example printed memo in the previous section is 223 completely independent of what was on the envelope that contained it. 224 The memo might say "From: Alice" and "To: Bob", while the envelope 225 said "From: Charlie" and "To: Deborah". More generally, there is no 226 guarantee that the content and the transport have any relationship at 227 all. 229 An example of non-core material that is rightly a property of the 230 message and not the envelope includes digital signatures of the 231 payload. One might think of the mark or seal of a notary, which is 232 meant to certify the content and not the envelope containing it. 234 SMTP also has the notion of "Trace Information" which is a record of 235 the agents that handled the message prior to delivery and when they 236 each processed the message. One might think of a premium package 237 handling service that includes tracking as part of its product, 238 showing through which stations the package was carried and a date/ 239 time at each. Email trace information fulfills the same goal, and is 240 normally recorded as Received header fields. 242 Also recorded in the header, at the time of delivery only, is the 243 "from" portion of the envelope, to permit a reply to be sent to the 244 correct place. This is recorded in a field called Return-Path. 246 Any message can be forwarded by a user or a piece of software (such 247 as a mailing list service). In this case, it is appropriate to think 248 of the message as taking on a new life beyond its original delivery; 249 that is, it is delivered to the entity that will forward it, and 250 takes on a new life, with a new envelope and possibly a new or 251 revised header, or even augmented content. Caution must be taken 252 when constructing a new header so that information relevant only to 253 the original delivery does not get forwarded; this leakage of 254 information can lead to mishandling of the content or even leakage of 255 private information to the new recipient(s). [RRVS] provides an 256 example of such risks. 258 7. Deployment Observations and Results 260 As the email service grew in popularity, it also became a popular 261 target for abuse. In particular, it became a vector for delivery of 262 unwanted commercial email ("spam") or even malicious active content 263 ("malware", such as viruses or worms). These attempt to exploit user 264 trust (and naivete) in order to deliver undesirable content. Among 265 other things, false or misleading From and Subject fields on messages 266 are commonplace. 268 Mail User Agents (MUAs) retrieve messages from message stores, and 269 not from the Message Transfer Agents (MTAs) or Message Delivery 270 Agents (MDAs) that affect transport and delivery of messages. They 271 do not have access to the parameters exchanged during the protocol 272 sessions that resulted in the delivery. This led to various 273 enhancements done as message header fields, rather than enhancements 274 to SMTP, or to MUA access protocols such as the Internet Message 275 Access Protocol (IMAP) or Post Office Protocol (POP). 277 The rise in abusive emails, with the abuse almost entirely aimed at 278 exploiting deficiencies in content handling and presentation (see 279 [RFC7103]), produced a requirement for email-handling agents 280 (primarily MTAs) to be enhanced with powerful mechanisms for 281 analyzing and even modifying messages. Given the considerable range 282 of different ad hoc enhancements that have been made to message 283 formats, discussed above, this requires significant flexibility in 284 the mechanisms for making decisions about, or even altering, header 285 fields in messages as they are processed. By contrast, very little 286 in the way of messaging abuse takes place via misuse of SMTP or its 287 extensions. 289 Furthermore, SMTP is the infrastructure mechanism for message 290 handling, and infrastructures are always markedly more difficult to 291 modify, especially when the infrastructure is under a series of 292 independent administrative controls, but must somehow come to be 293 coordinated in their enhancements. This again contrasts with the 294 handling of the payload itself, where only the agent generating the 295 content and the agent that will ultimately intepret it -- by 296 presenting it to a user -- need to understand it. 298 This has resulted in the current environment, in which it is often 299 very easy to add, alter, remove, and analyze header fields on a 300 message, and typically very difficult if not impossible to add or 301 process an SMTP extension for which built-in support does not already 302 exist. 304 An MTA or MDA advertises the SMTP extensions it supports, through the 305 EHLO command reply. A client that supports a particular extension 306 can therefore easily determine its applicability with the server with 307 which it is interacting. If that agent does not include such 308 support, the current agent must decide to do one of two things: 310 a. consider the delivery a failure, and begin processing it as an 311 error; or 313 b. relay the message anyway, losing the capability afforded by the 314 extension. 316 In contrast to this negotiation mechanism at the level of SMTP, there 317 is no control exchange for support of header field enhancements. 318 They are present or not, and the client agent has no way to determine 319 whether its semantics are supported by the next handling agent (or 320 the recipient). However an MTA or MDA that does not understand a 321 particular header field will almost always simply ignore that header 322 field and continue to relay it, usually unmodified, to software 323 downstream that does recognize the field and how to use its contents. 324 A good example of this is MIME, whose header fields are typically of 325 use only to MUAs and are ignored by MTAs and MDAs. Moreover, MUAs 326 typically do not include header fields they don't recognize in the 327 material ultimately presented to the end user. 329 Enhancements done using header fields can be enormously useful when 330 one wishes to deploy a new capability that will not affect or be 331 affected by non-participating agents and is not intended for direct 332 human consumption. 334 As a result, it is common to assume that adding a new capability to 335 the email service is best accomplished by creating (and hopefully, 336 registering) a new header field specific to that purpose, even if 337 that capability would more properly be implemented as an SMTP 338 extension. 340 In a few very rare cases, new capabilities have even been developed 341 that include both header field and SMTP extension forms. [RRVS] 342 again serves as a useful example. 344 8. Consequences of Faulty Design 346 Using the header for enhancements that do not fit the envelope vs. 347 content model may be convenient given the current deployed 348 environment, but they result in such issues as: 350 o inadvertent leakage of data not relevant to later message 351 recipients if the message gets forwarded; 353 o no guarantee that any agent in the handling path understands the 354 enhancement or the details associated with it, leading to 355 unexpected results; 357 o for messages going to multiple recipients, the possible 358 inadvertent revelation of private information when the message is 359 "fanned out". 361 For more discussion, see Section 7.2 of [RFC5321], Section 3.6.3 of 362 [RFC5322], and Section 7 of [RRVS]. 364 MTA and MDA implementers need to ensure that SMTP extensions can be 365 added and handled via the runtime environment as easily as they can 366 be for header fields. This will ensure the more sound architectural 367 decisions can be made by designers and operators of future 368 enhancements. 370 9. Security Considerations 372 An important observation is that the envelope and the header overlap 373 in only a small number of key ways: 375 o The Return-Path header field, added at time of delivery, which 376 includes the sender address as extracted from the message 377 envelope; and 379 o The Received header field, which might contain the envelope 380 recipient for messages addressed to a single mailbox. 382 Typically, all other envelope details are discarded upon delivery. 383 Because of this, data about transport that should be ephemeral but 384 are stored in header fields can fall into the wrong hands when the 385 message is forwarded. Following the recommendations above can help 386 to reduce this concern. 388 10. IANA Considerations 390 This document contains no actions for IANA. 392 [RFC Editor: Please remove this section prior to publication.] 394 11. References 396 11.1. Normative References 398 [RFC5598] Crocker, D., "Internet Mail Architecture", RFC 5598, 399 July 2009. 401 11.2. Informative References 403 [RFC0733] Crocker, D., Vittal, J., Pogran, K., and D. Henderson, 404 "Standard for the format of ARPA network text messages", 405 RFC 733, November 1977. 407 [RFC0821] Postel, J., "Simple Mail Transfer Protocol", STD 10, 408 RFC 821, August 1982. 410 [RFC2045] Freed, N. and N. Borenstein, "Multipurpose Internet Mail 411 Extensions (MIME) Part One: Format of Internet Message 412 Bodies", RFC 2045, November 1996. 414 [RFC5321] Klensin, J., "Simple Mail Transfer Protocol", RFC 5321, 415 October 2008. 417 [RFC5322] Resnick, P., Ed., "Internet Message Format", RFC 5322, 418 October 2008. 420 [RFC7103] Kucherawy, M., Shapiro, G., and N. Freed, "Advice for Safe 421 Handling of Malformed Messages", RFC 7103, January 2014. 423 [RRVS] Mills, W. and M. Kucherawy, "The Require-Recipient-Valid- 424 Since Header Field and SMTP Service Extension", 425 draft-ietf-appsawg-rrvs-header-field (work in progress), 426 April 2014. 428 Appendix A. Acknowledgments 430 Dave Crocker and John Levine provided useful review comments during 431 the development of this work. 433 Author's Address 435 Murray S. Kucherawy 436 270 Upland Drive 437 San Francisco, CA 94127 438 USA 440 EMail: superuser@gmail.com