idnits 2.17.00 (12 Aug 2021) /tmp/idnits30570/draft-kucherawy-dkim-list-canon-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (April 5, 2015) is 2596 days in the past. Is this intentional? Checking references for intended status: Experimental ---------------------------------------------------------------------------- No issues found here. Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group M. Kucherawy 3 Internet-Draft April 5, 2015 4 Intended status: Experimental 5 Expires: October 7, 2015 7 A List-safe Canonicalization for DomainKeys Identified Mail (DKIM) 8 draft-kucherawy-dkim-list-canon-01 10 Abstract 12 DomainKeys Identified Mail (DKIM) introduced a mechanism whereby a 13 mail operator can affix a signature to a message that validates at 14 the level of the signer's domain name. It specified two possible 15 ways of converting the message body to a canonical form, one 16 intolerant of changes and the other tolerant of simple changes to 17 whitespace within the message body. 19 The provided canonicalization schemes do not tolerate changes in a 20 structured message such as conversion between transfer encodings or 21 addition of new message parts. It is useful to have these 22 capabilities to allow for transport through gateways, and also for 23 transport through handlers (such as mailing list services) that might 24 add content that would invalidate a signature generated using the 25 existing canonicalization schemes. 27 This document presents a mechanism for generating a canonicalization 28 that can allows easy detection of modified content while still being 29 valid for the content it originally signed. It also presents a use 30 profile of DKIM that takes advantage of this capability. 32 Status of This Memo 34 This Internet-Draft is submitted in full conformance with the 35 provisions of BCP 78 and BCP 79. 37 Internet-Drafts are working documents of the Internet Engineering 38 Task Force (IETF). Note that other groups may also distribute 39 working documents as Internet-Drafts. The list of current Internet- 40 Drafts is at http://datatracker.ietf.org/drafts/current/. 42 Internet-Drafts are draft documents valid for a maximum of six months 43 and may be updated, replaced, or obsoleted by other documents at any 44 time. It is inappropriate to use Internet-Drafts as reference 45 material or to cite them other than as "work in progress." 47 This Internet-Draft will expire on October 7, 2015. 49 Copyright Notice 51 Copyright (c) 2015 IETF Trust and the persons identified as the 52 document authors. All rights reserved. 54 This document is subject to BCP 78 and the IETF Trust's Legal 55 Provisions Relating to IETF Documents 56 (http://trustee.ietf.org/license-info) in effect on the date of 57 publication of this document. Please review these documents 58 carefully, as they describe your rights and restrictions with respect 59 to this document. Code Components extracted from this document must 60 include Simplified BSD License text as described in Section 4.e of 61 the Trust Legal Provisions and are provided without warranty as 62 described in the Simplified BSD License. 64 Table of Contents 66 1. Background . . . . . . . . . . . . . . . . . . . . . . . . . . 3 67 2. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 3 68 3. The 'list' Canonicalization Description . . . . . . . . . . . 3 69 3.1. Preparing Content . . . . . . . . . . . . . . . . . . . . 4 70 4. 'The 'lh=' Signature Tag . . . . . . . . . . . . . . . . . . . 5 71 5. Use Profile . . . . . . . . . . . . . . . . . . . . . . . . . 6 72 6. Security Considerations . . . . . . . . . . . . . . . . . . . 6 73 6.1. Imported from DKIM . . . . . . . . . . . . . . . . . . . . 6 74 6.2. Added Content May Not Be Safe . . . . . . . . . . . . . . 7 75 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 7 76 7.1. DKIM-Signature Canonicalization Body Registry . . . . . . 7 77 7.2. DKIM-Signature Tag Specifications Registry . . . . . . . . 7 78 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 7 79 8.1. Normative References . . . . . . . . . . . . . . . . . . . 7 80 8.2. Informative References . . . . . . . . . . . . . . . . . . 7 81 Appendix A. Example . . . . . . . . . . . . . . . . . . . . . . . 8 82 Appendix B. To-Do . . . . . . . . . . . . . . . . . . . . . . . . 10 83 Appendix C. Acknowledgements . . . . . . . . . . . . . . . . . . 10 85 1. Background 87 DomainKeys Identified Mail [RFC6376] (DKIM) defines a mechanism 88 whereby a verified domain name can be attached to a message, or 89 portion of a message, using a cryptographic signature. It presents 90 two possible schemes for converting the header block to a canonical 91 form, and similarly two schemes for canonicalizing the body. In each 92 case, one scheme permits no changes whatsoever, and the other permits 93 limited changes restricted to areas such as whitespace munging, case 94 changing, and header field wrapping. 96 Some agents deliberately, but innocently, modify content in transit. 97 A prime example of this is mailing lists, which might add a prefix to 98 the Subject field of a message, add list-specific information to the 99 header (in the form of new header fields), or append administrivia to 100 the body of messages before they are re-mailed to the list 101 subscribers. Use of mailing lists with respect to DKIM, and a 102 discussion of related challenges, can be found in [RFC6377]. 104 There is a desire to have DKIM signatures survive transit through 105 lists. One way to do this is to make use of DKIM's "l=" tag which 106 limits the portion of the body that is signed. This exposes an 107 attack vector, however, since one can simply append any content to a 108 partly-signed message and the signature will continue to verify. 109 (See Section 8.2 of [RFC6376].) 111 This document defines a new body canonicalization for DKIM that 112 includes a partial signature for each message part in a message 113 structured using Multipurpose Internet Mail Extensions (MIME; see 114 [RFC2045]). This allows a clear delineation between the author- 115 generated content (which would be signed by the author) and content 116 added downstream (which would be signed by the other actor). A DKIM 117 verifier can then determine whether the author-generated content is 118 intact, and then identify and verify the content that was added 119 later. 121 The utility of this mechanism is predicated on the notion that agents 122 that modify signed messages will do so in ways compatible with MIME. 124 2. Definitions 126 Numerous terms used here, especially "Author", are defined in 127 [RFC5598]. 129 3. The 'list' Canonicalization Description 131 This section defines the 'list' body canonicalization algorithm. 133 Put simply, the list canonicalization constructs a hash tree of the 134 MIME structure of the message after each part has been decoded (for 135 those with a Content-Transfer-Encoding field). The hash used is 136 implied by the signature algorithm to be used (see the DKIM "a=" 137 tag). Each of the hashes can be made a part of the signature to 138 allow for more precise part validation, and identification of added 139 content. 141 3.1. Preparing Content 143 A message is prepared for canonicalization by applying the following 144 steps in order: 146 1. Create an empty tree. Each node of the tree includes the 147 following components: 149 A. The MIME type and subtype of the part, expressed as would be 150 found in a Content-Type header field, with no whitespace or 151 comments; 153 B. The unencoded content represented by the MIME part at this 154 node; 156 C. A series of octets that will contain a hash of the content; 158 D. A series of zero or more pointers to other (child) nodes. 160 2. If the message is not encoded using MIME, insert a node at the 161 root of the tree using a type/subtype of "text/plain" and the 162 full body content. The hash is not initialized. 164 3. If the message is encoded using MIME, then the tree is populated 165 in a way that mirrors the MIME structure of the message. In 166 particular, the outermost MIME object will appear at the root 167 node, and the only nodes that have children are those with a MIME 168 type of "multipart". The hashes are not initialized. 170 4. For each leaf node, compute a hash of the content of that node. 171 Store the hash in the node. 173 5. For each non-leaf node, if all of its child nodes now have 174 computed hashes, concatenate the hashes (with order preserved), 175 and compute and store a hash of the concatenation. 177 6. Repeat the previous step until all hashes in the tree have been 178 populated. 180 When this canonicalization is in use, the "bh=" tag will contain the 181 hash stored at the root of the tree. The processes for signing and 182 verification are otherwise unchanged. 184 4. 'The 'lh=' Signature Tag 186 A signer can include an "lh=" tag, defined here, to make more than 187 just the root hash information available to verifying agents. This 188 permits identification of the specific part of the MIME structure 189 that was modified, added or removed by an intermediary. 191 The "lh=" tag is constructed by performing an in-order traversal of 192 the canonicalization tree described in Section 3.1. At each node, 193 each of the following is output, separated by a colon character 194 (ASCII 0x3A): 196 1. A base64 expression of the hash at that node; 198 2. The MIME type of that node; 200 3. An integer expression of the number of children at that node. 202 Between each node's output, a comma character (ASCII 0x2C) is output. 204 Reconstruction of the MIME tree can be accomplished by the following 205 steps: 207 1. Create a tree "T" containing a single empty node. 209 2. Create an empty node queue, "Q". 211 3. Create an information queue "I", containing the sequence of node 212 information fields found in the "lh=" tag. 214 4. Select the root node of the tree. Call this node "N". 216 5. Extract the first batch of node information ("B") from the "lh=" 217 tag. 219 6. Store the hash and MIME type from "B" into "N". 221 7. Enqueue the specified number of empty nodes into "Q", and attach 222 them all as children of "N". 224 8. If "I" and "Q" are both empty, terminate. If one is empty and 225 the other is not, an error has occurred. 227 9. Extract the next batch of node information from "I", as "B". 229 10. Dequeue the next node from "Q", as "N". 231 11. Return to step 6. 233 By comparing the hashes in and structure of this tree to those in the 234 canonicalized tree, a receiver can identify parts of the tree (or 235 entire subtrees) that have been modified. Parts not covered by the 236 signature can also be identified. 238 5. Use Profile 240 The intended use of this mechanism is to affix two DKIM signatures to 241 a message. The first signature is added by the Author, and 242 canonicalizes the original message in its entirety. The second 243 signature is added by a modifying intermediary, such as a mailing 244 list manager (MLM). 246 When verifying, the Author signature on an unmodified message would 247 pass verification. For a modified message, in the typical case, the 248 verification step would observe that the Author signature failed but 249 the intermediary's signature verified. When the "lh=" tag is 250 present, it is possible to reconstruct the MIME structure of the 251 signed message and compare it to that of the received message, 252 including hashes of the content seen by each party. By comparing 253 hash values at each node of the MIME structures, it is possible to 254 determine in which MIME parts changes were made and/or new parts 255 added or removed by the intermediary. The verifying agent can then 256 determine whether those changes are acceptable before allowing the 257 message to continue toward delivery. 259 It is also possible to determine which agents in the handling chain 260 took responsibility for which parts of the content. For example, 261 while a Mediator's signature might indicate that the mediator is 262 responsible for the entire (rewritten) message, it might also be 263 possible to determine that the Author takes responsibility for all 264 but one part of the message as well. The excluded part would be the 265 part added by the Mediator, and can be handled separately from the 266 Author's content. 268 6. Security Considerations 270 6.1. Imported from DKIM 272 Section 8 of [RFC6376] discusses numerous security considerations 273 relevant to DKIM. Of particular interest here is Section 8.2, which 274 discusses concerns regarding signatures that sill verify in the 275 presence of added message content. 277 6.2. Added Content May Not Be Safe 279 When the use profile described in Section 3 is applied, it is 280 important to note that the added content was not signed by the Author 281 domain, but only by the domain of the intermediary. Operators that 282 might grant preferential handling based on valid DKIM signatures from 283 favorable domains; assuming that appended content in the presence of 284 such signatures does not mean the appended content is necessarily 285 safe. 287 7. IANA Considerations 289 7.1. DKIM-Signature Canonicalization Body Registry 291 IANA is requested to add the following entry to the DKIM-Signature 292 Canonicalization Body Registry: 294 Type: list 295 Reference: [this document] 296 Status: active 298 7.2. DKIM-Signature Tag Specifications Registry 300 IANA is requested to add the following entry to the DKIM-Signature 301 Tag Specifications Registry: 303 Type: lh 304 Reference: [this document] 305 Status: active 307 8. References 309 8.1. Normative References 311 [RFC2045] Freed, N. and N. Borenstein, "Multipurpose Internet Mail 312 Extensions (MIME) Part One: Format of Internet Message 313 Bodies", RFC 2045, November 1996. 315 [RFC6376] Crocker, D., Hansen, T., and M. Kucherawy, "DomainKeys 316 Identified Mail (DKIM) Signatures", STD 76, RFC 6376, 317 September 2011. 319 8.2. Informative References 321 [RFC5598] Crocker, D., "Internet Mail Architecture", RFC 5598, 322 July 2009. 324 [RFC6377] Kucherawy, M., "DomainKeys Identified Mail (DKIM) and 325 Mailing Lists", BCP 167, RFC 6377, September 2011. 327 Appendix A. Example 329 To illustrate the use of this addition to DKIM, consider a message 330 whose header and content are as follows: 332 From: sender@example.com 333 To: recipient@example.net 334 Date: Mon, 23 Mar 2015 11:21:33 -0700 335 Subject: test message 336 MIME-Version: 1.0 337 Content-Type: multipart/mixed; boundary="foobar" 339 --foobar 340 Content-Type: text/plain 342 Text part #1 344 --foobar 345 Content-Type: text/plain 347 Text part #2 349 --foobar-- 351 Figure 1: Example Message 353 The MIME structure in this message can be represented as a tree. A 354 node with media type "multipart" has a set of one or more children 355 nodes, each of which starts with the corresponding boundary. A node 356 of any other type contains actual content, and has no descendents, 357 but has siblings under the same parent node. Thus, as a tree, the 358 example message might be represented thus: 360 +-----------+ 361 | multipart | 362 | mixed |--->// 363 +-----------+ 364 | 365 | 366 V 367 +-----------+ +-----------+ 368 | text | | text | 369 | plain |--->| plain |--->// 370 +-----------+ +-----------+ 372 Figure 2: MIME structure 373 Continuing with this illustration, a Mediator receives the message, 374 and adds its desired "footer" content by appending a third text/plain 375 MIME part after the existing content. This results in the following 376 MIME structure: 378 +-----------+ 379 | multipart | 380 | mixed |--->// 381 +-----------+ 382 | 383 | 384 V 385 +-----------+ +-----------+ +-----------+ 386 | text | | text | | text | 387 | plain |--->| plain |--->| plain |--->// 388 +-----------+ +-----------+ +-----------+ 390 Figure 3: Augmented MIME structure 392 Applying the signatures as described in Section 3 at both the Author 393 and the Mediator, the final Verifier will see signatures that cover 394 content as follows: 396 +--------------------------------------------------------+ 397 |+--------------------------------+ | 398 || +-----------+ | | 399 || | multipart | | | 400 || | mixed |---// | | 401 || +-----------+ | | 402 || | | | 403 || | | | 404 || V | | 405 || +-----------+ +-----------+ | +-----------+ | 406 || | text | | text | | | text | | 407 || | plain |--->| plain |-|->| plain |--->// | 408 || +-----------+ +-----------+ | +-----------+ | 409 |+--------------------------------+ | 410 | A u t h o r s i g n a t u r e | 411 +--------------------------------------------------------+ 412 M e d i a t o r s i g n a t u r e 414 Figure 4: Signature coverage of content 416 With the additional information provided using this mechanism, it is 417 now possible to verify both signatures, and also ascribe 418 responsibility for different parts of the content to two different 419 signature-generating entities. 421 Appendix B. To-Do 423 Explain how this works when the input message is not already a MIME 424 message. Probably just canonicalize it as a multipart/mixed with a 425 single text/plain in it. 427 Handle more complex MIME structures from the author, such as 428 something that's already multipart/mixed with some non-trivial 429 structure to it. 431 Appendix C. Acknowledgements 433 The original idea was proposed by Ned Freed. 435 The authors wish to acknowledge (names) for their comments during the 436 development of this document. 438 Author's Address 440 Murray S. Kucherawy 442 EMail: superuser@gmail.com