idnits 2.17.00 (12 Aug 2021) /tmp/idnits40767/draft-sheffer-ietf-ciphertext-format-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (January 15, 2021) is 484 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '1' on line 439 -- Looks like a reference, but probably isn't: '2' on line 442 -- Looks like a reference, but probably isn't: '3' on line 444 -- Looks like a reference, but probably isn't: '4' on line 447 Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 5 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group Y. Sheffer 3 Internet-Draft G. Keselman 4 Intended status: Standards Track Intuit 5 Expires: July 19, 2021 Y. Nir 6 Dell Technologies 7 January 15, 2021 9 A Generic Ciphertext Format 10 draft-sheffer-ietf-ciphertext-format-01 12 Abstract 14 This document defines a set of structured headers for encrypted data. 15 The main goal of this format is to enable detection of encrypted data 16 in large data stores, and associating it back to the system where it 17 was created and the key with which it was encrypted. This allows 18 organizations to extend the concept of data governance to encrypted 19 data, and to manage such data even when encrypted by multiple 20 different systems and cloud providers. 22 Status of This Memo 24 This Internet-Draft is submitted in full conformance with the 25 provisions of BCP 78 and BCP 79. 27 Internet-Drafts are working documents of the Internet Engineering 28 Task Force (IETF). Note that other groups may also distribute 29 working documents as Internet-Drafts. The list of current Internet- 30 Drafts is at https://datatracker.ietf.org/drafts/current/. 32 Internet-Drafts are draft documents valid for a maximum of six months 33 and may be updated, replaced, or obsoleted by other documents at any 34 time. It is inappropriate to use Internet-Drafts as reference 35 material or to cite them other than as "work in progress." 37 This Internet-Draft will expire on July 19, 2021. 39 Copyright Notice 41 Copyright (c) 2021 IETF Trust and the persons identified as the 42 document authors. All rights reserved. 44 This document is subject to BCP 78 and the IETF Trust's Legal 45 Provisions Relating to IETF Documents 46 (https://trustee.ietf.org/license-info) in effect on the date of 47 publication of this document. Please review these documents 48 carefully, as they describe your rights and restrictions with respect 49 to this document. Code Components extracted from this document must 50 include Simplified BSD License text as described in Section 4.e of 51 the Trust Legal Provisions and are provided without warranty as 52 described in the Simplified BSD License. 54 Table of Contents 56 1. Introduction and Design Principles . . . . . . . . . . . . . 2 57 1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 3 58 2. Motivation . . . . . . . . . . . . . . . . . . . . . . . . . 3 59 2.1. Design Goals . . . . . . . . . . . . . . . . . . . . . . 3 60 2.2. Previous Work . . . . . . . . . . . . . . . . . . . . . . 4 61 3. The Ciphertext Format . . . . . . . . . . . . . . . . . . . . 4 62 3.1. Format Overview . . . . . . . . . . . . . . . . . . . . . 4 63 3.1.1. Fixed Header . . . . . . . . . . . . . . . . . . . . 5 64 3.1.2. Variable Header . . . . . . . . . . . . . . . . . . . 5 65 3.1.3. Deriving a Specific Key . . . . . . . . . . . . . . . 6 66 3.2. Receiving Ciphertext . . . . . . . . . . . . . . . . . . 7 67 3.3. Fixed Header Rationale . . . . . . . . . . . . . . . . . 7 68 4. Example . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 69 4.1. Fixed Header . . . . . . . . . . . . . . . . . . . . . . 8 70 4.2. Variable Header: CBOR Diagnostic Notation . . . . . . . . 8 71 4.3. Variable Header: Binary . . . . . . . . . . . . . . . . . 8 72 4.4. Complete Header . . . . . . . . . . . . . . . . . . . . . 8 73 4.5. CDDL . . . . . . . . . . . . . . . . . . . . . . . . . . 8 74 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 75 6. Security Considerations . . . . . . . . . . . . . . . . . . . 9 76 6.1. Integrity Protection . . . . . . . . . . . . . . . . . . 9 77 7. References . . . . . . . . . . . . . . . . . . . . . . . . . 9 78 7.1. Normative References . . . . . . . . . . . . . . . . . . 9 79 7.2. Informative References . . . . . . . . . . . . . . . . . 10 80 7.3. URIs . . . . . . . . . . . . . . . . . . . . . . . . . . 10 81 Appendix A. Document History . . . . . . . . . . . . . . . . . . 11 82 A.1. draft-sheffer-ietf-ciphertext-format-01 . . . . . . . . . 11 83 A.2. draft-sheffer-ietf-ciphertext-format-00 . . . . . . . . . 11 84 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 11 86 1. Introduction and Design Principles 88 Organizations that manage sensitive data often employ application- 89 level encryption to protect data at rest. When this solution is 90 used, it is common that very large numbers of encrypted data items 91 are stored, potentially for a long time. Security best practices, 92 complicated organizational structures, as well as the existence of 93 modern key management systems, lead to the proliferation of large 94 numbers of encryption keys. After a while it becomes difficult to 95 identify the encryption key that was used for a particular piece of 96 data, with the situation becoming even more complicated when multiple 97 key management systems are used by the same organization. 99 Application-level encryption can be deployed at different scales: in 100 some cases a multi-megabyte file may be encrypted with a single key. 101 In other cases, we may want to deploy encryption for specific 102 database fields, which can easily manifest itself as millions of keys 103 for a single database table. 105 Tagging encrypted data with metadata supports a number of important 106 use cases: it allows the organization to better catalog the data 107 (a.k.a. "data governance"), to discover the owner of each piece of 108 encrypted data, to detect data encrypted with outdated keys. 110 1.1. Terminology 112 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 113 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 114 "OPTIONAL" in this document are to be interpreted as described in 115 BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all 116 capitals, as shown here. 118 2. Motivation 120 Our main goal in defining a common ciphertext format is to allow 121 organizations to manage large scale data, encrypted at rest using 122 multiple key management and encryption services. Additional 123 motivations for an enterprise to use a common format are: 125 - Cross-KMS-provider interoperability, to simplify automated 126 management of data sourced from multiple origins. 128 - Proprietary data encryption formats mean that the data remains 129 tied to a single vendor. 131 - Standardization around key management best practices. 133 2.1. Design Goals 135 Some of the goals behind this design include: 137 - The format should allow simple and efficient detection of 138 encrypted data, in support of automated data governance and key 139 lifecycle management. 141 - The format should be space-efficient, since it may be used for 142 very large numbers of small encrypted items. As a result, 143 important information is associated with the (stored) key, rather 144 than the ciphertext. 146 - Specifically, following security best practices, a given key 147 material should be used with only a single cryptographic 148 algorithm. Therefore, the algorithm identifier should be stored 149 with the key (or the key version), rather than with the 150 ciphertext. 152 - The format defined here only covers the ciphertext header, and not 153 the ciphertext itself (referred to as "body" in this document). 154 The body is defined elsewhere, such as [NISTSP800-38D] for AES- 155 GCM. 157 - The header is not encrypted. Integrity-protection is optional. 158 See Section 6.1 for details. 160 - The format should support key versioning, i.e. automated, periodic 161 rotation of keys. 163 - The format should support granular key management by allowing for 164 key derivation and key wrapping. 166 - The format should allow for generic tools to perform partial 167 attribution of ciphertext, i.e. to associate it with a specific 168 key provider. More specific, possibly provider-specific tools are 169 required for full attribution. 171 2.2. Previous Work 173 A few notable formats are: 175 - The Amazon Web Services SDK message format, documented here [1]. 176 This format is specific to the AWS library, and aimed at users of 177 the AWS Key Management System (KMS). 179 - The wire format [2] defined by Google's Tink library. 181 - The format defined by the KMIP 2.1 [3] specification, which is 182 targeted at data transmittal, rather than storage. 184 3. The Ciphertext Format 186 3.1. Format Overview 188 The ciphertext is prefixed by a header, which in turn, consists of a 189 short fixed header and variable header. The variable header is a 190 CBOR [RFC8949] map. 192 Following the header is the body of the ciphertext. The format 193 (including length) of the body is out of scope for this document. 195 3.1.1. Fixed Header 197 The fixed header consists of: 199 - A single constant octet 0x08 (see Section 3.3). 201 - A single octet denoting the format version. The version is 0x01 202 for the format defined in this document. 204 3.1.2. Variable Header 206 The variable header is a CBOR map consisting of elements from the 207 following table. 209 +----------------+-----+----------+---------------------+-----------+ 210 | Field Name | Map | Value | Meaning | Mandatory | 211 | | Key | Type | | | 212 +----------------+-----+----------+---------------------+-----------+ 213 | Key Provider | 1 | Unsigned | The organization | Y | 214 | | | integer | responsible for the | | 215 | | | | key management | | 216 | | | | system. | | 217 | | | | | | 218 | Key ID | 2 | Byte | An encryption key | Y | 219 | | | string | identifier, where | | 220 | | | | the key is stored | | 221 | | | | in a key management | | 222 | | | | system. This must | | 223 | | | | denote a unique | | 224 | | | | key, even if the | | 225 | | | | Provider supports | | 226 | | | | multiple tenants. | | 227 | | | | Encoding of this | | 228 | | | | field is Provider- | | 229 | | | | specific. The field | | 230 | | | | must appear once. | | 231 | | | | | | 232 | Key Version | 3 | Unsigned | A version of a key, | N | 233 | | | integer | where the key is | | 234 | | | | rotated on a | | 235 | | | | periodic basis. | | 236 | | | | Encoding of this | | 237 | | | | field is Provider- | | 238 | | | | specific. The field | | 239 | | | | must appear at most | | 240 | | | | once. | | 241 | | | | | | 242 | Auxiliary Data | 4 | Byte | Additional data | N | 243 | | | string | required to derive | | 244 | | | | a specific key from | | 245 | | | | the referenced key | | 246 | | | | (and key version, | | 247 | | | | if any), see also | | 248 | | | | Section 3.1.3. The | | 249 | | | | field must appear | | 250 | | | | at most once. | | 251 | | | | | | 252 | Nonce | 5 | Byte | A nonce or | N | 253 | | | string | initialization | | 254 | | | | vector (IV), if | | 255 | | | | required by the | | 256 | | | | cipher algorithm. | | 257 | | | | We note that an | | 258 | | | | implementation may | | 259 | | | | prefer to store the | | 260 | | | | nonce and | | 261 | | | | authentication tag | | 262 | | | | in-line with the | | 263 | | | | ciphertext. | | 264 | | | | | | 265 | Authentication | 6 | Byte | An authentication | N | 266 | Tag | | string | tag or integrity | | 267 | | | | check value (ICV), | | 268 | | | | if required by the | | 269 | | | | cipher algorithm. | | 270 | | | | | | 271 | Additional | 7 | Byte | Additional | N | 272 | Authenticated | | string | authenticated data | | 273 | Data | | | (AAD), which is | | 274 | | | | integrity-protected | | 275 | | | | but not encrypted | | 276 | | | | by the cipher. | | 277 +----------------+-----+----------+---------------------+-----------+ 279 3.1.3. Deriving a Specific Key 281 The Auxiliary Data field is used to support derivation of a key, 282 specific to the ciphertext being managed. There are two common ways 283 to obtain this specific key: 285 - Using a key derivation function: SK = KDF(key, aux-data) 287 - Decryption of a wrapped key: SK = Decrypt(key, aux-data) 288 The exact algorithm is implementation dependent, and should be 289 uniquely defined by the combination of Key Provider, Key ID and (if 290 given) Key Version. 292 3.2. Receiving Ciphertext 294 Correct interpretation of the format may have security implications, 295 making it important to define the exact semantics even when the 296 entity that receives a ciphertext may not understand parts of the 297 header. 299 - A recipient MUST reject a malformed header, e.g. if the total 300 length is larger than the physical length allocated to it based on 301 higher-level network protocols or storage formats. 303 - A recipient MUST reject a ciphertext if it does not recognize the 304 format version. 306 - A recipient MUST reject a ciphertext if the variable header is not 307 valid CBOR, as per [RFC8949] Sec. 5.3.1. In particular, it MUST 308 reject duplicate map keys. 310 - A recipient MUST accept a ciphertext even if it does not recognize 311 some of the map keys. It MUST ignore the unknown map keys and 312 MUST interpret all known ones. In other words, the only way to 313 introduce new mandatory map keys is by incrementing the format 314 version. 316 - If ciphertext integrity protection coverage includes the header, a 317 recipient MUST reject the header as well as the ciphertext if the 318 integrity protection fails to validate. 320 3.3. Fixed Header Rationale 322 We chose the initial byte 0x08, since strings are very unlikely to 323 start with it, as we explain below. Automated tools can detect 324 encrypted data in structured contexts (e.g., a SQL database column) 325 by sampling a number of data items and if all start with this byte, 326 determining that they are encrypted with a high probability. 328 The byte 0x08 encodes the ASCII control character "backspace". It 329 has the same meaning in UTF-8, and the 08 block of UTF-16 characters 330 is only populated by two very small languages and rarely-used 331 extended Arabic characters [4]. 333 4. Example 335 4.1. Fixed Header 337 "08 01" 339 4.2. Variable Header: CBOR Diagnostic Notation 341 " {1: 65535, 2: h'1122334455', 3: 6, } " 343 4.3. Variable Header: Binary 345 " a3 01 19 ff ff 02 45 11 22 33 44 55 03 06 " 347 4.4. Complete Header 349 " 08 01 a3 01 19 ff ff 02 45 11 22 33 44 55 03 06 " 351 4.5. CDDL 353 The following non-normative snippet defines the format of the 354 variable header using CDDL [RFC8610]. 356 var_header = { 357 K_KEY_PROVIDER: uint, 358 K_KEY_ID: bstr, 359 ? K_KEY_VERSION: uint, 360 ? K_AUX_DATA: bstr, 361 ? K_NONCE : bstr, 362 ? K_AUTH_TAG : bstr, 363 ? K_AAD : bstr, 364 *uint => any ; extensions 365 } 367 K_RESERVED = 0 368 K_KEY_PROVIDER = 1 369 K_KEY_ID = 2 370 K_KEY_VERSION = 3 371 K_AUX_DATA = 4 372 K_NONCE = 5 373 K_AUTH_TAG = 6 374 K_AAD = 7 375 ; extend here 377 5. IANA Considerations 379 TBD: establish a registry for Types, with 128-255 as private use. 381 TBD: establish a registry of Key Providers. 383 6. Security Considerations 385 6.1. Integrity Protection 387 The format defined here does not include integrity protection for the 388 header, and neither does it mandate that the encrypted item's 389 integrity protection should include the header. 391 Data encrypted at rest is typically vulnerable to denial of service 392 attacks, since (assuming the data is integrity protected) an attacker 393 that can change the ciphertext can trivially cause it to fail 394 validation. 396 There are cases where it is convenient to manipulate the ciphertext 397 header, even if the data itself remains encrypted and unmodified. 398 For example, when migrating between formats or when bulk-changing 399 metadata associated with the ciphertext. On the other hand, it is a 400 best practice to protect cryptographic metadata against malicious 401 modification. We are currently not aware of a specific threat vector 402 associated with malicious changes to the proposed format, at least 403 assuming the use of AEAD ciphers. 405 7. References 407 7.1. Normative References 409 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 410 Requirement Levels", BCP 14, RFC 2119, 411 DOI 10.17487/RFC2119, March 1997, 412 . 414 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 415 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 416 May 2017, . 418 [RFC8949] Bormann, C. and P. Hoffman, "Concise Binary Object 419 Representation (CBOR)", STD 94, RFC 8949, 420 DOI 10.17487/RFC8949, December 2020, 421 . 423 7.2. Informative References 425 [NISTSP800-38D] 426 Dworkin, M., "Recommendation for block cipher modes of 427 operation :: GaloisCounter Mode (GCM) and GMAC", National 428 Institute of Standards and Technology report, 429 DOI 10.6028/nist.sp.800-38d, 2007. 431 [RFC8610] Birkholz, H., Vigano, C., and C. Bormann, "Concise Data 432 Definition Language (CDDL): A Notational Convention to 433 Express Concise Binary Object Representation (CBOR) and 434 JSON Data Structures", RFC 8610, DOI 10.17487/RFC8610, 435 June 2019, . 437 7.3. URIs 439 [1] https://docs.aws.amazon.com/encryption-sdk/latest/developer- 440 guide/message-format.html 442 [2] https://github.com/google/tink/blob/master/docs/WIRE-FORMAT.md 444 [3] https://docs.oasis-open.org/kmip/kmip-profiles/v2.1/csprd01/kmip- 445 profiles-v2.1-csprd01.html 447 [4] https://en.wikipedia.org/wiki/Arabic_Extended-A 449 Appendix A. Document History 451 A.1. draft-sheffer-ietf-ciphertext-format-01 453 - SAAG feedback: the variable header is now CBOR. 455 - Binary example. 457 - Non-normative CDDL. 459 - Additional types for non-inline AEAD. 461 A.2. draft-sheffer-ietf-ciphertext-format-00 463 - Initial version. 465 Authors' Addresses 467 Yaron Sheffer 468 Intuit 470 EMail: yaronf.ietf@gmail.com 472 Gleb Keselman 473 Intuit 475 EMail: gleb.keselman@gmail.com 477 Yoav Nir 478 Dell Technologies 480 EMail: ynir.ietf@gmail.com