idnits 2.17.00 (12 Aug 2021) /tmp/idnits1449/draft-omara-sframe-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 42 instances of too long lines in the document, the longest one being 21 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (16 August 2021) is 271 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: 'KID' is mentioned on line 509, but not defined == Missing Reference: 'Optional' is mentioned on line 542, but not defined == Outdated reference: A later version (-07) exists of draft-ietf-mls-architecture-06 == Outdated reference: A later version (-14) exists of draft-ietf-mls-protocol-11 Summary: 1 error (**), 0 flaws (~~), 5 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group E. Omara 3 Internet-Draft Apple 4 Intended status: Informational J. Uberti 5 Expires: 17 February 2022 Google 6 A. GOUAILLARD 7 S. Murillo 8 CoSMo Software 9 16 August 2021 11 Secure Frame (SFrame) 12 draft-omara-sframe-03 14 Abstract 16 This document describes the Secure Frame (SFrame) end-to-end 17 encryption and authentication mechanism for media frames in a 18 multiparty conference call, in which central media servers (SFUs) can 19 access the media metadata needed to make forwarding decisions without 20 having access to the actual media. The proposed mechanism differs 21 from other approaches through its use of media frames as the 22 encryptable unit, instead of individual RTP packets, which makes it 23 more bandwidth efficient and also allows use with non-RTP transports. 25 Status of This Memo 27 This Internet-Draft is submitted in full conformance with the 28 provisions of BCP 78 and BCP 79. 30 Internet-Drafts are working documents of the Internet Engineering 31 Task Force (IETF). Note that other groups may also distribute 32 working documents as Internet-Drafts. The list of current Internet- 33 Drafts is at https://datatracker.ietf.org/drafts/current/. 35 Internet-Drafts are draft documents valid for a maximum of six months 36 and may be updated, replaced, or obsoleted by other documents at any 37 time. It is inappropriate to use Internet-Drafts as reference 38 material or to cite them other than as "work in progress." 40 This Internet-Draft will expire on 17 February 2022. 42 Copyright Notice 44 Copyright (c) 2021 IETF Trust and the persons identified as the 45 document authors. All rights reserved. 47 This document is subject to BCP 78 and the IETF Trust's Legal 48 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 49 license-info) in effect on the date of publication of this document. 50 Please review these documents carefully, as they describe your rights 51 and restrictions with respect to this document. Code Components 52 extracted from this document must include Simplified BSD License text 53 as described in Section 4.e of the Trust Legal Provisions and are 54 provided without warranty as described in the Simplified BSD License. 56 Table of Contents 58 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 59 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 60 3. Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 61 4. SFrame . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 62 4.1. SFrame Format . . . . . . . . . . . . . . . . . . . . . . 7 63 4.2. SFrame Header . . . . . . . . . . . . . . . . . . . . . . 7 64 4.3. Encryption Schema . . . . . . . . . . . . . . . . . . . . 8 65 4.3.1. Key Selection . . . . . . . . . . . . . . . . . . . . 9 66 4.3.2. Key Derivation . . . . . . . . . . . . . . . . . . . 9 67 4.3.3. Encryption . . . . . . . . . . . . . . . . . . . . . 10 68 4.3.4. Decryption . . . . . . . . . . . . . . . . . . . . . 12 69 4.3.5. Duplicate Frames . . . . . . . . . . . . . . . . . . 12 70 4.4. Ciphersuites . . . . . . . . . . . . . . . . . . . . . . 12 71 4.4.1. AES-CM with SHA2 . . . . . . . . . . . . . . . . . . 13 72 5. Key Management . . . . . . . . . . . . . . . . . . . . . . . 14 73 5.1. Sender Keys . . . . . . . . . . . . . . . . . . . . . . . 15 74 5.2. MLS . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 75 6. Media Considerations . . . . . . . . . . . . . . . . . . . . 17 76 6.1. SFU . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 77 6.1.1. LastN and RTP stream reuse . . . . . . . . . . . . . 17 78 6.1.2. Simulcast . . . . . . . . . . . . . . . . . . . . . . 17 79 6.1.3. SVC . . . . . . . . . . . . . . . . . . . . . . . . . 18 80 6.2. Video Key Frames . . . . . . . . . . . . . . . . . . . . 18 81 6.3. Partial Decoding . . . . . . . . . . . . . . . . . . . . 18 82 7. Overhead . . . . . . . . . . . . . . . . . . . . . . . . . . 18 83 7.1. Audio . . . . . . . . . . . . . . . . . . . . . . . . . . 19 84 7.2. Video . . . . . . . . . . . . . . . . . . . . . . . . . . 19 85 7.3. SFrame vs PERC-lite . . . . . . . . . . . . . . . . . . . 20 86 7.3.1. Audio . . . . . . . . . . . . . . . . . . . . . . . . 20 87 7.3.2. Video . . . . . . . . . . . . . . . . . . . . . . . . 20 88 8. Security Considerations . . . . . . . . . . . . . . . . . . . 21 89 8.1. No Per-Sender Authentication . . . . . . . . . . . . . . 21 90 8.2. Key Management . . . . . . . . . . . . . . . . . . . . . 21 91 8.3. Authentication tag length . . . . . . . . . . . . . . . . 21 92 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 21 93 10. Test Vectors . . . . . . . . . . . . . . . . . . . . . . . . 21 94 10.1. AES_CM_128_HMAC_SHA256_4 . . . . . . . . . . . . . . . . 22 95 10.2. AES_CM_128_HMAC_SHA256_8 . . . . . . . . . . . . . . . . 23 96 10.3. AES_GCM_128_SHA256 . . . . . . . . . . . . . . . . . . . 25 97 10.4. AES_GCM_256_SHA512 . . . . . . . . . . . . . . . . . . . 27 98 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 29 99 11.1. Normative References . . . . . . . . . . . . . . . . . . 29 100 11.2. Informative References . . . . . . . . . . . . . . . . . 29 101 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 30 103 1. Introduction 105 Modern multi-party video call systems use Selective Forwarding Unit 106 (SFU) servers to efficiently route RTP streams to call endpoints 107 based on factors such as available bandwidth, desired video size, 108 codec support, and other factors. In order for the SFU to work 109 properly though, it needs to be able to access RTP metadata and RTCP 110 feedback messages, which is not possible if all RTP/RTCP traffic is 111 end-to-end encrypted. 113 As such, two layers of encryptions and authentication are required: 115 1. Hop-by-hop (HBH) encryption of media, metadata, and feedback 116 messages between the the endpoints and SFU 118 2. End-to-end (E2E) encryption of media between the endpoints 120 While DTLS-SRTP can be used as an efficient HBH mechanism, it is 121 inherently point-to-point and therefore not suitable for a SFU 122 context. In addition, given the various scenarios in which video 123 calling occurs, minimizing the bandwidth overhead of end-to-end 124 encryption is also an important goal. 126 This document proposes a new end-to-end encryption mechanism known as 127 SFrame, specifically designed to work in group conference calls with 128 SFUs. 130 +-------------------------------+-------------------------------+^+ 131 |V=2|P|X| CC |M| PT | sequence number | | 132 +-------------------------------+-------------------------------+ | 133 | timestamp | | 134 +---------------------------------------------------------------+ | 135 | synchronization source (SSRC) identifier | | 136 |=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=| | 137 | contributing source (CSRC) identifiers | | 138 | .... | | 139 +---------------------------------------------------------------+ | 140 | RTP extension(s) (OPTIONAL) | | 141 +^---------------------+------------------------------------------+ | 142 | | payload header | | | 143 | +--------------------+ payload ... | | 144 | | | | 145 +^+---------------------------------------------------------------+^+ 146 | : authentication tag : | 147 | +---------------------------------------------------------------+ | 148 | | 149 ++ Encrypted Portion Authenticated Portion +--+ 151 Figure 1: SRTP packet format 153 2. Terminology 155 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 156 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 157 "OPTIONAL" in this document are to be interpreted as described in BCP 158 14 [RFC2119] [RFC8174] when, and only when, they appear in all 159 capitals, as shown here. 161 SFU: Selective Forwarding Unit (AKA RTP Switch) 163 IV: Initialization Vector 165 MAC: Message Authentication Code 167 E2EE: End to End Encryption 169 HBH: Hop By Hop 171 KMS: Key Management System 173 3. Goals 175 SFrame is designed to be a suitable E2EE protection scheme for 176 conference call media in a broad range of scenarios, as outlined by 177 the following goals: 179 1. Provide an secure E2EE mechanism for audio and video in 180 conference calls that can be used with arbitrary SFU servers. 182 2. Decouple media encryption from key management to allow SFrame to 183 be used with an arbitrary KMS. 185 3. Minimize packet expansion to allow successful conferencing in as 186 many network conditions as possible. 188 4. Independence from the underlying transport, including use in non- 189 RTP transports, e.g., WebTransport. 191 5. When used with RTP and its associated error resilience 192 mechanisms, i.e., RTX and FEC, require no special handling for 193 RTX and FEC packets. 195 6. Minimize the changes needed in SFU servers. 197 7. Minimize the changes needed in endpoints. 199 8. Work with the most popular audio and video codecs used in 200 conferencing scenarios. 202 4. SFrame 204 We propose a frame level encryption mechanism that provides effective 205 end-to-end encryption, is simple to implement, has no dependencies on 206 RTP, and minimizes encryption bandwidth overhead. Because SFrame 207 encrypts the full frame, rather than individual packets, bandwidth 208 overhead is reduced by having a single IV and authentication tag for 209 each media frame. 211 Also, because media is encrypted prior to packetization, the 212 encrypted frame is packetized using a generic RTP packetizer instead 213 of codec-dependent packetization mechanisms. With this move to a 214 generic packetizer, media metadata is moved from codec-specific 215 mechanisms to a generic frame RTP header extension which, while 216 visible to the SFU, is authenticated end-to-end. This extension 217 includes metadata needed for SFU routing such as resolution, frame 218 beginning and end markers, etc. 220 The generic packetizer splits the E2E encrypted media frame into one 221 or more RTP packets and adds the SFrame header to the beginning of 222 the first packet and an auth tag to the end of the last packet. 224 +-------------------------------------------------------+ 225 | | 226 | +----------+ +------------+ +-----------+ | 227 | | | | SFrame | |Packetizer | | DTLS+SRTP 228 | | Encoder +----->+ Enc +----->+ +-------------------------+ 229 ,+. | | | | | | | | +--+ +--+ +--+ | 230 `|' | +----------+ +-----+------+ +-----------+ | | | | | | | | 231 /|\ | ^ | | | | | | | | 232 + | | | | | | | | | | 233 / \ | | | +--+ +--+ +--+ | 234 Alice | +-----+------+ | Encrypted Packets | 235 | |Key Manager | | | 236 | +------------+ | | 237 | || | | 238 | || | | 239 | || | | 240 +-------------------------------------------------------+ | 241 || | 242 || v 243 +------------+ +-----+------+ 244 E2EE channel | Messaging | | Media | 245 via the | Server | | Server | 246 Messaging Server | | | | 247 +------------+ +-----+------+ 248 || | 249 || | 250 +-------------------------------------------------------+ | 251 | || | | 252 | || | | 253 | || | | 254 | +------------+ | | 255 | |Key Manager | | | 256 ,+. | +-----+------+ | Encrypted Packets | 257 `|' | | | +--+ +--+ +--+ | 258 /|\ | | | | | | | | | | 259 + | v | | | | | | | | 260 / \ | +----------+ +-----+------+ +-----------+ | | | | | | | | 261 Bob | | | | SFrame | | De+ | | +--+ +--+ +--+ | 262 | | Decoder +<-----+ Dec +<-----+Packetizer +<------------------------+ 263 | | | | | | | | DTLS+SRTP 264 | +----------+ +------------+ +-----------+ | 265 | | 266 +-------------------------------------------------------+ 268 The E2EE keys used to encrypt the frame are exchanged out of band 269 using a secure E2EE channel. 271 4.1. SFrame Format 273 +------------+------------------------------------------+^+ 274 |S|LEN|X|KID | Frame Counter | | 275 +^+------------+------------------------------------------+ | 276 | | | | 277 | | | | 278 | | | | 279 | | | | 280 | | Encrypted Frame | | 281 | | | | 282 | | | | 283 | | | | 284 | | | | 285 +^+-------------------------------------------------------+^+ 286 | | Authentication Tag | | 287 | +-------------------------------------------------------+ | 288 | | 289 | | 290 +----+Encrypted Portion Authenticated Portion+---+ 292 4.2. SFrame Header 294 Since each endpoint can send multiple media layers, each frame will 295 have a unique frame counter that will be used to derive the 296 encryption IV. The frame counter must be unique and monotonically 297 increasing to avoid IV reuse. 299 As each sender will use their own key for encryption, so the SFrame 300 header will include the key id to allow the receiver to identify the 301 key that needs to be used for decrypting. 303 Both the frame counter and the key id are encoded in a variable 304 length format to decrease the overhead. The length is up to 8 bytes 305 and is represented in 3 bits in the SFrame header: 000 represents a 306 length of 1, 001 a length of 2... The first byte in the SFrame header 307 is fixed and contains the header metadata with the following format: 309 0 1 2 3 4 5 6 7 310 +-+-+-+-+-+-+-+-+ 311 |R|LEN |X| K | 312 +-+-+-+-+-+-+-+-+ 313 SFrame header metadata 315 Reserved (R): 1 bit This field MUST be set to zero on sending, and 316 MUST be ignored by receivers. Counter Length (LEN): 3 bits This 317 field indicates the length of the CTR fields in bytes (1-8). 318 Extended Key Id Flag (X): 1 bit Indicates if the key field contains 319 the key id or the key length. Key or Key Length: 3 bits This field 320 contains the key id (KID) if the X flag is set to 0, or the key 321 length (KLEN) if set to 1. 323 If X flag is 0 then the KID is in the range of 0-7 and the frame 324 counter (CTR) is found in the next LEN bytes: 326 0 1 2 3 4 5 6 7 327 +-+-+-+-+-+-+-+-+---------------------------------+ 328 |R|LEN |0| KID | CTR... (length=LEN) | 329 +-+-+-+-+-+-+-+-+---------------------------------+ 331 Frame counter byte length (LEN): 3bits The frame counter length in 332 bytes (1-8). Key id (KID): 3 bits The key id (0-7). Frame counter 333 (CTR): (Variable length) Frame counter value up to 8 bytes long. 335 if X flag is 1 then KLEN is the length of the key (KID), that is 336 found after the SFrame header metadata byte. After the key id (KID), 337 the frame counter (CTR) will be found in the next LEN bytes: 339 0 1 2 3 4 5 6 7 340 +-+-+-+-+-+-+-+-+---------------------------+---------------------------+ 341 |R|LEN |1|KLEN | KID... (length=KLEN) | CTR... (length=LEN) | 342 +-+-+-+-+-+-+-+-+---------------------------+---------------------------+ 344 Frame counter byte length (LEN): 3bits The frame counter length in 345 bytes (1-8). Key length (KLEN): 3 bits The key length in bytes 346 (1-8). Key id (KID): (Variable length) The key id value up to 8 347 bytes long. Frame counter (CTR): (Variable length) Frame counter 348 value up to 8 bytes long. 350 4.3. Encryption Schema 352 SFrame encryption uses an AEAD encryption algorithm and hash function 353 defined by the ciphersuite in use (see Section 4.4). We will refer 354 to the following aspects of the AEAD algorithm below: 356 * "AEAD.Encrypt" and "AEAD.Decrypt" - The encryption and decryption 357 functions for the AEAD. We follow the convention of RFC 5116 358 [RFC5116] and consider the authentication tag part of the 359 ciphertext produced by "AEAD.Encrypt" (as opposed to a separate 360 field as in SRTP [RFC3711]). 362 * "AEAD.Nk" - The size of a key for the encryption algorithm, in 363 bytes 365 * "AEAD.Nn" - The size of a nonce for the encryption algorithm, in 366 bytes 368 4.3.1. Key Selection 370 Each SFrame encryption or decryption operation is premised on a 371 single secret "base\_key", which is labeled with an integer KID value 372 signaled in the SFrame header. 374 The sender and receivers need to agree on which key should be used 375 for a given KID. The process for provisioning keys and their KID 376 values is beyond the scope of this specification, but its security 377 properties will bound the assurances that SFrame provides. For 378 example, if SFrame is used to provide E2E security against 379 intermediary media nodes, then SFrame keys MUST be negotiated in a 380 way that does not make them accessible to these intermediaries. 382 For each known KID value, the client stores the corresponding 383 symmetric key "base\_key". For keys that can be used for encryption, 384 the client also stores the next counter value CTR to be used when 385 encrypting (initially 0). 387 When encrypting a frame, the application specifies which KID is to be 388 used, and the counter is incremented after successful encryption. 389 When decrypting, the "base\_key" for decryption is selected from the 390 available keys using the KID value in the SFrame Header. 392 A given key MUST NOT be used for encryption by multiple senders. 393 Such reuse would result in multiple encrypted frames being generated 394 with the same (key, nonce) pair, which harms the protections provided 395 by many AEAD algorithms. Implementations SHOULD mark each key as 396 usable for encryption or decryption, never both. 398 Note that the set of available keys might change over the lifetime of 399 a real-time session. In such cases, the client will need to manage 400 key usage to avoid media loss due to a key being used to encrypt 401 before all receivers are able to use it to decrypt. For example, an 402 application may make decryption-only keys available immediately, but 403 delay the use of encryption-only keys until (a) all receivers have 404 acknowledged receipt of the new key or (b) a timeout expires. 406 4.3.2. Key Derivation 408 SFrame encrytion and decryption use a key and salt derived from the 409 "base\_key" associated to a KID. Given a "base\_key" value, the key 410 and salt are derived using HKDF [RFC5869] as follows: 412 sframe_secret = HKDF-Extract(K, 'SFrame10') 413 sframe_key = HKDF-Expand(sframe_secret, 'key', AEAD.Nk) 414 sframe_salt = HKDF-Expand(sframe_secret, 'salt', AEAD.Nn) 415 The hash function used for HKDF is determined by the ciphersuite in 416 use. 418 4.3.3. Encryption 420 After encoding the frame and before packetizing it, the necessary 421 media metadata will be moved out of the encoded frame buffer, to be 422 used later in the RTP generic frame header extension. The encoded 423 frame, the metadata buffer and the frame counter are passed to SFrame 424 encryptor. 426 SFrame encryption uses the AEAD encryption algorithm for the 427 ciphersuite in use. The key for the encryption is the "sframe\_key" 428 and the nonce is formed by XORing the "sframe\_salt" with the current 429 counter, encoded as a big-endian integer of length "AEAD.Nn". 431 The encryptor forms an SFrame header using the S, CTR, and KID values 432 provided. The encoded header is provided as AAD to the AEAD 433 encryption operation, with any frame metadata appended. 435 def encrypt(S, CTR, KID, frame_metadata, frame): 436 sframe_key, sframe_salt = key_store[KID] 438 frame_ctr = encode_big_endian(CTR, AEAD.Nn) 439 frame_nonce = xor(sframe_salt, frame_ctr) 441 header = encode_sframe_header(S, CTR, KID) 442 frame_aad = header + frame_metadata 444 encrypted_frame = AEAD.Encrypt(sframe_key, frame_nonce, frame_aad, frame) 445 return header + encrypted_frame 447 The encrypted payload is then passed to a generic RTP packetized to 448 construct the RTP packets and encrypt it using SRTP keys for the HBH 449 encryption to the media server. 451 +----------------+ +---------------+ 452 | frame metadata | | | 453 +-------+--------+ | | 454 | | frame | 455 | | | 456 | | | 457 | +-------+-------+ 458 | | 459 header ----+------------------>| AAD 460 +-----+ | 461 | S | | 462 +-----+ | 463 | KID +--+--> sframe_key ----->| Key 464 | | | | 465 | | +--> sframe_salt -+ | 466 +-----+ | | 467 | CTR +--------------------+-->| Nonce 468 | | | 469 | | | 470 +-----+ | 471 | AEAD.Encrypt 472 | | 473 | V 474 | +-------+-------+ 475 | | | 476 | | | 477 | | encrypted | 478 | | frame | 479 | | | 480 | | | 481 | +-------+-------+ 482 | | 483 | generic RTP packetize 484 | | 485 | v 486 V 487 +---------------+ +---------------+ +---------------+ 488 | SFrame header | | | | | 489 +---------------+ | | | | 490 | | | payload 2/N | | payload N/N | 491 | payload 1/N | | | | | 492 | | | | | | 493 +---------------+ +---------------+ +---------------+ 495 Figure 2: Encryption flow 497 4.3.4. Decryption 499 The receiving clients buffer all packets that belongs to the same 500 frame using the frame beginning and ending marks in the generic RTP 501 frame header extension, and once all packets are available, it passes 502 it to SFrame for decryption. The KID field in the SFrame header is 503 used to find the right key for the encrypted frame. 505 def decrypt(frame_metadata, sframe): 506 header, encrypted_frame = split_header(sframe) 507 S, CTR, KID = parse_header(header) 509 sframe_key, sframe_salt = key_store[KID] 511 frame_ctr = encode_big_endian(CTR, AEAD.Nn) 512 frame_nonce = xor(sframe_salt, frame_ctr) 513 frame_aad = header + frame_metadata 515 return AEAD.Decrypt(sframe_key, frame_nonce, frame_aad, encrypted_frame) 517 For frames that are failed to decrypt because there is key available 518 for the KID in the SFrame header, the client MAY buffer the frame and 519 retry decryption once a key with that KID is received. 521 4.3.5. Duplicate Frames 523 Unlike messaging application, in video calls, receiving a duplicate 524 frame doesn't necessary mean the client is under a replay attack, 525 there are other reasons that might cause this, for example the sender 526 might just be sending them in case of packet loss. SFrame decryptors 527 use the highest received frame counter to protect against this. It 528 allows only older frame pithing a short interval to support out of 529 order delivery. 531 4.4. Ciphersuites 533 Each SFrame session uses a single ciphersuite that specifies the 534 following primitives: 536 o A hash function used for key derivation and hashing signature 537 inputs 539 o An AEAD encryption algorithm [RFC5116] used for frame encryption, 540 optionally with a truncated authentication tag 542 o [Optional] A signature algorithm 544 This document defines the following ciphersuites: 546 +========+==========================+====+====+====+===========+ 547 | Value | Name | Nh | Nk | Nn | Reference | 548 +========+==========================+====+====+====+===========+ 549 | 0x0001 | AES_CM_128_HMAC_SHA256_8 | 32 | 16 | 12 | RFC XXXX | 550 +--------+--------------------------+----+----+----+-----------+ 551 | 0x0002 | AES_CM_128_HMAC_SHA256_4 | 32 | 16 | 12 | RFC XXXX | 552 +--------+--------------------------+----+----+----+-----------+ 553 | 0x0003 | AES_GCM_128_SHA256 | 32 | 16 | 12 | RFC XXXX | 554 +--------+--------------------------+----+----+----+-----------+ 555 | 0x0004 | AES_GCM_256_SHA512 | 64 | 32 | 12 | RFC XXXX | 556 +--------+--------------------------+----+----+----+-----------+ 558 Table 1 560 In the "AES_CM" suites, the length of the authentication tag is 561 indicated by the last value: "_8" indicates an eight-byte tag and 562 "_4" indicates a four-byte tag. 564 In a session that uses multiple media streams, different ciphersuites 565 might be configured for different media streams. For example, in 566 order to conserve bandwidth, a session might use a ciphersuite with 567 80-bit tags for video frames and another ciphersuite with 32-bit tags 568 for audio frames. 570 4.4.1. AES-CM with SHA2 572 In order to allow very short tag sizes, we define a synthetic AEAD 573 function using the authenticated counter mode of AES together with 574 HMAC for authentication. We use an encrypt-then-MAC approach as in 575 SRTP [RFC3711]. 577 Before encryption or decryption, encryption and authentication 578 subkeys are derived from the single AEAD key using HKDF. The subkeys 579 are derived as follows, where "Nk" represents the key size for the 580 AES block cipher in use and "Nh" represents the output size of the 581 hash function: 583 def derive_subkeys(sframe_key): 584 aead_secret = HKDF-Extract(sframe_key, 'SFrame10 AES CM AEAD') 585 enc_key = HKDF-Expand(aead_secret, 'enc', Nk) 586 auth_key = HKDF-Expand(aead_secret, 'auth', Nh) 587 return enc_key, auth_key 589 The AEAD encryption and decryption functions are then composed of 590 individual calls to the CM encrypt function and HMAC. The resulting 591 MAC value is truncated to a number of bytes "tag_len" fixed by the 592 ciphersuite. 594 def compute_tag(auth_key, nonce, aad, ct): 595 aad_len = encode_big_endian(len(aad), 8) 596 ct_len = encode_big_endian(len(ct), 8) 597 auth_data = aad_len + ct_len + nonce + aad + ct 598 tag = HMAC(auth_key, auth_data) 599 return truncate(tag, tag_len) 601 def AEAD.Encrypt(key, nonce, aad, pt): 602 enc_key, auth_key = derive_subkeys(key) 603 ct = AES-CM.Encrypt(enc_key, nonce, pt) 604 tag = compute_tag(auth_key, nonce, aad, ct) 605 return ct + tag 607 def AEAD.Decrypt(key, nonce, aad, ct): 608 inner_ct, tag = split_ct(ct, tag_len) 610 enc_key, auth_key = derive_subkeys(key) 611 candidate_tag = compute_tag(auth_key, nonce, aad, inner_ct) 612 if !constant_time_equal(tag, candidate_tag): 613 raise Exception("Authentication Failure") 615 return AES-CM.Decrypt(enc_key, nonce, inner_ct) 617 5. Key Management 619 SFrame must be integrated with an E2E key management framework to 620 exchange and rotate the keys used for SFrame encryption and/or 621 signing. The key management framework provides the following 622 functions: 624 * Provisioning KID/"base\_key" mappings to participating clients 626 * (optional) Provisioning clients with a list of trusted signing 627 keys 629 * Updating the above data as clients join or leave 631 It is up to the application to define a rotation schedule for keys. 632 For example, one application might have an ephemeral group for every 633 call and keep rotating key when end points joins or leave the call, 634 while another application could have a persistent group that can be 635 used for multiple calls and simply derives ephemeral symmetric keys 636 for a specific call. 638 5.1. Sender Keys 640 If the participants in a call have a pre-existing E2E-secure channel, 641 they can use it to distribute SFrame keys. Each client participating 642 in a call generates a fresh encryption key and optionally a signing 643 key pair. The client then uses the E2E-secure channel to send their 644 encryption key and signing public key to the other participants. 646 In this scheme, it is assumed that receivers have a signal outside of 647 SFrame for which client has sent a given frame, for example the RTP 648 SSRC. SFrame KID values are then used to distinguish generations of 649 the sender's key. At the beginning of a call, each sender encrypts 650 with KID=0. Thereafter, the sender can ratchet their key forward for 651 forward secrecy: 653 sender_key[i+1] = HKDF-Expand( 654 HKDF-Extract(sender_key[i], 'SFrame10 ratchet'), 655 '', AEAD.Nk) 657 The sender signals such an update by incrementing their KID value. A 658 receiver who receives from a sender with a new KID computes the new 659 key as above. The old key may be kept for some time to allow for 660 out-of-order delivery, but should be deleted promptly. 662 If a new participant joins mid-call, they will need to receive from 663 each sender (a) the current sender key for that sender, (b) the 664 signing key for the sender, if used, and (c) the current KID value 665 for the sender. Evicting a participant requires each sender to send 666 a fresh sender key to all receivers. 668 5.2. MLS 670 The Messaging Layer Security (MLS) protocol provides group 671 authenticated key exchange [I-D.ietf-mls-architecture] 672 [I-D.ietf-mls-protocol]. In principle, it could be used to 673 instantiate the sender key scheme above, but it can also be used more 674 efficiently directly. 676 MLS creates a linear sequence of keys, each of which is shared among 677 the members of a group at a given point in time. When a member joins 678 or leaves the group, a new key is produced that is known only to the 679 augmented or reduced group. Each step in the lifetime of the group 680 is know as an "epoch", and each member of the group is assigned an 681 "index" that is constant for the time they are in the group. 683 In SFrame, we derive per-sender "base\_key" values from the group 684 secret for an epoch, and use the KID field to signal the epoch and 685 sender index. First, we use the MLS exporter to compute a shared 686 SFrame secret for the epoch. 688 sframe_epoch_secret = MLS-Exporter("SFrame 10 MLS", "", AEAD.Nk) 690 sender_base_key[index] = HKDF-Expand(sframe_epoch_secret, 691 encode_big_endian(index, 4), AEAD.Nk) 693 For compactness, do not send the whole epoch number. Instead, we 694 send only its low-order E bits. Note that E effectively defines a 695 re-ordering window, since no more than 2^E epoch can be active at a 696 given time. Receivers MUST be prepared for the epoch counter to roll 697 over, removing an old epoch when a new epoch with the same E lower 698 bits is introduced. (Sender indices cannot be similarly compressed.) 700 KID = (sender_index << E) + (epoch % (1 << E)) 702 Once an SFrame stack has been provisioned with the 703 "sframe_epoch_secret" for an epoch, it can compute the required KIDs 704 and "sender_base_key" values on demand, as it needs to encrypt/ 705 decrypt for a given member. 707 ... 708 | 709 Epoch 17 +--+-- index=33 -> KID = 0x211 710 | | 711 | +-- index=51 -> KID = 0x331 712 | 713 | 714 Epoch 16 +--+-- index=2 --> KID = 0x20 715 | 716 | 717 Epoch 15 +--+-- index=3 --> KID = 0x3f 718 | | 719 | +-- index=5 --> KID = 0x5f 720 | 721 | 722 Epoch 14 +--+-- index=3 --> KID = 0x3e 723 | | 724 | +-- index=7 --> KID = 0x7e 725 | | 726 | +-- index=20 -> KID = 0x14e 727 | 728 ... 730 MLS also provides an authenticated signing key pair for each 731 participant. When SFrame uses signatures, these are the keys used to 732 generate SFrame signatures. 734 6. Media Considerations 736 6.1. SFU 738 Selective Forwarding Units (SFUs) as described in 739 https://tools.ietf.org/html/rfc7667#section-3.7 receives the RTP 740 streams from each participant and selects which ones should be 741 forwarded to each of the other participants. There are several 742 approaches about how to do this stream selection but in general, in 743 order to do so, the SFU needs to access metadata associated to each 744 frame and modify the RTP information of the incoming packets when 745 they are transmitted to the received participants. 747 This section describes how this normal SFU modes of operation 748 interacts with the E2EE provided by SFrame 750 6.1.1. LastN and RTP stream reuse 752 The SFU may choose to send only a certain number of streams based on 753 the voice activity of the participants. To reduce the number of SDP 754 O/A required to establish a new RTP stream, the SFU may decide to 755 reuse previously existing RTP sessions or even pre-allocate a 756 predefined number of RTP streams and choose in each moment in time 757 which participant media will be sending through it. This means that 758 in the same RTP stream (defined by either SSRC or MID) may carry 759 media from different streams of different participants. As different 760 keys are used by each participant for encoding their media, the 761 receiver will be able to verify which is the sender of the media 762 coming within the RTP stream at any given point if time, preventing 763 the SFU trying to impersonate any of the participants with another 764 participant's media. Note that in order to prevent impersonation by 765 a malicious participant (not the SFU) usage of the signature is 766 required. In case of video, the a new signature should be started 767 each time a key frame is sent to allow the receiver to identify the 768 source faster after a switch. 770 6.1.2. Simulcast 772 When using simulcast, the same input image will produce N different 773 encoded frames (one per simulcast layer) which would be processed 774 independently by the frame encryptor and assigned an unique counter 775 for each. 777 6.1.3. SVC 779 In both temporal and spatial scalability, the SFU may choose to drop 780 layers in order to match a certain bitrate or forward specific media 781 sizes or frames per second. In order to support it, the sender MUST 782 encode each spatial layer of a given picture in a different frame. 783 That is, an RTP frame may contain more than one SFrame encrypted 784 frame with an incrementing frame counter. 786 6.2. Video Key Frames 788 Forward and Post-Compromise Security requires that the e2ee keys are 789 updated anytime a participant joins/leave the call. 791 The key exchange happens async and on a different path than the SFU 792 signaling and media. So it may happen that when a new participant 793 joins the call and the SFU side requests a key frame, the sender 794 generates the e2ee encrypted frame with a key not known by the 795 receiver, so it will be discarded. When the sender updates his 796 sending key with the new key, it will send it in a non-key frame, so 797 the receiver will be able to decrypt it, but not decode it. 799 Receiver will re-request an key frame then, but due to sender and sfu 800 policies, that new key frame could take some time to be generated. 802 If the sender sends a key frame when the new e2ee key is in use, the 803 time required for the new participant to display the video is 804 minimized. 806 6.3. Partial Decoding 808 Some codes support partial decoding, where it can decrypt individual 809 packets without waiting for the full frame to arrive, with SFrame 810 this won't be possible because the decoder will not access the 811 packets until the entire frame is arrived and decrypted. 813 7. Overhead 815 The encryption overhead will vary between audio and video streams, 816 because in audio each packet is considered a separate frame, so it 817 will always have extra MAC and IV, however a video frame usually 818 consists of multiple RTP packets. The number of bytes overhead per 819 frame is calculated as the following 1 + FrameCounter length + 4 The 820 constant 1 is the SFrame header byte and 4 bytes for the HBH 821 authentication tag for both audio and video packets. 823 7.1. Audio 825 Using three different audio frame durations 20ms (50 packets/s) 40ms 826 (25 packets/s) 100ms (10 packets/s) Up to 3 bytes frame counter (3.8 827 days of data for 20ms frame duration) and 4 bytes fixed MAC length. 829 +=============+===========+==========+==========+===========+ 830 | Counter len | Packets | Overhead | Overhead | Overhead | 831 +=============+===========+==========+==========+===========+ 832 | | | bps@20ms | bps@40ms | bps@100ms | 833 +-------------+-----------+----------+----------+-----------+ 834 | 1 | 0-255 | 2400 | 1200 | 480 | 835 +-------------+-----------+----------+----------+-----------+ 836 | 2 | 255 - 65K | 2800 | 1400 | 560 | 837 +-------------+-----------+----------+----------+-----------+ 838 | 3 | 65K - 16M | 3200 | 1600 | 640 | 839 +-------------+-----------+----------+----------+-----------+ 841 Table 2 843 7.2. Video 845 The per-stream overhead bits per second as calculated for the 846 following video encodings: 30fps@1000Kbps (4 packets per frame) 847 30fps@512Kbps (2 packets per frame) 15fps@200Kbps (2 packets per 848 frame) 7.5fps@30Kbps (1 packet per frame) Overhead bps = (Counter 849 length + 1 + 4 ) * 8 * fps 851 +=============+===========+===========+===========+============+ 852 | Counter len | Frames | Overhead | Overhead | Overhead | 853 +=============+===========+===========+===========+============+ 854 | | | bps@30fps | bps@15fps | bps@7.5fps | 855 +-------------+-----------+-----------+-----------+------------+ 856 | 1 | 0-255 | 1440 | 1440 | 720 | 857 +-------------+-----------+-----------+-----------+------------+ 858 | 2 | 256 - 65K | 1680 | 1680 | 840 | 859 +-------------+-----------+-----------+-----------+------------+ 860 | 3 | 56K - 16M | 1920 | 1920 | 960 | 861 +-------------+-----------+-----------+-----------+------------+ 862 | 4 | 16M - 4B | 2160 | 2160 | 1080 | 863 +-------------+-----------+-----------+-----------+------------+ 865 Table 3 867 7.3. SFrame vs PERC-lite 869 [RFC8723] has significant overhead over SFrame because the overhead 870 is per packet, not per frame, and OHB (Original Header Block) which 871 duplicates any RTP header/extension field modified by the SFU. 872 [I-D.murillo-perc-lite] https://mailarchive.ietf.org/arch/msg/perc/ 873 SB0qMHWz6EsDtz3yIEX0HWp5IEY/ is slightly better because it doesn't 874 use the OHB anymore, however it still does per packet encryption 875 using SRTP. Below the the overheard in [I-D.murillo-perc-lite] 876 implemented by Cosmos Software which uses extra 11 bytes per packet 877 to preserve the PT, SEQ_NUM, TIME_STAMP and SSRC fields in addition 878 to the extra MAC tag per packet. 880 OverheadPerPacket = 11 + MAC length Overhead bps = PacketPerSecond * 881 OverHeadPerPacket * 8 883 Similar to SFrame, we will assume the HBH authentication tag length 884 will always be 4 bytes for audio and video even though it is not the 885 case in this [I-D.murillo-perc-lite] implementation 887 7.3.1. Audio 889 +===================+===================+====================+ 890 | Overhead bps@20ms | Overhead bps@40ms | Overhead bps@100ms | 891 +===================+===================+====================+ 892 | 6000 | 3000 | 1200 | 893 +-------------------+-------------------+--------------------+ 895 Table 4 897 7.3.2. Video 899 +=======================+====================+=====================+ 900 | Overhead bps@30fps | Overhead bps@15fps | Overhead bps@7.5fps | 901 +=======================+====================+=====================+ 902 | (4 packets per frame) | (2 packets per | (1 packet per | 903 | | frame) | frame) | 904 +-----------------------+--------------------+---------------------+ 905 | 14400 | 7200 | 3600 | 906 +-----------------------+--------------------+---------------------+ 908 Table 5 910 For a conference with a single incoming audio stream (@ 50 pps) and 4 911 incoming video streams (@200 Kbps), the savings in overhead is 34800 912 - 9600 = ~25 Kbps, or ~3%. 914 8. Security Considerations 916 8.1. No Per-Sender Authentication 918 SFrame does not provide per-sender authentication of media data. Any 919 sender in a session can send media that will be associated with any 920 other sender. This is because SFrame uses symmetric encryption to 921 protect media data, so that any receiver also has the keys required 922 to encrypt packets for the sender. 924 8.2. Key Management 926 Key exchange mechanism is out of scope of this document, however 927 every client MUST change their keys when new clients joins or leaves 928 the call for "Forward Secrecy" and "Post Compromise Security". 930 8.3. Authentication tag length 932 The cipher suites defined in this draft use short authentication tags 933 for encryption, however it can easily support other ciphers with full 934 authentication tag if the short ones are proved insecure. 936 9. IANA Considerations 938 This document makes no requests of IANA. 940 10. Test Vectors 942 This section provides a set of test vectors that implementations can 943 use to verify that they correctly implement SFrame encryption and 944 decryption. For each ciphersuite, we provide: 946 * [in] The "base_key" value (hex encoded) 948 * [out] The "secret", "key", and "salt" values derived from the 949 "base_key" (hex encoded) 951 * A plaintext value that is encrypted in the following encryption 952 cases 954 * A sequence of encryption cases, including: 956 - [in] The "KID" and "CTR" values to be included in the header 958 - [out] The resulting encoded header (hex encoded) 960 - [out] The nonce computed from the "salt" and "CTR" values 961 - The ciphertext resulting from encrypting the plaintext with 962 these parameters (hex encoded) 964 An implementation should reproduce the output values given the input 965 values: * An implementation should be able to encrypt with the input 966 values and the plaintext to produce the ciphertext. * An 967 implementation must be able to decrypt with the input values and the 968 ciphertext to generate the plaintext. 970 Line breaks and whitespace within values are inserted to conform to 971 the width requirements of the RFC format. They should be removed 972 before use. These test vectors are also available in JSON format at 973 [TestVectors]. 975 10.1. AES_CM_128_HMAC_SHA256_4 977 CipherSuite: 0x01 978 Base Key: 101112131415161718191a1b1c1d1e1f 979 Key: 343d3290f5c0b936415bea9a43c6f5a2 980 Salt: 42d662fbad5cd81eb3aad79a 981 Plaintext: 46726f6d2068656176656e6c79206861 982 726d6f6e79202f2f205468697320756e 983 6976657273616c206672616d65206265 984 67616e 986 KID: 0x7 987 CTR: 0x0 988 Header: 1700 989 Nonce: 42d662fbad5cd81eb3aad79a 990 Ciphertext: 170065c67c6fb784631a7db1b589ffb6 991 2d75b78e28b0899e632fbbee3b944747 992 a6382d75b6bd3788dc7b71b9295c7fb9 993 0b5098f7add14ef329 995 KID: 0x7 996 CTR: 0x1 997 Header: 1701 998 Nonce: 42d662fbad5cd81eb3aad79b 999 Ciphertext: 1701ec742e98d667be810f153ff0d4da 1000 d7969f69b310aa7c6b9cb911e83af09b 1001 0f0a6d74772d8195c8c9dae3878fd1cb 1002 10edb4176d12e2387a 1004 KID: 0x7 1005 CTR: 0x2 1006 Header: 1702 1007 Nonce: 42d662fbad5cd81eb3aad798 1008 Ciphertext: 1702ac9b495d37a1e48c712ade5cba72 1009 df0bf90f24aa022a454cfb92d8b87cd5 1010 4335fb6b9eeded6a5aa4e2643d7a0994 1011 6646001d0a41b09557 1013 KID: 0xf 1014 CTR: 0xaa 1015 Header: 190faa 1016 Nonce: 42d662fbad5cd81eb3aad730 1017 Ciphertext: 190faaeaa5adc70cae0d6ebd36805fa8 1018 7d2351dd02c55c751cd351a7fdb7f092 1019 7b474eae3e800033e08100a440002da1 1020 7579678b36dc275789d5 1022 KID: 0x1ff 1023 CTR: 0xaa 1024 Header: 1a01ffaa 1025 Nonce: 42d662fbad5cd81eb3aad730 1026 Ciphertext: 1a01ffaaeaa5adc70cae0d6ebd36805f 1027 a87d2351dd02c55c751cd351a7fdb7f0 1028 927b474eae3e800033e08100a440002d 1029 a17579678b36dc9bbe558b 1031 KID: 0x1ff 1032 CTR: 0xaaaa 1033 Header: 2a01ffaaaa 1034 Nonce: 42d662fbad5cd81eb3aa7d30 1035 Ciphertext: 2a01ffaaaa170500225053f1a044e51c 1036 4e91a6b783f69b1714fb31531d95d5b8 1037 dd7926c2d43405b4f32b9b49dd6e0aa5 1038 aba2427a94ff97f81dcd2826 1040 KID: 0xffffffffffffff 1041 CTR: 0xffffffffffffff 1042 Header: 7fffffffffffffffffffffffffffff 1043 Nonce: 42d662fbada327e14c552865 1044 Ciphertext: 7fffffffffffffffffffffffffffffdc 1045 a3655d5117bc838d6f4382ca468a4f99 1046 2ff77bfd1d2f4391be6b33e8fb638dc4 1047 8aa82f57fd91430c714def0b2089c8bf 1048 b2ac9da92415 1050 10.2. AES_CM_128_HMAC_SHA256_8 1051 CipherSuite: 0x02 1052 Base Key: 202122232425262728292a2b2c2d2e2f 1053 Key: 3fce747d505e46ec9b92d9f58ee7a5d4 1054 Salt: 77fbf5f1d82c73f6d2b353c9 1055 Plaintext: 46726f6d2068656176656e6c79206861 1056 726d6f6e79202f2f205468697320756e 1057 6976657273616c206672616d65206265 1058 67616e 1060 KID: 0x7 1061 CTR: 0x0 1062 Header: 1700 1063 Nonce: 77fbf5f1d82c73f6d2b353c9 1064 Ciphertext: 1700647513fce71aab7fed1e904fd924 1065 0343d77092c831f0d58fde0985a0f3e5 1066 ba4020e87a7b9c870b5f8f7f628d2769 1067 0cc1e571e4d391da5fbf428433 1069 KID: 0x7 1070 CTR: 0x1 1071 Header: 1701 1072 Nonce: 77fbf5f1d82c73f6d2b353c8 1073 Ciphertext: 17019e1bdf713b0d4c02f3dbf50a72ea 1074 773286e7da38f3872cc734f3e1b1448a 1075 ab5009b424e05495214f96d02e4e8f8d 1076 a975cc808f40f67cafead7cffd 1078 KID: 0x7 1079 CTR: 0x2 1080 Header: 1702 1081 Nonce: 77fbf5f1d82c73f6d2b353cb 1082 Ciphertext: 170220ad36fd9191453ace2d36a175ad 1083 8a69c1f16b8613d14b4f7ef30c68bc56 1084 09e349df38155cc1544d7dbfa079e3fa 1085 ae3c7883b448e75047caafe05b 1087 KID: 0xf 1088 CTR: 0xaa 1089 Header: 190faa 1090 Nonce: 77fbf5f1d82c73f6d2b35363 1091 Ciphertext: 190faadab9b284a4b9e3aea36b9cdcae 1092 4a58e141d3f0f52f240ef80a93dbb8d8 1093 09ede01b05b2cace18a22fb39c032724 1094 481c5baa181d6b793458355b0f30 1096 KID: 0x1ff 1097 CTR: 0xaa 1098 Header: 1a01ffaa 1099 Nonce: 77fbf5f1d82c73f6d2b35363 1100 Ciphertext: 1a01ffaadab9b284a4b9e3aea36b9cdc 1101 ae4a58e141d3f0f52f240ef80a93dbb8 1102 d809ede01b05b2cace18a22fb39c0327 1103 24481c5baa181dad5ad0f89a1cfb58 1105 KID: 0x1ff 1106 CTR: 0xaaaa 1107 Header: 2a01ffaaaa 1108 Nonce: 77fbf5f1d82c73f6d2b3f963 1109 Ciphertext: 2a01ffaaaae0f2384e4dc472cb92238b 1110 5b722159205c4481665484de66985f15 1111 5071655ca4e9d1c998781f8c7d439f8d 1112 1eb6f6071cd80fd22f7e8846ba91036a 1114 KID: 0xffffffffffffff 1115 CTR: 0xffffffffffffff 1116 Header: 7fffffffffffffffffffffffffffff 1117 Nonce: 77fbf5f1d8d38c092d4cac36 1118 Ciphertext: 7fffffffffffffffffffffffffffff4b 1119 8c7429d7ee83eec5e53808b80555b1f8 1120 0b1df9d97877575fa1c7fa35b6119c68 1121 ed6543020075959dcc4ca6900a7f9cf1 1122 d936b640bba41ca62f6c 1124 10.3. AES_GCM_128_SHA256 1126 CipherSuite: 0x03 1127 Base Key: 303132333435363738393a3b3c3d3e3f 1128 Key: 2ea2e8163ff56c0613e6fa9f20a213da 1129 Salt: a80478b3f6fba19983d540d5 1130 Plaintext: 46726f6d2068656176656e6c79206861 1131 726d6f6e79202f2f205468697320756e 1132 6976657273616c206672616d65206265 1133 67616e 1135 KID: 0x7 1136 CTR: 0x0 1137 Header: 1700 1138 Nonce: a80478b3f6fba19983d540d5 1139 Ciphertext: 17000e426255e47ed70dd7d15d69d759 1140 bf459032ca15f5e8b2a91e7d348aa7c1 1141 86d403f620801c495b1717a35097411a 1142 a97cbb140671eb3b49ac3775926db74d 1143 57b91e8e6c 1145 KID: 0x7 1146 CTR: 0x1 1147 Header: 1701 1148 Nonce: a80478b3f6fba19983d540d4 1149 Ciphertext: 170103bbafa34ada8a6b9f2066bc34a1 1150 959d87384c9f4b1ce34fed58e938bde1 1151 43393910b1aeb55b48d91d5b0db3ea67 1152 e3d0e02b843afd41630c940b1948e72d 1153 d45396a43a 1155 KID: 0x7 1156 CTR: 0x2 1157 Header: 1702 1158 Nonce: a80478b3f6fba19983d540d7 1159 Ciphertext: 170258d58adebd8bf6f3cc0c1fcacf34 1160 ba4d7a763b2683fe302a57f1be7f2a27 1161 4bf81b2236995fec1203cadb146cd402 1162 e1c52d5e6a10989dfe0f4116da1ee4c2 1163 fad0d21f8f 1165 KID: 0xf 1166 CTR: 0xaa 1167 Header: 190faa 1168 Nonce: a80478b3f6fba19983d5407f 1169 Ciphertext: 190faad0b1743bf5248f90869c945636 1170 6d55724d16bbe08060875815565e90b1 1171 14f9ccbdba192422b33848a1ae1e3bd2 1172 66a001b2f5bb727112772e0072ea8679 1173 ca1850cf11d8 1175 KID: 0x1ff 1176 CTR: 0xaa 1177 Header: 1a01ffaa 1178 Nonce: a80478b3f6fba19983d5407f 1179 Ciphertext: 1a01ffaad0b1743bf5248f90869c9456 1180 366d55724d16bbe08060875815565e90 1181 b114f9ccbdba192422b33848a1ae1e3b 1182 d266a001b2f5bbc9c63bd3973c19bd57 1183 127f565380ed4a 1185 KID: 0x1ff 1186 CTR: 0xaaaa 1187 Header: 2a01ffaaaa 1188 Nonce: a80478b3f6fba19983d5ea7f 1189 Ciphertext: 2a01ffaaaa9de65e21e4f1ca2247b879 1190 43c03c5cb7b182090e93d508dcfb76e0 1191 8174c6397356e682d2eaddabc0b3c101 1192 8d2c13c3570f61c1beaab805f27b565e 1193 1329a823a7a649b6 1195 KID: 0xffffffffffffff 1196 CTR: 0xffffffffffffff 1197 Header: 7fffffffffffffffffffffffffffff 1198 Nonce: a80478b3f6045e667c2abf2a 1199 Ciphertext: 7fffffffffffffffffffffffffffff09 1200 981bdcdad80e380b6f74cf6afdbce946 1201 839bedadd57578bfcd809dbcea535546 1202 cc24660613d2761adea852155785011e 1203 633534f4ecc3b8257c8d34321c27854a 1204 1422 1206 10.4. AES_GCM_256_SHA512 1208 CipherSuite: 0x04 1209 Base Key: 404142434445464748494a4b4c4d4e4f 1210 505152535455565758595a5b5c5d5e5f 1211 Key: 436774b0b5ae45633d96547f8f3cb06c 1212 8e6628eff2e4255b5c4d77e721aa3355 1213 Salt: 31ed26f90a072e6aee646298 1214 Plaintext: 46726f6d2068656176656e6c79206861 1215 726d6f6e79202f2f205468697320756e 1216 6976657273616c206672616d65206265 1217 67616e 1219 KID: 0x7 1220 CTR: 0x0 1221 Header: 1700 1222 Nonce: 31ed26f90a072e6aee646298 1223 Ciphertext: 1700f3e297c1e95207710bd31ccc4ba3 1224 96fbef7b257440bde638ff0f3c891154 1225 0136df61b26220249d6c432c245ae8d5 1226 5ef45bfccf32530a15aeaaf313a03838 1227 e51bd45652 1229 KID: 0x7 1230 CTR: 0x1 1231 Header: 1701 1232 Nonce: 31ed26f90a072e6aee646299 1233 Ciphertext: 170193268b0bf030071bff443bb6b447 1234 1bdfb1cc81bc9625f4697b0336ff4665 1235 d15f152f02169448d8a967fb06359a87 1236 d2145398de0ce3fbe257b0992a3da153 1237 7590459f3c 1239 KID: 0x7 1240 CTR: 0x2 1241 Header: 1702 1242 Nonce: 31ed26f90a072e6aee64629a 1243 Ciphertext: 1702649691ba27c4c01a41280fba4657 1244 c03fa7fe21c8f5c862e9094227c3ca3e 1245 c0d9468b1a2cb060ff0978f25a24e6b1 1246 06f5a6e1053c1b8f5fce794d88a0e481 1247 8c081e18ea 1249 KID: 0xf 1250 CTR: 0xaa 1251 Header: 190faa 1252 Nonce: 31ed26f90a072e6aee646232 1253 Ciphertext: 190faa2858c10b5ddd231c1f26819490 1254 521678603a050448d563c503b1fd890d 1255 02ead01d754f074ecb6f32da9b2f3859 1256 f380b4f47d4edd1e15f42f9a2d7ecfac 1257 99067e238321 1259 KID: 0x1ff 1260 CTR: 0xaa 1261 Header: 1a01ffaa 1262 Nonce: 31ed26f90a072e6aee646232 1263 Ciphertext: 1a01ffaa2858c10b5ddd231c1f268194 1264 90521678603a050448d563c503b1fd89 1265 0d02ead01d754f074ecb6f32da9b2f38 1266 59f380b4f47d4e3bf7040eb10ec25b81 1267 26b2ce7b1d9d31 1269 KID: 0x1ff 1270 CTR: 0xaaaa 1271 Header: 2a01ffaaaa 1272 Nonce: 31ed26f90a072e6aee64c832 1273 Ciphertext: 2a01ffaaaad9bc6a258a07d210a814d5 1274 45eca70321c0e87498ada6e5c708b7ea 1275 d162ffcf4fbaba1eb82650590a87122b 1276 4d95fe36bd88b278812166d26e046ed0 1277 a530b7ee232ee0f2 1279 KID: 0xffffffffffffff 1280 CTR: 0xffffffffffffff 1281 Header: 7fffffffffffffffffffffffffffff 1282 Nonce: 31ed26f90af8d195119b9d67 1283 Ciphertext: 7fffffffffffffffffffffffffffffaf 1284 480d4779ce0c02b5137ee6a61e026c04 1285 ac999cb0c97319feceeb258d58df23bc 1286 e14979e5c67a431777b34498062e72f9 1287 39ca42ec84ffbc7b50eff923f515a2df 1288 760c 1290 11. References 1292 11.1. Normative References 1294 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1295 Requirement Levels", BCP 14, RFC 2119, 1296 DOI 10.17487/RFC2119, March 1997, 1297 . 1299 [RFC5116] McGrew, D., "An Interface and Algorithms for Authenticated 1300 Encryption", RFC 5116, DOI 10.17487/RFC5116, January 2008, 1301 . 1303 [RFC5869] Krawczyk, H. and P. Eronen, "HMAC-based Extract-and-Expand 1304 Key Derivation Function (HKDF)", RFC 5869, 1305 DOI 10.17487/RFC5869, May 2010, 1306 . 1308 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 1309 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 1310 May 2017, . 1312 11.2. Informative References 1314 [I-D.ietf-mls-architecture] 1315 Omara, E., Beurdouche, B., Rescorla, E., Inguva, S., Kwon, 1316 A., and A. Duric, "The Messaging Layer Security (MLS) 1317 Architecture", Work in Progress, Internet-Draft, draft- 1318 ietf-mls-architecture-06, 8 March 2021, 1319 . 1322 [I-D.ietf-mls-protocol] 1323 Barnes, R., Beurdouche, B., Millican, J., Omara, E., Cohn- 1324 Gordon, K., and R. Robert, "The Messaging Layer Security 1325 (MLS) Protocol", Work in Progress, Internet-Draft, draft- 1326 ietf-mls-protocol-11, 22 December 2020, 1327 . 1330 [I-D.murillo-perc-lite] 1331 Murillo, S. G. and A. Gouaillard, "End to End Media 1332 Encryption Procedures", Work in Progress, Internet-Draft, 1333 draft-murillo-perc-lite-01, 12 May 2020, 1334 . 1337 [RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. 1338 Norrman, "The Secure Real-time Transport Protocol (SRTP)", 1339 RFC 3711, DOI 10.17487/RFC3711, March 2004, 1340 . 1342 [RFC8723] Jennings, C., Jones, P., Barnes, R., and A.B. Roach, 1343 "Double Encryption Procedures for the Secure Real-Time 1344 Transport Protocol (SRTP)", RFC 8723, 1345 DOI 10.17487/RFC8723, April 2020, 1346 . 1348 [TestVectors] 1349 "SFrame Test Vectors", 2021, 1350 . 1353 Authors' Addresses 1355 Emad Omara 1356 Apple 1358 Email: eomara@apple.com 1360 Justin Uberti 1361 Google 1363 Email: juberti@google.com 1364 Alexandre GOUAILLARD 1365 CoSMo Software 1367 Email: Alex.GOUAILLARD@cosmosoftware.io 1369 Sergio Garcia Murillo 1370 CoSMo Software 1372 Email: sergio.garcia.murillo@cosmosoftware.io