idnits 2.17.00 (12 Aug 2021) /tmp/idnits37124/draft-irtf-icnrg-flic-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 13 instances of too long lines in the document, the longest one being 29 characters in excess of 72. == There are 2 instances of lines with non-RFC2606-compliant FQDNs in the document. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (7 November 2021) is 188 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Experimental ---------------------------------------------------------------------------- == Missing Reference: 'SecurityCtx' is mentioned on line 508, but not defined == Missing Reference: 'AuthTag' is mentioned on line 508, but not defined == Missing Reference: 'NodeData' is mentioned on line 515, but not defined == Missing Reference: 'SubtreeSize' is mentioned on line 544, but not defined == Missing Reference: 'SubtreeDigest' is mentioned on line 544, but not defined == Missing Reference: 'Locators' is mentioned on line 516, but not defined == Missing Reference: 'ProtocolFlags' is mentioned on line 526, but not defined == Missing Reference: 'GroupData' is mentioned on line 534, but not defined == Missing Reference: 'NcId' is mentioned on line 544, but not defined == Missing Reference: 'LeafSize' is mentioned on line 544, but not defined == Missing Reference: 'LeafDigest' is mentioned on line 544, but not defined == Missing Reference: 'Name' is mentioned on line 864, but not defined == Missing Reference: 'ExpiryTime' is mentioned on line 864, but not defined Summary: 1 error (**), 0 flaws (~~), 15 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 ICNRG C. Tschudin 3 Internet-Draft University of Basel 4 Intended status: Experimental C.A. Wood 5 Expires: 11 May 2022 Cloudflare 6 M.E. Mosko 7 PARC, Inc. 8 D. Oran, Ed. 9 Network Systems Research & Design 10 7 November 2021 12 File-Like ICN Collections (FLIC) 13 draft-irtf-icnrg-flic-03 15 Abstract 17 This document describes a simple "index table" data structure and its 18 associated ICN data objects for organizing a set of primitive ICN 19 data objects into a large, File-Like ICN Collection (FLIC). At the 20 core of this collection is a _manifest_ which acts as the 21 collection's root node. The manifest contains an index table with 22 pointers, each pointer being a hash value pointing to either a final 23 data block or another index table node. 25 Status of This Memo 27 This Internet-Draft is submitted in full conformance with the 28 provisions of BCP 78 and BCP 79. 30 Internet-Drafts are working documents of the Internet Engineering 31 Task Force (IETF). Note that other groups may also distribute 32 working documents as Internet-Drafts. The list of current Internet- 33 Drafts is at https://datatracker.ietf.org/drafts/current/. 35 Internet-Drafts are draft documents valid for a maximum of six months 36 and may be updated, replaced, or obsoleted by other documents at any 37 time. It is inappropriate to use Internet-Drafts as reference 38 material or to cite them other than as "work in progress." 40 This Internet-Draft will expire on 11 May 2022. 42 Copyright Notice 44 Copyright (c) 2021 IETF Trust and the persons identified as the 45 document authors. All rights reserved. 47 This document is subject to BCP 78 and the IETF Trust's Legal 48 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 49 license-info) in effect on the date of publication of this document. 50 Please review these documents carefully, as they describe your rights 51 and restrictions with respect to this document. 53 Table of Contents 55 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 56 1.1. FLIC as an ICN experimental tool . . . . . . . . . . . . 5 57 1.2. Requirements Language . . . . . . . . . . . . . . . . . . 5 58 2. Design Overview . . . . . . . . . . . . . . . . . . . . . . . 5 59 3. FLIC Structure . . . . . . . . . . . . . . . . . . . . . . . 6 60 3.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 6 61 3.2. Locators . . . . . . . . . . . . . . . . . . . . . . . . 8 62 3.3. Name Constructors . . . . . . . . . . . . . . . . . . . . 8 63 3.4. Manifest Metadata . . . . . . . . . . . . . . . . . . . . 10 64 3.5. Pointer Annotations . . . . . . . . . . . . . . . . . . . 10 65 3.6. Manifest Grammar (ABNF) . . . . . . . . . . . . . . . . . 11 66 3.7. Manifest Trees . . . . . . . . . . . . . . . . . . . . . 13 67 3.7.1. Traversal . . . . . . . . . . . . . . . . . . . . . . 13 68 3.8. Manifest Encryption Modes . . . . . . . . . . . . . . . . 14 69 3.8.1. AEAD Mode . . . . . . . . . . . . . . . . . . . . . . 15 70 3.8.2. RSA-OAEP Key Transport Mode . . . . . . . . . . . . . 17 71 3.9. Protocol Encodings . . . . . . . . . . . . . . . . . . . 19 72 3.9.1. CCNx Encoding . . . . . . . . . . . . . . . . . . . . 19 73 3.9.1.1. CCNx Hash Naming . . . . . . . . . . . . . . . . 19 74 3.9.1.2. CCNx Single Prefix . . . . . . . . . . . . . . . 20 75 3.9.1.3. CCNx Segmented Prefix . . . . . . . . . . . . . . 20 76 3.9.1.4. CCNx Hybrid Schema . . . . . . . . . . . . . . . 21 77 3.9.2. NDN Encoding . . . . . . . . . . . . . . . . . . . . 22 78 3.9.2.1. NDN Hash Naming . . . . . . . . . . . . . . . . . 22 79 3.9.2.2. NDN Single Prefix . . . . . . . . . . . . . . . . 22 80 3.9.2.3. NDN Segmented Prefix . . . . . . . . . . . . . . 23 81 3.9.2.4. NDN Hybrid Schema . . . . . . . . . . . . . . . . 24 82 3.10. Example Structures . . . . . . . . . . . . . . . . . . . 25 83 3.10.1. Leaf-only data . . . . . . . . . . . . . . . . . . . 25 84 3.10.2. Linear . . . . . . . . . . . . . . . . . . . . . . . 25 85 4. Experimenting with FLIC . . . . . . . . . . . . . . . . . . . 25 86 5. Usage Examples . . . . . . . . . . . . . . . . . . . . . . . 26 87 5.1. Locating FLIC leaf and manifest nodes . . . . . . . . . . 26 88 5.2. Seeking . . . . . . . . . . . . . . . . . . . . . . . . . 27 89 5.3. Block-level de-duplication . . . . . . . . . . . . . . . 28 90 5.4. Growing ICN collections . . . . . . . . . . . . . . . . . 28 91 5.5. Re-publishing a FLIC under a new name . . . . . . . . . . 29 92 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 30 93 6.1. FLIC Payload Type . . . . . . . . . . . . . . . . . . . . 30 94 6.2. FLIC Manifest Metadata and Annotation TLVs . . . . . . . 30 96 7. Security Considerations . . . . . . . . . . . . . . . . . . . 31 97 7.1. Integrity and Origin Authentication of FLIC Manifests . . 31 98 7.2. Confidentiality of Manifest Data . . . . . . . . . . . . 32 99 7.3. Privacy of names and linkability of access patterns . . . 33 100 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 33 101 8.1. Normative References . . . . . . . . . . . . . . . . . . 33 102 8.2. Informative References . . . . . . . . . . . . . . . . . 34 103 Appendix A. Building Trees . . . . . . . . . . . . . . . . . . . 35 104 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 37 106 1. Introduction 108 ICN architectures such as Content-Centric Networking (CCNx)[RFC8569] 109 and Named Data Networking [NDN] are well suited for static content 110 distribution. Each piece of (possibly immutable) static content is 111 assigned a name by its producer. Consumers fetch this content using 112 said name. Optionally, consumers may specify the full name of 113 content, which includes its name and a unique (with overwhelming 114 probability) cryptographic digest of said content. 116 | Note: The reader is assumed to be familiar with general ICN 117 | concepts from CCNx or NDN. For general ICN terms, this 118 | document uses the terminology defined in [RFC7927]. Where more 119 | specificity is needed, we utilize CCNx [RFC8569] terminology 120 | where a Content Object is the data structure that holds 121 | application payload. Terms defined specifically for FLIC are 122 | enumerated below in Section 3.1. 124 To enable requests with full names, consumers need a priori knowledge 125 of content digests. Manifests, a form of catalog, are data 126 structures commonly employed to store and transport this information. 127 Typically, ICN manifests are signed content objects (data) which 128 carry a collection of hash digests. Therefore, as content objects, 129 manifests themselves may be fetched by full name. Thus, manifests 130 may contain either hash digests of, or pointers to, either other 131 manifests or content objects. A collection of manifests and content 132 objects represents a large piece of application data, e.g., one that 133 cannot otherwise fit in a single content object. 135 Structurally, this relationship between manifests and content objects 136 is reminiscent of the UNIX inode concept with index tables and memory 137 pointers. In this document, we specify a simple, yet extensible, 138 manifest data structure called FLIC - _File-Like ICN Collection_. 139 FLIC is suitable for ICN protocol suites such as CCNx and NDN. We 140 describe the FLIC design, grammar, and various use cases, e.g., 141 ordered fetch, seeking, de-duplication, extension, and variable-sized 142 encoding. We also include FLIC encoding examples for CCNx and NDN. 144 The purpose of a manifest is to concisely name, and hence point to, 145 the constiuent pieces of a larger object. A FLIC manifest does this 146 by using a _root_ manifest to name and cryptographically sign the 147 data structure and then use concise lists of hash-based names to 148 indicate the constituent pieces. This maintains strong security from 149 a single signature. A Manifest entry gives one enough information to 150 create an _Interest_ for that entry, so it must specify the name, the 151 hash digest, and if needed, the locators. 153 FLIC is a distributed data structure illustrated by the following 154 picture. 156 root manifest 157 .------------------------------------. 158 | optional name: | 159 | /icn/name/of/this/flic | 160 | | 161 | HashGroup (HG): | 162 | optional metadata: | 163 | overall digest, locator, etc. | .------. 164 | hash-valued data pointer -----------> | data | 165 | ... | `------' sub manifest 166 | hash-valued manifest pointer ------. .------------------. 167 | | `--> | -----> 168 | optional additional HashGroups .. | | -----> 169 | | `------------------' 170 | optional signature | 171 `------------------------------------' 173 Figure 1: A FLIC manifest and its directed acyclic graph 175 A key design decision is how one names the root manifest, the 176 application data, and subsidiary manifests. For this, FLIC uses the 177 concept of a Name Constructor. The root manifest (in fact, any FLIC 178 manifest) may include a Name Constructor that instructs a manifest 179 reader how to properly create Interests for the associated 180 application data and subsidiary manifests. The Name Constructors 181 allow interest construction using a well-known, application- 182 independent set of rules. Some name constructor forms are tailored 183 towards specific ICN protocols, such as CCNx or NDN; some are more 184 general and could work with many protocols. We describe the allowed 185 Name Constructor methods in Section 3.3. There are also particulars 186 of how to encode the name schema in a given ICN protocol, which we 187 describe in Section 3.9. 189 FLIC has encodings for CCNx (Section 3.9.1) as per RFC 8609 [RFC8609] 190 and for NDN (Section 3.9.2). 192 An example implementation in Python may be found at 193 [FLICImplementation]. 195 1.1. FLIC as an ICN experimental tool 197 FLIC enables experimentation with how to structure and retrieve large 198 data objects and collections in ICN. By having a common data 199 structure applications can rely on, with a common library of code 200 that can be used to create and parse manifest data structures, 201 applications using ICN protocols can both avoid unnecessary 202 reinvention and also have enhanced interoperability. Since the 203 design attempts to balance simplicity, universality, and 204 extensibility, there are a number of important experimental goals to 205 achieve that may wind up in conflict with one another. We provide a 206 partial list of these experimental issues in Section 4. It is also 207 important for users of FLIC to understand that some flexibility and 208 extensions might be removed if use cases do not materialize to 209 justify their inclusion in an eventual standard. 211 1.2. Requirements Language 213 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 214 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 215 document are to be interpreted as described in [RFC2119]. 217 2. Design Overview 219 The FLIC design adopts the proven UNIX inode concept of direct and 220 indirect pointers, but without the specific structural forms of 221 direct versus indirect. The pointers in FLIC use hash-based naming 222 of Content Objects analogous to the function block numbers play in 223 UNIX inodes. More generally, a FLIC structure is in most cases a 224 tree, although any acyclic di-graph is a legal form. 226 In FLIC terms, a direct pointer links to application-level data, 227 which is a Content Object with application data in the Payload. An 228 indirect pointer links to a Content Object with a FLIC Manifest in 229 the Payload. 231 Because FLIC uses hash-based pointers as names, FLIC graphs are 232 inherently acyclic. Both CCNx and NDN support hash-based naming, 233 though the details differ (see Section 3.9.1 and Section 3.9.2). 235 | Note: A substantial advantage of using hash-based naming is 236 | that it permits block-level de-duplication of application data 237 | because two blocks with the same payload will have the same 238 | hash name. 240 The FLIC structure that it is expected most applications would use 241 consists of a root manifest with a strong cryptographic signature and 242 then cryptographically strong (e.g. SHA256 [SHS]) hash names as 243 pointers to other manifests. The advantage of this structure is that 244 the single signature in the root manifest covers the entire data 245 structure no matter how many additional manifests are in the data 246 structure. Another advantage of this structure is it removes the 247 need to use chunk (CCNx) or segment (NDN) name components for the 248 subordinate manifests. 250 FLIC supports manifest encryption separate from application payload 251 encryption (See Section 3.8). It has a flexible encryption envelope 252 to support various encryption algorithms and key discovery 253 mechanisms. The byte layout allows for in-place encryption and 254 decryption. 256 A limitation of this approach is that one cannot construct a hash- 257 based name for a child until one knows the payload of that child. In 258 practical terms, this means that one must have the complete 259 application payload available at the time of manifest creation. 261 FLIC's design allows straightforward applications that just need to 262 traverse a linear set of related objects to do so simply, but FLIC 263 has two extensibility mechanisms that allow for more sophisticated 264 uses: manifest metadata, and pointer annotations. These are 265 described in Section 3.4 and Section 3.5 respectively. 267 FLIC goes to considerable lengths to allow creation and parsing by 268 application-independent library code. Therefore, any options used by 269 applications in the data structure or encryption capabilities MUST 270 NOT require applications to have application-specific Manifest 271 traversal algorithms. This ensures that such application agnostic 272 libraries can always successfully parse and traverse any FLIC 273 Manifest by ignoring the optional capabilities. 275 3. FLIC Structure 277 3.1. Terminology 279 Data Object: a CCNx nameless Content Object that usually only has 280 Payload. It might also have an ExpiryTime to limit the lifetime 281 of the data. 283 Direct Pointer: borrowed from inode terminology, it is a CCNx link 284 using a content object hash restriction and a locator name to 285 point to a Data Object. 287 Indirect Pointer: borrowed from inode terminology, it is a CCNx link 288 using a content object hash restriction and a locator name to 289 point to a manifest content object. 291 Manifest: a CCNx ContentObject with PayloadType 'Manifest' and a 292 Payload of the encoded manifest. A leaf manifest only has direct 293 pointers. An internal manifest has a mixture of direct and 294 indirect pointers. 296 Leaf Manifest: all pointers are direct pointers. 298 Internal Manifest: some or all pointers are indirect. The order and 299 number of each is up to the manifest builder. By convention, all 300 the direct manifests come first, then the indirect. 302 Manifest Waste: a metric used to measure the amount of waste in a 303 manifest tree. Waste is the number of unused pointers. For 304 example, a leaf manifest might be able to hold 40 direct pointers, 305 but only 30 of them are used, so the waste of this node is 10. 306 Manifest tree waste is the sum of waste over all manifests in a 307 tree. 309 Root Manifest: A signed, named, manifest that points to nameless 310 manifest nodes. This structure means that the internal tree 311 structure of internal and leaf manifests have no names and thus 312 may be located anywhere in a namespace, while the root manifest 313 has a name to fetch it by. 315 Top Manifest: One useful manifest structure is to use a Root 316 manifest that points to a single Internal manifest called the Top 317 Manifest. The Top manifest the begins the structure used to 318 organize manifests. It is also possible to elide the two and use 319 only a root manifest that also serves in the role of the top 320 manifest. 322 Name Constructor: The specification of how to construct an Interest 323 for a Manifest entry. 325 Locator: A routing hint in an Interest used by forwarding to get the 326 Interest to where it can be matched based on its Name Constructor- 327 derived name. 329 3.2. Locators 331 Locators are routing hints used by forwarders to get an Interest to a 332 node in the network that can resolve the Interest's name. In some 333 naming conventions, the name might only be a hash-based name so the 334 Locator is the only available routing information. Locators exist in 335 both CCNx and NDN, though the specific protocol mechanisms differ. A 336 FLIC manifest represents locators in the same way for both ICN 337 protocols, though they are encoded differently in the underlying 338 protocol. See Section 3.9 for encoding differences. 340 A manifest Node may define one or more Locator prefixes that can be 341 used in the construction of Interests from the pointers in the 342 manifest. The Locators are inherited when walking a manifest tree, 343 so they do not need to be defined everywhere. It is RECOMMENDED that 344 only the Root manifest contain Locators so that a single operation 345 can update the locators. One use case when storing application 346 payloads at different replicas is to replace the Root manifest with a 347 new one that contains locators for the current replicas. 349 3.3. Name Constructors 351 A Manifest may define zero or more name constructors in 352 NameConstructorDefinitions (NCD) located in the Manifest Node. An 353 NCD associates a Name Constructor Id (NCID) to a Name Constructor. 354 The NCID is used in other parts of the Manifest to refer to that 355 specific definition. 357 NCID 0 is the default name constructor. If it is not defined in an 358 NCD, it is assumed to be a HashNamingConstructor. A Manifest may re- 359 define the default as needed. 361 A Manifest MUST use locally unique NCIDs in the NCD. 363 NCDs and their associated NCIDs are inherited as one traverses a 364 manifest. That is, a manifest consumer must remember the NCDs as it 365 traverses manifests. If it encounters a HashGroup that uses an 366 unknown NCID, the RECOMMENDED action is to report a malformed 367 manifest to the user. 369 A Manifest may update an NCID. If a child manifest re-defines an 370 NCID, the manifest consumer MUST use the new definition from that 371 point forward under that Manifest branch. 373 It is RECOMMENDED that only the root or similar top-level manifest 374 define NCDs and they not be re-defined in subsequent manifests. 376 We expect that an application constructing a Manifest will take one 377 of three approaches to name constructors. The advantage of using, or 378 re-defining, the default name constructor is that any hash groups 379 that use it do not need to specify an NCID and thus might save some 380 space. 382 * A manifest might define (or use) a default name constructor and 383 mix subsequent Manifest and Data objects under that same 384 namespace. The manifest only needs to use one Hash Group and can 385 freely mix Manifest and Data pointers. 387 * A manifest might define (or use) a default name constructor for 388 subsequent Manifests and define a second NCD for the application 389 data. This places all subsequent manifests under the default 390 constructor and places all application data under the second NCD. 391 The Manifest must use at least two Hash Groups. 393 There are a few options on how to organize the Hash Groups: 395 (1) Manifest Hash Group followed by Data Hash group, 396 (2) Data Hash Group followed by Manifest Hash Group, 397 (3) Intermix multiple manifest and data hash groups for 398 interleaved reading, or 399 (4) use a data-on-leaf only approach: the interior manifests 400 would use the manifest hash group and the leaves would use 401 the data hash group. Other organizations are possible. 403 * Define multiple NCDs for subsequent manifests and data, or not use 404 the default NCD, or use some other organization. 406 In this specification, we define the following four types of Name 407 Constructors. Additional name constructor types may be specified in 408 a subsequent revision of the specification. Here, we informally 409 define the name constructors. Section 3.6 specifies the encoding of 410 each name constructor. 412 Type 0 (Interest-Derived Naming): Use whatever name was used in the 413 Interest to retrieve this Manifest, less a hash component, and 414 append the desired hash value. 416 Type 1 (Data-Derived Naming): Use the Manifest Name, less a hash 417 component, as the Interest name, and append the desired hash 418 value. 420 Type 2 (Prefix List): The NCD specifies a list of 1 or more name 421 prefixes. The consumer may use any (or all) of those prefixes 422 with the desired hash appended. 424 Type 3 (Segmented Naming): As in Type 2, but the consumer MUST 425 maintain a 0-based counter for each NCID associated with the in- 426 order index of each hash and use that counter as a Segment number 427 in the name. 429 3.4. Manifest Metadata 431 The FLIC Manifest may be extended by defining TLVs that apply to the 432 Manifest as a whole, or alternatively, individually to every data 433 object pointed to by the Manifest. This basic specification does not 434 specify any, but metadata TLVs may be defined through additional RFCs 435 or via Vendor TLVs. FLIC uses a Vendor TLV structure identical to 436 [RFC8609] for vendor-specific annotations that require no 437 standardization process. 439 For example, some applications may find it useful to allow 440 specialized consumers such as _repositories_ (for example 441 [repository]) or enhanced forwarder caches to pre-place, or 442 adaptively pre-fetch data in order to improve robustness and/or 443 retrieval latency. Metadata can supply hints to such entities about 444 what subset of the compound object to fetch and in what order. 446 | Note: FLICs ability to use separate namespaces for the Manifest 447 | and the underlying Data allows different encryption keys to be 448 | used, hence giving a network element like a cache or repository 449 | access to the Manifest data does not as a side effect reveal 450 | the contents of the application data itself. 452 3.5. Pointer Annotations 454 FLIC allows each manifest pointer to be annotated with extra data. 455 Annotations allow applications to exploit metadata about each Data 456 Object pointed to without having to first fetch the corresponding 457 Content Object. This specification defines one such annotation. The 458 _SizeAnnotation_ specifies the number of application layer octets 459 covered by the pointer. 461 An annotation may, for example, give hints about a desirable 462 traversal order for fetching the data, or an importance/precedence 463 indication to aid applications that do not require every content 464 object pointed to in the manifest to be fetched. This can be very 465 useful for real-time or streaming media applications that can perform 466 error concealment when rendering the media. 468 Additional annotations may be defined through additional RFCs or via 469 Vendor TLVs. FLIC uses a Vendor TLV structure identical to [RFC8609] 470 for vendor-specific annotations that require no standardization 471 process. 473 3.6. Manifest Grammar (ABNF) 475 The manifest grammar is mostly, but not entirely independent of the 476 ICN protocol used to encode and transport it. The TLV encoding 477 therefore follows the corresponding ICN protocol, so for CCNx FLIC 478 uses 2 octet length, 2 octet type and for NDN uses the 1/3/5 octet 479 types and lengths (see [NDNTLV] for details). There are also some 480 differences in how one structures and resolves links. [RFC8569] 481 defines HashValue and Link for CCNx encodings. The NDN 482 ImplicitSha256DigestComponent defines HashValue and NDN Delegation 483 (from Link Object) defines Link for NDN. Section 3.9 below specifies 484 these differences. 486 The basic structure of a FLIC manifest comprises a security context, 487 a node, and an authentication tag. The security context and 488 authentication tag are not needed if the node is unencrypted. A node 489 is made up of a set of metadata, the NodeData, that applies to the 490 entire node, and one or more HashGroups that contain pointers. 492 The NodeData element defines the namespaces used by the manifest. 493 There may be multiple namespaces, depending on how one names 494 subsequent manifests or data objects. Each HashGroup may reference a 495 single namespace to control how one forms Interests from the 496 HashGroup. If one is using separate namespaces for manifests and 497 application data, one needs at least two hash groups. For a manifest 498 structure of "MMMDDD," (where M means manifest (indirect pointer) and 499 D means data (direct pointer)) for example, one would have a first 500 HashGroup for the child manifests with its namespace and a second 501 HashGroup for the data pointers with the other namespace. If one 502 used a structure like "MMMDDDMMM," then one would need three hash 503 groups. 505 TYPE = 2OCTET / {1,3,5}OCTET ; As per CCNx or NDN TLV 506 LENGTH = 2OCTET / {1,3,5}OCTET ; As per CCNx or NDN TLV 508 Manifest = TYPE LENGTH [SecurityCtx] (EncryptedNode / Node) [AuthTag] 510 SecurityCtx = TYPE LENGTH AlgorithmCtx 511 AlgorithmCtx = AEADCtx / RsaKemCtx 512 AuthTag = TYPE LENGTH *OCTET ; e.g. AEAD authentication tag 513 EncryptedNode = TYPE LENGTH *OCTET ; Encrypted Node 515 Node = TYPE LENGTH [NodeData] 1*HashGroup 516 NodeData = TYPE LENGTH [SubtreeSize] [SubtreeDigest] [Locators] 0*Vendor 0*NcDef 517 SubtreeSize = TYPE LENGTH INTEGER 518 SubtreeDigest = TYPE LENGTH HashValue 520 NcDef = TYPE LENGTH NcId NcSchema 521 NcId = TYPE LENGTH INTEGER 522 NcSchema = InterestDerivedSchema / DataDerivedSchema / PrefixSchema / SegmentedSchema 523 InterestDerivedSchema = TYPE LENGTH [ProtocolFlags] 524 DataDerivedSchema = TYPE LENGTH [ProtocolFlags] 525 PrefixSchema = TYPE LENGTH Locators [ProtocolFlags] 526 SegmentedSchema = TYPE LENGTH Locators [ProtocolFlags] 528 Locators = TYPE LENGTH 1*Link 529 HashValue = TYPE LENGTH *OCTET ; As per ICN Protocol 530 Link = TYPE LENGTH *OCTET ; As per ICN protocol 532 ProtocolFlags = TYPE LENGTH *OCTET; ICN-specific flags, e.g. must be fresh 534 HashGroup = TYPE LENGTH [GroupData] (Ptrs / AnnotatedPtrs) 535 Ptrs = TYPE LENGTH *HashValue 536 AnnotatedPtrs = TYPE LENGTH *PointerBlock 537 PointerBlock = TYPE LENGTH *Annotation Ptr 538 Ptr = TYPE LENGTH HashValue 540 Annotation = SizeAnnotation / Vendor 541 SizeAnnotation = TYPE LENGTH Integer 542 Vendor = TYPE LENGTH PEN *OCTET 544 GroupData = TYPE LENGTH [NcId] [LeafSize] [LeafDigest] [SubtreeSize] [SubtreeDigest] 545 LeafSize = TYPE LENGTH INTEGER 546 LeafDigest = TYPE LENGTH HashValue 548 AEADCtx = TYPE LENGTH AEADData 549 AEADData = KeyNum AEADNonce Mode 550 KeyNum = TYPE LENGTH INTEGER 551 AEADNonce = TYPE LENGTH 1*OCTET 552 AEADMode = TYPE LENGTH (AEAD_AES_128_GCM / AEAD_AES_256_GCM / AEAD_AES_128_CCM / AEAD_AES_128_CCM) 554 RsaKemCtx = 2 LENGTH RsaKemData 555 RsaKemData = KeyId AEADNonce AEADMode WrappedKey LocatorPrefix 556 KeyId = TYPE LENGTH HashValue; ID of Key Encryption Key 557 WrappedKey = TYPE LENGTH 1*OCTET 558 LocatorPrefix = TYPE LENGTH Link 560 Figure 2: FLIC Grammar 562 SecurityCtx: information about how to decrypt an EncryptedNode. The 563 structure will depend on the specific encryption algorithm. 565 AlgorithmId: The ID of the encryption method (e.g. preshared key, a 566 broadcast encryption scheme, etc.) 568 AlgorithmData: The context for the encryption algorithm. 570 EncryptedNode: An opaque octet string with an optional 571 authentication tag (i.e. for AEAD authentication tag) 573 Node: A plain-text manifest node. The structure allows for in-place 574 encryption/decryption. 576 NodeData: the metadata about the Manifest node 578 SubtreeSize: The size of all application data at and below the Node 579 or Group 581 SubtreeDigest: The cryptographic digest of all application data at 582 and below the Node or Group 584 Locators: An array of routing hints to find the manifest components 586 HashGroup: A set of child pointers and associated metadata 588 Ptrs: A list of one or more Hash Values 590 GroupData: Metadata that applies to a HashGroup 592 LeafSize: Size of all application data immediately under the Group 593 (i.e. via direct pointers) 595 LeafDigest: Digest of all application data immediately under the 596 Group 598 Ptr: The ContentObjectHash of a child, which may be a data 599 ContentObject (i.e. with Payload) or another Manifest Node. 601 3.7. Manifest Trees 603 3.7.1. Traversal 605 FLIC manifests use a pre-order traversal. This means they are read 606 top to bottom, left to right. The algorithms in Figure 3 show the 607 pre-order forward traversal code and the reverse-order traversal 608 code, which we use below to construct such a tree. This document 609 does not mandate how to build trees. Appendix A provides a detailed 610 example of building inode-like trees. 612 If using Annotated Pointers, an annotation could influence the 613 traversal order. 615 preorder(node) 616 if (node = null) 617 return 618 visit(node) 619 for (i = 0, i < node.child.length, i++) 620 preorder(node.child[i]) 622 reverse_preorder(node) 623 if (node = null) 624 return 625 for (i = node.child.length - 1, i >= 0, i-- ) 626 reverse_preorder(node.child[i]) 627 visit(node) 629 Figure 3: Traversal Pseudocode 631 In terms of the FLIC grammar, one expands a node into its hash 632 groups, visiting each hash group in order. In each hash group, one 633 follows each pointer in order. Figure 4 shows how hash groups inside 634 a manifest expand like virtual children in the tree. The in-order 635 traversal is M0, HG1, M1, HG3, D0, D1, D2, HG2, D3, D4. 637 M0 ____ 638 | \ 639 HG1 HG2 640 | \ | \ 641 M1 D2 D3 D4 642 | 643 HG3 644 | \ 645 D0 D1 647 Figure 4: Node Expansion 649 Using the example manifest tree shown in Figure 6, the in-order 650 traversal would be: Root, M0, M1, D0, D1, D2, M2, D3, D4, D5, M3, D6, 651 D7, D8. 653 3.8. Manifest Encryption Modes 655 This document specifies two encryption modes. The first is a 656 preshared key mode, where the parties are assumed to have the 657 decryption keys already. It uses AES-GCM or AES-CCM. This is 658 useful, for example, when using a key agreement protocol such as 659 CCNxKE [I-D.wood-icnrg-ccnxkeyexchange]. The second is an RSA key 660 encapsulation mode (RsaKem [RFC5990]), which may be used for group 661 keying. 663 Additional modes may be defined in subsequent specifications. We 664 expect that an RSA KemDem mode and Elliptic Curve mode should be 665 specified. 667 All encryption modes use standard encryption algorithms and 668 specifications. Where appropriate, we adopt the TLS 1.2 standards 669 for how to use the encryption algorithms. This section specifies how 670 to encode algorithm parameters or ICN-specific data. 672 For group key based encryption, we use RsaKem. This specification 673 only details the pertinent aspects of the encryption. It describes 674 how a consumer locates the appropriate keys in the ICN namespace. It 675 does not specify aspects of a key manager which may or may not be 676 used as part of key distribution and management, nor does it specify 677 the protocol between a key manager and a publisher. In its simpliest 678 form, the publisher could be the key manager, in which case there is 679 no extra protocol needed between the publisher and key manager. 681 While the preshared key algorithm is limited in use, the AES 682 encryption mode described applies to the group key mechanisms too. 683 The group key mechanism facilitates the distribution of the shared 684 key without an on-line key agreement protocol like (the expired 685 draft) CCNxKE [I-D.wood-icnrg-ccnxkeyexchange]. 687 3.8.1. AEAD Mode 689 This mechanism uses AES-GEM [AESGCM] or AES-CCM [RFC3310] for 690 manifest encryption. A publisher creating a SecurityCtx SHOULD use 691 the mechanisms in [RFC6655] for AES-CCM Nonce generation and 692 [RFC5288] for AES-GCM Nonce generation. 694 As these references specify, it is essential that the publisher 695 creating a Manifest never use a Nonce more than once for the same 696 key. For keys exchanged via a session protocol, such as CCNx, the 697 publisher MUST use unique nonces on each Manifest for that session. 698 If the key is derived via a group key mechanism, the publisher MUST 699 ensure that the same Nonce is not used more than once for the same 700 Content Encryption Key. 702 The AEAD Mode uses [RFC5116] defined symbols AEAD_AES_128_CCM, 703 AEAD_AES_128_GCM, AEAD_AES_256_CCM and AEAH_AES_256_GCM to specify 704 the key length and algorithm. 706 The KeyNum identifies a key on the receiver. The key MUST be exactly 707 of the length specific by the Mode. Many receivers may have the same 708 key with the same KeyNum. 710 When a Consumer reads a manifest that specifies a KeyNum, the 711 consumer SHOULD verify that the Manifest's publisher is an expected 712 one for the KeyNum's usage. This trust mechanism employed to 713 ascertain whether the publisher is expected is beyond the scope of 714 this document, but we provide an outline of one such possible trust 715 mechanism. When a consumer learns a shared key and KeyNum, it 716 associates that KeyNum with the publisher ID used in a public key 717 signature. When the consumer receives a signed manifest (e.g. the 718 root manifest of a manifest tree), the consumer matches the KeyNum's 719 publisher with the Manifest's publisher. 721 Each encrypted manifest node has a full security context (KeyNum, 722 Nonce, Mode). The AEAD decryption is independent for each manifest 723 so Manifest objects can be fetched and decrypted in any order. This 724 design also ensures that if a manifest tree points to the same 725 subtree repeatedly, such as for deduplication, the decryptions are 726 all idempotent. 728 To encrypt a Manifest, the publisher: 730 1. Removes any SecurityCtx or AuthTag from the Manifest. 732 2. Creates a SecurityCtx and adds it to the Manifest. 734 3. Treats the Manifest TLV through the end of the Node TLV Length as 735 unencrypted authenticated Header. That includes anything from 736 the start of the Manifest up to but not including the start of 737 the Node's body. 739 4. Treats the body of the Node to the end of the Manifest as 740 encrypted data. 742 5. Appends the AEAD AuthTag to the end of the Manifest, increasing 743 the Manifest's length 745 6. Changes the TLV type of the Node to EncryptedNode. 747 To decrypt a Manifest, the consumer: 749 1. Verifies that the KeyNum exists and the publisher is trusted for 750 that KeyNum. 752 2. Saves the AuthTag and removes it from the Manifest, decreasing 753 the Manifest length. 755 3. Changes the EncryptedNode type to Node. 757 4. Treats everything from the Manifest TLV through the end of the 758 Node Length as unencrypted authenticated Header. That is, all 759 bytes from the start of the Manifest up to but not including the 760 start of the Node's body. 762 5. Treats the body of the Node to the end of the Manifest as 763 encrypted data. 765 6. Verifies and decrypts the data using the key and saved AuthTag. 767 7. If the decryption fails, the consumer SHOULD notify the user and 768 stop further processing of the manifest. 770 3.8.2. RSA-OAEP Key Transport Mode 772 The RSA-OAEP mode uses RSA-OAEP (see RFC8017 Sec 7.1 [RFC8017] and 773 [RSAKEM]) to encrypt a symmetric key that is used to encrypt the 774 Manifest. We call this RSA key the Key Encryption Key (KEK) and each 775 group member has this private key. A separate key distribuiton 776 system is responsible for distributing the KEK. For our purposes, it 777 is reasonable to assume that the KEK private key is available at a 778 Locator and that group members can decrypt this private key. 780 The symmetric key MUST be one that is compatible with the AEAD Mode, 781 i.e. a 128-bit or 256-bit random number. Further, the symmetric key 782 MUST fit in the OAEP envelope (which will be true for normal-sized 783 keys). 785 Any group key protocol and system needed are outside the scope of 786 this document. We assume there is a Key Manager (KM) and a Publisher 787 (P) and a set of group members. Through some means, the Publisher 788 therefore has at its disposal: 790 * A Content Encryption Key (CEK), i.e. the symmetric key. 792 * The RSA-OAEP wrapped CEK. 794 * The KeyId of the KEK used to wrap the CEK. 796 * The Locator of the KEK, which is shared under some group key 797 protocol. 799 This Manifest specification requires that if a group member fetches 800 the KEK key at Locator it can decrypt the WrappedKey and retrieve the 801 CEK. 803 In one example, a publisher could request a key for a group and the 804 Key Manager could securely communicate (CEK, Wapped_CEK, KeyId, 805 Locator) back to the publisher. The Key Manager is responsible for 806 publishing the Locator. In another example, the publisher could be a 807 group member and have a group private key in which case the publisher 808 can create their own key encryption key, publish it under the Locator 809 and proceed. The publisher generates CEK, Wrapped_CEK, KeyId, and a 810 Locator on its own. 812 To create the wrapped key using a Key Encryption Key: 814 1. Obtain the CEK in binary format (e.g. 32 bytes for 256 bits) 816 2. RSA encrypt the CEK using the KEK public key with OAEP padding, 817 following RFC8017 Sec 7.1 [RFC8017]. The encryption is not 818 signed because the root Manifest must have been signed by the 819 publisher already. 821 To decrypt the wrapped key using a Key Encryption Key: 823 1. RSA decrypt the WrappedKey using the KEK private key with OAEP 824 padding, following RFC8017 Sec 7.1 [RFC8017]. 826 2. Verify the unwrapped key is a valid length for the AEADMode. 828 To encrypt a Manifest, the publisher: 830 1. Acquires the set of (CEK, Wrapped_CEK, KeyId, Locator). 832 2. Creates a SecurityCtx and adds it to the Manifest. The 833 SecurityCtx includes an AEADNonce and AEADMode, as per AEAD mode. 835 3. Encrypts the Manifest as per AEAD Mode using the RSA-OAEP 836 SecurityCtx and CEK. 838 To decrypt a Manifest, the consumer: 840 1. Acquires the KEK from the Key Locator. If the consumer already 841 has a cached copy of the KeyId in memory, it may use that cached 842 key. 844 2. SHOULD verify that it trusts the Manifest publisher to use the 845 provided key Locator. 847 3. Decrypts the WrappedKey to get the CEK. If the consumer has 848 already decrypted the same exact WrappedKey TLV block, it may use 849 that cached CEK. 851 4. Using the CEK, AEADNonce, and AEADMode, decrypt the Manifest as 852 per AEAD Mode, ignoring the KeyNum steps. 854 3.9. Protocol Encodings 856 3.9.1. CCNx Encoding 858 In CCNx, application data content objects use a PayloadType of 859 T_PAYLOADTYPE_DATA. In order to clearly distinguish FLIC Manifests 860 from application data, a different payload type is required. 861 Therefore this specification defines a new payload type of 862 T_PAYLOADTYPE_FLIC. 864 ManifestContentObject = TYPE LENGTH [Name] [ExpiryTime] PayloadType Payload 865 Name = TYPE LENGTH *OCTET ; As per RFC8569 866 ExpiryTime = TYPE LENGTH *OCTET ; As per RFC8569 867 PayloadType = TYPE LENGTH T_PAYLOADTYPE_FLIC ; Value TBD 868 Payload : TYPE LENGTH *OCTET ; the serialized Manifest object 870 Figure 5: CCNx Embedding Grammar 872 3.9.1.1. CCNx Hash Naming 874 The Hash Naming namespace uses CCNx nameless content objects. 876 It proceeds as follows: 878 * The Root Manifest content object bound to a name assigned by the 879 publisher and signed by the publisher. It also may have a set of 880 Locators used to fetch the remainder of the manifest. The root 881 manifest has a single HashPointer that points to the Top Manifest. 882 It may also have cache control directives, such as ExpiryTime. 884 * The Root Manifest has an NsDef that specifies HashSchema. Its 885 GroupData uses that NsId. All internal and leaf manifests use the 886 same GroupData NsId. A Manifest Tree MAY omit the NsDef and NsId 887 elements and rely on the default being HashSchema. 889 * The Top Manifest is a nameless CCNx content object. It may have 890 cache control directies. 892 * Internal and Leaf manifests are nameless CCNx content objects, 893 possibly with cache control directives. 895 * The Data content objects are nameless CCNx content objects, 896 possibly with cache control directives. 898 * To form an Interest for a direct or indirect pointer, use a Name 899 from one of the Locators and put the pointer HashValue into the 900 ContentObjectHashRestriction. 902 3.9.1.2. CCNx Single Prefix 904 The Single Prefix schema uses the same name in all Content Objects 905 and distinguishes them via their ContentObjectHash. Note that in 906 CCNx, using a SinglePrefix name means that Locators are not used. 908 It proceeds as follows: 910 * The Root Manifest content object has a name used to fetch the 911 manifest. It is signed by the publisher. It has a set of 912 Locators used to fetch the remainder of the manifest. It has a 913 single HashPointer that points to the Top Manifest. It may also 914 have cache control directives, such as ExpiryTime. 916 * The Root Manifest has an NsDef that specifies SinglePrefix and the 917 SinglePrefixSchema element specifies the SinglePrefixName. 919 * The Top Manifest has the name SinglePrefixName. It may have cache 920 control directies. Its GroupData elements must have an NsId that 921 references the NsDef. 923 * An Internal or Leaf manifest has the name SinglePrefixName, 924 possibly with cache control directives. Its GroupData elements 925 must have an NsId that references the NsDef. 927 * The Data content objects have the name SinglePrefixName, possibly 928 with cache control directives. 930 * To form an Interest for a direct or indirect pointer, use 931 SinglePrefixName as the Name and put the pointer HashValue into 932 the ContentObjectHashRestriction. 934 3.9.1.3. CCNx Segmented Prefix 936 The Segmented Prefix schema uses a different name in all Content 937 Objects and distinguishes them via their ContentObjectHash. Note 938 that in CCNx, using a SegmentedPrefixSchema means that Locators are 939 not used. 941 | *Optional*: Use AnnotatedPointers to indicate the segment 942 | number of each hash pointer to avoid needing to infer the 943 | segment numbers. 945 It proceeds as follows: 947 * The Root Manifest content object has a name used to fetch the 948 manifest. It is signed by the publisher. It has a set of 949 Locators used to fetch the remainder of the manifest. It has a 950 single HashPointer that points to the Top Manifest. It may also 951 have cache control directives, such as ExpiryTime. 953 * The Root Manifest has an NsDef that specifies SegmentedPrefix and 954 the SegmentedPrefixSchema element specifies the 955 SegmentedPrefixName. 957 * The publisher tracks the chunk number of each content object 958 within the NsId. Objects are be numbered in their traversal 959 order. Within each manifest, the name can be constructed from the 960 SegmentedPrefixName plus a Chunk name component. 962 * The Top Manifest has the name SegmentedPrefixName plus chunk 963 number. It may have cache control directies. It's GroupData 964 elements must have an NsId that references the NsDef. 966 * An Internal or Leaf manifest has the name SegmentedPrefixName plus 967 chunk number, possibly with cache control directives. Its 968 GroupData elements must have an NsId that references the NsDef. 970 * The Data content objects have the name SegmentedPrefixName plus 971 chunk number, possibly with cache control directives. 973 * To form an Interest for a direct or indirect pointer, use 974 SegmentedPrefixName plus chunk number as the Name and put the 975 pointer HashValue into the ContentObjectHashRestriction. A 976 consumer must track the chunk number in traversal order for each 977 SegmentedPrefixSchema NsId. 979 3.9.1.4. CCNx Hybrid Schema 981 A manifest may use multiple schemas. For example, the application 982 payload in data content objects might use SegmentedPrefix while the 983 manifest content objects might use HashNaming. 985 The Root Manifest should specify an NsDef with a first NsId (say 1) 986 as the HashNaming schema and a second NsDef with a second NsId (say 987 2) as the SegmentedPrefix schema along with the SegmentedPrefixName. 989 Each manifest (Top, Internal, Leaf) uses two or more HashGroups, 990 where each HashGroup has only Direct (with the second NsId) or 991 Indirect (with the first NsId). The number of hash groups will 992 depend on how the publisher wishes to interleave direct and indirect 993 pointers. 995 Manifests and data objects derive their names according to the 996 application's naming schema. 998 3.9.2. NDN Encoding 1000 In NDN, all Manifest Data objects use a ContentType of FLIC (1024), 1001 while all application data content objects use a PayloadType of Blob. 1003 3.9.2.1. NDN Hash Naming 1005 In NDN Hash Naming, a Data Object has a 0-length name. This means 1006 that an Interest will only have an ImplicitDigest name component in 1007 it. This method relies on using NDN Forwarding Hints. 1009 It proceeds as follows: 1011 * The Root Manifest Data has a name used to fetch the manifest. It 1012 is signed by the publisher. It has a set of Locators used to 1013 fetch the remainder of the manifest. It has a single HashPointer 1014 that points to the Top Manifest. It may also have cache control 1015 directives. 1017 * The Root Manifest has an NsDef that specifies HashSchema. Its 1018 GroupData uses that NsId. All internal and leaf manifests use the 1019 same GroupData NsId. A Manifest Tree MAY omit the NsDef and NsId 1020 elements and rely on the default being HashSchema. 1022 * The Top Manifest has a 0-length Name. It may have cache control 1023 directies. 1025 * Internal and Leaf manifests has a 0-length Name, possibly with 1026 cache control directives. 1028 * The application Data use a 0-length name, possibly with cache 1029 control directives. 1031 * To form an Interest for a direct or indirect pointer, the name is 1032 only the Implicit Digest name component derived from a pointer's 1033 HashValue. The ForwardingHints come from the Locators. In NDN, 1034 one may use one or more locators within a single Interest. 1036 3.9.2.2. NDN Single Prefix 1038 In Single Prefix, the Data name is a common prefix used between all 1039 objects in that namespace, without a Segment or other counter. They 1040 are distinguished via the Implicit Digest name component. The FLIC 1041 Locators go in the ForwardingHints. 1043 It proceeds as follows: 1045 * The Root Manifest Data object has a name used to fetch the 1046 manifest. It is signed by the publisher. It has a set of 1047 Locators used to fetch the remainder of the manifest. It has a 1048 single HashPointer that points to the Top Manifest. It may also 1049 have cache control directives. 1051 * The Root Manifest has an NsDef that specifies SinglePrefix and the 1052 SinglePrefixSchema element specifies the SinglePrefixName. 1054 * The Top Manifest has the name SinglePrefixName. It may have cache 1055 control directies. Its GroupData elements must have an NsId that 1056 references the NsDef. 1058 * An Internal or Leaf manifest has the name SinglePrefixName, 1059 possibly with cache control directives. Its GroupData elements 1060 must have an NsId that references the NsDef. 1062 * The Data content objects have the name SinglePrefixName, possibly 1063 with cache control directives. 1065 * To form an Interest for a direct or indirect pointer, use 1066 SinglePrefixName as the Name and append the pointer's HashValue 1067 into an ImplicitDigest name component. Set the ForwardingHints 1068 from the FLIC locators. 1070 3.9.2.3. NDN Segmented Prefix 1072 In Segmented Prefix, the Data name is a common prefix plus a segment 1073 number, so each manifest or application data object has a unique full 1074 name before the implicit digest. This means the consumer must 1075 maintain a counter for each SegmentedPrefix namespace. 1077 | *Optional*: Use AnnotatedPointers to indicate the segment 1078 | number of each hash pointer to avoid needing to infer the 1079 | segment numbers. 1081 It proceeds as follows: 1083 * The Root Manifest Data object has a name used to fetch the 1084 manifest. It is signed by the publisher. It has a set of 1085 Locators used to fetch the remainder of the manifest. It has a 1086 single HashPointer that points to the Top Manifest. It may also 1087 have cache control directives. 1089 * The Root Manifest has an NsDef that specifies SegmentedPrefix and 1090 the SegmentedPrefixSchema element specifies the 1091 SegmentedPrefixName. 1093 * The publisher tracks the segment number of each Data object within 1094 a SegmentedPrefix NsId. Data is numbered in traversal order. 1095 Within each manifest, the name is constructed from the 1096 SegmentedPrefixName plus a Segment name component. 1098 * The Top Manifest has the name SegmentedPrefixName plus segment 1099 number. It may have cache control directies. Its GroupData 1100 elements must have an NsId that references the NsDef. 1102 * An Internal or Leaf manifest has the name SegmentedPrefixName plus 1103 segment number, possibly with cache control directives. Its 1104 GroupData elements must have an NsId that references the NsDef. 1106 * The Data content objects have the name SegmentedPrefixName plus 1107 chunk number, possibly with cache control directives. 1109 * To form an Interest for a direct or indirect pointer, use 1110 SegmentedPrefixName plus segment number as the Name and put the 1111 pointer HashValue into the ImplicitDigest name component. A 1112 consumer must track the segment number in traversal order for each 1113 SegmentedPrefixSchema NsId. 1115 3.9.2.4. NDN Hybrid Schema 1117 A manifest may use multiple schemas. For example, the application 1118 payload in data content objects might use SegmentedPrefix while the 1119 manifest content objects might use HashNaming. 1121 The Root Manifest should specify an NsDef with a first NsId (say 1) 1122 as the HashNaming schema and a second NsDef with a second NsId (say 1123 2) as the SegmentedPrefix schema along with the SegmentedPrefixName. 1125 Each manifest (Top, Internal, Leaf) uses two or more HashGroups, 1126 where eash HashGroup has only Direct (with the second NsId) or 1127 Indirect (with the first NsId). The number of hash groups will 1128 depend on how the publisher wishes to interleave direct and indirect 1129 pointers. 1131 Manifests and data objects derive their names according to the 1132 application's naming schema. 1134 3.10. Example Structures 1136 3.10.1. Leaf-only data 1138 Root 1139 | 1140 ______ M0 _____ 1141 / | \ 1142 M1 M2 M3 1143 / | \ / | \ / | \ 1144 D0 D1 D2 D3 D4 D5 D6 D7 D8 1146 Figure 6: Leaf-only manifest tree 1148 3.10.2. Linear 1150 Of special interest are "skewed trees" where a pointer to a manifest 1151 may only appear as last pointer of (sub-) manifests. Such a tree 1152 becomes a sequential list of manifests with a maximum of datapointers 1153 per manifest packet. Beside the tree shape we also show this data 1154 structure in form of packet content where D stands for a data pointer 1155 and M is the hash of a manifest packet. 1157 Root -> M0 ----> M1 ----> ... 1158 |->DDDD |->DDDD 1160 4. Experimenting with FLIC 1162 FLIC is expected to enable a number of salient experiments in the use 1163 of ICN protools by applications. These experiments will help not 1164 only to inform the desirable structure of ICN applications but 1165 reflect back to the features included in FLIC to evaluate their 1166 usefulness to those applications. While many interesting design 1167 aspects of FLIC remain to be discovered through experience, a number 1168 of important questions to be answered through experimentation 1169 include: 1171 * use for just files or other collections like directories 1173 * use for particular applications, like streaming media manifests 1175 * utility of pointer annotations to optimize retrieval 1177 * utility of the encryption options for use by repositories and 1178 forwarders 1180 * need for application metadata in manifests 1182 5. Usage Examples 1184 5.1. Locating FLIC leaf and manifest nodes 1186 The names of manifest and data objects are often missing or not 1187 unique, unless using specific naming conventions. In this example, 1188 we show how using manifest locators is used to generate Interests. 1189 Take for example the figure below where the root manifest is named by 1190 hash h0. It has nameless children with hashes with hashes h1 ... hN. 1192 Objects: 1193 manifest(name=/a/b/c, ptr=h1, ptr=hN) - has hash h0 1194 nameless(data1) - has hash h1 1195 ... 1196 nameless(dataN) - has hash hN 1198 Query for the manifest: 1199 interest(name=/a/b/c, implicitDigest=h0) 1201 Figure 7: Data Organization 1203 After obtaining the manifest, the client fetches the contents. In 1204 this first instance, the manifest does not provide any Locators data 1205 structure, so the client must continue using the name it used for the 1206 manifest. 1208 interest(name=/a/b/c, implicitDigest=h1) 1209 ... 1210 interest(name=/a/b/c, implicitDigest=hN) 1212 Figure 8: Data Interests 1214 Using the locator metadata entry, this behavior can be changed: 1216 Objects: 1217 manifest(name=/a/b/c, 1218 hashgroup(loc=/x/y/z, ptr=h1) 1219 hashgroup(ptr=h2) - has hash h0 1220 nameless(data1) - has hash h1 1221 nameless(data2) - has hash h2 1223 Queries: 1224 interest(name=/a/b/c, implicitDigest=h0) 1225 interest(name=/x/y/z, implicitDigest=h1) 1226 interest(name=/a/b/c, implicitDigest=h2) 1228 Figure 9: Using Locators 1230 5.2. Seeking 1232 Fast seeking (without having to sequentially fetch all content) works 1233 by skipping over entries for which we know their size. The following 1234 expression shows how to compute the byte offset of the data pointed 1235 at by pointer P_i, call it offset_i. In this formula, let P_i.size 1236 represent the Size value of the _i_-th pointer. 1238 offset_i = \sum_{k=1}^{i-1} > P_k.size. 1240 With this offset, seeking is done as follows: 1242 Input: seek_pos P, a FLIC manifest with a hash group having N entries 1243 Output: pointer index i and byte offset o, or out-of-range error 1244 Algorithm: 1245 offset = 0 1246 for i in 1..N do 1247 if (P > offset + P_i.size) 1248 return (i, P - offset) 1249 offset += P_i.size 1250 return out-of-range 1252 Figure 10: Seeking Algorithm 1254 Seeking in a BlockHashGroup is different since offsets can be quickly 1255 computed. This is because the size of each pointer P_i except the 1256 last is equal to the SizePerPtr value. For a BlockHashGroup with N 1257 pointers, OverallByteCount D, and SizePerPointer L, the size of P_N 1258 is equal to the following: 1260 D - ((N - 1) * L) 1262 In a BlockHashGroup with k pointers, the size of P_k is equal to: 1264 D - L * (k - 1) 1266 Using these, the seeking algorithm can be thus simplified to the 1267 following: 1269 Input: seek_pos P, a FLIC manifest with a hash group having 1270 OverallByteCount S and SizePerPointer L. 1271 Output: pointer index i and byte offset o, or out-of-range error 1272 Algo: 1273 if (P > S) 1274 return out-of-range 1275 i = floor(P / L) 1276 if (i > N) 1277 return out-of-range # bad FLIC encoding 1278 o = P mod L 1279 return (i, o) 1281 Figure 11: Seeking Algorithm 1283 | *Note*: In both cases, if the pointer at position i is a 1284 | manifest pointer, this algorithm has to be called once more, 1285 | seeking to seek_pos o inside that manifest. 1287 5.3. Block-level de-duplication 1289 Consider a huge file, e.g. an ISO image of a DVD or program in binary 1290 be patched. In this case, all existing encoded ICN chunks can remain 1291 in the repository while only the chunks for the patch itself is added 1292 to a new manifest data structure, as is shown in the diagram below. 1293 For example, the venti archival file system of Plan9 [venti] uses 1294 this technique. 1296 old_mfst - - > h1 --> oldData1 <-- h1 < - - new_mfst 1297 \ - > h2 --> oldData2 <-- h2 < - - / 1298 \ replace3 <-- h5 < - -/ 1299 \- > h3 --> oldData3 / 1300 \ > h4 --> oldData4 <-- h4 < - / 1302 Figure 12: De-duplication 1304 5.4. Growing ICN collections 1306 A log file, for example, grows over time. Instead of having to re- 1307 FLIC the grown file it suffices to construct a new manifest with a 1308 manifest pointer to the old root manifest plus the sequence of data 1309 hash pointers for the new data (or additional sub-manifests if 1310 necessary). 1312 | *Note* that this tree will not be skewed (anymore). 1314 old data < - - - mfst_old <-- h_old - - mfst_new 1315 / 1316 new data1 <-- h_1 - - - - - - - - -/ 1317 new data2 / 1318 ... / 1319 new dataN <-- h_N - - - - - - - -/ 1321 Figure 13: Growing A Collection 1323 5.5. Re-publishing a FLIC under a new name 1325 There are several use cases for republishing a collection under a new 1326 namespace, or having one collection exist under several namespaces: 1328 * It can happen that a publisher's namespace is part of a service 1329 provider's prefix. When switching provider, the publisher may 1330 want to republish the old data under a new name. 1332 * A publishes wishes to distribute its content to several 1333 repositories and would like a result to be delivered from the 1334 repository for consumers who have good connectivity to that 1335 repository. For example, the publisher /alpha wishes to place 1336 content at /beta and /gamma, but routing only to /alpha would not 1337 send a request to either /beta or /gamma. The operators of of 1338 /beta and /gamma could create a named and signed version of the 1339 root manifest with appropriate keys (or delegate that to /alpha) 1340 so the results are always delivered by the corresponding 1341 repository without having to change the bulk of the manifest tree. 1343 This can easily be achieved with a single nameless root manifest for 1344 the large FLIC plus arbitrarily many per-name manifests (which are 1345 signed by whomever wants to publish this data): 1347 data < - nameless_mfst() <-- h < - mfst(/com/example/east/the/flic) 1348 < - mfst(/com/example/west/old/the/flic) 1349 < - mfst(/internet/archive/flic234) 1351 Figure 14: Relocating A Collection 1353 | Note that the hash computation (of h) only requires reading the 1354 | nameless root manifest, not the entire FLIC. 1356 This example points out the problem of HashGroups having their own 1357 locator metadata elements: A retriever would be urged to follow these 1358 hints which are "hardcoded" deep inside the FLIC but might have 1359 become outdated. We therefore recommend to name FLIC manifests only 1360 at the highest level (where these names have no locator function). 1361 Child nodes in a FLIC manifest should not be named as these names 1362 serve no purpose except retrieving a sub-tree's manifest by name, if 1363 would be required. 1365 6. IANA Considerations 1367 IANA is requested to perform the actions in the following sub- 1368 sections. 1370 | IANA should also note that FLIC uses the definitions of 1371 | AEAD_AES_128_GCM, AEAD_AES_128_CCM, AEAD_AES_256_GCM, 1372 | AEAD_AES_256_CCM from [RFC5116]. 1374 6.1. FLIC Payload Type 1376 Register FLIC as a Payload Type in the _CCNx Payload Types_ Registry 1377 referring to the description in Section 3.9.1 as follows: 1379 +======+====================+================================+ 1380 | Type | Name | Reference | 1381 +======+====================+================================+ 1382 | TBA | T_PAYLOADTYPE_FLIC | Section 3.9.1 and | 1383 | | | Section 3.6.2.2.1 of [RFC8609] | 1384 +------+--------------------+--------------------------------+ 1386 Table 1: FLIC CCNx Payload Type 1388 6.2. FLIC Manifest Metadata and Annotation TLVs 1390 Create the following registry to be titled _FLIC Manifest Metadata 1391 and Annotation TLVs_ Manifest Metadata is described in Section 3.4; 1392 Pointer Annotations are described in Section 3.5. The registration 1393 procedure is *Specification Required*. The Type value is 2 octets. 1394 The range is 0x0000-0xFFFF. Allocate a value for the single 1395 _SizeAnnotation_ TLV. 1397 +======+===================+====================+ 1398 | Type | Name | Reference | 1399 +======+===================+====================+ 1400 | TBA | T_SIZE_ANNOTATION | Size (Section 3.5) | 1401 +------+-------------------+--------------------+ 1403 Table 2: FLIC Manifest Metadata and 1404 Annotation TLVs 1406 7. Security Considerations 1408 TODO Need a discussion on: 1410 * signing and hash chaining security. (*Note: Did I cover this 1411 adequately below?*) 1413 * republishing under a new namespace. (*Note: need help here - is 1414 this to reinforce that you can re-publish application data by 1415 creating a new root Manifest and signing that, requiring only one 1416 signature to change?*) 1418 * encryption mechanisms. (*Note: did I cover this adequately 1419 below?*) 1421 * encryption key distribution mechanisms.(*Note: not sure what needs 1422 to be said here*) 1424 * discussion of privacy, leaking of linkability information - *could 1425 really use some help here*. 1427 *Anything else?????*. 1429 7.1. Integrity and Origin Authentication of FLIC Manifests 1431 A FLIC Manifest is used to describe how to form Interests to access 1432 large CCNx or NDN application data. The Manifest is itself either an 1433 individual content object, or a tree of content objects linked 1434 together via the corresponding content hashes. The NDN and CCnx 1435 protocol architectures directly provide both individual object 1436 integrity (using cryptographically strong hashes) and data origin 1437 authentication (using signatures). The protocol specifications, 1438 [NDN] and CCNx [RFC8609] respectively, provide the protocol machinery 1439 and keying to support strong integrity and authentication. 1440 Therefore, FLIC utilizes the existing protocol specifications for 1441 these functions, rather than providing its own. There are a few 1442 subtle differences in the handling of signatures and keys in NDN and 1443 CCNx worth recapitulating here: 1445 * NDN in general adds a signature to every individual data packet 1446 rather than aggregating signatures via some object-level scheme. 1447 When employing FLIC Manifests to multi-packet NDN objects, it is 1448 expected that the individual packet signatures would be elided and 1449 the signture on the Manifest used instead. 1451 * In contrast, CCNx is biased to have primitive objects or pieces 1452 thereof be "nameless" in the sense they are identified only by 1453 their hashes rather than each having a name directly bound to the 1454 content through an individual signature. Therefore, CCNx depends 1455 heavily on FLIC (or an alternative method) to provide the name and 1456 the signed binding of the name to the content described in the 1457 Manifest 1459 A FLIC Manifest therefore gets integrity of its individual pieces 1460 through the existing secure hashing procedures of the underlying 1461 protocols. Origin authentication of the entire Manifest is achieved 1462 through hash chaining and applying a signature *only* to the root 1463 Manifest of a manifest tree. It is important to note that the Name 1464 of the Manifest, which is what the signature is bound to, need not 1465 bear any particular relationship to the names of the application 1466 objects pointed to in the Manifest via Name Constructors. This has a 1467 number of important benefits described in Section 3.3. 1469 7.2. Confidentiality of Manifest Data 1471 ICN protocol architectures like CCNx and NDN, while providing 1472 integrity and origin authentication as described above, leaves 1473 confidentiality issues entirely in the domain of the ICN application. 1474 Therefore, since FLIC is an application-level construct in both NDN 1475 and CCNx, it is incumbent on this specification for FLIC to provide 1476 the desired confidentiality properties using encryption. One could 1477 leave the specification of Manifest encryption entirely in the hands 1478 of the individual application utilizing FLIC, but this would be 1479 undesirable for a number of reasons: 1481 * The sensitivity of the information in a Manifest may be different 1482 from the sensitivity of the application data it describes. In 1483 some cases, it may not be necessary to encrypt manifests, or to 1484 encrypt them with a different keying scheme from that used for the 1485 application data 1487 * One of the major capabilities enabled by FLIC is to allow 1488 repositories or forwarding caches to operate on Manifests (see in 1489 particular Section 3.4). In order to allow such intermediaries to 1490 interpret manifests without revealing the underlying application 1491 data, separate encryption and keying is necessary 1493 * A strong design goal of FLIC is _universality_ such that it can be 1494 used transparently by many different ICN applications. This 1495 argues that FLIC should have a set of common encryption and keying 1496 capabilities that can be delegated to library code and not have to 1497 be re-worked by each individual application (see Section 2, 1498 Paragraph 9) 1500 Therefore, this specification directly specifies two encryption 1501 encapsulations and associated links to key management, as described 1502 in Section 3.8. As more experience is gained with various use cases, 1503 additional encryption capabilities may be needed and hence we expect 1504 the encryption aspects of this specification to evolve over time. 1506 7.3. Privacy of names and linkability of access patterns 1508 What to say here, if anything? 1510 8. References 1512 8.1. Normative References 1514 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1515 Requirement Levels", BCP 14, RFC 2119, 1516 DOI 10.17487/RFC2119, March 1997, 1517 . 1519 [RFC3310] Niemi, A., Arkko, J., and V. Torvinen, "Hypertext Transfer 1520 Protocol (HTTP) Digest Authentication Using Authentication 1521 and Key Agreement (AKA)", RFC 3310, DOI 10.17487/RFC3310, 1522 September 2002, . 1524 [RFC5116] McGrew, D., "An Interface and Algorithms for Authenticated 1525 Encryption", RFC 5116, DOI 10.17487/RFC5116, January 2008, 1526 . 1528 [RFC5288] Salowey, J., Choudhury, A., and D. McGrew, "AES Galois 1529 Counter Mode (GCM) Cipher Suites for TLS", RFC 5288, 1530 DOI 10.17487/RFC5288, August 2008, 1531 . 1533 [RFC5990] Randall, J., Kaliski, B., Brainard, J., and S. Turner, 1534 "Use of the RSA-KEM Key Transport Algorithm in the 1535 Cryptographic Message Syntax (CMS)", RFC 5990, 1536 DOI 10.17487/RFC5990, September 2010, 1537 . 1539 [RFC6655] McGrew, D. and D. Bailey, "AES-CCM Cipher Suites for 1540 Transport Layer Security (TLS)", RFC 6655, 1541 DOI 10.17487/RFC6655, July 2012, 1542 . 1544 [RFC8017] Moriarty, K., Ed., Kaliski, B., Jonsson, J., and A. Rusch, 1545 "PKCS #1: RSA Cryptography Specifications Version 2.2", 1546 RFC 8017, DOI 10.17487/RFC8017, November 2016, 1547 . 1549 [RFC8569] Mosko, M., Solis, I., and C. Wood, "Content-Centric 1550 Networking (CCNx) Semantics", RFC 8569, 1551 DOI 10.17487/RFC8569, July 2019, 1552 . 1554 [RFC8609] Mosko, M., Solis, I., and C. Wood, "Content-Centric 1555 Networking (CCNx) Messages in TLV Format", RFC 8609, 1556 DOI 10.17487/RFC8609, July 2019, 1557 . 1559 8.2. Informative References 1561 [AESGCM] Dworkin, M., "Recommendation for Block Cipher Modes of 1562 Operation: Galois/Counter Mode (GCM) and GMAC", National 1563 Institute of Standards and Technology SP 800-38D, 2007, 1564 . 1566 [FLICImplementation] 1567 Mosko, M., "FLIC Implementation in Python", various, 1568 . 1570 [I-D.wood-icnrg-ccnxkeyexchange] 1571 Mosko, M., Uzun, E., and C. Wood, "CCNx Key Exchange 1572 Protocol Version 1.0", Work in Progress, Internet-Draft, 1573 draft-wood-icnrg-ccnxkeyexchange-02, 6 July 2017, 1574 . 1577 [NDN] "Named Data Networking", various, 1578 . 1580 [NDNTLV] "NDN Packet Format Specification.", 2016, 1581 . 1583 [repository] 1584 "Repo Protocol Specification", Various, 1585 . 1588 [RFC7927] Kutscher, D., Ed., Eum, S., Pentikousis, K., Psaras, I., 1589 Corujo, D., Saucez, D., Schmidt, T., and M. Waehlisch, 1590 "Information-Centric Networking (ICN) Research 1591 Challenges", RFC 7927, DOI 10.17487/RFC7927, July 2016, 1592 . 1594 [RSAKEM] Barker, E., Chen, L., Roginsky, A., Vassilev, A., Davis, 1595 R., and S. Simon, "Recommendation for Pair-Wise Key- 1596 Establishment Using Integer Factorization Cryptography", 1597 National Institute of Standards and Technology SP 800-56B 1598 Rev. 2, 2019, . 1600 [SHS] Technology, N. I. O. S. A., "Secure Hash Standard, United 1601 States of American, National Institute of Science and 1602 Technology, Federal Information Processing Standard (FIPS) 1603 180-4", National Institute of Standards and Technology SP 1604 180-4, 2012, 1605 . 1608 [venti] "Venti: a new approach to archival storage", Bell Labs 1609 Document Archive /sts/doc, 2002, 1610 . 1612 Appendix A. Building Trees 1614 This appendix describes one method to build trees. It constructs a 1615 pre-order tree in a single pass of the application data, going from 1616 the tail to the beginning. This allows us to work up the right side 1617 of the tree in a single pass, then work down each left branch until 1618 we exhaust the data. Using the reverse-order traversal, we create 1619 the right-most-child manifest, then its parent, then the indirect 1620 pointers of that parent, then the parent's direct pointers,then the 1621 parent of the parent (repeating). This process uses recursion, as it 1622 is the clearest way to show the code. A more optimized approach 1623 could do it in a true single pass. 1625 Because we're building from the bottom up, we use the term 'level' to 1626 be the distance from the right-most child up. Level 0 is the bottom- 1627 most level of the tree, such as where node 7 is: 1629 1 1630 2 3 1631 4 5 6 7 1632 preorder: 1 2 4 5 3 6 7 1633 reverse: 7 6 3 5 4 2 1 1635 The Python-like pseudocode build_tree(data, n, k, m) algorithm 1636 creates a tree of n data objects. The data[] array is an array of 1637 Content Objects that hold application payload; the application data 1638 has already been packetized into n Content Object packets.An interior 1639 manifest node has k direct pointers and m indirect pointers. 1641 build_tree(data[0..n-1], n, k, m) 1642 # data is an array of Content Objects (Data in NDN) with application payload. 1643 # n is the number of data items 1644 # k is the number of direct pointers per internal node 1645 # m is the number of indirect pointers per internal node 1647 segment = namedtuple('Segment', 'head tail')(0, n) 1648 level = 0 1650 # This bootstraps the process by creating the right most child manifest 1651 # A leaf manifest has no indirect pointers, so k+m are direct pointers 1652 root = leaf_manifest(data, segment, k + m) 1654 # Keep building subtrees until we're out of direct pointers 1655 while not segment.empty(): 1656 level += 1 1657 root = bottom_up_preorder(data, segment, level, k, m, root) 1659 return root 1661 bottom_up_preorder(data, segment, level, k, m, right_most_child=None) 1662 manifest = None 1663 if level == 0: 1664 assert right_most_child is None 1665 # build a leaf manifest with only direct pointers 1666 manifest = leaf_manifest(data, segment, k + m) 1667 else: 1668 # If the number of remaining direct pointers will fit in a leaf node, make one of those. 1669 # Otherwise, we need to be an interior node 1670 if right_most_child is None and segment.length() <= k + m: 1671 manifest = leaf_manifest(data, segment, k+m) 1672 else: 1673 manifest = interior_manifest(data, segment, level, k, m, right_most_child) 1674 return manifest 1676 leaf_manifest(data, segment, count) 1677 # At most count items, but never go before the head 1678 start = max(segment.head(), segment.tail() - count) 1679 manifest = Manifest(data[start:segment.tail]) 1680 segment.tail -= segment.tail() - start 1681 return manifest 1683 interior_manifest(data, segment, level, k, m, right_most_child) 1684 children = [] 1685 if right_most_child is not None: 1686 children.append(right_most_child) 1688 interior_indirect(data, segment, level, k, m, children) 1689 interior_direct(data, segment, level, k, m, children) 1691 manifest = Manifest(children) 1692 return manifest, tail 1694 interior_indirect(data, segment, level, k, m, children) 1695 # Reserve space at the head of the segment for this node's direct pointers before 1696 # descending to children. We want the top of the tree packed. 1697 reserve_count = min(k, segment.tail - segment.head) 1698 segment.head += reserve_count 1700 while len(children) < m and not segment.head == segment.tail: 1701 child = bottom_up_preorder(data, segment, level - 1, k, m) 1702 # prepend 1703 children.insert(0, child) 1705 # Pull back our reservation and put those pointers in our direct children 1706 segment.head -= reserve_count 1708 interior_direct(data, segment, level, k, m, children) 1709 while len(children) < k+m and not segment.head == segment.tail: 1710 pointer = data[segment.tail() - 1] 1711 children.insert(0, pointer) 1712 segment.tail -= 1 1714 Authors' Addresses 1716 Christian Tschudin 1717 University of Basel 1719 Email: christian.tschudin@unibas.ch 1721 Christopher A. Wood 1722 Cloudflare 1724 Email: caw@heapingbits.net 1726 Marc Mosko 1727 PARC, Inc. 1729 Email: marc.mosko@parc.com 1731 David Oran (editor) 1732 Network Systems Research & Design 1734 Email: daveoran@orandom.net