idnits 2.17.00 (12 Aug 2021) /tmp/idnits11062/draft-dickson-v6man-new-autoconf-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 15. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 998. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 1009. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 1016. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 1022. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([5]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (October 4, 2007) is 5343 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: '16' is mentioned on line 690, but not defined == Unused Reference: '1' is defined on line 734, but no explicit reference was found in the text == Unused Reference: '2' is defined on line 737, but no explicit reference was found in the text == Unused Reference: '3' is defined on line 741, but no explicit reference was found in the text == Unused Reference: '6' is defined on line 752, but no explicit reference was found in the text == Unused Reference: '7' is defined on line 755, but no explicit reference was found in the text ** Obsolete normative reference: RFC 4941 (ref. '2') (Obsoleted by RFC 8981) Summary: 3 errors (**), 0 flaws (~~), 8 warnings (==), 8 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 ipv6man B. Dickson 3 Internet-Draft Afilias Canada, Inc 4 Expires: April 6, 2008 October 4, 2007 6 A New Method of Constructing Interface Identifiers for Expanded 7 Autoconfiguration in IPv6 8 draft-dickson-v6man-new-autoconf-00 10 Status of this Memo 12 By submitting this Internet-Draft, each author represents that any 13 applicable patent or other IPR claims of which he or she is aware 14 have been or will be disclosed, and any of which he or she becomes 15 aware will be disclosed, in accordance with Section 6 of BCP 79. 17 Internet-Drafts are working documents of the Internet Engineering 18 Task Force (IETF), its areas, and its working groups. Note that 19 other groups may also distribute working documents as Internet- 20 Drafts. 22 Internet-Drafts are draft documents valid for a maximum of six months 23 and may be updated, replaced, or obsoleted by other documents at any 24 time. It is inappropriate to use Internet-Drafts as reference 25 material or to cite them other than as "work in progress." 27 The list of current Internet-Drafts can be accessed at 28 http://www.ietf.org/ietf/1id-abstracts.txt. 30 The list of Internet-Draft Shadow Directories can be accessed at 31 http://www.ietf.org/shadow.html. 33 This Internet-Draft will expire on April 6, 2008. 35 Copyright Notice 37 Copyright (C) The IETF Trust (2007). 39 Abstract 41 This Internet Draft discusses a proposed extension to the set of 42 Interface Identifier construction methods for 802 networks. The 43 purpose of this is to allow autoconf RA announcments of prefix length 44 other than 64 bits. It is intended to be fully backward compatible 45 for /64 announcements. Instead of having one Interface Identifier 46 construction method for all purposes, this adds an alternate method 47 which is only used for autoconf, and only if the prefix length is not 48 /64. No other IPv6 methods or protocols require modification. 49 However, without modification, use of prefixes other than /64 likely 50 won't support many IPv6 enhanced functions. 52 The ultimate goal it providing enough bits between the top level 53 allocation by Regional Internet Registristry (RIR) and the smallest 54 autoconfiguration allocation, to allow both external aggregation by 55 ISPs into one prefix, as well as internal hierarchical aggregation to 56 support a variety of ISP topologies and practices. Current policies 57 are driven from below by the current 64 bit Interface Identifier. 58 Only by relaxing this to 48 bits for such technologies as 802 59 (Ethernet), does the number of bits available reach a level deemed 60 "sufficient". 62 Author's Note 64 This Internet Draft is intended to result in this draft or a related 65 draft(s) being placed on the Standards Track for 6man. 66 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 67 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 68 document are to be interpreted as described in [5]. 69 Intended Status: Proposed Standard. 71 Table of Contents 73 1. Background . . . . . . . . . . . . . . . . . . . . . . . . . . 4 74 2. Description of the Problem: Scaling . . . . . . . . . . . . . 5 75 2.1. Scaling problem examples . . . . . . . . . . . . . . . . . 6 76 2.1.1. Allocation vs Aggregation . . . . . . . . . . . . . . 7 77 2.1.2. Allocation Techniques . . . . . . . . . . . . . . . . 7 78 2.1.3. Benefits of Aggregation . . . . . . . . . . . . . . . 9 79 3. Motivation for Change . . . . . . . . . . . . . . . . . . . . 12 80 3.1. Current Allocation Techniques . . . . . . . . . . . . . . 12 81 4. Proposed Changes . . . . . . . . . . . . . . . . . . . . . . . 14 82 4.1. Autoconf - Changing the Sense of Interface Identifier . . 14 83 4.1.1. Impact to Existing RFCs . . . . . . . . . . . . . . . 14 84 5. Possible (and desired) Impact on Global Allocation Schemes . . 16 85 5.1. Reduction in size of smallest end-user allocation . . . . 16 86 5.2. Reduction in size of initial allocations to ISPs . . . . . 17 87 5.3. Increase in available bits for subnetting . . . . . . . . 17 88 5.4. Increase in bits reserved for growth . . . . . . . . . . . 17 89 5.5. Working Demonstration Code . . . . . . . . . . . . . . . . 17 90 5.5.1. Linux Patch example . . . . . . . . . . . . . . . . . 17 91 6. Security Considerations . . . . . . . . . . . . . . . . . . . 19 92 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 20 93 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 21 94 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 22 95 9.1. Normative References . . . . . . . . . . . . . . . . . . . 22 96 9.2. Informative References . . . . . . . . . . . . . . . . . . 22 97 Appendix A. Appendix A: Allocation Technique Examples . . . . . . 23 98 Appendix B. Appendix B: Subnetting Choices by Length . . . . . . 26 99 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 28 100 Intellectual Property and Copyright Statements . . . . . . . . . . 29 102 1. Background 104 "The only problem is scale. If you can solve the scaling problem, 105 nothing else matters." - Paraphrasing Mike O'Dell, founder of UUnet. 107 IPv6 began as IPng, the "next generation internet protocol". It had 108 many ambitious goals, and has certainly achieved some of them. 110 However, along the way, some compromises were made, which have been 111 overlooked in terms of impact on the scaling up the deployment, as an 112 overlay on the current generation Internet. This document will 113 attempt to illustrate where those scaling issues exist, how they 114 impact operators, and why they are fundamental - they cannot be 115 worked around, because you cannot work around scaling problems. 117 2. Description of the Problem: Scaling 119 The problem being addressed is the intersection between the 120 theoretical design of IPv6, and the practical deployment issues on 121 the hybrid IPv4+IPv6 Internet. 123 IPv4 space is nearing exhaustion, and is anticipated to be fully 124 allocated by some time in 2010. Network operators now anticipate 125 adopting IPv6 widely, to address the need for additional address 126 space. This in turn is driving demand for initial IPv6 allocations, 127 and for deployment on infrastructure (especially servers) of IPv6. 129 It is this timeframe, and demand, that means any scaling problems 130 need to be addressed on as short a time-frame as is feasible, so as 131 to minimize the impact to the Internet, and to realize maximum 132 benefit by having as small a deployed base as possible before 133 introducing any changes. 135 There are several places in the combined infrastructure where scaling 136 issues are of pracitical concern. Management of address assignment 137 is an obvious one, but of low order of importance from the 138 perspective of network traffic. 140 Routing table size is another, which is impacted by address 141 aggregation and address allocation - two activities that goe hand-in- 142 hand. 144 This is true both in the context of the global routing table, as well 145 as the internal routing table requirements of individual ISPs. 147 Routing convergence time, while less obvious, is another function 148 which is scale-sensitive. 150 And hardware forwarding tables (where they exist) are very scale- 151 limited, expensive, and generally only upgradeable by upgrading the 152 routers containing them. These grow approximately 1:1 with the 153 routing table for any given router. 155 Currently, the places where the largest potential impact to current 156 and future scaling issues, is the Default-Free Zone (DFZ), where 157 routers are required to hold a complete routing table, since they do 158 not have or use a "default route". 160 While traditionally this may have been considered to be primarily the 161 region collectively known as "the Tier 1 networks", it now includes 162 any place where routers hold one or more copies of the full routing 163 table. Typical use of a full routing table is where a network 164 operator runs BGP and is multi-homed, and has in addition to multiple 165 upstream transit providers, a substantial number of routes heard from 166 non-transit (peering) relationships with other network operators. 168 This now comprises a substantial proportion of large network 169 operators, certainly in the thousands. 171 In the IPv4 DFZ, the number of prefixes is on the order of 230,000. 172 Also noteworthy is the number of ASNs present in the DFZ, on the 173 order of 30,000. In IPv4, the ratio of prefixes:ASNs is about 8:1. 174 Much of this ratio is a consequence of allocation policies which were 175 necessarily short-term. 177 Furthermore, the impact to the hardware of each IPv6 prefix, is 178 frequently double that of a single IPv4 prefix. 180 In order to minimize the impact of adding widescale IPv6 deployment 181 on the DFZ, the objective needs to be providing the means for maximum 182 aggregatability of the IPv6 allocations, over time. The specific 183 objective is, one IPv6 allocation per ASN (since that is the 184 theoretical minumum.) 186 Aggregation itself cannot be controlled directly or mandated, it 187 should be observed. However, the ability to aggregate is driven 188 directly by the allocation schemes used, and the underlying drivers 189 on consumption within that allocation. Internal allocation and 190 aggregation, within an ISP, is one such driver. 192 It should be observed that the rate of consumption is fundamentally 193 driven by two metrics: the size of the block from which allocations 194 are made, and the unit size of the smallest allocation. 196 By showing that even with an optimal allocation scheme, the drivers 197 for consumption are likely to result in either additional allocations 198 per ASN or inability to aggregate allocations within an ASN, we 199 identify the only place where this problem can be addressed - the 200 size of the minimum allocation unit. 202 2.1. Scaling problem examples 204 The problem space is a classical triangle: 206 Efficiency (the packing problem): Efficiency is measured in terms of 207 availability of unused space. Inefficient use is characterized by 208 fragmentation of unused space. Optimal efficiency is achieved if 209 none of the unused block sizes could be merged, regardless of 210 location in the binary tree. 212 Expansion (the reservation problem): Expansion is the reservation of 213 unused space adjacent to used space. A block expands when it gets 214 merged with unused space adjacent to it. Example: Used block 215 FEC0::2:0/112 merges with unused block FEC0::3:0/112 to become 216 used block FEC0::2:0/111. 218 Existence (the renumbering problem) As soon as space is allocated, 219 the allocatee becomes a ticking time bomb. It must be presumed 220 that their network is growing, and at some point will need more 221 space. The recipient will not want to renumber an existing 222 allocation, in order to receive a new allocation. 224 The more room is reserved for growth, the less is available for new 225 allocations, and the lower the apparent global efficiency. This is a 226 zero-sum game, in a finite space. However, the risk-reward, or 227 rather cost-pain, equation pits the allocatee against the allocator: 228 any improvement in efficiency which requires a recipient to renumber 229 will face vociferous opposition. 231 2.1.1. Allocation vs Aggregation 233 Aggregation is the act of combining a number of smaller things into a 234 bigger thing. In the context of prefixes in an internet, aggregation 235 can only occur on bit boundaries, and only when objects being 236 combined are contiguous, with sufficiently similar properties (such 237 as as-path). 239 Most important is the contiguous property. Consequently, one way to 240 view aggregation is as the reverse of de-aggregation. 242 Unless the initial allocation and reservation for further allocation 243 are contiguous, no further aggregation can be possible between the 244 two. 246 And in that sense, the first and subsequent allocations in fact look 247 like a de-aggregation followed by aggregation. Internal use and 248 assignment can similarly be viewed as de-aggregation, with the 249 summarization happening at the border of the entity doing the 250 aggregation (e.g. ASN border router.) 252 2.1.2. Allocation Techniques 254 There are a number of general techniques for allocation of address 255 space. Each have pros and cons, related to efficiency, expansion, 256 and renumbering. Variants on each can achieve some compromise in the 257 secondary areas, in addition to the primary benefit of the technique. 259 Sequential Block This technique breaks the large block into smaller 260 blocks, and assigns prefixes of a given size all out of one sub- 261 block, in a sequential fashion. Variants make allocations paired 262 with reserverations ajacent in the same block, by effectively 263 increasing the size of allocation itself. While simple to 264 implement, this technique is neither terribly efficient, nor very 265 flexible for growth. 267 Bisection This technique initially reserves the whole space for the 268 first recipient. Thereafter, each new recipient is assigned space 269 by splitting, or bisecting, the space reserved for one recipient, 270 reserving half for the original recipient and the other half for 271 the new recipient. Growth occurs within a recipient's reserved 272 space. This technique achieves expansion at the cost of 273 efficiency. Under bisection, unused space is *maximally* 274 fragmented. Variants may make allowances in bisection algorithm 275 based on size of initial allocation. Another problem with 276 bisection is, it is non-deterministic, in that it is sensitive to 277 the sequence in which requests are recieved - particularly when 278 balancing new allocation requests against allocation increases due 279 to growth. 281 Best Fit This techique is guaranteed to be optimal. It uses the 282 smallest unused block big enough to hold the requested allocation, 283 for any new allocation, repeatedly bisecting the selected block 284 (i.e. the smallest block big enough) until the exact fit for the 285 new allocation is recieved. If the smallest is the right size, no 286 partitioning is necessary. This technique guarantees no 287 aggregation of unused space is possible after an allocation, if it 288 wasn't before allocation. It starts with a completely aggregated 289 empty block. Thus, it will always achieve optimal efficiency. 291 Variants reserve extra space for each allocation, to permit 292 expansion. 294 For any method of reserving extra space for expansion, in an apples- 295 to-apples comparision, "Best Fit" will have the best efficiency. We 296 are concerned with efficiency at each level, since the more efficient 297 allocation within one block is, the slower the expansion will be 298 within the parent block. 300 For sake of argument, we will presume optimal allocation at the 301 lowest level, when viewing the impact of growth on multi-level 302 hierarchies of allocations. 304 2.1.3. Benefits of Aggregation 306 Aggregation is important in the DFZ, so as to avoid excessive (and 307 expensive) growth which affects all large operators. However, there 308 are benefits outside of the DFZ, which can be achieved by 309 aggregation. 311 In fact, the scalability of internal networking resources depends on 312 aggregation, and thus the ability to aggregate. 314 2.1.3.1. Impact of Hierarchical Aggregation 316 The following example is used to demonstrate the number of prefixes 317 needed, for reaching destinations in an Ingerior Gateway Protocol 318 (IGP) area, of a given size. 320 The laws governing scalability are identified, and examples used to 321 illustrate how important scaling is to IGPs regardless of specific 322 IGP technology. 324 The presumption being made is, that allocation is made according to 325 the topology, and that aggregation is done to the maximum degree 326 possible. The point is to demonstrate the benefit of maximum 327 aggregation, which in turn establishes the need for allocation that 328 supports HIERARCHICAL aggregation. 330 We start with a reasonably big address block, which is allocated out 331 to end networks - a /40, where the end assignments are /64. This 332 gives us 24 bits of allocation ability, or 2^24 networks, 333 approximately 16M. This is a flat network topology, where the 334 allocations are direct and in no way aggregated. The IGP needs to 335 carry all 16M prefixes - and likely will have a lot of difficulty 336 coping with storing those, let alone achieving routing convergence. 338 Now let us consider a two-level hierarchy. We'll use a couple of 339 examples, and extrapolate the rules on prefix counts in a 2-level 340 aggregation regime. 342 If we put the delineation point at the /48 boundary, this gives a 343 top-level IGP carrying 8 bits of aggregated prefixes, or 2^8 = 256 344 prefixes. In addition, each aggregation point, for example an ABR 345 from OSPF, it will need to have all the delegated prefixes for the 346 aggregation it is doing, meaning 2^16 prefixes, or 65k prefixes. 347 That is still a substantial number. 349 If instead, we put the delineation point between the two levels at 350 the /52 boundary, this gives both the top level, and the subordinate 351 level, a total of 2^12 of each kind of prefix. The routers in the 352 IGP doing aggregation would have 2 * 2^12, or 2^13 prefixes, about 353 8k. That' is quite a bit better. 355 Now, let us look at a three-level hierarchy. The top level would 356 contain all the top level prefixes. The second level would need all 357 the top level prefixes, plus the second level prefixes. The router 358 bordering level 2 and level3 would also need the third level 359 prefixes. 361 If each of the boundaries is an 8-bit boundary, the bottom tier 362 routers in the IGP would need 2^8 + 2^8 + 2^8 prefixes, or 768 363 prefixes. What an inprovement hierarchical assignments make. 365 The general rule on hierarchical maximum prefix count is, the sum of 366 all values 2^Bi, where Bi is the number of bits used for level "i" of 367 the heirarchy. 369 Clearly, the greatest benefit is achieved by putting an upper bound 370 on max(Bi), i.e. the section of the hierarchy with the greatest 371 number of bits. 373 Conversely, the flexibility on creating a topology is limited by the 374 number of PLACES an ISP can do aggregation. The smaller min(Bi) is, 375 the LESS flexible and useful the hierarchical scheme becomes. 377 Thirdly, the flexibility achieved by the NUMBER of levels in the 378 hierarchy, is the freedom for innovation among ISPs, in terms of the 379 network designs and network topologies that are possible. Fixing, or 380 putting an upper bound on the number of levels of hierarchy, is 381 overly restrictive. 383 Note well, the first two items, reducing max(Bi) and keeping min(Bi) 384 from collapsing, combine to make the usefulness of a hierarchy 385 dependent on the range of bit sizes per level of the hierarchy. 386 Combined with the last item, this creates an argument for a large 387 number of bits within which to build a hierarchy. 389 (We explore examples in Appendix A.) 391 2.1.3.2. Routing Table Size 393 Hierarchical results in routers typically seeing only prefix 394 information at the same level of the hierarchy as itself, plus 395 summarization prefixes for higher levels of the hierarchy. This 396 drastically reduces the requirements for the size of routing tables 397 within an organization. 399 2.1.3.3. Routing Stability 401 Aggregation also limits the impact of routing updates. In a 402 hierarchy of aggregations of prefixes, aggregation typically 403 suppresses reachability regarding more-specific prefixes. This 404 limits the scope of routing flaps, and improves network-wide routing 405 stability. Routing flaps propogate only to the aggregator, and not 406 higher in the hierarchy. 408 2.1.3.4. Routing Convergence 410 Thus, we can see that not only external aggregation at the top level, 411 but hierarchical aggregation within a block of addresses, has benefit 412 and is likely to be done by any organization with sufficient 413 resources allocated to it. Routing convergence scales by an order of 414 N*log(N) where N is the number of prefixes. Reducing N by ORDERS of 415 magnitude have profound benefits on speed of convergence, i.e. also 416 orders of magnitude. 418 The fewer prefixes there are in a routing table, the faster routing 419 can converge. This is especially true for SPF protocols, such as 420 OSPF or ISIS. Convergence time is on the order of Log(N) for N 421 prefixes. The smallest number N ie achieved with routing topologies 422 that implement hierarchical aggregation which mirrors topology. 423 (This is classic OSPF summarization methodology.) 425 3. Motivation for Change 427 The growth of the IPv4 space has come at considerable expense, and 428 some of the lessons from the early stages of public IPv4 Internet 429 growth, as has impacted the latter stages of the IPv4 Internet, do 430 not appear to have been heeded. 432 In particular, the IPv6 design was made prior to extensive experience 433 by operators with the growing pains of IPv4. Where appropriate, 434 consideration of the operator experience can lead to considerable 435 improvement in the scalability and robustness of Internet 436 architecture documents. 438 Additionally, there are costs which are necessarily borne by the 439 combined IPv4+IPv6 infrastructure - placing an unfortunate, but 440 unavoidable restriction, on deployment of IPv6 - specifically, 441 allocation and aggregation of unicast IPv6 addresses. 443 The ultimate objective, is to support IPv6 allocations which have 444 sufficient room for growth, so as to ensure that it is highly 445 unlikely that subsequent allocations will be needed for ISPs, at any 446 level, for a substantial length of time (5-10 years). 448 However, the secondary objective is to make the allocation schemes 449 available at the second level, the ISP, as well suited to the 450 internal needs of ISPs, so as to prevent harm to individual operators 451 as well as to the global set of network operators. 453 Only by achieving that objective, will the goal of 1 prefix per ASN 454 be possible. Without achieving this, the routing table in the DFZ 455 may bloat sufficiently to cause significant harm to the operators who 456 operate in the DFZ, ultimately harming the Internet. Doing so 457 without causing internal difficulty for operators who need to 458 aggregate internally, is an essential part of this proposal. 460 3.1. Current Allocation Techniques 462 Currently at the top level, IANA has allocated /12 prefixes to the 463 Regional Internet Registries (RIRs). These organizations allocate 464 space to ISPs, as Provider Aggregatable (PA) blocks, and to direct 465 recipients, as Provider Independent (PA) assignments. While there is 466 some variation among the operators, all have the following in common: 468 Autoconf Block Size: As per [4], currently /64 469 End User Assignment Size: Typically /56 471 Initial Assignment, No Paperwork: Normally /48; anthing shorter 472 requires justification. 474 Minimum Direct Assignment: Also /48 - typically used for Internet 475 Infrastructure use 477 Initial ISP Allocation Size: Typically /32, with many RIRs 478 allocating much larger blocks depending on ISP requirements (e.g. 479 /19, /20, /21). 480 Note that these large initial allocations actually *support* the 481 proposition that more bits are needed - otherwise, why would such 482 large allocations be being made? 484 Reservation for Growth to the ISP: Typically 4 bits 486 Reservation by the ISP for customer growth: Also typically 4 bits 488 If the ISP is given a /32, and allocates /48's out of it, and for 489 each /48, reserves the encompasing /44 in its entirety, this means 490 that only 12 bits of range are availabe for both allocation and 491 internal aggregation by the ISP. This is simply too few bits. A 492 three level hierarchy would support at most 4 bits per level (16 493 prefixes). A two-level hierarchy would provide only either 32 blocks 494 of 128 networks, or 16 blocks of 256 networks. Both of these are too 495 small for even a small ISP's needs on a 10-year basis. 497 4. Proposed Changes 499 The proposed changes involve only those elements which affect: 501 o autoconf 503 o IPv6 over anything with hardware address length < 64 bits (e.g. 504 Ethernet 506 4.1. Autoconf - Changing the Sense of Interface Identifier 508 As the current RFCs describe it, an Interface Identifier (II) is an 509 atomic element, the majority of uses of which presume 64-bit 510 addressing. Some RFCs have been structured to accommodate possible 511 II lengths other than 64 bits, but currently no RFCs define such an 512 II instance. 514 The legacy of the process by which IPv6 was developed, appears to 515 still be contain the hallmarks of the one-size-only II. The fixed- 516 size II is analogous to classful networking in IPv4, only instead of 517 three classes (A,B, and C), there is just one class, the /64. 519 The current instance of the core document for IPv6, the addressing 520 architecture document, by contrast, describes IPv6 as a 128-bit 521 addressing scheme with no internal structure. Essentially, it is 522 CIDR-128. 524 In order to take advantage of this, however, it is necessary to 525 foregoe the concept of universal II size, and instead, permit the 526 prefix length (of any IPv6 network) to determine the size of the II. 528 By reversing the sense of association, from having IPv6 addresses 529 associating "naturally" to a fixed-size Interface Identifier, to 530 having IPv6 addresses associating to a prefix, the very concept of an 531 Interface Identifier changes. Instead of a (fixed-size) constant 532 value, it becomes a deterministically-constructed yet variably-sized 533 entity, for uniquely numbering a given host interface on a given 534 subnet. 536 Within the context of autoconf, rather than the Interface Identifier 537 being tied tightly to the Link-Local address, it becomes an object 538 which is constructed only after receipt of an RA (with autoconf bit 539 set). 541 4.1.1. Impact to Existing RFCs 543 The RFCs for autoconf and Ethernet, plus any other 48-bit MAC layer-2 544 types (FDDI, etc.). The respective normative references need to be 545 modified to accommodate this autoconf-specific behaviour, or may need 546 to be updated to reflect better scoping models for aggregation (e.g. 547 RFC 3531). 549 RFC 4291 [3]IPv6 Addressing Architecture - directly affected 551 RFC 3587 [7]IPv6 Global Unicast Address Format - directly affected 553 RFC 3531 [6]A flexible method [...] - examples and appendices may 554 obsolete this RFC, as may some BCP RFCs and/or wiki pages 556 RFC 2464 [1]Transmission of IPv6 packets over Ethernet Networks - 557 directly affected 559 RFC 4862 [4]IPv6 Stateless Address Autoconfiguration - this is the 560 main RFC most directly affected 562 RFC 4941 [2]Privacy Extensions for [...] - presume 64 bit II, but 563 can be easily modified by replacing "top 64" and "bottom 64" 564 references with "top N" and "bottom 128-N", where N is II size in 565 bits. 567 4.1.1.1. Autoconf 569 The proscribed behaviour is: if a prefix whose prefix-length is not 570 64 is received, and which satisfies other conditions (hw_length + 571 prefix_lenth <= 128, and prefix_length > 10), a new Interface 572 Identifier MAY/SHOULD/MUST (language to be decided by consensus 573 and/or AD and/or IESG) be constructed by OR-ing the prefix with the 574 modified hardware address (e.g. EUI-48/MAC-48 with left-padded 575 zeros), the modification being the U and G bits (identical to the 576 Modified EUI-64 bit locations in the first byte). 578 The previous behaviour for such a prefix would be that autoconf would 579 just fail, silently. 581 The new (optional) behaviour, if implemented, is that autoconf will 582 now succeed, and be free from duplicates. However, DAD MUST still be 583 done. Wide deployment of implementations of the new behaviour, will 584 make prefix lengths up to /80 useable, which is a necessary step (but 585 not the only step) in solving the scaling problem identified. 587 Clearly, a prefix other than /64 would not be very useful unless this 588 is adopted, and widely. However if it is, then longer prefixes 589 (especially /80) would then be generally considered usable. 591 5. Possible (and desired) Impact on Global Allocation Schemes 593 Without this change, the smallest prefix for which autoconf would 594 work is /64. 596 With this change, the smallest prefix for which autoconf would work 597 is /80. 599 DHCPv6 is unaffected by this change. DHCPv6 already supports prefix 600 lengths other than /64. This change brings autoconfiguration in line 601 with DHCPv6. 603 All of the changes to allocation policies discussed below, cannot be 604 considered unless longer prefixes are useable with autoconfiguration, 605 since autoconfiguration is widely deployed and used. 607 This conclusion is based on a straw poll, where about 90% of 608 respondents indicated that autoconf was used partly or entirely on 609 prefixes they operated. However, substantial portions of newly 610 deployed networks are likely to use DHCPv6 (e.g. cable company DOCSIS 611 access networks.) 613 The implication is that an allocation needs to be at least as large 614 as the smallest block usable for autoconf, to be of use to 50-90% of 615 recipients of allocations. 617 Special note: DHCPv6 supports Prefix Delegation (PD). PD does not 618 presume /64 network size. 620 PD is, however, usable by routers in RA announcements, which can then 621 be used by autoconfiguration. 623 This means that without this proposed change, only PD of /64 can be 624 usable by autoconfiguration. 626 The proposed change would remove that limitation, and any prefix 627 supportable by the hardware address length, would be usable via PD + 628 autoconfiguration. 630 5.1. Reduction in size of smallest end-user allocation 632 Under the modified scheme, the smallest anticipated allocation would 633 shrink from /48 or /56, to anywhere from /60 to /76, although the 634 most likely candidates are /60, /64, and /72. 636 /60, in addition to being the largest of these, would still support a 637 mix of /64 assignments as well as /80 assignments. For example, 2^3 638 (8) /60's and 2^17 (131k) /80's. The 17 bits available are plenty 639 for even end-customer internal aggregation/allocation needs, for all 640 but the biggest enterprises. 642 5.2. Reduction in size of initial allocations to ISPs 644 An initial allocation of /32, when assignments to customers are /48's 645 and 4 bits are reserved per customer for growth, gives 12 bits of 646 aggregation range. 648 However, if the customer assignments are /60, an initial allocation 649 of /40 would give 20 bits of aggregation range. This is substantial 650 for both aggregation, and assignment volume and lifetime. 652 Reserving the whole /32 for the recipient of the /40, is likely to 653 give a substantial time frame within which the customer can grow, 654 without needing to renumber or receive an allocation from a non- 655 continguous netblock. 657 5.3. Increase in available bits for subnetting 659 By using /40's, with allocation units of /60, 20 bits are available 660 for subnetting. This is more important for the hierarchical 661 assignment and aggregation within an ISP, than it is for purposes of 662 keeping allocation information organized, although the latter is a 663 beneficial side effect. 665 5.4. Increase in bits reserved for growth 667 Rather than a /32 with an extra 4 bits reserved, the new minimum 668 allocation to ISPs could be /40 with 8 bits for growth. That is a 669 factor of 16 more growth, or an additional time-interval to the 670 initial quantity, presuming exponential growth. 672 5.5. Working Demonstration Code 674 Example code for current versions of Linux has been produced by the 675 author. The patch needed to support the new method is illustrated 676 below: 678 5.5.1. Linux Patch example 679 *** /usr/src/linux/net/ipv6/addrconf.c 2007-06-22 13:08:58.000000000 680 --- addrconf.c 2007-10-04 21:47:44.000000000 681 *************** 682 *** 1690,1695 **** 683 --- 1690,1712 ---- 684 } 685 goto ok; 686 } 687 + elseif (pinfo->prefix_len > 3 && (pinfo->prefix_len + 688 + 8*(dev->addr_len) <= 128)) { 689 + /* II needs to fit in available space */ 690 + u8 my_ii[16]; 691 + int i; 692 + /* prefix, plus zeros */ 693 + memcpy(&addr, &pinfo->prefix, 16); 694 + /* use HW_ADDR from dev as II */ 695 + memcpy(my_ii+16-(dev->addr_len),dev->addr, 696 + dev->addr_len); 697 + /* Global/Local bit */ 698 + my_ii[16-(dev->addr_len)] ^= 2; 699 + /* guaranteed to be INSIDE the HW_ADDR portion, 700 + and at proper location of EUI-64 */ 701 + for(i=0;i<16;i++){ addr.s6_addr[i] |= my_ii[i];} 702 + goto ok; 703 + } 704 if (net_ratelimit()) 705 printk(KERN_DEBUG "IPv6 addrconf: prefix with wrong length %d\n", 706 pinfo->prefix_len); 708 6. Security Considerations 710 This document raises no new security issues. 712 7. IANA Considerations 714 This document has no actions for IANA. 716 8. Acknowledgements 718 The author wishes to acknowledge the helpful guidance of the Working 719 Group chair of what is now 6man, previously ipv6wg, Brian Haberman, 720 and of the Internet Area Director, Jari Arrko. 722 The author also thanks the contributors on the ipv6 mailing list for 723 pushing him to detail and clarify his concerns, which has resulted in 724 a better Internet Draft. Specific contributors include Thomas 725 Narten, Scott Leibrand, Bob Hinden, Iljitsch van Beijnum, Fred Baker, 726 James Woodyatt, Mark Smith, Brian E. Carpenter, David Conrad, itojun, 727 Christien Huitema, Fred Templin, Michael Dillon, and Ignatios 728 Souvatzis. 730 9. References 732 9.1. Normative References 734 [1] Crawford, M., "Transmission of IPv6 Packets over Ethernet 735 Networks", RFC 2464, December 1998. 737 [2] Narten, T., Draves, R., and S. Krishnan, "Privacy Extensions for 738 Stateless Address Autoconfiguration in IPv6", RFC 4941, 739 September 2007. 741 [3] Hinden, R. and S. Deering, "IP Version 6 Addressing 742 Architecture", RFC 4291, February 2006. 744 [4] Thomson, S., Narten, T., and T. Jinmei, "IPv6 Stateless Address 745 Autoconfiguration", RFC 4862, September 2007. 747 9.2. Informative References 749 [5] Bradner, S., "Key words for use in RFCs to Indicate Requirement 750 Levels", BCP 14, RFC 2119, March 1997. 752 [6] Blanchet, M., "A Flexible Method for Managing the Assignment of 753 Bits of an IPv6 Address Block", RFC 3531, April 2003. 755 [7] Hinden, R., Deering, S., and E. Nordmark, "IPv6 Global Unicast 756 Address Format", RFC 3587, August 2003. 758 Appendix A. Appendix A: Allocation Technique Examples 760 In the following, we compare tools which do allocations according to 761 either the "best fit" or "bisection" method. We observe the results 762 of the two methods, and examine the details and implications to 763 global allocation policies. 765 In the following, the allocations are kept in a file, whose structure 766 is described in the comments block. Comments are preserved at the 767 top. 769 The transaction file is a list of address size requests, and the name 770 to associate to the request. 772 We illustrate several scenarios, using the same set of allocation 773 requests in different sequence. The resulting allocation files are 774 shown at intermediate steps, so the differences between methods and 775 the sensitivity to sequence of transactions is clearer. 777 The final allocation files, shows allocations, reservations for 778 growth, and empty space. 780 Each prefix/length range, has the name assigned to the allocated 781 block, or the empty string indicating unallocated space. 783 (Bisection uses reserved space, and does not have "unallocated" 784 space, per se.) 786 Input Files: 788 Empty allocation file (start from scratch): 789 # File for storing tree of allocations and free blocks 790 # default base is 10 791 # default arrangement is flat (vs dotted or colon separated hierarchy) 792 # 793 # format of each line is: 794 # network/[reservation-]length[,customer] 795 # 796 # if no [,customer] label exists, the block is available 797 # if [reservation-] is specified, the following are true: 798 # network/length is allocated to customer 799 # network/reservation is tentatively reserved for customer, 800 # but can be bisected 801 universe=/6 802 0/0 804 Transaction file containing sequential requests for new allocations: 806 # Set of requests (for batch processing of requests for allocations) 807 # name /size 808 c1 /5 809 c2 /6 810 c3 /3 811 c4 /6 812 c5 /4 813 c6 /3 815 Results for allocation strategy "Bisection": 816 universe=/6 817 0/3-5,c1 818 8/3-4,c5 819 16/3-3,c3 820 24/3-3,c6 821 32/2-6,c2 822 48/2-6,c4 824 Results for allocation strategy "Best": 825 universe=/6 826 0/5,c1 827 2/6,c2 828 3/6,c4 829 4/4,c5 830 8/3,c3 831 16/3,c6 832 24/3 833 32/1 835 Additional allocations: 836 c7 /2 837 c8 /3 838 c9 /3 839 c10 /3 841 Results for allocation strategy "Bisection": 842 Unable to allocate prefix size /2 for c7 843 Unable to allocate prefix size /3 for c10 844 universe=/6 845 0/4-5,c1 846 4/4-4,c11 847 8/4-4,c5 848 12/4-4,c12 849 16/3-3,c3 850 24/3-3,c6 851 32/3-6,c2 852 40/3-3,c8 853 48/3-6,c4 854 56/3-3,c9 856 Results for allocation strategy "Best": 857 universe=/6 858 0/5,c1 859 2/6,c2 860 3/6,c4 861 4/4,c5 862 8/3,c3 863 16/3,c6 864 24/3,c8 865 32/2,c7 866 48/3,c9 867 56/3,c10 869 We can see that the requests should have used up all of the available 870 space, exactly. 872 The strategy "Best" succeeded in using up all the space. 874 The strategy "Bisect" did leave some room for growth for some 875 allocations, but not for others. 877 "Bisect" ultimately fragmented the space too much for allocations 878 that would otherwise have been able to fit. 880 Most importantly, the "reserved" space resulting from the "bisection" 881 method is distributed in a non-deterministic manner. This reserves 882 differing amounts of space in a haphazard fashion, which while fair, 883 in the sense of being the result of blind luck, is still unbalanced. 885 However, it is clear that to ensure both optimized allocation 886 efficiency, and total fairness in growth, that allocations need to be 887 made using the "best" approach, with a fixed (constant) amount of 888 room for growth, measured in extra "bits" of prefix length. 890 Due to the particulars of ip6.arpa. reverse delegation, the prefered 891 choice should be on nibble (4-bit) boundaries, with one or two extra 892 nibbles reserved for growth. 894 It ultimately makes the most sense for growth reservations to be made 895 at each level of inter-organizational allocation (as opposed to 896 internal aggregation points). 898 RIR->LIR assignments should have growth space appended to assignment 899 lengths, and LIR->customer assignments should also have extra space 900 for growth. 902 Appendix B. Appendix B: Subnetting Choices by Length 904 This section enumerates examples of hierarchical subnetting, based 905 on: 907 Range of bits available This is the number of bits of what has 908 traditionally been called the "subnet mask", between the "network 909 mask", and the "host portion". E.g. If X/16 is subnetted, and 910 the presumed host portion at the LAN level is /64, then the bit 911 range is 64-16 = 48 bits. 913 Bits per level (min) We will make some distinctions on the 914 usefulness of different subnetting patterns. For example, nibble 915 boundaries are very convenient, while single-bit subnetting 916 schemes are not likely to be used. 918 Non VLSM hierarch We presume that at each level of the hierarchy, 919 siblings will have the same subnet mask. E.g. x::12:0/16 920 subnetted as /17's 12:0/17 and 12:8000/17. If 12:0/17 is 921 subnetted into /19's, then so is 12:8000/17 as /19's. 923 Number of Subnets Total Including varying the number of subnetting 924 steps in the hierarchy, ranging from 0 to the most that will fit 925 based on bits-per-level. 927 For example, here is a trivial or nearly-trivial subnetting scheme: 928 Range = 4 bits, Bits-per-level = 2 bits Subnet bit-length patterns: 929 4; 2/2. Total number of subnet patterns: 2. 931 Detailed layout of the two subnet mask patterns: 933 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 934 | Network Allocation |Subnet | Host Part | 935 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 937 Each value of Subnet indicates a different discrete network, and 938 aggregation is presumed only to take place a the level of the Network 939 Allocation. 941 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 942 | Network Allocation |X X|Y Y| Host Part | 943 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 945 Where XX is the top level subnet, and YY is the next level subnet, 946 each being 2 bits in size. Network aggregation is presumed to take 947 place at the XX level as well as at the Network Allocatiion. 949 Range:8, Bits/level: 2 or more 8 2/6 3/5 4/4 5/3 6/2 2/2/4 2/3/3 950 2/4/2 3/2/3 3/3/2 4/2/2 2/2/2/2 Total: 13 952 Range: 12, Bits/level: 4 or more 12 4/8 5/7 6/6 7/5 8/4 4/4/4 Total: 953 7 955 Range: 12, Bits/level: multiples of 4 12 4/8 8/4 4/4/4 Total: 4 957 Range: 12, Bits/level: 3 or more Total: 19 959 Range: 16, Bits/level: 3 or more Total: 88 961 Range: 16, Bits/level: 4 or more Total: 26 963 Range: 24, Bits/level 4 or more Total: 345 965 Range:19, Bits/level 3 or more Total 277 967 Range: 20, Bits/level: 4 or more Total: 95 969 Range: 20, Bits/level: multiples of 4 (nibble boundaries only) 970 Total: 16 972 Author's Address 974 Brian Dickson 975 Afilias Canada, Inc 976 4141 Yonge St, 977 Suite 204 978 North York, ON M2P 2A8 979 Canada 981 Email: briand@ca.afilias.info 982 URI: www.afilias.info 984 Full Copyright Statement 986 Copyright (C) The IETF Trust (2007). 988 This document is subject to the rights, licenses and restrictions 989 contained in BCP 78, and except as set forth therein, the authors 990 retain all their rights. 992 This document and the information contained herein are provided on an 993 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 994 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND 995 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS 996 OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 997 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 998 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 1000 Intellectual Property 1002 The IETF takes no position regarding the validity or scope of any 1003 Intellectual Property Rights or other rights that might be claimed to 1004 pertain to the implementation or use of the technology described in 1005 this document or the extent to which any license under such rights 1006 might or might not be available; nor does it represent that it has 1007 made any independent effort to identify any such rights. Information 1008 on the procedures with respect to rights in RFC documents can be 1009 found in BCP 78 and BCP 79. 1011 Copies of IPR disclosures made to the IETF Secretariat and any 1012 assurances of licenses to be made available, or the result of an 1013 attempt made to obtain a general license or permission for the use of 1014 such proprietary rights by implementers or users of this 1015 specification can be obtained from the IETF on-line IPR repository at 1016 http://www.ietf.org/ipr. 1018 The IETF invites any interested party to bring to its attention any 1019 copyrights, patents or patent applications, or other proprietary 1020 rights that may cover technology that may be required to implement 1021 this standard. Please address the information to the IETF at 1022 ietf-ipr@ietf.org. 1024 Acknowledgment 1026 Funding for the RFC Editor function is provided by the IETF 1027 Administrative Support Activity (IASA).