idnits 2.17.00 (12 Aug 2021) /tmp/idnits7638/draft-liu-coin-differential-reservation-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 328 has weird spacing: '...esource tbd...' == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. -- The document date (November 1, 2020) is 559 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Unused Reference: 'RFC5440' is defined on line 361, but no explicit reference was found in the text Summary: 0 errors (**), 0 flaws (~~), 4 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Computing in Network Research Group P. Liu 3 Internet-Draft H. Yao 4 Intended status: Informational L. Geng 5 Expires: May 5, 2021 China Mobile 6 November 1, 2020 8 Differential Computing Resource Reservation 9 draft-liu-coin-differential-reservation-01 11 Abstract 13 Computing in the network may require the embedded computing 14 capability in the network device, such as gateway, switch, etc, and 15 there might be so much distributed computing task in the network. 16 Some new applications like AR/VR, motion control put forward higher 17 demand of network than before, and AI is also considered to be used 18 in the app and network. In order to satisfy the demands, it needs to 19 guarantee both the bandwidth resource and the computing resource 20 which is linked by the network. 22 Requirements Language 24 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 25 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 26 document are to be interpreted as described in RFC 2119 [RFC2119]. 28 Status of This Memo 30 This Internet-Draft is submitted in full conformance with the 31 provisions of BCP 78 and BCP 79. 33 Internet-Drafts are working documents of the Internet Engineering 34 Task Force (IETF). Note that other groups may also distribute 35 working documents as Internet-Drafts. The list of current Internet- 36 Drafts is at https://datatracker.ietf.org/drafts/current/. 38 Internet-Drafts are draft documents valid for a maximum of six months 39 and may be updated, replaced, or obsoleted by other documents at any 40 time. It is inappropriate to use Internet-Drafts as reference 41 material or to cite them other than as "work in progress." 43 This Internet-Draft will expire on May 5, 2021. 45 Copyright Notice 47 Copyright (c) 2020 IETF Trust and the persons identified as the 48 document authors. All rights reserved. 50 This document is subject to BCP 78 and the IETF Trust's Legal 51 Provisions Relating to IETF Documents 52 (https://trustee.ietf.org/license-info) in effect on the date of 53 publication of this document. Please review these documents 54 carefully, as they describe your rights and restrictions with respect 55 to this document. Code Components extracted from this document must 56 include Simplified BSD License text as described in Section 4.e of 57 the Trust Legal Provisions and are provided without warranty as 58 described in the Simplified BSD License. 60 Table of Contents 62 1. Overview . . . . . . . . . . . . . . . . . . . . . . . . . . 2 63 2. Serial Distributed Computing Model . . . . . . . . . . . . . 3 64 3. Problems of Existing Protocol . . . . . . . . . . . . . . . . 4 65 4. Reference Method . . . . . . . . . . . . . . . . . . . . . . 5 66 4.1. Distributed Resource Reservation . . . . . . . . . . . . 5 67 4.2. Centralized Resource Reservation . . . . . . . . . . . . 6 68 4.2.1. PCEP . . . . . . . . . . . . . . . . . . . . . . . . 7 69 4.2.2. Netconf/Yang . . . . . . . . . . . . . . . . . . . . 7 70 5. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . 8 71 6. Security Considerations . . . . . . . . . . . . . . . . . . . 8 72 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 73 8. Normative References . . . . . . . . . . . . . . . . . . . . 8 74 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 9 76 1. Overview 78 From cloud computing to edge computing, computing power is 79 distributed and extends to customers. In the future network and 80 computing integration system, computing power will be distributed in 81 all nodes as ubiquitous endogenous resources. The user's request can 82 be satisfied by calling the nearest node resource, which is no longer 83 limited to a specific node. 85 The basic topology abstraction of traditional Internet architecture 86 is the end-to-end model: the network is in the middle, the computing 87 is in the periphery, and the host realizes the logical virtual full 88 connection through the network. In the trend of network and 89 computing convergence, computing resource may be embedded in the 90 network. From the perspective of completing users' computing tasks, 91 embedded resources are no longer peer-to-peer relationship, but need 92 to consider the different distances and network conditions. 94 There are two kinds of ideas of the convergence, One is from the 95 perspective of the network, to realize the perception of computing 96 resources based on the network, so as to perform routing, scheduling, 97 etc. The other is from the perspective of the data center, to 98 realize the perception of network status based on the data center, 99 and apply the scheduling of microservices and other architectures to 100 a wide range network. 102 Some researching on computing and network convergence has been 103 carried out in standardization organizations, including many network 104 architectures proposed by operators. However, no matter who is the 105 subject of perception, it is to provide better services, so the 106 network and computing will develop in a more refined direction. 107 Based on the perspective of network aware computing resources, this 108 draft analyzes the problems of resource reservation in the trend of 109 network and computing convergence, and put forwards the corresponding 110 reference schemes. 112 The reservation of traditional network resources is same in an end- 113 to-end path, which means the reserved bandwidth resources will not 114 change from the client to the server, but computing is different. 115 Distributed computing will bring different computing power, and 116 different resources need to be reserved for different nodes. For 117 example, AI algorithm now has a model of step-by-step iteration at 118 multiple nodes. The previous iteration will affect the next 119 calculation results, and the computing resources required for each 120 iteration are not the same. From the perspective of network 121 standard, we hope to regard computing resources as the dimensions to 122 measure network performance, such as the same bandwidth, path, etc., 123 while the traditional technologies of resource reservation have not 124 considered the reservation of computing resources, and have not 125 considered the differentiated resource reservation model. 127 2. Serial Distributed Computing Model 129 In the model of computing in the network, the computing resource may 130 be distributed in multiple nodes. A task may be divided into several 131 parts to be executed by multiple nodes, including serial distribution 132 and parallel distribution. Parallel distribution can reserve 133 resources separately. However, in the serial computing model, the 134 calculation process of serial distribution algorithm is sequential, 135 and the results of the previous calculation need to be used in the 136 later calculation, so it will bring the following two 137 characteristics: 139 Different computing nodes on the same path need different reserved 140 computing resources. 142 The bandwidth resources to be reserved maybe different after the 143 previous calculations in the same path. 145 A typical example is the artificial intelligence algorithm, which 146 involves the multi-layer convolution iterative process and can be 147 completed by multiple computing device in serial. As shown in the 148 figure, 20%, 30% and 50% tasks are calculated on network device 1, 3 149 and server respectively, and the calculation results of device 1 will 150 affect the subsequent calculation of device 3 and server. Then, 152 Network device 1, 3 and server need to reserve corresponding 153 computing resources respectively. 155 Since devices 1 and 3 calculated, the traffic will change after 156 passing through devices 1 and 3, so the bandwidth resources to be 157 reserved are different. 159 +------+ +--------+ 160 |Client| ->| Server | 161 +------+ \ +--------+ +--------+ +--------+ / +--------+ 162 \->|network | |network | |network |->/ 50% of 163 |device 1|-->|device 2|-->|device 3| computing 164 +--------+ +--------+ +--------+ tasks 165 20% of 30% of 166 computing computing 167 tasks tasks 169 Serial distributed computing model 171 3. Problems of Existing Protocol 173 Existing resource reservation protocols work on different layers of 174 network, such as Resource ReSerVation Protocol(RSVP) and Path 175 Computation Element Protocol (PCEP) . RSVP is a traditional protocol, 176 which only focuses on how to initiate the reservation of resources, 177 not the establishment of path. Later, RSVP-TE protocol was developed 178 for MPLS. PCEP was designed to separate the path calculation and 179 path establishment functions of RSVP-TE firstly, which means that the 180 path calculation part before resource reservation can be realized. 181 Therefore, RSVP and PCEP can be used together or separately. 183 However, thoes protocols have some problem when meets the computing 184 tasks: 186 First, they do not consider the computing attribute, also can't carry 187 the value of reserved computing resource. 189 Second, The reserved value of bandwidth resource along the path is 190 unchanged. 192 It should be noted that we only analyzes the resource reservation 193 protocol in the network field. For the resource reservation of 194 microservice architecture, there may be problems of applying the 195 microservice architecture in the operator network, so it will not be 196 analyzed for the time being. 198 4. Reference Method 200 This section provides distributed and centralized resource 201 reservation reference scheme based on the existing protocol of 202 network. It should be noted that for serial distributed computing, 203 we assume that the application side implements the following 204 functions: 206 The number of steps are involved in the calculation. 208 The computing proportion of calculation required at each node. 210 For bandwidth changes after each step of calculation, if this item 211 cannot be implemented, the same bandwidth resources will be reserved 212 by default. 214 4.1. Distributed Resource Reservation 216 Distributed resource reservation can be implemented by extending RSVP 217 or RSVP-TE protocol. The server receives the client's service 218 request, calculating the resource reservation strategy and return it. 219 The process is as follows: 221 1. The client sends the service request, carrying the service 222 requirements and the collected resource status of each node on the 223 path. They will be collected and added to the information that 224 carried by the service request. 226 2. The server receives the client's service request, then generates 227 the resource reservation strategy for target nodes on the path based 228 on the the service requirements and the resource status of each node, 229 and return the resource reservation strategy to each target node 230 along the path to reserve the resource. 232 The resource status at least includes the computing resource status 233 such as the catergery of chip, algorithm, etc. It can also includes 234 the network resource status such as bandwidth, delay, etc. 236 The resource reservation strategy at least includes the computing 237 resource reservation information of target nodes, which is as 238 follows: 240 1. Determine the serial distributed computing subtasks and computing 241 resources required by each computing subtask based on the service 242 request. 244 2. Select the target nodes for each computing subtask and generate 245 the computing resources reservation information to inform each target 246 node to reserve resource based on the computing resource status of 247 each node and the computing resources required by each computing 248 subtask. 250 Moreover, if the bandwidth change after each subtask can be 251 calculated, the resource reservation strategy can also carrying the 252 bandwidth resources reservation information. 254 It can be realized by defining new object of RSVP or RSVP-TE to 255 reserve different resources in each target nodes. The object can be 256 customized and extended with variable length. For example, 257 redefining a new class num as 30, carries the following message body: 259 [L = 0, IPv4, 64, IP address1, bandwidth 1, computing resource 1] 261 [L = 0, IPv4, 64, IP address2, bandwidth 2, computing resource 2] 263 [L = 0, IPv4, 64, IP address3, bandwidth 3, computing resource 3] 265 [L = 0, IPv4, 64, IP address4, bandwidth 4, computing resource 4] 267 ...... 269 It should be noted that the extended object can not only carry the 270 collected resources status of each node in the PATH message, but also 271 return the resource reservation strategy in the RESV message. 273 4.2. Centralized Resource Reservation 275 Centralized resource reservation can be realized by the network 276 manager. The manager receives the service request, calculates the 277 network and computing resources needed, and initiates resource 278 reservation configuration for the target nodes along the path.The 279 process is as follows: 281 The client sends a service request to the network manager. 283 Network manager selects the path according to the service request and 284 get the resource status of each node on the path. 286 Network manager generates the resource reservation strategy based on 287 the client's service request and resource status of each node. 289 Network manager sends resource reservation strategy to target nodes 290 to reserve the resource. 292 The resource status at least includes the computing resource status. 293 The resource reservation strategy at least includes the computing 294 resource reservation information of each target node. Which are the 295 same with chapter 4.1. 297 If at least one node in the selected path does not meet the resource 298 reservation requirements, it is necessary to re-select at least one 299 node in the path and get the resource status of the re-selected node 300 until the path meets the requirements of the resource reservation 301 strategy. 303 4.2.1. PCEP 305 By adding calculation force resource reservation field to resource 306 reservation object in PECP message, each calculation force flow has a 307 dynamic resource range based on the minimum reserved resource. 309 +---------+---------+-----------+----------+--------+ 310 | Object | Label | Reserverd |Interface | In/ | 311 | Type | ID | Bandwidth |IP Address| Out | 312 +---------+---------+-----------+----------+--------+ 314 PCEP extension 316 4.2.2. Netconf/Yang 318 It can also send resource reservation configuration to the target 319 nodes by netconf and defining the Yang structure. The reference Yang 320 module is as follows. 322 module: rs-computing-network 323 +--rw rs-computing-network 324 +--rw added-device[id] 325 | +--rw service id string 326 | +--rw user id string 327 | +--rw bandwitdh mbps 328 | +--rw computing resource tbd 329 +--rw deleted-device[id] 331 Yang Module 333 5. Conclusion 335 The draft proposes a method of differential reservation of computing 336 power and bandwidth resources based on the network protocol. Because 337 the traditional network does not include computing power, the 338 reservation of network resources is the same on the path. This 339 scheme can accurately reserve computing power and network resources 340 for the serial distributed computing services. It also present the 341 reference methods to realize different resource reservation.Of 342 course, there may be more and more appropriate methods to achieve the 343 computing and network resource reservation, which may require more 344 analysis and discussion. 346 6. Security Considerations 348 TBD. 350 7. IANA Considerations 352 TBD. 354 8. Normative References 356 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 357 Requirement Levels", BCP 14, RFC 2119, 358 DOI 10.17487/RFC2119, March 1997, 359 . 361 [RFC5440] Vasseur, JP., Ed. and JL. Le Roux, Ed., "Path Computation 362 Element (PCE) Communication Protocol (PCEP)", RFC 5440, 363 DOI 10.17487/RFC5440, March 2009, 364 . 366 Authors' Addresses 368 Peng Liu 369 China Mobile 370 Beijing 100053 371 China 373 Email: liupengyjy@chinamobile.com 375 Huijuan Yao 376 China Mobile 377 Beijing 100053 378 China 380 Email: yaohuijuan@chinamobile.com 382 Liang Geng 383 China Mobile 384 Beijing 100053 385 China 387 Email: gengliang@chinamobile.com