idnits 2.17.00 (12 Aug 2021) /tmp/idnits40133/draft-bernardos-anima-fog-monitoring-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 20, 2021) is 206 days in the past. Is this intentional? Checking references for intended status: Experimental ---------------------------------------------------------------------------- No issues found here. Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 ANIMA WG CJ. Bernardos, Ed. 3 Internet-Draft UC3M 4 Intended status: Experimental A. Mourad 5 Expires: April 23, 2022 InterDigital 6 P. Martinez-Julia 7 NICT 8 October 20, 2021 10 Autonomic setup of fog monitoring agents 11 draft-bernardos-anima-fog-monitoring-05 13 Abstract 15 The concept of fog computing has emerged driven by the Internet of 16 Things (IoT) due to the need of handling the data generated from the 17 end-user devices. The term fog is referred to any networked 18 computational resource in the continuum between things and cloud. In 19 fog computing, functions can be stiched together composing a service 20 function chain. These functions might be hosted on resources that 21 are inherently heterogeneous, volatile and mobile. This means that 22 resources might appear and disappear, and the connectivity 23 characteristics between these resources may also change dynamically. 24 This calls for new orchestration solutions able to cope with dynamic 25 changes to the resources in runtime or ahead of time (in anticipation 26 through prediction) as opposed to today's solutions which are 27 inherently reactive and static or semi-static. 29 A fog monitoring solution can be used to help predicting events so an 30 action can be taken before an event actually takes place. This 31 solution is composed of agents running on the fog nodes plus a 32 controller hosted at another device (running in the infrastructure or 33 in another fog node). Since fog environments are inherently volatile 34 and extremely dynamic, it is convenient to enable the use of 35 autonomic technologies to autonomously set-up the fog monitoring 36 platform. This document aims at presenting this use case as well as 37 specifying how to use GRASP as needed in this scenario. 39 Status of This Memo 41 This Internet-Draft is submitted in full conformance with the 42 provisions of BCP 78 and BCP 79. 44 Internet-Drafts are working documents of the Internet Engineering 45 Task Force (IETF). Note that other groups may also distribute 46 working documents as Internet-Drafts. The list of current Internet- 47 Drafts is at https://datatracker.ietf.org/drafts/current/. 49 Internet-Drafts are draft documents valid for a maximum of six months 50 and may be updated, replaced, or obsoleted by other documents at any 51 time. It is inappropriate to use Internet-Drafts as reference 52 material or to cite them other than as "work in progress." 54 This Internet-Draft will expire on April 23, 2022. 56 Copyright Notice 58 Copyright (c) 2021 IETF Trust and the persons identified as the 59 document authors. All rights reserved. 61 This document is subject to BCP 78 and the IETF Trust's Legal 62 Provisions Relating to IETF Documents 63 (https://trustee.ietf.org/license-info) in effect on the date of 64 publication of this document. Please review these documents 65 carefully, as they describe your rights and restrictions with respect 66 to this document. Code Components extracted from this document must 67 include Simplified BSD License text as described in Section 4.e of 68 the Trust Legal Provisions and are provided without warranty as 69 described in the Simplified BSD License. 71 Table of Contents 73 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 74 1.1. Problem statement . . . . . . . . . . . . . . . . . . . . 3 75 1.2. Fog monitoring framework . . . . . . . . . . . . . . . . 4 76 1.3. Supporting simple and complex monitoring metrics . . . . 5 77 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 6 78 3. Autonomic setup of fog monitoring framework . . . . . . . . . 6 79 4. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 10 80 5. Security Considerations . . . . . . . . . . . . . . . . . . . 10 81 6. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 10 82 7. Informative References . . . . . . . . . . . . . . . . . . . 10 83 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 10 85 1. Introduction 87 The concept of fog computing has emerged driven by the Internet of 88 Things (IoT) due to the need of handling the data generated from the 89 end-user devices. The term fog is referred to any networked 90 computational resource in the continuum between things and cloud. A 91 fog node may therefore be an infrastructure network node such as an 92 eNodeB or gNodeB, an edge server, a customer premises equipment 93 (CPE), or even a user equipment (UE) terminal node such as a laptop, 94 a smartphone, or a computing unit on-board a vehicle, robot or drone. 96 In fog computing, functions might be organized in service function 97 chains (SFCs), hosted on resources that are inherently heterogeneous, 98 volatile and mobile. This means that resources might appear and 99 disappear, and the connectivity characteristics between these 100 resources may also change dynamically. This calls for new 101 orchestration solutions able to cope with dynamic changes to the 102 resources in runtime or ahead of time (in anticipation through 103 prediction) as opposed to today's solutions which are inherently 104 reactive and static or semi-static. 106 1.1. Problem statement 108 Figure 1 shows an exemplary scenario of a (robot) network service. A 109 robot device has its (navigation) control application running in the 110 fog away from the robot, as a network service in the form of an SFC 111 "F1-F2" (e.g., F1 might be in charge of identifying obstacles and F2 112 takes decisions on the robot navigation). Initially the function F1 113 is assumed to be hosted at a fog node A and F2 at fog node B. At a 114 given point of time, fog node A becomes unavailable (e.g., due to low 115 battery issues or the fog node A moving away from the coverage of the 116 robot). There is therefore a need to predict the need of migrating/ 117 moving the function F1 to another node (e.g., fog node C in the 118 figure), and this needs to be done prior to the fog/edge node 119 becoming no longer capable/available. Such dynamic migration cannot 120 be dealt with in today's orchestration solutions, which are rather 121 reactive and static or semi-static (e.g., resources may fail, but 122 this is an exceptional event, happening with low frequency, and only 123 scaling actions are supported to react to SLA-related events). 125 -------------- 126 | ==== | 127 ------+F1+---------- 128 / | | ==== | | \ 129 / | +------+ | \ 130 | | fog node C | \ 131 | -------------- \ 132 | \ 133 | -------------- ---\---------- 134 | | ==== | | \==== | 135 | -----------+F1+------------+F2| | 136 |/ | | ==== | | | | ==== | | 137 o | +------+ | | +------+ | 138 | | fog node A | | fog node B | 139 --------+- -------------- -------------- 140 | | 141 --0----0-- 143 Figure 1: Example scenario 145 Existing frameworks rely on monitoring platforms that react to 146 resource failure events and ensure that negotiated SLAs are met. 147 However these are not designed to predict events likely to happen in 148 a volatile fog environment, such as resources moving away, resources 149 becoming unavailable due to battery issues or just changes in 150 availability of the resources because of variations of the use of the 151 local resources on the nodes. Besides, it is not feasible in this 152 kind of volatile and extremely mobile environment to perform a 153 continuous monitoring and reporting of every possible variable or 154 parameter from all the nodes hosting resources, as this would not 155 scale and would consume many resources and generate extra overhead. 157 In volatile and mobile environments, prediction (make-before-break) 158 is needed, as pure reaction (break-before-make) is not enough. This 159 prediction is not generic, and depends on the nature of the network 160 service/SFC: the functions of the SFC, the connectivity between them, 161 the service-specific requirements, etc. Monitoring has to be setup 162 differently on the nodes, depending on the specifics of the network 163 service. Besides, in order to act proactively and predict what might 164 need to be done, monitoring in such a volatile and mobile 165 environments does not only involve the nodes currently hosting the 166 resources running the network service/service function chain (i.e., 167 hosting a function), but also other nodes which are potential 168 candidates to join either in addition or in substitution to current 169 nodes for running the network service in accordance with the 170 orchestration decisions. 172 In the example of Figure 1, the fog node initially hosting function 173 F1 (fog node A) might be running out of battery and this should be 174 detected before the node A actually becomes unavailable, so the 175 function F1 can be effectively migrated in a time to a different fog 176 node C, capable of meeting the requirements of F1 (compute, 177 networking, location, expected availability, etc.). In order to be 178 able to predict the need for such a migration and have already 179 identified a target fog node where to move the function, it is needed 180 to have a monitoring solution in place that instructs each node 181 involved in the service (A and B), and also neighboring node 182 candidate (C) to host function (F1), to monitor and report on metrics 183 that are relevant for the specific network service "F1-F2" that is 184 currently running. 186 1.2. Fog monitoring framework 188 Fog environments differ from data-center ones on three key aspects: 189 heterogeneity, volatility and mobility. The fog monitoring framework 190 is used to predict events triggering and orchestration event (e.g., 191 migrating a function to a different resource). 193 The monitoring framework we propose for fog environments is composed 194 of 2 logical components: 196 o Fog agents running on each fog node. An agent is responsible for 197 sending the value of a variable or parameter to a fog monitoring 198 controller and to other fog agents. What variable or parameter 199 will be monitored and what data will be sent (including frequency) 200 is configured per agent considering the specifics of the network 201 service or SFC. A fog agent might also take some autonomous 202 actions (such as request migration of a function to a neighbor 203 node) in certain situations where connectivity with the fog 204 monitoring controller is temporarily unavailable. 206 o A fog monitoring controller (e.g., running at the edge or at a fog 207 node). This node obtains input from the orchestration logic (MANO 208 stack) and autonomously decides what variables or parameters will 209 be monitored, where will the data be collected, and how it will be 210 done, based on the requirements provided by the orchestration 211 logic managing the network services instantiated in the fog. This 212 configuration is specific to a network service, a function, or an 213 SFC as whole. 215 * It interacts with the orchestration logic to coordinate and 216 trigger orchestration events, such as function migration, 217 connectivity updates, etc. In some deployments, this entity 218 might be co-located with the orchestration logic (e.g., the 219 NFVO). 221 * It interacts with the fog agents to instruct what variables 222 and/or parameters need to be monitored. It also interacts to 223 get the resulting monitoring data. This interaction is not 224 limited to fog agents at nodes currently involved in a given 225 network service or SFC, but also includes other nodes that are 226 suitable for hosting a function that needs to be migrated. 227 This allows to provide the orchestration logic with candidate 228 nodes in a pro-active way. 230 * It is capable of autonomously discover and set up fog agents. 232 1.3. Supporting simple and complex monitoring metrics 234 Fog monitoring nodes will be capable of providing raw monitoring data 235 as well as processed data. The former are obtained directly from the 236 measured variables or parameters. The latter are obtained by 237 applying some processing function to several monitoring data items. 238 The fog monitoring controllers will specify the function to be 239 executed, which data will be collected and processed by the 240 functions, and the additional parameters that will control the 241 processing and will determine the particularities of the output of 242 each function. 244 The complexity of the functions that can be executed is arbitrary. 245 They can be either pre-instructed in the fog agents or dynamically 246 instructed by the requester (the fog monitoring controller) by 247 providing the sequence to execute the functions and their input 248 parameters. 250 Complex monitoring metrics, the processed data, can also be used as 251 part of the condition that determines the distributed and autonomic 252 actions. Thus, the logic that defines those actions is simplified 253 and the actuation components can be concentrated on their task 254 without requiring extra effort to process the raw monitoring data. 256 Adding support for complex monitoring metrics enables the fog 257 monitoring framework to avoid the transmission of unneeded data and 258 thus optimize its overall operation. For example, if the controller 259 is interested in the average of the CPU load of a fog agent for the 260 last 5 minutes, it can just request it, providing the period to 261 average as input parameter and specifying the source from which 262 measuring the CPU load variable. 264 2. Terminology 266 The following terms are using in ths document: 268 fog: Fog goes to the Extreme Edge, that is the closest 269 possible to the user including on the user device 270 itself. 272 fog node: Any device that is capable of participating in the Fog. 273 A Fog node might be volatile, mobile and constrained 274 (in terms of computing resources). Fog nodes may be 275 heterogeneous and may belong to different owners. 277 orchestrator: In this document we use orchestrator and NFVO terms 278 interchangeably. 280 3. Autonomic setup of fog monitoring framework 282 Fog nodes autonomously start fog agents at the bootstrapping, then 283 start looking for other agents and the fog monitoring controller. 284 This autonomic setup can be performed using GRASP. The procedure is 285 represented in Figure 2. The different steps are described next: 287 +--------+ +--------+ +--------+ 288 | fog | | fog | | fog | 289 | node C | | node A | | node B | +------+ 290 | | | | | | | fog | 291 | | | | | | | | | | | | +------+ | mon. | 292 | +----+ | | +----+ | | +----+ | | NFVO | | ctrl | 293 +--------+ +--------+ +--------+ +------+ +------+ 294 | | | | 295 (fog nodes A & B bootstrap) | | 296 | | | | 297 | | periodic mcast advertisement| 298 | | (ID, fog_scope) | 299 | | <----------------------------+ 300 | Mcast discovery (fog_node_ID, scope) | 301 +-------------------------------------------->| 302 +------------>| | | 303 | Mcast discovery (fog_node_ID, scope) | 304 | +------------------------------>| 305 |<------------+ | | 306 | | | | 307 | Unicast advertisement (ID, fog_scope) | 308 | |<------------------------------+ 309 |<--------------------------------------------+ 310 | | | | 311 | Unicast registration (ID, fog_node_ID | 312 | | fog_scope, capab.) | 313 | +------------------------------>| 314 +-------------------------------------------->| 315 | | | | 316 (fog nodes A & B registered) | | 317 | | | | 318 (fog node C bootstraps) | | | 319 | | | | | 320 | Mcast discovery (fog_node_ID, scope) | | 321 +---------------------------------------------------------->| 322 +-------------------------->| | | 323 +------------>| Unicast advertisement (ID, fog_scope) | 324 |<----------------------------------------------------------+ 325 |<--------------------------+ | | 326 |<------------+ Unicast registration (ID, fog_node_ID | 327 | | | fog_scope, capab.) | 328 +---------------------------------------------------------->| 329 (fog node C registered) | | | 330 | | | | | 332 Figure 2: Autonomic setup of fog agents 334 o The fog monitoring controller is regularly sending periodic 335 multicast advertisement messages, which include its ID as well as 336 the scope for the advertisement messages (i.e., the scope of where 337 the messages have to be flooded). 339 M_DISCOVERY messages are used, with new objectives and objective 340 options. GRASP specifies that "an objective option is used to 341 identify objectives for the purposes of discovery, negotiation or 342 synchronization". New objective options are defined for the 343 purposes of discovering potential fog agents with certain 344 characteristics. Non-limiting examples of these options are 345 listed below (note that the names are just examples, and the ones 346 used have to be registered by the IANA): 348 * FOGNODERADIO: used to specify a given type of radio technology, 349 e.g.,: WiFi (version), D2D, LTE, 5G, Bluetooth (version), etc. 351 * FOGNODECONNECTIVITY: used to specify a given type of 352 connectivity, e.g., layer-2, IPv4, IPv6. 354 * FOGNODEVIRTUALIZATION: used to specify a given type of 355 virtualization supported by the node where the agent runs. 356 Examples are: hypervisor (type), container, micro-kernel, bare- 357 metal, etc. 359 * FOGNODEDOMAIN: used to specify the domain/owner of the node. 360 This is useful to support operation of multiple domains/ 361 operators simultaneously on the same fog network. 363 An example of discovery message using GRASP would be the following 364 (in this example, the fog monitoring controller is identified by 365 its IPv6 address: 2001:DB8:1111:2222:3333:4444:5555:6666): 367 [M_DISCOVERY, 13948745, h'20010db8111122223333444455556666', 368 ["FOGDOMAIN", F_SYNCH_bits, 2, "operator1"]] 370 GRASP is used to allow the fog agents and the controller discovery 371 in an autonomic way. The extensions defined above, together with 372 the use of properly scoped multicast addresses (as explained 373 below), allow to precisely define which nodes participate in the 374 monitoring and to gather their principal characteristics. 376 o When a fog node bootstraps, such as nodes A and B in the figure, 377 they start sending multicast discovery messages within a given 378 scope, that is, the intended area that composes the fog. The 379 definition of the scope depends on the scenario, and examples of 380 possible scopes are: 382 * All-resources of a given manufacturer. 384 * All-resources of a given type. 386 * All-resources of a given administrative domain. 388 * All-resources of a given user. 390 * All-resources within a topological network distance (e.g., 391 number of hops). 393 * All-resources within a geographical location. 395 * Etc. 397 Combination of previous scopes are also possible. 399 The discovery messages are multicast within the scope, reaching 400 all the nodes that compose the specified fog resources. This can 401 be done for example using well defined IPv6 multicast addresses, 402 specified for each of the different scopes. This signaling is 403 based on GRASP. Different IPv6 multicast addresses need to be 404 defined to reach each different scope, using scopes equal or 405 larger than Admin-Local according to [RFC7346]. 407 o In response to multicast fog discovery messages, the fog 408 monitoring controller replies with unicast messages providing its 409 information. 411 o Fog agents can then register with a controller. The registration 412 message is unicast, and includes information on the capabilities 413 of the fog node, such as: 415 * Type of node. 417 * Vendor. 419 * Energy source: battery-powered or not. 421 * Connectivity (number of network interfaces and information 422 associated to them, such as radio technology type, layer-2 and 423 layer-3 addresses, etc.). 425 * Etc. 427 Note that registration to multiple fog monitoring controller 428 instances could also be possible if a fog node wants to belong to 429 several fog domains at the same time (but note that how the 430 orchestration of the same resource is done by multiple 431 orchestrators is not covered by this invention). The defined 432 mechanisms support this via the use of fog IDs and FOGNODEDOMAIN 433 options. 435 o A fog node C bootstraps after nodes A and B are already 436 registered. The same discovery process is followed by fog node C, 437 but in addition to the regular advertisement, registration 438 procedures described before, existing neighboring fog agents (such 439 as A and B in this example), might also respond to discovery 440 messages sent by bootstrapping nodes to provide required 441 information. This makes the procedure faster, more efficient and 442 reliable. In addition to helping the fog monitoring controller in 443 the fog agent discovery process, fog agents learn themselves about 444 the existence and associated capabilities of other fog agents. 445 This can be used to allow autonomous monitoring by the fog agents 446 without the involvement of the central controller. 448 4. IANA Considerations 450 TBD. 452 5. Security Considerations 454 TBD. 456 6. Acknowledgments 458 The work in this draft will be further developed and explored under 459 the framework of the H2020 5G-DIVE project (Grant 859881). 461 7. Informative References 463 [RFC7346] Droms, R., "IPv6 Multicast Address Scopes", RFC 7346, 464 DOI 10.17487/RFC7346, August 2014, 465 . 467 Authors' Addresses 469 Carlos J. Bernardos (editor) 470 Universidad Carlos III de Madrid 471 Av. Universidad, 30 472 Leganes, Madrid 28911 473 Spain 475 Phone: +34 91624 6236 476 Email: cjbc@it.uc3m.es 477 URI: http://www.it.uc3m.es/cjbc/ 478 Alain Mourad 479 InterDigital Europe 481 Email: Alain.Mourad@InterDigital.com 482 URI: http://www.InterDigital.com/ 484 Pedro Martinez-Julia 485 NICT 486 4-2-1, Nukui-Kitamachi, Koganei 487 Tokyo 184-8795 488 Japan 490 Phone: +81 42 327 7293 491 Email: pedro@nict.go.jp