idnits 2.17.00 (12 Aug 2021) /tmp/idnits15400/draft-spinella-event-streaming-open-network-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 14 instances of too long lines in the document, the longest one being 45 characters in excess of 72. == There are 101 instances of lines with non-RFC2606-compliant FQDNs in the document. == There are 16 instances of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. == There are 15 instances of lines with private range IPv4 addresses in the document. If these are generic example addresses, they should be changed to use any of the ranges defined in RFC 6890 (or successor): 192.0.2.x, 198.51.100.x or 203.0.113.x. Miscellaneous warnings: ---------------------------------------------------------------------------- == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. -- The document date (27 January 2022) is 107 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: 'Online' is mentioned on line 2068, but not defined Summary: 1 error (**), 0 flaws (~~), 5 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 TBD E. Spinella 3 Internet-Draft Syndeno 4 Intended status: Informational 27 January 2022 5 Expires: 31 July 2022 7 Event Streaming Open Network 8 draft-spinella-event-streaming-open-network-02 10 Abstract 12 This document describes the vision, architecture and network protocol 13 for an Event Streaming Open Network over the Internet. 15 About This Document 17 This note is to be removed before publishing as an RFC. 19 The latest revision of this draft can be found at 20 https://example.com/LATEST. Status information for this document may 21 be found at https://datatracker.ietf.org/doc/draft-spinella-event- 22 streaming-open-network/. 24 Source for this draft and an issue tracker can be found at 25 https://github.com/syndeno/draft-spinella-event-streaming-open- 26 network. 28 Status of This Memo 30 This Internet-Draft is submitted in full conformance with the 31 provisions of BCP 78 and BCP 79. 33 Internet-Drafts are working documents of the Internet Engineering 34 Task Force (IETF). Note that other groups may also distribute 35 working documents as Internet-Drafts. The list of current Internet- 36 Drafts is at https://datatracker.ietf.org/drafts/current/. 38 Internet-Drafts are draft documents valid for a maximum of six months 39 and may be updated, replaced, or obsoleted by other documents at any 40 time. It is inappropriate to use Internet-Drafts as reference 41 material or to cite them other than as "work in progress." 43 This Internet-Draft will expire on 31 July 2022. 45 Copyright Notice 47 Copyright (c) 2022 IETF Trust and the persons identified as the 48 document authors. All rights reserved. 50 This document is subject to BCP 78 and the IETF Trust's Legal 51 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 52 license-info) in effect on the date of publication of this document. 53 Please review these documents carefully, as they describe your rights 54 and restrictions with respect to this document. Code Components 55 extracted from this document must include Revised BSD License text as 56 described in Section 4.e of the Trust Legal Provisions and are 57 provided without warranty as described in the Revised BSD License. 59 Table of Contents 61 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 62 2. An Open Network for Event Streaming over the Internet . . . . 4 63 2.1. Free, Open & Neutral Networks (FONN) . . . . . . . . . . 5 64 2.2. Non-discriminatory and open access . . . . . . . . . . . 6 65 2.3. Open participation . . . . . . . . . . . . . . . . . . . 6 66 2.4. Open Access Infrastructure Resources . . . . . . . . . . 7 67 2.4.1. Open Access DNS Resource Example . . . . . . . . . . 9 68 2.4.2. Flow: Event Streaming Internet Resource . . . . . . . 9 69 3. Necessities for an Event Streaming Open Network over the 70 Internet . . . . . . . . . . . . . . . . . . . . . . . . 11 71 3.1. Necessity 1: Event Streaming Internet Resource Public 72 Registry . . . . . . . . . . . . . . . . . . . . . . . . 11 73 3.2. Necessity 2: Establishment of a User Space for Events . . 12 74 3.3. Necessity 3: An Agnostic Subscription Protocol . . . . . 13 75 3.4. Necessity 4: An Open Cross-sector Payload Format . . . . 14 76 4. Event Streaming Open Network Architecture . . . . . . . . . . 15 77 4.1. Architecture overview . . . . . . . . . . . . . . . . . . 15 78 4.1.1. Flow Events Broker (FEB) . . . . . . . . . . . . . . 18 79 4.1.2. Flow Name Service (FNS) . . . . . . . . . . . . . . . 18 80 4.1.3. Flow Namespace Accessing Agent (FNAA) . . . . . . . . 22 81 4.1.4. Flow Processor (FP) . . . . . . . . . . . . . . . . . 23 82 4.1.5. Flow Namespace User Agent (FNUA) . . . . . . . . . . 24 83 4.2. Communications Examples . . . . . . . . . . . . . . . . . 24 84 4.2.1. Unidirectional Subscription . . . . . . . . . . . . . 25 85 4.2.2. Bidirectional Subscription . . . . . . . . . . . . . 25 86 5. Event Streaming Open Network Protocol . . . . . . . . . . . . 26 87 5.1. Protocol definition methodology . . . . . . . . . . . . . 26 88 5.2. Flow Namespace Accessing Protocol (FNAP) . . . . . . . . 27 89 5.3. Implementation . . . . . . . . . . . . . . . . . . . . . 28 90 5.3.1. Objectives . . . . . . . . . . . . . . . . . . . . . 28 91 5.4. Existing components . . . . . . . . . . . . . . . . . . . 28 92 5.4.1. Flow Events Broker (FEB) . . . . . . . . . . . . . . 29 93 5.4.2. Flow Name Service (FN) . . . . . . . . . . . . . . . 29 94 5.4.3. Components to be developed . . . . . . . . . . . . . 29 95 6. Proof of Concept . . . . . . . . . . . . . . . . . . . . . . 32 96 6.1. Minimum functionalities . . . . . . . . . . . . . . . . . 32 97 6.2. FNAA - Server application . . . . . . . . . . . . . . . . 33 98 6.3. FNUA - Client application . . . . . . . . . . . . . . . . 34 99 6.4. Use cases . . . . . . . . . . . . . . . . . . . . . . . . 35 100 6.4.1. Use case 2: Creating a flow . . . . . . . . . . . . . 36 101 6.4.2. Use case 3: Describing a flow . . . . . . . . . . . . 37 102 6.4.3. Use case 4: Subscribing to a remote flow . . . . . . 38 103 6.5. Results of the PoC . . . . . . . . . . . . . . . . . . . 42 104 7. Summary & Conclusions . . . . . . . . . . . . . . . . . . . . 43 105 8. Security Considerations . . . . . . . . . . . . . . . . . . . 45 106 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 45 107 10. Normative References . . . . . . . . . . . . . . . . . . . . 45 108 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . 45 109 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 46 111 1. Introduction 113 Society is rapidly digitalizing and automating the exchanges of value 114 that constitute the economy. Also, considerable time and energy is 115 spent to assure that key transactions can be executed with reduced 116 human involvement with better, faster, and more accurate results. In 117 this context, Event Streaming can play a key role in how the economic 118 system evolves. 120 However, most of the application layer integrations executed today 121 across organizational boundaries are not in real time. Also, they 122 currently require employing a variety of formats and protocols. Some 123 industries have adopted data formats for exchanging information 124 between organizations, such as Electronic Data Interchange (EDI). 125 However, those integrations are limited to specific use cases and 126 represent a small fraction of all demanded organizational 127 integrations. 129 Thus, there is no consistent and common consensus on a mechanism for 130 the exchange of events across organizations. This results in a 131 completely custom landscape for each real-time cross-organization 132 integration. In this scenario, development teams must invest plenty 133 of time into understanding and defining a common interface for events 134 exchange. 136 In this context, we can now introduce how this landscape could change 137 with the introductiopn of an Event Streaming Open Network over the 138 Internet. When needing to connect real-time event flows across 139 organizations, developers would have a common basis for finding, 140 publishing, and subscribing to event streams. Also, given a set of 141 standard formats to encode and transmit events, developers could use 142 the programming language of their choice. Overall, this set of 143 standards would drastically reduce the cost of real-time integration, 144 which would also enable experimentation by users. 146 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 147 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 148 "OPTIONAL" in this document are to be interpreted as described in 149 BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all 150 capitals, as shown here. 152 2. An Open Network for Event Streaming over the Internet 154 In this section, we will argue how Internet standards are developed 155 and why this could be the case for an Event Streaming Open Network. 157 An interesting example of this phenomenon is the case of ISDN 158 (Integrated Services Digital Network), a set of communications 159 standards for the transmission of voice, video, and data over the 160 PSTN (Public Switched Telephone Network) developed by the ITU-T 161 (Telecommunication Standardization Sector) in 1988. ISDN pretended 162 to use the existing public telephone network to transmit digital data 163 in a time when the Internet connectivity access was not as broadly 164 available as it is today. The main competitor of this standard was 165 the incipient Internet itself, which could be used to transmit the 166 same data. 168 The Internet alternative needed a protocol to support the same 169 services offered by ISDN, which was initially developed by the 170 conjoint effort of the academic and private sector. Consequently, in 171 1992 the Mbone (Multicast Bone) was created. This project was an 172 experimental network backbone built over the Internet for carrying 173 multicast IP traffic, which could be used for multimedia content. 174 After some important milestones of this project, the SIP (Session 175 Initiation Protocol) was defined in 1996 and was published as a 176 standard protocol in IETF's [RFC3261]. The reality today is that SIP 177 has completely won the standards battle for multimedia transmission 178 over the Internet, and ISDN usage has been on continuous decline. 180 As for Event Streaming, we see a similar scenario set-up today. 181 There are currently several open specifications and implementations 182 for Event Streaming, like AMQP (Advanced Messaging Queueing 183 Protocol), supported by RabbitMQ. However, while AMQP can be used 184 for several purposes, Kafka Protocol specializes on Event Streaming 185 Processing and its specialized features make it more convenient than 186 RabbitMQ (i.e. ordering). 188 In the case of an Event Streaming Open Network over the Internet, if 189 we guide ourselves by the history of the most widely adopted 190 protocols on the Internet, the governance should be similar to that 191 of the WWW or Email. Both the WWW and Email have open specifications 192 as well as open-source implementations. We can mention the Apache 193 Web Server as an open-source implementation of the HTTP protocol; 194 Postfix for SMTP; and Bind for DNS. Nevertheless, the governance for 195 these protocols' specifications relies on the IETF. 197 In order to define the characteristics of an Event Streaming Open 198 Network, we will focus on the definition of shared and openly 199 accessible infrastructure. First, we will review the principles of 200 Free, Open & Neutral Networks and why they should be followed for an 201 Event Streaming Open Network. Then, we will show how DNS complies 202 with the criteria to be considered an infrastructure resource. 203 Finally, we will demonstrate how this is also true for Event 204 Streaming. 206 2.1. Free, Open & Neutral Networks (FONN) 208 The main principles of a Free, Open & Neutral Network are: 210 * It is open because it is universally open to the participation of 211 everybody without any kind of exclusion nor discrimination, and 212 because it is always described how it works and its components, 213 enabling everyone to improve it. 215 * It is free because everybody can use it for whatever purpose and 216 enjoy it independently of his network participation degree. 218 * it is neutral because the network is independent of the contents, 219 it does not influence them and they can freely circulate; the 220 users can access and produce contents independently to their 221 financial capacity or their social condition. The new contents 222 produced are orientated to stimulate new ones, or for the network 223 administration itself, or simply in exercise of the freedom of 224 adding new contents, but not to replace or to to block other ones. 226 * It is also neutral with regard to the technology, the network can 227 be built with whatever technology chosen by the participants with 228 the only limitations resulting of the technology itself. 230 2.2. Non-discriminatory and open access 232 Services such as DNS, the World Wide Web and Email do not 233 discriminate and are open-accessible. Basically, people and 234 organizations can access these networks as long as they can register 235 an Internet Domain and host the required server components. 236 Nowadays, there are alternatives to avoid having to register a domain 237 name to have a web page or an email, such as Cloud WordPress Hosting 238 or Gmail. However, we will focus on the network participants that 239 provide services to end-users. 241 In the case of Guifi.net, we can highlight how this principle has 242 been adopted in the fact that everybody can take part in the project 243 without discrimination. Moreover, an emphasis is made in easing the 244 participation of the disadvantaged collectives, with less resources 245 or less opportunities to access information technologies, 246 telecommunications, and the Internet. 248 An Event Streaming Open Network should provide resources in a similar 249 way than the most widely adopted Internet Services. Thus, 250 individuals and organizations must be able to register Flow address 251 spaces for which the existing DNS infrastructure could be leveraged. 252 Moreover, the specification of the protocols that implement the 253 Metadata and Payload formats must also be openly accessible. 255 2.3. Open participation 257 Internet Services like DNS, WWW and Email provide individuals and 258 organizations with different ways of participation. First, anybody 259 can obtain the protocols' specification and build a custom 260 implementation, which would result in a new product compatible with 261 the protocols. Secondly, anybody can register a domain name and set 262 up servers using compatible products. Thirdly, anybody can join and 263 participate in the IETF, the institution that governs the 264 specifications for these protocols. 266 As for Guifi.net, not only anybody can extend the network with new 267 nodes but also can also participate in existing projects of network 268 extension. Also, the participants can add services on top of the 269 network such as VoIP, FTP servers, broadcast radios, etc. 271 Regarding active participation on an Event Streaming Open Network, we 272 can highlight the possibility for individuals and organizations to 273 expand the services provided by the open network. This extensibility 274 could be made possible by different uses of the event payloads and 275 will vary significantly depending on the sector. Since we have 276 already proved how Flow is an infrastructure resource, innovation 277 would play its part and its results would be materialized in services 278 expansion. 280 We can conclude that the same kind of openness of DNS, WWW and Email 281 is necessary for an Event Streaming Open Network. Anybody should be 282 able to obtain the specifications to build an implementation of the 283 service. Also, since it should leverage the DNS infrastructure, 284 anybody would be able to register Flow address spaces. Lastly, the 285 specification could be governed by an institution such as the IETF, 286 due the dependency of Flow with other Internet Services governed by 287 this institution. 289 2.4. Open Access Infrastructure Resources 291 The literature about Commons Infrastructure (Frischmann, 2007) 292 defines a set of criteria to evaluate if a resource can be considered 293 an infrastructure resource. This analysis is relevant since it can 294 provide some arguments to prove the need of an infrastructure of 295 commons for Event Streaming, which could then be materialized in an 296 Open Network for Event Streaming. The demand-side criteria for 297 evaluating if a given resource can be considered as an infrastructure 298 resource are: 300 1. The resource can be consumed nonrivalrously. 302 2. Social demand for the resource is driven primarily by downstream 303 productive activity that requires the resource as an input. 305 3. The resource is used as an input into a wide range of goods and 306 services, including private goods, public goods and/or non-market 307 goods. 309 First, a nonrival good describes the "shareable" nature of a given 310 good. Infrastructures are shareable in the sense that the resources 311 can be accessed and used by multiple users at the same time. 312 However, infrastructure resources vary in their capacity to 313 accommodate multiple users, and this variance in the capacity 314 differentiates nonrival resources from partially rival resources. A 315 nonrival resource represents those resources with infinite capacity, 316 while a partially rival resource has finite but renewable capacity. 317 As an example, Broadcast Television is a nonrival resource since 318 additional users do not affect the capacity of the resource. On the 319 other hand, natural oil resources are completely rival since its 320 availability is limited and it is not renewable. In the middle, we 321 have partially rival resources like a highway, which may be 322 congested. This last characteristic is also true for the Internet 323 since it supports additional users without degrading the service to 324 existing users to a certain extent. 326 Secondly, infrastructure resources consumption is primarily driven by 327 downstream activities that require this resource as an input. This 328 means that the broad audience consumes infrastructure resources 329 indirectly. For instance, highway infrastructure is used to 330 transport every kind of physical good which people and organizations 331 purchase. This facilitates the generation of positive externalities 332 for society through the downstream production of public goods and 333 non-market goods. These positive externalities might be suppressed 334 under a regime where resource availability is driven solely based on 335 individuals' willingness to pay. 337 Regarding willingness to pay, it is relevant to analyze this factor 338 more exhaustively. Frischmann states that if infrastructure access 339 is allocated based on individuals' willingness to pay the potential 340 positive externalities of that infrastructure might be stifled. 341 Thus, infrastructure resources behave differently than end-user 342 products: if the former are made available solely based on the end- 343 user demands and willingness to pay, those needed infrastructure 344 resources might never be made available. As an example, we can 345 mention that if airports were built based on individuals' willingness 346 to pay for them, they might not even be built. However, individuals 347 are willing to pay for the airport's downstream activities, such as 348 purchasing a flight or consuming air-transported goods. Then, a 349 whole set of positive externalities are generated by the existence of 350 an airport in a city. 352 In the third place, infrastructure resources are used as input for a 353 wide range of outputs. This criterion emphasizes both the variance 354 of the downstream outputs and their nature. Thus, the infrastructure 355 resources possess a high level of genericness which enable productive 356 activities that produce different goods with high variance. If we 357 consider how an airport complies with this criterion, we can mention 358 that not only airports serve individuals that need to travel by air 359 but are also used to transport many kinds of physical goods. These 360 goods then enable other activities throughout the downstream value 361 chain. Then, the output variance of the activities that take airport 362 infrastructure as input is significantly high. 364 2.4.1. Open Access DNS Resource Example 366 Now, we will provide as an example how DNS complies with these 367 criteria and why it can be considered an infrastructure resource. 1. 368 DNS infrastructure is a partially rival resource because individuals 369 and organizations can register domains in the Domain Name addressing 370 space. It is partially rival because not every actor can acquire the 371 same domain name. However, the access to registering domain names is 372 open and non-discriminatory. Moreover, DNS is also prone to 373 congestion, which emphasizes its partially rival nature. 2. DNS 374 infrastructure demand is driven principally by downstream products 375 and services. An average Internet user is not paying directly for 376 this infrastructure, but all the Internet services the user consumes 377 pay for DNS infrastructure. This is true for all the Internet 378 services due to the ubiquitous nature of DNS infrastructure. 3. All 379 Internet services take as input DNS infrastructure and produce a 380 broad variety of outputs, which then generate positive externalities 381 to society as a whole by means of private goods, public goods and/or 382 non-market goods. 384 We can conclude that DNS complies with Frischmann criteria for being 385 considered as an infrastructure resource. The resource is 386 represented both by the domain name that can be and by the querying 387 capacity of DNS servers. 389 2.4.2. Flow: Event Streaming Internet Resource 391 In this section, we will describe an Event Streaming Internet 392 Resources. For this, we will consider the previously described 393 guidelines for FONN as well as the characteristics of DNS as a 394 resource. This Event Streaming Internet Resource shall be refered to 395 as "flow" from now onwards. 397 To begin with, we need to define what elements could be considered as 398 infrastructure resources in an Event Streaming Open Network. First, 399 the resource must be capable of delivering streams of events to 400 consumers. Secondly, it must also permit producers to write events 401 to the stream. Thirdly, each stream must be identifiable (i.e., URI) 402 and able to be located (i.e., URL). From now on, we will use "Flow" 403 to refer to the infrastructure resource of an Event Streaming Open 404 Network. The first Frischmann criterion requires the resource to be 405 consumed nonrivalrously. Complete nonrivalrously for any Internet 406 Service cannot be achieved due to the possibility of congestion and 407 potential unavailability of different elements of the network. The 408 same would be true for a Flow resource. Moreover, the public naming 409 addressing space for Flows would be limited to the same level as that 410 of domain names. 412 We will continue now with the third criterion. To illustrate the 413 potential of Flow being used as inputs for downstream activities, we 414 will refer to Urquhart's vision for Event Streaming. He lists two 415 areas in which significant changes can happen: 417 1. The use of time-critical data for customer experience and 418 efficiency. This is driven because today's consumers are 419 increasingly expecting great experiences, and organizations are 420 almost always motivated to improve the efficiency of their 421 operations. 423 2. The emergence of new businesses and business models. Businesses 424 and institutions will quickly discover use cases where data 425 processed in a timely manner will change the economics of a 426 process or transaction. They may even experiment with new 427 processes, made possible by this timely data flow. Thus, flow 428 resources will also enable innovation. These innovations are 429 responsible for generating positive externalities. 431 Then, we have demonstrated why Flow resources can be considered as 432 infrastructure resources using Frischmann's Demand-side Theory of 433 Infrastructure. These resources can be managed in an open manner to 434 maximize positive externalities, which basically means maintaining 435 its open access, not discriminating, and eliminating the need to 436 obtain licenses to use the resources. Consequently, managing 437 infrastructure resources in this manner eliminates the need to rely 438 on either market actors or governments. 440 Lastly, the adoption of an Event Streaming Open Network implies 441 taking Flow resources as inputs for productive activities. These 442 inputs would then be used downstream to generate private goods, 443 public goods and/or non-market goods. Additionally, we can assure 444 that most of the consumers of Flow would not directly consume Flow 445 resources. They would consume the outputs of downstream activities 446 that use Flow as input. Again, the consumers may not be willing to 447 pay for Flow resources directly. 449 We can conclude this section mentioning that an Event Streaming Open 450 Network would enable one infrastructure resource called Flow. The 451 access to this resource can be managed in an openly manner: 452 maintaining open access, not discriminating users or different uses 453 of the resource, and eliminating the need to obtain approval or a 454 license to use the resource. 456 3. Necessities for an Event Streaming Open Network over the Internet 458 In this section, we will describe the main needs for the broad 459 adoption of Event Streaming. The focus will be made on detecting and 460 describing the missing capabilities that could not only enable but 461 also accelerate the event data integration among different 462 organizations. The different necessities detailed in this section 463 will serve as input for an architecture design. 465 3.1. Necessity 1: Event Streaming Internet Resource Public Registry 467 A public registry of an organization's available event streams does 468 not exist. We will argue in this section why this is the core 469 component that an Event Streaming Open Network can provide. 471 Nowadays, when an organization needs to publish an event stream or 472 event flow, they usually follow some form of the following steps: 474 1. Develop and deploy a producer application that writes events to a 475 queue. 477 2. Create all necessary networking permissions for external public 478 access to the queue. 480 3. Inform the remote user the access information (i.e., Hostname/IP, 481 protocol, and port) together with the required client details and 482 technology for accessing the stream (i.e., Apache Kafka Protocol, 483 RabbitMQ API, etc.). 485 4. Create credentials for consumer authentication and authorization 486 access to the queue. 5.Develop and deploy a consumer application 487 that reads the queue. 489 Now, we can compare this process to a simple email interaction: 1. 490 Sender opens a graphical Mail User Agent application and sends an 491 email to an email address formatted as user@domain. 2. The message 492 is sent to an SMTP server that routes it to the destination SMTP 493 servers for the given domain. Once received, the message is put into 494 the user mailbox. 3. When the recipient checks its mailbox by IMAP 495 or POP3, the new email is transferred to the Mail User Agent. 497 In these two scenarios, we can see that the information needed to be 498 exchanged offline by the actors is completely different in size and 499 content. 501 First, in the case of email, there is a shared naming space given by 502 the Domain Name Service (DNS). The email format has been 503 standardized by the IETF in [RFC5321], section 2.3.11. Thus, there 504 is a common naming space that is used for referencing mailboxes in 505 the format user@domain. Thus, the offline details communicated by 506 the peers is only the recipient email address. There is no analogous 507 standard nor an open alternative for Event Streaming. 509 Therefore, in the case of Event Streaming, users need to perform 510 plenty of offline communication to agree not only on the technology 511 to use but also on the queue to use. For instance, two organizations 512 may be currently using Apache Kafka and need to share an event stream 513 among themselves. The organization having the source of the stream 514 should provide the following details to the consumer organization: * 515 Bootstrap servers: Fully Qualified Domain Name list of the Apache 516 Kafka brokers to start the connection to the Apache Kafka Brokers. 517 Example: tcp://kf1.cluster.emiliano.ar:9092, 518 tcp://kf2.cluster.emiliano.ar:9092, 519 tcp://kf3.cluster.emiliano.ar:9092 * Topic or Queue name: name of the 520 topic resource in the Apache Kafka Cluster * Authentication 521 information: User and password, TLS Certificate, etc. 523 In the case these organizations were not both using Apache Kafka, the 524 use case cannot be simply solved without incurring in development or 525 complex configurations as well as adopting proprietary components. 527 We can conclude that an Event Streaming Open Network should provide a 528 global accessible URI for streams in a similar fashion than email, to 529 reduce offline developers' interactions. This means being able to 530 name event streams in a common naming space like DNS, as well as 531 providing a mechanism for users to discover the location and 532 connections requirements. 534 3.2. Necessity 2: Establishment of a User Space for Events 536 Another need for broad adoption is due to the inexistence of a common 537 and agreed user convention. In the general literature, we cannot 538 find reference to the types of users that would consume or produce 539 events to and from an event stream. 541 In this sense, it is also appropriate to consider the email use case. 542 Basically, an email user only needs to know the email address, the 543 password, the URL of a web mail client or the details of IMAP/POP3 544 server connection. Once the user has this information, it's possible 545 to access an email space or mailbox where the user can navigate the 546 emails in it. Also, IMAP provides the possibility for the user to 547 create folders and optionally share them with other users. 549 There is no analogous service currently available for Event Streaming 550 analogous to the email case. This means that the user concept in 551 Event Streaming is limited to authentication and authorization. 553 Thus, the user does not have access to a "streambox". The result is 554 the impossibility for a person or an application to possess a home 555 directory containing all the streams owned by the user. 557 As a conclusion for this section, we can mention that it is necessary 558 to embrace a user space resource for Event Streaming. This resource 559 should not only solve the users' motivations and requirements but 560 also reduce the offline verbal communications and custom development 561 dependencies. In the next sections, we will refer to this component 562 as the Event User Space Service. 564 3.3. Necessity 3: An Agnostic Subscription Protocol 566 A third need for wide adoption is an agnostic protocol to manage 567 subscriptions to event streams. For this need to be solved, it would 568 be necessary first to count with an Event User Space Service. Then, 569 in case a user has created a stream and wants to enable public 570 subscriptions by other users, there is no general protocol to inform 571 other parties of this subscription intention nor its confirmation. 573 The result is the inability for the users to seamlessly subscribe to 574 an event stream. They either must employ protocols like MQTT or, in 575 the need of employing other application protocols like Apache Kafka, 576 hardcode the subscription details in the different software 577 implementations. This means that there is no general subscription 578 protocol for Event Streaming that is agnostic of the application 579 protocol employed. This protocol implements both the Metadata 580 Payload Format and Payload Format. 582 A good example to illustrate the difference between a control 583 protocol that implements a Metadata Payload Format from a payload 584 protocol that implements a Payload Format is how SIP (Session 585 Initiation Protocol) works with RTP (Real Time Protocol) to provide 586 VoIP capabilities. The former is a control protocol that initiates 587 and maintains a session or call while the latter is the one 588 responsible for carrying the payloads, which in the case of VoIP it 589 would be coded audio. 591 Consequently, a similar definition of protocols could potentially 592 mitigate this limitation for Event Streaming. If a protocol can be 593 used to establish and maintain the subscriptions relationships while 594 another different protocol is used for the events payload, all the 595 current application protocols implementations could be supported. 597 Additionally, by counting also with an Event Streaming Public 598 Registry, it would be possible to provide URI for streams in a 599 similar way as email works with the "mailto" URI. For instance, in 600 web pages one can find that email addresses are linked to mailto URIs 601 which, when clicked, open the default email user application (i.e., 602 Microsoft Outlook) to send an email to the referenced email address. 604 If a user counts with a user space or streambox, then a user 605 application like an email client could provide access to it. Then, 606 if the user clicks on a link of a stream URI (i.e. 607 "stream:myeventflow"), the streambox application would open and 608 subscribe to the given stream. 610 Currently, the Metadata Payload Format as well as the Payload Format 611 are both provided by the queue or log application protocol. In the 612 case of Apache Kafka, both formats are implemented within the Apache 613 Kafka Protocol. This introduces a barrier for interoperability among 614 different technologies, meaning that flows of event data cannot be 615 seamlessly connected, without relying on custom development or 616 proprietary software licensing. 618 We can conclude that there is an actual need for an open 619 specification of an Event Subscription Service for event streams, 620 which implements what Urquhart calls Metadata Payload Format. This 621 specification could be materialized in a network protocol that 622 introduces an abstraction for the event queue or log technologies 623 implemented by different organizations. 625 3.4. Necessity 4: An Open Cross-sector Payload Format 627 Currently, the different implementations of Event Streaming combine 628 both the Payload Format with the Metadata Format. This means that 629 the same protocol utilized for payload transport is used for 630 subscription management. 632 When a producer intends to publish events to a queue or, using Apache 633 Kafka terminology, when a producer intends to write records to a 634 topic, first it needs to initiate a connection to at least one of the 635 Apache Kafka Brokers. In that initial exchange of TCP packages, the 636 producer is authenticated, authorized, and informed with topic 637 details. This set of transactions would belong to a protocol that 638 implements a Metadata Payload Format. Afterwards, when the Producer 639 starts writing the events to the topic, it encapsulates the event 640 payload in a Kafka Protocol message. This latter behavior makes use 641 of a Payload Format. Thus, we can observe how both theoretical 642 formats are coupled in a single protocol. Similar behavior of a 643 coupled Metadata and Payload Format in one single protocol happens 644 also in AMQP, MQTT and RabbitMQ. 646 As for the consumer, the behavior is the same with the difference 647 that the initial intention is to subscribe to a queue or, in Apache 648 Kafka terminology, to consume records of a topic. Then, a set of TCP 649 packages encapsulating the Apache Kafka protocol authenticates, 650 authorizes, and informs the Consumer with topic details for 651 consumption. Afterwards, the consumer can start polling for new 652 records in the different partitions of the topic. It is worth 653 mentioning that the consumer needs to implement more queue management 654 logic than the Producer, especially when multiple replicas of a 655 consumer type are deployed. 657 If we focus on the Payload Format, there is the need for an 658 implementation-agnostic payload format suitable for Event Streaming. 659 In this sense, CloudEvents project of the CNCF proposes a 660 specification and a set of libraries for this purpose. The goal is 661 to use CloudEvents specification as a Payload Format regardless of 662 the Payload Protocol being used. For instance, we could transmit 663 events in the CloudEvents format using the Kafka or AMQP Protocol. 665 The general structure of the CloudEvents Payload Format includes a 666 standardized methodology to include event data in an event message. 667 For instance, instead of defining a customized JSON structure for 668 sending the events of temperature changes measured by a device, a 669 CloudEvent object could be used. Temperature could be included as an 670 attribute in the CloudEvent object. 672 We can then conclude that while there is no current protocol 673 candidate that implements the Metadata Format, CloudEvents is a good 674 candidate for the Payload Format needed in an Event Streaming Open 675 Network. In this way, the different CloudEvents libraries made 676 available in several programming could be leveraged. 678 4. Event Streaming Open Network Architecture 680 In this section, we will describe the overall architectural proposal 681 for an Event Streaming Open Network. This description will include 682 the different actors in play, the software components required, as 683 well as the network protocols that should be specificized. 685 4.1. Architecture overview 687 In Figure 1 we illustrate a high-level overview of an architecture 688 proposal for the Open Network. 690 (Artwork only available as svg: No external link available, see 691 draft-spinella-event-streaming-open-network-02.html for artwork.) 693 Figure 1: Figure 1 695 We can identify different Network Participant (NP) in Figure 1 696 represented by different colors. The different NPs act as equals 697 when consuming or producing events as part of the Flows they own. 698 All of NPs implement the Event Streaming Open Network Protocol, which 699 Is described in the next chapter. 701 In the diagram, an initial flow starts on the orange NP to which a 702 user in the blue NP is subscribed. After processing the events 703 received in the first flow, the results are published to a new flow 704 in NP blue, to which the orange NP is subscribed as well. Now, the 705 green participant is subscribed to the same flow, enabling downstream 706 activities across the rest of the network participants. 708 It is possible to observe how the high-level architecture allows 709 sharing the streaming of events across different network participants 710 and their users. Also, there is also the need for security, in order 711 to allow or deny the access to write to and read from flows. 713 Regarding security, the architecture considers the integration with 714 an Identity & Access Management service, which could implement 715 popular protocols such as OAuth, SAML or SASL. However, the network 716 should also enable anonymous access in the same way FTP does. This 717 means that a given NP could publicly publish flow and allow any party 718 to subscribe to it. 720 For example, nowadays the Network Time Protocol (NTP) is used to 721 synchronize the day and time on servers. There are many NTP servers 722 available that allow anonymous access, meaning that the service is 723 openly available. The same must be considered for the Event 724 Streaming Open Network. 726 Additionally, the NP must be able to expand the capacity to support 727 any number of flows, as well as extending the network with new 728 services. Not only NP must be able to include any given set of data 729 within events but also, they must be able to build applications and 730 services on top of the network by employing the architecture 731 primitives. 733 (Artwork only available as svg: No external link available, see 734 draft-spinella-event-streaming-open-network-02.html for artwork.) 736 Figure 2: Figure 2 738 Now, we provide a brief description of all the components that appear 739 in the diagram of Figure 2. In the next sections further details of 740 the components are provided. 742 * Flow Events Broker (FEB): a high-available and fault-tolerant 743 service that provide queues to be consumed by network services, by 744 users, and their applications. An example of an Event Queue 745 Broker can be Apache Kafka, AWS SQS or Google Cloud PubSub. The 746 payload format implemented by these tools are what in 3.1.4 we 747 called Event Streaming Payload Format. 749 * Flow Name Service (FNS): a DNS-based registry that acts as an 750 authoritative server for a set of domain names, which are used to 751 represent flow addresses in a flow namespace. These domains 752 contain all the necessary information to resolve flow names into 753 flow network locations. This component refers to what in 3.1.1 we 754 named Event Streaming Registry. 756 * Flow Namespace User Agent (FNUA): an application similar to User 757 Mail Agents like Microsoft Outlook or Gmail. This application 758 provides access to flow namespaces to users of the network. The 759 definition of this component implies the specification of a 760 dedicated protocol. We will refer to this protocol as FNAP (Flow 761 Namespace Accessing Protocol). 763 * Flow Namespace Accessing Agent (FNAA): the server-side of the Flow 764 Namespace User Agent. This component is the one that must provide 765 convenient integration methods for GUI. This component refers to 766 what in 3.1.2 we named Event User Space Service. This component 767 must implement the same protocol selected for the Flow Namespace 768 User Agent: FNAP (Flow Namespace Accessing Protocol). 770 * Flow Processor (FP): a flow processing instance used to set up 771 subscriptions that connect local or remote flows on demand. This 772 component implements the processing part of what in 3.1.3 we 773 called Event Subscription Service. This component will be created 774 and managed by a FNAA instance, and the communication is held 775 through an Inter-process Communications (IPC) interface. Also, 776 this service must implement an Event Payload Format, for which we 777 will mainly consider CNCF's CloudEvents and Protobuf. 779 * Flow Namespace Accessing Protocol (FNAP): the protocol implemented 780 in the Flow Namespace Accessing Agent as well as in the Flow 781 Namespace User Agent. The former will act both as a server and a 782 client while the latter only as a client. This protocol is 783 described in the next chapter. 785 4.1.1. Flow Events Broker (FEB) 787 The FEB implementation that we will mostly consider is Apache Kafka. 788 This open-source project is quickly becoming a commodity platform, 789 and major cloud providers are building utilities for it. However, as 790 a design decision, it should be possible to use the same protocols to 791 support other applications, such as RabbitMQ, Apache Pulsar or the 792 cloud-based options like AWS SQS or Azure Events Hub. 794 Apache Kafka is the ecosystem leader in the Event Streaming space, 795 considering mainly adoption. There is a growing set of tools and 796 vendors supporting its installation, operation, and consumption. 797 This fact makes Apache Kafka much more appealing to enterprise 798 developers. However, the broker should provide a common set of 799 functionalities which can be seen in the diagram of Figure 3. 801 (Artwork only available as svg: No external link available, see 802 draft-spinella-event-streaming-open-network-02.html for artwork.) 804 Figure 3: Figure 3 806 The selection of the Events Broker will impact on the implementation 807 of the Flow Namespace Accessing Agent. This last component will be 808 responsible for knowing how to set up and manage flows on top of 809 different Events Brokers. 811 4.1.2. Flow Name Service (FNS) 813 FNS is a core component for the overall proposed architecture. This 814 component provides all needed functionalities for obtaining Flow 815 connection details based on a Flow URI (Uniform Resource Identifier). 816 Thus, it is required to define a URI format for Flow resources and to 817 specify mechanisms for resource location resolution. 819 In this section, we will focus on describing both the URI for Flow as 820 well as the DNS mechanism for obtaining Flow network location 821 details. 823 4.1.2.1. Leveraging DNS infrastructure 825 As mentioned previously, this component must maximize its leverage on 826 the existing Internet DNS infrastructure. The reason for this 827 requirement is to avoid defining new protocols and services that 828 prevent broad adoption. Currently, DNS is the de facto name 829 resolution protocol for the Internet, and there exist libraries for 830 its usage on every programming language. 832 Whereas DNS is mainly used to resolve FQDN (Fully Qualified Domain 833 Names) into IP addresses, there are many other functionalities 834 provided by the global DNS infrastructure. Theoretically, DNS is an 835 open network of a distributed database. Individuals and 836 organizations that want to participate in the network need to 837 register a domain name and set up Authoritative DNS servers for 838 domains. 840 It is not in the scope of this work to detail the different available 841 usages of DNS functionalities, but we can mention that it provides 842 special Resource Records (i.e., types of information for a FQDN) that 843 are solely used by special protocols. For instance, the MX Resource 844 Records are used by SMTP servers to exchange email messages. 846 For the Flow Open Network, it will be required to define a URI format 847 for flows as well as the mechanism to resolve an URI into all the 848 needed information to connect to a flow. In the case of email, a URI 849 is the email address while the connection details will be the SMTP 850 server responsible for receiving emails for that account. For 851 instance, an email URI could be user@domain.com while its connection 852 details could be smtp://mail.domain.com. The way in which the 853 connection details are obtained is by resolving the MX DNS Resource 854 Records of domain.com, which in this example is mail.domain.com. 856 4.1.2.2. Flow URI 858 As we mentioned previously, the first needed element is a URI 859 definition for flow resources. These resources identification must 860 capture the following details: * Domain, a registered domain in which 861 create flow resources references. For example, airport.com. * Flow 862 Namespace, a subdomain which is solely used by users to host flow 863 names. This subdomain must be delegated to the Flow Name Server 864 component and desirable should not be used for any other purpose 865 other than flow. * Flow Name, a name for each flow that must be 866 unique within its domain. The combination of flow name and flow 867 domain results in an FQDN. For instance, we could have a flow named 868 arrivals of the domain flow.airport.com. Thus, the FQDN of the flow 869 would be arrivals.flow.airport.com. Also, the name can contain dots 870 so that the following FQDN could be also used: 871 airline.arrivals.flow.airport.com. 873 Thus, the general syntax of a flow URI would be: 875 flow://flow_name.flow_namespace.domain 877 This URI has the advantage that is similar to "mailto" URI and could 878 be implemented in HTML to refer to flow resources. Some examples: 880 * flow://entrances.building.company.com 882 * flow://exits.building.company.com 884 * flow://temperature.house.mydomain.com 886 * flow://pressure.room1.office.mydomain.com 888 The flow URI must unequivocally identify a flow resource and provide, 889 by means of DNS resolution mechanisms, all the information required 890 to use the flow. Among these parameters, at least the following 891 should be resolvable: 893 * Event Queue Broker protocol utilized by the flow. For instance, 894 if Apache Kafka is used, the protocol would be "kafka"; In case 895 RabbitMQ is used by the flow, "amqp". Also, it must be informed 896 if the protocol is protected by TLS. 898 * Event Queue Broker FQDN or list of FQDNs that resolve to the IP 899 address of one or a set of the Event Queue Brokers. For instance, 900 kafka-1.mycompany.com, kafka-2.mycompany.com. 902 * Event Queue Broker Port used by the Event Queue Brokers. For 903 instance, in the case of Kafka: 9092, 9093. 905 * Event Queue Broker Transport Security Layer can be implemented. 906 Thus, it is needed to know if the connection uses TLS before 907 establishing it. 909 * Queue Name hosted in the Event Queue Broker, which must be equal 910 to that of the corresponding flow name. 912 The general syntax of the Flow URI would be as follows: 914 flow://flowName.flowCategory.myNameSpace.domain.tld 916 * Flow Namespace FQDN: myNameSpace.domain.tld 918 * Flow Name: flowName.flowCategory 920 * Flow FQDN: flowName.flowCategory.myNameSpace.domain.tld 922 The following are examples of this URI Syntax: 924 flow://notifications.calendar.people.syndeno.com 926 * Flow Namespace FQDN: people.syndeno.com 927 * Flow Name: notifications.calendar 929 * Flow FQDN: notifications.calendar.people.syndeno.com 931 flow://created.invoice.finance.syndeno.com: 933 * Flow Namespace FQDN: finance.syndeno.com 935 * Flow Name: created.invoice 937 * Flow FQDN: created.invoice.finance.syndeno.com 939 4.1.2.3. Flow name resolution 941 In Figure 4, we can see how a Flow FQDN can be resolved by means of 942 the Flow Name Service. 944 (Artwork only available as svg: No external link available, see 945 draft-spinella-event-streaming-open-network-02.html for artwork.) 947 Figure 4: Figure 4 949 In order to illustrate the Flow Name resolution procedure by the FNAA 950 (Flow Namespace Accessing Agent), we can consider the following flow 951 URI: 953 flow://notifications.calendar.people.syndeno.com 955 First, the FNAA will perform a query to the DNS resolvers. These 956 will perform a recursive DNS query to obtain the authoritative name 957 servers for the Flow Namespace: people.syndeno.com. Thus, the 958 authoritative name servers for syndeno.com will reply with one or 959 more NS Resource Record containing the FQDN for the authoritative 960 name servers of people.syndeno.com. 962 Secondly, once these name servers are obtained, the FNUA will perform 963 a PTR query on the Flow FQDN adding a service discovery prefix. The 964 response of the PTR query will return another FQDN compliant with SRV 965 DNS Resource Records [RFC2782] and DNS Service Discovery [RFC6763]. 967 In this case, the query for PTR records would be as follows: ~~~ ;; 968 QUESTION SECTION: ;notifications.calendar.people.syndeno.com. IN PTR 969 ~~~ The response would be in the following form: ~~~ ;; ANSWER 970 SECTION: notifications.calendar.people.syndeno.com. 21600 IN PTR 971 _flow._tcp.notifications.calendar.people.syndeno.com. ~~~ Using the 972 FQDN returned by this query, an additional query asking for SRV 973 records is made: ~~~ ;; QUESTION SECTION: 974 ;_flow._tcp.notifications.calendar.people.syndeno.com. IN SRV 975 ;; ANSWER SECTION: 976 _flow._tcp.notifications.calendar.people.syndeno.com. 875 IN SRV 30 977 30 65432 fnaa.syndeno.com. 978 _flow._tcp.notifications.calendar.people.syndeno.com. 875 IN TXT 979 "tls" 981 _queue._flow._tcp.notifications.calendar.people.syndeno.com. 875 IN 982 SRV 30 30 9092 kafka.syndeno.com. 983 _queue._flow._tcp.notifications.calendar.people.syndeno.com. 875 IN 984 TXT "broker-type=kafka tls" ~~~ First, the response informs the 985 network location of the FNAA server, in this case a connection should 986 be opened to TCP port 65432 of the IP resulting of resolving 987 fnaa.syndeno.com: ~~~ ;; QUESTION SECTION: ;fnaa.syndeno.com. IN A 989 ;; ANSWER SECTION: fnaa.syndeno.com. 21600 IN A 208.68.163.200 ~~~ 990 Secondly, this response offers other relevant information, like the 991 TCP port where the queue service is located (9092). It also includes 992 a TXT Resource Record that establishes the protocol of the Event 993 Queue Broker, defined in the variable "broker-type=kafka". 995 Now, using the returned FQDN for the queue, kafka.syndeno.com, the 996 resolver can perform an additional query: ~~~ ;; QUESTION SECTION: 997 ;kafka.syndeno.com. IN A 999 ;; ANSWER SECTION: kafka.syndeno.com. 21600 IN A 208.68.163.218 ~~~ 1001 4.1.3. Flow Namespace Accessing Agent (FNAA) 1003 The Flow Namespace Accessing Agent is the core component of a Network 1004 Participant. This server application implements the Flow Namespace 1005 Accessing Protocol that allows client connections. 1007 In the diagram of Figure 5 we can see the different methods that the 1008 FNAA must support. 1010 (Artwork only available as svg: No external link available, see 1011 draft-spinella-event-streaming-open-network-02.html for artwork.) 1013 Figure 5: Figure 5 1015 The clients connecting to a FNAA server can be remote FNAA servers as 1016 well as FNUA. The rationale is that users of a NP connect to the 1017 FNAA by means of a FNUA. On the other hand, when a user triggers a 1018 new subscription creation, the FNAA of his NP must connect as client 1019 to a remote FNAA server. 1021 4.1.4. Flow Processor (FP) 1023 Whenever a new subscription creation is triggered and all remote flow 1024 connection details are obtained, the FNAA needs to set up a Processor 1025 for it. The communications of the FNAA to and from the FP is by 1026 means of an IPC interface. This means that there can be different 1027 implementations of Processors, one of which will be the Subscription 1028 Processor. 1030 In the diagram of Figure 6, we can see the initial interface methods 1031 that should be implemented in a Flow Processor. 1033 (Artwork only available as svg: No external link available, see 1034 draft-spinella-event-streaming-open-network-02.html for artwork.) 1036 Figure 6: Figure 6 1038 Depending on the use of the processor, different data structures 1039 should be added to the different methods. In the case of a 1040 Subscription Processor, the minimum information will be the remote 1041 and local Flow connection details. Moreover, the interface also 1042 should include methods to update the Processor configuration and to 1043 destroy it, once a subscription is revoked. Finally, due to the 1044 nature of the stream communication, there could also be methods 1045 available to pause and to resume a Processor. 1047 There can be different types of Processors, which we can see in 1048 Figure 7. 1050 (Artwork only available as svg: No external link available, see 1051 draft-spinella-event-streaming-open-network-02.html for artwork.) 1053 Figure 7: Figure 7 1055 In Figure 7, we can see that there are different types of Flow 1056 Processors: * Bridge Processor: Consumes events from a Flow located 1057 in an Event Broker (i.e., Apache Kafka) and transcribes them to a 1058 single Flow (local or remote). * Collector Processor: Consumes events 1059 from N Flows located in an Event Broker and transcribes the aggregate 1060 to a single Flow (local or remote). * Distributor Processor: Consumes 1061 events from a single Flow and transcribes or broadcast to N Flows 1062 (local or remote). * Signal Processor: Consumes events from N Flows 1063 and produces new events to N Flows (local or remote) 1065 To implement the previously described Subscription Processor, we can 1066 utilize some form of the Bridge Processor. Although we are initially 1067 considering the basic use case of subscription, it must be possible 1068 for the network to extend the processor types supported. In any 1069 case, the different FNAA servers involved must be aware the supported 1070 processor types, with the goal of informing the users the 1071 capabilities available in the FNAA server. For instance, the fact 1072 that a FNAA supports the Bridge Processor should enable the 1073 subscription commands in the FNAA, for users to create subscriptions 1074 using the Bridge Processor. 1076 In summary, the IPC interface should support all the possible 1077 processors that the network may need although we are initially 1078 considering the subscription use case. 1080 4.1.5. Flow Namespace User Agent (FNUA) 1082 The FNUA is an application analogous to email clients such as 1083 Microsoft Office or Gmail. These applications implement either 1084 different network protocols to access mailboxes by means of IMAP and/ 1085 or POP3. In the case of FNUA, the protocol implemented is the FNAP 1086 (Flow Namespace Accessing Protocol). 1088 The FNUA is an application that acts as a client for the FNAA server. 1089 Only users that possess accounts in a Network Participant should be 1090 able to login to FNAA to manage Flow Namespaces. The FNUA could be 1091 any kind of user application: web application, desktop application, 1092 mobile application or even a cli tool. 1094 In the Diagram of Figure 8 we can see the actions that the user can 1095 request to the FNUA. 1097 (Artwork only available as svg: No external link available, see 1098 draft-spinella-event-streaming-open-network-02.html for artwork.) 1100 Figure 8: Figure 8 1102 The main goal of the FNUA is to provide the user with access to Flow 1103 Namespaces and the flows hosted in them. A user may have many Flow 1104 Namespace and many Flows in each of them. By means of the FNUA, the 1105 user can manage the Flow Namespaces and the Flows in them. Also, the 1106 FNUA will provide the capabilities required to subscribe to external 1107 Flows, whether local to the FNAA, local to the NP or remote (in a 1108 different NP FNAA server). 1110 4.2. Communications Examples 1112 In this section, two usage examples of Network Participants 1113 communications are provided. The first one, we call unidirectional, 1114 since one NP subscribes to a remote Flow of a different NP. The 1115 second one, we call it bidirectional, since now these NP have mutual 1116 subscriptions. 1118 4.2.1. Unidirectional Subscription 1120 In the diagram of Figure 9, we can see an integration between two NP. 1121 In this case, there is a FlowA hosted in the Orange NP to which the 1122 FlowB in the Blue NP is subscribed. Both FlowA and FlowB count with 1123 a queue hosted in the Flow Events Broker, which could be an Apache 1124 Kafka instance for example. However, it must be possible to employ 1125 any Flow Events Broker of the NP's choice. 1127 (Artwork only available as svg: No external link available, see 1128 draft-spinella-event-streaming-open-network-02.html for artwork.) 1130 Figure 9: Figure 9 1132 The steps followed to set up a subscription to a remote flow are: 1. 1133 A user of the Blue NP creates a new subscription to remote FlowA by 1134 means of the Flow Namespace User Agent (FNUA). 2. The FNUA connects 1135 to the Flow Namespace Accessing Agent (FNAA) of the Blue NP to inform 1136 the user request. 3. The FNAA in the Blue NP discovers the remote 1137 FNAA to which it must connect to obtain the flow connection 1138 parameters. First, it needs to authenticate and, if allowed, the 1139 connection parameters will be returned. 4. Once the FNAA in the Blue 1140 NP has all the necessary information, it will set up a new Processor 1141 that connects the flow in the Orange NP to a flow in the Blue NP. 5. 1142 Once the subscription is brought up, every time a Producer in the 1143 Orange NP writes an event to FlowA, the Flow Processor will receive 1144 it, since it is subscribed to it. Then, the Flow Processor will 1145 write that event to FlowB in the Blue NP. 6. From now on, every 1146 Consumer connected to FlowB will receive the events published on 1147 FlowA. 1149 In case the user owner of FlowA in the Orange NP wishes to revoke the 1150 access, it must be able to do so by means of security credentials 1151 revoking against the Identity & Access Manager of the Orange NP. 1153 4.2.2. Bidirectional Subscription 1155 In Figure 10 we can see an example of all the components needed to 1156 set up a flow integration between two different NP. In this case, 1157 there are two flows being connected: * FlowA of the Orange NP with 1158 FlowB of the Blue NP * FlowC of the Blue NP with FlowD of the Orange 1159 NP 1161 (Artwork only available as svg: No external link available, see 1162 draft-spinella-event-streaming-open-network-02.html for artwork.) 1164 Figure 10: Figure 10 1166 Each Flow has its corresponding Queue hosted in the NP Flow Events 1167 Broker. Also, there is one Flow Processor for each connection, 1168 meaning that these components are in charge of reading new events on 1169 source flows to write them to the destination flows as soon as 1170 received. 1172 Also, we can see that in order to connect FlowB to FlowA, a 1173 connection from the Blue NP's FNAA has been initiated against the 1174 Orange NP's FNAA. This connection uses the FNAP to interchange the 1175 flow connection details. Analogously, the FNAA connection to set up 1176 the integration of FlowC with FlowD has been initiated by the Orange 1177 NP's FNAA. 1179 After the flow connection details are obtained, the different Flow 1180 Processors are set up to consume and produce events from and to the 1181 corresponding Queue in the different NPs. 1183 Once the two processors are initialized, all the events produced to 1184 FlowA in the Orange NP will be forwarded to FlowB in the Blue NP; and 1185 all the events produced to FlowC in the Blue NP will be forwarder to 1186 FlowD in the Orange NP. 1188 5. Event Streaming Open Network Protocol 1190 The protocol to be used in an Event Streaming Open Network is a key 1191 component of the overall architecture and design. This chapter is 1192 dedicated to thoroughly describe this protocol. 1194 5.1. Protocol definition methodology 1196 It is now necessary to specify the protocol needed for the Flow 1197 Namespace Accessing Agent or FNAA, which we have named the Flow 1198 Namespace Accessing Protocol or FNAP. In the diagram of Figure 11 we 1199 can see how an FNAA client connects with a FNAA server by means of 1200 the FNAP. 1202 (Artwork only available as svg: No external link available, see 1203 draft-spinella-event-streaming-open-network-02.html for artwork.) 1205 Figure 11: Figure 11 1207 In order to define a finite state machine for the protocol and the 1208 different stimuli that cause a change of state, the model presented 1209 by M.Wild (Wild, 2013) in her paper "Guided Merging of Sequence 1210 Diagrams" will be employed. This model is beneficial since it 1211 provides an integrated method both for client and server maintaining 1212 the stimuli relationship that trigger a change of state in each 1213 component. 1215 (Artwork only available as svg: No external link available, see 1216 draft-spinella-event-streaming-open-network-02.html for artwork.) 1218 Figure 12: Figure 12 1220 In Figure 12 we have the method proposed by Wild for SMTP, in which 1221 there are boxes representing states and arrows representing 1222 transitions. Each transition has a label composed of the originating 1223 stimulus that triggers the transition and a subsequent stimulus 1224 effect triggered by the transition itself. For instance, when a 1225 client connects to an SMTP Server, the client goes from "idle" state 1226 to "conPend" state. The label of this transition includes "uCon" as 1227 the stimulus triggering the transition, which triggers the effect 1228 "sCon". Then, on the diagram for the server we can see that the 1229 "sCon" triggers the transition from "waiting" state to "accepting" 1230 state in the server. 1232 This method will be used to define the states and transitions for the 1233 Flow Namespace Accessing Protocol both for client and server. 1235 5.2. Flow Namespace Accessing Protocol (FNAP) 1237 Using the model proposed by Wild described previously, we define the 1238 finite-state machine for the FNAA Server, which we can see in 1239 Figure 13. 1241 (Artwork only available as svg: No external link available, see 1242 draft-spinella-event-streaming-open-network-02.html for artwork.) 1244 Figure 13: Figure 13 1246 The model in right side of Figure 13 shows that the FNAA server 1247 starts in a "waiting" state, which basically means that the server 1248 has successfully set up the networking requirements to accept client 1249 connections. Then, when a client connects, a transition is made to 1250 "accepting" state, in which internally the authentication procedure 1251 is made. If the authentication is successful, a transition is made 1252 to "ready" state, meaning that the client can now execute commands on 1253 the FNAA server. 1255 For each command that the client executes, a transition is made to 1256 "cmdRecvd" state. Then, a response is returned to the client, 1257 transitioning again to "waiting" state. When the client executes the 1258 "Quit" command, a transition is made to the "waiting" state and the 1259 server must free all used networking resources for the now closed 1260 connection. 1262 On the left side of Figure 13, we also have the client state machine 1263 with its corresponding transitions. The client triggers a connection 1264 to the server and once established, an authentication is needed. 1265 Once the authentication is correctly done, the client can start 1266 requesting commands to the server. For each command executed by the 1267 client, a transition is made to "cmdPend" state, until a response is 1268 returned by the server. 1270 Eventually, a "Quit" command will be executed by the client and the 1271 connection will be closed. 1273 5.3. Implementation 1275 In this section, we provide an approach for the overall 1276 implementation of the proposed Event Streaming Open Network. 1277 Considering the components defined previously for the architecture, 1278 we will define which existing tools can be leveraged and those that 1279 require development. 1281 5.3.1. Objectives 1283 The objective of this implementation is to provide specifications for 1284 an initial implementation of the overall architecture for the Event 1285 Streaming Open Network. Whenever it is possible, existing tools 1286 should be leveraged. For those components that need development, a 1287 thorough specification is to be provided. 1289 5.3.1.1. Implementation overview 1291 In Figure 14, we have a diagram of the overall implementation 1292 proposal. The components that have the Kubernetes Deployment icon 1293 are the ones to be managed by the FNAA server instance. Then, we 1294 have a Kafka Cluster that provides a Topic instance for each flow. 1295 Finally, the DNS Infrastructure is leveraged. 1297 (Artwork only available as svg: No external link available, see 1298 draft-spinella-event-streaming-open-network-02.html for artwork.) 1300 Figure 14: Figure 14 1302 5.4. Existing components 1304 In this section, we describe the existing software components that 1305 can be leveraged for implementation. 1307 5.4.1. Flow Events Broker (FEB) 1309 Since there are currently many implementations for this component, it 1310 is necessary to develop the needed integrations of other components 1311 of the architecture to the main market leaders. Thus, we will 1312 consider the following Flow Events Broker for the implementation: 1313 Apache Kafka, AWS SQS and Google Compute PubSub. 1315 In summary, this component does not need to be developed from 1316 scratch. However, the FNAA will need to be able to communicate with 1317 the different Flow Events Broker, meaning that it must implement 1318 their APIs as a client. 1320 5.4.2. Flow Name Service (FN) 1322 This component can be completely implemented by leveraging on the ISC 1323 Bind9 software component, which is the de facto leader for DNS 1324 servers. A given NP will need to deploy a Bind9 Nameserver and 1325 enable both DNSSEC and DNS Dynamic Update. 1327 The impact of adopting Bind9 for the implementation means that the 1328 FNAA component needs to be able to use a remote DNS Server to manage 1329 the Flow URI registration, deregistration and execute recursive DNS 1330 resolution. 1332 5.4.3. Components to be developed 1334 In this section, we describe a set of tools that require development. 1335 These components, especially the FNAA, are the core components of 1336 every Network Participant. Moreover, these are the components that 1337 implement the network protocol FNAP. 1339 Since these are the core components of the network, they are the 1340 natural candidates for validation. In the next chapter, we will show 1341 the feasibility of the core network components in the form of a Proof 1342 of Concept. 1344 5.4.3.1. Flow Namespace Accessing Agent (FNAA) 1346 The Flow Namespace Accessing Agent is a server component that 1347 triggers the creation of child processes that implement the different 1348 Flow Processors. This means that the instance running the FNAA will 1349 bring up new processes for each processor. One way of implementing 1350 this functionality can be a parent process that creates new child 1351 processes for each processor. However, this would imply the need of 1352 creating and managing different threads in a single FNAA instance. 1354 The problem with the approach of a parent process and child processes 1355 for the FNAA is on the infrastructure level. The more processor a 1356 FNAA needs to manage, the more compute resources the FNAA would need. 1357 In the current cloud infrastructure context, this is problem because 1358 it means that additional compute resources should be assigned to the 1359 FNAA, depending on the quantity of processors and the required 1360 resources for each of them. In summary, this approach would be 1361 vertically scalable but not horizontally scalable. 1363 Then, to avoid the scalability issue, the approach we propose is by 1364 implementing a Cloud Native application. By leveraging on 1365 Kubernetes, it is possible to trigger the creation of Deployments, 1366 which are composed of Pods. Each Pod can contain a given quantity of 1367 containers, which are processes running in a GNU/Linux Operating 1368 System. In this way, we can dedicate a Pod to run the FNAA server 1369 and different Pods to run the Processors. This approach provides a 1370 convenient process isolation and enables both horizontal and vertical 1371 scalability. 1373 Moreover, the way in which the FNAA would bring up and manage 1374 Processor instances would be though an integration with the 1375 underlaying Kubernetes instance, by means of the Kubernetes API. The 1376 result is a Cloud Native application that leverages the power and 1377 flexibility of Kubernetes to manage the Processor instances. 1379 On the other hand, the programming language for the FNAA must also be 1380 defined. For this, we consider that it must be possible to implement 1381 the FNAA and the Flow Processors in different programming languages. 1382 For the FNAP it is recommended to employ Golang, since Kubernetes CLI 1383 tool is implemented in this language and there are several libraries 1384 available for integration. As for the Flow Processors, it must be 1385 possible to use any programming language as long as the IPC interface 1386 is correctly implemented. 1388 Regarding the IPC interface for the communications between the FNAA 1389 and the Flow Processors, the recommendation is to employ gRPC 1390 together with Protobuf. The rationale for choosing this this 1391 technology is the fact that gRPC enables binary communications, which 1392 are the desired type of communication for systems integration. Then, 1393 both the FNAA and the Flow Processors must share this Protobuf 1394 interface definition and implement it accordingly through gRPC. 1396 Finally, the FNAA must implement the protocol we have named FNAP, 1397 which provides the main set of functionalities for the Event 1398 Streaming Open Network. The implementation of FNAP must be stateful, 1399 in the sense that it is connection-based. Additionally, the 1400 implementation must be text-based, with the goal that humans can 1401 interact with FNAA servers in the same way that it is possible for 1402 SMTP servers. The transport protocol must be TCP with no special 1403 definition for a port number, since the port should be able to be 1404 discovered by means of DNS SRV Resource Records. 1406 Regarding security for the FNAA servers, TLS must be supported. This 1407 means that any client can start a TLS handshake with the FNAA servers 1408 before issuing any command. 1410 In conclusion, the implementation of the FNAA over Kubernetes 1411 provides the needed flexibility and set of capabilities required for 1412 this component. It is recommended to implement the FNAA in Golang 1413 and enable the implementation of Flow Processors in any programming 1414 language as long as the Protobuf interface is correctly implemented. 1415 Finally, the FNAA must implement the protocol FNAP in a connection- 1416 based and text-based manner. 1418 5.4.3.2. Flow Namespace User Agent (FNUA) 1420 The Flow Namespace User Agent (FNUA) can have different 1421 implementations as long as they comply with the protocol FNAP. 1423 We propose the initial availability of a CLI tool that acts as a Flow 1424 Namespace User Agent. This CLI tool must provide a client 1425 implementation of all the functionalities available in the FNAA 1426 server. Among the functionalities to be implemented as a must, we 1427 can mention: * Discover the FNAA server for a given Flow URI. * 1428 Connect to the FNAA server to manage Flow Namespaces and Flows, as 1429 exemplified in Figure 8. 1431 Additionally, the FNUA should be able to discover the Authoritative 1432 FNAA server for a given Flow Namespace. This discovery shall be 1433 performed by leveraging on the DNS-SD specification. Refer to Annex 1434 D to review the discovery process. 1436 Regarding the implementation of the CLI tool, it is recommended to 1437 employ Golang together with Cobra, a library specialized to create 1438 CLI tools. In Figure 15 we have a diagram that shows the different 1439 functionalities that the CLI tool should implement. 1441 (Artwork only available as svg: No external link available, see 1442 draft-spinella-event-streaming-open-network-02.html for artwork.) 1444 Figure 15: Figure 15 1446 6. Proof of Concept 1448 In this section, we will focus on providing a minimum implementation 1449 of the main Event Streaming Open Network component: the Flow 1450 Namespace Accessing Agent. This implementation should serve as a 1451 Proof of Concept of the overall Event Streaming Open Network 1452 proposal. 1454 As described in the previous section, the Flow Namespace Accessing 1455 Agent (FNAA) is the main and core required component for the Open 1456 Network. All Network Participants must deploy an FNAA server 1457 instance in order to be part of the network. The FNAA actually 1458 implements a server-like application for the Flow Namespace Accessing 1459 Protocol (FNAP). Then, the first objective of this Proof of Concept 1460 is to show an initial implementation of the FNAA server component. 1462 On the other hand, the FNAA is accessed by means of a Flow Namespace 1463 User Agent (FUA). This component acts as a client application that 1464 connects to a FNAA. Also, this component can take different forms: 1465 it could be a web-based application, a desktop application or even a 1466 command line tool. For the purposes of this Proof of Concept, we 1467 will implement a CLI tool that acts as a client application for the 1468 FNAA. Thus, the second objective of this PoC is to provide an 1469 initial implementation of the FNUA client component. 1471 In the following sections, we will first describe the minimum 1472 functionalities considered for validating the overall proposal for 1473 the Event Streaming Open Network. This minimum set of requirements 1474 for both the FNAA and the FNUA will compose the Proof of Concept. 1476 Afterwards, we will describe the technology chosen for the initial 1477 implementation of both the FNAA and the FNUA. Then, a description of 1478 how these tools work in isolation will be provided. Subsequently, we 1479 will review different use cases to prove how the network could be 1480 used by network participants and its users. 1482 Lastly, we will provide a conclusion for this Proof of Concept, where 1483 we mentioning if and how the minimum established requirements have 1484 been met or not. 1486 6.1. Minimum functionalities 1488 Network Participants system administrators must be able to run a 1489 Server Application that acts as FNAA. 1491 Users using a Client Application actiong as a FNUA must be able to: 1492 1. Access the flow account and operate its flows. 2. Create a new 1493 flow. 3. Describe an existing flow. 4. Subscribe to an external 1494 flow. 1496 6.2. FNAA - Server application 1498 The FNAA server application must implement FNAP as described in 1499 Section 6. Basically, the FNAA will open a TCP port on all the IP 1500 addresses of the host to listen for new FNUA client connections. 1502 The chosen language for the development of the FNAA is GoLang. The 1503 reason for choosing GoLang is because Kubernetes is written in this 1504 language and there is a robust set of libraries available for 1505 integration. Although there is no integration built with Kubernetes 1506 for this Proof of Concept, the usage of GoLang will enable a seamless 1507 evolution of the FNAA application. In future versions of the FNAA 1508 codebase, new functionalities leveraging Kubernetes will be easier to 1509 implement than if using a different programming language. 1511 When the FNAA server application is initialized, it provides debug 1512 log messages describing all client interactions. In order to start 1513 the server application, a Network Participant system administrator 1514 can download the binary and execute it in a terminal: 1516 ignatius ~ 0$./fnaad 1517 server.go:146: Listen on [::]:61000 1518 server.go:148: Accept a connection request. 1520 Now that the 61000 TCP port is open, we can test the behaviour by 1521 means of a raw TCP using telnet command in a different terminal: ~~~ 1522 ignatius ~ 1$telnet localhost 61000 Trying 127.0.0.1... Connected to 1523 localhost. Escape character is '^]'. 220 fnaa.unix.ar FNAA ~~~ We 1524 can now see that the server has provided the first message in the 1525 connection: a welcome message indicating its FQDN fnaa.unix.ar. 1527 On the other hand, the server application starts providing debug 1528 information for the new connection established: 1530 ignatius ~ 0$./fnaad 1531 server.go:146: Listen on [::]:61000 1532 server.go:148: Accept a connection request. 1533 server.go:154: Handle incoming messages. 1534 server.go:148: Accept a connection request. 1536 6.3. FNUA - Client application 1538 In order to test the FNAA server application, a CLI-based FNUA 1539 application has been developed. The chosen language for this CLI 1540 tool is also GoLang. The reason for choosing GoLang for the FNUA is 1541 because of its functionalities for building CLI tools, leveraging on 1542 the Cobra library. Thus, the FNUA for the PoC is an executable file 1543 that complies with the diagram in Figure 14. 1545 One of the requirements for the flow CLI tool is a configuration file 1546 that defines the different FNAA servers together with the credentials 1547 to use. An example of this configuration file follows: 1549 ignatius ~/ 0$cat .flow.yml 1550 agents: 1551 - 1552 name: fnaa-unix 1553 fqdn: fnaa.unix.ar 1554 username: test 1555 password: test 1556 prefix: unix.ar- 1557 - 1558 name: fnaa-emiliano 1559 fqdn: fnaa.emiliano.ar 1560 username: test 1561 password: test 1562 prefix: emiliano.ar- 1564 namespaces: 1565 - 1566 name: flows.unix.ar 1567 agent: fnaa-unix 1568 - 1569 name: flows.emiliano.ar 1570 agent: fnaa-emiliano 1572 In this file, we can see that there are two FNAA instances described 1573 with FQDN fnaa.unix.ar and fnaa.emiliano.ar. Then, there are two 1574 namespaces: one called flow.unix.ar hosted on fnaa-unix and second 1575 namespace flows.emiliano.ar hosted on fnaa-emiliano. This 1576 configuration enables the FNUA to interact with two different FNAA, 1577 each of which is hosting different Flow Namespaces. 1579 Once the configuration file has been saved, the flow CLI tool can now 1580 be used. In the following sections, we will show how to use the 1581 minimum functionalities required for the Open Network using this CLI 1582 tool. 1584 6.4. Use cases 1586 ### Use case 1: Authenticating a user After the connection is 1587 established, the first command that the client must execute is the 1588 authentication command. As previously defined in Chapter 5, every 1589 FNAA client must first authenticate in order to execute commands. 1590 Thus, the authentication challenge must be supported both by the FNAA 1591 as well as the FNUA. 1593 It is worth mentioning that the chosen authentication mechanism for 1594 this PoC is SASL Plain. This command can be extended furtherly with 1595 other mechanisms in later versions. However, this simple 1596 authentication mechanism is sufficient to demonstrate the 1597 authentication step in the FNAP. 1599 The SASL Plain Authentication implies sending the username and the 1600 password encoded in Base64. The way to obtain the Base64 if we 1601 consider a user test with password test, is as follows: 1603 ignatius ~ 0$echo -en "\0test\0test" | base64 1604 AHRlc3QAdGVzdA== 1606 Now, we can use this Base64 string to authenticate with the FNAA. 1607 First, we need to launch the FNAA server instance: 1609 ignatius~/ $./fnaad --config ./fnaad_flow.unix.ar.yaml 1610 main.go:41: Using config file: ./fnaad_flow.unix.ar.yaml 1611 main.go:57: Using config file: ./fnaad_flow.unix.ar.yaml 1612 server.go:103: Listen on [::]:61000 1613 server.go:105: Accept a connection request. 1615 Then, we can connect to the TCP port in which the FNAA is listening: 1617 ignatius ~ 1$telnet localhost 61000 1618 Trying 127.0.0.1... 1619 Connected to localhost. 1620 Escape character is '^]'. 1621 220 fnaa.unix.ar FNAA 1622 AUTHENTICATE PLAIN 1623 220 OK 1624 AHRlc3QAdGVzdA== 1625 220 Authenticated 1627 Once the client is authenticated, it can start executing FNAP 1628 commands to manage the Flow Namespace of the authenticated user. For 1629 simplicity purposes, in this Proof of Concept, we will be using a 1630 single user. 1632 In the case of the CLI tool, there is no need to perform an 1633 authentication step, since every command the user executes will be 1634 preceded by an authentication in the server. 1636 6.4.1. Use case 2: Creating a flow 1638 Once the authentication is successful, the client can now create a 1639 new Flow. The way to do this using the CLI tool would be: 1641 ignatius ~/ 0$./fnua create flow time.flow.unix.ar 1642 Resolving SRV for fnaa.unix.ar. using server 172.17.0.2:53 1643 Executing query fnaa.unix.ar. IN 33 using server 172.17.0.2:53 1644 Executing successful: [fnaa.unix.ar. 604800 IN SRV 0 0 61000 fnaa.unix.ar.] 1645 Resolving A for fnaa.unix.ar. using server 172.17.0.2:53 1646 Executing query fnaa.unix.ar. IN 1 using server 172.17.0.2:53 1647 Executing successful: [fnaa.unix.ar. 604800 IN A 127.0.0.1] 1648 Resolved A to 127.0.0.1 for fnaa.unix.ar. using server 172.17.0.2:53 1649 C: Connecting to 127.0.0.1:61000 1650 C: Got a response: 220 fnaa.unix.ar FNAA 1651 C: Sending command AUTHENTICATE PLAIN 1652 C: Wrote (20 bytes written) 1653 C: Got a response: 220 OK 1654 C: Authentication string sent: AHRlc3QAdGVzdA== 1655 C: Wrote (18 bytes written) 1656 C: Got a response: 220 Authenticated 1657 C: Sending command CREATE FLOW time.flow.unix.ar 1658 C: Wrote (31 bytes written) 1659 C: Server sent OK for command CREATE FLOW time.flow.unix.ar 1660 C: Sending command QUIT 1661 C: Wrote (6 bytes written) 1663 The client has discovered the FNAA server for Flow Namespace 1664 flow.unix.ar by means of SRV DNS records. Thus, it obtained both the 1665 FQDN of the FNAA together with the TCP port where it is listening, in 1666 this case 61000. Once the resolution process ends, the FNUA connects 1667 to the FNAA. First, the FNUA authenticates with the FNAA and then it 1668 executes the create flow command. 1670 If we were to simulate the same behavior using a raw TCP connection, 1671 the following steps would be executed: ~~~ ignatius ~ 1$telnet 1672 localhost 61000 Trying 127.0.0.1... Connected to localhost. Escape 1673 character is '^]'. 220 fnaa.unix.ar FNAA AUTHENTICATE PLAIN 220 OK 1674 AHRlc3QAdGVzdA== 220 Authenticated CREATE FLOW time.flows.unix.ar 220 1675 OK time.flows.unix.ar ~~~ 1676 Now, the client has created a new flow called time.flows.unix.ar 1677 located in the flows.unix.ar namespace. The FNAA in background has 1678 created a Kafka Topic as well as the necessary DNS entries for name 1679 resolution. 1681 6.4.2. Use case 3: Describing a flow 1683 Once a flow has been created, we can obtain information of if by 1684 executing the following command using the CLI tool: 1686 ignatius ~/ 1$./fnua describe flow time.flow.unix.ar 1687 Resolving SRV for fnaa.unix.ar. using server 172.17.0.2:53 1688 Executing query fnaa.unix.ar. IN 33 using server 172.17.0.2:53 1689 Executing successful: [fnaa.unix.ar. 604800 IN SRV 0 0 61000 fnaa.unix.ar.] 1690 Nameserver to be used: 172.17.0.2 1691 Resolving A for fnaa.unix.ar. using server 172.17.0.2:53 1692 Executing query fnaa.unix.ar. IN 1 using server 172.17.0.2:53 1693 Executing successful: [fnaa.unix.ar. 604800 IN A 127.0.0.1] 1694 Resolved A to 127.0.0.1 for fnaa.unix.ar. using server 172.17.0.2:53 1695 C: Connecting to 127.0.0.1:61000 1696 C: Got a response: 220 fnaa.unix.ar FNAA 1697 C: Sending command AUTHENTICATE PLAIN 1698 C: Wrote (20 bytes written) 1699 C: Got a response: 220 OK 1700 C: Authentication string sent: AHRlc3QAdGVzdA== 1701 C: Wrote (18 bytes written) 1702 C: Got a response: 220 Authenticated 1703 C: Sending command DESCRIBE FLOW time.flow.unix.ar 1704 C: Wrote (33 bytes written) 1705 C: Server sent OK for command DESCRIBE FLOW time.flow.unix.ar 1706 Flow time.flow.unix.ar description: 1707 flow=time.flow.unix.ar 1708 type=kafka 1709 topic=time.flow.unix.ar 1710 server=kf1.unix.ar:9092 1711 Flow time.flow.unix.ar described successfully 1712 Quitting 1713 C: Sending command QUIT 1714 C: Wrote (6 bytes written) 1716 In the output of the describe command we can see all the necessary 1717 information to connect to the Flow called time.flow.unix.ar: (i) the 1718 type of Event Broker is Kafka, (ii) the Kafka topic has the same name 1719 of the flow and (iii) the Kafka Bootstrap server with port is 1720 provided. If we were to obtain this information using a manual 1721 connection, the steps would be: 1723 ignatius ~ 1$telnet localhost 61000 1724 Trying 127.0.0.1... 1725 Connected to localhost. 1726 Escape character is '^]'. 1727 220 fnaa.unix.ar FNAA 1728 AUTHENTICATE PLAIN 1729 220 OK 1730 AHRlc3QAdGVzdA== 1731 220 Authenticated 1732 DESCRIBE FLOW time.flows.unix.ar 1733 220 DATA 1734 flow=time.flows.unix.ar 1735 type=kafka 1736 topic=time.flows.unix.ar 1737 server=kf1.unix.ar:9092 1738 220 OK 1740 Now, we can use this information to connect to the Kafka topic and 1741 start producing or consuming events. 1743 6.4.3. Use case 4: Subscribing to a remote flow 1745 In this section, we will show how a subscription can be set up. When 1746 a user commands the FNAA to create a new subscription to a remote 1747 Flow, the local FNAA server first needs to discover the remote FNAA 1748 server. Once the server is discovered by means of DNS resolution, 1749 the local FNAA contacts the remote FNAA, authenticates the user and 1750 then executes a subscription command. 1752 Thus, the initial communication between the FNUA and the FNAA, in 1753 which the user indicates to subscribe to a remote flow, would be as 1754 follows: ~~~ ignatius ~ 1$telnet localhost 61000 Trying 127.0.0.1... 1755 Connected to localhost. Escape character is '^]'. 220 fnaa.unix.ar 1756 FNAA AUTHENTICATE PLAIN 220 OK AHRlc3QAdGVzdA== 220 Authenticated 1757 SUBSCRIBE time.flows.unix.ar LOCAL emiliano.ar-time.flows.unix.ar 220 1758 DATA ksdj898.time.flows.unix.ar 220 OK ~~~ 1760 Once the user is authenticated, a SUBSCRIBE command is executed. 1761 This command indicates first the remote flow to subscribe to. Then, 1762 it also specifies with LOCAL the flow where the remote events will be 1763 written. In this example, the remote flow to subscribe to is 1764 time.flows.unix.ar, and the local flow is emiliano.ar- 1765 time.flows.unix.ar. Basically, a new flow has been created, 1766 emiliano.ar-time.flows.unix.ar, where all the events of flow 1767 time.flows.unix.ar will be written. 1769 The server answers back with a new Flow URI, in this case 1770 ksdj898.time.flows.unix.ar. This Flow URI indicates a copy of the 1771 original flow time.flows.unix.ar created for this subscription. 1772 Thus, the remote FNAA has full control over this subscription, being 1773 able to revoke it by simply deleting this flow or applying Quality of 1774 Service rules. 1776 The remote FNAA has set up a Bridge Processor to transcribe messages 1777 in topic time.flows.unix.ar to the new topic 1778 ksdj898.time.flows.unix.ar. Another alternative to a Bridge 1779 Processor would be a Distributor Processor, which could be optimized 1780 for a Flow with high demand. Moreover, instead of creating a single 1781 Bridge Processor per subscription, a Distributor Processor could be 1782 used, in order to have a single consumer of the source flow and write 1783 the events to several subscription flows. 1785 The user could use the FNUA CLI tool to execute this command in the 1786 following manner: 1788 ignatius ~ 0$./fnua --config=./flow.yml subscribe time.flows.unix.ar --nameserver 172.17.0.2 -d --agent fnaa-emiliano 1789 Initializing initConfig 1790 Using config file: ./flow.yml 1791 Subscribe to flow 1792 Agent selected: fnaa-emiliano 1793 Resolving FNAA FQDN fnaa.emiliano.ar 1794 Starting FQDN resolution with 172.17.0.2 1795 Resolving SRV for fnaa.emiliano.ar. using server 172.17.0.2:53 1796 Executing query fnaa.emiliano.ar. IN 33 using server 172.17.0.2:53 1797 FNAA FQDN Resolved to fnaa.emiliano.ar. port 51000 1798 Resolving A for fnaa.emiliano.ar. using server 172.17.0.2:53 1799 Resolved A to 127.0.0.1 for fnaa.emiliano.ar. using server 172.17.0.2:53 1800 C: Connecting to 127.0.0.1:51000 1801 C: Got a response: 220 fnaa.unix.ar FNAA 1802 Connected to FNAA 1803 Authenticating with PLAIN mechanism 1804 C: Sending command AUTHENTICATE PLAIN 1805 C: Wrote (20 bytes written) 1806 C: Got a response: 220 OK 1807 C: Authentication string sent: AHRlc3QAdGVzdA== 1808 C: Wrote (18 bytes written) 1809 C: Got a response: 220 Authenticated 1810 Authenticated 1811 Executing command SUBSCRIBE time.flows.unix.ar LOCAL emiliano.ar-time.flows.unix.ar 1812 C: Sending command SUBSCRIBE time.flows.unix.ar LOCAL emiliano.ar-time.flows.unix.ar 1813 C: Wrote (67 bytes written) 1814 C: Server sent OK for command SUBSCRIBE time.flows.unix.ar LOCAL emiliano.ar-time.flows.unix.ar 1815 Flow emiliano.ar-time.flows.unix.ar subscription created successfully 1816 Server responded: emiliano.ar-time.flows.unix.ar SUBSCRIBED TO ksdj898.time.flows.unix.ar 1817 Quitting 1818 C: Sending command QUIT 1819 C: Wrote (6 bytes written) 1820 Connection closed 1822 This interaction of the FNUA with the FNAA of the Flow Namespace 1823 emiliano.ar (fnaa-emiliano) has trigger an interaction with the FNAA 1824 of unix.ar Flow Namespace (fnaa-unix). This means that before fnaa- 1825 emiliano was able to respond to the FNUA, a new connection was opened 1826 to the remote FNAA and the SUBSCRIBE command was executed. 1828 The log of fnaa-emiliano when the SUBCRIBE command was issued looks 1829 as follows: 1831 server.go:111: Handle incoming messages. 1832 server.go:105: Accept a connection request. 1833 server.go:253: User authenticated 1834 server.go:347: FULL COMMAND: SUBSCRIBE time.flows.unix.ar LOCAL emiliano.ar-time.flows.unix.ar 1835 server.go:401: Flow is REMOTE 1836 client.go:280: **#Resolving SRV for time.flows.unix.ar. using server 172.17.0.2:53 1837 server.go:417: FNAA FQDN Resolved to fnaa.unix.ar. port 61000 1838 client.go:42: C: Connecting to 127.0.0.1:61000 1839 client.go:69: C: Got a response: 220 fnaa.unix.ar FNAA 1840 server.go:435: Connected to FNAA 1841 server.go:436: Authenticating with PLAIN mechanism 1842 client.go:126: C: Sending command AUTHENTICATE PLAIN 1843 client.go:133: C: Wrote (20 bytes written) 1844 client.go:144: C: Got a response: 220 OK 1845 client.go:154: C: Authentication string sent: AHRlc3QAdGVzdA== 1846 client.go:159: C: Wrote (18 bytes written) 1847 client.go:170: C: Got a response: 220 Authenticated 1848 server.go:444: Authenticated 1849 client.go:82: C: Sending command SUBSCRIBE time.flows.unix.ar 1850 client.go:88: C: Wrote (30 bytes written) 1851 client.go:112: C: Server sent OK for command SUBSCRIBE time.flows.unix.ar 1852 server.go:456: Flow time.flows.unix.ar subscribed successfully 1853 server.go:457: Server responded: ksdj898.time.flows.unix.ar 1854 server.go:459: Quitting 1856 We can see how fnaa-emiliano had to trigger a client subroutine to 1857 contact the remote fnaa-unix. Once the server FQDN, IP and Port is 1858 discovered by means of DNS, a new connection is established and the 1859 SUBSCRIBE command is issued. Here we can see the log of fnaa-unix: 1861 server.go:111: Handle incoming messages. 1862 server.go:105: Accept a connection request. 1863 server.go:253: User authenticated 1864 server.go:139: Received command: subscribe 1865 server.go:348: [SUBSCRIBE time.flows.unix.ar] 1866 server.go:367: Creating flow endpoint time.flows.unix.ar 1867 server.go:368: Creating new topic ksdj898.time.flows.unix.ar in Apache Kafka instance kafka_local 1868 server.go:369: Creating Flow Processor src=time.flows.unix.ar dst=ksdj898.time.flows.unix.ar 1869 server.go:370: Adding DNS Records for ksdj898.time.flows.unix.ar 1870 server.go:372: Flow enabled ksdj898.time.flows.unix.ar 1871 server.go:139: Received command: quit 1873 Thus, we were able to set up a new subscription in fnaa-emiliano that 1874 trigger a background interaction with fnaa-unix. 1876 6.5. Results of the PoC 1878 We can confirm the feasibility of the overall Event Streaming Open 1879 Network architecture. The test of the proposed protocol FNAP and its 1880 implementation, both in the FNAA and FNUA (CLI application), show 1881 that the architecture can be employed for the purpose of distributed 1882 subscription management among Network Participants. 1884 The minimum functionalities defined both for the Network Participants 1885 and the Users were met. Network Participants can run this type of 1886 service by means of a server application, the FNAA server. Also, the 1887 CLI-tool resulted in a convenient low-level method to interact with a 1888 FNAA server. 1890 In further implementations, the server application should be 1891 optimized as well as secured, for instance with a TLS handshake. 1892 Also, the CLI-tool could be enhanced by a web-based application with 1893 a friendly user interface. 1895 Nevertheless, the challenge for a stable implementation of both 1896 components is the possibility of supporting different Event Brokers 1897 and their evolution. Not only Apache Kafka should be supported but 1898 also the main Public Cloud providers events solutions, such as AWS 1899 SQS or Google Cloud Pub/Sub. Since the Event Brokers are continuously 1900 evolving, the implementation of the FNAA component should keep up 1901 both with the API and new functionalities of these vendors. 1903 Regarding the protocol design, it would be needed to enhance the 1904 serialization of the exchanged data. In this sense, it could be 1905 convenient to define a packet header for the overall interaction 1906 between the FNAA both with remote FNAA as well as with FNUA. 1908 Regarding the subscription use case, it would be necessary to 1909 establish a convenient format for the server response. Currently, 1910 the server is returning a key/value structure with the details of the 1911 Flow. This structure may not be the most adequate, since it may 1912 differ depending on the Event Broker used. 1914 Also, the security aspect needs further analysis and design since its 1915 fragility could lead to great economical damage for organizations. 1916 Thus, it would be recommended to review the different security 1917 controls needed for this solution as part of an Information Security 1918 Management System. 1920 Finally, the implementation should leverage the Cloud Native 1921 functionalities provided by the Kubernetes API. For example, the 1922 FNAA should trigger the deployment of Flow Processors on demand, in 1923 order to provide isolated computing resources for each subscription. 1924 Also, a Kubernetes resource could be developed to use the kubectl CLI 1925 tool for management, instead of a custom CLI tool. 1927 7. Summary & Conclusions 1929 In this chapter we will provide a summary of everything that has been 1930 described in this document as well as some conclusions about it. 1932 We have identified a use case for which there is currently no 1933 adequate solution provided by existing tools. This use case is based 1934 on the cross-organization integration of real-time event streams. 1935 Nowadays, organizations intending to integrate these kind of data 1936 streams struggle with offline communication to achieve a common 1937 interface for integration. In this context, we proposed an Open 1938 Network for Event Streaming as a possible solution for this 1939 difficulty. 1941 For this Open Network, we have followed the main necessities from the 1942 technical perspective. While there already exist many components 1943 that can be leveraged, some components require analysis, design, and 1944 implementation. Then, we referred to the Commons Infrastructure 1945 literature in order to show how Event Streaming can be considered an 1946 Infrastructure Resource that can enable downstream productive 1947 activities. Finally, we established the main guidelines that an Open 1948 Network should follow, basing these definitions on Free, Open & 1949 Neutral Networks. 1951 Using the previous definitions, we have designed an architecture for 1952 the Event Streaming Open Network, establishing the components that 1953 the different Network Participants should implement in order to 1954 participate in the network. After providing a thorough description 1955 of all the components, we showed some use cases of integration among 1956 different Network Participants. 1958 Once the architecture was defined, we proposed an implementation 1959 approach which describes the existing components that can be 1960 leveraged as well as those that need to be developed from scratch. 1961 The outcome was that a server-side application called FNAA had to be 1962 developed. This application implements the protocol FNAP and can be 1963 accessed by a client application, which we named FNUA. 1965 Finally, we proved the feasibility of the proposed architecture by 1966 providing an implementation of the minimum functionalities required, 1967 in the form of a Proof of Concept. The results of this PoC were 1968 encouraging since it was possible to implement the initial 1969 functionalities for the FNAA and FNUA components. 1971 As conclusion, we can mention that there is great potential for an 1972 Open Network for Event Streaming among organizations. In the same 1973 way the email infrastructure acts as an open network for electronic 1974 communications among people, this kind of network would enable 1975 developers to integrate real-time event streams while minimizing 1976 offline agreement of interfaces and technologies. 1978 However, there are many difficulties that could be furtherly worked 1979 on. First, a robust implementation for the Event Streaming Open 1980 Network main components must be provided, mainly for the FNAA and 1981 FNUA. In order to achieve an acceptable level of quality and 1982 stability, the development of a community around the project is 1983 needed. 1985 Secondly, we found that the proposed architecture is a convenient 1986 starting point. However, it can suffer modifications based on the 1987 learning process during the implementation. For example, while 1988 designing the architecture, we avoided the need of a database for the 1989 FNAA component, leveraging on the DNS infrastructure. While this can 1990 be sufficient for the minimum functionalities described, it will most 1991 probably be necessary for the FNAA to persist data in a database of 1992 its own. In this sense, we believe that leveraging the Kubernetes 1993 resources model could be a convenient alternative. 1995 Thirdly, during the PoC execution, we identified some difficulties 1996 implementing the security functionalities of authentication and 1997 authorization. Although we were able to implement an authentication 1998 mechanism, the reality indicates that integration with well- 1999 established protocols is needed (i.e., OAuth, GSSAPI, etc.). 2001 Finally, there is also the need to leverage on the Cloud Native 2002 architecture, basically Kubernetes, to provide hyper-scalability and 2003 enable Network Participants to agnostically choose the underlaying 2004 infrastructure. The selection of Golang for the PoC implementation 2005 showed to be convenient, given the vast number of available libraries 2006 for integration of third-party components and services. 2008 Notwithstanding the difficulties, we firmly believe that cross- 2009 organization real-time event integration can provide great benefits 2010 for society. It would enhance the efficiency of business processes 2011 throughout organizations. Also, it would provide broad visibility to 2012 the final users, enabling experimentation and entrepreneurship. New 2013 business models for existing productive activities could be 2014 developed, as well as enabling innovation, which in turn would 2015 conform the positive externalities of the Event Streaming Open 2016 Network. 2018 8. Security Considerations 2020 TODO Security 2022 9. IANA Considerations 2024 This document has no IANA actions. 2026 10. Normative References 2028 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 2029 Requirement Levels", BCP 14, RFC 2119, 2030 DOI 10.17487/RFC2119, March 1997, 2031 . 2033 [RFC2782] Gulbrandsen, A., Vixie, P., and L. Esibov, "A DNS RR for 2034 specifying the location of services (DNS SRV)", RFC 2782, 2035 DOI 10.17487/RFC2782, February 2000, 2036 . 2038 [RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, 2039 A., Peterson, J., Sparks, R., Handley, M., and E. 2040 Schooler, "SIP: Session Initiation Protocol", RFC 3261, 2041 DOI 10.17487/RFC3261, June 2002, 2042 . 2044 [RFC5321] Klensin, J., "Simple Mail Transfer Protocol", RFC 5321, 2045 DOI 10.17487/RFC5321, October 2008, 2046 . 2048 [RFC6763] Cheshire, S. and M. Krochmal, "DNS-Based Service 2049 Discovery", RFC 6763, DOI 10.17487/RFC6763, February 2013, 2050 . 2052 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2053 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 2054 May 2017, . 2056 Acknowledgments 2058 SPINELLA E. (2022) [Online] Event Streaming Open Network Master's 2059 Thesis https://drive.google.com/file/d/1R9H- 2060 4knAztez_yUPlr7aZSkbUjs8jL3j 2061 URQUHART J. (2021) Flow Architectures 2063 FRISCHMANN B. (2007) [Online] Infrastructure Commons in Economic 2064 Perspective < https://firstmonday.org/article/view/1901/1783> 2066 WIDL M. (2013), Guided Merging of Sequence Diagrams 2068 NAVARRO L. (2018) [Online] Network Infrastructures: The commons model 2069 for local participation, governance and sustainability 2070 https://www.apc.org/en/pubs/network-infrastructures-commons-model- 2071 local-participation-governance-and-sustainability 2072 (https://www.apc.org/en/pubs/network-infrastructures-commons-model- 2073 local-participation-governance-and-sustainability) 2075 BRINO A. (2019) Towards an Event Streaming Service for ATLAS data 2076 processing. 2078 GUTTRIDGE, Gartner (2021) "Modern Data Strategies for the Real-time 2079 Enterprise" Big Data Quarterly 2021 2081 Author's Address 2083 Emiliano Spinella 2084 Syndeno 2086 Email: emiliano.spinella@syndeno.com