idnits 2.17.00 (12 Aug 2021) /tmp/idnits46022/draft-ietf-tewg-principles-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 63 longer pages, the longest (page 1) being 65 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. Miscellaneous warnings: ---------------------------------------------------------------------------- == Line 673 has weird spacing: '...ay) and timin...' == Line 780 has weird spacing: '...ormance optim...' == Line 891 has weird spacing: '... merits of di...' == Line 1545 has weird spacing: '...provide bound...' -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- Couldn't find a document date in the document -- date freshness check skipped. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'Elwalid' is mentioned on line 315, but not defined == Missing Reference: 'RFC 2475' is mentioned on line 659, but not defined == Missing Reference: 'RFC-2474' is mentioned on line 1636, but not defined == Missing Reference: 'RFC2330' is mentioned on line 1723, but not defined == Missing Reference: 'RFC2680' is mentioned on line 1724, but not defined ** Obsolete undefined reference: RFC 2680 (Obsoleted by RFC 7680) == Missing Reference: 'RFC2679' is mentioned on line 1724, but not defined ** Obsolete undefined reference: RFC 2679 (Obsoleted by RFC 7679) == Missing Reference: 'RFC2678' is mentioned on line 1725, but not defined == Missing Reference: 'BGP4' is mentioned on line 2680, but not defined == Unused Reference: 'ASH1' is defined on line 3003, but no explicit reference was found in the text == Unused Reference: 'AWD4' is defined on line 3023, but no explicit reference was found in the text == Unused Reference: 'ELW95' is defined on line 3037, but no explicit reference was found in the text == Unused Reference: 'FGLR' is defined on line 3042, but no explicit reference was found in the text == Unused Reference: 'LNO96' is defined on line 3072, but no explicit reference was found in the text == Unused Reference: 'RFC-1458' is defined on line 3096, but no explicit reference was found in the text == Unused Reference: 'RFC-1771' is defined on line 3099, but no explicit reference was found in the text == Unused Reference: 'RFC-1812' is defined on line 3102, but no explicit reference was found in the text == Unused Reference: 'RFC-1997' is defined on line 3108, but no explicit reference was found in the text == Unused Reference: 'RFC-1998' is defined on line 3111, but no explicit reference was found in the text == Unused Reference: 'RFC-2215' is defined on line 3125, but no explicit reference was found in the text == Unused Reference: 'RFC-2216' is defined on line 3129, but no explicit reference was found in the text == Unused Reference: 'RFC-2330' is defined on line 3132, but no explicit reference was found in the text == Unused Reference: 'RFC-2678' is defined on line 3145, but no explicit reference was found in the text == Unused Reference: 'RFC-2679' is defined on line 3148, but no explicit reference was found in the text == Unused Reference: 'RFC-2680' is defined on line 3151, but no explicit reference was found in the text -- Possible downref: Non-RFC (?) normative reference: ref. 'ASH1' -- Possible downref: Non-RFC (?) normative reference: ref. 'ASH2' -- Possible downref: Non-RFC (?) normative reference: ref. 'ASH3' -- Possible downref: Non-RFC (?) normative reference: ref. 'AWD1' -- Possible downref: Non-RFC (?) normative reference: ref. 'AWD2' -- Possible downref: Non-RFC (?) normative reference: ref. 'AWD3' -- Possible downref: Non-RFC (?) normative reference: ref. 'AWD4' -- Possible downref: Non-RFC (?) normative reference: ref. 'AWD5' -- Possible downref: Non-RFC (?) normative reference: ref. 'CRUZ' -- Possible downref: Non-RFC (?) normative reference: ref. 'DIFF-TE' -- Possible downref: Non-RFC (?) normative reference: ref. 'ELW95' -- Possible downref: Non-RFC (?) normative reference: ref. 'FGLR' -- Possible downref: Non-RFC (?) normative reference: ref. 'FLJA93' -- Possible downref: Non-RFC (?) normative reference: ref. 'FLOY94' -- Possible downref: Non-RFC (?) normative reference: ref. 'HUSS87' -- Possible downref: Non-RFC (?) normative reference: ref. 'ITU-E600' -- Possible downref: Non-RFC (?) normative reference: ref. 'ITU-E701' -- Possible downref: Non-RFC (?) normative reference: ref. 'ITU-E801' -- Possible downref: Non-RFC (?) normative reference: ref. 'JAM' -- Possible downref: Non-RFC (?) normative reference: ref. 'KATZ' -- Possible downref: Non-RFC (?) normative reference: ref. 'LNO96' -- Possible downref: Non-RFC (?) normative reference: ref. 'MA' -- Possible downref: Non-RFC (?) normative reference: ref. 'MATE' -- Possible downref: Non-RFC (?) normative reference: ref. 'MCQ80' -- Possible downref: Non-RFC (?) normative reference: ref. 'MPLS-DIFF' -- Possible downref: Non-RFC (?) normative reference: ref. 'MR99' ** Obsolete normative reference: RFC 1349 (Obsoleted by RFC 2474) ** Downref: Normative reference to an Informational RFC: RFC 1458 ** Obsolete normative reference: RFC 1771 (Obsoleted by RFC 4271) ** Downref: Normative reference to an Informational RFC: RFC 1992 ** Downref: Normative reference to an Informational RFC: RFC 1998 ** Obsolete normative reference: RFC 2178 (Obsoleted by RFC 2328) ** Downref: Normative reference to an Informational RFC: RFC 2216 ** Downref: Normative reference to an Informational RFC: RFC 2330 ** Downref: Normative reference to an Informational RFC: RFC 2386 ** Downref: Normative reference to an Informational RFC: RFC 2475 ** Obsolete normative reference: RFC 2679 (Obsoleted by RFC 7679) ** Obsolete normative reference: RFC 2680 (Obsoleted by RFC 7680) ** Downref: Normative reference to an Informational RFC: RFC 2702 ** Downref: Normative reference to an Informational RFC: RFC 2722 ** Downref: Normative reference to an Informational RFC: RFC 2753 ** Downref: Normative reference to an Informational RFC: RFC 2998 ** Downref: Normative reference to an Informational RFC: RFC 3086 -- Possible downref: Non-RFC (?) normative reference: ref. 'SHAR' -- Possible downref: Non-RFC (?) normative reference: ref. 'SLDC98' -- Possible downref: Non-RFC (?) normative reference: ref. 'SMIT' -- Possible downref: Non-RFC (?) normative reference: ref. 'XIAO' -- Possible downref: Non-RFC (?) normative reference: ref. 'YARE95' Summary: 22 errors (**), 0 flaws (~~), 30 warnings (==), 33 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force 3 INTERNET-DRAFT 4 TE Working Group 5 Daniel O. Awduche 6 Expiration Date: April 2002 Movaz Networks 8 Angela Chiu 9 Celion Networks 11 Anwar Elwalid 12 Lucent Technologies 14 Indra Widjaja 15 Lucent Technologies 17 XiPeng Xiao 18 Photuris Inc. 20 Overview and Principles of Internet Traffic Engineering 22 draft-ietf-tewg-principles-01.txt 24 Status of this Memo 26 This document is an Internet-Draft and is in full conformance with 27 all provisions of Section 10 of RFC2026. 29 Internet-Drafts are working documents of the Internet Engineering 30 Task Force (IETF), its areas, and its working groups. Note that 31 other groups may also distribute working documents as Internet- 32 Drafts. 34 Internet-Drafts are draft documents valid for a maximum of six months 35 and may be updated, replaced, or obsoleted by other documents at any 36 time. It is inappropriate to use Internet-Drafts as reference 37 material or to cite them other than as "work in progress." 39 The list of current Internet-Drafts can be accessed at 40 http://www.ietf.org/1id-abstracts.html 42 The list of Internet-Draft Shadow Directories can be accessed at 43 http://www.ietf.org/shadow.html 45 Abstract 47 This memo describes the principles of Traffic Engineering (TE) in the 48 Internet. The document is intended to promote better understanding 49 of the issues surrounding traffic engineering in IP networks, and to 50 provide a common basis for the development of traffic engineering 51 capabilities for the Internet. The principles, architectures, and 52 methodologies for performance evaluation and performance optimization 53 of operational IP networks are discussed throughout this document. 54 The optimization goals of traffic engineering are to enhance the 55 performance of IP traffic while utilizing network resources 56 economically and reliably. The document includes a set of generic 57 recommendations, and options for Internet traffic engineering. The 58 document can serve as a guide to implementors of online and offline 59 Internet traffic engineering mechanisms, tools, and support systems. 60 The document can also help service providers devise traffic 61 engineering solutions for their networks. 63 Table of Contents 65 1.0 Introduction...................................................3 66 1.1 What is Internet Traffic Engineering?.......................4 67 1.2 Scope.......................................................7 68 1.3 Terminology.................................................8 69 2.0 Background....................................................11 70 2.1 Context of Internet Traffic Engineering....................11 71 2.2 Network Context............................................12 72 2.3 Problem Context............................................14 73 2.3.1 Congestion and its Ramifications......................15 74 2.4 Solution Context...........................................15 75 2.4.1 Combating the Congestion Problem......................17 76 2.5 Implementation and Operational Context.....................19 77 3.0 Traffic Engineering Process Model.............................20 78 3.1 Components of the Traffic Engineering Process Model........21 79 3.2 Measurement................................................21 80 3.3 Modeling, Analysis, and Simulation.........................22 81 3.4 Optimization...............................................23 82 4.0 Historical Review and Recent Developments.....................24 83 4.1 Traffic Engineering in Classical Telephone Networks........24 84 4.2 Evolution of Traffic Engineering in the Internet...........26 85 4.2.1 Adaptive Routing in ARPANET...........................26 86 4.2.2 Dynamic Routing in the Internet.......................27 87 4.2.3 ToS Routing...........................................27 88 4.2.4 Equal Cost Multi-Path.................................28 89 4.2.5 Nimrod................................................28 90 4.3 Overlay Model..............................................29 91 4.4 Constraint-Based Routing...................................29 92 4.5 Overview of Other IETF Projects Related to Traffic 93 Engineering................................................30 94 4.5.1 Integrated Services...................................30 95 4.5.2 RSVP..................................................31 96 4.5.3 Differentiated Services...............................32 97 4.5.4 MPLS..................................................33 98 4.5.5 IP Performance Metrics................................34 99 4.5.6 Flow Measurement......................................34 100 4.5.7 Endpoint Congestion Management........................35 101 4.6 Overview of ITU Activities Related to Traffic 102 Engineering................................................35 103 4.7 Content Distribution.......................................36 104 5.0 Taxonomy of Traffic Engineering Systems.......................37 105 5.1 Time-Dependent Versus State-Dependent......................37 106 5.2 Offline Versus Online......................................38 107 5.3 Centralized Versus Distributed.............................39 108 5.4 Local Versus Global........................................39 109 5.5 Prescriptive Versus Descriptive............................39 110 5.6 Open-Loop Versus Closed-Loop...............................40 111 5.7 Tactical vs Strategic......................................40 112 6.0 Recommendations for Internet Traffic Engineering..............40 113 6.1 Generic Non-functional Recommendations.....................41 114 6.2 Routing Recommendations....................................42 115 6.3 Traffic Mapping Recommendations............................45 116 6.4 Measurement Recommendations................................45 117 6.5 Network Survivability......................................46 118 6.5.1 Survivability in MPLS Based Networks..................48 119 6.5.2 Protection Option.....................................49 120 6.6 Traffic Engineering in Diffserv Environments...............50 121 6.7 Network Controllability....................................52 122 7.0 Inter-Domain Considerations...................................52 123 8.0 Overview of Contemporary TE Practices in Operational 124 IP Networks...................................................54 125 9.0 Conclusion....................................................58 126 10.0 Security Considerations......................................58 127 11.0 Acknowledgments..............................................58 128 12.0 References...................................................58 129 13.0 Authors' Addresses...........................................63 131 1.0 Introduction 133 This memo describes the principles of Internet traffic engineering. 134 The objective of the document is to articulate the general issues and 135 principles for Internet traffic engineering; and where appropriate to 136 provide recommendations, guidelines, and options for the development 137 of online and offline Internet traffic engineering capabilities and 138 support systems. 140 The document can aid service providers in devising and implementing 141 traffic engineering solutions for their networks. Networking hardware 142 and software vendors will also find the document helpful in the 143 development of mechanisms and support systems for the Internet 144 environment that support the traffic engineering function. 146 The document provides a terminology for describing and understanding 147 common Internet traffic engineering concepts. The document also 148 provides a taxonomy of known traffic engineering styles. In this 149 context, a traffic engineering style abstracts important aspects from 150 a traffic engineering methodology. Traffic engineering styles can be 151 viewed in different ways depending upon the specific context in which 152 they are used and the specific purpose which they serve. The 153 combination of styles and views results in a natural taxonomy of 154 traffic engineering systems. 156 Even though Internet traffic engineering is most effective when 157 applied end-to-end, the initial focus of this document document is 158 intra-domain traffic engineering (that is, traffic engineering within 159 a given autonomous system). However, because a preponderance of 160 Internet traffic tends to be inter-domain (originating in one 161 autonomous system and terminating in another), this document provides 162 an overview of aspects pertaining to inter-domain traffic 163 engineering. 165 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 166 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 167 document are to be interpreted as described in RFC 2119. 169 1.1. What is Internet Traffic Engineering? 171 Internet traffic engineering is defined as that aspect of Internet 172 network engineering dealing with the issue of performance evaluation 173 and performance optimization of operational IP networks. Traffic 174 Engineering encompasses the application of technology and scientific 175 principles to the measurement, characterization, modeling, and 176 control of Internet traffic [RFC-2702, AWD2]. 178 Enhancing the performance of an operational network, at both the 179 traffic and resource levels, are major objectives of Internet traffic 180 engineering. This is accomplished by addressing traffic oriented 181 performance requirements, while utilizing network resources 182 economically and reliably. Traffic oriented performance measures 183 include delay, delay variation, packet loss, and throughput. 185 An important objective of Internet traffic engineering is to 186 facilitate reliable network operations [RFC-2702]. Reliable network 187 operations can be facilitated by providing mechanisms that enhance 188 network integrity and by embracing policies emphasizing network 189 survivability. This results in a minimization of the vulnerability of 190 the network to service outages arising from errors, faults, and 191 failures occurring within the infrastructure. 193 The Internet exists in order to transfer information from source 194 nodes to destination nodes. Accordingly, one of the most significant 195 functions performed by the Internet is the routing of traffic from 196 ingress nodes to egress nodes. Therefore, one of the most distinctive 197 functions performed by Internet traffic engineering is the control 198 and optimization of the routing function, to steer traffic through 199 the network in the most effective way. 201 Ultimately, it is the performance of the network as seen by end users 202 of network services that is truly paramount. This crucial point 203 should be considered throughout the development of traffic 204 engineering mechanisms and policies. The characteristics visible to 205 end users are the emergent properties of the network, which are the 206 characteristics of the network when viewed as a whole. A central goal 207 of the service provider, therefore, is to enhance the emergent 208 properties of the network while taking economic considerations into 209 account. 211 The importance of the above observation regarding the emergent 212 properties of networks is that special care must be taken when 213 choosing network performance measures to optimize. Optimizing the 214 wrong measures may achieve certain local objectives, but may have 215 disastrous consequences on the emergent properties of the network and 216 thereby on the quality of service perceived by end-users of network 217 services. 219 A subtle, but practical advantage of the systematic application of 220 traffic engineering concepts to operational networks is that it helps 221 to identify and structure goals and priorities in terms of enhancing 222 the quality of service delivered to end-users of network services. 223 The application of traffic engineering concepts also aids in the 224 measurement and analysis of the achievement of these goals. 226 The optimization aspects of traffic engineering can be achieved 227 through capacity management and traffic management. As used in this 228 document, capacity management includes capacity planning, routing 229 control, and resource management. Network resources of particular 230 interest include link bandwidth, buffer space, and computational 231 resources. Likewise, as used in this document, traffic management 232 includes (1) nodal traffic control functions such as traffic 233 conditioning, queue management, scheduling, and (2) other functions 234 that regulate traffic flow through the network or that arbitrate 235 access to network resources between different packets or between 236 different traffic streams. 238 The optimization objectives of Internet traffic engineering should be 239 viewed as a continual and iterative process of network performance 240 improvement and not simply as a one time goal. Traffic engineering 241 also demands continual development of new technologies and new 242 methodologies for network performance enhancement. 244 The optimization objectives of Internet traffic engineering may 245 change over time as new requirements are imposed, as new technologies 246 emerge, or as new insights are brought to bear on the underlying 247 problems. Moreover, different networks may have different 248 optimization objectives, depending upon their business models, 249 capabilities, and operating constraints. The optimization aspects of 250 traffic engineering are ultimately concerned with network control 251 regardless of the specific optimization goals in any particular 252 environment. 254 Thus, the optimization aspects of traffic engineering can be viewed 255 from a control perspective. The aspect of control within the Internet 256 traffic engineering arena can be pro-active and/or reactive. In the 257 pro-active case, the traffic engineering control system takes 258 preventive action to obviate predicted unfavorable future network 259 states. It may also take perfective action to induce a more 260 desirable state in the future. In the reactive case, the control 261 system responds correctively and perhaps adaptively to events that 262 have already transpired in the network. 264 The control dimension of Internet traffic engineering responds at 265 multiple levels of temporal resolution to network events. Certain 266 aspects of capacity management, such as capacity planning, respond at 267 very coarse temporal levels, ranging from days to possibly years. The 268 introduction of automatically switched optical transport networks 269 (e.g. based on the Multi-protocol Lambda Switching concepts) could 270 significantly reduce the lifecycle for capacity planning by 271 expediting provisioning of optical bandwidth. Routing control 272 functions operate at intermediate levels of temporal resolution, 273 ranging from milliseconds to days. Finally, the packet level 274 processing functions (e.g. rate shaping, queue management, and 275 scheduling) operate at very fine levels of temporal resolution, 276 ranging from picoseconds to milliseconds while responding to the 277 real-time statistical behavior of traffic. The subsystems of Internet 278 traffic engineering control include: capacity augmentation, routing 279 control, traffic control, and resource control (including control of 280 service policies at network elements). When capacity is to be 281 augmented for tactical purposes, it may be desirable to devise a 282 deployment plan that expedites bandwidth provisioning while 283 minimizing installation costs. 285 Inputs into the traffic engineering control system include network 286 state variables, policy variables, and decision variables. 288 One major challenge of Internet traffic engineering is the 289 realization of automated control capabilities that adapt quickly and 290 cost effectively to significant changes in a network's state, while 291 still maintaining stability. 293 Another critical dimension of Internet traffic engineering is network 294 performance evaluation, which is important for assessing the 295 effectiveness of traffic engineering methods, and for monitoring and 296 verifying compliance with network performance goals. Results from 297 performance evaluation can be used to identify existing problems, 298 guide network re-optimization, and aid in the prediction of potential 299 future problems. 301 Performance evaluation can be achieved in many different ways. The 302 most notable techniques include analytical methods, simulation, and 303 empirical methods based on measurements. When analytical methods or 304 simulation are used, network nodes and links can be modeled to 305 capture relevant operational features such as topology, bandwidth, 306 buffer space, and nodal service policies (link scheduling, packet 307 prioritization, buffer management, etc). Analytical traffic models 308 can be used to depict dynamic and behavioral traffic characteristics, 309 such as burstiness, statistical distributions, and dependence. 311 Performance evaluation can be quite complicated in practical network 312 contexts. A number of techniques can be used to simplify the 313 analysis, such as abstraction, decomposition, and approximation. For 314 example, simplifying concepts such as effective bandwidth and 315 effective buffer [Elwalid] may be used to approximate nodal behaviors 316 at the packet level and simplify the analysis at the connection 317 level. Network analysis techniques using, for example, queuing models 318 and approximation schemes based on asymptotic and decomposition 319 techniques can render the analysis even more tractable. In 320 particular, an emerging set of concepts known as network calculus 321 [CRUZ] based on deterministic bounds may simplify network analysis 322 relative to classical stochastic techniques. When using analytical 323 techniques, care should be taken to ensure that the models faithfully 324 reflect the relevant operational characteristics of the modeled 325 network entities. 327 Simulation can be used to evaluate network performance or to verify 328 and validate analytical approximations. Simulation can, however, be 329 computationally costly and may not always provide sufficient 330 insights. An appropriate approach to a given network performance 331 evaluation problem may involve a hybrid combination of analytical 332 techniques, simulation, and empirical methods. 334 As a general rule, traffic engineering concepts and mechanisms must 335 be sufficiently specific and well defined to address known 336 requirements, but simultaneously flexible and extensible to 337 accommodate unforeseen future demands. 339 1.2. Scope 341 The scope of this document is intra-domain traffic engineering; that 342 is, traffic engineering within a given autonomous system in the 343 Internet. The document will discuss concepts pertaining to intra- 344 domain traffic control, including such issues as routing control, 345 micro and macro resource allocation, and the control coordination 346 problems that arise consequently. 348 This document will describe and characterize techniques already in 349 use or in advanced development for Internet traffic engineering. The 350 way these techniques fit together will be discussed and scenarios in 351 which they are useful will be identified. 353 Although the emphasis is on intra-domain traffic engineering, in 354 Section 7.0, an overview of the high level considerations pertaining 355 to inter-domain traffic engineering will be provided. inter-domain 356 Internet traffic engineering is crucial to the performance 357 enhancement of the global Internet infrastructure. 359 Whenever possible, relevant requirements from existing IETF documents 360 and other sources will be incorporated by reference. 362 1.3 Terminology 364 This subsection provides terminology which is useful for Internet 365 traffic engineering. The definitions presented apply to this 366 document. These terms may have other meanings elsewhere. 368 - Baseline analysis: 369 A study conducted to serve as a baseline for comparison to the 370 actual behavior of the network. 372 - Busy hour: 373 A one hour period within a specified interval of time 374 (typically 24 hours) in which the traffic load in a 375 network or sub-network is greatest. 377 - Bottleneck 378 A network element whose input traffic rate tends to be greater 379 than its output rate. 381 - Congestion: 382 A state of a network resource in which the traffic incident 383 on the resource exceeds its output capacity over an interval 384 of time. 386 - Congestion avoidance: 387 An approach to congestion management that attempts to obviate 388 the occurrence of congestion. 390 - Congestion control: 391 An approach to congestion management that attempts to remedy 392 congestion problems that have already occurred. 394 - Constraint-based routing: 395 A class of routing protocols that take specified traffic 396 attributes, network constraints, and policy constraints into 397 account when making routing decisions. Constraint-based 398 routing is applicable to traffic aggregates as well as flows. 399 It is a generalization of QoS routing. 401 - Demand side congestion management: 402 A congestion management scheme that addresses congestion 403 problems by regulating or conditioning offered load. 405 - Effective bandwidth: 406 The minimum amount of bandwidth that can be assigned to a flow 407 or traffic aggregate in order to deliver 'acceptable service 408 quality' to the flow or traffic aggregate. 410 - Egress traffic: 411 Traffic exiting a network or network element. 413 - Hot-spot 414 A network element or subsystem which is in a state of 415 congestion. 417 - Ingress traffic: 418 Traffic entering a network or network element. 420 - Inter-domain traffic: 421 Traffic that originates in one Autonomous system and 422 terminates in another. 424 - Loss network: 425 A network that does not provide adequate buffering for 426 traffic, so that traffic entering a busy resource within 427 the network will be dropped rather than queued. 429 - Metric: 430 A parameter defined in terms of standard units of 431 measurement. 433 - Measurement Methodology: 434 A repeatable measurement technique used to derive one or 435 more metrics of interest. 437 - Network Survivability: 438 The capability to provide a prescribed level of QoS for 439 existing services after a given number of failures occur 440 within the network. 442 - Offline traffic engineering: 443 A traffic engineering system that exists outside of the 444 network. 446 - Online traffic engineering: 447 A traffic engineering system that exists within the network, 448 typically implemented on or as adjuncts to operational network 449 elements. 451 - Performance measures: 452 Metrics that provide quantitative or qualitative measures of 453 the performance of systems or subsystems of interest. 455 - Performance management: 456 A systematic approach to improving effectiveness in the 457 accomplishment of specific networking goals related to 458 performance improvement. 460 - Performance Metric: 461 A performance parameter defined in terms of standard units of 462 measurement. 464 - Provisioning: 465 The process of assigning or configuring network resources to 466 meet certain requests. 468 - QoS routing: 470 Class of routing systems that selects paths to be used by a 471 flow based on the QoS requirements of the flow. 473 - Service Level Agreement: 474 A contract between a provider and a customer that guarantees 475 specific levels of performance and reliability at a certain 476 cost. 478 - Stability: 479 An operational state in which a network does not oscillate 480 in a disruptive manner from one mode to another mode. 482 - Supply side congestion management: 483 A congestion management scheme that provisions additional 484 network resources to address existing and/or anticipated 485 congestion problems. 487 - Transit traffic: 488 Traffic whose origin and destination are both outside of 489 the network under consideration. 491 - Traffic characteristic: 492 A description of the temporal behavior or a description of the 493 attributes of a given traffic flow or traffic aggregate. 495 - Traffic engineering system 496 A collection of objects, mechanisms, and protocols that are 497 used conjunctively to accomplish traffic engineering 498 objectives. 500 - Traffic flow: 501 A stream of packets between two end-points that can be 502 characterized in a certain way. A micro-flow has a more 503 specific definition: A micro-flow is a stream of packets 504 with the same source and destination addresses, source 505 and destination ports, and protocol ID. 507 - Traffic intensity: 508 A measure of traffic loading with respect to a resource 509 capacity over a specified period of time. In classical 510 telephony systems, traffic intensity is measured in units of 511 Erlang. 513 - Traffic matrix: 514 A representation of the traffic demand between a set of origin 515 and destination abstract nodes. An abstract node can consist 516 of one or more network elements. 518 - Traffic monitoring: 520 The process of observing traffic characteristics at a given 521 point in a network and collecting the traffic information for 522 analysis and further action. 524 - Traffic trunk: 525 An aggregation of traffic flows belonging to the same class 526 which are forwarded through a common path. A traffic trunk 527 may be characterized by an ingress and egress node, and a 528 set of attributes which determine its behavioral 529 characteristics and requirements from the network. 531 2.0 Background 533 The Internet has quickly evolved into a very critical communications 534 infrastructure, supporting significant economic, educational, and 535 social activities. Simultaneously, the delivery of Internet 536 communications services has become very competitive and end-users are 537 demanding very high quality service from their service providers. 538 Consequently, performance optimization of large scale IP networks, 539 especially public Internet backbones, has become an important 540 problem. Network performance requirements are multi-dimensional, 541 complex, and sometimes contradictory; making the traffic engineering 542 problem very challenging. 544 The network must convey IP packets from ingress nodes to egress nodes 545 efficiently, expeditiously and economically. Furthermore, in a 546 multiclass service environment (e.g. Diffserv capable networks), the 547 resource sharing parameters of the network must be appropriately 548 determined and configured according to prevailing policies and 549 service models to resolve resource contention issues arising from 550 mutual interference between packets traversing through the network. 551 Thus, consideration must be given to resolving competition for 552 network resources between traffic streams belonging to the same 553 service class (intra-class contention resolution) and traffic streams 554 belonging to different classes (inter-class contention resolution). 556 2.1 Context of Internet Traffic Engineering 558 The context of Internet traffic engineering pertains to the scenarios 559 where traffic engineering is used. A traffic engineering methodology 560 establishes appropriate rules to resolve traffic performance issues 561 occurring in a specific context. The context of Internet traffic 562 engineering includes: 564 (1) A network context defining the universe of discourse, 565 and in particular the situations in which the traffic 566 engineering problems occur. The network context 567 includes network structure, network policies, network 568 characteristics, network constraints, network quality 569 attributes, and network optimization criteria. 571 (2) A problem context defining the general and concrete 572 issues that traffic engineering addresses. The problem 573 context includes identification, abstraction of relevant 574 features, representation, formulation, specification of 575 the requirements on the solution space, and specification 576 of the desirable features of acceptable solutions. 578 (3) A solution context suggesting how to address the issues 579 identified by the problem context. The solution context 580 includes analysis, evaluation of alternatives, 581 prescription, and resolution. 583 (4) An implementation and operational context in which the 584 solutions are methodologically instantiated. The 585 implementation and operational context includes 586 planning, organization, and execution. 588 The context of Internet traffic engineering and the different problem 589 scenarios are discussed in the following subsections. 591 2.2 Network Context 593 IP networks range in size from small clusters of routers situated 594 within a given location, to thousands of interconnected routers, 595 switches, and other components distributed all over the world. 597 Conceptually, at the most basic level of abstraction, an IP network 598 can be represented as a distributed dynamical system consisting of: 599 (1) a set of interconnected resources which provide transport 600 services for IP traffic subject to certain constraints, (2) a demand 601 system representing the offered load to be transported through the 602 network, and (3) a response system consisting of network processes, 603 protocols, and related mechanisms which facilitate the movement of 604 traffic through the network [see also AWD2]. 606 The network elements and resources may have specific characteristics 607 restricting the manner in which the demand is handled. Additionally, 608 network resources may be equipped with traffic control mechanisms 609 superintending the way in which the demand is serviced. Traffic 610 control mechanisms may, for example, be used to control various 611 packet processing activities within a given resource, arbitrate 612 contention for access to the resource by different packets, and 613 regulate traffic behavior through the resource. A configuration 614 management and provisioning system may allow the settings of the 615 traffic control mechanisms to be manipulated by external or internal 616 entities in order to exercise control over the way in which the 617 network elements respond to internal and external stimuli. 619 The details of how the network provides transport services for 620 packets are specified in the policies of the network administrators 621 and are installed through network configuration management and policy 622 based provisioning systems. Generally, the types of services 623 provided by the network also depends upon the technology and 624 characteristics of the network elements and protocols, the prevailing 625 service and utility models, and the ability of the network 626 administrators to translate policies into network configurations. 628 Contemporary Internet networks have three significant 629 characteristics: (1) they provide real-time services, (2) they have 630 become mission critical, and (3) their operating environments are 631 very dynamic. The dynamic characteristics of IP networks can be 632 attributed in part to fluctuations in demand, to the interaction 633 between various network protocols and processes, to the rapid 634 evolution of the infrastructure which demands the constant inclusion 635 of new technologies and new network elements, and to transient and 636 persistent impairments which occur within the system. 638 Packets contend for the use of network resources as they are conveyed 639 through the network. A network resource is considered to be 640 congested if the arrival rate of packets exceed the output capacity 641 of the resource over an interval of time. Congestion may result in 642 some of the arrival packets being delayed or even dropped. 643 Congestion increases transit delays, delay variation, packet loss, 644 and reduces the predictability of network services. Clearly, 645 congestion is a highly undesirable phenomenon. 647 Combating congestion at reasonable cost is a major objective of 648 Internet traffic engineering. 650 Efficient sharing of network resources by multiple traffic streams is 651 a basic economic premise for packet switched networks in general and 652 the Internet in particular. A fundamental challenge in network 653 operation, especially in a large scale public IP network, is to 654 increase the efficiency of resource utilization while minimizing the 655 possibility of congestion. 657 Increasingly, the Internet will have to function in the presence of 658 different classes of traffic with different service requirements. The 659 advent of Differentiated Services [RFC 2475] makes this requirement 660 particularly acute. Thus, packets may be grouped into behavior 661 aggregates such that each behavior aggregate may have a common set of 662 behavioral characteristics or a common set of delivery requirements. 663 In practice, the delivery requirements of a specific set of packets 664 may be specified explicitly or implicitly. Two of the most important 665 traffic delivery requirements are capacity constraints and QoS 666 constraints. 668 Capacity constraints can be expressed statistically as peak rates, 669 mean rates, burst sizes, or as some deterministic notion of effective 670 bandwidth. QoS requirements can be expressed in terms of (1) 671 integrity constraints such as packet loss and (2) in terms of 672 temporal constraints such as timing restrictions for the delivery of 673 each packet (delay) and timing restrictions for the delivery of 674 consecutive packets belonging to the same traffic stream (delay 675 variation). 677 2.3 Problem Context 679 Fundamental problems exist in association with the operation of a 680 network described by the simple model of the previous subsection. 681 This subsection reviews the problem context in relation to the 682 traffic engineering function. 684 The identification, abstraction, representation, and measurement of 685 network features relevant to traffic engineering is a significant 686 issue. 688 One particularly important class of problems concerns how to 689 explicitly formulate the problems that traffic engineering attempts 690 to solve, how to identify the requirements on the solution space, how 691 to specify the desirable features of good solutions, how to actually 692 solve the problems, and how to measure and characterize the 693 effectiveness of the solutions. 695 Another class of problems concerns how to measure and estimate 696 relevant network state parameters. Effective traffic engineering 697 relies on a good estimate of the offered traffic load as well as a 698 view of the underlying topology and associated resource constraints. 699 A network-wide view of the topology is also a must for offline 700 planning. 702 Still another class of problems concerns how to characterize the 703 state of the network and how to evaluate its performance under a 704 variety of scenarios. The performance evaluation problem is two-fold. 705 One aspect of this problem relates to the evaluation of the system 706 level performance of the network. The other aspect relates to the 707 evaluation of the resource level performance, which restricts 708 attention to the performance analysis of individual network 709 resources. In this memo, we refer to the system level characteristics 710 of the network as the "macro-states" and the resource level 711 characteristics as the "micro-states." The system level 712 characteristics are also known as the emergent properties of the 713 network as noted earlier. Correspondingly, we shall refer to the 714 traffic engineering schemes dealing with network performance 715 optimization at the systems level as "macro-TE" and the schemes that 716 optimize at the individual resource level as "micro-TE." Under 717 certain circumstances, the system level performance can be derived 718 from the resource level performance using appropriate rules of 719 composition, depending upon the particular performance measures of 720 interest. 722 Another fundamental class of problems concerns how to effectively 723 optimize network performance. Performance optimization may entail 724 translating solutions to specific traffic engineering problems into 725 network configurations. Optimization may also entail some degree of 726 resource management control, routing control, and/or capacity 727 augmentation. 729 As noted previously, congestion is an undesirable phenomena in 730 operational networks. Therefore, the next subsection addresses the 731 issue of congestion and its ramifications within the problem context 732 of Internet traffic engineering. 734 2.3.1 Congestion and its Ramifications 736 Congestion is one of the most significant problems in an operational 737 IP context. A network element is said to be congested if it 738 experiences sustained overload over an interval of time. Congestion 739 almost always results in degradation of service quality to end users. 740 Congestion control schemes can include demand side policies and 741 supply side policies. Demand side policies may restrict access to 742 congested resources and/or dynamically regulate the demand to 743 alleviate the overload situation. Supply side policies may expand or 744 augment network capacity to better accommodate offered traffic. 745 Supply side policies may also re-allocate network resources by 746 redistributing traffic over the infrastructure. Traffic 747 redistribution and resource re-allocation serve to increase the 748 'effective capacity' seen by the demand. 750 The emphasis of this memo is primarily on congestion management 751 schemes falling within the scope of the network, rather than on 752 congestion management systems dependent upon sensitivity and 753 adaptivity from end-systems. That is, the aspects that are considered 754 in this memo with respect to congestion management are those 755 solutions that can be provided by control entities operating on the 756 network and by the actions of network administrators and network 757 operations systems. 759 2.4 Solution Context 761 The solution context for Internet traffic engineering involves 762 analysis, evaluation of alternatives, and choice between alternative 763 courses of action. Generally the solution context is predicated on 764 making reasonable inferences about the current or future state of the 765 network, and subsequently making appropriate decisions that may 766 involve a preference between alternative sets of action. More 767 specifically, the solution context demands reasonable estimates of 768 traffic workload, characterization of network state, deriving 769 solutions to traffic engineering problems which may be implicitly or 770 explicitly formulated, and possibly instantiating a set of control 771 actions. Control actions may involve the manipulation of parameters 772 associated with routing, control over tactical capacity acquisition, 773 and control over the traffic management functions. 775 The following list of instruments may be applicable to the solution 776 context of Internet traffic engineering. 778 (1) A set of policies, objectives, and requirements (which may be 779 context dependent) for network performance evaluation and 780 performance optimization. 782 (2) A collection of online and possibly offline tools and mechanisms 783 for measurement, characterization, modeling, and control 784 of Internet traffic and control over the placement and allocation 785 of network resources, as well as control over the mapping or 786 distribution of traffic onto the infrastructure. 788 (3) A set of constraints on the operating environment, the network 789 protocols, and the traffic engineering system itself. 791 (4) A set of quantitative and qualitative techniques and 792 methodologies for abstracting, formulating, and 793 solving traffic engineering problems. 795 (5) A set of administrative control parameters which may be 796 manipulated through a Configuration Management (CM) system. 797 The CM system itself may include a configuration control 798 subsystem, a configuration repository, a configuration 799 accounting subsystem, and a configuration auditing subsystem. 801 (6) A set of guidelines for network performance evaluation, 802 performance optimization, and performance improvement. 804 Derivation of traffic characteristics through measurement and/or 805 estimation is very useful within the realm of the solution space for 806 traffic engineering. Traffic estimates can be derived from customer 807 subscription information, traffic projections, traffic models, and 808 from actual empirical measurements. The empirical measurements may be 809 performed at the traffic aggregate level or at the flow level in 810 order to derive traffic statistics at various levels of detail. 811 Measurements at the flow level or on small traffic aggregates may be 812 performed at edge nodes, where traffic enters and leaves the network. 813 Measurements at large traffic aggregate levels may be performed 814 within the core of the network where potentially numerous traffic 815 flows may be in transit concurrently. 817 To conduct performance studies and to support planning of existing 818 and future networks, a routing analysis may be performed to determine 819 the path(s) the routing protocols will choose for various traffic 820 demands, and to ascertain the utilization of network resources as 821 traffic is routed through the network. The routing analysis should 822 capture the selection of paths through the network, the assignment of 823 traffic across multiple feasible routes, and the multiplexing of IP 824 traffic over traffic trunks (if such constructs exists) and over the 825 underlying network infrastructure. A network topology model is a 826 necessity for routing analysis. A network topology model may be 827 extracted from network architecture documents, from network designs, 828 from information contained in router configuration files, from 829 routing databases, from routing tables, or from automated tools that 830 discover and depict network topology information. Topology 831 information may also be derived from servers that monitor network 832 state, and from servers that perform provisioning functions. 834 Routing in operational IP networks can be administratively controlled 835 at various levels of abstraction including the manipulation of BGP 836 attributes and manipulation of IGP metrics. For path oriented 837 technologies such as MPLS, routing can be further controlled by the 838 manipulation of relevant traffic engineering parameters, resource 839 parameters, and administrative policy constraints. Within the 840 context of MPLS, the path of an explicit label switched path (LSP) 841 can be computed and established in various ways including: (1) 842 manually, (2) automatically online using constraint-based routing 843 processes implemented on label switching routers, and (3) 844 automatically offline using constraint-based routing entities 845 implemented on external traffic engineering support systems. 847 2.4.1 Combating the Congestion Problem 849 Minimizing congestion is a significant aspect of Internet traffic 850 engineering. This subsection gives an overview of the general 851 approaches that have been used or proposed to combat congestion 852 problems. 854 Congestion management policies can be categorized based upon the 855 following criteria (see e.g., [YARE95] for a more detailed taxonomy 856 of congestion control schemes): (1) Response time scale which can be 857 characterized as long, medium, or short; (2) reactive versus 858 preventive which relates to congestion control and congestion 859 avoidance; and (3) supply side versus demand side congestion 860 management schemes. These aspects are discussed in the following 861 paragraphs. 863 (1) Congestion Management based on Response Time Scales 865 - Long (weeks to months): Capacity planning works over a relatively 866 long time scale to expand network capacity based on estimates or 867 forecasts of future traffic demand and traffic distribution. Since 868 router and link provisioning take time and are generally expensive, 869 these upgrades are typically carried out in the weeks-to-months or 870 even years time scale. 872 - Medium (minutes to days): Several control policies fall within the 873 medium time scale category. Examples include: (1) Adjusting IGP 874 and/or BGP parameters to route traffic away or towards certain 875 segments of the network; (2) Setting up and/or adjusting some 876 explicitly routed label switched paths (ER-LSPs) in MPLS networks to 877 route some traffic trunks away from possibly congested resources or 878 towards possibly more favorable routes; (3) re-configuring the 879 logical topology of the network to make it correlate more closely 880 with the spatial traffic distribution using for example some 881 underlying path-oriented technology such as MPLS LSPs, ATM PVCs, or 882 optical channel trails. Many of these adaptive medium time scale 883 response schemes rely on a measurement system that monitors changes 884 in traffic distribution, traffic shifts, and network resource 885 utilization and subsequently provides feedback to the online and/or 886 offline traffic engineering mechanisms and tools which employ this 887 feedback information to trigger certain control actions to occur 888 within the network. The traffic engineering mechanisms and tools can 889 be implemented in a distributed fashion or in a centralized fashion, 890 and may have a hierarchical structure or a flat structure. The 891 comparative merits of distributed and centralized control structures 892 for networks are well known. A centralized scheme may have global 893 visibility into the network state and may produce potentially more 894 optimal solutions. However, centralized schemes are prone to single 895 points of failure and may not scale as well as distributed schemes. 896 Moreover, the information utilized by a centralized scheme may be 897 stale and may not reflect the actual state of the network. It is not 898 an objective of this memo to make a recommendation between 899 distributed and centralized schemes. This is a choice that network 900 administrators must make based on their specific needs. 902 - Short (picoseconds to minutes): This category includes packet level 903 processing functions and events on the order of several round trip 904 times. It includes router mechanisms such as passive and active 905 buffer management. These mechanisms are used to control congestion 906 and/or signal congestion to end systems so that they can adaptively 907 regulate the rate at which traffic is injected into the network. One 908 of the most popular active queue management schemes, especially for 909 TCP traffic, is Random Early Detection (RED) [FLJA93], which supports 910 congestion avoidance by controlling the average queue size. During 911 congestion (but before the queue is filled), the RED scheme chooses 912 arriving packets to "mark" according to a probabilistic algorithm 913 which takes into account the average queue size. For a router that 914 does not utilize explicit congestion notification (ECN) see e.g., 915 [FLOY94], the marked packets can simply be dropped to signal the 916 inception of congestion to end systems. On the other hand, if the 917 router supports ECN, then it can set the ECN field in the packet 918 header. Several variations of RED have been proposed to support 919 different drop precedence levels in multi-class environments [RFC- 920 2597], e.g., RED with In and Out (RIO) and Weighted RED. There is 921 general consensus that RED provides congestion avoidance performance 922 which is not worse than traditional Tail-Drop (TD) queue management 923 (drop arriving packets only when the queue is full). Importantly, 924 however, RED reduces the possibility of global synchronization and 925 improves fairness among different TCP sessions. However, RED by 926 itself can not prevent congestion and unfairness caused by sources 927 unresponsive to RED, e.g., UDP traffic and some misbehaved greedy 928 connections. Other schemes have been proposed to improve the 929 performance and fairness in the presence of unresponsive traffic. 930 Some of these schemes were proposed as theoretical frameworks and are 931 typically not available in existing commercial products. Two such 932 schemes are Longest Queue Drop (LQD) and Dynamic Soft Partitioning 933 with Random Drop (RND) [SLDC98]. 935 (2) Congestion Management: Reactive versus Preventive Schemes 937 - Reactive: reactive (recovery) congestion management policies react 938 to existing congestion problems to improve it. All the policies 939 described in the long and medium time scales above can be categorized 940 as being reactive especially if the policies are based on monitoring 941 and identifying existing congestion problems, and on the initiation 942 of relevant actions to ease a situation. 944 - Preventive: preventive (predictive/avoidance) policies take 945 proactive action to prevent congestion based on estimates and 946 predictions of future potential congestion problems. Some of the 947 policies described in the long and medium time scales fall into this 948 category. They do not necessarily respond immediately to existing 949 congestion problems. Instead forecasts of traffic demand and workload 950 distribution are considered and action may be taken to prevent 951 potential congestion problems in the future. The schemes described in 952 the short time scale (e.g., RED and its variations, ECN, LQD, and 953 RND) are also used for congestion avoidance since dropping or marking 954 packets before queues actually overflow would trigger corresponding 955 TCP sources to slow down. 957 (3) Congestion Management: Supply Side versus Demand Side Schemes 959 - Supply side: supply side congestion management policies increase 960 the effective capacity available to traffic in order to control or 961 obviate congestion. This can be accomplished by augmenting capacity. 962 Another way to accomplish this is to minimize congestion by having a 963 relatively balanced distribution of traffic over the network. For 964 example, capacity planning should aim to provide a physical topology 965 and associated link bandwidths that match estimated traffic workload 966 and traffic distribution based on forecasting (subject to budgetary 967 and other constraints). However, if actual traffic distribution does 968 not match the topology derived from capacity panning (due to 969 forecasting errors or facility constraints for example), then the 970 traffic can be mapped onto the existing topology using routing 971 control mechanisms, using path oriented technologies (e.g., MPLS LSPs 972 and optical channel trails) to modify the logical topology, or by 973 using some other load redistribution mechanisms. 975 - Demand side: demand side congestion management policies control or 976 regulate the offered traffic to alleviate congestion problems. For 977 example, some of the short time scale mechanisms described earlier 978 (such as RED and its variations, ECN, LQD, and RND) as well as 979 policing and rate shaping mechanisms attempt to regulate the offered 980 load in various ways. Tariffs may also be applied as a demand side 981 instrument. To date, however, tariffs have not been used as a means 982 of demand side congestion management within the Internet. 984 In summary, a variety of mechanisms can be used to address congestion 985 problems in IP networks. These mechanisms may operate at multiple 986 time-scales. 988 2.5 Implementation and Operational Context 990 The operational context of Internet traffic engineering is 991 characterized by constant change which occur at multiple levels of 992 abstraction. The implementation context demands effective planning, 993 organization, and execution. The planning aspects may involve 994 determining prior sets of actions to achieve desired objectives. 995 Organizing involves arranging and assigning responsibility to the 996 various components of the traffic engineering system and coordinating 997 the activities to accomplish the desired TE objectives. Execution 998 involves measuring and applying corrective or perfective actions to 999 attain and maintain desired TE goals. 1001 3.0 Traffic Engineering Process Model(s) 1003 This section describes a generic process model that captures the high 1004 level practical aspects of Internet traffic engineering in an 1005 operational context. The process model is described as a sequence of 1006 actions that a traffic engineer, or more generally a traffic 1007 engineering system, must perform to optimize the performance of an 1008 operational network (see also [RFC-2702, AWD2]). The process model 1009 described here represents the broad activities common to most traffic 1010 engineering methodologies although the details regarding how traffic 1011 engineering is executed may differ from network to network. This 1012 process model may be enacted explicitly or implicitly, by an 1013 automaton and/or by a human. 1015 The traffic engineering process model is iterative [AWD2]. The four 1016 phases of the process model described below are repeated continually. 1018 The first phase of the TE process model is to define the relevant 1019 control policies that govern the operation of the network. These 1020 policies may depend upon many factors including the prevailing 1021 business model, the network cost structure, the operating 1022 constraints, the utility model, and optimization criteria. 1024 The second phase of the process model is a feedback mechanism 1025 involving the acquisition of measurement data from the operational 1026 network. If empirical data is not readily available from the network, 1027 then synthetic workloads may be used instead which reflect either the 1028 prevailing or the expected workload of the network. Synthetic 1029 workloads may be derived by estimation or extrapolation using prior 1030 empirical data. Their derivation may also be obtained using 1031 mathematical models of traffic characteristics or other means. 1033 The third phase of the process model is to analyze the network state 1034 and to characterize traffic workload. Performance analysis may be 1035 proactive and/or reactive. Proactive performance analysis identifies 1036 potential problems that do not exist, but could manifest in the 1037 future. Reactive performance analysis identifies existing problems, 1038 determines their cause through diagnosis, and evaluates alternative 1039 approaches to remedy the problem, if necessary. A number of 1040 quantitative and qualitative techniques may be used in the analysis 1041 process, including modeling based analysis and simulation. The 1042 analysis phase of the process model may involve investigating the 1043 concentration and distribution of traffic across the network or 1044 relevant subsets of the network, identifying the characteristics of 1045 the offered traffic workload, identifying existing or potential 1046 bottlenecks, and identifying network pathologies such as ineffective 1047 link placement, single points of failures, etc. Network pathologies 1048 may result from many factors including inferior network architecture, 1049 inferior network design, and configuration problems. A traffic 1050 matrix may be constructed as part of the analysis process. Network 1051 analysis may also be descriptive or prescriptive. 1053 The fourth phase of the TE process model is the performance 1054 optimization of the network. The performance optimization phase 1055 involves a decision process which selects and implements a set of 1056 actions from a set of alternatives. Optimization actions may include 1057 the use of appropriate techniques to either control the offered 1058 traffic or to control the distribution of traffic across the network. 1059 Optimization actions may also involve adding additional links or 1060 increasing link capacity, deploying additional hardware such as 1061 routers and switches, systematically adjusting parameters associated 1062 with routing such as IGP metrics and BGP attributes, and adjusting 1063 traffic management parameters. Network performance optimization may 1064 also involve starting a network planning process to improve the 1065 network architecture, network design, network capacity, network 1066 technology, and the configuration of network elements to accommodate 1067 current and future growth. 1069 3.1 Components of the Traffic Engineering Process Model 1071 The key components of the traffic engineering process model include a 1072 measurement subsystem, a modeling and analysis subsystem, and an 1073 optimization subsystem. The following subsections examine these 1074 components as they apply to the traffic engineering process model. 1076 3.2 Measurement 1078 Measurement is crucial to the traffic engineering function. The 1079 operational state of a network can be conclusively determined only 1080 through measurement. Measurement is also critical to the optimization 1081 function because it provides feedback data which is used by traffic 1082 engineering control subsystems. This data is used to adaptively 1083 optimize network performance in response to events and stimuli 1084 originating within and outside the network. Measurement is also 1085 needed to determine the quality of network services and to evaluate 1086 the effectiveness of traffic engineering policies. Experience 1087 suggests that measurement is most effective when acquired and applied 1088 systematically. 1090 When developing a measurement system to support the traffic 1091 engineering function in IP networks, the following questions should 1092 be carefully considered: Why is measurement needed in this particular 1093 context? What parameters are to be measured? How should the 1094 measurement be accomplished? Where should the measurement be 1095 performed? When should the measurement be performed? How frequently 1096 should the monitored variables be measured? What level of 1097 measurement accuracy and reliability is desirable? What level of 1098 measurement accuracy and reliability is realistically attainable? To 1099 what extent can the measurement system permissibly interfere with the 1100 monitored network components and variables? What is the acceptable 1101 cost of measurement? The answers to these questions will determine 1102 the measurement tools and methodologies appropriate in any given 1103 traffic engineering context. 1105 It should also be noted that there is a distinction between 1106 measurement and evaluation. Measurement provides raw data concerning 1107 state parameters and variables of monitored network elements. 1108 Evaluation utilizes the raw data to make inferences regarding the 1109 monitored system. 1111 Measurement in support of the TE function can occur at different 1112 levels of abstraction. For example, measurement can be used to derive 1113 packet level characteristics, flow level characteristics, user or 1114 customer level characteristics, traffic aggregate characteristics, 1115 component level characteristics, and network wide characteristics. 1117 3.3 Modeling, Analysis, and Simulation 1119 Modeling and analysis are important aspects of Internet traffic 1120 engineering. Modeling involves constructing an abstract or physical 1121 representation which depicts relevant traffic characteristics and 1122 network attributes. 1124 A network model is an abstract representation of the network which 1125 captures relevant network features, attributes, and characteristics, 1126 such as link and nodal attributes and constraints. A network model 1127 may facilitate analysis and/or simulation which can be used to 1128 predict network performance under various conditions as well as to 1129 guide network expansion plans. 1131 In general, Internet traffic engineering models can be classified as 1132 either structural or behavioral. Structural models focus on the 1133 organization of the network and its components. Behavioral models 1134 focus on the dynamics of the network and the traffic workload. 1135 Modeling for Internet traffic engineering may also be formal or 1136 informal. 1138 Accurate behavioral models for traffic sources are particularly 1139 useful for analysis. Development of behavioral traffic source models 1140 that are consistent with empirical data obtained from operational 1141 networks is a major research topic in Internet traffic engineering. 1142 These source models should also be tractable and amenable to 1143 analysis. The topic of source models for IP traffic is a research 1144 topic and is therefore outside the scope of this document. Its 1145 importance, however, must be emphasized. 1147 Network simulation tools are extremely useful for traffic 1148 engineering. Because of the complexity of realistic quantitative 1149 analysis of network behavior, certain aspects of network performance 1150 studies can only be conducted effectively using simulation. A good 1151 network simulator can be used to mimic and visualize network 1152 characteristics under various conditions in a safe and non-disruptive 1153 manner. For example, a network simulator may be used to depict 1154 congested resources and hot spots, and to provide hints regarding 1155 possible solutions to network performance problems. A good simulator 1156 may also be used to validate the effectiveness of planned solutions 1157 to network issues without the need to tamper with the operational 1158 network, or to commence an expensive network upgrade which may not 1159 achieve the desired objectives. Furthermore, during the process of 1160 network planning, a network simulator may reveal pathologies such as 1161 single points of failure which may require additional redundancy, and 1162 potential bottlenecks and hot spots which may require additional 1163 capacity. 1165 Routing simulators are especially useful in large networks. A routing 1166 simulator may identify planned links which may not actually be used 1167 to route traffic by the existing routing protocols. Simulators can 1168 also be used to conduct scenario based and perturbation based 1169 analysis, as well as sensitivity studies. Simulation results can be 1170 used to initiate appropriate actions in various ways. For example, an 1171 important application of network simulation tools is to investigate 1172 and identify how best to evolve and grow the network in order to 1173 accommodate projected future demands. 1175 3.4 Optimization 1177 Network performance optimization involves resolving network issues by 1178 transforming such issues into concepts that enable a solution, 1179 identification of a solution, and implementation of the solution. 1180 Network performance optimization can be corrective or perfective. In 1181 corrective optimization, the goal is to remedy a problem that has 1182 occurred or that is incipient. In perfective optimization, the goal 1183 is to improve network performance even when explicit problems do not 1184 exist and are not anticipated. 1186 Network performance optimization is a continual process, as noted 1187 previously. Performance optimization iterations may consist of 1188 real-time optimization sub-processes and non-real-time network 1189 planning sub-processes. The difference between real-time 1190 optimization and network planning is primarily in the relative time- 1191 scale in they operate and in the granularity of actions. One of the 1192 objectives of a real-time optimization sub-process is to control the 1193 mapping and distribution of traffic over the existing network 1194 infrastructure to avoid and/or relieve congestion, to assure 1195 satisfactory service delivery, and to optimize resource utilization. 1196 Real-time optimization is needed because random incidents such as 1197 fiber cuts or shifts in traffic demand will occur irrespective of how 1198 well a network is designed. These incidents can cause congestion and 1199 other problems to manifest in an operational network. Real-time 1200 optimization must solve such problems in small to medium time-scales 1201 ranging from micro-seconds to minutes or hours. Examples of real-time 1202 optimization include queue management, IGP/BGP metric tuning, and 1203 using technologies such as MPLS explicit LSPs to change the paths of 1204 some traffic trunks [XIAO]. 1206 One of the functions of the network planning sub-process is to 1207 initiate actions to systematically evolve the architecture, 1208 technology, topology, and capacity of a network. When a problem 1209 exists in the network, real-time optimization should provide an 1210 immediate remedy. Because a prompt response is necessary, the real- 1211 time solution may not be the best possible solution. Network 1212 planning may subsequently be needed to refine the solution and 1213 improve the situation. Network planning is also required to expand 1214 the network to support traffic growth and changes in traffic 1215 distribution over time. As previously noted, a change in the topology 1216 and/or capacity of the network may be the outcome of network 1217 planning. 1219 Clearly, network planning and real-time performance optimization are 1220 mutually complementary activities. A well-planned and designed 1221 network makes real-time optimization easier, while a systematic 1222 approach to real-time network performance optimization allows network 1223 planning to focus on long term issues rather than tactical 1224 considerations. Systematic real-time network performance 1225 optimization also provides valuable inputs and insights toward 1226 network planning. 1228 Stability is an important consideration in real-time network 1229 performance optimization. This aspect will be repeatedly addressed 1230 throughout this memo. 1232 4.0 Historical Review and Recent Developments 1234 This section briefly reviews different traffic engineering approaches 1235 proposed and implemented in telecommunications and computer networks. 1236 The discussion is not intended to be comprehensive. It is primarily 1237 intended to illuminate pre-existing perspectives and prior art 1238 concerning traffic engineering in the Internet and in legacy 1239 telecommunications networks. 1241 4.1 Traffic Engineering in Classical Telephone Networks 1243 This subsection presents a brief overview of traffic engineering in 1244 telephone networks which often relates to the way user traffic is 1245 steered from an originating node to the terminating node. This 1246 subsection presents a brief overview of this topic. A detailed 1247 description of the various routing strategies applied in telephone 1248 networks is included in the book by G. Ash [ASH2]. 1250 The early telephone network relied on static hierarchical routing, 1251 whereby routing patterns remained fixed independent of the state of 1252 the network or time of day. The hierarchy was intended to accommodate 1253 overflow traffic, improve network reliability via alternate routes, 1254 and prevent call looping by employing strict hierarchical rules. The 1255 network was typically over-provisioned since a given fixed route had 1256 to be dimensioned so that it could carry user traffic during a busy 1257 hour of any busy day. Hierarchical routing in the telephony network 1258 was found to be too rigid upon the advent of digital switches and 1259 stored program control which were able to manage more complicated 1260 traffic engineering rules. 1262 Dynamic routing was introduced to alleviate the routing inflexibility 1263 in the static hierarchical routing so that the network would operate 1264 more efficiently. This resulted in significant economic gains 1265 [HUSS87]. Dynamic routing typically reduces the overall loss 1266 probability by 10 to 20 percent (compared to static hierarchical 1267 routing). Dynamic routing can also improve network resilience by 1268 recalculating routes on a per-call basis and periodically updating 1269 routes. 1271 There are three main types of dynamic routing in the telephone 1272 network. They are time-dependent routing, state-dependent routing 1273 (SDR), and event dependent routing (EDR). 1275 In time-dependent routing, regular variations in traffic loads (such 1276 as time of day or day of week) are exploited in pre-planned routing 1277 tables. In state-dependent routing, routing tables are updated 1278 online according to the current state of the network (e.g, traffic 1279 demand, utilization, etc.). In event dependent routing, routing 1280 changes are incepted by events (such as call setups encountering 1281 congested or blocked links) whereupon new paths are searched out 1282 using learning models. EDR methods are real-time adaptive, but they 1283 do not require global state information as does SDR. Examples of EDR 1284 schemes include the dynamic alternate routing (DAR) from BT, the 1285 state-and-time dependent routing (STR) from NTT, and the success-to- 1286 the-top (STT) routing from AT&T. 1288 Dynamic non-hierarchical routing (DNHR) is an example of dynamic 1289 routing that was introduced in the AT&T toll network in the 1980's to 1290 respond to time-dependent information such as regular load variations 1291 as a function of time. Time-dependent information in terms of load 1292 may be divided into three time scales: hourly, weekly, and yearly. 1293 Correspondingly, three algorithms are defined to pre-plan the routing 1294 tables. The network design algorithm operates over a year-long 1295 interval while the demand servicing algorithm operates on a weekly 1296 basis to fine tune link sizes and routing tables to correct forecast 1297 errors on the yearly basis. At the smallest time scale, the routing 1298 algorithm is used to make limited adjustments based on daily traffic 1299 variations. Network design and demand servicing are computed using 1300 offline calculations. Typically, the calculations require extensive 1301 search on possible routes. On the other hand, routing may need 1302 online calculations to handle crankback. DNHR adopts a "two-link" 1303 approach whereby a path can consist of two links at most. The 1304 routing algorithm presents an ordered list of route choices between 1305 an originating switch and a terminating switch. If a call overflows, 1306 a via switch (a tandem exchange between the originating switch and 1307 the terminating switch) would send a crankback signal to the 1308 originating switch. This switch would then select the next route, 1309 and so on, until there are no alternative routes available in which 1310 the call is blocked. 1312 4.2 Evolution of Traffic Engineering in Packet Networks 1314 This subsection reviews related prior work that was intended to 1315 improve the performance of data networks. Indeed, optimization of 1316 the performance of data networks started in the early days of the 1317 ARPANET. Other early commercial networks such as SNA also recognized 1318 the importance of performance optimization and service 1319 differentiation. 1321 In terms of traffic management, the Internet has been a best effort 1322 service environment until recently. In particular, very limited 1323 traffic management capabilities existed in IP networks to provide 1324 differentiated queue management and scheduling services to packets 1325 belonging to different classes. 1327 In terms of routing control, the Internet has employed distributed 1328 protocols for intra-domain routing. These protocols are highly 1329 scalable and resilient. However, they are based on simple algorithms 1330 for path selection which have very limited functionality to allow 1331 flexible control of the path selection process. 1333 In the following subsections, the evolution of practical traffic 1334 engineering mechanisms in IP networks and its predecessors is 1335 reviewed. 1337 4.2.1 Adaptive Routing in the ARPANET 1339 The early ARPANET recognized the importance of adaptive routing where 1340 routing decisions were based on the current state of the network 1341 [MCQ80]. Early minimum delay routing approaches forwarded each 1342 packet to its destination along a path for which the total estimated 1343 transit time was the smallest. Each node maintained a table of 1344 network delays, representing the estimated delay that a packet would 1345 experience along a given path toward its destination. The minimum 1346 delay table was periodically transmitted by a node to its neighbors. 1347 The shortest path, in terms of hop count, was also propagated to give 1348 the connectivity information. 1350 One drawback to this approach is that dynamic link metrics tend to 1351 create "traffic magnets" causing congestion to be shifted from one 1352 location of a network to another location, resulting in oscillation 1353 and network instability. 1355 4.2.2 Dynamic Routing in the Internet 1357 The Internet evolved from the APARNET and adopted dynamic routing 1358 algorithms with distributed control to determine the paths that 1359 packets should take en-route to their destinations. The routing 1360 algorithms are adaptations of shortest path algorithms where costs 1361 are based on link metrics. The link metric can be based on static or 1362 dynamic quantities. The link metric based on static quantities may be 1363 assigned administratively according to local criteria. The link 1364 metric based on dynamic quantities may be a function of a network 1365 congestion measure such as delay or packet loss. 1367 It was apparent early that static link metric assignment was 1368 inadequate because it can easily lead to unfavorable scenarios in 1369 which some links become congested while others remain lightly loaded. 1370 One of the many reasons for the inadequacy of static link metrics is 1371 that link metric assignment was often done without considering the 1372 traffic matrix in the network. Also, the routing protocols did not 1373 take traffic attributes and capacity constraints into account when 1374 making routing decisions. This results in traffic concentration being 1375 localized in subsets of the network infrastructure and potentially 1376 causing congestion. Even if link metrics are assigned in accordance 1377 with the traffic matrix, unbalanced loads in the network can still 1378 occur due to a number factors including: 1380 - Resources may not be deployed in the most optimal locations 1381 from a routing perspective. 1383 - Forecasting errors in traffic volume and/or traffic distribution. 1385 - Dynamics in traffic matrix due to the temporal nature of traffic 1386 patterns, BGP policy change from peers, etc. 1388 The inadequacy of the legacy Internet interior gateway routing system 1389 is one of the factors motivating the interest in path oriented 1390 technology with explicit routing and constraint-based routing 1391 capability such as MPLS. 1393 4.2.3 ToS Routing 1395 Type-of-Service (ToS) routing involves different routes going to the 1396 same destination being selected depending upon the ToS field of an IP 1397 packet [RFC-1349]. The ToS classes may be classified as low delay 1398 and high throughput. Each link is associated with multiple link 1399 costs and each link cost is used to compute routes for a particular 1400 ToS. A separate shortest path tree is computed for each ToS. The 1401 shortest path algorithm must be run for each ToS resulting in very 1402 expensive computation. Classical ToS-based routing is now outdated 1403 as the IP header field has been replaced by a Diffserv field. 1404 Effective traffic engineering is difficult to perform in classical 1405 ToS-based routing because each class still relies exclusively on 1406 shortest path routing which results in localization of traffic 1407 concentration within the network. 1409 4.2.4 Equal Cost Multi-Path 1411 Equal Cost Multi-Path (ECMP) is another technique that attempts to 1412 address the deficiency in Shortest Path First (SPF) interior gateway 1413 routing systems [RFC-2178]. In the classical SPF algorithm, if two or 1414 more shortest paths exist to a given destination, the algorithm will 1415 choose one of them. The algorithm is modified slightly in ECMP so 1416 that if two or more equal cost shortest paths exist between two 1417 nodes, the traffic between the nodes is distributed among the 1418 multiple equal-cost paths. Traffic distribution across the equal- 1419 cost paths is usually performed in one of two ways: (1) packet-based 1420 in a round-robin fashion, or (2) flow-based using hashing on source 1421 and destination IP addresses and possibly other fields of the IP 1422 header. The first approach can easily cause out-of-order packets 1423 while the second approach is dependent upon the number and 1424 distribution of flows. Flow-based load sharing may be unpredictable 1425 in an enterprise network where the number of flows is relatively 1426 small and less heterogeneous (for example, hashing may not be 1427 uniform), but it is generally effective in core public networks where 1428 the number of flows is large and heterogeneous. 1430 In ECMP, link costs are static and bandwidth constraints are not 1431 considered, so ECMP attempts to distribute the traffic as equally as 1432 possible among the equal-cost paths independent of the congestion 1433 status of each path. As a result, given two equal-cost paths, it is 1434 possible that one of the paths will be more congested than the other. 1435 Another drawback of ECMP is that load sharing cannot be achieved on 1436 multiple paths which have non-identical costs. 1438 4.2.5 Nimrod 1440 Nimrod is a routing system developed to provide heterogeneous service 1441 specific routing in the Internet, while taking multiple constraints 1442 into account [RFC-1992]. Essentially, Nimrod is a link state routing 1443 protocol which supports path oriented packet forwarding. It uses the 1444 concept of maps to represent network connectivity and services at 1445 multiple levels of abstraction. Mechanisms are provided to allow 1446 restriction of the distribution of routing information. 1448 Even though Nimrod did not enjoy deployment in the public Internet, a 1449 number of key concepts incorporated into the Nimrod architecture, 1450 such as explicit routing which allows selection of paths at 1451 originating nodes, are beginning to find applications in some recent 1452 constraint-based routing initiatives. 1454 4.3 Overlay Model 1456 In the overlay model, a virtual-circuit network, such as ATM, frame 1457 relay, or WDM provides virtual-circuit connectivity between routers 1458 that are located at the edges of a virtual-circuit cloud. In this 1459 mode, two routers that are connected through a virtual circuit see a 1460 direct adjacency between themselves independent of the physical route 1461 taken by the virtual circuit through the ATM, frame relay, or WDM 1462 network. Thus, the overlay model essentially decouples the logical 1463 topology that routers see from the physical topology that the ATM, 1464 frame relay, or WDM network manages. The overlay model based on ATM 1465 or frame relay enables a network administrator or an automaton to 1466 employ traffic engineering concepts to perform path optimization by 1467 re-configuring or rearranging the virtual circuits so that a virtual 1468 circuit on a congested or sub-optimal physical link can be re-routed 1469 to a less congested or more optimal one. In the overlay model, 1470 traffic engineering is also employed to establish relationships 1471 between the traffic management parameters (e.g. PCR, SCR, and MBS for 1472 ATM) of the virtual-circuit technology and the actual traffic that 1473 traverses each circuit. These relationships can be established based 1474 upon known or projected traffic profiles, and some other factors. 1476 The overlay model using IP over ATM requires the management of two 1477 separate networks with different technologies (IP and ATM) resulting 1478 in increased operational complexity and cost. In the fully-meshed 1479 overlay model, each router would peer to every other router in the 1480 network, so that the total number of adjacencies is a quadratic 1481 function of the number of routers. Some of the issues with the 1482 overlay model are discussed in [AWD2]. 1484 4.4 Constrained-Based Routing 1486 Constraint-based routing refers to a class of routing systems that 1487 compute routes through a network subject to satisfaction of a set of 1488 constraints and requirements. In the most general setting, 1489 constraint-based routing may also seek to optimize overall network 1490 performance while minimizing costs. 1492 The constraints and requirements may be imposed by the network itself 1493 or by administrative policies. Constraints may include bandwidth, hop 1494 count, delay, and policy instruments such as resource class 1495 attributes. Constraints may also include domain specific attributes 1496 of certain network technologies and contexts which impose 1497 restrictions on the solution space of the routing function. Path 1498 oriented technologies such as MPLS have made constraint-based routing 1499 feasible and attractive in public IP networks. 1501 The concept of constraint-based routing within the context of MPLS 1502 traffic engineering requirements in IP networks was first defined in 1503 [RFC-2702]. 1505 Unlike QoS routing (for example, see [RFC-2386] and [MA]) which 1506 generally addresses the issue of routing individual traffic flows to 1507 satisfy prescribed flow based QoS requirements subject to network 1508 resource availability, constraint-based routing is applicable to 1509 traffic aggregates as well as flows and may be subject to a wide 1510 variety of constraints which may include policy restrictions. 1512 4.5 Overview of Other IETF Projects Related to Traffic Engineering 1514 This subsection reviews a number of IETF activities pertinent to 1515 Internet traffic engineering. These activities are primarily intended 1516 to evolve the IP architecture to support new service definitions 1517 which allow preferential or differentiated treatment to be accorded 1518 to certain types of traffic. 1520 4.5.1 Integrated Services 1522 The IETF Integrated Services working group developed the integrated 1523 services (Intserv) model. This model requires resources, such as 1524 bandwidth and buffers, to be reserved a priori for a given traffic 1525 flow to ensure that the quality of service requested by the traffic 1526 flow is satisfied. The integrated services model includes additional 1527 components beyond those used in the best-effort model such as packet 1528 classifiers, packet schedulers, and admission control. A packet 1529 classifier is used to identify flows that are to receive a certain 1530 level of service. A packet scheduler handles the scheduling of 1531 service to different packet flows to ensure that QoS commitments are 1532 met. Admission control is used to determine whether a router has the 1533 necessary resources to accept a new flow. 1535 Two services have been defined under the Integrated Services model: 1536 guaranteed service [RFC-2212] and controlled-load service [RFC-2211]. 1538 The guaranteed service can be used for applications requiring bounded 1539 packet delivery time. For this type of application, data that is 1540 delivered to the application after a pre-defined amount of time has 1541 elapsed is usually considered worthless. Therefore, guaranteed 1542 service was intended to provide a firm quantitative bound on the 1543 end-to-end packet delay for a flow. This is accomplished by 1544 controlling the queuing delay on network elements along the data flow 1545 path. The guaranteed service model does not, however, provide bounds 1546 on jitter (inter-arrival times between consecutive packets). 1548 The controlled-load service can be used for adaptive applications 1549 that can tolerate some delay but are sensitive to traffic overload 1550 conditions. This type of application typically functions 1551 satisfactorily when the network is lightly loaded but its performance 1552 degrades significantly when the network is heavily loaded. 1553 Controlled-load service therefore has been designed to provide 1554 approximately the same service as best-effort service in a lightly 1555 loaded network regardless of actual network conditions. Controlled- 1556 load service is described qualitatively in that no target values of 1557 delay or loss are specified. 1559 The main issue with the Integrated Services model has been 1560 scalability [RFC-2998], especially in large public IP networks which 1561 may potentially have millions of active micro-flows in transit 1562 concurrently. 1564 A notable feature of the Integrated Services model is that it 1565 requires explicit signaling of QoS requirements from end systems to 1566 routers [RFC-2753]. The Resource Reservation Protocol (RSVP) performs 1567 this signaling function and is a critical component of the Integrated 1568 Services model. The RSVP protocol is described next. 1570 4.5.2 RSVP 1572 RSVP is a soft state signaling protocol [RFC-2205]. It supports 1573 receiver initiated establishment of resource reservations for both 1574 multicast and unicast flows. RSVP was originally developed as a 1575 signaling protocol within the integrated services framework for 1576 applications to communicate QoS requirements to the network and for 1577 the network to reserve relevant resources to satisfy the QoS 1578 requirements [RFC-2205]. 1580 Under RSVP, the sender or source node sends a PATH message to the 1581 receiver with the same source and destination addresses as the 1582 traffic which the sender will generate. The PATH message contains: 1583 (1) a sender Tspec specifying the characteristics of the traffic, (2) 1584 a sender Template specifying the format of the traffic, and (3) an 1585 optional Adspec which is used to support the concept of one pass with 1586 advertising" (OPWA) [RFC-2205]. Every intermediate router along the 1587 path forwards the PATH Message to the next hop determined by the 1588 routing protocol. Upon receiving a PATH Message, the receiver 1589 responds with a RESV message which includes a flow descriptor used to 1590 request resource reservations. The RESV message travels to the sender 1591 or source node in the opposite direction along the path that the PATH 1592 message traversed. Every intermediate router along the path can 1593 reject or accept the reservation request of the RESV message. If the 1594 request is rejected, the rejecting router will send an error message 1595 to the receiver and the signaling process will terminate. If the 1596 request is accepted, link bandwidth and buffer space are allocated 1597 for the flow and the related flow state information is installed in 1598 the router. 1600 One of the issues with the original RSVP specification was 1601 Scalability. This is because reservations were required for micro- 1602 flows, so that the amount of state maintained by network elements 1603 tends to increase linearly with the number of micro-flows. These 1604 issues are described in [RFC-2961]. 1606 Recently, RSVP has been modified and extended in several ways to 1607 mitigate the scaling problems. As a result, it is becoming a 1608 versatile signaling protocol for the Internet. For example, RSVP has 1609 been extended to reserve resources for aggregation of flows, to set 1610 up MPLS explicit label switched paths, and to perform other signaling 1611 functions within the Internet. There are also a number of proposals 1612 to reduce the amount of refresh messages required to maintain 1613 established RSVP sessions [RFC-2961]. 1615 A number of IETF working groups have been engaged in activities 1616 related to the RSVP protocol. These include the original RSVP working 1617 group, the MPLS working group, the Resource Allocation Protocol 1618 working group, and the Policy Framework working group. 1620 4.5.3 Differentiated Services 1622 The goal of the Differentiated Services (Diffserv) effort within the 1623 IETF is to devise scalable mechanisms for categorization of traffic 1624 into behavior aggregates, which ultimately allows each behavior 1625 aggregate to be treated differently, especially when there is a 1626 shortage of resources such as link bandwidth and buffer space [RFC- 1627 2475]. One of the primary motivations for the Diffserv effort was to 1628 devise alternative mechanisms for service differentiation in the 1629 Internet that mitigate the scalability issues encountered with the 1630 Intserv model. 1632 The IETF Diffserv working group has defined a Differentiated Services 1633 field in the IP header (DS field). The DS field consists of six bits 1634 of the part of the IP header formerly known as TOS octet. The DS 1635 field is used to indicate the forwarding treatment that a packet 1636 should receive at a node [RFC-2474]. The Diffserv working group has 1637 also standardized a number of Per-Hop Behavior (PHB) groups. Using 1638 the PHBs, several classes of services can be defined using different 1639 classification, policing, shaping and scheduling rules. 1641 For an end-user of network services to receive Differentiated 1642 Services from its Internet Service Provider (ISP), it may be 1643 necessary for the user to have a Service Level Agreement (SLA) with 1644 the ISP. An SLA may explicitly or implicitly specify a Traffic 1645 Conditioning Agreement (TCA) which defines classifier rules as well 1646 as metering, marking, discarding, and shaping rules. 1648 Packets are classified, and possibly policed and shaped at the 1649 ingress to a Diffserv network. When a packet traverses the boundary 1650 between different Diffserv domains, the DS field of the packet may be 1651 re-marked according to existing agreements between the domains. 1653 Differentiated Services allows only a finite number of service 1654 classes to be indicated by the DS field. The main advantage of the 1655 Diffserv approach relative to the Intserv model is scalability. 1656 Resources are allocated on a per-class basis and the amount of state 1657 information is proportional to the number of classes rather than to 1658 the number of application flows. 1660 It should be obvious from the previous discussion that the Diffserv 1661 model essentially deals with traffic management issues on a per hop 1662 basis. The Diffserv control model consists of a collection of micro- 1663 TE control mechanisms. Other traffic engineering capabilities, such 1664 as capacity management (including routing control), are also required 1665 in order to deliver acceptable service quality in Diffserv networks. 1666 The concept of Per Domain Behaviors has been introduced to better 1667 capture the notion of differentiated services across a complete 1668 domain [RFC-3086]. 1670 4.5.4 MPLS 1672 MPLS is an advanced forwarding scheme which also includes extensions 1673 to conventional IP control plane protocols. MPLS extends the Internet 1674 routing model and enhances packet forwarding and path control [RFC- 1675 3031]. 1677 At the ingress to an MPLS domain, label switching routers (LSRs) 1678 classify IP packets into forwarding equivalence classes (FECs) based 1679 on a variety of factors, including e.g. a combination of the 1680 information carried in the IP header of the packets and the local 1681 routing information maintained by the LSRs. An MPLS label is then 1682 prepended to each packet according to their forwarding equivalence 1683 classes. In a non-ATM/FR environment, the label is 32 bits long and 1684 contains a 20-bit label field, a 3-bit experimental field (formerly 1685 known as Class-of-Service or CoS field), a 1-bit label stack 1686 indicator and an 8-bit TTL field. In an ATM (FR) environment, the 1687 label consists information encoded in the VCI/VPI (DLCI) field. An 1688 MPLS capable router (an LSR) examines the label and possibly the 1689 experimental field and uses this information to make packet 1690 forwarding decisions. 1692 An LSR makes forwarding decisions by using the label prepended to 1693 packets as the index into a local next hop label forwarding entry 1694 (NHLFE). The packet is then processed as specified in the NHLFE. The 1695 incoming label may be replaced by an outgoing label, and the packet 1696 may be switched to the next LSR. This label-switching process is very 1697 similar to the label (VCI/VPI) swapping process in ATM networks. 1698 Before a packet leaves an MPLS domain, its MPLS label may be removed. 1699 A Label Switched Path (LSP) is the path between an ingress LSRs and 1700 an egress LSRs through which a labeled packet traverses. The path of 1701 an explicit LSP is defined at the originating (ingress) node of the 1702 LSP. MPLS can use a signaling protocol such as RSVP or LDP to set up 1703 LSPs. 1705 MPLS is a very powerful technology for Internet traffic engineering 1706 because it supports explicit LSPs which allow constraint-based 1707 routing to be implemented efficiently in IP networks [AWD2]. The 1708 requirements for traffic engineering over MPLS are described in 1709 [RFC-2702]. Extensions to RSVP to support instantiation of explicit 1710 LSP are discussed in [AWD3]. Extensions to LDP, known as CR-LDP, to 1711 support explicit LSPs are presented in [JAM]. 1713 4.5.5 IP Performance Metrics 1715 The IETF IP Performance Metrics (IPPM) working group has been 1716 developing a set of standard metrics that can be used to monitor the 1717 quality, performance, and reliability of Internet services. These 1718 metrics can be applied by network operators, end-users, and 1719 independent testing groups to provide users and service providers 1720 with a common understanding of the performance and reliability of the 1721 Internet component 'clouds' they use/provide [RFC2330]. The criteria 1722 for performance metrics developed by the IPPM WG are described in 1723 [RFC2330]. Examples of performance metrics include one-way packet 1724 loss [RFC2680], one-way delay [RFC2679], and connectivity measures 1725 between two nodes [RFC2678]. Other metrics include second-order 1726 measures of packet loss and delay. 1728 Some of the performance metrics specified by the IPPM WG are useful 1729 for specifying Service Level Agreements (SLAs). SLAs are sets of 1730 service level objectives negotiated between users and service 1731 providers, wherein each objective is a combination of one or more 1732 performance metrics possibly subject to certain constraints. 1734 4.5.6 Flow Measurement 1736 The IETF Real Time Flow Measurement (RTFM) working group has produced 1737 an architecture document defining a method to specify traffic flows 1738 as well as a number of components for flow measurement (meters, meter 1739 readers, manager) [RFC-2722]. A flow measurement system enables 1740 network traffic flows to be measured and analyzed at the flow level 1741 for a variety of purposes. As noted in RFC-2722, a flow measurement 1742 system can be very useful in the following contexts: (1) 1743 understanding the behavior of existing networks, (2) planning for 1744 network development and expansion, (3) quantification of network 1745 performance, (4) verifying the quality of network service, and (5) 1746 attribution of network usage to users. 1748 A flow measurement system consists of meters, meter readers, and 1749 managers. A meter observe packets passing through a measurement 1750 point, classifies them into certain groups, accumulates certain usage 1751 data (such as the number of packets and bytes for each group), and 1752 stores the usage data in a flow table. A group may represent a user 1753 application, a host, a network, a group of networks, etc. A meter 1754 reader gathers usage data from various meters so it can be made 1755 available for analysis. A manager is responsible for configuring and 1756 controlling meters and meter readers. The instructions received by a 1757 meter from a manager include flow specification, meter control 1758 parameters, and sampling techniques. The instructions received by a 1759 meter reader from a manager include the address of the meter whose 1760 date is to be collected, the frequency of data collection, and the 1761 types of flows to be collected. 1763 4.5.7 Endpoint Congestion Management 1765 [RFC-3124] is intended to provide a set of congestion control 1766 mechanisms that transport protocols can use. It is also intended to 1767 develop mechanisms for unifying congestion control across a subset of 1768 an endpoint's active unicast connections (called a congestion group). 1769 A congestion manager continuously monitors the state of the path for 1770 each congestion group under its control. The manager uses that 1771 information to instruct a scheduler on how to partition bandwidth 1772 among the connections of that congestion group. 1774 4.6 Overview of ITU Activities Related to Traffic Engineering 1776 This section provides an overview of prior work within the ITU-T 1777 pertaining to traffic engineering in traditional telecommunications 1778 networks. 1780 ITU-T Recommendations E.600 [ITU-E600], E.701 [ITU-E701], and E.801 1781 [ITU-E801] address traffic engineering issues in traditional 1782 telecommunications networks. Recommendation E.600 provides a 1783 vocabulary for describing traffic engineering concepts, while E.701 1784 defines reference connections, Grade of Service (GOS), and traffic 1785 parameters for ISDN. Recommendation E.701 uses the concept of a 1786 reference connection to identify representative cases of different 1787 types of connections without describing the specifics of their actual 1788 realizations by different physical means. As defined in 1789 Recommendation E.600, "a connection is an association of resources 1790 providing means for communication between two or more devices in, or 1791 attached to, a telecommunication network." Also, E.600 defines "a 1792 resource as any set of physically or conceptually identifiable 1793 entities within a telecommunication network, the use of which can be 1794 unambiguously determined" [ITU-E600]. There can be different types 1795 of connections as the number and types of resources in a connection 1796 may vary. 1798 Typically, different network segments are involved in the path of a 1799 connection. For example, a connection may be local, national, or 1800 international. The purposes of reference connections are to clarify 1801 and specify traffic performance issues at various interfaces between 1802 different network domains. Each domain may consist of one or more 1803 service provider networks. 1805 Reference connections provide a basis to define grade of service 1806 (GoS) parameters related to traffic engineering within the ITU-T 1807 framework. As defined in E.600, "GoS refers to a number of traffic 1808 engineering variables which are used to provide a measure of the 1809 adequacy of a group of resources under specified conditions." These 1810 GoS variables may be probability of loss, dial tone, delay, etc. 1811 They are essential for network internal design and operation as well 1812 as for component performance specification. 1814 GoS is different from quality of service (QoS) in the ITU framework. 1815 QoS is the performance perceivable by a telecommunication service 1816 user and expresses the user's degree of satisfaction of the service. 1817 QoS parameters focus on performance aspects observable at the service 1818 access points and network interfaces, rather than their causes within 1819 the network. GoS, on the other hand, is a set of network oriented 1820 measures which characterize the adequacy of a group of resources 1821 under specified conditions. For a network to be effective in serving 1822 its users, the values of both GoS and QoS parameters must be related, 1823 with GoS parameters typically making a major contribution to the QoS. 1825 Recommendation E.600 stipulates that a set of GoS parameters must be 1826 selected and defined on an end-to-end basis for each major service 1827 category provided by a network to assist the network provider improve 1828 efficiency and effectiveness of the network. Based on a selected set 1829 of reference connections, suitable target values are assigned to the 1830 selected GoS parameters under normal and high load conditions. These 1831 end-to-end GoS target values are then apportioned to individual 1832 resource components of the reference connections for dimensioning 1833 purposes. 1835 4.7 Content Distribution 1837 The Internet is dominated by client-server interactions, especially 1838 Web traffic (in the future, more sophisticated media servers may 1839 become dominant). The location and performance of major information 1840 servers has a significant impact on the traffic patterns within the 1841 Internet as well as on the perception of service quality by end 1842 users. 1844 A number of dynamic load balancing techniques have been devised to 1845 improve the performance of replicated information servers. These 1846 techniques can cause spatial traffic characteristics to become more 1847 dynamic in the Internet because information servers can be 1848 dynamically picked based upon the location of the clients, the 1849 location of the servers, the relative utilization of the servers, the 1850 relative performance of different networks, and the relative 1851 performance of different parts of a network. This process of 1852 assignment of distributed servers to clients is called Traffic 1853 Directing. It functions at the application layer. 1855 Traffic Directing schemes that allocate servers in multiple 1856 geographically dispersed locations to clients may require empirical 1857 network performance statistics to make more effective decisions. In 1858 the future, network measurement systems may need to provide this type 1859 of information. The exact parameters needed are not yet defined. 1861 When congestion exists in the network, Traffic Directing and Traffic 1862 Engineering systems should act in a coordinated manner. This topic is 1863 for further study. 1865 The issues related to location and replication of information 1866 servers, particularly web servers, are important for Internet traffic 1867 engineering because these servers contribute a substantial proportion 1868 of Internet traffic. 1870 5.0 Taxonomy of Traffic Engineering Systems 1872 This section presents a short taxonomy of traffic engineering 1873 systems. A taxonomy of traffic engineering systems can be constructed 1874 based on traffic engineering styles and views as listed below: 1876 - Time-dependent vs State-dependent vs Event-dependent 1877 - Offline vs Online 1878 - Centralized vs Distributed 1879 - Local vs Global Information 1880 - Prescriptive vs Descriptive 1881 - Open Loop vs Closed Loop 1882 - Tactical vs Strategic 1884 These classification systems are described in greater detail in the 1885 following subsections of this document. 1887 5.1 Time-Dependent Versus State-Dependent Versus Event Dependent 1889 Traffic engineering methodologies can be classified as time-dependent 1890 or state-dependent or event-dependent. All TE schemes are considered 1891 to be dynamic in this document. Static TE implies that no traffic 1892 engineering methodology or algorithm is being applied. 1894 In the time-dependent TE, historical information based on periodic 1895 variations in traffic (such as time of day) is used to pre-program 1896 routing plans and other TE control mechanisms. Additionally, 1897 customer subscription or traffic projection may be used. Pre- 1898 programmed routing plans typically change on a relatively long time 1899 scale (e.g., diurnal). Time-dependent algorithms do not attempt to 1900 adapt to random variations in traffic or changing network conditions. 1901 An example of a time-dependent algorithm is a global centralized 1902 optimizer where the input to the system is a traffic matrix and 1903 multi-class QoS requirements as described [MR99]. 1905 State-dependent TE adapts the routing plans for packets based on the 1906 current state of the network. The current state of the network 1907 provides additional information on variations in actual traffic 1908 (i.e., perturbations from regular variations) that could not be 1909 predicted using historical information. Constraint-based routing is 1910 an example of state-dependent TE operating in a relatively long time 1911 scale. An example operating in a relatively short time scale is a 1912 load-balancing algorithm described in [MATE]. 1914 The state of the network can be based on parameters such as 1915 utilization, packet delay, packet loss, etc. These parameters can be 1916 obtained in several ways. For example, each router may flood these 1917 parameters periodically or by means of some kind of trigger to other 1918 routers. Another approach is for a particular router performing 1919 adaptive TE to send probe packets along a path to gather the state of 1920 that path. Still another approach is for a management system to 1921 gather relevant information from network elements. 1923 Expeditious and accurate gathering and distribution of state 1924 information is critical for adaptive TE due to the dynamic nature of 1925 network conditions. State-dependent algorithms may be applied to 1926 increase network efficiency and resilience. Time-dependent algorithms 1927 are more suitable for predictable traffic variations. On the other 1928 hand, state-dependent algorithms are more suitable for adapting to 1929 the prevailing network state. 1931 Event-dependent TE methods can also be used for TE path selection. 1932 Event-dependent TE methods are distinct from time-dependent and 1933 state-dependent TE methods in the manner in which paths are selected. 1934 These algorithms are adaptive and distributed in nature and typically 1935 use learning models to find good paths for TE in a network. While 1936 state-dependent TE models typically use available-link-bandwidth 1937 (ALB) flooding for TE path selection, event-dependent TE methods do 1938 not require ALB flooding. Rather, event-dependent TE methods 1939 typically search out capacity by learning models, as in the success- 1940 to-the-top (STT) method. ALB flooding can be resource intensive, 1941 since it requires link bandwidth to carry LSAs, processor capacity to 1942 process LSAs, and the overhead can limit area/autonomous system (AS) 1943 size. Modeling results suggest that event-dependent TE methods could 1944 lead to a reduction in ALB flooding overhead without loss of network 1945 throughput performance [ASH3]. 1947 5.2 Offline Versus Online 1949 Traffic engineering requires the computation of routing plans. The 1950 computation may be performed offline or online. The computation can 1951 be done offline for scenarios where routing plans need not be 1952 executed in real-time. For example, routing plans computed from 1953 forecast information may be computed offline. Typically, offline 1954 computation is also used to perform extensive searches on multi- 1955 dimensional solution spaces. 1957 Online computation is required when the routing plans must adapt to 1958 changing network conditions as in state-dependent algorithms. Unlike 1959 offline computation (which can be computationally demanding), online 1960 computation is geared toward relative simple and fast calculations to 1961 select routes, fine-tune the allocations of resources, and perform 1962 load balancing. 1964 5.3 Centralized Versus Distributed 1966 Centralized control has a central authority which determines routing 1967 plans and perhaps other TE control parameters on behalf of each 1968 router. The central authority collects the network-state information 1969 from all routers periodically and returns the routing information to 1970 the routers. The routing update cycle is a critical parameter 1971 directly impacting the performance of the network being controlled. 1972 Centralized control may need high processing power and high bandwidth 1973 control channels. 1975 Distributed control determines route selection by each router 1976 autonomously based on the routers view of the state of the network. 1977 The network state information may be obtained by the router using a 1978 probing method or distributed by other routers on a periodic basis 1979 using link state advertisements. Network state information may also 1980 be disseminated under exceptional conditions. 1982 5.4 Local Versus Global 1984 Traffic engineering algorithms may require local or global network- 1985 state information. 1987 Local information pertains to the state of a portion of the domain. 1988 Examples include the bandwidth and packet loss rate of a particular 1989 path. Local state information may be sufficient for certain 1990 instances of distributed-controlled TEs. 1992 Global information pertains to the state of the entire domain 1993 undergoing traffic engineering. Examples include a global traffic 1994 matrix and loading information on each link throughout the domain of 1995 interest. Global state information is typically required with 1996 centralized control. Distributed TE systems may also need global 1997 information in some cases. 1999 5.5 Prescriptive Versus Descriptive 2001 TE systems may also be classified as prescriptive or descriptive. 2003 Prescriptive traffic engineering evaluates alternatives and 2004 recommends a course of action. Prescriptive traffic engineering can 2005 be further categorized as either corrective or perfective. Corrective 2006 TE prescribes a course of action to address an existing or predicted 2007 anomaly. Perfective TE prescribes a course of action to evolve and 2008 improve network performance even when no anomalies are evident. 2010 Descriptive traffic engineering, on the other hand, characterizes the 2011 state of the network and assesses the impact of various policies 2012 without recommending any particular course of action. 2014 5.6 Open-Loop Versus Closed-Loop 2016 Open-loop traffic engineering control is where control action does 2017 not use feedback information from the current network state. The 2018 control action may use its own local information for accounting 2019 purposes, however. 2021 Closed-loop traffic engineering control is where control action 2022 utilizes feedback information from the network state. The feedback 2023 information may be in the form of historical information or current 2024 measurement. 2026 5.7 Tactical vs Strategic 2028 Tactical traffic engineering aims to address specific performance 2029 problems (such as hot-spots) that occur in the network from a 2030 tactical perspective, without consideration of overall strategic 2031 imperatives. Without proper planning and insights, tactical TE tends 2032 to be ad hoc in nature. 2034 Strategic traffic engineering approaches the TE problem from a more 2035 organized and systematic perspective, taking into consideration the 2036 immediate and longer term consequences of specific policies and 2037 actions. 2039 6.0 Recommendations for Internet Traffic Engineering 2041 This section describes high level recommendations for traffic 2042 engineering in the Internet. These recommendations are presented in 2043 general terms. 2045 The recommendations describe the capabilities needed to solve a 2046 traffic engineering problem or to achieve a traffic engineering 2047 objective. Broadly speaking, these recommendations can be categorized 2048 as either functional and non-functional recommendations. 2050 Functional recommendations for Internet traffic engineering describe 2051 the functions that a traffic engineering system should perform. These 2052 functions are needed to realize traffic engineering objectives by 2053 addressing traffic engineering problems. 2055 Non-functional recommendations for Internet traffic engineering 2056 relate to the quality attributes or state characteristics of a 2057 traffic engineering system. These recommendations may contain 2058 conflicting assertions and may sometimes be difficult to quantify 2059 precisely. 2061 6.1 Generic Non-functional Recommendations 2063 The generic non-functional recommendations for Internet traffic 2064 engineering include: usability, automation, scalability, stability, 2065 visibility, simplicity, efficiency, reliability, correctness, 2066 maintainability, extensibility, interoperability, and security. In a 2067 given context, some of these recommendations may be critical while 2068 others may be optional. Therefore, prioritization may be required 2069 during the development phase of a traffic engineering system (or 2070 components thereof) to tailor it to a specific operational context. 2072 In the following paragraphs, some of the aspects of the non- 2073 functional recommendations for Internet traffic engineering are 2074 summarized. 2076 Usability: Usability is a human factor aspect of traffic engineering 2077 systems. Usability refers to the ease with which a traffic 2078 engineering system can be deployed and operated. In general, it is 2079 desirable to have a TE system that can be readily deployed in an 2080 existing network. It is also desirable to have a TE system that is 2081 easy to operate and maintain. 2083 Automation: Whenever feasible, a traffic engineering system should 2084 automate as many traffic engineering functions as possible to 2085 minimize the amount of human effort needed to control and analyze 2086 operational networks. Automation is particularly imperative in large 2087 scale public networks because of the high cost of the human aspects 2088 of network operations and the high risk of network problems caused by 2089 human errors. Automation may entail the incorporation of automatic 2090 feedback and intelligence into some components of the traffic 2091 engineering system. 2093 Scalability: Contemporary public networks are growing very fast with 2094 respect to network size and traffic volume. Therefore, a TE system 2095 should be scalable to remain applicable as the network evolves. In 2096 particular, a TE system should remain functional as the network 2097 expands with regard to the number of routers and links, and with 2098 respect to the traffic volume. A TE system should have a scalable 2099 architecture, should not adversely impair other functions and 2100 processes in a network element, and should not consume too much 2101 network resources when collecting and distributing state information 2102 or when exerting control. 2104 Stability: Stability is a very important consideration in traffic 2105 engineering systems that respond to changes in the state of the 2106 network. State-dependent traffic engineering methodologies typically 2107 mandate a tradeoff between responsiveness and stability. It is 2108 strongly recommended that when tradeoffs are warranted between 2109 responsiveness and stability, that the tradeoff should be made in 2110 favor of stability (especially in public IP backbone networks). 2112 Flexibility: A TE system should be flexible to allow for changes in 2113 optimization policy. In particular, a TE system should provide 2114 sufficient configuration options so that a network administrator can 2115 tailor the TE system to a particular environment. It may also be 2116 desirable to have both online and offline TE subsystems which can be 2117 independently enabled and disabled. TE systems that are used in 2118 multi-class networks should also have options to support class based 2119 performance evaluation and optimization. 2121 Visibility: As part of the TE system, mechanisms should exist to 2122 collect statistics from the network and to analyze these statistics 2123 to determine how well the network is functioning. Derived statistics 2124 such as traffic matrices, link utilization, latency, packet loss, and 2125 other performance measures of interest which are determined from 2126 network measurements can be used as indicators of prevailing network 2127 conditions. Other examples of status information which should be 2128 observed include existing functional routing information 2129 (additionally, in the context of MPLS existing LSP routes), etc. 2131 Simplicity: Generally, a TE system should be as simple as possible. 2132 More importantly, the TE system should be relatively easy to use 2133 (i.e., clean, convenient, and intuitive user interfaces). Simplicity 2134 in user interface does not necessarily imply that the TE system will 2135 use naive algorithms. When complex algorithms and internal structures 2136 are used, such complexities should be hidden as much as possible from 2137 the network administrator through the user interface. 2139 Interoperability: Whenever feasible, traffic engineering systems and 2140 their components should be developed with open standards based 2141 interfaces to allow interoperation with other systems and components. 2143 Security: Security is a critical consideration in traffic engineering 2144 systems. Such traffic engineering systems typically exert control 2145 over certain functional aspects of the network to achieve the desired 2146 performance objectives. Therefore, adequate measures must be taken to 2147 safeguard the integrity of the traffic engineering system. Adequate 2148 measures must also be taken to protect the network from 2149 vulnerabilities that originate from security breaches and other 2150 impairments within the traffic engineering system. 2152 The remainder of this section will focus on some of the high level 2153 functional recommendations for traffic engineering. 2155 6.2 Routing Recommendations 2157 Routing control is a significant aspect of Internet traffic 2158 engineering. Routing impacts many of the key performance measures 2159 associated with networks, such as throughput, delay, and utilization. 2160 Generally, it is very difficult to provide good service quality in a 2161 wide area network without effective routing control. A desirable 2162 routing system is one that takes traffic characteristics and network 2163 constraints into account during route selection while maintaining 2164 stability. 2166 Traditional shortest path first (SPF) interior gateway protocols are 2167 based on shortest path algorithms and have limited control 2168 capabilities for traffic engineering [RFC-2702, AWD2]. These 2169 limitations include : 2171 1. The well known issues with pure SPF protocols, which 2172 do not take network constraints and traffic characteristics 2173 into account during route selection. For example, since IGPs 2174 always use the shortest paths (based on administratively 2175 assigned link metrics) to forward traffic, load sharing cannot 2176 be accomplished among paths of different costs. Using shortest 2177 paths to forward traffic conserves network resources, but may 2178 cause the following problems: 1) If traffic from a source to a 2179 destination exceeds the capacity of a link along the shortest 2180 path, the link (hence the shortest path) becomes congested while 2181 a longer path between these two nodes may be under-utilized; 2182 2) the shortest paths from different sources can overlap at some 2183 links. If the total traffic from the sources exceeds the 2184 capacity of any of these links, congestion will occur. Problems 2185 can also occur because traffic demand changes over time but 2186 network topology and routing configuration cannot be changed as 2187 rapidly. This causes the network topology and routing 2188 configuration to become sub-optimal over time, which may result 2189 in persistent congestion problems. 2191 2. The Equal-Cost Multi-Path (ECMP) capability of SPF IGPs supports 2192 sharing of traffic among equal cost paths between two nodes. 2193 However, ECMP attempts to divide the traffic as equally as 2194 possible among the equal cost shortest paths. Generally, ECMP 2195 does not support configurable load sharing ratios among equal 2196 cost paths. The result is that one of the paths may carry 2197 significantly more traffic than other paths because it 2198 may also carry traffic from other sources. This situation can 2199 result in congestion along the path that carries more traffic. 2201 3. Modifying IGP metrics to control traffic routing tends to 2202 have network-wide effect. Consequently, undesirable and 2203 unanticipated traffic shifts can be triggered as a result. 2205 Because of these limitations, new capabilities are needed to enhance 2206 the routing function in IP networks. Some of these capabilities have 2207 been described elsewhere and are summarized below. 2209 Constraint-based routing is desirable to evolve the routing 2210 architecture of IP networks, especially public IP backbones with 2211 complex topologies [RFC-2702]. Constraint-based routing computes 2212 routes to fulfill requirements subject to constraints. Constraints 2213 may include bandwidth, hop count, delay, and administrative policy 2214 instruments such as resource class attributes [RFC-2702, RFC-2386]. 2215 This makes it possible to select routes that satisfy a given set of 2216 requirements subject to network and administrative policy 2217 constraints. Routes computed through constraint-based routing are not 2218 necessarily the shortest paths. Constraint-based routing works best 2219 with path oriented technologies that support explicit routing, such 2220 as MPLS. 2222 Constraint-based routing can also be used as a way to redistribute 2223 traffic onto the infrastructure (even for best effort traffic). For 2224 example, if the bandwidth requirements for path selection and 2225 reservable bandwidth attributes of network links are appropriately 2226 defined and configured, then congestion problems caused by uneven 2227 traffic distribution may be avoided or reduced. In this way, the 2228 performance and efficiency of the network can be improved. 2230 A number of enhancements are needed to conventional link state IGPs, 2231 such as OSPF and IS-IS, to allow them to distribute additional state 2232 information required for constraint-based routing. These extensions 2233 to OSPF were described in [KATZ] and to IS-IS in [SMIT]. 2234 Essentially, these enhancements require the propagation of additional 2235 information in link state advertisements. Specifically, in addition 2236 to normal link-state information, an enhanced IGP is required to 2237 propagate topology state information needed for constraint-based 2238 routing. Some of the additional topology state information include 2239 link attributes such as reservable bandwidth and link resource class 2240 attribute (an administratively specified property of the link). The 2241 resource class attribute concept was defined in [RFC-2702]. The 2242 additional topology state information is carried in new TLVs and 2243 sub-TLVs in IS-IS, or in the Opaque LSA in OSPF [SMIT, KATZ]. 2245 An enhanced link-state IGP may flood information more frequently than 2246 a normal IGP. This is because even without changes in topology, 2247 changes in reservable bandwidth or link affinity can trigger the 2248 enhanced IGP to initiate flooding. A tradeoff is typically required 2249 between the timeliness of the information flooded and the flooding 2250 frequency to avoid excessive consumption of link bandwidth and 2251 computational resources, and more importantly, to avoid instability. 2253 In a TE system, it is also desirable for the routing subsystem to 2254 make the load splitting ratio among multiple paths (with equal cost 2255 or different cost) configurable. This capability gives network 2256 administrators more flexibility in the control of traffic 2257 distribution across the network. It can be very useful for 2258 avoiding/relieving congestion in certain situations. Examples can be 2259 found in [XIAO]. 2261 The routing system should also have the capability to control the 2262 routes of subsets of traffic without affecting the routes of other 2263 traffic if sufficient resources exist for this purpose. This 2264 capability allows a more refined control over the distribution of 2265 traffic across the network. For example, the ability to move traffic 2266 from a source to a destination away from its original path to another 2267 path (without affecting other traffic paths) allows traffic to be 2268 moved from resource-poor network segments to resource-rich segments. 2269 Path oriented technologies such as MPLS inherently support this 2270 capability as discussed in [AWD2]. 2272 Additionally, the routing subsystem should be able to select 2273 different paths for different classes of traffic (or for different 2274 traffic behavior aggregates) if the network supports multiple classes 2275 of service (different behavior aggregates). 2277 6.3 Traffic Mapping Recommendations 2279 Traffic mapping pertains to the assignment of traffic workload onto 2280 pre-established paths to meet certain requirements. Thus, while 2281 constraint-based routing deals with path selection, traffic mapping 2282 deals with the assignment of traffic to established paths which may 2283 have been selected by constraint-based routing or by some other 2284 means. Traffic mapping can be performed by time-dependent or state- 2285 dependent mechanisms, as described in Section 5.1. 2287 An important aspect of the traffic mapping function is the ability to 2288 establish multiple paths between an originating node and a 2289 destination node, and the capability to distribute the traffic 2290 between the two nodes across the paths according to some policies. A 2291 pre-condition for this scheme is the existence of flexible mechanisms 2292 to partition traffic and then assign the traffic partitions onto the 2293 parallel paths. This requirement was noted in [RFC-2702]. When 2294 traffic is assigned to multiple parallel paths, it is recommended 2295 that special care should be taken to ensure proper ordering of 2296 packets belonging to the same application (or micro-flow) at the 2297 destination node of the parallel paths. 2299 As a general rule, mechanisms that perform the traffic mapping 2300 functions should aim to map the traffic onto the network 2301 infrastructure to minimize congestion. If the total traffic load 2302 cannot be accommodated, or if the routing and mapping functions 2303 cannot react fast enough to changing traffic conditions, then a 2304 traffic mapping system may rely on short time scale congestion 2305 control mechanisms (such as queue management, scheduling, etc) to 2306 mitigate congestion. Thus, mechanisms that perform the traffic 2307 mapping functions should complement existing congestion control 2308 mechanisms. In an operational network, it is generally desirable to 2309 map the traffic onto the infrastructure such that intra-class and 2310 inter-class resource contention are minimized. 2312 When traffic mapping techniques that depend on dynamic state feedback 2313 (e.g. MATE and such like) are used, special care must be taken to 2314 guarantee network stability. 2316 6.4 Measurement Recommendations 2318 The importance of measurement in traffic engineering has been 2319 discussed throughout this document. Mechanisms should be provided to 2320 measure and collect statistics from the network to support the 2321 traffic engineering function. Additional capabilities may be needed 2322 to help in the analysis of the statistics. The actions of these 2323 mechanisms should not adversely affect the accuracy and integrity of 2324 the statistics collected. The mechanisms for statistical data 2325 acquisition should also be able to scale as the network evolves. 2327 Traffic statistics may be classified according to long-term or 2328 short-term time scales. Long-term time scale traffic statistics are 2329 very useful for traffic engineering. Long-term time scale traffic 2330 statistics may capture or reflect periodicity in network workload 2331 (such as hourly, daily, and weekly variations in traffic profiles) as 2332 well as traffic trends. Aspects of the monitored traffic statistics 2333 may also depict class of service characteristics for a network 2334 supporting multiple classes of service. Analysis of the long-term 2335 traffic statistics MAY yield secondary statistics such as busy hour 2336 characteristics, traffic growth patterns, persistent congestion 2337 problems, hot-spot, and imbalances in link utilization caused by 2338 routing anomalies. 2340 A mechanism for constructing traffic matrices for both long-term and 2341 short-term traffic statistics should be in place. In multi-service IP 2342 networks, the traffic matrices may be constructed for different 2343 service classes. Each element of a traffic matrix represents a 2344 statistic of traffic flow between a pair of abstract nodes. An 2345 abstract node may represent a router, a collection of routers, or a 2346 site in a VPN. 2348 Measured traffic statistics should provide reasonable and reliable 2349 indicators of the current state of the network on the short-term 2350 scale. Some short term traffic statistics may reflect link 2351 utilization and link congestion status. Examples of congestion 2352 indicators include excessive packet delay, packet loss, and high 2353 resource utilization. Examples of mechanisms for distributing this 2354 kind of information include SNMP, probing techniques, FTP, IGP link 2355 state advertisements, etc. 2357 6.5 Network Survivability 2359 Network survivability refers to the capability of a network to 2360 maintain service continuity in the presence of faults. This can be 2361 accomplished by promptly recovering from network impairments and 2362 maintaining the required QoS for existing services after recovery. 2363 Survivability has become an issue of great concern within the 2364 Internet community due to the increasing demands to carry mission 2365 critical traffic, real-time traffic, and other high priority traffic 2366 over the Internet. Survivability can be addressed at the device level 2367 by developing network elements that are more reliable; and at the 2368 network level by incorporating redundancy into the architecture, 2369 design, and operation of networks. It is recommended that a 2370 philosophy of robustness and survivability should be adopted in the 2371 architecture, design, and operation of traffic engineering that 2372 control IP networks (especially public IP networks). Because 2373 different contexts may demand different levels of survivability, the 2374 mechanisms developed to support network survivability should be 2375 flexible so that they can be tailored to different needs. 2377 Failure protection and restoration capabilities have become available 2378 from multiple layers as network technologies have continued to 2379 improve. At the bottom of the layered stack, optical networks are now 2380 capable of providing dynamic ring and mesh restoration functionality 2381 at the wavelength level as well as traditional protection 2382 functionality. At the SONET/SDH layer survivability capability is 2383 provided with Automatic Protection Switching (APS) as well as self- 2384 healing ring and mesh architectures. Similar functionality is 2385 provided by layer 2 technologies such as ATM (generally with slower 2386 mean restoration times). Rerouting is traditionally used at the IP 2387 layer to restore service following link and node outages. Rerouting 2388 at the IP layer occurs after a period of routing convergence which 2389 may require seconds to minutes to complete. Some new developments in 2390 the MPLS context make it possible to achieve recovery at the IP layer 2391 prior to convergence [SHAR]. 2393 To support advanced survivability requirements, path-oriented 2394 technologies such a MPLS can be used to enhance the survivability of 2395 IP networks in a potentially cost effective manner. The advantages of 2396 path oriented technologies such as MPLS for IP restoration becomes 2397 even more evident when class based protection and restoration 2398 capabilities are required. 2400 Recently, a common suite of control plane protocols has been proposed 2401 for both MPLS and optical transport networks under the acronym 2402 Multi-protocol Lambda Switching [AWD1]. This new paradigm of Multi- 2403 protocol Lambda Switching will support even more sophisticated mesh 2404 restoration capabilities at the optical layer for the emerging IP 2405 over WDM network architectures. 2407 Another important aspect regarding multi-layer survivability is that 2408 technologies at different layers provide protection and restoration 2409 capabilities at different temporal granularities (in terms of time 2410 scales) and at different bandwidth granularity (from packet-level to 2411 wavelength level). Protection and restoration capabilities can also 2412 be sensitive to different service classes and different network 2413 utility models. 2415 The impact of service outages varies significantly for different 2416 service classes depending upon the effective duration of the outage. 2417 The duration of an outage can vary from milliseconds (with minor 2418 service impact) to seconds (with possible call drops for IP telephony 2419 and session time-outs for connection oriented transactions) to 2420 minutes and hours (with potentially considerable social and business 2421 impact). 2423 Coordinating different protection and restoration capabilities across 2424 multiple layers in a cohesive manner to ensure network survivability 2425 is maintained at reasonable cost is a challenging task. Protection 2426 and restoration coordination across layers may not always be 2427 feasible, because networks at different layers may belong to 2428 different administrative domains. 2430 The following paragraphs present some of the general recommendations 2431 for protection and restoration coordination. 2433 - Protection and restoration capabilities from different layers 2434 should be coordinated whenever feasible and appropriate to 2435 provide network survivability in a flexible and cost effective 2436 manner. Minimization of function duplication across layers is 2437 one way to achieve the coordination. Escalation of alarms and 2438 other fault indicators from lower to higher layers may also 2439 be performed in a coordinated manner. A temporal order of 2440 restoration trigger timing at different layers is another way 2441 to coordinate multi-layer protection/restoration. 2443 - Spare capacity at higher layers is often regarded as working 2444 traffic at lower layers. Placing protection/restoration 2445 functions in many layers may increase redundancy and robustness, 2446 but it should not result in significant and avoidable 2447 inefficiencies in network resource utilization. 2449 - It is generally desirable to have protection and restoration 2450 schemes that are bandwidth efficient. 2452 - Failure notification throughout the network should be timely 2453 and reliable. 2455 - Alarms and other fault monitoring and reporting capabilities 2456 should be provided at appropriate layers. 2458 6.5.1 Survivability in MPLS Based Networks 2460 MPLS is an important emerging technology that enhances IP networks in 2461 terms of features, capabilities, and services. Because MPLS is path- 2462 oriented it can potentially provide faster and more predictable 2463 protection and restoration capabilities than conventional hop by hop 2464 routed IP systems. This subsection describes some of the basic 2465 aspects and recommendations for MPLS networks regarding protection 2466 and restoration. See [SHAR] for a more comprehensive discussion on 2467 MPLS based recovery. 2469 Protection types for MPLS networks can be categorized as link 2470 protection, node protection, path protection, and segment protection. 2472 - Link Protection: The objective for link protection is to protect 2473 an LSP from a given link failure. Under link protection, the path 2474 of the protection or backup LSP (the secondary LSP) is disjoint 2475 from the path of the working or operational LSP at the particular 2476 link over which protection is required. When the protected link 2477 fails, traffic on the working LSP is switched over to the 2478 protection LSP at the head-end of the failed link. This is a local 2479 repair method which can be fast. It might be more appropriate in 2480 situations where some network elements along a given path are 2481 less reliable than others. 2483 - Node Protection: The objective of LSP node protection is to protect 2484 an LSP from a given node failure. Under node protection, the path 2485 of the protection LSP is disjoint from the path of the working LSP 2486 at the particular node to be protected. The secondary path is 2487 also disjoint from the primary path at all links associated with 2488 the node to be protected. When the node fails, traffic on the 2489 working LSP is switched over to the protection LSP at the upstream 2490 LSR directly connected to the failed node. 2492 - Path Protection: The goal of LSP path protection is to protect an 2493 LSP from failure at any point along its routed path. Under path 2494 protection, the path of the protection LSP is completely disjoint 2495 from the path of the working LSP. The advantage of path protection 2496 is that the backup LSP protects the working LSP from all possible 2497 link and node failures along the path, except for failures that 2498 might occur at the ingress and egress LSRs, or for correlated 2499 failures that might impact both working and backup paths 2500 simultaneously. Additionally, since the path selection is 2501 end-to-end, path protection might be more efficient in terms of 2502 resource usage than link or node protection. However, path 2503 protection may be slower than link and node protection in general. 2505 - Segment Protection: An MPLS domain may be partitioned into multiple 2506 protection domains whereby a failure in a protection domain is 2507 rectified within that domain. In cases where an LSP traverses 2508 multiple protection domains, a protection mechanism within a domain 2509 only needs to protect the segment of the LSP that lies within the 2510 domain. Segment protection will generally be faster than path 2511 protection because recovery generally occurs closer to the fault. 2513 6.5.2 Protection Option 2515 Another issue to consider is the concept of protection options. The 2516 protection option uses the notation m:n protection where m is the 2517 number of protection LSPs used to protect n working LSPs. Feasible 2518 protection options follow. 2520 - 1:1: one working LSP is protected/restored by one protection LSP. 2522 - 1:n: one protection LSP is used to protect/restore n working LSPs. 2524 - n:1: one working LSP is protected/restored by n protection LSPs, 2525 possibly with configurable load splitting ratio. When more than 2526 one protection LSP is used, it may be desirable to share the 2527 traffic across the protection LSPs when the working LSP fails to 2528 satisfy the bandwidth requirement of the traffic trunk associated 2529 with the working LSP. This may be especially useful when it is 2530 not feasible to find one path that can satisfy the bandwidth 2531 requirement of the primary LSP. 2533 - 1+1: traffic is sent concurrently on both the working LSP and the 2534 protection LSP. In this case, the egress LSR selects one of the two 2535 LSPs based on a local traffic integrity decision process, which 2536 compares the traffic received from both the working and the 2537 protection LSP and identifies discrepancies. It is unlikely that 2538 this option would be used extensively in IP networks due to its 2539 resource utilization inefficiency. However, if bandwidth becomes 2540 plentiful and cheap, then this option might become quite viable and 2541 attractive in IP networks. 2543 6.6 Traffic Engineering in Diffserv Environments 2545 This section provides an overview of the traffic engineering features 2546 and recommendations that are specifically pertinent to Differentiated 2547 Services (Diffserv) [RFC-2475] capable IP networks. 2549 Increasing requirements to support multiple classes of traffic, such 2550 as best effort and mission critical data, in the Internet calls for 2551 IP networks to differentiate traffic according to some criteria, and 2552 to accord preferential treatment to certain types of traffic. Large 2553 numbers of flows can be aggregated into a few behavior aggregates 2554 based on some criteria in terms of common performance requirements in 2555 terms of packet loss ratio, delay, and jitter; or in terms of common 2556 fields within the IP packet headers. 2558 As Diffserv evolves and becomes deployed in operational networks, 2559 traffic engineering will be critical to ensuring that SLAs defined 2560 within a given Diffserv service model are met. Classes of service 2561 (CoS) can be supported in a Diffserv environment by concatenating 2562 per-hop behaviors (PHBs) along the routing path, using service 2563 provisioning mechanisms, and by appropriately configuring edge 2564 functionality such as traffic classification, marking, policing, and 2565 shaping. PHB is the forwarding behavior that a packet receives at a 2566 DS node (a Diffserv-compliant node). This is accomplished by means of 2567 buffer management and packet scheduling mechanisms. In this context, 2568 packets belonging to a class are those that are members of a 2569 corresponding ordering aggregate. 2571 Traffic engineering can be used as a compliment to Diffserv 2572 mechanisms to improve utilization of network resources, but not as a 2573 necessary element in general. When traffic engineering is used, it 2574 can be operated on an aggregated basis across all service classes 2575 [MPLS-DIFF] or on a per service class basis. The former is used to 2576 provide better distribution of the aggregate traffic load over the 2577 network resources. (See [MPLS_DIFF] for detailed mechanisms to 2578 support aggregate traffic engineering.) The latter case is discussed 2579 below since it is specific to the Diffserv environment, with so 2580 called Diffserv-aware traffic engineering [DIFF_TE]. 2582 For some Diffserv networks, it may be desirable to control the 2583 performance of some service classes by enforcing certain 2584 relationships between the traffic workload contributed by each 2585 service class and the amount of network resources allocated or 2586 provisioned for that service class. Such relationships between 2587 demand and resource allocation can be enforced using a combination 2588 of, for example: (1) traffic engineering mechanisms on a per service 2589 class basis that enforce the desired relationship between the amount 2590 of traffic contributed by a given service class and the resources 2591 allocated to that class and (2) mechanisms that dynamically adjust 2592 the resources allocated to a given service class to relate to the 2593 amount of traffic contributed by that service class. 2595 It may also be desirable to limit the performance impact of high 2596 priority traffic on relatively low priority traffic. This can be 2597 achieved by, for example, controlling the percentage of high priority 2598 traffic that is routed through a given link. Another way to 2599 accomplish this is to increase link capacities appropriately so that 2600 lower priority traffic can still enjoy adequate service quality. When 2601 the ratio of traffic workload contributed by different service 2602 classes vary significantly from router to router, it may not suffice 2603 to rely exclusively on conventional IGP routing protocols or on 2604 traffic engineering mechanisms that are insensitive to different 2605 service classes. Instead, it may be desirable to perform traffic 2606 engineering, especially routing control and mapping functions, on a 2607 per service class basis. One way to accomplish this in a domain that 2608 supports both MPLS and Diffserv is to define class specific LSPs and 2609 to map traffic from each class onto one or more LSPs that correspond 2610 to that service class. An LSP corresponding to a given service class 2611 can then be routed and protected/restored in a class dependent 2612 manner, according to specific policies. 2614 Performing traffic engineering on a per class basis may require 2615 certain per-class parameters to be distributed. Note that it is 2616 common to have some classes to share some aggregate constraint (e.g. 2617 maximum bandwidth requirement) without enforcing the constraint on 2618 each individual class. These classes then can be grouped into a 2619 class-type and per-class-type parameters can be distributed instead 2620 to improve scalability. It also allows better bandwidth sharing 2621 between classes in the same class-type. A class-type is a set of 2622 classes that satisfy the following two conditions: 2624 1) Classes in the same class-type have common aggregate requirements 2625 to satisfy required performance levels. 2627 2) There is no requirement to be enforced at the level of individual 2628 class in the class-type. Note that it is still possible, 2629 nevertheless, to implement some priority policies for classes in the 2630 same class-type to permit preferential access to the class-type 2631 bandwidth through the use of preemption priorities. 2633 An example of the class-type can be a low-loss class-type that 2634 includes both AF1-based and AF2-based Ordering Aggregates. With such 2635 a class-type, one may implement some priority policy which assigns 2636 higher preemption priority to AF1-based traffic trunks over AF2-based 2637 ones, vice versa, or the same priority. 2639 See [DIFF-TE] for detailed requirements on Diffserv-aware traffic 2640 engineering. 2642 6.7 Network Controllability 2644 Off-line (and on-line) traffic engineering considerations would be of 2645 limited utility if the network could not be controlled effectively to 2646 implement the results of TE decisions and to achieve desired network 2647 performance objectives. Capacity augmentation is a coarse grained 2648 solution to traffic engineering issues. However, it is simple and may 2649 be advantageous if bandwidth is abundant and cheap or if the current 2650 or expected network workload demands it. However, bandwidth is not 2651 always abundant and cheap, and the workload may not always demand 2652 additional capacity. Adjustments of administrative weights and other 2653 parameters associated with routing protocols provide finer grained 2654 control, but is difficult to use and imprecise because of the routing 2655 interactions that occur across the network. In certain network 2656 contexts, more flexible, finer grained approaches which provide more 2657 precise control over the mapping of traffic to routes and over the 2658 selection and placement of routes may be appropriate and useful. 2660 Control mechanisms can be manual (e.g. administrative configuration), 2661 partially-automated (e.g. scripts) or fully-automated (e.g. policy 2662 based management systems). Automated mechanisms are particularly 2663 required in large scale networks. Multi-vendor interoperability can 2664 be facilitated by developing and deploying standardized management 2665 systems (e.g. standard MIBs) and policies (PIBs) to support the 2666 control functions required to address traffic engineering objectives 2667 such as load distribution and protection/restoration. 2669 Network control functions should be secure, reliable, and stable as 2670 these are often needed to operate correctly in times of network 2671 impairments (e.g. during network congestion or security attacks). 2673 7.0 Inter-Domain Considerations 2675 Inter-domain traffic engineering is concerned with the performance 2676 optimization for traffic that originates in one administrative domain 2677 and terminates in a different one. 2679 Traffic exchange between autonomous systems in the Internet occurs 2680 through exterior gateway protocols. Currently, BGP [BGP4] is the 2681 standard exterior gateway protocol for the Internet. BGP provides a 2682 number of attributes and capabilities (e.g. route filtering) that can 2683 be used for inter-domain traffic engineering. More specifically, BGP 2684 permits the control of routing information and traffic exchange 2685 between Autonomous Systems (AS's) in the Internet. BGP incorporates a 2686 sequential decision process which calculates the degree of preference 2687 for various routes to a given destination network. There are two 2688 fundamental aspects to inter-domain traffic engineering using BGP: 2690 - Route Redistribution: controlling the import and export of routes 2691 between AS's, and controlling the redistribution of routes between 2692 BGP and other protocols within an AS. 2694 - Best path selection: selecting the best path when there are 2695 multiple candidate paths to a given destination network. Best path 2696 selection is performed by the BGP decision process based on a 2697 sequential procedure, taking a number of different considerations 2698 into account. Ultimately, best path selection under BGP boils down 2699 to selecting preferred exit points out of an AS towards 2700 specific destination networks. The BGP path selection process can 2701 be influenced by manipulating the attributes associated with 2702 the BGP decision process. These attributes include: NEXT-HOP, 2703 WEIGHT (Cisco proprietary which is also implemented by some other 2704 vendors), LOCAL-PREFERENCE, AS-PATH, ROUTE-ORIGIN, 2705 MULTI-EXIT-DESCRIMINATOR (MED), IGP METRIC, etc. 2707 Route-maps provide the flexibility to implement complex BGP policies 2708 based on pre-configured logical conditions. In particular, Route- 2709 maps can be used to control import and export policies for incoming 2710 and outgoing routes, control the redistribution of routes between BGP 2711 and other protocols, and influence the selection of best paths by 2712 manipulating the attributes associated with the BGP decision process. 2713 Very complex logical expressions that implement various types of 2714 policies can be implemented using a combination of Route-maps, BGP- 2715 attributes, Access-lists, and Community attributes. 2717 When looking at possible strategies for inter-domain TE with BGP, it 2718 must be noted that the outbound traffic exit point is controllable, 2719 whereas the interconnection point where inbound traffic is received 2720 from an EBGP peer typically is not, unless a special arrangement is 2721 made with the peer sending the traffic. Therefore, it is up to each 2722 individual network to implement sound TE strategies that deal with 2723 the efficient delivery of outbound traffic from one's customers to 2724 one's peering points. The vast majority of TE policy is based upon a 2725 "closest exit" strategy, which relies on other networks to deliver 2726 traffic to its final destination in the most efficient manner 2727 possible. Most methods of manipulating the point at which inbound 2728 traffic enters a network from an EBGP peer (inconsistent route 2729 announcements between peering points, AS pre-pending, and sending 2730 MEDs) are either ineffective, or not accepted in the peering 2731 community. 2733 Inter-domain TE with BGP is generally effective, but it is usually 2734 applied in a trial-and-error fashion. A systematic approach for 2735 inter-domain traffic engineering is yet to be devised. 2737 Inter-domain TE is inherently more difficult than intra-domain TE 2738 under the current Internet architecture. The reasons for this are 2739 both technical and administrative. Technically, while topology and 2740 link state information are helpful for mapping traffic more 2741 effectively, BGP does not propagate such information across domain 2742 boundaries for stability and scalability reasons. Administratively, 2743 there are differences in operating costs and network capacities 2744 between domains. Generally, what may be considered a good solution in 2745 one domain may not necessarily be a good solution in another domain. 2746 Moreover, it would generally be considered inadvisable for one domain 2747 to permit another domain to influence the routing and management of 2748 traffic in its network. 2750 MPLS TE-tunnels (explicit LSPs) can potentially add a degree of 2751 flexibility in the selection of exit points for inter-domain routing. 2752 The concept of relative and absolute metrics can be applied to this 2753 purpose. The idea is that if BGP attributes are defined such that the 2754 BGP decision process depends on IGP metrics to select exit points for 2755 inter-domain traffic, then some inter-domain traffic destined to a 2756 given peer network can be made to prefer a specific exit point by 2757 establishing a TE-tunnel between the router making the selection to 2758 the peering point via a TE-tunnel and assigning the TE-tunnel a 2759 metric which is smaller than the IGP cost to all other peering 2760 points. If a peer accepts and processes MEDs, then a similar MPLS 2761 TE-tunnel based scheme can be applied to cause certain entrance 2762 points to be preferred by setting MED to be an IGP cost, which has 2763 been modified by the tunnel metric. 2765 Similar to intra-domain TE, inter-domain TE is best accomplished when 2766 a traffic matrix can be derived to depict the volume of traffic from 2767 one autonomous system to another. 2769 Generally, redistribution of inter-domain traffic requires 2770 coordination between peering partners. An export policy in one domain 2771 that results in load redistribution across peer points with another 2772 domain can significantly affect the local traffic matrix inside the 2773 domain of the peering partner. This, in turn, will affect the intra- 2774 domain TE due to changes in the spatial distribution of traffic. 2775 Therefore, it is mutually beneficial for peering partners to 2776 coordinate with each other before attempting any policy changes that 2777 may result in significant shifts in inter-domain traffic. In certain 2778 contexts, this coordination can be quite challenging due to technical 2779 and non- technical reasons. 2781 It is a matter of speculation as to whether MPLS, or similar 2782 technologies, can be extended to allow selection of constrained paths 2783 across domain boundaries. 2785 8.0 Overview of Contemporary TE Practices in Operational IP Networks 2787 This section provides an overview of some contemporary traffic 2788 engineering practices in IP networks. The focus is primarily on the 2789 aspects that pertain to the control of the routing function in 2790 operational contexts. The intent here is to provide an overview of 2791 the commonly used practices. The discussion is not intended to be 2792 exhaustive. 2794 Currently, service providers apply many of the traffic engineering 2795 mechanisms discussed in this document to optimize the performance of 2796 their IP networks. These techniques include capacity planning for 2797 long time scales, routing control using IGP metrics and MPLS for 2798 medium time scales, the overlay model also for medium time scales, 2799 and traffic management mechanisms for short time scale. 2801 When a service provider plans to build an IP network, or expand the 2802 capacity of an existing network, effective capacity planning should 2803 be an important component of the process. Such plans may take the 2804 following aspects into account: location of new nodes if any, 2805 existing and predicted traffic patterns, costs, link capacity, 2806 topology, routing design, and survivability. 2808 Performance optimization of operational networks is usually an 2809 ongoing process in which traffic statistics, performance parameters, 2810 and fault indicators are continually collected from the network. 2811 These empirical data are then analyzed and used to trigger various 2812 traffic engineering mechanisms. For example, IGP parameters, e.g., 2813 OSPF or IS-IS metrics, can be adjusted based on manual computations 2814 or based on the output of some traffic engineering support tools. 2815 Such tools may use the following as input the: traffic matrix, 2816 network topology, and network performance objective(s). Tools that 2817 perform what-if analysis can also be used to assist the TE process by 2818 allowing various scenarios to be reviewed before a new set of 2819 configurations are implemented in the operational network. 2821 The overlay model (IP over ATM or IP over Frame relay) is another 2822 approach which is commonly used in practice [AWD2]. The IP over ATM 2823 technique is no longer viewed favorably due to recent advances in 2824 MPLS and router hardware technology. 2826 Deployment of MPLS for traffic engineering applications has commenced 2827 in some service provider networks. One operational scenario is to 2828 deploy MPLS in conjunction with an IGP (IS-IS-TE or OSPF-TE) that 2829 supports the traffic engineering extensions, in conjunction with 2830 constraint-based routing for explicit route computations, and a 2831 signaling protocol (e.g. RSVP-TE or CRLDP) for LSP instantiation. 2833 In contemporary MPLS traffic engineering contexts, network 2834 administrators specify and configure link attributes and resource 2835 constraints such as maximum reservable bandwidth and resource class 2836 attributes for links (interfaces) within the MPLS domain. A link 2837 state protocol that supports TE extensions (IS-IS-TE or OSPF-TE) is 2838 used to propagate information about network topology and link 2839 attribute to all routers in the routing area. Network administrators 2840 also specify all the LSPs that are to originate each router. For each 2841 LSP, the network administrator specifies the destination node and the 2842 attributes of the LSP which indicate the requirements that to be 2843 satisfied during the path selection process. Each router then uses a 2844 local constraint-based routing process to compute explicit paths for 2845 all LSPs originating from it. Subsequently, a signaling protocol is 2846 used to instantiate the LSPs. By assigning proper bandwidth values to 2847 links and LSPs, congestion caused by uneven traffic distribution can 2848 generally be avoided or mitigated. 2850 The bandwidth attributes of LSPs used for traffic engineering can be 2851 updated periodically. The basic concept is that the bandwidth 2852 assigned to an LSP should relate in some manner to the bandwidth 2853 requirements of traffic that actually flows through the LSP. The 2854 traffic attribute of an LSP can be modified to accommodate traffic 2855 growth and persistent traffic shifts. If network congestion occurs 2856 due to some unexpected events, existing LSPs can be rerouted to 2857 alleviate the situation or network administrator can configure new 2858 LSPs to divert some traffic to alternative paths. The reservable 2859 bandwidth of the congested links can also be reduced to force some 2860 LSPs to be rerouted to other paths. 2862 In an MPLS domain, a traffic matrix can also be estimated by 2863 monitoring the traffic on LSPs. Such traffic statistics can be used 2864 for a variety of purposes including network planning and network 2865 optimization. Current practice suggests that deploying an MPLS 2866 network consisting of hundreds of routers and thousands of LSPs is 2867 feasible. In summary, recent deployment experience suggests that MPLS 2868 approach is very effective for traffic engineering in IP networks 2869 [XIAO]. 2871 As mentioned previously in Section 7.0, one usually has no direct 2872 control over the distribution of inbound traffic. Therefore, the 2873 main goal of contemporary inter-domain TE is to optimize the 2874 distribution of outbound traffic between multiple inter-domain links. 2875 When operating a global network, maintaining the ability to operate 2876 the network in a regional fashion where desired, while continuing to 2877 take advantage of the benefits of a global network, also becomes an 2878 important objective. 2880 Inter-domain TE with BGP usually begins with the placement of 2881 multiple peering interconnection points in locations that have high 2882 peer density, are in close proximity to originating/terminating 2883 traffic locations on one's own network, and are lowest in cost. 2884 There are generally several locations in each region of the world 2885 where the vast majority of major networks congregate and 2886 interconnect. Some location-decision problems that arise in 2887 association with inter-domain routing are discussed in [AWD5]. 2889 Once the locations of the interconnects are determined, and circuits 2890 are implemented, one decides how best to handle the routes heard from 2891 the peer, as well as how to propagate the peers' routes within one's 2892 own network. One way to engineer outbound traffic flows on a network 2893 with many EBGP peers is to create a hierarchy of peers. Generally, 2894 the Local Preferences of all peers are set to the same value so that 2895 the shortest AS paths will be chosen to forward traffic. Then, by 2896 over-writing the inbound MED metric (Multi-exit-discriminator metric, 2897 also referred to as "BGP metric". Both terms are used interchangeably 2898 in this document) with BGP metrics to routes received at different 2899 peers, the hierarchy can be formed. For example, all Local 2900 Preferences can be set to 200, preferred private peers can be 2901 assigned a BGP metric of 50, the rest of the private peers can be 2902 assigned a BGP metric of 100, and public peers can be assigned a BGP 2903 metric of 600. "Preferred" peers might be defined as those peers 2904 with whom the most available capacity exists, whose customer base is 2905 larger in comparison to other peers, whose interconnection costs are 2906 the lowest, and with whom upgrading existing capacity is the easiest. 2907 In a network with low utilization at the edge, this works well. The 2908 same concept could be applied to a network with higher edge 2909 utilization by creating more levels of BGP metrics between peers, 2910 allowing for more granularity in selecting the exit points for 2911 traffic bound for a dual homed customer on a peer's network. 2913 By only replacing inbound MED metrics with BGP metrics, only equal 2914 AS-Path length routes' exit points are being changed. (The BGP 2915 decision considers Local Preference first, then AS-Path length, and 2916 then BGP metric). For example, assume a network has two possible 2917 egress points, peer A and peer B. Each peer has 40% of the 2918 Internet's routes exclusively on its network, while the remaining 20% 2919 of the Internet's routes are from customers who dual home between A 2920 and B. Assume that both peers have a Local Preference of 200 and a 2921 BGP metric of 100. If the link to peer A is congested, increasing 2922 its BGP metric while leaving the Local Preference at 200 will ensure 2923 that the 20% of total routes belonging to dual homed customers will 2924 prefer peer B as the exit point. The previous example would be used 2925 in a situation where all exit points to a given peer were close to 2926 congestion levels, and traffic needed to be shifted away from that 2927 peer entirely. 2929 When there are multiple exit points to a given peer, and only one of 2930 them is congested, it is not necessary to shift traffic away from the 2931 peer entirely, but only from the one congested circuit. This can be 2932 achieved by using passive IGP-metrics, AS-path filtering, or prefix 2933 filtering. 2935 Occasionally, more drastic changes are needed, for example, in 2936 dealing with a "problem peer" who is difficult to work with on 2937 upgrades or is charging high prices for connectivity to their 2938 network. In that case, the Local Preference to that peer can be 2939 reduced below the level of other peers. This effectively reduces the 2940 amount of traffic sent to that peer to only originating traffic 2941 (assuming no transit providers are involved). This type of change 2942 can affect a large amount of traffic, and is only used after other 2943 methods have failed to provide the desired results. 2945 Although it is not much of an issue in regional networks, the 2946 propagation of a peer's routes back through the network must be 2947 considered when a network is peering on a global scale. Sometimes, 2948 business considerations can influence the choice of BGP policies in a 2949 given context. For example, it may be imprudent, from a business 2950 perspective, to operate a global network and provide full access to 2951 the global customer base to a small network in a particular country. 2952 However, for the purpose of providing one's own customers with 2953 quality service in a particular region, good connectivity to that 2954 in-country network may still be necessary. This can be achieved by 2955 assigning a set of communities at the edge of the network, which have 2956 a known behavior when routes tagged with those communities are 2957 propagating back through the core. Routes heard from local peers 2958 will be prevented from propagating back to the global network, 2959 whereas routes learned from larger peers may be allowed to propagate 2960 freely throughout the entire global network. By implementing a 2961 flexible community strategy, the benefits of using a single global AS 2962 Number (ASN) can be realized, while the benefits of operating 2963 regional networks can also be taken advantage of. An alternative to 2964 doing this is to use different ASNs in different regions, with the 2965 consequence that the AS path length for routes announced by that 2966 service provider will increase. 2968 9.0 Conclusion 2970 This document described principles for traffic engineering in the 2971 Internet. It presented an overview of some of the basic issues 2972 surrounding traffic engineering in IP networks. The context of TE was 2973 described, a TE process models and a taxonomy of TE styles were 2974 presented. A brief historical review of pertinent developments 2975 related to traffic engineering was provided. A survey of contemporary 2976 TE techniques in operational networks was presented. Additionally, 2977 the document specified a set of generic requirements, 2978 recommendations, and options for Internet traffic engineering. 2980 10.0 Security Considerations 2982 This document does not introduce new security issues. 2984 11.0 Acknowledgments 2986 The authors would like to thank Jim Boyle for inputs on the 2987 recommendations section, Francois Le Faucheur for inputs on Diffserv 2988 aspects, Blaine Christian for inputs on measurement, Gerald Ash for 2989 inputs on routing in telephone networks and for text on event- 2990 dependent TE methods ,Steven Wright for inputs on network 2991 controllability, and Jonathan Aufderheide for inputs on inter-domain 2992 TE with BGP. Special thanks to Randy Bush for proposing the TE 2993 taxonomy based on "tactical vs strategic" methods. The subsection 2994 describing an "Overview of ITU Activities Related to Traffic 2995 Engineering" was adapted from a contribution by Waisum Lai. Useful 2996 feedback and pointers to relevant materials were provided by J. Noel 2997 Chiappa. Additional comments were provided by Glenn Grotefeld during 2998 the working last call process. Finally, the authors would like to 2999 thank Ed Kern, the TEWG co-chair, for his comments and support. 3001 12.0 References 3003 [ASH1] J. Ash, M. Girish, E. Gray, B. Jamoussi, G. Wright, 3004 "Applicability Statement for CR-LDP," Work in Progress, July 2000. 3006 [ASH2] J. Ash, Dynamic Routing in Telecommunications Networks, McGraw 3007 Hill, 1998 3009 [ASH3] J. Ash, "TE & QoS Methods for IP-, ATM-, & TDM-Based 3010 Networks," Work in Progress, Mar. 2001. 3012 [AWD1] D. Awduche and Y. Rekhter, "Multiprocotol Lambda Switching: 3013 Combining MPLS Traffic Engineering Control with Optical 3014 Crossconnects", IEEE Communications Magazine, March 2001. 3016 [AWD2] D. Awduche, "MPLS and Traffic Engineering in IP Networks," 3017 IEEE Communications Magazine, Dec. 1999. 3019 [AWD3] D. Awduche, L. Berger, D. Gan, T. Li, G. Swallow, and V. 3020 Srinivasan, "RSVP-TE: Extensions to RSVP for LSP Tunnels," Work in 3021 Progress, Feb. 2001. 3023 [AWD4] D. Awduche, A. Hannan, X. Xiao, " Applicability Statement for 3024 Extensions to RSVP for LSP-Tunnels," Work in Progress, Apr. 2000. 3026 [AWD5] D. Awduche et al, "An Approach to Optimal Peering Between 3027 Autonomous Systems in the Internet," International Conference on 3028 Computer Communications and Networks (ICCCN'98), Oct. 1998. 3030 [CRUZ] R. L. Cruz, "A Calculus for Network Delay, Part II: Network 3031 Analysis," IEEE Transactions on Information Theory, vol. 37, pp. 3032 132-141, 1991. 3034 [DIFF-TE] F. Le Faucheur, et al, "Requirements for support of Diff- 3035 Serv-aware MPLS Traffic Engineering", Work in Progress, May 2001. 3037 [ELW95] A. Elwalid, D. Mitra and R.H. Wentworth, "A New Approach for 3038 Allocating Buffers and Bandwidth to Heterogeneous, Regulated Traffic 3039 in an ATM Node," IEEE IEEE Journal on Selected Areas in 3040 Communications, 13:6, pp. 1115-1127, Aug. 1995. 3042 [FGLR] A. Feldmann, A. Greenberg, C. Lund, N. Reingold, and J. 3043 Rexford, "NetScope: Traffic Engineering for IP Networks," IEEE 3044 Network Magazine, 2000. 3046 [FLJA93] S. Floyd and V. Jacobson, "Random Early Detection Gateways 3047 for Congestion Avoidance," IEEE/ACM Transactions on Networking, Vol. 3048 1 Nov. 4., p. 387-413, Aug. 1993. 3050 [FLOY94] S. Floyd, "TCP and Explicit Congestion Notification," ACM 3051 Computer Communication Review, V. 24, No. 5, p. 10-23, Oct. 1994. 3053 [HUSS87] B.R. Hurley, C.J.R. Seidl and W.F. Sewel, "A Survey of 3054 Dynamic Routing Methods for Circuit-Switched Traffic," IEEE 3055 Communication Magazine, Sep. 1987. 3057 [ITU-E600] ITU-T Recommendation E.600, "Terms and Definitions of 3058 Traffic Engineering," Mar. 1993. 3060 [ITU-E701] ITU-T Recommendation E.701, "Reference Connections for 3061 Traffic Engineering," Oct. 1993. 3063 [ITU-E801] ITU-T Recommendation E.801, "Framework for Service Quality 3064 Agreement," Oct. 1996. 3066 [JAM] B. Jamoussi, "Constraint-Based LSP Setup using LDP," Work in 3067 Progress, Feb. 2001. 3069 [KATZ] D. Katz, D. Yeung, and K. Kompella, "Traffic Engineering 3070 Extensions to OSPF," Work in Progress, Feb. 2001. 3072 [LNO96] T. Lakshman, A. Neidhardt, and T. Ott, "The Drop from Front 3073 Strategy in TCP over ATM and its Interworking with other Control 3074 Features," Proc. INFOCOM'96, p. 1242-1250, 1996. 3076 [MA] Q. Ma, "Quality of Service Routing in Integrated Services 3077 Networks," PhD Dissertation, CMU-CS-98-138, CMU, 1998. 3079 [MATE] A. Elwalid, C. Jin, S. Low, and I. Widjaja, "MATE: MPLS 3080 Adaptive Traffic Engineering," Proc. INFOCOM'01, Apr. 2001. 3082 [MCQ80] J.M. McQuillan, I. Richer, and E.C. Rosen, "The New Routing 3083 Algorithm for the ARPANET," IEEE. Trans. on Communications, vol. 28, 3084 no. 5, pp. 711-719, May 1980. 3086 [MPLS-DIFF] F. Le Faucheur, et al, "MPLS Support of Differentiated 3087 Services", Work in Progress, February 2001. 3089 [MR99] D. Mitra and K.G. Ramakrishnan, "A Case Study of Multiservice, 3090 Multipriority Traffic Engineering Design for Data Networks," Proc. 3091 Globecom'99, Dec 1999. 3093 [RFC-1349] P. Almquist, "Type of Service in the Internet Protocol 3094 Suite," RFC 1349, Jul. 1992. 3096 [RFC-1458] R. Braudes, S. Zabele, "Requirements for Multicast 3097 Protocols," RFC 1458, May 1993. 3099 [RFC-1771] Y. Rekhter and T. Li, "A Border Gateway Protocol 4 (BGP- 3100 4)," RFC 1771, Mar. 1995. 3102 [RFC-1812] F. Baker (Editor), "Requirements for IP Version 4 3103 Routers," RFC 1812, Jun. 1995. 3105 [RFC-1992] I. Castineyra, N. Chiappa, and M. Steenstrup, "The Nimrod 3106 Routing Architecture," RFC 1992, Aug. 1996. 3108 [RFC-1997] R. Chandra, P. Traina, and T. Li, "BGP Community 3109 Attributes" RFC 1997, Aug. 1996. 3111 [RFC-1998] E. Chen and T. Bates, "An Application of the BGP Community 3112 Attribute in Multi-home Routing," RFC 1998, Aug. 1996. 3114 [RFC-2178] J. Moy, "OSPF Version 2," RFC 2178, July 1997. 3116 [RFC-2205] R. Braden, et. al., "Resource Reservation Protocol (RSVP) 3117 - Version 1 Functional Specification," RFC 2205, Sep. 1997. 3119 [RFC-2211] J. Wroclawski, "Specification of the Controlled-Load 3120 Network Element Service," RFC 2211, Sep. 1997. 3122 [RFC-2212] S. Shenker, C. Partridge, R. Guerin, "Specification of 3123 Guaranteed Quality of Service," RFC 2212, Sep. 1997 3125 [RFC-2215] S. Shenker and J. Wroclawski, "General Characterization 3126 Parameters for Integrated Service Network Elements," RFC 2215, Sep. 3127 1997. 3129 [RFC-2216] S. Shenker and J. Wroclawski, "Network Element Service 3130 Specification Template," RFC 2216, Sep. 1997. 3132 [RFC-2330] V. Paxson et al., "Framework for IP Performance Metrics," 3133 RFC 2330, May 1998. 3135 [RFC-2386] E. Crawley, R. Nair, B. Rajagopalan, and H. Sandick, "A 3136 Framework for QoS-based Routing in the Internet," RFC 2386, Aug. 3137 1998. 3139 [RFC-2475] S. Blake et al., "An Architecture for Differentiated 3140 Services," RFC 2475, Dec. 1998. 3142 [RFC-2597] J. Heinanen, F. Baker, W. Weiss, and J. Wroclawski, 3143 "Assured Forwarding PHB Group," RFC 2597, June 1999. 3145 [RFC-2678] J. Mahdavi and V. Paxson, "IPPM Metrics for Measuring 3146 Connectivity," RFC 2678, Sep. 1999. 3148 [RFC-2679] G. Almes, S. Kalidindi, and M. Zekauskas, "A One-way Delay 3149 Metric for IPPM," RFC 2679, Sep. 1999. 3151 [RFC-2680] G. Almes, S. Kalidindi, and M. Zekauskas, "A One-way 3152 Packet Loss Metric for IPPM," RFC 2680, Sep. 1999. 3154 [RFC-2702] D. Awduche, J. Malcolm, J. Agogbua, M. O'Dell, J. McManus, 3155 "Requirements for Traffic Engineering over MPLS," RFC 2702, Sep. 3156 1999. 3158 [RFC-2722] N. Brownlee, C. Mills, and G. Ruth, "Traffic Flow 3159 Measurement: Architecture," RFC 2722, Oct. 1999. 3161 [RFC-2753] R. Yavatkar, D. Pendarakis, and R. Guerin, "A Framework 3162 for Policy-based Admission Control," RFC 2753, Jan. 2000. 3164 [RFC-2961] L. Berger, D. Gan, G. Swallow, P. Pan, F. Tommasi, S. 3165 Molendini, "RSVP Refresh Overhead Reduction Extensions", RFC 2961, 3166 Apr. 2000. 3168 [RFC-2998] Y. Bernet, et. al., "A Framework for Integrated Services 3169 Operation over Diffserv Networks", RFC 2998, Nov. 2000. 3171 [RFC-3031] E. Rosen, A. Viswanathan, R. Callon, "Multiprotocol Label 3172 Switching Architecture," RFC 3031, Jan. 2001. 3174 [RFC-3086] K. Nichols and B. Carpenter, "Definition of Differentiated 3175 Services Per Domain Behaviors and Rules for their Specification," RFC 3176 3086, April 2001. 3178 [RFC-3124] H. Balakrishnan and S. Seshan, "The Congestion Manager," 3179 RFC 3124, Jun. 2001. 3181 [SHAR] V. Sharma, et. al., "Framework for MPLS Based Recovery," Work 3182 in Progress, Mar. 2001. 3184 [SLDC98] B. Suter, T. Lakshman, D. Stiliadis, and A. Choudhury, 3185 "Design Considerations for Supporting TCP with Per-flow Queueing," 3186 Proc. INFOCOM'98, p. 299-306, 1998. 3188 [SMIT] H. Smit and T. Li, "IS-IS extensions for Traffic Engineering," 3189 Work in Progress, Feb. 2001. 3191 [XIAO] X. Xiao, A. Hannan, B. Bailey, L. Ni, "Traffic Engineering 3192 with MPLS in the Internet," IEEE Network magazine, Mar. 2000. 3194 [YARE95] C. Yang and A. Reddy, "A Taxonomy for Congestion Control 3195 Algorithms in Packet Switching Networks", IEEE Network Magazine, p. 3196 34-45, 1995. 3198 13.0 Authors' Addresses: 3200 Daniel O. Awduche 3201 Movaz Networks 3202 7926 Jones Branch Drive, Suite 615 3203 McLean, VA 22102 3204 Phone: 703-847-7350 3205 Email: awduche@movaz.com 3207 Angela Chiu 3208 Celion Networks 3209 1 Shiela Dr., Suite 2 3210 Tinton Falls, NJ 07724 3211 Phone: 732-747-9987 3212 Email: angela.chiu@celion.com 3214 Anwar Elwalid 3215 Lucent Technologies 3216 Murray Hill, NJ 07974 3217 Phone: 908 582-7589 3218 Email: anwar@lucent.com 3220 Indra Widjaja 3221 Bell Labs, Lucent Technologies 3222 600 Mountain Avenue 3223 Murray Hill, NJ 07974 3224 Phone: 908 582-0435 3225 Email: iwidjaja@research.bell-labs.com 3227 XiPeng Xiao 3228 Photuris Inc. 3229 2025 Stierlin Ct., 3230 Mountain View, CA 94043 3231 Phone: 650-919-3215 3232 Email: xxiao@photuris.com