idnits 2.17.00 (12 Aug 2021) /tmp/idnits2913/draft-ott-mmusic-cap-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document is more than 15 pages and seems to lack a Table of Contents. == No 'Intended status' indicated for this document; assuming Proposed Standard == It seems as if not all pages are separated by form feeds - found 0 form feeds but 25 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There are 5 instances of too long lines in the document, the longest one being 4 characters in excess of 72. == There are 1 instance of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. == There are 1 instance of lines with multicast IPv4 addresses in the document. If these are generic example addresses, they should be changed to use the 233.252.0.x range defined in RFC 5771 Miscellaneous warnings: ---------------------------------------------------------------------------- == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (June 1999) is 8369 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: '10' is defined on line 859, but no explicit reference was found in the text ** Obsolete normative reference: RFC 2234 (ref. '2') (Obsoleted by RFC 4234) ** Downref: Normative reference to an Experimental RFC: RFC 2295 (ref. '3') -- Possible downref: Non-RFC (?) normative reference: ref. '4' -- Possible downref: Non-RFC (?) normative reference: ref. '5' == Outdated reference: draft-ietf-conneg-media-features has been published as RFC 2534 == Outdated reference: draft-ietf-conneg-feature-reg has been published as RFC 2506 == Outdated reference: draft-ietf-conneg-feature-syntax has been published as RFC 2533 -- Possible downref: Normative reference to a draft: ref. '9' == Outdated reference: A later version (-02) exists of draft-ietf-conneg-W3C-ccpp-01 -- Possible downref: Normative reference to a draft: ref. '10' ** Obsolete normative reference: RFC 2327 (ref. '11') (Obsoleted by RFC 4566) == Outdated reference: draft-ietf-mmusic-sap-v2 has been published as RFC 2974 ** Downref: Normative reference to an Experimental draft: draft-ietf-mmusic-sap-v2 (ref. '12') Summary: 11 errors (**), 0 flaws (~~), 11 warnings (==), 6 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 INTERNET-DRAFT J. Ott/D. Kutscher/C. Bormann 2 Expires: December 1999 Universitaet Bremen 3 June 1999 5 Capability description for group cooperation 6 draft-ott-mmusic-cap-00.txt 8 Status of this memo 10 This document is an Internet-Draft and is in full conformance with 11 all provisions of Section 10 of RFC2026. 13 Internet-Drafts are working documents of the Internet Engineering 14 Task Force (IETF), its areas, and its working groups. Note that 15 other groups may also distribute working documents as Internet- 16 Drafts. 18 Internet-Drafts are draft documents valid for a maximum of six months 19 and may be updated, replaced, or obsoleted by other documents at any 20 time. It is inappropriate to use Internet- Drafts as reference 21 material or to cite them other than as "work in progress." 23 The list of current Internet-Drafts can be accessed at 24 http://www.ietf.org/ietf/1id-abstracts.txt 26 The list of Internet-Draft Shadow Directories can be accessed at 27 http://www.ietf.org/shadow.html. 29 Abstract 31 This document presents a notation for describing potential and 32 specific configurations of end systems in multiparty collaboration 33 sessions. The objective is to define a configuration description 34 framework that can be used to define end system capabilities, to 35 calculate a set of appropriate common capabilities based on the 36 descriptions of all (end) systems and to express a selected media 37 description for use in session descriptions. One application for this 38 framework would be multiparty multimedia conferencing, an application 39 area where multiple tools have to be configured on conference startup 40 (and/or during the conference) concerning media encoding types and 41 other parameters. Other applications are IP Telephony and media 42 gateway control. 44 This document is intended for discussion in the Multiparty Multimedia 45 Session Control (MMUSIC) working group of the Internet Engineering 46 Task Force. Comments are solicited and should be addressed to the 47 working group's mailing list at confctrl@isi.edu and/or the authors. 49 1. Introduction 51 1.1. Background 53 1.1.1. Motivation 55 Multiparty multimedia conferencing is one application that requires 56 the dynamic interchange of end system capabilities and the 57 negotiation of a parameter set that is appropriate for all sending 58 and receiving end systems in a conference. Currently the parameter 59 negotiation is either done by out of band means or, for loosely 60 coupled conferences, parameters are simply fixed by the initiator of 61 a conference. In the latter scenario no negotiation is required 62 because only those participants with media tools that support the 63 predefined settings can join a media session and/or a conference. 65 This approach is applicable for conferences that are announced some 66 time ahead of the actual start date of the conference. Potential 67 participants can check the availability of media tools in advance and 68 tools like session directories can configure tools on startup. This 69 procedure however fails to work for conferences initiated 70 spontaneously like Internet phone calls or ad-hoc multiparty 71 conferences. Fixed settings for parameters like media types, their 72 encoding etc. can easiliy inhibit the initiation of conferences, for 73 example in situations where a caller insists on a fixed audio 74 encoding that is not available at the callee's end system. 76 To allow for spontaneous conferences, the process of defining a 77 conference's parameter set must therefore be performed either at 78 conference start (for closed conferences) or maybe (potentially) even 79 repeatedly every time a new participant joins an active conference. 80 The latter approach may not be appropriate for every type of 81 conference: For conferences with TV-broadcast or lecture 82 characteristics (one main active source) it is usually not desired to 83 re-negotiate parameters every time a new participant with an exotic 84 configuration joins because it may exclude the main source from media 85 sessions. But conferences with equal ``rights'' for participants that 86 are open for new participants do need dynamic capability negotiation, 87 for example a telephone call that is extented to a 3-parties 88 conference at some time during the session. 90 1.1.2. Current practices in the IETF community 92 Capability and session descriptions play different roles in 93 applications of IETF conferencing standards and are currently almost 94 always specified as SDP (Session Description Protocol) [11] session 95 descriptions. In session announcements with SAP (Session Announcement 96 Protocol) [12] they are used to define media encodings and parameters 97 for conferences and thus at least reflect the system capabilities of 98 the participants or the active source. 100 Within the context of SIP (Session Initiation Protocol) capability 101 descriptions can be expressed in different session description 102 languages, one of them SDP. For example, in a SIP-INVITE message for 103 a unicast session, the session description enumerates the media types 104 and formats that the caller is willing to use and thus expresses the 105 capabilities of the caller's end system. The SDP content is however 106 not only used to express a caller's preferences but is also used to 107 configure communication channels in a somewhat crude way. For 108 example, if a callee does not want to send or receive data on a 109 offered stream he has to set the port number of that stream to zero 110 in its media description that he sends as a reply to the caller. The 111 use of SDP as a capability description and negotiation mechanism has 112 lead to a whole set of conventions and requirements that have to be 113 considered by implementations because SDP itself is not powerful 114 enough for this purpose. This is clearly not a defect of SDP which 115 has never been designed to be a complete capability description and 116 negotiation mechanism. SDP has been developed in the context of SAP 117 to describe simple static media sets. 119 The misuse of SDP reveals a lack of a powerful, yet simple way to 120 perform capability description and negotiation in a conference setup 121 or reconfiguration phase in the current IETF conferencing model. 123 1.2. Purpose 125 The configuration negotiation framework consists of three components: 127 o A language that allows expressing capability descriptions, 128 potential configurations, unambiguously; 130 o an algorithm that compares different capability descriptions and 131 produces an appropriate ``collapsed'' subset that can be used as 132 a common set of potential configurations; and 134 o a concrete capability name and value range specification for 135 specific applications. 137 This documents specifies ways to express potential and concrete 138 configurations as well as rules to combine, constrain, and collaps 139 these configurations. How a particular component's potential 140 configurations are gained, what relationship exists to system 141 capabilities, and similar meta-discussions are beyond the scope of 142 this dcoument. 144 It is also not the purpose of this document to specify a complete 145 framework including mandatory protocols for capability exchange. 146 Names and value ranges for different applications should be defined 147 in a follow-up document and registered with the IANA. 149 Besides modeling and rules, this document specifies a syntax for 150 expressing configurations and describes a basic and a concise 151 representation format as well as an XML-based notation. A number of 152 appendices provide mappings to other specification formats (in 153 particular SDP and H.245) as far as possible and also give an 154 overview of semantic definitions for configurations for audio codecs. 156 1.3. Relation to other Developments 158 A few other generic or application specific models have been 159 developed that deal with capability description and/or capability 160 negotiation. 162 RFC 2295 (Transparent Content Negotiation in HTTP) [3] proposes a 163 negotiation mechanism layered on top of HTTP that allows for 164 automatically selecting the ``best'' version of documents that are 165 accessible by a single URI. A server can describe the properties of 166 each variant of a document associated with ``quality degration 167 factors''. The content negotiation process will either allow the 168 client to select the appropriate version according a variant list 169 provided by the server or the server itself may choose a document 170 version relying on Accept-headers that are included in the client's 171 request. 173 The Resource Description Framework (RDF) [4] provides a specification 174 model for properties of Web resources and aims at automating 175 processing Web resources with respect to resource discovery, 176 cataloging, resource selection and other applications. 178 CC/PP [5] is an on-going development that is creating a framework for 179 describing user preferences and device capabilities that uses RDF to 180 express those descriptions. In the CC/PP model a user agent can 181 provide capability profiles that enable servers and proxies to 182 customize content accordingly. 184 The IETF Content Negotiation (conneg) working group is developing a 185 collection of media features for display, print and fax [6], a 186 registration procedure for feature tags (the names of capability 187 properties) [7] as well as description and negotiation models [8] [9] 188 for media features and capabilities. One of conneg's goals is to 189 develop a ``tag independent negotiation'' process that can work 190 without knowing the meaning of feature tags. 192 Whereas TCN, RDF and CC/PP focus on describing/negotiating 193 capabilities for client/server scenarios such as the WWW, where a 194 server provides content with certain properties and a client has 195 certain preferences/capabilities, the conneg approach is more 196 general. The conneg framework provides the abstraction of ``feature 197 sets'' that are media feature collections. Feature sets can either be 198 interpreted as a set of variants that a server can provide as data 199 formats or as a set of capabilities of a receiver. Content 200 negotiation in this model would be to find a non-empty feature set 201 that is compatible with both the sender's and the receiver's original 202 feature set. 204 H.245, the multimedia control protocol employed across all newer 205 H.32x Recommendations for tightly-coupled multimedia conferencing 206 (particularly included H.323) provides the concept of capability 207 specification and exchange between terminals and uses the same 208 description mechanisms to define particular instantiations of media 209 streams in a conference. For capability description purposes, H.245 210 provides means to express all the capabilities supported by a system 211 (``AlternativeCapabilitySets'') as well as to describe permitted 212 combinations of these capability sets to be instantiated at the same 213 time. Capability exchange is defined on a peer-to-peer basis, common 214 (``collapsed'') capabilities are calculated by some central entity 215 that controls the mode of operation in a multipoint conference. This 216 calculation requires the central entity to understand the (semantics 217 of) the individual endpoints' capability descriptions. 219 T.124 specifies a framework for exchanging and collapsing 220 capabilities. This framework specifies a core set of rules (minimum, 221 maximum, logical AND) and capability types as well as a naming 222 scheme, but leaves definition of specific semantics to the 223 application protocols. This concept makes the framework extensible 224 and enables entities to calculate a common set of supported 225 capabilities without having to understand their semantics. Also, 226 T.124 distinguishes between capability descriptions and particular 227 instantiations for application sessions. In addition to these 228 collapsing capabilities, T.124 supports the notion of non- collapsing 229 capabilities to which the collapsing process is not applied. 231 Capability negotiation for groups of senders and receivers as 232 presented in this document can be viewed as a specialization of the 233 general conneg approach that focuses on simplicity for capability 234 descriptions. Some expressional power of the conneg framework is 235 abandoned in favor of simplicity. 237 1.4. Terminology for requirement specifications 239 In this document, the key words "MUST", "MUST NOT", "REQUIRED", 240 "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", 241 and "OPTIONAL" are to be interpreted as described in RFC 2119 [1] and 242 indicate requirement levels for compliant implementations. 244 2. Requirements and Concepts 246 2.1. System Model 248 Any (computer) system has a number of rather fixed hardware as well 249 as software resources. These resources ultimately define the 250 limitations on what can be captured, displayed, rendered, replayed, 251 etc. with this particular machine. We term features enabled and 252 restricted by these resources "system capabilities". 254 Example: System capabilities may include the limitation of the 255 screen resolution for true color by the graphics board; 256 available audio hardware or software may offer only certain 257 media encodings (e.g. G.711 and G.723.1 but not GSM); and CPU 258 processing power and quality of implementation may constrain the 259 possible video encoding algorithms. 261 In multiparty multimedia conferences, participants employ different 262 ``components'' in conducting the conference. 264 Example: In lecture multicast conferences one component might be 265 the voice transmission for the lecturer, another the 266 transmission of video pictures showing the lecturer and the 267 third the transmission of presentation material that are 268 different components in a conference. 270 Depending on system capabilities, user preferences and other 271 technical and political constraints, different configurations can be 272 chosen to accomplish the ``deployment'' of these components. 274 Each component can be characterized at least by (a) its intended use 275 (i.e. the function it shall provide) and (b) a one or more possible 276 ways to realize this function. Each way of realizing a particular 277 function is referred to as a "configuration". 279 Example: A conference component's intended use may be to make 280 transparencies of a presentation visible to the audience on the 281 Mbone. This can be achieved either by a video camera capturing 282 the image and transmitting a video stream via some video tool or 283 by loading an copy of the slides into a distributed eletronic 284 whiteboard. For each of these cases, additional parameters may 285 exist, leading to additional configurations (see below). 287 Two configurations are considered different regardless whether they 288 employ entirely different mechanisms and protocols (as in the 289 previous example) or they choose the same and differ only in a single 290 parameter. 292 Example: In case of video transmission, a JPEG-based still image 293 protocol may be used, H.261 encoded CIF images could be sent as 294 could H.261 encoded QCIF images. All three cases constitute 295 different configurations. Of course there are many more 296 detailed protocol parameters. 298 Each component's configurations are limited by the system 299 capabilities. In addition, the intended use of a component may 300 constrain the possible configurations further to a subset suitable 301 for the particular component's purpose. 303 Example: In a system for highly interactive audio communication 304 the component responsible for audio may decide not to use the 305 available G.723.1 audio codec to avoid the additional latency 306 but only use G.711. This would be reflected in this component 307 only showing configurations based upon G.711. Still, multiple 308 configurations are possible, e.g. depending on the use of A-law 309 or u-Law, packetization and redundancy parameters, etc. 311 We distinguish two types of configurations: 313 o potential configurations 315 (a set of any number of configurations per component) indicating 316 a system's functional capabilities as constrained by the 317 intended use of the various components; 319 o actual configurations 321 (exactly one per instance of a component) reflecting the mode of 322 operation of this component's particular instantiation. 324 Example: The potential configuration of the aforementioned video 325 component may indicate support for JPEG, H.261/CIF, and 326 H.261/QCIF. A particular instantiation for a video conference 327 may use the actual configuration of H.261/CIF for exchanging 328 video streams. 330 A configuration consists of any number of properties and is uniquely 331 identified by a tag. Potential configurations can be grouped into 332 alternatives each of which indicates a possible mode of operation of 333 a component. 335 In a conference, each involved peer contributes to the formation of a 336 component's configuration -- by specifying its its own features and 337 limitations during the capability exchange process. Based upon all 338 systems' input, a set of common capbilities -- potential 339 configurations -- is calculated through the collapsing process. 341 The collapsing process may be influenced by additional constraints 342 that may be expressed on the possible combinations of alternatives -- 343 between multiple instances of the same component as well as across 344 (instances of) different components. Also, user preferences may be 345 taken into account -- during the collapsing process as well as when 346 deciding on which potential configuration is to be instantiated as 347 the actual configuration for a component. 349 2.2. Definition of terms 351 From the system model described above, the following core terms can 352 be extracted: 354 o conference component 356 An element of a multiparty multimedia conference that can appear 357 as a media stream and has a set of potential configurations. 359 o configuration 361 A set of named attributes, expressing constraints to a system's 362 capabilities. 364 o capability 366 Resources or system features that influence the selection of 367 useful configurations for components. 369 o alternative 371 When comparing different potential configurations, one potential 372 configuration is an alternative to other configurations. 374 o property 376 A property is a label-value pair. 378 The capability description language specified in this document is 379 called CAP. 381 2.3. Description language 383 The objective of a capability description language is to allow the 384 definition of supported media types, encodings and features of an end 385 system. The language must be unambiguous, easily parsable and allow 386 for concise definitions to minimize the transport overhead for a 387 capability negotiation phase during a conference. It should also be 388 extensible and not fixed to certain features, because new encodings 389 must be supported without changes to the language definition. 391 To ensure the unambiguousness it is however required to have a common 392 understanding on the meaning of identifiers and values. E.g. if two 393 end systems used different names for the audio encoding ``GSM'' a 394 capability negotiation would not lead to the desired result. The 395 need for well-known identifiers and the need for extensibility 396 require to seperate the definition of identifiers and values from the 397 definition of the description language itself. Identifiers and values 398 should therefore be standardized and registered. 400 2.4. Collapsing Algorithm 402 The objective of the collapsing algorithm is to take capability 403 description sets from each end system in order to find a set of 404 media-types, encodings and features that are supported by all end 405 system, or, if this is not possible, to find a subset that would 406 exclude as few systems as possible. 408 The procedure described above would be the default algorithm. In 409 certain scenarios where some end systems are priveleged it must be 410 possible to ensure that the result of the collapsing process does not 411 exclude those privileged systems. It must therefore be possible to 412 parameterize the process with the policy to be applied. 414 3. Specification of the Decription Language 416 Two, semantically equivalent, notations are introduced. The first 417 notation is simple but leads to verbose capability descriptions and 418 the second notation is more complex but allows for concise 419 descriptions.[1] This specification also defines how to translate 420 descriptions using the concise notation to the other, simpler, 421 format. 423 Please note that all tags and values are just examples and not a 424 subject of this specification. 426 3.1. Basic Description Language 428 In the basic description language a end system's capability 429 description is a set of alternatives. An alternative is a set of 430 constraints for certain parameters. A constraint can be understood as 431 a restriction because it limits the capability alternative according 432 to the constraint's meaning. 434 A constraint is constituted of three components: 436 +---------+-------------------------------------+ 437 |tag | name of the constraint | 438 |operator | defines the type of the constraint | 439 |value | a value for the constraint operator | 440 +---------+-------------------------------------+ 441 Table 1: Components of a Constraint 443 A constraint that limits the capability of an end system to a maximum 444 transfer rate of 64 kbit/s (say in a description of audio receiver 445 capabilities) would be written as follows: 447 bps <= 64000; 449 with bps as the tag, <= as the operator and 64000 as the value of 450 this constraint (plus a semicolon as a end-of-statement-symbol). 452 A complete alternative (a set of constraints) would be written as: 454 _________________________ 455 [1] A third, XML-based notation is included in appendix A. 457 media = audio; 458 mode = receive | send; 459 channels = 1; 460 encoding = g711; 461 compression = mulaw; 462 sampling_rate = 8000 | 11025 | 16000; 464 This example exhibits another way of expressing constraints using the 465 = operator. The = operator can be used to define a set of supported 466 values in a single constraint. The value of the = operator's value is 467 actually a list of names seperated by ``|''. In the definition of the 468 media constraint it is shown how a single name is used as a value for 469 the = operator, which has the meaning that (for the respective 470 alternative ) only the media-type audio is supported. 472 Another operator that is not shown in the example is the operator >= 473 that can be used to express minimum constraints. Table 2 provides an 474 overview of the operators: 476 +---+---------------------------+ 477 |<= | maximum | 478 |>= | minimum | 479 |= | selection of fixed values | 480 +---+---------------------------+ 481 Table 2: Operators for Capability Constraints 483 The reason why the sampling_rate constraint is expressed with a = and 484 not with a <= operator is that defining the rate capability as a 485 maximum constraint with a value of 16000 would allow any value less 486 than 16000 as a valid parameter which would not match the application 487 specific semantics in this case.[2] 489 The example above contains one alternative of a capability 490 description. It could be used as a complete description expressing 491 that the end system does not support more than this specific 492 alternative. Most end system however support more variants of audio 493 parameters, requiring the definition of more alternatives. E.g. 494 supporting ``GSM'' as a second encoding would lead to the following 495 capability description: 497 tag: audio/g711 498 media = audio; 499 mode = receive | send; 500 channels = 1; 501 encoding = g711; 502 compression = mulaw; 503 sampling_rate = 8000 | 11025 | 16000; 505 _________________________ 506 [2] Most codecs do not support arbitrary sampling rates. 508 tag: audio/gsm 509 media = audio; 510 mode = receive | send; 511 channels = 1; 512 encoding = gsm; 513 compression = half | full | enhanced_full; 515 This description expresses that the end system supports one media 516 type ``audio'' and two audio encodings ``g711'' and ``gsm'', each 517 with certain other constraints. This way of defining capabilities is 518 very redundant as many constraints are the same for both 519 alternatives. It is important to know all the constraints of an 520 alternative for a later negotiation phase (see below) but for writing 521 and transferring capability descriptions another notation that 522 expresses common constraints and allows for more concise definition 523 is useful. 525 The = operator is actually already used to aggregate several 526 constraints into one: A hypothetic even more primitive notation could 527 translate each alternative containing a = constraint into a set of 528 alternatives each containing a ``equality constraint'' for one value 529 of the = value list. E.g. for the GSM alternative there would be 3 530 alternatives for each compression type (each variant again would 531 require an alternative for receive and for send mode in this 532 example). This has not been done in this example in order to avoid 533 the obvious verbosity. Every alternative containing a = constraint 534 with n values can however unrolled to n different alternatives if 535 this granularity is required. 537 Each alternative also contains a tag that allows to reference it 538 later in simultaneous capability specifications. Due to the 539 possibility to aggregate alternatives with = constraints several 540 specific codec parameters for a media codec can be subsumed under one 541 common tag like in the example above. This allows to handle common 542 cases, where this is desired, efficiently. Again, if more granularity 543 is needed for specific applications, = constraints can be unrolled. 545 The ABNF[2] specification for the basic description language is as 546 follows: 548 +-------------------------------------------------------------------+ 549 |caps = alternative *(CRLF CRLF alternative) | 550 |alternative = tag-definition CRLF *constraint | 551 |constraint = *WSP (min-constraint / max-constraint / | 552 | oneof-constraint) *WSP [CRLF *WSP] | 553 |tag-definition = *WSP "tag:" *WSP identifier *WSP ";" | 554 |min-constraint = label *WSP ">=" *WSP numval *WSP ";" | 555 |max-constraint = label *WSP "<=" *WSP numval *WSP ";" | 556 |oneof-constraint = label *WSP "=" *WSP [oneof-list] *WSP | 557 | ";" | 558 |oneof-list = val / (oneof-list *WSP "|" *WSP val) | 559 |label = identifier | 560 |val = identifier | 561 |numval = 1*DIGIT | 562 |identifier = ALPHA, *(ALPHA / DIGIT) | 563 +-------------------------------------------------------------------+ 565 Note that the specification does currently not provide ``non- 566 collapsing'' attributes, i.e. attributes that are not considered in 567 collapsing rules, except for tags. Another syntactic element for 568 those attribute will be added in the future. 570 3.2. Concise Description Language 572 3.2.1. Syntax 574 The goal of the concise description language is to express the same 575 capability description more concisely by grouping shared constraints 576 of alternatives. The concise language provides the same constraint 577 operators but introduces the concept of alternative groups. The 578 above, verbose example can be expressed like this: 580 media: audio { 581 mode = receive | send; 582 channels = 1; 583 encoding: g711 { 584 compression = mulaw; 585 sampling_rate = 8000 | 11025 | 16000; 586 } || encoding: gsm { 587 compression = half | full | enhanced_full; 588 }; 589 }; 591 An alternative group contains those constraints (and subgroups) that 592 are specific to an alternative and cannot be expressed in the common 593 part. A group is enclosed by curly brackets and follows a group-tag 594 (like ``encoding: g711'' in the example). A group-tag is semantically 595 a ``='' constraint (with one value) but is used in the concise 596 notation to introduce a new subgroup of constraints. 598 The example above contains three groups: The top-level group ``media: 599 audio'' and two second-level groups ``encoding: g711'' and 600 ``encoding: gsm''. Groups on the same hierarchy level (siblings) are 601 connected by ``||''. Groups can be nested to arbitrary levels and 602 there is no limit for the number of siblings in a hierarchy. The next 603 example shows how the ``encoding: g711'' group can be split-up into 2 604 subgroups: 606 media: audio { 607 mode = receive | send; 608 channels = 1; 609 encoding: g711 { 610 compression: mulaw { 611 sampling_rate = 8000 | 11025 | 16000; 612 } || compression: alaw { 613 sampling_rate = 8000 | 11025 | 32000; 614 }; 615 } || encoding: gsm { 616 compression = half | full | enhanced_full; 617 }; 618 }; 620 Note that there are no explicit tags allowed for the concise 621 notation. Instead group tags serve as implicit tags components that 622 can be composed to unique tags for each expressed alternative. A 623 alternative can be uniquely specified by joining the group tags of 624 all enclosing groups. The specification example above would thus 625 define three alternatives: audio/g711/mulaw, audio/g711/alaw and 626 audio/gsm. Tag concatenation uses "/" (slash) as a delimiting 627 character. 629 The ABNF[2] specification for the concise description language is as 630 follows (as an extension to the ABNF of the basic language, see 631 section 3.1): 633 +------------------------------------------------------------+ 634 |caps = 1*(group LWSP *("||" LWSP group) *WSP | 635 | ";") | 636 |group = group-tag *WSP "{" LWSP *constraint | 637 | [caps] LWSP "}" | 638 |group-tag = name ":" *WSP tag | 639 +------------------------------------------------------------+ 641 3.2.2. Translation to Basic Notation 643 Transforming a capability description from concise to basic notation 644 MUST be done by applying the following algorithm, starting at the 645 outermost hierarchy level and transforming subgroups recursively: 647 transform group-tag to = constraint; 648 push group-tag to tag stack; 649 adopt all other constraints within the group; 651 for each group in this level { 652 add adopted constraints and transformed group-tag 653 to every alternative obtained from transforming the 654 subgroups 655 recursively resulting in a set of alternatives; 656 if (is innermost group) { 657 construct tag by concatenating all group-tags 658 from tag stack 659 and add it to alternative; 660 } 661 pop tag stack; 662 } 664 Two innermost subgroups at the same hierachy level are thus converted 665 to two alternatives. An Example: 667 A: B { 668 C <= 1; 669 D: E { 670 F <= 2; 671 G:H { 672 I <= 3 ; 673 } || J: K { 674 L <=4; 675 } 676 } || M: L { 677 N <= 5; 678 } 679 } 681 would be transformed into the following set of alternatives: 683 tag: B/E/H 684 A = B; 685 C <= 1; 686 D = E; 687 F <= 2; 688 G = H; 689 I <= 3; 691 tag: B/E/K 692 A = B; 693 C <= 1; 694 D = E; 695 F <= 2; 696 J = K; 697 L <= 4; 698 tag: B/L 699 A = B; 700 C <= 1; 701 M = L; 702 N <= 5; 704 3.2.3. Translation from Basic to Concise Format 706 The mapping from basic to concise representation is not unique by 707 itself: In principle, for all alternatives constraints with common 708 values can be factored out. Depending on the constraints that are 709 chosen for outer groups the results will differ. Nevertheless it 710 would be possible to define an algorithm that will guarantee 711 uniqueness, for example by defining certain tags as implicit outer- 712 level tags (e.g. ``media'') and by demanding that those constraints 713 with the largest number of equal values in many alternatives will 714 appear in the outermost groups. Conflicts could be avoided by 715 imposing a lexicographic ordering on the tags. Only ``='' constraints 716 with one parameter can be chosen for group tags. 718 4. Specification of constraints for simulataneous capabilities 720 For some applications it is not sufficient to be able to express the 721 capability to support a list of media types and codec parameters. 722 Instead constraints of how many instances of codecs of different 723 types can be active at a given time must also be specified as an 724 input parameter for a negotiation/selection process. 726 For example a gateway may be able to handle either 5 GSM streams or, 727 alternatively, 5 G.711 streams at the same time but not both GSM and 728 G.711 at the same time. 730 The specification presented here enables the definitions of such 731 constraints by the tagging mechanism. Alternative capability can be 732 refered to in rules expressing those simultaneous constraints using 733 their tags. The specification of such a definition language is 734 however not subject of this draft and will have to be defined 735 elsewhere. 737 5. Specification of the Collapsing Process 739 The collapsing process generates a set of alternatives, according to 740 the collapsing policy and the set of alternatives that are used as 741 the input to this process. 743 5.1. Finding compatible alternatives 744 The general collapsing process tries to find a set of alternatives 745 that are supported by every end system. This must be accomplished by 746 comparing each alternative of an end system's alternative set with 747 each alternative of every other alternative set. 749 The process of collapsing two alternatives works as follows: 751 find intersection of constraints of the two alternatives by 752 keeping all constraint with same names and operators; 753 for all constraints in the intersection { 754 find according constraint (same name and operator) in 755 second set; 756 if(operator==''<='') { 757 calculate minimum of both constraint values and 758 add 759 maximum constraint with that value to result set; 760 } 761 if(operator==''>='') { 762 calculate maximum of both constraint values and 763 add 764 minimum constraint with that value to result set; 765 } 766 if(operator==''='') { 767 Build intersection of tags in both constraints; 768 add = constraint with a value of the 769 intersection to the result set; 770 } 772 } 774 Tags are ignored in the collapsing process. If the result set of 775 alternatives contains = constraints with empty value lists the 776 collapsing of these two alternatives has failed and the resulting set 777 must be discarded. 779 5.2. Other policies 781 Other collapsing policies will have to be defined. 783 6. Composed Configurations 785 For certain configurations it is required to compose configurations 786 by combining or referencing other configurations. Sample application 787 could be redundant and FEC encodings. A full specification how this 788 can be accomplished will have to be defined. The general outline 789 would be to use the structuring and referencing mechanisms (tagged 790 alternative) to express the required constraints for the respective 791 encodings. 793 7. Security Considerations 795 Security considerations will also have to be defined. 797 8. Authors' Addresses 799 Joerg Ott 800 Universitaet Bremen, TZI, MZH 5180 801 Bibliothekstr. 1 802 D-28359 Bremen 803 Germany 804 voice +49 421 201-7028 805 fax +49 421 218-7000 807 Dirk Kutscher 808 Universitaet Bremen, TZI, MZH 5160 809 Bibliothekstr. 1 810 D-28359 Bremen 811 Germany 812 voice +49 421 218-7595 813 fax +49 421 218-7000 815 Carsten Bormann 816 Universitaet Bremen, TZI, MZH 5180 817 Bibliothekstr. 1 818 D-28359 Bremen 819 Germany 820 voice +49 421 218-7024 821 fax +49 421 218-7000 823 9. References 825 [1] S. Bradner, ``Key words for use in RFCs to Indicate Requirement 826 Levels'' RFC 2119, March 1997 828 [2] D. Crocker, P. Overell, ``Augmented BNF for Syntax 829 Specifications: ABNF'', RFC 2234, November 1997 831 [3] K. Holtman, A. Mutz, ``Transparent Content Negotiation in 832 HTTP'', RFC 2295, March 1998 834 [4] O. Lassila, R. Swick, ``Resource Description Framework (RDF) 835 Model und Syntax Specification'', W3C Proposed Recommendation, 836 January 1999, work in progress, http://www.w3.org/TR/1999 838 [5] F. Reynolds, J. Hjelm, S. Dawkins, S. Singhal, ``Composite 839 Capability/Preference Profiles (CC/PP): A user side framework 840 for content negotiation'', W3C Note 30, November 1998, work in 841 progress, http://www.w3.org/TR/1998/NOTE-CPP-19981130 843 [6] L. Massinter, K. Holtman, A. Mutz, D. Wing, ``Media Features for 844 Display, Print, and Fax'', Internet Draft draft-ietf-conneg- 845 media-features-05.txt, January 1998, Work in Progress 847 [7] K. Holtman, A. Mutz, T. Hardie, ``Media Feature Tag Registration 848 Procedure'', Internet Draft draft-ietf-conneg-feature- 849 reg-03.txt, July 1998, Work in Progress 851 [8] G. Klyne, ``A syntax for describing media feature sets'', 852 Internet Draft draft-ietf-conneg-feature-syntax-04.txt, December 853 1998, Work in Progress 855 [9] G. Klyne, ``An algebra for describing media feature sets'', 856 Internet Draft draft-ietf-conneg-feature-algebra-03.txt, August 857 1998, Work in Progress 859 [10] G. Klyne, ``W3C Composite Capability/Preference Profiles'', 860 Internet-Draft draft-ietf-conneg-W3C-ccpp-01.txt, December 1998, 861 Work in progress 863 [11] M. Handley, ``SDP: Session Description Protocol'', RFC 2327, 864 April 1998 866 [12] M.Handley, C. Perkins, E. Whelan, ``Session Announcement 867 Protocol'', Internet-Draft draft-ietf-mmusic-sap-v2-01.txt, June 868 1999, Work in progress 870 Appendix A: XML-DTD for the description language 872 A XML-DTD for XML documents representing concise CAP descriptions: 874 875 879 880 884 885 890 891 895 896 901 902 907 908 912 The example explained above represented in XML: 914 915 916 917 918 receive 919 send 920 921 1 922 923 924 925 8000 926 11025 927 928 929 930 931 8000 932 11025 933 934 935 936 937 938 half 939 full 940 enhanced_full 941 942 943 944 946 Note that = constraints with one alternative are represented as 947 property elements for brevity while = constraints with multiple 948 alternatives are represented as one.of elements with a property 949 element (without name attribute) for each value. 951 Appendix B: Mapping from/to SDP 953 Note that this appendix is still prelimenary as it does not yet cover 954 all the features provided by the capability description language 955 presented in this document. 957 SDP allows for describing all parameters required for establishing a 958 conference. The media parameters that can be interpreted as a 959 caller's capabilities are only a subset of the session desription. 960 Other information like origin (``o='' field) or communication 961 parameters are not related to a system's capability description 962 (although they need to be expressable in a session description 963 language, as well). An example SDP description: 965 v=0 966 o=mhandley 2890844526 2890842807 IN IP4 126.16.64.4 967 s=SDP Seminar 968 i=A Seminar on the session description protocol 969 u=http://www.cs.ucl.ac.uk/staff/M.Handley/sdp.03.ps 970 e=mjh@isi.edu (Mark Handley) 971 c=IN IP4 224.2.17.12/127 972 t=2873397496 2873404696 973 a=recvonly 974 m=audio 49170 RTP/AVP 0 975 m=video 51372 RTP/AVP 31 976 m=video 51374 RTP/AVP 98 977 a=rtpmap:98 X-H.263+ 978 m=application 32416 udp wb 979 a=orient:portrait 981 Only the ``m='' and the respective ``a='' fields contain relevant 982 information for a mapping to our capability description language. 983 The first element of a ``m='' field is the media type that can be 984 mapped to the tag of a top-level ``group-tag'' in the concise 985 description language. The second element of a ``m='' field, the 986 transport port, is a communication parameter and can therefore be 987 neglected for now. The third and fourth (and subsequent) elements 988 define a transport protocol (that can be regarded as some kind of 989 capability) and media formats (encodings). The ``m='' field may be 990 followed by an ``a='' field that can contain arbitrary constraints on 991 the media description, notably the rtpmap attribute, that maps a 992 dynamic RTP payload type number to a media format (and additional 993 encoding parameters, depending on the concrete encoding). Further 994 encoding specific parameters are specified using a ``a=fmtp'' 995 attribute. All parameters of a ``a=fmtp'' attribute will be mapped to 996 respective constraints in our description language. The concrete 997 mapping is yet to be defined for some common uses of ``a=fmtp''. 999 For the sake of generality we must translate the implicit encoding 1000 paramters expressed in static RTP payload numbers to explicit 1001 descriptions and extract the relevant information from the ``a='' 1002 fields for dynamic payload types. 1004 The example above could therefore be translated as: 1006 media: audio { 1007 mode = receive | send 1008 encoding: g711 { 1009 transport = RTP 1010 compression = mulaw 1011 sampling_rate = 8000 1012 channels = 1 1013 } 1014 } 1015 media: video { 1016 mode = receive | send 1017 encoding: h261 { 1018 transport = RTP 1019 } || encoding: h263+ { 1020 transport = RTP 1021 } 1022 } 1023 media: application { 1024 type: wb { 1025 transport = UDP 1026 orientation = portrait 1027 } 1028 } 1030 The constraints inside the ``g711'' group have to be adopted from the 1031 payload types definition in RFC 1890. The ``transport'' constraint 1032 could also be factored-out to the outer groups ``audio'' and 1033 ``video'' -- this is not relevant to the semantics of the 1034 description. Note that the empty group for ``h261'' and ``h263+'' can 1035 also be abbreviated as a ``= constraint'' if no specific constraints 1036 exist for those encodings. 1038 The mapping process can thus defined as follows: 1040 1) Each ``m='' format specification is mapped to a ``group'' 1041 nested in a ``group'' for the respective media. The tag for that 1042 group is inferred either from the static payload type or in case 1043 of dynamic payload types looked up from a corresponding 1044 ``a=rtpmap'' field. The corresponding registered payload type 1045 name leads to an encoding name (by a yet to be defined name 1046 map). A mapping for unregistered payload type names has to be 1047 defined, as well. 1049 2) The transport of a ``m='' field becomes a ``='' constraint in 1050 the ``group'' for the encoding 1052 3) For registered payload type names the additional parameters as 1053 defined in RFC 1890 such as sampling rate and number of channels 1054 are each translated into corresponding ``='' constraints of the 1055 encoding group. 1057 4) Translation of ``a=fmtp'' has to be defined... 1059 5) All other ``a='' fields relating to a ``m='' and representing a 1060 single attribute-value mapping (like orient:portrait) are 1061 translated into single ``='' constraints with one value. 1063 Future versions of this specification will also define how integrate 1064 other SDP configuration parameters into CAP using non-collapsing 1065 parameters (see section xx) that are yet to be defined. 1067 Translating a description written in the concise decscription 1068 language (back) to SDP again would rely on a well-defined mapping of 1069 encoding names: 1071 1) CAP names for video or audio that cannot be translated into 1072 registered payload type names will be translated as dynamic 1073 payload types with a corresponding ``a=rtpmap'' field. 1075 2) CAP groups with encoding names that can be mapped are either 1076 translated into ``m='' fields with static payload types if the 1077 encoding parameters (sampling rate and number of channels) 1078 conform to the specification of a static payload type or, if one 1079 of these parameters differ are translated to ``m='' fields with 1080 a dynamic payload type that will be defined in a subsequent 1081 ``a=rtpmap'' field. 1083 3) For other media types the encoding groups will be translated to 1084 ``m=application''fields with the encoding name as the fourth 1085 element. 1087 4) The transport constraint of the CAP description is will be 1088 reflected in the ``m='' field, as well. 1090 5) Other = constraints with one value will be translated to ``a='' 1091 fields. 1093 Appendix C: Integration into SDP 1095 Instead of translating a CAPs specification into SDP media 1096 descriptions it can be more efficient to directly add it to a SDP 1097 description and thus retain the original specification. This can be 1098 done by using dynamic payload types: 1100 m=audio 98 1102 a=rtpmap:98 X-CAP 1104 a={ channels = 1; encoding: g711 { compression = mulaw);sampling_rate 1105 = 8000 | 11025 | 16000; } || encoding: gsm { compression = half | 1106 full | enhanced_full; }; } 1108 Appendix D: Mapping to H.245 1110 TBD.