idnits 2.17.00 (12 Aug 2021) /tmp/idnits46936/draft-ietf-rtcweb-audio-codecs-for-interop-06.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (April 22, 2016) is 2213 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: draft-ietf-rtcweb-audio has been published as RFC 7874 == Outdated reference: draft-ietf-rtcweb-overview has been published as RFC 8825 Summary: 0 errors (**), 0 flaws (~~), 3 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group S. Proust, Ed. 3 Internet-Draft Orange 4 Intended status: Informational April 22, 2016 5 Expires: October 24, 2016 7 Additional WebRTC audio codecs for interoperability. 8 draft-ietf-rtcweb-audio-codecs-for-interop-06 10 Abstract 12 To ensure a baseline level of interoperability between WebRTC 13 endpoints, a minimum set of required codecs is specified. However, 14 to maximize the possibility to establish the session without the need 15 for audio transcoding, it is also recommended to include in the offer 16 other suitable audio codecs that are available to the browser. 18 This document provides some guidelines on the suitable codecs to be 19 considered for WebRTC endpoints to address the most relevant 20 interoperability use cases. 22 Status of This Memo 24 This Internet-Draft is submitted in full conformance with the 25 provisions of BCP 78 and BCP 79. 27 Internet-Drafts are working documents of the Internet Engineering 28 Task Force (IETF). Note that other groups may also distribute 29 working documents as Internet-Drafts. The list of current Internet- 30 Drafts is at http://datatracker.ietf.org/drafts/current/. 32 Internet-Drafts are draft documents valid for a maximum of six months 33 and may be updated, replaced, or obsoleted by other documents at any 34 time. It is inappropriate to use Internet-Drafts as reference 35 material or to cite them other than as "work in progress." 37 This Internet-Draft will expire on October 24, 2016. 39 Copyright Notice 41 Copyright (c) 2016 IETF Trust and the persons identified as the 42 document authors. All rights reserved. 44 This document is subject to BCP 78 and the IETF Trust's Legal 45 Provisions Relating to IETF Documents 46 (http://trustee.ietf.org/license-info) in effect on the date of 47 publication of this document. Please review these documents 48 carefully, as they describe your rights and restrictions with respect 49 to this document. Code Components extracted from this document must 50 include Simplified BSD License text as described in Section 4.e of 51 the Trust Legal Provisions and are provided without warranty as 52 described in the Simplified BSD License. 54 Table of Contents 56 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 57 2. Definition and abbreviations . . . . . . . . . . . . . . . . 3 58 3. Rationale for additional WebRTC codecs . . . . . . . . . . . 3 59 4. Additional suitable codecs for WebRTC . . . . . . . . . . . . 5 60 4.1. AMR-WB . . . . . . . . . . . . . . . . . . . . . . . . . 5 61 4.1.1. AMR-WB General description . . . . . . . . . . . . . 5 62 4.1.2. WebRTC relevant use case for AMR-WB . . . . . . . . . 5 63 4.1.3. Guidelines for AMR-WB usage and implementation with 64 WebRTC . . . . . . . . . . . . . . . . . . . . . . . 5 65 4.2. AMR . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 66 4.2.1. AMR General description . . . . . . . . . . . . . . . 6 67 4.2.2. WebRTC relevant use case for AMR . . . . . . . . . . 6 68 4.2.3. Guidelines for AMR usage and implementation with 69 WebRTC . . . . . . . . . . . . . . . . . . . . . . . 6 70 4.3. G.722 . . . . . . . . . . . . . . . . . . . . . . . . . . 7 71 4.3.1. G.722 General description . . . . . . . . . . . . . . 7 72 4.3.2. WebRTC relevant use case for G.722 . . . . . . . . . 7 73 4.3.3. Guidelines for G.722 usage and implementation . . . . 8 74 5. Security Considerations . . . . . . . . . . . . . . . . . . . 8 75 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 76 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 8 77 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 9 78 8.1. Normative references . . . . . . . . . . . . . . . . . . 9 79 8.2. Informative references . . . . . . . . . . . . . . . . . 10 80 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 11 82 1. Introduction 84 As indicated in [I-D.ietf-rtcweb-overview], it has been anticipated 85 that WebRTC will not remain an isolated island and that some WebRTC 86 endpoints will need to communicate with devices used in other 87 existing networks with the help of a gateway. Therefore, in order to 88 maximize the possibility to establish the session without the need 89 for audio transcoding, it is recommended in [I-D.ietf-rtcweb-audio] 90 to include in the offer other suitable audio codecs beyond those that 91 are mandatory to implement. This document provides some guidelines 92 on the suitable codecs to be considered for WebRTC endpoints to 93 address the most relevant interoperability use cases. 95 The codecs considered in this document are recommended to be 96 supported and included in the Offer only for WebRTC endpoints for 97 which interoperability with other non-WebRTC endpoints and non-WebRTC 98 based services is relevant as described in Section 4.1.2, 99 Section 4.2.2, Section 4.3.2. Other use cases may justify offering 100 other additional codecs to avoid transcoding. 102 2. Definition and abbreviations 104 o Legacy networks: In this document, legacy networks encompass the 105 conversational networks that are already deployed like the PSTN, 106 the PLMN, the IP/IMS networks offering VoIP services, including 107 3GPP "4G" Evolved Packet System [TS23.002] supporting voice over 108 LTE radio access (VoLTE) [IR.92]. 110 o WebRTC endpoint: a WebRTC endpoint can be a WebRTC browser or a 111 WebRTC non browser (also called "WebRTC device" or "WebRTC native 112 application") as defined in [I-D.ietf-rtcweb-overview] 114 o AMR: Adaptive Multi-Rate. 116 o AMR-WB: Adaptive Multi-Rate WideBand. 118 o CAT-iq: Cordless Advanced Technology-internet and quality. 120 o DECT: Digital Enhanced Cordless Telecommunications 122 o IMS: IP Multimedia Subsystem 124 o LTE: Long Term Evolution (3GPP "4G" wireless data transmission 125 standard) 127 o MOS: Mean Opinion Score, defined in ITU-T P.800 specification 128 [P.800] 130 o PSTN: Public Switched Telephone Network 132 o PLMN: Public Land Mobile Network 134 o VoLTE: Voice Over LTE 136 3. Rationale for additional WebRTC codecs 138 The mandatory implementation of OPUS [RFC6716] in WebRTC endpoints 139 can guarantee codec interoperability (without transcoding) at state 140 of the art voice quality (better than narrow band "PSTN" quality) 141 between WebRTC endpoints. The WebRTC technology is also expected to 142 be used to communicate with other types of endpoints using other 143 technologies. It can be used for instance as an access technology to 144 VoLTE services (Voice over LTE as specified in [IR.92]) or to 145 interoperate with fixed or mobile Circuit Switched or VoIP services 146 like mobile Circuit Switched voice over 3GPP 2G/3G mobile networks 147 [TS23.002] or DECT based VoIP telephony [EN300175-1]. Consequently, 148 a significant number of calls are likely to occur between terminals 149 supporting WebRTC endpoints and other terminals like mobile handsets, 150 fixed VoIP terminals and DECT terminals that do not support WebRTC 151 endpoints nor implement OPUS. As a consequence, these calls are 152 likely to be either of low narrow band PSTN quality using G.711 153 [G.711] at both ends or affected by transcoding operations. The 154 drawback of such transcoding operations are listed below: 156 o Degraded user experience with respect to voice quality: voice 157 quality is significantly degraded by transcoding. For instance, 158 the degradation is around 0.2 to 0.3 MOS for most of transcoding 159 use cases with AMR-WB codec (Section 4.1) at 12.65 kbit/s and in 160 the same range for other wideband transcoding cases. It should be 161 stressed that if G.711 is used as a fall back codec for 162 interoperation, wideband voice quality will be lost. Such 163 bandwidth reduction effect down to narrow band clearly degrades 164 the user perceived quality of service leading to shorter and less 165 frequent calls. Such a switch to G.711 is less than desirable or 166 acceptable choice for customers. If transcoding is performed 167 between OPUS and any other wideband codec, wideband communication 168 could be maintained but with degraded quality (MOS scores of 169 transcoding between AMR-WB 12.65 kbit/s and OPUS at 16 kbit/s in 170 both directions are significantly lower than those of AMR-WB at 171 12.65 kbit/s or OPUS at 16 kbit/s). Furthermore, in degraded 172 conditions, the addition of defects, like audio artifacts due to 173 packet losses, and the audio effects resulting from the cascading 174 of different packet loss recovery algorithms may result in a 175 quality below the acceptable limit for the customers. 177 o Degraded user experience with respect to conversational 178 interactivity: the degradation of conversational interactivity is 179 due to the increase of end to end latency for both directions that 180 is introduced by the transcoding operations. Transcoding requires 181 full de-packetization for decoding of the media stream (including 182 mechanisms of de-jitter buffering and packet loss recovery) then 183 re-encoding, re-packetization and re-sending. The delays produced 184 by all these operations are additive and may increase the end to 185 end delay up to 1 second, much beyond the acceptable limit. 187 o Additional cost in networks: transcoding places important 188 additional cost on network gateways mainly related to codec 189 implementation, codecs licenses, deployment, testing and 190 validation cost. It must be noted that transcoding of wideband to 191 wideband would require more CPU processing and be more costly than 192 transcoding between narrowband codecs. 194 4. Additional suitable codecs for WebRTC 196 The following codecs are considered as relevant codecs with respect 197 to the general purpose described in Section 3. This list reflects 198 the current status of WebRTC foreseen use cases. It is not 199 limitative and opened to further inclusion of other codecs for which 200 relevant use cases can be identified. These additional codecs are 201 recommended to be included in the offer in addition to OPUS and G.711 202 according to the foreseen interoperability cases to be addressed. 204 4.1. AMR-WB 206 4.1.1. AMR-WB General description 208 The Adaptive Multi-Rate WideBand (AMR-WB) is a 3GPP defined speech 209 codec that is mandatory to implement in any 3GPP terminal that 210 supports wideband speech communication. It is being used in circuit 211 switched mobile telephony services and new multimedia telephony 212 services over IP/IMS. It is specially used for voice over LTE as 213 specified by GSMA in [IR.92]. More detailed information on AMR-WB 214 can be found in [IR.36]. References for AMR-WB related 215 specifications including detailed codec description and source code 216 are in [TS26.171], [TS26.173], [TS26.190], [TS26.204]. 218 4.1.2. WebRTC relevant use case for AMR-WB 220 The market of personal voice communication is driven by mobile 221 terminals. AMR-WB is now very widely implemented in devices and 222 networks offering "HD Voice" A high number of calls are consequently 223 likely to occur between WebRTC endpoints and mobile 3GPP terminals 224 offering AMR-WB. The use of AMR-WB by WebRTC endpoints would 225 consequently allow transcoding free interoperation with all mobile 226 3GPP wideband terminals. Besides, WebRTC endpoints running on mobile 227 terminals (smartphones) may reuse the AMR-WB codec already 228 implemented on these devices. 230 4.1.3. Guidelines for AMR-WB usage and implementation with WebRTC 232 The payload format to be used for AMR-WB is described in [RFC4867] 233 with bandwidth efficient format and one speech frame encapsulated in 234 each RTP packet. Further guidelines for implementing and using AMR- 235 WB and ensuring interoperability with 3GPP mobile services can be 236 found in [TS26.114]. In order to ensure interoperability with 4G/ 237 VoLTE as specified by GSMA, the more specific IMS profile for voice 238 derived from [TS26.114] should be considered in [IR.92]. In order to 239 maximize the possibility of successful call establishment for WebRTC 240 endpoints offering AMR-WB it is important that the WebRTC endpoints: 242 o Offer AMR in addition to AMR-WB with AMR-WB listed first (AMR-WB 243 being a wideband codec) as preferred payload type with respect to 244 other narrow band codecs (AMR, G.711...) and with Bandwidth 245 Efficient payload format preferred. 247 o Be capable of operating AMR-WB with any subset of the nine codec 248 modes and source controlled rate operation. Offer at least one 249 AMR-WB configuration with parameter settings as defined in 250 Table 6.1 of [TS26.114]. In order to maximize the 251 interoperability and quality this offer does not restrict the 252 codec modes offered. Restrictions in the use of codec modes may 253 be included in the answer. 255 4.2. AMR 257 4.2.1. AMR General description 259 Adaptive Multi-Rate (AMR) is a 3GPP defined speech codec that is 260 mandatory to implement in any 3GPP terminal that supports voice 261 communication. This includes both mobile phone calls using GSM and 262 3G cellular systems as well as multimedia telephony services over IP/ 263 IMS and 4G/VoLTE, such as GSMA voice IMS profile for VoLTE in 264 [IR.92]. In addition to impacts listed above, support of AMR can 265 avoid degrading the high efficiency over mobile radio access. 266 References for AMR related specifications including detailed codec 267 description and source code are in [TS26.071], [TS26.073], 268 [TS26.090], [TS26.104]. 270 4.2.2. WebRTC relevant use case for AMR 272 A user of a WebRTC endpoint on a device integrating an AMR module 273 wants to communicate with another user that can only be reached on a 274 mobile device that only supports AMR. Although more and more 275 terminal devices are now "HD voice" and support AMR-WB; there are 276 still a high number of legacy terminals supporting only AMR 277 (terminals with no wideband / HD Voice capabilities) that are still 278 in use. The use of AMR by WebRTC endpoints would consequently allow 279 transcoding free interoperation with all mobile 3GPP terminals. 280 Besides, WebRTC endpoints running on mobile terminals (smartphones) 281 may reuse the AMR codec already implemented on these devices. 283 4.2.3. Guidelines for AMR usage and implementation with WebRTC 285 The payload format to be used for AMR is described in [RFC4867] with 286 bandwidth efficient format and one speech frame encapsulated in each 287 RTP packet. Further guidelines for implementing and using AMR with 288 purpose to ensure interoperability with 3GPP mobile services can be 289 found in [TS26.114]. In order to ensure interoperability with 4G/ 290 VoLTE as specified by GSMA, the more specific IMS profile for voice 291 derived from [TS26.114] should be considered in [IR.92]. In order to 292 maximize the possibility of successful call establishment for WebRTC 293 endpoints offering AMR, it is important that the WebRTC endpoints: 295 o Be capable of operating AMR with any subset of the eight codec 296 modes and source controlled rate operation. 298 o Offer at least one configuration with parameter settings as 299 defined in Table 6.1 and Table 6.2 of [TS26.114]. In order to 300 maximize the interoperability and quality this offer shall not 301 restrict AMR codec modes offered. Restrictions in the use of 302 codec modes may be included in the answer. 304 4.3. G.722 306 4.3.1. G.722 General description 308 G.722 [G.722] is an ITU-T defined wideband speech codec. G.722 was 309 approved by ITU-T in 1988. It is a royalty free codec that is common 310 in a wide range of terminals and endpoints supporting wideband speech 311 and requiring low complexity. The complexity of G.722 is estimated 312 to 10 MIPS [EN300175-8] which is 2.5 to 3 times lower than AMR-WB. 313 Especially, G.722 has been chosen by ETSI DECT as the mandatory 314 wideband codec for New Generation DECT with purpose to greatly 315 increase the voice quality by extending the bandwidth from narrow 316 band to wideband. G.722 is the wideband codec required for CAT-iq 317 DECT certified terminals and the V2.0 of CAT-iq specifications have 318 been approved by GSMA as minimum requirements for HD voice logo usage 319 on "fixed" devices; i.e., broadband connections using the G.722 320 codec. 322 4.3.2. WebRTC relevant use case for G.722 324 G.722 is the wideband codec required for DECT CAT-iq terminals. DECT 325 cordeless phones are still widely used to offer short range wireless 326 connection to PSTN or VoIP services. G.722 has also been specified 327 by ETSI in [TS181005] as mandatory wideband codec for IMS multimedia 328 telephony communication service and supplementary services using 329 fixed broadband access. The support of G.722 would consequently 330 allow transcoding free IP interoperation between WebRTC endpoints and 331 fixed VoIP terminals including DECT / CAT-IQ terminals supporting 332 G.722. Besides, WebRTC endpoints running on fixed terminals 333 implementing G.722 may reuse the G.722 codec already implemented on 334 these devices. 336 4.3.3. Guidelines for G.722 usage and implementation 338 The payload format to be used for G.722 is defined in [RFC3551] with 339 each octet of the stream of octets produced by the codec to be octet- 340 aligned in an RTP packet. The sampling frequency for G.722 is 16 kHz 341 but the rtp clock rate is set to 8000Hz in SDP to stay backward 342 compatible with an erroneous definition in the original version of 343 the RTP A/V profile. Further guidelines for implementing and using 344 G.722 with purpose to ensure interoperability with multimedia 345 telephony services over IMS can be found in section 7 of [TS26.114]. 346 Additional information of G.722 implementation in DECT can be found 347 in [EN300175-8] and full codec description and C source code in 348 [G.722]. 350 5. Security Considerations 352 Security considerations for WebRTC Audio Codec and Processing 353 Requirements can be found in [I-D.ietf-rtcweb-audio]. Implementors 354 making use of the additional codecs considered in this document are 355 advised to also refer more specifically to the "Security 356 Considerations" sections of [RFC4867] (for AMR and AMR-WB) and 357 [RFC3551]. 359 6. IANA Considerations 361 None. 363 7. Acknowledgements 365 The authors of this document are 367 o Stephane Proust, Orange, stephane.proust@orange.com , 369 o Espen Berger, Cisco, espeberg@cisco.com , 371 o Bernhard Feiten, Deutsche Telekom, Bernhard.Feiten@telekom.de , 373 o Bo Burman, Ericsson, bo.burman@ericsson.com , 375 o Kalyani Bogineni, Verizon Wireless, 376 Kalyani.Bogineni@VerizonWireless.com , 378 o Mia Lei, Huawei, lei.miao@huawei.com , 380 o Enrico Marocco,Telecom Italia, enrico.marocco@telecomitalia.it , 382 though only the editor is listed on the front page. 384 The authors would like to thank Magnus Westerlund, Barry Dingle and 385 Sanjay Mishra who carefully reviewed the document and helped to 386 improve it. 388 8. References 390 8.1. Normative references 392 [G.722] ITU, "Recommendation ITU-T G.722 (2012): 7 kHz audio- 393 coding within 64 kbit/s", 2012-09, 394 . 396 [I-D.ietf-rtcweb-audio] 397 Valin, J. and C. Bran, "WebRTC Audio Codec and Processing 398 Requirements", draft-ietf-rtcweb-audio-10 (work in 399 progress), February 2016. 401 [IR.92] GSMA, "IMS Profile for Voice and SMS V9.0", April 2015, 402 . 405 [RFC3551] Schulzrinne, H. and S. Casner, "RTP Profile for Audio and 406 Video Conferences with Minimal Control", STD 65, RFC 3551, 407 DOI 10.17487/RFC3551, July 2003, 408 . 410 [RFC4867] Sjoberg, J., Westerlund, M., Lakaniemi, A., and Q. Xie, 411 "RTP Payload Format and File Storage Format for the 412 Adaptive Multi-Rate (AMR) and Adaptive Multi-Rate Wideband 413 (AMR-WB) Audio Codecs", RFC 4867, DOI 10.17487/RFC4867, 414 April 2007, . 416 [TS26.071] 417 3GPP, "3GPP TS 26.071 v13.0.0: Recommendation ITU-T G.722 418 (2012): "Mandatory Speech Codec speech processing 419 functions; AMR Speech CODEC; General description".", 420 2015-12, . 422 [TS26.073] 423 3GPP, "3GPP TS 26.073 v13.0.0: ANSI C code for the 424 Adaptive Multi Rate (AMR) speech codec", 2015-12, 425 . 427 [TS26.090] 428 3GPP, "3GPP TS 26.090 v13.0.0: Mandatory Speech Codec 429 speech processing functions; Adaptive Multi-Rate (AMR) 430 speech codec; Transcoding functions.", 2015-12, 431 . 433 [TS26.104] 434 3GPP, "3GPP TS 26.104 v13.0.0: ANSI C code for the 435 floating-point Adaptive Multi Rate (AMR) speech codec.", 436 2015-12, . 438 [TS26.114] 439 3GPP, "3GPP TS 26.114 v13.3.0: IP Multimedia Subsystem 440 (IMS); Multimedia telephony; Media handling and 441 interaction", March 2016, 442 . 444 [TS26.171] 445 3GPP, "3GPP TS 26.171 v13.0.0: Speech codec speech 446 processing functions; Adaptive Multi-Rate - Wideband (AMR- 447 WB) speech codec; General description.", 2015-12, 448 . 450 [TS26.173] 451 3GPP, "3GPP TS 26.173 v13.1.0: ANSI-C code for the 452 Adaptive Multi-Rate - Wideband (AMR-WB) speech codec.", 453 2016-03, . 455 [TS26.190] 456 3GPP, "3GPP TS 26.190 v13.0.0: Speech codec speech 457 processing functions; Adaptive Multi-Rate - Wideband (AMR- 458 WB) speech codec; Transcoding functions.", 2015-12, 459 . 461 [TS26.204] 462 3GPP, "3GPP TS 26.204 v13.1.0: Speech codec speech 463 processing functions; Adaptive Multi-Rate - Wideband (AMR- 464 WB) speech codec; ANSI-C code.", 2016-03, 465 . 467 8.2. Informative references 469 [EN300175-1] 470 ETSI, "ETSI EN 300 175-1, v2.6.1: Digital Enhanced 471 Cordless Telecommunications (DECT); Common Interface (CI); 472 Part 1: Overview", 2015, . 476 [EN300175-8] 477 ETSI, "ETSI EN 300 175-8, v2.6.1: Digital Enhanced 478 Cordless Telecommunications (DECT); Common Interface (CI); 479 Part 8: Speech and audio coding and transmission.", 2015, 480 . 484 [G.711] ITU, "Recommendation ITU-T G.711 (2012): Pulse code 485 modulation (PCM) of voice frequencies", 1988-11, 486 . 488 [I-D.ietf-rtcweb-overview] 489 Alvestrand, H., "Overview: Real Time Protocols for 490 Browser-based Applications", draft-ietf-rtcweb-overview-15 491 (work in progress), January 2016. 493 [IR.36] GSMA, "Adaptive Multirate Wide Band V3.0", September 2014, 494 . 497 [P.800] ITU, "ITU-T P.800: Methods for objective and subjective 498 assessment of quality", 1996-08, . 501 [RFC6716] Valin, JM., Vos, K., and T. Terriberry, "Definition of the 502 Opus Audio Codec", RFC 6716, DOI 10.17487/RFC6716, 503 September 2012, . 505 [TS181005] 506 ETSI, "Telecommunications and Internet converged Services 507 and Protocols for Advanced Networking (TISPAN); Service 508 and Capability Requirements V3.3.1 (2009-12)", 2009, 509 . 513 [TS23.002] 514 3GPP, "3GPP TS 23.002 v13.5.0: Network architecture", 515 2016-03, . 517 Author's Address 518 Stephane Proust (editor) 519 Orange 520 2, avenue Pierre Marzin 521 Lannion 22307 522 France 524 Email: stephane.proust@orange.com