idnits 2.17.00 (12 Aug 2021) /tmp/idnits9283/draft-kobayashi-dv-video-00.txt: ** The Abstract section seems to be numbered Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 8 longer pages, the longest (page 2) being 60 lines == It seems as if not all pages are separated by form feeds - found 0 form feeds but 9 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** There is 1 instance of too long lines in the document, the longest one being 2 characters in excess of 72. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 157: '... information SHOULD be provided as p...' RFC 2119 keyword, line 159: '...ynamic payload types MUST be assigned,...' RFC 2119 keyword, line 160: '...format. The sender MUST change to the...' RFC 2119 keyword, line 162: '... The sender MUST NOT expect to notif...' RFC 2119 keyword, line 170: '... video frame MUST have the same time...' (17 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (August 1999) is 8308 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? '1' on line 357 looks like a reference -- Missing reference section? '2' on line 361 looks like a reference -- Missing reference section? '3' on line 363 looks like a reference -- Missing reference section? '4' on line 365 looks like a reference -- Missing reference section? '5' on line 369 looks like a reference -- Missing reference section? '6' on line 372 looks like a reference Summary: 7 errors (**), 0 flaws (~~), 4 warnings (==), 8 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 INTERNET-DRAFT Katsushi Kobayashi 2 draft-kobayashi-dv-video-00.txt Communication Research Laboratory 3 Akimichi Ogawa 4 Keio University 5 Stephen Casner 6 Cisco Systems 7 Carsten Bormann 8 Universitaet Bremen TZI 9 February 25, 1999 10 Expires August 1999 12 RTP Payload Format for DV Format Video 14 Status of this Memo 16 This document is an Internet-Draft and is in full conformance with 17 all provisions of Section 10 of RFC2026. 19 Internet-Drafts are working documents of the Internet Engineering 20 Task Force (IETF), its areas, and its working groups. Note that 21 other groups may also distribute working documents as Internet- 22 Drafts. 24 Internet-Drafts are draft documents valid for a maximum of six months 25 and may be updated, replaced, or obsoleted by other documents at any 26 time. It is inappropriate to use Internet- Drafts as reference 27 material or to cite them other than as "work in progress." 29 The list of current Internet-Drafts can be accessed at 30 http://www.ietf.org/ietf/1id-abstracts.txt 32 The list of Internet-Draft Shadow Directories can be accessed at 33 http://www.ietf.org/shadow.html. 35 1. Abstract 37 This document specifies the packetization scheme for encapsulating 38 the digital video data streams defined by the HD Digital VCR 39 Conference, commonly known as "DV", into a payload format for the 40 Real-Time Transport Protocol (RTP). The RTP payload format specified 41 in this document supports three quality levels of digital video 42 identified as SD-VCR, HD-VCR and SDL-VCR. 44 2. Introduction 46 The HD Digital VCR Conference has published a digital video 47 specification set entitled "Specification of Consumer-Use Digital 48 VCRs using 6.3mm magnetic tape" [1,2]. The specification set 49 consists of two subset specifications, the first of which is 50 "Specification of Consumer-Use Digital VCRs". That subset comprises 51 the whole specification for consumer-use digital video including 52 mechanical specifications of a cassette, helical magnetic recording 53 format, error correction in the magnetic tape, DCT video encoding 54 format, and audio encoding format. The digital video format defined 55 by that specification is commonly known as "DV" format. 57 The second subset is "Specification of Digital Interface for Consumer 58 Electronic Audio/Video Equipment" (abbreviated hereafter as the 59 Digital Interface). That subset defines the communication protocol 60 for carrying DV video and audio over the IEEE 1394 high performance 61 serial bus [3]. The IEEE 1394 bus may be used to interconnect 62 digital video cameras, digital VCRs, computers and other devices. 64 This document specifies the RTP payload format for encapsulating the 65 DV format data streams obtained via the Digital Interface into the 66 Real-time Transport Protocol (RTP), version 2 [4]. 68 The HD Digital VCR Conference specification set supports several 69 video formats: SD-VCR (including 525/60, 625/50), HD-VCR (1125/60, 70 1250/50), SDL-VCR (525/60, 625/50), PALplus, DVB (Digital Video 71 Broadcast) and ATV (Advanced Television). However, the Digital 72 Interface specifies the IEEE1394 communication protocol for only a 73 subset of these video formats. The RTP payload format defined here 74 covers only those video formats that are included in the Digital 75 Interface. 77 Furthermore, some formats defined by the HD Digital VCR Conference, 78 e.g. DVB and ATV, are based on MPEG2. The payload format for 79 encapsulating MPEG2 into RTP has already been defined in RFC 2250. 80 That payload format is more suitable for transmission of MPEG2 over 81 the Internet than would be a packetization of MPEG2 first into the 82 IEEE 1394 protocol and then into RTP. Therefore, packetization of DV 83 formats based on MPEG2 is outside the scope of this document. 85 Consequently, the payload format specified in this document will 86 support the original six video formats of the HD Digital VCR 87 Conference: SD-VCR (525/60, 625/50), HD-VCR (1125/60, 1250/50) and 88 SDL-VCR (525/60, 625/50). 90 The HD Digital VCR Conference is also standardizing an audio and 91 video device control protocol, that is, a command set for video 92 equipment operation and status queries to video devices. This 93 document does not address these control functions. 95 Throughout this specification, we make extensive use of the VCR 96 Conference terminology. The reader should consult the Digital 97 Interface references for definitions of these terms. 99 3. DV format encoding 101 The DV format is designed for magnetic tape applications and is 102 optimized in helical magnetic recording on tape media. All video 103 data including audio and other system data are managed within the 104 picture frame unit of video. 106 The video encoding consists of a three-level hierarchical structure. 107 A picture frame is divided into rectangle- or clipped-rectangle- 108 shaped DCT super blocks. DCT super blocks are divided into 27 109 rectangle- or square-shaped DCT macro blocks. The DCT macro block 110 consists of 6 square 8x8 DCT blocks, four of which represent Y 111 picture component and the remaining two 2 represent Cr and Cb. 113 Audio is encoded with sampled data. Its frequency is 32 kHz, 44.1 114 kHz or 48 kHz, its quantization is 16-bit linear or 12-bit non- 115 linear, and the number of channels may range from 2 to 8. Only 116 certain combinations of these parameters are allowed depending upon 117 the video format, as specified in [1]. 119 A frame of data in the DV format stream is divided into several "DIF 120 sequences". A DIF sequence is composed of an integral number of 121 fixed length (80-byte) DIF blocks. Each DIF block contains a 3-byte 122 ID header that specifies the type of the DIF block and its position 123 in the DIF sequence. Five types of DIF blocks are defined: DIF 124 sequence header, Subcode, Video Auxiliary information (VAUX), Audio 125 data and Video data. 127 3.1 Transmission of DV format over IEEE 1394 129 The specification of the Digital Interface defines a transport 130 protocol for transmission of video stream data in the isochronous 131 stream mode of IEEE 1394 called "real time data transmission 132 protocol". The protocol defines the general Common Isochronous 133 Packet (CIP) header that does not depend on the encoding format of 134 the payload. Several real time transmission encodings have been 135 defined on CIP, including MPEG2 and MIDI in addition to DV format 136 [1,2]. 138 A DIF block is the basic unit for all transmission on the IEEE 1394. 139 Each IEEE 1394 isochronous stream packet is composed of an integral 140 number of DIF blocks, assembled without regard to DIF sequence 141 boundaries, up to the limit of the MTU for IEEE 1394. 143 4. Usage of RTP 144 Each RTP packet starts with the RTP header as defined in RFC 1889 145 [4]. No additional payload-format-specific header is required for 146 this payload format. 148 4.1 RTP header usage 150 The meaning of RTP header fields that are specific to the DV format 151 is described in the following: 153 Payload type (PT): The payload type is dynamically assigned by means 154 outside the scope of this document. Details of the encoding format, 155 such as audio sampling rate and video scan rate, are given in the 156 AAUX and VAUX data embedded in the data stream. However, the same 157 information SHOULD be provided as part of the dynamic payload type 158 assignment. If multiple encoding formats are to be used within one 159 RTP session, then multiple dynamic payload types MUST be assigned, 160 one for each encoding format. The sender MUST change to the 161 corresponding payload type whenever the encoding format is changed. 162 The sender MUST NOT expect to notify the receiver of an encoding 163 format change with the information included in AAUX or VAUX because 164 the packet carrying this information might be dropped and would not 165 be available to the receiver until the next AAUX or VAUX packet is 166 received. 168 Timestamp: 32-bit 90 kHz timestamp representing the time at which the 169 first data in the frame was sampled. All RTP packets within the same 170 video frame MUST have the same timestamp. The timestamp SHOULD 171 increment by a multiple of the nominal interval for one frame time, 172 as given in the following table: 174 Mode Framerate (Hz) Increase of one frame 175 in 90khz timestamp 177 525-60 29.97 3003 178 625-50 25 3600 179 1125-60 30 3000 180 1250-50 25 3600 182 The progress of video frame times MAY be monitored using the SYT 183 timestamp carried in the CIP header, as described in Appendix A. 185 Marker bit (M): The marker bit of the RTP fixed header is set to one 186 on the last packet of a video frame, and otherwise, must be zero. 187 The M bit allows the receiver to know that it has received the last 188 packet of a frame so it can display the image without waiting for the 189 first packet of the next frame to arrive to detect the frame change. 190 However, detection of a frame change MUST NOT rely on the marker bit 191 since the last packet of the frame might be lost. Detection of a 192 frame change MUST be done by differences in RTP timestamp. 194 4.2 DV data encapsulation into RTP payload 196 All of the information in the IEEE 1394 CIP header is either implicit 197 in the RTP payload format or supplanted by information in the RTP 198 header, so the CIP header is not required. For this payload format, 199 the CIP header MUST be removed from IEEE 1394 packet, leaving just a 200 sequence of DIF blocks. Integral DIF blocks are placed into the RTP 201 payload beginning immediately after the RTP header. DIF blocks 202 carried by different IEEE 1394 packets may be packed into one RTP 203 packet, except that all DIF blocks in one RTP packet must be from the 204 same video frame. DIF blocks from the next video frame MUST NOT be 205 packed into the same RTP packet even if there is more payload space 206 remaining. This requirement stems from the fact the transition from 207 one video frame to the next is indicated by a change in the RTP 208 timestamp. It also reduces the processing complexity at the 209 receiver. 211 Since the RTP payload contains an integral number of DIF blocks, the 212 length of the RTP payload will be a multiple of 80 bytes. 214 Audio and video data may be transmitted as one bundled RTP stream or 215 in separate RTP streams. The choice MUST be indicated as part of the 216 assignment of the dynamic payload type and MUST remain unchanged for 217 the duration of the RTP session to avoid complicated procedures of 218 sequence number synchronization. 220 In the case of one bundled stream, DIF blocks for both audio and 221 video are packed into RTP packets in the same order as they were 222 generated. 224 When audio and video are sent in separate RTP streams, or when only 225 one medium is sent, then only the DIF blocks corresponding to the 226 selected medium are included. If VAUX DIF blocks are included, they 227 MUST only be sent in the video stream. 229 When sending a separate audio stream in the 16-bit encoding, it is 230 RECOMMENDED that the audio stream data be extracted from the DIF 231 blocks and repackaged in the L16 payload format defined in RFC 1890 232 [5] in order to maximize interoperability with non-DV-capable 233 receivers. 235 When sending separate video and audio streams with both in DV format, 236 the same timestamp SHOULD be used for both audio and video data 237 within the same frame in order to simplify lip synchronization at the 238 receiver. Lip synchronization may also be achieved using reference 239 timestamps passed in RTCP as described in [4]. 241 The sender MAY send null AAUX information and omit VAUX DIF blocks if 242 the VAUX/AAUX information remains constant during the session. 243 However, the VAUX/AAUX information in the DV stream includes source 244 encoding parameters, such as video display aspect ratio, audio 245 quantization and number of audio channels, which are required to 246 decode the stream. Therefore, if VAUX/AAUX information is not 247 transmitted in the stream, the equivalent parameters essential to 248 playout MUST be provided by some out of band means beyond the scope 249 of this document. 251 The receiver MUST be able to process a data stream with null AAUX 252 information and null or omitted VAUX DIF blocks if the equivalent 253 parameters are provided out of band. Therefore, if the RTP receiver 254 is feeding the DV stream to a device that requires AAUX information 255 and VAUX DIF blocks, the receiver MUST be able to generate AAUX 256 within audio DIF blocks and VAUX DIF blocks for the device using the 257 parameters provided out of band. 259 The sender MAY reduce the video frame rate by discarding the video 260 data and VAUX DIF blocks for some of the video frames. The RTP 261 timestamp must still be incremented to account for the discarded 262 frames. The sender MAY alternatively reduce bandwidth by discarding 263 video data DIF blocks for portions of the image which are unchanged 264 from the previous image. To enable this bandwidth reduction, 265 receivers SHOULD implement an error concealment strategy to 266 accommodate lost or missing DIF blocks by repeating the corresponding 267 DIF block from the previous image. 269 5. Security Considerations 271 RTP packets using the payload format defined in this specification 272 are subject to the security considerations discussed in the RTP 273 specification [4], and any appropriate RTP profile. This implies 274 that confidentiality of the media streams is achieved by encryption. 275 Because the data compression used with this payload format is applied 276 end-to-end, encryption may be performed after compression so there is 277 no conflict between the two operations. 279 A potential denial-of-service threat exists for data encodings using 280 compression techniques that have non-uniform receiver-end 281 computational load. The attacker can inject pathological datagrams 282 into the stream which are complex to decode and cause the receiver to 283 be overloaded. However, this encoding does not exhibit any 284 significant non-uniformity. 286 As with any IP-based protocol, in some circumstances a receiver may 287 be overloaded simply by the receipt of too many packets, either 288 desired or undesired. Network-layer authentication may be used to 289 discard packets from undesired sources, but the processing cost of 290 the authentication itself may be too high. In a multicast 291 environment, pruning of specific sources may be implemented in future 292 versions of IGMP [6] and in multicast routing protocols to allow a 293 receiver to select which sources are allowed to reach it. 295 6. Full Copyright Statement 297 Copyright (C) The Internet Society (1999). All Rights Reserved. 299 This document and translations of it may be copied and furnished to 300 others, and derivative works that comment on or otherwise explain it 301 or assist in its implementation may be prepared, copied, published 302 and distributed, in whole or in part, without restriction of any 303 kind, provided that the above copyright notice and this paragraph are 304 included on all such copies and derivative works. 306 However, this document itself may not be modified in any way, such as 307 by removing the copyright notice or references to the Internet Soci- 308 ety or other Internet organizations, except as needed for the purpose 309 of developing Internet standards in which case the procedures for 310 copyrights defined in the Internet Standards process must be fol- 311 lowed, or as required to translate it into languages other than 312 English. 314 The limited permissions granted above are perpetual and will not be 315 revoked by the Internet Society or its successors or assigns. 317 This document and the information contained herein is provided on an 318 "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING 319 TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING 320 BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION 321 HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MER- 322 CHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE." 324 7. Authors' Addresses 326 Katsushi Kobayashi 327 Communication Research Laboratory 328 4-2-1 Nukii-kita machi, Koganei 329 Tokyo 184-8795 330 JAPAN 331 EMail: ikob@koganei.wide.ad.jp 333 Akimichi Ogawa 334 Keio University 335 5322 Endo, Fujisawa 336 Kanagawa 252 337 JAPAN 338 EMail: akimichi@sfc.wide.ad.jp 340 Stephen L. Casner 341 Cisco Systems, Inc. 342 170 West Tasman Drive 343 San Jose, CA 95134-1706 344 United States 345 EMail: casner@cisco.com 347 Carsten Bormann 348 Universitaet Bremen FB3 TZI 349 Postfach 330440 350 D-28334 Bremen, GERMANY 351 Phone: +49.421.218-7024 352 Fax: +49.421.218-7000 353 EMail: cabo@tzi.org 355 8. Bibliography 357 [1] IEC 61834, Helical-scan digital video cassette recording system 358 using 6,35 mm magnetic tape for consumer use (525-60, 625-50, 359 1125-60 and 1250-50 systems) 361 [2] IEC 61883, Consumer audio/video equipment - Digital interface 363 [3] IEEE Std 1394-1995, Standard for a High Performance Serial Bus 365 [4] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson. RTP: A 366 transport protocol for real-time applications. IETF Audio/Video 367 Transport Working Group, January 1996. RFC1889. 369 [5] Schulzrinne, H., "RTP Profile for Audio and Video Conferences 370 with Minimal Control", RFC 1890, January 1996. 372 [6] Deering, S., "Host Extensions for IP Multicasting", STD 5, 373 RFC 1112, August 1989. 375 Appendix A. 377 In the Digital Interface specification, two types of 8-byte CIP 378 headers are defined, one type including the SYT field, and the other 379 without the SYT field. The SYT field is a 16-bit timestamp copied 380 from lower 16 bits of CYCLE_TIME register defined in IEEE 1394. The 381 CYCLE_TIME register is incremented by a 24.576 MHz clock, but the 382 lower 12 bits count to a maximum of 3071 before wrapping around to 383 zero and adding a carry to the high 4 bits. Therefore, the SYT 384 timestamp is not linear. 386 If the encoding format requires synchronization between devices, it 387 should adopt the CIP header with SYT. The DV format selects the CIP 388 header type including the SYT field, but only requires that the SYT 389 field contain a valid timestamp for one CIP header in every video frame 390 period. In the remaining CIP headers, the SYT field may contain the 391 special "no information" value (all ones).