idnits 2.17.00 (12 Aug 2021) /tmp/idnits54586/draft-ietf-json-rfc4627bis-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The draft header indicates that this document obsoletes RFC4627, but the abstract doesn't seem to mention this, which it should. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 08, 2013) is 3146 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '116' on line 525 -- Looks like a reference, but probably isn't: '943' on line 525 -- Looks like a reference, but probably isn't: '234' on line 525 -- Looks like a reference, but probably isn't: '38793' on line 525 -- Possible downref: Non-RFC (?) normative reference: ref. 'IEEE754' ** Obsolete normative reference: RFC 4234 (Obsoleted by RFC 5234) -- Possible downref: Non-RFC (?) normative reference: ref. 'UNICODE' Summary: 1 error (**), 0 flaws (~~), 1 warning (==), 8 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 JSON Working Group T. Bray, Ed. 3 Internet-Draft Google, Inc. 4 Obsoletes: 4627 (if approved) October 08, 2013 5 Intended status: Standards Track 6 Expires: April 11, 2014 8 The JSON Data Interchange Format 9 draft-ietf-json-rfc4627bis-05 11 Abstract 13 JavaScript Object Notation (JSON) is a lightweight, text-based, 14 language-independent data interchange format. It was derived from 15 the ECMAScript Programming Language Standard. JSON defines a small 16 set of formatting rules for the portable representation of structured 17 data. 19 Status of This Memo 21 This Internet-Draft is submitted in full conformance with the 22 provisions of BCP 78 and BCP 79. 24 Internet-Drafts are working documents of the Internet Engineering 25 Task Force (IETF). Note that other groups may also distribute 26 working documents as Internet-Drafts. The list of current Internet- 27 Drafts is at http://datatracker.ietf.org/drafts/current/. 29 Internet-Drafts are draft documents valid for a maximum of six months 30 and may be updated, replaced, or obsoleted by other documents at any 31 time. It is inappropriate to use Internet-Drafts as reference 32 material or to cite them other than as "work in progress." 34 This Internet-Draft will expire on April 11, 2014. 36 Copyright Notice 38 Copyright (c) 2013 IETF Trust and the persons identified as the 39 document authors. All rights reserved. 41 This document is subject to BCP 78 and the IETF Trust's Legal 42 Provisions Relating to IETF Documents 43 (http://trustee.ietf.org/license-info) in effect on the date of 44 publication of this document. Please review these documents 45 carefully, as they describe your rights and restrictions with respect 46 to this document. Code Components extracted from this document must 47 include Simplified BSD License text as described in Section 4.e of 48 the Trust Legal Provisions and are provided without warranty as 49 described in the Simplified BSD License. 51 Table of Contents 53 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 54 1.1. Conventions Used in This Document . . . . . . . . . . . . 3 55 1.2. Specifications of JSON . . . . . . . . . . . . . . . . . 3 56 1.3. Introduction to This Revision . . . . . . . . . . . . . . 3 57 1.4. Changes from RFC 4627 . . . . . . . . . . . . . . . . . . 4 58 2. JSON Grammar . . . . . . . . . . . . . . . . . . . . . . . . 5 59 3. Values . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 60 4. Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 61 5. Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 62 6. Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 63 7. Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 64 8. String and Character Issues . . . . . . . . . . . . . . . . . 9 65 8.1. Encoding and Detection . . . . . . . . . . . . . . . . . 9 66 8.2. Unicode Characters . . . . . . . . . . . . . . . . . . . 9 67 8.3. String Comparison . . . . . . . . . . . . . . . . . . . . 10 68 9. Parsers . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 69 10. Generators . . . . . . . . . . . . . . . . . . . . . . . . . 10 70 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 10 71 12. Security Considerations . . . . . . . . . . . . . . . . . . . 12 72 13. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 12 73 14. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 13 74 15. References . . . . . . . . . . . . . . . . . . . . . . . . . 13 75 15.1. Normative References . . . . . . . . . . . . . . . . . . 13 76 15.2. Informative References . . . . . . . . . . . . . . . . . 14 77 Appendix A. Changes in -04 . . . . . . . . . . . . . . . . . . . 14 78 Appendix B. Changes in -05 . . . . . . . . . . . . . . . . . . . 14 79 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 15 81 1. Introduction 83 JavaScript Object Notation (JSON) is a text format for the 84 serialization of structured data. It is derived from the object 85 literals of JavaScript, as defined in the ECMAScript Programming 86 Language Standard, Third Edition [ECMA-262]. 88 JSON can represent four primitive types (strings, numbers, booleans, 89 and null) and two structured types (objects and arrays). 91 A string is a sequence of zero or more Unicode characters [UNICODE]. 93 An object is an unordered collection of zero or more name/value 94 pairs, where a name is a string and a value is a string, number, 95 boolean, null, object, or array. 97 An array is an ordered sequence of zero or more values. 99 The terms "object" and "array" come from the conventions of 100 JavaScript. 102 JSON's design goals were for it to be minimal, portable, textual, and 103 a subset of JavaScript. 105 1.1. Conventions Used in This Document 107 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 108 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 109 document are to be interpreted as described in [RFC2119]. 111 The grammatical rules in this document are to be interpreted as 112 described in [RFC4234]. 114 1.2. Specifications of JSON 116 A description of JSON in ECMAScript terms first appeared in version 117 5.1 of the ECMAScript specification [ECMA-262], section 15.12. It 118 includes a description of the differences between JSON as described 119 in that specification and in RFC4627. The most significant is that 120 ECMAScript 5.1 does not require a JSON Text to be an Array or an 121 Object; thus, for example, "Hello world!", "42", and "true" would all 122 be valid JSON texts in the ECMAScript 5.1 context. 124 JSON is also described in [ECMA-404]. 126 None of the specifications of JSON syntax disagree on the syntax of 127 the language. 129 1.3. Introduction to This Revision 131 In the years since the publication of RFC 4627, JSON has found very 132 wide use. This experience has revealed certain patterns which, while 133 allowed by its specifications, have caused interoperability problems. 135 Also, a small number of errata have been reported. 137 This revision does not change any of the rules of the specification; 138 all texts which were legal JSON remain so, and none which were not 139 JSON become JSON. The revision's goal is to fix the errata and 140 highlight practices which can lead to interoperability problems. 142 1.4. Changes from RFC 4627 144 This section lists all changes between this document and the text in 145 RFC 4627. 147 o Changed Working Group attribution to JSON Working Group. 149 o Changed title of document. 151 o Change the reference to [UNICODE] to be be non-version-specific. 153 o Added a "Specifications of JSON" section. 155 o Added an "Introduction to this Revision" section. 157 o Added language about duplicate object member names and 158 interoperability. 160 o Applied erratum #607 from RFC 4627 to correctly align the artwork 161 for the definition of "object". 163 o Changed "as sequences of digits" to "in the grammar below" in 164 "Numbers" section. 166 o Added language about number interoperability as a function of 167 IEEE754, and an IEEE754 reference. 169 o Added language about interoperability and Unicode characters, and 170 about string comparisons. To do this, turned the old "Encoding" 171 section into a "String and Character Issues" section, with three 172 subsections: The old "Encoding" material, and two new sections for 173 "Unicode Characters" and "String Comparison". 175 o Changed guidance in "Parsers" section to point out that 176 implementations may set limits on the range "and precision" of 177 numbers. 179 o Removed the language "Interoperability considerations: n/a" from 180 the "IANA Considerations" section. 182 o Made a real "Security Considerations" section, and lifted the text 183 out of the existing "IANA Considerations" section. 185 o Applied erratum #3607 from RFC 4627 by removing the security 186 consideration that begins "A JSON text can be safely passed" and 187 the JavaScript code that went with that consideration. 189 o Added a note to the "Security Considerations" section pointing out 190 the risks of using the "eval()" function in JavaScript or any 191 other language in which JSON texts conform to that language's 192 syntax. 194 o Added "Contributors" section crediting Douglas Crockford. 196 o Moved the ECMAScript reference from Normative to Informative, 197 updated it to reference ECMAScript 5.1, and added reference to 198 ECMA 404. 200 2. JSON Grammar 202 A JSON text is a sequence of tokens. The set of tokens includes six 203 structural characters, strings, numbers, and three literal names. 205 A JSON text is a serialized object or array. 207 JSON-text = object / array 209 These are the six structural characters: 211 begin-array = ws %x5B ws ; [ left square bracket 213 begin-object = ws %x7B ws ; { left curly bracket 215 end-array = ws %x5D ws ; ] right square bracket 217 end-object = ws %x7D ws ; } right curly bracket 219 name-separator = ws %x3A ws ; : colon 221 value-separator = ws %x2C ws ; , comma 223 Insignificant whitespace is allowed before or after any of the six 224 structural characters. 226 ws = *( 227 %x20 / ; Space 228 %x09 / ; Horizontal tab 229 %x0A / ; Line feed or New line 230 %x0D ; Carriage return 231 ) 233 3. Values 235 A JSON value MUST be an object, array, number, or string, or one of 236 the following three literal names: 238 false null true 240 The literal names MUST be lowercase. No other literal names are 241 allowed. 243 value = false / null / true / object / array / number / string 245 false = %x66.61.6c.73.65 ; false 247 null = %x6e.75.6c.6c ; null 249 true = %x74.72.75.65 ; true 251 4. Objects 253 An object structure is represented as a pair of curly brackets 254 surrounding zero or more name/value pairs (or members). A name is a 255 string. A single colon comes after each name, separating the name 256 from the value. A single comma separates a value from a following 257 name. The names within an object SHOULD be unique. 259 object = begin-object [ member *( value-separator member ) ] 260 end-object 262 member = string name-separator value 264 An object whose names are all unique is interoperable in the sense 265 that all software implementations which receive that object will 266 agree on the name-value mappings. When the names within an object 267 are not unique, the behavior of software that receives such an object 268 is unpredictable. Many implementations report the last name/value 269 pair only; other implementations report an error or fail to parse the 270 object; other implementations report all of the name/value pairs, 271 including duplicates. 273 5. Arrays 275 An array structure is represented as square brackets surrounding zero 276 or more values (or elements). Elements are separated by commas. 278 array = begin-array [ value *( value-separator value ) ] end-array 280 6. Numbers 282 The representation of numbers is similar to that used in most 283 programming languages. A number contains an integer component that 284 may be prefixed with an optional minus sign, which may be followed by 285 a fraction part and/or an exponent part. 287 Octal and hex forms are not allowed. Leading zeros are not allowed. 289 A fraction part is a decimal point followed by one or more digits. 291 An exponent part begins with the letter E in upper or lowercase, 292 which may be followed by a plus or minus sign. The E and optional 293 sign are followed by one or more digits. 295 Numeric values that cannot be represented in the grammar below (such 296 as Infinity and NaN) are not permitted. 298 number = [ minus ] int [ frac ] [ exp ] 300 decimal-point = %x2E ; . 302 digit1-9 = %x31-39 ; 1-9 304 e = %x65 / %x45 ; e E 306 exp = e [ minus / plus ] 1*DIGIT 308 frac = decimal-point 1*DIGIT 310 int = zero / ( digit1-9 *DIGIT ) 312 minus = %x2D ; - 314 plus = %x2B ; + 316 zero = %x30 ; 0 318 This specification allows implementations to set limits on the range 319 and precision of numbers accepted. Since software which implements 320 IEEE 754-2008 [IEEE754] is generally available and widely used, good 321 interoperability can be achieved by implementations which expect no 322 more precision or range than provided by an IEEE 754 binary64 (double 323 precision) number, in the sense that implementations will approximate 324 JSON numbers within the expected precision. A JSON number which is 325 outside those bounds, such as 1E400 or 326 3.141592653589793238462643383279, may indicate potential 327 interoperability problems since it suggests that the software which 328 created it it expected greater magnitude or precision than is widely 329 available. 331 Note that when such software is used, numbers which are integers and 332 are in the range [-(2**53)+1, (2**53)-1] are interoperable in the 333 sense that implementations will agree exactly on their numeric 334 values. 336 7. Strings 338 The representation of strings is similar to conventions used in the C 339 family of programming languages. A string begins and ends with 340 quotation marks. All Unicode characters may be placed within the 341 quotation marks except for the characters that must be escaped: 342 quotation mark, reverse solidus, and the control characters (U+0000 343 through U+001F). 345 Any character may be escaped. If the character is in the Basic 346 Multilingual Plane (U+0000 through U+FFFF), then it may be 347 represented as a six-character sequence: a reverse solidus, followed 348 by the lowercase letter u, followed by four hexadecimal digits that 349 encode the character's code point. The hexadecimal letters A though 350 F can be upper or lowercase. So, for example, a string containing 351 only a single reverse solidus character may be represented as 352 "\u005C". 354 Alternatively, there are two-character sequence escape 355 representations of some popular characters. So, for example, a 356 string containing only a single reverse solidus character may be 357 represented more compactly as "\\". 359 To escape an extended character that is not in the Basic Multilingual 360 Plane, the character is represented as a twelve-character sequence, 361 encoding the UTF-16 surrogate pair. So, for example, a string 362 containing only the G clef character (U+1D11E) may be represented as 363 "\uD834\uDD1E". 365 string = quotation-mark *char quotation-mark 367 char = unescaped / 368 escape ( 369 %x22 / ; " quotation mark U+0022 370 %x5C / ; \ reverse solidus U+005C 371 %x2F / ; / solidus U+002F 372 %x62 / ; b backspace U+0008 373 %x66 / ; f form feed U+000C 374 %x6E / ; n line feed U+000A 375 %x72 / ; r carriage return U+000D 376 %x74 / ; t tab U+0009 377 %x75 4HEXDIG ) ; uXXXX U+XXXX 379 escape = %x5C ; \ 381 quotation-mark = %x22 ; " 383 unescaped = %x20-21 / %x23-5B / %x5D-10FFFF 385 8. String and Character Issues 387 8.1. Encoding and Detection 389 JSON text SHALL be encoded in Unicode. The default encoding is 390 UTF-8. 392 Since the first two characters of a JSON text will always be ASCII 393 characters [RFC0020], it is possible to determine whether an octet 394 stream is UTF-8, UTF-16 (BE or LE), or UTF-32 (BE or LE) by looking 395 at the pattern of nulls in the first four octets. 397 00 00 00 xx UTF-32BE 398 00 xx 00 xx UTF-16BE 399 xx 00 00 00 UTF-32LE 400 xx 00 xx 00 UTF-16LE 401 xx xx xx xx UTF-8 403 8.2. Unicode Characters 405 When all the strings represented in a JSON text are composed entirely 406 of Unicode characters [UNICODE] (however escaped), then that JSON 407 text is interoperable in the sense that all software implementations 408 which parse it will agree on the contents of names and of string 409 values in objects and arrays. 411 However, the ABNF in this specification allows member names and 412 string values to contain bit sequences which cannot encode Unicode 413 characters, for example "\uDEAD" (a single unpaired UTF-16 414 surrogate). Instances of this have been observed, for example when a 415 library truncates a UTF-16 string without checking whether the 416 truncation split a surrogate pair. The behavior of software which 417 receives JSON texts containing such values is unpredictable; for 418 example, implementations might return different values for the length 419 of a string value, or even suffer fatal runtime exceptions. 421 8.3. String Comparison 423 Software implementations are typically required to test names of 424 object members for equality. Implementations which transform the 425 textual representation into sequences of Unicode code units, and then 426 perform the comparison numerically, code unit by code unit, are 427 interoperable in the sense that implementations will agree in all 428 cases on equality or inequality of two strings. For example, 429 implementations which compare strings with escaped characters 430 unconverted may incorrectly find that "a\b" and "a\u005Cb" are not 431 equal. 433 9. Parsers 435 A JSON parser transforms a JSON text into another representation. A 436 JSON parser MUST accept all texts that conform to the JSON grammar. 437 A JSON parser MAY accept non-JSON forms or extensions. 439 An implementation may set limits on the size of texts that it 440 accepts. An implementation may set limits on the maximum depth of 441 nesting. An implementation may set limits on the range and precision 442 of numbers. An implementation may set limits on the length and 443 character contents of strings. 445 10. Generators 447 A JSON generator produces JSON text. The resulting text MUST 448 strictly conform to the JSON grammar. 450 11. IANA Considerations 451 The MIME media type for JSON text is application/json. 453 Type name: application 455 Subtype name: json 457 Required parameters: n/a 459 Optional parameters: n/a 461 Encoding considerations: 8bit if UTF-8; binary if UTF-16 or UTF-32 463 JSON may be represented using UTF-8, UTF-16, or UTF-32. When JSON 464 is written in UTF-8, JSON is 8bit compatible. When JSON is 465 written in UTF-16 or UTF-32, the binary content-transfer-encoding 466 must be used. 468 Published specification: RFC 4627 470 Applications that use this media type: 472 JSON has been used to exchange data between applications written 473 in all of these programming languages: ActionScript, C, C#, 474 ColdFusion, Common Lisp, E, Erlang, Java, JavaScript, Lua, 475 Objective CAML, Perl, PHP, Python, Rebol, Ruby, and Scheme. 477 Additional information: 479 Magic number(s): n/a 480 File extension(s): .json 481 Macintosh file type code(s): TEXT 483 Person & email address to contact for further information: 484 Douglas Crockford 485 douglas@crockford.com 487 Intended usage: COMMON 489 Restrictions on usage: none 491 Author: 492 Douglas Crockford 493 douglas@crockford.com 495 Change controller: 496 Douglas Crockford 497 douglas@crockford.com 499 12. Security Considerations 501 Generally there are security issues with scripting languages. JSON 502 is a subset of JavaScript, but excludes assignment and invocation. 504 Since JSON's syntax is borrowed from JavaScript, it is possible to 505 use that language's "eval()" function to parse JSON texts. This 506 generally constitutes an unacceptable security risk, since the text 507 could contain executable code along with data declarations. The same 508 consideration applies in any other programming languages in which 509 JSON texts conform to that language's syntax. 511 13. Examples 513 This is a JSON object: 515 { 516 "Image": { 517 "Width": 800, 518 "Height": 600, 519 "Title": "View from 15th Floor", 520 "Thumbnail": { 521 "Url": "http://www.example.com/image/481989943", 522 "Height": 125, 523 "Width": "100" 524 }, 525 "IDs": [116, 943, 234, 38793] 526 } 527 } 529 Its Image member is an object whose Thumbnail member is an object and 530 whose IDs member is an array of numbers. 532 This is a JSON array containing two objects: 534 [ 535 { 536 "precision": "zip", 537 "Latitude": 37.7668, 538 "Longitude": -122.3959, 539 "Address": "", 540 "City": "SAN FRANCISCO", 541 "State": "CA", 542 "Zip": "94107", 543 "Country": "US" 544 }, 545 { 546 "precision": "zip", 547 "Latitude": 37.371991, 548 "Longitude": -122.026020, 549 "Address": "", 550 "City": "SUNNYVALE", 551 "State": "CA", 552 "Zip": "94085", 553 "Country": "US" 554 } 555 ] 557 14. Contributors 559 RFC 4627 was written by Douglas Crockford. This document was 560 constructed by making a relatively small number of changes to that 561 document; thus the vast majority of the text here is his. 563 15. References 565 15.1. Normative References 567 [IEEE754] IEEE, "IEEE Standard for Floating-Point Arithmetic", 2008, 568 . 570 [RFC0020] Cerf, V., "ASCII format for network interchange", RFC 20, 571 October 1969. 573 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 574 Requirement Levels", BCP 14, RFC 2119, March 1997. 576 [RFC4234] Crocker, D., Ed. and P. Overell, "Augmented BNF for 577 Syntax Specifications: ABNF", RFC 4234, October 2005. 579 [UNICODE] The Unicode Consortium, "The Unicode Standard, Version 4.0 580 ", 2003, . 582 15.2. Informative References 584 [ECMA-262] 585 European Computer Manufacturers Association, "ECMAScript 586 Language Specification 5.1 Edition ", June 2011, . 589 [ECMA-404] 590 Ecma International, "The JSON Data Interchange Format ", 591 October 2013, . 594 Appendix A. Changes in -04 596 o Reworded Section 8.2 to talk about strings that are represented in 597 the JSON text, rather than the actual text itself. Also fine- 598 tuned the "will agree on" clause in the interoperability 599 description. 601 o Changed "20008" to "2008". 603 o Reworded numeric-interoperability language following on WG 604 discussion, notably referring to availability of software that 605 does IEEE754 and "approximate JSON numbers within the expected 606 precision". Also took out duplicate language about NaN and Inf. 608 o Changed "as sequences of digits" to "in the grammar below" in 609 "Numbers" section. 611 Appendix B. Changes in -05 613 o Removed the numbers-interop text about "frac" and "exp" parts. 615 o Added the obsoletes 4627 attribute. 617 o Moved the EcmaScript ref from normative to informative, and 618 redirected to point at 5.1. 620 o Changed numbers language to say that implementations can impose 621 limits on range *and precision*. 623 o Changed section title from "Character Model" to "String and 624 Character Issues". 626 o Added "Specifications of JSON" section, and included a reference 627 to ECMA-404. 629 o Removed the consensus-call link from the list of changes. 631 o Added a paragraph about not using eval() in JavaScript or other 632 languaegs where JSON syntax matches that language's syntax. 634 o Reorganized the list of changes so they're ordered like the spec, 635 and cleaned up language a bit. 637 Author's Address 639 Tim Bray (editor) 640 Google, Inc. 642 Email: tbray@textuality.com