draft-ietf-cbor-7049bis-03.txt | draft-ietf-cbor-7049bis-04.txt | |||
---|---|---|---|---|
Network Working Group C. Bormann | Network Working Group C. Bormann | |||
Internet-Draft Universitaet Bremen TZI | Internet-Draft Universitaet Bremen TZI | |||
Intended status: Standards Track P. Hoffman | Intended status: Standards Track P. Hoffman | |||
Expires: March 24, 2019 ICANN | Expires: April 26, 2019 ICANN | |||
September 20, 2018 | October 23, 2018 | |||
Concise Binary Object Representation (CBOR) | Concise Binary Object Representation (CBOR) | |||
draft-ietf-cbor-7049bis-03 | draft-ietf-cbor-7049bis-04 | |||
Abstract | Abstract | |||
The Concise Binary Object Representation (CBOR) is a data format | The Concise Binary Object Representation (CBOR) is a data format | |||
whose design goals include the possibility of extremely small code | whose design goals include the possibility of extremely small code | |||
size, fairly small message size, and extensibility without the need | size, fairly small message size, and extensibility without the need | |||
for version negotiation. These design goals make it different from | for version negotiation. These design goals make it different from | |||
earlier binary serializations such as ASN.1 and MessagePack. | earlier binary serializations such as ASN.1 and MessagePack. | |||
Contributing | Contributing | |||
skipping to change at page 1, line 47 ¶ | skipping to change at page 1, line 47 ¶ | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
Drafts is at https://datatracker.ietf.org/drafts/current/. | Drafts is at https://datatracker.ietf.org/drafts/current/. | |||
Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
This Internet-Draft will expire on March 24, 2019. | This Internet-Draft will expire on April 26, 2019. | |||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2018 IETF Trust and the persons identified as the | Copyright (c) 2018 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
(https://trustee.ietf.org/license-info) in effect on the date of | (https://trustee.ietf.org/license-info) in effect on the date of | |||
publication of this document. Please review these documents | publication of this document. Please review these documents | |||
skipping to change at page 2, line 25 ¶ | skipping to change at page 2, line 25 ¶ | |||
to this document. Code Components extracted from this document must | to this document. Code Components extracted from this document must | |||
include Simplified BSD License text as described in Section 4.e of | include Simplified BSD License text as described in Section 4.e of | |||
the Trust Legal Provisions and are provided without warranty as | the Trust Legal Provisions and are provided without warranty as | |||
described in the Simplified BSD License. | described in the Simplified BSD License. | |||
Table of Contents | Table of Contents | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 | |||
1.1. Objectives . . . . . . . . . . . . . . . . . . . . . . . 4 | 1.1. Objectives . . . . . . . . . . . . . . . . . . . . . . . 4 | |||
1.2. Terminology . . . . . . . . . . . . . . . . . . . . . . . 5 | 1.2. Terminology . . . . . . . . . . . . . . . . . . . . . . . 5 | |||
2. CBOR Data Models . . . . . . . . . . . . . . . . . . . . . . 6 | 2. CBOR Data Models . . . . . . . . . . . . . . . . . . . . . . 7 | |||
2.1. Extended Generic Data Models . . . . . . . . . . . . . . 7 | 2.1. Extended Generic Data Models . . . . . . . . . . . . . . 8 | |||
2.2. Specific Data Models . . . . . . . . . . . . . . . . . . 8 | 2.2. Specific Data Models . . . . . . . . . . . . . . . . . . 8 | |||
3. Specification of the CBOR Encoding . . . . . . . . . . . . . 8 | 3. Specification of the CBOR Encoding . . . . . . . . . . . . . 9 | |||
3.1. Major Types . . . . . . . . . . . . . . . . . . . . . . . 9 | 3.1. Major Types . . . . . . . . . . . . . . . . . . . . . . . 10 | |||
3.2. Indefinite Lengths for Some Major Types . . . . . . . . . 11 | 3.2. Indefinite Lengths for Some Major Types . . . . . . . . . 11 | |||
3.2.1. Indefinite-Length Arrays and Maps . . . . . . . . . . 11 | 3.2.1. Indefinite-Length Arrays and Maps . . . . . . . . . . 12 | |||
3.2.2. Indefinite-Length Byte Strings and Text Strings . . . 13 | 3.2.2. Indefinite-Length Byte Strings and Text Strings . . . 14 | |||
3.3. Floating-Point Numbers and Values with No Content . . . . 14 | 3.3. Floating-Point Numbers and Values with No Content . . . . 15 | |||
3.4. Optional Tagging of Items . . . . . . . . . . . . . . . . 16 | 3.4. Optional Tagging of Items . . . . . . . . . . . . . . . . 16 | |||
3.4.1. Date and Time . . . . . . . . . . . . . . . . . . . . 18 | 3.4.1. Date and Time . . . . . . . . . . . . . . . . . . . . 18 | |||
3.4.2. Bignums . . . . . . . . . . . . . . . . . . . . . . . 18 | 3.4.2. Standard Date/Time String . . . . . . . . . . . . . . 18 | |||
3.4.3. Decimal Fractions and Bigfloats . . . . . . . . . . . 19 | 3.4.3. Epoch-based Date/Time . . . . . . . . . . . . . . . . 18 | |||
3.4.4. Content Hints . . . . . . . . . . . . . . . . . . . . 20 | 3.4.4. Bignums . . . . . . . . . . . . . . . . . . . . . . . 19 | |||
3.4.4.1. Encoded CBOR Data Item . . . . . . . . . . . . . 20 | 3.4.5. Decimal Fractions and Bigfloats . . . . . . . . . . . 20 | |||
3.4.4.2. Expected Later Encoding for CBOR-to-JSON | 3.4.6. Content Hints . . . . . . . . . . . . . . . . . . . . 21 | |||
Converters . . . . . . . . . . . . . . . . . . . 20 | 3.4.6.1. Encoded CBOR Data Item . . . . . . . . . . . . . 21 | |||
3.4.4.3. Encoded Text . . . . . . . . . . . . . . . . . . 21 | 3.4.6.2. Expected Later Encoding for CBOR-to-JSON | |||
3.4.5. Self-Describe CBOR . . . . . . . . . . . . . . . . . 21 | Converters . . . . . . . . . . . . . . . . . . . 21 | |||
4. Creating CBOR-Based Protocols . . . . . . . . . . . . . . . . 22 | 3.4.6.3. Encoded Text . . . . . . . . . . . . . . . . . . 22 | |||
4.1. CBOR in Streaming Applications . . . . . . . . . . . . . 23 | 3.4.7. Self-Describe CBOR . . . . . . . . . . . . . . . . . 22 | |||
4.2. Generic Encoders and Decoders . . . . . . . . . . . . . . 23 | 4. Creating CBOR-Based Protocols . . . . . . . . . . . . . . . . 23 | |||
4.3. Syntax Errors . . . . . . . . . . . . . . . . . . . . . . 24 | 4.1. CBOR in Streaming Applications . . . . . . . . . . . . . 24 | |||
4.3.1. Incomplete CBOR Data Items . . . . . . . . . . . . . 24 | 4.2. Generic Encoders and Decoders . . . . . . . . . . . . . . 24 | |||
4.3.2. Malformed Indefinite-Length Items . . . . . . . . . . 24 | 4.3. Syntax Errors . . . . . . . . . . . . . . . . . . . . . . 25 | |||
4.3.3. Unknown Additional Information Values . . . . . . . . 25 | 4.3.1. Incomplete CBOR Data Items . . . . . . . . . . . . . 25 | |||
4.4. Other Decoding Errors . . . . . . . . . . . . . . . . . . 25 | 4.3.2. Malformed Indefinite-Length Items . . . . . . . . . . 25 | |||
4.5. Handling Unknown Simple Values and Tags . . . . . . . . . 26 | 4.3.3. Unknown Additional Information Values . . . . . . . . 26 | |||
4.6. Numbers . . . . . . . . . . . . . . . . . . . . . . . . . 26 | ||||
4.7. Specifying Keys for Maps . . . . . . . . . . . . . . . . 27 | 4.4. Other Decoding Errors . . . . . . . . . . . . . . . . . . 26 | |||
4.7.1. Equivalence of Keys . . . . . . . . . . . . . . . . . 28 | 4.5. Handling Unknown Simple Values and Tags . . . . . . . . . 27 | |||
4.8. Undefined Values . . . . . . . . . . . . . . . . . . . . 29 | 4.6. Numbers . . . . . . . . . . . . . . . . . . . . . . . . . 27 | |||
4.9. Canonical CBOR . . . . . . . . . . . . . . . . . . . . . 29 | 4.7. Specifying Keys for Maps . . . . . . . . . . . . . . . . 28 | |||
4.9.1. Length-first map key ordering . . . . . . . . . . . . 31 | 4.7.1. Equivalence of Keys . . . . . . . . . . . . . . . . . 29 | |||
4.10. Strict Mode . . . . . . . . . . . . . . . . . . . . . . . 32 | 4.8. Undefined Values . . . . . . . . . . . . . . . . . . . . 30 | |||
5. Converting Data between CBOR and JSON . . . . . . . . . . . . 33 | 4.9. Preferred Serialization . . . . . . . . . . . . . . . . . 30 | |||
5.1. Converting from CBOR to JSON . . . . . . . . . . . . . . 33 | 4.10. Canonical CBOR . . . . . . . . . . . . . . . . . . . . . 31 | |||
5.2. Converting from JSON to CBOR . . . . . . . . . . . . . . 35 | 4.10.1. Length-first map key ordering . . . . . . . . . . . 33 | |||
6. Future Evolution of CBOR . . . . . . . . . . . . . . . . . . 36 | 4.11. Strict Mode . . . . . . . . . . . . . . . . . . . . . . . 34 | |||
6.1. Extension Points . . . . . . . . . . . . . . . . . . . . 36 | 5. Converting Data between CBOR and JSON . . . . . . . . . . . . 35 | |||
6.2. Curating the Additional Information Space . . . . . . . . 37 | 5.1. Converting from CBOR to JSON . . . . . . . . . . . . . . 35 | |||
7. Diagnostic Notation . . . . . . . . . . . . . . . . . . . . . 37 | 5.2. Converting from JSON to CBOR . . . . . . . . . . . . . . 37 | |||
7.1. Encoding Indicators . . . . . . . . . . . . . . . . . . . 38 | 6. Future Evolution of CBOR . . . . . . . . . . . . . . . . . . 37 | |||
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 39 | 6.1. Extension Points . . . . . . . . . . . . . . . . . . . . 38 | |||
8.1. Simple Values Registry . . . . . . . . . . . . . . . . . 39 | 6.2. Curating the Additional Information Space . . . . . . . . 39 | |||
8.2. Tags Registry . . . . . . . . . . . . . . . . . . . . . . 39 | 7. Diagnostic Notation . . . . . . . . . . . . . . . . . . . . . 39 | |||
8.3. Media Type ("MIME Type") . . . . . . . . . . . . . . . . 40 | 7.1. Encoding Indicators . . . . . . . . . . . . . . . . . . . 40 | |||
8.4. CoAP Content-Format . . . . . . . . . . . . . . . . . . . 41 | 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 41 | |||
8.5. The +cbor Structured Syntax Suffix Registration . . . . . 41 | 8.1. Simple Values Registry . . . . . . . . . . . . . . . . . 41 | |||
9. Security Considerations . . . . . . . . . . . . . . . . . . . 42 | 8.2. Tags Registry . . . . . . . . . . . . . . . . . . . . . . 41 | |||
10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 42 | 8.3. Media Type ("MIME Type") . . . . . . . . . . . . . . . . 42 | |||
11. References . . . . . . . . . . . . . . . . . . . . . . . . . 43 | 8.4. CoAP Content-Format . . . . . . . . . . . . . . . . . . . 42 | |||
11.1. Normative References . . . . . . . . . . . . . . . . . . 43 | 8.5. The +cbor Structured Syntax Suffix Registration . . . . . 43 | |||
11.2. Informative References . . . . . . . . . . . . . . . . . 44 | 9. Security Considerations . . . . . . . . . . . . . . . . . . . 44 | |||
Appendix A. Examples . . . . . . . . . . . . . . . . . . . . . . 46 | 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 44 | |||
Appendix B. Jump Table . . . . . . . . . . . . . . . . . . . . . 50 | 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 45 | |||
Appendix C. Pseudocode . . . . . . . . . . . . . . . . . . . . . 53 | 11.1. Normative References . . . . . . . . . . . . . . . . . . 45 | |||
Appendix D. Half-Precision . . . . . . . . . . . . . . . . . . . 55 | 11.2. Informative References . . . . . . . . . . . . . . . . . 46 | |||
Appendix A. Examples . . . . . . . . . . . . . . . . . . . . . . 48 | ||||
Appendix B. Jump Table . . . . . . . . . . . . . . . . . . . . . 52 | ||||
Appendix C. Pseudocode . . . . . . . . . . . . . . . . . . . . . 55 | ||||
Appendix D. Half-Precision . . . . . . . . . . . . . . . . . . . 57 | ||||
Appendix E. Comparison of Other Binary Formats to CBOR's Design | Appendix E. Comparison of Other Binary Formats to CBOR's Design | |||
Objectives . . . . . . . . . . . . . . . . . . . . . 56 | Objectives . . . . . . . . . . . . . . . . . . . . . 58 | |||
E.1. ASN.1 DER, BER, and PER . . . . . . . . . . . . . . . . . 57 | E.1. ASN.1 DER, BER, and PER . . . . . . . . . . . . . . . . . 59 | |||
E.2. MessagePack . . . . . . . . . . . . . . . . . . . . . . . 57 | E.2. MessagePack . . . . . . . . . . . . . . . . . . . . . . . 59 | |||
E.3. BSON . . . . . . . . . . . . . . . . . . . . . . . . . . 58 | E.3. BSON . . . . . . . . . . . . . . . . . . . . . . . . . . 60 | |||
E.4. UBJSON . . . . . . . . . . . . . . . . . . . . . . . . . 58 | E.4. MSDTP: RFC 713 . . . . . . . . . . . . . . . . . . . . . 60 | |||
E.5. MSDTP: RFC 713 . . . . . . . . . . . . . . . . . . . . . 58 | E.5. Conciseness on the Wire . . . . . . . . . . . . . . . . . 60 | |||
E.6. Conciseness on the Wire . . . . . . . . . . . . . . . . . 58 | Appendix F. Changes from RFC 7049 . . . . . . . . . . . . . . . 61 | |||
Appendix F. Changes from RFC 7049 . . . . . . . . . . . . . . . 59 | Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 61 | |||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 59 | ||||
1. Introduction | 1. Introduction | |||
There are hundreds of standardized formats for binary representation | There are hundreds of standardized formats for binary representation | |||
of structured data (also known as binary serialization formats). Of | of structured data (also known as binary serialization formats). Of | |||
those, some are for specific domains of information, while others are | those, some are for specific domains of information, while others are | |||
generalized for arbitrary data. In the IETF, probably the best-known | generalized for arbitrary data. In the IETF, probably the best-known | |||
formats in the latter category are ASN.1's BER and DER [ASN.1]. | formats in the latter category are ASN.1's BER and DER [ASN.1]. | |||
The format defined here follows some specific design goals that are | The format defined here follows some specific design goals that are | |||
skipping to change at page 6, line 45 ¶ | skipping to change at page 6, line 49 ¶ | |||
of the data items in the sequence available to an application as | of the data items in the sequence available to an application as | |||
they are received. | they are received. | |||
Where bit arithmetic or data types are explained, this document uses | Where bit arithmetic or data types are explained, this document uses | |||
the notation familiar from the programming language C, except that | the notation familiar from the programming language C, except that | |||
"**" denotes exponentiation. Similar to the "0x" notation for | "**" denotes exponentiation. Similar to the "0x" notation for | |||
hexadecimal numbers, numbers in binary notation are prefixed with | hexadecimal numbers, numbers in binary notation are prefixed with | |||
"0b". Underscores can be added to such a number solely for | "0b". Underscores can be added to such a number solely for | |||
readability, so 0b00100001 (0x21) might be written 0b001_00001 to | readability, so 0b00100001 (0x21) might be written 0b001_00001 to | |||
emphasize the desired interpretation of the bits in the byte; in this | emphasize the desired interpretation of the bits in the byte; in this | |||
case, it is split into three bits and five bits. | case, it is split into three bits and five bits. Encoded CBOR data | |||
items are sometimes given in the "0x" or "0b" notation; these values | ||||
are first interpreted as numbers as in C and are then interpreted as | ||||
byte strings in network byte order, including any leading zero bytes | ||||
expressed in the notation. | ||||
2. CBOR Data Models | 2. CBOR Data Models | |||
CBOR is explicit about its generic data model, which defines the set | CBOR is explicit about its generic data model, which defines the set | |||
of all data items that can be represented in CBOR. Its basic generic | of all data items that can be represented in CBOR. Its basic generic | |||
data model is extensible by the registration of simple type values | data model is extensible by the registration of simple type values | |||
and tags. Applications can then subset the resulting extended | and tags. Applications can then subset the resulting extended | |||
generic data model to build their specific data models. | generic data model to build their specific data models. | |||
Within environments that can represent the data items in the generic | Within environments that can represent the data items in the generic | |||
skipping to change at page 7, line 25 ¶ | skipping to change at page 7, line 33 ¶ | |||
In the basic (un-extended) generic data model, a data item is one of: | In the basic (un-extended) generic data model, a data item is one of: | |||
o an integer in the range -2**64..2**64-1 inclusive | o an integer in the range -2**64..2**64-1 inclusive | |||
o a simple value, identified by a number between 0 and 255, but | o a simple value, identified by a number between 0 and 255, but | |||
distinct from that number | distinct from that number | |||
o a floating point value, distinct from an integer, out of the set | o a floating point value, distinct from an integer, out of the set | |||
representable by IEEE 754 binary64 (including non-finites) | representable by IEEE 754 binary64 (including non-finites) | |||
[IEEE.754.2008] | ||||
o a sequence of zero or more bytes ("byte string") | o a sequence of zero or more bytes ("byte string") | |||
o a sequence of zero or more Unicode code points ("text string") | o a sequence of zero or more Unicode code points ("text string") | |||
o a sequence of zero or more data items ("array") | o a sequence of zero or more data items ("array") | |||
o a mapping (mathematical function) from zero or more data items | o a mapping (mathematical function) from zero or more data items | |||
("keys") each to a data item ("values"), ("map") | ("keys") each to a data item ("values"), ("map") | |||
o a tagged data item, comprising a tag (an integer in the range | o a tagged data item, comprising a tag (an integer in the range | |||
0..2**64-1) and a value (a data item) | 0..2**64-1) and a value (a data item) | |||
Note that integer and floating-point values are distinct in this | Note that integer and floating-point values are distinct in this | |||
model, even if they have the same numeric value. | model, even if they have the same numeric value. | |||
Also note that serialization variants, such as number of bytes of the | ||||
encoded floating value, or the choice of one of the ways in which an | ||||
integer, the length of a text or byte string, the number of elements | ||||
in an array or pairs in a map, or a tag value, (collectively "the | ||||
argument", see Section 3) can be encoded, are not visible at the | ||||
generic data model level. | ||||
2.1. Extended Generic Data Models | 2.1. Extended Generic Data Models | |||
This basic generic data model comes pre-extended by the registration | This basic generic data model comes pre-extended by the registration | |||
of a number of simple values and tags right in this document, such | of a number of simple values and tags right in this document, such | |||
as: | as: | |||
o "false", "true", "null", and "undefined" (simple values identified | o "false", "true", "null", and "undefined" (simple values identified | |||
by 20..23) | by 20..23) | |||
o integer and floating point values with a larger range and | o integer and floating point values with a larger range and | |||
skipping to change at page 8, line 32 ¶ | skipping to change at page 8, line 48 ¶ | |||
intentionally omitted) in the form appropriate for their programming | intentionally omitted) in the form appropriate for their programming | |||
environment, implementation of the data model extensions created by | environment, implementation of the data model extensions created by | |||
tags is truly optional and a matter of implementation quality. | tags is truly optional and a matter of implementation quality. | |||
2.2. Specific Data Models | 2.2. Specific Data Models | |||
The specific data model for a CBOR-based protocol usually subsets the | The specific data model for a CBOR-based protocol usually subsets the | |||
extended generic data model and assigns application semantics to the | extended generic data model and assigns application semantics to the | |||
data items within this subset and its components. When documenting | data items within this subset and its components. When documenting | |||
such specific data models, where it is desired to specify the types | such specific data models, where it is desired to specify the types | |||
of data items, it is preferred to identify the types by their names | of data items, it is preferred to identify the types by the names | |||
in the generic data model ("negative integer", "array") instead of by | they have in the generic data model ("negative integer", "array") | |||
referring to aspects of their CBOR representation ("major type 1", | instead of by referring to aspects of their CBOR representation | |||
"major type 4"). | ("major type 1", "major type 4"). | |||
Specific data models can also specify that values of different types | Specific data models can also specify what values (including values | |||
are equivalent for the purposes of map keys and encoder freedom. For | of different types) are equivalent for the purposes of map keys and | |||
example, in the generic data model, a valid map MAY have both "0" and | encoder freedom. For example, in the generic data model, a valid map | |||
"0.0" as keys, and an encoder MUST NOT encode "0.0" as an integer | MAY have both "0" and "0.0" as keys, and an encoder MUST NOT encode | |||
(major type 0, Section 3.1). However, if a specific data model | "0.0" as an integer (major type 0, Section 3.1). However, if a | |||
declares that floating point and integer representations of integral | specific data model declares that floating point and integer | |||
values are equivalent, map keys "0" and "0.0" would be considered | representations of integral values are equivalent, using both map | |||
duplicates and so invalid, and an encoder could encode integral- | keys "0" and "0.0" in a single map would be considered duplicates and | |||
valued floats as integers or vice versa, perhaps to save encoded | so invalid, and an encoder could encode integral-valued floats as | |||
bytes. | integers or vice versa, perhaps to save encoded bytes. | |||
3. Specification of the CBOR Encoding | 3. Specification of the CBOR Encoding | |||
A CBOR data item (Section 2) is encoded to or decoded from a byte | A CBOR data item (Section 2) is encoded to or decoded from a byte | |||
string as described in this section. The encoding is summarized in | string as described in this section. The encoding is summarized in | |||
Table 5. | Table 5. | |||
The initial byte of each encoded data item contains both information | The initial byte of each encoded data item contains both information | |||
about the major type (the high-order 3 bits, described in | about the major type (the high-order 3 bits, described in | |||
Section 3.1) and additional information (the low-order 5 bits). | Section 3.1) and additional information (the low-order 5 bits). | |||
skipping to change at page 16, line 6 ¶ | skipping to change at page 16, line 26 ¶ | |||
| 23 | Undefined value | | | 23 | Undefined value | | |||
| | | | | | | | |||
| 24..31 | (Reserved) | | | 24..31 | (Reserved) | | |||
| | | | | | | | |||
| 32..255 | (Unassigned) | | | 32..255 | (Unassigned) | | |||
+---------+-----------------+ | +---------+-----------------+ | |||
Table 2: Simple Values | Table 2: Simple Values | |||
The 5-bit values of 25, 26, and 27 are for 16-bit, 32-bit, and 64-bit | The 5-bit values of 25, 26, and 27 are for 16-bit, 32-bit, and 64-bit | |||
IEEE 754 binary floating-point values. These floating-point values | IEEE 754 binary floating-point values [IEEE.754.2008]. These | |||
are encoded in the additional bytes of the appropriate size. (See | floating-point values are encoded in the additional bytes of the | |||
Appendix D for some information about 16-bit floating point.) | appropriate size. (See Appendix D for some information about 16-bit | |||
floating point.) | ||||
An encoder MUST NOT encode False as the two-byte sequence of 0xf814, | An encoder MUST NOT encode False as the two-byte sequence of 0xf814, | |||
MUST NOT encode True as the two-byte sequence of 0xf815, MUST NOT | MUST NOT encode True as the two-byte sequence of 0xf815, MUST NOT | |||
encode Null as the two-byte sequence of 0xf816, and MUST NOT encode | encode Null as the two-byte sequence of 0xf816, and MUST NOT encode | |||
Undefined value as the two-byte sequence of 0xf817. A decoder MUST | Undefined value as the two-byte sequence of 0xf817. A decoder MUST | |||
treat these two-byte sequences as an error. Similar prohibitions | treat these two-byte sequences as an error. Similar prohibitions | |||
apply to the unassigned simple values as well. | apply to the unassigned simple values as well. | |||
3.4. Optional Tagging of Items | 3.4. Optional Tagging of Items | |||
skipping to change at page 16, line 30 ¶ | skipping to change at page 16, line 51 ¶ | |||
additional semantics while retaining its structure. The tag is major | additional semantics while retaining its structure. The tag is major | |||
type 6, and represents an integer number as indicated by the tag's | type 6, and represents an integer number as indicated by the tag's | |||
argument (Section 3); the (sole) data item is carried as content | argument (Section 3); the (sole) data item is carried as content | |||
data. If a tag requires structured data, this structure is encoded | data. If a tag requires structured data, this structure is encoded | |||
into the nested data item. The definition of a tag usually restricts | into the nested data item. The definition of a tag usually restricts | |||
what kinds of nested data item or items are valid. | what kinds of nested data item or items are valid. | |||
The initial bytes of the tag follow the rules for positive integers | The initial bytes of the tag follow the rules for positive integers | |||
(major type 0). The tag is followed by a single data item of any | (major type 0). The tag is followed by a single data item of any | |||
type. For example, assume that a byte string of length 12 is marked | type. For example, assume that a byte string of length 12 is marked | |||
with a tag to indicate it is a positive bignum (Section 3.4.2). This | with a tag to indicate it is a positive bignum (Section 3.4.4). This | |||
would be marked as 0b110_00010 (major type 6, additional information | would be marked as 0b110_00010 (major type 6, additional information | |||
2 for the tag) followed by 0b010_01100 (major type 2, additional | 2 for the tag) followed by 0b010_01100 (major type 2, additional | |||
information of 12 for the length) followed by the 12 bytes of the | information of 12 for the length) followed by the 12 bytes of the | |||
bignum. | bignum. | |||
Decoders do not need to understand tags, and thus tags may be of | Decoders do not need to understand tags, and thus tags may be of | |||
little value in applications where the implementation creating a | little value in applications where the implementation creating a | |||
particular CBOR data item and the implementation decoding that stream | particular CBOR data item and the implementation decoding that stream | |||
know the semantic meaning of each item in the data flow. Their | know the semantic meaning of each item in the data flow. Their | |||
primary purpose in this specification is to define common data types | primary purpose in this specification is to define common data types | |||
skipping to change at page 17, line 12 ¶ | skipping to change at page 17, line 33 ¶ | |||
value. The content of the tagged item is the data item (the value) | value. The content of the tagged item is the data item (the value) | |||
that is being tagged. | that is being tagged. | |||
IANA maintains a registry of tag values as described in Section 8.2. | IANA maintains a registry of tag values as described in Section 8.2. | |||
Table 3 provides a list of initial values, with definitions in the | Table 3 provides a list of initial values, with definitions in the | |||
rest of this section. | rest of this section. | |||
+-----------+--------------+----------------------------------------+ | +-----------+--------------+----------------------------------------+ | |||
| Tag | Data Item | Semantics | | | Tag | Data Item | Semantics | | |||
+-----------+--------------+----------------------------------------+ | +-----------+--------------+----------------------------------------+ | |||
| 0 | UTF-8 string | Standard date/time string; see | | | 0 | UTF-8 string | Standard date/time string; see Section | | |||
| | | Section 3.4.1 | | | | | 3.4.2 | | |||
| | | | | | | | | | |||
| 1 | multiple | Epoch-based date/time; see | | | 1 | multiple | Epoch-based date/time; see Section | | |||
| | | Section 3.4.1 | | | | | 3.4.3 | | |||
| | | | | | | | | | |||
| 2 | byte string | Positive bignum; see Section 3.4.2 | | | 2 | byte string | Positive bignum; see Section 3.4.4 | | |||
| | | | | | | | | | |||
| 3 | byte string | Negative bignum; see Section 3.4.2 | | | 3 | byte string | Negative bignum; see Section 3.4.4 | | |||
| | | | | | | | | | |||
| 4 | array | Decimal fraction; see Section 3.4.3 | | | 4 | array | Decimal fraction; see Section 3.4.5 | | |||
| | | | | | | | | | |||
| 5 | array | Bigfloat; see Section 3.4.3 | | | 5 | array | Bigfloat; see Section 3.4.5 | | |||
| | | | | | | | | | |||
| 6..20 | (Unassigned) | (Unassigned) | | | 6..20 | (Unassigned) | (Unassigned) | | |||
| | | | | | | | | | |||
| 21 | multiple | Expected conversion to base64url | | | 21 | multiple | Expected conversion to base64url | | |||
| | | encoding; see Section 3.4.4.2 | | | | | encoding; see Section 3.4.6.2 | | |||
| | | | | | | | | | |||
| 22 | multiple | Expected conversion to base64 | | | 22 | multiple | Expected conversion to base64 | | |||
| | | encoding; see Section 3.4.4.2 | | | | | encoding; see Section 3.4.6.2 | | |||
| | | | | | | | | | |||
| 23 | multiple | Expected conversion to base16 | | | 23 | multiple | Expected conversion to base16 | | |||
| | | encoding; see Section 3.4.4.2 | | | | | encoding; see Section 3.4.6.2 | | |||
| | | | | | | | | | |||
| 24 | byte string | Encoded CBOR data item; see | | | 24 | byte string | Encoded CBOR data item; see Section | | |||
| | | Section 3.4.4.1 | | | | | 3.4.6.1 | | |||
| | | | | | | | | | |||
| 25..31 | (Unassigned) | (Unassigned) | | | 25..31 | (Unassigned) | (Unassigned) | | |||
| | | | | | | | | | |||
| 32 | UTF-8 string | URI; see Section 3.4.4.3 | | | 32 | UTF-8 string | URI; see Section 3.4.6.3 | | |||
| | | | | | | | | | |||
| 33 | UTF-8 string | base64url; see Section 3.4.4.3 | | | 33 | UTF-8 string | base64url; see Section 3.4.6.3 | | |||
| | | | | | | | | | |||
| 34 | UTF-8 string | base64; see Section 3.4.4.3 | | | 34 | UTF-8 string | base64; see Section 3.4.6.3 | | |||
| | | | | | | | | | |||
| 35 | UTF-8 string | Regular expression; see | | | 35 | UTF-8 string | Regular expression; see Section | | |||
| | | Section 3.4.4.3 | | | | | 3.4.6.3 | | |||
| | | | | | | | | | |||
| 36 | UTF-8 string | MIME message; see Section 3.4.4.3 | | | 36 | UTF-8 string | MIME message; see Section 3.4.6.3 | | |||
| | | | | | | | | | |||
| 37..55798 | (Unassigned) | (Unassigned) | | | 37..55798 | (Unassigned) | (Unassigned) | | |||
| | | | | | | | | | |||
| 55799 | multiple | Self-describe CBOR; see Section 3.4.5 | | | 55799 | multiple | Self-describe CBOR; see Section 3.4.7 | | |||
| | | | | | | | | | |||
| 55800+ | (Unassigned) | (Unassigned) | | | 55800+ | (Unassigned) | (Unassigned) | | |||
+-----------+--------------+----------------------------------------+ | +-----------+--------------+----------------------------------------+ | |||
Table 3: Values for Tags | Table 3: Values for Tags | |||
3.4.1. Date and Time | 3.4.1. Date and Time | |||
Protocols using tag values 0 and 1 extend the generic data model | Protocols using tag values 0 and 1 extend the generic data model | |||
(Section 2) with data items representing points in time. | (Section 2) with data items representing points in time. | |||
3.4.2. Standard Date/Time String | ||||
Tag value 0 is for date/time strings that follow the standard format | Tag value 0 is for date/time strings that follow the standard format | |||
described in [RFC3339], as refined by Section 3.3 of [RFC4287]. | described in [RFC3339], as refined by Section 3.3 of [RFC4287]. | |||
Tag value 1 is for numerical representation of seconds relative to | 3.4.3. Epoch-based Date/Time | |||
1970-01-01T00:00Z in UTC time. (For the non-negative values that the | ||||
Portable Operating System Interface (POSIX) defines, the number of | ||||
seconds is counted in the same way as for POSIX "seconds since the | ||||
epoch" [TIME_T].) The tagged item can be a positive or negative | ||||
integer (major types 0 and 1), or a floating-point number (major type | ||||
7 with additional information 25, 26, or 27). Note that the number | ||||
can be negative (time before 1970-01-01T00:00Z) and, if a floating- | ||||
point number, indicate fractional seconds. | ||||
3.4.2. Bignums | Tag value 1 is for numerical representation of civil time expressed | |||
in seconds relative to 1970-01-01T00:00Z (in UTC time). | ||||
The tagged item MUST be an unsigned or negative integer (major types | ||||
0 and 1), or a floating-point number (major type 7 with additional | ||||
information 25, 26, or 27). | ||||
Non-negative values (major type 0 and non-negative floating-point | ||||
numbers) stand for time values on or after 1970-01-01T00:00Z UTC and | ||||
are interpreted according to POSIX [TIME_T]. (POSIX time is also | ||||
known as UNIX Epoch time. Note that leap seconds are handled | ||||
specially by POSIX time and this results in a 1 second discontinuity | ||||
several times per decade.) Note that applications that require the | ||||
expression of times beyond early 2106 cannot leave out support of | ||||
64-bit integers for the tagged value. | ||||
Negative values (major type 1 and negative floating-point numbers) | ||||
are interpreted as determined by the application requirements as | ||||
there is no universal standard for UTC count-of-seconds time before | ||||
1970-01-01T00:00Z (this is particularly true for points in time that | ||||
precede discontinuities in national calendars). | ||||
To indicate fractional seconds, floating point values can be used | ||||
within Tag 1 instead of integer values. Note that this generally | ||||
requires binary64 support, as binary16 and binary32 provide non-zero | ||||
fractions of seconds only for a short period of time around early | ||||
1970. An application that requires Tag 1 support may restrict the | ||||
tagged value to be an integer (or a floating-point value) only. | ||||
3.4.4. Bignums | ||||
Protocols using tag values 2 and 3 extend the generic data model | Protocols using tag values 2 and 3 extend the generic data model | |||
(Section 2) with "bignums" representing arbitrary integers. In the | (Section 2) with "bignums" representing arbitrary integers. In the | |||
generic data model, bignum values are not equal to integers from the | generic data model, bignum values are not equal to integers from the | |||
basic data model, but specific data models can define that | basic data model, but specific data models can define that | |||
equivalence. | equivalence. | |||
Bignums are encoded as a byte string data item, which is interpreted | Bignums are encoded as a byte string data item, which is interpreted | |||
as an unsigned integer n in network byte order. For tag value 2, the | as an unsigned integer n in network byte order. For tag value 2, the | |||
value of the bignum is n. For tag value 3, the value of the bignum | value of the bignum is n. For tag value 3, the value of the bignum | |||
skipping to change at page 19, line 9 ¶ | skipping to change at page 20, line 5 ¶ | |||
For example, the number 18446744073709551616 (2**64) is represented | For example, the number 18446744073709551616 (2**64) is represented | |||
as 0b110_00010 (major type 6, tag 2), followed by 0b010_01001 (major | as 0b110_00010 (major type 6, tag 2), followed by 0b010_01001 (major | |||
type 2, length 9), followed by 0x010000000000000000 (one byte 0x01 | type 2, length 9), followed by 0x010000000000000000 (one byte 0x01 | |||
and eight bytes 0x00). In hexadecimal: | and eight bytes 0x00). In hexadecimal: | |||
C2 -- Tag 2 | C2 -- Tag 2 | |||
49 -- Byte string of length 9 | 49 -- Byte string of length 9 | |||
010000000000000000 -- Bytes content | 010000000000000000 -- Bytes content | |||
3.4.3. Decimal Fractions and Bigfloats | 3.4.5. Decimal Fractions and Bigfloats | |||
Protocols using tag value 4 extend the generic data model with data | Protocols using tag value 4 extend the generic data model with data | |||
items representing arbitrary-length decimal fractions m*(10*e). | items representing arbitrary-length decimal fractions m*(10*e). | |||
Protocols using tag value 5 extend the generic data model with data | Protocols using tag value 5 extend the generic data model with data | |||
items representing arbitrary-length binary fractions m*(2*e). As | items representing arbitrary-length binary fractions m*(2*e). As | |||
with bignums, values of different types are not equal in the generic | with bignums, values of different types are not equal in the generic | |||
data model. | data model. | |||
Decimal fractions combine an integer mantissa with a base-10 scaling | Decimal fractions combine an integer mantissa with a base-10 scaling | |||
factor. They are most useful if an application needs the exact | factor. They are most useful if an application needs the exact | |||
skipping to change at page 19, line 37 ¶ | skipping to change at page 20, line 33 ¶ | |||
(Section 3.3). Bigfloats may also be used by constrained | (Section 3.3). Bigfloats may also be used by constrained | |||
applications that need some basic binary floating-point capability | applications that need some basic binary floating-point capability | |||
without the need for supporting IEEE 754. | without the need for supporting IEEE 754. | |||
A decimal fraction or a bigfloat is represented as a tagged array | A decimal fraction or a bigfloat is represented as a tagged array | |||
that contains exactly two integer numbers: an exponent e and a | that contains exactly two integer numbers: an exponent e and a | |||
mantissa m. Decimal fractions (tag 4) use base-10 exponents; the | mantissa m. Decimal fractions (tag 4) use base-10 exponents; the | |||
value of a decimal fraction data item is m*(10**e). Bigfloats (tag | value of a decimal fraction data item is m*(10**e). Bigfloats (tag | |||
5) use base-2 exponents; the value of a bigfloat data item is | 5) use base-2 exponents; the value of a bigfloat data item is | |||
m*(2**e). The exponent e MUST be represented in an integer of major | m*(2**e). The exponent e MUST be represented in an integer of major | |||
type 0 or 1, while the mantissa also can be a bignum (Section 3.4.2). | type 0 or 1, while the mantissa also can be a bignum (Section 3.4.4). | |||
An example of a decimal fraction is that the number 273.15 could be | An example of a decimal fraction is that the number 273.15 could be | |||
represented as 0b110_00100 (major type of 6 for the tag, additional | represented as 0b110_00100 (major type of 6 for the tag, additional | |||
information of 4 for the type of tag), followed by 0b100_00010 (major | information of 4 for the type of tag), followed by 0b100_00010 (major | |||
type of 4 for the array, additional information of 2 for the length | type of 4 for the array, additional information of 2 for the length | |||
of the array), followed by 0b001_00001 (major type of 1 for the first | of the array), followed by 0b001_00001 (major type of 1 for the first | |||
integer, additional information of 1 for the value of -2), followed | integer, additional information of 1 for the value of -2), followed | |||
by 0b000_11001 (major type of 0 for the second integer, additional | by 0b000_11001 (major type of 0 for the second integer, additional | |||
information of 25 for a two-byte value), followed by | information of 25 for a two-byte value), followed by | |||
0b0110101010110011 (27315 in two bytes). In hexadecimal: | 0b0110101010110011 (27315 in two bytes). In hexadecimal: | |||
skipping to change at page 20, line 29 ¶ | skipping to change at page 21, line 25 ¶ | |||
Decimal fractions and bigfloats provide no representation of | Decimal fractions and bigfloats provide no representation of | |||
Infinity, -Infinity, or NaN; if these are needed in place of a | Infinity, -Infinity, or NaN; if these are needed in place of a | |||
decimal fraction or bigfloat, the IEEE 754 half-precision | decimal fraction or bigfloat, the IEEE 754 half-precision | |||
representations from Section 3.3 can be used. For constrained | representations from Section 3.3 can be used. For constrained | |||
applications, where there is a choice between representing a specific | applications, where there is a choice between representing a specific | |||
number as an integer and as a decimal fraction or bigfloat (such as | number as an integer and as a decimal fraction or bigfloat (such as | |||
when the exponent is small and non-negative), there is a quality-of- | when the exponent is small and non-negative), there is a quality-of- | |||
implementation expectation that the integer representation is used | implementation expectation that the integer representation is used | |||
directly. | directly. | |||
3.4.4. Content Hints | 3.4.6. Content Hints | |||
The tags in this section are for content hints that might be used by | The tags in this section are for content hints that might be used by | |||
generic CBOR processors. These content hints do not extend the | generic CBOR processors. These content hints do not extend the | |||
generic data model. | generic data model. | |||
3.4.4.1. Encoded CBOR Data Item | 3.4.6.1. Encoded CBOR Data Item | |||
Sometimes it is beneficial to carry an embedded CBOR data item that | Sometimes it is beneficial to carry an embedded CBOR data item that | |||
is not meant to be decoded immediately at the time the enclosing data | is not meant to be decoded immediately at the time the enclosing data | |||
item is being parsed. Tag 24 (CBOR data item) can be used to tag the | item is being parsed. Tag 24 (CBOR data item) can be used to tag the | |||
embedded byte string as a data item encoded in CBOR format. | embedded byte string as a data item encoded in CBOR format. | |||
3.4.4.2. Expected Later Encoding for CBOR-to-JSON Converters | 3.4.6.2. Expected Later Encoding for CBOR-to-JSON Converters | |||
Tags 21 to 23 indicate that a byte string might require a specific | Tags 21 to 23 indicate that a byte string might require a specific | |||
encoding when interoperating with a text-based representation. These | encoding when interoperating with a text-based representation. These | |||
tags are useful when an encoder knows that the byte string data it is | tags are useful when an encoder knows that the byte string data it is | |||
writing is likely to be later converted to a particular JSON-based | writing is likely to be later converted to a particular JSON-based | |||
usage. That usage specifies that some strings are encoded as base64, | usage. That usage specifies that some strings are encoded as base64, | |||
base64url, and so on. The encoder uses byte strings instead of doing | base64url, and so on. The encoder uses byte strings instead of doing | |||
the encoding itself to reduce the message size, to reduce the code | the encoding itself to reduce the message size, to reduce the code | |||
size of the encoder, or both. The encoder does not know whether or | size of the encoder, or both. The encoder does not know whether or | |||
not the converter will be generic, and therefore wants to say what it | not the converter will be generic, and therefore wants to say what it | |||
skipping to change at page 21, line 19 ¶ | skipping to change at page 22, line 17 ¶ | |||
contained in the data item, except for those contained in a nested | contained in the data item, except for those contained in a nested | |||
data item tagged with an expected conversion. | data item tagged with an expected conversion. | |||
These three tag types suggest conversions to three of the base data | These three tag types suggest conversions to three of the base data | |||
encodings defined in [RFC4648]. For base64url encoding, padding is | encodings defined in [RFC4648]. For base64url encoding, padding is | |||
not used (see Section 3.2 of RFC 4648); that is, all trailing equals | not used (see Section 3.2 of RFC 4648); that is, all trailing equals | |||
signs ("=") are removed from the base64url-encoded string. Later | signs ("=") are removed from the base64url-encoded string. Later | |||
tags might be defined for other data encodings of RFC 4648 or for | tags might be defined for other data encodings of RFC 4648 or for | |||
other ways to encode binary data in strings. | other ways to encode binary data in strings. | |||
3.4.4.3. Encoded Text | 3.4.6.3. Encoded Text | |||
Some text strings hold data that have formats widely used on the | Some text strings hold data that have formats widely used on the | |||
Internet, and sometimes those formats can be validated and presented | Internet, and sometimes those formats can be validated and presented | |||
to the application in appropriate form by the decoder. There are | to the application in appropriate form by the decoder. There are | |||
tags for some of these formats. | tags for some of these formats. | |||
o Tag 32 is for URIs, as defined in [RFC3986]; | o Tag 32 is for URIs, as defined in [RFC3986]; | |||
o Tags 33 and 34 are for base64url- and base64-encoded text strings, | o Tags 33 and 34 are for base64url- and base64-encoded text strings, | |||
as defined in [RFC4648]; | as defined in [RFC4648]; | |||
skipping to change at page 21, line 46 ¶ | skipping to change at page 22, line 44 ¶ | |||
expression, or more than just the text of the regular expression | expression, or more than just the text of the regular expression | |||
itself, need to be conveyed.) | itself, need to be conveyed.) | |||
o Tag 36 is for MIME messages (including all headers), as defined in | o Tag 36 is for MIME messages (including all headers), as defined in | |||
[RFC2045]; | [RFC2045]; | |||
Note that tags 33 and 34 differ from 21 and 22 in that the data is | Note that tags 33 and 34 differ from 21 and 22 in that the data is | |||
transported in base-encoded form for the former and in raw byte | transported in base-encoded form for the former and in raw byte | |||
string form for the latter. | string form for the latter. | |||
3.4.5. Self-Describe CBOR | 3.4.7. Self-Describe CBOR | |||
In many applications, it will be clear from the context that CBOR is | In many applications, it will be clear from the context that CBOR is | |||
being employed for encoding a data item. For instance, a specific | being employed for encoding a data item. For instance, a specific | |||
protocol might specify the use of CBOR, or a media type is indicated | protocol might specify the use of CBOR, or a media type is indicated | |||
that specifies its use. However, there may be applications where | that specifies its use. However, there may be applications where | |||
such context information is not available, such as when CBOR data is | such context information is not available, such as when CBOR data is | |||
stored in a file and disambiguating metadata is not in use. Here, it | stored in a file and disambiguating metadata is not in use. Here, it | |||
may help to have some distinguishing characteristics for the data | may help to have some distinguishing characteristics for the data | |||
itself. | itself. | |||
skipping to change at page 27, line 12 ¶ | skipping to change at page 28, line 12 ¶ | |||
specific integer encodings that are longer than necessary for the | specific integer encodings that are longer than necessary for the | |||
application, such as to save the need to implement 64-bit integers. | application, such as to save the need to implement 64-bit integers. | |||
There is an expectation that encoders will use the most compact | There is an expectation that encoders will use the most compact | |||
integer representation that can represent a given value. However, a | integer representation that can represent a given value. However, a | |||
compact application should accept values that use a longer-than- | compact application should accept values that use a longer-than- | |||
needed encoding (such as encoding "0" as 0b000_11001 followed by two | needed encoding (such as encoding "0" as 0b000_11001 followed by two | |||
bytes of 0x00) as long as the application can decode an integer of | bytes of 0x00) as long as the application can decode an integer of | |||
the given size. | the given size. | |||
The preferred encoding for a floating point value is the shortest | ||||
floating point encoding that preserves its value, e.g., 0xf94580 for | ||||
the number 5.5, and 0xfa45ad9c00 for the number 5555.5, unless the | ||||
CBOR-based protocol specifically excludes the use of the shorter | ||||
floating point encodings. For NaN values, a shorter encoding is | ||||
preferred if zero-padding the shorter significand towards the right | ||||
reconstitutes the original NaN value (for many applications, the | ||||
single NaN encoding 0xf97e00 will suffice). | ||||
4.7. Specifying Keys for Maps | 4.7. Specifying Keys for Maps | |||
The encoding and decoding applications need to agree on what types of | The encoding and decoding applications need to agree on what types of | |||
keys are going to be used in maps. In applications that need to | keys are going to be used in maps. In applications that need to | |||
interwork with JSON-based applications, keys probably should be | interwork with JSON-based applications, keys probably should be | |||
limited to UTF-8 strings only; otherwise, there has to be a specified | limited to UTF-8 strings only; otherwise, there has to be a specified | |||
mapping from the other CBOR types to Unicode characters, and this | mapping from the other CBOR types to Unicode characters, and this | |||
often leads to implementation errors. In applications where keys are | often leads to implementation errors. In applications where keys are | |||
numeric in nature and numeric ordering of keys is important to the | numeric in nature and numeric ordering of keys is important to the | |||
application, directly using the numbers for the keys is useful. | application, directly using the numbers for the keys is useful. | |||
skipping to change at page 27, line 47 ¶ | skipping to change at page 29, line 7 ¶ | |||
source to maintain uniqueness. | source to maintain uniqueness. | |||
A CBOR-based protocol should make an intentional decision about what | A CBOR-based protocol should make an intentional decision about what | |||
to do when a receiving application does see multiple identical keys | to do when a receiving application does see multiple identical keys | |||
in a map. The resulting rule in the protocol should respect the CBOR | in a map. The resulting rule in the protocol should respect the CBOR | |||
data model: it cannot prescribe a specific handling of the entries | data model: it cannot prescribe a specific handling of the entries | |||
with the identical keys, except that it might have a rule that having | with the identical keys, except that it might have a rule that having | |||
identical keys in a map indicates a malformed map and that the | identical keys in a map indicates a malformed map and that the | |||
decoder has to stop with an error. Duplicate keys are also | decoder has to stop with an error. Duplicate keys are also | |||
prohibited by CBOR decoders that are using strict mode | prohibited by CBOR decoders that are using strict mode | |||
(Section 4.10). | (Section 4.11). | |||
The CBOR data model for maps does not allow ascribing semantics to | The CBOR data model for maps does not allow ascribing semantics to | |||
the order of the key/value pairs in the map representation. Thus, a | the order of the key/value pairs in the map representation. Thus, a | |||
CBOR-based protocol MUST NOT specify that changing the key/value pair | CBOR-based protocol MUST NOT specify that changing the key/value pair | |||
order in a map would change the semantics, except to specify that | order in a map would change the semantics, except to specify that | |||
some, e.g. non-canonical, orders are disallowed. Timing, cache | some, e.g. non-canonical, orders are disallowed. Timing, cache | |||
usage, and other side channels are not considered part of the | usage, and other side channels are not considered part of the | |||
semantics. | semantics. | |||
Applications for constrained devices that have maps with 24 or fewer | Applications for constrained devices that have maps with 24 or fewer | |||
frequently used keys should consider using small integers (and those | frequently used keys should consider using small integers (and those | |||
with up to 48 frequently used keys should consider also using small | with up to 48 frequently used keys should consider also using small | |||
negative integers) because the keys can then be encoded in a single | negative integers) because the keys can then be encoded in a single | |||
byte. | byte. | |||
4.7.1. Equivalence of Keys | 4.7.1. Equivalence of Keys | |||
This notion of equivalence must be used to determine whether keys in | The specific data model applying to a CBOR data item is used to | |||
maps are duplicates or distinct. | determine whether keys occurring in maps are duplicates or distinct. | |||
o All numbers are compared by their numeric value. | ||||
* Integer data items with the same value are equal regardless of | ||||
how many bytes are used to encode them. | ||||
* Floating point data items with the same value are equal | ||||
regardless of how many bytes are used to encode them. | ||||
* An integer value encoded as a floating point data item is | ||||
equivalent to the same value encoded as an integer | ||||
o Byte strings and text strings are compared by their binary | ||||
content. | ||||
* A different length encoding has no effect on equivalence. | ||||
* A byte string is equal to a text string if they have the same | ||||
binary content. | ||||
o Two arrays are equal if all their items are in the same order and | At the generic data model level, numerically equivalent integer and | |||
equal. | floating point values are distinct from each other, as they are from | |||
the various big numbers (Tags 2 to 5). Similarly, text strings are | ||||
distinct from byte strings, even if composed of the same bytes. A | ||||
tagged value is distinct from an untagged value or from a value | ||||
tagged with a different tag. | ||||
o Two maps are equal if they have the same set of pairs regardless | Within each of these groups, numeric values are distinct unless they | |||
of their order; pairs are equal if both the key and value are | are numerically equal (specifically, -0.0 is equal to 0.0); for the | |||
equal. | purpose of map key equivalence, NaN (not a number) values are | |||
equivalent if they have the same significand after zero-extending | ||||
both significands at the right to 64 bits. | ||||
o Tags have no effect in determining equality of a data item, if two | (Byte and text) strings are compared byte by byte, arrays element by | |||
items are equal then they are equal irrespective of any tags that | element, and are equal if they have the same number of bytes/elements | |||
either or both may have. | and the same values at the same positions. Two maps are equal if | |||
they have the same set of pairs regardless of their order; pairs are | ||||
equal if both the key and value are equal. | ||||
o Simple values are equal if they simply have the same value. | Tagged values are equal if both the tag and the value are equal. | |||
Simple values are equal if they simply have the same value. Nothing | ||||
else is equal in the generic data model, a simple value 2 is not | ||||
equivalent to an integer 2 and an array is never equivalent to a map. | ||||
Nothing else is equal, a simple value 2 is not equivalent to an | As discussed in Section 2.2, specific data models can make values | |||
integer 2 and an array cannot be equivalent to a map with the same | equivalent for the purpose of comparing map keys that are distinct in | |||
values and sequential integer keys. | the generic data model. Note that this implies that a generic | |||
decoder may deliver a decoded map to an application that needs to be | ||||
checked for duplicate map keys by that application (alternatively, | ||||
the decoder may provide a programming interface to perform this | ||||
service for the application). Specific data models cannot | ||||
distinguish values for map keys that are equal for this purpose at | ||||
the generic data model level. | ||||
4.8. Undefined Values | 4.8. Undefined Values | |||
In some CBOR-based protocols, the simple value (Section 3.3) of | In some CBOR-based protocols, the simple value (Section 3.3) of | |||
Undefined might be used by an encoder as a substitute for a data item | Undefined might be used by an encoder as a substitute for a data item | |||
with an encoding problem, in order to allow the rest of the enclosing | with an encoding problem, in order to allow the rest of the enclosing | |||
data items to be encoded without harm. | data items to be encoded without harm. | |||
4.9. Canonical CBOR | 4.9. Preferred Serialization | |||
For some values at the data model level, CBOR provides multiple | ||||
serializations. For many applications, it is desirable that an | ||||
encoder always chooses a preferred serialization; however, the | ||||
present specification does not put the burden of enforcing this | ||||
preference on either encoder or decoder. | ||||
Some constrained decoders may be limited in their ability to decode | ||||
non-preferred serializations: For example, if only integers below | ||||
1_000_000_000 are expected in an application, the decoder may leave | ||||
out the code that would be needed to decode 64-bit arguments in | ||||
integers. An encoder that always uses preferred serialization | ||||
("preferred encoder") interoperates with this decoder for the numbers | ||||
that can occur in this application. More generally speaking, it | ||||
therefore can be said that a preferred encoder is more universally | ||||
interoperable (and also less wasteful) than one that, say, always | ||||
uses 64-bit integers. | ||||
Similarly, a constrained encoder may be limited in the variety of | ||||
representation variants it supports in such a way that it does not | ||||
emit preferred serializations ("variant encoder"): Say, it could be | ||||
designed to always use the 32-bit variant for an integer that it | ||||
encodes even if a short representation is available (again, assuming | ||||
that there is no application need for integers that can only be | ||||
represented with the 64-bit variant). A decoder that does not rely | ||||
on only ever receiving preferred serializations ("variation-tolerant | ||||
decoder") can there be said to be more universally interoperable (it | ||||
might very well optimize for the case of receiving preferred | ||||
serializations, though). Full implementations of CBOR decoders are | ||||
by definition variation-tolerant; the distinction is only relevant if | ||||
a constrained implementation of a CBOR decoder meets a variant | ||||
encoder. | ||||
The preferred serialization always uses the shortest form of | ||||
representing the argument (Section 3)); it also uses the shortest | ||||
floating point encoding that preserves the value being encoded (see | ||||
Section 4.6). Definite length encoding is preferred whenever the | ||||
length is known at the time the serialization of the item starts. | ||||
4.10. Canonical CBOR | ||||
Some protocols may want encoders to only emit CBOR in a particular | Some protocols may want encoders to only emit CBOR in a particular | |||
canonical format; those protocols might also have the decoders check | canonical format; those protocols might also have the decoders check | |||
that their input is canonical. Those protocols are free to define | that their input is canonical. Those protocols are free to define | |||
what they mean by a canonical format and what encoders and decoders | what they mean by a canonical format and what encoders and decoders | |||
are expected to do. This section defines a set of restrictions that | are expected to do. This section defines a set of restrictions that | |||
can serve as the base of such a canonical format. | can serve as the base of such a canonical format. | |||
A CBOR encoding satisfies the "core canonicalization requirements" if | A CBOR encoding satisfies the "core canonicalization requirements" if | |||
it satisfies the following restrictions: | it satisfies the following restrictions: | |||
o Integers MUST be as short as possible. In particular: | o Arguments (see Section 3) for integers, lengths in major types 2 | |||
through 5, and tags MUST be as short as possible. In particular: | ||||
* 0 to 23 and -1 to -24 MUST be expressed in the same byte as the | * 0 to 23 and -1 to -24 MUST be expressed in the same byte as the | |||
major type; | major type; | |||
* 24 to 255 and -25 to -256 MUST be expressed only with an | * 24 to 255 and -25 to -256 MUST be expressed only with an | |||
additional uint8_t; | additional uint8_t; | |||
* 256 to 65535 and -257 to -65536 MUST be expressed only with an | * 256 to 65535 and -257 to -65536 MUST be expressed only with an | |||
additional uint16_t; | additional uint16_t; | |||
* 65536 to 4294967295 and -65537 to -4294967296 MUST be expressed | * 65536 to 4294967295 and -65537 to -4294967296 MUST be expressed | |||
only with an additional uint32_t. | only with an additional uint32_t. | |||
o The expression of lengths in major types 2 through 5 MUST be as | ||||
short as possible. The rules for these lengths follow the above | ||||
rule for integers. | ||||
o The keys in every map MUST be sorted in the bytewise lexicographic | o The keys in every map MUST be sorted in the bytewise lexicographic | |||
order of their canonical encodings. For example, the following | order of their canonical encodings. For example, the following | |||
keys are sorted correctly: | keys are sorted correctly: | |||
1. 10, encoded as 0x0a. | 1. 10, encoded as 0x0a. | |||
2. 100, encoded as 0x1864. | 2. 100, encoded as 0x1864. | |||
3. -1, encoded as 0x20. | 3. -1, encoded as 0x20. | |||
skipping to change at page 31, line 17 ¶ | skipping to change at page 33, line 15 ¶ | |||
2. Encode all values as the smallest of 16-, 32-, or 64-bit | 2. Encode all values as the smallest of 16-, 32-, or 64-bit | |||
floating point that accurately represents the value, even for | floating point that accurately represents the value, even for | |||
integral values, or | integral values, or | |||
3. Encode all values as 64-bit floating point. | 3. Encode all values as 64-bit floating point. | |||
If NaN is an allowed value, the protocol needs to pick a single | If NaN is an allowed value, the protocol needs to pick a single | |||
representation, for example 0xf97e00. | representation, for example 0xf97e00. | |||
o If a protocol includes a field that can express integers larger | o If a protocol includes a field that can express integers larger | |||
than 2^64 using tag 2 (Section 3.4.2), the protocol's | than 2^64 using tag 2 (Section 3.4.4), the protocol's | |||
canonicalization needs to specify whether small integers are | canonicalization needs to specify whether small integers are | |||
expressed using the tag or major types 0 and 1. | expressed using the tag or major types 0 and 1. | |||
o A protocol might give encoders the choice of representing a URL as | o A protocol might give encoders the choice of representing a URL as | |||
either a text string or, using Section 3.4.4.3, tag 32 containing | either a text string or, using Section 3.4.6.3, tag 32 containing | |||
a text string. This protocol's canonicalization needs to either | a text string. This protocol's canonicalization needs to either | |||
require that the tag is present or require that it's absent, not | require that the tag is present or require that it's absent, not | |||
allow either one. | allow either one. | |||
4.9.1. Length-first map key ordering | 4.10.1. Length-first map key ordering | |||
The core canonicalization requirements sort map keys in a different | The core canonicalization requirements sort map keys in a different | |||
order from the one suggested by [RFC7049]. Protocols that need to be | order from the one suggested by [RFC7049]. Protocols that need to be | |||
compatible with [RFC7049]'s order can instead be specified in terms | compatible with [RFC7049]'s order can instead be specified in terms | |||
of this specification's "length-first core canonicalization | of this specification's "length-first core canonicalization | |||
requirements": | requirements": | |||
A CBOR encoding satisfies the "length-first core canonicalization | A CBOR encoding satisfies the "length-first core canonicalization | |||
requirements" if it satisfies the core canonicalization requirements | requirements" if it satisfies the core canonicalization requirements | |||
except that the keys in every map MUST be sorted such that: | except that the keys in every map MUST be sorted such that: | |||
skipping to change at page 32, line 17 ¶ | skipping to change at page 34, line 13 ¶ | |||
4. 100, encoded as 0x1864. | 4. 100, encoded as 0x1864. | |||
5. "z", encoded as 0x617a. | 5. "z", encoded as 0x617a. | |||
6. [-1], encoded as 0x8120. | 6. [-1], encoded as 0x8120. | |||
7. "aa", encoded as 0x626161. | 7. "aa", encoded as 0x626161. | |||
8. [100], encoded as 0x811864. | 8. [100], encoded as 0x811864. | |||
4.10. Strict Mode | 4.11. Strict Mode | |||
Some areas of application of CBOR do not require canonicalization | Some areas of application of CBOR do not require canonicalization | |||
(Section 4.9) but may require that different decoders reach the same | (Section 4.10) but may require that different decoders reach the same | |||
(semantically equivalent) results, even in the presence of | (semantically equivalent) results, even in the presence of | |||
potentially malicious data. This can be required if one application | potentially malicious data. This can be required if one application | |||
(such as a firewall or other protecting entity) makes a decision | (such as a firewall or other protecting entity) makes a decision | |||
based on the data that another application, which independently | based on the data that another application, which independently | |||
decodes the data, relies on. | decodes the data, relies on. | |||
Normally, it is the responsibility of the sender to avoid ambiguously | Normally, it is the responsibility of the sender to avoid ambiguously | |||
decodable data. However, the sender might be an attacker specially | decodable data. However, the sender might be an attacker specially | |||
making up CBOR data such that it will be interpreted differently by | making up CBOR data such that it will be interpreted differently by | |||
different decoders in an attempt to exploit that as a vulnerability. | different decoders in an attempt to exploit that as a vulnerability. | |||
skipping to change at page 42, line 38 ¶ | skipping to change at page 44, line 32 ¶ | |||
Applications where a CBOR data item is examined by a gatekeeper | Applications where a CBOR data item is examined by a gatekeeper | |||
function and later used by a different application may exhibit | function and later used by a different application may exhibit | |||
vulnerabilities when multiple interpretations of the data item are | vulnerabilities when multiple interpretations of the data item are | |||
possible. For example, an attacker could make use of duplicate keys | possible. For example, an attacker could make use of duplicate keys | |||
in maps and precision issues in numbers to make the gatekeeper base | in maps and precision issues in numbers to make the gatekeeper base | |||
its decisions on a different interpretation than the one that will be | its decisions on a different interpretation than the one that will be | |||
used by the second application. Protocols that are used in a | used by the second application. Protocols that are used in a | |||
security context should be defined in such a way that these multiple | security context should be defined in such a way that these multiple | |||
interpretations are reliably reduced to a single one. To facilitate | interpretations are reliably reduced to a single one. To facilitate | |||
this, encoder and decoder implementations used in such contexts | this, encoder and decoder implementations used in such contexts | |||
should provide at least one strict mode of operation (Section 4.10). | should provide at least one strict mode of operation (Section 4.11). | |||
10. Acknowledgements | 10. Acknowledgements | |||
CBOR was inspired by MessagePack. MessagePack was developed and | CBOR was inspired by MessagePack. MessagePack was developed and | |||
promoted by Sadayuki Furuhashi ("frsyuki"). This reference to | promoted by Sadayuki Furuhashi ("frsyuki"). This reference to | |||
MessagePack is solely for attribution; CBOR is not intended as a | MessagePack is solely for attribution; CBOR is not intended as a | |||
version of or replacement for MessagePack, as it has different design | version of or replacement for MessagePack, as it has different design | |||
goals and requirements. | goals and requirements. | |||
The need for functionality beyond the original MessagePack | The need for functionality beyond the original MessagePack | |||
skipping to change at page 43, line 13 ¶ | skipping to change at page 45, line 6 ¶ | |||
MessagePack that was developed by Eric Zhang for the binaryjs | MessagePack that was developed by Eric Zhang for the binaryjs | |||
project. A similar, but different, extension was made by Tim Caswell | project. A similar, but different, extension was made by Tim Caswell | |||
for his msgpack-js and msgpack-js-browser projects. Many people have | for his msgpack-js and msgpack-js-browser projects. Many people have | |||
contributed to the recent discussion about extending MessagePack to | contributed to the recent discussion about extending MessagePack to | |||
separate text string representation from byte string representation. | separate text string representation from byte string representation. | |||
The encoding of the additional information in CBOR was inspired by | The encoding of the additional information in CBOR was inspired by | |||
the encoding of length information designed by Klaus Hartke for CoAP. | the encoding of length information designed by Klaus Hartke for CoAP. | |||
This document also incorporates suggestions made by many people, | This document also incorporates suggestions made by many people, | |||
notably Dan Frost, James Manger, Joe Hildebrand, Keith Moore, Matthew | notably Dan Frost, James Manger, Joe Hildebrand, Keith Moore, | |||
Lepinski, Nico Williams, Phillip Hallam-Baker, Ray Polk, Tim Bray, | Laurence Lundblade, Matthew Lepinski, Michael Richardson, Nico | |||
Tony Finch, Tony Hansen, and Yaron Sheffer. | Williams, Phillip Hallam-Baker, Ray Polk, Tim Bray, Tony Finch, Tony | |||
Hansen, and Yaron Sheffer. | ||||
11. References | 11. References | |||
11.1. Normative References | 11.1. Normative References | |||
[ECMA262] Ecma International, "ECMAScript 2018 Language | [ECMA262] Ecma International, "ECMAScript 2018 Language | |||
Specification", ECMA Standard ECMA-262, 9th Edition, June | Specification", ECMA Standard ECMA-262, 9th Edition, June | |||
2018, <https://www.ecma- | 2018, <https://www.ecma- | |||
international.org/publications/files/ECMA-ST/ | international.org/publications/files/ECMA-ST/ | |||
Ecma-262.pdf>. | Ecma-262.pdf>. | |||
[IEEE.754.2008] | ||||
Institute of Electrical and Electronics Engineers, "IEEE | ||||
Standard for Floating-Point Arithmetic", IEEE | ||||
Standard 754-2008, August 2008. | ||||
[RFC2045] Freed, N. and N. Borenstein, "Multipurpose Internet Mail | [RFC2045] Freed, N. and N. Borenstein, "Multipurpose Internet Mail | |||
Extensions (MIME) Part One: Format of Internet Message | Extensions (MIME) Part One: Format of Internet Message | |||
Bodies", RFC 2045, DOI 10.17487/RFC2045, November 1996, | Bodies", RFC 2045, DOI 10.17487/RFC2045, November 1996, | |||
<https://www.rfc-editor.org/info/rfc2045>. | <https://www.rfc-editor.org/info/rfc2045>. | |||
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
Requirement Levels", BCP 14, RFC 2119, | Requirement Levels", BCP 14, RFC 2119, | |||
DOI 10.17487/RFC2119, March 1997, | DOI 10.17487/RFC2119, March 1997, | |||
<https://www.rfc-editor.org/info/rfc2119>. | <https://www.rfc-editor.org/info/rfc2119>. | |||
skipping to change at page 44, line 38 ¶ | skipping to change at page 46, line 34 ¶ | |||
Encoding Rules (BER), Canonical Encoding Rules (CER) and | Encoding Rules (BER), Canonical Encoding Rules (CER) and | |||
Distinguished Encoding Rules (DER)", ITU-T Recommendation | Distinguished Encoding Rules (DER)", ITU-T Recommendation | |||
X.690, 1994. | X.690, 1994. | |||
[BSON] Various, "BSON - Binary JSON", 2013, | [BSON] Various, "BSON - Binary JSON", 2013, | |||
<http://bsonspec.org/>. | <http://bsonspec.org/>. | |||
[MessagePack] | [MessagePack] | |||
Furuhashi, S., "MessagePack", 2013, <http://msgpack.org/>. | Furuhashi, S., "MessagePack", 2013, <http://msgpack.org/>. | |||
[PCRE] Hazel, P., "PCRE - Perl Compatible Regular Expressions", | [PCRE] Ho, A., "PCRE - Perl Compatible Regular Expressions", | |||
2018, <http://www.pcre.org/>. | 2018, <http://www.pcre.org/>. | |||
[RFC0713] Haverty, J., "MSDTP-Message Services Data Transmission | [RFC0713] Haverty, J., "MSDTP-Message Services Data Transmission | |||
Protocol", RFC 713, DOI 10.17487/RFC0713, April 1976, | Protocol", RFC 713, DOI 10.17487/RFC0713, April 1976, | |||
<https://www.rfc-editor.org/info/rfc713>. | <https://www.rfc-editor.org/info/rfc713>. | |||
[RFC6838] Freed, N., Klensin, J., and T. Hansen, "Media Type | [RFC6838] Freed, N., Klensin, J., and T. Hansen, "Media Type | |||
Specifications and Registration Procedures", BCP 13, | Specifications and Registration Procedures", BCP 13, | |||
RFC 6838, DOI 10.17487/RFC6838, January 2013, | RFC 6838, DOI 10.17487/RFC6838, January 2013, | |||
<https://www.rfc-editor.org/info/rfc6838>. | <https://www.rfc-editor.org/info/rfc6838>. | |||
skipping to change at page 45, line 15 ¶ | skipping to change at page 47, line 15 ¶ | |||
[RFC7228] Bormann, C., Ersue, M., and A. Keranen, "Terminology for | [RFC7228] Bormann, C., Ersue, M., and A. Keranen, "Terminology for | |||
Constrained-Node Networks", RFC 7228, | Constrained-Node Networks", RFC 7228, | |||
DOI 10.17487/RFC7228, May 2014, | DOI 10.17487/RFC7228, May 2014, | |||
<https://www.rfc-editor.org/info/rfc7228>. | <https://www.rfc-editor.org/info/rfc7228>. | |||
[RFC8259] Bray, T., Ed., "The JavaScript Object Notation (JSON) Data | [RFC8259] Bray, T., Ed., "The JavaScript Object Notation (JSON) Data | |||
Interchange Format", STD 90, RFC 8259, | Interchange Format", STD 90, RFC 8259, | |||
DOI 10.17487/RFC8259, December 2017, | DOI 10.17487/RFC8259, December 2017, | |||
<https://www.rfc-editor.org/info/rfc8259>. | <https://www.rfc-editor.org/info/rfc8259>. | |||
[UBJSON] The Buzz Media, "Universal Binary JSON Specification", | ||||
2013, <http://ubjson.org/>. | ||||
[YAML] Ben-Kiki, O., Evans, C., and I. Net, "YAML Ain't Markup | [YAML] Ben-Kiki, O., Evans, C., and I. Net, "YAML Ain't Markup | |||
Language (YAML[TM]) Version 1.2", 3rd Edition, October | Language (YAML[TM]) Version 1.2", 3rd Edition, October | |||
2009, <http://www.yaml.org/spec/1.2/spec.html>. | 2009, <http://www.yaml.org/spec/1.2/spec.html>. | |||
Appendix A. Examples | Appendix A. Examples | |||
The following table provides some CBOR-encoded values in hexadecimal | The following table provides some CBOR-encoded values in hexadecimal | |||
(right column), together with diagnostic notation for these values | (right column), together with diagnostic notation for these values | |||
(left column). Note that the string "\u00fc" is one form of | (left column). Note that the string "\u00fc" is one form of | |||
diagnostic notation for a UTF-8 string containing the single Unicode | diagnostic notation for a UTF-8 string containing the single Unicode | |||
skipping to change at page 52, line 15 ¶ | skipping to change at page 54, line 15 ¶ | |||
| | | | | | | | |||
| 0xba | map (four-byte uint32_t for n, and then n pairs of | | | 0xba | map (four-byte uint32_t for n, and then n pairs of | | |||
| | data items follow) | | | | data items follow) | | |||
| | | | | | | | |||
| 0xbb | map (eight-byte uint64_t for n, and then n pairs of | | | 0xbb | map (eight-byte uint64_t for n, and then n pairs of | | |||
| | data items follow) | | | | data items follow) | | |||
| | | | | | | | |||
| 0xbf | map, pairs of data items follow, terminated by | | | 0xbf | map, pairs of data items follow, terminated by | | |||
| | "break" | | | | "break" | | |||
| | | | | | | | |||
| 0xc0 | Text-based date/time (data item follows; see | | | 0xc0 | Text-based date/time (data item follows; see Section | | |||
| | Section 3.4.1) | | | | 3.4.2) | | |||
| | | | | | | | |||
| 0xc1 | Epoch-based date/time (data item follows; see | | | 0xc1 | Epoch-based date/time (data item follows; see | | |||
| | Section 3.4.1) | | | | Section 3.4.3) | | |||
| | | | | | | | |||
| 0xc2 | Positive bignum (data item "byte string" follows) | | | 0xc2 | Positive bignum (data item "byte string" follows) | | |||
| | | | | | | | |||
| 0xc3 | Negative bignum (data item "byte string" follows) | | | 0xc3 | Negative bignum (data item "byte string" follows) | | |||
| | | | | | | | |||
| 0xc4 | Decimal Fraction (data item "array" follows; see | | | 0xc4 | Decimal Fraction (data item "array" follows; see | | |||
| | Section 3.4.3) | | | | Section 3.4.5) | | |||
| | | | | | | | |||
| 0xc5 | Bigfloat (data item "array" follows; see | | | 0xc5 | Bigfloat (data item "array" follows; see Section | | |||
| | Section 3.4.3) | | | | 3.4.5) | | |||
| | | | | | | | |||
| 0xc6..0xd4 | (tagged item) | | | 0xc6..0xd4 | (tagged item) | | |||
| | | | | | | | |||
| 0xd5..0xd7 | Expected Conversion (data item follows; see | | | 0xd5..0xd7 | Expected Conversion (data item follows; see Section | | |||
| | Section 3.4.4.2) | | | | 3.4.6.2) | | |||
| | | | | | | | |||
| 0xd8..0xdb | (more tagged items, 1/2/4/8 bytes and then a data | | | 0xd8..0xdb | (more tagged items, 1/2/4/8 bytes and then a data | | |||
| | item follow) | | | | item follow) | | |||
| | | | | | | | |||
| 0xe0..0xf3 | (simple value) | | | 0xe0..0xf3 | (simple value) | | |||
| | | | | | | | |||
| 0xf4 | False | | | 0xf4 | False | | |||
| | | | | | | | |||
| 0xf5 | True | | | 0xf5 | True | | |||
| | | | | | | | |||
skipping to change at page 55, line 33 ¶ | skipping to change at page 57, line 33 ¶ | |||
*p++ = mt + 24; | *p++ = mt + 24; | |||
*p++ = ui; | *p++ = ui; | |||
} else | } else | |||
... | ... | |||
Figure 2: Pseudocode for Encoding a Signed Integer | Figure 2: Pseudocode for Encoding a Signed Integer | |||
Appendix D. Half-Precision | Appendix D. Half-Precision | |||
As half-precision floating-point numbers were only added to IEEE 754 | As half-precision floating-point numbers were only added to IEEE 754 | |||
in 2008, today's programming platforms often still only have limited | in 2008 [IEEE.754.2008], today's programming platforms often still | |||
support for them. It is very easy to include at least decoding | only have limited support for them. It is very easy to include at | |||
support for them even without such support. An example of a small | least decoding support for them even without such support. An | |||
decoder for half-precision floating-point numbers in the C language | example of a small decoder for half-precision floating-point numbers | |||
is shown in Figure 3. A similar program for Python is in Figure 4; | in the C language is shown in Figure 3. A similar program for Python | |||
this code assumes that the 2-byte value has already been decoded as | is in Figure 4; this code assumes that the 2-byte value has already | |||
an (unsigned short) integer in network byte order (as would be done | been decoded as an (unsigned short) integer in network byte order (as | |||
by the pseudocode in Appendix C). | would be done by the pseudocode in Appendix C). | |||
#include <math.h> | #include <math.h> | |||
double decode_half(unsigned char *halfp) { | double decode_half(unsigned char *halfp) { | |||
int half = (halfp[0] << 8) + halfp[1]; | int half = (halfp[0] << 8) + halfp[1]; | |||
int exp = (half >> 10) & 0x1f; | int exp = (half >> 10) & 0x1f; | |||
int mant = half & 0x3ff; | int mant = half & 0x3ff; | |||
double val; | double val; | |||
if (exp == 0) val = ldexp(mant, -24); | if (exp == 0) val = ldexp(mant, -24); | |||
else if (exp != 31) val = ldexp(mant + 1024, exp - 25); | else if (exp != 31) val = ldexp(mant + 1024, exp - 25); | |||
skipping to change at page 58, line 19 ¶ | skipping to change at page 60, line 19 ¶ | |||
[BSON] is a data format that was developed for the storage of JSON- | [BSON] is a data format that was developed for the storage of JSON- | |||
like maps (JSON objects) in the MongoDB database. Its major | like maps (JSON objects) in the MongoDB database. Its major | |||
distinguishing feature is the capability for in-place update, | distinguishing feature is the capability for in-place update, | |||
foregoing a compact representation. BSON uses a counted | foregoing a compact representation. BSON uses a counted | |||
representation except for map keys, which are null-byte terminated. | representation except for map keys, which are null-byte terminated. | |||
While BSON can be used for the representation of JSON-like objects on | While BSON can be used for the representation of JSON-like objects on | |||
the wire, its specification is dominated by the requirements of the | the wire, its specification is dominated by the requirements of the | |||
database application and has become somewhat baroque. The status of | database application and has become somewhat baroque. The status of | |||
how BSON extensions will be implemented remains unclear. | how BSON extensions will be implemented remains unclear. | |||
E.4. UBJSON | E.4. MSDTP: RFC 713 | |||
[UBJSON] has a design goal to make JSON faster and somewhat smaller, | ||||
using a binary format that is limited to exactly the data model JSON | ||||
uses. Thus, there is expressly no intention to support, for example, | ||||
binary data; however, there is a "high-precision number", expressed | ||||
as a character string in JSON syntax. UBJSON is not optimized for | ||||
code compactness, and its type byte coding is optimized for human | ||||
recognition and not for compact representation of native types such | ||||
as small integers. Although UBJSON is mostly counted, it provides a | ||||
reserved "unknown-length" value to support streaming of arrays and | ||||
maps (JSON objects). Within these containers, UBJSON also has a | ||||
"Noop" type for padding. | ||||
E.5. MSDTP: RFC 713 | ||||
Message Services Data Transmission (MSDTP) is a very early example of | Message Services Data Transmission (MSDTP) is a very early example of | |||
a compact message format; it is described in [RFC0713], written in | a compact message format; it is described in [RFC0713], written in | |||
1976. It is included here for its historical value, not because it | 1976. It is included here for its historical value, not because it | |||
was ever widely used. | was ever widely used. | |||
E.6. Conciseness on the Wire | E.5. Conciseness on the Wire | |||
While CBOR's design objective of code compactness for encoders and | While CBOR's design objective of code compactness for encoders and | |||
decoders is a higher priority than its objective of conciseness on | decoders is a higher priority than its objective of conciseness on | |||
the wire, many people focus on the wire size. Table 6 shows some | the wire, many people focus on the wire size. Table 6 shows some | |||
encoding examples for the simple nested array [1, [2, 3]]; where some | encoding examples for the simple nested array [1, [2, 3]]; where some | |||
form of indefinite-length encoding is supported by the encoding, | form of indefinite-length encoding is supported by the encoding, | |||
[_ 1, [2, 3]] (indefinite length on the outer array) is also shown. | [_ 1, [2, 3]] (indefinite length on the outer array) is also shown. | |||
+-------------+--------------------------+--------------------------+ | +-------------+--------------------------+--------------------------+ | |||
| Format | [1, [2, 3]] | [_ 1, [2, 3]] | | | Format | [1, [2, 3]] | [_ 1, [2, 3]] | | |||
skipping to change at page 59, line 21 ¶ | skipping to change at page 61, line 21 ¶ | |||
| | 01 02 02 01 03 | 01 02 02 01 03 00 00 | | | | 01 02 02 01 03 | 01 02 02 01 03 00 00 | | |||
| | | | | | | | | | |||
| MessagePack | 92 01 92 02 03 | | | | MessagePack | 92 01 92 02 03 | | | |||
| | | | | | | | | | |||
| BSON | 22 00 00 00 10 30 00 01 | | | | BSON | 22 00 00 00 10 30 00 01 | | | |||
| | 00 00 00 04 31 00 13 00 | | | | | 00 00 00 04 31 00 13 00 | | | |||
| | 00 00 10 30 00 02 00 00 | | | | | 00 00 10 30 00 02 00 00 | | | |||
| | 00 10 31 00 03 00 00 00 | | | | | 00 10 31 00 03 00 00 00 | | | |||
| | 00 00 | | | | | 00 00 | | | |||
| | | | | | | | | | |||
| UBJSON | 61 02 42 01 61 02 42 02 | 61 ff 42 01 61 02 42 02 | | ||||
| | 42 03 | 42 03 45 | | ||||
| | | | | ||||
| CBOR | 82 01 82 02 03 | 9f 01 82 02 03 ff | | | CBOR | 82 01 82 02 03 | 9f 01 82 02 03 ff | | |||
+-------------+--------------------------+--------------------------+ | +-------------+--------------------------+--------------------------+ | |||
Table 6: Examples for Different Levels of Conciseness | Table 6: Examples for Different Levels of Conciseness | |||
Appendix F. Changes from RFC 7049 | Appendix F. Changes from RFC 7049 | |||
The following is a list of known changes from RFC 7049. This list is | The following is a list of known changes from RFC 7049. This list is | |||
non-authoritative. It is meant to help reviewers see the significant | non-authoritative. It is meant to help reviewers see the significant | |||
differences. | differences. | |||
End of changes. 71 change blocks. | ||||
211 lines changed or deleted | 280 lines changed or added | |||
This html diff was produced by rfcdiff 1.47. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |