draft-ietf-cbor-7049bis-14.txt | draft-ietf-cbor-7049bis-15.txt | |||
---|---|---|---|---|
Network Working Group C. Bormann | Network Working Group C. Bormann | |||
Internet-Draft Universitaet Bremen TZI | Internet-Draft Universitaet Bremen TZI | |||
Obsoletes: 7049 (if approved) P. Hoffman | Obsoletes: 7049 (if approved) P. Hoffman | |||
Intended status: Standards Track ICANN | Intended status: Standards Track ICANN | |||
Expires: 19 December 2020 17 June 2020 | Expires: 28 March 2021 24 September 2020 | |||
Concise Binary Object Representation (CBOR) | Concise Binary Object Representation (CBOR) | |||
draft-ietf-cbor-7049bis-14 | draft-ietf-cbor-7049bis-15 | |||
Abstract | Abstract | |||
The Concise Binary Object Representation (CBOR) is a data format | The Concise Binary Object Representation (CBOR) is a data format | |||
whose design goals include the possibility of extremely small code | whose design goals include the possibility of extremely small code | |||
size, fairly small message size, and extensibility without the need | size, fairly small message size, and extensibility without the need | |||
for version negotiation. These design goals make it different from | for version negotiation. These design goals make it different from | |||
earlier binary serializations such as ASN.1 and MessagePack. | earlier binary serializations such as ASN.1 and MessagePack. | |||
This document is a revised edition of RFC 7049, with editorial | This document is a revised edition of RFC 7049, with editorial | |||
skipping to change at page 2, line 10 ¶ | skipping to change at page 2, line 10 ¶ | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
Drafts is at https://datatracker.ietf.org/drafts/current/. | Drafts is at https://datatracker.ietf.org/drafts/current/. | |||
Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
This Internet-Draft will expire on 19 December 2020. | This Internet-Draft will expire on 28 March 2021. | |||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2020 IETF Trust and the persons identified as the | Copyright (c) 2020 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents (https://trustee.ietf.org/ | Provisions Relating to IETF Documents (https://trustee.ietf.org/ | |||
license-info) in effect on the date of publication of this document. | license-info) in effect on the date of publication of this document. | |||
Please review these documents carefully, as they describe your rights | Please review these documents carefully, as they describe your rights | |||
skipping to change at page 2, line 33 ¶ | skipping to change at page 2, line 33 ¶ | |||
as described in Section 4.e of the Trust Legal Provisions and are | as described in Section 4.e of the Trust Legal Provisions and are | |||
provided without warranty as described in the Simplified BSD License. | provided without warranty as described in the Simplified BSD License. | |||
Table of Contents | Table of Contents | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 | |||
1.1. Objectives . . . . . . . . . . . . . . . . . . . . . . . 4 | 1.1. Objectives . . . . . . . . . . . . . . . . . . . . . . . 4 | |||
1.2. Terminology . . . . . . . . . . . . . . . . . . . . . . . 6 | 1.2. Terminology . . . . . . . . . . . . . . . . . . . . . . . 6 | |||
2. CBOR Data Models . . . . . . . . . . . . . . . . . . . . . . 8 | 2. CBOR Data Models . . . . . . . . . . . . . . . . . . . . . . 8 | |||
2.1. Extended Generic Data Models . . . . . . . . . . . . . . 9 | 2.1. Extended Generic Data Models . . . . . . . . . . . . . . 9 | |||
2.2. Specific Data Models . . . . . . . . . . . . . . . . . . 9 | 2.2. Specific Data Models . . . . . . . . . . . . . . . . . . 10 | |||
3. Specification of the CBOR Encoding . . . . . . . . . . . . . 10 | 3. Specification of the CBOR Encoding . . . . . . . . . . . . . 10 | |||
3.1. Major Types . . . . . . . . . . . . . . . . . . . . . . . 11 | 3.1. Major Types . . . . . . . . . . . . . . . . . . . . . . . 11 | |||
3.2. Indefinite Lengths for Some Major Types . . . . . . . . . 14 | 3.2. Indefinite Lengths for Some Major Types . . . . . . . . . 14 | |||
3.2.1. The "break" Stop Code . . . . . . . . . . . . . . . . 14 | 3.2.1. The "break" Stop Code . . . . . . . . . . . . . . . . 14 | |||
3.2.2. Indefinite-Length Arrays and Maps . . . . . . . . . . 14 | 3.2.2. Indefinite-Length Arrays and Maps . . . . . . . . . . 15 | |||
3.2.3. Indefinite-Length Byte Strings and Text Strings . . . 16 | 3.2.3. Indefinite-Length Byte Strings and Text Strings . . . 17 | |||
3.2.4. Summary of indefinite-length use of major types . . . 17 | 3.2.4. Summary of indefinite-length use of major types . . . 18 | |||
3.3. Floating-Point Numbers and Values with No Content . . . . 18 | 3.3. Floating-Point Numbers and Values with No Content . . . . 18 | |||
3.4. Tagging of Items . . . . . . . . . . . . . . . . . . . . 19 | 3.4. Tagging of Items . . . . . . . . . . . . . . . . . . . . 20 | |||
3.4.1. Standard Date/Time String . . . . . . . . . . . . . . 22 | 3.4.1. Standard Date/Time String . . . . . . . . . . . . . . 23 | |||
3.4.2. Epoch-based Date/Time . . . . . . . . . . . . . . . . 23 | 3.4.2. Epoch-based Date/Time . . . . . . . . . . . . . . . . 23 | |||
3.4.3. Bignums . . . . . . . . . . . . . . . . . . . . . . . 24 | 3.4.3. Bignums . . . . . . . . . . . . . . . . . . . . . . . 24 | |||
3.4.4. Decimal Fractions and Bigfloats . . . . . . . . . . . 24 | 3.4.4. Decimal Fractions and Bigfloats . . . . . . . . . . . 25 | |||
3.4.5. Content Hints . . . . . . . . . . . . . . . . . . . . 26 | 3.4.5. Content Hints . . . . . . . . . . . . . . . . . . . . 26 | |||
3.4.5.1. Encoded CBOR Data Item . . . . . . . . . . . . . 26 | 3.4.5.1. Encoded CBOR Data Item . . . . . . . . . . . . . 27 | |||
3.4.5.2. Expected Later Encoding for CBOR-to-JSON | 3.4.5.2. Expected Later Encoding for CBOR-to-JSON | |||
Converters . . . . . . . . . . . . . . . . . . . . 26 | Converters . . . . . . . . . . . . . . . . . . . . 27 | |||
3.4.5.3. Encoded Text . . . . . . . . . . . . . . . . . . 27 | 3.4.5.3. Encoded Text . . . . . . . . . . . . . . . . . . 28 | |||
3.4.6. Self-Described CBOR . . . . . . . . . . . . . . . . . 28 | 3.4.6. Self-Described CBOR . . . . . . . . . . . . . . . . . 29 | |||
4. Serialization Considerations . . . . . . . . . . . . . . . . 29 | 4. Serialization Considerations . . . . . . . . . . . . . . . . 29 | |||
4.1. Preferred Serialization . . . . . . . . . . . . . . . . . 29 | 4.1. Preferred Serialization . . . . . . . . . . . . . . . . . 29 | |||
4.2. Deterministically Encoded CBOR . . . . . . . . . . . . . 30 | 4.2. Deterministically Encoded CBOR . . . . . . . . . . . . . 31 | |||
4.2.1. Core Deterministic Encoding Requirements . . . . . . 30 | 4.2.1. Core Deterministic Encoding Requirements . . . . . . 31 | |||
4.2.2. Additional Deterministic Encoding Considerations . . 31 | 4.2.2. Additional Deterministic Encoding Considerations . . 32 | |||
4.2.3. Length-first Map Key Ordering . . . . . . . . . . . . 33 | 4.2.3. Length-first Map Key Ordering . . . . . . . . . . . . 34 | |||
5. Creating CBOR-Based Protocols . . . . . . . . . . . . . . . . 34 | 5. Creating CBOR-Based Protocols . . . . . . . . . . . . . . . . 35 | |||
5.1. CBOR in Streaming Applications . . . . . . . . . . . . . 35 | 5.1. CBOR in Streaming Applications . . . . . . . . . . . . . 35 | |||
5.2. Generic Encoders and Decoders . . . . . . . . . . . . . . 35 | 5.2. Generic Encoders and Decoders . . . . . . . . . . . . . . 36 | |||
5.3. Validity of Items . . . . . . . . . . . . . . . . . . . . 36 | 5.3. Validity of Items . . . . . . . . . . . . . . . . . . . . 37 | |||
5.3.1. Basic validity . . . . . . . . . . . . . . . . . . . 36 | 5.3.1. Basic validity . . . . . . . . . . . . . . . . . . . 37 | |||
5.3.2. Tag validity . . . . . . . . . . . . . . . . . . . . 37 | 5.3.2. Tag validity . . . . . . . . . . . . . . . . . . . . 37 | |||
5.4. Validity and Evolution . . . . . . . . . . . . . . . . . 37 | 5.4. Validity and Evolution . . . . . . . . . . . . . . . . . 38 | |||
5.5. Numbers . . . . . . . . . . . . . . . . . . . . . . . . . 38 | 5.5. Numbers . . . . . . . . . . . . . . . . . . . . . . . . . 39 | |||
5.6. Specifying Keys for Maps . . . . . . . . . . . . . . . . 39 | 5.6. Specifying Keys for Maps . . . . . . . . . . . . . . . . 40 | |||
5.6.1. Equivalence of Keys . . . . . . . . . . . . . . . . . 41 | 5.6.1. Equivalence of Keys . . . . . . . . . . . . . . . . . 42 | |||
5.7. Undefined Values . . . . . . . . . . . . . . . . . . . . 42 | 5.7. Undefined Values . . . . . . . . . . . . . . . . . . . . 43 | |||
6. Converting Data between CBOR and JSON . . . . . . . . . . . . 42 | 6. Converting Data between CBOR and JSON . . . . . . . . . . . . 43 | |||
6.1. Converting from CBOR to JSON . . . . . . . . . . . . . . 42 | 6.1. Converting from CBOR to JSON . . . . . . . . . . . . . . 43 | |||
6.2. Converting from JSON to CBOR . . . . . . . . . . . . . . 43 | 6.2. Converting from JSON to CBOR . . . . . . . . . . . . . . 44 | |||
7. Future Evolution of CBOR . . . . . . . . . . . . . . . . . . 44 | 7. Future Evolution of CBOR . . . . . . . . . . . . . . . . . . 46 | |||
7.1. Extension Points . . . . . . . . . . . . . . . . . . . . 45 | 7.1. Extension Points . . . . . . . . . . . . . . . . . . . . 46 | |||
7.2. Curating the Additional Information Space . . . . . . . . 46 | 7.2. Curating the Additional Information Space . . . . . . . . 47 | |||
8. Diagnostic Notation . . . . . . . . . . . . . . . . . . . . . 46 | 8. Diagnostic Notation . . . . . . . . . . . . . . . . . . . . . 47 | |||
8.1. Encoding Indicators . . . . . . . . . . . . . . . . . . . 47 | 8.1. Encoding Indicators . . . . . . . . . . . . . . . . . . . 49 | |||
9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 48 | 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 49 | |||
9.1. Simple Values Registry . . . . . . . . . . . . . . . . . 48 | 9.1. Simple Values Registry . . . . . . . . . . . . . . . . . 50 | |||
9.2. Tags Registry . . . . . . . . . . . . . . . . . . . . . . 48 | 9.2. Tags Registry . . . . . . . . . . . . . . . . . . . . . . 50 | |||
9.3. Media Type ("MIME Type") . . . . . . . . . . . . . . . . 49 | 9.3. Media Type ("MIME Type") . . . . . . . . . . . . . . . . 51 | |||
9.4. CoAP Content-Format . . . . . . . . . . . . . . . . . . . 50 | 9.4. CoAP Content-Format . . . . . . . . . . . . . . . . . . . 51 | |||
9.5. The +cbor Structured Syntax Suffix Registration . . . . . 50 | 9.5. The +cbor Structured Syntax Suffix Registration . . . . . 52 | |||
10. Security Considerations . . . . . . . . . . . . . . . . . . . 51 | 10. Security Considerations . . . . . . . . . . . . . . . . . . . 53 | |||
11. References . . . . . . . . . . . . . . . . . . . . . . . . . 53 | 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 56 | |||
11.1. Normative References . . . . . . . . . . . . . . . . . . 53 | 11.1. Normative References . . . . . . . . . . . . . . . . . . 56 | |||
11.2. Informative References . . . . . . . . . . . . . . . . . 54 | 11.2. Informative References . . . . . . . . . . . . . . . . . 57 | |||
Appendix A. Examples . . . . . . . . . . . . . . . . . . . . . . 57 | Appendix A. Examples of Encoded CBOR Data Items . . . . . . . . 60 | |||
Appendix B. Jump Table . . . . . . . . . . . . . . . . . . . . . 61 | Appendix B. Jump Table for Initial Byte . . . . . . . . . . . . 64 | |||
Appendix C. Pseudocode . . . . . . . . . . . . . . . . . . . . . 64 | Appendix C. Pseudocode . . . . . . . . . . . . . . . . . . . . . 67 | |||
Appendix D. Half-Precision . . . . . . . . . . . . . . . . . . . 66 | Appendix D. Half-Precision . . . . . . . . . . . . . . . . . . . 69 | |||
Appendix E. Comparison of Other Binary Formats to CBOR's Design | Appendix E. Comparison of Other Binary Formats to CBOR's Design | |||
Objectives . . . . . . . . . . . . . . . . . . . . . . . 67 | Objectives . . . . . . . . . . . . . . . . . . . . . . . 70 | |||
E.1. ASN.1 DER, BER, and PER . . . . . . . . . . . . . . . . . 68 | E.1. ASN.1 DER, BER, and PER . . . . . . . . . . . . . . . . . 71 | |||
E.2. MessagePack . . . . . . . . . . . . . . . . . . . . . . . 68 | E.2. MessagePack . . . . . . . . . . . . . . . . . . . . . . . 71 | |||
E.3. BSON . . . . . . . . . . . . . . . . . . . . . . . . . . 69 | E.3. BSON . . . . . . . . . . . . . . . . . . . . . . . . . . 72 | |||
E.4. MSDTP: RFC 713 . . . . . . . . . . . . . . . . . . . . . 69 | E.4. MSDTP: RFC 713 . . . . . . . . . . . . . . . . . . . . . 72 | |||
E.5. Conciseness on the Wire . . . . . . . . . . . . . . . . . 69 | E.5. Conciseness on the Wire . . . . . . . . . . . . . . . . . 72 | |||
Appendix F. Well-formedness errors and examples . . . . . . . . 70 | Appendix F. Well-formedness errors and examples . . . . . . . . 73 | |||
F.1. Examples for CBOR data items that are not well-formed . . 71 | F.1. Examples for CBOR data items that are not well-formed . . 74 | |||
Appendix G. Changes from RFC 7049 . . . . . . . . . . . . . . . 73 | Appendix G. Changes from RFC 7049 . . . . . . . . . . . . . . . 76 | |||
G.1. Errata processing, clerical changes . . . . . . . . . . . 73 | G.1. Errata processing, clerical changes . . . . . . . . . . . 76 | |||
G.2. Changes in IANA considerations . . . . . . . . . . . . . 74 | G.2. Changes in IANA considerations . . . . . . . . . . . . . 77 | |||
G.3. Changes in suggestions and other informational | G.3. Changes in suggestions and other informational | |||
components . . . . . . . . . . . . . . . . . . . . . . . 74 | components . . . . . . . . . . . . . . . . . . . . . . . 77 | |||
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . 76 | Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . 79 | |||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 76 | Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 79 | |||
1. Introduction | 1. Introduction | |||
There are hundreds of standardized formats for binary representation | There are hundreds of standardized formats for binary representation | |||
of structured data (also known as binary serialization formats). Of | of structured data (also known as binary serialization formats). Of | |||
those, some are for specific domains of information, while others are | those, some are for specific domains of information, while others are | |||
generalized for arbitrary data. In the IETF, probably the best-known | generalized for arbitrary data. In the IETF, probably the best-known | |||
formats in the latter category are ASN.1's BER and DER [ASN.1]. | formats in the latter category are ASN.1's BER and DER [ASN.1]. | |||
The format defined here follows some specific design goals that are | The format defined here follows some specific design goals that are | |||
skipping to change at page 7, line 34 ¶ | skipping to change at page 7, line 34 ¶ | |||
Stream decoder: A process that decodes a data stream and makes each | Stream decoder: A process that decodes a data stream and makes each | |||
of the data items in the sequence available to an application as | of the data items in the sequence available to an application as | |||
they are received. | they are received. | |||
Terms and concepts for floating-point values such as Infinity, NaN | Terms and concepts for floating-point values such as Infinity, NaN | |||
(not a number), negative zero, and subnormal are defined in | (not a number), negative zero, and subnormal are defined in | |||
[IEEE754]. | [IEEE754]. | |||
Where bit arithmetic or data types are explained, this document uses | Where bit arithmetic or data types are explained, this document uses | |||
the notation familiar from the programming language C, except that | the notation familiar from the programming language C [C], except | |||
"**" denotes exponentiation. Similar to the "0x" notation for | that "**" denotes exponentiation and ".." denotes a range that | |||
hexadecimal numbers, numbers in binary notation are prefixed with | includes both ends given. Examples and pseudocode assume that signed | |||
"0b". Underscores can be added to a number solely for readability, | integers use two's complement representation and that right shifts of | |||
so 0b00100001 (0x21) might be written 0b001_00001 to emphasize the | signed integers perform sign extension; these assumptions are also | |||
desired interpretation of the bits in the byte; in this case, it is | specified in Sections 6.8.2 and 7.6.7 of the 2020 version of C++, | |||
split into three bits and five bits. Encoded CBOR data items are | successor of [Cplusplus17]. | |||
sometimes given in the "0x" or "0b" notation; these values are first | ||||
interpreted as numbers as in C and are then interpreted as byte | Similar to the "0x" notation for hexadecimal numbers, numbers in | |||
strings in network byte order, including any leading zero bytes | binary notation are prefixed with "0b". Underscores can be added to | |||
expressed in the notation. | a number solely for readability, so 0b00100001 (0x21) might be | |||
written 0b001_00001 to emphasize the desired interpretation of the | ||||
bits in the byte; in this case, it is split into three bits and five | ||||
bits. Encoded CBOR data items are sometimes given in the "0x" or | ||||
"0b" notation; these values are first interpreted as numbers as in C | ||||
and are then interpreted as byte strings in network byte order, | ||||
including any leading zero bytes expressed in the notation. | ||||
Words may be _italicized_ for emphasis; in the plain text form of | Words may be _italicized_ for emphasis; in the plain text form of | |||
this specification this is indicated by surrounding words with | this specification this is indicated by surrounding words with | |||
underscore characters. Verbatim text (e.g., names from a programming | underscore characters. Verbatim text (e.g., names from a programming | |||
language) may be set in "monospace" type; in plain text this is | language) may be set in "monospace" type; in plain text this is | |||
approximated somewhat ambiguously by surrounding the text in double | approximated somewhat ambiguously by surrounding the text in double | |||
quotes (which also retain their usual meaning). | quotes (which also retain their usual meaning). | |||
2. CBOR Data Models | 2. CBOR Data Models | |||
CBOR is explicit about its generic data model, which defines the set | CBOR is explicit about its generic data model, which defines the set | |||
of all data items that can be represented in CBOR. Its basic generic | of all data items that can be represented in CBOR. Its basic generic | |||
data model is extensible by the registration of simple type values | data model is extensible by the registration of "simple values" and | |||
and tags. Applications can then subset the resulting extended | tags. Applications can then subset the resulting extended generic | |||
generic data model to build their specific data models. | data model to build their specific data models. | |||
Within environments that can represent the data items in the generic | Within environments that can represent the data items in the generic | |||
data model, generic CBOR encoders and decoders can be implemented | data model, generic CBOR encoders and decoders can be implemented | |||
(which usually involves defining additional implementation data types | (which usually involves defining additional implementation data types | |||
for those data items that do not already have a natural | for those data items that do not already have a natural | |||
representation in the environment). The ability to provide generic | representation in the environment). The ability to provide generic | |||
encoders and decoders is an explicit design goal of CBOR; however | encoders and decoders is an explicit design goal of CBOR; however | |||
many applications will provide their own application-specific | many applications will provide their own application-specific | |||
encoders and/or decoders. | encoders and/or decoders. | |||
In the basic (un-extended) generic data model, a data item is one of: | In the basic (un-extended) generic data model defined in Section 3, a | |||
data item is one of: | ||||
* an integer in the range -2**64..2**64-1 inclusive | * an integer in the range -2**64..2**64-1 inclusive | |||
* a simple value, identified by a number between 0 and 255, but | * a simple value, identified by a number between 0 and 255, but | |||
distinct from that number itself | distinct from that number itself | |||
* a floating-point value, distinct from an integer, out of the set | * a floating-point value, distinct from an integer, out of the set | |||
representable by IEEE 754 binary64 (including non-finites) | representable by IEEE 754 binary64 (including non-finites) | |||
[IEEE754] | [IEEE754] | |||
skipping to change at page 9, line 32 ¶ | skipping to change at page 9, line 35 ¶ | |||
precision than the above (tag numbers 2 to 5) | precision than the above (tag numbers 2 to 5) | |||
* application data types such as a point in time or an RFC 3339 | * application data types such as a point in time or an RFC 3339 | |||
date/time string (tag numbers 1, 0) | date/time string (tag numbers 1, 0) | |||
Further elements of the extended generic data model can be (and have | Further elements of the extended generic data model can be (and have | |||
been) defined via the IANA registries created for CBOR. Even if such | been) defined via the IANA registries created for CBOR. Even if such | |||
an extension is unknown to a generic encoder or decoder, data items | an extension is unknown to a generic encoder or decoder, data items | |||
using that extension can be passed to or from the application by | using that extension can be passed to or from the application by | |||
representing them at the interface to the application within the | representing them at the interface to the application within the | |||
basic generic data model, i.e., as generic values of a simple type or | basic generic data model, i.e., as generic simple values or generic | |||
generic tags. | tags. | |||
In other words, the basic generic data model is stable as defined in | In other words, the basic generic data model is stable as defined in | |||
this document, while the extended generic data model expands by the | this document, while the extended generic data model expands by the | |||
registration of new simple values or tag numbers, but never shrinks. | registration of new simple values or tag numbers, but never shrinks. | |||
While there is a strong expectation that generic encoders and | While there is a strong expectation that generic encoders and | |||
decoders can represent "false", "true", and "null" ("undefined" is | decoders can represent "false", "true", and "null" ("undefined" is | |||
intentionally omitted) in the form appropriate for their programming | intentionally omitted) in the form appropriate for their programming | |||
environment, implementation of the data model extensions created by | environment, implementation of the data model extensions created by | |||
tags is truly optional and a matter of implementation quality. | tags is truly optional and a matter of implementation quality. | |||
skipping to change at page 10, line 23 ¶ | skipping to change at page 10, line 32 ¶ | |||
representations of integral values are equivalent, using both map | representations of integral values are equivalent, using both map | |||
keys "0" and "0.0" in a single map would be considered duplicates, | keys "0" and "0.0" in a single map would be considered duplicates, | |||
even while encoded as different major types, and so invalid; and an | even while encoded as different major types, and so invalid; and an | |||
encoder could encode integral-valued floats as integers or vice | encoder could encode integral-valued floats as integers or vice | |||
versa, perhaps to save encoded bytes. | versa, perhaps to save encoded bytes. | |||
3. Specification of the CBOR Encoding | 3. Specification of the CBOR Encoding | |||
A CBOR data item (Section 2) is encoded to or decoded from a byte | A CBOR data item (Section 2) is encoded to or decoded from a byte | |||
string carrying a well-formed encoded data item as described in this | string carrying a well-formed encoded data item as described in this | |||
section. The encoding is summarized in Table 7, indexed by the | section. The encoding is summarized in Table 7 in Appendix B, | |||
initial byte. An encoder MUST produce only well-formed encoded data | indexed by the initial byte. An encoder MUST produce only well- | |||
items. A decoder MUST NOT return a decoded data item when it | formed encoded data items. A decoder MUST NOT return a decoded data | |||
encounters input that is not a well-formed encoded CBOR data item | item when it encounters input that is not a well-formed encoded CBOR | |||
(this does not detract from the usefulness of diagnostic and recovery | data item (this does not detract from the usefulness of diagnostic | |||
tools that might make available some information from a damaged | and recovery tools that might make available some information from a | |||
encoded CBOR data item). | damaged encoded CBOR data item). | |||
The initial byte of each encoded data item contains both information | The initial byte of each encoded data item contains both information | |||
about the major type (the high-order 3 bits, described in | about the major type (the high-order 3 bits, described in | |||
Section 3.1) and additional information (the low-order 5 bits). With | Section 3.1) and additional information (the low-order 5 bits). With | |||
a few exceptions, the additional information's value describes how to | a few exceptions, the additional information's value describes how to | |||
load an unsigned integer "argument": | load an unsigned integer "argument": | |||
Less than 24: The argument's value is the value of the additional | Less than 24: The argument's value is the value of the additional | |||
information. | information. | |||
skipping to change at page 11, line 6 ¶ | skipping to change at page 11, line 16 ¶ | |||
are not used as an integer argument, but as a floating-point value | are not used as an integer argument, but as a floating-point value | |||
(see Section 3.3). | (see Section 3.3). | |||
28, 29, 30: These values are reserved for future additions to the | 28, 29, 30: These values are reserved for future additions to the | |||
CBOR format. In the present version of CBOR, the encoded item is | CBOR format. In the present version of CBOR, the encoded item is | |||
not well-formed. | not well-formed. | |||
31: No argument value is derived. If the major type is 0, 1, or 6, | 31: No argument value is derived. If the major type is 0, 1, or 6, | |||
the encoded item is not well-formed. For major types 2 to 5, the | the encoded item is not well-formed. For major types 2 to 5, the | |||
item's length is indefinite, and for major type 7, the byte does | item's length is indefinite, and for major type 7, the byte does | |||
not consitute a data item at all but terminates an indefinite | not constitute a data item at all but terminates an indefinite | |||
length item; both are described in Section 3.2. | length item; all are described in Section 3.2. | |||
The initial byte and any additional bytes consumed to construct the | The initial byte and any additional bytes consumed to construct the | |||
argument are collectively referred to as the "head" of the data item. | argument are collectively referred to as the "head" of the data item. | |||
The meaning of this argument depends on the major type. For example, | The meaning of this argument depends on the major type. For example, | |||
in major type 0, the argument is the value of the data item itself | in major type 0, the argument is the value of the data item itself | |||
(and in major type 1 the value of the data item is computed from the | (and in major type 1 the value of the data item is computed from the | |||
argument); in major type 2 and 3 it gives the length of the string | argument); in major type 2 and 3 it gives the length of the string | |||
data in bytes that follows; and in major types 4 and 5 it is used to | data in bytes that follows; and in major types 4 and 5 it is used to | |||
determine the number of data items enclosed. | determine the number of data items enclosed. | |||
skipping to change at page 11, line 38 ¶ | skipping to change at page 11, line 48 ¶ | |||
256 defined values for the initial byte (Table 7). A decoder in a | 256 defined values for the initial byte (Table 7). A decoder in a | |||
constrained implementation can instead use the structure of the | constrained implementation can instead use the structure of the | |||
initial byte and following bytes for more compact code (see | initial byte and following bytes for more compact code (see | |||
Appendix C for a rough impression of how this could look). | Appendix C for a rough impression of how this could look). | |||
3.1. Major Types | 3.1. Major Types | |||
The following lists the major types and the additional information | The following lists the major types and the additional information | |||
and other bytes associated with the type. | and other bytes associated with the type. | |||
Major type 0: an integer in the range 0..2**64-1 inclusive. The | Major type 0: an unsigned integer in the range 0..2**64-1 inclusive. | |||
value of the encoded item is the argument itself. For example, | ||||
the integer 10 is denoted as the one byte 0b000_01010 (major type | The value of the encoded item is the argument itself. For | |||
0, additional information 10). The integer 500 would be | example, the integer 10 is denoted as the one byte 0b000_01010 | |||
0b000_11001 (major type 0, additional information 25) followed by | (major type 0, additional information 10). The integer 500 would | |||
the two bytes 0x01f4, which is 500 in decimal. | be 0b000_11001 (major type 0, additional information 25) followed | |||
by the two bytes 0x01f4, which is 500 in decimal. | ||||
Major type 1: a negative integer in the range -2**64..-1 inclusive. | Major type 1: a negative integer in the range -2**64..-1 inclusive. | |||
The value of the item is -1 minus the argument. For example, the | The value of the item is -1 minus the argument. For example, the | |||
integer -500 would be 0b001_11001 (major type 1, additional | integer -500 would be 0b001_11001 (major type 1, additional | |||
information 25) followed by the two bytes 0x01f3, which is 499 in | information 25) followed by the two bytes 0x01f3, which is 499 in | |||
decimal. | decimal. | |||
Major type 2: a byte string. The number of bytes in the string is | Major type 2: a byte string. The number of bytes in the string is | |||
equal to the argument. For example, a byte string whose length is | equal to the argument. For example, a byte string whose length is | |||
5 would have an initial byte of 0b010_00101 (major type 2, | 5 would have an initial byte of 0b010_00101 (major type 2, | |||
skipping to change at page 12, line 18 ¶ | skipping to change at page 12, line 32 ¶ | |||
initial bytes of 0b010_11001 (major type 2, additional information | initial bytes of 0b010_11001 (major type 2, additional information | |||
25 to indicate a two-byte length) followed by the two bytes 0x01f4 | 25 to indicate a two-byte length) followed by the two bytes 0x01f4 | |||
for a length of 500, followed by 500 bytes of binary content. | for a length of 500, followed by 500 bytes of binary content. | |||
Major type 3: a text string (Section 2), encoded as UTF-8 | Major type 3: a text string (Section 2), encoded as UTF-8 | |||
([RFC3629]). The number of bytes in the string is equal to the | ([RFC3629]). The number of bytes in the string is equal to the | |||
argument. A string containing an invalid UTF-8 sequence is well- | argument. A string containing an invalid UTF-8 sequence is well- | |||
formed but invalid (Section 1.2). This type is provided for | formed but invalid (Section 1.2). This type is provided for | |||
systems that need to interpret or display human-readable text, and | systems that need to interpret or display human-readable text, and | |||
allows the differentiation between unstructured bytes and text | allows the differentiation between unstructured bytes and text | |||
that has a specified repertoire and encoding. In contrast to | that has a specified repertoire (that of Unicode) and encoding | |||
formats such as JSON, the Unicode characters in this type are | (UTF-8). In contrast to formats such as JSON, the Unicode | |||
never escaped. Thus, a newline character (U+000A) is always | characters in this type are never escaped. Thus, a newline | |||
represented in a string as the byte 0x0a, and never as the bytes | character (U+000A) is always represented in a string as the byte | |||
0x5c6e (the characters "\" and "n") or as 0x5c7530303061 (the | 0x0a, and never as the bytes 0x5c6e (the characters "\" and "n") | |||
characters "\", "u", "0", "0", "0", and "a"). | nor as 0x5c7530303061 (the characters "\", "u", "0", "0", "0", and | |||
"a"). | ||||
Major type 4: an array of data items. In other formats, arrays are | Major type 4: an array of data items. In other formats, arrays are | |||
also called lists, sequences, or tuples (a "CBOR sequence" is | also called lists, sequences, or tuples (a "CBOR sequence" is | |||
something slightly different, though [RFC8742]). The argument is | something slightly different, though [RFC8742]). The argument is | |||
the number of data items in the array. Items in an array do not | the number of data items in the array. Items in an array do not | |||
need to all be of the same type. For example, an array that | need to all be of the same type. For example, an array that | |||
contains 10 items of any type would have an initial byte of | contains 10 items of any type would have an initial byte of | |||
0b100_01010 (major type of 4, additional information of 10 for the | 0b100_01010 (major type 4, additional information 10 for the | |||
length) followed by the 10 remaining items. | length) followed by the 10 remaining items. | |||
Major type 5: a map of pairs of data items. Maps are also called | Major type 5: a map of pairs of data items. Maps are also called | |||
tables, dictionaries, hashes, or objects (in JSON). A map is | tables, dictionaries, hashes, or objects (in JSON). A map is | |||
comprised of pairs of data items, each pair consisting of a key | comprised of pairs of data items, each pair consisting of a key | |||
that is immediately followed by a value. The argument is the | that is immediately followed by a value. The argument is the | |||
number of _pairs_ of data items in the map. For example, a map | number of _pairs_ of data items in the map. For example, a map | |||
that contains 9 pairs would have an initial byte of 0b101_01001 | that contains 9 pairs would have an initial byte of 0b101_01001 | |||
(major type of 5, additional information of 9 for the number of | (major type 5, additional information 9 for the number of pairs) | |||
pairs) followed by the 18 remaining items. The first item is the | followed by the 18 remaining items. The first item is the first | |||
first key, the second item is the first value, the third item is | key, the second item is the first value, the third item is the | |||
the second key, and so on. Because items in a map come in pairs, | second key, and so on. Because items in a map come in pairs, | |||
their total number is always even: A map that contains an odd | their total number is always even: A map that contains an odd | |||
number of items (no value data present after the last key data | number of items (no value data present after the last key data | |||
item) is not well-formed. A map that has duplicate keys may be | item) is not well-formed. A map that has duplicate keys may be | |||
well-formed, but it is not valid, and thus it causes indeterminate | well-formed, but it is not valid, and thus it causes indeterminate | |||
decoding; see also Section 5.6. | decoding; see also Section 5.6. | |||
Major type 6: a tagged data item ("tag") whose tag number, an | Major type 6: a tagged data item ("tag") whose tag number, an | |||
integer in the range 0..2**64-1 inclusive, is the argument and | integer in the range 0..2**64-1 inclusive, is the argument and | |||
whose enclosed data item ("tag content") is the single encoded | whose enclosed data item ("tag content") is the single encoded | |||
data item that follows the head. See Section 3.4. | data item that follows the head. See Section 3.4. | |||
skipping to change at page 13, line 23 ¶ | skipping to change at page 14, line 5 ¶ | |||
(Table 7). | (Table 7). | |||
In major types 6 and 7, many of the possible values are reserved for | In major types 6 and 7, many of the possible values are reserved for | |||
future specification. See Section 9 for more information on these | future specification. See Section 9 for more information on these | |||
values. | values. | |||
Table 1 summarizes the major types defined by CBOR, ignoring the next | Table 1 summarizes the major types defined by CBOR, ignoring the next | |||
section for now. The number N in this table stands for the argument, | section for now. The number N in this table stands for the argument, | |||
mt for the major type. | mt for the major type. | |||
+----+-----------------------+---------------------------------+ | +====+=======================+=================================+ | |||
| mt | Meaning | Content | | | mt | Meaning | Content | | |||
+====+=======================+=================================+ | +====+=======================+=================================+ | |||
| 0 | unsigned integer N | - | | | 0 | unsigned integer N | - | | |||
+----+-----------------------+---------------------------------+ | +----+-----------------------+---------------------------------+ | |||
| 1 | negative integer -1-N | - | | | 1 | negative integer -1-N | - | | |||
+----+-----------------------+---------------------------------+ | +----+-----------------------+---------------------------------+ | |||
| 2 | byte string | N bytes | | | 2 | byte string | N bytes | | |||
+----+-----------------------+---------------------------------+ | +----+-----------------------+---------------------------------+ | |||
| 3 | text string | N bytes (UTF-8 text) | | | 3 | text string | N bytes (UTF-8 text) | | |||
+----+-----------------------+---------------------------------+ | +----+-----------------------+---------------------------------+ | |||
skipping to change at page 14, line 16 ¶ | skipping to change at page 14, line 39 ¶ | |||
Four CBOR items (arrays, maps, byte strings, and text strings) can be | Four CBOR items (arrays, maps, byte strings, and text strings) can be | |||
encoded with an indefinite length using additional information value | encoded with an indefinite length using additional information value | |||
31. This is useful if the encoding of the item needs to begin before | 31. This is useful if the encoding of the item needs to begin before | |||
the number of items inside the array or map, or the total length of | the number of items inside the array or map, or the total length of | |||
the string, is known. (The ability to start sending a data item | the string, is known. (The ability to start sending a data item | |||
before all of it is known is often referred to as "streaming" within | before all of it is known is often referred to as "streaming" within | |||
that data item.) | that data item.) | |||
Indefinite-length arrays and maps are dealt with differently than | Indefinite-length arrays and maps are dealt with differently than | |||
indefinite-length byte strings and text strings. | indefinite-length strings (byte strings and text strings). | |||
3.2.1. The "break" Stop Code | 3.2.1. The "break" Stop Code | |||
The "break" stop code is encoded with major type 7 and additional | The "break" stop code is encoded with major type 7 and additional | |||
information value 31 (0b111_11111). It is not itself a data item: it | information value 31 (0b111_11111). It is not itself a data item: it | |||
is just a syntactic feature to close an indefinite-length item. | is just a syntactic feature to close an indefinite-length item. | |||
If the "break" stop code appears anywhere where a data item is | If the "break" stop code appears anywhere where a data item is | |||
expected, other than directly inside an indefinite-length string, | expected, other than directly inside an indefinite-length string, | |||
array, or map -- for example directly inside a definite-length array | array, or map -- for example directly inside a definite-length array | |||
skipping to change at page 16, line 45 ¶ | skipping to change at page 17, line 21 ¶ | |||
The data item represented by the indefinite-length string is the | The data item represented by the indefinite-length string is the | |||
concatenation of the chunks (i.e., the empty byte or text string, | concatenation of the chunks (i.e., the empty byte or text string, | |||
respectively, if no chunk is present). (Note that zero-length | respectively, if no chunk is present). (Note that zero-length | |||
chunks, while not particularly useful, are permitted.) | chunks, while not particularly useful, are permitted.) | |||
If any item between the indefinite-length string indicator | If any item between the indefinite-length string indicator | |||
(0b010_11111 or 0b011_11111) and the "break" stop code is not a | (0b010_11111 or 0b011_11111) and the "break" stop code is not a | |||
definite-length string item of the same major type, the string is not | definite-length string item of the same major type, the string is not | |||
well-formed. | well-formed. | |||
The design does not allow nesting indefinite-length strings as chunks | ||||
into indefinite-length strings. If it were allowed, it would require | ||||
decoder implementations to keep a stack, or at least a count, of | ||||
nesting levels. It is unnecessary on the encoder side because the | ||||
inner indefinite-length string would consist of chunks, and these | ||||
could instead be put directly into the outer indefinite-length | ||||
string. | ||||
If any definite-length text string inside an indefinite-length text | If any definite-length text string inside an indefinite-length text | |||
string is invalid, the indefinite-length text string is invalid. | string is invalid, the indefinite-length text string is invalid. | |||
Note that this implies that the UTF-8 bytes of a single Unicode code | Note that this implies that the UTF-8 bytes of a single Unicode code | |||
point (scalar value) cannot be spread between chunks: a new chunk of | point (scalar value) cannot be spread between chunks: a new chunk of | |||
a text string can only be started at a code point boundary. | a text string can only be started at a code point boundary. | |||
For example, assume an encoded data item consisting of the bytes: | For example, assume an encoded data item consisting of the bytes: | |||
0b010_11111 0b010_00100 0xaabbccdd 0b010_00011 0xeeff99 0b111_11111 | 0b010_11111 0b010_00100 0xaabbccdd 0b010_00011 0xeeff99 0b111_11111 | |||
skipping to change at page 17, line 23 ¶ | skipping to change at page 18, line 11 ¶ | |||
After decoding, this results in a single byte string with seven | After decoding, this results in a single byte string with seven | |||
bytes: 0xaabbccddeeff99. | bytes: 0xaabbccddeeff99. | |||
3.2.4. Summary of indefinite-length use of major types | 3.2.4. Summary of indefinite-length use of major types | |||
Table 2 summarizes the major types defined by CBOR as used for | Table 2 summarizes the major types defined by CBOR as used for | |||
indefinite length encoding (with additional information set to 31). | indefinite length encoding (with additional information set to 31). | |||
mt stands for the major type. | mt stands for the major type. | |||
+----+-------------------+----------------------------------+ | +====+===================+==================================+ | |||
| mt | Meaning | enclosed up to "break" stop code | | | mt | Meaning | enclosed up to "break" stop code | | |||
+====+===================+==================================+ | +====+===================+==================================+ | |||
| 0 | (not well-formed) | - | | | 0 | (not well-formed) | - | | |||
+----+-------------------+----------------------------------+ | +----+-------------------+----------------------------------+ | |||
| 1 | (not well-formed) | - | | | 1 | (not well-formed) | - | | |||
+----+-------------------+----------------------------------+ | +----+-------------------+----------------------------------+ | |||
| 2 | byte string | definite-length byte strings | | | 2 | byte string | definite-length byte strings | | |||
+----+-------------------+----------------------------------+ | +----+-------------------+----------------------------------+ | |||
| 3 | text string | definite-length text strings | | | 3 | text string | definite-length text strings | | |||
+----+-------------------+----------------------------------+ | +----+-------------------+----------------------------------+ | |||
skipping to change at page 18, line 12 ¶ | skipping to change at page 18, line 42 ¶ | |||
major types (mt = major type, additional information = | major types (mt = major type, additional information = | |||
31) | 31) | |||
3.3. Floating-Point Numbers and Values with No Content | 3.3. Floating-Point Numbers and Values with No Content | |||
Major type 7 is for two types of data: floating-point numbers and | Major type 7 is for two types of data: floating-point numbers and | |||
"simple values" that do not need any content. Each value of the | "simple values" that do not need any content. Each value of the | |||
5-bit additional information in the initial byte has its own separate | 5-bit additional information in the initial byte has its own separate | |||
meaning, as defined in Table 3. Like the major types for integers, | meaning, as defined in Table 3. Like the major types for integers, | |||
items of this major type do not carry content data; all the | items of this major type do not carry content data; all the | |||
information is in the initial bytes. | information is in the initial bytes (the head). | |||
+-------------+---------------------------------------------------+ | +=============+===================================================+ | |||
| 5-Bit Value | Semantics | | | 5-Bit Value | Semantics | | |||
+=============+===================================================+ | +=============+===================================================+ | |||
| 0..23 | Simple value (value 0..23) | | | 0..23 | Simple value (value 0..23) | | |||
+-------------+---------------------------------------------------+ | +-------------+---------------------------------------------------+ | |||
| 24 | Simple value (value 32..255 in following byte) | | | 24 | Simple value (value 32..255 in following byte) | | |||
+-------------+---------------------------------------------------+ | +-------------+---------------------------------------------------+ | |||
| 25 | IEEE 754 Half-Precision Float (16 bits follow) | | | 25 | IEEE 754 Half-Precision Float (16 bits follow) | | |||
+-------------+---------------------------------------------------+ | +-------------+---------------------------------------------------+ | |||
| 26 | IEEE 754 Single-Precision Float (32 bits follow) | | | 26 | IEEE 754 Single-Precision Float (32 bits follow) | | |||
+-------------+---------------------------------------------------+ | +-------------+---------------------------------------------------+ | |||
skipping to change at page 18, line 40 ¶ | skipping to change at page 19, line 31 ¶ | |||
| | (Section 3.2.1) | | | | (Section 3.2.1) | | |||
+-------------+---------------------------------------------------+ | +-------------+---------------------------------------------------+ | |||
Table 3: Values for Additional Information in Major Type 7 | Table 3: Values for Additional Information in Major Type 7 | |||
As with all other major types, the 5-bit value 24 signifies a single- | As with all other major types, the 5-bit value 24 signifies a single- | |||
byte extension: it is followed by an additional byte to represent the | byte extension: it is followed by an additional byte to represent the | |||
simple value. (To minimize confusion, only the values 32 to 255 are | simple value. (To minimize confusion, only the values 32 to 255 are | |||
used.) This maintains the structure of the initial bytes: as for the | used.) This maintains the structure of the initial bytes: as for the | |||
other major types, the length of these always depends on the | other major types, the length of these always depends on the | |||
additional information in the first byte. Table 4 lists the values | additional information in the first byte. Table 4 lists the numeric | |||
assigned and available for simple types. | values assigned and available for simple values. | |||
+---------+-----------------+ | +=========+==============+ | |||
| Value | Semantics | | | Value | Semantics | | |||
+=========+=================+ | +=========+==============+ | |||
| 0..19 | (Unassigned) | | | 0..19 | (Unassigned) | | |||
+---------+-----------------+ | +---------+--------------+ | |||
| 20 | False | | | 20 | False | | |||
+---------+-----------------+ | +---------+--------------+ | |||
| 21 | True | | | 21 | True | | |||
+---------+-----------------+ | +---------+--------------+ | |||
| 22 | Null | | | 22 | Null | | |||
+---------+-----------------+ | +---------+--------------+ | |||
| 23 | Undefined value | | | 23 | Undefined | | |||
+---------+-----------------+ | +---------+--------------+ | |||
| 24..31 | (Reserved) | | | 24..31 | (Reserved) | | |||
+---------+-----------------+ | +---------+--------------+ | |||
| 32..255 | (Unassigned) | | | 32..255 | (Unassigned) | | |||
+---------+-----------------+ | +---------+--------------+ | |||
Table 4: Simple Values | Table 4: Simple Values | |||
An encoder MUST NOT issue two-byte sequences that start with 0xf8 | An encoder MUST NOT issue two-byte sequences that start with 0xf8 | |||
(major type = 7, additional information = 24) and continue with a | (major type 7, additional information 24) and continue with a byte | |||
byte less than 0x20 (32 decimal). Such sequences are not well- | less than 0x20 (32 decimal). Such sequences are not well-formed. | |||
formed. (This implies that an encoder cannot encode false, true, | (This implies that an encoder cannot encode false, true, null, or | |||
null, or undefined in two-byte sequences, only the one-byte variants | undefined in two-byte sequences, and that only the one-byte variants | |||
of these are well-formed; more generally speaking, each simple value | of these are well-formed; more generally speaking, each simple value | |||
only has a single representation variant). | only has a single representation variant). | |||
The 5-bit values of 25, 26, and 27 are for 16-bit, 32-bit, and 64-bit | The 5-bit values of 25, 26, and 27 are for 16-bit, 32-bit, and 64-bit | |||
IEEE 754 binary floating-point values [IEEE754]. These floating- | IEEE 754 binary floating-point values [IEEE754]. These floating- | |||
point values are encoded in the additional bytes of the appropriate | point values are encoded in the additional bytes of the appropriate | |||
size. (See Appendix D for some information about 16-bit floating- | size. (See Appendix D for some information about 16-bit floating- | |||
point numbers.) | point numbers.) | |||
3.4. Tagging of Items | 3.4. Tagging of Items | |||
skipping to change at page 21, line 11 ¶ | skipping to change at page 21, line 39 ¶ | |||
decoder; it can simply present both the tag number and the tag | decoder; it can simply present both the tag number and the tag | |||
content to the application, without interpreting the additional | content to the application, without interpreting the additional | |||
semantics of the tag. | semantics of the tag. | |||
A tag applies semantics to the data item it encloses. Tags can nest: | A tag applies semantics to the data item it encloses. Tags can nest: | |||
If tag A encloses tag B, which encloses data item C, tag A applies to | If tag A encloses tag B, which encloses data item C, tag A applies to | |||
the result of applying tag B on data item C. | the result of applying tag B on data item C. | |||
IANA maintains a registry of tag numbers as described in Section 9.2. | IANA maintains a registry of tag numbers as described in Section 9.2. | |||
Table 5 provides a list of tag numbers that were defined in | Table 5 provides a list of tag numbers that were defined in | |||
[RFC7049], with definitions in the rest of this section. Note that | [RFC7049], with definitions in the rest of this section. (Tag number | |||
many other tag numbers have been defined since the publication of | 35 was also defined in [RFC7049]; a discussion of this tag number | |||
[RFC7049]; see the registry described at Section 9.2 for the complete | follows in Section 3.4.5.3.) Note that many other tag numbers have | |||
list. | been defined since the publication of [RFC7049]; see the registry | |||
described at Section 9.2 for the complete list. | ||||
+------------+-------------+----------------------------------+ | +============+=============+==================================+ | |||
| Tag Number | Data Item | Semantics | | | Tag Number | Data Item | Tag Content Semantics | | |||
+============+=============+==================================+ | +============+=============+==================================+ | |||
| 0 | text string | Standard date/time string; see | | | 0 | text string | Standard date/time string; see | | |||
| | | Section 3.4.1 | | | | | Section 3.4.1 | | |||
+------------+-------------+----------------------------------+ | +------------+-------------+----------------------------------+ | |||
| 1 | integer or | Epoch-based date/time; see | | | 1 | integer or | Epoch-based date/time; see | | |||
| | float | Section 3.4.2 | | | | float | Section 3.4.2 | | |||
+------------+-------------+----------------------------------+ | +------------+-------------+----------------------------------+ | |||
| 2 | byte string | Positive bignum; see | | | 2 | byte string | Positive bignum; see | | |||
| | | Section 3.4.3 | | | | | Section 3.4.3 | | |||
+------------+-------------+----------------------------------+ | +------------+-------------+----------------------------------+ | |||
skipping to change at page 22, line 5 ¶ | skipping to change at page 22, line 43 ¶ | |||
+------------+-------------+----------------------------------+ | +------------+-------------+----------------------------------+ | |||
| 24 | byte string | Encoded CBOR data item; see | | | 24 | byte string | Encoded CBOR data item; see | | |||
| | | Section 3.4.5.1 | | | | | Section 3.4.5.1 | | |||
+------------+-------------+----------------------------------+ | +------------+-------------+----------------------------------+ | |||
| 32 | text string | URI; see Section 3.4.5.3 | | | 32 | text string | URI; see Section 3.4.5.3 | | |||
+------------+-------------+----------------------------------+ | +------------+-------------+----------------------------------+ | |||
| 33 | text string | base64url; see Section 3.4.5.3 | | | 33 | text string | base64url; see Section 3.4.5.3 | | |||
+------------+-------------+----------------------------------+ | +------------+-------------+----------------------------------+ | |||
| 34 | text string | base64; see Section 3.4.5.3 | | | 34 | text string | base64; see Section 3.4.5.3 | | |||
+------------+-------------+----------------------------------+ | +------------+-------------+----------------------------------+ | |||
| 35 | text string | Regular expression; see | | ||||
| | | Section 3.4.5.3 | | ||||
+------------+-------------+----------------------------------+ | ||||
| 36 | text string | MIME message; see | | | 36 | text string | MIME message; see | | |||
| | | Section 3.4.5.3 | | | | | Section 3.4.5.3 | | |||
+------------+-------------+----------------------------------+ | +------------+-------------+----------------------------------+ | |||
| 55799 | (any) | Self-described CBOR; see | | | 55799 | (any) | Self-described CBOR; see | | |||
| | | Section 3.4.6 | | | | | Section 3.4.6 | | |||
+------------+-------------+----------------------------------+ | +------------+-------------+----------------------------------+ | |||
Table 5: Tag numbers defined in RFC 7049 | Table 5: Tag numbers defined in RFC 7049 | |||
Conceptually, tags are interpreted in the generic data model, not at | Conceptually, tags are interpreted in the generic data model, not at | |||
(de-)serialization time. A small number of tags (specifically, tag | (de-)serialization time. A small number of tags (at this time, tag | |||
number 25 and tag number 29) have been registered with semantics that | number 25 and tag number 29 [IANA.cbor-tags]) have been registered | |||
may require processing at (de-)serialization time: The decoder needs | with semantics that may require processing at (de-)serialization | |||
to be aware and the encoder needs to be in control of the exact | time: The decoder needs to be aware and the encoder needs to be in | |||
sequence in which data items are encoded into the CBOR data item. | control of the exact sequence in which data items are encoded into | |||
This means these tags cannot be implemented on top of every generic | the CBOR data item. This means these tags cannot be implemented on | |||
CBOR encoder/decoder (which might not reflect the serialization order | top of an arbitrary generic CBOR encoder/decoder (which might not | |||
for entries in a map at the data model level and vice versa); their | reflect the serialization order for entries in a map at the data | |||
implementation therefore typically needs to be integrated into the | model level and vice versa); their implementation therefore typically | |||
generic encoder/decoder. The definition of new tags with this | needs to be integrated into the generic encoder/decoder. The | |||
property is NOT RECOMMENDED. | definition of new tags with this property is NOT RECOMMENDED. | |||
IANA allocated tag numbers 65535, 4294967295, and | IANA allocated tag numbers 65535, 4294967295, and | |||
18446744073709551615 (binary all-ones in 16-bit, 32-bit, and 64-bit). | 18446744073709551615 (binary all-ones in 16-bit, 32-bit, and 64-bit). | |||
These can be used as a convenience for implementers that want a | These can be used as a convenience for implementers that want a | |||
single integer to indicate either that a specific tag is present, or | single integer data structure to indicate either that a specific tag | |||
the absence of a tag. That allocation is described in Section 10 of | is present, or the absence of a tag. That allocation is described in | |||
[I-D.bormann-cbor-notable-tags]. These tags are not intended to | Section 10 of [I-D.bormann-cbor-notable-tags]. These tags are not | |||
occur in actual CBOR data items; implementations may flag such an | intended to occur in actual CBOR data items; implementations MAY flag | |||
occurrence as an error. | such an occurrence as an error. | |||
Protocols using tag numbers 0 and 1 extend the generic data model | Protocols using tag numbers 0 and 1 extend the generic data model | |||
(Section 2) with data items representing points in time; tag numbers | (Section 2) with data items representing points in time; tag numbers | |||
2 and 3, with arbitrarily sized integers; and tag numbers 4 and 5, | 2 and 3, with arbitrarily sized integers; and tag numbers 4 and 5, | |||
with floating-point values of arbitrary size and precision. | with floating-point values of arbitrary size and precision. | |||
3.4.1. Standard Date/Time String | 3.4.1. Standard Date/Time String | |||
Tag number 0 contains a text string in the standard format described | Tag number 0 contains a text string in the standard format described | |||
by the "date-time" production in [RFC3339], as refined by Section 3.3 | by the "date-time" production in [RFC3339], as refined by Section 3.3 | |||
of [RFC4287], representing the point in time described there. A | of [RFC4287], representing the point in time described there. A | |||
nested item of another type or that doesn't match the [RFC4287] | nested item of another type or a text string that doesn't match the | |||
format is invalid. | [RFC4287] format is invalid. | |||
3.4.2. Epoch-based Date/Time | 3.4.2. Epoch-based Date/Time | |||
Tag number 1 contains a numerical value counting the number of | Tag number 1 contains a numerical value counting the number of | |||
seconds from 1970-01-01T00:00Z in UTC time to the represented point | seconds from 1970-01-01T00:00Z in UTC time to the represented point | |||
in civil time. | in civil time. | |||
The tag content MUST be an unsigned or negative integer (major types | The tag content MUST be an unsigned or negative integer (major types | |||
0 and 1), or a floating-point number (major type 7 with additional | 0 and 1), or a floating-point number (major type 7 with additional | |||
information 25, 26, or 27). Other contained types are invalid. | information 25, 26, or 27). Other contained types are invalid. | |||
Non-negative values (major type 0 and non-negative floating-point | Non-negative values (major type 0 and non-negative floating-point | |||
numbers) stand for time values on or after 1970-01-01T00:00Z UTC and | numbers) stand for time values on or after 1970-01-01T00:00Z UTC and | |||
are interpreted according to POSIX [TIME_T]. (POSIX time is also | are interpreted according to POSIX [TIME_T]. (POSIX time is also | |||
known as UNIX Epoch time. Note that leap seconds are handled | known as "UNIX Epoch time".) Leap seconds are handled specially by | |||
specially by POSIX time and this results in a 1 second discontinuity | POSIX time and this results in a 1 second discontinuity several times | |||
several times per decade.) Note that applications that require the | per decade. Note that applications that require the expression of | |||
expression of times beyond early 2106 cannot leave out support of | times beyond early 2106 cannot leave out support of 64-bit integers | |||
64-bit integers for the tag content. | for the tag content. | |||
Negative values (major type 1 and negative floating-point numbers) | Negative values (major type 1 and negative floating-point numbers) | |||
are interpreted as determined by the application requirements as | are interpreted as determined by the application requirements as | |||
there is no universal standard for UTC count-of-seconds time before | there is no universal standard for UTC count-of-seconds time before | |||
1970-01-01T00:00Z (this is particularly true for points in time that | 1970-01-01T00:00Z (this is particularly true for points in time that | |||
precede discontinuities in national calendars). The same applies to | precede discontinuities in national calendars). The same applies to | |||
non-finite values. | non-finite values. | |||
To indicate fractional seconds, floating-point values can be used | To indicate fractional seconds, floating-point values can be used | |||
within tag number 1 instead of integer values. Note that this | within tag number 1 instead of integer values. Note that this | |||
skipping to change at page 23, line 44 ¶ | skipping to change at page 24, line 30 ¶ | |||
non-zero fractions of seconds only for a short period of time around | non-zero fractions of seconds only for a short period of time around | |||
early 1970. An application that requires tag number 1 support may | early 1970. An application that requires tag number 1 support may | |||
restrict the tag content to be an integer (or a floating-point value) | restrict the tag content to be an integer (or a floating-point value) | |||
only. | only. | |||
Note that platform types for date/time may include null or undefined | Note that platform types for date/time may include null or undefined | |||
values, which may also be desirable at an application protocol level. | values, which may also be desirable at an application protocol level. | |||
While emitting tag number 1 values with non-finite tag content values | While emitting tag number 1 values with non-finite tag content values | |||
(e.g., with NaN for undefined date/time values or with Infinite for | (e.g., with NaN for undefined date/time values or with Infinite for | |||
an expiry date that is not set) may seem an obvious way to handle | an expiry date that is not set) may seem an obvious way to handle | |||
this, using untagged null or undefined is often a better solution. | this, using untagged null or undefined avoids the use of non-finites | |||
Application protocol designers are encouraged to consider these cases | and results in a shorter encoding. Application protocol designers | |||
and include clear guidelines for handling them. | are encouraged to consider these cases and include clear guidelines | |||
for handling them. | ||||
3.4.3. Bignums | 3.4.3. Bignums | |||
Protocols using tag numbers 2 and 3 extend the generic data model | Protocols using tag numbers 2 and 3 extend the generic data model | |||
(Section 2) with "bignums" representing arbitrarily sized integers. | (Section 2) with "bignums" representing arbitrarily sized integers. | |||
In the basic generic data model, bignum values are not equal to | In the basic generic data model, bignum values are not equal to | |||
integers from the same model, but the extended generic data model | integers from the same model, but the extended generic data model | |||
created by this tag definition defines equivalence based on numeric | created by this tag definition defines equivalence based on numeric | |||
value, and preferred serialization (Section 4.1) never makes use of | value, and preferred serialization (Section 4.1) never makes use of | |||
bignums that also can be expressed as basic integers (see below). | bignums that also can be expressed as basic integers (see below). | |||
skipping to change at page 25, line 28 ¶ | skipping to change at page 26, line 15 ¶ | |||
A decimal fraction or a bigfloat is represented as a tagged array | A decimal fraction or a bigfloat is represented as a tagged array | |||
that contains exactly two integer numbers: an exponent e and a | that contains exactly two integer numbers: an exponent e and a | |||
mantissa m. Decimal fractions (tag number 4) use base-10 exponents; | mantissa m. Decimal fractions (tag number 4) use base-10 exponents; | |||
the value of a decimal fraction data item is m*(10**e). Bigfloats | the value of a decimal fraction data item is m*(10**e). Bigfloats | |||
(tag number 5) use base-2 exponents; the value of a bigfloat data | (tag number 5) use base-2 exponents; the value of a bigfloat data | |||
item is m*(2**e). The exponent e MUST be represented in an integer | item is m*(2**e). The exponent e MUST be represented in an integer | |||
of major type 0 or 1, while the mantissa can also be a bignum | of major type 0 or 1, while the mantissa can also be a bignum | |||
(Section 3.4.3). Contained items with other structures are invalid. | (Section 3.4.3). Contained items with other structures are invalid. | |||
An example of a decimal fraction is that the number 273.15 could be | An example of a decimal fraction is that the number 273.15 could be | |||
represented as 0b110_00100 (major type of 6 for the tag, additional | represented as 0b110_00100 (major type 6 for tag, additional | |||
information of 4 for the number of tag), followed by 0b100_00010 | information 4 for the tag number), followed by 0b100_00010 (major | |||
(major type of 4 for the array, additional information of 2 for the | type 4 for the array, additional information 2 for the length of the | |||
length of the array), followed by 0b001_00001 (major type of 1 for | array), followed by 0b001_00001 (major type 1 for the first integer, | |||
the first integer, additional information of 1 for the value of -2), | additional information 1 for the value of -2), followed by | |||
followed by 0b000_11001 (major type of 0 for the second integer, | 0b000_11001 (major type 0 for the second integer, additional | |||
additional information of 25 for a two-byte value), followed by | information 25 for a two-byte value), followed by 0b0110101010110011 | |||
0b0110101010110011 (27315 in two bytes). In hexadecimal: | (27315 in two bytes). In hexadecimal: | |||
C4 -- Tag 4 | C4 -- Tag 4 | |||
82 -- Array of length 2 | 82 -- Array of length 2 | |||
21 -- -2 | 21 -- -2 | |||
19 6ab3 -- 27315 | 19 6ab3 -- 27315 | |||
An example of a bigfloat is that the number 1.5 could be represented | An example of a bigfloat is that the number 1.5 could be represented | |||
as 0b110_00101 (major type of 6 for the tag, additional information | as 0b110_00101 (major type 6 for tag, additional information 5 for | |||
of 5 for the number of tag), followed by 0b100_00010 (major type of 4 | the tag number), followed by 0b100_00010 (major type 4 for the array, | |||
for the array, additional information of 2 for the length of the | additional information 2 for the length of the array), followed by | |||
array), followed by 0b001_00000 (major type of 1 for the first | 0b001_00000 (major type 1 for the first integer, additional | |||
integer, additional information of 0 for the value of -1), followed | information 0 for the value of -1), followed by 0b000_00011 (major | |||
by 0b000_00011 (major type of 0 for the second integer, additional | type 0 for the second integer, additional information 3 for the value | |||
information of 3 for the value of 3). In hexadecimal: | of 3). In hexadecimal: | |||
C5 -- Tag 5 | C5 -- Tag 5 | |||
82 -- Array of length 2 | 82 -- Array of length 2 | |||
20 -- -1 | 20 -- -1 | |||
03 -- 3 | 03 -- 3 | |||
Decimal fractions and bigfloats provide no representation of | Decimal fractions and bigfloats provide no representation of | |||
Infinity, -Infinity, or NaN; if these are needed in place of a | Infinity, -Infinity, or NaN; if these are needed in place of a | |||
decimal fraction or bigfloat, the IEEE 754 half-precision | decimal fraction or bigfloat, the IEEE 754 half-precision | |||
representations from Section 3.3 can be used. | representations from Section 3.3 can be used. | |||
skipping to change at page 26, line 35 ¶ | skipping to change at page 27, line 19 ¶ | |||
item is being decoded. Tag number 24 (CBOR data item) can be used to | item is being decoded. Tag number 24 (CBOR data item) can be used to | |||
tag the embedded byte string as a single data item encoded in CBOR | tag the embedded byte string as a single data item encoded in CBOR | |||
format. Contained items that aren't byte strings are invalid. A | format. Contained items that aren't byte strings are invalid. A | |||
contained byte string is valid if it encodes a well-formed CBOR data | contained byte string is valid if it encodes a well-formed CBOR data | |||
item; validity checking of the decoded CBOR item is not required for | item; validity checking of the decoded CBOR item is not required for | |||
tag validity (but could be offered by a generic decoder as a special | tag validity (but could be offered by a generic decoder as a special | |||
option). | option). | |||
3.4.5.2. Expected Later Encoding for CBOR-to-JSON Converters | 3.4.5.2. Expected Later Encoding for CBOR-to-JSON Converters | |||
Tags number 21 to 23 indicate that a byte string might require a | Tag numbers 21 to 23 indicate that a byte string might require a | |||
specific encoding when interoperating with a text-based | specific encoding when interoperating with a text-based | |||
representation. These tags are useful when an encoder knows that the | representation. These tags are useful when an encoder knows that the | |||
byte string data it is writing is likely to be later converted to a | byte string data it is writing is likely to be later converted to a | |||
particular JSON-based usage. That usage specifies that some strings | particular JSON-based usage. That usage specifies that some strings | |||
are encoded as base64, base64url, and so on. The encoder uses byte | are encoded as base64, base64url, and so on. The encoder uses byte | |||
strings instead of doing the encoding itself to reduce the message | strings instead of doing the encoding itself to reduce the message | |||
size, to reduce the code size of the encoder, or both. The encoder | size, to reduce the code size of the encoder, or both. The encoder | |||
does not know whether or not the converter will be generic, and | does not know whether or not the converter will be generic, and | |||
therefore wants to say what it believes is the proper way to convert | therefore wants to say what it believes is the proper way to convert | |||
binary strings to JSON. | binary strings to JSON. | |||
skipping to change at page 27, line 12 ¶ | skipping to change at page 27, line 43 ¶ | |||
contained in the data item, except for those contained in a nested | contained in the data item, except for those contained in a nested | |||
data item tagged with an expected conversion. | data item tagged with an expected conversion. | |||
These three tag numbers suggest conversions to three of the base data | These three tag numbers suggest conversions to three of the base data | |||
encodings defined in [RFC4648]. Tag number 21 suggests conversion to | encodings defined in [RFC4648]. Tag number 21 suggests conversion to | |||
base64url encoding (Section 5 of RFC 4648), where padding is not used | base64url encoding (Section 5 of RFC 4648), where padding is not used | |||
(see Section 3.2 of RFC 4648); that is, all trailing equals signs | (see Section 3.2 of RFC 4648); that is, all trailing equals signs | |||
("=") are removed from the encoded string. Tag number 22 suggests | ("=") are removed from the encoded string. Tag number 22 suggests | |||
conversion to classical base64 encoding (Section 4 of RFC 4648), with | conversion to classical base64 encoding (Section 4 of RFC 4648), with | |||
padding as defined in RFC 4648. For both base64url and base64, | padding as defined in RFC 4648. For both base64url and base64, | |||
padding bits are set to zero (see Section 3.5 of RFC 4648), and | padding bits are set to zero (see Section 3.5 of RFC 4648), and the | |||
encoding is performed without the inclusion of any line breaks, | conversion to alternate encoding is performed on the contents of the | |||
whitespace, or other additional characters. Tag number 23 suggests | byte string (that is, without adding any line breaks, whitespace, or | |||
conversion to base16 (hex) encoding, with uppercase alphabetics (see | other additional characters). Tag number 23 suggests conversion to | |||
Section 8 of RFC 4648). Note that, for all three tag numbers, the | base16 (hex) encoding, with uppercase alphabetics (see Section 8 of | |||
encoding of the empty byte string is the empty text string. | RFC 4648). Note that, for all three tag numbers, the encoding of the | |||
empty byte string is the empty text string. | ||||
3.4.5.3. Encoded Text | 3.4.5.3. Encoded Text | |||
Some text strings hold data that have formats widely used on the | Some text strings hold data that have formats widely used on the | |||
Internet, and sometimes those formats can be validated and presented | Internet, and sometimes those formats can be validated and presented | |||
to the application in appropriate form by the decoder. There are | to the application in appropriate form by the decoder. There are | |||
tags for some of these formats. | tags for some of these formats. | |||
* Tag number 32 is for URIs, as defined in [RFC3986]. If the text | * Tag number 32 is for URIs, as defined in [RFC3986]. If the text | |||
string doesn't match the "URI-reference" production, the string is | string doesn't match the "URI-reference" production, the string is | |||
skipping to change at page 28, line 5 ¶ | skipping to change at page 28, line 33 ¶ | |||
- the padding bits in a 2- or 3-character block are not 0, or | - the padding bits in a 2- or 3-character block are not 0, or | |||
- the base64 encoding has the wrong number of padding characters, | - the base64 encoding has the wrong number of padding characters, | |||
or | or | |||
- the base64url encoding has padding characters, | - the base64url encoding has padding characters, | |||
the string is invalid. | the string is invalid. | |||
* Tag number 35 is for regular expressions that are roughly in Perl | ||||
Compatible Regular Expressions (PCRE/PCRE2) form [PCRE] or a | ||||
version of the JavaScript regular expression syntax [ECMA262]. | ||||
(Note that more specific identification may be necessary if the | ||||
actual version of the specification underlying the regular | ||||
expression, or more than just the text of the regular expression | ||||
itself, need to be conveyed.) Any contained string value is | ||||
valid. | ||||
* Tag number 36 is for MIME messages (including all headers), as | * Tag number 36 is for MIME messages (including all headers), as | |||
defined in [RFC2045]. A text string that isn't a valid MIME | defined in [RFC2045]. A text string that isn't a valid MIME | |||
message is invalid. (For this tag, validity checking may be | message is invalid. (For this tag, validity checking may be | |||
particularly onerous for a generic decoder and might therefore not | particularly onerous for a generic decoder and might therefore not | |||
be offered. Note that many MIME messages are general binary data | be offered. Note that many MIME messages are general binary data | |||
and can therefore not be represented in a text string; | and can therefore not be represented in a text string; | |||
[IANA.cbor-tags] lists a registration for tag number 257 that is | [IANA.cbor-tags] lists a registration for tag number 257 that is | |||
similar to tag number 36 but uses a byte string as its tag | similar to tag number 36 but uses a byte string as its tag | |||
content.) | content.) | |||
Note that tag numbers 33 and 34 differ from 21 and 22 in that the | Note that tag numbers 33 and 34 differ from 21 and 22 in that the | |||
data is transported in base-encoded form for the former and in raw | data is transported in base-encoded form for the former and in raw | |||
byte string form for the latter. | byte string form for the latter. | |||
[RFC7049] also defined a tag number 35, for regular expressions that | ||||
are in Perl Compatible Regular Expressions (PCRE/PCRE2) form [PCRE] | ||||
or in JavaScript regular expression syntax [ECMA262]. The state of | ||||
the art in these regular expression specifications has since advanced | ||||
and is continually advancing, so the present specification does not | ||||
attempt to update the references to a snapshot that is current at the | ||||
time of writing. Instead, this tag remains available (as registered | ||||
in [RFC7049]) for applications that specify the particular regular | ||||
expression variant they use out-of-band (possibly by limiting the | ||||
usage to a defined common subset of both PCRE and ECMA262). As the | ||||
present specification clarifies tag validity beyond [RFC7049], we | ||||
note that due to the open way the tag was defined in [RFC7049], any | ||||
contained string value needs to be valid at the CBOR tag level (but | ||||
may then not be "expected" at the application level). | ||||
3.4.6. Self-Described CBOR | 3.4.6. Self-Described CBOR | |||
In many applications, it will be clear from the context that CBOR is | In many applications, it will be clear from the context that CBOR is | |||
being employed for encoding a data item. For instance, a specific | being employed for encoding a data item. For instance, a specific | |||
protocol might specify the use of CBOR, or a media type is indicated | protocol might specify the use of CBOR, or a media type is indicated | |||
that specifies its use. However, there may be applications where | that specifies its use. However, there may be applications where | |||
such context information is not available, such as when CBOR data is | such context information is not available, such as when CBOR data is | |||
stored in a file that does not have disambiguating metadata. Here, | stored in a file that does not have disambiguating metadata. Here, | |||
it may help to have some distinguishing characteristics for the data | it may help to have some distinguishing characteristics for the data | |||
itself. | itself. | |||
skipping to change at page 29, line 40 ¶ | skipping to change at page 30, line 24 ¶ | |||
say, always uses 64-bit integers. | say, always uses 64-bit integers. | |||
Similarly, a constrained encoder may be limited in the variety of | Similarly, a constrained encoder may be limited in the variety of | |||
representation variants it supports in such a way that it does not | representation variants it supports in such a way that it does not | |||
emit preferred serializations ("variant encoder"): Say, it could be | emit preferred serializations ("variant encoder"): Say, it could be | |||
designed to always use the 32-bit variant for an integer that it | designed to always use the 32-bit variant for an integer that it | |||
encodes even if a short representation is available (again, assuming | encodes even if a short representation is available (again, assuming | |||
that there is no application need for integers that can only be | that there is no application need for integers that can only be | |||
represented with the 64-bit variant). A decoder that does not rely | represented with the 64-bit variant). A decoder that does not rely | |||
on only ever receiving preferred serializations ("variation-tolerant | on only ever receiving preferred serializations ("variation-tolerant | |||
decoder") can there be said to be more universally interoperable (it | decoder") can therefore be said to be more universally interoperable | |||
might very well optimize for the case of receiving preferred | (it might very well optimize for the case of receiving preferred | |||
serializations, though). Full implementations of CBOR decoders are | serializations, though). Full implementations of CBOR decoders are | |||
by definition variation-tolerant; the distinction is only relevant if | by definition variation-tolerant; the distinction is only relevant if | |||
a constrained implementation of a CBOR decoder meets a variant | a constrained implementation of a CBOR decoder meets a variant | |||
encoder. | encoder. | |||
The preferred serialization always uses the shortest form of | The preferred serialization always uses the shortest form of | |||
representing the argument (Section 3); it also uses the shortest | representing the argument (Section 3); it also uses the shortest | |||
floating-point encoding that preserves the value being encoded. | floating-point encoding that preserves the value being encoded. | |||
The preferred serialization for a floating-point value is the | The preferred serialization for a floating-point value is the | |||
skipping to change at page 30, line 49 ¶ | skipping to change at page 31, line 38 ¶ | |||
- 24 to 255 and -25 to -256 MUST be expressed only with an | - 24 to 255 and -25 to -256 MUST be expressed only with an | |||
additional uint8_t; | additional uint8_t; | |||
- 256 to 65535 and -257 to -65536 MUST be expressed only with an | - 256 to 65535 and -257 to -65536 MUST be expressed only with an | |||
additional uint16_t; | additional uint16_t; | |||
- 65536 to 4294967295 and -65537 to -4294967296 MUST be expressed | - 65536 to 4294967295 and -65537 to -4294967296 MUST be expressed | |||
only with an additional uint32_t. | only with an additional uint32_t. | |||
Floating-point values also MUST use the shortest form that | Floating-point values also MUST use the shortest form that | |||
preserves the value, e.g. 1.5 is encoded as 0xf93e00 and 1000000.5 | preserves the value, e.g. 1.5 is encoded as 0xf93e00 (binary16) | |||
as 0xfa49742408. (One implementation of this is to have all | and 1000000.5 as 0xfa49742408 (binary32). (One implementation of | |||
floats start as a 64-bit float, then do a test conversion to a | this is to have all floats start as a 64-bit float, then do a test | |||
32-bit float; if the result is the same numeric value, use the | conversion to a 32-bit float; if the result is the same numeric | |||
shorter form and repeat the process with a test conversion to a | value, use the shorter form and repeat the process with a test | |||
16-bit float. This also works to select 16-bit float for positive | conversion to a 16-bit float. This also works to select 16-bit | |||
and negative Infinity as well.) | float for positive and negative Infinity as well.) | |||
* Indefinite-length items MUST NOT appear. They can be encoded as | * Indefinite-length items MUST NOT appear. They can be encoded as | |||
definite-length items instead. | definite-length items instead. | |||
* The keys in every map MUST be sorted in the bytewise lexicographic | * The keys in every map MUST be sorted in the bytewise lexicographic | |||
order of their deterministic encodings. For example, the | order of their deterministic encodings. For example, the | |||
following keys are sorted correctly: | following keys are sorted correctly: | |||
1. 10, encoded as 0x0a. | 1. 10, encoded as 0x0a. | |||
skipping to change at page 31, line 31 ¶ | skipping to change at page 32, line 21 ¶ | |||
4. "z", encoded as 0x617a. | 4. "z", encoded as 0x617a. | |||
5. "aa", encoded as 0x626161. | 5. "aa", encoded as 0x626161. | |||
6. [100], encoded as 0x811864. | 6. [100], encoded as 0x811864. | |||
7. [-1], encoded as 0x8120. | 7. [-1], encoded as 0x8120. | |||
8. false, encoded as 0xf4. | 8. false, encoded as 0xf4. | |||
(Implementation note: the self-delimiting nature of the CBOR | ||||
encoding means that there are no two well-formed CBOR encoded data | ||||
items where one is a prefix of the other. The bytewise | ||||
lexicographic comparison of deterministic encodings of different | ||||
map keys therefore always ends in a position where the byte | ||||
differs between the keys, before the end of a key is reached.) | ||||
4.2.2. Additional Deterministic Encoding Considerations | 4.2.2. Additional Deterministic Encoding Considerations | |||
CBOR tags present additional considerations for deterministic | CBOR tags present additional considerations for deterministic | |||
encoding. If a CBOR-based protocol were to provide the same | encoding. If a CBOR-based protocol were to provide the same | |||
semantics for the presence and absence of a specific tag (e.g., by | semantics for the presence and absence of a specific tag (e.g., by | |||
allowing both tag 1 data items and raw numbers in a date/time | allowing both tag 1 data items and raw numbers in a date/time | |||
position, treating the latter as if they were tagged), the | position, treating the latter as if they were tagged), the | |||
deterministic format would not allow the presence of the tag, based | deterministic format would not allow the presence of the tag, based | |||
on the "shortest form" principle. For example, a protocol might give | on the "shortest form" principle. For example, a protocol might give | |||
encoders the choice of representing a URL as either a text string or, | encoders the choice of representing a URL as either a text string or, | |||
skipping to change at page 32, line 21 ¶ | skipping to change at page 33, line 14 ¶ | |||
Protocols that include floating-point values, whether represented | Protocols that include floating-point values, whether represented | |||
using basic floating-point values (Section 3.3) or using tags (or | using basic floating-point values (Section 3.3) or using tags (or | |||
both), may need to define extra requirements on their deterministic | both), may need to define extra requirements on their deterministic | |||
encodings, such as: | encodings, such as: | |||
* Although IEEE floating-point values can represent both positive | * Although IEEE floating-point values can represent both positive | |||
and negative zero as distinct values, the application might not | and negative zero as distinct values, the application might not | |||
distinguish these and might decide to represent all zero values | distinguish these and might decide to represent all zero values | |||
with a positive sign, disallowing negative zero. (The application | with a positive sign, disallowing negative zero. (The application | |||
may also want to restrict the precision of floating point values | may also want to restrict the precision of floating-point values | |||
in such a way that there is never a need to represent 64-bit -- or | in such a way that there is never a need to represent 64-bit -- or | |||
even 32-bit -- floating-point values.) | even 32-bit -- floating-point values.) | |||
* If a protocol includes a field that can express floating-point | * If a protocol includes a field that can express floating-point | |||
values, with a specific data model that declares integer and | values, with a specific data model that declares integer and | |||
floating-point values to be interchangeable, the protocol's | floating-point values to be interchangeable, the protocol's | |||
deterministic encoding needs to specify whether the integer 1.0 is | deterministic encoding needs to specify whether (for example) the | |||
encoded as 0x01, 0xf93c00, 0xfa3f800000, or 0xfb3ff0000000000000. | integer 1.0 is encoded as 0x01 (unsigned integer), 0xf93c00 | |||
Example rules for this are: | (binary16), 0xfa3f800000 (binary32), or 0xfb3ff0000000000000 | |||
(binary64). Example rules for this are: | ||||
1. Encode integral values that fit in 64 bits as values from | 1. Encode integral values that fit in 64 bits as values from | |||
major types 0 and 1, and other values as the preferred | major types 0 and 1, and other values as the preferred | |||
(smallest of 16-, 32-, or 64-bit) floating-point | (smallest of 16-, 32-, or 64-bit) floating-point | |||
representation that accurately represents the value, | representation that accurately represents the value, | |||
2. Encode all values as the preferred floating-point | 2. Encode all values as the preferred floating-point | |||
representation that accurately represents the value, even for | representation that accurately represents the value, even for | |||
integral values, or | integral values, or | |||
skipping to change at page 34, line 33 ¶ | skipping to change at page 35, line 25 ¶ | |||
Data formats such as CBOR are often used in environments where there | Data formats such as CBOR are often used in environments where there | |||
is no format negotiation. A specific design goal of CBOR is to not | is no format negotiation. A specific design goal of CBOR is to not | |||
need any included or assumed schema: a decoder can take a CBOR item | need any included or assumed schema: a decoder can take a CBOR item | |||
and decode it with no other knowledge. | and decode it with no other knowledge. | |||
Of course, in real-world implementations, the encoder and the decoder | Of course, in real-world implementations, the encoder and the decoder | |||
will have a shared view of what should be in a CBOR data item. For | will have a shared view of what should be in a CBOR data item. For | |||
example, an agreed-to format might be "the item is an array whose | example, an agreed-to format might be "the item is an array whose | |||
first value is a UTF-8 string, second value is an integer, and | first value is a UTF-8 string, second value is an integer, and | |||
subsequent values are zero or more floating-point numbers" or "the | subsequent values are zero or more floating-point numbers" or "the | |||
item is a map that has byte strings for keys and contains at least | item is a map that has byte strings for keys and contains a pair | |||
one pair whose key is 0xab01". | whose key is 0xab01". | |||
CBOR-based protocols MUST specify how their decoders handle invalid | CBOR-based protocols MUST specify how their decoders handle invalid | |||
and other unexpected data. CBOR-based protocols MAY specify that | and other unexpected data. CBOR-based protocols MAY specify that | |||
they treat arbitrary valid data as unexpected. Encoders for CBOR- | they treat arbitrary valid data as unexpected. Encoders for CBOR- | |||
based protocols MUST produce only valid items, that is, the protocol | based protocols MUST produce only valid items, that is, the protocol | |||
cannot be designed to make use of invalid items. An encoder can be | cannot be designed to make use of invalid items. An encoder can be | |||
capable of encoding as many or as few types of values as is required | capable of encoding as many or as few types of values as is required | |||
by the protocol in which it is used; a decoder can be capable of | by the protocol in which it is used; a decoder can be capable of | |||
understanding as many or as few types of values as is required by the | understanding as many or as few types of values as is required by the | |||
protocols in which it is used. This lack of restrictions allows CBOR | protocols in which it is used. This lack of restrictions allows CBOR | |||
skipping to change at page 35, line 26 ¶ | skipping to change at page 36, line 11 ¶ | |||
sequence of CBOR data items concatenated back-to-back. In such an | sequence of CBOR data items concatenated back-to-back. In such an | |||
environment, the decoder immediately begins decoding a new data item | environment, the decoder immediately begins decoding a new data item | |||
if data is found after the end of a previous data item. | if data is found after the end of a previous data item. | |||
Not all of the bytes making up a data item may be immediately | Not all of the bytes making up a data item may be immediately | |||
available to the decoder; some decoders will buffer additional data | available to the decoder; some decoders will buffer additional data | |||
until a complete data item can be presented to the application. | until a complete data item can be presented to the application. | |||
Other decoders can present partial information about a top-level data | Other decoders can present partial information about a top-level data | |||
item to an application, such as the nested data items that could | item to an application, such as the nested data items that could | |||
already be decoded, or even parts of a byte string that hasn't | already be decoded, or even parts of a byte string that hasn't | |||
completely arrived yet. | completely arrived yet. Such an application also MUST have matching | |||
streaming security mechanism, where the desired protection is | ||||
available for incremental data presented to the application. | ||||
Note that some applications and protocols will not want to use | Note that some applications and protocols will not want to use | |||
indefinite-length encoding. Using indefinite-length encoding allows | indefinite-length encoding. Using indefinite-length encoding allows | |||
an encoder to not need to marshal all the data for counting, but it | an encoder to not need to marshal all the data for counting, but it | |||
requires a decoder to allocate increasing amounts of memory while | requires a decoder to allocate increasing amounts of memory while | |||
waiting for the end of the item. This might be fine for some | waiting for the end of the item. This might be fine for some | |||
applications but not others. | applications but not others. | |||
5.2. Generic Encoders and Decoders | 5.2. Generic Encoders and Decoders | |||
skipping to change at page 37, line 44 ¶ | skipping to change at page 38, line 35 ¶ | |||
needs to have an API that reports an error (and does not return data) | needs to have an API that reports an error (and does not return data) | |||
for a CBOR data item that contains any of the validity errors listed | for a CBOR data item that contains any of the validity errors listed | |||
in the previous subsection. | in the previous subsection. | |||
The set of tags defined in the tag registry (Section 9.2), as well as | The set of tags defined in the tag registry (Section 9.2), as well as | |||
the set of simple values defined in the simple values registry | the set of simple values defined in the simple values registry | |||
(Section 9.1), can grow at any time beyond the set understood by a | (Section 9.1), can grow at any time beyond the set understood by a | |||
generic decoder. A validity-checking decoder can do one of two | generic decoder. A validity-checking decoder can do one of two | |||
things when it encounters such a case that it does not recognize: | things when it encounters such a case that it does not recognize: | |||
* It can report an error (and not return data). Note that this | * It can report an error (and not return data). Note that treating | |||
error is not a validity error per se. This kind of error is more | this case as an error can cause ossification, and is thus not | |||
likely to be raised by a decoder that would be performing validity | encouraged. This error is not a validity error per se. This kind | |||
checking if this were a known case. | of error is more likely to be raised by a decoder that would be | |||
performing validity checking if this were a known case. | ||||
* It can emit the unknown item (type, value, and, for tags, the | * It can emit the unknown item (type, value, and, for tags, the | |||
decoded tagged data item) to the application calling the decoder, | decoded tagged data item) to the application calling the decoder, | |||
with an indication that the decoder did not recognize that tag | with an indication that the decoder did not recognize that tag | |||
number or simple value. | number or simple value. | |||
The latter approach, which is also appropriate for decoders that do | The latter approach, which is also appropriate for decoders that do | |||
not support validity checking, provides forward compatibility with | not support validity checking, provides forward compatibility with | |||
newly registered tags and simple values without the requirement to | newly registered tags and simple values without the requirement to | |||
update the encoder at the same time as the calling application. (For | update the encoder at the same time as the calling application. (For | |||
skipping to change at page 38, line 36 ¶ | skipping to change at page 39, line 31 ¶ | |||
reliably limits its output to valid CBOR, independent of whether or | reliably limits its output to valid CBOR, independent of whether or | |||
not its application is indeed providing API-conformant data. | not its application is indeed providing API-conformant data. | |||
5.5. Numbers | 5.5. Numbers | |||
CBOR-based protocols should take into account that different language | CBOR-based protocols should take into account that different language | |||
environments pose different restrictions on the range and precision | environments pose different restrictions on the range and precision | |||
of numbers that are representable. For example, the basic JavaScript | of numbers that are representable. For example, the basic JavaScript | |||
number system treats all numbers as floating-point values, which may | number system treats all numbers as floating-point values, which may | |||
result in silent loss of precision in decoding integers with more | result in silent loss of precision in decoding integers with more | |||
than 53 significant bits. A protocol that uses numbers should define | than 53 significant bits. Another example is that, since CBOR keeps | |||
its expectations on the handling of non-trivial numbers in decoders | the sign bit for its integer representation in the major type, it has | |||
and receiving applications. | one bit more for signed numbers of a certain length (e.g., | |||
-2**64..2**64-1 for 1+8-byte integers) than the typical platform | ||||
signed integer representation of the same length (-2**63..2**63-1 for | ||||
8-byte int64_t). A protocol that uses numbers should define its | ||||
expectations on the handling of non-trivial numbers in decoders and | ||||
receiving applications. | ||||
A CBOR-based protocol that includes floating-point numbers can | A CBOR-based protocol that includes floating-point numbers can | |||
restrict which of the three formats (half-precision, single- | restrict which of the three formats (half-precision, single- | |||
precision, and double-precision) are to be supported. For an | precision, and double-precision) are to be supported. For an | |||
integer-only application, a protocol may want to completely exclude | integer-only application, a protocol may want to completely exclude | |||
the use of floating-point values. | the use of floating-point values. | |||
A CBOR-based protocol designed for compactness may want to exclude | A CBOR-based protocol designed for compactness may want to exclude | |||
specific integer encodings that are longer than necessary for the | specific integer encodings that are longer than necessary for the | |||
application, such as to save the need to implement 64-bit integers. | application, such as to save the need to implement 64-bit integers. | |||
skipping to change at page 40, line 5 ¶ | skipping to change at page 41, line 5 ¶ | |||
A CBOR-based protocol MUST define what to do when a receiving | A CBOR-based protocol MUST define what to do when a receiving | |||
application does see multiple identical keys in a map. The resulting | application does see multiple identical keys in a map. The resulting | |||
rule in the protocol MUST respect the CBOR data model: it cannot | rule in the protocol MUST respect the CBOR data model: it cannot | |||
prescribe a specific handling of the entries with the identical keys, | prescribe a specific handling of the entries with the identical keys, | |||
except that it might have a rule that having identical keys in a map | except that it might have a rule that having identical keys in a map | |||
indicates a malformed map and that the decoder has to stop with an | indicates a malformed map and that the decoder has to stop with an | |||
error. When processing maps that exhibit entries with duplicate | error. When processing maps that exhibit entries with duplicate | |||
keys, a generic decoder might do one of the following: | keys, a generic decoder might do one of the following: | |||
* Not accept maps duplicate keys (that is, enforce validity for | * Not accept maps with duplicate keys (that is, enforce validity for | |||
maps, see also Section 5.4). These generic decoders are | maps, see also Section 5.4). These generic decoders are | |||
universally useful. An application may still need to do perform | universally useful. An application may still need to do perform | |||
its own duplicate checking based on application rules (for | its own duplicate checking based on application rules (for | |||
instance if the application equates integers and floating point | instance if the application equates integers and floating-point | |||
values in map key positions for specific maps). | values in map key positions for specific maps). | |||
* Pass all map entries to the application, including ones with | * Pass all map entries to the application, including ones with | |||
duplicate keys. This requires the application to handle (check | duplicate keys. This requires the application to handle (check | |||
against) duplicate keys, even if the application rules are | against) duplicate keys, even if the application rules are | |||
identical to the generic data model rules. | identical to the generic data model rules. | |||
* Lose some entries with duplicate keys, e.g. by only delivering the | * Lose some entries with duplicate keys, e.g. by only delivering the | |||
final (or first) entry out of the entries with the same key. With | final (or first) entry out of the entries with the same key. With | |||
such a generic decoder, applications may get different results for | such a generic decoder, applications may get different results for | |||
skipping to change at page 41, line 34 ¶ | skipping to change at page 42, line 34 ¶ | |||
element, and are equal if they have the same number of bytes/elements | element, and are equal if they have the same number of bytes/elements | |||
and the same values at the same positions. Two maps are equal if | and the same values at the same positions. Two maps are equal if | |||
they have the same set of pairs regardless of their order; pairs are | they have the same set of pairs regardless of their order; pairs are | |||
equal if both the key and value are equal. | equal if both the key and value are equal. | |||
Tagged values are equal if both the tag number and the tag content | Tagged values are equal if both the tag number and the tag content | |||
are equal. (Note that a generic decoder that provides processing for | are equal. (Note that a generic decoder that provides processing for | |||
a specific tag may not be able to distinguish some semantically | a specific tag may not be able to distinguish some semantically | |||
equivalent values, e.g. if leading zeroes occur in the content of tag | equivalent values, e.g. if leading zeroes occur in the content of tag | |||
2/3 (Section 3.4.3).) Simple values are equal if they simply have | 2/3 (Section 3.4.3).) Simple values are equal if they simply have | |||
the same value. Nothing else is equal in the generic data model, a | the same value. Nothing else is equal in the generic data model; a | |||
simple value 2 is not equivalent to an integer 2 and an array is | simple value 2 is not equivalent to an integer 2 and an array is | |||
never equivalent to a map. | never equivalent to a map. | |||
As discussed in Section 2.2, specific data models can make values | As discussed in Section 2.2, specific data models can make values | |||
equivalent for the purpose of comparing map keys that are distinct in | equivalent for the purpose of comparing map keys that are distinct in | |||
the generic data model. Note that this implies that a generic | the generic data model. Note that this implies that a generic | |||
decoder may deliver a decoded map to an application that needs to be | decoder may deliver a decoded map to an application that needs to be | |||
checked for duplicate map keys by that application (alternatively, | checked for duplicate map keys by that application (alternatively, | |||
the decoder may provide a programming interface to perform this | the decoder may provide a programming interface to perform this | |||
service for the application). Specific data models cannot | service for the application). Specific data models are not able to | |||
distinguish values for map keys that are equal for this purpose at | distinguish values for map keys that are equal for this purpose at | |||
the generic data model level. | the generic data model level. | |||
5.7. Undefined Values | 5.7. Undefined Values | |||
In some CBOR-based protocols, the simple value (Section 3.3) of | In some CBOR-based protocols, the simple value (Section 3.3) of | |||
Undefined might be used by an encoder as a substitute for a data item | Undefined might be used by an encoder as a substitute for a data item | |||
with an encoding problem, in order to allow the rest of the enclosing | with an encoding problem, in order to allow the rest of the enclosing | |||
data items to be encoded without harm. | data items to be encoded without harm. | |||
6. Converting Data between CBOR and JSON | 6. Converting Data between CBOR and JSON | |||
This section gives non-normative advice about converting between CBOR | This section gives non-normative advice about converting between CBOR | |||
and JSON. Implementations of converters are free to use whichever | and JSON. Implementations of converters MAY use whichever advice | |||
advice here they want. | here they want. | |||
It is worth noting that a JSON text is a sequence of characters, not | It is worth noting that a JSON text is a sequence of characters, not | |||
an encoded sequence of bytes, while a CBOR data item consists of | an encoded sequence of bytes, while a CBOR data item consists of | |||
bytes, not characters. | bytes, not characters. | |||
6.1. Converting from CBOR to JSON | 6.1. Converting from CBOR to JSON | |||
Most of the types in CBOR have direct analogs in JSON. However, some | Most of the types in CBOR have direct analogs in JSON. However, some | |||
do not, and someone implementing a CBOR-to-JSON converter has to | do not, and someone implementing a CBOR-to-JSON converter has to | |||
consider what to do in those cases. The following non-normative | consider what to do in those cases. The following non-normative | |||
skipping to change at page 43, line 31 ¶ | skipping to change at page 44, line 31 ¶ | |||
value not yet discussed) is represented by the substitute value. | value not yet discussed) is represented by the substitute value. | |||
* A bignum (major type 6, tag number 2 or 3) is represented by | * A bignum (major type 6, tag number 2 or 3) is represented by | |||
encoding its byte string in base64url without padding and becomes | encoding its byte string in base64url without padding and becomes | |||
a JSON string. For tag number 3 (negative bignum), a "~" (ASCII | a JSON string. For tag number 3 (negative bignum), a "~" (ASCII | |||
tilde) is inserted before the base-encoded value. (The conversion | tilde) is inserted before the base-encoded value. (The conversion | |||
to a binary blob instead of a number is to prevent a likely | to a binary blob instead of a number is to prevent a likely | |||
numeric overflow for the JSON decoder.) | numeric overflow for the JSON decoder.) | |||
* A byte string with an encoding hint (major type 6, tag number 21 | * A byte string with an encoding hint (major type 6, tag number 21 | |||
through 23) is encoded as described and becomes a JSON string. | through 23) is encoded as described by the hint and becomes a JSON | |||
string. | ||||
* For all other tags (major type 6, any other tag number), the tag | * For all other tags (major type 6, any other tag number), the tag | |||
content is represented as a JSON value; the tag number is ignored. | content is represented as a JSON value; the tag number is ignored. | |||
* Indefinite-length items are made definite before conversion. | * Indefinite-length items are made definite before conversion. | |||
A CBOR-to-JSON converter may want to keep to the JSON profile I-JSON | ||||
[RFC7493], to maximize interoperability and increase confidence that | ||||
the JSON output can be processed with predictable results. For | ||||
example, this has implications on the range of integers that can be | ||||
represented reliably, as well as on the top-level items that may be | ||||
supported by older JSON implementations. | ||||
6.2. Converting from JSON to CBOR | 6.2. Converting from JSON to CBOR | |||
All JSON values, once decoded, directly map into one or more CBOR | All JSON values, once decoded, directly map into one or more CBOR | |||
values. As with any kind of CBOR generation, decisions have to be | values. As with any kind of CBOR generation, decisions have to be | |||
made with respect to number representation. In a suggested | made with respect to number representation. In a suggested | |||
conversion: | conversion: | |||
* JSON numbers without fractional parts (integer numbers) are | * JSON numbers without fractional parts (integer numbers) are | |||
represented as integers (major types 0 and 1, possibly major type | represented as integers (major types 0 and 1, possibly major type | |||
6 tag number 2 and 3), choosing the shortest form; integers longer | 6 tag number 2 and 3), choosing the shortest form; integers longer | |||
skipping to change at page 44, line 14 ¶ | skipping to change at page 45, line 23 ¶ | |||
converter implementation, may choose -2**32..2**32-1 or | converter implementation, may choose -2**32..2**32-1 or | |||
-2**64..2**64-1 (fully using the integer ranges available in CBOR | -2**64..2**64-1 (fully using the integer ranges available in CBOR | |||
with uint32_t or uint64_t, respectively) or even -2**31..2**31-1 | with uint32_t or uint64_t, respectively) or even -2**31..2**31-1 | |||
or -2**63..2**63-1 (using popular ranges for two's complement | or -2**63..2**63-1 (using popular ranges for two's complement | |||
signed integers). (If the JSON was generated from a JavaScript | signed integers). (If the JSON was generated from a JavaScript | |||
implementation, its precision is already limited to 53 bits | implementation, its precision is already limited to 53 bits | |||
maximum.) | maximum.) | |||
* Numbers with fractional parts are represented as floating-point | * Numbers with fractional parts are represented as floating-point | |||
values, performing the decimal-to-binary conversion based on the | values, performing the decimal-to-binary conversion based on the | |||
precision provided by IEEE 754 binary64. Then, when encoding in | precision provided by IEEE 754 binary64. The mathematical value | |||
CBOR, the preferred serialization uses the shortest floating-point | of the JSON number is converted to binary64 using the | |||
representation exactly representing this conversion result; for | roundTiesToEven procedure in Section 4.3.1 of [IEEE754]. Then, | |||
instance, 1.5 is represented in a 16-bit floating-point value (not | when encoding in CBOR, the preferred serialization uses the | |||
all implementations will be capable of efficiently finding the | shortest floating-point representation exactly representing this | |||
minimum form, though). Instead of using the default binary64 | conversion result; for instance, 1.5 is represented in a 16-bit | |||
precision, there may be an implementation-defined limit to the | floating-point value (not all implementations will be capable of | |||
precision of the conversion that will affect the precision of the | efficiently finding the minimum form, though). Instead of using | |||
represented values. Decimal representation should only be used on | the default binary64 precision, there may be an implementation- | |||
the CBOR side if that is specified in a protocol. | defined limit to the precision of the conversion that will affect | |||
the precision of the represented values. Decimal representation | ||||
should only be used on the CBOR side if that is specified in a | ||||
protocol. | ||||
CBOR has been designed to generally provide a more compact encoding | CBOR has been designed to generally provide a more compact encoding | |||
than JSON. One implementation strategy that might come to mind is to | than JSON. One implementation strategy that might come to mind is to | |||
perform a JSON-to-CBOR encoding in place in a single buffer. This | perform a JSON-to-CBOR encoding in place in a single buffer. This | |||
strategy would need to carefully consider a number of pathological | strategy would need to carefully consider a number of pathological | |||
cases, such as that some strings represented with no or very few | cases, such as that some strings represented with no or very few | |||
escapes and longer (or much longer) than 255 bytes may expand when | escapes and longer (or much longer) than 255 bytes may expand when | |||
encoded as UTF-8 strings in CBOR. Similarly, a few of the binary | encoded as UTF-8 strings in CBOR. Similarly, a few of the binary | |||
floating-point representations might cause expansion from some short | floating-point representations might cause expansion from some short | |||
decimal representations (1.1, 1e9) in JSON. This may be hard to get | decimal representations (1.1, 1e9) in JSON. This may be hard to get | |||
skipping to change at page 47, line 30 ¶ | skipping to change at page 48, line 45 ¶ | |||
actual encodings do not overlap, so the string remains unambiguous). | actual encodings do not overlap, so the string remains unambiguous). | |||
For example, the byte string 0x12345678 could be written h'12345678', | For example, the byte string 0x12345678 could be written h'12345678', | |||
b32'CI2FM6A', or b64'EjRWeA'. | b32'CI2FM6A', or b64'EjRWeA'. | |||
Unassigned simple values are given as "simple()" with the appropriate | Unassigned simple values are given as "simple()" with the appropriate | |||
integer in the parentheses. For example, "simple(42)" indicates | integer in the parentheses. For example, "simple(42)" indicates | |||
major type 7, value 42. | major type 7, value 42. | |||
A number of useful extensions to the diagnostic notation defined here | A number of useful extensions to the diagnostic notation defined here | |||
are provided in Appendix G of [RFC8610], "Extended Diagnostic | are provided in Appendix G of [RFC8610], "Extended Diagnostic | |||
Notation" (EDN). | Notation" (EDN). Similarly, an extension of this notation could be | |||
provided in a separate document to provide for the documentation of | ||||
NaN payloads, which are not covered in the present document. | ||||
8.1. Encoding Indicators | 8.1. Encoding Indicators | |||
Sometimes it is useful to indicate in the diagnostic notation which | Sometimes it is useful to indicate in the diagnostic notation which | |||
of several alternative representations were actually used; for | of several alternative representations were actually used; for | |||
example, a data item written >1.5< by a diagnostic decoder might have | example, a data item written >1.5< by a diagnostic decoder might have | |||
been encoded as a half-, single-, or double-precision float. | been encoded as a half-, single-, or double-precision float. | |||
The convention for encoding indicators is that anything starting with | The convention for encoding indicators is that anything starting with | |||
an underscore and all following characters that are alphanumeric or | an underscore and all following characters that are alphanumeric or | |||
skipping to change at page 48, line 14 ¶ | skipping to change at page 49, line 33 ¶ | |||
An underscore followed by a decimal digit n indicates that the | An underscore followed by a decimal digit n indicates that the | |||
preceding item (or, for arrays and maps, the item starting with the | preceding item (or, for arrays and maps, the item starting with the | |||
preceding bracket or brace) was encoded with an additional | preceding bracket or brace) was encoded with an additional | |||
information value of 24+n. For example, 1.5_1 is a half-precision | information value of 24+n. For example, 1.5_1 is a half-precision | |||
floating-point number, while 1.5_3 is encoded as double precision. | floating-point number, while 1.5_3 is encoded as double precision. | |||
This encoding indicator is not shown in Appendix A. (Note that the | This encoding indicator is not shown in Appendix A. (Note that the | |||
encoding indicator "_" is thus an abbreviation of the full form "_7", | encoding indicator "_" is thus an abbreviation of the full form "_7", | |||
which is not used.) | which is not used.) | |||
Byte and text strings of indefinite length can be notated in the form | The detailed chunk structure of byte and text strings of indefinite | |||
(_ h'0123', h'4567') and (_ "foo", "bar"). | length can be notated in the form (_ h'0123', h'4567') and (_ "foo", | |||
"bar"). However, for an indefinite length string with no chunks | ||||
inside, (_ ) would be ambiguous whether a byte string (0x5fff) or a | ||||
text string (0x7fff) is meant and is therefore not used. The basic | ||||
forms ''_ and ""_ can be used instead and are reserved for the case | ||||
with no chunks only -- not as short forms for the (permitted, but not | ||||
really useful) encodings with only empty chunks, which to preserve | ||||
the chunk structure need to be notated as (_ ''), (_ ""), etc. | ||||
9. IANA Considerations | 9. IANA Considerations | |||
IANA has created two registries for new CBOR values. The registries | IANA has created two registries for new CBOR values. The registries | |||
are separate, that is, not under an umbrella registry, and follow the | are separate, that is, not under an umbrella registry, and follow the | |||
rules in [RFC8126]. IANA has also assigned a new MIME media type and | rules in [RFC8126]. IANA has also assigned a new MIME media type and | |||
an associated Constrained Application Protocol (CoAP) Content-Format | an associated Constrained Application Protocol (CoAP) Content-Format | |||
entry. | entry. | |||
[To be removed by RFC editor:] IANA is requested to update these | [To be removed by RFC editor:] IANA is requested to update these | |||
skipping to change at page 48, line 47 ¶ | skipping to change at page 50, line 24 ¶ | |||
contiguous blocks (if any). | contiguous blocks (if any). | |||
New entries in the range 32 to 255 are assigned by Specification | New entries in the range 32 to 255 are assigned by Specification | |||
Required. | Required. | |||
9.2. Tags Registry | 9.2. Tags Registry | |||
IANA has created the "Concise Binary Object Representation (CBOR) | IANA has created the "Concise Binary Object Representation (CBOR) | |||
Tags" registry at [IANA.cbor-tags]. The tags that were defined in | Tags" registry at [IANA.cbor-tags]. The tags that were defined in | |||
[RFC7049] are described in detail in Section 3.4, and other tags have | [RFC7049] are described in detail in Section 3.4, and other tags have | |||
already been defined. | already been defined since then. | |||
New entries in the range 0 to 23 ("1+0") are assigned by Standards | New entries in the range 0 to 23 ("1+0") are assigned by Standards | |||
Action. New entries in the ranges 24 to 255 ("1+1") and 256 to 32767 | Action. New entries in the ranges 24 to 255 ("1+1") and 256 to 32767 | |||
(lower half of "1+2") are assigned by Specification Required. New | (lower half of "1+2") are assigned by Specification Required. New | |||
entries in the range 32768 to 18446744073709551615 (upper half of | entries in the range 32768 to 18446744073709551615 (upper half of | |||
"1+2", "1+4", and "1+8") are assigned by First Come First Served. | "1+2", "1+4", and "1+8") are assigned by First Come First Served. | |||
The template for registration requests is: | The template for registration requests is: | |||
* Data item | * Data item | |||
* Semantics (short form) | * Semantics (short form) | |||
In addition, First Come First Served requests should include: | In addition, First Come First Served requests should include: | |||
* Point of contact | * Point of contact | |||
* Description of semantics (URL) - This description is optional; the | * Description of semantics (URL) -- This description is optional; | |||
URL can point to something like an Internet-Draft or a web page. | the URL can point to something like an Internet-Draft or a web | |||
page. | ||||
Applicants exercising the First Come First Served range and making a | Applicants exercising the First Come First Served range and making a | |||
suggestion for a tag number that is not representable in 32 bits | suggestion for a tag number that is not representable in 32 bits | |||
(i.e., larger than 4294967295) should be aware that this could reduce | (i.e., larger than 4294967295) should be aware that this could reduce | |||
interoperability with implementations that do not support 64-bit | interoperability with implementations that do not support 64-bit | |||
numbers. | numbers. | |||
9.3. Media Type ("MIME Type") | 9.3. Media Type ("MIME Type") | |||
The Internet media type [RFC6838] for a single encoded CBOR data item | The Internet media type [RFC6838] for a single encoded CBOR data item | |||
is application/cbor, as defined in [IANA.media-types]: | is application/cbor, as defined in [IANA.media-types]: | |||
Type name: application | Type name: application | |||
Subtype name: cbor | Subtype name: cbor | |||
Required parameters: n/a | Required parameters: n/a | |||
Optional parameters: n/a | Optional parameters: n/a | |||
Encoding considerations: binary | Encoding considerations: Binary | |||
Security considerations: See Section 10 of this document | Security considerations: See Section 10 of this document | |||
Interoperability considerations: n/a | Interoperability considerations: n/a | |||
Published specification: This document | Published specification: This document | |||
Applications that use this media type: None yet, but it is expected | Applications that use this media type: Many | |||
that this format will be deployed in protocols and applications. | ||||
Additional information: * Magic number(s): n/a | Additional information: | |||
* Magic number(s): n/a | ||||
* File extension(s): .cbor | * File extension(s): .cbor | |||
* Macintosh file type code(s): n/a | * Macintosh file type code(s): n/a | |||
Person & email address to contact for further information: IETF CBOR | Person & email address to contact for further information: IETF CBOR | |||
Working Group cbor@ietf.org (mailto:cbor@ietf.org) or IETF | Working Group cbor@ietf.org (mailto:cbor@ietf.org) or IETF | |||
Applications and Real-Time Area art@ietf.org (mailto:art@ietf.org) | Applications and Real-Time Area art@ietf.org (mailto:art@ietf.org) | |||
Intended usage: COMMON | Intended usage: COMMON | |||
Restrictions on usage: none | Restrictions on usage: none | |||
Author: IETF CBOR Working Group cbor@ietf.org (mailto:cbor@ietf.org) | Author: IETF CBOR Working Group cbor@ietf.org (mailto:cbor@ietf.org) | |||
Change controller: The IESG iesg@ietf.org (mailto:iesg@ietf.org) | Change controller: The IESG iesg@ietf.org (mailto:iesg@ietf.org) | |||
9.4. CoAP Content-Format | 9.4. CoAP Content-Format | |||
The CoAP Content-Format for CBOR is defined in | The CoAP Content-Format for CBOR is registered in | |||
[IANA.core-parameters]: | [IANA.core-parameters]: | |||
Media Type: application/cbor | Media Type: application/cbor | |||
Encoding: - | Encoding: - | |||
Id: 60 | Id: 60 | |||
Reference: [RFCthis] | Reference: [RFCthis] | |||
9.5. The +cbor Structured Syntax Suffix Registration | 9.5. The +cbor Structured Syntax Suffix Registration | |||
The Structured Syntax Suffix [RFC6838] for media types based on a | The Structured Syntax Suffix [RFC6838] for media types based on a | |||
single encoded CBOR data item is +cbor, as defined in | single encoded CBOR data item is +cbor, as defined in | |||
skipping to change at page 52, line 9 ¶ | skipping to change at page 53, line 33 ¶ | |||
Because CBOR decoders are often used as a first step in processing | Because CBOR decoders are often used as a first step in processing | |||
unvalidated input, they need to be fully prepared for all types of | unvalidated input, they need to be fully prepared for all types of | |||
hostile input that may be designed to corrupt, overrun, or achieve | hostile input that may be designed to corrupt, overrun, or achieve | |||
control of the system decoding the CBOR data item. A CBOR decoder | control of the system decoding the CBOR data item. A CBOR decoder | |||
needs to assume that all input may be hostile even if it has been | needs to assume that all input may be hostile even if it has been | |||
checked by a firewall, has come over a secure channel such as TLS, is | checked by a firewall, has come over a secure channel such as TLS, is | |||
encrypted or signed, or has come from some other source that is | encrypted or signed, or has come from some other source that is | |||
presumed trusted. | presumed trusted. | |||
Section 4.1 gives examples of limitations in interoperability when | ||||
using a constrained CBOR decoder with input from a CBOR encoder that | ||||
uses a non-preferred serialization. When a single data item is | ||||
consumed both by such a constrained decoder and a full decoder, it | ||||
can lead to security issues that can be exploited by an attacker who | ||||
can inject or manipulate content. | ||||
As discussed throughout this document, there are many values that can | ||||
be considered "equivalent" in some circumstances and "not equivalent" | ||||
in others. As just one example, the numeric value for the number | ||||
"one" might be expressed as an integer or a bignum. A system | ||||
interpreting CBOR input might accept either form for the number | ||||
"one", or might reject one (or both) forms. Such acceptance or | ||||
rejection can have security implications in the program that is using | ||||
the interpreted input. | ||||
Hostile input may be constructed to overrun buffers, overflow or | Hostile input may be constructed to overrun buffers, overflow or | |||
underflow integer arithmetic, or cause other decoding disruption. | underflow integer arithmetic, or cause other decoding disruption. | |||
CBOR data items might have lengths or sizes that are intentionally | CBOR data items might have lengths or sizes that are intentionally | |||
extremely large or too short. Resource exhaustion attacks might | extremely large or too short. Resource exhaustion attacks might | |||
attempt to lure a decoder into allocating very big data items | attempt to lure a decoder into allocating very big data items | |||
(strings, arrays, maps, or even arbitrary precision numbers) or | (strings, arrays, maps, or even arbitrary precision numbers) or | |||
exhaust the stack depth by setting up deeply nested items. Decoders | exhaust the stack depth by setting up deeply nested items. Decoders | |||
need to have appropriate resource management to mitigate these | need to have appropriate resource management to mitigate these | |||
attacks. (Items for which very large sizes are given can also | attacks. (Items for which very large sizes are given can also | |||
attempt to exploit integer overflow vulnerabilities.) | attempt to exploit integer overflow vulnerabilities.) | |||
skipping to change at page 52, line 38 ¶ | skipping to change at page 54, line 29 ¶ | |||
also perform validity checks on the CBOR data. Alternatively, it can | also perform validity checks on the CBOR data. Alternatively, it can | |||
leave those checks to the application using the decoder. This choice | leave those checks to the application using the decoder. This choice | |||
needs to be clearly documented in the decoder. Beyond the validity | needs to be clearly documented in the decoder. Beyond the validity | |||
at the CBOR level, an application also needs to ascertain that the | at the CBOR level, an application also needs to ascertain that the | |||
input is in alignment with the application protocol that is | input is in alignment with the application protocol that is | |||
serialized in CBOR. | serialized in CBOR. | |||
The input check itself may consume resources. This is usually linear | The input check itself may consume resources. This is usually linear | |||
in the size of the input, which means that an attacker has to spend | in the size of the input, which means that an attacker has to spend | |||
resources that are commensurate to the resources spent by the | resources that are commensurate to the resources spent by the | |||
defender on input validation. Processing for arbitrary-precision | defender on input validation. However, an attacker might be able to | |||
craft inputs that will take longer for a target decoder to process | ||||
than for the attacker to produce. Processing for arbitrary-precision | ||||
numbers may exceed linear effort. Also, some hash-table | numbers may exceed linear effort. Also, some hash-table | |||
implementations that are used by decoders to build in-memory | implementations that are used by decoders to build in-memory | |||
representations of maps can be attacked to spend quadratic effort, | representations of maps can be attacked to spend quadratic effort, | |||
unless a secret key (see Section 7 of [SIPHASH]) or some other | unless a secret key (see Section 7 of [SIPHASH_LNCS], also | |||
mitigation is employed. Such superlinear efforts can be exploited by | [SIPHASH_OPEN]) or some other mitigation is employed. Such | |||
an attacker to exhaust resources at or before the input validator; | superlinear efforts can be exploited by an attacker to exhaust | |||
they therefore need to be avoided in a CBOR decoder implementation. | resources at or before the input validator; they therefore need to be | |||
Note that tag number definitions and their implementations can add | avoided in a CBOR decoder implementation. Note that tag number | |||
security considerations of this kind; this should then be discussed | definitions and their implementations can add security considerations | |||
in the security considerations of the tag number definition. | of this kind; this should then be discussed in the security | |||
considerations of the tag number definition. | ||||
CBOR encoders do not receive input directly from the network and are | CBOR encoders do not receive input directly from the network and are | |||
thus not directly attackable in the same way as CBOR decoders. | thus not directly attackable in the same way as CBOR decoders. | |||
However, CBOR encoders often have an API that takes input from | However, CBOR encoders often have an API that takes input from | |||
another level in the implementation and can be attacked through that | another level in the implementation and can be attacked through that | |||
API. The design and implementation of that API should assume the | API. The design and implementation of that API should assume the | |||
behavior of its caller may be based on hostile input or on coding | behavior of its caller may be based on hostile input or on coding | |||
mistakes. It should check inputs for buffer overruns, overflow and | mistakes. It should check inputs for buffer overruns, overflow and | |||
underflow of integer arithmetic, and other such errors that are aimed | underflow of integer arithmetic, and other such errors that are aimed | |||
to disrupt the encoder. | to disrupt the encoder. | |||
skipping to change at page 53, line 34 ¶ | skipping to change at page 55, line 34 ¶ | |||
cannot know about all requirements that an application poses on its | cannot know about all requirements that an application poses on its | |||
input data; it is therefore not relieving the application from | input data; it is therefore not relieving the application from | |||
performing its own input checking. Also, since the set of defined | performing its own input checking. Also, since the set of defined | |||
tag numbers evolves, the application may employ a tag number that is | tag numbers evolves, the application may employ a tag number that is | |||
not yet supported for validity checking by the generic decoder it | not yet supported for validity checking by the generic decoder it | |||
uses. Generic decoders therefore need to provide documentation which | uses. Generic decoders therefore need to provide documentation which | |||
tag numbers they support and what validity checking they can provide | tag numbers they support and what validity checking they can provide | |||
for each of them as well as for basic CBOR validity (UTF-8 checking, | for each of them as well as for basic CBOR validity (UTF-8 checking, | |||
duplicate map key checking). | duplicate map key checking). | |||
Section 3.4.3 notes that using the non-preferred choice of a bignum | ||||
representation instead of a basic integer for encoding a number is | ||||
not intended to have application semantics, but it can have such | ||||
semantics if an application receiving CBOR data is using a decoder in | ||||
the basic generic data model. This disparity causes a security issue | ||||
if the two sets of semantics differ. Thus, applications using CBOR | ||||
need to specify the data model that they are using for each use of | ||||
CBOR data. | ||||
It is common to convert CBOR data to other formats. In many cases, | ||||
CBOR has more expressive types than other formats; this is | ||||
particularly true for the common conversion to JSON. The loss of | ||||
type information can cause security issues for the systems that are | ||||
processing the less-expressive data. | ||||
Section 6.2 describes a possibly-common usage scenario of converting | ||||
between CBOR and JSON that could allow an attack if the attcker knows | ||||
that the application is performing the conversion. | ||||
Security considerations for the use of base16 and base64 from | ||||
[RFC4648], and the use of UTF-8 from [RFC3629], are relevant to CBOR | ||||
as well. | ||||
11. References | 11. References | |||
11.1. Normative References | 11.1. Normative References | |||
[ECMA262] Ecma International, "ECMAScript 2018 Language | [C] International Organization for Standardization, | |||
Specification", ECMA Standard ECMA-262, 9th Edition, June | "Information technology — Programming languages — C", ISO/ | |||
2018, <https://www.ecma- | IEC 9899:2018, Fourth Edition, June 2018. | |||
international.org/publications/files/ECMA-ST/Ecma- | ||||
262.pdf>. | [Cplusplus17] | |||
International Organization for Standardization, | ||||
"Programming languages — C++", ISO/IEC 14882:2017, Fifth | ||||
Edition, December 2017. | ||||
[IEEE754] IEEE, "IEEE Standard for Floating-Point Arithmetic", IEEE | [IEEE754] IEEE, "IEEE Standard for Floating-Point Arithmetic", IEEE | |||
Std 754-2008. | Std 754-2019, DOI 10.1109/IEEESTD.2019.8766229, | |||
<https://ieeexplore.ieee.org/document/8766229>. | ||||
[RFC2045] Freed, N. and N. Borenstein, "Multipurpose Internet Mail | [RFC2045] Freed, N. and N. Borenstein, "Multipurpose Internet Mail | |||
Extensions (MIME) Part One: Format of Internet Message | Extensions (MIME) Part One: Format of Internet Message | |||
Bodies", RFC 2045, DOI 10.17487/RFC2045, November 1996, | Bodies", RFC 2045, DOI 10.17487/RFC2045, November 1996, | |||
<https://www.rfc-editor.org/info/rfc2045>. | <https://www.rfc-editor.org/info/rfc2045>. | |||
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
Requirement Levels", BCP 14, RFC 2119, | Requirement Levels", BCP 14, RFC 2119, | |||
DOI 10.17487/RFC2119, March 1997, | DOI 10.17487/RFC2119, March 1997, | |||
<https://www.rfc-editor.org/info/rfc2119>. | <https://www.rfc-editor.org/info/rfc2119>. | |||
skipping to change at page 54, line 40 ¶ | skipping to change at page 57, line 18 ¶ | |||
[RFC8126] Cotton, M., Leiba, B., and T. Narten, "Guidelines for | [RFC8126] Cotton, M., Leiba, B., and T. Narten, "Guidelines for | |||
Writing an IANA Considerations Section in RFCs", BCP 26, | Writing an IANA Considerations Section in RFCs", BCP 26, | |||
RFC 8126, DOI 10.17487/RFC8126, June 2017, | RFC 8126, DOI 10.17487/RFC8126, June 2017, | |||
<https://www.rfc-editor.org/info/rfc8126>. | <https://www.rfc-editor.org/info/rfc8126>. | |||
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | |||
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | |||
May 2017, <https://www.rfc-editor.org/info/rfc8174>. | May 2017, <https://www.rfc-editor.org/info/rfc8174>. | |||
[TIME_T] The Open Group Base Specifications, "Vol. 1: Base | [TIME_T] The Open Group Base Specifications, "Open Group Standard: | |||
Definitions, Issue 7", 2013 Edition, IEEE Std 1003.1, | Vol. 1: Base Definitions, Issue 7", Section 4.16 'Seconds | |||
Section 4.15 'Seconds Since the Epoch', 2013, | Since the Epoch', IEEE Std 1003.1, 2018 Edition, 2018, | |||
<http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/ | <http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/ | |||
V1_chap04.html#tag_04_15>. | V1_chap04.html#tag_04_16>. | |||
11.2. Informative References | 11.2. Informative References | |||
[ASN.1] International Telecommunication Union, "Information | [ASN.1] International Telecommunication Union, "Information | |||
Technology -- ASN.1 encoding rules: Specification of Basic | Technology — ASN.1 encoding rules: Specification of Basic | |||
Encoding Rules (BER), Canonical Encoding Rules (CER) and | Encoding Rules (BER), Canonical Encoding Rules (CER) and | |||
Distinguished Encoding Rules (DER)", ITU-T Recommendation | Distinguished Encoding Rules (DER)", ITU-T Recommendation | |||
X.690, 1994. | X.690, 1994. | |||
[BSON] Various, "BSON - Binary JSON", 2013, | [BSON] Various, "BSON - Binary JSON", 2013, | |||
<http://bsonspec.org/>. | <http://bsonspec.org/>. | |||
[ECMA262] Ecma International, "ECMAScript 2018 Language | ||||
Specification", ECMA Standard ECMA-262, 9th Edition, June | ||||
2018, <https://www.ecma- | ||||
international.org/publications/files/ECMA-ST/Ecma- | ||||
262.pdf>. | ||||
[I-D.bormann-cbor-notable-tags] | [I-D.bormann-cbor-notable-tags] | |||
Bormann, C., "Notable CBOR Tags", Work in Progress, | Bormann, C., "Notable CBOR Tags", Work in Progress, | |||
Internet-Draft, draft-bormann-cbor-notable-tags-01, 15 May | Internet-Draft, draft-bormann-cbor-notable-tags-02, 25 | |||
2020, <http://www.ietf.org/internet-drafts/draft-bormann- | June 2020, <http://www.ietf.org/internet-drafts/draft- | |||
cbor-notable-tags-01.txt>. | bormann-cbor-notable-tags-02.txt>. | |||
[IANA.cbor-simple-values] | [IANA.cbor-simple-values] | |||
IANA, "Concise Binary Object Representation (CBOR) Simple | IANA, "Concise Binary Object Representation (CBOR) Simple | |||
Values", | Values", | |||
<http://www.iana.org/assignments/cbor-simple-values>. | <http://www.iana.org/assignments/cbor-simple-values>. | |||
[IANA.cbor-tags] | [IANA.cbor-tags] | |||
IANA, "Concise Binary Object Representation (CBOR) Tags", | IANA, "Concise Binary Object Representation (CBOR) Tags", | |||
<http://www.iana.org/assignments/cbor-tags>. | <http://www.iana.org/assignments/cbor-tags>. | |||
skipping to change at page 56, line 47 ¶ | skipping to change at page 59, line 34 ¶ | |||
[RFC8742] Bormann, C., "Concise Binary Object Representation (CBOR) | [RFC8742] Bormann, C., "Concise Binary Object Representation (CBOR) | |||
Sequences", RFC 8742, DOI 10.17487/RFC8742, February 2020, | Sequences", RFC 8742, DOI 10.17487/RFC8742, February 2020, | |||
<https://www.rfc-editor.org/info/rfc8742>. | <https://www.rfc-editor.org/info/rfc8742>. | |||
[RFC8746] Bormann, C., Ed., "Concise Binary Object Representation | [RFC8746] Bormann, C., Ed., "Concise Binary Object Representation | |||
(CBOR) Tags for Typed Arrays", RFC 8746, | (CBOR) Tags for Typed Arrays", RFC 8746, | |||
DOI 10.17487/RFC8746, February 2020, | DOI 10.17487/RFC8746, February 2020, | |||
<https://www.rfc-editor.org/info/rfc8746>. | <https://www.rfc-editor.org/info/rfc8746>. | |||
[SIPHASH] Aumasson, J. and D. Bernstein, "SipHash: A Fast Short- | [SIPHASH_LNCS] | |||
Input PRF", DOI 10.1007/978-3-642-34931-7_28, Lecture | Aumasson, J. and D. Bernstein, "SipHash: A Fast Short- | |||
Notes in Computer Science pp. 489-508, 2012, | Input PRF", Lecture Notes in Computer Science pp. 489-508, | |||
DOI 10.1007/978-3-642-34931-7_28, 2012, | ||||
<https://doi.org/10.1007/978-3-642-34931-7_28>. | <https://doi.org/10.1007/978-3-642-34931-7_28>. | |||
[SIPHASH_OPEN] | ||||
Aumasson, J. and D.J. Bernstein, "SipHash: a fast short- | ||||
input PRF", <https://131002.net/siphash/siphash.pdf>. | ||||
[YAML] Ben-Kiki, O., Evans, C., and I.d. Net, "YAML Ain't Markup | [YAML] Ben-Kiki, O., Evans, C., and I.d. Net, "YAML Ain't Markup | |||
Language (YAML[TM]) Version 1.2", 3rd Edition, October | Language (YAML[TM]) Version 1.2", 3rd Edition, October | |||
2009, <http://www.yaml.org/spec/1.2/spec.html>. | 2009, <http://www.yaml.org/spec/1.2/spec.html>. | |||
Appendix A. Examples | Appendix A. Examples of Encoded CBOR Data Items | |||
The following table provides some CBOR-encoded values in hexadecimal | The following table provides some CBOR-encoded values in hexadecimal | |||
(right column), together with diagnostic notation for these values | (right column), together with diagnostic notation for these values | |||
(left column). Note that the string "\u00fc" is one form of | (left column). Note that the string "\u00fc" is one form of | |||
diagnostic notation for a UTF-8 string containing the single Unicode | diagnostic notation for a UTF-8 string containing the single Unicode | |||
character U+00FC, LATIN SMALL LETTER U WITH DIAERESIS (u umlaut). | character U+00FC, LATIN SMALL LETTER U WITH DIAERESIS (u umlaut). | |||
Similarly, "\u6c34" is a UTF-8 string in diagnostic notation with a | Similarly, "\u6c34" is a UTF-8 string in diagnostic notation with a | |||
single character U+6C34 (CJK UNIFIED IDEOGRAPH-6C34, often | single character U+6C34 (CJK UNIFIED IDEOGRAPH-6C34, often | |||
representing "water"), and "\ud800\udd51" is a UTF-8 string in | representing "water"), and "\ud800\udd51" is a UTF-8 string in | |||
diagnostic notation with a single character U+10151 (GREEK ACROPHONIC | diagnostic notation with a single character U+10151 (GREEK ACROPHONIC | |||
ATTIC FIFTY STATERS). (Note that all these single-character strings | ATTIC FIFTY STATERS). (Note that all these single-character strings | |||
could also be represented in native UTF-8 in diagnostic notation, | could also be represented in native UTF-8 in diagnostic notation, | |||
just not in an ASCII-only specification like the present one.) In | just not in an ASCII-only specification.) In the diagnostic notation | |||
the diagnostic notation provided for bignums, their intended numeric | provided for bignums, their intended numeric value is shown as a | |||
value is shown as a decimal number (such as 18446744073709551616) | decimal number (such as 18446744073709551616) instead of showing a | |||
instead of showing a tagged byte string (such as | tagged byte string (such as 2(h'010000000000000000')). | |||
2(h'010000000000000000')). | ||||
+------------------------------+------------------------------------+ | ||||
| Diagnostic | Encoded | | ||||
+==============================+====================================+ | +==============================+====================================+ | |||
| 0 | 0x00 | | |Diagnostic | Encoded | | |||
+==============================+====================================+ | ||||
|0 | 0x00 | | ||||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| 1 | 0x01 | | |1 | 0x01 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| 10 | 0x0a | | |10 | 0x0a | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| 23 | 0x17 | | |23 | 0x17 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| 24 | 0x1818 | | |24 | 0x1818 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| 25 | 0x1819 | | |25 | 0x1819 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| 100 | 0x1864 | | |100 | 0x1864 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| 1000 | 0x1903e8 | | |1000 | 0x1903e8 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| 1000000 | 0x1a000f4240 | | |1000000 | 0x1a000f4240 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| 1000000000000 | 0x1b000000e8d4a51000 | | |1000000000000 | 0x1b000000e8d4a51000 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| 18446744073709551615 | 0x1bffffffffffffffff | | |18446744073709551615 | 0x1bffffffffffffffff | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| 18446744073709551616 | 0xc249010000000000000000 | | |18446744073709551616 | 0xc249010000000000000000 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| -18446744073709551616 | 0x3bffffffffffffffff | | |-18446744073709551616 | 0x3bffffffffffffffff | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| -18446744073709551617 | 0xc349010000000000000000 | | |-18446744073709551617 | 0xc349010000000000000000 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| -1 | 0x20 | | |-1 | 0x20 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| -10 | 0x29 | | |-10 | 0x29 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| -100 | 0x3863 | | |-100 | 0x3863 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| -1000 | 0x3903e7 | | |-1000 | 0x3903e7 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| 0.0 | 0xf90000 | | |0.0 | 0xf90000 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| -0.0 | 0xf98000 | | |-0.0 | 0xf98000 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| 1.0 | 0xf93c00 | | |1.0 | 0xf93c00 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| 1.1 | 0xfb3ff199999999999a | | |1.1 | 0xfb3ff199999999999a | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| 1.5 | 0xf93e00 | | |1.5 | 0xf93e00 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| 65504.0 | 0xf97bff | | |65504.0 | 0xf97bff | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| 100000.0 | 0xfa47c35000 | | |100000.0 | 0xfa47c35000 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| 3.4028234663852886e+38 | 0xfa7f7fffff | | |3.4028234663852886e+38 | 0xfa7f7fffff | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| 1.0e+300 | 0xfb7e37e43c8800759c | | |1.0e+300 | 0xfb7e37e43c8800759c | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| 5.960464477539063e-8 | 0xf90001 | | |5.960464477539063e-8 | 0xf90001 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| 0.00006103515625 | 0xf90400 | | |0.00006103515625 | 0xf90400 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| -4.0 | 0xf9c400 | | |-4.0 | 0xf9c400 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| -4.1 | 0xfbc010666666666666 | | |-4.1 | 0xfbc010666666666666 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| Infinity | 0xf97c00 | | |Infinity | 0xf97c00 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| NaN | 0xf97e00 | | |NaN | 0xf97e00 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| -Infinity | 0xf9fc00 | | |-Infinity | 0xf9fc00 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| Infinity | 0xfa7f800000 | | |Infinity | 0xfa7f800000 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| NaN | 0xfa7fc00000 | | |NaN | 0xfa7fc00000 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| -Infinity | 0xfaff800000 | | |-Infinity | 0xfaff800000 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| Infinity | 0xfb7ff0000000000000 | | |Infinity | 0xfb7ff0000000000000 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| NaN | 0xfb7ff8000000000000 | | |NaN | 0xfb7ff8000000000000 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| -Infinity | 0xfbfff0000000000000 | | |-Infinity | 0xfbfff0000000000000 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| false | 0xf4 | | |false | 0xf4 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| true | 0xf5 | | |true | 0xf5 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| null | 0xf6 | | |null | 0xf6 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| undefined | 0xf7 | | |undefined | 0xf7 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| simple(16) | 0xf0 | | |simple(16) | 0xf0 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| simple(255) | 0xf8ff | | |simple(255) | 0xf8ff | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| 0("2013-03-21T20:04:00Z") | 0xc074323031332d30332d32315432303a | | |0("2013-03-21T20:04:00Z") | 0xc074323031332d30332d32315432303a | | |||
| | 30343a30305a | | | | 30343a30305a | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| 1(1363896240) | 0xc11a514b67b0 | | |1(1363896240) | 0xc11a514b67b0 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| 1(1363896240.5) | 0xc1fb41d452d9ec200000 | | |1(1363896240.5) | 0xc1fb41d452d9ec200000 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| 23(h'01020304') | 0xd74401020304 | | |23(h'01020304') | 0xd74401020304 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| 24(h'6449455446') | 0xd818456449455446 | | |24(h'6449455446') | 0xd818456449455446 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| 32("http://www.example.com") | 0xd82076687474703a2f2f7777772e6578 | | |32("http://www.example.com") | 0xd82076687474703a2f2f7777772e6578 | | |||
| | 616d706c652e636f6d | | | | 616d706c652e636f6d | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| h'' | 0x40 | | |h'' | 0x40 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| h'01020304' | 0x4401020304 | | |h'01020304' | 0x4401020304 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| "" | 0x60 | | |"" | 0x60 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| "a" | 0x6161 | | |"a" | 0x6161 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| "IETF" | 0x6449455446 | | |"IETF" | 0x6449455446 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| "\"\\" | 0x62225c | | |"\"\\" | 0x62225c | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| "\u00fc" | 0x62c3bc | | |"\u00fc" | 0x62c3bc | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| "\u6c34" | 0x63e6b0b4 | | |"\u6c34" | 0x63e6b0b4 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| "\ud800\udd51" | 0x64f0908591 | | |"\ud800\udd51" | 0x64f0908591 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| [] | 0x80 | | |[] | 0x80 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| [1, 2, 3] | 0x83010203 | | |[1, 2, 3] | 0x83010203 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| [1, [2, 3], [4, 5]] | 0x8301820203820405 | | |[1, [2, 3], [4, 5]] | 0x8301820203820405 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| [1, 2, 3, 4, 5, 6, 7, 8, 9, | 0x98190102030405060708090a0b0c0d0e | | |[1, 2, 3, 4, 5, 6, 7, 8, 9, | 0x98190102030405060708090a0b0c0d0e | | |||
| 10, 11, 12, 13, 14, 15, 16, | 0f101112131415161718181819 | | |10, 11, 12, 13, 14, 15, 16, | 0f101112131415161718181819 | | |||
| 17, 18, 19, 20, 21, 22, 23, | | | |17, 18, 19, 20, 21, 22, 23, | | | |||
| 24, 25] | | | |24, 25] | | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| {} | 0xa0 | | |{} | 0xa0 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| {1: 2, 3: 4} | 0xa201020304 | | |{1: 2, 3: 4} | 0xa201020304 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| {"a": 1, "b": [2, 3]} | 0xa26161016162820203 | | |{"a": 1, "b": [2, 3]} | 0xa26161016162820203 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| ["a", {"b": "c"}] | 0x826161a161626163 | | |["a", {"b": "c"}] | 0x826161a161626163 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
|{"a": "A", "b": "B", "c": "C",| 0xa5616161416162614261636143616461 | | |{"a": "A", "b": "B", "c": "C",| 0xa5616161416162614261636143616461 | | |||
| "d": "D", "e": "E"} | 4461656145 | | |"d": "D", "e": "E"} | 4461656145 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| (_ h'0102', h'030405') | 0x5f42010243030405ff | | |(_ h'0102', h'030405') | 0x5f42010243030405ff | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| (_ "strea", "ming") | 0x7f657374726561646d696e67ff | | |(_ "strea", "ming") | 0x7f657374726561646d696e67ff | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| [_ ] | 0x9fff | | |[_ ] | 0x9fff | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| [_ 1, [2, 3], [_ 4, 5]] | 0x9f018202039f0405ffff | | |[_ 1, [2, 3], [_ 4, 5]] | 0x9f018202039f0405ffff | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| [_ 1, [2, 3], [4, 5]] | 0x9f01820203820405ff | | |[_ 1, [2, 3], [4, 5]] | 0x9f01820203820405ff | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| [1, [2, 3], [_ 4, 5]] | 0x83018202039f0405ff | | |[1, [2, 3], [_ 4, 5]] | 0x83018202039f0405ff | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| [1, [_ 2, 3], [4, 5]] | 0x83019f0203ff820405 | | |[1, [_ 2, 3], [4, 5]] | 0x83019f0203ff820405 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
|[_ 1, 2, 3, 4, 5, 6, 7, 8, 9, | 0x9f0102030405060708090a0b0c0d0e0f | | |[_ 1, 2, 3, 4, 5, 6, 7, 8, 9, | 0x9f0102030405060708090a0b0c0d0e0f | | |||
| 10, 11, 12, 13, 14, 15, 16, | 101112131415161718181819ff | | |10, 11, 12, 13, 14, 15, 16, | 101112131415161718181819ff | | |||
| 17, 18, 19, 20, 21, 22, 23, | | | |17, 18, 19, 20, 21, 22, 23, | | | |||
| 24, 25] | | | |24, 25] | | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| {_ "a": 1, "b": [_ 2, 3]} | 0xbf61610161629f0203ffff | | |{_ "a": 1, "b": [_ 2, 3]} | 0xbf61610161629f0203ffff | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| ["a", {_ "b": "c"}] | 0x826161bf61626163ff | | |["a", {_ "b": "c"}] | 0x826161bf61626163ff | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| {_ "Fun": true, "Amt": -2} | 0xbf6346756ef563416d7421ff | | |{_ "Fun": true, "Amt": -2} | 0xbf6346756ef563416d7421ff | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
Table 6: Examples of Encoded CBOR Data Items | Table 6: Examples of Encoded CBOR Data Items | |||
Appendix B. Jump Table | Appendix B. Jump Table for Initial Byte | |||
For brevity, this jump table does not show initial bytes that are | For brevity, this jump table does not show initial bytes that are | |||
reserved for future extension. It also only shows a selection of the | reserved for future extension. It also only shows a selection of the | |||
initial bytes that can be used for optional features. (All unsigned | initial bytes that can be used for optional features. (All unsigned | |||
integers are in network byte order.) | integers are in network byte order.) | |||
+------------+------------------------------------------------+ | +============+================================================+ | |||
| Byte | Structure/Semantics | | | Byte | Structure/Semantics | | |||
+============+================================================+ | +============+================================================+ | |||
| 0x00..0x17 | Unsigned integer 0x00..0x17 (0..23) | | | 0x00..0x17 | Unsigned integer 0x00..0x17 (0..23) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0x18 | Unsigned integer (one-byte uint8_t follows) | | | 0x18 | Unsigned integer (one-byte uint8_t follows) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0x19 | Unsigned integer (two-byte uint16_t follows) | | | 0x19 | Unsigned integer (two-byte uint16_t follows) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0x1a | Unsigned integer (four-byte uint32_t follows) | | | 0x1a | Unsigned integer (four-byte uint32_t follows) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
skipping to change at page 63, line 41 ¶ | skipping to change at page 66, line 35 ¶ | |||
| | see Section 3.4.4) | | | | see Section 3.4.4) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xc5 | Bigfloat (data item "array" follows; see | | | 0xc5 | Bigfloat (data item "array" follows; see | | |||
| | Section 3.4.4) | | | | Section 3.4.4) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xc6..0xd4 | (tag) | | | 0xc6..0xd4 | (tag) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xd5..0xd7 | Expected Conversion (data item follows; see | | | 0xd5..0xd7 | Expected Conversion (data item follows; see | | |||
| | Section 3.4.5.2) | | | | Section 3.4.5.2) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xd8..0xdb | (more tags, 1/2/4/8 bytes and then a data item | | | 0xd8..0xdb | (more tags; 1/2/4/8 bytes of tag number and | | |||
| | follow) | | | | then a data item follow) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xe0..0xf3 | (simple value) | | | 0xe0..0xf3 | (simple value) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xf4 | False | | | 0xf4 | False | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xf5 | True | | | 0xf5 | True | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xf6 | Null | | | 0xf6 | Null | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xf7 | Undefined | | | 0xf7 | Undefined | | |||
skipping to change at page 64, line 42 ¶ | skipping to change at page 67, line 36 ¶ | |||
byte string. If n bytes are no longer available, take(n) fails. | byte string. If n bytes are no longer available, take(n) fails. | |||
* uint() converts a byte string into an unsigned integer by | * uint() converts a byte string into an unsigned integer by | |||
interpreting the byte string in network byte order. | interpreting the byte string in network byte order. | |||
* Arithmetic works as in C. | * Arithmetic works as in C. | |||
* All variables are unsigned integers of sufficient range. | * All variables are unsigned integers of sufficient range. | |||
Note that "well_formed" returns the major type for well-formed | Note that "well_formed" returns the major type for well-formed | |||
definite length items, but 0 for an indefinite length item (or -1 for | definite length items, but 99 for an indefinite length item (or -1 | |||
a "break" stop code, only if "breakable" is set). This is used in | for a "break" stop code, only if "breakable" is set). This is used | |||
"well_formed_indefinite" to ascertain that indefinite length strings | in "well_formed_indefinite" to ascertain that indefinite length | |||
only contain definite length strings as chunks. | strings only contain definite length strings as chunks. | |||
well_formed (breakable = false) { | well_formed(breakable = false) { | |||
// process initial bytes | // process initial bytes | |||
ib = uint(take(1)); | ib = uint(take(1)); | |||
mt = ib >> 5; | mt = ib >> 5; | |||
val = ai = ib & 0x1f; | val = ai = ib & 0x1f; | |||
switch (ai) { | switch (ai) { | |||
case 24: val = uint(take(1)); break; | case 24: val = uint(take(1)); break; | |||
case 25: val = uint(take(2)); break; | case 25: val = uint(take(2)); break; | |||
case 26: val = uint(take(4)); break; | case 26: val = uint(take(4)); break; | |||
case 27: val = uint(take(8)); break; | case 27: val = uint(take(8)); break; | |||
case 28: case 29: case 30: fail(); | case 28: case 29: case 30: fail(); | |||
skipping to change at page 65, line 28 ¶ | skipping to change at page 68, line 28 ¶ | |||
} | } | |||
// process content | // process content | |||
switch (mt) { | switch (mt) { | |||
// case 0, 1, 7 do not have content; just use val | // case 0, 1, 7 do not have content; just use val | |||
case 2: case 3: take(val); break; // bytes/UTF-8 | case 2: case 3: take(val); break; // bytes/UTF-8 | |||
case 4: for (i = 0; i < val; i++) well_formed(); break; | case 4: for (i = 0; i < val; i++) well_formed(); break; | |||
case 5: for (i = 0; i < val*2; i++) well_formed(); break; | case 5: for (i = 0; i < val*2; i++) well_formed(); break; | |||
case 6: well_formed(); break; // 1 embedded data item | case 6: well_formed(); break; // 1 embedded data item | |||
case 7: if (ai == 24 && val < 32) fail(); // bad simple | case 7: if (ai == 24 && val < 32) fail(); // bad simple | |||
} | } | |||
return mt; // finite data item | return mt; // definite-length data item | |||
} | } | |||
well_formed_indefinite(mt, breakable) { | well_formed_indefinite(mt, breakable) { | |||
switch (mt) { | switch (mt) { | |||
case 2: case 3: | case 2: case 3: | |||
while ((it = well_formed(true)) != -1) | while ((it = well_formed(true)) != -1) | |||
if (it != mt) // need finite-length chunk | if (it != mt) // need definite-length chunk | |||
fail(); // of same type | fail(); // of same type | |||
break; | break; | |||
case 4: while (well_formed(true) != -1); break; | case 4: while (well_formed(true) != -1); break; | |||
case 5: while (well_formed(true) != -1) well_formed(); break; | case 5: while (well_formed(true) != -1) well_formed(); break; | |||
case 7: | case 7: | |||
if (breakable) | if (breakable) | |||
return -1; // signal break out | return -1; // signal break out | |||
else fail(); // no enclosing indefinite | else fail(); // no enclosing indefinite | |||
default: fail(); // wrong mt | default: fail(); // wrong mt | |||
} | } | |||
return 0; // no break out | return 99; // indefinite-length data item | |||
} | } | |||
Figure 1: Pseudocode for Well-Formedness Check | Figure 1: Pseudocode for Well-Formedness Check | |||
Note that the remaining complexity of a complete CBOR decoder is | Note that the remaining complexity of a complete CBOR decoder is | |||
about presenting data that has been decoded to the application in an | about presenting data that has been decoded to the application in an | |||
appropriate form. | appropriate form. | |||
Major types 0 and 1 are designed in such a way that they can be | Major types 0 and 1 are designed in such a way that they can be | |||
encoded in C from a signed integer without actually doing an if-then- | encoded in C from a signed integer without actually doing an if-then- | |||
skipping to change at page 66, line 22 ¶ | skipping to change at page 69, line 22 ¶ | |||
(-1-n), the transformation for major type 1, is the same as ~n | (-1-n), the transformation for major type 1, is the same as ~n | |||
(bitwise complement) in C unsigned arithmetic; ~n can then be | (bitwise complement) in C unsigned arithmetic; ~n can then be | |||
expressed as (-1)^n for the negative case, while 0^n leaves n | expressed as (-1)^n for the negative case, while 0^n leaves n | |||
unchanged for non-negative. The sign of a number can be converted to | unchanged for non-negative. The sign of a number can be converted to | |||
-1 for negative and 0 for non-negative (0 or positive) by arithmetic- | -1 for negative and 0 for non-negative (0 or positive) by arithmetic- | |||
shifting the number by one bit less than the bit length of the number | shifting the number by one bit less than the bit length of the number | |||
(for example, by 63 for 64-bit numbers). | (for example, by 63 for 64-bit numbers). | |||
void encode_sint(int64_t n) { | void encode_sint(int64_t n) { | |||
uint64t ui = n >> 63; // extend sign to whole length | uint64t ui = n >> 63; // extend sign to whole length | |||
mt = ui & 0x20; // extract major type | unsigned mt = ui & 0x20; // extract (shifted) major type | |||
ui ^= n; // complement negatives | ui ^= n; // complement negatives | |||
if (ui < 24) | if (ui < 24) | |||
*p++ = mt + ui; | *p++ = mt + ui; | |||
else if (ui < 256) { | else if (ui < 256) { | |||
*p++ = mt + 24; | *p++ = mt + 24; | |||
*p++ = ui; | *p++ = ui; | |||
} else | } else | |||
... | ... | |||
Figure 2: Pseudocode for Encoding a Signed Integer | Figure 2: Pseudocode for Encoding a Signed Integer | |||
See Section 1.2 for some specific assumptions about the profile of | ||||
the C language used in these pieces of code. | ||||
Appendix D. Half-Precision | Appendix D. Half-Precision | |||
As half-precision floating-point numbers were only added to IEEE 754 | As half-precision floating-point numbers were only added to IEEE 754 | |||
in 2008 [IEEE754], today's programming platforms often still only | in 2008 [IEEE754], today's programming platforms often still only | |||
have limited support for them. It is very easy to include at least | have limited support for them. It is very easy to include at least | |||
decoding support for them even without such support. An example of a | decoding support for them even without such support. An example of a | |||
small decoder for half-precision floating-point numbers in the C | small decoder for half-precision floating-point numbers in the C | |||
language is shown in Figure 3. A similar program for Python is in | language is shown in Figure 3. A similar program for Python is in | |||
Figure 4; this code assumes that the 2-byte value has already been | Figure 4; this code assumes that the 2-byte value has already been | |||
decoded as an (unsigned short) integer in network byte order (as | decoded as an (unsigned short) integer in network byte order (as | |||
would be done by the pseudocode in Appendix C). | would be done by the pseudocode in Appendix C). | |||
#include <math.h> | #include <math.h> | |||
double decode_half(unsigned char *halfp) { | double decode_half(unsigned char *halfp) { | |||
int half = (halfp[0] << 8) + halfp[1]; | unsigned half = (halfp[0] << 8) + halfp[1]; | |||
int exp = (half >> 10) & 0x1f; | unsigned exp = (half >> 10) & 0x1f; | |||
int mant = half & 0x3ff; | unsigned mant = half & 0x3ff; | |||
double val; | double val; | |||
if (exp == 0) val = ldexp(mant, -24); | if (exp == 0) val = ldexp(mant, -24); | |||
else if (exp != 31) val = ldexp(mant + 1024, exp - 25); | else if (exp != 31) val = ldexp(mant + 1024, exp - 25); | |||
else val = mant == 0 ? INFINITY : NAN; | else val = mant == 0 ? INFINITY : NAN; | |||
return half & 0x8000 ? -val : val; | return half & 0x8000 ? -val : val; | |||
} | } | |||
Figure 3: C Code for a Half-Precision Decoder | Figure 3: C Code for a Half-Precision Decoder | |||
import struct | import struct | |||
skipping to change at page 70, line 5 ¶ | skipping to change at page 73, line 5 ¶ | |||
E.5. Conciseness on the Wire | E.5. Conciseness on the Wire | |||
While CBOR's design objective of code compactness for encoders and | While CBOR's design objective of code compactness for encoders and | |||
decoders is a higher priority than its objective of conciseness on | decoders is a higher priority than its objective of conciseness on | |||
the wire, many people focus on the wire size. Table 8 shows some | the wire, many people focus on the wire size. Table 8 shows some | |||
encoding examples for the simple nested array [1, [2, 3]]; where some | encoding examples for the simple nested array [1, [2, 3]]; where some | |||
form of indefinite-length encoding is supported by the encoding, | form of indefinite-length encoding is supported by the encoding, | |||
[_ 1, [2, 3]] (indefinite length on the outer array) is also shown. | [_ 1, [2, 3]] (indefinite length on the outer array) is also shown. | |||
+-------------+----------------------------+----------------+ | +=============+============================+================+ | |||
| Format | [1, [2, 3]] | [_ 1, [2, 3]] | | | Format | [1, [2, 3]] | [_ 1, [2, 3]] | | |||
+=============+============================+================+ | +=============+============================+================+ | |||
| RFC 713 | c2 05 81 c2 02 82 83 | | | | RFC 713 | c2 05 81 c2 02 82 83 | | | |||
+-------------+----------------------------+----------------+ | +-------------+----------------------------+----------------+ | |||
| ASN.1 BER | 30 0b 02 01 01 30 06 02 01 | 30 80 02 01 01 | | | ASN.1 BER | 30 0b 02 01 01 30 06 02 01 | 30 80 02 01 01 | | |||
| | 02 02 01 03 | 30 06 02 01 02 | | | | 02 02 01 03 | 30 06 02 01 02 | | |||
| | | 02 01 03 00 00 | | | | | 02 01 03 00 00 | | |||
+-------------+----------------------------+----------------+ | +-------------+----------------------------+----------------+ | |||
| MessagePack | 92 01 92 02 03 | | | | MessagePack | 92 01 92 02 03 | | | |||
+-------------+----------------------------+----------------+ | +-------------+----------------------------+----------------+ | |||
skipping to change at page 70, line 43 ¶ | skipping to change at page 73, line 43 ¶ | |||
This is only an error if the application assumed that the input | This is only an error if the application assumed that the input | |||
bytes would span exactly one data item. Where the application | bytes would span exactly one data item. Where the application | |||
uses the self-delimiting nature of CBOR encoding to permit | uses the self-delimiting nature of CBOR encoding to permit | |||
additional data after the data item, as is for example done in | additional data after the data item, as is for example done in | |||
CBOR sequences [RFC8742], the CBOR decoder can simply indicate | CBOR sequences [RFC8742], the CBOR decoder can simply indicate | |||
what part of the input has not been consumed. | what part of the input has not been consumed. | |||
* Too little data: The input data available would need additional | * Too little data: The input data available would need additional | |||
bytes added at their end for a complete CBOR data item. This may | bytes added at their end for a complete CBOR data item. This may | |||
indicate the input is truncated; it is also a common error when | indicate the input is truncated; it is also a common error when | |||
trying to decode random data as CBOR. For some applications | trying to decode random data as CBOR. For some applications, | |||
however, this may not actually be an error, as the application may | however, this may not actually be an error, as the application may | |||
not be certain it has all the data yet and can obtain or wait for | not be certain it has all the data yet and can obtain or wait for | |||
additional input bytes. Some of these applications may have an | additional input bytes. Some of these applications may have an | |||
upper limit for how much additional data can show up; here the | upper limit for how much additional data can show up; here the | |||
decoder may be able to indicate that the encoded CBOR data item | decoder may be able to indicate that the encoded CBOR data item | |||
cannot be completed within this limit. | cannot be completed within this limit. | |||
* Syntax error: The input data are not consistent with the | * Syntax error: The input data are not consistent with the | |||
requirements of the CBOR encoding, and this cannot be remedied by | requirements of the CBOR encoding, and this cannot be remedied by | |||
adding (or removing) data at the end. | adding (or removing) data at the end. | |||
In Appendix C, errors of the first kind are addressed in the first | In Appendix C, errors of the first kind are addressed in the first | |||
paragraph/bullet list (requiring "no bytes are left"), and errors of | paragraph/bullet list (requiring "no bytes are left"), and errors of | |||
the second kind are addressed in the second paragraph/bullet list | the second kind are addressed in the second paragraph/bullet list | |||
(failing "if n bytes are no longer available"). Errors of the third | (failing "if n bytes are no longer available"). Errors of the third | |||
kind are identified in the pseudocode by specific instances of | kind are identified in the pseudocode by specific instances of | |||
calling fail(), in order: | calling fail(), in order: | |||
* a reserved value is used for additional information (28, 29, 30) | * a reserved value is used for additional information (28, 29, 30) | |||
* major type 7, additional information 24, value < 32 (incorrect or | * major type 7, additional information 24, value < 32 (incorrect) | |||
incorrectly encoded simple type) | ||||
* incorrect substructure of indefinite length byte/text string (may | * incorrect substructure of indefinite length byte/text string (may | |||
only contain definite length strings of the same major type) | only contain definite length strings of the same major type) | |||
* "break" stop code (mt=7, ai=31) occurs in a value position of a | * "break" stop code (mt=7, ai=31) occurs in a value position of a | |||
map or except at a position directly in an indefinite length item | map or except at a position directly in an indefinite length item | |||
where also another enclosed data item could occur | where also another enclosed data item could occur | |||
* additional information 31 used with major type 0, 1, or 6 | * additional information 31 used with major type 0, 1, or 6 | |||
skipping to change at page 72, line 33 ¶ | skipping to change at page 75, line 33 ¶ | |||
(syntax error) are shown below. | (syntax error) are shown below. | |||
Subkind 1: | Subkind 1: | |||
* Reserved additional information values: 1c, 1d, 1e, 3c, 3d, 3e, | * Reserved additional information values: 1c, 1d, 1e, 3c, 3d, 3e, | |||
5c, 5d, 5e, 7c, 7d, 7e, 9c, 9d, 9e, bc, bd, be, dc, dd, de, fc, | 5c, 5d, 5e, 7c, 7d, 7e, 9c, 9d, 9e, bc, bd, be, dc, dd, de, fc, | |||
fd, fe, | fd, fe, | |||
Subkind 2: | Subkind 2: | |||
* Reserved two-byte encodings of simple types: f8 00, f8 01, f8 18, | * Reserved two-byte encodings of simple values: f8 00, f8 01, f8 18, | |||
f8 1f | f8 1f | |||
Subkind 3: | Subkind 3: | |||
* Indefinite length string chunks not of the correct type: 5f 00 ff, | * Indefinite length string chunks not of the correct type: 5f 00 ff, | |||
5f 21 ff, 5f 61 00 ff, 5f 80 ff, 5f a0 ff, 5f c0 00 ff, 5f e0 ff, | 5f 21 ff, 5f 61 00 ff, 5f 80 ff, 5f a0 ff, 5f c0 00 ff, 5f e0 ff, | |||
7f 41 00 ff | 7f 41 00 ff | |||
* Indefinite length string chunks not definite length: 5f 5f 41 00 | * Indefinite length string chunks not definite length: 5f 5f 41 00 | |||
ff ff, 7f 7f 61 00 ff ff | ff ff, 7f 7f 61 00 ff ff | |||
skipping to change at page 73, line 25 ¶ | skipping to change at page 76, line 25 ¶ | |||
of RFC 7049, with editorial improvements, added detail, and fixed | of RFC 7049, with editorial improvements, added detail, and fixed | |||
errata. This document formally obsoletes RFC 7049, while keeping | errata. This document formally obsoletes RFC 7049, while keeping | |||
full compatibility of the interchange format from RFC 7049. This | full compatibility of the interchange format from RFC 7049. This | |||
document does not create a new version of the format. | document does not create a new version of the format. | |||
G.1. Errata processing, clerical changes | G.1. Errata processing, clerical changes | |||
The two verified errata on RFC 7049, EID 3764 and EID 3770, concerned | The two verified errata on RFC 7049, EID 3764 and EID 3770, concerned | |||
two encoding examples in the text that have been corrected | two encoding examples in the text that have been corrected | |||
(Section 3.4.3: "29" -> "49", Section 5.5: "0b000_11101" -> | (Section 3.4.3: "29" -> "49", Section 5.5: "0b000_11101" -> | |||
"0b000_11001"). Also, RFC 7049 contained an example using the simple | "0b000_11001"). Also, RFC 7049 contained an example using the | |||
type value 24 (EID 5917), which is not well-formed; this example has | numeric value 24 for a simple value (EID 5917), which is not well- | |||
been removed. Errata report 5763 pointed to an accident in the | formed; this example has been removed. Errata report 5763 pointed to | |||
wording of the definition of tags; this was resolved during a re- | an accident in the wording of the definition of tags; this was | |||
write of Section 3.4. Errata report 5434 pointed out that the UBJSON | resolved during a re-write of Section 3.4. Errata report 5434 | |||
example in Appendix E no longer complied with the version of UBJSON | pointed out that the UBJSON example in Appendix E no longer complied | |||
current at the time of submitting the report. It turned out that the | with the version of UBJSON current at the time of submitting the | |||
UBJSON specification had completely changed since 2013; this example | report. It turned out that the UBJSON specification had completely | |||
therefore also was removed. Further errata reports (4409, 4963, | changed since 2013; this example therefore also was removed. Further | |||
4964) complained that the map key sorting rules for canonical | errata reports (4409, 4963, 4964) complained that the map key sorting | |||
encoding were onerous; these led to a reconsideration of the | rules for canonical encoding were onerous; these led to a | |||
canonical encoding suggestions and replacement by the deterministic | reconsideration of the canonical encoding suggestions and replacement | |||
encoding suggestions (described below). An editorial suggestion in | by the deterministic encoding suggestions (described below). An | |||
errata report 4294 was also implemented (improved symmetry by adding | editorial suggestion in errata report 4294 was also implemented | |||
"Second value" to a comment to the last example in Section 3.2.2). | (improved symmetry by adding "Second value" to a comment to the last | |||
example in Section 3.2.2). | ||||
Other more clerical changes include: | Other more clerical changes include: | |||
* use of new RFCXML functionality [RFC7991]; | * use of new RFCXML functionality [RFC7991]; | |||
* explain some more of the notation used; | * explain some more of the notation used; | |||
* updated references, e.g. for RFC4627 to [RFC8259] in many places, | * updated references, e.g. for RFC4627 to [RFC8259] in many places, | |||
for CNN-TERMS to [RFC7228]; added missing reference to [IEEE754] | for CNN-TERMS to [RFC7228]; added missing reference to [IEEE754] | |||
(importing required definitions) and updated to [ECMA262]; added a | (importing required definitions) and updated to [ECMA262]; added a | |||
reference to [RFC8618] that further illustrates the discussion in | reference to [RFC8618] that further illustrates the discussion in | |||
Appendix E; | Appendix E; | |||
* the discussion of diagnostic notation mentions the "Extended | * the discussion of diagnostic notation mentions the "Extended | |||
Diagnostic Notation" (EDN) defined in [RFC8610]; | Diagnostic Notation" (EDN) defined in [RFC8610] as well as the gap | |||
diagnostic notation has in representing NaN payloads; an | ||||
explanation was added on how to represent indefinite length | ||||
strings with no chunks; | ||||
* the addition of this appendix. | * the addition of this appendix. | |||
G.2. Changes in IANA considerations | G.2. Changes in IANA considerations | |||
The IANA considerations were generally updated (clerical changes, | The IANA considerations were generally updated (clerical changes, | |||
e.g., now pointing to the CBOR working group as the author of the | e.g., now pointing to the CBOR working group as the author of the | |||
specification). References to the respective IANA registries have | specification). References to the respective IANA registries have | |||
been added to the informative references. | been added to the informative references. | |||
skipping to change at page 74, line 38 ¶ | skipping to change at page 77, line 41 ¶ | |||
A significant addition in this revision is Section 2, which discusses | A significant addition in this revision is Section 2, which discusses | |||
the CBOR data model and its small variations involved in the | the CBOR data model and its small variations involved in the | |||
processing of CBOR. Introducing terms for those (basic generic, | processing of CBOR. Introducing terms for those (basic generic, | |||
extended generic, specific) enables more concise language in other | extended generic, specific) enables more concise language in other | |||
places of the document, but also helps in clarifying expectations on | places of the document, but also helps in clarifying expectations on | |||
implementations and on the extensibility features of the format. | implementations and on the extensibility features of the format. | |||
RFC 7049, as a format derived from the JSON ecosystem, was influenced | RFC 7049, as a format derived from the JSON ecosystem, was influenced | |||
by the JSON number system that was in turn inherited from JavaScript | by the JSON number system that was in turn inherited from JavaScript | |||
at the time. JSON does not provide distinct integers and floating | at the time. JSON does not provide distinct integers and floating- | |||
point values (and the latter are decimal in the format). CBOR | point values (and the latter are decimal in the format). CBOR | |||
provides binary representations of numbers, which do differ between | provides binary representations of numbers, which do differ between | |||
integers and floating point values. Experience from implementation | integers and floating-point values. Experience from implementation | |||
and use now suggested that the separation between these two number | and use now suggested that the separation between these two number | |||
domains should be more clearly drawn in the document; language that | domains should be more clearly drawn in the document; language that | |||
suggested an integer could seamlessly stand in for a floating point | suggested an integer could seamlessly stand in for a floating-point | |||
value was removed. Also, a suggestion (based on I-JSON [RFC7493]) | value was removed. Also, a suggestion (based on I-JSON [RFC7493]) | |||
was added for handling these types when converting JSON to CBOR. | was added for handling these types when converting JSON to CBOR, and | |||
the use of a specific rounding mechanism has been recommended. | ||||
For a single value in the data model, CBOR often provides multiple | For a single value in the data model, CBOR often provides multiple | |||
encoding options. The revision adds a new section Section 4, which | encoding options. The revision adds a new section Section 4, which | |||
first introduces the term "preferred serialization" (Section 4.1) and | first introduces the term "preferred serialization" (Section 4.1) and | |||
defines it for various kinds of data items. On the basis of this | defines it for various kinds of data items. On the basis of this | |||
terminology, the section goes on to discuss how a CBOR-based protocol | terminology, the section goes on to discuss how a CBOR-based protocol | |||
can define "deterministic encoding" (Section 4.2), which now avoids | can define "deterministic encoding" (Section 4.2), which now avoids | |||
the RFC 7049 terms "canonical" and "canonicalization". The | the RFC 7049 terms "canonical" and "canonicalization". The | |||
suggestion of "Core Deterministic Encoding Requirements" | suggestion of "Core Deterministic Encoding Requirements" | |||
Section 4.2.1 enables generic support for such protocol-defined | Section 4.2.1 enables generic support for such protocol-defined | |||
skipping to change at page 75, line 27 ¶ | skipping to change at page 78, line 33 ¶ | |||
as "syntax error", "decoding error" and "strict mode" outside | as "syntax error", "decoding error" and "strict mode" outside | |||
examples. Also, a third level of requirements beyond CBOR-level | examples. Also, a third level of requirements beyond CBOR-level | |||
validity that an application has on its input data is now explicitly | validity that an application has on its input data is now explicitly | |||
called out. Well-formed (processable at all), valid (checked by a | called out. Well-formed (processable at all), valid (checked by a | |||
validity-checking generic decoder), and expected input (as checked by | validity-checking generic decoder), and expected input (as checked by | |||
the application) are treated as a hierarchy of layers of | the application) are treated as a hierarchy of layers of | |||
acceptability. | acceptability. | |||
The handling of non-well-formed simple values was clarified in text | The handling of non-well-formed simple values was clarified in text | |||
and pseudocode. Appendix F was added to discuss well-formedness | and pseudocode. Appendix F was added to discuss well-formedness | |||
errors and provide examples for them. | errors and provide examples for them. The pseudocode was updated to | |||
be more portable and some portability considerations were added. | ||||
The discussion of validity has been sharpened in two areas. Map | The discussion of validity has been sharpened in two areas. Map | |||
validity (handling of duplicate keys) was clarified and the domain of | validity (handling of duplicate keys) was clarified and the domain of | |||
applicability of certain implementation choices explained. Also, | applicability of certain implementation choices explained. Also, | |||
while streamlining the terminology for tags, tag numbers, and tag | while streamlining the terminology for tags, tag numbers, and tag | |||
content, discussion was added on tag validity, and the restrictions | content, discussion was added on tag validity, and the restrictions | |||
pwere clarified on tag content, in general and specifically for tag | were clarified on tag content, in general and specifically for tag 1. | |||
1. | ||||
An implementation note (and note for future tag definitions) was | An implementation note (and note for future tag definitions) was | |||
added to Section 3.4 about defining tags with semantics that depend | added to Section 3.4 about defining tags with semantics that depend | |||
on serialization order. | on serialization order. | |||
Tag 35 is no longer defined in this updated document; the | ||||
registration based on the definition in RFC 7049 remains in place. | ||||
Terminology was introduced in Section 3 for "argument" and "head", | Terminology was introduced in Section 3 for "argument" and "head", | |||
simplifying further discussion. | simplifying further discussion. | |||
The security considerations were mostly rewritten and significantly | The security considerations were mostly rewritten and significantly | |||
expanded; in multiple other places, the document is now more explicit | expanded; in multiple other places, the document is now more explicit | |||
that a decoder cannot simply condone well-formedness errors. | that a decoder cannot simply condone well-formedness errors. | |||
Acknowledgements | Acknowledgements | |||
CBOR was inspired by MessagePack. MessagePack was developed and | CBOR was inspired by MessagePack. MessagePack was developed and | |||
skipping to change at page 76, line 29 ¶ | skipping to change at page 79, line 33 ¶ | |||
contributed to the discussion about extending MessagePack to separate | contributed to the discussion about extending MessagePack to separate | |||
text string representation from byte string representation. | text string representation from byte string representation. | |||
The encoding of the additional information in CBOR was inspired by | The encoding of the additional information in CBOR was inspired by | |||
the encoding of length information designed by Klaus Hartke for CoAP. | the encoding of length information designed by Klaus Hartke for CoAP. | |||
This document also incorporates suggestions made by many people, | This document also incorporates suggestions made by many people, | |||
notably Dan Frost, James Manger, Jeffrey Yasskin, Joe Hildebrand, | notably Dan Frost, James Manger, Jeffrey Yasskin, Joe Hildebrand, | |||
Keith Moore, Laurence Lundblade, Matthew Lepinski, Michael | Keith Moore, Laurence Lundblade, Matthew Lepinski, Michael | |||
Richardson, Nico Williams, Peter Occil, Phillip Hallam-Baker, Ray | Richardson, Nico Williams, Peter Occil, Phillip Hallam-Baker, Ray | |||
Polk, Tim Bray, Tony Finch, Tony Hansen, and Yaron Sheffer. | Polk, Stuart Cheshire, Tim Bray, Tony Finch, Tony Hansen, and Yaron | |||
Sheffer. Benjamin Kaduk provided an extensive review during IESG | ||||
processing. Éric Vyncke, Erik Kline, Robert Wilton, and Roman Danyliw | ||||
provided further IESG comments, which included an IoT directorate | ||||
review by Eve Schooler. | ||||
Authors' Addresses | Authors' Addresses | |||
Carsten Bormann | Carsten Bormann | |||
Universitaet Bremen TZI | Universitaet Bremen TZI | |||
Postfach 330440 | Postfach 330440 | |||
D-28359 Bremen | D-28359 Bremen | |||
Germany | Germany | |||
Phone: +49-421-218-63921 | Phone: +49-421-218-63921 | |||
End of changes. 198 change blocks. | ||||
418 lines changed or deleted | 546 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |