draft-ietf-cbor-7049bis-12.txt | draft-ietf-cbor-7049bis-13.txt | |||
---|---|---|---|---|
Network Working Group C. Bormann | Network Working Group C. Bormann | |||
Internet-Draft Universitaet Bremen TZI | Internet-Draft Universitaet Bremen TZI | |||
Obsoletes: 7049 (if approved) P. Hoffman | Obsoletes: 7049 (if approved) P. Hoffman | |||
Intended status: Standards Track ICANN | Intended status: Standards Track ICANN | |||
Expires: 20 June 2020 18 December 2019 | Expires: 9 September 2020 8 March 2020 | |||
Concise Binary Object Representation (CBOR) | Concise Binary Object Representation (CBOR) | |||
draft-ietf-cbor-7049bis-12 | draft-ietf-cbor-7049bis-13 | |||
Abstract | Abstract | |||
The Concise Binary Object Representation (CBOR) is a data format | The Concise Binary Object Representation (CBOR) is a data format | |||
whose design goals include the possibility of extremely small code | whose design goals include the possibility of extremely small code | |||
size, fairly small message size, and extensibility without the need | size, fairly small message size, and extensibility without the need | |||
for version negotiation. These design goals make it different from | for version negotiation. These design goals make it different from | |||
earlier binary serializations such as ASN.1 and MessagePack. | earlier binary serializations such as ASN.1 and MessagePack. | |||
This document is a revised edition of RFC 7049, with editorial | This document is a revised edition of RFC 7049, with editorial | |||
skipping to change at page 1, line 38 ¶ | skipping to change at page 1, line 38 ¶ | |||
This document is being worked on in the CBOR Working Group. Please | This document is being worked on in the CBOR Working Group. Please | |||
contribute on the mailing list there, or in the GitHub repository for | contribute on the mailing list there, or in the GitHub repository for | |||
this draft: https://github.com/cbor-wg/CBORbis | this draft: https://github.com/cbor-wg/CBORbis | |||
The charter for the CBOR Working Group says that the WG will update | The charter for the CBOR Working Group says that the WG will update | |||
RFC 7049 to fix verified errata. Security issues and clarifications | RFC 7049 to fix verified errata. Security issues and clarifications | |||
may be addressed, but changes to this document will ensure backward | may be addressed, but changes to this document will ensure backward | |||
compatibility for popular deployed codebases. This document will be | compatibility for popular deployed codebases. This document will be | |||
targeted at becoming an Internet Standard. | targeted at becoming an Internet Standard. | |||
[RFC editor: please remove this note.] | ||||
Status of This Memo | Status of This Memo | |||
This Internet-Draft is submitted in full conformance with the | This Internet-Draft is submitted in full conformance with the | |||
provisions of BCP 78 and BCP 79. | provisions of BCP 78 and BCP 79. | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
Drafts is at https://datatracker.ietf.org/drafts/current/. | Drafts is at https://datatracker.ietf.org/drafts/current/. | |||
Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
This Internet-Draft will expire on 20 June 2020. | This Internet-Draft will expire on 9 September 2020. | |||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2019 IETF Trust and the persons identified as the | Copyright (c) 2020 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents (https://trustee.ietf.org/ | Provisions Relating to IETF Documents (https://trustee.ietf.org/ | |||
license-info) in effect on the date of publication of this document. | license-info) in effect on the date of publication of this document. | |||
Please review these documents carefully, as they describe your rights | Please review these documents carefully, as they describe your rights | |||
and restrictions with respect to this document. Code Components | and restrictions with respect to this document. Code Components | |||
extracted from this document must include Simplified BSD License text | extracted from this document must include Simplified BSD License text | |||
as described in Section 4.e of the Trust Legal Provisions and are | as described in Section 4.e of the Trust Legal Provisions and are | |||
provided without warranty as described in the Simplified BSD License. | provided without warranty as described in the Simplified BSD License. | |||
Table of Contents | Table of Contents | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 | |||
1.1. Objectives . . . . . . . . . . . . . . . . . . . . . . . 4 | 1.1. Objectives . . . . . . . . . . . . . . . . . . . . . . . 4 | |||
1.2. Terminology . . . . . . . . . . . . . . . . . . . . . . . 6 | 1.2. Terminology . . . . . . . . . . . . . . . . . . . . . . . 6 | |||
2. CBOR Data Models . . . . . . . . . . . . . . . . . . . . . . 7 | 2. CBOR Data Models . . . . . . . . . . . . . . . . . . . . . . 7 | |||
2.1. Extended Generic Data Models . . . . . . . . . . . . . . 8 | 2.1. Extended Generic Data Models . . . . . . . . . . . . . . 8 | |||
2.2. Specific Data Models . . . . . . . . . . . . . . . . . . 9 | 2.2. Specific Data Models . . . . . . . . . . . . . . . . . . 9 | |||
3. Specification of the CBOR Encoding . . . . . . . . . . . . . 9 | 3. Specification of the CBOR Encoding . . . . . . . . . . . . . 10 | |||
3.1. Major Types . . . . . . . . . . . . . . . . . . . . . . . 11 | 3.1. Major Types . . . . . . . . . . . . . . . . . . . . . . . 11 | |||
3.2. Indefinite Lengths for Some Major Types . . . . . . . . . 13 | 3.2. Indefinite Lengths for Some Major Types . . . . . . . . . 13 | |||
3.2.1. The "break" Stop Code . . . . . . . . . . . . . . . . 13 | 3.2.1. The "break" Stop Code . . . . . . . . . . . . . . . . 13 | |||
3.2.2. Indefinite-Length Arrays and Maps . . . . . . . . . . 14 | 3.2.2. Indefinite-Length Arrays and Maps . . . . . . . . . . 14 | |||
3.2.3. Indefinite-Length Byte Strings and Text Strings . . . 16 | 3.2.3. Indefinite-Length Byte Strings and Text Strings . . . 16 | |||
3.3. Floating-Point Numbers and Values with No Content . . . . 16 | 3.2.4. Summary of indefinite-length use of major types . . . 17 | |||
3.4. Tagging of Items . . . . . . . . . . . . . . . . . . . . 18 | 3.3. Floating-Point Numbers and Values with No Content . . . . 17 | |||
3.4.1. Standard Date/Time String . . . . . . . . . . . . . . 20 | 3.4. Tagging of Items . . . . . . . . . . . . . . . . . . . . 19 | |||
3.4.2. Epoch-based Date/Time . . . . . . . . . . . . . . . . 20 | 3.4.1. Standard Date/Time String . . . . . . . . . . . . . . 22 | |||
3.4.3. Bignums . . . . . . . . . . . . . . . . . . . . . . . 21 | 3.4.2. Epoch-based Date/Time . . . . . . . . . . . . . . . . 22 | |||
3.4.4. Decimal Fractions and Bigfloats . . . . . . . . . . . 22 | 3.4.3. Bignums . . . . . . . . . . . . . . . . . . . . . . . 23 | |||
3.4.5. Content Hints . . . . . . . . . . . . . . . . . . . . 23 | 3.4.4. Decimal Fractions and Bigfloats . . . . . . . . . . . 24 | |||
3.4.5.1. Encoded CBOR Data Item . . . . . . . . . . . . . 23 | 3.4.5. Content Hints . . . . . . . . . . . . . . . . . . . . 25 | |||
3.4.5.1. Encoded CBOR Data Item . . . . . . . . . . . . . 25 | ||||
3.4.5.2. Expected Later Encoding for CBOR-to-JSON | 3.4.5.2. Expected Later Encoding for CBOR-to-JSON | |||
Converters . . . . . . . . . . . . . . . . . . . . 24 | Converters . . . . . . . . . . . . . . . . . . . . 25 | |||
3.4.5.3. Encoded Text . . . . . . . . . . . . . . . . . . 24 | 3.4.5.3. Encoded Text . . . . . . . . . . . . . . . . . . 26 | |||
3.4.6. Self-Described CBOR . . . . . . . . . . . . . . . . . 25 | 3.4.6. Self-Described CBOR . . . . . . . . . . . . . . . . . 27 | |||
4. Serialization Considerations . . . . . . . . . . . . . . . . 26 | ||||
4.1. Preferred Serialization . . . . . . . . . . . . . . . . . 26 | 4. Serialization Considerations . . . . . . . . . . . . . . . . 28 | |||
4.2. Deterministically Encoded CBOR . . . . . . . . . . . . . 27 | 4.1. Preferred Serialization . . . . . . . . . . . . . . . . . 28 | |||
4.2.1. Core Deterministic Encoding Requirements . . . . . . 27 | 4.2. Deterministically Encoded CBOR . . . . . . . . . . . . . 29 | |||
4.2.2. Additional Deterministic Encoding Considerations . . 28 | 4.2.1. Core Deterministic Encoding Requirements . . . . . . 29 | |||
4.2.3. Length-first map key ordering . . . . . . . . . . . . 30 | 4.2.2. Additional Deterministic Encoding Considerations . . 30 | |||
5. Creating CBOR-Based Protocols . . . . . . . . . . . . . . . . 31 | 4.2.3. Length-first Map Key Ordering . . . . . . . . . . . . 32 | |||
5.1. CBOR in Streaming Applications . . . . . . . . . . . . . 31 | 5. Creating CBOR-Based Protocols . . . . . . . . . . . . . . . . 33 | |||
5.2. Generic Encoders and Decoders . . . . . . . . . . . . . . 32 | 5.1. CBOR in Streaming Applications . . . . . . . . . . . . . 33 | |||
5.3. Validity of Items . . . . . . . . . . . . . . . . . . . . 32 | 5.2. Generic Encoders and Decoders . . . . . . . . . . . . . . 34 | |||
5.3.1. Basic validity . . . . . . . . . . . . . . . . . . . 33 | 5.3. Validity of Items . . . . . . . . . . . . . . . . . . . . 35 | |||
5.3.2. Tag validity . . . . . . . . . . . . . . . . . . . . 33 | 5.3.1. Basic validity . . . . . . . . . . . . . . . . . . . 35 | |||
5.4. Validity and Evolution . . . . . . . . . . . . . . . . . 34 | 5.3.2. Tag validity . . . . . . . . . . . . . . . . . . . . 35 | |||
5.5. Numbers . . . . . . . . . . . . . . . . . . . . . . . . . 35 | 5.4. Validity and Evolution . . . . . . . . . . . . . . . . . 36 | |||
5.6. Specifying Keys for Maps . . . . . . . . . . . . . . . . 35 | 5.5. Numbers . . . . . . . . . . . . . . . . . . . . . . . . . 37 | |||
5.6.1. Equivalence of Keys . . . . . . . . . . . . . . . . . 36 | 5.6. Specifying Keys for Maps . . . . . . . . . . . . . . . . 38 | |||
5.7. Undefined Values . . . . . . . . . . . . . . . . . . . . 37 | 5.6.1. Equivalence of Keys . . . . . . . . . . . . . . . . . 39 | |||
6. Converting Data between CBOR and JSON . . . . . . . . . . . . 38 | 5.7. Undefined Values . . . . . . . . . . . . . . . . . . . . 40 | |||
6.1. Converting from CBOR to JSON . . . . . . . . . . . . . . 38 | 6. Converting Data between CBOR and JSON . . . . . . . . . . . . 40 | |||
6.2. Converting from JSON to CBOR . . . . . . . . . . . . . . 39 | 6.1. Converting from CBOR to JSON . . . . . . . . . . . . . . 41 | |||
7. Future Evolution of CBOR . . . . . . . . . . . . . . . . . . 40 | 6.2. Converting from JSON to CBOR . . . . . . . . . . . . . . 42 | |||
7.1. Extension Points . . . . . . . . . . . . . . . . . . . . 41 | 7. Future Evolution of CBOR . . . . . . . . . . . . . . . . . . 43 | |||
7.2. Curating the Additional Information Space . . . . . . . . 41 | 7.1. Extension Points . . . . . . . . . . . . . . . . . . . . 43 | |||
8. Diagnostic Notation . . . . . . . . . . . . . . . . . . . . . 42 | 7.2. Curating the Additional Information Space . . . . . . . . 44 | |||
8.1. Encoding Indicators . . . . . . . . . . . . . . . . . . . 43 | 8. Diagnostic Notation . . . . . . . . . . . . . . . . . . . . . 45 | |||
9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 44 | 8.1. Encoding Indicators . . . . . . . . . . . . . . . . . . . 46 | |||
9.1. Simple Values Registry . . . . . . . . . . . . . . . . . 44 | 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 46 | |||
9.2. Tags Registry . . . . . . . . . . . . . . . . . . . . . . 44 | 9.1. Simple Values Registry . . . . . . . . . . . . . . . . . 47 | |||
9.3. Media Type ("MIME Type") . . . . . . . . . . . . . . . . 45 | 9.2. Tags Registry . . . . . . . . . . . . . . . . . . . . . . 47 | |||
9.4. CoAP Content-Format . . . . . . . . . . . . . . . . . . . 45 | 9.3. Media Type ("MIME Type") . . . . . . . . . . . . . . . . 47 | |||
9.5. The +cbor Structured Syntax Suffix Registration . . . . . 46 | 9.4. CoAP Content-Format . . . . . . . . . . . . . . . . . . . 48 | |||
10. Security Considerations . . . . . . . . . . . . . . . . . . . 47 | 9.5. The +cbor Structured Syntax Suffix Registration . . . . . 49 | |||
11. References . . . . . . . . . . . . . . . . . . . . . . . . . 48 | 10. Security Considerations . . . . . . . . . . . . . . . . . . . 50 | |||
11.1. Normative References . . . . . . . . . . . . . . . . . . 48 | 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 52 | |||
11.2. Informative References . . . . . . . . . . . . . . . . . 50 | 11.1. Normative References . . . . . . . . . . . . . . . . . . 52 | |||
Appendix A. Examples . . . . . . . . . . . . . . . . . . . . . . 51 | 11.2. Informative References . . . . . . . . . . . . . . . . . 53 | |||
Appendix B. Jump Table . . . . . . . . . . . . . . . . . . . . . 55 | Appendix A. Examples . . . . . . . . . . . . . . . . . . . . . . 55 | |||
Appendix C. Pseudocode . . . . . . . . . . . . . . . . . . . . . 58 | Appendix B. Jump Table . . . . . . . . . . . . . . . . . . . . . 59 | |||
Appendix D. Half-Precision . . . . . . . . . . . . . . . . . . . 61 | Appendix C. Pseudocode . . . . . . . . . . . . . . . . . . . . . 62 | |||
Appendix D. Half-Precision . . . . . . . . . . . . . . . . . . . 65 | ||||
Appendix E. Comparison of Other Binary Formats to CBOR's Design | Appendix E. Comparison of Other Binary Formats to CBOR's Design | |||
Objectives . . . . . . . . . . . . . . . . . . . . . . . 62 | Objectives . . . . . . . . . . . . . . . . . . . . . . . 66 | |||
E.1. ASN.1 DER, BER, and PER . . . . . . . . . . . . . . . . . 63 | E.1. ASN.1 DER, BER, and PER . . . . . . . . . . . . . . . . . 67 | |||
E.2. MessagePack . . . . . . . . . . . . . . . . . . . . . . . 63 | E.2. MessagePack . . . . . . . . . . . . . . . . . . . . . . . 67 | |||
E.3. BSON . . . . . . . . . . . . . . . . . . . . . . . . . . 64 | E.3. BSON . . . . . . . . . . . . . . . . . . . . . . . . . . 68 | |||
E.4. MSDTP: RFC 713 . . . . . . . . . . . . . . . . . . . . . 64 | E.4. MSDTP: RFC 713 . . . . . . . . . . . . . . . . . . . . . 68 | |||
E.5. Conciseness on the Wire . . . . . . . . . . . . . . . . . 64 | E.5. Conciseness on the Wire . . . . . . . . . . . . . . . . . 68 | |||
Appendix F. Changes from RFC 7049 . . . . . . . . . . . . . . . 65 | Appendix F. Changes from RFC 7049 . . . . . . . . . . . . . . . 69 | |||
Appendix G. Well-formedness errors and examples . . . . . . . . 65 | Appendix G. Well-formedness errors and examples . . . . . . . . 70 | |||
G.1. Examples for CBOR data items that are not well-formed . . 66 | G.1. Examples for CBOR data items that are not well-formed . . 71 | |||
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . 68 | Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . 73 | |||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 69 | Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 74 | |||
1. Introduction | 1. Introduction | |||
There are hundreds of standardized formats for binary representation | There are hundreds of standardized formats for binary representation | |||
of structured data (also known as binary serialization formats). Of | of structured data (also known as binary serialization formats). Of | |||
those, some are for specific domains of information, while others are | those, some are for specific domains of information, while others are | |||
generalized for arbitrary data. In the IETF, probably the best-known | generalized for arbitrary data. In the IETF, probably the best-known | |||
formats in the latter category are ASN.1's BER and DER [ASN.1]. | formats in the latter category are ASN.1's BER and DER [ASN.1]. | |||
The format defined here follows some specific design goals that are | The format defined here follows some specific design goals that are | |||
skipping to change at page 5, line 25 ¶ | skipping to change at page 5, line 29 ¶ | |||
3. Data must be able to be decoded without a schema description. | 3. Data must be able to be decoded without a schema description. | |||
* Similar to JSON, encoded data should be self-describing so | * Similar to JSON, encoded data should be self-describing so | |||
that a generic decoder can be written. | that a generic decoder can be written. | |||
4. The serialization must be reasonably compact, but data | 4. The serialization must be reasonably compact, but data | |||
compactness is secondary to code compactness for the encoder and | compactness is secondary to code compactness for the encoder and | |||
decoder. | decoder. | |||
* "Reasonable" here is bounded by JSON as an upper bound in | * "Reasonable" here is bounded by JSON as an upper bound in | |||
size, and by implementation complexity maintaining a lower | size, and by the implementation complexity limiting how much | |||
bound. Using either general compression schemes or extensive | effort can go into achieving that compactness. Using either | |||
bit-fiddling violates the complexity goals. | general compression schemes or extensive bit-fiddling violates | |||
the complexity goals. | ||||
5. The format must be applicable to both constrained nodes and high- | 5. The format must be applicable to both constrained nodes and high- | |||
volume applications. | volume applications. | |||
* This means it must be reasonably frugal in CPU usage for both | * This means it must be reasonably frugal in CPU usage for both | |||
encoding and decoding. This is relevant both for constrained | encoding and decoding. This is relevant both for constrained | |||
nodes and for potential usage in applications with a very high | nodes and for potential usage in applications with a very high | |||
volume of data. | volume of data. | |||
6. The format must support all JSON data types for conversion to and | 6. The format must support all JSON data types for conversion to and | |||
skipping to change at page 6, line 48 ¶ | skipping to change at page 7, line 4 ¶ | |||
Data Stream: A sequence of zero or more data items, not further | Data Stream: A sequence of zero or more data items, not further | |||
assembled into a larger containing data item. The independent | assembled into a larger containing data item. The independent | |||
data items that make up a data stream are sometimes also referred | data items that make up a data stream are sometimes also referred | |||
to as "top-level data items". | to as "top-level data items". | |||
Well-formed: A data item that follows the syntactic structure of | Well-formed: A data item that follows the syntactic structure of | |||
CBOR. A well-formed data item uses the initial bytes and the byte | CBOR. A well-formed data item uses the initial bytes and the byte | |||
strings and/or data items that are implied by their values as | strings and/or data items that are implied by their values as | |||
defined in CBOR and does not include following extraneous data. | defined in CBOR and does not include following extraneous data. | |||
CBOR decoders by definition only return contents from well-formed | CBOR decoders by definition only return contents from well-formed | |||
data items. | data items. | |||
Valid: A data item that is well-formed and also follows the semantic | Valid: A data item that is well-formed and also follows the semantic | |||
restrictions that apply to CBOR data items. | restrictions that apply to CBOR data items (Section 5.3). | |||
Expected: Besides its normal English meaning, the term "expected" is | Expected: Besides its normal English meaning, the term "expected" is | |||
used to describe requirements beyond CBOR validity that an | used to describe requirements beyond CBOR validity that an | |||
application has on its input data. Well-formed (processable at | application has on its input data. Well-formed (processable at | |||
all), valid (checked by a validity-checking generic decoder), and | all), valid (checked by a validity-checking generic decoder), and | |||
expected (checked by the application) form a hierarchy of layers | expected (checked by the application) form a hierarchy of layers | |||
of acceptability. | of acceptability. | |||
Stream decoder: A process that decodes a data stream and makes each | Stream decoder: A process that decodes a data stream and makes each | |||
of the data items in the sequence available to an application as | of the data items in the sequence available to an application as | |||
they are received. | they are received. | |||
Terms and concepts for floating-point values such as Infinity, NaN | ||||
(not a number), negative zero, and subnormal are defined in | ||||
[IEEE754]. | ||||
Where bit arithmetic or data types are explained, this document uses | Where bit arithmetic or data types are explained, this document uses | |||
the notation familiar from the programming language C, except that | the notation familiar from the programming language C, except that | |||
"**" denotes exponentiation. Similar to the "0x" notation for | "**" denotes exponentiation. Similar to the "0x" notation for | |||
hexadecimal numbers, numbers in binary notation are prefixed with | hexadecimal numbers, numbers in binary notation are prefixed with | |||
"0b". Underscores can be added to a number solely for readability, | "0b". Underscores can be added to a number solely for readability, | |||
so 0b00100001 (0x21) might be written 0b001_00001 to emphasize the | so 0b00100001 (0x21) might be written 0b001_00001 to emphasize the | |||
desired interpretation of the bits in the byte; in this case, it is | desired interpretation of the bits in the byte; in this case, it is | |||
split into three bits and five bits. Encoded CBOR data items are | split into three bits and five bits. Encoded CBOR data items are | |||
sometimes given in the "0x" or "0b" notation; these values are first | sometimes given in the "0x" or "0b" notation; these values are first | |||
interpreted as numbers as in C and are then interpreted as byte | interpreted as numbers as in C and are then interpreted as byte | |||
strings in network byte order, including any leading zero bytes | strings in network byte order, including any leading zero bytes | |||
expressed in the notation. | expressed in the notation. | |||
Words may be _italicized_ for emphasis; in the plain text form of | ||||
this specification this is indicated by surrounding words with | ||||
underscore characters. Verbatim text (e.g., names from a programming | ||||
language) may be set in "monospace" type; in plain text this is | ||||
approximated somewhat ambiguously by surrounding the text in double | ||||
quotes (which also retain their usual meaning). | ||||
2. CBOR Data Models | 2. CBOR Data Models | |||
CBOR is explicit about its generic data model, which defines the set | CBOR is explicit about its generic data model, which defines the set | |||
of all data items that can be represented in CBOR. Its basic generic | of all data items that can be represented in CBOR. Its basic generic | |||
data model is extensible by the registration of simple type values | data model is extensible by the registration of simple type values | |||
and tags. Applications can then subset the resulting extended | and tags. Applications can then subset the resulting extended | |||
generic data model to build their specific data models. | generic data model to build their specific data models. | |||
Within environments that can represent the data items in the generic | Within environments that can represent the data items in the generic | |||
data model, generic CBOR encoders and decoders can be implemented | data model, generic CBOR encoders and decoders can be implemented | |||
skipping to change at page 8, line 4 ¶ | skipping to change at page 8, line 17 ¶ | |||
(which usually involves defining additional implementation data types | (which usually involves defining additional implementation data types | |||
for those data items that do not already have a natural | for those data items that do not already have a natural | |||
representation in the environment). The ability to provide generic | representation in the environment). The ability to provide generic | |||
encoders and decoders is an explicit design goal of CBOR; however | encoders and decoders is an explicit design goal of CBOR; however | |||
many applications will provide their own application-specific | many applications will provide their own application-specific | |||
encoders and/or decoders. | encoders and/or decoders. | |||
In the basic (un-extended) generic data model, a data item is one of: | In the basic (un-extended) generic data model, a data item is one of: | |||
* an integer in the range -2**64..2**64-1 inclusive | * an integer in the range -2**64..2**64-1 inclusive | |||
* a simple value, identified by a number between 0 and 255, but | * a simple value, identified by a number between 0 and 255, but | |||
distinct from that number | distinct from that number itself | |||
* a floating-point value, distinct from an integer, out of the set | * a floating-point value, distinct from an integer, out of the set | |||
representable by IEEE 754 binary64 (including non-finites) | representable by IEEE 754 binary64 (including non-finites) | |||
[IEEE754] | [IEEE754] | |||
* a sequence of zero or more bytes ("byte string") | * a sequence of zero or more bytes ("byte string") | |||
* a sequence of zero or more Unicode code points ("text string") | * a sequence of zero or more Unicode code points ("text string") | |||
* a sequence of zero or more data items ("array") | * a sequence of zero or more data items ("array") | |||
* a mapping (mathematical function) from zero or more data items | * a mapping (mathematical function) from zero or more data items | |||
("keys") each to a data item ("values"), ("map") | ("keys") each to a data item ("values"), ("map") | |||
* a tagged data item ("tag"), comprising a tag number (an integer in | * a tagged data item ("tag"), comprising a tag number (an integer in | |||
the range 0..2**64-1) and a tagged value (a data item) | the range 0..2**64-1) and the tag content (a data item) | |||
Note that integer and floating-point values are distinct in this | Note that integer and floating-point values are distinct in this | |||
model, even if they have the same numeric value. | model, even if they have the same numeric value. | |||
Also note that serialization variants, such as the number of bytes of | Also note that serialization variants, such as the number of bytes of | |||
the encoded floating value, or the choice of one of the ways in which | the encoded floating-point value, or the choice of one of the ways in | |||
an integer, the length of a text or byte string, the number of | which an integer, the length of a text or byte string, the number of | |||
elements in an array or pairs in a map, or a tag number, | elements in an array or pairs in a map, or a tag number, | |||
(collectively "the argument", see Section 3) can be encoded, are not | (collectively "the argument", see Section 3) can be encoded, are not | |||
visible at the generic data model level. | visible at the generic data model level. | |||
2.1. Extended Generic Data Models | 2.1. Extended Generic Data Models | |||
This basic generic data model comes pre-extended by the registration | This basic generic data model comes pre-extended by the registration | |||
of a number of simple values and tag numbers right in this document, | of a number of simple values and tag numbers right in this document, | |||
such as: | such as: | |||
skipping to change at page 9, line 45 ¶ | skipping to change at page 10, line 11 ¶ | |||
representations of integral values are equivalent, using both map | representations of integral values are equivalent, using both map | |||
keys "0" and "0.0" in a single map would be considered duplicates, | keys "0" and "0.0" in a single map would be considered duplicates, | |||
even while encoded as different major types, and so invalid; and an | even while encoded as different major types, and so invalid; and an | |||
encoder could encode integral-valued floats as integers or vice | encoder could encode integral-valued floats as integers or vice | |||
versa, perhaps to save encoded bytes. | versa, perhaps to save encoded bytes. | |||
3. Specification of the CBOR Encoding | 3. Specification of the CBOR Encoding | |||
A CBOR data item (Section 2) is encoded to or decoded from a byte | A CBOR data item (Section 2) is encoded to or decoded from a byte | |||
string carrying a well-formed encoded data item as described in this | string carrying a well-formed encoded data item as described in this | |||
section. The encoding is summarized in Table 6, indexed by the | section. The encoding is summarized in Table 7, indexed by the | |||
initial byte. An encoder MUST produce only well-formed encoded data | initial byte. An encoder MUST produce only well-formed encoded data | |||
items. A decoder MUST NOT return a decoded data item when it | items. A decoder MUST NOT return a decoded data item when it | |||
encounters input that is not a well-formed encoded CBOR data item | encounters input that is not a well-formed encoded CBOR data item | |||
(this does not detract from the usefulness of diagnostic and recovery | (this does not detract from the usefulness of diagnostic and recovery | |||
tools that might make available some information from a damaged | tools that might make available some information from a damaged | |||
encoded CBOR data item). | encoded CBOR data item). | |||
The initial byte of each encoded data item contains both information | The initial byte of each encoded data item contains both information | |||
about the major type (the high-order 3 bits, described in | about the major type (the high-order 3 bits, described in | |||
Section 3.1) and additional information (the low-order 5 bits). With | Section 3.1) and additional information (the low-order 5 bits). With | |||
skipping to change at page 10, line 49 ¶ | skipping to change at page 11, line 16 ¶ | |||
If the encoded sequence of bytes ends before the end of a data item, | If the encoded sequence of bytes ends before the end of a data item, | |||
that item is not well-formed. If the encoded sequence of bytes still | that item is not well-formed. If the encoded sequence of bytes still | |||
has bytes remaining after the outermost encoded item is decoded, that | has bytes remaining after the outermost encoded item is decoded, that | |||
encoding is not a single well-formed CBOR item; depending on the | encoding is not a single well-formed CBOR item; depending on the | |||
application, the decoder may either treat the encoding as not well- | application, the decoder may either treat the encoding as not well- | |||
formed or just identify the start of the remaining bytes to the | formed or just identify the start of the remaining bytes to the | |||
application. | application. | |||
A CBOR decoder implementation can be based on a jump table with all | A CBOR decoder implementation can be based on a jump table with all | |||
256 defined values for the initial byte (Table 6). A decoder in a | 256 defined values for the initial byte (Table 7). A decoder in a | |||
constrained implementation can instead use the structure of the | constrained implementation can instead use the structure of the | |||
initial byte and following bytes for more compact code (see | initial byte and following bytes for more compact code (see | |||
Appendix C for a rough impression of how this could look). | Appendix C for a rough impression of how this could look). | |||
3.1. Major Types | 3.1. Major Types | |||
The following lists the major types and the additional information | The following lists the major types and the additional information | |||
and other bytes associated with the type. | and other bytes associated with the type. | |||
Major type 0: an integer in the range 0..2**64-1 inclusive. The | Major type 0: an integer in the range 0..2**64-1 inclusive. The | |||
skipping to change at page 11, line 45 ¶ | skipping to change at page 12, line 13 ¶ | |||
formed but invalid. This type is provided for systems that need | formed but invalid. This type is provided for systems that need | |||
to interpret or display human-readable text, and allows the | to interpret or display human-readable text, and allows the | |||
differentiation between unstructured bytes and text that has a | differentiation between unstructured bytes and text that has a | |||
specified repertoire and encoding. In contrast to formats such as | specified repertoire and encoding. In contrast to formats such as | |||
JSON, the Unicode characters in this type are never escaped. | JSON, the Unicode characters in this type are never escaped. | |||
Thus, a newline character (U+000A) is always represented in a | Thus, a newline character (U+000A) is always represented in a | |||
string as the byte 0x0a, and never as the bytes 0x5c6e (the | string as the byte 0x0a, and never as the bytes 0x5c6e (the | |||
characters "\" and "n") or as 0x5c7530303061 (the characters "\", | characters "\" and "n") or as 0x5c7530303061 (the characters "\", | |||
"u", "0", "0", "0", and "a"). | "u", "0", "0", "0", and "a"). | |||
Major type 4: an array of data items. Arrays are also called lists, | Major type 4: an array of data items. In other formats, arrays are | |||
sequences, or tuples. The argument is the number of data items in | also called lists, sequences, or tuples (a "CBOR sequence" is | |||
the array. Items in an array do not need to all be of the same | something slightly different, though [RFC8742]). The argument is | |||
type. For example, an array that contains 10 items of any type | the number of data items in the array. Items in an array do not | |||
would have an initial byte of 0b100_01010 (major type of 4, | need to all be of the same type. For example, an array that | |||
additional information of 10 for the length) followed by the 10 | contains 10 items of any type would have an initial byte of | |||
remaining items. | 0b100_01010 (major type of 4, additional information of 10 for the | |||
length) followed by the 10 remaining items. | ||||
Major type 5: a map of pairs of data items. Maps are also called | Major type 5: a map of pairs of data items. Maps are also called | |||
tables, dictionaries, hashes, or objects (in JSON). A map is | tables, dictionaries, hashes, or objects (in JSON). A map is | |||
comprised of pairs of data items, each pair consisting of a key | comprised of pairs of data items, each pair consisting of a key | |||
that is immediately followed by a value. The argument is the | that is immediately followed by a value. The argument is the | |||
number of _pairs_ of data items in the map. For example, a map | number of _pairs_ of data items in the map. For example, a map | |||
that contains 9 pairs would have an initial byte of 0b101_01001 | that contains 9 pairs would have an initial byte of 0b101_01001 | |||
(major type of 5, additional information of 9 for the number of | (major type of 5, additional information of 9 for the number of | |||
pairs) followed by the 18 remaining items. The first item is the | pairs) followed by the 18 remaining items. The first item is the | |||
first key, the second item is the first value, the third item is | first key, the second item is the first value, the third item is | |||
the second key, and so on. Because items in a map come in pairs, | the second key, and so on. Because items in a map come in pairs, | |||
their total number is always even: A map that contains an odd | their total number is always even: A map that contains an odd | |||
number of items (no value data present after the last key data | number of items (no value data present after the last key data | |||
item) is not well-formed. A map that has duplicate keys may be | item) is not well-formed. A map that has duplicate keys may be | |||
well-formed, but it is not valid, and thus it causes indeterminate | well-formed, but it is not valid, and thus it causes indeterminate | |||
decoding; see also Section 5.6. | decoding; see also Section 5.6. | |||
Major type 6: a tagged data item ("tag") whose tag number is the | Major type 6: a tagged data item ("tag") whose tag number, an | |||
argument and whose enclosed data item ("tag content") is the | integer in the range 0..2**64-1 inclusive, is the argument and | |||
single encoded data item that follows the head. See Section 3.4. | whose enclosed data item ("tag content") is the single encoded | |||
data item that follows the head. See Section 3.4. | ||||
Major type 7: floating-point numbers and simple values, as well as | Major type 7: floating-point numbers and simple values, as well as | |||
the "break" stop code. See Section 3.3. | the "break" stop code. See Section 3.3. | |||
These eight major types lead to a simple table showing which of the | These eight major types lead to a simple table showing which of the | |||
256 possible values for the initial byte of a data item are used | 256 possible values for the initial byte of a data item are used | |||
(Table 6). | (Table 7). | |||
In major types 6 and 7, many of the possible values are reserved for | In major types 6 and 7, many of the possible values are reserved for | |||
future specification. See Section 9 for more information on these | future specification. See Section 9 for more information on these | |||
values. | values. | |||
Table 1 summarizes the major types defined by CBOR, ignoring the next | Table 1 summarizes the major types defined by CBOR, ignoring the next | |||
section for now. The number N in this table stands for the argument, | section for now. The number N in this table stands for the argument, | |||
mt for the major type. | mt for the major type. | |||
+----+-----------------------+---------------------------------+ | +----+-----------------------+---------------------------------+ | |||
skipping to change at page 13, line 25 ¶ | skipping to change at page 13, line 31 ¶ | |||
+----+-----------------------+---------------------------------+ | +----+-----------------------+---------------------------------+ | |||
| 4 | array | N data items (elements) | | | 4 | array | N data items (elements) | | |||
+----+-----------------------+---------------------------------+ | +----+-----------------------+---------------------------------+ | |||
| 5 | map | 2N data items (key/value pairs) | | | 5 | map | 2N data items (key/value pairs) | | |||
+----+-----------------------+---------------------------------+ | +----+-----------------------+---------------------------------+ | |||
| 6 | tag of number N | 1 data item | | | 6 | tag of number N | 1 data item | | |||
+----+-----------------------+---------------------------------+ | +----+-----------------------+---------------------------------+ | |||
| 7 | simple/float | - | | | 7 | simple/float | - | | |||
+----+-----------------------+---------------------------------+ | +----+-----------------------+---------------------------------+ | |||
Table 1: Overview over CBOR major types (definite length | Table 1: Overview over the definite-length use of CBOR major | |||
encoded) | types (mt = major type, N = argument) | |||
3.2. Indefinite Lengths for Some Major Types | 3.2. Indefinite Lengths for Some Major Types | |||
Four CBOR items (arrays, maps, byte strings, and text strings) can be | Four CBOR items (arrays, maps, byte strings, and text strings) can be | |||
encoded with an indefinite length using additional information value | encoded with an indefinite length using additional information value | |||
31. This is useful if the encoding of the item needs to begin before | 31. This is useful if the encoding of the item needs to begin before | |||
the number of items inside the array or map, or the total length of | the number of items inside the array or map, or the total length of | |||
the string, is known. (The application of this is often referred to | the string, is known. (The ability to start sending a data item | |||
as "streaming" within a data item.) | before all of it is known is often referred to as "streaming" within | |||
that data item.) | ||||
Indefinite-length arrays and maps are dealt with differently than | Indefinite-length arrays and maps are dealt with differently than | |||
indefinite-length byte strings and text strings. | indefinite-length byte strings and text strings. | |||
3.2.1. The "break" Stop Code | 3.2.1. The "break" Stop Code | |||
The "break" stop code is encoded with major type 7 and additional | The "break" stop code is encoded with major type 7 and additional | |||
information value 31 (0b111_11111). It is not itself a data item: it | information value 31 (0b111_11111). It is not itself a data item: it | |||
is just a syntactic feature to close an indefinite-length item. | is just a syntactic feature to close an indefinite-length item. | |||
skipping to change at page 16, line 24 ¶ | skipping to change at page 16, line 34 ¶ | |||
chunks, while not particularly useful, are permitted.) | chunks, while not particularly useful, are permitted.) | |||
If any item between the indefinite-length string indicator | If any item between the indefinite-length string indicator | |||
(0b010_11111 or 0b011_11111) and the "break" stop code is not a | (0b010_11111 or 0b011_11111) and the "break" stop code is not a | |||
definite-length string item of the same major type, the string is not | definite-length string item of the same major type, the string is not | |||
well-formed. | well-formed. | |||
If any definite-length text string inside an indefinite-length text | If any definite-length text string inside an indefinite-length text | |||
string is invalid, the indefinite-length text string is invalid. | string is invalid, the indefinite-length text string is invalid. | |||
Note that this implies that the bytes of a single UTF-8 character | Note that this implies that the bytes of a single UTF-8 character | |||
cannot be spread between chunks: a new chunk can only be started at a | cannot be split up between chunks: a new chunk of a text string can | |||
character boundary. | only be started at a character boundary. | |||
For example, assume the sequence: | For example, assume an encoded data item consisting of the bytes: | |||
0b010_11111 0b010_00100 0xaabbccdd 0b010_00011 0xeeff99 0b111_11111 | 0b010_11111 0b010_00100 0xaabbccdd 0b010_00011 0xeeff99 0b111_11111 | |||
5F -- Start indefinite-length byte string | 5F -- Start indefinite-length byte string | |||
44 -- Byte string of length 4 | 44 -- Byte string of length 4 | |||
aabbccdd -- Bytes content | aabbccdd -- Bytes content | |||
43 -- Byte string of length 3 | 43 -- Byte string of length 3 | |||
eeff99 -- Bytes content | eeff99 -- Bytes content | |||
FF -- "break" | FF -- "break" | |||
After decoding, this results in a single byte string with seven | After decoding, this results in a single byte string with seven | |||
bytes: 0xaabbccddeeff99. | bytes: 0xaabbccddeeff99. | |||
3.2.4. Summary of indefinite-length use of major types | ||||
Table 2 summarizes the major types defined by CBOR as used for | ||||
indefinite length encoding (with additional information set to 31). | ||||
mt stands for the major type. | ||||
+----+-------------------+----------------------------------+ | ||||
| mt | Meaning | enclosed up to "break" stop code | | ||||
+====+===================+==================================+ | ||||
| 0 | (not well-formed) | - | | ||||
+----+-------------------+----------------------------------+ | ||||
| 1 | (not well-formed) | - | | ||||
+----+-------------------+----------------------------------+ | ||||
| 2 | byte string | definite-length byte strings | | ||||
+----+-------------------+----------------------------------+ | ||||
| 3 | text string | definite-length text strings | | ||||
+----+-------------------+----------------------------------+ | ||||
| 4 | array | data items (elements) | | ||||
+----+-------------------+----------------------------------+ | ||||
| 5 | map | data items (key/value pairs) | | ||||
+----+-------------------+----------------------------------+ | ||||
| 6 | (not well-formed) | - | | ||||
+----+-------------------+----------------------------------+ | ||||
| 7 | "break" stop code | - | | ||||
+----+-------------------+----------------------------------+ | ||||
Table 2: Overview over the indefinite-length use of CBOR | ||||
major types (mt = major type, additional information = | ||||
31) | ||||
3.3. Floating-Point Numbers and Values with No Content | 3.3. Floating-Point Numbers and Values with No Content | |||
Major type 7 is for two types of data: floating-point numbers and | Major type 7 is for two types of data: floating-point numbers and | |||
"simple values" that do not need any content. Each value of the | "simple values" that do not need any content. Each value of the | |||
5-bit additional information in the initial byte has its own separate | 5-bit additional information in the initial byte has its own separate | |||
meaning, as defined in Table 2. Like the major types for integers, | meaning, as defined in Table 3. Like the major types for integers, | |||
items of this major type do not carry content data; all the | items of this major type do not carry content data; all the | |||
information is in the initial bytes. | information is in the initial bytes. | |||
+-------------+---------------------------------------------------+ | +-------------+---------------------------------------------------+ | |||
| 5-Bit Value | Semantics | | | 5-Bit Value | Semantics | | |||
+=============+===================================================+ | +=============+===================================================+ | |||
| 0..23 | Simple value (value 0..23) | | | 0..23 | Simple value (value 0..23) | | |||
+-------------+---------------------------------------------------+ | +-------------+---------------------------------------------------+ | |||
| 24 | Simple value (value 32..255 in following byte) | | | 24 | Simple value (value 32..255 in following byte) | | |||
+-------------+---------------------------------------------------+ | +-------------+---------------------------------------------------+ | |||
skipping to change at page 17, line 24 ¶ | skipping to change at page 18, line 24 ¶ | |||
| 26 | IEEE 754 Single-Precision Float (32 bits follow) | | | 26 | IEEE 754 Single-Precision Float (32 bits follow) | | |||
+-------------+---------------------------------------------------+ | +-------------+---------------------------------------------------+ | |||
| 27 | IEEE 754 Double-Precision Float (64 bits follow) | | | 27 | IEEE 754 Double-Precision Float (64 bits follow) | | |||
+-------------+---------------------------------------------------+ | +-------------+---------------------------------------------------+ | |||
| 28-30 | Reserved, not well-formed in the present document | | | 28-30 | Reserved, not well-formed in the present document | | |||
+-------------+---------------------------------------------------+ | +-------------+---------------------------------------------------+ | |||
| 31 | "break" stop code for indefinite-length items | | | 31 | "break" stop code for indefinite-length items | | |||
| | (Section 3.2.1) | | | | (Section 3.2.1) | | |||
+-------------+---------------------------------------------------+ | +-------------+---------------------------------------------------+ | |||
Table 2: Values for Additional Information in Major Type 7 | Table 3: Values for Additional Information in Major Type 7 | |||
As with all other major types, the 5-bit value 24 signifies a single- | As with all other major types, the 5-bit value 24 signifies a single- | |||
byte extension: it is followed by an additional byte to represent the | byte extension: it is followed by an additional byte to represent the | |||
simple value. (To minimize confusion, only the values 32 to 255 are | simple value. (To minimize confusion, only the values 32 to 255 are | |||
used.) This maintains the structure of the initial bytes: as for the | used.) This maintains the structure of the initial bytes: as for the | |||
other major types, the length of these always depends on the | other major types, the length of these always depends on the | |||
additional information in the first byte. Table 3 lists the values | additional information in the first byte. Table 4 lists the values | |||
assigned and available for simple types. | assigned and available for simple types. | |||
+---------+-----------------+ | +---------+-----------------+ | |||
| Value | Semantics | | | Value | Semantics | | |||
+=========+=================+ | +=========+=================+ | |||
| 0..19 | (Unassigned) | | | 0..19 | (Unassigned) | | |||
+---------+-----------------+ | +---------+-----------------+ | |||
| 20 | False | | | 20 | False | | |||
+---------+-----------------+ | +---------+-----------------+ | |||
| 21 | True | | | 21 | True | | |||
+---------+-----------------+ | +---------+-----------------+ | |||
| 22 | Null | | | 22 | Null | | |||
+---------+-----------------+ | +---------+-----------------+ | |||
| 23 | Undefined value | | | 23 | Undefined value | | |||
+---------+-----------------+ | +---------+-----------------+ | |||
| 24..31 | (Reserved) | | | 24..31 | (Reserved) | | |||
+---------+-----------------+ | +---------+-----------------+ | |||
| 32..255 | (Unassigned) | | | 32..255 | (Unassigned) | | |||
+---------+-----------------+ | +---------+-----------------+ | |||
Table 3: Simple Values | Table 4: Simple Values | |||
An encoder MUST NOT issue two-byte sequences that start with 0xf8 | An encoder MUST NOT issue two-byte sequences that start with 0xf8 | |||
(major type = 7, additional information = 24) and continue with a | (major type = 7, additional information = 24) and continue with a | |||
byte less than 0x20 (32 decimal). Such sequences are not well- | byte less than 0x20 (32 decimal). Such sequences are not well- | |||
formed. (This implies that an encoder cannot encode false, true, | formed. (This implies that an encoder cannot encode false, true, | |||
null, or undefined in two-byte sequences, only the one-byte variants | null, or undefined in two-byte sequences, only the one-byte variants | |||
of these are well-formed.) | of these are well-formed.) | |||
The 5-bit values of 25, 26, and 27 are for 16-bit, 32-bit, and 64-bit | The 5-bit values of 25, 26, and 27 are for 16-bit, 32-bit, and 64-bit | |||
IEEE 754 binary floating-point values [IEEE754]. These floating- | IEEE 754 binary floating-point values [IEEE754]. These floating- | |||
point values are encoded in the additional bytes of the appropriate | point values are encoded in the additional bytes of the appropriate | |||
size. (See Appendix D for some information about 16-bit floating | size. (See Appendix D for some information about 16-bit floating- | |||
point.) | point numbers.) | |||
3.4. Tagging of Items | 3.4. Tagging of Items | |||
In CBOR, a data item can be enclosed by a tag to give it additional | In CBOR, a data item can be enclosed by a tag to give it some | |||
semantics while retaining its structure. The tag is major type 6, | additional semantics, as uniquely identified by a "tag number". The | |||
and represents an unsigned integer as indicated by the tag's argument | tag is major type 6, its argument (Section 3) indicates the tag | |||
(Section 3); the (sole) enclosed data item is carried as content | number, and it contains a single enclosed data item, the "tag | |||
data. If a tag requires structured data, this structure is encoded | content". (If a tag requires further structure to its content, this | |||
into the nested data item. The definition of a tag number usually | structure is provided by the enclosed data item.) We use the term | |||
restricts what kinds of nested data item or items are valid for tags | "tag" for the entire data item consisting of both a tag number and | |||
using this tag number. | the tag content: the tag content is the data item that is being | |||
tagged. | ||||
For example, assume that a byte string of length 12 is marked with a | For example, assume that a byte string of length 12 is marked with a | |||
tag of number 2 to indicate it is a positive bignum (Section 3.4.3). | tag of number 2 to indicate it is a positive "bignum" | |||
This would be marked as 0b110_00010 (major type 6, additional | (Section 3.4.3). The encoded data item would start with a byte | |||
information 2 for the tag number) followed by 0b010_01100 (major type | 0b110_00010 (major type 6, additional information 2 for the tag | |||
number) followed by the encoded tag content: 0b010_01100 (major type | ||||
2, additional information of 12 for the length) followed by the 12 | 2, additional information of 12 for the length) followed by the 12 | |||
bytes of the bignum. | bytes of the bignum. | |||
The definition of a tag number describes the additional semantics | ||||
conveyed for tags with this tag number in the extended generic data | ||||
model. These semantics may include equivalence of some tagged data | ||||
items with other data items, including some that can already be | ||||
represented in the basic generic data model. For instance, 0xc24101, | ||||
a bignum the tag content of which is the byte string with the single | ||||
byte 0x01, is equivalent to an integer 1, which could also be encoded | ||||
for instance as 0x01, 0x1801, or 0x190001. The tag definition may | ||||
include the definition of a preferred serialization (Section 4.1) | ||||
that is recommended for generic encoders; this may prefer basic | ||||
generic data model representations over ones that employ a tag. | ||||
The tag definition usually restricts what kinds of nested data item | ||||
or items are valid for such tags. Tag definitions may restrict their | ||||
content to a very specific syntactic structure, as the tags defined | ||||
in this document do, or they may aim at a more semantically defined | ||||
definition of their content, as for instance tags 40 and 1040 do | ||||
[rfc8746]: These accept a number of different ways of representing | ||||
arrays. | ||||
As a matter of convention, many tags do not accept null or undefined | ||||
values as tag content; instead, the expectation is that a null or | ||||
undefined value can be used in place of the entire tag; Section 3.4.2 | ||||
provides some further considerations for one specific tag about the | ||||
handling of this convention in application protocols and in mapping | ||||
to platform types. | ||||
Decoders do not need to understand tags of every tag number, and tags | Decoders do not need to understand tags of every tag number, and tags | |||
may be of little value in applications where the implementation | may be of little value in applications where the implementation | |||
creating a particular CBOR data item and the implementation decoding | creating a particular CBOR data item and the implementation decoding | |||
that stream know the semantic meaning of each item in the data flow. | that stream know the semantic meaning of each item in the data flow. | |||
Their primary purpose in this specification is to define common data | Their primary purpose in this specification is to define common data | |||
types such as dates. A secondary purpose is to provide conversion | types such as dates. A secondary purpose is to provide conversion | |||
hints when it is foreseen that the CBOR data item needs to be | hints when it is foreseen that the CBOR data item needs to be | |||
translated into a different format, requiring hints about the content | translated into a different format, requiring hints about the content | |||
of items. Understanding the semantics of tags is optional for a | of items. Understanding the semantics of tags is optional for a | |||
decoder; it can just jump over the initial bytes of the tag (that | decoder; it can simply present both the tag number and the tag | |||
encode the tag number) and interpret the tag content itself, | content to the application, without interpreting the additional | |||
presenting both tag number and tag content to the application. | semantics of the tag. | |||
A tag applies semantics to the data item it encloses. Thus, if tag A | A tag applies semantics to the data item it encloses. Tags can nest: | |||
encloses tag B, which encloses data item C, tag A applies to the | If tag A encloses tag B, which encloses data item C, tag A applies to | |||
result of applying tag B on data item C. That is, a tag is a data | the result of applying tag B on data item C. | |||
item consisting of a tag number and an enclosed value. The content | ||||
of the tag (the enclosed data item) is the data item (the value) that | ||||
is being tagged. | ||||
IANA maintains a registry of tag numbers as described in Section 9.2. | IANA maintains a registry of tag numbers as described in Section 9.2. | |||
Table 4 provides a list of tag numbers that were defined in | Table 5 provides a list of tag numbers that were defined in | |||
[RFC7049], with definitions in the rest of this section. Note that | [RFC7049], with definitions in the rest of this section. Note that | |||
many other tag numbers have been defined since the publication of | many other tag numbers have been defined since the publication of | |||
[RFC7049]; see the registry described at Section 9.2 for the complete | [RFC7049]; see the registry described at Section 9.2 for the complete | |||
list. | list. | |||
+------------+-------------+----------------------------------+ | +------------+-------------+----------------------------------+ | |||
| Tag Number | Data Item | Semantics | | | Tag Number | Data Item | Semantics | | |||
+============+=============+==================================+ | +============+=============+==================================+ | |||
| 0 | text string | Standard date/time string; see | | | 0 | text string | Standard date/time string; see | | |||
| | | Section 3.4.1 | | | | | Section 3.4.1 | | |||
+------------+-------------+----------------------------------+ | +------------+-------------+----------------------------------+ | |||
| 1 | multiple | Epoch-based date/time; see | | | 1 | integer or | Epoch-based date/time; see | | |||
| | | Section 3.4.2 | | | | float | Section 3.4.2 | | |||
+------------+-------------+----------------------------------+ | +------------+-------------+----------------------------------+ | |||
| 2 | byte string | Positive bignum; see | | | 2 | byte string | Positive bignum; see | | |||
| | | Section 3.4.3 | | | | | Section 3.4.3 | | |||
+------------+-------------+----------------------------------+ | +------------+-------------+----------------------------------+ | |||
| 3 | byte string | Negative bignum; see | | | 3 | byte string | Negative bignum; see | | |||
| | | Section 3.4.3 | | | | | Section 3.4.3 | | |||
+------------+-------------+----------------------------------+ | +------------+-------------+----------------------------------+ | |||
| 4 | array | Decimal fraction; see | | | 4 | array | Decimal fraction; see | | |||
| | | Section 3.4.4 | | | | | Section 3.4.4 | | |||
+------------+-------------+----------------------------------+ | +------------+-------------+----------------------------------+ | |||
| 5 | array | Bigfloat; see Section 3.4.4 | | | 5 | array | Bigfloat; see Section 3.4.4 | | |||
+------------+-------------+----------------------------------+ | +------------+-------------+----------------------------------+ | |||
| 21 | multiple | Expected conversion to base64url | | | 21 | (any) | Expected conversion to base64url | | |||
| | | encoding; see Section 3.4.5.2 | | | | | encoding; see Section 3.4.5.2 | | |||
+------------+-------------+----------------------------------+ | +------------+-------------+----------------------------------+ | |||
| 22 | multiple | Expected conversion to base64 | | | 22 | (any) | Expected conversion to base64 | | |||
| | | encoding; see Section 3.4.5.2 | | | | | encoding; see Section 3.4.5.2 | | |||
+------------+-------------+----------------------------------+ | +------------+-------------+----------------------------------+ | |||
| 23 | multiple | Expected conversion to base16 | | | 23 | (any) | Expected conversion to base16 | | |||
| | | encoding; see Section 3.4.5.2 | | | | | encoding; see Section 3.4.5.2 | | |||
+------------+-------------+----------------------------------+ | +------------+-------------+----------------------------------+ | |||
| 24 | byte string | Encoded CBOR data item; see | | | 24 | byte string | Encoded CBOR data item; see | | |||
| | | Section 3.4.5.1 | | | | | Section 3.4.5.1 | | |||
+------------+-------------+----------------------------------+ | +------------+-------------+----------------------------------+ | |||
| 32 | text string | URI; see Section 3.4.5.3 | | | 32 | text string | URI; see Section 3.4.5.3 | | |||
+------------+-------------+----------------------------------+ | +------------+-------------+----------------------------------+ | |||
| 33 | text string | base64url; see Section 3.4.5.3 | | | 33 | text string | base64url; see Section 3.4.5.3 | | |||
+------------+-------------+----------------------------------+ | +------------+-------------+----------------------------------+ | |||
| 34 | text string | base64; see Section 3.4.5.3 | | | 34 | text string | base64; see Section 3.4.5.3 | | |||
+------------+-------------+----------------------------------+ | +------------+-------------+----------------------------------+ | |||
| 35 | text string | Regular expression; see | | | 35 | text string | Regular expression; see | | |||
| | | Section 3.4.5.3 | | | | | Section 3.4.5.3 | | |||
+------------+-------------+----------------------------------+ | +------------+-------------+----------------------------------+ | |||
| 36 | text string | MIME message; see | | | 36 | text string | MIME message; see | | |||
| | | Section 3.4.5.3 | | | | | Section 3.4.5.3 | | |||
+------------+-------------+----------------------------------+ | +------------+-------------+----------------------------------+ | |||
| 55799 | multiple | Self-described CBOR; see | | | 55799 | (any) | Self-described CBOR; see | | |||
| | | Section 3.4.6 | | | | | Section 3.4.6 | | |||
+------------+-------------+----------------------------------+ | +------------+-------------+----------------------------------+ | |||
Table 4: Tag numbers defined in RFC 7049 | Table 5: Tag numbers defined in RFC 7049 | |||
Conceptually, tags are interpreted in the generic data model, not at | Conceptually, tags are interpreted in the generic data model, not at | |||
(de-)serialization time. A small number of tags (specifically, tag | (de-)serialization time. A small number of tags (specifically, tag | |||
number 25 and tag number 29) have been registered with semantics that | number 25 and tag number 29) have been registered with semantics that | |||
may require processing at (de-)serialization time: The decoder needs | may require processing at (de-)serialization time: The decoder needs | |||
to be aware and the encoder needs to be in control of the exact | to be aware and the encoder needs to be in control of the exact | |||
sequence in which data items are encoded into the CBOR data stream. | sequence in which data items are encoded into the CBOR data stream. | |||
This means these tags cannot be implemented on top of every generic | This means these tags cannot be implemented on top of every generic | |||
CBOR encoder/decoder (which might not reflect the serialization order | CBOR encoder/decoder (which might not reflect the serialization order | |||
for entries in a map at the data model level and vice versa); their | for entries in a map at the data model level and vice versa); their | |||
implementation therefore typically needs to be integrated into the | implementation therefore typically needs to be integrated into the | |||
generic encoder/decoder. The definition of new tags with this | generic encoder/decoder. The definition of new tags with this | |||
property is NOT RECOMMENDED. | property is NOT RECOMMENDED. | |||
Protocols using tag numbers 0 and 1 extend the generic data model | Protocols using tag numbers 0 and 1 extend the generic data model | |||
(Section 2) with data items representing points in time; tag numbers | (Section 2) with data items representing points in time; tag numbers | |||
2 and 3, with arbitrarily sized integers; and tag numbers 4 and 5, | 2 and 3, with arbitrarily sized integers; and tag numbers 4 and 5, | |||
with floating point values of arbitrary size and precision. | with floating-point values of arbitrary size and precision. | |||
3.4.1. Standard Date/Time String | 3.4.1. Standard Date/Time String | |||
Tag number 0 contains a text string in the standard format described | Tag number 0 contains a text string in the standard format described | |||
by the "date-time" production in [RFC3339], as refined by Section 3.3 | by the "date-time" production in [RFC3339], as refined by Section 3.3 | |||
of [RFC4287], representing the point in time described there. A | of [RFC4287], representing the point in time described there. A | |||
nested item of another type or that doesn't match the [RFC4287] | nested item of another type or that doesn't match the [RFC4287] | |||
format is invalid. | format is invalid. | |||
3.4.2. Epoch-based Date/Time | 3.4.2. Epoch-based Date/Time | |||
Tag number 1 contains a numerical value counting the number of | Tag number 1 contains a numerical value counting the number of | |||
seconds from 1970-01-01T00:00Z in UTC time to the represented point | seconds from 1970-01-01T00:00Z in UTC time to the represented point | |||
in civil time. | in civil time. | |||
The enclosed item MUST be an unsigned or negative integer (major | The tag content MUST be an unsigned or negative integer (major types | |||
types 0 and 1), or a floating-point number (major type 7 with | 0 and 1), or a floating-point number (major type 7 with additional | |||
additional information 25, 26, or 27). Other contained types are | information 25, 26, or 27). Other contained types are invalid. | |||
invalid. | ||||
Non-negative values (major type 0 and non-negative floating-point | Non-negative values (major type 0 and non-negative floating-point | |||
numbers) stand for time values on or after 1970-01-01T00:00Z UTC and | numbers) stand for time values on or after 1970-01-01T00:00Z UTC and | |||
are interpreted according to POSIX [TIME_T]. (POSIX time is also | are interpreted according to POSIX [TIME_T]. (POSIX time is also | |||
known as UNIX Epoch time. Note that leap seconds are handled | known as UNIX Epoch time. Note that leap seconds are handled | |||
specially by POSIX time and this results in a 1 second discontinuity | specially by POSIX time and this results in a 1 second discontinuity | |||
several times per decade.) Note that applications that require the | several times per decade.) Note that applications that require the | |||
expression of times beyond early 2106 cannot leave out support of | expression of times beyond early 2106 cannot leave out support of | |||
64-bit integers for the enclosed value. | 64-bit integers for the tag content. | |||
Negative values (major type 1 and negative floating-point numbers) | Negative values (major type 1 and negative floating-point numbers) | |||
are interpreted as determined by the application requirements as | are interpreted as determined by the application requirements as | |||
there is no universal standard for UTC count-of-seconds time before | there is no universal standard for UTC count-of-seconds time before | |||
1970-01-01T00:00Z (this is particularly true for points in time that | 1970-01-01T00:00Z (this is particularly true for points in time that | |||
precede discontinuities in national calendars). The same applies to | precede discontinuities in national calendars). The same applies to | |||
non-finite values. | non-finite values. | |||
To indicate fractional seconds, floating-point values can be used | To indicate fractional seconds, floating-point values can be used | |||
within tag number 1 instead of integer values. Note that this | within tag number 1 instead of integer values. Note that this | |||
generally requires binary64 support, as binary16 and binary32 provide | generally requires binary64 support, as binary16 and binary32 provide | |||
non-zero fractions of seconds only for a short period of time around | non-zero fractions of seconds only for a short period of time around | |||
early 1970. An application that requires tag number 1 support may | early 1970. An application that requires tag number 1 support may | |||
restrict the enclosed value to be an integer (or a floating-point | restrict the tag content to be an integer (or a floating-point value) | |||
value) only. | only. | |||
Note that platform types for date/time may include null or undefined | ||||
values, which may also be desirable at an application protocol level. | ||||
While emitting tag number 1 values with non-finite tag content values | ||||
(e.g., with NaN for undefined date/time values or with Infinite for | ||||
an expiry date that is not set) may seem an obvious way to handle | ||||
this, using untagged null or undefined is often a better solution. | ||||
Application protocol designers are encouraged to consider these cases | ||||
and include clear guidelines for handling them. | ||||
3.4.3. Bignums | 3.4.3. Bignums | |||
Protocols using tag numbers 2 and 3 extend the generic data model | Protocols using tag numbers 2 and 3 extend the generic data model | |||
(Section 2) with "bignums" representing arbitrarily sized integers. | (Section 2) with "bignums" representing arbitrarily sized integers. | |||
In the generic data model, bignum values are not equal to integers | In the basic generic data model, bignum values are not equal to | |||
from the basic data model, but specific data models can define that | integers from the same model, but the extended generic data model | |||
equivalence, and preferred encoding never makes use of bignums that | created by this tag definition defines equivalence based on numeric | |||
also can be expressed as basic integers (see below). | value, and preferred serialization (Section 4.1) never makes use of | |||
bignums that also can be expressed as basic integers (see below). | ||||
Bignums are encoded as a byte string data item, which is interpreted | Bignums are encoded as a byte string data item, which is interpreted | |||
as an unsigned integer n in network byte order. Contained items of | as an unsigned integer n in network byte order. Contained items of | |||
other types are invalid. For tag number 2, the value of the bignum | other types are invalid. For tag number 2, the value of the bignum | |||
is n. For tag number 3, the value of the bignum is -1 - n. The | is n. For tag number 3, the value of the bignum is -1 - n. The | |||
preferred encoding of the byte string is to leave out any leading | preferred serialization of the byte string is to leave out any | |||
zeroes (note that this means the preferred encoding for n = 0 is the | leading zeroes (note that this means the preferred serialization for | |||
empty byte string, but see below). Decoders that understand these | n = 0 is the empty byte string, but see below). Decoders that | |||
tags MUST be able to decode bignums that do have leading zeroes. The | understand these tags MUST be able to decode bignums that do have | |||
preferred encoding of an integer that can be represented using major | leading zeroes. The preferred serialization of an integer that can | |||
type 0 or 1 is to encode it this way instead of as a bignum (which | be represented using major type 0 or 1 is to encode it this way | |||
means that the empty string never occurs in a bignum when using | instead of as a bignum (which means that the empty string never | |||
preferred encoding). Note that this means the non-preferred choice | occurs in a bignum when using preferred serialization). Note that | |||
of a bignum representation instead of a basic integer for encoding a | this means the non-preferred choice of a bignum representation | |||
number is not intended to have application semantics (just as the | instead of a basic integer for encoding a number is not intended to | |||
choice of a longer basic integer representation than needed, such as | have application semantics (just as the choice of a longer basic | |||
0x1800 for 0x00 does not). | integer representation than needed, such as 0x1800 for 0x00 does | |||
not). | ||||
For example, the number 18446744073709551616 (2**64) is represented | For example, the number 18446744073709551616 (2**64) is represented | |||
as 0b110_00010 (major type 6, tag number 2), followed by 0b010_01001 | as 0b110_00010 (major type 6, tag number 2), followed by 0b010_01001 | |||
(major type 2, length 9), followed by 0x010000000000000000 (one byte | (major type 2, length 9), followed by 0x010000000000000000 (one byte | |||
0x01 and eight bytes 0x00). In hexadecimal: | 0x01 and eight bytes 0x00). In hexadecimal: | |||
C2 -- Tag 2 | C2 -- Tag 2 | |||
49 -- Byte string of length 9 | 49 -- Byte string of length 9 | |||
010000000000000000 -- Bytes content | 010000000000000000 -- Bytes content | |||
skipping to change at page 22, line 28 ¶ | skipping to change at page 24, line 17 ¶ | |||
Protocols using tag number 4 extend the generic data model with data | Protocols using tag number 4 extend the generic data model with data | |||
items representing arbitrary-length decimal fractions of the form | items representing arbitrary-length decimal fractions of the form | |||
m*(10**e). Protocols using tag number 5 extend the generic data | m*(10**e). Protocols using tag number 5 extend the generic data | |||
model with data items representing arbitrary-length binary fractions | model with data items representing arbitrary-length binary fractions | |||
of the form m*(2**e). As with bignums, values of different types are | of the form m*(2**e). As with bignums, values of different types are | |||
not equal in the generic data model. | not equal in the generic data model. | |||
Decimal fractions combine an integer mantissa with a base-10 scaling | Decimal fractions combine an integer mantissa with a base-10 scaling | |||
factor. They are most useful if an application needs the exact | factor. They are most useful if an application needs the exact | |||
representation of a decimal fraction such as 1.1 because there is no | representation of a decimal fraction such as 1.1 because there is no | |||
exact representation for many decimal fractions in binary floating | exact representation for many decimal fractions in binary floating- | |||
point. | point representations. | |||
Bigfloats combine an integer mantissa with a base-2 scaling factor. | "Bigfloats" combine an integer mantissa with a base-2 scaling factor. | |||
They are binary floating-point values that can exceed the range or | They are binary floating-point values that can exceed the range or | |||
the precision of the three IEEE 754 formats supported by CBOR | the precision of the three IEEE 754 formats supported by CBOR | |||
(Section 3.3). Bigfloats may also be used by constrained | (Section 3.3). Bigfloats may also be used by constrained | |||
applications that need some basic binary floating-point capability | applications that need some basic binary floating-point capability | |||
without the need for supporting IEEE 754. | without the need for supporting IEEE 754. | |||
A decimal fraction or a bigfloat is represented as a tagged array | A decimal fraction or a bigfloat is represented as a tagged array | |||
that contains exactly two integer numbers: an exponent e and a | that contains exactly two integer numbers: an exponent e and a | |||
mantissa m. Decimal fractions (tag number 4) use base-10 exponents; | mantissa m. Decimal fractions (tag number 4) use base-10 exponents; | |||
the value of a decimal fraction data item is m*(10**e). Bigfloats | the value of a decimal fraction data item is m*(10**e). Bigfloats | |||
(tag number 5) use base-2 exponents; the value of a bigfloat data | (tag number 5) use base-2 exponents; the value of a bigfloat data | |||
item is m*(2**e). The exponent e MUST be represented in an integer | item is m*(2**e). The exponent e MUST be represented in an integer | |||
of major type 0 or 1, while the mantissa also can be a bignum | of major type 0 or 1, while the mantissa can also be a bignum | |||
(Section 3.4.3). Contained items with other structures are invalid. | (Section 3.4.3). Contained items with other structures are invalid. | |||
An example of a decimal fraction is that the number 273.15 could be | An example of a decimal fraction is that the number 273.15 could be | |||
represented as 0b110_00100 (major type of 6 for the tag, additional | represented as 0b110_00100 (major type of 6 for the tag, additional | |||
information of 4 for the number of tag), followed by 0b100_00010 | information of 4 for the number of tag), followed by 0b100_00010 | |||
(major type of 4 for the array, additional information of 2 for the | (major type of 4 for the array, additional information of 2 for the | |||
length of the array), followed by 0b001_00001 (major type of 1 for | length of the array), followed by 0b001_00001 (major type of 1 for | |||
the first integer, additional information of 1 for the value of -2), | the first integer, additional information of 1 for the value of -2), | |||
followed by 0b000_11001 (major type of 0 for the second integer, | followed by 0b000_11001 (major type of 0 for the second integer, | |||
additional information of 25 for a two-byte value), followed by | additional information of 25 for a two-byte value), followed by | |||
skipping to change at page 23, line 31 ¶ | skipping to change at page 25, line 19 ¶ | |||
information of 3 for the value of 3). In hexadecimal: | information of 3 for the value of 3). In hexadecimal: | |||
C5 -- Tag 5 | C5 -- Tag 5 | |||
82 -- Array of length 2 | 82 -- Array of length 2 | |||
20 -- -1 | 20 -- -1 | |||
03 -- 3 | 03 -- 3 | |||
Decimal fractions and bigfloats provide no representation of | Decimal fractions and bigfloats provide no representation of | |||
Infinity, -Infinity, or NaN; if these are needed in place of a | Infinity, -Infinity, or NaN; if these are needed in place of a | |||
decimal fraction or bigfloat, the IEEE 754 half-precision | decimal fraction or bigfloat, the IEEE 754 half-precision | |||
representations from Section 3.3 can be used. For constrained | representations from Section 3.3 can be used. | |||
applications, where there is a choice between representing a specific | ||||
number as an integer and as a decimal fraction or bigfloat (such as | ||||
when the exponent is small and non-negative), there is a quality-of- | ||||
implementation expectation that the integer representation is used | ||||
directly. | ||||
3.4.5. Content Hints | 3.4.5. Content Hints | |||
The tags in this section are for content hints that might be used by | The tags in this section are for content hints that might be used by | |||
generic CBOR processors. These content hints do not extend the | generic CBOR processors. These content hints do not extend the | |||
generic data model. | generic data model. | |||
3.4.5.1. Encoded CBOR Data Item | 3.4.5.1. Encoded CBOR Data Item | |||
Sometimes it is beneficial to carry an embedded CBOR data item that | Sometimes it is beneficial to carry an embedded CBOR data item that | |||
skipping to change at page 24, line 27 ¶ | skipping to change at page 26, line 11 ¶ | |||
does not know whether or not the converter will be generic, and | does not know whether or not the converter will be generic, and | |||
therefore wants to say what it believes is the proper way to convert | therefore wants to say what it believes is the proper way to convert | |||
binary strings to JSON. | binary strings to JSON. | |||
The data item tagged can be a byte string or any other data item. In | The data item tagged can be a byte string or any other data item. In | |||
the latter case, the tag applies to all of the byte string data items | the latter case, the tag applies to all of the byte string data items | |||
contained in the data item, except for those contained in a nested | contained in the data item, except for those contained in a nested | |||
data item tagged with an expected conversion. | data item tagged with an expected conversion. | |||
These three tag numbers suggest conversions to three of the base data | These three tag numbers suggest conversions to three of the base data | |||
encodings defined in [RFC4648]. For base64url encoding (tag number | encodings defined in [RFC4648]. Tag number 21 suggests conversion to | |||
21), padding is not used (see Section 3.2 of RFC 4648); that is, all | base64url encoding (Section 5 of RFC 4648), where padding is not used | |||
trailing equals signs ("=") are removed from the encoded string. For | (see Section 3.2 of RFC 4648); that is, all trailing equals signs | |||
base64 encoding (tag number 22), padding is used as defined in RFC | ("=") are removed from the encoded string. Tag number 22 suggests | |||
4648. For both base64url and base64, padding bits are set to zero | conversion to classical base64 encoding (Section 4 of RFC 4648), with | |||
(see Section 3.5 of RFC 4648), and encoding is performed without the | padding as defined in RFC 4648. For both base64url and base64, | |||
inclusion of any line breaks, whitespace, or other additional | padding bits are set to zero (see Section 3.5 of RFC 4648), and | |||
characters. Note that, for all three tag numbers, the encoding of | encoding is performed without the inclusion of any line breaks, | |||
the empty byte string is the empty text string. | whitespace, or other additional characters. Tag number 23 suggests | |||
conversion to base16 (hex) encoding, with uppercase alphabetics (see | ||||
Section 8 of RFC 4648). Note that, for all three tag numbers, the | ||||
encoding of the empty byte string is the empty text string. | ||||
3.4.5.3. Encoded Text | 3.4.5.3. Encoded Text | |||
Some text strings hold data that have formats widely used on the | Some text strings hold data that have formats widely used on the | |||
Internet, and sometimes those formats can be validated and presented | Internet, and sometimes those formats can be validated and presented | |||
to the application in appropriate form by the decoder. There are | to the application in appropriate form by the decoder. There are | |||
tags for some of these formats. As with tag numbers 21 to 23, if | tags for some of these formats. As with tag numbers 21 to 23, if | |||
these tags are applied to an item other than a text string, they | these tags are applied to an item other than a text string, they | |||
apply to all text string data items it contains. | apply to all text string data items it contains. | |||
* Tag number 32 is for URIs, as defined in [RFC3986]. If the text | * Tag number 32 is for URIs, as defined in [RFC3986]. If the text | |||
string doesn't match the "URI-reference" production, the string is | string doesn't match the "URI-reference" production, the string is | |||
invalid. | invalid. | |||
* Tag numbers 33 and 34 are for base64url- and base64-encoded text | * Tag numbers 33 and 34 are for base64url- and base64-encoded text | |||
strings, as defined in [RFC4648]. If any of: | strings, respectively, as defined in [RFC4648]. If any of: | |||
- the encoded text string contains non-alphabet characters or | - the encoded text string contains non-alphabet characters or | |||
only 1 character in the last block of 4, or | only 1 character in the last block of 4, or | |||
- the padding bits in a 2- or 3-character block are not 0, or | - the padding bits in a 2- or 3-character block are not 0, or | |||
- the base64 encoding has the wrong number of padding characters, | - the base64 encoding has the wrong number of padding characters, | |||
or | or | |||
- the base64url encoding has padding characters, | - the base64url encoding has padding characters, | |||
skipping to change at page 25, line 33 ¶ | skipping to change at page 27, line 21 ¶ | |||
itself, need to be conveyed.) Any contained string value is | itself, need to be conveyed.) Any contained string value is | |||
valid. | valid. | |||
* Tag number 36 is for MIME messages (including all headers), as | * Tag number 36 is for MIME messages (including all headers), as | |||
defined in [RFC2045]. A text string that isn't a valid MIME | defined in [RFC2045]. A text string that isn't a valid MIME | |||
message is invalid. (For this tag, validity checking may be | message is invalid. (For this tag, validity checking may be | |||
particularly onerous for a generic decoder and might therefore not | particularly onerous for a generic decoder and might therefore not | |||
be offered. Note that many MIME messages are general binary data | be offered. Note that many MIME messages are general binary data | |||
and can therefore not be represented in a text string; | and can therefore not be represented in a text string; | |||
[IANA.cbor-tags] lists a registration for tag number 257 that is | [IANA.cbor-tags] lists a registration for tag number 257 that is | |||
similar to tag number 36 but is used with an enclosed byte | similar to tag number 36 but uses a byte string as its tag | |||
string.) | content.) | |||
Note that tag numbers 33 and 34 differ from 21 and 22 in that the | Note that tag numbers 33 and 34 differ from 21 and 22 in that the | |||
data is transported in base-encoded form for the former and in raw | data is transported in base-encoded form for the former and in raw | |||
byte string form for the latter. | byte string form for the latter. | |||
3.4.6. Self-Described CBOR | 3.4.6. Self-Described CBOR | |||
In many applications, it will be clear from the context that CBOR is | In many applications, it will be clear from the context that CBOR is | |||
being employed for encoding a data item. For instance, a specific | being employed for encoding a data item. For instance, a specific | |||
protocol might specify the use of CBOR, or a media type is indicated | protocol might specify the use of CBOR, or a media type is indicated | |||
that specifies its use. However, there may be applications where | that specifies its use. However, there may be applications where | |||
such context information is not available, such as when CBOR data is | such context information is not available, such as when CBOR data is | |||
stored in a file that does not have disambiguating metadata. Here, | stored in a file that does not have disambiguating metadata. Here, | |||
it may help to have some distinguishing characteristics for the data | it may help to have some distinguishing characteristics for the data | |||
itself. | itself. | |||
Tag number 55799 is defined for this purpose. It does not impart any | Tag number 55799 is defined for this purpose. It does not impart any | |||
special semantics on the data item that it encloses; that is, the | special semantics on the data item that it encloses; that is, the | |||
semantics of a data item enclosed in tag number 55799 is exactly | semantics of the tag content enclosed in tag number 55799 is exactly | |||
identical to the semantics of the data item itself. | identical to the semantics of the tag content itself. | |||
The serialization of this tag's head is 0xd9d9f7, which does not | The serialization of this tag's head is 0xd9d9f7, which does not | |||
appear to be in use as a distinguishing mark for any frequently used | appear to be in use as a distinguishing mark for any frequently used | |||
file types. In particular, 0xd9d9f7 is not a valid start of a | file types. In particular, 0xd9d9f7 is not a valid start of a | |||
Unicode text in any Unicode encoding if it is followed by a valid | Unicode text in any Unicode encoding if it is followed by a valid | |||
CBOR data item. | CBOR data item. | |||
For instance, a decoder might be able to decode both CBOR and JSON. | For instance, a decoder might be able to decode both CBOR and JSON. | |||
Such a decoder would need to mechanically distinguish the two | Such a decoder would need to mechanically distinguish the two | |||
formats. An easy way for an encoder to help the decoder would be to | formats. An easy way for an encoder to help the decoder would be to | |||
skipping to change at page 26, line 31 ¶ | skipping to change at page 28, line 19 ¶ | |||
4.1. Preferred Serialization | 4.1. Preferred Serialization | |||
For some values at the data model level, CBOR provides multiple | For some values at the data model level, CBOR provides multiple | |||
serializations. For many applications, it is desirable that an | serializations. For many applications, it is desirable that an | |||
encoder always chooses a preferred serialization (preferred | encoder always chooses a preferred serialization (preferred | |||
encoding); however, the present specification does not put the burden | encoding); however, the present specification does not put the burden | |||
of enforcing this preference on either encoder or decoder. | of enforcing this preference on either encoder or decoder. | |||
Some constrained decoders may be limited in their ability to decode | Some constrained decoders may be limited in their ability to decode | |||
non-preferred serializations: For example, if only integers below | non-preferred serializations: For example, if only integers below | |||
1_000_000_000 are expected in an application, the decoder may leave | 1_000_000_000 (one billion) are expected in an application, the | |||
out the code that would be needed to decode 64-bit arguments in | decoder may leave out the code that would be needed to decode 64-bit | |||
integers. An encoder that always uses preferred serialization | arguments in integers. An encoder that always uses preferred | |||
("preferred encoder") interoperates with this decoder for the numbers | serialization ("preferred encoder") interoperates with this decoder | |||
that can occur in this application. More generally speaking, it | for the numbers that can occur in this application. More generally | |||
therefore can be said that a preferred encoder is more universally | speaking, it therefore can be said that a preferred encoder is more | |||
interoperable (and also less wasteful) than one that, say, always | universally interoperable (and also less wasteful) than one that, | |||
uses 64-bit integers. | say, always uses 64-bit integers. | |||
Similarly, a constrained encoder may be limited in the variety of | Similarly, a constrained encoder may be limited in the variety of | |||
representation variants it supports in such a way that it does not | representation variants it supports in such a way that it does not | |||
emit preferred serializations ("variant encoder"): Say, it could be | emit preferred serializations ("variant encoder"): Say, it could be | |||
designed to always use the 32-bit variant for an integer that it | designed to always use the 32-bit variant for an integer that it | |||
encodes even if a short representation is available (again, assuming | encodes even if a short representation is available (again, assuming | |||
that there is no application need for integers that can only be | that there is no application need for integers that can only be | |||
represented with the 64-bit variant). A decoder that does not rely | represented with the 64-bit variant). A decoder that does not rely | |||
on only ever receiving preferred serializations ("variation-tolerant | on only ever receiving preferred serializations ("variation-tolerant | |||
decoder") can there be said to be more universally interoperable (it | decoder") can there be said to be more universally interoperable (it | |||
might very well optimize for the case of receiving preferred | might very well optimize for the case of receiving preferred | |||
serializations, though). Full implementations of CBOR decoders are | serializations, though). Full implementations of CBOR decoders are | |||
by definition variation-tolerant; the distinction is only relevant if | by definition variation-tolerant; the distinction is only relevant if | |||
a constrained implementation of a CBOR decoder meets a variant | a constrained implementation of a CBOR decoder meets a variant | |||
encoder. | encoder. | |||
The preferred serialization always uses the shortest form of | The preferred serialization always uses the shortest form of | |||
representing the argument (Section 3)); it also uses the shortest | representing the argument (Section 3); it also uses the shortest | |||
floating-point encoding that preserves the value being encoded (see | floating-point encoding that preserves the value being encoded. | |||
Section 5.5). Definite length encoding is preferred whenever the | ||||
length is known at the time the serialization of the item starts. | The preferred serialization for a floating-point value is the | |||
shortest floating-point encoding that preserves its value, e.g., | ||||
0xf94580 for the number 5.5, and 0xfa45ad9c00 for the number 5555.5. | ||||
For NaN values, a shorter encoding is preferred if zero-padding the | ||||
shorter significand towards the right reconstitutes the original NaN | ||||
value (for many applications, the single NaN encoding 0xf97e00 will | ||||
suffice). | ||||
Definite length encoding is preferred whenever the length is known at | ||||
the time the serialization of the item starts. | ||||
4.2. Deterministically Encoded CBOR | 4.2. Deterministically Encoded CBOR | |||
Some protocols may want encoders to only emit CBOR in a particular | Some protocols may want encoders to only emit CBOR in a particular | |||
deterministic format; those protocols might also have the decoders | deterministic format; those protocols might also have the decoders | |||
check that their input is in that deterministic format. Those | check that their input is in that deterministic format. Those | |||
protocols are free to define what they mean by a "deterministic | protocols are free to define what they mean by a "deterministic | |||
format" and what encoders and decoders are expected to do. This | format" and what encoders and decoders are expected to do. This | |||
section defines a set of restrictions that can serve as the base of | section defines a set of restrictions that can serve as the base of | |||
such a deterministic format. | such a deterministic format. | |||
skipping to change at page 27, line 45 ¶ | skipping to change at page 29, line 42 ¶ | |||
- 24 to 255 and -25 to -256 MUST be expressed only with an | - 24 to 255 and -25 to -256 MUST be expressed only with an | |||
additional uint8_t; | additional uint8_t; | |||
- 256 to 65535 and -257 to -65536 MUST be expressed only with an | - 256 to 65535 and -257 to -65536 MUST be expressed only with an | |||
additional uint16_t; | additional uint16_t; | |||
- 65536 to 4294967295 and -65537 to -4294967296 MUST be expressed | - 65536 to 4294967295 and -65537 to -4294967296 MUST be expressed | |||
only with an additional uint32_t. | only with an additional uint32_t. | |||
Floating point values also MUST use the shortest form that | Floating-point values also MUST use the shortest form that | |||
preserves the value, e.g. 1.5 is encoded as 0xf93e00 and 1000000.5 | preserves the value, e.g. 1.5 is encoded as 0xf93e00 and 1000000.5 | |||
as 0xfa49742408. | as 0xfa49742408. (One implementation of this is to have all | |||
floats start as a 64-bit float, then do a test conversion to a | ||||
32-bit float; if the result is the same numeric value, use the | ||||
shorter form and repeat the process with a test conversion to a | ||||
16-bit float. This also works to select 16-bit float for positive | ||||
and negative Infinity as well.) | ||||
* Indefinite-length items MUST NOT appear. They can be encoded as | * Indefinite-length items MUST NOT appear. They can be encoded as | |||
definite-length items instead. | definite-length items instead. | |||
* The keys in every map MUST be sorted in the bytewise lexicographic | * The keys in every map MUST be sorted in the bytewise lexicographic | |||
order of their deterministic encodings. For example, the | order of their deterministic encodings. For example, the | |||
following keys are sorted correctly: | following keys are sorted correctly: | |||
1. 10, encoded as 0x0a. | 1. 10, encoded as 0x0a. | |||
skipping to change at page 28, line 27 ¶ | skipping to change at page 30, line 30 ¶ | |||
5. "aa", encoded as 0x626161. | 5. "aa", encoded as 0x626161. | |||
6. [100], encoded as 0x811864. | 6. [100], encoded as 0x811864. | |||
7. [-1], encoded as 0x8120. | 7. [-1], encoded as 0x8120. | |||
8. false, encoded as 0xf4. | 8. false, encoded as 0xf4. | |||
4.2.2. Additional Deterministic Encoding Considerations | 4.2.2. Additional Deterministic Encoding Considerations | |||
If a protocol allows for IEEE floats, then additional deterministic | ||||
encoding rules might need to be added. One example rule might be to | ||||
have all floats start as a 64-bit float, then do a test conversion to | ||||
a 32-bit float; if the result is the same numeric value, use the | ||||
shorter value and repeat the process with a test conversion to a | ||||
16-bit float. (This rule selects 16-bit float for positive and | ||||
negative Infinity as well.) Although IEEE floats can represent both | ||||
positive and negative zero as distinct values, the application might | ||||
not distinguish these and might decide to represent all zero values | ||||
with a positive sign, disallowing negative zero. | ||||
CBOR tags present additional considerations for deterministic | CBOR tags present additional considerations for deterministic | |||
encoding. If a CBOR-based protocol were to provide the same | encoding. If a CBOR-based protocol were to provide the same | |||
semantics for the presence and absence of a specific tag (e.g., by | semantics for the presence and absence of a specific tag (e.g., by | |||
allowing both tag 1 data items and raw numbers in a date/time | allowing both tag 1 data items and raw numbers in a date/time | |||
position, treating the latter as if they were tagged), the | position, treating the latter as if they were tagged), the | |||
deterministic format would not allow them. In a protocol that | deterministic format would not allow the presence of the tag, based | |||
requires tags in certain places to obtain specific semantics, the tag | on the "shortest form" principle. For example, a protocol might give | |||
needs to appear in the deterministic format as well. Deterministic | encoders the choice of representing a URL as either a text string or, | |||
encoding considerations also apply to the content of tags. | using Section 3.4.5.3, tag number 32 containing a text string. This | |||
protocol's deterministic encoding needs to either require that the | ||||
tag is present or require that it is absent, not allow either one. | ||||
Protocols that include floating, big integer, or other complex values | In a protocol that does require tags in certain places to obtain | |||
need to define extra requirements on their deterministic encodings. | specific semantics, the tag needs to appear in the deterministic | |||
For example: | format as well. Deterministic encoding considerations also apply to | |||
the content of tags. | ||||
If a protocol includes a field that can express integers with an | ||||
absolute value of 2^64 or larger using tag numbers 2 or 3 | ||||
(Section 3.4.3), the protocol's deterministic encoding needs to | ||||
specify whether smaller integers are also expressed using these tags | ||||
or using major types 0 and 1. Preferred serialization uses the | ||||
latter choice, which is therefore recommended. | ||||
Protocols that include floating-point values, whether represented | ||||
using basic floating-point values (Section 3.3) or using tags (or | ||||
both), may need to define extra requirements on their deterministic | ||||
encodings, such as: | ||||
* Although IEEE floating-point values can represent both positive | ||||
and negative zero as distinct values, the application might not | ||||
distinguish these and might decide to represent all zero values | ||||
with a positive sign, disallowing negative zero. (The application | ||||
may also want to restrict the precision of floating point values | ||||
in such a way that there is never a need to represent 64-bit -- or | ||||
even 32-bit -- floating-point values.) | ||||
* If a protocol includes a field that can express floating-point | * If a protocol includes a field that can express floating-point | |||
values (Section 3.3), the protocol's deterministic encoding needs | values, with a specific data model that declares integer and | |||
to specify whether the integer 1.0 is encoded as 0x01, 0xf93c00, | floating-point values to be interchangeable, the protocol's | |||
0xfa3f800000, or 0xfb3ff0000000000000. Three sensible rules for | deterministic encoding needs to specify whether the integer 1.0 is | |||
this are: | encoded as 0x01, 0xf93c00, 0xfa3f800000, or 0xfb3ff0000000000000. | |||
Example rules for this are: | ||||
1. Encode integral values that fit in 64 bits as values from | 1. Encode integral values that fit in 64 bits as values from | |||
major types 0 and 1, and other values as the smallest of 16-, | major types 0 and 1, and other values as the preferred | |||
32-, or 64-bit floating point that accurately represents the | (smallest of 16-, 32-, or 64-bit) floating-point | |||
value, | representation that accurately represents the value, | |||
2. Encode all values as the smallest of 16-, 32-, or 64-bit | 2. Encode all values as the preferred floating-point | |||
floating point that accurately represents the value, even for | representation that accurately represents the value, even for | |||
integral values, or | integral values, or | |||
3. Encode all values as 64-bit floating point. | 3. Encode all values as 64-bit floating-point representations. | |||
Rule 1 straddles the boundaries between integers and floating | Rule 1 straddles the boundaries between integers and floating- | |||
point values, and Rule 3 does not use preferred encoding, so Rule | point values, and Rule 3 does not use preferred serialization, so | |||
2 may be a good choice in many cases. | Rule 2 may be a good choice in many cases. | |||
If NaN is an allowed value and there is no intent to support NaN | * If NaN is an allowed value and there is no intent to support NaN | |||
payloads or signaling NaNs, the protocol needs to pick a single | payloads or signaling NaNs, the protocol needs to pick a single | |||
representation, for example 0xf97e00. If that simple choice is | representation, typically 0xf97e00. If that simple choice is not | |||
not possible, specific attention will be needed for NaN handling. | possible, specific attention will be needed for NaN handling. | |||
Subnormal numbers (nonzero numbers with the lowest possible | * Subnormal numbers (nonzero numbers with the lowest possible | |||
exponent of a given IEEE 754 number format) may be flushed to zero | exponent of a given IEEE 754 number format) may be flushed to zero | |||
outputs or be treated as zero inputs in some floating point | outputs or be treated as zero inputs in some floating-point | |||
implementations. A protocol's deterministic encoding may want to | implementations. A protocol's deterministic encoding may want to | |||
exclude them from interchange, interchanging zero instead. | specifically accommodate such implementations while creating an | |||
onus on other implementations, by excluding subnormal numbers from | ||||
* If a protocol includes a field that can express integers with an | interchange, interchanging zero instead. | |||
absolute value of 2^64 or larger using tag numbers 2 or 3 | ||||
(Section 3.4.3), the protocol's deterministic encoding needs to | ||||
specify whether small integers are expressed using the tag or | ||||
major types 0 and 1. | ||||
* A protocol might give encoders the choice of representing a URL as | * The same number can be represented by different decimal fractions, | |||
either a text string or, using Section 3.4.5.3, tag number 32 | by different bigfloats, and by different forms under other tags | |||
containing a text string. This protocol's deterministic encoding | that may be defined to express numeric values. Depending on the | |||
needs to either require that the tag is present or require that | implementation, it may not always be practical to determine | |||
it's absent, not allow either one. | whether any of these forms (or forms in the basic generic data | |||
model) are equivalent. An application protocol that presents | ||||
choices of this kind for the representation format of numbers | ||||
needs to be explicit in how the formats are to be chosen for | ||||
deterministic encoding. | ||||
4.2.3. Length-first map key ordering | 4.2.3. Length-first Map Key Ordering | |||
The core deterministic encoding requirements sort map keys in a | The core deterministic encoding requirements (Section 4.2.1) sort map | |||
different order from the one suggested by Section 3.9 of [RFC7049] | keys in a different order from the one suggested by Section 3.9 of | |||
(called "Canonical CBOR" there). Protocols that need to be | [RFC7049] (called "Canonical CBOR" there). Protocols that need to be | |||
compatible with [RFC7049]'s order can instead be specified in terms | compatible with [RFC7049]'s order can instead be specified in terms | |||
of this specification's "length-first core deterministic encoding | of this specification's "length-first core deterministic encoding | |||
requirements": | requirements": | |||
A CBOR encoding satisfies the "length-first core deterministic | A CBOR encoding satisfies the "length-first core deterministic | |||
encoding requirements" if it satisfies the core deterministic | encoding requirements" if it satisfies the core deterministic | |||
encoding requirements except that the keys in every map MUST be | encoding requirements except that the keys in every map MUST be | |||
sorted such that: | sorted such that: | |||
1. If two keys have different lengths, the shorter one sorts | 1. If two keys have different lengths, the shorter one sorts | |||
skipping to change at page 31, line 31 ¶ | skipping to change at page 33, line 39 ¶ | |||
and other unexpected data. CBOR-based protocols MAY specify that | and other unexpected data. CBOR-based protocols MAY specify that | |||
they treat arbitrary valid data as unexpected. Encoders for CBOR- | they treat arbitrary valid data as unexpected. Encoders for CBOR- | |||
based protocols MUST produce only valid items, that is, the protocol | based protocols MUST produce only valid items, that is, the protocol | |||
cannot be designed to make use of invalid items. An encoder can be | cannot be designed to make use of invalid items. An encoder can be | |||
capable of encoding as many or as few types of values as is required | capable of encoding as many or as few types of values as is required | |||
by the protocol in which it is used; a decoder can be capable of | by the protocol in which it is used; a decoder can be capable of | |||
understanding as many or as few types of values as is required by the | understanding as many or as few types of values as is required by the | |||
protocols in which it is used. This lack of restrictions allows CBOR | protocols in which it is used. This lack of restrictions allows CBOR | |||
to be used in extremely constrained environments. | to be used in extremely constrained environments. | |||
This section discusses some considerations in creating CBOR-based | The rest of this section discusses some considerations in creating | |||
protocols. With few exceptions, it is advisory only and explicitly | CBOR-based protocols. With few exceptions, it is advisory only and | |||
excludes any language from BCP 14 other than words that could be | explicitly excludes any language from BCP 14 other than words that | |||
interpreted as "MAY" in the sense of BCP 14. The exceptions aim at | could be interpreted as "MAY" in the sense of BCP 14. The exceptions | |||
facilitating interoperability of CBOR-based protocols while making | aim at facilitating interoperability of CBOR-based protocols while | |||
use of a wide variety of both generic and application-specific | making use of a wide variety of both generic and application-specific | |||
encoders and decoders. | encoders and decoders. | |||
5.1. CBOR in Streaming Applications | 5.1. CBOR in Streaming Applications | |||
In a streaming application, a data stream may be composed of a | In a streaming application, a data stream may be composed of a | |||
sequence of CBOR data items concatenated back-to-back. In such an | sequence of CBOR data items concatenated back-to-back. In such an | |||
environment, the decoder immediately begins decoding a new data item | environment, the decoder immediately begins decoding a new data item | |||
if data is found after the end of a previous data item. | if data is found after the end of a previous data item. | |||
Not all of the bytes making up a data item may be immediately | Not all of the bytes making up a data item may be immediately | |||
skipping to change at page 33, line 37 ¶ | skipping to change at page 35, line 51 ¶ | |||
Invalid UTF-8 string: A decoder might or might not want to verify | Invalid UTF-8 string: A decoder might or might not want to verify | |||
that the sequence of bytes in a UTF-8 string (major type 3) is | that the sequence of bytes in a UTF-8 string (major type 3) is | |||
actually valid UTF-8 and react appropriately. | actually valid UTF-8 and react appropriately. | |||
5.3.2. Tag validity | 5.3.2. Tag validity | |||
Two additional kinds of validity errors are introduced by adding tags | Two additional kinds of validity errors are introduced by adding tags | |||
to the basic generic data model: | to the basic generic data model: | |||
Inadmissible type for tag content: Tags (Section 3.4) specify what | Inadmissible type for tag content: Tag numbers (Section 3.4) specify | |||
type of data item is supposed to be enclosed by the tag; for | what type of data item is supposed to be used as their tag | |||
example, the tags for positive or negative bignums are supposed to | content; for example, the tag numbers for positive or negative | |||
be put on byte strings. A decoder that decodes the tagged data | bignums are supposed to be put on byte strings. A decoder that | |||
item into a native representation (a native big integer in this | decodes the tagged data item into a native representation (a | |||
example) is expected to check the type of the data item being | native big integer in this example) is expected to check the type | |||
tagged. Even decoders that don't have such native representations | of the data item being tagged. Even decoders that don't have such | |||
available in their environment may perform the check on those tags | native representations available in their environment may perform | |||
known to them and react appropriately. | the check on those tags known to them and react appropriately. | |||
Inadmissible value for tag content: The type of data item may be | Inadmissible value for tag content: The type of data item may be | |||
admissible for a tag's content, but the specific value may not be; | admissible for a tag's content, but the specific value may not be; | |||
e.g., a value of "yesterday" is not acceptable for the content of | e.g., a value of "yesterday" is not acceptable for the content of | |||
tag 0, even though it properly is a text string. A decoder that | tag 0, even though it properly is a text string. A decoder that | |||
normally ingests such tags into equivalent platform types might | normally ingests such tags into equivalent platform types might | |||
present this tag to the application in a similar way to how it | present this tag to the application in a similar way to how it | |||
would present a tag with an unknown tag number (Section 5.4). | would present a tag with an unknown tag number (Section 5.4). | |||
5.4. Validity and Evolution | 5.4. Validity and Evolution | |||
skipping to change at page 34, line 38 ¶ | skipping to change at page 37, line 4 ¶ | |||
with an indication that the decoder did not recognize that tag | with an indication that the decoder did not recognize that tag | |||
number or simple value. | number or simple value. | |||
The latter approach, which is also appropriate for decoders that do | The latter approach, which is also appropriate for decoders that do | |||
not support validity checking, provides forward compatibility with | not support validity checking, provides forward compatibility with | |||
newly registered tags and simple values without the requirement to | newly registered tags and simple values without the requirement to | |||
update the encoder at the same time as the calling application. (For | update the encoder at the same time as the calling application. (For | |||
this, the API for the decoder needs to have a way to mark unknown | this, the API for the decoder needs to have a way to mark unknown | |||
items so that the calling application can handle them in a manner | items so that the calling application can handle them in a manner | |||
appropriate for the program.) | appropriate for the program.) | |||
Since some of the processing needed for validity checking may have an | Since some of the processing needed for validity checking may have an | |||
appreciable cost (in particular with duplicate detection for maps), | appreciable cost (in particular with duplicate detection for maps), | |||
support of validity checking is not a requirement placed on all CBOR | support of validity checking is not a requirement placed on all CBOR | |||
decoders. | decoders. | |||
Some encoders will rely on their applications to provide input data | Some encoders will rely on their applications to provide input data | |||
in such a way that valid CBOR results from the encoder. A generic | in such a way that valid CBOR results from the encoder. A generic | |||
encoder also may want to provide a validity-checking mode where it | encoder may also want to provide a validity-checking mode where it | |||
reliably limits its output to valid CBOR, independent of whether or | reliably limits its output to valid CBOR, independent of whether or | |||
not its application is indeed providing API-conformant data. | not its application is indeed providing API-conformant data. | |||
5.5. Numbers | 5.5. Numbers | |||
CBOR-based protocols should take into account that different language | CBOR-based protocols should take into account that different language | |||
environments pose different restrictions on the range and precision | environments pose different restrictions on the range and precision | |||
of numbers that are representable. For example, the JavaScript | of numbers that are representable. For example, the basic JavaScript | |||
number system treats all numbers as floating point, which may result | number system treats all numbers as floating-point values, which may | |||
in silent loss of precision in decoding integers with more than 53 | result in silent loss of precision in decoding integers with more | |||
significant bits. A protocol that uses numbers should define its | than 53 significant bits. A protocol that uses numbers should define | |||
expectations on the handling of non-trivial numbers in decoders and | its expectations on the handling of non-trivial numbers in decoders | |||
receiving applications. | and receiving applications. | |||
A CBOR-based protocol that includes floating-point numbers can | A CBOR-based protocol that includes floating-point numbers can | |||
restrict which of the three formats (half-precision, single- | restrict which of the three formats (half-precision, single- | |||
precision, and double-precision) are to be supported. For an | precision, and double-precision) are to be supported. For an | |||
integer-only application, a protocol may want to completely exclude | integer-only application, a protocol may want to completely exclude | |||
the use of floating-point values. | the use of floating-point values. | |||
A CBOR-based protocol designed for compactness may want to exclude | A CBOR-based protocol designed for compactness may want to exclude | |||
specific integer encodings that are longer than necessary for the | specific integer encodings that are longer than necessary for the | |||
application, such as to save the need to implement 64-bit integers. | application, such as to save the need to implement 64-bit integers. | |||
There is an expectation that encoders will use the most compact | There is an expectation that encoders will use the most compact | |||
integer representation that can represent a given value. However, a | integer representation that can represent a given value. However, a | |||
compact application should accept values that use a longer-than- | compact application that does not require deterministic encoding | |||
needed encoding (such as encoding "0" as 0b000_11001 followed by two | should accept values that use a longer-than-needed encoding (such as | |||
bytes of 0x00) as long as the application can decode an integer of | encoding "0" as 0b000_11001 followed by two bytes of 0x00) as long as | |||
the given size. | the application can decode an integer of the given size. Similar | |||
considerations apply to floating-point values; decoding both | ||||
preferred serializations and longer-than-needed ones is recommended. | ||||
The preferred encoding for a floating-point value is the shortest | CBOR-based protocols for constrained applications that provide a | |||
floating-point encoding that preserves its value, e.g., 0xf94580 for | choice between representing a specific number as an integer and as a | |||
the number 5.5, and 0xfa45ad9c00 for the number 5555.5, unless the | decimal fraction or bigfloat (such as when the exponent is small and | |||
CBOR-based protocol specifically excludes the use of the shorter | non-negative), might express a quality-of-implementation expectation | |||
floating-point encodings. For NaN values, a shorter encoding is | that the integer representation is used directly. | |||
preferred if zero-padding the shorter significand towards the right | ||||
reconstitutes the original NaN value (for many applications, the | ||||
single NaN encoding 0xf97e00 will suffice). | ||||
5.6. Specifying Keys for Maps | 5.6. Specifying Keys for Maps | |||
The encoding and decoding applications need to agree on what types of | The encoding and decoding applications need to agree on what types of | |||
keys are going to be used in maps. In applications that need to | keys are going to be used in maps. In applications that need to | |||
interwork with JSON-based applications, keys probably should be | interwork with JSON-based applications, conversion is simplified by | |||
limited to UTF-8 strings only; otherwise, there has to be a specified | limiting keys to text strings only; otherwise, there has to be a | |||
mapping from the other CBOR types to Unicode characters, and this | specified mapping from the other CBOR types to text strings, and this | |||
often leads to implementation errors. In applications where keys are | often leads to implementation errors. In applications where keys are | |||
numeric in nature and numeric ordering of keys is important to the | numeric in nature and numeric ordering of keys is important to the | |||
application, directly using the numbers for the keys is useful. | application, directly using the numbers for the keys is useful. | |||
If multiple types of keys are to be used, consideration should be | If multiple types of keys are to be used, consideration should be | |||
given to how these types would be represented in the specific | given to how these types would be represented in the specific | |||
programming environments that are to be used. For example, in | programming environments that are to be used. For example, in | |||
JavaScript Maps [ECMA262], a key of integer 1 cannot be distinguished | JavaScript Maps [ECMA262], a key of integer 1 cannot be distinguished | |||
from a key of floating-point 1.0. This means that, if integer keys | from a key of floating-point 1.0. This means that, if integer keys | |||
are used, the protocol needs to avoid use of floating-point keys the | are used, the protocol needs to avoid use of floating-point keys the | |||
skipping to change at page 36, line 27 ¶ | skipping to change at page 38, line 38 ¶ | |||
the enclosing data item is completely available ("streaming encoder") | the enclosing data item is completely available ("streaming encoder") | |||
may want to reduce its overhead significantly by relying on its data | may want to reduce its overhead significantly by relying on its data | |||
source to maintain uniqueness. | source to maintain uniqueness. | |||
A CBOR-based protocol MUST define what to do when a receiving | A CBOR-based protocol MUST define what to do when a receiving | |||
application does see multiple identical keys in a map. The resulting | application does see multiple identical keys in a map. The resulting | |||
rule in the protocol MUST respect the CBOR data model: it cannot | rule in the protocol MUST respect the CBOR data model: it cannot | |||
prescribe a specific handling of the entries with the identical keys, | prescribe a specific handling of the entries with the identical keys, | |||
except that it might have a rule that having identical keys in a map | except that it might have a rule that having identical keys in a map | |||
indicates a malformed map and that the decoder has to stop with an | indicates a malformed map and that the decoder has to stop with an | |||
error. Duplicate keys are also prohibited by CBOR decoders that | error. When processing maps that exhibit entries with duplicate | |||
enforce validity (Section 5.4). | keys, a generic decoder might do one of the following: | |||
* Not accept maps duplicate keys (that is, enforce validity for | ||||
maps, see also Section 5.4). These generic decoders are | ||||
universally useful. An application may still need to do perform | ||||
its own duplicate checking based on application rules (for | ||||
instance if the application equates integers and floating point | ||||
values in map key positions for specific maps). | ||||
* Pass all map entries to the application, including ones with | ||||
duplicate keys. This requires the application to handle (check | ||||
against) duplicate keys, even if the application rules are | ||||
identical to the generic data model rules. | ||||
* Lose some entries with duplicate keys, e.g. by only delivering the | ||||
final (or first) entry out of the entries with the same key. With | ||||
such a generic decoder, applications may get different results for | ||||
a specific key on different runs and with different generic | ||||
decoders as which value is returned is based on generic decoder | ||||
implementation and the actual order of keys in the map. In | ||||
particular, applications cannot validate key uniqueness on their | ||||
own as they do not necessarily see all entries; they may not be | ||||
able to use such a generic decoder if they do need to validate key | ||||
uniqueness. These generic decoders can only be used in situations | ||||
where the data source and transfer can be relied upon to always | ||||
provide valid maps; this is not possible if the data source and | ||||
transfer can be attacked. | ||||
Generic decoders need to document which of these three approaches | ||||
they implement. | ||||
The CBOR data model for maps does not allow ascribing semantics to | The CBOR data model for maps does not allow ascribing semantics to | |||
the order of the key/value pairs in the map representation. Thus, a | the order of the key/value pairs in the map representation. Thus, a | |||
CBOR-based protocol MUST NOT specify that changing the key/value pair | CBOR-based protocol MUST NOT specify that changing the key/value pair | |||
order in a map would change the semantics, except to specify that | order in a map would change the semantics, except to specify that | |||
some, orders are disallowed, for example where they would not meet | some orders are disallowed, for example where they would not meet the | |||
the requirements of a deterministic encoding (Section 4.2). (Any | requirements of a deterministic encoding (Section 4.2). (Any | |||
secondary effects of map ordering such as on timing, cache usage, and | secondary effects of map ordering such as on timing, cache usage, and | |||
other potential side channels are not considered part of the | other potential side channels are not considered part of the | |||
semantics but may be enough reason on its own for a protocol to | semantics but may be enough reason on their own for a protocol to | |||
require a deterministic encoding format.) | require a deterministic encoding format.) | |||
Applications for constrained devices that have maps where a small | Applications for constrained devices that have maps where a small | |||
number of frequently used keys can be identified should consider | number of frequently used keys can be identified should consider | |||
using small integers as keys; for instance, a set of 24 or fewer | using small integers as keys; for instance, a set of 24 or fewer | |||
frequent keys can be encoded in a single byte as unsigned integers, | frequent keys can be encoded in a single byte as unsigned integers, | |||
up to 48 if negative integers are also used. Less frequently | up to 48 if negative integers are also used. Less frequently | |||
occurring keys can then use integers with longer encodings. | occurring keys can then use integers with longer encodings. | |||
5.6.1. Equivalence of Keys | 5.6.1. Equivalence of Keys | |||
skipping to change at page 37, line 24 ¶ | skipping to change at page 40, line 17 ¶ | |||
purpose of map key equivalence, NaN (not a number) values are | purpose of map key equivalence, NaN (not a number) values are | |||
equivalent if they have the same significand after zero-extending | equivalent if they have the same significand after zero-extending | |||
both significands at the right to 64 bits. | both significands at the right to 64 bits. | |||
(Byte and text) strings are compared byte by byte, arrays element by | (Byte and text) strings are compared byte by byte, arrays element by | |||
element, and are equal if they have the same number of bytes/elements | element, and are equal if they have the same number of bytes/elements | |||
and the same values at the same positions. Two maps are equal if | and the same values at the same positions. Two maps are equal if | |||
they have the same set of pairs regardless of their order; pairs are | they have the same set of pairs regardless of their order; pairs are | |||
equal if both the key and value are equal. | equal if both the key and value are equal. | |||
Tagged values are equal if both the tag number and the enclosed item | Tagged values are equal if both the tag number and the tag content | |||
are equal. (Note that a generic decoder that provides processing for | are equal. (Note that a generic decoder that provides processing for | |||
a specific tag may not be able to distinguish some semantically | a specific tag may not be able to distinguish some semantically | |||
equivalent values, e.g. if leading zeroes occur in the content of tag | equivalent values, e.g. if leading zeroes occur in the content of tag | |||
2/3 (Section 3.4.3).) Simple values are equal if they simply have | 2/3 (Section 3.4.3).) Simple values are equal if they simply have | |||
the same value. Nothing else is equal in the generic data model, a | the same value. Nothing else is equal in the generic data model, a | |||
simple value 2 is not equivalent to an integer 2 and an array is | simple value 2 is not equivalent to an integer 2 and an array is | |||
never equivalent to a map. | never equivalent to a map. | |||
As discussed in Section 2.2, specific data models can make values | As discussed in Section 2.2, specific data models can make values | |||
equivalent for the purpose of comparing map keys that are distinct in | equivalent for the purpose of comparing map keys that are distinct in | |||
skipping to change at page 39, line 27 ¶ | skipping to change at page 42, line 15 ¶ | |||
* A bignum (major type 6, tag number 2 or 3) is represented by | * A bignum (major type 6, tag number 2 or 3) is represented by | |||
encoding its byte string in base64url without padding and becomes | encoding its byte string in base64url without padding and becomes | |||
a JSON string. For tag number 3 (negative bignum), a "~" (ASCII | a JSON string. For tag number 3 (negative bignum), a "~" (ASCII | |||
tilde) is inserted before the base-encoded value. (The conversion | tilde) is inserted before the base-encoded value. (The conversion | |||
to a binary blob instead of a number is to prevent a likely | to a binary blob instead of a number is to prevent a likely | |||
numeric overflow for the JSON decoder.) | numeric overflow for the JSON decoder.) | |||
* A byte string with an encoding hint (major type 6, tag number 21 | * A byte string with an encoding hint (major type 6, tag number 21 | |||
through 23) is encoded as described and becomes a JSON string. | through 23) is encoded as described and becomes a JSON string. | |||
* For all other tags (major type 6, any other tag number), the | * For all other tags (major type 6, any other tag number), the tag | |||
enclosed CBOR item is represented as a JSON value; the tag number | content is represented as a JSON value; the tag number is ignored. | |||
is ignored. | ||||
* Indefinite-length items are made definite before conversion. | * Indefinite-length items are made definite before conversion. | |||
6.2. Converting from JSON to CBOR | 6.2. Converting from JSON to CBOR | |||
All JSON values, once decoded, directly map into one or more CBOR | All JSON values, once decoded, directly map into one or more CBOR | |||
values. As with any kind of CBOR generation, decisions have to be | values. As with any kind of CBOR generation, decisions have to be | |||
made with respect to number representation. In a suggested | made with respect to number representation. In a suggested | |||
conversion: | conversion: | |||
skipping to change at page 41, line 34 ¶ | skipping to change at page 44, line 21 ¶ | |||
been allocated. Implementations receiving an unknown simple data | been allocated. Implementations receiving an unknown simple data | |||
item may be able to process it as such, given that the structure | item may be able to process it as such, given that the structure | |||
of the value is indeed simple. The IANA registry in Section 9.1 | of the value is indeed simple. The IANA registry in Section 9.1 | |||
is the appropriate way to address the extensibility of this | is the appropriate way to address the extensibility of this | |||
codepoint space. | codepoint space. | |||
* the "tag" space (values in major type 6). Again, only a small | * the "tag" space (values in major type 6). Again, only a small | |||
part of the codepoint space has been allocated, and the space is | part of the codepoint space has been allocated, and the space is | |||
abundant (although the early numbers are more efficient than the | abundant (although the early numbers are more efficient than the | |||
later ones). Implementations receiving an unknown tag number can | later ones). Implementations receiving an unknown tag number can | |||
choose to simply ignore it or to process it as an unknown tag | choose to simply ignore it (process just the enclosed tag content) | |||
number wrapping the enclosed data item. The IANA registry in | or to process it as an unknown tag number wrapping the tag | |||
Section 9.2 is the appropriate way to address the extensibility of | content. The IANA registry in Section 9.2 is the appropriate way | |||
this codepoint space. | to address the extensibility of this codepoint space. | |||
* the "additional information" space. An implementation receiving | * the "additional information" space. An implementation receiving | |||
an unknown additional information value has no way to continue | an unknown additional information value has no way to continue | |||
decoding, so allocating codepoints to this space is a major step. | decoding, so allocating codepoints to this space is a major step. | |||
There are also very few codepoints left. | There are also very few codepoints left. See also Section 7.2. | |||
7.2. Curating the Additional Information Space | 7.2. Curating the Additional Information Space | |||
The human mind is sometimes drawn to filling in little perceived gaps | The human mind is sometimes drawn to filling in little perceived gaps | |||
to make something neat. We expect the remaining gaps in the | to make something neat. We expect the remaining gaps in the | |||
codepoint space for the additional information values to be an | codepoint space for the additional information values to be an | |||
attractor for new ideas, just because they are there. | attractor for new ideas, just because they are there. | |||
The present specification does not manage the additional information | The present specification does not manage the additional information | |||
codepoint space by an IANA registry. Instead, allocations out of | codepoint space by an IANA registry. Instead, allocations out of | |||
this space can only be done by updating this specification. | this space can only be done by updating this specification. | |||
For an additional information value of n >= 24, the size of the | For an additional information value of n >= 24, the size of the | |||
additional data typically is 2**(n-24) bytes. Therefore, additional | additional data typically is 2**(n-24) bytes. Therefore, additional | |||
information values 28 and 29 should be viewed as candidates for | information values 28 and 29 should be viewed as candidates for | |||
128-bit and 256-bit quantities, in case a need arises to add them to | 128-bit and 256-bit quantities, in case a need arises to add them to | |||
the protocol. Additional information value 30 is then the only | the protocol. Additional information value 30 is then the only | |||
additional information value available for general allocation, and | additional information value available for general allocation, and | |||
there should be a very good reason for allocating it before assigning | there should be a very good reason for allocating it before assigning | |||
it through an update of this protocol. | it through an update of the present specification. | |||
8. Diagnostic Notation | 8. Diagnostic Notation | |||
CBOR is a binary interchange format. To facilitate documentation and | CBOR is a binary interchange format. To facilitate documentation and | |||
debugging, and in particular to facilitate communication between | debugging, and in particular to facilitate communication between | |||
entities cooperating in debugging, this section defines a simple | entities cooperating in debugging, this section defines a simple | |||
human-readable diagnostic notation. All actual interchange always | human-readable diagnostic notation. All actual interchange always | |||
happens in the binary format. | happens in the binary format. | |||
Note that this truly is a diagnostic format; it is not meant to be | Note that this truly is a diagnostic format; it is not meant to be | |||
parsed. Therefore, no formal definition (as in ABNF) is given in | parsed. Therefore, no formal definition (as in ABNF) is given in | |||
this document. (Implementers looking for a text-based format for | this document. (Implementers looking for a text-based format for | |||
representing CBOR data items in configuration files may also want to | representing CBOR data items in configuration files may also want to | |||
consider YAML [YAML].) | consider YAML [YAML].) | |||
The diagnostic notation is loosely based on JSON as it is defined in | The diagnostic notation is loosely based on JSON as it is defined in | |||
RFC 8259, extending it where needed. | RFC 8259, extending it where needed. | |||
The notation borrows the JSON syntax for numbers (integer and | The notation borrows the JSON syntax for numbers (integer and | |||
floating point), True (>true<), False (>false<), Null (>null<), UTF-8 | floating-point), True (>true<), False (>false<), Null (>null<), UTF-8 | |||
strings, arrays, and maps (maps are called objects in JSON; the | strings, arrays, and maps (maps are called objects in JSON; the | |||
diagnostic notation extends JSON here by allowing any data item in | diagnostic notation extends JSON here by allowing any data item in | |||
the key position). Undefined is written >undefined< as in | the key position). Undefined is written >undefined< as in | |||
JavaScript. The non-finite floating-point numbers Infinity, | JavaScript. The non-finite floating-point numbers Infinity, | |||
-Infinity, and NaN are written exactly as in this sentence (this is | -Infinity, and NaN are written exactly as in this sentence (this is | |||
also a way they can be written in JavaScript, although JSON does not | also a way they can be written in JavaScript, although JSON does not | |||
allow them). A tag is written as an integer number for the tag | allow them). A tag is written as an integer number for the tag | |||
number, followed by the tag content in parentheses; for instance, an | number, followed by the tag content in parentheses; for instance, an | |||
RFC 3339 (ISO 8601) date could be notated as: | RFC 3339 (ISO 8601) date could be notated as: | |||
skipping to change at page 43, line 16 ¶ | skipping to change at page 45, line 51 ¶ | |||
padding, enclosed in single quotes, prefixed by >h< for base16, >b32< | padding, enclosed in single quotes, prefixed by >h< for base16, >b32< | |||
for base32, >h32< for base32hex, >b64< for base64 or base64url (the | for base32, >h32< for base32hex, >b64< for base64 or base64url (the | |||
actual encodings do not overlap, so the string remains unambiguous). | actual encodings do not overlap, so the string remains unambiguous). | |||
For example, the byte string 0x12345678 could be written h'12345678', | For example, the byte string 0x12345678 could be written h'12345678', | |||
b32'CI2FM6A', or b64'EjRWeA'. | b32'CI2FM6A', or b64'EjRWeA'. | |||
Unassigned simple values are given as "simple()" with the appropriate | Unassigned simple values are given as "simple()" with the appropriate | |||
integer in the parentheses. For example, "simple(42)" indicates | integer in the parentheses. For example, "simple(42)" indicates | |||
major type 7, value 42. | major type 7, value 42. | |||
A number of useful extensions to the diagnostic notation defined here | ||||
are provided in Appendix G of [RFC8610], "Extended Diagnostic | ||||
Notation" (EDN). | ||||
8.1. Encoding Indicators | 8.1. Encoding Indicators | |||
Sometimes it is useful to indicate in the diagnostic notation which | Sometimes it is useful to indicate in the diagnostic notation which | |||
of several alternative representations were actually used; for | of several alternative representations were actually used; for | |||
example, a data item written >1.5< by a diagnostic decoder might have | example, a data item written >1.5< by a diagnostic decoder might have | |||
been encoded as a half-, single-, or double-precision float. | been encoded as a half-, single-, or double-precision float. | |||
The convention for encoding indicators is that anything starting with | The convention for encoding indicators is that anything starting with | |||
an underscore and all following characters that are alphanumeric or | an underscore and all following characters that are alphanumeric or | |||
underscore, is an encoding indicator, and can be ignored by anyone | underscore, is an encoding indicator, and can be ignored by anyone | |||
not interested in this information. Encoding indicators are always | not interested in this information. For example, "_" or "_3". | |||
optional. | Encoding indicators are always optional. | |||
A single underscore can be written after the opening brace of a map | A single underscore can be written after the opening brace of a map | |||
or the opening bracket of an array to indicate that the data item was | or the opening bracket of an array to indicate that the data item was | |||
represented in indefinite-length format. For example, [_ 1, 2] | represented in indefinite-length format. For example, [_ 1, 2] | |||
contains an indicator that an indefinite-length representation was | contains an indicator that an indefinite-length representation was | |||
used to represent the data item [1, 2]. | used to represent the data item [1, 2]. | |||
An underscore followed by a decimal digit n indicates that the | An underscore followed by a decimal digit n indicates that the | |||
preceding item (or, for arrays and maps, the item starting with the | preceding item (or, for arrays and maps, the item starting with the | |||
preceding bracket or brace) was encoded with an additional | preceding bracket or brace) was encoded with an additional | |||
information value of 24+n. For example, 1.5_1 is a half-precision | information value of 24+n. For example, 1.5_1 is a half-precision | |||
floating-point number, while 1.5_3 is encoded as double precision. | floating-point number, while 1.5_3 is encoded as double precision. | |||
This encoding indicator is not shown in Appendix A. (Note that the | This encoding indicator is not shown in Appendix A. (Note that the | |||
encoding indicator "_" is thus an abbreviation of the full form "_7", | encoding indicator "_" is thus an abbreviation of the full form "_7", | |||
which is not used.) | which is not used.) | |||
As a special case, byte and text strings of indefinite length can be | Byte and text strings of indefinite length can be notated in the form | |||
notated in the form (_ h'0123', h'4567') and (_ "foo", "bar"). | (_ h'0123', h'4567') and (_ "foo", "bar"). | |||
9. IANA Considerations | 9. IANA Considerations | |||
IANA has created two registries for new CBOR values. The registries | IANA has created two registries for new CBOR values. The registries | |||
are separate, that is, not under an umbrella registry, and follow the | are separate, that is, not under an umbrella registry, and follow the | |||
rules in [RFC8126]. IANA has also assigned a new MIME media type and | rules in [RFC8126]. IANA has also assigned a new MIME media type and | |||
an associated Constrained Application Protocol (CoAP) Content-Format | an associated Constrained Application Protocol (CoAP) Content-Format | |||
entry. | entry. | |||
[To be removed by RFC editor:] IANA is requested to update these | [To be removed by RFC editor:] IANA is requested to update these | |||
registries to point to the present document instead of RFC 7049. | registries to point to the present document instead of RFC 7049. | |||
9.1. Simple Values Registry | 9.1. Simple Values Registry | |||
IANA has created the "Concise Binary Object Representation (CBOR) | IANA has created the "Concise Binary Object Representation (CBOR) | |||
Simple Values" registry at [IANA.cbor-simple-values]. The initial | Simple Values" registry at [IANA.cbor-simple-values]. The initial | |||
values are shown in Table 3. | values are shown in Table 4. | |||
New entries in the range 0 to 19 are assigned by Standards Action. | New entries in the range 0 to 19 are assigned by Standards Action. | |||
It is suggested that these Standards Actions allocate values starting | It is suggested that these Standards Actions allocate values starting | |||
with the number 16 in order to reserve the lower numbers for | with the number 16 in order to reserve the lower numbers for | |||
contiguous blocks (if any). | contiguous blocks (if any). | |||
New entries in the range 32 to 255 are assigned by Specification | New entries in the range 32 to 255 are assigned by Specification | |||
Required. | Required. | |||
9.2. Tags Registry | 9.2. Tags Registry | |||
IANA has created the "Concise Binary Object Representation (CBOR) | IANA has created the "Concise Binary Object Representation (CBOR) | |||
Tags" registry at [IANA.cbor-tags]. The tags that were defined in | Tags" registry at [IANA.cbor-tags]. The tags that were defined in | |||
[RFC7049] are described in detail in Section 3.4, but other tags have | [RFC7049] are described in detail in Section 3.4, and other tags have | |||
already been defined. | already been defined. | |||
New entries in the range 0 to 23 are assigned by Standards Action. | New entries in the range 0 to 23 are assigned by Standards Action. | |||
New entries in the range 24 to 255 are assigned by Specification | New entries in the range 24 to 255 are assigned by Specification | |||
Required. New entries in the range 256 to 18446744073709551615 are | Required. New entries in the range 256 to 18446744073709551615 are | |||
assigned by First Come First Served. The template for registration | assigned by First Come First Served. The template for registration | |||
requests is: | requests is: | |||
* Data item | * Data item | |||
skipping to change at page 45, line 8 ¶ | skipping to change at page 47, line 46 ¶ | |||
In addition, First Come First Served requests should include: | In addition, First Come First Served requests should include: | |||
* Point of contact | * Point of contact | |||
* Description of semantics (URL) - This description is optional; the | * Description of semantics (URL) - This description is optional; the | |||
URL can point to something like an Internet-Draft or a web page. | URL can point to something like an Internet-Draft or a web page. | |||
9.3. Media Type ("MIME Type") | 9.3. Media Type ("MIME Type") | |||
The Internet media type [RFC6838] for a single encoded CBOR data item | The Internet media type [RFC6838] for a single encoded CBOR data item | |||
is application/cbor. | is application/cbor, as defined in [IANA.media-types]: | |||
Type name: application | Type name: application | |||
Subtype name: cbor | Subtype name: cbor | |||
Required parameters: n/a | Required parameters: n/a | |||
Optional parameters: n/a | Optional parameters: n/a | |||
Encoding considerations: binary | Encoding considerations: binary | |||
Security considerations: See Section 10 of this document | Security considerations: See Section 10 of this document | |||
Interoperability considerations: n/a | Interoperability considerations: n/a | |||
Published specification: This document | Published specification: This document | |||
skipping to change at page 45, line 29 ¶ | skipping to change at page 48, line 17 ¶ | |||
Security considerations: See Section 10 of this document | Security considerations: See Section 10 of this document | |||
Interoperability considerations: n/a | Interoperability considerations: n/a | |||
Published specification: This document | Published specification: This document | |||
Applications that use this media type: None yet, but it is expected | Applications that use this media type: None yet, but it is expected | |||
that this format will be deployed in protocols and applications. | that this format will be deployed in protocols and applications. | |||
Additional information: | Additional information: * Magic number(s): n/a | |||
Magic number(s): n/a | ||||
File extension(s): .cbor | ||||
Macintosh file type code(s): n/a | ||||
Person & email address to contact for further information: | * File extension(s): .cbor | |||
Carsten Bormann | ||||
cabo@tzi.org | * Macintosh file type code(s): n/a | |||
Person & email address to contact for further information: IETF CBOR | ||||
Working Group cbor@ietf.org (mailto:cbor@ietf.org) or IETF | ||||
Applications and Real-Time Area art@ietf.org (mailto:art@ietf.org) | ||||
Intended usage: COMMON | Intended usage: COMMON | |||
Restrictions on usage: none | Restrictions on usage: none | |||
Author: | Author: IETF CBOR Working Group cbor@ietf.org (mailto:cbor@ietf.org) | |||
Carsten Bormann <cabo@tzi.org> | ||||
Change controller: | Change controller: The IESG iesg@ietf.org (mailto:iesg@ietf.org) | |||
The IESG <iesg@ietf.org> | ||||
9.4. CoAP Content-Format | 9.4. CoAP Content-Format | |||
The CoAP Content-Format for CBOR is defined in | ||||
[IANA.core-parameters]: | ||||
Media Type: application/cbor | Media Type: application/cbor | |||
Encoding: - | Encoding: - | |||
Id: 60 | Id: 60 | |||
Reference: [RFCthis] | Reference: [RFCthis] | |||
9.5. The +cbor Structured Syntax Suffix Registration | 9.5. The +cbor Structured Syntax Suffix Registration | |||
The Structured Syntax Suffix [RFC6838] for media types based on a | ||||
single encoded CBOR data item is +cbor, as defined in | ||||
[IANA.media-type-structured-suffix]: | ||||
Name: Concise Binary Object Representation (CBOR) | Name: Concise Binary Object Representation (CBOR) | |||
+suffix: +cbor | +suffix: +cbor | |||
References: [RFCthis] | References: [RFCthis] | |||
Encoding Considerations: CBOR is a binary format. | Encoding Considerations: CBOR is a binary format. | |||
Interoperability Considerations: n/a | Interoperability Considerations: n/a | |||
Fragment Identifier Considerations: | Fragment Identifier Considerations: The syntax and semantics of | |||
The syntax and semantics of fragment identifiers specified for | fragment identifiers specified for +cbor SHOULD be as specified | |||
+cbor SHOULD be as specified for "application/cbor". (At | for "application/cbor". (At publication of this document, there | |||
publication of this document, there is no fragment identification | is no fragment identification syntax defined for "application/ | |||
syntax defined for "application/cbor".) | cbor".) | |||
The syntax and semantics for fragment identifiers for a specific | The syntax and semantics for fragment identifiers for a specific | |||
"xxx/yyy+cbor" SHOULD be processed as follows: | "xxx/yyy+cbor" SHOULD be processed as follows: | |||
For cases defined in +cbor, where the fragment identifier resolves | * For cases defined in +cbor, where the fragment identifier | |||
per the +cbor rules, then process as specified in +cbor. | resolves per the +cbor rules, then process as specified in | |||
+cbor. | ||||
For cases defined in +cbor, where the fragment identifier does | * For cases defined in +cbor, where the fragment identifier does | |||
not resolve per the +cbor rules, then process as specified in | not resolve per the +cbor rules, then process as specified in | |||
"xxx/yyy+cbor". | "xxx/yyy+cbor". | |||
For cases not defined in +cbor, then process as specified in | * For cases not defined in +cbor, then process as specified in | |||
"xxx/yyy+cbor". | "xxx/yyy+cbor". | |||
Security Considerations: See Section 10 of this document | Security Considerations: See Section 10 of this document | |||
Contact: | Contact: IETF CBOR Working Group cbor@ietf.org | |||
Apps Area Working Group (apps-discuss@ietf.org) | (mailto:cbor@ietf.org) or IETF Applications and Real-Time Area | |||
art@ietf.org (mailto:art@ietf.org) | ||||
Author/Change Controller: | Author/Change Controller: The IESG iesg@ietf.org | |||
The Apps Area Working Group. | (mailto:iesg@ietf.org) | |||
The IESG has change control over this registration. | // Editors' note: RFC 6838 has a template | |||
field Author/Change | ||||
// controller, the descriptive text of | ||||
which makes clear that this is | ||||
// the change controller, not the author. | ||||
Go figure. There is no | ||||
// separate author entry as in the media | ||||
types registry. (RFC | ||||
// editor: Please remove this note before | ||||
publication.) | ||||
10. Security Considerations | 10. Security Considerations | |||
A network-facing application can exhibit vulnerabilities in its | A network-facing application can exhibit vulnerabilities in its | |||
processing logic for incoming data. Complex parsers are well known | processing logic for incoming data. Complex parsers are well known | |||
as a likely source of such vulnerabilities, such as the ability to | as a likely source of such vulnerabilities, such as the ability to | |||
remotely crash a node, or even remotely execute arbitrary code on it. | remotely crash a node, or even remotely execute arbitrary code on it. | |||
CBOR attempts to narrow the opportunities for introducing such | CBOR attempts to narrow the opportunities for introducing such | |||
vulnerabilities by reducing parser complexity, by giving the entire | vulnerabilities by reducing parser complexity, by giving the entire | |||
range of encodable values a meaning where possible. | range of encodable values a meaning where possible. | |||
skipping to change at page 50, line 19 ¶ | skipping to change at page 53, line 26 ¶ | |||
[ASN.1] International Telecommunication Union, "Information | [ASN.1] International Telecommunication Union, "Information | |||
Technology -- ASN.1 encoding rules: Specification of Basic | Technology -- ASN.1 encoding rules: Specification of Basic | |||
Encoding Rules (BER), Canonical Encoding Rules (CER) and | Encoding Rules (BER), Canonical Encoding Rules (CER) and | |||
Distinguished Encoding Rules (DER)", ITU-T Recommendation | Distinguished Encoding Rules (DER)", ITU-T Recommendation | |||
X.690, 1994. | X.690, 1994. | |||
[BSON] Various, "BSON - Binary JSON", 2013, | [BSON] Various, "BSON - Binary JSON", 2013, | |||
<http://bsonspec.org/>. | <http://bsonspec.org/>. | |||
[I-D.ietf-cbor-sequence] | ||||
Bormann, C., "Concise Binary Object Representation (CBOR) | ||||
Sequences", Work in Progress, Internet-Draft, draft-ietf- | ||||
cbor-sequence-02, 25 September 2019, <http://www.ietf.org/ | ||||
internet-drafts/draft-ietf-cbor-sequence-02.txt>. | ||||
[IANA.cbor-simple-values] | [IANA.cbor-simple-values] | |||
IANA, "Concise Binary Object Representation (CBOR) Simple | IANA, "Concise Binary Object Representation (CBOR) Simple | |||
Values", | Values", | |||
<http://www.iana.org/assignments/cbor-simple-values>. | <http://www.iana.org/assignments/cbor-simple-values>. | |||
[IANA.cbor-tags] | [IANA.cbor-tags] | |||
IANA, "Concise Binary Object Representation (CBOR) Tags", | IANA, "Concise Binary Object Representation (CBOR) Tags", | |||
<http://www.iana.org/assignments/cbor-tags>. | <http://www.iana.org/assignments/cbor-tags>. | |||
[IANA.core-parameters] | ||||
IANA, "Constrained RESTful Environments (CoRE) | ||||
Parameters", | ||||
<http://www.iana.org/assignments/core-parameters>. | ||||
[IANA.media-type-structured-suffix] | ||||
IANA, "Structured Syntax Suffix Registry", | ||||
<http://www.iana.org/assignments/media-type-structured- | ||||
suffix>. | ||||
[IANA.media-types] | ||||
IANA, "Media Types", | ||||
<http://www.iana.org/assignments/media-types>. | ||||
[MessagePack] | [MessagePack] | |||
Furuhashi, S., "MessagePack", 2013, <http://msgpack.org/>. | Furuhashi, S., "MessagePack", 2013, <http://msgpack.org/>. | |||
[PCRE] Ho, A., "PCRE - Perl Compatible Regular Expressions", | [PCRE] Ho, A., "PCRE - Perl Compatible Regular Expressions", | |||
2018, <http://www.pcre.org/>. | 2018, <http://www.pcre.org/>. | |||
[RFC0713] Haverty, J., "MSDTP-Message Services Data Transmission | [RFC0713] Haverty, J., "MSDTP-Message Services Data Transmission | |||
Protocol", RFC 713, DOI 10.17487/RFC0713, April 1976, | Protocol", RFC 713, DOI 10.17487/RFC0713, April 1976, | |||
<https://www.rfc-editor.org/info/rfc713>. | <https://www.rfc-editor.org/info/rfc713>. | |||
skipping to change at page 51, line 14 ¶ | skipping to change at page 54, line 30 ¶ | |||
[RFC7228] Bormann, C., Ersue, M., and A. Keranen, "Terminology for | [RFC7228] Bormann, C., Ersue, M., and A. Keranen, "Terminology for | |||
Constrained-Node Networks", RFC 7228, | Constrained-Node Networks", RFC 7228, | |||
DOI 10.17487/RFC7228, May 2014, | DOI 10.17487/RFC7228, May 2014, | |||
<https://www.rfc-editor.org/info/rfc7228>. | <https://www.rfc-editor.org/info/rfc7228>. | |||
[RFC7493] Bray, T., Ed., "The I-JSON Message Format", RFC 7493, | [RFC7493] Bray, T., Ed., "The I-JSON Message Format", RFC 7493, | |||
DOI 10.17487/RFC7493, March 2015, | DOI 10.17487/RFC7493, March 2015, | |||
<https://www.rfc-editor.org/info/rfc7493>. | <https://www.rfc-editor.org/info/rfc7493>. | |||
[RFC7991] Hoffman, P., "The "xml2rfc" Version 3 Vocabulary", | ||||
RFC 7991, DOI 10.17487/RFC7991, December 2016, | ||||
<https://www.rfc-editor.org/info/rfc7991>. | ||||
[RFC8259] Bray, T., Ed., "The JavaScript Object Notation (JSON) Data | [RFC8259] Bray, T., Ed., "The JavaScript Object Notation (JSON) Data | |||
Interchange Format", STD 90, RFC 8259, | Interchange Format", STD 90, RFC 8259, | |||
DOI 10.17487/RFC8259, December 2017, | DOI 10.17487/RFC8259, December 2017, | |||
<https://www.rfc-editor.org/info/rfc8259>. | <https://www.rfc-editor.org/info/rfc8259>. | |||
[RFC8610] Birkholz, H., Vigano, C., and C. Bormann, "Concise Data | ||||
Definition Language (CDDL): A Notational Convention to | ||||
Express Concise Binary Object Representation (CBOR) and | ||||
JSON Data Structures", RFC 8610, DOI 10.17487/RFC8610, | ||||
June 2019, <https://www.rfc-editor.org/info/rfc8610>. | ||||
[RFC8618] Dickinson, J., Hague, J., Dickinson, S., Manderson, T., | [RFC8618] Dickinson, J., Hague, J., Dickinson, S., Manderson, T., | |||
and J. Bond, "Compacted-DNS (C-DNS): A Format for DNS | and J. Bond, "Compacted-DNS (C-DNS): A Format for DNS | |||
Packet Capture", RFC 8618, DOI 10.17487/RFC8618, September | Packet Capture", RFC 8618, DOI 10.17487/RFC8618, September | |||
2019, <https://www.rfc-editor.org/info/rfc8618>. | 2019, <https://www.rfc-editor.org/info/rfc8618>. | |||
[RFC8742] Bormann, C., "Concise Binary Object Representation (CBOR) | ||||
Sequences", RFC 8742, DOI 10.17487/RFC8742, February 2020, | ||||
<https://www.rfc-editor.org/info/rfc8742>. | ||||
[RFC8746] Bormann, C., Ed., "Concise Binary Object Representation | ||||
(CBOR) Tags for Typed Arrays", RFC 8746, | ||||
DOI 10.17487/RFC8746, February 2020, | ||||
<https://www.rfc-editor.org/info/rfc8746>. | ||||
[rfc8746] Bormann, C., Ed., "Concise Binary Object Representation | ||||
(CBOR) Tags for Typed Arrays", RFC 8746, | ||||
DOI 10.17487/RFC8746, February 2020, | ||||
<https://www.rfc-editor.org/info/rfc8746>. | ||||
[SIPHASH] Aumasson, J. and D. Bernstein, "SipHash: A Fast Short- | [SIPHASH] Aumasson, J. and D. Bernstein, "SipHash: A Fast Short- | |||
Input PRF", DOI 10.1007/978-3-642-34931-7_28, Lecture | Input PRF", DOI 10.1007/978-3-642-34931-7_28, Lecture | |||
Notes in Computer Science pp. 489-508, 2012, | Notes in Computer Science pp. 489-508, 2012, | |||
<https://doi.org/10.1007/978-3-642-34931-7_28>. | <https://doi.org/10.1007/978-3-642-34931-7_28>. | |||
[YAML] Ben-Kiki, O., Evans, C., and I.d. Net, "YAML Ain't Markup | [YAML] Ben-Kiki, O., Evans, C., and I.d. Net, "YAML Ain't Markup | |||
Language (YAML[TM]) Version 1.2", 3rd Edition, October | Language (YAML[TM]) Version 1.2", 3rd Edition, October | |||
2009, <http://www.yaml.org/spec/1.2/spec.html>. | 2009, <http://www.yaml.org/spec/1.2/spec.html>. | |||
Appendix A. Examples | Appendix A. Examples | |||
skipping to change at page 55, line 35 ¶ | skipping to change at page 59, line 25 ¶ | |||
| 17, 18, 19, 20, 21, 22, 23, | | | | 17, 18, 19, 20, 21, 22, 23, | | | |||
| 24, 25] | | | | 24, 25] | | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| {_ "a": 1, "b": [_ 2, 3]} | 0xbf61610161629f0203ffff | | | {_ "a": 1, "b": [_ 2, 3]} | 0xbf61610161629f0203ffff | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| ["a", {_ "b": "c"}] | 0x826161bf61626163ff | | | ["a", {_ "b": "c"}] | 0x826161bf61626163ff | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
| {_ "Fun": true, "Amt": -2} | 0xbf6346756ef563416d7421ff | | | {_ "Fun": true, "Amt": -2} | 0xbf6346756ef563416d7421ff | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
Table 5: Examples of Encoded CBOR Data Items | Table 6: Examples of Encoded CBOR Data Items | |||
Appendix B. Jump Table | Appendix B. Jump Table | |||
For brevity, this jump table does not show initial bytes that are | For brevity, this jump table does not show initial bytes that are | |||
reserved for future extension. It also only shows a selection of the | reserved for future extension. It also only shows a selection of the | |||
initial bytes that can be used for optional features. (All unsigned | initial bytes that can be used for optional features. (All unsigned | |||
integers are in network byte order.) | integers are in network byte order.) | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| Byte | Structure/Semantics | | | Byte | Structure/Semantics | | |||
skipping to change at page 58, line 42 ¶ | skipping to change at page 62, line 32 ¶ | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xf9 | Half-Precision Float (two-byte IEEE 754) | | | 0xf9 | Half-Precision Float (two-byte IEEE 754) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xfa | Single-Precision Float (four-byte IEEE 754) | | | 0xfa | Single-Precision Float (four-byte IEEE 754) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xfb | Double-Precision Float (eight-byte IEEE 754) | | | 0xfb | Double-Precision Float (eight-byte IEEE 754) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xff | "break" stop code | | | 0xff | "break" stop code | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
Table 6: Jump Table for Initial Byte | Table 7: Jump Table for Initial Byte | |||
Appendix C. Pseudocode | Appendix C. Pseudocode | |||
The well-formedness of a CBOR item can be checked by the pseudocode | The well-formedness of a CBOR item can be checked by the pseudocode | |||
in Figure 1. The data is well-formed if and only if: | in Figure 1. The data is well-formed if and only if: | |||
* the pseudocode does not "fail"; | * the pseudocode does not "fail"; | |||
* after execution of the pseudocode, no bytes are left in the input | * after execution of the pseudocode, no bytes are left in the input | |||
(except in streaming applications) | (except in streaming applications) | |||
The pseudocode has the following prerequisites: | The pseudocode has the following prerequisites: | |||
* take(n) reads n bytes from the input data and returns them as a | * take(n) reads n bytes from the input data and returns them as a | |||
byte string. If n bytes are no longer available, take(n) fails. | byte string. If n bytes are no longer available, take(n) fails. | |||
* uint() converts a byte string into an unsigned integer by | * uint() converts a byte string into an unsigned integer by | |||
interpreting the byte string in network byte order. | interpreting the byte string in network byte order. | |||
skipping to change at page 64, line 34 ¶ | skipping to change at page 68, line 34 ¶ | |||
Message Services Data Transmission (MSDTP) is a very early example of | Message Services Data Transmission (MSDTP) is a very early example of | |||
a compact message format; it is described in [RFC0713], written in | a compact message format; it is described in [RFC0713], written in | |||
1976. It is included here for its historical value, not because it | 1976. It is included here for its historical value, not because it | |||
was ever widely used. | was ever widely used. | |||
E.5. Conciseness on the Wire | E.5. Conciseness on the Wire | |||
While CBOR's design objective of code compactness for encoders and | While CBOR's design objective of code compactness for encoders and | |||
decoders is a higher priority than its objective of conciseness on | decoders is a higher priority than its objective of conciseness on | |||
the wire, many people focus on the wire size. Table 7 shows some | the wire, many people focus on the wire size. Table 8 shows some | |||
encoding examples for the simple nested array [1, [2, 3]]; where some | encoding examples for the simple nested array [1, [2, 3]]; where some | |||
form of indefinite-length encoding is supported by the encoding, | form of indefinite-length encoding is supported by the encoding, | |||
[_ 1, [2, 3]] (indefinite length on the outer array) is also shown. | [_ 1, [2, 3]] (indefinite length on the outer array) is also shown. | |||
+-------------+----------------------------+----------------+ | +-------------+----------------------------+----------------+ | |||
| Format | [1, [2, 3]] | [_ 1, [2, 3]] | | | Format | [1, [2, 3]] | [_ 1, [2, 3]] | | |||
+=============+============================+================+ | +=============+============================+================+ | |||
| RFC 713 | c2 05 81 c2 02 82 83 | | | | RFC 713 | c2 05 81 c2 02 82 83 | | | |||
+-------------+----------------------------+----------------+ | +-------------+----------------------------+----------------+ | |||
| ASN.1 BER | 30 0b 02 01 01 30 06 02 01 | 30 80 02 01 01 | | | ASN.1 BER | 30 0b 02 01 01 30 06 02 01 | 30 80 02 01 01 | | |||
skipping to change at page 65, line 25 ¶ | skipping to change at page 69, line 25 ¶ | |||
+-------------+----------------------------+----------------+ | +-------------+----------------------------+----------------+ | |||
| BSON | 22 00 00 00 10 30 00 01 00 | | | | BSON | 22 00 00 00 10 30 00 01 00 | | | |||
| | 00 00 04 31 00 13 00 00 00 | | | | | 00 00 04 31 00 13 00 00 00 | | | |||
| | 10 30 00 02 00 00 00 10 31 | | | | | 10 30 00 02 00 00 00 10 31 | | | |||
| | 00 03 00 00 00 00 00 | | | | | 00 03 00 00 00 00 00 | | | |||
+-------------+----------------------------+----------------+ | +-------------+----------------------------+----------------+ | |||
| CBOR | 82 01 82 02 03 | 9f 01 82 02 03 | | | CBOR | 82 01 82 02 03 | 9f 01 82 02 03 | | |||
| | | ff | | | | | ff | | |||
+-------------+----------------------------+----------------+ | +-------------+----------------------------+----------------+ | |||
Table 7: Examples for Different Levels of Conciseness | Table 8: Examples for Different Levels of Conciseness | |||
Appendix F. Changes from RFC 7049 | Appendix F. Changes from RFC 7049 | |||
The following is a list of known changes from RFC 7049. This list is | The following is a list of known changes from RFC 7049. This list is | |||
non-authoritative. It is meant to help reviewers see the significant | non-authoritative. It is meant to help reviewers see the significant | |||
differences. | differences. | |||
* Updated reference for [RFC4627] to [RFC8259] in many places | * Made some use of new RFCXML functionality [RFC7991] | |||
* Updated reference for [CNN-TERMS] to [RFC7228] | * Updated references, e.g. for [RFC4627] to [RFC8259] in many | |||
places, for [CNN-TERMS] to [RFC7228]; added missing reference to | ||||
[IEEE754] and updated to [ECMA262] | ||||
* Added a comment to the last example in Section 2.2.1 (added | * Fixed errata: in the example in Section 2.4.2 ("29" -> "49"), and | |||
in the last paragraph of Section 3.6 ("0b000_11101" -> | ||||
"0b000_11001") | ||||
* Added a comment to the last example in Section 3.2.2 (added | ||||
"Second value") | "Second value") | |||
* Fixed a bug in the example in Section 2.4.2 ("29" -> "49") | * Applied numerous small editorial changes | |||
* Fixed a bug in the last paragraph of Section 3.6 ("0b000_11101" -> | * Added a few tables for illustration | |||
"0b000_11001") | ||||
* More stringently used terminology for well-formed and valid data, | ||||
avoiding less well-defined alternative terms such as "syntax | ||||
error", "decoding error" and "strict mode" outside examples | ||||
* Streamlined terminology to talk about tags, tag numbers, and tag | ||||
content | ||||
* Clarified the restrictions on tag content, in general and | ||||
specifically for tag 1 | ||||
* Added text about the CBOR data model and its small variations | ||||
(basic generic, extended generic, specific) | ||||
* More clearly separated integers from floating-point values; | ||||
provided a suggestion (based on I-JSON [RFC7493]) for handling | ||||
these types when converting JSON to CBOR | ||||
* Added term "preferred serialization" and defined it for various | ||||
kinds of data items | ||||
* Added comment about tags with semantics that depend on | ||||
serialization order | ||||
* Defined "deterministic encoding", making use of "preferred | ||||
serialization", and simplified the suggested map ordering for the | ||||
"Core Deterministic Encoding Requirements", easing implementation, | ||||
while keeping RFC 7049 map ordering as an alternative "length- | ||||
first map key ordering"; now avoiding the terms "canonical" and | ||||
"canonicalization" | ||||
* Clarified map validity (handling of duplicate keys) and explained | ||||
the domain of applicability of certain implementation choices | ||||
* Updated IANA considerations | ||||
* Added security considerations | ||||
* Clarified handling of non-well-formed simple values in text and | ||||
pseudocode | ||||
* Added Appendix G, well-formedness errors and examples | ||||
* Removed UBJSON from Appendix E, as that format has completely | ||||
changed since RFC 7049; added reference to [RFC8618] | ||||
Appendix G. Well-formedness errors and examples | Appendix G. Well-formedness errors and examples | |||
There are three basic kinds of well-formedness errors that can occur | There are three basic kinds of well-formedness errors that can occur | |||
in decoding a CBOR data item: | in decoding a CBOR data item: | |||
* Too much data: There are input bytes left that were not consumed. | * Too much data: There are input bytes left that were not consumed. | |||
This is only an error if the application assumed that the input | This is only an error if the application assumed that the input | |||
bytes would span exactly one data item. Where the application | bytes would span exactly one data item. Where the application | |||
uses the self-delimiting nature of CBOR encoding to permit | uses the self-delimiting nature of CBOR encoding to permit | |||
additional data after the data item, as is for example done in | additional data after the data item, as is for example done in | |||
CBOR sequences [I-D.ietf-cbor-sequence], the CBOR decoder can | CBOR sequences [RFC8742], the CBOR decoder can simply indicate | |||
simply indicate what part of the input has not been consumed. | what part of the input has not been consumed. | |||
* Too little data: The input data available would need additional | * Too little data: The input data available would need additional | |||
bytes added at their end for a complete CBOR data item. This may | bytes added at their end for a complete CBOR data item. This may | |||
indicate the input is truncated; it is also a common error when | indicate the input is truncated; it is also a common error when | |||
trying to decode random data as CBOR. For some applications | trying to decode random data as CBOR. For some applications | |||
however, this may not be actually be an error, as the application | however, this may not actually be an error, as the application may | |||
may not be certain it has all the data yet and can obtain or wait | not be certain it has all the data yet and can obtain or wait for | |||
for additional input bytes. Some of these applications may have | additional input bytes. Some of these applications may have an | |||
an upper limit for how much additional data can show up; here the | upper limit for how much additional data can show up; here the | |||
decoder may be able to indicate that the encoded CBOR data item | decoder may be able to indicate that the encoded CBOR data item | |||
cannot be completed within this limit. | cannot be completed within this limit. | |||
* Syntax error: The input data are not consistent with the | * Syntax error: The input data are not consistent with the | |||
requirements of the CBOR encoding, and this cannot be remedied by | requirements of the CBOR encoding, and this cannot be remedied by | |||
adding (or removing) data at the end. | adding (or removing) data at the end. | |||
In Appendix C, errors of the first kind are addressed in the first | In Appendix C, errors of the first kind are addressed in the first | |||
paragraph/bullet list (requiring "no bytes are left"), and errors of | paragraph/bullet list (requiring "no bytes are left"), and errors of | |||
the second kind are addressed in the second paragraph/bullet list | the second kind are addressed in the second paragraph/bullet list | |||
skipping to change at page 67, line 31 ¶ | skipping to change at page 72, line 31 ¶ | |||
00 00, fb 00 00 00 | 00 00, fb 00 00 00 | |||
* Definite length strings with short data: 41, 61, 5a ff ff ff ff | * Definite length strings with short data: 41, 61, 5a ff ff ff ff | |||
00, 5b ff ff ff ff ff ff ff ff 01 02 03, 7a ff ff ff ff 00, 7b 7f | 00, 5b ff ff ff ff ff ff ff ff 01 02 03, 7a ff ff ff ff 00, 7b 7f | |||
ff ff ff ff ff ff ff 01 02 03 | ff ff ff ff ff ff ff 01 02 03 | |||
* Definite length maps and arrays not closed with enough items: 81, | * Definite length maps and arrays not closed with enough items: 81, | |||
81 81 81 81 81 81 81 81 81, 82 00, a1, a2 01 02, a1 00, a2 00 00 | 81 81 81 81 81 81 81 81 81, 82 00, a1, a2 01 02, a1 00, a2 00 00 | |||
00 | 00 | |||
* Tag number not followed by tag content: c0 | ||||
* Indefinite length strings not closed by a break stop code: 5f 41 | * Indefinite length strings not closed by a break stop code: 5f 41 | |||
00, 7f 61 00 | 00, 7f 61 00 | |||
* Indefinite length maps and arrays not closed by a break stop code: | * Indefinite length maps and arrays not closed by a break stop code: | |||
9f, 9f 01 02, bf, bf 01 02 01 02, 81 9f, 9f 80 00, 9f 9f 9f 9f 9f | 9f, 9f 01 02, bf, bf 01 02 01 02, 81 9f, 9f 80 00, 9f 9f 9f 9f 9f | |||
ff ff ff ff, 9f 81 9f 81 9f 9f ff ff ff | ff ff ff ff, 9f 81 9f 81 9f 9f ff ff ff | |||
A few examples for the five subkinds of well-formedness error kind 3 | A few examples for the five subkinds of well-formedness error kind 3 | |||
(syntax error) are shown below. | (syntax error) are shown below. | |||
End of changes. 135 change blocks. | ||||
356 lines changed or deleted | 600 lines changed or added | |||
This html diff was produced by rfcdiff 1.47. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |