draft-ietf-cbor-7049bis-08.txt   draft-ietf-cbor-7049bis-09.txt 
Network Working Group C. Bormann Network Working Group C. Bormann
Internet-Draft Universitaet Bremen TZI Internet-Draft Universitaet Bremen TZI
Obsoletes: 7049 (if approved) P. Hoffman Obsoletes: 7049 (if approved) P. Hoffman
Intended status: Standards Track ICANN Intended status: Standards Track ICANN
Expires: May 8, 2020 November 05, 2019 Expires: May 8, 2020 November 05, 2019
Concise Binary Object Representation (CBOR) Concise Binary Object Representation (CBOR)
draft-ietf-cbor-7049bis-08 draft-ietf-cbor-7049bis-09
Abstract Abstract
The Concise Binary Object Representation (CBOR) is a data format The Concise Binary Object Representation (CBOR) is a data format
whose design goals include the possibility of extremely small code whose design goals include the possibility of extremely small code
size, fairly small message size, and extensibility without the need size, fairly small message size, and extensibility without the need
for version negotiation. These design goals make it different from for version negotiation. These design goals make it different from
earlier binary serializations such as ASN.1 and MessagePack. earlier binary serializations such as ASN.1 and MessagePack.
This document is a revised edition of RFC 7049, with editorial This document is a revised edition of RFC 7049, with editorial
skipping to change at page 2, line 40 skipping to change at page 2, line 40
2.1. Extended Generic Data Models . . . . . . . . . . . . . . 8 2.1. Extended Generic Data Models . . . . . . . . . . . . . . 8
2.2. Specific Data Models . . . . . . . . . . . . . . . . . . 9 2.2. Specific Data Models . . . . . . . . . . . . . . . . . . 9
3. Specification of the CBOR Encoding . . . . . . . . . . . . . 9 3. Specification of the CBOR Encoding . . . . . . . . . . . . . 9
3.1. Major Types . . . . . . . . . . . . . . . . . . . . . . . 11 3.1. Major Types . . . . . . . . . . . . . . . . . . . . . . . 11
3.2. Indefinite Lengths for Some Major Types . . . . . . . . . 13 3.2. Indefinite Lengths for Some Major Types . . . . . . . . . 13
3.2.1. The "break" Stop Code . . . . . . . . . . . . . . . . 13 3.2.1. The "break" Stop Code . . . . . . . . . . . . . . . . 13
3.2.2. Indefinite-Length Arrays and Maps . . . . . . . . . . 14 3.2.2. Indefinite-Length Arrays and Maps . . . . . . . . . . 14
3.2.3. Indefinite-Length Byte Strings and Text Strings . . . 16 3.2.3. Indefinite-Length Byte Strings and Text Strings . . . 16
3.3. Floating-Point Numbers and Values with No Content . . . . 16 3.3. Floating-Point Numbers and Values with No Content . . . . 16
3.4. Tagging of Items . . . . . . . . . . . . . . . . . . . . 18 3.4. Tagging of Items . . . . . . . . . . . . . . . . . . . . 18
3.4.1. Date and Time . . . . . . . . . . . . . . . . . . . . 20 3.4.1. Date and Time . . . . . . . . . . . . . . . . . . . . 21
3.4.2. Standard Date/Time String . . . . . . . . . . . . . . 20 3.4.2. Standard Date/Time String . . . . . . . . . . . . . . 21
3.4.3. Epoch-based Date/Time . . . . . . . . . . . . . . . . 21 3.4.3. Epoch-based Date/Time . . . . . . . . . . . . . . . . 21
3.4.4. Bignums . . . . . . . . . . . . . . . . . . . . . . . 21 3.4.4. Bignums . . . . . . . . . . . . . . . . . . . . . . . 22
3.4.5. Decimal Fractions and Bigfloats . . . . . . . . . . . 22 3.4.5. Decimal Fractions and Bigfloats . . . . . . . . . . . 22
3.4.6. Content Hints . . . . . . . . . . . . . . . . . . . . 24 3.4.6. Content Hints . . . . . . . . . . . . . . . . . . . . 24
3.4.6.1. Encoded CBOR Data Item . . . . . . . . . . . . . 24 3.4.6.1. Encoded CBOR Data Item . . . . . . . . . . . . . 24
3.4.6.2. Expected Later Encoding for CBOR-to-JSON 3.4.6.2. Expected Later Encoding for CBOR-to-JSON
Converters . . . . . . . . . . . . . . . . . . . 24 Converters . . . . . . . . . . . . . . . . . . . 24
3.4.6.3. Encoded Text . . . . . . . . . . . . . . . . . . 25 3.4.6.3. Encoded Text . . . . . . . . . . . . . . . . . . 25
3.4.7. Self-Described CBOR . . . . . . . . . . . . . . . . . 26 3.4.7. Self-Described CBOR . . . . . . . . . . . . . . . . . 26
4. Serialization Considerations . . . . . . . . . . . . . . . . 26 4. Serialization Considerations . . . . . . . . . . . . . . . . 26
4.1. Preferred Serialization . . . . . . . . . . . . . . . . . 26 4.1. Preferred Serialization . . . . . . . . . . . . . . . . . 26
4.2. Deterministically Encoded CBOR . . . . . . . . . . . . . 27 4.2. Deterministically Encoded CBOR . . . . . . . . . . . . . 27
4.2.1. Core Deterministic Encoding Requirements . . . . . . 27 4.2.1. Core Deterministic Encoding Requirements . . . . . . 28
4.2.2. Additional Deterministic Encoding Considerations . . 28 4.2.2. Additional Deterministic Encoding Considerations . . 29
4.2.3. Length-first map key ordering . . . . . . . . . . . . 30 4.2.3. Length-first map key ordering . . . . . . . . . . . . 30
5. Creating CBOR-Based Protocols . . . . . . . . . . . . . . . . 31 5. Creating CBOR-Based Protocols . . . . . . . . . . . . . . . . 31
5.1. CBOR in Streaming Applications . . . . . . . . . . . . . 31 5.1. CBOR in Streaming Applications . . . . . . . . . . . . . 32
5.2. Generic Encoders and Decoders . . . . . . . . . . . . . . 32 5.2. Generic Encoders and Decoders . . . . . . . . . . . . . . 32
5.3. Validity of Items . . . . . . . . . . . . . . . . . . . . 32 5.3. Validity of Items . . . . . . . . . . . . . . . . . . . . 33
5.3.1. Basic validity . . . . . . . . . . . . . . . . . . . 33 5.3.1. Basic validity . . . . . . . . . . . . . . . . . . . 33
5.3.2. Tag validity . . . . . . . . . . . . . . . . . . . . 33 5.3.2. Tag validity . . . . . . . . . . . . . . . . . . . . 34
5.4. Handling Unknown Simple Values and Tag numbers . . . . . 33 5.4. Validity and Evolution . . . . . . . . . . . . . . . . . 34
5.5. Numbers . . . . . . . . . . . . . . . . . . . . . . . . . 34 5.5. Numbers . . . . . . . . . . . . . . . . . . . . . . . . . 35
5.6. Specifying Keys for Maps . . . . . . . . . . . . . . . . 35 5.6. Specifying Keys for Maps . . . . . . . . . . . . . . . . 36
5.6.1. Equivalence of Keys . . . . . . . . . . . . . . . . . 36 5.6.1. Equivalence of Keys . . . . . . . . . . . . . . . . . 37
5.7. Undefined Values . . . . . . . . . . . . . . . . . . . . 37 5.7. Undefined Values . . . . . . . . . . . . . . . . . . . . 38
5.8. Validity Checking and Robustness . . . . . . . . . . . . 37
6. Converting Data between CBOR and JSON . . . . . . . . . . . . 38 6. Converting Data between CBOR and JSON . . . . . . . . . . . . 38
6.1. Converting from CBOR to JSON . . . . . . . . . . . . . . 38 6.1. Converting from CBOR to JSON . . . . . . . . . . . . . . 38
6.2. Converting from JSON to CBOR . . . . . . . . . . . . . . 40 6.2. Converting from JSON to CBOR . . . . . . . . . . . . . . 39
7. Future Evolution of CBOR . . . . . . . . . . . . . . . . . . 41 7. Future Evolution of CBOR . . . . . . . . . . . . . . . . . . 40
7.1. Extension Points . . . . . . . . . . . . . . . . . . . . 41 7.1. Extension Points . . . . . . . . . . . . . . . . . . . . 41
7.2. Curating the Additional Information Space . . . . . . . . 42 7.2. Curating the Additional Information Space . . . . . . . . 42
8. Diagnostic Notation . . . . . . . . . . . . . . . . . . . . . 42 8. Diagnostic Notation . . . . . . . . . . . . . . . . . . . . . 42
8.1. Encoding Indicators . . . . . . . . . . . . . . . . . . . 43 8.1. Encoding Indicators . . . . . . . . . . . . . . . . . . . 43
9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 44 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 44
9.1. Simple Values Registry . . . . . . . . . . . . . . . . . 44 9.1. Simple Values Registry . . . . . . . . . . . . . . . . . 44
9.2. Tags Registry . . . . . . . . . . . . . . . . . . . . . . 44 9.2. Tags Registry . . . . . . . . . . . . . . . . . . . . . . 44
9.3. Media Type ("MIME Type") . . . . . . . . . . . . . . . . 45 9.3. Media Type ("MIME Type") . . . . . . . . . . . . . . . . 45
9.4. CoAP Content-Format . . . . . . . . . . . . . . . . . . . 46 9.4. CoAP Content-Format . . . . . . . . . . . . . . . . . . . 46
9.5. The +cbor Structured Syntax Suffix Registration . . . . . 46 9.5. The +cbor Structured Syntax Suffix Registration . . . . . 46
skipping to change at page 7, line 4 skipping to change at page 6, line 50
Data Stream: A sequence of zero or more data items, not further Data Stream: A sequence of zero or more data items, not further
assembled into a larger containing data item. The independent assembled into a larger containing data item. The independent
data items that make up a data stream are sometimes also referred data items that make up a data stream are sometimes also referred
to as "top-level data items". to as "top-level data items".
Well-formed: A data item that follows the syntactic structure of Well-formed: A data item that follows the syntactic structure of
CBOR. A well-formed data item uses the initial bytes and the byte CBOR. A well-formed data item uses the initial bytes and the byte
strings and/or data items that are implied by their values as strings and/or data items that are implied by their values as
defined in CBOR and does not include following extraneous data. defined in CBOR and does not include following extraneous data.
CBOR decoders by definition only return contents from well-formed CBOR decoders by definition only return contents from well-formed
data items. data items.
Valid: A data item that is well-formed and also follows the semantic Valid: A data item that is well-formed and also follows the semantic
restrictions that apply to CBOR data items. restrictions that apply to CBOR data items.
Expected: Besides its normal English meaning, the term "expected" is Expected: Besides its normal English meaning, the term "expected" is
used to describe requirements beyond CBOR validity that an used to describe requirements beyond CBOR validity that an
application has on its input data. Well-formed (processable at application has on its input data. Well-formed (processable at
all), valid (checked by a valdity-checking generic decoder), and all), valid (checked by a validity-checking generic decoder), and
expected (checked by the application) form a hierarchy of layers expected (checked by the application) form a hierarchy of layers
of acceptability. of acceptability.
Stream decoder: A process that decodes a data stream and makes each Stream decoder: A process that decodes a data stream and makes each
of the data items in the sequence available to an application as of the data items in the sequence available to an application as
they are received. they are received.
Where bit arithmetic or data types are explained, this document uses Where bit arithmetic or data types are explained, this document uses
the notation familiar from the programming language C, except that the notation familiar from the programming language C, except that
"**" denotes exponentiation. Similar to the "0x" notation for "**" denotes exponentiation. Similar to the "0x" notation for
skipping to change at page 20, line 37 skipping to change at page 20, line 37
| | string | | | | string | |
| | | | | | | |
| 36 | text | MIME message; see Section 3.4.6.3 | | 36 | text | MIME message; see Section 3.4.6.3 |
| | string | | | | string | |
| | | | | | | |
| 55799 | multiple | Self-described CBOR; see Section 3.4.7 | | 55799 | multiple | Self-described CBOR; see Section 3.4.7 |
+----------+----------+---------------------------------------------+ +----------+----------+---------------------------------------------+
Table 4: Tag numbers defined in RFC 7049 Table 4: Tag numbers defined in RFC 7049
Conceptually, tags are interpreted in the generic data model, not at
(de-)serialization time. A small number of tags (specifically, tag
number 25 and tag number 29) have been registered with semantics that
do require processing at (de-)serialization time: The decoder needs
to be aware and the encoder needs to be under control of the exact
sequence in which data items are encoded into the CBOR data stream.
This means these tags cannot be implemented on top of every generic
CBOR encoder/decoder (which might not reflect the serialization order
for entries in a map at the data model level and vice versa); their
implementation therefore typically needs to be integrated into the
generic encoder/decoder. The definition of new tags with this
property is NOT RECOMMENDED.
3.4.1. Date and Time 3.4.1. Date and Time
Protocols using tag numbers 0 and 1 extend the generic data model Protocols using tag numbers 0 and 1 extend the generic data model
(Section 2) with data items representing points in time. (Section 2) with data items representing points in time.
3.4.2. Standard Date/Time String 3.4.2. Standard Date/Time String
Tag number 0 contains a text string in the standard format described Tag number 0 contains a text string in the standard format described
by the "date-time" production in [RFC3339], as refined by Section 3.3 by the "date-time" production in [RFC3339], as refined by Section 3.3
of [RFC4287], representing the point in time described there. A of [RFC4287], representing the point in time described there. A
skipping to change at page 25, line 46 skipping to change at page 26, line 6
(Note that more specific identification may be necessary if the (Note that more specific identification may be necessary if the
actual version of the specification underlying the regular actual version of the specification underlying the regular
expression, or more than just the text of the regular expression expression, or more than just the text of the regular expression
itself, need to be conveyed.) Any contained string value is itself, need to be conveyed.) Any contained string value is
valid. valid.
o Tag number 36 is for MIME messages (including all headers), as o Tag number 36 is for MIME messages (including all headers), as
defined in [RFC2045]. A text string that isn't a valid MIME defined in [RFC2045]. A text string that isn't a valid MIME
message is invalid. (For this tag, validity checking may be message is invalid. (For this tag, validity checking may be
particularly onerous for a generic decoder and might therefore not particularly onerous for a generic decoder and might therefore not
be offered.) be offered. Note that many MIME messages are general binary data
and can therefore not be represented in a text string;
[IANA.cbor-tags] lists a registration for tag number 257 that is
similar to tag number 36 but is used with an enclosed byte
string.)
Note that tag numbers 33 and 34 differ from 21 and 22 in that the Note that tag numbers 33 and 34 differ from 21 and 22 in that the
data is transported in base-encoded form for the former and in raw data is transported in base-encoded form for the former and in raw
byte string form for the latter. byte string form for the latter.
3.4.7. Self-Described CBOR 3.4.7. Self-Described CBOR
In many applications, it will be clear from the context that CBOR is In many applications, it will be clear from the context that CBOR is
being employed for encoding a data item. For instance, a specific being employed for encoding a data item. For instance, a specific
protocol might specify the use of CBOR, or a media type is indicated protocol might specify the use of CBOR, or a media type is indicated
skipping to change at page 28, line 49 skipping to change at page 29, line 16
If a protocol allows for IEEE floats, then additional deterministic If a protocol allows for IEEE floats, then additional deterministic
encoding rules might need to be added. One example rule might be to encoding rules might need to be added. One example rule might be to
have all floats start as a 64-bit float, then do a test conversion to have all floats start as a 64-bit float, then do a test conversion to
a 32-bit float; if the result is the same numeric value, use the a 32-bit float; if the result is the same numeric value, use the
shorter value and repeat the process with a test conversion to a shorter value and repeat the process with a test conversion to a
16-bit float. (This rule selects 16-bit float for positive and 16-bit float. (This rule selects 16-bit float for positive and
negative Infinity as well.) Although IEEE floats can represent both negative Infinity as well.) Although IEEE floats can represent both
positive and negative zero as distinct values, the application might positive and negative zero as distinct values, the application might
not distinguish these and might decide to represent all zero values not distinguish these and might decide to represent all zero values
with a positive sign, disallowing negative zero. Also, there are with a positive sign, disallowing negative zero.
many representations for NaN. If NaN is an allowed value, it must
always be represented as 0xf97e00.
CBOR tags present additional considerations for deterministic CBOR tags present additional considerations for deterministic
encoding. If a CBOR-based protocol were to provide the same encoding. If a CBOR-based protocol were to provide the same
semantics for the presence and absence of a specific tag (e.g., by semantics for the presence and absence of a specific tag (e.g., by
allowing both tag 1 data items and raw numbers in a date/time allowing both tag 1 data items and raw numbers in a date/time
position, treating the latter as if they were tagged), the position, treating the latter as if they were tagged), the
deterministic format would not allow them. In a protocol that deterministic format would not allow them. In a protocol that
requires tags in certain places to obtain specific semantics, the tag requires tags in certain places to obtain specific semantics, the tag
needs to appear in the deterministic format as well. needs to appear in the deterministic format as well.
skipping to change at page 29, line 35 skipping to change at page 29, line 48
major types 0 and 1, and other values as the smallest of 16-, major types 0 and 1, and other values as the smallest of 16-,
32-, or 64-bit floating point that accurately represents the 32-, or 64-bit floating point that accurately represents the
value, value,
2. Encode all values as the smallest of 16-, 32-, or 64-bit 2. Encode all values as the smallest of 16-, 32-, or 64-bit
floating point that accurately represents the value, even for floating point that accurately represents the value, even for
integral values, or integral values, or
3. Encode all values as 64-bit floating point. 3. Encode all values as 64-bit floating point.
If NaN is an allowed value, the protocol needs to pick a single Rule 1 straddles the boundaries between integers and floating
representation, for example 0xf97e00. point values, and Rule 3 does not use preferred encoding, so Rule
2 may be a good choice in many cases.
If NaN is an allowed value and there is no intent to support NaN
payloads or signaling NaNs, the protocol needs to pick a single
representation, for example 0xf97e00. If that simple choice is
not possible, specific attention will be needed for NaN handling.
Subnormal numbers (nonzero numbers with the lowest possible
exponent of a given IEEE 754 number format) may be flushed to zero
outputs or be treated as zero inputs in some floating point
implementations. A protocol's deterministic encoding may want to
exclude them from interchange, interchanging zero instead.
o If a protocol includes a field that can express integers with an o If a protocol includes a field that can express integers with an
absolute value of 2^64 or larger using tag numbers 2 or 3 absolute value of 2^64 or larger using tag numbers 2 or 3
(Section 3.4.4), the protocol's deterministic encoding needs to (Section 3.4.4), the protocol's deterministic encoding needs to
specify whether small integers are expressed using the tag or specify whether small integers are expressed using the tag or
major types 0 and 1. major types 0 and 1.
o A protocol might give encoders the choice of representing a URL as o A protocol might give encoders the choice of representing a URL as
either a text string or, using Section 3.4.6.3, tag number 32 either a text string or, using Section 3.4.6.3, tag number 32
containing a text string. This protocol's deterministic encoding containing a text string. This protocol's deterministic encoding
skipping to change at page 33, line 13 skipping to change at page 33, line 37
2. Issue an error and stop processing altogether. 2. Issue an error and stop processing altogether.
A CBOR-based protocol MUST specify which of these options its A CBOR-based protocol MUST specify which of these options its
decoders take, for each kind of invalid item they might encounter. decoders take, for each kind of invalid item they might encounter.
Such problems might occur at the basic validity level of CBOR or in Such problems might occur at the basic validity level of CBOR or in
the context of tags (tag validity). the context of tags (tag validity).
5.3.1. Basic validity 5.3.1. Basic validity
Two kinds of validity errors can occur in the basic generic data
model:
Duplicate keys in a map: Generic decoders (Section 5.2) make data Duplicate keys in a map: Generic decoders (Section 5.2) make data
available to applications using the native CBOR data model. That available to applications using the native CBOR data model. That
data model includes maps (key-value mappings with unique keys), data model includes maps (key-value mappings with unique keys),
not multimaps (key-value mappings where multiple entries can have not multimaps (key-value mappings where multiple entries can have
the same key). Thus, a generic decoder that gets a CBOR map item the same key). Thus, a generic decoder that gets a CBOR map item
that has duplicate keys will decode to a map with only one that has duplicate keys will decode to a map with only one
instance of that key, or it might stop processing altogether. On instance of that key, or it might stop processing altogether. On
the other hand, a "streaming decoder" may not even be able to the other hand, a "streaming decoder" may not even be able to
notice (Section 5.6). notice (Section 5.6).
Invalid UTF-8 string: A decoder might or might not want to verify Invalid UTF-8 string: A decoder might or might not want to verify
that the sequence of bytes in a UTF-8 string (major type 3) is that the sequence of bytes in a UTF-8 string (major type 3) is
actually valid UTF-8 and react appropriately. actually valid UTF-8 and react appropriately.
5.3.2. Tag validity 5.3.2. Tag validity
Two additional kinds of validity errors are introduced by adding tags
to the basic generic data model:
Inadmissible type for tag content: Tags (Section 3.4) specify what Inadmissible type for tag content: Tags (Section 3.4) specify what
type of data item is supposed to be enclosed by the tag; for type of data item is supposed to be enclosed by the tag; for
example, the tags for positive or negative bignums are supposed to example, the tags for positive or negative bignums are supposed to
be put on byte strings. A decoder that decodes the tagged data be put on byte strings. A decoder that decodes the tagged data
item into a native representation (a native big integer in this item into a native representation (a native big integer in this
example) is expected to check the type of the data item being example) is expected to check the type of the data item being
tagged. Even decoders that don't have such native representations tagged. Even decoders that don't have such native representations
available in their environment may perform the check on those tags available in their environment may perform the check on those tags
known to them and react appropriately. known to them and react appropriately.
Inadmissible value for tag content: The type of data item may be Inadmissible value for tag content: The type of data item may be
admissible for a tag's content, but the specific value may not be; admissible for a tag's content, but the specific value may not be;
e.g., a value of "yesterday" is not acceptable for the content of e.g., a value of "yesterday" is not acceptable for the content of
tag 0, even though it properly is a text string. A decoder that tag 0, even though it properly is a text string. A decoder that
normally ingests such tags into equivalent platform types might normally ingests such tags into equivalent platform types might
present this tag to the application in a similar way to how it present this tag to the application in a similar way to how it
would present a tag with an unknown tag number (Section 5.4). would present a tag with an unknown tag number (Section 5.4).
5.4. Handling Unknown Simple Values and Tag numbers 5.4. Validity and Evolution
A decoder that comes across a simple value (Section 3.3) that it does A decoder with validity checking will expend the effort to reliably
not recognize, such as a value that was added to the IANA registry detect data items with validity errors. For example, such a decoder
after the decoder was deployed or a value that the decoder chose not needs to have an API that reports an error (and does not return data)
to implement, might issue a warning, might stop processing for a CBOR data item that contains any of the validity errors listed
altogether, might handle the error by making the unknown value in the previous subsection.
available to the application as such (as is expected of generic
decoders), or take some other type of action.
A decoder that comes across a tag number (Section 3.4) that it does The set of tags defined in the tag registry (Section 9.2), as well as
not recognize, such as a tag number that was added to the IANA the set of simple values defined in the simple values registry
registry after the decoder was deployed or a tag number that the (Section 9.1), can grow at any time beyond the set understood by a
decoder chose not to implement, might issue a warning, might stop generic decoder. A validity-checking decoder can do one of two
processing altogether, might handle the error and present the unknown things when it encounters such a case that it does not recognize:
tag number together with the enclosed data item to the application
(as is expected of generic decoders), or take some other type of o It can report an error (and not return data). Note that this
action. error is not a validity error per se. This kind of error is more
likely to be raised by a decoder that would be performing validity
checking if this were a known case.
o It can emit the unknown item (type, value, and, for tags, the
decoded tagged data item) to the application calling the decoder,
with an indication that the decoder did not recognize that tag
number or simple value.
The latter approach, which is also appropriate for decoders that do
not support validity checking, provides forward compatibility with
newly registered tags and simple values without the requirement to
update the encoder at the same time as the calling application. (For
this, the API for the decoder needs to have a way to mark unknown
items so that the calling application can handle them in a manner
appropriate for the program.)
Since some of the processing needed for validity checking may have an
appreciable cost (in particular with duplicate detection for maps),
support of validity checking is not a requirement placed on all CBOR
decoders.
Some encoders will rely on their applications to provide input data
in such a way that valid CBOR results from the encoder. A generic
encoder also may want to provide a validity-checking mode where it
reliably limits its output to valid CBOR, independent of whether or
not its application is indeed providing API-conformant data.
5.5. Numbers 5.5. Numbers
CBOR-based protocols should take into account that different language CBOR-based protocols should take into account that different language
environments pose different restrictions on the range and precision environments pose different restrictions on the range and precision
of numbers that are representable. For example, the JavaScript of numbers that are representable. For example, the JavaScript
number system treats all numbers as floating point, which may result number system treats all numbers as floating point, which may result
in silent loss of precision in decoding integers with more than 53 in silent loss of precision in decoding integers with more than 53
significant bits. A protocol that uses numbers should define its significant bits. A protocol that uses numbers should define its
expectations on the handling of non-trivial numbers in decoders and expectations on the handling of non-trivial numbers in decoders and
skipping to change at page 35, line 39 skipping to change at page 36, line 45
may want to reduce its overhead significantly by relying on its data may want to reduce its overhead significantly by relying on its data
source to maintain uniqueness. source to maintain uniqueness.
A CBOR-based protocol MUST define what to do when a receiving A CBOR-based protocol MUST define what to do when a receiving
application does see multiple identical keys in a map. The resulting application does see multiple identical keys in a map. The resulting
rule in the protocol MUST respect the CBOR data model: it cannot rule in the protocol MUST respect the CBOR data model: it cannot
prescribe a specific handling of the entries with the identical keys, prescribe a specific handling of the entries with the identical keys,
except that it might have a rule that having identical keys in a map except that it might have a rule that having identical keys in a map
indicates a malformed map and that the decoder has to stop with an indicates a malformed map and that the decoder has to stop with an
error. Duplicate keys are also prohibited by CBOR decoders that error. Duplicate keys are also prohibited by CBOR decoders that
enforce validity (Section 5.8). enforce validity (Section 5.4).
The CBOR data model for maps does not allow ascribing semantics to The CBOR data model for maps does not allow ascribing semantics to
the order of the key/value pairs in the map representation. Thus, a the order of the key/value pairs in the map representation. Thus, a
CBOR-based protocol MUST NOT specify that changing the key/value pair CBOR-based protocol MUST NOT specify that changing the key/value pair
order in a map would change the semantics, except to specify that order in a map would change the semantics, except to specify that
some, orders are disallowed, for example where they would not meet some, orders are disallowed, for example where they would not meet
the requirements of a deterministic encoding (Section 4.2). (Any the requirements of a deterministic encoding (Section 4.2). (Any
secondary effects of map ordering such as on timing, cache usage, and secondary effects of map ordering such as on timing, cache usage, and
other potential side channels are not considered part of the other potential side channels are not considered part of the
semantics but may be enough reason on its own for a protocol to semantics but may be enough reason on its own for a protocol to
skipping to change at page 37, line 14 skipping to change at page 38, line 19
distinguish values for map keys that are equal for this purpose at distinguish values for map keys that are equal for this purpose at
the generic data model level. the generic data model level.
5.7. Undefined Values 5.7. Undefined Values
In some CBOR-based protocols, the simple value (Section 3.3) of In some CBOR-based protocols, the simple value (Section 3.3) of
Undefined might be used by an encoder as a substitute for a data item Undefined might be used by an encoder as a substitute for a data item
with an encoding problem, in order to allow the rest of the enclosing with an encoding problem, in order to allow the rest of the enclosing
data items to be encoded without harm. data items to be encoded without harm.
5.8. Validity Checking and Robustness
Some areas of application of CBOR do not require deterministic
encoding (Section 4.2) but may require that different decoders reach
the same (semantically equivalent) results, even in the presence of
potentially malicious data. This can be required if one application
(such as a firewall or other protecting entity) makes a decision
based on the data that another application, which independently
decodes the data, relies on.
Normally, it is the responsibility of the sender to avoid ambiguously
decodable data. However, the sender might be an attacker specially
making up CBOR data such that it will be interpreted differently by
different decoders in an attempt to exploit that as a vulnerability.
Generic decoders used in applications where this might be a problem
can help by providing a validity-checking mode in which it is also
the responsibility of the generic decoder to reject invalid data. It
is expected that firewalls and other security systems that decode
CBOR will employ their decoders with validity checking applied.
A decoder with validity checking will expend the effort to reliably
detect invalid data items (Section 5.3). For example, such a decoder
needs to have an API that reports an error (and does not return data)
for a CBOR data item that contains any of the following:
o a map (major type 5) that has more than one entry with the same
key
o a tag that is used on a data item of the incorrect type
o a data item that is incorrectly formatted for the type given to
it, such as invalid UTF-8 in a text string or data that (even if
of the correct type) cannot be interpreted with the specific tag
number that it has been tagged with
A validity-checking decoder can do one of two things when it
encounters a tag number or simple value that it does not recognize:
o It can report an error (and not return data).
o It can emit the unknown item (type, value, and, for tags, the
decoded tagged data item) to the application calling the decoder,
with an indication that the decoder did not recognize that tag
number or simple value.
The latter approach, which is also appropriate for decoders that do
not support validity checking, provides forward compatibility with
newly registered tags and simple values without the requirement to
update the encoder at the same time as the calling application. (For
this, the API for the decoder needs to have a way to mark unknown
items so that the calling application can handle them in a manner
appropriate for the program.)
Since some of the processing needed for validity checking may have an
appreciable cost (in particular with duplicate detection for maps),
support of validity checking is not a requirement placed on all CBOR
decoders.
Some encoders will rely on their applications to provide input data
in such a way that valid CBOR results. A generic encoder also may
want to provide a validity-checking mode where it reliably limits its
output to valid CBOR, independent of whether or not its application
is providing API-conformant data.
6. Converting Data between CBOR and JSON 6. Converting Data between CBOR and JSON
This section gives non-normative advice about converting between CBOR This section gives non-normative advice about converting between CBOR
and JSON. Implementations of converters are free to use whichever and JSON. Implementations of converters are free to use whichever
advice here they want. advice here they want.
It is worth noting that a JSON text is a sequence of characters, not It is worth noting that a JSON text is a sequence of characters, not
an encoded sequence of bytes, while a CBOR data item consists of an encoded sequence of bytes, while a CBOR data item consists of
bytes, not characters. bytes, not characters.
skipping to change at page 49, line 15 skipping to change at page 49, line 15
to disrupt the encoder. to disrupt the encoder.
Protocols should be defined in such a way that potential multiple Protocols should be defined in such a way that potential multiple
interpretations are reliably reduced to a single interpretation. For interpretations are reliably reduced to a single interpretation. For
example, an attacker could make use of invalid input such as example, an attacker could make use of invalid input such as
duplicate keys in maps, or exploit different precision in processing duplicate keys in maps, or exploit different precision in processing
numbers to make one application base its decisions on a different numbers to make one application base its decisions on a different
interpretation than the one that will be used by a second interpretation than the one that will be used by a second
application. To facilitate consistent interpretation, encoder and application. To facilitate consistent interpretation, encoder and
decoder implementations should provide a validity checking mode of decoder implementations should provide a validity checking mode of
operation (Section 5.8). Note, however, that a generic decoder operation (Section 5.4). Note, however, that a generic decoder
cannot know about all requirements that an application poses on its cannot know about all requirements that an application poses on its
input data; it is therefore not relieving the application from input data; it is therefore not relieving the application from
performing its own input checking. Also, since the set of defined performing its own input checking. Also, since the set of defined
tag numbers evolves, the application may employ a tag number that is tag numbers evolves, the application may employ a tag number that is
not yet supported for validity checking by the generic decoder it not yet supported for validity checking by the generic decoder it
uses. Generic decoders therefore need to provide documentation which uses. Generic decoders therefore need to provide documentation which
tag numbers they support and what validity checking they can provide tag numbers they support and what validity checking they can provide
for each of them as well as for basic CBOR validity (UTF-8 checking, for each of them as well as for basic CBOR validity (UTF-8 checking,
duplicate map key checking). duplicate map key checking).
 End of changes. 22 change blocks. 
107 lines changed or deleted 98 lines changed or added

This html diff was produced by rfcdiff 1.47. The latest version is available from http://tools.ietf.org/tools/rfcdiff/