From 671e56e06e31c21317500fff78c80d02ed8ee43f Mon Sep 17 00:00:00 2001 From: Geoffrey Picron Date: Fri, 18 Aug 2023 11:57:35 +0200 Subject: [PATCH] BOOLNULL to Atom type & INT 64 bits --- 011.md | 40 +++++++++++++++++++++++++--------------- 1 file changed, 25 insertions(+), 15 deletions(-) diff --git a/011.md b/011.md index fcd82a8..2999e84 100644 --- a/011.md +++ b/011.md @@ -37,22 +37,23 @@ shows a simplified grammar for JSON (that ommits low-level rules e.g. for integers or floating point numbers and strings): ``` -JSON_VAL ::= JSON_ATOM / JSON_LIST / JSON_DICT -JSON_ATOM ::= 'true' / 'false' / 'null' / INT_VAL / DOUBLE_VAL / STR_VAL -JSON_LIST ::= '[' ']' / '[' JSON_VAL *(',' JSON_VAL) ']' -JSON_DICT ::= '{' '}' / '{' STR_VAL ':' JSON_VAL *(',' STR_VAL ':' JSON_VAL) '}' +JSON_VAL ::= JSON_PRIMITIVE / JSON_LIST / JSON_DICT +JSON_PRIMITIVE ::= 'true' / 'false' / 'null' / INT_VAL / DOUBLE_VAL / STR_VAL +JSON_LIST ::= '[' ']' / '[' JSON_VAL *(',' JSON_VAL) ']' +JSON_DICT ::= '{' '}' / '{' STR_VAL ':' JSON_VAL *(',' STR_VAL ':' JSON_VAL) '}' ``` BIPF is more expressive in that it also supports byte arrays. -Moreover, any atom can be used as a key for a dictionary, not only +Moreover, any primitive can be used as a key for a dictionary, not only strings. The following grammer for BIPF permits a direct comparison with the previous grammer for JSON: ``` -BIPF_VAL ::= BIPF_ATOM / BIPF_LIST / BIPF_DICT -BIPF_ATOM ::= 'true' / 'false' / 'null' / INT_VAL / DOUBLE_VAL / STR_VAL / BYTES_VAL -BIPF_LIST ::= '[' ']' / '[' BIPF_VAL *(',' BIPF_VAL) ']' -BIPF_DICT ::= '{' '}' / '{' BIPF_ATOM ':' BIPF_VAL *(',' BIPF_ATOM ':' BIPF_VAL) '}' +BIPF_VAL ::= BIPF_PRIMITIVE / BIPF_LIST / BIPF_DICT +BIPF_PRIMITIVE ::= ATOM_VAL / INT_VAL / DOUBLE_VAL / STR_VAL / BYTES_VAL +ATOM_VAL ::= 'true' (1) / 'false' (0) / 'null' (no value) / application specific (2..2^64-1) +BIPF_LIST ::= '[' ']' / '[' BIPF_VAL *(',' BIPF_VAL) ']' +BIPF_DICT ::= '{' '}' / '{' BIPF_ATOM ':' BIPF_VAL *(',' BIPF_ATOM ':' BIPF_VAL) '}' ``` This grammer can be used as a human-readable format of BIPF data items @@ -105,16 +106,18 @@ encoding of a corresponding value: ``` STRING : 0 (000) // utf8 encoded string BYTES : 1 (001) // raw byte sequence -INT : 2 (010) // little endian, two's complement, minimal number of bytes +INT : 2 (010) // 64 bits signed integer, little endian, two's complement, minimal number of bytes DOUBLE : 3 (011) // IEEE 754-encoded double precision floating point LIST : 4 (100) // sequence of bipf-encoded values DICT : 5 (101) // sequence of alternating bipf-encoded key and value -BOOLNULL: 6 (110) // 1 = true, 0 = false, no value means null +ATOM : 6 (110) // 64 bits unsigned integer, little endian, two's complement, minimal number of bytes. + // 1 = true, 0 = false, no value means null. Other values are for application purposes. EXTENDED: 7 (111) // custom type. Specific type should be indicated by varint at start of buffer ``` -Note that the ```BOOLNULL``` bit pattern is used for three different -atomic types. +Note that the ```ATOM``` bit pattern is used for three predefined different +atomic types (true, false, null) and may be used by applications to encode other +application-specific constants with a minimum of bytes. BIPF values are serialized with a TYPE-LENGTH-VALUE (TLV) encoding. To this end, T and L are combined into a single integer value called @@ -135,7 +138,11 @@ significant bit set, except the last byte. Zero is encoded as byte 0x00. Integer values are encoded with the minimally required number of bytes in -little-endian order using two's complement representation. +signed 64 bits little-endian order using two's complement representation. + +Atoms values are encoded with the minimally required number of bytes in +unsigned 64 bits little-endian order using two's complement representation. +The absence of value is encoded as a zero-length atom and correspond to NULL. Lists are encoded by prepending to the concatenation of BIPF-encoded elements a tag with ```typ=4``` and a ```len``` value which is the sum @@ -170,7 +177,7 @@ document in the way integer values are encoded. In the authors use a fixed-length encoding for integer values (4 bytes, little endian, two's complement). With space concerns in mind, tinySSB formats integers as little endian, two's complement, and retaining only -the minimum number of bytes needed. +the minimum number of bytes needed up to 8 bytes (64 bit integers). (b) As pointed out in the Motivation section, a straight-forward human-readable representation of BIPF exists that is very close to @@ -192,6 +199,8 @@ false 0e00 true 0e01 +@2@ 0e02 + 123 0a7b -123 0a85 @@ -243,6 +252,7 @@ varint in ProtoBuf: [```https://protobuf.dev/programming-guides/encoding/#varint BIPF.tinySSB: - Python: https://pypi.org/project/bipf/ +- Nim: https://github.com/BundleFeed/nim_bipf BIPF.classic: - Go: https://git.sr.ht/~cryptix/go-exp/tree/bipf/item/bipf