Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 25 additions & 15 deletions 011.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,22 +37,23 @@ shows a simplified grammar for JSON (that ommits low-level rules
e.g. for integers or floating point numbers and strings):

```
JSON_VAL ::= JSON_ATOM / JSON_LIST / JSON_DICT
JSON_ATOM ::= 'true' / 'false' / 'null' / INT_VAL / DOUBLE_VAL / STR_VAL
JSON_LIST ::= '[' ']' / '[' JSON_VAL *(',' JSON_VAL) ']'
JSON_DICT ::= '{' '}' / '{' STR_VAL ':' JSON_VAL *(',' STR_VAL ':' JSON_VAL) '}'
JSON_VAL ::= JSON_PRIMITIVE / JSON_LIST / JSON_DICT
JSON_PRIMITIVE ::= 'true' / 'false' / 'null' / INT_VAL / DOUBLE_VAL / STR_VAL
JSON_LIST ::= '[' ']' / '[' JSON_VAL *(',' JSON_VAL) ']'
JSON_DICT ::= '{' '}' / '{' STR_VAL ':' JSON_VAL *(',' STR_VAL ':' JSON_VAL) '}'
```

BIPF is more expressive in that it also supports byte arrays.
Moreover, any atom can be used as a key for a dictionary, not only
Moreover, any primitive can be used as a key for a dictionary, not only
strings. The following grammer for BIPF permits a direct comparison
with the previous grammer for JSON:

```
BIPF_VAL ::= BIPF_ATOM / BIPF_LIST / BIPF_DICT
BIPF_ATOM ::= 'true' / 'false' / 'null' / INT_VAL / DOUBLE_VAL / STR_VAL / BYTES_VAL
BIPF_LIST ::= '[' ']' / '[' BIPF_VAL *(',' BIPF_VAL) ']'
BIPF_DICT ::= '{' '}' / '{' BIPF_ATOM ':' BIPF_VAL *(',' BIPF_ATOM ':' BIPF_VAL) '}'
BIPF_VAL ::= BIPF_PRIMITIVE / BIPF_LIST / BIPF_DICT
BIPF_PRIMITIVE ::= ATOM_VAL / INT_VAL / DOUBLE_VAL / STR_VAL / BYTES_VAL
ATOM_VAL ::= 'true' (1) / 'false' (0) / 'null' (no value) / application specific (2..2^64-1)
BIPF_LIST ::= '[' ']' / '[' BIPF_VAL *(',' BIPF_VAL) ']'
BIPF_DICT ::= '{' '}' / '{' BIPF_ATOM ':' BIPF_VAL *(',' BIPF_ATOM ':' BIPF_VAL) '}'
```

This grammer can be used as a human-readable format of BIPF data items
Expand Down Expand Up @@ -105,16 +106,18 @@ encoding of a corresponding value:
```
STRING : 0 (000) // utf8 encoded string
BYTES : 1 (001) // raw byte sequence
INT : 2 (010) // little endian, two's complement, minimal number of bytes
INT : 2 (010) // 64 bits signed integer, little endian, two's complement, minimal number of bytes
DOUBLE : 3 (011) // IEEE 754-encoded double precision floating point
LIST : 4 (100) // sequence of bipf-encoded values
DICT : 5 (101) // sequence of alternating bipf-encoded key and value
BOOLNULL: 6 (110) // 1 = true, 0 = false, no value means null
ATOM : 6 (110) // 64 bits unsigned integer, little endian, two's complement, minimal number of bytes.
// 1 = true, 0 = false, no value means null. Other values are for application purposes.
EXTENDED: 7 (111) // custom type. Specific type should be indicated by varint at start of buffer
```

Note that the ```BOOLNULL``` bit pattern is used for three different
atomic types.
Note that the ```ATOM``` bit pattern is used for three predefined different
atomic types (true, false, null) and may be used by applications to encode other
application-specific constants with a minimum of bytes.

BIPF values are serialized with a TYPE-LENGTH-VALUE (TLV) encoding.
To this end, T and L are combined into a single integer value called
Expand All @@ -135,7 +138,11 @@ significant bit set, except the last byte. Zero is encoded as byte
0x00.

Integer values are encoded with the minimally required number of bytes in
little-endian order using two's complement representation.
signed 64 bits little-endian order using two's complement representation.

Atoms values are encoded with the minimally required number of bytes in
unsigned 64 bits little-endian order using two's complement representation.
The absence of value is encoded as a zero-length atom and correspond to NULL.

Lists are encoded by prepending to the concatenation of BIPF-encoded
elements a tag with ```typ=4``` and a ```len``` value which is the sum
Expand Down Expand Up @@ -170,7 +177,7 @@ document in the way integer values are encoded. In
the authors use a fixed-length encoding for integer values (4 bytes,
little endian, two's complement). With space concerns in mind, tinySSB
formats integers as little endian, two's complement, and retaining only
the minimum number of bytes needed.
the minimum number of bytes needed up to 8 bytes (64 bit integers).

(b) As pointed out in the Motivation section, a straight-forward
human-readable representation of BIPF exists that is very close to
Expand All @@ -192,6 +199,8 @@ false 0e00

true 0e01

@2@ 0e02

123 0a7b

-123 0a85
Expand Down Expand Up @@ -243,6 +252,7 @@ varint in ProtoBuf: [```https://protobuf.dev/programming-guides/encoding/#varint

BIPF.tinySSB:
- Python: https://pypi.org/project/bipf/
- Nim: https://github.com/BundleFeed/nim_bipf

BIPF.classic:
- Go: https://git.sr.ht/~cryptix/go-exp/tree/bipf/item/bipf
Expand Down