Skip to content
Pascal S. de Kloe edited this page Aug 18, 2017 · 3 revisions

Format Specification

A data structure consists of zero or more field value definitions followed by a termination byte 0x7f. Only those fields with a value other than the zero value may be serialized. Fields appear in order as stated by the schema.

struct :≡ def* 127

The zero value for booleans is false, integers: 0, floating points: 0.0, timestamps: 1970-01-01T00:00:00.000000000Z, text & binary: the empty string, nested data structures: null and an empty list for data structure lists.

Data is represented in a big-endian manner. The format relies on varints also known as a variable-length quantity or unsigned big-endian base 128 (UBEB128). Bits reserved for future use (RFU) must be set to 0.

Value Definiton

Each definition starts with an 8-bit header. The 7 least significant bits identify the field by its (0-based position) index in the schema. The most significant bit is used as a flag.

def :≡ header data
header :≡ key o̅r̅ flagged-key
key :≡ 0 – 126
flagged-key :≡ 128 – 254

Boolean occurrences set the value to true. The flag is RFU.

Unsigned 8-bit integer values just follow the header byte and the flag is RFU. Unsigned 16-bit integer values greather than 255 also follow the header byte. Smaller values are encoded in one byte with the header flag set. Unsigned 32-bit integer values less than 1<<21 are encoded as varints and larger values set the header flag for fixed length encoding. Unsigned 64-bit integer values less than 1<<49 are encoded as varints and larger values set the header flag for fixed length encoding.

Signed 32- and 64-bit integers are encoded as varints. The flag stands for negative. The tenth byte for 64-bit integers is skipped for encoding since its value is fixed to 0x01.

Floating points are encoded conform IEEE 754. The flag is RFU.

Timestamps are encoded as a 32-bit unsigned integer for the number of seconds that have elapsed since 00:00:00 UTC, Thursday, 1 January 1970, not counting leap seconds. When the header flag is set then the number of seconds is encoded as a 64-bit two's complement integer. In both cases the value is followed with 32 bits for the nanosecond fraction. Note that the first two bits are RFU.

The data for text and binaries is prefixed with a varint byte size declaration. Text is encoded as UTF-8. The flag is RFU.

Lists of floating points, text, binaries and data structures are prefixed with a varint element size declaration. The flag is RFU.

Clone this wiki locally