YEDB specifications

Version: 1

1. General

1.1 Implementations

1.2 References

The following third-party technologies are mentioned in this document:

2. Database format

2.1 Common

2.2 Meta information file

The file MUST be called ".yedb" and stored in the database root directory. The file MUST be serialized as JSON and contain the following fields:

Name Type Description
fmt String data serialization format code
created u64 database creation timestamp
version u16 engine version
checksums bool data storage checksums enabled

2.3 Key files

2.3.1 General

Key values MUST be stored in regular files in "keys" subdirectory. The format MUST be kept to allow system administrators repair a database without any external tools. The full path tree, where a key file is stored, MUST represent the key full name.

Example: a key, named "my/cool/key" should be:

When deleted, the engine SHOULD automatically remove unnecessary directories in the path tree.

The root key SHOULD NOT have any value. Other "parent" keys MAY have values set.

2.3.2 File format

The database can have files of the same format only. Checksums are enabled or disabled globally, for all keys in the database.

2.3.2.1 Checksums disabled

If checksums are disabled, key files contain serialized data as-is. This is more easy for manually repairing the database, but less reliable for data integrity.

2.3.2.2 Checksums enabled
2.3.2.2.1 Binary serialization formats
Byte range Size Value
0-31 32 SHA256-checksum of serialized data
32-39 8 Set timestamp
40- Serialized data
2.3.2.2.2 Text serialization formats
String Value
1 SHA256-checksum of serialized data (HEX)
2 Set timestamp (hex)
3-N Serialized data

2.4 Data serialization

2.4.1 Data formats

The current YEDB specifications document defines the following serialization formats:

Name Code Mandatory File suffix With c-sums Type
json 1 Y(default) .json .jsonc text
msgpack 2 Y .mp .mpc binary
cbor 3 N .cb .cbc binary
yaml 4 N .yml .ymlc text

2.4.2 Data types

Name Mandatory
null Y
boolean Y
number Y
string Y
array Y
object Y
bytes N

The database MAY implement additional data types.

2.4.3 Data type schemas

The database MAY implement strict type / structure checking for keys.

If implemented, the implementation MUST satisfy the following requirements:

3. Engine

3.1 Basics

3.1.1 Writing and flushing data

If data flushing is enabled, the key and database data MUST be written into temporary files. After, these files MUST be flushed with the corresponding system call (e.g. fd.flush(); libc::fsync(fd)).

After the successful flushing:

3.2 Public database object variables

The database object MUST have the following variables, defined either as public or provide setters for them:

Name Type Default Description
auto_repair bool true Auto-repair the database when opened
auto_flush bool true Flush key data to disk immediately
lock_ ex bool true Lock the database exclusively on open
write_modified_only bool true Write to disk modified key values only

3.3 Mandatory methods

Name Args Brief description
purge Remove all except key files and meta, delete broken keys
safe_purge The same as purge but do not delete broken keys
repair Repair broken keys, deletes unrepairable
check Check keys
info Get database info
server_set name: String, value: Value Modify server options
key_exists key: String Return boolean True if the key exists; False if does not
key_get key: String Get key value
key_explain key: String Get key value and extended info
key_set key: String, value Set key value
key_decrement key: String Decrement values of numeric (integer) keys
key_delete key: String Delete the key
key_delete_recursive key: String Delete the key and all its subkeys
key_increment key: String Increment value of numeric (integer) keys
key_list key: String List the key and all its subkeys Vec<String>
key_list_all key: String List key and all its subkeys, including hidden
key_get_recursive key: String Get the key value and all subkeys Vec\<String, Value>
key_copy key: String, dst_key: String Copy the key value
key_rename key: String, dst_key: String Rename the key / key tree
key_dump key: String Get value of the key and all subkeys, ignores broken
key_load data: Vec\<String, Value> Load dumped keys back

3.3.1 Purge

The method MUST return a Generator\<String> or an array/list of deleted broken keys.

3.3.2 Repair

The method MUST return a Generator\<(String, bool)> or an array/list of deleted broken keys, where the bool value is true if the key is repaired and false if the key has been deleted.

3.3.3 Check

The method MUST return a Generator\<String> or an array/list of broken keys.

3.3.4 Info

The method MUST return the following data object:

Name Type Description
auto_flush bool Flush key data to disk immediately
checksums bool Checksums enabled
created u64 Database creation timestamp
fmt String Current data serialization format
path bool Database path (server local)
repair_recommended bool Database "repair recommended" flag (not auto-repaired)
server (String, String) Server engine ID / Version (custom values)
version u16 Engine version

The object MAY contain additional fields.

3.3.5 Server set

The following server parameters are allowed to be modified on-the-flow:

Name
auto_flush
repair_recommended

3.3.6 Key Explain

The method MUST return the following data object:

Name Type Description
file String Key file
schema String JSON schema key if schema is defined
len u64 length for strings, objects and arrays, null for others
mtime u64 Key file modification timestamp
stime u64 Key modification timestamp; null if checksums are disabled
sha256 SHA256-checksum, MUST be serialized to String
type Value type (see 2.4.2), MUST be serialized to String
value Key value

The object MAY contain additional fields.

If key explain is requested for a schema key, its "schema" field MUST be started with "!" symbol to inform clients that the key does not physically exist in the database.

If the database engine has data type schemas (see 2.4.3) implemented, the schema field for .schema keys MUST contain the value "!JSON Schema VERSION", e.g. "!JSON Schema Draft-7".

If the schema implements custom data types, this MUST be clearly and properly explained. E.g. if a key schema defines that keys must contain valid Python code, the value MUST contain either "Python" or the link to python.org.

3.3.7 Key increment and decrement methods

4. Engine API

4.1 Basics

The engine MUST implement JSON RPC 2.0 API with the following conditions:

The engine MAY implement other APIs.

4.2 Test

The engine API MUST implement "test" method, which MUST return the following structure:

Name Type Description
name String MUST have the value = "yedb"
version u16 Engine version

4.3 Server types

Type Serialization formats Notes
UNIX socket msgpack The name SHOULD have the suffix ".sock" or ".socket"
TCP socket msgpack The default port SHOULD be 8870
HTTP msgpack, json The default port SHOULD be 8878

4.4 Binary packets format

For binary data exchange (UNIX/TCP sockets), the following format MUST be kept for both client and server:

Byte range Size Value
0 1 Engine version
1 1 Data format code (2 for msgpack)
2-5 4 JSON RPC frame length (little-endian)
6- JSON RPC request / response frame

4.5 HTTP

The HTTP transport MUST meet the following conditions:

4.6 JSON RPC Error codes

Error codes, returned by server, MUST match the following:

4.6.1 Protocol errors

Code Description
-32600 Invalid request
-32601 Method not found
-32602 Invalid method parameters

4.6.2 Engine errors

Code Description
-32001 Key not found
-32002 Data error, checksum error
-32003 Schema validation error
-32004 OS I/O errors: device errors, permissions etc.
-32000 All other server errors

5. Dump files

If the engine or a client creates / loads dump files, these files MUST have data serialized with MessagePack and have the following format:

5.1 File header

Byte range Size Value
0 1 Engine version
1 1 Data format code (2 for msgpack)

5.2 Key data

Stored, starting from byte 2, for each key:

Byte range Size Value
0 4 Data length
4- Data value (msgpack)