remote HID protocol, low level

In this document bold words are tokens and italic words are grammatic constructs.

The protocol is designed with the following goals in mind:

Tokens

token description
( start a generic list
) end a generic list
{ start a binary list
} end a binary list
space element separator: space (ASCII dec 32)
\n message separator: newline (ASCII dec 10)
text-string short, alphanumeric string
binary-string long and/or non-alnum string

Grammar

The communication is a pair of asynchronous message stream. The message format is:
text-string generic-list \n
or
text-string binary-list \n
or
# comment \n

where text-string is a command and generic-list is a list and holds the arguments of. The command name can not start with a hash mark ('#') and can not be empty. The list is the argument tree of the command.

As a special exception, a line may start with a hash mark ('#') to indicate a comment. Characters up to the first newline are ignored.

Optional: a tolerant parser also accepts empty lines and whitespace before a message.

The language has two type of lists: generic and binary. A generic list is wrapped in parenthesis () and its children are:

A binary-list is wrapped in braces {} and its children are:

Any list can be empty. There's no whitespace after the opening token and before the closing token, so the empty generic-list and the empty binary-list are written as, respectively:

()
{}
Subsequent fields of a list has a single space in between, for separation (this is an intentional redundancy for a binary-list).

Note: a generic-list can host text and binary children, but a binary list can not host only binary children. This means if a node of the parse tree is a binary list, the subtree will contain only binary nodes.

A text-string contains only English alphanumeric characters (A..Z, a..z, 0..9), underscores (_), plus and minus signs (+, -) periods (.) and the hash mark ('#') and is at most 16 characters long.

A binary string encodes the length of the payload in base64 (A..Z, a..z, +, /), has a '=' for separator then the payload in binary format. For example

F=hello
means the 5 characters long string "hello". The maximum size of the base64 encoded length field is 5, thus the longest binary data that can be packed in a single field is 1 gigabyte.

Examples

empty-argument messages

hello()\n
foo{}\n

single-argument messages

Text and binary alternative for the same content:
hello(world)\n
hello{F=world}\n

multi-argument messages

Text and binary alternative for the same content:
print(hello world !)\n
print{E=hello F=world B=!}\n
Note: using space between list items; don't space before the first or after the last argument. Always emit one space between any two list items.

lists in list

Text and binary alternatives for the same content:
line((14.55 3.1) (44.2 0) 5)\n
line({F=14.55 D=3.1} (44.2 0) 5)\n
line((14.55 3.1) {E=44.2 B=0} 5)\n
line({F=14.55 D=3.1} {E=44.2 B=0} 5)\n
line{{F=14.55 D=3.1} {E=44.2 B=0} B=5}\n
The subtree assumed in this fictional message is two x;y coordinate pairs and a line width. In other words the arguments of line is a list (start point), another list (end point) and a scalar (width).

Since all data comply to the text-string token format, the first, simplest format is recommended. The other 4 lines demonstrate all other valid variants.

It is important to note that there are constraints (derived from the grammar) on choosing which list items can be encoded in binary:

Thus if the 3rd argument, (width in this example), must be encoded as a binary-string, it will turn it's parent, line's argument list binary too, which in turn enforces all arguments to be binary.