Lihata C API

1. Introduction

Lihata C API is a set of libraries layered on top of each other in a way the user can choose how many their application code wants to use. The layers are:

core constants and types and associated utility functions (lihata.h)
event parser (parser.h) - depends on core
DOM parser that builds a tree (dom.h) - depends on the event parser
tree utils (tree.h) - depends on the DOM parser

Size, code complexity and API complexity of the layers are increasing from the core toward the higher levels.

Each level of the stack is designed to be reasonably reentrant:

the application may maintain multiple lihata documents
threaded apps: concurrent read/write operations on different documents will work
threaded apps: concurrent read operations are guaranteed to work on a single document
threaded apps: concurrent read while write will not work on a single document: the application should not attempt to change a document while other threads are reading it
threaded apps: concurrent write operations will not work on a single document and the application is responsible for implementing locking mechanism

This is achieved by using document handles on all levels. These handles store all internal data and parser states of the document and the library does not depend on global variables. For many read operations (most notably tree print functions) any call-local storage is allocated on the stack so multiple concurrent calls on the same non-changing tree shall work.

No layer of the library depends on threading or thread libraries. For non-threaded applications the above concurrency rules restrict operations that can be executed from callback functions. In threaded applications the write operations shall put exclusive read-write lock on the lihata document to make sure no read sessions are confused by the write. For example a recursive tree print of a lihata document in one thread may produce undefined behavior if the tree is changed during the descend.

2. The core

The core library is not useful alone, but is required by all above layers. It is dealing with basic types and constants derived from the specification and possible parse errors.

3. The event parser

The event parser consumes the document character by character in a non-blocking manner. The caller is responsible for feeding the parser until it returns an error or PE_STOP. Normally the caller shall pass on the document without any filtering (even passing the EOF). If the parser detects an error or a valid EOF (root node closed), it stops parsing and returns PE_STOP upon subsequent calls. This means the event parser ignores anything beyond the root node.

Parsing a stream with a single root node behaves exactly as parsing a file document. When parsing a streams without a root node, the caller should reinitialize the event parser after the root node is closed.

Detecting when the root node is closed can be implemented in different ways:

check if two subsequent calls return PE_SUCCESS and PE_STOP
trace depth of nesting in the event handler; when the outmost node is closed, the root is closed

The event parser will generate open/close events in pair for nodes that may have children. The close event is anonymous - if the event callback needs to pair up close events with node events, or remember the node name or type at close, it should maintain its own internal database of open nodes. Node types that can not have children will trigger only a single event that will hold all properties of the node (i.e. text or symlink nodes will not have open/close events but a single textdata event).

The event parser is not required to remember more than the maximum two textual tokens and the type of the current node. This allows the caller to set an upper limit of memory usage of the event parser. However, this also means the event parser can not do all checks and validation of the document the lihata specification requires, and it is the responsibility of the caller to implement those:

keys in a hash must be unique
children of a table must be lists
rows of a table should have the same length

Note: the DOM parser does implement all these checks.

The event parser is recommended when the application is not required to have random access on the data but can process it sequentially. This often means the document consists of lists/tables and the application expects a specific order of data; or the document is built of hashes and the application sets the value of variables by name. The simplest example expects a single list or hash node as the root with setting nodes of known order and/or name.

NOTE: This should not mean the order of nodes matters under a hash.

Lihata C API

1. Introduction

2. The core

3. The event parser

4. The DOM parser

5. tree utils