plumb internals

VP programming

VP programming: - don't write to a blocked source - don't write from start() because sink VP/RP may not be initialized yet - strclone() arguments that need to be used later as args will be free'd after ->init()

fd handling

eof handling

On fd_close(), the fd is first marked to be in closing state. Further calls of fd_close() to a closing fd will return with no effect. fd_write() on an closing fd will fail (returns -1). fd_close() then closes the other side of the pipe (if fd was bound) and both fds are marked as eof'd. fd_write() fails on eof'd fds.

When fd_close() is closing the socket, it will also:

blocking, unblocking

The source of a block backpromotion from an RP is a short write in fd_write(), that triggers a back_block() call; or an fd_pause() call (typically called when a process is paused).

After a short write, the portion that was not written is saved in the output buffer of the fd and the fd is watched for P_POLLOUT. When it becomes writable again, the buffer is written out first. If the buffer is empty, and the fd is not paused, unblocking is backpromoted.

pausing, unpausing

All sinks of a paused process should act like if it was blocking. There is a ->paused field of fd that remembers wheter an fd is in pause. When fd_pause() is called to pause or unpause an fd, it will act only if the action would change pause status. When pausing/unpausing, block backpomotion happens only if that is a change in the final blocking state (both pause and normal blocking can change the state).

transactions

stages of a transaction

A transaction is processed in multiple stages, in the following order, jumping to the next stage only if the previous is finished:
  1. parsing (flex+bison): emits a binary version of the script (exe_t); variable assignments are done in the parser
  2. create: create all objects (processes, pipes); create and run RPs, create and ->init VPs
  3. start: call ->start of all VPs

When a flush is executed, parsing stops and the remaining stages are run on the script seen so far, then the binary script buffer is deleted and parsing continues from after the flush the same way as in the beginning of a new file.

Only after finishing with start, data flow may start (P_poll called). This allows ->start of VPs to change piping inside the transaction without having to worry about buffers and partial lines: there's no data read from any of the RPs yet. Of course this is not true for processes created by earlier transactions.

It is also guaranteed that by the time ->start is called, all objects of the transaction are already created.

includes, subtransactions, new transactions

A normal include does not create a new transaction, the content of the included file is parsed in-place, just like the content of variables. On the other hand, lib-include will create a new subtransaction, which will be run from the parent transaction. Since subtransaction is a plain transaction command, the order of commands is kept and the subtransaction is executed exactly in between the two statements it appeared in the script. A subtransaction does not share the transaction-local variables with its parent.

Virtual process [new] creates a new, independent transaction, which is not a subtransaction of any other transaction (has no parent). By the time ->init of [new] returns, the new transaction is already executed.

Each instance of [cmd] creates its own transaction with similar properties of the transactions of [new], except there is no implicit flush and the parsed but pending statements of the transaction is executed only when a flush command is parsed.

variables in transactions

Since there are two scopes (transaction-local and global), var_*() functions need to act on both, local first. After the transaction finishes, variables are copied to the global scope. Subtransactions will push variables directly to the global scope instead of into the parent transaction. TODO: is this a good idea?

rebinding of fds from VPs

- rebinding for [open] and [env]: ->start does the rebinding, and that may happen after a source/sink RP is already running, but sure happens before we poll() so it is not possible that we write data to the wrong pipe (the one before rebind) within a transaction

shutdown procedure

1. make sure all RPs are killed (4 stages: shut down standard script, close stdio, term, kill); happens for error/abort and clean exit 2. clean exit also cleans up memory (for valgrind)

parser

string allocation policy

- string allocation is done in lex; y doesn't free(); gram_execute() is careful about freeing because some of the lex-alloced names will end up in hash keys