plumb internals
VP programming
VP programming:
- don't write to a blocked source
- don't write from start() because sink VP/RP may not be initialized yet
- strclone() arguments that need to be used later as args will be free'd after ->init()
fd handling
eof handling
On fd_close(), the fd is first marked to be in closing state. Further calls
of fd_close() to a closing fd will return with no effect. fd_write() on
an closing fd will fail (returns -1). fd_close() then closes the other
side of the pipe (if fd was bound) and both fds are marked as eof'd.
fd_write() fails on eof'd fds.
When fd_close() is closing the socket, it will also:
- generate a close event
- if other side of the pipe is a VP, notifies it
- if other side of the pipe is an RP, make sure data is not lost: if it is a blocking sink, set close-after-write; else close the socket for reading or writing (shutdown)
blocking, unblocking
The source of a block backpromotion from an RP is a short write in fd_write(),
that triggers a back_block() call; or an fd_pause() call (typically
called when a process is paused).
After a short write, the portion that was not written is saved in the
output buffer of the fd and the fd is watched for P_POLLOUT. When it
becomes writable again, the buffer is written out first. If the buffer
is empty, and the fd is not paused, unblocking is backpromoted.
pausing, unpausing
All sinks of a paused process should act like if it was blocking. There
is a ->paused field of fd that remembers wheter an fd is in pause. When
fd_pause() is called to pause or unpause an fd, it will act only if the
action would change pause status. When pausing/unpausing, block backpomotion happens
only if that is a change in the final blocking state (both pause and normal
blocking can change the state).
transactions
stages of a transaction
A transaction is processed in multiple stages, in the following order, jumping
to the next stage only if the previous is finished:
- parsing (flex+bison): emits a binary version of the script (exe_t); variable assignments are done in the parser
- create: create all objects (processes, pipes); create and run RPs, create and ->init VPs
- start: call ->start of all VPs
When a flush is executed, parsing stops and the remaining stages are run
on the script seen so far, then the binary script buffer is deleted and
parsing continues from after the flush the same way as in the beginning of
a new file.
Only after finishing with start, data flow may start (P_poll called). This
allows ->start of VPs to change piping inside the transaction without having
to worry about buffers and partial lines: there's no data read from any
of the RPs yet. Of course this is not true for processes created by
earlier transactions.
It is also guaranteed that by the time ->start is called, all objects of
the transaction are already created.
includes, subtransactions, new transactions
A normal include does not create a new transaction, the content of
the included file is parsed in-place, just like the content of variables.
On the other hand, lib-include will create a new subtransaction, which will
be run from the parent transaction. Since subtransaction is a plain transaction
command, the order of commands is kept and the subtransaction is executed
exactly in between the two statements it appeared in the script. A subtransaction
does not share the transaction-local variables with its parent.
Virtual process [new] creates a new, independent transaction, which is not a
subtransaction of any other transaction (has no parent). By the time ->init
of [new] returns, the new transaction is already executed.
Each instance of [cmd] creates its own transaction with similar properties
of the transactions of [new], except there is no implicit flush and the
parsed but pending statements of the transaction is executed only when a
flush command is parsed.
variables in transactions
Since there are two scopes (transaction-local and global), var_*() functions
need to act on both, local first. After the transaction finishes, variables
are copied to the global scope. Subtransactions will push variables
directly to the global scope instead of into the parent transaction. TODO: is this a good idea?
rebinding of fds from VPs
- rebinding for [open] and [env]: ->start does the rebinding, and that may happen after a source/sink RP is already running, but sure happens before we poll() so it is not possible that we write data to the wrong pipe (the one before rebind) within a transaction
shutdown procedure
1. make sure all RPs are killed (4 stages: shut down standard script, close stdio, term, kill); happens for error/abort and clean exit
2. clean exit also cleans up memory (for valgrind)
parser
string allocation policy
- string allocation is done in lex; y doesn't free(); gram_execute() is careful about freeing because some of the lex-alloced names will end up in hash keys