The two main features in plumb that help solving the problem are:
The example in this document will demonstrate how to build a script that implements a simple control loop in a simulated environment. Both the controller and the simulator implement a simple, blocking main loop.
The simulator (fan.c, below) simulates a thermal system with a heat producer, a heat sink, a controllable fan and a sensor that measures the temperature of the sink. The purpose of the control loop is to keep the temperature around a predefined target value. The simulation will be the same for all examples.
Example: fan.c
#include#include /* target: 60 C load fan 0.651 1 */ double air = 25.0; /* [C] air temp */ double temp = 25.0; /* [C] starting temperature is room temperature */ double load = 0.2; /* load percent generatinmg the heat */ double fan_rpm = 0.0; /* current rpm of the fan in percent */ double fan_rpm_min = 0.2; /* minimum fan rpm, if non-zero; anything between zero and this value will become this value */ double cm_sink = 40.0; double cm_air = 5.0; void sim(void) { double rpm, Q_out, Q_in; /* the device is heating heat sink with Q_in energy */ Q_in = load * 1000.0; temp += Q_in / (cm_sink); /* calculate Q_out depending on fan rpm using a nonlinear fan function */ rpm = fan_rpm; if ((rpm < fan_rpm_min) && (rpm != 0)) rpm = fan_rpm_min; else if (rpm > 1.0) rpm = 1.0; Q_out = (log(rpm*3.0+1.0)+0.2); //printf("rpm=%f Qin=%f Qout=%f\n", rpm, Q_in, Q_out); temp -= Q_out / (cm_air) * (temp - air); if (temp < air) temp = air; } int main(int argc, char *argv[]) { float f; unsigned long int last_t, t; char line[1024]; last_t = time(NULL) - 2; /* make sure we have some iterations to run after the first control arrives */ while(!(feof(stdin))) { /* read line by line */ *line = 0; fgets(line, sizeof(line), stdin); /* float means control, anything else is just tick for the timer */ if (sscanf(line, "%f", &f) == 1) fan_rpm = f; /* run as many iterations as many seconds passed after the last invocation */ for(t = time(NULL); last_t < t; last_t++) { load += (double)(rand() % 2000 - 1000) / 10000.0; if (load < 0.0) load = 0.0; if (load > 1.0) load = 1.0; sim(); printf("%f %f %f\n", temp, load, fan_rpm); fflush(stdout); } } }
The controller is a script written in awk. It will evolve with the plumb script throughout the tutorial.
When running the tests, output will have 3 columns of numbers. First column is the current temperature, second is the incoming heat (0..1), third is the fan control (0..1).
Example: example1.awk
# feed in something to get the loop started - turn off fan and see what happens BEGIN { print "0.0"; fflush(); } # got results, do the control { print $0 > "/dev/stderr" temp=$1 # calculate control output, between 0 and 1 ctrl = (temp - target) / 12.0 if (ctrl < 0) ctrl = 0 if (ctrl > 1) ctrl = 1 # print control output print ctrl fflush() # also feed the simulator with ticks system("sleep 1") print "tick" fflush() }
The plumb script that runs the two processes together:
Example: example1.pb
# We have a hardware simulator process: sim={"./fan"} # The following awk script is a very simple controller control={gawk -v "target=60" -f example1.awk } # two pipes, for the traditional '69' setup: sim:1 | control:0 control:1 | sim:0 # to see that it's working, redirect control's stderr to plumb's output: control:2 | env:1
The full syntax can be found in the manual plumb(5). A brief extract for understanding the script: creating a real process is name={cmd args}, pipelines work like in shell, referencing file descriptors of already created processes is name:num, default file descriptor assignment is same as in shell (0=stdin, 1=stdout, 2=stderr). However, there is no default stdio binding, which means none of the fds of a process will be redirected implicitly, not even stderr, the plumb script always must be explicit about all fds. A virtual process created and named "env" automatically on startup - binding any fd of env will bind the given fd of plumb (env:0 is the stdin of plumb).
On the above plumbviz drawing, all grey boxes are implicit objects created by plumb's stdscript on startup. The example script is in the box called "-f example1.pb". Legend of the drawing:
#legend
The script names processes (sim and control); besides name is required for fd references (which is essential for piping), it is a good practice to name most processes for readability of the script and the drawing. The drawing is generated using plumbviz(1).
There is a trick built in to fan.c for avoiding deadlocks. The loop depends on data circulating. If both the control script and the simulator sits there waiting for the first input to calculate the output, the loop will not start. As a quick fix, fan.c will simulate two iterations on startup, which will cause it to emit two lines of output that starts the loop. This will introduce a fixed delay of 2 iterations in the control loop, because by the time the control script reads a sensor data, the simulation is already 2 iterations in the future (2 iterations of sensor data waiting in a buffer).
The simulation runs as fast as input flows and the control loop immediately responds to any sensor output. Without artificial delays, the whole simulation will run on the maximum speed, taking up a lot of resources of the host system. To get the simulation "real-time", there is an artificial delay built in: the awk script waits a second before generating the next output.
In example 3 a more elegant timing is demonstrated.
Finally, the standard error output of the control process is redirected to the standard output of plumb. This will make the whole process visible to the user.
Example: example1.pb
# We have a hardware simulator process: sim={"./fan"} # The following awk script is a very simple controller control={gawk -v "target=60" -f example1.awk } # two pipes, for the traditional '69' setup: sim:1 | control:0 control:1 | sim:0 # to see that it's working, redirect control's stderr to plumb's output: control:2 | env:1
An alternative solution is to let plumb do the timing by:
Example: example3.pb
# We have a hardware simulator process and a controller: sim=[hub] | {"./fan"} | control={gawk -v "target=60" -f example3.awk } | sim:* # timer ticks for running the sim time_base=[timer period=0.5 repeat=0] | sim:* # to see that it's working, redirect control's stderr to plumb's output: control:2 | env:1
The awk control script got shorter: timing and loop workaround are thrown out:
Example: example3.awk
# whenever we got results, do the control { print $0 > "/dev/stderr" temp=$1 # calculate control output, between 0 and 1 ctrl = (temp - target) / 12.0 if (ctrl < 0) ctrl = 0 if (ctrl > 1) ctrl = 1 # print control output print ctrl fflush() }
For referencing file descriptors of the [hub], sim:* is used. This syntax is common with [hub]s, "*" means "take the first free file descriptor". This feature is useful with any virtual process that does not use dedicated file descriptors, such as a [hub]. Not having dedicated file descriptor assignment means there's no stdio and no convention for fd numbers - any file descriptor can be either input or output (and the direction is obvious from the script). NOTE: for plumb, stdio does not exist, it's only a convention real processes and most virtual processes and the user follow.
The rule for merging the streams at the [hub] is simple: whenever a record is read on a sink fd (input), it is copied to all source fds (output). As long as the control script writes a full line at once and it is flushed, and the timer writes a full line at once, the [hub] will get records which are each a text line. It is guaranteed that hub will not mix or merge content of different records, so on the output there will be intact text lines.
NOTE: this implementation is not yet line-safe; the next step will fix that.
Plumb does not do implicit line spitting/buffering, but it is easy to do explicit line splitting. There are two virtual processes that can should be used in series, the first breaking the stream into one line per record (removing the separators), the second appending a single newline at the end of each record:
[split "\n\r"] | [affix suffix="\n"]Since this is a feature used very often, there is a shorthand for this pipeline, called $LSP in library stdio. The first line of the new script includes the stdio library.
Example: example4.pb
include stdio # We have a hardware simulator process and a controller: sim=[hub] | {"./fan"} | control={gawk -v "target=60" -f example3.awk } | $LSP | sim:* # timer ticks for running the sim time_base=[timer period=0.5 repeat=0] | sim:* # to see that it's working, redirect control's stderr to plumb's output: control:2 | stdout:*
Besides providing $LSP, stdio also hooks env:0, env:1 and env:2, connecting [hub]s to them, allowing multiple processes to read or write the stdio of plumb.
The only other change to the plumb script is the inserted $LSP at the output of the control script. Note: because of the lack of implicit line splitting, the traffic between fan and control is unfiltered. Control is an awk script and awk has its own input line splitting. Not having a line split in between the two processes saves some CPU and makes the data flow more smooth.
Example: example5.pb
include stdio # We have a hardware simulator process and a controller: sim=[hub] | {"./fan"} | {gawk -v "target=60" -f example5.awk } | $LSP | control=[hub] # timer ticks for running the sim time_base=[timer period=0.5 repeat=0] | sim:* # loopback pipe control:* | cm=[regex pattern="^ctrl " subst=""] | sim:* # to see that it's working, redirect control's "log" to plumb's output: control:* | lm=[regex pattern="^log "] | stdout:*
In this example the previous script is modified to read prefixed lines from the control script. When the script wants to send a line to the simulation, it prefixes the line with "ctrl ", when it wants to put a line in the log (to stdout of plumb), the prefix is "log ". Since line splitting happens at the output, the new [hub] (called control) will output text lines. This stream is fed in to two other pipelines, each regex filtering for a specific prefix. The first also removes the prefix (substituting it with empty string): looping back control commands to the fan sim shouldn't have anything else but numbers. The second pipe demonstrates how to keep the prefix, doing a match but not substituting and dumps the stream on plumb's stdout as-is.
These two pipelines are running in parallel; the one that alters the stream does so on a copy, so they don't interfere. The one that does not alter he stream but only matches does not copy or buffer any data.
The only modification to the awk script since example 3 is that the output lines are prefixed:
Example: example5.awk
# whenever we got results, do the control { print "log " $0 temp=$1 # calculate control output, between 0 and 1 ctrl = (temp - target) / 12.0 if (ctrl < 0) ctrl = 0 if (ctrl > 1) ctrl = 1 # print control output print "ctrl " ctrl fflush() }
Example: example6.pb
include stdio # We have a hardware simulator process and a controller: sim_in=[hub] | {"./fan"} | sim_out=[hub] | $LSP | [affix prefix="sim "] \ | control_in=[hub] | {gawk -v "target=60" -f example6.awk } \ | $LSP | control_out=[hub] # timer ticks for running the sim time_base=[timer period=0.5 repeat=0] | sim_in:* # loopback pipe control_out:* | cm=[regex pattern="^ctrl " subst="" pass=all] | sim_in:* # to see that it's working, redirect control's "log" to plumb's output: control_out:* | lm=[regex pattern="^log "] | stdout:* # minimalistic cli: prefix and redirect plumb's stdin to the controller stdin:* | $LSP | Cm=[affix prefix="CLI "] | control_in:*
The awk script is modified to process both input types:
Example: example6.awk
# whenever we got results, do the control /^sim / { print "log " $0 temp=$2 # calculate control output, between 0 and 1 ctrl = (temp - target) / 12.0 if (ctrl < 0) ctrl = 0 if (ctrl > 1) ctrl = 1 # print control output print "ctrl " ctrl fflush() next } # change target from CLI /^CLI target / { target = $3 print "log new target temp: " target next } # catch-all rules /^CLI / { print "log ERROR: unknown command on CLI: " $0 fflush() next } { print "log ERROR: unknown input prefix: " $0 fflush() }
The only command the script takes is the target num command, where num is a floating point number, the new target temperature.
Note: stdin:* needs line splitting since plumb doesn't handle input in any special way, it's just a stream of records of random size read(2) returns.
Example: example7.pb
include stdio #configuration: logfile="example7.log" # We have a hardware simulator process and a controller: sim_in=[hub] | {"./fan"} | sim_out=[hub] | $LSP | [affix prefix="sim "] \ | control_in=[hub] | {gawk -v "target=60" -f example6.awk } \ | $LSP | control_out=[hub] # timer ticks for running the sim time_base=[timer period=0.5 repeat=0] | sim_in:* # loopback pipe control_out:* | cm=[regex pattern="^ctrl " subst="" pass=all] | sim_in:* # to see that it's working, redirect control's "log" to plumb's output and to # a log file: control_out:* | lm=[regex pattern="^log "] | loglines=[hub] | stdout:* loglines:* | [open ${logfile}] # minimalistic cli: prefix and redirect plumb's stdin to the controller stdin:* | $LSP | Cm=[affix prefix="CLI "] | control_in:*
The file name is not hardwired at [open] but is stored in a variable. The ${} variable substitution is chosen to ensure the file name is not interpreted - it may contain whitespace or even a full plumb script.
It is also possible, and is a good practice for larger scripts to have such settings in a separate file which is then included by the script using the include statement.
The script could be much more readable and somewhat shorter (if tee is used often) by having a generic implementation that can be inserted in any pipeline with a file name argument.
The following example demonstrates how to implement a subscript that can be easily reused with virtual process [new].
Example: example8.pb
include stdio #configuration: logfile="example8.log" #implement a generic tee subscript tee=' env:0 | h=[hub] | env:1 h:* | [open ${name}] ' # We have a hardware simulator process and a controller: sim_in=[hub] | {"./fan"} | sim_out=[hub] | $LSP | [affix prefix="sim "] \ | control_in=[hub] | {gawk -v "target=60" -f example6.awk } \ | $LSP | control_out=[hub] # timer ticks for running the sim time_base=[timer period=0.5 repeat=0] | sim_in:* # loopback pipe control_out:* | cm=[regex pattern="^ctrl " subst="" pass=all] | sim_in:* # to see that it's working, redirect control's "log" to plumb's output and to # a log file: control_out:* | lm=[regex pattern="^log "] | loglines=[new tee name=${logfile}] | stdout:* # minimalistic cli: prefix and redirect plumb's stdin to the controller stdin:* | $LSP | Cm=[affix prefix="CLI "] | control_in:*
The subscript is specified as a variable, in multiple lines (for readability). When [new] is inserted in the logger pipeline, it will create an "instance" of the variable named as the first argument ("tee" in our case) by parsing it and creating each objects ([hub] and [open]) prefixed with its own name (loglines) and a period. In the above example this means loglines.h for the [hub], and a random generated name prefixed with "loglines." for the [open]. The original fd bindings of [new] will be rebound to the newly created processes, in place of the env: references (which are standing for environment of the subscript instance). The parent script normally doesn't want to access internal processes created by [new] so naming them is not necessary except for internal references. However, once [new] finished, the objects created are not special and are in the same global namespace as the parent script, and it is possible to directly bind to loglines.h. However, it is encouraged to use env: and multiple file descriptors as an API instead of direct references to prefixed objects, for readability of the script.
The only mandatory argument for [new] is the first argument, which is the name of the variable that has the subscript to instantiate. The rest of the arguments are optional key=value pairs. When they present, they act like variable assignments visible only to the subscript, and are used for argument passing to the subscript.
Note 1: library stdio also implements variable tee, the same way as the example script did, so it is not necessary to copy this specific subscript in user scripts.
Note 2: for subscripts taking no argument, there's an even shorter way of reference: a variable that has the [new] command. This is how $LSP works: it contains "[new $LSP_]", which is interpreted by the parser if the $LSP or $(LSP) syntax is used. In turn $LSP_ contains the subscript that splits the input into single line records and ensures each record has a single newline at the end.
Example: example9.pb
include stdio #configuration: logfile="example9.log" #implement a generic tee subscript tee=' env:0 | h=[hub] | env:1 h:* | [open ${name}] ' # We have a hardware simulator process and a controller: sim_in=[hub] | {"./fan"} | sim_out=[hub] | $LSP | [affix prefix="sim "] \ | control_in=[hub] | {gawk -v "target=60" -f example6.awk } \ | $LSP | control_out=[hub] # timer ticks for running the sim time_base=[timer period=0.5 repeat=0] | sim_in:* # loopback pipe control_out:* | cm=[regex pattern="^ctrl " subst="" pass=all] | sim_in:* # to see that it's working, redirect control's "log" to plumb's output and to # a log file: control_out:* | lm=[regex pattern="^log "] | loglines=[new tee name=${logfile}] | stdout:* # minimalistic cli: prefix and redirect plumb's stdin to the controller stdin:* | $LSP | Cm=[affix prefix="CLI "] | control_in:* # echo a banner in front of the stream [echo "Fan control - plumb tutorial"] | [hub eof=ignore-on-sink] | loglines.h:*
There is an extra [hub] built in right after the echo. This is required because echo sends out an EOF after printing the message. By default, when this EOF reaches a [hub] (loglines.h) it will shut down the hub. In this script shutting down the hub of tee would back-promote the EOF through the regex match and the control hub to the awk script which would quit. In turn, the EOF of the stdin of the quiting awk would be back-promoted to the sim process that would quit too, and since both sticky processes ended, plumb would exit as well. To avoid this chain of unwanted EOF rooted from [echo], the above mentioned hub is introduced, with the eof=ignore-on-sink argument that makes it ignore the eof read on its input so that it will not appear on its output.
Note 1: similar thing happens if multiple processes of a large plumb script writes stdout or stderr: the first process closing its output will shut down the whole script. Having a separate ignore-on-sink hub on each process would not be practical. Simply ignoring all EOFs on the stdout and stderr [hub]s would be bad too, as in some situations the eof back-promotion through them will be the mechanism for the script to properly shut down and quit. To overcome this problem, stdio provides two sets of the [hub]s, one with lowercase names, the other with uppercase names. The lowercase variants (stdout, stderr) work without side effects, the uppercase variants (STDOUT, STDERR) ignore eof. They can be used in parallel.
Note 2: The only reason the banner will end up as the first line of the log stream is that the control script will write its first line to the same pipe only when the first timer went through the loop. This would be a race condition if a real process was used in place of [echo], because it would depend on that process emitting output faster than the first timer message reaches the awk script. However, [echo] is a virtual process that will emit the string almost immediately after it's been started, which, in worst case is the same iteration as [timer] runs. However, the message sent out by [timer] will go through real processes which will require multiple more plumb iterations (reading from fan and writing to awk, then reading from awk at least). Thus order is guaranteed.