A global variable is created on the first time it is referenced. Scalar variables are created with the value (and type) NIL.
Variables are normally referenced by their name. A variable name must start with underscore or letter and can consist underscores, letters and digits . A variable name is case sensitive.
A variable is a piece of data memory, named by the variable name.
A global variable is a variable that is accessible from every function within a script and will always access the same memory region with the same name. A global variable exists from the time the script is loaded until it is unloaded.
A local variable is function-local: it exists only for a specific call of a function; different functions or different calls of the same function will see a different memory location using the same local variable name. More on local variables in chapter 4. This chapter will deal with global variables only.
Each variable is either a scalar or an array. A scalar variable holds a single value, an array holds multiple values. This chapter describes scalars only - more on arrays in chapter 5.
Each variable stores a data and a data type. Data type is one of:
type name | description | example |
---|---|---|
NUM | rational number | 4, -6, 3.14, 0, 0x2a |
STR | text string | "hello, world!" |
FUNC | function ref | main |
NIL | special empty value | "" |
ARRAY | arrays | see chapter 5 |
Data type is determined automatically, runtime: when the value of the variable is changed, the data type is adjusted as required.
The NIL type is a special type used for indicating "no value". It is written as "" in fawk source code - the empty string is the same as NIL. A NIL value is equal to another NIL value and is not-equal only to non-NIL values. When a NIL value is compared in any other way or to any other value, the result of the comparison is false (e.g. both 1<NIL and 1>NIL are false).
Variables are changed using operators. The simplest operator is the assignment operator (written as '='
example program ch2_ex1 |
---|
function main(ARGV) { foo = 5; } |
example program ch2_ex2 |
---|
function main(ARGV) { foo = 5; foo = foo + 3; } |
example program ch2_ex3 |
---|
function main(ARGV) { foo = 5; foo = foo + 3; fawk_print_cell(foo); fawk_print(foo); } |
example program ch2_ex4 |
---|
function main(ARGV) { foo = "5"; bar = 3; fawk_print(foo+bar); } |
Arithmetic operators | ||
---|---|---|
syntax | description | example |
a + b | sum of a and b | foo+3 |
a - b | subtract b from a | foo-3 |
a * b | multiply a with b | 10*foo |
a / b | divide a with b | foo/3 |
a % b | integer remiainder of a/b | foo%3 |
-a | negate a | -5 -foo |
(a) | change precedence | (4+2)*8 |
v++ | shorthand: v = v+1 | |
v-- | shorthand: v = v-1 | |
++v | shorthand: v = v+1 | |
--v | shorthand: v = v-1 | |
v += a | shorthand: v = v+a | |
v -= a | shorthand: v = v-a | |
v *= a | shorthand: v = v+a | |
v %= a | shorthand: v = v-a |
The difference between v++ and ++v is the same as in C: whether the value of the expression is decided before or after the increment. Take the following code: v=5;fawk_print(v++); this will print 5, but the value of v is 6 after v++. Code v=5;fawk_print(++v); will print 6 and the value of v is 6 at the end. The difference is "first take the value, then increment" or "first increment then take the value".
boolean operators | ||
---|---|---|
!a | logical negate a | v = !a |
a && b | true if a and b are true | |
a || b | true if a or b is true |
Note: boolean operators will look at the numeric value of the operand. Value zero is interpreted as false, any other value is interpreted as true.
Relational operators | ||
---|---|---|
syntax | description | example |
a == b | true if a equals to b | if (a == 5) ... |
a != b | true if a is not equal to b | if (a != 5) ... |
a > b | true if a is greater than b | if (a > 5) ... |
a < b | true if a is less than b | if (a < 5) ... |
a >= b | a is greater than or equal to b | if (a >= 5) ... |
a <= b | a is less than or equal to b | if (a <= 5) ... |
Misc operators | |||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
v = a | copy the value of a to v | foo=5 foo=bar | |||||||||||||||
a @ b | concatenate string a and b | "foo" @ "bar" is "foobar"
a ? b :c | trenary | a > 10 ? b-- : b++
| v[] | explicit array (chapter 5) | my_func(FOO[])
| v[b] | array indexing (chapter 5) | FOO[3]
| v.b | array indexing (chapter 5) | FOO.bar
| &v | used in C function call: pass by reference (chapter 5) | delete(&FOO)
| |
Using the operators listed above will build expressions. An expression is evaluated in place and will yield a result which is a type and a value. The result can be used as an operand in another expression, as a function call parameter, as a function return value or can be discarded:
example program ch2_ex5 |
---|
function main(ARGV) { # the result of 2 * 3 is used as an operand for the +; result is 7: a = 1 + 2 * 3; # () changes precedence so (1 + 2) is calculated first; result is 6: b = (1 + 2) * 3; # as a function call parameter: expression is evaluated first, the result # (which is 7) is passed to the function: fawk_print(1 + 2 * 3); # "void expression": an expression that is evaluated but the result is # is discarded; the result is 3 and is not used for anything 1+2; # "void expressions" are used for their side effects. A typical example # is the increment operator. In the example below the result of ++c is 11, # but that result is discarded, however the side effect of the ++ operator, # that is the value of c is increased remains, so the script will print 11 c = 10; ++c; fawk_print(c); } |
In fact fawk_print() is a function that we call. A function always has a return value, so a function call is an expression too. So far all our calls to fawk_print() were "void expressions" because we discarded the return value of fawk_print().
Note: "void expressions" is the way an expression is turned into a statement, which is required because a function body is a sequence of statements
When there is a logical AND or OR in an expression, the left side is evaluated first. If the left side determines the result already, the right side is not evaluated at all. For example in if (foo() || bar()) { ... } if foo() is true, the result will always be true so bar() is not evaluated. Since in this case the right side is a function call, this means bar() is not called.
Similar way, if the left side of an AND operator is false, the result is false and the right side is not evaluated.
This obviously doesn't make a difference if the right side is something like (a > 4), but does make a difference if the right side has side effect (any effect that change global states, e.g. global variables, or calls to C functions, even indirectly).
In a trenary operator only two expressions are evaluated: the condition and either the true or the false side. This is not merely an optimization on CPU, it affects side effects as well:
function main(ARGV) { a = 10; b = 20; a > 5 ? b++ : b--; fawk_print(b); }
This example prints 21, because a is greater than 5 and b++ is evaluated, while b-- is not. This is called short circuit evaluation: what is not needed is not evaluated, so its side effects are not applied either.
All three operands are expressions and the syntax is:
condition ? when_true : when_false
First condition is evaluated. If it is non-zero, then when_true is evaluated and the result of the expression is the result of when_true. If condition is zero then when_false is evaluated and the result of the expression is the result of when_false.