There is a query state the caller needs to initialize first. The query state includes:
expr: [ tag_expr ] [] /* always matches the current tag */ node_ref ! expr /* invert the logic value of an expr; if it was NULL, the new value is TRUE */ expr : expr /* apply the right expr on the result of the left expr; e.g. "r : [name==A]" means the right tag match is applied on backreferenced node "r" */ expr / expr /* right expr matched from any first-level children of the left node; evaluates to the right expr */ expr / loop expr /* same as above, but the / operator looped on multiple levels of the tree, considering all children on all levels */ expr . expr /* right expr matched from the first child node of the node named by left node; evaluates to the right expr */ expr . loop expr /* same as above, but the . operator looped on multiple levels of the tree, always considering the first child only */ expr ^ expr /* right expr matched against the direct parent node of the node selected by the left expr; evaluates to the right expr */ expr ^ loop expr /* same as above, but the ^ operator looped: any of the ascendents to the depth loop specifies */ expr > expr /* right expr matched against the next (sibling) node of the node named by the left expr; evaluates to the right expr */ expr > loop expr /* same as above, but the > operator looped: any of the right siblings, spanning up to what loop specifies */ expr && expr /* evaluate left expr, if it matches evaluate the right expr; true if both matches (evaluates to left) */ expr || expr /* evaluate left expr, if it doesn't match evaluate the right expr; true if either matches (evaluates to left or if left is FALSE, to the right) */ name = expr /* overwrite the value of name in the query state map */ (expr) loop: * /* previous operator repeated zero or more times */ + /* previous operator repeated one or more times */ { int } /* previous operator repeated exactly int times */ { int1 - int2 } /* previous operator repeated any times between int1 and int2, inclusive */ node_ref: name /* matches a node only if it is the node named; evaluates to the node if matches (else to NULL); name is alnum */ tag_expr: ! tag_expr /* invert the logic value of a tag_expr */ tag == value /* tag is the node's tag; case sensitive string match */ tag ~ sep r sep /* matches a regex specified between two separator characters */ tag c_op value /* compare tag to value using a custom operator */ tag_expr || tag_expr tag_expr && tag_expr (tag_expr) value: string /* string consists of alnum */ " string " /* quoted string consists of any char (standard \ quoting) */
The value of an expression is a node or FALSE. FALSE can be interpreted as a node reference to a non-existing node (sort of a NULL). Anything non-FALSE is true and is a reference to any existing node. If a true value is negated, its value becomes FALSE. If a true value is negated twice, it retrains its original value.
The value of a tag_expr is false or the node it was matched against. A tag_expr is used to decide whether a given node matches specific criterias evaluating the tags of the node.
When evaluating an expression, there is a Node-Under-Test, or NUT for short; tag_expr is always matched against NUT.The NUT is the root node of the tree or sub-tree when the evaluation of an expression starts. Tree operators change the NUT.
The simples tag expression is the empty expression:
[]This will always match the current NUT. Tags of a tree can be matched using two builtin syntaxes and a custom one:
Custom operators are strings consists of the following characters:
'<', '>', '=', '!', '*', '/', '^', '.'Custom operators can not be longer than 20 characters.
Furthermore the following operators are accepted within a tag_expr:
Examples:
[color==green] | matches if the "color" tag of the NUT is "green" |
[color==green || color=red] | matches if the "color" tag of the NUT is "green" or "red" |
[ !(color==green || color=red) && (type ~ foo) ] | matches if the "color" tag of the NUT is not "green" or "red" and type matches foo. |
For example the parent operator (^) takes the parent of the NUT. The expression "[] ^ [color==red]" is executed as:
NOTE: sides of a tree operator are expressions, they don't have to be tag_expr but can be more complex expressions.
Tree operators:
[] . [] . [] | 1/1 granchild: take a first child of NUT, then take the first child of the child; evaluate a reference to the rightmost node. |
[] / [] . [color == red] | take a child of NUT whose first child has tag color==red; evaluate a reference to the rightmost node. |
[] . [] ^ [] | Evaluates to the NUT itself, if it had a child. If the NUT doesn't have a child, the second [] will evaluate to false as it will not match non-existing nodes. |
* | previous operator repeated zero or more times |
+ | previous operator repeated one or more times |
{ int } | previous operator repeated exactly int times |
{ int1 - int2 } | previous operator repeated any times between int1 and int2, inclusive |
[] >+ [] | iterate over all siblings of the NUT |
[] >* [] | iterate over all siblings of the NUT, including the NUT itself (the important case of "zero times") |
[] ^{2-3} [color=red] | check the grand-parent and the parent of the grand-parent of NUT; return such an ancestor if it is red |
[] . [] >* [] | take all the children of NUT (NOTE: the second and third [] will be the same for the first child, * allows that; same as "[] / []") |
[] >+ [] . [color=green] >* [] | Iterate over all siblings of the NUT, taking the first child of the given sibling. If that child is green, return it and all its siblings Two loops nested: O(n^2) |
Examples:
([] > [color==blue]) && ([] . [color==red]) | matches if the right sibling of NUT is blue and the first child of NUT is red; the value of the expression is the right sibling if both criterias are met or else FALSE. |
([] > [color==blue]) || ([] . [color==red]) | matches if either the right sibling of NUT is blue or the first child of NUT is red; the value of the expression is the right sibling if it is blue, else the first child of NUT if it is red, else FALSE. |
!([] . [color==red]) | matches if the first child of NUT is not red. Evaluates to the child or FALSE. Same as [] . [!color==red] |
!!([] . [color==red]) | matches if the first child of NUT is red. Evaluates to the child or FALSE. Same as !([] . [!color==red]) |
[] /+ m=[] /+ t=[]It will effectively find all combinations so that:
This feature allows the caller to extract detailed search results beyond the one node that is returned by the evaluation as result.
([] /+ m=[] /+ t=[]) && (m ^ [color==red])The left side of the expression is the same as in the previous example; the right side filters the result and accepts only the those matches for which "m's parent is red" is true.
The caller may set map entries before evaluating the expression, and the evaluation is able to access those map values (can read or overwrite them). This feature allows the caller to "pass on parameters" to the evaluation through map variables. For example
[] /+ []is a BFS from the root of the tree, while
e /+ []is a BFS from whichever node map variable 'e' points to.
Note: it is the responsibility of the caller to reset the map between iterations (repeated evaluation of the same expression on the same tree). When some variables of the map are not reset, they can be used to communicate nodes between iterations.
[text=="foo\"bar"]matches if NUT's text is foo"bar.
Beside string equality check, there is a so called "string matching" operation. What "matching" exactly means is up to the caller (most often it is a variant of regex). The syntax for string matching is slightly different:
tag_name ~SEP pattern SEPwhere ~ is the tilde character (string match operator) and SEP is an arbitrary caharacter that does not appear in the pattern. The spaces between the SEP characters and pattern are not part of the syntax, if they are present, they are already part of the pattern. There must be no extra space between the ~ operator and SEP, but SEP might be whitespace. For example the following six tag_exprs all perform the same string pattern matching to "pattern":
[tag_name ~/pattern/] [tag_name ~!pattern!] [tag_name ~"pattern"] [tag_name ~ypatterny] [tag_name ~ pattern ] [tag_name ~ pattern ]SEP may be any character except \0.