Expression engine for conditions
From LimeSurvey Manual
Introduction
A condition is simply a boolean expression, which may consist of several, not necessarily boolean, expression. When implementing a Conditions feature into LS2, it makes sense to look at the most basic building block of such a feature - expressions - and develop a powerful engine, that is be able to construct, evaluate, parse and store expressions, yet is isolated from its target application.
Constructing expressions
Expressions are built from strings representing the expression in postfix notation (also known as Reverse Polish Notation or RPN). Postfix notation is extremely simple to parse, compact, parenthesis-free, precedence unambiguous etc... it's just awesome. An expression, no matter how complicated, nested or how many variables it contains, can be compactly stored as a single string.
Let's create a simple expression object, to evaluate (6 + 7) * 11 < 12*12 (this is false by the way...)
$exp = ExpressionEngine::create("6 7 + 11 * 12 sqr <");
Individual tokens are separated by space. You can put as many spaces as you want, but always at least one. $exp now stores the expression and we can evaluate it by calling evaluate() on it
$exp->evaluate(); // returns FALSE
That's it. You can also construct an expression using classes provided by the engine. For example:
$exp = new LeqExpression(new ConstantExpression(12), new AddExpression(new ConstantExpression(10), new ConstantExpression(3)));
$exp->evaluate(); // returns TRUE
Why would you want to construct an expression this way? Well, imagine you are writing an UI for the Conditions feature. You need to represent the currently build expression in a OOP way. You would use these expression classes provided for you. Then you'll need to convert it to postfix notation for storage. How would you do that? Easy:
$exp->toPostfix(); // returns "12 10 3 + <"
Expressions in postfix notation are generally hard for humans to comprehend. Can we convert to normal, infix notation, so that the user can read them? Sure:
$exp->toInfix(); // returns "(12<(10+3))"
Variables
Expressions so far were constant and always returned the same value. Let's add variables. Variable names are prefixed by @ in the postfix notation, and enclosed in { } in infix notation. Variable names should not contain a space. Although the ExpressionEngine imposes no further restrictions, it's best to use sensible ones. @000 is certainly a bad idea.
Variables are late-bound, which means they are not resolved (their value is not evaluated) until the whole expression is evaluated. To attach values to the variable names, use ExpressionEngine::bind(array("variable_name" => "value" ...) ). For example:
$exp = ExpressionEngine::create("6 7 + 11 * @a sqr <");
ExpressionEngine::bind(array("a" => 12));
$exp->evaluate(); // Returns TRUE
ExpressionEngine::bind(array("a" => 11));
$exp->evaluate(); // Returns FALSE
Variables offer a simple but powerful gateway for passing data to expressions.
Strings
Strings are delimited using double quotes. To include double quotes inside the string, type them twice in a row.
echo ExpressionEngine::create('"She said: ""no"""')->evaluate();
echo ExpressionEngine::createFromInfix('"She said: ""no"""')->evaluate();
// prints twice: She said: "no"
Comparison operators work on strings as well. If expression on either side of the comparison operator is a string, the operator automatically performs a string-based comparison.
ExpressionEngine::createFromInfix('10<5')->evaluate(); // false
ExpressionEngine::createFromInfix('"10"<"5"')->evaluate(); // true
ExpressionEngine::createFromInfix('"10"<5')->evaluate(); // true
Custom operators and Expressions
The ExpressionEngine comes with only basic operators, such as those for arithmetic, comparison and logic. However, it allows you to define your custom expressions and 'hooks' (called string literals), that the parser will recognize and create. Also, there are several base classes providing you with boilerplate code (see code for details). There are no restrictions on Expression names, but again, if you bind 0 as your literal, you'll obviously run into problems.
IExpression
Every expression must be a subclass of the IExpression interface. It must implement these functions:
- evaluate() - return expression value
- toInfix() - return infix notation
- toPrefix() - return postfix notation
Example custom expression
Let's create a sample custom expression. This one will be called Sparta and will always return 300. The code is:
/**
* Defining some custom expression. This one always returns 300, because
* THIS IS SPARTA!!!!!!!!
*/
class SpartaExpression implements IExpression {
public function evaluate() {
return 300;
}
public function toPrefix() {
return "sparta";
}
public function toInfix() {
return "sparta";
}
}
Before being able to use, we need to register it with the ExpressionEngine.
ExpressionEngine::register('sparta');
The Engine will automatically look for SpartaExpression and add it to the list of recognized expression literals. We can now use our new expression:
$exp = ExpressionEngine::create("30 10 * sparta ==");
$exp->evaluate(); // returns TRUE
Example custom operator
Let's create an operator, which accepts two expressions and returns the one that's larger. We will make use of the abstract base class ABinaryExpression, because this class implements the toInfix and toPrefix functions for us and it also defines a constructor for us, storing the first expression in the $left variable and the second one in $right. All we need to do is to provide a getLiteral() function, which should return the literal we want the operator to be recognized by.
class MaxExpression extends ABinaryExpression {
public function evaluate() {
$left_value = $this->left->evaluate();
$right_value = $this->right->evaluate();
if ($left_value > $right_value) {
return $left_value;
} else {
return $right_value;
}
}
public function getLiteral() {
return "max";
}
}
Again, register the 'max' literal and the custom operator is good to go. Look at ABinaryExpression for more details.
Security and Safety
ExpressionEngine is extremely safe, because during parsing every literal is checked for existence and thus it is impossible to pass any harmful code. When implementing custom operators and expressions, it is advised they are checked for safety and don't perform any potentionally risky operations.
Concluding notes
Starting with an Expression Engine, there's still a long way to go in implementing a Conditions feature in LS2. However, hopefully the engine will provide a solid base, on which custom expression can be easily developed and maintained, and which lifts the burden of having to write boilerplate code, from the developers.
Advanced users, who wouldn't like to rely on a GUI to build a Condition, would probably prefer to bust out some expressions by writing them manually in infix notation. Hence an infix-to-postfix parser would certainly come in handy. For those interested writing such a beast, this can be useful An infix notation parser has been implemented following this algorithm: http://en.wikipedia.org/wiki/Operator-precedence_parser