I have an expression/formula like this
std::string expr="((A>0) && (B>5 || C > 10))";
I have done some research and it seems that if A,B,C values are known, by embedding Lua or Python in a C++ program, there are eval
functions that can substitute A,B and C and return true
or false
.
But what happens when I don't know all the values? Let's say that A is known and it is -1. If A is -1 then the formula will evaluate to "false" irrespective of the values of B or C.
Can I evaluate a formula without knowing all the variables in advance? For example if A is 10, it makes sense to lookup for the value of B and re-evaluate again. How can we solve these problems? Ideas?
I don't know of any existing available library for handling this.
The usual approach would be to build an expression-tree and evaluate what is possible - similarly to constant folding in compilers: https://en.wikipedia.org/wiki/Constant_folding
One of the important aspects for that is to know the allowed values for variables, and thus the allowed partial evaluations, e.g. x*0
(and 0*x
) is 0
if x
is an integer, or finite floating point number, but cannot be evaluated if x
is an IEEE floating number (since it could be Nan or infinity), or if x
could be matrix since [1,1]*0
is [0,0]
not the scalar 0
.
One way is to parse the expression into a tree and evaluate the tree. Subexpressions for which all variables are known will be fully evaluated. The effect will be to simplify the tree.
In your example, the tree has &&
at the top with two subtrees, the left one being the tree for A>0
. To evaluate the tree, we evaluate the left subtree, which returns -1, and so we don't need to evaluate the right subtree, because the operator is &&
. The whole tree evaluates to false.
I don't understand exactly what you want to do or understand but I am OK with ivan_pozdeev about short-circuit evaluation and lazy evaluation.
A boolean expression is evaluate from the left to the right and when the result is known the evaluation stop and ignore what is on the right.
With Python:
E = "(A > 0) and (B > 5 or C > 10)"
A = -1
print(eval(E))
gives
False
But
E = "(A > 0) and (B > 5 or C > 10)"
A = 1
print(eval(E))
gives the error "name 'B' is not defined".
So from what I understand of your question, you want something like
if (A>0) {
B = getB();
C = getC();
if (B>23 || C==11)
explode();
}
ie your expression must be split up so that you are only ever working with known values.
You can do it like this:
class LazyValues():
def __init__(self):
self._known_values = {}
def __getitem__(self, var):
try:
return self._known_values[var]
except KeyError:
print("Evaluating %s..." % var)
return self._known_values.setdefault(var, eval(var))
def lazy_eval(expr, lazy_vars):
for var in lazy_vars:
expr = expr.replace(var, "lazy_values['%s']" % var)
# will look like ((lazy_value['A']>0) && (lazy_value['B']>5 || lazy_value['C'] > 10))
lazy_values = LazyValues()
return eval(expr)
lazy_eval("((A>0) and (B>5 or C > 10))", lazy_vars=['A', 'B', 'C'])
# Evaluating A...
# ....
# NameError: name 'A' is not defined
A = -1
lazy_eval("((A>0) and (B>5 or C > 10))", lazy_vars=['A', 'B', 'C'])
#Evaluating A...
#False
A = 5
B = 6
lazy_eval("((A>0) and (B>5 or C > 10))", lazy_vars=['A', 'B', 'C'])
# Evaluating A...
# Evaluating B...
# True
More details later...
It seems to me that the answer is yes, yes you can try to evaluate the expression with missing information. You will need to define what happens though when a symbol lookup fails.
In you case, you will need a boolean expression evaluator and a symbol table so the evaluator can lookup the symbols to execute the expression.
If you succeed in looking up all the symbols, the result would be true or false. If you fail to look up a symbol, then handle that case, perhaps returning None, nullptr, or raising / throwing an exception.
I believe you can embed the python interpreter in your c++ program and call a function to evaluate the expression, more importantly, you can give it the dict to use as the symbol table. If the call returns a result, it was able to find enough symbols or shortcut to a result, otherwise it will raise an exception which your c++ code can detect.
You can prototype the function in python the evaluate if the approach works they way you want, then embed.
Or you can do it all in c++, with a grammar, lexer, parser, and elevator.
While this is a very crude implementation of your solution, but it suits your situation perfectly although using a lot of if else
and Exception Handling.
def main_func():
def checker(a, b=None, c=None):
if a is None:
del a
if b is None:
del b
if c is None:
del c
d = eval('a>0 and (b>5 or c>10)')
return d
return checker
def doer(a=None, b=None, c=None):
try:
return main_func()(a,b,c)
except NameError as e:
if "'a' is not" in str(e):
return 'a'
elif "'b' is not" in str(e):
return 'b'
elif "'c' is not" in str(e):
return 'c'
def check_ret(ret):
return type(ret) == bool
def actual_evaluator():
getter = {
"a": get_a,
"b": get_b,
"c": get_c
}
args = []
while True:
ret = doer(*tuple(args))
if not check_ret(ret):
val = getter[ret]()
args.append(val)
else:
return ret
if __name__ == '__main__':
print actual_evaluator()
Now explaining my code, main_func
returns another function which is used to evaluate the given expression in a string. While here the string has been hardcoded, you can always pass it as a parameter to the function and replace the string inside eval
with the parameter.
In doer
, the function returned by main_func
is invoked and if a NameError
is thrown, which happens in case of the previous conditions being false and new values to be calculated then it returns the specific variable which needs to be calculated. All of that is checked in the actual_evaluator
where the values of the variables are fetched via some function get_variable_name
which you can define in your getter
dict. In my code, I'd used random numbers to check the validity, but like you said you'd have to evaluate the various variables by other means so you can call the respective functions.
Due to the short-circuit behaviour, Python can evaluate an expression even without all the contained values defined if it's possible. If not, it raises an exception:
In [1]: a= False
In [3]: a and b
Out[3]: False
In [4]: a or b
NameError: name 'b' is not defined
But the expression is evaluated left to right:
In [5]: b and a
NameError: name 'b' is not defined
I have done a "roll-my-own" approach for this in the past. It's not that difficult for simple things; you simply create your own objects that implement magic math methods and that keep track of other objects.
If you need something more full-featured, the sympy project is designed to do symbolic math...
I'd take a look at sympy or other computer algebra systems. I believe that algebraic simplification of the pxeression plus short circuit evaluation will allow you to evaluate all the cases where it is possible to get a result. There are some cases where you need to know the value of some variable. For example if you have a simple expression like a == b, you are going to make no progress without knowing the value of a and b. However something like (a >= 0) ||(a <= 0), algebraic simplification will result in true assuming that a is not NAN or some other value that is not equal to itself.
It sounds like you have two challenges:
This means you need some way to evaluate an expression at runtime, and you would like to take advantage of short-circuit logic if possible. Python could be a good choice for this, as shown in the example below.
There is a short Python script (evaluate.py
), which defines an evaluate()
function which can be called from your C or C++ program. The evaluate()
function will attempt to evaluate the expression you give it (translating "&&" and "||" to "and" and "or" if needed). If it requires a variable which hasn't been defined yet, it will retrieve a value for that variable by calling a get_var_value()
function defined in the C/C++ program (and then cache the value for later use).
This approach will use normal short-circuit behavior, so it will only request the variable values needed to finish evaluating the expression. Note that this won't rearrange the expression to choose the minimal set of variables needed to evaluate it; it just uses the standard short-circuiting behavior.
UPDATE: I've added an example at the end that defines the Python script using a multiline string literal in the .cpp file. This could be useful if you don't want to install a separate evaluate.py file along with your executable. It also simplifies the Python initialization a little bit.
The C/Python interaction in the scripts below is based on code in https://docs.python.org/2/extending/embedding.html and https://docs.python.org/2/c-api/arg.html.
Here are the files:
evaluate.py (Python script)
# load embedded_methods module defined by the parent C program
from embedded_methods import get_var_value
# define a custom dictionary class that calls get_var_value(key) for any missing keys.
class var_dict(dict):
def __missing__(self, var):
self[var] = val = get_var_value(var)
return val
# define a function which can be called by the parent C program
def evaluate(expr):
# Create a dictionary to use as a namespace for the evaluation (this version
# will automatically request missing variables).
# Move this line up to the module level to retain values between calls.
namespace = var_dict()
# convert C-style Boolean operators to Python-style
py_expr = expr.replace("||", " or ").replace("&&", " and ").replace(" ", " ")
print('evaluating expression "{}" as "{}"'.format(expr, py_expr))
# evaluate the expression, retrieving variable values as needed
return eval(py_expr, namespace)
evaluate.c (your main program; could also be evaluate.cpp, compiled with g++)
// on Mac, compile with gcc -o evaluate evaluate.c -framework Python
#include <Python/Python.h> // Mac
// #include <Python.h> // non-Mac?
// retain values of argc and argv for equation evaluation
int argc;
char **argv;
/*
Calculate the value of a named variable; this is called from the Python
script to obtain any values needed to evaluate the expression.
*/
static PyObject* c_get_var_value(PyObject *self, PyObject *args)
{
int var_num;
char *var_name;
char err_string[100];
long var_value;
if(!PyArg_ParseTuple(args, "s:get_var_value", &var_name)) {
PyErr_SetString(PyExc_ValueError, "Invalid arguments passed to get_var_value()");
return NULL;
}
// change the code below to define your variable values
// This version just assumes A, B, C are given by argv[2], argv[3], argv[4], etc.
printf("looking up value of %s: ", var_name);
var_num = var_name[0]-'A';
if (strlen(var_name) != 1 || var_num < 0 || var_num >= argc-2) {
printf("%s\n", "unknown");
snprintf(
err_string, sizeof(err_string),
"Value requested for unknown variable \"%s\"", var_name
);
PyErr_SetString(PyExc_ValueError, err_string);
return NULL; // will raise exception in Python
} else {
var_value = atoi(argv[2+var_num]);
printf("%ld\n", var_value);
return Py_BuildValue("l", var_value);
}
}
// list of methods to be added to the "embedded_methods" module
static PyMethodDef c_methods[] = {
{"get_var_value", c_get_var_value, METH_VARARGS, // could use METH_O
"Retrieve the value for the specified variable."},
{NULL, NULL, 0, NULL} // sentinel for end of list
};
int main(int ac, char *av[])
{
PyObject *p_module, *p_evaluate, *p_args, *p_result;
long result;
const char* expr;
// cache and evaluate arguments
argc = ac;
argv = av;
if (argc < 2) {
fprintf(
stderr,
"Usage: %s \"expr\" A B C ...\n"
"e.g., %s \"((A>0) && (B>5 || C > 10))\" 10 9 -1\n",
argv[0], argv[0]
);
return 1;
}
expr = argv[1];
// initialize Python
Py_SetProgramName(argv[0]);
Py_Initialize();
// Set system path to include the directory where this executable is stored
// (to find evaluate.py later)
PySys_SetArgv(argc, argv);
// attach custom module with get_var_value() function
Py_InitModule("embedded_methods", c_methods);
// Load evaluate.py
p_module = PyImport_ImportModule("evaluate");
if (PyErr_Occurred()) { PyErr_Print(); }
if (p_module == NULL) {
fprintf(stderr, "unable to load evaluate.py\n");
return 1;
}
// get a reference to the evaluate() function
p_evaluate = PyObject_GetAttrString(p_module, "evaluate");
if (!(p_evaluate && PyCallable_Check(p_evaluate))) {
fprintf(stderr, "Cannot retrieve evaluate() function from evaluate.py module\n");
return 1;
}
/*
Call the Python evaluate() function with the expression to be evaluated.
The evaluate() function will call c_get_var_value() to obtain any
variable values needed to evaluate the expression. It will use
caching and normal logical short-circuiting to reduce the number
of requests.
*/
p_args = Py_BuildValue("(s)", expr);
p_result = PyObject_CallObject(p_evaluate, p_args);
Py_DECREF(p_args);
if (PyErr_Occurred()) {
PyErr_Print();
return 1;
}
result = PyInt_AsLong(p_result);
Py_DECREF(p_result);
printf("result was %ld\n", result);
Py_DECREF(p_evaluate);
Py_DECREF(p_module);
return 0;
}
Results:
$ evaluate "((A>0) && (B>5 || C > 10))" -1 9 -1
evaluating expression "((A>0) && (B>5 || C > 10))" as "((A>0) and (B>5 or C > 10))"
looking up value of A: -1
result was 0
$ evaluate "((A>0) && (B>5 || C > 10))" 10 9 -1
evaluating expression "((A>0) && (B>5 || C > 10))" as "((A>0) and (B>5 or C > 10))"
looking up value of A: 10
looking up value of B: 9
result was 1
$ evaluate "((A>0) && (B>5 || C > 10))" 10 3 -1
evaluating expression "((A>0) && (B>5 || C > 10))" as "((A>0) and (B>5 or C > 10))"
looking up value of A: 10
looking up value of B: 3
looking up value of C: -1
result was 0
As an alternative, you can combine all this code into a single .cpp file, as shown below. This uses the multi-line string literal capability in C++11.
Self-contained evaluate.cpp
// on Mac, compile with g++ evaluate.cpp -o evaluate -std=c++11 -framework Python
#include <Python/Python.h> // Mac
//#include <Python.h> // non-Mac?
/*
Python script to be run in embedded interpreter.
This defines an evaluate(expr) function which will interpret an expression
and return the result. If any variable values are needed, it will call the
get_var_values(var) function defined in the parent C++ program
*/
const char* py_script = R"(
# load embedded_methods module defined by the parent C program
from embedded_methods import get_var_value
# define a custom dictionary class that calls get_var_value(key) for any missing keys.
class var_dict(dict):
def __missing__(self, var):
self[var] = val = get_var_value(var)
return val
# define a function which can be called by the parent C program
def evaluate(expr):
# Create a dictionary to use as a namespace for the evaluation (this version
# will automatically request missing variables).
# Move this line up to the module level to retain values between calls.
namespace = var_dict()
# convert C-style Boolean operators to Python-style
py_expr = expr.replace("||", " or ").replace("&&", " and ").replace(" ", " ")
print('evaluating expression "{}" as "{}"'.format(expr, py_expr))
# evaluate the expression, retrieving variable values as needed
return eval(py_expr, namespace)
)";
// retain values of argc and argv for equation evaluation
int argc;
char **argv;
/*
Calculate the value of a named variable; this is called from the Python
script to obtain any values needed to evaluate the expression.
*/
static PyObject* c_get_var_value(PyObject *self, PyObject *args)
{
int var_num;
char *var_name;
char err_string[100];
long var_value;
if(!PyArg_ParseTuple(args, "s:get_var_value", &var_name)) {
PyErr_SetString(PyExc_ValueError, "Invalid arguments passed to get_var_value()");
return NULL;
}
// change the code below to define your variable values
// This version just assumes A, B, C are given by argv[2], argv[3], argv[4], etc.
printf("looking up value of %s: ", var_name);
var_num = var_name[0]-'A';
if (strlen(var_name) != 1 || var_num < 0 || var_num >= argc-2) {
printf("%s\n", "unknown");
snprintf(
err_string, sizeof(err_string),
"Value requested for unknown variable \"%s\"", var_name
);
PyErr_SetString(PyExc_ValueError, err_string);
return NULL; // will raise exception in Python
} else {
var_value = atoi(argv[2+var_num]);
printf("%ld\n", var_value);
return Py_BuildValue("l", var_value);
}
}
// list of methods to be added to the "embedded_methods" module
static PyMethodDef c_methods[] = {
{"get_var_value", c_get_var_value, METH_VARARGS, // could use METH_O
"Retrieve the value for the specified variable."},
{NULL, NULL, 0, NULL} // sentinel for end of list
};
int main(int ac, char *av[])
{
PyObject *p_module, *p_evaluate, *p_args, *p_result;
long result;
const char* expr;
// cache and evaluate arguments
argc = ac;
argv = av;
if (argc < 2) {
fprintf(
stderr,
"Usage: %s \"expr\" A B C ...\n"
"e.g., %s \"((A>0) && (B>5 || C > 10))\" 10 9 -1\n",
argv[0], argv[0]
);
return 1;
}
expr = argv[1];
// initialize Python
Py_SetProgramName(argv[0]);
Py_Initialize();
// attach custom module with get_var_value() function
Py_InitModule("embedded_methods", c_methods);
// run script to define evalute() function
PyRun_SimpleString(py_script);
if (PyErr_Occurred()) {
PyErr_Print();
fprintf(stderr, "%s\n", "unable to run Python script");
return 1;
}
// get a reference to the Python evaluate() function (can be reused later)
// (note: PyRun_SimpleString creates objects in the __main__ module)
p_module = PyImport_AddModule("__main__");
p_evaluate = PyObject_GetAttrString(p_module, "evaluate");
if (!(p_evaluate && PyCallable_Check(p_evaluate))) {
fprintf(stderr, "%s\n", "Cannot retrieve evaluate() function from __main__ module");
return 1;
}
/*
Call the Python evaluate() function with the expression to be evaluated.
The evaluate() function will call c_get_var_value() to obtain any
variable values needed to evaluate the expression. It will use
caching and normal logical short-circuiting to reduce the number
of requests.
*/
p_args = Py_BuildValue("(s)", expr);
p_result = PyObject_CallObject(p_evaluate, p_args);
Py_DECREF(p_args);
if (PyErr_Occurred()) {
PyErr_Print();
return 1;
}
result = PyInt_AsLong(p_result);
Py_DECREF(p_result);
printf("result was %ld\n", result);
Py_DECREF(p_module);
Py_DECREF(p_evaluate);
return 0;
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With