Consider this example
(Note that this is just something I made up to illustrate the problem. I am well aware there are more efficient ways to parse an arithmetic expression and though the subject is fascinating, this has nothing to do with my actual question. It's just a semi-realistic example, if I might say so.
I agree the parser thing might make the question seem more complicated, but I could not think of a more abstract example).
Assume you want to do a simple expression parser. You will get bits of strings from a tokenizer, some of them being possibly ambiguous.
For instance, the string "-" could represent an unary minus or a binary minus.
Assume you want to get all possible meanings for string "-".
You could do as follows:
1) define a sorted array describing all possible operators
// types of operators
enum class opType: char { unary, lasso, rasso, none };
// operator descriptors
struct opDesc {
string symbol;
opType type;
char priority;
// partial order comparison
bool operator< (const opDesc& a) const
{
// unary operators first
if (symbol == a.symbol) return type < a.type;
return symbol < a.symbol;
}
// comparison with strings
static bool comp_desc_str (const opDesc& a, const string& s)
{
return a.symbol < s;
}
static bool comp_str_desc (const string& s, const opDesc& a)
{
return s < a.symbol;
}
};
static opDesc op_descriptors[] = {
{ "+" , opType::unary, 8 }, // unary +
{ "-" , opType::unary, 8 }, // unary -
{ "*" , opType::lasso, 6 }, // multiplication
{ "/" , opType::lasso, 6 }, // division
{ "+" , opType::lasso, 5 }, // addition
{ "-" , opType::lasso, 5 }, // substraction
};
2) use std::equal_range
to get all possible matches for a given string
// sort descriptors by value and type
sort(begin(op_descriptors), end(op_descriptors));
// do some searches
string patterns[] = { "+", "-", ">>", "**" };
for (string s : patterns)
{
pair<opDesc*, opDesc*> ops;
ops = equal_range(
std::begin(op_descriptors),
std::end (op_descriptors),
s,
opDesc::comp_desc_str);
cout << s <<": "<< ops.first[0] << ops.second[-1] << endl;
}
This code won't compile, complaining about opDesc::comp_desc_str
(it expects parameters the other way around, i.e. string
first, opDesc
next).
If I try to replace the function with a version that takes its arguments in reverse order:
ops = equal_range(
std::begin(op_descriptors),
std::end (op_descriptors),
s,
opDesc::comp_str_desc);
it won't compile either, complaining about parameters being yet again in the wrong order (at some other point of the algorithm).
This code, however, will work (see a live version here)
#include <regex>
#include <iostream>
using namespace std;
// types of operators
enum class opType: char { unary, lasso, rasso, none };
// operator descriptors
struct opDesc {
string symbol;
opType type;
char priority;
// partial order comparison
bool operator< (const opDesc& a) const
{
// unary operators first
if (symbol == a.symbol) return type < a.type;
return symbol < a.symbol;
}
// comparison with strings
static bool comp_desc_str (const opDesc& a, const string& s)
{
return a.symbol < s;
}
static bool comp_str_desc (const string& s, const opDesc& a)
{
return s < a.symbol;
}
// display
friend ostream& operator<<(ostream& os, const opDesc& op);
};
ostream& operator<<(ostream& os, const opDesc& op)
{
os << op.symbol << "[" << (int)op.type << ":" << (int)op.priority << "]";
return os;
}
static opDesc op_descriptors[] = {
{ "+" , opType::unary, 8 }, // unary +
{ "-" , opType::unary, 8 }, // unary -
{ "~" , opType::unary, 8 }, // bitwise not
{ "**", opType::rasso, 7 }, // power
{ "*" , opType::lasso, 6 }, // multiplication
{ "/" , opType::lasso, 6 }, // division
{ "%" , opType::lasso, 6 }, // remainder
{ "+" , opType::lasso, 5 }, // addition
{ "-" , opType::lasso, 5 }, // substraction
{ "<<", opType::lasso, 4 }, // left shift
{ ">>", opType::lasso, 4 }, // right shift
{ "&" , opType::lasso, 3 }, // bitwise and
{ "^" , opType::lasso, 2 }, // bitwise xor
{ "|" , opType::lasso, 1 }, // bitwise or
{ "(" , opType::none , 0 }, // braces
{ ")" , opType::none , 0 }
};
int main(void)
{
// sort descriptors by value and type
sort(begin(op_descriptors), end(op_descriptors));
// do some searches
string patterns[] = { "+", "-", ">>", "**" };
for (string s : patterns)
{
pair<opDesc*, opDesc*> ops;
// this won't work
/*
ops = equal_range(
std::begin(op_descriptors),
std::end (op_descriptors),
s,
opDesc::comp_desc_str or opDesc::comp_str_desc);
*/
// this works
ops.first = lower_bound(
std::begin(op_descriptors),
std::end (op_descriptors),
s, opDesc::comp_desc_str);
ops.second = upper_bound(
std::begin(op_descriptors),
std::end (op_descriptors),
s, opDesc::comp_str_desc);
cout << s <<": "<< ops.first[0] << ops.second[-1] << endl;
}
}
output:
+: +[0:8]+[1:5] // unary and binary "-" operators found
-: -[0:8]-[1:5] // same thing for "+"
>>: >>[1:4]>>[1:4] // first == second when there is only
**: **[2:7]**[2:7] // one version of the operator
I tried this code on VisualC++ 2013 and g++ with the same results
(only the obfuscation of the template error messages vary).
lower_bound
and upper_bound
should require two different custom comparison functions?equal_range
as intended instead of doing the job twice and make it compile in debug mode on Visual C++ 2013)?std::lower_bound
requires comp(*it, val)
whereas std::upper_bound
requires comp(val, *it)
.
So your comp
functor have to provide both bool operator () (const opDesc& a, const string& s) const
and bool operator ()(const string& s, const opDesc& a) const
.
So, you may use following comp functor:
struct lessOpDescWithString
{
bool operator () (const opDesc& lhs, const std::string& rhs) const {
return opDesc::comp_desc_str(lhs, rhs);
}
bool operator () (const std::string& lhs, const opDesc& rhs) const {
return opDesc::comp_str_desc(lhs, rhs);
}
};
I suppose that std::lower_bound
and std::upper_bound
will use a binary search on your sorted array and must be able to compare obDesc
to std::string
and vice versa. I would suggest making a comparator like
struct obDescStrCmp {
bool operator()(const opDesc& lhs, const opDesc& rhs) const {
// code to compare obDesc to opDesc
}
bool operator()(const opDesc& lhs, const std::string& rhs) const {
// code to compare obDesc to std::string
}
bool operator()(const std::string& lhs, const opDesc& rhs) const {
// code to compare std::string to opDesc
}
bool operator()(const std::string& lhs, const std::string& rhs) const {
// code to compare std::string to std::string
// I'm not sure if this is really necessary.
}
};
and pass it to your std algorithm of choice instead of relying on the operators defined in your opDesc
struct. The compiler should select the correct overload based on the actual order of parameters in the implementations of the std algorithms.
Edit: Replace operator<
with operator()
to make the struct callable.
My own answer, only summarizing other contributions, notably Jarod42's:
The algorithm of equal_range
requires both >
and <
comparisons between internal (opDesc
in this example) and foreign (std::string
) types.
You can't deduce a<b
from !(b<a)
, because of the ==
case, so you have to provide two different comparators.
Functionnally, you could pick any working combination of comparison operations, for instance <
and >
or <
and <=
, but the std:: guys settled for a fixed <
comparison with arguments swapped around, which is a choice dictated by the signature of the function: it only has to define a (type, foreign type)
and a (foreign type, type)
variant.
lower_bound
only requires <
(expressed as type < foreigh type
) while upper_bound
only requires >
(expressed as foreign type < type
), so both can work with a single function, but equal_range
must have access to both prototypes.
The practical solution is to define a function object aka functor to do the job:
// operator descriptors
struct opDesc {
string symbol;
opType type;
char priority;
// partial order comparison
bool operator< (const opDesc& a) const
{
// unary operators first
if (symbol == a.symbol) return type < a.type;
return symbol < a.symbol;
}
// functor to compare with strings
struct comp
{
bool operator() (const opDesc& a, const std::string& b) const
{
return a.symbol < b;
}
bool operator() (const std::string& a, const opDesc& b) const
{
return a < b.symbol;
}
};
and use it like so:
pair<opDesc*, opDesc*> ops;
ops = equal_range(
std::begin(op_descriptors),
std::end (op_descriptors),
s,
opDesc::comp()); // <- functor to provide two different comparison functions
Besides, this won't compile on MSVC++ 2013 due to an obscure paranoid check enabled only in debug mode. The release version will compile fine, as will the code in g++ regardless of debug level.
Judging from the cryptic names used, it seems the template checks whether the comparison defines a total order (which it shouldn't since the whole point of this API is to work on partially ordered structures).
My current (ugly) workaround is to disable some internal debug flag:
#if (defined _MSC_VER && defined _DEBUG)
#define _ITERATOR_DEBUG_LEVEL 1
#endif // _MSC_VER && _DEBUG
prior to including std:: headers
Another possible workaround suggested by Jarod42 is to define the missing comparison function.
// functor to compare with strings
struct comp
{
bool operator() (const opDesc& a, const std::string& b)
{ return a.symbol < b; }
bool operator() (const std::string& a, const opDesc& b)
{ return a < b.symbol; }
// just to make Microsoft Visual C++ happy when compiling in debug mode
bool operator() (const opDesc& a, const opDesc& b)
{ assert(false); return false; }
};
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With