I'm trying to find which category C a double x belongs to. My categories are defined as strings names and doubles values in a file like this
A 1.0
B 2.5
C 7.0
which should be interpreted like this
"A": 0 < x <= 1.0
"B": a < x <= 2.5
"C": b < x <= 7.0
(the input can have arbitrary length and may have to be sorted by their values). I simply need a function like this
std::string findCategory(categories_t categories, double x) {
...insert magic here
}
so for this example I'd expect
findCategory(categories, 0.5) == "A"
findCategory(categories, 1.9) == "B"
findCategory(categories, 6.0) == "C"
So my question is a) how to write the function and b) what the best choice of category_t may be (using stl in pre 11 C++). I made several attempts, all of which were... less than successful.
One option would be to use the std::map
container with doubles as keys and values corresponding to what value is assigned to the range whose upper endpoint is the given value. For example, given your file, you would have a map like this:
std::map<double, std::string> lookup;
lookup[1.0] = "A";
lookup[2.5] = "B";
lookup[7.0] = "C";
Then, you could use the std::map::lower_bound
function, given some point, to get back the key/value pair whose key (upper endpoint) is the first key in the map that is at least as large as the point in question. For example, with the above map, lookup.lower_bound(1.37)
would return an iterator whose value is "B." lookup.lower_bound(2.56)
would return an iterator whose value is "C." These lookups are fast; they take O(log n) time for a map with n elements.
In the above, I'm assuming that the values you're looking up are all nonnegative. If negative values are allowed, you can add a quick test in to check whether the value is negative before you do any lookups. That way, you can eliminate spurious results.
For what it's worth, if you happen to know something about the distribution of your lookups (say, they're uniformly distributed) it's possible to build a special data structure called an optimal binary search tree that will give better access times than the std::map
. Also, depending on your application, there may be even faster options available. For example, if you're doing this because you want to randomly choose one of the outcomes with differing probabilities, then I would suggest looking into this article on the alias method, which lets you generate random values in O(1) time.
Hope this helps!
You can use the pair type and the 'lower_bound' from < algorithm > http://www.cplusplus.com/reference/algorithm/lower_bound/.
Let's define your categories in terms of the upper edge: typedef pair categories_t;
Then just make a vector of those edges and search it using binary search. See the full example below.
#include <string>
#include <vector>
#include <algorithm>
#include <iostream>
using namespace std;
typedef pair<double,string> category_t;
std::string findCategory(const vector<category_t> &categories, double x) {
vector<category_t>::const_iterator it=std::lower_bound(categories.begin(), categories.end(),category_t(x,""));
if(it==categories.end()){
return "";
}
return it->second;
}
int main (){
vector< category_t > edges;
edges.push_back(category_t(0,"bin n with upper edge at 0 (underflow)"));
edges.push_back(category_t(1,"bin A with upper edge at 1"));
edges.push_back(category_t(2.5,"bin B with upper edge at 2.5"));
edges.push_back(category_t(7,"bin C with upper edge at 7"));
edges.push_back(category_t(8,"bin D with upper edge at 8"));
edges.push_back(category_t(9,"bin E with upper edge at 9"));
edges.push_back(category_t(10,"bin F with upper edge at 10"));
vector< double > examples ;
examples.push_back(1);
examples.push_back(3.3);
examples.push_back(7.4);
examples.push_back(-5);
examples.push_back(15);
for( vector< double >::const_iterator eit =examples.begin();eit!=examples.end();++eit)
cout << "value "<< *eit << " : " << findCategory(edges,*eit) << endl;
}
The comparisons works the way we want it to since the double is the first in the pair, and pairs are compared first by comparing the first and then the second constituent. Else we would define a compare predicate as described at the page I linked above.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With