I have some data structures:
all_unordered_m
is a big vector containing all the strings I need (all different)ordered_m
is a small vector containing the indexes of a subset of the strings (all different) in the former vectorposition_m
maps the indexes of objects from the first vector to their position in the second one.The string_after(index, reverse)
method returns the string referenced by ordered_m after all_unordered_m[index]
.
ordered_m
is considered circular, and is explored in natural or reverse order depending on the second parameter.
The code is something like the following:
struct ordered_subset {
// [...]
std::vector<std::string>& all_unordered_m; // size = n >> 1
std::vector<size_t> ordered_m; // size << n
std::tr1::unordered_map<size_t, size_t> position_m;
const std::string&
string_after(size_t index, bool reverse) const
{
size_t pos = position_m.find(index)->second;
if(reverse)
pos = (pos == 0 ? orderd_m.size() - 1 : pos - 1);
else
pos = (pos == ordered.size() - 1 ? 0 : pos + 1);
return all_unordered_m[ordered_m[pos]];
}
};
Given that:
How can I speed up the string_after
method that is called billions of times and is eating up about 10% of the execution time?
EDIT:
I've tried making position_m
a vector
instead of a unordered_map
and using the following method to avoid jumps:
string_after(size_t index, int direction) const
{
return all_unordered_m[ordered_m[
(ordered_m.size()+position_m[index]+direction)%ordered_m.size()]];
}
The change in position_m seems to be the most effective (I'm not sure that eliminating the branches made any difference, I'm tempted to say that the code is more compact but equally efficient with that regard).
There are different data structures based on hashing, but the most commonly used data structure is the hash table. Hash tables are generally implemented using arrays.
Traditional UNIX already supports the concept of threads. Each process contains a single thread, so programming with multiple processes is programming with multiple threads. But, a process is also an address space, and creating a process involves creating a new address space.
An ADT is a mathematical model of a data structure that specifies the type of data stored, the operations supported on them, and the types of parameters of the operations. An ADT specifies what each operation does, but not how it does it. Typically, an ADT can be implemented using one of many different data structures.
vector
lookups are blazing fast. size()
calls and simple arithmetic are blazing fast. map
lookups, in comparison, are as slow as a dead turtle with a block of concrete on his back. I have often seen those become a bottleneck in otherwise simple code like this.
You could try unordered_map
from TR1 or C++0x (a drop-in hashtable replacement of map
) instead and see if that makes a difference.
Well, in such cases (a small function that is called often) every branch can be very expensive. There are two things that come to mind.
reverse
parameter and make it two separate methods? This only makes sense if that doesn't simply push the if
-statement to the calling code.pos
: pos = (pos + 1) % ordered_m.size()
(this is for the forward case). This only works if you are sure that pos
never overflows when incrementing it.In general, try to replace branches with arithmetic operations in such cases, this can give you substantial speedup.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With