I've been trying to decrease every single char
value in a std::stringstream
by 100
:
std::string str = stream.str();
auto decrement = [](char c) { return c - 100; };
std::string out;
out.reserve(str.size());
std::transform(str.begin(), str.end(), std::back_inserter(out), decrement);
stream = std::stringstream(out);
But it took 7 minutes stuck on the std::transform
instruction. That for a 150mb text file.
I'm not using an optimized build. This is the debug build. The goal is to be able to have the code running faster for debugging purposes. Release results are secondary for this question.
Any suggestions on how to improve efficiency?
One thing I'd consider would be transforming it in place if you don't use your str
for anything else. That way you write back to the same location you read from and might get better caching behaviour. Simply change
std::transform(str.begin(), str.end(), std::back_inserter(out), decrement);
to
std::transform(str.begin(), str.end(), str.begin(), decrement);
and you can get rid of your out
string completely. The 3rd (destination) parameter is allowed to be the same as the first parameter.
Not only does that get rid completely of the extra 150MB string variable, you previously had to access two different locations in the memory that should be quite a bit apart. With reading from and writing back to the same place you make sure that there really is maximum use of the cache.
Of course this mutates str
so it's only really useful if you don't need the original str
variable for something else.
End result:
std::string str = stream.str();
auto decrement = [](char c) { return c - 100; };
std::transform(str.begin(), str.end(), str.begin(), decrement);
stream = std::stringstream(str);
There are two obvious speedups.
The first is to transform in-place.
std::string str = stream.str();
auto decrement = [=](char c) { return c -= 100; };
std::transform(str.begin(), str.end(), str.begin(), decrement);
stream = std::stringstream(str);
which was covered by Raphael.
The second, which is only because you want DEBUG optimized speed, is to bypass possibly debug iterator checks.
std::string str = stream.str();
auto decrement = [=](char c) { return c -= 100; };
std::transform(&str[0], (&str[0])+str.size(), (&str[0]), decrement);
stream = std::stringstream(str);
here we replace begin()
with &str[0]
, a raw pointer to the character buffer contents. If you are working with extremely strange basic_string
s, use std::addressof
instead of &
.
In a system with a iterators encumbered with debugging instrumentation, this could be much faster. In an optimized build, I would expect it to be the same speed.
Slightly less elegant but I think still acceptable (also depends on your target machine) would be use of sse intrinsics (SSE2) if you need extra speed (around 5x faster than solution presented by Raphael).
#include <emmintrin.h>
__m128i dec = _mm_set1_epi8(100);
size_t x = 0;
for (; x < str.size()-15; x+=16)
{
__m128i sse = _mm_loadu_si128((__m128i*)&str[x]);
_mm_storeu_si128((__m128i*)&str[x], _mm_sub_epi8(sse, dec));
}
for (; x < str.size(); ++x)
str[x] -= 100;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With