So I was playing around with some code and wanted to see which method of converting a std::string to upper case was most efficient. I figured that the two would be somewhat similar performance-wise, but I was terribly wrong. Now I'd like to find out why.
The first method of converting the string works as follows: for each character in the string (save the length, iterate from 0 to length), if it's between 'a' and 'z', then shift it so that it's between 'A' and 'Z' instead.
The second method works as follows: for each character in the string (start from 0, keep going till we hit a null terminator), apply the build in toupper() function.
Here's the code:
#include <iostream>
#include <string>
inline std::string ToUpper_Reg(std::string str)
{
for (int pos = 0, sz = str.length(); pos < sz; ++pos)
{
if (str[pos] >= 'a' && str[pos] <= 'z') { str[pos] += ('A' - 'a'); }
}
return str;
}
inline std::string ToUpper_Alt(std::string str)
{
for (int pos = 0; str[pos] != '\0'; ++pos) { str[pos] = toupper(str[pos]); }
return str;
}
int main()
{
std::string test = " abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789~!@#$%^&*()_+=-`'{}[]\\|\";:<>,./?";
for (size_t i = 0; i < 100000000; ++i) { ToUpper_Reg(test); /* ToUpper_Alt(test); */ }
return 0;
}
The first method ToUpper_Reg
took about 169 seconds per 100 million iterations.
The second method Toupper_Alt
took about 379 seconds per 100 million iterations.
What gives?
Edit: I changed the second method so that it iterates the string how the first one does (set the length aside, loop while less than length) and it's a bit faster, but still about twice as slow.
Edit 2: Thanks everybody for your submissions! The data I'll be using it on is guaranteed to be ascii, so I think I'll be sticking with the first method for the time being. I'll keep in mind that toupper
is locale specific for when/if I need it.
ToUpper depends on what your strings contain more of, and that typically strings contain more lower case characters which makes ToLower more efficient.
C++ String has got built-in toupper() function to convert the input String to Uppercase.
The toupper() function in C++ converts a given character to uppercase. It is defined in the cctype header file.
toupper() function in CThe toupper() function is used to convert lowercase alphabet to uppercase. i.e. If the character passed is a lowercase alphabet then the toupper() function converts a lowercase alphabet to an uppercase alphabet. It is defined in the ctype.
std::toupper
uses the current locale to do case conversions, which involves a function call and other abstractions. So naturally, it will be slower. But it will also work on non-ASCII text.
toupper()
does more than just shift characters in the range [a-z]. For one thing it's locale dependent and can handle more than just ASCII.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With