I had a question about the use of std::search
vs string::find
for dealing with strings. I know it is often better to use a class specific member function algorithm over the standard library algorithm because it can optimize based on the class, but I was wondering whether it is reasonable to, say for the sake of consistency, use std::search
with the iterators rather than string::find
with indices.
Would it be a sin for me to do something like that or should I just stick to string::find? Are there any huge advantages of one over the other in terms of performance or style?
The strchr() function finds the first occurrence of a character in a string. The character c can be the null character (\0); the ending null character of string is included in the search. The strchr() function operates on null-ended strings.
Using find() function to check if string contains substring in C++. We can use string::find() that can return the first occurrence of the substring in the string. It returns the index from the starting position and the default value for this function is 0. It returns -1 if the substring is not present in the string.
The . find() method returns the index of the first occurrence of the specified string or character. If no result is found, string::npos is returned instead.
Right now (27th April 2017), at least GCCs libstdc++
(which is also used by clang by default), implements std::string::find
with a linear search and thus is much slower than using
std::string_view substr{"whatever"};
auto it = std::search(s.cbegin(), s.cend(),
std::boyer_moore_searcher(substr.begin(), substr.end()));
The problem is that the Boyer-Moore searcher allocates memory for internal data structures, and thus can fail with a std::bad_alloc
exception. However, std::string::find
is marked noexcept
, so using the already implemented Boyer-Moore searcher within std::string::find
isn't straight-forward.
string::find
uses linear search but it is several times faster than Boyer Moore for some cases (with the latest patch). I submitted a patch (first-element then memcomp) to both libstdc++ and libc++ which improved string::find
significantly. You can try the recent gcc (7.1) and you will get the improved performance. You can also measure the performance with the simple benchmarking suite I wrote:
https://github.com/hiraditya/std-benchmark
Especially for smaller strings, by the time Boyer Moore is busy constructing internal data structure, (sub) linear string::find will be done. Also for parsing HTML etc., where most of the searches are mismatches, string::find should be faster.
commit fc7ebc4b8d9ad7e2891b7f72152e8a2b7543cd65
Author: redi <redi@138bc75d-0d04-0410-961f-82ee72b054a4>
Date: Mon Jan 9 13:05:58 2017 +0000
PR66414 optimize std::string::find
2017-01-09 Jonathan Wakely <[email protected]>
Aditya Kumar <[email protected]>
PR libstdc++/66414
* include/bits/basic_string.tcc
(basic_string::find(const CharT*, size_type, size_type)): Optimize.
git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@244225 138bc75d-0d04-0410-961f-82ee72b054a4
PS: Using std::find
will always be slower than the current std::string::find
with the current implementation.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With