Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does strstr()'s implementation in gcc and VS have linear complexity?

I know that there are fast string search algorithms, like Boyer–Moore and Knuth–Morris–Pratt, that have O(n+m) complexity, while the trivial solution would be O(n*m).

So does the implementation of strstr() for the most popular toolchains - gcc and Visual Studio - use these fast O(n) algorithms, or does it use the trivial solution?

like image 216
sashoalm Avatar asked Oct 01 '22 03:10

sashoalm


1 Answers

GCC's runtime library uses Two-Way Algorithm which performs 2n-m text character comparisons in the worst case. It is O(n) complexity in the search phase but it needs a extra preprocessing phase which is O(m) complexity. You could find details on http://www-igm.univ-mlv.fr/~lecroq/string/node26.html about the algorithm.

AFAIK MSVC runtime is doing strstr in the most naive way, in O(n*m) complexity. But brute force doesn't need extra memory space, so it never raise a bad alloc exception. KMP need O(m) extra space, and Two-Way need a constant extra space.

What GCC is doing sounds just like using FFT to calculate multiplys. Looks extremely fast in paper, but really slow in practice. MSVC will use SIMD instructions in strstr when they are availiable so it's even faster in most case. I will choose the brute force approach with SIMD if I'm going to write my own library.

like image 121
Aean Avatar answered Oct 18 '22 12:10

Aean