Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to efficiently implement regular expression like .*a.*b.*?

I want to matching of filenames like Colibri does. I tried to solve it by regular expressions.

Searching in Colibri works that You type characters which are in order inside of a filename and it finds all files with these characters in order in the filename. For example for "ab" it finds "cabal", "ab", and "achab".

Simple insertion of .* between letters works (so searched string "ab" becomes regular expression .*a.*b.*), but I want to make it on large amount of files.

So far I have O(N*???), where N is amount of filenames and ??? is at best linear complexity (I assume my language uses NFA). I don't care about space complexity that much. What data structures or algorithms should I choose to make it more efficient (in time complexity)?

like image 763
user712092 Avatar asked Feb 04 '26 06:02

user712092


1 Answers

If you just want to check whether the characters of a search string search are contained in another string str in the same order, you could use this simple algorithm:

pos := -1
for each character in search do
    pos := indexOf(str, character, pos+1)
    if pos is -1 then
        break
    endif
endfor
return pos

This algorithm returns the offset of the last character of search in str and -1 otherwise. Its runtime is in O(n) (you could replace indexOf by a simple while loop that compares the characters in str from pos to Length(str)-1 and returns either the offset or -1).

like image 76
Gumbo Avatar answered Feb 06 '26 19:02

Gumbo



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!