I want to matching of filenames like Colibri does. I tried to solve it by regular expressions.
Searching in Colibri works that You type characters which are in order inside of a filename and it finds all files with these characters in order in the filename. For example for "ab" it finds "cabal", "ab", and "achab".
Simple insertion of .* between letters works (so searched string "ab" becomes regular expression .*a.*b.*), but I want to make it on large amount of files.
So far I have O(N*???), where N is amount of filenames and ??? is at best linear complexity (I assume my language uses NFA). I don't care about space complexity that much. What data structures or algorithms should I choose to make it more efficient (in time complexity)?
If you just want to check whether the characters of a search string search are contained in another string str in the same order, you could use this simple algorithm:
pos := -1
for each character in search do
pos := indexOf(str, character, pos+1)
if pos is -1 then
break
endif
endfor
return pos
This algorithm returns the offset of the last character of search in str and -1 otherwise. Its runtime is in O(n) (you could replace indexOf by a simple while loop that compares the characters in str from pos to Length(str)-1 and returns either the offset or -1).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With