I need to implement something like my own file system. One operation would be the FindFirstFile. I need to check, if the caller passed something like ., sample*.cpp or so. My "file system" implementation provides the list of "files names" as a array of char*.
Is there any Windows function or any source code that implements this file name matching?
An asterisk is replaced by any number of characters in a filename. For example, ae* would match aegis, aerie, aeon, etc. if those files were in the same directory. You can use this to save typing for a single filename (for example, al* for alphabet.
Wildcards (also referred to as meta characters) are symbols or special characters that represent other characters. You can use them with any command such as ls command or rm command to list or remove files matching a given criteria, receptively.
The asterisk represents any number of unknown characters. Use it when searching for documents or files for which you have only partial names. For most web search engines, wildcards increase the number of your search results.
The wildcard '*' means it will match any number of characters or a set of characters. For example, S**n will match anything between S and n. The number of characters between them do not count. Example: Here, we can see in the result that files starting with 'A' and ending with 'f' are displayed.
For wildcard name matching using '*' and '?' try this (if you want to avoid boost, use std::tr1::regex):
#include <boost/regex.hpp>
#include <boost/algorithm/string/replace.hpp>
using std::string;
bool MatchTextWithWildcards(const string &text, string wildcardPattern, bool caseSensitive /*= true*/)
{
// Escape all regex special chars
EscapeRegex(wildcardPattern);
// Convert chars '*?' back to their regex equivalents
boost::replace_all(wildcardPattern, "\\?", ".");
boost::replace_all(wildcardPattern, "\\*", ".*");
boost::regex pattern(wildcardPattern, caseSensitive ? regex::normal : regex::icase);
return regex_match(text, pattern);
}
void EscapeRegex(string ®ex)
{
boost::replace_all(regex, "\\", "\\\\");
boost::replace_all(regex, "^", "\\^");
boost::replace_all(regex, ".", "\\.");
boost::replace_all(regex, "$", "\\$");
boost::replace_all(regex, "|", "\\|");
boost::replace_all(regex, "(", "\\(");
boost::replace_all(regex, ")", "\\)");
boost::replace_all(regex, "{", "\\{");
boost::replace_all(regex, "{", "\\}");
boost::replace_all(regex, "[", "\\[");
boost::replace_all(regex, "]", "\\]");
boost::replace_all(regex, "*", "\\*");
boost::replace_all(regex, "+", "\\+");
boost::replace_all(regex, "?", "\\?");
boost::replace_all(regex, "/", "\\/");
}
There are quite a few such functions around. Here's a directory of various implementations, sorted into recursive and non-recursive, etc.
In case you don't like the licensing there (or have trouble with the link, etc.) here's one possible implementation of a matching algorithm that at least closely approximates what Windows uses:
#include <string.h>
#include <iostream>
bool match(char const *needle, char const *haystack) {
for (; *needle != '\0'; ++needle) {
switch (*needle) {
case '?':
if (*haystack == '\0')
return false;
++haystack;
break;
case '*': {
if (needle[1] == '\0')
return true;
size_t max = strlen(haystack);
for (size_t i = 0; i < max; i++)
if (match(needle + 1, haystack + i))
return true;
return false;
}
default:
if (*haystack != *needle)
return false;
++haystack;
}
}
return *haystack == '\0';
}
#ifdef TEST
#define CATCH_CONFIG_MAIN
#include "catch.hpp"
TEST_CASE("Matching", "[match]") {
REQUIRE(match("a", "a") == true);
REQUIRE(match("a", "b") == false);
REQUIRE(match("a*", "a") == true);
REQUIRE(match("a?", "a") == false);
REQUIRE(match("a?", "ab") == true);
REQUIRE(match("a*b", "ab") == true);
REQUIRE(match("a*b", "acb") == true);
REQUIRE(match("a*b", "abc") == false);
REQUIRE(match("*a*??????a?????????a???????????????",
"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa") == true);
}
#endif
Since there was a discussion of complexity of some of the other answers, I'll note that I believe this has O(NM) complexity and O(M) storage use (where N is the size of the target string, and M is the size of the pattern).
With @masterxilo's test pair:
"*a*??????*a*?????????a???????????????", "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
...this finds a match in approximately 3 microseconds on my machine. That is a lot slower than a typical pattern--most of my other tests run in about 300 nanoseconds or so on this particular machine.
At the same time, @masterxilo's code takes approximately 11 microseconds to run on the same machine, so this is still around 3 to 4 times faster (not to mention being somewhat smaller and simpler).
Have a look at the POSIX functions fnmatch
, glob
, and wordexp
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With