Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Simple wildcard search algorithm in C++

I have an assignment in which I have to create a search pattern including wildcard character '?'. We haven't covered anything further than loops and properties of string libraries yet, so my teacher doesn't want me to use arrays or anything we haven't covered.

My problem is to create an algorithm for the special character '?'. Do you have any idea how can I integrate it into my program without using more advanced tricks? Everything I tried is either completely wrong or has some mistakes in it.

Program should request an input from the user for the source string and then, ask for another input for search string which can include '?' in it. For example:

Source string: glorious Search string: ?r?o

The matched string was found at index: 2 The matched string is: orio

like image 559
hevele Avatar asked Apr 01 '26 16:04

hevele


1 Answers

I felt bad for only hinting on backtracking and recursion in a comment. Here's an explanation:

Strategy:

Focus on the tokens between wilcards (the wildcards are not what should be matched).

  • extract first token from pattern
  • exit with success for no (more) tokens
  • for each token match in input
    • match the remainder of the pattern against the remainder of the input
    • if no successful submatch, fail, otherwise done

There is recursion (the matching of the remainder class match(....) recursively).

There is backtracking (if the recursive match doesn't succeed, we try the next token submatch)

Sample (see https://ideone.com/yApYp)

Only using loops and std::string interface (well, and iostreams for displaying test output) :)

#include <iostream>
#include <string>

typedef std::string::const_iterator It;

/*
 * Extract sequences of non-wildcard characters from pattern range
 */
std::string extract_token(It &s, It e) // [s,e) is (sub)pattern
{
    It wcard;
    for (wcard=s; wcard!=e; ++wcard)
        if ('?' == *wcard) break;

    std::string token(s,wcard);

    for (s=wcard; s!=e; ++s)
        if ('?' != *s) break; // treat '??' as '?' in pattern

    return token;
}

/*
 * Match a (sub)pattern against a (sub)input
 *
 * (See "Strategy" above)
 */
bool match(It patb, It pate, const std::string& input)
{
    while (patb != pate)
    {
        // get next token from pattern, advancing patb
        std::string token = extract_token(patb, pate); // updates patb

        if (!token.empty()) // could happen if pattern begins/ends with redundant '?'
        {
            size_t submatch = input.find(token);  // first submatch please

            while (std::string::npos != submatch)  // while we have a submatch
            {
                if (match(patb, pate, input.substr(token.size())))
                    return true; // match completed successfully

                // look for later potential submatches (*backtrack*)
                submatch = input.find(token, submatch+1);
            }
            return false; // required token not found
        }
    }
    return true; // no (remaining) pattern, always match
}

bool match(const std::string& pattern, const std::string& input)
{
    // just relay to overload more suited for recursion
    return match(pattern.begin(), pattern.end(), input); 
}

//////////////////////
// TEST PROGRAM

void test(const std::string& pattern, const std::string& input)
{
    std::cout << std::boolalpha;
    std::cout << "match(\"" << pattern << "\", \"" << input << "\") => " 
              << match(pattern, input) << std::endl;
}

int main()
{
    // matches
    test("?????",               "");
    test("?????",               "?????");
    test("",                    "");
    test("",                    "glorious");
    test("?r?o",                "glorious");
    test("some?words?exist",    "some silly words should, most definitely, be existing");
    test("some??words?exist?",  "some silly words should, most definitely, be existing");

    // failing matches
    test("_",                   "");
    test("_",                   "glorious");
    test("_",                   "glorious");
    test("glorious",            "glo?ious");
    test("?some??words?exist?", "bogus");
}
like image 145
sehe Avatar answered Apr 04 '26 08:04

sehe