Find only first std::regex match efficiently

Question

I'm trying to find an efficient way to greedily find the first match for a std::regex without analyzing the whole input.

My specific problem is that I wrote a hand made lexer and I'm trying to provide rules to parse common literal values (eg. a numeric value).

So suppose a simple let's say

std::regex integralRegex = std::regex("([+-]?[1-9]*[0-9]+)");

Is there a way to find the longest match starting from the beginning of input without scanning all of it? It looks like std::regex_match tries to match the whole input while std::regex_search forcefully finds all matches.

Maybe I'm missing a trivial overload for my purpose but I can't find an efficient solution to the problem.

Just to clarify the question: I'm not interested in stopping after first sub-match and ignore the remainder of input but for an input like "51+12*3" I'd like something that finds first 51 match and then stops, ignoring whatever is after.

Marek R · Accepted Answer

First of all [+-]?[1-9]?[0-9]+ I think it does the same think, but should be a bit faster. Or you intend to use something like this: [+-]?[1-9][0-9]*|0 (zero without sign or number not starting with zero).

Secondly C++ provides regular expression iterator:

const std::string s = "51+12*3";

std::regex number_regex("[+-]?[1-9]?[0-9]+");
auto words_begin = 
    std::sregex_iterator(s.begin(), s.end(), number_regex);
auto words_end = std::sregex_iterator();

std::cout << "Found " 
          << std::distance(words_begin, words_end) 
          << " numbers:
";

for (std::sregex_iterator i = words_begin; i != words_end; ++i) {
    std::smatch match = *i;                                                 
    std::string match_str = match.str(); 
    std::cout << match_str << '
';
}

And looks like this is what you need.

https://wandbox.org/permlink/tkaAfIslkWeY2poo

Find only first std::regex match efficiently

Tags:

c++

regex

c++17

lexer

Jack

1 Answers

Marek R

Recent Activity

Donate For Us

Find only first std::regex match efficiently

Tags:

c++

regex

c++17

lexer

Jack

1 Answers

Marek R

Related questions

Recent Activity

Donate For Us