Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Differences in regex support between gcc 4.9.2 and gcc 5.3

Tags:

c++

regex

gcc

Can anyone more familiar with gcc point out why the sample below fails to match on gcc 4.9.2 but succeeds on gcc 5.3? Is there anything I can do to alternate the pattern so that it would work (also seems to work fine on VS 2013)?

#include <iostream>
#include <regex>

std::regex pattern("HTTP/(\\d\\.\\d)\\s(\\d{3})\\s(.*)\\r\\n(([!#\\$%&\\*\\+\\-\\./a-zA-Z\\^_`\\|-]+\\:[^\\r]+\\r\\n)*)\\r\\n");

const char* test = "HTTP/1.1 200 OK\r\nHost: 192.168.1.72:8080\r\nContent-Length: 86\r\n\r\n";

int main()
{
    std::cmatch results;
    bool matched = std::regex_search(test, test + strlen(test), results, pattern);
    std::cout << matched;
    return 0; 
}

I assume I am using something that is not supported in gcc 4.9.2 but was added on or fixed later, but I have no idea where to look it up.

UPDATE

Due to the amount of help and suggestions I tried to backtrack the issue instead of just switching to gcc 5. I get correct matches with this modification:

#include <iostream>
#include <regex>

std::regex pattern("HTTP/(\\d\\.\\d)\\s(\\d{3})\\s(.*?)\\r\\n(?:([^:]+\\:[^\\r]+\\r\\n)*)\\r\\n");

const char* test = "HTTP/1.1 200 OK\r\nHost: 192.168.1.72:8080\r\nContent-Length: 86\r\n\r\n";

int main()
{
    std::cmatch results;
    bool matched = std::regex_search(test, test + strlen(test), results, pattern);
    std::cout << matched << std::endl;
    if (matched)
    {
        for (const auto& result : results)
        {
            std::cout << "matched: " << result.str() << std::endl;
        }
    }
    return 0;
}

So I guess the problem is with the group that matches the HTTP header name. Will check further.

UPDATE 2

std::regex pattern(R"(HTTP/(\d\.\d)\s(\d{3})\s(.*?)\r\n(?:([!#$&a-zA-Z^_`|-]+\:[^\r]+\r\n)*)\r\n)")

is the last thing that works. Adding any of the remaining characters that I had in my group - %*+-. (escaped or not epscaped) - breaks it.

like image 449
Rudolfs Bundulis Avatar asked Apr 19 '16 13:04

Rudolfs Bundulis


1 Answers

So I know GCC did not support the c++11 regex library until GCC 4.9 officially. See Is gcc 4.8 or earlier buggy about regular expressions?. Since it was so new, it is likely that it had a few bugs to smooth out. Pinning down the exact cause would be difficult, but the problem is in the implementation and not in the regex.

Side note: I remember spending 20 minutes one time trying to figure out what was wrong with my regex when I found the mentioned article and realized that I was using gcc 4.8.*. Since the machine I had to run on wasn't mine, I basically ended up compiling on a different, similar platform with a later version of gcc and a few hacks and then it ran on the target platform.

like image 154
HackerBoss Avatar answered Nov 08 '22 20:11

HackerBoss