Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex to match only some filenames

Tags:

c++

regex

c++11

Using std::regex and given a file path, I want to match only the filenames that end with .txt and that are not of the form _test.txt or .txtTEMP. Any other underscore is fine.

So, for example:

  • somepath/testFile.txt should match.
  • somepath/test_File.txt should match.
  • somepath/testFile_test.txt should not match.
  • somepath/testFile.txtTEMP should not match.

What is the correct regex for such a pattern?

What I have tried:

(.*?)(\.txt) ---> This matches any file path ending with .txt.

To exclude files that contains _test I tried to use negative lookahed:

(.*?)(?!_test)(\.txt)

But it didn't work.

I also tried negative lookbehind but MSVC14 (Visual Studio 2015) throws a std::regex_error exception when creating the regex, so I'm not sure if it's not supported or I'm using the wrong syntax.

like image 244
Banex Avatar asked Jul 27 '15 15:07

Banex


3 Answers

based on what you posted, use this pattern

^(?!.*_).*\.txt$

Demo


or this pattern based on OP edit

^(.*(?<!_test)\.txt$)

Demo

like image 65
alpha bravo Avatar answered Oct 23 '22 15:10

alpha bravo


^(?!.*?_test\.).*\.txt$

I do not have access to VS 2015 atm, but this only uses lookahead, so should work.

like image 33
Alexander Balabin Avatar answered Oct 23 '22 15:10

Alexander Balabin


Best bet? Don't use regexes. Particularly in a simplistic string search case like this one.

First there are a couple simple optimizations that can be made given the question's parameters:

  1. Since the input string's extension must be: ".txt" we don't need to check if the extension is ".txtTEMP"
  2. The only don't match condition then, where the input string ends in "_test.txt", requires checking that the stem ends in "_test" since the extension is already known to be: ".txt"

Both of these checks are always going to be offset a fixed number of characters from the end of the input string. Since all the information for both of these expressions is known it should be setup at compile time:

constexpr auto doMatch = ".txt";
constexpr auto doMatchSize = strlen(doMatch);
constexpr auto doNotMatch = "_test";
constexpr auto doNotMatchSize = strlen(doNotMatch) + doMatchSize;

Given string input it could be tested for success as follows:

if(input.size() >= doMatchSize &&
   equal(input.end() - doMatchSize, input.end(), doMatch) &&
   (input.size() < doNotMatchSize ||
   !equal(input.end() - doNotMatchSize, input.end() - doMatchSize, doNotMatch)))

You can see a live example here: http://ideone.com/7BcyFi

like image 1
Jonathan Mee Avatar answered Oct 23 '22 16:10

Jonathan Mee