Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

powershell filter filenames with regex

Tags:

powershell

I am building a list of files that I'm putting into my $list variable.

Then I want to filter the list based on the $filter variable. The current solution works, but it doesn't work with a regex.

$filter = @("test.txt","Fake","AnotherFile\d{1..6}")

######### HTML TESTS #############
[string]$list = @"
FakeFile.txt
test120119.txt
AnotherFile120119.txt
LastFile.txt
"@

[array]$files = $list -split '\r?\n'
$files = $files | Where-Object {$_} | Where {$_ -notin $filter} # filter out empty items from the array...

$files

My idea is to put regex patterns in the $filter variable so I can catch filenames that have datestamps in them such as test120119.txt in the $list variable above.

How can I change my code to allow for regex? I tried some variations of select-string without splitting my $list, but was not fruitful. I also tried changing my -notin to -notmatch but this doesn't work at all of course.

like image 445
shadow2020 Avatar asked Jan 18 '26 08:01

shadow2020


1 Answers

If you want to use regex, I think it would be easier to just fully commit to regex with your $filter array.

$filter = "^test\d{0,6}\.txt","^Fake","^AnotherFile\d{0,6}\.txt" -join '|'

$list = @"
FakeFile.txt
test120119.txt
AnotherFile120119.txt
LastFile.txt
"@

$files = $list -split '\r?\n'
$files | Where {$_ -notmatch $filter}

The thing to keep in mind is remembering to escape special regex characters if you want them treated literally. You can use the [regex]::Escape() method to do this for you but not if you already purposely injected regex characters.

Once you have your regex filter list, you can join each item with a regex or using the | character.

Not all operators recognize regex language. -match and -notmatch are among the few that do. -match and -notmatch are not case-sensitive. If you want to match against case, you should use the -c variants of the operators, namely -cmatch and -cnotmatch.

The regex items can be tweaked to your liking. More requirements would need to be given in order to come up with an exact solution. Here are some examples to consider:

  • \d{0,6} matches 0 to 6 consecutive digits. 122619 will match successfully, but so will 1226. If you want only 0 or 6 digits to match, you can use (\d{6})?.
  • ^ should be used if you want to start each match at the beginning of the input string. So if you want the regex or to apply from the beginning of the string, you need to include ^ in each item or group items succeeding the initial ^ with () accordingly. ^item1|^item2 will return the same capture group 0 match as ^(item1|item2).
  • \ escape the literal . characters.
  • Not using anchor characters like ^ and $ create a lot of flexibility and potentially unwanted results. 'FakeFile' -match 'Fake' returns true but so does 'MyFakeFile' -match 'Fake'. However, 'MyFakeFile' -match 'Fake$' returns false and 'MyFake' -match 'Fake$' returns true.
like image 129
AdminOfThings Avatar answered Jan 20 '26 13:01

AdminOfThings