I have a shell script that needs to check if a file name matches a certain regex, but it always shows "not match". Can anyone let me know what's wrong with my code?
fileNamePattern=abcd_????_def_*.txt
realFilePath=/data/file/abcd_12bd_def_ghijk.txt
if [[ $realFilePath =~ $fileNamePattern ]]
then
echo $realFilePath match $fileNamePattern
else
echo $realFilePath not match $fileNamePattern
fi
There is a confusion between regexes and the simpler "glob"/"wildcard"/"normal" patterns – whatever you want to call them. You're using the latter, but call it a regex.
If you want to use a pattern, you should
Quote it when assigning1:
fileNamePattern="abcd_????_def_*.txt"
You don't want anything to expand quite yet.
Make it match the complete path. This doesn't match:
$ mypath="/mydir/myfile1.txt"
$ mypattern="myfile?.txt"
$ [[ $mypath == $mypattern ]] && echo "Matches!" || echo "Doesn't match!"
Doesn't match!
But after extending the pattern to start with *
:
$ mypattern="*myfile?.txt"
$ [[ $mypath == $mypattern ]] && echo "Matches!" || echo "Doesn't match!"
Matches!
The first one doesn't match because it matches only the filename, but not the complete path. Alternatively, you could use the first pattern, but remove the rest of the path with parameter expansion:
$ mypattern="myfile?.txt"
$ mypath="/mydir/myfile1.txt"
$ echo "${mypath##*/}"
myfile1.txt
$ [[ ${mypath##*/} == $mypattern ]] && echo "Matches!" || echo "Doesn't match!"
Matches!
Use ==
and not =~
, as shown in the above examples. You could also use the more portable =
instead, but since we're already using the non-POSIX [[ ]]
instead of [ ]
, we can as well use ==
.
If you want to use a regex, you should:
Write your pattern as one: ?
and *
have a different meaning in regexes; they modify what they stand after, whereas in glob patterns, they can stand on their own (see the manual). The corresponding pattern would become:
fileNameRegex="abcd_.{4}_def_.*.txt"
and could be used like this:
$ mypath="/data/file/abcd_12bd_def_ghijk.txt"
$ [[ $mypath =~ $fileNameRegex ]] && echo "Matches!" || echo "Doesn't match!"
Matches!
Keep your habit of writing the regex into a separate parameter and then use it unquoted in the conditional operator [[ ]]
, or escaping gets very messy – it's also more portable across Bash versions.
The BashGuide has a great article about the different types of patterns in Bash.
Notice that quoting your parameters is almost always a good habit. It's not required in conditional expressions in [[ ]]
, and actually suppresses interpretation of the right-hand side as a pattern or regex. If you were using [ ]
(which doesn't support regexes and patterns anyway), quoting would be required to avoid unexpected side effects of special characters and empty strings.
1 Not exactly true in this case, actually. When assigning to a variable, the manual says that the following happens:
[...] tilde expansion, parameter and variable expansion, command substitution, arithmetic expansion, and quote removal [...]
i.e., no pathname (glob) expansion. While in this very case using
fileNamePattern=abcd_????_def_*.txt
would work just as well as the quoted version, using quotes prevents surprises in many other cases and is required as soon as you have a blank in the pattern.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With