Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does bash "=~" operator ignore the last part of the pattern specified?

Tags:

regex

bash

I am trying to do compare a string in bash to a regex pattern and have found something odd. For starters I am using GNU bash, version 5.0.17(1)-release (x86_64-pc-linux-gnu). This is within WSL.

For example here is sample program demonstrating the problem:

#!/bin/env bash

name="John"

if [[ "${name}" =~ "John"* ]]; then
    echo "found"
else
    echo "not found"
fi

exit

As expected this will echo found since the name "John" matches the regex pattern described. Now what I find odd is if I drop the n in John, it still echos found. Imo "Joh" does match the pattern of "John"*.

If you drop the "hn" and just set $name to "Jo" then it echos not found. It seems to only affect the last character in the Regex pattern (aside from the wildcard).

I am converting an old csh script to bash and this behavior is not happening in csh. What is causing bash to do this?

like image 629
Tyler Avatar asked Oct 16 '25 04:10

Tyler


1 Answers

You're mixing up syntax for shell patterns and regular expressions. Your regular expression, after stripping the quoting, is John*: Joh followed by any number of n, including 0. Matches Joh, John, Johnn, Johnnn, ...

It's not anchored, so it also matches any string containing one of the matches above.

Since it's not anchored, depending on what you want, you could do any of these:

  • Any string containing John should match:
    • Regex: [[ $name =~ John ]]
    • Shell pattern: [[ $name == *John* ]]
  • Any string that begins with John should match:
    • Regex: [[ $name =~ ^John ]]
    • Shell pattern: [[ $name == John* ]]

Notice that shell patterns, unlike the regular expressions, must match the entire string.

A note on quoting: within [[ ... ]], the left-hand side doesn't have to be quoted; on the right-hand side, quoted parts are interpreted literally. For regular expressions, it's a good practice to define it in a separate variable:

re='^John'
if [[ $name =~ $re ]]; then

This avoids a few edge cases with special characters in the regex.

like image 135
Benjamin W. Avatar answered Oct 17 '25 16:10

Benjamin W.



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!