I know for a fact, that bash
supports extended glob with a regular expression like support for @(foo|bar)
, *(foo)
and ?(foo)
. This syntax is quite unique i.e. different from that of EREs -- extended globs use a prefix notation (where the operator appears before
its operands), rather than postfix like EREs.
I'm wondering does it support the interval expressions feature of type {n,m}
i.e. if there is one number in the braces, the preceding regexp is repeated n
times or if there are two numbers separated by a comma, the preceding regexp is repeated n
to m
times. I couldn't find a particular documentation that suggests this support enabled in extended glob.
I came across a requirement in one of the questions today, to remove only a pair of trailing zeroes in a string. Trying to solve this with the extended glob support in bash
Given some sample strings like
foobar0000
foobar00
foobar000
should produce
foobar00
foobar
foobar0
I tried using extended glob with parameter expansion to do
x='foobar000'
respectively. I tried using the interval expression as below which seemed obvious to me that it wouldn't work
echo ${x%%+([0]{2})}
i.e. similar using sed
in ERE as sed -E 's/[0]{2}$//'
or in BRE as sed 's/[0]\{2\}$//'
So my question being, is this possible using any of the extended glob operators? I'm looking for answers specific to using the extended glob support in bash
would take 'No' if not possible too.
Extended globs gives us more of the power of regular expressions for globbing. Unlike character sets or character classes, patterns can be more than one character and we can match multiple occurrences of a pattern.
The Bash shell feature that is used for matching or expanding specific types of patterns is called globbing. Globbing is mainly used to match filenames or searching for content in a file. Globbing uses wildcard characters to create the pattern.
Somehow I managed to find a way to do this within the confinements of bash
.
No! In contrast to other shells such as ksh and zsh, bash did not implement interval expressions for globbing.
Yes! However, it is not really practical and could sometimes benefit by using printf
. The idea is to build the globular expression that mimics the {m,n}
interval using the KSH-globs @(pattern)
and ?(pattern)
.
In the explanation below, we assume that the pattern is stored in variable p
Match n
occurrences of the given pattern ({n}
):
The idea is to repeat the pattern n
times. For large n you can use printf
$ var="foobar01010"
$ echo ${var%%@(0|1)@(0|1)}
foobar000
or
$ var="foobar01010"
$ p=$(printf "@(0|1)%.0s" {1..4})
$ echo ${var%%$p}
foobar0
Match at least m
occurrences of the given pattern ({m,}
):
It is the same as before, but with an additional *(pattern)
$ var="foobar01010"
$ echo ${var%%@(0|1)@(0|1)*(0|1)}
foobar
or
$ var="foobar01010"
$ p="(0|1)"
$ q=$(printf "@$p%.0s" {1..4})
$ echo ${var%%$q*$p}
foobar
Match from n
to m
occurrences of the given pattern ({m,n}
):
The interval expression {n,m}
implies we have for sure n appearances and m-n possible appearances. These can be constructed using the ksh-globs @(pat)
n times and ?(pat)
m-n times. For n=2 and m=3, this leads to:
$ var="foobar01010"
$ echo ${var%%@(0|1)@(0|1)?(0|1)}
foobar010
or
$ p="(0|1)"
$ q=$(printf "@$p%.0s" {1..n})$(printf "?$p%.0s" {n+1..m})
$ echo ${var%%$q}
foobar010
$ var="foobar00200"
foobar002
$ var="foobar00020"
foobar00020
Another way to construct the interval expression {n,m}
is using the ksh-glob anything but pattern written as !(pat)
which allows us to say: give me all, except...
man bash
:!(pattern-list)
: Matches anything except one of the given patterns
This way we can write
$ echo ${var%%!(!(*$p)|@$p@$p@$p+$p|?$p)}
or
$ p="(0|1)"
$ pn=$(printf "@$p%.0s" {1..n})
$ pm=$(printf "?$p%.0s" {1..m-1})
$ echo ${var%%!(!(*$p)|$pn+$p|$pm)}
note: you need to do a double exclusion here due to the or (|
) in the pattern list.
The interval expression {n,m}
has been implemented in ksh93
:
man ksh
:
{n}(pattern-list)
Matchesn
occurrences of the given patterns.{m,n}(pattern-list)
Matches fromm
ton
occurrences of the given patterns. Ifm
is omitted,0
will be used. Ifn
is omitted at leastm
occurrences will be matched.
$ echo ${var%%{2,3}(0|1)}
Also zsh
has a form of interval expression. It is a globbing flag which is part of the EXTENDED_GLOB
option:
man zshall
:
(#cN,M)
The flag(#cN,M)
can be used anywhere that the#
or##
operators can be used except in the expressions(*/)#
and(*/)##
in filename generation, where/
has special meaning; it cannot be combined with other globbing flags and a bad pattern error occurs if it is misplaced. It is equivalent to the form{N,M}
in regular expressions. The previous character or group is required to match betweenN
andM
times, inclusive. The form(#cN)
requires exactlyN
matches;(#c,M)
is equivalent to specifyingN
as0
;(#cN,)
specifies that there is no maximum limit on the number of matches.
$ echo ${var%%(0|1)(#c2,3)}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With