I want to split string something like 'substring1 substring2 ONCE[0,10s] substring3'. The expected result should be (with delimiter 'ONCE[0,10s]'):
substring1 substring2
substring3
The problem is that the number in delimiter is variable such as 'ONCE[0,1s]' or 'ONCE[0,3m]' or 'ONCE[0,10d]' and so on.
How can I do this in bash script ? Any idea ?
Thank you
The example provided in the OP (as well as the two answers provided by @GlennJackman and @devnull) assume that the actual question could have been:
In bash, how do I replace the match for a regular expression in a string with a newline.
That's not actually the same as "split a string using a regular expression", unless you add the constraint that the string does not contain any newline characters. And even then, it's not actually "splitting" the string; the presumption is that some other process will use a newline to split the result.
Once the question has been reformulated, the solution is not challenging. You could use any tool which supports regular expressions, such as sed
:
sed 's/ *ONCE\[[^]]*] */\n/g' <<<"$variable"
(Remove the g
if you only want to replace the first sequence; you may need to adjust the regular expression, since it wasn't quite clear what the desired constraints are.)
bash
itself does not provide a replace all
primitive using regular expressions, although it does have "patterns" and, if the option extglob
is set (which is the default on some distributions), the patterns are sufficiently powerful to express the pattern, so you could use:
echo "${variable//*( )ONCE\[*([^]])]*( )/$'\n'}"
Again, you can make the substitution only happen once by changing //
to /
and you may need to change the pattern to meet your precise needs.
That leaves open the question of how to actually split a bash variable using a delimiter specified by a regular expression, for some definition of "split". One possible definition is "call a function with the parts of the string as arguments"; that's the one which we use here:
# Usage:
# call_with_split <pattern> <string> <cmd> <args>...
# Splits string according to regular expression pattern and then invokes
# cmd args string-pieces
call_with_split () {
if [[ $2 =~ ($1).* ]]; then
call_with_split "$1" \
"${2:$((${#2} - ${#BASH_REMATCH[0]} + ${#BASH_REMATCH[1]}))}" \
"${@:3}" \
"${2:0:$((${#2} - ${#BASH_REMATCH[0]}))}"
else
"${@:3}" "$2"
fi
}
Example:
$ var="substring1 substring2 ONCE[0,10s] substring3"
$ call_with_split " ONCE\[[^]]*] " "$var" printf "%s\n"
substring1 substring2
substring3
bash:
s='substring1 substring2 ONCE[0,10s] substring3'
if [[ $s =~ (.+)" ONCE["[0-9]+,[0-9]+[smhd]"] "(.+) ]]; then
echo "${BASH_REMATCH[1]}"
echo "${BASH_REMATCH[2]}"
else
echo no match
fi
substring1 substring2
substring3
You could use awk
. Specify the field separator as:
'ONCE[[]0,[^]]*[]] *'
For example, using your sample input:
$ awk -F 'ONCE[[]0,[^]]*[]] *' '{for(i=1;i<=NF;i++){printf $i"\n"}}' <<< "substring1 substring2 ONCE[0,10s] substring3"
substring1 substring2
substring3
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With