Say I have a bash array (e.g. the array of all parameters) and want to delete all parameters matching a certain pattern or alternatively copy all remaining elements to a new array. Alternatively, the other way round, keep elements matching a pattern.
An example for illustration:
x=(preffoo bar foo prefbaz baz prefbar)
and I want to delete everything starting with pref
in order to get
y=(bar foo baz)
(the order is not relevant)
What if I want the same thing for a list of words separated by whitespace?
x="preffoo bar foo prefbaz baz prefbar"
and again delete everything starting with pref
in order to get
y="bar foo baz"
To really remove an exact item, you need to walk through the array, comparing the target to each element, and using unset to delete an exact match. Note that if you do this, and one or more elements is removed, the indices will no longer be a continuous sequence of integers.
One of the methods is “unset,” which is used to delete an element from a specific index and afterward replace it with some other array. Several other sets of elements can be deleted using: also. You can remove the list element from the end but only the solitary one using the pop() method.
Using ArrayList Get the array and the index. Form an ArrayList with the array elements. Remove the specified index element using remove() method. Form a new array of the ArrayList using mapToInt() and toArray() methods.
Filtering an array is tricky if you consider possibility of elements containing spaces (not to mention even "weirder" characters). In particular answers given so far (referring to various forms of ${x[@]//pref*/}
) will fail with such arrays.
I have investigated this issue somewhat and found a solution however it is not a nice one-liner. But at least it is.
For illustration examples let's assume ARR
names the array we want to filter. We shall start with the core expression:
for index in "${!ARR[@]}" ; do [[ …condition… ]] && unset -v 'ARR[$index]' ; done ARR=("${ARR[@]}")
There are already few elements worth mentioning:
"${!ARR[@]}"
evaluates to indexes of the array (as opposed to elements)."${!ARR[@]}"
is a must. You must not skip quotes or change @
to *
. Or else the expression will break on associative arrays where keys contain spaces (for example).do
can be whatever you want. The idea is only that you must do unset
as shown for the elements that you don't want to have in the array.-v
and quotes with unset
or else bad things may happen.do
is as suggested above, you can use either &&
or ||
to filter out the elements that either pass or fail the condition.ARR
, is needed only with non-associative arrays and will break with associative arrays. (I didn't quickly came out with a generic expression that will handle both while I don't need one…). For ordinary arrays it is needed if you want to have consecutive indexes. Because unset
on an array element does not modify (drop by one) elements of higher indexes - it just makes a hole in the indexes. Now if you only iterate over the array (or expand it as a whole) this makes no problem. But for other cases you need to reassign indexes. Note also that if you had any hole in the indexes before it will be removed as well. So if you need to preserve existing holes more logic has to be done beside the unset
and final reassignment.Now as it comes to the condition. The [[ ]]
expression is an easy way if you can use it. (See here.) In particular it supports regular expression matching using the Extended Regular Expressions. (See here.) Also be careful with using grep
or any other line-based tool for this if you expect that array elements can contain not only spaces but also new lines. (While a very nasty file name could have a new line character I think…)
Referring to the question itself the [[ ]]
expression would have to be:
[[ ${ARR[$index]} =~ ^pref ]]
(with && unset
as above)
Let's finally see how this works with those difficult cases. First we construct the array:
declare -a ARR='([0]="preffoo" [1]="bar" [2]="foo" [3]="prefbaz" [4]="baz" [5]="prefbar" [6]="pref with spaces")' ARR+=($'pref\nwith\nnew line') ARR+=($'\npref with new line before')
we can see that we have all the complex cases by running declare -p ARR
and getting:
declare -a ARR='([0]="preffoo" [1]="bar" [2]="foo" [3]="prefbaz" [4]="baz" [5]="prefbar" [6]="pref with spaces" [7]="pref with new line" [8]=" pref with new line before")'
Now we run the filter expression:
for index in "${!ARR[@]}" ; do [[ ${ARR[$index]} =~ ^pref ]] && unset -v 'ARR[$index]' ; done
and another test (declare -p ARR
) gives expected:
declare -a ARR='([1]="bar" [2]="foo" [4]="baz" [8]=" pref with new line before")'
note how all elements starting with pref
were removed but indexes did not change. Note also that ${ARRAY[8]}
is still there since it starts with new line rather than pref
.
Now for the final reassignment:
ARR=("${ARR[@]}")
and check (declare -p ARR
):
declare -a ARR='([0]="bar" [1]="foo" [2]="baz" [3]=" pref with new line before")'
which is exactly what was expected.
For the closing notes. It would be nice if this could be changed into a flexible one-liner. But I don't think there is a way to get it shorter and simpler as it is now without defining functions or alike.
As for the function it would be nice as well to have it accept array, return array and have easy to configure test to exclude or keep. But I'm not good enough with Bash to do it now.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With