Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to remove duplicate elements in an existing array in bash? [duplicate]

Tags:

bash

unix

How do I create a newArray containing only the unique elements present in Array?

Ex: ARRAY contains elements aa ab bb aa ab cc at ARRAY[0-5] respectively.

When I print newARRAY, I want only aa ab bb cc at newARRAY[0-3] respectively.

I've searched stack overflow for a while now and nothing is solving my problem. I tried to do newARRAY=$(ARRAY[@] | sort -u | uniq, but duplicated elements still exist.

like image 599
John Dingus Avatar asked Feb 21 '19 00:02

John Dingus


2 Answers

Naive approach

To get the unique elements of arr and assuming that no element contains newlines:

$ printf "%s\n" "${arr[@]}" | sort -u
aa
ab
bb
cc

Better approach

To get a NUL-separated list that works even if there were newlines:

$ printf "%s\0" "${arr[@]}" | sort -uz
aaabbbcc

(This, of course, looks ugly on a terminal because it doesn't display NULs.)

Putting it all together

To capture the result in newArr:

$ newArr=(); while IFS= read -r -d '' x; do newArr+=("$x"); done < <(printf "%s\0" "${arr[@]}" | sort -uz)

After running the above, we can use declare to verify that newArr is the array that we want:

$ declare -p newArr
declare -a newArr=([0]="aa" [1]="ab" [2]="bb" [3]="cc")

For those who prefer their code spread over multiple lines, the above can be rewritten as:

newArr=()
while IFS= read -r -d '' x
do
    newArr+=("$x")
done < <(printf "%s\0" "${arr[@]}" | sort -uz)

Additional comment

Don't use all caps for your variable names. The system and the shell use all caps for their names and you don't want to accidentally overwrite one of them.

like image 187
John1024 Avatar answered Oct 19 '22 05:10

John1024


You can use an associatve array to keep track of elements you've seen:

#!/bin/bash

ARRAY=(aa ab bb aa ab cc)

unset dupes # ensure it's empty
declare -A dupes

for i in "${ARRAY[@]}"; do
    if [[ -z ${dupes[$i]} ]]; then
        NEWARRAY+=("$i")
    fi
    dupes["$i"]=1
done
unset dupes # optional

printf "[%s]" "${ARRAY[@]}"
echo
printf "[%s]" "${NEWARRAY[@]}"
echo
like image 37
jhnc Avatar answered Oct 19 '22 07:10

jhnc