I have an array which contains strings. Several of those strings can be the same and that is fine. They can be in any order to start of, but most likely they are in alphabetical order. I have the following shuffle
function which will shuffle all the elements. However, I want to add a condition that no two of the same string can be adjacent in the array.
For example, this is fine: ook eek ook monkey ook
but this is not: ook ook eek ook monkey
as two ook
are adjacent. It is assumed that the input has been checked so that any duplicates are less than half the total number of elements so a set of non-adjacent solutions exists. For example, ook ook ook eek
would be rejected. The strings could contains spaces and UTF-8 characters but not new lines -- the strings are actually file name of images.
How can I modify the shuffle
function to achieve this goal?
Or is there a better way to do that?
shuffle() {
# This function shuffles the elements of an array in-place using the
# Knuth-Fisher-Yates shuffle algorithm.
local i tmp size max rand
# $RANDOM % (i+1) is biased because of the limited range of $RANDOM
# Compensate by using a range which is a multiple of the array size.
size=${#array[*]}
max=$(( 32768 / size * size ))
for ((i=size-1; i>0; i--)); do
while (( (rand=$RANDOM) >= max )); do :; done
rand=$(( rand % (i+1) ))
tmp=${array[i]} array[i]=${array[rand]} array[rand]=$tmp
done
}
Given the rejection pre-condition, it is possible to split the word list into several «equivalence classes» (ECs) — special word groups in each of which words are the same by whatever criterion. Rejection implies that there is no more than one EC that is partly in one half of the list and partly in the other.
We lay a part of this EC aside, so that (1) the remaining part is contained in no more than one of the remaining halves of the list, and (2) the halves are strictly equal in size. Then we shuffle the halves, each one separately. After that, we merge them, the first half occupying odd elements, while evens are for the second half. Then we randomly insert the remaining elements previously laid-aside. It is quite simple, since all of them belong to one EC and thus it is easy to mark places where they can be and where they cannot.
By construction, there can be no two adjacent elements belonging to one EC.
[EDIT:] Finally, the implementation of what`s above.
shuffle() {
local sort="$(sort <<<"$1" | sed "s/^/+/g")"
local size="$(grep -c ^ <<<"$sort")"
local take cntr head tail
if [ "$sort" == "${sort%%$'\n'*}" ]; then
# single line: nothing to shuffle
echo "${sort#+}"
return
fi
if [ $[size & 1] == 1 ]; then
# enforcing equal halves, beginning to extract the middle
take="$(head -$[size / 2 + 1] <<<"$sort" | tail -1)"
fi
cntr="$take"
size=$[size / 2]
head="$(head -$size <<<"$sort")"
tail="$(tail -$size <<<"$sort")"
while [ "$(head -1 <<<"$tail")" == "$(tail -1 <<<"$head")" ]; do
# continue extracting the middle, if still left
if [ -n "$cntr" -a "$cntr" != "$(tail -1 <<<"$head")" ]; then
break
else
cntr="$(tail -1 <<<"$head")"
fi
((--size))
head="$(head -$size <<<"$head")"
tail="$(tail -$size <<<"$tail")"
take="${take:+$take$'\n'}$cntr"$'\n'"$cntr"
done
sort=()
for cntr in $(seq $size); do
# transforming two line sets into a single interlaced array
sort[$[cntr * 4 - 3]]="$(head -$cntr <<<"$head" | tail -1)"
sort[$[cntr * 4 - 1]]="$(head -$cntr <<<"$tail" | tail -1)"
done
for cntr in $(seq $[size - 1]); do
# shuffling each of the interlaced halves by Fisher-Yates
head="${sort[$[cntr * 4 - 3]]}"
tail=$[RANDOM % (size - cntr + 1) + cntr]
sort[$[cntr * 4 - 3]]="${sort[$[tail * 4 - 3]]}"
sort[$[tail * 4 - 3]]="$head"
head="${sort[$[cntr * 4 - 1]]}"
tail=$[RANDOM % (size - cntr + 1) + cntr]
sort[$[cntr * 4 - 1]]="${sort[$[tail * 4 - 1]]}"
sort[$[tail * 4 - 1]]="$head"
done
if [ -n "$take" ]; then
# got a remainder; inserting
tail=($(seq 0 $[size * 2]))
for cntr in $(seq $[size * 2]); do
# discarding potential places with remainder in proximity
if [ "${sort[$[cntr * 2 - 1]]}" \
== "${take%%$'\n'*}" ]; then
tail[$[cntr - 1]]=""
tail[$[cntr]]=""
fi
done
tail=(${tail[@]})
for cntr in $(seq 0 $[${#tail[@]} - 2]); do
# shuffling the remaining places, also by Fisher-Yates
head="${tail[$cntr]}"
size=$[RANDOM % (${#tail[@]} - cntr) + cntr]
tail[$cntr]="${tail[$size]}"
tail[$size]="$head"
done
size="$(grep -c ^ <<<"$take")"
while ((size--)); do
# finally inserting remainders
sort[$[${tail[$size]} * 2]]="${take%%$'\n'*}"
done
fi
head=0
size="${#sort[@]}"
while ((size)); do
# printing the overall result
if [ -n "${sort[$head]}" ]; then
echo "${sort[$head]#+}"
((size--))
fi
((head++))
done
}
# a very simple test from the original post
shuffle \
"ook
ook
eek
ook
monkey"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With