Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Shuffle an array so that no two of the same elements are adjacent

Tags:

bash

I have an array which contains strings. Several of those strings can be the same and that is fine. They can be in any order to start of, but most likely they are in alphabetical order. I have the following shuffle function which will shuffle all the elements. However, I want to add a condition that no two of the same string can be adjacent in the array.

For example, this is fine: ook eek ook monkey ook but this is not: ook ook eek ook monkey as two ook are adjacent. It is assumed that the input has been checked so that any duplicates are less than half the total number of elements so a set of non-adjacent solutions exists. For example, ook ook ook eek would be rejected. The strings could contains spaces and UTF-8 characters but not new lines -- the strings are actually file name of images.

How can I modify the shuffle function to achieve this goal?

Or is there a better way to do that?

shuffle() {
    # This function shuffles the elements of an array in-place using the
    # Knuth-Fisher-Yates shuffle algorithm. 
    local i tmp size max rand

    # $RANDOM % (i+1) is biased because of the limited range of $RANDOM
    # Compensate by using a range which is a multiple of the array size.
    size=${#array[*]}
    max=$(( 32768 / size * size ))
    for ((i=size-1; i>0; i--)); do
        while (( (rand=$RANDOM) >= max )); do :; done
            rand=$(( rand % (i+1) ))
            tmp=${array[i]} array[i]=${array[rand]} array[rand]=$tmp
    done
}
like image 395
Sardathrion - against SE abuse Avatar asked Oct 20 '16 10:10

Sardathrion - against SE abuse


1 Answers

Given the rejection pre-condition, it is possible to split the word list into several «equivalence classes» (ECs) — special word groups in each of which words are the same by whatever criterion. Rejection implies that there is no more than one EC that is partly in one half of the list and partly in the other.

We lay a part of this EC aside, so that (1) the remaining part is contained in no more than one of the remaining halves of the list, and (2) the halves are strictly equal in size. Then we shuffle the halves, each one separately. After that, we merge them, the first half occupying odd elements, while evens are for the second half. Then we randomly insert the remaining elements previously laid-aside. It is quite simple, since all of them belong to one EC and thus it is easy to mark places where they can be and where they cannot.

By construction, there can be no two adjacent elements belonging to one EC.

[EDIT:] Finally, the implementation of what`s above.

shuffle() {
    local sort="$(sort <<<"$1" | sed "s/^/+/g")"
    local size="$(grep -c ^ <<<"$sort")"
    local take cntr head tail

    if [ "$sort" == "${sort%%$'\n'*}" ]; then
        # single line: nothing to shuffle
        echo "${sort#+}"
        return
    fi
    if [ $[size & 1] == 1 ]; then
        # enforcing equal halves, beginning to extract the middle
        take="$(head -$[size / 2 + 1] <<<"$sort" | tail -1)"
    fi
    cntr="$take"
    size=$[size / 2]
    head="$(head -$size <<<"$sort")"
    tail="$(tail -$size <<<"$sort")"
    while [ "$(head -1 <<<"$tail")" == "$(tail -1 <<<"$head")" ]; do
        # continue extracting the middle, if still left
        if [ -n "$cntr" -a "$cntr" != "$(tail -1 <<<"$head")" ]; then
            break
        else
            cntr="$(tail -1 <<<"$head")"
        fi
        ((--size))
        head="$(head -$size <<<"$head")"
        tail="$(tail -$size <<<"$tail")"
        take="${take:+$take$'\n'}$cntr"$'\n'"$cntr"
    done
    sort=()
    for cntr in $(seq $size); do
        # transforming two line sets into a single interlaced array
        sort[$[cntr * 4 - 3]]="$(head -$cntr <<<"$head" | tail -1)"
        sort[$[cntr * 4 - 1]]="$(head -$cntr <<<"$tail" | tail -1)"
    done
    for cntr in $(seq $[size - 1]); do
        # shuffling each of the interlaced halves by Fisher-Yates
        head="${sort[$[cntr * 4 - 3]]}"
        tail=$[RANDOM % (size - cntr + 1) + cntr]
        sort[$[cntr * 4 - 3]]="${sort[$[tail * 4 - 3]]}"
        sort[$[tail * 4 - 3]]="$head"
        head="${sort[$[cntr * 4 - 1]]}"
        tail=$[RANDOM % (size - cntr + 1) + cntr]
        sort[$[cntr * 4 - 1]]="${sort[$[tail * 4 - 1]]}"
        sort[$[tail * 4 - 1]]="$head"
    done
    if [ -n "$take" ]; then
        # got a remainder; inserting
        tail=($(seq 0 $[size * 2]))
        for cntr in $(seq $[size * 2]); do
            # discarding potential places with remainder in proximity
            if [ "${sort[$[cntr * 2 - 1]]}" \
              == "${take%%$'\n'*}" ]; then
                tail[$[cntr - 1]]=""
                tail[$[cntr]]=""
            fi
        done
        tail=(${tail[@]})
        for cntr in $(seq 0 $[${#tail[@]} - 2]); do
            # shuffling the remaining places, also by Fisher-Yates
            head="${tail[$cntr]}"
            size=$[RANDOM % (${#tail[@]} - cntr) + cntr]
            tail[$cntr]="${tail[$size]}"
            tail[$size]="$head"
        done
        size="$(grep -c ^ <<<"$take")"
        while ((size--)); do
            # finally inserting remainders
            sort[$[${tail[$size]} * 2]]="${take%%$'\n'*}"
        done
    fi
    head=0
    size="${#sort[@]}"
    while ((size)); do
        # printing the overall result
        if [ -n "${sort[$head]}" ]; then
            echo "${sort[$head]#+}"
            ((size--))
        fi
        ((head++))
    done
}

# a very simple test from the original post
shuffle \
"ook
ook
eek
ook
monkey"
like image 58
hidefromkgb Avatar answered Nov 15 '22 06:11

hidefromkgb