Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficient way of generating latin squares (or randomly permute numbers in matrix uniquely on both axes - using NumPy)

For example, if there are 5 numbers 1, 2, 3, 4, 5

I want a random result like

[[ 2, 3, 1, 4, 5]
 [ 5, 1, 2, 3, 4]
 [ 3, 2, 4, 5, 1]
 [ 1, 4, 5, 2, 3]
 [ 4, 5, 3, 1, 2]]

Ensure every number is unique in its row and column.

Is there an efficient way to do this?

I've tried to use while loops to generate one row for each iteration, but it seems not so efficient.

import numpy as np

numbers = list(range(1,6))
result = np.zeros((5,5), dtype='int32')
row_index = 0
while row_index < 5:
    np.random.shuffle(numbers)
    for column_index, number in enumerate(numbers):
       if number in result[:, column_index]:
           break
       else:
           result[row_index, :] = numbers
           row_index += 1
like image 418
KHC Avatar asked Apr 02 '18 06:04

KHC


People also ask

What is the best way to find random Latin squares?

In the combinatorics community, the Jacobson and Matthews approach is widely considered the best approach to obtain random Latin squares (approximately) uniformly at random from the set of all Latin squares. A practical non-MCMC approach that samples uniformly would be extremely well-received (but seems beyond current techniques).

What is the memory overhead of a sparse matrix with permutation?

If you have a sparse matrix stored in COO format, the following might be helpful assuming that A contains the COO matrix, and perm is a numpy.array containing the permutation. This will only have m memory overhead, where m is the number of non-zero elements of the matrix.

Is there a way to perform permutation in NumPy?

According to the docs, there is no in-place permutation method in numpy, something like ndarray.sort. So your options are (assuming that M is a $Ntimes N$ matrix and p the permutation vector) implementing your own algorithm in C as an extension module (but in-place algorithms are hard, at least for me!)

How do you generate a uniform distribution of Latin squares?

Generating Latin squares row-by-row by appending random permutations and restarting whenever their is a clash gives the uniform distribution. [Or equivalently, uniformly sampling from the set of row-Latin squares, then restarting if there is a clash.]


1 Answers

Just for your information, what you are looking for is a way of generating latin squares. As for the solution, it depends on how much random "random" is for you.

I would devise at least four main techniques, two of which have been already proposed. Hence, I will briefly describe the other two:

  1. loop through all possible permutations of the items and accept the first that satisfy the unicity constraint along rows
  2. use only cyclic permutations to build subsequent rows: these are by construction satisfying the unicity constraint along rows (the cyclic transformation can be done forward or backward); for improved "randomness" the rows can be shuffled

Assuming we work with standard Python data types since I do not see a real merit in using NumPy (but results can be easily converted to np.ndarray if necessary), this would be in code (the first function is just to check that the solution is actually correct):

import random
import math
import itertools

# this only works for Iterable[Iterable]
def is_latin_rectangle(rows):
    valid = True
    for row in rows:
        if len(set(row)) < len(row):
            valid = False
    if valid and rows:
        for i, val in enumerate(rows[0]):
            col = [row[i] for row in rows]
            if len(set(col)) < len(col):
                valid = False
                break
    return valid

def is_latin_square(rows):
    return is_latin_rectangle(rows) and len(rows) == len(rows[0])

# : prepare the input
n = 9
items = list(range(1, n + 1))
# shuffle items
random.shuffle(items)
# number of permutations
print(math.factorial(n))


def latin_square1(items, shuffle=True):
    result = []
    for elems in itertools.permutations(items):
        valid = True
        for i, elem in enumerate(elems):
            orthogonals = [x[i] for x in result] + [elem]
            if len(set(orthogonals)) < len(orthogonals):
                valid = False
                break
        if valid:
            result.append(elems)
    if shuffle:
        random.shuffle(result)
    return result

rows1 = latin_square1(items)
for row in rows1:
    print(row)
print(is_latin_square(rows1))


def latin_square2(items, shuffle=True, forward=False):
    sign = -1 if forward else 1
    result = [items[sign * i:] + items[:sign * i] for i in range(len(items))]
    if shuffle:
        random.shuffle(result)
    return result

rows2 = latin_square2(items)
for row in rows2:
    print(row)
print(is_latin_square(rows2))

rows2b = latin_square2(items, False)
for row in rows2b:
    print(row)
print(is_latin_square(rows2b))

For comparison, an implementation by trying random permutations and accepting valid ones (fundamentally what @hpaulj proposed) is also presented.

def latin_square3(items):
    result = [list(items)]
    while len(result) < len(items):
        new_row = list(items)
        random.shuffle(new_row)
        result.append(new_row)
        if not is_latin_rectangle(result):
            result = result[:-1]
    return result

rows3 = latin_square3(items)
for row in rows3:
    print(row)
print(is_latin_square(rows3))

I did not have time (yet) to implement the other method (with backtrack Sudoku-like solutions from @ConfusedByCode).

With timings for n = 5:

%timeit latin_square1(items)
321 µs ± 24.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit latin_square2(items)
7.5 µs ± 222 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

%timeit latin_square2(items, False)
2.21 µs ± 69.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

%timeit latin_square3(items)
2.15 ms ± 102 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

... and for n = 9:

%timeit latin_square1(items)
895 ms ± 18.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

%timeit latin_square2(items)
12.5 µs ± 200 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

%timeit latin_square2(items, False)
3.55 µs ± 55.5 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

%timeit latin_square3(items)
The slowest run took 36.54 times longer than the fastest. This could mean that an intermediate result is being cached.
9.76 s ± 9.23 s per loop (mean ± std. dev. of 7 runs, 1 loop each)

So, solution 1 is giving a fair deal of randomness but it is not terribly fast (and scale with O(n!)), solution 2 (and 2b) are much faster (scaling with O(n)) but not as random as solution 1. Solution 3 is very slow and the performance can vary significantly (can probably be sped up by letting the last iteration be computed instead of guessed).

Getting more technical, other efficient algorithms are discussed in:

  • Jacobson, M. T. and Matthews, P. (1996), Generating uniformly distributed random latin squares. J. Combin. Designs, 4: 405-437. doi:10.1002/(SICI)1520-6610(1996)4:6<405::AID-JCD3>3.0.CO;2-J
like image 166
norok2 Avatar answered Oct 22 '22 18:10

norok2