For example, if there are 5 numbers 1, 2, 3, 4, 5
I want a random result like
[[ 2, 3, 1, 4, 5]
[ 5, 1, 2, 3, 4]
[ 3, 2, 4, 5, 1]
[ 1, 4, 5, 2, 3]
[ 4, 5, 3, 1, 2]]
Ensure every number is unique in its row and column.
Is there an efficient way to do this?
I've tried to use while loops to generate one row for each iteration, but it seems not so efficient.
import numpy as np
numbers = list(range(1,6))
result = np.zeros((5,5), dtype='int32')
row_index = 0
while row_index < 5:
np.random.shuffle(numbers)
for column_index, number in enumerate(numbers):
if number in result[:, column_index]:
break
else:
result[row_index, :] = numbers
row_index += 1
In the combinatorics community, the Jacobson and Matthews approach is widely considered the best approach to obtain random Latin squares (approximately) uniformly at random from the set of all Latin squares. A practical non-MCMC approach that samples uniformly would be extremely well-received (but seems beyond current techniques).
If you have a sparse matrix stored in COO format, the following might be helpful assuming that A contains the COO matrix, and perm is a numpy.array containing the permutation. This will only have m memory overhead, where m is the number of non-zero elements of the matrix.
According to the docs, there is no in-place permutation method in numpy, something like ndarray.sort. So your options are (assuming that M is a $Ntimes N$ matrix and p the permutation vector) implementing your own algorithm in C as an extension module (but in-place algorithms are hard, at least for me!)
Generating Latin squares row-by-row by appending random permutations and restarting whenever their is a clash gives the uniform distribution. [Or equivalently, uniformly sampling from the set of row-Latin squares, then restarting if there is a clash.]
Just for your information, what you are looking for is a way of generating latin squares. As for the solution, it depends on how much random "random" is for you.
I would devise at least four main techniques, two of which have been already proposed. Hence, I will briefly describe the other two:
Assuming we work with standard Python data types since I do not see a real merit in using NumPy (but results can be easily converted to np.ndarray
if necessary), this would be in code (the first function is just to check that the solution is actually correct):
import random
import math
import itertools
# this only works for Iterable[Iterable]
def is_latin_rectangle(rows):
valid = True
for row in rows:
if len(set(row)) < len(row):
valid = False
if valid and rows:
for i, val in enumerate(rows[0]):
col = [row[i] for row in rows]
if len(set(col)) < len(col):
valid = False
break
return valid
def is_latin_square(rows):
return is_latin_rectangle(rows) and len(rows) == len(rows[0])
# : prepare the input
n = 9
items = list(range(1, n + 1))
# shuffle items
random.shuffle(items)
# number of permutations
print(math.factorial(n))
def latin_square1(items, shuffle=True):
result = []
for elems in itertools.permutations(items):
valid = True
for i, elem in enumerate(elems):
orthogonals = [x[i] for x in result] + [elem]
if len(set(orthogonals)) < len(orthogonals):
valid = False
break
if valid:
result.append(elems)
if shuffle:
random.shuffle(result)
return result
rows1 = latin_square1(items)
for row in rows1:
print(row)
print(is_latin_square(rows1))
def latin_square2(items, shuffle=True, forward=False):
sign = -1 if forward else 1
result = [items[sign * i:] + items[:sign * i] for i in range(len(items))]
if shuffle:
random.shuffle(result)
return result
rows2 = latin_square2(items)
for row in rows2:
print(row)
print(is_latin_square(rows2))
rows2b = latin_square2(items, False)
for row in rows2b:
print(row)
print(is_latin_square(rows2b))
For comparison, an implementation by trying random permutations and accepting valid ones (fundamentally what @hpaulj proposed) is also presented.
def latin_square3(items):
result = [list(items)]
while len(result) < len(items):
new_row = list(items)
random.shuffle(new_row)
result.append(new_row)
if not is_latin_rectangle(result):
result = result[:-1]
return result
rows3 = latin_square3(items)
for row in rows3:
print(row)
print(is_latin_square(rows3))
I did not have time (yet) to implement the other method (with backtrack Sudoku-like solutions from @ConfusedByCode).
With timings for n = 5
:
%timeit latin_square1(items)
321 µs ± 24.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit latin_square2(items)
7.5 µs ± 222 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%timeit latin_square2(items, False)
2.21 µs ± 69.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%timeit latin_square3(items)
2.15 ms ± 102 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
... and for n = 9
:
%timeit latin_square1(items)
895 ms ± 18.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit latin_square2(items)
12.5 µs ± 200 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%timeit latin_square2(items, False)
3.55 µs ± 55.5 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%timeit latin_square3(items)
The slowest run took 36.54 times longer than the fastest. This could mean that an intermediate result is being cached.
9.76 s ± 9.23 s per loop (mean ± std. dev. of 7 runs, 1 loop each)
So, solution 1 is giving a fair deal of randomness but it is not terribly fast (and scale with O(n!)
), solution 2 (and 2b) are much faster (scaling with O(n)
) but not as random as solution 1. Solution 3 is very slow and the performance can vary significantly (can probably be sped up by letting the last iteration be computed instead of guessed).
Getting more technical, other efficient algorithms are discussed in:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With