I need an efficient implementation of the cartesian product for a variable number of arrays.
I have tried the product function from Iterators.jl, but the performance was lacking.
I am a python hacker and have used this function from sklearn and have had good performance results with it.
I have tried to write a Julia version of this function, but am not able to produce the same results as the python function.
My code is:
function my_repeat(a, n)
# mimics numpy.repeat
m = size(a, 1)
out = Array(eltype(a), n * m)
out[1:n] = a[1]
for i=2:m
out[(i-1)*n+1:i*n] = a[i]
end
return out
end
function cartesian(arrs; out=None)
dtype = eltype(arrs[1])
n = prod([size(i, 1) for i in arrs])
if is(out, None)
out = Array(dtype, n, length(arrs))
end
m = int(n / size(arrs[1], 1))
out[:, 1] = my_repeat(arrs[1], m)
if length(arrs[2:]) > 0
cartesian(arrs[2:], out=out[1:m, 2:])
for j = 1:size(arrs[1], 1)-1
out[(j*m + 1):(j+1)*m, 2:] = out[1:m, 2:]
end
end
return out
end
I test it with the following:
aa = ([1, 2, 3], [4, 5], [6, 7])
cartesian(aa)
The return value is:
12x3 Array{Float64,2}:
1.0 9.88131e-324 2.13149e-314
1.0 2.76235e-318 2.13149e-314
1.0 9.88131e-324 2.13676e-314
1.0 9.88131e-324 2.13676e-314
2.0 9.88131e-324 2.13149e-314
2.0 2.76235e-318 2.13149e-314
2.0 9.88131e-324 2.13676e-314
2.0 9.88131e-324 2.13676e-314
3.0 9.88131e-324 2.13149e-314
3.0 2.76235e-318 2.13149e-314
3.0 9.88131e-324 2.13676e-314
3.0 9.88131e-324 2.13676e-314
I believe that the problem here is that when I use this line: cartesian(arrs[2:], out=out[1:m, 2:]), the keyword argument out is not updated inplace in the recursive calls.
As can be seen, I have done a very naive translation of the Python version of this function (see link from above). It might well be possible that there are internal language differences that make a naive translation impossible. I don't think this is true because of this quote from the functions section of the julia documentation:
Julia function arguments follow a convention sometimes called “pass-by-sharing”, which means that values are not copied when they are passed to functions. Function arguments themselves act as new variable bindings (new locations that can refer to values), but the values they refer to are identical to the passed values. Modifications to mutable values (such as Arrays) made within a function will be visible to the caller. This is the same behavior found in Scheme, most Lisps, Python, Ruby and Perl, among other dynamic languages.
How can I make this (or an equivalent) function work in Julia?
There's a repeat function in Base.
A shorter and faster variant might use the @forcartesian macro in the Cartesian package:
using Cartesian
function cartprod(arrs, out=Array(eltype(arrs[1]), prod([length(a) for a in arrs]), length(arrs)))
sz = Int[length(a) for a in arrs]
narrs = length(arrs)
@forcartesian I sz begin
k = sub2ind(sz, I)
for i = 1:narrs
out[k,i] = arrs[i][I[i]]
end
end
out
end
The order of rows is different than your solution, but perhaps that doesn't matter?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With