Is there a Python function similar to the expand.grid() function in R ? Thanks in advance.
(EDIT) Below are the description of this R function and an example.
Create a Data Frame from All Combinations of Factors Description: Create a data frame from all combinations of the supplied vectors or factors. > x <- 1:3 > y <- 1:3 > expand.grid(x,y) Var1 Var2 1 1 1 2 2 1 3 3 1 4 1 2 5 2 2 6 3 2 7 1 3 8 2 3 9 3 3
(EDIT2) Below is an example with the rpy package. I would like to get the same output object but without using R :
>>> from rpy import * >>> a = [1,2,3] >>> b = [5,7,9] >>> r.assign("a",a) [1, 2, 3] >>> r.assign("b",b) [5, 7, 9] >>> r("expand.grid(a,b)") {'Var1': [1, 2, 3, 1, 2, 3, 1, 2, 3], 'Var2': [5, 5, 5, 7, 7, 7, 9, 9, 9]}
EDIT 02/09/2012: I'm really lost with Python. Lev Levitsky's code given in his answer does not work for me:
>>> a = [1,2,3] >>> b = [5,7,9] >>> expandgrid(a, b) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 2, in expandgrid NameError: global name 'itertools' is not defined
However the itertools module seems to be installed (typing from itertools import *
does not return any error message)
expand. grid() function in R Language is used to create a data frame with all the values that can be formed with the combinations of all the vectors or factors passed to the function as argument.
The function expand. grid() creates a data frame with all possible combinations of vectors or factors given as arguments.
To create a grid, you need to use the . The grid() method allows you to indicate the row and column positioning in its parameter list. Both row and column start from index 0. For example grid(row=1, column=2) specifies a position on the third column and second row of your frame or window.
Just use list comprehensions:
>>> [(x, y) for x in range(5) for y in range(5)] [(0, 0), (0, 1), (0, 2), (0, 3), (0, 4), (1, 0), (1, 1), (1, 2), (1, 3), (1, 4), (2, 0), (2, 1), (2, 2), (2, 3), (2, 4), (3, 0), (3, 1), (3, 2), (3, 3), (3, 4), (4, 0), (4, 1), (4, 2), (4, 3), (4, 4)]
convert to numpy array if desired:
>>> import numpy as np >>> x = np.array([(x, y) for x in range(5) for y in range(5)]) >>> x.shape (25, 2)
I have tested for up to 10000 x 10000 and performance of python is comparable to that of expand.grid in R. Using a tuple (x, y) is about 40% faster than using a list [x, y] in the comprehension.
OR...Around 3x faster with np.meshgrid and much less memory intensive.
%timeit np.array(np.meshgrid(range(10000), range(10000))).reshape(2, 100000000).T 1 loops, best of 3: 736 ms per loop
in R:
> system.time(expand.grid(1:10000, 1:10000)) user system elapsed 1.991 0.416 2.424
Keep in mind that R has 1-based arrays whereas Python is 0-based.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With