Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R expand.grid() function in Python

Tags:

python

r

Is there a Python function similar to the expand.grid() function in R ? Thanks in advance.

(EDIT) Below are the description of this R function and an example.

Create a Data Frame from All Combinations of Factors  Description:       Create a data frame from all combinations of the supplied vectors      or factors.    > x <- 1:3 > y <- 1:3 > expand.grid(x,y)   Var1 Var2 1    1    1 2    2    1 3    3    1 4    1    2 5    2    2 6    3    2 7    1    3 8    2    3 9    3    3 

(EDIT2) Below is an example with the rpy package. I would like to get the same output object but without using R :

>>> from rpy import * >>> a = [1,2,3] >>> b = [5,7,9] >>> r.assign("a",a) [1, 2, 3] >>> r.assign("b",b) [5, 7, 9] >>> r("expand.grid(a,b)") {'Var1': [1, 2, 3, 1, 2, 3, 1, 2, 3], 'Var2': [5, 5, 5, 7, 7, 7, 9, 9, 9]} 

EDIT 02/09/2012: I'm really lost with Python. Lev Levitsky's code given in his answer does not work for me:

>>> a = [1,2,3] >>> b = [5,7,9] >>> expandgrid(a, b) Traceback (most recent call last):   File "<stdin>", line 1, in <module>   File "<stdin>", line 2, in expandgrid NameError: global name 'itertools' is not defined 

However the itertools module seems to be installed (typing from itertools import * does not return any error message)

like image 419
Stéphane Laurent Avatar asked Aug 26 '12 14:08

Stéphane Laurent


People also ask

What is expand grid function in R?

expand. grid() function in R Language is used to create a data frame with all the values that can be formed with the combinations of all the vectors or factors passed to the function as argument.

What is the use of Expand grid () function in Tidyverse package?

The function expand. grid() creates a data frame with all possible combinations of vectors or factors given as arguments.

How do you create a grid in Python?

To create a grid, you need to use the . The grid() method allows you to indicate the row and column positioning in its parameter list. Both row and column start from index 0. For example grid(row=1, column=2) specifies a position on the third column and second row of your frame or window.


1 Answers

Just use list comprehensions:

>>> [(x, y) for x in range(5) for y in range(5)]  [(0, 0), (0, 1), (0, 2), (0, 3), (0, 4), (1, 0), (1, 1), (1, 2), (1, 3), (1, 4), (2, 0), (2, 1), (2, 2), (2, 3), (2, 4), (3, 0), (3, 1), (3, 2), (3, 3), (3, 4), (4, 0), (4, 1), (4, 2), (4, 3), (4, 4)] 

convert to numpy array if desired:

>>> import numpy as np >>> x = np.array([(x, y) for x in range(5) for y in range(5)]) >>> x.shape (25, 2) 

I have tested for up to 10000 x 10000 and performance of python is comparable to that of expand.grid in R. Using a tuple (x, y) is about 40% faster than using a list [x, y] in the comprehension.

OR...

Around 3x faster with np.meshgrid and much less memory intensive.

%timeit np.array(np.meshgrid(range(10000), range(10000))).reshape(2, 100000000).T 1 loops, best of 3: 736 ms per loop 

in R:

> system.time(expand.grid(1:10000, 1:10000))    user  system elapsed    1.991   0.416   2.424  

Keep in mind that R has 1-based arrays whereas Python is 0-based.

like image 106
Thomas Browne Avatar answered Sep 21 '22 15:09

Thomas Browne