fastest way to create pandas dataframe rows for combination of values from lists

Question

let's say i have three list

listA = ['a','b','c', 'd']
listP = ['p', 'q', 'r']
listX = ['x', 'z']

so the dataframe will have 4*3*2 = 24 rows. now, the simplest way to solve this problem is to do this:

df = pd.DataFrame(columns=['A','P','X'])

for val1 in listA:
   for val2 in listP:
      for val3 in listX:
         df.loc[<indexvalue>] = [val1,val2,val3]

now in the real scenario I will have about 800k rows and 12 columns (so 12 nesting in the loops). is there any way i can create this dataframe much faster?

Tarifazo · Accepted Answer

Similar discussion here. Apparently np.meshgrid is more efficient for large data (as an alternative to itertools.product.

Application:

v = np.stack(i.ravel() for i in np.meshgrid(listA, listP, listX)).T
df = pd.DataFrame(v, columns=['A', 'P', 'X'])
>>  A  P  X
0   a  p  x
1   a  p  z
2   b  p  x
3   b  p  z
4   c  p  x

fastest way to create pandas dataframe rows for combination of values from lists

Tags:

python

pandas

numpy

sjishan

1 Answers

Tarifazo

Recent Activity

Donate For Us

fastest way to create pandas dataframe rows for combination of values from lists

Tags:

python

pandas

numpy

sjishan

1 Answers

Tarifazo

Related questions

Recent Activity

Donate For Us