Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Matrix of all possible multipliable outcomes from two lists into dataframe

I have two lists and am trying to create a matrix of all possible multiplication outcomes in a dataframe using pandas.

Lists:

>>> L1
[8, 1, 4, 2, 7, 5]

>>> L2
[5, 3, 9, 1, 2, 6]

I have multiplied every item from L1 with every item from L2 as followed to formulate all possible outcomes:

>>> [[a*b] for a in L1 for b in L2]
[[40], [24], [72], [8], [16], [48], [5], [3], [9], [1], [2], [6], [20], [12], [36], [4], [8], [24], [10], [6], [18], [2], [4], [12], [35], [21], [63], [7], [14], [42], [25], [15], [45], [5], [10], [30]]

Expected Output/Goal:

My goal is to now represent these values as a matrix using pandas but I am not sure of where to start. The first item in the lists are assigned the column/row name from 0-5.

For example:

The df matrix should look something like:

   0  1  2  3  4  5
0  40  5 20 10 35 25
1  24  3 12  6 21 15
2  72  9 36 18 63 45
3  8   1  4  2  7 5
4  16  2  8  4 14 10
5  48  6 24 12 42 30
like image 606
ThatOneNoob Avatar asked Jul 20 '17 17:07

ThatOneNoob


2 Answers

You could consider a solution with numpy.outer:

In [879]: pd.DataFrame(np.outer(L2, L1))
Out[879]: 
    0  1   2   3   4   5
0  40  5  20  10  35  25
1  24  3  12   6  21  15
2  72  9  36  18  63  45
3   8  1   4   2   7   5
4  16  2   8   4  14  10
5  48  6  24  12  42  30

Performance

Setup:

x = np.random.choice(26, 1000)
y = np.random.choice(26, 1000)

%timeit pd.DataFrame([[a*b for b in x] for a in y]) # Willem Van Onsem's solution          
%timeit pd.DataFrame(np.outer(y, x)) # Proposed in this post

Results:

1 loop, best of 3: 566 ms per loop
100 loops, best of 3: 3.75 ms per loop

The numpy solution is 150 times faster than a list comprehension.

like image 146
cs95 Avatar answered Nov 09 '22 23:11

cs95


You can use nested list comprehension:

pd.DataFrame([[a*b for b in L1] for a in L2])

This generates:

>>> pd.DataFrame([[a*b for b in L1] for a in L2])
    0  1   2   3   4   5
0  40  5  20  10  35  25
1  24  3  12   6  21  15
2  72  9  36  18  63  45
3   8  1   4   2   7   5
4  16  2   8   4  14  10
5  48  6  24  12  42  30

So the outer list comprehension [... for a in L2] iterates over L2 and assigns the values to variable a. For every such variable, we generate a list (again with list comprehension) with [a*b for b in L1] where we thus iterate over the values in L1 and generate a list where we multiply the values with a.

like image 33
Willem Van Onsem Avatar answered Nov 09 '22 23:11

Willem Van Onsem