I have loaded the below CSV file containing code and coefficient data into the below dataframe df: <pre class="prettyprint"><code>CODE|COEFFICIENT A|0.5 B|0.4 C|0.3 import pandas as pd import numpy as np df= pd.read_csv('cod_coeff.csv', delimiter='|', encoding="utf-8-sig") </code></pre> giving <pre class="prettyprint"><code> ITEM COEFFICIENT 0 A 0.5 1 B 0.4 2 C 0.3 </code></pre> From the above dataframe, I need to create a final dataframe as below which has a matrix structure with the product of the coefficients: <pre class="prettyprint"><code> A B C A 0.25 0.2 0.15 B 0.2 0.16 0.12 C 0.15 0.12 0.09 </code></pre> I am using <code>np.multiply</code> but I am not successful in producing the result.

numpy as a faster alternative <pre class="prettyprint"><code>pd.DataFrame(np.outer(df, df), df.index, df.index) </code></pre> <img src="https://i.stack.imgur.com/hHaPR.png" alt="enter image description here"> <hr> <h3>Timing</h3> Given sample <img src="https://i.stack.imgur.com/d8W0P.png" alt="enter image description here"> 30,000 rows <pre class="prettyprint"><code>df = pd.concat([df for _ in range(10000)], ignore_index=True) </code></pre> <img src="https://i.stack.imgur.com/Q5nee.png" alt="enter image description here">

You want to do the math between a vector and its tranposition. Transpose with <code>.T</code> and apply the matrix <code>dot</code> function between the two dataframes. <pre class="prettyprint"><code>df = df.set_index('CODE') df.T Out[10]: CODE A B C COEFFICIENT 0.5 0.4 0.3 df.dot(df.T) Out[11]: CODE A B C CODE A 0.25 0.20 0.15 B 0.20 0.16 0.12 C 0.15 0.12 0.09 </code></pre>

create matrix structure using pandas

Tags:

python

pandas

dataframe

numpy

I have loaded the below CSV file containing code and coefficient data into the below dataframe df:

CODE|COEFFICIENT  
A|0.5  
B|0.4  
C|0.3

import pandas as pd
import numpy as np
df= pd.read_csv('cod_coeff.csv', delimiter='|', encoding="utf-8-sig")

giving

  ITEM   COEFFICIENT  
0    A       0.5  
1    B       0.4  
2    C       0.3

From the above dataframe, I need to create a final dataframe as below which has a matrix structure with the product of the coefficients:

     A         B         C        
A   0.25      0.2        0.15  
B   0.2       0.16       0.12  
C   0.15      0.12       0.09

I am using np.multiply but I am not successful in producing the result.

382

asked Aug 31 '16 03:08

dataviz

2 Answers

numpy as a faster alternative

pd.DataFrame(np.outer(df, df), df.index, df.index)

enter image description here

Timing

Given sample

enter image description here

30,000 rows

df = pd.concat([df for _ in range(10000)], ignore_index=True)

enter image description here

answered Oct 09 '22 11:10

piRSquared

You want to do the math between a vector and its tranposition. Transpose with .T and apply the matrix dot function between the two dataframes.

df = df.set_index('CODE')

df.T
Out[10]: 
CODE             A    B    C
COEFFICIENT    0.5  0.4  0.3

df.dot(df.T)
Out[11]: 
CODE     A     B     C
CODE                  
A     0.25  0.20  0.15
B     0.20  0.16  0.12
C     0.15  0.12  0.09

answered Oct 09 '22 10:10

Zeugma

Related questions
                            
                                Pandas split name column into first and last name if contains one space
                            
                                checking if a letter is present in a string in python [duplicate]
                            
                                How to set background color, title in Plotly (python)?
                            
                                python getattr built-in method executes default arguments
                            
                                sum up two pandas dataframes with different indexes element by element
                            
                                A transition from CountVectorizer to TfidfTransformer in sklearn
                            
                                What's convention for naming a class or method as "class" in Python?
                            
                                How to use a Seafile generated upload-link w/o authentication token from command line
                            
                                View of a view of a numpy array is a copy?
                            
                                How to sort rows in pandas with a non-standard order
                            
                                pywinauto.findwindows.WindowNotFoundError in pywinauto
                            
                                "ImportError: No module named urls" while following Django Tutorial
                            
                                Partial sums and subtotals with Pandas
                            
                                How to replace an re match with a transformation of that match?
                            
                                Three variables as heatmap
                            
                                Difference between <type 'generator'> and <type 'xrange'>
                            
                                A python regex that matches the regional indicator character class
                            
                                How do I convert a MultiIndex to type string
                            
                                How can I plot a pandas multiindex dataframe as 3d
                            
                                How to protect some files from the Jinja template processor?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

create matrix structure using pandas

Tags:

python

pandas

dataframe

numpy

dataviz

People also ask

2 Answers

Timing

piRSquared

Zeugma

Recent Activity

Donate For Us