Compute correlation between features and target variable

Tags:

What is the best solution to compute correlation between my features and target variable ?? My dataframe have 1000 rows and 40 000 columns...

Exemple :

df = pd.DataFrame([[1, 2, 4 ,6], [1, 3, 4, 7], [4, 6, 8, 12], [5, 3, 2 ,10]], columns=['Feature1', 'Feature2','Feature3','Target'])

This code works fine but this is too long on my dataframe ... I need only the last column of correlation matrix : correlation with target (not pairwise feature corelation).

corr_matrix=df.corr()
corr_matrix["Target"].sort_values(ascending=False)

The np.corcoeff() function works with array but can we exclude the pairwise feature correlation ?

561

asked Sep 25 '18 11:09

Cox Tox

1 Answers

You could use pandas corr on each column:

df.drop("Target", axis=1).apply(lambda x: x.corr(df.Target))

108

answered Jan 11 '23 22:01

w-m

Related questions
                            
                                Find euclidean distance from a point to rows in pandas dataframe
                            
                                Setting variable in Jinja for loop doesn't persist between iterations
                            
                                Python - How to read CSV file retrieved from S3 bucket?
                            
                                how to handle select boxes in django admin with large amount of records
                            
                                Permanent fix for Opencv videocapture
                            
                                Tkinter Grid Dynamic Layout
                            
                                How can I define the order of click sub-commands in "--help"
                            
                                How to install libraries that require compilation on google-colaboratory
                            
                                Make Multiple Shifted (Lagged) Columns in Pandas
                            
                                How to install COCO PythonAPI in python3
                            
                                Pandas: how to sort dataframe by column AND by index
                            
                                An Efficient Sieve of Eratosthenes in Python
                            
                                Numpy/Pandas clean way to check if a specific value is NaN
                            
                                sum list of dictionary values
                            
                                Error in pip install torchvision on Windows 10
                            
                                How is list.clear() different from list = []?
                            
                                Python Pandas – How to supress PerformanceWarning?
                            
                                Python Anaconda reinstall
                            
                                How to add a function to discord.py event loop?
                            
                                pandas: How to get the most frequent item in pandas series?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Compute correlation between features and target variable

Tags:

python

dataframe

numpy

correlation

Cox Tox

People also ask

1 Answers

w-m

Recent Activity

Donate For Us