Correlation between two non-numeric columns in a Pandas DataFrame

Tags:

I get my data from a SQL query from the table to my pandas Dataframe. The data looks like:

     group phone_brand
0      M32-38          小米
1      M32-38          小米
2      M32-38          小米
3      M29-31          小米
4      M29-31          小米
5      F24-26        OPPO
6      M32-38          酷派
7      M32-38          小米
8      M32-38        vivo
9      F33-42          三星
10     M29-31          华为
11     F33-42          华为
12     F27-28          三星
13     M32-38          华为
14       M39+         艾优尼
15     F27-28          华为
16     M32-38          小米
17     M32-38          小米
18       M39+          魅族
19     M32-38          小米
20     F33-42          三星
21     M23-26          小米
22     M23-26          华为
23     M27-28          三星
24     M29-31          小米
25     M32-38          三星
26     M32-38          三星
27     F33-42          三星
28     M32-38          三星
29     M32-38          三星
...       ...         ...
74809  M27-28          华为
74810  M29-31         TCL

Now I want to found the correlation and the frequency from this to columns. But this in a visualization with Matplotlib. I try something like:

DataFrame.plot(style='o')
plt.show()

Now how can I visualize this correlation at simplest way?

786

asked Oct 29 '17 15:10

madik_atma

1 Answers

To quickly get a correlation:

df.apply(lambda x: x.factorize()[0]).corr()

                group  phone_brand
group        1.000000     0.427941
phone_brand  0.427941     1.000000

Heat map

import seaborn as sns

sns.heatmap(pd.crosstab(df.group, df.phone_brand))

enter image description here

answered Oct 16 '22 23:10

piRSquared

Related questions
                            
                                How to use custom field for search in django admin
                            
                                SQLAlchemy order_by many to many relationship through association proxy
                            
                                How to click on span element with python selenium
                            
                                AlphaVantage API Stock Market Indices
                            
                                Using function parameter names that are the same as passed variables
                            
                                how to download pip dependencies locally? [duplicate]
                            
                                Matplotlib - color under curve based on spectral color
                            
                                How to set Tensorflow dynamic_rnn, zero_state without a fixed batch_size?
                            
                                How to dynamically freeze weights after compiling model in Keras?
                            
                                Split List By Value and Keep Separators
                            
                                XGBoostError: b'[19:12:58] src/metric/rank_metric.cc:89: Check failed: (preds.size()) == (info.labels.size()) label size predict size not match'
                            
                                Difference in buffering of stdout on Linux and Windows
                            
                                How to get the index of filtered item in list using lambda?
                            
                                How to create a confirmation popup for class.DeleteView
                            
                                Splitting a dataframe into separate CSV files
                            
                                Trouble converting string to float in python
                            
                                Create a pandas dataframe from a nested lists of unequal lengths
                            
                                Add a validator to a Mongodb collection with pymongo
                            
                                Merge rows within a group together
                            
                                Convert string to float pandas

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Correlation between two non-numeric columns in a Pandas DataFrame

Tags:

python

pandas

matplotlib

correlation

madik_atma

People also ask

1 Answers

piRSquared

Recent Activity

Donate For Us