Displaying pair plot in Pandas data frame

Tags:

I am trying to display a pair plot by creating from scatter_matrix in pandas dataframe. This is how the pair plot is created:

# Create dataframe from data in X_train
# Label the columns using the strings in iris_dataset.feature_names
iris_dataframe = pd.DataFrame(X_train, columns=iris_dataset.feature_names)
# Create a scatter matrix from the dataframe, color by y_train
grr = pd.scatter_matrix(iris_dataframe, c=y_train, figsize=(15, 15), marker='o',
hist_kwds={'bins': 20}, s=60, alpha=.8, cmap=mglearn.cm3)

I want to display the pair plot to look something like this;

Enter image description here

I am using Python v3.6 and PyCharm and am not using Jupyter Notebook.

722

asked Mar 04 '17 05:03

user3848207

2 Answers

This code worked for me using Python 3.5.2:

import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

from sklearn import datasets

iris_dataset = datasets.load_iris()
X = iris_dataset.data
Y = iris_dataset.target

iris_dataframe = pd.DataFrame(X, columns=iris_dataset.feature_names)

# Create a scatter matrix from the dataframe, color by y_train
grr = pd.plotting.scatter_matrix(iris_dataframe, c=Y, figsize=(15, 15), marker='o',
                                 hist_kwds={'bins': 20}, s=60, alpha=.8)

For pandas version < v0.20.0.

Thanks to michael-szczepaniak for pointing out that this API had been deprecated.

grr = pd.scatter_matrix(iris_dataframe, c=Y, figsize=(15, 15), marker='o',
                        hist_kwds={'bins': 20}, s=60, alpha=.8)

I just had to remove the cmap=mglearn.cm3 piece, because I was not able to make mglearn work. There is a version mismatch issue with sklearn.

To not display the image and save it directly to file you can use this method:

plt.savefig('foo.png')

Also remove

# %matplotlib inline

Enter image description here

157

answered Sep 25 '22 08:09

Vikash Singh

Just an update to Vikash's excellent answer. The last two lines should now be:

grr = pd.plotting.scatter_matrix(iris_dataframe, c=Y, figsize=(15, 15), marker='o',
                                 hist_kwds={'bins': 20}, s=60, alpha=.8)

The scatter_matrix function has been moved to the plotting package, so the original answer, while correct is now deprecated.

So the complete code would now be:

import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

from sklearn import datasets

iris_dataset = datasets.load_iris()
X = iris_dataset.data
Y = iris_dataset.target

iris_dataframe = pd.DataFrame(X, columns=iris_dataset.feature_names)
# create a scatter matrix from the dataframe, color by y_train
grr = pd.plotting.scatter_matrix(iris_dataframe, c=Y, figsize=(15, 15), marker='o',
                                 hist_kwds={'bins': 20}, s=60, alpha=.8)

answered Sep 22 '22 08:09

Michael Szczepaniak

Related questions
                            
                                How do I handle multiple asserts within a single Python unittest?
                            
                                How can I change Django admin language?
                            
                                Write to StringIO object using Pandas Excelwriter?
                            
                                cannot import name 'ImageTK' - python 3.5
                            
                                Python iterate through array while finding the mean of the top k elements
                            
                                Method name doesn't conform to snake_case naming style
                            
                                Converting JSON into newline delimited JSON in Python
                            
                                What does this tensorflow message mean? Any side effect? Was the installation successful?
                            
                                Consecutive, Overlapping Subsets of Array (NumPy, Python)
                            
                                Perl for a Python programmer
                            
                                Display Listbox with columns using Tkinter?
                            
                                How do I have python httplib accept untrusted certs?
                            
                                Calculate difference between adjacent items in a python list
                            
                                How to get all messages in Amazon SQS queue using boto library in Python?
                            
                                How to read file attributes in a directory?
                            
                                How to execute a for loop in batches?
                            
                                Django REST framework foreign keys and filtering
                            
                                Flask Restful add resource parameters
                            
                                Dictionary column in pandas dataframe
                            
                                Extract dictionary value from column in data frame

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Displaying pair plot in Pandas data frame

Tags:

python

python-3.x

pandas

plot

user3848207

People also ask

2 Answers

Vikash Singh

Michael Szczepaniak

Recent Activity

Donate For Us