I create a pandas scatter-matrix usng the following code:
import numpy as np
import pandas as pd
a = np.random.normal(1, 3, 100)
b = np.random.normal(3, 1, 100)
c = np.random.normal(2, 2, 100)
df = pd.DataFrame({'A':a,'B':b,'C':c})
pd.scatter_matrix(df, diagonal='kde')
This result in the following scatter-matrix:
The first row has no ytick labels, the 3th column no xtick labels, the 3th item 'C' is not labeled.
Any idea how to complete this plot with the missing labels ?
Pandas has a function scatter_matrix(), for this purpose. scatter_matrix() can be used to easily generate a group of scatter plots between all pairs of numerical features.
plotting. scatter_matrix. Draw a matrix of scatter plots.
Pandas uses matplotlib to display scatter matrices.
Access the subplot in question and change its settings like so.
axes = pd.scatter_matrix(df, diagonal='kde')
ax = axes[2, 2] # your bottom-right subplot
ax.xaxis.set_visible(True)
draw()
You can inspect how the scatter_matrix function goes about labeling at the link below. If you find yourself doing this over and over, consider copying the code into file and creating your own custom scatter_matrix function.
https://github.com/pydata/pandas/blob/master/pandas/tools/plotting.py#L160
Edit, in response to a rejected comment:
The obvious extensions of this, doing ax[0, 0].xaxis.set_visible(True)
and so forth, do not work. For some reason, scatter_matrix seems to set up ticks and labels for axes[2, 2] without making them visible, but it does not set up ticks and labels for the rest. If you decide that it is necessary to display ticks and labels on other subplots, you'll have to dig deeper into the code linked above.
Specifically, change the conditions on the if statements to:
if i == 0
if i == n-1
if j == 0
if j == n-1
respectively. I haven't tested that, but I think it will do the trick.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With