I want to plot a bunch of variables against the same target variable. So a kind of scatter matrix, but with list_of_df_columns-vs-one_df_column rather than all-vs-all.
I've looked at adding subplots one by one in a loop, but it seems like there must be a better way. Is there some way to use the scatter_matrix function to do this?
There are dozens of variables I want to plot against a single outcome, I really want the results to be nice and compact so they can be presented as a single figure.
If you have a grouping variable you can create a scatter plot by group passing the variable (as factor) to the col argument of the plot function, so each group will be displayed with a different color.
You can plot data directly from your DataFrame using the plot() method. To plot multiple data columns in single frame we simply have to pass the list of columns to the y argument of the plot function.
You could try using seaborn pairplot, and passing specific x and y variables.
import seaborn as sns
sns.pairplot(df, y_vars="A", x_vars=df.columns.values)
Maybe the bare plot
could help if you set the index to the fixed column:
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame({'A':[1,2,3,4,5,6],'B':[2,0,3,6,1,3],'C':[7,3,2,1,5,0],'D':[1,3,0,2,2,6]})
col = 'A'
df2 = df.drop(col,axis=1)
df2.index = df[col]
df2.plot(subplots=True, style='.')
plt.legend(loc='best')
plt.show()
Hope it helps.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With