I have some nice data in a pandas dataframe. I'd like to run simple linear regression on it:
Using statsmodels, I perform my regression. Now, how do I get my plot? I've tried statsmodels' plot_fit
method, but the plot is a little funky:
I was hoping to get a horizontal line which represents the actual result of the regression.
Statsmodels has a variety of methods for plotting regression (a few more details about them here) but none of them seem to be the super simple "just plot the regression line on top of your data" -- plot_fit
seems to be the closest thing.
matplotlib.axes._subplots.AxesSubplot
. Can I overlay a regression line easily onto that plot? Two related questions:
Neither seems to have a good answer.
As requested by @IgorRaush
motifScore expression 6870 1.401123 0.55 10456 1.188554 -1.58 12455 1.476361 -1.75 18052 1.805736 0.13 19725 1.110953 2.30 30401 1.744645 -0.49 30716 1.098253 -1.59 30771 1.098253 -2.04
I had tried this, but it doesn't seem to work... not sure why:
The OLS() function of the statsmodels. api module is used to perform OLS regression. It returns an OLS object. Then fit() method is called on this object for fitting the regression line to the data.
In applied statistics, a partial regression plot attempts to show the effect of adding another variable to a model that already has one or more independent variables. Partial regression plots are also referred to as added variable plots, adjusted variable plots, and individual coefficient plots.
As I mentioned in the comments, seaborn
is a great choice for statistical data visualization.
import seaborn as sns sns.regplot(x='motifScore', y='expression', data=motif)
Alternatively, you can use statsmodels.regression.linear_model.OLS
and manually plot a regression line.
import statsmodels.api as sm # regress "expression" onto "motifScore" (plus an intercept) model = sm.OLS(motif.expression, sm.add_constant(motif.motifScore)) p = model.fit().params # generate x-values for your regression line (two is sufficient) x = np.arange(1, 3) # scatter-plot data ax = motif.plot(x='motifScore', y='expression', kind='scatter') # plot regression line on the same axes, set x-axis limits ax.plot(x, p.const + p.motifScore * x) ax.set_xlim([1, 2])
Yet another solution is statsmodels.graphics.regressionplots.abline_plot
which takes away some of the boilerplate from the above approach.
import statsmodels.api as sm from statsmodels.graphics.regressionplots import abline_plot # regress "expression" onto "motifScore" (plus an intercept) model = sm.OLS(motif.expression, sm.add_constant(motif.motifScore)) # scatter-plot data ax = motif.plot(x='motifScore', y='expression', kind='scatter') # plot regression line abline_plot(model_results=model.fit(), ax=ax)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With