I tried fitting an OLS for Boston data set. My graph looks like below.
How to annotate the linear regression equation just above the line or somewhere in the graph? How do I print the equation in Python?
I am fairly new to this area. Exploring python as of now. If somebody can help me, it would speed up my learning curve.
Many thanks!
I tried this as well.
My problem is - how to annotate the above in the graph in equation format?
Regression plots in seaborn can be easily implemented with the help of the lmplot() function. lmplot() can be understood as a function that basically creates a linear model plot. lmplot() makes a very simple linear regression plot.It creates a scatter plot with a linear fit on top of it.
regplot() : This method is used to plot data and a linear regression model fit. There are a number of mutually exclusive options for estimating the regression model.
lmplot() method is used to draw a scatter plot onto a FacetGrid.
You can use coefficients of linear fit to make a legend like in this example:
import seaborn as sns
import matplotlib.pyplot as plt
from scipy import stats
tips = sns.load_dataset("tips")
# get coeffs of linear fit
slope, intercept, r_value, p_value, std_err = stats.linregress(tips['total_bill'],tips['tip'])
# use line_kws to set line label for legend
ax = sns.regplot(x="total_bill", y="tip", data=tips, color='b',
line_kws={'label':"y={0:.1f}x+{1:.1f}".format(slope,intercept)})
# plot legend
ax.legend()
plt.show()
If you use more complex fitting function you can use latex notification: https://matplotlib.org/users/usetex.html
To annotate multiple linear regression lines in the case of using seaborn
lmplot
you can do the following.
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
df = pd.read_excel('data.xlsx')
# assume some random columns called EAV and PAV in your DataFrame
# assume a third variable used for grouping called "Mammal" which will be used for color coding
p = sns.lmplot(x=EAV, y=PAV,
data=df, hue='Mammal',
line_kws={'label':"Linear Reg"}, legend=True)
ax = p.axes[0, 0]
ax.legend()
leg = ax.get_legend()
L_labels = leg.get_texts()
# assuming you computed r_squared which is the coefficient of determination somewhere else
slope, intercept, r_value, p_value, std_err = stats.linregress(df['EAV'],df['PAV'])
label_line_1 = r'$y={0:.1f}x+{1:.1f}'.format(slope,intercept)
label_line_2 = r'$R^2:{0:.2f}$'.format(0.21) # as an exampple or whatever you want[!
L_labels[0].set_text(label_line_1)
L_labels[1].set_text(label_line_2)
Result:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With