I'm new to the regression game and hope to plot a functionally arbitrary, nonlinear regression line (plus confidence and prediction intervals) for a subset of data that satisfies a certain condition (i.e. with mean replicate value exceeding a threshold; see below).
The data
is generated for independent variable x
across 20 different values: x=(20-np.arange(20))**2
, with rep_num=10
replicates for each condition. The data shows strong nonlinearity across x
and looks like the following:
import numpy as np
mu = [.40, .38, .39, .35, .37, .33, .34, .28, .11, .24,
.03, .07, .01, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
data = np.zeros((20, rep_num))
for i in range(13):
data[i] = np.clip(np.random.normal(loc=mu[i], scale=0.1, size=rep_num), 0., 1.)
I can make a scatter plot of the data; the replicate means are shown by the red dots:
import matplotlib.pyplot as plt
plt.scatter(np.log10(np.tile(x[:,None], rep_num)), data,
facecolors='none', edgecolors='k', alpha=0.25)
plt.plot(np.log10(x), data.mean(1), 'ro', alpha=0.8)
plt.plot(np.log10(x), np.repeat(0., 20), 'k--')
plt.xlim(-0.02, np.max(np.log10(x)) + 0.02)
plt.ylim(-0.01, 0.7)
My goal is to plot a regression line for only those data that have replicate mean > 0.02. In addition, I would like to add a 95% confidence interval (black dashed lines) around the regression, as well as a 95% prediction interval (blue dashed lines) -- ideally, the prediction interval can also be colored in with transparent blue background.
The final plot (without the blue background inside the prediction interval) would look something like this:
How would I make this? My online search yielded very different partial approaches using seaborn, scipy, and statsmodels. The applications of some of those template functions did not appear to work alongside the existing matplotlib scatter plot.
The 95% confidence interval is commonly interpreted as there is a 95% probability that the true linear regression line of the population will lie within the confidence interval of the regression line calculated from the sample data.
The prediction interval predicts in what range a future individual observation will fall, while a confidence interval shows the likely range of values associated with some statistical parameter of the data, such as the population mean.
OK, here's a shot at this (withouth prediction band, though). First of all you want to select the applicable data:
threshold = 0.02
reg_x = np.log10(x)[data.mean(1)>threshold]
reg_y = data.mean(1)[data.mean(1)>threshold]
Then you choose a model and perform a fit. Note, here I chose a second order polynomial but in principle you could do anything. For the fits I use kapteyn
, this has a built-in confidence bans method, although it would be straightforward to implement (see e.g. Delta method)
from kapteyn import kmpfit
# Set model to fit.
def model(p, x):
a, b, c = p
return a + b*x + c*x**2
# Perform fit.
f = kmpfit.simplefit(model, [.1, .1, .1], reg_x, reg_y)
f
contains all the estimated parameters and such, you can use that for plotting etc.
x = np.linspace(0, 3, 100)
plt.plot(x, model(f.params, x), linestyle='-', color='black', marker='')
For the confidence bands, we need the partial derivatives of the model with respect to the parameters (yes, some math). Again, this is easy for a polynomial model, shouldn't be a problem for any other model either.
# Partial derivatives:
dfdp = [1., reg_x, reg_x**2]
_, ci_upper, ci_lower = f.confidence_band(reg_x, dfdp, 0.95, model)
# Plot.
plt.plot(reg_x, ci_upper, linestyle='--', color='black', marker='')
plt.plot(reg_x, ci_lower, linestyle='--', color='black', marker='')
Unfortunately there is not prediction_bands()
routine in the package, at least not that I know of. Assume you found some method for the prediction band, the plotting and preparation would look the same though..
p_upper, p_lower = prediction_band(*args, **kwargs)
plt.fill_between(reg_x, p_upper, p_lower, facecolor='blue', alpha=0.2, linestyle='')
Hope this helps, L.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With