One example for curve is shown as below. The elbow point might be x=3 or 4. How to compute the elbow for a curve automatically and mathematically?
(1) Find the tangent line to the curve that is parallel to the line segment A. Define the elbow point as the point where the tangent line intersects the curve. (2) Find the point on the curve that has the greatest vertical distance to the line segment A.
The elbow plot is helpful when determining how many PCs we need to capture the majority of the variation in the data. The elbow plot visualizes the standard deviation of each PC. Where the elbow appears is usually the threshold for identifying the majority of the variation.
The elbow method is used to determine the optimal number of clusters in k-means clustering. The elbow method plots the value of the cost function produced by different values of k.
I created a Python package that attempts to implement the Kneedle algorithm.
To recreate the function above and detect the point of maximum curvature:
x = range(1,21) y = [0.065, 0.039, 0.030, 0.024, 0.023, 0.022, 0.019, 0.0185, 0.0187, 0.016, 0.015, 0.016, 0.0135, 0.0130, 0.0125, 0.0120, 0.0117, 0.0115, 0.0112, 0.013] kn = KneeLocator( x, y, curve='convex', direction='decreasing', interp_method='interp1d', ) print(kn.knee) 7
import matplotlib.pyplot as plt plt.xlabel('x') plt.ylabel('f(x)') plt.xticks(range(1,21)) plt.plot(x, y, 'bx-') plt.vlines(kn.knee, plt.ylim()[0], plt.ylim()[1], linestyles='dashed')
update
Kneed has an improved spline fitting method for handling local minima, use interp_method='polynomial'
.
kn = KneeLocator( x, y, curve='convex', direction='decreasing', interp_method='polynomial', ) print(kn.knee) 4
And the new plot:
plt.xlabel('x') plt.ylabel('f(x)') plt.xticks(range(1,21)) plt.plot(x, y, 'bx-') plt.vlines(kn.knee, plt.ylim()[0], plt.ylim()[1], linestyles='dashed')
You might want to look for the point with the maximum absolute second derivative which, for a set of discrete points x[i]
as you have there, can be approximated with a central difference:
secondDerivative[i] = x[i+1] + x[i-1] - 2 * x[i]
As noted above, what you really want is the point with maximum curvature, but the second derivative will do, and this central difference is a good proxy for the second derivative.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With