Find locations on a curve where the slope changes

Question

I have data points of time and voltage that create the curve shown below.

The time data is

array([  0.10810811,   0.75675676,   1.62162162,   2.59459459,
         3.56756757,   4.21621622,   4.97297297,   4.97297297,
         4.97297297,   4.97297297,   4.97297297,   4.97297297,
         4.97297297,   4.97297297,   5.08108108,   5.18918919,
         5.2972973 ,   5.51351351,   5.72972973,   5.94594595,
         6.27027027,   6.59459459,   7.13513514,   7.67567568,
         8.32432432,   9.18918919,  10.05405405,  10.91891892,
        11.78378378,  12.64864865,  13.51351351,  14.37837838,
        15.35135135,  16.32432432,  17.08108108,  18.16216216,
        19.02702703,  20.        ,  20.        ,  20.        ,
        20.        ,  20.        ,  20.        ,  20.        ,
        20.10810811,  20.21621622,  20.43243243,  20.64864865,
        20.97297297,  21.40540541,  22.05405405,  22.91891892,
        23.78378378,  24.86486486,  25.83783784,  26.7027027 ,
        27.56756757,  28.54054054,  29.51351351,  30.48648649,
        31.56756757,  32.64864865,  33.62162162,  34.59459459,
        35.67567568,  36.64864865,  37.62162162,  38.59459459,
        39.67567568,  40.75675676,  41.83783784,  42.81081081,
        43.89189189,  44.97297297,  46.05405405,  47.02702703,
        48.10810811,  49.18918919,  50.27027027,  51.35135135,
        52.43243243,  53.51351351,  54.48648649,  55.56756757,
        56.75675676,  57.72972973,  58.81081081,  59.89189189])

and the volts data is

array([ 4.11041056,  4.11041056,  4.11041056,  4.11041056,  4.11041056,
        4.11041056,  4.11041056,  4.10454545,  4.09794721,  4.09208211,
        4.08621701,  4.07961877,  4.07228739,  4.06568915,  4.05909091,
        4.05175953,  4.04516129,  4.03782991,  4.03123167,  4.02463343,
        4.01803519,  4.01217009,  4.00557185,  3.99970674,  3.99384164,
        3.98797654,  3.98284457,  3.97771261,  3.97331378,  3.96891496,
        3.96451613,  3.96085044,  3.95645161,  3.95205279,  3.9483871 ,
        3.94398827,  3.94032258,  3.93665689,  3.94325513,  3.94985337,
        3.95645161,  3.96378299,  3.97038123,  3.97624633,  3.98284457,
        3.98944282,  3.99604106,  4.0026393 ,  4.00923754,  4.01510264,
        4.02096774,  4.02609971,  4.02903226,  4.03196481,  4.03416422,
        4.0356305 ,  4.03709677,  4.03856305,  4.03929619,  4.04002933,
        4.04076246,  4.04222874,  4.04296188,  4.04296188,  4.04369501,
        4.04442815,  4.04516129,  4.04516129,  4.04589443,  4.04589443,
        4.04662757,  4.04662757,  4.0473607 ,  4.0473607 ,  4.04809384,
        4.04809384,  4.04809384,  4.04882698,  4.04882698,  4.04882698,
        4.04956012,  4.04956012,  4.04956012,  4.04956012,  4.05029326,
        4.05029326,  4.05029326,  4.05029326])

plot of the curve

I would like to determine the location of the points labeled A, B, C, D, and E. Point A is the first location where the slope goes from zero to undefined. Point B is the location where the line is no longer vertical. Point C is the minimum of the curve. Point D is where the curve is no longer vertical. Point E is where the slope is close to zero again. The Python code below determines the locations for points A and C.

tdiff = np.diff(time)
vdiff = np.diff(volts)

# point A
idxA = np.where(vdiff < 0)[0][0]
timeA = time[idxA]
voltA = volts[idxA]

# point C
idxC = volts.idxmin()
timeC = time[idxC]
voltC = volts[idxC]

plot with points A and C

How can I determine the other locations on the curve represented by points B, D, and E?

Mad Physicist · Accepted Answer

You are looking for the points that mark any location where the slope changes to or from zero or infinity. We do not not actually need to compute slopes anywhere: either y_n - y_n-1 == 0 and y_n+1 - y_n != 0, or vice versa, or the same for x.

We can take the diff of x. If one of two successive elements is zero, then the diff of the diff will be the diff or the negative diff at that point. So we just want to find and label all points where diff(x) == diff(diff(x)) and diff(x) != 0, properly adjusted for differences in size between the arrays of course. We also want all the points where the same is true for y.

In numpy terms, this is can be written as follows

def masks(vec):
    d = np.diff(vec)
    dd = np.diff(d)

    # Mask of locations where graph goes to vertical or horizontal, depending on vec
    to_mask = ((d[:-1] != 0) & (d[:-1] == -dd))
    # Mask of locations where graph comes from vertical or horizontal, depending on vec
    from_mask = ((d[1:] != 0) & (d[1:] == dd))
    return to_mask, from_mask

to_vert_mask, from_vert_mask = masks(time)
to_horiz_mask, from_horiz_mask = masks(volts)

Keep in mind that the masks are computed on second order differences, so they are two elements shorter than the inputs. Elements in the masks correspond to elements in the input arrays with a one-element border on the leading and trailing edge (hence the index [1:-1] below). You can convert the mask to indices using np.nonzero or you can get the x- and y-values directly using the masks as indices:

def apply_mask(mask, x, y):
    return x[1:-1][mask], y[1:-1][mask]

to_vert_t, to_vert_v = apply_mask(to_vert_mask, time, volts)
from_vert_t, from_vert_v = apply_mask(from_vert_mask, time, volts)
to_horiz_t, to_horiz_v = apply_mask(to_horiz_mask, time, volts)
from_horiz_t, from_horiz_v = apply_mask(from_horiz_mask, time, volts)

plt.plot(time, volts, 'b-')
plt.plot(to_vert_t, to_vert_v, 'r^', label='Plot goes vertical')
plt.plot(from_vert_t, from_vert_v, 'kv', label='Plot stops being vertical')
plt.plot(to_horiz_t, to_horiz_v, 'r>', label='Plot goes horizontal')
plt.plot(from_horiz_t, from_horiz_v, 'k<', label='Plot stops being horizontal')
plt.legend()
plt.show()

Here is the resulting plot:

enter image description here

Notice that because the classification is done separately, "Point A" is correctly identified as being both a spot where verticalness starts and horizontalness ends. The problem is that "Point E" does not appear to be resolvable as such according to these criteria. Zooming in shows that all of the proliferated points correctly identify horizontal line segments:

enter image description here

You could choose a "correct" version of "Point E" by discarding from_horiz completely, and only the last value from to_horiz:

to_horiz_t, to_horiz_v = apply_mask(to_horiz_mask, time, volts)
to_horiz_t, to_horiz_v = to_horiz_t[-1], to_horiz_v[-1]

plt.plot(time, volts, 'b-')
plt.plot(*apply_mask(to_vert_mask, time, volts), 'r^', label='Plot goes vertical')
plt.plot(*apply_mask(from_vert_mask, time, volts), 'kv', label='Plot stops being vertical')
plt.plot(to_horiz_t, to_horiz_v, 'r>', label='Plot goes horizontal')
plt.legend()
plt.show()

I am using this as a showcase for the star expansion of the results of apply_mask. The resulting plot is:

enter image description here

This is pretty much exactly the plot you were looking for. Discarding from_horiz also makes "Point A" be identified only as a drop to vertical, which is nice.

As multiple values in to_horiz show, this method is very sensitive to noise within the data. Your data is quite smooth, but this approach is unlikely to ever work with raw unfiltered measurements.

Find locations on a curve where the slope changes

Tags:

python

pandas

numpy

scipy

wigging

1 Answers

Mad Physicist

Recent Activity

Donate For Us

Find locations on a curve where the slope changes

Tags:

python

pandas

numpy

scipy

wigging

1 Answers

Mad Physicist

Related questions

Recent Activity

Donate For Us