Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fit a monotonic curve (preferably in python)

I'm trying to fit a monotonic curve to some nearly-monotonic data. (The X values are monotonic, and the Y values should be monotonic but the noise is often bigger than the change in the underlying value from point to point.) Here's a summary of what I'm doing at the moment:

def goodness_of_fit(Xfit):
    assert(is_sorted(Xfit))
    # ( Calculate the area between the fit line and the join-the-dots line from the data )

scipy.optimize.minimize(goodness_of_fit, x0=numpy.linspace(xmin, xmax))

I can't find a way to get the optimisation algorithm to keep the Xfit array sorted - does anyone have any suggestions? (The size of the array is too large for it to be feasible to create N-1 individual ordering constraints and use the constrained optimisation functions.) I'm willing to use a different language other than Python if the best solution is only available in that language.

(N.B. I am indeed fitting the X values, not the Y values - this is because I eventually want to plot the dX/dY curve and have it not blow up to ridiculous values as it does if I plot it from the raw data. However, if it's much easier to fit the Y values on fixed X values I could do that instead.)

like image 425
user1915639 Avatar asked Oct 21 '22 21:10

user1915639


2 Answers

How about creating a new array from Xfit that is a strictly-monotonic subset and then fitting the curve to that? Something like:

Xfit      = np.hstack(((-1,-1,0,1,1),np.arange(1,10),(-7,9,10,10)))
len_mono  = 0
Xfit_mono = zeros(Xfit.size)
Xfit_mono_ind = zeros(Xfit.size)
Xfit_mono[len_mono] = Xfit[0]

for(iX, x) in enumerate(Xfit):
    if iX > 0:
        if x > Xfit_mono[len_mono]:
            len_mono = len_mono + 1
            Xfit_mono[len_mono] = x
            Xfit_mono_ind[len_mono] = iX 
Xfit_mono_ind = Xfit_mono_ind[:len_mono+1]
Xfit_mono = Xfit_mono[:len_mono+1]
print(Xfit_mono_ind)
print(Xfit_mono)

Then you can use Xfit_mono in your curve fitting instead, and when you want to select the associated y values, you can use the values of Xfit_mono_ind as the indices of y.

Update replaces:

if x > Xfit[iX-1]:

with:

if x > Xfit_mono[len_mono]:

For weakly-increasing the old solution works. However, the old result gave the wrong answer if it is ever decreasing. This update gives the desired result in both cases .

like image 77
BKay Avatar answered Oct 31 '22 12:10

BKay


In the end I did it by fitting the differences between successive X values rather than the values themselves - I can then use a simple lower bound of zero.

def fg(X):
    # return the objective function and its Jacobian

def adjusted_fg(X_diff):
    X = cumsum(X_diff)
    score, jac = fg(X)
    jac[1:] = np.diff(jac)
    return score, jac

X0_diff[1:] = [X0[0]] + np.diff(X0)
bounds = [(None, None)] + [(0, None) for i in range(len(X0)-1)]

scipy.optimize.minimize(adjusted_fg, X0_diff, method='L-BFGS-B', bounds=bounds)

For large N, this tends to be unstable and/or get stuck in a local minimum if not started close to the solution, so I'm going to try fitting for a smaller N (e.g. N/10) first, then interpolating to get a better X0 for the large N fit.

(N.B. For my problem I actually want strict ordering, so I used a positive lower bound rather than 0)

like image 26
user1915639 Avatar answered Oct 31 '22 11:10

user1915639