I know scipy curve_fit can do better

Tags:

I'm using python/numpy/scipy to implement this algorithm for aligning two digital elevation models (DEMs) based on terrain aspect and slope:

"Co-registration and bias corrections of satellite elevation data sets for quantifying glacier thickness change", C. Nuth and A. Kääb, doi:10.5194/tc-5-271-2011

I have things a framework set up, but the quality of the fit provided by scipy.optimize.curve_fit is poor.

def f(x, a, b, c):
    y = a * numpy.cos(numpy.deg2rad(b-x)) + c
    return y

def compute_offset(dh, slope, aspect):
    import scipy.optimize as optimization

    idx = random.sample(range(dh.compressed().size), 10000)
    xdata = numpy.array(aspect.compressed()[idx], float)
    ydata = numpy.array((dh/numpy.tan(numpy.deg2rad(slope))).compressed()[idx], float)

    #Generate synthetic data to test curve_fit
    #xdata = numpy.arange(0,360,0.01)
    #ydata = f(xdata, 20.0, 130.0, -3.0) + 20*numpy.random.normal(size=len(xdata))

    print xdata
    print ydata

    x0 = numpy.array([0.0, 0.0, 0.0])

    fit = optimization.curve_fit(f, xdata, ydata, x0)[0]
    #optimization.leastsq(f, x0[:], args=(xdata, ydata))
    genplot(xdata, ydata, fit)
    return fit

def genplot(x, y, fit):
    a = (numpy.arange(0,360))
    f_a = f(a, fit[0], fit[1], fit[2])
    idx = random.sample(range(x.size), 10000)
    plt.figure()
    plt.xlabel('Aspect (deg)')
    plt.ylabel('dh/tan(slope) (m)')
    plt.plot(x[idx], y[idx], 'r.')
    plt.axhline(color='k')
    plt.plot(a, f_a, 'b')
    plt.ylim(-80,80)
    plt.show()

#Input DEMs
dem1_fn = sys.argv[1]
dem2_fn = sys.argv[2]

dem1_ds = gdal.Open(dem1_fn, gdal.GA_ReadOnly)
dem2_ds = gdal.Open(dem2_fn, gdal.GA_ReadOnly)

#Extract band 1 from each dataset as masked array using internal nodata value
dem1 = getperc_new.gdal_getma(dem1_ds, 1)
dem2 = getperc_new.gdal_getma(dem2_ds, 1)

#Produce slope and aspect maps using gdaldem and load into masked arrays
dem1_slope = gdaldem_slope(dem1_fn)
dem1_aspect = gdaldem_aspect(dem1_fn)

#Compute common mask and apply to all products
common_mask = dem1.mask + dem2.mask + dem1_slope.mask + dem1_aspect.mask

diff_euler = numpy.ma.array(dem2-dem1, mask=common_mask)
dem1_slope.__setmask__(common_mask)
dem1_aspect.__setmask__(common_mask)

#Compute relationship between elevation difference, slope and aspect
fit = compute_offset(diff_euler, dem1_slope, dem1_aspect)
print fit

Here is the fit for my data, which initially consists of ~2 million points, but I've randomly sampled for testing/plotting purposes:

[ -14.9639559 216.01093596 -41.96806735]

plot_one

There is plenty of data there for a good fit, but the result from curve_fit is poor. When I run with synthetic data, I get a nice fit:

original input parameters [20.0, 130.0, -3.0]

result from curve_fit [-19.66719631 -49.6673076 -3.12198723]

plot_two

Not sure if this has something to do with using masked arrays, a limitation of curve_fit, or if I'm just overlooking something simple. Thanks for any suggestions.

==========================

Edit 9/4/13 16:30 PDT

As suggested by @Evert and others, the problem was definitely related to outliers. I was able to obtain a much better fit after removing outliers. Looking at my old code, it seems I computed the median absolute deviation for each aspect, then removed anything outside of 2*mad before fitting.

I generated a few additional plots back in November 2012:

histogram

enter image description here

But looking at these again, I'm almost positive they were generated for different input data. It's all that I can find right now, so I'm including them here as an example of a case with biased sampling. This method for DEM alignment is bound to fail for cases like these - and it has nothing to do with scipy's curve fitting abilities.

I ended up developing a different approach for alignment involving normalized cross-correlation, sub-pixel refinement, and vertical offset removal for two masked 2D numpy arrays. It is faster and consistently provides better results. Although even that approach has been superseded by an Iterative Closest Point (ICP) tool (pc_align) developed by Oleg Alexandrov as part of the NASA Ames Stereo Pipeline.

Thanks for all of your responses and I apologize for abandoning this question.

489

asked Sep 12 '12 23:09

David Shean

1 Answers

If you're just trying to get a sine wave with phase offset, you don't need a non-linear fit.

You can replace that a * sin(x - b) + c by a * sin(x) + b * cos(x) + c, because any sine with an offset can be written as an appropriate combination of a sine and a cosine("Phasor addition", like in a fourier transform).

If that gives the same result then it is not the "non-linear" fit that is the problem.

183

answered Sep 25 '22 01:09

mdaoust

Related questions
                            
                                how to pass a value to c++ from python and back?
                            
                                Python - Validation with multiple schemas using lxml
                            
                                Create and parse multipart HTTP requests in Python
                            
                                How to decode JSON to str and not unicode in Python (2.6)? [duplicate]
                            
                                Python API design pattern question
                            
                                Distributing python application
                            
                                What's the normal structure of a Python open source project and what's the preferred way of running the tests? [closed]
                            
                                How to tell Django not to buffer the HTTP POST data?
                            
                                Python, calling mmap with too-large address causes overflow exception
                            
                                Trying to record from microphone and playback in real time
                            
                                Thread Safety with Template Tags
                            
                                Still getting an error "ImportError: No module named win32com.client" after installing pywin32 lib
                            
                                Installing a python/sphinx environment on windows?
                            
                                Making nested calls to "execute" with Python's Fabric library
                            
                                Partitions of values in a fibonacci call graph (call graph is a binary tree)
                            
                                Python and mechanize login script
                            
                                Tell if PC comes out of standby using Qt
                            
                                Beginning physics simulation
                            
                                Place to initialize DB in Flask
                            
                                import cv2 works but import cv2.cv as cv not working

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

I know scipy curve_fit can do better

Tags:

python

scipy

curve-fitting

David Shean

People also ask

1 Answers

mdaoust

Recent Activity

Donate For Us