Algorithm to smooth a curve while keeping the area under it constant

Tags:

Consider a discrete curve defined by the points (x1,y1), (x2,y2), (x3,y3), ... ,(xn,yn)

Define a constant SUM = y1+y2+y3+...+yn. Say we change the value of some k number of y points (increase or decrease) such that the total sum of these changed points is less than or equal to the constant SUM.

What would be the best possible manner to adjust the other y points given the following two conditions:

The total sum of the y points (y1'+y2'+...+yn') should remain constant ie, SUM.
The curve should retain as much of its original shape as possible.

A simple solution would be to define some delta as follows:

delta = (ym1' + ym2' + ym3' + ... + ymk') - (ym1 + ym2 + ym3 + ... + ymk')

and to distribute this delta over the rest of the points equally. Here ym1' is the value of the modified point after modification and ym1 is the value of the modified point before modification to give delta as the total difference in modification.

However this would not ensure a totally smoothed curve as area near changed points would appear ragged. Does a better solution/algorithm exist for the this problem?

532

asked Dec 04 '14 04:12

Sohaib

2 Answers

I've used the following approach, though it is a bit OTT.

Consider adding d[i] to y[i], to get s[i], the smoothed value. We seek to minimise

S = Sum{ 1<=i<N-1 | sqr( s[i+1]-2*s[i]+s[i-1] } + f*Sum{ 0<=i<N | sqr( d[i])}

The first term is a sum of the squares of (an approximate) second derivative of the curve, and the second term penalises moving away from the original. f is a (positive) constant. A little algebra recasts this as

S = sqr( ||A*d - b||)

where the matrix A has a nice structure, and indeed A'*A is penta-diagonal, which means that the normal equations (ie d = Inv(A'*A)*A'*b) can be solved efficiently. Note that d is computed directly, there is no need to initialise it.

Given the solution d to this problem we can compute the solution d^ to the same problem but with the constraint One'*d = 0 (where One is the vector of all ones) like this

d^ = d - (One'*d/Q) * e
e = Inv(A'*A)*One
Q = One'*e

What value to use for f? Well a simple approach is to try out this procedure on sample curves for various fs and pick a value that looks good. Another approach is to pick a estimate of smoothness, for example the rms of the second derivative, and then a value that should attain, and then search for an f that gives that value. As a general rule, the bigger f is the less smooth the smoothed curve will be.

Some motivation for all this. The aim is to find a 'smooth' curve 'close' to a given one. For this we need a measure of smoothness (the first term in S) and a measure of closeness (the second term. Why these measures? Well, each are easy to compute, and each are quadratic in the variables (the d[]); this will mean that the problem becomes an instance of linear least squares for which there are efficient algorithms available. Moreover each term in each sum depends on nearby values of the variables, which will in turn mean that the 'inverse covariance' (A'*A) will have a banded structure and so the least squares problem can be solved efficiently. Why introduce f? Well, if we didn't have f (or set it to 0) we could minimise S by setting d[i] = -y[i], getting a perfectly smooth curve s[] = 0, which has nothing to do with the y curve. On the other hand if f is gigantic, then to minimise s we should concentrate on the second term, and set d[i] = 0, and our 'smoothed' curve is just the original. So it's reasonable to suppose that as we vary f, the corresponding solutions will vary between being very smooth but far from y (small f) and being close to y but a bit rough (large f).

It's often said that the normal equations, whose use I advocate here, are a bad way to solve least squares problems, and this is generally true. However with 'nice' banded systems -- like the one here -- the loss of stability through using the normal equations is not so great, while the gain in speed is so great. I've used this approach to smooth curves with many thousands of points in a reasonable time.

To see what A is, consider the case where we had 4 points. Then our expression for S comes down to:

sqr( s[2] - 2*s[1] + s[0]) + sqr( s[3] - 2*s[2] + s[1]) + f*(d[0]*d[0] + .. + d[3]*d[3]).

If we substitute s[i] = y[i] + d[i] in this we get, for example,

s[2] - 2*s[1] + s[0] = d[2]-2*d[1]+d[0] + y[2]-2*y[1]+y[0]

and so we see that for this to be sqr( ||A*d-b||) we should take

A = ( 1 -2  1  0)
    ( 0  1 -2  1)
    ( f  0  0  0)
    ( 0  f  0  0)
    ( 0  0  f  0)
    ( 0  0  0  f)
and
b = ( -(y[2]-2*y[1]+y[0]))
    ( -(y[3]-2*y[2]+y[1]))
    ( 0 )
    ( 0 )
    ( 0 )
    ( 0 )

In an implementation, though, you probably wouldn't want to form A and b, as they are only going to be used to form the normal equation terms, A'*A and A'*b. It would be simpler to accumulate these directly.

182

answered Nov 12 '22 00:11

dmuir

This is a constrained optimization problem. The functional to be minimized is the integrated difference of the original curve and the modified curve. The constraints are the area under the curve and the new locations of the modified points. It is not easy to write such codes on your own. It is better to use some open source optimization codes, like this one: ool.

answered Nov 11 '22 23:11

fang

Related questions
                            
                                Finding edge connectivity of a network by using Maximum Flow algorithm
                            
                                Algorithm: Distance transform - any faster algorithm?
                            
                                Is java.secure.random a sufficient choice for gambling industry?
                            
                                calculate a row of numbers(see context for details)
                            
                                Johnson's Algorithm graph explanation
                            
                                Find pairwise overlaps of intervals (segments)
                            
                                Algorithm for plotting a group of lines efficiently
                            
                                Bounded knapsack special case - small individual item weight is small compared to the number of items
                            
                                How to solve systems of XOR equations?
                            
                                Lowest value in range
                            
                                alternative to recursion based merge sort logic
                            
                                Why Kruskal clustering generates suboptimal classes?
                            
                                Given number N eliminate K digits to get maximum possible number
                            
                                Detect uniform images that (most probably) are not photographs
                            
                                Find a monotonic shortest path in a graph in O(E logV)
                            
                                Is Morton code the most efficient for higher dimensions?
                            
                                Maze (recursive division) algorithm design
                            
                                Sort array in the minimum number of moves
                            
                                Fast SSE threshold algorithm
                            
                                Knuth the art of computer programming ex 1.1.8

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Algorithm to smooth a curve while keeping the area under it constant

Tags:

algorithm

math

curve-fitting

Sohaib

People also ask

2 Answers

dmuir

fang

Recent Activity

Donate For Us