I have several ordered List of X/Y Pairs and I want to calculate a ordered List of X/Y Pairs representing the average of these Lists. All these Lists (including the "average list") will then be drawn onto a chart (see example picture below). I have several problems: <ol> <li>The different lists don't have the same amount of values</li> <li>The X and Y values can increase and decrease and increase (and so on) (see example picture below)</li> </ol> I need to implement this in C#, altought I guess that's not really important for the algorithm itself. <img src="https://i.stack.imgur.com/hnT2Z.png" alt="Example of Lines"> Sorry, that I can't explain my problem in a more formal or mathematical way. EDIT: I replaced the term "function" with "List of X/Y Pairs" which is less confusing.

I would use the method Justin proposes, with one adjustment. He suggests using a mappingtable with fractional indices, though I would suggest integer indices. This might sound a little mathematical, but it's no shame to have to read the following twice(I'd have to too). Suppose the point at index i in a list of pairs A has searched for the closest points in another list B, and that closest point is at index j. To find the closest point in B to A[i+1] you should only consider points in B with an index equal to or larger than j. It will probably by j + 1, but could be j or j + 2, j + 3 etc, but never below j. Even if the point closest to A[i+1] has an index smaller than j, you still shouldn't use that point to interpolate with, since that would result in an unexpected average and graph. I'll take a moment now to create some sample code for you. I hope you see that this optimalization makes sense. EDIT: While implementing this, I realised that j is not only bounded from below(by the method described above), but also bounded from above. When you try the distance from A[i+1] to B[j], B[j+1], B[j+2] etc, you can stop comparing when the distance A[i+1] to B[j+...] stops decreasing. There's no point in searching further in B. The same reasoning applies as when j was bounded from below: even if some point elsewhere in B would be closer, that's probably not the point you want to interpolate with. Doing so would result in an unexpected graph, probably less smooth than you'd expect. And an extra bonus of this second bound is the improved performance. I've created the following code: <pre class="prettyprint"><code>IEnumerable<Tuple<double, double>> Average(List<Tuple<double, double>> A, List<Tuple<double, double>> B) { if (A == null || B == null || A.Any(p => p == null) || B.Any(p => p == null)) throw new ArgumentException(); Func<double, double> square = d => d * d;//squares its argument Func<int, int, double> euclidianDistance = (a, b) => Math.Sqrt(square(A[a].Item1 - B[b].Item1) + square(A[a].Item2 - B[b].Item2));//computes the distance from A[first argument] to B[second argument] int previousIndexInB = 0; for (int i = 0; i < A.Count; i++) { double distance = euclidianDistance(i, previousIndexInB);//distance between A[i] and B[j - 1], initially for (int j = previousIndexInB + 1; j < B.Count; j++) { var distance2 = euclidianDistance(i, j);//distance between A[i] and B[j] if (distance2 < distance)//if it's closer than the previously checked point, keep searching. Otherwise stop the search and return an interpolated point. { distance = distance2; previousIndexInB = j; } else { break;//don't place the yield return statement here, because that could go wrong at the end of B. } } yield return LinearInterpolation(A[i], B[previousIndexInB]); } } Tuple<double, double> LinearInterpolation(Tuple<double, double> a, Tuple<double, double> b) { return new Tuple<double, double>((a.Item1 + b.Item1) / 2, (a.Item2 + b.Item2) / 2); } </code></pre> For your information, the function Average returns the same amount of interpolated points the list A contains, which is probably fine, but you should think about this for your specific application. I've added some comments in it to clarify some details, and I've described all aspects of this code in the text above. I hope it's clear, and otherwise feel free to ask questions. SECOND EDIT: I misread and thought you had only two lists of points. I have created a generalised function of that above accepting multiple lists. It still uses only those principles explained above. <pre class="prettyprint"><code>IEnumerable<Tuple<double, double>> Average(List<List<Tuple<double, double>>> data) { if (data == null || data.Count < 2 || data.Any(list => list == null || list.Any(p => p == null))) throw new ArgumentException(); Func<double, double> square = d => d * d; Func<Tuple<double, double>, Tuple<double, double>, double> euclidianDistance = (a, b) => Math.Sqrt(square(a.Item1 - b.Item1) + square(a.Item2 - b.Item2)); var firstList = data[0]; for (int i = 0; i < firstList.Count; i++) { int[] previousIndices = new int[data.Count];//the indices of points which are closest to the previous point firstList[i - 1]. //(or zero if i == 0). This is kept track of per list, except the first list. var closests = new Tuple<double, double>[data.Count];//an array of points used for caching, of which the average will be yielded. closests[0] = firstList[i]; for (int listIndex = 1; listIndex < data.Count; listIndex++) { var list = data[listIndex]; double distance = euclidianDistance(firstList[i], list[previousIndices[listIndex]]); for (int j = previousIndices[listIndex] + 1; j < list.Count; j++) { var distance2 = euclidianDistance(firstList[i], list[j]); if (distance2 < distance)//if it's closer than the previously checked point, keep searching. Otherwise stop the search and return an interpolated point. { distance = distance2; previousIndices[listIndex] = j; } else { break; } } closests[listIndex] = list[previousIndices[listIndex]]; } yield return new Tuple<double, double>(closests.Select(p => p.Item1).Average(), closests.Select(p => p.Item2).Average()); } } </code></pre> Actually that I did the specific case for 2 lists separately might have been a good thing: it is easily explained and offers a step before understanding the generalised version. Furthermore, the square root could be taken out, since it doesn't change the order of the distances when sorted, just the lengths. THIRD EDIT: In the comments it became clear there might be a bug. I think there are none, aside from the mentioned small bug, which shouldn't make any difference except for at the end of the graphs. As a proof that it actually works, this is the result of it(the dotted line is the average): <img src="https://i.stack.imgur.com/0WaLq.png" alt="enter image description here">

As these are not <code>y=f(x)</code> functions, are they perhaps something like <code>(x,y)=f(t)</code>? If so, you could interpolate along t, and calculate avg(x) and avg(y) for each t. EDIT This of course assumes that t can be made available to your code - so that you have an ordered list of T/X/Y triples.

calculate average function of several functions

3 Answers

I would use the method Justin proposes, with one adjustment. He suggests using a mappingtable with fractional indices, though I would suggest integer indices. This might sound a little mathematical, but it's no shame to have to read the following twice(I'd have to too). Suppose the point at index i in a list of pairs A has searched for the closest points in another list B, and that closest point is at index j. To find the closest point in B to A[i+1] you should only consider points in B with an index equal to or larger than j. It will probably by j + 1, but could be j or j + 2, j + 3 etc, but never below j. Even if the point closest to A[i+1] has an index smaller than j, you still shouldn't use that point to interpolate with, since that would result in an unexpected average and graph. I'll take a moment now to create some sample code for you. I hope you see that this optimalization makes sense.

EDIT: While implementing this, I realised that j is not only bounded from below(by the method described above), but also bounded from above. When you try the distance from A[i+1] to B[j], B[j+1], B[j+2] etc, you can stop comparing when the distance A[i+1] to B[j+...] stops decreasing. There's no point in searching further in B. The same reasoning applies as when j was bounded from below: even if some point elsewhere in B would be closer, that's probably not the point you want to interpolate with. Doing so would result in an unexpected graph, probably less smooth than you'd expect. And an extra bonus of this second bound is the improved performance. I've created the following code:

IEnumerable<Tuple<double, double>> Average(List<Tuple<double, double>> A, List<Tuple<double, double>> B)
{
    if (A == null || B == null || A.Any(p => p == null) || B.Any(p => p == null)) throw new ArgumentException();
    Func<double, double> square = d => d * d;//squares its argument
    Func<int, int, double> euclidianDistance = (a, b) => Math.Sqrt(square(A[a].Item1 - B[b].Item1) + square(A[a].Item2 - B[b].Item2));//computes the distance from A[first argument] to B[second argument]

    int previousIndexInB = 0;
    for (int i = 0; i < A.Count; i++)
    {
        double distance = euclidianDistance(i, previousIndexInB);//distance between A[i] and B[j - 1], initially 
        for (int j = previousIndexInB + 1; j < B.Count; j++)
        {
            var distance2 = euclidianDistance(i, j);//distance between A[i] and B[j]
            if (distance2 < distance)//if it's closer than the previously checked point, keep searching. Otherwise stop the search and return an interpolated point.
            {
                distance = distance2;
                previousIndexInB = j;
            }
            else
            {
                break;//don't place the yield return statement here, because that could go wrong at the end of B.
            }
        }
        yield return LinearInterpolation(A[i], B[previousIndexInB]);
    }
}
Tuple<double, double> LinearInterpolation(Tuple<double, double> a, Tuple<double, double> b)
{
    return new Tuple<double, double>((a.Item1 + b.Item1) / 2, (a.Item2 + b.Item2) / 2);
}

For your information, the function Average returns the same amount of interpolated points the list A contains, which is probably fine, but you should think about this for your specific application. I've added some comments in it to clarify some details, and I've described all aspects of this code in the text above. I hope it's clear, and otherwise feel free to ask questions.

SECOND EDIT: I misread and thought you had only two lists of points. I have created a generalised function of that above accepting multiple lists. It still uses only those principles explained above.

IEnumerable<Tuple<double, double>> Average(List<List<Tuple<double, double>>> data)
{
    if (data == null || data.Count < 2 || data.Any(list => list == null || list.Any(p => p == null))) throw new ArgumentException();
    Func<double, double> square = d => d * d;
    Func<Tuple<double, double>, Tuple<double, double>, double> euclidianDistance = (a, b) => Math.Sqrt(square(a.Item1 - b.Item1) + square(a.Item2 - b.Item2));

    var firstList = data[0];
    for (int i = 0; i < firstList.Count; i++)
    {
        int[] previousIndices = new int[data.Count];//the indices of points which are closest to the previous point firstList[i - 1]. 
        //(or zero if i == 0). This is kept track of per list, except the first list.
        var closests = new Tuple<double, double>[data.Count];//an array of points used for caching, of which the average will be yielded.
        closests[0] = firstList[i];
        for (int listIndex = 1; listIndex < data.Count; listIndex++)
        {
            var list = data[listIndex];
            double distance = euclidianDistance(firstList[i], list[previousIndices[listIndex]]);
            for (int j = previousIndices[listIndex] + 1; j < list.Count; j++)
            {
                var distance2 = euclidianDistance(firstList[i], list[j]);
                if (distance2 < distance)//if it's closer than the previously checked point, keep searching. Otherwise stop the search and return an interpolated point.
                {
                    distance = distance2;
                    previousIndices[listIndex] = j;
                }
                else
                {
                    break;
                }
            }
            closests[listIndex] = list[previousIndices[listIndex]];
        }
        yield return new Tuple<double, double>(closests.Select(p => p.Item1).Average(), closests.Select(p => p.Item2).Average());
    }
}

Actually that I did the specific case for 2 lists separately might have been a good thing: it is easily explained and offers a step before understanding the generalised version. Furthermore, the square root could be taken out, since it doesn't change the order of the distances when sorted, just the lengths.

THIRD EDIT: In the comments it became clear there might be a bug. I think there are none, aside from the mentioned small bug, which shouldn't make any difference except for at the end of the graphs. As a proof that it actually works, this is the result of it(the dotted line is the average): enter image description here

145

answered Sep 30 '22 19:09

JBSnorro

I'll use a metaphor of your functions being cars racing down a curvy racetrack, where you want to extract the center-line of the track given the cars' positions. Each car's position can be described as a function of time:

p1(t) = (x1(t), y1(t))
p2(t) = (x2(t), y2(t))
p3(t) = (x3(t), y3(t))

The crucial problem is that the cars are racing at different speeds, which means that p1(10) could be twice as far down the race track as p2(10). If you took a naive average of these two points, and there was a sharp curve in the track between the cars, the average may be far from the track.

If you could just transform your functions to no longer be a function of time, but a function of the distance along the track, then you would be able to do what you want.

One way you could do this would be to choose the slowest car (i.e., the one with the greatest number of samples). Then, for each sample of the slowest car's position, look at all of the other cars' paths, find the two closest points, and choose the point on the interpolated line which is closest to the slowest car's position. Then average these points together. Once you do this for all of the slow car's samples, you have an average path.

I'm assuming that all of the cars start and end in roughly the same places; if any of the cars just race a small portion of the track, you will need to add some more logic to detect that.

A possible improvement (for both performance and accuracy), is to keep track of the most recent sample you are using for each car and the speed of each car (the relative sampling rate). For your slowest car, it would be a simple map: 1 => 1, 2 => 2, 3 => 3, ... For the other cars, though, it could be more like: 1 => 0.3, 2 => 0.7, 3 => 1.6 (fractional values are due to interpolation). The speed would be the inverse of the change in sample number (e.g., the slow car would have speed 1, and the other car would have speed 1/(1.6-0.7)=1.11). You could then ensure that you don't accidentally backtrack on any of the cars. You could also improve the calculation speed because you don't have to search through the whole set of all points on each path; instead, you can assume that the next sample will be somewhere close to the current sample plus 1/speed.

answered Sep 30 '22 17:09

Justin

As these are not y=f(x) functions, are they perhaps something like (x,y)=f(t)?

If so, you could interpolate along t, and calculate avg(x) and avg(y) for each t.

EDIT This of course assumes that t can be made available to your code - so that you have an ordered list of T/X/Y triples.

answered Sep 30 '22 19:09

Gavi Lock

Related questions
                            
                                WPF ActualWidth is zero
                            
                                Invoke Func<T, TResult> from Reflection
                            
                                Should changing the contents of a string like this cause an exception?
                            
                                C# Cast Exception
                            
                                Elmah not logging exceptions
                            
                                What can I do to make my C# application take advantage of multiple processor cores?
                            
                                Using Selenium 2's IWebDriver to interact with elements on the page
                            
                                Are there any arithmetic operation projections in NHibernate?
                            
                                html tags in app.config file in .net console application
                            
                                Is there any reason not to use 'protected' properties?
                            
                                How to view c# compiler output of syntactic sugar
                            
                                Do Timer object get GC-ed when no other object references them?
                            
                                C# Check If Text File Has Content
                            
                                Window hooks in c#
                            
                                Twitter API - OOB Flow
                            
                                Export Movies from Powerpoint to file in C#
                            
                                Many-to-many relationships in Entity Framework where join table has more than two fields?
                            
                                Creating partial class in C#
                            
                                C#/EF and the Repository Pattern: Where to put the ObjectContext in a solution with multiple repositories?
                            
                                WCF service access from client application when user is behind proxy

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

calculate average function of several functions

Tags:

c#

algorithm

Preli

People also ask

3 Answers

JBSnorro

Justin

Gavi Lock

Recent Activity

Donate For Us