Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Linq on 2 arrays in 1 loop?

Tags:

c#

linq

Is it possible to improve the efficiency of those linq requests? I use two different loops... Can you help me to optimize this code?

        double[] x = { 2, 3, 1, 5, 7, 2, 3 };
        double[] y = { 1, 2, 3, 4, 5, 6, 7 };            
        IEnumerable<int> range = Enumerable.Range(0, x.Length);
        double[] y_sorted = (from n in range orderby x[n] select y[n]).ToArray();
        double[] x_sorted = (from n in range orderby x[n] select x[n]).ToArray();

This code in python is like that if you prefer:

        x_index = argsort(x)
        x_sorted = [x[i] for i in x_index]
        y_sorted = [y[i] for i in x_index]

you will notice that, in this python code, i use only one sort. that's not the case of this c# code.

we should get at the end:

        x_sorted = { 1, 2, 2, 3, 3, 5, 7 }
        y_sorted = { 3, 1, 6, 2, 7, 4, 5 }

Fred

Edit: I use the program of Diadistis (after a small correction)

So here we go: Array.Sort(x, y) (0.05) is the fastest way following (0.18) by

        int[] x_index = Enumerable.Range(0, x.Length).OrderBy(i => x[i]).ToArray();
        double[] x_sorted = x_index.Select(i => x[i]).ToArray();
        double[] y_sorted = x_index.Select(i => y[i]).ToArray();

The other solutions are quite equivalent (~0.35) in time consumption on my pc.

If someone have an interesting idea, I will profile it and update this post.

like image 932
Frederic Avatar asked Jan 19 '10 12:01

Frederic


2 Answers

It's ok but I'd prefer a simpler syntax :

double[] x = { 2, 3, 1, 5, 7, 2, 3 };
double[] y = { 2, 3, 1, 5, 7, 2, 3 };

double[] x_sorted = x.OrderBy(d => d).ToArray();
double[] y_sorted = y.OrderBy(d => d).ToArray();

Edit:

Argh... I failed to spot that this was an associative array sort.

double[] x = { 2, 3, 1, 5, 7, 2, 3 };
double[] y = { 1, 2, 3, 4, 5, 6, 7 };  

double[] y_sorted = y.Clone() as double[];
double[] x_sorted = x.Clone() as double[];

Array.Sort(x_sorted, y_sorted);

Edit 2 1/2

And some performance tests :

public class Program
{
    delegate void SortMethod(double[] x, double[] y);

    private const int ARRAY_SIZE = 3000000;

    private static Random RandomNumberGenerator = new Random();

    private static double[] x = GenerateTestData(ARRAY_SIZE);
    private static double[] y = GenerateTestData(ARRAY_SIZE);

    private static double[] GenerateTestData(int count)
    {
        var data = new double[count];

        for (var i = 0; i < count; i++)
        {
            data[i] = RandomNumberGenerator.NextDouble();
        }

        return data;
    }

    private static void SortMethod1(double[] x, double[] y)
    {
        Array.Sort(x, y);
    }

    private static void SortMethod2(double[] x, double[] y)
    {
        IEnumerable<int> range = Enumerable.Range(0, x.Length);
        x = (from n in range orderby x[n] select y[n]).ToArray();
        y = (from n in range orderby x[n] select x[n]).ToArray();
    }

    private static void SortMethod3(double[] x, double[] y)
    {
        int[] x_index = 
            Enumerable.Range(0, x.Length).OrderBy(i => x[i]).ToArray();

        x = x_index.Select(i => x[i]).ToArray();
        y = x_index.Select(i => y[i]).ToArray();
    }

    private static void SortMethod4(double[] x, double[] y)
    {
        int[] range =
            Enumerable.Range(0, x.Length).OrderBy(i => x[i]).ToArray();

        var q = (
            from n in range
            orderby x[n]
            select new { First = x[n], Second = y[n] }).ToArray();

        x = q.Select(t => t.First).ToArray();
        y = q.Select(t => t.Second).ToArray();
    }

    private static void SortMethodPerformanceTest(SortMethod sortMethod)
    {
        double[] y_sorted = y.Clone() as double[];
        double[] x_sorted = x.Clone() as double[];

        var sw = new Stopwatch();

        sw.Start();
        sortMethod.Invoke(x_sorted, y_sorted);
        sw.Stop();

        Console.WriteLine(
            string.Format(
                "{0} : {1}",
                sortMethod.Method.Name,
                sw.Elapsed));
    }

    static void Main(string[] args)
    {
        Console.WriteLine("For array length : " + ARRAY_SIZE);
        Console.WriteLine("------------------------------");

        SortMethodPerformanceTest(SortMethod1);
        SortMethodPerformanceTest(SortMethod2);
        SortMethodPerformanceTest(SortMethod3);
        SortMethodPerformanceTest(SortMethod4);

        Console.WriteLine("Press any key to continue...");
        Console.ReadKey();
    }
}

And the results :

For array length : 3000000
------------------------------
SortMethod1 : 00:00:00.6088503 // Array.Sort(Array, Array)
SortMethod2 : 00:00:07.9583779 // Original
SortMethod3 : 00:00:04.5023336 // dtb's Linq Alternative
SortMethod4 : 00:00:06.6115911 // Christian's Linq Alternative
like image 91
8 revs Avatar answered Nov 13 '22 18:11

8 revs


If I read your code correctly, you're trying to sort two arrays by the elements of the first array.

The literal translation of your Python code to C# would be something like this:

int[] x_index = Enumerable.Range(0, x.Length).OrderBy(i => x[i]).ToArray();
double[] x_sorted = x_index.Select(i => x[i]).ToArray();
double[] y_sorted = x_index.Select(i => y[i]).ToArray();

Alternatively, you could zip the two arrays to an enumerable of tuples and then sort this by the first item:

var sorted = Enumerable.Zip(x, y, Tuple.Create<double, double>)
                       .OrderBy(t => t.Item1)
                       .ToArray();
like image 4
dtb Avatar answered Nov 13 '22 18:11

dtb