Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is this List<>.IndexOf code so much faster than the List[i] and manual compare?

I'm running AQTime on this piece of code, I found that .IndexOf takes 16% of the time vs close to 80% for the other piece... They appear to use the same IsEqual and other routines. Called 116,000 times inserting 30,000 items. None of the List<> objects gets over 200 elements. (I may be using AQTime incorrectly, I'm looking into this)

class PointD : IEquatable<PointD>
{
    public double X, Y, Z;

    bool IEquatable<PointD>.Equals(PointD other)
    {
        return ((X == other.X) && (Y == other.Y) && (Z == other.Z));
    }
}

class PerfTest
{
    readonly List<PointD> _pCoord3Points = new List<PointD>();
    public int NewPoints;
    public int TotalPoints;

    public PerfTest()
    {
        NewPoints = 0;
        TotalPoints = 0;
    }
    public int CheckPointIndexOf(PointD pt)
    {
        int retIndex = _pCoord3Points.IndexOf(pt);
        if (retIndex < 0)
        {
            _pCoord3Points.Add(pt);
            NewPoints++;
        }
        TotalPoints++;
        return retIndex;    
    }

    public int CheckPointForBreak(PointD pt)
    {
        int retIndex = -1;
        for (int i = 0; i < _pCoord3Points.Count; i++)
        {
            PointD otherPt = _pCoord3Points[i];
            if ((pt.X == otherPt.X) &&
                (pt.Y == otherPt.Y) &&
                (pt.Z == otherPt.Z))
            {
                retIndex = i;
                break;
            }
        }
        if (retIndex == -1)
        {
            NewPoints++;
            _pCoord3Points.Add(pt);
        }
        TotalPoints++;
        return retIndex;
    }

    static void Main()
    {
        const int xmax = 300;
        const int ymax = 10;
        const int zmax = 10;
        const int imax = 4;

        var test = new PerfTest();
        //test.Init();
        Stopwatch sw = Stopwatch.StartNew();
        for (int i = 0; i < imax; i++)
        {
            for (int x = 0; x < xmax; x++)
            {
                for (int y = 0; y < ymax; y++)
                {
                    for (int z = 0; z < zmax; z++)
                    {
                        var pt = new PointD { X = x, Y = y, Z = z };
                        test.CheckPointIndexOf(pt);
                    }
                }
            }

        } 
        sw.Stop();
        string output = string.Format("Total: {0:0} New: {1:0} IndexOf: ", test.TotalPoints, test.NewPoints);
        Console.Write(output);
        Console.WriteLine(sw.Elapsed);

        test = new PerfTest();
        sw = Stopwatch.StartNew();
        for (int i = 0; i < imax; i++)
        {
            for (int x = 0; x < xmax; x++)
            {
                for (int y = 0; y < ymax; y++)
                {
                    for (int z = 0; z < zmax; z++)
                    {
                        var pt = new PointD { X = x, Y = y, Z = z };
                        test.CheckPointForBreak(pt);
                    }
                }
            }

        } 
        sw.Stop();
        output = string.Format("Total: {0:0} New: {1:0} PointD[] ", test.TotalPoints, test.NewPoints);
        Console.Write(output);
        Console.WriteLine(sw.Elapsed);
        Console.ReadLine();
    }
}
like image 784
Roger Avatar asked Sep 07 '10 21:09

Roger


2 Answers

I made the following assumptions:

  • PointD is a struct
  • IndexOf is indeed slower than manually iterating the list

You can speed up IndexOf by implementing the IEquatable<T> interface:

struct PointD : IEquatable<PointD>
{
    public int X;
    public int Y;
    public int Z;

    public bool Equals(PointD other)
    {
        return (this.X == other.X) &&
                (this.Y == other.Y) &&
                (this.Z == other.Z);
    }
}

Without implementing the IEquatable<T> interface, IndexOf will compare the two PointD elements using ValueType.Equals(object other) which involves expensive boxing operations.

The documentation of the IEquatable<T> interface states:

The IEquatable<T> interface is used by generic collection objects such as Dictionary<TKey, TValue>, List<T>, and LinkedList<T> when testing for equality in such methods as Contains, IndexOf, LastIndexOf, and Remove. It should be implemented for any object that might be stored in a generic collection.

Here is my complete benchmark code:

using System;
using System.Collections.Generic;
using System.Diagnostics;

struct PointD 
{
    public int X;
    public int Y;
    public int Z;
}

class PerfTest
{
    List<PointD> _pCoord3Points = new List<PointD>();

    int checkPointIndexOf(PointD pt)
    {
        return _pCoord3Points.IndexOf(pt);  
    }

    int checkPointForBreak(PointD pt)
    {
        int retIndex = -1;
        for (int i = 0; i < _pCoord3Points.Count; i++)
        {
            PointD otherPt = _pCoord3Points[i];
            if ((pt.X == otherPt.X) &&
                (pt.Y == otherPt.Y) &&
                (pt.Z == otherPt.Z))
                retIndex = i;
            break;
        }
        return retIndex;
    }

    void init()
    {
        for (int x = 0; x < 100; x++)
        {
            for (int y = 0; y < 10; y++)
            {
                for (int z = 0; z < 10; z++)
                {
                    PointD pt = new PointD() { X = x, Y = y, Z = z };
                    _pCoord3Points.Add(pt);
                }
            }
        }
    }

    static void Main(string[] args)
    {
        PerfTest test = new PerfTest();
        test.init();
        Stopwatch sw = Stopwatch.StartNew();
        for (int x = 0; x < 100; x++)
        {
            for (int y = 0; y < 10; y++)
            {
                for (int z = 0; z < 10; z++)
                {
                    PointD pt = new PointD() { X = x, Y = y, Z = z };
                    test.checkPointIndexOf(pt);
                }
            }
        }
        sw.Stop();
        Console.WriteLine(sw.Elapsed);
        sw = Stopwatch.StartNew();
        for (int x = 0; x < 100; x++)
        {
            for (int y = 0; y < 10; y++)
            {
                for (int z = 0; z < 10; z++)
                {
                    PointD pt = new PointD() { X = x, Y = y, Z = z };
                    test.checkPointForBreak(pt);
                }
            }
        }
        sw.Stop();
        Console.WriteLine(sw.Elapsed);
    }
}

On Windows 7, x64, .NET 4.0 x64 build I get the following timings (approx):

IndexOf:  00:00:04.8096623
For Loop: 00:00:00.0014203

When turning PointD into a class the timings change to

IndexOf:  00:00:01.0703627
For Loop: 00:00:00.0014190

When using a struct PointD implementing IEquatable<PointD> I get more "similar" results, but IndexOf is still slower (there is no significant difference when using a class now):

IndexOf:  00:00:00.3904615
For Loop: 00:00:00.0015218
like image 178
Dirk Vollmar Avatar answered Oct 10 '22 21:10

Dirk Vollmar


Normally, before you access an array element, it checks to make sure that the index is >= 0 and < length -- so that you don't read or overwrite memory that doesn't belong to you. Among other things, it eliminates a slew of severe security flaws called buffer overflows.

Needless to say, that impedes performance if you have very little code within your loop. To lessen this problem, the JIT compiler optimizes for-loops of the form for (i = 0; i < array.Length; i++) { array[i]; } -- that is, any loop that accesses all the indices of an array from 0 to length - 1. It omits the bounds checking for this case. (The optimization fails if you access things like array[i + 1], for the reason that you might step over the bounds.)

Unfortunately this only works with arrays, not with List<>s. So your latter code will not be optimized.

But since a List<> internally contains an array, and List.IndexOf() uses a loop to access each value in the array directly, it can be optimized.

By the way, it's better to say for (int i = 0; i < array.Length; i++) { } than it is to say int length = array.Length; for(int i = 0; i < length; i++) { } -- because it can't be sure that length really is the length of the array.

Edit: looking at the IndexOf code using Reflector, the loop will indeed optimize, but as the other people here have mentioned, it calls Equals() -- which requires a vtable lookup and various sanity checks. So in this case, IndexOf might in fact be slower when you're not running it with a profiler.

The disassembled code:

internal virtual int IndexOf(T[] array, T value, int startIndex, int count)
{
    int num = startIndex + count;
    for (int i = startIndex; i < num; i++)
    {
        if (this.Equals(array[i], value))
        {
            return i;
        }
    }
    return -1;
}
like image 40
Rei Miyasaka Avatar answered Oct 10 '22 22:10

Rei Miyasaka