Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

If Statement True Block Executed When Condition is False

I optimized an extension method to compare two streams for equality (byte-for-byte) - knowing that this is a hot method I tried to optimize it as far as possible (the streams can reach into multi-megabyte lengths). I essentially came up with the following approach:

[StructLayout(LayoutKind.Explicit)]
struct Converter
{
    [FieldOffset(0)]
    public Byte[] Byte;

    [FieldOffset(0)]
    public UInt64[] UInt64;
}

/// <summary>
/// Compares two streams for byte-by-byte equality.
/// </summary>
/// <param name="target">The target stream.</param>
/// <param name="compareTo">The stream to compare the target to.</param>
/// <returns>A value indicating whether the two streams are identical.</returns>
public static bool CompareBytes(this Stream target, Stream compareTo)
{
    if (target == null && compareTo == null)
        return true;
    if (target == null || compareTo == null)
        return false;
    if (target.Length != compareTo.Length)
        return false;
    if (object.ReferenceEquals(target, compareTo))
        return true;
    if (!target.CanRead || !target.CanSeek)
        throw new ArgumentOutOfRangeException("target");
    if (!compareTo.CanRead || !compareTo.CanSeek)
        throw new ArgumentOutOfRangeException("target");
    lock (target)
    {
        lock (compareTo)
        {
            var origa = target.Position;
            var origb = compareTo.Position;
            try
            {
                target.Position = compareTo.Position = 0;

                // Shrink the number of comparisons.
                var arr1 = new byte[4096];
                var convert1 = new Converter() { Byte = arr1 };
                var arr2 = new byte[4096];
                var convert2 = new Converter() { Byte = arr2 };

                int len;
                while ((len = target.Read(arr1, 0, 4096)) != 0)
                {
                    if (compareTo.Read(arr2, 0, 4096) != len)
                        return false;
                    for (var i = 0; i < (len / 8) + 1; i++)
                        if (convert1.UInt64[i] != convert2.UInt64[i])
                            return false;
                }

                return true;
            }
            finally
            {
                target.Position = origa;
                compareTo.Position = origb;
            }
        }
    }
}

The problem is that the convert1.UInt64[i] != convert2.UInt64[i] if block (returning false) is being evaluated even when the values are the equal. I checked each on individually, then checked the outcome of the 'not equals'. I am in pure disbelief:

Values are not equal

I have not messed with the instruction pointer - this is how the code executed and the watch pin is live.

Any ideas how this could happen?

like image 970
Jonathan Dickinson Avatar asked Jan 20 '12 12:01

Jonathan Dickinson


2 Answers

  for (var i = 0; i < (len / 8) + 1; i++)

The debugger in general has a hard time with this union, it can't display the array content when I try it. But the core problem is no doubt the +1 in the for() end expression. That indexes the array beyond its last element when len is divisible by 8. The runtime cannot catch this mistake, overlapping the arrays causes the Length property to have a bogus value. What happens next is undefined behavior, you are reading bytes that are not part of the array. A workaround is to make the array 7 bytes longer.

This kind of code is not exactly an optimization, reading and comparing uint64 on a 32-bit machine is expensive, especially when the array isn't aligned correctly.. About 50% odds for that. A better mousetrap is to use the C runtime memcmp() function, available on any Windows machine:

    [DllImport("msvcrt.dll")]
    private static extern int memcmp(byte[] arr1, byte[] arr2, int cnt);

And use it like this:

    int len;
    while ((len = target.Read(arr1, 0, 4096)) != 0) {
        if (compareTo.Read(arr2, 0, 4096) != len) return false;
        if (memcmp(arr1, arr2, len) != 0) return false;
    }
    return true;

Do compare the perf of this with a plain for() loop that compares bytes. The ultimate throttle here is the memory bus bandwidth.

like image 53
Hans Passant Avatar answered Sep 27 '22 18:09

Hans Passant


Problems like this are commonly issues with understanding of how optimizations work. This line of code could very well be being executed because both return false clauses are combined into one set of instructions at the lower level. Other causes for issues like this is if the architecture you are on allows for conditional execution in which certain instructions are hit in the debugger but the results are never committed to registers at the architecture level.

Verify that the code works in debug mode first. Then when you are convinced the outcome is the same as the release mode, look at the underlying instructions to figure out the compiler optimization at hand.

like image 38
Mark Smith Avatar answered Sep 27 '22 19:09

Mark Smith