I optimized an extension method to compare two streams for equality (byte-for-byte) - knowing that this is a hot method I tried to optimize it as far as possible (the streams can reach into multi-megabyte lengths). I essentially came up with the following approach:
[StructLayout(LayoutKind.Explicit)]
struct Converter
{
[FieldOffset(0)]
public Byte[] Byte;
[FieldOffset(0)]
public UInt64[] UInt64;
}
/// <summary>
/// Compares two streams for byte-by-byte equality.
/// </summary>
/// <param name="target">The target stream.</param>
/// <param name="compareTo">The stream to compare the target to.</param>
/// <returns>A value indicating whether the two streams are identical.</returns>
public static bool CompareBytes(this Stream target, Stream compareTo)
{
if (target == null && compareTo == null)
return true;
if (target == null || compareTo == null)
return false;
if (target.Length != compareTo.Length)
return false;
if (object.ReferenceEquals(target, compareTo))
return true;
if (!target.CanRead || !target.CanSeek)
throw new ArgumentOutOfRangeException("target");
if (!compareTo.CanRead || !compareTo.CanSeek)
throw new ArgumentOutOfRangeException("target");
lock (target)
{
lock (compareTo)
{
var origa = target.Position;
var origb = compareTo.Position;
try
{
target.Position = compareTo.Position = 0;
// Shrink the number of comparisons.
var arr1 = new byte[4096];
var convert1 = new Converter() { Byte = arr1 };
var arr2 = new byte[4096];
var convert2 = new Converter() { Byte = arr2 };
int len;
while ((len = target.Read(arr1, 0, 4096)) != 0)
{
if (compareTo.Read(arr2, 0, 4096) != len)
return false;
for (var i = 0; i < (len / 8) + 1; i++)
if (convert1.UInt64[i] != convert2.UInt64[i])
return false;
}
return true;
}
finally
{
target.Position = origa;
compareTo.Position = origb;
}
}
}
}
The problem is that the convert1.UInt64[i] != convert2.UInt64[i]
if
block (returning false
) is being evaluated even when the values are the equal. I checked each on individually, then checked the outcome of the 'not equals'. I am in pure disbelief:
I have not messed with the instruction pointer - this is how the code executed and the watch pin is live.
Any ideas how this could happen?
for (var i = 0; i < (len / 8) + 1; i++)
The debugger in general has a hard time with this union, it can't display the array content when I try it. But the core problem is no doubt the +1 in the for() end expression. That indexes the array beyond its last element when len is divisible by 8. The runtime cannot catch this mistake, overlapping the arrays causes the Length property to have a bogus value. What happens next is undefined behavior, you are reading bytes that are not part of the array. A workaround is to make the array 7 bytes longer.
This kind of code is not exactly an optimization, reading and comparing uint64 on a 32-bit machine is expensive, especially when the array isn't aligned correctly.. About 50% odds for that. A better mousetrap is to use the C runtime memcmp() function, available on any Windows machine:
[DllImport("msvcrt.dll")]
private static extern int memcmp(byte[] arr1, byte[] arr2, int cnt);
And use it like this:
int len;
while ((len = target.Read(arr1, 0, 4096)) != 0) {
if (compareTo.Read(arr2, 0, 4096) != len) return false;
if (memcmp(arr1, arr2, len) != 0) return false;
}
return true;
Do compare the perf of this with a plain for() loop that compares bytes. The ultimate throttle here is the memory bus bandwidth.
Problems like this are commonly issues with understanding of how optimizations work. This line of code could very well be being executed because both return false clauses are combined into one set of instructions at the lower level. Other causes for issues like this is if the architecture you are on allows for conditional execution in which certain instructions are hit in the debugger but the results are never committed to registers at the architecture level.
Verify that the code works in debug mode first. Then when you are convinced the outcome is the same as the release mode, look at the underlying instructions to figure out the compiler optimization at hand.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With