I have two arrays and I want to know if they contain the same items. Equals(object obj)
doesn't work because an array is a reference type. I have posted my attempt below, but since I'm sure this is a common task I'd like to know if there is a better test.
public bool ContainsEquivalentSequence<T>(T[] array1, T[] array2)
{
bool a1IsNullOrEmpty = ReferenceEquals(array1, null) || array1.Length == 0;
bool a2IsNullOrEmpty = ReferenceEquals(array2, null) || array2.Length == 0;
if (a1IsNullOrEmpty) return a2IsNullOrEmpty;
if (a2IsNullOrEmpty || array1.Length != array2.Length) return false;
for (int i = 0; i < array1.Length; i++)
if (!Equals(array1[i], array2[i]))
return false;
return true;
}
I reflected the source and it does not compare the length prior to executing the loop. This makes sense since the method is designed generally for an IEnumerable<T>
, not for a T[]
.
public static bool SequenceEqual<TSource>(this IEnumerable<TSource> first, IEnumerable<TSource> second, IEqualityComparer<TSource> comparer)
{
if (comparer == null)
{
comparer = EqualityComparer<TSource>.Default;
}
if (first == null)
{
throw Error.ArgumentNull("first");
}
if (second == null)
{
throw Error.ArgumentNull("second");
}
using (IEnumerator<TSource> enumerator = first.GetEnumerator())
{
using (IEnumerator<TSource> enumerator2 = second.GetEnumerator())
{
while (enumerator.MoveNext())
{
if (!enumerator2.MoveNext() || !comparer.Equals(enumerator.Current, enumerator2.Current))
{
return false;
}
}
if (enumerator2.MoveNext())
{
return false;
}
}
}
return true;
}
I've done some tests using Any
, Contains
, All
and SequenceEqual
then I picked the best 3.
There are different results for different inputs...
Two identical arrays of size 100: SequenceEqual
was faster
[ SequenceEqual: 00:00:00.027 ]*
[ ContainsEqSeq: 00:00:00.046 ]
[ Parallel: 00:00:00.281 ]
Two identical arrays of size 1000: SequenceEqual
was faster
[ SequenceEqual: 00:00:00.240 ]*
[ ContainsEqSeq: 00:00:00.361 ]
[ Parallel: 00:00:00.491 ]
Two identical arrays of size 10000: Parallel
was faster
[ SequenceEqual: 00:00:02.357 ]
[ ContainsEqSeq: 00:00:03.341 ]
[ Parallel: 00:00:01.688 ]*
Two identical arrays of size 50000: Parallel
kick ass
[ SequenceEqual: 00:00:11.824 ]
[ ContainsEqSeq: 00:00:17.206 ]
[ Parallel: 00:00:06.811 ]*
Two arrays with one difference at position 200: SequenceEqual
was faster
[ SequenceEqual: 00:00:00.050 ]*
[ ContainsEqSeq: 00:00:00.075 ]
[ Parallel: 00:00:00.332 ]
Two arrays with one difference at position 0: ContainsEqSeq
and SequenceEqual
were faster
[ SequenceEqual: 00:00:00.002 ]*
[ ContainsEqSeq: 00:00:00.001 ]*
[ Parallel: 00:00:00.211 ]
Two arrays with one difference at position 999: SequenceEqual
was faster
[ SequenceEqual: 00:00:00.237 ]*
[ ContainsEqSeq: 00:00:00.330 ]
[ Parallel: 00:00:00.691 ]
Two arrays with one difference at position 9999: Parallel
kick ass
[ SequenceEqual: 00:00:02.386 ]
[ ContainsEqSeq: 00:00:03.417 ]
[ Parallel: 00:00:01.614 ]*
The code for SequenceEqual
is
a1.SequenceEqual(a2)
The code for ContainsEqSeq
is your method.
The code for Parallel
is
bool a1IsNullOrEmpty = ReferenceEquals(a1, null) || a1.Length == 0;
bool a2IsNullOrEmpty = ReferenceEquals(a2, null) || a2.Length == 0;
if (a1IsNullOrEmpty) return a2IsNullOrEmpty;
if (a2IsNullOrEmpty || a1.Length != a2.Length) return false;
var areEqual = true;
Parallel.ForEach(a1,
(i, s, x) =>
{
if (a1[x] != a2[x])
{
areEqual = false;
s.Stop();
}
});
return areEqual;
I would say that the best one depends on what your input will be.
If you will work with huge arrays (like 10000+) I would say Parallel
is the best choice, it only loses when there is a difference on the beginning.
For other cases SequenceEqual
might be the best one, I only tested with int[]
, but I believe it can be fast with complex types as well.
But remember, results will vary accordingly to the input.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With