Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unable to sort because the IComparer.Compare() method returns inconsistent results. Again

Tags:

c#

sorting

Minimal reproductible example

Before move people vote to close this question, would you please, kindly check the minimal reproducible example?

This question has been asked a thousand times, but this time it really doesn't make any sense.

I'm getting the following exception message:

System.ArgumentException
  HResult=0x80070057
  Message=Unable to sort because the IComparer.Compare() method returns inconsistent results. Either a value does not compare equal to itself, or one value repeatedly compared to another value yields different results. IComparer: 'Minotaur.GeneticAlgorithms.LexicographicalFitnessComparer'.
  Source=System.Private.CoreLib
  StackTrace:
   at System.Collections.Generic.IntrospectiveSortUtilities.ThrowOrIgnoreBadComparer(Object comparer)
   at System.Collections.Generic.ArraySortHelper`2.Sort(TKey[] keys, TValue[] values, Int32 index, Int32 length, IComparer`1 comparer)
   at System.Array.Sort[TKey,TValue](TKey[] keys, TValue[] items, IComparer`1 comparer)
   at Minotaur.FitnessReportMaker.MakeReport(Array`1 fitnesses) in C:\Source\minotaur\Minotaur\Minotaur\FitnessReportMaker.cs:line 18
   at Minotaur.Theseus.EvolutionEngine.Run(IEnumerable`1 initialPopulation) in C:\Source\minotaur\Minotaur\Minotaur\Theseus\EvolutionEngine.cs:line 63
   at Minotaur.Program.Run(ProgramSettings settings) in C:\Source\minotaur\Minotaur\Minotaur\Program.cs:line 148
   at Minotaur.ProgramSettings.OnExecute() in C:\Source\minotaur\Minotaur\Minotaur\ProgramSettings.cs:line 14

When I call this method:

Array.Sort(
    keys: sortedFitnesses,
    items: sortedFitnesses,
    comparer: new LexicographicalFitnessComparer());

And this is my IComparer<Fitness> implementation:

namespace Minotaur.GeneticAlgorithms {
    using System;
    using System.Collections.Generic;

    public sealed class LexicographicalFitnessComparer: IComparer<Fitness> {

        public int Compare(Fitness lhs, Fitness rhs) {
            var a = CompareImpl(lhs, lhs);
            if (a != 0)
                throw new InvalidOperationException();

            var b = CompareImpl(rhs, rhs);
            if (b != 0)
                throw new InvalidOperationException();

            var c = CompareImpl(lhs, rhs);
            var d = CompareImpl(rhs, lhs);
            if (c != -d)
                throw new InvalidOperationException();

            return c;
        }

        public int CompareImpl(Fitness lhs, Fitness rhs) {
            if (lhs is null)
                throw new ArgumentNullException(nameof(lhs));
            if (rhs is null)
                throw new ArgumentNullException(nameof(rhs));
            if (lhs.Count != rhs.Count)
                throw new ArgumentException(nameof(lhs) + " and " + nameof(rhs) + " must have the same length.");

            for (int i = 0; i < lhs.Count; i++) {
                if (lhs[i] < rhs[i])
                    return -1;
                else if (lhs[i] > rhs[i])
                    return 1;
            }

            return 0;
        }
    }
}

As you can see, the Compare method actually calls CompareImpl and performs many tests with the results. No InvalidOperationException is ever thrown. So I have no idea what is going on...

And for completeness sake, here is my Fitness implementation:

namespace Minotaur.GeneticAlgorithms {
    using System;
    using System.Collections;
    using System.Collections.Generic;
    using System.Linq;
    using Minotaur.ExtensionMethods.SystemArray;
    using Newtonsoft.Json;

    [JsonObject(MemberSerialization.OptIn)]
    public sealed class Fitness: IEquatable<Fitness>, IReadOnlyList<float> {

        [JsonProperty] private readonly float[] _objectives;

        private readonly int _precomputedHashCode;

        [JsonConstructor]
        private Fitness(float[] objectives) {
            for (int i = 0; i < objectives.Length; i++) {
                if (float.IsNaN(objectives[i]))
                    throw new ArgumentException(nameof(objectives) + " can't contain NaNs.");
            }

            _objectives = objectives;
            Count = objectives.Length;

            var hash = new HashCode();
            for (int i = 0; i < _objectives.Length; i++)
                hash.Add(_objectives[i]);

            _precomputedHashCode = hash.ToHashCode();
        }

        public static Fitness WrapAndCopy(float[] objectives) {
            if (objectives == null)
                throw new ArgumentNullException(nameof(objectives));
            if (objectives.Length == 0)
                throw new ArgumentException(nameof(objectives) + " can't be empty");

            return new Fitness(objectives.ToArray());
        }

        public float this[int index] => _objectives[index];

        public int Count { get; }

        public override string ToString() => "[" + string.Join(", ", _objectives) + "]";

        public override int GetHashCode() => _precomputedHashCode;

        public override bool Equals(object obj) => Equals(obj as Fitness);

        public bool Equals(Fitness other) {
            if (other == null)
                return false;

            if (object.ReferenceEquals(this, other))
                return true;

            // Again, fitnesses should all have the same length
            // finding one with different length indicates a critical error
            if (Count != other.Count)
                throw new InvalidOperationException($"Fitness should ALWAYS have the same {nameof(Count)}");

            var lhs = _objectives;
            var rhs = other._objectives;

            for (int i = 0; i < lhs.Length; i++) {
                if (lhs[i] != rhs[i])
                    return false;
            }

            return true;
        }

        public IEnumerator<float> GetEnumerator() => _objectives.GetGenericEnumerator();

        IEnumerator IEnumerable.GetEnumerator() => _objectives.GetEnumerator();
    }
}

As you can see, Fitness is immutable and doesn't allow NaNs. The array being sorted is a local variable (hence is not being modified by other threadds) and contains no nulls.

Is this a framework bug?

Edit: the target framework is .NET Core 2.2. The project is being built for x64 on Windows.

Sample input:

{Minotaur.GeneticAlgorithms.Fitness[50]}
{[0.4032445, 144]}
{[0.3778533, 144]}
{[0.1739699, 144]}
{[0.3778533, 144]}
{[0.4032445, 144]}
{[0.1962067, 144]}
{[0.2236756, 144]}
{[0.376882, 144]}
{[0.275862, 144]}
{[0.3972802, 144]}
{[0.2236756, 144]}
{[0.1962067, 144]}
{[0.376882, 144]}
{[0.2236756, 144]}
{[0.4032445, 144]}
{[0.3684821, 144]}
{[0.3603691, 144]}
{[0.4032445, 144]}
{[0.3113146, 144]}
{[0.3176299, 144]}
{[0.2236756, 144]}
{[0.3972802, 144]}
{[0.4325995, 144]}
{[0.275862, 144]}
{[0.2972316, 144]}
{[0.376882, 144]}
{[0.3603691, 144]}
{[0.275862, 144]}
{[0.2236756, 144]}
{[0.2040549, 144]}
{[0.4032445, 144]}
{[0.3113146, 144]}
{[0.2040549, 144]}
{[0.2236756, 144]}
{[0.275862, 144]}
{[0.4032445, 144]}
{[0.3113146, 144]}
{[0.3113146, 144]}
{[0.3176299, 144]}
{[0.3118019, 144]}
{[0.3778533, 144]}
{[0.4032445, 144]}
{[0.3127732, 144]}
{[0.3176299, 144]}
{[0.3603691, 144]}
{[0.275862, 144]}
{[0.2236756, 144]}
{[0.376882, 144]}
{[0.3176299, 144]}
{[0.3603691, 144]}
like image 426
Trauer Avatar asked Sep 10 '19 16:09

Trauer


2 Answers

The problem goes away when you do not pass the same array as keys and items to Sort.

Array.Sort(sortedFitnesses, new LexicographicalFitnessComparer());

If you pass the same array as both keys and items, because the sort algorithm will try to rearrange both arrays at the same time, it will get confused.

For example: If the alogorithm decides, it has to exchange elements at positions 3 and 5, it will first exchange elements at positions 3 and 5 in the keys array and then exchange elements at positions 3 and 5 in the items array. If both are the same array reference, every exchange done by the algorithm is immediately undone again.

Since you only have one array, you do not need to specify it as both keys and items.

The sort would also work when making clones of the arrays first.

The documentation explains:

Each key in the keys Array has a corresponding item in the items Array. When a key is repositioned during the sorting, the corresponding item in the items Array is similarly repositioned. Therefore, the items Array is sorted according to the arrangement of the corresponding keys in the keys Array.

I think that Microsoft should enhance the documentation and specifically mention that you cannot use the same array for keys and items. They could also easily check this in the implementation.

like image 141
NineBerry Avatar answered Nov 18 '22 04:11

NineBerry


UPDATE: Upon receiving a minimal reproducible example and running it, it is obvious what is wrong. The comparison is fine, and I've deleted my answer explaining how to find problems with the comparison. (Though to be fair, the bad call site was in the original post; I did not read it carefully enough!)

The real problem is that the original poster is using the version of sort which takes an array of keys to be sorted and a second array to be sorted using the same swaps as the key array.

So what happens? Let's say two keys are swapped because they are out of order. Then the corresponding array elements in the other array are swapped. But it is the same array, so we just swapped them right back, and now the keys are in the wrong order.

Why does this then exhibit as defect in the comparison? Because a reasonable sanity check is:

  • if the elements are out of order in the key array
  • swap the elements in the key array
  • swap the corresponding elements in the other array
  • check that the key elements are now in order

but of course they are not! They are back in their incorrect order.

The vast majority of time this exception happens it is because the comparison algorithm itself is bad.

Normally the key array and the other array are not even the same type. It would never have occurred to me that the two arrays might be the same array, and it evidently did not occur to the framework authors either, as there is no check for it. There certainly could be.

Likely the reason they never thought of it is because this is a very strange thing to do. If you want to sort an array where the element being sorted is also the sort key, you just sort the array. Just call Sort(items, comparison). Only use the version that takes a key array if the keys are sorted differently than the elements.

like image 21
Eric Lippert Avatar answered Nov 18 '22 03:11

Eric Lippert