Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C# performant alternatives to HashSet and Dictionary that do not use GetHashCode

I'm looking for built-in alternatives of HashSet and Dictionary objects that have better performance than lists but do not use the internal GetHashCode method. I need this because for the class I have written, there is no way of writing a GetHashCode method that fulfills the usual contract with Equals other than

public override int GetHashCode() { return 0; } // or return any other constant value

which would turn HashSet and Dictionary into ordinary lists (performance-wise).

So what I need is a set implementation and a mapping implementation. Any suggestions?

EDIT:

My class is a tolerance-based 3-dimensional vector class:

public class Vector
{
    private static const double TOL = 1E-10;
    private double x, y, z;

    public Vector(double x, double y, double z)
    {
        this.x = x; this.y = y; this.z = z;
    }

    public override bool Equals(object o)
    {
        Vector other = o as Vector;

        if (other == null)
            return false;

        return ((Math.Abs(x - other.x) <= TOL) &&
                (Math.Abs(y - other.y) <= TOL) &&
                (Math.Abs(z - other.z) <= TOL));
    }
}

Note that my Equals method is not transitive. However, in my use case I can make it "locally" transitive because at some point, I will know all vectors that I need to put into my set / mapping key set, and I also know that they will come in clusters. So when I have collected all vectors, I will choose one representative per cluster and replace all original vectors by the representative. Then Equals will be transitive among the elements of my set / mapping key set.

When I have my set or mapping, I will collect vectors from another source (for the sake of this question let's assume I'll ask a user to type in a vector). These can be any possible vector. Those will never be added to the set/mapping, but I will need to know if they are contained in the set / key set of the mapping (regarding tolerance), and I will need to know their value from the mapping.

like image 229
Kjara Avatar asked Nov 09 '22 11:11

Kjara


1 Answers

You need a data structure that supports sorting, binary search and fast insertion. Unfortunately there is no such collection in the .NET Framework. The SortedDictionary doesn't supports binary search, while the SortedList suffers from O(n) insertion for unsorted data. So you must search for a third party tool. A good candidate seems to be the TreeDictionary of C5 library. It is a red-black tree implementation that offers the important method RangeFromTo. Here is an incomplete implementation of a Dictionary that has Vectors as keys, backed internally by a C5.TreeDictionary:

public class VectorDictionary<TValue>
{
    private readonly C5.TreeDictionary<double, (Vector, TValue)> _tree =
        new C5.TreeDictionary<double, (Vector, TValue)>();

    public bool TryGetKeyValue(Vector key, out (Vector, TValue) pair)
    {
        double xyz = key.X + key.Y + key.Z;
        // Hoping that not all vectors are crowded in the same diagonal line
        var range = _tree.RangeFromTo(xyz - Vector.TOL * 3, xyz + Vector.TOL * 3);
        var equalPairs = range.Where(e => e.Value.Item1.Equals(key));
        // Selecting a vector from many "equal" vectors is tricky.
        // Some may be more equal than others. :-) Lets return the first for now.
        var selectedPair = equalPairs.FirstOrDefault().Value;
        pair = selectedPair;
        return selectedPair.Item1 != null;
    }

    public Vector GetExisting(Vector key)
    {
        return TryGetKeyValue(key, out var pair) ? pair.Item1 : default;
    }

    public bool Contains(Vector key) => TryGetKeyValue(key, out var _);

    public bool Add(Vector key, TValue value)
    {
        if (Contains(key)) return false;
        _tree.Add(key.X + key.Y + key.Z, (key, value));
        return true;
    }

    public TValue this[Vector key]
    {
        get => TryGetKeyValue(key, out var pair) ? pair.Item2 : default;
        set => _tree.Add(key.X + key.Y + key.Z, (key, value));
    }

    public int Count => _tree.Count;
}

Usage example:

var dictionary = new VectorDictionary<int>();
Console.WriteLine($"Added: {dictionary.Add(new Vector(0.5 * 1E-10, 0, 0), 1)}");
Console.WriteLine($"Added: {dictionary.Add(new Vector(0.6 * 1E-10, 0, 0), 2)}");
Console.WriteLine($"Added: {dictionary.Add(new Vector(1.6 * 1E-10, 0, 0), 3)}");
Console.WriteLine($"dictionary.Count: {dictionary.Count}");
Console.WriteLine($"dictionary.Contains: {dictionary.Contains(new Vector(2.5 * 1E-10, 0, 0))}");
Console.WriteLine($"dictionary.GetValue: {dictionary[new Vector(2.5 * 1E-10, 0, 0)]}");

Output:

Added: True
Added: False
Added: True
dictionary.Count: 2
dictionary.Contains: True
dictionary.GetValue: 3
like image 188
Theodor Zoulias Avatar answered Nov 14 '22 21:11

Theodor Zoulias