Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

'4' and '4' clash in primary key but not in filesystem

There is DataTable with primary key to store information about files. There happen to be 2 files which differ in names with symbols '4' and '4' (0xff14, a "Fullwidth Digit Four" symbol). The DataTable fails to include them both because of failed uniqueness. However, in Windows filesystem they seem to be able to coexist without any issues.

The behavior does not seem to depend on locale settings, I changed "Region&Language->Formats->Format" from English to japanese, also "language for non-unicode programs" changes. Locale was printed as "jp-JP", "en-GB". Always same result.

Questions:

  1. what would be less intrusive way to fix it? I could switch to using containers instead of System.Data.* but I'd like to avoid it. Is it possible to define custom comparer for the column or otherwise better check the uniqueness? Enabling case sensitivity (which would fix this one) would cause other issues.
  2. is there any chance that some global settings would fix it without rebuilding the software?

The demo program with failure:

using System;
using System.Data;

namespace DataTableUniqueness
{
    class Program
    {
        static void Main(string[] args)
        {
            var changes = new DataTable("Rows");

            var column = new DataColumn { DataType = Type.GetType("System.String"), ColumnName = "File" };
            changes.Columns.Add(column);
            var primKey = new DataColumn[1];
            primKey[0] = column;
            changes.PrimaryKey = primKey;

            changes.Rows.Add("4.txt");
            try
            {
                changes.Rows.Add("4.txt"); // throws the exception
            }
            catch (Exception e)
            {
                Console.WriteLine("Exception: {0}", e);
            }
        }
    }
}

The exception

Exception: System.Data.ConstraintException: Column 'File' is constrained to be unique.  Value '4.txt' is already present.
   at System.Data.UniqueConstraint.CheckConstraint(DataRow row, DataRowAction action)
   at System.Data.DataTable.RaiseRowChanging(DataRowChangeEventArgs args, DataRow eRow, DataRowAction eAction, Boolean fireEvent)
   at System.Data.DataTable.SetNewRecordWorker(DataRow row, Int32 proposedRecord, DataRowAction action, Boolean isInMerge, Boolean suppressEnsurePropertyChanged, Int32 position, Boolean fireEvent, Exception& deferredException)
   at System.Data.DataTable.InsertRow(DataRow row, Int64 proposedID, Int32 pos, Boolean fireEvent)
   at System.Data.DataRowCollection.Add(Object[] values)

PS: The locale is seen as: enter image description here

like image 331
max630 Avatar asked May 16 '18 12:05

max630


1 Answers

By using DataType = typeof(object) you "disable" the string normalization. String equality is still used for comparison. I don't know if there are other side effects.

More complex solution: implement a "wrapper" for the string class:

public class MyString : IEquatable<MyString>, IComparable, IComparable<MyString>
{
    public static readonly StringComparer Comparer = StringComparer.InvariantCultureIgnoreCase;
    public readonly string Value;

    public MyString(string value)
    {
        Value = value;
    }

    public static implicit operator MyString(string value)
    {
        return new MyString(value);
    }

    public static implicit operator string(MyString value)
    {
        return value != null ? value.Value : null;
    }

    public override int GetHashCode()
    {
        return Comparer.GetHashCode(Value);
    }

    public override bool Equals(object obj)
    {
        if (obj == null || !(obj is MyString))
        {
            return false;
        }

        return Comparer.Equals(Value, ((MyString)obj).Value);
    }

    public override string ToString()
    {
        return Value != null ? Value.ToString() : null;
    }

    public bool Equals(MyString other)
    {
        if (other == null)
        {
            return false;
        }

        return Comparer.Equals(Value, other.Value);
    }

    public int CompareTo(object obj)
    {
        if (obj == null)
        {
            return 1;
        }

        return CompareTo((MyString)obj);
    }

    public int CompareTo(MyString other)
    {
        if (other == null)
        {
            return 1;
        }

        return Comparer.Compare(Value, other.Value);
    }
}

And then:

var changes = new DataTable("Rows");

var column = new DataColumn { DataType = typeof(MyString), ColumnName = "File" };
changes.Columns.Add(column);
var primKey = new DataColumn[1];
primKey[0] = column;
changes.PrimaryKey = primKey;

changes.Rows.Add((MyString)"a");
changes.Rows.Add((MyString)"4.txt");
try
{
    changes.Rows.Add((MyString)"4.txt"); // throws the exception
}
catch (Exception e)
{
    Console.WriteLine("Exception: {0}", e);
}

var row = changes.Rows.Find((MyString)"A");
like image 64
xanatos Avatar answered Oct 21 '22 06:10

xanatos