Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

DataTable Union

Tags:

c#

linq

datatable

Could you please check the following what is wrong in this.

I need union of this but it returns 6 record instead of 5(because "Amir" occurs two times)

DataTable dt1 = new DataTable();
dt1.Columns.Add(new DataColumn("Name"));
dt1.Rows.Add(dt1.NewRow()["Name"] = "Imran");
dt1.Rows.Add(dt1.NewRow()["Name"] = "Amir");
dt1.Rows.Add(dt1.NewRow()["Name"] = "Asif");

DataTable dt2 = new DataTable();
dt2.Columns.Add(new DataColumn("Name"));
dt2.Rows.Add(dt2.NewRow()["Name"] = "Tandulkar");
dt2.Rows.Add(dt2.NewRow()["Name"] = "Amir");
dt2.Rows.Add(dt2.NewRow()["Name"] = "Sheqwag");

 DataTable dtUnion = dt1.AsEnumerable()
  .Union(dt2.AsEnumerable()).CopyToDataTable<DataRow>();
like image 971
Ali Avatar asked Mar 09 '12 11:03

Ali


2 Answers

The problem here is that Linq does not know that you want to compare the Name. Instead it does what it does for all object types it compares the hash which is different for two different instances.

What you need todo is tell the Union method how to compare two items. You can do so by creating a custom IEqualityComparer that does compare two data rows the way you want it.

Here is a sample implementation:

class CustomComparer : IEqualityComparer<DataRow>
{
    #region IEqualityComparer<DataRow> Members

    public bool Equals(DataRow x, DataRow y)
    {
        return ((string)x["Name"]).Equals((string)y["Name"]);
    }

    public int GetHashCode(DataRow obj)
    {
        return ((string)obj["Name"]).GetHashCode();
    }

    #endregion
}

When calling Union you then need to pass in an instance of this comparer:

var comparer = new CustomComparer();
DataTable dtUnion = dt1.AsEnumerable()
      .Union(dt2.AsEnumerable(), comparer).CopyToDataTable<DataRow>();

See here for more info:
http://msdn.microsoft.com/en-us/library/bb358407.aspx

Word of advice:
Linq is best with customized data classes, which DataRow is not . It's best to have an actual Name property on the class, only then Linq can really shine.
If you don't need the flexibility of dynamic schema you should stay away from DataTable and implement custom classes that resemble exactly what you need, since DataTable is extremely bloated and slow.

like image 52
ntziolis Avatar answered Oct 12 '22 20:10

ntziolis


If your DataTables' schemas are the same, you could just use the existing DataRowComparer.Default, like so:

DataTable dtUnion = dt1.AsEnumerable().Union(dt2.AsEnumerable()).Distinct(DataRowComparer.Default).CopyToDataTable<DataRow>();

And the Aggregate function is very handy when you need to union more that 2 tables, eg:

// Create a table "template"
DataTable dt = new DataTable();
dt.Columns.Add(new DataColumn("Name"));

// Create a List of DataTables and add 3 identical tables
List<DataTable> dtList = new List<DataTable>();
dtList.AddRange(new List<DataTable>() { dt.Clone(), dt.Clone(), dt.Clone()});

// Populate the 3 clones with some data
dtList[0].Rows.Add("Imran");
dtList[0].Rows.Add("Amir"); 
dtList[0].Rows.Add("Asif");

dtList[1].Rows.Add("Tandulkar");
dtList[1].Rows.Add("Amir");  
dtList[1].Rows.Add("Sheqwag");

dtList[2].Rows.Add("John");
dtList[2].Rows.Add("Sheqwag");
dtList[2].Rows.Add("Mike");

// Union the 3 clones into a single DataTable containing only distinct rows
DataTable dtUnion = dtList
                    .Select(d => d.Select().AsEnumerable())
                    .Aggregate((current, next) => current.Union(next))
                    .Distinct(DataRowComparer.Default)
                    .CopyToDataTable<DataRow>();
like image 24
wnutt Avatar answered Oct 12 '22 20:10

wnutt