Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find intersecting DataRows in a List of DataTables

Tags:

c#

.net

linq

I have a List. I would like to filter through all the rows in the list of tables to find all the rows that are in every datatable in the list.

If possible, the compare needs to be on the "ID" column that is on every row.

I have tried to solve this with Linq but got stuck. This is what I have so far:

List<DataTable> dataTables = new List<DataTable>();

// fill up the list
List<DataRow> dataRows = 
    dataTables.SelectMany(dt => dt.Rows.Cast<DataRow>().AsEnumerable()).
    Aggregate((r1, r2) => r1.Intersect(r2));

Any suggestions?

like image 711
Maarten Avatar asked May 24 '12 12:05

Maarten


1 Answers

Not a simple question. Here's a solution (which seems too complicated to me, but it works).

  • Obtain the Id value from each row using Linq to DataSets
  • Intersect the multiple lists to find all the common values
  • Find a single occurence of a row in all of the rows that have one of the matching ids

To use Linq on DataTable, see this article for a start.

You could get the ids from one table like this

var ids = dt.AsEnumerable().Select (d => d.Field<int>("ID")).OfType<int>();

and from multiple tables

var setsOfIds = dataTables.Select (
    t => t.AsEnumerable().Select (x => x.Field<int>("ID")).OfType<int>());

To intersect multiple lists, try this article. Using one of the methods there you could obtain the intersection of all of the ids.

Using Jon Skeet's helper method

public static class MyExtensions
{
    public static List<T> IntersectAll<T>(this IEnumerable<IEnumerable<T>> lists)
    {
        HashSet<T> hashSet = new HashSet<T>(lists.First());
        foreach (var list in lists.Skip(1))
        {
            hashSet.IntersectWith(list);
        }
        return hashSet.ToList();
    }
}

we can write

var commonIds = setsOfIds.InsersectAll();

Now flatten all the rows from the DataTables and filter by the common ids:

var rows = dataTables.SelectMany (t => t.AsEnumerable()).Where(
    r => commonIds.Contains(r.Field<int>("ID")));

Now group by id and take the first instance of each row:

var result = rows.GroupBy (r => r.Field<int>("ID")).Select (r => r.First ());
like image 99
Phil Avatar answered Sep 28 '22 10:09

Phil