Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Mark non-unique rows in a DataTable

I have a DataTable which I want to check if values in three of the columns are unique. If not, the last column should be filled with the line number of the first appearance of the value-combination.

For example, this table:

ID    Name    LastName    Age    Flag
-------------------------------------
1     Bart    Simpson     10      -
2     Lisa    Simpson      8      -
3     Bart    Simpson     10      -
4     Ned     Flanders    40      -
5     Bart    Simpson     10      -

Should lead to this result:

Line  Name    LastName    Age    Flag
-------------------------------------
1     Bart    Simpson     10      -
2     Lisa    Simpson      8      -
3     Bart    Simpson     10      1
4     Ned     Flanders    40      -
5     Bart    Simpson     10      1

I solved this by iterating the DataTable with two nested for loops and comparing the values. While this works fine for a small amount of data, it gets pretty slow when the DataTable contains a lot of rows.

My question is: What is the best/fastest solution for this problem, regarding that the amount of data can vary between let's say 100 and 20000 rows?
Is there a way to do this using LINQ? (I'm not too familiar with it, but I want to learn!)

like image 608
Philipp Grathwohl Avatar asked Nov 06 '22 05:11

Philipp Grathwohl


1 Answers

I can't comment on how you might do this in C#/VB with a data table, but if you could move it all to SQL, your query would look like:

declare @t table (ID int, Name varchar(10), LastName varchar(10), Age int)
insert into @t values (1,     'Bart' ,   'Simpson',     10 )
insert into @t values (2,     'Lisa',    'Simpson' ,     8 )
insert into @t values (3,     'Bart',    'Simpson' ,    10 )
insert into @t values (4,     'Ned',     'Flanders' ,   40 )
insert into @t values (5 ,    'Bart',    'Simpson'   ,  10 )

select t.*,
(select min(ID) as ID
    from @t t2
    where t2.Name = t.Name
    and t2.LastName = t.LastName
    and t2.id < t.id)
from @t t

Here I've defined a table for demo purposes. I suppose you might be able to translate this into LINQ.

like image 169
James Wiseman Avatar answered Nov 09 '22 14:11

James Wiseman