Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why doesn't the Union function in LINQ remove duplicate entries?

Tags:

vb.net

linq

union

I'm using VB .NET and I know that Union normally works ByRef but in VB, Strings are generally processed as if they were primitive datatypes.

Consequently, here's the problem:

Sub Main()
    Dim firstFile, secondFile As String(), resultingFile As New StringBuilder

    firstFile = My.Computer.FileSystem.ReadAllText(My.Computer.FileSystem.SpecialDirectories.Desktop & "\1.txt").Split(vbNewLine)
    secondFile = My.Computer.FileSystem.ReadAllText(My.Computer.FileSystem.SpecialDirectories.Desktop & "\2.txt").Split(vbNewLine)

    For Each line As String In firstFile.Union(secondFile)
        resultingFile.AppendLine(line)
    Next

    My.Computer.FileSystem.WriteAllText(My.Computer.FileSystem.SpecialDirectories.Desktop & "\merged.txt", resultingFile.ToString, True)
End Sub

1.txt contains:
a
b
c
d
e

2.txt contains:
b
c
d
e
f
g
h
i
j

After running the code, I get:
a
b
c
d
e
b
f
g
h
i
j

Any suggestions for making the Union function act like its mathematical counterpart?

like image 215
Zian Choy Avatar asked Aug 09 '09 04:08

Zian Choy


2 Answers

Linq Union does perform as you want it to. Ensure your input files are correct (e.g. one of the lines may contain a space before the newline) or Trim() the strings after splitting?

var list1 = new[] { "a", "s", "d" };
var list2 = new[] { "d", "a", "f", "123" };
var union = list1.Union(list2);
union.Dump(); // this is a LinqPad method

In linqpad, the result is {"a", "s", "d", "f", "123" }

like image 110
Robert Paulson Avatar answered Nov 15 '22 08:11

Robert Paulson


I think you want to use the Distinct function. At then end of your LINQ statement do .Distinct();

var distinctList = yourCombinedList.Distinct();

Similar to a 'SELECT DISTINCT' in SQL :)

like image 24
Kelsey Avatar answered Nov 15 '22 08:11

Kelsey