Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find Duplicate entries C#

Tags:

c#

I am new to programming so this may seem somewhat straightforward, but I cannot seem to figure it out.

I am trying to find duplicate values that are in a datatable in one column of the values.

Here is what I was trying to do it with.

DataRow[] dupresults = dt.Select("PROV_NEW");
TableIssues = string.Empty;
DataTable dtTemp = dt.DefaultView.ToTable(true, "NEW_PROV");

if (dupresults.Length == 0)
{
    return true;
}
else
{
    foreach (DataRow item in dupresults)
    {
        Console.WriteLine(item[1]);
        TableIssues += "Provider Code is not unique for " + item[1].ToString() + ". Revise non-unique codes.\r\n\n\n\n";
    }
    return false;
}

Alright, but I am also having it search to make sure that there are no empty fields in PROV_NEW too. so I would not know where to put that. I am very new to c#. I just started last week. I am doing side projects for my father's company.

private bool ValidateTable(DataSets.Setup.SETUP_MWPROVDataTable dt, out string TableIssues)
    {
        try
        {
            //NewCode not used for other row
            DataRow[] result = dt.Select("PROV_NEW = ''");
            DataRow[] dupresults = dt.Select("PROV_NEW");
            TableIssues = string.Empty;
            DataTable dtTemp = dt.DefaultView.ToTable(true, "NEW_PROV");



            if (dupresults.Length == 0)
            {

                return true;
            }
            else
            {
                var duplicates = dt.AsEnumerable()
               .Select(dr => dr.Field<string>("PROV_NEW"))
               .GroupBy(x => x)
               .Where(g => g.Count() > 1)
               .Select(g => g.Key)
               .ToList();

                foreach (DataRow item in dupresults)
                {
                    Console.WriteLine(item[1]);
                    TableIssues += "Provider Code is not unique for " + item[1].ToString() + ". Revise non-unique codes.\r\n\n\n\n";
                }
                return false;
            }


            if (result.Length == 0)
            {
                //TODO: Add Next Step for validation

                return true;

            }
            else
            {
                foreach (DataRow item in result)
                {
                    Console.WriteLine(item[1]);
                    TableIssues += "Provider code " + item[1].ToString() + " is blank. Add new Provider code for " + item[1].ToString() +".\r\n\n\n";
                }


                return false;
            }

           }
        catch (Exception)
        {

            throw;
        }
    }


}
like image 535
Kobrien Avatar asked Nov 30 '22 14:11

Kobrien


1 Answers

LINQ can help you here:

var duplicates = dt.AsEnumerable()
                   .Select(dr => dr.Field<string>("PROV_NEW"))
                   .GroupBy(x => x)
                   .Where(g => g.Count() > 1)
                   .Select(g => g.Key)
                   .ToList();

// Now work with the set of duplicates

Alternatively:

HashSet<string> providers = new HashSet<string>();
foreach (var provider in dt.AsEnumerable()
                           .Select(dr => dr.Field<string>("PROV_NEW")))
{
    if (!providers.Add(provider))
    {
        // This provider is a duplicate
    }
}

(This works because HashSet<T>.Add returns false if the value already exists in the set.)

like image 138
Jon Skeet Avatar answered Dec 15 '22 23:12

Jon Skeet