Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find items with duplicate values over mutiple properties

Tags:

c#

linq

It's hard to explain my issue without giving a concrete example. There might be a similar question on here but I wasn't able to find it because I'm having trouble wording it in searchable terms.

Basically I need to find items in a list that have any duplicate values over multiple properties. In other words, any value in the original list has to be unique regardless of which property it is in.

Here is a simple example I could come up with that describes the problem really well:

There is a list of dates for holidays with an extra property for an optional replacement date (ex: for when the holiday falls in the weekend). Each date in this list has to be unique, so I'd like to find the items that contain duplicate dates.

PART1: return a list of duplicate dates
PART2: return a list of all the items with a duplicate date

I believe this is a great example because one of the properties is nullable which might make it even a little more difficult.

Model:

public class Holiday
{
    public Holiday(DateTime hDate, string descr, DateTime? rDate = null)
    {
        HolidayDate = hDate;
        Description = descr;
        ReplacementDate = rDate;
    }

    public DateTime HolidayDate { get; set; }
    public string Description { get; set; }
    public DateTime? ReplacementDate { get; set; }
}

Here is some example data to get you started (and hopefully clear up any confusion)

var list = new List<Holiday>()
{
    new Holiday(new DateTime(2016,1,1),"NEW YEAR 2016"),
    new Holiday(new DateTime(2016,3,27),"EASTER MONDAY 2016", new DateTime(2016,3,28)),
    new Holiday(new DateTime(2016,12,25),"CHRISTMAS DAY 2016", new DateTime(2016,12,26)),
    new Holiday(new DateTime(2017,1,1),"NEW YEAR 2017", new DateTime(2017,1,2)),
    new Holiday(new DateTime(2017,4,17),"EASTER MONDAY 2017"),
    new Holiday(new DateTime(2017,12,25),"CHRISTMAS DAY 2017"),
    new Holiday(new DateTime(2018,1,1),"NEW YEAR 2018"),
    new Holiday(new DateTime(2018,1,1),"DUPLICATE 1"), //example of a duplicate
    new Holiday(new DateTime(2018,1,2),"DUPLICATE 2", new DateTime(2016,1,1)), //example of a duplicate
    new Holiday(new DateTime(2018,1,3),"DUPLICATE 3", new DateTime(2017,1,2)), //example of a duplicate
    new Holiday(new DateTime(2018,1,4),"DUPLICATE 4", new DateTime(2018,1,4)), //example of a duplicate

};

var result = list; //add your code that finds the items with a duplicate value, preferably a linq query

//PART1: return list of the actual duplicate values
//duplcateDates =  "2016-01-01, 2017-01-02, 2018-01-01, 2018-01-04";

//PART2: return a list with the items that have a duplicate item
var reultString = string.Join(", ", result.Select(q => q.Description));
//resultString = "NEW YEAR 2016, DUPLICATE 2, NEW YEAR 2017, DUPLICATE 3, NEW YEAR 2018, DUPLICATE 1, DUPLICATE 4";

For those of you that are lazy and don't feel like working out this example, as long as you understood my issue and can help me by using your own similar example or point me in the right direction, I'd really appreciate it.

Any solution that doesn't involve writing a specific foreach or for-loop that checks each property individually for duplicates will be accepted as an answer.
I'd like to be able to apply the solution of this problem to different similar situations without having to write an entire block of code iterating trough the possibilities.

This is why I'd like to know if this is possible by linq queries. However if you have a generic method or extension that solves problems like this, please share!

like image 810
Oceans Avatar asked Mar 13 '26 18:03

Oceans


2 Answers

You can flatten collection so that each date (holiday and replacement) is represeted by seprate item, then group by date, like this:

// flatten
var result = list.SelectMany(c => new[] {
    // always include HolidayDate, it's not nullable
    new {
        Date = c.HolidayDate,
        Instance = c
    },
    // include replacement date if not null
    c.ReplacementDate != null ? new {
        Date = c.ReplacementDate.Value,
        Instance = c
    }: null
})
// filter out null items (that were created for null replacement dates)
.Where(c => c != null)
.GroupBy(c => c.Date)
.Where(c => c.Count() > 1)
.ToArray();

// keys of groups are duplicate dates
var duplcateDates = String.Join(", ", result.Select(c => c.Key.ToString("yyyy-MM-dd")));

var reultString = string.Join(", ", 
      // flatten groups extracting all items
      result.SelectMany(c => c)
      // filter out duplicates
     .Select(c => c.Instance).Distinct()
     .Select(c => c.Description));
like image 64
Evk Avatar answered Mar 16 '26 07:03

Evk


Got one, too: Not sure, if its possible without collecting the data, first.

//PART0: collecting data 
var holidayDateDates = list.Select(x => x.HolidayDate).ToList();
var replacementDates = list.Select(x => x.ReplacementDate).Where(y => y != null).ToList().ConvertAll<DateTime>(x => x.Value);
holidayDateDates.AddRange(replacementDates);
//PART1: return list of the actual duplicate values
var result = holidayDateDates.GroupBy(x => x)
    .Where(g => g.Count() > 1)
    .Select(y => y.Key)
    .ToList();
var duplicateDates = string.Join(", ", result.Select(c => c.ToString("yyyy-MM-dd")));

//PART2: return a list with the items that have a duplicate item
var dummytime = new DateTime();// this will never be in the list and kills all nulls, see below
var duplicateHolidays = list.Where(x => result.Contains(x.HolidayDate) || result.Contains(x.ReplacementDate??dummytime));
var resultString = string.Join(", ", duplicateHolidays.Select(q => q.Description));
like image 33
FrankM Avatar answered Mar 16 '26 07:03

FrankM



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!