Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

LINQ with subselect and groupby to get only the latest version of each item in a list

Tags:

c#

lambda

linq

I'm a newbie when it comes to LINQ...

I have an IEnumerable generic list that contains answers with different versions (each with an FK to the question). From this list I need to get a dictionary of latest version answers only.

A very simplified class diagram:

Question
-ID
-question
- ...other properties

Answer
-ID
-Version
-QuestionID
-Value
- ...other properties

Currently I've got the following :

IEnumerable<Answer> answers = GetAnswers();

IDictionary<long, AnswerDTO> latestVersionAnswers = new Dictionary<long, AnswerDTO>();
if (answers != null)
{
latestVersionAnswers = answers
      .OrderBy(e => e.ID)
      .GroupBy(e => e.Question.ID)
      .Select(g => new AnswerDTO
                        {
                             Version = g.Last().Version, // g.Select(e => e.Version).Max(), 
                             QuestionID = g.Key,
                             ID = g.Last().ID,
                             Value = g.Last().Value
                                 }).ToDictionary(c => c.QuestionID);
        }

While this works for the most part, you can quickly see that it needs some serious optimization (and is a little fragile in that it depends on the Answer record rows order instead of "Max" logic). What would be the best way to do this with LINQ, or is it best to just do multiple for each loops?

  1. If I only needed the version (and not the ID, Value, etc.) I wouldn't need the OrderBy as I could just go g.Select(e => e.Version).Max() (or I've now seen the post at C# List<> GroupBy 2 Values, but this again would only return the key/s and one property: Version).
  2. Ultimately, in this particularly situation I would much prefer to just "filter" the original list and return the original answer items instead of involving the AnswerDTO.

Any pointers or help would be much appreciated!

like image 341
Ted Avatar asked Jan 13 '09 00:01

Ted


2 Answers

How about something like....

   private void button1_Click(object sender, EventArgs e)
    {
      List<Answer> list = GetAnswers();

      var dict = (from a in list
                 group a by a.QuestionID into grp
                 from g in grp
                 where g.Version == grp.Max(m => m.Version)
                 select new { id = g.QuestionID, q = g }).ToDictionary( o => o.id, o => o.q);

      StringBuilder sb = new StringBuilder();
      foreach (var elem in dict)
      {
        sb.AppendLine(elem.Key.ToString() + "-" + elem.Value.Version.ToString());
      }
      MessageBox.Show(sb.ToString());
    }

    private List<Answer> GetAnswers()
    {
      List<Answer> result = new List<Answer>();
      result.Add(new Answer() { ID = 1, QuestionID = 1, Version = 1 });
      result.Add(new Answer() { ID = 2, QuestionID = 1, Version = 2 });
      result.Add(new Answer() { ID = 3, QuestionID = 1, Version = 3 });
      result.Add(new Answer() { ID = 4, QuestionID = 2, Version = 1 });
      result.Add(new Answer() { ID = 5, QuestionID = 2, Version = 2 });
      result.Add(new Answer() { ID = 6, QuestionID = 2, Version = 3 });
      result.Add(new Answer() { ID = 7, QuestionID = 3, Version = 1 });
      result.Add(new Answer() { ID = 8, QuestionID = 3, Version = 2 });
      result.Add(new Answer() { ID = 9, QuestionID = 3, Version = 3 });
      result.Add(new Answer() { ID = 10, QuestionID = 3, Version = 4 });
      return result;
    }
like image 126
Tim Jarvis Avatar answered Oct 16 '22 14:10

Tim Jarvis


latestVersionAnswers = answers
  .GroupBy(e => e.Question.ID)
  .Select(g => g.OrderByDescending(e => e.Version).First())
  .ToDictionary(e => e.Question.ID);

Or, if you prefer the selecting overload of ToDictionary:

latestVersionAnswers = answers
  .GroupBy(e => e.Question.ID)
  .ToDictionary(
    g => g.Key,
    g => g.OrderByDescending(e => e.Version).First()
  );
like image 36
Amy B Avatar answered Oct 16 '22 13:10

Amy B