Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Nested LINQ Query Question

I ran into an issue today and I have been stumped for some time in trying to get the results I am searching for.

I currently have a class that resembles the following:

public class InstanceInformation
{
     public string PatientID {get; set;}
     public string StudyID {get; set;}
     public string SeriesID {get; set;}
     public string InstanceID {get; set;}
}

I have a List<InstanceInformation> and I am trying to use LINQ (or whatever other means to generate paths (for a file-directory) based on this list that resemble the following:

PatientID/StudyID/SeriesID/InstanceID

My issue is the data is currently unstructured as it comes in the previously mentioned form (List) and I need a way to group all of the data with the following constraints:

  • Group InstanceIDs by SeriesID
  • Group SeriesIDs by StudyID
  • Group StudyIDs by PatientID

I currently have something that resembles this:

var groups = from instance in instances
             group instance by instance.PatientID into patientGroups
             from studyGroups in
                 (from instance in patientGroups
                   group instance by instance.StudyID)
                   from seriesGroup in
                       (from instance in studyGroups
                        group instance by instance.SeriesID)
                            from instanceGroup in
                                 (from instance in seriesGroup
                                  group instance by instance.InstanceID)
             group instanceGroup by patientGroups.Key;

which just groups all of my InstanceIDs by PatientID, and it's quite hard to cull through all of the data after this massive grouping to see if the areas in between (StudyID/SeriesID) are being lost. Any other methods of solving this issue would be more than welcome.

This is primarily just for grouping the objects - as I would need to then iterate through them (using a foreach)

like image 672
Rion Williams Avatar asked Mar 04 '11 21:03

Rion Williams


1 Answers

I have no idea if the query you've come up with is the query you actually want or need, but assuming that it is, let's consider the question of whether there is a better way to write it.

The place you want to look is section 7.16.2.1 of the C# 4 specification, a portion of which I quote here for your convenience:


A query expression with a continuation

from ... into x ...

is translated into

from x in ( from ... ) ...

Is that clear? Let's take a look at a fragment of your query that I've marked with stars:

var groups = from instance in instances
             group instance by instance.PatientID into patientGroups
             from studyGroups in
                 **** (from instance in patientGroups
                   group instance by instance.StudyID) ****
                   from seriesGroup in
                       (from instance in studyGroups
                        group instance by instance.SeriesID)
                            from instanceGroup in
                                 (from instance in seriesGroup
                                  group instance by instance.InstanceID)
             group instanceGroup by patientGroups.Key;

Here we have

from studyGroups in ( from ... ) ...

the spec says that this is equivalent to

from ... into studyGroups ...

so we can rewrite your query as

var groups = from instance in instances
             group instance by instance.PatientID into patientGroups
             from instance in patientGroups
             group instance by instance.StudyID into studyGroups
             from seriesGroup in
             **** (from instance in studyGroups
                  group instance by instance.SeriesID) ****
                      from instanceGroup in
                           (from instance in seriesGroup
                            group instance by instance.InstanceID)
             group instanceGroup by patientGroups.Key;

Do it again. Now we have

from seriesGroup in (from ... ) ...

and the spec says that this is the same as

from ... into seriesGroup ...

so rewrite it like that:

var groups = from instance in instances 
             group instance by instance.PatientID into patientGroups
             from instance in patientGroups 
             group instance by instance.StudyID into studyGroups
             from instance in studyGroups
             group instance by instance.SeriesID into seriesGroup
             from instanceGroup in
              ****     (from instance in seriesGroup
                   group instance by instance.InstanceID) ****
             group instanceGroup by patientGroups.Key;

And again!

var groups = from instance in instances 
             group instance by instance.PatientID into patientGroups
             from instance in patientGroups 
             group instance by instance.StudyID into studyGroups
             from instance in studyGroups
             group instance by instance.SeriesID into seriesGroup
             from instance in seriesGroup
             group instance by instance.InstanceID into instanceGroup
             group instanceGroup by patientGroups.Key;

Which I hope you agree is a whole lot easier to read. I would improve its readability more by changing the fact that "instance" is used half a dozen times to mean different things:

var groups = from instance in instances 
             group instance by instance.PatientID into patientGroups
             from patientGroup in patientGroups 
             group patientGroup by instance.StudyID into studyGroups
             from studyGroup in studyGroups
             group studyGroup by studyGroup.SeriesID into seriesGroups
             from seriesGroup in seriesGroups
             group seriesGroup by seriesGroup.InstanceID into instanceGroup
             group instanceGroup by patientGroups.Key;

Whether this is actually the query you need to solve your problem, I don't know, but at least this one you can reason about without turning yourself inside out trying to follow all the nesting.

This technique is called "query continuation". Basically the idea is that the continuation introduces a new range variable over the query so far.

like image 178
Eric Lippert Avatar answered Oct 01 '22 16:10

Eric Lippert