I have a dictionary of struct, where one member is a list containing varying elements applicable to each dictionary item. I would like to join these elements against each item, in order to filter them and/or group them by element. In SQL I'm familiar with joining against tables/queries to obtain multiple rows as desired, but I'm new to C#/Linq. Since a "column" can be an object/list already associated with the proper dictionary items, I wonder how I can use them to perform a join? Here's a sample of the structure: <pre class="prettyprint"><code>name elements item1 list: elementA item2 list: elementA, elementB </code></pre> I would like a query that gives this output (count = 3) <pre class="prettyprint"><code>name elements item1 elementA item2 elementA item2 elementB </code></pre> For ultimately, grouping them like this: <pre class="prettyprint"><code> element count ElementA 2 ElementB 1 </code></pre> Here's my code start to count dictionary items. <pre class="prettyprint"><code> public struct MyStruct { public string name; public List<string> elements; } private void button1_Click(object sender, EventArgs e) { MyStruct myStruct = new MyStruct(); Dictionary<String, MyStruct> dict = new Dictionary<string, MyStruct>(); // Populate 2 items myStruct.name = "item1"; myStruct.elements = new List<string>(); myStruct.elements.Add("elementA"); dict.Add(myStruct.name, myStruct); myStruct.name = "item2"; myStruct.elements = new List<string>(); myStruct.elements.Add("elementA"); myStruct.elements.Add("elementB"); dict.Add(myStruct.name, myStruct); var q = from t in dict select t; MessageBox.Show(q.Count().ToString()); // Returns 2 } </code></pre> Edit: I don't really need the output is a dictionary. I used it to store my data because it works well and prevents duplicates (I do have unique item.name which I store as the key). However, for the purpose of filtering/grouping, I guess it could be a list or array without issues. I can always do .ToDictionary where key = item.Name afterwards.

<pre class="prettyprint"><code>var q = from t in dict from v in t.Value.elements select new { name = t.Key, element = v }; </code></pre> The method here is Enumerable.SelectMany. Using extension method syntax: <pre class="prettyprint"><code>var q = dict.SelectMany(t => t.Value.elements.Select(v => new { name = t.Key, element = v })); </code></pre> EDIT Note that you could also use <code>t.Value.name</code> above, instead of <code>t.Key</code>, since these values are equal. So, what's going on here? The query-comprehension syntax is probably easiest to understand; you can write an equivalent iterator block to see what's going on. We can't do that simply with an anonymous type, however, so we'll declare a type to return: <pre class="prettyprint"><code>class NameElement { public string name { get; set; } public string element { get; set; } } IEnumerable<NameElement> GetResults(Dictionary<string, MyStruct> dict) { foreach (KeyValuePair<string, MyStruct> t in dict) foreach (string v in t.Value.elements) yield return new NameElement { name = t.Key, element = v }; } </code></pre> How about the extension method syntax (or, what's really going on here)? (This is inspired in part by Eric Lippert's post at https://stackoverflow.com/a/2704795/385844; I had a much more complicated explanation, then I read that, and came up with this:) Let's say we want to avoid declaring the NameElement type. We could use an anonymous type by passing in a function. We'd change the call from this: <pre class="prettyprint"><code>var q = GetResults(dict); </code></pre> to this: <pre class="prettyprint"><code>var q = GetResults(dict, (string1, string2) => new { name = string1, element = string2 }); </code></pre> The lambda expression <code>(string1, string2) => new { name = string1, element = string2 }</code> represents a function that takes 2 strings -- defined by the argument list <code>(string1, string2)</code> -- and returns an instance of the anonymous type initialized with those strings -- defined by the expression <code>new { name = string1, element = string2 }</code>. The corresponding implementation is this: <pre class="prettyprint"><code>IEnumerable<T> GetResults<T>( IEnumerable<KeyValuePair<string, MyStruct>> pairs, Func<string, string, T> resultSelector) { foreach (KeyValuePair<string, MyStruct> pair in pairs) foreach (string e in pair.Value.elements) yield return resultSelector.Invoke(t.Key, v); } </code></pre> Type inference allows us to call this function without specifying <code>T</code> by name. That's handy, because (as far as we are aware as C# programmers), the type we're using doesn't have a name: it's anonymous. Note that the variable <code>t</code> is now <code>pair</code>, to avoid confusion with the type parameter <code>T</code>, and <code>v</code> is now <code>e</code>, for "element". We've also changed the type of the first parameter to one of its base types, <code>IEnumerable<KeyValuePair<string, MyStruct>></code>. It's wordier, but it makes the method more useful, and it will be helpful in the end. As the type is no longer a dictionary type, we've also changed the name of the parameter from <code>dict</code> to <code>pairs</code>. We could generalize this further. The second <code>foreach</code> has the effect of projecting a key-value pair to a sequence of type T. That whole effect could be encapsulated in a single function; the delegate type would be <code>Func<KeyValuePair<string, MyStruct>, T></code>. The first step is to refactor the method so we have a single statement that converts the element <code>pair</code> into a sequence, using the <code>Select</code> method to invoke the <code>resultSelector</code> delegate: <pre class="prettyprint"><code>IEnumerable<T> GetResults<T>( IEnumerable<KeyValuePair<string, MyStruct>> pairs, Func<string, string, T> resultSelector) { foreach (KeyValuePair<string, MyStruct> pair in pairs) foreach (T result in pair.Value.elements.Select(e => resultSelector.Invoke(pair.Key, e)) yield return result; } </code></pre> Now we can easily change the signature: <pre class="prettyprint"><code>IEnumerable<T> GetResults<T>( IEnumerable<KeyValuePair<string, MyStruct>> pairs, Func<KeyValuePair<string, MyStruct>, IEnumerable<T>> resultSelector) { foreach (KeyValuePair<string, MyStruct> pair in pairs) foreach (T result in resultSelector.Invoke(pair)) yield return result; } </code></pre> The call site now looks like this; notice how the lambda expression now incorporates the logic that we removed from the method body when we changed its signature: <pre class="prettyprint"><code>var q = GetResults(dict, pair => pair.Value.elements.Select(e => new { name = pair.Key, element = e })); </code></pre> To make the method more useful (and its implementation less verbose), let's replace the type <code>KeyValuePair<string, MyStruct></code> with a type parameter, <code>TSource</code>. We'll change some other names at the same time: <pre class="prettyprint"><code>T -> TResult pairs -> sourceSequence pair -> sourceElement </code></pre> And, just for kicks, we'll make it an extension method: <pre class="prettyprint"><code>static IEnumerable<TResult> GetResults<TSource, TResult>( this IEnumerable<TSource> sourceSequence, Func<TSource, IEnumerable<TResult>> resultSelector) { foreach (TSource sourceElement in sourceSequence) foreach (T result in resultSelector.Invoke(pair)) yield return result; } </code></pre> And there you have it: SelectMany! Well, the function still has the wrong name, and the actual implementation includes validation that the source sequence and the selector function are non-null, but that's the core logic. From MSDN: <code>SelectMany</code> "projects each element of a sequence to an IEnumerable and flattens the resulting sequences into one sequence."

Linq query to join against list in a struct

Tags:

c#

linq

linq-to-objects

I have a dictionary of struct, where one member is a list containing varying elements applicable to each dictionary item.

I would like to join these elements against each item, in order to filter them and/or group them by element.

In SQL I'm familiar with joining against tables/queries to obtain multiple rows as desired, but I'm new to C#/Linq. Since a "column" can be an object/list already associated with the proper dictionary items, I wonder how I can use them to perform a join?

Here's a sample of the structure:

name   elements
item1  list: elementA
item2  list: elementA, elementB

I would like a query that gives this output (count = 3)

name   elements
item1  elementA
item2  elementA
item2  elementB

For ultimately, grouping them like this:

   element    count
   ElementA   2
   ElementB   1

Here's my code start to count dictionary items.

    public struct MyStruct
    {
        public string name;
        public List<string> elements;
    }

    private void button1_Click(object sender, EventArgs e)
    {
        MyStruct myStruct = new MyStruct();
        Dictionary<String, MyStruct> dict = new Dictionary<string, MyStruct>();

        // Populate 2 items
        myStruct.name = "item1";
        myStruct.elements = new List<string>();
        myStruct.elements.Add("elementA");
        dict.Add(myStruct.name, myStruct);

        myStruct.name = "item2";
        myStruct.elements = new List<string>();
        myStruct.elements.Add("elementA");
        myStruct.elements.Add("elementB");
        dict.Add(myStruct.name, myStruct);


        var q = from t in dict
                select t;

        MessageBox.Show(q.Count().ToString()); // Returns 2
    }

Edit: I don't really need the output is a dictionary. I used it to store my data because it works well and prevents duplicates (I do have unique item.name which I store as the key). However, for the purpose of filtering/grouping, I guess it could be a list or array without issues. I can always do .ToDictionary where key = item.Name afterwards.

447

asked Feb 29 '12 06:02

mtone

1 Answers

var q = from t in dict
    from v in t.Value.elements
    select new { name = t.Key, element = v };

The method here is Enumerable.SelectMany. Using extension method syntax:

var q = dict.SelectMany(t => t.Value.elements.Select(v => new { name = t.Key, element = v }));

EDIT

Note that you could also use t.Value.name above, instead of t.Key, since these values are equal.

So, what's going on here?

The query-comprehension syntax is probably easiest to understand; you can write an equivalent iterator block to see what's going on. We can't do that simply with an anonymous type, however, so we'll declare a type to return:

class NameElement
{
    public string name { get; set; }
    public string element { get; set; }
}
IEnumerable<NameElement> GetResults(Dictionary<string, MyStruct> dict)
{
    foreach (KeyValuePair<string, MyStruct> t in dict)
        foreach (string v in t.Value.elements)
            yield return new NameElement { name = t.Key, element = v };
}

How about the extension method syntax (or, what's really going on here)?

(This is inspired in part by Eric Lippert's post at https://stackoverflow.com/a/2704795/385844; I had a much more complicated explanation, then I read that, and came up with this:)

Let's say we want to avoid declaring the NameElement type. We could use an anonymous type by passing in a function. We'd change the call from this:

var q = GetResults(dict);

to this:

var q = GetResults(dict, (string1, string2) => new { name = string1, element = string2 });

The lambda expression (string1, string2) => new { name = string1, element = string2 } represents a function that takes 2 strings -- defined by the argument list (string1, string2) -- and returns an instance of the anonymous type initialized with those strings -- defined by the expression new { name = string1, element = string2 }.

The corresponding implementation is this:

IEnumerable<T> GetResults<T>(
    IEnumerable<KeyValuePair<string, MyStruct>> pairs,
    Func<string, string, T> resultSelector)
{
    foreach (KeyValuePair<string, MyStruct> pair in pairs)
        foreach (string e in pair.Value.elements)
            yield return resultSelector.Invoke(t.Key, v);
}

Type inference allows us to call this function without specifying T by name. That's handy, because (as far as we are aware as C# programmers), the type we're using doesn't have a name: it's anonymous.

Note that the variable t is now pair, to avoid confusion with the type parameter T, and v is now e, for "element". We've also changed the type of the first parameter to one of its base types, IEnumerable<KeyValuePair<string, MyStruct>>. It's wordier, but it makes the method more useful, and it will be helpful in the end. As the type is no longer a dictionary type, we've also changed the name of the parameter from dict to pairs.

We could generalize this further. The second foreach has the effect of projecting a key-value pair to a sequence of type T. That whole effect could be encapsulated in a single function; the delegate type would be Func<KeyValuePair<string, MyStruct>, T>. The first step is to refactor the method so we have a single statement that converts the element pair into a sequence, using the Select method to invoke the resultSelector delegate:

IEnumerable<T> GetResults<T>(
    IEnumerable<KeyValuePair<string, MyStruct>> pairs,
    Func<string, string, T> resultSelector)
{
    foreach (KeyValuePair<string, MyStruct> pair in pairs)
        foreach (T result in pair.Value.elements.Select(e => resultSelector.Invoke(pair.Key, e))
            yield return result;
}

Now we can easily change the signature:

IEnumerable<T> GetResults<T>(
    IEnumerable<KeyValuePair<string, MyStruct>> pairs,
    Func<KeyValuePair<string, MyStruct>, IEnumerable<T>> resultSelector)
{
    foreach (KeyValuePair<string, MyStruct> pair in pairs)
        foreach (T result in resultSelector.Invoke(pair))
            yield return result;
}

The call site now looks like this; notice how the lambda expression now incorporates the logic that we removed from the method body when we changed its signature:

var q = GetResults(dict, pair => pair.Value.elements.Select(e => new { name = pair.Key, element = e }));

To make the method more useful (and its implementation less verbose), let's replace the type KeyValuePair<string, MyStruct> with a type parameter, TSource. We'll change some other names at the same time:

T     -> TResult
pairs -> sourceSequence
pair  -> sourceElement

And, just for kicks, we'll make it an extension method:

static IEnumerable<TResult> GetResults<TSource, TResult>(
    this IEnumerable<TSource> sourceSequence,
    Func<TSource, IEnumerable<TResult>> resultSelector)
{
    foreach (TSource sourceElement in sourceSequence)
        foreach (T result in resultSelector.Invoke(pair))
            yield return result;
}

And there you have it: SelectMany! Well, the function still has the wrong name, and the actual implementation includes validation that the source sequence and the selector function are non-null, but that's the core logic.

From MSDN: SelectMany "projects each element of a sequence to an IEnumerable and flattens the resulting sequences into one sequence."

156

answered Oct 11 '22 17:10

phoog

Related questions
                            
                                How do I apply dependency injection to an abstract factory
                            
                                Access the ToolStripMenuItem child in WinForms
                            
                                Machine retains file exists/locks on client-side power outage
                            
                                The type initializer for 'Emgu.CV.CvInvoke' threw an exception
                            
                                ThreadLocal performance vs using parameters
                            
                                How to prevent users to login to my site more than one sessions?
                            
                                Hide cursor everywhere
                            
                                How a String type get Passed to a Method or Assigned to a Variable in C#?
                            
                                Ninject: Entity Context to Controller
                            
                                TFS 2010 : Check out file on open
                            
                                Validation still showing although disabled
                            
                                Why is WindowsPrincipal.IsInRole always returning false for the "Administrators" group?
                            
                                Does WCF Service use multiple threads to process incoming requests?
                            
                                retrieve WHEEL_DELTA from wParam in WM_MOUSEHWHEEL msg in C#
                            
                                How to write a solid unit test for this business logic code?
                            
                                Join context menus
                            
                                Unable to cast object of type 'System.Linq.EnumerableQuery`1[Entities.Test]' to type 'System.Data.Objects.ObjectQuery`1[Entities.Test]'
                            
                                Using the "let" kewword in a LINQ Query with EF 4.3
                            
                                How to obtain DefragAnalysis using C#
                            
                                String Parameter in AjaxOption null on Submit but showing in Response

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With