Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the use of Enumerable.Zip extension method in Linq?

People also ask

What does Linq Zip do?

The Linq Zip Method in C# is used to apply a specified function to the corresponding elements of two sequences and producing a sequence of the results.

What is Linq enumerable?

Returns the number of elements in a sequence. Returns a number that represents how many elements in the specified sequence satisfy a condition. Returns the elements of the specified sequence or the type parameter's default value in a singleton collection if the sequence is empty.


The Zip operator merges the corresponding elements of two sequences using a specified selector function.

var letters= new string[] { "A", "B", "C", "D", "E" };
var numbers= new int[] { 1, 2, 3 };
var q = letters.Zip(numbers, (l, n) => l + n.ToString());
foreach (var s in q)
    Console.WriteLine(s);

Ouput

A1
B2
C3

Zip is for combining two sequences into one. For example, if you have the sequences

1, 2, 3

and

10, 20, 30

and you want the sequence that is the result of multiplying elements in the same position in each sequence to obtain

10, 40, 90

you could say

var left = new[] { 1, 2, 3 };
var right = new[] { 10, 20, 30 };
var products = left.Zip(right, (m, n) => m * n);

It is called "zip" because you think of one sequence as the left-side of a zipper, and the other sequence as the right-side of the zipper, and the zip operator will pull the two sides together pairing off the teeth (the elements of the sequence) appropriately.


It iterates through two sequences and combines their elements, one by one, into a single new sequence. So you take an element of sequence A, transform it with the corresponding element from sequence B, and the result forms an element of sequence C.

One way to think about it is that it's similar to Select, except instead of transforming items from a single collection, it works on two collections at once.

From the MSDN article on the method:

int[] numbers = { 1, 2, 3, 4 };
string[] words = { "one", "two", "three" };

var numbersAndWords = numbers.Zip(words, (first, second) => first + " " + second);

foreach (var item in numbersAndWords)
    Console.WriteLine(item);

// This code produces the following output:

// 1 one
// 2 two
// 3 three

If you were to do this in imperative code, you'd probably do something like this:

for (int i = 0; i < numbers.Length && i < words.Length; i++)
{
    numbersAndWords.Add(numbers[i] + " " + words[i]);
}

Or if LINQ didn't have Zip in it, you could do this:

var numbersAndWords = numbers.Select(
                          (num, i) => num + " " + words[i]
                      );

This is useful when you have data spread into simple, array-like lists, each with the same length and order, and each describing a different property of the same set of objects. Zip helps you knit those pieces of data together into a more coherent structure.

So if you have an array of state names and another array of their abbreviations, you could collate them into a State class like so:

IEnumerable<State> GetListOfStates(string[] stateNames, int[] statePopulations)
{
    return stateNames.Zip(statePopulations, 
                          (name, population) => new State()
                          {
                              Name = name,
                              Population = population
                          });
}

DO NOT let the name Zip throw you off. It has nothing to do with zipping as in zipping a file or a folder (compressing). It actually gets its name from how a zipper on clothes works: The zipper on clothes has 2 sides and each side has a bunch of teeth. When you go in one direction, the zipper enumerates (travels) both sides and closes the zipper by clenching the teeth. When you go in the other direction it opens the teeth. You either end with an open or closed zipper.

It is the same idea with the Zip method. Consider an example where we have two collections. One holds letters and the other holds the name of a food item which starts with that letter. For clarity purposes I am calling them leftSideOfZipper and rightSideOfZipper. Here is the code.

var leftSideOfZipper = new List<string> { "A", "B", "C", "D", "E" };
var rightSideOfZipper = new List<string> { "Apple", "Banana", "Coconut", "Donut" };

Our task is to produce one collection which has the letter of the fruit separated by a : and its name. Like this:

A : Apple
B : Banana
C : Coconut
D : Donut

Zip to the rescue. To keep up with our zipper terminology we will call this result closedZipper and the items of the left zipper we will call leftTooth and the right side we will call righTooth for obvious reasons:

var closedZipper = leftSideOfZipper
   .Zip(rightSideOfZipper, (leftTooth, rightTooth) => leftTooth + " : " + rightTooth).ToList();

In the above we are enumerating (travelling) the left side of the zipper and the right side of the zipper and performing an operation on each tooth. The operation we are performing is concatenating the left tooth (food letter) with a : and then the right tooth (food name). We do that using this code:

(leftTooth, rightTooth) => leftTooth + " : " + rightTooth)

The end result is this:

A : Apple
B : Banana
C : Coconut
D : Donut

What happened to the last letter E?

If you are enumerating (pulling) a real clothes zipper and one side, does not matter the left side or the right side, has less teeth than the other side, what will happen? Well the zipper will stop there. The Zip method will do exactly the same: It will stop once it has reached the last item on either side. In our case the right side has less teeth (food names) so it will stop at "Donut".


A lot of the answers here demonstrate Zip, but without really explaining a real life use-case that would motivate the use of Zip.

One particularly common pattern that Zip is fantastic for iterating over successive pairs of things. This is done by iterating an enumerable X with itself, skipping 1 element: x.Zip(x.Skip(1). Visual Example:

 x | x.Skip(1) | x.Zip(x.Skip(1), ...)
---+-----------+----------------------
   |    1      |
 1 |    2      | (1, 2)
 2 |    3      | (2, 1)
 3 |    4      | (3, 2)
 4 |    5      | (4, 3)

These successive pairs are useful for finding the first differences between values. For example, successive pairs of IEnumable<MouseXPosition> can be used to produce IEnumerable<MouseXDelta>. Similarly, sampled bool values of a button can be interpretted into events like NotPressed/Clicked/Held/Released. Those events can then drive calls to delegate methods. Here's an example:

using System;
using System.Collections.Generic;
using System.Linq;

enum MouseEvent { NotPressed, Clicked, Held, Released }

public class Program {
    public static void Main() {
        // Example: Sampling the boolean state of a mouse button
        List<bool> mouseStates = new List<bool> { false, false, false, false, true, true, true, false, true, false, false, true };

        mouseStates.Zip(mouseStates.Skip(1), (oldMouseState, newMouseState) => {
            if (oldMouseState) {
                if (newMouseState) return MouseEvent.Held;
                else return MouseEvent.Released;
            } else {
                if (newMouseState) return MouseEvent.Clicked;
                else return MouseEvent.NotPressed;
            }
        })
        .ToList()
        .ForEach(mouseEvent => Console.WriteLine(mouseEvent) );
    }
}

Prints:

NotPressesd
NotPressesd
NotPressesd
Clicked
Held
Held
Released
Clicked
Released
NotPressesd
Clicked

I don't have the rep points to post in the comments section, but to answer the related question :

What if I want zip to continue where one list run out of elements? In which case the shorter list element should take default value. Output in this case to be A1, B2, C3, D0, E0. – liang Nov 19 '15 at 3:29

What you would do is to use Array.Resize() to pad-out the shorter sequence with default values, and then Zip() them together.

Code example :

var letters = new string[] { "A", "B", "C", "D", "E" };
var numbers = new int[] { 1, 2, 3 };
if (numbers.Length < letters.Length)
    Array.Resize(ref numbers, letters.Length);
var q = letters.Zip(numbers, (l, n) => l + n.ToString());
foreach (var s in q)
    Console.WriteLine(s);

Output:

A1
B2
C3
D0
E0

Please note that using Array.Resize() has a caveat : Redim Preserve in C#?

If it is unknown which sequence will be the shorter one, a function can be created that susses it:

static void Main(string[] args)
{
    var letters = new string[] { "A", "B", "C", "D", "E" };
    var numbers = new int[] { 1, 2, 3 };
    var q = letters.Zip(numbers, (l, n) => l + n.ToString()).ToArray();
    var qDef = ZipDefault(letters, numbers);
    Array.Resize(ref q, qDef.Count());
    // Note: using a second .Zip() to show the results side-by-side
    foreach (var s in q.Zip(qDef, (a, b) => string.Format("{0, 2} {1, 2}", a, b)))
        Console.WriteLine(s);
}

static IEnumerable<string> ZipDefault(string[] letters, int[] numbers)
{
    switch (letters.Length.CompareTo(numbers.Length))
    {
        case -1: Array.Resize(ref letters, numbers.Length); break;
        case 0: goto default;
        case 1: Array.Resize(ref numbers, letters.Length); break;
        default: break;
    }
    return letters.Zip(numbers, (l, n) => l + n.ToString()); 
}

Output of plain .Zip() alongside ZipDefault() :

A1 A1
B2 B2
C3 C3
   D0
   E0

Going back to the main answer of the original question, another interesting thing that one might wish to do (when the lengths of the sequences to be "zipped" are different) is to join them in such a way so that the end of the list matches instead of the top. This can be accomplished by "skipping" the appropriate number of items using .Skip().

foreach (var s in letters.Skip(letters.Length - numbers.Length).Zip(numbers, (l, n) => l + n.ToString()).ToArray())
Console.WriteLine(s);

Output:

C1
D2
E3