Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pivot data using LINQ

People also ask

Is it possible to Pivot data using LINQ?

In Linq, you get the key and any elements as children of the key (hierarchical shape). To pivot, you must project the hierarchy back into a row/column form of your choosing.

How do I convert rows into columns using LINQ?

We need it to set column name of first column. data selector - this is a method which will be run on grouped data for each cell. I.e. in your case we just select keyValue property of first item in group. But it could be items count items => items.


I'm not saying it is a great way to pivot - but it is a pivot...

    // sample data
    var data = new[] {
        new { Foo = 1, Bar = "Don Smith"},
        new { Foo = 1, Bar = "Mike Jones"},
        new { Foo = 1, Bar = "James Ray"},
        new { Foo = 2, Bar = "Tom Rizzo"},
        new { Foo = 2, Bar = "Alex Homes"},
        new { Foo = 3, Bar = "Andy Bates"},
    };
    // group into columns, and select the rows per column
    var grps = from d in data
              group d by d.Foo
              into grp
              select new {
                  Foo = grp.Key,
                  Bars = grp.Select(d2 => d2.Bar).ToArray()
              };

    // find the total number of (data) rows
    int rows = grps.Max(grp => grp.Bars.Length);

    // output columns
    foreach (var grp in grps) {
        Console.Write(grp.Foo + "\t");
    }
    Console.WriteLine();
    // output data
    for (int i = 0; i < rows; i++) {
        foreach (var grp in grps) {
            Console.Write((i < grp.Bars.Length ? grp.Bars[i] : null) + "\t");
        }
        Console.WriteLine();
    }

Marc's answer gives sparse matrix that can't be pumped into Grid directly.
I tried to expand the code from the link provided by Vasu as below:

public static Dictionary<TKey1, Dictionary<TKey2, TValue>> Pivot3<TSource, TKey1, TKey2, TValue>(
    this IEnumerable<TSource> source
    , Func<TSource, TKey1> key1Selector
    , Func<TSource, TKey2> key2Selector
    , Func<IEnumerable<TSource>, TValue> aggregate)
{
    return source.GroupBy(key1Selector).Select(
        x => new
        {
            X = x.Key,
            Y = source.GroupBy(key2Selector).Select(
                z => new
                {
                    Z = z.Key,
                    V = aggregate(from item in source
                                  where key1Selector(item).Equals(x.Key)
                                  && key2Selector(item).Equals(z.Key)
                                  select item
                    )

                }
            ).ToDictionary(e => e.Z, o => o.V)
        }
    ).ToDictionary(e => e.X, o => o.Y);
} 
internal class Employee
{
    public string Name { get; set; }
    public string Department { get; set; }
    public string Function { get; set; }
    public decimal Salary { get; set; }
}
public void TestLinqExtenions()
{
    var l = new List<Employee>() {
    new Employee() { Name = "Fons", Department = "R&D", Function = "Trainer", Salary = 2000 },
    new Employee() { Name = "Jim", Department = "R&D", Function = "Trainer", Salary = 3000 },
    new Employee() { Name = "Ellen", Department = "Dev", Function = "Developer", Salary = 4000 },
    new Employee() { Name = "Mike", Department = "Dev", Function = "Consultant", Salary = 5000 },
    new Employee() { Name = "Jack", Department = "R&D", Function = "Developer", Salary = 6000 },
    new Employee() { Name = "Demy", Department = "Dev", Function = "Consultant", Salary = 2000 }};

    var result5 = l.Pivot3(emp => emp.Department, emp2 => emp2.Function, lst => lst.Sum(emp => emp.Salary));
    var result6 = l.Pivot3(emp => emp.Function, emp2 => emp2.Department, lst => lst.Count());
}

* can't say anything about the performance though.


You can use Linq's .ToLookup to group in the manner you are looking for.

var lookup = data.ToLookup(d => d.TypeCode, d => d.User);

Then it's a matter of putting it into a form that your consumer can make sense of. For instance:

//Warning: untested code
var enumerators = lookup.Select(g => g.GetEnumerator()).ToList();
int columns = enumerators.Count;
while(columns > 0)
{
  for(int i = 0; i < enumerators.Count; ++i)
  {
    var enumerator = enumerators[i];
    if(enumator == null) continue;
    if(!enumerator.MoveNext())
    { 
      --columns;
      enumerators[i] = null;
    }
  }
  yield return enumerators.Select(e => (e != null) ? e.Current : null);
}

Put that in an IEnumerable<> method and it will (probably) return a collection (rows) of collections (column) of User where a null is put in a column that has no data.


I guess this is similar to Marc's answer, but I'll post it since I spent some time working on it. The results are separated by " | " as in your example. It also uses the IGrouping<int, string> type returned from the LINQ query when using a group by instead of constructing a new anonymous type. This is tested, working code.

var Items = new[] {
    new { TypeCode = 1, UserName = "Don Smith"},
    new { TypeCode = 1, UserName = "Mike Jones"},
    new { TypeCode = 1, UserName = "James Ray"},
    new { TypeCode = 2, UserName = "Tom Rizzo"},
    new { TypeCode = 2, UserName = "Alex Homes"},
    new { TypeCode = 3, UserName = "Andy Bates"}
};
var Columns = from i in Items
              group i.UserName by i.TypeCode;
Dictionary<int, List<string>> Rows = new Dictionary<int, List<string>>();
int RowCount = Columns.Max(g => g.Count());
for (int i = 0; i <= RowCount; i++) // Row 0 is the header row.
{
    Rows.Add(i, new List<string>());
}
int RowIndex;
foreach (IGrouping<int, string> c in Columns)
{
    Rows[0].Add(c.Key.ToString());
    RowIndex = 1;
    foreach (string user in c)
    {
        Rows[RowIndex].Add(user);
        RowIndex++;
    }
    for (int r = RowIndex; r <= Columns.Count(); r++)
    {
        Rows[r].Add(string.Empty);
    }
}
foreach (List<string> row in Rows.Values)
{
    Console.WriteLine(row.Aggregate((current, next) => current + " | " + next));
}
Console.ReadLine();

I also tested it with this input:

var Items = new[] {
    new { TypeCode = 1, UserName = "Don Smith"},
    new { TypeCode = 3, UserName = "Mike Jones"},
    new { TypeCode = 3, UserName = "James Ray"},
    new { TypeCode = 2, UserName = "Tom Rizzo"},
    new { TypeCode = 2, UserName = "Alex Homes"},
    new { TypeCode = 3, UserName = "Andy Bates"}
};

Which produced the following results showing that the first column doesn't need to contain the longest list. You could use OrderBy to get the columns ordered by TypeCode if needed.

1         | 3          | 2
Don Smith | Mike Jones | Tom Rizzo
          | James Ray  | Alex Homes
          | Andy Bates | 

@Sanjaya.Tio I was intrigued by your answer and created this adaptation which minimizes keySelector execution. (untested)

public static Dictionary<TKey1, Dictionary<TKey2, TValue>> Pivot3<TSource, TKey1, TKey2, TValue>(
    this IEnumerable<TSource> source
    , Func<TSource, TKey1> key1Selector
    , Func<TSource, TKey2> key2Selector
    , Func<IEnumerable<TSource>, TValue> aggregate)
{
  var lookup = source.ToLookup(x => new {Key1 = key1Selector(x), Key2 = key2Selector(x)});

  List<TKey1> key1s = lookup.Select(g => g.Key.Key1).Distinct().ToList();
  List<TKey2> key2s = lookup.Select(g => g.Key.Key2).Distinct().ToList();

  var resultQuery =
    from key1 in key1s
    from key2 in key2s
    let lookupKey = new {Key1 = key1, Key2 = key2}
    let g = lookup[lookupKey]
    let resultValue = g.Any() ? aggregate(g) : default(TValue)
    select new {Key1 = key1, Key2 = key2, ResultValue = resultValue};

  Dictionary<TKey1, Dictionary<TKey2, TValue>> result = new Dictionary<TKey1, Dictionary<TKey2, TValue>>();
  foreach(var resultItem in resultQuery)
  {
    TKey1 key1 = resultItem.Key1;
    TKey2 key2 = resultItem.Key2;
    TValue resultValue = resultItem.ResultValue;

    if (!result.ContainsKey(key1))
    {
      result[key1] = new Dictionary<TKey2, TValue>();
    }
    var subDictionary = result[key1];
    subDictionary[key2] = resultValue; 
  }
  return result;
}