Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Select row with MAX value per category Power BI

How to select row with max value per category in M of Power BI. Suppose we have table:

+----------+-------+------------+
| Category | Value |    Date    |
+----------+-------+------------+
| apples   |     1 | 2018-07-01 |
| apples   |     2 | 2018-07-02 |
| apples   |     3 | 2018-07-03 |
| bananas  |     7 | 2018-07-04 |
| bananas  |     8 | 2018-07-05 |
| bananas  |     9 | 2018-07-06 |
+----------+-------+------------+

Desired results are:

+----------+-------+------------+
| Category | Value |    Date    |
+----------+-------+------------+
| apples   |     3 | 2018-07-03 |
| bananas  |     9 | 2018-07-06 |
+----------+-------+------------+

Here is a start table for PBI:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSiwoyEktVtJRMgRiIwNDC10Dc10DQ6VYHSQ5I2Q5I1Q5Y2Q5Y7BcUmIeEIIkzZElTdAkLZAlTdEkLZElzZRiYwE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Category = _t, Value = _t, Date = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Category", type text}, {"Value", Int64.Type}, {"Date", type date}})
in
    #"Changed Type"

I wonder if there is a way to come to desired results in subsequent steps within only one table, by adding some magic column IsMax:

+----------+-------+------------+-------+
| Category | Value |    Date    | IsMax |
+----------+-------+------------+-------+
| apples   |     1 | 2018-07-01 |     0 |
| apples   |     2 | 2018-07-02 |     0 |
| apples   |     3 | 2018-07-03 |     1 |
| bananas  |     7 | 2018-07-04 |     0 |
| bananas  |     8 | 2018-07-05 |     0 |
| bananas  |     9 | 2018-07-06 |     1 |
+----------+-------+------------+-------+
like image 753
Przemyslaw Remin Avatar asked Jul 20 '18 08:07

Przemyslaw Remin


People also ask

How do you filter max value in power query?

In this sample data set each time we need to extract the highest sales value (max value) from the data set and this value should be dynamic whenever the new highest value is added to the records. Load data into the Power Query Editor for Excel or Power BI. Filter the column value (Sales Amount) with any number.

Does Power BI have a limit on number of rows?

Row limit - When using DirectQuery, Power BI imposes a limit on the query results that are sent to your underlying data source. If the query sent to the data source returns more than one million rows, you see an error and the query fails. Your underlying data can still contain more than one million rows.


1 Answers

Doing a basic Group By in the Power Query Editor (group by Category and take the max over Value) gets you this table:

+----------+-------+
| Category | Value |
+----------+-------+
| apples   |     3 |
| bananas  |     9 |
+----------+-------+

Add a custom column IsMax that is simply the value 1 to this table and then merge (left outer join) it with your original table matching on both Category and Value. Finally, expand the IsMax column to get your desired table, except with null instead of 0. You can replace the null values if you choose.

Here's the M code for all those steps:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSiwoyEktVtJRMgRiIwNDC10Dc10DQ6VYHSQ5I2Q5I1Q5Y2Q5Y7BcUmIeEIIkzZElTdAkLZAlTdEkLZElzZRiYwE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Category = _t, Value = _t, Date = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Value", Int64.Type}, {"Date", type date}, {"Category", type text}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"Category"}, {{"Value", each List.Max([Value]), Int64.Type}}),
    #"Added Custom" = Table.AddColumn(#"Grouped Rows", "IsMax", each 1, Int64.Type),
    #"Merged Queries" = Table.NestedJoin(#"Changed Type",{"Category", "Value"},#"Added Custom",{"Category", "Value"},"Added Custom",JoinKind.LeftOuter),
    #"Expanded Added Custom" = Table.ExpandTableColumn(#"Merged Queries", "Added Custom", {"IsMax"}, {"IsMax"})
in
    #"Expanded Added Custom"

Edit: A slightly simplified version to reproduce the "desired result" rather than the IsMax version:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSiwoyEktVtJRMgRiIwNDC10Dc10DQ6VYHSQ5I2Q5I1Q5Y2Q5Y7BcUmIeEIIkzZElTdAkLZAlTdEkLZElzZRiYwE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Category = _t, Value = _t, Date = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Value", Int64.Type}, {"Date", type date}, {"Category", type text}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"Category"}, {{"Value", each List.Max([Value]), Int64.Type}}),
    #"Merged Queries" = Table.NestedJoin(#"Grouped Rows", {"Category", "Value"}, #"Changed Type", {"Category", "Value"}, "Grouped Rows", JoinKind.LeftOuter),
    #"Expanded Grouped Rows" = Table.ExpandTableColumn(#"Merged Queries", "Grouped Rows", {"Date"}, {"Date"})
in
    #"Expanded Grouped Rows"

Edit 2: @user11632362 pointed me at another solution that's even fewer steps.

Everything happens in the Group By step.

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSiwoyEktVtJRMgRiIwNDC10Dc10DQ6VYHSQ5I2Q5I1Q5Y2Q5Y7BcUmIeEIIkzZElTdAkLZAlTdEkLZElzZRiYwE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Category = _t, Value = _t, Date = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Value", Int64.Type}, {"Date", type date}, {"Category", type text}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"Category"}, {{"Value", each List.Max([Value]), Int64.Type}, {"Date", each Table.Max(_, "Value")[Date], type date}})
in
    #"Grouped Rows"

The key here is each Table.Max(_, "Value")[Date]. This sorts the subtable by Value and returns the top row of the result as a record (and the [Date] suffix returns the value in the Date field of that record).

Note that this only pulls over a single column, Date. If you need to pull in a bunch of columns it might make more sense to return the full record and expand all the desired fields in another step rather than adding more columns to the Group By step.

For example,

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSiwoyEktVtJRMgRiIwNDC10Dc10DQ6hAolKsDpIaI2Q1RlCBJFQ1xshqjKECyWA1SYl5QAhSZI6syATIAeEUNEUWyIpMgRwQTkVTZImsyAzIAeE0pdhYAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Category = _t, Value = _t, Date = _t, Col1 = _t, Col2 = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Category", type text}, {"Value", Int64.Type}, {"Date", type date}, {"Col1", Int64.Type}, {"Col2", type text}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"Category"}, {{"Value", each List.Max([Value]), type nullable date}, {"TopValueRow", each Table.Max(_, "Value"), type record}}),
    #"Expanded TopValueRow" = Table.ExpandRecordColumn(#"Grouped Rows", "TopValueRow", {"Date", "Col1", "Col2"}, {"Date", "Col1", "Col2"})
in
    #"Expanded TopValueRow"
like image 106
Alexis Olson Avatar answered Sep 23 '22 20:09

Alexis Olson