How to select row with max value per category in M of Power BI. Suppose we have table:
+----------+-------+------------+
| Category | Value | Date |
+----------+-------+------------+
| apples | 1 | 2018-07-01 |
| apples | 2 | 2018-07-02 |
| apples | 3 | 2018-07-03 |
| bananas | 7 | 2018-07-04 |
| bananas | 8 | 2018-07-05 |
| bananas | 9 | 2018-07-06 |
+----------+-------+------------+
Desired results are:
+----------+-------+------------+
| Category | Value | Date |
+----------+-------+------------+
| apples | 3 | 2018-07-03 |
| bananas | 9 | 2018-07-06 |
+----------+-------+------------+
Here is a start table for PBI:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSiwoyEktVtJRMgRiIwNDC10Dc10DQ6VYHSQ5I2Q5I1Q5Y2Q5Y7BcUmIeEIIkzZElTdAkLZAlTdEkLZElzZRiYwE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Category = _t, Value = _t, Date = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Category", type text}, {"Value", Int64.Type}, {"Date", type date}})
in
#"Changed Type"
I wonder if there is a way to come to desired results in subsequent steps within only one table, by adding some magic column IsMax:
+----------+-------+------------+-------+
| Category | Value | Date | IsMax |
+----------+-------+------------+-------+
| apples | 1 | 2018-07-01 | 0 |
| apples | 2 | 2018-07-02 | 0 |
| apples | 3 | 2018-07-03 | 1 |
| bananas | 7 | 2018-07-04 | 0 |
| bananas | 8 | 2018-07-05 | 0 |
| bananas | 9 | 2018-07-06 | 1 |
+----------+-------+------------+-------+
In this sample data set each time we need to extract the highest sales value (max value) from the data set and this value should be dynamic whenever the new highest value is added to the records. Load data into the Power Query Editor for Excel or Power BI. Filter the column value (Sales Amount) with any number.
Row limit - When using DirectQuery, Power BI imposes a limit on the query results that are sent to your underlying data source. If the query sent to the data source returns more than one million rows, you see an error and the query fails. Your underlying data can still contain more than one million rows.
Doing a basic Group By in the Power Query Editor (group by Category
and take the max over Value
) gets you this table:
+----------+-------+
| Category | Value |
+----------+-------+
| apples | 3 |
| bananas | 9 |
+----------+-------+
Add a custom column IsMax
that is simply the value 1
to this table and then merge (left outer join) it with your original table matching on both Category
and Value
. Finally, expand the IsMax
column to get your desired table, except with null
instead of 0
. You can replace the null
values if you choose.
Here's the M code for all those steps:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSiwoyEktVtJRMgRiIwNDC10Dc10DQ6VYHSQ5I2Q5I1Q5Y2Q5Y7BcUmIeEIIkzZElTdAkLZAlTdEkLZElzZRiYwE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Category = _t, Value = _t, Date = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Value", Int64.Type}, {"Date", type date}, {"Category", type text}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"Category"}, {{"Value", each List.Max([Value]), Int64.Type}}),
#"Added Custom" = Table.AddColumn(#"Grouped Rows", "IsMax", each 1, Int64.Type),
#"Merged Queries" = Table.NestedJoin(#"Changed Type",{"Category", "Value"},#"Added Custom",{"Category", "Value"},"Added Custom",JoinKind.LeftOuter),
#"Expanded Added Custom" = Table.ExpandTableColumn(#"Merged Queries", "Added Custom", {"IsMax"}, {"IsMax"})
in
#"Expanded Added Custom"
Edit: A slightly simplified version to reproduce the "desired result" rather than the IsMax
version:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSiwoyEktVtJRMgRiIwNDC10Dc10DQ6VYHSQ5I2Q5I1Q5Y2Q5Y7BcUmIeEIIkzZElTdAkLZAlTdEkLZElzZRiYwE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Category = _t, Value = _t, Date = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Value", Int64.Type}, {"Date", type date}, {"Category", type text}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"Category"}, {{"Value", each List.Max([Value]), Int64.Type}}),
#"Merged Queries" = Table.NestedJoin(#"Grouped Rows", {"Category", "Value"}, #"Changed Type", {"Category", "Value"}, "Grouped Rows", JoinKind.LeftOuter),
#"Expanded Grouped Rows" = Table.ExpandTableColumn(#"Merged Queries", "Grouped Rows", {"Date"}, {"Date"})
in
#"Expanded Grouped Rows"
Edit 2: @user11632362 pointed me at another solution that's even fewer steps.
Everything happens in the Group By step.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSiwoyEktVtJRMgRiIwNDC10Dc10DQ6VYHSQ5I2Q5I1Q5Y2Q5Y7BcUmIeEIIkzZElTdAkLZAlTdEkLZElzZRiYwE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Category = _t, Value = _t, Date = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Value", Int64.Type}, {"Date", type date}, {"Category", type text}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"Category"}, {{"Value", each List.Max([Value]), Int64.Type}, {"Date", each Table.Max(_, "Value")[Date], type date}})
in
#"Grouped Rows"
The key here is each Table.Max(_, "Value")[Date]
. This sorts the subtable by Value
and returns the top row of the result as a record (and the [Date]
suffix returns the value in the Date
field of that record).
Note that this only pulls over a single column, Date
. If you need to pull in a bunch of columns it might make more sense to return the full record and expand all the desired fields in another step rather than adding more columns to the Group By step.
For example,
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSiwoyEktVtJRMgRiIwNDC10Dc10DQ6hAolKsDpIaI2Q1RlCBJFQ1xshqjKECyWA1SYl5QAhSZI6syATIAeEUNEUWyIpMgRwQTkVTZImsyAzIAeE0pdhYAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Category = _t, Value = _t, Date = _t, Col1 = _t, Col2 = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Category", type text}, {"Value", Int64.Type}, {"Date", type date}, {"Col1", Int64.Type}, {"Col2", type text}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"Category"}, {{"Value", each List.Max([Value]), type nullable date}, {"TopValueRow", each Table.Max(_, "Value"), type record}}),
#"Expanded TopValueRow" = Table.ExpandRecordColumn(#"Grouped Rows", "TopValueRow", {"Date", "Col1", "Col2"}, {"Date", "Col1", "Col2"})
in
#"Expanded TopValueRow"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With