Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Filter SUMMARIZECOLUMNS

Tags:

powerbi

dax

How to construct filter tables for SUMMARIZECOLUMNS function?

The SUMMARIZECOLUMNS has the following pattern:

SUMMARIZECOLUMNS( 
    ColumnName1, ...
    ColumnNameN,
    FilterTable1,     -- my question concerns this line
    FilterTableN, 
    Name1, [measure1], 
    NameN, [measure2], 
)

I have checked that the following 3 patterns work. They return the same results, at least for the simple sample data I used.

SUMMARIZECOLUMNS (
    T[col],
    FILTER( T, T[col] = "red" ) 
)
SUMMARIZECOLUMNS (
    T[col],
    CALCULATETABLE( T, T[col] = "red" ) 
)
SUMMARIZECOLUMNS (
    T[col],
    CALCULATETABLE ( T, KEEPFILTERS ( T[col] = "red" ) )
)

Is any of these patterns superior over the other?

Reference: https://www.sqlbi.com/articles/introducing-summarizecolumns/

Update

I would be interested in an answer that contains a query plan analysis or link to credible source. I would be grateful if you mentioned using the SUMMARIZECOLUMNS function when grouping columns from multiple tables.

like image 471
Przemyslaw Remin Avatar asked Jan 27 '20 12:01

Przemyslaw Remin


People also ask

What does Summarizecolumns do?

The SUMMARIZECOLUMNS function produces a very optimized query plan, and you should consider it as a replacement to SUMMARIZE and ADDCOLUMNS/SUMMARIZE pattern. Create a summary table for the requested totals over set of groups. Returns a table with new columns specified by the DAX expressions.

What is the difference between summarize and Summarizecolumns?

The difference between the two is that the SUMMARIZE function allows us to have a row and a filter context within the expression whereas the SUMMARIZECOLUMNS function only allows us to have a filter context and no row context. Let us learn in detail about both these functions now.

What is not equal to Dax?

The “not equal to” operator <> returns TRUE when the two arguments do not have the same value. A comparison between BLANK and 0 or between BLANK and an empty string returns FALSE. Use the == operator to treat BLANK and 0 or empty string as different values.


1 Answers

You can also construct them the way PowerBI does, using VAR:

VAR  __MyFilterTable = FILTER( T, T[col] = "red" ) 

RETURN
SUMMARIZECOLUMNS (
    T[col],
    __MyFilterTable
)

Which is more efficient will depend on the complexity your filtering, so there is no "one size fits all" rule necessarily. For a simple table level filter, just FILTER will suffice. I caution you that Line 1, where you're filtering the entire table T, is a bad idea. It's much more performant to only filter a single column. When you filter the entire table, DAX materializes the entire table in memory, while the following just materializes the one value of T[col]:

VAR  __MyFilterTable = FILTER( ALL(T[col]), T[col] = "red" ) // This is better.

RETURN
SUMMARIZECOLUMNS (
    T[col],
    __MyFilterTable
)

You can do even better than that, conceptually. You can basically tell DAX, "I know this is a value, so don't even look in the table for it. Just make me a table and treat it as though I filtered it. Like this:

VAR  __MyFilterTable = TREATAS ({"red"}, T[col] )

RETURN
SUMMARIZECOLUMNS (
    T[col],
    __MyFilterTable
)

Again, this is the pattern that PowerBI uses when performing its filters.

BTW, Creating the filter tables a the top vs. creating them inline with SUMMARIZECOLUMNS() won't make any difference for speed. Avoid using CALCULATETABLE() as you've done here generally.

You can also do this as well, though you aren't likely to see a speed increase generally:

CALCULATETABLE(
    SUMMARIZECOLUMNS (
        T[col]
    ),
    KEEPFILTERS(T[col] = "red")
)
like image 140
Dave Markle Avatar answered Sep 28 '22 05:09

Dave Markle