Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Power BI: Calculating STDEVX.P over 6-Month period

I am attempting to calculate the most recent 6-Month STDEVX.P (not including the current month; so in May 2017, I'd like to the STDEVX.P for periods Nov 2016 - Apr 2017) for sales by product in order to further calculate variation in sales orders.

The Sales Data is made up of daily transactions so it contains transaction date: iContractsChargebacks[TransactionDate] and units sold: iContractsChargebacks[ChargebackUnits], but if there are no sales in a given period, then there will be no data for that month.

So, for example, on July 1st, sales for the past 6 months were the following:

Jan 100
Feb 125
Apr 140
May 125
Jun 130

March is missing because there were no sales. So, when I calculate STDEVX.P on the data set, it is calculating it over 5 periods, when in fact there were 6, just one happens to be zero.

At the end of the day, I need to calculate STDEVX.P for the current six month period. If when pulling the monthly sales numbers, it only comes back with 3 periods(months), then it needs to assume the other 3 periods with a zero value.

I thought about manually calculating standard deviation instead of using the DAX STDEVX.P formula and found these 2 links as a reference on how to do so, the first being closest to my need:

https://community.powerbi.com/t5/Desktop/Problem-with-STDEV/td-p/19731

Calculating the standard deviation from columns of values and frequencies in Power BI...

I attempted to make a go of it, but still am not getting the correct calculation. My code is:

STDEVX2 =
    var Averageprice=[6M Sales]
    var months=6
    return
    SQRT(
    DIVIDE(SUMX(
    FILTER(ALL(DimDate),
    DimDate[Month ID]<=(MAX(DimDate[Month ID])-1) &&
    DimDate[Month ID]>=(MAX(DimDate[Month ID])-6)
    ),
    (iContractsChargebacks[SumOfOrderQuantity]-Averageprice)^2),
        months
    )
)

*note: Instead of using date parameters in the code, I created a calculated column in the date table that gives each Month a unique ID, makes it easier for me.

like image 629
Epicleo Avatar asked Oct 18 '22 10:10

Epicleo


1 Answers

Your question would definitely be easier to answer with more explanation regarding your model. E.g. how you defined [SumOfOrderQuantity] and [6M Sales], since a mistake there could definitely impact the final result. Also, knowing what the result you're seeing is vs. the result you expect would be helpful (using sample data).

My guess, however, is that your DimDate table is a standard date table (with one row per date), but you want standard deviation by month.

The FILTER statement in your formula limits the date range to the prior 6 full months correctly, but it will still have one row per date. You can confirm this in Power BI by going into the Data View, selecting 'New Table' under Modeling on the ribbon, and putting your FILTER statement in:

Table = FILTER(ALL(DimDate),
DimDate[MonthID]<=(MAX(DimDate[MonthID])-1) &&
DimDate[MonthID]>=(MAX(DimDate[MonthID])-6))

Assuming you have more than one day of sales for a given month, calculating the variance by day rather than by month is going to mess things up.

What I'd suggest trying:

Table = FILTER(SUMMARIZE(ALL(DimDate),[MonthID]),
DimDate[MonthID]<=(MAX(DimDate[MonthID])-1) &&
DimDate[MonthID]>=(MAX(DimDate[MonthID])-6))

The additional SUMMARIZE statement means that you only get one row for each MonthID, rather than 1 row for each date. If your [6M Sales] is the monthly average across all 6 months, and [SumOfOrderQuantity] is the monthly sum for each month, then you should be set to go calculating the variance, squaring, dividing by 6, and square rooting.

If you need to do further troubleshooting, remember you can put a table on your canvas with MonthID, SumOfOrderQuantity and [6M Sales] and compare the numbers you expect at each stage of the calculation with the numbers you're seeing.

Hope this helps.

like image 100
Leonard Avatar answered Oct 21 '22 04:10

Leonard