Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Z-score calculation from mean and st dev

I would like to ask whether any popular package like: numpy, scipy, etc has a built in function to calculate Z-Score if I know already crital value, mean and st dev.

I am doing it usually like:

def Zscore(xcritical, mean, stdev):
    return (xcritical - mean)/stdev

#example:
xcritical = 73.06
mean = 72
stdev = 0.5

zscore = Zscore(xcritical, mean, stdev)

and later I am using scipy.stats.norm.cdf to calculate probability of x being lower than xcritical.

import scipy.stats as st
print(st.norm.cdf(zscore))

I wonder If I can simplify it somehow. I know that there is scipy.stats.zscore function but it takes a sample array and not sample statistics.

like image 321
Mateusz Konopelski Avatar asked Feb 09 '18 12:02

Mateusz Konopelski


2 Answers

Starting Python 3.9, the standard library provides the zscore function on the NormalDist object as part of the statistics module:

NormalDist(mu=72, sigma=.5).zscore(73.06)
# 2.1200000000000045
like image 166
Xavier Guihot Avatar answered Sep 20 '22 23:09

Xavier Guihot


In your question, I am not sure what do you mean by calculating the probability of 'x' being lower than 'xcritical' because you have not defined 'x'. Anyhow, I shall answer how to calculate the z-score for an 'x' value.

Going by the scipy.stats.norm documentation here, there doesn't seem to be an inbuilt method to calculate the z-score for a value ('xcritical' in your case), given the mean and standard deviation. However, you can calculate the same using inbuilt methods cdf and ppf. Consider the following snippet (the values are same as you have used in your post, where 'xcritical' is the value for which you wish to calculate z-score):

xcritical = 73.06
mean = 72
stdev = 0.5

p = norm.cdf(x=xcritical,loc=mean,scale=stdev)
z_score = norm.ppf(p)
print('The z-score for {} corresonding to {} mean and {} std deviation is: {:.3f}'.format(xcritical,mean,stdev,z_score))

Here, we first calculate the cumulative probability 'p' of obtaining 'xcritical' value given 'mean' and 'stdev' using norm.cdf(). norm.cdf() calculates the percentage of area under a normal distribution curve from negative infinity till an 'x' value ('xritical' in this case). Then, we pass this probability to norm.ppf() to obtain the z-score corresponding to that 'x' value. norm.ppf() is percent point function which yields the (z)value corresponding to passed lower tail probability in a standard normal distributed curve. The output of this code 2.12 which is same as what you will obtain from the function Zscore().

Hope that helps!

like image 33
Harshit Lamba Avatar answered Sep 24 '22 23:09

Harshit Lamba