I have a dataframe containing a single column of IDs and all other columns are numerical values for which I want to compute z-scores. Here's a subsection of it:
ID Age BMI Risk Factor PT 6 48 19.3 4 PT 8 43 20.9 NaN PT 2 39 18.1 3 PT 9 41 19.5 NaN
Some of my columns contain NaN values which I do not want to include into the z-score calculations so I intend to use a solution offered to this question: how to zscore normalize pandas column with nans?
df['zscore'] = (df.a - df.a.mean())/df.a.std(ddof=0)
I'm interested in applying this solution to all of my columns except the ID column to produce a new dataframe which I can save as an Excel file using
df2.to_excel("Z-Scores.xlsx")
So basically; how can I compute z-scores for each column (ignoring NaN values) and push everything into a new dataframe?
SIDENOTE: there is a concept in pandas called "indexing" which intimidates me because I do not understand it well. If indexing is a crucial part of solving this problem, please dumb down your explanation of indexing.
For each value in an array, the z-score is calculated by dividing the difference between the value and the mean by the standard deviation of the distribution.
To get column average or mean from pandas DataFrame use either mean() and describe() method. The DataFrame. mean() method is used to return the mean of the values for the requested axis.
Build a list from the columns and remove the column you don't want to calculate the Z score for:
In [66]: cols = list(df.columns) cols.remove('ID') df[cols] Out[66]: Age BMI Risk Factor 0 6 48 19.3 4 1 8 43 20.9 NaN 2 2 39 18.1 3 3 9 41 19.5 NaN In [68]: # now iterate over the remaining columns and create a new zscore column for col in cols: col_zscore = col + '_zscore' df[col_zscore] = (df[col] - df[col].mean())/df[col].std(ddof=0) df Out[68]: ID Age BMI Risk Factor Age_zscore BMI_zscore Risk_zscore \ 0 PT 6 48 19.3 4 -0.093250 1.569614 -0.150946 1 PT 8 43 20.9 NaN 0.652753 0.074744 1.459148 2 PT 2 39 18.1 3 -1.585258 -1.121153 -1.358517 3 PT 9 41 19.5 NaN 1.025755 -0.523205 0.050315 Factor_zscore 0 1 1 NaN 2 -1 3 NaN
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With