Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

"transpose" a Pandas Series

Tags:

pandas

I have a DataFrame with an ID column and some features columns. I'd like to see a description of how many unique IDs are there per column values.

The following code works but I wonder if there a better way than the to_frame().unstack().unstack() line which transposes the .describe() series result to DataFrame where the columns are the percentiles, max, min ...

def unique_ids(df):
    rows = []
    for col in sorted(c for c in df.columns if c != id_col):
        v = df.groupby(col)[id_col].nunique().describe()
        v = v.to_frame().unstack().unstack()  # Transpose
        v.index = [col]
        rows.append(v)

    return pd.concat(rows)
like image 602
lazy1 Avatar asked Feb 04 '23 15:02

lazy1


1 Answers

It seems you need change:

v = v.to_frame().unstack().unstack()

to

v = v.to_frame().T

Or is possible transpose final DataFrame, also is added rename by col:

df = pd.DataFrame({'ID':[1,1,3],
                   'E':[4,5,5],
                   'C':[7,8,9]})

print (df)
   C  E  ID
0  7  4   1
1  8  5   1
2  9  5   3

def unique_ids(df):
    rows = []
    id_col = 'ID'
    for col in sorted(c for c in df.columns if c != id_col):
        v = df.groupby(col)[id_col].nunique().describe().rename(col)
        rows.append(v)
    return pd.concat(rows, axis=1).T

print (unique_ids(df))
   count  mean       std  min   25%  50%   75%  max
C    3.0   1.0  0.000000  1.0  1.00  1.0  1.00  1.0
E    2.0   1.5  0.707107  1.0  1.25  1.5  1.75  2.0
like image 173
jezrael Avatar answered Jun 14 '23 00:06

jezrael