Get mean and mode of dataframe depending on each column type

Tags:

This dataframe spans for dozens of rows, and have a set of numeric columns, and a set of string columns. I would like to condense this into 1 row, where each entry is just the mean or mode of the column. If the column is numeric, take the mean, otherwise, take the mode. In my actual use case, the order of numeric and object columns are random, so I hope to use an iterative loop that checks for each column which action to take.

I tried this but it didn't work, it seems to be taking the entire Series as the mode.

for i in df1:
    if df1[i].dtype == 'float64':
        df1[i] = df1[i].mean()

Any help is appreciated, thank you!

595

asked Mar 23 '21 21:03

AxW

3 Answers

You can use describe with 'all' which calculates statistics depending upon the dtype. It determines the top (mode) for object and mean for numeric columns. Then combine.

s = df1.describe(include='all')
s = s.loc['top'].combine_first(s.loc['mean'])

#Group      Winner
#Study        Read
#Score    0.883333
#Name: top, dtype: object

161

answered Nov 14 '22 21:11

ALollz

`np.number` and `select_dtypes`

s = df1.select_dtypes(np.number).mean()
df1.drop(s.index, axis=1).mode().iloc[0].append(s)

Group      Winner
Study        Read
Score    0.883333
dtype: object

Variant

g = df1.dtypes.map(lambda x: np.issubdtype(x, np.number))
d = {k: d for k, d in df1.groupby(g, axis=1)}
pd.concat([d[False].mode().iloc[0], d[True].mean()])

Group      Winner
Study        Read
Score    0.883333
dtype: object

answered Nov 14 '22 22:11

piRSquared

Here is a slight variation on your solution that gets the job done

res = {}
for col_name, col_type in zip(df1.columns, df1.dtypes):
    if pd.api.types.is_numeric_dtype(col_type):
        res[col_name] = df1[col_name].mean()
    else:
        res[col_name]= df1[col_name].mode()[0]

pd.DataFrame(res, index = [0])

returns

    Group   Study   Score
0   Winner  Read    0.883333

there could be multiple modes in a Series -- this solution picks the first one

answered Nov 14 '22 22:11

piterbarg

Related questions
                            
                                Exclude tests in pytest configuration file
                            
                                Jupyter starting a kernel in a docker container?
                            
                                Including and distributing third party libraries with a Python C extension
                            
                                Using for loop in Python to add leading zeros to date column
                            
                                finplot as a widget in layout
                            
                                Find all pairs of strings in two lists that contain no common characters
                            
                                why pip freeze returns some "gibberish" instead of package==VERSION?
                            
                                Length of endogenous variable must be larger the the number of lags used
                            
                                FastAPI - How to use HTTPException in responses?
                            
                                Why I'm getting this error while building docker image?
                            
                                twilio: raise KeyError(key) from None
                            
                                AttributeError: 'str' object has no attribute 'dim' in pytorch
                            
                                psycopg2.errors.InFailedSqlTransaction: current transaction is aborted, commands ignored until end of transaction block
                            
                                Is there a way to release the GIL for pure functions using pure python?
                            
                                How to remove all repeating elements from a list in python?
                            
                                Argsort DataFrame according to columns
                            
                                No module named 'scipy.spatial.transform._rotation_groups after compile python script with pyinstaller
                            
                                FastApi Sqlalchemy how to manage transaction (session and multiple commits)
                            
                                Difference between torch.flatten() and nn.Flatten()
                            
                                Pandas: Create dict where one column is key and list of remaining columns are values

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Get mean and mode of dataframe depending on each column type

Tags:

python

pandas

AxW

People also ask

3 Answers

ALollz

`np.number` and `select_dtypes`

piRSquared

piterbarg

Recent Activity

Donate For Us

Get mean and mode of dataframe depending on each column type

Tags:

python

pandas

AxW

People also ask

3 Answers

ALollz

np.number and select_dtypes

piRSquared

piterbarg

Related questions

Recent Activity

Donate For Us

`np.number` and `select_dtypes`