Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas: How to transform all numeric columns of a data frame into logarithms

Tags:

python

pandas

In R I can apply a logarithmic (or square root, etc.) transformation to all numeric columns of a data frame, by using:

logdf <- log10(df)

Is there something equivalent in Python/Pandas? I see that there is a "transform" and an (R-like) "apply" function, but could not figure out how to use them in this case.

Thanks for any hints or suggestions.

like image 455
maurobio Avatar asked Jan 27 '19 14:01

maurobio


3 Answers

Supposed you have a dataframe named df

You can first make a list of possible numeric types, then just do a loop

numerics = ['int16', 'int32', 'int64', 'float16', 'float32', 'float64']
for c in [c for c in df.columns if df[c].dtype in numerics]:
    df[c] = np.log10(df[c])

Or, a one-liner solution with lambda operator and np.dtype.kind

numeric_df = df.apply(lambda x: np.log10(x) if np.issubdtype(x.dtype, np.number) else x)
like image 149
Rex Low Avatar answered Oct 19 '22 22:10

Rex Low


If most columns are numeric it might make sense to just try it and skip the column if it does not work:

for column in df.columns:
    try:
        df[column] = np.log10(df[column])
    except (ValueError, AttributeError):
        pass

If you want to you could wrap it in a function, of course.

If all columns are numeric, you can even simply do

df_log10 = np.log10(df)
like image 37
Graipher Avatar answered Oct 20 '22 00:10

Graipher


You can use select_dtypes and numpy.log10:

import numpy as np
for c in df.select_dtype(include = [np.number]).columns:
    df[c] = np.log10(df[c])

The select_dtypes selects columns of the the data types that are passed to it's include parameter. np.number includes all numeric data types.

numpy.log10 returns the base 10 logarithm of the input, element wise

like image 30
Mohit Motwani Avatar answered Oct 20 '22 00:10

Mohit Motwani