I would like to easily apply multiple functions to a single column in a Julia dataframe. Here is a simple example from notebook 5 of the DataFrames.jl course on Julia Academy.
Bogumil shows us to easily calculate the mean of the jumps
column by doing the following:
combine(df, :jumps => mean)
jumps_mean | |
---|---|
Float64 | |
1 | 2.7186 |
But what if I want to apply multiple functions to jumps
to get multiple summary statistics? So far I can get the following to work:
combine(df, :jumps => (x -> [(mean(x), std(x), minimum(x), maximum(x))]) => [:mean, :std, :min, :max])
mean | std | max | min | |
---|---|---|---|---|
Float64 | Float64 | Int64 | Int64 | |
1 | 2.7186 | 0.875671 | 2 | 11 |
Is there a cleaner syntax for doing this, without needing wrap the function return in [ ]
or specifically use an anonymous function?
For example, I would like to do:
combine(df, :jumps => (mean, std, minimum, maximum))
Do:
combine(df, :jumps .=> [mean, std, minimum, maximum])
See also Multiple summary statistics on grouped column in Julia for some more advanced examples.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With