I want to calculate the mean for several columns and thus create a new column for the mean using dplyr
and without melting + merging.
> head(growth2) CODE_COUNTRY CODE_PLOT IV12_ha_yr IV23_ha_yr IV34_ha_yr IV14_ha_yr IV24_ha_yr IV13_ha_yr 1 1 6 4.10 6.97 NA NA NA 4.58 2 1 17 9.88 8.75 NA NA NA 8.25 3 1 30 NA NA NA NA NA NA 4 1 37 15.43 15.07 11.89 10.00 12.09 14.33 5 1 41 20.21 15.01 14.72 11.31 13.27 17.09 6 1 46 12.64 14.36 13.65 9.07 12.47 12.36 >
I need a new column within the dataset with the mean of all the IV columns. I tried this:
growth2 %>% group_by(CODE_COUNTRY, CODE_PLOT) %>% summarise(IVmean=mean(IV12_ha_yr:IV13_ha_yr, na.rm=TRUE))
And returned several errors depending on the example used, such as:
Error in NA_real_:NA_real_ : NA/NaN argument
or
Error in if (trim > 0 && n) { : missing value where TRUE/FALSE needed
To find the mean of multiple columns based on multiple grouping columns in R data frame, we can use summarise_at function with mean function.
Computing Column Means on data without missing data using across() function dplyr. Our dataframe contains both numerical and character variables. To compute means of all numerical columns, we use select() function to select the numerical columns. And then apply across() function on all columns to compute mean values.
The group_by() method is used to group the data contained in the data frame based on the columns specified as arguments to the function call.
To calculate the mean of whole columns in the DataFrame, use pandas. Series. mean() with a list of DataFrame columns. You can also get the mean for all numeric columns using DataFrame.
You don't need to group, just select()
and then mutate()
library(dplyr) mutate(df, IVMean = rowMeans(select(df, starts_with("IV")), na.rm = TRUE))
Use .
in dplyr.
library(dplyr) mutate(df, IVMean = rowMeans(select(., starts_with("IV")), na.rm = TRUE))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With