I have a data frame that has 5 variables and 800 rows:
head(df)
V1 variable value element OtolithNum
1 24.9835 V7 130230.0 Mg 25
2 24.9835 V8 145844.0 Mg 25
3 24.9835 V9 126126.0 Mg 25
4 24.9835 V10 103152.0 Mg 25
5 24.9835 V11 129571.9 Mg 25
6 24.9835 V12 114214.0 Mg 25
I need to perform the following:
I have been using dplyr package and have used the following code to group by the "element" variable, and provide the mean values:
df1=df %>%
group_by(element) %>%
summarise_each(funs(mean), value)
Can you please help me manipulate or add to the code above in order to remove outliers (defined above, as >2 sd from the median) grouped by the "element" variable, before I extract the means.
I have tried the following code from another posting (thats why the data names don't match with my personal data above), without luck:
#standardize each column (we use it in the outdet function)
scale(dat)
#create function that looks for values > +/- 2 sd from mean
outdet <- function(x) abs(scale(x)) >= 2
#index with the function to remove those values
dat[!apply(sapply(dat, outdet), 1, any), ]
3) How to Remove Outliers by Group in R We use tapply() function (in which quantile() function is used) to find quantiles of each iris species. Then, we select the first (Q1) and third (Q3) quartiles of each group by using sapply() function.
How do you remove outliers from multiple columns? Step 1: Create data frame. Step 2: Define outlier function. Step 3: Apply outlier function to data frame.
Here's a method using base R:
element <- sample(letters[1:5], 1e4, replace=T)
value <- rnorm(1e4)
df <- data.frame(element, value)
means.without.ols <- tapply(value, element, function(x) {
mean(x[!(abs(x - median(x)) > 2*sd(x))])
})
And using dplyr
df1 = df %>%
group_by(element) %>%
filter(!(abs(value - median(value)) > 2*sd(value))) %>%
summarise_each(funs(mean), value)
Comparison of results:
> means.without.ols
a b c d e
-0.008059215 -0.035448381 -0.013836321 -0.013537466 0.021170663
> df1
Source: local data frame [5 x 2]
element value
1 a -0.008059215
2 b -0.035448381
3 c -0.013836321
4 d -0.013537466
5 e 0.021170663
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With