Check if characters are all equal in a group using dplyr - R

Tags:

In the following data frame, how can I group by the first two columns and check if all the values in the fourth column are identical? If they are identical I would like to replace them with ''.

In this example, the group combinations 'embryonated + protein' and 'Hatching + Lipid' are the only two groups whose letters are not all a.

df

         Stage variable Temperature letters       Mean
30 Embryonated Moisture          30       a  808.70882
31 Embryonated      NFE          20       a   53.28806
32 Embryonated      NFE          25       a   45.38572
33 Embryonated      NFE          30       a   84.56113
34 Embryonated  Protein          20      ab  118.53608
35 Embryonated  Protein          25       b  127.29849
36 Embryonated  Protein          30       a   84.55175
37    Hatching      Ash          20       a   16.95345
38    Hatching      Ash          25       a   14.54980
39    Hatching      Ash          30       a   13.38510
40    Hatching   Energy          20       a 4931.18857
41    Hatching   Energy          25       a 4187.27213
42    Hatching   Energy          30       a 4314.61171
43    Hatching    Lipid          20       b   26.44363
44    Hatching    Lipid          25       a   19.90928
45    Hatching    Lipid          30      ab   22.27561
46    Hatching Moisture          20       a  785.63062
47    Hatching Moisture          25       a  818.69860
48    Hatching Moisture          30       a  815.32070
49    Hatching      NFE          20       a   60.34359
50    Hatching      NFE          25       a   43.02979

I have tried using dplyr to no avail.

grp_cols <- names(df)[c(1,2)] #group by stage and variable

# Convert character vector to list of symbols
dots <- lapply(grp_cols3, as.symbol)


res = df %>% group_by(.dots=dots) %>% 
  do(k=all(letters=='a')) #(returns all groups as `FALSE`)

Data:

dput(df)

structure(list(Stage = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("Developing", 
"Embryonated", "Hatching", "Laid"), class = "factor"), variable = structure(c(1L, 
5L, 5L, 5L, 2L, 2L, 2L, 4L, 4L, 4L, 6L, 6L, 6L, 3L, 3L, 3L, 1L, 
1L, 1L, 5L, 5L), .Label = c("Moisture", "Protein", "Lipid", "Ash", 
"NFE", "Energy"), class = "factor"), Temperature = c("30", "20", 
"25", "30", "20", "25", "30", "20", "25", "30", "20", "25", "30", 
"20", "25", "30", "20", "25", "30", "20", "25"), letters = c("a", 
"a", "a", "a", "ab", "b", "a", "a", "a", "a", "a", "a", "a", 
"b", "a", "ab", "a", "a", "a", "a", "a"), Mean = c(808.708818349727, 
53.2880626188374, 45.3857220182952, 84.5611267892406, 118.536080769588, 
127.298486932385, 84.5517498179938, 16.9534468121571, 14.5497954869813, 
13.3850951354759, 4931.18857123979, 4187.27213494545, 4314.61171127083, 
26.4436265667305, 19.9092762683653, 22.2756088142943, 785.630624024365, 
818.698598619779, 815.320702070777, 60.3435858953567, 43.0297881562102
)), .Names = c("Stage", "variable", "Temperature", "letters", 
"Mean"), row.names = 30:50, class = "data.frame")

224

asked May 04 '18 04:05

J.Con

1 Answers

Split the data by each group, look for the n_distinct values, then replace with '' where this is the case:

df %>%
  group_by(Stage,variable) %>%
  mutate(letters = replace(letters, n_distinct(letters)==1, '') )

Similar logic works in data.table too:

library(data.table)
setDT(df)
df[, letters := if(uniqueN(letters)==1) '' else letters, by=.(Stage,variable)]

195

answered Oct 22 '22 02:10

thelatemail

Related questions
                            
                                Pearson correlation coefficient in R's survey package
                            
                                Append suffix to colnames
                            
                                "Squared" superscript in ggplot2 yaxis label in R
                            
                                pass configure arguments to install packages in R
                            
                                Using tryCatch and rvest to deal with 404 and other crawling errors
                            
                                R Shiny: How to add data tables to dynamically created tabs
                            
                                Why does rm inside a function not delete objects?
                            
                                How to use doMC under Windows or alternative parallel processing implementation for glmnet?
                            
                                Dplyr : how to find the first-non missing string by groups?
                            
                                Python function equivalent to R's `pretty()`?
                            
                                ggplot2: reorder bars from highest to lowest in each facet [duplicate]
                            
                                Conditionally apply pipeline step depending on external value
                            
                                reshaping k columns to 2 columns representing sequential pairs of the values of the k variables
                            
                                dplyr 0.7 equivalent for deprecated mutate_
                            
                                r dplyr ends_with multiple string matches
                            
                                Create a new dataframe according to the contrast between two similar df [duplicate]
                            
                                How to center ggplot plot title
                            
                                In dplyr, what are the intrinsic differences between setdiff and anti_join?
                            
                                dplyr group by, carry forward value from previous group to next
                            
                                How to add icon to webpage tabs in blogdown

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Check if characters are all equal in a group using dplyr - R

Tags:

dataframe

r

dplyr

J.Con

People also ask

1 Answers

thelatemail

Recent Activity

Donate For Us