frame with 10 rows and 3 columns
a b c
1 1 201 1
2 2 202 1
3 3 203 1
4 4 204 1
5 5 205 4
6 6 206 5
7 7 207 4
8 8 208 4
9 9 209 8
10 10 210 5
I want to delete all rows where the same value in the column "c" repeated less than 3 times. In this example I want to remove rows 6, 9 and 10. (my real data.frame has 5000 rows and 25 cols) I tried to do it using the function rle, but I keep getting the wrong solution. any help? thanks!
For example, we can use the subset() function if we want to drop a row based on a condition. If we prefer to work with the Tidyverse package, we can use the filter() function to remove (or select) rows based on values in a column (conditionally, that is, and the same as using subset).
By using bracket notation on R DataFrame (data.name) we can select rows by column value, by index, by name, by condition e.t.c. You can also use the R base function subset() to get the same results. Besides these, R also provides another function dplyr::filter() to get the rows from the DataFrame.
By using na. omit() , complete. cases() , rowSums() , and drop_na() methods you can remove rows that contain NA ( missing values) from R data frame.
Here is a solution using ave
:
Data[ave(Data$c, Data$c, FUN = length) > 2, ]
or using ave
with subset
:
subset(Data, ave(c, c, FUN = length) > 2)
Building on Joshua's answer:
Data[Data$c %in% names(which(table(Data$c) > 2)), ]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With