Say we have a table 'data' containing Strings in several columns. We want to find the indices of all rows that contain a certain value, or better yet, one of several values. The column, however, is unknown.
What I do, at the moment, is:
apply(df, 2, function(x) which(x == "M017"))
where df =
1 04.10.2009 01:24:51 M017 <NA> <NA> NA 2 04.10.2009 01:24:53 M018 <NA> <NA> NA 3 04.10.2009 01:24:54 M051 <NA> <NA> NA 4 04.10.2009 01:25:06 <NA> M016 <NA> NA 5 04.10.2009 01:25:07 <NA> M015 <NA> NA 6 04.10.2009 01:26:07 <NA> M017 <NA> NA 7 04.10.2009 01:26:27 <NA> M017 <NA> NA 8 04.10.2009 01:27:23 <NA> M017 <NA> NA 9 04.10.2009 01:27:30 <NA> M017 <NA> NA 10 04.10.2009 01:27:32 M017 <NA> <NA> NA 11 04.10.2009 01:27:34 M051 <NA> <NA> NA
This also works if we try to find more than one value:
apply(df, 2, function(x) which(x %in% c("M017", "M018")))
The result being:
$`1` integer(0) $`2` [1] 1 2 20 $`3` [1] 16 17 18 19 $`4` integer(0) $`5` integer(0)
However, processing the resulting list of lists is rather tedious.
Is there a more efficient way to find rows that contain a value (or more) in ANY column?
You can use the following basic syntax to find the rows of a data frame in R in which a certain value appears in any of the columns: library(dplyr) df %>% filter_all(any_vars(. %in% c('value1', 'value2', ...)))
How about
apply(df, 1, function(r) any(r %in% c("M017", "M018")))
The ith element will be TRUE
if the ith row contains one of the values, and FALSE
otherwise. Or, if you want just the row numbers, enclose the above statement in which(...)
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With