I have a dataframe like this:
set.seed(12)
df <- data.frame(
v1 = sample(LETTERS, 10),
v2 = sample(LETTERS, 10),
v3 = sample(LETTERS, 10),
v4 = c(sample(LETTERS, 8), sample(letters, 2)),
v5 = c(sample(letters, 1), sample(LETTERS, 7), sample(letters, 2))
)
df
v1 v2 v3 v4 v5
1 B K F G p
2 U U T W N
3 W J C V Y
4 G I Q S E
5 D F E N T
6 A X Z T C
7 V Y K X I
8 M Z D Q A
9 Y L H k d
10 R B L j t
I want to subset df
on those rows that contain a lowercase value in any of df
's columns. It can be done like this:
df1 <- df[grepl("[a-z]", df$v1) | grepl("[a-z]", df$v2) | grepl("[a-z]", df$v3) |
grepl("[a-z]", df$v4) | grepl("[a-z]", df$v5), ]
df1
v1 v2 v3 v4 v5
1 B K F G p
9 Y L H k d
10 R B L j t
But this is uneconomical, if you have many (more) columns, and error-prone. Is there a cleaner, simpler and more economical way, preferably in base R?
Use str. lower() to make a DataFrame string column lowercase Call df["first_column"]. str. lower() to make all strings in df["first_column"] lowercase.
Convert Column Names to Uppercase using str. where, df is the input dataframe and columns is the attribute to get the column labels as an Index Object. Then using the StringMethods. upper() we converted all labels to uppercase. It converted all the column labels to uppercase.
df[rowSums(sapply(df, function(x) x %in% letters)) > 0,]
#OR
df[apply(df == sapply(df, tolower), 1, any),]
# v1 v2 v3 v4 v5
#1 B L L M e
#9 R N D t t
#10 F X M h x
One option is to apply grepl
on each column with lapply
to create a list
of logical vector
s and Reduce
it with |
df[Reduce(`|`, lapply(df, grepl, pattern = "[a-z]")),]
# v1 v2 v3 v4 v5
#1 B L L M e
#9 R N D t t
#10 F X M h x
Or using filter_all
library(dplyr)
library(stringr)
df %>%
filter_all(any_vars(str_detect(., "[a-z]")))
# v1 v2 v3 v4 v5
#1 B L L M e
#2 R N D t t
#3 F X M h x
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With