Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Select columns based on string match - dplyr::select

Tags:

regex

r

dplyr

I have a data frame ("data") with lots and lots of columns. Some of the columns contain a certain string ("search_string").

How can I use dplyr::select() to give me a subset including only the columns that contain the string?

I tried:

# columns as boolean vector select(data, grepl("search_string",colnames(data)))  # columns as vector of column names names  select(data, colnames(data)[grepl("search_string",colnames(data))])  

Neither of them work.

I know that select() accepts numeric vectors as substitute for columns e.g.:

select(data,5,7,9:20) 

But I don't know how to get a numeric vector of columns IDs from my grepl() expression.

like image 762
Timm S. Avatar asked Sep 18 '14 22:09

Timm S.


People also ask

How do I select certain columns of data in R?

To select a column in R you can use brackets e.g., YourDataFrame['Column'] will take the column named “Column”. Furthermore, we can also use dplyr and the select() function to get columns by name or index. For instance, select(YourDataFrame, c('A', 'B') will take the columns named “A” and “B” from the dataframe.

How do I select multiple columns by name in R?

To pick out single or multiple columns use the select() function. The select() function expects a dataframe as it's first input ('argument', in R language), followed by the names of the columns you want to extract with a comma between each name.

What is the use of select () function in dplyr?

select () Function in Dplyr: Select Column by Name select () function helps us to select the column by passing the dataframe and column names of the dataframe as argument 1

How to select column by column position in dplyr?

Select column by column position in dplyr Select column which contains a value or matches a pattern. Select column which starts with or ends with certain character. Select column name with Regular Expression using grepl () function

How to select variables (columns) in R using dplyr package?

Select function in R is used to select variables (columns) in R using Dplyr package. Dplyr package in R is provided with select () function which select the columns based on conditions.

How to select column name which matches with certain pattern using regular expression?

select the column name which matches with certain pattern using regular expression has been accomplished with the help of grepl () function. grepl () function pass the column name and regular expression as argument and returns the matched column as shown below. view source print?


1 Answers

Within the dplyr world, try:

select(iris,contains("Sepal")) 

See the Selection section in ?select for numerous other helpers like starts_with, ends_with, etc.

like image 64
joran Avatar answered Sep 21 '22 05:09

joran