Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

grep() to search column names of a dataframe

Tags:

r

Is there a clearer, simpler, more direct, shorter way to do this:

Where df1 is a dataframe:

names(df1[grep("Yield",names(df1))])

I want to return any column names that include the word yield.

Thanks,

like image 212
variable Avatar asked Jul 03 '14 19:07

variable


2 Answers

grep has a value argument that should work for this. Try:

grep("Yield", names(df1), value = TRUE)

MRE

df1 <- data.frame(
  Yield_1995 = 1:5,
  Yield_1996 = 6:10,
  Something = letters[1:5]
)

## Your current approach
names(df1[grep("Yield",names(df1))])
# [1] "Yield_1995" "Yield_1996"

## My suggestion
grep("Yield", names(df1), value=TRUE)
# [1] "Yield_1995" "Yield_1996"

OK, so it doesn't win in terms of brevity, but it does in terms of clarity of intention :-)


For the sake of variety.... a "dplyr" approach.

library(dplyr)
names(df1 %>% select(contains("Yield")))
# [1] "Yield_1995" "Yield_1996"
names(select(df1, contains("Yield")))
# [1] "Yield_1995" "Yield_1996"
like image 50
A5C1D2H2I1M1N2O1R2T1 Avatar answered Oct 06 '22 09:10

A5C1D2H2I1M1N2O1R2T1


You could easily define your own function to make it shorter. For instance,

 myfun <- function(x,y) names(y[grep(x, names(y))])

Then, whenever you need it, you use

 myfun("Yield", df1)

It hardly gets any shorter.

like image 21
coffeinjunky Avatar answered Oct 06 '22 08:10

coffeinjunky