Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R Identifing text string within column of dataframe

Tags:

r

One column of my data frame has words and phrases. I am trying to create a dummy variable for those fields within this column that have specific strings of text anywhere within.

For example:

  • kite
  • cars
  • box kites
  • model cars
  • i like kites that fly
  • cars of the world

     myvector<-c("kite","cars","box kites","model cars","i like kites that fly",
     "cars of the world")
    

I would want to identify all the fields with the string "kite"

I've tried a few things such as any(), which() and %in% but nothing has worked so far.

Any help greatly appreciated

like image 363
Will Phillips Avatar asked Sep 13 '12 15:09

Will Phillips


1 Answers

You didn't provided any reproducible example. But your answer will be grepl.

grepl("kite", df$words)

It will return a logical vector if the word is in the row.

If you want to match multiple words use logical or | inside the string to match

grepl("kite|cars|box kites", df$words)
like image 60
Luciano Selzer Avatar answered Oct 23 '22 09:10

Luciano Selzer