Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find value in column return TRUE/FALSE

If this is my code:

df<-data.frame(speaker=c("nancyball","nancyball","wigglet","wigglet"),
               phrase=c("the cat is on the hat",
                        "the cat runs",
                        "the cat is under the bowl",
                        "a cat plays"))

prep.list<-c("on","under","in")

I want a new column (df$kind) added to df with a value of T or F a word from prep.list is in df$phrase.

There must be an easy way to do this.

Even better, I'd like df$kind to return a few different things, like if I also had:

verb.list<-c("plays","sings","sits")

I would get:

"prep, F, prep, verb"

I tried: where() which won't coerce my column into a vector and apply() with grep() but it lost a dimension

like image 654
cowfish Avatar asked May 15 '26 01:05

cowfish


1 Answers

You could try:

  df$kind <- grepl(paste(prep.list, collapse="|"), df$phrase)
  df
  #   speaker                    phrase  kind
  #1 nancyball     the cat is on the hat  TRUE
  #2 nancyball              the cat runs FALSE
  #3   wigglet the cat is under the bowl  TRUE
  #4   wigglet               a cat plays FALSE


 indx1 <- grepl(paste(prep.list, collapse="|"), df$phrase)
 indx2 <- grepl(paste(verb.list, collapse="|"), df$phrase)
 

After finding @Jaap's answer, I guess you wanted:

  df$kind <- c("F", "prep", "verb")[as.numeric(factor(1*indx1+2*indx2))] #updated based on comments from @alexis_laz
  df
  #  speaker                    phrase kind
 #1 nancyball     the cat is on the hat prep
 #2 nancyball              the cat runs    F
 #3   wigglet the cat is under the bowl prep
 #4   wigglet               a cat plays verb

Update

Suppose you have multiple lists and more than one list matches a particular element of df$phrase, one way is:

 new.list <- c("hat", "bowl", "howl")
 nm1 <- ls(pattern=".list")
 lst1 <- mget(nm1)
 indx2 <- sapply(names(lst1), function(x) {x1 <- gsub("\\..*", "", x)
                                indx <- grepl(paste(lst1[[x]], collapse="|"), df$phrase)
                                      c(NA, x1)[indx+1]})

   df$kind <- ifelse(rowSums(is.na(indx2))==ncol(indx2), "F", 
                apply(indx2, 1, function(x) paste(x[!is.na(x)], collapse="_")))

   df
   #   speaker                    phrase     kind
   #1 nancyball     the cat is on the hat new_prep
   #2 nancyball              the cat runs        F
   #3   wigglet the cat is under the bowl new_prep
   #4   wigglet               a cat plays     verb
like image 93
akrun Avatar answered May 16 '26 14:05

akrun



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!