Count the occurrences of words in a string row wise based on existing words in other columns

Question

I have a data frame that has rows of strings. I want to count the occurrence of words in the rows based on what words appear in the column. How can I achieve this with the code below? Can the below code be modified somehow to achieve this or can anyone suggest another piece of code that doesn't require loops? Thanks so much in advance!

df <- data.frame(
  words = c("I want want to compare each ",
            "column to the values in",
            "If any word from the list any",
            "replace the word in the respective the word want"),
  want= c("want", "want", "want", "want"),
  word= c("word", "word", "word", "word"),
  any= c("any", "any", "any", "any"))

#add 1 for match and 0 for no match
for (i in 2:ncol(df))
{
  for (j in 1:nrow(df))
  {                 
    df[j,i] <- ifelse (grepl (df[j,i] , df$words[j]) %in% "TRUE", 1, 0)
  }
  print(i)
}

*'data.frame':  4 obs. of  4 variables:
 $ words: chr  "I want want to compare each " "column to the values in " "If any word from the words any" "replace the word in the respective the word"
 $ want : chr  "want" "want" "want" "want"
 $ word : chr  "word" "word" "word" "word"
 $ any  : chr  "any" "any" "any" "any"*

The output should look like below:

    words                                                 want word any
1   I want want to compare each                            2    0   0
2   column to the values in                                0    0   0
3   If any word from the list any                          0    1   2
4   replace the word in the respective the word want       1    2   0

Current output with existing code looks like this:

    words                                                 want word any
1   I want want to compare each                            1    0   0
2   column to the values in                                0    0   0
3   If any word from the list any                          0    1   1
4   replace the word in the respective the word want       1    1   0

NelsonGon · Accepted Answer

With tidyverse(slight violation of syntax by using $):

library(tidyverse)

df %>% 
     mutate_at(vars(-words),function(x) str_count(df$words,x))
                                             words want word any
1                     I want want to compare each     2    0   0
2                          column to the values in    0    0   0
3                    If any word from the list any    0    1   2
4 replace the word in the respective the word want    1    2   0

Or using modify_at and as suggested by @Sotos we can use . to maintain tidyverse syntax.

df %>% 
      modify_at(2:ncol(.),function(x) str_count(.$words,x))
                                             words want word any
1                     I want want to compare each     2    0   0
2                          column to the values in    0    0   0
3                    If any word from the list any    0    1   2
4 replace the word in the respective the word want    1    2   0

Count the occurrences of words in a string row wise based on existing words in other columns

Tags:

r

nlp

text-mining

Onyeka D'mello

1 Answers

NelsonGon

Recent Activity

Donate For Us

Count the occurrences of words in a string row wise based on existing words in other columns

Tags:

r

nlp

text-mining

Onyeka D'mello

1 Answers

NelsonGon

Related questions

Recent Activity

Donate For Us