Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calculating number and frequency of list elements in R?

Tags:

list

r

element

I've got a long list containing different numbers of elements which also can recur in the same field.
This is an example of the first five lines:

A <- list(c("JAMES","CHARLES","JAMES","RICHARD"),  
          c("JOHN","ROBERT","CHARLES"),  
          c("CHARLES","WILLIAM","CHARLES","MICHAEL","WILLIAM","DAVID","CHARLES","WILLIAM"),  
          c("CHARLES"),  
          c("CHARLES","CHARLES"))  

Now I'd like to calculate the number of elements for each line of the list.
My desired output would look similar to this:

[1] 4  
[2] 3  
[3] 7  
[4] 1  
[5] 2  

In addition to that I'd like to know the frequency the term "CHARLES" occurs in each line.
Based on my example I'd like to get an output similar to this:

[1] 1  
[2] 1  
[3] 3  
[4] 1  
[5] 2  

I thought of this:

> table(A)  
Error in table(A) : all arguments arguments must have same length  
> sum(A)  
Error in sum(A) : invalid 'type' (list) of argument  

But I don't know what to do about these error messages and am not aware of alternatives, unfortunately.
I know that the number of lines of the list is:

> length(A)  
[1] 5  

But this doesn't answer my question, unfortunately. I couldn't find any other answers, either.
Therefore I'd like to ask you to please help me calculate these two measures!

Thank you very much in advance!

like image 995
user0815 Avatar asked Dec 06 '22 14:12

user0815


1 Answers

You should get familiar with lapply and sapply to loop over lists:

sapply(A, length)
[1] 4 3 8 1 2

sapply(A, function(x)sum(grepl("CHARLES", x)))
[1] 1 1 3 1 2

What grepl() does is to match a regular expression to your text, and returns TRUE or FALSE depending on whether there is a match. I then do a sum() over these logical values, i.e. summing the TRUE values.

like image 99
Andrie Avatar answered Jan 27 '23 16:01

Andrie