I've got a long list containing different numbers of elements which also can recur in the same field.
This is an example of the first five lines:
A <- list(c("JAMES","CHARLES","JAMES","RICHARD"),
c("JOHN","ROBERT","CHARLES"),
c("CHARLES","WILLIAM","CHARLES","MICHAEL","WILLIAM","DAVID","CHARLES","WILLIAM"),
c("CHARLES"),
c("CHARLES","CHARLES"))
Now I'd like to calculate the number of elements for each line of the list.
My desired output would look similar to this:
[1] 4
[2] 3
[3] 7
[4] 1
[5] 2
In addition to that I'd like to know the frequency the term "CHARLES" occurs in each line.
Based on my example I'd like to get an output similar to this:
[1] 1
[2] 1
[3] 3
[4] 1
[5] 2
I thought of this:
> table(A)
Error in table(A) : all arguments arguments must have same length
> sum(A)
Error in sum(A) : invalid 'type' (list) of argument
But I don't know what to do about these error messages and am not aware of alternatives, unfortunately.
I know that the number of lines of the list is:
> length(A)
[1] 5
But this doesn't answer my question, unfortunately. I couldn't find any other answers, either.
Therefore I'd like to ask you to please help me calculate these two measures!
Thank you very much in advance!
You should get familiar with lapply
and sapply
to loop over lists:
sapply(A, length)
[1] 4 3 8 1 2
sapply(A, function(x)sum(grepl("CHARLES", x)))
[1] 1 1 3 1 2
What grepl()
does is to match a regular expression to your text, and returns TRUE or FALSE depending on whether there is a match. I then do a sum()
over these logical values, i.e. summing the TRUE values.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With