I'm fairly new to R and trying to build a data frame that includes the frequency for each unique observation for each element of my nested list:
lst <- list(
c('A', 'A', 'A'),
c("A", "B"),
c("A", "A", "B", "B", "B", "B"),
c("A", "C", "C"),
c("B", "A")
)
I could figure out how to do that disregarding the elements of the list or for just one of the elements:
prop <- prop.table(table(unlist(lapply(lst, unique))))
as.data.frame(prop)
# or
as.data.frame(prop.table(table(lst[[1]]))
But not how to effectively combine the two.
My desired output is along the lines of:
type 1 2 3 4 5
======================
A 1 .5 .33 .33 .5
B 0 .5 .67 0 .5
C 0 0 0 .67 0
Additionally, I would like the output to have more digits than they do when just using prop.table(). Any advice is much appreciated
We may do this in a single line - the trick is to create a two column data.frame with stack
, get the table
and use proportions
with MARGIN
as 2
proportions(table(stack(setNames(lst, seq_along(lst)))), 2)
-output
ind
values 1 2 3 4 5
A 1.0000000 0.5000000 0.3333333 0.3333333 0.5000000
B 0.0000000 0.5000000 0.6666667 0.0000000 0.5000000
C 0.0000000 0.0000000 0.0000000 0.6666667 0.0000000
An approach using prop.table
First get the unique entries un_lst, then simply compare them with each list entry.
un_lst <- unique(unlist(lst))
data.frame(
sapply(lst, function(x)
prop.table(setNames(rowSums(sapply(x, function(y)
y == un_lst)), un_lst))))
X1 X2 X3 X4 X5
A 1 0.5 0.3333333 0.3333333 0.5
B 0 0.5 0.6666667 0.0000000 0.5
C 0 0.0 0.0000000 0.6666667 0.0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With