Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R: Count objects in column-list

Tags:

list

dataframe

r

Let me define a data frame with one column id formed by a vector of integer

df <- data.frame(id = c(1,2,2,3,3))

and a column objects which instead is list of character vectors. Let''s create the column with the following function

randomObjects <- function(argument) {
  numberObjects <- sample(c(1,2,3,4), 1)
  vector <- character()
  for (i in 1:numberObjects) {
    vector <- c(vector, sample(c("apple","pear","banana"), 1))
  }
  return(vector)
} 

which is then called with lapply

set.seed(28100)
df$objects <- lapply(df$id, randomObjects)

The resulting data frame is

df
#   id                 objects
# 1  1            apple, apple
# 2  2     apple, banana, pear
# 3  2                  banana
# 4  3    banana, pear, banana
# 5  3 pear, pear, apple, pear

Now I want to count the number of objects corresponding to each id with a data frame like this

summary <- data.frame(id = c(1, 2, 3),
                      apples = c(2, 1, 1), 
                      bananas = c(0, 2, 2),
                      pears = c(0, 1, 4))

summary
#   id apples bananas pears
# 1  1      2       0     0
# 2  2      1       2     1
# 3  3      1       2     4

How should I collapse the information of df into a more compact data frame such as summary without using a for loop?

like image 836
CptNemo Avatar asked Apr 17 '15 14:04

CptNemo


People also ask

How do you count occurrences in a column in R?

To count occurrences between columns, simply use both names, and it provides the frequency between the values of each column. This process produces a dataset of all those comparisons that can be used for further processing.

How do you count the number of repeated elements in a list in R?

Use the length() function to count the number of elements returned by the which() function, as which function returns the elements that are repeated more than once. The length() function in R Language is used to get or set the length of a vector (list) or other objects.

How do I count the number of entries in R?

R provides us nrow() function to get the rows for an object. That is, with nrow() function, we can easily detect and extract the number of rows present in an object that can be matrix, data frame or even a dataset.


1 Answers

Here is a "data.table" approach:

library(data.table)
dcast.data.table(as.data.table(df)[
  , unlist(objects), by = id][
    , .N, by = .(id, V1)], 
  id ~ V1, value.var = "N", fill = 0L)
#    id apple banana pear
# 1:  1     2      0    0
# 2:  2     1      2    1
# 3:  3     1      2    4

unlist the values by ID, count them using .N, and reshape wide with dcast.data.table.


Initially, I had thought of mtabulate from "qdapTools", but that doesn't do the aggregation step. Still, you can try something like:

library(data.table)
library(qdapTools)
data.table(cbind(df[1], mtabulate(df[[-1]])))[, lapply(.SD, sum), by = id]
#    id apple banana pear
# 1:  1     2      0    0
# 2:  2     1      2    1
# 3:  3     1      2    4
like image 165
A5C1D2H2I1M1N2O1R2T1 Avatar answered Sep 22 '22 11:09

A5C1D2H2I1M1N2O1R2T1