I have a data table that contains character observations:
library(data.table)
library(stringr)
DT = data.table(strings = c('AAABD', 'BBDA', 'AACBDAA', 'ABACD'))
I would like to create a variable that contains counts of 'A', 'AA', and 'AAA' in each observation as a list. To do this I have created a function foo:
foo <- function(str) {
n <- str_count(str, 'A')
n2 <- str_count(str, 'AA')
n3 <- str_count(str, 'AAA')
df <- list('n' = n, 'n2' = n2, 'n3' = n3)
return(df)
}
I apply this function to DT to create a new variable for count observation as a list:
DT[, count := foo(strings)]
When I do this, I receive this error:
Warning message:
In `[.data.table`(DT, , `:=`(counts, foo(strings))) :
Supplied 3 items to be assigned to 4 items of column 'counts' (recycled leaving remainder of 1 items).
The data table returned has count variable lists of size 4 instead of size 3 and does not represent the amount of 'A', 'AA', and 'AAA' accurately for each string observation in variable strings. How can I assign a list as an observation in a data table?
You need to transpose the list:
foo <- function(str) {
n <- str_count(str, 'A')
n2 <- str_count(str, 'AA')
n3 <- str_count(str, 'AAA')
df <- transpose(list('n' = n, 'n2' = n2, 'n3' = n3)) # <- add transpose
return(df)
}
DT[, count := foo(strings)]
DT
# strings count
# 1: AAABD 3,1,1
# 2: BBDA 1,0,0
# 3: AACBDAA 4,2,0
# 4: ABACD 2,0,0
str(DT)
# Classes ‘data.table’ and 'data.frame': 4 obs. of 2 variables:
# $ strings: chr "AAABD" "BBDA" "AACBDAA" "ABACD"
# $ count :List of 4
# ..$ : int 3 1 1
# ..$ : int 1 0 0
# ..$ : int 4 2 0
# ..$ : int 2 0 0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With