Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Coercing a column of lists into a string in an R data frame

Tags:

r

Create sample data:

id <- c(12, 32, 42, 42, 52, 52, 67, 67)
relationship_id <- c(15,1,59,1,61,6,59,1)
sample.data <- data.frame(id,relationship_id)

For each id that appears more than once, concatenate the relationship_id:

combo <- aggregate(relationship_id ~ id, data = sample.data, paste, sep=",")
table(combo$relationship_id)
Error in table(combo$relationship_id) :
  all arguments must have the same length

I figured out what caused this error:

class(combo$relationship_id)
[1] "list"

But when I try and coerce the list vector to a character vector:

combo["relationship_id"] <- lapply(combo["relationship_id"], as.character)
> head(combo)    
  id relationship_id
1 12              15
2 32               1
3 42    c("59", "1")
4 52    c("61", "6")
5 67    c("59", "1")

It includes the concatenation syntax... I understand that I can parse the output so that it is usable, but why is this happening? Is there an easier way to clean up the output?

like image 426
Shayna Avatar asked Jan 06 '15 16:01

Shayna


1 Answers

You are trying to tackle the wrong problem. If you really wanted to collapse those values into a single character vector, you should use collapse = "," instead of sep.

combo <- aggregate(relationship_id ~ id, data = sample.data, 
                   paste, collapse=",")
table(combo$relationship_id)
# 
#    1   15 59,1 61,6 
#    1    1    2    1 
like image 182
A5C1D2H2I1M1N2O1R2T1 Avatar answered Sep 18 '22 21:09

A5C1D2H2I1M1N2O1R2T1