Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unsplit a list of data frames after subsetting the data frames in the list

Tags:

dataframe

r

Let's say I have a data frame "x"

> x
    x1         x2 x3
1  box 0.81432465  4
2  box 0.19628122  2
3  box 0.06619734  1
4  box 0.90403568  5
5  box 0.52693274  3
6  axe 0.28665840  2
7  axe 0.45193228  3
8  axe 0.48278466  4
9  axe 0.86553847  5
10 axe 0.13925190  1
11 cat 0.86340413  5
12 cat 0.28387540  2
13 cat 0.24297445  1
14 cat 0.36651366  3
15 cat 0.55038108  4

Then I perform following operations on it

> x.factor <- factor(x[,1]) ## convert column 1 as factors
> x.split <- split(x, x.factor)
> unsplit(x.split, x.factor) ## get back original data frame

works fine till now. But when I do this, it gives me an error

> x.split2 <- lapply(x.split, function(x) {head(x,1)})
> unsplit(x.split2, x.factor) ## trying to combine into a data frame

Error in `row.names<-.data.frame`(`*tmp*`, value = value) : 
  duplicate 'row.names' are not allowed
In addition: Warning message:
non-unique values when setting 'row.names': ‘1’, ‘11’, ‘6’

I don't get it because, if I print out x.split2, the row names are unique for each element in the list.

Why am I getting this error?

like image 678
user3212376 Avatar asked Apr 23 '14 09:04

user3212376


1 Answers

Instead of unsplit, you can use the common do.call(rbind, ...) approach:

do.call(rbind, x.split2)
#      x1        x2 x3
# axe axe 0.2866584  2
# box box 0.8143246  4
# cat cat 0.8634041  5

Your present approach doesn't work because your "x.factor" object has duplicated values that relate to the original number of rows in your data.frame. Since you're just taking one of each factor, you can also try something like the following:

unsplit(x.split2, levels(x.factor))
#     x1        x2 x3
# 6  axe 0.2866584  2
# 1  box 0.8143246  4
# 11 cat 0.8634041  5
like image 97
A5C1D2H2I1M1N2O1R2T1 Avatar answered Oct 19 '22 04:10

A5C1D2H2I1M1N2O1R2T1