Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multiple list nesting with split(), R

Given a dataset with multiple unique elements in a column, I'd like to split those unique elements into new dataframes, but have the dataframe nested one level down. Essentially adding an extra level to the split() command.

For instance (using the built-in iris table as an example:

iris
mylist <- split(iris, iris$Species)

produces a list, mylist, that contains 3 sublists, setosa, versicolor, virginica.

mylist[["setosa"]]

       Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1           5.1         3.5          1.4         0.2  setosa
2           4.9         3.0          1.4         0.2  setosa
3           4.7         3.2          1.3         0.2  setosa
4           4.6         3.1          1.5         0.2  setosa
5           5.0         3.6          1.4         0.2  setosa

But I would actually like to nest that data table in a sublist called results BUT keep the upper level list name as setosa. Such that:

mylist$setosa["results"]

  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1           5.1         3.5          1.4         0.2  setosa
2           4.9         3.0          1.4         0.2  setosa
3           4.7         3.2          1.3         0.2  setosa
4           4.6         3.1          1.5         0.2  setosa
5           5.0         3.6          1.4         0.2  setosa

I could do this with manual manipulation, but I'd like this to run automatically. I've tried unsuccessfully with mapply

mapply(function(names, df) 
   names <- split(df, df[["Species"]]), 
   unique(iris$Species), iris)

Any advice? Also happy to use a tidyr package if that makes things easier...

like image 372
moxed Avatar asked Dec 09 '25 17:12

moxed


2 Answers

Consider by (object-oriented wrapper to tapply), very similar to split but allows you to run a function on each subset. Often many useRs run split + lapply, unaware both can replaced with by:

mylist <- by(iris, iris$Species, function(sub) list(results=sub), simplify = FALSE)

head(mylist$setosa$results)
#   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
# 1          5.1         3.5          1.4         0.2  setosa
# 2          4.9         3.0          1.4         0.2  setosa
# 3          4.7         3.2          1.3         0.2  setosa
# 4          4.6         3.1          1.5         0.2  setosa
# 5          5.0         3.6          1.4         0.2  setosa
# 6          5.4         3.9          1.7         0.4  setosa

head(mylist$versicolor$results)
#    Sepal.Length Sepal.Width Petal.Length Petal.Width    Species
# 51          7.0         3.2          4.7         1.4 versicolor
# 52          6.4         3.2          4.5         1.5 versicolor
# 53          6.9         3.1          4.9         1.5 versicolor
# 54          5.5         2.3          4.0         1.3 versicolor
# 55          6.5         2.8          4.6         1.5 versicolor
# 56          5.7         2.8          4.5         1.3 versicolor

head(mylist$virginica$results)
#     Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
# 101          6.3         3.3          6.0         2.5 virginica
# 102          5.8         2.7          5.1         1.9 virginica
# 103          7.1         3.0          5.9         2.1 virginica
# 104          6.3         2.9          5.6         1.8 virginica
# 105          6.5         3.0          5.8         2.2 virginica
# 106          7.6         3.0          6.6         2.1 virginica
like image 78
Parfait Avatar answered Dec 12 '25 07:12

Parfait


setNames in lapply will keep the names of the list you're iterating through

iris
mylist <- split(iris, iris$Species)
mylist2 <- lapply(setNames(names(mylist), names(mylist)), function(x){
  list(results = mylist[[x]])
})
like image 44
Beemyfriend Avatar answered Dec 12 '25 07:12

Beemyfriend



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!