Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

separate list column into columns in R

Given a dataframe with 2 columns, id and value, I want to transform it into a dataframe with more columns containing the id and the quantiles from the column value: q0, q25, q50, q75, q100.

I do not know how to separate a column containing a list into more columns containing its values. Of course, all the lists have the same length.

Here is an example:

library(dplyr)
library(tidyr)

set.seed(0)
df <- data.frame(id = rep(c("Alice", "Bob"), each = 10),
                 value = round(rnorm(20) * 10))
> df
      id value
1  Alice    13
2  Alice    -3
3  Alice    13
4  Alice    13
5  Alice     4
6  Alice   -15
7  Alice    -9
8  Alice    -3
9  Alice     0
10 Alice    24
11   Bob     8
12   Bob    -8
13   Bob   -11
14   Bob    -3
15   Bob    -3
16   Bob    -4
17   Bob     3
18   Bob    -9
19   Bob     4
20   Bob   -12
df_quantiles <- df %>% 
  group_by(id) %>% 
  summarise( quantiles = list(quantile(value))) %>% 
  ungroup()
> df_quantiles
    # A tibble: 2 x 2
          id quantiles
          
    1  Alice 
    2    Bob 
> df_quantiles$quantiles
[[1]]
  0%  25%  50%  75% 100% 
 -15   -3    2   13   24 

[[2]]
    0%    25%    50%    75%   100% 
-12.00  -8.75  -3.50   1.50   8.00 

The next command doesn't do the work. Can you please help me with the good separate call? Is there any other method to get the result?

> df_quantiles %>%
+ separate(quantiles, paste0("q", seq(0,5)))
# A tibble: 2 x 7
      id    q0    q1    q2    q3    q4    q5
*       
1  Alice     c    15     3     2    13    24
2    Bob     c    12     8    75     3     5
Warning message:
Too many values at 2 locations: 1, 2 

What I expect is this dataframe:

    id      q0%    q25%    q50%    q75%   q100%
1  Alice    -15      -3       2      13      24
2    Bob -12.00   -8.75   -3.50    1.50    8.00
like image 223
Costin Avatar asked Dec 14 '22 03:12

Costin


2 Answers

How about

cbind.data.frame(id=unique(df$id), do.call(rbind, df_quantiles$quantiles))

with output

id  0%   25%  50%  75% 100%
1 Alice -15 -3.00  2.0 13.0   24
2   Bob -12 -8.75 -3.5  1.5    8
like image 141
Sandipan Dey Avatar answered Jan 07 '23 19:01

Sandipan Dey


A combination of list, as_tibble from tibble, as.list and unnest from tidyr does the job

library(tidyverse)
df_quantiles <- df %>% 
  group_by(id) %>% 
  summarise(quantiles = list(as_tibble(as.list(quantile(value))))) %>% unnest() %>%
  ungroup()
like image 25
Erwan LE PENNEC Avatar answered Jan 07 '23 18:01

Erwan LE PENNEC