Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Summarise all values of a column into a vector

Tags:

r

tibble

So here is something I'm still trying to get right.

Imagine a tibble like this one:

library(tidyverse)
t1 <- tibble(
  id       = c(1,1,1,1,2,2,2,2,2),
  id_sub   = c(1,1,2,2,1,2,2,2,2),
  position = c(1,2,1,2,1,1,2,3,4),
  head     = c(1,1,2,2,1,3,2,2,3)
  )

What I want to achieve is to create a 5th attribute depend that has the values from head for each id_sub. This does mean, that each value of depend is a vector with a minimum length of 1 (shouldn't be a problem with tibble, right?).

The result I'm looking for in this example would have an attribute with the following vectors:

c(1,1),c(2,2),c(1),c(3,2,2,3)

Of course my data is a little bigger and so far the only solution I was able to find was grouping the tibble and spreading position and head:

t1 %>% 
  group_by(id, id_sub) %>% 
  spread(position, head)

This of course creates multiple attributes:

# A tibble: 4 x 6
# Groups:   id, id_sub [4]
     id id_sub   `1`   `2`   `3`   `4`
* <dbl>  <dbl> <dbl> <dbl> <dbl> <dbl>
1     1      1     1     1    NA    NA
2     1      2     2     2    NA    NA
3     2      1     1    NA    NA    NA
4     2      2     3     2     2     3

For just one sample I could transform positionxhead as a matrix and turn it into a vector ignoring NA. But this doesn't help me on a larger scale.

m <- t1 %>% 
  filter(id == 2 & id_sub == 2) %>% 
  select(-c(id,id_sub)) %>% 
  spread(position, head) %>% 
  as.matrix()
m <- as.vector(m)
m[!is.na(m)]

With the following result:

[1] 3 2 2 3

Happy to hear your thoughts and suggestions!

like image 452
Paavo Pohndorff Avatar asked Jan 29 '23 00:01

Paavo Pohndorff


1 Answers

Another possible solution:

t1 %>% 
  group_by(data.table::rleid(id_sub)) %>% 
  summarise(hd = list(head)) %>% 
  pull(hd)

which gives:

[[1]]
[1] 1 1

[[2]]
[1] 2 2

[[3]]
[1] 1

[[4]]
[1] 3 2 2 3
like image 176
Jaap Avatar answered Feb 01 '23 07:02

Jaap