Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Element of vector to different columns of data frame

I have a df:

   group number id
1      A   abcd  1
2      A   abcd  2
3      A   abcd  3
4      A   efgh  4
5      A   efgh  5
6      B   abcd  1
7      B   abcd  2
8      B   abcd  3
9      B   abcd  9
10     B   ijkl 10

I want to make it like this:

   group number  data1 data2 data3 data4           Length
1      A   abcd      1     2     3                      3
2      A   efgh      4     5                            2
3      B   abcd      1     2     3     9                4
4      B   ijkl      10                                 1

I am sorry I can only make it to df2 like this:

   group number     data               Length
1      A   abcd  c(1,2,3)                   3
2      A   efgh  c(4,5)                     2
3      B   abcd  c(1,2,3,9)                 4
4      B   ijkl  10                         1

My code is here:

library(tidyverse)

df <- data.frame (group = c(rep('A',5),rep("B",5)),
                  number = c(rep('abcd',3),rep('efgh',2),rep('abcd',4),rep('ijkl',1)),
                  id = c(1,2,3,4,5,1,2,3,9,10))

df2 <- df %>%
  group_by(group,number) %>%
  nest() %>%
  mutate(data=map(data,~unlist(.x, recursive = TRUE, use.names = FALSE)),
         Length= map(data, ~length(.x)))

Please feel free to start with df or df2, with(out) any package is fine.

like image 526
kkjoe Avatar asked Sep 14 '17 21:09

kkjoe


2 Answers

You can change the name count to length(also, I perfer make the 'space' to NA, If want to change it , df2[is.na(df2)]='')


Option 1

df <- data.frame (group = c(rep('A',5),rep("B",5)),
                  number = c(rep('abcd',3),rep('efgh',2),rep('abcd',4),rep('ijkl',1)),
                  id = c(1,2,3,4,5,1,2,3,9,10))

df2 <- df %>%
    group_by(group,number) %>%
    mutate(data=toString(id),count=n())

library(splitstackshape)
cSplit(df2, 3, drop = TRUE,sep=',')


   group number count data_1 data_2 data_3 data_4
1:     A   abcd     3      1      2      3     NA
2:     A   efgh     2      4      5     NA     NA
3:     B   abcd     4      1      2      3      9
4:     B   ijkl     1     10     NA     NA     NA

Option 2

library(dplyr)
library(tidyr)

df2 <- df %>%
     group_by(group,number) %>%
     summarise(data=toString(id),count=n())%>%separate_rows(data)%>% mutate(Col = paste0("data", 1:n()))%>%spread(Col, data)
df2
# A tibble: 4 x 8
# Groups:   group [2]
   group number count data1 data2 data3 data4 data5
* <fctr> <fctr> <int> <chr> <chr> <chr> <chr> <chr>
1      A   abcd     3     1     2     3  <NA>  <NA>
2      A   efgh     2  <NA>  <NA>  <NA>     4     5
3      B   abcd     4     1     2     3     9  <NA>
4      B   ijkl     1  <NA>  <NA>  <NA>  <NA>    10
like image 87
BENY Avatar answered Sep 30 '22 14:09

BENY


I must give it to you blindly but that should work or be close :

library(tidyverse)
df %>%
    group_by(group,number) %>%
    mutate(key = paste0("data",row_number()),length = n()) %>%
    ungroup %>%
    spread(key,id,"")

To make it work from your nested data I think you have to change these vectors into 1 line data.frames of same col numbers and names , then use unnest, much more complicated! :)

like image 26
Moody_Mudskipper Avatar answered Sep 30 '22 15:09

Moody_Mudskipper