Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can you use pivot_wider to create multiple groups of alternating new columns?

Tags:

r

pivot

My data currently looks like this, with the column "Number_Code based on each different Side_Effect:

Session_ID   Side_Effect     Number_Code
 1            anxious           1
 1            dizzy             2
 1            relaxed           3
 3            dizzy             2
 7            nauseous          4
 7            anxious           1

I know I can do:

mutate(rn = str_c('side_effect_', row_number())) %>% 
 pivot_wider(names_from = rn, values_from = Side_Effect)

In order to create new column names and put each side effect into a new column like this:

 session    Number_Code   side_effect1   side effect_2      side_effect_3    
      1     1                 anxious         NA                 NA
      1     2                 NA              dizzy              NA
      1     3                 NA              NA                 relaxed
      3     2                 dizzy           NA                 NA
      7     4                 nauseous        NA                 NA
      7     1                 NA              anxious            NA

But I need to widen the data based on both "Side_Effect" and "Number_Code", and have them in alternating columns like this:

 session     side_effect1   number_code1   side effect_2   number_code2   side_effect_3    number_code3
        1       anxious         1              dizzy             2            relaxed          3
        3       dizzy           2               NA               NA           NA              NA
        7       nauseous        4              anxious           1            NA              NA

I saw another post where they widened the data based on two variables, but all of the columns for the second one were after all of the columns of the first one. Is there a way to get them to alternate like this? Thank you!!

like image 370
alex Avatar asked Feb 10 '20 22:02

alex


People also ask

What does Pivot_wider do in R?

pivot_wider() "widens" data, increasing the number of columns and decreasing the number of rows.

What package is Pivot_wider in?

The pivot_wider() function from the tidyr package in R can be used to pivot a data frame from a long format to a wide format.

Can We group columns by simply inserting a pivot table?

We cannot group columns by simply inserting a Pivot Table. For instance, in this method, I will use the PivotTable and PivotChart Wizard to create the Pivot table first and then group it into columns. Follow the below steps to create the expected Pivot Table.

How to specify both columns in pivot_wider ()?

We can specify both columns to pivot_wider () ’s values_from argument. pivot_wider () pivots variable, then creates columns by combining the value of measure with each column name specified in values_from.

Why is my pivot table twice the size of my pivot?

That’s because both the pivot tables are sharing the same pivot cache. To understand better, when Excel creates a pivot table, it makes a copy of the entire source data, and creates a temporary pivot cache in the memory. This duplicated cache is now stored with the Excel file, doubling its size.

What are the limitations of pivot table grouping?

Pivot Table grouping is quite flexible. It allows you to group several different types of Fields. You can create many groups and you can group previously existing groups (create groups of groups). Despite its flexibility, Pivot Table grouping has some restrictions. Note the following 2 limitations: You can't add Calculated Items to grouped Fields.


2 Answers

The pivot_wider can take multiple value_from columns, so after creating the sequence by group, use pivot_wider with values_from specifying the columns of interest

library(dplyr)
library(tidyr)
df1 %>% 
   group_by(Session_ID) %>%
   mutate(rn = row_number()) %>% 
   ungroup %>% 
   pivot_wider(names_from = rn, values_from = c(Side_Effect, Number_Code))
# A tibble: 3 x 7
#  Session_ID Side_Effect_1 Side_Effect_2 Side_Effect_3 Number_Code_1 Number_Code_2 Number_Code_3
#       <int> <chr>         <chr>         <chr>                 <int>         <int>         <int>
#1          1 anxious       dizzy         relaxed                   1             2             3
#2          3 dizzy         <NA>          <NA>                      2            NA            NA
#3          7 nauseous      anxious       <NA>                      4             1            NA

If we need to reorder the column order, then we can select based on the numeric part and order

df1 %>% 
    group_by(Session_ID) %>%
    mutate(rn = row_number()) %>% 
    ungroup %>% 
    pivot_wider(names_from = rn, values_from = c(Side_Effect, Number_Code)) %>%
    select(Session_ID, names(.)[-1][order(readr::parse_number(names(.)[-1]))] )
# A tibble: 3 x 7
#  Session_ID Side_Effect_1 Number_Code_1 Side_Effect_2 Number_Code_2 Side_Effect_3 Number_Code_3
#       <int> <chr>                 <int> <chr>                 <int> <chr>                 <int>
#1          1 anxious                   1 dizzy                     2 relaxed                   3
#2          3 dizzy                     2 <NA>                     NA <NA>                     NA
#3          7 nauseous                  4 anxious                   1 <NA>                     NA

data

df1 <- structure(list(Session_ID = c(1L, 1L, 1L, 3L, 7L, 7L), 
  Side_Effect = c("anxious", 
"dizzy", "relaxed", "dizzy", "nauseous", "anxious"), Number_Code = c(1L, 
2L, 3L, 2L, 4L, 1L)), class = "data.frame", row.names = c(NA, 
-6L))
like image 111
akrun Avatar answered Oct 23 '22 19:10

akrun


I think this is best achieved via the pivot_*_spec() interface which allows the building of a specification data frame. This data frame determines both the names and the variable order of the pivoted data.

library(tidyr)
library(dplyr)

df <- df %>%
  group_by(Session_ID) %>%
  mutate(row_id = factor(row_number(), labels = c("first", "next", "last")[1:max(row_number())])) %>%
  ungroup()

spec <- df %>%
  build_wider_spec(names_from = row_id, values_from = c(Side_Effect, Number_Code))

spec

# A tibble: 6 x 3
  .name             .value      row_id
  <chr>             <chr>       <fct> 
1 Side_Effect_first Side_Effect first 
2 Side_Effect_next  Side_Effect next  
3 Side_Effect_last  Side_Effect last  
4 Number_Code_first Number_Code first 
5 Number_Code_next  Number_Code next  
6 Number_Code_last  Number_Code last  

Because the column order of the pivot is determined by the specification data row order, arrange() can be used to flexibly control the final order of the pivot (where factors can be used, as in the data above, to fine tune the order of text variable names). Some examples:

# Alternating by row id  
spec %>%
  arrange(row_id) %>%
  pivot_wider_spec(df, .)

# A tibble: 3 x 7
  Session_ID Side_Effect_first Number_Code_first Side_Effect_next Number_Code_next Side_Effect_last Number_Code_last
       <int> <chr>                         <int> <chr>                       <int> <chr>                       <int>
1          1 anxious                           1 dizzy                           2 relaxed                         3
2          3 dizzy                             2 NA                             NA NA                             NA
3          7 nauseous                          4 anxious                         1 NA                             NA

# Alternate by row_id and .value in ascending order
spec %>%
  arrange(row_id, .value) %>%
  pivot_wider_spec(df, .)

# A tibble: 3 x 7
  Session_ID Number_Code_first Side_Effect_first Number_Code_next Side_Effect_next Number_Code_last Side_Effect_last
       <int>             <int> <chr>                        <int> <chr>                       <int> <chr>           
1          1                 1 anxious                          2 dizzy                           3 relaxed         
2          3                 2 dizzy                           NA NA                             NA NA              
3          7                 4 nauseous                         1 anxious                        NA NA            

# .value ascending row_id descending
spec %>%
  arrange(.value, desc(row_id)) %>%
  pivot_wider_spec(df, .)
    
# A tibble: 3 x 7
  Session_ID Number_Code_last Number_Code_next Number_Code_first Side_Effect_last Side_Effect_next Side_Effect_first
       <int>            <int>            <int>             <int> <chr>            <chr>            <chr>            
1          1                3                2                 1 relaxed          dizzy            anxious          
2          3               NA               NA                 2 NA               NA               dizzy            
3          7               NA                1                 4 NA               anxious          nauseous       

    
like image 45
Ritchie Sacramento Avatar answered Oct 23 '22 19:10

Ritchie Sacramento