Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reshaping data using tidyr

Tags:

r

dplyr

tidyr

I am working with a dataframe data which is similar in structure to the one below.

  Gender   Age         Number
1 Female 55-59 years       5
2 Female   65+ years       10
3   Male 25-29 years       4
4   Male 40-44 years       3
5   Male 50-54 years       1

I am attempting to reshape the data (unsuccessfully thus far) using tidyr so that each value of the Number column is featured on its own line. The output I am seeking should resemble the following:

  Gender   Age
1 Female 55-59 years  
2 Female 55-59 years
3 Female 55-59 years
4 Female 55-59 years
5 Female 55-59 years 
6 Female   65+ years
7 Female   65+ years
8 Female   65+ years
9 Female   65+ years
10 Female   65+ years
11 Female   65+ years
12 Female   65+ years
13 Female   65+ years
14 Female   65+ years
15 Female   65+ years
16 Male 25-29 years
17 Male 25-29 years
18 Male 25-29 years
19 Male 25-29 years
20 Male 40-44 years
21 Male 40-44 years
22 Male 40-44 years
23 Male 50-54 years

I have tried to use various combinations of the gather/spread functions without coming even remotely close to success. I'm fairly sure this is possible in tidyr!

I know there are a number of other packages/functions that I could use to achieve the same result, but I'm quite keen to get a tidyr solution so I can include it in a larger dplyr/tidyr pipe.

Any help of assistance would be very much appreciated.

dat <- structure(list(Gender = structure(c(3L, 3L, 1L, 2L, 1L), .Label = c("   Male", 
    " Male", "Female"), class = "factor"), Age = structure(c(5L, 
    1L, 2L, 3L, 4L), .Label = c("65+ years", "25-29 years", "40-44 years", 
    "50-54 years", "55-59 years"), class = "factor"), Number = c(5L, 
    10L, 4L, 3L, 1L)), .Names = c("Gender", "Age", "Number"), class = "data.frame", row.names = c(NA, 
    -5L))
like image 950
vengefulsealion Avatar asked Apr 07 '26 12:04

vengefulsealion


1 Answers

This is also not using tidyr, but I think it's natural:

dat %>% slice(rep(row_number(), Number)) %>% select(-Number)

    Gender         Age
1   Female 55-59 years
2   Female 55-59 years
3   Female 55-59 years
4   Female 55-59 years
5   Female 55-59 years
6   Female   65+ years
7   Female   65+ years
8   Female   65+ years
9   Female   65+ years
10  Female   65+ years
11  Female   65+ years
12  Female   65+ years
13  Female   65+ years
14  Female   65+ years
15  Female   65+ years
16    Male 25-29 years
17    Male 25-29 years
18    Male 25-29 years
19    Male 25-29 years
20    Male 40-44 years
21    Male 40-44 years
22    Male 40-44 years
23    Male 50-54 years

As @bramtayl suggested, one can (arguably) improve readability with

dat %>% slice(row_number() %>% rep(Number)) %>% select(-Number)
like image 71
Frank Avatar answered Apr 09 '26 01:04

Frank



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!