Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

error with tidyr::gather() when I have unique names

Tags:

r

tidyr

I have an issue with the gather() function from the tidyr package.

sample
# A tibble: 5 × 6
  market_share      Y2012      Y2013      Y2014      Y2015      Y2016
         <chr>      <dbl>      <dbl>      <dbl>      <dbl>      <dbl>
1          KAB 0.23469425 0.23513725 0.23187590 0.22940831 0.22662625
2          BGD 0.21353096 0.21352769 0.20910574 0.20035900 0.19374223
3          NN 0.16891699 0.16204919 0.16272993 0.16388675 0.16154017
4         OG 0.07648682 0.07597078 0.07945966 0.07780233 0.08069057
5         Ha 0.05092648 0.05480555 0.06434457 0.07127716 0.08054208

If I try:

sample2 <- gather(sample, market_share, period, Y2012:Y2016)
Error: Each variable must have a unique name.
Problem variables: 'market_share'

However, each variable appears to have a unique name.

Ha  KAB  BGD  NN OG 
   1    1    1    1    1 

It appears to be a common issue people have with gather, but I don't get it.

like image 452
Prometheus Avatar asked Apr 20 '17 15:04

Prometheus


2 Answers

The second and third argument is the names of key and value column to be created in output. Having two columns with the same name is odd and doesn't work well with other functions of tidyr or dplyr. I suggest giving other names for new columns. Therefore, you can try:

sample2 <- gather(sample, period, value, Y2012:Y2016)
like image 162
mt1022 Avatar answered Nov 02 '22 12:11

mt1022


The error message tells you that you are trying to create a new column market_share, but it already exists. You need to put period in the second spot because that's the column you are trying to create.

df1<-read.table(text="market_share      Y2012      Y2013      Y2014      Y2015      Y2016
KAB 0.23469425 0.23513725 0.23187590 0.22940831 0.22662625
BGD 0.21353096 0.21352769 0.20910574 0.20035900 0.19374223
NN 0.16891699 0.16204919 0.16272993 0.16388675 0.16154017
OG 0.07648682 0.07597078 0.07945966 0.07780233 0.08069057
Ha 0.05092648 0.05480555 0.06434457 0.07127716 0.08054208",header=TRUE, stringsAsFactors=FALSE)

library(tidyr)    
gather(df1, period,market_share)

   market_share period market_share
1           KAB  Y2012   0.23469425
2           BGD  Y2012   0.21353096
3            NN  Y2012   0.16891699
4            OG  Y2012   0.07648682
5            Ha  Y2012   0.05092648
6           KAB  Y2013   0.23513725
7           BGD  Y2013   0.21352769
8            NN  Y2013   0.16204919
9            OG  Y2013   0.07597078
10           Ha  Y2013   0.05480555
like image 41
Pierre Lapointe Avatar answered Nov 02 '22 14:11

Pierre Lapointe