Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

dplyr mutate: solve unique names error

Tags:

r

dplyr

I have a dataframe with 10 columns. For the example this is the dummy version:

df = tbl_df(replicate(10,sample(0:1,1000,rep=TRUE)))

I want to do this in dplyr:

df %>% mutate(V2 = ifelse(is.na(V6), V2, paste(V2,V3,sep=" ")))

I obtain:

Error: Each variable must have a unique name.

But if I do:

df$V2 = ifelse(is.na(df$V6), df$V2, paste(df$V2,df$V3,sep=" "))

it works.

How can I do the last step with dplyr statements?

like image 330
pachadotdev Avatar asked Jan 04 '23 01:01

pachadotdev


1 Answers

As @Lamia said, the problem most likely lies with duplicate columns names.

Create sample dataframe with duplicate column names. You should never do that:

wrong_df <- data.frame(
  V1 = 1:3,
  V2 = 1:3,
  V3 = 1:3,
  V6 = c(4, NA, 6),
  V1 = 7:9,
  check.names = FALSE
)
wrong_df
#   V1 V2 V3 V6 V1
# 1  1  1  1  4  7
# 2  2  2  2 NA  8
# 3  3  3  3  6  9

Reproduce the issue:

library(dplyr)
wrong_df %>% 
  mutate(V2 = ifelse(is.na(V6), V2, paste(V2, V3, sep = " ")))
# Error: Each variable must have a unique name.
# Problem variables: 'V1'

Solve it by deduplicating column names with make.names(). Note that the second V1 column has been renamed V1.1 (see help("make.names")):

wrong_df %>% 
  setNames(make.names(names(.), unique = TRUE)) %>% 
  mutate(V2 = ifelse(is.na(V6), V2, paste(V2, V3, sep = " ")))
#   V1  V2 V3 V6 V1.1
# 1  1 1 1  1  4    7
# 2  2   2  2 NA    8
# 3  3 3 3  3  6    9
like image 110
Aurèle Avatar answered Jan 07 '23 16:01

Aurèle