delete duplicated column dplyr

Question

This morning while doing some analysis with a data frame I got an error due to the presence of duplicated column names. I tried to find a solution using exclusively dplyr but I could not find anything that works. Here is an example to illustrate the problem. A dataframe with a duplicated column name.

x <- data.frame(matrix(c(1, 2, 3),
                c(2,2,1),nrow=2,ncol=3))
colnames(x) <- c("a", "a", "b")

When I try to drop the first column using the select command I get an error

x %>%
  select(-1)%>%filter(b>1)

Error: found duplicated column name: a

I can get rid of the column easily using traditional indexing and the using dplyr to filter by value

x<-x[,-1]%>%filter(b>1)

Which produces the desired output

Any ideas on how to perform this using only dplyr grammar?

Chrisss · Accepted Answer

This could work, taking advantage of make.names behaviour. Don't know if I've cheated here, but it seems mostly to take advantage of dplyr functions.

x %>% 
    setNames(make.names(names(.), unique = TRUE)) %>% 
    select(-matches("*\.[1-9]+$"))

delete duplicated column dplyr

Tags:

r

dplyr

asado23

1 Answers

Chrisss

Recent Activity

Donate For Us

delete duplicated column dplyr

Tags:

r

dplyr

asado23

1 Answers

Chrisss

Related questions

Recent Activity

Donate For Us