Combine two data frames and remove duplicate columns

Tags:

r

duplicate-data

I want to cbind two data frames and remove duplicated columns. For example:

df1 <- data.frame(var1=c('a','b','c'), var2=c(1,2,3))
df2 <- data.frame(var1=c('a','b','c'), var3=c(2,4,6))

cbind(df1,df2) #this creates a data frame in which column var1 is duplicated

I want to create a data frame with columns var1, var2 and var3, in which column var2 is not repeated.

772

asked Sep 16 '11 06:09

danilinares

2 Answers

merge will do that work.

try:

Click to copy

merge(df1, df2)

164

answered Sep 29 '22 15:09

kohske

In case you inherit someone else's dataset and end up with duplicate columns somehow and want to deal with them, this is a nice way to do it:

Click to copy

for (name in unique(names(testframe))) {
  if (length(which(names(testframe)==name)) > 1) {
    ## Deal with duplicates here. In this example
    ## just print name and column #s of duplicates:
    print(name)
    print(which(names(testframe)==name))
  }
}

answered Sep 29 '22 15:09

Sam

Related questions
                            
                                Compare a value to null. Why is this true?
                            
                                Getting Column-Names to Wrap in R/Kable() HTML Table
                            
                                Clarity on purrr syntax
                            
                                What is setReplaceMethod() and how does it work?
                            
                                Strange formatting of legend in ggplotly in R
                            
                                Calculate distance longitude latitude of multiple in dataframe R
                            
                                Using flextable in r markdown loop not producing tables
                            
                                ggplot dotplot: What is the proper use of geom_dotplot?
                            
                                dplyr: mutate_at + coalesce: dynamic names of columns
                            
                                Fastest way to read large Excel xlsx files? To parallelize or not?
                            
                                Replace a sequence of values by group depending on preceeding values
                            
                                How to keep linetype spacing constant despite line size
                            
                                R: Which is the optimal way to compute functions over time with 3D arrays (latitude, longitude, and time)?
                            
                                Greek letters in ggplot strip text
                            
                                What are good example R packages written using RUnit or roxygen?
                            
                                Clear R Console programmatically [duplicate]
                            
                                Is the FoldLeft function available in R?
                            
                                How does one install 'rj' in StatET plugin for Eclipse?
                            
                                XPath within R using XML package
                            
                                easiest way to write a title page to pdf without Sweave

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With