I have very big matrix, I know that some of the colnames of them are duplicated. so I just want to find those duplicated colnames and remove on of the column from duplicate. I tried <code>duplicate()</code>, but it removes the duplicate entries. Would someone help me to implment this in R ? the point is that, duplicate colnames, might not have duplicate entires.

Let's say <code>temp</code> is your matrix <pre class="prettyprint"><code>temp <- matrix(seq_len(15), 5, 3) colnames(temp) <- c("A", "A", "B") ## A A B ## [1,] 1 6 11 ## [2,] 2 7 12 ## [3,] 3 8 13 ## [4,] 4 9 14 ## [5,] 5 10 15 </code></pre> You could do <pre class="prettyprint"><code>temp <- temp[, !duplicated(colnames(temp))] ## A B ## [1,] 1 11 ## [2,] 2 12 ## [3,] 3 13 ## [4,] 4 14 ## [5,] 5 15 </code></pre> Or, if you want to keep the last duplicated column, you can do <pre class="prettyprint"><code>temp <- temp[, !duplicated(colnames(temp), fromLast = TRUE)] ## A B ## [1,] 6 11 ## [2,] 7 12 ## [3,] 8 13 ## [4,] 9 14 ## [5,] 10 15 </code></pre>

Or assuming data.frames you could use <code>subset</code>: <pre class="prettyprint"><code>subset(iris, select=which(!duplicated(names(.)))) </code></pre> Note that <code>dplyr::select</code> is not applicable here because it requires column-uniqueness in the input data already.

How to remove duplicated column names in R?

Tags:

r

I have very big matrix, I know that some of the colnames of them are duplicated. so I just want to find those duplicated colnames and remove on of the column from duplicate. I tried duplicate(), but it removes the duplicate entries. Would someone help me to implment this in R ? the point is that, duplicate colnames, might not have duplicate entires.

934

asked Jun 10 '14 13:06

user2806363

2 Answers

Let's say temp is your matrix

temp <- matrix(seq_len(15), 5, 3) colnames(temp) <- c("A", "A", "B")  ##      A  A  B ## [1,] 1  6 11 ## [2,] 2  7 12 ## [3,] 3  8 13 ## [4,] 4  9 14 ## [5,] 5 10 15

You could do

temp <- temp[, !duplicated(colnames(temp))]  ##      A  B ## [1,] 1 11 ## [2,] 2 12 ## [3,] 3 13 ## [4,] 4 14 ## [5,] 5 15

Or, if you want to keep the last duplicated column, you can do

temp <- temp[, !duplicated(colnames(temp), fromLast = TRUE)]   ##       A  B ## [1,]  6 11 ## [2,]  7 12 ## [3,]  8 13 ## [4,]  9 14 ## [5,] 10 15

189

answered Sep 20 '22 21:09

David Arenburg

Or assuming data.frames you could use subset:

subset(iris, select=which(!duplicated(names(.))))

Note that dplyr::select is not applicable here because it requires column-uniqueness in the input data already.

answered Sep 20 '22 21:09

Holger Brandl

Related questions
                            
                                jitter geom_line()
                            
                                Merge three different columns into a date in R
                            
                                Matching multiple patterns
                            
                                Forecasting time series data
                            
                                Merging multiple rasters in R
                            
                                What is the right way to multiply data frame by vector?
                            
                                How to adjust facet size manually
                            
                                R: How to filter/subset a sequence of dates
                            
                                Delete columns/rows with more than x% missing
                            
                                How to transpose a dataframe in tidyverse?
                            
                                How do I strip dollar signs ($) from data/ escape special characters in R?
                            
                                linear regression "NA" estimate just for last coefficient
                            
                                Is there a way to knitr markdown straight out of your workspace using RStudio?
                            
                                Create new column with dplyr mutate and substring of existing column
                            
                                Change plot title sizes in a facet_wrap multiplot
                            
                                Use filter in dplyr conditional on an if statement in R
                            
                                Saving and loading data.frames [duplicate]
                            
                                How to access to specify file in subfolder without change working directory In R?
                            
                                Install binary zipped R package via command line
                            
                                Check whether two vectors contain the same (unordered) elements in R

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With