While reading a data set using fread
, I've noticed that sometimes I'm getting duplicated column names, for example (fread
doesn't have check.names
argument)
> data.table( x = 1, x = 2)
x x
1: 1 2
The question is: is there any way to remove 1 of 2 columns if they have the same name?
Use the unique() function to remove duplicates from the selected columns of the R data frame.
Remove duplicate rows in a data frameThe function distinct() [dplyr package] can be used to keep only unique/distinct rows from a data frame. If there are duplicate rows, only the first row is preserved. It's an efficient version of the R base function unique() .
Duplicate column names are allowed, but you need to use check. names = FALSE for data. frame to generate such a data frame. However, not all operations on data frames will preserve duplicated column names: for example matrix-like subsetting will force column names in the result to be unique.
.SDcols
approaches would return a copy of the columns you're selecting. Instead just remove those duplicated columns using :=
, by reference.
dt[, which(duplicated(names(dt))) := NULL]
# x
# 1: 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With