Assuming I have data looks like below.
At this entire data, in total I have 3*A, 2*B, 2*C, and only 1 D, E, and F.
data <- read.table(textConnection("
col1 col2
A B
A C
B A
C D
E F
"), header = TRUE)
What I want to do is to keep the order and contents the same, BUT make them unique. For example, the A becomes A.1, A.2, and A.3.
col1 col2
A.1 B.2
A.2 C.2
B.1 A.3
C.1 D
E F
Is there any smart way I can do this?
I know I can use make.unique
or make.names
, but it looks like it only can work for one column, not for entire dataset.
Using:
dat[] <- make.unique(as.character(unlist(dat)))
gives:
> dat col1 col2 1 A B.1 2 A.1 C.1 3 B A.2 4 C D 5 E F
The OP requires that the values in the data.frame should be made unique across all columns. This is a strong indicator that the data better should be reshaped from wide to long format where all data manipulations can be performed on one column instead of many.
library(data.table)
DT <- data.table(data)
molten <- melt(DT, measure.vars = names(DT))[
, value := paste(value, rowid(value), sep = ".")]
molten
variable value 1: col1 A.1 2: col1 A.2 3: col1 B.1 4: col1 C.1 5: col1 E.1 6: col2 B.2 7: col2 C.2 8: col2 A.3 9: col2 D.1 10: col2 F.1
The rowid()
function is a convenience function for generating a unique row id within each group.
Further processing can continue in the long format. Finally, the data may be reshaped to wide format again:
molten[, rn := rowid(variable)][, dcast(.SD, rn ~ variable)][, rn := NULL][]
col1 col2 1: A.1 B.2 2: A.2 C.2 3: B.1 A.3 4: C.1 D.1 5: E.1 F.1
Jaap's make.unique()
approach can be used as well:
melt(DT, measure.vars = names(DT))[, value := make.unique(value)][]
variable value 1: col1 A 2: col1 A.1 3: col1 B 4: col1 C 5: col1 E 6: col2 B.1 7: col2 C.1 8: col2 A.2 9: col2 D 10: col2 F
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With