For each row of my dataframe, I am currently trying to select all the duplicated values equal to 4 in order to set them "equal" to NA.
My dataframe is like this:
dat <- read.table(text = "
1 1 1 2 2 4 4 4
1 2 1 1 4 4 4 4",
header=FALSE)
What I need to obtain is:
1 1 1 2 2 4 NA NA
1 2 1 1 4 NA NA NA
I have found information on how to eliminate duplicated rows or columns, but I really do not know how to proceed here.. many thanks for any help
Sometimes you will want to avoid apply
because it destroys the multi-class feature of dataframe objects. This is a by
approach:
> do.call(rbind, by(dat, rownames(dat),
function(line) {line[ duplicated(unlist(line)) & line==4 ] <- NA; line} ) )
V1 V2 V3 V4 V5 V6 V7 V8
1 1 1 1 2 2 4 NA NA
2 1 2 1 1 4 NA NA NA
which
and apply
are helpful here.
> dat <- t(apply(dat, 1, function(X) {X[which(X==4)][-1] <- NA ; X}))
> dat
[1,] 1 1 1 2 2 4 NA NA
[2,] 1 2 1 1 4 NA NA NA
But there's probably a way around having to use the transpose (t
) function here, can anyone help me out?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With