I have a dataframe in R and I wonder if it is possible to retrieve values of a column that are not present in the others columns and this for each column.
My dataframe looks like :
sample_1 sample_2 sample_3
a a a
c e c
d f e
g m j
m n n
x u w
t z z
I would like to get the following result:
sample_1 sample_2 sample_3
d f j
g u w
x
t
Thank you in advance for your answers,
To extract unique values in multiple columns in an R data frame, we first need to create a vector of the column values but for that we would need to read the columns in matrix form. After that we can simply unique function for the extraction.
The unique() function in R is used to eliminate or delete the duplicate values or the rows present in the vector, data frame, or matrix as well. The unique() function found its importance in the EDA (Exploratory Data Analysis) as it directly identifies and eliminates the duplicate values in the data.
The function distinct() [dplyr package] can be used to keep only unique/distinct rows from a data frame. If there are duplicate rows, only the first row is preserved. It's an efficient version of the R base function unique() .
You can try
lst <- lapply(seq_along(df1), function(i) df1[,i][!df1[,i] %in%
unique(unlist(df1[-i]))])
library(stringi)
as.data.frame(stri_list2matrix(lst, fill=''))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With