Questions Linux Laravel Mysql Ubuntu Git Menu

HTML CSS JAVASCRIPT SQL PYTHON PHP BOOTSTRAP JAVA JQUERY R React Kotlin

Delete Redundant columns in R [duplicate]

Tags:

merge

r

I have something similar to this:

date        pgm      in.x     logs       out.y
20130514    na       12       j1         12
20131204    z2       03       j1         03
20130516    a01      04       j0         04
20130628    z1       05       j2         05

I noticed that the in and out values are always the same so I want to delete the out.y column. And I have other columns like this I want to be able to detect any .y columns that match .x columns and delete them after I do the merge.

like image

704

asked Jun 01 '16 09:06

Chayma Atallah

People also ask

How do I remove duplicates from a vector in R?

unique() function in R Language is used to remove duplicated elements/rows from a vector, data frame or array.

1 Answers

If we assume all column redundancies should be removed

no_duplicate <- data_set[!duplicated(as.list(data_set))]

will do the trick.

as.list will convert the data.frame to a list of all its columns, and duplicated will return indices for those columns that have all values as a duplicate of a previously seen column.

This does not directly try to compare .x and .y columns, but has the effect of retaining one copy of each duplicated column, which I assume is the main goal. On the other hand, it will also remove any .x columns that are duplicates of another .x column.

If we want to retain all .x columns, even those that are duplicates, a good solution might be to do filtering before the merge. Assuming you have data_x and data_y that will be merged by column "identifier":

data_y_nonredundant <- data_y[!(as.list(data_y) %in% as.list(data_x) & names(data_y)!="identifier")]
data <- merge(data_x, data_y_nonredundant, by=c("identifier"))

like image

123

answered Nov 01 '22 15:11

Alex A.

Sign in to Comment

Related questions
                            
                                Inline R code in YAML for rmarkdown doesn't run
                            
                                How can I manipulate a ggplot in R to allow extra room on lhs for angle=45 long x-axis labels? [duplicate]
                            
                                Store custom ggplot styles in object
                            
                                Dynamically add column to xts object
                            
                                Change the size of the arrowheads in a markov chain plot
                            
                                geom_text with dodged barplot
                            
                                How to get screen resolution from JavaScript in R Shiny?
                            
                                ggplot2, facet wrap, fixed y scale for each row, free scale between rows
                            
                                Visualize Parse Tree Structure
                            
                                How display length of branches in phylogenetic tree
                            
                                vim-rmarkdown plugin configuration
                            
                                A caterpillar plot of just the "significant" random effects from a mixed effects model
                            
                                R - Split by "\n" or three spaces and retain at least one space when there are three spaces
                            
                                Fastest way for doing 21 day rolling sum for an ActivityType
                            
                                Aggregating all unique values of each column of data frame
                            
                                How to merge multiple data.frames and sum and average columns at the same time in R
                            
                                ggplot: line plot for discrete x-axis
                            
                                R foreach: from single-machine to cluster
                            
                                Identify a weblink in bold in R
                            
                                Change values in data frame in a specific row using dplyr

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With