overlapping intervals in a dataframe in r

Tags:

I am trying to work on genomic data with R, and I have seen a couple of topics with quite good answers related to two dataframes and overlapping intervals. My problem is that I have one dataframe with overlapping intervals, which I would like to merge, i.e:

chrom   start   stop
 5       100     105
 5       100     105
 5       200     300
 9       275     300
 9       280     301

I would like to end up with something like this:

chrom   start   stop
 5       100     105
 5       200     300
 9       275     301

I am also trying to become better at coding, so I was wondering what would be the most elegant way to do it. Hope this is not redundant with some other query,

924

asked Oct 28 '15 16:10

Max_IT

1 Answers

Using GenomicRanges::reduce:

require(GenomicRanges)
as.data.frame(reduce(GRanges(df$chrom, IRanges(df$start, df$stop))))
#   seqnames start end width strand
# 1        5   100 105     6      *
# 2        5   200 300   101      *
# 3        9   275 301    27      *

It's also possible using data.table::foverlaps or GenomicRanges::findOverlaps, but not as straightforward.

167

answered Nov 03 '22 04:11

Arun

Related questions
                            
                                Add a prefix to all rows in R
                            
                                Sortring in R: Object not found
                            
                                R cor.test : "not enough finite observations"
                            
                                How to get list files from a github repository folder using R
                            
                                R predict() function returning wrong/too many values
                            
                                Ternary heatmap in R
                            
                                Can Rcpp replace unif function in R?
                            
                                Error: Continuous value supplied to discrete scale
                            
                                lag not working as expected
                            
                                How do keep only unique words within each string in a vector
                            
                                stat_density2d: removed rows containing non-finite values
                            
                                R loses information when saving plot as encapsulated postscript (.eps)
                            
                                Unable to install XML package in R on CentOS
                            
                                Paste all combinations of a vector in R
                            
                                Keep only groups of data with multiple observations
                            
                                How to find the package name in R for a specific function?
                            
                                r - Use tab as part of seperator
                            
                                ggplot: combining size and color in legend
                            
                                R - Calculate Time Elapsed Since Last Event with Multiple Event Types
                            
                                Detecting whether shiny runs the R code

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

overlapping intervals in a dataframe in r

Tags:

dataframe

r

Max_IT

People also ask

1 Answers

Arun

Recent Activity

Donate For Us