Subset data frame based on vector sequence of minimum 5 consecutive values

Tags:

I have a vector that looks like this:

out1[1:200]
  [1] NA NA NA NA  0  1  2 NA NA NA  1 NA  0 NA  0  1 NA NA  0 NA  0  1  2  2  2 NA  0  1  2  3  4  4  5  6  7  8  9  9  9  9
 [41] 10 11 NA  0  0 NA  1 NA  0  1 NA  0 NA  0  1  2 NA  1 NA  0  0  0  1  2 NA NA NA  0  0 NA  0  0  0  1  2 NA  1  2 NA  0
 [81]  1  2  3  4  5  6  7  8 NA  0  1  2  3  4 NA  0  1  2  2  3  4  5 NA  0  1  2  3  3  4  5  5  6  7 NA  1  2 NA  1  2 NA
[121]  0  1  2 NA  1  2  3  3  3  3  4 NA  0  0  0  1  2  3  4  5 NA NA  0  1 NA NA NA  1  2  2  3 NA  1  2  2  2 NA NA  0  1
[161] NA  1 NA  1  2 NA  0  0 NA NA  0  1 NA NA NA NA  1  2  3 NA NA  1  2  3  4  5  6 NA  1  2  3  4  5  6  6  7  8 NA  0  1

I now want to subset a df (with the same length) by this vector, but only sequences that have a range over minimum 5 consecutive numbers, e.g. 0:4, or 1:5 (and of course everything longer than this). Hence, NA's should be FALSE as well.

E.g.

out1: NA NA 0 1 2 2 NA 0 0 1 2 3 3 4 NA

Then the result should be

out2: F F F F F F F T T T T T T T F

956

asked Apr 16 '15 08:04

Pat

1 Answers

Following gives the desired result

library(data.table) # v >= 1.9.5 (devel version - install from GitHub)
data.table(x)[,id:=rleid(!is.na(x)),
   ][ , aa:=(.N>5) , by = id
      ][ ,aaa:=4 %in% cumsum(diff(unique(sort(x)))), by = .(id, aa)
         ]$aaa

## [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE
## [15]  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE

data

x <- c(NA, NA, NA, NA, NA, 0, 1, 2, NA, 0, 1, 2, 3, 4, 4, 5, NA, 1, 2, 3, 3, 3, 3, 4, NA)

answered Sep 22 '22 00:09

Khashaa

Related questions
                            
                                Send expression to website return dynamic result (picture)
                            
                                read.csv replaces column-name characters like `?` with `.`, `-` with `...`
                            
                                Calculate multiple columns from one function and add them to data.frame
                            
                                How to group similar rows in R
                            
                                Exclude specific object type from the global environment
                            
                                Pass expression as variable to curve
                            
                                Classification accuracy of binomial glmer() predictions
                            
                                How to use namespaced function with dplyr::mutate_each?
                            
                                Adjusting x limits xlim() in ggplot2 geom_density() to mimic ggvis layer_densities() behavior
                            
                                Getting an R expression from a value (similar to enquote)
                            
                                Parliamentary seats graph -> colors and labels?
                            
                                Financial Data - R data.table - group by condiction
                            
                                rJava - .jcall calling issue: method with signature not found
                            
                                Joining two data.tables in R based on multiple keys and duplicate entries
                            
                                How to assign different images to different vertices in an igraph?
                            
                                What is the reason to add quotation marks around R function names?
                            
                                How can i get a 'rcom' package?
                            
                                Sum product by row across two dataframes/matrix in r
                            
                                Error while adding main title with subscript in gridExtra
                            
                                Can Summarise in dplyr not drop other columns in my data frame?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Subset data frame based on vector sequence of minimum 5 consecutive values

Tags:

dataframe

r

sequence

subset

Pat

People also ask

1 Answers

Khashaa

Recent Activity

Donate For Us