I am looking to do a 4 day rolling average over a large set of data. The problem is that some individuals do not have 4 cases and thus I get an error indicating that k <= n is not TRUE.
Is there a way to remove any individual that does not have enough data in the data set?
Here is an example of how the data would look:
Name variable.1
1 Kim 64.703950
2 Kim 926.339849
3 Kim 128.662977
4 Kim 290.888594
5 Kim 869.418523
6 Bob 594.973849
7 Bob 408.159544
8 Bob 609.140928
9 Joseph 496.779712
10 Joseph 444.028668
11 Joseph -213.375635
12 Joseph -76.728981
13 Joseph 265.642784
14 Hank -91.646728
15 Hank 170.209746
16 Hank 97.889889
17 Hank 12.069074
18 Hank 402.361731
19 Earl 721.941796
20 Earl 4.823148
21 Earl 696.299627
For example, we can use the subset() function if we want to drop a row based on a condition. If we prefer to work with the Tidyverse package, we can use the filter() function to remove (or select) rows based on values in a column (conditionally, that is, and the same as using subset).
omit() function in R Language is used to omit all unnecessary cases from data frame, matrix or vector. Parameter: data: Set of specified values of data frame, matrix or vector.
To delete a row from an R data frame if any value in the row is greater than n can be done by using the subsetting with single square brackets and negation operator.
To remove all rows having NA, we can use na. omit function. For Example, if we have a data frame called df that contains some NA values then we can remove all rows that contains at least one NA by using the command na. omit(df).
If your data frame is df
, you can remove all names that occur fewer than 4 times with dplyr
:
library(dplyr)
df %>%
group_by(Name) %>%
filter(n() >= 4)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With