I have seen here and here on how to return every nth row; but my problem is different. A separate column in the file provides specifics about which nth element to return; which are different depending on the group. Here is a sample of the dataset where the Nth
column provides the rows to return. That is, for Id
group a
every 3rd row and forId
group b
every 4th row. The data is quite sizable which with several Id
groups.
Id TagNo Nth
a A-A-3 3
a A-A-1 3
a A-A-5 3
a A-A-2 3
a AX-45 3
a AX-33 3
b B-B-5 4
b B-B-4 4
b B-B-3 4
b BX-B2 4
Desired output:
Id TagNo Nth
a A-A-3 3
a A-A-2 3
b B-B-5 4
Thank you for your help.
Edit: Please kindly note that I want to start picking from the first
and every nth item; that is every 3rd for a
and 4th for b
. For group a
it will be 1st,4th, 7th
... for group b it will be 1st,5th, 9th
rows. The original output has error and an edit has been done. My sincere apologies.
This awk
should work:
awk '!a[$1]++{print; if(NR>1) n=NR+$3} NR==n{print; n=NR+$3}' file
Id TagNo Nth
a A-A-3 3
a A-A-2 3
b B-B-5 4
Base R
solution:
do.call(rbind, lapply(split(df, df$Id), function(x) x[seq(from = 1, to = nrow(x), by = unique(x$Nth)), ]))
Id TagNo Nth
a.1 a A-A-3 3
a.4 a A-A-2 3
b b B-B-5 4
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With