Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Take every nth row from a file with groups and n is a given in a column

I have seen here and here on how to return every nth row; but my problem is different. A separate column in the file provides specifics about which nth element to return; which are different depending on the group. Here is a sample of the dataset where the Nth column provides the rows to return. That is, for Id group a every 3rd row and forId group b every 4th row. The data is quite sizable which with several Id groups.

Id  TagNo   Nth
a   A-A-3   3
a   A-A-1   3
a   A-A-5   3
a   A-A-2   3
a   AX-45   3
a   AX-33   3
b   B-B-5   4
b   B-B-4   4
b   B-B-3   4
b   BX-B2   4 

Desired output:

Id  TagNo   Nth
 a  A-A-3   3
 a  A-A-2   3
 b  B-B-5   4

Thank you for your help.

Edit: Please kindly note that I want to start picking from the first and every nth item; that is every 3rd for a and 4th for b. For group a it will be 1st,4th, 7th... for group b it will be 1st,5th, 9th rows. The original output has error and an edit has been done. My sincere apologies.

like image 417
deepseefan Avatar asked Oct 17 '17 07:10

deepseefan


2 Answers

This awk should work:

awk '!a[$1]++{print; if(NR>1) n=NR+$3} NR==n{print; n=NR+$3}' file

Id  TagNo   Nth
a   A-A-3   3
a   A-A-2   3
b   B-B-5   4
like image 128
anubhava Avatar answered Oct 29 '22 08:10

anubhava


Base R solution:

do.call(rbind, lapply(split(df, df$Id), function(x) x[seq(from = 1, to = nrow(x), by = unique(x$Nth)), ]))

    Id TagNo Nth
a.1  a A-A-3   3
a.4  a A-A-2   3
b    b B-B-5   4
like image 33
LAP Avatar answered Oct 29 '22 08:10

LAP