Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to sort groups of lines together?

Tags:

unix

sorting

awk

I have this file:

rs1    1    ADD     0.7     0.75     0.45
rs1    1    VAR     0.4     4.53     0.06
rs1    1    USER    NA      1.96     0.37
rs2    2    ADD     1.5     0.25     0.23
rs2    2    VAR     0.4     4.86     0.03
rs2    2    USER    NA      1.73     0.05
rs3    3    ADD     0.29    0.76     0.97
rs3    3    VAR     0.44    3.95     0.09
rs3    3    USER    0.96    5.41     0.01

For each value in $1, there are three lines with ADD, VAR, USER in $3. I want to sort (reverse sort) the file based on the $6 values for lines with USER in $3. The question is that how I could do this AND at the same time keep the corresponding lines with ADD, VAR next to the sorted lines. I don't need the other two lines to be sorted; I just need them to be next to the sorted line.

Desired output:

 rs3    3    ADD     0.29    0.76    0.97
 rs3    3    VAR     0.44    3.95    0.09
 rs3    3    USER    0.96    5.41    0.01
 rs2    2    ADD     1.5     0.25    0.23
 rs2    2    VAR     0.4     4.86    0.03
 rs2    2    USER    NA      1.73    0.05
 rs1    1    ADD     0.7     0.75    0.45
 rs1    1    VAR     0.4     4.53    0.06
 rs1    1    USER    NA      1.96    0.37

I have tried this code, but it only sorts based on the $6 values in USER lines:

cat File | sort -k1 | uniq | sort -g -k6 > Output

Thank you

like image 502
user2162153 Avatar asked Sep 10 '13 23:09

user2162153


People also ask

How do I sort a group of rows in Excel?

Select any cell within the range you want to sort. On the Data tab, in the Sort & Filter group, select Custom Sort. In the Custom Sort dialog box, click Options. Under Row, in the 'Sort by' drop down, select the row that you want to sort.

What is group sorting?

Group sorting is a facility available on table worksheets that removes repeated values to make reports easier to analyze. Group sorting has the following effects: The group name is displayed only once at the start of a group. Repeated group name values are removed from the worksheet.


2 Answers

This is a bit messy but does what you want:

paste - - - < File | sort -k18,18g | xargs -n 6

The problem with the input format is that sort has no way to operate on groups of input lines, so you need to turn each group into one line, sort it, then turn it back. This only works if the input "USER" line is always last in the group.

like image 170
PhilR Avatar answered Jan 18 '23 21:01

PhilR


Here's a one-liner for Ruby :)

ruby -e 'File.open(ARGV.shift).readlines.entries.group_by{|e| e.split[1]}.sort.reverse.each{|e| puts e[1]}' file

Output:

rs3    3    ADD     0.29    0.76     0.97
rs3    3    VAR     0.44    3.95     0.09
rs3    3    USER    0.96    5.41     0.05
rs2    2    ADD     1.5     0.25     0.23
rs2    2    VAR     0.4     4.86     0.03
rs2    2    USER    NA      1.73     0.01
rs1    1    ADD     0.7     0.75     0.45
rs1    1    VAR     0.4     4.53     0.06
rs1    1    USER    NA      1.96     0.37
like image 38
konsolebox Avatar answered Jan 18 '23 19:01

konsolebox