Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

awk one liner select only rows based on value of a column

Tags:

linux

unix

awk

I'd like to read filein.txt (tab delimited) and output a fileout.txt with only rows that match the value of a given column, and eliminate the column being queried. i.e.,

filein.txt
#name\thouse\taddress
roger\tvictorian\t223 dolan st.
maggie\tfrench\t12 alameda ave.
kingston\tvictorian\t224 house st.
robert\tamerican\t22 dolan st.

Let us say I'd like to select only the rows where the houses are of victorian style, then my fileout.txt should look like:

fileout.txt
#name\taddress
roger\t223 dolan st.
kingston\t224 house st.
like image 797
Dnaiel Avatar asked Nov 13 '12 16:11

Dnaiel


People also ask

How to use AWK to filter rows in Excel?

Using AWK to Filter Rows 1 Let’s look at the data we want to filter. What we want to do is get the rows from Chr (column 7) when it equals 6 and also the Pos ... 2 Printing Fields and Searching. We can also use AWK to select and print parts of the file. ... 3 Filtering Rows Based on Field Values. ...

What does $2 mean in AWK?

I learned that in awk, $2 is the 2nd column. How to specify the ith line and the element at the ith row and jth column? Here's an example with a header line and (redundant) field descriptions: There are better ways to align columns than " " by the way.

How to get only lines between 11000000 and 25000000 in AWK?

Our initial problem requires that we look into the Chr field to get only lines with the value 6. Then we want to look into the Pos field to grab the lines where those values are between 11000000 and 25000000. To do this in AWK, we need to use the if control statement along with a conditional expression. Let’s run one now and explain after:

How to use AWK in Linux?

We can use at the command line in conjunction with other UNIX commands to build a pipeline of operations that act on a data file or we can use AWK inside a shell script. You can also put an AWK program in it’s own file and run with awk -f source-file. There are many more features in the AWK language I didn’t discuss in this blog.


2 Answers

awk -F"\t" '$2 == "victorian" { print $1"\t"$3 }' file.in
like image 100
Kevin Avatar answered Sep 29 '22 10:09

Kevin


You can do it with the following awk script:

#!/bin/bash

style="victorian"
awk -v s_style=$style 'BEGIN{FS=OFS="\t"}
    $2==s_style {$2=""; sub("\t\t","\t"); print}'

Explanation:

  • style="victorian": assign the house style that you want to select outside of the awk script so it's easier to maintain
  • awk: invoke awk
  • -v s_style=$style: the -v option passes an external variable into awk. Need to specify this for each variable you pass in. In this case it assigns the external variable $style to the awk variable s_style.
  • BEGIN{FS=OFS="\t"}: tells awk that the field separators in the output should be tabs, not spaces by default.
  • {$2==s_style {$2=""; sub("\t\t","\t"); print}}': If the 2nd field is the house type specified in s_style (in this case, victorian), then remove it and print the line.

Alternatively, you could do:

#!/bin/bash

style="victorian"
awk -v s_style=$style 'BEGIN{FS=OFS="\t"}
    $2==s_style {print $1, $3}'

but this assumes that your input files will not have additional fields separated by tabs in the future.

like image 37
sampson-chen Avatar answered Sep 29 '22 09:09

sampson-chen