Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Awk match multiple strings and print both fields on the same line

Tags:

awk

I have an input file that does not have a consistent structure for the fields. What I'm trying to do is find the correct two fields and print their content on the same line.

EDIT: Here is a potential example for the input file:

abc=012 aaa=000 cba=210 bbb=111
aaa=555 abc=567 cba=765 bbb=666
aaa=444 abc=456 bbb=555 cba=654

This program almost works

  awk '{for(i=1;i<=NF;i++){if ($i ~ /aaa/) {print $i}}}' file
  awk '{for(i=1;i<=NF;i++){if ($i ~ /bbb/) {print $i}}}' file

However, this prints everything on a new field, and it does not display the data correctly either:

aaa=000
aaa=555
aaa=444
bbb=111
bbb=666
bbb=555

What I need is for the field aaa to follow the field bbb on the same line, like this:

aaa=000 bbb=111
aaa=555 bbb=666
aaa=444 bbb=555

How can this be done?

like image 322
Reubens4Dinner Avatar asked Dec 05 '25 00:12

Reubens4Dinner


1 Answers

Here is awk, using match() and substr() function, modify search="..." variable according to your need, the order you input the same way it will give you result.

awk -v search="aaa,bbb" '
BEGIN{
    n=split(search, arr, /,/) 
}
{
    for(i=1; i in arr; i++)
          printf("%s%s", (match($0,"(^| )"arr[i]"=[^ ]*") ? substr($0,(RSTART>1?RSTART+1:RSTART),(RSTART>1?RLENGTH-1:RLENGTH)) : ""), i==n ? ORS : OFS)      
}' infile

Test Results:

akshay@db-3325:/tmp$ cat infile
abc=012 aaa=000 cba=210 bbb=111
aaa=555 abc=567 cba=765 bbb=666
aaa=444 abc=456 bbb=555 cba=654

akshay@db-3325:/tmp$ awk -v search="aaa,bbb" '
BEGIN{
    n=split(search, arr, /,/) 
}
{
    for(i=1; i in arr; i++)
          printf("%s%s", (match($0,"(^| )"arr[i]"=[^ ]*") ? substr($0,(RSTART>1?RSTART+1:RSTART),(RSTART>1?RLENGTH-1:RLENGTH)) : ""), i==n ? ORS : OFS)      
}' infile
aaa=000 bbb=111
aaa=555 bbb=666
aaa=444 bbb=555

Explanation

awk -v search="aaa,bbb" '             # call awk set variable search
BEGIN{

    # split string in variable search
    # into array, separated by comma
    # arr[1]  will have aaa
    # arr[2]  will have bbb
    # variable n will have 2, which is count of array

    n=split(search, arr, /,/) 
}
{
    # loop through array arr
    for(i=1; i in arr; i++)
    {
         found = 0                   # default state

         # if there is match
         # beginning or space followed by your word
         # = anything except space char
         # which creates regexp like : 
         #    /(^| )aaa=[^ ]*/
         #    /(^| )bbb=[^ ]*/
         # if matches then 

         if(match($0,"(^| )"arr[i]"=[^ ]*")){ 

             # if it was not beginning then there will be space char
             # lets increment starting position and decrement length
             if(RSTART>1){
               RSTART++              # we got space so one char +
               RLENGTH--             # lenght one char -
             }
            found =1                 # found flag
         }

         # ternary operator syntax : ( your_condition ) ? true_action : false_action 
         # if found is true then use substr
         # else ""
         # if i equal n then print output row separator else output field separaor
         printf("%s%s", ( found ? substr($0,RSTART,RLENGTH) : ""), i==n ? ORS : OFS)
    }      
}' infile
like image 151
Akshay Hegde Avatar answered Dec 08 '25 10:12

Akshay Hegde



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!