I have an input file that does not have a consistent structure for the fields. What I'm trying to do is find the correct two fields and print their content on the same line.
EDIT: Here is a potential example for the input file:
abc=012 aaa=000 cba=210 bbb=111
aaa=555 abc=567 cba=765 bbb=666
aaa=444 abc=456 bbb=555 cba=654
This program almost works
awk '{for(i=1;i<=NF;i++){if ($i ~ /aaa/) {print $i}}}' file
awk '{for(i=1;i<=NF;i++){if ($i ~ /bbb/) {print $i}}}' file
However, this prints everything on a new field, and it does not display the data correctly either:
aaa=000
aaa=555
aaa=444
bbb=111
bbb=666
bbb=555
What I need is for the field aaa to follow the field bbb on the same line, like this:
aaa=000 bbb=111
aaa=555 bbb=666
aaa=444 bbb=555
How can this be done?
Here is awk, using match() and substr() function, modify search="..." variable according to your need, the order you input the same way it will give you result.
awk -v search="aaa,bbb" '
BEGIN{
n=split(search, arr, /,/)
}
{
for(i=1; i in arr; i++)
printf("%s%s", (match($0,"(^| )"arr[i]"=[^ ]*") ? substr($0,(RSTART>1?RSTART+1:RSTART),(RSTART>1?RLENGTH-1:RLENGTH)) : ""), i==n ? ORS : OFS)
}' infile
Test Results:
akshay@db-3325:/tmp$ cat infile
abc=012 aaa=000 cba=210 bbb=111
aaa=555 abc=567 cba=765 bbb=666
aaa=444 abc=456 bbb=555 cba=654
akshay@db-3325:/tmp$ awk -v search="aaa,bbb" '
BEGIN{
n=split(search, arr, /,/)
}
{
for(i=1; i in arr; i++)
printf("%s%s", (match($0,"(^| )"arr[i]"=[^ ]*") ? substr($0,(RSTART>1?RSTART+1:RSTART),(RSTART>1?RLENGTH-1:RLENGTH)) : ""), i==n ? ORS : OFS)
}' infile
aaa=000 bbb=111
aaa=555 bbb=666
aaa=444 bbb=555
Explanation
awk -v search="aaa,bbb" ' # call awk set variable search
BEGIN{
# split string in variable search
# into array, separated by comma
# arr[1] will have aaa
# arr[2] will have bbb
# variable n will have 2, which is count of array
n=split(search, arr, /,/)
}
{
# loop through array arr
for(i=1; i in arr; i++)
{
found = 0 # default state
# if there is match
# beginning or space followed by your word
# = anything except space char
# which creates regexp like :
# /(^| )aaa=[^ ]*/
# /(^| )bbb=[^ ]*/
# if matches then
if(match($0,"(^| )"arr[i]"=[^ ]*")){
# if it was not beginning then there will be space char
# lets increment starting position and decrement length
if(RSTART>1){
RSTART++ # we got space so one char +
RLENGTH-- # lenght one char -
}
found =1 # found flag
}
# ternary operator syntax : ( your_condition ) ? true_action : false_action
# if found is true then use substr
# else ""
# if i equal n then print output row separator else output field separaor
printf("%s%s", ( found ? substr($0,RSTART,RLENGTH) : ""), i==n ? ORS : OFS)
}
}' infile
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With