Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get specific data from block of data based on condition

Tags:

bash

sed

awk

I have a file like this:

[group]
enable = 0
name =  green
test = more

[group]
name  = blue
test = home

[group]
value = 48
name = orange
test = out

There may be one ore more space/tabs between label and = and value.
Number of lines may wary in every block.
I like to have the name, only if this is not true enable = 0

So output should be:

blue
orange

Here is what I have managed to create:

awk -v RS="group" '!/enable = 0/ {sub(/.*name[[:blank:]]+=[[:blank:]]+/,x);print $1}'
blue
orange

There are several fault with this:

  1. I am not able to set RS to [group], both this fails RS="[group]" and RS="\[group\]". This will then fail if name or other labels contains group.
  2. I do prefer not to use RS with multiple characters, since this is gnu awk only.

Anyone have other suggestion? sed or awk and not use a long chain of commands.

like image 853
Jotne Avatar asked Feb 23 '14 12:02

Jotne


4 Answers

If you know that groups are always separated by empty lines, set RS to the empty string:

$ awk -v RS="" '!/enable = 0/ {sub(/.*name[[:blank:]]+=[[:blank:]]+/,x);print $1}'
blue
orange

@devnull explained in his answer that GNU awk also accepts regular expressions in RS, so you could only split at [group] if it is on its own line:

gawk -v RS='(^|\n)[[]group]($|\n)' '!/enable = 0/ {sub(/.*name[[:blank:]]+=[[:blank:]]+/,x);print $1}'

This makes sure we're not splitting at evil names like

[group]
enable = 0
name =  [group]
name = evil
test = more
like image 90
Jens Erat Avatar answered Oct 26 '22 19:10

Jens Erat


Your problem seems to be:

I am not able to set RS to [group], both this fails RS="[group]" and RS="\[group\]".

Saying:

RS="[[]group[]]"

should yield the desired result.

like image 21
devnull Avatar answered Oct 26 '22 17:10

devnull


In these situations where there's clearly name = value statements within a record, I like to first populate an array with those mappings, e.g.:

map["<name>"] = <value>

and then just use the names to reference the values I want. In this case:

$ awk -v RS= -F'\n' '
{
    delete map
    for (i=1;i<=NF;i++) {
        split($i,tmp,/ *= */)
        map[tmp[1]] = tmp[2]
    }
}
map["enable"] !~ /^0$/ {
    print map["name"]
}
' file
blue
orange

If your version of awk doesn't support deleting a whole array then change delete map to split("",map).

Compared to using REs and/or sub()s., etc., it makes the solution much more robust and extensible in case you want to compare and/or print the values of other fields in future.

like image 43
Ed Morton Avatar answered Oct 26 '22 17:10

Ed Morton


Since you have line-separated records, you should consider putting awk in paragraph mode. If you must test for the [group] identifier, simply add code to handle that. Here's some example code that should fulfill your requirements. Run like:

awk -f script.awk file.txt

Contents of script.awk:

BEGIN {

    RS=""
}

{
    for (i=2; i<=NF; i+=3) {

        if ($i == "enable" && $(i+2) == 0) {

            f = 1
        }

        if ($i == "name") {

            r = $(i+2)
        }
    }
}

!(f) && r {

    print r
}

{
    f = 0
    r = ""
}

Results:

blue
orange
like image 21
Steve Avatar answered Oct 26 '22 18:10

Steve