Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Separating output records in AWK without a trailing separator

Tags:

json

awk

gawk

nawk

I have the following records:

31 Stockholm
42 Talin
34 Helsinki
24 Moscow
15 Tokyo

And I want to convert it to JSON with AWK. Using this code:

#!/usr/bin/awk
BEGIN {
    print "{";
    FS=" ";
    ORS=",\n";
    OFS=":";
};

{    
    if ( !a[city]++ && NR > 1 ) {
        key = $2;
        value = $1;
        print "\"" key "\"", value;
    }
};

END {
    ORS="\n";
    OFS=" ";
    print "\b\b}";
};

Gives me this:

{
"Stockholm":31,
"Talin":42,
"Helsinki":34,
"Moscow":24,
"Tokyo":15, <--- I don't want this comma
}

The problem is that trailing comma on the last data line. It makes the JSON output not acceptable. How can I get this output:

{
"Stockholm":31,
"Talin":42,
"Helsinki":34,
"Moscow":24,
"Tokyo":15
}
like image 770
AlexStack Avatar asked Mar 25 '13 19:03

AlexStack


3 Answers

Mind some feedback on your posted script?

#!/usr/bin/awk        # Just be aware that on Solaris this will be old, broken awk which you must never use
BEGIN {
    print "{";        # On this and every other line, the trailing semi-colon is a pointless null-statement, remove all of these.
    FS=" ";           # This is setting FS to the value it already has so remove it.
    ORS=",\n";
    OFS=":";
};

{
    if ( !a[city]++ && NR > 1 ) {      # awk consists of <condition>{<action} segments so move this condition out to the condition part
                                       # also, you never populate a variable named "city" so `!a[city]++` won't behave sensibly.
        key = $2;
        value = $1;
        print "\"" key "\"", value;
    }
};

END {
    ORS="\n";                          # no need to set ORS and OFS when the script will no longer use them.
    OFS=" ";
    print "\b\b}";                     # why would you want to print a backspace???
};

so your original script should have been written as:

#!/usr/bin/awk
BEGIN {
    print "{"
    ORS=",\n"
    OFS=":"
}

!a[city]++ && (NR > 1) {    
    key = $2
    value = $1
    print "\"" key "\"", value
}

END {
    print "}"
}

Here's how I'd really write a script to convert your posted input to your posted output though:

$ cat file
31 Stockholm
42 Talin
34 Helsinki
24 Moscow
15 Tokyo
$
$ awk 'BEGIN{print "{"} {printf "%s\"%s\":%s",sep,$2,$1; sep=",\n"} END{print "\n}"}' file
{
"Stockholm":31,
"Talin":42,
"Helsinki":34,
"Moscow":24,
"Tokyo":15
}
like image 58
Ed Morton Avatar answered Oct 02 '22 01:10

Ed Morton


You have a couple of choices. An easy one would be to add the comma of the previous line as you are about to write out a new line:

  • Set a variable first = 1 in your BEGIN.

  • When about to print a line, check first. If it is 1, then just set it to 0. If it is 0 print out a comma and a newline:

    if (first) { first = 0; } else { print ","; }
    

    The point of this is to avoid putting an extra comma at the start of the list.

  • Use printf("%s", ...) instead of print ... so that you can avoid the newline when printing a record.

  • Add an extra newline before the close brace, as in: print "\n}";

Also, note that if you don't care about the aesthetics, JSON doesn't really require newlines between items, etc. You could just output one big line for the whole enchilada.

like image 20
danfuzz Avatar answered Oct 02 '22 01:10

danfuzz


You should really use a json parser but here is how with awk:

BEGIN {
    print "{"    
}
NR==1{
    s= "\""$2"\":"$1
    next
}
{
    s=s",\n\""$2"\":"$1
}
END {
    printf "%s\n%s",s,"}"
}

Outputs:

{
"Stockholm":31,
"Talin":42,
"Helsinki":34,
"Moscow":24,
"Tokyo":15
}
like image 42
Chris Seymour Avatar answered Oct 02 '22 01:10

Chris Seymour