Question

I have the following records:

31 Stockholm
42 Talin
34 Helsinki
24 Moscow
15 Tokyo

And I want to convert it to JSON with AWK. Using this code:

#!/usr/bin/awk
BEGIN {
    print "{";
    FS=" ";
    ORS=",\n";
    OFS=":";
};

{    
    if ( !a[city]++ && NR > 1 ) {
        key = $2;
        value = $1;
        print "\"" key "\"", value;
    }
};

END {
    ORS="\n";
    OFS=" ";
    print "\b\b}";
};

Gives me this:

{
"Stockholm":31,
"Talin":42,
"Helsinki":34,
"Moscow":24,
"Tokyo":15, <--- I don't want this comma
}

The problem is that trailing comma on the last data line. It makes the JSON output not acceptable. How can I get this output:

{
"Stockholm":31,
"Talin":42,
"Helsinki":34,
"Moscow":24,
"Tokyo":15
}
Was it helpful?

Solution

Mind some feedback on your posted script?

#!/usr/bin/awk        # Just be aware that on Solaris this will be old, broken awk which you must never use
BEGIN {
    print "{";        # On this and every other line, the trailing semi-colon is a pointless null-statement, remove all of these.
    FS=" ";           # This is setting FS to the value it already has so remove it.
    ORS=",\n";
    OFS=":";
};

{
    if ( !a[city]++ && NR > 1 ) {      # awk consists of <condition>{<action} segments so move this condition out to the condition part
                                       # also, you never populate a variable named "city" so `!a[city]++` won't behave sensibly.
        key = $2;
        value = $1;
        print "\"" key "\"", value;
    }
};

END {
    ORS="\n";                          # no need to set ORS and OFS when the script will no longer use them.
    OFS=" ";
    print "\b\b}";                     # why would you want to print a backspace???
};

so your original script should have been written as:

#!/usr/bin/awk
BEGIN {
    print "{"
    ORS=",\n"
    OFS=":"
}

!a[city]++ && (NR > 1) {    
    key = $2
    value = $1
    print "\"" key "\"", value
}

END {
    print "}"
}

Here's how I'd really write a script to convert your posted input to your posted output though:

$ cat file
31 Stockholm
42 Talin
34 Helsinki
24 Moscow
15 Tokyo
$
$ awk 'BEGIN{print "{"} {printf "%s\"%s\":%s",sep,$2,$1; sep=",\n"} END{print "\n}"}' file
{
"Stockholm":31,
"Talin":42,
"Helsinki":34,
"Moscow":24,
"Tokyo":15
}

OTHER TIPS

You have a couple of choices. An easy one would be to add the comma of the previous line as you are about to write out a new line:

  • Set a variable first = 1 in your BEGIN.

  • When about to print a line, check first. If it is 1, then just set it to 0. If it is 0 print out a comma and a newline:

    if (first) { first = 0; } else { print ","; }
    

    The point of this is to avoid putting an extra comma at the start of the list.

  • Use printf("%s", ...) instead of print ... so that you can avoid the newline when printing a record.

  • Add an extra newline before the close brace, as in: print "\n}";

Also, note that if you don't care about the aesthetics, JSON doesn't really require newlines between items, etc. You could just output one big line for the whole enchilada.

You should really use a json parser but here is how with awk:

BEGIN {
    print "{"    
}
NR==1{
    s= "\""$2"\":"$1
    next
}
{
    s=s",\n\""$2"\":"$1
}
END {
    printf "%s\n%s",s,"}"
}

Outputs:

{
"Stockholm":31,
"Talin":42,
"Helsinki":34,
"Moscow":24,
"Tokyo":15
}

Why not use json parser? Don't force awk to do something isn't wasn't designed to do. Here is a solution using python:

import json

d = {}
with open("file") as f:
    for line in f:
       (val, key) = line.split()
       d[key] = int(val)

print json.dumps(d,indent=0)

This outputs:

{
"Helsinki": 34, 
"Moscow": 24, 
"Stockholm": 31, 
"Talin": 42, 
"Tokyo": 15
}
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top