Convert arbitrary output to json by column in the terminal?

I'd like to be able to pipe the output from any command line program to a command that converts it to json.

For example my unknown program could accept target columns, a delimiter and output field names

# select columns 1 and 3 from the output and convert it to simple json
netstat -a | grep CLOSE_WAIT | convert_to_json 1,3 name,other

and would generate something like so:

  {"name": "tcp4", "other": "31"},
  {"name": "tcp4", "other": "0"} 

I'm looking for something that works for any program, not just netstat!

I'm open to installing any 3rd party tool/opensource project, and tend to run things on linux/osx - does not have to be a bash script solution, can be written in node, perl, python, etc.

EDIT: I'm of course willing to pass in any more info that'd be required to make it work, for example a delimiter or multiple delimiters - I'd just like to avoid explicit parsing in the command line, and have the tool do that.

Filtering STDIN to build json variable


As terminal is a very special kind of interface, with monospaced fonts, tools are built to monitor on this terminal, many output could be very difficult to parse:

netstat output is a good sample:

Active UNIX domain sockets (servers and established)
Proto RefCnt Flags       Type       State         I-Node   Path
unix  2      [ ACC ]     STREAM     LISTENING     13947569 @/tmp/.X11-unix/X1
unix  2      [ ]         DGRAM                    8760     /run/systemd/notify
unix  2      [ ACC ]     SEQPACKET  LISTENING     8790     /run/udev/control

Where some line contain blank fields, this could not be simply splitted on spaces.

Because of this, the requestet script convert_to_json will be posted at very bottom of this.

Simple space based splitting with awk

By using awk, you could use nice syntax:

netstat -an |
    awk '/CLOSE_WAIT/{
        printf "  { \42%s\42:\42%s\42,\42%s\42:\42%s\42},\n","name",$1,"other",$3
    }' |
    sed '1s/^/[\n/;$s/,$/\n]/'

Simple space based splitting with perl, but using json library

But this perl way is more flexible:

netstat -an | perl -MJSON::XS -ne 'push @out,{"name"=>,$1,"other"=>$2} if /^(\S+)\s+\d+\s+(\d+)\s.*CLOSE_WAIT/;END{print encode_json(\@out)."\n";}'

or same but splitted;

netstat -an |
    perl -MJSON::XS -ne '
        push @out,{"name"=>,$1,"other"=>$2} if
        END{print encode_json(\@out)."\n";

Or pretty-printed:

netstat -an | perl -MJSON::XS -ne '
    push @out,{"name"=>,$1,"other"=>$2} if /^(\S+)\s+\d+\s+(\d+)\s.*CLOSE_WAIT/;
    END{$coder = JSON::XS->new->ascii->pretty->allow_nonref;
        print $coder->encode(\@out);}'

Finally, I like this version not based on regex:

netstat -an | perl -MJSON::XS -ne '
    do {
        my @line=split(/\s+/);
        push @out,{"name"=>,$line[0],"other"=>$line[2]}
    } if /CLOSE_WAIT/;
        $coder = JSON::XS->new->ascii->pretty->allow_nonref;
        print $coder->encode(\@out);

But you could run command inside perl script:

perl -MJSON::XS -e '
    open STDIN,"netstat -an|";
    my @out;
    while (<>){
        push @out,{"name"=>,$1,"other"=>$2} if /^(\S+)\s+\d+\s+(\d+)\s.*CLOSE_WAIT/;
    print encode_json \@out;'

This could become a basical prototyp:

#!/usr/bin/perl -w

use strict;
use JSON::XS;
my $coder = JSON::XS->new->ascii->pretty->allow_nonref;

open STDIN,"netstat -naut|";
my @out;
my @fields;

my $searchre=":";
$searchre = shift @ARGV if @ARGV;

while (<>){
    map { s/_/ /g;push @fields,$_; } split(/\s+/) if
        /^Proto.*State/ && s/\sAddr/_Addr/g;
    do {
        my @line=split(/\s+/);
        my %entry;
        for my $i (0..$#fields) {
        push @out,\%entry;
    } if /$searchre/;

print $coder->encode(\@out);

(Without argument, this will dump entire netstat -uta, but you could give any search string as argument, like CLOSE or an IP.)

Positional parameters, netstat2json.pl

This method could work with many other tools than netcat, with some corrections:

#!/usr/bin/perl -w
use strict;
use JSON::XS;
my $coder = JSON::XS->new->ascii->pretty->allow_nonref;
open STDIN,"netstat -nap|";
my ( $searchre ,@out,%fields)=( "[/:]" );
$searchre = shift @ARGV if @ARGV;
while (<>){
    next if /^Active\s.*\)$/;
    /^Proto.*State/ && do {
        my @head;
        map { s/_/ /g;push @head,$_; } split(/\s+/);
        s/_/ /g;
        for my $i (0..$#head) {
            my $crt=index($_,$head[$i]);
            my $next=-1;
            $next=index($_,$head[$i+1])-$crt-1 if $i < $#head;
    do {
        my $line=$_;
        my %entry;
        for my $i (keys %fields) {
            my $crt=substr($line,$fields{$i}[0],$fields{$i}[1]);
        push @out,\%entry;
    } if /$searchre/;
print $coder->encode(\@out);
  • find header lines Proto.*State (specific to netcat)
  • store fieldnames with position and length
  • split line by field length,then trim spaces
  • dump variable as json string.

This could be run with arguments, like previously:

./netstat2json.pl CLOS
      "Local Address" : "",
      "State" : "CLOSE_WAIT",
      "Recv-Q" : "18",
      "Proto" : "tcp",
      "Send-Q" : "0",
      "Foreign Address" : "",
      "PID/Program name" : "-"
      "Recv-Q" : "1",
      "Local Address" : "::1:53816",
      "State" : "CLOSE_WAIT",
      "Send-Q" : "0",
      "PID/Program name" : "-",
      "Foreign Address" : "::1:631",
      "Proto" : "tcp6"

And empty fields don't break variable assignement:

./netstat2json.pl 1000.*systemd/notify
      "Proto" : "unix",
      "I-Node" : "33378",
      "RefCnt" : "2",
      "Path" : "/run/user/1000/systemd/notify",
      "PID/Program name" : "-",
      "Type" : "DGRAM",
      "Flags" : "[ ]",
      "State" : ""

Nota! This modified version run netstat with -nap arguments to get PID/Program name field.

If not run by superuser root, you could become this output on STDERR:

(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)

You could avoid them

  • by running netstat2json.pl 2>/dev/null,
  • by running this as root or with sudo or
  • edit line #6, change "netstat -nap|" for "netstat -na|".

convert_to_json.pl perl script to transform STDIN to json.

There is the convert_to_json.pl perl script, strictly as requested: to be run as netstat -an | grep CLOSE | ./convert_to_json.pl 1,3 name,other

#!/usr/bin/perl -w

use strict;
use JSON::XS;
my $coder = JSON::XS->new->ascii->pretty->allow_nonref;

my (@fields,@pos,@out);

map {
    push @pos,1*$_-1
} split ",",shift @ARGV;      

map { 
    push @fields,$_
} split ",",shift @ARGV;

die "Number of fields don't match number of positions" if $#fields ne $#pos;

while (<>) {
    my @line=split(/\s+/);
    my %entry;
    for my $i (0..$#fields) {
    push @out,\%entry;
print $coder->encode(\@out);
Here's my ruby version :

#! /usr/bin/env ruby
# Converts stdin columns to a JSON array of hashes
# Installation : Save as convert_to_json, make it executable and put it somewhere in PATH. Ruby must be installed
# Examples :
# netstat -a | grep CLOSE_WAIT | convert_to_json 1,3 name,other
# ls -l | convert_to_json
# ls -l | convert_to_json 6,7,8,9
# ls -l | convert_to_json 6,7,8,9 month,day,time,name
# convert_to_json 1,2 time,value ";" < some_file.csv
# http://stackoverflow.com/questions/40246134/convert-arbitrary-output-to-json-by-column-in-the-terminal

require 'json'

script_name = File.basename(__FILE__)
syntax = "Syntax : command_which_outputs_columns | #{script_name} column1_id,column2_id,...,columnN_id column1_name,column2_name,...,columnN_name delimiter"

if $stdin.tty? or $stdin.closed? then
  $stderr.puts syntax
  if ARGV[2]
    delimiter = ARGV[2]
    $stderr.puts "#{script_name} : Using #{delimiter} as delimiter"
    delimiter = /\s+/

  column_ids = (ARGV[0] || "").split(',').map{|column_id| column_id.to_i-1}
  column_names = (ARGV[1] || "").split(',')

  results = []
  $stdin.each do |stdin_line|
    if column_ids.empty?
      values = stdin_line.strip.split(delimiter)
      values = stdin_line.strip.split(delimiter).values_at(*column_ids)
      colum_name = column_names[i] || "column#{(column_ids[i] || i)+1}"
  puts JSON.pretty_generate(results)

It works as defined in your example :

netstat -a | grep CLOSE_WAIT | convert_to_json 1,3 name,other
    "name": "tcp",
    "other": "0"
    "name": "tcp6",
    "other": "0"

As a bonus, you can

  • omit to specify parameters : every column will be converted to json
  • omit to specify names : column will be called column1, column2, ...
  • choose a missing column : value will be null
  • define a delimiter as third parameter. Default is whitespace

Other examples :

netstat -a | grep CLOSE_WAIT | ./convert_to_json
# [
#   {
#     "column1": "tcp",
#     "column2": "1",
#     "column3": "0",
#     "column4": "",
#     "column5": "",
#     "column6": "CLOSE_WAIT"
#   },
#   {
#     "column1": "tcp6",
#     "column2": "1",
#     "column3": "0",
#     "column4": "ip6-localhost:50293",
#     "column5": "ip6-localhost:ipp",
#     "column6": "CLOSE_WAIT"
#   }
# ]

netstat -a | grep CLOSE_WAIT | ./convert_to_json 1,3
# [
#   {
#     "column1": "tcp",
#     "column3": "0"
#   },
#   {
#     "column1": "tcp6",
#     "column3": "0"
#   }
# ]

ls -l | tail -n3 | convert_to_json 6,7,8,9 month,day,time,name
# [
#   {
#     "month": "Oct",
#     "day": "27",
#     "time": "10:35",
#     "name": "test.dot"
#   },
#   {
#     "month": "Nov",
#     "day": "2",
#     "time": "14:27",
#     "name": "uniq.rb"
#   },
#   {
#     "month": "Nov",
#     "day": "2",
#     "time": "14:27",
#     "name": "utf8_nokogiri.rb"
#   }
# ]

# NOTE: ls -l uses the 8th column for year, not time, for older files :
ls --full-time -t /usr/share/doc | tail -n3 | ./convert_to_json 6,7,9 yyyymmdd,time,name
    "yyyymmdd": "2013-10-21",
    "time": "15:15:20.000000000",
    "name": "libbz2-dev"
    "yyyymmdd": "2013-10-10",
    "time": "16:27:32.000000000",
    "name": "zsh"
    "yyyymmdd": "2013-10-03",
    "time": "18:52:45.000000000",
    "name": "manpages-dev"

ls -l | tail -n3 | convert_to_json 9,12
# [
#   {
#     "column9": "test.dot",
#     "column12": null
#   },
#   {
#     "column9": "uniq.rb",
#     "column12": null
#   },
#   {
#     "column9": "utf8_nokogiri.rb",
#     "column12": null
#   }
# ]

convert_to_json 1,2 time,value ";" < some_file.csv
# convert_to_json : Using ; as delimiter
# [
#   {
#     "time": "1",
#     "value": "3"
#   },
#   {
#     "time": "2",
#     "value": "5"
#   }
# ]
