Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I sum all numbers from output of jq

Tags:

jq

I have this command that I would like to sum all the numbers from the output.

The command looks like this

$(hadoop fs -ls -R /reports/dt=2018-08-27 | grep _stats.json | awk '{print $NF}' | xargs hadoop fs -cat | jq '.duration')

So it's going to list all the folders in /reports/dt=2018-08-27 and get only _stats.json and pass that through jq from hadoop -cat and get only .duration from the json. Which in the end I get the result like this.

1211789 1211789 373585 495379 1211789

But I would like the command to sum all those numbers together to become 4504331

like image 631
toy Avatar asked Aug 28 '18 20:08

toy


People also ask

What is the output of JQ?

jq usually outputs non-ASCII Unicode codepoints as UTF-8, even if the input specified them as escape sequences (like "\u03bc"). Using this option, you can force jq to produce pure ASCII output with every non-ASCII character replaced with the equivalent escape sequence.


5 Answers

the simplest solution is the add filter:

jq '[.duration] | add'

the [ brackets ] are needed around the value to sum because add sums the values of an array, not a stream. (for stream summation, you would need a more sophisticated solution, e.g. using reduce, as detailed in other answers.)


depending on the exact format of the input, you may need some preprocessing to get this right.

e.g. for the sample input in Charles Duffy’s answer either

  • use inputs (note that -n is needed to avoid jq swallowing the first line of input):

    jq -n '[inputs.duration] | add' <<< "$sample_data"
    
  • or slurp (-s) and iterate (.[]) / map:

    jq -s '[.[].duration] | add' <<< "$sample_data"
    jq -s 'map(.duration) | add' <<< "$sample_data"
    
like image 178
törzsmókus Avatar answered Oct 19 '22 03:10

törzsmókus


Another option (and one that works even if not all your durations are integers) is to make your jq code do the work:

sample_data='{"duration": 1211789}
{"duration": 1211789}
{"duration": 373585}
{"duration": 495379}
{"duration": 1211789}'

jq -n '[inputs | .duration] | reduce .[] as $num (0; .+$num)' <<<"$sample_data"

...properly emits as output:

4504331

Replace the <<<"$sample_data" with a pipeline on stdin as desired.

like image 38
Charles Duffy Avatar answered Oct 19 '22 03:10

Charles Duffy


awk to the rescue!

$ ... | awk '{sum+=$0} END{print sum}'

4504331
like image 40
karakfa Avatar answered Oct 19 '22 02:10

karakfa


You can just use add now.

jq '.duration | add'
like image 12
Timmmm Avatar answered Oct 19 '22 04:10

Timmmm


For clarity and generality, it might be worthwhile defining sigma(s) to add a stream of numbers:

... | jq -n '
  def sigma(s): reduce s as $x(0;.+$x); 
  sigma(inputs | .duration)'
like image 3
peak Avatar answered Oct 19 '22 03:10

peak