I have this command that I would like to sum all the numbers from the output.
The command looks like this
$(hadoop fs -ls -R /reports/dt=2018-08-27 | grep _stats.json | awk '{print $NF}' | xargs hadoop fs -cat | jq '.duration')
So it's going to list all the folders in /reports/dt=2018-08-27
and get only _stats.json
and pass that through jq
from hadoop -cat
and get only .duration
from the json. Which in the end I get the result like this.
1211789 1211789 373585 495379 1211789
But I would like the command to sum all those numbers together to become 4504331
jq usually outputs non-ASCII Unicode codepoints as UTF-8, even if the input specified them as escape sequences (like "\u03bc"). Using this option, you can force jq to produce pure ASCII output with every non-ASCII character replaced with the equivalent escape sequence.
the simplest solution is the add
filter:
jq '[.duration] | add'
the [
brackets ]
are needed around the value to sum because add
sums the values of an array, not a stream. (for stream summation, you would need a more sophisticated solution, e.g. using reduce
, as detailed in other answers.)
depending on the exact format of the input, you may need some preprocessing to get this right.
e.g. for the sample input in Charles Duffy’s answer either
use inputs
(note that -n
is needed to avoid jq swallowing the first line of input):
jq -n '[inputs.duration] | add' <<< "$sample_data"
or slurp (-s
) and iterate (.[]
) / map:
jq -s '[.[].duration] | add' <<< "$sample_data"
jq -s 'map(.duration) | add' <<< "$sample_data"
Another option (and one that works even if not all your durations are integers) is to make your jq
code do the work:
sample_data='{"duration": 1211789}
{"duration": 1211789}
{"duration": 373585}
{"duration": 495379}
{"duration": 1211789}'
jq -n '[inputs | .duration] | reduce .[] as $num (0; .+$num)' <<<"$sample_data"
...properly emits as output:
4504331
Replace the <<<"$sample_data"
with a pipeline on stdin as desired.
awk
to the rescue!
$ ... | awk '{sum+=$0} END{print sum}'
4504331
You can just use add
now.
jq '.duration | add'
For clarity and generality, it might be worthwhile defining sigma(s)
to add a stream of numbers:
... | jq -n '
def sigma(s): reduce s as $x(0;.+$x);
sigma(inputs | .duration)'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With