Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sum durations in bash

Tags:

bash

awk

perl

I am getting execution time of various processes in a file from their respective log files. The file with execution time looks similar to following (it may have hundreds of entries)

1:00:01.11
2:2.20
1.02

The first line is hours:minutes:seconds, the second line is minutes:seconds and, the third is just seconds.

I want to sum all entries to come to a total execution time. How can I achieve this in bash? If not bash, then can you provide me some examples from other scripting language to sum timestamps?

like image 757
zatka Avatar asked Feb 20 '17 20:02

zatka


2 Answers

To complement Matt Jacob's elegant perl solution with a (POSIX-compliant) awk solution:

awk -F: '{ n=0; for(i=NF; i>=1; --i) secs += $i * 60 ^ n++ } END { print secs }' file

With the sample input, this outputs (the sum of all time spans in seconds):

3724.33

See the section below for how to format this value as a time span, similar to the input (01:02:04.33).

Explanation:

  • -F: splits the input lines into fields by :, so that the resulting fields ($1, $2, ...) represent the hour, minute, and seconds components individually.

  • n=0; for(i=NF; i>=1; --i) secs += $i * 60 ^ n++ enumerates the fields in reverse order (first seconds, then minutes, then hours, if defined; NF is the number of fields) and multiplies each field with the appropriate multiple of 60 to yield an overall value in seconds, stored in variable secs, cumulatively across lines.

  • END { print secs } is executed after all lines have been processed and simply prints the cumulative value in seconds.


Formatting the output as a time span:

Custom output formatting must be used:

awk -F: '
  { n=0; for(i=NF; i>=1; --i) secs += $i * 60 ^ n++ }
  END { 
    hours   = int(secs / 3600)
    minutes = int((secs - hours * 3600) / 60)
    secs    = secs % 60
    printf "%02d:%02d:%05.2f\n", hours, minutes, secs
  }
' file

The above yields (the equivalent of 3724.33 seconds):

01:02:04.33

The END { ... } block splits the total number of seconds accumulated in secs back into hours, minutes, and seconds, and outputs the result with appropriate formatting of the components using printf.

The reason that utilities such as date and GNU awk's (nonstandard) date-formatting functions cannot be used to format the output is twofold:

  • The standard time format specifier %H wraps around at 24 hours, so if the cumulative time span exceeds 24 hours, the output would be incorrect.

  • Fractional seconds would be lost (the granularity of Unix time stamps is whole seconds).

like image 147
mklement0 Avatar answered Nov 05 '22 21:11

mklement0


The mostly-readable full perl script:

use strict;
use warnings;

my $seconds = 0;

while (<DATA>) {
    my @fields = reverse(split(/:/));

    for my $i (0 .. $#fields) {
        $seconds += $fields[$i] * 60 ** $i;
    }
}

print "$seconds\n";

__DATA__
1:00:01.11
2:2.20
1.02

Or, the barely-readable one-liner version:

$ perl -F: -wane '@F = reverse(@F); $seconds += $F[$_] * 60 ** $_ for 0 .. $#F; END { print "$seconds\n" }' times.log

Output:

3724.33

In both cases, we're splitting each line on the H:M:S separator : and then reversing the array so that we can process from right-to-left. To get the total time in seconds, we can rely on a neat trick where we multiply each field by powers of 60.

If you want the result in H:M:S format instead of raw seconds, strftime() from the POSIX core module makes it easy:

use POSIX qw(strftime);
print strftime('%H:%M:%S', gmtime($seconds)), "\n";

Output:

01:02:04
like image 42
Matt Jacob Avatar answered Nov 05 '22 21:11

Matt Jacob