Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Explain Perl code to display a number of bytes in KB, MB, GB etc

Tags:

perl

Given a number of bytes it formats it into "bytes", "KB", "MB", or "GB"... but what I don't understand is the portion:

$_->[1], $_->[0]

isn't what's being passed to map just an array of hashes? So how can there be a 0 and 1 index?

sub fmt {
    my $bytes = shift;

    return (
        sort { length $a <=> length $b                         } 
        map  { sprintf '%.3g%s', $bytes/1024**$_->[1], $_->[0] } 
        [" bytes"=>0],[KB=>1],[MB=>2],[GB=>3]
    )[0];
}
like image 846
user740521 Avatar asked May 09 '16 19:05

user740521


2 Answers

That is one awful piece of code. Someone's showing off

The list passed to map is this: a list of anonymous arrays

[ " bytes" => 0 ], [ KB => 1 ], [ MB => 2 ], [ GB => 3 ]

While the fat comma operator => is often seen in the context of a hash literal, that's not all it's good for. It's identical to an ordinary comma , except that a bareword left-hand operand will be implicitly quoted. Without it the list would be the same as

[ ' bytes', 0 ], [ 'KB', 1 ], [ 'MB', 2 ], [ 'GB', 3 ]

Here's the same function with the result of the intermediate map statement expanded into a separate array @variations, which I dump using Data::Dump to show what it's doing

The list passed to map is a number of anonymous arrays--each one containing the suffix string and the corresponding power of 1024 to which that string corresponds. The return statement simply picks the shortest of the representations

use strict;
use warnings 'all';
use feature 'say';

use Data::Dump;

say fmt(987 * 1024**2);

sub fmt {
        my $bytes = shift;

        my @variations = map { sprintf '%.3g%s', $bytes/1024 ** $_->[1], $_->[0] }
            [ " bytes" => 0 ],
            [ KB => 1 ],
            [ MB => 2 ],
            [ GB => 3 ];

        dd \@variations;

        return ( sort { length $a <=> length $b } @variations ) [0];
}

output

["1.03e+009 bytes", "1.01e+006KB", "987MB", "0.964GB"]
987MB

I normally use something similar to this. The antics with sprintf are to make sure that fractions of a byte are never displayed

sub fmt2 {
    my ($n) = @_;
    my @suffix = ( '', qw/ K M G T P E / );

    my $i = 0;
    until ( $n < 1024 or $i == $#suffix ) {
        $n /= 1024;
        ++$i;
    }

    sprintf $i ? '%.3g%sB' : '%.0f%sB', $n, $suffix[$i];
}
like image 178
Borodin Avatar answered Nov 27 '22 02:11

Borodin


With a tiny bit of math, this can be done without any iteration or cleverly constructed arrays:

my @si_prefix = ('', qw( K M G T P E Z Y ));
sub fmt {
  my $bytes = shift or return '0B';
  my $pow = int log(abs $bytes)/log(1024);
  return sprintf('%3.3g%sB', $bytes / (1024 ** $pow), $si_prefix[$pow]);
}

We can easily determine the closest power of 1024 by using the logarithm base change rule: log1024($bytes) = log($bytes) / log(1024)

Just for fun, I ran Benchmark::cmpthese using the code from the question, @Borodin's fmt2, and my version:

Benchmarking 1B
                 Rate    fmt_orig fmt_borodin         fmt
fmt_orig     245700/s          --        -76%        -84%
fmt_borodin 1030928/s        320%          --        -34%
fmt         1562500/s        536%         52%          --

Benchmarking 7.45GB
                 Rate    fmt_orig fmt_borodin         fmt
fmt_orig     224215/s          --        -66%        -84%
fmt_borodin  653595/s        192%          --        -54%
fmt         1428571/s        537%        119%          --

Benchmarking 55.5EB
                 Rate    fmt_orig fmt_borodin         fmt
fmt_orig     207469/s          --        -57%        -83%
fmt_borodin  487805/s        135%          --        -60%
fmt         1219512/s        488%        150%          --
like image 45
Ben Grimm Avatar answered Nov 27 '22 00:11

Ben Grimm