Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

File path into JSON data structure

Tags:

json

perl

I'm doing a disk space report that uses File::Find to collect cumulative sizing in a directory tree.

What I get (easily) from File::Find is the directory name.

e.g.:

/path/to/user/username/subdir/anothersubdir/etc

I'm running File::Find to collect sizes beneath:

/path/to/user/username

And build a cumulative size report of the directory and each of the subdirectories.

What I've currently got is:

while ( $dir_tree ) {
   %results{$dir_tree} += $blocks * $block_size;
   my @path_arr = split ( "/", $dir_tree ); 
   pop ( @path_arr );
   $dir_tree = join ( "/", @path_arr ); 
}

(And yes, I know that's not very nice.).

The purpose of doing this is so when I stat each file, I add it's size to the current node and each parent node in the tree.

This is sufficient to generate:

username,300M
username/documents,150M
username/documents/excel,50M
username/documents/word,40M
username/work,70M
username/fish,50M,
username/some_other_stuff,30M

But I'd like to now turn that in to JSON more like this:

{ 
    "name" : "username",
    "size" : "307200",
    "children" : [
        { 
            "name" : "documents",
            "size" : "153750",
            "children" : [
                  { 
                      "name" : "excel",
                      "size" : "51200"
                   }, 
                   {
                       "name" : "word",
                       "size" : "81920"
                   }
             ]
         }
    ]
}

That's because I'm intending to do a D3 visualisation of this structure - loosely based on D3 Zoomable Circle Pack

So my question is this - what is the neatest way to collate my data such that I can have cumulative (and ideally non cumulative) sizing information, but populating a hash hierarchically.

I was thinking in terms of a 'cursor' approach (and using File::Spec this time):

use File::Spec; 
my $data;
my $cursor = \$data; 
foreach my $element ( File::Spec -> splitdir ( $File::Find::dir ) ) {
   $cursor -> {size} += $blocks * $block_size;
   $cursor = $cursor -> {$element} 
}

Although... that's not quite creating the data structure I'm looking for, not least because we basically have to search by hash key to do the 'rolling up' part of the process.

Is there a better way of accomplishing this?

Edit - more complete example of what I have already:

#!/usr/bin/env perl

use strict;
use warnings;

use File::Find;
use Data::Dumper;

my $block_size = 1024;

sub collate_sizes {
    my ( $results_ref, $starting_path ) = @_;
    $starting_path =~ s,/\w+$,/,;
    if ( -f $File::Find::name ) {
        print "$File::Find::name isafile\n";
        my ($dev,   $ino,     $mode, $nlink, $uid,
            $gid,   $rdev,    $size, $atime, $mtime,
            $ctime, $blksize, $blocks
        ) = stat($File::Find::name);

        my $dir_tree = $File::Find::dir;
        $dir_tree =~ s|^$starting_path||g;
        while ($dir_tree) {
            print "Updating $dir_tree\n";
            $$results_ref{$dir_tree} += $blocks * $block_size;
            my @path_arr = split( "/", $dir_tree );
            pop(@path_arr);
            $dir_tree = join( "/", @path_arr );
        }
    }
}

my @users = qw ( user1 user2 );

foreach my $user (@users) {
    my $path = "/home/$user";
    print $path;
    my %results;
    File::Find::find(
        {   wanted   => sub { \&collate_sizes( \%results, $path ) },
            no_chdir => 1
        },
        $path
    );
    print Dumper \%results;

    #would print this to a file in the homedir - to STDOUT for convenience
    foreach my $key ( sort { $results{$b} <=> $results{$a} } keys %results ) {
       print "$key => $results{$key}\n";
    }
}

And yes - I know this isn't portable, and does a few somewhat nasty things. Part of what I'm doing here is trying to improve on that. (But currently it's a Unix based homedir structure, so that's fine).

like image 547
Sobrique Avatar asked Sep 03 '15 11:09

Sobrique


People also ask

How do I structure a JSON file?

Rules for JSON SyntaxData should be in name/value pairs. Data should be separated by commas. Curly braces should hold objects. Square brackets hold arrays.

How do I push data into a JSON file?

push(newData); To write this new data to our JSON file, we will use fs. writeFile() which takes the JSON file and data to be added as parameters. Note that we will have to first convert the object back into raw format before writing it.

What data structure does JSON use?

JSON defines only two data structures: objects and arrays. An object is a set of name-value pairs, and an array is a list of values. JSON defines seven value types: string, number, object, array, true, false, and null.

How do I write an array to a JSON file?

You convert the whole array to JSON as one object by calling JSON. stringify() on the array, which results in a single JSON string. To convert back to an array from JSON, you'd call JSON. parse() on the string, leaving you with the original array.


1 Answers

If you do your own dir scanning instead of using File::Find, you naturally get the right structure.

sub _scan {
   my ($qfn, $fn) = @_;
   my $node = { name => $fn };

   lstat($qfn)
      or die $!;

   my $size   = -s _;
   my $is_dir = -d _;

   if ($is_dir) {
      my @child_fns = do {
         opendir(my $dh, $qfn)
            or die $!;

         grep !/^\.\.?\z/, readdir($dh);
      };

      my @children;
      for my $child_fn (@child_fns) {
         my $child_node = _scan("$qfn/$child_fn", $child_fn);
         $size += $child_node->{size};
         push @children, $child_node;
      }

      $node->{children} = \@children;
   }

   $node->{size} = $size;
   return $node;
}

Rest of the code:

#!/usr/bin/perl

use strict;
use warnings;    
no warnings 'recursion';

use File::Basename qw( basename );
use JSON           qw( encode_json );

...    

sub scan { _scan($_[0], basename($_[0])) }

print(encode_json(scan($ARGV[0] // '.')));
like image 79
ikegami Avatar answered Oct 07 '22 13:10

ikegami