Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Command to list all file types and their average size in a directory

Tags:

bash

unix

macos

I am working on a specific project where I need to work out the make-up of a large extract of documents so that we have a baseline for performance testing.

Specifically, I need a command that can recursively go through a directory and, for each file type, inform me of the number of files of that type and their average size.

I've looked at solutions like: Unix find average file size, How can I recursively print a list of files with filenames shorter than 25 characters using a one-liner? and https://unix.stackexchange.com/questions/63370/compute-average-file-size, but nothing quite gets me to what I'm after.

like image 285
Mardoz Avatar asked Dec 26 '22 13:12

Mardoz


2 Answers

This du and awk combination should work for you:

du -a mydir/ | awk -F'[.[:space:]]' '/\.[a-zA-Z0-9]+$/ { a[$NF]+=$1; b[$NF]++ }
     END{for (i in a) print i, b[i], (a[i]/b[i])}' 
like image 198
anubhava Avatar answered May 15 '23 02:05

anubhava


Give you something to start, with below script, you will get a list of file and its size, line by line.

#!/usr/bin/env bash

DIR=ABC
cd $DIR

find . -type f |while read line
do 
  # size=$(stat --format="%s" $line)    # For the system with stat command
  size=$(perl -e 'print -s $ARGV[0],"\n"' $line )  # @Mark Setchell provided the command, but I have no osx system to test it. 
  echo $size $line 
done

Output sample

123 ./a.txt
23 ./fds/afdsf.jpg

Then it is your homework, with above output, you should be easy to get file type and their average size

like image 30
BMW Avatar answered May 15 '23 02:05

BMW