Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fastest Way To Calculate Directory Sizes

what is the best and fastest way to calculate directory sizes? For example we will have the following structure:

/users
      /a
      /b
      /c
      /...

We need the output to be per user directory:

a = 1224KB
b = 3533KB
c = 3324KB
...

We plan on having tens maybe even hundred of thousands of directories under /users. The following shell command works:

du -cms /users/a | grep total | awk '{print $1}'

But, we will have to call it N number of times. The entire point, is that the output; each users directory size will be stored in our database. Also, we would love to have it update as frequently as possible, but without blocking all the resources on the server. Is it even possible to have it calculate users directory size every minute? How about every 5 minutes?

Now that I am thinking about it some more, would it make sense to use node.js? That way, we can calculate directory sizes, and even insert into the database all in one transaction. We could do that as well in PHP and Python, but not sure it is as fast.

Thanks.

like image 214
Justin Avatar asked Nov 29 '10 20:11

Justin


People also ask

What is the fastest way to check folder size?

Right-click on the folder you want to view the size in File Explorer. Select “Properties.” The File Properties dialogue box will appear displaying the folder “Size” and its “Size on disk.” It will also show the file contents of those particular folders.

How can I see the size of multiple folders?

Here, if you want to check the size of multiple folders, you can press Ctrl key and select all the folders. Then right-click and select Properties. In the pop-up folder properties window, you can see the total size of the selected folders.

How do you determine size of folder?

Go to Windows Explorer and right-click on the file, folder or drive that you're investigating. From the menu that appears, go to Properties. This will show you the total file/drive size. A folder will show you the size in writing, a drive will show you a pie chart to make it easier to see.


5 Answers

Why not just:

du -sm /users/*

(The slowest part is still likely to be du traversing the filesystem to calculate the size, though).

like image 80
caf Avatar answered Oct 01 '22 17:10

caf


What do you need this information for? If it's only for reminding the users that their home directories are too big, you should add quota limits to the filesystem. You can set the quota to 1000 GB if you just want the numbers without really limiting disk usage.

The numbers are usually accurate whenever you access anything on the disk. The only downside is that they tell you how large the files are that are owned by a particular user, instead of how large the files below his home directory are. But maybe you can live with that.

like image 31
Roland Illig Avatar answered Oct 02 '22 17:10

Roland Illig


I think what you are looking for is:

du -cm --max-depth=1 /users | awk '{user = substr($2,7,300);
>                                   ans = user ": " $1;
>                                   print ans}'

The magic numbers 7 is taking away the substring /users/, and 300 is just an arbitrary big number (awk is not one of my best languages =D, but I am guessing that part is not going to be written in awk anyways.) It's faster since you don't involve greping for the total and the loop is contained inside du. I bet it can be done faster, but this should be fast enough.

like image 38
HaskellElephant Avatar answered Oct 01 '22 17:10

HaskellElephant


If you have multiple cores you can run the du command in parallel,

For example (running from the folder you want to examine):

>> parallel du -sm ::: *

>> ls -a | xargs -P4 du -sm

[The number after the -P argument sets the amount of cpus you want to use]

like image 21
Yoni Gerufi Avatar answered Oct 02 '22 17:10

Yoni Gerufi


not that slow but will show you folders size: du -sh /* > total.size.files.txt

like image 40
morpheus Avatar answered Oct 02 '22 17:10

morpheus