Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Count the number of folders in a directory and subdirectories

I've got a script that will accurately tell me how many files are in a directory, and the subdirectories within. However, I'm also looking into identify how many folders there are within the same directory and its subdirectories...

My current script:

import os, getpass
from os.path import join, getsize
user = 'Copy of ' + getpass.getuser()
path = "C://Documents and Settings//" + user + "./"
folder_counter = sum([len(folder) for r, d, folder in os.walk(path)])
file_counter = sum([len(files) for r, d, files in os.walk(path)])
print ' [*] ' + str(file_counter) + ' Files were found and ' + str(folder_counter) + ' folders'

This code gives me the print out of: [*] 147 Files were found and 147 folders.

Meaning that the folder_counter isn't counting the right elements. How can I correct this so the folder_counter is correct?

like image 319
Luke Willmer Avatar asked Apr 21 '15 10:04

Luke Willmer


People also ask

How do I count the number of files in a directory and subdirectories?

To count all the files and directories in the current directory and subdirectories, type dir *. * /s at the prompt.

How do I count the number of directories and subdirectory in Linux?

An easy way of counting files and directories in a directory is to use the “tree” command and to specify the name of the directory to be inspected. As you can see, the number of files and directories is available at the bottom of the tree command.


2 Answers

Python 2.7 solution

For a single directory and in you can also do:

import os
print len(os.walk('dir_name').next()[1])

which will not load the whole string list and also return you the amount of directories inside the 'dir_name' directory.

Python 3.x solution

Since many people just want an easy and fast solution, without actually understanding the solution, I edit my answer to include the exact working code for Python 3.x.

So, in Python 3.x we have the next method instead of .next. Thus, the above snippet becomes:

import os
print(len(next(os.walk('dir_name'))[1]))

where dir_name is the directory that you want to find out how many directories has inside.

like image 80
Xxxo Avatar answered Oct 21 '22 13:10

Xxxo


I think you want something like:

import os

files = folders = 0

for _, dirnames, filenames in os.walk(path):
  # ^ this idiom means "we won't be using this value"
    files += len(filenames)
    folders += len(dirnames)

print "{:,} files, {:,} folders".format(files, folders)

Note that this only iterates over os.walk once, which will make it much quicker on paths containing lots of files and directories. Running it on my Python directory gives me:

30,183 files, 2,074 folders

which exactly matches what the Windows folder properties view tells me.


Note that your current code calculates the same number twice because the only change is renaming one of the returned values from the call to os.walk:

folder_counter = sum([len(folder) for r, d, folder in os.walk(path)])
                        # ^ here          # ^ and here
file_counter = sum([len(files) for r, d, files in os.walk(path)])
                      # ^ vs. here     # ^ and here

Despite that name change, you're counting the same value (i.e. in both it's the third of the three returned values that you're using)! Python functions do not know what names (if any at all; you could do print list(os.walk(path)), for example) the values they return will be assigned to, and their behaviour certainly won't change because of it. Per the documentation, os.walk returns a three-tuple (dirpath, dirnames, filenames), and the names you use for that, e.g. whether:

for foo, bar, baz in os.walk(...):

or:

for all_three in os.walk(..):

won't change that.

like image 6
jonrsharpe Avatar answered Oct 21 '22 14:10

jonrsharpe