While fixing one user's answer on AskUbuntu , I've discovered a small issue. The code itself is straightforward : os.walk , recursively get sum of all files in the directory.
But it breaks on symlinks :
$ python test_code2.py $HOME
Traceback (most recent call last):
File "test_code2.py", line 8, in <module>
space += os.stat(os.path.join(subdir, f)).st_size
OSError: [Errno 2] No such file or directory: '/home/xieerqi/.kde/socket-eagle'
Question then is, how do I tell python to ignore those files and avoid summing them ?
Solution:
As suggested in the comments , I've added os.path.isfile()
check and now it works perfectly and gives correct size for my home directory
$> cat test_code2.py
#! /usr/bin/python
import os
import sys
space = 0L # L means "long" - not necessary in Python 3
for subdir, dirs, files in os.walk(sys.argv[1]):
for f in files:
file_path = os.path.join(subdir, f)
if os.path.isfile(file_path):
space += os.stat(file_path).st_size
sys.stdout.write("Total: {:d}\n".format(space))
$> python test_code2.py $HOME
Total: 76763501905
As already mentioned by Antti Haapala in a comment, The script does not break on symlinks, but on broken symlinks. One way to avoid that, taking the existing script as a starting point, is using try/except
:
#! /usr/bin/python2
import os
import sys
space = 0L # L means "long" - not necessary in Python 3
for root, dirs, files in os.walk(sys.argv[1]):
for f in files:
fpath = os.path.join(root, f)
try:
space += os.stat(fpath).st_size
except OSError:
print("could not read "+fpath)
sys.stdout.write("Total: {:d}\n".format(space))
As a side effect, it gives you information on possible broken links.
Yes, os.path.isfile
is the way to go. However the following version may be more memory efficient.
for subdir, dirs, files in os.walk(sys.argv[1]):
paths = (os.path.join(subdir, f) for f in files)
space = sum(os.stat(path).st_size for path in paths if os.path.isfile(path))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With