I would like to calculate the "size on disk" of a file in Python. Therefore I would like to determine the cluster size of the file system where the file is stored.
How do I determine the cluster size in Python? Or another built-in method that calculates the "size on disk" will also work.
I looked at os.path.getsize but it returns the file size in bytes, not taking the FS's block size into consideration.
I am hoping that this can be done in an OS independent way...
Use os.path.getsize() function Use the os. path. getsize('file_path') function to check the file size. Pass the file name or file path to this function as an argument.
Only that the cluster size will be a multiple of the block size. Other than that, the cluster size is unrelated to the block. Cluster sizes are more related to the size of the disk, and optimum sizes for a given file system structure. So a disk cluster might be 5MB (or less) or even 20MB.
disk_usage() method in Python is to get disk usage statistics about the given path. This method returns a named tuple with attributes total, used and free.
The cluster size is the allocation unit that the filesystem uses, and is what causes fragmentation - I'm sure you know about that. On a moderately sized ext3 filesystem, this is usually 4096 bytes, but you can check that with dumpe2fs.
On UNIX/Linux platforms, use Python's built-in os.statvfs. On Windows, unless you can find a third-party library that does it, you'll need to use ctypes to call the Win32 function GetDiskFreeSpace, like this:
import ctypes
sectorsPerCluster = ctypes.c_ulonglong(0)
bytesPerSector = ctypes.c_ulonglong(0)
rootPathName = ctypes.c_wchar_p(u"C:\\")
ctypes.windll.kernel32.GetDiskFreeSpaceW(rootPathName,
ctypes.pointer(sectorsPerCluster),
ctypes.pointer(bytesPerSector),
None,
None,
)
print(sectorsPerCluster.value, bytesPerSector.value)
Note that ctypes only became part of the Python stdlib in 2.5 or 2.6 (can't remember which).
I put this sort of thing in a function that first checks whether the UNIX variant is present, and falls back to ctypes if (presumably because it's running on Windows) not. That way, if Python ever does implement statvfs on Windows, it will just use that.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With