Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Determine cluster size of file system in Python

I would like to calculate the "size on disk" of a file in Python. Therefore I would like to determine the cluster size of the file system where the file is stored.

How do I determine the cluster size in Python? Or another built-in method that calculates the "size on disk" will also work.

I looked at os.path.getsize but it returns the file size in bytes, not taking the FS's block size into consideration.

I am hoping that this can be done in an OS independent way...

like image 609
Philip Fourie Avatar asked Mar 22 '10 14:03

Philip Fourie


People also ask

How do I check the size of a file in Python?

Use os.path.getsize() function Use the os. path. getsize('file_path') function to check the file size. Pass the file name or file path to this function as an argument.

Is cluster size the same as block size?

Only that the cluster size will be a multiple of the block size. Other than that, the cluster size is unrelated to the block. Cluster sizes are more related to the size of the disk, and optimum sizes for a given file system structure. So a disk cluster might be 5MB (or less) or even 20MB.

How do I check storage in Python?

disk_usage() method in Python is to get disk usage statistics about the given path. This method returns a named tuple with attributes total, used and free.

What is cluster size Linux?

The cluster size is the allocation unit that the filesystem uses, and is what causes fragmentation - I'm sure you know about that. On a moderately sized ext3 filesystem, this is usually 4096 bytes, but you can check that with dumpe2fs.


1 Answers

On UNIX/Linux platforms, use Python's built-in os.statvfs. On Windows, unless you can find a third-party library that does it, you'll need to use ctypes to call the Win32 function GetDiskFreeSpace, like this:

import ctypes

sectorsPerCluster = ctypes.c_ulonglong(0)
bytesPerSector = ctypes.c_ulonglong(0)
rootPathName = ctypes.c_wchar_p(u"C:\\")

ctypes.windll.kernel32.GetDiskFreeSpaceW(rootPathName,
    ctypes.pointer(sectorsPerCluster),
    ctypes.pointer(bytesPerSector),
    None,
    None,
)

print(sectorsPerCluster.value, bytesPerSector.value)

Note that ctypes only became part of the Python stdlib in 2.5 or 2.6 (can't remember which).

I put this sort of thing in a function that first checks whether the UNIX variant is present, and falls back to ctypes if (presumably because it's running on Windows) not. That way, if Python ever does implement statvfs on Windows, it will just use that.

like image 187
DNS Avatar answered Oct 05 '22 20:10

DNS