Is there a simple way to read all the allocated clusters of a given file with python? The usual python read() seemingly only allows me to read up to the logical size of the file (which is reasonable, of course), but I want to read all the clusters including slack space.
For example, I have a file called "test.bin" that is 1234 byte in logical size, but because my file system uses clusters of size 4096 bytes, the file has a physical size of 4096 bytes on disk. I.e., there are 2862 bytes in file slack space.
I'm not sure where to even start with this problem... I know I can read the raw disk from /dev/sda, but I'm not sure how to locate the clusters of interest... of course this is the whole point of having a file-system (to match up names of files to sectors on disk), but I don't know enough about how python interacts with the file-system to figure this out... yet... any help or pointers to references would be greatly appreciated.
Assuming an ext2/3/4 filesytem, as you guess yourself, your best bet is probably to:
use a wrapper (like this one) around debugfs to get the list of blocks associated with a given file:
debugfs: blocks ./f.txt
2562
to read-back that/those block(s) from the block device / image file
>>> f = open('/tmp/test.img','rb')
>>> f.seek(2562*4*1024)
10493952
>>> bytes = f.read(4*1024)
>>> bytes
b'Some data\n\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
...
Not very fancy but that will work. Please note you don't have to mount the FS to do any of these steps. This is especially important for forensic applications where you cannot trust in anyway the content of the disk and/or are not allowed per regulation to mount the disk image.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With