I want to recover files from a disk by Java without using native libraries
I'm doing this using Java 8
As far as I know deleted files remain on the disk until they are overwritten
I have direct access to disk on linux and I can read raw data, but, how can I parse deleted files on an ext4 or NTFS file system for example?
Thanks.
Recovering deleted files requires knowledge of how the underlaying file system is implemented, so you have a bit of reading to do before you can get anywhere.
In theory, YES, you can definitely do this in pure Java; you just need to find out how to read data from a raw disk, bypassing the file system. On a Unix system this is simple: open the device node as a file (you'll need root
permissions) and just read. On Windows there is probably a similar process; at worst you'll have to create a helper library in C or C++ to read the data for you.
Once you get access to the raw data, look up how files are stored in your particular file system and start looking for similar patterns in the data that you read.
This is not something you can do in an afternoon though.
Update: How to bypass the file system.
On a Unix system you can read from a partition or volume like this:
InputStream sda1 = new FileInputStream("/dev/sda1");
int firstByte = sda1.read();
On Windows you would read from \\.\PhysicalDisk0
. From Naming Files, Paths, and Namespaces:
Another example of using the Win32 device namespace is using the
CreateFile
function with "\\.\PhysicalDiskX" (where X is a valid integer value) or "\\.\CdRomX". This allows you to access those devices directly, bypassing the file system. This works because these device names are created by the system as these devices are enumerated, and some drivers will also create other aliases in the system. For example, the device driver that implements the name "C:\" has its own namespace that also happens to be the file system.APIs that go through the
CreateFile
function generally work with the "\\.\" prefix becauseCreateFile
is the function used to open both files and devices, depending on the parameters you use.If you're working with Windows API functions, you should use the "\\.\" prefix to access devices only and not files.
Most APIs won't support "\\.\"; only those that are designed to work with the device namespace will recognize it. Always check the reference topic for each API to be sure.
I don't know if the Java API is implemented using CreateFile
or if it does some name mangling that means you can't access the device namespace. In the worst case you'll have to create a wrapper library that calls CreateFile
and turns the HANDLE it returns into a file descriptor that can be used in Java; that's no work at all.
Files by definition are named sequences of bytes stored on permanent storage device. Files are managed by OS component named file system. File system operates with term "file" and translates this term to lower level terms like volume, sector, block etc.
Mapping between file name (and path) and blocks on your disk where the information is actually stored is named files table and is managed by file system.
When you delete file you ask file system to remove appropriate entry from file table. This means that indeed the file content is not deleted from disk physically and if you are lucky enough can probably be restored. Why probably? Because once the file entry is removed from the table the space occupied by file can be re-used and therefore other information can be stored there.
There are tools that try to restore the information. These tools work on level under file system, i.e. use lower level APIs. Probably they are talking directly to driver. Java does not provide API for doing this.
Therefore you have the following solutions.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With