Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extracting jpegs from a disk dump

I've got a 16GB memory card off someone that won't load properly (asks to be reformatted). I'm trying to get jpegs off it.

I've run dd to dump the contents to a file, which worked splendidly. The file won't mount and be read, so the contents are corrupt in someway.

Opening the dump in a hex editor shows that there is data on there, and by looking for the markers for the start and end of a jpeg (FFD8 and FFD9), I've been able to manually extract the first 3 jpegs.

Before I go and write some code to stream the file, find the offsets and dump the files, is there any existing way to do this? I can't find anything with a simple google search, but don't want to solve a problem which must have been solved many times before.

Does anyone know of either some software or a decent library (Python would be nice as I'm familiar with the language, though anything would do) that will easily let me extract the jpegs, or am I better off just writing the code myself?

like image 276
Rich Bradshaw Avatar asked Feb 25 '12 17:02

Rich Bradshaw


2 Answers

You want a computer forensics carving tool.

There are two obvious choices for this problem. The first is the open source photorec. The second is the commercial tool Adroit Photo Forensics. I've used both tools on many occasions. Adroit will recover files that are fragmented and does a better job eliminating false positives, but it is pricy. In all likelihood you'll be fine with photorec.

like image 129
vy32 Avatar answered Oct 21 '22 04:10

vy32


Here is a program that i wrote to do this using python, it reads a file that contains the image data and separates it into individual files.

import hashlib

inputfile = 'data.txt'
marker = chr(0xFF)+chr(0xD8)

# Input data
imagedump = file(inputfile, "rb").read()

imagedump = imagedump.split(marker)

count=0
for photo in imagedump:
    name = hashlib.sha256(photo).hexdigest()[0:16]+".jpg"
    file(name, "wb").write(marker+photo)
    count=count+1
    print count

The script names the found images with their sha256 digest and all of the photos that it finds will be dumped in the current directory.

Here is a way that you can test the script to see if it is working correctly: type cd ~/images/ then make the folder mkdir test then dump a some jpegs into a singe file in the directory cat *.jpg > ./test/data.txt then cd test and put the script into the current directory, then run the script python extract.py and the jpegs will be jumped in the current folder.

like image 38
kyle k Avatar answered Oct 21 '22 03:10

kyle k