I need to analyze thousands of jpeg files (by retrieving it EXIF data). It is more than 50 GB of data.I cannot read whole files because it'll take too much time.
Is there any method in C# to read only EXIF data from those files without need of loading and decompressing whole jpeg files?
EDIT: Why I need fast method?
I've tried solution from this question: How to get the EXIF data from a file using C#
And for 1000 images with total size ~ 1GB it took 3 minutes to analyze. So for larger (50G) library of photos it could take 2 hours. And when you need almost immedietelly information like: "What is preffered zoom used by your customer" it is too slow.
The exchangeable image file format (EXIF) is a standard for embedding technical metadata in image files that many camera manufacturers use and many image-processing programs support. EXIF metadata can be embedded in TIFF and JPEG images.
On a Windows PC using File Explorer right-click on the file you want to see the data for. You will see a window pop up with various options. Click on Properties and then on Details. This will bring up the EXIF data for that photo.
A JPEG file also has a lot of metadata in each file containing auxiliary information about the image. On an average, this kind of metadata occupies 16% of size of the JPEG file.
Screenshots. Real photographs can be fingerprinted and contain EXIF data. Screenshots provide a timestamp and even that can go through editing so it's one of the easiest ways you can use to remove metadata from a photograph. Remember metadata is data that can help us identify even more data.
You'll find some code samples in ExifLib - A Fast Exif Data Extractor for .NET 2.0+ (and a full project too) that shows how to read the minimum data necessary to get just the EXIF information out.
I've recently ported my Java metadata-extractor library to .NET. It's been active since 2002 and had heavy testing through widespread use. In my tests, it churns through 2GB of images, extracting all metadata within in around 4 seconds on my machine. You could optimise further by telling it to only read specific types of metadata, such as Exif. It supports many image/video formats, and many metadata types.
Available on GitHub and NuGet.
GdPicture.NET Imaging SDK starting version 10 provides a new image parsing mechanism that allows direct access to image metadata (EXIF, GPS, XMP, IPTC...) without decoding pixels. It supports more than 90 image formats including JPEG, TIFF, RAW and WebP.
Here a link the the GdPicture.NET knowledge base that demonstrates how to extract metadata using C# and VB.NET (many other languages are also supported): tutorial
In case anybody needs further information I will be glad to assist.
Disclaimer: I am the product architect of GdPicture.NET.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With