Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Strategies for working with large amounts of image data

Technology stack: C# / .NET 4 / WinForms

Background:

The project on which I am working is a visualization application for a series of image stacks. Specifically, each image stack is aligned to a grid, show the same image at any one time, and processing functions are applied to the images currently in view. The image stacks themselves are 150-300 MB and each image is 512KB-1MB. A typical data set will consist of ~100 image stacks.

Question:

To try and work with this amount of data, I am using several technqiues:

  • memory mapped files: Image stacks are loaded from disk at application launch
  • compilation under x64 with unsafe code allowed: Obviously I need 64-bit address space for files of this size. I am moving the currently displayed image from the memory mapped file to to a method that generates a bitmap via Marshal.Copy with unsafe pointers.
  • System.Threading.Tasks: I'm using parallel loops for processing where possible
  • System.Drawing.BufferedGraphicsContext: Each image stack has one active image which is composited onto a BufferedGraphicsContext before being passed to a PictureBox for display to the user.
  • High-end system requirements: quad-core CPU or better, SSD, 12GB memory, etc

Yet even using all of the above, the reponsiveness leaves much to be desired. Using the SysInternals Process Explorer, CPU utilization is low (< 25%) while memory usage climbs to the limit before garbage collection occurs.

Profiling shows that most of the execution time is spent getting data out of the memory mapped files. I assume it's waiting as the OS pages the requested memory back into active memory?

What else could I do to improve performance?

Note:

  • Most, if not all, image stacks will be viewable at the same time so clipping to the current viewport may not yield much speed.
  • Resizing for display is an option, but the complete original data must still be available at all times for processing so it seems that would just be an extra step.

Update 1:

  • For memory, my development box only has 6 GB (and I'm attempting to load fewer files as a result), but the deployment system will have 24 GB.
  • I am looking into using SSE optimizations through the Intel Performance Primitives and GPU accelleration via CUDA.
  • The reason I am attempting to load all of the data into memory is because an important visualization step is cycling through the image stacks at 15-60 Hz and I was afraid of thrashing.
like image 946
Noren Avatar asked Feb 01 '12 21:02

Noren


2 Answers

First of all, I think using unsafe code and memory mapped files is not very helpful. You need to read around 20GB of data from disk. Reading it from disk is going to take a lot longer than one extra copy in memory in case you just use streams - you've optimized in the wrong place.

I think you should look at it from a different angle. You're showing stacks of images - 20GB worth, on a display that can show less than 10MB of data. You don't need to read 20GB of data to show all the image stacks and to provide a responsive UI while processing these images. You just need to load the top image from each stack - that'll be much much faster.

As for the actual processing, unless you can utilize the GPU somehow, I don't think you can make it faster than processing images in parallel. I guess it depends on the processing you actually do.

like image 81
zmbq Avatar answered Oct 02 '22 13:10

zmbq


You can still pre-generated thumb images per image and load only them on Grid when all images are available. In the moment that user gonna apply an effect/transformation to image you can load only that image. And even during the loading of only that image you can devide it to clipping sectors of loading and load them in async way. If you look on Google Street View, how it loads, after zoom, you wil figure out that never entire image (even if it was requested by you) immediately loaded, but it loaded by sectors.

Another very interesting technology I think, Deep Zoom can be, if not an answer to your problems, but at least can provide a good hint.

Another example on Deep Zoom from Scott Hanselman

Good luck.

like image 25
Tigran Avatar answered Oct 02 '22 12:10

Tigran