Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to avoid heap fragmentation?

I'm currently working on a project for medical image processing, that needs a huge amount of memory. Is there anything I can do to avoid heap fragmentation and to speed up access of image data that has already been loaded into memory?

The application has been written in C++ and runs on Windows XP.

EDIT: The application does some preprocessing with the image data, like reformatting, calculating look-up-tables, extracting sub images of interest ... The application needs about 2 GB RAM during processing, of which about 1,5 GB may be used for the image data.

like image 353
Thomas Koschel Avatar asked Sep 29 '08 21:09

Thomas Koschel


People also ask

How is fragmentation avoided?

In order to avoid IP fragmentation, you must determine the size of the IP packets to send over the network. There are two approaches that are generally used; path MTU discovery and setting maximum segment size (MSS). Path MTU Discovery – This technique is used to identify MTU end-to-end to prevent packet fragmentation.

What causes heap fragmentation?

Fragmentation occurs when a user program has allocated memory, but doesn't use it. An example is in the heap from Malloc lecture #2, when malloc() has 8130 bytes in its free list. This is memory that the program has allocated from the operating system, but is not using.

What is memory fragmentation How can this issue be solved?

This problem occurs when you allocate RAM to processes continuously. It is done in paging and segmentation, where memory is allocated to processes non-contiguously. As a result, if you remove this condition, external fragmentation may be decreased. Compaction is another method for removing external fragmentation.

How the fragmentation problem can be solved?

Just as compaction can eliminate external fragmentation, data fragmentation can be eliminated by rearranging data storage so that related pieces are close together. For example, the primary job of a defragmentation tool is to rearrange blocks on disk so that the blocks of each file are contiguous.


2 Answers

If you are doing medical image processing it is likely that you are allocating big blocks at a time (512x512, 2-byte per pixel images). Fragmentation will bite you if you allocate smaller objects between the allocations of image buffers.

Writing a custom allocator is not necessarily hard for this particular use-case. You can use the standard C++ allocator for your Image object, but for the pixel buffer you can use custom allocation that is all managed within your Image object. Here's a quick and dirty outline:

  • Use a static array of structs, each struct has:
    • A solid chunk of memory that can hold N images -- the chunking will help control fragmentation -- try an initial N of 5 or so
    • A parallel array of bools indicating whether the corresponding image is in use
  • To allocate, search the array for an empty buffer and set its flag
    • If none found, append a new struct to the end of the array
  • To deallocate, find the corresponding buffer in the array(s) and clear the boolean flag

This is just one simple idea with lots of room for variation. The main trick is to avoid freeing and reallocating the image pixel buffers.

like image 106
Jeff Kotula Avatar answered Sep 28 '22 10:09

Jeff Kotula


There are answers, but it's difficult to be general without knowing the details of the problem.

I'm assuming 32-bit Windows XP.

Try to avoid needing 100s of MB of contiguous memory, if you are unlucky, a few random dlls will load themselves at inconventient points through your available address space rapidly cutting down very large areas of contiguous memory. Depending on what APIs you need, this can be quite hard to prevent. It can be quite surprising how just allocating a couple of 400MB blocks of memory in addition to some 'normal' memory usage can leave you with nowhere to allocate a final 'little' 40MB block.

On the other hand, do preallocate reasonable size chunks at a time. Of the order of 10MB or so is a good compromise block size. If you can manage to partition your data into this sort of size chunks, you'll be able to fill the address space reasonably efficiently.

If you're still going to run out of address space, you're going to need to be able to page blocks in and out based on some sort of caching algorithm. Choosing the right blocks to page out is going to depend very much on your processing algortihm and will need careful analysis.

Choosing where to page things out to is another decision. You might decide to just write them to temporary files. You could also investigate Microsoft's Address Windowing Extenstions API. In either case you need to be careful in your application design to clean up any pointers that are pointing to something that is about to be paged out otherwise really bad things(tm) will happen.

Good Luck!

like image 27
CB Bailey Avatar answered Sep 28 '22 10:09

CB Bailey