I'm currently working on a project for medical image processing, that needs a huge amount of memory. Is there anything I can do to avoid heap fragmentation and to speed up access of image data that has already been loaded into memory? The application has been written in C++ and runs on Windows XP. EDIT: The application does some preprocessing with the image data, like reformatting, calculating look-up-tables, extracting sub images of interest ... The application needs about 2 GB RAM during processing, of which about 1,5 GB may be used for the image data.

If you are doing medical image processing it is likely that you are allocating big blocks at a time (512x512, 2-byte per pixel images). Fragmentation will bite you if you allocate smaller objects between the allocations of image buffers. Writing a custom allocator is not necessarily hard for this particular use-case. You can use the standard C++ allocator for your Image object, but for the pixel buffer you can use custom allocation that is all managed within your Image object. Here's a quick and dirty outline: <ul> <li>Use a static array of structs, each struct has: <ul> <li>A solid chunk of memory that can hold N images -- the chunking will help control fragmentation -- try an initial N of 5 or so</li> <li>A parallel array of bools indicating whether the corresponding image is in use</li> </ul> </li> <li>To allocate, search the array for an empty buffer and set its flag <ul> <li>If none found, append a new struct to the end of the array</li> </ul> </li> <li>To deallocate, find the corresponding buffer in the array(s) and clear the boolean flag</li> </ul> This is just one simple idea with lots of room for variation. The main trick is to avoid freeing and reallocating the image pixel buffers.

How to avoid heap fragmentation?

Tags:

performance

memory-management

heap-memory

fragmentation

I'm currently working on a project for medical image processing, that needs a huge amount of memory. Is there anything I can do to avoid heap fragmentation and to speed up access of image data that has already been loaded into memory?

The application has been written in C++ and runs on Windows XP.

EDIT: The application does some preprocessing with the image data, like reformatting, calculating look-up-tables, extracting sub images of interest ... The application needs about 2 GB RAM during processing, of which about 1,5 GB may be used for the image data.

353

asked Sep 29 '08 21:09

Thomas Koschel

2 Answers

If you are doing medical image processing it is likely that you are allocating big blocks at a time (512x512, 2-byte per pixel images). Fragmentation will bite you if you allocate smaller objects between the allocations of image buffers.

Writing a custom allocator is not necessarily hard for this particular use-case. You can use the standard C++ allocator for your Image object, but for the pixel buffer you can use custom allocation that is all managed within your Image object. Here's a quick and dirty outline:

Use a static array of structs, each struct has:
- A solid chunk of memory that can hold N images -- the chunking will help control fragmentation -- try an initial N of 5 or so
- A parallel array of bools indicating whether the corresponding image is in use
To allocate, search the array for an empty buffer and set its flag
- If none found, append a new struct to the end of the array
To deallocate, find the corresponding buffer in the array(s) and clear the boolean flag

This is just one simple idea with lots of room for variation. The main trick is to avoid freeing and reallocating the image pixel buffers.

106

answered Sep 28 '22 10:09

Jeff Kotula

There are answers, but it's difficult to be general without knowing the details of the problem.

I'm assuming 32-bit Windows XP.

Try to avoid needing 100s of MB of contiguous memory, if you are unlucky, a few random dlls will load themselves at inconventient points through your available address space rapidly cutting down very large areas of contiguous memory. Depending on what APIs you need, this can be quite hard to prevent. It can be quite surprising how just allocating a couple of 400MB blocks of memory in addition to some 'normal' memory usage can leave you with nowhere to allocate a final 'little' 40MB block.

On the other hand, do preallocate reasonable size chunks at a time. Of the order of 10MB or so is a good compromise block size. If you can manage to partition your data into this sort of size chunks, you'll be able to fill the address space reasonably efficiently.

If you're still going to run out of address space, you're going to need to be able to page blocks in and out based on some sort of caching algorithm. Choosing the right blocks to page out is going to depend very much on your processing algortihm and will need careful analysis.

Choosing where to page things out to is another decision. You might decide to just write them to temporary files. You could also investigate Microsoft's Address Windowing Extenstions API. In either case you need to be careful in your application design to clean up any pointers that are pointing to something that is about to be paged out otherwise really bad things(tm) will happen.

Good Luck!

answered Sep 28 '22 10:09

CB Bailey

Related questions
                            
                                Why is the "new" keyword so much more efficient than assignment?
                            
                                Efficiency of Multithreaded Loops
                            
                                Is remove() faster than get() in HashMap?
                            
                                Why '===' is slower than comparing char by char when comparing two string in Nodejs
                            
                                Performance problem with Euler problem and recursion on Int64 types
                            
                                JsPerf: ParseInt vs Plus conversion
                            
                                Can WPF render a line path with 300,000 points on it in a performance-sensitive environment?
                            
                                The fastest way (execution time) to find the longest element in an list
                            
                                C# WebClient acting slow the first time
                            
                                Complexity Comparisons Between Data Structures
                            
                                SQL Server BIGINT or DECIMAL(18,0) for primary key
                            
                                how does except method work in linq
                            
                                Find an element in an infinite length sorted array
                            
                                How is dart2js code faster than javascript?
                            
                                Does "readonly" (C#) reduce memory usage?
                            
                                Is there a faster way to check if this is a valid date?
                            
                                To ask permission or apologize?
                            
                                What are the advantages of the general types (int / uint) over specific types (int64 / uint64) in Go lang?
                            
                                Pandas replace/dictionary slowness
                            
                                Converting NumPy array to a set takes too long

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With