The task is to parse a binary file into memory. However, I don't know a priori the amount of memory needed to allocate.
Which method has shown to be preferable: doing multiple small mallocs as I progress in the parsing routine or, first traverse the file in order to decide the amount of memory needed, and then parse again?
Any hint is appreciated.
In almost all cases one large allocation is better than many small allocations. This prevents fragmentation, makes less system calls. It often results in better performance through better locality.
A common technique is to allocate a small segment first and reallocate one larger by a fixed factor (often 1.5
). After all elements are gathered, the memory can be fixed to largest size if the over-allocation is considered to big.
In any case: Implement the simplest one first. If you have performance problems: benchmark. Then optimize. It might turn out that allocation isn't even your bottleneck.
EDIT: As R.. mentions you might get a good idea how much to allocate by reasoning about the upper memory bound and its relation to file length. Most good binary formats also contain length and size information in a header segment. If you can figure out the exact size required by your data structure with a little arithmetic and/or file seeking, you are on the winning side.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With