Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Large file memory management

Tags:

c

file

memory

mmap

I'm looking for help on how to handle access to a large (defined by: larger than addressable memory) file/block device transparently and sanely within my library. Say we've a block device of 512GB in size on an 32bit architecture. 512GB is way more than we can address on a 32-bit architecture and managing portions of the device/file in memory using mmap() is something I'm trying to avoid.

What I'm trying to achieve is, to get blocks that are addressed as 64-bit numbers/offsets and that are arbitrary but per-device static in size (512 bytes, 4K, 8K, 64MB, etc.). The caller should just get the memory address and should not need to take care about freeing memory or loading the actual content into memory.

I was thinking about a mechanism as follows:

  • something like a void* get_file_address(unit64_t blk_offset) call taking an offset (the block number) and that checks if this block is mapped already and if not reads in and therefore maps it
  • some structure that keeps track of access counts to the blocks (updated on every get_file_address call)
  • a memory manager that can be utilized if memory gets low and that starts unloading seldom used blocks using before mentioned structure

The last point was irritating to me: writing a memory manager by myself doesn't seem sane. Additionally, I'm sure that I'm not the first one with this problem.

So is there any solution/library/codefragment out there that already helps to manage such or similar case? I'm ok with solutions for Win, Linux, *BSD or OS X.

like image 391
grasbueschel Avatar asked Nov 11 '22 22:11

grasbueschel


1 Answers

I would use "framed mmap" with "large file support" which is part of Linux since long now. Start from the Wikipedia article and then go to technical details within the SuSE web site.

There are also some examples online and a few answers here on stackoverflow. I don't think you can easily find some pre-cooked library. Like the above links suggest, source code for software that handles large multimedia files could be helpful, and their "framed" nature could lead to some interesting snippet.

like image 126
EnzoR Avatar answered Nov 14 '22 22:11

EnzoR