I am implementing a disk based hashtable supporting large amount of keys (26+ million). The value is deserialized. Reads are essentially random throughout the file, values are less than the page size, and I am optimising for SSDs. Safety/consistency are not such huge issues (performance matters). My current solution involves using a <code>mmap()</code> file with <code>MADV_RANDOM | MADV_DONTNEED</code> set to disable prefetching by the kernel and only load data as needed on-demand. I get the idea that the kernel reads from disk to memory buffer, and I deserialize from there. What about <code>O_DIRECT</code>? If I call <code>read()</code>, I'm still copying into a buffer (which I deserialize from) so can I gain any advantage? Where can I find more info on the buffers involved with a <code>mmap()</code> file and calling <code>read()</code> on a file opened with <code>O_DIRECT</code>? I am not interested in read ahead or caching. It has nothing to offer for my use case.

O_DIRECT is option for read/write operations, when data bypass system buffers, and copied directlty from your buffer to disk controller. For get advantages of O_DIRECT, need to comply some conditions - keep aligned by memory page buffer address and buffer size aligned by I/O block. Anyway, if you use mmap, you do not use read/write. Moreover, after mmap, you can close file descriptor, and mapping will still works. So, O_DIRECT useless with mmap option. What can I recommend for increase performance: <ol> <li>If your subsystem has many request for search missing key, you can create Bloom filter in the memory. Thereafter, you match your search key on Bloom filter http://en.wikipedia.org/wiki/Bloom_filter, and reject missing keys, without actual request to disk.</li> <li>For conserve memory, use 2-level scheme, when bucket heads you keep in the mmap-ed memory, but buckets itself you read from file by pread().</li> </ol> Both options I implemented in the my autocomplete subsytem, you can see it online here: http://olegh.ftp.sh/autocomplete.html and estimate performance on the slow old computer - Celeron-300.

mmap vs O_DIRECT for random reads (what are the buffers involved?)

Tags:

c

hashtable

file-io

buffer

mmap

I am implementing a disk based hashtable supporting large amount of keys (26+ million). The value is deserialized. Reads are essentially random throughout the file, values are less than the page size, and I am optimising for SSDs. Safety/consistency are not such huge issues (performance matters).

My current solution involves using a mmap() file with MADV_RANDOM | MADV_DONTNEED set to disable prefetching by the kernel and only load data as needed on-demand.

I get the idea that the kernel reads from disk to memory buffer, and I deserialize from there.

What about O_DIRECT? If I call read(), I'm still copying into a buffer (which I deserialize from) so can I gain any advantage?

Where can I find more info on the buffers involved with a mmap() file and calling read() on a file opened with O_DIRECT?

I am not interested in read ahead or caching. It has nothing to offer for my use case.

566

asked Nov 11 '13 14:11

Amir Taaki

1 Answers

O_DIRECT is option for read/write operations, when data bypass system buffers, and copied directlty from your buffer to disk controller. For get advantages of O_DIRECT, need to comply some conditions - keep aligned by memory page buffer address and buffer size aligned by I/O block.

Anyway, if you use mmap, you do not use read/write. Moreover, after mmap, you can close file descriptor, and mapping will still works. So, O_DIRECT useless with mmap option.

What can I recommend for increase performance:

If your subsystem has many request for search missing key, you can create Bloom filter in the memory. Thereafter, you match your search key on Bloom filter http://en.wikipedia.org/wiki/Bloom_filter, and reject missing keys, without actual request to disk.
For conserve memory, use 2-level scheme, when bucket heads you keep in the mmap-ed memory, but buckets itself you read from file by pread().

Both options I implemented in the my autocomplete subsytem, you can see it online here: http://olegh.ftp.sh/autocomplete.html and estimate performance on the slow old computer - Celeron-300.

178

answered Oct 05 '22 23:10

olegarch

Related questions
                            
                                Can you check performance of a program running with Qemu Simulator?
                            
                                Compliant way to parse a 64 bit integer using sscanf with GCC
                            
                                Disassembly of sections in a c program [closed]
                            
                                difference b/w allocating memory to 2D array in 1 go or row by row
                            
                                reading a text file into an array
                            
                                Select() doesn't recognise changes through FD_SET while blocking
                            
                                Copying a specific column of a file to another file in C
                            
                                Does ANSI C actually specify which bytes are used when typecasting to a smaller integer?
                            
                                How To Marshal Int Arrays Or Pointers To Int Arrays
                            
                                C/pre-processor: detect if a __builtin function is available
                            
                                Making y[i] a modifiable variable in C
                            
                                How to split string by commas optionally followed by spaces without regex?
                            
                                undefined reference to `strnlen_s', strncpy_s',strncat_s' [closed]
                            
                                In C programming, will a calculation be slower, if any variable in the expression is nan
                            
                                Check if Directory is a Mount Point in Objective C in OSX
                            
                                How to Allocate memory from a new virtual page in C?
                            
                                Algorithm to determine whether a given number N can become hypotenuse of right triangle with all 3 integral sides
                            
                                Flex/Bison: segmentation fault core dump
                            
                                Mysterious malloc: sysmalloc: Assertion failed error
                            
                                fast retrieval of lua objects from C/C++

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With