Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Which one is faster, reading from disk or allocate system memory

My environment is XP 32-bit. I find when allocated memory is nearly the maximum size, 2GB, that means a little virtual space is available, allocationnew memory is very slow.

So if I have a page file, my app need to analyze them. I have two ways. One is to read them all into system memory, then do the analysis. The other is to reserv a memory buffer first as a cache, and read part of page file into that buffer, analyze and then discard it, then read second part of page file, and override the cache, do the analysis again.

From the profiling, it looks the second one is faster, since it avoid the allocation time cost.

What do you think? Thanks in adavance.

like image 595
Buzz Avatar asked Feb 03 '10 13:02

Buzz


1 Answers

(1) I'm not sure the question matches the title. If you're allocating close to 2GB of RAM on 32 bit Windows, the system is probably paging a lot of memory to disk, and that's where I'd look first for the slow down. When you're using a lot of memory, you should think of it as being stored on disk (in pagefile.sys) but cached in physical RAM. The second one might be faster not because of the cost of doing allocation, but because of the cost of using a lot of memory at once. In effect when you copy the file into one big allocation you're copying much of it disk->disk via RAM, then when you run over it again to analyse, you're loading the copy back to RAM again. If your analysis is a single-pass algorithm that's a lot of redundant work.

(2) What I think is, mmap the file (MapViewOfFile and friends on Windows).

Edit: (3) a caution. If the file is currently 1.8GB, there might be a chance that next year it might be 4GB. If so, I'd plan now for it to have a size greater than 2^32 on a 32bit machine, which means either taking your second option, or else still using MapViewOfFile but doing it one sensible-sized chunk of the file at a time, rather than all at once. Otherwise you'll be revisiting this code the first time someone tries it on a big file and reports the bug.

like image 161
Steve Jessop Avatar answered Oct 21 '22 05:10

Steve Jessop