Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Writing multiple files Vs. writing one big file [in a solid state drive]

(I was not able to find a clear answer to my question, maybe I used the wrong search term)

I want to record many images from a camera, with no compression or lossless compression, on a not so powerful device with one single solid drive. After investigating, I have decided that, if any, the compression will be simply png image by image (this is not part of the discussion). Given these constraints, I want to be able to record at maximum possible frequency from the camera. The bottleneck is the (only one) hard drive speed. I want to use the RAM for queuing, and the few available cores for compressing the images in parallel, so that there's less data to write.

Once the data is compressed, do I get any gain in writing speed if I stream all the bytes in one single file, or, considering that I am working with a solid drive, can I just write one file (let's say about 1 or 2 MB) per image still working at the maximum disk bandwidth? (or very close to it, like >90%)?

I don't know if it matters, this will be done using C++ and its libraries.

like image 613
Antonio Avatar asked Mar 22 '23 03:03

Antonio


1 Answers

My question is "simply" if by writing my output on a single file instead of in many 2MB files I can expect a significant benefit, when working with a solid state drive.

There's a benefit, not a significant one. A file system driver for a solid state drive already knows how to distribute the data of a file across many non-adjacent clusters so doing it yourself doesn't help. Necessary to fit a large file on a drive that already contains files. By breaking it up, you force extra writes to also add the directory entries for those segments.

The type of a solid state drive matters but this is in general already done by the driver to implement "wear-leveling". In other words, intentionally scatter the data across the drive. This avoids wearing out flash memory cells, they have a limited number of times you can write them before they physically wear out and fail. Traditionally only guaranteed at 10,000 writes, they've gotten better. You'll exercise this of course. Notable as well is that flash drives are fast to read but slow to write, that matters in your case.

There's one notable advantage to breaking up the image data into separate files: it is easier to recover from a drive error. Either from a disastrous failure or the drive just filling up to capacity without you stopping in time. You don't lose the entire shot. But inconvenient to whatever program reads the images off the drive, it has to glue them back together. Which is an important design goal as well, if you make it too impractical with a non-standard uncompressed file format or just too slow to transfer or just too inconvenient in general then it will just not get used very often.

like image 145
Hans Passant Avatar answered Apr 05 '23 23:04

Hans Passant