Can I write to different parts of the same file concurrently from multiple threads (on a typical PC)? I mean there's only one disk head, so the writes can be only performed in some order anyway i.e. not in parallel, right?
Edit:
I'm writing a program that sorts a large binary file but the majority of time is still spent on disk I/O, so I'm just wondering will I gain any extra speed by doing I/O in parallel.
There's nothing to stop you from having multiple threads writing to different parts of the same file.
I have a program that sorts a large binary file but the majority of time is still spent on disk I/O, so I'm just wondering will I gain any extra speed by doing I/O in parallel.
If the program is disk-bound, making it multithreaded (and still writing the same amount of data to the same disk) will not speed it up.
If we are talking about a traditional hard drive, sequential I/O is generally faster than I/O that involves moving the disk head back and forth. With this in mind, splitting the I/O across threads might even be counter-productive.
There are several avenues to explore as far as speeding things up:
It is possible on unix(-like) operating systems at least, presumably also on Windows, though file handling is somewhat different and may need specific file mode allow this (edit: see answer of bizzehdee for details).
On a running operating system, "file" is really a logical entity, some state of it stored to disk at any given time, but also some changes still only in kernel buffers. So, in a way, writing to file is no different from writing to block of shared memory, only API is different (and not even that if you use mmap
).
But in short, just seek and write, old bytes in the file get overwritten. If two processes write on same bytes overlapping, I think end result is undefined, and in any case something, which should never happen in a correctly functioning system, and any programs doing this should have some mechanism to prevent overlapping writes.
About speed up: depends on what you do, really. If you just perform raw write, things will probably slow down on traditional spinning hard disk, or file may become fragmented more easily. On an SSD, there probably is no slow-down, but no speed-up either.
On the other hand, if your operation is CPU-bound, and you have multiple cores, and doing things in parallel will allow you to get higher total CPU usage, then processing different parts of same output file in parallel can speed up things, even a lot if there's lot of processing compared to bytes written to file.
you need to look at CreateFileEx
and WriteFileEx
and make use of lpOverlapped
. This allows for async reading and/or writing from/to the same file at the same time in multiple threads.
http://msdn.microsoft.com/en-us/library/windows/desktop/aa365748(v=vs.85).aspx
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With