Which operation is more time consuming - reading from a disk or writing to a disk for the same amount of data and the same memory location?
Short answer: it depends a lot.
From an application level writes generally appear faster. as you are really only requesting that the OS writes data, and the OS may return to you quickly and write the data at it's leisure. With a read, you have to wait for the OS to get back to you with the data you want.
The filesystem can dramatically affect the speed of reads and writes... often there is more housekeeping to be done on a write, but if you are appending to a file that might go more quickly.
Most solid state disks are much slower at writing than they are at reading.
This is actually a pretty complicated question, and it requires an understanding of how your I/O system is set up. The simple example you're citing (reading/writing a fixed amount of data to a particular location on disk) isn't as realistic as you might think. Here's a short summary of things that can affect I/O performance.
Disk speed
Hard disk speed is usually expressed in terms of rotational speed (rpm or revolutions per minute), which tells you how fast the platters are spinning around inside the drive. Typical values are from 5400 to 10,000rpm. Typical transfer rates are from 1-1.6 GBit/sec, and can sustain transfer rates of up to 125 MB/sec.
Keep in mind that there's a difference between latency and throughput. If you write very small pieces of data to different places on your drive, you're dependent on the drive's latency (seek time, rotational delay, and access time). But, if you stream a large amount of data at once, you are probably more dependent on the throughput. Your filesystem determines how files are laid out on disk, and it may try to optimize for things like this (see below).
Another thing to consider is that you can (and most businesses do) get faster transfer rates using multiple drives in a RAID configuration. The throughput of RAID drives depends on what combination of striping, mirroring, and parity you've chosen. Check out the Wikipedia article for all the subtleties. There are too many parameters to explain in full here.
Caching
Modern OS's carefully schedule when they interact with the disk drive. Between your program and the physical disk there may be several layers of caches, so the performance you'll see as an application programmer may depend more on how your OS handles data than on the actual performance of your drive.
Most OS's today use a buffer cache so that data from disk can be kept in memory, and the OS can schedule when it talks to the disk. Writes by an application will seem fast, since they can go straight to memory and the OS can wait to flush the buffer until it has nothing else to do. In practice, OS's will try flush writes in a fairly timely fashion, so that a power failure doesn't kill all of your data. So, while there is available buffer space, writes will seem fast. If you fill up the buffer cache, or if the OS has little free memory to work with, you may see I/O performance degrade because the OS has to flush buffers more frequently.
Read speed, like write speed, depends on how data is cached. Most hard drives today have hardware caches that can increase sustained transfer rate. Likewise, the OS uses the buffer cache to store files that you've accessed. Either of these can use some prefetching strategy to fetch data from the drive in advance if it seems like it may be needed. With caching, initial reads might be slow, but subsequent reads (especially reads of the same thing) will take less time if the data to be read is already in a cache somewhere.
Filesystem
Finally, there's your filesystem to consider. A large write may not all go to the same place, so you can't simply consider your transfer rate when estimating how long it's going to take. Files aren't always contiguous on disk, and your filesystem has to compute how they should be laid out, which can affect performance drastically depending on how much space is available and how fragmented your disk is.
Read/write performance will boil down to a combination of all the effects mentioned above plus characteristics of the workload you put on the drive (size of data, frequency of reads and writes, etc). As with most things, you'll need to experiment with your application, the OS you intend to run in on, and your particular disk configuration to get a realistic idea of how it performs.
Buffers will affect the time to read and write by a great amount. Buffers may be maintained by the operating system in RAM and many drives also contain internal buffers that are part of the disk controller.
Consider that the operating system might cache portions of a file in RAM such that reads from these portions can complete very quickly. In addition, the operating system might cache writes in RAM until there is a sufficient amount to write to disk. A call to a 'write' function might return after only copying the data to another area of memory.
In short and to generalize, if you require the bits to be written to the disk (using a flush operation or something similar) then this operation will be at least as long as and uncached read from the disk, likely longer.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With