Why do disks write data in chunks of page size?

https://softwareengineering.stackexchange.com/questions/378084

07-02-2021
|

Question

In my understanding, even if i want to overwrite a byte in middle of a file, OS and/or disk will read the content of the size of page, modify one byte and then write the contents back.

What is the fundamental reason that disks are designed this way to have read-modify-write cycle ?

Is hardware or physical properties of disk the reason ?

Is optimizing for data-locality for next reads is the reason ?

This question is more about granularity of each disk operation than inserting in middle of file.

Solution

That's due to the mechanics.

A disk is a surface which rotates around its axis at a high speed (in reality several surfaces). The surface is divided into concentric tracks, and a motor controls the electromagnetic head to move to the right track. The head then just waits until the information it wants to read/write passes under it. So the problem is then to locate the bytes on the track in order to know when they will reach the head.

If there would be byte addressing, the head would read/write a single byte, but due to the speed of the rotation, would have to wait a full rotation to execute the next read/write instruction. An alternative could be to cache the access instructions to optimize the access to continuous bytes in order to avoid waiting for unnecessary rotations. But this would have required more memory and more throughput for the disk controller, which were both very limited in earlier times.

So instead of accessing single bytes, and considering that most accesses are for a whole sequence of bytes, the tracks were divided into sectors of same capacity. This is why you have this block logic.

OTHER TIPS

Your mechanical hard drive reads and writes data at 100 MB per second, using a strong magnet. To change a single byte, you’d have to turn that magnet on for ten nanoseconds. In order to not destroy any bits of the previous bytes, you’d have to turn that super strong magnet on and off within half a nanosecond.

You’d also have to recognize the location where you write with sub-nanosecond precision before you arrive there. So you have zero processing time available. Just impossible.

And since a hard drive rotates 90 times a second, if you write bytes one after the other, you have the enormous write rate of 90 bytes per second.

All memory devices at every level of the memory hierarchy (from L1 cache, main memory, disk...) offer sequential access as a faster mode compared to random access. Random access requires constant transmission of addresses (which means devices have to reconfigure their access for changing addresses) whereas sequential access means bulk transfers: amortizing one address across transfer of whole block, while the devices are internally advancing addressing to the next, in parallel with access of the prior address.

Disk takes that to extreme as @gnasher729 is pointing out. Due to the physical media, byte addressing is not even practical. On disk, there is fixed-sized overhead surrounding any amount of content you might store. During low-level formatting by the manufacturer, a choice is made of the size of content that is made into a block. The choice is a trade off between advantages of sequential bulk transfers when you want more data, and the disadvantages when you don't, as well as amortizing overhead over content.

In practice, if the operating system is working with a given file, then it will only do read-modify-write when the page is cold, whereas thereafter for a time it will cache the latest content of that page. If the page is found in the cache, then the initial read can be forgone.

I agree with previous answers in many points. Except about hdd structure - this doesn't cover other device types. But they also use bulk operations as @Erik Eidt already said. Completely agree with inherent sequential access of any memory device. But need to mention another aspect of input / output process: on most architectures it is optimised to be executed asynchronously. This usually means that some memory device control copying of data between different addressed points. For x64 it is done by MMU for example. As such the granularity of IO operations depends on these devices and is usually is block sized or even page sized. You can look at some scattering / gathering principles of their work.

Licensed under: CC-BY-SA with attribution

Not affiliated with softwareengineering.stackexchange