Does fopen update mode ("rb+") change the disk location of the file if the file size does not change?

StackOverflow https://stackoverflow.com/questions/22087473

  •  18-10-2022
  •  | 
  •  

Question

Say that you have a very large binary file. You make frequent updates to it but, when you do, you just overwrite a tiny part of it without changing the size of the file. To accomplish this, you use fopen("rb+"), seek to the location, and perform a write.

How do operating systems actually implement this? Since update mode allows you to write to the end of the file and therefore increase its size, I assume that the file will sometimes have to get recreated and moved to a different location. However, in the situation described above, recreating the file every time seems like it would be a performance issue and/or wreck the flash drive.

The man pages for fopen don't get into this at all..

Was it helpful?

Solution

How each file system stores its files depends on the file system, not on the immediate implementation of fopen. It would be strange to see a system that has to relocate the entire file in order to extend it. Files are typically stored in "slow" storage, which means that the entire design of the file system is dedicated, among other things, to making such operations as extending files as efficient as possible under "slow storage" conditions.

A popular, most straightforward approach to storing files in a file system (think FAT) is to organize files as sequences of disk "blocks" of some pre-determined size. When you write new data to the end of the file, it is written into the last block until it becomes full. After that a new block is allocated somewhere in storage and appended to the end of the file. And so on. Obviously, this might easily result in multiple files having their blocks stored in interleaved fashion, i.e. physically these files are stored non-contiguously. This adversely affects input/output performance, but this is considered an acceptable price to pay for fairly good file extension performance.

In other words, the file does not get recreated or relocated. It simply gets extended at the end.

OTHER TIPS

The reason why fopen man pages don't mention this is because the actual creation and storage of files is a filesystem-dependent implementation detail. Indeed, fopen may not be operating on a physical file at all (e.g. fopen("/dev/tty", "w")).

However, typically filesystems will store a file as a collection of disk blocks. Each block is a contiguous piece (anywhere from 512 bytes to 32KBytes and up), but the blocks in a file may not be physically contiguous (when this happens the file is "fragmented"). So, if you write past the end of the file and the filesystem needs to add a new block to the file, it will simply find the next free block on your disk and start writing there.

SSDs and other flash memory storage, however, requires that blocks (which can be up to 128KB in size) be fully erased before being written again. So, any update to a disk block will necessitate writing a fresh block. (This happens on every write that ends up on disk, not just the ones that write past the end of the file.)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top