How to store the file names, start offset and length while avoiding the issue of self imposed limits (lookup table) or having to scan the entire file?

softwareengineering.stackexchange https://softwareengineering.stackexchange.com/questions/240752

  •  03-10-2020
  •  | 
  •  

Question

I am attempting to learn more about C and it's descendants(C++ mainly). I have decided that I would like to create a "file system" of sorts. Not a particularly advanced one mind you but something to play with. I have no intentions of making it mountable, securable or even recoverable.

At the moment I am stuck in concept land with trying to decide how to implement the MFT/FAT.

At first I thought that I would just use the first X number of bytes to store a lookup table, when I realized that there would then be a limitation to the number of files I could store I thought maybe use some type of metadata with each file but then I would have to scan the entire filesystem to locate a file.

I have read through this and this although the z80 link seems like it is more up my alley.

From a high level I want to be able to issue a command like:

./myfs funnycat.jpg mystorage.mfs

Essentially appending binary data to the end of mystorage.mfs

How can I store the information that would contain the file names, start offset and length while avoiding the issue of self imposed limits (lookup table length) or having to scan the entire file (metadata with binary data)?

Concise Explanation I am looking for a way to label binary data stored in a single contiguous file so that I can pull data from a given offset range or by string.

./myfs mystorage.mfs funnycat.jpg

Likely in order to accomplish this I will add some logic to myfs to check the first argument for signs that it is a blob containing other files or not.

Était-ce utile?

La solution

Below are some possible solutions to some of the problems have mentioned.

How do I store a number of arbitrary size?

A simple approach is to use a Variable Length Quantity. Basically, for each octet used to represent your number, 7/8 of its bits are used to represent the number and the extra bit is set to 1 if there are any more octets. You can also just pick some "unreachable" size and write code to support this fixed size.

How can I tell when I've reached the end of a file?

You can use some form of file allocation table (you can represent it as a linked list to allow it to grow to arbitrary size).

I want to support disk fragmentation. I.e., I should be able to grow a file, even if the file is about to overlap another file, without either file.

Have each file be made up of clusters of some fixed size. Each cluster will have a header with information like:

  • How much of the cluster is allocated? (clusters are fixed size, a cluster's content may not be fully populated with data).
  • Where is the next cluster? (A file may need more than one cluster. You could store this information in an allocation table instead, of course).
  • Is this cluster allocated? (If this information is in the cluster rather than in an allocation table, you'll be forced to "format" your entire "hard drive" using a "slow format." Using a file allocation table allows you to perform a "quick format." Mind you, you could cheat by leveraging your existing file system (i.e., you can assume byte is initialized to 0).
Licencié sous: CC-BY-SA avec attribution
scroll top