Question

A database file system is a file system that is a database instead of a hierarchy. Not too complex an idea initially but I thought I'd ask if anyone has thought about how they might do something like this? What are the issues that a simple plan is likely to miss? My first guess at an implementation would be something like a filesystem to for a Linux platform (probably atop an existing file system) but I really don't know much about how that would be started. Its a passing thought that I doubt I'd ever follow through on but I'm hoping to at least satisfy my curiosity.

Was it helpful?

Solution

The easiest way would be to build it using fuse, with a database back-end.

A more difficult thing to do is to have it as a kernel module (VFS).

On Windows, you could use IFS.

OTHER TIPS

DBFS is a really nice PoC implementation for KDE. Instead of implementing it as a file system directly, it is based on indexing on a traditional file system, and building a new user interface to make the results accessible to users.

I'm not really sure what you mean with "A database file system is a file system that is a database instead of a hierarchy".

Probably, using "Filesystem in Userspace" (FUSE), as mentioned by Osama ALASSIRY, is a good idea. The FUSE wiki lists a lot of existing projects about databased-backed filesystems as well as filesystems in which you can search by SQL-like queries.

Maybe this is a good starting point for getting an idea how it could work.

It's a basic overview of the Firebird architecture.

Firebird is an opensource RDBMS, so you can have a real deep insight look, too, if you're interested.

Its been a while since you asked this. I'm surprised no one suggested the obvious. Look at mainframes and minis, especially iSeries-OS (now called IBM-i used to be called iOS or OS/400).

How to do an relational database as a mass data store is relatively easy. Oracle and MySQL both have these. The catch is it must be essentially ubiquitous for end user applications.

So the steps for an app conversion are:

1) Everything in a normal hierarchical filesystem

2) Data in BLOBs with light metadata in the database. File with some catalogue information.

3) Large data in BLOBs with extensive metadata and complex structures in the database. File with substantial metadata associated with it that can be essentially to understanding the structure.

4) Internal structures of the BLOB exposed in an object <--> Relational map with extensive meta-data. While there may be an exportable form, the application naturally works with the database, the notion of the file as the repository is lost.

If you use Python then have a look at SpiderOak's Transactional Storage System:

A python accessible filesystem API that supports most typical file operations (directories, folders, file creation, reading, writing, seeking, renaming, etc.) with transactional fault tolerance. You can open the filesystem, perform any number of modifications, and then commit or rollback all changes atomically. This lets us build SpiderOak using simple, traditional files objects as containers. Uses SQLite internally as a storage system, and thus keeps the entire filesystem within a single on-disk file. Is multiprocess and threadsafe.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top