Question

This article describes DBMS as caching data at the application layer, which is why they often use Direct IO to bypass the file buffer cache.

My question is: what benefits are provided by the DBMS caching information at the application layer? How are these benefits affected when the DBMS is on a server that is remote to the application?

Was it helpful?

Solution

Databases typically implement a lot of the OS style functionality themselves. At a high level, this is done to improve performance and scalability.

To give you some examples:

Operating system file caches are generic, least frequently used, style caches. A database needs much finer control of what goes into the cache and what does not. For example, when you calculate hash tables for queries, you want to give them preference over file data while the query runs (it is cheaper to bring in some new file data than to page out hash tables). A database also need to be in very careful control of the memory allocator to avoid fragmenting the heap. Most operating systems are simply not up to the task of handling high the speed allocations that databases need and database typically implement their own memory manager, purpose build for the database engine.

Operating system caches also don't guarantee persistence unless you call flush in Linux or ask for unbuffered I/O in Windows. To guarantee consistency and ACID properties, a database needs much finer control over when data is on disk and when it is cached. Because of this, there is very little benefit to using synchronous, buffered I/O.

Furthermore, most OS caches are relatively poorly implemented. Windows file system caches only got really good NUMA style scale support around Windows 2008 and Linux still has very poor file system performance at high core count. FreeBSD doesn't even know what NUMA is. Since database need to run on very large servers (in some cases up to 32 CPU sockets) database vendors often implement their own caching data structures that are much more scalable than what the OS provides by default.

And last but not least: Database are very hungry for I/O and can drive the file system much harder than a typical file server. Relying on traditional, synchronous I/O is simply too slow for most database and they instead use aggressive, core affinitised, asynchronous I/O completion to gain scale. The I/O subsystem of database tends to be much more advanced than what a standard OS provides.

In general, you will find much more advanced control of machine resource in database source code than in operating system kernels. In fact, most high scale database implement their own user space kernel to avoid using the operating system kernel primitives. If you see a database do a lot of kernel transitions, start looking around for a better product.

Great Question!

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top