Question

What data structure is used in RDBMS for storing the actual data that we enter, firstname lastname and so on, I know that B and B+ trees are efficient for indexing and so on, but I haven't gotten a cpnvincing answer for this, please excuse my ignorance.

Was it helpful?

Solution

This is typically up to the storage engine, and raw data is going to be stored for different classifications of database data, of which there are at least three common divisions:

  • Indexes (keys to point to rows containing that value)
  • Row data (non-blob data, some info here)
  • Blob data (usually bulk data; think of this as extended data that you typically don't search on, but retrieve in bulk after looking up a row by other keys)

Most of the info I'll include here will be based on knowledge of MySQL.

Indexes

For instance, MySQL has several, including:

From what I can tell, InnoDB and MyISAM use B-tree indexes, while the memory storage engine allows you to specify b-tree or hash indexes.

The docs even contain a page comparing how it uses each.

Other RDBMS's are going to use their own specifications, but I would imagine B-tree is common.

Typically an index is going to be like a small table of its own; with the index value being a primary key, and the value being a list of primary keys for rows that contain that value.

Row data

This answer is again complicated and up to the storage engine. I would read into storage engines if you are interested in implementation details.

Rows are stored in a data format that can be looked up quickly by some primary key. The speed is aided by the fact that rows typically have a relative small (think 216 bytes) fixed limit, after which further data will be pushed to extended data pages.

The primary key is always indexed, and other values can be optionally indexed. If they are not, then the only way for the storage engine to find them will be a "table scan" -- literally, looping through all data comparing column values to the value you are looking for.

Blob data

Think of blob data like a big filesystem with no special indexing properties other than being able to be looked up by the primary key for their row. They also lose the benefits of fixed space allocated per row, which is a tradeoff for being able to store large, arbitrary amounts of data.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top