Question

According to Amazon, Redshift is based on PostgreSQL, and is a column-oriented database management system. This looks to me as a very self-conflicting statement, isn't Postgres a row-oriented database?

Was it helpful?

Solution

A Postgres server is a lot of things, not just row-oriented access methods:

enter image description here

(The image above is from the Postgres documentation.)

Postgres source code is available for anyone to use under a very permissive license. To implement a DBMS that is "based on PostgreSQL" you don't have to rewrite it from scratch. Say, if you want to introduce a different layout for storing your data you can likely reuse with little or no changes the main server process, client APIs, query parser and rewriter, most of utility and security functions. You might need to modify the plan generator and executor, while adding new access methods and replacing parts of the page storage manager. Given the resources at Amazon's disposal, this doesn't look like an impossible undertaking.

Amazon basically say as much:

the specialized data storage schema and query execution engine that Amazon Redshift uses are completely different from the PostgreSQL implementation. [...] Amazon Redshift stores data in columns, using specialized data compression encodings for optimum memory usage and disk I/O. Some PostgreSQL features that are suited to smaller-scale OLTP processing, such as secondary indexes and efficient single-row data manipulation operations, have been omitted [...]

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top