Pergunta

For the same data set, with mostly text data, how do the data (table + index) size of Postgresql compared to that of MySQL?

  • Postgresql uses MVCC, that would suggest its data size would be bigger

  • In this presentation, the largest blog site in Japan talked about their migration from Postgresql to MySQL. One of their reasons for moving away from Postgresql was that data size in Postgresql was too large (p. 41): Migrating from PostgreSQL to MySQL at Cocolog, Japan's Largest Blog Community

  • Postgresql has data compression, so that should make the data size smaller. But MySQL Plugin also has compression.

Does anyone have any actual experience about how the data sizes of Postgresql & MySQL compare to each other?

Foi útil?

Solução

  • MySQL uses MVCC as well, just check innoDB. But, in PostgreSQL you can change the FILLFACTOR to make space for future updates. With this, you can create a database that has space for current data but also for some future updates and deletes. When autovacuum and HOT do their things right, the size of your database can be stable.
  • The blog is about old versions, a lot of things have changed and PostgreSQL does a much better job in compression as it did in the old days.
  • Compression depends on the datatype, configuration and speed as well. You have to test to see how it's working for you situation.

I did a couple of conversions from MySQL to PostgreSQL and in all these cases, PostgreSQL was about 10% smaller (MySQL 5.0 => PostgreSQL 8.3 and 8.4). This 10% was used to change the fillfactor on the most updated tables, these were set to a fillfactor 60 to 70. Speed was much better (no more problems with over 20 concurrent users) and data size was stable as well, no MVCC going out of control or vacuum to far behind.

MySQL and PostgreSQL are two different beasts, PostgreSQL is all about reliability where MySQL is populair.

Outras dicas

Both have their storage requirements in their respective documentation:

MySQL: http://dev.mysql.com/doc/refman/5.1/en/storage-requirements.html
Postgres: http://www.postgresql.org/docs/current/interactive/datatype.html

A quick comparison of the two don't show any flagrant "zomg PostGres requires 2 megabytes to store a bit field" type differences. I suppose Postgres could have higher metadata overhead than MySQL, or has to extend its data files in larger chunks, but I can't find anything obvious that Postgres "wastes" space for which migrating to MySQL is the cure.

I'd like to add that for large columns stores, postgresql also takes advantage of compressing them using a "fairly simple and very fast member of the LZ family of compression techniques"

To read more about this, check out http://www.postgresql.org/docs/9.0/static/storage-toast.html

It's rather low-level and probably not necessary to know, but since you're using a blog, you may benefit from it.

About indexes,

MySQL stores the data whithin the index which makes them huge. Postgres don't. This means that the storage size of a b-tree index in Postgres doesn't depend on the number of column it spans or which data type the column has.

Postgres also supports partial indexes (e.g. WHERE status=0) which is a very powerful feature to prevent building indexes over millions of rows when only a few hundred is needed.

Since you're going to put a lot of data in Postgres you will probably find it practical to be able to create indexes whitout locking the table.

Sent from my iPhone. Sorry for bad spelling and lack of references

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top