Question

This is my first post here and I've tried to do my homework.

I need to have a better way to develop Data Warehouse databases for Postgres than using Power*Architect. Here is the complete sad history:

I work with Data Warehouse development, mostly on free/open platforms (Linux+Postgres.) All the tables I create have no relationship among them whatsoever because in DW enviroments FKs enforcement adds little to no value. This is so because I have total control of what goes into the tables. Also, when doing some maintenance (like updating or deleting wrong lines) those constraints would add extra work and so I keep them out.

To model those DWs database I must draw diagrams and it is a good practice to show which table relates to which trough a foreign key/primary key. To see the relationship helps the communication of the model to the customer and also the development of it.

However, most of the database modelers, like PowerArchitect, generates code to actually create the constraints. Other than modifying the PowerArchitect code myself, there is no way to stop it from doing that.

And here is when the problem really hits me: when I start evolving the model, whenever I check the current database (which I keep totally free of relationship) against my model (which is full of FKs relationship lines) to find what (tables, columns, indexes) to change, Power*Architect complains the relationships are missing and "kindly" generates the code to create them. I have to read the diff exercising a lot of mind-filtering (trying to read the code without taking in the add constraint commands) and, when applying the diff'd SQL to my database, I have to manually remove those commands.

That would not be a problem if, like on Oracle, I could command Postgres to turn off the relationship constraints. I could keep the neat diagrams on Power*Architect without worrying about the relationships in effect at the database. But I can't (and I am never going to move to Oracle.) So my problem would be solved if I could leave those constraints on all the time IF it does not significantly degrade queries to the database. I can deal with some performance loss when writing, as long as it is hovers around or bellow 10% degration (like a INSERT taking 1'06" instead of 1'00".)

So, the questions is:

"Are there any significant performance loss when doing a lot of writing, some small updatings and deletes to Postgres tables with FKs relationship constraint in effect?"

As per this posts, Does Foreign Key improve query performance? and http://www.experts-exchange.com/Database/MS-SQL-Server/A_4293-Can-Foreign-key-improve-performance.html, MS SQL Server suffers performance degradation when writting and might experience a bit of a performance gain when reading. Is that a general rule? Does Postgres behaves similarly? I have looked around and did't found a direct answer to it. This answer https://stackoverflow.com/a/83527/3507015 comments on NO ACTION flag. Would that do the job (I mean, have the relationships without penalizing the performance)? (Postgres has this option but as per http://www.postgresql.org/docs/8.1/static/ddl-constraints.html it does not seem to take "no action" but instead does not allow any action to be taken.)

Of course it would be best to have a database designer with the same features of Power*Architect (database reverse/forward engineering, database/model comparison etc.) which allowed me to draw the lines but not require them actually created as constraints neither tried to create it any time I compared model to database.

Sorry to make it sooo long a post to ask a single question.

Was it helpful?

Solution

Yes, there is a performance hit.

If you insert or update on the referencing side of a relationship, PostgreSQL must do an index lookup on the referenced unique key to verify that the relevant node exists. It only has to do this when the actual key field changes, though.

If you update or delete from the referenced side, PostgreSQL must do a lookup on the referencing side of each relationship that points to the key, making sure no rows currently rely on the referenced row being deleted. This results in a horribly slow seqscan unless you have an index on the referencing side - in which case it's still an index lookup, and you have to pay the cost to maintain an index you might not otherwise need.

You can disable FKs in PostgreSQL, though. They're triggers, and like any triggers they may be disabled. You cannot do so globally, you must do it for each table, but it's quite likely that your tool won't notice that they're disabled and try to "fix" them.

Do be aware that PostgreSQL's query planner is permitted to rely on the validity of foreign keys, and if if the relationship isn't enforced this could result in queries producing incorrect results. In practice, I don't think the planner does anything much with FKs at this point (only check and unique constraints) so it shouldn't really matter.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top