Question

I am trying to learn transactions better in PostgreSQL and came across this observation.

I have a situation where I am creating multiple tables using an SQL statement across multiple sessions. I am using CREATE TABLE *table_name* AS *sql_stmt* (CTAS) syntax. I would consider this a safe operation to execute concurrently (across my multiple sessions) because it is not updating any data, thus I would want to ensure that I have the least restrictive lock possible at all times.

That being said, multiple sessions are still creating data (the table) so I am quite worried that if the SQL statements contain references to the same table, I might be invoking a lock, even if I know that it is safe for the operations I have in mind. In other words, I know that all my operations are read-only (so can use a lock that is not restrictive) and that no table will have data which is updated, with the simple caveat that I am also saving the result as a table (so this is no longer 100% read-only; might invoke some restrictive lock at some point?). Thus, my question is as follows: how can I ensure the lowest lock restriction across all concurrent sessions while I do my CTAS workload?

For an example, let us say I have two sessions. One is issuing the following command:

CREATE TABLE t1 AS SELECT * FROM facttable;

The other, concurrently issues:

CREATE TABLE t2 AS SELECT * FROM facttable;

How can I ensure that such operations do not invoke a lock since they are referring to the same table?

Was it helpful?

Solution

There is no danger of locking with your scenario.

The only lock that is taken on facttable is the ACCESS SHARE lock that prevents others from dropping or altering the table while it is being read. Since ACCESS SHARE doesn't conflict with itself, that is no problem (two people can read the same poster).

The only consideration is the I/O load: if many processes perform a sequential scan on the table, that could be substantial. You should try to benefit from PostgreSQL's synchronized sequential scans by starting all processes at the same time, so that they have a chance to synchronize their sequential scans.

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top