Temporary schema per connection?

https://dba.stackexchange.com/questions/76494

09-12-2020
|

Question

I am trying to migrate my unit tests from H2 to Postgresql.

Currently, H2 gives me an in-memory schema such that each connection maps to a unique schema, creates the tables, runs the test, and drops the schema. The schema creation and destruction is handled automatically by H2.

The unit tests run concurrently.

What is the best way to do this in Postgresql? Specifically,

How do I get a unique schema per connection?
- Should the test framework generate unique names or is there a built-in mechanism for doing this?
How do I ensure that the schema is dropped when the connection is dropped?
- I don't want to end up with dangling schemas when unit tests get killed.
What approach will yield the highest performance?
- I need to create/drop tens of schemas per second.

UPDATE: I found a related answer here but it fails to drop schemas in case the process running the unit tests gets killed.

Solution

pg_temp is an alias for the current session's temporary schema.

If you do a SET search_path TO pg_temp before running your tests it should all just work (as long as nothing is referencing a schema explicitly).

If you don't want to change your script at all, then set the search_path on the user that the tests log in as:

> ALTER ROLE testuser SET search_path = pg_temp;

Then everything that user creates will be in pg_temp unless explicitly specified.

Here's an example from psql, showing the actual schema (for this connection) that the alias resolves to:

> SET search_path TO pg_temp;
SET
> create table test();
CREATE TABLE
> \dt test
          List of relations
  Schema   | Name | Type  |  Owner
-----------+------+-------+----------
 pg_temp_4 | test | table | postgres
(1 row)

And, as you'd expect, that schema is different for every concurrent connection, and is gone after the connection is closed.

Note that this also works for functions, though you will have to explicitly reference the pg_temp schema when calling them.

OTHER TIPS

You can get the name of the current temporary schema (after creating the first temp table) like laid out in the link you added:

SELECT nspname
FROM   pg_namespace
WHERE  oid = pg_my_temp_schema();

But your current plan still wouldn't make a lot of sense. To create tables in the current temporary schema, just create temporary tables. That's all. By default, the search_path is defined so that temporary tables are visible first. One never needs to schema-qualify temp tables. You shouldn't ever have to address the current temporary schema directly in any way - that's an implementation detail.

Do your tests involve transactions? DDL is transactional in PostgreSQL, so if you create your schema and tables, then run your tests, all within a single transaction that is then rolled back, the schema is never actually committed and visible to other sessions.

You'd still need to use a probably-unique name for your schema (maybe include hostname and PID), as CREATE SCHEMA will fail immediately if an identically-named schema already exists, and will block if another session has created an identically-named schema in an uncommitted transaction.

An alternative would possibly just be to use temporary tables, if you're able to modify your database creation scripts to do that.

I just got an idea.

Postgresql guarantees that a session can't see another's temporary tables. I'm guessing this means that when you create a temporary table, it creates a temporary schema. So perhaps I could do the following:

Create a (dummy) temporary table and look up its schema.
Use this schema for the test (create the tables, run the test).
When the connection is closed, Postgresql will drop the schema.

I don't like relying on implementation details, but in this case this seems pretty safe.

Licensed under: CC-BY-SA with attribution

Not affiliated with dba.stackexchange