Question

I am running PostgreSQL 8.4 database server and psql terminal front-end on Ubuntu 10.04 Lucid Lynx and would like to span a single transaction over several sequential psql sessions.

When I connect to my database with psql a new connection is established and a server backend process for this connection is created. When I disconnect the connection is released and the backend process terminates. A (non-XA*) transaction is bound to the scope of a connection, so obviously there is no straight forward way to span a single transaction over several psql sessions.

What I would like to achieve is that the following sequence of commands can be run within a single transaction and therefore return the same transaction timestamp on each call of now():

tscho@test:~$ sudo -u postgres psql -p 5433 --no-align --tuples-only -c "select now()"
2012-02-17 21:25:07.027056+01
tscho@test:~$ sudo -u postgres psql -p 5433 --no-align --tuples-only -c "select now()"
2012-02-17 21:25:09.487601+01

database log:

2012-02-17 21:25:07 CET 0- LOG:  connection received: host=[local]
2012-02-17 21:25:07 CET 0- LOG:  connection authorized: user=postgres database=postgres
2012-02-17 21:25:07 CET 0-2/0 LOG:  duration: 0.366 ms  statement: select now()
2012-02-17 21:25:07 CET 0- LOG:  disconnection: session time: 0:00:00.002 user=postgres database=postgres host=[local]
2012-02-17 21:25:09 CET 0- LOG:  connection received: host=[local]
2012-02-17 21:25:09 CET 0- LOG:  connection authorized: user=postgres database=postgres
2012-02-17 21:25:09 CET 0-2/0 LOG:  duration: 0.347 ms  statement: select now()
2012-02-17 21:25:09 CET 0- LOG:  disconnection: session time: 0:00:00.002 user=postgres database=postgres host=[local]

Clearly this is not what I really want to do. I want to be able to execute several bash scripts that connect to the database and execute SQL statements and scripts with psql within a single transaction.

* Afaik the XA protocol would allow BEGIN TRANSACTION and PREPARE TRANSACTION on different connections but PostgreSQL does not support this.


My first shot to solve this problem was to setup the PgBouncer 1.5 connection pool and configure it as a simple proxy with exactly one connection to the target database (session pooling mode). My reasoning was that PgBouncer would establish this connection at start-up and that I can then connect/disconnect to/from the proxy with psql while the connection to the database keeps open.

tscho@test:~$ sudo -u postgres psql pgproxy -U pgbouncer -p 6432 --no-align --tuples-only -c "select now()"
2012-02-17 21:25:23.517019+01
tscho@test:~$ sudo -u postgres psql pgproxy -U pgbouncer -p 6432 --no-align --tuples-only -c "select now()"
2012-02-17 21:25:26.943172+01

This actually works out quite well as the database log shows:

2012-02-17 21:25:17 CET 0- LOG:  connection received: host=[local]
2012-02-17 21:25:17 CET 0- LOG:  connection authorized: user=postgres database=postgres
2012-02-17 21:25:23 CET 0-1/0 LOG:  duration: 0.110 ms  statement: select 1
2012-02-17 21:25:23 CET 0-1/0 LOG:  duration: 0.053 ms  statement: select now()
2012-02-17 21:25:23 CET 0-1/0 LOG:  duration: 0.046 ms  statement: DISCARD ALL
2012-02-17 21:25:26 CET 0-1/0 LOG:  duration: 0.126 ms  statement: select now()
2012-02-17 21:25:26 CET 0-1/0 LOG:  duration: 0.043 ms  statement: DISCARD ALL

But there is a little problem with this approach, as soon as I begin a transaction on the proxy connection and disconnect...

tscho@test:~$ sudo -u postgres psql pgproxy -U pgbouncer -p 6432 --no-align --tuples-only -c "start transaction"
START TRANSACTION

...the connection is released by PgBouncer.

2012-02-17 21:32:47 CET 0-1/2178 LOG:  duration: 0.025 ms  statement: start transaction
2012-02-17 21:32:47 CET 0- LOG:  disconnection: session time: 0:07:20.737 user=postgres database=postgres host=[local]

Of course this makes perfect sense for a connection pool. Its job is to provide (a) shared connection(s) for several clients but to isolate the transactions of these clients. But for my use case a shared transaction is exactly what I would need...

So my question is now, is there a way to configure PgBouncer (or another connection pool) to not release the connection upon disconnection after BEGIN/START TRANSACTION or is there another way to achieve what I would like to do?

All further questions to this post, comments and of course answers appreciated!

Was it helpful?

Solution

Maybe it would be easiest to run a psql in the background, with it set to execute stdin, and connect its stdin to a named pipe. Then you can continually push data into that pipe, and finally push "end; \quit". Something like:

#!/bin/sh

psql_pipe=/tmp/psql$$
mkfifo -m 600 $psql_pipe
psql < $psql_pipe &
exec 3>$psql_pipe
psql_pid=$!

echo "> Started psql (pid=$psql_pid) reading from $psql_pipe"
trap '
  kill $psql_pid
  rm -f $psql_pipe
' EXIT

echo "begin;" >&3

echo "select now();" >&3

sleep 2

echo "select now();" >&3

sleep 2

echo "end; \quit" >&3

wait $psql_pid

Note that you can't simply do echo "sql" >$psql_pipe since the EOF would be transmitted to psql, which would then exit early-- the shell script has to keep its fd open.

OTHER TIPS

I usually tackle this a bit differently: I collect up the SQL files/snippets I want to run and then execute a single psql session that issues a BEGIN then uses \i to include each file in turn, finally issuing a COMMIT. Eg (untested but the general idea):

psql -1 <<__END__
\i script1.sql
\i script2.sql
\i script3.sql
__END__

errstatus=$?
if ! $errstatus; then
  echo "psql failed with $errstatus"
  exit 1
fi 

Edit: Alternately, if your script and PostgreSQL session need two-way interaction, say if you're generating dynamic sql, then you can either use a co-process or use a more sophisticated scripting language than sh that has its own built-in PostgreSQL interface. See this answer I wrote to a very similar question a while ago.

I use Python and psycopg for this sort of thing most of the time, but a coprocess can be useful if you're stuck with bash.

If you're on Windows and using cmd.exe then (a) I'm sorry and (b) you'll have to use a real scripting language or, if PowerShell supports anything like co-processes, maybe use PowerShell and psql.

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top