How to assure that DB (Postgres) connection is still alive between prepare and execute?

https://stackoverflow.com/questions/12213576

29-06-2021
|

Question

I have daemon script which runs forever in while loop. I have a prepared statement and this statement is executed on every loop.

Example:

  my $dbh;
  sub get_dbh {
      return DBI->connect(...);
  }

  my $dbh = get_dbh();
  my $sth = $dbh->prepare("SELECT ....") or die $DBI::errstr;

  while (1) {
      // is connection still there??
      //if (!$dbh->ping) {
      //    $dbh = get_dbh();
      //}

      $sth->execute('param1');

      // do some other things ... sleep between 0 and 3600
  }

Problem occurs (or might occur) if prepared statement is prepared a few hours ago. Connection could die and my execute too. Checking $dbh->ping before every execute looks like an overkill.

MySQL supports mysql_auto_reconnect which works really. DBD::Pg doesn't have something like that. I read about DBI::Apache but as I see it depends on mod_perl etc. It is obviously meant for web applications.

Is there a "best-practice" way to do check for connection state and reconnect if needed?#

I could prepare statement on every loop but that is not the solution but just a way around the problem.

Solution

Is there a "best-practice" way to do check for connection state and reconnect if needed?#

Yes, at least in my view, because there's only one approach that's free of race conditions, and that's to execute the query in a retry loop that handles errors if they arise.

Otherwise you still have:

PREPARE
SELECT 1; or whatever your test statement is
Network drops out, backend crashes, admin restarts server, whatever
EXECUTE
splat.

Correct behaviour requires something like the pseudocode:

while not succeeded:
    try:
        execute_statement()
        succeeded = True
    except some_database_exception:
        if transaction_is_valid():
            // a `SELECT 1` or `select 1 from pg_prepared_statements where name = 'blah'
            // succeeded in transaction_is_valid(), so the issue was probably
            // transient. Retry, possibly with a retry counter that resets the 
            // connection if more than a certain number of retries.
            // It can also be useful to examine the exception or error state to 
            // see if the error is recoverable so you don't do things like retry
            // repeatedly for a transaction that's in the error state.
        else if test_connection_usable_after_rollback():
            // Connection is OK but transaction is invalid. You might determine
            // this from the exception state or by seeing if sending a `ROLLBACK`
            // succeeds. In this case you don't have to re-prepare, just open
            // a new transaction. This case is not needed if you're using autocommit.
        else:
            // If you tried a SELECT 1; and a ROLLBACK and neither succeeded, or
            // the exception state suggests the connection is dead. Re-establish
            // it, re-prepare, and restart the last transaction from the beginning.
            reset_connection_and_re_prepare()

Verbose and annoying? Yep, but usually easily wrapped in a helper or library. Everything else is still subject to races.

Most importantly, if your application is issuing transactions where it does more than one thing, it needs to remember everything it did until the transaction commits, and be able to retry the whole transaction if there's an error. That, or tell the user "oops, I ate your data, please re-enter it and try again".

If you don't mind the races and just want to handle any obviously dead connections with a periodic check, just store the time of the last query in a variable. When issuing queries check if the timestamp is more than a few minutes old and if it is, issue a SELECT 1; or a query against pg_prepared_statements to check for your prepared statement. You'll either need to be prepared to barf errors at the user or wrap the whole thing in proper error handling anyway ... in which case there's no point bothering with the time check and test at all.

OTHER TIPS

It is quite reasonable to ping/test the connection when you know you might let the connection idle for an hour.

Better, the DBI's connect_cached and prepare_cached make this relatively easy:

while (1) {
    my $dbh = DBI->connect_cached(..., { RaiseError => 1 });  # This will ping() for you
    my $sth = $dbh->prepare_cached('SELECT ...');

    $sth->execute('param1');

    # Do work, sleep up to 1 hour
}

In this way you'll re-use the same prepared statement over the life of the connection.

(For what it's worth, modern DBD::Pg pinging is implemented with an efficient, native PostgreSQL call.)

I don't understand why you're saying that a ping before every execute is overkill, but the alternative is to explicitly handle the case of execute failing because the database handle is invalid by reconnecting, preparing the statement, and issuing the execute a second time. This would be fractionally faster, but I see no reason to avoid the ping strategy

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow