Вопрос

I have set up streaming replication from a master DB to a slave DB. If the master is shut down, the slave will take over. The replication and failover works fine.

I have a web app using the master database for storing data.

Some details:

  • Both servers running Centos 6.4 and Postgres 9.2.
  • Streaming replication is set up from the master to the slave using Postgres built-in replication.
  • Failover is handled by the Postgresql JDBC driver (v9.2-1003) by specifying master/slave in the connection string.

I want to continue using this method of doing replication.

The questions:

  • The slave server is read-only. How can i make this a master (writable) after the failover automatically?
  • What if the original master suddenly starts working again and we now have two masters? How can I shoot the original master in the head? Automatically.
Это было полезно?

Решение

I suggest having a look at pgpool with the failover_command option. There you can have a small shell script to restart the slave in read/write mode. pgpool

In case you run into some issues with pgpool, this process which I followed to troubleshoot might help - pgpool - stracing

Другие советы

PGPool-II did the trick.

I installed PGPool on a third server; a monitoring server also running CentOS. I configured a health check to run every 10 seconds. The failover_command was set to run a small shell script that generates a trigger file on the slave server if the master server fails. And it worked perfectly.

To prevent the master from suddenly starting again, I´ll use two config files (one for master and one for slave) for the app server and extend the shell script to restart the app server using the slave config.

Thanks for the tip!

Give PostDock a try if you would consider a Docker based solution.

Currently I have tried it in our project with docker-compose, with the schema as shown below:

pgmaster (primary node1)  --|
|- pgslave1 (node2)       --|
|  |- pgslave2 (node3)    --|----pgpool (master_slave_mode stream)----client
|- pgslave3 (node4)       --|
   |- pgslave4 (node5)    --|

I have tested the following scenarios, and they all work very well:

  • Replication: changes made at the primary (i.e., master) node will be replicated to all standby (i.e., slave) nodes
  • Failover: stops the primary node, and a standby node (e.g., node4) will automatically take over the primary role.
  • Prevention of two primary nodes: resurrect the previous primary node (node1), node4 will continue as the primary node, while node1 will be in sync but as a standby node.

As for the client application, these changes are all transparent. The client just points to the pgpool node, and keeps working fine in all the aforementioned scenarios.

Note: In case you have problems to get PostDock up running, you could try my forked version of PostDock.

Pgpool-II with Watchdog

A problem with the aforementioned architecture is that pgpool is the single point of failure. So I have also tried enabling Watchdog for pgpool-II with a delegated virtual IP, so as to avoid the single point of failure.

master (primary node1)  --\
|- slave1 (node2)       ---\     / pgpool1 (active)  \
|  |- slave2 (node3)    ----|---|                     |----client
|- slave3 (node4)       ---/     \ pgpool2 (standby) /
   |- slave4 (node5)    --/

I have tested the following scenarios, and they all work very well:

  • Normal scenario: both pgpools start up, with the virtual IP automatically applied to one of them, in my case, pgpool1
  • Failover: shutdown pgpool1. The virtual IP will be automatically applied to pgpool2, which hence becomes active.
  • Start failed pgpool: start again pgpool1. The virtual IP will be kept with pgpool2, and pgpool1 is now working as standby.

As for the client application, these changes are all transparent. The client just points to the virtual IP, and keeps working fine in all the aforementioned scenarios.

You can find this project at my GitHub repository on the watchdog branch.

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top