Question

I have a golang program that does computation on data in multiple threads at once, all pulling their data from Postgres. The number of threads depends on a previous result. There can therefore be hundreds of threads trying to pull data from Postgres at the same time.

The golang sql library allows one to specify a connection limit, which prevents postgres from running out of shared memory or free connections.

If I hardcode the maximum number of connections I will run out of connections when something else is connected. On the other hand if I hardcode a too low number of allowed connections from the golang program performance will be unnecessary limited.

What would be the best way to allow the go program to use as many as possible connections, without running into the limits. I imagine this number to be a variable one depending on the number of other services that are connected to the database at that time.

I am thinking of running PgBouncer between the database and the golang program, hopefully accepting all the connections from the golang program, allowing as many as possible through, but blocking the rest until connections become free. I am however unsure if PgBouncer does this, but I will be testing this next.

Is there perhaps another method of having a connections pool that will block connections when no real connections are free? Blocking, not refusing, as refusing a connection will mean I have to add retry logic to my golang program.

Was it helpful?

Solution

The number of threads depends on a previous result.

Could you rewrite this to have fewer queries returning more rows each? If you launch one thread per row to be returned, you are probably introducing more overhead than you are removing.

If I hardcode the maximum number of connections I will run out of connections when something else is connected.

Unless that "something else" is another golang program doing much the same as the first one, or something equivalent, then you can make max_connections only be slightly larger than the max pool size in golang, and you shouldn't have a problem. It does require you to have a fairly good idea of what the other things are.

On the other hand if I hardcode a too low number of allowed connections from the golang program performance will be unnecessary limited.

Have you verified this? Having 300 connections fighting over 16 CPUs and 4 hard drives is not going to enhance productivity. There is only so much hardware, and now you have to unproductively fight over spinlocks (for example) as well as do the real work.

I am thinking of running PgBouncer between the database and the golang program, hopefully accepting all the connections from the golang program, allowing as many as possible through, but blocking the rest until connections become free.

It can do this, but why would it be better than having golang itself do it? It could make sense if you can push your "other connections" through pgbouncer, but can't push them through golang. Then you can have one pooler that sees everything. On the other hand, it introduce another layer of latency on every network round trip. Plus another layer of complexity.

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top