Pregunta

On couchbase website, one can see couchbase can easily reach 100 000 requests per second. As my application needs basically only key/value store, I give a try to couchbase. So I tried to build a small cluster within my provider. I use python client, and Couchbase server 2.2.0 community edition.

With a single node into the "cluster" : I can do 16 000 requests per second : nice ! But when there are 2 nodes into the cluster, I got only 100 requests per second for 'set(key,val)', and the same for 'get(key)' (I used the default bucket). This is for a very small number of keys : 10 000 keys, length : only 10 bytes !

When looking the stats, it seems there is no bottleneck (CPU/disk/RAM).

My hardware :

Core i5 (3.4 Ghz)
32 GB RAM
Disk : SSD 120Go
Network : Gigabit, bandwith limited to 200 Mbps

The only point I see is that I have a 10ms latency between the 2 nodes :

  • What should be a "good" latency between nodes ?
  • What performance I can expect with a gigabit connection ?
  • I used default bucket, should I use another one with specific parameters ?
¿Fue útil?

Solución

10ms latency is pretty high if your running both your client and server in the same datacenter so the first thing I would do is try to figure out why your network giving you such high latencies.

As you mentioned you are doing about 100 ops/sec and this makes sense if your network latency is 10ms. This also means that your likely doing synchronous IO over the network. This means your waiting for one request to make a round-trip before sending the next. The python client should have async API's that allow you to send multiple requests without waiting for the responses to come back later. This will vastly improve the amount of ops/sec you can do.

I know the website mentions that Couchbase can do a 100k ops/sec for a single node, but I've gotten up to almost 250k ops/sec. The only things that will really slow you down is the network (which I maxed out in this case) and how many items are resident in memory when you request them since having to go to disk will lower your performance especially if you only have a few connections to the database.

Here's some answers to the questions you posted.

  1. Nodes should be in the same datacenter if they are part of the same cluster. (Use the cross datacenter replication feature if they are in different data centers)
  2. Expect to be able to max out the network connection and that the server will not be the bottleneck when all of your data is resident in memory.
  3. There are no specific parameters that you need to tune in order to get performance from Couchbase.

[EDIT] There is no reason why 1 node would perform better than 2 nodes. In fact having more nodes should cause you to have more throughput.

Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top