CouchDB / MochiWeb : negative effect of persistent connections

https://stackoverflow.com/questions/8994124

19-04-2021
|

Question

I have pretty straightforward setup of CouchDB on my Mint/Debian box. My Java webapp was sufferring rather long delays on querying CouchDB, so I started to seek for the causes.

EDIT: The query pattern is lots of small queries and small JSON objects (like 300 bytes up / 1Kbyte down).

Wireshark dumps are pretty nice, showing mostly 3-5 millis request-response turnaround. JVM frame sampling showed me that socket code (client side queries to the Couch) is somewhat busy, but nothing remarkable. Then I tried to profile the same with ApacheBench and oops: I currently see that keep-alive introduces steady extra 39ms delay over non-persistent setups.

Does anyone know how to explain this? Maybe persistent connections increase the congestion window on the TCP layer and then are idling out due to TCP_WAIT and small request/response sizes, or something like that? Should this option (TCP_WAIT) be ever switched ON for loopback tcp connections?

w@mint ~ $ uname -a
Linux mint 2.6.39-2-486 #1 Tue Jul 5 02:52:23 UTC 2011 i686 GNU/Linux
w@mint ~ $ curl http://127.0.0.1:5984/
{"couchdb":"Welcome","version":"1.1.1"}

running with keep alive, average 40 millis per request

w@mint ~ $ ab -n 1024 -c 1 -k http://127.0.0.1:5984/
>>>snip
Server Software:        CouchDB/1.1.1
Server Hostname:        127.0.0.1
Server Port:            5984

Document Path:          /
Document Length:        40 bytes

Concurrency Level:      1
Time taken for tests:   41.001 seconds
Complete requests:      1024
Failed requests:        0
Write errors:           0
Keep-Alive requests:    1024
Total transferred:      261120 bytes
HTML transferred:       40960 bytes
Requests per second:    24.98 [#/sec] (mean)
Time per request:       40.040 [ms] (mean)
Time per request:       40.040 [ms] (mean, across all concurrent requests)
Transfer rate:          6.22 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       0
Processing:     1   40   1.4     40      48
Waiting:        0    1   0.7      1       8
Total:          1   40   1.3     40      48

Percentage of the requests served within a certain time (ms)
  50%     40
>>>snip
  95%     40
  98%     41
  99%     44
 100%     48 (longest request)

No keepalive, and voila - 1 ms per request, mostly.

w@mint ~ $ ab -n 1024 -c 1 http://127.0.0.1:5984/
>>>snip
Time taken for tests:   1.080 seconds
Complete requests:      1024
Failed requests:        0
Write errors:           0
Total transferred:      236544 bytes
HTML transferred:       40960 bytes
Requests per second:    948.15 [#/sec] (mean)
Time per request:       1.055 [ms] (mean)
Time per request:       1.055 [ms] (mean, across all concurrent requests)
Transfer rate:          213.89 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       0
Processing:     1    1   1.0      1      11
Waiting:        1    1   0.9      1      11
Total:          1    1   1.0      1      11

Percentage of the requests served within a certain time (ms)
  50%      1
>>>snip
  80%      1
  90%      2
  95%      3
  98%      5
  99%      6
 100%     11 (longest request)

Okay, now with keep-alive on but also asking to close the connection via http header. Also 1 ms per request or so.

w@mint ~ $ ab -n 1024 -c 1 -k -H 'Connection: close' http://127.0.0.1:5984/
>>>snip
Time taken for tests:   1.131 seconds
Complete requests:      1024
Failed requests:        0
Write errors:           0
Keep-Alive requests:    0
Total transferred:      236544 bytes
HTML transferred:       40960 bytes
Requests per second:    905.03 [#/sec] (mean)
Time per request:       1.105 [ms] (mean)
Time per request:       1.105 [ms] (mean, across all concurrent requests)
Transfer rate:          204.16 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       0
Processing:     1    1   1.2      1      14
Waiting:        0    1   1.1      1      13
Total:          1    1   1.2      1      14

Percentage of the requests served within a certain time (ms)
  50%      1
>>>snip
  80%      1
  90%      2
  95%      3
  98%      6
  99%      7
 100%     14 (longest request)

Solution

Yeah, this is related to tcp socket setup options. This configuration now leveled off all three cases at 1ms per request.

[httpd]
socket_options = [{nodelay, true}]

See this for details: http://wiki.apache.org/couchdb/Performance#Network

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow