Question

I am trying to scale a messaging app. Im using nodeJS with Socket.io and Redis-Store on the backend. The client can be iphone native browser, android browsers .. etc

I am using SSL for the node connection, using Nginx to load balance the socket connections. I am not clustering my socket.io app , instead i am load balancing over 10 node servers ( we have a massive amount of users ). Everything looks fine when the transport is Websockets , however when it falls back to xhr-polling ( in case of old android phones ) I see a HUGE response time of up to 3000 rpm in New-relic. And I have to restart my node servers every hour or so otherwise the server crashes.

I was wondering if I am doing anything wrong , and if there are any measures I can take to scale socket.io when using xhr-polling transport ? like increasing or decreasing the poll duration ?

Was it helpful?

Solution

You are not doing anything wrong, xhr-polling is also called long polling. The name comes from the fact that the connection is maintained open longer, usually until some answer can be sent down the wire. After the connection closes, a new connection is open waiting for the next piece of information.

You can read more on this here http://en.wikipedia.org/wiki/Push_technology#Long_polling

New Relic shows you the response time of the polling request. Socket.IO has a default "polling duration" of 20 seconds.

You will get a higher RPM for smaller polling duration and a smaller RPM for a higher polling duration. I would consider increasing the polling duration or just keep the 20 sec default.

Also, to keep New Relic from displaying irrelevant data for the long polling, you can add ignore rules in the newrelic.js that you require in you app. This is also detailed in the newrelic npm module documentation here https://www.npmjs.org/package/newrelic#rules-for-naming-and-ignoring-requests

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top