WebSockets vs Ajax call for scheduled event?

https://softwareengineering.stackexchange.com/questions/373294

06-02-2021
|

Pergunta

Intro

I have been weighing the pros and cons of using WebSockets vs. an Ajax call for an event which will happen every x number of seconds (in this case 5). I'll start by explaining the scenario.

Business customer wants "real-time" updates to a webpage based on data living inside a database. This database resides within a walled environment (third party application), and the only way to access this data is via a provided REST API.

Option 1

My initial thought was that we could simply do an Ajax request every minute to update the webpage with the newest data obtained from the API, but the business customer wants this data as soon as it's updated and insists that a one-minute interval is too long. Ok, that's understandable.

My next thought was to just run the Ajax call every five seconds, but I quickly realized that this may have a serious performance overhead. If 1000 users have this webpage open all day long watching the data refresh, that would result in 1000*(60/5)*60*24 (number_of_users * (seconds_per_minute / refresh_rate) * minutes_in_hour * hours_in_days) or 17,280,000 additional requests per day, along with each requiring a call to the API to obtain the information.

That's a lot of requests. And an exceptional amount of API calls resulting in database reads. To overcome the excessive API calls we could implement a caching solution which would read from the database at the refresh_rate (5 seconds) and save the results to something like Redis, which the Ajax requests could then read from. This would remove the number_of_users variable from the above equation.

Ajax call TLDR:

17,280,000 http requests per day
17,280 api calls per day
quick and easy to implement
does not introduce a WebSocket server onto our current software stack

Option 2

Alright, so if it's decided that the Ajax solution is too much extra overhead... then WebSockets. Using WebSockets seems like a logical approach to resolve this dilemma.

Instead of having to make an http request every five seconds back to a webserver, clients could instantiate a WebSocket connection on page load and receive updates as fast as the server can push them.

The WebSocket server could poll the API every 5 seconds for changes (eliminating the need for a caching solution as described in option 1) and push these changes to all connected clients.

WebSocket TLDR:

single call to server to instantiate connection, server then pushes changes to all connected peers
no need for caching layer
introduces a WebSocket server onto our stack

Conclusion

It seems like Option 2 is the way to go, however as mentioned above, this will require an additional application to live on our stack which will need to be thought out, designed, and supported. So it would seem that my real question really comes down to:

Would you rather allow 17.3 million http calls to your infrastructure, or design a WebSocket server?

The WebSocket server would have to be implemented in either Java or C# as these are the allowed backend technologies for our organization (no Node.js).

Solução

You will not see a huge difference.

Well, the polling will consume a lot more bandwidth simply from request headers unless you use techniques like long-polling, where the server doesn't respond to requests immediately but waits until either new data is available, or some timeout expires (in which case the client starts the next request). This allows for instantaneous updates, just like Websockets.

The difference is that websockets can push multiple messages in both directions using the same connection, whereas with long-polling the server can only provide one response per connection. Connection here means HTTP-request, even though the underlying TCP and TLS connections might be reused. Long-polling is trivial to implement if your web server is based on an event loop, but is problematic in one-thread-per-request web servers.

But aside from that, there really isn't a huge difference. In both cases, your server will have to handle a large number of concurrent open connections. In the case of polling without long-polling, you will have fewer concurrent connections but will spend more time responding to all the requests – which still won't be a lot if the events can be cached.

The cache can likely be a normal in-memory data structure that persists between requests. It doesn't have to be an external cache server.

So simply build the simplest thing that works for you. Both Java and C# have a very mature library ecosystem so you are likely to have good implementations available. When there is a problem, consider refactoring to use a more elaborate, more performant solution. My guess is that long-polling will be fine, unless clients typically receive multiple messages per minute, or unless you have more connections than your main webserver can comfortably handle.

Outras dicas

It's been a while since I played around with websockets (server: java) and getting them setup and working wasn't really that difficult. The challenges I had we more on the client side. I'll come back to that. You shouldn't need an additional server. You can add this into an existing webserver.

Websockets would seem to be a nice fit here but if will likely introduce some new problems. One thing I don't get is that if you are using websockets, why poll on the server? It would make more sense for the API to issue events when would then be translated to websocket calls.

I would avoid trying to send the data payload across the websocket. Instead issue a small message that tells the client there is something to get from the server. The reason I say this has to do with the challenges on the client. For one, you should not build this around any notion that the client will always be up and listening. If someone closes their browser or shuts their laptop, it won't be listening. The first thing the client needs to do is be able to get the current state of things and/or any events that were missed. Do this with normal ajax calls.

Now that you have a way for the client to get itself straightened out, the next challenge is keeping the client connected. What I saw when working with websockets was that they wouldn't stay open indefinitely. You need to have some sort of re-connection routine on the client which will again involve getting caught up.

One alternative here would be to add a last modification time to headers on the ajax response. Then you can use If-Modifed-Since logic or make HEAD calls to check for updates based on ETag or Last-Modified without having to pull down the payload each time. There's still a lot of chattiness though.

Licenciado em: CC-BY-SA com atribuição

Não afiliado a softwareengineering.stackexchange