Scaling of push events - Best topology?

https://stackoverflow.com/questions/12483609

02-07-2021
|

Question

I've built a TCP server which handles RPC (request/reply) type requests from clients, but it also allows services to push events down at ad-hoc times.

If I need to scale in the future, the RPC stuff is quite easy, like web infrastructure, I'll just add more nodes and load-balance.

To scale the push messages, I will need all the servers to coordinate as the client(s) subscribed to the events could be on any server.

My options are:

broadcast the events to all the servers using UDP multicast/broadcast (e.g. emcaster)
fully interconnect the servers to each other using TCP
central server where all the events are sent, and all the worker servers connect to that one
[3] but with several layers to form a tree

My temptation is to go with [1] as it is simple and probably works well for up to 20-30 nodes. Is there a consensus on what the best strategies are for different ranges of N, where N is the number of nodes?

Solution

Its hard to advise which would be the best strategy without knowing more details. Perhaps what might help would be to list some things to consider for each item:

UDP Broadcast
- As you mention, this will be the easiest to implement.
- Why is the limit 20-30 nodes? Will that limit work with your requirements? If so, go with it.
- Will the UDP broadcast messages possibly be affected by NW elements such as firewals?
Interconnected TCP NW
- This option seems like it could be a maintenance nightmare to configure and maintain a consistent list of IP addresses.
- How will a particular server know which is the next server to send the message to? This logic could become complex.
Central Server
- Personally, I would consider this to be the second possible solution after [1.]
- This central server may need some quite complex processing to know where to send the messages.
Central Server with a tree
- Configuration and Maintenance nightmare
- The complex logic mentioned in 4 will be even worse with this solution.

Personally, I would look at the pros and cons of each and also consider how each solution addresses the requirements. Hopefully that lesson will make the decision easier.

OTHER TIPS

You should check out the zeromq guide. If you need that udp broadcast to compensate for lost packets, then zeromq would be a good way to go. It is a light weight message passing interface built for efficiency. Here is the intro guide in C (library language) and python:

C: http://zguide.zeromq.org/page:all
Python: http://zguide.zeromq.org/py:all

The examples have also been translated into wrappers for C++, C#, CL, Erlang, F#, Felix, Haskell, Java, Objective-C, Ruby, Ada, Basic, Clojure, Go, Haxe, Node.js, ooc, Perl, Scala, Lua, Haxe and PHP.

----Update---

Sorry, it appears that the links do not change all the code examples from C to python, but you can get alternate language translations...

Specifically for your push topology, they have a page on how to implement pub/sub in zeromq: http://www.zeromq.org/whitepapers:0mq-3-0-pubsub

Try using some already-invented-wheel open source software in the middle. I can just think of one at the time but I am 900% sure that there will be tents of copycats in the market.

Redis is a good example, scalable, fast and has already many toys, plugins and clients. With more or less 3 lines of code you can implement publisher/subscriber stuff.

Are your clients uniquely identifiable? If so, you can partition them across the various servers and integrate the logic for which server to connect to (UNIQUE_ID mod N?) into each client/server

I would select #3 - Central Server. It would scale much better than the other options and could be designed to function like a router table to ensure traffic is only generated to a server when necessary. Additional server nodes could be added on-the-fly.

Out of curiosity, what language have you developed your server in?

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow