ZeroRPC pub/sub aggregate results

https://stackoverflow.com/questions/19955976

30-07-2022
|

Question

I am designing a simple distributed database in python. I consider to implement a communication layer with ZeroRPC. The key lookup is implemented by the DHT protocol with req/rep pattern. However, I would also like to have the ability to make distributed lookups by the key's value. For instance, if I make a request for a key with particular value, I would like that all servers do the lookup in their local storage and than return the result back to the requester. I am thinking of the possibility to implement this with pub/sub, something like this:

    #node.py
    import zerorpc
    class Node:
        def query(param):
            #lookup code
            return result # could be None or [], etc.

    sub = zerorpc.Subscriber(Node())
    sub.connect('tcp://127.0.0.1:9999')
    sub.run()


    #requester.py
    import zerorpc

    pub = zerorpc.Publisher()
    pub.bind('tcp://127.0.0.1:9999')

    result = pub.query('foo_query') # None
    print result # None

The question is, can I get the result of calling pub.query() and if so, can I aggregate that result from a bunch of subscriber nodes.

P.S. May be I am looking into the wrong direction and should use some other communication technique?

Solution

The Publisher->Subscriber pattern is a one way communication pattern. Its a good way to implement unmanaged work item distribution, but you will need another channel of communication if you want bi-way communication or more control over the distribution of work (load balancing, etc).

There are two high-level solutions available to you, based on the information I have on what you're trying to do:

Blackbox the server nodes behind a single gateway

Request-Reply Broker Pattern

"Using a request-reply broker makes your client/server architectures easier to scale because clients don't see workers, and workers don't see clients. The only static node is the broker in the middle."

enter image description here

See more on this pattern with code examples in the ZMQ Guide here.

Implement your own multicast with simple REQ<->REP

Use the typical client<->server model (REQ<->REP) for connectivity and implement multicasting the work out in your own code.

I couldn't say which solution is best as you know your application needs best, and these are just two common solutions. There are many ways to implement ZMQ and it can be implemented in almost any way you wish. What is often most important is designed a good pipeline at a high-level, then coming back to ZMQ to do the hard work for you.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow