Question

I have a simple pub-sub setup on a mid-sized network, using ZMQ 2.1. Although some subscribers are using C# bindings, others are using Python bindings, and the issue I'm having is the same for either.

If I pull the network cable from a machine running a subscriber, I get an un-catchable error that immediately terminates that subscriber.

Here's a very simple example of a subscriber in Python (not actual production code, but enough to reproduce the problem):

import zmq

def main(server_address, port):

    context = zmq.Context()
    sub_socket = context.socket(zmq.SUB)
    sub_socket.connect("tcp://" + server_address + ":" + str(port))
    sub_socket.setsockopt(zmq.SUBSCRIBE, "KITH1S2")

    while True:

        msg = sub_socket.recv()      
        print msg  

if __name__ == "__main__": main("company-intranet", 4000)

In C# the program simply terminates silently. In Python I at least get this:

Assertion failed: rc == 0 (....\src\zmq_connector.cpp:48)

This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information.

I've tried non-blocking versions, and poller versions, but in either case this instant termination problem persists. Is there something obvious I should be doing but I'm not? (That is, obvious to someone else :) ).

EDIT:

Found the following: https://zeromq.jira.com/browse/LIBZMQ-207

Seems as though it is/was a known issue.

That link further links to Github, where a change log for 2.1.10 has this note:

  • Fixed issue 207, assertion failure in zmq_connecter.cpp:48, when an invalid zmq_connect() string was used, or the hostname could not be resolved. The zmq_connect() call now returns -1 in both those cases.

Although connect() does indeed throw an Invalid Argument exception in Python (not C# apparently?), recv() still fails. If the subscriber machine suddenly loses the network, that subscriber will simply stop functioning.

So - I'm going to try using IP addresses instead of named addresses to see if this will bypass the issue. Not ideal, but better than insta-crash.

Was it helpful?

Solution

Original question: Is there something obvious I should be doing but I'm not?

No.

The workaround for now is to use IP addressing. This does not cause program failure upon network disconnect for ZMQ 2.1.x.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top