Why does net_kernel:monitor_nodes/2 not deliver nodeup/nodedown messages for sname nodes?

https://stackoverflow.com/questions/19940544

30-07-2022
|

Question

I start up a master node with a short name and get it running a process to monitor for node up and down messages.

> erl -sname master -cookie monster
Erlang R15B03 (erts-5.9.3) [source] [64-bit] [smp:4:4] [async-threads:0] [hipe] [kernel-poll:false] [dtrace]

Eshell V5.9.3  (abort with ^G)
(master@pencil)1> c("/tmp/monitor.erl").
{ok,monitor}
(master@pencil)2> Pid = monitor:start().
<0.44.0>
(master@pencil)3> Pid ! running.
RECV :: running
running
(master@pencil)4> net_adm:names().
{ok,[{"master",52564}]}

At this point only the master node is running. I startup the second node on the same machine:

> erl -sname client -cookie monster
Erlang R15B03 (erts-5.9.3) [source] [64-bit] [smp:4:4] [async-threads:0] [hipe] [kernel-poll:false] [dtrace]

Eshell V5.9.3  (abort with ^G)
(client@pencil)1>

and wait for a minute, just incase I'm reading the docs wrong and there's a complication with the net tickrate. Nothing, so on master I force the connection:

(master@pencil)5> net_adm:names().
{ok,[{"master",52564},{"client",52579}]}
(master@pencil)6>

and nothing from my little monitor process. Now, if I do the same thing but use long names--that is -name--this works just fine. I'm surprised, though, as the net_kernel docs don't mention that. What's the deal?

Here's the monitor.erl referenced above:

-module(monitor).

-export([start/0]).

start() ->
    spawn_link(fun init_loop/0).

%%%===================================================================
%%% Internal Functions
%%%===================================================================

init_loop() ->
    net_kernel:monitor_nodes(true, []),
    loop().

loop() ->
    receive
    Msg -> io:format(user, "RECV :: ~p~n", [Msg])
    end,
    loop().

Solution

net_kernel:monitor_nodes/2 definitely does deliver nodeup/nodedown messages for nodes with either short and long names.

However, the nodeup message is only delivered when the node is connected, as mentioned in the documentation. Why you got the nodeup message with -name is a mystery (and couldn't be reproduced here) as net_adm:names/0 does not connect nodes at all. It only connects to epmd to obtain the list of locally registered nodes. It will even list nodes with a different cookie.

If you connect the client to the master (or the other way around) with net_adm:ping/1 (or an rpc call), the monitoring process will receive the nodeup message.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow