Question

I have a group of erlang nodes that are replicating their data through Mnesia's "extra_db_nodes"... I need to upgrade hardware and software so I have to detach some nodes as I make my way from node to node.

How does one remove a node and still preserve the data that was inserted?

[update] removing nodes is as important as adding them. Over time as your cluster grows it must also contract. If not then Mnesia is going to be busy trying to send data to nonexistent nodes filling up queues and keeping the network busy.

[final update] after pouring through the erlang/mnesia source code I was able to determine that it is not possible to completely disassociate nodes. While del_table_copy removes the linkage between tables it is incomplete. I would close this question but none of the close descriptions are adequate.

Was it helpful?

Solution

I wish I had found this a long time ago: http://weblambdazero.blogspot.com/2008/08/erlang-tips-and-tricks-mnesia.html

basically, with a properly functioning cluster....

  • login to the cluster to be removed

  • stop mnesia

    mnesia:stop().
    
  • login to a different node on the cluster

  • delete the schema

    mnesia:del_table_copy(schema, node@host.domain).
    

OTHER TIPS

I'm extremely late to the party, but came across this info in the doc when looking for a solution to the same problem:

"The function call mnesia:del_table_copy(schema, mynode@host) deletes the node 'mynode@host' from the Mnesia system. The call fails if mnesia is running on 'mynode@host'. The other mnesia nodes will never try to connect to that node again. Note, if there is a disc resident schema on the node 'mynode@host', the entire mnesia directory should be deleted. This can be done with mnesia:delete_schema/1. If mnesia is started again on the the node 'mynode@host' and the directory has not been cleared, mnesia's behaviour is undefined." (http://www.erlang.org/doc/apps/mnesia/Mnesia_chap5.html#id74278)

I think the following might do what you desire:

AllTables = mnesia:system_info(tables),
DataTables = lists:filter(fun(Table) -> Table =/= schema end,
                          AllTables),

RemoveTableCopy = fun(Table,Node) ->
  Nodes = mnesia:table_info(Table,ram_copies) ++
          mnesia:table_info(Table,disc_copies) ++
          mnesia:table_info(Table,disc_only_copies),
  case lists:member(Node,Nodes) of
    true -> mnesia:del_table_copy(Table,Node);
    false -> ok
  end
end,

[RemoveTableCopy(Tbl,'gone@gone_host') || Tbl <- DataTables].

rpc:call('gone@gone_host',mnesia,stop,[]),
rpc:call('gone@gone_host',mnesia,delete_schema,[SchemaDir]),
RemoveTablecopy(schema,'gone@gone_host').

Though, I haven't tested it since my scenario is slightly different.

I've certainly used this method to perform this (supporting the mnesia:del_table_copy/2 use). See removeNode/1 below:

-module(tool_bootstrap).

-export([bootstrapNewNode/1, closedownNode/0,
     finalBootstrap/0, removeNode/1]).

-include_lib("records.hrl").

-include_lib("stdlib/include/qlc.hrl").

bootstrapNewNode(Node) ->
    %% Make the given node part of the family and start the cloud on it
    mnesia:change_config(extra_db_nodes, [Node]),
    %% Now make the other node set things up
    rpc:call(Node, tool_bootstrap, finalBootstrap, []).

removeNode(Node) ->
    rpc:call(Node, tool_bootstrap, closedownNode, []),
    mnesia:del_table_copy(schema, Node).

finalBootstrap() ->
    %% Code removed to actually copy over my tables etc...
    application:start(cloud).

closedownNode() ->
    application:stop(cloud), mnesia:stop().

If you have replicated the table (added table copies) on nodes other than the one you're removing, then you're already fine - just remove the node.

If you wanted to be slightly tidier you'd delete the table copies from the node you're about to remove first via mnesia:del_table_copy/2.

Generally, mnesia gracefully handles node loss and detects node rejoin (rebooted nodes obtain new table copies from nodes that kept running, nodes that didn't reboot are detected as a network partition event). Mnesia does not consume CPU or network traffic for nodes that have gone down. I think, though I haven't confirmed it in the source, mnesia won't reconnect to nodes that have gone down automatically - the node that goes down is expected to reboot (mnesia) and reconnect.

mnesia:add_table_copy/3, mnesia:move_table_copy/3 and mnesia:del_table_copy/2 are the functions you should look at for live schema management.

The extra_db_nodes parameter should only be used when initialising a new DB node - once a new node has a copy of the schema it doesn't need the extra_db_nodes parameter.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top