Why does Riak store all my documents on only one node? n_val is equal to 3

https://stackoverflow.com/questions/22364876

erlang
riak

13-06-2023
|

Domanda

I have built a 5-node cluster using Riak 2.0pre11 on EC2 servers. Installed Riak, got it working, then repeated the same actions on 4 more servers using a bash script. At that point I used riak-admin cluster join riak@node1.example.com on nodes 2 thru 5 to form a cluster.

Using the Python Riak client I wrote a script to send 10,000 documents to Riak. Works fine and I can wrote another script to retrieve a doc which worked fine. Other than specifying the use of protobufs I haven't specified any other options when storing keys. I stored all the docs via a connection to node1.

However Riak seems to be storing all 3 replicas on the same node, in other words the storage used on node1 is about 3x the original HTML docs.

The script connected to node 1 and that is where all docs are stored. I changed the script to connect to node 2 and send 10,000 more which also all ended up in node 1. I used the command du -h /data/riak/bitcask to verify the aggregate stored size of the objects. On nodes 2 thru 4 there is only a few K which is the overhead of an empty Bitcask datastore.

For each document I specified the key similar to this

http://www.example.com/blogstore/007529.html4787somehash4787947:2014-03-12T19:14:32.887951Z

The first part of all keys are identical (testing), only the .html name and the ISO 8601 timestamp are different. Is it possible that I have somehow subverted the perfect hashing function?

Basically I used a default config. What could be wrong? Since Riak 2.0 uses a different config format, here is a fragment of the generated config for riak-core in the old format:

{riak_core,
 [{enable_consensus,false},
  {platform_log_dir,"/var/log/riak"},
  {platform_lib_dir,"/usr/lib/riak/lib"},
  {platform_etc_dir,"/etc/riak"},
  {platform_data_dir,"/var/lib/riak"},
  {platform_bin_dir,"/usr/sbin"},
  {dtrace_support,false},
  {handoff_port,8099},
  {ring_state_dir,"/datapool/riak/ring"},
  {handoff_concurrency,2},
  {ring_creation_size,64},
  {default_bucket_props,
      [{n_val,3},
       {last_write_wins,false},
       {allow_mult,true},
       {basic_quorum,false},
       {notfound_ok,true},
       {rw,quorum},
       {dw,quorum},
       {pw,0},
       {w,quorum},
       {r,quorum},
       {pr,0}]}]}

Soluzione

If the bitcask directory only grows on a single node, it sounds like the nodes might not be communicating. Please run riak-admin member-status to verify that all nodes in the cluster are active.

Once you have issued the riak-admin cluster join <node> commands on all the nodes joining the cluster, you will also need to run riak-admin cluster plan to verify that the plan is correct before committing it using riak-admin cluster commit. These commands are described in greater detail here..

Autorizzato sotto: CC-BY-SA insieme a attribuzione

Non affiliato a StackOverflow