Question

What will happen if the node's physical space of one of the shard in SolrCloud is full? Will the index request to those nodes or that shard will redirect to other shards having space?

Was it helpful?

Solution

The short answer is not easily, not automatically because a specific shard is full. The reason being the 32-bit hash range is split evenly between each shard, Solr uses the murmur hash algorithm, which keeps the number of documents in each shard balanced (roughly), so most of your nodes will start hitting same limitations almost at the same time, so you need to monitor your indexes and plan for it ahead or after. You have two options in this context First, Custom hashing allows you to route documents to specific shards based on some common field value, such as tenant ID. Another example of this would be routing documents based on category.The biggest concern when using custom hashing is that it may create unbalanced shards in your cluster. The second options is Shard splitting , allows you to split an existing shard into two subshards. To do shard splitting , Use the SPLITSHARD action of the collections API to split an existing shard into two subshards. Issue a "hard" commit after the split process completes to make the new subshards active. Unload the original shard from the cluster.

But if you still choose to force document to a specific shard becuase you know other shard is full, you can do it this way: Solr 4.5 has added the ability to specify the router implementation with the router.name parameter. If you use the "compositeId" router, you can send documents with a prefix in the document ID which will be used to calculate the hash Solr uses to determine the shard a document is sent to for indexing. The prefix can be anything you'd like it to be (it doesn't have to be the shard name, for example), but it must be consistent so Solr behaves consistently. For example, if you wanted to co-locate documents for a customer, you could use the customer name or ID as the prefix. If your customer is "IBM", for example, with a document with the ID "12345", you would insert the prefix into the document id field: "IBM!12345". The exclamation mark ('!') is critical here, as it defines the shard to direct the document to.

You can read more about it here: https://cwiki.apache.org/confluence/display/solr/Shards+and+Indexing+Data+in+SolrCloud

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top