سؤال

I understand that VIntWritable can significantly reduce the size needed to store an integer, when compared to IntWritable.

My questions are: What is the cost of using VIntWritable instead of IntWritable? Is it (only) the time needed for compression? In other words, when should I use IntWritable, instead of VIntWritable?

هل كانت مفيدة؟

المحلول

How do you choose between a fixed-length and a variable-length encoding?

Fixedlength encodings are good when the distribution of values is fairly uniform across the whole value space, such as a (well-designed) hash function. Most numeric variables tend to have nonuniform distributions, and on average the variable-length encoding will save space. Another advantage of variable-length encodings is that you can switch from VIntWritable to VLongWritable, because their encodings are actually the same. So by choosing a variable-length representation, you have room to grow without committing to an 8-byte long representation from the beginning.

I just picked this up from the definitive guide book page 98

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top