문제

I understand that VIntWritable can significantly reduce the size needed to store an integer, when compared to IntWritable.

My questions are: What is the cost of using VIntWritable instead of IntWritable? Is it (only) the time needed for compression? In other words, when should I use IntWritable, instead of VIntWritable?

도움이 되었습니까?

해결책

How do you choose between a fixed-length and a variable-length encoding?

Fixedlength encodings are good when the distribution of values is fairly uniform across the whole value space, such as a (well-designed) hash function. Most numeric variables tend to have nonuniform distributions, and on average the variable-length encoding will save space. Another advantage of variable-length encodings is that you can switch from VIntWritable to VLongWritable, because their encodings are actually the same. So by choosing a variable-length representation, you have room to grow without committing to an 8-byte long representation from the beginning.

I just picked this up from the definitive guide book page 98

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top