문제

I'm compiling a lookup table that needs to have 133,784,560 entries, with values ranging from 0 - 7,462

The maximum value of 7,462 can be contained within 13 bits. This gives us a lookup table of around 207 MB.

A 16 bit value increases our lookup table size around 50mb more.

The extra increase in size of the lookup table is not significant in todays age, but it would be nice to keep it as thin as possible.

When the LUT is loaded into memory, how much overhead is there to evaluate the value of a range of 13 bits, compared to evaluating the 16 bits? I'm assuming there would be some intermediary bitwise operations to convert it to a computer workable format, or am I wrong?

Every clock cycle counts, as this will be involved in a brute force analysis program that will run billions of comparisons. Should I just stick with the slightly larger LUT?

도움이 되었습니까?

해결책

My guess would be that is a case of premature optimisation. Bit-fiddling is quite expensive, and will probably dwarf the extra memory-access cost, unless by sheer coincidence your cache performance hits an elbow somewhere between those two sizes.

Ultimately, there's no substitute for just trying it out.

다른 팁

I would stick with 16-bit values rather than 13-bit. Since you're doing brute force analysis and billions of comparisons, the extra 50MB seems a small price to pay. Also keep in mind that the code managing the 13-bit case will be significantly more complex, as you'll usually have to read across multiple 16-bit (or 32-bit, or whatever) values and shift and combine in order to get the actual value you need. In other words, extracting value #n is going to be much more complex than simply "retrieve it from the table".

The only real way to know for sure, however, would be to try both and see... but unless you've got the time to implement the 13-bit value retrieval code that you might not end up using, I probably wouldn't bother.

I would say try it both ways and see which one is faster. Also, I think this is a good candidate to drop into C++. You can encapsulate this in a managed C++ project which you can reference directly from C#. This will allow you to do all the low level optimizations that you want while still being directly accessible to the rest of your app.

When the LUT is loaded into memory, how much overhead is there to evaluate the value of a range of 13 bits, compared to evaluating the 16 bits?

Assuming you mean storing data in an array like this:

AAAAAAAA AAAAABBB BBBBBBBB BBCCCCCC
CCCCCCCD DDDDDDDD DDDDEEEE EEEEEEEE
EFFFFFFF FFFFFFGG GGGGGGGG GGGHHHHH
HHHHHHHH ...
  • It's more complicated to calculate the memory address where your LUT entry is stored. You can't just (have the compiler) multiply by 2 like you can with short[].
  • You also have deal with the fact that your 13-bit value might be split across 2 array elements (or 3 if the underlying array is a byte[]).
  • Plus the “intermediary bitwise operations”.
라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top