Pregunta

Have Dictionary <Int64, byte> that gets used a lot. I mean in a loop that runs for days in a big data load. The Int64 comess from two Int32. The byte happens to be the distance (count) between those two Int32 from many very long lists.

What I need to do in this loop is

  • Generate the key
  • If key does not exists in the Dictionary then insert key and value
  • If key does exists and new value (byte) is less than the existing value then replace the existing value with the new value

Right now I am using straight math to generate the key and I know there is faster way but I cannot figure it out. I put shift as a tag as I think that is how to optimize it but I cannot figure it out.

Then when the loop is complete I need to extract the two Int32 from the Int64 to insert the data into a database.

Thanks

Per comment the math I use to combine two Int32 into one Int64

        Int64 BigInt;
        Debug.WriteLine(Int32.MaxValue);
        Int32 IntA = 0;
        Int32 IntB = 1;
        BigInt = ((Int64)IntA * Int32.MaxValue) + IntB;
        Debug.WriteLine(BigInt.ToString());
        IntA = 1;
        IntB = 0;
        BigInt = ((Int64)IntA * Int32.MaxValue) + IntB;
        Debug.WriteLine(BigInt.ToString());
        IntA = 1;
        IntB = 1;
        BigInt = ((Int64)IntA * Int32.MaxValue) + IntB;
        Debug.WriteLine(BigInt.ToString());

And the best key may not be an Int64. What I have is two Int32 that together form a key. And a value of byte. I need fast lookup on that composite key. Dictionary is fast but it does not support composite key so I create a single key that is actually a composite key. In SQL Int32A, Int32B form the PK.

The reason I don't use a composite key is I want the lookup speed of Dictionary and to my knowledge Dictionary does not support composite key. This is production code. In the SQL table there is actually a third key (Int32 sID, Int32 IntA, Int32 IntB). In this parser I am only dealing with one sID at a time (and sIDs are processed in order). I started with composite key lookup to SQL (billions in a run). When I pulled IntA, IntB out to Dictionary to process a single sID then load to SQL at the completion of each sID I got a 100:1 performance improvement. Part of the performance improvement is insert as when I insert from the Dictionary I can insert in PK order. The new IntA and IntB are not produced sorted by the parse so direct insert into SQL would severely fragment the index and I would need to rebuild the index at the end of a run.

¿Fue útil?

Solución

Sounds like you just want a shift. Personally I find it simpler to think about bitshifting when using unsigned types instead of signed ones:

// Note: if you're in a checked context by default, you'll want to make this
// explicitly unchecked
uint u1 = (uint) int1;
uint u2 = (uint) int2;

ulong unsignedKey = (((ulong) u1) << 32) | u2;
long key = (long) unsignedKey;

And to reverse:

ulong unsignedKey = (long) key;
uint lowBits = (uint) (unsignedKey & 0xffffffffUL);
uint highBits = (uint) (unsignedKey >> 32);
int i1 = (int) highBits;
int i2 = (int) lowBits;

It's entirely possible that you don't need all these conversions to unsigned types. It's more for my sanity than anything else :)

Note that you need to cast u1 to a ulong so that the shifting works in the right space - shifting a uint by 32 bits would do nothing.

Note that that's a way of combining two 32-integers to get a 64-bit integer. It's not the only way by any means.

(Side-note: Bas's solution works perfectly well - I'm just always somewhat uncomfortable with that sort of approach, for no specific reason.)

Otros consejos

If you want to convert back and forth from Int32's to Int64's you can use a struct with explicit layout:

//using System.Runtime.InteropServices;
[StructLayout(LayoutKind.Explicit)]
struct Int64ToInt32
{
    [FieldOffset(0)]
    public Int64 Int64Value;
    [FieldOffset(0)]
    public Int32 LeftInt32;
    [FieldOffset(4)]
    public Int32 RightInt32;
}

Just set/get values from the fields.

You can use bit shifting to store two 32bit values in one 64 bit variable.

I shall give a small example:

int a = 10;
int b = 5;
long c;

//To pack the two values in one variable
c = (long)a << 32;
c = c + (long)b;
//the 32 most significant bits now contain a, the 32 least significant bits contain b

//To retrieve the two values:
c >> 32 == a
c - ((c>>32)<<32) == b

Edit: I see I am a bit late to the party, just wanted to check in VS if I didn't make a mistake :)

Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top