Question

I'm running a project that requires

  • inter-communication between different programming languages (mostly java, c++).
  • could serialize/deserialize into both binary format and json format.
  • IDL to generate class code for different languages

Thrift matches these criteria perfectly, although we don't need its RPC functions. We will be sending/receiving the serialized thrift data via MQ. Serializing the object is very straight forward. However, when it comes to deserializing, we cannot do something like this:

byte[] data = recv();
Object object = TDeserializer.deserialize(data);
if (object instanceof TypeA) {
    TypeA a = (TypeA) object;
} else if (object instanceof TypeB) {
    TypeB b = (TypeB) object;
}

It seems we have to tell thrift exactly which struct it needs to deserialize into like:

byte[] data = recv();
TypeA a;
TDeserializer.deserialize(a, data);

Just wondering if there's a way to deserialize raw data into thrift object without knowing its exact type.

Thanks!!

Was it helpful?

Solution

Thrift serialized message itself doesn't contain type info, so deserializer must be aware of message data type. However, it's possible to wrap all necessary data types into union.

Thrift code:

union Message {
    1: TypeA a;
    2: TypeB b;
}

Deserialization code:

byte[] data = recv();
Message msg;
TDeserializer.deserialize(msg, data);
<find out message type with msg.getSetField()>

If you need to add new message types, just add another field into union. If you don't touch old field IDs, you will retain backward compatibility:

union Message {
    1: TypeA a;
    2: TypeB b;
    3: TypeC c; <-- OK
}

You will be able to receive messages from old producers (they will never send TypeC messages) and send TypeA/TypeB messages to old consumers. If you send TypeC message to the consumer that's not aware of field #3, it will get exception.

The big advantage of this approach is that type information is very compact. If you use TCompactProtocol, type info will only take 1 extra byte in most cases (if the field IDs in Message are less that 127).

Be careful, if you change field IDs, you will loose backward compatibility. For example:

union Message {
    1: TypeA a;
    2: TypeC c; <-- Wrong
    3: TypeB b; <-- Wrong
    4: TypeD d; <-- OK
}
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top