Question

So I have a construct called packet

struct Packet {
    unsigned int packet_type;
    wchar_t packet_length[128];
    wchar_t file_name[256];
    wchar_t template_name[256];
    wchar_t file_name_list[1024];
    wchar_t file_data[1024];

    void serialize(char * dat) {
        memcpy(dat, this, sizeof(Packet));
    }

    void deserialize(const char * dat) {
        memcpy(this, dat, sizeof(Packet));
    }
};

I'm trying to desieralize from this data

{byte[2692]}
[0]    0       unsigned int packet_type; (4 bytes)
[1]    0
[2]    0
[3]    0
[4]    50 '2'  wchar_t packet_length[128]; (128 bytes)
[3]    0
[5]    54 '6'
[3]    0
[6]    57 '9'
[3]    0
[7]    50 '2'
[8]    0
[...]  0
[132]  112 'p'  wchar_t file_name[256]; (256 bytes)
[133]  0
[134]  104 'h'
[...]  0

But the memcpy in deserialze isn't giving me the file_name, but it does give me the packet_length. What's up with this? Thanks!

EDIT: So it's clear to me now that wchar_t is taking up more space than I once thought; however, I'm being told not to use memcpy at all?

I've written this deserialize method and it grabs the data correctly. Will this still cause a security leak?

void deserialize(const char * dat) {
        memcpy(&(packet_type), dat, 4);
        memcpy(&(packet_length[0]), dat + 4, 128);
        memcpy(&(file_name[0]), dat + 132, 256);
        memcpy(&(template_name[0]), dat + 388, 256);
        memcpy(&(file_name_list[0]), dat + 644, 1024);
        memcpy(&(file_data[0]), dat + 1668, 1024);
    }
Was it helpful?

Solution 2

The layout of your char array assumes that the size of wchar_t is two bytes; it is not - here is an example of a system where the size of wchar_t is 4, so the size of Packet is 10756, not 2692 bytes: (link to a demo).

That is why your memcpy trick from the edit presents a problem: it assumes that the layout of data in the char[] array matches the layout of wchar_t[] arrays, which it may or may not match. If you know that your data array has two-character elements stored in little endian format (LSB first), you can write your own function that converts the data from the source to the destination, and call it for portions of your serialized data, like this:

void bytes_to_wchar(wchar_t *dest, const unsigned char* src, size_t length) {
    for (size_t i = 0 ; i != lengt ; i++) {
        dest[i] = src[2*i] | (src[2*i+1] << 8);
    }
}

Now you can use this function to copy data into wchar_t arrays independently of the wchar_t size on the target system, or the endianness of the target system:

void deserialize(const char * dat) {
    bytes_to_wchar(packet_type,       dat + 0,      4);
    bytes_to_wchar(packet_length[0],  dat + 4,    128);
    bytes_to_wchar(file_name[0],      dat + 132,  256);
    bytes_to_wchar(template_name[0],  dat + 388,  256);
    bytes_to_wchar(file_name_list[0], dat + 644,  1024);
    bytes_to_wchar(file_data[0],      dat + 1668, 1024);
}

The shortcut of saving the data from memory and writing it back may work when you do it on the same hardware, using the same compiler. Even then it remains sensitive to small adjustments in the headers that you use and in the settings of the compiler.

If the character array that you need to copy into the struct has a fixed layout, you need to write a function to process that layout, converting two-byte groups into wchar_ts, four-byte groups into unsigned ints, and so on.

OTHER TIPS

Please do not use this method for serializing structures. It's utterly non-portable.

Moreover, the compiler might as well pad, align or reorder members depending on the target architecture, endianness, optimizations and a bunch of other things.

A much more elegant way would be to use boost::Serialization, which takes care of low-level details in a portable way.

If, on the other hand, you just want to inspect your structures, then a debugger comes handy...

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top