Question

I have a situation in which I'm performing binary serialization of a some items and I'm writing them to an opaque byte buffer:

int SerializeToBuffer(unsigned char* buffer)
{
    stringstream ss;
    vector<Serializable> items = GetSerializables();
    string serializedItem("");
    short len = 0;
    for(int i = 0; i < items.size(); ++i)
    {
        serializedItem = items[i].Serialize();
        len = serializedItem.length();

        // Write the bytes to the stream
        ss.write(*(char*)&(len), 2);
        ss.write(serializedItem.c_str(), len);

    }
    buffer = reinterpret_cast<unsigned char*>(
                const_cast<char*>(ss.str().c_str()));
    return items.size();
}

Is it safe to remove the const-ness from the ss.str().c_str() and then reinterpret_cast the result as unsigned char* then assign it to the buffer?

Note: the code is just to give you an idea of what I'm doing, it doesn't necessarily compile.

Was it helpful?

Solution

No removing const-ness of an inherently contant string will result in Undefined Behavior.

const char* c_str ( ) const;
Get C string equivalent

Generates a null-terminated sequence of characters (c-string) with the same content as the string object and returns it as a pointer to an array of characters.
A terminating null character is automatically appended.
The returned array points to an internal location with the required storage space for this sequence of characters plus its terminating null-character, but the values in this array should not be modified in the program and are only guaranteed to remain unchanged until the next call to a non-constant member function of the string object.

OTHER TIPS

Short answer: No

Long Answer: No. You really can't do that. The internal buffer of those object belong to the objects. Taking a reference to an internal structure is definitely a no-no and breaks encapsulation. Anyway those objects (with their internal buffer) will be destroyed at the end of the function and your buffer variable will point at uninitialized memory.

Use of const_cast<> is usually a sign that something in your design is wrong.
Use of reinterpret_cast<> usually means you are doing it wrong (or you are doing some very low level stuff).

You probably want to write something like this:

std::ostream& operator<<(std::ostream& stream, Data const& serializable)
{
    return stream << serializable.internalData;

    // Or if you want to write binary data to the file:

    stream.write(static_cast<char*>(&serializable.internalData), sizeof(serializable.internalData);
    return stream;

}

This is unsafe, partially because you're stripping off const, but more importantly because you're returning a pointer to an array that will be reclaimed when the function returns.

When you write

ss.str().c_str()

The return value of c_str() is only valid as long as the string object you invoked it on still exists. The signature of stringstream::str() is

string stringstream::str() const;

Which means that it returns a temporary string object. Consequently, as soon as the line

ss.str().c_str()

finishes executing, the temporary string object is reclaimed. This means that the outstanding pointer you received via c_str() is no longer valid, and any use of it leads to undefined behavior.

To fix this, if you really must return an unsigned char*, you'll need to manually copy the C-style string into its own buffer:

/* Get a copy of the string that won't be automatically destroyed at the end of a statement. */
string value = ss.str();

/* Extract the C-style string. */
const char* cStr = value.c_str();

/* Allocate a buffer and copy the contents of cStr into it. */
unsigned char* result = new unsigned char[value.length() + 1];
copy(cStr, cStr + value.length() + 1, result);

/* Hand back the result. */
return result;

Additionally, as @Als has pointed out, the stripping-off of const is a Bad Idea if you're planning on modifying the contents. If you aren't modifying the contents, it should be fine, but then you ought to be returning a const unsigned char* instead of an unsigned char*.

Hope this helps!

Since it appears that your primary consumer of this function is a C# application, making the signature more C#-friendly is a good start. Here's what I'd do if I were really crunched for time and didn't have time to do things "The Right Way" ;-]

using System::Runtime::InteropServices::OutAttribute;

void SerializeToBuffer([Out] array<unsigned char>^% buffer)
{
    using System::Runtime::InteropServices::Marshal;

    vector<Serializable> const& items = GetSerializables();
    // or, if Serializable::Serialize() is non-const (which it shouldn't be)
    //vector<Serializable> items = GetSerializables();

    ostringstream ss(ios_base::binary);
    for (size_t i = 0u; i != items.size(); ++i)
    {
        string const& serializedItem = items[i].Serialize();
        unsigned short const len =
            static_cast<unsigned short>(serializedItem.size());

        ss.write(reinterpret_cast<char const*>(&len), sizeof(unsigned short));
        ss.write(serializedItem.data(), len);
    }

    string const& s = ss.str();
    buffer = gcnew array<unsigned char>(static_cast<int>(s.size()));
    Marshal::Copy(
        IntPtr(const_cast<char*>(s.data())),
        buffer,
        0,
        buffer->Length
    );
}

To C# code, this will have the signature:

void SerializeToBuffer(out byte[] buffer);

Here is the underlying problem:

buffer = ... ;
return items.size();

In the second-to last line you're assigning a new value to the local variable that used (up until that point) to hold the pointer your function was given as an argument. Then, immediately after, you return from the function, forgetting everything about the variable you just assigned to. That does not make sense!

What you probably want to do is to copy data from the memory pointed to by ss_str().c_str() to the memory pointed to by the pointer stored in buffer. Something like

memcpy(buffer, ss_str().s_str(), <an appropriate length here>)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top