The are a few potential problems with the approach posted:
- On some systems objects of a type bigger than
char
needs to be aligned properly to be accessible. A typical requirement foruint32_t
is that the object is aligned to an address divisible by four. - If
length / sizeof(uint32_t) != 0
the loop may never terminate. - Depending on the endianess of the system
mask
needs to contain different values. Ifmask
is produced by*reinterpret_cast<uint32_t>(char_mask)
of a suitable array this shouldn't be an array.
If these issues are taken care of, reinterpret_cast<...>(...)
can be used in the situation you have. Reinterpreting the meaning of pointers is one of the reasons this operation is there and sometimes it is needed. I would create a suitable test case to verify that it works properly, though, to avoid having to hunt down problems when porting the code to a different platform.
Personally I would go with a different approach until profiling shows that it is too slow:
char* it(data);
if (4 < length) {
for (char* end(data + length - 4); it < end; it += 4) {
it[0] ^= mask_[0];
it[1] ^= mask_[1];
it[2] ^= mask_[2];
it[3] ^= mask_[3];
}
}
it != data + length && *it++ ^= mask_[0];
it != data + length && *it++ ^= mask_[1];
it != data + length && *it++ ^= mask_[2];
it != data + length && *it++ ^= mask_[3];
I'm definitely using a number of similar approaches in software which meant to be really faster and haven't found them to be a notable performance problem.