Question

With regard to the issues copy vs. memcpy vs memmove(excellent info here, btw.), I have been reading up and it would seem to me, that unlike what is colloquially said, for example at cppreferenceNote: memcpy has been changed to memmove since taking this quote. --

Notes

In practice, implementations of std::copy avoid multiple assignments and use bulk copy functions such as std::memcpy if the value type is TriviallyCopyable

-- std::copy (nor std::copy_backward) cannot be implemented in terms of memcopy, because for std::copy only the beginning of the destination range must not fall into the source range, but for memcpy the entirety of the ranges must not overlap.

Looking at Visual-C++'s implementation (see the xutility header), we can also observe that VC++ uses memmove, but that one now has more relaxed requirements than std::copy:

... The objects may overlap: copying takes place as if the characters were copied to a temporary character array and then the characters were copied from the array ...

So it would appear that implementing std::copy in terms of memcpy is not possible, but using memmove is actually a pessimization. (a wee tiny bit of pessimization, possibly not measurable, but still)

To come back to the question(s): Is my summary correct? Is this a problem anywhere? Regardless of what's specified, is there even a possible practical implementation of memcpy that would not also fulfill the requirements of std::copy, i.e. are there memcpy implementations that break when the ranges partially overlap as allowed by std::copy?

Was it helpful?

Solution

If the question is, whether it's possible to encounter an efficient memcpy implementation with enough undefined behavior to not trust it over overlapping ranges, then the answer is yes. :-)

Consider one possible implementation of memcpy on Power(PC) architecture: lmw instruction will load multiple consecutive words from memory into consecutive registers (which can be specified as a user defined range argument). stmw will then save the supplied register range back to memory. Thus, we are talking around ~100/200 bytes (32b/64b CPU) buffered by the CPU during a single memcpy iteration - plenty of data to spoil the target range if it overlaps with the source one, especially considering that CPU makes no promises about relative order of individual load and stores.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top