How to correctly (yet efficiently) implement something like “vector::insert”? (Pointer aliasing)

StackOverflow https://stackoverflow.com/questions/12031781

  •  27-06-2021
  •  | 
  •  

Question

Consider this hypothetical implementation of vector:

template<class T>  // ignore the allocator
struct vector
{
    typedef T* iterator;
    typedef const T* const_iterator;

    template<class It>
    void insert(iterator where, It begin, It end)
    {
        ...
    }

    ...
}

Problem

There is a subtle problem we face here:
There is the possibility that begin and end refer to items in the same vector, after where.

For example, if the user says:

vector<int> items;
for (int i = 0; i < 1000; i++)
    items.push_back(i);
items.insert(items.begin(), items.end() - 2, items.end() - 1);

If It is not a pointer type, then we're fine.
But we don't know, so we must check that [begin, end) does not refer to a range already inside the vector.

But how do we do this? According to C++, if they don't refer to the same array, then pointer comparisons would be undefined!
So the compiler could falsely tell us that the items don't alias, when in fact they do, giving us unnecessary O(n) slowdown.

Potential solution & caveat

One solution is to copy the entire vector every time, to include the new items, and then throw away the old copy.

But that's very slow in scenarios such as in the example above, where we'd be copying 1000 items just to insert 1 item, even though we might clearly already have enough capacity.

Is there a generic way to (correctly) solve this problem efficiently, i.e. without suffering from O(n) slowdown in cases where nothing is aliasing?

Was it helpful?

Solution

You can use the predicates std::less etc, which are guaranteed to give a total order, even when the raw pointer comparisons do not.

From the standard [comparisons]/8:

For templates greater, less, greater_equal, and less_equal, the specializations for any pointer type yield a total order, even if the built-in operators <, >, <=, >= do not.

OTHER TIPS

But how do we do this? According to C++, if they don't refer to the same array, then pointer comparisons would be undefined!

Wrong. The pointer comparisons are unspecified, not undefined. From C++03 §5.9/2 [expr.rel]:

[...] Pointers to objects or functions of the same type (after pointer conversions) can be compared, with a result defined as follows:

[...]
-Other pointer comparisons are unspecified.

So it's safe to test if there is an overlap before doing the expensive-but-correct copy.

Interestingly, C99 differs from C++ in this, in that pointer comparisons between unrelated objects is undefined behavior. From C99 §6.5.8/5:

When two pointers are compared, the result depends on the relative locations in the address space of the objects pointed to. [...] In all other cases, the behavior is undefined.

Actually, this would be true even if they were regular iterators. There's nothing stopping anyone doing

std::vector<int> v; 
// fill v 
v.insert(v.end() - 3, v.begin(), v.end());

Determining if they alias is a problem for any implementation of iterators.

However, the thing you're missing is that you're the implementation, you don't have to use portable code. As the implementation, you can do whatever you want. You could say "Well, in my implementation, I follow x86 and < and > are fine to use for any pointers.". And that would be fine.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top