Question

I'm trying to create a generic function that removes duplicates from an std::vector. Since I don't want to create a function for each vector type, I want to make this a template function that can accept vectors of any type. Here is what I have:

//foo.h

Class Foo {

template<typename T>
static void RemoveVectorDuplicates(std::vector<T>& vectorToUpdate);

};

//foo.cpp

template<typename T>
void Foo::RemoveVectorDuplicates(std::vector<T>& vectorToUpdate) {
for(typename T::iterator sourceIter = vectorToUpdate.begin(); (sourceIter != vectorToUpdate.end() - 1); sourceIter++) {
        for(typename T::iterator compareIter = (vectorToUpdate.begin() + 1); compareIter != vectorToUpdate.end(); compareIter++) {
            if(sourceIter == compareIter) {
                vectorToUpdate.erase(compareIter);
            }
        }
    }
}

//SomeOtherClass.cpp

#include "foo.h"

...

void SomeOtherClass::SomeFunction(void) {
    std::vector<int> myVector;

    //fill vector with values

    Foo::RemoveVectorDuplicates(myVector);
}

I keep getting a linker error, but it compiles fine. Any ideas as to what I'm doing wrong?

UPDATE: Based on the answer given by Iraimbilanja, I went and rewrote the code. However, just in case someone wanted working code to do the RemoveDuplicates function, here it is:

//foo.h

Class Foo {

    template<typename T>
    static void RemoveVectorDuplicates(T& vectorToUpdate){
        for(typename T::iterator sourceIter = vectorToUpdate.begin(); sourceIter != vectorToUpdate.end(); sourceIter++) {
            for(typename T::iterator compareIter = (sourceIter + 1); compareIter != vectorToUpdate.end(); compareIter++) {
            if(*sourceIter == *compareIter) {
                compareIter = vectorToUpdate.erase(compareIter);
            }
        }
    }
};

Turns out that if I specify std::vector in the signature, the iterators don't work correctly. So I had to go with a more generic approach. Also, when erasing compareIter, the next iteration of the loop produces a pointer exception. The post decrement of compareIter on an erase takes care of that problem. I also fixed the bugs in the iterator compare and in the initialization of compareIter in the 2nd loop.

UPDATE 2:

I saw that this question got another up vote, so figured I'd update it with a better algorithm that uses some C++14 goodness. My previous one only worked if the type stored in the vector implemented operator== and it required a bunch of copies and unnecessary comparisons. And, in hindsight, there is no need to make it a member of a class. This new algorithm allows for a custom compare predicate, shrinks the compare space as duplicates are found and makes a significantly smaller number of copies. The name has been changed to erase_duplicates to better conform to STL algorithm naming conventions.

template<typename T>
static void erase_duplicates(T& containerToUpdate) 
{
    erase_duplicates(containerToUpdate, nullptr);
}

template<typename T>
static void erase_duplicates(T& containerToUpdate, 
  std::function<bool (typename T::value_type const&, typename T::value_type const&)> pred) 
{
    auto lastNonDuplicateIter = begin(containerToUpdate);
    auto firstDuplicateIter = end(containerToUpdate);
    while (lastNonDuplicateIter != firstDuplicateIter) {
        firstDuplicateIter = std::remove_if(lastNonDuplicateIter + 1, firstDuplicateIter, 
            [&lastNonDuplicateIter, &pred](auto const& compareItem){
            if (pred != nullptr) {
                return pred(*lastNonDuplicateIter, compareItem);
            }
            else {
                return *lastNonDuplicateIter == compareItem;
            }
        });
        ++lastNonDuplicateIter;
    }
    containerToUpdate.erase(firstDuplicateIter, end(containerToUpdate));
}
Was it helpful?

Solution

Short Answer

Define the function in the header, preferably inside the class definition.

Long answer

Defining the template function inside the .cpp means it won't get #included into any translation units: it will only be available to the translation unit it's defined in.

Hence RemoveVectorDuplicates must be defined in the header, as this is the only way the compiler can text-substitute the template arguments, hence instantiating the template, producing an usable class.

There are two workarounds for this inconvenience

First, you can remove the #include "foo.h" from the .cpp and add another one, in the end of the header:

#include "foo.cpp"

This lets you organize your files consistently, but doesn't provide the usual advantages of separate compilation (smaller dependencies, faster and rarer compiles).

Second, you can just define the template function in the .cpp and explicitly instantiate it for all the types it'll be ever used with.

For example, this can go in the end of the .cpp to make the function usable with ints:

template void Foo::RemoveVectorDuplicates(std::vector<int>*);

However, this assumes you only use templates to save some typing, rather than to provide true genericity.

OTHER TIPS

One alternative you have is to first std::sort() the vector, and then use the pre-existing std::unique() function to remove duplicates. The sort takes O(nlog n) time, and removing duplicates after that takes just O(n) time as all duplicates appear in a single block. Your current "all-vs-all" comparison algorithm takes O(n^2) time.

You can't implement a template function in a .cpp file. The complete implementation has to be visible anywhere it's instantiated.

Just define the function inside the class definition in the header. That's the usual way to implement template functions.

I'll suggest to use a more "generic" approach, instead of passing a container just receive two iterators.

Something like It remove_duplicates(It first, It last), and will return an iterator, so you can call like remove: v.erase(remove_duplicates(v.begin(), v.end()), v.end()).

template <typename It>
It remove_duplicate(It first, It last)
{
  It current = first;
  while(current != last) {
    // Remove *current from [current+1,last)
    It next = current;
    ++next;
    last = std::remove(next, last, *current);
    current = next;
  }
  return last;
}

Unrelated to your problem (which has been explained already), why is this a static function rather than residing globally in a namespace? This would be somewhat C++-ier.

I don't think that code compiles....

vectorToUpdate.erase where std::vector* vectorToUpdate.... does anyone else notice there is a * where there should be a &? that code is definitely not being compiled. if you're going to use a pointer to vector you must use '->' instead of '.' I know this is actually a bit nit picky but it points out that the compiler doesn't even care about your code...

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top