문제

In my custom physics engine, the biggest bottleneck is a method that gets all bodies from the spatial partitioning (a 2D grid), and returns a collection containing only unique pointers to body.

template<typename T, typename V> bool contains(const T& mContainer, const V& mValue)
{
    return std::find(std::begin(mContainer), 
                     std::end(mContainer), mValue) != std::end(mContainer);
}

const vector<Body*>& GridInfo::getBodiesToCheck()
{
    bodiesToCheck.clear();
    for(auto& query : queries)
        for(auto& body : *query)
            if(!contains(bodiesToCheck, body)) bodiesToCheck.push_back(body);
    return bodiesToCheck;
}

Using a profiler shows that the bottleneck is in the "contains" method.

Obviously, an std::unordered_set would be the "ideal" solution here. However, it is a lot slower than the current solution. I've also tried google::dense_hash_set, which is faster than std::unordered_set, but still slower than the current solution.

const unordered_set<Body*>& GridInfo::getBodiesToCheck()
{
    bodiesToCheck.clear();
    for(auto& query : queries)
        for(auto& body : *query)
            /*if(!contains(bodiesToCheck, body))*/ bodiesToCheck.insert(body);
    return bodiesToCheck;
}

Why are the "correct" containers slower than a std::vector?

Is there any way I can speed up this method further?

도움이 되었습니까?

해결책

There are two possibilities I can think of:

  1. You have a small enough number of data elements that a linear search is faster than a hash-plus compare lookup.
  2. You're using the same contains function to find an element in the unordered_set, instead of using the member function find.

다른 팁

If the number of duplicate bodies isn't that high compared to the others, one option might be to just push all your bodies into the vector and remove the duplicates afterwards. But this will require a std::sort followed by an erase(std::unique, end).

But it may be worth a try, considering that your vector seems to outplay a std::unordered_set anyway, which doesn't have the same memory locality and trivial access like a std::vector.

I am not sure that I understand the problem correctly, but it seems that the lookup will be slower on std::vector/std::find, but iteration might be faster than with std::unordered_set. If that is the case, and you are not limited by memory constraints, you can mix both approaches:

Maintain both a std::unordered_set and a std::vector with the elements. Lookup inside the std::unordered_set to determine whether the element is already there, if it is not, add it to both containers. At the end iterate over the std::vector.

Note that you can provide hints to both containers regarding the 'expected' number of elements that they will contain, and that will reduce the number of memory allocations/rehashing.

I run into a similar problem where a linear search is faster than a hash-plus compare lookup.(A support to Mark's first answer).

I try to use BFS to improve the CPU voxelization of a mesh. std::unordered_set is used to mark visited voxels. However, unordered_set is 100% slower than linearly iterating the space. With a profile comparison, I find that a linear search is faster if the ratio of active voxels over all visited voxels is higher than 3%. Otherwise, BFS with unordered_set is better.

Here is what you find in the std documentation:

"unordered_set containers are faster than set containers to access individual elements by their key, although they are generally less efficient for range iteration through a subset of their elements."

Well since the find method will eventually loop through a considerable number of elements thats probably the reason...

Maybe if you've used a costum hash function you should improve it to make it faster... Only thing I can think of...

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top