Question

I am implementing a quadtree. I re-implemented my first draft (full version can be seen here) that used smart pointers and references with a version using raw pointers.

But filling the new tree is apparently up to two times slower, why is this the case?

The old versions code:

// returns if coordinates fit in the tree
const bool contains(const double &x, const double &y, const double &w, const double &h) const {
    return (this->x < x &&
            this->y < y &&
            this->x + this->w > x + w &&
            this->y + this->h > x + h);
}
// returns if an element fits in the tree
const bool contains(const std::shared_ptr<Rectangle> &rect) const {
    return contains(rect->getX(), rect->getY(), rect->getW(), rect->getH());
}

// inserts an element in the tree
const bool insert(const std::shared_ptr<Rectangle> &rect) {
    // if rect is too big for this quadtree
    if(!contains(rect)) {
        auto sp = getParent();
        if(sp == nullptr) {
            return false;
        }
        return sp->insert(rect);
    }
    // if element theoretically fits in subtree
    else if(rect->getW() < getW() / 2 && rect->getH() < getH() / 2) {
        if(!subtrees[0]) {
            generateSubtrees();
        }
        for(const auto &subtree: subtrees) {
            if(subtree->contains(rect)) {
                return subtree->insert(rect);
            }
        }
    }
    children.insert(children.end(), rect);
    return true;
}

void generateSubtrees() {
    subtrees[0] = std::make_shared<QuadTree>(getW() / 2.0f, getH() / 2.0f, getX(),                 getY(),                 this);
    subtrees[1] = std::make_shared<QuadTree>(getW() / 2.0f, getH() / 2.0f, getX() + getW() / 2.0f, getY(),                 this);
    subtrees[2] = std::make_shared<QuadTree>(getW() / 2.0f, getH() / 2.0f, getX(),                 getY() + getH() / 2.0f, this);
    subtrees[3] = std::make_shared<QuadTree>(getW() / 2.0f, getH() / 2.0f, getX() + getW() / 2.0f, getY() + getH() / 2.0f, this);

}

The time filling the tree with this version is ca. 0.001367 seconds for 1000 elements.

Then I re-implemented this function:

// Returns if a Rectangle fits in the tree
const bool contains(const Rectangle *rect) const {
    return (this->x < rect->x &&
            this->y < rect->y &&
            this->x + this->w > rect->x + rect->w &&
            this->y + this->h > rect->y + rect->h);
}

// Inserts an element in the tree
const bool insert(Rectangle *rect) {
    if(!contains(rect) && parent == nullptr) {
        return false;
    }
    if(rect->w < this->w / 2.0f && rect->w < this->w / 2.0f) {
        if(children[0]==nullptr){
            generateSubtrees();
        }
        for(const auto child: children) {
            if(child->contains(rect)) {
                return child->insert(rect);
            }
        }
    }
    elements.push_back(rect);
    return true;
}

// Generate the subtrees
void generateSubtrees() {
    children[0] = new Quadtree(w/2.0f, h/2.0f, x,        y,        this);
    children[1] = new Quadtree(w/2.0f, h/2.0f, x+w/2.0f, y,        this);
    children[2] = new Quadtree(w/2.0f, h/2.0f, x,        y+w/2.0f, this);
    children[3] = new Quadtree(w/2.0f, h/2.0f, x+w/2.0f, y+w/2.0f, this);
}

The time for filling this version with 1000 elements takes ca. 0.00312 seconds.

As you see, the second version using pointers is a much slower.

PS: I fill the old tree (version 1) in a loop with

insert(std::make_shared<Rectangle>(std::rand()%999, std::rand()%999, 1, 1))

and the new one (version 2) with

insert(new Quadtree::Rectangle(std::rand()%999, std::rand()%999, 1, 1)).

Can you tell me where the reason for the performance loss lies?

(Look up the comments for additional information)

Was it helpful?

Solution

This code

const bool contains(const double &x, const double &y, const double &w, const double &h) const {
    return (this->x < x &&
            this->y < y &&
            this->x + this->w > x + w &&
            this->y + this->h > x + h);  <---- error here
}

is not the same as this code

const bool contains(const Rectangle *rect) const {
    return (this->x < rect->x &&
            this->y < rect->y &&
            this->x + this->w > rect->x + rect->w &&
            this->y + this->h > rect->y + rect->h);
}

the first wrongly says x + h, it should say y + h.

OTHER TIPS

You need bigger Testdata to have an reliable statement.

You also want to do that 'time messuring' multiply times.

After that you might use an Profiler to determine what your root cause is.

It can be problems with your cpu cache (change of structure) or something slower you are doing now.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top