Domanda

I am trying to understand how r-tree works, and saw that there are two types of splits: quadratic and linear.

What are actually the differences between linear and quadratic? and in which case one would be preferred over the other?

È stato utile?

Soluzione

The original R-Tree paper describes the differences between PickSeeds and LinearPickSeeds in sections 3.5.2 and 3.5.3, and the charts in section 4 show the performance differences between the two algorithms. Note that figure 4.2 uses an exponential scale for the Y-axis.

http://www.cs.bgu.ac.il/~atdb082/wiki.files/paper6.pdf

I would personally use LinearPickSeeds for cases where the R-Tree has high "churn" and memory usage is not critical, and QuadraticPickSeeds for cases where the R-Tree is relatively static or in a limited memory environment. But that's just a rule of thumb; I don't have benchmarks to back that up.

Altri suggerimenti

Both are heuristics to find small area split.

In quadratic you choose two objects that create as much empty space as possible. In linear you choose two objects that are farthest apart.

Quadratic provides a bit better quality of split. However for many practical purposes linear is as simple, fast and good as quadratic.

There are even more variants: Exhaustive search, Greenes split, Ang Tan split and the R*-tree split.

All of them are heuristics to find a good split in acceptable time.

In my experiments, R*-tree splitting works best, because it produces more rectangular pages. Ang-Tan, while being "linear" produces slices that are actually a pain for most queries. Often, cost at construction/insertion is not too important, but query is.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top