Question

I have a set of roughly 10 million documents tagged with geolocation and time, that is growing at a rate of about 100,000 per day. We need a good way to query for documents nearby a given latitude/longitude, but we also want to take time into account (more recent documents should be much more highly weighted).

My current solution takes about 300 ms to run the query and is struggling under increased load, so I am trying to figure out a better way to do it. I made a prototype using a 3-dimensional kd-tree (on latitude, longitude, and time), and it was insanely fast (<1 ms). However, it was not at all suitable for production -- it required loading the whole thing in memory, and more importantly there doesn't seem to be a good way to write to/delete from a kd-tree. I'm looking for a production-ready database that offers something approaching this kind of speed, but also supports normal INSERT and UPDATE operations.

I looked into PostGIS, which says it supports 2-4 dimensional spatial fields. However, I couldn't find any conclusive information on whether or not it supports >2 dimensional spatial indices. Does anyone know if it will support a 3D index, and if so does it seem relatively performant? If not, any other options out there?

Thanks in advance.

Was it helpful?

Solution

After a bit of googling, found this page that has useful information on N-D indexing in PostGIS. Looks like PostGIS is the way to go for this problem; I'll try to build a prototype tomorrow.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top