NoSQL and spatial data

https://stackoverflow.com/questions/2041622

19-09-2019
|

Question

Has any of you had any experience with using NoSQL (non-relational) databases to store spatial data? Are there any potential benefits (speed, space, ...) of using such databases to hold data for, say, a desktop application (compared to using SpatiaLite or PostGIS)?

I've seen posts about using MongoDB for spatial data, but I'm interested in some performance comparison.

Solution

graphs databases like Neo4j are a very good fit, especially as you can add different indexing schemes dynamically as you go. Typical stuff you can do on your base data is of course 1D indexing (e.g. Timline or B-Trees) or funkier stuff like Hilbert Curves etc, see Nick's blog. Also, for some live demonstration, look at the AWE open source GIS desktop tool here, the underlying indexed graph being visible around time 07:00 .

OTHER TIPS

Couchdb also has a simple spatial extension

http://vmx.cx/cgi-bin/blog/index.cgi/category/CouchDB

Currently, MongoDB uses geohashing with B-trees which will be slower than the R-trees of PostGIS (I can't give exact numbers, I'm afraid, but there is plenty of theoretical literature on the differences). However, in these slides, http://www.slideshare.net/nknize/rtree-spatial-indexing-with-mongodb-mongodc the author talks about adding R-trees to MongoDB and sharding on a geo key. You talk about desktop use, so geosharding may not be of interest, as sharding's benefits will be felt more on massive datasets. Ultimately, it probably comes down more to what you want to do with your spatial data. Postgis has vastly more functions and support for topology, rasters, 3D, conversions between coordinate systems, so if this is what you are looking for, PostGIS would still be the best option. If you are interested in storing billions/trillions of spatial objects and just running basic find all points near/inside this point based on some criteria, then MongoDB is likely a very good choice.

I've been storing spatial data with ZODB. There's some inherent performance advantage in accessing local file data (spatialite) or unix socket (PostGIS) compared to TCP or HTTP requests (CouchDB etc), surely, but having an spatial index makes the biggest difference. I'm using the same R-trees mentioned in the MongoDB article, but there are plenty of good options. The JTS topology suite has various spatial indexes for Java.

Cassandra is also an option for spatial data:

http://www.readwriteweb.com/cloud/2011/02/video-simplegeo-cassandra.php

Tarantool supports spatial two-dimensional index (RTREE) with nearest neighbor search, overlaps, contains, and other spatial operators. Tarantool maintains the entire data set in RAM, making it the only OSS in-memory database with spatial index support. https://github.com/tarantool/tarantool/wiki/R-tree-index-quick-start-and-usage

MarkLogic(Enterprise NoSQL) provides spatial functionality. This NoSQL product provides GIS applications the ability to conflate multiple objects into one entity. This provides support for managing relationships across structured and unstructured content, provenance and pedigree information about the data, historic and timeline information, etc. in a single entity.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow