Hierarchical data structure in firebase with geospatial/geohash searching; am I doing this right?

StackOverflow https://stackoverflow.com/questions/21840822

Pregunta

Intent:
Create a catalog of aerial photography. Records to be created by AngularJS forms, but retrieved by a map viewer like leaflet.

Goals:
Retrieve information multiple ways: by Flight, Collection, Image by ID, and Image by Geography

Data Structure:

Collection: {
    meta: '',
    Flights: {
        meta: '',
        Aerials: {
            meta:'',
            geopoint: [lat, lon],
            geopoly: [[x1,y1], [x2,y2], [x3,y3], [x4,y4]]
        }
    }
}

Firebase (attempt at denormalizing) Structure:

  • /Collections/CID
  • /Collections_Flights/CID
  • /Collections_Images/CID
  • /Flights/FID
  • /Images/IID

(I referenced this stackoverflow)

Questions:

  1. If there are 1 Million Images does the denormalization look adequate? (each flight will have about 80 images, each collection will average 100 flights... if it matters)

  2. Should I use a GeoHash?, if so does the GeoHash become the "Image ID (IID)" for the firebase reference /Images/UID? (Or should I make another reference ex: /Images_Geo/)

  3. Must I use a GeoHash like this example? (On most mapping servers I can pass in a bounding box of the user's current view and the server will return all the items in that location. Not sure how to go about this using Firebase.)

¿Fue útil?

Solución

If Collection_Flights and Collections_Images only contain ids, and if you always retrieve them at the same time you fetch Collections/$CID, then you may not need to denormalize those indexes.

The main purpose of denormalizing here would be to make it faster to fetch a list of Collections or to grab Collections/$CID without having to wait for images and flight lists to load. So again, if those are always used in parallel, probably additional complexity for no gain.

The split on Flights/ and Images/, referenced from an index, is an excellent choice.

One million images of 10k each would be about 10GB of data; an important consideration. Assuming the images aren't changing at real-time speeds, you'd probably want to optimize by finding an extremely cheap storage facility (S3, CDN, etc) for those images and just storing the URLs in Firebase, rather than storing them as data and paying real-time rates for bandwidth and storage of these bulky static assets.

Whether you should use GeoHash is a huge topic and quite debatable; there are plenty of alternatives as you've already pointed out. I don't think anyone can tell you the answer to that huge implementation decision in a Q&A format (maybe as a chapter in a textbook or a discussion thread on an appropriate mailing list).

Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top