Question

Working with Solr 4.5 and use case is that I need to sort results by the distance to the given route. Working with documents that contain 1 geo coordinate as rpt field geo (location of the point of interest).

Here is an illustration of what I am aiming for: http://i.imgur.com/lGgMEal.jpg . I would like to calculate the shortest distance from document to given route and use it as a boosting component.

Current attempt is to use {!score=recipDistance} function in edismax mode and send in description of route as LineString in WKT. Here is query that is sent now:

fl=*,score,distdeg:query({!score=distance filter=false v=$spatialfilter})
defType=edismax
q.alt=*:*
boost=query({!score=distance filter=false v=$spatialfilter})
spatialfilter=geo:"Intersects(LINESTRING (59.79619 11.38690, 60.25974 11.63869))"

And in URI form:

http://sokemotortest:8080/solr/collection1/select?fl=*%2Cscore%2Cdistdeg%3Aquery%28{!score%3Ddistance+filter%3Dfalse+v%3D%24spatialfilter}%29&wt=json&debugQuery=true&defType=edismax&q.alt=*%3A*&boost=query%28{!score=distance%20filter=false%20v=$spatialfilter}%29&spatialfilter=geo:%22Intersects%28LINESTRING%20%2859.79619%2011.38690,%2060.25974%2011.63869%29%29%22

My issues with this approach are:

  • Distance seem to be calculated from the center of the shape (route). This means that we are getting distance not to the line, but rather to the spot. With this query it's Pt(x=60.027965,y=11.512795)
  • Results of distance calculation seem wrong. There are 4 documents in the index and they come in following order:

    • (1) 59.7333, 7.61283
    • (2) 59.6236, 10.7263
    • (3) 59.6238, 10.7385
    • (4) 64.12379, 22.14029

    When order should have rather been:

    • (3) 59.6238, 10.7385
    • (2) 59.6236, 10.7263
    • (1) 59.7333, 7.61283
    • (4) 64.12379, 22.14029

You can take a look at the complete result with boost calc debug here: pastebin.com/5tvCb0Cf

Another working solution might be filtering of documents by the distance to the route (like this: http://i.imgur.com/EJu8Kcg.jpg ). This might be done with usage of buffered line that seem to be supported both in jTS and spatial4j. The only question is how do I send a buffered line as a input to the Intersect function (smth like this: geo:"Intersects(LINESTRING (59.79619 11.38690, 60.25974 11.63869) d=1)").

Solution here would be to create a custom search component that will accept route as LineString and will forward query further as as Polygon or MuliPolygon, but I would rather avoid developing custom components unless it is necessary.

My questions are:

  • Is it possible in Solr 4.5 to get distance to the LineString, not to the center of the shape?
  • Can we send a buffered line as a input to the Intersect function (smth like this: geo:"Intersects(LINESTRING (59.79619 11.38690, 60.25974 11.63869) d=1)")?

PS: Description of the fields in the index:

<field name="geo" type="location_rpt" indexed="true" stored="true"/>

Field type definition:

<fieldType name="location_rpt"
    class="solr.SpatialRecursivePrefixTreeFieldType"
    spatialContextFactory="com.spatial4j.core.context.jts.JtsSpatialContextFactory" 
    geo="true"
    distErrPct="0.025"
    maxDistErr="0.000009"
    units="degrees"
    />
Was it helpful?

Solution

  1. It is not possible (without customizing Solr) to get the distance to a query LineString from an indexed point per document. You're going to need to write a ValueSourceParser that references the lineString (that you can parse with JTS WKT parser) and that also references your indexed point field. For the purposes of retrieving the point from the document efficiently on a per-document basis, use LatLonType, not RPT. JTS can calculate the distance between a point and a LineString, but keep in mind JTS operates in Euclidean space. To get better accuracy, you'll need to "project" the data (both indexed point and lineString) to a projection that is centered on the lineString. Proj4j can help with that.

  2. RE bufferedLineStrings, you may be interested to know that the master branch of Spatial4j has a "BufferedLineString" shape -- it's native to Spatial4j. However, it hasn't yet been integrated into shape parsing, so it's not completely ready yet. To be clear, it's well tested and I use it privately with a parser that isn't open-sourced. It's also Euclidean-space limited, like JTS. The best way to approach this is to add your own Solr query parser (easier than it may sound). This query parser would read a buffer distance, a LineString, and use JTS from there to buffer it. Projecting to the shape's center point isn't feasible because it has to align with the indexed data, so instead you might compensate by over-buffering by an appropriate amount, thereby increasing the shape size but at least ensuring the minimum distance is captured. I have plans to solve this better but I've been busy.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top