Question

I'm working with RDF data which contains georeferenced stuff, e.g. a POI with the specified location:

@prefix ogc:   <http://www.opengis.net/ont/geosparql#> .

:poi  ogc:hasGeometry  :geo
:geo  ogc:asWKT        "POINT(48.5 11.7)"^^ogc:wktLiteral .

So there's some kind of POI that is located at (48.5, 11.7). I can use GeoSPARQL-queries to work with these locations, but now I want to extract latitude and longitude separately, so I can feed it into a different application which does not support WKT.

SELECT ?lat ?lon
WHERE {
    # how do I get lat and lon from "POINT(48.5 11.7)"^^ogc:wktLiteral?
}

I didn't find anything useful in the OGC's GeoSPARQL specification, so I was wondering what's the best way to extract this kind of data by hand, inside a SPARQL query.

Was it helpful?

Solution

It's always a bit tricky to do this sort of stuff with regular expressions, and especially when it doesn't look like we have a precise grammar to work with, but I think the following approach works:

prefix ogc: <urn:ex:>

select ?lat ?long where {
  values ?point { "POINT(48.5 11.7)"^^ogc:wktLiteral }
  bind( replace( str(?point), "^[^0-9\\.]*([0-9\\.]+) .*$", "$1" ) as ?long )
  bind( replace( str(?point), "^.* ([0-9\\.]+)[^0-9\\.]*$", "$1" ) as ?lat )
}
-------------------
| lat    | long   |
===================
| "11.7" | "48.5" |
-------------------

The key here is in the regular expressions

"^[^0-9\\.]*([0-9\\.]+) .*$" === <non-number>(number) <anything>
"^.* ([0-9\\.]+)[^0-9\\.]*$" === <anything> (number)<non-number>

Of course, that's really an approximation of number, since it would match things with multiple dots, but if the data is good, you shouldn't have a problem. If you need to cast these values to numeric types, you can do that kind of cast too:

prefix ogc: <urn:ex:>
prefix xsd: <http://www.w3.org/2001/XMLSchema#>

select ?lat ?long where {
  values ?point { "POINT(48.5 11.7)"^^ogc:wktLiteral }
  bind( xsd:decimal( replace( str(?point), "^[^0-9\\.]*([0-9\\.]+) .*$", "$1" )) as ?long )
  bind( xsd:decimal( replace( str(?point), "^.* ([0-9\\.]+)[^0-9\\.]*$", "$1" )) as ?lat )
}
---------------
| lat  | long |
===============
| 11.7 | 48.5 |  # note: no quotation marks; these are numbers
---------------

Note that there are other types of WKT points as well, and this code won't handle them correctly. E.g., some examples from Wikipedia's Well-known text article:

POINT ZM (1 1 5 60)
POINT M (1 1 80)
POINT EMPTY

OTHER TIPS

Joshua's response doesn't take into account negative values for latitude or longitude. A correction to this is :

prefix ogc: <urn:ex:>
select ?lat ?long where {
   values ?point { "POINT(48.5 -11.7)"^^ogc:wktLiteral }
   bind( replace( str(?point), "^[^0-9\\.-]*([-]?[0-9\\.]+) .*$", "$1" ) as ?long )
   bind( replace( str(?point), "^.* ([-]?[0-9\\.]+)[^0-9\\.]*$", "$1" ) as ?lat )
}

and the result

--------------------
| lat     | long   |
===================
| "-11.7" | "48.5" |
--------------------

I've tested the regular expressions with Rubular and the GeoSPARQL queries with a Parliament SPARQL endpoint, and it's seems OK.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top