Question

Scenario

I'm using the open source dbpedia browser to get a handle on sparql queries. I'm attempting to parse both the month and day values out of an xs:date for dbo:birthDate. It is not a dateTime so I cannot use month() or dayOfWeek() functions. I've instead gone with substr which works perfectly fine for month. However, trying to substring with values 9,2 give me an invalid index error. So I modified the sparql to show the string and the strlen values and now I'm totally confused. The string it shows is a valid date format "yyyy-mm-dd" which is 10 characters. However, it shows a strlen of 7.

Here is the query:

PREFIX dbo: <http://dbpedia.org/ontology/>

SELECT ?name, ?bmonth, ?bstring, ?len1, ?len2
WHERE {
  ?person foaf:name ?name .
  ?person dbo:birthDate ?birth .
  BIND (str(?birth) AS ?bstring)
  BIND (strlen(?bstring) AS ?len1)
  BIND (substr(?bstring, 6,2) AS ?bmonth)
  BIND (strlen(?bstring) AS ?len2)
  FILTER ( ?bmonth = '12' )
} GROUP BY ?person ORDER BY ?len1 LIMIT 50

And here are a few rows of the results:

name            bmonth  bstring        len1 len2
"Ann Ronell"@e  "12"    "1906-12-28"    7   7
"Anna Freud"@en "12"    "1895-12-03"    7   7
"Dorothy Lamour"@en "12"    "1914-12-10"    7   7
"Doug Mohns"@en "12"    "1933-12-13"    7   7

Query: http://tinyurl.com/k6kuapd

Conclusion/Question

Essentially, I'm really just trying to get the dayOfMonth value from the string. But if anyone could also explain to me how it's coming up with 7 I'd truly appreciate it!

Was it helpful?

Solution

Extracting the day and month by casting to xsd:dateTime

SPARQL 1.1 has a day function that returns the day of a dateTime.

17.4.5.4 day

xsd:integer  DAY (xsd:dateTime arg)

Returns the day part of arg as an integer.

This function corresponds to fn:day-from-dateTime.

day("2011-01-10T14:45:13.815-05:00"^^xsd:dateTime)
=> 10

You can use xsd:dateTime as a function to convert xsd:dates into dateTimes to which you can apply this function. As an example:

prefix xsd: <http://www.w3.org/2001/XMLSchema#>

select ?month ?day where {
  # Provide some values for ?date.  Note that 
  # we can use xsd:date *and* xsd:dateTime here,
  # as both can be cast to xsd:dateTime.
  values ?date { 
    "2011-01-02"^^xsd:date
    "2012-10-11"^^xsd:date
    "2013-12-30T14:45:13.815-05:00"^^xsd:dateTime 
  }

  # Cast to a xsd:dateTime and extract the day and month.
  bind( day(xsd:dateTime(?date)) as ?day )
  bind( month(xsd:dateTime(?date)) as ?month )
}
---------------
| month | day |
===============
| 01    | 02  |
| 10    | 11  |
| 12    | 30  |
---------------

Why DBpedia does strange things

The easiest way to find out why you get seven as the length sometimes is to ask for values where the length is seven and see what you get. E.g., look at the results of

SELECT ?person ?birth WHERE {
  ?person dbpedia-owl:birthDate ?birth .
  bind( strlen(str(?birth)) as ?blen )
  filter( ?blen < 10 )
}
LIMIT 25

SPARQL results

Person                            Birth
-----------------------------------------------------------------------------------------
Alyson_No%C3%ABl                  "--12-03"^^<http://www.w3.org/2001/XMLSchema#gMonthDay>
Corneille_(singer)                "--03-24"^^<http://www.w3.org/2001/XMLSchema#gMonthDay>
Count_Karl_Sigmund_von_Hohenwart  "--02-12"^^<http://www.w3.org/2001/XMLSchema#gMonthDay>
David_Lewis_(politician)          "--06-23"^^<http://www.w3.org/2001/XMLSchema#gMonthDay>
…                                 …

There are values that aren't xsd:dateTimes or xsd:dates, but rather xsd:gMonthDays. These don't have a year specified, so the the strings aren't as long as those for xsd:dates. This is another reason that shows why you should cast to xsd:dates (or check the datatype, etc., since you can't cast from an xsd:gMonthDay to a xsd:dateTime).

As was pointed out in the comments, your query isn't actually legal. You shouldn't be able to select non-grouped variables, so perhaps what's happening is that when you've done

SELECT ?name, ?bmonth, ?bstring, ?len1, ?len2 WHERE {
  ?person foaf:name ?name .
  ?person dbo:birthDate ?birth .
  BIND (str(?birth) AS ?bstring)
  BIND (strlen(?bstring) AS ?len1)
  BIND (substr(?bstring, 6,2) AS ?bmonth)
  BIND (strlen(?bstring) AS ?len2)
  FILTER ( ?bmonth = '12' )
}
GROUP BY ?person
ORDER BY ?len1
LIMIT 50

you're getting some DBpedia specific behavior. E.g., perhaps you're grouping by the person, and then the select is doing an implicit sample and maybe some of these persons have multiple values for their birth date. (I'm not certain about this, it's just a possibility that occurs to me. DBpedia is doing maintainence at the moment, so I can't check what data's there, and even if I could, this is behavior not defined by the standard. I suggest that you check your queries at sparql.org's query validator and get legal SPARQL first. Only then can we tell if you if it's returning what it ought to.)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top