Вопрос

I have a list of somewhere between 5 and 100 properties and want to query for any entities having these properties (not interested in the values), ranged by most matches. How can this be achieved with a SPARQL query? For instance, say I have the following properties:

dbpedia-owl:country
dbpedia-owl:elevation
dbpedia-owl:leader
dbpprop:area
dbpprop:flag
dbpprop:name
…

The query should return all resources having values for all of these these properties, as well as resources that match just some of the properties. The results will lots of cities and countries, but it should also include, for example, an organization that has a leader and name, but not a flag, area or elevation.

Это было полезно?

Решение

This is sort of an expensive query to write, but it's pretty straightforward. You need something like this:

select ?subject (count(?property) as ?numProperties) where {
  values ?property {
    dbpedia-owl:country dbpedia-owl:elevation
    dbpedia-owl:leader dbpprop:area dbpprop:flag
    dbpprop:name
  }
  ?subject ?property ?object 
}
group by ?subject 
order by desc(?numProperties)
limit 10

SPARQL results

query results

This says to find triples with any of the properties that you've enumerated and, for each ?subject, count the number of properties that there were values for and call it ?numProperties, and present the results ordered by ?numProperties (with greatest number of properties first).

Those numbers look rather high, but it's because those list pages have lots of values defined for certain properties. For instance, List of Advanced Dungeons & Dragons 2nd edition monsters really does have a whole bunch of dbpprop:name values:

select (count(distinct ?name) as ?numNames) where { 
  <http://dbpedia.org/resource/List_of_Advanced_Dungeons_&_Dragons_2nd_edition_monsters> 
    dbpprop:name ?name
}

1971

SPARQL results

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top