multiple values getting printed for rdf:type
Question
I have created a class Asset which is a subclass of http://schema.org/CreativeWork Asset has 2 subclasses say Article and Publication
Now i would have instances of only Article or Publication class. Asset is abstract class.
When i print metadata for article or publication i also want to print the type of Asset. In this case it would be Article or Publication.
I run the following query
SELECT ?id ?title ?type
WHERE
{
?asset rdf:type Asset ;
somePrefix:id ?id ;
somePrefix:title ?title ;
rdf:type ?type .
}
Now instead of printing type as Article or Publication for every Asset i get multiple values in rdf:type. Example
id title Type
1 this is a article CreativeWork
1 this is a article Asset
1 this is a article Article
2 this is a publication CreativeWork
2 this is a publication Asset
2 this is a publication Article
I want to somehow print only Article or Publication in the type column
How can i achieve this ?
Solution
As I understand you, your class hierarchy is:
CreativeWork
Asset
Article
Publication
You have a few options.
Getting nothing but Article and Publication
The simplest is to say that you only want to consider values of ?type that are Article and Publication, in which case you can specify this with values:
SELECT ?id ?title ?type
WHERE
{
values ?type { Article Publication }
?asset rdf:type Asset ;
somePrefix:id ?id ;
somePrefix:title ?title ;
rdf:type ?type .
}
This is the most specific thing that you can do, and you are guaranteed that ?type will only be Article or Publication.
Getting everything but CreativeWork and Asset
Of course, you might define other subclasses later, and you might not want to have to add more types to the values block every time you do that. You might consider simply filtering out CreativeWork and Asset, then:
SELECT ?id ?title ?type
WHERE
{
?asset rdf:type Asset ;
somePrefix:id ?id ;
somePrefix:title ?title ;
rdf:type ?type .
filter ( ?type != Asset && ?type != CreativeWork )
}
You can also do that filter with:
filter ( ?type NOT IN ( Asset, CreativeWork ) )
Getting only maximally specific classes
This doesn't make any guarantee about what classes you could have, though, and if you later add subclasses of Article or Publication, e.g., JournalArticle ⊑ Article, then you'd get results that include both Article and JournalArticle, and you might not want that. What you might want instead, is the "most specific" class for an individual. That is, you want the class C of an individual such that the individual has no other type D ⊑ C. (Note that the other there is important, since C ⊑ C.) The general idea is captured in How to get Least common subsumer in ontology using SPARQL Query?, along with some other questions, but it's easy to reproduce the important part here:
SELECT ?id ?title ?type
WHERE
{
?asset rdf:type Asset ;
somePrefix:id ?id ;
somePrefix:title ?title ;
rdf:type ?type .
filter not exists { # Don't take ?type as a result if
?asset rdf:type ?subtype . # ?asset has some other ?subtype
?subtype rdfs:subClassOf* ?type . # that is a subclass of ?type
filter ( ?subtype != ?type ) # (other than ?type itself).
}
}
This will get you the "deepest" class in the hierarchy that an individual has. This still could return multiple results, if your individual is a member of classes such that neither is a subclass of the other. Of course, in that case, you'd probably still be interested in all the results.
OTHER TIPS
You are getting this output, because your Assets are all Assets and CreativeWorks, and they can also be Articles or Publications. If you only want to print the subclasses of Asset, then you can use the following query to restrict the values of ?type
(same as yours with an extra line):
SELECT ?id ?title ?type
WHERE
{
?asset rdf:type Asset ;
somePrefix:id ?id ;
somePrefix:title ?title ;
rdf:type ?type .
?type rdfs:subClassOf Asset .
}
where rdfs
is the namespace prefix of http://www.w3.org/2000/01/rdf-schema#.
It means that the ?type
should be a subclass of Asset only.
The first restriction (?asset rdf:type Asset
) is not actually needed, but I leave it for clarity, since you have it in your initial query. You can safely skip it though.