Question

I have created a class Asset which is a subclass of http://schema.org/CreativeWork Asset has 2 subclasses say Article and Publication

Now i would have instances of only Article or Publication class. Asset is abstract class.

When i print metadata for article or publication i also want to print the type of Asset. In this case it would be Article or Publication.

I run the following query

SELECT ?id ?title ?type
WHERE
{
   ?asset rdf:type Asset ;
          somePrefix:id ?id ;
          somePrefix:title ?title ;
          rdf:type ?type .
}

Now instead of printing type as Article or Publication for every Asset i get multiple values in rdf:type. Example

 id  title                    Type
 1   this is a article        CreativeWork
 1   this is a article        Asset
 1   this is a article        Article
 2   this is a publication    CreativeWork
 2   this is a publication    Asset
 2   this is a publication    Article

I want to somehow print only Article or Publication in the type column

How can i achieve this ?

Was it helpful?

Solution

As I understand you, your class hierarchy is:

CreativeWork
  Asset
    Article
    Publication

You have a few options.

Getting nothing but Article and Publication

The simplest is to say that you only want to consider values of ?type that are Article and Publication, in which case you can specify this with values:

SELECT ?id ?title ?type
WHERE
{
   values ?type { Article Publication }
   ?asset rdf:type Asset ;
          somePrefix:id ?id ;
          somePrefix:title ?title ;
          rdf:type ?type .
}

This is the most specific thing that you can do, and you are guaranteed that ?type will only be Article or Publication.

Getting everything but CreativeWork and Asset

Of course, you might define other subclasses later, and you might not want to have to add more types to the values block every time you do that. You might consider simply filtering out CreativeWork and Asset, then:

SELECT ?id ?title ?type
WHERE
{
   ?asset rdf:type Asset ;
          somePrefix:id ?id ;
          somePrefix:title ?title ;
          rdf:type ?type .
   filter ( ?type != Asset && ?type != CreativeWork )
}

You can also do that filter with:

filter ( ?type NOT IN ( Asset, CreativeWork ) )

Getting only maximally specific classes

This doesn't make any guarantee about what classes you could have, though, and if you later add subclasses of Article or Publication, e.g., JournalArticle ⊑ Article, then you'd get results that include both Article and JournalArticle, and you might not want that. What you might want instead, is the "most specific" class for an individual. That is, you want the class C of an individual such that the individual has no other type D ⊑ C. (Note that the other there is important, since C ⊑ C.) The general idea is captured in How to get Least common subsumer in ontology using SPARQL Query?, along with some other questions, but it's easy to reproduce the important part here:

SELECT ?id ?title ?type
WHERE
{
   ?asset rdf:type Asset ;
          somePrefix:id ?id ;
          somePrefix:title ?title ;
          rdf:type ?type .

   filter not exists {                    # Don't take ?type as a result if
      ?asset rdf:type ?subtype .          # ?asset has some other ?subtype 
      ?subtype rdfs:subClassOf* ?type .   # that is a subclass of ?type 
      filter ( ?subtype != ?type )        # (other than ?type itself).
   }
}

This will get you the "deepest" class in the hierarchy that an individual has. This still could return multiple results, if your individual is a member of classes such that neither is a subclass of the other. Of course, in that case, you'd probably still be interested in all the results.

OTHER TIPS

You are getting this output, because your Assets are all Assets and CreativeWorks, and they can also be Articles or Publications. If you only want to print the subclasses of Asset, then you can use the following query to restrict the values of ?type (same as yours with an extra line):

SELECT ?id ?title ?type
WHERE
{
     ?asset rdf:type Asset ;
            somePrefix:id ?id ;
            somePrefix:title ?title ;
            rdf:type ?type .
     ?type rdfs:subClassOf Asset . 
}

where rdfs is the namespace prefix of http://www.w3.org/2000/01/rdf-schema#.
It means that the ?type should be a subclass of Asset only.

The first restriction (?asset rdf:type Asset) is not actually needed, but I leave it for clarity, since you have it in your initial query. You can safely skip it though.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top