Here's some sample data with four classes. The first and second classes contain children with names "name1" and "name2". The third class contains "name1" and "name3", and the fourth contains "name3" and "name4". The fifth class contains all the children of the fourth, as well as "name5". So, the first and second classes are equivalent, and the fourth class is a subclass of the fifth.
@prefix : <http://example.org/> .
:class1 :child [ :objectName "name1" ] ,
[ :objectName "name2" ] .
:class2 :child [ :objectName "name2" ] ,
[ :objectName "name1" ] .
:class3 :child [ :objectName "name1" ] ,
[ :objectName "name3" ] .
:class4 :child [ :objectName "name3" ] ,
[ :objectName "name4" ] .
:class5 :child [ :objectName "name3" ] ,
[ :objectName "name4" ] ,
[ :objectName "name5" ] .
Your description sounds like you're actually looking for subclasses, since you mention classes all of whose children are also in another class. As such, this SPARQL query should take care of finding subclass relationships:
prefix : <http://example.org/>
select distinct ?c1 ?c2 where {
?c1 :child [] .
?c2 :child [] .
NOT EXISTS { ?c1 :child [ :objectName ?name ] .
NOT EXISTS { ?c2 :child [ :objectName ?name ] } }
FILTER( !sameTerm( ?c1, ?c2 ) )
}
The nested NOT EXIST
patterns ensures that the only classes we select are such that there does NOT EXIST
an element ?c1
which does NOT EXIST
in ?c2
. That is, we reject any pairs of sets where there is an element in ?c1
that is not in ?c2
; we reject any ?c1,?c2
pair where ?c1
is not a subset of ?c2
, so we're keeping just the ones where ?c1
is a subset of ?c2
. The sameTerm
filter removes the trivial ?c,?c
pairs, since everything will be subset of itself. Using Jena's command line ARQ tools, we get these results:
$ arq --data data.n3 --query query.sparql
---------------------
| c1 | c2 |
=====================
| :class4 | :class5 |
| :class2 | :class1 |
| :class1 | :class2 |
---------------------
As expected, :class1
and :class2
are each subsets of the other, and :class4
is a subset of :class5
.
If you want equivalent classes, it is sufficient to just a second NOT EXISTS
to ensure that ?c2
is also a subset of ?c1
:
prefix : <http://example.org/>
select distinct ?c1 ?c2 where {
?c1 :child [] .
?c2 :child [] .
NOT EXISTS { ?c1 :child [ :objectName ?name ] .
NOT EXISTS { ?c2 :child [ :objectName ?name ] } }
NOT EXISTS { ?c2 :child [ :objectName ?name ] .
NOT EXISTS { ?c1 :child [ :objectName ?name ] } }
FILTER( !sameTerm( ?c1, ?c2 ) )
}
With this query, we get back just :class1
and :class2
:
$ arq --data data.n3 --query query.sparql
---------------------
| c1 | c2 |
=====================
| :class2 | :class1 |
| :class1 | :class2 |
---------------------