SPARQL query and distinct count
Question
I have the following query:
SELECT ?tag WHERE {
?r ns9:taggedWithTag ?tagresource.
?tagresource ns9:name ?tag
}
LIMIT 5000
and the results are:
abc
abc
abc
abc
abc
abc
abc
abd
ads
anb
I want to get somthing like:
tag | count
-----------------
abc 7
abd 1
ads 1
anb 1
I have tried it with count(*)
and count(?tag)
, but than I get the error message "Variable or "*" expected."
Can someone tell me, how to make it right?
Solution
If you're using Java and Jena's ARQ, you can use ARQ's extensions for aggregates. Your query would look something like:
SELECT ?tag (count(distinct ?tag) as ?count)
WHERE {
?r ns9:taggedWithTag ?tagresource.
?tagresource ns9:name ?tag
}
LIMIT 5000
The original SPARQL specification from 2008 didn't include aggregates, but the current version, 1.1, from 2013 does.
OTHER TIPS
Using COUNT(), MIN(), MAX(), SUM(), AVG() with GROUP BY can produce summary values for groups of triples. Note, these patterns might be specific to SPARQL 1.1.
For example, this one can sum the ?value for each ?category,
SELECT ?category (SUM(?value) as ?valueSum)
WHERE
{
?s ?category ?value .
}
GROUP BY ?category
This one can count the number of uses for predicate ?p,
SELECT ?p (COUNT(?p) as ?pCount)
WHERE
{
?s ?p ?o .
}
GROUP BY ?p
These examples are inspired by material from Bob DuCharme (2011), "Learning SPARQL". O’Reilly Media, Sebastopol, CA, USA; see http://www.learningsparql.com/