Вопрос

I'm using Spira as a model/persistence layer for a Ruby application. I'm having trouble getting a suitable serialization (e.g., as RDF/XML) for my individual models. For example, when I dump a model that contains "associations", I get XML that looks like:

<ns0:video rdf:about="info:whatever/videos/g91832990">
  <ns1:contributor rdf:resource="info:whatever/interviewees/g88129610"/>
  <ns1:title>Test Video</ns1:title>
  <ns0:files rdf:resource="info:whatever/files/g91776800"/>
</ns0:video>

However, I'd like this XML representation to resolve the rdf:resource references. That is, I'd like the XML to look more like this (which is what I get when I do a dump of the whole repository/triplestore):

<ns0:video rdf:about="info:repository/videos/g91832990">
  <ns1:contributor>
    <ns2:person rdf:about="info:repository/interviewees/g88129610">
      <ns2:name>Creator</ns2:name>
    </ns2:person>
  </ns1:contributor>
  <ns1:title>Test Video</ns1:title> <!-- ... -->
</ns0:video>

The contributor element is expanded to contain the relevant metadata. I can get the first-level references with a SPARQL query like:

sparql.construct([:o, :p2, :o2]).where([node, :p, :o], [:o, :p2, :o2])

where node is my "about" node. However, I want to do this to arbitrary depth. I understand that this question might touch on bigger issues, like doing recursive queries in SPARQL/RDF. However, I was hoping there would be some switch or setting in Spira or RDF.rb that would just change the output format.

Sorry about my terminology: I'm sure "resolving references" isn't the correct term to use.

EDIT

In Spira, models mixin RDF::Enumerable; they have an RDF representation comprising RDF statements from the triplestore where the subject is the model's URI. "Dumping a model" looks like:

v = Video.find 'RDF::Enumerable'
v.dump(:rdfxml)

The RDF/XML generated contains only the model's RDF statements. It's also possible to dump the whole triplestore (e.g., my second example above) with the following command:

Spira.repository.dump(:rdfxml)
Это было полезно?

Решение

There are two parts to this answer. The first is that the particular structure of the XML used in the RDF/XML serialization doesn't matter (insofar as the RDF data is concerned; you're still free to have a preference about what it looks like). The second part is about getting what you want (for aesthetic reasons) out of RDF.rb.

The particular XML structure of RDF/XML doesn't matter

RDF is a graph based data representation. The basic piece of information in RDF is the triple, also called a statement, which has the form

subject predicate object

A whole bunch of those make up an RDF graph. Those RDF graphs can be serialized in a number of formats. Some are easy to read and write by hand, and others are more complex. Some serialization formats might have a single way of writing a given RDF graph, or define a canonical way, but most will given you a number of different ways to write the same RDF graph.

For instance, the following data (in Turtle):

@prefix : <http://example.org/> .

<info:repository/videos/g91832990>
  a :video ;
  :contributor <info:repository/interviewees/g88129610> ;
  :title "Test Video" .

<info:repository/interviewees/g88129610>
  a :person ;
  :name "Creator" .

can be serialized in RDF/XML in different ways, because the format allows for lots of shorthand notation. For instance, with Jena, if I serialize as (plain) RDF/XML, I get:

<rdf:RDF
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns="http://example.org/" > 
  <rdf:Description rdf:about="info:repository/videos/g91832990">
    <rdf:type rdf:resource="http://example.org/video"/>
    <contributor rdf:resource="info:repository/interviewees/g88129610"/>
    <title>Test Video</title>
  </rdf:Description>
  <rdf:Description rdf:about="info:repository/interviewees/g88129610">
    <rdf:type rdf:resource="http://example.org/person"/>
    <name>Creator</name>
  </rdf:Description>
</rdf:RDF>

but if I serialize as RDF/XML-ABBREV, I get:

<rdf:RDF
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns="http://example.org/">
  <video rdf:about="info:repository/videos/g91832990">
    <contributor>
      <person rdf:about="info:repository/interviewees/g88129610">
        <name>Creator</name>
      </person>
    </contributor>
    <title>Test Video</title>
  </video>
</rdf:RDF>

Those are the same RDF graph. The latter might be a bit more expensive to write, since it uses more abbreviations, but they are the same RDF graph.

However, I'd like this XML representation to resolve the rdf:resource references. That is, I'd like the XML to look more like this (which is what I get when I do a dump of the whole repository/triplestore):

<ns0:video rdf:about="info:repository/videos/g91832990">
  <ns1:contributor>
    <ns2:person rdf:about="info:repository/interviewees/g88129610">
      <ns2:name>Creator</ns2:name>
    </ns2:person>
  </ns1:contributor>
  <ns1:title>Test Video</ns1:title> <!-- ... -->
</ns0:video>

It's OK to have aesthetic preferences, as long as you recognize that dumping the model in one format versus another doesn't change what graph you're getting. The structure of the serialization won't affect the results of your SPARQL queries, since the SPARQL query is based on the RDF graph, not the serialization. In fact, trying to access RDF by using XML tools and the RDF/XML serialization is really a bad idea, as I've discussed in this answer to How to access OWL documents using XPath in Java?.

Getting abbreviated RDF/XML with RDF.rb

According to its website, RDF.rb supports a number of serialization formats (emphasis added):

  • RDF::NTriples
  • RDF::JSON (plugin)
  • RDF::N3 (plugin)
  • RDF::Raptor::RDFXML (plugin)
  • RDF::Raptor::Turtle (plugin)
  • RDF::RDFa (plugin)
  • RDF::RDFXML (plugin)
  • RDF::Trix (plugin)

Note that there are two for RDFXML there, one through Raptor, and one from RDF.rb. At least one of those should provide support for the more concise bits of RDF/XML. I haven't used RDF.rb lately, but I seem to recall that the Raptor libraries provide a number of options, so that might be a good bet here. The built in one might have something too, of course.

If you start digging around in the source for rdf-rdfxml, you'll find in the writer, that there's an initialization option that might help you out here:

# @option options [Integer]  :max_depth (3)
#   Maximum depth for recursively defining resources
Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top