Question

I've a RDF that has multiple resources in it that I'm generating from my data model. Because each resource is added (concatenated) separately, I've multiple prefixes (when in N3). It looks something like this:

@prefix dc: <someURL>.

<someURL/Tony_Benn>
     dc:title "Tony Benn";
     dc:publisher "Wikipedia".

@prefix dc: <someURL>.

<someURL/Someone_Else>
     dc:title "Someone Else";
     dc:publisher "Wikipedia".

I am using Jena API to create the RDF but I've written a wrapper around the API to keep it disjoint. Is there a better way to approach this problem or is there a way to remove the duplicate prefixes?

Was it helpful?

Solution

If you're using a utility (e.g., Jena's rdfcat to concatenate the RDF documents, then you have nothing to worry about. Prefixes just make reading and writing a little easier, but RDF-aware tools don't really care. If being able to concatenate data with text-based tools (i.e., tools that aren't RDF-aware) is important, then you should probably use the N-Triples format. It is very simple, just

subject predicate object .

with one triple per line. Since there is no provision for prefixes, text concatenation simply works. N-Triples also has the (even nicer) feature that if you need to split up a document, e.g., for distributed processing, you can just split the file, as long as you split at linebreaks. That's impossible with N3, RDF/XML, and other more complicated formats.

OTHER TIPS

Thanks @Joshua. I thought about it. Rather than removing the duplicate entries, I think its better to not have it at the first place. Rather than concatenating two RDF documents, I found it better to make a union of respective models. Hence, here is what I did:

  • Read the documents into models
  • Made a union of the models. This could be done using the union(Model model) method OR better
  • Read, using read(.. ,.. ,..) method, the first RDF file (because I had it as a string, read it as an inputstream) into a model and add the statements from the second one. As @Joshua suggested in the below comment, it is much more efficient in memory usage.
  • Get the unified model out
  • I found this much more easier, predictable and handled the prefixes much better. I could do with Notation3 as well.

    Licensed under: CC-BY-SA with attribution
    Not affiliated with StackOverflow
    scroll top