Question

I have implemented RDFa on a shopping website.

Now, how to create triple store using those structured data?

There are thousands of products in the website. So, manually visiting each and every page and extracting RDF is not a good solution. Is there any automatic tools for this?

Was it helpful?

Solution

The answer depends on how you "implemented RDFa". It is unlikely that the majority of your content is expressed as static information, so it is also unlikely that the majority of your content requires scraping.

There are tools, such as D2R Server, that give you facilities for exposing your underlying datastore as a read-only SPARQL endpoint. The only trick will be if you do have static content and wish to expose that as automatically generated RDF as well. That would require some finessing.

OTHER TIPS

The data which is in RDFa format on your website probably comes from a database, where it is in relational form, since you probably didn't add the RDF triples to the HTML manually. So the easiest way to get the data into the triple store would not be from the HTML, but by some kind of transformation of the original data in the database. In the end, RDF triples can be seen as a ternary relation that can well be stored in any relational database.

GRDDL (Gleaning Resource Descriptions from Dialects of Languages) is a way of using XSLT to extract the RDF triples from the HTML, in case you do not have access to a relational database that stores the data. Hope this helps.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top