Question

I'm looking for a web service, browser extension, or anything else that directly extracts any and all semantic data contained in a given web page, as long as that semantic data is following any of the myriad of modern standards used for embedding semantic information inside web pages. Somehow I couldn't find anything that works. I could find many 'semantic crawlers' but no tool that just shows what semantic data you have at hand on a given web page.

I'd be very glad getting pointers to any such tool, if one exists out there. I can't fathom how people debug or develop their semantic harvesters without it.......

I listed some of the relevant standards as the tags for this question (see question's tags which usually show here below) but this list is not to be taken as exhaustive.

Thanks!

Was it helpful?

Solution

For some good starting points, you might consider:

Sindice is perhaps the most general of these, most of the others focus on RDFa (my own bias, sorry). Your choice might depend a bit on what you consider semantic data (e.g. do you want HTML5 semantics like <title> to count? For just RDFa I have found Apache's Any23 best for my needs, with nice API, flexible formats and accurate extraction.

Good question though, I'd be curious to see what tools others most recommend. W3C has a longer list that may be slightly dated.

OTHER TIPS

Yandex has tool for validating embedded semantic markup as well. There is some doc available also. It works with microdata, schema.org, opengraph, rdfa, microformats. Not just with microformats, as you may conclude from title :)

If you're looking for opensource tools there is mighty library RDFLib on Github. It does a lot and parsing in particular.

The library contains parsers and serializers for RDF/XML, N3, NTriples, N-Quads, Turtle, TriX, RDFa and Microdata.

For RDF data, there is Tim Berners-Lee’s Tabulator. A browser available as web app (resp. FLOSS JavaScript) and Firefox add-on. Howver, it seems no longer to be maintained (?).

For RDFa, there is the Firefox add-on RDFa Developer.

For RDF files linked in the page’s head, there is the Firefox-addon Semantic Radar.

Another Firefox add-on is OpenLink Data Explorer.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top