For some good starting points, you might consider:
- Google Structured Data Testing Tool
- RDFa Play (More of a testbed, but nice visuals)
- GetSchema.org
- Apache's Any23.org
- Sindice
Sindice is perhaps the most general of these, most of the others focus on RDFa (my own bias, sorry). Your choice might depend a bit on what you consider semantic data (e.g. do you want HTML5 semantics like <title>
to count? For just RDFa I have found Apache's Any23 best for my needs, with nice API, flexible formats and accurate extraction.
Good question though, I'd be curious to see what tools others most recommend. W3C has a longer list that may be slightly dated.