Question

I understand DBPedia spotlight does Named Entity recognition on a given document. To do that it uses the downloaded DBPedia files that are stored in the file system.Refer the URL:https://github.com/dbpedia-spotlight/dbpedia-spotlight/wiki/Run-from-a-JAR.

What I need is an equivalent API like spotlight for Freebase. As much as I browsed I could not find any such tool/API that operates on Freebase triple store. Could some one help?

Was it helpful?

Solution

There is currently no equivalent project for named entity recognition in Freebase. However, Freebase has links to DBpedia on sameAs.org so you can use DBpedia spotlight and then resolve the IDs back to Freebase (that data is also available in the Freebase RDF dumps).

If you're looking for a coding project in this area, I think it should be possible to adapt the DBpedia Spotlight code so that you can train its models using Freebase data. The main benefit of this would be that Freebase covers a wider range of entities than DBpedia so you'd get better recall. Also, you may be able to exploit other data in Freebase like "notable types" to get better precision as well.

You should be able to get a good set of "surface forms" of the entity by looking at the /type/object/name and /common/topic/alias properties in Freebase. Any Freebase entity that corresponds to a Wikpedia page will have one or more /type/object/key values in the /wikipedia/en namespace. These correspond to the Wikipedia page names (and redirects) which will allow you to parse through the Wikipedia XML dumps and identify which links on the page correspond to Freebase topics. The Freebase key encoding scheme is described here.

You might also be interested in OpenCalais and AlchemyAPI which provide named entity recognition as a service and provide Freebase IDs in their API responses.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top