Pergunta

I would like to know the category in which the word falls in.

Like it could be a place,food,clothing.. etc how do I get that from wikipedia api ? Currently I am using in this example : Pizza which is of category food .How do I get that.The following query gives me the entire page.How do I get only the category ? I really appreciate any help. Thanks in Advance.

http://en.wikipedia.org/w/api.php?action=query&prop=categories&rvprop=content&format=xml&titles=pizza  

response:

<api><warnings><main xml:space="preserve">Unrecognized parameter: 'rvprop'</main></warnings><query-continue><categories clcontinue="24768|Pizza"/></query-continue><query><normalized><n from="pizza" to="Pizza"/></normalized><pages><page pageid="24768" ns="0" title="Pizza"><categories><cl ns="14" title="Category:All articles needing additional references"/><cl ns="14" title="Category:All articles with unsourced statements"/><cl ns="14" title="Category:Articles including recorded pronunciations"/><cl ns="14" title="Category:Articles needing additional references from June 2010"/><cl ns="14" title="Category:Articles with unsourced statements from March 2013"/><cl ns="14" title="Category:Flatbreads"/><cl ns="14" title="Category:Greek inventions"/><cl ns="14" title="Category:Italian cuisine"/><cl ns="14" title="Category:Italian inventions"/><cl ns="14" title="Category:Mediterranean cuisine"/></categories></page></pages></query></api>
Foi útil?

Solução

Replace prop=revisions with prop=categories and you will get a list of categories the article is in.

For example, api.php?action=query&prop=categories&titles=pizza gives you (among other things):

<api>
 <query>
  <pages>
   <page pageid="24768" ns="0" title="Pizza">
    <categories>
     <cl ns="14" title="Category:All articles needing additional references"/>
     <cl ns="14" title="Category:All articles with unsourced statements"/>
     <cl ns="14" title="Category:Articles including recorded pronunciations"/>
     <cl ns="14" title="Category:Articles needing additional references from June 2010"/>
     <cl ns="14" title="Category:Articles with unsourced statements from March 2013"/>
     <cl ns="14" title="Category:Flatbreads"/>
     <cl ns="14" title="Category:Greek inventions"/>
     <cl ns="14" title="Category:Italian cuisine"/>
     <cl ns="14" title="Category:Italian inventions"/>
     <cl ns="14" title="Category:Mediterranean cuisine"/>
    </categories>
   </page>
  </pages>
 </query>
</api>

Note that you get the article categories, not some kind of taxonomy of the thing the article is about. You could probably follow a 'chain' (more like a tree) of categories up to a handful of large categories that you care about. For example, the Flatbreads category leads to Breads, which leads to Foods. This means that you could assume that pizza is a food. It's no guarantee, though. Article categories are not really meant to be used that way.

If you want a better classification of 'things', instead of articles, try something like Freebase - which, for example, helpfully lists pizza as a kind of food. A sample query would look like this:

[{"id":null,"name":"Pizza","type":[{"id":null,"name":null}]}]

This returns, among other things, Food, Cuisine, Bread and Man-made Thing.

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top