Pergunta

Is there a Wikipedia API available to fetch its contents in plain JSON if possible without BBCode, or Wikipedia special code! Something similar to YouTube's JSON API like this.

Foi útil?

Solução

Please take a look at MediaWiki API help. There you can find all the necessary information. You can choose the format of responses among the following list:

json, jsonfm, php, phpfm, wddx, wddxfm, xml, xmlfm, yaml, yamlfm
rawfm, txt, txtfm, dbg, dbgfm, dump, dumpfm, none

Outras dicas

There is also the opportunity to consume Wikipedia pages through a wrapper API like JSONpedia. It works both live (ask for the current JSON representation of a Wikipedia page) and storage based (query multiple pages previously ingested in Elasticsearch and MongoDB).

Here's a Windows curl statement that returns a JSON response of a Wikipedia entry (Albert Einstein). Most of the HTML markup is removed although <ref> remains. It also contains some Wikipedia markup.

curl "https://en.wikipedia.org/w/api.php?origin=*&action=query&format=json&formatversion=2&redirects&prop=revisions&rvprop=content&titles=Albert+Einstein" -o curl-wiktionary-result.json

Use this jq command to drill down into the "content" property:

jq ".query.pages[].revisions[].content" < curl-wiktionary-result.json
Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top