Domanda

Is there a Wikipedia API available to fetch its contents in plain JSON if possible without BBCode, or Wikipedia special code! Something similar to YouTube's JSON API like this.

È stato utile?

Soluzione

Please take a look at MediaWiki API help. There you can find all the necessary information. You can choose the format of responses among the following list:

json, jsonfm, php, phpfm, wddx, wddxfm, xml, xmlfm, yaml, yamlfm
rawfm, txt, txtfm, dbg, dbgfm, dump, dumpfm, none

Altri suggerimenti

There is also the opportunity to consume Wikipedia pages through a wrapper API like JSONpedia. It works both live (ask for the current JSON representation of a Wikipedia page) and storage based (query multiple pages previously ingested in Elasticsearch and MongoDB).

Here's a Windows curl statement that returns a JSON response of a Wikipedia entry (Albert Einstein). Most of the HTML markup is removed although <ref> remains. It also contains some Wikipedia markup.

curl "https://en.wikipedia.org/w/api.php?origin=*&action=query&format=json&formatversion=2&redirects&prop=revisions&rvprop=content&titles=Albert+Einstein" -o curl-wiktionary-result.json

Use this jq command to drill down into the "content" property:

jq ".query.pages[].revisions[].content" < curl-wiktionary-result.json
Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top