How do I get Html Source of a Wiktionary page? [duplicate]

https://stackoverflow.com/questions/16254480

13-04-2022
|

Question

I am struggling withe Wiki Api. How can I simply get a pages html using the API. I know it is possible as I have done it before but I cannot remember how to do it.

Say I want the page source for the page http://en.wiktionary.org/wiki/bicycle how do I do it. What API do I use. I do not want to look in the Browser?

Solution

With Java and Jsoup you can do this:

Document document = Jsoup
        .connect("http://en.wiktionary.org/wiki/bicycle")
        .get();

Element bodyContent = document.select("div#bodyContent").first();

System.out.println(bodyContent.html());

OTHER TIPS

You use the "parse" action of the MediaWiki API assuming you want the HTML:

http://en.wiktionary.org/w/api.php?action=parse&page=bicycle&prop=text&disablepp=1&format=json

If you were looking for the original wikitext you just request a different property:

http://en.wiktionary.org/w/api.php?action=parse&page=bicycle&prop=wikitext&disablepp=1&format=json

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow