Question

Using PHP, is there a nice way to get the (parsed) introduction only from a wikipedia page?

I have to current methods:

  • The first is to call the api page and return, then call the Wiki parser on the introduction I have pulled from the first request (two requests, extracting the intro from the text isn't pretty either).
  • The second is to call the entire page parser and use xpath to retrieve every <p> tag before the contents table.

With both methods I then have to re-parse the HTML to ensure the relevant links inside the introduction link off to wikipedia.

Neither are ideal really, there must be a better way?

Was it helpful?

Solution

The action=parse API module accepts a section number parameter, like this. The lead is section number 0.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top