Question

I have been trying really hard to get a list of writers of books from Wikipedia using it's API. I would like to give users of my website the ability to show what writers they like. To show them others who like the same writer I thought it would be a good idea to make an autocomlete/suggest textbox which shows them possible writers (after, let's say typing 3 characters). This way, spelling problems are avoided and also I can store the pageId which I can then use to match users.

The coding is not the problem! The problem is in constructing the right query. I tried several approaches but I just can't get what I want. There are also very few examples that show how to do this kind of thing.

What I would like:

  • returns titles of pages
  • pages only (so no categories, revisions etc)
  • pages about people, or if possible writers (nationality is unimportant)
  • searching on title only

And if possible:

  • a little bit of the text on a page (I guess one can only get this on Wikipedia?)
  • an url to the page
  • date of birth, and when appropriate date of death

I am not sure if this is even possible.

Was it helpful?

Solution

Querying Wikipedia data is nowadays done via its structured data counterpart, Wikidata. https://www.wikidata.org/wiki/Wikidata:Data_access

You can for instance use WDQ to get a list of items which are marked as "being" or "having profession of" "writer": http://tools.wmflabs.org/autolist/autolist1.html?q=tree%5B36180%5D%5B%5D%5B31%2C106%5D (60k results).

Or also include all the subclasses thereof (poet and whatever): http://tools.wmflabs.org/autolist/autolist1.html?q=tree%5B36180%5D%5B%5D%5B31%2C106%2C279%5D (gets a bit messy with 200k results, would need some filtering).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top