Question

If I have a wikimedia category such as "Category:Google_Art_Project_works_by_Vincent_van_Gogh", is there an API to retrieve a list of the URLs linked to on that page?

I've tried this, but it doesn't return any links: https://en.wikipedia.org/w/api.php?action=query&titles=Category:Google_Art_Project_works_by_Vincent_van_Gogh&prop=links

(If not, I'll parse the html and obtain them that way.)

Once I have all the URLs linked to, is there an API to retrieve some of the information on the page? (Summary/Artist, Title, Date, Dimensions, Current location, Licensing)

I've tried this, but it doesn't seem to have a way to return that information: https://en.wikipedia.org/w/api.php?action=query&titles=File:Irises-Vincent_van_Gogh.jpg&prop=imageinfo&iiprop=url

Was it helpful?

Solution

is there an API to retrieve a list of the URLs linked to on that page?

I guess you're looking for the Categorymembers API which will list the pages in the selected category.

I've tried this, but it doesn't return any links: https://en.wikipedia.org/w/api.php?action=query&titles=Category:Google_Art_Project_works_by_Vincent_van_Gogh&prop=links

First, notice that this is a Wikimedia Commons Category, querying the en.wikipedia.org did return a you a missing page. However, even if you query the right project, you will notice that the category description does indeed not contain any links.

Once I have all the URLs linked to, is there an API to retrieve some of the information on the page?

You can use the categorymembers query as a generator, then specify the usual properties that you want from each page. However, the metadata you seem to be interested in is not available via the API, you need to parse it out of each image description text.

Try https://commons.wikimedia.org/w/api.php?action=query&generator=categorymembers&gcmtitle=Category%3aGoogle_Art_Project_works_by_Vincent_van_Gogh&prop=links|imageinfo|revisions&iiprop=timestamp|user|url|size|mime&rvprop=ids|content&rvgeneratexml

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top