Question

Is there a way to programmatically list all geo-tagged Wikipedia entries within a radius of a long/lat point? I'm thinking this is possible with the google maps API but I am interested in any method. NOTE: I do not want to display a googlemap.

Was it helpful?

Solution

Yes, it's possible. The hard part is either:

(1) Screen-scraping Wikipedia (bad idea, unless you already have a (small) list of target pages)
(2) Downloading and parsing the massive Wikipedia data sets (better idea)

Once you have lat/long coordinates, which I assume are in the wiki page's geotag format, you can use the great circle formula to compute relative distances, and bypass Google's API entirely.

The moral of this story? When you've dealing with datasets this massive, you're going to want to do as much of it offline as possible.

OTHER TIPS

I've solved a slightly similar problem by using the GeoNames webservices.

You can use the webservice to request cities and so on. There is a per-ip-limitation you may not exceed.

I searched a little further and there's something interesting for you. The webservice is called findNearByWikipedia. It may be the thing you're searching for...

Another option is using DbPedia SPARQL interface. For example, the following SPARQL query gets wikipedia articles within bounding box limited by coordinates.

SPARQL libraries widely available, for example for SPARQL Endpoint interface to Python

To test it, just paste the query below to this online query editor:

http://dbpedia.org/sparql

PREFIX geo: <http://www.w3.org/2003/01/geo/wgs84_pos#> 
PREFIX onto: <http://dbpedia.org/ontology/> 

SELECT * WHERE {
?s a onto:Place .
?s geo:lat ?lat .
?s geo:long ?long .
FILTER ( ?long > 30.3 && ?long < 30.5 && ?lat > 50.4 && ?lat < 50.5)
} 

LIMIT 100

Gets the following result:

"s","lat","long"
"http://dbpedia.org/resource/Kotsiubynske","50.48972320556641","30.32888793945312"
"http://dbpedia.org/resource/Akademmistechko_%28Kiev_Metro%29","50.46555709838867","30.35499954223633"
"http://dbpedia.org/resource/Zhytomyrska_%28Kiev_Metro%29","50.45583343505859","30.36416625976562"
"http://dbpedia.org/resource/Sviatoshyn_Airfield","50.47833251953125","30.38500022888184"
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top