Question

If I try to get the language links for a page on Wikipedia via their API like this:

http://en.wikipedia.org/w/api.php?action=query&prop=langlinks&format=json&lllimit=10&llurl=&titles=wreck-it%20Ralph&redirects=

I get a list of results.

But if I down-case the R in Ralph like this:

http://en.wikipedia.org/w/api.php?action=query&prop=langlinks&format=json&lllimit=10&llurl=&titles=wreck-it%20ralph&redirects=

I get no results.

Looking at the returned information, it looks like Wikipedia normalizes "wreck-it Ralph" in the first example to "Wreck-it Ralph" which redirects to "Wreck-It Ralph".

In the second example, "wreck-it ralph" is normalized to "Wreck-it ralph" which doesn't redirect anywhere, apparently.

Searching for "wreck-it ralph" on http://wikipedia.org works, of course:

http://www.wikipedia.org/search-redirect.php?family=wikipedia&search=wreck-it+ralph&language=en

Can I make the langlinks query work the same way, helping me when I don't know the exact case of all the characters of the search term?

Update From the answer by Sorawee I managed to find out how to do a case-insensitive search: https://en.wikipedia.org/w/api.php?action=query&generator=search&format=json&gsrsearch=wreck-it%20ralph&gsrlimit=1&prop=info

Was it helpful?

Solution

In MediaWiki, all titles will be capitalized automatically. Therefore, "wreck-it Ralph" and "Wreck-it Ralph" are the same page. Similarly, "wreck-it ralph" and "Wreck-it ralph" are the same page. Note that capitalization only applies to the very first letter.

MediaWiki also has pages called "redirect pages." A redirect page can redirect you from the page to another totally different page. For example, https://en.wikipedia.org/wiki/Template:cn will redirect you to https://en.wikipedia.org/wiki/Template:Citation_needed. These pages are created by users, not software.

The situation you asked is like the below diagram.

"wreck-it Ralph" =normalized=> "Wreck-it Ralph" =redirected=> "Wreck-It Ralph" (found)

"wreck-it ralph" =normalized=> "Wreck-it ralph" (not exist)

So now you know that you can't query page "wreck-it ralph," because it doesn't exist.

However, if you want to query from "wreck-it Ralph," you might or might not get the langlinks of "Wreck-It Ralph." It depends on the parameter "&redirects=." If you don't have this parameter, it will not return any langlinks, as "wreck-it Ralph" itself has no langlinks. With "&redirects=," api will search langlinks at redirect page instead (if it exists). Therefore, it will return the langlinks that you want. You can compare:

For the question why does http://www.wikipedia.org/search-redirect.php?family=wikipedia&search=wreck-it+ralph&language=en work, the answer is that search-redirect.php is not api. It searches and returns for the nearest match, while the api that we are discussing returns only the exact result.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top