In a wikipedia's article text, a link might be mentioned like: [Category:A B C], however the exact wiki url will have suffix like Category:A_B_C From where I can get the information regarding all these rules which wiki uses to get the url from a link in its text ?(, e.g. converting spaces to underscores, capitalizing first letter, dealing with non-ascii characters etc)

有帮助吗?

解决方案

Roughly the following:

  • Normalize namespace, e.g. category: --> Category:.
  • Uppercase the first letter of title proper, e.g. Category:foo --> Category:Foo. Note: this depends on wiki settings and titles are never uppercased on Wiktionary, for example.
  • Replace spaces with underscores, e.g. Foo bar --> Foo_bar.
  • Percent-encode all the usual characters with PHP's standard function urlencode(), except for the following ones: ;:@$!*(),/.

For full technical details you could look up this (function getLocalUrl()) and this (function wfUrlencode()).

其他提示

There is no “etc.”, you already mentioned all the rules:

  1. spaces are converted to underscores
  2. the first letter of the article title is capitalized (the first letter of the namespace is capitalized too, if there is any)
  3. the whole link is percent-encoded

Note that rules #1 and #2 are not mandatory: if you create your own URL that doesn't follow them, Wikipedia will still show the page correctly.

Things get more complicated if you include namespace aliases (WP:WikiProject ComputingWikipedia:WikiProject_Computing) and interwiki links (wikia:gameofthrones:Westeroshttp://www.wikia.com/wiki/c:gameofthrones:Westeros).

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top