Turns out it's generated from an MD5 hash. Something like the below in Scala will work -- although I'm not sure how to predict whether it's in /commons or in /en.
import org.apache.commons.codec.digest.DigestUtils
def getImageUrl(fileName: String, rootUrl: String): String = {
val messageDigest = DigestUtils.md5Hex(fileName.replace(" ", "_"))
val md5 = messageDigest
val hash1 = md5.substring(0, 1)
val hash2 = md5.substring(0, 2)
rootUrl + hash1 + "/" + hash2 + "/" + fileName
}
Careful about leading zeros, as discussed here:
Does wikipedia use different methods to compute the hash part of an image path?
http://lists.wikimedia.org/pipermail/mediawiki-api/2011-December/thread.html#2446