Question

I am having trouble encoding a url with combined Non-ASCII and spaces. For example, http://xxx.xx.xx.xx/resources/upload/pdf/APPLE ははは.pdf. I've read here that you need to encode only the last part of the path of the url.

Here's the code:

public static String getLastPathFromUrl(String url) {
    return url.replaceFirst(".*/([^/?]+).*", "$1");
}

So now I have already APPLE ははは.pdf, next step is to replace spaces with %20 for the link to work BUT the problem is that if I encode APPLE%20ははは.pdf it becomes APPLE%2520%E3%81%AF%E3%81%AF%E3%81%AF.pdf. I should have APPLE%20%E3%81%AF%E3%81%AF%E3%81%AF.pdf.

So I decided to:

1. Separate each word from the link
2. Encode it
3. Concatenate the new encoded words, for example:
    3.A. APPLE (APPLE)
    3.B. %E3%81%AF%E3%81%AF%E3%81%AF.pdf (ははは.pdf)
    with the (space) converted to %20, now becomes APPLE%20%E3%81%AF%E3%81%AF%E3%81%AF.pdf

Here's my code:

public static String[] splitWords(String sentence) {
    String[] words = sentence.split(" ");
    return words;
}

The calling code:

String urlLastPath = getLastPathFromUrl(pdfUrl);
String[] splitWords = splitWords(urlLastPath);
for (String word : splitWords) {
    String urlEncoded = URLEncoder.encode(word, "utf-8"); //STUCKED HERE
}

I now want to concatenate each unicoded string(urlEncoded) inside the indices to finally form like APPLE%20%E3%81%AF%E3%81%AF%E3%81%AF.pdf. How do I do this?

Was it helpful?

Solution

actually the %20 is encoded as %2520 so just call URLEncoder.encode(word, "utf-8"); so you will get result like this APPLE+%E3%81%AF%E3%81%AF%E3%81%AF.pdf and in final result replace + with %20.

OTHER TIPS

Do you want to do something like this:

// Get the whole url as string
Stirng urlString = pdfUrl.toString();

// get the string before the last path segment
String result = urlString.substring(0, urlString.lastIndexOf("/"));

String urlLastPath = getLastPathFromUrl(pdfUrl);
String[] splitWords = splitWords(urlLastPath);

for (String word : splitWords) {
    String urlEncoded = URLEncoder.encode(word, "utf-8");

    // add the encoded part to the url
    result += urlEncoded;
}

Now the string result is your encoded URL as a string.

Possibly easy with org.apache.commons.io.FilenameUtils.

  1. Split your url into baseUrl and the file name and extension.
  2. Encode the file name and extension
  3. Join them together

String url = "http://xxx.xx.xx.xx/resources/upload/pdf/APPLE ははは.pdf";

String baseUrl = FilenameUtils.getPath(url); // GIVES: http://xxx.xx.xx.xx/resources/upload/pdf/
String myFile = FilenameUtils.getBaseName(url)
            + "." + FilenameUtils.getExtension(url); // GIVES: APPLE ははは.pdf
String encoded = URLEncoder.encode(myFile, "UTF-8"); //GIVES: APPLE+%E3%81%AF%E3%81%AF%E3%81%AF.pdf
System.out.println(baseUrl + encoded);

Output:

http://xxx.xx.xx.xx/resources/upload/pdf/APPLE+%E3%81%AF%E3%81%AF%E3%81%AF.pdf

Don't reinvent the wheel. Use URLEncoder for encoding the URL.

URLEncoder.encode(yourArgumentsHere, "utf-8");

Moreover, where do you get your URL from, so that you have to split it before encoding? You should first build the arguments (last part), then just append it onto the base URL.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top