ESAPI canonicalize malforming url

Question 1

This problem is a known bug in ESAPI. I started working on resolving it, but since I don't know when a patch will get committed, I can only refer you to a workaround in my comments to the OP here where I linked a similar answer, using java.net.URI and javax.ws.rs.core.UriBuilder to parse/break down the URL, canonicalize the pieces, and then reconstruct the URL. I'll repost the link here. The example I put forth is on the second half of the question after the OP switched topics mid-question.

Question 2

I faced the same issue. In my case, for the string \fgdf\gghfh\fgh\dff the canonicalize method formed this into:

Case 1: canonicalize(string) --> INTRUSION - Multiple (2x) encoding detected in \fgdf\gghfh\fgh\dff

Case 2: canonicalize(string, false) --> input=fgdfgghfhfghdff And in this case, it failed with string validation since this ? character is not part of white list of characters.

I finally managed to get it working. Below is the code:

    value = ESAPI.encoder().encodeForURL(value);
    value = value.replaceAll("", "");
    isSafe = validator.isValidInput("APPNAME", value, "URLSTRING", 255, true, false);

The last parameter of false turns off internal canonicalization that is on by default.

I hope this helps.