.Net Uri Encoding RFC 2396 vs RFC 3986

https://stackoverflow.com/questions/7034209

17-12-2020
|

Domanda

First, some quick background... As part of an integration with a third party vendor, I have a C# .Net web application that receives a URL with a bunch of information in the query string. That URL is signed with an MD5 hash and a shared secret key. Basically, I pull in the query string, remove their hash, perform my own hash on the remaining query string, and make sure mine matches the one that was supplied.

I'm retrieving the Uri in the following way...

Uri uriFromVendor = new Uri(Request.Url.ToString());
string queryFromVendor = uriFromVendor.Query.Substring(1); //Substring to remove question mark

My issue is stemming from query strings that contain special characters like an umlaut (ü). The vendor is calculating their hash based on the RFC 2396 representation which is %FC. My C# .Net app is calculating it's hash based on the RFC 3986 representation which is %C3%BC. Needless to say, our hashes don't match, and I throw my errors.

Strangely, the documentation for the Uri class in .Net says that it should follow RFC 2396 unless otherwise set to RFC 3986, but I don't have the entry in my web.config file that they say is required for this behavior.

How can I force the Uri constructor to use the RFC 2396 convention?

Failing that, is there an easy way to convert the RFC 3986 octet pairs to RFC 2396 octets?

Soluzione

Nothing to do with your question, but why are you creating a new Uri here? You can just do string queryFromVendor = Request.Url.Query.Substring(1); – atticae

+1 for atticae! I went back to try removing the extraneous Uri I was creating and suddenly, the string had the umlaut encoded as UTF-8 instead of UTF-16.

At first, I didn't think this would work. Somewhere along the line, I had tried retrieving the url using Request.QueryString, but this was causing the umlaut to come through as %ufffd which is the � character. In the interest of taking a fresh perspective, I tried atticae's suggestion and it worked.

I'm pretty sure the answer has to do with something I read here.

C# uses UTF-16 in all its strings, with tools to encode when it comes to dealing with streams and files that bring us onto...

ASP.NET uses UTF-8 by default, and it's hard to think of a time when it isn't a good choice...

My problems stemmed from here...

Uri uriFromVendor = new Uri(Request.Url.ToString());

By taking the Request.Url uri and creating another uri, it was encoding as the C# standard UTF-16. By using the original uri, it remained in the .Net standard UTF-8.

Thanks to all for your help.

Altri suggerimenti

I'm wondering if this is a bit of a red herring:

I say this because FC is the UTF16 representation of the u with umlaut; C2BC is the UTF8 representation.

I wonder if one of the System.Text.Encoding methods to convert the source data into a normal .Net string might help.

This question might be of interest too: Encode and Decode rfc2396 URLs

I don't know about the standard encoding for Uri constructors, but if everything else fails you could always decode the URL yourself and encode it in whatever encoding you like.

The HttpUtility-Class has an UrlDecode() and UrlEncode() method, which lets you specify the System.Text.Encoding as second parameter.

For example:

string decodedQueryString = HttpUtility.UrlDecode(Request.Url.Query.Substring(1));
string encodedQueryString = HttpUtility.UrlEncode(decodedQueryString, System.Text.Encoding.GetEncoding("utf-16"));
// calc hash here

Autorizzato sotto: CC-BY-SA insieme a attribuzione

Non affiliato a StackOverflow