I've had a look into this and the problem is that the HTTP Listener infers that the Character encoding for the Request as Windows-1251
instead of UTF-8
it does this because the character encoding for the request is specified on the Content-Type
HTTP Header, so it would work as expected if you were to change the Content-Type in fiddler to:
Content-Type: application/x-www-form-urlencoded; charset=utf-8
Unfortunately HTML Forms doesn't let you specify the Content-Type with a charset which would look like:
<form action="/hello" method="POST"
enctype="application/x-www-form-urlencoded; charset=utf-8">
<input name="Name" id="Name"/>
<input type="submit" value="Send"/>
</form>
But browsers effectively ignore this and send the default Form Content-Type instead, e.g:
Content-Type: application/x-www-form-urlencoded
With the lack of the Content-Type the HTTP Listener tries to infer the Content-Type from the POST'ed data in this case:
Name=%D0%BF%D1%80%D0%B8%D0%B2%D1%96%D1%82
Which it infers as Windows-1251
and parses the value using that encoding.
There are a couple of solutions the first is to override the Content Encoding which has just been enabled in this commit and force a UTF-8 encoding, e.g:
public override ListenerRequest CreateRequest(HttpListenerContext httpContext,
string operationName)
{
var req = new ListenerRequest(httpContext,
operationName,
RequestAttributes.None)
{
ContentEncoding = Encoding.UTF8
};
//Important: Set ContentEncoding before parsing attrs as it parses FORM Body
req.RequestAttributes = req.GetAttributes();
return req;
}
This feature will be in v4.0.19 release that's now available on MyGet.
The second solution is to effectively provide a hint to the HTTP Request to infer the request as UTF-8
which you can do by specifying the first field in English, e.g:
<form action="/hello" method="POST">
<input type="hidden" name="force" value="UTF-8"/>
<input name="Name" id="Name"/>
<input type="submit" value="Send"/>
</form>
There is nothing special about force=UTF-8
other than its English and uses the ASCII charset.