I use a console program (cmd call) for translating a string from the standard input into a string in special Unicode characters, received from the standard output. The returning string in C# escapes the escaping backslash before the Unicode character.

How can I undo this escaping?

Example returning string =

stdout = "\\x284b\\x2817\\x2801\\x281d\\x2835 \\x281a\\x2801\\x281b\\x281e \\x280a\\x280d \\x2805\\x2815\\x280d\\x280f\\x2807\\x2811\\x281e\\x281e \\x2827\\x2811\\x2817\\x283a\\x2801\\x2813\\x2817\\x2807\\x2815\\x280e\\x281e\\x2811\\x281d \\x285e\\x2801\\x282d"

... but it should be

stdout = "\x284b\x2817\x2801\x281d\x2835 \x281a\x2801\x281b\x281e \x280a\x280d \x2805\x2815\x280d\x280f\x2807\x2811\x281e\x281e \x2827\x2811\x2817\x283a\x2801\x2813\x2817\x2807\x2815\x280e\x281e\x2811\x281d \x285e\x2801\x282d"

My trys to resove this problem by doing

var stdout2 = stdout.Replace(@"\\", @"\");

doesn't have effect.

Thanks 4 help.

有帮助吗?

解决方案 4

In the end it is easy and a little bit complex at the same time. I came to the solution by knowing that a char can be created out of an integer. So by knowing, that the coding of style '\x284b' indicates the hexadecimal value '284B' which is '10315' in decimal and therefore can be cast into a char. So I used these small functions to translate the coding into an Int32 and from that into an internal string … voila

/// <summary>
/// Gets the char from unicode hexadecimal string.
/// </summary>
/// <param name="characterCode">The character code e.g. '\x2800'.</param>
/// <returns>the current available unicode character if available e.g. ' '</returns>
public static string GetCharFromUnicodeHex(String characterCode)
{

    if (!String.IsNullOrEmpty(characterCode))
    {
        if (characterCode.StartsWith(@"\"))
        {
            characterCode = characterCode.Substring(1);
        }
        if (characterCode.StartsWith("x"))
        {
            characterCode = characterCode.Substring(1);
        }

        int number;
        bool success = Int32.TryParse(characterCode, System.Globalization.NumberStyles.HexNumber, System.Globalization.CultureInfo.InvariantCulture, out number);

        if (success)
        {
            return GetCharFromUnicodeInt(number);
        }
    }
    return String.Empty;
}


/// <summary>
/// try to parse a char from unicode int.
/// </summary>
/// <param name="number">The number code e.g. 10241.</param>
/// <returns>the char of the given value e.g. ' '</returns>
public static string GetCharFromUnicodeInt(int number)
{
    try
    {
        char c2 = (char)number;
        return c2.ToString();
    }
    catch { }
    return String.Empty;
}

其他提示

You need to do

stdout = stdout.Replace(@"\\", @"\");

instead.

I assume you don't want to remove \\ in the string. It should print as \\x284b.... If that's the case append string with @. Following code will print with \\

       string stdout = @"\\x284b\\x2817\\x2801\\x281d\\x2835 \\x281a\\x2801\\x281b\\x281e
       \\x280a\\x280d \\x2805\\x2815\\x280d\\x280f\\x2807\\x2811\\x281e\\x281e   
       \\x2827\\x2811\\x2817\\x283a\\x2801\\x2813\\x2817\\x2807\\x2815\\x280e\\x281e\\x2811
        \\x281d \\x285e\\x2801\\x282d";

        Console.Write(stdout);
        Console.Read();

the result comes from a console programm called liblouis

Ah OK, LibLouis has its own curious non-standard string escape scheme, documented in section 3 here. If you want to turn it into a raw unescaped Unicode string there are many backslash escape sequences you'll want to handle in addition to \x. Something like (not tested):

var escape = new Regex(@"\\(x[0-9A-Fa-f]{4}|y[0-9A-Fa-f]{5}|z[0-9A-Fa-f]{8}|.)");
var chars = new Dictionary<char, string> {
    { 'f', "\f" }, { 'n', "\n" }, { 'r', "\r" }, { 't', "\t" }, { 'v', "\v" },
    { 's', " " }, { 'e', "\x1B"}
};

var decoded_string = escape.Replace(encoded_string, match =>
    match.Length>2 ?
        Char.ConvertFromUtf32(
            int.Parse(
                match.Value.Substring(2),
                System.Globalization.NumberStyles.HexNumber
            )
        ) :
    chars.ContainsKey(match.Value[1]) ?
        chars[match.Value[1]] :
    match.Value.Substring(1)
);
许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top