Domanda

I have custom IComparer<string> which I use to compare strings ignoring their case and symbols like this:

public class LiberalStringComparer : IComparer<string>
{
    private readonly CompareInfo _compareInfo = CultureInfo.InvariantCulture.CompareInfo;
    private const CompareOptions COMPARE_OPTIONS = CompareOptions.IgnoreSymbols | CompareOptions.OrdinalIgnoreCase;

    public int Compare(string x, string y)
    {
        if (x == null) return -1;
        if (y == null) return 1;

        return this._compareInfo.Compare(x, y, COMPARE_OPTIONS);
    }
}

Can I obtain the output string which is, ultimately, used for the comparison?

My final goal is to produce an IEqualityComparer<string> which ignores symbols and casing in the same way as this comparer.

I can write regex to do this, but there's no guarantee that my regex will use the same logic as the built-in comparison options do.

È stato utile?

Soluzione 2

There is probably not such an "output string". I'd implement your Equals in this way:

return liberalStringComparer.Compare(x, y) == 0;

GetHashCode is more complicated.

Some approaches:

  1. Use a poor implementation like return 0; (which means you always have to run a Compare to know if they're equal).
  2. Since your comparison is relatively simple (invariant culture, ordinal ignore case comparison), you should be able to make a hash that generally works. Without extensive study of Unicode and testing, however, I wouldn't recommend that you assume this'll work for any valid Unicode string from any culture.

    In pseudocode:

    public int GetHashCode(string value)
    {
        // for each index in value
        if (!char.IsSymbol(value, i))
            // add value[i].ToUpperInvariant() to the hash using an algorithm
            // like http://stackoverflow.com/a/263416/781792
    }
    
  3. Form a string by removing all where char.IsSymbol is true, then use StringComparer.InvariantCulture.GetHashCode on it.
  4. CompareInfo.GetSortKey's hash code should be a suitable value.

    public int GetHashCode(string value)
    {
        return _compareInfo.GetSortKey(value, COMPARE_OPTIONS).GetHashCode();
    }
    

Altri suggerimenti

Quite interesting question here. Internally CompareInfo.Compare uses InternalCompareString method importing COMNlsInfo::InternalCompareString from clr.dll:

// Compare a string using the native API calls -- COMNlsInfo::InternalCompareString   
...
private static extern int InternalCompareString(IntPtr handle, 
             IntPtr handleOrigin, String localeName, String string1, int offset1, 
             int length1, String string2, int offset2, int length2, int flags);

In other words, as you can't be sure about the logic of the built-in function, maybe you should write your own and reuse it in both IEqualityComparer and IComparer implementations.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top