MySQL's utf_general_ci in C#
Question
Is there an easy way to replicate the behavior of MySQL's utf_general_ci
collation in C#?
In particular, given a Unicode string, I want to generate a(n ASCII?) string that can then be trivially sorted or compared, as utf_general_ci
would.
I found this question, which shows how to strip accents from strings, which looks like a similar but not quite equivalent function, e.g., it doesn't decompose ß
into ss
.
For my purposes, that may end up being good enough, but if there's a way to replicate its behavior completely I'd prefer that.
No correct solution
OTHER TIPS
Take a look at the SortKey class and its KeyData property.
For given collation (CultureInfo in .NET terms) and a string, you can get a naturally sortable byte array using MyCultureInfo.CompareInfo.GetSortKey(mystring).KeyData
I do not think the collating keys for .NET and for MySQL will necessarily match, however both are using the same technique based on Unicode Collation algorithm