Question

Is there an easy way to replicate the behavior of MySQL's utf_general_ci collation in C#?

In particular, given a Unicode string, I want to generate a(n ASCII?) string that can then be trivially sorted or compared, as utf_general_ci would.

I found this question, which shows how to strip accents from strings, which looks like a similar but not quite equivalent function, e.g., it doesn't decompose ß into ss.

For my purposes, that may end up being good enough, but if there's a way to replicate its behavior completely I'd prefer that.

No correct solution

OTHER TIPS

Take a look at the SortKey class and its KeyData property.

For given collation (CultureInfo in .NET terms) and a string, you can get a naturally sortable byte array using MyCultureInfo.CompareInfo.GetSortKey(mystring).KeyData

I do not think the collating keys for .NET and for MySQL will necessarily match, however both are using the same technique based on Unicode Collation algorithm

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top