Question

I am getting to the last stage of my rope (a more scalable version of String) implementation. Obviously, I want all operations to give the same result as the operations on Strings whenever possible.

Doing this for ordinal operations is pretty simple, but I am worried about implementing culture-sensitive operations correctly. Especially since I know only two languages and in both of them culture-sensitive operations behave precisely the same as ordinal operations do!

So are there any specific things that I could test and get at least some confidence that I am doing things correctly? I know, for example, about ß being equal to SS when ignoring cases in German; about dotted and undotted i in Turkish.

Was it helpful?

Solution

Surrogate pairs, if you plan to support them - including invalid combinations (e.g. only one part of one).

If you're doing encoding and decoding, make sure you retain enough state to cope with being given arbitrarily blocks of binary data to decode which may end half way through a character, with the remaining half coming in the next character.

OTHER TIPS

The Turkish test is the best I know :)

You should mimic the String methods implementations and use the core library to do this for you. It is very hard to take into account every possible aspect of every culture. Instead of re-inventing the wheel use reflector on the String methods and see the internal calls. For example String.Compare uses CultureInfo.CurrentCulture.CompareInfo.Compare for comparing 2 strings in current culture.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top