Question

I was wondering why there is no case-insensitive Unicode character field type in ADS?

While you can set the collation of indexes to NVarChar fields to be case-insensitive, a simple query using WHERE field = 'HeLlO WoRlD' doesn't find the value 'Hello World'?

I know that WHERE field = 'HeLlO WoRlD' COLLATE ads_default_ci works, but doing that for every single comparison is not an option.

The CiChar field type is not Unicode capable (unless you store UTF-8 strings in there which causes other problems).

Was it helpful?

Solution

Fundamentally, unlike regular character field, Unicode can store characters from all languages so there is not specific collation/language associate with it. The collation comes from how it is used, indexed or sorted. If a NVarCiChar field is to be defined, a language/locale (English has different case sensitivity from French or German) will need to be associated with such field type and that would introduced unnecessary complexity to the system (what to do when an English ci field is to compare with a German ci field).

Although the ciChar type is easier to use in some respects, it has drawbacks as well. The main one being that it is not standard so it is not portable to other DB, and it requires some special handling in code. It is less flexible. It causes problem when trying to compare a ciChar field with a regular char field -- the COLLATE clause is required for such comparison. Since the relatively standard way of using the COLLATE clause supports case insensitive comparison in a more clear manner while being more flexible, we decided that the case insensitive Unicode field is not necessary. It is also easy to do case insensitive comparison of Unicode strings by specifying a case insensitive Unicode collation for the SQL statement handle to avoid using multiple COLLATE clauses.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top