Question

Which MySQL UTF8 collation can I use that on one hand supports case-insensitive searches ('Hello' == 'hello') but on the other hand does not ignore umlauts in comparisons ('hällö' != 'hallo')?

utf8_unicode_ci/utf8_general_ci seem to do the former but not the latter, utf8_bin does the latter but not the former.

At first glance utf8_swedish_ci seems to work, but I am not sure if this does not cause any other problems.

What's the best practice here?

Was it helpful?

Solution

Case insensitive search and NOT ignoring umlauts in comparison both work with utf8_swedish_ci:

mysql> CREATE TABLE users (
  id INT(11) default NULL auto_increment,
  name varchar(60) NOT NULL,
  PRIMARY KEY (id),
  UNIQUE KEY name(name)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE utf8_swedish_ci;

Query OK, 0 rows affected (0.13 sec)

mysql>  INSERT INTO `users` (`name`) VALUES ('Hello'), ('hällö');
Query OK, 2 rows affected (0.10 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> SELECT * FROM users WHERE name='hello';
+----+-------+
| id | name  |
+----+-------+
|  1 | Hello |
+----+-------+
1 row in set (0.00 sec)

mysql> SELECT * FROM users WHERE name='hallo';
Empty set (0.00 sec)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top