Question

When sorting my values from the mySQL database I get the following wrong sorting order:

SELECT * FROM tt_news WHERE pid=19 AND deleted=0 AND hidden=0 order by title ASC

A  
B  
C  
...  
Ä  
Ö

What can I do against this sorting problem? Ä should be in between or after A and so on.

MySQL server version: 5.0.51a with UTF-8 support

I found out that this has to do with the collation of a database (see german link: http://mysql-faq.sourceforge.net/tables3.html).

The script is embedded into TYPO3 with setDBinit set to SET NAMES utf8 and forceCharset set to UTF-8. So UTF-8 data should be stored in ISO-8859-1 (Latin 1).

The column has the type text and the collation latin1_swedish_ci. When I enter SHOW VARIABLES LIKE 'collation%' in phpMyAdmin I get

collation_connection    utf8_general_ci
collation_database  latin1_swedish_ci
collation_server    latin1_swedish_ci

SHOW VARIABLES LIKE '%CHARACTER_SET%'; gives me in phpMyAdmin

character_set_client    utf8
character_set_connection    utf8
character_set_database  latin1
character_set_filesystem    binary
character_set_results   utf8
character_set_server    latin1
character_set_system    utf8
character_sets_dir  /usr/share/mysql/charsets/

Attempt No. 1:

I tried to use SET NAMES utf8; in my script but that didn't changed something.

Attempt No. 2:

I wanted to do the sorting in PHP (according to this SOQ: How to sort an array of associative arrays by value of a given key in PHP?), but that didn't changed the sorting.

$title=array();
foreach ($result as $key => $row) {
    $title[$key]  = $row['title'];
}
array_multisort($title, SORT_ASC, $result);

Attempt No. 3:

I used this mySQL statement (taken from http://blog.mixable.de/mysql-order-by-und-deutsche-umlaute/):

SELECT * FROM tt_news WHERE pid=19 AND deleted=0 AND hidden=0 order by title COLLATE latin1_swedish_ci;

No changes in the sorting. Using utf-8 leads to an error (not allowed collation).

Attempt No. 4:

SELECT *, REPLACE( REPLACE( REPLACE( REPLACE( REPLACE( REPLACE(REPLACE(title, 'Ä', 'A'), 'Ö', 'O'), 'Ü', 'U'), 'ä', 'a'), 'ö', 'o'), 'ü','u'), 'ß', 's') AS sortiert FROM tt_news WHERE pid=19 AND deleted=0 AND hidden=0 ORDER BY sortiert

Source: http://www.php-faq.de/q-mysql-umlaute-sortieren.html

Works in phpMyAdmin but not in my script. I get the following error: Illegal mix of collations (latin1_swedish_ci,IMPLICIT), (utf8_general_ci,COERCIBLE), (utf8_general_ci,COERCIBLE) for operation 'replace'

Can I make the correct sorting in PHP without changing the character set or collation?

Was it helpful?

Solution

The ordering you see is correct by Swedish rules: Å, Ä and Ö are the last three letters of the alphabet, after Z. If you don't like it change the column collation so something else.

alter table tt_news modify title text collate latin1_general_ci;

The general variant considers all accented variations of a character distinct, but groups them together when sorting; for example AZ comes before ÄA. If you require some national standard other than Swedish here's a list of what MySQL supports out of the box: http://dev.mysql.com/doc/refman/5.6/en/charset-we-sets.html

If you can't change the column collation in the database, you can tell MySQL to use a particular collation just for ordering of the query. For example:

.... order by title collate latin1_general_ci

OTHER TIPS

Pure PHP solution:

function sortWUmlauts($s1, $s2)
{
    $s1 = $s1['title'];
    $s2 = $s2['title'];
    $search = array('Ä','Ö','Ü','ß');
    $replace = array('A','O','U','s');

    return strcasecmp(
       str_ireplace($search, $replace, $s1),
       str_ireplace($search, $replace, $s2)
    );
}

// call
uasort($result, 'sortWUmlauts');

Taken from http://at2.php.net/manual/en/function.uasort.php#99017

A nice addition would be to have a variable which holds the search key for the associative array (directly embed the function in the uasort call).

use "order by title latin1_german1_ci" for

Ä = A
Ö = O
Ü = U
ß = s

use "order by title latin1_german2_ci" for

Ä = AE
Ö = OE
Ü = UE
ß = ss

sorting for more http://dev.mysql.com/doc/refman/5.6/en/charset-we-sets.html

You don't have to modify your database to do this (unless you want to, of course). Perhaps you have different columns that you want to sort according to a different language?

Simply specify a different collation in your query, e.g.:

SELECT * FROM tt_news WHERE pid=19 ORDER BY title COLLATE "utf8_german2_ci" ASC

Note that if your table is not already in a utf8 collation (maybe it is in a latin1 collation) then you will need to use a latin1 collation for the sorting. In this case you would use latin1_german2_ci instead of utf8_german2_ci in the query above.

A list of collations along with a useful discussion of their uses is available in the MySQL reference docs here.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top