문제

I'm struggling with the encoding of the content of an external interface. In the MySQL database the collation is latin1_swedish_ci. Also the collation of the field ist latin1_swedish_ci. The php script is encoded in UTF-8 and the output in the browser gives me UTF-8. Everything is working fine except the content of this database. The database connection should be UTF-8 (Typo3 4.7) and the content is

straße

but it should be straße.

mb_detect_encoding($data['street'],'UTF-8') says it is UTF-8. If I use utf8_decode() I get

stra�?e

If I use utf8_encode() I get

straße

My assumption was that UTF-8 encoded data is stored in ISO-8859-1, but if this would be the case this shouldn't make such problems here. How do I find out what the real encoding is?

PS: I cannot change the encoding of the source!

My solution for my initial problem:

I had to set the datbase connection from UTF-8 to ISO-8859-1 with this line of code

$res = $GLOBALS['TYPO3_DB']->sql_query("SET NAMES latin1");
도움이 되었습니까?

해결책

The character ß 'LATIN SMALL LETTER SHARP S' (U+00DF) exist in UTF-8 of bytes 0xC3 and 0x9F as per the linked site:

UTF-8 (hex) 0xC3 0x9F (c39f)

If we look at the ISO-8859-1 codepage layout, then those bytes represent the characters à and a character not definied in the ISO-8859-1 codepage layout. This is thus not it. Another common character encoding which has some overlap with ISO-8859-1 is Windows CP1252 (also known as ANSI, used by default when saving a text file in Notepad — which is overridable by using Save As instead). If we look at CP1252 codepage layout, then those bytes represent the characters à and Ÿ which confirms what you're initially retrieving.

So, it's most likely CP1252 encoded.

다른 팁

What you see as “ß” is really the windows-1252 (also known as CP1252) interpretation of the two bytes 0xC3 and 0x9F that constitute the UTF-8 encoding of “ß”. But this seems to mean that the data is actually UTF-8 encoded and just gets misinterpreted as windows-1252 encoded. So I think it should be simply processed as UTF-8, with due precautions.

i recommend that you proceed to verify what charset is being used by your sql connection. it is NOT necessarily the same as the charset that you define for your databse.

FROM PHP

// Opens a connection to a MySQL server
$connection = mysql_connect ($server, $username, $password);
$charset = mysql_client_encoding($connection);
$flagChange = mysql_set_charset('utf8', $connection);
echo "The character set is: $charset</br>mysql_set_charset result:$flagChange</br>";

INSIDE PHPMYADMIN

  1. open database information_schema
  2. open table schemata
  3. check out your mysql default collation

you may or may not be able to change these parameters, depending on user privileges.

as shown above, i solved my conflicting character set problems in mysql by appending the following line to my connection.php file (which i call at the beginning of every page that uses db access):

$flagChange = mysql_set_charset('utf8', $connection);
라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top