Question

I have got problem to write to mysql DB in utf-8 encoding. My application is little bit complicated, so I will try to be as specific as possible. (My aplication requires Slovak special charaters (there are in utf-8) like ľščťžýáí etc.

I am running debian. I believe that my locale is correctly set, but to be sure:

root@radiator:/var/scripts# locale
locale: Cannot set LC_CTYPE to default locale: No such file or directory
locale: Cannot set LC_MESSAGES to default locale: No such file or directory
locale: Cannot set LC_ALL to default locale: No such file or directory
LANG=sk_SK.UTF-8
LANGUAGE=sk_SK.UTF-8:cs_CZ.UTF-8
LC_CTYPE="sk_SK.UTF-8"
LC_NUMERIC="sk_SK.UTF-8"
LC_TIME="sk_SK.UTF-8"
LC_COLLATE="sk_SK.UTF-8"
LC_MONETARY="sk_SK.UTF-8"
LC_MESSAGES="sk_SK.UTF-8"
LC_PAPER="sk_SK.UTF-8"
LC_NAME="sk_SK.UTF-8"
LC_ADDRESS="sk_SK.UTF-8"
LC_TELEPHONE="sk_SK.UTF-8"
LC_MEASUREMENT="sk_SK.UTF-8"
LC_IDENTIFICATION="sk_SK.UTF-8"
LC_ALL=

I have bash script which should write text (in Slovak language to DB.) (the first hash character is because debian don't know to work with BOM, still don't know how to deal with it )

#
#!/bin/bash
table=$1
cycle=$2
sstart=$3
eend=$4
dbtext=$(cat /var/www/vids/$5/vars/$5.recogn.p.tmp2)

qry="INSERT INTO  \`video\`.\`$table\` (\`DB_ID\` , \`LNX_ID\` , \`STIME\` , \`ETIME\` , \`TEXT\` ) VALUES ( NULL , '$cycle', '$sstart', '$eend', '$dbtext');"

mysql --host=localhost --database 'video' --user=uzivatel --password=heslo << eof
$qry
eof

This is content of mentioned tmp2 file (encoding of this file is utf-8):

Tá žena držal poznali poznal jeho rodičov poznali podsvetie hodváb ulsteru mám ostatných tak veľmi dobre ako boli pre nato že sa bude vydávať ale skóre nevyšlo to potom zas nasťahovala.

And in phpmyadmin it look like :

Tá žena držal poznali poznal jeho rodiÄov poznali podsvetie hodváb ulsteru mám ostatných tak veľmi dobre ako boli pre nato že sa bude vydávaÅ¥ ale skóre nevyÅ¡lo to potom zas nasÅ¥ahovala.

(encoding in this field is utf8_slovak_ci ) (google chrome encoding is utf8) .

It took me whole day to google this and I don't know what is problem. Could you please help me ? I know that you are the best.. :)

Was it helpful?

Solution

It looks like your UTF-8 input is being interpreted in a single-byte encoding at some point, most likely by mysql itself because the database connection may default to latin1.

Try adding --default-character-set=utf8 to your mysql call. (Alternatively, a SET NAMES utf8 put before the query should have the same effect.)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top