Question

I have a MySQL database with an InnoDB table containning utf8_general_ci varchar fields. When I fetch them through PHP (via PEAR::MDB2) and try to output them (via Smarty), I get ??? symbols. I would like to know how to fix that problem, which is most likely caused by PHP.

Good information to know:

  • It is a new version of the site I'm working on, the old version had the same problem even though it didn't use Smarty nor MDB2, so they are most likely not the cause. The old programmer used htmlentities() to remedy the problem, but I'm trying to avoid that.
  • The character encoding of all my files (template, source, etc.) is UTF-8 without BOM.
  • When I display a page, all accented characters (the ones in the templates, not the ones coming from MySQL) are shown correctly and the encoding in the browser is UTF-8. If I manually switch it over to ISO-8859-1, then the character from MySQL are outputed correctly, but no the others.

Basically, it seems that PHP or MySQL transforms the UTF-8 data contained within the database to ISO-8859-1 at some point during the query/fetch process, and that is what I want to fix.

I've done a lot of searching but haven't found any solution, and I'm hoping the problem lies in a setting somewhere. I'd like to avoid having to use htmlentities() or utf8_encode(), however that might be the only way to go until PHP6 shows up.

Thank you for your input on this!

Was it helpful?

Solution

You need to execute a few queries to tell it to use UTF-8 for the connection (the default is indeed Latin-1). Here's what I use:

SET CHARACTER SET = "utf8";
SET character_set_database = "utf8";
SET character_set_connection = "utf8";
SET character_set_server = "utf8";

I know some of these seem overkill, but they have been tested and do seem to work quite well...

OTHER TIPS

My guess is the data wasn't utf-8-encoded when it hit the database.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top