Question

Do I really need to switch from VARCHAR to VARBINARY and TEXT to BLOB for UTF-8 in Mysql & PHP? Or can I stick with CHAR/TEXT fields in MySQL?

Was it helpful?

Solution

Maybe. As jason pointed out and I failed to notice, MySQL UTF-8 does only map the Basic Multilingual Plane. The manual does point out however, that "They [utf8 and ucs2] are sufficient for almost all characters in major languages" So, it is probably safe but you might want to check out what is in the Basic Multilingual Plane just to be sure.

Orignal Answer

As long as your database is using UTF-8 you should be able to stick with VARCHAR and TEXT. (As a side note, the MySQL manual recommends using VARCHAR over CHAR with UTF-8 to save space. As this is the case, it should be safe to use VARCHAR and TEXT.)

OTHER TIPS

Not necessarily. MySQL's UTF-8 support is limited to only 3 byte UTF8, which includes everything upto and including the Basic Multilingual Plane. It is only if you need characters which are in the 4 byte range that you need to use BLOB storage; this is rare, but not totally uncommon. See the Wikipedia article for a breakdown of what you'll be missing, and decide if there's anything there that is a must have.

Here's a nice link on dealing with UTF-8 in PHP. MySQL does very well with UTF-8 if you set the collation right. PHP on the other hand has lots of problems.

Of course it is safe to use VARCHAR to store UTF-8 text and no VARBINARY is needed for that.

VARCHAR is a "CHARACTER WITH VARIABLE LENGTH", which will flawlessly adapt to the number of BYTES needed to store the characters according to the CHARCODE selected.

There is also a reason why MySQL's UTF-8 support is limited to only 3 bytes. You would need to dive into the related UTF-8 docs that talk about the encoding procedure of UTF-8 to understand why that's correct.

And last but not least: if you're unsure about UTF-8, you can always opt-in to UTF-16. Yet, you'll still be using VARCHAR as it will flawlessly adapt to the correct byte-length nevertheless.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top