Question

I am running into this issue where I have a controller that receives a string which is the assigned to an attribute for one of my models that I then save to the database. An log message with an inspect call shows the model successfully takes the string right up until the #save call. The problem seems to be that without any errors being thrown, if the string contains a french character, the string from that character to the end of the string becomes truncated.

Further investigation seems to show that the string gets truncated when being written to the MySQL database. I also came across this article: Stale Rails Issue

If I am reading that right, it looks like characters that are not in the ASCII character encoding but are in the ISO Latin-1 character encoding are subject to this bug. I actually upgraded my project from Rails 3.0 to Rails 3.2 and from Ruby 1.8 to Ruby 1.9 so I could easily use the mysql2 adapter with Rails which some other articles seemed to suggest might solve the issue. However it didn't.

So how do I prevent the string truncation from happening?

Edit1: If I enter the query SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%'; I get:

Variable Name, Value
'character_set_client', 'utf8'
'character_set_connection', 'utf8'
'character_set_database', 'utf8'
'character_set_filesystem', 'binary'
'character_set_results', 'utf8'
'character_set_server', 'latin1'
'character_set_system', 'utf8'
'collation_connection', 'utf8_general_ci'
'collation_database', 'utf8_unicode_ci'
'collation_server', 'latin1_swedish_ci'

Also I noticed that if I place in the french character via the MySQL Query Browser and then refresh the rails app on my browser so it pulls the new data from the database it display, it displays it correctly. It just seems to drop it when saving the model data.

Edit2: I just changed some config parameters to try to fix the problem but it still exists. However, this is what I had changed the values to.

Variable Name, Value
'character_set_client', 'utf8'
'character_set_connection', 'utf8'
'character_set_database', 'utf8'
'character_set_filesystem', 'binary'
'character_set_results', 'utf8'
'character_set_server', 'utf8'
'character_set_system', 'utf8'
'collation_connection', 'utf8_general_ci'
'collation_database', 'utf8_unicode_ci'
'collation_server', 'utf8_unicode_ci'
Was it helpful?

Solution 2

Sorry for all the trouble. I'll just put down the answer. It just turned out in this case the database was correctly set up for utf8 but a user was inputing strings encoded in ISO-Latin-1 and I wasn't doing a check for what encoding user input as I assumed all input would be utf8 compatible. Turns out that french accent characters in ISO-Latin-1 are illegal utf8 characters. The database seems to handle it by just raising a warning and truncating the string at the point of the illegal character but keeping everything before it.

OTHER TIPS

Well you are using utf8 but if you use utf8_unicode_ci it could be better there is another encoding utf8_general_ci which is of better performance but could have problems with German if that's a problem use the utf8_unicode_ci, that's for the database, for more information on MySQL character set check out MySQL's charset-unicode-sets. On the side of Rails and Ruby you should check this questions out French accents in ruby. And also Rails messages in french. As a last resource you could html encode the data before inserting it in the database. This can mess up searches but if you encode the search data also before searching the database everything should be fine for more information check French characters in rails page. I hope this helps if you keep getting errors please tell me so I can check other ways to help you out.

Also the comment by @Ahmed Ali could help you out it looks like the encodings get changed

Fetching data from any database (Mysql, Postgresql, Sqlite2 & 3), all configured to have UTF-8 as it's character set, returns the data with ASCII-8BIT in ruby 1.9.1 and rails 2.3.2.1.

See the link Ahmed posted for the complete answer and the link to the page from where the quote was taken, (ASCII-8BIT encoding of query results in rails 2.3.2 and ruby 1.9.1).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top