Question

I try to share posts from my website.
The thing is the famous question mark diamond shows up on some of the posts description.
All the OG Meta looks good (using Yoast SEO), it's 'just' the text on the share post itself that has this � sign.
I understand it's probably due to some file not encoded in utf-8.
I've added default_charset = UTF-8 to my .ini file but no change.
The content-type is also properly set to utf-8
I've also did a validation check on w3 validator and didn't find anything related

The website shows up just fine with no weird characters, it's only when trying to share a post on Facebook.
How can I find the source for the wrong encoding ?

Was it helpful?

Solution 2

I've found the issue.
UTF8 might produce a multibyte string (when using Hebrew characters for example).
Facebook truncates this string a post is shared to keep the description under max number of characters.
However, they might truncate a multibyte character in the middle which will result in UTF8 invalid character: �
This issue actually reproduces in all multibyte websites.
The only workaround I've found is limiting the og:description to Facebook's max number of characters (about 110 in comment and 300 in post) I've submitted a bug to Facebook OpenGraph platform and I hope it will be fixed shortly.

OTHER TIPS

I think is not an problem in your WP installation. The site works as encoding UTF-8, also your feed etc.

Validating

For an validating of strings should use the PHP function mb_check_encoding. A small script should check your database tables, the content so that you have feedback about your data inside the database tables.

Also it the libraray tchwork-grekas/utf8 helpful, for finding problems also fix the wrong strings.

Fix them via custom script

However if you will only search in your database, search via a plugin for an strings should helps you. Alternative is an custom script, that check and fix them. I think the forceutf8 library is helpful. The method fixUTF8 fix your problems, if you have inside your data.

Alternative is also the library Patchwork-UTF8 in the first abstract, see above.

mySQL Table Collation

Before you should check the Collation of all tables. You check their encoding by looking at the Collation value in the output of SHOW TABLE STATUS (in phpMyAdmin or Adminer this is shown in the list of tables).

SHOW TABLE STATUS FROM <YOUR_DATABASE>

mySQL Variable

You can also check each variable, run the follow mysql command in your database to verify that everything has properly been set to use the UTF-8 encoding.

SHOW VARIABLES LIKE 'char%';

Convert

Convert the table to InnoDB and utf8mb4, the posts table

ALTER TABLE wp_posts ENGINE=InnoDB ROW_FORMAT=DYNAMIC;
ALTER TABLE wp_posts CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;

WP Config

Check the wp-config.php utf8mb4 is your choice since WordPress 4.2, full UTF-8 support.

define( 'DB_CHARSET', 'utf8mb4' ); 

Links

Licensed under: CC-BY-SA with attribution
Not affiliated with wordpress.stackexchange
scroll top