You're oh so close. Combining your code with Marc B's comments, we have this:
if (preg_match('/\xF0/si', $texts[$i])) {
$texts[$i] = preg_replace('/\xF0/si', '', $texts[$i]);
}
Question
I apologize for such a topic title. But it is because the problem is so.
Now I'm writing parser for Twitter and when in the text of tweet script stumbles upon these symbols 💗⚫️, Yii generate errors as:
SQLSTATE[HY000]: General error: 1366 Incorrect string value: '\xF0\x9F\x98\x8D\xF0\x9F...' for column 'code' at row 1.
I wrote the following code:
if (preg_match('/😍/si', $texts[$i])) {
$texts[$i] = str_replace('😍', '', $texts[$i]);
}
But it did not help me, because all these characters have different Unicode (they are only in the form of squares)...
I wrote the following code too:
if (preg_match('/xF0/si', $texts[$i])) {
unset($texts[$i]);
}
But it did not help me too...
These symbols is: ✂ ✃ ✄ ✆ ✇ ✈ ✉ ✌ ✍ ✎ ✏ ✐ ✑ ✒ ✓ ✔ ✕ ✖ ✗ ✘ ✙ ✚ ✛ ✜ ✝ ✞ ✟ ✠ ✡ ✢ ✣ ✤ ✥ ✦ ✧ ✩ ✪ ✫ ✬ ✭ ✮ ✯ ✰ ✱ ✲ ✳ ✴ ✵ ✶ ✷ ✸ ✹ ✺ ✻ ✼ ✽ ✾ ✿ ❀ ❁ ❂ ❃ ❄ ❅ ❆ ❇ ❈ ❉ ❊ ❋ ❍ ❏ ❐ ❑ ❒ ❖ ❘ ❙ ❚ ❛ ❜ ❝ ❞ ❡ ❢ ❣ ❤ ❥ ❦ ❧ ❶ ❷ ❸ ❹ ❺ ❻ ❼ ❽ ❾ ❿ ➀ ➁ ➂ ➃ ➄ ➅ ➆ 7 ➇ ➈ ➉ ➊ ➋ ➌ ➍ ➎ ➏ ➐ ➑ ➒ ➓ ➔ ➘ ➙ ➚ ➛ ➜ ➝ ➞ ➟ ➠ ➡ ➢ ➣ ➤ ➥ ➦ ➧ ➨ ➩ ➪ ➫ ➬ ➭ ➮ ➯ ➱ ➲ ➳ ➴ ➵ ➶ ➷ ➸ ➹ ➺ ➻ ➼ ➽ and many many others...
How I can remove all these symbols from parsed text (without using utf8mb4)?
Pas de solution correcte
Autres conseils
You're oh so close. Combining your code with Marc B's comments, we have this:
if (preg_match('/\xF0/si', $texts[$i])) {
$texts[$i] = preg_replace('/\xF0/si', '', $texts[$i]);
}
function replace4byte($string) {
return preg_replace('%(?:
\xF0[\x90-\xBF][\x80-\xBF]{2} # planes 1-3
| [\xF1-\xF3][\x80-\xBF]{3} # planes 4-15
| \xF4[\x80-\x8F][\x80-\xBF]{2} # plane 16
)%xs', '', $string);
}