You're oh so close. Combining your code with Marc B's comments, we have this:
if (preg_match('/\xF0/si', $texts[$i])) {
$texts[$i] = preg_replace('/\xF0/si', '', $texts[$i]);
}
Question
I apologize for such a topic title. But it is because the problem is so.
Now I'm writing parser for Twitter and when in the text of tweet script stumbles upon these symbols 💗⚫️, Yii generate errors as:
SQLSTATE[HY000]: General error: 1366 Incorrect string value: '\xF0\x9F\x98\x8D\xF0\x9F...' for column 'code' at row 1.
I wrote the following code:
if (preg_match('/😍/si', $texts[$i])) {
$texts[$i] = str_replace('😍', '', $texts[$i]);
}
But it did not help me, because all these characters have different Unicode (they are only in the form of squares)...
I wrote the following code too:
if (preg_match('/xF0/si', $texts[$i])) {
unset($texts[$i]);
}
But it did not help me too...
These symbols is: ✂ ✃ ✄ ✆ ✇ ✈ ✉ ✌ ✍ ✎ ✏ ✐ ✑ ✒ ✓ ✔ ✕ ✖ ✗ ✘ ✙ ✚ ✛ ✜ ✝ ✞ ✟ ✠ ✡ ✢ ✣ ✤ ✥ ✦ ✧ ✩ ✪ ✫ ✬ ✭ ✮ ✯ ✰ ✱ ✲ ✳ ✴ ✵ ✶ ✷ ✸ ✹ ✺ ✻ ✼ ✽ ✾ ✿ ❀ ❁ ❂ ❃ ❄ ❅ ❆ ❇ ❈ ❉ ❊ ❋ ❍ ❏ ❐ ❑ ❒ ❖ ❘ ❙ ❚ ❛ ❜ ❝ ❞ ❡ ❢ ❣ ❤ ❥ ❦ ❧ ❶ ❷ ❸ ❹ ❺ ❻ ❼ ❽ ❾ ❿ ➀ ➁ ➂ ➃ ➄ ➅ ➆ 7 ➇ ➈ ➉ ➊ ➋ ➌ ➍ ➎ ➏ ➐ ➑ ➒ ➓ ➔ ➘ ➙ ➚ ➛ ➜ ➝ ➞ ➟ ➠ ➡ ➢ ➣ ➤ ➥ ➦ ➧ ➨ ➩ ➪ ➫ ➬ ➭ ➮ ➯ ➱ ➲ ➳ ➴ ➵ ➶ ➷ ➸ ➹ ➺ ➻ ➼ ➽ and many many others...
How I can remove all these symbols from parsed text (without using utf8mb4)?
No correct solution
OTHER TIPS
You're oh so close. Combining your code with Marc B's comments, we have this:
if (preg_match('/\xF0/si', $texts[$i])) {
$texts[$i] = preg_replace('/\xF0/si', '', $texts[$i]);
}
function replace4byte($string) {
return preg_replace('%(?:
\xF0[\x90-\xBF][\x80-\xBF]{2} # planes 1-3
| [\xF1-\xF3][\x80-\xBF]{3} # planes 4-15
| \xF4[\x80-\x8F][\x80-\xBF]{2} # plane 16
)%xs', '', $string);
}