htmlpurifier returning question marks when user enters into html?
-
10-06-2021 - |
Question
It hardly seems like html code that needs purifying.
Why does htmlpurifier turn that string into a question mark when it should obviously be a space?
My exact html purification code is:
//purify the html input
include_once('inc/htmlpurifier-4.4.0/library/HTMLPurifier.auto.php');
$config = HTMLPurifier_Config::createDefault();
$config->set('Core.Encoding', 'UTF-8');
$config->set('HTML.Doctype', 'HTML 4.01 Transitional');
if (defined('PURIFIER_CACHE')) {
$config->set('Cache.SerializerPath', PURIFIER_CACHE);
} else {
# Disable the cache entirely
$config->set('Cache.DefinitionImpl', null);
}
$input = $_POST["about_me"];
# Help out the Purifier a bit, until it develops this functionality
while (($cleaner = preg_replace('!<(em|strong)>(\s*)</\1>!', '$2', $input)) != $input) {
$input = $cleaner;
}
$filter = new HTMLPurifier($config);
$htmlpurified_output = $filter->purify($input);
I have utf8 enabled in my php page headers and also for mysql when saving the information.
I am able to write, save to DB, and re-display other UTF8 characters inside other textareas on the same page. The culprit is definitely htmlpurifier returning the question marks in place of actual characters.
I will answer any other questions I can.
Solution
And the answer is...
To always make sure your encoding is properly set in all areas.
I had the "about_me" row of the table only set to accept ascii characters. Duh.
Sorry for wasting everybody's time.