سؤال

I am using preg_match function to filter unwanted characters from a textarea form in 2 PHP scripts I made, but in one of them seems not to work.

Here's the script with the problem:

<?php
    //Database connection, etc......

    mysql_select_db("etc", $con);
    $errmsg = '';
    $chido = $_POST['chido'];
    $gacho = $_POST['gacho'];
    $maestroid = $_POST['maestroid'];
    $comentario = $_POST['comment'];
    $voto = $_POST['voto'];

    if($_POST['enviado']==1) {
        if (preg_match ('/[^a-zA-Z áéíóúüñÁÉÍÓÚÜÑ]/i', $comentario))
            $errmsg = 1;
        if($errmsg == '') {
            //here's some queries, etc
        }
    }

    if($errmsg == 1)
        echo "ERROR: You inserted invalid characters...";
?>

So as you can see the preg_match just filter unwanted chracters like !"#$%&/() etc..

But every time I type a special character like 'ñ' or 'á' it triggers the error code.

I have this very similar script that works perfectly with the same preg_match and filters just the unwanted characters:

//Database connection, etc..
mysql_select_db("etc", $con);
$errmsg = '';

if ($_POST['enviado']==1) {
     $nombre = $_POST['nombre'];
     $apodo = $_POST['apodo'];
     $mat1 = $_POST['mat1'];
     $mat2 = $_POST['mat2'];
     $mat3 = $_POST['mat3'];

     if (preg_match ('/[^a-zA-Z áéíóúüñÁÉÍÓÚÜÑ]/i', $nombre))
         $errmsg = 1;


     if($errmsg == '') {
         //more queries after validation
     }
}

if($errmsg == 1)
    echo "ERROR: etc......."
?>

So the question is, what am I doing wrong in the first script??

I tried everything but always fails and shows the error.

Any suggestion?

هل كانت مفيدة؟

المحلول

نصائح أخرى

try adding a u at the end along with your i to use unicode

/[^a-zA-Z áéíóúüñÁÉÍÓÚÜÑ]/iu

Hi before i was using this match expression:

/^[a-z]\d_]+$/i

because i was accepting letters from a to z, digits from 0 to 9 and underscore '_', the plus sign '+' to repeat through the whole string, and the '/i' for insensitive match. But i needed to accept the 'ñ' letter.

So, what i tried and worked for me was using this regex:

/^[a-z\d_\w]+$/iu

I added '\w' to accept any word character and also added an 'u' after '/i' to treat the pattern as UTF-16 character set, instead of UTF-8.

I added this to the form.

<form accept-charset="utf-8">. 

Now seems to work.

Why are you specifying /i yet enumerating all the upper‐ and lower‐case letters separately?

ALSO: This won’t work at all if you don’t normalize your input. Consider how ñ can be either character U+F1 or characters U+4E followed by U+303!

  • Unicode Normalization Form D will guarantee that both U+F1 and U+4E,U+303 turn into the canonically decomposed form U+4E,U+303.

  • Unicode Normalization Form C will guarantee that both U+F1 and U+4E,U+303 turn into form U+4E because it uses canonical decomposition followed by canonical composition.

Based on your pattern, it looks like you want the NFC form.

From PHP, you’ll need to use the Normalization class on these to get it working reliably.

i don't know if this can help but i had exactly the same problem with those kind of special characters and that turned me crazy for many days at the end i understood that the problem was a html_entities() command sanitizing the string before running in preg_match(), moving the html_entities() after prey_match()made it work great.

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top