Question

I have one "go" script that fetches any other script requested and this is what I wrote to sanitize user input:

foreach ($_REQUEST as $key => $value){
    if (get_magic_quotes_gpc()) 
    $_REQUEST[$key] = mysql_real_escape_string(stripslashes($value));  
    else
    $_REQUEST[$key] = mysql_real_escape_string($value); 
}

I haven't seen anyone else use this approach. Is there any reason not to?

EDIT - amended for to work for arrays:

function mysql_escape($thing) {
  if (is_array($thing)) {
    $escaped = array();
    foreach ($thing as $key => $value) {
      $escaped[$key] = mysql_escape($value);
    }          
    return $escaped;
  }
  // else
  if (get_magic_quotes_gpc()) $thing = stripslashes($thing);
  return mysql_real_escape_string($thing);
}

foreach ($_REQUEST as $key => $value){
    $_REQUEST[$key] = mysql_escape($value); 
}
Was it helpful?

Solution

I find it much better to escape the data at the time it is used, not on the way in. You might want to use that data in JSON, XML, Shell, MySQL, Curl, or HTML and each will have it's own way of escaping the data.


Lets have a quick review of WHY escaping is needed in different contexts:

If you are in a quote delimited string, you need to be able to escape the quotes. If you are in xml, then you need to separate "content" from "markup" If you are in SQL, you need to separate "commands" from "data" If you are on the command line, you need to separate "commands" from "data"

This is a really basic aspect of computing in general. Because the syntax that delimits data can occur IN THE DATA, there needs to be a way to differentiate the DATA from the SYNTAX, hence, escaping.

In web programming, the common escaping cases are: 1. Outputting text into HTML 2. Outputting data into HTML attributes 3. Outputting HTML into HTML 4. Inserting data into Javascript 5. Inserting data into SQL 6. Inserting data into a shell command

Each one has a different security implications if handled incorrectly. THIS IS REALLY IMPORTANT! Let's review this in the context of PHP:

  1. Text into HTML: htmlspecialchars(...)

  2. Data into HTML attributes htmlspecialchars(..., ENT_QUOTES)

  3. HTML into HTML Use a library such as HTMLPurifier to ENSURE that only valid tags are present.

  4. Data into Javascript I prefer json_encode. If you are placing it in an attribute, you still need to use #2, such as

  5. Inserting data into SQL Each driver has an escape() function of some sort. It is best. If you are running in a normal latin1 character set, addslashes(...) is suitable. Don't forget the quotes AROUND the addslashes() call:

    "INSERT INTO table1 SET field1 = '" . addslashes($data) . "'"

  6. Data on the command line escapeshellarg() and escapeshellcmd() -- read the manual

-- Take these to heart, and you will eliminate 95%* of common web security risks! (* a guess)

OTHER TIPS

If you have arrays in your $_REQUEST their values won't be sanitized.

I've made and use this one:

<?php
function _clean($var){
    $pattern = array("/0x27/","/%0a/","/%0A/","/%0d/","/%0D/","/0x3a/",
                     "/union/i","/concat/i","/delete/i","/truncate/i","/alter/i","/information_schema/i",
                     "/unhex/i","/load_file/i","/outfile/i","/0xbf27/");
    $value = addslashes(preg_replace($pattern, "", $var));
    return $value;
}

if(isset($_GET)){
    foreach($_GET as $k => $v){
        $_GET[$k] = _clean($v);
    }
}

if(isset($_POST)){
    foreach($_POST as $k => $v){
        $_POST[$k] = _clean($v);
    }
}
?>

Your approach tries to sanitize all the request data for insertion into the database, but what if you just wanted to output it? You will have unnecessary backslashes in your output. Also, escaping is not a good strategy to protect from SQL exceptions anyway. By using parametrized queries (e.g. in PDO or in MySQLi) you "pass" the problem of escaping to the abstraction layer.

Apart from the lack of recursion into arrays and the unnecessary escaping of, say, integers, this approach encodes data for use in an SQL statement before sanitization. mysql_real_escape_string() escapes data, it doesn't sanitize it -- escaping and sanitizing aren't the same thing.

Sanitization is the task many PHP scripts have of scrutinizing input data for acceptability before using it. I think this is better done on data that hasn't been escaped. I generally don't escape data until it goes into the SQL. Those who prefer to use Prepared Statements achieve the same that way.

One more thing: If input data can include utf8 strings, it seems these ought to be validated before escaping. I often use a recursive utf8 cleaner on $_POST before sanitization.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top