Can someone clearly explain why mysqli_prepare()/bind_param() is better than real_escape_string()? [duplicate]

Question 1

What you are reading, that you need to use mysqli_prepare() and mysqli_bind_param() functions to "properly escape your MySQL queries" is wrong.

It is true that if you use mysqli_prepare() and mysqli_bind_param(), you needn't (and shouldn't) "escape" the values supplied as bind parameters. So, in that sense, there's some truth in what you are reading.

It's only when unsafe variables are included in the SQL text (the actual text of the query) that you need to "properly escape" the variables, usually by wrapping the variables in mysqli_real_escape_string() function calls.

(We note that it's possible to make of use of prepared statements and still include un-escaped variables in the SQL text, rather than passing the variable values as bind_parameters. That does sort of defeats the purpose of using prepared statements, but the point is, either way, you can write code that is vulnerable.

MySQL now supports "server side" prepared statements (if the option is enabled in the connection), and that's a performance optimization (in some cases) of repeated executions of identical SQL text. (This has been long supported in other databases, such as Oracle, where making use of prepared statements has been a familiar pattern for, like, since forever.)

Q: Did they implement [prepared statements] so that people wouldn't forget to escape values before sending them in a query?

A: Based on the number of examples of code vulnerable to SQL Injection when not using prepared statements, despite the documentation regarding mysql_real_escape_string() function, you'd think that certainly would be sufficient reason.

I think one big benefit is that when we're reading code, we can see a SQL statement as a single string literal, rather than a concatenation of a bunch of variables, with quotes and dots and calls to mysql_real_escape_string, which isn't too bad with a simple query, but with a more complex query, it is just overly cumbersome. The use of the ? placeholder makes for a more understandable SQL statement,... true, I need to look at other lines of code to figure out what value is getting stuffed there. (I think the Oracle style named parameters :fee, :fi, :fo, :fum is preferable to the positional ?, ?, ?, ? notation.) But having STATIC SQL text is what is really the benefit.

Q: Or is it somehow faster?

As I mentioned before, the use of server side prepared statements can be and advantage in terms of performance. It's not always the case that it's faster, but for repeated execution of the same statement, where the only difference is literal values (as in repeated inserts), it can provide a performance boost.

Q: Or should I use this method when I intend to use the same query repeatedly (since a mysqli_stmt can be reused) and use the traditional method in other cases?

That's up to you. My preference is for using STATIC SQL text. But this really comes from a long history of using Oracle, and using the same pattern with MySQL fits naturally. (Albeit, from Perl using the DBI interface, and Java using JDBC and MyBATIS, or other ORMs (Hibernate, Glassfish JPA, et al.)

Following the same pattern just feels natural in PHP; the introduction of mysqli_ and PDO are a welcome relief from the arcane (and abused) mysql_ interface.

Good code can be written following either pattern. But I challenge you to think ahead, about more complex SQL statements, and whether the choice to use mysqli_real_escape_string() and concatenating together a dynamic string to be executed, rather than using static SQL text and bind parameters, might make reading, and deciphering, the actual SQL being executed more complicated for the soul that finds themselves maintaining code they didn't write.

I think studies have shown that code is read ten times more than it is written, which is why we strive to produce readable, understandable code, even if that means more lines of code. (When each statement is doing a single identifiable thing, that's usually easier for me to understand than reading a jumble of concatenated function calls in one complicated statement.

Question 2

I think it's less a question of the latter method being more secure per se than encouraging separation of logic. With prepared statements the SQL query is independent of the values we use. This means, for example, when we go back and change our query we don't have to concatenate a bunch of different values to a string, and maybe risk forgetting to escape our input. Makes for more maintainable code!

Question 3

There are a couple main benefits I found that were well written:

The overhead of compiling and optimizing the statement is incurred only once, although the statement is executed multiple times. Not all optimization can be performed at the time the prepared statement is compiled, for two reasons: the best plan may depend on the specific values of the parameters, and the best plan may change as tables and indexes change over time.
Prepared statements are resilient against SQL injection, because parameter values, which are transmitted later using a different protocol, need not be correctly escaped. If the original statement template is not derived from external input, SQL injection cannot occur.

On the other hand, if a query is executed only once, server-side prepared statements can be slower because of the additional round-trip to the server. Implementation limitations may also lead to performance penalties: some versions of MySQL did not cache results of prepared queries, and some DBMSs such as PostgreSQL do not perform additional query optimization during execution.

Source

Question 4

I would like to add that mysqli_bind_param() has been removed as of PHP 5.4.0. You should use mysqli_stmt_bind_param()