Hah. Unfortunately, it almost always depends on the monetary and managerial decision for what's appropriate, but "it is very tedious" is not generally considered a valid engineering concern -- it is but an excuse to appropriately refactor the code.
Meanwhile, the question requests non-PreparedStatement methods: in short, if you cannot offload the work to a library (e.g. prepared statements), then the only other method is to do it yourself for each "injected" item. Either way, the work of checking the input must be performed. The sole question is where it is done, and what programmer's expertise made the input validation code.
For example, consider a simple SELECT statement:
sql = "SELECT * FROM mytable WHERE id = " + untrustedVar;
For completeness, we might assume the injection example where untrustedVar is a string like 1 OR 1 or 1; DROP TABLE mytable; Obviously this would result in unwanted behavior, vis-a-vis all rows returned to the caller, or a now missing database table:
SELECT * FROM mytable WHERE id = 1;
DROP mytable;
Obligatory XKCD reference
In this case, you could let the language semantics at least ensure that unstrustedVar is an integer, perhaps in your function definition:
String[] selectRowById ( int untrustedVar ) { ...
Or if it's a string, you might do this with a regex like:
Pattern valid_id_re = Pattern.compile('^\d{1,10}$'); // ensure id is between 1 and 10 billion
Matcher m = valid_id_re.matcher( unstrustedVar );
if ( ! m.matches() )
return null;
But if you have longer inputs that do not have any grammar or structure guarantees (e.g., a web form textarea), then you will need to do the lower-level character replacements to escape the potentially bad characters. Per statement. Per variable. Per database flavor (PostgreSQL, Oracle, MySQL, SQLite, etc.). This ... is a can of worms.
The upside, of course, is that if you are not using prepared statements, and no one has yet done the work to otherwise avoid SQL injection attacks for your application, you have no where to go but up.
Meanwhile, I urge, urge, urge you to reconsider your stance on "We can't use prepared statements because reasons." More to the point, and as Gord Thompson correctly points out in the comments below, "it will be a considerable amount of work in any case, so why not just do it right and be done with it?"
Edit
After writing the above, it occurred to me that some might think that merely writing a prepared statement = better security. Actually, it's prepared statements with bound parameters that ups the security. For example, one could write this:
String sql = "SELECT * FROM mytable WHERE id = " + untrustedVar;
PreparedStatment pstmt = dbh.prepareStatement( sql );
return pstmt.executeQuery();
At this point, you've done little more than prepare a statement that has already been injected with malicious code. Instead, consider the actual binding of parameters:
String sql = "SELECT * FROM mytable WHERE id = ?"; // Raw string; never touched by "tainted" variable
PreparedStatment pstmt = dbh.prepareStatement( sql );
pstmt.setObject(1, p); // Perform the actual binding.
return pstmt.executeQuery();
This latter example does two things. It first creates a known safe format, and sends that to the DB for preparation. Then, after the DB has returned a handle to the prepared statement, do we bind the variables, and finally execute the statement.
Answer from hunteke on Stack Overflowsql injection - Can you protect against SQLi without using prepared statements? - Information Security Stack Exchange
Are prepared statements 100% safe against SQL injection? - Information Security Stack Exchange
java - Preventing SQL Injection in JDBC without using Prepared Statements - Stack Overflow
java - How to avoid SQL injection without PreparedStatements - Stack Overflow
Videos
Not sure what escapeSimple does exactly, but if it behaves like mysqli_real_escape_string then it's ok, as long as you are not using some weird character encoding or forget to set it correctly (UTF8 should be ok by default, as far as I know).
Remember that using MD5 or any other simple hashing function is considered bad practice today, and you should use better functions which take more resources to compute (and with salts), for example bcrypt, argon2, etc.
I also don't like that == in the comparison, you should always use === unless you have a good reason to want type juggling. Otherwise you are going to get into trouble and something like this might happen: https://stackoverflow.com/questions/22140204/why-md5240610708-is-equal-to-md5qnkcdzo
By the way, there might also be timing attacks to consider (see Alexander O'Mara's comment below).
That said, your question was about avoiding SQL injection without prepared statements. Yes, it's definitely possible to avoid it, if you are very careful and make sure every statement is correctly constructed and its parts correctly escaped. However in some cases it might be tricky, and to avoid any mistakes of course prepared statements are the way to go.
If escapeSimple behaves like mysqli_real_escape_string then there is one important caveat to be aware of:
mysqli::real_escape_string -- mysqli_real_escape_string — Escapes special characters in a string for use in an SQL statement, taking into account the current charset of the connection
(emphasis is mine)
So you need an active connection for this charset-aware function. But I don't see in your code how escapeSimple relates to an existing, open DB connection. So does it really behave the way you would expect ?
This type of coding is not advisable for several reasons, one being the lack of portability. Software nowadays is often designed to run on different DBMSes (eg Mysql or SQLite), that is why we have abstraction classes like PDO etc.
But this code is problematic for other reasons: it uses mysql_fetch_array, which is deprecated (PHP5.5.0). In fact what you are showing is the plain old mysql functions.
At the very least it means your environment is not up to date. Your PHP install may carry a number of security vulnerabilities that won't be fixed as it is EOL. Thus, the real vulnerability could come from something else than this particular code.
Guarantee of 100% safe from SQL injection? Not going to get it (from me).
In principle, your database (or library in your language that is interacting with the db) could implement prepared statements with bound parameters in an unsafe way susceptible to some sort of advanced attack, say exploiting buffer overflows or having null-terminating characters in user-provided strings, etc. (You could argue that these types of attacks should not be called SQL injection as they are fundamentally different; but that's just semantics).
I have never heard of any of these attacks on prepared statements on real databases in the field and strongly suggest using bound parameters to prevent SQL injection. Without bound parameters or input sanitation, its trivial to do SQL injection. With only input sanitation, its quite often possible to find an obscure loophole around the sanitation.
With bound parameters, your SQL query execution plan is figured out ahead of time without relying on user input, which should make SQL injection not possible (as any inserted quotes, comment symbols, etc are only inserted within the already compiled SQL statement).
The only argument against using prepared statements is you want your database to optimize your execution plans depending on the actual query. Most databases when given the full query are smart enough to do an optimal execution plan; e.g., if the query returns a large percentage of the table, it would want to walk through the entire table to find matches; while if its only going to get a few records you may do an index based search [1].
EDIT: Responding to two criticisms (that are a tad too long for comments):
First, as others noted yes every relational database supporting prepared statements and bound parameters doesn't necessarily pre-compile the prepared statement without looking at the value of the bound parameters. Many databases customarily do this, but its also possible for databases to look at the values of the bound parameters when figuring out the execution plan. This isn't a problem, as the structure of the prepared statement with separated bound parameters, makes it easy for the database to cleanly differentiate the SQL statement (including SQL keywords) from data in bound parameters (where nothing in a bound parameter will be interpreted as an SQL keyword). This is not possible when constructing SQL statements from string concatenation where variables and SQL keywords would get intermixed.
Second, as the other answer points out, using bound parameters when calling an SQL statement at one point in a program will safely prevent SQL injection when making that top-level call. However, if you have SQL injection vulnerabilities elsewhere in the application (e.g., in user-defined functions you stored and run in your database that you unsafely wrote to construct SQL queries by string concatenation).
For example, if in your application you wrote pseudo-code like:
sql_stmt = "SELECT create_new_user(?, ?)"
params = (email_str, hashed_pw_str)
db_conn.execute_with_params(sql_stmt, params)
There can be no SQL injection when running this SQL statement at the application level. However if the user-defined database function was written unsafely (using PL/pgSQL syntax):
CREATE FUNCTION create_new_user(email_str, hashed_pw_str) RETURNS VOID AS
$$
DECLARE
sql_str TEXT;
BEGIN
sql_str := 'INSERT INTO users VALUES (' || email_str || ', ' || hashed_pw_str || ');'
EXECUTE sql_str;
END;
$$
LANGUAGE plpgsql;
then you would be vulnerable to SQL injection attacks, because it executes an SQL statement constructed via string concatenation that mixes the SQL statement with strings containing the values of user defined variables.
That said unless you were trying to be unsafe (constructing SQL statements via string concatenation), it would be more natural to write the user-defined in a safe way like:
CREATE FUNCTION create_new_user(email_str, hashed_pw_str) RETURNS VOID AS
$$
BEGIN
INSERT INTO users VALUES (email_str, hashed_pw_str);
END;
$$
LANGUAGE plpgsql;
Further, if you really felt the need to compose an SQL statement from a string in a user defined function, you can still separate data variables from the SQL statement in the same way as stored_procedures/bound parameters even within a user defined function. For example in PL/pgSQL:
CREATE FUNCTION create_new_user(email_str, hashed_pw_str) RETURNS VOID AS
$$
DECLARE
sql_str TEXT;
BEGIN
sql_str := 'INSERT INTO users VALUES($1, $2)'
EXECUTE sql_str USING email_str, hashed_pw_str;
END;
$$
LANGUAGE plpgsql;
So using prepared statements is safe from SQL injection, as long as you aren't just doing unsafe things elsewhere (that is constructing SQL statements by string concatenation).
100% safe? Not even close. Bound parameters (prepared statement-wise or otherwise) effectively can prevent, 100%, one class of SQL injection vulnerability (assuming no db bugs and a sane implementation). In no way do they prevent other classes. Note that PostgreSQL (my db of choice) has an option to bind parameters to ad hoc statements which saves a round trip regarding prepared statements if you don't need certain features of these.
You have to figure that many large, complex databases are programs in themselves. The complexity of these programs varies quite a bit, and SQL injection is something that has to be looked out for inside programming routines. Such routines include triggers, user-defined functions, stored procedures, and the like. It is not always obvious how these things interact from an application level as many good dba's provide some degree of abstraction between the application access level and the storage level.
With bound parameters, the query tree is parsed, then, in PostgreSQL at least, the data is looked at in order to plan. The plan is executed. With prepared statements, the plan is saved so you can re-execute the same plan with different data over and over (this may or may not be what you want). But the point is that with bound parameters, a parameter cannot inject anything into the parse tree. So this class of SQL injection issue is properly taken care of.
But now we need to log who writes what to a table, so we add triggers and user-defined functions to encapsulate the logic of these triggers. These pose new issues. If you have any dynamic SQL in these, then you have to worry about SQL injection there. The tables they write to may have triggers of their own, and so forth. Similarly a function call might invoke another query which might invoke another function call and so forth. Each of these is independently planned of the main tree.
What this means is that if I run a query with a bound parameter like foo'; drop user postgres; -- then it cannot directly implicate the top-level query tree and cause it to add another command to drop the postgres user. However, if this query calls another function directly or not, it is possible that somewhere down the line, a function will be vulnerable and the postgres user will be dropped. The bound parameters offered no protection to secondary queries. Those secondary queries need to make sure they use bound parameters too to the extent possible, and where not, will need to use appropriate quoting routines.
The rabbit hole goes deep.
BTW for a question on Stack Overflow where this problem is apparent, see https://stackoverflow.com/questions/37878426/conditional-where-expression-in-dynamic-query/37878574#37878574
Also a more problematic version (due to limitation on utility statements) at https://stackoverflow.com/questions/38016764/perform-create-index-in-plpgsql-doesnt-run/38021245#38021245
The ESAPI library has procedures for escaping input for SQL and for developing your own db specific encoders if necessary.
Check out JTDS FAQ - I'm pretty confident that with a combination of properties prepareSQL and maxStatements you could go there (or "could have gone" as you probably completed that task years ago :-) )
Java, untested.
List<String> clauses = new ArrayList<String>();
List<String> binds = new ArrayList<String>();
if (request.name != null) {
binds.add(request.name);
clauses.add("NAME = ?");
}
if (request.city != null) {
binds.add(request.city);
clauses.add("CITY = ?");
}
...
String whereClause = "";
for(String clause : clauses) {
if (whereClause.length() > 0) {
whereClause = whereClause + " AND ";
}
whereClause = whereClause + clause;
}
String sql = "SELECT * FROM table WHERE " + whereClause;
PreparedStatement ps = con.prepareStatment(sql);
int col = 1;
for(String bind : binds) {
ps.setString(col++, bind);
}
ResultSet rs = ps.executeQuery();
If you add arguments to prepared statements they will automatically be escaped.
conn = pool.getConnection( );
String selectStatement = "SELECT * FROM User WHERE userId = ? ";
PreparedStatement prepStmt = con.prepareStatement(selectStatement);
prepStmt.setString(1, userId);
ResultSet rs = prepStmt.executeQuery();
Using string concatenation for constructing your query from arbitrary input will not make PreparedStatement safe. Take a look at this example:
preparedStatement = "SELECT * FROM users WHERE name = '" + userName + "';";
If somebody puts
' or '1'='1
as userName, your PreparedStatement will be vulnerable to SQL injection, since that query will be executed on database as
SELECT * FROM users WHERE name = '' OR '1'='1';
So, if you use
preparedStatement = "SELECT * FROM users WHERE name = ?";
preparedStatement.setString(1, userName);
you will be safe.
Some of this code taken from this Wikipedia article.
The prepared statement, if used properly, does protect against SQL injection. But please post a code example to your question, so we can see if you are using it properly.
It sounds like you haven't benchmarked the simplest solution - prepared statements. You say that they "might hurt performance" but until you've tested it, you really won't know.
I would definitely test prepared statements first. Even if they do hamper performance slightly, until you've tested them you won't know whether you can still achieve the performance you require.
Why spend time trying to find alternative solutions when you haven't tried the most obvious one?
If you find that prepared statement execution plan caching is costly, you may well find there are DB-specific ways of tuning or disabling it.
Not sure if I understand your question correctly. Is there something in PreparedStatement that isn't fitting your needs?
I think that whether or not the statement is cached on the server side is an implementation detail of the database driver and the specific database you're using; if your query/statement changes over time than this should have no impact - the cached/compiled statements simply won't be used.