This is an old revision of the document!
Table of Contents
SQL Injection - Why not to use escape quotes as a defence
Raw SQL
When you're writing SQL – for anything that takes human input really, a lot of things have been done to avoid the injection.
Everyone that's heard of SQL injection knows that (I'm going to use PHP as a sample) doing something like this isn't safe:
$sql = "SELECT * FROM `users` . WHERE `userName` = "{$_POST["username"]}" . AND `pass` = "{$_POST["pass"]}";";
Magic
Why can't we escape all user input using “magic”?
Magic quotes was a controversial feature of the PHP scripting language, wherein strings are automatically escaped. Special characters are prefixed with a backslash before being passed on.
The idea of using “magic escape quotes” to deal with program input that isn't sanitized correctly does not really solve the issue with SQL injection for the following reasons.
- At the time the magic is applied, it is unknown where the data will end up. So magic quotes are destroying data, unless it is written unescaped to a database.
- Not all data that are supplied by the user are intended for insertion into a database. They may be rendered directly to the screen, stored in a session, or previewed before saving. This can result in backslashes being added where they are not wanted and being shown to the end user.
- It may just be used in the HTML response sent back to the client. Think of a form that has not been filled in completely and is therefore shown again to the user. With magic quotes, the data entered on the first attempt will now be SQL escaped, which is meaningless on an HTML page. Even worse: on the second submission the data is SQL escaped again.
- Adding magic quotes and subsequently removing them where appropriate incurs a small but unnecessary amount of performance overhead.
- Magic quotes also use the generic functionality provided by PHP's addslashes() function, which is not Unicode-aware and is still subject to SQL injection vulnerabilities in some multi-byte character encodings.
- Magic quotes do not protect against other common security vulnerabilities such as cross-site scripting attacks or SMTP header injection attacks
Adding slashes
The addslashes() command escapes all the quotes which prevents hackers using something like:
' OR 1=1
The PHP directive magic_quotes_gpc was on by default in some versions of PHP, and it essentially automatically ran addslashes() on all GET, POST, and COOKIE data. If addslashes() was then run on the data that had already been escaped with magic_quotes_gpc it would result in double escaping. The function get_magic_quotes_gpc() could come in handy for checking this.
Even the documentation for addslashes says that you shouldn't use addslashes, but instead to use database-specific functions such as mysql_real_escape_string().
addslashes() also has issues with multibyte characters:
' 27
\ 5c
뼧 bf 5c
Those are two bytes, but only one Unicode character.
Since addslashes does not know anything about Unicode, it converts the input bf 27 to bf 5c 27. If this is read by a Unicode-aware program, it is seen as 뼧'. Boom.
There is a good explanation of this issue at http://shiflett.org/blog/2006/jan/addslashes-versus-mysql-real-escape-string.
Example of an addslashes() SQL injection attack
Demonstrating an SQL injection attack that is immune to addslashes() is a bit more involved, but still pretty straightforward.
In GBK, 0xbf27 is not a valid multi-byte character, but 0xbf5c is. Interpreted as single-byte characters, 0xbf27 is 0xbf (¿) followed by 0x27 ('), and 0xbf5c is 0xbf (¿) followed by 0x5c (\).
How does this help? If I want to attempt a SQL injection attack against a MySQL database, having single quotes escaped with a backslash is a bummer. If you're using addslashes(), however, I'm in luck. All I need to do is inject something like 0xbf27, and addslashes() modifies this to become 0xbf5c27, a valid multi-byte character followed by a single quote. In other words, I can successfully inject a single quote despite your escaping. That's because 0xbf5c is interpreted as a single character, not two. Oops, there goes the backslash.
Multibyte Issue
For convenience, single-byte representations are in parentheses. Keep in mind that this is not necessarily how these bytes are interpreted.
0xbf (¿) is used in this example, because the attack is assuming:
1. Single quotes are being escaped with a backslash.
2. 0xbf5c27 (¿\) is a valid multi-byte character in whatever character encoding the database is using.
In other words, 0xbf (¿) isn't a magic byte that works in all cases. It's a specific one used in this particular attack, because when followed by 0x5c (\), it makes a valid multi-byte character, 0xbf5c27 (¿\).
Because 0xbf27 (¿') is not a valid multi-byte character, this particular example doesn't work if single quotes are being escaped with single quotes. Therefore, instead of 0xbf(¿), the attack would use a byte (or bytes) that can be followed by 0x27 (') to form a valid multi-byte character.
It's the exact same attack. Using single quotes to escape single quotes offers no protection. Again, a single quote isn't always a single quote, and that's really the focus of this particular post.
Adding slashes specific to Database
They are a little better because they ensure that the escaping interprets the data in the same way the database does.
From a security point of view, they are okay if you use them for every single database input. But they have the risk that you may forget either of them somewhere, or that you use xxx_escape_strings for numbers without adding quotation marks around them in the SQL statement and without ensuring that they are actual numbers by casting or converting the input to the appropriate data type.
From a software development perspective they are bad because they make it a lot harder to add support for other SQL database server software.
DBMSes have different escape specification for identifiers (e.g. Table name, field name) than parameters. Not all DBMS provide escape API.
Summary
The following should not be used:
- addslashes
- magic_quotes
- DBMS specific _real_escape_string