Reproducing SQL Injection in sqlite3 and PyMySQL

SQL injection is a security vulnerability that allows an unauthorized user to execute SQL queries against a database. It can be exploited in any type of application (console, web, desktop, mobile). An application might be vulnerable to SQL injection when allowing the user to enter some data that is later included in a database query. In this post you will learn how to create and exploit vulnerable code in Python, using SQLite and MySQL as example databases. If you want to keep your Python application secure, you need to know how vulnerabilities arise!

Reproducing in SQLite

Let's start with the sqlite3 standard module. Consider the following program that prompts the user for a name and age and then inserts this data into an SQLite database.

import sqlite3
conn = sqlite3.connect("database.sqlite")
cursor = conn.cursor()
cursor.execute("CREATE TABLE IF NOT EXISTS people (name TEXT, age NUMERIC)")
name = input("Name: ")
age = int(input("Age: "))
cursor.execute(f"INSERT INTO people VALUES ('{name}', {age})")
conn.commit()
print("Data successfully inserted!")
conn.close()

Here's a typical usage session of this application:

Name: Charles
Age: 30
Data successfully inserted!

Note that the user input is included into the SQL query by using Python's latest formatting syntax, which works by prefixing the string with f and enclosing within braces the names of the variables whose values we want to inject. This magic is done in the following line:

cursor.execute(f"INSERT INTO people VALUES ('{name}', {age})")

Thus, if name is "Charles" and age is 30, the resulting query will be:

INSERT INTO people VALUES ('Charles', 30)

The problem with this approach is that no validation is performed on the data entered by the user. What happens if the user writes the following?

Name: Charles', 30); DELETE FROM people; --
Age: 30

Because these two values are injected into the string, the query would look like this (note that there is a space after the two hyphens):

INSERT INTO people VALUES ('Charles', 30); DELETE FROM people; -- ', 30)

Since SQL lets you separate multiple queries using semicolons, the above code does the following:

  1. Insert a row with the values "Charles" and 30 as name and age, respectively.

  2. Delete every row from the table.

The double hyphen at the end of the query introduces a comment, i.e., the last ', 30) characters are completely ignored, and this is necessary for the database engine (in this case, SQLite) to not throw any syntax errors. So, until now, we would theoretically have managed to inject an SQL query (DELETE FROM) from the input() function. But if you enter such a name and age, you will see that the program throws the following error.

Traceback (most recent call last):
    [...]
    cursor.execute(f"INSERT INTO people VALUES ('{name}', {age})")
sqlite3.Warning: You can only execute one statement at a time.

We find out that SQLite's execute() function only allows a single query to be executed at a time, thus strings containing more than one query (separated by semicolons) will cause the sqlite3 module to produce the above exception. But beware: this happens with this particular module and may not apply to other database modules. If we want to execute more than one query with a single string, we must use the executescript() function. So try to replace this line

cursor.execute(f"INSERT INTO people VALUES ('{name}', {age})")

with

cursor.executescript(f"INSERT INTO people VALUES ('{name}', {age})")

Now run the code again and enter the following values:

Name: Charles', 30); DELETE FROM people; --
Age: 30

You will see that the program does not throw any errors and, after opening the database, the people table will be effectively empty. You have successfully injected SQL code!

How do we prevent this vulnerability? Remember that its origin lies in injecting user-entered variables within the SQL query. The method by which this operation is carried out is indistinct, that is, it can be any of the following formatting syntaxes:

# WARNING: THESE ARE ALL VULNERABLE.
.execute(f"INSERT INTO people VALUES ('{name}', {age})")
.execute("INSERT INTO people VALUES ('{}', {})".format(name, age))
.execute("INSERT INTO people VALUES ('" + name + "', " + str(age) + ")")
.execute("INSERT INTO people VALUES ('%s', %d)" % (name, age))

These four methods lead to SQL injection vulnerability. The only way to make it safe is to send separately the SQL query and the user-entered variables (name and age, or wherever you want to include in the query). And database engines support this. When calling execute(), reserve the first argument for your SQL query and pass a tuple containing user-entered data in the second argument, but make sure to use a question mark as a placeholder in the query to indicate where the elements of the tuple must be replaced.

# This is not vulnerable.
cursor.execute("INSERT INTO people VALUES (?, ?)", (name, age))

Note that even though name is a string, we don't use single quotes around the question mark inside the query. The database engine will take care of that.

Now run the program again with this updated line and enter the malicious payload:

Name: Charles', 30); DELETE FROM people; --
Age: 30
Data successfully inserted!

You will note that the program does not fail with an exception, but SQL injection does not get place either. That is, the DELETE FROM people query is not executed. Rather, the result is the following:

/images/reproducing-sql-injection-in-sqlite3-and-pymysql/sqlite3-browser.png

This is totally to be expected, since we have said that the name we wanted to enter was Charles', 30); DELETE FROM people; -- (a bit of an odd name). Thus we have eliminated the possibility of SQL injection in our application.

Now you will understand xkcd's famous comic about SQL injection:

https://imgs.xkcd.com/comics/exploits_of_a_mom.png

Reproducing in PyMySQL

The logic to prevent SQL injection is the same in every database engine, database module and programming language: do not include variables whose values are user-originated inside an SQL query. Instead, let the database engine take care of it by shipping them separately. However, the exact way in wich this is done may have some variations from what we saw above for SQLite. We will consider them in this section for the MySQL database engine using the PyMySQL module. (There are other modules to communicate with MySQL databases, such as mysql-connector-python and mysqlclient.)

A program similar to the one above, but using the pymysql module, would look something like this:

import pymysql
conn = pymysql.connect(
    host="localhost",
    user="user",
    passwd="password",
    db="dbname"
)
cursor = conn.cursor()
cursor.execute("CREATE TABLE IF NOT EXISTS people (name VARCHAR(50), age INT)")
name = input("Name: ")
age = int(input("Age: "))
cursor.execute(f"INSERT INTO people VALUES ('{name}', {age})")
conn.commit()
print("Data successfully inserted!")
conn.close()

As stated in the previous section, this line makes the code vulnerable:

cursor.execute(f"INSERT INTO people VALUES ('{name}', {age})")

However, if you run the program and enter this data:

Name: Charles', 30); DELETE FROM people; --
Age: 30

You will get the following exception:

Traceback (most recent call last):
    [...]
pymysql.err.ProgrammingError: (1064, "You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'DELETE FROM people; -- ', 30)' at line 1")

This happens because the PyMySQL module does not allow multiple queries to be executed in the same execute() call by default. In order to do this, you will need to pass the client_flag=pymysql.constants.CLIENT.MULTI_STATEMENTS parameter to the connect() function.

conn = pymysql.connect(
    host="localhost",
    user="user",
    passwd="password",
    db="dbname",
    client_flag=pymysql.constants.CLIENT.MULTI_STATEMENTS
)

Now repeat the steps and you will see that the query is executed successfully:

Name: Charles', 30); DELETE FROM people; --
Age: 30
Data successfully inserted!

Thus, the SQL injection will have succeeded and all rows in the people table will have been deleted.

The solution to fix the vulnerability is the same as for the sqlite3 module, but in this case, the placeholder is %s instead of ?.

cursor.execute("INSERT INTO people VALUES (%s, %s)", (name, age))

Not to be confused with the following code that does generate the vulnerability:

cursor.execute("INSERT INTO people VALUES ('%s', %d)" % (name, age))

Placeholders might change when switching between database modules. However, possible placeholders are limited by Python's Database API Specification.