Detect SQL injections in the source code

Question

Consider the following code snippet:

import MySQLdb

def get_data(id):
    db = MySQLdb.connect(db='TEST')
    cursor = db.cursor()
    cursor.execute("SELECT * FROM TEST WHERE ID = '%s'" % id)

    return cursor.fetchall()

print(get_data(1))

There is a major problem in the code - it is vulnerable to SQL injections attacks since the query is not parameterized through DB API and is constructed via string formatting. If you call the function this way:

get_data("'; DROP TABLE TEST -- ")

the following query would be executed:

SELECT * FROM TEST WHERE ID = ''; DROP TABLE TEST --

Now, my goal is to analyze the code in the project and detect all places potentially vulnerable to SQL injections. In other words, where the query is constructed via string formatting as opposed to passing query parameters in a separate argument.

Is it something that can be solved statically, with the help of pylint, pyflakes or any other static code analysis packages?

I'm aware of sqlmap popular penetration testing tool, but, as far as I understand, it is working against a web resource, testing it as a black-box through HTTP requests.

alecxe · Accepted Answer

There is a tool that tries to solve exactly what the question is about, py-find-injection:

py_find_injection uses various heuristics to look for SQL injection vulnerabilities in python source code.

It uses ast module, looks for session.execute() and cursor.execute() calls, and checks whether the query inside is formed via string interpolation, concatenation or format().

Here is what it outputs while checking the snippet in the question:

$ py-find-injection test.py
test.py:6   string interpolation of SQL query
1 total errors

The project, though, is not actively maintained, but could be used as a starting point. A good idea would be to make a pylint or pyflakes plugin out of it.

PaulMcG · Answer

Not sure how this will compare with the other packages, but to a certain extent you need to parse the arguments being passed to cursor.execute. This bit of pyparsing code looks for:

arguments using string interpolation
arguments using string concatenation with variable names
arguments that are just variable names

But sometimes arguments use string concatenation just to break up a long string into - if all the strings in the expression are literals being added together, there is no risk of SQL injection.

This pyparsing snippet will look for calls to cursor.execute, and then look for the at-risk argument forms:

from pyparsing import *
import re

identifier = Word(alphas, alphanums+'_')
integer = Word(nums)
LPAR,RPAR,PLUS,PERCENT = map(Literal, '()+%')

stringInterpRE = re.compile(r"%-?\d*\*?\.?\d*\*?s")
def containsStringInterpolation(s,l,tokens):
    if not stringInterpRE.search(tokens[0]):
        raise ParseException(s,l,"No string interpolation")
tupleContents = identifier | integer
tupleExpr = LPAR + delimitedList(tupleContents) + RPAR
stringInterpArg = identifier | tupleExpr        
interpolatedString = originalTextFor(quotedString.copy().setParseAction(containsStringInterpolation) + 
                                    PERCENT + stringInterpArg)

stringTerm = interpolatedString | OneOrMore(quotedString.copy()) | identifier
stringTerm.setName("stringTerm")

unsafeStringExpr = (stringTerm + OneOrMore(PLUS + stringTerm)) | identifier | interpolatedString
def unsafeExpr(s,l,tokens):
    if not any(term == interpolatedString or term == identifier
                for term in tokens):
        raise ParseException(s,l,"No unsafe string terms")
unsafeStringExpr.setParseAction(unsafeExpr)
unsafeStringExpr.setName("unsafeExpr")

func = Literal("cursor.execute")
statement = func + LPAR + unsafeStringExpr + RPAR
statement.setName("execute stmt")
#statement.ignore(pythonComment)

for tokens in statement.searchString(sample):
    print ' '.join(tokens.asList())

This will scan through the following sample:

sample = """
import MySQLdb

def get_data(id):
    db = MySQLdb.connect(db='TEST')
    cursor = db.cursor()
    cursor.execute("SELECT * FROM TEST WHERE ID = '%s' -- UNSAFE" % id)
    cursor.execute("SELECT * FROM TEST WHERE ID = '" + id + "' -- UNSAFE")
    cursor.execute(sqlVar + " -- UNSAFE")
    cursor.execute("SELECT * FROM TEST WHERE ID = 'FRED' -- SAFE")
    cursor.execute("SELECT * FROM TEST WHERE ID = " + 
                        "'FRED' -- SAFE")
    cursor.execute("SELECT * FROM TEST "
                        "WHERE ID = "
                        "'FRED' -- SAFE")
    cursor.execute("SELECT * FROM TEST "
                        "WHERE ID = " +
                        "'%s' -- UNSAFE" % name)
    return cursor.fetchall()

print(get_data(1))"""

and report these unsafe statements:

cursor.execute ( "SELECT * FROM TEST WHERE ID = '%s' -- UNSAFE" % id )
cursor.execute ( "SELECT * FROM TEST WHERE ID = '" + id + "' -- UNSAFE" )
cursor.execute ( sqlVar + " -- UNSAFE" )
cursor.execute ( "SELECT * FROM TEST " "WHERE ID = " + "'%s' -- UNSAFE" % name )

You can also have pyparsing report the location of the found lines, using scanString instead of searchString.

Detect SQL injections in the source code

Tags:

python

security

sql

sql-injection

static-code-analysis

alecxe

2 Answers

alecxe

PaulMcG

Recent Activity

Donate For Us

Detect SQL injections in the source code

Tags:

python

security

sql

sql-injection

static-code-analysis

alecxe

2 Answers

alecxe

PaulMcG

Related questions

Recent Activity

Donate For Us