Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

about pandasql locals() and globals() method issue

For sqldf method of pandasql package, there is a "session/environment variables", could be locals() or globals(), could anyone let me know what it is for? And any document reference when should we use locals(), and when should we use globals()?

https://github.com/yhat/pandasql/

Here is my code and wondering what things pandansql is looking for thorough locals()? And locals() means namespace inside method select_first_50?

def select_first_50(filename):
    students = pandas.read_csv(filename)
    students.rename(columns = lambda x: x.replace(' ', '_').lower(), inplace=True)

    q = "select major, gender from studentstable limit 50"

    #Execute your SQL command against the pandas frame
    results = pandasql.sqldf(q.lower(), locals())
    return results
like image 427
Lin Ma Avatar asked Feb 10 '23 07:02

Lin Ma


1 Answers

locals() and globals() are python built-in functions that are used to return the corresponding namespace.

In Python , Namespace is a way to implement scope. So global namespace means global scope, so variables(names) defined there are visible throughout the module.

local namepsace is the namespace that is local to a particular function.

globals() returns a dictionary representing the current global namespace.

locals()'s return depends on where it is called, when called directly inside the script scope (not inside a particular function) it returns the same dictionary as globals() that is the global namespace. When called inside a function it returns the local namespace.

In pandasql , the second argument you need to pass is basically this namespace (dictionary) that contains the variables that you are using in the query. That is lets assume you create a DataFrame called a , and then write your query on it. Then pandasql needs to know the DataFrame that corresponds to the name a for this it needs the local/global namespace, and that is what the second argument is for.

So you need to decide what to pass in, example , if your DataFrame is only defined inside a function and does not exist in global scope, you need to pass in locals() return dictionary, If your DataFrame exists in global scope, you need to pass in result of globals() .

like image 119
Anand S Kumar Avatar answered Feb 11 '23 19:02

Anand S Kumar