I have two python files. I am using one of them to import all the prerequisite libraries. I am using the other one to execute some code. Here is the first python file named imports.py
def importAll(process):
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
print('Success')
if process == 'train':
import sklearn
The second python file train.py is as follows:
from imports import importAll
importAll('train')
def load_data(date):
#load only data till Sep
df = pd.read_csv('df.csv')
return(df[df['date'] < date])
date = '2012-09-01'
df = load_data(date)
When I run train.py, note that 'Success' is getting printed (from the imports.py file) However, I also get the error that pd is not defined ( in the line df = pd.read_csv('df.csv') ) Is there any way to correct this error?
When you import from within the scope of a function, that import is only defined from within that function, and not in the scope that the function is called in.
I'd recommend looking at this question for a good explanation for scope rules in python.
To fix this, you can use python's star import.
imports.py:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
__all__ = [
'pd',
'np',
'sns',
'plt'
]
train.py:
from imports import *
...
The syntax from module import * imports all variables defined in the __all__ list from within that module.
I strongly discourage the use of this code, because has the opposite effect you intend it to have. This will remove the "clutter" of redundant import statements, at the cost of something much worse: confusing code, and a hidden bug waiting to come to the surface (explained below).
Alas, the solution:
import inspect
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
CODE_CONTEXT = ['from imports import *\n']
__all__ = [
'pd',
'np',
'sns',
'plt'
]
def _get_filename():
frame, *_ = filter(
lambda frame: getattr(frame, 'code_context', None) == CODE_CONTEXT,
inspect.stack()
)
return frame.filename
imported_from = _get_filename()
if imported_from == 'train.py':
import sklearn
__all__.append('sklearn')
elif imported_from == 'eda.py':
...
To understand how a bug might come from this code, consider this example:
imports.py:
import inspect
CODE_CONTEXT = ['from imports import *\n']
__all__ = []
def _get_filename():
frame, *_ = filter(
lambda frame: getattr(frame, 'code_context', None) == CODE_CONTEXT,
inspect.stack()
)
return frame.filename
imported_from = _get_filename()
print(imported_from)
a.py:
from imports import *
b.py:
from imports import *
main.py:
import a
import b
When you run python3 main.py what will print to the console?
a.py
Why isn't b.py printed? Because modules are only executed once, during their first import. Since a.py imported the module first, each subsequent import of imports.py won't re-execute the module, they will reuse the code that was built during its initial execution.
TLDR;
Any alterations made to __all__ will only apply to the first import of the module, and subsequent imports might be missing the modules that are needed.
Ideally you should working in python package like structure, Python module needs to by default have __init__.py file. There you can include all the other files/modules in package through __init__.py file.
Suppose, you make package example package. file structure will be,
If file1.py one has two classes
class A:
def __init__(self):
pass
class B:
def __init__(self):
pass
If file2.py one has again two more classes
class C:
def __init__(self):
pass
class D:
def __init__(self):
pass
and you want to include all this classes in outer file
add/import all classes in __init__.py file, like
from .file1 import *
from .file2 import *
Now in outer file you can simply do this outer.py
from package import *
# this will import all four A, B, C, D classes here
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With