py file), or Jupyter Notebook. Remember the file that contains the function definitions and the file calling the functions must be in the same directory. To use the functions written in one file inside another file include the import line, from filename import function_name .
The most Pythonic way to import a module from another folder is to place an empty file named __init__.py into that folder and use the relative path with the dot notation. For example, a module in the parent folder would be imported with from .. import module .
I had almost the same example as you in this notebook where I wanted to illustrate the usage of an adjacent module's function in a DRY manner.
My solution was to tell Python of that additional module import path by adding a snippet like this one to the notebook:
import os
import sys
module_path = os.path.abspath(os.path.join('..'))
if module_path not in sys.path:
sys.path.append(module_path)
This allows you to import the desired function from the module hierarchy:
from project1.lib.module import function
# use the function normally
function(...)
Note that it is necessary to add empty __init__.py
files to project1/ and lib/ folders if you don't have them already.
Came here searching for best practices in abstracting code to submodules when working in Notebooks. I'm not sure that there is a best practice. I have been proposing this.
A project hierarchy as such:
├── ipynb
│ ├── 20170609-Examine_Database_Requirements.ipynb
│ └── 20170609-Initial_Database_Connection.ipynb
└── lib
├── __init__.py
└── postgres.py
And from 20170609-Initial_Database_Connection.ipynb
:
In [1]: cd ..
In [2]: from lib.postgres import database_connection
This works because by default the Jupyter Notebook can parse the cd
command. Note that this does not make use of Python Notebook magic. It simply works without prepending %bash
.
Considering that 99 times out of a 100 I am working in Docker using one of the Project Jupyter Docker images, the following modification is idempotent
In [1]: cd /home/jovyan
In [2]: from lib.postgres import database_connection
So far, the accepted answer has worked best for me. However, my concern has always been that there is a likely scenario where I might refactor the notebooks
directory into subdirectories, requiring to change the module_path
in every notebook. I decided to add a python file within each notebook directory to import the required modules.
Thus, having the following project structure:
project
|__notebooks
|__explore
|__ notebook1.ipynb
|__ notebook2.ipynb
|__ project_path.py
|__ explain
|__notebook1.ipynb
|__project_path.py
|__lib
|__ __init__.py
|__ module.py
I added the file project_path.py
in each notebook subdirectory (notebooks/explore
and notebooks/explain
). This file contains the code for relative imports (from @metakermit):
import sys
import os
module_path = os.path.abspath(os.path.join(os.pardir, os.pardir))
if module_path not in sys.path:
sys.path.append(module_path)
This way, I just need to do relative imports within the project_path.py
file, and not in the notebooks. The notebooks files would then just need to import project_path
before importing lib
. For example in 0.0-notebook.ipynb
:
import project_path
import lib
The caveat here is that reversing the imports would not work. THIS DOES NOT WORK:
import lib
import project_path
Thus care must be taken during imports.
All other answers here depends on adding code the the notebook(!)
In my opinion is bad practice to hardcode a specific path into the notebook code, or otherwise depend on the location, since this makes it really hard to refactor you code later on. Instead I would recommend you to add the root project folder to PYTHONPATH when starting up your Jupyter notebook server, either directly from the project folder like so
env PYTHONPATH=`pwd` jupyter notebook
or if you are starting it up from somewhere else, use the absolute path like so
env PYTHONPATH=/Users/foo/bar/project/ jupyter notebook
I have just found this pretty solution:
import sys; sys.path.insert(0, '..') # add parent folder path where lib folder is
import lib.store_load # store_load is a file on my library folder
You just want some functions of that file
from lib.store_load import your_function_name
If python version >= 3.3 you do not need init.py file in the folder
Researching this topic myself and having read the answers I recommend using the path.py library since it provides a context manager for changing the current working directory.
You then have something like
import path
if path.Path('../lib').isdir():
with path.Path('..'):
import lib
Although, you might just omit the isdir
statement.
Here I'll add print statements to make it easy to follow what's happening
import path
import pandas
print(path.Path.getcwd())
print(path.Path('../lib').isdir())
if path.Path('../lib').isdir():
with path.Path('..'):
print(path.Path.getcwd())
import lib
print('Success!')
print(path.Path.getcwd())
which outputs in this example (where lib is at /home/jovyan/shared/notebooks/by-team/data-vis/demos/lib
):
/home/jovyan/shared/notebooks/by-team/data-vis/demos/custom-chart
/home/jovyan/shared/notebooks/by-team/data-vis/demos
/home/jovyan/shared/notebooks/by-team/data-vis/demos/custom-chart
Since the solution uses a context manager, you are guaranteed to go back to your previous working directory, no matter what state your kernel was in before the cell and no matter what exceptions are thrown by importing your library code.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With