I have a Jupyter notebook that I plan to run repeatedly. It has functions in it, the structure of the code is this:
def construct_url(data): ... return url def scrape_url(url): ... # fetch url, extract data return parsed_data for i in mylist: url = construct_url(i) data = scrape_url(url) ... # use the data to do analysis
I'd like to write tests for construct_url
and scrape_url
. What's the most sensible way to do this?
Some approaches I've considered:
Jupyter Notebook can show that documentation of the function you are calling. Press Shift+Tab to view the documentation. This is very helpful as you don't need to open the documentation website every single time.
In this guide, we cover how to test the code inside a Jupyter notebook using pytest . This approach allows you to build comprehensive yet flexible tasks for the user to complete. For example, you can test the contents of a given variable, the return value of a function, or even a class.
For example, we named the file for unit-testing as Basic_Test.py . So the command to run python unittest will be: $python3. 6 -m unittest Basic_Test. Testing If you want to see the verbose, then the command will be; $python3.
Python standard testing tools, such as doctest and unittest, can be used directly in a notebook.
A notebook cell with a function and a test case in a docstring:
def add(a, b): ''' This is a test: >>> add(2, 2) 5 ''' return a + b
A notebook cell (the last one in the notebook) that runs all test cases in the docstrings:
import doctest doctest.testmod(verbose=True)
Output:
Trying: add(2, 2) Expecting: 5 ********************************************************************** File "__main__", line 4, in __main__.add Failed example: add(2, 2) Expected: 5 Got: 4 1 items had no tests: __main__ ********************************************************************** 1 items had failures: 1 of 1 in __main__.add 1 tests in 2 items. 0 passed and 1 failed. ***Test Failed*** 1 failures.
A notebook cell with a function:
def add(a, b): return a + b
A notebook cell (the last one in the notebook) that contains a test case. The last line in the cell runs the test case when the cell is executed:
import unittest class TestNotebook(unittest.TestCase): def test_add(self): self.assertEqual(add(2, 2), 5) unittest.main(argv=[''], verbosity=2, exit=False)
Output:
test_add (__main__.TestNotebook) ... FAIL ====================================================================== FAIL: test_add (__main__.TestNotebook) ---------------------------------------------------------------------- Traceback (most recent call last): File "<ipython-input-15-4409ad9ffaea>", line 6, in test_add self.assertEqual(add(2, 2), 5) AssertionError: 4 != 5 ---------------------------------------------------------------------- Ran 1 test in 0.001s FAILED (failures=1)
While debugging a failed test, it is often useful to halt the test case execution at some point and run a debugger. For this, insert the following code just before the line at which you want the execution to halt:
import pdb; pdb.set_trace()
For example:
def add(a, b): ''' This is the test: >>> add(2, 2) 5 ''' import pdb; pdb.set_trace() return a + b
For this example, the next time you run the doctest, the execution will halt just before the return statement and the Python debugger (pdb) will start. You will get a pdb prompt directly in the notebook, which will allow you to inspect the values of a
and b
, step over lines, etc.
Note: Starting with Python 3.7, the built-in breakpoint()
can be used instead of import pdb; pdb.set_trace()
.
I created a Jupyter notebook for experimenting with the techniques I have just described. You can try it out with
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With