Synchronizing code between jupyter/iPython notebook script and class methods

Tags:

I'm trying to figure out the best way to keep code in an Jupyter/iPython notebook and the same code inside of a class method in sync. Here's the use case:

I wrote a long script that uses pandas inside a notebook, and have multiple cells which made the development easy, because I could check intermediate results within the notebook. This is very useful with pandas scripts. I downloaded that working code into a Python ".py" file, and converted that script to be a method within a Python class in my program, that is instantiated with the input data, and provides the output as a result of that method. Everything works great. That Python class is used in a much larger application, so that is the real deliverable.

But then there was a bug for a certain data set in the implementation in the method, which also was in my script. I could go back to my notebook and go step-by-step through the various cells to find the issue. I fix the issue, but then I have to carefully make the change back in the regular Python class method code. This is a bit painful.

Ideally, I'd like to be able to run a class method across cells, so I can check intermediate results. I can't figure out how to do this.

So what is the best practice between keeping a script code and code embedded within a class method in sync?

Yes, I know that I can import the class into the notebook, but then I lose the ability to look at intermediate results inside the class method via individual cells, which is what I do when it is a pure script. With pandas, this is very useful.

667

asked Aug 04 '16 14:08

Irv

1 Answers

I have used your same development workflow and recognize the value of being able to step through code using the jupyter notebook. I've developed several packages by first hashing out the details and then eventually moving the polished product in to separate .py files. I do not think there is a simple solution to the inconvenience you encounter (I have run into the same issues), but I will describe my practice (I'm not so bold as to proclaim it the "best" practice) and maybe it will be helpful in your use case.

In my experience, once I have created a module/package from my jupyter notebook, it is easier to maintain/develop the code outside of the notebook and import that module into the notebook for testing.

Keeping each method small is good practice in general, and is very helpful for testing the logic at each step using the notebook. You can break larger "public" methods into smaller "private" methods named using a leading underscore (e.g. '_load_file'. You can call the "private" methods in your notebook for testing/debugging, but users of your module should know to ignore these methods.

You can use the reload function in the importlib module to quickly refresh your imported modules with changes made to the source.

import mymodule
from importlib import reload
reload(mymodule)

Calling import again will not actually update your namespace. You need the reload function (or similar) to force python to recompile/execute the module code.

Inevitably, you will still need to step through individual functions line by line, but if you've decomposed your code into small methods, the amount of code you need to "re-write" in the notebook is very small.

148

answered Nov 04 '22 02:11

Gordon Bean

Related questions
                            
                                Python Requests - authentication after redirect
                            
                                `requirements.txt` dependencies, getting only high level dependencies
                            
                                Can I sign an X509 certificate entirely in Python?
                            
                                Numeric value directly after backreference [duplicate]
                            
                                Random Sampling of Pandas data frame (both rows and columns)
                            
                                How to implement left outer join in python pandas? [duplicate]
                            
                                Pandas: increment datetime
                            
                                Django include template tag in for loop only catches first iteration
                            
                                Can't seem to retrieve stripe charge using python
                            
                                Potential Exceptions using builtin str() type in Python
                            
                                DoesNotExist at /accounts/register/ Site matching query does not exist. (django, python)
                            
                                add labels to sklearn k-means
                            
                                Select rows from a pandas dataframe where two columns match list of pairs
                            
                                Aligning a text box edge with an image corner
                            
                                Highlight specific points in matplotlib scatterplot
                            
                                Can I add permissions to media django media files?
                            
                                Volume Yahoo Finance
                            
                                Limit Google OAuth access to one domain using 'hd' param (Django / python-social-auth)
                            
                                Flatten nested pandas dataframe
                            
                                Pandas Flatten a dataframe to a single column

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Synchronizing code between jupyter/iPython notebook script and class methods

Tags:

python

pandas

jupyter-notebook

jupyter

ipython-notebook

Irv

People also ask

1 Answers

Gordon Bean

Recent Activity

Donate For Us