Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Variable scope is changed in consecutive cells using %%time in Jupyter notebook

TL;DR

I am facing a weird issue (or I am missing something basic). I have a Jupyter notebook and in one cell there is a variable saved as numpy.ndarray, but when I print its type in the next cell, the variable appears as of type list. How is this possible? In my machine works fine, in a VM does not.


Detailed description:

I am working on a certain pull request updating a jupyter notebook, and as I having some plotting issues in my current setup, I tried to test it in a different machine/system with updated packages and components.

In my laptop I have Ubuntu 16.04 and this configuration:

> The version of the notebook server is: 5.7.4 The server is running on
> this version of Python: Python 3.5.2 (default, Nov 12 2018, 13:43:14) 
> [GCC 5.4.0 20160609]
> 
> Current Kernel Information: Python 3.5.2 (default, Nov 12 2018,
> 13:43:14)
> IPython 7.2.0 -- An enhanced Interactive Python. Type '?' for help.

I created a virtual machine, installed Ubuntu 18.04 and use this configuration:

> The version of the notebook server is: 5.7.6 The server is running on
> this version of Python: Python 3.6.7 (default, Oct 22 2018, 11:32:17) 
> [GCC 8.2.0]
> 
> Current Kernel Information: Python 3.6.7 (default, Oct 22 2018, 11:32:17)
> IPython 7.4.0 -- An enhanced Interactive Python. Type '?' for help.

Then I identified that in the VM, a variable changes its type from numpy.ndarray to list with no reason (for me). The variable is pos. This is causing me problems as it is used later for indexing purposes

Laptop:

Laptop

Virtual Machine: vm

What is going on here? Am I missing something on this?

Any hint is appreciated :) Thanks.


UPDATE:

I've tried another notebook in the VM, and now it's not just the type changing but the variable not reached in a different cell (variable joint_vars):

enter image description here

Should it be some misconfiguration of the environment in the VM?

like image 500
gustavovelascoh Avatar asked Mar 25 '19 15:03

gustavovelascoh


People also ask

How to deallocate a variable in Jupyter Notebook?

In Jupyter notebook, every cell uses the global scope. Every variable you create in that scope will not get deallocated unless you override the value of the variable or explicitly remove it using the “del” keyword.

How to time your code in Jupyter Notebook?

Fortunately, in Jupyter or IPython notebook, a magic “ timeit ” command is available to time your code. Timeit magic command in the Jupyter notebook is used to measure the time execution of small code. You don’t need to import the timeit module from a standard library.

What is global scope in Jupyter Notebook?

The new scope has access to all variables defined in the enclosing scope (regardless of nesting level). When the program reaches the end of the scope, it removes all references created in that scope. If some reference count reaches zero, the memory used by those values gets deallocated. In Jupyter notebook, every cell uses the global scope.

Is there a way to pause/restart cell blocks in Jupyter Notebook?

I don't think Jupyter notebooks (even via extensions) currently offer pausing/restarting cell blocks. I would suggest putting the code of both cells into a single cell and using Python logic to determine the order of execution. In general, however, you cannot strictly pause the execution of something and come back to it later.


1 Answers

I believe the problem here is a change in how scope is handled in cell magics. Your laptop was running IPython 7.2.0; your VM is running 7.4.0. The old behavior changed in 7.4.0 (this may be a bug to be fixed in a future release).

I suspect that pos had been previously defined as a list in your notebook. In 7.4.0 (as on your VM) everything in the cell is treated as a local scope. For example:

Python 3.7.2 | packaged by conda-forge | (default, Mar 19 2019, 20:46:22)
Type 'copyright', 'credits' or 'license' for more information
IPython 7.4.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: foo = "bar"

In [2]: foo
Out[2]: 'bar'

In [3]: %%time
   ...: foo = 5
   ...:
   ...:
CPU times: user 3 µs, sys: 1 µs, total: 4 µs
Wall time: 5.72 µs

In [4]: foo
Out[4]: 'bar'

If you run the same thing with 7.3.0, you end up with

In [4]: foo
Out[4]: 5

Since foo was defined as a string previously, the effect you observe is that the type of foo (as of cell 4) changes depending on the version of IPython. (Here, what should be an integer changes to a string.) This is more subtle when the types involved are closely related, like lists and numpy arrays in your case. It isn't that the type changed because of the cell; it's that the new value never got assigned, so it kept its old type.

The solution is to downgrade the VM to IPython 7.3.0 for the time being, or to avoid using the %%time cell magic.

like image 116
dwhswenson Avatar answered Oct 13 '22 01:10

dwhswenson