Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Kaggle TypeError: slice indices must be integers or None or have an __index__ method

I am trying to plot a seaborn histogram on a Kaggle notebook in this way:

 sns.distplot(myseries, bins=50, kde=True)

but I get this error:

TypeError: slice indices must be integers or None or have an __index__ method

Thi is the Kaggle notebook: https://www.kaggle.com/asindico/slice-indices-must-be-integers-or-none/

here is the series head:

0     5850000
1     6000000
2     5700000
3    13100000
4    16331452
Name: price_doc, dtype: int64
like image 492
Andrea Sindico Avatar asked May 16 '17 19:05

Andrea Sindico


2 Answers

As @ryankdwyer pointed out, it was an issue in the underlying statsmodels implementation which is no longer existent in the 0.8.0 release.

Since kaggle won't allow you to access the internet from any kernel/script, upgrading the package is not an option. You basically have the following two alternatives:

  1. Use sns.distplot(myseries, bins=50, kde=False). This will of course not print the kde.
  2. Manually patch the statsmodels implementation with the code from version 0.8.0. Admittedly, this is a bit hacky, but you will get the kde plot.

Here is an example (and a proof on kaggle):

import numpy as np

def _revrt(X,m=None):
    """
    Inverse of forrt. Equivalent to Munro (1976) REVRT routine.
    """
    if m is None:
        m = len(X)
    i = int(m // 2+1)
    y = X[:i] + np.r_[0,X[i:],0]*1j
    return np.fft.irfft(y)*m

from statsmodels.nonparametric import kdetools

# replace the implementation with new method.
kdetools.revrt = _revrt

# import seaborn AFTER replacing the method. 
import seaborn as sns

# draw the distplot with the kde function
sns.distplot(myseries, bins=50, kde=True)

Why does it work? Well, it relates to the way Python loads modules. From the Python docs:

5.3.1. The module cache

The first place checked during import search is sys.modules. This mapping serves as a cache of all modules that have been previously imported, including the intermediate paths. So if foo.bar.baz was previously imported, sys.modules will contain entries for foo, foo.bar, and foo.bar.baz. Each key will have as its value the corresponding module object.

Therefore, the from statsmodels.nonparametric import kdetools is inside this module cache. The next time seaborn acquires it, the cached version will be returned by the Python module loader. Since this cached version is the module that we have adapted, our patch of the revrt function is used. By the way, this practice is very handy when writing unit tests and is called mocking.

like image 57
Jan Trienes Avatar answered Nov 15 '22 02:11

Jan Trienes


This error appears to be a known issue.

https://github.com/mwaskom/seaborn/issues/1092

Potential Solution -> update your statsmodels package to 0.8.0

pip install -U statsmodels

like image 39
ryankdwyer Avatar answered Nov 15 '22 00:11

ryankdwyer