Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

TypeError: Cannot cast array data from dtype('int64') to dtype('int32') according to the rule 'safe' while plotting a seaborn.regplot

I'm trying to plot a regplot using seaborn and i'm not unable to plot it and facing TypeError: Cannot cast array data from dtype('int64') to dtype('int32') according to the rule 'safe' .

My data has 731 rows and 16 column -

>>> bike_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 731 entries, 0 to 730
Data columns (total 16 columns):
 #   Column      Non-Null Count  Dtype  
---  ------      --------------  -----  
 0   instant     731 non-null    int64  
 1   dteday      731 non-null    object 
 2   season      731 non-null    int64  
 3   yr          731 non-null    int64  
 4   mnth        731 non-null    int64  
 5   holiday     731 non-null    int64  
 6   weekday     731 non-null    int64  
 7   workingday  731 non-null    int64  
 8   weathersit  731 non-null    int64  
 9   temp        731 non-null    float64
 10  atemp       731 non-null    float64
 11  hum         731 non-null    float64
 12  windspeed   731 non-null    float64
 13  casual      731 non-null    int64  
 14  registered  731 non-null    int64  
 15  cnt         731 non-null    int64  
dtypes: float64(4), int64(11), object(1)
memory usage: 88.6+ KB

Here is a snippet of the data data snippet And when i'm trying to plot regplot using seaborn -

>>> sns.regplot(x="casual", y="cnt", data=bike_df);

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-54-68533af96906> in <module>
----> 1 sns.regplot(x="casual", y="cnt", data=bike_df);

~\AppData\Local\Continuum\anaconda3\envs\rstudio\lib\site-packages\seaborn\regression.py in regplot(x, y, data, x_estimator, x_bins, x_ci, scatter, fit_reg, ci, n_boot, units, seed, order, logistic, lowess, robust, logx, x_partial, y_partial, truncate, dropna, x_jitter, y_jitter, label, color, marker, scatter_kws, line_kws, ax)
    816     scatter_kws["marker"] = marker
    817     line_kws = {} if line_kws is None else copy.copy(line_kws)
--> 818     plotter.plot(ax, scatter_kws, line_kws)
    819     return ax
    820 

~\AppData\Local\Continuum\anaconda3\envs\rstudio\lib\site-packages\seaborn\regression.py in plot(self, ax, scatter_kws, line_kws)
    363 
    364         if self.fit_reg:
--> 365             self.lineplot(ax, line_kws)
    366 
    367         # Label the axes

~\AppData\Local\Continuum\anaconda3\envs\rstudio\lib\site-packages\seaborn\regression.py in lineplot(self, ax, kws)
    406         """Draw the model."""
    407         # Fit the regression model
--> 408         grid, yhat, err_bands = self.fit_regression(ax)
    409         edges = grid[0], grid[-1]
    410 

~\AppData\Local\Continuum\anaconda3\envs\rstudio\lib\site-packages\seaborn\regression.py in fit_regression(self, ax, x_range, grid)
    214             yhat, yhat_boots = self.fit_logx(grid)
    215         else:
--> 216             yhat, yhat_boots = self.fit_fast(grid)
    217 
    218         # Compute the confidence interval at each grid point

~\AppData\Local\Continuum\anaconda3\envs\rstudio\lib\site-packages\seaborn\regression.py in fit_fast(self, grid)
    239                                     n_boot=self.n_boot,
    240                                     units=self.units,
--> 241                                     seed=self.seed).T
    242         yhat_boots = grid.dot(beta_boots).T
    243         return yhat, yhat_boots

~\AppData\Local\Continuum\anaconda3\envs\rstudio\lib\site-packages\seaborn\algorithms.py in bootstrap(*args, **kwargs)
     83     for i in range(int(n_boot)):
     84         resampler = integers(0, n, n)
---> 85         sample = [a.take(resampler, axis=0) for a in args]
     86         boot_dist.append(f(*sample, **func_kwargs))
     87     return np.array(boot_dist)

~\AppData\Local\Continuum\anaconda3\envs\rstudio\lib\site-packages\seaborn\algorithms.py in <listcomp>(.0)
     83     for i in range(int(n_boot)):
     84         resampler = integers(0, n, n)
---> 85         sample = [a.take(resampler, axis=0) for a in args]
     86         boot_dist.append(f(*sample, **func_kwargs))
     87     return np.array(boot_dist)

TypeError: Cannot cast array data from dtype('int64') to dtype('int32') according to the rule 'safe'

I tried changing the datatypes using dtypes for all the rows like below -

>>> bike_df['cnt'] = bike_df['cnt'].astype(np.int32)

but this did not help and got the same error again while plotting.

Any suggestions are appreciated.

Thanks in advance.

like image 292
Jeevan NH Avatar asked Feb 04 '20 15:02

Jeevan NH


2 Answers

Update: this bug is solved in Seaborn version 0.10.1 (April 2020).

I encountered the same problem. It is issue 1950 at Seaborn's github. Related to running a 32-bit version of numpy. It will be solved in the next release.

To work around the problem, I changed line 84 of my local version of Seaborn's algorithm.py:

resampler = integers(0, n, n, dtype=np.int_)

This happened with:

  • numpy version: 1.18.1

  • seaborn version: 0.10.0

like image 180
JohanC Avatar answered Nov 16 '22 07:11

JohanC


I had this issue from my machine too!!

I've tried modifying the Seaborn's algorithm.py code as mentioned as JohanC mentioned, but It didn't work...

The I realized that my python version was 32-bit, so I installed a newer python 64-bit version and run the same code.

The version I downloaded and installed was 64-bit (3.8.2) of this link.

That made my python run the script without problems!!

like image 5
Chandler Klüser Avatar answered Nov 16 '22 08:11

Chandler Klüser