I want to conduct a heatmap on my table df, which looks normal at the beginning:
Total Paid Post Engaged Negative like
1 2178 0 0 66 0 1207
2 1042 0 0 60 0 921
3 2096 0 0 112 0 1744
4 1832 0 0 109 0 1718
5 1341 0 0 38 0 889
6 1933 0 0 123 0 1501
...
but after I applied:
df= full_Data.iloc[1:,4:10]
df= pd.DataFrame(df,columns=['A','B','C', 'D', 'E', 'F'])
corrMatrix = df.corr()
sn.heatmap(corrMatrix, annot=True)
plt.show()
it returned an empty graph:
C:\Users\User\Anaconda3\lib\site-packages\seaborn\matrix.py:204: RuntimeWarning: All-NaN slice encountered
vmin = np.nanmin(calc_data)
C:\Users\User\Anaconda3\lib\site-packages\seaborn\matrix.py:209: RuntimeWarning: All-NaN slice encountered
vmax = np.nanmax(calc_data)

and df returned:
A B C D E F
1 nan nan nan nan nan nan
2 nan nan nan nan nan nan
3 nan nan nan nan nan nan
4 nan nan nan nan nan nan
5 nan nan nan nan nan nan
...
Why all the values are turned into nan?
Update:
Tried to convert df without naming column in the old way:
df.columns = ['A','B','C', 'D', 'E', 'F']
and
df= pd.DataFrame(df.to_numpy(),columns=['A','B','C', 'D', 'E', 'F'])
and both caught error:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-43-3a27f095066b> in <module>
12
13 corrMatrix = df.corr()
---> 14 sn.heatmap(corrMatrix, annot=True)
15 plt.show()
16
~\Anaconda3\lib\site-packages\seaborn\_decorators.py in inner_f(*args, **kwargs)
44 )
45 kwargs.update({k: arg for k, arg in zip(sig.parameters, args)})
---> 46 return f(**kwargs)
47 return inner_f
48
~\Anaconda3\lib\site-packages\seaborn\matrix.py in heatmap(data, vmin, vmax, cmap, center, robust, annot, fmt, annot_kws, linewidths, linecolor, cbar, cbar_kws, cbar_ax, square, xticklabels, yticklabels, mask, ax, **kwargs)
545 plotter = _HeatMapper(data, vmin, vmax, cmap, center, robust, annot, fmt,
546 annot_kws, cbar, cbar_kws, xticklabels,
--> 547 yticklabels, mask)
548
549 # Add the pcolormesh kwargs here
~\Anaconda3\lib\site-packages\seaborn\matrix.py in __init__(self, data, vmin, vmax, cmap, center, robust, annot, fmt, annot_kws, cbar, cbar_kws, xticklabels, yticklabels, mask)
164 # Determine good default values for the colormapping
165 self._determine_cmap_params(plot_data, vmin, vmax,
--> 166 cmap, center, robust)
167
168 # Sort out the annotations
~\Anaconda3\lib\site-packages\seaborn\matrix.py in _determine_cmap_params(self, plot_data, vmin, vmax, cmap, center, robust)
202 vmin = np.nanpercentile(calc_data, 2)
203 else:
--> 204 vmin = np.nanmin(calc_data)
205 if vmax is None:
206 if robust:
<__array_function__ internals> in nanmin(*args, **kwargs)
~\Anaconda3\lib\site-packages\numpy\lib\nanfunctions.py in nanmin(a, axis, out, keepdims)
317 # Fast, but not safe for subclasses of ndarray, or object arrays,
318 # which do not implement isnan (gh-9009), or fmin correctly (gh-8975)
--> 319 res = np.fmin.reduce(a, axis=axis, out=out, **kwargs)
320 if np.isnan(res).any():
321 warnings.warn("All-NaN slice encountered", RuntimeWarning,
ValueError: zero-size array to reduction operation fmin which has no identity
I think problem is passed object DataFrame to pd.DataFrame constructor, so there are different original columns names and new columns names from list, so only NaNs are created.
Solution is convert it to numpy array:
df= pd.DataFrame(df.to_numpy(),columns=['A','B','C', 'D', 'E', 'F'])
Or set new columns names in next step without DataFrame constructor:
df = full_Data.iloc[1:,4:10]
df.columns = ['A','B','C', 'D', 'E', 'F']
Solution create dict by existing columns only:
old = df.columns
new = ['A','B','C', 'D', 'E', 'F']
df = df.rename(columns=dict(zip(old, new)))
print (df)
A B C D E F
1 2178 0 0 66 0 1207
2 1042 0 0 60 0 921
3 2096 0 0 112 0 1744
4 1832 0 0 109 0 1718
5 1341 0 0 38 0 889
6 1933 0 0 123 0 1501
print (df.corr())
A B C D E F
A 1.000000 NaN NaN 0.606808 NaN 0.727034
B NaN NaN NaN NaN NaN NaN
C NaN NaN NaN NaN NaN NaN
D 0.606808 NaN NaN 1.000000 NaN 0.916325
E NaN NaN NaN NaN NaN NaN
F 0.727034 NaN NaN 0.916325 NaN 1.000000
EDIT:
Problem was columns was not numeric.
df = df.astype(int)
Or:
df = df.apply(pd.to_numeric, errors='coerce')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With