Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

numpy: difference between NaN and masked array

Tags:

python

nan

numpy

In numpy there are two ways to mark missing values: I can either use a NaN or a masked array. I understand that using NaNs is (potentially) faster while masked array offers more functionality (which?).

I guess my question is if/ when should I use one over the other? What is the use case of np.NaN in a regular array vs. a masked array?

I am sure the answer must be out there but I could not find it...

like image 325
mathause Avatar asked May 29 '15 11:05

mathause


1 Answers

Keep in mind that strange np.nan behaviours, mentioned by jrmyp, include unexpected results for example when using functions of the statsmodels (e.g. ttest) or numpy module (e.g. average). From experience, most those functions have workarounds for NaNs, but it has the potential of driving you mad for a while. This seems like a reason to mask arrays whenever possible.

like image 152
Pauli Avatar answered Oct 08 '22 01:10

Pauli