Lets say I have a dumb text file with the contents:
Year Recon Observed
1505 162.38 23
1506 46.14 -9999
1507 147.49 -9999
-9999 is used to denote a missing value (don't ask).
So, I should be able to read this into a Numpy array with:
import numpy as np
x = np.genfromtxt("file.txt", dtype = None, names = True, missing_values = -9999)
And have all my little -9999s turn into numpy.nan. But, I get:
>>> x
array([(1409, 112.38, 23), (1410, 56.14, -9999), (1411, 145.49, -9999)],
dtype=[('Year', '<i8'), ('Recon', '<f8'), ('Observed', '<i8')])
... That's not right...
Am I missing something?
Nope, you're not doing anything wrong. Using the missing_values argument indeed tells np.genfromtxt that the corresponding values should be flagged as "missing/invalid". The problem is that dealing with missing values is only supported if you use the usemask=True argument (I probably should have made that clearer in the documentation, my bad).
With usemask=True, the output is a masked array. You can transform it into a regular ndarray with the missing values replaced by np.nan with the method .filled(np.nan).
Be careful, though: if you have column that was detected as having a int dtype and you try to fill its missing values with np.nan, you won't get what you expect (np.nan is only supported for float columns).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With