Despite the advice from the previous questions:
-9999 as missing value with numpy.genfromtxt()
Using genfromtxt to import csv data with missing values in numpy
I still am unable to process a text file that ends with a missing value,
a.txt:
1 2 3
4 5 6
7 8
I've tried multiple arrangements of options of missing_values
, filling_values
and can not get this to work:
import numpy as np
sol = np.genfromtxt("a.txt",
dtype=float,
invalid_raise=False,
missing_values=None,
usemask=True,
filling_values=0.0)
print sol
What I would like to get is:
[[1.0 2.0 3.0]
[4.0 5.0 6.0]
[7.0 8.0 0.0]]
but instead I get:
/usr/local/lib/python2.7/dist-packages/numpy/lib/npyio.py:1641: ConversionWarning: Some errors were detected !
Line #3 (got 2 columns instead of 3)
warnings.warn(errmsg, ConversionWarning)
[[1.0 2.0 3.0]
[4.0 5.0 6.0]]
Using genfromtxt to import csv data with missing values in numpy.
genfromtxt() function. The genfromtxt() used to load data from a text file, with missing values handled as specified. Each line past the first skip_header lines is split at the delimiter character, and characters following the comments character are discarded.
The genfromtxt() function is used to load data in a program from a text file. It takes multiple argument values to clean the data of the text file. It also has the ability to deal with missing or null values through the processes of filtering, removing, and replacing.
The default first line of a csv file contains the field names. The function recfromcsv invoke genfromtxt with parameters names=True as default. It means that it read the first line of the data as the header. Definition: http://docs.scipy.org/doc/numpy/reference/generated/numpy.genfromtxt.html.
Using pandas:
import pandas as pd
df = pd.read_table('data', sep='\s+', header=None)
df.fillna(0, inplace=True)
print(df)
# 0 1 2
# 0 1 2 3
# 1 4 5 6
# 2 7 8 0
pandas.read_table
replaces missing data with NaN
s. You can replace those NaN
s with some other value using df.fillna
.
df
is a pandas.DataFrame
. You can access the underlying NumPy array with df.values
:
print(df.values)
# [[ 1. 2. 3.]
# [ 4. 5. 6.]
# [ 7. 8. 0.]]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With