I just picked up Pandas to do with some data analysis work in my biology research. Turns out one of the proteins I'm analyzing is called 'NA'.
I have a matrix with pairwise 'HA, M1, M2, NA, NP...' on the column headers, and the same as "row headers" (for the biologists who might read this, I'm working with influenza).
When I import the data into Pandas directly from a CSV file, it reads the "row headers" as 'HA, M1, M2...' and then NA gets read as NaN. Is there any way to stop this? The column headers are fine - 'HA, M1, M2, NA, NP etc...'
by-default pandas consider #N/A, -NaN, -n/a, N/A, NULL etc as NaN value.
Pandas DataFrame fillna() Method The fillna() method replaces the NULL values with a specified value. The fillna() method returns a new DataFrame object unless the inplace parameter is set to True , in that case the fillna() method does the replacing in the original DataFrame instead.
Turn off NaN detection this way: pd.read_csv(filename, keep_default_na=False)
I originally suggested na_filter=False
, which gets the job done. But, if I understand Jeff's comments below, this is a cleaner solution.
Example:
In [1]: pd.read_csv('test')
Out[1]:[4]: pd.read_csv('test', keep_default_na=False)
Out[4]:1 2
2 3
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With