force pandas to read nan as string

Question

I could not find any other question related to mine. Please help me with a link, if I missed it...

I have a csv-file looking like this:

"concentration"
"5"
"5"
"5"
"5"
"5"

"nan"
"nan"
"nan"
"nan"
"nan"

If I read it with pandas read_csv, the "nan" values are automatically interpreted as NaN. But I would like to keep them as a string. The missing value which should be NaN is in line 7 (where actually nothing is written).

I tried to read it like this:

df = pd.read_csv(path, dtype= {'concentration': 'string'}, quoting = csv.QUOTE_NONNUMERIC, sep=',')

Can anybody help?

Roman Pekar · Accepted Answer

Looks like you can use keep_default_na and na_values. From the docs:

na_values : list-like or dict, default None
Additional strings to recognize as NA/NaN. If dict passed, specific per-column NA values

keep_default_na : bool, default True
If na_values are specified and keep_default_na is False the default NaN values are overridden, otherwise they’re appended to

So here's the code

pd.read_csv('c:\temp\temp.txt', keep_default_na=False, na_values=[''])

   concentration
0              5
1              5
2              5
3              5
4              5
5            NaN
6            nan
7            nan
8            nan
9            nan
10           nan

force pandas to read nan as string

Tags:

python

pandas

Antje Janosch

1 Answers

Roman Pekar

Recent Activity

Donate For Us

force pandas to read nan as string

Tags:

python

pandas

Antje Janosch

1 Answers

Roman Pekar

Related questions

Recent Activity

Donate For Us