Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas read csv replacing #DIV/0! and #VALUE! with NaN

I am new to Pandas for Python and am busy reading a csv file. Unfortunately the Excel file has some cells with #VALUE! and #DIV/0! in them. I cannot fix this in Excel because the data is pulled from other sheets. Pandas turns these columns into objects instead of numpy64, so I cannot plot from them. I want to replace the #VALUE! and #DIV/0! strings with NaN entries in Pandas, however i cannot find how to do this. I have tried the following (my code runs, but it changes nothing):

import pandas as pd
import numpy as np
df = pd.read_csv('2013AllData.csv')
df.replace('#DIV/0!', np.nan)
like image 876
nicolejane33 Avatar asked Jan 20 '15 10:01

nicolejane33


People also ask

How do I rename a column in reading CSV?

When reading a CSV file, it may be more sensible to rename columns using read_csv() with names argument. When you want to rename some selected columns, the rename() function is the best choice. columns. str.

What does replace do in pandas?

Pandas DataFrame replace() Method The replace() method replaces the specified value with another specified value. The replace() method searches the entire DataFrame and replaces every case of the specified value.


1 Answers

Rather than replacing after loading, just set the param na_values when reading the csv in and it will convert them to NaN values when the df is created:

df = pd.read_csv('2013AllData.csv', na_values=['#VALUE!', '#DIV/0!'])

Check the docs: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html#pandas.read_csv

like image 187
EdChum Avatar answered Oct 01 '22 06:10

EdChum