pandas read_csv: ignore trailing lines with empty data

Question

I would like to read the following data from a csv file:

id;type;start;end
Test;OIS;01/07/2016;01/07/2018
;;;
;;;

However, pandas read_csv will try reading the empty lines ;;; as well. Is there a way to automatically ignore these trailing lines of empty data?

These lines are causing a problem because I am using read_csv with converters, and the functions in the converters will dutifully throw an exception when they encounter invalid data, meaning I don't even arrive at a valid dataframe. I could change the functions to convert invalid data to NaN and then drop NaNs from the dataframe, but then I would silently be dropping erroneous data as well as those empty lines.

Some clarifications:

The lines of empty data will always been trailing, it's a common problem with csv files generated from Excel.
The data is user-generated so manually cleaning the file is not an option.

Padraic Cunningham · Accepted Answer

Not sure you can so it directly with read_csv but you can use dropna:

import pandas as pd

df= pd.read_csv("in.csv", delimiter=";")
df.dropna(how="all", inplace=True) 
print(df)

pandas read_csv: ignore trailing lines with empty data

Tags:

python

pandas

Anne

1 Answers

Padraic Cunningham

Recent Activity

Donate For Us

pandas read_csv: ignore trailing lines with empty data

Tags:

python

pandas

Anne

1 Answers

Padraic Cunningham

Related questions

Recent Activity

Donate For Us