Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Read all but last line of CSV file in pandas

I have CSV files which I read in in pandas with:

#!/usr/bin/env python  import pandas as pd import sys  filename = sys.argv[1] df = pd.read_csv(filename) 

Unfortunately, the last line of these files is often corrupt (has the wrong number of commas). Currently I open each file in a text editor and remove the last line.

Is it possible to remove the last line in the same python/pandas script that loads the CSV to save having to take this extra non-automated step?

like image 209
graffe Avatar asked Nov 13 '15 09:11

graffe


People also ask

How do you skip the last row in pandas?

Using drop() Function to Delete Last Row of Pandas DataFrame. Alternatively, you can also use drop() method to remove the last row. Use index param to specify the last index and inplace=True to apply the change on the existing DataFrame. In the below example, df.

How do I read the last line of a CSV file in Python?

Read Last Line of File With the readlines() Function in Python. The file. readlines() function reads all the lines of a file and returns them in the form of a list. We can then get the last line of the file by referencing the last index of the list using -1 as an index.

How do I get the last few records in Python?

Method 1: Using tail() method DataFrame. tail(n) to get the last n rows of the DataFrame. It takes one optional argument n (number of rows you want to get from the end). By default n = 5, it return the last 5 rows if the value of n is not passed to the method.

Does pandas read CSV close file?

If you pass it an open file it will keep it open (reading from the current position), if you pass a string then read_csv will open and close the file.


1 Answers

pass error_bad_lines=False and it will skip this line automatically

df = pd.read_csv(filename, error_bad_lines=False) 

The advantage of error_bad_lines is it will skip and not bork on any erroneous lines but if the last line is always duff then skipfooter=1 is better

Thanks to @DexterMorgan for pointing out that skipfooter option forces the engine to use the python engine which is slower than the c engine for parsing a csv.

like image 165
EdChum Avatar answered Sep 17 '22 21:09

EdChum