Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get the number of rows in a Pandas chunk?

Tags:

python

pandas

csv

I'm reading a huge csv file by iterating over chunks. How can I get the size of the currently processed chunk? Especially the last chunk may have smaller number of rows than defined with the parameter chunksize.

reader = pd.read_table('myFile.csv', sep=';', chunksize=100)
like image 339
Jonathan Roth Avatar asked Jan 07 '17 17:01

Jonathan Roth


People also ask

How do you read data in Panda chunks?

To read large CSV files in chunks in Pandas, use the read_csv(~) method and specify the chunksize parameter. This is particularly useful if you are facing a MemoryError when trying to read in the whole DataFrame at once.

How do you extract the first 10 rows in pandas?

You can use df. head() to get the first N rows in Pandas DataFrame. Alternatively, you can specify a negative number within the brackets to get all the rows, excluding the last N rows.


1 Answers

You need check length of DataFrame:

for x in reader:
    print (len(x.index))
    print (len(x))
    print (x.shape[0])
like image 129
jezrael Avatar answered Oct 23 '22 08:10

jezrael