Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Pandas reads_csv skip first x and last y rows

Tags:

python

pandas

csv

I think I may be missing something obvious here, but I am new to python and pandas. I am reading a large text file and only want to use rows in range(61,75496). I can skip the first 60 rows with

keywords = pd.read_csv('keywords.list', sep='\t', skiprows=60)

How can I only include the rows inbetween these values? There unfortunately is no userows parameter.

Is there something like

range(start, stop, start, stop)?
like image 559
PandaBearSoup Avatar asked Jul 27 '15 18:07

PandaBearSoup


2 Answers

Maybe you can use the nrows argument to give the number of rows to read.

From documentation -

nrows : int, default None
Number of rows of file to read. Useful for reading pieces of large files

Code -

keywords = pd.read_csv('keywords.list', sep='\t', skiprows=60,nrows=75436) #Here 75436 is 75496 - 60
like image 69
Anand S Kumar Avatar answered Nov 14 '22 23:11

Anand S Kumar


From the documentation, you can skip first few rows using

skiprows = X

where X is an integer. If there's a header, for example, a few rows into your file, you can also skip straight to the header using

header = X

Skip rows starting from the bottom of the file and counting upwards using

skipfooter = X

All together to set the header to row 3 (and skip the rows above) and ignore the bottom 4 rows: pd.read_csv('path/or/url/to/file.csv', skiprows=3, skipfooter=4)

like image 25
k3t0 Avatar answered Nov 15 '22 01:11

k3t0