I'm having trouble figuring out how to skip n rows in a csv file but keep the header which is the 1 row.
What I want to do is iterate but keep the header from the first row. skiprows
makes the header the first row after the skipped rows. What is the best way of doing this?
data = pd.read_csv('test.csv', sep='|', header=0, skiprows=10, nrows=10)
Skipping rows at specific index positions while reading a csv file to Dataframe. While calling pandas. read_csv() if we pass skiprows argument as a list of ints, then it will skip the rows from csv at specified indices in the list. For example if we want to skip lines at index 0, 2 and 5 while reading users.
In Python, while reading a CSV using the CSV module you can skip the first line using next() method. We usually want to skip the first line when the file is containing a header row, and we don't want to print or import that row.
You can pass a list of row numbers to skiprows
instead of an integer.
By giving the function the integer 10, you're just skipping the first 10 lines.
To keep the first row 0 (as the header) and then skip everything else up to row 10, you can write:
pd.read_csv('test.csv', sep='|', skiprows=range(1, 10))
read_csv
The two main ways to control which rows read_csv
uses are the header
or skiprows
parameters.
Supose we have the following CSV file with one column:
a b c d e f
In each of the examples below, this file is f = io.StringIO("\n".join("abcdef"))
.
Read all lines as values (no header, defaults to integers)
>>> pd.read_csv(f, header=None) 0 0 a 1 b 2 c 3 d 4 e 5 f
Use a particular row as the header (skip all lines before that):
>>> pd.read_csv(f, header=3) d 0 e 1 f
Use a multiple rows as the header creating a MultiIndex (skip all lines before the last specified header line):
>>> pd.read_csv(f, header=[2, 4]) c e 0 f
Skip N rows from the start of the file (the first row that's not skipped is the header):
>>> pd.read_csv(f, skiprows=3) d 0 e 1 f
Skip one or more rows by giving the row indices (the first row that's not skipped is the header):
>>> pd.read_csv(f, skiprows=[2, 4]) a 0 b 1 d 2 f
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With