How to detect and remove lines above data set while reading from csv?

Question

I have a csv that looks like this:

name: john
date modified: 2018-09
from: jane
colum1 column2 column3
data    data    data

Is there any function I can apply that would strip off any lines before the tabular data begins when reading from csv? currently the lines above column look like strange characters when I read them in.

New table should look like this:

colum1 column2 column3
data    data    data

AGN Gazer · Accepted Answer

I would do something like this:

from io import StringIO
with open('filename.csv') as f:
    lines = f.readlines()
s = StringIO(''.join((l for l in lines if ':' not in l)))
pd.read_csv(s)

Alternatively:

with open('filename.csv') as f:
    lines = f.readlines()
skip_rows_idx = [i for i, l in enumerate(lines) if ':' in l]
pd.read_csv('filename.csv', skiprows=skip_rows_idx)

If there are no colons in the header, then one could adapt the above code (first example) to drop first lines like this:

import itertools
s = StringIO(''.join(itertools.dropwhile(lambda l: ':' in l, lines)))

(assuming there are no "bad" lines after the header).

How to detect and remove lines above data set while reading from csv?

Tags:

python

python-3.x

pandas

csv

RustyShackleford

1 Answers

AGN Gazer

Recent Activity

Donate For Us

How to detect and remove lines above data set while reading from csv?

Tags:

python

python-3.x

pandas

csv

RustyShackleford

1 Answers

AGN Gazer

Related questions

Recent Activity

Donate For Us