Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

More pythonic way of skipping header lines

Tags:

python

Is there a shorter (perhaps more pythonic) way of opening a text file and reading past the lines that start with a comment character?

In other words, a neater way of doing this

fin = open("data.txt")
line = fin.readline()
while line.startswith("#"):
    line = fin.readline()
like image 706
pufferfish Avatar asked Nov 13 '09 17:11

pufferfish


People also ask

How do I skip the header of a file in Python?

This method uses next () to skip the header and starts reading the file from line 2. Note: If you want to print the header later, instead of next (f) use f.readline () and store it as a variable or use header_line = next (f). This shows that the header of the file is stored in next ().

How to skip the header and read the file from Line 2?

This method uses readlines () to skip the header and starts reading the file from line 2. readlines () uses the slicing technique. As you can see in the below example, readlines [1:], it denotes that the reading of the file starts from index 1 as it skips the index 0. This is a much more powerful solution as it generalizes to any line.

Is it possible to skip a line in Python?

Python is a very powerful programming language. Let's see how to skip a line in Python. It is very easy. I love Python. It makes everything so fun. Python is a very powerful programming language. Let's see how to skip a line in Python. It is very easy. I love Python. It makes everything so fun. Let’s now skip the 3rd line. This is a sample file.

How to extend unpacking in Python 3?

In Python 3, PEP 3132 has introduced a new method of extended unpacking: If you need to assign something (for instance, in unpacking), but will not need that variable, use __: filename = 'foobar.txt' basename, __, ext = filename.rpartition ('.')


3 Answers

At this stage in my arc of learning Python, I find this most Pythonic:

def iscomment(s):
   return s.startswith('#')

from itertools import dropwhile
with open(filename, 'r') as f:
    for line in dropwhile(iscomment, f):
       # do something with line

to skip all of the lines at the top of the file starting with #. To skip all lines starting with #:

from itertools import ifilterfalse
with open(filename, 'r') as f:
    for line in ifilterfalse(iscomment, f):
       # do something with line

That's almost all about readability for me; functionally there's almost no difference between:

for line in ifilterfalse(iscomment, f))

and

for line in (x for x in f if not x.startswith('#'))

Breaking out the test into its own function makes the intent of the code a little clearer; it also means that if your definition of a comment changes you have one place to change it.

like image 196
Robert Rossney Avatar answered Sep 21 '22 14:09

Robert Rossney


for line in open('data.txt'):
    if line.startswith('#'):
        continue
    # work with line

of course, if your commented lines are only at the beginning of the file, you might use some optimisations.

like image 37
SilentGhost Avatar answered Sep 19 '22 14:09

SilentGhost


from itertools import dropwhile
for line in dropwhile(lambda line: line.startswith('#'), file('data.txt')):
    pass
like image 33
ephemient Avatar answered Sep 19 '22 14:09

ephemient