Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Easiest way to ignore blank lines when reading a file in Python

Tags:

python

I have some code that reads a file of names and creates a list:

names_list = open("names", "r").read().splitlines() 

Each name is separated by a newline, like so:

Allman Atkinson  Behlendorf  

I want to ignore any lines that contain only whitespace. I know I can do this by by creating a loop and checking each line I read and then adding it to a list if it's not blank.

I was just wondering if there was a more Pythonic way of doing it?

like image 427
Ambrosio Avatar asked Jan 30 '11 09:01

Ambrosio


People also ask

How do you skip empty lines when reading a file in Python?

readlines() which will read the file line by line. First, initialize an empty list. If the present read-in line, after stripping, is empty, ignore that line and continue to read the next line. Else add the read-in line to the list.

How do I only read certain lines of a file in Python?

Use readlines() to Read the range of line from the File You can use an index number as a line number to extract a set of lines from it. This is the most straightforward way to read a specific line from a file in Python. We read the entire file using this way and then pick specific lines from it as per our requirement.


1 Answers

I would stack generator expressions:

with open(filename) as f_in:     lines = (line.rstrip() for line in f_in) # All lines including the blank ones     lines = (line for line in lines if line) # Non-blank lines 

Now, lines is all of the non-blank lines. This will save you from having to call strip on the line twice. If you want a list of lines, then you can just do:

with open(filename) as f_in:     lines = (line.rstrip() for line in f_in)      lines = list(line for line in lines if line) # Non-blank lines in a list 

You can also do it in a one-liner (exluding with statement) but it's no more efficient and harder to read:

with open(filename) as f_in:     lines = list(line for line in (l.strip() for l in f_in) if line) 

Update:

I agree that this is ugly because of the repetition of tokens. You could just write a generator if you prefer:

def nonblank_lines(f):     for l in f:         line = l.rstrip()         if line:             yield line 

Then call it like:

with open(filename) as f_in:     for line in nonblank_lines(f_in):         # Stuff 

update 2:

with open(filename) as f_in:     lines = filter(None, (line.rstrip() for line in f_in)) 

and on CPython (with deterministic reference counting)

lines = filter(None, (line.rstrip() for line in open(filename))) 

In Python 2 use itertools.ifilter if you want a generator and in Python 3, just pass the whole thing to list if you want a list.

like image 150
aaronasterling Avatar answered Sep 16 '22 14:09

aaronasterling