Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: Line that does not start with #

Tags:

python

regex

I have a file that contains something like

# comment
# comment
not a comment

# comment
# comment
not a comment

I'm trying to read the file line by line and only capture lines that does not start with #. What is wrong with my code/regex?

import re

def read_file():
    pattern = re.compile("^(?<!# ).*")

    with open('list') as f:
        for line in f:
            print pattern.findall(line)

Original code captures everything instead of expected.

like image 933
Mico Avatar asked Dec 07 '15 09:12

Mico


1 Answers

Use match function in this case- since it will check in the beginning.

So expression will be \s*[^#]- for sanity i use \s to pass whitespaces.

OP's code will be-

def read_file():
    pattern = re.compile("\s*[^#]")
    with open(r"C:\test.txt") as f:
        for line in f:
            if pattern.match(line):
                    print line
read_file()

EDIT-

A bit explanation why OP's pattern is not working-

When you use . it means all except line break character. So when you write ^(?<!# ).* it means any character (except line break- it includes # damn it!) that has not # before- ultimately it becomes any string (except line break variant) starts with any character.

See LIVE DEMO

Solution:

Try negation like ^(?<!# )[^#]

like image 189
SIslam Avatar answered Oct 18 '22 20:10

SIslam