Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove regex elements from list

I use python 2.7. I have data in file 'a':

[email protected];description1
[email protected];description2
myname3@this_is_ok.ok;description3
[email protected];description4
[email protected];description5
[email protected];description7

I read this file like:

with open('a', 'r') as f:
    data = [x.strip() for x in f.readlines()]

i have a list named bad:

bad = ['abc', 'qwe'] # could be more than 20 elements

Now i'm trying to remove all lines with 'abc' and 'qwe' after @ and write the rest to the newfile. So in newfile should be only 2 lines:

myname3@this_is_ok.ok;description3
[email protected];description7

I've been tryin to use regexp (.?)@(.?);(.*) to get groups, but i don't know what to do next.

Advice me, please!

like image 271
Alex Avatar asked May 25 '26 12:05

Alex


2 Answers

Here's a non-regex solution:

bad = set(['abc', 'qwe'])

with open('a', 'r') as f:
    data = [line.strip() for line in f if line.split('@')[1].split('.')[0] in bad]
like image 61
Joel Cornett Avatar answered May 28 '26 04:05

Joel Cornett


import re
bad = ['abc', 'qwe']

with open('a') as f:
    print [line.strip() 
           for line in f
           if not re.search('|'.join(bad), line.partition('@')[2]]

This solution works as long as bad only contains normal characters eg. letters, numbers, underscores but nothing that interferes with the regex expression like 'a|b' as @phihag pointed out.

like image 20
jamylak Avatar answered May 28 '26 05:05

jamylak



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!