Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

keep all alphabetic and numeric characters in python

Tags:

python-3.x

i have a file.txt and i just want to keep all alphabetic and numeric characters without whitespace and save it in a list, is there another way to do it? this is the new code, is faster. what do you think about it?

fin = open(fcompiti, encoding = 'UTF-8')
s = fin.read()
s = s.replace(' ', '').replace('\n','')
like image 534
MISTERCEC Avatar asked Jul 02 '26 01:07

MISTERCEC


2 Answers

Regex (regular expressions) are your friend.

fin = open('file.txt')
s = fin.read()
alphanums = re.sub(r'[\W_]+', '', s)

This answer will give you more knowledge and examples on how and why this works.

like image 167
emporerblk Avatar answered Jul 04 '26 18:07

emporerblk


You can try with a regex, which may or may not be faster than your approach (depending on the size and structure of your text).

import re

with open('file.txt') as f:
    s = f.read()

s = ''.join(re.findall(r'[\dA-z]+', s))

As a side note, your code is not as memory efficient as it could be. Instead of creating a list in memory and then passing it to join you can use a generator.

s = ''.join(c for c in s if c.isalpha() or c.isnumeric())
# note absence of square brackets
like image 35
DeepSpace Avatar answered Jul 04 '26 19:07

DeepSpace



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!