i have a file.txt and i just want to keep all alphabetic and numeric characters without whitespace and save it in a list, is there another way to do it? this is the new code, is faster. what do you think about it?
fin = open(fcompiti, encoding = 'UTF-8')
s = fin.read()
s = s.replace(' ', '').replace('\n','')
Regex (regular expressions) are your friend.
fin = open('file.txt')
s = fin.read()
alphanums = re.sub(r'[\W_]+', '', s)
This answer will give you more knowledge and examples on how and why this works.
You can try with a regex, which may or may not be faster than your approach (depending on the size and structure of your text).
import re
with open('file.txt') as f:
s = f.read()
s = ''.join(re.findall(r'[\dA-z]+', s))
As a side note, your code is not as memory efficient as it could be. Instead of creating a list in memory and then passing it to join you can use a generator.
s = ''.join(c for c in s if c.isalpha() or c.isnumeric())
# note absence of square brackets
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With