I am currently trying to input a text file, separate each word and organize them into a list.
The current problem I'm having is getting rid of commas and periods from the text file.
My code is below:
#Process a '*.txt' file.
def Process():
name = input("What is the name of the file you would like to read from? ")
file = open( name , "r" )
text = [word for line in file for word in line.lower().split()]
word = word.replace(",", "")
word = word.replace(".", "")
print(text)
The output I'm currently getting is this:
['this', 'is', 'the', 'first', 'line', 'of', 'the', 'file.', 'this', 'is', 'the', 'second', 'line.']
As you can see, the words "file" and "line" have a period at the end of them.
The text file I'm reading is:
This is the first line of the file.
This is the second line.
Thanks in advance.
These lines have no effect
word = word.replace(",", "")
word = word.replace(".", "")
just change your list comp to this:
[word.replace(",", "").replace(".", "")
for line in file for word in line.lower().split()]
Maybe strip
is more appropriate than replace
def Process():
name = input("What is the name of the file you would like to read from? ")
file = open(name , "r")
text = [word.strip(",.") for line in file for word in line.lower().split()]
print(text)
>>> help(str.strip) Help on method_descriptor: strip(...) S.strip([chars]) -> string or unicode Return a copy of the string S with leading and trailing whitespace removed. If chars is given and not None, remove characters in chars instead. If chars is unicode, S will be converted to unicode before stripping
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With