Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Stripping Commas and Periods

I am currently trying to input a text file, separate each word and organize them into a list.

The current problem I'm having is getting rid of commas and periods from the text file.

My code is below:

#Process a '*.txt' file.
def Process():
    name = input("What is the name of the file you would like to read from? ")

    file = open( name , "r" )
    text = [word for line in file for word in line.lower().split()]
    word = word.replace(",", "")
    word = word.replace(".", "")

    print(text)

The output I'm currently getting is this:

['this', 'is', 'the', 'first', 'line', 'of', 'the', 'file.', 'this', 'is', 'the', 'second', 'line.']

As you can see, the words "file" and "line" have a period at the end of them.

The text file I'm reading is:

This is the first line of the file.

This is the second line.

Thanks in advance.

like image 792
Keyfer Mathewson Avatar asked Mar 20 '13 22:03

Keyfer Mathewson


2 Answers

These lines have no effect

word = word.replace(",", "")
word = word.replace(".", "")

just change your list comp to this:

[word.replace(",", "").replace(".", "") 
 for line in file for word in line.lower().split()]
like image 168
jamylak Avatar answered Oct 05 '22 23:10

jamylak


Maybe strip is more appropriate than replace

def Process():
    name = input("What is the name of the file you would like to read from? ")

    file = open(name , "r")
    text = [word.strip(",.") for line in file for word in line.lower().split()]
    print(text)
>>> help(str.strip)
Help on method_descriptor:

strip(...)
    S.strip([chars]) -> string or unicode

    Return a copy of the string S with leading and trailing
    whitespace removed.
    If chars is given and not None, remove characters in chars instead.
    If chars is unicode, S will be converted to unicode before stripping
like image 32
John La Rooy Avatar answered Oct 06 '22 01:10

John La Rooy