Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to return unique words from the text file using Python

How do I return all the unique words from a text file using Python? For example:

I am not a robot

I am a human

Should return:

I

am

not

a

robot

human

Here is what I've done so far:

def unique_file(input_filename, output_filename):
    input_file = open(input_filename, 'r')
    file_contents = input_file.read()
    input_file.close()
    word_list = file_contents.split()

    file = open(output_filename, 'w')

    for word in word_list:
        if word not in word_list:
            file.write(str(word) + "\n")
    file.close()

The text file the Python creates has nothing in it. I'm not sure what I am doing wrong

like image 989
user927584 Avatar asked Apr 10 '14 04:04

user927584


People also ask

How do I find a word in a text file Python?

Method 1: Finding the index of the string in the text file using readline() In this method, we are using the readline() function, and checking with the find() function, this method returns -1 if the value is not found and if found it returns 0.

How do you count the number of unique words in a text file in Python?

To count the unique words in a text file: Read the contents of the file into a string and split it into words. Use the set() class to convert the list to a set object. Use the len() function to count the unique words in the text file.


1 Answers

for word in word_list:
    if word not in word_list:

every word is in word_list, by definition from the first line.

Instead of that logic, use a set:

unique_words = set(word_list)
for word in unique_words:
    file.write(str(word) + "\n")

sets only hold unique members, which is exactly what you're trying to achieve.

Note that order won't be preserved, but you didn't specify if that's a requirement.

like image 183
mhlester Avatar answered Nov 07 '22 16:11

mhlester