How do I return all the unique words from a text file using Python? For example:
I am not a robot
I am a human
Should return:
I
am
not
a
robot
human
Here is what I've done so far:
def unique_file(input_filename, output_filename):
input_file = open(input_filename, 'r')
file_contents = input_file.read()
input_file.close()
word_list = file_contents.split()
file = open(output_filename, 'w')
for word in word_list:
if word not in word_list:
file.write(str(word) + "\n")
file.close()
The text file the Python creates has nothing in it. I'm not sure what I am doing wrong
Method 1: Finding the index of the string in the text file using readline() In this method, we are using the readline() function, and checking with the find() function, this method returns -1 if the value is not found and if found it returns 0.
To count the unique words in a text file: Read the contents of the file into a string and split it into words. Use the set() class to convert the list to a set object. Use the len() function to count the unique words in the text file.
for word in word_list:
if word not in word_list:
every word
is in word_list
, by definition from the first line.
Instead of that logic, use a set
:
unique_words = set(word_list)
for word in unique_words:
file.write(str(word) + "\n")
set
s only hold unique members, which is exactly what you're trying to achieve.
Note that order won't be preserved, but you didn't specify if that's a requirement.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With