Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to find common words and print them using python command?

Tags:

python-3.x

I want to cross check names from two word documents and then print the common names in the same program. How do I do so? Do I use regex or simply use the in function?

like image 906
user1596017 Avatar asked Aug 13 '12 17:08

user1596017


People also ask

How do you print a word in Python?

The Python print() function takes in any number of parameters, and prints them out on one line of text. The items are each converted to text form, separated by spaces, and there is a single '\n' at the end (the "newline" char). When called with zero parameters, print() just prints the '\n' and nothing else.


2 Answers

Once you have the text out of the Word documents, it's really quite easy:

document_1_text = 'This is document one'
document_2_text = 'This is document two'

document_1_words = document_1_text.split()
document_2_words = document_2_text.split()

common = set(document_1_words).intersection( set(document_2_words) )
unique = set(document_1_words).symmetric_difference( set(document_2_words) )

If you're not sure how to get the text out of Word docs:

from win32com.client import Dispatch

def get_text_from_doc(filename):
    word = Dispatch('Word.Application')
    word.Visible = False
    wdoc = word.Documents.Open(filename)
    if wdoc:
        return wdoc.Content.Text.strip()
like image 94
Matthew Trevor Avatar answered Sep 18 '22 20:09

Matthew Trevor


str1 = "Hello world its a demo"

str2 = "Hello world"

str1_words = set(str1.split())

str2_words = set(str2.split())

common = str1_words & str2_words

output:

common = {'Hello', 'world'}
like image 33
Suman Avatar answered Sep 21 '22 20:09

Suman