I want to cross check names from two word documents and then print the common names in the same program. How do I do so? Do I use regex or simply use the in function?
The Python print() function takes in any number of parameters, and prints them out on one line of text. The items are each converted to text form, separated by spaces, and there is a single '\n' at the end (the "newline" char). When called with zero parameters, print() just prints the '\n' and nothing else.
Once you have the text out of the Word documents, it's really quite easy:
document_1_text = 'This is document one'
document_2_text = 'This is document two'
document_1_words = document_1_text.split()
document_2_words = document_2_text.split()
common = set(document_1_words).intersection( set(document_2_words) )
unique = set(document_1_words).symmetric_difference( set(document_2_words) )
If you're not sure how to get the text out of Word docs:
from win32com.client import Dispatch
def get_text_from_doc(filename):
word = Dispatch('Word.Application')
word.Visible = False
wdoc = word.Documents.Open(filename)
if wdoc:
return wdoc.Content.Text.strip()
str1 = "Hello world its a demo"
str2 = "Hello world"
str1_words = set(str1.split())
str2_words = set(str2.split())
common = str1_words & str2_words
output:
common = {'Hello', 'world'}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With