I want to write a program that copies text from a Word document and pastes it to another. I'm trying to do that using the python-docx
library. I was able to do that with the following code, but it does not copy the bold, italic, underlined nor colored parts as they are and only their text:
from docx import Document
input = Document('SomeDoc.docx')
paragraphs = []
for para in input.paragraphs:
p = para.text
paragraphs.append(p)
output = Document()
for item in paragraphs:
output.add_paragraph(item)
output.save('OutputDoc.docx')
I've also tried copying the paragraph
object directly into the output document, but it doesn't work either:
from docx import Document
input = Document('SomeDoc.docx')
output = Document()
for para in input.paragraphs:
output.add_paragraph(para)
output.save('OutputDoc.docx')
Switch to your Microsoft Word document, highlight the text you want to copy, and choose Edit > Copy from the menu bar. Switch back to your web browser, where you should still see the Paste From Word box. Click in this box, then choose Edit > Paste from the menu bar.
single click to select TOC 1. SHift Click on TOC 7 to select all the styles between. Click on the "Copy" button between the text boxes. Close the dialog.
Click the table move handle to select the table. Do one of the following: To copy the table, press CTRL+C. To cut the table, press CTRL+X.
In order to copy the text with its styles, you will need to write your own function, as there is no python-docx
function that does such a thing.
This is the function I wrote:
def get_para_data(output_doc_name, paragraph):
"""
Write the run to the new file and then set its font, bold, alignment, color etc. data.
"""
output_para = output_doc_name.add_paragraph()
for run in paragraph.runs:
output_run = output_para.add_run(run.text)
# Run's bold data
output_run.bold = run.bold
# Run's italic data
output_run.italic = run.italic
# Run's underline data
output_run.underline = run.underline
# Run's color data
output_run.font.color.rgb = run.font.color.rgb
# Run's font data
output_run.style.name = run.style.name
# Paragraph's alignment data
output_para.paragraph_format.alignment = paragraph.paragraph_format.alignment
paragraph
object to the file.run
to that paragraph.True
, False
, None
. If it's True
, the run will be in that style, if it's False
, it won't be in that style, and if it's None
, it will be inherited by the default style of the paragraph it's in. Then it applies the styles to the run.run
.run
.run
.You need to give it the name you gave your output document and the paragraphs you want to copy. For Example:
# Imports
input_doc = Document('InputDoc.docx')
output_doc = Document()
# Call the function
get_para_data(output_doc, input_doc.paragraphs[3])
# Save the new file
output_doc.save('OutputDoc.docx')
If you'd like to copy the entire document I suggest you do this:
for para in input_doc.paragraphs:
get_para_data(output_doc, para)
output_doc.save('OutputDoc.docx')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With