I'm currently trying to work with docx files using PHPWord library and its templating system. I have found and updated someones (cant remember the name, but its not important) path to this library that can work with tables (replicate its rows and then use standard setValue() from PHPWord on each of row).
If i create my own document, the data in xml is in normal structure, so the variable to be replaced ${variable} is in its own tag like this:
<w:tbl>
<w:tr>
...
${variable}
</w:tr>
</w:tbl>
I simplified the code, in actual code there is number of other tags descibing sizes, styles, etc.
My problem is i have to proccess documents from other people where i am prohibited to make big changes, I get a document where at some point they is a table with one blank row. I add the ${variable} variables and run it through PHPWord. Problem is, that it fails. After doing some research , I found out that the source XML looks like this:
....
...
${va
...
riab
...
le}
....
(again heavily simplified, but you get the picture)
This structure is a problem for me, because the function to clone rows uses strpos(), substr() and regular expressions to work and does not work with this structure (and I cant imagine elegant way to do it so).
So the question is - Does anybody know why docx does this and how to prevent him? I am looking for a solution via word, not PHP (I need current functions to work without much editing)
I have worked with this problem a lot:
In word, the document can be saved like this
<w:t>{</w:t>...
<w:t>variable</w:t>
<w:t>}</w:t>
I have therefore create a JS library that works even if variable names are splitted: Docxtemplater (works server side too) . What I have found out during development is that variables names aren't splitted if:
I don't think there's a way to fix a docx document with one command in Word, , but rewriting the variables to write them in one Stroke should work.
The primary cause of this is proofErr
element. Whereby Word identifies something that it deems spelt incorrectly and wraps it in the <w:proofErr>
element, inevitably splitting the original text.
If this happens to you I recommend the following, it's tedious, but the only sure-fire way:
.docx
to .zip
.word\document.xml
..zip
to .docx
.EDIT
This Visual Studio Extension lets you edit the contents of the OpenXML package directly. This allows you to skip steps 1 & 2.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With