We're searching a programm that allows us to convert a doc or docx document to a txt file. We're working with linux and we want to start a website that converts user uploaded doc files. We don't wanna use open office/libre office cause we have bad experience with that. Pandoc can't handle doc files :/
Anyone have a idea?
Click the File tab. Do one of the following: To convert the document without saving a copy, click Info, and then click Convert. To create a new copy of the document in Word 2016 or Word 2013 mode, click Save As, and then choose the location and the folder where you want to save the new copy.
A . txt file contains plain text characters of 1 byte, a . doc file includes all Word document metadata such as font style, size, page margins etc.
You will have to use two different command-line tools, depending if you are working with .doc or .docx format.
For .doc use catdoc:
catdoc foo.doc > foo.txt
For .docx use docx2txt:
docx2txt foo.docx
The latter will produce a file called foo.txt in the same directory as the original.
I'm not sure which Linux distribution you are using, but both catdoc and docx2txt are available from the Ubuntu repositories, for example:
apt-get install docx2txt
Or with Homebrew on Mac:
brew install docx2txt
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With