Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert doc to txt via commandline

Tags:

We're searching a programm that allows us to convert a doc or docx document to a txt file. We're working with linux and we want to start a website that converts user uploaded doc files. We don't wanna use open office/libre office cause we have bad experience with that. Pandoc can't handle doc files :/

Anyone have a idea?

like image 627
user698601 Avatar asked Jun 28 '11 16:06

user698601


People also ask

How do I convert a DOC file?

Click the File tab. Do one of the following: To convert the document without saving a copy, click Info, and then click Convert. To create a new copy of the document in Word 2016 or Word 2013 mode, click Save As, and then choose the location and the folder where you want to save the new copy.

What is the difference between DOC and TXT?

A . txt file contains plain text characters of 1 byte, a . doc file includes all Word document metadata such as font style, size, page margins etc.


1 Answers

You will have to use two different command-line tools, depending if you are working with .doc or .docx format.

For .doc use catdoc:

catdoc foo.doc > foo.txt

For .docx use docx2txt:

docx2txt foo.docx

The latter will produce a file called foo.txt in the same directory as the original.

I'm not sure which Linux distribution you are using, but both catdoc and docx2txt are available from the Ubuntu repositories, for example:

apt-get install docx2txt

Or with Homebrew on Mac:

brew install docx2txt
like image 83
harlandski Avatar answered Sep 27 '22 23:09

harlandski