Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

JavaScript library to read doc and docx on client

I am searching for a JavaScript library, which can read .doc - and .docx - files. The focus is only on the text content. I am not interested in pictures, formulas or other special structures in MS-Word file.

It would be great if the library works with to JavaScript FileReader as shown in the code below.

function readExcel(currfile) {
  var reader = new FileReader();

  reader.onload = (function (_file) {
      return function (e) {
          //here should the magic happen
      };
  })(currfile);

  reader.onabort = function (e) {
      alert('File read canceled');
  };

  reader.readAsBinaryString(currfile);
}

I searched through the internet, but I could not get what I was looking for.

like image 792
Torben Avatar asked Jun 22 '17 12:06

Torben


People also ask

How do I read a DOCX file in node?

First install NodeJS file system. Second is pdf reader. Install Xlsx for reading Xls, xlsx workbooks. node-stream-zip is to read doc and Docx file.

How do I open a DOCX file in my browser?

From the document library select- Settings > Document Library Settings > General Settings > Advanced Settings > Browser-enabled Documents > Select the "Display as a Web page" option.

How do I open a .doc file in HTML?

Just append your src attribute with an appropriate URL to a specific doc viewer, it will download your file from URL and then generate an HTML page from it, and then you direct your iframe to it and voila!

How do I display a DOCX file in HTML?

If you wanted to pre-process your DOCX files, rather than waiting until runtime you could convert them into HTML first by using a file conversion API such as Zamzar. You could use the API to programatically convert from DOCX to HMTL, save the output to your server and then serve that HTML up to your end users.


1 Answers

You can use docxtemplater for this (even if normally, it is used for templating, it can also just get the text of the document) :

var zip = new JSZip(content);
var doc=new Docxtemplater().loadZip(zip)
var text= doc.getFullText();
console.log(text);

See the Doc for installation information (I'm the maintainer of this project)

However, it only handles docx, not doc

like image 57
edi9999 Avatar answered Sep 19 '22 16:09

edi9999