I am following their "Code Example" guide on their github. https://github.com/modesty/pdf2json#code-example
In the example that says "Parse a PDF then write a .txt file (which only contains textual content of the PDF)", I copied and pasted the exact implementation into my a local JavaScript file and called it but the output text file was completely blank.
'use strict';
let fs = require('fs');
let PDFParser = require("pdf2json");
let pdfParser = new PDFParser();
pdfParser.on("pdfParser_dataError", errData => console.error(errData.parserError) );
pdfParser.on("pdfParser_dataReady", pdfData => {
fs.writeFile("./node_modules/pdf2json/test/F1040EZ.content.txt", pdfParser.getRawTextContent());
});
pdfParser.loadPDF("./node_modules/pdf2json/test/pdf/fd/form/F1040EZ.pdf");
Is it something that I am doing wrong? Or does this not work on their part? Also are there any alternatives to pdf to text converters for Nodejs without additional binaries installed?
The frontpage documentation is a bit wrong! In order to make this work simply set to PDFParser parameters null and 1
This one works:
var fs = require("fs");
// https://github.com/modesty/pdf2json
var PDFParser = require("./node_modules/pdf2json/PDFParser");
var pdfParser = new PDFParser(this,1);
pdfParser.on("pdfParser_dataError", errData => console.error(errData.parserError));
pdfParser.on("pdfParser_dataReady", pdfData => {
console.log(pdfParser)
fs.writeFile("./content.txt", pdfParser.getRawTextContent());
});
HTH -XDVarpunen
Link to issue in pdf2json: https://github.com/modesty/pdf2json/issues/76
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With