I have a PDF document that is saved in Google Drive. I can use the Google Drive Web UI search to find text in the document.
How can I programmatically extract a portion of the text in the document using Google Apps Script?
See pdfToText()
in this gist.
To invoke the OCR built in to Google Drive on a PDF file, e.g. myPDF.pdf
, here is what you do:
function myFunction() {
var pdfFile = DriveApp.getFilesByName("myPDF.pdf").next();
var blob = pdfFile.getBlob();
// Get the text from pdf
var filetext = pdfToText( blob, {keepTextfile: false} );
// Now do whatever you want with filetext...
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With