I am using Google OCR API and I am reading both images and PDF files, I am able to read and process images file, however, for PDF files, as per Google OCR API documentation, they have mentioned that we need to store our document into Google Cloud service.
Having said that, due to data confidentiality, I can't store my data into Google Cloud and want to upload my PDF from my local system in order to read text from PDF file. Is it possible to upload PDF from local disk and then process it instead of uploading file into Google Cloud?
With the help of Optical Character Recognition (OCR), you can extract any text from a PDF document into a simple text file. And it's simple: just upload your PDF and let us do the rest. After you provided your file, PDF2Go will use OCR to get the text from your PDF and save it as a TXT file.
Img to Docs allows you to quickly and easily convert images to text within a Google Doc. Simply drag and drop your image or click to upload and watch as Optical Character Recognition (OCR) is automatically applied to extract your text.
As you said, it's not possible to do that locally. I filed a Feature Request [1] on your behalf for you to follow updates there.
Anyway, I have a possible workaround that might satisfy your data confidentiality awareness. It consist in using the Cloud Storage Client libraries [2] to both upload and delete those files:
This should work as long as you don't mind having those files in buckets for a brief period of time.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With