I'm using the google drive api to store and retrieve pdf files. I would like to query these files using the search parameters.
But before I start implementing this. I would like to know how google handles the indexing of large pdf files. (600+pages 25Mb+) I would like to know for text based pdf's.(they don't need ocr)
I've tried some searches on the drive website and it doesn't always work.
I would like to know if the are any limitations and what they are.
According to this page for PDFs with OCR:
The maximum size for images (.jpg, .gif, .png) and PDF files (.pdf) is 2 MB. For PDF files, we only look at the first 10 pages when searching for text to extract.
And this page for PDFs with text:
You can search for text in PDF and image files by:
In theory you should be able to search the first 100 pages of any text documents or text-based PDFs that you've uploaded. You'll also be able to search for text found on the first ten pages of any image PDFs on your Drive.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With