Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reading PDF file using OpenCV

Tags:

c++

opencv

pdf

Is it possible to convert a PDF file to cv::Mat? I know that PDF file is generally vector of objects, but given a required resolution. Is there any tool that can do such a conversion?

like image 758
Mercury Avatar asked Jan 21 '13 09:01

Mercury


People also ask

Can OpenCV read PDF files?

Can OpenCV read PDF? OpenCV doesn't support pdf format at all, so you should convert pdf page to image using another library.

How do I extract text from a PDF using OCR?

Open a PDF file containing a scanned image in Acrobat for Mac or PC. Click on the “Edit PDF” tool in the right pane. Acrobat automatically applies optical character recognition (OCR) to your document and converts it to a fully editable copy of your PDF. Click the text element you wish to edit and start typing.

Can Pytesseract read PDF?

Using pytesseract, one can extract almost all the data irrespective of the format of the documents (whether its a scanned document or a pdf or a simple jpeg image). Also, since its open source, the overall solution would be flexible as well as not that expensive.


1 Answers

OpenCV doesn't support pdf format at all, so you should convert pdf page to image using another library. Read this discussion: Open source PDF library for C/C++ application?

Also this question is similar to yours: What C++ library can I use to convert a PDF to an image on windows?

like image 173
ArtemStorozhuk Avatar answered Sep 26 '22 17:09

ArtemStorozhuk