Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to read a .pdf file programmatically and convert it into audio (.mp3 format)?

I want to parse a PDF file from my C# app and create an audio file off it. How would I do that ?

I'm particularly looking for a good pdf to text library or a way to strip a pdf file off its text.

like image 958
Attilah Avatar asked Jun 06 '09 13:06

Attilah


People also ask

Can you convert a PDF to mp3?

Go to https://www.zamzar.com/convert/pdf-to-mp3/, and click "Choose Files" to upload your PDF document. Here, you can flexibly import PDFs from your local computer, Dropbox, Box, Google Drive, and OneDrive. Step 2. Select the "mp3" as the output format.

How do I convert PDF to audiobook in Python?

To convert a PDF to an audiobook you need to install some Python packages; pyttsx3, PyPDF2 & pdfplumber. All these packages can be easily installed by using the pip command; pip install <package name>. Also, Read – Machine Learning Full Course for free.


3 Answers

You preferably have a tagged PDF document as your input document. This means that the document contains tags to mark up the logical structure of the document (typically a PDF document will only contain visual information).

This PDF could then be converted into DAISY format, which is a standard for digital talking books, i.e. an intermediate XML format storing the text of books along with the logical structure and navigation features.

This Daisy XML format can be either converted to an audio format, or you could be using a Daisy reader, a physical device like an MP3 player to listen to the book.

There is a presentation available at the Daisy web site explaining the principles of this toolchain:

Accessible PDF to DAISY/NIMAS Conversion

like image 154
Dirk Vollmar Avatar answered Oct 28 '22 01:10

Dirk Vollmar


Use Festival for the text to speech. Various pdf to text api's exist...

like image 45
dicroce Avatar answered Oct 28 '22 01:10

dicroce


You need the Speech SDK from Microsoft. Read an instruction here

like image 29
jao Avatar answered Oct 28 '22 02:10

jao