I need to convert a .pdf file to a .txt file
How can I do this in C#?
I've had the need myself and I used this article to get me started: http://www.codeproject.com/KB/string/pdf2text.aspx
Ghostscript could do what you need. Below is a command for extracting text from a pdf file into a txt file (you can run it from a command line to test if it works for you):
gswin32c.exe -q -dNODISPLAY -dSAFER -dDELAYBIND -dWRITESYSTEMDICT -dSIMPLE -c save -f ps2ascii.ps "test.pdf" -c quit >"test.txt"
Check here: codeproject: Convert PDF to Image Using Ghostscript API for details on how to use ghostscript with C#
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With