Is there any way to extract content of a pdf from bash? (I have a big folder of academic papers, which sadly have labels like "1010.3423.pdf". I'd like to write a bash script to name them more sensibly, which involves, say googling the first few lines.)
There is pdftotext, which can help you get the title and authors from the pdf file. You can then use this to google, or generate a filename yourself.
try pdftotext to extract the text? http://en.wikipedia.org/wiki/Pdftotext
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With