Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there any way to extract content of a pdf from bash?

Tags:

bash

Is there any way to extract content of a pdf from bash? (I have a big folder of academic papers, which sadly have labels like "1010.3423.pdf". I'd like to write a bash script to name them more sensibly, which involves, say googling the first few lines.)

like image 370
MSmth Avatar asked Dec 10 '12 04:12

MSmth


Video Answer


2 Answers

There is pdftotext, which can help you get the title and authors from the pdf file. You can then use this to google, or generate a filename yourself.

like image 96
perreal Avatar answered Sep 24 '22 15:09

perreal


try pdftotext to extract the text? http://en.wikipedia.org/wiki/Pdftotext

like image 39
Dyno Fu Avatar answered Sep 23 '22 15:09

Dyno Fu