Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Correct word-count of a LaTeX document

I'm currently searching for an application or a script that does a correct word count for a LaTeX document.

Up till now, I have only encountered scripts that only work on a single file but what I want is a script that can safely ignore LaTeX keywords and also traverse linked files...ie follow \include and \input links to produce a correct word-count for the whole document.

With vim, I currently use ggVGg CTRL+G but obviously that shows the count for the current file and does not ignore LaTeX keywords.

Does anyone know of any script (or application) that can do this job?

like image 434
Andreas Grech Avatar asked Jun 04 '10 14:06

Andreas Grech


2 Answers

I use texcount. The webpage has a Perl script to download (and a manual).

It will include tex files that are included (\input or \include) in the document (see -inc), supports macros, and has many other nice features.

When following included files you will get detail about each separate file as well as a total. For example here is the total output for a 12 page document of mine:

TOTAL COUNT Files: 20 Words in text: 4188 Words in headers: 26 Words in float captions: 404 Number of headers: 12 Number of floats: 7 Number of math inlines: 85 Number of math displayed: 19 

If you're only interested in the total, use the -total argument.

like image 139
Geoff Avatar answered Oct 02 '22 20:10

Geoff


I went with icio's comment and did a word-count on the pdf itself by piping the output of pdftotext to wc:

pdftotext file.pdf - | wc - w  
like image 45
Andreas Grech Avatar answered Oct 02 '22 22:10

Andreas Grech