Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to extract pdf index/table-of-contents with poppler?

I see that pdf-viewers like okular and evince are able to display the index of a pdf document (book) very well, with link to every paragraph. How can they do so? They use poppler library, how could I do extract that index with poppler, or in general?

like image 387
P5music Avatar asked Jan 20 '26 17:01

P5music


1 Answers

it just stops at first level (recursion needed to go more deeply)

toc=document->toc();

QDomElement docElem = toc->documentElement();

 QDomNode n = docElem.firstChild();
 while(!n.isNull()) {
     QDomElement e = n.toElement(); // try to convert the node to an element.
     if(!e.isNull()) {
         qDebug("elem %s\n",qPrintable(e.tagName())); // the node really is an element.

     }
     n = n.nextSibling();
 }
like image 101
P5music Avatar answered Jan 23 '26 20:01

P5music



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!