Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extracting vector graphics from pdf with Inkscape [closed]

I'd like to extract some pdf images from a paper for presentation purposes. In windows, Adobe Illustrator works just fine, but I now have to perform this task in a Debian box.

Two popular solutions I found online are using

  • pdfimage
  • Inkscape

The pdfimage does not meet my needs since I want vector graphics (pdf) rather than jpgs so I prefer to use Inkscape, but it does not work as expected. I hope I could use some selector tool to drag a box and select everything inside as I normally did with Illustrator, but none of the tools in Inkscape works.

If I use the "select and transform objects" tool (the black arrow), the whole pdf page is selected while I only want a small portion; if I use the "edit path by nodes" tool (the black triangle arrow with some nodes) I can only select a single object at a time. Drag and drop (even with the shift key pressed) does not work.

I'm wondering if there's a way to get around this, or is there a better tool in Debian to achieve the same? Thanks.

like image 233
Yang Avatar asked Aug 23 '12 04:08

Yang


2 Answers

In my humble opinion, I can suggest the way I use to get vector images from pdf

there is a tool called

pdftocairo, contained into poppler-utils

  • http://poppler.freedesktop.org/

syntax:

pdftocairo [options] <PDF-file> [<output-file>]

pdftocairo is able to produce, in output, both raster and vector format, between these last, it is able to convert the content of single pdf page (if you have a multipage pdf doc, you first need to explode this in its single pdf pages, with pdftk for instance), into:

  • -ps : generate PostScript file
  • -eps : generate Encapsulated PostScript (EPS)
  • -svg : generate a Scalable Vector Graphics (SVG) file

the best output format for your needs may be the svg, so after converted the pdf page you can open this svg with any svg app (with inkscape or the good old sodipodi for instance), select the vector elements you want extract and save

RESUMING:

if you have a MULTIPAGE PDF

  1. you FIRST split this multipage pdf into its single pages (create a folder for this single pages)

    pdftk file.pdf burst
    
  2. then use pdftocairo to convert any pdf page into svg

    for f in *.pdf; do pdftocairo -svg $f; done
    
like image 168
Dingo Avatar answered Jan 03 '23 11:01

Dingo


You can split multi-page pdf files using pdftk, then using inkscape to convert pdf to svg file using command line, e.g

inkscape --without-gui --file=input.pdf --export-plain-svg=output.svg
like image 37
wang.aurora Avatar answered Jan 03 '23 12:01

wang.aurora