Adobe offers different solutions to process a PDF file, its Acrobat, Photoshop and Illustrator all help on PDF to SVG, but Illustrator is the best option to convert PDF to vector SVG format.
Best 5 PDF to SVG Converter on Windows/Mac/Online:Inkscape. Adobe Illustrator. Zamzar. Convertio.
Head over to ConvertPDF. Today, upload your PDF file and choose either PDF or SVG. Because SVG is a vector format, fonts may be substituted during conversion. To ensure the results match the PDF exactly, make sure to choose “Convert Characters to Curves” on the next page.
You can use Inkscape on the commandline only, without opening a GUI. Try this:
inkscape \
--without-gui \
--file=input.pdf \
--export-plain-svg=output.svg
For a complete list of all commandline options, run inkscape --help
.
Inkscape is used by many people on Wikipedia to convert PDF to SVG.
http://inkscape.org/
They even have a handy guide on how to do so!
http://en.wikipedia.org/wiki/Wikipedia:Graphic_Lab/Resources/PDF_conversion_to_SVG#Conversion_with_Inkscape
I am currently using PDFBox which has good support for graphic output. There is good support for extracting the vector strokes and also for managing fonts. There are some good tools for trying it out (e.g. PDFReader will display as Java Graphics2D). You can intercept the graphics tool with an SVG tool like Batik (I do this and it gives good capture).
There is no simple way to convert all PDF to SVG - it depends on the strategy and tools used to create the PDFs. Some text is converted to vectors and cannot be easily reconstructed - you have to install vector fonts and look them up.
UPDATE: I have now developed this into a package PDF2SVG which does not use Batik any more:
which has been tested on a range of PDFs. It produces SVG output consisting of
<svg:text>
per character<svg:path>
<svg:image>
Later packages will (hopefully) convert the characters to running text and the paths to higher-level graphics objects
UPDATE: We can now re-create running text from the SVG characters. We've also converted diagrams to domain-specific XML (e.g. chemical spectra). See https://bitbucket.org/petermr/svg2xml-dev. It's still in Alpha, but is moving at a useful speed. Anyone can join in!
UPDATE. (@Tim Kelty) We are continuing to work on PDF2SVG and also downstream tools that do (limited) Java OCR and creation of higher-level graphics primitives (arrows, boxes, etc.) See https://bitbucket.org/petermr/imageanalysis https://bitbucket.org/petermr/diagramanalyzer https://bitbucket.org/petermr/norma and https://bitbucket.org/petermr/ami-core . This is a funded project to capture 100 million facts from the scientific literature (contentmine.org) much of which is PDF.
This topic is quite old, but here is a handy solution that I found:
http://www.cityinthesky.co.uk/opensource/pdf2svg/
It offers a tool, pdf2png, which once installed does exactly the job in command line. I've tested it with irreproachable results so far, including with bitmaps.
EDIT : My mistake, this tool also converts letters to paths, so it does not address the initial question. However it does a good job anyway, and can be useful to anyone who does not intend to modify the code in the svg file, so I'll leave the post.
Here is the process that I ended up using. The main tool I used was Inkscape which was able to convert text alright.
Using Adobe Acrobat Pro Actions (formerly Batch Processing) create a custom action to separate PDF pages into separate files. Alternatively you may be able to split up PDFs with GhostScript
/* Extract Pages to Folder */
var re = /.*\/|\.pdf$/ig;
var filename = this.path.replace(re,"");
{
for ( var i = 0; i < this.numPages; i++ )
this.extractPages
({
nStart: i,
nEnd: i,
cPath : filename + "_s" + ("000000" + (i+1)).slice (-3) + ".pdf"
});
};
Using Windows Cmd created batch file to loop through all PDF files in a folder and convert them to SVG
:: ===== SETUP =====
@echo off
CLS
echo Starting SVG conversion...
echo.
:: setup working directory (if different)
REM set "_work_dir=%~dp0"
set "_work_dir=%CD%"
:: setup counter
set "count=1"
:: setup file search and save string
set "_work_x1=pdf"
set "_work_x2=svg"
set "_work_file_str=*.%_work_x1%"
:: setup inkscape commands
set "_inkscape_path=D:\InkscapePortable\App\Inkscape\"
set "_inkscape_cmd=%_inkscape_path%inkscape.exe"
:: ===== FIND FILES IN WORKING DIRECTORY =====
:: Output from DIR last element is single carriage return character.
:: Carriage return characters are directly removed after percent expansion,
:: but not with delayed expansion.
pushd "%_work_dir%"
FOR /f "tokens=*" %%A IN ('DIR /A:-D /O:N /B %_work_file_str%') DO (
CALL :subroutine "%%A"
)
popd
:: ===== CONVERT PDF TO SVG WITH INKSCAPE =====
:subroutine
echo.
IF NOT [%1]==[] (
echo %count%:%1
set /A count+=1
start "" /D "%_work_dir%" /W "%_inkscape_cmd%" --without-gui --file="%~n1.%_work_x1%" --export-dpi=300 --export-plain-svg="%~n1.%_work_x2%"
) ELSE (
echo End of output
)
echo.
GOTO :eof
:: ===== INKSCAPE REFERENCE =====
:: print inkscape help
REM "%_inkscape_cmd%" --help > "%~dp0\inkscape_help.txt"
REM "%_inkscape_cmd%" --verb-list > "%~dp0\inkscape_verb_list.txt"
I realize it is not best practice to manually brute force edit SVG or XML tags or attributes due to potential variations and should use an XML parser instead. However I had a simple issue where the stroke width on one drawing was very small, and on another the font family was being incorrectly identified, so I basically modified the previous Windows Cmd batch script to do a simple find and replace. The only changes were to the search string definitions and changing to call a PowerShell command. The PowerShell command will perform a find and replace and save the modified file with an added suffix. I did find some other references that could be better used to parse or modify the resultant SVG files if some other minor cleanup is needed to be performed.
:: setup file search and save string
set "_work_x1=svg"
set "_work_x2=svg"
set "_work_s2=_mod"
set "_work_file_str=*.%_work_x1%"
powershell -Command "(Get-Content '%~n1.%_work_x1%') | ForEach-Object {$_ -replace 'stroke-width:0.06', 'stroke-width:1'} | ForEach-Object {$_ -replace 'font-family:Times Roman','font-family:Times New Roman'} | Set-Content '%~n1%_work_s2%.%_work_x2%'"
Hope this might help someone
If DVI to SVG is an option, you can also use dvisvgm to convert a DVI file to an SVG file. This works perfectly for instance for LaTeX formulas (with option --no-fonts
):
dvisvgm --no-fonts input.dvi -o output.svg
There is also pdf2svg which uses poppler and Cairo to convert a pdf into SVG. When I tried this, the SVG was perfectly rendered in inkscape
.
Bash script to convert each page of a PDF into its own SVG file.
#!/bin/bash
#
# Make one PDF per page using PDF toolkit.
# Convert this PDF to SVG using inkscape
#
inputPdf=$1
pageCnt=$(pdftk $inputPdf dump_data | grep NumberOfPages | cut -d " " -f 2)
for i in $(seq 1 $pageCnt); do
echo "converting page $i..."
pdftk ${inputPdf} cat $i output ${inputPdf%%.*}_${i}.pdf
inkscape --without-gui "--file=${inputPdf%%.*}_${i}.pdf" "--export-plain-svg=${inputPdf%%.*}_${i}.svg"
done
To generate in png, use --export-png
, etc...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With