Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

converting html to text with perl

Tags:

perl

cpan

I have a bunch of html files and need to convert and format them to text with perl i.e somthing like <br/> will be interperted to \n

I found this perl module on cpan html::formattext it format the text well but if there is link it strip it , are there any option with HTML::FormatText to format the html as is to text but when there links like this

<a href="http://www.microsoft.com>http://www.microsoft.com</a>

i.e somthing like this :

<br /><b>Microsoft</b><br /><a href="http://www.microsoft.com>`

will be converted to:

microsoft
http://www.microsoft.com
like image 757
smith Avatar asked Jan 11 '12 19:01

smith


1 Answers

Take a look at HTML::FormatText::WithLinks

Setting the after_link option to, say, " (%l)" will put the link in line after the anchor text. In your example you would get Microsoft (http://www.microsoft.com).

like image 156
Borodin Avatar answered Sep 19 '22 23:09

Borodin