Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extract information from a large XML file

Tags:

xml

I need to get some urls from a large xml file.

Xml file has the below structure.

<Main>
 <Product>
  <Images>
   <URL>image1.jpg</URL>
   <URL>image2.jpg</URL>
   <URL>image3.jpg</URL>
   <URL>image4.jpg</URL>
  </Images>
 </Product>

......

I need to extract all the links into a text file. Have any ideea on how to do this /?

like image 665
CARASS Avatar asked Feb 18 '26 02:02

CARASS


1 Answers

If you have Perl installed (or you can install it), you can use xml_grep, which comes with XML::Twig (available in Activestate Perl, or in Strawberry Perl or of course on centOS).

xml_grep --text_only URL product_file.xml > url.txt

It can deal with very large files, since it works in stream mode.

like image 69
mirod Avatar answered Feb 20 '26 19:02

mirod



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!