Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get the specific content from html and print to txt file in Perl

I have a html file which contains paper ID s and papers. So i want to print these ID s and papers sequencely. Here are the html file and example output.

<META content="MSHTML 6.00.2900.2802" name=GENERATOR></HEAD>
<BODY bgColor=#ffffff leftMargin=0 topMargin=0 rightMargin=0 marginwidth="0" 
marginheight="0">
<DIV class=conf><A class=confName 
href="http://ehw.jpl.nasa.gov/events/ahs2006/">1st Conference on Adaptive 
Hardware and Systems (AHS-2006)</A></DIV>
<DIV class=menuoc>OpenConf Conference Management System</DIV>
<DIV class=menu>
<DIV class=menuitem><A 
href="http://www.eng.bahcesehir.edu.tr/openconf/chair/">Chair Home</A></DIV>
<DIV class=menuitem><A 
href="http://www.eng.bahcesehir.edu.tr/openconf/chair/signout.php">Sign 
Out</A></DIV>
<DIV class=menufiller>Logged in as: ahs2006&nbsp;</DIV></DIV>
<DIV class=mainbody><BR>
<P class=header>Assign Reviews</P>
<FORM action=/openconf/chair/assign_reviews.php method=post>
<DL>
<DT>
<P><B>Select Paper(s):</B></P>
<DD>
<P>[ Paper ID - Title (# Reviewers) ]</P>
<DD><SELECT multiple size=10 name=papers[]> <OPTION value=2>&nbsp;&nbsp;2 - 
Switchable Glass: A possible medium for Evolvable Hardware (4)</OPTION> 
<OPTION value=3>&nbsp;&nbsp;3 - An Efficient Multi-Objective Evolutionary 
Algorithm for Combinational Circuit Design (3)</OPTION> <OPTION 
value=4>&nbsp;&nbsp;4 - A Background Mismatch Calibration for Capacitive 
Digital-to-Analog Converters (3)</OPTION> <OPTION value=5>&nbsp;&nbsp;5 - 
Designing Electronic Circuits by Means of Gene Expression Programming 
(3)</OPTION> <OPTION value=6>&nbsp;&nbsp;6 - Coherence Based Fault Detection 
And Error Correction (3)</OPTION> <OPTION value=7>&nbsp;&nbsp;7 - Wormhole 
Routing with Virtual Channels using Dynamic Rate Control for Network-on... 
(2)</OPTION> <OPTION value=8>&nbsp;&nbsp;8 - Noise Analysis of Phase Locked 
Loops (3)</OPTION> <OPTION value=9>&nbsp;&nbsp;9 - Design and Analysis of a 
Second Order Phase Locked Loops (PLLs) (2)</OPTION> <OPTION value=10>10 - 
SW-HW Co-design and fault tolerant implementation for the LRID Wireless 
communication... (3)</OPTION> <OPTION value=11>11 - Adaptive PID Controller 
Using Parameter Optimization Algorithm (2)</OPTION> <OPTION value=12>12 - A 
Novel Self-organizing Hybrid Network Protocol (2)</OPTION> <OPTION 
value=13>13 - An Adaptive FPGA-Based Mechatronic Control System Supporting 
Partial Reconfiguration... (3)</OPTION> <OPTION value=14>14 - Generalized 
Disjunction Decomposition for the Evolution of Programmable Logic Array... 
(3)</OPTION> <OPTION value=15>15 - Woofer-Tweeter Adaptive Optics System 
(1)</OPTION> <OPTION value=16>16 - A Re-Programmable Platform for Dynamic 
Burn-in Test of Xilinx VirtexII 3000 FPGA... (3)</OPTION> <OPTION 
value=17>17 - Using hardware-based particle swarm method for dynamic 
optimization of adaptive ... (2)</OPTION> <OPTION value=18>18 - 
Hardware/software coevolution of genome programs and cellular processors 
(2)</OPTION> <OPTION value=19>19 - Systolic Array Based Adaptive Beamformer 
Modelling in SystemC Environment (2)</OPTION> <OPTION value=20>20 - A 
Reconfigurable Hardware Design Using FPGA (2)</OPTION> <OPTION value=21>21 - 
An FPGA Implemented Processor Architecture with Adaptive Resolution 
(2)</OPTION> <OPTION value=22>22 - Evolving Hardware with 
Self-reconfigurable connectivity in Xilinx FPGAs (2)</OPTION> <OPTION 
value=23>23 - Particle Swarm Optimization with Discrete Recombination: An 
Online Optimizer for... (2)</OPTION> <OPTION value=24>24 - Towards the 
Integration of Drive Control Loop Electronics of the JPL/Boeing Gyroscope... 
(2)</OPTION> <OPTION value=25>25 - An Incremental Evolutionary Strategy for 
the Design of FIR Filters Targeting Real... (2)</OPTION> <OPTION value=26>26 
- Adaptive Micro-Antenna on Silicon Substrate (3)</OPTION> <OPTION 
value=27>27 - Towards Fluent Sensor Networks: A Scalable and Robust 
Self-Deployment Approach (3)</OPTION> <OPTION value=28>28 - Comparison of 
Fuzzy-C Means, Hard C-Means and Differential Evolution Algorithm in... 
(2)</OPTION> <OPTION value=29>29 - Evolutionary Design of Digital Circuits: 
Where Are Current Limits? (2)</OPTION> <OPTION value=30>30 - GEZGİN &amp; 
GEZGİN-2: Adaptive Real-Time Image Processing Subsystems for Earth 
Observing... (3)</OPTION> <OPTION value=31>31 - A Multi-objective Genetic 
Algorithm for On-chip Real-time Adaptation of a Multi-... (2)</OPTION> 
<OPTION value=32>32 - An Efficient Technique for Preventing Single Event 
Disruptions in Synchronous and... (1)</OPTION> <OPTION value=33>33 - 
Architecture of a Dynamically Reconfigurable NoC for Adaptive Reconfigurable 
MPSoC (0)</OPTION> <OPTION value=34>34 - Embedded Reconfigurable Array 
Fabrics for Efficient Implementation of Image Compression... (1)</OPTION> 
<OPTION value=35>35 - Routing in Wireless Sensor Networks Using Ant Colony 
Optimization (2)</OPTION> <OPTION value=36>36 - Simulation of 
Multifunctional Combinational Modules Controlled by Vdd (3)</OPTION> <OPTION 
value=37>37 - Reconfigurable Parallel Computing Architecture for On-Board 
Data Processing (2)</OPTION> <OPTION value=38>38 - On comparison of Variable 
Length Representations by Means of Unconstrained Evolution... (3)</OPTION> 
<OPTION value=39>39 - VLSI Implementation of LMS Equaliser with Adaptive 
Length Selection for Wireless... (0)</OPTION> <OPTION value=41>41 - A 
Scalable Reconfigurable Analog to Digital Converter Architecture Targeting 
Low... (0)</OPTION> <OPTION value=42>42 - Linear Prediction with 
Differential Evolution Algorithm (2)</OPTION> <OPTION value=43>43 - Genetic 
Algorithm based Engine for Domain-Specific Reconfigurable Arrays 
(0)</OPTION> <OPTION value=44>44 - Non-Uniform Search Domain based Genetic 
Algorithm for the Synthesis and Continuous... (2)</OPTION> <OPTION 
value=45>45 - Design Concepts for a Dynamically Reconfigurable Wireless 
Sensor Node (2)</OPTION> <OPTION value=46>46 - On-Board Partial Run-Time 
Reconfiguration for Pico-Satellite Constellations (2)</OPTION> <OPTION 
value=47>47 - A Framework of Evolvable and Reconfigurable Sensor Networks 
for Aerospace –based... (0)</OPTION> <OPTION value=48>48 - Analytical 
Modelling of Power Attenuation under Parameter Fluctuations with 
Applications... (2)</OPTION> <OPTION value=49>49 - A New State Space 
Representation Method for Adaptive Log Domain Systems (2)</OPTION> <OPTION 
value=50>50 - Swarm Based Incremental Learning for Combinational Circuit 
Evolution (2)</OPTION> <OPTION value=51>51 - Gene Regulation Mechanisms 
introduced in the E valuation Criteria for a Hardware... (2)</OPTION> 
<OPTION value=52>52 - Automatic Hybrid Genetic Algorithm Based Printed 
Circuit Board Inspection (2)</OPTION> <OPTION value=53>53 - Population based 
FPGA solution to Mastermind game (2)</OPTION> <OPTION value=54>54 - A Large 
Scale Adaptable Multiplier for Cryptographic Applications (2)</OPTION> 
<OPTION value=55>55 - A Self-Tuning Analog Proportional-Integral-Derivative 
(PID) Controller (2)</OPTION> <OPTION value=56>56 - Self-Configurable Neural 
Network Processor for Adaptable FIR Filters (3)</OPTION> <OPTION value=57>57 
- On-Chip Evolution Using a Soft Processor Core Applied to Image Recognition 
(2)</OPTION> <OPTION value=58>58 - A Novel Adaptive Viterbi Algorithm and 
Its Implementation (2)</OPTION> <OPTION value=59>59 - An Efficient Hardware 
Architecture for H.264 Adaptive Deblocking Filter (2)</OPTION> <OPTION 
value=60>60 - A Low-Complexity Self-Calibrating Adaptive Quadrature Receiver 
(2)</OPTION> <OPTION value=61>61 - A Honeycomb Development Architecture for 
Robust Fault-Tolerant Design (2)</OPTION> <OPTION value=62>62 - Sate-Space 
based Analytical Modelling for Real-Time Fault Recovery and Self-Repair... 
(2)</OPTION> <OPTION value=63>63 - Strategies to On- Line Failure Recovery 
in Self- Adaptive Systems based on Dynamic... (2)</OPTION> <OPTION 
value=64>64 - A Platform for Digital Intrinsic Hardware Evolution 
(2)</OPTION> <OPTION value=65>65 - Face Recognition Using a Gabor Filter 
Bank Approach (2)</OPTION> <OPTION value=66>66 - Protecting Fingerprint Data 
using Watermarking (2)</OPTION> <OPTION value=67>67 - Debug Support for 
System-on-Chips, Considerations for Reconfigurable and Hybrid ... 
(2)</OPTION> <OPTION value=68>68 - Novel Techniques for Ensuring Secure 
Communications for Distributed Low Power Devices (2)</OPTION> <OPTION 
value=69>69 - A Modular Framework for the Evolution of Circuits on 
Configurable Transistor Array... (2)</OPTION> <OPTION value=70>70 - Power 
Driven Reconfigurable Complex Continuous Wavelet Transform Processor 
(2)</OPTION> <OPTION value=71>71 - A Tuning Technique for Switched-Capacitor 
Circuits (0)</OPTION> <OPTION value=72>72 - An Automatic Technique to 
Synthesize System-on-a-Chip to Adapt to Changing Environments (2)</OPTION> 
<OPTION value=73>73 - Picosatellite Constellations for Remote Sensing in LEO 
(2)</OPTION> <OPTION value=74>74 - Evolvable Hardware Applied to 
Nanotechnology (1)</OPTION> <OPTION value=75>75 - Gate-level Morphogenetic 
Evolvable Hardware for Scalability and Adaptation on FPGAs (2)</OPTION> 
<OPTION value=76>76 - Synthesis of MOS Analog Circuits by Evolutionary 
Methods (2)</OPTION> <OPTION value=77>77 - An Adaptive HDL Design 
Methodology for Hard IP and Soft IP Co-Protection (2)</OPTION> <OPTION 
value=78>78 - FSM and HSM watermarking: A Tutorial (3)</OPTION> <OPTION 
value=79>79 - Physics-based Model applied to Evolvable Hardware (2)</OPTION> 
<OPTION value=80>80 - A Generic On-Chip Debugger for Wireless Sensor 
Networks (goCDWSN) (2)</OPTION> <OPTION value=81>81 - The Gannet 
Service-based SoC: A Service-level Reconfigurable Architecture (2)</OPTION> 
<OPTION value=82>82 - A FPGA simulation using asexual genetic algorithms for 
integrated self-repair (2)</OPTION> <OPTION value=83>83 - USING THE 
“CELOXICA” FPGA BOARD AND THE MACHINE LEARNING ALGORITHM “LEM3‮.. 
(2)</OPTION> <OPTION value=84>84 - A Comparing Design of Satellite Attitude 
Control System Based on Reaction Wheel (0)</OPTION></SELECT> 
<P></P>
<DT>
<P><B>Select Reviewer(s):</B></P>
<DD>
<P><SPAN class=note>Tip: Click on ID, Name, or Reviews on the line below to 
re-sort this list (page will reload)</SPAN></P>
<DD>
<P>[ Reviewer ID - <A 
href="http://www.eng.bahcesehir.edu.tr/openconf/chair/assign_reviews.php?s=name">Name</A> 
(# <A 
href="http://www.eng.bahcesehir.edu.tr/openconf/chair/assign_reviews.php?s=reviews">Reviews</A>) 
]</P>
<DD><SELECT multiple size=10 name=reviewers[]> <OPTION value=4>&nbsp;&nbsp;4 
- [PC] Nizamettin Aydin (0)</OPTION> <OPTION value=5>&nbsp;&nbsp;5 - [PC] 
Yalcin Cekic (0)</OPTION> <OPTION value=6>&nbsp;&nbsp;6 - [PC] Didier 
Keymeulen (1)</OPTION> <OPTION value=7>&nbsp;&nbsp;7 - [PC] Emin Anarim 
(0)</OPTION> <OPTION value=8>&nbsp;&nbsp;8 - [PC] Murat Askar (0)</OPTION> 
<OPTION value=9>&nbsp;&nbsp;9 - [PC] Peter Athanas (3)</OPTION> <OPTION 
value=10>10 - [PC] Juergen Becker (3)</OPTION> <OPTION value=11>11 - [PC] 
Neil Bergmann (3)</OPTION> <OPTION value=12>12 - [PC] John Choma 
(2)</OPTION> <OPTION value=13>13 - [PC] Carlos A. Coello Coello (3)</OPTION> 
<OPTION value=14>14 - [PC] Sorin Cristoloveanu (1)</OPTION> <OPTION 
value=15>15 - [PC] Antonio Di Nola (1)</OPTION> <OPTION value=16>16 - [PC] 
Wai-Chi Fang (3)</OPTION> <OPTION value=17>17 - [PC] F. Joel Ferguson 
(0)</OPTION> <OPTION value=18>18 - [PC] Dario Floreano (1)</OPTION> <OPTION 
value=19>19 - [PC] Manfred Glesner (3)</OPTION> <OPTION value=20>20 - [PC] 
Maya Gokhale (3)</OPTION> <OPTION value=21>21 - [PC] Pauline Haddow 
(3)</OPTION> <OPTION value=22>22 - [PC] Ilker Hamzaoglu (1)</OPTION> <OPTION 
value=23>23 - [PC] Tetsuya Higuchi (2)</OPTION> <OPTION value=24>24 - [PC] 
Daniel Howard (3)</OPTION> <OPTION value=25>25 - [PC] Lishan Kang 
(3)</OPTION> <OPTION value=26>26 - [PC] Haluk Konuk (3)</OPTION> <OPTION 
value=27>27 - [PC] John Koza (3)</OPTION> <OPTION value=28>28 - [PC] Jason 
Lahn (1)</OPTION> <OPTION value=29>29 - [PC] Bernard Manderick (3)</OPTION> 
<OPTION value=30>30 - [PC] Trent McConaghy (2)</OPTION> <OPTION value=31>31 
- [PC] Bob McKay (1)</OPTION> <OPTION value=32>32 - [PC] Brian Meadows 
(3)</OPTION> <OPTION value=33>33 - [PC] Karlheinz Meier (2)</OPTION> <OPTION 
value=34>34 - [PC] Mohammad Mojarradi (2)</OPTION> <OPTION value=35>35 - 
[PC] J. M. Moreno (2)</OPTION> <OPTION value=36>36 - [PC] Masahiro Murakawa 
(3)</OPTION> <OPTION value=37>37 - [PC] Alex Orailoglu (0)</OPTION> <OPTION 
value=38>38 - [PC] Christos Papachristou (3)</OPTION> <OPTION value=39>39 - 
[PC] Marek A. Perkowski (1)</OPTION> <OPTION value=40>40 - [PC] Viktor 
Prasanna (3)</OPTION> <OPTION value=41>41 - [PC] Justinian Rosca 
(3)</OPTION> <OPTION value=42>42 - [PC] Eduardo Sanchez (3)</OPTION> <OPTION 
value=43>43 - [PC] Radu Secareanu (2)</OPTION> <OPTION value=44>44 - [PC] 
Sakir Sezer (3)</OPTION> <OPTION value=45>45 - [PC] Hajime Shibata 
(3)</OPTION> <OPTION value=46>46 - [PC] Horia-Nicolai Teodorescu 
(3)</OPTION> <OPTION value=47>47 - [PC] Jim Torresen (3)</OPTION> <OPTION 
value=48>48 - [PC] Andy Tyrrell (3)</OPTION> <OPTION value=49>49 - [PC] 
Sezer Goren Ugurdag (0)</OPTION> <OPTION value=50>50 - [PC] Ranga Vemuri 
(3)</OPTION> <OPTION value=51>51 - [PC] Tanya Vladimirova (3)</OPTION> 
<OPTION value=52>52 - [PC] Svetlana Yanushkevich (3)</OPTION> <OPTION 
value=53>53 - [PC] Xin Yao (3)</OPTION> <OPTION value=54>54 - [PC] Nukhet 
Yetis (0)</OPTION> <OPTION value=55>55 - [PC] Sanyou Zeng (3)</OPTION> 
<OPTION value=56>56 - [PC] Nazeeh Aranki (3)</OPTION> <OPTION value=57>57 - 
[PC] Hugo deGaris (3)</OPTION> <OPTION value=58>58 - [PC] Erik Dirkx 
(3)</OPTION> <OPTION value=59>59 - [PC] Ahmet Erdogan (2)</OPTION> <OPTION 
value=60>60 - [PC] Sharon Graves (2)</OPTION> <OPTION value=61>61 - [PC] 
David Gwaltney (2)</OPTION> <OPTION value=62>62 - [PC] Alister Hamilton 
(2)</OPTION> <OPTION value=63>63 - [PC] Alan Hunsberger (3)</OPTION> <OPTION 
value=64>64 - [PC] Srinivas Katkoori (2)</OPTION> <OPTION value=65>65 - [PC] 
Semion Kizhner (3)</OPTION> <OPTION value=66>66 - [PC] Gregory Larchev 
(2)</OPTION> <OPTION value=67>67 - [PC] Derek Linden (1)</OPTION> <OPTION 
value=68>68 - [PC] Klaus McDonald-Maier (1)</OPTION> <OPTION value=69>69 - 
[PC] Julian Miller (1)</OPTION> <OPTION value=70>70 - [PC] Lukas Sekanina 
(3)</OPTION> <OPTION value=71>71 - [PC] Raphael Some (3)</OPTION> <OPTION 
value=72>72 - [PC] Adrian Stoica (3)</OPTION> <OPTION value=73>73 - [PC] 
Gianluca Tempesti (1)</OPTION> <OPTION value=74>74 - [PC] Anil Thakoor 
(2)</OPTION> <OPTION value=75>75 - [PC] Gunnar Tufte (3)</OPTION> <OPTION 
value=76>76 - [PC] Tina Yu (2)</OPTION> <OPTION value=77>77 - [PC] Rolf 
Drechsler (3)</OPTION> <OPTION value=78>78 - [PC] Rajesh Galivanche 
(3)</OPTION> <OPTION value=79>79 - [PC] Paul Hasler (2)</OPTION> <OPTION 
value=80>80 - [PC] Kalmanje S Krishnakumar (0)</OPTION> <OPTION value=81>81 
- [PC] Osman Nuri Ucan (0)</OPTION> <OPTION value=82>82 - [PC] H Fatih 
Ugurdag (0)</OPTION></SELECT> 
<P></P>
<DT><INPUT type=submit value="Assign Reviews" name=submit> </DT></DL></FORM>
<P></P></DIV><!-- mainbody -->
<P>&nbsp;</P>
<DIV class=footerBorder></DIV><!-- DO NOT REMOVE THIS COPYRIGHT NOTICE -->
<P>
<DIV class=powered>Powered by <A href="http://www.openconf.org/" 
target=_blank>OpenConf</A><!--1.22--><BR>Copyright ©2002-2005 <A 
href="http://www.zakongroup.com/technology/" target=_blank>Zakon Group 
LLC</A></DIV><!-- DO NOT REMOVE THIS COPYRIGHT NOTICE --></BODY></HTML>

A txt file that i want to create by using perl is:

2 - Switchable Glass: A possible medium for Evolvable Hardware (4)
3 - An Efficient Multi-Objective Evolutionary Algorithm for Combinational Circuit Design (3)
4 - A Background Mismatch Calibration for Capacitive Digital-to-Analog Converters (3)
5 - Designing Electronic Circuits by Means of Gene Expression Programming (3)
6 - Coherence Based Fault Detection And Error Correction (3)
7 - Wormhole Routing with Virtual Channels using Dynamic Rate Control for Network-on... (2)
8 - Noise Analysis of Phase Locked Loops (3)
9 - Design and Analysis of a Second Order Phase Locked Loops (PLLs) (2)
10 - SW-HW Co-design and fault tolerant implementation for the LRID Wireless communication... (3)
11 - Adaptive PID Controller Using Parameter Optimization Algorithm (2)
12 - A Novel Self-organizing Hybrid Network Protocol (2)
13 - An Adaptive FPGA-Based Mechatronic Control System Supporting Partial Reconfiguration... (3)
14 - Generalized Disjunction Decomposition for the Evolution of Programmable Logic Array... (3)
15 - Woofer-Tweeter Adaptive Optics System (1)
16 - A Re-Programmable Platform for Dynamic Burn-in Test of Xilinx VirtexII 3000 FPGA... (3)
17 - Using hardware-based particle swarm method for dynamic optimization of adaptive ... (2)
18 - Hardware/software coevolution of genome programs and cellular processors (2)
19 - Systolic Array Based Adaptive Beamformer Modelling in SystemC Environment (2)
20 - A Reconfigurable Hardware Design Using FPGA (2)
21 - An FPGA Implemented Processor Architecture with Adaptive Resolution (2)
22 - Evolving Hardware with Self-reconfigurable connectivity in Xilinx FPGAs (2)
23 - Particle Swarm Optimization with Discrete Recombination: An Online Optimizer for... (2)
24 - Towards the Integration of Drive Control Loop Electronics of the JPL/Boeing Gyroscope... (2)
25 - An Incremental Evolutionary Strategy for the Design of FIR Filters Targeting Real... (2)
26 - Adaptive Micro-Antenna on Silicon Substrate (3)
27 - Towards Fluent Sensor Networks: A Scalable and Robust Self-Deployment Approach (3)
28 - Comparison of Fuzzy-C Means, Hard C-Means and Differential Evolution Algorithm in... (2)
29 - Evolutionary Design of Digital Circuits: Where Are Current Limits? (2)

And so on..

So far i've written this code but i can't realize why it doesn't work. It prints nothing to both screen and text file. Any help will be appreciated.Thank you!

use strict;
use warnings;  

use HTML::TreeBuilder;

my $tree = HTML::TreeBuilder->new_from_content(
    do { local $/; <DATA> }
);
open(my $fh, '>', 'outputs.txt');
my $i = 2;
for ( $tree->look_down( 'name' => 'papers' ) ) {
    my $papers = $_->look_down( 'OPTION value' => 'i' )->as_trimmed_text;
    # my $comment  = $_->look_down( 'class' => 'content' )->as_trimmed_text;
    # my $name     = $_->look_down( '_tag'  => 'h3' )->as_trimmed_text;
    # $name =~ s/^Re:\s*//;
    # $name =~ s/\s*$location\s*$//;

    print "Paper: $papers\n";
    print $fh "Paper: $papers\n";
    $i++;
}
like image 783
Gunner1905 Avatar asked May 29 '15 07:05

Gunner1905


People also ask

How to read the content of a txt file in Perl?

We use Perl filehandle to read its content. we should always close the filehandle after processing it. text from test2.txt file C:\> “text from test2.txt file” is the content of the test2.txt file.

How do I write to a file in Perl?

You must put space between print (), filehandle FH and $str variable. The $str variable holds data that is written to the file. Notice that if you write to a file that contains content, Perl will truncate its content. As always, you should close the filehandle when you are no longer use it.

How do I read from a scalar context in Perl?

Perl read file in scalar context. In order to read from a file in read mode, you put the filehandle variable inside angle brackets as follows: To read the next line of the file with newline included, you use the following syntax: $line = <FH>;

What is the Perl source code file path?

The Perl source code file path is c:\perlws\perl-read-file2.pl Now, you can invoke the program from the command line as follows:


2 Answers

You're overcomplicating with the look_down, which is used for attributes. Simply find() the <option> elements.

foreach my $papers ( $tree->look_down( 'name' => 'papers[]' ) ) {
    foreach my $option ( $papers->find( 'option' ) ) {
        say $option->as_trimmed_text;
    }
}

Also note that the name attribute of the <select> is papers[], not papers. The [] are part of the name.

like image 114
simbabque Avatar answered Sep 22 '22 23:09

simbabque


Try this:

#! /usr/bin/env perl

use strict;
use warnings;

use HTML::TreeBuilder;

my $tree = HTML::TreeBuilder->new_from_content(
    do { local $/; <DATA> }
);

# $tree->dump;

for ( $tree->look_down( 'name' => 'papers[]' ) ) {
    for my $p ( $_->look_down( '_tag' => 'option' ) ) {
        print "Paper: " . $p->as_trimmed_text( extra_chars => '\xA0' ) . "\n";
    }      
}          

__DATA__
...

There are two problems with your code: the interesting section is named papers[], not papers (I used $tree->dump to find that out), and both the arguments and the return value of your second look_down() are completely messed up. I'm not sure why you expected that to work.

like image 25
lcd047 Avatar answered Sep 20 '22 23:09

lcd047