<p>I know, don't parse using curl, grep and sed. But I am looking for an easy approach, not a very safe one.</p> <p>So I get an HTML file with curl, from which I need a value of a certain attribute from a tag. I use grep to get me the line where it says <code>token</code>. This only occurs once. This gives me a whole div: </p> <pre class="prettyprint"><code><div class="userlinks"> <span class="arrow flleft profilesettings">settings</span> <form class="logoutform" method="post" action="/logout"> <input class="logoutbtn arrow flright" type="submit" value="Log out"> <input type="hidden" name="ltoken" value="a5fc8828a42277538f1352cf9ea27a71"> </form> </div> </code></pre> <p>How can I get just the value attribute (e.g. "a5fc8828a42277538f1352cf9ea27a71")?</p>

<p>There's no need to grep:</p> <pre class="prettyprint"><code>sed -n '/token/s/.*name="ltoken"\s\+value="\([^"]\+\).*/\1/p' input_file </code></pre>

<p>One way, using <code>sed</code>:</p> <pre class="prettyprint"><code>sed "s/.* value=\"\(.*\)\".*/\1/" file.txt </code></pre> <p>Results:</p> <pre class="prettyprint"><code>a5fc8828a42277538f1352cf9ea27a71 </code></pre> <p>HTH</p>

How can I extract a tag's attribute value from an HTML file?

Tags:

regex

bash

I know, don't parse using curl, grep and sed. But I am looking for an easy approach, not a very safe one.

So I get an HTML file with curl, from which I need a value of a certain attribute from a tag. I use grep to get me the line where it says token. This only occurs once. This gives me a whole div:

<div class="userlinks">
  <span class="arrow flleft profilesettings">settings</span>
  <form class="logoutform" method="post" action="/logout">
    <input class="logoutbtn arrow flright" type="submit" value="Log out">
    <input type="hidden" name="ltoken" value="a5fc8828a42277538f1352cf9ea27a71">
  </form>
</div>

How can I get just the value attribute (e.g. "a5fc8828a42277538f1352cf9ea27a71")?

553

asked Jul 17 '12 13:07

tzippy

2 Answers

There's no need to grep:

sed -n '/token/s/.*name="ltoken"\s\+value="\([^"]\+\).*/\1/p' input_file

answered Nov 08 '22 05:11

perreal

One way, using sed:

sed "s/.* value=\"\(.*\)\".*/\1/" file.txt

Results:

a5fc8828a42277538f1352cf9ea27a71

HTH

answered Nov 08 '22 03:11

Steve

Related questions
                            
                                preg_replace all but numbers, letters, periods, and slash?
                            
                                HOW: Apache Camel, Regex match files
                            
                                RegEx validation for numbers only with a minimum length [duplicate]
                            
                                Concatenates Regex.Matches to a string
                            
                                Replacing commas with dot and dot with commas
                            
                                Regex to match 10 or 12 digits only
                            
                                Regular expression to define some binary sequence
                            
                                Separate firstname and lastname from fullname string in C#
                            
                                Why isn't regular expressions part of ISO C99
                            
                                How to replace last dot in a string using a regular expression?
                            
                                Replace multiple \n with 1 in JavaScript
                            
                                Validate Mobile number using regular expression
                            
                                Extract Nth line after matching pattern
                            
                                remove unicode emoji using re in python
                            
                                html pattern to only accept numbers [duplicate]
                            
                                Download all pdf files from a website using Python
                            
                                Validating javascript decimal numbers
                            
                                PHP: remove extra space from a string using regex
                            
                                Simple Java regex not working
                            
                                [^/]+ explanation in htaccess

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With