I am trying to parse a fairly simple web page for information in a shell script. The web page I'm working with now is generated here. For example, I would like to pull the information on the internet service provider into a shell variable. It may make sense to use one of the programs xmllint, XMLStarlet or xpath for this purpose. I am quite familiar with shell scripting, but I am new to XPath syntax and the utilities used to implement the XPath syntax, so I would appreciate a few pointers in the right direction.
Here's the beginnings of the shell script:
HTMLISPInformation="$(curl --user-agent "Mozilla/5.0" http://aruljohn.com/details.php)"
# ISP="$(<XPath magic goes here.>)"
For your convenience, here is a utility for dynamically testing XPath syntax online:
http://www.bit-101.com/xpath/
Quick and dirty solution...
xmllint --html -xpath "//table/tbody/tr[6]/td[2]" page.html
You can find the xpath of your node using Chrome and the Developer Tools. When inspecting the node, right click on it and select copy XPath.
I wouldn't use this too much, this is not very reliable.
All the information on your page can be found elsewhere: run whois on your own IP for instance...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With