At the page https://developer.android.com/studio/index.html, there is a link to the Android SDK tools for Linux, which I'd like to download by a script. Unfortunately, there is no "easy" link to use to download the latest version, so I'd like to extract the link from the HTML itself.
The link is identified by the id linux-tools
and is contained on multiple lines:
<a onclick="return onDownload(this)" id="linux-tools" data-modal-toggle="studio_tos"
href="https://dl.google.com/android/repository/sdk-tools-linux-3859397.zip">sdk-tools-linux-38593
I'd like to extract that href
into a variable in a Bash script. The closest I've gotten so far is the following:
grep -o -z '<a.[^<]*id="linux-tools"[^<]*</a>' index.html
which outputs the above two lines.
How do I get at the actual link using typically-available shell commands?
You can use sed
to first select the range you want to work, for example:
sed -n '/id="linux-tools"/,+1 p' index.html
That will give you the address from line containing id="linux-tools"
plus one line.
Now you can use sed
substitute to extract the href
just from that range:
sed -n '/id="linux-tools"/,+1 s/.*href="\([^"]*\).*$/\1/p' index.html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With