Using the curl CommandThe -L option tells the curl command to follow the URL redirections. The -s option is for silent mode, which means the command should not output anything to the terminal, while the -o option provides the path that it should send the output to instead of sending it to stdout.
GetRedirectURL.php * Gets the address that the provided URL redirects to, * or FALSE if there's no redirect. * Follows and collects all redirects, in order, for the given URL. * Gets the address that the URL ultimately leads to.
To check internal redirect chains or identify redirect loops, simply export the 'redirect chains' report. This report maps out chains of redirects, the number of hops along the way and will identify the source, as well as if there is a loop.
Use Python urllib Library To Get Redirection URL. request module. Define a web page URL, suppose this URL will be redirected when you send a request to it. Get the response object. Get the webserver returned response status code, if the code is 301 then it means the URL has been redirected permanently.
curl
's -w
option and the sub variable url_effective
is what you are
looking for.
Something like
curl -Ls -o /dev/null -w %{url_effective} http://google.com
More info
-L Follow redirects -s Silent mode. Don't output anything -o FILE Write output to <file> instead of stdout -w FORMAT What to output after completion
More
You might want to add -I
(that is an uppercase i
) as well, which will make the command not download any "body", but it then also uses the HEAD method, which is not what the question included and risk changing what the server does. Sometimes servers don't respond well to HEAD even when they respond fine to GET.
Thanks, that helped me. I made some improvements and wrapped that in a helper script "finalurl":
#!/bin/bash
curl $1 -s -L -I -o /dev/null -w '%{url_effective}'
-o
output to /dev/null
-I
don't actually download, just discover the final URL-s
silent mode, no progressbarsThis made it possible to call the command from other scripts like this:
echo `finalurl http://someurl/`
as another option:
$ curl -i http://google.com
HTTP/1.1 301 Moved Permanently
Location: http://www.google.com/
Content-Type: text/html; charset=UTF-8
Date: Sat, 19 Jun 2010 04:15:10 GMT
Expires: Mon, 19 Jul 2010 04:15:10 GMT
Cache-Control: public, max-age=2592000
Server: gws
Content-Length: 219
X-XSS-Protection: 1; mode=block
<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>301 Moved</TITLE></HEAD><BODY>
<H1>301 Moved</H1>
The document has moved
<A HREF="http://www.google.com/">here</A>.
</BODY></HTML>
But it doesn't go past the first one.
Thank you. I ended up implementing your suggestions: curl -i + grep
curl -i http://google.com -L | egrep -A 10 '301 Moved Permanently|302 Found' | grep 'Location' | awk -F': ' '{print $2}' | tail -1
Returns blank if the website doesn't redirect, but that's good enough for me as it works on consecutive redirections.
Could be buggy, but at a glance it works ok.
You can do this with wget usually. wget --content-disposition
"url" additionally if you add -O /dev/null
you will not be actually saving the file.
wget -O /dev/null --content-disposition example.com
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With