Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

HttpClient 4 - how to capture last redirect URL

I have rather simple HttpClient 4 code that calls HttpGet to get HTML output. The HTML returns with scripts and image locations all set to local (e.g. <img src="/images/foo.jpg"/>) so I need calling URL to make these into absolute (<img src="http://foo.com/images/foo.jpg"/>) Now comes the problem - during the call there may be one or two 302 redirects so the original URL is no longer reflects the location of HTML.

How do I get the latest URL of the returned content given all the redirects I may (or may not) have?

I looked at HttpGet#getAllHeaders() and HttpResponse#getAllHeaders() - couldn't find anything.

Edited: HttpGet#getURI() returns original calling address

like image 405
Bostone Avatar asked Sep 21 '09 21:09

Bostone


People also ask

How do I redirect in HttpClient?

The automatic redirection policy is checked whenever a 3XX response code is received. If redirection does not happen automatically, then the response, containing the 3XX response code, is returned, where it can be handled manually. Redirect policy is set through the Builder. followRedirects method.

What is LaxRedirectStrategy?

Class LaxRedirectStrategyLax RedirectStrategy implementation that automatically redirects all HEAD, GET, POST, and DELETE requests. This strategy relaxes restrictions on automatic redirection of POST methods imposed by the HTTP specification.

Does HttpClient need to be closed?

You do not need to explicitly close the HttpClient, however, (you may be doing this already but worth noting) you should ensure that connections are released after method execution. Edit: The ClientConnectionManager within the HttpClient is going to be responsible for maintaining the state of connections.


2 Answers

That would be the current URL, which you can get by calling

  HttpGet#getURI(); 

EDIT: You didn't mention how you are doing redirect. That works for us because we handle the 302 ourselves.

Sounds like you are using DefaultRedirectHandler. We used to do that. It's kind of tricky to get the current URL. You need to use your own context. Here are the relevant code snippets,

        HttpGet httpget = new HttpGet(url);         HttpContext context = new BasicHttpContext();          HttpResponse response = httpClient.execute(httpget, context);          if (response.getStatusLine().getStatusCode() != HttpStatus.SC_OK)             throw new IOException(response.getStatusLine().toString());         HttpUriRequest currentReq = (HttpUriRequest) context.getAttribute(                  ExecutionContext.HTTP_REQUEST);         HttpHost currentHost = (HttpHost)  context.getAttribute(                  ExecutionContext.HTTP_TARGET_HOST);         String currentUrl = (currentReq.getURI().isAbsolute()) ? currentReq.getURI().toString() : (currentHost.toURI() + currentReq.getURI()); 

The default redirect didn't work for us so we changed but I forgot what was the problem.

like image 66
ZZ Coder Avatar answered Sep 22 '22 01:09

ZZ Coder


In HttpClient 4, if you are using LaxRedirectStrategy or any subclass of DefaultRedirectStrategy, this is the recommended way (see source code of DefaultRedirectStrategy) :

HttpContext context = new BasicHttpContext(); HttpResult<T> result = client.execute(request, handler, context); URI finalUrl = request.getURI(); RedirectLocations locations = (RedirectLocations) context.getAttribute(DefaultRedirectStrategy.REDIRECT_LOCATIONS); if (locations != null) {     finalUrl = locations.getAll().get(locations.getAll().size() - 1); } 

Since HttpClient 4.3.x, the above code can be simplified as:

HttpClientContext context = HttpClientContext.create(); HttpResult<T> result = client.execute(request, handler, context); URI finalUrl = request.getURI(); List<URI> locations = context.getRedirectLocations(); if (locations != null) {     finalUrl = locations.get(locations.size() - 1); } 
like image 31
david_p Avatar answered Sep 24 '22 01:09

david_p