I have rather simple HttpClient 4 code that calls HttpGet to get HTML output. The HTML returns with scripts and image locations all set to local (e.g. <img src="/images/foo.jpg"/>
) so I need calling URL to make these into absolute (<img src="http://foo.com/images/foo.jpg"/>
) Now comes the problem - during the call there may be one or two 302 redirects so the original URL is no longer reflects the location of HTML.
How do I get the latest URL of the returned content given all the redirects I may (or may not) have?
I looked at HttpGet#getAllHeaders()
and HttpResponse#getAllHeaders()
- couldn't find anything.
Edited: HttpGet#getURI()
returns original calling address
The automatic redirection policy is checked whenever a 3XX response code is received. If redirection does not happen automatically, then the response, containing the 3XX response code, is returned, where it can be handled manually. Redirect policy is set through the Builder. followRedirects method.
Class LaxRedirectStrategyLax RedirectStrategy implementation that automatically redirects all HEAD, GET, POST, and DELETE requests. This strategy relaxes restrictions on automatic redirection of POST methods imposed by the HTTP specification.
You do not need to explicitly close the HttpClient, however, (you may be doing this already but worth noting) you should ensure that connections are released after method execution. Edit: The ClientConnectionManager within the HttpClient is going to be responsible for maintaining the state of connections.
That would be the current URL, which you can get by calling
HttpGet#getURI();
EDIT: You didn't mention how you are doing redirect. That works for us because we handle the 302 ourselves.
Sounds like you are using DefaultRedirectHandler. We used to do that. It's kind of tricky to get the current URL. You need to use your own context. Here are the relevant code snippets,
HttpGet httpget = new HttpGet(url); HttpContext context = new BasicHttpContext(); HttpResponse response = httpClient.execute(httpget, context); if (response.getStatusLine().getStatusCode() != HttpStatus.SC_OK) throw new IOException(response.getStatusLine().toString()); HttpUriRequest currentReq = (HttpUriRequest) context.getAttribute( ExecutionContext.HTTP_REQUEST); HttpHost currentHost = (HttpHost) context.getAttribute( ExecutionContext.HTTP_TARGET_HOST); String currentUrl = (currentReq.getURI().isAbsolute()) ? currentReq.getURI().toString() : (currentHost.toURI() + currentReq.getURI());
The default redirect didn't work for us so we changed but I forgot what was the problem.
In HttpClient 4, if you are using LaxRedirectStrategy
or any subclass of DefaultRedirectStrategy
, this is the recommended way (see source code of DefaultRedirectStrategy
) :
HttpContext context = new BasicHttpContext(); HttpResult<T> result = client.execute(request, handler, context); URI finalUrl = request.getURI(); RedirectLocations locations = (RedirectLocations) context.getAttribute(DefaultRedirectStrategy.REDIRECT_LOCATIONS); if (locations != null) { finalUrl = locations.getAll().get(locations.getAll().size() - 1); }
Since HttpClient 4.3.x, the above code can be simplified as:
HttpClientContext context = HttpClientContext.create(); HttpResult<T> result = client.execute(request, handler, context); URI finalUrl = request.getURI(); List<URI> locations = context.getRedirectLocations(); if (locations != null) { finalUrl = locations.get(locations.size() - 1); }
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With