I'm using FileUtils.copyURLToFile(URL, File)
, an Apache Commons IO 2.4 part, to download and save the file on my computer. The problem is that some sites refuse connection without referrer and user agent data.
My questions:
copyURLToFile
method?InputStream
to file?I've re-implement the functionality with HttpComponents
instead of Commons-IO
. This code allows you to download a file in Java according to its URL and save it at the specific destination.
The final code:
public static boolean saveFile(URL imgURL, String imgSavePath) {
boolean isSucceed = true;
CloseableHttpClient httpClient = HttpClients.createDefault();
HttpGet httpGet = new HttpGet(imgURL.toString());
httpGet.addHeader("User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.11 Safari/537.36");
httpGet.addHeader("Referer", "https://www.google.com");
try {
CloseableHttpResponse httpResponse = httpClient.execute(httpGet);
HttpEntity imageEntity = httpResponse.getEntity();
if (imageEntity != null) {
FileUtils.copyInputStreamToFile(imageEntity.getContent(), new File(imgSavePath));
}
} catch (IOException e) {
isSucceed = false;
}
httpGet.releaseConnection();
return isSucceed;
}
Of course, the code above takes more space then just single line of code:
FileUtils.copyURLToFile(imgURL, new File(imgSavePath),
URLS_FETCH_TIMEOUT, URLS_FETCH_TIMEOUT);
but it will give you more control over a process and let you specify not only timeouts but User-Agent
and Referer
values, which are critical for many web-sites.
Completing the accepted answer on how to handle timeouts:
If you want to set timeouts, you have to create the CloseableHttpClient
like this:
RequestConfig config = RequestConfig.custom()
.setConnectTimeout(connectionTimeout)
.setConnectionRequestTimeout(readDataTimeout)
.setSocketTimeout(readDataTimeout)
.build();
CloseableHttpClient httpClient = HttpClientBuilder
.create()
.setDefaultRequestConfig(config)
.build();
And, it may be a good idea to create your CloseableHttpClient
using a try-with-resource statement to handle its closing:
try (CloseableHttpClient httpClient = HttpClientBuilder.create().setDefaultRequestConfig(config).build()) {
... rest of the code using httpClient
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With