Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

403 when trying to download a remote image

I am trying to download pictures from some urls. For some pictures it works fine, but for others I get 403 errors.

For exemple, this one: http://blog.zenika.com/themes/Zenika/img/zenika.gif

This picture access does not require any authentication. You can click yourself on the link and verify that it is available to your browser with a 200 status code.

The following code produces an exception: new java.net.URL(url).openStream(). Same for org.apache.commons.io.FileUtils.copyURLToFile(new java.net.URL(url), tmp) whichs uses the same openStream() metho under the hood.

java.io.IOException: Server returned HTTP response code: 403 for URL: http://blog.zenika.com/themes/Zenika/img/zenika.gif
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1626) ~[na:1.7.0_45]
at java.net.URL.openStream(URL.java:1037) ~[na:1.7.0_45]
at services.impl.DefaultStampleServiceComponent$RemoteImgUrlFilter$class.downloadAsTemporaryFile(DefaultStampleServiceComponent.scala:548) [classes/:na]
at services.impl.DefaultStampleServiceComponent$RemoteImgUrlFilter$class.services$impl$DefaultStampleServiceComponent$RemoteImgUrlFilter$$handleImageUrl(DefaultStampleServiceComponent.scala:523) [classes/:na]

I develop with Scala / Play Framework. I tried to use the built-in AsyncHttpClient.

// TODO it could be better to use itetarees on the GET call becase I think AHC load the whole body in memory
WS.url(url).get.flatMap { res =>
  if (res.status >= 200 && res.status < 300) {
    val bodyStream = res.getAHCResponse.getResponseBodyAsStream
    val futureFile = TryUtils.tryToFuture(createTemporaryFile(bodyStream))
    play.api.Logger.info(s"Successfully downloaded file $filename with status code ${res.status}")
    futureFile
  } else {
    Future.failed(new RuntimeException(s"Download of file $filename returned status code ${res.status}"))
  }
} recover {
  case NonFatal(e) => throw new RuntimeException(s"Could not downloadAsTemporaryFile url=$url", e)
}

With this AHC code, it works fine. Can someone explain this behavior and why I got a 403 error with the URL.openStream() method?

like image 375
Sebastien Lorber Avatar asked Apr 09 '14 09:04

Sebastien Lorber


People also ask

What does the error code 403 mean?

The HTTP 403 Forbidden response status code indicates that the server understands the request but refuses to authorize it. This status is similar to 401 , but for the 403 Forbidden status code re-authenticating makes no difference.

How do I get rid of 403 Forbidden on Chrome?

Many times the 403 error is temporary, and a simple refresh might do the trick. Most browsers use Ctrl+R on Windows or Cmd+R on Mac to refresh, and also provide a Refresh button somewhere on the address bar. It doesn't fix the problem very often, but it takes just a second to try.

Why does a website say 403 Forbidden?

The 403 Forbidden error appears when your server denies you permission to access a page on your site. This is mainly caused by a faulty security plugin, a corrupt . htaccess file, or incorrect file permissions on your server.


1 Answers

As mentioned, some hoster prevent this intrusion using some header like UserAgent :

This doesn't work :

   val urls = """http://blog.zenika.com/themes/Zenika/img/zenika.gif"""
  val url = new URL(urls)
  val urlConnection = url.openConnection() 
  val inputStream = urlConnection.getInputStream()
  val bufferedReader = new BufferedReader(new InputStreamReader(inputStream))

This works :

val urls = """http://blog.zenika.com/themes/Zenika/img/zenika.gif"""
val url = new URL(urls)
val urlConnection = url.openConnection()   
urlConnection.setRequestProperty("User-Agent", """NING/1.0""") 
val inputStream = urlConnection.getInputStream()
val bufferedReader = new BufferedReader(new InputStreamReader(inputStream))
like image 126
ouertani Avatar answered Sep 22 '22 22:09

ouertani