Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

HTTP URL Address Encoding in Java

My Java standalone application gets a URL (which points to a file) from the user and I need to hit it and download it. The problem I am facing is that I am not able to encode the HTTP URL address properly...

Example:

URL:  http://search.barnesandnoble.com/booksearch/first book.pdf  java.net.URLEncoder.encode(url.toString(), "ISO-8859-1"); 

returns me:

http%3A%2F%2Fsearch.barnesandnoble.com%2Fbooksearch%2Ffirst+book.pdf 

But, what I want is

http://search.barnesandnoble.com/booksearch/first%20book.pdf 

(space replaced by %20)

I guess URLEncoder is not designed to encode HTTP URLs... The JavaDoc says "Utility class for HTML form encoding"... Is there any other way to do this?

like image 423
suDocker Avatar asked Apr 07 '09 03:04

suDocker


People also ask

How do you encode a URL in Java?

Encode the URLprivate String encodeValue(String value) { return URLEncoder. encode(value, StandardCharsets. UTF_8. toString()); } @Test public void givenRequestParam_whenUTF8Scheme_thenEncode() throws Exception { Map<String, String> requestParams = new HashMap<>(); requestParams.

How do I encode an HTTP URL?

HTTP URLs can only be sent over the Internet using the ASCII character-set, which often contain characters outside the ASCII set. So these unsafe characters must be replaced with a % followed by two hexadecimal digits. Encode with %xx where xx is the hexadecimal representation of the character.

What is HTTP URL encoding?

What Does URL Encoding Mean? URL encoding is a mechanism for translating unprintable or special characters to a universally accepted format by web servers and browsers.

What does %20 replace in URL?

URL encoding normally replaces a space with a plus (+) sign or with %20.


1 Answers

The java.net.URI class can help; in the documentation of URL you find

Note, the URI class does perform escaping of its component fields in certain circumstances. The recommended way to manage the encoding and decoding of URLs is to use an URI

Use one of the constructors with more than one argument, like:

URI uri = new URI(     "http",      "search.barnesandnoble.com",      "/booksearch/first book.pdf",     null); URL url = uri.toURL(); //or String request = uri.toString(); 

(the single-argument constructor of URI does NOT escape illegal characters)


Only illegal characters get escaped by above code - it does NOT escape non-ASCII characters (see fatih's comment).
The toASCIIString method can be used to get a String only with US-ASCII characters:

URI uri = new URI(     "http",      "search.barnesandnoble.com",      "/booksearch/é",     null); String request = uri.toASCIIString(); 

For an URL with a query like http://www.google.com/ig/api?weather=São Paulo, use the 5-parameter version of the constructor:

URI uri = new URI(         "http",          "www.google.com",          "/ig/api",         "weather=São Paulo",         null); String request = uri.toASCIIString(); 
like image 155
user85421 Avatar answered Sep 19 '22 15:09

user85421