Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java URL encoding: URLEncoder vs. URI

Tags:

Looking on the W3 Schools URL encoding webpage, it says that @ should be encoded as %40, and that space should be encoded as %20.

I've tried both URLEncoder and URI, but neither does the above properly:

import java.net.URI; import java.net.URLEncoder;  public class Test {     public static void main(String[] args) throws Exception {          // Prints me%40home.com (CORRECT)         System.out.println(URLEncoder.encode("[email protected]", "UTF-8"));          // Prints Email+Address (WRONG: Should be Email%20Address)         System.out.println(URLEncoder.encode("Email Address", "UTF-8"));          // http://www.home.com/test?Email%[email protected]         // (WRONG: it has not encoded the @ in the email address)         URI uri = new URI("http", "www.home.com", "/test", "Email [email protected]", null);         System.out.println(uri.toString());     } } 

For some reason, URLEncoder does the email address correctly but not spaces, and URI does spaces currency but not email addresses.

How should I encode these 2 parameters to be consistent with what w3schools says is correct (or is w3schools wrong?)

like image 630
John Farrelly Avatar asked Jan 14 '13 15:01

John Farrelly


People also ask

What is URLEncoder encode in Java?

public class URLEncoder extends Object. Utility class for HTML form encoding. This class contains static methods for converting a String to the application/x-www-form-urlencoded MIME format. For more information about HTML form encoding, consult the HTML specification.

What is the difference between encodeURI and encodeURIComponent?

encodeURI is used to encode a full URL. Whereas encodeURIComponent is used for encoding a URI component such as a query string. There are 11 characters which are not encoded by encodeURI , but encoded by encodeURIComponent .

Should I encode URL parameters?

Why do we need to encode? URLs can only have certain characters from the standard 128 character ASCII set. Reserved characters that do not belong to this set must be encoded. This means that we need to encode these characters when passing into a URL.

What is the difference between Htmlencode and Urlencode?

HTMLEncoding turns this character into "<" which is the encoded representation of the less-than sign. URLEncoding does the same, but for URLs, for which the special characters are different, although there is some overlap. Save this answer. Show activity on this post.


1 Answers

Although I think the answer from @fge is the right one, as I was using a 3rd party webservice that relied on the encoding outlined in the W3Schools article, I followed the answer from Java equivalent to JavaScript's encodeURIComponent that produces identical output?

public static String encodeURIComponent(String s) {     String result;      try {         result = URLEncoder.encode(s, "UTF-8")                 .replaceAll("\\+", "%20")                 .replaceAll("\\%21", "!")                 .replaceAll("\\%27", "'")                 .replaceAll("\\%28", "(")                 .replaceAll("\\%29", ")")                 .replaceAll("\\%7E", "~");     } catch (UnsupportedEncodingException e) {         result = s;     }      return result; } 
like image 114
John Farrelly Avatar answered Oct 14 '22 20:10

John Farrelly