I've been experimenting with various bits of Java code trying to come up with something that will encode a string containing quotes, spaces and "exotic" Unicode characters and produce output that's identical to JavaScript's encodeURIComponent function.
My torture test string is: "A" B ± "
If I enter the following JavaScript statement in Firebug:
encodeURIComponent('"A" B ± "');
—Then I get:
"%22A%22%20B%20%C2%B1%20%22"
Here's my little test Java program:
import java.io.UnsupportedEncodingException; import java.net.URLEncoder; public class EncodingTest { public static void main(String[] args) throws UnsupportedEncodingException { String s = "\"A\" B ± \""; System.out.println("URLEncoder.encode returns " + URLEncoder.encode(s, "UTF-8")); System.out.println("getBytes returns " + new String(s.getBytes("UTF-8"), "ISO-8859-1")); } }
—This program outputs:
URLEncoder.encode returns %22A%22+B+%C2%B1+%22 getBytes returns "A" B ± "
Close, but no cigar! What is the best way of encoding a UTF-8 string using Java so that it produces the same output as JavaScript's encodeURIComponent
?
EDIT: I'm using Java 1.4 moving to Java 5 shortly.
encodeURIComponent should be used to encode a URI Component - a string that is supposed to be part of a URL. encodeURI should be used to encode a URI or an existing URL.
The difference between encodeURI and encodeURIComponent is encodeURIComponent encodes the entire string, where encodeURI ignores protocol prefix ('http://') and domain name. encodeURIComponent is designed to encode everything, where encodeURI ignores a URL's domain related roots.
decodeURI(): It takes encodeURI(url) string as parameter and returns the decoded string. decodeURIComponent(): It takes encodeURIComponent(url) string as parameter and returns the decoded string.
Simply put, URL encoding translates special characters from the URL to a representation that adheres to the spec and can be correctly understood and interpreted.
This is the class I came up with in the end:
import java.io.UnsupportedEncodingException; import java.net.URLDecoder; import java.net.URLEncoder; /** * Utility class for JavaScript compatible UTF-8 encoding and decoding. * * @see http://stackoverflow.com/questions/607176/java-equivalent-to-javascripts-encodeuricomponent-that-produces-identical-output * @author John Topley */ public class EncodingUtil { /** * Decodes the passed UTF-8 String using an algorithm that's compatible with * JavaScript's <code>decodeURIComponent</code> function. Returns * <code>null</code> if the String is <code>null</code>. * * @param s The UTF-8 encoded String to be decoded * @return the decoded String */ public static String decodeURIComponent(String s) { if (s == null) { return null; } String result = null; try { result = URLDecoder.decode(s, "UTF-8"); } // This exception should never occur. catch (UnsupportedEncodingException e) { result = s; } return result; } /** * Encodes the passed String as UTF-8 using an algorithm that's compatible * with JavaScript's <code>encodeURIComponent</code> function. Returns * <code>null</code> if the String is <code>null</code>. * * @param s The String to be encoded * @return the encoded String */ public static String encodeURIComponent(String s) { String result = null; try { result = URLEncoder.encode(s, "UTF-8") .replaceAll("\\+", "%20") .replaceAll("\\%21", "!") .replaceAll("\\%27", "'") .replaceAll("\\%28", "(") .replaceAll("\\%29", ")") .replaceAll("\\%7E", "~"); } // This exception should never occur. catch (UnsupportedEncodingException e) { result = s; } return result; } /** * Private constructor to prevent this class from being instantiated. */ private EncodingUtil() { super(); } }
Looking at the implementation differences, I see that:
MDC on encodeURIComponent()
:
[-a-zA-Z0-9._*~'()!]
Java 1.5.0 documentation on URLEncoder
:
[-a-zA-Z0-9._*]
" "
is converted into a plus sign "+"
. So basically, to get the desired result, use URLEncoder.encode(s, "UTF-8")
and then do some post-processing:
"+"
with "%20"
"%xx"
representing any of [~'()!]
back to their literal counter-partsIf you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With