I'm having trouble encoding a URL to a URI: <pre class="prettyprint"><code>mUrl = "A string url that needs to be encoded for use in a new HttpGet()"; URL url = new URL(mUrl); URI uri = new URI(url.getProtocol(), url.getAuthority(), url.getPath(), url.getQuery(), null); </code></pre> This does not do what I expect for the following URL: Passing in the String: http://m.bloomingdales.com/img?url=http%3A%2F%2Fimages.bloomingdales.com%2Fis%2Fimage%2FBLM%2Fproducts%2F3%2Foptimized%2F1140443_fpx.tif%3Fwid%3D52%26qlt%3D90%2C0%26layer%3Dcomp%26op_sharpen%3D0%26resMode%3Dsharp2%26op_usm%3D0.7%2C1.0%2C0.5%2C0%26fmt%3Djpeg&ttl=30d Comes out as: http://m.bloomingdales.com/img?url=http%253A%252F%252Fimages.bloomingdales.com%252Fis%252Fimage%252FBLM%252Fproducts%252F3%252Foptimized%252F1140443_fpx.tif%253Fwid%253D52%2526qlt%253D90%252C0%2526layer%253Dcomp%2526op_sharpen%253D0%2526resMode%253Dsharp2%2526op_usm%253D0.7%252C1.0%252C0.5%252C0%2526fmt%253Djpeg&ttl=30d Which is broken. For example, the <code>%3D</code> is turned into <code>%253D</code> It seems to be doing something mysterious to the %'s already in the string. What's going on and what am I doing wrong here?

%3d means-> = (Equal) And %253D --> = (Equal) decimal 6hex (byte) 3D %253D hex indicator for CGI: %3D

URL to URI encoding changes a "%3D" to "%253D"

Tags:

java

url

uri

encoding

I'm having trouble encoding a URL to a URI:

mUrl = "A string url that needs to be encoded for use in a new HttpGet()";
URL url = new URL(mUrl);
URI uri = new URI(url.getProtocol(), url.getAuthority(), url.getPath(), 
    url.getQuery(), null);

This does not do what I expect for the following URL:

Passing in the String:

http://m.bloomingdales.com/img?url=http%3A%2F%2Fimages.bloomingdales.com%2Fis%2Fimage%2FBLM%2Fproducts%2F3%2Foptimized%2F1140443_fpx.tif%3Fwid%3D52%26qlt%3D90%2C0%26layer%3Dcomp%26op_sharpen%3D0%26resMode%3Dsharp2%26op_usm%3D0.7%2C1.0%2C0.5%2C0%26fmt%3Djpeg&ttl=30d

Comes out as:

http://m.bloomingdales.com/img?url=http%253A%252F%252Fimages.bloomingdales.com%252Fis%252Fimage%252FBLM%252Fproducts%252F3%252Foptimized%252F1140443_fpx.tif%253Fwid%253D52%2526qlt%253D90%252C0%2526layer%253Dcomp%2526op_sharpen%253D0%2526resMode%253Dsharp2%2526op_usm%253D0.7%252C1.0%252C0.5%252C0%2526fmt%253Djpeg&ttl=30d

Which is broken. For example, the %3D is turned into %253D It seems to be doing something mysterious to the %'s already in the string.

What's going on and what am I doing wrong here?

984

asked Feb 01 '11 01:02

cottonBallPaws

3 Answers

You are first putting the (already-escaped) string into the URL class. That doesn't escape anything. Then you are pulling out sections of the URL, which returns them without any further processing (so -- they are still escaped since they were escaped when you put them in). Finally, you are putting the sections into the URI class, using the multi-argument constructor. This constructor is specified as encoding the URI components using percentages.

Therefore, it is in this final step that, for example, ":" becomes "%3A" (good) and "%3A" becomes "%253A" (bad). Since you are putting in URLs which are already-encoded*, you don't want to encode them again.

Therefore, the single-argument constructor of URI is your friend. It doesn't escape anything, and requires that you pass a pre-escaped string. Hence, you don't need URL at all:

mUrl = "A string url is already percent-encoded for use in a new HttpGet()";
URI uri = new URI(mUrl);

*The only problem is if your URLs are sometimes not percent-encoded, and sometimes they are. Then you have a bigger problem. You need to decide whether your program is starting out with a URL which is always encoded, or one which needs to be encoded.

Note that there is no such thing as a full URL which is not percent-encoded. For example, you can't take the full URL "http://example.com/bob&co" and somehow turn it into the properly-encoded URL "http://example.com/bob%26co" -- how can you tell the difference between the syntax (which shouldn't be escaped) and the characters (which should)? This is why the single-argument form of URI requires that strings are already-escaped. If you have unescaped strings, you need to percent-encode them before inserting them into the full URL syntax, and that is what the multi-argument constructor of URI helps you do.

Edit: I missed the fact that the original code discards the fragment. If you want to remove the fragment (or any other part) of the URL, you can construct the URI as above, then pull all the parts out as required (they will be decoded into regular strings), then pass them back into the URI multi-argument constructor (where they will be re-encoded as URI components):

uri = new URI(uri.getScheme(), uri.getUserInfo(), uri.getHost(), uri.getPort(),
              uri.getPath(), uri.getQuery(), null)  // Remove fragment

144

answered Oct 06 '22 04:10

mgiuca

%3d means-> = (Equal)

And

%253D --> = (Equal) decimal 6hex (byte) 3D

%253D hex indicator for CGI: %3D

answered Oct 06 '22 06:10

Sarat Patel

The URL class didn't decode the %-sequences when it parsed the URL, but the URI class is encoding them (again). Use URI to parse the URL string.

Javadocs:

http://download.oracle.com/javase/6/docs/api/java/net/URL.html

The URL class does not itself encode or decode any URL components according to the escaping mechanism defined in RFC2396. It is the responsibility of the caller to encode any fields, which need to be escaped prior to calling URL, and also to decode any escaped fields, that are returned from URL. Furthermore, because URL has no knowledge of URL escaping, it does not recognise equivalence between the encoded or decoded form of the same URL. For example, the two URLs:

http://foo.com/hello world/ and http://foo.com/hello%20world

would be considered not equal to each other. Note, the URI class does perform escaping of its component fields in certain circumstances.

The recommended way to manage the encoding and decoding of URLs is to use URI, and to convert between these two classes using toURI() and URI.toURL().

answered Oct 06 '22 05:10

Bert F

Related questions
                            
                                How to get a Label with wrapped text?
                            
                                java replaceAll not working for \n characters
                            
                                simple query: not implemented by SQLite JDBC driver
                            
                                i can't solve maven building error failure
                            
                                How to do BigDecimal modulus comparison
                            
                                How to use font awesome in a fxml project (javafx)
                            
                                What are the differences between Akka and Netty besides their choice of language (Scala vs Java)? [closed]
                            
                                Simple DynamoDB request failing with ResourceNotFoundException
                            
                                Java 8 Map.Entry comparator
                            
                                Maven Error: (repeated) java.security.InvalidAlgorithmParameterException: the trustAnchors parameter must be non-empty
                            
                                java.lang.IllegalArgumentException: Failed to find configured root that contains /storage/emulated/0/Pictures/
                            
                                How to share java models between microservices in microservice architecture
                            
                                Best way to rearrange an ArrayList in Java
                            
                                Determine if a String is a valid date before parsing
                            
                                Libs for HTML sanitizing
                            
                                How to query a property of type List<String>in JPA
                            
                                Java Swing Font Chooser
                            
                                Should I put my ThreadLocals in a spring-injected singleton?
                            
                                What JAR files are needed for Eclipse to use JSTL so it ultimately works on GAE/J?
                            
                                updating multiple rows using JPA

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With