Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java library for URL encoding if necessary (like a browser)

Tags:

java

urlencode

If I put the http://localhost:9000/space test URL to the address bar of a web browser it calls the server with http://localhost:9000/space%20test. http://localhost:9000/specÁÉÍtest will be also encoded to http://localhost:9000/spec%C3%81%C3%89%C3%8Dtest.

If put the encoded URLs to the address bar (i.e. http://localhost:9000/space%20test and http://localhost:9000/spec%C3%81%C3%89%C3%8Dtest) they remain the same (they won't be double-encoded).

Is there any Java API or library which does this encoding? The URLs comes from the user so I don't know if they are encoded or not.

(If there isn't would it be enough to search for % in the input string and encode if it's not found, or is there any special case where this would not work?)

Edit:

URLEncoder.encode("space%20test", "UTF-8") returns with space%2520test which is not what I would like since it is double-encoded.

Edit 2:

Furthermore, browsers handle partially encoded URLs, like http://localhost:9000/specÁÉ%C3%8Dtest, well, without double-encoding them. In this case the server receives the following URL: http://localhost:9000/spec%C3%81%C3%89%C3%8Dtest. It is same as the encoded form of ...specÁÉÍtest.

like image 279
palacsint Avatar asked Jan 16 '13 11:01

palacsint


People also ask

How do you encode a URL in Java Web application?

Encode the URLprivate String encodeValue(String value) { return URLEncoder. encode(value, StandardCharsets. UTF_8. toString()); } @Test public void givenRequestParam_whenUTF8Scheme_thenEncode() throws Exception { Map<String, String> requestParams = new HashMap<>(); requestParams.

What is the best way to URL encode a string?

URL encoding replaces unsafe ASCII characters with a "%" followed by two hexadecimal digits. URLs cannot contain spaces. URL encoding normally replaces a space with a plus (+) sign or with %20.

What is URL decoder in Java?

public class URLDecoder extends Object. Utility class for HTML form decoding. This class contains static methods for decoding a String from the application/x-www-form-urlencoded MIME format. The conversion process is the reverse of that used by the URLEncoder class.


1 Answers

What every web developer must know about URL encoding

Url Encoding Explained

Why do I need URL encoding?

The URL specification RFC 1738 specifies that only a small set of characters 
can be used in a URL. Those characters are:

A to Z (ABCDEFGHIJKLMNOPQRSTUVWXYZ)
a to z (abcdefghijklmnopqrstuvwxyz)
0 to 9 (0123456789)
$ (Dollar Sign)
- (Hyphen / Dash)
_ (Underscore)
. (Period)
+ (Plus sign)
! (Exclamation / Bang)
* (Asterisk / Star)
' (Single Quote)
( (Open Bracket)
) (Closing Bracket)

How does URL encoding work?

All offending characters are replaced by a % and a two digit hexadecimal value 
that represents the character in the proper ISO character set. Here are a 
couple of examples:

$ (Dollar Sign) becomes %24
& (Ampersand) becomes %26
+ (Plus) becomes %2B
, (Comma) becomes %2C
: (Colon) becomes %3A
; (Semi-Colon) becomes %3B
= (Equals) becomes %3D
? (Question Mark) becomes %3F
@ (Commercial A / At) becomes %40

Simple Example:

import java.util.logging.Level;
import java.util.logging.Logger;
import javax.script.ScriptEngine;
import javax.script.ScriptEngineManager;
import javax.script.ScriptException;

public class TextHelper {
    private static ScriptEngine engine = new ScriptEngineManager()
        .getEngineByName("JavaScript");

/**
 * Encoding if need escaping %$&+,/:;=?@<>#%
 *
 * @param str should be encoded
 * @return encoded Result 
 */
public static String escapeJavascript(String str) {
    try {
        return engine.eval(String.format("escape(\"%s\")", 
            str.replaceAll("%20", " "))).toString()
                .replaceAll("%3A", ":")
                .replaceAll("%2F", "/")
                .replaceAll("%3B", ";")
                .replaceAll("%40", "@")
                .replaceAll("%3C", "<")
                .replaceAll("%3E", ">")
                .replaceAll("%3D", "=")
                .replaceAll("%26", "&")
                .replaceAll("%25", "%")
                .replaceAll("%24", "$")
                .replaceAll("%23", "#")
                .replaceAll("%2B", "+")
                .replaceAll("%2C", ",")
                .replaceAll("%3F", "?");
    } catch (ScriptException ex) {
        Logger.getLogger(TextHelper.class.getName())
            .log(Level.SEVERE, null, ex);
        return null;
    }
}
like image 91
Veniamin Avatar answered Sep 18 '22 05:09

Veniamin