Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

URL Encoding with httpclient

I have a list of URLs which I need to get the content of. The URL is with special characters and thus needs to be encoded. I use Commons HtpClient to get the content.

when I use:

GetMethod get = new GetMethod(url);

I get a " Invalid "illegal escape character" exception. when I use

 GetMethod get = new GetMethod();
 get.setURI(new URI(url.toString(), false, "UTF-8"));

I get 404 when trying to get the page, because a space is turned to %2520 instead of just %20.

I've seen many posts about this problem, and most of them advice to build the URI part by part. The problem is that it's a given list of URLs, not a one that I can handle manually.

Any other solution for this problem?

thanks.

like image 918
user1251654 Avatar asked Jul 26 '12 10:07

user1251654


2 Answers

What if you create a new URL object from it's string like URL urlObject = new URL(url), then do urlObject.getQuery() and urlObject.getPath() to split it right, parse the Query Params into a List or a Map or something and do something like:

EDIT: I just found out that HttpClient Library has a URLEncodedUtils.parse() method which you can use easily with the code provided below. I'll edit it to fit, however is untested.

With Apache HttpClient it would be something like:

URI urlObject = new URI(url,"UTF-8");
HttpClient httpclient = new DefaultHttpClient();
List<NameValuePair> formparams = URLEncodedUtils.parse(urlObject,"UTF-8");
UrlEncodedFormEntity entity;
entity = new UrlEncodedFormEntity(formparams);

HttpPost httppost = new HttpPost(urlObject.getPath());
httppost.setEntity(entity);
httppost.addHeader("Content-Type","application/x-www-form-urlencoded");

HttpResponse response = httpclient.execute(httppost);

HttpEntity entity2 = response.getEntity();

With Java URLConnection it would be something like:

    // Iterate over query params from urlObject.getQuery() like
while(en.hasMoreElements()){
    String paramName  = (String)en.nextElement(); // Iterator over yourListOfKeys
    String paramValue = yourMapOfValues.get(paramName); // replace yourMapOfNameValues
    str = str + "&" + paramName + "=" + URLEncoder.encode(paramValue);
}
try{
    URL u = new URL(urlObject.getPath()); //here's the url path from your urlObject
    URLConnection uc = u.openConnection();
    uc.setDoOutput(true);
    uc.setRequestProperty("Content-Type","application/x-www-form-urlencoded");
    PrintWriter pw = new PrintWriter(uc.getOutputStream());
    pw.println(str);
    pw.close();

    BufferedReader in = new BufferedReader(new 
            InputStreamReader(uc.getInputStream()));
    String res = in.readLine();
    in.close();
    // ...
}
like image 62
hectorg87 Avatar answered Nov 14 '22 19:11

hectorg87


If you need to manipulate with request URIs it is strongly advisable to use URIBuilder shipped with Apache HttpClient.

like image 25
ok2c Avatar answered Nov 14 '22 21:11

ok2c