Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to parse a URI like this in Java

Tags:

java

uri

parsing

I'm trying to parse the following URI : http://translate.google.com/#zh-CN|en|你

but got this error message :

java.net.URISyntaxException: Illegal character in fragment at index 34: http://translate.google.com/#zh-CN|en|你
        at java.net.URI$Parser.fail(URI.java:2809)
        at java.net.URI$Parser.checkChars(URI.java:2982)
        at java.net.URI$Parser.parse(URI.java:3028)

It's having problem with the "|" character, if I get rid of the "|", the last Chinese char is not causing any problem, what's the right way to handle this ?

My method look like this :

  public static void displayFileOrUrlInBrowser(String File_Or_Url)
  {
    try { Desktop.getDesktop().browse(new URI(File_Or_Url.replace(" ","%20").replace("^","%5E"))); }
    catch (Exception e) { e.printStackTrace(); }
  }

Thanks for the answers, but BalusC's solution seems to work only for an instance of the url, my method needs to work with any url I pass to it, how would it know where's the starting point to cut the url into two parts and only encode the second part ?

like image 967
Frank Avatar asked Dec 01 '09 20:12

Frank


People also ask

What is URI parsing?

URL Parsing. The URL parsing functions focus on splitting a URL string into its components, or on combining URL components into a URL string.

How do you parse a method in Java?

The Period instance can be obtained from a string value using the parse() method in the Period class in Java. This method requires a single parameter i.e. the string which is to be parsed. This string cannot be null. Also, it returns the Period instance obtained from the string value that was passed as a parameter.

What is an URI in Java?

A URI is a uniform resource identifier while a URL is a uniform resource locator. Hence every URL is a URI, abstractly speaking, but not every URI is a URL. This is because there is another subcategory of URIs, uniform resource names (URNs), which name resources but do not specify how to locate them.


1 Answers

You should use java.net.URLEncoder to URL-encode the query with UTF-8. You don't necessarily need regex for this. You don't want to have a regex to cover all of those thousands Chinese glyphs, do you? ;)

String query = URLEncoder.encode("zh-CN|en|你", "UTF-8");
String url = "http://translate.google.com/#" + query;
Desktop.getDesktop().browse(new URI(url));    
like image 152
BalusC Avatar answered Sep 20 '22 18:09

BalusC