Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

java.net.URI get host with underscores

I got a strange behavior of that method:

import java.net.URI

    URI url = new URI("https://pmi_artifacts_prod.s3.amazonaws.com");
    System.out.println(url.getHost()); /returns NULL
    URI url2 = new URI("https://s3.amazonaws.com");
    System.out.println(url2.getHost());  //returns s3.amazonaws.com

`

i want first url.getHost() to be pmi_artifacts_prod.s3.amazonaws.com, but it gives me NULL. Turned out that problem is with underscores in domain name, its a known bug, but still what can be done as I need to work with this host exactly?

like image 264
proxy Avatar asked Feb 17 '15 18:02

proxy


3 Answers

The bug is not in Java but in naming the host, since an underscore is not a valid character in a hostname. Although widely used incorrectly, Java refuses to handle such hostnames.

https://en.wikipedia.org/wiki/Hostname#Restrictions_on_valid_hostnames

A possible workaround:

public static void main(String...a) throws URISyntaxException, NoSuchFieldException, SecurityException, IllegalArgumentException, IllegalAccessException {
    URI url = new URI("https://pmi_artifacts_prod.s3.amazonaws.com");
    System.out.println(url.getHost()); //NULL


    URI uriObj = new URI("https://pmi_artifacts_prod.s3.amazonaws.com");
    if (uriObj.getHost() == null) {
        final Field hostField = URI.class.getDeclaredField("host");
        hostField.setAccessible(true);
        hostField.set(uriObj, "pmi_artifacts_prod.s3.amazonaws.com");
    }
    System.out.println(uriObj.getHost()); //pmi_artifacts_prod.s3.amazonaws.com


    URI url2 = new URI("https://s3.amazonaws.com");
    System.out.println(url2.getHost());  //s3.amazonaws.com
}
like image 138
Vurtatoo Avatar answered Nov 07 '22 00:11

Vurtatoo


note that although

new URI("https://pmi_artifacts_prod.s3.amazonaws.com");

will not throw and the workaround provided by @Vurtatoo will work for this case, it cannot handle url such as https://a_b?c={1}

I also found out that

new URI("https://a_b?c={1}")

will throw but

new URI("https://a_b?c=1")

won't.

not sure why is that but my take-away is we should not make any assumptions on the implementation details of the Java URI class. If you have to use Java URI, it's probably better to fork the source code and make the changes you need.

like image 21
Zidong Avatar answered Nov 07 '22 01:11

Zidong


Underscore support could be added right into URI by patching:

public static void main(String[] args) throws Exception {
    patchUriField(35184372088832L, "L_DASH");
    patchUriField(2147483648L, "H_DASH");
    
    URI s = URI.create("http://my_favorite_host:3892");
    // prints "my_favorite_host"
    System.out.println(s.getHost());
}

private static void patchUriField(Long maskValue, String fieldName)
        throws NoSuchMethodException, IllegalAccessException, InvocationTargetException, NoSuchFieldException {
        Field field = URI.class.getDeclaredField(fieldName);
        
        Field modifiers = Field.class.getDeclaredField("modifiers");
        modifiers.setAccessible(true);
        modifiers.setInt(field, field.getModifiers() & ~Modifier.FINAL);
        
        field.setAccessible(true);
        field.setLong(null, maskValue);
}
like image 2
Nikita Koksharov Avatar answered Nov 07 '22 00:11

Nikita Koksharov