Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What java library can I use to compare two URLs for equality?

Tags:

java

url

This question has been asked here:

  • Comparing URLs with parameters in Java
  • How to compare two URLs in java?

but I'm completely unsatisfied with the answers. I need a way to compare two URLs for equality and ideally I won't be writing it by hand. This library needs to understand that these urls are equal

http://stackoverflow.com
https://stackoverflow.com/

https://stackoverflow.com/questions/ask
https://stackoverflow.com/questions/ask/

http://stackoverflow.com?paramName=
http://stackoverflow.com?paramName

http://stackoverflow.com?paramName1=value1&paramName2=value2
http://stackoverflow.com?paramName2=value2&paramName1=value1

http://stackoverflow.com?param name 1=value 1
http://stackoverflow.com?param%20name%201=value%201

These URLs are not equal:

https://stackoverflow.com/questions/ask
https://stackoverflow.com/questionz/ask

http://stackoverflow.com?paramName1=value1&paramName2=value2
http://stackoverflow.com?paramName1=value1&paramName2=value3

And other complicated things like this. Where can I find such a library?

BTW, here is a unit test of this:

import org.junit.Test;

import java.net.URI;
import java.net.URISyntaxException;

import static org.junit.Assert.assertEquals;
import static org.junit.Assert.assertNotSame;

public class UriTest {

    @Test
    public void equality() throws URISyntaxException {
        assertUrlsEqual("http://stackoverflow.com", "https://stackoverflow.com/");
        assertUrlsEqual("https://stackoverflow.com/questions/ask", "https://stackoverflow.com/questions/ask/");
        assertUrlsEqual("http://stackoverflow.com?paramName=", "http://stackoverflow.com?paramName");
        assertUrlsEqual("http://stackoverflow.com?paramName1=value1&paramName2=value2", "http://stackoverflow.com?paramName2=value2&paramName1=value1");
        assertUrlsEqual("http://stackoverflow.com?param name 1=value 1", "http://stackoverflow.com?param%20name%201=value%201");
    }

    @Test
    public void notEqual() throws URISyntaxException {
        assertUrlsNotEqual("https://stackoverflow.com/questions/ask", "https://stackoverflow.com/questionz/ask");
        assertUrlsNotEqual("http://stackoverflow.com?paramName1=value1&paramName2=value2", "http://stackoverflow.com?paramName1=value1&paramName2=value3");
    }

    private void assertUrlsNotEqual(String u1, String u2) throws URISyntaxException {

//...?
    }

    private void assertUrlsEqual(String u1, String u2) throws URISyntaxException {
//...?
    }

}
like image 247
Daniel Kaplan Avatar asked Aug 16 '13 19:08

Daniel Kaplan


1 Answers

java.net.URI will compare two URLs without network requests (the way java.net.URL does), and you can use the normalize method to make a URL with an absolute path path-canonical.

There are some problems with your examples:

http://stackoverflow.com?paramName=
http://stackoverflow.com?paramName

http://stackoverflow.com?paramName1=value1&paramName2=value2
http://stackoverflow.com?paramName2=value2&paramName1=value1

Servers are allowed to assign meaning to the order of parameters, and to the presence of an equals sign, so those pairs are not equivalent according to RFC 3986.

http://stackoverflow.com?param name 1=value 1
http://stackoverflow.com?param%20name%201=value%201

Not all URL libraries are going to treat these as valid, because the first is not a valid URL according to RFC 3986, although most user-agents agree on how to convert the former to the latter.

like image 104
Mike Samuel Avatar answered Sep 18 '22 13:09

Mike Samuel