Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In java, what's the best way to read a url and split it into its parts?

Firstly, I am aware that there are other posts similar, but since mine is using a URL and I am not always sure what my delimiter will be, I feel that I am alright posting my question. My assignment is to make a crude web browser. I have a textField that a user enters the desired URL into. I then have obviously have to navigate to that webpage. Here is an example from my teacher of what my code would look kinda like. This is the code i'm suposed to be sending to my socket. Sample url: http://en.wikipedia.org/wiki/Hypertext_Transfer_Protocol

 GET /wiki/Hypertext_Transfer_Protocol HTTP/1.1\n
Host: en.wikipedia.org\n
\n

So my question is this: I am going to read in the url as just one complete string, so how do I extract just the "en.wikipedia.org" part and just the extension? I tried this as a test:

 String url = "http://en.wikipedia.org/wiki/Hypertext Transfer Protocol";
    String done = " ";
    String[] hope = url.split(".org");

    for ( int i = 0; i < hope.length; i++)
    {
        done = done + hope[i];
    }
    System.out.println(done);

This just prints out the URL without the ".org" in it. I think i'm on the right track. I am just not sure. Also, I know that websites can have different endings (.org, .com, .edu, etc) so I am assuming i'll have to have a few if statements that compenstate for the possible different endings. Basically, how do I get the url into the two parts that I need?

like image 767
art3m1sm00n Avatar asked Feb 25 '13 21:02

art3m1sm00n


People also ask

What is the best way to split a string in Java?

Split() String method in Java with examples The string split() method breaks a given string around matches of the given regular expression. After splitting against the given regular expression, this method returns a string array.

How would you extract the URL in Java?

In Java, this can be done by using Pattern. matcher(). Find the substring from the first index of match result to the last index of the match result and add this substring into the list. After completing the above steps, if the list is found to be empty, then print “-1” as there is no URL present in the string S.


2 Answers

The URL class pretty much does this, look at the tutorial. For example, given this URL:

http://example.com:80/docs/books/tutorial/index.html?name=networking#DOWNLOADING

This is the kind of information you can expect to obtain:

protocol = http
authority = example.com:80
host = example.com
port = 80
path = /docs/books/tutorial/index.html
query = name=networking
filename = /docs/books/tutorial/index.html?name=networking
ref = DOWNLOADING
like image 119
Óscar López Avatar answered Sep 22 '22 10:09

Óscar López


This is how you should split your URL parts: http://docs.oracle.com/javase/tutorial/networking/urls/urlInfo.html

like image 37
piokuc Avatar answered Sep 23 '22 10:09

piokuc