I am trying to fetch base URL using java. I have used jtidy parser in my code to fetch the title. I am getting the title properly using jtidy, but I am not getting the base url from the given URL.
I have some URL as input:
String s1 = "http://staff.unak.is/andy/GameProgramming0910/new_page_2.htm";
String s2 = "http://www.complex.com/pop-culture/2011/04/10-hottest-women-in-fast-and-furious-movies";
From the first string, I want to fetch "http://staff.unak.is/andy/GameProgramming0910/"
as a base URL and from the second string, I want "http://www.complex.com/"
as a base URL.
I am using code:
URL url = new URL(s1);
HttpURLConnection conn = (HttpURLConnection) url.openConnection();
InputStream in = conn.getInputStream();
Document doc = new Tidy().parseDOM(in, null);
String titleText = doc.getElementsByTagName("title").item(0).getFirstChild()
.getNodeValue();
I am getting titletext
, but please can let me know how to get base URL from above given URL?
There's nothing in the Android URI class that gives you the base URL directly- As gnuf suggests, you'd have to construct it as the protocol + getHost(). The string parsing way might be easier and let you avoid stuffing everything in a try/catch block.
The BaseURL defines the basic capabilities of a portlet URL pointing back to the portlet. Since: 2.0. Method Summary. void. addProperty(java.lang.String key, java.lang.String value)
In your Java program, you can use a String containing this text to create a URL object: URL myURL = new URL("http://example.com/"); The URL object created above represents an absolute URL. An absolute URL contains all of the information necessary to reach the resource in question.
Try to use the java.net.URL class, it will help you:
For the second case, that it is easier, you could use new URL(s2).getHost();
For the first case, you could get the host and also use getFile() method, and remove the string after the last slash ("/"). something like: (code not tested)
URL url = new URL(s1);
String path = url.getFile().substring(0, url.getFile().lastIndexOf('/'));
String base = url.getProtocol() + "://" + url.getHost() + path;
You use the java.net.URL class to resolve relative URLs.
For the first case: removing the filename from the path:
new URL(new URL(s1), ".").toString()
For the second case: setting the root path:
new URL(new URL(s2), "/").toString()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With