I'm working on a project that requires me to use information on border wait times provided by the Canadian Border Patrol on their website to construct a visual representation of wait time distribution.
I'm trying to find a way to have a Java script regularly check the website, and extract the information at a few different border stations (not all of them). I suppose I would use XPath to get me the specific stations, but how do I load up the webpage on a regular basis?
(P.S. I know they have a Twitter account too now, but they update it once a day and more specifically I'd like to learn how to work with websites and XPATH)
Ok i had a little time off today at job and thought to give a help and write it for you. Excuse me for any mistakes it's the first time i parsed a site, i made a little research and decided to use jSoup for this.
Ok this code will parse the table and system out the 3 columns with the values, you can alter the code and build it within your needs :)
You have to download the jsoup jar Download jSoup
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.net.URL;
import java.net.URLConnection;
import java.util.Iterator;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
/**
*
*/
public class ParseWithJsoup{
public static void main(String[] args) {
URL url;
try {
url = new URL("http://www.cbsa-asfc.gc.ca/bwt-taf/menu-eng.html");
URLConnection conn = url.openConnection();
BufferedReader buffRead = new BufferedReader(new InputStreamReader(conn.getInputStream()));
StringBuffer buffer = new StringBuffer("");
String inputLine = "";
// Append the site in a buffer
while (inputLine != null){
inputLine = buffRead.readLine();
buffer.append(inputLine);
}
Document doc = Jsoup.parse(buffer.toString());
// Parse the table
Element table = doc.select("table[class=bwt]").first();
//Office elements iterator
Iterator<Element> officeElements = table.select("td[headers=Office]").iterator();
//Commercial Flow iterator
Iterator<Element> comElements = table.select("td[headers=Com ComCanada]").iterator();
//Travellers Flow iterator
Iterator<Element> travElements = table.select("td[headers=Trav TravCanada]").iterator();
// Iterate all elements through first element row for all columns
while(officeElements.hasNext()){
System.out.println("Office: " + officeElements.next().text());
System.out.println("Commercial Flow: " + comElements.next().text());
System.out.println("Travellers Flow: " + travElements.next().text());
}
}
catch (Exception e){
System.out.println("Exc:"+e.getMessage());
}
}
}
`
Use the URL in Java. Create the URL and then use its method .openConnection() to start reading from the website.
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.net.URL;
import java.net.URLConnection;
public class webVisitor {
public static void main(String[] args) {
URL url;
try {
url = new URL("http://seinfeldaudio.com");
URLConnection conn = url.openConnection();
BufferedReader buffRead = new BufferedReader(new InputStreamReader(conn.getInputStream()));
String inputLine = "";
while (inputLine != null){
inputLine = buffRead.readLine();
System.out.println(inputLine);
}
}
catch (Exception e){
}
}
}
More info here: http://www.mkyong.com/java/how-to-get-url-content-in-java/
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With