I have read through many similar questions, but I am still stuck with logging in to my school gradebook (https://parents.mtsd.k12.nj.us/genesis/parents) to retrieve data. My networking class is shown below:
public class WebLogin {
public String login(String username, String password, String url) throws IOException {
URL address = new URL(url);
HttpURLConnection connection = (HttpURLConnection) address.openConnection();
connection.setDoOutput(true);
connection.setRequestProperty("j_username", username);
connection.setRequestProperty("j_password", password);
connection.setRequestProperty("__submit1__","Login");
InputStream response = connection.getInputStream();
Document document = Jsoup.parse(response, null, "");
//don't know what to do here!
return null;
}
}
I am not sure what to do with the InputStream or if I am going about logging in correctly, and once again, I have gone through and read several online resources to try to understand this concept but I am still confused. Any help is appreciated!
UPDATE: So now I understand the basic process and I know what I have to do. When using Jsoup to handle the connection, I know you can do something like this:
Connection.Response res = Jsoup.connect("---website---")
.data("j_username", username, "j_password", password)
.followRedirects(true)
.method(Method.POST)
.execute();
However, I am still a little confused as to how to actually send the data (such as a user's username/password to the website with HttpURLConnection, and how to actually store the obtained cookie...otherwise, the help has been really useful and I am fine with everything else
Yes, it's login screens. Sometimes, you might set your sights on scraping data you can access only after you log into an account. It could be your channel analytics, your user history, or any other type of information you need. In this case, first check if the company provides an API for the purpose.
Web scraping is the process of using bots to extract content and data from a website. Unlike screen scraping, which only copies pixels displayed onscreen, web scraping extracts underlying HTML code and, with it, data stored in a database. The scraper can then replicate entire website content elsewhere.
Web Scraping Past Login Screens ParseHub is a free and powerful web scraper that can log in to any site before it starts scraping data. You can then set it up to extract the specific data you want and download it all to an Excel or JSON file. To get started, make sure you download and install ParseHub for free.
EXAMPLE HOW-TO:
(explanation on the origin in edit part)
(with use of value/key pair with http url connection):
1-2 . assuming we got some URL adress (login page "http://blabla/login.php") & we wanna login so what we do:
// create & open connection
HttpUrlConnection connection = (HttpURLConnection) new URL(adress).openConnection();
// set do output ...
connection.setDoOutput(true);
/** variable charset for encoding */
String CHARSET = "UTF-8";
// Construct the POST value/key pair data.
String data = "login=" + URLEncoder.encode(login, CHARSET)
+ "&password=" + URLEncoder.encode(password, CHARSET)
+ "&remember_me=on";
byte[] dataBytes = data.getBytes(CHARSET);
// create output stram to write our creditentials
OutputStream outputStream = new BufferedOutputStream(connection.getOutputStream());
// write value/key data to output stream
outputStream.write(dataBytes);
outputStream.flush();
// connect to url
connection.connect();
// now we are connected and we can do other stuff get input strem header response code etc....
int responseCode = connection.getResponseCode();
/**
* here we grab cookies (how? - in other story)
*/
connection.disconnect();
3 . THEN WE GOT SECOND PAGE WITH USER DATA ( http://blabla/userdata.php ) (*if we allready were not redirected, ** we also can reuse connection or do request as next step to above)
//we are creating & oppening another connection to new adres as at beginning
HttpUrlConnection connection = (HttpURLConnection) new URL(adressToUserData).openConnection();
// but we do not construct user value/kay data & don't create output stream
// we just add obtained cookies as request property
connection.addRequestProperty("Cookie", _cookie);
//connecting
connection.connect();
//getting input stram
InputStrem is = connection.getInputStream();
//parse data for example with jsoup
Document doc = JSoup.parse(is,null"");
//show parsed result example in grid view
then u got the "page" as document from point 3 & u select what u need
Elements tableWithGrades = doc.select("table>grades");
u cant select single element as table row(tr), cell(td), span, div by id or class, name etc from HTML - what u like to. You just need to learn 'syntax' of JSoup & got 'simple' knowledge of html:)
EDIT:
"I am not sure what to do with the InputStream or if I am going about logging in correctly, and once again, I have gone through and read several online resources to try to understand this concept but I am still confused."
to be clear:
You want to (--goals--):
so u need to ACT & "THINK" like a WEB BROWSER:(--way--):
u know now how to do 1,2 from my previous answers
u need now think how-to 3,4 - there are many ways - many roads to one place :P it's your choice which one will u take - but still you need to be aware of those roads :)
so what part of gathered data & in what kind of way u want to present it ?
EDIT2:
"However, I am still a little confused as to how to actually send the data (such as a user's username/password to the website with HttpURLConnection"
before u call:
// change post string "?login=xxxx&password=zzz" to byte array *
byte[] dataBytes = data.getBytes(CHARSET);
which is bound with:
//write value/key data to output stream
outputStream.write(dataBytes);
outputStream.flush();
or before u call as in yr jsoup example:
res.execute();
u need to set String login="...";, passwod="...." why???
because your code is called sequential(excluding parareel parts) & java use references
"and how to actually store the obtained cookie..."
"and does writing to the outputStream do the same thing as what my Jsoup example would do? "
in your statment jsoup uses so known builder pattern which makes the same as code i wrote in point 1-4 but it's something like: "the details on how you do it are not irrelevant to me as far it is working" as you can dig a hole with an ax or a shovel - you get the expected result.
"when you are posting login information, why would that be an encoded param?"
imagine your password looks like this $$$s-_xxx.php?xxaasfs??dfsdfśś%%___////**"* - try to do browser request with that kinnd of url http://server.com/home.php&action=login&username=xxx&password=$$$s-_xxx.php?xxaasfs??dfsdfśś :)))
URLEncoder.encode(password, CHARSET)
/**
* This class is used to encode a string using the format required by
* {@code application/x-www-form-urlencoded} MIME content type.
*
* <p>All characters except letters ('a'..'z', 'A'..'Z') and numbers ('0'..'9')
* and characters '.', '-', '*', '_' are converted into their hexadecimal value
* prepended by '%'. For example: '#' -> %23. In addition, spaces are
* substituted by '+'.
*/
SOME OTHER HINTS:
ps. for youre case you should look also at
Using Form-Based Login in JavaServer Faces Web Applications
I hope that this will give you an overview of one of the possible paths & sorry for typos i dont know english language ;p at all
One way to do this is using Apache Http Client. Maybe something like below could work.
Request.Post("https://parents.mtsd.k12.nj.us/genesis/j_security_check")
.bodyForm(
Form.form()
.add("j_username", username)
.add("j_password", password).build()
)
.execute().returnContent();
You'll have figure out what information are necessary for logging in (URL, request method, name and value of parameters etc.).
Here is an example I found for logging into Gmail.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With