Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reading information from a password protected site

Tags:

r

I have been using readLines() to scrape information from a website in an R tutorial. I now wish to extract data from my own website (specifically the awstats data) however the domain is password protected.

Is there a way that I can pass the url for the specific awstats data I require with a username and password.

the format of the url is:

http://domain.name:port/awstats.pl?month=02&year=2011&config=domain.name&lang=en&framename=mainright&output=alldomains

Thanks.

like image 890
John Avatar asked Mar 24 '11 14:03

John


People also ask

How do I pull data from a website in Excel that requires a login?

Go to Data>From Web to enter your URL, click OK, then select Basic to enter your login credentials to have a check. For more information, you could refer to the Other Sources :Web section of Import data from external data sources. Just checking in to see if the information was helpful.

How does a password protected website work?

You can use your account management control panel to protect Web pages with a password. If you do this, anyone trying to view those pages will be asked to enter a user name and password using HTTP authentication. A password actually protects all the Web pages in a certain directory (folder).

What are password protected sites?

Some servers restrict access to certain content, requiring a user to authenticate with a valid user name and password in order to gain access.

How can I check my website login details?

Step 1. Log in to your email account and type the name of the website along with "password" into the search box. Most websites send you an email to confirm your password when you register, so as long as you didn't delete the email, searching your archives should locate the information.


1 Answers

Formatting the url as http://username:password@domain... for use with download.file didn't work for me, but R.utils provides the function downloadFile that works perfectly:

require(R.utils)
downloadFile(myurl, myfile, username = "myusername", password ="mypassword")

See @joris-meys answer for a way to avoid including your username and password in plain text in your script.

EDIT Except it looks like downloadFile just reformats the URL to http://username:password@domain...? Hmm...

like image 122
mikeck Avatar answered Oct 29 '22 13:10

mikeck