Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Scraping a Facebook App for Data

Tags:

c#

.net

facebook

I'm using a Facebook application that has a rich set of information that I'd like to get at offline. To do this, I essentially need to read the infromation from the web pages into my own database. Obviously, I'd prefer not to have to save pages manually and let my application read the pages and pull the relevant details from them. Unfortunately, I am road-blocked by the requirement to authenticate to Facebook first. So when I run this code:

private static string getPage(string pageAddress)
{
    HttpWebRequest req = (HttpWebRequest)WebRequest.Create(new Uri(baseUri, pageAddress));
    HttpWebResponse response = (HttpWebResponse)req.GetResponse();
    StreamReader readStream = new StreamReader(response.GetResponseStream());
    string page = readStream.ReadToEnd();
    readStream.Close();
    response.Close(); // I know, I'm paranoid and this is likely redundant...
    return page;
}

I get this response:

<script type="text/javascript">
if (parent != self) 
top.location.href = "http://www.facebook.com/login.php?api_key=<obscured>&canvas&v=1.0";
else self.location.href = "http://www.facebook.com/login.php?api_key=<obscured>&canvas&v=1.0";
</script>

Any ideas how to tell the app that I really am the authentic me?

like image 545
Jacob Proffitt Avatar asked Feb 05 '26 18:02

Jacob Proffitt


1 Answers

As far as I understood you just need to login to facebook appliction, right? Use any web scraping/crawling framework for it (they support JS, cookies, etc.). They just emulate usuall web browsing. For example, try these:

http://scrapy.org/

http://wwwsearch.sourceforge.net/mechanize/

http://watin.sourceforge.net/

Also see

.Net Screen scraping and session

like image 196
Alexey Kalmykov Avatar answered Feb 07 '26 10:02

Alexey Kalmykov