Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a good tutorial for figuring out what a website is doing so your program can do the same thing? [closed]

Is there a good guide or tutorial for people who need to programmatically interact with dynamic websites? There's been a rash of Perl questions about that lately, and I haven't found a good resource to point people toward. I'm asking not because I need one but because I don't want to waste my time writing it if it already exists. Although I'm most interested in Perl, the extra tools and techniques are mostly the same.

Typically, I see see these problems in people's questions:

  • Handling, setting, and saving cookies
  • Finding and interacting with forms
  • Handling JavaScript inside your user-agent
    • especially things like onLoad, onSumbit, and Ajax
  • Using HTTP sniffer tools
  • Using Web developer plugins in interactive browsers
  • Interacting with DOM, screen scraping, etc.

If there's no good tutorial, I'll add it to my list of things to do (unless someone else wants to do it). Along the way, if you don't have a suggestion for an existing tutorial, please suggest the things that you think should be in a new one, including links, your favorite tools, and your own user-agent development experiences. I don't care about the particular language you use.

like image 828
brian d foy Avatar asked May 02 '10 19:05

brian d foy


People also ask

What is web scraping?

Web scraping is the process of using bots to extract content and data from a website. Unlike screen scraping, which only copies pixels displayed onscreen, web scraping extracts underlying HTML code and, with it, data stored in a database. The scraper can then replicate entire website content elsewhere.

What is an example of web scraping?

Web scraping refers to the extraction of web data on to a format that is more useful for the user. For example, you might scrape product information from an ecommerce website onto an excel spreadsheet. Although web scraping can be done manually, in most cases, you might be better off using an automated tool.


1 Answers

The best I've seen is a Defcon presentation video.

like image 144
Alex Avatar answered Jan 02 '23 09:01

Alex