Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python web crawler with MySQL database

I want to create or find an open source web crawler (spider/bot) written in Python. It must find and follow links, collect meta tags and meta descriptions, title's of web pages and the url of a webpage and put all of the data into a MySQL database.

Does anyone know of any open source scripts that could help me? Also, if anyone can give me some pointers as to what I should do then they are more than welcome to.

like image 698
Callum Whyte Avatar asked Aug 10 '11 20:08

Callum Whyte


1 Answers

yes i know,

libraries

https://github.com/djay/transmogrify.webcrawler

http://code.google.com/p/harvestman-crawler/

http://code.activestate.com/pypm/orchid/

open source web crawler

http://scrapy.org/

tutorials

http://www.example-code.com/python/pythonspider.asp

PS I don't know if they use mysql because normally python either uses sqlit or postgre sql so if you want you could use the libraries i gave you and import the python-mysql module and do it :D

http://sourceforge.net/projects/mysql-python/

like image 181
Lynob Avatar answered Sep 23 '22 12:09

Lynob