Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to make a Twitter Crawler using Scrapy? [closed]

I have tried using Scrapy to scrape data from websites like Pinterest that do not require logged in sessions for data scraping, but how to use Scrapy for scraping and crawling Twitter, since for accessing Twitter followers and other data we need to first log in.

like image 304
Aman Avatar asked Oct 12 '25 16:10

Aman


1 Answers

Login Twitter and get the follower page of someone An example using Python library Requests:

import requests

url = "https://twitter.com/login"
payload = { 'session[username_or_email]': account, 
            'session[password]': password}
r = requests.post(url, data=payload)

It would be better to add headers of a browser to request query so that Twitter server would regard the spider as a browser user.

# You need to fill the area below after checking the header in your browser
header = {
        'Host': 'twitter.com',
        'User-Agent': ,
        'Accept': ,
        'Accept-Language': ,
        'Accept-Encoding': ,
        'X-Requested-With': ,
        "Cookie": ",
        'Connection': }
url = 'http://twitter.com/%s/followers'%(someone)
p = requests.get(url, headers=headers)

Then you get the page and you can parse the page by other stuff like BS4, scrape or anything.

like image 97
Hao Lyu Avatar answered Oct 16 '25 11:10

Hao Lyu



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!