Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to scrape instagram account info in python

I am trying to do something extremely simple in python yet somehow it's very difficult. All I want to do is write a python script that records the number of people a Instagram user is following, and the number of it's followers. That's it.

Can anyone point me to a good package to do this? preferably not beautiful soup as that is overly complicated for what I want to do. I just want something like

[user: example_user, followers:9019, following:217] 

Is there an Instagram specific python library?

The account I want to scrape is public. This is very simple to do for twitter.

Any help is appreciated.

like image 894
user53558 Avatar asked Dec 23 '22 14:12

user53558


2 Answers

As the content you look for are available in page source, you can fetch them using requests in combination with BeautifulSoup.

Give it a try:

import requests
from bs4 import BeautifulSoup

html = requests.get('https://www.instagram.com/michaeljackson/')
soup = BeautifulSoup(html.text, 'lxml')
item = soup.select_one("meta[property='og:description']")
name = item.find_previous_sibling().get("content").split("•")[0]
followers = item.get("content").split(",")[0]
following = item.get("content").split(",")[1].strip()
print(f'{name}\n{followers}\n{following}')

Results:

Name :Michael Jackson
Followers :1.6m
Following :4
like image 134
SIM Avatar answered Jan 11 '23 11:01

SIM


I don't know why you would like to avoid using BeautifulSoup, since it is actually quite convinient for tasks like this. So, something along the following lines should do the job:

import requests
from bs4 import BeautifulSoup

html = requests.get('https://www.instagram.com/cristiano/') # input URL here
soup = BeautifulSoup(html.text, 'lxml')

data = soup.find_all('meta', attrs={'property':'og:description'})
text = data[0].get('content').split()

user = '%s %s %s' % (text[-3], text[-2], text[-1])
followers = text[0]
following = text[2]

print('User:', user)
print('Followers:', followers)
print('Following:', following)

...output:

User: Cristiano Ronaldo (@cristiano)

Followers: 111.5m

Following: 387

Of course, you would need to do some calculations to get an actual (yet truncated) number in cases where the user has more than 1m followers (or is following more than 1m users), which should not be too difficult.

like image 22
adder Avatar answered Jan 11 '23 12:01

adder