Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get the hidden input's value with beautifulsoup?

In python3 I want to extract information from a page using requests and beautifulsoup

import requests
from bs4 import BeautifulSoup

link = "https://portal.stf.jus.br/processos/listarPartes.asp?termo=AECIO%20NEVES%20DA%20CUNHA"

try:
    res = requests.get(link)
except (requests.exceptions.HTTPError, requests.exceptions.RequestException, requests.exceptions.ConnectionError, requests.exceptions.Timeout) as e:
    print(str(e))
except Exception as e:
    print("Exceção")

html = res.content.decode('utf-8') 

soup =  BeautifulSoup(html, "lxml")

pag = soup.find('div', {'id': 'total'})

print(pag)

In this case the information is in an HTML snippet like this:

<div id="total" style="display: inline-block"><input type="hidden" name="totalProc" id="totalProc" value="35">35</div>

What I want to access is value, in this case 35. Capture number "35"

That's why I used "pag = soup.find('div', {'id': 'total'})". To slowly isolate just the number 35

But the content returned was just: <div id="total" style="display: inline-block"><img src="ajax-loader.gif"/></div>

Please does anyone know how to capture value content only?

like image 328
Reinaldo Chaves Avatar asked Dec 04 '25 12:12

Reinaldo Chaves


1 Answers

It is dynamically pulled from another XHR call you can find in the network tab

import requests
from bs4 import BeautifulSoup as bs

r = requests.get('https://portal.stf.jus.br/processos/totalProcessosPartes.asp?termo=AECIO%20NEVES%20DA%20CUNHA&total=0')
soup = bs(r.content, 'lxml')
print(soup.select_one('#totalProc')['value'])

With regex

import requests, re

r = requests.get('https://portal.stf.jus.br/processos/totalProcessosPartes.asp?termo=AECIO%20NEVES%20DA%20CUNHA&total=0')
soup = bs(r.content, 'lxml')
print(re.search('value=(\d+)',r.text).groups(0)[0])
like image 172
QHarr Avatar answered Dec 07 '25 00:12

QHarr