BeautifulSoup tag is type bs4.element.NavigableString and bs4.element.Tag

Tags:

I'm trying to scrape a table in a Wikipedia article and the type of each table element appears to be both <class 'bs4.element.Tag'> and <class 'bs4.element.NavigableString'>.

import requests
import bs4
import lxml


resp = requests.get('https://en.wikipedia.org/wiki/List_of_municipalities_in_Massachusetts')

soup = bs4.BeautifulSoup(resp.text, 'lxml')

munis = soup.find(id='mw-content-text')('table')[1]

for muni in munis:
    print type(muni)
    print '============'

produces the following ouput:

<class 'bs4.element.Tag'>
============
<class 'bs4.element.NavigableString'>
============
<class 'bs4.element.Tag'>
============
<class 'bs4.element.NavigableString'>
============
<class 'bs4.element.Tag'>
============
<class 'bs4.element.NavigableString'>
...

When I try to retrieve muni.contents I get the AttributeError: 'NavigableString' object has no attribute 'contents' error.

What am I doing wrong? How do I get the bs4.element.Tag object for each muni?

(Using Python 2.7).

782

asked Nov 30 '16 03:11

2 Answers

#!/usr/bin/env python
# coding:utf-8
'''黄哥Python'''

import requests
import bs4
from bs4 import BeautifulSoup
# from urllib.request import urlopen

html = requests.get('https://en.wikipedia.org/wiki/List_of_S%26P_500_companies')
soup = BeautifulSoup(html.text, 'lxml')

symbolslist = soup.find('table').tr.next_siblings
for sec in symbolslist:
    # print(type(sec))
    if type(sec) is not bs4.element.NavigableString:
        print(sec.get_text())

result screenshot

124

answered Sep 20 '22 19:09

If you have spaces in your markup in between nodes BeautifulSoup will turn those into NavigableString. Just put a try catch and see whether the contents are getting fetched as you would want them to -

for muni in munis:
    #print type(muni)
    try:
        print muni.contents
    except AttributeError:
        pass
    print '============'

answered Sep 20 '22 19:09

Vivek Kalyanarangan

Related questions
                            
                                How can calculate the real distance between two points with GeoDjango?
                            
                                Two dimensional color ramp (256x256 matrix) interpolated from 4 corner colors
                            
                                PyCharm cannot find installed packages: keras
                            
                                Python scan for WiFi
                            
                                How to convert unicode numbers to ints?
                            
                                PyQt - QDialogButtonBox signals and tool tip
                            
                                Setting an index limit in SQLAlchemy
                            
                                How can I draw a point with Canvas in Tkinter?
                            
                                How to run non-linear regression in python
                            
                                Don't show zero values on 2D heat map
                            
                                Make an object that behaves like a slice
                            
                                Difference between cv2.findNonZero and Numpy.NonZero
                            
                                Get a random sample of a dict
                            
                                Python how to decode unicode with hex characters
                            
                                cartopy: higher resolution for great circle distance line
                            
                                IPython help functionality in ipdb debugger
                            
                                How to plot two real-time data in one single plot in PyQtGraph?
                            
                                Python gmail api send email with attachment pdf all blank
                            
                                Pandas Finding Index From Values In Column
                            
                                Selecting a specific row and column within pandas data array

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

BeautifulSoup tag is type bs4.element.NavigableString and bs4.element.Tag

Tags:

python

beautifulsoup

web-scraping

lsimmons

People also ask

2 Answers

黄哥Python培训

Vivek Kalyanarangan

Recent Activity

Donate For Us