Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python/BeautifulSoup - how to remove all tags from an element?

How can I simply strip all tags from an element I find in BeautifulSoup?

like image 929
Daniele B Avatar asked Apr 25 '13 04:04

Daniele B


People also ask

How do you remove a tag in Python?

For this, decompose() method is used which comes built into the module. Tag. decompose() removes a tag from the tree of a given HTML document, then completely destroys it and its contents.

How do I select multiple tags in BeautifulSoup?

To find multiple tags, you can use the , CSS selector, where you can specify multiple tags separated by a comma , . To use a CSS selector, use the . select_one() method instead of . find() , or .

How do you scrape a tag with BeautifulSoup?

Step-by-step Approach. Step 1: The first step will be for scraping we need to import beautifulsoup module and get the request of the website we need to import the requests module. Step 2: The second step will be to request the URL call get method.


2 Answers

With BeautifulStoneSoup gone in bs4, it's even simpler in Python3

from bs4 import BeautifulSoup  soup = BeautifulSoup(html) text = soup.get_text() print(text) 
like image 185
shawnl Avatar answered Oct 13 '22 16:10

shawnl


why has no answer I've seen mentioned anything about the unwrap method? Or, even easier, the get_text method

http://www.crummy.com/software/BeautifulSoup/bs4/doc/#unwrap http://www.crummy.com/software/BeautifulSoup/bs4/doc/#get-text

like image 39
Bobby Avatar answered Oct 13 '22 16:10

Bobby