I'm grabbing text data from a webpage and when I use .text it has all the elements combined. However, I want to separate some of them with a space.
For example, I have this text:
data=['<span class="sub-title title-block"><span class="nowrap">1.2</span><span class="nowrap">TEKNA</span></span>',
'<span class="sub-title title-block"><span class="nowrap">Amr</span><span class="nowrap">V12 5.2</span></span>',
'<span class="sub-title title-block"></span>']
When I do the following:
from bs4 import BeautifulSoup
for i in data:
soup = BeautifulSoup(i, 'lxml')
for d in soup:
print(d.text)
I get:
1.2TEKNA
AmrV12 5.2
But I want the expected output:
1.2 TEKNA
Amr V12 5.2
where I get each text separated between each other.
You can use get_text(<sep>) method and define your custom separator as below:
from bs4 import BeautifulSoup
data=['<span class="sub-title title-block"><span class="nowrap">1.2</span><span class="nowrap">TEKNA</span></span>',
'<span class="sub-title title-block"><span class="nowrap">Amr</span><span class="nowrap">V12 5.2</span></span>',
'<span class="sub-title title-block"></span>']
for i in data:
soup = BeautifulSoup(i, 'lxml')
for d in soup:
print(d.get_text(" "))
Output:
1.2 TEKNA
Amr V12 5.2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With