I have a script with these two functions:
# Getting content of each page
def GetContent(url):
response = requests.get(url)
return response.content
# Extracting the sites
def CiteParser(content):
soup = BeautifulSoup(content)
print "---> site #: ",len(soup('cite'))
result = []
for cite in soup.find_all('cite'):
result.append(cite.string.split('/')[0])
return result
When I run program I have the following error:
result.append(cite.string.split('/')[0])
AttributeError: 'NoneType' object has no attribute 'split'
Output Sample:
URL: <URL That I use to search 'can be google, bing, etc'>
---> site #: 10
site1.com
.
.
.
site10.com
URL: <URL That I use to search 'can be google, bing, etc'>
File "python.py", line 49, in CiteParser
result.append(cite.string.split('/')[0])
AttributeError: 'NoneType' object has no attribute 'split'
TypeError: 'NoneType' object is not subscriptable Solution The best way to resolve this issue is by not assigning the sort() method to any variable and leaving the numbers.
It can happen, that the string has nothing inside, than it is "None" type, so what I can suppose is to check first if your string is not "None"
# Extracting the sites
def CiteParser(content):
soup = BeautifulSoup(content)
#print soup
print "---> site #: ",len(soup('cite'))
result = []
for cite in soup.find_all('cite'):
if cite.string is not None:
result.append(cite.string.split('/'))
print cite
return result
for cite in soup.find_all('cite'):
if( (cite.string is None) or (len(cite.string) == 0)):
continue
result.append(cite.string.split('/')[0])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With