Why won't Python regex work on a formatted string of HTML?

Question

from bs4 import BeautifulSoup
import urllib
import re

soup = urllib.urlopen("http://atlanta.craigslist.org/cto/")
soup = BeautifulSoup(soup)
souped = soup.p
print souped
m = re.search("\$.",souped)
print m.group(0)

I can download and print out the html just fine, but it always breaks when I add the last two lines.

I get this error:

Traceback (most recent call last):
  File "C:\Python27\Lib\site-packages\pythonwin\pywin\framework\scriptutils.py", line 323, in RunScript
    debugger.run(codeObject, __main__.__dict__, start_stepping=0)
  File "C:\Python27\Lib\site-packages\pythonwin\pywin\debugger\__init__.py", line 60, in run
    _GetCurrentDebugger().run(cmd, globals,locals, start_stepping)
  File "C:\Python27\Lib\site-packages\pythonwin\pywin\debugger\debugger.py", line 655, in run
    exec cmd in globals, locals
  File "C:\Users\Zack\Documents\Scripto.py", line 1, in <module>
    from bs4 import BeautifulSoup
  File "C:\Python27\lib
e.py", line 142, in search
    return _compile(pattern, flags).search(string)
TypeError: expected string or buffer

Thanks lots!

Roman Bodnarchuk · Accepted Answer

You probably want re.search("\$.", str(souped)).

Why won't Python regex work on a formatted string of HTML?

Tags:

python

regex

user1232812

1 Answers

Roman Bodnarchuk

Recent Activity

Donate For Us

Why won't Python regex work on a formatted string of HTML?

Tags:

python

regex

user1232812

1 Answers

Roman Bodnarchuk

Related questions

Recent Activity

Donate For Us