Logo Questions Linux Laravel Mysql Ubuntu Git Menu

Problems running beautifulsoup4 within Apache/mod_python/Django

I'm was trying to render an HTML-page on the fly using BeautifulSoup version 4 in Django (using Apache2 with mod_python). However, as soon as I pass any HTML-string to the BeautifulSoup constructor (see code below), the browser just hangs waiting for the webserver. I tried equivalent code in CLI and it works like a charm. So I'm guessing it's something related to BeautifulSoups environment, in this case Django + Apache + mod_python.

import bs4
import django.shortcuts as shortcuts

def test(request):
    s = bs4.BeautifulSoup('<b>asdf</b>')
    return shortcuts.render_to_response('test.html', {})

I have installed BeautifulSoup using pip, pip install beautifulsoup4. I tried to install BeautifulSoup3 using standard Debian packages, apt-get install python-beautifulsoup, and then the following equivalent code works fine (both from browser and CLI).

from BeautifulSoup import BeautifulSoup
import django.shortcuts as shortcuts

def test(request):
    s = BeautifulSoup('<b>asdf</b>')
    return shortcuts.render_to_response('test.html', {})

I have looked in Apaches access and error logs and they show no information what's happening to the request that gets stalled. I have also checked /var/log/syslog and /var/log/messages, but no further info.

Here's the Apache configuration I used:

<VirtualHost *:80>
    DocumentRoot /home/nandersson/src
    <Directory /home/nandersson/src>
        SetHandler python-program
        PythonHandler django.core.handlers.modpython
        SetEnv DJANGO_SETTINGS_MODULE app.settings
        PythonOption django.root /home/nandersson/src
        PythonDebug On
        PythonPath "['/home/nandersson/src'] + sys.path"

    <Location "/media/">
        SetHandler None
    <Location "/app/poc/">
        SetHandler None

I'm not sure how to debug this further, not sure if it's a bug or not. Anyone got ideas on how to get to the bottom of this or have run into similar problems?

like image 931
Niklas9 Avatar asked Dec 01 '22 05:12


1 Answers

I'm using Apache2 with mod_python. I solved the hang problem by explicitly passing the 'html.parser' to get a soup.

s = bs4.BeautifulSoup('<b>asdf</b>', 'html.parser')
like image 189
chx3 Avatar answered Dec 04 '22 05:12
