I'm trying to use the Django ORM in some standalone screen scraping scripts. I know this question has been asked before, but I'm unable to figure out a good solution for my particular problem.
I have a Django project with defined models. What I would like to do is use these models and the ORM in my scraping script. My directory structure is something like this:
project
scrape
#scraping scripts
...
test.py
web
django_project
settings.py
...
#Django files
I tried doing the following in project/scrape/test.py
:
print os.path.join(os.path.abspath('..'), 'web', 'django_project')
sys.path.append(os.path.join(os.path.abspath('..'), 'web', 'django_project'))
print sys.path
print "-------"
os.environ['DJANGO_SETTINGS_MODULE'] = 'django_project.settings'
#print os.environ
from django_project.myapp.models import MyModel
print MyModel.objects.count()
However, I get an ImportError
when I try to run test.py:
Traceback (most recent call last):
File "test.py", line 12, in <module>
from django_project.myapp.models import MyModel
ImportError: No module named django_project.myapp.models
One solution I found around this problem is to create a symbolic link to ../web/govcheck
in the scrape folder:
:scrape rmanocha$ ln -s ../web/govcheck ./govcheck
With this, I can then run test.py just fine. However, this seems like a hack, and more importantly, is not very portable (I will have to create this symbolic link everywhere I run this code).
So, I was wondering if anyone has any better solutions for my problem?
Django is one of the popular python frameworks; critiques have argued that it is a bloated framework. The truth of the matter is that it is very modularized, and each of the components () can be independently used.
Yes that is possible, but a lot of ways how Django can help with webdevelopment are based on its models. For example based on a model Django can make a ModelForm [Django-doc] to automate rendering HTML forms that map to the model, validating user input, and saving it to the database.
Django doesn't have a separate package for it's ORM, but it's still possible to use it on its own.
Found an easy way to reuse existing django app's settings for console script:
from django.core.management import setup_environ
import settings
setup_environ(settings)
from myapp.models import Object
for o in Object.objects.all():
print o
Are you sure it shouldn't be:
sys.path.append(os.path.join(os.path.abspath('..'), 'web'))
Also, make sure there's an __init__.py
file (empty is fine) in project/web/django_project
.
P.S. I'd recommend feeding os.path.join
's output to os.path.abspath
instead of the other way.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With