Is it possible to access my django models inside of a Scrapy pipeline, so that I can save my scraped data straight to my model?
I've seen this, but I don't really get how to set it up?
You may use scrapy-djangoitem extension that defines Scrapy Items using existing Django models. When you declare an item class, you can directly save data as items.
In Django 1.7+ it is better to use get_model() on the Django app registry, which is available via django. apps. apps. get_model(model_name) .
Since Django only looks for models under the Python path <app>. models , you must declare a relative import in the main models.py file -- for each of the models inside sub-folders -- to make them visible to the app.
Mine is simpler to implement, and you can pass a list, dict, or anything that can be converted into json. In Django 1.10 and above, there's a new ArrayField field you can use.
If anyone else is having the same problem, this is how I solved it.
I added this to my scrapy settings.py file:
def setup_django_env(path): import imp, os from django.core.management import setup_environ f, filename, desc = imp.find_module('settings', [path]) project = imp.load_module('settings', f, filename, desc) setup_environ(project) setup_django_env('/path/to/django/project/')
Note: the path above is to your django project folder, not the settings.py file.
Now you will have full access to your django models inside of your scrapy project.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With