A short description about my working environment: win 7 x64, python 2.7 x64, scrapy 0.22, cx_Freeze 4.3.2.
First, I developed a simple crawl-spider and it works fine. Then, using the core scrapy API, I created an external script main.py, which can run spider, and it also works as required. Here is the code of the script:
# external main.py using scrapy core API, 'test' is just replaced name of my project
from twisted.internet import reactor
from scrapy.crawler import Crawler
from scrapy import log, signals
from test.spiders.testSpider import TestSpider
from test import settings, pipelines
from scrapy.utils.project import get_project_settings
spider = TestSpider(domain='test.com')
settings = get_project_settings()
crawler = Crawler(settings)
crawler.signals.connect(reactor.stop, signal=signals.spider_closed)
crawler.configure()
crawler.crawl(spider)
crawler.start()
log.start()
reactor.run()
So now i'm trying to make binary for all of this with cx_Freeze using setup.py like in another topic here. Here is the code:
from cx_Freeze import setup, Executable
includes = ['scrapy', 'pkg_resources', 'lxml.etree', 'lxml._elementpath']
build_options = {'compressed' : True,
'optimize' : 2,
'namespace_packages' : ['zope', 'scrapy', 'pkg_resources'],
'includes' : includes,
'excludes' : []}
executable = Executable(script='main.py',
copyDependentFiles=True,
includes=includes)
setup(name='Stand-alone scraper',
version='0.1',
description='Stand-alone scraper',
options= {'build_exe': build_options},
executables=[executable])
It's normally compiling into exe-file. Problems starts when i try to run it:
Traceback (most recent call last):
File "C:\Python27\lib\site-packages\cx_Freeze\initscripts\Console.py", line 27, in <module>
exec code in m.__dict__
File "main.py", line 2, in <module>
from scrapy.crawler import Crawler
File "C:\Python27\lib\site-packages\scrapy\__init__.py", line 6, in <module>
__version__ = pkgutil.get_data(__package__, 'VERSION').strip()
File "C:\Python27\lib\pkgutil.py", line 591, in get_data
return loader.get_data(resource_name)
IOError: [Errno 2] No such file or directory: 'scrapy\\VERSION'
I solved this problem just moving scrapy\version file from original source (python\lib\site-packages\scrapy) to library.zip\scapy in build-folder. After second run of main.exe i got another message:
Traceback (most recent call last):
File "C:\Python27\lib\site-packages\cx_Freeze\initscripts\Console.py", line 27, in <module>
exec code in m.__dict__
File "main.py", line 11, in <module>
crawler = Crawler(settings)
File "C:\Python27\lib\site-packages\scrapy\crawler.py", line 20, in __init__
self.stats = load_object(settings['STATS_CLASS'])(self)
File "C:\Python27\lib\site-packages\scrapy\utils\misc.py", line 42, in load_object
raise ImportError("Error loading object '%s': %s" % (path, e))
ImportError: Error loading object 'scrapy.statscol.MemoryStatsCollector': No module named statscol
I didn't find any solution of this, and just try to import module from error message in the my main.py. Briefly -it didn't work. Every new import i got a new message with another module (totally i tried to import 15 :)) modules, until got error about aes module in cryptography. I also tryied to use cx_freeze alternatives like py2exe and pyinstaller, but same result.
Can anybody help me to solve this problem? Thank you for reading to this point.
Replace your cx_Freeze code with this.
import sys
from cx_Freeze import setup, Executable
build_exe_options = {"packages": ["os","twisted","scrapy","test"], "excludes": ["tkinter"],"include_msvcr":True}
base = None
setup( name = "MyScript",
version = "0.1",
description = "Demo",
options = {"build_exe": build_exe_options},
executables = [Executable("C:\\MyScript", base=base)])
The difference in code is I have included the whole of the packages so you can access all functions from them.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With