I have Debian Linux server that I use for a variety of things. I want it to be able to do some web-scraping jobs I need done regularly.
This code can be found here.
import sys
from PyQt4.QtGui import *
from PyQt4.QtCore import *
from PyQt4.QtWebKit import *
class Render(QWebPage):
def __init__(self, url):
self.app = QApplication(sys.argv, False) # Line updated based on mata's answer
QWebPage.__init__(self)
self.loadFinished.connect(self._loadFinished)
self.mainFrame().load(QUrl(url))
self.app.exec_()
def _loadFinished(self, result):
self.frame = self.mainFrame()
self.app.quit()
A simple test of it would look like this:
url = 'http://example.com'
print Render(url).frame.toHtml()
On the call to the constructor it dies with this message (it's printed to stdout, not an uncaught exception).
: cannot connect to X server
How can I use Python (2.7), QT4, and Webkit on a headless server? Nothing ever needs to be displayed, so I can tweek any settings or anything that need to be tweeked.
I've looked into alternatives, but this is the best fit for me and my projects. If I did have to install an X server, how could I do it with minimal overhead?
On gitlab CI/CD. Adding ['-platform', 'minimal']
and using xvfb didn't work for me. Instead I use QT_QPA_PLATFORM: "offscreen"
variable.
See https://stackoverflow.com/a/55442821/6000005
If PyQt5 is an option, Qt 5 has the "minimal" platform plugin.
To use it, modify the argv passed to QApplication to include ['-platform', 'minimal']
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With