I want to periodically scrape a website with Selenium and a headless PhantomJS driver.
My boss wants me to run it "in the cloud" for reasons, and a serverless Azure Function looks like it could be a useful way to do it, instead of having to run a VM or something.
I've got my VS.net code to do the scraping mostly done, but I just realized that I'm not sure if I can actually deploy it as a function, since it looks like it wants me to include the phantomjs.exe in my project in order to run, which may not work in a Azure Function...
Can I do what I wanted to do, or should I explore other options?
Selenium is the standard tool for automated web browser testing. On top of that, Selenium is a popular tool for web scraping. When creating a web scraper in Azure, Azure Functions is a logical candidate to run your code in.
Selenium requires the agent to be run in interactive mode to execute the UI tests. In the VM open web browser, sign in to your Azure DevOps organization and navigate to the Agent pools tab: Choose Azure DevOps, Organization settings. Choose Agent pools.
PhantomJS is a known unsupported framework in App Service, which is the same environment Azure Functions runs on.
You can find more information here: https://github.com/projectkudu/kudu/wiki/Azure-Web-App-sandbox#unsupported-frameworks
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With