Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to run Selenium Webdriver correctly on Heroku with a Rails app

I’m implementing a very basic scraper on my app with the watir gem. It runs perfectly fine locally but when I run it on heroku, it triggers this error : Webdrivers::BrowserNotFound: Failed to find Chrome binary.

I added google-chrome and chromedriver buildpacks to my app to tell Selenium where to find Chrome on Heroku but it still doest not work. Moreover, when I print the options, the binary seems to be correctly set:

#<Selenium::WebDriver::Chrome::Options:0x0000558bdf7ecc30 @args=#<Set: {"--user-data-dir=/app/tmp/chrome", "--no-sandbox", "--window-size=1200x600", "--headless", "--disable-gpu"}>, @binary="/app/.apt/usr/bin/google-chrome-stable", @prefs={}, @extensions=[], @options={}, @emulation={}, @encoded_extensions=[]>

This is my app Buildpack URLs :

1. heroku/ruby
2. heroku/google-chrome
3. heroku/chromedriver

This is my code :

def new_browser(downloads: false)

  options = Selenium::WebDriver::Chrome::Options.new

  chrome_dir = File.join Dir.pwd, %w(tmp chrome)
  FileUtils.mkdir_p chrome_dir
  user_data_dir = "--user-data-dir=#{chrome_dir}"
  options.add_argument user_data_dir

  if chrome_bin = ENV["GOOGLE_CHROME_SHIM"]
    options.add_argument "--no-sandbox"
    options.binary = chrome_bin
  end

  options.add_argument "--window-size=1200x600"
  options.add_argument "--headless"
  options.add_argument "--disable-gpu"

  browser = Watir::Browser.new :chrome, options: options

  if downloads
    downloads_dir = File.join Dir.pwd, %w(tmp downloads)
    FileUtils.mkdir_p downloads_dir

    bridge = browser.driver.send :bridge
    path = "/session/#{bridge.session_id}/chromium/send_command"
    params = { behavior: "allow", downloadPath: downloads_dir }
    bridge.http.call(:post, path, cmd: "Page.setDownloadBehavior",
                                  params: params)
  end
  browser
end

Any idea how to fix this ? I checked many similar issues on different websites but I did not find anything.

like image 746
Rémi JUHE Avatar asked Aug 22 '19 12:08

Rémi JUHE


3 Answers

i also work on same thing last two days, and as you said I try a lot of different things. I finally made it.

The problem is that heroku use different path where is chromedriver downloaded. In source code of webdriver gem I found that webdriver looking on default system path for (linux, mac os, windows) and this is reason why works locally or path defined in WD_CHROME_PATH environment variable. To set path on heroku we must set this env variable

"WD_CHROME_PATH": "/app/.apt/usr/bin/google-chrome"

must be google-chrome not google-chrome-stable like we can find on examples.

That is, just run this from terminal:

heroku config:set WD_CHROME_PATH=/app/.apt/usr/bin/google-chrome
like image 107
Nedim Avatar answered Nov 18 '22 12:11

Nedim


No solutions worked for me (Heroku-18 stack, with 'https://github.com/heroku/heroku-buildpack-google-chrome.git' and 'https://github.com/heroku/heroku-buildpack-chromedriver' buildpacks).

I tried all kinds of solutions but finally found a fail proof way to debug it yourself.

It involves a couple of resources: https://www.simon-neutert.de/2018/watir-chrome-heroku/ and https://github.com/jormon/minimal-chrome-on-heroku/blob/master/runner.thor in particular.

Check where your actual binary and drivers are on Heroku:

$ heroku run bash
~ $ which chromedriver
/app/.chromedriver/bin/chromedriver
~ $ which google-chrome
/app/.apt/usr/bin/google-chrome

The shims that the buildpacks set up for me didn't work. In fact, even if you set the values above on Heroku to something different, the buildpacks reset them, so you lose the new shim (see here: https://github.com/heroku/heroku-buildpack-google-chrome/blob/master/bin/compile ) so I made new shims:

$ heroku config:set GOOGLE_CHROME_REAL=/app/.apt/usr/bin/google-chrome
$ heroku config:set CHROME_DRIVER_REAL=/app/.chromedriver/bin/chromedriver

Then, I modified the browser initializer (from: https://github.com/jormon/minimal-chrome-on-heroku/blob/master/runner.thor ):

def new_browser(downloads: false)
    require 'watir'
    require 'webdrivers'
    options = Selenium::WebDriver::Chrome::Options.new

    # make a directory for chrome if it doesn't already exist
    chrome_dir = File.join Dir.pwd, %w(tmp chrome)
    FileUtils.mkdir_p chrome_dir
    user_data_dir = "--user-data-dir=#{chrome_dir}"
    # add the option for user-data-dir
    options.add_argument user_data_dir

    # let Selenium know where to look for chrome if we have a hint from
    # heroku. chromedriver-helper & chrome seem to work out of the box on osx,
    # but not on heroku.
    if chrome_bin = ENV["GOOGLE_CHROME_REAL"]
        Selenium::WebDriver::Chrome.path = chrome_bin
    end
    if chrome_driver = ENV["CHROME_DRIVER_REAL"]
        Selenium::WebDriver::Chrome.driver_path = chrome_driver
    end

    # headless!
    options.add_argument "--window-size=1200x600"
    options.add_argument "--headless"
    options.add_argument "--disable-gpu"

    # make the browser
    browser = Watir::Browser.new :chrome, options: options

    # setup downloading options
    if downloads
      # make download storage directory
      downloads_dir = File.join Dir.pwd, %w(tmp downloads)
      FileUtils.mkdir_p downloads_dir

      # tell the bridge to use downloads
      bridge = browser.driver.send :bridge
      path = "/session/#{bridge.session_id}/chromium/send_command"
      params = { behavior: "allow", downloadPath: downloads_dir }
      bridge.http.call(:post, path, cmd: "Page.setDownloadBehavior",
                                    params: params)
    end
    browser
end

Hope this helps others.

like image 1
Cody Avatar answered Nov 18 '22 10:11

Cody


I have tried to solve this for a while with different approaches but none of them worked. Then I checked the webdrivers source code and found that you need to set the "WD_CHROME_PATH" env variable for it to work. Just attaching my full setup here. This cost me a few hours to debug and fix.

spec_helper.rb

require 'webdrivers'
require 'capybara/rspec'

 # Heroku build packs need to put the chromedriver binary in a non-standard location specified by GOOGLE_CHROME_SHIM
 chrome_bin = ENV.fetch('GOOGLE_CHROME_SHIM', nil)

 options = {}
 options[:args] = ['headless', 'disable-gpu', 'window-size=1280,1024']
 options[:binary] = chrome_bin if chrome_bin

 Capybara.register_driver :headless_chrome do |app|
   Capybara::Selenium::Driver.new(app,
      browser: :chrome,
      options: Selenium::WebDriver::Chrome::Options.new(options)
    )
 end

 Capybara.javascript_driver = :headless_chrome

Gemfile

group :test do
  gem 'capybara'
  gem 'timecop'
  gem 'selenium-webdriver'
  gem 'webdrivers'
end

app.json

{
  "name": "evocal",
  "repository": "https://github.com/zeitdev/evocal",
  "environments": {
    "test": {
      "addons":[
        "heroku-postgresql:in-dyno"
      ],
      "scripts": {
        "test-setup": "bundle exec rake db:seed",
        "test": "bundle exec rspec"
      },
      "buildpacks": [
        { "url": "heroku/ruby" },
        { "url": "https://github.com/heroku/heroku-buildpack-google-chrome" },
        { "url": "https://github.com/heroku/heroku-buildpack-chromedriver" },
        { "url": "heroku/nodejs" }
      ],
      "env": {
        "WD_CHROME_PATH": "/app/.apt/opt/google/chrome/chrome"
      }
    }
  }
}

I don't fully yet understand how selenium, webdriver and the gem interact with each other. Some also wrote that you can leave away another buildpack. But this works at least for now :-D.

like image 1
Hendrik Avatar answered Nov 18 '22 10:11

Hendrik