I am running a Django service that fires up a chromedriver for selenium and scrapes a website for data. The Django service is called by another Java service through HTTP.
Here is the code:
views.py
path_to_chromedriver = '/path/to/chromedriver'
browser = webdriver.Chrome(executable_path = path_to_chromedriver)
try:
response = get_data(browser)
except Exception as e:
print str(e)
finally:
browser.close()
browser.quit()
scraper.py
get_data(browser)
try:
.
.
.
for i in range(1,6):
try:
.
.
.
return "success data"
except NoSuchElementException:
browser.back()
raise Exception("No results found")
except Exception as e:
print str(e)
raise
The problem is that after the java service has finished making all the calls and the whole process is complete, there are between 25 - 50 chrome processes orphaned in RAM occupying over 1 GB. Is there anything wrong I'm doing here?
You may have noticed that Google Chrome will often have more than one process open, even if you only have one tab open. This occurs because Google Chrome deliberately separates the browser, the rendering engine, and the plugins from each other by running them in separate processes.
But if you open the Task Manager, you may be surprised to see many Google Chrome processes running. I could see 18 running even though I had opened only in one single window with 4 tabs. This is because Chrome opens a separate process for each of its tab, extension, tab and, subframe.
Click the “≡” button in the upper right corner of the Chrome browser window. Select the Exit button. This will close all tabs and windows and end the process.
That's an old issue. What works for me, although dirty, is to add a sleep before quitting:
time.sleep(5)
browser.quit()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With