Mechanize: too many values to unpack (expected 2)

Tags:

mechanize-python

I have tried to write the following code, I am trying to write a code in Python 3.7 that just opens a web browser and the website fed to it in the Command Line:

Example.py

import sys

from mechanize import Browser
browser = Browser()

browser.set_handle_equiv(True)
browser.set_handle_gzip(True)
browser.set_handle_redirect(True)
browser.set_handle_referer(True)
browser.set_handle_robots(False)

# pretend you are a real browser
browser.addheaders = [('Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36')]

listOfSites = sys.argv[1:]
for i in listOfSites:
    browser.open(i)

I have entered the following command in the cmd:

python Example.py https://www.google.com

And I have the following traceback:

Traceback (most recent call last):
  File "Example.py", line 19, in <module>
    browser.open(i)
  File "C:\Python37\lib\site-packages\mechanize\_mechanize.py", line 253, in open
    return self._mech_open(url_or_request, data, timeout=timeout)
  File "C:\Python37\lib\site-packages\mechanize\_mechanize.py", line 283, in _mech_open
    response = UserAgentBase.open(self, request, data)
  File "C:\Python37\lib\site-packages\mechanize\_opener.py", line 188, in open
    req = meth(req)
  File "C:\Python37\lib\site-packages\mechanize\_urllib2_fork.py", line 1104, in do_request_
    for name, value in self.parent.addheaders:
ValueError: too many values to unpack (expected 2)

I am very new to Python. This is my first code here. I am stuck with the above traceback but haven't found the solution yet. I have searched for a lot of questions on SO community as well but they didn't seem to help. What should I do next?

UPDATE:

As suggested by @Jean-François-Fabre, in his answer, I have added 'User-agent' to the header, now there is no traceback, but still there is an issue where my link cannot be opened in the browser.

Here is how the addheader looks like now:

browser.addheaders = [('User-agent', 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36')]

831

asked Feb 05 '19 06:02

Code_Ninja

4 Answers

I have just found out a way around to this issue even if the above issue still exists. I am posting this only to let the readers know that we can do it this way too:

Instead of using the mechanize package, we can use the webbrowser package and write the following python code in the Example.py:

import webbrowser
import sys

#This is an upgrade suggested by @Jean-François Fabre
listOfSites = sys.argv[1:]

for i in listOfSites:
    webbrowser.open_new_tab(i)

Then we can run this python code by executing the following command in the terminal/command prompt:

python Example.py https://www.google.com https://www.bing.com

This command mentioned above in the example will open two sites at a time. One is Google and the other is Bing

165

answered Oct 22 '22 21:10

Code_Ninja

Here you go :)

import sys
from mechanize import Browser, Request


browser = Browser()

browser.set_handle_equiv(True)
browser.set_handle_gzip(True)
browser.set_handle_redirect(True)
browser.set_handle_referer(True)
browser.set_handle_robots(False)

# setup your header, add anything you want
header = {'User-Agent': 'Mozilla/5.0 (Windows NT 5.1; rv:14.0) Gecko/20100101 Firefox/14.0.1', 'Referer': 'http://whateveritis.com'}


url_list = sys.argv[1:]
for url in url_list:
    request = Request(url=url, data=None, headers=header)
    response = browser.open(request)
    print(response.read())
    response.close()

answered Oct 22 '22 22:10

han solo

I don't know mechanize at all, but the traceback and variable names (and some googling) can help.

You're initializing addheaders with a list of strings. Some other examples (ex: Mechanize Python and addheader method - how do I know the newest headers?) show a list of tuples, which seem to match the traceback. Ex:

browser.addheaders = [('User-agent', 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.1) Gecko/2008071615 Fedora/3.0.1-1.fc9 Firefox/3.0.1')]

so it unpacks properly into name and value in loop

for name, value in whatever.addheaders:

You have to add the 'User-agent' property name (you can pass other less common parameters than the browser name)

answered Oct 22 '22 22:10

Jean-François Fabre

Let me try to answer your question in parts:

You're correct in adding "Browser Headers". Many servers might outright drop your connection as it's a definite sign of being crawled by a bot.
mechanize as stated by the docs "is a tool for programmatic web browsing".
This means it is primarily used to crawl webpages, parse their contents, fill forms, click on things, send requests, but not using a "real" web browser, with parts such as CSS rendering. For example you cannot open a page and take a screenshot, as there's not something "rendered", and to achieve this you would need to save the page, and render it using another solution.
If this suits you needs, check headless browsers as a technology, there are a lot of them. In the Python ecosystem, other than mechanize, I'd check "headless chromium", as "phantomjs" is unfortunately discontinued.

But if I understand correctly, you need the actual web browser to open up with the webpage, right? For this reason, you actually need, well, a browser in your system to take care of that!

Case 1 : Use your native system's browser

Find out where your browser's executable lies in your system. For example, my Firefox executable lies in "C:\Program Files\Mozilla Firefox\firefox.exe" and add it to your PATH.

As you're using Windows, use the start menu to navigate to Advanced System Settings --> Advanced --> Environment Variables, and add the path above to your PATH variable.

If you're using Linux export PATH=$PATH:"/path/to/your/browser" will take care of things.

Then, your code can run as simply as

import subprocess
import sys

listOfSites = sys.argv[1:]
links = ""
for i in listOfSites:
    links += "-new-tab " + i
print(links)
subprocess.run(["firefox", links])

Firefox will open new windows, one for each of the links you have provided.

Case 2 : Use selenium

Then comes Selenium, which in my opinion is the most mature solution to browser-related problems, and what most people use. I've used it in a production setting with very good results. It provides both the UI/frontend of a browser that renders the webpages, but also allows you to programmatically work with these webpages.

It needs some setup, (for example, if you're using Firefox, you'll need to download the geckodriver executable from their releases page, and then add it your PATH variable again.

Then you define your webdriver, spawn one for each of the websites you need to visit, and get the webpage. You can also take screenshot, as a proof that the page has been rendered correctly.

from selenium import webdriver
import sys

listOfSites = sys.argv[1:]
for i in listOfSites:
    driver = webdriver.Firefox()
    driver.get('http://'+i)
    driver.save_screenshot(i+'-screenshot.png')

# When you're finished
# driver.quit()

I've tested both of these code snippets, and they work as expected. Please let me know how all these sound, and if you need any more additional information..! ^^

answered Oct 22 '22 21:10

hyperTrashPanda

Related questions
                            
                                is it possible to make input() throw errors?
                            
                                Redirecting python output to a file causes UnicodeEncodeError on Windows
                            
                                Get precise decimal string representation of Python Decimal?
                            
                                Pygame: Rescale pixel size
                            
                                Running `connect_get_namespaced_pod_exec` using kubernetes client corev1api gives bad request
                            
                                Dynamically change size of arrowheads in networkx based on list/dict
                            
                                Python convert the day of year to month on an axis
                            
                                Convert categorical data back to numbers using keras utils to_categorical
                            
                                Google Drive API v3 Change File Permissions and Get Publicly Shareable Link (Python)
                            
                                Multithreading with spacy: Is joblib necessary?
                            
                                PySpark 2.x: Programmatically adding Maven JAR Coordinates to Spark
                            
                                Deep learnin on Google Colab: loading large image dataset is very long, how to accelerate the process?
                            
                                How do I write a Django ORM query for the reverse relationship in a one-to-many relationship?
                            
                                Selectively import from another Jupyter Notebook
                            
                                Automate the Boring Stuff Chapter 7: Regular Expressions - phone number and email extractor only extracting phone numbers
                            
                                Pandas Dataframe: to_dict() poor performance
                            
                                python websockets, how to setup connect timeout
                            
                                Python Setup.py: set environment variable prior to running install_requires
                            
                                How to get Flask-SQLAlchemy to work with the Application Factory Pattern
                            
                                Why is this message printed more than once during multiprocessing with concurrent.futures.ProcessPoolExecuter()?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Mechanize: too many values to unpack (expected 2)

Tags:

python

mechanize-python

Code_Ninja

People also ask

4 Answers

Code_Ninja

han solo

Jean-François Fabre

Case 1 : Use your native system's browser

Case 2 : Use selenium

hyperTrashPanda

Recent Activity

Donate For Us