Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Mechanize: too many values to unpack (expected 2)

I have tried to write the following code, I am trying to write a code in Python 3.7 that just opens a web browser and the website fed to it in the Command Line:

Example.py

import sys

from mechanize import Browser
browser = Browser()

browser.set_handle_equiv(True)
browser.set_handle_gzip(True)
browser.set_handle_redirect(True)
browser.set_handle_referer(True)
browser.set_handle_robots(False)

# pretend you are a real browser
browser.addheaders = [('Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36')]

listOfSites = sys.argv[1:]
for i in listOfSites:
    browser.open(i)

I have entered the following command in the cmd:

python Example.py https://www.google.com

And I have the following traceback:

Traceback (most recent call last):
  File "Example.py", line 19, in <module>
    browser.open(i)
  File "C:\Python37\lib\site-packages\mechanize\_mechanize.py", line 253, in open
    return self._mech_open(url_or_request, data, timeout=timeout)
  File "C:\Python37\lib\site-packages\mechanize\_mechanize.py", line 283, in _mech_open
    response = UserAgentBase.open(self, request, data)
  File "C:\Python37\lib\site-packages\mechanize\_opener.py", line 188, in open
    req = meth(req)
  File "C:\Python37\lib\site-packages\mechanize\_urllib2_fork.py", line 1104, in do_request_
    for name, value in self.parent.addheaders:
ValueError: too many values to unpack (expected 2)

I am very new to Python. This is my first code here. I am stuck with the above traceback but haven't found the solution yet. I have searched for a lot of questions on SO community as well but they didn't seem to help. What should I do next?

UPDATE:

As suggested by @Jean-François-Fabre, in his answer, I have added 'User-agent' to the header, now there is no traceback, but still there is an issue where my link cannot be opened in the browser.

Here is how the addheader looks like now:

browser.addheaders = [('User-agent', 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36')]
like image 831
Code_Ninja Avatar asked Feb 05 '19 06:02

Code_Ninja


People also ask

How do you fix too many values to unpack expected 2?

The ValueError: too many values to unpack (expected 2)” occurs when you do not unpack all of the items in a list. A common mistake is trying to unpack too many values into variables. We can solve this by ensuring the number of variables equals the number of items in the list to unpack.

How do I fix too many values to unpack Python?

The “valueerror: too many values to unpack (expected 2)” error occurs when you do not unpack all the items in a list. This error is often caused by trying to iterate over the items in a dictionary. To solve this problem, use the items() method to iterate over a dictionary.

What does too many values to unpack mean?

The valueerror: too many values to unpack occurs during a multiple-assignment where you either don't have enough objects to assign to the variables or you have more objects to assign than variables.

How do you fix not enough values to unpack expected 2 got 1?

Verify the assignment variables. If the number of assignment variables is greater than the total number of variables, delete the excess variable from the assignment operator. The number of objects returned, as well as the number of variables available are the same. This will resolve the value error.


4 Answers

I have just found out a way around to this issue even if the above issue still exists. I am posting this only to let the readers know that we can do it this way too:

Instead of using the mechanize package, we can use the webbrowser package and write the following python code in the Example.py:

import webbrowser
import sys

#This is an upgrade suggested by @Jean-François Fabre
listOfSites = sys.argv[1:]

for i in listOfSites:
    webbrowser.open_new_tab(i)

Then we can run this python code by executing the following command in the terminal/command prompt:

python Example.py https://www.google.com https://www.bing.com

This command mentioned above in the example will open two sites at a time. One is Google and the other is Bing

like image 165
Code_Ninja Avatar answered Oct 22 '22 21:10

Code_Ninja


Here you go :)

import sys
from mechanize import Browser, Request


browser = Browser()

browser.set_handle_equiv(True)
browser.set_handle_gzip(True)
browser.set_handle_redirect(True)
browser.set_handle_referer(True)
browser.set_handle_robots(False)

# setup your header, add anything you want
header = {'User-Agent': 'Mozilla/5.0 (Windows NT 5.1; rv:14.0) Gecko/20100101 Firefox/14.0.1', 'Referer': 'http://whateveritis.com'}


url_list = sys.argv[1:]
for url in url_list:
    request = Request(url=url, data=None, headers=header)
    response = browser.open(request)
    print(response.read())
    response.close()
like image 26
han solo Avatar answered Oct 22 '22 22:10

han solo


I don't know mechanize at all, but the traceback and variable names (and some googling) can help.

You're initializing addheaders with a list of strings. Some other examples (ex: Mechanize Python and addheader method - how do I know the newest headers?) show a list of tuples, which seem to match the traceback. Ex:

browser.addheaders = [('User-agent', 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.1) Gecko/2008071615 Fedora/3.0.1-1.fc9 Firefox/3.0.1')]

so it unpacks properly into name and value in loop

for name, value in whatever.addheaders:

You have to add the 'User-agent' property name (you can pass other less common parameters than the browser name)

like image 1
Jean-François Fabre Avatar answered Oct 22 '22 22:10

Jean-François Fabre


Let me try to answer your question in parts:

  • You're correct in adding "Browser Headers". Many servers might outright drop your connection as it's a definite sign of being crawled by a bot.

  • mechanize as stated by the docs "is a tool for programmatic web browsing".
    This means it is primarily used to crawl webpages, parse their contents, fill forms, click on things, send requests, but not using a "real" web browser, with parts such as CSS rendering. For example you cannot open a page and take a screenshot, as there's not something "rendered", and to achieve this you would need to save the page, and render it using another solution.

  • If this suits you needs, check headless browsers as a technology, there are a lot of them. In the Python ecosystem, other than mechanize, I'd check "headless chromium", as "phantomjs" is unfortunately discontinued.

But if I understand correctly, you need the actual web browser to open up with the webpage, right? For this reason, you actually need, well, a browser in your system to take care of that!

Case 1 : Use your native system's browser

Find out where your browser's executable lies in your system. For example, my Firefox executable lies in "C:\Program Files\Mozilla Firefox\firefox.exe" and add it to your PATH.

As you're using Windows, use the start menu to navigate to Advanced System Settings --> Advanced --> Environment Variables, and add the path above to your PATH variable.

If you're using Linux export PATH=$PATH:"/path/to/your/browser" will take care of things.

Then, your code can run as simply as

import subprocess
import sys

listOfSites = sys.argv[1:]
links = ""
for i in listOfSites:
    links += "-new-tab " + i
print(links)
subprocess.run(["firefox", links])

Firefox will open new windows, one for each of the links you have provided.

Case 2 : Use selenium

Then comes Selenium, which in my opinion is the most mature solution to browser-related problems, and what most people use. I've used it in a production setting with very good results. It provides both the UI/frontend of a browser that renders the webpages, but also allows you to programmatically work with these webpages.

It needs some setup, (for example, if you're using Firefox, you'll need to download the geckodriver executable from their releases page, and then add it your PATH variable again.

Then you define your webdriver, spawn one for each of the websites you need to visit, and get the webpage. You can also take screenshot, as a proof that the page has been rendered correctly.

from selenium import webdriver
import sys

listOfSites = sys.argv[1:]
for i in listOfSites:
    driver = webdriver.Firefox()
    driver.get('http://'+i)
    driver.save_screenshot(i+'-screenshot.png')

# When you're finished
# driver.quit()

I've tested both of these code snippets, and they work as expected. Please let me know how all these sound, and if you need any more additional information..! ^^

like image 1
hyperTrashPanda Avatar answered Oct 22 '22 21:10

hyperTrashPanda